Professional Documents
Culture Documents
Advances in
Communication, Network,
and Computing
Third International Conference, CNC 2012
Chennai, India, February 24-25, 2012
Revised Selected Papers
13
Volume Editors
Vinu V. Das
Network Security Group
The IDES
1191 GT Amsterdam, The Netherlands
E-mail: vinuvdas@theides.org
Janahanlal Stephen
Ilahia College of Engineering
686673 Kothamangalam, India
E-mail: drlalps@gmail.com
© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2012
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
• Social media : With the rise of social networks such as Facebook, Myspace,
Digg, and others, a form of crowd-based media known as social media has
emerged. Businesses can take advantage of social media as another aspect of
business communication automation.
• Research and open source communication : Emerging young scientists and
researchers can very well make use of the above media to collaborate with
people who are open source minded to help and promote research.
Janahanlal Stephen
CNC 2012 – Organization
Technical Chair
Dr. Janahan Lal ILAHIA College of Engineering, India
Technical Co-chair
Dr. R. Vijayakumar MG University, INDIA
Dr. Ford Lumban Gaol Bina Nusantara University, Indonesia
Organizing Chair
Prof. P.M. Thankachan Mar Ivanious, India
Organizing Co-chair
Dr. Srinivasa K.G. M.S. Ramaiah Institute of Technology, India
Dr. Hatim A. Aboalsamh King Saud University, Riyad
General Chair
Prof. Vinu V. Das The IDES
Publicity Chair
Dr. Mohammad Hammoudeh University of Wolverhampton, UK
Publication Chair
Prof. Mohamed Jamaludeen SRM University, India
Advisory Committee
Dr. Sudarshan TSB BITS Pilani, India
Dr. Sumeet Dua Louisiana Tech University, USA
Dr. Ansari Nirwan
VIII CNC 2012 – Organization
Program Committee
Prof. Shelly Sachdeva Jaypee Institute of Information & Technology
University, India
Prof. Pradheep Kumar K SEEE, India
Mrs. Rupa Ashutosh Fadnavis Yeshwantrao Chavan College of Engineering,
India
Dr. Shu-Ching Chen Florida International University, USA
Dr. Stefan Wagner Fakultät für Informatik Technische Universität
München, Boltzmannstr
Prof. Juha Puustjärvi Helsinki University of Technology
Dr. Selwyn Piramuthu University of Florida
Dr. Werner Retschitzegger University of Linz, Austria
Dr. Habibollah Haro Universiti Teknologi Malaysia
Dr. Derek Molloy Dublin City University, Ireland
Dr. Anirban Mukhopadhyay University of Kalyani, India
Dr. Malabika Basu Dublin Institute of Technology, Ireland
Dr. Tahseen Al-Doori American University in Dubai
Dr. V.K. Bhat SMVD University, India
Dr. Ranjit Abraham Armia Systems, India
Dr. Naomie Salim Universiti Teknologi Malaysia
Dr. Abdullah Ibrahim Universiti Malaysia Pahang
Dr. Charles McCorkell Dublin City University, Ireland
Dr. Neeraj Nehra SMVD University, India
Dr. Muhammad Nubli Universiti Malaysia Pahang
Dr. Zhenyu Y Angz Florida International University, USA
Dr. Keivan Navi Shahid Beheshti University, Tehran
Table of Contents
Full Paper
High Speed ASIC Design of DCT for Image Compression . . . . . . . . . . . . . 1
Deepa Yagain, Ashwini, and A. Vijaya Krishna
Image De-noising and Enhancement for Salt and Pepper Noise Using
Improved Median Filter-Morphological Operations . . . . . . . . . . . . . . . . . . . 7
K. Ratna Babu and K.V.N. Sunitha
Block Based Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Md. Mahmudul Hasan, Shaila Sharmeen, Md. Anisur Rahman,
M. Ameer Ali, and Md. Humayun Kabir
Secured Two Phase Geographic Forwarding with GSS Algorithm . . . . . . . 25
B. Prathusha Laxmi and A. Chilambuchelvan
On Demand Logical Resource Replication Scheme as a Service . . . . . . . . . 31
Vardhan Manu, Goel Akhil, Verma Abhinav, Mishra Shakti, and
Kushwaha D.S.
Data Storage Security Model for Cloud Computing . . . . . . . . . . . . . . . . . . . 37
Hiren B. Patel, Dhiren R. Patel, Bhavesh Borisaniya, and Avi Patel
Testing of Reversible Combinational Circuits . . . . . . . . . . . . . . . . . . . . . . . . 46
Y. Syamala, A.V.N. Tilak, and K. Srilakshmi
Classification of Medical Images Using Data Mining Techniques . . . . . . . . 54
B.G. Prasad and Krishna A.N.
A Novel Solution for Grayhole Attack in AODV Based MANETs . . . . . . 60
Rutvij H. Jhaveri, Sankita J. Patel, and Devesh C. Jinwala
Multilayer Feed-Forward Artificial Neural Network Integrated with
Sensitivity Based Connection Pruning Method . . . . . . . . . . . . . . . . . . . . . . . 68
Siddhaling Urolagin, Prema K.V., JayaKrishna R., and
N.V. Subba Reddy
ACTM: Anonymity Cluster Based Trust Management in Wireless
Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Shaila K., Sivasankari H., S.H. Manjula, Venugopal K.R.,
S.S. Iyengar, and L.M. Patnaik
Texture Based Image Retrieval Using Correlation on Haar Wavelet
Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
D.N. Verma, Sahil Narang, and Bhawna Juneja
X Table of Contents
Short Paper
Discovery of Cluster Patterns and Its Associated Data
Simultaneously . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
N. Kavitha and S. Karthikeyan
Poster Paper
A Survey on Single Channel Speech Separation . . . . . . . . . . . . . . . . . . . . . . 387
G. Logeshwari and G.S. Anandha Mala
Abstract. This paper gives the design and implementation of an image data
compression method such as DCT(Discrete Cosine Transform) using vedic
multiplier. This VLSI hardware can be used in practical coding systems to
compress images[1]. Discrete cosine transform (DCT) is one of the most
popular schemes because of its compression efficiency and small mean square
error. DCT is used specially for the compression of images where tolerable
degradation is accepted. In this paper, DCT modules are designed, implemented
and verified using 90nm technology library using Tanner EDA. Here various
individual cores are designed and connected to implement an ASIC for image
compression. The Vedic multiplier in this case performs the multiplication
much faster when compared to usual array multiplier approach. Due to this, the
speed can be increased. Also since all the simulations and implementations are
done in 90nm which is one among the deep submicron technologies, the power,
area and length of interconnects taken will be less.
1 Introduction
Transform coding constitutes an integral component of contemporary image/video
processing applications. The three important features of a suitable transform are its
compression efficiency, which relates to concentrating the energy at low frequencies,
ease of computation, and minimum mean square error. DCT is the popular technique
as it possesses these three advantages and can be represented algorithmically, In a
video transmission system, adjacent pixels in consecutive frames show very high
correlation. DCT converts data (image pixels) into sets of frequencies. The first
frequencies in the set are the most meaningful; the latter, the least. The least
meaningful frequencies can be stripped away based on allowable resolution loss.
In this paper 2D DCT is used for image compression whch is an invertible linear
transform and is widely used in many practical image compression systems. As the
DCT related technology becomes prominent in image coding systems[3], an efficient
and reliable implementation of the 2D-DCT operation may greatly improve the
system performance. Ex: When designing a video codec system, it is important to use
a two dimensional DCT functional block in the circuit. By using ASIC as DCT block,
performance of codec is improved. A multiplier design using "Urdhva-tiryakbyham"
sutras[6][7] has been used to design the multiplier.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 1–6, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
2 D. Yagain, Ashwini,, and A. Vijaya Krishna
(1)
(2)
(3)
The 2-D DCT is a direct extension of the 1-D case and is given by
(4)
0
Where ( ) √
1
M=Number of rows
r in the input data set
N= Number of columns
c in the input data set
m= Row index in the time domiain0<m<M-1
n= Column indeex in the time domiain0<n<N-1
f (m,n)=time do
omain data
u=Row index in n frequency domain
v=Column indeex in frequency domain
F (u,v) =Frequeency domain coefficien for u, v = 0, 1, 2,…,N-1
Design of DCT f(x) Modu ule. The Block diagram of DCT f(x) module is as shownn in
Fig 1(a). It takes f(x) as inp
put signal and Count_Reg enl, Count_Mux_Enable siggnal
as enable signal to give thee output Mux_Out. The various blocks used in this moddule
are Registers and Multipleexer. Here any one of the f(x) signal will be selectedd as
Mux_Out depending on Count_Mux_Enable signal whenever clk(Clock) and
Count_Reg enables are high h. The implementation of Register and Mux are done ussing
MOSFETs. Here pass traansistors are used to design DFF which increases the
performance which in turn is i used to implement a register.
Design of DCT Cos Modu ule. The block diagram of DCT Cos module is as shownn in
Fig 1(b). The various inputss given to Dct Calc block in DCT cos module are f(x) frrom
DCT_f(x) module, Cos vaalues chosen from various multiplexers. The chosen C Cos
values are from the memory y block. For Ex: Any one cos value from Cos1 to Cos8 w will
High Speed ASIC Design of DCT for Image Compression 3
be chosen depending on Cos_Mux_En. This signal is combined with f(x) which is the
output of Dct_f(x) module and given as input to DCT_Calc block which further will
generate F(X). The muxes and registers in DCT cos module is implemented similarly
as given in DCT f(x) module.
(a) (b)
Fig. 1. (a) Block diagram of DCT f(x) Module and (b) Block diagram of DCT Cos Module
The Schematic and symbol of DFF using Pass transistors is as shown in Fig 2. This
is implemented using Pass transistors to increase the performance parameters.
Similarly the Mux block is implemented using Nand gates and buffers. By cascading
16 DFFs we can implement a register block in DCTf(x) module. Similarly 8:1 mux is
implemented and connected to obtain DCT f(x) module.
(a) (b)
Fig. 2. (a) Schematic of DFF using pass transistors and (b) Symbol of DFF
Design of DCT Calc Module. The block diagram and scematic for DCT Calc block
is as shown in figure 3(a) and 3(b) Here, multiplication of f(x) and Cos bits are
performed in multiplier block. The result is given to the register which is enabled by
Mult_Reg_En given from controller. The output of register and it’s 2’s complement
value are given as inputs to Mux. Simultaneously MSB of Cos bit stream is sent to
controller for sign check. Based on the sign, controller will issue Mult_sel signal to
Mux which selects either output of Mult_Reg or its 2’s complement. The two’s
complement values are generated by giving outputs of Mult_Reg to XOR gates. The
generated result is added for all the rows (m) and columns(n) of the image block
which is indicated by adder and SOP register. According to the formula given in
equation (4), here
4 D. Yagain, Ashwini,, and A. Vijaya Krishna
1
( ) 0
√2
1
(a) (b)
Fig. 3. (a) Block Diagram of DCT calc Module and (b) Schematic of DCT calc Module
Thus x can be either zero o or a non zero value. Accordingly the result is either riight
shifted by 2 bits or 3 bits so that the result is either divided by 4 or 8. The out__sel
signal from the controller iss used to select the shifted values so that final result F(xx) is
obtained. The basic block in DCT Calc is Multiplier and it is as shown in figure 4 (a)
and 4 (b).
(a) (b)
Fig. 4. (a) Schematic of 16 bit Vedic multiplier and (b) Schematic block of 8 bit Veedic
multiplier
Design of Controller Module. The Block diagram of DCT functional block and state
diagram of Controller is as shown in figure 5(a) and 5(b).
(a) (b)
Fig. 5. (a) Block Diagram of DCT functional Block and (b) State machine of Controller Block
(a) (b)
Fig. 6. Simulation results of DCT module showing (a) inputs and (b) F(X) values
6 D. Yagain, Ashwini, and A. Vijaya Krishna
4 Conclusions
The entire design procedure of an ASIC 2D-DCT processor is presented in this paper.
The important advantage of this core is its usage for various real-time coding systems.
As the DCT related technology becomes prominent in image coding systems, an
efficient and reliable implementation of the 2D-DCT operation may greatly improve
the system performance. The intent of the paper is exploring a 2D-DCT architecture,
to regulate the design for different applications, thus expedite the design procedure.
The transistors in these modules are designed using MOSIS 90nm technology library.
References
1. Rao, K.R., Yip, P.: Discrete Cosine Transform—Algorithms, Advantages, Applications.
Academic Press, London (1990)
2. Ahmed, N., Nataranjan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. C-
23(1), 90–93 (1974)
3. Yu, S., Swartzlander Jr., E.E.: DCT implementation with distributed arithmetic. IEEE
Trans. Comput. 50(9), 985–991 (2001)
4. Wallace, G.K.: The JPEG still picture compression standard. Communications of the
ACM 34(4), 30–44 (1991)
5. Cho, N.I., Lee, S.U.: Fast Algorithm and Implementation of 2-D DCT. IEEE Transactions
on Circuits and Systems 38, 297 (1991)
6. Wallace, C.S.: A suggestion for a fast multiplier. lEE Trans. Electronic Comput. EC-3, 14–
17 (1964)
7. Tiwari, H.D., Gankhuyag, G., Kim, C.M., Cho, Y.B.: Multiplier design based on ancient
Indian Vedic Mathematics. In: Proceedings IEEE International SoC Design Cotiference,
Busan, November 24-25, pp. 65–68 (2008)
8. Thapliyal, H., Srinivas, M.B.: High Speed Efficient N x N Bit Parallel Hierarchical Overlay
Multiplier Architecture Based on Ancient Indian Vedic Mathematics. Transactions on
Engineering, Computing and Technology 2 (2004)
Image De-noising and Enhancement for Salt
and Pepper Noise Using Improved Median
Filter-Morphological Operations
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 7–14, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
8 K. Ratna Babu and K.V.N. Sunitha
some of the Image Processing functions. Grayscale images are distinct from one-bit
black-and-white images, which in the context of computer imaging are images with
only the two colors, black, and white (also called bi-level or binary images).
Grayscale images have many shades of gray in between 0 and 255. A 640 x 480
grayscale image requires over 300 KB of storage. Linear and Non Linear Filtering
Techniques [5] are used for Image De-noising and Enhancement.
Mathematical Morphology (MM) was born in 1964 from the collaborative work of
Georges Matheron and Jean Serra [17] [18]. The primary morphological operations
are Dilation and Erosion. More complicated morphological operators can be designed
by means of combining Erosions and Dilations. Mathematical morphology considers
images as geometrical objects, to be analyzed through their interactions with other
geometrical objects. It has various applications in bio-medical imaging, Geo-science,
Remote sensing, Quality control, Document processing and Data analysis. Its
techniques have been applied successfully to a variety of image processing tasks that,
roughly speaking, involve shape information of image objects. The extraction and
enhancement of shape information from images is one of the important tasks of
mathematical morphology. Main idea behind Morphological Filtering [12] [13] [14]
[16] is to examine the geometrical structure of an image by matching it with small
patterns called structuring elements at various locations. By varying the size and
shape of the matching patterns, we can extract useful information about the shape of
the different parts of the image and their inter-relations. The Original Image may have
high or low intensity values which masks the details. In this paper we propose an
adaptive Image enhancement approach which mainly operates on image pixel values
to reduce Salt and Pepper Noise [15] from the Image.
The paper is organized as follows. Section 2 discusses about several spatial
Domain research works. In Section 3, we proposed a method for removing noise
using improved median filter and morphological operations. We performed
experiments on several images in Section 4.Finaly section 5 concludes the work.
Jimenez Sanchez et al. [6] proposed a method to detect the image background and
to enhance the contrast in grey level images with poor lighting, in which an
approximation to the background using blocks analysis is computed. This was
subsequently extended using mathematical morphology operators. However, a
difficulty was detected when the morphological dilation and erosion were employed.
Therefore, a new method to detect the image background was proposed, which is
based on the use of morphological connected transformations. These morphological
contrast enhancement transformations are based on Weber’s law. But they can only be
used satisfactorily in images with poor lighting. Rafael Verdú Monedero et al [7]
proposed a spatially variant erosions/dilations and openings/closings approach.
Structuring elements (SE) can locally adapt their shape and orientation across the
direction of the structures in the image. The process of extracting shape and
orientation of the SE at each pixel from the image is under study. This method is
useful in the enhancement of anisotropic features such as coherent, flow-like
structures.
A general method based on fuzzy implication and inclusion grade operators have
been discussed by Yee Yee Htun et al [8].The fuzzy morphological operations have
extended the ordinary morphological operations by using fuzzy sets, where the union
operation and the intersection operation of the fuzzy sets have been replaced by a
maximum operation, and a minimum operation respectively. In this work, fuzzy set
theory, fuzzy Mathematical morphology based on fuzzy logic and fuzzy set theory,
fuzzy Mathematical operations and their properties have been studied. The
applications of fuzziness in Mathematical morphology in practical work such as
image processing and illustration problems have been discussed. Fuzzy Filtering [9]
also useful in Image Enhancement in which spatial data is transformed to Fuzzy Data
and Fuzzy operations were carried out on this data and Finally converting the
modified Fuzzy Data to Spatial Data.Morphological operations [10] [12] are useful in
smoothing the Images. But they also remove thin features from the images along with
noise. Morphological Image Cleaning algorithm which is explained in [11] preserves
thin features while removing noise. This algorithm is best useful for Scanner noise,
still video images noise etc.
3 Proposed Algorithm
The techniques we introduced in this work are considered the manipulation of the
dynamic range of a given digital image to improve visualization of its contents which
is based on Median Filtering Technique [19] [20]. Each pixel in the Image has a value
in the range of 0 to 255. The Algorithm is given below.
Central Pixel of the 3×3 neighborhood window is the processing Pixel. We will check
for its intensity value. If the central pixel has intensity value other than 0 or 255 it will
not be treated as a noisy pixel, so no further processing is required for that pixel. If it
has the intensity value 0 or 255, then its intensity value is replaced by equation (1).If
the central pixel intensity value is 0 or 255 and intensity values of all the
neighborhood pixels values are also 0 or 255 then central pixel intensity value is
replaced by equation (2).If Central Pixel intensity value is either 0 or 255, then
Where is the Median of the vector (K) elements, which is 60 for the following
case. Vector K contains all the neighborhood pixel intensity values.
55 125 75
42 0 or 255 78
If Central Pixel intensity value is either 0 or 255 and neighborhood pixels values
are also 0 or 255, then
P (i+1, j+1) = (2)
Where is the Mean of the vector (K) elements, which is 85 for the following case.
Vector K contains all the neighborhood pixel intensity values.
0 0 255
0 0 0
Fig. 3. A Case where central pixel is 0 (Salt Noise) or 255(Pepper Noise) and Neighborhood
pixels are also either 0 or 255
Image Boarders are also checked for salt and pepper Noise. The Algorithm has two
cases of Boarder Preserving as shown below. Based on intensity values of the pixels
appropriate method can be used.
Case 1: If all the Boarder Pixel Intensity values are either 0 or 255 then calculate the
mean of them and replace at each 0 or 255 intensity value.
Function Borderpreserve_Mean //Mean Of All Borders//
(1) Read the values of all the borders into a vector.
(2) If all the Pixels has the intensity values either 0 or 255
(3) Replace the noised values in all the borders with the mean of the vector.
End Function.
Case 2: Otherwise calculate the median of vector and replace at each 0 or 255
intensity Value.
Function Borderpreserve_Median //Median Of All Borders//
(1) Read the values of all the borders into a vector.
(2) Remove the values 0 and 255 from that vector and calculate the median with
the remaining values..
(3) Replace the noised values in all the borders with the median of the vector.
End Function
Operators. Dilation enlarges the boundaries so that holes in the region become smaller
where as Erosion shrinks the boundaries so that holes in the region become larger.
Erosion and Dilation are often used in combination to implement image processing
operations known as “Opening” and ”Closing”,” Opening” is performed through
erosion followed by dilation with another image, which is less destructive than
Erosion alone.
Fig. 4. Results of proposed Method for Gray Scale Lena Image and Car Image (a) Images
Corrupted with 20%,50% and 80% Salt and Pepper Noise (b) Noise Reduced by Improved
Median Filter (c) Enhanced Images Using Morphological Operation
The enhancing process of the proposed system was evaluated with different
standard images like Lena, Baboon, cameraman etc. Using these Images this work was
tested by color and grayscale Images by applying the Improved Median Filter and
morphological operation “imopen”. PSNR Values of Lena image at different noise
densities are compared with different enhancement techniques, provided in the Table 1.
Table 1. Comparison of PSNR Values of Lena Image at Noise Densities 20%, 50% and 80%
PSNR in dB
Noise
Proposed
Level MF AMF DBA MDBA
Method
4.2 Discussion
Our Algorithm to enhance the Images using Improved Median Filter and
Morphological operations does not require complicated calculation to enhance image
contrast value. It performs well up to 80% of salt and pepper noise. If the noise level
is equal or above 90% then this algorithm produces the blurred Images as shown in
the Fig 5.
Fig. 5. Results of proposed Method for Parrot Image (a) Original Parrot Image (b) Corrupted
with 90% Salt and Pepper Noise (c) Noise Reduced by Improved Median Filter
5 Conclusion
In this Paper, a hybrid image enhancement technique that combines Improved Median
Filter and morphological operation is implemented. Initially, the Original Image is
filtered using Improved Median Filter and then enhanced using Morphological
Operation. The implemented Algorithm present the best performance both visually
and quantitatively based on the measures such as mean square Error (MSE), Peak
Signal to noise ratio (PSNR). This paper considers the morphological open function
with 2 X 2 square structural elements for smoothening. Experiments were carried out
on number of Images, with the noise levels 20%, 50% and 80%. The result was robust
and achieved a very good Enhancement Level which proves the effectiveness of the
Proposed Work.
References
1. Astola, J., Kuosmaneen, P.: Fundamentals of Nonlinear Digital Filtering. CRC, Boca
Raton (1997)
2. Hwang, H., Hadded, R.A.: Adaptive median filter: New algorithms and results. IEEE
Trans. Image Process. 4(4), 499–502 (1995)
3. Srinivasan, K.S., Ebenezer, D.: A new fast and efficient decision based algorithm for
removal of high density impulse noise. IEEE Signal Process. Lett. 14(3), 189–192 (2007)
4. Selvi, M., Roselin, Kavitha: A Hybrid Image Enhancement Technique for Noisy Dim
Images using Curvelet and Morphology. International Journal of Engineering Science and
Technology 2(7), 2997–3002 (2010)
5. Satpathy, Panda, Nagwanshi, Nayak, Ardil: Adaptive Non-linear Filtering Technique for
Image Restoration. International Journal of Electrical and Computer Engineering 5(1), 15–
22 (2010)
14 K. Ratna Babu and K.V.N. Sunitha
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 15–24, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
16 Md. Mahmudul Hasan et al.
algorithms are domain dependent, (iii) thresholds used in these algorithms are
manually set, (iv) two adjacent regions do not share the same boundary information in
these algorithms. Region-based algorithms proposed in [4-6] are much more efficient
than the boundary-based algorithms. Split-and-merge (SM) of [6] is a popular, easy,
and simple region-based algorithm to segment objects in an image. To overcome the
problem associated with Basic SM algorithm and its several variations a new
algorithm called pattern based object segmentation using Split-and-Merge (PSM) was
proposed in [7] using pattern matching for object extraction. However, the biggest
drawback of PSM algorithm is that it cannot segment the homogeneous regions that
are connected. In this paper, we propose a new algorithm block based image
segmentation (BIS) taking into account the basic SM algorithm, image feature
stability, inter- and intra-object variability and human visual perception. The
experimental analysis has been conducted on gray-scale images considering the
intensity of pixel as the feature for segmentation. The results of the BIS algorithm are
compared with that of the PSM algorithm [7], basic SM algorithm [6], classical fuzzy
clustering algorithm namely suppressed fuzzy c-means (SFCM) using combination of
pixel intensity and pixel location as the feature for segmentation process [10], and the
newly developed shape based clustering algorithm called object based image
segmentation using fuzzy clustering (OSF) [1]. The BIS algorithm performs better
than all the algorithms mentioned above in segmenting connected regions and
producing less distortion.
The rest of the paper is organized as follows: the basic SM algorithm and the
supporting literature containing the theorems applied to propose BIS algorithm is
detailed in section 2, while the proposed BIS algorithm is presented in Section 3.
Computational complexity is calculated in Section 4. The experimental results are
described in Section 5 and finally some concluding remarks are provided in Section 6.
2 Supporting Literature
In this Section, we describe related research works those are directly used to propose
the object segmentation algorithm called block based image segmentation (BIS).
Region stability test is applied in both the split and merge stage. A region is said to be
stable if it does not contain portion of more than one object. Region stability is
measured depending on whether all the samples in a sample space (here, pixels in a
Block Based Image Segmentation 17
region) belong to the same sample space (here, the same region) or not. If the regions
are unstable, they are subdivided into several smaller regions or sample spaces. In this
purpose, T-test has been applied because of its world wide appeal [7].
Patterns were first successfully used in [8, 9] to find motion vectors of the objects
(micro blocks) in video encoding. A video frame is segmented into one or more
moving object using 16X16 blocks, called micro-blocks (MBs). Wong et. al. [15]
compare each micro-block with the patterns to find whether there is any moving
object in the micro-block under consideration. A match with any pattern indicates
a moving object's presence in the micro-block. This matching technique of
marco blocks are taken into consideration and developed a new segmentation
algorithm in [7].
If the change of any object or feature is less than or equal to 0.5dB , human
perception is unable to detect the change and the details of this is explained in [1].
18 Md. Mahmudul Hasan et al.
3 Proposed Model
This Section presents a new algorithm for image segmentation, namely block based
image segmentation (BIS) algorithm, which is developed to overcome the various
segmentation faults of the PSM algorithm. The proposed BIS algorithm is divided
into following stages (i) split stage, (ii) region accepting stage, and (iii) merge stage.
The basic assumptions for the BIS algorithm are as follows: (i) the aspect ratio of the
image is 1.33 or 4/3 with the image size 1024X768, (ii) the image is segmented into
square regions in sizes 16X16 blocks, and (iii) only the foreground pixels are
segmented and the back ground pixels are set to zero.
In the split stage, the square regions of different length are produced R1, R2, R3, …, Rn
where n is the number of the splitted regions. The splitted regions are matched with
the patterns of the pattern codebook described in Section 2.3. We use the technique of
pattern matching from [7]. If the size of the MB is a × b , the percentage of matching
of a region Ri with any pattern Pj can be calculated using the following equation:
a b
x =1 y =1
f ( x, y )
POM (%) =
a×b
If POM (%) ≥ 95 the pattern Pj is said fully matched with the region Ri .If
60 ≤ POM (%) < 95 the region is said to be partially-matched with the pattern Pj
and if POM (%) < 60 , pattern Pj is unmatched with the region Ri . Region
accepting stage find three types of regions: accepted region, partially accepted region,
and rejected region. Accepted region is the region that contains only foreground
Block Based Image Segmentation 19
pixels while partially accepted region has pixels of both foreground and background.
The partially accepted regions need to be replaced by the best match pattern and after
replacing it will be treated as an accepted region. The rejected region is the
background of the image and is not being replaced by a matched pattern. Region
accepting stage uses following two steps to mark accepted region and not accepted
region:
Step-1: If a region has size greater than the size 16 × 16 of a micro-block and does
not contain any background pixels, then the region is marked as accepted, otherwise,
it is marked as rejected.
Step-2: When a region size is equal to the size of the MB, then the regions may be
accepted or partially accepted or rejected. If the region does not have any background
pixels, it would be treated as accepted region while the rejected block contains only
background pixels. On other hand, a region having both background and foreground
pixels, is considered as the partially accepted region and in this case, it need to match
this region with the given patterns. This region will be replaced by the best match
pattern and then this replaced block will be treated as the accepted region. This
process will continue for all the regions whose size is equal to that of the MB.
Two regions are selected for merging using multistage merging technique if they
are connected and accepted. The merging stage is detailed in the following Section.
number of objects having different variations among them, these objects are only
differentiable if they have different appearance and visually distinct from each other.
Since the number of clusters is neither fixed nor manually provided, the minimization
of the intra-region variability and the maximization of the inter-region variability in
the union of two regions are considered in the proposed algorithm like [16]. However,
both the straight minimization of the intra-region variability and the maximization of
the inter-region variability lead to undesirable trivial solutions being N regions or 1
region respectively. BIS minimizes the intra- region variability while at the same time
constraining the inter region variability in the union of two regions. The reason
behind such condition is - if two regions belong to single object, their intensities
should be similar and as a result their combined variability should be minimal. On the
other hand, when one of nearly intensified region disappears due to merging, the
verity of variance is increased while the number of the regions is decreased. This
leads to maximization of inter-region variability. Thereby, in the overall image, the
inter-variance of object is maximal as they are distinctive from each others. This idea
is applied on the remaining regions that need to be merged. This task terminates when
no more region can be merged under this criteria.
4 Computational Complexity
The time complexity of the BIS algorithm, which is detailed in Section 3.3.4, is
presented below in a step-wise approach.
Step 1: The time required to resize image into a 1024 × 768 size image is O (n) ,
where n is the number of pixels in the image.
Step 2: Computational time required for checking image stability using T-test is
O(n) .
Step 3: In the worst case, the computational time required to split the image is
O(1) × O (n) = O (n) .
Step 4: To calculate the Rmap which is used to map the pixels into a region, requires
O(n) computational time.
Step 5: Step 5 will run forO(n) computational time.
Step 6 : Region stability test using T-test requires O (n) time and to merge regions
with n pixels requires O (n) time. The computation time required to compute the
max Var for a image containing n pixels is O(n) while the T-test needs
O(n) time. To examine a region whether it is less than 6% of another region
requires O (n) time. So, total time requires for this step is O (n) time in the worst
case.
5 Experimental Results
In order to evaluate the performance the BIS algorithm, results obtained from the
proposed BIS algorithm are compared with the results of OSF [1], ROSSM [12] and
PSM [7] algorithms which all are implemented using MATLAB 7.1. A total of 210
22 Md. Mahmudul Hasan et al.
different natural and synthetic gray-scale and color 2-D images were randomly
selected for the experimental analysis, comprising up to 5 different regions (objects)
having varying degrees of surface variation from the internet and IMSI. The detailed
implementation procedures are provided in [7] and the original and manually
segmented reference regions of the test images used are shown in Figures 1 and 2.
To provide a better visual interpretation of the segmented results, both the reference
and the segmented regions are displayed in different colors rather than their original
gray-scale intensities.
Fig. 1. Tiger image: (a) original image, (b) manually segmented reference image, and
segmented results using (c) OSF, (d) OSSM, (e) PSM, and (f) BIS
Now we present an image with five different objects, some of them are connected
while some are not, shown in Figure 1. The original synthetic image is presented in
Figure 1(a) and the manually segmented image is illustrated in Figure 1(b). We can
see that the performance of the BIS (Figure 1(f)}) algorithm is better then the OSF
(Figure 1(c)), OSSP (Figure 1(d)) and PSM (Figure 1(e)) algorithms in terms of
misclassification error rate and visual analysis.
The segmented images of an elephant image is shown in Figure 2. The original and
manually segmented images are displayed in Figure 2(a) and (b) respectively. This
image consists of four different objects and the objects are not connected. The BIS
(Figure 2(f)) algorithm performs better than any other algorithm with almost zero
misclassification error. A small amount of error introduced because the boundary
blocks are replaced by predefined patterns. Since the regions or objects are not
connected PSM (Figure 2(e)) algorithm performs same as the BIS algorithm. The
segmentation performance of OSF (Figure 2(c)) and OSSM (Figure 2 (e)) algorithm is
very bad with a high misclassification error rate.
In assimilating the overall segmentation performance of the proposed segmentation
algorithm, it needs to be highlighted that only the best results for each segmentation
algorithm (OSF, OSSM and PSM) are considered. These three were compared with
Block Based Image Segmentation 23
Fig. 2. Elephant image: (a) original image, (b) manually segmented reference image, and
segmented results using (c) OSF, (d) OSSM, (e) PSM, and (f) BIS
the BIS algorithm. Of the 146 test images, BIS produced superior results for 106
images. For the remainder of the images, OSF, OSSM and PSM provided better
results for only 45, 47, and 22 images respectively.
6 Conclusion
This paper has proposed a new segmentation approach called block based image
segmentation (BIS) algorithm to address some of the limitations inherent with the
PSM algorithm. The BIS algorithm as like PSM algorithm firstly splits the image into
several regions until the region stability is achieved or the block size becomes 16X16.
Then the splitted regions are matched with the micro-block (MBs) to produce
accepted and rejected regions. Then the accepted and connected splitted regions are
merged using multistage merging techniques, such as T-test, intra-variance and inter-
variance test, and human visual perception techniques. Experimental results have
shown that the newly developed BIS algorithm has been able to segment connected
images well and outperforms the pattern based object segmentation using split and
merge (PSM), object based image segmentation using fuzzy clustering (OSF) and
Robust Image Segmentation Based on Split and Merge (OSSM). This will be highly
applicable in low bit rate video coding applications for real life application where
some misclassification error is acceptable. A little shape distortion occurs due to
pattern matching in 16X16 size regions in the BIS algorithm.
24 Md. Mahmudul Hasan et al.
References
1. Ali, M.A., Dooley, L.S., Karmakar, G.C.: Object based segmentation using fuzzy
clustering. In: IEEE International Conference on Acoustics, Speech & Signal Processing,
vol. 2, pp. 105–108 (2005)
2. Canny, J.F.: A computational approach to edge detection. IEEE Transaction Pattern
Analysis and Machine Intelligence 13, 399–409 (1998)
3. Ronfard, R.: Region-based strategies for active contour models. International Journal of
Computer Vision 13(2), 229–251 (1994)
4. Haris, K.: Hybrid Image Segmentation Using Watersheds and Fast Region Merging. IEEE
Transactions on Image Processing 7(12), 1684–1699 (1998)
5. Chaudhuri, D., Chaudhuri, B.B., Murthy, C.A.: A new split-and-merge clustering
technique. Pattern Recognition Letters 13, 399–409 (1998)
6. Horowitz, S.L., Pavlidis, T.: Picture segmentation by a directed split-and-merge procedure.
In: 2nd International Joint Conference on Pattern Recognition, pp. 424–433 (1974)
7. Karim, Z., Paiker, N.R., Ali, M.A., Sorwar, G.: Pattern based object segmentation using
split and merge. In: IEEE Conference on Fuzzy Systems, South Korea (2009)
8. Murshed, M., Paul, M.: Pattern Identification VLC for Pattern-based Video Coding using
Co-occurrence Matrix. In: International Conference on Computer Science, Software
Engineering, Information Technology, e-Business and Applications (CSITeA 2004), Egypt
(2004)
9. Paul, M., Murshed, M.: An Efficient Similarity Metric for Pattern based VLBR Video
Coding. Journal of Internet Technology (2005)
10. Fan, J.L., Zhen, W.Z., Xie, W.X.: Suppressed fuzzy c-means clustering algorithm. Pattern
Recognition Letters 24, 1607–1612 (2003)
11. Pavlidis, T.: Structural Pattern Recognition. Springer, Heidelberg (1977)
12. Faruquzzaman, A.B.M., Paiker, N.R., Arafat, J., Ali, M.A., Sorwar, G.: Robust Image
Segmentation Based on Split and Merge. In: IEEE TENCON (2008)
13. Brox, T., Firin, D., Peter, H.N.: Multistage region merging for image segmentation. In:
22nd symposium on Information Theory in to Benelux, Enschede, pp. 189–196 (2001)
14. Gupta, S.P.: Advanced Practical Statistics. S. Chad and Company Ltd., New Delhi (2001)
15. Wong, K.W., Lam, K.M., Siu, W.C.: An Efficient Low Bit-Rate Video-Coding Algorithm
Focusing on Moving Regions. IEEE Trans. on Circuits and Systems for Video
Technology 11(10), 1128–1134 (2001)
16. Veeman, C.J., Reinders, M.J.T., Backer, E.: A cellular co evolutionary algorithm for image
segmentation. IEEE Transaction on Image Processing, 304–316 (2003)
Secured Two Phase Geographic Forwarding
with GSS Algorithm
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 25–30, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
26 B. Prathusha Laxmi and A. Chilambuchelvan
node. And, if the path is optimized, the intermediate node appends optimized
neighbor ID in the path list before its ID when the request is modified.
For example, an intermediate node ‘e’ receives a request message for which the
path list contains “a->b->c->d” and nodes ‘b’ and ‘c’ are neighbors of node ‘e’.
The intermediate node ‘e’ checks whether the source node is a neighbor, if it is not,
then searches the path list from the beginning node (node ‘a’) till it finds a neighbor
node in the path list. The searching returns the farthest (the farthest in ID sequence,
but not on geographic distance) neighbor node ‘b’, and then the path list in the
request message for node ‘e’ will be modified as “a->b->c->d->b->e”. Finally, the
intermediate node records the address of the neighbor from which it received the
request, and then the modified route request is forwarded. This process is repeated
until the request message reaches the base station.
Fig. 2. The dash line shows the reverse travelling in the found path
When the base station receives the request message, it verifies the MAC. If this
verification is successful, the base station continues to search a duplicated node ID in
the path list of the request message to get optimized path. If the base station finds a
duplicate node ID, it assumes that the next node after the duplicated ID and the
duplicated ID nodes are neighbors, so it removes the nodes’ IDs in between the two
neighbor nodes to get the optimized path.
Route maintenance mechanism detects malfunctioning, dead or subverted nodes
along the routing path. In SecuTPGF, each node along the path forwards the data to
the next hop node and then attempts to confirm that the data was received by the next
hop node. If, after a limited number of local retransmissions of the data, a node in the
route is unable to make this confirmation, it propagates a route error message
(RERR) to the source node to inform that the link is broken. The initiator of route
error message computed a MAC using a non interactive key. Upon receiving a route
error message, the source authenticates the RERR and then may re-initiate the route
discovery process for the destination.
Secured Two Phase Geographic Forwarding with GSS Algorithm and the pseudo
code is shown below. During the first part of GSS, the geographic location(gu) of
each authentic node u is obtained and the potential nearest authenticated Neighbors to
Secured Two Phase Geographic Forwarding with GSS Algorithm 29
the sink for each node is identified. In the second part of GSS, a random rank ranku of
each node u is picked and the subset Cu of u’s currently awake authenticated
Neighbors having rank < ranku is computed. Before ‘u ‘can go to sleep, it needs to
ensure that all nodes in Cu are connected by nodes with rank<ranku each of its
authenticated Neighbors has at least k neighbors from Cu and it is not the potential
nearest authenticated neighbor node for other nodes.
4 Conclusion
Aiming at improving Secured Two Phase Geographic Forwarding, the GSS algorithm
is combined with the secured TPGF. By making less potential nodes closest to sink go
to sleep. The proposed GSS algorithm can own a better first transmission path when
transmitting data in WMSNs with SecuTPGF.
References
1. Shu, L., Zhang, Y., Yang, L.T., Wang, Y., Hauswirth, M., Xiong, N.X.: TPGF: Geographic
Routing in Wireless Multimedia Sensor Networks. Telecommunication Systems 44(1-2),
79–95 (2010)
2. Shu, L., Zhang, Y., Zhou, Z., Hauswirth, M., Yu, Z., Hynes, G.: Transmitting and Gathering
Streaming Data in Wireless Multimedia Sensor Networks within Expected Network
Lifetime. In: ACM/Springer Mobile Networks and Applications(MONET), vol. 13(3-4), pp.
306–322 (2008)
30 B. Prathusha Laxmi and A. Chilambuchelvan
3. Mulugeta, T., Shu, L., Hauswirth, M., Chen, M., Hara, T., Nishio, S.: Secure Two Phase
eographic Forwarding Routing Protocol in Wireless Multimedia Sensor Networks. In: Proc.
Intl. Conf. Global Communication (Globecom 2010), Miami, Florida, USA (2010); Die,
W., Hellman, M.: New directions in cryptography. IEEE Transactions on Information
Theory 22(6), 644–654
4. Guerrero-Zapata, M., Zilan, R., Barcel-Ordinas, J., Bicakci, K., Tavli, B.: The Future of
Security in Wireless Multimedia Sensor Networks. Telecommunication Systems 45(1), 77–
91 (2010)
5. Karlof, C., Wagner, D.: OhHelp: Secure routing in wireless sensor networks: Attacks and
countermeasures. In: Proc. Intl. Workshop on First Sensor Network Protocols and
Applications in Conjunction with ICC 2003, AK, USA, pp. 113–127 (2009)
6. Sakai, R., Ohgishi, K., Kasahara, M.: Cryptosystems based on pairing. In: Proc. Intl.
Symposium on 2000 Cryptography and Information Security, Okinawa, Japan, pp. 26–28
(2000)
7. Zhu, C., Yang, L.T., Shu, L., Joel Rodrigues, J.P.C., Hara, T.: Proc. Intl. Conf. Global
Communication(Globecom 2010), Miami, Florida, USA, pp. 140–127 (2010)
On Demand Logical Resource Replication Scheme
as a Service
1 Introduction
Cloud computing is the future of technology, with its application distributed in every
field. Cloud computing eliminates the need of having efficient hardware resources
and infrastructure requirements by providing the resources on Pay-as-you-go basis.
All that is required is a machine with an enabled web browser. Cloud can be deployed
in three ways viz., Public, Private [8] and Hybrid cloud. When it comes to cloud deli-
very architecture, there exist three delivery models, viz., SaaS [7], PaaS [7] and IaaS
[7]. IaaS is the most widely used delivery architecture, as it provides the provision for
processing, storage, networks and other fundamental computing resources. To in-
crease the system reliability, some fault tolerant mechanism should be used, so that
the system keeps functioning in case of failure. One such method is replication, which
replicates the critical software components, so that if one of them fails, the others can
be used to continue. On demand logical resource replication scheme (OLRRS), pro-
vides an on-demand replication of logical resources (files, service), with a view to
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 31–36, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
32 V. Manu et al.
minimize the network resource utilization by minimizing the message exchange over-
head, to speed up the overall system performance.
The rest of the paper is organized as follows. Section 2 presents related work. Sec-
tion 3 introduces the proposed architecture in a private cloud environment. Section 4
discusses the simulation and results with a case study for OLRRS approach and then
the result is concluded in Section 5 followed by the future work.
2 Related Work
Replication in cloud environment is done to achieve high availability of resources.
Resources can be replicated dynamically or on-demand to minimize the overhead of
maintaining the consistency of the replicated files, to some extent. Similar kind of
work is carried out by the authors in distributed environment considering the various
performance issues that can arise and affect the overall system performance. Richard
T. Hurley and Soon Aun Yeap [1] have proposed file replication and migration poli-
cy, by which the total mean response time for a requested file at a particular site can
be reduced. Sometimes instead of using file replication mechanism, it is preferred to
use process migration to achieve better system performance and minimum utilization
of network resources. A similar approach has been taken by Anna Hac [3]. Author has
proposed the file replication, migration and process migration techniques based on the
workload of local and remote host. Concern with cloud computing is, how to deploy
the application on cloud and in what manner should the deployed application, be deli-
vered as a service. Pengzhi Xu, et. al. [5] presented a prototype named PosixCloud,
which is designed to deliver general purpose cloud storage via standard POSIX inter-
face and provides support for the traditional applications which are based on standard
file system interface. To facilitate logical resource (file, service) replication in cloud
environment, a replication mechanism is required which aims at facilitating replica-
tion considering the issues related with replication in cloud and distributed compu-
ting. Wei-Tek Tsai, et. al. [6] has proposed a replication scheme for services.
Whenever there is an increase in number of request a service can handle, additional
resources are acquired by replicating the service.
3 Proposed Approach
On-demand replication of logical resources (file or service), is achieved by replicating
resources, when the number of requests for a specific resource reaches the threshold
value. The proposed architecture is discussed below:
3.1 Architecture
The architecture proposed, implements, on demand logical resource replication scheme
(OLRRS). Figure 1 shows the set of peer servers called the File Replication Servers
(FRS), responsible for providing the replication service in the cloud environment.
Based on the number of request received for a particular file by the FRS, the resource
(File) is replicated when the total number of request reaches a threshold value.
On Demand Logical Resource Replication Scheme as a Service 33
14
Total Number of Messages Exchanged
12
10
8
6 OLRRS
4 RR
2 RRA
5 Conclusion
On demand logical resource replication scheme (OLRRS), proposed in this paper,
aims at implementing the replication mechanism, with minimum number of messages
required to complete the replication operation, thus minimizing the network resource
utilization. OLRRS approach ensures various types of transparencies viz., migration,
access and performance. Migration decisions about the file movement are made by
OLRRS approach, without any user intervention. It is responsible for replicating the
file, from one peer server to the other peer server, when the total number of request,
on a peer server, for transferring a file reaches the threshold value. Thus, enhances the
performance of the peer servers, which constitutes to the system performance.
In the future, we will, enhance OLRRS approach. In particular, proposing a consis-
tency mechanism, increasing reliability and security setup so that the purpose of
building OLRRS approach remains intact.
36 V. Manu et al.
References
1. Hurley, R.T., Yeap, S.A.: File migration and file replication: a symbiotic relationship. IEEE
Trans. on Parallel and Distributed Systems 7(6), 578–586 (1996)
2. Mei, A., Mancini, L.V., Jajodia, S.: Secure Dynamic Fragment and Replica Allocation in
Large-Scale Distributed File System. IEEE Trans. on Parallel and Distributed Sys-
tems 14(9), 885–896 (2003)
3. Hac, A.: A Distributed Algorithm for Performance Improvement Through File Replication,
File Migration, and Process Migration. IEEE Trans. on Software Engg. 15(2), 1459–1470
(1989)
4. Xiong, K., Perros, H.: Service Performance and Analysis in Cloud Computing. In: World
Conference on Services-I, pp. 693–700 (2009)
5. Xu, P., Zheng, W., Wu, Y., Huang, X., Xu, C.: Enabling cloud storage to support traditional
applications. In: 5th Annual ChinaGrid Conference, pp. 167–172 (2010)
6. Tsai, W.-T., Zhong, P., Elston, J., Bai, X., Chen, Y.: Service Replication with MapReduce
in Clouds. In: 10th International Symp. on Autonomous Decentralized Systems (ISADS),
pp. 381–388 (2011)
7. Cloud computing - A premier The Internet protocol Journal 12,
http://www.cisco.com/web/about/ac123/ac147/archived_issues/
ipj_12-3/123_cloud1.html (accessed on October 22, 2011)
8. Zheng, L., Hu, Y., Yang, C.: Design and Research on Private Cloud Computing Architec-
ture to Support Smart Grid. In: International Conf. on Intelligent Human-Machine Systems
and Cybernetics (IHMSC), August 26-27, pp. 159–161 (2011)
9. Spector, A.Z.: Performing remote operation efficiently on a local computer Network.
Communications of the ACM 25(4), 246–259 (1982)
Data Storage Security Model for Cloud Computing
1 Introduction
The apparent benefit of having Cloud computing model is to relax the user from the
lumber of storing and maintaining data or computing resources locally. This reduces
the initial investment of any organization drastically, and provides a pay-as-you-go
model. In spite of these noticeable advantages, Cloud computing has not been adopted
widely in practice due to security and privacy concerns. Along with these, other
traditional IT security issues such as integrity, confidentiality, availability, reliability,
non-repudiation, efficient retrieval, data sharing etc. have the same significance in
Cloud computing. Among all these, data storage correctness is one of the important
security issues in Cloud.
There are various methods being adopted for the data storage correctness. Trusted
third party such as cryptographic coprocessors is preferred by many researchers [2]
[3] [6] [8] [11]. It adds additional cost on Cloud users’ part for extra hardware. One
implement the functionalities of cryptographic coprocessor using open source code in
form of client application [1] [8]. It can be proved as cost-effective solution with
some compromise at performance level.
In this paper, we aim to provide a client application based Data Storage Security
Model. Rest of the paper is organized as follows. Section 2 discusses the recent work
carried out followed by problem statement in Section 3. Section 4 presents our
proposed scheme in detail along with the validation the planned-goals with the design.
Section 5 includes some possible techniques to implement the core components of
this model. With conclusions in section 6 follows references at the end.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 37–45, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
38 H.B. Patel et al.
2 Related Work
This section illustrates recent research in Cloud data storage correctness. There are
few approaches which make use of soft client applications without use of extra
hardware. Kamara at el. [1] propose a template of complete secure storage structure
without mentioning much on implementation of components involved. Pearson et al.
[8] describe a privacy manager which protects the data being stolen or misused.
Though both of these approaches reduce the burden of extra hardware cost from
Cloud user/provider, the performance is compromised to some extent, which can be
improved with third party auditor (TPA) and/or additional hardware such as
cryptographic coprocessors.
Recently, few researchers have proposed approaches based on third party auditor
(TPA). Wang at el. [2] propose an approach which enables public auditability for
Cloud data storage security through external TPA, without demanding local copy of
data or imposing extra online burden on Cloud. Gowrigolla at el. [12] outline a data
protection scheme with public auditing which allows data to be stored in encrypted
form on Cloud server without loss of accessibility or functionality for authorized
users. Homomorphic token are being utilized by Wang at el. [3] and Tribhuvan at el.
[10] to achieve data storage correctness. Wei at el. [4] develop an auditing scheme
which seeks data storage security, computation and privacy preservation with the help
of probabilistic sampling technique and verifier technique. Chuang at el. [5] design an
Effective Privacy Protection Scheme (EPPS) which provides privacy protection
according to user’s demand and also claim to achieve performance.
Temper-proof cryptographic coprocessors configured by trusted third party are
proposed by Itani at el. [6] and Ram at. el. [11] to solve the problem of securely
processing confidential data in Cloud infrastructure based on various trust levels.
Cheng at el. [7] make use of Trusted Platform Module (TPM) with sealed storage
ability. While enjoying the benefits of improved performance through extra hardware,
these approaches pose cost burden on Cloud users/providers side. Security issues for
cross-Cloud environment are addressed by Li at el. [9]. Xu at el. [13] address the
security problem in the direction of securing document service. Yu at el. [14] argue
that the Cloud data security problem should be solved from data life cycle
perspective.
As every proposal discussed here has its own way of understanding the problem of
data storage correctness, they do not handle the problems from all facets. For an
instance, ignoring dynamic nature of Cloud or adding unnecessary cost on user part
may distract the users from Cloud.
In this section, we propose a data storage security model, which intends to solve the
data security problem from multiple facets. The first part outlines the design goals
which we aim to achieve and the second part describes the proposed model.
Data Storage Security Model for Cloud Computing 39
(A) Registration Phase: CDO and CDU register themselves to Cloud before they start
accessing data by providing their unique identification (Customer_ID) and password
type code (Customer_Code). This information will be stored in Cloud Customer
Registration Master Table (table 1) maintained by the Cloud, for future customer
verification by CSP. (B) Pre-Storage Phase: Prior to storing (encrypted) data into the
40 H.B. Patel et al.
(C) Verification Phase: Any time, CDO/CDU can use this phase to check integrity of
data, by issuing QUERY to CSP and CSP returns answer in form of code REPLY,
which will be compared by CDO’s locally stored code of the same file (or can be re-
computed). The integrity of the data is considered to be protected if they are same.
(D) Grant Rights Phase: CDO issues a coupon (table 3) to CDU and intimate to CSP
by sending few of the coupons’ information (table 4) to CSP.
In case of granting access right to other CDU, CDO can write a line to table 4.
Granting and revoking rights will be performed through simple SQL query such as:
Set Access_Rights to ‘V’ where File_ID=’101034’ and User_ID=’101’;
Data Storage Security Model for Cloud Computing 41
Based on sensitivity, users’ data can be divided into three categories (a) Not
sensitive (fully trusted model) (b) Highly sensitive data (not trusted model) and (c)
Moderately sensitive data (partial cryptographic primitives). In case of some
legitimate issue, we optionally provide an audit trail in form of log file (table 5) to
CDO which tells him the changes made by other CDUs in the file owned by him.
As mentioned earlier, confidentiality and integrity are the two of the main goals to
be achieved in Cloud computing. Both the operations in our model are achieved as
mentioned beneath.
(A) Encryption Process (Performed offline): Performed at CDO’s site or CDU’s site,
they can choose encryption algorithm along with appropriate key or they can use their
custom-designed algorithms, too. Two broadly known options for encryption viz.
symmetric key encryption (e.g. AES) and asymmetric key encryption (e.g. RSA) may
be used here. The keys are to be stored and maintained by the data owner, per file,
locally. (Alternatively, we can use a trusted third party, which takes care of storage
and maintenance of these keys.) (B) Verifying Data Integrity: Simply downloading
the data for integrity verification is not a practical solution due to expensiveness in
I/O cost and unsafe files transfer across the network and may lead to new
vulnerabilities [16]. Moreover, legal regulations, such as (HIPAA) [17], further
demand the outsourced data not to be leaked to external parties (e.g. TPA). So
applying encryption before outsourcing is the most preferred way to mitigate the
privacy concern.
Along with MD5 and MAC, Proof of storage [15] is widely used protocol for the
purpose of checking integrity of data stored on remote server. The algorithms can be
run any number of times as user wants, and they do not result into too much
communication or computations overhead. It produces a very small amount of
42 H.B. Patel et al.
information (irrespective of the size of the data file) which can be exchanged between
user and Cloud, any number of times.
Next comes, the process of sharing data. As shown in figure 1, step 4 and step 5
illustrate the sharing requirement. In step 4, CDO issues a Coupon (as shown in table
3) to CDU (so that the CDU can download (retrieve) and decrypt the file). In step 5,
CDO also issues a part of the Coupon to CSP (which can be used by CSP to allow
data user’s request to use data) as shown in table 4. Taking care of Cloud dynamism
in terms of revoking access right and deregistering a user from Cloud is just a matter
of executing a query as shown earlier.
After proposing the scheme for data storage correctness, now it’s time to cross-verify
the goals that we planned in problem statement. (a) Storage Correctness: CDO can
anytime request CSP for data correctness. CDO issues a QUERY to CSP and CSP
gives REPLY in form of the code Hash/MAC_Code stored in table 2. CDO re-
computes the same (off line) and compares it with the code received to check data
integrity protection. (b) Encryption based on sensitivity of data: Data is divided
among three categories based on its sensitivity viz. (i) not sensitive, (ii) moderately
sensitive and (iii) highly sensitive. CDO specifies encryption/encoding algorithm
based on data sensitivity in the field Encryption_Algo_Type and
Encoding_Algo_Type in table 5. (c) Lightweight: Main two functionalities viz.
confidentiality and integrity are to be achieved through encryption and encoding
algorithms. We propose these two operations to be performed offline on the premise
of CDO or CDU. To check integrity of data i.e. storage correctness, only a small data
(hash or MAC code) is to be exchanged among CSP and CDO, which is independent
of the file size. (d) Dynamism: Granting and revoking access rights to or from CDU is
just a matter of writing a small SQL query and updating table 4, as shown earlier.
After every modification file size is updated in table 5 by CDO/CDU. Log table 5,
gives information about the trail of changes made by Cloud users. Increase in number
of users may not be a matter of great disturbance for Cloud as it merely increases the
rows in table 1. (e) No data duplication: Without asking local copy of data,
correctness can be measured even data is in encrypted form. The decryption is also
done offline at the site of CDO/CDU. (f) Legal and compliance issues: In case of any
legal dispute, if data owner is under investigation, enforcement agencies cannot get
data directly from CSP. Sometimes there may be a difference of opinion on who made
changes to the shared file. Access control log file (table 5) keeps complete track of
changes being made by various user on a file. CSP may give this log to CDO or other
regulatory bureau upon their quest. (g) Data type or format: We do not make any
assumption regarding the data file type or its format. Researches such as [13] heavily
rely on data type and format. Apart from this, our solution does not bring in new
vulnerabilities. There is no additional online burden on Cloud in terms of data
transportation. Dynamic nature of Cloud storage has also been taken care of.
Data Storage Security Model for Cloud Computing 43
6 Conclusion
In this paper, we proposed a Cloud Data Storage Model. This model aims to achieve
lightweight storage correctness along with provision to consider dynamic nature of
Cloud. We emphasized that the proposed design prototype is to be only meant for
usage as an illustration form. The main part of this model is to develop a client
application in open source standard, which is to be downloaded by Cloud customer
(CDO/CDU) from CSP in the beginning of entire process (one time only), and
provides all the functionalities related various encryption-decryption, key
management, encoding, decoding, integrity checking functions such as MAC, Hash,
and Proof of storage protocols.
44 H.B. Patel et al.
References
1. Kamara, S., Lauter, K.: Cryptographic Cloud Storage. In: Proceedings of the 14th
International Conference on Financial Cryptograpy and Data Security, FC 2010, pp. 136–
149. Springer, Heidelberg (2010)
2. Wang, C., Wang, Q., Ren, K., Lou, W.: Privacy-preserving public auditing for data storage
security in cloud computing. In: 2010 Proceedings IEEE INFOCOM, vol. 54(2), pp. 1–9
(2010)
3. Wang, C., Wang, Q., Ren, K., Lou, W.: Ensuring data storage security in cloud computing.
Cryptology ePrint Archive, Report 2009/081 (2009)
4. Wei, L., Zhu, H., Cao, Z., Jia, W., Vasilakos, A.V.: Seccloud: Bridging secure storage and
computation in cloud. In: Distributed Computing Systems Workshops (2010)
5. Chuang, I.H., Li, S.H., Huang, K.C., Kuo, Y.H.: An effective privacy protection scheme
for cloud computing. In: 2011 13th International Conference on Advanced Communication
Technology (ICACT), pp. 260–265 (2011)
6. Itani, W., Kayssi, A., Chehab, A.: Privacy as a service: Privacy-aware data storage and
processing in cloud computing architectures. In: Proceedings of the 2009 Eighth IEEE
International Conference on Dependable, Autonomic and Secure Computing, DASC 2009,
pp. 711–716. IEEE Computer Society, Washington, DC (2009)
7. Cheng, G., Ohoussou, A.: Sealed storage for trusted cloud computing. In: International
Conference on Computer Design and Applications (ICCDA), vol. 5, pp. V5-335 –V5-339
(2010)
8. Pearson, S., Shen, Y., Mowbray, M.: A Privacy Manager for Cloud Computing. In: Jaatun,
M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS, vol. 5931, pp. 90–106. Springer,
Heidelberg (2009)
9. Li, W., Ping, L.: Trust Model to Enhance Security and Interoperability of Cloud
Environment. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS,
vol. 5931, pp. 69–79. Springer, Heidelberg (2009)
10. Tribhuwan, M., Bhuyar, V., Pirzade, S.: Ensuring data storage security in cloud computing
through two-way handshake based on token management. In: 2010 International
Conference on Advances in Recent Technologies in Communication and Computing
(ARTCom), pp. 386–389 (2010)
11. Ram, C., Sreenivaasan, G.: Security as a service (sass): Securing user data by coprocessor
and distributing the data. In: Trendz in Information Sciences Computing (TISC), pp. 152–
155 (2010)
12. Gowrigolla, B., Sivaji, S., Masillamani, M.: Design and auditing of cloud computing
security. In: 2010 5th International Conference on Information and Automation for
Sustainability (ICIAFs), pp. 292–297 (2010)
13. Xu, J.-S., Huang, R.-C., Huang, W.-M., Yang, G.: Secure Document Service for Cloud
Computing. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS,
vol. 5931, pp. 541–546. Springer, Heidelberg (2009)
14. Yu, X., Wen, Q.: A view about cloud data security from data life cycle. In: 2010
International Conference on Computational Intelligence and Software Engineering (CiSE),
pp. 1–4 (2010)
15. Juels, A., Kaliski Jr., B.S.: Pors: proofs of retrievability for large files. In: Ning, P., di
Vimercati, S.D.C., Syverson, P.F. (eds.) ACM Conference on Computer and
Communications Security, pp. 584–597. ACM (2007)
Data Storage Security Model for Cloud Computing 45
16. Shah, M.A., Baker, M., Mogul, J.C., Swaminathan, R.: Auditing to keep online storage
services honest. In: Proceedings of the 11th USENIX Workshop on Hot Topics in
Operating Systems, pp. 11:1–11:6. USENIX Association, Berkeley (2007)
17. 104th United States Congress: Health Insurance Portability and Accountability Act of
1996 (HIPPA) (1996), http://aspe.hhs.gov/admnsimp/pl104191.htm
18. Advanced encryption standard (AES) (FIPS pub. 197) (2001)
19. FIPS 46-3: Data Encryption Standard (DES). (fips pub 46-3) (1999)
20. Rivest, R., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-
key cryptosystems. Communications of the ACM 21, 120–126 (1978)
21. Certicom Research. Standards for efficient cryptography, SEC 1: Elliptic curve
cryptography, Version 1.0 (2000), http://www.secg.org/
22. Goyal, V., Pandey, O., Sahai, A., Waters, B.: Attribute-based encryption for fine-grained
access control of encrypted data. In: Proceedings of the 13th ACM Conference on
Computer and Communications Security, CCS 2006, p. 89 (2006)
23. Ostrovsky, R., Sahai, A., Waters, B.: Attribute-based encryption with non-monotonic
access structures. In: Proceedings of the 14th ACM Conference on Computer and
Communication Security, CCS 2007, pp. 195–203. ACM, New York (2007)
24. Ateniese, G., Burns, R., Curtmola, R., Herring, J., Kissner, L., Peterson, Z., Song, D.:
Provable data possession at untrusted stores. In: Proceedings of the 14th ACM Conference
on Computer and Communications Security, CCS 2007, pp. 598–609. ACM, NY (2007)
25. Rivest, R.: The md5 message-digest algorithm (1992)
26. Institute of standards and technology, N.: FIPS 180-2, Secure Hash Standard, Federal
Information Processing Standard (FIPS), Publication 180-2. Tech. rep., Department Of
Commerce (2002)
27. Neuman, B.M., Miller, S.P., Neuman, B.C., Schiller, J.I., Saltzer, J.H.: Section e.2.1
kerberos authentication and authorization system. Project Athena Technical Plan (1987)
28. Housley, R., Ford, W., Polk, W., Solo, D.: Internet x.509 public key infrastructure
certificate and crl profile (1999)
29. Sanka, S., Hota, C., Rajarajan, M.: Secure Data Access in Cloud Computing. In: 2010
IEEE 4th International Conference on Internet Multimedia Services Architecture and
Application(IMSAA), pp. 1–6 (2010)
30. Triple data encryption algorithm. Technical Report Federal Information Processing
Standard Publication 46-3, standard ANSI X9.52-1998, NIST (1998)
31. Rescorla, E.: Diffie-Hellman Key Agreement Method. RFC2631 (1999)
Testing of Reversible Combinational Circuits
Keywords: Low power design, reversible logic, fault models, testing, signature
analysis.
1 Introduction
Now a days, power dissipation plays an important role in the design of VLSI circuits,
especially when there is an increasing trend of packing more and more logic elements
into smaller and smaller volumes and clocking these circuits with higher frequencies.
The logic elements are normally irreversible in nature and according to Landauer [1],
irreversible logic computation results in energy dissipation due to power loss. This is
because, erasure of each bit of information dissipates at least KTln2 joules of energy,
where K is Boltzmann’s constant and T is the absolute temperature at which the
operation is performed. If Moore’s law continues to be in effect, it is predicted that by
year 2020 this will become a substantial part of energy dissipation.
Further, Bennet [2] had shown that the energy dissipation problem of VLSI circuits
designed using conventional (irreversible) logic gates can be overcome by using
reversible logic gates. This is due to the fact that reversible logic naturally takes care
of heating, since, in reversible circuits, the input vectors can be uniquely recovered
from its corresponding output vectors.
Reversible computation requires reversible logic circuits and synthesis of these
circuits differs significantly from its irreversible counterpart because of different
factors [3]. Reversible circuits are designed using reversible gates which are logic
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 46–53, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Testing of Reversible Combinational Circuits 47
gates that can generate unique output vector from each input vector, and vice versa,
i.e., there is a one-to-one mapping between the input and the output vectors. A
number of reversible gates such as the Fredkin, Toffoli, Peres, and Feynman, etc., are
available for the synthesis of reversible logic circuits. Synthesizing a logic circuit
using reversible logic gates should satisfy the following features: minimum number of
reversible gates and constant inputs, with a few garbage outputs. Two restrictions that
have to be followed while designing reversible circuits are i) fan out is not considered
and ii) feedback loops are not permitted in the system.
Decoders and encoders are some of the basic building blocks of complex
combinational circuits and they play an important role either to derive a number of
control signals from a binary coded signal or to generate a number of outputs based
on single – bit information.
Testing and failure analysis of logical circuits are very important during and after
the design and manufacturing, to ensure their functionality and durability. The testing
problems posed by irreversible circuits are very challenging due to the complexity of
their normal and faulty behavior models [4], [5], [6]. However, the complexity of test
generation is lower for reversible circuits than for conventional irreversible ones;
viz. all multiple stuck-at faults are covered by a complete test set for single stuck-at
faults [7].
Faults can be located at several places in a circuit and it is difficult to determine at
which place errors would likely to occur because there does not exist a commonly
agreed upon technology for building reversible logic circuits. In order to test for all
likely faults accurately, different fault models, such as stuck-at, single missing gate,
and multiple missing gate fault models [8] are applied for the reversible decoder,
encoder, and priority encoder circuits realized using Fredkin gates.
Built-in self-test (BIST) is well known for it’s faster, less expensive and on-chip
test process for testing integrated circuits [9]. An algorithm is proposed for fault
localization by preset method, and testing of the designed circuits to determine the
fault coverage is carried out by making use of reversible linear feedback shift register
(LFSR) [10] and multiple input signature register (MISR) techniques.
The remainder of the paper is organized as follows: section 2 gives the realization
of decoder, encoder and priority encoder using Fredkin gates. Fault models and
testing aspects are discussed in section 3 and finally section 4 concludes the paper.
Reversible decoder, encoder and priority encoder circuits are realized using Fredkin
gates. The Fredkin gate as shown in Figure 1(a) is a (3, 3) reversible gate with
(A,B,C) and (P,Q,R) as the input and output vectors respectively. The outputs are
given by P=A, Q= AB ⊕ AC ,and R= AC ⊕ AB . Figure 1(b) gives the quantum
representation of the Fredkin gate shown in Figure 1(a). Its quantum cost as estimated
from Figure 1(b) is 5.
48 Y. Syamala, A.V.N. Tilak, and K. Srilakshmi
(a) (b)
Fig. 1. (a) Fredkin gate (b) Quantum representation
Y1 = En.I 0 .I 1 (2)
Y2 = En.I 0 .I 1 (3)
Y3 = En.I 0 .I 1 (4)
where En is the enable input; I0 and I1 are the inputs. It makes use of three constant
inputs and produces two garbage outputs, G1 and G2. The quantum cost and depth for
the circuit is 15.
Y0 = I 3 + I 1 I 2 (7)
Testing of Reversible Combinational Circuits 49
Y1 = I 2 + I 3 (8)
where I0, I1, I2 and I3 are the inputs and Y0, Y1 are the outputs. A reversible 4 to 2
priority encoder shown in Figure 4 makes use of 3 Fredkin gates and 3 constant
inputs, producing 5 garbage outputs. The quantum cost and depth for the circuit is 15.
3 Testing
2 to 4 decoder 5 2 1
4 to 2 encoder -- 1 1
4 to 2 priority 4 2 1
encoder
It is to be noted that it is not possible to model the encoder circuit using stuck–at
fault model, because the encoder circuit is designed under the assumption that only
one input should be active at a time.
Testing of proposed circuits is carried out using BIST technique to determine their
fault coverage by introducing the stuck-at, single missing gate, and multiple missing
gate faults.
An N-bit reversible LFSR and MISR are shown in Figures 5 and 6 respectively.
The reversible D flip-flop used in LFSR and MISR is given in Figure 7. The MISR
produces signature which is said to be good if the circuit is not faulty. Now under
testing process, the faults are introduced in the circuit under test (CUT), and the
outputs produced from the CUT are given as inputs to MISR, thereby producing
signatures. For each fault, the signature produced is compared with good signature. If
both are not same, the fault is said to be detected, otherwise not detected.
Table 2 summarizes the testing results obtained through BIST. Good signature is
the signature obtained from signature analyzer for the fault-free circuit and is shown
for the three types of faults in binary form.
2 to 4 4 to 2 4 to 2
Reversible circuit
decoder encoder priority encoder
The number within brackets indicates the decimal equivalent of good signature.
The Table 2 lists the number of faults detected out of the total number of faults
created and the percentage fault coverage for each of the three fault models
considered. The fault coverage is found to be 85% with SAF model and 50% with
MMGF model for the reversible 2 to 4 decoder and is only 75% for priority encoder
using SAF model. In all other cases it is found to be 100%.
Testing of Reversible Combinational Circuits 53
4 Conclusions
References
1. Landauer, R.: Irreversibility and Heat Generation in the Computing Process. IBM J.
Research and Development 5, 183–191 (1961)
2. Bennet, C.H.: Logical Reversibility of Computation. IBM J. Research and
Development 17, 525–532 (1973)
3. Shende, V.V., Prasad, A.K., Markov, I.L., Hayes, J.P.: Reversible Logic Circuit Synthesis.
In: International Conference on Computer – Aided Design, San Jose, California, USA, pp.
125–132 (2002)
4. Kvill, E., Laflamma, R., Ashikhmin, A., Barnum, H., Viola, L., Zurek, W.H.: Introduction
to Quantum Error Correction. Los Alamos Science, 188–221 (2002)
5. Nielsen, M.A., Chuang, I.C.: Quantum Computation and Quantum Information.
Cambridge Univ. Press (2000)
6. Obenland, K.M., Despain, A.M.: Impact of errors on Quantum Computer Architecture.
Tech. Rep, University of Southern California (1996)
7. Patel, K.N., Hayes, J.P., Markov, I.L.: Fault Testing for Reversible Circuits. IEEE Trans.
on Computer Aided Design of Integrated Circuits and Systems 23(8), 1220–1230 (2004)
8. Hayes, J.P., Polian, I., Becker, B.: Testing for Missing-gate Faults in Reversible Circuits.
In: Proceeding of 13th Asian Test Symposium, Taiwan, pp. 100–105 (2004)
9. Schiniger, A.: Testing and Built-in Self Test – a Survey. Journal of Systems
Architecture 46, 721–747 (2000)
10. Chen, J., Vasudevan, D.P., Popovici, E., Schellekens, M.: Reversible Online BIST Using
Bidirectional BILBO. In: Proceeding of ACM International Conference on Computing
Frontiers, pp. 257–266 (2010)
11. Polian, I., Fiehn, T., Becker, B., Hayes, J.P.: A Family of Logical Fault Models for
Reversible Circuits. In: Proceeding of 14th Asian Test Symposium, pp. 422–427 (2005)
12. Vasudevan, D.P., Lala, P.K., Parkerson, J.P.: CMOS Realization of Online Testable
Reversible Logic Gates. In: Proceeding of IEEE Annual Symposium on Computer Society,
VLSI, pp. 309–310 (2005)
13. Ramasamy, K., Tagare, R., Perkins, E., Perkowski, M.: Fault Localization in Reversible
Circuits is Easier Than for Classical Circuits. In: Proceeding of the International Workshop
on Logic and Synthesis (2004)
Classification of Medical Images
Using Data Mining Techniques
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 54–59, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Classification of Medical Images Using Data Mining Techniques 55
2 Related Work
N g −1
Ng Ng
Contrast = n P(i, j )where | i − j |= n
2 (3)
n =0 i =1 j =1
56 B.G. Prasad and A.N. Krishna
4 Tamura Features
Tamura [6] proposes six texture features corresponding to human visual perception:
coarseness, contrast, directionality, line-likeness, regularity and roughness. Among
them, it is found that the first three features correlate strongly with the human
perception and are defined as follows:
Coarseness: The coarseness gives information about the size of the texture elements.
The coarseness measure is calculated as follows:
1. Take averages at every point over neighborhoods of sizes of powers of two. The
average over the neighborhood of size 2kx2k at the point (x, y) is given by
k −1
x + 2 k −1 −1 y + 2 −1
Ak ( x , y ) = k −1
f (i, j ) / 2
k −1
2k
(4)
i= x−2 j = y −2
where m and n are the effective width and height of the image.
Contrast: In the narrow sense, contrast stands for picture quality. The contrast of an
image is calculated by Fcon = σ (α 4 )n with α 4 = μ 4 σ 4 where μ4 is the fourth moment
about the mean and σ2 is the variance and n has been experimentally determined to be
1/4.
Directionality: To calculate the directionality, the horizontal and vertical derivatives
∆H and ∆V are measured by Prewitt operator. The magnitude |∆G| and the local edge
direction θ are approximated as
ΔG = ( ΔH + ΔV ) / 2
(8)
θ = tan −1 (ΔV / ΔH ) + π 2 (9)
The resultant θ is a real number (0 ≤ θ < π) measured counterclockwise so that the
horizontal direction is 0. The desired histogram HD can be obtained by quantizing θ
and counting the points with the magnitude |∆G| over the threshold t given by
Classification of Medical Images Using Data Mining Techniques 57
n −1
HD (k ) = Nθ (k ) / Nθ (i ), k = 0,1,, n − 1 (10)
i =0
where Nθ(k) is the number of points at which (2k - 1)π/2n ≤ θ < (2k + 1)/2n and |∆G|
≤ t. Thresholding |∆G| by t is aimed at preventing counting of unreliable directions
which cannot be regarded as edge points. In our experiments, we have used n = 16
and t = 12.
A way of measuring the directionality quantitatively from HD is to compute the
sharpness of the peaks. The approach adopted is to sum the second moments around
each peak from valley to valley, if multiple peaks are determined to exist in order to
compute the directionality from HD. This measure can be defined by
np 2
Fdir = 1 − r ⋅ np ⋅ (φ − φ )
p φ ∈w p
p ⋅ HD (φ ) (11)
5 Wold Features
The 2-D Wold theory [7] allows an image pattern to be decomposed into two
mutually orthogonal components: deterministic and nondeterministic. The
deterministic component is further divided into a harmonic component and an
evanescent component. Let y(m, n) be a real valued, regular, and homogeneous
random field uniquely represented by the decomposition given by
y (m, n) = w(m, n) + p(m, n) + g (m, n) (12)
6 Classifiers
Table 1. Precision
Table 2. Recall
8 Conclusions
In this paper, we have described the classification techniques for CT scan brain
images. Three texture features and two classifiers are compared for classification of
CT scan brain images from a large database of images. The texture features compared
are Haralick, Tamura and Wold harmonic peaks which are implemented using JAVA.
We have used data mining classifiers J48 and Random Forest of WEKA Data Mining
tool for our experiment. Both the features and the classifiers are compared based on
Precision and Recall measures as shown in Tables 1 and 2 respectively. Haralick
features combined with Random Forest classifier is found to give the best results for
classification of CT-scan brain images.
References
1. John, S., Ioannis, V., et al.: Computer Aided Diagnosis based on Medical Image Processing
and Artificial Intelligence Methods. Nuclear Instruments and Methods in Physics Research
A 569, 591–595 (2006)
2. Zheng, B.: Computer-Aided Diagnosis in Mammography using Content-Based Image
Retrieval Approaches: Current Status and Future Perspectives. Algorithms, 828–849 (2009)
3. Antonie, M.-L., Zaane, O.R., Coman, A.: Application of Data Mining Techniques for
Medical Image Classification. In: Proc. of the 2nd Int. Workshop on Multimedia Data
Mining, San Francisco, USA (August 26, 2001)
4. Dua, S., Jain, V., Thompson, H.W.: Patient Classification using Association Mining of
Clinical Images, 978-1-4244-2003-2/08 2008 IEEE
5. Haralick, R., Shanmugam, K., Dinstein, I.: Textural Features for Image Classification. IEEE
Trans. on Systems, Man and Cybernetics 3(6), 610–621 (1973)
6. Tamura, H., Mori, S., Yamawaki, T.: Textural Features Corresponding to Visual Perception.
IEEE Trans. on Systems, Man and Cybernetics (June 1978)
7. Liu, F., Picard, R.W.: Periodicity, Directionality and Randomness: Wold Features for Image
Modeling and Retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence 18(7)
(July 1996)
A Novel Solution for Grayhole Attack
in AODV Based MANETs
1 Introduction
Role of MANETs has become vital in pervasive computing due to their self-
configurable and rapidly deployable nature. A MANET connects mobile devices
anytime and anywhere without any fixed infrastructure or centralized access point. To
make the ad-hoc network and to stay connected, nodes act as a router to relay packets
for peer nodes. In the environment where nodes have limited transmission range and
high mobility, routes may change frequently [1]. As a result, routing becomes a major
challenge. The duty of establishing and maintaining routes is performed by special
routing protocols [2]. Among all routing protocols, AODV is the most popular on-
demand protocol. As makers of AODV did not focus on its security aspect, malicious
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 60–67, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
A Novel Solution for Grayhole Attack in AODV Based MANETs 61
nodes can perform many attacks just by not following the protocol rules. Moreover,
inherent nature of MANETs make them vulnerable to various kinds of attacks such as
spoofing, flooding, eavesdropping, modification of packet contents, routing table
overflow, route cache poisoning and DoS attacks viz. Wormhole attack, Sinkhole
attack, Grayhole attack and Blackhole attack [3]. In this paper, we focus on security
against Grayhole attack that is one of the most dangerous DoS attacks disrupting the
basic functionality of AODV of delivering data packets from source to destination and
thus, degrading network performance [4].
Grayhole attack is another version of Blackhole attack in which the attacker
promotes itself as having a shortest valid route to the destination by sending fabricated
routing information [5]. As a result, a bogus route will be created through the
malicious node which dumps the received packets for specific time period and behaves
normally afterwards. This disturbs route discovery process and absorbs network traffic
[4, 6]. Due to the unpredictable nature, detection of Grayhole attack is not an easy task.
In this paper, we provide a variation of AODV that detects and removes malicious
nodes performing Grayhole attack. The primary objective of designing this routing
protocol is to set up shortest secured route with minimal overhead and to consume
minimum resources. The protocol adds little functionality to each and every node
involved in the session; an intermediate node receiving unusual routing information
from RREP (Route Reply Packet) sent by neighbor node considers that node as a
malicious node. The intermediate node marks it as malicious node in the routing table
and appends its information to the RREP and marks that RREP with do not consider
flag; every node receiving that RREP on the reverse path updates its routing table to
mark the node as malicious node. Before sending RREQ (Route Request Packet), a list
of malicious node is appended to it and every node receiving the broadcasted RREQ
marks entries of the listed nodes as malicious in the routing table. Thus, a node finds
the attacker either by checking do not consider flag of RREP, by checking the
malicious node’s entry in its routing table or by identifying fabricated routing
information from the received RREP. The solution uses default control packets, RREP
and RREQ, to inform other nodes about ignoring the routing packets received from
malicious nodes and thus, malicious nodes are isolated.
The remainder of paper is organized as follows. Section 2 presents the related work.
Section 3 describes design of our solution to prevent Grayhole attack. Simulation
results and analysis are presented in Section 4. Section 5 concludes the paper.
2 Related Work
Vishnu et. al [7] proposed a mechanism that establishes a backbone network of
trusted nodes over the ad hoc network. An unused IP address from a backbone node is
requested by the source node periodically. During route discovery process, a node
sends an RREQ to search the destination node as well as the unused IP. If attacker is
present, it sends RREP for the unused IP also. For a positive response to unused IP,
source node starts detection process. The mechanism, though, assumes the network is
divided into several grids and trusted nodes have powerful battery and high transmis-
sion range; it also assumes that a node entering the network is capable of finding its
grid location and number of malicious nodes at any point must be less than normal
nodes which may not be likely in several situations. Mamatha et. al [8] provided a
62 R.H. Jhaveri, S.J. Patel, and D.C. Jinwala
3 R-AODV Protocol
Fig. 1(a) shows a MANET using AODV protocol in which a Grayhole attacker M is
present. S receives RREP from M with unusually high sequence number in response
to broadcasted RREQ; while destination D sends RREP having legitimately higher
sequence number. As RREP sent by M contains higher sequence number of
all received RREPs, S unknowingly selects path through M to transfer data packets
and as a result, M drops some of the received packets for a specific time causing
denial-of-service in the network.
A Novel Solution for Grayhole Attack in AODV Based MANETs 63
Our simulations are performed using ns-2 (Ver.-2.34) simulation tool [14]. A new
routing agent is included in ns-2 containing Grayhole attack. We randomly move 5 to
30 nodes in the area of 800m x 800m for the simulation time of 50 seconds. Trans-
mission range of each node is 250m. Table 1 shows simulation parameters along with
values. To analyze the performance of R-AODV, we use the following metrics:
A Novel Solution for Grayhole Attack in AODV Based MANETs 65
Packet Delivery Ratio (PDR): The ratio of number of data packets received by the
application layer of destination nodes to the number of packets transmitted by the
application layer of source nodes.
Average End-to-End Delay: Average time taken by transmitted data packets to reach
to corresponding destinations.
Normalized Routing Overhead: The ratio of number of routing control packets to the
number of data packets.
Parameter Value
Terrain Area 800 m x 800 m
Simulation Time 50 sec
Traffic Type CBR (UDP)
Maximum Bandwidth 2 Mbps
Transmission Range 250 m
Data Payload 512 Bytes/Packet
Pause Time 2.0 sec
Maximum Speed 20 m/sec
Number of Nodes 5 to 30
Number of Grayhole Nodes 1 to 7
We take ideal scenarios with zero packet loss for AODV to exactly measure the ef-
fects of Grayhole attack. Fig. 3 (a) shows the performance of R-AODV under Gray-
hole attack by varying network size in presence of a single attacker; R-AODV gives
tremendous improvement in PDR which is equivalent to that of AODV in normal
conditions as far as there is an alternative genuine node present to replace isolated
malicious node to establish an alternate secured route. Even when multiple attackers
are present, R-AODV proves its reliability by preventing all malicious nodes wishing
to involve in data transmission phase. Fig. 3 (b) shows the performance comparison of
AODV and R-AODV in presence of multiple attackers for a MANET containing 20
nodes; as the number of attackers increase PDR of AODV decreases, while R-AODV
performs equally well and gives significant rise in PDR. Fig. 3 (c) depicts the perfor-
mance of R-AODV in terms of end-to-end delay by varying network size in presence
of single attacker. R-AODV shows remarkable improvement in end-to-end delay
in comparison with AODV under attack. Fig. 3 (d) shows the graph comparing
normalized routing overhead with increasing number of nodes. In presence
of an attacker, R-AODV proves its efficiency with noticable decrease in normalized
routing overhead compared to AODV as there is no extra control packets added to
R-AODV.
66 R.H. Jhaveri, S.J. Patel, and D.C. Jinwala
(a) PDR with single attacker (b) PDR with multiple attackers
(c) End-to-End delay Vs Network size (d) Routing overhead Vs Network size
DoS attacks performing packet forwarding misbehavior have become major security
threats for AODV protocol in MANETs. In this paper, we presented an alternative
solution for AODV protocol called R-AODV that proves its reliability against
Grayhole attack. Under the attack, AODV cannot perform its basic functionality to
reliably transfer all data packets to the destination and its performance drops
significantly, while R-AODV detects and isolates multiple malicious nodes
performing the attack and fulfills its design objectives. This novel solution finds
shortest secured route without adding extra control packets and gives nearly the same
PDR as default AODV with negligible difference in normalized routing overhead and
end-to-end delay. R-AODV is equally applicable to Blackhole attack.
References
1. Qasim, N., Said, F., Aghvami, H.: Performance Evaluation of Mobile Ad Hoc Networking
Protocols. In: World Congress on Engineering, pp. 219–229 (2008)
2. Raj, P.N., Swadas, P.B.: DPRAODV: A Dynamic Learning System against Black hole At-
tack in AODV Based MANET. International Journal of Computer Science Issues 3, 54–59
(2010)
A Novel Solution for Grayhole Attack in AODV Based MANETs 67
3. Mistry, N., Jinwala, D.C., Zaveri, M.: Improving AODV Protocol against Black hole At-
tacks. In: International Multiconference of Engineers and Computer Scientists, vol. 2, pp.
1034–1039 (2010)
4. Xiaopeng, G., Wei, C.: A Novel Gray Hole Attack Detection Scheme for Mobile Ad-Hoc
Networks. In: IFIP International Conference on Network and Parallel Computing, pp. 209–
214 (2007)
5. Jhaveri, R.H., Patel, A.D., Parmar, J.D., Shah, B.I.: MANET Routing Protocols and
Wormhole Attack against AODV. International Journal of Computer Science and Network
Security 10(4), 12–18 (2010)
6. Bala, A., Bansal, M., Singh, J.: Performance Analysis of MANET under Black hole At-
tack. In: 1st International Conference on Networks & Communications, pp. 141–145
(2009)
7. Vishnu, K., Paul, A.J.: Detection and Removal of Cooperative Black/Gray hole Attack in
Mobile ADHOC Networks. International Journal of Computer Applications 1(22), 38–42
(2010)
8. Mamatha, G.S., Sharma, S.C.: A Robust Approach to Detect and Prevent Network Layer
Attacks in MANETS. International Journal of Computer Science and Security 4(3), 275–
284 (2010)
9. Jhumka, A., Griths, N., Dawson, A., Myers, R.: An Outlook on the Impact of Trust Models
on Routing in Mobile Ad Hoc Networks (MANETs). In: Networking and Electronic
Commerce Research Conference (2008)
10. Gonzalez, O.F., Ansa, G., Howarth, M., Pavlou, G.: Detection and Accusation of Packet
Forwarding Misbehavior in Mobile Ad-Hoc Networks. Journal of Internet Engineer-
ing 2(1), 181–192 (2008)
11. Agrawal, P., Ghosh, R.K., Das, S.K.: Cooperative Black and Gray Hole Attacks in Mobile
Ad Hoc Networks. In: 2nd International Conference on Ubiquitous Information Manage-
ment and Communication, pp. 310–314 (2008)
12. Sen, J., Girish Chandra, N., Harihara, S.G., Reddy, H., Balamuralidhar, P.: A Mechanism
for Detection of Gray Hole Attack in Mobile Ad Hoc Networks. In: 6th International Con-
ference on Information, Communications and Signal Processing, pp. 1–5 (2007)
13. Jhaveri, R.H., Patel, S.J., Jinwala, D.C.: A Novel Approach for Grayhole and Blackhole
Attacks in Mobile Ad Hoc Networks. In: 2nd International Conference on Advanced
Computing & Communication Technologies, pp. 556–560 (2012)
14. Fall, K., Varadhan, K.: The ns Manual, http://www.isi.edu/nsnam/ns/doc/
Multilayer Feed-Forward Artificial Neural
Network Integrated with Sensitivity
Based Connection Pruning Method
Siddhaling Urolagin1, Prema K.V.2, JayaKrishna R.1 and N.V. Subba Reddy2
1
Dept. Comp Sc. & Engg., M.I.T., Manipal-576104. Karnataka, India
2
Mody Institute of Technology and Science, Rajasthan, India
siddesh_u@yahoo.com,
{prema_kv,dr_nvsreddy}@rediffmail.com,
jayakrishnaa.r@gmail.com
Abstract. The Artificial Neural Network (ANN) with small size may not solve
the problem while the network with large size will suffer from poor
generalization. The pruning methods are approaches for finding appropriate size
of the network by eliminating few parameters from the network. The sensitivity
based pruning will determine sensitivity of the network error for removal of a
parameter and eliminate parameters with least sensitivity. In this research a
sensitivity based pruning method is integrated with multilayer feed-forward
ANN and applied on MNIST handwritten numeral recognition. An analysis of
effect of pruning on the network is compared with performance of a network
without pruning. It is observed that the network integrated with pruning method
show better generalization ability than a network without pruning method being
incorporated.
1 Introduction
In theory, if a problem is solvable with a network of a given size, it can also be solved
by a large net, which imbeds the smaller one, with all the redundant connections, or
synapses, having zero strength. However, the learning algorithm will typically
produce a different structure, with non vanishing synaptic weights spreading all over
the net, thus obscuring the existence of a smaller size neural net solution. A network,
which is too small, may never solve the problem, while larger network size may cause
over fitting [1]. For a network to be able to generalize well, it should have fewer
parameters than there are data points in training set [2], [3]. It has been observed that
network with small size that fits the data well have good generalization capability [3].
Thus it makes sense to start with a large net and then reduce its size.
The pruning algorithms involves in removing network elements such as nodes,
weights or biases selectively in order to reduce the size of the network. There are
several pruning algorithms have been proposed in the literatures. The Optimal Brain
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 68–74, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Multilayer Feed-Forward Artificial Neural Network Integrated with Sensitivity 69
Damage [4] which estimates the saliency of connections and removes connections
based on this estimation. The Bottom Up Freezing [5] is a method in which nodes are
frozen from training if their contribution falls below a threshold. Nodes which are
frozen are mostly removed from the network. A review on network pruning
algorithms has been given in [6]. In this paper we integrated the sensitivity based
pruning method of [7] with multilayer feed-forward neural network and applied it for
handwritten numeral recognition. We analyze the behavior of the pruning integrated
multilayer feed-forward neural network using experiments. During the training usual
objective is to reduce the network error. The effect of pruning a connection on
network error has been discussed in section 2. In section 3 the sensitivity based
pruning method of [7] has been elaborated. In section 4 the feature extraction on
handwritten numerals has been discussed. The experimental results are discussed in
section 5. The section 6 covers the concluding remarks.
In a typical supervised learning method on presenting a pattern p let tpi, the desired
output for unit i. The difference between the output produced Opi and desired output
tpi is estimated usually as network error E. For a given training set, E is function of all
wij weights. The learning is the process of modifying the weights such that the
network error E will be decreased. The celebrated back-propagation learning
algorithm which is variant of the steepest descent optimization method [8] updates the
weights after each presentation of a subset of the training patterns. After the network
undergoes sufficient training and network error E reaches its local minima, where all
f
its weights are in the final state w ij . By arbitrarily setting of a wij weight to zero,
which is equivalent to elimination the synapse that goes from neuron j to neuron i will
typically result in an increase of the error E. i.e., E(w ij = 0) > E(w ij = w ijf ) . So,
efficient pruning means finding the subset of weights that, when set to zero, will lead
to the smallest increase in E.
Moze and Smolensky [9] have introduced the idea of estimating the sensitivity of the
error function to the elimination of each unit. The sensitivity estimation sij for
elimination of weight wij is
E(w f ) − E(0)
S = − wf (2)
f
w − 0
Where w = wij and E is expressed as a function of w, assuming that all other weights
are fixed at their final states, upon completion of learning. A typical learning process
does not start with w = 0, but rather with some small randomly chosen initial value
wi . The error E as the function of weight w is depicted in Fig. 1a. In this figure, the
training is begins at initial weight values wi and as training progressing E of the
network decreases. At wf the network error reaches a local minimum. Since we do not
know E(0), we will approximate the slope of E(w) (when moving from 0 to wf) by the
average slope measured between wi and wf, namely
E(w f ) − E(w i ) f
S≈− w (3)
wf − wi
Fig. 1. (a) The error as a function of one weight, (b) Learning on an error function Surface
The initial wi and final weights wf,are quantities that are available during the
training phase. However, for the numerator of (3) it was implicitly assumed that only
one weight, namely w, had been changed, while all other weights remained fixed.
This is not the case during normal learning. Consider for example a network with only
two weights u and w. the numerator of (3) will be
E(u f , w f ) − E(u f , w i ) (4)
The error E (u, w) is illustrated by the constant value contours in Fig. 1b. The initial
point in the weight space is designated by I and the learning path is the dashed line
from I to F, the final point. For a precise evaluation of S, the numerator of (2) can be
evaluated as
F
∂ E (u f ,w )
E (w = w f ) − E (w = 0) = dw (5)
A
∂w
Multilayer Feed-Forward Artificial Neural Network Integrated with Sensitivity 71
The integral is along the line from point A, which corresponds to w = 0 while all other
weights are in their final states, to the final weight state F. However, the training
phase starts at point I rather than A, so we have to compromise on an approximation
to integral above, namely, we will use
F ∂ E(u, w) (6)
E(w = w f ) − E(w = 0) = dw
∂w
I
This expression will be further approximated by replacing the integral by summation,
taken over the discrete steps that the network passes while learning. Thus the
estimated sensitivity to the removal of connection wij will be evaluated as
N − 1 ∂E
Ŝ = − (n) Δw ij (n) w f (w f − w i ) (7)
ij
0 ∂ w ij ij ij
ij
Where N is number of training epochs. The above estimate for the sensitivity uses
terms which are readily available during the normal course of training. Obviously the
weight increments wij are essence of every learning process, so they are always
available. Also, virtually every optimization search uses gradients to find the direction
of change, so the partial derivatives which are the components of gradient, are
available. Therefore, the only extra computation demand for implementing is the
summation of (7). For the special case of back-propagation, weights are updated
according to [8], hence (7) reduces to
N −1 f
Ŝ = [ Δw ij (n)] 2 w ij η(w f − w i ) (8)
ij ij ij
0
Upon completion of training we are equipped with a list of sensitivity numbers, one
per each connection. They were created by a process that runs concurrently, but
without interfering, with the learning process. At this point a decision can be taken on
pruning the synapses of the smallest sensitivity based on some criterion.
neural network is trained for 5000 epochs with learning rate as 0.25. After 1000
epoch the sensitivity based pruning is carried out with different pruning thresholds.
The parameters with sensitivity measures less than a chosen threshold are removed.
The following Table 1 shows number of connections pruned at different epochs with
pruning threshold as 0.0001, 0.0005 and 0.001.
Pruning Thresholds
Epoch T=0.0001 T=0.0005 T=0.001
1000 82 82 82
2000 16 21 27
3000 19 24 28
4000 13 56 61
5000 13 20 39
5 Conclusion
The pruning algorithms are approaches for finding appropriate size of the network by
removing redundant parameters from the network. In the sensitivity based approach,
the sensitivity of the network error due to elimination of each unit is estimated. Then
several parameters with least sensitivity are removed from the network. In this
research a sensitivity based pruning method of [7] is integrated with multilayer feed-
forward ANN and effect of pruning is analyzed. The experiments have been
conducted on integrated ANN to recognize MNIST handwritten numerals. It is
interesting to observe that during the initial phase of learning more parameters are
pruned than later stage. During the training of ANN, pruning the parameters will lead
to increase in network error (i.e. MSE). However as the training progresses, then
network shows ability learn even with fewer parameters in it. On unseen (test) data, it
is usually observed that whenever pruning takes place it leads to improve in its
generalization ability, which is evident from decrease in MSE and increase in
classification rate on test data. When compared with a network with same topology
and without pruning being integrated, the ANN integrated with pruning algorithm
show better generalization results.
References
1. Steve, L., Lee Giles, C.: Overfitting and Neural Networks: Conjugate Gradient and
Backpropagation. In: International Joint Conference on Neural Networks, pp. 114–119.
IEEE Computer Society, Los Alamitos (2000)
2. Baum, E.B., Haussler, D.: What size net gives valid generalization? Neural Comuta. 1,
151–160 (1989)
3. Denker, J., Schwartz, D., Wittner, B., Solla, S., Howard, R., Jackel, L., Hopfield, J.: Large
automatic learning, rule extraction, and generalization. Complex Syst. 1, 877–922 (1987)
4. Le Cun, Y., Denker, J.S., Solla, S.A.: Optima Brain Damage. In: Touretzky, D.S. (ed.)
Advances in Neural Information Processing (2) (Denver 1989), pp. 598–605 (1990)
5. Farzan, A., Ghorbani, A.A.: The Bottom-Up Freezing: An Approach to Neural
Engineering. In: Proceedings of Advances in Artificial Intelligence: 14th Biennial
Conference of the CAIAC, Ottawa, Canada, pp. 317–324 (2001)
6. Reed, R.: Pruning Algorithms-A Survey. IEEE Trans. on Neural Network 4(5), 740–747
(1993)
7. Karnin, E.D.: A Simple Procedure for Pruning Back-Propagation Trained Neural
Networks. IEEE Transaction on Neural Network 1(2), 239 (1990)
8. Luenberger, D.G.: Linear and Nonlinear Programming. Addison-Wesley, Reading (1984)
9. Mozer, M.C., Smolensky, P.: Skeletonization: a technique for trimming the fat from a
network via relevance assessment. In: Advances in Neural Information Processing Systems
1, pp. 107–115. Morgan Kaufmann (1988)
10. Urolagin, S., Prema, K.V., Subba Reddy, N.V.: Illumination Invariant Character
Recognition using Binarized Gabor Features. In: IEEE International Conference on CIMA,
India, December 13-15, pp. 216–220 (2007)
ACTM: Anonymity Cluster Based Trust Management
in Wireless Sensor Networks
Abstract. Wireless Sensor Networks consists of sensor nodes that are capable
of sensing the information and maintaining security. In this paper, an
Anonymity Cluster based Trust Management algorithm(ACTM) is proposed
which enhances the security level and provides a stable path for
communication. It is observed that the performance of the network is better than
existing schemes through simulation.
1 Introduction
Wireless Sensor Network (WSNs) consists of a large number of tiny sensor nodes that
are equipped with sensing, processing and communicating components. WSNs
applications include target tracking in battle field and environmental monitoring etc..
The deployment nature of sensor networks makes them more vulnerable to various
attacks. Thus, providing security to WSNs becomes very important. Traditionally,
cryptography and authentication approach are used to provide security. Conventional
approach of providing security is not sufficient for autonomous network, so trust-
based approaches are used for providing security to the network. In order to evaluate
the trustworthiness it is essential to establish the co-operation and trust between
sensor nodes. Group-based Trust Management Scheme [1] uses Hybrid Trust Manag-
ement and works on two topologies: intra-group topology and inter-group topology.
Motivation : During processing of data, each node forwards the trust of its neighbors
to cluster head upon request. When sink sends request to cluster head, it transmits
neighboring clusters trust value to the sink. So, there is a possibility of adversary
performing traffic analysis during the communication between sensor nodes. Hence,
security level has to be enhanced by incorporating identity anonymity feature to the
existing Group-based Trust Management Scheme.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 75–80, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
76 K. Shaila et al.
2 Literature Survey
Riaz et al., [2] proposed Group-based Trust Management Scheme which calculates
trust for group of sensor nodes in each cluster. It works on intra-group topology using
distributed trust management approach and inter-group topology using centralized
trust management approach. Karthik et al., [3], compares various trust management
Techniques for high trust values in WSNs. The trust values are maintained based on
the various processes like trust establishment, trust propagation, trust metrics and
Group Based Trust Management Schemes.
Efthimia et al., [4] propose Certificate-based approach mechanism for deployment
knowledge on the trust relationships within a network and Behavior-based trust model
views trust as the level of positive cooperation between neighboring nodes in a
network. Yu et al., [5] present Trustworthiness-Based QoS Routing protocol for
Wireless Ad hoc Networks.
3 System Model
Consider a static Wireless Sensor Network consisting of a large number of small
devices called sensor nodes. The number of nodes in a sensor network can be of 144
sensors with 600 x 600 nodes, 225 sensors with 800 x 800 nodes and 324 sensors with
1000 x 1000 nodes. Each sensor node has its own ID. The network is divided into
number of groups referred to as clusters. Cluster Head (CH) is elected for each
cluster, which has more power compared to other members of the cluster. Each sensor
node can communicate with all its cluster members directly. Each cluster head
commu-nicates with neighboring cluster heads as well as with sink either through
intermediate CH or directly.
4 Problem Definition
Consider a given grid based WSNs, in which nodes are organized in the form of
clusters. The trust values are computed and communicated from the nodes to sink
through the cluster head. During this process, the adversary performs traffic analysis
and alters the trust values. The objective of this work is to avoid traffic analysis
attack.
Assumptions: (i) Initially all nodes will be in uncertain zone. (ii) Each node has
enough memory to store range of dynamic IDs. (iii) Sensor nodes have to exchange
their ID ranges within a short period, to avoid the nodes compromising with an
adversary. (iv) Adversary cannot attack sink.
ACTM: Anonymity Cluster Based Trust Management in Wireless Sensor Networks 77
In order to overcome the traffic analysis attack, anonymity of the nodes and trust
values are maintained during transmission. Initially, N nodes are generated using
random function and are arranged in a grid fashion. These nodes are divided into
smaller groups called as clusters and they elect their leader called as Cluster Head as
proposed in Selection of Cluster Head algorithm in Table 1.
These cluster heads communicate with the other cluster heads and the sink. An
adversary can track the information being transmitted if it is able to trace the IDs of
the sensor nodes. To overcome this problem, identity anonymity is created by
dividing the dynamic ID pool into number of subranges of equal size. Each sensor
node is given randomly chosen subranges that are overlapping and non-contiguous
from ID pool as explained in Assigning Anonymity IDs algorithm in Table 2. Map
table is created at each sensor node to map true ID of sensor node with dynamic
sender and receiver ID.
The trust of any node indicates its ability to provide the required service. Based on
the trust value, the nodes can be categorized as trusted, uncertain or untrusted nodes.
If the node is malicious it is categorized as untrusted or uncertain node. Trust value is
calculated first at Node level, then at Cluster head level and finally at sink level based
on number of successful and unsuccessful interaction between the nodes using sliding
window [2] for every r iterations. Similarly, the trust values are computed at cluster
heads.
78 K. Shaila et al.
Table 5. Map Table: Dynamic ID range for node 1 Table 6. Trust Value for
Each Cluster
Fig. 1. Comparison
n of ACTM with GTMS for Communication Overhead
80 K. Shaila et al.
The trust value is generated for each of the node separately. The trust value
obtained for each cluster during simulation is tabulated in Table 6. For accuracy
purpose the fractional value upto six points is considered. The trust value zero is
assigned directly if the nodes have not been communicated for more than two sliding
time window period instead of taking peer recommendations.
The communication overhead is plotted for 100 simulation runs for 144, 225 and
324 nodes as shown in Figure 1. The graph shows that the communication overhead is
less compared to GTMS. The communication overhead varies depending on size and
number of nodes in the network. If the number of iterations is increased,
communication overhead reduces because transfer of nodes changes the position of
nodes. Still each node possesses past recommendation values in the trust table even if
their positions are changed and does not calculate the trust values from beginning.
This reduces the communication overhead exponentially. The anonymity IDs are
calculated initially and are just assigned to the nodes for every r iterations. With low
communication overhead it is still able to provide enhanced security as it is using
anonymity of IDs.
7 Conclusions
Security is an important issue in Wireless Sensor Networks. We propose an Anon-
ymity Cluster Based Trust Management (ACTM) algorithm to maintain security and
avoid traffic analysis attack for WSNs. The proposed approach includes inclusion of
anonymous IDs and assignment of trust values to each node. The concept of anon-
ymity is introduced to hide the identity of the sensor nodes from the compromised
nodes whereas anonymity of node IDs are not maintained in GTMS. The cluster head
and its members are regularly reorganized randomly within the network and hence,
the chance of early node failure is reduced. Thus, enhanced security, longer lifetime
and reduced communication overhead is achieved in our algorithm.
References
1. Ganeriwal, S., Balzano, L.K., Srivastava, M.B.: Reputation-based Framework for High
Integrity Sensor Network. ACM Transaction on Sensor Networks 4(3), 15:1–37 (2008)
2. Shaikh, R.A., Jameel, H., d’Auriol, B.J., Lee, H., Lee, S., Song, Y.-J.: Group-based Trust
Management Scheme for Clustered Wireless Sensor Network. IEEE Transactions on
Parallel and Distributed Systems 20(11), 1698–1712 (2009)
3. Karthik, S., Vanitha, K., Radhamani, G.: Trust Management Techniques in Wireless Sensor
Networks: An Evaluation. In: Proceedings of IEEE Conference on Communications and
Signal Processing (ICCSP), pp. 328–330 (2011)
4. Aivaloglou, E., Gritzalis, S., Skianis, C.: Trust Establishment in Sensor Networks:
Behaviour-Based, Certificate-Based and a Combinational Approach. International J. System
of Systems Engineering. 1(1/2), 128–148 (2008)
5. Yu, M., Lueng, K.K.: A Trustworthiness-Based QoS Routing Protocol for Wireless Ad Hoc
Networks. IEEE Transactions on Wireless Communication 8(4), 1888–1898 (2009)
Texture Based Image Retrieval Using Correlation
on Haar Wavelet Transform
Abstract. Content Based Image Retrieval deals with the retrieval of most
similar images corresponding to a query image from an image database. It
involves feature extraction and similarity computation. This paper proposes a
method named Correlation Texture Descriptor (CTD) which computes the
correlation between the sub bands formed after applying Haar Discrete Wavelet
Transform. Fuzzy Logic is used to compute the similarity of two feature
vectors. Experiments determined that the proposed method, CTD, showed a
significant improvement in retrieval performance when compared to other
methods such as Weighted Standard Deviation (WSD), Gradient operation
using Sobel operator and Gray Level Co-occurrence Matrix (GLCM).
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 81–86, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
82 D.N. Verma, S. Narang, and B. Juneja
after applying Haar Wavelet Transform on the image at each level along with other
statistical features. Furthermore, a comparison of CTD with other methods such as
WSD[2], Gradient operation using Sobel operator[3] and GLCM[4] shows a
significant improvement in retrieval performance.
This paper is organized as follows: This section gives a brief introduction on
CBIR. Section 2 elaborates on Haar Wavelet Transform in Image Processing. Section
3 introduces the proposed method for texture feature extraction. Section 4 gives an
idea of the other three approaches. Section 5 covers the similarity criteria. Section 6
presents the experimental results. Finally, the paper is concluded with Section 7.
If (3)
Then Y= (4)
By applying the operations given above, the top left band acts as a 2D low pass filter
and gives the approximation image [7]. Similarly, the top right band acts as an
average horizontal gradient or horizontal high pass filter and vertical low pass filter ,
the lower left band as an average vertical gradient or horizontal low pass filter and
vertical high pass filter and the lower right band as a diagonal curvature or 2D high
pass filter.
Texture Based Image Retrieval Using Correlation on Haar Wavelet Transform 83
( , ) ( ) (5)
( , )
Where E is the expected value operator, cov means covariance, µ and µ are the mean
values and corr stands for correlation.
Correlation Texture Descriptor (CTD) uses the wavelet coefficients of all the sub
bands obtained after Haar Discrete Wavelet Transform. To compute CTD, an image is
first subjected to gray scale conversion using the formula:
Iij = (11* Cij (R) + 16* Cij (G) + 5* Cij (B)) / 32 (6)
Where Iij is the intensity assigned to pixel (i,j) of Image C, Cij (R, G, B) denotes the
red, green and blue color values of (i, j) pixel of image C.
The gray scale image is subjected to level 1 Haar DWT decomposing it into 4 sub
bands. The approximation image is then subjected to level 2 Haar DWT decomposing
it into 4 sub bands, thereby resulting in a total of 8 sub bands. The correlation between
LL and HL sub bands, LL and LH sub bands and between HL and LH sub bands is
calculated using the formula [9] :
∑ ∑ ( )( )
, (7)
(∑ ∑ ( ) )(∑ ∑ ( ) )
Where a and b are NXN matrices containing the pixel intensity values of image sub
bands A and B respectively and a, b represent the mean of a and b respectively.
The standard deviation of each sub band along with the mean and energy of the
approximation image is computed at each level. Since the number of pixels in LH,
HH and HL sub bands keep on decreasing with an increase in the level of
decomposition, the standard deviation of these sub band images at ith level is weighted
by the factor thus assigning higher weights to lower level bands.
The 18 CTD Features (CF) for a two level haar image are:
CF = { σ1LL ,σ1LH, σ1HL, σ1HH, µ 1LL,E1LL,σ2LL ,½ σ2LH, ½ σ2HL, ½ σ2HH, µ 2LL,
E2LL,corr(LL1-HL1),corr(LL1-LH1),corr(HL1-LH1),corr(LL2-HL2),corr(LL2-
LH2),corr(HL2-LH2) }
Where CF stands for CTD Features,
σiMM is the Standard Deviation of the MM sub band (MM stands for LL, LH, HL or
HH sub band ) at decomposition level i,
84 D.N. Verma, S. Narang, and B. Juneja
µ iLL is the mean of the approximation image ( i=1 for level1 and i=2 for level2),
EiLL is the energy of the approximation image ( i=1 for level 1 and i=2 for level 2),
corr(mmi-nni) stands for the correlation of sub band mmi and nni (mm and nn stand for
LL,HL or LH sub bands; i=1 for level1 and i=2 for level2).
WSD [2] (Weighted Standard Deviation) is a texture descriptor which uses the
wavelet coefficients of all the sub bands obtained after Haar Discrete Wavelet
Transform. The standard deviation of 6 sub bands (3 at each level) along with the
mean and standard deviation of the approximation image obtained at the 2nd level of
decomposition is computed. The standard deviation of each HH, HL and LH sub band
image at ith level is weighted by the factor . Thus a total of 8 features for each
image are extracted.
The Gradient Operation technique [2][3] applies Sobel gradient mask directly on
the gray scale image to obtain the gradient direction and theta of each pixel. Nine bins
each of 40 degrees are then calculated. Computing the mean, standard deviation and
entropy of each bin results in the formation of a total of 27 features.
In the GLCM technique [2][4], four co-occurrence matrices are computed from the
gray scale image by considering distance between pixels to be 1 and the four
directions as 00, 450, 900 and 1350. For each co-occurrence matrix so obtained, four
features namely: contrast, correlation, energy and homogeneity are calculated
resulting in a total of 16 features for each image.
5 Similarity Measure
CBIR requires the computation of similarity between the query image and those in the
image database. A specified number of images with the highest similarity are then
required to be retrieved. In all the above approaches, Fuzzy Logic [10] is used as a
similarity measure. There was a significant improvement in retrieval results by using
Fuzzy Logic as similarity measure when compared to Minkowski distance and
Euclidean distance. Fuzzy logic uses membership functions to find the similarity
between the feature vectors of any two images. In our study, we have used triangular
functions as membership functions. The base of the triangular functions was
experimentally determined.
6 Experimental Results
The experiments were performed with Java as front end and MySQL as back end. 16
different images were taken from a database containing 681 images from the VisTex
Album [11], each of size 256X256.
Texture Based Image Retrieval Using Correlation on Haar Wavelet Transform 85
For the discussed methods, the respective features of each database image were
computed offline and stored. The feature computation process was carried out online
for the query image. For each of the 16 images, 10 most similar images were
retrieved. The retrieval accuracy was then measured using the formula given below:
( ) | ( ) ( )| | ( )| (8)
Where R(q) is the retrieval accuracy, |S(q)| is the total number of images of the
texture type in the database, |A(q)| is the total number of images retrieved for display
and |A(q) ∩ S(q)| is the number of relevant images retrieved belonging to the texture
type of the query image. The following graph depicts the retrieval accuracy of the
above mentioned techniques.
120
100 CTD
80
WSD
60
40 GLCM
20 Gradient
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Fig. 1. Plot of Retrieval Accuracy vs. Texture type for the four texture based techniques
The CTD approach was also applied on the Brodatz texture album [2] and a
comparison [2] of its results with those of the other three techniques showed a pattern
similar to the graph given above i.e. the retrieval accuracy of CTD was better than
that of the other three techniques.
7 Conclusion
The graph given above clearly depicts a significant improvement in precision of the
proposed CTD approach over the other three methods, namely: WSD, Gradient and
GLCM. It can also be seen that both CTD and WSD give significantly better results
compared to GLCM and Gradient methods. This can be attributed to the fact that both
CTD and WSD methods use detailed information of all the sub bands, after
decomposing the image by applying Haar Discrete Wavelet Transform. Also, the use
of correlation between sub bands at each level and incorporation of features of the
approximation image obtained after applying Haar Transform may be responsible for
the marginal improvement in results when the proposed CTD method is used as
compared to the WSD approach.
86 D.N. Verma, S. Narang, and B. Juneja
References
1. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn.
2. Verma, D.N., Garg, N., Garg, N., Dosi, G.: Improved Texture Based Image Retrieval using
Haar Wavelet Transform. In: Proceedings International Conference on Information
Processing (2010)
3. Wang, K.-A., Lin, H.-H., Chan, P.-C., Lin, C.-H., Chang, S.-H., Chen, Y.-F.:
Implementation of an Image Retrieval System Using Wavelet Decomposition and Gradient
Variation. WSEAS Transactions on Computers 7(6) (2008)
4. Yazdi, M., Gheysari, K.: A New Approach for Fingerprint Classification based on Gray-
Level Co-occurrence Matrix. World Academy of Science, Engineering and Technology 47
(2008)
5. Bénéteau, C., Van Fleet, P.J.: Discrete Wavelet Transformations and Undergraduate
Education. Notices of the AMS 58(05) (May 2011)
6. Hiremath, P.S., Shivashankar, S., Pujari, J.: Wavelet Based Features For Color Texture
Classification With Application To CBIR. IJCSNS International Journal of Computer
Science and Network Security 6(9A) (September 2006)
7. The Haar Transform, http://cnx.org/content/m11087/latest/
8. Cross Correlation, http://paulbourke.net/miscellaneous/correlate/
9. Digital Image Correlation,
http://en.wikipedia.org/wiki/Digital_image_correlation
10. Verma, D.N., Maru, V., Bharti: An Efficient Approach for Color Image Retrieval using
Haar Wavelet. In: Proceedings of International Conference on Methods and Models in
Computer Science (2009)
11. Vistex Database,
http://vismod.media.mit.edu/pub/VisTex/VisTex.tar.gz
Dynamic Cooperative Routing (DCR)
in Wireless Sensor Networks
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 87–92, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
88 H. Sivasankari et al.
2 Literature Survey
Mohamed et al., [1] proposed energy aware routing for wireless sensor networks.
Gateway node acts as a centralized network manager. Failure of the gateway and the
mobility of the sink is not considered. Sikora et al., [2] have considered a
communication network with a single source node, a single destination node and N −
1 intermediate nodes placed equidistantly. Single hop transmission is more suitable
for the bandwidth limited regime especially when higher spectral frequencies are
required. The gap between single-hop and multi-hop is bridged by employing
Interface Cancellation. Madan et al., [3] formulated a distributed algorithm to
compute an optimal routing scheme. The algorithm is derived from the concept of
convex quadratic optimization and time constraint to maximize the lifetime of
network. Ahmed et al., [4], [5] proposed distributed relay assignment protocol for
cooperative communication in wireless Networks. The relay is selected from the
nearest neighbour list to the base station. Simulation results show that, significant
increment in gain of the coverage area for the cooperative routing than direct routing.
Proof: The entire network is mapped as graph G, node i ∊ V with the coordinates of
(xi, yi). Let z and zm be the static sink and mobile sink with the coordinates of (xz, yz)
Dynamic Cooperative Routing (DCR) in Wireless Sensor Networks 89
and (xz(mt),yz(mt ) at time t respectively. di is the minimum distance between source and
the sink. By partial differentiation of the distance equation and equating to zero, we
obtain the best location of the sink with minimal distance with respect to all other
nodes present in the network is given by
∑ . (2)
( ) ( ) . (3)
Partial differentiation of distance di with respect to x and y for static sink is given by
∑ ∑ . (4)
Partial differentiation of distance di with respect to x and y for mobile sink is given by
( )
∑ ∑ . (5)
where is either x or y. Total power consumption of the path becomes the additive
power consumption at each link present in the path, where path comprises of many
links. Total energy consumption is computed as given by
∑ ( )( ) . (6)
where li is the total number of links present in the path. Energy conservation is
improved thus by reducing the distance between Source and destination node.
Sensor nodes are deployed randomly in the given area. Dynamic Cooperative Routing
algorithm checks for the presence of void node in the network. Sensor nodes first
need to identify or select the sink and then forward the data to the desired sink. We
consider static and mobile sink. In the case single static sink, all sensor nodes have the
global information about the sink location and hence source nodes can route the data
to the destination sink. Source node first identifies the set of reachable nodes i.e., the
neighbour vectors. It selects one source node from the neighbour vector to forward
data. Later the receiver node calculates strength of the received signal and compares it
with minimum threshold value i.e., SNRmin. Source node applies the shortest path
algorithm and finds the next hop to which data can be sent. After sending data to the
next node, receiver node calculates the strength of received signal rs. If the received
signal strength is less than the minimum fixed threshold value , receiver node then re-
requests the relay node to send data through cooperative communication. If the relay
node has received the data correctly, then it cooperatively transmits the data to the
next node. The procedure is repeated until the sink node is encountered.
90 H. Sivasankari et al.
140
120
SNCP
120 SNCP
CASNCP
100 CASNCP
E nergy C o n sum tio n (n J)
MPCR
E n erg y C o n su m tio n (n J )
100 MPCR
DCR
80 DCR
80
60
60
40
40
20 20
0
2 3 4 5 6 7 8 9 10 11
0
2 4 6 8 10
Number of Links Number of Links
Fig. 1. Energy Consumption versus Number Fig. 2. Energy Consumption versus Number
of Links for Static Sink of Links for Mobile Sink
500 1000
SNCP
SNCP CASNCP
400 CASNCP 800 MPCR
MPCR
DCR
L ifetim e ( Itera tio n s)
DCR
L ifetim e (Itera tio ns)
300 600
200 400
100 200
0 0
20 40 60 80 100 20 40 60 80 100
Number of Nodes in Network Number of Nodes in Network
Fig. 3. Lifetime versus Number of Nodes for Fig. 4. Lifetime versus Number of Links for
Static Sink Mobile Sink
92 H. Sivasankari et al.
6 Conclusions
Wireless Sensor Networks operate under limited energy resources and hence with
reduced lifetime. We propose Dynamic Cooperative routing to conserve energy and
maximize the lifetime of the network. In the case of direct transmission only one node
has an opportunity to transmit the data, and therefore the continuous participation of
same node drains it’s energy and reduces lifetime. In the second scenario, mobile sink
moves to predefined locations. All sensor nodes participates for the communication
evenly in data transmission and thus energy consumption is uniform across all nodes.
The DCR algorithm implemented shows that there is 70% and 65% improvement
over the existing MPCR algorithm with respect to lifetime and energy conservation
respectively. This work can be enhanced in future with multiple static and multiple
mobile sinks.
References
1. Younis, M., Youssef, M., Arisha, K.: Energy Aware Management in Cluster- Based Sensor
Networks. J. Computer Networks 43(5), 539–694 (2003)
2. Sikora, M., Nicholas Laneman, J., Haenggi, M., Costello, D.J., Thomas, E.F.: Bandwidth
and Power Efficient Routing in Linear Wireless Networks. IEEE Transactions on
Information Theory 52, 2624–2633 (2006)
3. Madan, R., Lall, S.: Distributed Algorithms for Maximum Lifetime Routing in Wireless
Sensor Networks. IEEE Transactions on Wireless Communications 5(8), 2185–2193 (2006)
4. Ahmed, K.S., Han, Z., Ray Liu, K.J.: A Distributed Relay-Assignment Algorithm for
Cooperative Communications in Wireless Networks. In: Proceedings of IEEE ICC (2006)
5. Ibrahim, A.S., Han, Z., Ray Liu, K.J.: Distributed Energy Efficient Cooperative Routing in
Wireless Networks. IEEE Transactions on Wireless Communications 7(10), 3930–3941
(2008)
Addressing Forwarder’s Dilemma:
A Game-Theoretic Approach to Induce Cooperation
in a Multi-hop Wireless Network
1 Introduction
A multi-hop wireless network is a collection of computers and devices (nodes)
connected by wireless communication links. Because each radio link has a limited
communications range, many pairs of nodes cannot communicate directly, and must
forward data to each other via one or more cooperating intermediate nodes.
Multi-hop communication is not an issue where nodes are altruistic and faithful to
a global algorithm. However, if nodes are selfish, they may not behave cooperatively
as they have an incentive to free-ride by sending their own packets without relaying
packets for others since relaying packets for others consumes bandwidth and energy.
This concentrates traffic through the cooperative nodes, which decreases both
individual and system throughput, and might even partition an otherwise connected
network.
Hence, the need arises to design some mechanism that induces cooperation among
the nodes. The basic aim of any such mechanism is to encourage the nodes to forward
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 93–98, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
94 S. Mukherjee et al.
packets sent to it by other nodes. This can be done in a positive or a negative way,
that is, a node can be made to cooperate within a network either by providing some
incentive or by taking penalty actions against a node when its rate of packet
forwarding falls below a particular value. Marti et al. [10] discuss schemes to identify
misbehaving nodes (non-forwarders) and deflect traffic around them. Michiardi and
Molva [8] devise reputation mechanisms where nodes observe the behavior of others
and prepare reputation reports that they use to behave selectively. Zhong et al. [9]
propose the use of currencies to enforce cooperation. Buttyan and Hubaux [5, 6]
devise a scheme based on a virtual currency called a nuglet that a node pays to send
its own packets but receives if it forwards other’s packets. Cooperation without
incentive mechanisms is an interesting topic. Srinivasan et al. [11] and Urpi et al. [1]
study the same in general mathematical frameworks. In [7], Felegyhazi et al. use
game theoretic and graph theoretic notions to examine whether cooperation can exist
in multi-hop communication without incentive mechanisms. They consider the effect
of network topology and communication patterns on the cooperative behavior. In [3],
the authors propose a self-learning repeated game framework to enforce cooperation
in wireless ad hoc networks. In [2], Kamhoua et al. model packet forwarding as a
stochastic game in which each node observes the behavior of its neighbours using an
imperfect monitoring technology. They develop a strategy that constrains self-
interested nodes to cooperate under noise.
In this paper, we propose a model in which the problem of forwarder’s dilemma
can be modeled using non-cooperative game theory [4]. In this framework, the nodes
must choose their behavior, regarding packet forwarding, that is, they have to make
decisions every time a packet is sent to them for forwarding. We use a game theoretic
model to study this scenario. The strategy of a node is the probability with which it
forwards a received packet. Note that this is a generalization of the binary choices of
dropping or relaying a packet to a continuous space. The utility function of a node is
the point awarded to it by the base-station (based on its forwarding actions) offset by
the cost incurred in forwarding. A selfish rational node attempts to maximize its
utility function. The natural solution of the game is a Nash equilibrium.
We show the existence of Nash equilibria in the game. We use different criteria to
select the desirable equilibria and show that cooperation is feasible. The novelty of
our work lies in identifying utility functions that model the situation in an elegant way
and ensuring the existence of desirable Nash equilibria in the game. Unlike many
other works we do not use explicit currencies. Also we use refined notions of Nash
equilibrium that boost the performance of the network as a whole.
The remainder of this paper is organized in the following manner. Section 2
discusses the features of our model. The results obtained are enumerated in section 3
and the inferences are drawn in section 4.
2 Model Definition
In this section, we give a formal definition of the proposed model and the reasons
why we have chosen this model. Since we are studying cooperation in packet
Addressing Forwarder’s Dilemma 95
forwarding, we assume that the main reason for packet losses in the network is the
non-cooperative behavior of the nodes.
Let us consider a wireless network having N nodes. Let X denote the set of nodes,
where X= {x1, x2…xN}.
Each node in the network, in addition to its own packets, has to forward some
packets for other nodes, in case it is an intermediate node between the source and the
destination. However, due to the energy constraints of the nodes, there is a probability
associated with the event of a node forwarding packets sent by other nodes. We
denote this probability for a node xi as pi .Thus, if pi is 1, a node definitely forwards
the received packet. A value of pi=0 indicates that the node drops the packet.
Let us assume the cost incurred by a node xi in forwarding a packet is ci. The cost
can be defined to be the power consumed to transmit the packet, or the bandwidth
occupied by the packet and so on. Let us assume that 0 < ci < 1.
In order to encourage nodes to forward packets, the base station gives each node xi
an incentive gi that depends on the probability pi with which the node forwards the
packet. (We assume that the base station can collect all the required information.)
The whole scenario is defined as a non-cooperative game. Here the players are the
nodes of the network and each node aims to maximize its payoff. The strategy set of
each node is its set of allowable forwarding probabilities which is a closed subset of
[0, 1]. We can define the utility of a node xi, having a forwarding probability pi as
follows:
d
U i ( pi ) = g i − pi (1 − ∏ p j ) − pi1 / α ci (1)
i +1
where gi = ln (1 + pi).
Note that the first term above is the incentive awarded by the base station. The
second term refers to the probability that the packet does not reach the destination d
given that this node has forwarded the packet. In this case, since the packet has not
reached the destination, this node’s forwarding action does not produce any benefit to
the network. Each node that forwards a packet that does not actually reach the
destination get a mild punishment for resource wastage. In practice, this could
motivate the nodes to identify rogue nodes and refuse to forward their packets.
However, such analysis is beyond the scope of the current work. The third term is the
cost incurred in forwarding.
Substituting the value of gi in (3.1) we get,
d
U i ( pi ) = ln(1 + pi ) − pi (1 − ∏ p j ) − pi1 / α ci (2)
i +1
where, α is a constant that denotes the degree of cost-constraint of a node. Greater
the value of α , more stringent a node is regarding forwarding packets for other
nodes.
For simplicity, we assume that in this case, the value of α is 1.Therefore,
d
U i ( pi ) = ln(1 + pi ) − pi (1 − ∏ p j ) − pi ci (3)
i +1
96 S. Mukherjee et al.
Since the strategy set of each node is a closed subset of [0, 1], it is a closed bounded
set which is compact and convex. Also, double differentiating equation (3.3) with
respect to pi , we get,
∂ 2ui 1
=− <0
∂pi 2
(1 + pi ) 2
Hence ui is quasi-concave with respect to the strategy.
Thus, we can say that there is at least one Nash Equilibrium point for the above
game. Recollect that a Nash equilibrium is an action profile p* ∈ P with the property
that for all players i ∈ N:
ui (p*) = ui (p*-i, p*i) • ui (p*-i, pi) ∀ pi ∈ Pi
In the last section, we did not answer how many Nash equilibria are possible in the
game. Indeed many Nash equilibria are possible (as we found out through
simulations). So we define some refinements of the equilibrium so that only those that
satisfy the refinement criteria are retained and the rest are filtered out. The two criteria
we use to select the Nash equilibria are social welfare maximization and proportional
fairness maximization. We choose those equilibria that maximize social welfare or
maximize proportional fairness. Social welfare is the sum total of the payoffs of the n
n
nodes that is, U
i =1
i . A payoff profile is said to be proportionally fair if the products
n
of the individual payoffs, that is, ∏U
i =1
i is maximized.
We perform a simulation with 5 nodes in a chain topology, where the third node
has a very high cost (=1) of packet forwarding with respect to the other nodes which
have been considered equivalent in terms of their respective costs of packet
forwarding. Each of these nodes has a cost of 0.1. We find that 92 Nash equilibria
exist in this case. We select the case where nodes select the strategies that maximize
social welfare and find that at Nash equilibria, the probability of packet forwarding
for the third node is 0.98 while all the other nodes forward packets with a probability
of 1. We obtain a social welfare of 2.03568 under this strategy set.
We consider the same topology again but now all nodes have the same forwarding
cost. Graph 1 represents the value of the social welfare at the Nash equilibrium point
maximizing social welfare, with respect to cost of packet forwarding. The cost of any
given node is plotted along the X-axis. Graph 2 is similar to graph 1 except for the
fact that here the condition for refinement of Nash equilibrium strategy is that of
proportional fairness.
Addressing Forwarder’s Dilemma 97
We also considered other network sizes with different cost values. The general
trend of the graphs remains the same. As the cost increases, the social utility and
proportional fairness decrease fast to zero. However, for moderate costs, the values of
social utility and proportional fairness are non-zero. This means that cooperation is
present when costs are not very high. Thus at the obtained Nash equilibria, multi-hop
communication occurs successfully.
98 S. Mukherjee et al.
4 Conclusion
In this paper, we have a presented a game theoretic model to analyze and provide a
solution to the Forwarders’ Dilemma problem in wireless ad-hoc networks. We have
restricted ourselves to a static network scenario because of the complexity of the
problem.
We have shown that the proposed game possesses at least one Nash equilibrium.
Indeed, there are multiple equilibria so that we select the ones that either maximize
the social utility or the proportional fairness. It is shown that intermediate nodes do
forward other nodes’ packets at the equilibrium point, thus resulting in successful
multi-hop communication. As the cost of forwarding increases, the social utility and
proportional fairness decrease at the equilibrium point.
The presence of multiple Nash equilibria prevents us from predicting which one
will actually exist in the system. In future we plan to explore how to design utility
functions that would make the Nash equilibrium unique. Simulating the game in a
larger network is also left as a future exercise.
References
1. Urpi, A., Bonuccelli, M., Giordano, S.: Modeling cooperation in mobile ad hoc networks:
a formal description of selfishness. In: Proceedings of WiOpt 2003, France, March 3-5
(2003)
2. Kamhoua, C.A., Pissinou, N., Busovaca, A., Makki, K.: Belief-free equilibrium of packet
forwarding game in ad hoc networks under imperfect monitoring. In: Proceedings of
IPCCC 2010, Albuquerque, NM, USA (December 2010)
3. Pandana, C., Han, Z., Liu, K.J.R.: Cooperation enforcement and learning for optimizing
packet forwarding in autonomous wireless networks. IEEE Transactions on Wireless
Communications 7(8) (August 2008)
4. Owen, G.: Game Theory, 3rd edn. Academic Press, New York (2001)
5. Buttyaan, L., Hubaux, J.P.: Nuglets: a virtual currency to stimulate cooperation in self-
organized mobile ad hoc networks. Technical Report DSC/2001/001, Department of
Communication Systems, Swiss Federal Institute of Technology (2001)
6. Buttyan, L., Hubaux, J.P.: Enforcing service availability in mobile ad hoc WANs. In:
Proceedings of MobiHoc 2000, Boston, MA, USA (August 2000)
7. Felegyhazi, M., Hubaux, J.-P., Buttyan, L.: Nash equilibria of packet forwarding strategies
in wireless ad hoc networks. IEEE Transactions on Mobile Computing 5(5) (May 2006)
8. Michiardi, P., Molva, R.: CORE: A COllaborative REputation mechanism to enforce node
cooperation in mobile ad hoc networks. In: Proceedings of CMS 2002, Portoroz, Slovenia,
September 26-27 (2002)
9. Zhong, S., Yang, Y.R., Chen, J.: Sprite: A simple, cheat-proof, credit-based system for
mobile ad hoc networks. In: Proceedings of IEEE INFOCOM 2003, March 30-April 3
(2003)
10. Marti, S., Guili, T.J., Lai, K., Baker, M.: Mitigating Routing Misbehavior in Mobile Ad
Hoc Networks. In: Proceedings of Mobicom 2000 (2000)
11. Srinivasan, V., Nuggehalli, P., Chiasserini, C.F., Rao, R.R.: Cooperation in wireless ad hoc
networks. In: Proceedings of IEEE INFOCOM 2003, San Francisco, March 30-April 3
(2003)
Improving Reliability in Cognitive Radio Networks
Using Multiple Description Coding
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 99–108, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
100 A. Chaoub and E.I. Elhaj
There exist little research efforts on the problem of secondary traffic transmission
over Cognitive Radio networks using Multiple Description Coding. In [5], Kushwaha,
Xing, Chandramouli and Subbalakshmi have studied the coding aspect over CR
networks. Principally, the paper has given an overview of the multiple description
codes as source coding well suited for use in CR networks. For simulation results, the
paper has adopted the LT codes to combat the secondary use losses under the CR
architecture model defined in [7]. This study was an attempt to give a general analyze
of MDC applications on CR networks and no numerical results have been presented
for this specific coding scheme. In [8], Husheng has investigated the use of MDC in
cognitive radio systems to overcome the losses caused by the primary traffic arrival
on the secondary applications that are delay sensitive with some tolerable quality
degradation. Using a Gaussian source, he has proposed an algorithm to turn the
selection of rates and distortions into an optimization problem for the expected utility.
The primary users’ occupancy over each frequency channel was modeled as a Markov
chain. Numerical results have been presented for real time image coding. However,
this study has not considered the sharing aspect of CR networks due to the
opportunistic spectrum access feature and consequently there is an additional packets
loss average due to collision effects which degrades considerably the system Spectral
Efficiency. Moreover, this study has considered only the Gaussian sources and need
to be generalized to more sources types. Recently, in the paper [6], the issue of
multimedia transmission over CR networks using MDC has been treated, but OSS
conflicts have been omitted.
primary user. As in [7], one packet is sent per subchannel. The network topology
providing the infrastructure for the multimedia communication in a secondary use
scenario is depicted and discussed. Particularly, our main attention is paid to
multimedia applications that are delay constrained with a targeted distortion measure.
So, the secondary stream delivery has a predetermined delay not to exceed, under
some allowed distortion. Time Division Multiple Access (TDMA) method has been
adopted as a mean for sharing the same CR infrastructure among multiple cognitive
devices (SUs). Meanwhile, we have opted for an efficient time slot allocation. In fact,
secondary users transmit one after the other, each principally using its specific time
slot i with some probability q . Nevertheless, as far as the secondary transmission is
opportunistic, the same secondary user could utilize the other slots if it has data to
transmit out of its own slot like urgent packets or prioritized data, let p be the
probability that this SU convey its packets in the remaining slots j ≠ i (Fig. 1) [9].
Realistic CR contexts are distinguished by their complexity and the multitude of
factors involved in shaping their performance. Thus, for ease of exposition we are
going to be concerned only with three chief factors influencing the reliability of the
cognitive transmission: first, primary traffic interruptions causing harmful
interferences to the secondary signal. Second, cognitive transmissions may collide
with one another due to the fact that each secondary peer attempts to transmit
opportunistically in the remaining slots reserved for the other secondary peers. Last,
data packets may be disrupted because of subchannel characteristics like shadowing.
In the present CR network model, collision occurs when two or more secondary users
attempt to deliver a packet or many packets across the same Secondary User link
(SUL) at the same time, in other words the dynamics between the competing
secondary users where accessing the available secondary user links is the root cause
of secondary mutual collisions. So it is clear that the occurrence of collisions may
impede the performance of the CR network. Multiple Description Coding technique
has been employed as a join source channel coding to cope with the specific pattern of
loss examined, namely: primary traffic interruptions, packet collisions and subchannel
fading. In deed, the media stream is progressively encoded using a scalable
compression algorithm like the well known SPIHT [13]. The paper make use of a
specific source coding structure that implements the Priority Encoding Transmission
(PET) packetization technique of Albanese et al. [10] (Fig. 2). Different amounts of
Forward Error Correction (FEC) are allocated to different message fragments
according to their contribution to the image quality. The used FEC can be Reed
Solomon codes [11] or any error correcting codes like Fountain codes [12]. Making
use of the MDC in conjunction with the PET based packetization scheme in CR
networks enables recovering the multimedia data content up to a certain quality
commensurate to the number of received descriptions and provides reliability in
various secondary applications against the resulting erasures.
The remainder of this paper is organized as follows: In Section 2 the analytic
expression of the Message Error Probability which considers primary traffic
interruptions, TDMA collisions and subchannel characteristics is computed. Then, the
Spectral Efficiency expression is derived. Section 3 evaluates the influence of the
proposed model parameter settings on the secondary traffic transmission performance
in view of the Spectral Efficiency and finally Section 4 draws some conclusions.
102 A. Chaoub and E.I. Elhaj
u v v
SU 4 SU 4 SU 3
SU 3u
u u→ v v
v
u
SU 2
SU 2 SU 1u SU 1v
redundancy amount (FEC) is added to each message fragment such that the sub
stream Nl and the FEC form a description Dl (Fig. 2). Let FECl be the length of FEC
assigned to the l' th stream where l ∈ {1,, L} . We state that N = Nl + FECl . Unlike [6],
the question of how much FEC amount to assign to each layer will not be addressed in
this contribution.
Preclaim, l0 is the probability that the active Cognitive user v fails to receive
N l0 packets over the selected SUL due to primary traffic interruptions. Pcollision , l0 is
the probability that there is at least FECl0 + 1 secondary communications that coincide
at the same time slot over the same subchannels in the selected SUL. Pfading , l0 is the
probability that at least FECl0 + 1 subchannels are subject to fading and noise.
This communication will succeed only if at most FECl0 subchannels are claimed by
their associated licensed users. Hence, the message error probability for the secondary
users, which takes into consideration only the Primary traffic interruptions, is given by:
N l0
Nl0 + FECl0 FEC + n
Preclaim, l0 =
n =1
p
FECl0 + n a
l0 (1 − pa )N l0 − n . (2)
Let i be the time slot assigned to the active Cognitive user u and Deg v defined as the
( )
number of neighbors of the active cognitive user v ( SU iv 1≤i ≤ 4 in Fig. 1).
Let Pc be the probability that there is some collisions perturbing the transmission
u → v on a given subchannel. As shown in [9], Pc can be written as:
q(1 − p) + (M − 1) p(2 − p − q)
Pc = 1 − (1 − p)Degv −1 . (3)
M
N l0
Nl0 + FECl0 FEC +i
Pcollision , l0 =
i =1
P
FECl0 + i c
l0 (1 − Pc )N l 0 −i . (4)
Effl0 =
(1 − Perr, l )× L × Nl
0 0
. (6)
S × W × Tdata
3 Numerical Results
For these experiments, we used Lenna image compressed with SPIHT using a bit rate
of r = 0.146 bit pixel for data and FEC bytes. We consider an ATM transmission.
ATM Packets consist of 48 bytes where 1 byte is reserved for sequence number.
Therefore, a total of N = 100 packets are needed. The image needs to be transmitted
over a TDMA CR network in a maximum delay of Tdata = 1 ms . Subchannel bandwidth
is W = 100kHz and the estimated traffic average on the assigned slot i is q = 98% .
p a and π has been fixed respectively at 0.08 and 0.02.
Fig. 3 depicts the impact of the average traffic in the remaining slots j ≠ i on the
message error probability Perr ,l0 . Deg v and M values has been fixed respectively at 1
and 5. As it is seen and according to what was theoretically expected, the message
error probability performs good results where increasing the number of FEC packets.
In deed, at a specific time slot, as long as the number of subchannels that will carry
the FEC packets increases, it become very unlikely that two Secondary Users
transmits over the same subchannel and consequently more chance to avoid
collisions. It is also interesting to note that, for some FEC values, there is an
optimal p value that minimizes the message error probability. Carefully designing the
opportunistic transmission plays a critical role in cognitive radio systems to reach a
good compromise between efficient sharing of licensed spectrum and reliability.
1
FEC1=85
0.9 FEC2=80
FEC3=78
0.8
FEC4=70
Message Error Probability Perr
FEC5=65
0.7
FEC6=20
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Probability p
Fig. 4 illustrates the achieved Spectral Efficiency over Cognitive Radio network
shared by several SUs using TDMA technique plotted against the probability p . Here,
the impact of the FEC on the traffic transmission performance is analyzed. The
following settings Deg v = 1 and M = 5 are fixed. Thus, on one hand we find an optimal
p value ( p ≈ 40% ) that maximizes the system Spectral Efficiency and on the other
hand, the quantity Effl0 attains another maximum at a specific value of
FEC (FEC ≈ 78) . For large number of FEC, i.e. a big amount of protection,
transmission is reliable and consequently the probability of messages error is small
which increases the Spectral Efficiency. However, large number of FEC results in a
reduced number of information symbols N l0 which degrades the Spectral efficiency.
0.9
FEC=85
0.8 FEC=80
FEC=78
0.7 FEC=70
Spectral Efficiency [bps/Hz]
FEC=65
0.6 FEC=20
0.5
0.4
0.3
0.2
0.1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Probability p
In Fig. 5, the computed Message Error probability is given versus the probability p ;
simulations were run for several FEC values, M = 5 and Degv = 3 . It can be observed
that high traffic performance in view of Perr ,l0 is attained where
increasing FECl0 values. Nevertheless, where increasing Deg v for 1 to 3, the message
1
FEC1=85
0.9 FEC2=80
FEC3=78
0.8
FEC4=70
Message Error Probability Perr
FEC5=65
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Probability p
error probability has been increased. It is obvious that where increasing the number of
neighbors, the chance that this SU interferes with the other active cognitive users get
increased. On the other hand, it is noticed that there is no optimal value of the
probability p maximizing the system Spectral Efficiency.
Fig. 6 represents the Spectral Efficiency against the probability p for several values
of FECl0 . The slots number has been fixed at 5 and Deg v has been increased to 3. The
Spectrum Efficiency has been degraded by increasing the number of neighbors of the
active CR user v . The reason is obvious, more neighbors means more active SUs
which will arouse more collisions. Where increasing the amount of FEC, the
Effl0 improves and there is no local maximum of the graph which means that
increasing the traffic p decreases the efficiency of the spectral resources for those
specific system parameter settings.
1
FEC=85
0.9 FEC=80
FEC=78
0.8
FEC=70
Spectral Efficiency [bps/Hz]
FEC=65
0.7
FEC=20
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Probability p
4 Conclusion
In this paper, the issue of delay constrained multimedia transmission over Cognitive
Radio networks is examined. In deed, the TDMA frame is opportunistically shared by
several SUs. We have suggested the use of MDC approach to improve the efficiency
of CR networks against the unreliability caused by primary traffic reclaims,
opportunistic collisions and subchannels shadowing. The numerical simulations have
been presented in terms of the Message Error Probability and the Spectral Efficiency.
The achieved results prove the effectiveness of the given solution to achieve a good
balance between several system parameters and the packet loss pattern.
References
1. Shared Spectrum Compagny. Spectrum occupancy measurement,
http://www.sharedspectrum.com/measurements/
2. NTIA, U.S. frequency allocations,
http://www.ntia.doc.gov/osmhome/allochrt.pdf
108 A. Chaoub and E.I. Elhaj
3. Mitola III, J.: Cognitive Radio: An Integrated Agent Architecture for Software Defined
Radio. Ph.D Thesis, KTH Royal Institute of Technology (2000)
4. Cabric, D., Mishra, S.M., Willkomm Foster, D., Broderson, R.W.: A Cognitive Radio
approach for usage of virtual unlicensed spectrum. In: Proc 14th IST Mobile and Wireless
Communication (2005)
5. Kushwaha, H., Xing, Y., Chandramouli, R., Subbalakshmi, K.P.: Erasure Tolerant Coding
for Cognitive Radios. In: Mahmoud, Q.H. (ed.) Cognitive Networks: Towards Self-Aware
Networks. Wiley, Chichester (2007)
6. Chaoub, A., Ibn Elhaj, E., El Abbadi, J.: Multimedia Traffic Transmission over Cognitive
Radio Networks Using Multiple Description Coding. In: Abraham, A., Lloret Mauri, J.,
Buford, J.F., Suzuki, J., Thampi, S.M. (eds.) ACC 2011, Part I. CCIS, vol. 190, pp. 529–
543. Springer, Heidelberg (2011)
7. Willkomm, D., Gross, J., Wolisz, A.: Reliable link maintenance in cognitive radio systems.
In: First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access
Networks, Baltimore, pp. 371–378 (2005)
8. Li, H.: Multiple description source coding for cognitive radio systems. In: 2010
Proceedings of the Fifth International Conference on Cognitive Radio Oriented Wireless
Networks & Communications (CROWNCOM), Cannes, pp. 1–5 (2010)
9. Chaoub, A., Ibn Elhaj, E., El Abbadi, J.: Multimedia traffic transmission over TDMA
shared Cognitive Radio networks with Poissonian Primary traffic. In: International
Conference on Multimedia Computing and Systems, Ouarzazat, pp. 378–383 (2011)
10. Albanese, A., Blömer, J., Edmonds, J., Luby, M., Sudan, M.: Priority encoding
transmission. IEEE Transactions on Information Theory 42, 1737–1744 (1996)
11. Mohr, A.E., Riskin, E.A., Ladner, R.: Generalized multiple description coding through
unequal loss protection. In: IEEE International Conference on Image Processing, Kobe, pp.
411–415 (1999)
12. MacKay, D.J.C.: Fountain codes. IEE Proceedings Communications 152, 1062–1068
(2005)
13. Said, A., Pearlman, W.A.: A new, fast, and efficient image codec based on set partitioning
in hierarchical trees. IEEE Transactions on Circuits and Systems for Video Technology 6,
243–250 (1996)
Design and Development of an Enhanced UDDI
for Efficient Discovery of Web Services
Keywords: Web service, Web service registry, UDDI, Search engine, Efficient
Discovery.
1 Introduction
The utilization of internet as a means of on-line business has increased due to its
popularity. Web services have become the preferred technology for such services
because of the inherent benefit of loose coupling. Web Services which are becoming a
promising technology preferred for constructing understandable applications are
internet-based, modular applications and they are of immense interest to governments,
businesses, and individuals. The basic web services architecture permits a service
requester to find services that are available in service registry where service providers
publish their services so that requesters may find them from these registries. In this
environment, searching mostly relies on the accessibility and capabilities of the
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 109–114, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
110 K. Tamilarasi and M. Ramakrishnan
repositories for which these services are accumulated. The major specification for
forming service-based repositories or registries is Universal Description, Discovery,
and Integration (UDDI) [1]. UDDI allows for the description of global registries
where, information about services is published. At present, UDDI is the only
established standard for Web service discovery across the world [2]. Search engine
concept is used in this paper to enhance UDDI registry with the intention of
improving the searching facility. At first, the web services are published in the UDDI
registry by the service provider. The information updated to the registry is stored as a
WSDL document, which contains the businessEntity, businessService, binding
template and tModel. To transfer information in between the service provider and the
registry, we have used SOAP messages and API calls. Then, the index database is
built using the <description> label of the businessEntity and businessService. Here,
we have maintained two separate index databases, where we have used inverted index
structure to build it. In discovery phase, the search query is matched with the
keywords defined in the index database and the key corresponding to the matched
keywords are provided to the user. The accessing behavior of the current users is
stored in log files, which contain two files for businessEntity and businessService.
Once, more number of users use this search engine, the records in the log file will be
increased and the personalized businesses and businessServices are provided to the
user by analyzing these log files. The basic outline of the paper is described as
follows. A brief review of related research is discussed in section 2. The proposed
Enhanced UDDI for web service discovery is presented in section 3. Implementation
of the proposed Enhanced UDDI registry is given in section 4. Conclusion is summed
up in section 5.
A few researches are available in the literature for web service publishing and
discovery that use UDDI registry. The extension of UDDI registry has received
considerable attention among the researchers for effectively identifying the web
services. Here, we present some of the researches related to UDDI registry. Zongxia
Du et.al.[4] have extended and arranged the private or semi-private UDDIs based on
industry classifications to present an active and distributed UDDI architecture named
Ad-UDDI. Song, H.et.al.[5] conducted an experiment to analyze better techniques of
using general-purpose search engines to find Web Services. They have published web
services using nine different techniques and retrieved them using Yahoo and Google
search engines by employing two groups of total 18 queries.
(ii) UDDI interfaces: There are two types of interfaces in the UDDI registry. The
publisher interfaces are used by the service provider to register their services to the
registry. The APIs like get_authToken, save_business, save_service and save_tModel
are known as publishing API and they are used by service providers to manage the
entries that are present in the UDDI registry. The inquiry interfaces are used by the
services customer to discover their services from the registry. The inquiry API
interfaces such as find_business, find_service, get_businessDetail and get_serviceDetail
are used for discovering the web services in the UDDI registry and for retrieving service
descriptions about particular registrations. For publishing the web services, at first, the
service provider should get authentication token from the registry. So, for getting the
authentication token from the registry, the service provider should submit his/her user
name and password to the repository. Then, by validating the inputs given by the service
provider, the registry generates an authentication token for the service provider. Using
this authentication token, the service provider can update the business details to the
registry. Once the service provider updates the business details, the registry provides a
business key and then, the business key is utilized by the service provider to update the
web services. After publishing the business, the registry generates one service key for
the user. Then, the service key is used for updating technical details to the registry. For
updating all these details, the service provider uses get_authen API, save _business API,
save_service and save_tModel interfaces. So, description about business, web services
and technical information business provided by the service provider are stored in the
registry as a WSDL document.
This step is an additional work done in the UDDI registry for easy retrieval of web
services and optimization of speed and performance of relevant web services
searching. Once the service providers register their services in the registry, the index
database, where each service is indexed builds automatically. Indexing is commonly
performed in several ways for example using Suffix tree, Inverted index, Citation
index, Ngram index or Document-term matrix. In the proposed system, we make use
of Inverted index for indexing the web services registered in the UDDI registry.
Here, we have designed two index databases IB (for businessEntity) and IS (for
businessService) using the inverted index data structure. The description given by the
service provider in the businessEntity is used to construct the index database IB that
contains two important fields namely, Keyword and business key. Keyword refers to
the significant keywords obtained from the <description> label of the business entity.
Business key refers to the unique business key of the business entity related to the
significant keyword. So, these two fields are added in the index database for each
business registered in the UDDI. For a new service provider, the significant keywords
are identified and the business key related to the keyword is inserted into the index
database along with the keyword. If the keyword is already in the index database, the
business key is added to the Business key field that corresponds to the keyword. The
example of the inverted index of the businessEntity is given in table 1.
Design and Development of an Enhanced UDDI 113
This section describes the discovery of web services from the index database using a
keyword-based searching mechanism. When a new user wants to obtain a web
service, he/she should put a query (typically by using key word) to the search engine.
Then, the input query given by the new user is matched with the keyword field of the
index database. The keys corresponding to the matched keyword are taken from the
index database and given to the user. Using this procedure, the business and services
are discovered and the detailed descriptions of both of them are obtained using the
attained key.
4 Implementation
The partial implementation of Enhanced UDDI for web service registry and discovery
is done and consists of two phases
businessService data model and the technical details are uploaded in the tModel. The
details updated by the service provider are stored in a WSDL document, which is
located in the UDDI registry.
In service discovery phase, the query keyword given to the search engine by the new
user is mapped with the index database and the keys relevant to the keyword are given
to the user to access the web services. Accessing of web services can be accomplished
by matching the business query keyword and also, the businessService keyword.
5 Conclusion
References
1. Brian Blake, M., Nowlan, M.F., Bansal, A., Kona, S.: Annotating UDDI registries to
support the management of composite services. In: Proceedings of the 2009 ACM
symposium on Applied Computing, Honolulu, Hawaii, pp. 2146–2153 (2009)
2. Brogi, A., Corfini, S.: Behaviour-Aware Discovery of Web Service Compositions.
International Journal of Web Services Research (IJWSR) 4(3) (2007)
3. Spies, M., Schoning, H., Swenson, K.: Publishing of Interoperable Services and Processes
in UDDI. In: Proceedings of the 11th IEEE International Enterprise Distributed Object
Computing Conference, Annapolis, MD, p. 503 (October 2007)
4. Du, Z., Huai, J., Liu, Y.: Ad-UDDI: An Active and Distributed Service Registry. In:
Bussler, C.J., Shan, M.-C. (eds.) TES 2005. LNCS, vol. 3811, pp. 58–71. Springer,
Heidelberg (2006)
5. Song, H., Cheng, D., Messer, A., Kalasapur, S.: Web Service Discovery Using General-
Purpose Search Engines. In: Kalasapur, S. (ed.) Proceedings of IEEE International
Conference on Web Services, Salt Lake City, UT, pp. 265–271 (July 2007)
Identification, Authentication and Tracking Algorithm
for Vehicles Using VIN in Centralized VANET
Abstract. The VANET should allow only the authentic vehicles to participate
in the system for efficient utilization of its available resources. The proposed
system architecture contains multiple base stations in the coverage area of a
certifying authority. The base station verifies the identification of the vehicle
and the certifying authority verifies the authentication of the vehicle using its
vehicle identification number. The certifying authority also generates a digital
signature for each authentic vehicle and assigns it to the corresponding vehicle
through base station. The base station allocates a channel to each authentic
vehicle. The channel remains busy as long as the vehicle is within the coverage
area of this base station. So the base station is able to track an authentic vehicle
by sensing the allocated channel within its coverage area.
1 Introduction
The unique identification of a vehicle at a very fast speed using vehicle tracking
system helps to track it at various check points within a boundary or premises. It
manages the security of vehicles in case of theft or any unwanted incident. Along with
identification and tracking it is also required to protect the communication in a secured
VANET system from unauthorized message injection and message alteration.
Several identification, tracking and authentication schemes have been proposed so
far. The automatic vehicle identification techniques are discussed in [1]. One such
technique uses barcode which is affixed to each vehicle. The optical reader may be
used to read it. But its quality is affected by weather. Most current automatic vehicle
identification technology relies on radio frequency identification in which an antenna is
used to communicate with a transponder on the vehicle using DSRC. It has excellent
accuracy even at highway speed. But the major disadvantage is the cost of transponder.
The close circuit television technology [2] may be used in vehicle tracking system. But
the image quality is affected by the lighting or the presence of trees. The global
positioning system (GPS) may be used to know the current position of a vehicle [3].
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 115–120, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
116 A. Mondal and S. Mitra
But it cannot distinguish one vehicle from another. Moreover satellite communication
is required to process data which is obtained from GPS. But satellite communication is
not adequate at urban and forest area. Any storms having ionized particle may
introduce noise in the collected information. In [4], each vehicle has a unique identity
along with a public and private key. The public key is used to encrypt the unique
identity before sending it to the receiver and the public key is used to decrypt any
received message. But the pattern of the unique identity is not mentioned in this
scheme. Moreover it needs considerable amount of time for encryption. The biometric
information of the driver is used for authentication in [5]. It hampers the privacy of the
driver. Moreover it is applicable if the driver in the vehicle is a fixed person. In [6], CA
assigns a public key to each vehicle for V2V and V2I communication. The vehicle
uses the public key for encryption. CA generates a signature from the said public key
and assigns it to the vehicle in case the vehicle wants to get any service from the
network. But any intruder may get the public key of a vehicle and may start some
communication.
The proposed VANET is a hierarchy having certifying authority (CA) at the root
level, base stations (BSs) at the intermediate level and vehicles at the leaf level. The
dedicated short range communication (DSRC) protocol [7] is proposed for short
distance V2I communication among vehicles and CA through BS in the form of data.
The range of frequency provided by DSRC is 5.850 − 5.925 GHz. So the available
bandwidth at each BS is 75MHz, out of which 70 MHz is usable and rest 5 MHz is
used as guard frequency [8]. The 70 MHz bandwidth at each BS is divided into 7 links,
out of which 6 links are reserved for service and 1 link is used for control purposes. So
the bandwidth of each link is 10 MHz and it is allocated to authentic vehicle on
demand at a rate 1.28 MHz. So each 10 MHz link is divided into (10 MHz / 1.28 MHz)
≈ 8 channels and so the total number of available channels to provide service at each
BS is 48. Each BS allocates one channel to each authentic vehicle within its coverage
area and so can provide service to 48 vehicles using 48 channels. The proposed scheme
uses identification algorithm to identify a vehicle uniquely using its vehicle
identification number (VIN). The authentication algorithm is used for verifying the
authenticity of a vehicle during its initial registration phase. It assigns a digital
signature (D_Sig) to an authentic vehicle. The tracking algorithm is used by each BS to
track an authentic vehicle within its coverage area. Each BS maintains a VIN database
to store the encrypted VIN and D_Sig pair of all the authentic vehicles. CA maintains a
VIN database to store the available VINs. The use of VIN for identification and
authentication is advantageous as it is impossible to transfer VIN among vehicles and
to alter the information on it. Moreover VIN of a vehicle remains intact even in typical
environmental condition. It contains information about the manufacturer of the vehicle
and description of the vehicle. So other than identification and authentication VIN can
also be used to know the manufacturing details and the details description of a vehicle
which may require in case of accidents etc.
2 Present Work
In this section a modified VIN structure is proposed. The identification algorithm and
authentication algorithm of vth vehicle (Vv) are elaborated. The tracking algorithm for
Vv is also considered for discussion.
Identification, Authentication and Tracking Algorithm for Vehicles 117
Let Vv (1≤v≤V) enters into the coverage area of Bth BS (BSB, 1≤B≤S), where V is the
total number of vehicles and S is the total number of BSs in the proposed VANET
environment. In the proposed scheme it is assumed that each vehicle has an electronic
license plate (ELP) in which the encrypted VIN of the vehicle is embedded by the
vehicle manufacturer. The ELP of a vehicle broadcasts (as per IEEE P1069 and IEEE
802.11p) the encrypted VIN after entering into the coverage area of a new BS. So the
ELP of Vv (ELPv) broadcasts the encrypted VIN (E_VINv). The identification
algorithm at BSB receives E_VINv from Vv and searches its VIN database
(BSB_VIN_DATABASE) for E_VINv using BS_VIN_SEARCH algorithm Fig 1. If
found Vv is an authentic vehicle. So its initial registration phase is already over. The
identification algorithm at BSB reads D_Sig of Vv (D_Sigv) from
BSB_VIN_DATABASE and triggers the tracking algorithm by sending D_Sigv to it.
Otherwise the identification algorithm at BSB starts the initial registration phase of Vv
by triggering authentication algorithm.
118 A. Mondal and S. Mitra
is yes
Str_R<−Received VIN from vehicle Length_R<−Length of Str_R Length_R=Length_D
D<−1
no B<−1
Size<−Number of encrypted VIN
Str_D<−Dth encrypted VIN
in BS _VIN_DATABASE
B in BS _VIN_DATABASE Str_R(B)<−B
th
character of Str_R
B
no is
Size=0 yes Length_D<−Length of Str_D th
Str_D(B)<−B character of Str_D
yes
no is
Trigger authentication D<−D+1
D<=Size no
algorithm
is
Str_R is not found in is B<−B+1
yes
Str_R(B)=Str_D(B)
B<=Length_R
BS _VIN_DATABASE no
B
yes
3 Simulation
In this section the simulation parameters and the simulation results are considered for
discussion. The coverage area of CA (Area_CA) is assumed as1000 meter to support
DSRC and the coverage area of BS (Area_BS) is assumed as 300 meter during
simulation. So the maximum distance from BS to CA (Dist_BS_CA) is 300 meter and
from vehicle to BS (Dist_V_BS) is300 meter. The maximum number of BSs (S) is the
ratio of Area_CA to Area_BS and the maximum number of vehicles (V) is 48S. The
data transmission time among various components in the proposed VANET is the
ratio of the distance between source and destination to data transmission rate. The
data transmission rate (Data_TR) is assumed as 6 Mb/s [12].
The size of E_VINv (Size_E_VINv) is 17N bits where N = log2[ 2max(P, Q) − 1]. P
and Q are the prime numbers used in RSA algorithm. Kp and Ks are the public and
private key respectively used in RSA algorithm. Fig.2 shows the plot of the
transmission time of E_VINv from Vv to BSB (TT_VBS_E_VINv) vs. Size_E_VINv.
Identification, Authentication and Tracking Algorithm for Vehicles 119
0.035 3
0.03
2.5
TT_BSCA_E_VINv(s)
TT_VBS_E_VINv(s)
0.025
2
0.02
1.5
0.015
1
0.01
0.005 0.5
0 0
136 1024 32768 33554432 136 256 1024 8192 32768 1048576 33554432 1.074E+09
Size_E_VINv(bit) Size_E_VINv(bit)
0.003000005
25
0.002500005
Bv_CA_BS(s)
0.002000005 20
VIN_P_Time(s)
0.001500005
15
0.001000005
0.000500005 10
0.000000005
5
32
6
9
24
92
8
6
57
+0
76
13
25
44
10
81
48
4E
32
55
0
10
07
33
1.
The time to check D_VINv for its validity (VT_CA_D_VINv) depends upon the
size of D_VINv. The decryption time of E_VINv (D_Time) depends upon the size of
Ks in RSA algorithm [13]. As D_Sig is generated using SHA-1 algorithm [13] so the
D_Sig_Time is constant and it is 15 msec which has been observed during simulation.
The waiting time of E_VINv at CA_VIN_Queue (WT_CA_E_VINv)
is∑T (WT_CA_ _ ), where WT_CA_E_VINi is the waiting time of E_VINi at
CA_VIN_Queue and T is the number of encrypted VIN waiting in CA_VIN_Queue
in front of E_VINv. WT_CA_E_VINi is the sum of D_Time, VT_CA_D_VINi and
D_Sig_Time. The maximum size of CA_VIN_Queue depends upon the number of
vehicles in the proposed VANET environment and it is 48S. Fig.4 shows the plot of
120 A. Mondal and S. Mitra
4 Conclusion
The present work is an identification and authentication mechanism of vehicles in a
centralized VANET environment. It uses VIN of a vehicle for its identification and
authentication. The performance of the proposed algorithms may be studied in a
distributed VANET environment. Moreover in this work we concentrate on V2I
communication only. It can be extended to V2V communication.
References
1. http://ec.europa.eu/transport/roadsafety_library/
publications/evi_executive_summary.pdf
2. Parliamentary Office of Science and Technology. Postnote 175 (2002)
3. Balon, G.N.: Vehicular Ad-hoc Networks and Dedicated Short Range. University of
Michigan, Dearborn (2006)
4. Papadimitratos, P., Gligor, V., Hubaux, J.P.: Securing Vehicular Communications -
Assumptions, Requirements, and Principles. ESCAR (2006).
5. Raya, M., Padimitratos, P., Hubaux, J.P.: Secure Vehicular Communications, EPFL. IEEE
Wireless Communications (2006)
6. Calandriello, G., Padimitrators, P., Hubaux, J.P., Lioy, A.: Efficient and Robust
Pseudonymous Authentication in VANET. In: ACM VANET 2007 (2007)
7. Persad, K., Walton, C.M., Hussain, S.: Electronic Vehicle Identification: Industry Standards,
Performance. Project 0-5217, Texas Department of Transportation (August 2006)
8. Zang, Y., Stibor, L., Walke, B., Reumerman, H.J., Barroso, A.: Towards Broadband
Vehicular Ad-Hoc Networks-The Vehicular Mesh Network (VMESH) MAC Protocol.
IEEE Communication Society (2007)
9. Scully, J., Fildes, B., Logan, D.: Use of Vehicle Identification Number for Safety
Research. Monash University Accident Research Centre, Melbourne, Australia (2005)
10. http://www.angelfire.com/ca/TORONTO/VIN/VIS.html
11. http://en.wikipedia.org/wiki/List_of_cars
12. Towards Effective Vehicle Identification, The NMVTRC’s Strategic Framework for
Improving the Identification of Vehicles and Components (2004)
13. Kahate, A.: Cryptography and Network Security, 2nd edn. TMH (2010)
Performance Analysis of Fault Tolerant Node
in Wireless Sensor Network
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 121–126, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
122 R. Maheswar and R. Jayaparvathy
In this paper, we propose a new energy minimization scheme by which the average
energy consumption of individual nodes in the sensor network is reduced during
packet transmission based on queue threshold by considering node failures into
account We develop an analytical model of a sensor network by considering node
failures into account for analyzing the system performance in terms of average energy
consumption and mean delay by considering node failures.
2 System Model
3 Performance Analysis
In this section, we analyse the behaviour of a single sensor node. The arrival of data
packets to sensors follows a Poisson process with mean arrival rate per node (λ) and a
node during its period of active time, remains in IDLE state and switches to BUSY
state when the node’s buffer is filled at least with threshold number of packets (N) and
switches back from BUSY state to IDLE state when there are no packets in the node’s
buffer. We analyze the performance of the system in terms of the following
parameters.
Performance Analysis of Fault Tolerant Node in Wireless Sensor Network 123
N − 1 αλρE[ Br 2 ] λ 2 ρ br2 E[ S 2 ]
L = ρ br + + + (1)
2 2(1 − ρ br ) 2 ρ 2 (1 − ρ br )
λ α
where, ρ= , ρ br = ρ 1 + , E[S2] is the second order moment of service time
μ β
and E[Br2] is the second order moment of mean repair time. Since the packet size are
equal, we consider a deterministic service time with mean 1/µ, and assuming, failure
rate follows Poisson process with mean 1/α and with mean repair time 1/β, the mean
number of packets in the queue (L) is determined as,
N −1 αλρ ρ br2
L = ρ br + + + (2)
2 2 β 2 (1 − ρ br ) 2(1 − ρ br )
8CTλ (1 − ρ br )
N * = 0.5 − 1 + + 1 (4)
CH
124 R. Maheswar and R. Jayaparvathy
4 Simulation Model
In this section, we present the simulation model. We consider Mica2 motes forming a
wireless sensor network. We perform the simulation for a sensor network using the
various parameters given in [6-7]. The values of CT and CH are determined as CT = 6.9
mJ and CH = 0.8 mJ. Simulations results are obtained for various scenarios by varying
the mean arrival rate per node, failure rate and threshold number of packets to
determine the average energy consumption of a node and the mean delay experienced
by the packets per node in a network. Simulation results show that the average energy
consumption is reduced by increasing the threshold value N and the minimum energy
is consumed for optimal threshold value N*. Results clearly show that there exists
trade-offs between the energy consumption and the mean delay and the results also
show that mean delay increases as failure rate increases the results.
In this section, we present the simulation and analytical results. Simulation and
analytical results are taken to find the mean delay and the average energy
consumption of a node by varying the queue threshold (N) and it is shown in Fig. 1
and Fig. 2. From Fig. 1, it is inferred that the mean delay increases linearly as queue
threshold (N) increases because the packets has to wait for longer period for larger
threshold value. From Fig. 2, it is inferred that, as N increases, the average energy
consumption per node decreases and increases and minimum energy is consumed for
the optimal threshold. By assuming N = 4, N* = 6 and N = 10 and mean arrival rate
per node = 2, the energy consumption savings (%) is determined and it is found to be
66%, 68% and 63% respectively when compared to no threshold condition (i.e., N =
1). The optimal threshold value (N*) using equation (4) for mean arrival rate per node
of 2 is 6. Hence, it is also clear that the maximum energy savings of 68% is achieved
for the optimal threshold value (N*) = 6 and it is shown in Fig. 3. From Fig. 4, it is
also inferred that mean delay increases as the failure rate increases.
6 Conclusions
In this work, we have proposed a new energy minimization scheme by which the
average energy consumption of nodes in the sensor network is reduced based on
queue threshold during its period of active time. We have developed an analytical
126 R. Maheswar and R. Jayaparvathy
model of a sensor network by considering node failures into account using M/G/1
queueing model and the system performance in terms of average energy consumption
and mean delay have been determined. The results show that the average energy
consumption savings is 68% for the optimal threshold value when compared to no
threshold condition for mean arrival rate per node of 2 and there exists trade-off
between the average energy consumption and mean delay. Also, it is inferred that
mean delay increases as the failure rate increases. The simulations were performed for
100 runs and a confidence interval of 95% is obtained and thus the results show that
our analytical results present an excellent matching with simulation results under
various scenarios showing the accuracy of our approach.
References
1. Quan, Z., Subramanian, A., Sayed, A.H.: REACA: An Efficient Protocol architecture for
Large Scale Sensor Networks. IEEE Transactions on Wireless Communications 6(10),
3846–3855 (2007)
2. Carle, J., Simplot-Ryl: Energy-efficient area monitoring for sensor networks. IEEE
Computer 37(2), 40–46 (2004)
3. Chiasserini, C.F., Garetto, M.: An analytical model for wireless sensor networks with
sleeping nodes. IEEE Trans. Mobile Computing 5(12), 1706–1718 (2006)
4. Maheswar, R., Jayaparvathy, R.: Performance Analysis of Cluster based Sensor Networks
Using N-Policy M/G/1 Queueing Model. European Journal of Scientific Research 58(2)
(2011)
5. Liu, H., Nayak, A., Stojmenovi, I.: Fault-Tolerant Algorithms/Protocols in Wireless Sensor
Networks. In: Guide to Wireless Sensor Networks, pp. 261–291. Springer (2009)
6. Polastre, J., Hill, J., Culler, D.: Versatile low energy media access for wireless sensor
networks. In: Proc. SenSys 2004, pp. 95–107 (2004)
7. http://www.xbow.com/products/Product_pdf_files/
Wireless_pdf/MICA2_Datasheet.pdf
Diameter Restricted Fault Tolerant Network Design
Abstract. Low transmission delay, high fault tolerance and low design cost are
the three main properties of any network which are best described by its
topology. Transmission delay can be decreased by restricting the diameter of
the network. Very few methods in literature have considered the importance of
the diameter of the network to decrease the transmission delay. Fault tolerance
in the network depends on the number of disjoint paths between a node pair.
Designing a k-connected fault tolerant network subject to connectivity and
diameter constraint at minimal cost is a NP hard problem. In this paper, an
efficient constructive heuristic algorithm is proposed for designing a k-
connected network while optimizing the cost of the network subject to the
connectivity and diameter constraints. Diameter of resultant network would be
of two links regardless of network size to get the speed comparable to complete
connected network at low cost. Effectiveness of the proposed approach is also
evaluated using different examples.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 127–135, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
128 P.C. Saxena, S. Sabharwal, and Maneesha
computational effective as these require feasibility and optimality check for each
intermediate topology until a sub optimal topology is found and these are not suitable
for ad-hoc networks where we need the resultant topology quickly.
Various constructive heuristic approximation algorithms have also been used for
designing k-connected network topology. These approximation algorithms generate
topology from scratch by adding low cost links until a topology satisfying the desired
constraints is found. Number of links and design cost are the parameters which are
used for measuring the efficiency of the constructive algorithms in this paper. In [8]
minimum number of links are much larger than the optimal number of links and there
is also no restriction on diameter of the resultant network. In another approach [9],
minimum number of links required are k*(n-k) which are near to optimal only when
k>(n/2). But this is hardly required in any network [7]. It is also analyzed that when k
is greater than n/2, then the resultant network using this approach would not be a k-
connected network. In [10], cost of the resultant network is very much dependent on
numbering of the nodes and may be less for the same network , numbered differently
and there is no restriction on the diameter of the resultant network. Node degree is
used here for connectivity which is not a sufficient condition [11]. Wassim El-Hajj
et.al. [12] proposed an approach for designing a network of diameter of two links only
regardless of network size. But design cost is not so much optimized in this approach.
In this paper, an efficient constructive heuristic algorithm is proposed for designing
the topology subject to fault tolerance and diameter constraint while optimizing the
cost. Diameter of the network is set to two regardless of network size as we want to
design a fast network whose speed would be comparable to complete connected
network at low cost. The resultant network would be a √n connected network, where
n is number of nodes. The paper is organized as follows: proposed constructive
heuristic algorithm is explained in Section 2. In Section 3, we provide some
examples of network design problem illustrated using our proposed approach.
Comparative analysis of our proposed approach and other existing approaches is
given in section 4. We end the paper with the concluding remarks in section 5.
The proposed approach searches a subset of low cost links to design a k-connectivity
network having two links diameter. k is here equal to √n as this is the minimum
connectivity requirement to design a network having two link diameter. The starting
node from where the design starts significantly affects the cost of the resultant
network. Keeping this in mind, we have proposed to start the topology design with the
node whose eccentricity is the highest among all nodes in the network. Eccentricity of
a node is the distance/cost from a node to its farthest node. Using this way we can
appreciably reduce network cost because no two distant nodes would ever be
connected as the chosen farthest node is always connected to its k nearest nodes.
First step is finding the node having highest eccentricity. For example, node s has
the highest eccentricity. Node s is first connected to its k nearest nodes directly to
satisfy the connectivity constraint. Once the connectivity requirement of node s is
Diameter Restricted Fault Tolerant Network Design 129
satisfied, next step is to connect it to all other remaining nodes in the network via its
directly connecting k nodes to satisfy the two link diameter constraint. Now node s
satisfies both constraints i.e. k connectivity and two links diameter. Next, we again
find out the next highest eccentricity node i.e q in the set of the nodes in the given
network excluding node s. Connect node q to k-degree(q) number of its nearest
node to satisfy the connectivity condition. Now we check the length of the shortest
path to all nodes from node q. If any node is unreachable to node q or having the
shortest path of length more than two links then connect these nodes to their nearest
directly connected nodes of node q. This procedure is repeated for all nodes. Resultant
network would be a low cost k-connected network having two links diameter as at
every step in our proposed approach we start with a highest eccentricity node and
connect it to its nearest node so that no two farthest nodes are ever connected directly.
Step by step details of proposed algorithm is given below.
1. Start with the highest eccentricity node s in set V. Set V = V- s
2. Find (k - degree(s)) number of nearest nodes from s and store them in set C.
3. Connect s to all its nearest nodes in set C either directly or via some other node
depending upon the degree of the nodes in set C.
4. Set P = V- {C}
5. Remove the farthest node d of node s from the set P.
6. Connect node d to the nearest direct connecting node v of node s depending upon
the degree of node s, node d, direct connecting nodes of s and direct connecting
nodes of d.
7. Repeat steps 5 and 6 until set P is empty.
8. Repeat all above steps 1- 7 until the set V is empty.
9. Resultant graph would be a k-connected network having two links diameter.
Step 3 and Step 6 is about to connect highest eccentricity node to all other nodes of
network either directly or via an intermediate node. Degree of the nodes plays a very
important role in determining whether two nodes would be connected directly or
through an intermediate node. In step 3, suppose node u is the nearest node in set C
from node s. If the degree of node u is less than k then the node s and node u are
connected directly. Otherwise if u has degree equal or greater than k, then node s
would be connected to one of the direct connecting nodes of node u which is nearest
to node s. And if all direct connecting nodes of u also have degree equal or greater
than k then find out nearest node among node u and its direct connecting nodes to
node s. So using this way, we directly connect only two nearest nodes. Step 6 is
responsible for restricting the network diameter of two links only. So once the
connectivity of a chosen node at step 3 is satisfied, then it is connected to remaining
nodes via its direct connecting nodes so that it can reach to every other nodes either
directly or using only one intermediate node. So step 6 are used for constructing
path of maximum two link diameter from the currently chosen highest eccentricity
node to all other nodes in the network. Working of step 3 and step 6 is best
described by algorithm Create_ Link as given in fig.1.
130 P.C. Saxena, S. Sabharwal, and Maneesha
Algorithm Create_Link(s,d)
Begin
1. S= direct_connecting_nodes(s)
2. D= direct_connecting_nodes(d)
3. If ((Degree(s) < k)) and (Degree(d) <k)) then
connect(s,d)
Else if ((Degree(s<k)) and (Degree(d≥k))) then
If (deg(D < k) then
m= find_nearest( s, D)
Connect(s,m)
Else
m= find(s, d, D)
Connect(s,m)
Else If ((Degree(s) ≥ k) and (Degree (d) <k)) then
If (deg (S) < k)
m= find_nearest (d, S)
Connect (d, m)
Else
m = find (d, s, S)
Connect (d, m)
Else if ((Degree(s) ≥k) and (Degree (d) ≥k))
If ((deg(S) < k) and (deg(D)< k))
m= find_nearest (d, S)
m1=find_nearest (s, D)
If (m< m1) then
Connect (d,m)
Else
Connect(s,m1)
Else if ((deg (S) < k) and (deg (D) ≥ k))
m= find_nearest ( d, S)
Connect (d, m)
Else if ((deg (S) ≥k) && deg (D <k))
m= find_nearest (s, D)
Connect(s, m)
Else if ((deg (S) ≥ k) and (deg (D) ≥ k))
m= find (d, s, S)
m1=find (s, d, D)
If (m< m1) then
Connect(d,m)
Else
Connect(s, m1)
End
Direct_connecting_node(s) : This function return an array containing
all direct connecting nodes of a node s .
Deg(S) :This function return the degree of a node who has lowest
degree among all all nodes stored in S.
3 Design Examples
In this section we will illustrate our proposed approach to design two 7 nodes and 13
nodes networks.
Example 1
Number of nodes n=7
Connectivity k = √n = 3
Let us consider 7 nodes network as shown in the fig. 2. Table1 gives the cost matrix
constructed out of the cost ( which is a function of distance) associated with the pair
of nodes. Cost and distance between two nodes are considered same here.
G
F
E
A
D
B
C
A B C D E F G
A 0 5 3 7 4 2 9
B 5 0 5 10 8 6 12
C 3 5 0 6 4 5 10
D 7 10 6 0 4 8 5
E 4 8 4 4 0 5 5
F 2 6 5 8 5 0 8
G 9 12 10 5 5 8 0
From the table 1, it is clear that node B, node G are the two farthest node in this
network. Choose either of two. Suppose, B is chosen. Degree of node B is 0. So,
connect B to its k nearest nodes i.e. A, C and F directly. Rest of the nodes (D,E,G) in
the network would be connected to node B via either of its direct connecting nodes
(A,C,F) because degree of node B is now k. Next, choose the farthest node to B from
the node set {D, E, G}. Node G is the farthest node from node B. Node G is
connected to its nearest node F which is the nearest node among nodes A, C and F.
Again choose the farthest node to B from the node set {A, C}. Node A is chosen this
time and it is connected to its nearest node i.e. node E from nodes set {D, E, G}.
Finally, the last remaining node D is connected to its nearest node i.e C from the node
132 P.C. Saxena, S. Sabharwal, and Maneesha
set {A, C, F}. Resultant network having two links diameter for node B is shown in
fig. 3 which shows the network in which node B is connected to every other nodes
either directly or using one intermediate node only.
G
F
E
A
D
B
C
F G
E
A
D
B
C
Fig. 4. Network having diameter of two links for node B and node C
Same procedure is applied to every node until the all nodes’s diameter becomes of
two links. So the final network is shown in fig. 5 having two links diameter of
network.
G
F
E
A
D
B
C
Example: 2
Number of nodes n = 13
Connectivity k = √n = 4
Let us consider 13 nodes network as shown in the figure 6. Table 2 gives the cost
(distance) matrix constructed out of the cost associated with the pair of nodes.
E
G
A
J
I
C
M L
D
H
K
F B
A B C D E F G H I J K L M
A 0 15 5 10 4 9 8 7 10 6 8 10 6
1
B 15 0 8 12 6 10 8 8 9 7 8 8
0
C 5 10 0 12 6 8 7 6 9 5 4 6 4
1
D 10 8 0 8 8 6 7 5 6 6 5 7
2
E 4 12 6 8 0 8 4 6 5 4 6 6 5
F 9 6 8 8 8 0 7 4 6 5 4 5 4
G 8 10 7 6 4 7 0 6 6 5 4 5 3
H 7 8 6 7 6 4 6 0 6 5 3 5 3
I 10 8 9 5 5 6 6 5 0 4 5 3 4
J 6 9 5 6 4 5 5 5 4 0 3 4 2
K 8 7 4 6 6 4 4 3 5 3 0 4 2
L 10 8 6 5 6 5 5 5 3 4 3 0 3
M 6 8 4 7 5 4 3 3 4 2 2 3 0
E
A
J I G
M
C
L D
H
K
B
F
4 Comparative Results
In this section, our proposed approach is compared with the existing approaches [9],
[12] as these are the only approaches which design a network of two links diameter.
Network design cost and number of links used to design a network are considered as
two parameters for comparison. Comparative analysis for link optimization and
design cost of two networks designed in previous section is given in table 3. It is
found that our proposed approach always results into less number of links and low
design cost than the existing approaches.
Table 3. Comparative analysis for link optimization and design cost for two networks
References
1. Wille, E.C.G., Mellia, M., Leonardi, E., Marsan, M.A.: Topological Design of Survivable
IP Networks Using Metaheuristic Approaches, pp. 191–206. Springer, Heidelberg (2005)
2. Dengiz, B., Altiparmak, F., Smith, A.E.: Local search Genetic Algorithm for optimal
design of reliable Networks. IEEE (1997)
3. Blum, C., Roli, A.: Meta- heuristics in Combinatorial Optimization: Overview and
Conceptual Comparison. ACM Computing Survey 35(3) (September 2003)
4. Szlachcic, E., Mlynek, J.: Efficiency Analysis in Communication Network Topology
Design. In: Fourth IEEE International Conference on Dependability of Computer Systems
(2009)
5. Szlachcic, E.: Fault Tolerant Topological Design for Computer Networks. In: Proceeding
of the IEEE International Conference on Dependability of Computer Systems (2006)
6. Kumar, R., Parida, P.P., Gupta, M.: Topological Design of Communication Network using
Multiobjective Genetic Optimization. IEEE (2002)
7. Pierre, S., Hyppolite, M.-A., Bourjolly, J.M., Dioume, O.: Topological Design of
Computer Communication Networks using Simulated Annealing. Engineering
Applications, Artificial Intelligence 8 (1995)
8. Zili, D., Nenghai, Y., Zheng, L.: Designing Fault Tolerant Networks Topologies based on
Greedy Algorithm. In: IEEE Third International Conference on Dependability of
Computer Systems DepCoS-RELCOMEX 2008 (2008)
9. Kamlesh, K.N., Srivatsa, S.K.: Topological Design of Minimum Cost Survivable
Computer Communication Networks: Bipartite Graph Method. (IJCSIS) International
Journal of Computer Science and Information Security 3(1) (2009)
10. Steiglitz, K., Weiner, P., Kleitman, D.J.: The Design of Minimum-Cost Survivable
Networks. IEEE Transaction on Circuit Theory Cr-16 (November 1969)
11. Fencl, T., Burget, P., Bilek, J.: “Network topology design” control engineering practice.
Elsevier Ltd. (2011)
12. EI-Hajj, W., Hajj, H., Trabelsi, Z.: On Fault Tolerant Ad-Hoc Network Design. In: ACM
Transaction, IWCMC (2009)
Investigation on the Effects of ACO Parameters
for Feature Selection and Classification
Shunmugapriya P.1, Kanmani S.2, Devipriya S.2, Archana J.2, and Pushpa J.2
1
Dept. of CSE
2
Dept. of IT
Pondicherry Engineering College, Puducherry, India
pshunmugapriya@gmail.com
1 Introduction
Feature selection is constructively used by a number of Machine Learning algorithms
especially Pattern Classification [2, 3, 4, 8, 9 and 10]. The presence of redundant,
irrelevant and noisy data may result in poor prediction (classification) performance.
Feature selection extracts the relevant and most useful features without affecting the
original representation of the dataset. The generic purpose of a Feature Selection
Algorithm related to Pattern Classification is the improvement of the classifier or
learner, either in terms of learning speed, generalization capacity or simplicity of the
representation [10].
Ant Colony Optimization (ACO) is a meta-heuristic search algorithm which has
been successfully employed to implement feature selection in numerous applications
[5, 7, 21, 22, 23, 24, 25, 26, 27 and 28]. It can be inferred from these works that
ACO leads to optimal selection of features and effectively increases the prediction
results. ACO employs certain parameters to solve the optimization problems. These
parameters are Pheromone Evaporation Rate (PER), Local Pheromone Update (LPU),
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 136–145, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Investigation on the Effects of ACO Parameters for Feature Selection and Classification 137
parameter stating relative importance (β), parameter which decides the component
selection(τo) and the number of ants [1] and [11].
While performing optimization using ACO, its parameters have to be fine tuned
and assigned values. These values are assigned after experimenting with an allowed
set of numbers. The performance of the ant colony system changes with respect to the
change of values of the parameters especially PER. Works have been carried out
analyzing the role of ACO parameters in combinatorial optimization problems like
Traveling Salesman Problem, online parameter adaptation etc [11, 14, 15, 16, 17, 18,
19 and 20]. Our work searches for the optimal values for ACO parameters in relation
to FS problem optimization. 10 standard datasets have been used to check for the
behavior of ACO according to different values for the PER, LPU and β. It could be
inferred from the results that, ACO leads to better optmization, when the value of
PER is assigned between 0.1 and 0.7. For the value of PER from 0.1 to 0.7,
classification accuracy keeps increasing and the accuracy takes a transition when PER
exceeds around 0.75 and starts to decrease. So, from the experiments conducted,
the optimal value to be assigned to PER when optimizing FS by ACO is a value
around 0.75.
This paper is organized in 6 sections. Feature Selection and Classification are dis-
cussed in section 2. Section 3 gives a brief description of ACO and Pheromone Trial.
Section 4 outlines the ACO algorithm and ACO parameters. The computational expe-
riments and results are described in section 5. Section 6 concludes the paper.
2.1 Classification
A classifier takes a set of features as input and these features have different effect on
the performance of classifier. Some features are irrelevant and have no ability to in-
crease the discriminative power of the classifier. Some features are relevant and high-
ly correlated to that specific classification. For classification, sometimes obtaining
extra irrelevant features is very unsafe and risky [2]. A reduced feature subset, con-
taining only the relevant features helps in increasing the classification accuracy and
reducing the time required for training.
in turn these are used to adapt to other paramter values. PER is the significant
parameter as it decides where more pheromone is to be accumulated and what is to
be selected[1].
. (3)
At the start of the algorithm, all the parameters are initialized. The features of the
dataset are assigned a pheromone value each and are initialized to a small positive
number in the range [0, 1]. Each ant selects a feature based on the probability value as
given in (1). Accumulation of pheromone is done, when a feature is selected by using
equation (2). Because the classifier is also involved, the pheromone accumulation
actually encourages the selection of the feature that is more relevant and has a posi-
tive effect on the classification accuracy. After all the ants have finished a run, the ant
producing the highest classification accuracy is considered as the best ant (L) and then
the global pheromone update is done using equation (3). Global pheromone update
actually leads to the pheromone evaporation of irrelevant features. After a predeter-
mined number of iterations are over, the algorithm halts yielding the set containing
optimal features.
5.2 Experiment
All computations are done using WEKA (Waikato Environment of Knowledge Anal-
ysis) [13]. As discussed in section 4.1, the ACO parameters used for optimization of
140 P. Shunmugapriya et al.
feature selection are ρ, ϕ, β, τo, τi and the number of ants. The number of ants is usual-
ly set equal to the number of features and the pheromone values τi of the features are
initialized to a small positive number in the range[0,1]. Because the pheromone val-
ues assigned are probability distributed, the pheromone values can be randomly as-
signed to the features. ρ, ϕ, β should also be assigned values in the range [0, 1]. The
effect of these parameters over optimization has been discussed in a number of works
in the literature [11, 14, 15, 16,17,18,19 and 20].
The parameter of global update ρ, which is also the parameter indicating the evapo-
ration rate, has significant effect over the selection of features, based on the value set
to it. The relative factor β and LPU ϕ, affects the optimal feature selection but com-
pared to the evaporation rate ρ, they have lesser significance. The importance of the
pheromone evaporation rate has been revealed in the literature [1, 11, 14, 15,
16,17,18,19 and 20].
The following tables 2,3,4,5,6,7,8,9,10 and 11 show how the values of the parame-
ters affect the behavior of ACO in optimal selection of features and in increase of
classification accuracy. For Heart dataset, when the evaporation rate ρ is assigned the
value from 0 to 0.69, the features are selected and the accuracy keeps increasing.
When ρ takes up values in the range of [0.7, 1], the classification accuracy keeps de-
creasing and is not a favorable situation. The relative factor β has the same effect over
optimization for all the values in the interval [0, 1]. The local pheromone update ϕ
has yielded better result when it is set to 0.3.
Investigation on the Effects of ACO Parameters for Feature Selection and Classification 141
Accuracy
Evaporation Rate ρ ϕ Actual Features Features Selected
(%)
0.0 19 2(4,7) 58.06
0.1
TO 0.1 19 17(all except 13,19) 60
0.73 0.2 to 14(all except
19 65.16
1 13,15,16,17,19)
0.0 to 13(all except
0.74 To 1 19 77.45
1 10,11,14,15,17,19)
Table 4. LUNG CANCER - ACO based Feature Selection and Classification Accuracy
Actual Accuracy
Evaporation Rate ρ ϕ Features
Features Selected
(%)
33(1,2,4,6,7,9,11,12,17,18,
0.1
0.0 21,22,24,26,27,28,30,31,33,36,
To 56 87.37
to 1 37,38,39,41,42,43,44,45,46,47,48,
0.78
52,54)
0.79 0.0 56 50(all except 1,3,5,10,13,20) 81.25
To 0.1
0.87 56 53(all except 3,5,13) 82.37
to 1
0.0 56 52(all except 1,3,5,13) 71.25
0.1 56 6(1,2,4,6,24,49) 70.87
0.88
0.2
56 1(1) 65.62
to 1
0.0 56 50(all except 1,3,5,10,13,20) 80.25
0.89 to 1 0.1
56 2(1,2) 65.62
to 1
For Hepatitis dataset, the impact of the parameters is in a reverse order to that of
Heart C. ACO gives lesser accuracy for the value of ρ from 0 to 0.73 and the accura-
cy gets increased for the value of ρ higher than 0.73. β has the same effect for all the
values in the range [0, 1]. ϕ affects the optimization process for lower values of ρ
and produces the same result for the higher values of ρ.
Accuracy
Evaporation Rate ρ ϕ Actual Features Features Selected
(%)
0.0 to 28(all except
0.1 0.4 34 4,7,11,13,20,34) 96.90
TO 27(all except
0.77 0.5 to 1 34 4,7,9,11,13,20,34) 98.35
0.78 0.0 34 33(all except 13) 95.90
To 0.1 34 33(all except 13) 94.5
1 0.2 to 1 34 1(1) 35.79
142 P. Shunmugapriya et al.
Accuracy
Evaporation Rate ρ ϕ Actual Features Features Selected
(%)
0.1 TO 0.77 0.0 to 1 8 6(except 5,7) 81.11
0.0 To 0.2 8 6(except 5,7) 81.11
0.78
0.3 To 1 8 6(except 3,4) 89.82
0.0 To 0.2 8 6(except 5,7) 81.11
0.79 To 0.81 0.3 and 0.4 8 6(all except 4,8) 74.73
0.5 To 1 8 1(1) 67.83
0.82 0.0 8 8(all) 80.11
To 0.1 8 6(all except 4,8) 74.73
1 0.2 To 1 8 1(1) 67.83
Table 10. HIV - ACO based Feature Selection and Classification Accuracy
Table 11. DIABETES - ACO based Feature Selection and Classification Accuracy
From the tables 2,3,4,5,6,7,8,9,10 and 11 it can be seen that, except Hepatitis, for
all other datasets, the feature selection is optimal and the accuracy is higher, when the
evaporation rate is set to 0.75 and below it. The local pheromone update factor has
only a little significance and performs its best when set to the values in the range [0.3,
0.5]. The relative factor β has the same effect on optimization for all the values in
between [0, 1].
From the data represented in Tables 2 to 11, it can be inferred that,
a. The relative factor β affects the optimization process in the same way for all the
values in the range [0, 1].
b. The local pheromone update factor ϕ, shows varied performance based on the
values it is assigned to. However, the results suggest ϕ gives best results when it
is assigned values in between 0.3 to 0.5.
c. When the PER ρ is assigned values around 0.7, it leads to best optimization and
higher classification accuracy.
d. Except for Hepatitis dataset, the accuracy increases for the values of PER from
0.1 to 0.7(roughly) and the accuracy starts decreasing when PER takes values
from 0.75 to 1.
e. When ACO is applied for optimization of feature selection, the parameters can be
set to the optimal values suggested by this experiment.
f. The optimal values are 0.3 to 0.5 for LPU, 0.7 to 0.75 for PER. β can take any
value from 0 to 1.
6 Conclusion
ACO has been widely employed to solve combinatorial optimization problems. Lite-
rature has proved FS implemented using ACO has always resulted in optimal feature
selection and better classification accuracy. However, setting the values of parameters
required in ACO mechanism is usually a time consuming process. These parameters
are usually set by running the trial and error on all possible values and finally select-
ing the numbers that yield best results. In this work, ACO in combination with a
classifier is employed for optimal selection of features. We have experimented by
assigning all the allowed values to the ACO parameters using 10 different data sets
and arrived at optimal values for these parameters within the allowed range of values.
144 P. Shunmugapriya et al.
References
1. Dorigo, M., Stutzle, T.: Ant Colony Optimization. The MIT Press, Massachusetts (2004)
2. Rezaee, M.R., Goedhart, B., Lelieveldt, B.P.F., Reiber, J.H.C.: Fuzzy feature selection.
Pattern Recognition 32, 2011–2019 (1999)
3. Laura Santana, E.A., Ligia Silva Anne Canuto, M.P., Pintro, F., Vale, K.O.: A Compara-
tive Analysis of Genetic Algorithm and Ant Colony Optimization to Select Attributes for a
Heterogeneous Ensemble of Classifiers, pp. 465–472. IEEE (2010)
4. Ahmed, E.F., Yang, W.J.M., Abdullah, M.Y.: Novel method of the combination of fore-
casts based on rough sets. Journal of Computer Science 5, 440–444 (2009)
5. Sadeghzadeh, M., Teshnehlab, M.: Correlation-based Feature Selection using Ant Colony
Optimization. World Academy of Science, Engineering and Technology 64, 497–502
(2010)
6. Dorigo, M., Di, C.G., Gambardella, L.M.: Ant algorithms for discrete optimization. Artifi-
cial Life 5, 137–172 (1999)
7. Abd-Alsabour, N., Randall, M.: Feature Selection for Classification Using an Ant Colony
System. In: Sixth IEEE International Conference on e–Science Workshops, pp. 86–91
(2010); Kuncheva, L.I.: Combining Pattern Classifiers, Methods and Algorithms. Wiley
Interscience (2005)
8. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Recognition, 2nd edn. John Wiley & Sons,
Inc. (2001)
9. Molina, L.C., Belanche, L., Nebot, À.: Feature Selection Algorithms: A Survey and Expe-
rimental Evaluation. In: Second IEEE International Conference on Data Mining, pp. 155–
172 (2002)
10. Randall, M.: Near Parameter Free Ant Colony Optimisation. In: Dorigo, M., Birattari, M.,
Blum, C., Gambardella, L.M., Mondada, F., Stützle, T. (eds.) ANTS 2004. LNCS,
vol. 3172, pp. 374–381. Springer, Heidelberg (2004)
11. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California,
School of Information and Computer Science, Irvine, CA (2010),
http://archive.ics.uci.edu/ml
12. WEKA: A Java Machine Learning Package,
http://www.cs.waikato.ac.nz/~ml/weka/
13. Ridge, E., Kudenko, D.: Screening the Parameters Affecting Heuristic Performance. In:
Preceeding of Genetic and Evolutionary Computation, GECCO 2007, p. 180. ACM (2007)
14. Matthews, D.C.: Improved Lower Limits for Pheromone Trails in Ant Colony Optimiza-
tion. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008.
LNCS, vol. 5199, pp. 508–517. Springer, Heidelberg (2008)
15. Stutzle, T., López-Ibáñez, M., Pellegrini, P., Maur, M., de Oca, M.M., Birattari, M., Dori-
go, M.: Parameter Adaptation in Ant Colony Optimization, IRIDIA Technical Report Se-
ries Technical Report No. TR/IRIDIA/2010-002 (2010)
16. Kumar, P.: A Note on the Parameter of Evaporation in the Ant Colony Optimization Algo-
rithm. International Mathematical Forum 6(34), 1655–1659 (2011)
17. Ivković, N.: Investigating MAX-MIN_ Ant System Parameter Space
18. Dobslaw, F.: A Parameter-Tuning Framework for Metaheuristics Based on Design of Ex-
periments and Artificial Neural Networks. World Academy of Science, Engineering and
Technology 64, 213–216 (2010)
19. Pellegrini, P., Favaretto, D., Moretti, E.: On MAX-MIN Ant System’s parameters
Investigation on the Effects of ACO Parameters for Feature Selection and Classification 145
20. Sivagaminathan, R.K., Ramakrishnan, S.: A hybrid approach for feature subset selection
using neural networks and ant colony optimization. Expert Systems with Applications 33,
49–60 (2007)
21. Aghdam, M.H., Ghasem-Aghaee, N., Basiri, M.E.: Text feature selection using ant colony
optimization. Expert Systems with Applications 36, 6843–6853 (2009)
22. Al-Ani, A.: Feature Subset Selection Using Ant Colony Optimization. International Jour-
nal of Computational Intelligence 2(1), 53–58 (2005)
23. Al-Ani, A.: Ant Colony Optimization for Feature Subset Selection. World Academy of
Science, Engineering and Technology 4, 35–38 (2005)
24. He, Y., Chen, D., Zhao, W.: Ensemble classifier system based on ant colony algorithm and
its application in chemical pattern classification. Chemo Metrics and Intelligent Laboratory
Systems, 39–49 (2006)
25. Robbins, K., Zhang, W., Bertrand, J.: The ant colony algorithm for feature selection in
high-dimension gene expression data for disease classification. Mathematical Medicine
and Biology, 413–426 (2007)
26. Kanan, H., Faez, K.: An improved feature selection method based on ant colony optimiza-
tion (ACO) evaluated on face recognition system. Applied Mathematics and Computation,
716–725 (2008)
27. Chandra, A., Yao, X.: Ensemble learning using multi-objective evolutionary algorithm.
Journal of Mathematical Modeling and Algorithms 5(4), 417–445 (2006)
Hop Count Based Energy Saving Dynamic Source
Routing Protocol for Ad Hoc Network
Abstract. Energy conservation has become one of the challenging issues for
increasing the life time of Mobile Ad hoc Network (MANET). Several energy
saving protocols have been proposed to maximize the life span and one such
protocol is Energy Saving Dynamic Source Routing (ESDSR) protocol. ESDSR
focuses only on energy and does not consider delay caused due to increase in
hop count. This paper Hop Count based ESDSR (HCESDSR) protocol proposes
a novel approach to maximize the life span of MANET and to reduce the
energy consumption. This reduces the delay caused in ESDSR by considering
energy as well as hop count. The number of dead node formation is less in
proposed work and hence increases the life of network. The route selection is
based on number of hops and energy level of that route. The protocol is
simulated using network simulator NS2. The simulation results show that this
approach attains energy efficiency of ESDSR and overcomes the limitations of
ESDSR.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 146–152, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Hop Count Based Energy Saving Dynamic Source Routing Protocol 147
counter which represent the current number of neighbors at each node that are kept in
active state. In LEAR protocol [10] an intermediate node forwards route request
message only if its residual battery energy is higher than threshold value, otherwise it
will drop the message. The smallest Common Power (COMPOW) protocol [5] selects
the smallest transmit power level, which is just enough to maintain connectivity of the
entire network. The energy conservation in MANET can be achieved by approaches
like transmit power control and load distribution approach.
The transmit power control approach minimizes total transmission energy required
to deliver data packet from source to destination. The disadvantage of this approach is
that it always chooses the same least transmission power path, which causes this path
to be overused and hence ‘die’ faster than other paths. The load sharing approach
focuses on balancing energy usage among nodes by avoiding over-utilized nodes.
ESDSR overcomes the limitations of these two approaches [4].
In ESDSR, the nodes which has a ‘tendency’ to ‘die out’ very soon are avoided
during route discovery phase. The ‘tendency’ of the node to ‘die out’ is expresses
quantitatively as the ratio of remaining battery energy and current transmit power of
the node that is ‘expected life’ of the node. Once routing decision is made, link by
link transmit power adjustment is accomplished depending on the signal strength at
which a node receives a packet. But this approach does not consider number of hops
needed for transmission and hence delay is higher in ESDSR. This paper focuses on a
mechanism in which the path which has a minimum hop as well as energy efficient is
selected for transmission. The following section includes the basic function of DSR,
ESDSR, the proposed system, its simulation results and conclusion.
DSR is based on the concept of source routing [1]. In source routing each packet
carries the complete ordered list of nodes in which the packet should pass through the
network. This is done by maintaining a cache with route from source to destination.
The DSR protocol allows the nodes to dynamically discover a source route across
multiple network hops to any destination in the ad hoc network. Each data packet sent
then carries in its header the complete, ordered list of nodes through which the packet
must pass, allowing packet routing to be trivially loop-free and avoiding the need for
up-to-date routing information in the intermediate nodes through which the packet is
forwarded. By including this source route in the header of each packet, other nodes
forwarding or overhearing any of these packets may also easily cache this routing
information for future use.
There are many energy aware routing protocols available. In Adaptive Energy-Aware
Routing Protocols for Wireless Ad Hoc Networks [1], the idea is to adaptively select
148 N. Geetha and A. Sankar
the subset of nodes required to participate in the process of searching for a low-power
path for networks where in nodes can adaptively adjust their transmission power [3].
In efficient energy management for mobile ad hoc network [8], overhead reduction
and energy management technique is adopted for basic DSR protocol. The Energy
Saving Dynamic Source Routing (ESDSR) protocol is an enhancement of DSR
protocol. The routing decision is made in ESDSR is based on the ratio of remaining
energy of the node to the transmitting power.
On Route Reply (RREP), this protocol estimates its expected life using (Ei/Pti).
Nodes replace the value recorded in the route reply if the calculated value is less than
the recorded value. It repeats the same process for other Route Reply from other
routes so that the reply packet carries the value of ‘minimum expected life’. The
source then selects the path which has higher ‘minimum expected life’ instead of
choosing the shortest path as in case of DSR. After establishing the path for data
transmission, link by link power adjustment has to be done based on the transmit
power control approach.
This protocol focuses only on the energy and not on hop count. Since packets are
not sent through minimum hop, the average number of hop will increase and thereby
increases the delay. Proposed system takes both hop count and energy into
consideration so that it overcomes the delay in ESDSR.
4 Proposed Work
The Sender Address is the source address from where packet is sent, Target address is
the destination address to which the packet is sent, Request Id is the unique Id for the
corresponding request and route record includes the address of each intermediate node
through which RREQ is forwarded. Each and every node maintains a route cache. The
pseudo code for route discovery phase is presented in Fig. 1.
Route Maintenance is the mechanism whereby sender is able to detect, if there is
any change in the network topology so that it can no longer use its route to
destination. Route maintenance is needed for two reasons: Mobility and Energy
depletion. Due to mobility, connections between some nodes on the path may be
depleting too quickly. The pseudo code for route maintenance phase is explained in
Fig. 2.
Hop Count Based Energy Saving Dynamic Source Routing Protocol 149
Fig. 1. Pseudo code for route discovery phase Fig. 2. Pseudo code for route maintenance
phase
Proposed system varies in load distribution approach. During the route reply, the
source node will record the ‘minimum expected life’ (Rj(t)). But instead of selecting
the path with the maximum of the ‘minimum expected life’, this approach selects the
path R which has cost function maximized as follows:
where Rj(t) is the path that has the highest minimum expected life and hj(t) is the
number of hops available in the path j. The minimum expected life is computed using
the energy metric. The energy metric of a node is the ratio of remaining energy to its
transmit power.
where Ei is the remaining energy of node i on the discovered path and Pti is the
transmit power of node i on the discovered path. Thus this approach is energy
efficient and overcomes the drawbacks of ESDSR and it is energy efficient than DSR
protocol.
In transmit power control approach, when a node receives RREQ packet at Precv
which was transmitted at Ptrans, it calculates transmit power Pnew for this receiving
node such that this node can communicate with the sender by using this minimum
required power.
Pnew = Pmin (3)
Where
Where Pthreshold is the required threshold power of the receiving node for successful
reception of the packet and Pmargin is the power included to overcome the problem of
unstable links due to channel fluctuations. Since this protocol maintains a small
margin it can save more energy than the protocol mentioned in [7].
Every node maintains a power table which records number of other routes through
that node and minimum required transmit power for the next hop in the corresponding
route. The power table includes power record and minimum transmit power.
The node records the newly calculated transmit power in ACK packet and sends it.
The transmitting node, after receiving the ACK packet reads Pnew and stores it in a
power table. The node uses the transmit power stored in the table to send next data
packet. This means that the packet is transmitted only with required power, reducing
the energy consumption at each node and thus increases lifetime of the route. When a
node wants to send a packet to next hop it will search its own power table, if it finds
an entry then sender will transmit at the default power level.
5 Simulation
The simulation is done using NS2 wireless extension. The DSR version with flow
state disabled is used. Since the receiving power is constant for each and every node,
it is set to zero. The Medium Access Control (MAC) was based on IEEE 802.11 with
2 megabits per second raw capacity. The 802.11 Distributed Coordination Function
(DCF) used RTS and CTS control packets for unicast data transmission. The data
transmission was followed by acknowledge (ACK) packet. The radio propagation
model chosen was two-ray path loss model. The traffic sources were CBR with 512
bytes/packet. The power table is added to basic node structure. Packet header module
was modified to carry transmit power level and threshold power level of node. Route
cache was modified to store the expected life of nodes for path. A scenario of having
50 mobile nodes as constant and varied the simulation area and transmission range.
The performance metric used to compare the performance of proposed work with
DSR and ESDSR are capacity, energy consumption per packet, hop count and number
of dead node formation. The capacity is measured as the total number of packets
reached the destination. The number of packets that has reached the destination in
HCESDSR is more than ESDSR and DSR as illustrated in Table. 1. The energy
consumption per packet is less in HCESDSR than ESDSR and DSR and this metric is
illustrated in Fig. 3.
Fig. 3. Energy Consumption Fig. 4. Number of Dead Fig. 5. Number of Hop Count
per packet Nodes
The transmission range of the node is varied and simulation runs for 50 nodes. The
nodes. The network lifetime can be predicted based on the number of dead node
formation. The dead node is a node whose energy is completely depleted. When the
simulation area is low, the number of dead nodes is less, but high when area increases
and this is shown in Fig. 4. The average number of hop counts between a particular
source and destination is given in Fig. 5. The delay in transmitting the packet is
reduced in proposed system when compared to ESDSR.
6 Conclusion
In this paper, an efficient mechanism for increasing the life time of MANET is
proposed. The HCESDSR protocol reduces the delay caused due to hop count. The
energy is conserved and can be improved in future enhancement. Here, the number of
dead nodes is directly proportional to the transmission area and this will be overcome
in the future enhancement.
References
1. Zhang, B., Hussein, T.: Adaptive Energy-Aware Routing Protocols for Wireless Ad Hoc
Networks. In: Proceedings of the First International Conference on Quality of Service in
Heterogeneous Wired/Wireless Networks, pp. 252–259 (2004)
2. Broch, J., Johnson, D.B., Maltx, D.A.: The Dynamic Source Routing Protocol for Mobile
Ad Hoc Networks. IETF Internet-Draft, draft-ietf-manet-dsr-00.txt (1998)
3. Misra, A., Banerjee, S.: MRPC Maximizing network lifetime for reliable routing in
wireless environments. In: Proceedings of IEEE Wireless Communication and Networking
Conference, vol. 2, pp. 800–806 (2002)
152 N. Geetha and A. Sankar
4. Tarique, M., Kemal, E.T., Naserian, M.: Energy saving dynamic source routing for ad hoc
wireless networks. In: Proceedings of Modeling and Optimization in Mobile, Ad Foc and
Wireless Networks, pp. 305–310 (2005)
5. Narayanaswamy, S., Kawdia, V., Srinivas, R.S., Kumar, P.R.: Power control in ad hoc
network:theory, architecture, algorithm and implementation. In: Proceedings of European
Wireless Conference – Next Generation Wireless Networks, Technologies, Protocols,
Services and Applications, pp. 156–162 (2002)
6. Rajeswari, S., Venkataramani, Y.: An Adaptive Energy Efficient and Reliable Gossip
Routing Protocol for Mobile Ad Hoc Networks. International Journal of Computer Theory
and Engineering 2(5), 740–745 (2010)
7. Doshi, S., Brown, T.X.: An on demand minimum energy routing protocol for a wireless ad
hoc network. Proceedings of ACM SIGMOBILE Mobile Computing and Communication
Review 6(3), 50–66 (2002)
8. Tamilarasi, M., Chandramathi, S., Palanivelu, T.G.: Efficient energy management for
mobile ad hoc networks. Ubiquitous Computing and Communication Journal 3(5), 12–19
(2001)
9. Toh, C.K.: Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in
wireless ad hoc networks. IEEE Communication Magazine 39(6), 138–147 (2001)
10. Woo, K., Yu, C., Hy, Y., Lee, B.: Non-Blocking, Localized Routing Algorithm for
Balanced Energy Consumption in Mobile Ad hoc Networks. In: Proceedings of
International Symposium on Modeling, Analysis and Simulation of Computer and
Telecommunication Systems, pp. 117–124 (2001)
Alternate Data Clustering for Fast Pattern Matching
in Stream Time Series Data
Abstract. Stream time series retrieval has been a major area of study due to its
vast application in various fields like weather forecasting, multimedia data
retrieval and huge data analysis. Presently, there is a demand for stream data
processing, high speed searching and quick response. In this paper, we use a
alternate data cluster or segment mean method for stream time series data,
where the data is pruned with a computational cost of O(log w). This approach
can be used for both static and dynamic stream data processing. The results
obtained are the better than the existing algorithms.
Keywords: Stream time series, Alternate data clustering, Fast pattern match,
Cluster mean.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 153–158, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
154 R.H. Vishwanath et al.
to analyze the patterns in stock data monitoring system etc,. Previous works on
similarity search on archived time series are not applicable to that over stream time
series, since they cannot efficiently process frequent updates. Moreover, the existing
work for searching on stream time-series data consider detecting a single static pattern
over multiple stream time-series data [1], or checking which pattern (from multiple
static patterns) is close to a single stream time series.
Different methodologies are proposed for stream data processing in mining stream
time series such as Random Sampling method, Sliding windows method, Histograms
method, Multiresolution methods etc., In this paper, we are using sliding window
model to analyze stream data. The basic idea is that rather than running computations
on the entire data, we can make decisions based only on recent data. More formally,
at every time t, a new data element arrives. This element expires at t + w, where w is
the window size or length. The sliding window model is useful for stocks or sensor
networks, where only recent events may be important and reduces memory
requirements because only a small window of data is used. In contrast, our work
focuses on the detection of multiple static/dynamic patterns over multiple stream time
series.
Contribution: In this paper, we present a new approach for representation of the
stream time series data i.e., alternate data cluster or segment mean of stream time
series data. The cost of computational time is saved by selecting alternate patterns
from the stream time series and computing the mean cluster segment which takes
logw steps. The pattern representation that is progressively computed is most suitable
for regular and dynamic updation during processing of streams. This data
representation works very well under all -norms.
2 Related Works
Agrawal et al.[1] proposed whole sequence matching which finds sequences of the
same length that are similar to a query sequence using Discrete Fourier Transform
(DFT). Later, Floutsos et al. [2] extended this work to subsequence matching using
efficient indexing method to locate 1-dimensional subsequences within a collection of
sequences. In both the works Euclidean distance is used to measure the similarity
among sequences. Guttman et al.[3] proposed a Dynamic Index structure for spatial
searching.
Xian Lian et al.[4] proposed multiscale representations for fast pattern matching in
stream time series using multiscale segment mean(MSM), which can be incrementally
computed and perfectly adopted to the stream characteristics. Sethukkarasi et al.,[5]
proposed a technique for similarity matching between static/dynamic patterns and
stream time-series image data to perform effective retrieval of image data. Many
techniques such as DFT, DWT[1], Piecewise Aggregate Approximation(PAA) and
MSM[4] are used for dimensionality reduction problem.
Alternate Data Clustering for Fast Pattern Matching in Stream Time Series Data 155
[X,Y] = ∑ | | (1)
4 Algorithms
5 Result Analysis
Experiments over the generated random data sets were performed. Such generated
synthetic data is applied on the DWT, MSM and ADCMP algorithms. We observed
that, the average CPU time required for existing DWT algorithm is 2.36 seconds,
average CPU time required for MSM 1.904 seconds, whereas in our ADCMP
algorithm, average CPU time required is 0.282 seconds. Our algorithm ADCMP is
80% more efficient than MSM and DWT algorithms as shown in Figure 2.
158 R.H. Vishwanath et al.
6 Conclusion
References
1. Agrawal, R., Faloutsos, C., Swami, A.N.: Efficient Similarity Search in Sequence
Databases. In: Proc. Fourth Intl Conf. Foundations of Data Organization and Algorithms,
FODO (1993)
2. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast Subsequence Matching in Time-
Series Databases. In: Proc. ACM SIGMOD (1994)
3. Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proc. ACM
SIGMOD (1984)
4. Lian, X., Chen, L., Yu, J.X., Han, J., Ma, J.: Multiscale Representations for Fast Pattern
Matching in Stream Time Series. The IEEE TKDE 21(4), 568–581 (2009)
5. Sethukkarasi, R., Rajalakshmi, D., Kannan, A.: Efficient and Fast Pattern Matching in
Stream Time Series Image Data. In: Proc. 1st ICIIC (2010)
Representation of Smart Environments
Using Distributed P Systems
The concept of smart environments was first envisaged by Mark Weiser in the late
1990’s and it has evolved as a very challenging area of research. Smart environment
has its roots based on two near similar paradigms - ambient intelligence and pervasive
computing [4]. The key to smart environment lies in capturing the data live [5] from
the environment and in using smart devices that are enriched with intelligence and
enhanced functionality to increase the reasoning capabilities of the devices.
Recognition, Reasoning and Retrieval in smart environments have been dealt with in
recent related research works [2][3]. The challenge now is to ‘Represent’ these smart
environments effectively for efficient Reasoning.
Membrane Computing is a research area that aims to abstract computing ideas and
models from the structure and functioning of living cells. Membrane systems also
known as P systems, deal with distributed and parallel computing models, processing
multi-sets of symbol objects in a localized manner. Evolution rules and evolving
objects are encapsulated into compartments delimited by membranes. Objects are
communicated between the compartments and with the environment by means of
communication rules. Objects evolve by means of evolution rules which are localized
and associated with the regions of the membrane structure. In this paper, a variant of
the Distributed P System [6] is proposed and generated dynamically.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 159–164, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
160 S. Elias, V. Rajalakshmi, and S. Sivaranjani
2 Proposed Model
The work proposed in this paper utilizes a variant of the existing Distributed P
System for effective modeling of smart environments. The environment is a network
of devices embedded with intelligence that continuously monitors the various
activities happening in the environment. The monitoring systems installed all around
the environment are assumed to be smart. Smart devices are the devices that not only
have the ability to capture data but also apply reasoning on the data being captured in
order to indicate the occurrence of specific events.
In order to model the environment a domain specific application of a smart home
for assisted living is considered. The following assumptions are made for modeling
the environment. The entire environment is represented as a Distributed P System and
the environment is divided into zones where each zone is a P System and a collection
of such P Systems form a Distributed P System. This is illustrated in Fig. 1a., where
four zones are identified in the environment which are represented as four P Systems
Π1, Π2, Π3 and Π4 respectively. The single occupant in the home is considered as a
mobile P System uniquely labeled as Πp. The embedded devices in the environment
that enable monitoring could range from video cameras and audio recorders to
microphones, sensors (light and smoke), controllers, LEDs etc., Each of these devices
will form a membrane in the P System that represents a zone where the devices are
physically located. Fig. 1b. is an example of P System with membranes indicating the
presence of smart devices in a particular zone. Evolution rules are written to trigger
the alarm in each zone in case of an emergency.
Fig. 1a. Smart Home as Distributed P System Fig. 1b. The P System representing a zone
A prototype is being developed, that facilitates the design of smart spaces. Fig. 2.
shows a snapshot of the prototype and the backend functionality of the prototype as
well.
The layout of the home is used as input in the prototype. Each room in the home is
considered as a zone. Let the smart devices fixed at each zone be video recorders,
audio recorders and an alarm system. Assume the smart devices capture information
in the form Dev_x = (lab, lev) where, ‘lab’ and ‘lev’ represents the label and level of
the activity captured by the device ‘x’. The activity for an audio system is assumed to
be the sound recorded and for the video system, it represents the current scenario
captured. The level for the audio system represents volume and for video system, it
represents description about the activity. Consider the device number of the alarm
system in all zones to be 1. Assume that there are three priority levels for an alarm
system. Same label or level can be assigned to different activities captured by various
devices and in different zones. To differentiate the labels or levels in each device and
in each zone, the label or level can be denoted by labik or levik, (where ‘i’ represents
the zone number in the home and ‘k’ represents the device number in the zone).
The prototype would collect the zone identity ‘i’ which represents the ith P System
and all label-level pairs Dev_x = (lab, lev) from all devices in that zone. This forms
the input for Algorithm 1 which generates rule for a P System. Algorithm 2 generates
a Distributed P System from the P Systems.
The formal definition of proposed variant of Distributed P System with degree
(n≥1) has the following construct:
Δ = (O, Π1, . . ., Πn, Πp, S);
where,
1. O is an alphabet of objects;
2. Π1, . . ., Πn are P Systems containing m membranes with skin
membranes labeled with s1, . . ., sn, respectively where,
Πi = (Vi, µ i, wi1, ….., wim, Ei, Ri1, ……, Rim)
where,
a. Vi is an alphabet of objects, Vi ⊆ O;
b. µ i is a membrane structure of the ith P System which is of the form
[0[1] 1[2] 2[3] 3.....[m] m] 0;
c. wi1, ….., wim represents the multisets of objects available in each
membrane;
d. Ei ⊆ Vi represents the objects available in arbitrarily many copies in
the environment;
e. Ri1, ……, Rim represents the evolution and communication rules
used in each P System. The rules have the form a → v, where a, v ϵ
V.
3. Πp is a mobile P System that represents the single occupant in the
environment. This is an additional component proposed in this
paper to suit
the domain specific application. Πp has the following construct:
Πp = (Vp, µ p, wp, Ep, Rp)
where,
a. Vp is an alphabet of objects, Vp ⊆ O;
b. µ p is a membrane structure of the mobile P System with only the
skin membrane [0] 0;
c. wp are strings representing the multisets over V associated with skin
region;
162 S. Elias, V. Rajalakshmi, and S. Sivaranjani
Algorithm 2 generates the Distributed P System. This algorithm takes as input all
individual P Systems to form Distributed P System. Assume i, j and k represents the P
System. In line 1, O is set to empty set. The for loop in lines 2-6, will add the
following:
• In line 3, Vi of each P System is added to the set O.
• In line 4, P Systems are added to the Distributed P System Δ.
• In line 5, Skin-to-skin communication rules are also added to the set S.
Finally the lines 8 and 9, represents the same function for the mobile P System.
2.2 Illustration
Consider a smart home with 4 zones. Fig. 3a. is an illustration of rule generation in P
System i. In this illustration, the P System i contains an alarm system, a video camera,
a voice recorder which form the membranes 1, 2, 3 respectively. Consider the activity
recorded by video camera has the label ’a’ and level ’2’ and also consider the activity
recorded by voice recorder has the label ’b’ and level ’4’. For this particular
combination, if an alarm has to raised, with high alert, say ‘3’. The rule generated in
each membrane is shown in the Fig. 3a.
Fig. 3a. A rule generated in ith P System Fig. 3b. Skin to skin communication rule
164 S. Elias, V. Rajalakshmi, and S. Sivaranjani
After an alarm is raised in one zone, alarm system in all other zone should be
triggered. To trigger the alarm system in all other zones skin to skin communication
rules are required which is illustrated in Fig. 3b.
3 Complexity Analysis
Consider ‘Algorithm 1’, for dynamic rule generation. Let m be the number of
membranes and n be the number of activities associated with each membrane. The for
loop is bounded by m-1 times the execution of the function union. The union
operation takes the time complexity of O(mα(m+n, n)+n), where α is the functional
reverse of Ackerman’s function [1]. Therefore, the time complexity of the loop is
O(mn) and hence, the algorithm takes a worst case time complexity of O(mn).
Consider ‘Algorithm 2’, for formation of Distributed P System. Let p be the
number of P Systems and m is the number of membranes in each P System.
Calculating the complexity for the algorithm in the same manner, the algorithm
arrives at the worst case time complexity of O(pm). Since both the algorithms do not
take exponential time, the algorithms are proved to be efficient.
4 Conclusion
This paper focuses on the effective representation of the smart environment for
efficient real time response. A domain specific application for assisted living in smart
home has been proposed. Distributed P Systems are generally used in representing the
functioning of the system and thereby solving problems in a distributed manner.
Algorithms are designed for automatically representing smart spaces using
Distributed P System. Complexity analysis presented in this paper indicates that the
algorithms are efficient.
References
1. Blum, N.: On the single-operation worst-case time complexity of the disjoint set union
problem. In: Mehlhorn, K. (ed.) STACS 1985. LNCS, vol. 182, pp. 32–38. Springer,
Heidelberg (1984)
2. Menon, V., Jayaraman, B., Govindaraju, V.: Three R’s of Cyberphysical Spaces. IEEE
Computer Society 44(9), 73–79 (2011)
3. Menon, V., Jayaraman, B., Govindaraju, V.: Multimodal identification and tracking in smart
environments. Personal and Ubiquitous Computing Journal-Springer 14(8), 685–694 (2010)
4. Menon, V., Jayaraman, B., Govindaraju, V.: Biometrics Driven Smart Environments:
Abstract Framework and Evaluation. In: Sandnes, F.E., Zhang, Y., Rong, C., Yang, L.T.,
Ma, J. (eds.) UIC 2008. LNCS, vol. 5061, pp. 75–89. Springer, Heidelberg (2008)
5. Menon, V., Jayaraman, B., Govindaraju, V.: Integrating recognition and reasoning in smart
environments. In: 4th International Conference on Intelligent Environments, pp. 1–8 (2008)
6. Paun, G., Perez-Jimenez, M.J.: Solving Problems in a Distributed way in Membrane
Computing: dP Systems. International Journal of Computers, Communication and
Control 2, 238–250 (2010)
Low Leakage-Power SRAM Cell Design
Using CNTFETs at 32nm Technology
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 165–171, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
166 S. Rajendra Prasad, B.K. Madhavi, and K. Lal Kishore
state. The most effective technique for reducing dynamic power dissipation is supply
voltage scaling and to maintain performance, transistor threshold voltage also has to
be scaled proportionally [2]. This has an adverse effect on leakage power. Thus, in the
nanometer regime due to lower supply voltages, leakage power cannot be neglected.
To curtail this static power loss, several techniques have been proposed that
efficiently minimize leakage power dissipation [3-4].
Since the first CNTFET was reported in 1998, great progress has been made during
the past years in all the areas of CNFET science and technology, including materials,
devices, and circuits [5]. Carbon Nano Tubes (CNTs) are sheets of graphene rolled
into tube. A CNTFET is the analogue of silicon MOSFET in which Single Wall CNTs
(SWCNT) replace the silicon channel. Depending on their chirality (i.e., the direction
in which the graphite sheet is rolled), the SWCNTs can either be metallic or semi-
conducting. CNFETs are one of the molecular devices that avoid most fundamental
silicon transistor restriction and have ballistic or near ballistic transport in their
channel. Therefore a semiconductor CNT is appropriate for using as channel of FETs.
Applied voltage to the gate can control the electrical conductance of the CNT by
changing electron density in the channel [6].
The SRAM leakage power has also become a more significant component of total
chip power as a large portion of the total chip transistors directly comes from on-die
SRAM. The dominant leakage power component is the subthreshold leakage.
Effectively lowering power supply decreases all the leakage components. Since the
activity factor of a large on-die SRAM is relatively low, it is much more effective to
put in a power reduction mechanism dynamically, which modulates the power supply.
But when the SRAMs are required to keep the data retention as the power supply is
lowered, the rail-to-rail voltage needs to be carefully controlled to maintain sufficient
cell stability, avoiding potential data loss. Especially for modern VLSI processor
design, SRAM takes large part of power consumption portion and area overhead.
While seeking for solutions with higher integration, performance, stability, and lower
power, CNT has been presented for next-generation SRAM design as an alternative
material in recent years [7-11]. Hence, this necessitates the need for techniques to
reduce this leakage power dissipation in CNTFET based SRAM cell.
Leakage power dissipation arises from the leakage currents flowing through the
transistor when there are no input transitions and the transistor has reached steady state.
The leakage paths in conventional 6T CNTFET SRAM cell are shown in Fig.1. There
are two dominant sub-threshold leakage paths in a 6T SRAM cell: one from VDD to
ground paths inside the SRAM cell, called Cell leakage paths and second from the bit-
lines ‘BL’ (or bit-bar line ‘BLbar’) to ground path through the pass transistor M5 (or
M6), called Bitline leakage paths [12].
Low Leakage-Power SRAM Cell Design Using CNTFETs at 32nm Technology 167
and is in inactive or sleep mode when ‘WL’ is ‘0’. When a control signal ‘WL’ is ‘0’,
a ‘Sleep’ is set to ‘1’ and ‘SleepBar’ to ‘0’ so that sleep transistors MS1 and MS2 are
OFF, there by leakage currents are reduced and hence leakage power.
These Sleep Transistors MS1 and MS2 can be shared among multiple SRAM cells
to amortize the overhead. To reduce the impact on SRAM cell speed and to ensure
stability of the SRAM, the sleep transistors must be carefully sized with respect to the
SRAM cell transistors when they are gating. While these sleep transistors must be
made large enough to sink the current flowing through the SRAM cells during a
read/write operation in the active mode, too large a sleep transistor may reduce the
stacking effect, thereby diminishing the energy savings. Moreover, large transistors
also increase the area overhead.
Synopsis HSPICE is used for simulation purpose to estimate delay and power
consumption. Simulations performed with Stanford CNTFET model at 32nm feature
size with supply voltage VDD of 0.9V [13]. The HSPICE Cscope is used for
Low Leakage-Power SRAM Cell Design Using CNTFETs at 32nm Technology 169
Fig. 3. Leakage currents flowing through the 6T CNTFET SRAM cell with and without Sleep
Transistors when Q=’1’
Fig. 4. Leakage currents flowing through the 6T CNTFET SRAM cell with and without Sleep
Transistors when Q=’0’
Low Leakage-Power SRAM Cell Design Using CNTFETs at 32nm Technology 171
5 Conclusion
CMOS technology in nanometer scale faces great challenge due to sub-threshold
leakage power Consumption. In this paper a conventional 6T SRAM cell based on
CNTFET is designed for leakage power reduction by applying Sleeping Transistors
technique. The results show that this design reduces a leakage power to the significant
effect by maintaining delay but with minimal increase in area. Proposed cell can be
used in design of CNTFET based ultra-low SRAM Memories. This cell can be used for
design of low-leakage memory based on CNTFET technology.
References
1. International Technology Roadmap for Semiconductors by Semiconductor Industry
Association (2009), http://public.itrs.net
2. Mutoh, S., Douseki, T., Matsuya, Y., Aoki, T., Shigematsu, S., Yamada, J.: 1-V power
supply high-speed digital circuit technology with multi threshold-voltage CMOS. IEEE J.
Solid-State Circuits 30(8), 847–854 (1995)
3. Powell, M., Yang, S.H., Falsafi, B., Roy, K., Vijaykumar, T.N.: Gated-VDD: A Circuit
Technique to Reduce Leakage in Deep-submicron Cache Memories. In: International
Symposium on Low Power Electronics and Design, pp. 90–95 (2000)
4. Kim, K.K., Nan, H., Choi, K.: Power Gating for Ultra-Low Voltage Nanometer ICs. In:
IEEE ISCAS (2010)
5. Patil, N., Lin, A., Zhang, J., Wong, H.S.P., Mitra, S.: Digital VLSI logic technology using
Carbon Nanotube FETs: Frequently Asked Questions. In: 46th ACM-IEEE Design
Automation Conference, pp. 304–309 (2009)
6. Appenzeller, J.: Carbon Nanotubes for High-Performance Electronics—Progress and
Prospect. Proc. IEEE 96(2), 201–211 (2008)
7. Yu, Z., Chen, Y., Nan, H., Wang, W., Choi, K.: Design of Novel Low Power 6T CNFET
SRAM Cell Working in Sub-Threshold Region. In: IEEE International Conference on
Electro/Information Technology (2011)
8. Wang, W., Choi, K.: Novel curve fitting design method for carbon nanotube SRAM cell
optimization. In: IEEE International Conference on Electro/Information Technology, EIT
(2010)
9. Kuresh, A.K., Hasan, M.: Performance comparison of CNFET- and CMOS-based 6T
SRAM cell in deep submicron. Microelectronics Journal Archive 40(6), 979–982 (2009)
10. Lin, S., Kim, Y.-B., Lombardi, F.: A New SRAM Cell Design Using CNTFETs. In:
International SoC Design Conference, vol. 1, pp. 168–171 (2008)
11. Lin, S., Kim, Y.B., Lombardi, F.: Design of a CNTFET-based SRAM cell by dual-
chirality selection. IEEE Transactions on Nanotechnology 9(1), 30–37 (2010)
12. Kim, C., Roy, K.: Dynamic VtSRAM: A Leakage Tolerant Cache Memory for Low
Voltage Microprocessors. In: Proceedings of the International Symposium on Low Power
Electronics and Design, USA, pp. 251–254 (2002)
13. Stanford University CNFET Model website,
http://nano.stanford.edu/model.php?id=23
Modified Low-Power Multiplier Architecture
1 Introduction
Power dissipation of VLSI chips is traditionally a neglected subject. In the past, the
device density and frequency were low enough that it was not a constraining factor
chips. As the scale of integration improves, more transistors, faster and smaller than
their predecessors, are being packed into a chip. This leads to the steady growth of the
operating frequency and processing capacity per chip, resulting in increased power
dissipation [2].
2 Related Works
In the previous papers [1] they had proposed about the modifications done to the
conventional architecture. The work has been done towards a low power shift-and-
add multiplier, and has been proved that BZ-FAD architecture uses low area and low
power compared to the conventional architecture. Simulation results shows that
architecture lowers the total switching activity up to 76% and power consumption up
to30% when compared to the conventional architecture [7].
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 172–178, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Modified Low-Power Multiplier Architecture 173
advances in technology, many researchers have tried and are trying to design
multipliers which offer either of the following- high speed, low power consumption,
regularity of layout and hence less area or even combination of them in multiplier.
Thus making them suitable for various high speed, low power, and compact VLSI
implementations. Hence in this paper we have tried to design three different
architectures of multipliers
1. Conventional Shift-and-Add-Architecture.
2. Bypass zero feed A directly (BZ-FAD) Multiplier Architecture.
3. Modified Bypass multiplier architecture
To get low power architecture, sources of switching activity have been reduced. This
involves the shifting of B register ,switching activity of the adder , shifting of the PP
register etc [4].
The above figure shows the modified version of the BZ-FAD architecture. In this
architecture we have eliminated some of the components from the previous one which
can further reduce the area and the power consumption in the multiplier circuit [3]. A
bypass register and mux are eliminated to reduce the shifting activities in the
circuit.Similiarly the d- flip-flop and mux2 has been eliminated to reduce the delay.
Instead a mux has been introduced to select the LSB between feeder register and the
adder block.
The working of the given architecture is described as follows. When B (0) bit of
the mux1 equals to one then the MSB of the adder register is loaded into the feeder
register and the LSB of the adder will be loaded into the mux2 block else the content
of the feeder register itself is right shifted by one bit and the LSB of the feeder will be
loaded into the mux2 block. This will be finally stored in the latch. The MSB of the
final product will be obtained from the feeder register whereas the LSB will be
obtained from the latch.
In previous case the bits of the multiplicand B reaches the main mux only after
passing through d-flip flop and mux2.This will increase the delay of multiplier circuit.
The d-flip flop and mux2 of the previous architecture has been removed and a direct
connection from mux1 to mux2 and feeder register is given here to decrease the delay
of the circuit. So along with the power reduction, there is a substantial decrease in the
area and delay also.
Modified BZ-
Component BZ-FAD
FAD
Partial Product register 0.4462 0.4462
Adder 0.1256 0.0874
Multiplier 5.686 -
Bypass Register 0.51314 -
Feeder Register 0.2514 5.572
to the total power consumption of both the multiplexer and bypass, the power
consumption of feeder is very low. Even the adder in the modified BZ-FAD
multiplier consumes less power. So the total power consumption of the Modified BZ-
FAD consumes less power.
References
1. Mottaghi-Dastjerdi, M., Afzali-Kusha, A., Pedram, M.: BZ-FAD: A Low-Power Low-Area
Multiplier Based on Shift-and-Add Architecture. IEEE Transactions on Very Large Scale
Integration (vlsi) Systems 17(2) (February 2009)
2. Yeap, G.: Motorola: Practical Low Power Digital VLSI Design. Kluwer Academic
Publishers
3. Kayyalha, M., Namaki-Shoushtari, M., Dorosti, H.: BZ-FAD A Low-Power Low-Area
Multiplier Based on Shift-and-Add Architecture
4. Marimuthu, C.N., Thangaraj, P., Ramesan, A.: Low Power Shift And Add Multiplier
Design. International Journal of Computer Science and Information Technology 2(3) (June
2010)
5. Chen, O.T., Wang, S., Wu, Y.-W.: Minimization of switching activities of partial products
for designing low-power multipliers. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 11
6. Chandrakasan, A., Brodersen, R.: Lowpower CMOS digital design. IEEE J. Solid-State
Circuits 27(4)
7. Marimuthu, C.N., Thangaraj, P.: Low Power High Performance Multiplier. In: ICGST-
PDCS, vol. 8(1) (December 2008)
8. Chandrakasan, A.P., Sheng, S., Brodersen, R.W.: Low-Power CMOS Digital Design.
Journal of Solid State Circuits 27(4) (April 1992)
Clustering Methodologies and Their Implications
in Sensor Networks
Department of MCA,
R.V. College of Engineering
Mysore Road, Bangalore
{mhanaradhya72o,sumithraka}@gmail.com,
dharani_ap@yahoo.com, vijaysingh@rediffmail.com
Abstract. Currently many algorithms like LEACH, HEED, EECH are applied
to sensors networks to achieve better lifetime of a network. But each of these
algorithms has some drawback in achieving a effective lifetime of a sensor
network. This paper deals with existing algorithm and comparing the simulated
results to know the effective solution to increase lifetime of the sensor network.
1 Introduction
Wireless Ad Hoc Networks comprise a fast developing research area with a vast
spectrum of applications. The Energy efficiency continues to be a key factor in
limiting the deploy ability [1] of ad-hoc networks. Deploying an energy efficient
system exploiting the maximum lifetime of the network has remained a great
challenge. The lifetime of the wireless sensor network [2] is largely dependent on
efficient utilization of energy. While looking at energy efficient protocols, they have
significant impact on the lifetime of these wireless sensor networks.
2.1.1 ANDA
ANDA (Adhoc Network Design Algorithm) [1]assigns the ordinary nodes to the
cluster heads such that energy is not drained out from them easily and the lifetime of
the whole system increases drastically. A matrix is computed, which lists the probable
lifetime of the cluster heads for a particular node is assigned to all the cluster head.
ANDA algorithm basically comprises two algorithms. One, the covering algorithm
which is applied to the static and dynamic case and second, the reconfigure algorithm
which applies only to the dynamic scenario.
Drawback: But this algorithm takes into account a fixed set of cluster-heads which
continuously dissipate energy throughout the network functioning time.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 179–184, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
180 M. Aradhya et al.
Overcome: We came up with the idea of having dynamic set of cluster-heads, thereby
distributing the energy dissipation among the set of nodes for a better lifetime
2.1.2 LID
The LID (Lowest Id) Algorithm [1] defines which nodes will behave as cluster heads
and determines the nodes that constitute the cluster. ANDA is then implemented to
cover the nodes. It assigns a unique ID to each node in the network. The LID
algorithm chooses arbitrarily the node with the lowest ID as the cluster-head and
declares all the nodes within the range of this cluster-head as its members.
Drawback: It is difficult to choose the cluster head in the mobile network. Because
other nodes within some range to cluster head have to accept that head as their cluster
head, if the cluster head is keep on changing its position means some node may get
out of that range and some new node may get in.
Overcome: So need of a dynamic selection of cluster head.
2.1.3 LEAD
LEAD deals with dynamic selection of cluster heads among the set of nodes in the
network and then allocate the other ordinary nodes to the cluster heads dynamically. It
adapts itself to the network, and node selection and allocation is done according to the
current status of the network.
LEAD achieves three goals:
First, we select a set of cluster heads among the nodes randomly which is very
practical in case of wireless ad hoc networks instead of having a fixed set of cluster
heads.
Second, set of cluster heads are selected dynamically after a time in a round
schedule balancing the load (energy dissipation) throughout the nodes of the network
thus increasing the lifetime.
Third, dynamically allocates the nodes to the cluster heads using the enhanced
feature of ANDA thereby reducing the load on each cluster head and to make the
cluster head sustainable for more number of rounds.
2.2.1 HEED
In order to avoid the random selection of the cluster head problem occurred in first
method, the HEED (Hybrid Energy-Efficient Distributed clustering) [2], that
periodically selects cluster heads according to a hybrid of their residual energy and a
secondary parameter, such as node proximity to its neighbors or node degree.
HEED has four primary goals:
Clustering Methodologies and Their Implications in Sensor Networks 181
2.2.2 LEACH
LEACH (Lower-Energy Adaptive Clustering Hierarchy)[3] a clustering protocol,
using randomly rotation of cluster head to balance energy . The principle of LEACH
is to determine the cluster head and cluster. The cluster head accepts data from other
sensors in its cluster and makes data aggregation then sends it to BS.
Advantage: It can reduce energy consumption and uses minimum transmission power
and nodes walkup only during assigned TDMA slots. Longer network lifetime and
larger data capacity
Drawback: Need to know number of neighbor (n) to calculate k and energy level of
all nodes.
Overcome: TDMA scheduling of the nodes and CDMA can be used to avoid
collision.
2.3.2 PEGASIS
PEGASIS (Power-Efficient Gathering in Sensor Information Systems) algorithm is
based on chain [5], which uses greedy algorithm to form data chain. Each node
aggregates data from downstream node and sends it to upstream node along the chain.
Advantages : compared to LEACH it eliminates the overhead in dynamic formation
of cluster so that PEGASIS can save much energy.
Shortcoming: results in distance a long distance between a pair of sensors which leads
to more consumption of energy.
Fig-2 Shows the network lifetime as a function of the number of nodes, for a
percentage of cluster heads P=0.05.
The life-time decreases as the number of nodes grow; however for a
number of nodes greater than 100, the life-time remains almost constant as the
number of nodes increases. Lifetime decreases because Cluster heads have to
cover more nodes as the number of nodes in the network size increases. But
LEAD has a high lifetime compared to ANDA for higher number of nodes. Hence we
can say that LEAD has better performance than ANDA. But in LEAD the node are
assigned randomly to CH so to avoid this we go for leach in which nodes are assigned
to CH by calculating the shortest distance from the node to and energy level of CH.
LID has a major problem as defined above section, so we are not considered it for the
comparison.
Fig. 3.1. Average residual energy for Fig. 3.2. Average residual energy for
Simulation radius 100m simulation radius 500m
The figure above is result obtain by simulating the LEACH and HEED in WSN
simulator. In this we used the alkaline battery of capacity 2850mAh, number of nodes
100, simulated for a time t=50sec and communication radius are the parameters used.
Where X-axis is time and Y-axis is power in mA. Fig-3.1 the energy saved by
LEACH is more than the HEED. This show that LEACH consumes less energy than
HEED and hence we can say that LEACH is more energy efficient than HEED. But in
Fig-3.2 we can see that average energy left behind in LEACH is less than HEED, this
is due the larger communication area. In this we simulated for a radius for 500 m.
Hence LEACH can be used for smaller network area to achieve significant energy
efficiency. EECH can give better performance over LEACH but the problem is in
calculating its optimal parameters.
184 M. Aradhya et al.
4 Conclusions
The simulation results of different algorithm were compared and based on that we can
say the LEACH algorithm has better performance than others and it can increase the
life time of a sensor network by consuming less energy. So it can be used efficiently
to achieve a better lifetime of a sensor and it can be still improved in better way to
achieve a good energy efficient for larger network area. LEACH is better clustering
algorithm for sensor networks and among the different data collection algorithm in
wireless sensor network when the lifetime of a sensor in the network is a major issue.
References
1. Mishra, S., Satpathy, S.M., Mishra, A.: Energy Efficiency in AD hoc Networks.
Proceedings of International Journal of Ad hoc, Sensor & Ubiquitous Computing (IJASUC)
INDIA 2(1) (March 2011)
2. Younis, O., Fahmy, S.: Distributed Clustering in Ad-hoc sensor Networks-A hybrid,
Energy-Efficient Approach. Department of Compute Science. Purdue University IN 47907-
2066,USA
3. Shen, L., Shi, X.: A Location Base Clustering Algorithm for Wireless Sensor Networks.
Proceedings in International Journal of Intelligent Control and Systems 13(3), 208–213
(2008)
4. Manjeshwar, A., Agrawal, D.P.: TEEN: A routing protocol for enhanced efficiency in
wireless sensor Networks. In: Proceedings of the IPDPS Workshop on Issues in the
Wireless Networks and Mobile Computing, San Francisco. CA, pp. 2009–2015 (April
2001); MA, pp. 63–73 (November 2000)
5. Liu, Y., Ji, H., Yue, G.: An Energy-Efficient PEGASIS-Based Enhanced Algorithm in
Wireless Sensor Networks. DCN Lab, Beijing University of Posts and Telecommunications,
Beijing, China (2006)
6. Bandyopadhyay, S., Coyle, E.J.: Energy Efficient Hierarchical Clustering Algorithm for
Wireless Sensor Networks. School of Electrical and Computer Engineering Purdue
University West Lafayette, IN, USA
CCCDBA Based Implementation of Voltage Mode Third
Order Filters
Abstract. In the present paper implementation of third order low pass, high
pass filters have been proposed by using current-controlled current differencing
buffered amplifier (CCCDBA). In the present work, an effort has been made to
simulate the third order doubly-terminated LC ladder low pass filter, third order
doubly-terminated LC ladder high pass filters using CCCDBA. Here, in each
circuit more than one CCCDBA and few grounded capacitor without have been
utilized. The designed circuits are very suitable for integrated circuit and very
easy in implementation. The circuits’ performance is simulated through
PSPICE and its simulated results obtained so is comparable to the theoretical
one.
1 Introduction
Filters have found wide applications in the field of instrumentation, automatic control,
and communication. A basic approach for building higher-order filters is to emulate a
passive LC ladder filter which possesses low sensitivities in the pass band. Electroni-
cally tunable current mode ladder filters using current controlled CDBA has been
presented in this paper. In this paper we have presented the design and simulation of
third order doubly-terminated LC ladder low pass filter, third order doubly-terminated
LC ladder high pass filter.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 185–192, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
186 R. Vishal et al.
2 Operation
A current differencing buffered amplifier (CDBA), which is a new active circuit
building block especially suitable for the realization of a class of continuous-time
filters, has been introduced, whose bipolar-based realization is introduced and used
for the realization of active and passive filters. It offers several advantageous features
such as:-
a) high-slew rate,
b) freedom from parasitic capacitances,
c) wide bandwidth and,
d) Simple implementation.
Ia
Ip
Vp P W Vw
CCCDBA
Vn N Z Vz
In Iz
Since the proposed circuits are based on CCCDBAs, a brief review of CCCDBA is
given here. CCCDBA is a translinear based current-controlled current differencing
buffered amplifier (CC-CDBA’s) whose parasitic input resistances can be varied elec-
tronically. Basically, the CCDBA is a four-terminal active element. For ideal opera-
tion, it’s current and voltage relations described by the equation-1. [11],[12],[13].
From the circuit operation, the current-voltage characteristics of CC-CDBA can be
expressed by the following matrix.
v p 0 0 RX 0 v z
v 0 0 0 R X i z
n = (1)
i z 0 0 1 − 1 i p
v w 1 0 0 0 in
Rs L2
V1 1 2
V3
+
I2
Vin
C1 C3 Rl Vo
Figure-2 shows the block diagram of the leapfrog representation of the LC fil-
ters[1], [8]. From Figure-2 it is obvious that the low pass filter can be constructed
with two lossy integrators and a lossless integrator. The CCCDBA implementation of
low pass filter is shown in Figure-3. Where
^
V in − V 2
V = (2)
1 + sC 1 R
1
R
(V 1 )
^
V2 = − V 2 (3)
sL 2
V O = V 3 (4)
^
V 2 (5)
V =
1 + sC
3
2 R
The transfer function of the low pass filter is determined to be as given by the equa-
tion-6.
R 1
3 2
V o (s ) R 1 C R R 1 (6)
=
V i (s ) 1 (2 R + R 1 ) 2
s3 + 2s 2 + 2s +
2 2 3 2
CR C R R 1 C R R1
In the above equation we have assumed that:
Rx1 = Rx2 = Rx3 = Rx4 = Rx5 = Rx7 = Rx8 = Rx9 = Rx11 = Rx12 = Rx13 = R1 and Rx6 = Rx10 =
Rx14 = R. Also, C1 = C2 = C3 = C.
From the above transfer function of equation-6 we obtain the expression for cut-off
frequency as: -
2
ω o = 3
3 2
(7)
C R R 1
188 R. Vishal et al.
-1
V1 V2 VL
1 R 1
+ +
Vi 1 + sC R
1 + sC 1 R sL2 3
-1
Fig. 3. Block Diagram of Third Order Leapfrog Filter (Low pass Section)
Ia1
Ia2
Rx4
Rx1 Ip Ip
P W P W
Vin
CCCDBA1 Iz CCCDBA2
In Iz
N Z N Z
In C1
Rx2 Rx3 Rx5 Rx6
Ia4 Ia3
Ip
Ip Rx10 Rx7
W P
W P
CCCDBA4 CCCDBA3
Iz Rx9
Z N Z N
C2
In Rx11 Iz In Rx8
Ia5
Rx12 Ip
P W Vo(LP)
CCCDBA5
Iz
N Z
Rx13 In C3
Rx14
Fig. 4. Circuit Diagram of Third Order Doubly Terminated LC Ladder Leapfrog Low pass
Filter Using CCCDBA
^
V
V 3 = V 1 − 2 (9)
sC 2 R
^
V
V 1 = V 3 + 2
(10)
sC 2 R
^ R
V2 = V 3 + V 3 (11)
sL 3
V O = V 3 (12)
Rs C2
V1 V3
+
2
I2 2
Vin
L1 L3 Rl Vo
1 1
With the help of the above four equations we can obtain the block-diagram of Figure-
5 which is depicted in Figure-6 shown below.
Fig. 6. Block Diagram of Third Order Leapfrog Filter (High pass Section)
R 1
s 3
V o (s )
3 2
R1 C R R 1 (13)
=
V i (s ) 1 (2 R + R 1 ) 2
s 3
+ 2s2 + 2 s +
CR
2 2 3
C R R1 C R 2 R1
Fig. 7. Circuit Diagram of Third Order Doubly Terminated LC Ladder Leapfrog High pass
Filter Using CCCDBA
3 Simulation Results
Figure-8 depicts the SPICE simulation results of the proposed third order LC ladder
low pass filter circuit which gives the value of cut-off frequency to be 239.883 KHz.
Figure-9 depicts the SPICE simulation results of the proposed third order LC ladder
high pass filter circuit which gives the value of cut-off frequency to be 164.059 KHz .
CCCDBA Based Implementation of Voltage Mode Third Order Filters 191
4 Conclusions
We observe that the realized filter works in accordance with the theoretical values for
Butterworth response with R = 975Ω and R1 = 130Ω, and the capacitors to be equal C
= 1nF, which gives cutoff frequency to be approximately equal to 402.574 KHz. we
observe that the realized filter works in accordance with the theoretical values for
Butterworth response with R = 780Ω and the capacitors to be equal C = 1nF, which
gives cutoff frequency to be approximately equal to 119.326 KHz. In this chapter we
have realized higher order current-mode continuous-time filters using current-
controlled CCCDBA. We have focused our attention on the implementation and de-
sign of doubly terminated LC ladder leapfrog filters. As an example we have realized
a third order low pass, third order high pass filter.
192 R. Vishal et al.
References
1. Akerberg, D., Mossberg, K.: A Versatile Building Block: Current Differencing Buffered
Amplifier Suitable for Analog Signal Processing Filters. IEEE Trans. Circuit Syst. CAS-
21, 75–78 (1974)
2. Acar, C., Ozogoz, S.: Nth order current transfer function synthesis using current differenc-
ing buffered amplifiers: signal flow graph approach. Microelectronics Journal 31(1), 49–53
(2000)
3. Frey, D.R.: Log–domain filtering: an approach to current-mode Filtering. IEEE Proceed-
ings, Pt. G. 140, 406–416 (1993)
4. Jaikala, W., Sooksood, K., Montree, S.: Current-Controlled CDBA’s (CCCDBA’s) based
Novel current-mode universal biquadratic filter. IEEE Trans., ISCAS 2006, 3806–3809
(2006)
5. Maheshwari, S., Khan, I.A.: Current controlled current differencing buffered amplifier:
implementation and applications. Active and Passive Electronics Components 27(4), 219–
222 (2004)
6. Maheshwari, S.: Voltage-Mode All-Pass filters including minimum component count cir-
cuits. Active and Passive Electronic Components 2007, 1–5 (2007)
7. Pisitchalermpong, S., Prasertsom, D., Piyatat, T., Tangsrirat, W., Surakampontorn, W.:
Current tunable quadrature oscillator using only CCCDBAs and grounded capacitor. In:
ECTI-con 2007 the 2007 ECTI International Conference, pp. 32–35 (2007)
8. Tangsriat, W., Surakampontron, W., Fujii, N.: Realization of Leapfrog Filters Using Cur-
rent Differential Buffered Amplifiers. IEICE Trans. Fundamental E86-A(2), 318–326
(2002)
9. Tangsrirat, W., Surakampontorn, W.: Electronically tunable floating inductance simulation
based on Current-Controlled Current Differencing Buffered Amplifiers. Thammasat Int. J.
Sc. Tech. 11(1), 60–65 (2006)
10. Tangsrirat, W., Surakampontorn, W.: Realization of multiple-output biquadratic filters us-
ing current differencing buffered amplifier. International Journal of Electronics 92(6),
313–325 (1993)
11. Toker, A., Ozouguz, S., Acar, C.: Current-mode KHN- equivalent biquad using CDBAs.
Electronics Letters 35(20), 1682–1683 (1999)
12. Tangsrirat, W.: Novel minimum-component universal filter and quadrature oscillator with
electronic tuning property based on CCCDBAs. Indian Journal of Pure and Applied Phys-
ics 47, 815–822 (2009)
13. Tangsrirat, W., Surakampontorn, W.: Electronically tunable quadrature oscillator using
current controlled current differencing buffered amplifiers. Journal of Active and Passive
Electronic Devices 4, 163–174 (2009)
An Effective Approach to Build Optimal T-way
Interaction Test Suites over Cloud Using Particle Swarm
Optimization
1 Introduction
Software testing is an expensive and time consuming activity that is often restricted
by limited project budgets. Accordingly, the National Institute for Standards and
Technology (NIST) reports that software defects cost the U.S economy close to $60
billion a year [1]. They suggest that approximately $22 billion can be saved through
more effective testing. There is a need for advanced software testing techniques that
offer a solid cost-benefit ratio in identifying defects. Interaction testing is one such
method that may offer a benefit. Interaction testing or Combinatorial Testing imple-
ments a model based testing approach using combinatorial design. In this approach, it
creates test suites by selecting values for input parameters and by combining these
parameter values This testing method has been applied in numerous examples like
medical devices, browsers, servers etc. [2] [3].
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 193–198, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
194 Priyanka, I. Chana, and A. Rana
Combinatorial Testing can detect hard to find software faults more efficiently than
manual test case selection methods. It can be categorized into two types Pairwise
Testing and T-way testing.
2 Related Work
T-way testing is very promising technique for generating test data in software quality
assurance because it provides effective error detection at low cost. There are three
main types of algorithms to construct combinatorial test suites: Algebraic,
Computational and Heuristic search algorithms [4][5]. Comparison chart is shown in
Table 1. Test data generation process for multi-way testing can be fully automated.
Several tools that automate the production of complete test cases covering up to 6-
way combinations are summarized in Table 2. Combinatorial Testing is a practical
software testing approach, which could detect the faults that triggered by single
factors in software and even interactions of them. Existing Combinatorial Testing
Algorithms is summarized in Table 3.
2. They do not enumerate any It is expensive due to the need to Produce smaller test suites than
combinations, hence less expensive. consider explicit enumeration Computational Approach.
from all the combination space.
3. Algebraic approaches can be Time required is more than Time required is more than
extremely fast. Algebraic Approach. Computational Approach.
4. Impose serious restrictions on the It can be applied to arbitrary It can be applied to arbitrary system
system configurations to which they system configurations. configurations.
can be applied.
5. Test prioritization and Constraint It constructs tests in a locally Easy to adapt computational
handling can be more difficult for optimized manner. Thus, the size approaches for test prioritization and
algebraic approaches. of test sets generated may not be constraint handling.
minimal
6. Deterministic Approach. For e.g. Can be either deterministic or Can be either deterministic or Non-
Covering Arrays and Orthogonal Non-Deterministic. For e.g. Deterministic. For e.g. Simulated
Arrays. AETG, IPO etc Annealing, Hill-Climbing.
3 Research Objectives
Based upon literature review, we found that one of the key issues of Combinatorial
Testing is Combinatorial Explosion problem (i.e. too many data set to consider)
which can be addressed through Parallelization. Many Combinatorial testing tech-
niques have been proposed which mainly focus on minimization of the resulting test
sets with balanced time and space requirements [6] [7] [16], removal of unwanted
controls and data dependencies [8] [9], pairwise testing with efficient data structure
for storing and searching pairs [10], but none of the techniques have been yet ported
An Effective Approach to Build Optimal T-way Interaction Test Suites 195
to cloud environment which could further reduce time and cost. Due to resource con-
straints, it is nearly always impossible to exhaustively test all of these combinations of
parameter values. This problem can be addressed through Parallelization which can
be an effective approach to manage the computational cost to find a good test set that
covers all the combinations for a given interaction strength (t). In this paper we pro-
pose a strategy to build optimal t-way interaction test suites using artificial life tech-
niques like particle swarm optimization that that can be executed in the cloud
environment for further reduction in cost and time. Benefits of using Cloud as execu-
tion platform can be listed as: I) Computing clouds are huge aggregates of various
grids (academic, commercial), computing clusters and supercomputers. They are used
by a huge number of people either as users (300 million users of Microsoft’s Live) or
developers (330.000 application developers of Amazon EC2). II) Cloud computing
has strength to tackle vast amounts of data coming not only from the web but also
from a rising number of instruments and sensors as it draws on many existing tech-
nologies and architecture and integrates centralized, distributed and ‘software as ser-
vice’ computing paradigms into an orchestrated whole [21] and III )The emergence of
the computing cloud will invigorate academic research and will have strong potential
to spawn innovative collaboration methods and new behaviors for eg. SETI@home
and FOLDING@home.
The manipulation of the particles around the search space is restricted by a certain
update and positions rule. The particles are manipulated according to the following
equations [13]:
Vj,d (t) = w Vj,d(t-1) +c rj,d(pBestj,d (t-1) – Xj,d (t-1)) +c’r’j,d (lBestj,d (t-1)-Xj,d(t-1)) (1)
Where ‘t’ is the iteration number or time, d is the dimension, j the particle index, w is
the inertia weight, r and r’ are two random factors, which are two random real num-
bers between 0 and 1, and c, c’ are acceleration coefficients that are adjusting the
weight between components.
5 Proposed Strategy
Figure 2 shows the main classes of the tool that we have proposed in order to generate
an optimal t-way test suites using particle swarm optimization. Class Starter contains
the main program and will start the execution of operations for building the optimized
test suite. Class PSuiteOptimizer will be called by class Starter that contains the algo-
rithm based upon particle swarm optimization which generates the optimized test
suite. Class Repository contains the version of the test suite that is considered optimal.
In order to obtain real parallelism between threads, the instances of PSuiteOptimizer
will be executed on different machines. For this, we used the framework MapReduce
[14] that gives support for automatically distributing an application. MapReduce is a
programming model and an associated implementation for processing and generating
large datasets that is amenable to a broad variety of real-world tasks. Generated mi-
nimal t-way test suites executes simultaneously on different machines in cloud envi-
ronment. A Cloud will be set up to implement and validate the above proposed model.
An Effective Approach to Build Optimal T-way Interaction Test Suites 197
In this paper, we propose and illustrate our effective approach to build optimal t-way
interaction test suites over cloud using a novel approach particle swarm optimization
technique with the software test case generation to gain near optimal solution. Our
approach supports (t) equals up to 6-way consistent with the requirements as de-
scribed by Kuhn et al [15]. Concerning future work, we are now looking to compare
the performance of our approach particularly in terms of the test size with other strat-
egies like IPOG with its tool FireEye, WHITCH, Jenny, TConfig, and TVG etc.
References
1. National Institute of Standards and Technology. The Economic Impacts of Inadequate In-
frastructure for Software Testing. U.S. Department of Commerce (May 2002)
2. Berling, T., Runeson, P.: Efficient Evaluation of Multifactor Dependent System Perfor-
mance Using Fractional Factorial Design. IEEE Transactions on Software Engineer-
ing 29(9), 769–781 (2003)
198 Priyanka, I. Chana, and A. Rana
Keywords: Fuzzy set, Rough set, Fuzzy decision tree, Accuracy measure,
Roughness measure.
1 Introduction
Decision Tree Induction (DTI), one of the Data Mining classification methods, is used
in this research for predictive problem solving in analyzing patient medical track
records. In this paper, we have extended the concept of DTI dealing with meaningful
fuzzy labels in order to express human knowledge for mining fuzzy association rules
with using fuzzy rough technique [1]. Theories of fuzzy sets and rough sets are
generalizations of classical set theory for modeling vagueness and uncertainty [2], [3].
Rough sets are the results of approximating crisp sets using equivalence classes, in
other words, in traditional rough set approach, the values of attributes are assumed to
be nominal data, i.e. symbols. Fuzzy set theory deals with the ill definition of the
boundary of a class through a continuous generalization of set characteristic
functions. In many applications, however, the attribute values can be linguistic terms
(i.e. fuzzy sets), for example, the attribute ‘‘height’’ may be given the values of
‘‘high’’, ‘‘mid’’, and ‘‘low’’, then traditional rough set approach would treat these
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 199–207, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
200 M.A. Elashiri, H.A. Hefny, and A.H. Abd Elwhab
Fuzzy decision trees can process data expressed with symbolic, numerical values and
fuzzy terms. The apparent advantage of fuzzy decision trees is that they use the same
routines as symbolic decision trees but with fuzzy representation [8]. This allows for
utilization of the same comprehensible tree structure for knowledge understanding
and verification. Rough set can reduce the fuzzy attribute and lead to simplify of
computational rather than computational of fuzzy set, fuzzification of rough sets has
two quantifies are [9]: uncertainties simultaneously or vagueness through the
fuzzification of real valued attributes and ambiguity through rough approximations
[10]. Next subsections displays summary about some heuristics as related work using
fuzzy and rough concepts to generate FDT.
Using Jaccard similarities measure to select expanded attribute and combine between
each sub-attributes of all attributes with each sub-classes. Moreover aggregation
operators such as MIN, MAX, OWA, WA must be used to aggregate from depth 0 to
depth 1 also, previous combination will be repeating until complete the decision tree.
Unattractive reasons are; more than one tree generating in each level to aggregate
with others in next depths that lead to construct many of trees. Also, the algorithms of
pruning are needed to eliminate some of generated trees [6].
value of next partition is greater than dependency degree value of previous partition
that implies to large size of tree [12].
The main objective of this paper is to enhance the construction of fuzzy decision trees
by applying the integration between theories of fuzzy set and rough set, which has
been applied successfully in too many fields such as machine learning, pattern
recognition and image processing. This integration lead to efficient rules generation
and smaller decision trees which deal with fuzzy or crisp data sets, try to improve the
accuracy and also, to overcome the drawbacks of both fuzzy decision trees and rough
crisp decision trees. The drawbacks of rough decision trees are, it deals only with data
in classical or crisp forms and it cannot effectively deal with fuzzy initial data e.g.
linguistic terms. On the other hand, the drawbacks of fuzzy decision trees are: from a
computational point of view, with increased size of tree comes increased the
computational complexity, and from point of view about pruning which decrease the
size of large trees. Using rough set theory is a good way to determine relation within
dataset sample that may lead to determine the core of data by its measures such as
roughness degree [16].
Consider a non leaf node S consisting of n attributes S={F1,...,Fn} to be selected for
each 1≤k≤n, the attribute Fk takes mk values of fuzzy subsets Fk={F1k,...,Fmkk}, the
fuzzy classification is FC={FC1,..,FCmc} and universe of discourse is U={X1,..,XN}, as
shown in table 1.
F1 . Fk . FClass
U
. .
F11 …. F1m1 Fk1 … Fkmk FC1 ....... FCmc
1
X1 a11 … a11m1 a 1k1 … a 1kmk a C1 1 1
a Cm c
׃ ׃ ׃ ׃ ׃ ׃ ׃ ׃ ׃ ׃
N
XN
a 11 … a1Nm1 a kN1 …
N
akm a CN1 N
a Cm
k c
Our paper introduces modify of roughness measure formula, where the roughness
measure was defined only on rough set theory. Next section presents a new
formulation of roughness measure in fuzzy rough set theory and roughness measure
of fuzzy partition.
202 M.A. Elashiri, H.A. Hefny, and A.H. Abd Elwhab
R (x)
α (x ) = . (1)
R R (x)
ψ (x ) = 1 − α ( x ). (2)
R R
Definition 3. Fuzzy rough set is a tuple from lower approximation membership
degree and upper approximation membership degree of class l through Fkj, both are
define in Eq. (3) and Eq. (4) as [26]:
( )j
∀ i∈u
j
μ l F k = Inf max 1− μ F k ( x i ) , μ ( y i ) .
l
(3)
μ F k = Sup min μ F k ( x i ) , μ ( y i ) . (4)
l j ∀ i∈ u
j l
μ F kj
α F kj = .
l
l (5)
μ F kj
l
ψ
l
( kj ) = 1 − αl ( kj ).
F F (6)
Construct Fuzzy Decision Trees Based on Roughness Measures 203
The term M(Fkj) /∑mkj=1 M(Fkj), is the weights which represent the relative size of
subset Fkj in F. The attribute with minimum value of roughness measure ψ(F) will be
selected as a root of fuzzy decision tree i.e. Root=Mink(ψ(Fk))∈ Sroot , 1≤k≤N.
β calc = ρ ( Fi k , Fi C ) =
(( ) (
M μ Fi k (u ) ∩ μ Fi c (u ) ))
i of k in the same
branch
((
M μ Fi k (u ) )) . (8)
One of the criteria to stop growth of a branch in the tree is based on user defined
threshold βdefined, for leaf selection and each branch will undergo a leaf selection test,
which is calculation of βCalc. If truth level of classifying into one class is greater then
βdefined then terminate the branch as a leaf. Selection of βdefined is depends on the
problem to be solved.
Definition 7. Roughness measure of fuzzy partition ψ (Fv| Fkj) is the weight average
of roughness measure of fuzzy partition Fv∈ FP =S-Sroot on fuzzy evidence Fkj is
defined as:
mv M (F v ∩ F k )
ψ (F
v k
F ) =
l j ψ ( F v | F k ) . (9)
j mk l j
l =1 M ( F ∩ F )
v k
j =1 l j
204 M.A. Elashiri, H.A. Hefny, and A.H. Abd Elwhab
( j ) ∀ i∈u
j
l
μ l F v ∩ F k = Inf max 1− μ F v ( x i ) ∩ F k ( x i ) , μ ( y i ) .
(10)
μ F v ∩ F k = Sup min μ F v ( x i ) ∩ F k ( x i ) , μ ( y i ) .
l j j l
(11)
∀ i ∈u
If ψ (Fv∩ Fkj)≤ψ(Fkj) for any Fv∈Fp, where Fd={F1,F2,…,FL}=S-Sroot, then select the
attribute Fv with smallest value of roughness measure is a new decision node from of
Fkj branch otherwise terminate Fkj branch as a leaf with the highest truth level βCalc.
Input: FIS
Output: Group of fuzzy classification rules extracted from the generated fuzzy decision tree T
Step 1: Measure the roughness associated with each attribute and selects the attribute with
the smallest value of roughness measure ψl(Fkj) as the root decision node, which implies to
pure classification using Eq. (6).
Step 2: Delete all empty branches of the decision node and for each nonempty branch of the
decision node, calculate the truth level of classifying all objects within the branch into each
class as a leaf.
If truth level of classifying into one class is above a given threshold β from Eq. (8).
Then terminate the branch as a leaf.
Else, investigate
If an additional attribute will further partition the branch and further reduce the value
of roughness measure.
Then select the attribute with smallest value of objective function as a new decision
node from the branch from Eq. (9).
Else terminate this branch as a leaf and at the leaf; all objects will be labeled to one
class with the highest truth level Eq. (8).
Step 3: Repeat step 2 for all newly generated decision nodes until no further.
Step 4: Extracting fuzzy classification rules from the fuzzy decision tree T.
Construct Fuzzy Decision Trees Based on Roughness Measures 205
4 Experiments
5 Conclusions
New technique have been presented to construct fuzzy decision tree from both crisp
and fuzzy dataset depend concepts from rough set theory, the main proposed is
simplify computational procedures this was achieved from using simple set "rough set
theory" and increase the accuracy rules or keep of the high grade of accuracy, where
process of extract rules from huge dataset using FDTA, FID3 and FDTDD be very
complex computational procedures rather than complex computational procedures
from our proposed FDTR. Our proposed method takes both fuzziness and roughness
existing in an information system into consideration; in addition numerically
experimental results statistically verify the effectiveness of our proposed method. The
results of our technique are found to be smaller than other famous algorithms in tree
size and consequently better generalization (test) performance.
6 Future Work
References
1. Yao, Y.Y.: A comparative study of Fuzzy Sets and Rough Sets. Information
Sciences 109(1-4), 227–242 (1998)
2. Zhi, W.W., Mi, J.S., Zhang, W.X.: Generalized fuzzy Rough Sets. Information
Sciences 151, 263–282 (2003)
3. Wang, X.Z., Tsangb, E.C.C., Zhao, S., Chen, D., Yeung, D.S.: Learning Fuzzy Rules from
Fuzzy Samples based on Rough Set Technique. Information Sciences 177, 4493–4514
(2007)
4. Wang, T., Li, Z., Yan, Y., Chen, H.: A Survey of Fuzzy Decision Tree Classifier
Methodology. In: Fuzzy Information and Engineering (ICFIE). ASC, vol. 40, pp. 959–968
(2007)
5. Ning, Y., Tianruii, L., Jing, S.: Construction of Decision Trees based Entropy and Rough
Sets under Tolerance Relation. In: International Conference on Intelligent Systems and
Knowledge Engineering, ISKE (2007)
6. Huang, Z., Gedeon, T.D., Nikravesh, M.: Pattern Trees Induction: A New Machine
Learning Method. IEEE Transactions on Fuzzy Systems 16(4) (2008)
Construct Fuzzy Decision Trees Based on Roughness Measures 207
7. Wang, X.Z., Zhai, J.H., Lu, S.X.: Induction of Multiple Fuzzy Decision Trees based on
Rough Set Technique. Information Sciences 178, 3188–3202 (2008)
8. Wang, X.Z., Zhai, J.H., Zhang, S.F.: Fuzzy Decision Tree based on the Important Degree
of Fuzzy Attribute. In: Proceedings of the Seventh International Conference on Machine
Learning and Cybernetics, Kunming, pp. 511–516 (2008)
9. Ruying, S.: Data Mining Based on Fuzzy Rough Set Theory and Its Application in the
Glass Identification. Modern Applied Science 3(8) (2009)
10. Janikow, C.Z.: Fuzzy Decision Forest. In: NAFIPS 22nd International Conference of the
North American Fuzzy Information Processing Society, pp. 480–483 (2003)
11. Qing, H.H., Ming, Y.G., Da, R.Y.: Construct Rough Decision Forests based on
Sequentially Data Reduction. In: Proceedings of the Fifth International Conference on
Machine Learning and Cybernetics, pp. 2284–2289 (2006)
12. Rajen, B., Gopal, M.: FRCT: Fuzzy-Rough Classification Trees. Pattern Analysis &
Applications 11(1), 73–88 (2008)
13. Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177, 3–27
(2007)
14. Zhai, J.: Fuzzy decision tree based on fuzzy-rough technique. Soft Computing 15(6),
1087–1096 (2011)
15. Xue, L., Xiao, H.Z., Dong, D.Z.: Four Matching Operators of Fuzzy Decision Tree
Induction. In: Proceedings of the Fourth International Conference on Fuzzy Systems and
Knowledge Discovery (FSKD 2007), vol. 4, pp. 674–678 (2007)
16. Liang, J., Wang, J., Qian, Y.: A New Measure of Uncertainty based on Knowledge
Granulation for Rough Sets. Information Sciences 179, 458–470 (2009)
17. Chen, Y.L., Hu, H.W., Tang, K.: Construct a Decision Tree from Data with Hierarchical
Class Label. Expert Systems with Applications 36, 4838–4847 (2009)
18. Hefny, H.A., Ahmed, S.G., Abdel Wahab, A.H., Elashiry, M.: Effective Method for
Extracting Rules from Fuzzy Decision Trees based on Ambiguity and Classifiability.
Universal Journal of Computer Science and Engineering Technology 1(1), 55–63 (2010)
19. Jensen, R., Shen, Q.: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees. In:
Proceedings of the 2005 UK Workshop on Computational Intelligence, pp. 89–96 (2005)
20. Elashiry, M.A., Hefny, H.A., Abdel Wahab, A.H.: Fuzzy Rough Decision Tree based on
Rough Accuracy Measure and Fuzzy Rough Set. In: Proceedings of the 1st International
Conference on Advanced Computing and Communication (ICACC 2010), pp. 134–138.
Amal Jyothi College of Engineering, Kerala (2010)
21. Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases,
http://archive.ics.uci.edu/ml/
22. Yuan, Y., Shaw, M.J.: Induction of fuzzy decision trees. Fuzzy Sets and systems 69, 125–
139 (1995)
23. Wang, X.Z., Gao, X.H.: A Research on the Relation Between Training Ambiguity and
Generalization Capability. In: International Conference on Machine Learning and
Cybernetics, pp. 2008–2013 (2006)
24. Wang, X.Z., Yan, J.H., Wang, R.: A Sample Selection Algorithm in Fuzzy Decision Tree
Induction and Its Theoretical Analyses. In: IEEE International Conference on Systems,
Man and Cybernetics, pp. 3621–3626 (2007)
25. Baowen, X., Zhou, Y., Lu, H.: An Improved Accuracy Measure for Rough Sets. Journal of
Computer and System Sciences 71(2), 163–173 (2005)
26. Elashiry, M.A., Hefny, H.A., Abdel Wahab, A.H.: Induction of Fuzzy Decision Trees
Based on Fuzzy Rough Set Techniques. In: Proceedings of the International Conference on
Computer Engineering and Systems (ICCES 2011), pp. 134–139. Ain Shams University,
Cairo (2011)
Design of Low Power Enhanced Fully Differential
Recyclic Folded Cascode OTA
Abstract. In the literature, Recyclic Folded Cascode (RFC) and Improved RFC
(IRFC) Operational Transconductance Amplifiers (OTAs) are proposed for en-
hancing the DC gain and the Unity Gain Bandwidth (UGB) of the Folded
Cascode (FC) OTA. In this paper, an enhanced RFC (ERFC) OTA which uses
positive feedback at the cascode node is proposed for increasing the DC gain
and CMRR without changing the unity gain bandwidth (UGB) . For the purpose
of comparison, RFC, IRFC and ERFC OTAs are implemented using
UMC90nm technology in moderate inversion and studied through simulation.
From the simulation, it is found that the DC gain of ERFC OTA is higher by
6dB, 1dB compared to that of RFC and IRFC OTAs respectively. The CM gain
of ERFC OTA is lower by 31dB, 34dB compared to that of RFC and IRFC
OTAs respectively for the same power and area.
1 Introduction
High performance A/D converters and switched capacitor filters require Operational
Transconductance Amplifiers (OTAs) that has both high DC gain and a high unity
gain bandwidth (UGB). The advents of deep sub-micron technologies enable increa-
singly high speed circuits. As the technology scales down, the intrinsic gain of
the transistor decreases which makes it difficult to design OTAs with high DC gain.
In low voltage CMOS process, Folded Cascode (FC) amplifier is one of the most
preferred architectures for both single stage and the multi stage amplifiers (in the
first stage) due to its high gain and reasonably large output signal swing. Moreover,
the FC with PMOS input pair is preferred over its NMOS counterpart due to its higher
non-dominant poles, lower flicker noise, and lower input common mode range [1].
A number of techniques have been proposed in the literature to enhance the gain of
the FC OTA. One of these techniques presented in [2], [3] enhances the DC gain by
providing an additional current path at the cascode node. This converts the current
source into active current mirror which raises the output current to be above its quies-
cent value during slewing. Another technique proposed in [4], enhances the DC gain
and UGB by modifying the bias current sources of the conventional FC. In the con-
ventional FC these current sources don’t contribute to DC gain. A recycling technique
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 208–216, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Design of Low Power Enhanced Fully Differential Recyclic Folded Cascode OTA 209
The bias current sources in the conventional FC [1] consume high current, and have
large transconductance. However, these current sources don’t contribute to the DC
gain. In [4], the input transistors of FC are split into two parts (M1a, M1b, M2a, M2b)
which conduct fixed and equal currents of Ib/2. Next the current source transistor in
the FC is replaced by current mirrors M3a:M3b and M4a:M4b at a ratio of K: 1. This
architecture is called as the RFC OTA and is shown in Fig.1.
2.1 DC Gain
(1)
Where Gm is the transconductance and Rout is the output impedance.
The transconductance Gm is given by
/ (2)
Where the output current Iout is given by
(3)
From Fig.1, it can be seen that transistors M2b and the diode connected transistors
M11 and M3b act as a common source amplifier with a voltage gain of approximately
-1. Since, the input applied to M2b is in opposite direction, the node X+ (or X-) is in
the same phase of Vi+ (or Vi-)
Where
210 P. Patra, S. Kumaravel, and B. Venkatramani
and
1
Hence
Substituting in (3)
(4)
Substituting (4) in (3) gives the small signal transconductance Gm.
(5)
Where
.
The output impedance Rout of the RFC OTA is given by
( || )|| (6)
In Fig.1, the input impedance ZC at the cascode node C of the RFC OTA is given by
(8)
1
(1 ) (8)
In [3], the cascode node of the FC OTA is modified and the current sources are re-
placed with an active inverting current mirror. The same approach is adopted for the
RFC OTA at the cascode node. The modified half circuit of RFC OTA is shown in
Fig.2. The active load of the conventional RFC OTA comprising (M7, M9) is mod-
ified into an active inverting current mirror comprising (M7, M9, M14, M16, and the
inverters). A normal current mirror creates a copy of a current of equal magnitude and
in the same direction. The inverting current mirror creates a copy of any incremental
currents that is equal in magnitude, but opposite in direction. The inverting incremen-
tal currents for M7 and M9 can be obtained from M2a and hence the inverters shown
in Fig.2 are not required. Therefore, M12, M14 and M16 are attached to the drain of
M2a. The fully differential enhanced RFC OTA is shown in Fig.3. The gain from the
cascode node to the output node is given by (9). Thus the modified input impedance
ZC at the cascode node is given by (10),
Design of Low Power Enhanced Fully Differential Recyclic Folded Cascode OTA 211
( ) (9)
(10)
3.1 DC Gain
(11)
( ) 1 | |
2
(1 )
(12)
(1 ) ( ) | |
2
212 P. Patra, S. Kumaravel, and B. Venkatramani
1
(13)
where
(2 1)( || || ) (14)
and
Where Cout denotes the equivalent load capacitance that includes the external capacit-
ance , as well as all the parasitic junction capacitances associated with the output
node.
4 gm/ID Methodology
The / methodology [6] relates the small signal parameter to large signal
parameter . There are three degrees of freedom in this methodology - inversion
coefficient, drain current and channel length of the transistor. The of the OTA is
determined by the unity gain bandwidth (UGB) and the load capacitance. The expres-
sion for is given by
(15)
The gives the transconductance of the input transistor M1a in fig.4, which is
the UGB and is the load capacitance. The different regions of operation give differ-
ent values of / . The / value is high in weak inversion and the value de-
creases as we move from weak to strong inversion. Since, the / ratio doesn’t
depend on the gate width, drain currents ID achieving any prescribed bandwidth prod-
uct (for a particular UGB) can be derived from the expression given by (16)
/
(16)
Design of Low Power Enhanced Fully Differential Recyclic Folded Cascode OTA 213
The transistors are traditionally biased in the saturation region in the analogue
circuits. But, they can be operated in strong inversion or moderate inversion or weak
inversion . Transistors biased in weak inversion provide higher transconductance and
higher gain with a smaller current [6].Diffusion current and drift current dominate in
weak inversion and strong inversion regions respectively. For short channel devices
biased in strong inversion, velocity saturation and other small geometry as well high-
field effects, reduce the drain current
(17)
The moderate inversion offers high transconductance efficiency and low drain-source
saturation voltages compared to strong inversion. Moreover, it results in smaller gate
area and capacitances which in turn results in higher bandwidth compared to weak
inversion. Hence, as the technology scales down, it is better to operate in moderate
inversion. The closed form equation for the drain current in moderate inversion is not
available in the literature. An interpolated equation for this region is reported in [6].
In [6], range of effective gate to source voltage (Veff given by (17)) required for each
region of operation are given by inequalities (18), (19) and (20). In (18), RHS is taken
as -72mv by assuming the substrate factor (n) as 1.4 and as 25.9mv at 300K.
Weak Inversion
0.72 (18)
Moderate Inversion
Strong Inversion
0.25 (20)
GmRFC depends only on the transconductance gm1a of the input transistor. Hence us-
ing (7) and (15), GmRFC can be written as
through a transistor when aspect ratio is 1. The unary drain current for M3b transistor
is 187.835nA. Similar design procedure is followed for other transistors. [6]
(22)
5 Simulation Results
The ERFC OTA, RFC OTA, and IRFC OTA reported in the literature [4][5] are si-
mulated using the UMC 90nm CMOS process with a supply voltage of 1.2 volts. The
load capacitance for all the OTAs is 5.6pF. For all three OTAs, parameter K in the
bias current source is assumed to be three. The OTAs discussed are implemented and
simulated using Cadence SPECTRE Simulator. The area required is the same for all
the three OTAs as the transistor widths of M5, M7, and M9 are divided into pairs of
M5/M11, M7/M13, M9 and M15. It can be verified from the Table.1 that the size of
M5/M7/M9 are 2 times that of the M5/M11/M7/M13/M9/M15. The improvement in
common mode rejection ratio can also be analyzed from the Fig.5. Designing a
CMFB circuit is difficult for fully differential RFC [4] but that need is eliminated in
ERFC. The various parameters of the OTAs such as DC Gain, UGB, Phase Margin,
CMRR, slew rate are given in Table.2.
From Fig.4 and Table.2 the following observations may be made:
• The gain of the ERFC OTA is higher by 6dB, 1dB respectively compared to con-
ventional RFC and IRFC OTAs.
• The CM gain of ERFC OTA is lower by 30dB, 34dB compared to that of RFC
and IRFC OTAs respectively. This also implies that the CMRR of ERFC OTA is
higher by 49dB compared to that of RFC and IRFC OTAs respectively.
M3c/M4c 12.1/0.35
M11 53/0.35 - -
M13 4.41/0.35 - -
M15 12.23/0.35 -
6 Conclusion
The fully differential enhanced RFC OTA proposed in this paper and RFC as well as
IRFC OTAs reported in the literature have been designed and simulated in UMC
90nm CMOS technology. The increase in the low frequency DC gain is achieved by
positive current feedback technique. This in turn results in symmetric slew rate and
high common mode rejection ratio. The fully differential Enhanced RFC OTA
achieves a higher DC Gain and lower CM gain compared to the other two OTAs. The
need for CMFB circuit is also avoided.
References
1. Razavi, B.: Design of Analog CMOS Integrated Circuit. Tata McGraw Hill (2001)
2. Nakamura, K., Carley, L.R.: An enhanced fully differential folded cascode op-amp. IEEE
Journal of Solid-State Circuits 27, 563–568 (1992)
3. Richard Carley, L., Nakamura, K.: Fully differential operational amplifier having frequency
dependent impedance division. Patent Number 5,146,179
4. Assaad, R.S., Silva-Martinez, J.: The Recycling folded cascode: A general enhancement of
the folded cascode amplifier. IEEE Journal of Solid State Circuits 44(9) (September 2009)
5. Li, Y.L., Han, K.F., Tan, X., Yan, N., Min, H.: Transconductance enhancement method for
operational transconductance amplifiers. IET Electronics Letters 46(9) (September 2010)
6. Binkley, D.M.: Tradeoffs And Optimization in Analog CMOS Design. A John Wiley &
Sons, Ltd.
CCCDBA Based Implementation of Sixth Order Band
Pass Filter
Abstract. In the present paper implementation of sixth order band pass filters
have been proposed by using current-controlled current differencing buffered
amplifier (CCCDBA). In the present work, an effort has been made to simulate
the sixth order doubly-terminated LC ladder band pass filter, using CCCDBA.
Here, in each circuit more than one CCCDBA and few grounded capacitor
without have been utilized. The designed circuits are very suitable for inte-
grated circuit and very easy in implementation. The circuits’ performance is si-
mulated through PSPICE and its simulated results obtained so is comparable to
the theoretical one.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 217–223, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
218 R. Vishal et al.
a) high-slew rate,
b) freedom from parasitic capacitances,
c) wide bandwidth and,
d) Simple implementation.
Since the proposed circuits are based on CCCDBAs, a brief review of CCCDBA is
given here. CCCDBA is a translinear based current-controlled current differencing
buffered amplifier (CC-CDBA’s) whose parasitic input resistances can be varied elec-
tronically. Basically, the CCDBA is a four-terminal active element. For ideal opera-
tion, it’s current and voltage relations described by the equation-1. [11],[12],[13].
Ia
Ip
Vp P W Vw
CCCDBA
Vn N Z Vz
In Iz
2 Operation
I2
Vin
C1 C3 Rl Vo
p=
s 2 + ω02 Q s n2 + 1
=
( ) (2)
ω3dB s sn
ωo
Where, Q= = QualityFactor (3)
ω 3dB
And
s
sn = And ωo = Center frequency (4)
ωo
ωo
Also, ωo = (ωlωu)
1/2
≈ ωl + ωu/2 for, >> 1. (5)
ω 3dB
ω3dB = ωu - ωl. (6)
From, the above equation it is clear that the series inductor element should be re-
placed by a series combination of inductor and capacitor whereas the shunt capacitor
element is replaced by a parallel combination of inductor and capacitor. So the resul-
tant network thus obtained after applying the required frequency transformation is
shown in Figure-3.
Rs L2 C2
V1 1 2
V3
+
2 I2 2
Vin L1 C1
L3 C3 Rl Vo
1
1
To obtain the circuit realization of the filter in the form of voltage-adder and vol-
tage-integrators mathematical calculations are carried out and we arrive at the follow-
ing results:
1
s
C1 R ^
(7)
V1 = V in − V 2
1 1
s 2 + s +
C1 R L1C1
R
s
^
L2 (8)
V2 = V1 − V 3
1
s2 + s
L2C 2
220 R. Vishal et al.
1
s
V1 = C3R ^
V2
(9)
1 1
s2 + s +
C 3 R L 3 C 3
VO = V3 (10)
From equation-8 it is observed that we require a second order resonant section which
can be implemented by cascading two lossless integrators as shown by the block dia-
gram representation in Figure-4 and the corresponding circuit realization of the reso-
nant section as depicted by Figure-5.
The transfer function of the resonant section shown in Figure-3 is thus calculated
to be as follows:
With the help of the above four equations from equation 7 to equation 9, we can
obtain the block-diagram of Figure-3 which is depicted in Figure-6 and the circuit
realization is shown in Figure-7
Fig. 7. Circuit Diagram of Sixth Order Doubly Terminated LC Ladder Leapfrog Band pass
Filter Using CCCDBA
The transfer function of the circuit shown in Figure-7 is given by the following
equation:
1 3 8
s 3 3
Vo (s) 8 C R
= (12)
Vin(s) 6 1 1 8 7 4 1
s + 4s5 +5s4 2 2 + s3 3 3 + s2 4 4 + s 5 5 + 6 6
CR C R C R C R C R C R
In deriving the above transfer function we have taken following the conditions:
Rx4 = Rx12 = Rx14 = R16 = Rx17 = R, Rx3 = R/4. Also, C3 = C4= C, C2 = C5 = C/4 and C1 = C6
= 4C.
The expression for cut-off frequency is calculated as follows:
fo =
(4 )1 / 6 (13)
2 π CR
3 Simulation Results
Figure-8 depicts the SPICE simulation results of the second order resonant section
while figure-8 depicts the SPICE simulation results of the proposed sixth order LC
ladder band pass filter circuit which gives the value of center frequency to be 45.186
KHz and bandwidth to be 26.677 KHz.
222 R. Vishal et al.
4 Conclusion
We observe that the realized filter works in accordance with the theoretical values for
Butterworth response with R = 520Ω and the value of the capacitor to be C = 4nF
gives the value of the center frequency to be approximately equal to 96.405 KHz. At
the time of realizing band pass filter we come across a second order resonant section,
which was implemented by cascading two lossless integrators. Also, in realizing sixth
order band pass filter we require two second order band pass sections which are im-
plemented using multiple feedback band pass filters.
References
1. Tangsrirat, W., Surakampontorn, W.: Realization of multiple-output biquadratic filters us-
ing current differencing buffered amplifier. International Journal of Electronics 92(6),
313–325 (1993)
2. Akerberg, D., Mossberg, K.: A Versatile Building Block: Current Differencing Buffered
Amplifier Suitable for Analog Signal Processing Filters. IEEE Trans. Circuit Syst.
CAS 21, 75–78 (1974)
CCCDBA Based Implementation of Sixth Order Band Pass Filter 223
Abstract. Discovering automatic patterns from the databases are most useful
information and great demand in science and engineering fields. The effective
pattern mining methods such as pattern discovery and association rule mining
have been developed and used in various applications. The existing methods are
unable to uncover the useful information from the raw data. Discovering large
volume of patterns is easy .But finding the relationship between the patterns and
associated data are very difficult and further analyzing the patterns are also
complex task. In this Paper, we presented a new algorithm which generates
closed frequent patterns and its associated data simultaneously. Here the
relationship between the patterns and its associated data are made explicit. The
experiment result has been included.
1 Introduction
Pattern Discovery is a useful tool for categorical data analysis. The patterns produced
are easy to understand. Hence it is widely used in business and commercial
applications. PD typically produces an overwhelming number of patterns. The scope
of each pattern is very difficult and time consuming to comprehend. There is no
systematic and objective way of combining fragments of information from
individual patterns to produce a more generalized form of information. Since there are
too many patterns, it is difficult to us e them to further explore or analyze the data. To
address the problems in Pattern discovery , We proposed a new method that
simultaneously clusters the discovered patterns and their associated data. One
important property of the proposed method was each cluster pattern was explicitly
associated with a corresponding data cluster. To effectively cluster patterns and their
associated data, several distance measures are used. Once a distance measure is
defined, existing clustering methods can be used to cluster patterns and their
associated data. After clusters are found, each of them can be further explored and
analyzed individually.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 224–227, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Discovery of Cluster Patterns and Its Associated Data Simultaneously 225
2 Related Work
Many algorithms are developed to find the frequent patterns. The first proposed
algorithm for association rule mining was AIS algorithm. AIS Algorithm [1] used the
Candidate Generation to generate frequent itemsets. The Main drawback of this
algorithm is generating too many candidate itemsets. The next algorithm is Apriori
Algorithm [2] was developed to find the frequent patterns. It used the Breadth –First
Strategy to count the support of itemsets and used the candidate generation function.
Wong and Li[3] have proposed a method that simultaneously clusters the discovered
patterns and their associated data. In which pattern induced data clusters is
introduced. It relates patterns to the set of compound events containing them and
makes the relation between patterns and their associated data explicit. Pattern induced
data clusters defined are constants. That is each attribute has only one value in the
cluster. Since each pattern can induce a constant cluster, the number of
constant clusters is overwhelming. To reduce the number, it is desirable to merge
clusters. Let us say two clusters I (i), I (j) are two clusters. The merged data cluster of
I (i) and I (j) is the union of their matched samp les and matched attributes. When two
data cluster are merged, the corresponding patterns including them are simultaneously
clustered. The author used the hierarchical agglomerative app roach to clusters the
patterns. To Generate pattern, the author used Discover *e Algorithm. The main
drawback of the algorithm is speed and Pattern pruning was not done. Rather than
clustering all frequent item sets, cluster closed frequent itemsets which could be much
fewer than the number of all frequent item sets. Our works does exactly that.
Where dist(d(j), ck) measures the Euclidean distance between a points d (J) and its
cluster center ck. The k-means algorithm calculates cluster centers iteratively as
follows:
1. Initialize the centers in ck using random sampling;
2. Decide membership of the points in one of the K clusters according to the
minimum distance from cluster center Criteria;
226 N. Kavitha and S. Karthikeyan
5 Experimental Results
We had taken the iris database from UCI Machine learning database repository for
finding frequent closed item sets. It consists of 160 samples, 4 attributes and 3 classes
(Setosa, Versicolor and Virginica). The classes Versicolor and Virginica are highly
overlapped while the class Setosa is linearly separable from the other two. The
algorithms are implemented using java programming language. It generates the 56
closed itemsets if the minimum support is 2 and average execution time is 0.011 secs.
It generates the 48 closed itemsets if the minimum support is 3 and average execution
time is 0.002 secs . If the existing algorithm, apriori which generates 90 frequent
item sets when the minimum support is 2 and average execution time is 0.016 secs.
Ariori generates 85 frequent itemsets if the minimum support is 3 and average
execution is 0.003 secs .
The comparison of Charm and Apriori is shown in the fig 1.
By seeing the chart, the proposed out performs the result. Apriori generates all the
frequent itemsets. But proposed algorithm produced only the closed Frequent
Patterns. The execution speed is faster when compared to Apriori.
Discovery of Cluster Patterns and Its Associated Data Simultaneously 227
6 Conclusion
This paper has proposed a method for clustering patterns and their associated data.
The effectiveness of the above divide-and-conquer approach lies in the proposed
clustering method. One important property of the proposed method is that each
pattern cluster is explicitly associated with a corresponding data cluster. To
effectively cluster patterns and their associated data, several distance measures are
used.
References
1. Agrawal, Srikant, R.: Algorithms for Mining Frequent Itemsets. In: Proc. of the ACM
SIGMOD Conference on Management of Data (1993)
2. Agrawal, Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. 20th Int’l
Conf. Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)
3. Wong, A.K.C., Li, G.C.L.: Simultaneous pattern and data clustering for pattern cluster
analysis. IEEE Trans. Knowledge and Data Eng. 20(7), 911–923 (2008)
4. Pei, J., Han, J., Mao, R.: CLOSET: An Efficient Algorithm for Mining Frequent Closed
Itemsets. In: Proc. ACM SIGMOD Int’l Workshop Data Mining and Knowledge Discovery
(May 2000)
5. Pei, J., Han, J., Wang, J.: CLOSET+: Searching for the Best Strategies for Mining Frequent
Closed Itemsets. In: Proc. Ninth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data
Mining (August 2003)
6. Zaki, M., Hsiao, C.: CHARM: An efficient algorithm for closed itemset mining. In: SDM
2002 (April 2002)
A Method to Improve the Performance of endairA
for MANETs
1 Introduction
MANET forms network in the absence of base stations. The requirement of MANET
is just to have mobile nodes that can interact with each other and route the traffic
using some routing protocol. Security in MANETs is an essential component for basic
network functions like packet forwarding otherwise network operations can be easily
jeopardized. Different secure routing protocols are proposed for MANETs each has
their own strengths and weaknesses. endairA[6] is one of the secure on demand
routing protocol for MANETs. In this paper we proposed a new approach to improve
the performance of endairA[6] protocol.
This paper is organized as follows. Related work is presented in Section 2. Section
3 describes the proposed method. Results are described in Section 4. Finally, section 5
concludes the paper.
2 Related Work
There has been considerable work for securing the MANET routing protocols. ARAN
[8] uses previous node signature to ensure the integrity of routing messages. ARAN
needs an extensive signature generation and verification during route request phase.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 228–232, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
A Method to Improve the Performance of endairA for MANETs 229
Secure Routing protocol (SRP) [8] [10] uses secret symmetric key to establish secure
connection. Security Aware Ad-Hoc Routing (SAR) [8] provides level of security to
share a secret key. SAR may fail to find a route if some of the nodes in the path do not
meet the security requirements. Ariadne [12] adds digital signatures to route request
which will be verified only at the destination. An attack on SRP and Ariadne is
explained in [1].
endairA [6][10] signs route reply message. A practical problem of the basic
endairA[6] protocol is that each intermediate node verifies all the signatures in the
route reply which is too expensive in terms of power consumption and end-to-end
delay. To overcome the drawback of endairA[6], variant of endairA [6] was proposed.
In variant of endairA intermediate node verifies only Destination signature. Source of
route verifies all the signatures in the route reply. This may lead to successful attacks
that are clearly explained in [6].
3 Proposed Approach
The source of the route generates and broadcasts route request. Once the route request
reaches the destination it generates route reply. The route reply contains the source
identifier, destination identifier and the accumulated route obtained from the route
request. Destination calculates the signature on the above elements appends it to the
route reply and forwards it to the next node present in the route. Each Intermediate
node in the route receives the route reply and verifies whether its identifier is in the
node list, the adjacent identifiers are belongs to neighboring nodes, number of
signatures in the reply is less than or equal to number of nodes in the node list and
validates two signatures in the route reply message i.e. two hop neighbor's signature
and the destination signature. If all these verifications are successful, the intermediate
node attaches its signature and forwards the route reply message to the next node
present in the node list otherwise the node drops the route reply.
When the source node receives the route reply it verifies whether the first identifier
in the route reply is belongs to neighbor and validates all the signatures in the route
reply. If all these verifications are successful, the source accepts the route otherwise
rejects the route.
Proposed approach, Impro_endairA overcomes the attack on variant of endairA[6]
which is explained in [6]. Consider configuration in Fig 1[6], Assume that the source
is S, destination is D, and the route reply contains the route (A, B, Z, C). After
230 V.B. Reddy et al.
receiving the route reply from C the adversarial node V2* can send the following
message to A in the name of B.
(rrep, S, D, id ,(A,B,Z,C),(sigD,sigC,sigZ))
A will not accept this message, as it verifies sigC with Z’s key which will fail. So the
Impro_ endairA overcomes the specified attack in [6] by dropping the route reply
packet at node A itself.
have shown that the Impro_endairA outperforms variant of endairA and endairA. The
attacks which are possible on endairA are still possible on Impro_endairA.
We need to do thorough simulations with different parameters to understand the
performance of Impro_endairA.
References
1. Abidin, A.F.A., et al.: An Analysis on Endaira. International Journal on Computer Science
and Engineering 02(03), 437–442 (2010)
2. Perkins, C., Bhagwat, P.: Highly DynamicDestination-Sequenced Distance Vector Routing
(DSDV) for Mobile Computers. In: ACM SIGCOMM, pp. 234–244 (1994)
3. Johnson, D.B., Maltz, D.A., Broch, J.: DSR: The Dynamic Source Routing Protocol for
Multi-Hop Wireless Ad Hoc Networks, http://www.monarch.cs.cmu.edu/
4. Djenouri, D., Bouamama, M.: Black-hole-resistant ENADAIRA based routing protocol for
Mobile Ad hoc Networks. Int. Journal on Security and Networks 4(4) (2009)
5. Djenouri, D., Badache, N.: A novel approach for selfish nodes detection in manets:
proposal and petrinets based modeling. In: The 8th IEEE International Conference on
Telecommunications (ConTel 2005), Zagreb, Croatia, pp. 569–574 (2005)
6. Gergely, A., Buttyan, L., Vajda, I.: Provably Secure On-Demand Source Routing in
Mobile Ad Hoc Networks. IEEE Transaction on Mobile Computing 5(11) (November
2006)
7. Nguyen, H.L., Nguyen, U.T.: A study of different types of attacks on multicast in mobile
ad hoc networks. Ad. Hoc. Networks 6, 32–46 (2008),
http://www.sciencedirect.com
8. Kumar, M.J., Gupta, K.D.: Secure Routing Protocols in Ad Hoc Networks: A Review. In:
Special Issue of IJCCT, 2010 for International Conference (ICCT 2010), December 3- 5,
vol. 2(2,3,4) (2010)
9. Fanaei, M., Fanian, A., Berenjkoub, M.: Prevention of Tunneling Attack in endairA. In:
Sarbazi-Azad, H., Parhami, B., Miremadi, S.-G., Hessabi, S. (eds.) CSICC 2008. CCIS,
vol. 6, pp. 994–999. Springer, Heidelberg (2008)
10. Buttyan, L., Hubaux, J.P.: Security and cooperation in wireless networks. Cambidge
University Press (2008)
11. Marti, S., Giuli, T.J., Lai, K., Baker, M.: Mitigating routing misbehavior in mobile ad hoc
networks. In: Proceedings of the 6th Annual International Conference on Mobile
Computing and Networking (MobiCom 2000), Boston, MA, pp. 255–265 (2000)
12. Li, W., Joshi, A.: Security Issues in Mobile Ad Hoc Networks- A Survey (2006)
13. Hu, Y.C., Perrig, A., Johnson, D.B.: Ariadne: A Secure On-Demand Routing Protocol for
Ad hoc Networks. In: Proc. 8th ACM Int’l. Conf. Mobile Computing and Networking
(Mobicom 2002), Atlanta, Georgia, pp. 12–23 (September 2002)
Identification of Reliable Peer Groups in Peer-to-Peer
Computing Systems
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 233–237, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
234 J. Dubey and V. Tokekar
PCT = IT X AP (4)
It represent the time when a peer actually executes the system’s computations in
the presence of peer autonomy failures.
The peer groups are constructed by the algorithm of peer group construction as
given below in figure 1(a) , the peers are classified into A, B, C, D, E, F, G, and H
classes depending on the peer availability(AP), peer computation time (PCT), and peer
credibility(CP).
In figure 1(b) we show a unit cube. The three dimensions of the cube correspond to
the three important peer characteristics which affect the performance of a peer group.
The vertical dimension represents the peer availability (AP), horizontal dimension
represents peer computation time (PCT), and dimension perpendicular to plan
represents the peer credibility (CP). We divide this cube into eight equal volume sub-
cubes A, B, C, D, E, F, G, and H as shown in the figure 1(b) which corresponds to the
peer groups constructed by the algorithm.
• The group ‘A’ (sub cube A in fig 1(b)) represents a peer groups in which all the
peers have high values of AP, PCT, and CP. In group ‘A’ all the peers have high
possibilities to execute task reliably because they have high credibility as well as
availability.
• The group ‘B’ (sub cube B in fig 1(b)) represents a peer group in which all the
peers have high values of AP, and PCT but low values of CP. it means that the peer
group has high possibility to complete the task; however its results might be
incorrect.
236 J. Dubey and V. Tokekar
Fig. 1. (a) Algorithm for peer group construction (b) Categorization of peer groups
• The group ‘C’ (sub cube C in fig 1(b)) represents a peer group in which all the
peers have high values of AP, CP, but low values of PCT. The peer group has the
high possibility to produce correct results, however it cannot complete the assigned
task because lack of peer computation time.
• The group ‘D’ (sub cube D in fig 1(b)) represents a peer group in which all the
peers have high values of AP, but low values of PCT, and CP. this peer group has
low probability that it can complete the task due to lack of peer computation time
and also results produced by it might be incorrect.
• The group ‘E’ (sub cube E in fig 1(b)) represents a peer group in which all the
peers have high values of PCT, and CP but low values of AP. In this peer group peers
have small availability but high peer computation time so there is possibility to
complete the computation task with correct results.
• The group ‘F’ (sub cube F in fig 1(b)) represents a peer group in which all the
peers have high values of PCT, but low values of CP and AP. this peer group has less
peer availability and credibility hence it cannot complete the task and not
recommended to use for computations.
• The group ‘G’ (sub cube G in fig 1(b)) represents a peer group in which all the
peers have high value of CP, but low values of AP, and PCT. this peer group has
least probability to complete the task and it is not recommended to use for the
computations.
• The group ‘H’ (sub cube H in fig 1(b)) represents a peer group in which all the
peers have low values of AP, PCT, and CP. this group is also not recommended to
use for the computations.
Identification of Reliable Peer Groups in Peer-to-Peer Computing Systems 237
3 Conclusion
In pure P2P computing system peer’s various properties like peer availability, peer
credibility, and peer computation time may also be used to group the peers. Here we
proposed an algorithm which categorized the existing peers in a P2P computing
system into eight different categories according to the values of above mentioned peer
properties. In group ‘A’ all the peers have high values for reliability, credibility, and
computation time hence this group has the highest probability to complete the
computation task into given time period and to produce correct results. The group ‘A’
may be very useful for the real time and dead line driven computations. The group ‘B’
may be useful in such computations were time deadline is important and fraction of
error in computation is acceptable. The group ‘C’ and ‘E’ may be used for those
computations in which accuracy is must but having no time dead line. The group ‘D’
can be used for those computations were fraction of error is acceptable and also
having no time deadline for the completion of computation. The group ‘F’, ‘G’, and
‘H’ are not recommended to use for the computation purpose.
References
1. Dubey, J., Tokekar, V.: P2PCS – A Pure Peer-to-Peer Computing System for Large Scale
Computation Problems. In: IEEE International Conference on Computational Intelligence
and Communication Networks (CICN), October 7-9, pp. 582–585 (2011)
2. Verbeke, J., Nadgir, N., Ruetsch, G., Sharapov, I.: Framework for Peer-to-Peer Distributed
Computing in a Heterogeneous, Decentralized Environment. In: Parashar, M. (ed.) GRID
2002. LNCS, vol. 2536, pp. 1–12. Springer, Heidelberg (2002)
3. Dubey, J., Tokekar, V., Rajavat, A.: A Study of P2P Computing Networks. In: International
Conference on Computer Engineering and Technology (ICCET 2010), November 13-14,
pp. 623–627. JIET group of Institution, Jodhpur (2010)
4. Ernst-Desmulier, J.-B., Bourgeois, J., Spies, F., Verbeke, J.: Adding New Features In A
Peer-to-Peer Distributed Computing Framework. In: 13th Euromicro Conference on
Parallel, Distributed and Network-Based Processing (Euromicro-PDP 2005), pp. 34–41
(2005)
5. Lo, V., Zappala, D., Zhou, D., Liu, Y.-H., Zhao, S.: Cluster Computing on the Fly: P2P
Scheduling of Idle Cycles in the Internet. In: Voelker, G.M., Shenker, S. (eds.) IPTPS 2004.
LNCS, vol. 3279, pp. 227–236. Springer, Heidelberg (2005)
6. Choi, S.J., Baik, M.S., Gil, J.M., Park, C.Y., Jung, S.Y., Hwang, C.S.: Group-based
Dynamic Computational Replication Mechanism in Peer to Peer Grid Computing. In:
IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2006),
6th International Workshop on Global and Peer to Peer Computing (GP2P) (May 2006)
Improving the Efficiency of Data Retrieval in Secure
Cloud
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 238–241, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Improving the Efficiency of Data Retrieval in Secure Cloud 239
contain important information related to the data files. The existing searchable
encryption techniques do not suit for cloud computing scenario because they support
only exact keyword search. This significant drawback of existing schemes signifies
the important need for new methods that support searching flexibility, tolerating both
minor types and format inconsistencies. The main problem is how efficiently
searching the data and retrieves the results in most secure and privacy preserving
manner. The existing system is mainly focusing on the ‘fuzzy keyword search’
method. The data that is outsourced is encrypted, constructs fuzzy sets based on both
wild card technique and gram based technique, and also introduced a symbol-based
trie-traverse search scheme [2, 4], where a multi-way tree was constructed for storing
the fuzzy keyword set and finally retrieving the data.
The main challenges are security in data storage, searching, data retrieval etc. Here it
is mainly focusing on the data searching and retrieval part. Traditional encryption
techniques support only exact keyword search. This technique is insufficient, because
it will retrieve the data, only if the given keyword matches for the predefined keyword
set. So for increasing the flexibility in searching so many new searching techniques
were introduced.
S. Ji, proposed a new computing paradigm, called interactive, fuzzy search [3].
This uses ‘Straight forward method’ for keyword construction. It gives the idea about
the queries with a single keyword. Here the keyword set construction needs more
space for storing the keywords. So in order to reduce the space complexity J. Li, [4,
6] proposed another fuzzy keyword search method which includes ‘Wild-card’ [ 2, 4,
6] based method and ‘Gram based’ [2, 4, 6] method for constructing fuzzy keyword
sets, a ‘symbol-based trie-traverse search scheme’ [2,6] for data retrieval. Philippe
Golle, proposed a Secure Conjunctive Keyword Search [1], which gives a clear idea
about the conjunction of the keyword. By introducing the conjunction of keywords
the relevancy will be increase. That is the efficiency will be increasing and generate
ranking automatically. Xin Zhou, proposed a Wild card search method for Digital
dictionary based on mobile platform [2]. It gives a good idea about the Wild card
method, and trie tree approach which reduces the search range largely, which includes
the fuzzy pointer field, and also gives the idea for the process of inserting a word in
the tree.
Figure1. shows the node of an advanced trie-tree [2]. It consists of nodes, in which
each node has 26 child pointers. In the proposed work the existing trie-tree method is
replaced with the advanced trie-tree method. Here two fields namely exact index and
fuzzy pointers are added in to the node. The exact index's value contains the offset of
the word's entry in the dictionary, which contain all the information about the word.
The fuzzy pointer is a pointer to record a list of keyword's offsets in the dictionary.
by returning the matching files in a ranked order given in the keyword sequence
regarding to certain relevance criteria like keyword frequency. Here a ranked list is
generated for the conjunctive/sequence of keywords, and retrieve the data according
to that ranking. It is highly secure, privacy preserving and efficient.
In wild card method for a key word ‘SIMPLE’ with the pre-set edit distance 1, the
total number of variants constructed is 13+1. In general for ‘l’ number of words the
number of variants constructed are 2*l +1+1. So for conjunction of keywords having
‘n’ number of keywords n (2*l +1+1) variants are constructed. For gram based
method for the key word ‘SIMPLE’ with the pre-set edit distance 1, the total number
of variants constructed is l+1. So for conjunction of keywords having ‘n’ number of
keywords n (l +1) variants are constructed. The complexity of searching will reduce
from O(m*n*l) in to O(m*n), where ‘m’ is the length of hash value. The space
complexity is reduced from O(m*p*n*l) in to O(m*n*l).
References
1. Golle, P., Staddon, J., Waters, B.: Secure Conjunctive Keyword Search over Encrypted
Data. In: Jakobsson, M., Yung, M., Zhou, J. (eds.) ACNS 2004. LNCS, vol. 3089, pp. 31–
45. Springer, Heidelberg (2004)
2. Zhou, X., Xu, Y., Chen, G., Pan, Z.: A New Wild- card Search Method for Digital
Dictionary Based on Mobile Platform. In: Proceedings of the 16th International Conference
on Artificial Reality and Telexistence Workshops, ICAT 2006. IEEE (2006)
3. Ji, S., Li, G., Li, C., Feng, J.: Efficient interactive fuzzy keyword Search. In: Proc. of
WWW 2009 (2009)
4. Li, J., Wang, Q., Wang, C., Cao, N., Ren, K., Lou, W.: Enabling Efficient Fuzzy Keyword
search over encrypted data in cloud computing. In: Proc. of IEEE INFOCOM 2010 Mini-
Conference, San Diego, CA, USA (March 2009)
5. Kaufman, L.M.: Data Security in the World of Cloud Computing. IEEE Security and
Privacy 7(4), 61–64 (2009)
6. Li, J., Wang, Q., Wang, C., Cao, N., Ren, K., Lou, W.: Fuzzy keyword search over
encrypted data in cloud computing. In: Proc. of IEEE INFOCOM 2010 Mini-Conference,
San Diego, CA, USA (March 2010)
7. Wang, C., Cao, N., Li, J., Ren, K., Lou, W.: Secure ranked key word search over encrypted
cloud data. In: Proc. of ICDC 2010 (2010)
8. Cao, N., Wang, C., Li, M., Ren, K., Lou, W.: Privacy-Preserving Multi-keyword Ranked
Search over Encrypted Cloud Data. In: Proc. of IEEE INFOCOM 2011 ( April 2011)
9. Boneh, D., Waters, B.: Conjunctive, Subset, and Range Queries on Encrypted Data. In:
Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp. 535–554. Springer, Heidelberg (2007)
VLSI Implementation of Burrows Wheeler Transform
for Memory Reduced Distributed Arithmetic
Architectures
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 242–245, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
VLSI Implementation of Burrows Wheeler Transform 243
Sorting and compression of look-up table entries are offline processes. Hence only
reverse BWT has to be implemented in FPGA. In this section we are discussing the
implementation of reverse BWT for generating the required look-up table data for
performing the convolution operation from the compressed table.
Fig. 1. ASM chart for sorting operation Fig. 2. Data path circuit for sorting
244 A.S. Remya Ajai, L. Rajan, and C. Shiny
The basic steps include sorting and concatenation. An ASM chart for sorting a list
of k unsigned numbers stored in a set of registers R0……Rk-1 is shown in Fig. 1. A
data path circuit that meets the requirements of the ASM chart in Fig 1 is illustrated in
Fig 2. It shows how the registers R0,…Rk-1 can be connected to registers A and B
using 4-to-1 multiplexers. We assume the value k=4 for simplicity. Registers A and B
are connected to a comparator circuit and ,through the multiplexers, back to the inputs
of the registers R0,….Rk-1.The registers can be loaded with initial data(unsorted)data
using the dataIn lines. The data is written (loaded) into each register by asserting the
WrInit control signal and placing the address of the register on RAdd input. The
tristate buffer driven by the Rd control signal is used to output the contents of
registers on Data Out output. The signals Rin0,…….Rink-1 are controlled by the 2-to-
4 decoder as shown in figure. If int=1,the decoder is driven by one of the counters ci
or cj. If Int=0, then the decoder is driven by the external input RAdd. The signals zi
and zj are set to 1if ci=k-2 and cj=k-1.respectively.
The BWT block implemented is applied for discrete wavelet transform using
Daubechies-4 wavelet. The size of the look-up table is reduced from 9216 to 8667
bits.
Fig. 4. VHDL Simulation output of a low pass filter with Daubechies-4 wavelet coefficients
using distributed arithmetic architecture
VLSI Implementation of Burrows Wheeler Transform 245
Fig. 5. Comparison of compression ratios of look-up tables with and without BWT for
Daubechies-4 high pass(H1) and low pass(H0) filter coefficients
5 Conclusion
The synthesis report generated by Xilinx ISE 9.1 revealed that the time needed for
executing the BWT block is only 5.00 seconds. This guaranteed that addition of BWT
block will not affect much delay in the entire operation. This method is tested in the
implementation of DWT using DB4 wavelet and obtained a compression ratio of
2.3:1. Thus this methodology of adding BWT block to any distributed arithmetic
computation of filters with larger coefficients guarantees a reduction in the memory
needed for look-up tables.
References
1. White, S.A.: Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial
Review. IEEE ASSP MAGAZINE (July 1989)
2. Rao, R.M., Bopardikar, A.S.: Wavelet Transforms: Introduction to Theory and
Applications. Addison-Wesley (2000)
3. Burrows, M., Wheeler, D.J.: A Block-sorting Lossless Data Compression Algorithm. SRC
Research Report (May 1994)
4. Al-Haj, A. M.: An FPGA-based Parallel Distributed Arithmetic Implementation of the 1D
Discrete Wavelet Transform. International Journal of Computing and Informatics
(Informatica) 29(2) (2005)
5. Bonny, T., Henkel, J.: Efficient Code Compression for Embedded Processors. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems 16(12), 1696–1707 (2008)
Multi-objective Optimization for Object-oriented Testing
Using Stage-Based Genetic Algorithm
1 Introduction
Genetic algorithms which are more advanced heuristic search techniques have been
successfully applied in the area of software testing. For a large search space and
getting optimal set of solutions GA is the best choice in software testing. Commonly,
these techniques are referred as evolutionary testing. Evolutionary testing tries to
improve the effectiveness and efficiency of the testing process by transforming testing
objectives into search problems, and applying evolutionary computation in order to
solve them. In this testing of software can be done with a single objective or with
multiple objectives. Instead of fixing only one criteria or quality parameter for
generating test cases, multiple objectives like minimizing the time & cost and no. of
test cases simultaneously maximizing the coverage (i.e., the test requirements) would
be considered.
2 Existing Methods
Genetic Algorithms are the most popular heuristic technique to solve Multi-Objective
Optimization Problems [2]. A Multi-Objective Optimization Problem has a number of
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 246–249, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Multi-objective Optimization for Object-oriented Testing 247
Evoly methods
local global
search search ESs
2 approaches in implementing GA
HC, SA, TS
RWGA NSGA
GAs
Stage-based
obj1: max cov
obj2: min time
obj3: min test suite
Criteria (Coverage)
(test reqts)
Multiple objectives
2. Coinbox 120 0.75 0.83 4080 3802 0.7 0.5 Upto 3 eras
4. Calculator 110 0.87 0.93 2725 2570 0.4 0.3 Upto 3 eras
5. Postal code 98 0.78 0.88 5180 4215 0.8 0.5 Upto 10 secs
6. Sliding 100 0.87 0.93 4250 3990 0.5 0.4 Upto 3 eras
Window
8. Anomaly 140 0.80 0.90 2312 2196 0.5 0.4 Upto 4 eras
detector
9. Array 150 0.83 0.92 2105 1958 0.4 0.3 Upto 4 eras
difference
0.80
0.75
0.70
(4080, 5180) ms
0.65
Immigration Rate 'I'
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0 1 2 3 4 5 6 7 8 9 10
(3802, 4215) ms
Programs
4 Conclusion
Thus, the stage-based genetic algorithm with two stages is used for generation of
object-oriented test cases. The fitness is purely depends on the path coverage of the
test cases in the class. The results for sample java programs show that the efficiency
and effectiveness of test cases generated by SBGA in terms of path coverage. In
addition to path coverage, the time required for execution and the immigration rate
are also satisfactory. This algorithm can be used for similar type of software
engineering problems.
References
1. Ghiduk, A.S.: Automatic Generation of Object-Oriented Tests with a Multistage-Based
Genetic Algorithm. Journal of computers 5(10), 1560–1569 (2010)
2. Singh, D.P., Khare, A.: Different Aspects of Evolutionary Algorithms, Multi-Objective
Optimization Algorithms and Application Domain. International Journal of Advanced
Networking and Applications 2(04), 770–775 (2011)
3. Konak, A., Coit, D.W., Smith, A.E.: Multi-objective optimization using genetic algorithms:
A tutorial. Reliability Engineering and System Safety, 992–1007 (2006)
Cluster Pattern Matching Using ACO Based Feature
Selection for Efficient Data Classification
1 Introduction
Classification is the task of learning from instances which are described by a set of
features and a class label. An unknown sample is an instance with a set of features
whose class label is to be predicted. The result of learning is a classification model
that is capable of accurately predicting the class label of unknown samples. There are
several methods in literature which attempts to classify samples based on the patterns
in the instances of the training set. One such classification approach is K-Nearest
Neighbor (KNN). The drawback of standard KNN classifier is that it does not output
meaningful probabilities associated with class prediction [4]. Therefore, higher values
of k are considered for classification which provides smoothing that reduces the risk
of over-fitting due to noise in the training data [3]. However, choosing higher value of
k, leads to misclassification of samples present in the training dataset as shown in
section 5.1. A Bayesian solution to this problem was proposed and is known as
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 250–254, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Cluster Pattern Matching Using ACO Based Feature Selection 251
Probabilistic k-Nearest Neighbor (PNN). However, it has been shown that PNN does
not offer an improvement in accuracy over the basic KNN [2]. Han et al., [1] pro-
posed a Query Projection Analytical Learning (QPAL) for classification. However,
the drawback in this approach is that the training instances with less number of fea-
tures matching with a query are also considered. An Ant Colony Optimization (ACO)
is a swarm intelligence algorithm to solve optimization problems. In this paper, a
novel algorithm called Cluster Pattern Matching based Classification (CPMC) using
an Ant Colony Optimization (ACO) based approach of feature selection is proposed.
Experimental results show that CPMC is efficient for classifying datasets. This paper
is organized as follows. Section 2 describes clustering the instances of the training set.
Section 3 describes cluster pattern matching based classification. Section 4 describes
building a CPMC training model using ACO based feature selection. A comparison
with existing methods is described in section 5. Finally, the conclusion is presented in
section 6.
discussed in section 4 was used to select the features in the training dataset for
comparison with the unknown sample. Each of the selected feature value xi of the
instances of the training dataset in the cluster was compared with the corresponding
feature value of the unknown sample. The number of features in the training instance
whose value matches with the corresponding feature value of the unknown sample
was counted and was denoted as feature match count. This was repeated for all
training instances in the cluster to which the unknown sample belongs. The training
instances in the cluster having the maximum feature match count were grouped. The
class label of the unknown sample was predicted as the majority class label of the
training instances in the cluster having maximum feature match count. If there were
more than one majority class label, the probability of each class label was found by
dividing the majority class label count by its corresponding class label count in the
cluster. The class label of the unknown sample was predicted as the class label with
highest probability.
agent is evaluated. If the energy value is greater than the previous trail, the tabu list is
updated with the pheromone deposition and the energy value. If the energy value is
lesser than the previous trail, then the newly added or deleted features are ignored.
The process was repeated until the energy value becomes a constant for a series of
trails or the classification accuracy was greater than 99%. Once the solution is found,
the classification model is built and the feature subset in the pheromone deposition
denotes the features that are to be used for comparison by CPMC algorithm to classify
unknown samples.
6 Conclusion
References
1. Han, Y., Lam, W., Ling, C.X.: Customized classification learning based on query projec-
tions. Information Sciences 177, 3557–3573 (2007)
2. Manocha, S., Girolami, M.A.: An empirical analysis of the probabilistic k-nearest neighbour
classifier. Pattern Recogn. Lett. 28, 1818–1824 (2007)
3. Ming Leung, K.: k-Nearest Neighbor Algorithm for Classification. Polytechnic University
Department of Computer Science / Finance and Risk Engineering (2007)
4. Tomasev, N., Radovanovic, M., Mladenic, D.: A Probabilistic Approach to Nearest-
Neighbor Classification: Naive Hubness Bayesian KNN. In: CIKM 2011, Glasgow, Scot-
land, UK (2011)
Population Based Search Methods
in Mining Association Rules
Abstract. Genetic Algorithm (GA) and Particle swarm optimization (PSO) are
both population based search methods and move from set of points (population)
to another set of points in a single iteration with likely improvement using set of
control operators. GA has become popular because of its many versions, ease of
implementation, ability to solve difficult problems and so on. PSO is relatively
recent heuristic search mechanism inspired by bird flocking or fish schooling.
Association Rule (AR) mining is one of the most studied tasks in data mining.
The objective of this paper is to compare the effectiveness and computational
capability of GA and PSO in mining association rules. Though both are heuris-
tic based search methods, the control parameters involved in GA and PSO dif-
fer. The Genetic algorithm parameters are based on reproduction techniques
evolved from biology and the control parameters of PSO are based on particle
‘best’ values in each generation. From the experimental study PSO is found to
be as effective as GA with marginally better computational efficiency over GA.
1 Introduction
With advancements in information technology the amount of data stored in databases
and kinds of databases continue to grow fast. Analyzing and finding the critical hid-
den information from this data has become very important issue. Association rule
mining techniques help in achieving this task. Association rule mining is searching of
interesting patterns or information from database [12]. Association rule mining finds
interesting associations and/or correlation relationships among large set of data items.
Typically the relationship will be in the form of a rule [13], Where X and Y
are itemsets and X is called the antecedent and Y the consequent.
Genetic algorithm and particle swarm optimization are both evolutionary heuristics
and population based search methods proven to be successful in solving difficult
problems. Genetic Algorithm (GA) is a procedure used to find approximate solutions
to search problems through the application of the principles of evolutionary biology.
Particle swarm optimization (PSO) is a heuristic search method whose mechanics are
inspired by the swarming or collaborative behavior of biological populations. The
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 255–261, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
256 K. Indira et al.
major objective of this paper is to verify whether the hypothesis that PSO has same
effectiveness as that of GA but better computational efficiency is valid or not. This
paper is organized as follows. Section 2 discusses the related works carried out so far
on GA and PSO in association rule mining. Section 3 describes the methodology
adopted for mining ARs. In section 4 the experimental results are presented followed
by conclusions in section 5.
2 Related Works
During last few decades many researches were carried out using evolutionary algo-
rithm in data mining concepts. Association rule mining shares major part of research
in data mining. Many classical approaches for mining association rules have been
developed and analyzed. GA discovers high level prediction rules [1] with better
attribute interaction than other classical mining rules available. The mechanism to
select individuals for a new generation based on the technique of elitist recombination
[2] simplifies the implementation of GA.
In [3], cross probability and mutation probability are set up in dynamic process of
evolution. When new population evolves, if every individual is comparatively consis-
tent, then cross probability Pc and mutation probability Pm are increased. Noda et al.
[4] has proposed two relatively simple objective measures of the rule Surprisingness
(or interestingness). By contrast, genetic algorithms (GAs) [5] maintain a population
and thus can search for many non-dominated solutions in parallel. GA’s ability to find
a diverse set of solutions in a single run and its exemption from demand for objective
preference information renders it immediate advantage over other classical tech-
niques.
Particle Swarm Optimization is a population based stochastic optimization tech-
nique developed by Eberhart and Kennedy in 1995 [6], inspired by social behavior of
bird flocking or fish schooling. PSO shares many similarities with evolutionary com-
putation techniques such as GA. However unlike GA, PSO has no evolution operators
such as crossover and mutation. A binary version of PSO based algorithm for fuzzy
classification rule generation, also called fuzzy PSO, is presented in [7]. PSO has
proved to be competitive with GA in several tasks, mainly in optimization areas. The
PSO variants implemented were Discrete Particle Swarm Optimizer [8] (DPSO), Li-
near Decreasing Weight Particle Swarm Optimizer [9] (LDWPSO) and Constricted
Particle Swarm Optimizer [10] (CPSO).
The fixing up of the best position [16) for particles after velocity updation by using
Euclidean distance helps in generating the best particles. The chaotic operator based
on Zaslavskii maps when used in velocity update equation [17] proved to enhance the
efficiency of the method. The soft adaptive particle swarm optimization algorithm
[18] exploits the self adaptation in improving the ability of PSO to overcome optimi-
zation problems with high dimensionality. The particle swarm optimization with self
adaptive learning [19] aims in providing the user a tool for various optimization
problems. The problem of getting struck at local optimum and hence premature con-
vergence is overcome by adopting self adaptive PSO [20] where the diversity of
population is maintained. This copes up with the deception of multiple local optima
and reduces computational complexity.
Population Based Search Methods in Mining Association Rules 257
3 Methodology
Genetic algorithm and particle swarm optimization both population based search me-
thods are applied for mining association rule from databases. The self adaptive GA
[15] is found to perform marginally better than traditional GA. This section describes
the methodology adopted for mining AR based on both SAGA and PSO.
Fitness value decides the importance of each itemset being evaluated. Fitness value
is evaluated using the fitness function. Equation 3 describes the fitness function.
( ) ( ) log(sup( ) ( ) 1) (1)
Where sup(x) and conf(x) are as described in equation 2 and 3, length(x) is length of
the association rule type x.
.
sup( ) (2)
.
( )
( ) (3)
( )
where |X&Y| is the number of records that satisfy both the antecedent X and conse-
quent Y, |X| is the number of rules satisfying the antecedent X.
n (n+1) th
pm is the nth generation mutation rate, pm is the (n+1) generation mutation rate.
0 (m)
The first generation mutation rate is pm , fi is the fitness of the nth individual itemset
(n+1) th (n)
i. fmax is the highest fitness of the (n+1) individual stocks. fi is the fitness of the nth
individual i. m is the number of itemsets. λ is the adjustment factor. The fitness crite-
rion is as described in equation 5.
1 1 ( ) ( ) 2 ( )
( ) (6)
1 (7)
v[] is the particle velocity, present[] is the current particle. pbest[] and gbest[] are
local best and global best position of particles. rand () is a random number between
(0,1). c1, c2 are learning factors. Usually c1 = c2 = 2. 3. The position of particles is
then updated based on equation 4. During position updation if the acceleration ex-
ceeds the user defined Vmax then position is set to Vmax. The above process is repeated
until fixed number of generations or the termination condition is met.
Population Based Search Methods in Mining Association Rules 259
4 Experimental Results
To confirm the effectiveness of GA and PSO, both the algorithms were coded in Java.
Lenses, Haberman and Car evaluation datasets from UCI Irvine repository [14] were
taken up or the experiment. Self adaptive GA and PSO based mining of ARs on the
above dataset when performed resulted in predictive accuracy as potted in figure 1.
The predictive accuracy when achieved maximum during successive iterations was
recorded. PSO is found to be equally effective as SAGA in mining association rules.
The predictive accuracy for both the methods is close to one another.
100
Preictive Accuracy
95
90
85
80
75 SAGA
70
65 PSO
60
55
50
Lenses Haberman’s Car evaluation
survival
5000
4500
4000
Execution time (ms)
3500
3000
2500
SAGA
2000
1500 PSO
1000
500
0
Lenses Haberman’s Car
survival evaluation
main difference between PSO and GA is that PSO does not have the genetic operators
as crossover and mutation. In PSO only the best particle passes information to others
and hence the computational capability of PSO is marginally better than SAGA.
5 Conclusions
Particle swarm optimization is a recent heuristic search method based on the idea of
collaborative behavior and swarming in populations. Both PSO and GA depend on
sharing information between populations. In GA information is passed from one gen-
eration to other through the reproduction method namely crossover and mutation op-
erator. GA is well established method with many versions and many applications. The
objective of this study is to analyze PSO and GA in terms of effectiveness and com-
putational efficiency.
From the study carried out on the three datasets PSO proves to be as effective as
GA in mining association rules. In term of computational efficiency PSO is marginal-
ly faster than GA. The pbest and gbest values tends to pass the information between
populations more effectively than the reproduction operators in GA. PSO and GA are
both inspired by nature and more effective for optimization problems. Setting of
appropriate values for the control parameters involved in these heuristics methods is
the key point to success in these methods.
References
1. Alex, A. F.: A Survey of Evolutionary Algorithms for Data Mining and Knowledge Dis-
covery, Postgraduate Program in Computer Science. Pontificia Universidade catolica do
Parana Rua Imaculada Conceicao, Brazil
2. Shi, X.-J., Lei, H.: Genetic Algorithm-Based Approach for Classification Rule Discovery.
In: International Conference on Information Management, Innovation Management and
Industrial Engineering, ICIII 2008, vol. 1, pp. 175–178 (2008)
3. Zhu, X., Yu, Y., Guo, X.: Genetic Algorithm Based on Evolution Strategy and the Appli-
cation in Data Mining. In: First International Workshop on Education Technology and
Computer Science, ETCS 2009, vol. 1, pp. 848–852 (2009)
4. Noda, E., Freitas, A.A., Lopes, H.S.: Discovering Interesting Prediction Rules with Genet-
ic Algorithm. In: Proceedings of Conference on Evolutionary Computation (CEC 1999),
Washington, DC, USA, pp. 1322–1329 (1999)
5. Michalewicz, Z.: Genetic Algorithms + Data Structure = Evolution Programs. Springer,
Berlin (1994)
6. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceedings of the 1995
IEEE International Conference on Neural Networks, pp. 1492–1948. IEEE Press (1995)
7. He, Z., et al.: Extracting Rules from Fuzzy NeuralNetwork by Particle Swarm Optimiza-
tion. In: IEEE Conference on Evolutionary Computation, USA, pp. 74–77 (1995)
8. Kennedy, J., Eberhart, R.C.: Swarm Intelligence. Morgan Kaufmann (2001)
9. Shi, Y., Eberhart, R.C.: Empirical Study of Particle Swarm Optimization. In: Proceedings
of the 1999 Congress of Evolutionary Computation, Piscatay (1999)
10. Clerc, M., Kennedy, J.: The particle Swarm-explosion, Stability and Convergence in a
Multidimensional Complex Space. IEEE Transactions on Evolutionary Computation 6,
58–73 (2002)
Population Based Search Methods in Mining Association Rules 261
11. Dehuri, S., Mall, R.: Predictive and Comprehensible Rule Discovery Using a Multiobjec-
tive Genetic Algorithm: Knowledge Based Systems, vol. 19, pp. 413–421. Elsevier (2006)
12. Wang, M., Zou, Q., Liu, C.: Multi-dimension Association Rule Mining Based on Adaptive
Genetic Algorithm. In: IEEE International Conference on Uncertainty Reasoning and
Knowledge Engineering, pp. 150–153 (2011)
13. Dehuri, S., Patnaik, S., Ghosh, A., Mall, R.: Application of Elitist Multi-objective Genetic
Algorithm for Classification Rule Generation: Applied Soft Computing, pp. 477–487
(2008)
14. Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases. University of
California Irvine. Department of Information and Computer Science (1996),
http://kdd.ics.uci.edu
15. Indira, K., Kanmani, S., Gaurav Sethia, D., Kumaran, S., Prabhakar, J.: Rule Acquisition
in Data Mining Using a Self Adaptive Genetic Algorithm. In: Nagamalai, D., Renault, E.,
Dhanuskodi, M. (eds.) CCSEIT 2011. CCIS, vol. 204, pp. 171–178. Springer, Heidelberg
(2011)
16. Kuo, R.J., Chao, C.M., Chiu, Y.T.: Application of Particle Swarm Optimization in Associ-
ation Rulemining. Applied Soft Computing, 323–336 (2011)
17. Atlas, B., Akin, E.: Multi-objective Rule Mining Using a Chaotic Particle Swarm Optimi-
zation Algorithms. Knowledge Based Systems 23, 455–460 (2009)
18. Mohammed, Y., Ali, B.: Soft Adaptive Particle Swarm Algorithm for Large Scale Optimi-
zation. In: Fifth International Conference on Bio Inspired Computing, pp. 1658–1662.
IEEE Press (2010)
19. Wang, Y., Li, B., Weise, T., Wang, J., Yun, B., Tian, Q.: Self-adaptive Learning Based on
Particle Swarm Optimization. Information Science 181, 4515–4538 (2011)
20. Lu, F., Ge, Y., Gao, L.: Self Adaptive Particle Swarm Optimization Algorithm for Global
Optimization. In: Sixth International Conference on Natural Computation, pp. 2692–2696.
IEEE Press (2010)
Efficient Public Key Generation for Homomorphic
Encryption over the Integers
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 262–268, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Efficient Public Key Generation for Homomorphic Encryption over the Integers 263
contribution is in reducing the public key size of that scheme from Õ(n10) to Õ(n7).
More expository survey of the recent advances in the homomorphic cryptography is
given in [10].
In this work, an efficient variant of the underlying SHE scheme of [5] is presented
using a comparatively smaller public key of size Õ(n3). It is shown that, the semantic
security of the proposed scheme is preserved under the two-element Partial
Approximate Greatest Common Divisor (PAGCD) problem. Also, the proposed
variant is proved compact with low ciphertext expansion of n3. It is estimated that,
with the improvements made the Homomorphic Encryption usage and thus encrypted
data processing becomes imminent for suitable applications that fall within the
multiplicative capacity of the proposed scheme. Due to space constraints, the proofs
for all the Theorems and Lemmas are given in the Appendix of the full version of this
paper.
The Somewhat Homomorphic Encryption over the integers [5], which is denoted as
HE in this paper, consists of four algorithms KeyGen, Encrypt, Decrypt and Evaluate.
The size (bit length) of various integers used in the scheme is denoted by the
parameters e, t, r, g, d, which represent the size of the secret key, number of elements
in the public key, size of the noise in the public key integers, the size of each integer
in the public key, size of the noise used for encryption, respectively and are
polynomial in the security parameter n. The parameter setting suggested in view of
the homomorphism and security is, e = Õ(n2), r = n, d = 2n, g = Õ(n5), and t = g + n.
This makes the public key size as Õ(n10), because, the public key consists of t = Õ(n5)
integers each of size g = Õ(n5).
e-1 e
KeyGen(n): Choose a random e-bit odd integer from the right open interval [2 , 2 )
g
as the secret key P. For i = 0,1,…..,t , Choose a random integer Qi from [0, 2 /P),
r r
another integer Ri from the open interval (-2 , 2 ), and compute Xi = PQi + Ri until the
conditions X0 > X1,….., Xt , X0 mod 2 = 1, and (X0 mod P) mod 2 = 0 are satisfied.
Output the public key PK = (X0, X1,….., Xt) and the secret key SK = P.
d d
Encrypt(PK, M ∈ {0, 1} ): Choose an integer B from (-2 , 2 ) as noise for encryption.
Choose a subset J ⊆ {1,…..t}. Compute the sum S = ∑i ∈J Xi . Output the ciphertext
as C = [M + 2(B + S) ] mod X0.
Decrypt (SK, C) : Compute M = ( C mod P ) mod 2.
Evaluate(PK,CKT, (C1,…..,Ck) ): Let CKT be the binary circuit to be evaluated
representing a boolean function f, with XOR gates and AND gates (i.e., CKT consists
of mod-2 addition and multiplication gates). Replace the XOR gates and AND gates
of CKT with addition and multiplication gates that operate over integers. Let GCKT
be the resulting generalized circuit and fg be the corresponding multivariate
polynomial. Apply GCKT over (C1,…..,Ck), and output the resulting ciphertext
Cg = fg (C1,…..,Ck) .
264 Y. Govinda Ramaiah and G. Vijaya Kumari
EncryptSP (PK, M ∈ {0,1} ): For a plaintext bit M∈ {0,1}, Choose a random even
d-1 d
integer N from the interval [2 , 2 ). The ciphertext C = [M + N. X1] mod X0
EvaluateSP ( PK, CKT, (C1,…..,Ck) ), and DecryptSP (SK, C) algorithms are same as
that of the original HE.
The appealing feature of the scheme HESP is the relatively smaller public key with
only two integers of size Õ(n3) each. Encryption method is also comparatively simple
because, the product (N. X1) corresponds to the operations of choosing a random
subset from the big set of public key elements, adding the elements in that subset,
multiplying the sum with 2 and adding to an even noise as done in HE.
Similar to the case of HE, the limits imposed on the sizes of the noise makes the
scheme somewhat homomorphic. It is quite easy to see that the scheme HESP is a
variant of the scheme HE for the chosen parameter setting. This is because, the
ciphertext in HE is M + 2B + PQ. The ratio of size of P and the size of noise (M+2B)
is Õ(n2) / Õ(n) = Õ(n). Consider a fresh ciphertext in HESP. We have,
C = [M + N. X1] mod X0 = M + RN + P (NQ1 – K Q0) for some integer K ≥ 0. This
can be written as M + 2Bs + PQs since RN is even, and due to which we have
Bs = RN/2, Qs = (NQ1 – K Q0).The ratio between the size of P and the size of noise
(M + 2Bs) is Õ(n2) / Õ(n) = Õ(n), which is same as that of HE. Hence, both the
schemes are identical with only difference in the methods of key generation and
encryption. For EvaluateSP, corresponding to the generalized circuit GCKT we have
the following notion of permitted circuit.
Definition 1. (Permitted circuit). An arithmetic circuit with addition and
multiplication gates is called a permitted circuit for the scheme HESP if, for any set of
d
integer inputs each < 2 in absolute value, the maximum absolute value output by the
e-2
circuit is < 2 . We denote the set of permitted circuits as PCKT.
Lemma 1. For the scheme HESP, the ciphertexts resulting from EncryptSP as well as
EvaluateSP applied to a permitted circuit, decrypts correctly. □
Theorem 1. The encryption scheme HESP is correct, compact and is algebraically
homomorphic for the given plaintext M ∈ {0,1}, and for any circuit CKT ∈ PCKT. □
Since HESP is a variant of HE, we can follow the same strategy as that of [5] and [8]
to base the security of our proposition on the hard problem of solving a version of
GACD called Partial Approximate Greatest Common Divisor (PAGCD). In [8] this
problem is called as error-free approximate-GCD.
Definition 2. (Two-element Partial Approximate Greatest Common Divisor) The
two-element (r, e, g )-PAGCD problem is: For a random e-bit odd positive integer P,
given X0 = PQ0 and X1= PQ1 + R, where Qi ( i=0,1), R are chosen from the intervals
g r r
[0, 2 / P), and (-2 , 2 ) respectively, output P.
266 Y. Govinda Ramaiah and G. Vijaya Kumari
The recent work of Chen and Nguyen [12] shown that solving PAGCD is relatively
easier than solving GAGCD. However, as mentioned by them their attack’s
implementation parameters are suboptimal for medium and large challenges put forth
by Coron et el [8]. Hence, if the security parameter n is appropriately chosen, the
PAGCD problem will be intractable ensuring the semantic security of the scheme. We
have the following theorem, similar to [5] to base the security of our scheme on the
two-element PAGCD problem.
4 Known Attacks
In HESP, for a given security parameter n the lowest possible size of the problem
instance to solve the PAGCD problem is the public key (X0, X1) because, the noise in
X1 is less when compared to noise in ciphertexts for a particular instance of the
scheme. Therefore, the attacks against the two-element PAGCD problem, i.e., against
the public key only are described, claiming that the high noise ciphertexts
(approximate multiples of P) successfully defend all these attacks.
Factoring the Exact Multiple. For the chosen parameter values, the size of the exact
multiple of P i.e., X0 is big enough so that, even the best known integer factoring
algorithms such as the General Number Field Sieve [13] will not be able to factor X0.
Even if the factor P is targeted which is smaller than the size of total Q0, algorithms
such as Lenstra’s elliptic curve factoring [14] takes about exp (O(√e ))time to find P.
But, it is to be noted that, P will not be recovered directly as it is not prime and may
be further decomposed in to smaller primes.
Brute-Force Attack on the Noise. Given the public key integers X0 and X1, the
r r
simple brute-force attack can be; choosing an R form the interval (-2 , 2 ), subtracting
it from X1, and computing GCD(X0, X1- R) every time, which may be the required
secret integer P. In a worst case, this process may need to be repeated for all the
r
integers R in the interval. The complexity of this attack will be 2 . Õ(g) for g bit
integers.
Another integer more vulnerable to brute-force attack in HESP is the noise factor N
used during the encryption. In fact, this integer clearly defines the overall security of
the scheme because, guessing this number simply breaks the scheme, rather than
guessing the secret integer P. The attack in the case of this integer will be, choosing
all the possible even integers N from the interval mentioned, and encrypting 0 with
each such N and public key. Then, for a plaintext bit encrypted using some N, the
Efficient Public Key Generation for Homomorphic Encryption over the Integers 267
Continued Fractions and Lattice Based Attacks. Howgrave Graham [11] described
two methods to solve the two-element PAGCD problem. In simple terms the
continued fraction based approach (Algorithm 11, [11] ) recovers P if the condition
R < P/Q is satisfied. Similarly, his lattice based algorithm (Algorithm 12, [11])
2 ε
recovers P if the condition R < P / (PQ) is satisfied for some real number ε. Also, as
analyzed in [5] for the case of a two-element PAGCD problem, it is possible to
2
recover P when r/g is smaller than (e / g) . Since the parameter setting of HESP does
not satisfy these constraints, the concerned methods fail to recover the value of P.
The General Common Divisors Attack. Consider the Theorem 31.2 and its
corollaries discussed in [15]. GCD(X0, X1), can be the smallest positive element in the
set {AX0 + BX1 : A,B ∈ ℤ }. This is possible because, A, B can be any integers
including negative numbers. Now, if a common divisor exists for both X0, X1, it will
divide all the possible linear combinations of X0, X1. Modular reduction of a
ciphertext with such common divisor results in the plaintext, because a ciphertext
contains a linear combination of X0, X1. Therefore, taking the pair of integers X0, X1
as co-prime foils this attack.
As discussed earlier, the public key of the HE contains Õ(n5) elements each of which
is Õ(n5) bits long. This will take Õ(n10) computations for complete key generation.
Also, in that scheme the bit length of a fresh ciphertext that encrypts a single bit is
Õ(n5), leading to an expansion ratio of n5 .
The public key in the scheme HESP consists of only two elements of Õ(n3) bits
long. This makes the complexity of key generation as Õ(n3). This is a considerable
improvement over the somewhat homomorphic schemes of [5] and [8]. Also, the
encryption of an Õ(n) bit plaintext, which involves a multiplication of Õ(n3). Õ ( n)
and a modular reduction of this with Õ(n3) bit X0 takes Õ(n3) steps. Similarly, the bit
complexity of decryption is roughly Õ(n3). Therefore, the overall complexity of the
proposed variant HESP is Õ(n3). Similarly, a single plaintext bit is embedded in a
ciphertext of Õ(n3) bits making the expansion ratio also comparatively less which is
n3. With these drastic improvements in bit complexity and ciphertext expansion, this
conceptually simple somewhat homomorphic scheme will be suitable for many
practical applications that involve simple functions for homomorphic evaluation (The
degree of the polynomial approximation of such functions should be within the
homomorphic evaluation capacity of the scheme).
268 Y. Govinda Ramaiah and G. Vijaya Kumari
6 Conclusions
In this paper, an efficient and hopefully a practical variant of the existing Somewhat
Homomorphic Encryption over the integers is proposed. The improvement in
efficiency, from Õ(n10) to Õ(n3), is obtained by reducing the size of the public key,
which contains only two integers. The semantic security of the scheme is thoroughly
analyzed by reducing the same to the hard problem of solving the two-element Partial
Approximate Greatest Common Divisor, describing all the known attacks. With the
improvement in bit complexity, it is expected that the Homomorphic Encryption
usage and thus encrypted data processing becomes imminent practically.
References
1. Rivest, R., Adleman, L., Dertouzos, M.: On data banks and privacy homomorphisms. In:
Foundations of Secure Computation, pp. 169–180 (1978)
2. Gentry, C.: A Fully homomorphic encryption scheme. Ph.D. thesis, Stanford Univ. (2009)
3. Gentry, C.: Fully homomorphic encryption using ideal lattices. In: STOC 2009, pp. 169–
178. ACM (2009)
4. Smart, N.P., Vercauteren, F.: Fully Homomorphic Encryption with Relatively Small Key
and Ciphertext Sizes. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS,
vol. 6056, pp. 420–443. Springer, Heidelberg (2010)
5. van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V.: Fully Homomorphic Encryption
over the Integers. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 24–43.
Springer, Heidelberg (2010)
6. Gentry, C.: Computing arbitrary functions of encrypted data. Communications of the
ACM 53(3), 97–105 (2010)
7. GovindaRamaiah, Y., VijayaKumari, G.: State-of-the-art and Critique of Cloud
Computing. In: NCNGCIS 2011, pp. 50–60. IMS, Noida (2011)
8. Coron, J.-S., Mandal, A., Naccache, D., Tibouchi, M.: Fully Homomorphic Encryption
over the Integers with Shorter Public Keys. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS,
vol. 6841, pp. 487–504. Springer, Heidelberg (2011)
9. Brakerski, Z., Gentry, C., Vaikuntanathan, V.: Fully Homomorphic Encryption without
Bootstrapping. Electronic Colloquium on Computational Complexity (ECCC) 18, 111 (2011)
10. Vaikuntanathan, V.: Computing Blindfolded: New Developments in Fully Homomorphic
Encryption, http://www.cs.toronto.edu/~vinodv/
FHE-focs-survey.pdf
11. Howgrave-Graham, N.: Approximate Integer Common Divisors. In: Silverman, J.H. (ed.)
CaLC 2001. LNCS, vol. 2146, pp. 51–66. Springer, Heidelberg (2001)
12. Chen, Y., Nguyen, P.Q.: Faster algorithms for approximate common divisors: Breaking
fully homomorphic encryption challenges over the integers. Cryptology ePrint Archive,
Report 2011/436, http://eprint.iacr.org/2011/436
13. Briggs, M.: An Introduction to the General Number Field Sieve. Master’s Thesis, Virginia
Tech (April 1998),
http://scholar.lib.vt.edu/theses/available/etd-32298-93111/
14. Lenstra, H.: Factoring Integers with Elliptic Curves. Annals of Mathematics 126, 649–673
(1987)
15. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd
edn. MIT Press (2002)
Similarity Based Web Data Extraction and Integration
System for Web Content Mining
1 Introduction
The World Wide Web has now become the largest knowledge base in the human
history. The Web encourages decentralized authorizing in which users can create or
modify documents locally, which makes information publishing more convenient and
faster than ever. Because of these characteristics, the Internet has grown rapidly,
which creates a new and huge media for information sharing and exchange.
There are situations in which the user needs those web pages on the Internet to be
available offline for convenience. The reason being offline availability of data, limited
download slots, storing data for future use, etc. This essentially leads to downloading
raw data from the web pages on the Internet that is a major set of the inputs to a
variety of software that are available today for the purpose of data mining.
In the recent years there has been lot of improvements on technology with products
differing in the slightest of terms. Every product needs to be tested thoroughly, and
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 269–274, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
270 K.C. Srikantaiah et al.
internet plays a vital role in gathering of information for the effective analysis of the
products.
Several algorithms have been proposed to extract data from search engine result
pages which contains both structured data and unstructured data. ANDES [2] uses
XML technology for data extraction and provides access to the deep web. Xunhua Liu
et al., [3] have proposed an algorithm based on the position of DIV to extract main
text from the body of Web pages. DEPTA [1] performs web data extraction
automatically in two steps, in first step identifies the individual records in a page
based on visual information and DOM tree matching. In second step aligns and
extracts data items from the identified records based on partial alignment technique.
ONTOWRAPPER [4] is an ontological technique uses existing lexical database for
English for the extraction of data records from deep web pages. Chia-Hui Chang et
al., [5] have surveyed the major web data extraction approaches and compared them
in three dimensions: the task domain, the automation degree, and the technique
used. In these methods, the page containing required data is crawled[6] and then
it is processed through online. This leads to a problem of offline unavailability of
data, limited download slot etc., it can be overcome by using offline browsing
mechanism [7].
In our approach, we replicate search result pages locally based on comparing page
URLs with a predefined threshold. The replication is such that the pages are
accessible locally in the same manner as on the web. In order to make the data
available locally to the user for analysis we extract and integrate the data based on the
prerequisites which are defined in the configuration file.
Contribution: In a given set of web pages, it is difficult to extract matching data.
so, we have to develop a tool that is capable of extracting the exact data from the web
pages. In this paper, we have developed WDES algorithm, which provides offline
browsing of the pages. Here, we integrate the downloaded content onto a defined
database and provide a platform for efficient mining of the data required.
Given a start page URL and a configuration file, the main objective is to extract pages
which are hyperlinked from the start page and integrate the required data for analysis
using data mining techniques. The user has sufficient space on the machine to store
the data that is downloaded.
each of the search results pages that are obtained. For example, if a search result
obtained contains 150 records displayed as 10 records per page (in total 15 pages of
information), we would have 15 sets of web documents each containing 10 hyperlinks
pointing to the required data. This forms the set of web documents, W. i.e.,
:1 . (1)
Each web document wi∊W is read through to collect the hyperlinks that are contained
in it, that are to be fetched to obtain the data values. We, represent this hyperlink set
as H(W). Thus, we consider H(W) as a whole set containing all the sets of hyperlinks
on each page wi ∊ W. i.e.,
( ) ( ): 1 . (2)
Then, considering each hyperlink hj ∊ H(wi), we find the similarity between hj and S,
using equation (3)
( , ( ))
∑ ( , )
, . (3)
( , ( ))/
1
(4)
0
The similarity SIM(hj ,S) is the value that lies between 0 and 1, this value is used to
compare with the defined threshold To (0.25), we download the page corresponding to
hj to local repository if SIM(hj , S) ≥ To . The detailed algorithm of WDES is given in
Table 1.
The algorithm WDES navigates the search result page from the given URL S and
configuration file C and generates a set of web documents W. Next, call the function
Hypcollection to collect hyperlinks of all pages in wi, indexed by H(wi), page
corresponding to H(wi) is stored in the local repository. The function webextract is
recursively called for each H(wi). Then, for each hi ∊ H(wi), similarity between hi and
S is calculated using Eq. 3, if SIM(hi,S) is greater than the threshold To, then page
corresponding to hi is stored and collect all the hyperlinks in hi to X. Continue this
process for X, until it reaches maximum depth l.
Web Data Integration using Cosine Similarity(WDICS): The aim of this algorithm
is to extract data from the downloaded web pages (those web pages that are available
in the local repository i.e., output of WDES algorithm) into the database based on
attributes and keywords from the configuration file Ci. We collect all result pages W
from local repository indexed by S, then H(W) is obtained by collecting all hyperlinks
from W, considering each hyperlink hj ∊ H(wi) such that k ∊ keywords in Ci. On
existence of k in hj, we populate the new record set N[m, n] by passing page hj and
272 K.C. Srikantaiah et al.
obtaining values defined with respect to the attributes[n] in Ci. We then populate the
old record set O[m, n] by obtaining all values with respect to attributes[n] in
database. For each record i, 1≤ i≤ m we find the similarity between N[i] and O[i]
using cosine similarity
∑
( , ) . (5)
∑ ∑
Input
S : Starting Page URL. C: Parameter Configuration File.
l : Level of Data Extraction. To: Threshold.
Output: Set of Webpages in Local Repository.
begin
W=Navigate to Web document on Given S and automate page with C
H(W)=Call: Hypcollection(W)
for each H(wi) ∊ H(w)
Save page H(wi) on local Machine page P
Call: Webextract(H(wi), 0, pageppath)
end for
end
Function Hypcollection(W)
begin
for each wi ∊ W do
H(wi)=Collect all hyperlinks in wi
end for
return H(W)
end
Function Webextract(Z, cl, lp)
Input
Z : set of URLs. cl : Current level. lp : local path to Z.
Output: Set of Webpages in Local Repository.
begin
for each hi ∊ Z do
if SIM(hi, S) • To then
Save hi to Fhi
X=collect URLs from hi and change its path in lp
if( cl < l)
Call: Webextract(X, cl + 1, pageppath of X)
end if
end if
end for
end
Similarity Based Web Data Extraction and Integration System for Web Content Mining 273
3 Experimental Results
The experiment was conducted on the Bluetooth SIG Website [8], which contains
listings of Bluetooth products and its specifications and is a domain specific search
engine. We have collected data from www.bluetooth.org, which gives the listings of
qualified products of Bluetooth devices. Here, we have extracted pages on the date
range Oct-2005 to Jun-2011 consisting of total 92 pages, with each page containing
200 records of information. We were able to extract data from each of these pages.
Based on the data extracted on the given attribute mentioned in the configuration file,
we have a cumulative set of data for comparison.
Input
S : Starting Page URL stored in local repository (output of
WDES).
Ci : Configuration File (Attributes and Keywords).
Output: Integrated Data in Local Repository.
begin
H(w)=Call: Hypcollection(S)
for each H(wi) ∊ H(w) do
Call: Integrate(H(wi))
end for
end
Function Integrate(X)
Input: X : set of URLs.
Output: Integration of Values of Attributes Local Repository.
begin
for each hi ∊ Z do
if( hi contain keyword) then
new[m][n]=parse page to obtain values of defined
attributes[n] in Ci
old[m][n]=obtain all values of attributes[n] from
repository
for each record i do
if(SimRecord(new[i], old[i])==1) Skip
end if
else
for each attribute j do
if ( new[i][j] Not Equal to old[i][j] )
IntegratedData=union(new[i][j],old[i][j])
end if
end for
store IntegartedData in local repository
end for
X=collect all links for hi
if (X not equal to NULL) Call: Integrate(X)
end if
end if
end for
end
274 K.C. Srikantaiah et al.
The Precision and Recall are calculated based on the total available records in the
Bluetooth website, the records found by the search engine and the records extracted
by our model. Recall and Precision of DETPA are 98.67.1% and 95.05% respectively,
and that of WDICS is 99.77% and 99.60% respectively as shown in Table 3. WDICS
is more efficient than DEPTA because when an object is dissimilar to its neighboring
objects DEPTA failed to identify all records correctly.
4 Conclusions
Extraction of exact information from the web is an important issue in web mining. We
propose a Similarity based Web data Extraction and Integration System (WDES and
WDICS). The proposed approach includes extraction and integration of web data.
This provides faster data processing and effective offline browsing functionality that
helps in saving time and resource. Integrating onto the database helps in extracting the
exact content from the downloaded pages.
References
1. Yanhong, Z., Bing, L.: Structured Data extraction from the Web Based on Partial Tree
Alignment. Journal of IEEE TKDE 18(12), 1614–1627 (2006)
2. Jussi, M.: Effective Web Data Extraction with Standards XML Technologies. In:
Proceedings of the 10th International Conference on World Wide Web, pp. 689–696 (2001)
3. Xunhua, L., Hui, L., Dan, W., Jiaqing, H., Wei, W., Li, Y., Ye, W., Hengjun, X.: On Web
Page Extraction based on Position of DIV. In: IEEE 4th ICCAE, pp. 144–147 (2010)
4. Hong, J. L.: Deep Web Data Extraction. In: IEEE International Conference on Systems Man
and Cybernetics (SMC), pp. 3420–3427 (2010)
5. Chia-Hui, C., Moheb Ramzy, G.: A Survey of Web Information Extraction Systems.
Journal of IEEE TKDE 18(10), 1411–1428 (2006)
6. Tiezheng, N., Zhenhua, W., Yue, K., Rui, Z.: Crawling Result Pages for Data Extraction
based on URL Classification. In: IEEE 7th Web Information Systems and Application
Conference, pp. 79–84 (2010)
7. Ganesh, A., Sean, B., Kentaro, T.: OWEB: A Framework for Offline Web Browsing. In:
Fourth Latin America Web Congress. IEEE Computer Society (2006)
8. Bluetooth SIG Website, https://www.bluetooth.org/tpg/listings.cfm
Join Query Processing in MapReduce Environment
1 Introduction
MapReduce was proposed by Google [1]. Many complex tasks such as parallelism,
fault tolerance, data distribution and load balancing are hidden from the user; thus
making it simple to use. Tasks are performed in two phases, Map and Reduce. Input
in the form of key/value pairs is processed by Map function to produce intermediate
key/value pairs; these intermediate values with same keys are merged together by
Reduce function to form smaller set of values as output.
map(InputKey, InputValue) list (IntermediateKey, intermediateValue)
reduce(IntermediateKey, list(intermediateValue)) list(intermediateValue)
Map and Reduce function are specified by the user, but the execution of these func-
tions in the distributed environment is transparent to the user. Hadoop is open source
implementation of MapReduce [2], built on top of Hadoop Distributed File System
(HDFS) which could handle petabytes of data [10]. Data blocks are replicated over
more than one location over the cluster to increase the availability.
2 Related Work
A framework Map-Reduce-Merge[8] was designed to improve join processing, it
included one more stage called Merge to join tuples from multiple relations. Join
performance could be improved by indexes; Hadoop++[6] used Trojan Join and Tro-
jan Index to improve join execution. Methods described in this paper could be applied
when data is organized in Row-wise manner; [9] described join optimization algo-
rithms for column-wise data. Authors in [7] designed a query optimizer for Hadoop.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 275–281, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
276 A. Shaikh and R. Jindal
3 Join Processing
Join algorithms used by Conventional DBMS and MapReduce are different, because
join execution in MapReduce uses Map and Reduce functions to get results. This
section describes various join processing algorithms for MapReduce environment.
pairs from each of two relations having similar join attribute values are kept on the
same split. Availability of data from both relations with same join key value at the
same split, makes execution of join possible locally at mapper node without need of
shuffle and reduce phases, hence reducing the network communication. Join result is
obtained by performing cross product between Data from a Co-Group in Map phase.
Execution of Trojan join between relations Passenger & Train is depicted in Figure 3.
Replicated join could be used when a tuple from one relation is joined with many
tuples of other relation. ‘k’ number of reduce process are selected, where k=m*m.
Reducers are numbered as [i,j] where values of i and j are 1,2,..m. Tuples from R, S,
& T are hashed into ‘m’ buckets by a Hash function. Tuples are sent to reducer using
the hashed values of join attributes B and C. Join is performed locally at reducer as
tuples from R, S, T with same join attribute value are available at reducer numbered
[hash(b), hash(c)]. Distribution of tuples to the reduce processes is shown in figure
4,where k=16=4*4, final results are marked in yellow. An optimization algorithm to
find minimum number of replicas/reducers needed for join execution was proposed
and applied to star and chain join in [4].
4 Experiments
Experiments to evaluate performance of join processing algorithm are described in
this section. Performance was evaluated for three different sized user and log relations
on cluster of 6 nodes with split size of 64MB by authors of [3]. Naive Asymmetric
Join took half of the time taken by Default Hadoop Join.
Optimized Broadcast Join was performed between user table and log table such
that 50%, 70% and 90% log table tuples were associated to user table [3]. Results
showed that time required for semi join was very less compared to actual join phase.
Optimized Broadcast join performed better than Broadcast Join.
Experiments conducted in [5], on 100 node cluster showed that Improved reparti-
tion always performed better than Repartition join. And performance of Broadcast
join was decreased as the number of referenced tuples and percentage of referenced
tuples increased. Semi join was not observed to perform better than broadcast Join,
because of high overhead of scanning entire table. Also, Scalability of Improved Re-
partition Join, Broadcast Join and Semi join was observed to be linear.
Performance of cascade of binary joins and three way join using replication was
evaluated by authors of [4] on four node cluster, processing time taken by both ap-
proaches proved that three-way-join took less time than cascade of two way joins.
Experiments conducted on Amazon EC2 cloud showed that performance of Trojan
Join was better than Hadoop [6]. Performance of Hadoop++ with split size of 1GB
was better than Hadoop by factor of 20. But, for split size of 256MB performance was
alike, thus increasing the split size improved the performance of Hadoop++, but re-
duced the fault tolerance.
In case of Prior knowledge of schema and join condition, Trojan index and Trojan
join should be used for better performance. Multiway join (Replicated join) would be
efficient in case of star and chain join between more than two relations, otherwise
performing cascade of two way join would be better. For join between two relations
such that one relation is smaller and efficient to transmit over network then Broadcast
join would be a good choice. When less number of tuples of a relation contributes to
join result then prefer Optimized Broadcast Join or Semi join, else perform Reparti-
tion join.
6 Comparison
Consider that a join is performed between relations R1(a,b) and R2(b,c). Table 1
compares above mentioned join algorithms based on number of MapReduce jobs
required for execution, advantages of using a particular method and issues involved.
7 Conclusion
This paper has described current research work done for optimization of the join
query processing in MapReduce environment. Lots of research work is already done
in Distributed databases, Parallel databases and Relational databases; which could be
utilized for further improvement of join query execution in MapReduce environment.
Algorithms described above do not provide generic solution which would give opti-
mized performance in all cases. So, we proposed a strategy for join algorithm selec-
tion which could be applied dynamically based on the various parameters like size of
relation; knowledge of schema and selectivity.
References
1. Jeffrey, D., Sanjay, G.: MapReduce: Simplified Data Processing on Large Clusters. In:
OSDI 2004: Proceedings of the 6th Conference on Symposium on Operating Systems De-
sign & Implementation (2004)
2. Apache Foundation – Hadoop Project, http://hadoop.apache.org
3. Miao, J., Ye, W.: Optimization of Multi-Join Query Processing within MapReduce. In:
2010 4th International Universal Communication Symposium, IUCS (2010)
4. Foto, N.A., Jeffrey, D.U.: Optimizing Multiway Joins in a Map-Reduce Environment.
IEEE Transactions on Knowledge and Data Engineering 23(9) (2011)
5. Spyros, B., Jignesh, M.P., Vuk, E., Jun, R., Eugene, J., Yuanyuan, T.: A Comparison of
Join Algorithms for Log Processing in MapReduce. In: SIGMOD 2010, June 6–11. ACM,
Indian-apolis (2010)
6. Jens, D., Jorge-Arnulfo, Q., Alekh, J., Yagiz, K., Vinay, S., Jorg, S.: Hadoop++: Making a
Yellow Elephant Run Like a Cheetah (Without It Even Noticing). In: Proceedings of the
VLDB Endowment, vol. 3(1) (2010)
7. Sai, W., Feng, L., Sharad, M., Beng, C.: Query Optimization for Massively Parallel Data
Processing. In: Symposium on Cloud Computing (SOCC 2011). ACM, Cascais (2011)
8. Yang, H.-C., Dasdan, A., Hsiao, R.-L., Parker, S.: Map-Reduce-Merge: Simplified Rela-
tional Data Processing on Large Clusters. In: SIGMOD 2007, June 12–14. ACM, Beijing
(2007)
9. Minqi, Z., Rong, Z., Dadan, Z., Weining, Q., Aoying, Z.: Join Optimization in the MapRe-
duce Environment for Column-wise Data Store. In: 2010 Sixth International Conference
on Semantics, Knowledge and Grids. IEEE (2010)
10. Konstantin, S., Hairong, K., Sanjay, R., Robert, C.: The Hadoop Distributed File System.
In: MSST 2010 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems
and Technologies, MSST (2010)
Applications of Hidden Markov Model
to Recognize Handwritten Tamil Characters
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 282–290, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Applications of Hidden Markov Model to Recognize Handwritten Tamil Characters 283
to form composite letters, making a total of 247 different characters. In addition to the
standard characters, six characters taken from the Grantha script, which was used in
the Tamil region to write Sanskrit, are sometimes used to represent sounds not native
to Tamil, that is, words borrowed from Sanskrit, Prakrit and other languages. The
complete Tamil alphabet and composite character formations are given in Table 1.
From Tamil language, some 67 Tamil characters as the basic characters (Vowels,
Consonants and composite letters) are identified and if one recognizes these 67
characters then all the 247 characters can be recognized. The lists of 247 characters
are represented in Table 1.
2 Literature Survey
In [1] the author described a method for recognition of machine printed Tamil
characters using an encoded character string dictionary. The scheme employs string
features extracted by row- and column-wise scanning of character matrix. The
features in each row (column) are encoded suitably depending upon the complexity of
the script to be recognized.
In [2] the author has proposed an approach for hand-printed Tamil character
recognition. Here, the characters are assumed to be composed of line-like elements,
called primitives, satisfying certain relational constraints. Labeled graphs are used to
describe the structural composition of characters in terms of the primitives and the
relational constraints satisfied by them. The recognition procedure consists of
converting the input image into a labeled graph representing the input character and
computing correlation coefficients with the labeled graphs stored for a set of basic
symbols.
In [3] & [4] the authors attempts to use the fuzzy concept on handwritten Tamil
characters to classify them as one among the prototype characters using a feature
called distance from the frame and a suitable membership function. The prototype
characters are categorized into two classes: one was considered as line
characters/patterns and the other was arc patterns. The unknown input character was
classified into one of these two classes first and then recognized to be one of the
characters in that class.
In [5] a system was described to recognize handwritten Tamil characters using a
two stage classification approach, for a subset of the Tamil alphabet. In the first stage,
284 R. Jagadeesh Kannan, R.M. Suresh, and A. Selvakumar
an unknown character was pre-classified into one of the three groups: core, ascending
and descending characters. Then, in the second stage, members of the pre-classified
group are further analyzed using a statistical classifier for final recognition.
In [6] a system was proposed to recognize printed characters, numerals and
handwritten Tamil characters using Fuzzy approach. In [7] the author proposed an
approach to use the fuzzy concept to recognize handwritten Tamil characters and
numerals. The handwritten characters are preprocessed and segmented into primitives.
These primitives are measured and labeled using fuzzy logic. Strings of a character
are formed from these labeled primitives. To recognize the handwritten characters,
conventional string matching was performed. However, the problem in this string
matching had been avoided using the membership value of the string.
In [8] the authors proposed a two stage approach. In the first stage, an unsupervised
clustering method was applied to create a smaller number of groups of handwritten
Tamil character classes. In the second stage, a supervised classification technique was
considered in each of these smaller groups for final recognition. The features
considered in the two stages are different.
In [9] an approach was proposed to recognize handwritten Tamil characters using
Neural Network. Fourier Descriptor was used as the feature to recognize the
characters. The system was trained using several different forms of handwriting
provided by both male and female participants of different age groups.
3 System Architecture
The scanned input document image file is loaded into the preprocessing steps, After
the preprocessing steps is over, Preliminary Classification is done for each character
and features are extracted from each of the characters and then they are sent into the
recognition stage, which is done by Hidden Markov Model (HMM) to get the
recognized output.
Preprocessing
Binarization, Feature
Scanned Noise Removal & Extraction Recognition
Image Segmentation using HMM
The scanned image is preprocessed, i.e., the image is checked for skew correction,
then the image is binarized, then unwanted noise is removed and finally the characters
are segmented.
Applications of Hidden Markov Model to Recognize Handwritten Tamil Characters 285
3.1.1 Binarization
Image binarization converts an image (up to 256 gray levels) to a black and white
image (0 or 1). Binarization is done using Modified Otsu Global Algorithm. This
algorithm is the combination of Otsu Global algorithm and Sauvola algorithm. This
method is both simple and effective. The algorithm assumes that the image to be
threshold contains two classes of pixels (e.g. foreground and background) and
calculates the optimum threshold separating those two classes so that their combined
spread (intra-class variation) is minimal. As in Otsu's method we exhaustively search
for the threshold that minimizes the intra-class variance, defined as a weighted sum of
variances of the two classes:
( ) ( ) ( ) ( ) ( ) (1)
Weights ωi are the probabilities of the two classes separated by a threshold t and
variances of these classes. Otsu shows that minimizing the intra-class variance is the
same as maximizing inter-class variance
( ) ( ) ( ) ( ) ( ) (2)
Which is expressed in terms of class probabilities ωi and class means μi which in turn
can be updated iteratively.
Algorithm 1
Input: Scanned Image Output: Binarized Image
Step 1: Compute histogram and probabilities of each intensity level
Step 2: Set up initial (0) and (0)
Step 3: Step through all possible thresholds t=1…… maximum intensity
1. Update and ( )
2. Compute ( )
Step 4: Desired threshold corresponds to the maximum (t)
In the case of the bad quality image global thresholding cannot work well. For this,
we would like to apply a technique. Sauvola binarization technique (window-based),
which calculates a local threshold for each image pixel at (x, y) by using the intensity
of pixels within a small window W (x, y). The Threshold T (x, y) is computed using
the following formula
Where X is the mean of gray values in the considered window W (x, y), is the
standard deviation of the gray levels and R is the dynamic range of the variance, k is a
constant (usually 0.5 but may be in the range 0 to 1).
286 R. Jagadeesh Kannan, R.M. Suresh, and A. Selvakumar
3.1.3 Segmentation
The skewed images have been made available for segmentation process.
Algorithm 3
Input: Noise Removed Image
Output: Segmented Image
Step 1: The gray level image of a Tamil word is median filtered and then converted
into a binary image using Otsu threshold technique.
Step 2:Apply the skew detection and correction algorithm and then detects the
headline and Baseline as well as the bounding box.
Step 3: Connected components of a word to be segmented are detected.
Step 4: Lower contour of each connected component is traced anticlockwise. During
this tracing Process the relevant features are extracted.
Step 5:The feature vectors are normalized. Also the MLP is trained with the
normalized feature set.
After pre-processing the image, the image is cropped
Table 2. Tam
mil Characters categorized into Different Groups
The images of segmenteed characters are then rescaled into 32x32 pixel size, ussing
a bilinear interpolation tech
hnique. Each image is divided into N X M zones and tthus
the image enters into Featurre Extraction Stage
This follows the segmenttation phase of character recognition system where the
individual image glyph is considered
c and extracted for features. A character glyphh is
defined by the following atttributes
3.4.1.1 Data Preparation. In the Data Preparation stage, first we will define a word
network using a low level notation called HTK Standard Lattice Format (SLF) in
which each word instance and each word-to-word transition is listed explicitly. This
word network can be created automatically from the grammar definition. At this same
stage, we have to build a dictionary in order to create a sorted list of the required
character or words to be trained. Then we will use the tool HSGen to generate the
prompts for test sentences. Note that the data preparation stage is required only for
recognition purpose. It has absolutely no usage for the training purpose.
3.4.1.2 Training. The first task is to define a prototype for the HMM model to be
trained. This task will depend on the number of states and the extracted feature of
each character or word. The definition of a HMM must specify the model topology,
the transition parameters and the output distribution parameters. HTK supports both
continuous mixture densities and discrete distributions. In our application we will use
discrete distributions as we see that the observation state is finite for each frame (8 x
90 = 720 states). Then we have to initialize the estimation of the HMM model
parameters. The model parameters contain the probability distribution or estimation of
each model. By far the most prevalent probability density function is the Gaussian
probability function that consists of means and variances and this density function is
used to define model parameters by the HTK. For this we have to invoke the tool
HInit. After the initialization process is completed the HMM model is written into
.mmf file that contains all the trained models and is used in recognition.
Fig. 4. Flow
wchart of Training and Recognition Procedure
4 Conclusion
Table 3.
3 Recognition Rate of the Tamil character
Group 1 70 6
62 88.57
Group 2 60 5
53 88.37
Group 3 65 5
58 89.23
Group 4 70 6
62 88.57
290 R. Jagadeesh Kannan, R.M. Suresh, and A. Selvakumar
References
1. Siromoney, G., Chandrasekaran, R., Chandrasekaran, M.: Computer Recognition of Printed
Tamil Character. Pattern Recognition 10, 243–247 (1978)
2. Chinnuswamy, P., Krishnamoorthy, S.G.: Recognition of Hand printed Tamil Characters.
Pattern Recognition 12, 141–152 (1980)
3. Suresh, R.M., Ganesan, L.: Recognition of Hand printed Tamil Characters Using
Classification Approach. In: ICAPRDT, pp. 63–84 (1999)
4. Suresh, R.M., Arumugam, S., Ganesan, L.: Fuzzy Approach to Recognize Handwritten
Tamil Characters. In: International Conference on Computational Intelligence and
Multimedia Applications, pp. 459–463 (2000)
5. Hewavitharana, S., Fernando, H.C.: A Two Stage Classification Approach to Tamil
Handwriting Recognition. Tamil Internet 2002, pp. 118–124 (2002)
6. Suresh, R.M., Ganesan, L.: Recognition of Printed and Handwritten Tamil Characters Using
Fuzzy Approach. In: International Conference on Computational Intelligence and
Multimedia Applications, pp. 291–286 (2002)
7. Patil, P.M., Sontakke, T.R.: Rotation, Scale and Translation Invariant Handwritten
Devanagari Numeral Character Recognition Using General Fuzzy Neural Network. Pattern
Recognition 40, 2110–2117 (2007)
8. Bhattacharya, U., Ghosh, S.K., Parui, S.K.: A Two Stage Recognition Scheme for
Handwritten Tamil Characters. In: International Conference on Document Analysis and
Recognition, pp. 511–515 (2007)
9. Sutha, J., Ramaraj, N.: Neural Network Based Offline Tamil Handwritten Character
Recognition System. In: International Conference on Computational Intelligence and
Multimedia Applications, vol. 2, pp. 446–450 (2007)
Architectural Design and Issues for Ad-Hoc Clouds
Abstract. Effectively using, managing and harnessing the data is the key to the
success of organizations in the time to come. We propose a cloud architecture
that uses donation based resources in a network & helps multiple organizations
to collaborate and yet compete with each other. The resources are utilized non
intrusively. Organizations collaborate to create a Data-centre, that doesn’t harm
their existence or profitability. At the same time, these organizations can com-
pete by spreading to those locations where they carry certain edge over others.
This is where an ad-hoc cloud in heterogeneous environment helps to venture
into remote areas with. To achieve this, ad-hoc cloud architecture is proposed
along with issues and strategies.
1 Introduction
Cloud computing is a computing paradigm where data and services reside in common
space in elastic data centers, and the services are accessible via authentication. It sup-
ports “pay as you go” model. The services are composed using highly elastic and
configurable resources. Cloud computing [18] services can form a strong infrastruc-
tural/service foundation framework to provide any kind of service oriented computing
environment. Ad-hoc clouds enables existing infrastructure as cloud compliant, the
resources available in the environment are utilized non-intrusively. Education-cloud
[2], where a cloud computing framework is harnessed to manage Information system
of an Educational institution would be highly efficient in terms of accessibility, ma-
nageability, scalability and availability. An ad-hoc cloud would enable us harness
services offered by Fixed Education –cloud and services created and composed within
ad-hoc cloud.
An e-Education [13] system doesn’t fit well in the scenario. As a solution to this
problem an ad-hoc cloud architecture is proposed that can rightly fit into the picture to
serve the purpose. An ad-hoc cloud created at the remote site could be connected to
the fixed cloud using and ad-hoc link. Hence the ad-hoc cloud would benefit in terms
of existing service and cloud applications from the fixed cloud. But due to ad-hoc
connectivity it needs to create its own data center and service composition environ-
ment where it can persist and also process its data.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 291–296, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
292 S.K. Pippal, S. Mishra, and D.S. Kushwaha
The cloud computing paradigm [14] is new and there is a need of standardizing the
interfaces and methods of programming the cloud. Presently all giants (Microsoft-
Azure [11], IBM Blue-Cloud [9], Amazon-EC-2 [1], salesforce.com [13] etc.) who
have ventured into cloud computing [18] paradigm have their own way of implement-
ing the cloud. Without standardization interoperability would be a major problem.
2 Existing System
A very close comparison with our system is [12], which considers voluntarily donated
resources to be reused as cloud compliant resource. In [12] some of the challenges
and some of its solutions are discussed. In [8] a similar concept in which a dispersed
under-utilized system is used to implement data-center[19] is considered from imple-
mentation perspective. Both of the above mentioned do not mention any application
perspective, by the ad hoc cloud could be used and in our approach we have consi-
dered some of the challenges and proposed solution to some of them.
Data Persistency
An ad-hoc Data-centre is proposed having some Super(S) nodes, some Persistent (P)
nodes and other Volunteer (V) nodes. S nodes are permanent, P nodes are persistent
store data on ad-hoc basis and V nodes voluntarily participate in Data-centre. Mirror-
ing is performed between S nodes, replication is performed between P nodes and V
nodes acts as data sources fig. 2.
institutes can look forward towards developing their own application in fields of vari-
ous subjects right from Elementry Math’s, Science, Physics, and chemistry to advance
subjects like Mechanics and Industrial engineering etc.
References
1. Amazon Web Services, http://aws.amazon.com
2. Dong, B., Zheng, Q., Qiao, M., Shu, J., Yang, J.: BlueSky Cloud Framework: An E-
Learning Framework Embracing Cloud Computing. In: Jaatun, M.G., Zhao, G., Rong, C.
(eds.) CloudCom 2009. LNCS, vol. 5931, pp. 577–582. Springer, Heidelberg (2009)
3. Eucalyptus, http://open.eucalyptus.com
4. Berman, F., Fox, G., Hey, T.: Education and the Enterprise With the Grid. In: Grid Com-
puting (May 30, 2003), doi:10.1002/0470867167.ch43
5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra,
T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data.
In: OSDI, pp. 205–218 (2006)
6. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Si-
vasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-
value store. In: SOSP, pp. 205–220 (2007)
7. Google App Engine, http://appengine.google.com
296 S.K. Pippal, S. Mishra, and D.S. Kushwaha
8. Kirby, G., Dearle, A., Acdonald, A., Fernandes, A.: An Approach to Ad hoc Cloud Com-
puting. In: DBLP: CoRR Volume abs 1002.4738 (June 2010)
9. IBM Cloud Computing, http://ibm.com/ibm/cloud
10. Mell, P., Grance, T.: The NIST definition of cloud computing (v15). Tech. rep., National
Institute of Standards and Technology (2009)
11. Microsoft Azure, http://www.microsoft.com/windowsazure
12. OpenNebula Project, http://www.opennebula.org
13. Pasatcha, P., Sunat, K.: A Distributed e-Education System Based on the Service Oriented
Architecture. In: 2008 IEEE International Conference on Web Services (2008)
14. Qian, L., Luo, Z., Du, Y., Guo, L.: Cloud Computing: An Overview. In: Jaatun, M.G.,
Zhao, G., Rong, C. (eds.) CloudComp 2009. LNCS, vol. 5931, pp. 626–631. Springer,
Heidelberg (2009)
15. Salesforce.com, http://salesforce.com
16. Das, S., Agarwal, S., Agrawal, D., El Abbadi, A.: ElasTraS: An Elastic, Scalable, and Self
Managing Transactional Database for the Cloud. Technical Report 2010-04, CS, UCSB
(2010)
17. Das, S., Nishimura, S., Agrawal, D., Ei Abbadi, A.: Albatross: Lightweight Elasticity in
Shared Storage Databases for the Cloud using Live Data Migration. In: the 37th Interna-
tional Conference on Very Large Data Bases, VLDB (2011)
18. Xia, T., Li, Z., Yu, N.: Research on Cloud Computing Based on Deep Analysis to Typical
Platforms. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) CloudCom 2009. LNCS, vol. 5931,
pp. 601–608. Springer, Heidelberg (2009)
19. Zimory GmbH: Building the flexible data centre, http://zimory.com
20. Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research chal-
lenges. Journal of Internet Services and Applications 1, 7–18 (2010)
Periocular Region Classifiers
1 Introduction
With the exploration of periocular region as a useful biometric trait, periocular region
is drawing lot of attention in research studies [1, 2, 14]. It is experimented that
periocular region is one of the most discriminative feature in the human face.
Periocular biometrics requires the analysis of periocular images for compliance to the
security related applications. To enhance the research studies in this area, periocular
databases such as FRGC (Facial Recognition Grand Challenge), FERET (Facial
Recognition Technology), MBGC (Multiple Biometrics Grand Challenge) and
UBIRIS.V2 collected at different spectral range, lighting conditions, pose variations
and different distances are available. From these periocular images, the region of
interest is procured using segmentation process and fed to the feature extractor
algorithm. Feature extraction is a robust process involved to seek distinguishing
features of texture, color or size that are invariant to irrelevant transformations of the
image. A feature extractor yields a representation to characterize the image. Various
feature extraction techniques such as Gradient Orientation Histogram (GOH), Local
Binary Patterns (LBP) [2, 16], Gabor Filters, Color Histograms [17], Walsh and
Laws’ mask, DCT, DWT, Force Field Transform and SURF are explored in
periocular biometric studies. The feature vectors provided by these feature extractors
are used by the classifiers to assign the object to a category. The abstraction provided
by the feature-vector representation enables the development of a largely domain
independent theory of classification. The degree of difficulty of the classification
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 297–300, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
298 D.R. Ambika, K.R. Radhika, and D. Seshachalam
problem depends on the variability in the feature values for periocular images in the
same category relative to the difference between feature values in different categories.
The next section focuses on the different classification techniques.
2 Classification Techniques
Classifier analyzes the numerical properties of the image features and organizes it into
categories. Classification algorithms typically employ two phases of processing:
training and testing. In the initial training phase, characteristic properties of typical
image features are isolated and, based on these, a unique description of each
classification category, i.e. training class, is created. In the subsequent testing phase,
these feature-space partitions are used to classify image features.
Scale Invariant Feature Transform (SIFT). SIFT transforms an image into a large
collection of feature vectors as shown in figure 1 (right), each of which is invariant to
image translation, scaling, and rotation, partially invariant to illumination changes and
robust to local geometric distortion. These SIFT features are extracted using
Difference of Guassian functions from a set of reference images and stored in a
Periocular Region Classifiers 299
database. A new image is matched by individually comparing each feature from the
new image to the database and determining candidate matching features. The best
candidate match for each keypoint is obtained by identifying its nearest neighbor in
the database of keypoints from training images. The nearest neighbor is that keypoint
with minimum Euclidean distance for the invariant descriptor vector [4, 5].
Linear Discriminant Analysis (LDA). LDA searches for the vectors in the
underlying space of independent data feature that best discriminate among classes
(rather than those that best describe the data itself). It projects data on a hyperplane
that minimizes the within-class scatter and maximizes the between-class scatter as
shown in the figure 1. Mathematically, these two measures are defined as within-class
scatter matrix, and between-class scatter matrix given by the equations 2 and 3 [8]
∑ ∑ ( )( ) (2)
where is the ith sample of the class j, is the mean of class j, c is the number of
classes and is the number of samples in class j.
∑ ( )( ) (3)
where represents the mean of all classes.
Principle Component Analysis (PCA). PCA is a standard technique used to
approximate the original data with lower dimensional feature vectors. It is a
mathematical procedure that uses an orthogonal transformation to convert a set of
observations of possible correlated variables into a set of uncorrelated variables called
principal components. This transformation is defined in such a way that the first
principal component has as high a variance as possible and each succeeding
component in turn has the highest variance possible under the constraint that it be
orthogonal to the preceding components [6, 7]. In principle, common properties
implicitly existing in the training set; like gender, race, age and usage of glasses can
be observed from these components.
Multilayer Perceptron (MLP). MLP is a feedforward (no recurrent nodes) network,
which maps a set of input data onto a set of output through multiple layers of nodes in
a directed graph . Each node (except for the input node) is a neuron with a nonlinear
activation function. MLP is trained through the backpropogation algorithm (BP). The
input vector is transformed to the output vector . The difference between the
desired output d and actual output is computed as the error signal and is propagated
backwards through the entire network by updating the synaptic weights W and biases
b. This updating yields the actual output closer to the desired output d [9].
JointBoost Algorithm. The idea of this algorithm is that at each boosting round of a
classification algorithm (C) such as AdaBoost, various subsets of classes, S ⊆ C are
examined and considered to fit a weak classifier such that this subset is distinguished
from the background. The subset is picked up such that it maximally reduces the error
on the weighted training set for all the classes. The best weak learner h(v, c) is then
added to the strong learners H(v, c) for all the classes c ∈ S, and their weight
distributions are updated so as to optimize the multiclass cost function
∑ ( , )
, is the membership label (±1) for class c. [11].
300 D.R. Ambika, K.R. Radhika, and D. Seshachalam
3 Conclusion
This work presents a review of various classifier schemes suitable for categorizing the
identity of the claimed, using the periocular region. It investigates different classifier
schemes such as independent learning machines and fusion of classifiers which form
boosting algorithms to aid in boosting the performance of weak classifiers.
References
1. Park, U., Jillela, Ross, Jain, A.K.: Periocular Biometrics in the Visible Spectrum. IEEE
Transactions on Information Forensics and Security 6(1) (March 2011)
2. Park, U., Ross, A., Jain, A.K.: Periocular biometrics in the visibility spectrum: A
feasibility study. In: Proc. Biometrics: Theory, Applications and Systems (BTAS), pp.
153–158 (2009)
3. Lyle, J.R., Miller, P., Pundlik, S., Woodard, D.: Soft Biometric Classification using
Periocular Region Features. IEEE Transactions (2010)
4. Distinctive Image Features from Scale- Invariant Keypoints
5. Hollingsworth, K., Bowyer, K.W., Flynn, P.J.: Identifying useful features for recognition
in Near Infrared Periocular Images. IEEE Transactions (2010)
6. Zhao, W., Arvindh, Rama, Daniel, John: Discriminant Analysis of Principal components
for face recognition
7. Koray, Volkan: PCA for gender estimation: Which Eigenvectors contribute? In: ICPR
2002 (2002)
8. Aleix, M.: Avinash: PCA versus LDA. IEEE Transactions on PAMI (2001)
9. Seung, S.: Multilayer perceptrons and backpropogation learning (2002)
10. Jerome, Trevor, Robert: Additive Logistic Regression: A statistical view of boosting. The
Annals of Statistics (2000)
11. Antonio, Kevin, William: Sharing features: efficient boosting procedures for multiclass
object detection
12. Zhuowen: Probabilistic Boosting Tree: Learning Discriminative Models for Classification.
Recognition and Clustering
13. Corinna, Vladimir: Support Vector Networks. Machine Learning (1995)
14. Woodard, D., Pundlik, S., Miller, P., Lyle, J.R.: Appearance-based periocular features in
the context of face and non-ideal iris recognition. Springer
15. Merkow, J., Jou, B., Savvides, M.: An Exploration of Gender Identification using only the
periocular region. IEEE Transactions (2010)
16. Merkow, J., Jou, B., Savvides, M.: An Exploration of Gender Identification using only the
periocular region. IEEE Transactions (2010)
17. Woodard, D., Pundlik, S., Lyle, J.R., Miller, P.: Periocular Region AppearanceCues for
Biometric Identification. IEEE Transactions (2010)
Error Analysis and Improving the Speech Recognition
Accuracy on Telugu Language
1 Introduction
Speech is one of the easiest modes of interface between the humans and machines. In
order to interact with machine through speech, several factors affecting the speech
recognition system. Environmental condition, prosodic variations, recording devices ,
speaker variations etc., are some of the key factors which affect much in getting the
good percentage of speech recognition accuracy. Much effort has been incorporated
in increasing the performance of the speech recognition systems. In spite of the
increased performance, still the output of the speech recognition system contains
many errors. In speech recognition system, it is extremely difficulty in dealing such
errors. The techniques being investigated and applied on the speech recognition
system to reduce the error rate by increasing the speech recognition accuracy. It is
very much important to record the speech in good environment with sophisticated
recording device. Back ground noise can influence much on the recognition accuracy.
Speakers should record the speech clearly, so that good acoustic signals will be
generated which will be used for both training phase and decoding phase. It is
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 301–308, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
302 N. Usha Rani and P.N. Girija
important to detect the errors in speech recognition results and then correct those
errors by imposing suitable methods. This results in increasing the accuracy of the
speech recognition system by reducing error rate. It is important to take care more in
dictionary to train the system. The pronunciation dictionary is a mapping table of the
vocabulary terms and the acoustic models. It contains the words to be recognized.
Incorrect pronunciation in the lexicon causes the incorrectness in training phase of the
speech recognition system, which in turn results incorrect results at the decoding
phase.
Sphinx 3 speech recognition system is used for training and testing. Sphinx is a large
vocabulary, speaker independent, continuous and HMM based speech recognition
system.
Hidden Markov Model (HMM)
HMM is a method of estimating the conditional probability of an observation
sequences given a hypothesized identity for the sequence. A transition probability
provides the probability of transition from one state to another state. After particular
transition occurs, output probability defines the conditional probability of observing
a set of speech features. In decoding phase, HMM is used to determine the sequence
of (hidden) states (transitions) occurred in observed signal. And also it determines the
probability of observing the particular given state of event that has been determine in
first process.
Learning algorithm
Baum welch algorithm find the model's parameter so that the maximum probability
of generating the observations for a given model and a sequence of observations.
Evaluation Problem
Forward-backward algorithm is used to find the probability that the model
generated the observations for the given model and a sequence of observations.
Decoding Problem
Viterbi algorithm is used to find out the most likely state sequence in the model that
produced the observation for the given model and the sequence of observations.
Probabilistic Formulation
Let A = {A1, A2, .., At} and W = {W1, W2,. .., Wm} be a sequence of acoustic
observations and words used. Given acoustic observations A, tthe probability of word
sequence W is P(W|A)
Bayes Rule : argmaxW ( P( W | A) ) = argmaxw ( P( W, A ) / P(A) )
= argmaxW( P( W, A) ) = argmaxW( P( A | W) * P( W ) )
A model for the probability of acoustic observations given the word sequence,
P( A | W ), is called an “acoustic model. A model for the probability of word
sequences, P(W), is called a “language model”.
Error Analysis and Improving the Speech Recognition Accuracy on Telugu Language 303
3 Related Work
In order to study the number of words accessed and the relevant words among the
accessed words, measures are being calculated. Classification Error Rate (CER),
Detection cost function (DCF) and Detection Error Trade-off (DET)are calculated to
determine the errors in the speech recognition system [1]. Possible metrics has been
developed to evaluate the error measures which in turn describe the performance of
the system. Error division’s diagrams also used to have the considerable information
to user by evaluating the sensitivity and recall metrics [2]. Evaluation of speech
recognition has been become the practical issue. The common thing is the information
retrieval is one of the practical issues. In this, it is necessary to determine the word
error rate and accuracy of retrieval of desired document [3].Hits rates are
calculated to know the correct recognitions. The harmonic mean of precision and
recall is one such measure used to determine the cost to the user to have different
types of errors. Slot error also calculated in order to overcome the limitations of the
other measures [4]. Many errors occur at different stages. Lexicon also plays a key
role in speech recognition. Pronunciation dictionary consists of all the possible
pronunciations of all the speakers. Different speakers may pronounce the same word
differently and in some cases same speaker pronounce the same word differently in
different contexts. This is due to dialectal variations, educational qualifications and
emotional conditions and so on. These variations increase the word error rate. If
training data covers all variations, more probability is there to improve the accuracy
rate [5]. Dictionary should be developed for all the possible pronunciations.
Depending on the pronunciation context and the frequency of that word also affects
the accuracy of the system [6]. Compound and individual word in the training and
testing influence the accuracy of the system. Proper Training is required so that all
the language models built properly. It is required to remove errors
corresponding to the transcript during the computation of word error rate [7] [8].
If all the acoustic models are exactly mapped to vocabulary units, then the effective
word rate should be zero, practically it is difficult to achieve. Mis-recognized words
occur due to the absence of all pronunciation variations by all speakers used in
training which is the cause for the low performance of the speech recognition system
[9]. Stochastic learning of lexicons is necessary for the spontaneous speech and
lecture speech etc. [10]. Accent and speech rate also influence the pronunciation
dictionary. Vowel compression and expansion are mostly observed which are very
difficult to represent in the pronunciation dictionary [11]. These are also cause to
occur more confusion pairs which degrades the performance of the system. Confusion
pairs will increase enormously in the case of out of vocabulary [12].
following are the some of the errors analysis are taken from the speech decoder to
analyze the type of error. REF indicates the transcription used for the sphinx speech
recognition system and REF indicates the hypothesis obtained from the decoder of the
sphinx speech recognition system.
Type 1
Misrecognition occurs due to substitution of one word occurs in the place of
original word. This substitution reduces the performance of the speech recognition
system.
REF: CHITTOORKI velle train peyremiti
HYP: CHITTOORKU velle train peyremiti
Type 2
Mis recognition occurs because of the substitution of multi words in the place of
original single word and also inserting new word E which reduced accuracy.
REF: thirumala ekspres eppudu VASTHUNDHI
HYP: thirumala ekspres eppudu EEST VUNDHI
Type 3
Mis recognitions occur due to the substitution of single word in the place of
multiple words. This degrades the accuracy of the system.
REF:ANOWNSEMENT ELA CHEYALI
HYP:GANTALAKU VELUTHUNDHI
Type 4
This type of error occurs due to the out of vocabulary (OOV) situation. Sometimes,
decoder some times fail to map the approximate word.
REF: AJHANTHA EKSPRES EKKADIKI VELUTHUNDHI
HYP: ENTHA EKPRES EKKADIKI VELUTHUNDHI
After analyzing the type of the errors, it is necessary to recover from the errors to
improve the accuracy. The knowledge sources of acoustic model, lexicon and
language model need improvements. Error patterns are observed from the confusion
pairs obtained from the decoder of the speech recognition system. More the
frequency of the error patter in confusion pairs, more the sentences in test data are
recognized as incorrect. Confusion pairs is useful to analyze the errors occurred in the
recognition results. This confusion pairs collects the information of frequency of
every possible recognition error. The more two words are confuse each other, the
closer they are. It generally refers to the hits, substitutions, deletions, insertions. If the
frequency is n in confusion pairs, then n number of times that particular word is
recognized as wrong. It is necessary to reduce the value of n. The number of
confusion pairs also increased as the frequency of n increases. In order to reduce the
number of confusion pairs, one of the recovery techniques called pronunciation
dictionary modification method is to be applied. This method reduce the frequency
of the error patterns in the confusion pairs to reduce the error rate In order to
Error Analysis and Improving the Speech Recognition Accuracy on Telugu Language 305
7 Experimental Results
7.2 Results
The following table denotes the percentage of accuracy and the number of confusion
pairs occurred before and after modification of the dictionary.
Table 1. No. of Confusion Pairs and % of accuracy ( before and after PDMM)
From the speech recognition decoder, the substitutions (SUB), insertions (INS),
deletions (DEL), misrecognitions, total errors collected for the given data. Word Error
Rate (WER) and ERROR-RATE are determined as follows:
From the above tables 2 and table 3, the hits (number of correctly recognized) and
false alarms (number of words incorrectly recognized as true) are determined. Hit rate
is improving and False Alarm is reducing with the modification in the dictionary,
which are shown in the following figures.
2500 40
2000 30
total words before
1500 20
before after
1000 10
500 after
0
0 237 474 948 1185 2370
1S 2S 4S 5S 10S
Fig. 1. Hit rates before and after PDMM Fig. 2. False alarms before and after PDMM
7.3 F-Measure
Precision and Recall is used to measure the performance of the system. Precision is
ratio of the correctly recognized words (C) to the substitutions ( S ) and insertions ( I )
Error Analysis and Improving the Speech Recognition Accuracy on Telugu Language 307
errors. Recall is ratio of correctly recognized words (C) to the substitution and
deletion (D) errors. F-measure is weighted combination of precision and recall. It is
also called as error measure to evaluate the performance of the system.
Precision = C/ (C+S+I) --- (3) && Recall = C/(C+S+D) (4)
F-measure = (2 * precision * Recall)/ (precision + Recall) (5)
Table 4. Precision, Recall and F-measure values before and after PDMM
From the table 2, table 3 and table 4, it is observed that the E<=ERROR-
RATE<=WER, [4] which are shown in the following figures.
2.5
4 WER
WER
2
3 ERROR_RATE
1.5
2 ER R OR- E
1
1 RA TE
0.5
0 E
0
1S 2S 4S 5S 10S
1S 2S 4S 5S 10S
8 Conclusion
In this paper, Errors are analyzed to apply the improving techniques to increase the
accuracy of the speech recognition system. Error measures are discussed.
References
1. Pellegrini, T., Transcoso, I.: Improving ASR error detection with non-decoder based
features. In: INTERSPEECH, pp. 1950–1953 (2010)
2. Minnen, D., Westeyn, T., Starner, T., Ward, J.A., Lukowicz, P.: Performance Metrics and
Evaluation Issues for Continuous Activity Recognition. Performance Metrics for
Intelligent System, 141–148 (2006)
3. McCowan, I., Moore, D., Dines, J., Gatica-Perez, D., Flynn, M., Wellner, P.: On the Use
of Information Retrieval Measures for Speech Recognition Evaluation. IDIAP Research
Report (2005)
308 N. Usha Rani and P.N. Girija
4. Makhoul, J., Kubala, F., Schwartz, R., Weishede, R.: Performance Measures for
information Extraction. In: DARPA Broadcast News Workshop, pp. 249–252 (1999)
5. Bourlard, H., Hermansky, H., Morgan, N.: Towards increasing speech recognition error
rates. Speech Communication, 205–231 (1996)
6. Davel, M., Martirosian, O.: Pronunciation Dictionary Development in Resource-scarce
Environments. In: INTERSPEECH, pp. 2851–2854 (2009)
7. Chen, Z., Lee, K.-F., Lee, M.-J.: Discriminative Training on Language Model. In: ICSLP
2000, pp. 16–20 (2000)
8. Hong-Kwang, Kuo, J., Fosler-Lussire, E., Jiang, H., Lee, C.-H.: Discriminative Training of
Language models for Speech Recognition. In: ICASSP 2002, pp. 325–328 (2002)
9. Martirosian, O.M., Davel, M.: Error analysis of a public domain pronunciation dictionary.
In: PRASA, pp. 13–16 (2007)
10. Badr, I., McGraw, I., Glass, J.: Pronunciation Learning from Continuous Speech. In:
INTERSPEECH, pp. 549–552 (2011)
11. Benus, S., Cernak, M., Rusko, M., Trnka, M., Darjaa, S.: Adapting Slovak ASR for native
Germans speaking Slovak. In: EMNLP, pp. 60–64 (2011)
12. Karanosou, P., Yvon, F., Lamel, L.: Measuring the confusability of pronunciations in
speech recognition. In: 9th International Workshop on Finite State Methods and Natural
Language Processing, pp. 107–115. Association for Computational Linguists (2011)
Performance Evaluation of Evolutionary and Decision
Tree Based Classifiers in Diversity of Datasets
Pardeep Kumar1, Vivek Kumar Sehgal1, Nitin1, and Durg Singh Chauhan2
1
Department of Computer Science & Engineering,
Jaypee University of Information Technology, Waknaghat, Solan (H.P), India
2
Department of Computer Science & Engineering, Institute of Technology, Banaras Hindu
University, Banaras(U.P), India. Currently with Uttrakhand Technical University,
Dehradun (UK), India
pardeepkumarkhokhar@gmail.com,
{vivekseh,delnitin}@ieee.org, pdschauhan@acm.org
Abstract. The large databases of digital information are ubiquitous. Data from
the neighborhood store’s checkout register, your bank’s credit card
authorization device, records in your doctor’s office, patterns in your telephone
calls and many more applications generate streams of digital records archived in
huge databases, sometimes in so-called data warehouses A new generation of
computational techniques and tools is required to support the extraction of
useful knowledge from the rapidly growing volumes of data. These techniques
and tools are the subject of the emerging field of knowledge discovery in
databases (KDD) and data mining. Data mining plays an important role to
discover important information to help in decision making of a decision support
system. It has been the active area of research in the last decade. The
classification is one of the important tasks of data mining. Different kind of
classifiers have been suggested and tested to predict the future events based on
unseen data. This paper compares the performance evaluation of evolutionary
based genetic algorithm and decision tree based classifiers in diversity of
datasets. The performance evaluation metrics are predictive accuracy, training
time and comprehensibility. Evolutionary based classifier shows better
comprehensibility over decision tree based classifiers. These classifiers show
almost same predictive accuracy. Experimental results demonstrate that
evolutionary approach based classifiers are slower than decision tree based
classifiers. This research is helpful for organizations to select the classifiers as
information generator for their decision support systems to make future
policies.
1 Introduction
Information plays a vital role in business organizations. Today’s business is
information hungry. Information can be used by the top level management for
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 309–312, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
310 P. Kumar et al.
decision making to make future policies. Due to increasing size of organizations data
rapidly, manual interpretation of data for information discovery is not feasible.
Over the last three decades, data mining has been growing on the map of computer
science. It deals with the discovery of hidden knowledge, unexpected patterns and new
rules from large databases. Data mining is regarded as the key element of a much more
elaborate process called Knowledge Discovery in Databases (KDD) which is defined
as the non – trivial process of identifying valid, novel, and ultimately understandable
patterns in large databases [1]. One of the important tasks of data mining is
classification. The conventional classifiers used for classification are decision trees,
neural network, statistical and clustering techniques. There is lot of research going in
the machine learning and statistics communities on classifiers for classification. In the
recent past, there has been an increasing interest in applying evolutionary methods to
Knowledge Discovery in Databases (KDD) and a number of successful applications of
Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been
demonstrated.
The STATLOG Project[2] finds that no classifier is uniformly most accurate over
the datasets studied and many classifiers possess comparable accuracy. Earlier
comparative studies put emphasis on the predictive accuracy of classifiers; other
factors like comprehensibility and classification index are also becoming important.
Breslow and Aha have surveyed methods of decision tree simplification to improve
their comprehensibility [3]. Brodley and Utgoff , Brown, Corruble, and Pittard,
Curram and Mingers, and Shavlik, Mooney and Towell have also done comparative
studies in the domain of classifiers[4-7]. Saroj and K.K Bhardwaj have done excellent
work on GA’s ability to discover production rules and censor based production rules
[8]. No single method has been found to be superior over all others for all datasets.
Issues such as accuracy, training time, robustness and scalability must be considered
and can involve tradeoffs, further complicating the quest for an overall superior
method.
This paper compares evolutionary approach based genetic algorithm and decision
tree based classifiers (CHAID, QUEST and C4.5) on four datasets
(Mushroom,Vote,Nursery and Credit) that are taken from the University of California,
Irvine, Repository of Machine Learning Databases (UCI) [9].
2 The Classifiers
CHAID, QUEST and C5.0 are decision tree based classifiers [10-12]. Genetic
algorithm is the evolutionary approach based classifier [13-17].
3 Experimental Setup
There are four datasets (Mushroom, Vote, Nursery and Credit) used in this research
work from real domain. These datasets are available from UCI machine learning
repository [9]. Predictive accuracy, training time and comprehensibility are the
parameters used for performance evaluation of the underlying classifiers [1, 10-11].
Decision tree based classifiers have been tested using Clementine 10.1 with window
XP platform. GA has been tested using GALIB 245 simulator on Linux Platform.
Performance Evaluation of Evolutionary and Decision Tree 311
4 Results
5 Conclusion
References
1. Smith, U.M.F., Shapiro, G.P., Smyth, P.: The KDD process for extracting useful
knowledge from volumes from data. Communication of ACM 39(11), 27–34 (1996)
2. King, R.D., Feng, C., Sutherland, A.: STATLOG. Comparison of classification algorithms
on large real-world problems. Applied Artificial Intelligence 9(3), 289–333 (1995)
3. Breslow, L.A., Aha, D.W.: Simplifying decision trees: A survey. Knowledge Engineering
Review 12, 1–40 (1997)
4. Brodley, C.E., Utgoff, P.E.: Multivariate versus univariate decision trees, Department of
Computer Science, University of Massachusetts, Amherst, MA. Technical Report 92-8
(1992)
5. Brown, D.E., Corruble, V., Pittard, C.L.: A comparison of decision tree classifiers with
back propagation neural networks for multimodal classification problems. Pattern
Recognition 26, 953–961 (1993)
6. Curram, S.P., Mingers, J.: Neural networks, decision tree induction and discriminant
analysis: An empirical comparison. Journal of the Operational Research Society 45, 440–
450 (1994)
7. Shavlik, J.W., Mooney, R.J., Towell, G.G.: Symbolic and neural learning algorithms: an
empirical comparison. Machine Learning 6, 111–144 (1991)
8. Saroj, Bhardwaj, K.K.: A parallel genetic algorithm approach for automated discovery of
censored production rules. In: AIAP 2007 Proceedings of the 25th Conference on
Proceedings of the 25th IASTED International Multi-Conference: Artificial Intelligence
and Applications, pp. 435–441 (2007)
9. UCI Repository of Machine Learning Databases. Department of Information and
Computer Science University of California (1994),
http://www.ics.uci.edu/~mlearn/MLRepositry.html
10. Han, J., Kamber, M.: Data mining: concepts and techniques: Book (Illustrated), 550 pages
(January 2001) ISBN-10: 1558604898, ISBN-13: 9781558604896
11. Quinlan, J.R.: Induction in decision trees. Journal of Machine Learning 1(1), 81–106
(2003)
12. Lim, T.S., Loh, W.Y., Shih, Y.S.: A Comparison of Prediction Accuracy, Complexity, and
Training Time of Thirty-Three Old and New Classification Algorithms. Journal of
Machine Learning 40, 203–228 (2000)
13. Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning.
Addison-Wesley (1989)
14. Deb, K.: Genetic Algorithm in search and optimization: The techniques and Applications.
In: Proceeding of Advanced Study Institute on Computational Methods for Engineering
Analysis and Design, pp. 12.1–12.25 (1993)
15. Saroj: Genetic Algorithm: A technique to search complex space. In: Proceedings of
National Seminar on Emerging Dimension in Information Technology, August 10-11, pp.
100–105 (2002)
16. Frietas, A.A.: A survey of evolutionary algorithms for data mining and knowledge
discovery, pp. 819–845. Springer, New York (2003)
17. Frietas, A.A.: Data mining and knowledge discovery with evolutionary algorithms, 265
pages (2002) ISBN: 978-3-540-43331-6
An Insight into Requirements Engineering Processes
1 Introduction
Requirements engineering (RE) is concerned with the identification of the goals to be
achieved by the system. Most of the software fails because of the incomplete,
inconsistent, and ambiguous requirements. Clear understanding of the RE process
helps to develop a successful software system. The objective of this article is, to have
an insight into requirements engineering processes, i.e., requirements elicitation,
requirements modeling, requirements analysis, requirements verification & validation,
and requirements management. Traditional requirements engineering treat
requirements as consisting only of processes and data, thus making it difficult to
understand requirements with respect to some high level concerns in the problem
domain. Traditionally modeling and analysis techniques do not allow alternative
system configuration, where more or less functionality is automated or different
assignments of responsibility are explored. Goal Oriented Requirements Engineering
(GORE) attempts to solve these problems [12]. Goals have long been used in the
Artificial Intelligence (AI). Lamsweerde defines the goal as an “objective that the
system should achieve through cooperation of agents in the software-to-be and in the
environment” [11].
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 313–318, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
314 M. Sadiq and S.K. Jain
The rest part of this paper is organised as follows: Section 2 provides the brief
description about the requirements engineering. Detailed description about the
requirements engineering processes is given in section 3, and finally, we conclude our
discussion in section 4.
2 Requirements Engineering
Zave [18] defines the “requirement engineering as the branch of software engineering
concerned with the real world goal for functions of and constraints on software
systems. It is also concerned with the relationship of these factors to precise
specification of software behaviour, and to their evolution over time and across
software families”. Brooks [3] states that “the hardest single part of building a
software system is deciding precisely what to build. Therefore, the most important
function that the software builder performs for the client is the iterative extraction and
refinement of the product requirements”.
Requirements Engineering
(Requirements engineering is a process which determines the requirements for a software
Division based on
Sub-processes
4 Conclusion
This paper presents a comprehensive view of the requirements engineering processes.
Requirements elicitation techniques are employed to understand the goal, and
objectives for building a software system. Requirements’ modeling is used to
represent the requirements. Most of the GORE approaches supports the modeling
notations for the requirements representations like AGORA, i*, NFR (for modeling
the non functional requirements), KAOS, GBRAM, and TROPOS. It is difficult to
elicit the high level objective of an enterprise using traditional requirements
engineering processes; therefore, to elicit the high level objectives, Goal Oriented
Requirements Engineering (GORE) is used. Requirements analysis is the process of
determining the user expectations for the software system. Requirements verification
and validation (RV&V) model makes it sure that the product that is developed is
according to the needs of users. Requirements management is employed for the
requirements documentation, communications, prioritizations, and controlling change
in the requirements. Fractional knowledge of RE processes may lead to failure of
software projects.
318 M. Sadiq and S.K. Jain
References
1. Albayrak, O., Hulya, Bicakcr, M.: Incomplete Software Requirements and Assumptions
made by Software Engineers. In: 16th Asia Pacific Software Engineering Conference, pp.
333–339 (2009)
2. Bell, T.E., Thayer, T.A.: Software Requirements: Are They Really a Problem? In: ICSE-2:
2nd International Conference on Software Engineering, San Francisco, pp. 61–68 (1976)
3. Brooks, F.P.: No Silver Bullet: Essence and Accidents of Software Engineering. IEEE
Computer 20(4), 10–19 (1987)
4. Cheng, B.H.C., Atlee, J.M.: Current and Future Research Directions in Requirements
Engineering. In: Lyytinen, K., Loucopoulos, P., Mylopoulos, J., Robinson, B. (eds.)
Design Requirements Workshop. LNBIP, vol. 14, pp. 11–43. Springer, Heidelberg (2009)
5. Davis, C.J., et al.: Communication Challenges in Requirements Elicitation and the use of
the Repertory Grid Technique. Journal of Computer Information System, Special Issue,
78–86 (2006)
6. Goguen, J.A., Linde, C.: Techniques for Requirements Elicitation. In: Proceedings,
Requirement Engineering, pp. 152–164 (1993)
7. Hickey, A.M., Davis, A.M.: Elicitation Technique Selection: How Do Experts Do It? In:
11th IEEE International Conference on Requirement Engineering (2003)
8. Jiang, L., Eberlein, A., Far, B.H.: Combining Requirements Engineering Techniques-
Theory and Case Study. In: Proceedings of the 12th IEEE International Conference and
Workshops on the Engineering of Computer –Based systems (ECBS 2005), pp. 105–112
(2005)
9. Keller, T.: Contextual Requirements Elicitation: An Overview. Seminar in Requirement
Engineering, Department of Informatics. University of Zurich
10. Kukkanen, J., Vakevainen, K., Kauppinen, M., Uusitalo, E.: Applying a systematic
Approach to Link Requirements and Testing: A Case Study. In: 16th IEEE Asia pacific
Software Engineering Conference, pp. 482–488 (2009)
11. Lamsweerde, A.V.: Requirements Engineering in the Year 00: A Research perspective. In:
22nd International Conference on Software Engineering (2000)
12. Lapouchnian, A.: Goal Oriented Requirements Engineering: An Overview of the Current
Research. Technical Report, Department of Computer Science, University of Toronto
(June 28, 2005)
13. Nurmuliani, N., Zowghi, D., Williams, S.P.: Using card Sorting technique to Classify
Requirements Change. In: Proceedings of the 12th IEEE International Requirement
Engineering Conference (RE 2004) (2004)
14. Nuseibeh, B., Easterbrook, S.: Requirements Engineering: A Roadmap. In: Proceedings of
the Conference on the Future of Software Engineering, New York (2000)
15. Rumbaugh, J., Jacobson, I., Booch, G.: The Unified Modelling Language Reference
Manual, 2nd edn. Pearson Education
16. Sindre, G., Opdahl, A.L.: Eliciting Security Requirements with Misuse Cases.
Requirement Engineering Journal, 34–44 (2005)
17. Uusitalo, E., Komssi, M., Kauppinen, M., Davis, A.M.: Linking Requirements and Testing
in Practice. In: 16th IEEE International Requirements Engineering Conference, pp. 265–
270 (2008)
18. Zave, P.: Classification of Research Efforts in Requirements Engineering. ACM
Computing Surveys, 315–321 (1997)
Design Challenges in Power Handling Techniques
in Nano Scale Cmos Devices
1 Introduction
To achieve lower power consumption, CMOS devices have been scaled for more than
30 years. Transistor delay times decrease by more than 30% per technology
generation, resulting in doubling of microprocessor performance every two years (3).
Supply voltage has been scaled down in order to keep the power consumption under
control.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 319–324, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
320 D. Veeranna, V. Surya Naik, and A. Degada
For a CMOS circuit, the total power dissipation includes dynamic and static
components. In the standby mode, the power dissipation is due to the standby leakage
current. Dynamic power dissipation consists of two components. One is the switching
power due to charging and discharging of load capacitance. The other is short circuit
power due to the nonzero rise and fall time of input waveforms. The static power of a
CMOS circuit is determined by the leakage current through each transistor. The
dynamic (switching) (Pd) power and leakage power (Pleak) are expressed as
Pd = α fCV2dd , Pleak = ∑ (leakage current) × (supply voltage)
kfτ(V -2Vt)3
Psc= DD (2)
12
where α is the switching activity is the operation frequency is the load capacitance,
Vdd is the supply voltage, and Ileak is the cumulative leakage current due to all the
components of the leakage current .leakage current (power) increases dramatically in
the scaled devices. Particularly, with reduction of threshold voltage (to achieve high
performance), leakage power becomes a significant component of the total power
consumption in both active and standby modes of operation.
Total power consumption is the sum of static and dynamic power consumption.
P = (CL VDD2 f + tscVDD Ipeak f ) + VDD Ileak (3)
As technology is scaling down dynamic power consumption is reducing absolutely
and static power is increasing relatively. Reports indicate that 40% or even higher
percentage of the total power consumption is due to the leakage of transistors.
Design Challenges in Power Handling Techniques in Nano Scale Cmos Devices 321
Mobility Degradation with Vertical Field: At large gate to source voltage, the high
electric field developed between the Gate and channel confines the carrier to a
narrower region below the oxide-silicon interface, leading to more carrier scattering
and hence lower mobility.
It becomes lower when high drain voltage is applied to a short channel device, the
barrier height is lowered even more, resulting in further decrease of the threshold
voltage.
322 D. Veeranna, V. Surya Naik, and A. Degada
3 Runtime Techniques
A common architectural technique to keep the power of fast, hot circuits within
bounds has been to freeze the circuits place them in a standby state any time they are
not needed. Standby-leakage reduction techniques exploit this idea to place certain
sections of the circuit in standby mode (low-leakage mode) when they are not
required.
W=22u W=22u
bp
L=2u L=2u
IN1
OUT
W=22u
L=2u
IN2
W=22u
bn
L=2u
Fig. 2. VTCMOS
4 Experimental Results
Simulation analysis has been carried out for NAND gate for two technologies i.e.
180nm and 350nm. SPICE simulations results obtained are given in Tables 1 and 2.
From the tables1 & 2, Leakage power reduces with VT CMOS (body bias)
Technique appr.10 times reduction, with transistor stacking effect Technique
appr.1000 times Reduction, with MTCMOS(sleep transistor) Technique appr.10000
times reduction is possible.
5 Conclusion
With the continuous scaling of CMOS devices, leakage current is becoming a major
contributor to the total power consumption. In current Nano regime CMOS devices
with low threshold voltages, sub threshold and gate leakage have become dominant
sources of leakage and are expected to increase with the technology scaling. Design
time techniques and Run time techniques such as Dual vth, MTCMOS, VTCMOS and
Dynamic Vth scaling can effectively reduce the leakage current in high-performance
logic.
References
1. Leakage power analysis and reduction for nano scale circuits by Amitagarval&Kaushik
Roy. IEEE Computer Society (2006)
2. Fallah, F., Pedram, M.: Standby and Active Leakage Current Control and Minimization in
its. IEICE-Leakage Review-Journal
3. Roy, K., Mukhopadya, S., Mahmoodi-Meimand, H.: Leakage Current Mechanisms and
Leakage Reduction Techniques in Deep Sub micrometer CMOS Circuits. IEEE 91(2)
(February 2003)
4. Borkar, S.: Design Challenges of Technology Scaling. IEEE Micro 19(4), 23–29 (1999)
5. Leakage Current Reduction in CMOS VLSI Circuits by Input Vector Control. IEEE
Transactions on Very Large Scale Integration Systems 12(2) (February 2004)
6. Nielsen, L.S., Niessen, C., Sparso, J., Van Berkel, C.H.: Low-Power Operation Using Self-
Timed Circuits and Adaptive Scaling of the Supply Voltage. IEEE Trans. on VLSI Systems,
391–397 (December 1994)
7. Ishihara, T., Yasuura, H.: Voltage scheduling problem for dynamically variable voltage
processors. In: Proc. of Int’l Symp. on Low Power Electronics and Design, pp. 197–202
(August 1999)
CMR – Clustered Multipath Routing
to Increase the Lifetime of Sensor Networks
Abstract. Routing in wireless sensor network is an important task. This has led
to number of routing protocol which utilizes the limited resources. Since
wireless sensors are powered by batteries, it is very essential to utilize their
energy. Under these constraints many methods for conserving the power have
been proposed to increase battery life. In this paper we propose a novel way by
using clustered multipath routing (CMR) to increase the lifetime of sensor
nodes. It uses multiple paths between source and the destination which is
intended to provide a consistent transmission with low energy. The proposed
system saves about 23% percentage of energy.
1 Introduction
2 Related Work
Wireless sensor networks have attracted much research in recent years. In order to
minimize the energy consumption in WSN’s several energy efficient routing protocols
and algorithms has been developed [1, 2]. The majority of the routing protocols can
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 325–328, 2012.
© Springer-Verlag Berlin Heidelberg 2012
326 S. Manju Priya and S. Karthikeyan
be classified into data centric, hierarchical, location based, network flow. Energy
sensor node is assumed to know its own position as well as that of its neighbors which
can be obtained with some localization schemes [3] [4]. Multipath routing in ad hoc
networks has been proposed in [5], [6], [7], [8]. Partitioning the whole network into
smaller areas can turn the network into an easily controllable and manageable
infrastructure, and such grouping of sensors is the clustering. Generally, the clustering
methods can be categorized into static and dynamic clustering. The static clustering
aims at minimizing the total energy spent during the formation of the clusters for a set
of network [9]. The dynamic clustering also deals with the same energy efficiency
problem like increase the lifetime of nodes, selecting the cluster head as in [10]. In the
existing shortest path algorithm [11] the nodes are randomly placed and at some
instance of time, collision of data packet, path breaking occurs.
3 Proposed Work
The proposed plan in this article is to find out the neighbour node list and then to find
the multiple path from the neighbour nodes. The work is been divided into 4 stages:
In this stage, the node which has more energy is elected as cluster head (CH). The
remaining nodes are treated as member nodes of that cluster. The node which has the
next energy level to the cluster head is treated as next_CH. In case if the cluster head
loses its energy level below the threshold value, then the next_CH will act as the
cluster head and the current head goes to sleep mode. The remaining energy of the
cluster head is calculated by the formula.
Remaining energy = I.E. – ((No. of packets transmitted * T.E.) + (No. of packets
received * E.C.)) where
I.E. – initial energy of the node T.E - transmission energy to transmit the packet
E.C. – energy consumed by receiving a packet
Before finding the multipath, create a neighbour list for the source nodes. The steps
are as follows:
1. Get the value of maximum nodes in each cluster
2. Get the position of source node
3. Let the source node be with node id =0
4. Find the distance between the source node and all other nodes using the distance
formula
5. if(distance<trans_range) then update the neighbours of the source node in the
neighbour list
Consider the following fig.1, node1 is acting as the source and node11 is the
destination.
CMR – Clustered Multipath Routing to Increase the Lifetime of Sensor Networks 327
The proposed system proves that it uses only less amount of energy while sending
the data. The fig.3 shows that the total remaining energy is more than the existing
energy.
328 S. Manju Priya and S. Karthikeyan
Here we introduced a novel technique to send the data through cluster multipath
routing (CMR). This CMR routing of nodes has saved energy as well time. The
simulation results show that the energy has been saved, so that the lifetime of the
nodes is also increased. Our future work will be further investigating in multipath
routing for larger number of nodes.
References
1. Boukerche, A., Chatzigiannankis, I., Nikoletseas, S.: A New Energy Efficient and Fault
Tolerant Protocol for Data Propagation in smart dust Networks using varying transmission
range. Computer Communication 4(29), 477 (2008)
2. Gao, J., Zhang, L.: Load Balanced short Path Routing in Wireless Networks. In: IEEE
INFOCOM 2004, pp. 1099–1108 (2004)
3. Doherty, L., EI Ghaoni, L., Pister, K.S.J.: Convex position estimation in wireless sensor
networks. In: IEEE INFOCOM, pp. 1655–1663 (2001)
4. Shang, Y., Ruml, W., Zhang, Y., Fromherz, M.P.J.: Localization from mere connectivity.
In: MobilComm, pp. 201–212 (2003)
5. Lee, S.-J., Gerla, M.: AODV-BR: Backup Routing in Ad hoc Networks. In: IEEE WCNC
2000, Chicago, IL (September 2000)
6. Nasipuri, A., Das, S.R.: On-Demand Multipath Routing for Mobile Ad Hoc Networks. In:
IEEE ICCCN 1999, Boston, MA, pp. 64–70 (1999)
7. Park, V.D., Corson, M.S.: A Highly Adaptive Distributed Routing Algorithm for Mobile
Wireless Networks. In: IEEE INFOCOM 1997, Kobe, pp. 1405–1413 (1997)
8. Raju, J., Garcia-Luna-Aceves, J.J.: A New Approach to On-demand Loop-Free Multipath
Routing. In: IEEE ICCCN 1999, Boston, MA, pp. 522–527 (1999)
9. Bandyopadhyay, S., Coyle, E.J.: Minimizing communication Costs in hierarchically
clustered networks of wireless sensors. Computer Networks 44(1), 1–16 (2004)
10. Ma, Y., Aylor, J.H.: System Lifetime Optimization for heterogeneous sensor networks
with a hub-spoke topology. IEEE Trans. Mobile Computing 3(3), 286–294 (2004)
11. Singh, P.K., Singh, N.P.: Data Forwarding in Adhoc Wireless Sensor Network Using
Shortest Path algorithm. Journal of Global Research in Computer Science 2(5) (2011)
Multiregion Image Segmentation by Graph Cuts
for Brain Tumour Segmentation
Abstract. Multiregion graph cut image partitioning via kernel mapping is used
to segment any type of the image data.The piecewise constant model of the
graph cut formulation becomes applicable when the image data is transformed
by a kernel function. The objective function contains an original data term to
evaluate the deviation of the transformed data within each segmentation region,
from the piecewise constant model, and a smoothness boundary preserving
regularization term. Using a common kernel function, energy minimization
typically consists of iterating image partitioning by graph cut iterations and
evaluations of region parameters via fixed point computation.The method
results in good segmentations and runs faster the graph cut methods. The
segmentation from MRI data is an important but time consuming task
performed manually by medical ex- perts. The segmentation of MRI image is
challenging due to the high diversity in appearance of tissue among
thepatient.A semi-automatic interactive brain segmentation system with the
ability to adjust operator control is achieved in this method.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 329–332, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
330 R. Ramya and K.B. Jayanthi
initialization[3]. Several interactive graph cut methods have used models more
general than the Gaussian by adding a process to learn the region parameters at any
step of the graph cut segmentation process.
The use of kernel functions is to transform image data rather than seeking accurate
(complex) image models and addressing a non linear problem. Using the Mercer’s
theorem, the dot product in the feature space suffices to write the kernel-induced data
term as a function of the image, the regions parameters, and a kernel function.
Graph cut methods states image segmentation as a label assignment problem. A data
term to measure the conformity of image data within the segmentation regions to a
stastical model and a regularization term (the prior) for smooth regions boundaries.
Multiregion Image Segmentation by Graph Cuts for Brain Tumour Segmentation 331
The kernel trick consists of using a linear classifier to solve a nonlinear problem by
mapping the original nonlinear data into a higher dimensional space. Following the
Mercer’s theorem,this states that any continuous, symmetric, positive semi definite
kernel function can be expressed as a dot product in a high-dimensional space[5].
3.2 Optimization
4 Results
The kernel method is used for segmenting various types of images. Inthis paper
thegraph cut method is tested over medicalimages.
The brain image, shown in Fig. 2(a), was segmented into three regions. In this
case, the choice of the number of regions is based upon prior medical knowledge.
Segmentation at convergence and final labels are displayed as in previous examples.
Fig. 2(d) depicts a spot of very narrow human vessels with very small contrast within
some regions. These results with gray level images show that the proposed method is
flexible. Detection of anatomical brain tumours plays an important role in the
planning and analysis of various treatments including radiation therapy and surgery.
332 R. Ramya and K.B. Jayanthi
5 Conclusion
The multiregion graph cut image segmentation in a kernel-induced space method
consists of minimizing a functional containing an original data term which references
the image data transformed in a kernel function. The optimization algorithm iterated
two consecutive steps: graph cut optimization and fixed point iterations for updating
the regions parameters. The flexibility and effectiveness of the method were tested
over medical and natural images.A flexible and effective alternative to complex
modelling of image data. Performance can be improved for specific applications.
References
1. Freedman, D., Zhang, T.: Interactive graph cut based segmentation with shape priors. In:
Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., pp. 755–762 (2005)
2. Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10(2),
266–277 (2001)
3. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts.
IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
4. El-Zehiry, N.Y., Elmaghraby, A.: A graph cut based active contour for multiphase image
segmentation. In: Proc. IEEE Int. Conf. Image Process., pp. 3188–3191 (2008)
5. Girolami, M.: Mercer kernel based clustering in feature space. IEEE Trans. Neural
Netw. 13(3), 780–784 (2001)
6. Liu, X., Veksler, O., Samarabandu, J.: Graph cut with ordering constraints on labels and its
applications. In: Proc. IEEE Int. Conf. Comput.Vis. Pattern Recognit., pp. 1–8 (2008)
Performance Parameters for Load Balancing Algorithm
in Grid Computing
1 Introduction
2 Load Balancing
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 333–336, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
334 P. Kumar et al.
( ) ( ( )) … (1)
Where ( )is an integer variable satisfying 0 ( )
For each neighboring processor ( )if ( ) ( ), then a nonnegative amount
of load, denoted by ( ) , is transferred from to ; no load is transferred if ( )
( ), in which case, we let ( ) 0. For notational convenience, we also let ( )
=0 if t∈ . We assume that a load transfer can take some time to be completed. We
use ( ) to denote the amount of load that has been sent from processor to
processor before time , but has not been received byprocessor before time . Let
( )be the load received by processor from processor at time . We then have
( 1) ( ) ∑ ∈ () ( ) ∑ ∈ () ( ), … (2)
And
( ) ( ( ) ( )) … (3)
Where we are making the implicit assumption that (0) 0. Since no load is
assumed to be in transit at time zero, we have (0) L, and using Eqs. (b) and
(c), we easily obtain the load conservation equation
( ( ) ∑ ∈ () ( ) L ,∀ t 0 … (4)
5 Conclusion
Through this paper, we have described multiple aspects of Grid Computing and
introduced numerous concepts which illustrate its broad capabilities. Grid Computing
is definitely a promising tendency to solve high demanding applications and all kinds
of problems. This paper presents number of parameters for load balancing like
communication delay, security, fault tolerance, efficiency, overload rejection,
complexity, grid topology etc. At the end of paper, given table shows performance of
various load balancing algorithm based on different parameters.
References
1. Bode, A.: Load Balancing In Distributed Memory Multiprocessors. Invited paper. IEEE
(1991)
2. Ranganathan, A., Campbell Roy, H.: What is the Complexity of a Distributed Computing
System? National Science Foundation, NSF CCR 0086094 ITR and NSF 99-72884 EQ
3. Eager, D.L., Lazowska, E.D., Zahorjan, J.: Adaptive load sharing in homogeneous
distributed systems. IEEE Transactions on Software Engineering 12(5), 662–675 (1986)
4. Milojičić, D.S., Douglis, F., Paindaveine, Y., Wheeler, R., Zhou, S.: Process Migration.
ACM Computing Surveys 32(3), 241–299 (2000)
5. Harvey, D.J.: Development of the Grid Computing Infrastructuer, NASA Ames Research
Center Sunnyvale, California,
http://webpages.sou.edu/~harveyd/presentations/Grid.ppt
6. Prathima, G., Saravanakumar, E.: A novel load balancing algorithm for computational
grid. International Journal of Computational Intelligence 1(1), 20–26 (2010)
7. Lin, H.-C., Raghavendra, C.S.: A Dynamic Load Balancing Policy With a Central Job
Dispatcher (LBC). IEEE Transactions on Software Engineering 18(2), 148–158 (1992)
8. Alkadi, I., Gregory, S.: Grid Computing: The Trend of The Milleninum. Review of
Business Information System 11(2), 33–38 (2007)
9. Psoroulas, I., Anognostopoulos, I., Loumos, V., Kayafas, E.: A Study of the Parameters
Concerning Load Balancing Algorithms. IJCSNS 7(4) (April 2007)
10. Jayabharathy, J., Parveen, A.: A Fault Tolerant Load Balancing Model for Grid
Enviroment. International Journal of Recent Trends in Engineering 2(2) (Novermber 2009)
11. Salehi, M.A., Deldari, H.: A Novel Load Balancing Method in an Agent-based Grid. IEEE,
Iran Telecommunication Research Center (ITRC), 1-4244–0220-4 (2006)
12. Bheevgade, M., Mujumdar, M., Patrikar, R., Malik, L.: Achieving Fault Tolerance in Grid
Computing System. In: Proceeding of 2nd National Conference on Challenges &
Opportunities in Information Technology (COIT 2008). RIMT-IET (March 29, 2008)
13. Nandagopal, M., Uthariaraj, R.V.: Hierarchical Status Information Exchange Scheduling
and Load Balancing For Computational Grid Environments. IJCSNS 10(2) (February
2010)
14. Malik, S.: Dynamic Load Balancing in a Network of Workstation, 95.515 Research Report
(Novermber 19, 2000)
15. Sharma, S., Singh, S., Sharma, M.: Performance Analysis of Load Balancing Algorithms.
World Academy of Science, Engineering and Technology 38 (2008)
Contourlet Based Image Watermarking Scheme
Using Schur Factorization and SVD
1 Introduction
With the growth and advancement in multimedia technology, the user can easily
access, tamper and modify the content. The technique for the copyright protection of
the digital product is called Digital Watermarking in which information pertaining to
ownership of data is embedded into original work. In transform domain, the image is
represented by frequency. Here, Discrete Wavelet Transform (DWT) [1] and
Contourlet Transform (CT) have gained much popularity in research arena. CT,
proposed by Do, and Vetterli [2], provides directionality and anisotropy besides time
frequency localization and multiresolution representation feature of wavelets.
Moreover, wavelets have only three directions in each resolution while contourlet
provides any number of directional decomposition at every level of resolutions
showing that the CT is more advantageous than wavelet [1]. An algorithm based on
CT was proposed by Liu and Tan [3] in which the watermark was added to the SVD
domain of the cover image. Xu et al. [4] proposed a scheme where the lowpass
coefficients of image in CT domain were decomposed using SVD. Liu et al. [5]
proposed a scheme in which the original cover image was decomposed by CT and
schur factorization was adopted in lowpass subband to embed watermark information.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 337–340, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
338 R. Gunjan, P. Mitra, and M.S. Gaur
Fig. 1. The CT of the Lena Image Fig. 2. The CT of the Logo Image
3 Proposed Algorithm
The watermarking techniques generally involve embedding watermark coefficients
directly to the original image coefficients, but two matrix factorization methods,
namely SVD and Schur are used in the proposed scheme after CT to decompose the
coefficients of watermark to be embedded. The Schur decomposition is applied on the
watermark coefficients after CT. The Schur form is further factorized by SVD. The
singular values thus obtained are embedded in the CT and SVD decomposed
coefficients of original image. The proposed scheme has been found to be more robust
against attacks than schemes that have been implemented with SVD with CT.
Step5: The singular values of the original image are then modified by the singular
values of the watermark using α as a visibility factor.
P’ = P + α W . (3)
Where, the singular values of original and embedded image are represented by P and
P’ respectively. The singular values of watermark are represented by W.
Step6: Inverse SVD and inverse CT is then applied on the modified original image
singular values to obtain the embedded image.
W = ( P’ – P ) / α . (4)
Step4: The inverse SVD, inverse Schur and inverse CT is then applied on the extracted
image singular values to get the watermark image.
Fig. 3. Original Image Fig. 4. Original Wa- Fig. 5. Watermarked Fig. 6. Extracted Wa-
termark termark
340 R. Gunjan, P. Mitra, and M.S. Gaur
5 Conclusion
The proposed method displays better performance than the other schemes which have
been implemented with either CT with SVD only or applying Schur Factorization
method on CT. Experimental results demonstrate that the extracted watermark is
similar to the watermark embedded. The proposed algorithm shows excellent
resilience against attacks and has high perceptual quality of watermarked image. The
future work involves the use of matrix factorization methods such as QR
decomposition and Takagi’s factorization.
References
1. Javidan, R., Masnadi-Shirazi, M.A., Azimifar, Z., Sadreddini, M.H.: A Comparative study
between wavelet and Contourlet Transorm Features for Textural Image Classification. In:
Information and Comm. Technologies: From Theory to Applications, pp. 1–5 (2008)
2. Do, M.N., Vetterli, M.: The contourlet transform: an efficient directional multiresolution
image representation. J. Image Processing 14(12), 2091–2106 (2005)
3. Liu, R., Tan, T.: An SVD-Based Watermarking Scheme for Protecting Rightful Ownership.
J. Multimedia 4(1), 121–128 (2002)
4. Bi, H., Li, X., Zhang, Y., Xu, Y.: A blind robust watermarking scheme based on CT and
SVD. In: IEEE 10th International Conference on Signal Processing, pp. 881–884 (2010)
5. Liu, P., Yang, J., Wei, J., Chen, F.: A Novel Watermarking Scheme in Contourlet Domain
Based on Schur Factorization. In: International Conference on Information Engineering and
Computer Sciences, pp. 1–4 (2010)
Patch-Based Categorization and Retrieval
of Medical Images
1 Introduction
With the increasing influence of computer techniques on the medical industry, the
production of digitized medical data is also increasing heavily. Though the size of the
medical data repository is increasing heavily, it is not being utilized efficiently, apart
from just being used once for the specific medical case diagnosis. In such cases, the
time spent on the process of analyzing the data is also being utilized for that one case
only. But if the time and data were to be utilized in solving multiple medical cases
then the medical industry can benefit intensively from the medical experts’ time in
providing new and more effective ways of handling and inventing medical solutions
for the future. This can be made possible by combining two most prominent fields in
the field of computer science – data mining techniques and image processing
techniques.
Medical imaging is the technique used to create images of the human body for
medical procedures (i.e., to reveal, diagnose or examine disease) or for medical
science. Medical imaging is often perceived to designate the set of techniques that
noninvasively produce images of the internal aspect of the body. Due to increase in
efficient medical imaging techniques there is an incredible increase in the number of
medical images. These images if archived and maintained would aid the medical
industry (doctors and radiologists) in ensuring efficient diagnosis.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 341–346, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
342 Z. Sulthana and K.P. Supreethi
The core of the medical data are the digital images, obtained after processing the x-
ray medical images; these should be processed in-order to improve their texture and
quality using image processing techniques and the data mining techniques may be
applied in-order to retrieve the relevant and significant data from the existing million
of tons of medical data.
2 Related Work
The diagnosis process is very time consuming and requires the expertise of a
radiologist. Thus, automating the process of indexing and retrieval of medical images
will be great boon to the medical community. The medical images are usually
obtained in the form of X-rays using recording techniques, which add some unwanted
extra data to the image like noise, air, etc., [2,5]. When these X-ray images are
transformed into the digital format these disturbances also get converted into digital
format and become a part of the image, which may adversely affect the process of
generating the accurate data required when the images are processed for medical help.
Thus these unwanted data needs to be separated from the images; and this can be done
using Image processing techniques. Medical Image Mining includes the following
phases: pre-processing phase, bag-of-visual-words phase, Clustering phase and
Retrieval phase.
Image Digitization - images acquired as x-ray images need to be processed to
remove the unwanted data from the images. The preprocessing phase is necessary in-
order to get rid of the unwanted data from the image [4]. In the preprocessing phase,
the image histogram equalization is used in-order to get the required clarity in the
image [1]. Any medical image consists of: primitive features that are low-level
features such as color, shape, and texture & logical features that are medium-level
features describing the image by a collection of objects and their spatial relationships
[8]. The images are divided into patches called visual words. The collection of all the
patches is referred to as bag-of-words (visual words) [1, 10]. Features vector are
created for the features of the image patches. These vectors are used in comparing the
difference between the images; the difference is calculated in the form of Euclidean
distance [9]. For different mining techniques, the results of feature extraction differs
[3]. Based on the Euclidean distance, the images are segregated into multiple clusters.
These clusters are utilized for retrieving a matching image during the retrieval phase.
3 Proposed Method
Preprocessing phase – includes the process of removing the unwanted data from the
image and improve the quality of the images. This process of removing unwanted
data (like stop-words in the data mining process) can be achieved by the techniques -
cropping, image enhancement and histogram equalization. An X-ray image is
obtained in the gray-scale. This gray-scale image is pre-processed using the histogram
equalization in-order to improve its visual quality.
Bag-of-visual words phase – consists of the complete process of obtaining the bag-of-
visual words (patches) formation and feature extraction.
Patch-Based Categorization and Retrieval of Medical Images 343
Bag-of-Visual words (patches) formation – In this phase, the images are segmented
into patches. Each segment will have more clarity when compared to the complete
image. Each segment of the image is referred to as a patch (visual word). Each patch
is a collection of the pixels in an image.
Feature Extraction - Post image enhancement and cropping, the images are obtained
in a high-quality. For each obtained patch the hue, saturation, and brightness is
calculated. Since it is a grey-scale image the hue and saturation counts remain zero.
Thus ultimately the feature of brightness is considered for the future processing of the
image. The patches are collected into a group and the average feature is calculated for
the group. The group of the patches is formed by considering the adjacent patches
obtained along the horizontal and vertical lines. Finally there would be certain number
of groups of feature description of each image. These feature description groups along
with the images are maintained in a database for the clustering process.
Clustering Phase - The images obtained from the previous phases are segregated into
groups based on the similarity of the features extracted. Each segment thus formed is
referred to a cluster. Initially a certain number of clusters are chosen, and then some
randomly picked images are set as the centroid for each cluster. The images are
compared against each other using the feature description groups and the Euclidean
distance is calculated. The similar images are grouped together into a cluster. Having
completed the process, the centroid of the cluster is recalculated based on the images
present in it.
Retrieval phase - The medical images can be retrieved based on the feature
comparison of the images stored in the clusters. Also image retrieval can be done for
a specific region of interest (ROI) [5] using CAD (Computer-Aided Diagnosis)
algorithms [1]. The medical image retrieval methods can be based on the type and
nature of the features being considered. There are different categories of image
retrieval methods such as, Text Based Image Retrieval (TBIR), Content Based Image
Retrieval (CBIR), and Semantic Based Image Retrieval (SBIR).
Proposed Algorithm:
1. Divide the input image into patches
2. Apply histogram equalization to each patch of the image, in-order to improve
the image quality.
3. Calculate the hue, saturation and brightness of the segments and generate the
brightness vector for every patch of the image.
4. Store the images along with their vectors in the database.
5. Choose a random number of Clusters, and fill the clusters using K-means
clustering.
6. For retrieving the matched images: repeat the steps 1 thru 5
7. Compare the image with the images in the clusters, in order to get the
matching image
8. Retrieve the image and its recommended diagnosis from the database
9. Display the diagnosis report to the user.
344 Z. Sulthana and K.P. Supreethi
4 Implementation
The work is implemented in the Java language using JDBC for the database
connectivity and other features of Java pertaining to the images, and core Java.
Pre-processing Phase: The first phase of the work includes, the pre-processing phase.
The images which are to be preprocessed are collected in a directory as Buffered
Images, and one by one those images are processed. Here for processing the
histogram equalization technique is applied. The height and the width of the image is
calculated and from this, the pixels are collected as ints (data type – integer) of the
form 0xRRGGBB. Then from each pixel, the red, green, and blue components are
extracted. From the data obtained the histogram is created. From the histogram
equalization, we obtain much clear images. For performing histogram equalization the
following algorithm is being used.
Bag-of-Visual-words Phase: In this phase, the visual words (patches) are extracted
from the equalized images, obtained from the previous phase. Initially, we obtain the
width and height of the image, and then divide the image into patches based upon the
Textons. Thus the required bag of visual words is formed.
Patch-Based Categorization and Retrieval of Medical Images 345
Feature description - For each patch in the image, the HSB (hue, saturation and
brightness) are calculated. Since the hue and saturation count for the gray-scale
images is zero, we take into account the brightness feature. The neighboring (along
the vertical and horizontal line) patches are grouped into a vector of patches. For each
patch vector we obtain the average brightness feature. This information obtained is
stored into the database for the future calculations.
Cluster Phase: Based on the varied medical cases, we decide upon the number of
clusters. Then randomly few images are picked up which are considered to be the
centroid of each cluster. Then the similarity between the images is found out by
calculating the Euclidean distance between the vectors of the images being
considered. The images are said to be similar if their Euclidean distance is equal to
the threshold value (assumed). Thus the similar images are collected into a cluster
(here, the cluster is a directory). Having placed all the images into their respective
clusters, for each cluster based upon the containing images, new centroid image is
found.
Retrieval Phase – during the retrieval phase the query image is compared with the
centroid images of all the clusters; and based on the closest matching centroid the
respective cluster is picked up and the query image is compared with the containing
images of the selected cluster, and the matching images is retrieved. Based on the
retrieved image, its image id is found and the respective diagnosis report is retrieved
from the database.
5 Conclusions
The process of diagnosis is a time taking process; and treatment is recommended only
in the case where the patient is found to be affected; in such scenario, the time spent
by the human expert in the diagnosis process is being wasted; instead if the diagnosis
report generation process is automated, then the human expert time and experience
may be utilized in a better way for the improvement of the medical field by inventing
new effective ways of dealing with the diseases. Certain challenges are also faced due
to the fact that dealing with the images is a very time consuming process; as well the
storage requirement is very high. The work can be expanded in the future by taking
into account various other features of the images like texture, shape etc.
References
1. Avni, U., Greenspan, H., Konen, E., Sharon, M., Goldberger, J.: X-ray Categorization and
Retrieval on the Organ and Pathology Level, Using Patch-Based Visual Words. IEEE
Transactions on Medical Imaging 30(3) (March 2011)
2. Bhadoria, S., Dethe, C.G.: Study of Medical Image Retrieval System. In: International
Conference on Data Storage and Data Engineering. IEEE Computer Society (2010)
3. Fu, L.-D., Zhang, Y.-F.: Medical Image Retrieval and Classification Based on
Morphological Shape Feature. In: Third International Conference on Intelligent Networks
and Intelligent Systems (2010)
4. Antonie, M.-L., Zaïane, O.R., Coman, A.: Application of Data Mining Techniques for
Medical Image Classification. In: Proceedings of the Second International Workshop on
346 Z. Sulthana and K.P. Supreethi
1 Introduction
The huge growth of digital music in recent years has led to a large number of musical
recordings becoming available in digital form as sampled audio. Additionally,
progress in electronic music production has resulted in a lot of symbolic music data
being created. Sampled audio cannot be manipulated as easily as symbolic music
formats, and symbolic formats lack the authenticity of real recordings. A key step
towards combining the benefits of these two realms is the ability to automatically
produce a symbolic representation of a sampled music recording. This process is
referred to as musical audio transcription [1]. There exist various techniques which
can accurately transcribe monophonic recordings (eg.YIN [2], TWM [3], and the
correlogram [4]) [1]. Raga is the most important concept in Indian music, making
accurate recognition a prerequisite to almost all musical analysis [5]. There has been
insufficient scientific research work to analyze the recordings of maestros who sing
Indian classical music (ICM). This is the main aim of this research work. Proposed
method is accurate and computationally efficient. Estimation of pitch is done by
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 347–350, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
348 K. Akant, R. Pande, and S. Limaye
Fourier of Fourier Transform method (FFT2) [7]. The location of spectral peaks is
further refined with parabolic interpolation [6]. The resulting accuracy is suitable for
detecting microtones [7].
3 Estimation of Pitch
To determine pitch at a certain time t, temporal frame centered at t is considered.
Then it is checked if the frame is monophonic and harmonic. If yes, then pitch for that
frame is estimated as discussed in [7]. Frame overlapping is done by selecting hop
size such that there is 75% overlap of the frames. This procedure is repeated for each
frame. Finally we get “time vs. pitch” graph.
4 Result
Amplitude
Fig. a
0
-1
0 0.5 1 1.5 2 2.5 3 3.5 4
Time in seconds
Frequency in Hz
300
Fig. b
200
100
0 0.5 1 1.5 2 2.5 3 3.5 4
Time in seconds
65
MIDI Number
Fig. c
60
55
0 0.5 1 1.5 2 2.5 3 3.5 4
Time in seconds
Fig. 1. Stages in music transcription of polyphonic ICM into MIDI data through approach 1
In Fig. 1, on the onset of tabla strokes near 0.5, 1.5 and 2.5 sec. the polyphonic and
inharmonic frames are rejected to get smooth pitch graph. Fig. 2a shows audio
waveform. Fig. 2b shows pitch graph. Fig. 2c shows pitch graph in MIDI numbers.
0.5
Amplitude
Fig. a
0
-0.5
0 0.5 1 1.5 2 2.5 3
Time in seconds
Frequency in Hz
200
Fig. b
150
100
0 0.5 1 1.5 2 2.5 3
Time in seconds
55
MIDI Number
Fig. c
50
45
0 0.5 1 1.5 2 2.5 3
Time in seconds
Fig. 2. Stages in music transcription of ICM into MIDI data through approach 1
0.5
Amplitude
Fig a
0
-0.5
0 0.5 1 1.5 2 2.5 3
Time in seconds
Frequency in Hz
200
Fig b
150
100
0 0.5 1 1.5 2 2.5 3
Time in seconds
55
MIDI Number
Fig c
50
45
0 0.5 1 1.5 2 2.5 3
Time in seconds
Fig. 3. Stages in music transcription of ICM into MIDI data thorugh approach 2
Fig.1, 2 and 3 show the progress of music transcription of ICM into MIDI data.
Fig. 2a shows audio waveform. Fig. 2b shows pitch graph. Fig. 2c shows pitch graph
in terms of MIDI numbers. MIDI note sequence for audio file in Fig.2 through
approach 1 is 3C(34.8) 3D(23.2) 3F(696.6) 3G(127.7) 3F(441.2) 3D(1091.3). Here,
3C(50) means octave-3; MIDI note-C; note duration: 50 ms. MIDI note sequence for
350 K. Akant, R. Pande, and S. Limaye
5 Conclusion
Pitch estimation is carried out using Fourier of Fourier Transform with Parabolic
Interpolation to spectral peaks. This reduces computational complexity to great extent.
Automatic music transcription using approach 1 shows MIDI note sequence along with
note duration for the notes which are there in the raga of audio sample. Whereas, in
approach 2, MIDI note sequence is comprised of any note in the octave since here prior
knowledge of Raga is not given. This note sequence can be further used for musical
pattern recognition i.e. Raga identification. Raga identification can be treated as basis
for music information retrieval of ICM and film songs based on ICM.
References
1. Sutton, C.: Transcription of vocal melodies in popular music. Report for the degree of MSc
in Digital Music Processing at the University of London (2006)
2. de Cheveigné, A., Kawahara, H.: YIN, a fundamental frequency estimator for speech and
music. J. Acoust. Soc. Am. 111(4), 1917–1930 (2002)
3. Maher, R.C., Beuchamp, J.: Fundamental frequency estimation of musical signals using a
two-way mismatch procedure. J. Acoust. Soc. Am. 95(4), 2254–2263 (1994)
4. Slaney, M., Lyon, R.F.: A perceptual pitch detector. In: Proceedings of the International
Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 357–360 (1990)
5. Chordia, P., Rae, A.: Raag recognition using pitch-class and pitch-class dyad distributions.
In: Proceedings of the International Conference on Music Information Retrieval (2007)
6. Smith, J.O., Serra, X.: PARSHL: An Analysis/Synthesis Program for Non-Harmonic
Sounds Based on a Sinusoidal Representation. In: Proceedings of the 1987 International
Computer Music Conference, pp. 290–297. International Computer Music Association, San
Francisco (1987)
7. Akant, K.A., Pande, R., Limaye, S.S.: Accurate Monophonic Pitch Tracking Algorithm for
QBH and Microtone Research. Pacific Journal of Science and Technology 11(2), 342–352
(2010)
8. Akant, K.A., Pande, R., Limaye, S.S.: Monophony/Polyphony Classification system Using
Fourier of Fourier Transform. International Journal of Electronics Engineering 2(2), 299–
303 (2010)
Enhanced Video Indexing and Retrieval Based on Face
Recognition through Combined Detection and Fast LDA
Abstract. The content based indexing and retrieval of videos plays a key role in
helping the Internet today to move towards semantic web. The exponential
growth of multimedia data has increased the demand for video search based on
the query image rather than the traditional text annotation. The best possible
method to index most videos is by the people featured in the video. The paper
proposes combined face detection approach with high detection efficiency and
low computational complexity. The fast LDA method proposed performs wave-
let decomposition as a pre-processing stage over the face image. The prepro-
cessing stage introduced reduces the retrieval time by a factor of 1/4n where n is
the level of decomposition as well as improving the face recognition rate. Expe-
rimental results demonstrate the effectiveness of the proposed method reducing
the retrieval time by 64 times over the direct LDA implementation.
1 Introduction
Digital image and video are rapidly evolving as the modus operandi for information
creation, exchange and storage in modern era over the Internet. The videos over the
Internet are traditionally annotated with keywords manually. The fast growth of vid-
eos over the past few decades has increased the demand of a query by example (QbE)
retrieval system in which the retrieval is based on the content of the videos [1].
Face detection and recognition techniques besides being used extensively in au-
thentication and identification of users, has also been extended to index and retrieve
videos [2]. People are the most important subjects in a video. Face detection is used to
identify faces in the image sequences and face recognition is used to associate the
video with the people featured in the video. The face recognition algorithms classified
into two types namely appearance based and geometric feature based approach [3].
The latter systems are computationally expensive compared to the former [4].
Wavelet transform has been a prominent tool for multi-resolution analysis of im-
ages in the recent past [5]. 2D Discrete Wavelet Transform (DWT) decomposes the
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 351–357, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
352 D. Loganathan et al.
image into four components namely Approximation (cA) corresponding to low fre-
quency variations, Horizontal (cH) corresponding to horizontal edges, Vertical (cV)
corresponding to vertical edges and Diagonal (cD) corresponding to non-horizontal
and non-vertical edge components as in fig. 4.
This paper proposes a system that uses a combination of skin color models to iden-
tify the skin regions, followed by morphological and geometric operations to find the
face regions. The face image to recognition phase is pre-processed by wavelet de-
composition and the approximation component is fed as input, thereby increasing
recognition rate and reducing time complexity by nearly 64 times.
2 Framework
The overall architecture as in Fig. 1 is based on the system proposed in [2]. Video
data can be viewed as a hierarchy, made up of frames at the lowest level. A collection
of frames focusing on one object depicts a shot. [6]. The key frames extracted after
shot detection [7] [8] are then subjected to the Combined Face Detection method. The
face images obtained are preprocessed by wavelet decomposition. In the retrieval
phase, the images in the training database are used both for training (building the
projection space) and testing stage (identifying a close match to test image).
The face detection method proposed in this paper tries to maximize the detection effi-
ciency at the same time reducing the computational complexity.
The unions of all binary images are shown in Fig. 3a. This step is intuitive of the
basic set theory principles. Thus if a region is detected as skin region in any one of the
color spaces is recognized as a skin region. This overhead of converting the image
into three color spaces in lieu of one is circumvented by the added advantage of re-
duced false rejection rate.
Fig. 2. Results of image after applying threshold on a)original image b) Normalized RGB
c) YCbCr d) HSV color models
The connected regions are analyzed to check if it conforms to the geometry of a face
region. The face regions cannot be too narrow, too short, short and wide or narrow
and tall and are excluded. The ratio of height to width of the remaining regions are
calculated and checked if it conforms to the Golden Ratio, given by (1). Fig. 3c is
the result of ANDing the mask obtained with the original image.
1 √5
. (1)
2
Fig. 3. a) Union of binary images b) Resultant after Morphological & Geometrical Analysis
c) Selected regions by ANDing d) Cropped Images
Linear subspace methods suffer from large computational load which arises due to
the large dimensions of the Eigen subspace. The approximation component ( ) is
used as input to the LDA. This component with reduced dimension retains the high
intensity variance. This pre-processing stage though has the overhead of wavelet de-
composition to three levels, reduces the recognition time by a factor of ⁄ , n is the
depth of decomposition, owing to the reduced dimension of the Fisher face space.
( )( ) . (2)
Enhanced Video Indexing and Retrieval Based on Face Recognition 355
The between class scatter matrix can be considered as the variance of each class
across all images and is given by
( )( ) . (3)
The main objective of LDA is to maximize the ratio between and , known as the
Fisher’s criterion given in (4). The problem reduces to solving the generalized Eigen
equation in (5).
| |
arg max( ) . (4)
| |
λ. (5)
where λ is a diagonal matrix containing the Eigen values of and is the
corresponding Eigen vectors. The happens to be singular in almost all cases for
the number of training images should be almost equal to size of the scatter matrices.
The singularity problem is solved by using the Direct Fractional Step LDA [9].
Once the transformation matrix is calculated, the projection of the mean image
of each class on the LDA subspace is calculated by
. (6)
… . (7)
The highest Fisher faces obtained by applying LDA and fast LDA are shown in
fig. 6 and 7. The proposed algorithm projects mean image of each class on to the pro-
jection space rather than all the images in a class as in (6). Let the test image vector be
. The test image is projected on to the subspace given by
. (8)
The Euclidean distance between the test image and each class is calculated by
| | ( ) . (9)
356 D. Loganathan et al.
The class which has the least Euclidean measure with respect to the test image is con-
sidered the match and associated videos are retrieved.
5 Experimental Results
The face detection and recognition system has been tested with different video
streams. The video types include news, talk shows having transition effects such as
fade, abrupt transition, wipe and dissolve. The proposed system focuses on improving
the retrieval time to facilitate recognition in real time and for large database. The re-
sults presented in this section conforms to the implementation of the algorithms pro-
posed and the system as whole in MATLAB r2010 on Intel Core 2 Duo Processor
running Windows 7.
150 100
Recognition Rate (%)
80
Time (seconds)
100
60
40
50
LDA
20
Fast LDA
0 0
a) 22 32 50 60 278 b) 50 100 150 300 400
Number of Persons in Database Number of Persons in Database
Fig. 8. Performance of the proposed fast LDA algorithm against the LDA in a) Recognition
Time b) Recognition Rate
Fig. 8 shows the performance of LDA and the fast LDA with respect to the retriev-
al phase over the face databases namely ORL database, Indian Face database [10],
and MUCT database [11]. An integrated system for video indexing and retrieval is
built with proposed enhancement. The MPEG video sequences having 30
frames/second are considered. Table 2 gives the details of the comparative perfor-
mance of the system.
The papers proposes the combined face detection and fast LDA methods which im-
proves the recognition rate and reduces the retrieval of videos based on face recogni-
tion making the system suitable for large databases. Further work in the same line
includes analyzing methods for faster implementation of wavelet decomposition to
reduce the extra overhead in the indexing phase.
References
1. Hu, W., Xie, N., Li, L., Zeng, X.: A Survey of Visual Content Video Indexing and Re-
trieval. J. IEEE 41(6), 797–819 (2011)
2. Torres, L., Vila, J.: Automatic Face Recognition for Video Indexing Applications. J. Pat-
tern Recognition 35(3), 615–625 (2002)
3. Chellappa, R., Wilson, C.L., Sirohey, S.: Human and Machine Recognition of Faces: a
Survey. IEEE 83(5), 705–741 (1995)
4. Chellpa, J., Etemad, K.: Discriminant Analysis for Recognition of Human Face Images. J.
Optical Society of America 14(8), 1724–1733 (1997)
5. Todd Ogden, R.: Essential Wavelets for Statistical Applications and Data Analysis.
Birkhäuser, Boston (1997)
6. Monaco, J.: How to Read a Film: The Art, Technology, Language, History, and Theory of
Film and Media. Oxford University Press (1977)
7. Yusoff, Y., Christmas, W., Kitter, J.: Video Shot Cut Detection Using Adaptive Thre-
sholding. In: British Machine Vision Conference (2000)
8. Boreczsky, J.S., Rowe, L.A.: Comparison of Video Shot Boundary Detection techniques.
In: SPIE Conference on Video Database, pp. 170–179 (1996)
9. Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N.: Face Recognition Using LDA Based Al-
gorithms. IEEE 14(1), 195–200 (2003)
10. The Indian Face Database (2002),
http://vis-www.cs.umass.edu/~vidit/IndianFaceDatabase/
11. Milborrow, S., Morkel, J., Nicolls, F.: MUCT database. University of Capetown (2008)
12. Gonzalez, R.C., Woods, R.E., Eddins, S.L.: Digital Image Processing Using MATLAB.
Tata McGraw Hill, New Delhi (2011)
An Efficient Approach for Neural Network
Based Fingerprint Recognition by Using Core,
Delta, Ridge Bifurcation and Minutia
1 Introduction
A fingerprint is the feature pattern of one finger. It is believed with strong evidences
that each fingerprint is unique. Each person has his own fingerprints with the
permanent uniqueness. So fingerprints have being used for identification and forensic
investigation for a long time. A fingerprint is composed of many ridges and furrows.
These ridges and furrows present good similarities in each small local window, like
parallelism and average width. However, shown by intensive research on fingerprint
recognition, fingerprints are not distinguished by their ridges and furrows, but by
Minutia, which are some abnormal points on the ridges. Among the variety of minutia
types reported in literatures, two are mostly significant and in heavy usage: one is
called termination, which is the immediate ending of a ridge; the other is called
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 358–362, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
An Efficient Approach for Neural Network Based Fingerprint Recognition 359
bifurcation, which is the point on the ridge from which two branches derive. The
fingerprint recognition problem can be grouped into two sub-domains: one is
fingerprint verification and the other is fingerprint identification. Fingerprint
verification is to verify the authenticity of one person by his fingerprint. The user
provides his fingerprint together with his identity information like his ID number. The
fingerprint verification system retrieves the fingerprint template according to the ID
number and matches the template with the real-time acquired fingerprint from the
user. Usually it is the underlying design principle of AFAS (Automatic Fingerprint
Authentication System). Fingerprint identification is to specify one person’s identity
by his fingerprint. Without knowledge of the person’s identity, the fingerprint
identification system tries to match his fingerprint with those in the whole fingerprint
database. It is especially useful for criminal investigation cases. And it is the design
principle of AFIS (Automatic Fingerprint Identification System).
4 Proposed Work
100
1- 100
Q1 Q1
foc foc
R R
fom fom
DB
Q2 foc
foc
Fig. 1. Supervised Neural Network for Fig. 2. Supervised Recurrent Neural Network
Recognition of the One-to-One Fingerprints for Recognition of the One-to-Many Finger-
print
Above figure (1) show the supervised neural network that have input layer, output
layer and hidden layer. The input layer neurons receive the fingerprint images and the
output layer shows the percentage of recognition.
Binary Image
Recognition Recognition
Matching
Calculate Ridges detail
of the image Calculate Minutia
Recognition Matching
Score
Fig. 3. Recognition procedure of the image Fig. 4. Displaying the matching scores of two
finger print images
An Efficient Approach for Neural Network Based Fingerprint Recognition 361
5 Experimental Results
We determine the recognition of the image and after recognition we will performed
the matching task. Fig (1, 2) show the neural network model of the proposed work.
Fig(3) show the recognition procedure of the image and fig (4) displaying the
matching score of two finger print images.
Accuracy and efficiency of our proposed work in most clear in below graph image,
this graph is represent the matching result by the following consideration of Minutia
Points, Ridge bifurcation, Island, Delta and Core fetched by the inputted finger print
images. And Table 1 is showing exact result after evolution of finger print images.
Below table and graph is showing the proposed simulation result of our work.
A 29 18 13 9 12
B 32 21 19 7 8
C 37 13 21 12 13
D 35 15 17 6 17
Here, show the some fingerprint images
like A, B, C and D.
In the above Table 2, Shows the final percentage ratio of matched finger prints
grey scale image. First Morphological operations of removing noise and small objects
from fingerprint images. Second improve the image intensity for better analysis.
Now we apply a threshold value by graythresh function for binary image. After this
we will get information about fingerprint image & by the Determining the pixels
value of the image with position, we get data as shown in Table -1 then correlation
between two corresponding images pixel can evaluate the matched ratio.
362 J.S. Sengar, J.P. Singh, and N. Sharma
In this paper, we have developed a neural network based recognition method that is
very effective and efficient. It reduces the deficiency of existing methods like
minutia, ridge and correlation. This proposed method gives better result than all the
other individual method. In future we will add some other concept like 2D cross
correlation, shape descriptor and moment invariants to get more accuracy in result.
References
1. Zhao, Q., Zhang, D., Zhang, L., Luo, N.: Adaptive fingerprint pore modeling and extraction.
Pattern Recognition, 2833–2844 (2010)
2. Yang, S., Verbauwhede, I.M.: A Secure Fingerprint Matching Technique, California, USA
(2003)
3. Jordan, M.I., Bishop, C.M.: Neural Networks. CRC Press (1996)
4. Abraham, A.: Artificial Neural Networks. John Wiley & Sons, Ltd. (2005) ISBN: 0-470-
02143-8
5. Hassoun, M.H.: Fundamentals of Artificial Neural Networks. MIT Press (1995)
6. Zhao, Q., Zhang, L., Zhang, D., Luo, N.: Direct Pore Matching for Fingerprint Recognition.
In: Tistarelli, M., Nixon, M.S. (eds.) ICB 2009. LNCS, vol. 5558, pp. 597–606. Springer,
Heidelberg (2009)
7. Ravi, J., Raja, K.B., Venugopal, K. R.: Finger print recognition using minutia score
matching, vol. 1(2), pp. 35–42 (2009)
8. Ito, K., Morita, A., Aoki, T., Higuchi, T., Nakajima, H., Kobayashi, K.: A Fingerprint
Recognition Algorithm Using Phase-Based Image Matching of Low Quality Fingerprints.
IEEE (2005)
Specification – Based Approach for Implementing
Atomic Read/ Write Shared Memory in Mobile Ad Hoc
Networks Using Fuzzy Logic
Abstract. In this paper we propose an efficient fuzzy logic based solution for
the specification and performance evaluation depending on generation of fuzzy
rules. A new property matching mechanism is defined. The requirement with
attributes is chandelled in the following manner: the basic functionality is
ensured, matching properties names according to the classical reading/writing
strategy. The preliminary solutions are selected and hierarchies according to the
degree of attribute matching. Consequently, we describe the basic principles of
the proposed solutions and illustrate them for implementing atomic read write
shared memory in mobile ad hoc network. This is done by fuzzy logic, which is
considered a clear solution to illustrate the results of this application in
distributed systems. The results are approximate but also, they are very good
and consistent with the nature of this application.
1 Introduction
A software system is viewed as a set of components that are connected to each other
through connectors. A software component is an implementation of some
functionality, available under the condition of a certain contract, independently
deployable and subject to composition. In the specification approach, each component
has a set of logical points of interaction with its environment. The logic of a
component composition (the semantic part) is enforced through the checking of
component contracts. Components may be simple or composed [1] [11]. A simple
component is the basic unit of composition that is responsible for certain behavior.
Composed components introduce a grouping mechanism to create higher abstractions
and may have several inputs and outputs. Components are specified by means of their
provided and required properties. Properties in this specification approach are facts
known about the component. A property is a name from a domain vocabulary set and
may have refining sub-properties (which are also properties) or refining attributes that
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 363–376, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
364 S. El-etriby and R. Shihata
are typed values [1]. The component contracts specify the services provided by the
component and their characteristics on one side and the obligations of client and
environment components on the other side. The provided services and their quality
depend on the services offered by other parties, being subject to a contract. A
component assembly is valid if it provides all individual components are respected. A
contract for a component is respected if all its required properties have found a match.
The criterion for a semantically correct component assembly is matching all required
properties with provided properties on every flow in the system [11]. In this
specification approach, it is not necessary that a requirement of a component is
matched by a component directly connected to it. It is sufficient that requirements are
matched by some components that are presented on the flow connected to the logical
point; these requirements are able to propagate.
2 Fuzzy Attributes
A property consists of a name describing functionality and attributes that are either
type values or fuzzy terms. The names used for the properties and for the attributes
are established through a domain-specific vocabulary[2][11]. Such a restriction is
necessary because a totally free-text specification makes the retrieval difficult,
producing false- positive or false-negative matching due to the use of a non-standard
terminology[2][11]. In this work, the domain specific vocabulary must also describe
the domains of the fuzzy attributes (linguistic variables) for each property as well as
the membership functions for the fuzzy terms. The membership functions for all
linguistic variables are considered of triangular shape as shown in Fig.1.
For each linguistic variable, first the number and the names of the terms of its
domain must be declared, and after that the values of the parameters a1, a2… an must
be specified.
...
Domain
a1 a2 a3 an am
In this paper the Geoquorum approach has presented for implementing atomic
read/write shared memory in mobile ad hoc networks. This approach is based on
Specification – Based Approach for Implementing Atomic Read/ Write Shared Memory 365
point becomes depopulated, then the associated focal point object fails [4]. (Note that
it doesn't matter how a focal point becomes depopulated, be it as a result of mobile
nodes failing, leaving the area, going to sleep. etc. Any depopulation results in the
focal point failing). The Geoquorum algorithm implements an atomic read/write
memory algorithm on top of the geographic abstraction, that is, on top of the focal
point object model. Nodes implementing the atomic memory use a Geocast service to
communicate with the focal point objects. In order to achieve fault tolerance and
availability, the algorithm replicates the read/write shared memory at a number of
focal point objects. In order to maintain consistency, accessing the shared memory
requires updating certain sets of focal points known as quorums.
An important aspect of our approach is that the members of our quorums are focal
point objects, not mobile nodes [3] [4]. The algorithm uses two sets of quorums (I)
get-quorums (II) put- quorums with property that every get-quorum intersects every
put-quorum. There is no requirement that put-quorums intersect other put-quorums, or
get-quorums intersect other get-quorums. The Put/Get quorums implementing atomic
read / write shared memory in mobile ad hoc networks shown in Fig. 2.
The use of quorums allows the algorithm to tolerate the failure of a limited number
of focal point objects. Our algorithm uses a Global Position System (GPS) time
service, allowing it to process write operations using a single phase, prior single-
phase write algorithm made other strong assumptions, for example: relying either on
synchrony or single writers [3][4]. This algorithm guarantees that all read operations
complete within two phases, but allows for some reads to be completed using a single
phase.
The atomic memory algorithm flags the completion of a previous read or write
operation to avoid using additional phases, and propagates this information to various
focal paint objects[3][4]. As far as we know, this is an improvement on previous
quorum based algorithms. For performance reasons, at different times it may be
desirable to use different times it may be desirable to use different sets of get quorums
and put-quorums.
Fig. 2. Put/Get quorums implementing atomic read / write shared memory in mobile ad hoc
networks
Specification – Based Approach for Implementing Atomic Read/ Write Shared Memory 367
[R4]If read/ write_ Ack_ status = no change need and occurrence = connect then
decision = weak_ select
The rules generated for two different neighbors are:
[R5] If read/write_Ack_status=Almost_no response and occurrence = almost no-
connect then decision = weak_reject.
[R6] If read/write_Ack_status=Ack_response and occurrence = almost
no_connect then decision = weak_ reject.
[R7] If read/write_Ack_status=Almost no response and occurrence = connect
then decision =weak_reject.
[R8] If read/write_Ack_status=Ack_response and occurrence = connect then
decision = strong_select.
The method has been implemented using java programming, the code is consists of
four phases: Analysis phase, Specification phase, Design phase, and Test phase. This
paper proposes an efficient fuzzy logic based solution for the specification and
performance evaluation depending on generation of fuzzy rules. As shown in Fig.3 to
Fig. 6 we are discussing samples at instant values with a resulting controller output;
the controller is sampling several times each second with a resulting “correction”
output following each sample. So, we introduce a specification approach for the
Geoquorums approach for implementing atomic read/write shared memory in mobile
ad hoc networks which is based on fuzzy logic. The advantages of this solution are: a
natural treatment for certain non-functional attributes that cannot be exactly evaluated
and specified, and a relaxed matching of required/provided attributes that do not have
to always be precise (see Fig. 9). Fig 3 -6 (a, b, c, d) illustrate how each of the
generated rules is composed with the fact represented by the specification of
component c1 (with read/ write- Ack- rate= 0.1, 0.3, 0.4, 0.8 and occurrence= Almost
no connect).
0-1
Fig. 3a. Rule: If read/ write- Ack- status = (Almost- no- response) and occurrence= (almost-
no-connect) then decision = weak-reject. Facts: read/ write- ack- rate= 0.1.
370 S. El-etriby and R. Shihata
0.5
0.1 0.1
Fig. 3b. Rule: If read/ write- Ack-status = Almost no-response and occurrence= about right
then decision= weak-reject. Facts: read/ write- Ack-rate= 0.1.
0-1 0-1
Fig. 3C. Rule: If read/ write-Ack-status = no change needed and occurrence=Almost no-
connect then decision= weak-reject facts: read/ write- ack- rate= 0.1
Fig. 3d. Rule: If read/ write- Ack–status = no change needed and occurrence= about right then
decision= weak–reject. Facts: read/ write- Ack-rate= 0.1.
Fig. 4a. Rule: If read/ write-Ack–status = (Almost no- response) and occurrence= Almost no-
connect then decision= weak–reject Facts: read/ write-Ack-rate= 0.3.
Specification – Based Approach for Implementing Atomic Read/ Write Shared Memory 371
0.4
0.3 0.3
Fig. 4b. Rule: If read/write-Ack – status = (Almost no- response) and occurrence= about right
then decision= weak–select. Facts: read/ write- Ack-rate= 0.3.
0.3 0.3
Fig. 4c. If read/ write- Ack – status= no change needed and occurrence= Almost no- connect
then decision= weak–reject. Facts: read/ write- Ack- rate= 0.3.
0.4
0.3 0.3
Fig. 4d. Rule: If read/ write- Ack – status = no change needed and occurrence= about right then
decision= strong- select. Facts: read/ write- Ack- rate= 0.3.
0.4 0.4
.
Fig. 5a. Rule: If read/ write-Ack – status= (Almost no- response) and occurrence= Almost no-
connect then decision= weak–reject facts: read/ write- Ack-rate= 0.4
372 S. El-etriby and R. Shihata
Fig. 5b. Rule: If read/write- Ack–status = (Almost no- response) and occurrence= about right
then decision= weak -reject. Facts: read/write- Ack- rate= 0.4
0.4 0.4
Fig. 5c. Rule: If read/ write- Ack–status = no change needed and occurrence= Almost no-
connect then decision= weak–reject Facts: read/ write- Ack- rate= 0.4
0.4
0.4
0.3
Fig. 5d. Rule: If read/ write- Ack–status = no change needed and occurrence about right then
decision= strong-select. Facts: read/ write- Ack- rate= 0.4
.
Fig. 6a. Rule: If read/ write- Ack–status= (Almost no- response) and occurrence= Almost no-
connect then decision= weak–select Facts: read/ write- Ack- rate= 0.8
Specification – Based Approach for Implementing Atomic Read/ Write Shared Memory 373
0.8 0.8
Fig. 6b. Rule: If read/ write- Ack–status= (Almost no- response) and occurrence= about right
then decision= weak–select. Facts: read/write-Ack-rate= 0.8.
0.8
Fig. 6c. Rule: If read/ write-Ack – status= no change needed and occurrence= Almost no-
connect then decision= weak–select Facts: read/write-Ack- rate= 0.8.
0.8
0.8
0.4
Fig. 6d. Rule: If read/write-Ack–status = no change needed and occurrence= about right then
decision= weak–reject. Facts: read/ write-Ack-rate= 0.8.
Fig. 7. First interface of the software Fig. 8. The interface of the specification phase
development process
374 S. El-etriby and R. Shihata
X=0
If X > No
0.2
Ye
s
Print “decision weak-
Reject”
X = X + 0.1
Fig. 9. Flow Chart for Specification Phase for Implementing Atomic Read/Write Shared
Memory in Mobile Ad Hoc Networks
Specification – Based Approach for Implementing Atomic Read/ Write Shared Memory 375
Consequently, the results of the previous figures (Fig. 3 to Fig. 6 ) are determined
as follow: the status of read/write operation is always (almost no response or no
change needed of connection), the future occurrence of connection is always (almost
no connect or about right of the connection in the network connection). Also, in these
figures we have two facts: the current status of occurrence is almost no connect and
according to the fuzzy logic we assume values X which are determined at range from
0 to 1. So, the results of the connection by the network is either weak-reject or weak–
select. Also, we can observe that when X is nearly 0.4 and the read/write-ack-status is
no change needed, the current occurrence of connection is about right and the result of
connection is may be strong-select, all possible output cases are shown in Table 1.
The snapshot of the GUI is illustrated in Fig.7 and Fig. 8.
5 Conclusions
In this paper we introduce a specification Geoquorums approach for implementing
atomic read/write shared memory in mobile ad hoc networks which is based on fuzzy
logic. The advantages of this solution are: a natural treatment for certain non-
functional attributes that can not be exactly evaluated and specified. In addition to a
relaxed matching of required/provided attributes that do not have to always be
precise.
References
1. Bachman, F., Bass, L., Buhman, C., Comella-Dorda, S., Long, F., Robert, J., Seacord, R.,
Wallnau, K.: Technical concepts of component-based software engineering. Technical
Report CMU/SEI-2000-TR-008, Carnegie Mellon Software Engineering Institute (2000)
2. Cooper, K., Cangussu, J.W., Lin, R., Sankaranarayanan, G., Soundararadjane, R., Wong,
E.: An Empirical Study on the Specification and Selection of Components Using Fuzzy
Logic. In: Heineman, G.T., Crnković, I., Schmidt, H.W., Stafford, J.A., Ren, X.-M.,
Wallnau, K. (eds.) CBSE 2005. LNCS, vol. 3489, pp. 155–170. Springer, Heidelberg
(2005)
3. Dolv, S., Gilbert, S., Lynch, N.A., Shvartsman, A.A., Welch, A., Loran, J.L.:
Geoquorums: Implementing Atomic Memory in Mobile Ad Hoc Networks. In:
Proceedings of the 17th International Conference on the Distributed Computing, pp. 306–
319 (2005)
4. Haas, Z.J., Liang, B.A., Wjghs, D.: Ad Hoc Mobile Management with Uniform
GeoQuorums Systems. Proceeding of IEEE/ACM Transactions on Mobile Ad Hoc
Networks 7(2), 228–240 (2004)
5. Koyuncu, M., Yazici, A.: A Fuzzy Knowledge-Based System for Intelligent Retrieval.
IEEE Transactions on Fuzzy Systems 13(3), 317–330 (2005)
6. Sora, I., Verbaeten, P., Berbers, Y.: A Description Language For Composable
Components. In: Pezzé, M. (ed.) FASE 2003. LNCS, vol. 2621, pp. 22–36. Springer,
Heidelberg (2003)
7. Şora, I., Creţu, V., Verbaeten, P., Berbers, Y.: Automating Decisions in Component
Composition Based on Propagation of Requirements. In: Wermelinger, M., Margaria-
Steffen, T. (eds.) FASE 2004. LNCS, vol. 2984, pp. 374–388. Springer, Heidelberg (2004)
376 S. El-etriby and R. Shihata
8. Sora, I., Cretu, V., Verbaeten, P., Berbers, Y.: Managing Variability of Self-customizable
Systems through Composable Components. Software Process Improvement and
Practice 10(1) (January 2005)
9. Szyperski, C.: Component Software: Beyond Object Oriented Programming. Addison
Wesley (2002)
10. Zhang, T., Benini, L., De Micheli, G.: Component Selection and Matching for IP-Based
Design. In: Proceedings of Conference on Design, Automation and Test in Europe
(DATE), Munich, Germany, pp. 40–46 (2001)
11. Şora, I., Todinca, D.: Specification-based Retrieval of Software Components through
Fuzzy Inference. Acta Polytechnica Hungarica 3(3) (2006)
12. Oliveira, R., Bernardo, L., Pinto, P.: Modeling delay on IEEE 802.11 MAC protocol for
unicast and broadcast non saturated traffic. In: Proc. WCNC 2007, pp. 463–467. IEEE
(2007)
13. Fehnker, A., Fruth, M., McIver, A.K.: Graphical Modelling for Simulation and Formal
Analysis of Wireless Network Protocols. In: Butler, M., Jones, C., Romanovsky, A.,
Troubitsyna, E. (eds.) Fault Tolerance. LNCS, vol. 5454, pp. 1–24. Springer, Heidelberg
(2009)
14. Ghassemi, F., Fokkink, W., Movaghar, A.: Restricted broadcast process theory. In: Proc.
SEFM 2008, pp. 345–354. IEEE (2008)
15. Ghassemi, F., Fokkink, W., Movaghar, A.: Equational Reasoning on Ad Hoc Networks.
In: Arbab, F., Sirjani, M. (eds.) FSEN 2009. LNCS, vol. 5961, pp. 113–128. Springer,
Heidelberg (2010)
16. Lin, T.: Mobile Ad-hoc Network Routing Protocols: Methodologies and Applications. PhD
thesis, Virginia Polytechnic Institute and State University (2004)
17. Tracy Camp, V.D., Boleng, J.: A survey of mobility models for ad hoc network research.
Wireless Communications and Mobile Computing 2, 483–502 (2002)
An Enhanced Scheme for Using Error Correction Codes
in ARQ Protocol
Prajit Paul1, Asok Kumar1, Mrinmoy Sarkar2, and Anup Kumar Bhattacharjee3
1 Introduction
To insure the reliable delivery of packets in the error-prone wireless channel,
automatic repeat request (ARQ) protocols are employed to acknowledge correct
packet reception [1]. The principle behind the proposed protocols is intuitive,
allowing multiple outstanding packets to be sent without acknowledgment by the base
station and then have the mobile node acknowledge a group of packets with just a
single acknowledgment. Through simulation we evaluate the throughput of different
packets using Space-Time Ring-Trellis Coded Modulation (ST-RTCM) code. To
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 377–382, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
378 P. Paul et al.
resist errors, ARQ and FEC have been widely adopted - ARQ combats channel errors
through retransmission and FEC through redundancy [2].
2 Throughput Analysis
2.1 Previous Case
Throughput of all ARQ techniques depend on the average number (n) of times a
packet needs transmission (including retransmission) for successful reception by the
receiver.
In normal stop and wait ARQ:
n sw =1 / (1-P), (1)
Where, P=Packet error probability = 1-(1-α) k,
Where, α = bit error rate, k= packet size.
In PRPC, all single bit errors will be corrected. The probability that a packet is with
single bit error is:
P1= kC1 α (1-α) k-1 (2)
Thus the probability of packet in error except singlebit error is:
P-P1 (3)
In previous scheme, when a negative acknowledgement is received, transmitter will
transmit two copies one in PRPC mode and another copy in original form of the
original packet. Then in MPC with PRPC, up to double bit errors will be corrected at
receiver. The probability of packet in error except with single bit error and double bit
error is:
P’’ = P -P1-P2 (4)
P2 is the probability of packet with double bit error and
k
P2 = C2 α2 (1-α) k-2 (5)
found to justify the idea in terms of throughput. The inquiry is an attempt to explore
the basic fissure.
2.2.1 Modified MPC with PRPC over Conventional PRPC with ECC
We propose MPC operation, which is a combined scheme of MPC and PRPC, called
MPC with PRPC in a Modified way correct both single bit error (with PRPC) and
double bit error with (MPC) at receiver by erroneous copies. In proposed scheme,
when a negative acknowledgement is received, transmitter will transmit two copies
one in PRPC mode and another copy in original form of the original packet. Then in
Modified MPC with PRPC, the probability of single bit aswell as double bit errors
will be corrected at receiver is as follows: If P1 and P2 are the single bit error and
double bit error, then the receiver will acknowledge without single bit and double bit
error is:
P1*P1 + P1*(1-P1) + P2*(1-P1) + P2*P1 + P2*P2 + P2*(1-P2) =P1 + 2*P2 (6)
Thus in Modified MPC with PRPC when implemented in stop and wait protocol, the
average number of times, nmmpc a packet needs transmission (including
retransmission) for successful delivery is:
n (MMPC+PRPC) = [(P1 + 2.P2) + (P’’/(1-P’’))] (7)
First part of right hand side of eq: 7 is for Modified MPC with PRPC correcting up to
double bit error and second part is for correcting in Normal Stop and Wait ARQ with
bit errors other than single bit and double bit error. PRPC corrects all single bit error
in packet. Modified MPC with PRPC corrects up to double bit error as well as with
single bit error.
Then the probability gain in correcting packet byModified MPC with PRPC over
conventional PRPC is:
Gain mpcprpc % = (P1 + 2*P2) / P1 *100 (8)
The Throughput of PRPC in Normal S/W ARQ with single bit error is:
(2.P1 + (P’/ (1-P’))) (9)
Where, P’=P-P1
The first part is for PRPC in correcting single bit error; second part is for Normal
S/W ARQ other than single bit error.
Now Coding Efficiencyis: k (Packet Size in Bits) / (k+c)); where, c is the Check
Bits
Here we are using CRC-16 as an Error Detection Code (EDC).
So the Throughput efficiency will be: (Throughput) * (Coding Efficiency)
i.e: ThroughputeffS/W = (2.P1+ (P’/ (1-P’)))*(k / (k+16)) (10)
So, for this scheme in S/W ARQ system, the throughput‘llbe, from eq: (7)
n(MMPC+PRPC) = [(P1 + 2.P2) + (P’’/(1-P’’))] (11)
Now Coding efficiency = k / (k + c), where, c is the check-bits. In eq (10), we have
used CRC-16.
380 P. Paul et al.
Using ST-RTCM (213 132/3) Code [5] in eq: 11 we are getting the Coding
efficiency = k / (k / 132* (213-132) +k+1) = (132*k / (213*k +132))
So the Throughput efficiency of Modified MPC+ PRPC scheme will be:
Throughputeff(MMPC+PRPC) = [(P1 + 2.P2) + (P’’/(1-P’’))] * (132*k / (213*k +132)) (12)
Using ST-RTCM (25 12/47) Code [6] in eq: 11 we are getting the
Coding efficiency = k / (k / 12* (25-12) +k+1) = (12*k / (25*k +12))
So the Throughput efficiency of Modified MPC+ PRPC scheme will be:
Throughputeff(MMPC+PRPC) = [(P1 + 2.P2) + (P’’/ (1-P’’))] * (12*k / (25*k +12)) (13)
Simulation Parameters
Simulation
Parameters Values
Simulation time 10 s
Uplink frame 0.5 ms
Downlink frame 0.5 ms
Duplex mode TDD
Application CBR (100
profile bytes,
0.2 ms)
Agent profile UDP
ARQ _ 500
WINDOW_ SIZE
PHY rate 1 Mbps
Fig. 1. Throughput percent with respect to packet size k= 10000 Bits by using ST-RTCM (213
132/3), ST-RTCM (25 12/47) codes for BER=0.01
An Enhanced Scheme for using Error Correction Codes in ARQ Protocol 381
Fig. 2. Throughput percent with respect to packet size k= 1000000 Bits by using ST-RTCM
(213 132/3), ST-RTCM (25 12/47) codes for BER=0.01
Fig. 3. Throughput percent with respect to packet size k= 80000 Bits by using ST-RTCM (213
132/3), ST-RTCM (25 12/47) codes for BER=0.01
Fig. 4. Throughput percent with respect to packet size k= 10000000 Bits by using ST-RTCM
(213 132/3), ST-RTCM (25 12/47) codes for BER=0.00001
In fig: 1, we see that the throughput percent is high with respect to medium packet
size (k=10000 bits) for Normal S/W with CRC-16, than Modified MPC+PRPC
Scheme with ST-RTCM (213 132/3), ST-RTCM (25 12/47) codes, though ST-RTCM
(213 132/3) code has shown better performance than ST-RTCM (25 12/47) at
382 P. Paul et al.
References
1. Redi, J., Petrioli, C., Chlamtac, I.: An Asymmetric, Dynamic, Energy-conserving ARQ
Protocol. In: Proceedings of the 49th Annual Vehicular Technology Conference, Houston,
Texas, May 16-20 (1999)
2. Lin, S., Costello, D.J., Miller, M.J.: Automatic repeat-request error control schemes. IEEE
Com. Mag. 22(12), 5–16 (1984)
3. Paul, P., Kumar, A., Roy, K.C.: Reliable Approach for ARQ Protocol on Communication
and Networks. In: NCIS 2010, C2_ES_0013, organized by MCIS, April 23-24, p. 31.
Manipal University, Manipal (2010)
4. Peterson, B.: Data Coding and Error Checking Techniques. Virtium Technology
5. Schlegel, C.B., Pérez, L.C.: Trellis and Turbo Coding. Wiley Publication
6. Carrasco, Johnston: Non-Binary Error Control Coding for Wireless Communication and
Data Storage. Wiley Publication
An Adaptive Earley Algorithm for LTAG Based Parsing
Abstract. In traditional parsing methods Earley parsing is one of the best par
ser implemented for both NLP and programming language requirements. Tree
Adjoining Grammar is powerful than traditional CFG and suitable to represent
complex structure of natural languages. An improved version LTAG has appro-
priate generative capacity and a strong linguistic foundation. Here we introduce
a new algorithm that simply adopts Earley method in LTAG which results
combined advantages of TAG and Earley Parsing.
1 Introduction
Tree Adjoining Grammars are somewhat similar to context-free grammars, but the
elementary unit of rewriting is the tree rather than the symbol. Whereas context-free
grammars have rules for rewriting symbols as strings of other symbols, tree-adjoining
grammars have rules for rewriting the nodes of trees as other trees.
TAG has more generative capacity than CFG. For example it can be shown that
L3={anbncn} is a context free language, but L4={anbncndn} is not context free. TAG can
generate L4, so it is more powerful than CFG. So TAG is a mildly context sensitive
language. On the other hand L5={anbncndnen} is not a Tree Adjoining language, but it
is context sensitive. So it follows that L(CFG) < L(TAG) < L(CSG).
Definition 1(Tree Adjoining Grammar): A TAG is a 5-tuple G = (VN, VT,S,I,A)
where VN is a finite set of non-terminal symbols, VT is a finite set of terminals,
S is a distinguished nonterminal, I is a finite set of trees called initial trees
and A is a finite set of trees called auxiliary trees. The trees in I U A are called
elementary trees.
In LTAG, each word is associated with a set of elementary trees. Each elementary
tree represents a possible tree structure for the word. An elementary tree may have
more than one lexical item. There are two kinds of elementary trees, initial trees and
auxiliary trees. Elementary trees can be combined through two operations,
substitution and adjunction. Operations are substitution and adjunction. Former is
used to attach an initial tree, and later is used to attach an auxiliary tree.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 383–386, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
384 K.A. Sharafudheen and C. Rahul
Use of dots in LTAG is basically same as that proposed by Earley (1970) for his
algorithm for CFG. We mimic the same idea here. Dot on left side of a non-terminal
indicates that the tree has not been explored yet. Right side dot indicates that all its
children are already explored.
Adjunction builds a new tree from an auxiliary tree β (with root/foot node X) and a
tree α (with internal node X). The sub-tree at internal node X in α is excised and
replaced by β; the excised sub-tree is then attached to the foot node of β.
The most common usage for substitutions on initial trees, but substitution may also
be done at frontier nodes of auxiliary and derived trees. Substitution takes place on
non-terminal nodes of the frontier of a tree (usually an initial tree). The node marked
for substitution is replaced by the tree to be substituted.
4 Proposed Algorithm
The algorithm uses two basic data structures: state and states set.
Definition 2: A state s is defined as a 5-tuple, [a, cur_it, pos, parent, lchild where a: is
the name of the dotted tree, cur_it: is the address or element of the dot in the tree a,
pos: is the position of the dot; parent: is the parent element of the cur_it; For start
node it is ф, lchild: is the left child of the cur_it; For leaf node it is ф.
A state set S is defined as a set of states. The states sets will be indexed by an
integer: Si with i є N. The presence of any state in states set i will mean that the input
string a1...ai has been recognized. Algorithm for state set creation is
Let G be an LTAG,
Let a1…anbe the input string,
/* Push initial state (α0,s’,L,ф,S) to stateset 0
ENQUEUE(α0,s’,L,ф,S) {Dummy}stateset 0
For(i=1 to LENGTH(sentence) do
For each state in stateset i do
If (incomplete (sentence) ) and any Operation is
possible
PREDICTOR(state)
If (incomplete (sentence)) and any Operation is
not possible
SCANNER(state)
Else
COMPLETOR(state)
End
An Adaptive Earley Algorithm for LTAG Based Parsing 385
End
End
Algorithm Predictor
For each state cur_it as root in stateset(i) and for
all GRAMMAR_RULE
Case 1: Dot is on the left side of a NT
If NT is not a leaf
ENQUEUE(tree,cur_it,L,P,lc) {Predictor}
/*Do Adjunction Operation
/*Add all cur_it rooted element to stateset(i)
Move dot to immediate left child
Else
ENQUEUE(tree,cur_it,L,P,lc) {Predictor}
/*Substitution Operation
End
Case 2: Dot is on the left side of a Terminal
ENQUEUE(tree,cur_it,R,P,ф) {Predictor}
/*Move dot to right of the terminal
End
Algorithm Scanner
/*Increment stateset index
For word (j) in input sentence
Find elementary tree for the word
ENQUEUE(tree,root,L,ф,lc) {Scanner}
End
Algorithm Completer
For each state that all left tree and all child
explored
Case 1: Dot is on the right side of a NT
If a sibling exist
ENQUEUE(tree,sibl,L,P,nlc) {Completer}
/*Move dot to left of immediate sibling
Else
ENQUEUE(tree,P,R,GP,cur_it) {Completer}
/*Move dot to right of the parent
End
Case 2: If Dot is on right of a Terminal
ENQUEUE(tree,root,R,GP,cur_it) {Completer)
End
386 K.A. Sharafudheen and C. Rahul
4.1 Complexity
The basic idea and method of the proposed algorithm is from the Earley Parsing
Technique and the average complexity is of the proposed work is not changed than
Earley Parsing even after change. On analysing it shows O(|G|n3) in average behavior
in time and O(|G|n) in space where |G| is the length of input grammar.
5 Conclusion
We design a new Earley parser based algorithm for LTAG. It works in lesser
complexity than any of the existing TAG parser. It is easy to implement and complex
data structure of existing Earley algorithm for TAG has modified to a simple one. It
combines the advantages of both TAG and Earley parsing. Worst case behavior is also
adaptable.
References
1. Aho, A.V., Sethi, R., Ullman: Compilers: principles, techniques, and tools. Addison-Wesley
(2002)
2. Shen, L., Joshi, A.K.: Statitical LTAG Parsing. Ph.D. thesis, University of Pennsylvania
(2006)
3. Joshi, A.K., Schabes, Y.: Tree-adjoining grammars. In: Rozenberg, G., Salomaa, A. (eds.)
Handbook of Formal Languages, vol. 3, pp. 69–124. Springer (1997)
4. McDonald, R., Crammer, K., Pereira, F.: Online large-margin training of dependency pars-
ers. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Lin-
guistics, ACL (2005)
5. Shen, L., Joshi, A.K.: An SVM based voting algorithm with application to parse reranking.
In: Proceedings of the 7th Conference on Computational Natural Language Learning (2003)
6. Frost, R., Hafiz, R., Callaghan: Modular and Efficient Top-Down Parsingfor Ambiguous
Left-Recursive Grammars. In: ACL-SIGPARSE, 10th International Workshop on Parsing
Technologies (IWPT), pp. 109–120 (2007)
7. Chiang, D.: Statistical Parsing with an Automatically-Extracted Tree Adjoining Grammar.
In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguis-
tics, ACL (2000)
A Survey on Single Channel Speech Separation
1 Introduction
In our daily life we hear sounds not in isolation but in mixture with background noise
which depends on the environment like car noise, television noise, radio noise and
crowd noise called as cocktail party effect. We humans have the capability to
recognize the target speech eliminating the back ground noise. But as a system it will
capture the combinations of several speech signals as a mixture which overlaps in
time and frequency. Single channel speech separation means separation of a specific,
required speech signal from a mixture of speech signals or from background noise,
where the speech mixture is captured by a single microphone. Single channel speech
separation is also called as Multiple Input Single Output System (MISO) which is a
branch of Speech Separation Process.
The pioneering work of separating audio signals from the back ground noise starts
from 1976 and the research continues till now. The main goal in speech separation
research is to model the processes underlying the mechanism of separation by humans
and replicate it in machines. Various techniques have been proposed for single
channel source separation, using either ‘time domain’ or ‘frequency domain’
technique. When the noisy speeches are recorded in strong reverberant environment,
the time domain method needs to learn many parameters and may result in
convergence difficulty and heavy computation load. The frequency domain method,
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 387–393, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
388 G. Logeshwari and G.S. Anandha Mala
ICA CASA
Frequency Domain Based Speech Separation System. Most algorithms that dealt
with this problem were based on masking, wherein unreliable frequency components
from the mixed signal spectrogram were suppressed, and the reliable components
were inverted to obtain the speech signal from speaker of interest. Most techniques
estimated the mask in a binary fashion, resulting in a hard mask. In [4], Aarthi
estimated all the spectral components of the desired speaker and estimated a soft mask
that weights the frequency sub bands of the mixed signal. This algorithm was
computationally expensive but achieved better performance than that obtained with
hard binary masks. This computational complexity was reduced by deriving a soft
mask filter, using minimum square error estimation of the log spectral vectors of
sources which were modeled using the Gaussian composite source modeling
approach[5].
The separation using instantaneous frequency from each narrow band frequency
channel using short-time Fourier analysis based on cross correlation was better than
the separation based on fundamental frequency alone [6]. Yun used both the
magnitude and phase information of the signal to extract and enhance the desired
speech signal from the mixed speech signals [7]. The complexity of STFT was
reduced by using sinusoidal parameters composed of amplitude and frequency [8, 9]
and this algorithm worked independently for pitch estimation.
An adaptive time-frequency resolution approach for non negative matrix
factorization was proposed by Serap[10] to improve the quality and intelligibility of
the separated sources. The degradation due to non negative matrix factorization
on a complex valued short time Fourier transform was reduced by incorporating
phase estimates via complex matrix factorization[11]. The speech mixture was
390 G. Logeshwari and G.S. Anandha Mala
decomposed into a series of oscillatory components and derived a sparse non negative
matrix factorization to estimate the spectral bases and temporal codes of the sources.
This methodology required no training knowledge for speech separation[12]. Least
Squares fitting approach method was proposed by Srikanth to model the speech
mixture as a sum of complex exponentials in which, it separated the participating
speaker streams rather than in the favor of the dominating speaker [13].
Amplitude Modulation Based Algorithm. The time domain signal was first
transformed into a time-frequency representation by applying Short Time Fourier
Transform. Then, the instantaneous amplitude was calculated for each narrow band
frequency bin. As the mixed speech signal had a great deal of overlap in the time
domain, modulation frequency analysis could provide a greater degree of separation
among sources [14]. A segment of the speech mixture was sparsely decomposed into
periodic signals, with time varying amplitude in which each of them was a component
of the individual speaker. For speech separation, the author [15] used K means
clustering algorithm for the set of the periodic signals. After the clustering, each
cluster was assigned to its corresponding speaker using codebooks that contained
spectral features of the speakers which could perform with less computational cost.
Pitch Tracking Based Algorithm. It was very natural to imagine that speech
separation could be accomplished by detecting the pitch of the mixed speech.
Generally speaking, pitch estimation could be done using either temporal, spectral or
spectro temporal methods (e.g., [16], [17], [18]). To identify the pitch contours of
each of several simultaneous speakers, comb filtering or other techniques could be
used to select the frequency components of the target speaker and suppress other
components from competing speakers.
The autocorrelation function of cochlear outputs was computed using dynamic
programming to estimate the dominant pitch. The components of dominating speaker
were removed and this process was repeated to retrieve the pitch values for the
weaker speaker [19]. Though, simple and easy to implement, it did not lead to a
satisfactory reduction in word error rate. Other researchers ([20], [21]) had proposed
similar recursive cancelation algorithms in which the dominant pitch value was first
estimated, and then removed so that a second pitch value could be calculated. All of
these algorithms were critically dependent on the performance of the first estimation
stage, and errors in the first pass usually led to errors in all subsequent passes. The
signal of target speaker was separated from an interfering speaker by manually
masking out modulation spectral features of the interferer. But this algorithm needed a
rough estimate of the target speaker’s pitch range [22]. Hence Mahmoodzadeh
estimated the pitch range in each frame of modulation spectrum of speech by
analyzing onsets and offsets. He filtered the mixture signal with a mask extracted
from the modulation spectrogram of mixture signal [23].
Hu estimated the multiple pitches present in the mixture simultaneously from the
speech signal and performed voiced/unvoiced decisions at the same time by
separating speech into low and high frequency segments [24]. For multi pitch
estimation, Michael Stark utilized the factorial HMM method. He modeled the vocal
tract filters either by vector quantization or by non negative matrix factorization for
A Survey on Single Channel Speech Separation 391
the fast approximation for the likelihood computation [25]. The improvement on
speech quality was consistent with Ji’s conclusion that long speech segments
maintained the temporal dynamics and speaker characteristics better than short
segments[26] In a long-short frame associated harmonic model, the long frame could
achieve high harmonic resolution, while the short frame could ensure the short time
stationary feature of the speech signal. They were jointly used to improve the
accuracy of the multi pitch estimation [27].
3 Conclusion
The paper highlights the importance of single channel speech separation systems
critically reviewed its growth in the last few decades, raising an awareness of the
challenges faced by the researchers in the development of new theory and algorithms.
In separating the speech signals, a priori knowledge of the underlying sources are
used to estimate the sources. Hence a system should be designed for separating two
sources without the prior knowledge of the source signals. Also the overall
performance of the system degrades if it is for speaker independent source separation
system. There is a lack in separating a more accurate speaker separation system
without any cross talk which is mainly due to the interference of unvoiced segments
in the target signal. Hence algorithms are to be designed to deal with unvoiced
segments. The long short associate harmonic model can handle voiced and unvoiced
speech separation when it is mixed with the voiced speech by detecting the energy of
the high frequency. However, if two unvoiced speech signals occur simultaneously, it
fails. Hence the proposed system is to combine the long short frame associated
method with the clustering algorithm to handle the inharmonic structures of unvoiced
392 G. Logeshwari and G.S. Anandha Mala
References
1. Parson, T.W.: Separation of speech from interfering speech by means of harmonic
selection. J. Acoust. Soc. Am. 60(4), 911–918 (1976)
2. Weintraub, M.: A theory and computational model of Auditory Monaural Sound
Separation. Ph.D Thesis, Stanford University (1985)
3. Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis. John Wiley&Sons
(2006)
4. Reddy, A.M., Raj, B.: Soft Mask Methods for Single Channel Speaker Separation. IEEE
Tran. Audio, Speech, Lang. Process. 15(6), 1766–1776 (2007)
5. Radfar, M.H., Dansereau, R.M.: Single Channel Speech Separation Using Soft Mask
Filtering. IEEE Tran. Audio, Speech, Lang. Process. 15(8), 2299–2310 (2007)
6. Gu, L.: Single-Channel Speech Separation based on Instantaneous Frequency, Carnegie
Mellon University, Ph.D Thesis (2010)
7. Lee, Y.-K., Lee, I.S., Kwon, O.-W.: Single Channel Speech Separation Using Phase Based
Methods. Procedures of the IEEE Tran. Acoust., Speech, Signal, Process. 56(4), 2453–
2459 (2010)
8. Mowlaee, P., Christensen, M.G., Jensen, S.H.: New Results on Single-Channel Speech
Separation Using Sinusoidal Modeling. IEEE Tran. Audio, Speech, Lang. Process. 19(5),
1265–1277 (2011)
9. Mowlaee, P., Saeidi, R., Tan, Z.H., Christensen, M.G., Kinnunen, T.: Sinusoidal Approach
for the Single Channel Speech Separation and Recognition Challenge. In: Proc.
Interspeech, pp. 677–680 (2011)
10. Kırbız, S., Smaragdis, P.: An adaptive time-frequency resolution approach for non-
negative matrix factorization based single channel sound source separation. In: Proc. IEEE
Conference ICASSP, pp. 253–256 (2011)
11. King, B.J., Atlas, L.: Single-Channel Source Separation Using Complex Matrix
Factorization. IEEE Tran. Audio, Speech, Lang. Process. 19(8), 2591–2597 (2011)
12. Gao, B., Woo, W.L., Dlay, S.S.: Single-Channel Source Separation Using EMD-Subband
Variable Regularized Sparse Features. Tran. Audio, Speech, Lang. Process. 19(4), 961–
976 (2011)
13. Vishnubhotla, S., Espy-Wilson, C.Y.: An Algorithm For Speech Segregation of Co-
Channel Speech. In: Proc. IEEE Conference ICASSP, pp. 109–112 (2009)
14. Schimmel, S.M., Atlas, L.E., Nie, K.: Feasibility of single channel speaker separation
based on modulation frequency analysis. In: Proc. IEEE Conference ICASSP, pp. IV605–
IV608 (2007)
15. Nakashizuka, M., Okumura, H., Iiguni, Y.: Single Channel Speech Separation Using A
Sparse Periodic Decomposition. In: Proc. 17th European Signal Processing Conference
(EUSIPCO 2009), Glasgow, Scotland, pp. 218–222 (2009)
16. Bach, F., Jordan, M.: Discriminative training of hidden markov models for multiple pitch
tracking. In: Proc. of ICASSP, pp. v489–v492 (2005)
17. Charpentier, F.J.: Pitch detection using the short-term phase spectrum. In: Proc. of
ICASSP, pp. 113–116 (1986)
A Survey on Single Channel Speech Separation 393
18. Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall,
Englewood (1993)
19. Weintraub, M.: A computational model for separating two simultaneous talkers. In: Proc.
of ICASSP, pp. 81–84 (1986)
20. de Cheveigne, A., Kawahara, H.: Multiple period estimation and pitch perception model.
Speech Communication 27(3-4), 175–185 (1999)
21. Barker, J., Coy, A., Ma, N., Cooke, M.: Recent advances in speech fragment decoding
techniques. In: Proc. of Interspeech, pp. 85–88 (2006)
22. Schimmel, S.M., Atlas, L.E., Nie, K.: Feasibility of Single Channel Speaker Separation
Based on Modulation Frequency Analysis. In: Proc. of ICASSP, pp. IV605–IV608 (2007)
23. Mahmoodzadeh, Abutalebi, H.R., Soltanian-Zadeh, H., Sheikhzadeh, H.: Single Channel
Speech Separation with a Frame-based Pitch Range Estimation Method in Modulation
Frequency. In: Proc. of IST, pp. 609–613 (2010)
24. Hu, G., Wang, D.L.: Monaural speech segregation based on pitch tracking and amplitude
modulation. IEEE Tran. on Neural Networks 15(5), 1135–1150 (2004)
25. Stark, M., Wohlmayr, M., Pernkopf, F.: Source–Filter-Based Single-Channel Speech
Separation Using Pitch Information. IEEE Trans. on Acoustics, Speech, Signal
Process. 19(2), 242–255 (2011)
26. Ji, M., Srinivasan, R., Crookes, D.: A corpus-based approach to speech enhancement from
nonstationary noise. In: Proc. of Interspeech, Makuhari, Chiba, Japan, pp. 1097–1100
(2010)
27. Huang, Q., Wang, D.: Single-channel speech separation based on long-short frame
associated harmonic model. Digital Signal Processing 21, 497–507 (2011)
28. Roweis, S.T.: One microphone source separation. In: Proc. of NIPS-13, pp. 793–799
(2001)
29. Roweis, S.T.: Factorial models and refiltering for speech separation and denoising. In:
Proc. Eurospeech, pp. 1009–1012 (2003)
30. Jang, G.J., Lee, T.W.: A maximum likelihood approach to single channel source
separation. Journal of Machine Learning Research 4(7-8), 1365–1392 (2004)
31. Bach, F., Jordan, M.I.: Blind one-microphone speech separation: A spectral learning
approach. Neural Info. Process. System, 65–72 (2005)
32. Jang, G.-J., Lee, T.-W., Oh, Y.-H.: Single channel Signal Separation Using Time-Domain
Basis Functions. IEEE Signal Processing Letters 10(6), 168–171 (2003)
33. Prendergast, G., Johnson, S.R., Green, G.G.R.: Extracting amplitude modulations from
speech in the time domain. Speech Communication 53, 903–913 (2011)
Wavelet Based Compression Techniques: A Survey
1 Introduction
The image compression algorithms are mainly used for reducing the redundancy and
irrelevancy and can thus store the image efficiently. The main goal of image
compression is to achieve the quality of the image at a given bit rate or the
compression rate. Mainly compression techniques are categorized into two such as
lossless and lossy compression[1]. For compression several techniques are used.
Some of these techniques are applicable to gray scale images, some others to color
images and some others do both. There are various types of coding techniques. Here
only the Discrete Wavelet Transform based method is focused.
This section gives an idea about the basic techniques used in image processing and
the types of compression techniques. In the following sections, section 2 explains the
transform coding and section 3 compares the these different techniques, section 4
explains advantages of DWT and the conclusion is in section 5.
In transform coding the pixel values are transformed from spatial to frequency
domain. The basic steps in transform coding is shown in Fig. 1. Here the Discrete
Wavelet Transform(DWT)[2][3] based technique is described based on several
methods and a comparison is made between the DWT methods.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 394–397, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Wavelet Based Compression Techniques: A Survey 395
Several DWT based techniques are existing in image processing. This section gives
the basic idea, coding method and features of some methods such as Compression
with Reversible Wavelets(CREW), Embedded Block Coder(EBCOT) and Resolution
Progressive Compression(RPC) Methods . These comparisons are based on [5][6].
Coding method: At first the image is tiled into different regions of rectangular grid.
They are called “tile components”. The size of the tile can b chosen by the user. If the
size is too small better the compression efficiency. These tile components are coded
separately.
omitting the unwanted layers. Here the bit plane coder is used. Four types of bit plane
coding primitives are used in the significant test such as zero coding(ZC), run length
coding(RLC), sign coding(SC) and magnitude refinement(MR).
The previous methods discussed here results rate loss. The method developed by Wei
Liu[5] suffered two type of rate loss such as image coding loss and source coding loss
compared to previous methods. Resolution Progressive Compression(RPC) is a
wavelet transform based compression technique.T o achieve progressive compression,
discrete wavelet transform (DWT) coefficients are converted to sign-magnitude form
and encoded one bit plane at a time starting with the most-significant magnitude bit
plane. The wavelet decomposition uses a reversible transform, so lossless
compression isachieved when all subband bit planes are coded. Correlation between
adjacent coefficients in a subband is exploited via predictive coding and context
modeling.
Coding Method: The encoder gets the cipher text and decomposes it into multiple
levels. In Wei Liu work he uses three level decomposition. Then the encoder encodes
each subband independently by using the Slepian Wolf Coding and transmits the
encoded bits from the lowest to the highest. Here a Context Adaptive
Interpolator(CAI) is used for SI generation.. For this four horizontal and vertical
neighbours and four diagonal neighbours are used. Here a two step interpolation is
used. At first the subimage 11 is interpolated from 00. Then after decoding the
subimage 11, 00 and 11 are used for interpolating 01 and 10. The CAI can be
calculate according to equation (1).
Then the interpolated image together with secret key is used as the Side Information
to decode the next resolution level. This is repeated until the whole image is decoded.
Here the decorrelation task is shifted to the decoder.
Features of RPC
(1) Efficiently exploiting Source dependency in an encrypted image.
(2) Reverse cryptosystem is used in this scheme. Thus the method is more secure.
This section gives the basic idea, coding method and features of the Discrete Wavelet
Transform based methods CREW, EBCOT and RPC. Out of these the Resolution
Progressive Compression scheme provides best features and performs efficiently.
Wavelet Based Compression Techniques: A Survey 397
3 Performance Evaluation
The previous section discussed several DWT techniques. Here certain performance
measures are used to compare these techniques. For comparison the performance
measures such as compression ratio, peak signal to noise ratio and mean square error.
The comparison is shown in Table 1. From these it can be clear that highest PSNR
and lowest MSE values provides best features.
4 Conclusion
References
1. Johnson, M., Ishwar, P., Prabhakaran, V., Schonberg, D., Ramchandran, K.: On
compressing Encrypted Data. IEEE Trans. Signal Processing 52, 2992–3006 (2004)
2. Rao, R., Bopardikar, A.S.: Wavelet Transforms-Introduction to theory applications. Pearson
Education, Asia (2004)
3. Rajesh Kumar, P., Prasad, G.V.H., Prema Kumar, M.: Discrete Cosine Transforms Discrete
Wavelet Transform: An Objective Comparison of Image Compression Techniques
4. Raja, S.P., Narayanan Prasanth, N., Arif Abdul Rahuman, S., Kurshid Jinna, S., Princess,
S.P.: 2009 International Conference on Advances in Computing, Control, and
Telecommunication Technologies. Wavelet Based Image Compression: a Comparative
Study (2009)
5. Shukla, J., Alwani, M., Tiwari, A.K.: A Survey on Lossless Image Compression Methods.
IEEE (2010)
Practical Approaches for Image Encryption/Scrambling
Using 3D Arnolds Cat Map
Abstract. This paper is exploratory study of the 3D Arnolds Cat Map. The
paper discusses the Arnold’s cat map and its 3D extension that is 3D Arnolds
Cat Map in detail. This paper extends idea of encryption/scrambling to the
encrypting/scrambling colour image using encryption/scrambling of R, G, and
B components. Experimental implementation of two different 3D Arnolds Cat
Maps proposed by different authors are provided along with their results. Paper
also discusses inverse ACM transformation to recover the scrambled image.
Keywords: Image encryption, Arnolds Cat Map, 3D Arnolds Cat Map, Chaos.
1 Introduction
Chaotic encryption is relatively new area in the network security & cryptography and
gaining widespread acceptance. Chaotic maps are blessed with features of sensitivity
to the initial condition, and ergodicity which make them very desirable for encryption
[4]. It has been found that chaotic algorithms are faster than classical algorithms like
DES, IDEA, MD5 [1]. Chaotic Arnolds Cat Map (ACM) is generally used for image
scrambling. ACM can provide only scrambling of image pixels which does not
provide desirable level of security and additionally contains less number of constants.
To deal with these problem higher dimensional ACM maps has been proposed by
many authors [2], [5], [6]. 3D ACM are more secure because they provide the
additional substitution apart from scrambling of image and also contain more number
of constants. Next section discusses Arnold’s cat map, and two different 3D Arnolds
Cat Maps in detail along with their implementation strategies and results. Lastly the
comparison between ACM and different 3D ACM is made to summarize the output.
Arnold’s cat map was invented by Russian mathematician Vladimir I. Arnold in the
x 1 p x
1960s. The ACM Transformation is given by y → q pq +1 y
mod N,
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 398–404, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Practical Approaches for Image Encryption/Scrambling Using 3D Arnolds Cat Map 399
x
where NxN is dimension of image [4]. Above representation is equivalent to y →
1 1 x
1 2 y mod N, where =1 and =1. Putting different values of and gives
the variation of the Arnolds Cat Map transformation.
x x + y
This ACM transformation places pixel at the location y to location x + 2 y .
The ACM is used for scrambling the image pixels before sending image for
encryption, this provide the additional security. Intensity of the scrambled image
remains exactly same as that of original image. In order to recover the image to its
original form we need to use the inverse matrix for same numbers of iterations. In this
x 2 − 1 x'
case the inverse ACM transformation is y → −1 1 y ' mod N.
400 P.N. Khade and M. Narnaware
Figure 7 represents the 124x124 dimension image after 15 iterations of ACM. The
recovered image is not exactly same as that of original but is closely similar to the
original image.
The security of ACM is entirely dependent upon values of and , it has been found
that for different values of and , the scrambling proportion of the same image is
different, and also needs different number of iterations for getting recovered image.
Number of iterations (Period) required for recovery of the image appears to be
random with changing value of and . It has been found that periodic behaviour of
the ACM makes it weak for the encryption.
2.3.1 Implementation1
This section discuss 3D ACM by Hongjuan Liu et al [2] which is improved by
introducing two new control parameters & . Following is the enhanced ACM
x' 1 a 0 x
y ' = b ab + 1 0 y mod N [3]. This 3D ACM perform the dual
z ' c d 1 z
encryption, firstly it performs the shuffling using regular ACM and secondly it will
perform the substitution using component. Using ACM The correlation among the
adjacent pixels can be disturbed completely. On the other hand, this 3D ACM can
Practical Approaches for Image Encryption/Scrambling Using 3D Arnolds Cat Map 401
substitute grey/ colour values according to the positions and original grey/colour
values of pixels. 3D ACM for both colour and greyscale image is implemented,
following is result for colour image.
2.3.2 Implementation2
Second 3D ACM was proposed by Zhou Zhe et al.[5], the equation of the 3D ACM
is ( ) 2 [6]. Here … , … and
, ∈ 0, 2 1 . is m*m matrix. Taking value of m=3 we get following matrix
[6].
Though the above 3D ACM is used as S-box by the author, we can also use it for
x
scrambling of the image by multiplying
with y where x and y represent the
z
402 P.N. Khade and M. Narnaware
pixel location and z represent the intensity/ colour values of the image pixel. Another
way is to use this map is by taking R, G, B component of an image pixels as , ,
and apply 3D ACM. Basically the security of this 3D ACM is huge due to presence of
the large number of constant terms. Following are the result obtained by using this 3D
ACM for R, G, and B component of an image. Important thing to mention here is,
unlike the previous version of 3D ACM this implementation does not provide the
scrambling of the image pixels. This implementation only substitutes the new values
for pixels R, G, B values. Following is the output of the above implementation.
Fig. 15. Intensity histogram before 3D ACM. Fig. 16. Intensity histogram after 3D ACM
Fig. 17. Intensity histogram of red colour Fig. 18. Intensity histogram of green colour
Practical Approaches for Image Encryption/Scrambling Using 3D Arnolds Cat Map 403
Fig. 19. Intensity histogram of blue colour Fig. 20. Histogram of Red, Green, Blue colour
after 3D ACM
2.4 Summary
Following is the table that compares ACM and 3D ACM according to certain
parameters.
The image that is scrambled/ encrypted using ACM can be recovered by using inverse
ACM transformation and iterating for same number of iterations. The image is
scrambled till 15 iteration and then inverse ACM for 15 rounds applied to get
following result.
Fig. 23. Original image Fig. 24. Scrambled image. Fig. 25. Unscrambled image
404 P.N. Khade and M. Narnaware
From the result obtained it is found that after applying the inverse transform the
original image can be recovered, but there is some amount of loss of image data. It is
also found that as number of iterations will increase for ACM transformation more
will be loss of image precision after applying the inverse transformation to recover the
image.
4 Conclusion
This paper studies the Arnolds Cat Map and 3D Arnolds Cat Map in detail. It’s been
observed that ACM can do only shuffling of the pixels location. The former
implementation of 3D ACM is provided with additional constants and was able to
shuffle and substitute the pixel values. The later implementation of 3D ACM can be
used in two ways firstly we can use it like former implementation or we can use it to
substitute the values of R, G, and B component of individual pixel. Hence the
conclusion is, 3D ACM is more secure than that of ACM and hence can be
successfully implemented in chaotic encryption algorithms. To overcome the periodic
property of ACM and 3D ACM we recommend the use of other 3D chaotic maps for
encryption after encrypting using 3D ACM. Applying both 3D ACM in cascaded
manner can provide the huge level of security.
References
1. Bose, R., Banerjee, A.: Implementing Symmetric Cryptography Using Chaos Function. In:
Advanced Computing & Communication Conference (1999)
2. Liu, H., Zhu, Z., Jiang, H., Wang, B.: A Novel Image Encryption Algorithm Based on
Improved 3D Chaotic Cat Map. In: The 9th International Conference for Young Computer
Scientists (2008)
3. Huang, M.-Y., Huang, Y.-M., Wang, M.-S.: Image encryption algorithm based on chaotic
maps. In: Computer Symposium, ICS (2010)
4. Mingming, Z., Xiaojun, T.: A Multiple Chaotic Encryption Scheme for Image. In: 6th
International Conference on Wireless Communications Networking and Mobile Computing,
WiCOM (2010)
5. Zhe, Z., Haibing, Y., Yu, Z., Wenjie, P., Yunpeng, Z.: A Block Encryption Scheme Based
on 3D Chaotic Arnold Maps. In: International Asia Symposium on Intelligent Interaction
and Affective Computing (2009)
6. Senthil Arumuga, A., Kiruba Jothi, D.: Image Encryption Algorithm Based On Improved 3d
Chaotic Cat Map. In: International Conference on Computational Intelligence and
Computing Research, ICCIC (2010)
7. Frazier-Reed, T.: M.I.S.T.: Cat Map,
http://music.calarts.edu/~tcfr33/technology/catmapex.html
(retrieved October 21, 2008)
Communication Efficient Distributed Decentralized
Key Management Framework for Message
Authentication in Vanet
1 Introduction
Vehicular ad-hoc networks (VANETs) are an emerging area of interest for the security
community. The important components involved in VANETs are road-side units
(RSUs) and on-board units (OBUs). RSUs are positioned on the sides of roads, at stop
lights and the like and on-board units (OBUs) which vehicles are equipped with. OBUs
enable communication between vehicles and with RSUs.
A major application of this technology is the distribution of safety-related informa-
tion, such as turn warnings, curve warnings, speed limit information and other types of
vital information between vehicles traveling over the road. In safety driving application,
each vehicle periodically broadcasts messages including its current position, direction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 405–408, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
406 G. VinothChakkaravarthy, R. Lavanya and P. Alli
and velocity, as well as road information, information about traffic congestion. Since
safety information may contribute to the survival of humans driving the vehicles partic-
ipating a VANET, security is of crucial importance to the system.
Various types of services can be offered for VANET to provide security. Confiden-
tiality is not a primary concern here. But the safety information must be distributed
among valid set of vehicles. So message authentication is the primary concern. Since,
this is a reliable group communication platform and each vehicle must be authenticated
each time when it starts exchanging. So communication delay and computation over-
head are the two issues to be considered because in safety driving application, vehicles
broadcast messages every 300ms. Group key management protocols between the groups
of vehicles offer cost effective authentication. Here, the computation overhead relies on
how effectively the key are managed in crypto systems.
First, the vehicles in a VANET are constantly roaming around and are highly dynamic.
Second, Since this is a highly dynamic environment, the number of peers in VANET
can become very large. Each vehicle receives a lot of data from the nearby vehicles in a
congested area. Third key challenge in modeling trust in a VANET environment is dif-
ficult, because VANET is a decentralized, open system i.e. there is no centralized infra-
structure and peers may join and leave the network any time respectively.
2 Related Work
Using pseudonyms is a one basic idea .The shortcoming of this protocol is that it re-
quires vehicles to store a large number of pseudonyms and certifications, where a revo-
cation scheme for abrogating malicious vehicles is difficult to implement.
TESLA is a protocol can be applied, which is a hash based protocol, to reduce the
computation overhead. However, the malicious vehicles could not be identified in this
protocol.
While all these studies assume a centralized key management scheme, hence a cost
effective decentralized distributed key management framework is proposed in this paper
to offer authentication which achieves privacy between the vehicles in VANET.
A protocol called Group Secret Key Protocol (GSKP) is proposed to for privacy
preservation of VANETs. The proposed work is efficient to avoid the communication
latency in reliable communication platform, without minimizing cryptographic opera-
tions. In reliable communication, communication latency increasingly dominating the
key setup latency, Hence, the bottleneck is shifted from computation to communica-
tion latency. So to reduce the communication latency, the number of cryptographic
operations and the number of rounds should be significantly reduced. The proposed
Communication Efficient Distributed
D Decentralized Key Management Framework 407
scheme in this paper is effficient to avoid such latency without minimizing cryppto-
graphic operations. GSKP is based on the formation of a virtual Skinny Tree (VST)
which is based on the appro oach that extends the 2-party Diffie-Hellman key exchaange
and supposes the formation n of a secure group. This protocol involves the follow wing
computation and communiccation requirements: O(n) communication rounds and O O(n)
cryptographic operations [2 2]are necessary to establish a shared key in a group off ‘n’
members. This framework is i extended to deal with dynamic groups in a communiica-
tion-efficient manner for VA ANET.
In the proposed system, three
t types of entities can be incorporated namely, authhori-
ties, Road Side Unit (RSU)), and nodes. Authorities are responsible for key generattion
and malicious vehicle judgeement. Authorities have powerful firewalls and other secuurity
protections.
RSUs are deployed at thee road sides, which are in charge of key management in the
proposed framework. Traffiic lights or road signs can be used as RSUs after renovatiion.
RSUs communicate with au uthorities through wired network. It is assumed that a trussted
platform module is equipped d in each RSU. It can resist software attacks but not sophis-
ticated hardware tampering. Nodes are ordinary vehicles on the road that can commuuni-
cate with each other and RSUs
R through radio. The assumption is that each vehiclle is
equipped with a GPS receiv ver using DGPS[1] with an accuracy on the order of centim me-
ters and an on board unit (O
OBU) which is in charge of all communication and compuuta-
tion tasks. Nodes have the lo
owest security level.
Fig 1 shows the Group Seecret Key Protocol (GSKP) , in which the entire grooup
shares the same secret key called
c as session-encrypting key (SEK).
The tree has two types off nodes: leaf and internal. Each leaf node is associated w
with
a specific group member. An internal node IN(i) always has two children: anotther
(lower) internal node IN(i-1) and a leaf node LN(i). Each leaf node LN(i) has a sses-
sion randomri chosen and kept secret by Mi. The public version thereof is bri = αri
mod p. Every internal nodee IN(j) has an associated secret key kj and a public insttant
key (bkey) bkj = αkj mod p[2]. The secret key ki (i> 1) is the result of a Difffie-
Hellman key agreement bettween the node’s two children
408 G. VinothChakkaravarthy, R. Lavanya and P. Alli
mod p = α
ri ki-1 riki-1
ki = (bki-1) mod p = (bri) mod p if i> 1.
The following membership changes are considered: Single member changes include
member join or leave, and multiple member changes include group merge and group
partition.
4 Protocol Discussion
In VST, the group key is calculated using a function of all the private keys of the ve-
hicles, the session key for individual user. The dynamicity of the proposed scheme is
shown whenever the Instant key gets varied on the arrival or departure of a user which
doesn’t allows impersonation of the user.
As the newly generated key is obtained from current user's hidden key, the valid cur-
rently operating users only can access it. Hence authentication is successfully ensured. It
leads to secure communication.
Whenever a node joins/leaves, the RSU has to perform (n+l)/(n-1) computations re-
spectively which in turn reduces the computational complexity. The above all points
make our scheme is robust for offering privacy and also easy for verification.
5 Conclusion
In this paper, a novel dynamic decentralized distributed key management scheme based
on the group key agreement is proposed to provision privacy in the VANETs. This pro-
tocol involves O(n) communication rounds and O(n) cryptographic operations to estab-
lish a GSK in a group of ‘n’ members. So it is communication efficient. The proposed
design guarantees that RSUs distribute keys only to valid set of vehicles, so it offers
strong authentication with less computation complexity and communication delay.
References
1. Hao, Y., Cheng, Y., Zhou, C., Song, W.: A Distributed Key Management Framework with
Cooperative Message Authentication in VANETs. IEEE Journal on Selected Areas in
Communications 29(3) (March 2011)
2. Yongdae, P., Tsudik: Group Key Agreement efficient in communication. IEEE Transactions
on Computers
3. Langley, C., Lucas, R., Fu, H.: Key Management in Vehicular Ad-Hoc Networks
4. Raya, M., Hubaux, J.-P.: Securing vehicular ad hoc networks. Journal of Computer Securi-
ty 15(1), 39–68 (2007)
5. Freudiger, J., Raya, M., Feleghhazi, M., Papadimitratos, P., Hubaux, J.P.: Mix zones for lo-
cation privacy in vehicular networks. In: Proc. International Workshop on Wireless Net-
working for Intelligent Transportation Systems, Vancouver, British Columbia (August
2007)
6. Duraiswamy, K., Shantharajah, S.P.: Key Management and Distribution for Authenticating
Group Communication. IEEE Transactions on Computers (2006)
Graph Learning System for Automatic Image Annotation
1 Introduction
How to index and search for the digital image collections effectively and efficiently is
an increasingly urgent research issue in the multimedia community. To support this,
keywords describing the images are required to retrieve and rank images. Manual an-
notation is a direct way to obtain these keywords. But, it is labor-intensive and error-
prone. Thus automatic annotation of images has emerged as an important technique for
efficient image search. An image annotation algorithm based on graph is proposed.
Based on the analysis of graph structure, a fast solution algorithm is proposed.
2 Related Work
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 409–412, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
410 K. Aishwaryameenakshi et al.
3 Proposed Approach
Images are segmented into different regions and features of each region are extracted.
The details are appended as nodes to the graph constructed using training set. The
image is annotated using graph learning method. It is proposed to extract shape con-
text features. The entire flow described above is modeled by the figure1.
exp
, (1)
0
( , )
, (2)
( ( ), )
4 Experimental Analysis
In order to evaluate the proposed algorithm, a test is conducted on image database
where 500 images are considered, in which 450 as training images and 50 as test
412 K. Aishwaryameenakshi et al.
images have been selected. It is observed that the IFRWR annotates with considerable
accuracy. A comparison is made between ground truth and annotated results of pro-
posed system as shown in the fig 3.This approach generates an average precision of
0.1501,average recall of 0.1892 and the average measure of Fscore is calculated using
the formula (7). It is estimated to be 0.1674.
.
2. (7)
5 Conclusion
In the proposed method, initially, an undirected graph is employed to integrate corre-
lation among low-level features, and words. Then image annotation is implemented
by IFRWR which addresses scalability issue by its iterative nature. The experiments
show satisfactory results of proposed algorithm. Future work includes exploring Pa-
rallel RWR and Hierarchical RWR.
References
1. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval Using
Cross-Media Relevance Model. In: 26th Annual International ACM SIGIR (2003)
2. Feng, S., Manmatha, R., Laverenko, V.: Multiple Bernoulli Relevance Models for Image
and Video Annotation. In: IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, pp. 1002–1009 (2004)
3. Carneiro, G., Chan, A.B., Moreno, P.J., et al.: Supervised learning of semantic classes for
image annotation and retrival. IEEE Transactions on Pattern Analysis and Machine Intelli-
gence 29(3), 394–394 (2007)
4. Vasconcelos, N.: Minimum probability of error image retrieval. IEEE Transactions on Sig-
nal Processing 52(8), 2322–2336 (2004)
5. Liu, J., Li, M., Ma, W.Y., et al.: Adaptive graph model for automatic image annotation. In:
Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval,
pp. 61–67 (2006)
6. Pan, J.-Y., Yang, H.-J., Faloutsos, C., et al.: GCap: Graph-based Automatic Image Caption-
ing. In: 4th International Workshop on Multimedia Data and Document Engineering
(MDDE 2004), in Conjuction with CVPR 2004, pp. 146–156 (2004)
7. Guo, Y.T., Luo, B.: An Automatic Image Annotation Method Based on the Mutual K-
Nearest Neighbor Graph. In: 2010 Sixth International Conference on Natural Computation,
ICNC 2010 (2010)
Usage of FPGA in Network Security
Abstract. This paper approaches to develop the RSA algorithm using FPGA
that can be used as a standard device in the secured communication system.
This RSA algorithm is implemented in the FPGA with the help of VHDL and
works with radio frequency range to make the information safer. A simple
nested loop addition and subtraction have been used in order to implement the
RSA operation. This results in less processing time and less space in the FPGA.
The information to encryption is in the form of statement or file and the same
will appear in the decryption. The hardware design is targeted on Xilinx Spartan
3E device. The RSA algorithm design has made use of approximately 1000 to-
tal equivalent gate counts and achieved a clock frequency of 50.00MHz
1 Introduction
The enormous advances in network technology have resulted in an amazing potential
for changing the way we communicate and do business over the Internet. However,
for transmitting confidential data, the cost-effectiveness and globalism provided by
the Internet are diminished by the main disadvantage of public networks: security
risks. As security plays a vital role in the communication channel, the development of
a new and efficient hardware security module has become the primary preference. A
vast number and wide varieties of works have been done on this particular field of
hardware implementation of RSA algorithm. A hardware implementation of RSA
scheme has been proposed by Hani, et al. [1], where they use Montgomery algorithm
with modular multiplication and systolic array architecture. Ibrahimy, et al. [2] have
proposed to implement RSA algorithm with flexible key size but they have given the
input to the RSA encryption side in the form of binary values directly. This work
approaches hardware implementation of RSA algorithm scheme using the modular
exponentiation operation. In this design, it is possible to change the key size of RSA
according to the application requirement and it could take the information either in the
form of statement or file.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 413–416, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
414 M. Senthil Kumar and S. Rajalakshmi
2 Design Methodology
An exceptional feature that can be found in the RSA algorithm [3] is that it allows
most of the components used in encryption to be re-used in the decryption process,
which can minimize the resulting hardware area. In RSA, a plaintext block M is en-
crypted to a cipher text block C by: C = M e mod n. The plaintext block is recovered
by: M = Cd mod n. RSA encryption and decryption are mutual inverses and commuta-
tive as shown in equation, due to symmetry in modular arithmetic. One of the poten-
tial applications for which this design of RSA has been targeted is the Secured Data
Communication. In this application, the data input could be either a statement or a file
which is fed into FPGA board directly via serial communication. The encryption
module takes care of the security. The process at the receiving end is same as the
process that has been followed at the sending end except that the sequence of the
module is reverse. The RSA covers both the operation of Encryption and Decryption.
3 VHDL Modeling
VHDL (VHSIC hardware description language) is a hardware description lan-
guage used in electronic design automation to describe digital and mixed-signal sys-
tems such as field-programmable gate arrays and integrated circuits. This language is
fully based on the IEEE 1164 Standards. Here, the whole system is designed with the
help of VHDL. The RTL model of this approach is shown in the Figure.1.
6 Conclusion
The primary goal of this research project is to develop RSA algorithm on FPGA
which can provide a significant level of security and can be faster also. The maxi-
mum bit length for both the public and private key is 1024-bit. Beside the security
issue, another major concern of this research project is to process the data or file as
input. The VHDL language provides a useful tool of implementing the algorithm
without actually drawing large amount of logic gates. Although the current key size of
this RSA algorithm can provide a sufficient amount of security, a larger key size can
always ensure better security.
References
1. Hani, M.K., Lin, T.S., Shaikh-Husin, N.: FPGA Implementation of RSA Public Key Cryp-
tographic Coprocessor. In: TENCON, Kuala Lumpur, Malaysia, vol. 3, pp. 6–11 (2000)
2. Ibrahimy, M.I., Reaz, M.B.I., Asaduzzaman, K., Hussain, S.: FPGA Implementation of
RSA Encryption Engine with Flexible Key Size. International Journal of Communica-
tions 1(3) (2007)
3. Rivest, R., Shamir, A., Adleman, L.: A Method for Obtaining Digital Signatures and Public
Key Cryptosystems. Communications of the ACM 21(2), 120–126 (1978)
4. Mazzeo, A.: FPGA Based Implementation of a Serial RSA Processor: Design, Automation
and Test in Europe, vol. 1. IEEE Computer Society, Washington, DC (2003)
5. Ghayoula, R., Hajlaoui, E.A.: FPGA Implementation of RSA Cryptosystem. International
Journal of Engineering and Applied Sciences 2(3), 114–118 (2006)
6. Senthil Kumar, M., Rajalakshmi, S.: Effective Implementation of Network Security using
FPGA. In: Recent Issues in Network Galaxy (RING 2011), Chennai, India, pp. 52–56
(2011)
Novel Relevance Model for Sentiment Classification
Based on Collision Theory
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 417–421, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
418 M.S. Murugeshan and S. Mukherjee
2 Related Work
Authors [6] have analyzed the effectiveness of machine learning methods viz. Naïve
Bayes, Maximum Entropy and Support Vector Machines (SVM) for sentiment
classification. A term count based method that exploits negations, intensifiers and
diminishers for sentiment classification is explained in [7]. A similarity based
approach for classifying the factual sentences from opinion bearing sentences is
proposed and discussed [8]. Authors [9] have given a detailed account of four related
problems in opinion mining Viz. subjectivity classification, word sentiment
classification, document sentiment classification and opinion extraction. The role of
polarity shifting in sentiment classification is discussed in [10]. Models inspired by
concepts in physics such as Quantum Theory [11], [12] and Theory of Gravitation
[13] have been effectively applied in Information Retrieval.
Bound-Bound transition =
( nv _ high × e nv _ low) + HV ( P _ terms )
Avg _ dis tan ce[( feature, polariry _ term1), ( feature, polariry _ term2)]
pv_low – Polarity weight of the positive polarity term having lower weight
in a transition.
pv_high –Polarity weight of the positive polarity term having higher weight
in a transition.
nv_low – Polarity weight of the negative polarity term having lower weight
in a transition.
nv_high – Polarity weight of the negative polarity term having higher weight
in a transition.
The distance between the features and polarity terms are calculated by considering the
number of nouns and verbs that are in between the polarity term and feature(s) in a
sentence. Each polarity term is reduced to half values in successive free-free and
bound-bound transitions until the half-value of the previous polarity terms become
less than both the polarity values in the current transition. The polarity terms on either
side of the features are considered in distance measure used in these transitions as
shown below.
The collision score of the overall review combines the effect of individual collision
scores of all features as given below.
RS = Positive if FS1 + FS2 +… + FSn > 0
RS = Negative if FS1 + FS2 +… + FSn < 0
Where RS is the sentiment of the review.
420 M.S. Murugeshan and S. Mukherjee
4 Evaluation
We have used the dataset containing four products provided by [15] for our
experiments. For evaluating the effectiveness of our model we have used the accuracy
measure. Term count method where polarity lists are built as shown in [14] has been
successfully applied for sentiment classification. Hence, we have compared the
performance of our approach with the term count based method. The classification
results are shown in Table 1.
Table 1. Comparison of accuracies for four categories using term count method and Collision
Theory based model
We can observe that for kitchen and books categories the accuracy values of both
positive and negative reviews outperform that of term count based method. In
electronics and DVD categories accuracies of positive reviews are marginally less.
However the results of negative reviews are better than the term count method.
Overall our approach has given better results in 6 out of 8 categories used in the
evaluation.
5 Conclusion
In this paper, we have proposed and tested the effectiveness of the Collision Theory
inspired model of relevance calculation for sentiment classification. The distribution
of positive and negative polarity terms is analyzed using three types of transitions.
The sentiment of the review is determined based on the difference between positive
and negative collisions. The advantages of the collision model over conventional
relevance method are evident from the results of our approach.
Novel Relevance Model for Sentiment Classification Based on Collision Theory 421
References
1. Mizzaro, S.: How many relevances in information retrieval? Interacting with Computers,
303–320 (1998)
2. Metzler, D., Dumais, S.T., Meek, C.: Similarity Measures for Short Segments of Text. In:
Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 16–27.
Springer, Heidelberg (2007)
3. Zhang, M., Ye, X.: A Generation Model to Unify Topic Relevance and Lexicon-based
Sentiment for Opinion Retrieval. In: Proceedings of 31st Annual International ACM
SIGIR Conference on Research and Development in Information Retrieval, pp. 411–418
(2008)
4. Harwit, M.: Astrophysical Concepts, 4th edn. Springer (2006)
5. Murugeshan, M.S., Mukherjee, S.: A Collision Theory Inspired Model for Categorization
of Wikipedia Documents. European Journal of Scientific Research 56(3), 396–403 (2011)
6. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine
Learning Techniques. In: Proceedings of the Conference on Empirical Methods in Natural
Language Processing (EMNLP 2002), pp. 79–86 (2002)
7. Kennedy, A., Inkpen, D.: Sentiment Classification of Movie Reviews Using Contextual
Valence Shifters. Computational Intelligence 22(2), 110–125 (2006)
8. Yu, H., Hatzivassiloglou, V.: Towards Answering Opinion Questions: Separating Facts
from Opinions and Identifying the Polarity of Opinion Sentences. In: Proceedings of the
Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), pp.
129–136 (2003)
9. Tang, H., Tan, S., Cheng, X.: A survey on sentiment detection of reviews. Expert Syst.
Appl. 36(7), 10760–10773 (2009)
10. Li, S., Lee, S.Y.M., Chen, Y., Huang, C., Zhou, G.: Sentiment Classification and Polarity
Shifting. In: Proceedings of the 23rd International Conference on Computational
Linguistics (COLING 2010), pp. 635–643 (2010)
11. van Rijsbergen, C.J.: The Geometry of Information Retrieval. Cambridge University Press,
New York (2004)
12. Piwowarski, B., Lalmas, M.: A Quantum-Based Model for Interactive Information
Retrieval. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D.,
Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 224–231. Springer, Heidelberg
(2009)
13. Shi, S., Wen, J., Yu, Q., Song, R., Ma, W.: Gravitation-Based Model for Information
Retrieval. In: Proceedings of the 28th International Conference on Research and
Development in Information Retrieval (SIGIR 2005), pp. 488–495 (2005)
14. Murugeshan, M.S., Sampath, A., Ahmed, F., Ashok, B., Mukherjee, S.: Effect of
Modifiers for Sentiment classification of Reviews. In: Proceedings of the 6th International
Conference on Natural Language Processing (ICON 2008), pp. 157–164 (2008)
15. Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, Boom-boxes and Blenders:
Domain Adaptation for Sentiment Classification. In: Proceedings of the Association of
Computational Linguistics (ACL), pp. 440–447 (2007)
Comparative Study of Crosstalk Reduction Techniques
for Parallel Microstriplines
1 Introduction
Microstrip lines are widely used for chip to chip interconnect on printed circuit board
(PCB) mainly for low cost. In the two parallel microstrip lines, a large impulse type
far end crosstalk voltage appears at one side of the victim line, when a digital signal is
applied at the opposite side of the aggressor line . This far end crosstalk voltage is
induced by the difference between the capacitive and inductive coupling ratios of two
microstriplines [1] Although there is no far end crosstalk induced in the strip lines.
The strip lines are more costly than the microstrip lines because strip lines need more
PCB layers. To reduce the far end crosstalk in the microstriplines , the extra dielectric
material can be deposited over the microstrip lines. This extra material deposition is a
cost adding process One of the method to reduce the far end crosstalk is widening the
spacing between the strip lines. But it increases PCB routing area. One new method is
to set a guard trace with vias between these parallel microstrip lines. This solution has
been taken in many applications many PCBs designers use guard trace with vias to
reduce coupling. This via stitch guard imposes the restriction on the PCB backside
routing due to via holes. In this work a guard trace with the serpentine form was
proposed to reduce crosstalk effectively and also this paper presents the comparative
study of guard trace with vias and serpentine trace.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 422–425, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Comparative Study of Crossstalk Reduction Techniques for Parallel Microstriplines 423
2 Parallel Microsttripline
Fig 1 shows the cross section of a coupled microstrip line pair in the inhomogeneeous
medium with the top side exposed to air. . In the geometry of the microstriplinne ,
W represents width of cond ducting strip, h is the substrate height, t is the thicknesss of
the conductor, s is the spaccing between the two microstriplines and l is the lengthh of
the microstripline. An iso olated transmission line can be modeled by the uniform mly
distributed self capacitancce (Cs) and self inductance (Ls). A pair of couppled
transmission lines can be modeled
m by the uniformly distributed mutual capacitaance
(Cm) and mutual inductance (Lm) in addition to the self capacitance (Cs) and the self
inductance (Ls)[3], as show
wn in Fig 2.
3 Proposed Structture
In this paper, work is focu used on how to reduce the coupling strength between the
parallel microstrip lines. The T schematic diagram shown in Fig 3(a) for paraallel
microstrip line without gu uard trace. In the simulation experiment, the structuree of
parallel microstrip line is trreated as a four port symmetrical network and which witth a
guard trace is also treatted as such a network. Fig 3 gives various paraallel
microstripline structures. TheT parameters of the simulated structures are the ttwo
Fig. 3. Schematic Diagram ofo (a) Parallel Microstriplines (b) Parallel Microstriplines w
with
Guard Trace (c) Guard Trace with vias (d) Parallel Microstriplines with Serpentine Tracee (e)
Serpentine Guard Trace with Vias
V
424 P. Rajeswari, and S. Raju
microstrip line length is 21.5234mm and the width of the microstrip line is 1.809mm
the Spacing between the microstripline is 1.5mm. Width off the guard trace is 0.5mm,
the distance between the transmission line and the guard trace is 0.5mm, the diameter
of the via is 0.4 mm the distribution of the vias in the guard trace is always
homogeneous the length of the guard trace is the same with that of the microstripline.
Each port is matched a resistance 50Ω. The proposed structure of the serpentine trace
width is 0.5mm and the length is 22mm the spacing between the microstripline and
the serpentine trace is 0.5mm. The simulation method is momentum method and the
frequency range is 1-10GHz.
0
dB(Microstrip_Modifi_serpentine_mom_a..S(1,3))
0
dB(GuardwithVias_a..S(1,3))
dB(WithoutGuard_a..S(1,3))
dB(WithGuard_a..S(1,3))
-10
-10
-20
-20
-30
-30
-40
-40
-50
-50
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
freq, GHz freq, GHz
Fig. 5. Simulation Results of Without Guard Fig. 6. Simulation Results of the Serpentine
With Guard and Vias Trace and Serpentine Trace with Vias
Fig 5 and Fig 6 shows the simulation results obtained by using the simulation tool
ADS. Parallel microstrip line with guard trace is treated as a four port symmetrical
network . If port 1 is a input port, port 2 is a cutoff port, port 3 is a coupling port and
port 4 is a transmission port . Because of the symmetry S13 = S31. The results are
obtained in terms of S parameters. From the S parameters coupling between the lines
are analyzed. Therefore S13 is the main parameter for analysis. when the distance is
1.5 mm without guard trace the coupling degree will increased from -30dB at 1 GHz
to -10dB at 10GHz, if a suspended guard trace is put between these two microstrip
lines, the coupling intensity will increase at some frequencies, because at those
frequencies resonance will be generated in the guard trace. A standing wave will be
generated in the guard trace so the coupling between the parallel lines increases.
Table 2 gives the comparison of Coupling Strength for the different microstripline
structures.
Comparative Study of Crosstalk Reduction Techniques for Parallel Microstriplines 425
5 Conclusion
From the simulation result, guard trace with vias is helpful to decrease the coupling
intensity. Serpentine guard trace reduces the far end crosstalk. Guard trace with vias is
a good solution. The same result is achieved in serpentine gurad trace with less
number of vias, compared to the conventional guard trace. So, serpentine trace helps
to reduce the coupling strength. This work may extended to design a optimum
dimensions of microstriplinewith serpentine trace by using Particle swarm
Optimization.
References
1. Lee, K., Jung, H.-K., Chi, H., Kwon, H.J., Sim, J.-Y., Park, H.J.: Serpentine Microstrip
Lines with Zero Far End Crossstalk for Parallel High Speed DRAM Interfaces. Proceedings
of the IEEE 33(2), 552–558 (2010)
2. Lee, K., Lee, H.B., Jung, H.-K., Sim, J.Y., Park, H.J.: A Serpentine Guard Trace to Reduce
the Far End Crosstalk Voltage and the Crosstalk Induced Timing Jitter of Parallel
MicrostripLines. Proceedings of the IEEE 31(4), 809–817 (2008)
3. Sohn, Y.S., Lee, J.C., Park, H.J.: Empirical equations on electrical parameters of coupled
microstrip lines for crosstalk estimation in printed circuit board. IEEE Trans. Adv.
Packag. 24(4), 521–527 (2001)
4. Li, Z., Wang, Q., Shi, C.: Application of Guard Traces with Vias in the RF PCB Layout.
Proceedings of the IEEE (2002)
5. Lee, K., Lee, H.-B., Jung, H.-K., Sim, J.-Y., Park, H.-J.: Serpentine guard trace to reduce
far-end crosstalk and even-odd mode velocity mismatch of microstrip lines by more than
40%. In: Electron. Compon. Technol. Conf., Reno, NV, pp. 329–332 (2007)
6. Lee, H.-B., Lee, K., Jung, H.-K., Park, H.-J.: Extraction of LRGCmatrices For 8-coupled
uniform lossy transmission lines using 2-port VNA measurements. IEICE Trans. Electron.
E89-C(3), 410–419 (2006)
An Adaptive Call Admission Control in WiMAX
Networks with Fair Trade off Analysis
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 426–430, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
An Adaptive Call Admission Control in WiMAX Networks 427
sustained traffic rate (MSTR) and Minimum reserved traffic rate (MRTR).In [6], the
authors proposed a CAC policy called adaptive bandwidth degradation CAC
(ABDCAC) that provides bandwidth and delay guarantees and improves Bandwidth
utilization of the system compared to the previous schemes. But by giving priorities to
handoff calls the HCDP is reduced but increases NCBP. It may be not fair to block
new originating calls within the cell while allowing more handoff calls from the
neighboring cells. Therefore a fair trade-off between HCDP and NCBP is necessary.
Some recent studies on CAC schemes can be found be in [7-9].The authors in [7]
proposed fuzzy logic based partitioning of the bandwidth however the bandwidth
utilization is poor compared to scheme in [6].
In this paper the algorithm in [6] is further extended to achieve a fair trade off
between HCDP and NCBP. The propose algorithm is called ABDCAC with fairness.
2 Proposed Algorithm
In this section we present algorithm for extended ABDCAC with fairness. The arrival
process of the handoff and newly originated UGS, rtPS, and nrtPS connections are
assumed Poisson with rates λ hu , λ hr , λ hn , λ nu , λ nr , and λ nn respectively, where
the subscript h represents handoff calls and the subscript n represents new calls
originated within a cell. The service times of UGS, rtPS and nrtPS calls are
exponentially distributed with mean 1/μ u , 1/μ r and 1/μ n respectively. Each base
station can be modeled as a five dimensional Markov chain [3, 5, 6], s=
(nu,nr,Br,nn,Bn) where nu, nr, nn represents number of UGS, rtPS and nrtPS calls
admitted and Br, Bn represents current available bandwidth of rtPS and nrtPS calls
respectively.
The summary of the proposed algorithm is given below.
Begin
(new UGS call) if (n + 1)∗ B + n ∗(Bmax− B )+ n ∗ Bmin < B
u u r r th n n
accept the new UGS call request else reject it;
(new rtPS call) if n ∗ B +(n + 1)∗(Bmax− B )+ n ∗ Bmin < B
u u r r th n n accept the
new rtPS call request else reject it;
(new nrtPS call) if n ∗ B + n ∗ Bmax+(n + 1)∗(Bmax− B )< B
u u r r n n th accept the
new nrtPS call request else reject it;
The degradation variable Bth is optimized to minimize a cost function (CF) and it
varies for different traffic conditions. When Bth=0, the algorithm degenerates to
simple ABDCAC algorithm.
Using the concept proposed in [10] we defined two metrics: Grade of Service (GoS)
and cost function (CF).
GoS is defined as GoS k = NCBPk + βk .HCDPk , k ε{u, r, n} (1)
Where u, r, n denotes UGS, rtPS and nrtPS connections respectively and βk indicates
penalty weight for handoff calls relative to new calls. In general, βk should be more
greater than 1 because handoff call should be given higher priority over new calls.
Smaller GoS means better performance in session layer for the related type traffic.
CF = w1 ∗ GoS u + w 2 ∗ GoS r + w 3 ∗ GoS n (2)
.
Where w1, w2 and w3 means different priority of the services. Therefore the weights
are selected such that w1>w2>w3 and w1+w2+w3=1.Because of the dynamic
characteristic of traffic flow the value of CF changes with traffics, so the threshold Bth
is periodically adjusted such that CF is minimized. Bth can be varied in the interval
0 < B th < Bmin
n .For a particular arrival rate every feasible value of Bth is used to
evaluate CF. Value of Bth that corresponds to the smallest CF is the optimal value of
Bth for that arrival rate.
4 Simulation Analysis
Fig. 1. (a) Optimal value of Bth (b) Comparison of NCBP of UGS connections (c) Comparison
of NCBP of rtPS connections (d) Comparison of NCBP of nrtPS connections
Fig. 2. (a) Comparison of HCDP of UGS connections (b) Comparison of HCDP of rtPS
connections (c) Comparison of HCDP of nrtPS connections
430 R. Laishram et al.
probabilities of the different service flows are reduced in the proposed scheme
compared to simple ABDCAC scheme at lower arrival rates. And the blocking
probabilities with ABDCAC scheme with fairness are different for different weights
w1, w2 and w3. However as the arrival rate the proposed scheme degenerates to the
simple ABDCAC scheme. This is because the degradation threshold is zero at higher
arrival rates as indicated in Fig. 1(a).The effect of the degradation threshold on nrtPS
connections are minimal because it is the lowest priority connections.Fig.2 (a)-(c)
shows the HCDP for UGS, rtPS and nrtPS connections respectively. Although the
degradation threshold reduces the NCBP of different connections the dropping
probabilities are increased. But the increase in HCDP is not as significant as the
decrease in NCBP. Hence it will not degrade the performance of the network.
5 Conclusion
In this paper we propose a CAC scheme to provide fairness between new calls and
handoff calls. Simulation results showed that the proposed scheme is able to achieved
improvement in NCBP without affecting too much on HCDP. Thus the proposed
ABDCAC scheme with fairness satisfies both the requirement of service providers
and subscribers. However tuning of the parameters used in tradeoff analysis is
required which may be done using robust optimization techniques.
References
1. IEEE Standard: Air Interface for Fixed Broadband Wireless Access System. IEEE
STD 802.16-2009 (October 2004)
2. IEEE Standard: Air Interface for Fixed and Mobile Broadband Wireless Access System.
IEEE P802.16e/D12 (February 2005)
3. Hou, F., Ho, P.-H., (Sherman) Shen, X.: Performance Analysis of Reservation Based
Connection Admission Scheme in IEEE 802.16 Networks. In: Global Telecommunications
Conference, GLOBECOM 2006, pp. 1–5 (November 2006)
4. Chen, X., Li, B., Fang, Y.: A Dynamic Multiple-Threshold Bandwidth Reservation
(DMTBR) Scheme for QoS Provisioning in Multimedia Wireless Networks. IEEE Trans.
on Wireless Communications 4(2), 583–592 (2005)
5. Suresh, K., Misra, I.S., Saha (Roy), K.: Bandwidth and Delay Guaranteed Connection
Admission Control Scheme for QoS Provisioning in IEEE 802.16e Mobile WiMAX. In:
Proc. IEEE GLOBECOM (2008)
6. Laishram, R., Misra, I.S.: A Bandwidth Efficient Adaptive Call Admission Control
Scheme for QoS Provisioning in IEEE 802.16e Mobile Networks. Int. J. Wireless Inf.
Networks 18(2), 108–116 (2011)
7. Shu’aibu, D.S., Syed Yusof, S.K., Fisal, N., et al.: Fuzzy Logic Partition-Based Call
Admission Control for Mobile WiMAX. ISRN Communications and Networking 2011,
Article ID 171760 (2011)
8. Sabari Ganesh, J., Bhuvaneswari, P.T.V.: Enhanced Call Admission Control for WiMAX
Networks. In: IEEE-International Conference on Recent Trends in Information
Technology, ICRTIT 2011, Chennai, June 3-5, pp. 33–36 (2011)
9. Kim, W., Hwang Jun, S.: QoS-aware joint working packet scheduling and call admission
control for video streaming service over WiMAX network. Wirel. Netw. 17(4), 1083–1094
(2011)
10. Xie, S., Wu, M.: Optimized Call Admission Control in Wireless Networks. In: Proc.
International Conference on Advanced Infocomm Technology (2008)
Analysis and Performance of Photonic Microwave Filters
Based on Multiple Optical Carriers
Abstract. In this paper a Photonic microwave filters are photonics sub system
design with aim of carrying equivalent tasks to those an supplementary
advantage inherent to photonics such as low lass, high band width, immunity to
electromagnetic inference (EMI), tunability, reconfigurability, reduced size and
weight, and low and constant electrical loss. Many photonic microwave filter
architectures have been proposed over the last years using a variety of fiber-
optic devices. Some of them are based on using multiple optical carriers
wavelength-division multiplexing (WDM)] and dispersive media to obtain a set
of time-delayed samples of the RF input signal. In this paper, the statistical
analysis of the performance of photonic microwave filter is based on multiple
optical carriers (WDM) and a dispersive medium, with random errors in
amplitude and wavelength spacing between optical carriers is presented.
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 431–436, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
432 H. Kumar et al.
2 Theory
The transfer function of an N-tap WDM photonic microwave filter for an optimum
polarization adjustment and neglecting the carrier suppression effect (e.g.., using a
single-sideband modulation [9]) is given by [6]
N −1
H RF ( f ) = ℜ Pk (1 + Δ k ) e − j 2π fDL[ kΔλ+ ε k ]
(1)
k =0
| H ( f ) |2 = H ( f ) ⋅ H * ( f )
N −1
= ℜ Pk (1 + Δ k )e − j 2π f DL[(k −n)Δ) +ek −en ] (2)
k =0
Assuming that amplitude and spacing errors have zero-mean Gaussian distribution
equation (2) is expressed as
N −1 N −1
| H ( f ) | 2 = = ℜ Pk Pn (1 + Δ k )(1 + Δ n )e − j 2π f DL[( k −n ) Δλ +e −e ]
k n (3)
k =0 n =0
Fig. 1. Experimental setup for random error evolution using five optical source of different
wavelength and an SSMF coil of 9 -10 Km as a dispersive element
Analysis and Performance of Photonic Microwave Filters 433
Fig. 2. Simulated average squared transfer function of 100 filters of 50 taps using a Hanning
Window with amplitude and wavelength spacing errors of a standard deviation of 0.05(solid
line)
where it has been considered that the error sources are independent random processes
and that every optical source has the same error statistics. Combining equation (4) and
(5), the average of the squared transfer function of a filter yields
| H ( f ) |2
N −1
= ℜ ⋅ (1 + Δ ) Pk2 + (1 + Δ ) e − j 2 π fDLε
2 2 2
k =0
N −1 N −1
Pk Pm e
j 2 π fDLε − j 2π fDL ( k − n ) Δλ (6)
.e
k =0 n= 0
n≠ k
The last term in parenthesis in equation (6) is the filter transfer function without
errors, except for the term k = n. If this term is added and subtracted, the average
squared transfer function of the filter is given by
(
| H ( f ) | 2 = ℜ 2 (1 + Δ ) − (1 + Δ ) e − j 2 π
2 2 fDLε
⋅ e j 2 πf DLε ) P k
2
( )
k
2
+ 1 + Δ e − j 2π fDLε
⋅ e j 2π fDLε
| H 0 ( f ) |2 (7)
434 H. Kumar et al.
From (7), it can be seen that the average squared transfer function of a filter with
errors is the superposition of the squared transfer function of the ideal filter (without
errors) and an error term, which depends on frequency. If the transfer function is
normalized in such a way that the transfer function without errors is unity at f = 0, the
error term can be read as the side lobe level relative to the peak of the main lobe[7].
The average squared transfer function of the filter with errors at f = 0 is given by
Fig. 3. Simulated average squared transfer function of 100 filters of five taps using a Hanning
window with amplitude and wavelength spacing errors of a standard deviation of 0.05 (solid
line). The dotted line corresponds with the estimation provided by (10) for a five tap filter and
the dashed line corresponds with the residual sidelobe level for a 50-tap filter
| H (0) | = ℜ ((1 + Δ) − (1 + Δ ) P + ℜ (1 + Δ ) P
2
2 2 2 2 2 2 2
(8)
k k
k
k
where the term ( P ) is usually larger than ( P ) and, therefore, the first
2 2
k k k k
term in (8) can be neglected. Thus, the average squared transfer function of the filter
with errors at f = 0 is approximated by
≈ ℜ (1 + Δ ) P
2
2 2 2
H (0) (9)
k
k
Diving (7) by (9), the normalized average squared transfer function of the filter with
errors can be expressed as where the first term is a residual sidelobe level due to
random errors. This term is simplified if the error statistics are known.
Usually the system will be calibrated and, therefore, the mean of the amplitude and
spacing errors will be zero (Δ = 0, ε = 0) .
in this case.
σ2 ≈
((1 + Δ )e
2 − j 2π fDLε
⋅ e − j 2π fDLε
−1 ) P
( P )
2
2 k (10)
k
k k
Analysis and Performance of Photonic Microwave Filters 435
3 Simulation Results
Simulations from equation (1) is used to calculate filter squared transfer functions
with random errors. By averaging these functions. It is then possible to compare these
results with the residual side lobe level given by equation (7). Fig. 1 depicts the
average squared transfer function (solid) of 100 filters with amplitude and wavelength
spacing errors between carriers of a standard deviation of 0.04 and 0.05, respectively,
for filters of 50 taps, using a nominal wavelength spacing of 0.9 nm and being the
nominal amplitude of a Henning window. The dispersive medium is a coil of standard
single-mode fiber (SSMF) of 9-10-km length and with D=16.4 ps/(nm.km).
Moreover, the residual side lobe level depends on the number of taps, the number
of optical carriers, as can be seen from (8) Fig. 2 depicts the average transfer function
of a filter equal to the one shown in Fig. 1 but using five taps.
Fig. 4. Measured average squared transfer function (dashed line), transfer function with a
hanning window without errors (dashed line),and residual side lobe level due to amplitude
(std=0.02) and wavelength spacing (std=0.01) random error (solid)
4 Experimental Results
The analysis & verify the validity of the previous expressions and to demonstrate the
effect of amplitude and spacing random errors on the transfer function of WDM
photonic microwave filters, measurements have been carried out in the laboratory
using the experimental setup shown in Fig. 1.A five-tap filter has been implemented
using four distributed feedback (DFB) lasers and one external cavity laser (ECL) with
a nominal wavelength spacing between carriers of 0.9 nm and using a Henning
window [-6.05 -1.250 - 1.26-6.02] dB as nominal amplitude distribution. This
window has a low side lobe level and, thus, the residual side lobe level is properly
displayed. The dispersive medium used has been a coil of 10 km approx. of SSMF.
To study the performance of the transfer function with errors. 14 transfer functions
have been measured under random amplitude and spacing errors of a standard
deviation of 0.02 and 0.01 respectively. These values were measured using the optical
436 H. Kumar et al.
spectrum analyzer (OSA) of the setup of Fig. 3, 4 shown the average squared transfer
functions of the 14.5 measured squared transfer functions (dotted line), the ideal
squared transfer function (without errors) using a five-tap Hanging window (dashed
line), and the residual side lobe level solid line) obtained from (7-8) due to random
errors in amplitude and wavelength spacing of a standard deviation of 0.02 and 0.01
respectively. From this figure, it can be seen that (7) provides a good estimation of the
residual side lobe level from the standard deviation of amplitude and wavelength
spacing between carriers errors.
5 Conclusion
References
1. Moslehi, B., Goodman, J.W., Tur, M., Shaw, H.J.: Fiber-optic lattice signl processing.
Proc. IEEE (7), 909–930 (1984)
2. You, N., Minasina, R.A.: A novel high-Q optical microwave processor using hybrid delay
line filters. IEEE Trans. Microw. Theory Tech. 47(7), 1304–1308 (1999)
3. Coppinger, F., Yegnanarayanan, S., Thrinh, P.D., Jalali, B., Newberg, I.L.: Nonrecursive
tunable photonic filter using wavelength selective true time delay. IEEE Photon Technol.
Leftt. 8(9), 1214–1216 (1996)
4. Capmany, J., Ortega, B., Pastor, D., Sales, S.: Discrete time optical processing of
microwave signals. J. Lightw. Technol. 23(2), 702–723 (2005)
5. Norton, D., Johns, S., Keefer, C., Soref, R.: Tunable microwave filtering using high
dispersion fiber time delays. IEEE Photon, Technol. Lett. 6(7), 831–832 (1994)
6. Camany, J., Pastor, D., Ortega, B.: New and flexible fiber-optic delay line filters using
chirped fiber Bragg gratings and laser arrays. IEEE Trans. Microw. Theory Tech. 47(7),
1321–1326 (1999)
7. Vidal, B., Polo, V., Corral, J.L., Marti, J.: Photonic microwave filter with tuning and
reconfiguration capabilities using optical switches and dispersive media. Electron
Lett. 39(6), 547–549 (2003)
8. You, N., Minasian, R.A.: Synthesis of WDM grating-based optical microwave filter with
arbitrary impulse response. In: Proc. Int. Topical Meeting Microwave Photonics (MWP),
Melbourne, Australia, vol. 1, pp. 223–226 (1999)
9. Simith, G.H., Novak, D., Ahmed, Z.: Technique for optical SSB generation to overcome
dispersion penalties in fiber-radio systems. Electron. Lett. 33(1), 74–75 (1997)
10. Capmany, J., Pastor, D., Ortega, B.: Microwave signal processing using optics. In: Proc.
Optical Fiber Conf. (OFC), Anaheim, CA, March 6–11, p. 2376 (2005)
A Genetic Algorithm for Alignment
of Multiple DNA Sequences
Abstract. This paper presents a new genetic algorithm based solution to obtain
alignment of multiple DNA molecular sequences. Multiple Sequence alignment
is one of the most active ongoing research problems in the field of
computational molecular biology. Sequence alignment is important because it
allows scientists to analyze protein strands (such as DNA and RNA) and
determine where there are overlaps. These overlaps can show commonalities in
evolution and they also allow scientists to better prepare vaccines against
viruses, which are made of protein strands. We have proposed new genetic
operations for crossover, mutation, fitness calculation, population initialization.
Proposed scheme generates new populations with better fitness value. We have
also reviewed the some of the popular works by different researchers towards
solving the MSA problem w.r.t various phases involved in general GA
procedure. A working example is presented to validate the proposed scheme.
Improvement in the overall population fitness is also calculated.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 437–443, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
438 P. Agarwal et al.
Initial Populations: The first challenge of a genetic algorithm is to determine how the
individuals of the population will be encoded and to generate an initial population
with some degree of randomness. Literature review [2,3,4] suggests that each
individual in the population should be one multiple alignment of all the given
sequences, but the way that they came up with the initial population varies. In [3,4,5]
the sequence length is increased by a certain percentage and randomly inserted gaps
or buffers of gaps into the sequences were considered. Hernandez et. Al [2] took a
new approach and used previously developed tools to align the sequences to a certain
degree and then used the GA to optimize the alignment.
Reproduction: Researchers Hernandez et al.[2], 2004; Horng et al.[3], 2005; Shyu et
al., 2006[5]; Wang & Lefkowitz[4], used the typical tournament style, also known as,
"roulette wheel" style of reproduction. Two of them [3, 4] also used some sort of
elitism while further restrictions were made by Wang & Lefkowitz [5] to only allow
the top scores to reproduce.
Crossover: After reproduction, pairs of alignments from the old population are
randomly chosen for crossover. The most common type of crossover is called "One
Point Crossover". Hernandez et al. [2] and Shyu et al.[5] used the process of dividing
the sequences in the alignments at a random point, and then swapping the first halves
of the first alignment with the first halves of the second alignment.
Mutation: Mutation is the last step in the process. There are a few ways to do
mutation for this problem and they all have to do with gaps and sliding subsequences
left or right. Hernandez et al. [1] have two forms of mutation. They either remove a
gap or slide a sub-sequence next to a gap into the gap space which essentially moves
the gap from the beginning of the sub-sequence to the end or vice versa. Homg et al.
[3] have four forms of mutation. Shyu et al. [5] randomly select columns in the
sequences and then swap nucleotides and spaces in these columns. Wang &
Lefkowitz [4] have three forms of mutation.
Fitness Function: The fitness function determines how "good" an alignment is. The
most common strategy that is used by all of the authors, albeit with significant
variations, is called the "Sum-Of-Pair"
Hernandez et al.[1] and Wang & Lefkowitz [4] create their own scoring matrices
based upon the sequences that they are trying to align. Wang & Lefkowitz [4] creates
a library of optimal pair-wise alignments from the sequences that they are trying to
align and then evaluates the consistency of the calculated multiple alignment with the
ones in the library. Homg et al. [3] uses the straight-forward Sum-of-Pairs calculation.
Shyu et al. [5] uses the nucleic acid scoring matrix from the International Union of
Biochemistry (IUB). This matrix groups nucleotides together according to certain
properties, e.g., Purines (A or G) and Pyrimidines (C or T). Segun et. al [8] describes
that MSA belongs to a class of optimization problem called the combinatorial
problems with exponential time complexity O(LN). Literature review [9, 10] describe
how Genetic Algorithms can be used to solve the MSA problem and that optimal, or
near-optimal solutions using GA Kosmas et. al [10] showed GA to be better than
other optimization methods as it require only a fitness function, rather no particular
algorithm to solve a given problem. The fitness function is the cost function given
using different weights for different types of matching symbols and assigning gap
costs [11]. Nizam et. Al [12] proposed a scheme where parameters like number of
A Genetic Algorithm for Alignment of Multiple DNA Sequences 439
generations, chromosome length, crossover and mutation rate are made to adapt the
values during execution using concept of self-organizing GA. Yang Chen et. al [13] is
based on Genetic Algorithm with Reverse Selection(GARS).One of the drawback
with Genetic Algorithm is that it suffers from premature convergence in which
solution reaches locally optimal stage.
ATAGAAGGT
The initial Population is the chromosome representation of various multiple
sequence alignments generated by randomly inserting gaps in the input Sequences.
(i) A-AAGCT--AT
GA-T-A-C-AA
A-CC-TTAA-A
A-TAGAAGG-T
Fitness= -52 1 7 8 11 2 4 6 8 11 1 4 9 11 1 9 11
Similarly ten other chromosomes are generated by inserting the gaps randomly to
form the initial population:
(ii) Fitness= -73 6 7 8 11 0 2 6 9 11 1 7 8 11 3 5 11
(iii) Fitness= -35 2 4 9 11 0 3 7 9 11 0 1 9 11 2 4 11
(iv) Fitness= -43 1 5 7 11 0 4 7 8 11 0 2 4 11 4 5 11
(v) Fitness= -54 1 5 8 11 0 1 3 8 11 0 6 8 11 5 7 11
(vi) Fitness= -53 1 2 5 11 1 3 4 5 11 1 2 9 11 0 2 11
(vii) Fitness= -35 2 3 9 11 2 5 6 9 11 0 2 5 11 0 3 11
(viii) Fitness= -77 0 4 7 11 2 5 6 8 11 1 2 9 11 4 5 11
(ix) Fitness= -54 1 4 7 11 0 1 6 9 11 0 5 9 11 0 4 11
(x) Fitness= -45 3 5 8 11 4 6 7 8 11 3 5 8 11 1 71
For the given Input Sequence: Overall Fitness Score= -521
Selection: Save 20% and perform operations on the remaining 80%. For the 10
chromosomes of Initial generation, save (iii) and (vii) which pass to next generation
without undergoing any change. Perform genetic operations on the remaining eight
chromosomes. Input for Crossover are given as:
1 7 8 11 2 4 6 8 11 1 4 9 11 1 9 11 Fitness= -52 parent 1
67 8 11 0 2 6 9 11 1 7 8 11 3 5 11 Fitness= -73 parent 2
1 5 7 11 0 4 7 8 11 0 2 4 11 4 5 11 Fitness= -43 parent 3
1 5 8 11 0 1 3 8 11 0 6 8 11 5 7 11 Fitness= -54 parent 4
1 2 5 11 1 3 4 5 11 1 2 9 11 0 2 11 Fitness= -53 parent 5
0 4 7 11 2 5 6 8 11 1 2 9 11 4 5 11 Fitness= -77 parent 6
1 4 7 11 0 1 6 9 11 0 5 9 11 0 4 11 Fitness= -54 parent 7
3 5 8 11 4 6 7 8 11 3 5 8 11 1 7 11 Fitness= -45 parent 8
After crossover, the new chromosomes generated would be:
1 7 8 11 0 2 6 9 11 1 7 8 11 3 5 11 Fitness= -67 Child 1
6 7 8 11 2 4 6 8 11 1 4 9 11 1 9 11 Fitness= -60 Child 2
1 5 7 11 0 1 3 8 11 0 6 8 11 5 7 11 Fitness= -59 Child 3
1 5 8 11 0 4 7 8 11 0 2 4 11 4 5 11 Fitness= -40 Child 4
1 2 5 11 2 5 6 8 11 1 2 9 11 4 5 11 Fitness= -40 Child 5
0 4 7 11 1 3 4 5 11 1 2 9 11 0 2 11 Fitness= -66 Child 6
1 4 7 11 4 6 7 8 11 3 5 8 11 1 7 11 Fitness= -49 Child 7
3 5 8 11 0 1 6 9 11 0 5 9 11 0 4 11 Fitness= -66 Child 8
Selection: Best 2 chromosomes Of parent[i], parent[i+1], child[i], child[i+1] are
passed to next step for i=1,3,5,7
1 7 8 11 2 4 6 8 11 1 4 9 11 1 9 11 Fitness= -52 Parent 1
A Genetic Algorithm for Alignment of Multiple DNA Sequences 441
1 5 7 11 0 4 7 8 11 1 2 4 11 4 5 11 Fitness= -41
1 5 8 11 0 4 7 8 11 0 2 4 11 4 5 11 Fitness= -40
1 2 5 11 2 5 6 8 11 1 2 9 11 4 5 11 Fitness= -40
1 2 5 11 1 6 4 5 11 1 2 9 11 0 2 11 Fitness= -38
3 5 8 11 4 6 7 8 11 3 5 8 11 1 7 11 Fitness= -45
1 4 7 11 4 6 7 8 11 3 5 8 11 1 7 11 Fitness= -49
Next Generation Produced, Output Sequence: Overall Fitness Score= -435
The Best Alignment is defined by the chromosome with highest fitness value:
2 3 9 11 2 5 6 9 11 0 2 5 11 0 3 11 Fitness= -35
A A - - A G C T A - T
G A - T A - - C A - A
- A - C C - T T A A A
- A T - A G A A G G T
3 Results
We’ve used the defined operators in our implementation for 4 sequences that initially
generated 10 chromosomes. Various operators developed by us were then applied on
these 10 chromosomes an iteration and it was observed that the overall fitness score
for next generation (population) of chromosomes was better than the previous one.
Success rate computated in terms of % is quite significant and the proposed scheme
can be used to generate good multiple alignments.
References
1. Goldberg, D.E.: Genetic algorithms in search, optimization & machine learning. Addison-
Wesley Publishing Company, Inc., Reading (1989)
2. Hernandez, D., Grass, R., Appel, R.: MoDEL: an efficient strategy for ungapped local
multiple alignment. Computational Biology and Chemistry 28, 119–128 (2004)
3. Horng, J.T., Wu, L.C., Lin, C.M., Yang, B.H.: A genetic algorithm for multiple sequence
alignment. Soft Computing 9, 407–420 (2005)
4. Wang, C., Lefkowitz, E.J.: Genomic multiple sequence alignments: Refinement using a
genetic algorithm. BMC Bioinformatics 6, 200 (2005)
5. Shyu, C., Sheneman, L., Foster, J.A.: Multiple sequence alignment with evolutionary
computation. Genetic Programming and Evolvable Machines 5, 121–144 (2004)
6. Buscema, M.: Genetic doping algorithm (GenD): Theory and applications. Expert
Systems 21(2), 63–79 (2004)
7. Notredame, C., Higgins, D.G.: SAGA: Sequence alignment by genetic algorithm. Nucleic
Acids Research 24(8), 1515–1524 (2004)
8. Fatumo, S.A., Akinyemi, I.O., Adebiyi, E.F.: Aligning Multiple Sequences with Genetic
Algorithm. International Journal of Computer Theory and Engineering 1(2), 186–190
(2009)
9. Carrillo, H., Lipman, D.: The multiple sequence alignment problem in biology. Siam J.
Appl. Math. 48(5), 1073–1082 (1988)
10. Kosmas, K., Donald, H.K.: Genetic Algorithms and the Multiple Sequence Alignment
Problem in Biology. In: Proceedings of the Second Annual Molecular Biology and
Biotechnology Conference, Baton Ronge, LA (February 1996)
11. Altschul, S.F.J.: Gap Costs for Multiple Seqence Alignment. Heoretical Biol. 138, 297–
309 (1989)
12. Nizam, A., Shanmugham, B., Subburaya, K.: Self-Organizing Genetic Algorithm for
Multiple Sequence Alignment (2010)
13. Chen, Y., Hu, J., Hirasawa, K., Yu, S.: Multiple Sequence Alignment Based on Genetic
Algorithms with Reserve Selection. In: ICNSC, pp. 1511–1516 (2008)
Food Distribution and Management System
Using Biometric Technique (Fdms)
1 Introduction
The main objective of total food grain supply chain computerization in civil Supplies
Corporation to check this diversion. The diversion takes place in four main areas.
1. Diversion in the procurement itself.
2. Diversion in the movement of commodities between CSC warehouses.
3. Diversion while transporting to FPS from CSC warehouses.
4. Diversion at the FPS level.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 444–447, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Food Distribution and Management System Using Biometric Technique (Fdms) 445
2 Objective
3 Methodology
The entire system is so designed to provide easy access to data, records and
information paralleling maintaining data integrity and security in all and every
aspects. All the information is stored in encrypted format and access is provided to the
only authorized person. The entire system is divided into two core process, namely a)
Enrolment and Authentication b) authorized access to data.
446 G. Prakash and P.T. Sivasankar
Data Acquisition
Fingerprint data is acquired when subjects firmly press their fingers against a glass or
polycarbonate plate.
Template or File Size
Fingerprint user files are generally between 500 and 1,500 bytes.
Accuracy
Some fingerprint systems can be adjusted to achieve a false accept rate of 0.0%.
Sandia National Laboratories tests of a top-rated fingerprint system in 1991 and 1993
produced a three-try false reject rate of 9.4% and a crossover error rate of 5%.
4 Conclusion
References
1. Biometric History from National Science & Technology Council (NSTC), Committee
handed on National Security, Subcommittee on Biometrics
2. http://www.ti.com/biometrics
3. Jain, A.K.: Biometric Recognition: A New Paradigm for Security. Dept of Computer
Science &Engineering. Michigan state university,
http://www.bimetric.cse.msu.edu
4. http://www.smartcardalliance.org
5. U.S.Government, Biometric U.S. Visas (May 2004),
http://usembassy.state.gov/posts/ar1/wwwhbiometric.html
6. Lewis, J.W.: Biometrics for Secure Identity Verification: Trends and Developments. A
Thesis Presented in for INSS 690 Professional Seminar Master of Science in Management
Information Systems University of Maryland Bowie State University
7. Gupta, P., Bagger, R.K. (eds.): Compendium of e-Governance Initiatives in India.
Universities Press, Hyderabad
Improving Intelligent IR Effectiveness
in Forensic Analysis
1 Introduction
Current digital forensic text string search tools use match and/or indexing algorithms
to search digital evidence at the physical level to locate specific text strings. The text
string search tools fail to group and /or order search hits. Text mining is the new
approach in digital forensics. The text mining approach will improve the IIR
(Intelligent Information Retrieval) effectiveness of digital forensic text string
searching. The text mining technology can be scalable to large datasets in GBs or
TBs. The system searches specific keywords, weighted by the user in accordance with
domain specific analysis. The system then ranks the correspondence data and displays
them. It also provides user graphs and charts about the ranked data which will help
the investigators to analyze further.
2 Proposed System
The scope of this research is to design and develop a Forensic Analysis and Inference
of Correspondence Data tool which will be used to detect any trend or activities that
may compromise security. The architecture of the proposed system is shown in fig.1.
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 448–451, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Improving Intelligent IR Effectiveness in Forensic Analysis 449
3 Analysis Tool
The functional requirements of the tool are described below:
3.4 Visualization
Functionalities of visualization are Query input, Output display and Graphic display.
Query input is a GUI module which will allow the investigator to feed the input query.
The query may be keywords of interest, or based on message header information like
sender id, recipients, time day etc. Output display is display the message information in
html page based on ranking criteria. When mouse is clicked on particular message id
the actual message has to be displayed in a new window. Graphic/Chart has user
interaction graph and date/time frequency graph. User Integration graph will show the
sender id and recipient interaction graph based during given date or time interval. This
graph will represent sender id and receiver id as nodes. The number of lines will show
the mails communicated between them. Date/Time frequency will show the time/date
frequency graph with sender id or receiver id. The X axis will show the sender /
receiver id and y axis will show the number of messages sent or received.
Improving Intelligent IR Effectiveness in Forensic Analysis 451
4 Conclusion
References
1. Smith, Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval.
Cambridge University Press (2008)
2. Beebe, N.L., Dietrich, G.: A new process model for text string searching. In: Shenoi, S.,
Craiger, P. (eds.) Research Advances in Digital Forensics III, pp. 73–85. Springer, Norwell
(2007)
3. Beebe, N.L., Clark, G. J.: Digital forensic text string searching: Improving information
retrieval effectiveness by thematicallyclustering search results, The University of Texas at
San Antonio, Department of IS&TM, One UTSA Circle, San Antonio, TX 78249, United
States
4. Naqvi, S., Dallons, G., Ponsard, C.: Applying Digital Forensics in the Future Internet
Enterprise Systems - European SME’s Perspective, pp. 89–93 (May 20, 2010) 978-0-7695-
4052-8
5. Schmerl, S., Vogel, M., Rietz, R., Konig, H.: Explorative Visualization of Log Data to
Support Forensic Analysis and Signature Development, pp. 109–118 (May 20, 2010) 978-0-
7695-4052-8
Energy Efficient Cluster Routing Protocol
for Heterogeneous Wireless Sensor Networks
1 Introduction
V.V. Das and J. Stephen (Eds.): CNC 2012, LNICST 108, pp. 452–455, 2012.
© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012
Energy Efficient Cluster Routing Protocol for heterogeneous Wireless Sensor Networks 453
In EE-SEP, a node elected as CH with a probability p can be attained for each nonCH
node by choosing the CH with the least communication energy [4]. A node becomes a
CH for the current rotation round if the number is less than the threshold Th (ni)
which is proposed as:
( ) ( ) ( ) ∊ (1)
0
where p is the desired percentage of the CH nodes in the sensor population, r is the
current round number, and G is the set of nodes that have not been CHs in the last 1/p
rounds.
If the fraction of advanced nodes is m and the additional energy [3] factor between
advanced and normal nodes is α, then
( )
( )
, ( )
(2)
Hence, in SEP, the threshold in (1) is replaced by that of the normal sensors, Th
(ninrml), and the same for advanced nodes Th (niadncd) is as follows:
∊ ′
( ) ( ) ( ) (3)
0
′′
∊
( ) ( ) ( ) (4)
0
where r is the current round, G′ is the set of normal nodes that have not become CHs
within the last 1/pinrml rounds of the epoch, and Th (ninrml) is the new threshold applied
to a population of n (1 − m) normal nodes. This guarantees that each normal node will
become a CH exactly once every 1/p x (1 + α m) rounds per epoch, and that the
average number of cluster heads that are normal nodes per round per epoch is equal to
n (1 − m) pinrml. Similarly, G′′ is the set of advanced nodes that have not become CHs
within the last 1/padncd rounds of the epoch, and new Th (niadncd) is the threshold
applied to a population of n x m advanced nodes. This guarantees that each advanced
node will become a CH exactly once every (1/p)(1 + α m)/ (1 + α) rounds.
According to the radio energy dissipation model illustrated in literature [5, 6], the
total energy dissipated in the network is equal to:
2 ( ) (5)
The average distance from a cluster head to the sink is given [6] by
0.765 (6)
454 T. Venu Madhav and N.V.S.N. Sarma
The optimal probability of a node to become a cluster head, p, can be given [6] by
(7)
(8)
The optimal construction of clusters which is equivalent to the setting of the optimal
probability for a node to become a cluster head is very important.
Fig. 1. Network lifetime comparison of LEACH, SEP and EE-SEP with m=0.1 and α=1
The Fig. 2 gives the network lifetime comparison for above algorithms with m=0.2
and α=1. In this approach also, the proposed algorithm outperforms the SEP and
LEACH with 10% more heterogeneity applied to the nodes. Similarly the network
lifetime of EE-SEP has 55% more than SEP where their lifetime observed as 4800
and 2250 rounds respectively with the increase of advanced nodes to 10% compared
to Fig. 1.
Energy Efficient Cluster Routing Protocol for heterogeneous Wireless Sensor Networks 455
Fig. 2. Network lifetime comparison of LEACH, SEP and EE-SEP with m=0.2 and α=1
The network stability from first round to death of first node (FND) has been
compared among the three algorithms. EE-SEP has outperformed the other two in all
the graphs taken in consideration with two different topologies. This indicates
formation of more energy efficient CHs with the newly proposed algorithm.
4 Conclusion
EE-SEP routing protocol has been proposed. Simulations have indicated that it
outperformed SEP and LEACH algorithms with the new clustering threshold applied
to the advanced nodes and normal nodes with two different topologies. The proposed
algorithm has more network stability, network lifetime and energy efficient CHs
selection in every round than the other two algorithms.
References
1. Zhang, Y., Yang, L.T., Chen, J.: RFID and sensor networks: architectures, protocols,
security and integrations, pp. 323–354. CRC Press (2010)
2. Akyildiz, I., Su, W., Sankarasubramaniam, Y., Cayirci, E.: A survey on sensor networks.
IEEE Communications Magazine 40(8), 102–114 (2002)
3. Smaragdakis, G., Matta, I., Bestavros, A.: SEP: A stable election protocol for clustered
heterogeneous wireless sensor networks. In: SANPA, pp. 1–11 (2004)
4. Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: An application-specific protocol
architecture for wireless micro sensor networks. IEEE Transactions on Wireless
Communications, 660–670 (2002)
5. Bari, A., Jaekel, A., Bandyopadhyay, S.: Maximizing the Lifetime of Two-Tiered Sensor
Networks. In: The Proceedings of IEEE International Electro/Information Technology
Conference, pp. 222–226 (2006)
6. Qing, L., Zhu, Q., Wang, M.: Design of a distributed energy-efficient clustering, algorithm
for heterogeneous wireless sensor networks. Computer Commmunications, 2230–2237
(2006)
7. Wang, G., Wang, Y., Tao, X.: An Ant Colony Clustering Routing Algorithm for Wireless
Sensor Networks. IEEE Computer Society, 670–673 (2009)
Author Index