Professional Documents
Culture Documents
Le Meridien
ISBN 978-1-4673-9353-9
MAURITIUS SUBSECTION
www.icccs.in
Preface
Copyright and Reprint Permission: Abstracting is permitted with credit to the source. Libraries
are permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons
those articles in this volume that carry a code at the bottom of the first page, provided the per-
copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive,
Danvers, MA 01923. For reprint or republication permission, email to IEEE Copyrights Manager
at pubs-permissions@ieee.org. All rights reserved. Copyright ©2015 by IEEE.
ISBN 978-1-4673-9353-9
The Society accepts no responsibility as to the content of the articles published inside this DVD
which is solely the responsibility of the authors.
Only registered and presented papers will be sent for inclusion into the IEEE Xplore Digital
Library.
IEEE Notice:
Please be advised that conference proceedings must meet IEEE's quality standards and IEEE reserves the
right not to publish any proceedings that do not meet these standards.
i
2015 International Conference on Computing, Communication and Security (ICCCS) is aimed to be an annual
conference conceptualized to foster students, researchers, academicians and industry persons in the field of
Computer Science and Engineering, Communication and Security. 2015 International Conference on
Computing, Communication and Security (ICCCS) is organized by the Society of Information Technologists and
Entrepreneurs and the IEEE Mauritius Subsection on 4-6 Dec, 2015 at Le Meridien, Mauritius.
This conference will include invited keynotes and oral presentations. 2015 International Conference on
Computing, Communication and Security (ICCCS) will provide a forum for researchers and engineers in both
academia and industry to exchange the latest innovations and research advancements in Innovative
Computing, Communication and Security. 2015 International Conference on Computing, Communication and
Security (ICCCS) also provides the attendees the chance to identify the emerging research topics, as well as the
future development directions in all field of Computer Science.
The primary goal of the conference is to promote research and developmental activities in advanced
Computing, Communication and Security challenges. Another goal is to promote scientific information
interchange between researchers, developers, engineers, students, and practitioners working in and around
the world.
Authors have contributed original research papers which have been accepted after a double blind review
process. All registered papers presented in the conference will be submitted for inclusion into the IEEE Xplore
Digital Library provided they meet quality standards.
We cordially invite you to join us at the conference and we look forward to welcoming you in Mauritius.
Prof. Subramaniam Ganesan, Prof. Sri Niwas Singh, Prof. Vijay Kumar,
Oakland University, Rochester, Indian Institute of Technology Kanpur, University of Missouri-Kansas City,
USA India USA
ii
Keynote Speech
KEYNOTE ABSTRACT
In this presentation, automotive embedded system security, software and hardware solutions, Vehicle to
Vehicle communication (V2V), security threat due to V 2 V, CAN bus simulations for Security, MiL (model in the
loop) and SiL (software in the loop) for modeling security of the Vehicle environment are covered. Solutions for
security threats due to advanced wireless sensors, handheld mobile devices, over the air software updating,
data routers with cloud based computing, and connected vehicles are presented.
Keynote Speech
Intelligent based optimization methods such as genetic algorithm (GA), particle warm optimization (PSO),
bacteria foraging, ant colony, neural networks, etc. are multi-path search and provide solution near to the
global optima. They do not require derivatives of objective function and constraints. This presentation briefly
covers some of the important techniques of optimization along with scope and future challenges.
iii
Keynote Speech
Keynote Speech
KEYNOTE ABSTRACT
Recent advance in intelligent system paradigms have been rapid and significant. Moreover, the pace of
development shows no sign of abating. We now live in a world where robots help in such activities as running
hotels, take a lead role in the manufacturing of an almost unmanageable range of products, and even provide
care for the sick and ageing. Autonomous vehicles assisted by intelligent control algorithms are now being
tested on roads used by the general public and unmanned drones are capable of flying more safely than
manned planes. The benefits that the advancements in AI afford humanity seem almost endless.
Set against the many advances and achievements of such intelligent systems and machines there have been
stark warnings from such notable scientists as Professor Stephen Hawking who highlight the very real danger
of some of the possible intended and unintended misuses of such technology. Moreover, there are still myriad
unanswered questions about the legal and ethical ramifications of the use of such systems in everyday life.
The talk will highlight some of the advances made in the area of intelligent systems and investigate the pros
and cons of such systems. In short, the talk will seek to inform discussion on whether such systems are friend
or foe.
iv
Keynote Speech
KEYNOTE ABSTRACT
Mobile computing and cloud computing are two technologies converging into a rapidly growing field of mobile
cloud computing (MCC). By the definition of National Institute of Standards and Technology (NIST), cloud
computing is a model for enabling on demand network access to a shared pool of configurable computing
resources (e.g. networks, servers, storage, applications, and services) that can be rapidly provisioned and
realized with minimal management effort or service provider interaction. Future applications enabled by 5G
will have impacts on almost every aspect of digital lives. On the other hand, the 5G system deployed initially
2020, is expected to provide approximately 1000 times higher wireless area capacity and save up to 90
percent of energy consumption per service when compared to the current fourth generation (4G) system. Also,
10 times higher battery life of connected devices and 5 times reduced end-to-end latency are anticipated.
The objective of this presentation is to address recent advances in mobile cloud computing for 5G networks. In
this work, the topics of interest for both academia and industry are classified in the following categories
related to many aspects of MCC in 5G:
dynamic voltage and frequency scaling, as well as power gating. Cloud centers need to support on-demand
dynamic resource provisioning, where clients can at any time submit virtual machine requests with various
amounts of resources. Cloud-assisted mobile ad-hoc networks are expected to be popular in 5G mobile
networks because the significantly faster performance of 5G communications enables clouds to provide
realistic services. On the other hand, the energy consumption problem will become serious due to the highly
increased cloud computing speed. In closing, many challenges remain to be addressed ranging from MCC
architecture/5G network design, resource/mobility management, security enhancement and protection, to
networking protocol development and new services. This will stimulate further research of MCC in 5G
networks.
vi
General Chairs
Prof. Subramaniam Ganesan, Oakland University, Rochester, USA
Organising Chairs
Vishal Kumar, BTKIT, India
Dr. Al-Sakib Khan Pathan, International Islamic University Malaysia, Kuala Lampur
Anupam Srivastava, Associate Dean, Middle East College, Knowledge Oasis Muscat, Oman
Parma Nand, School of Computing Science & Engineering, Galgotias University, India
Hlaing Htake Khaung Tin, Computer University, Loikaw, Kayah State, Myanmar
Shekhar Verma, Associate Professor, Indian Institute of Information Technology, Allahabad, India
Publicity Committee
Shilpi Saxena, Graphic Era Hill University, India
Financial Committee
Hoshiladevi Ramnial, Mauritius
Publications Committee
The Society of Information Technologists and Entrepreneurs
ICCCS-2015 Table of Contents
Table of Contents
Hardware Implementation of Ultralightweight Cryptographic Protocols . . . . . . . . . . . . . . . . . . . 1
Umar Mujahid, Najam Ul Islam Muhammad and Qurat Ul-Ain
Physical Layer Secrecy Solution for Passive Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . 2
Avinash Thombre and Aditya Trivedi
Visualising and Analysing Online Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Manoj Maharaj and Kambale Vanty Muhongya
An Efficient Key Management Scheme in Hierarchical Wireless Sensor Networks . . . . . . . . . . 4
Xinyang Zhang and Jidong Wang
Experiential Analysis of the Effectiveness of Buck and Boost DC-DC Converters in a
Basic Off Grid PV system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Osamede Asowata, Ruaan Schoeman and Pienaar Christo
Sixsoid: A New Paradigm for k-Coverage in 3D Wireless Sensor Networks . . . . . . . . . . . . . . . . 6
Manjish Pal and Nabajyoti Medhi
Identifying Ideal Values of Parameters for Software Performance Testing . . . . . . . . . . . . . . . . . . 7
Charmy Patel and Ravi Gulati
Reducing Structured Big Data Benchmark Cycle Time using Query Performance
Prediction Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Rekha Singhal
Robust Blind Watermarking Technique for Color Images using Online Sequential
Extreme Learning Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Ankit Rajpal, Anurag Mishra and Rajni Bala
A Detection Mechanism of DoS Attack using Adaptive NSA Algorithm in Cloud
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Sumana Maiti, Chandan Garai and Ranjan Dasgupta
Vehicle Security and Forensics in Mauritius and Abroad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Pravin Selukoto Paupiah
Performance Evaluation and Analysis of Layer 3 Tunneling between OpenSSH and
OpenVPN in a Wide Area Network Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Irfaan Coonjah, Clarel Catherine and K. M. S. Soyjaudah
Overview of Data Quality Challenges in the Context of Big Data . . . . . . . . . . . . . . . . . . . . . . . . . 13
Suraj Juddoo
A Classification Method to Classify High Dimensional Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Amit Gupta
Experimental Performance Comparison between TCP vs UDP tunnel using OpenVPN. . . . 15
Irfaan Coonjah, Pierre Catherine and K.M.Sunjiv Soyjaudah
6to4 Tunneling Framework using OpenSSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Irfaan Coonjah, Pierre Catherine and K.M.Sunjiv Soyjaudah
ICCCS-2015 Table of Contents
Abstract—RFID (Radio Frequency Identification) is among the can use traditional cryptographic algorithms (such as AES,
most emerging and tremendously growing technologies in the 3DES, Public or Private key cryptography). However, the
field of automatic identification. The technology is far better than channel of communication between tag and the reader is
its other contending systems such as Barcodes and magnetic considered to be insecure because of limited computational
tapes as it provides optimal communication link with non line of
power at reader side. As this link is wireless, which an
sight capability. Several researchers have proposed various
ultralightweight RFID authentication protocols to provide cost adversary can easily access to launch passive or active attacks.
effective security solutions. However, the proper hardware In order to enhance the security of this wireless channel, a large
implementation of these ultralightweight authentication protocols number of ultralightweight mutual authentication protocols
has been neglected long which develops a chaos of their practical (UMAP) have been proposed. However, almost all of these
feasibility. In this paper, we have addressed this problem and algorithms apparently failed to provide optimal security. The
proposed generic hardware architecture for EPC-C1G2 tags. We generic solution for the implementation of security protocols is
have also simulated four ultralightweight authentication to use encryption of data that is to be communicated. Main
protocols (SASI, LMAP, David-Prasad and RAPP) using focus of this paper will be on the efficient hardware
ModelSim PE and synthesized using Xilinx Design Suite 14.5.
implementation of the four protocols; SASI [1], LMAP [2],
Algorithm has been successfully implemented in hardware using
Spartan 6 (Low Power) and Vertex 7 Field Programmable Gate David Prasad [6] and RAPP [14] of UMAP family. UMAP
Array (FPGA). We believe that the proposed architecture and family mainly involves simple bitwise logical operators such as
simulated results will act as the stepping stone for the future AND, OR, XOR and some ultralightweight primitives
implementation of these low cost RFID tags for ASIC. (Permutation, Cyclic Left Rotation). For a low cost RFID tag,
only 2500 to 3000 gates can be allocated for the
Keywords—RFID; Ultralightweight; Mutual Authentication; implementation of security related tasks.
Tags; FPGA; SASI
I. INTRODUCTION After these protocols many active and passive attacks came
into existence against these protocols. N.J.Hopper et.al
RFID systems are the rapidly developing technology across proposed the first lightweight authentication protocol (HB
the world because of its enriched features and functional haste. protocol [11]) for passive RFID tags. A. juels et.al extended the
RFID system is basically composed of the three main modules protocol (HB+ [10]). However in 2005, H. Gilbert presented an
which include: Reader, Tag and the Backend Server active attack against HB+ protocol in [12] and challenged its
(commonly known as database). security claim. In 2006, Pedro Peris Lopez et.al proposed a
new security frameworks and introduced two new
ultralightweight authentication protocols LMAP [2] and EMAP
[3] using simple triangular functions (XOR, AND, OR).
Sequencer
H-Chein in [1] has presented the classifications of
authentication protocols based on cryptographic functions that Register
Memory ALU
can be used at tag's end: Output
*Where, 𝑛1 and 𝑛2 are the two random numbers with a *Where x & y are the strings of n- bit length (According to
length of n-bits. EPC-C1G2, n can be 32 bit, 48 bit, 64 bit or 96 bit)
On obtaining D transmitted by the tag, for verifying the tag, Tag has some specific entities which play vital role in the
reader calculates D' and verifies tag by checking D' = D. description of the protocol. These entities involve Pseudonym
Important element of tag was revealed in message D which is Identity (IDS), tag ID (ID) and the two keys. IDS is further
generated after the verification reader. Key updating follows divided into new IDS (𝐼𝐷𝑆 𝑛𝑒𝑥𝑡 ) and old IDS (𝐼𝐷𝑆 𝑝𝑟𝑖𝑜𝑟 ). Two
the following pattern: set of secret keys have been divided in similar manner
𝑝𝑟𝑖𝑜𝑟 𝑝𝑟𝑖𝑜𝑟
(𝐾1𝑛𝑒𝑥𝑡 ,𝐾1 ,𝐾2𝑛𝑒𝑥𝑡 ,𝐾2 ). Tag shares its IDS and keys with
𝑝𝑟𝑖𝑜𝑟
𝐼𝐷𝑆 𝑛𝑒𝑥𝑡 = 𝐼𝐷𝑆 𝑝𝑟𝑖𝑜𝑟 + 𝑛2 + 𝐾4 ⊕ ID 5 reader for mutual authentication. The most significant entity
𝑝𝑟𝑖𝑜𝑟 prior for any tag is its ID which is revealed in the last authentication
𝐾1𝑛𝑒𝑥𝑡 = 𝐾1 ⊕ n2 ⊕ K3 + ID 6 step. Keeping in mind that the IDS, ID and keys are a string of
𝑝𝑟𝑖𝑜𝑟 prior
𝐾2𝑛𝑒𝑥𝑡 = 𝐾2 ⊕ 𝑛2 ⊕ K4 + ID 7 n-bit value. Reader initially sends a HELLO message to tag. In
reply to the hello message, tag transmits its IDS towards the
reader. Reader now checks if the IDS is same and if it is 𝐼𝐷𝑆 𝑛𝑒𝑥𝑡 = 𝐼𝐷𝑆 + 𝐼𝐷 ⊕ n2 ⊕ K1 (16)
present in the database. If so, reader continues with the 𝐾1𝑛𝑒𝑥𝑡 = 𝐾1 17
authentication process. Reader calculates the messages A, B 𝐾2𝑛𝑒𝑥𝑡 = 𝐾2 18
and C using the ALU through the equations provided next:
2. 3 David - Prasad Protocol
𝐴 = 𝐼𝐷𝑆 ⊕ K1 ⊕ n1 (10)
𝐵 = 𝐼𝐷𝑆 ˅ 𝐾2 + 𝑛2 11 In September 2009, David and Prasad [6] proposed a new
𝐶 = 𝐾1 ⊕ K 2 + K1 ⊕ K 2 12 protocol from UMAP family for the passive Low – Cost RFID
Where, Tags. The main aim of the protocol was to provide the security
𝐾1 = 𝑅𝑂𝑇 𝐾1 ⊕ n2 , K1 13 within limited resources (Hardware and Power consumption).
𝐾2 = 𝑅𝑂𝑇 𝐾2 ⊕ n1 , K 2 (14) In this protocol, before inquisitive tag, reader needs to get a
one day certificate from Certificate Authority (CA). The
*Where 𝑛1 and 𝑛2 are the two randomly generated numbers logical operations involved in the protocol account are:
of n-bits length.
Logical Exclusive – OR operator (⊕)
Rotation operation is being explained in Figure 5. As reader Logical AND operator (˄)
uses its own set of keys and IDS in order to generate the Logical NOT operator (~)
messages A, B and C for the encryption purposes, not of 𝐾1 and
𝐾2 has been used. To create a string of data, reader transmits Figure 6 represents the David-Prasad Protocol. In this
message A, B and C as a single message towards tag by protocol, IDS and keys shared by both the tag and reader are
concatenating A||B||C. again of string length n bits. After the initialization of the
conversation by the reader through HELLO message and on
"LEFT ROTATION OPERATION" receiving the IDS from tag, it generates A, B and D. In these
messages reader uses two randomly generated numbers that are
Rotation operation ROT (x, y), left rotates the bits of x 𝑛1 and 𝑛2 . The messages A, B and D are formed in the
with respect to the hamming weight of y. Where, following manner:
hamming weight of y is the number of 1's bits in string y.
Hello
Reader IDS Tag
Let's assume x and y to be 5 bits value: 𝐼𝐷, 𝐼𝐷𝑆, 𝐾1 , {𝐼𝐷, 𝐼𝐷𝑆, 𝐾1 ,
A||B||D
To calculate ROT (10110, 01101) 𝐾2 𝐾2 }
E||F
x = 10110
y = 01101 𝐴 = 𝐼𝐷𝑆 ∧ 𝐾1 ∧ 𝐾2 ⊕ 𝑛1
𝐵 = 𝐼𝐷𝑆 ∧ 𝐾2 ∧ 𝐾1 ⊕ 𝑛2
Step 1: Count Number of one's in y. According to above 𝐷 = 𝐾1 ∧ 𝐾2 ⊕ (𝐾2 ∧ 𝑛1 )
example, number of one's in y is three. 𝐸 = 𝐾1 ⊕ 𝑛1 ⊕ 𝐼𝐷 ⊕ (𝐾2 ∧ 𝑛2 )
𝐹 = 𝐾1 ∧ 𝑛1 ⊕ (𝐾2 ∧ 𝑛2 )
Step 2: Rotate left x by three bits since the number of
one's in y was three. 𝐼𝐷𝑆 𝑛𝑒𝑥𝑡 = 𝐼𝐷𝑆 𝑝𝑟𝑖𝑜𝑟 ⊕ 𝑛1 ⊕ 𝑛2
Abstract—The backscatter communication system has tremen- distributed passive wireless sensor networks and RFID systems
dous potential in commercial applications, still very less work has [1]-[10]. RFID systems allows interconnection of the various
been done to study the benefits of it. Backscatter communication objects using tags [1]. Various technical problems must be
system is the backbone of many low cost and low power dis-
tributed wireless systems. The data transmission between various dealt right form tag circuit design to advanced backscatter
nodes in wireless communication system always comes with the signal processing, to rip the benefits of backscatter based
risk of third party interception. This leads to privacy and security wireless systems [1]-[11]. The standardization and practical
breaches of the information. In this paper, physical layer security issues related to RFID and backscatter communication security
of backscatter wireless system for multiple eavesdropper, single are discussed in [5]. Different issues addressed to the authenti-
tag, and single reader case is studied. Unique characteristics of
the channel are used to provide security to signal transmission. cation of RFID tags and their associated protocols are studied
In order to degrade the reception of the signal by eavesdropper, in [6], [8], and [9].
a noise injection scheme is proposed. The advantages of this Backscatter wireless communication system’s secrecy con-
approach are discussed for various cases while evaluating the stitutes a main design problem. Malicious attacks in communi-
impact of the key factors like antenna gain and location of cation, such as eavesdropping, may result into data interception
the eavesdropper on the secrecy of the transmission. Analytical
results indicate that if properly employed, the noise injection as well as serious security or privacy issues like owner tracking
scheme improves the performance of backscatter wireless system. or identity or data modification [11]. The pioneering work of
Wyner in [12] is the first to discuss the usage of the wireless
Index Terms—Backscatter communication system, physical communication channel characteristics as a tool to secure
layer secrecy, wireless sensor networks, artificial noise injection. wireless communication. The problems of securing backscatter
wireless communication systems face basic problems from the
practical limitations, in terms of size, cost, and computation,
I. INTRODUCTION which encourage novel approaches to wireless communication
Physical layer security focuses on exploiting the physical security [13], [14]. To overcome some of the limitations of
layer properties of channels to secure signal transmission cryptography in backscatter wireless communication systems,
against eavesdropping. It can also complement current crypto- physical layer (PHY) secrecy techniques are developed which
graphic techniques as the two approaches operate in different does not depend on the secret key generation and distribution.
domains, one protects the communication phase while the Physical layer secrecy schemes can be used as either for
other protects the data processing after the communication supplementing to the cryptography or as a complement that
phase. In addition, physical layer security techniques can also can strengthen available cryptographic schemes by providing
be adopted for secret key generation and distribution, by a secure communication channel for key distribution and
exploiting the rich randomness and dynamics available in the exchange.
wireless medium. In subsequent section II, system model is explained. Multi-
Radio Frequency Identification (RFID) is wireless technol- ple eavesdropper single reader and single case is illustrated
ogy used to identify and track objects using tag and reader. with the help of received signal and SNR expressions. In
Tags, also called transponders or labels, which are attached section III, noise injection scheme is discussed. The effects
to the object being detected or transplanted inside the body of it on signal reception and corresponding changes in SNR
of the animal to be tracked. Every tag consist of a chipset are studied. In section IV, conditions for positive secrecy
and a antenna used to store information about the object and are evaluated. Section V, contains the results of the paper.
transmit information to the reader as well as harvest power, Conclusion and future work is discussed in section VI.
respectively. Most common types of tags do not have their own
energy source; instead they harvest energy from continuous II. SYSTEM MODEL
carrier wave. Reader, also called interrogator, interrogates tags. Consider a backscatter communication system consisting of
The interrogation takes place over wireless media and there is a single reader single tag and multiple eavesdroppers. The
no requirement of the line of sight between the two. tag does not initiate communication on its own. To wake up
Backscatter wireless communication systems is an emerging the tag, reader transmits the continuous wave (CW) signal i.e.
wireless technology that has been very extensively used in carrier. An RF voltage is induced over the antenna which in
where γR is the SNR at the reader. The aim of this paper Px ΓG2RT K 2 d−4
RT
is to secure the transmission of the data from eavesdroppers γR = (7)
while transmission using physical layer security techniques. kPz ΓGRT K 2 d−4
2
RT + σ 2 + σ2 G
R
−2
T RT KdRT
The received signal at the reader is where k represents noise attenuation factor. The additional
interference caused at the eavesdropper while receiving the
yE = hT E hRT xs + ηE + hT E ηT E (4) backscattered signal from tag is,
yEn = hT En hRT xs + hT En hRT zs + hREn z
+ ηEn + hT En ηT En (8)
V. RESULTS
Using the SNR expressions in (7) and (9), with worst case
2
condition of eavesdropper’s noise( i.e. σE = σT2 E = 0), this
condition is given by
2
dT E GT E
>
dRE GRE
d2RT σR
2
+ KGRT σT2 −2
− (1 − k)ΓGRT KdRT (12)
KGRT Pz
It is clear that noise injection, Pz > 0, is important for
positive secrecy rate, which depends on the relatively how far
is the eavesdropper. When eavesdropper is very close tag then
maintaining secrecy becomes very difficult and strong noise
signal has to be introduced.
In order to fulfil the condition in (12) without caring about Fig. 5. Optimal value of α versus receiver-eavesdropper distance dT E .
the eavesdropper’s parameters, the right side of the above
equation should be positive. Therefore
d4RT σR
2
+ d2RT KGRT σT2 d2RE d2RT σR
2
+ KGRT σT2
Pz > (13) Pz > (14)
(1 − k)ΓG2RT K 2 d 2T E K
The right side of the equation is finite as long as k < 1. Figure 4 shows minimum required power for noise injection
Therefore, if reader is able to adjust the power of noise depends on location of eavesdropper. As the eavesdropper
injection according to (13) secure communication is possible gets closer to the tag, the power required for noise injection
regardless of the location and gains of the eavesdropper increases towards infinity and achieving positive secrecy be-
antennae. comes more challenging. Therefore, it is interesting to study
Figure 3 shows the minimum required power for noise how close the eavesdropper can get to the receiver for practical
injection for a range of noise attenuation factors. As the atten- values of the noise power generated by reader.
uation capability of the reader gets weaker, i.e. as k increases, Secure communication at a positive secrecy rate can usually
there is need for a stronger noise signal to maintain positive be achieved by inserting a small amount of noise at the reader.
secrecy. Also lower curve with reader to tag distance of 2 Now, the problem of optimally allocating the total transmission
meter indicates minimum required power for noise injection power at the reader between the conventional CW signal and
is lesser than that for reader to tag distance of 4 meter. Which the injected noise is discussed, in order to maximize the
means as the reader gets closer to the tag required power for achievable secrecy rate given in (10). To perform this power
noise injection decreases. allocation, the reader must be able to estimate the reader-tag
Consider, reader does not go for noise processing (i.e. k = channel. To do so, the reader can use the backscatter signal to
0) and all antennae are omnidirectional. Then condition for estimate the channel. This can be done either jointly during
achieving positive secrecy becomes the signal detection or the initial training phase with the tag.
Fig. 6. Optimal value of α versus eavesdropper’s antenna gain ratio Fig. 7. Variation in the SNR at eavesdropper as per optimal power allocation
GT E /GRE . ratio
To study the optimal power allocation problem, we define large value even if the gain ratio reaches 105 . This implies
the ratio of power allocated between conventional CW signal that, for most practical antenna gains, a small portion of the
and noise as α ∈ (0, 1]. power is needed for noise injection.
Figure 7 shows that, as the optimal power allocation ratio
Px = αP and Pz = (1 − α)P increases SNR at the eavesdropper also increases. The increase
The optimization of the power gives optimized equation of the in power allocation ratio indicates that there is decrease in the
α as noise power that is being transmitted. This results into im-
proved SNR at eavesdropper and thus affects the transmission.
a(a + k)[a(b − 1) + (b − k)] − a(1 − k) This increases vulnerability of the signal being detected by the
α=1− (15)
a(b − 1) + k(b − k) eavesdropper and hence it reduces the secrecy of backscatter
2 4
communication. Thus for low SNR at eavesdropper at more
σR dRT GRE d2RT d2T E noise power has to be injected and in turn less power allocation
where a = and b = 1 +
P ΓG2RT K 2 ΓGRT GT E Kd2RE ratio should be maintained.
Figure 5 shows the optimal value of α versus the receiver- VI. CONCLUSION
eavesdropper distance dT E ranging from 0 to 1 m. The optimal
In this paper, we analyzed the characteristics of physical
value of α is very close to 1 for nearly all possible range of the
layer security in multiple-eavesdropper single-reader single-
tag-eavesdropper distance (including the values of dT E > 1
tag case and proposed conditions for positive secrecy. A noise
not shown in the figure), which implies that only small fraction
injection scheme is proposed using the characteristics of the
of power is required for noise injection in order to achieve the
channel. Conditions for positive secrecy are derived for various
optimal physical layer security performance. Only when the
cases. Optimal power allocation is discussed and analyzed the
receiver-eavesdropper distance approaches 0, does the optimal
performance under different situations.
α starts to drop significantly.
In future, investigations may be done for multi-reader/tag
The impact of eavesdropper’s antenna gains, when it is
case, designing more efficient and backscatter oriented secrecy
equipped with a directional antenna with potentially high
achieving codes with low complexity under various backscatter
directivity, can be studied by either minimizing the antenna
radio propagation environments.
gain towards the reader GRE or maximizing the antenna gain
towards the tag GT E , the eavesdropper is able to improve its R EFERENCES
SNR. Expression for γE in (9) shows (with the assumption
of σE = 0) the ratio of the eavesdropper antenna gain, i.e. [1] D. M. Dobkin, The RF in RFID: Passive UHF RFID in Practice. Newnes,
2007.
GT E /GRE , is an important factor and range of values of it are [2] T. Philips, T. Karygiannis, and R. Kuhn, ”Security standards for the RFID
considered. For simplicity, consider dRT = dRE = dT E = 2m market,” IEEE Security Privacy, vol. 3, no. 6, pp. 8589, Nov. 2005.
or dRT = dRE = dT E = 4m. [3] A. O. Bicen and O. B. Akan, ”Energy-efficient RF source power control
for opportunistic distributed sensing in wireless passive sensor networks,”
Figure 6 shows the optimal value of α versus the eavesdrop- in Proc. 2012 IEEE Symp. Comput. Commun.
per’s antenna gain ratio GT E /GRE ranging from 1 to 105 . [4] D. Arnitz, U. Muehlmann, and K. Witrisal, ”Wideband characterization
The optimal value of α is very close to 1 for a wide range of of backscatter channels: derivations and theoretical background,” IEEE
Trans. Antennas Propag., vol. 60, no. 1, pp. 257-266, Jan. 2012.
practical antenna gains. Only when the gain ratio goes beyond [5] A. Juels, ”RFID security and privacy: a research survey,” IEEE J. Sel.
103 , does the optimal α start to drop, but still remains at a Areas Commun., vol. 24, no. 2, pp. 381-394, Feb. 2006.
[6] A. Juels, ”Minimalist cryptography for low-cost RFID tags,” in Proc.
2004 Int. Conf. Security Commun. Netw.
[7] E. Vahedi, R. K. Ward, and I. Blake, ”Security analysis and complexity
comparison of some recent lightweight RFID protocols,” in Proc. 2011
Int. Conf. Computational Intelligence Security Inf. Syst.
[8] S. Piramuthu, ”SASI: a new ultralightweight RFID authentication pro-
tocol providing strong authentication and strong integrity,” IEEE Trans.
Dependable Secure Comput., vol. 4, no. 4, pp. 337-340, Dec. 2007.
[9] P. Peris-Lopez, J. C. Hernandez-Castro, J. Estevez-Tapiador, and A.
Ribagorda, ”M2AP: a minimalist mutual-authentication protocol for low
cost RFID tags,” in Proc. 2006 Int. Conf. Ubiquitous Intelligence Comput.
[10] T. Li and G. Wang, ”Security analysis of two ultra-lightweight RFID
authentication protocols,” in Proc. 2007 IFIP SEC.
[11] H.-J. Chae, D. J. Yeager, J. R. Smith, and K. Fu, ”Maximalist cryp-
tography and computation on the WISP UHF RFID tag,” in Proc. 2007
RFID Security.
[12] A. D. Wyner, ”The wire-tap channel,” Bell System Tech. J., vol. 54, no.
8, pp. 1355-1387, 1975.
[13] P. K. Gopala, L. Lai, and H. El Gamal, ”On the secrecy capacity of
fading channels, IEEE Trans. Inf. Theory, vol. 54, no. 10, pp. 4687-4698,
Sept. 2008.
[14] Z. Li, W. Trappe, and R. Yates, ”Secret communication via multi antenna
transmission,” in Proc. 2007 Conf. Inf. Sciences Syst.
[15] M. Bloch and J. Barros, Physical-Layer Security: From Information
Theory to Security Engineering. Cambridge University Press, 2011.
[16] Walid Saad, Xiangyun Zhou, Zhu Han, and H. Vincent Poor. ”On
the Physical Layer Security of Backscatter Wireless Systems.” Wireless
Communications, IEEE Transactions on 13, no. 6 (2014): 3442-3451.
Visualising and Analysing Online Social Networks
Kambale Vanty Muhongya and Manoj Sewak Maharaj
School of Management, Information Technology and Governance
University of KwaZulu-Natal
Durban, South Africa
vantykyb@gmail.com, Maharajms@ukzn.ac.za
Abstract—The immense popularity of online social networks that they have a MySpace profile, with MXit at 29%, and
generates sufficient data, that when carefully analysed, can Twitter at a close 28%. Additional findings from the same
reveal unexpected realities. People are using them to establish survey indicated that 74% surf the Internet to visit SNSs, 74%
relationships in the form of friendships. Based on data collected, access Facebook at least once a day, 25% have met more
students’ networks were extracted, visualized and analysed to friends on SNSs than they have in real life, 24% have gone on
reflect the connections among South African communities using a real-world date with someone they met on social media, and
Gephi. The analysis revealed a slow progress in terms of 16% use SNSs to advertise their businesses.
connections among communities from different ethnic groups in
South Africa. This was facilitated through analysis of data
collected through Netvizz as well as by using Gephi to visualize In general, Facebook contains a wall, a friend page, a news
social media network structures. feed and an email page. A wall is an area where the user or
friends can post notes or add multimedia. A friend page shows
Keywords—Online social network; betweeness centrality; the number and a list of the friends a user is connected to. A
closeness centrality; race; visualization; graph; Gephi news feed informs the user about some Facebook events and
about the activities of Facebook friends. Facebook has an
I. INTRODUCTION embedded email service available to users to send private
Data visualization is becoming an increasingly important messages to other Facebook users. To see and view profiles of
component of analytics in the age of big data. Businesses individuals on Facebook, a user needs to subscribe. A valid
trying to understand global markets, governments concerned existing email is required for someone to subscribe and be
about transformation and well being of their citizens and many able to use the network. Facebook allows searches and
other challenges. A variety of tools like Gephi, Vizster, discloses personal information. Users are not obliged to
Leximancer, NodeXL etc. are used around the world to disclose information. Users can decide to restrict access to
visualize, analyses large scale data. their profiles by changing their privacy settings in the system
[3]. Nevertheless, by default, anyone can search and read other
Online social networks are formed by individuals and/or people’s profiles on the network.
content and the relationship between them. Facebook remains
the leading social network today. Facebook is known for Facebook is a social network used by university students,
connecting individuals in forms of friendships claiming to be high school pupils, and others. Facebook profiles show contact
transforming distant communities. In this paper, the details, including physical addresses and telephone numbers,
researchers present an analysis of student Facebook networks and additional information not often found on other social
using Gephi, an open source software that works like a networks thus enhancing its capacity to form both online and
database and assists to visualize and analyse obscured large real-world friendships. Facebook users can add friends to their
scale online. profiles asking for friendship. This is done by sending a friend
request to another user. When the other party accepts the
II. BACKGROUND request, the connection is shown in the network of friends.
A primary use of social networking sites (SNSs) is Considering the descriptions above, it might be difficult to
communicating and sharing information with friends. When it understand the link among events, peoples, contents that
comes to social network sites statistics, African nations rank happens on social networks like Facebook. However with tool
low. However, South Africa currently ranks 29th on like Gephi, this has become possible. Initiated by Mathieu
Facebook’s international customer record, and show many Jacomy in 2006, Gephi visualizes and analyses all kind of
resemblances with larger nations [1]. According to networks and complex datasets, dynamic and hierarchical
Socialbakers [2], with a large 82% membership, Facebook is graphs [4]. It runs on Windows, Linux and Max OS and is free
the prominent public media site used in Southern Africa. Over of charge. We have used Gephi to visualize and analyse large
half of South African Facebook users access the site via their Facebook networks of students in South African universities.
mobile phones. Eighty percent of the respondents indicated Gephi work with other compatible software such as Netvizz a
Netvizz assits in the collection of Facebook networks in a A network can be partitioned. This partition uses labels
form of a file that is then imported into Gephi for further from the data source. For instance if there is a label called
analysis. Other compatible files include .gexf, .gdf, .gml, gender, race etc. in the data source, they can be used to
GraphML, Pajek Net, GraphViz DOT, .csv, UCINET DL, distribute groups in the network. The partition present nodes
Tulp TPL, Netdwaw VNA and Spreadsheets. The data once in form of frequencies with percentage associated to it. The
imported is presented in a form of a dataset with rows and percentage determines how many nodes belong to a particular
columns. More columns can as well be added if more data group. A network filter is used to perform statistics link to ego
need to be added to the data source. Gephi is easier to use than networks, degree range etc. Ego networks can be studied
other network analysis tools where the user might need to separately from the main network. This is made possible by
write scripts in order to visualise networks. taking the number identifying the node from the database and
searching for it in the filter settings of ego network. Analysis
Gephi has been used to visually map @reply conversation in terms of path length, coefficient, modularity (betweeness
networks on twitter [6], to visualise and analyse business and closeness centrality), network density can be performed.
intelligence data [7] but has not been used to map connections
among different ethnic groups. In this paper we use Gephi to The usability of the application is of great importance as it
map and analyse students’ relationship based on race. manipulates data presenting the data in forms of graphics and
perform statistical analysis in form of frequency that are
performed by application such as SPSS or Excel.
III. NETWORK VISUALIZATION
V. CONCLUSION
Apparently, despite 20 years of post-Apartheid democracy
in South Africa, the impact and effects of enforced segregation
are still manifest, at least insofar as friendship relationships
Network 3 - Indians betweenness centrality in an Indian
between university students. Our analysis here shows that
Facebook network
students mostly befriend those on Facebook who are of the
same ethnic background as themselves. While we
acknowledge that there are many factors that contribute to
whether students become online friends, our analysis reveals
an unmistakable pattern. Parallel research on this aspect,
relating to the conversion of online friendships to real-world
friendships reveals a similar pattern. In this paper, Gephi was
used to visualize and analyse relationships amount students
belonging to different ethnic groups. Gephi is a useful visual
environment for the manipulation and analysis of both online
and offline social networks with quantifiable data. There are
opportunities for designers and developers to initiate
integrative graph software that manipulate both qualitative and
quantitative data in forms of graphs and other statistical
measures. This will revolutionize how research is currently
Network 4- Blacks Betweenness centrality in an Indian being performed. Obscured networks are easily observed and
Facebook network understood. The understanding of individual and community
behavior in relation to demographic factors can easily be
Fig. 2: Betweeness centrality Gephi visualisation analysed. It should be highlighted that the application is easy
to use.
REFERENCES
Abstract—In this paper, a secure efficient hierarchical key scheme also includes key updating and mechanism after node
management scheme (SEHKM) for wireless sensor networks is operation.
proposed. It satisfies several requirements of WSN. The design of
the scheme is motivated by the constraints of sensor networks II. RELATED WORK
and high resources cost in traditional key management
techniques. The key management scheme supports the A. Constraints of WSN
establishment of three types of keys to encrypt messages sent In a WSN, sensor nodes are independent tiny devices. Each
among sensor nodes. They are network key, group key and of them has individual battery and hardware. This leads to
pairwise key. Network key encrypts broadcast messages and constraints [2, 3]. The first one is energy. Once a node is
authenticates new nodes. Group key is shared by all nodes in the deployed in a network, the battery is not rechargeable in many
same cluster. Pairwise key is shared with a specific pair of nodes. applications. The battery has to serve the lifetime of a node for
In hierarchical WSN, cluster heads are very important due to the all functions. As a result, reasonable energy consumption is
structure of the network. In order to improve security strength very important for a sensor node and it affects the overall
and reduce resource cost and risk after cluster head compromise, performance of the whole network. Then the physical size and
an assistant node is introduced in the scheme. SHEKM includes
prize of the node device decide it could not have large and
key establishment, key transportation, mechanisms after node
expensive chips. So the hardware of memory and processor in
operation and dynamical freshness of keys. The evaluation of
SEHKM is presented as well as the comparison with some
a node is limited. However, a sensor node built on an
existing key management schemes. The performance analysis and embedded system has three functions: sensor interface, data
simulation show that SEHKM is very efficient in computation, processing and network interface [4]. A sensor node has to
communication, and storage while the security strength is high. carry out all these functions with limited hardware. The
processor also requires energy too.
Keywords—hierarchical wireless sensor networks, security, key
In the other hand, all nodes in a WSN communicate with
management, assistant node, key update
each other by radio channel which is open and can be accessed
I. INTRODUCTION by anyone in the same range. This makes a great challenge for
security. In addition, WSN can be deployed in different
Wireless sensor network (WSN) is a multi-hop ad-hoc environment depend on different applications. The entire
network formed by a large number of low-cost micro-sensor network is affected by this environment condition. Moreover,
nodes which communicate through radio channel [1]. It is due to unreliable channel and severe environment, there are
widely used in many areas in modern society, such as military, much more packet loss and fault in WSN than traditional
agriculture, environment and health care. Because information networks. Table 1 summarizes constraints of WSNs.
protection is essential, security for WSN is important and it
gets a lot of attentions from the public. As one of the important TABLE I. CONSTRAINTS OF WSN
issues, key management includes key establishment, key
transportation, key update and mechanism after node operation. Constraints Details
Based on current researches in this area, some existing key Severe environment, limited resources of memory,
Physical
energy and computation
management schemes are assessed under different
Unreliable channels and limited bandwidth, collision
requirements and constraints of WSN. Communication
and latency.
In this paper, a key management scheme for hierarchical
WSN (HWSN) using different keys to achieve security at each B. Requirements of WSN Key Management [5]
level is proposed. There are three types of keys: network key,
group key and pairwise key. Network key is shared by all Due to the constraints in WSNs, the key management
nodes in network and it is utilized for securing broadcast scheme employed in a WSN should provide the required
message and authentication. Group key is shared by the nodes security strength and works efficiently under the limitation of
in the same cluster. Pairwise key is shared by two nodes. The power, memory, computation and communication in WSN.
According to Carman [2] and Sastry [6], key management
978-1-4673-9354-6/15/$31.00 ©2015 IEEE
scheme should satisfy several requirements based on security factor will cost more communication resource. This scheme
and function of the WSN. Table 2 shows the requirements. implements one key in a cluster, so if a node is compromised,
the group key may lost and the whole cluster will be in danger.
TABLE II. REQUIREMENTS OF WSN KEY MANAGEMENT The security strength is not high enough for some applications.
Requirements Details LEAP (Localized Encryption and Authentication Protocol)
Confidentiality, authentication, data integrity, [10]is a key management protocol with the operation of four
Security
freshness and robustness types of keys: individual key, pair wise key, cluster key and
Reasonable communication, computation, storage
Efficiency
and power consumption
group key. The individual key is for each node shared with BS.
The pair wise key shared with another node. The cluster key
Operation Flexibility, scalability and accessibility
shared with all nodes within the cluster and the group key is for
BS communicates with all other sensor nodes in the network.
These four keys are generated by a pre-distributed key called
C. Network Structure
initial key. Firstly, the individual key is calculated by a
WSNs can be classified as hierarchical networks and flat function with the ID of the node. Secondly, nodes broadcast
networks by the way that sensor nodes formed network [1]. In their IDs in the neighbor discovering and the receiver uses a
an HWSN, all nodes are divided into several clusters and each function with initial key to establish the pair wise key shared
cluster has a head [5]. A cluster head leads a group of general with the neighbor. Then all nodes delete the initial key after the
sensor nodes and all cluster heads are led by the head of cluster pair wise key generation phase. Next is cluster key distribution.
head. Every node belongs to a cluster and communicates with a Cluster head secures the key with the pair wise key and
base station (BS) through cluster head. In a flat network, all broadcasts it in the cluster. Lastly, the BS broadcasts the group
nodes have the same capability. They communicate with key. LEAP uses μTimed Efficient Streaming Loss-tolerant
neighbors and transmit data to BS one by one. Comparing with Authentication Protocol (μTESLA) [14] to authenticate the
flat WSN, HWSN has many advantages [7]. First, in a
broadcast of BS which make sure that packet with group key is
hierarchical network, cluster heads and BS manage the network.
just sent by BS. LEAP has many advantages. The security
Normal nodes can only wake up when they are needed for data
strength of it is high and it meets the requirement of
transmission or collection. Hence, the energy consumption
accessibility. However LEAP failed to satisfy the flexibility
would be reduced for this. Then, cluster head and cluster
requirement because no new node can be added into the
members exchange information in a cluster, it helps cluster
network after deletion of initial key. In addition, if initial key is
head concludes the local information. Lastly, there is less
disclosed, the adversary will be able to establish pairwise key
competition between nodes for communication channel as
with any node in the network.
cluster head transmits most data, more nodes can be deployed
in the network. The hierarchical WSN has better scalability. III. SEHKM: A SECURITY EFFICIENT HIERARCHICAL
Compare with key management schemes for flat WSN [8, 9], KEY MANAGEMENT SCHEME FOR WSN
schemes for HWSN [10, 11] are more efficient. However, a
cluster head manages the keys of all nodes in the same cluster. SEHKM provides three types of keying mechanisms to
Hence, the cluster will face to serious problems if the head is secure WSNs. An overview of keys is introduced at first. Then
compromised or damaged [5]. the assistant node which increases security strength of the
scheme is presented. Next are the establishments of each type
D. Existing Key Management Schemes [5] of keys, node operation and key update. At last, node
Based on TDGH [12], Panja [13] proposed a key compromise is described.
management mechanism which is suitable for HWSN. The A. Overview
structure is a cluster head leads a group of general sensor nodes
and all cluster heads are led by the head of cluster head. All With the consideration of advantages of hierarchical
sensor nodes pre-install a symmetric key for key transport structure and existing schemes, SEHKM is proposed. It
encryption. Every node has a partial key to generate group assumes the adversary only can eavesdrop during in
keys. The leaf nodes generate random numbers to calculate initialization phase and BS is always secure.
their partial keys. The partial keys of parent nodes are SEHKM is a key management scheme for HWSNs and
calculated by the partial keys of the leaf nodes. There are two provides three types of keys to encrypt message. A network
types of group keys: intra-cluster key and inter-cluster key. The key shared by all nodes in the network, a group key shared by a
intra-cluster group key is used for encryption/decryption of group of nodes and a pairwise key shared by a specific pair of
messages inside a sensor network group, whereas the inter- nodes. Each node pre-distributes a network key and a number
cluster group key is used for groups of cluster heads. A cluster IN.
head computes the intra-cluster group key by using all the
partial keys in the group as arguments. The inter-cluster group Network key this is a key shared by all nodes in the
key is generated in the same approach by using the partial keys network. It is pre-installed and helps with the initialization
of cluster heads. This scheme is simple and easy to implement. of network. After initialization phase, the key will be
The energy consumption is reasonable as Panja analyzed. It updated by BS. It works for BS to encrypt broadcast
also satisfies the requirement scalability, accessibility, messages and authentication.
authentication and flexibility. But the scheme has weaknesses Group key a group key is a key shared by all nodes in the
too. In Panja scheme, the group key generation with blind same group. In lower level of network, a cluster can be a
group. Cluster head is group leader. In higher level, BS and cluster head in the hierarchical structure. It will find routes to
all cluster head is a group and BS is the group leader. the BS to get command from it when the cluster head is
Figure 1 shows the groups in different levels. Group key compromised or damaged.
secures messages broadcast in a group.
The assistant node finds a route to the BS in 6 steps:
Pairwise key A pairwise key is a shared by two nodes. In
1) Initialization
this scheme, not any pair of nodes has a pairwise key. BS
has pairwise keys with all cluster heads, BS shares a key Set Searching Depth to 0;
with assistant node in each cluster, a cluster head has Create a route request packet with following parameters
pairwise keys with all other nodes in the cluster and i. Cluster head ID ( source CH ID)
assistant node has pairwise keys that shared with all ii. Routing Path : Assistant Node ID
normal nodes in the cluster. During the key establishment, iii. Searching Depth
a disposable pairwise key is used to secure the phase and it 2) Assistant node passes the request packet to a gateway node
will be erased later. Pairwise key secures unicast message never encountered before.
and can be used for authentication. 3) The gateway node adds its own ID to the routing path and
In the following sections, assistant node, key establishment, sends the request to a gateway node in a neighboring
transportation and updating will be presented. The keying cluster.
mechanisms during node operation such as node addition, 4) Then this neighboring gateway node adds its own ID to
revocation and replacement are described in the next. the Routing Path and passes the request packet to its
cluster head.
B. Assistant Node 5) Each time a cluster head receives a request, it will
Hierarchical structure has many advantages compare with increase Searching Depth by 1. Then it will check if the
flat structure. In a HWSN, cluster heads are the most important source CH ID is its member. If it is, then the route is
nodes in their cluster. A cluster head organizes other nodes in found. Otherwise it will check if Search Depth is larger
the cluster, processes data aggregation and manages security than 3. If it is then go back to step 1. Otherwise it adds its
keys. As a cluster head plays such important roles, the ID to the Routing Path and passes the request to its own
compromise of a cluster head is fatal in the cluster and it head, then goes back to step5.
affects all the nodes in the cluster. In order to combat the 6) End.
security problem caused by cluster head compromise, an
assistant node is proposed. C. Establishment of Disposable Pairwise Keys
In a cluster, besides the cluster head, another management In the network clustering, each member node u joins the
node, named assistant node is arranged. An assistant node is cluster of the head node i and generates the disposable pairwise
randomly chosen by the cluster head at cluster setup phase. It key 𝐷𝑃𝐾𝑢 ,𝑖 shared by node u and cluster head i using IN and
shares pairwise keys with each member nodes and the cluster node u’s ID by function f.
head. The pairwise key establishment will be covered in next
section. All member nodes in the cluster are candidates to be an 𝐷𝑃𝐾𝑢 ,𝑖 = 𝑓𝐼𝑁 𝐼𝐷𝑢
assistant node. Whenever a cluster head cluster head is
compromised, replaced or removed, the assistant node will kick As a cluster head knows all IDs of its member nodes, it is
in and will take the management role in the cluster. Figure 1 able to calculate the disposable pairwise key with each member
shows assistant nodes in a two-layer wireless sensor network. node. After network clustering and generation of disposable
pairwise key with head node, each node erases IN and only
remains disposable pairwise key 𝐷𝑃𝐾𝑢 ,𝑖 . Cluster head may
transport this key to another node if it is needs. So this key may
be shared by more than two nodes.
D. Establishment of Group Keys
Group key is shared by a group of nodes. The first group
key is a random number generated by the group leader and it is
sent to each member node encrypted by disposable pairwise
key.
After initialization of all keys, group key needs to be update
to keep secure. The generation of following group keys is
different from the first group key. The algorithm will be
described in the key update section.
Fig. 1. Assistant nodes in two-layer wireless sensor network. E. Establishment of Network Key
An assistant node has IDs of all nodes in the same cluster as All nodes in the network shared one Network key. This key
well as the information of gateway nodes. The gateway nodes is pre-installed and secures messages in initialization phase.
enable the assistant node to find routes to the BS above its own After group key establishment, BS generates a random number
as the new network key. BS encrypts the key by group key and assistant node. After establishments of all pairwise keys,
broadcasts the key to all cluster heads, then each cluster head disposable pairwise keys should be erased.
broadcasts the key encrypted by its group key to its own
member nodes. Network key is always a random number G. Node Operation
generated by BS In a WSN, the operations of node addition, revocation and
replacement are needed. A list of notations is defined below:
F. Establishment of Pairwise Keys
In SEHKM scheme, there are three types of pairwise keys BS : base station.
in the scheme which distinguished by their generation and CH : cluster head.
storage. The first type is group associated pairwise keys which MN : member node.
include the key shared by group leaders with their member NK : network key
nodes and the pairwise keys shared by BS with cluster heads. GK: group key.
The second type of pairwise keys is assistant node associated. PK : pairwise key.
This type includes the key shared by assistant node with DPK: disposable pairwise key.
member nodes in the same cluster and the key shared by IK : temporary key.
assistant node with BS. The last type is disposable pairwise key 1) Node addition: The new node pre-installs the current
and it will be erased after establishment of all keys. The network key and IN to authenticate and join a cluster. Then the
establishment of disposable pairwise key is introduced above. cluster head generates a random number as pairwise key and
1) Group associated pairwise key establishment: The sends it, network key and group key to new node. The process
generation of group associated pairwise key uses Diffie-
Hellman algorithm. For example, in a group, the leader i
′
generates a random seed 𝑔1𝑖 , then calculates 𝑔1𝑖 as:
′
𝑔1𝑖 = 𝛼 𝑔 1𝑖 𝑚𝑜𝑑 𝜌
′
The group leader broadcasts 𝑔1𝑖 encrypted by group key in
′
the group. Then a member node u generates g 2𝑢 and sends 𝑔2𝑢
′
to group leader i. 𝑔2𝑢 is encrypted by the disposable pairwise
′
key. Therefore group leader gets 𝑔2𝑢 . Then the pairwise key
between member node u and group leader i is:
TABLE III. STORAGE COST (S) IN DIFFERENT NODES Table 5 presents the time cost of group key update in Panja
and SEHKM. It shows Panja cost much more time than
Node Network Group Pairwise Key Total(S)
Key (S) Key (S) (S) SEHKM. When network degree increases, time consume for
BS 1 1 (N-1)/m 2+ (N-1)/m group key update in Panja increases while in SEHKM it is
Cluster 1 2 m m+3 constant. Time for network key and group key update together
head in SEHKM grows while network degree rises.
Assistant 1 1 2 4
node
TABLE V. TIME COST (SECOND) OF PANJA AND SEH
Member 1 1 2 4
node Networ Netwo- Number of Group Key NK and
4) Simulation: According to above analysis result, the k Size rk Nodes in A Panja SEHKM GK in
simulation result of energy cost using DES for encryption in Degree Cluster SEHKM
two-level HWSN on a popular sensor node, the MICA from 22765 3 28 34.55 0.04 0.16
CrossBow is presented. The key size is 32bytes and unit is 22621 4 12 75.13 0.04 0.20
joule. The results are shown in table 4. 19608 5 7 94.01 0.04 0.24
19531 6 5 105.53 0.04 0.28
TABLE IV. ENERGY COST (JOULE) OF GROUP KEY UPDATE IN 21845 7 4 119.64 0.04 0.31
SEHKM, LEAP AND PANJA
Network Size LEAP Panja SEH
B. Security and Operation Analysis
100 0.33 0.27 0.03
200 0.6 0.54 0.05 In WSN, compromise detection is one of most critical
300 1.00 0.81 0.07 security requirements and almost all key management schemes
400 1.34 1.08 0.10 rely on it. Under detected compromise, SEHKM provides
500 1.68 1.36 0.12 nodes revocation mechanism to delete compromised node. This
600 2.01 1.63 0.14
mechanism avoids new key disclosure to the compromised
700 2.35 1.90 0.17
800 2.68 2.17 0.20
node and improves security strength of sensor network.
900 3.02 2.45 0.22 Assistant node improves security level too. It is able to
1000 3.69 2.99 0.27
communicate with BS safely if cluster head is compromised
The result shows SEHKM always costs less energy than because it has pairwise key with BS. As it describes in section
Panja and LEAP. B and G of chapter 3, assistant node gets command from BS
But in HWSN, network degree can be more than two and let other nodes which are not compromised join other
levels. It is simulated that the energy cost in SEHKM and clusters. This mechanism saves many normal nodes while in
Panja under different network degrees. other hierarchical schemes they are not able to be saved. The
DES and AES are different encryption algorithms. AES is method reduces resource cost and improves security.
stronger than DES but cost more energy and time. In DES,
The proposed scheme satisfies security requirement of
energy and time cost of encryption for 24bytes of data both are
WSN. First, data confidentiality can be achieved by encryption.
Three types of keys provide protection on different [8] L. Eschenauer and V. D. Gligor, "A key-management scheme for
communication types. Then, data authentication can be distributed sensor networks," presented at the Proceedings of the 9th
ACM conference on Computer and communications security,
provided by pairwise key or group key using a symmetric Washington, DC, USA, 2002.
message authentication code. Data integrity is achieved [9] W. Du, J. Deng, Y. S. Han, P. K. Varshney, J. Katz, and A. Khalili, "A
through data authentication. pairwise key predistribution scheme for wireless sensor networks," ACM
Trans. Inf. Syst. Secur., vol. 8, pp. 228-258, 2005.
In SEHKM, each node in the network is able to exchange [10] S. Zhu, S. Setia, and S. Jajodia, "LEAP+: Efficient security mechanisms
information with any other nodes. This is accessibility. On the for large-scale distributed sensor networks," ACM Trans. Sen. Netw.,
other hand, this scheme supports node addition, node vol. 2, pp. 500-528, 2006.
revocation and node replacement services. Thus flexibility is [11] B. Maala, H. Bettahar, and A. Bouabdallah, "TLA: A Tow Level
Architecture for Key Management in Wireless Sensor Networks," in
provided. Beside, scalability of network is provided due to the Sensor Technologies and Applications, 2008. SENSORCOMM '08.
hierarchical structure. The network is able to have large scale Second International Conference on, 2008, pp. 639-644.
of nodes. Therefore, the operation requirement of WSNs is [12] M. Steiner, G. Tsudik, and M. Waidner, "Key agreement in dynamic
satisfied in SEHKM. peer groups," Parallel and Distributed Systems, IEEE Transactions on,
vol. 11, pp. 769-780, 2000.
V. CONCLUSION [13] B. Panja, S. K. Madria, and B. Bhargava, "Energy and communication
efficient group key management protocol for hierarchical sensor
In this paper, a secure and efficient scheme for hierarchical networks," in Sensor Networks, Ubiquitous, and Trustworthy
wireless sensor network (SEHKM) is proposed. It has the Computing, 2006. IEEE International Conference on, 2006, p. 8 pp.
following features: [14] A. Perrig, R. Szewczyk, J. D. Tygar, V. Wen, and D. E. Culler, "SPINS:
Security protocols for sensor networks," Wireless networks, vol. 8, pp.
SEHKM supports establishment, transportation and 521-534, 2002.
updating of three types of keys. A network key shared by [15] S. William and W. Stallings, Cryptography and Network Security, 4/E:
Pearson Education India, 2006.
all nodes in the network. A group key shared by a group of [16] G.Yang, W. Chen and X. Cao. “Wireless sensor network security,”
nodes and a pairwise key shared by a specific pair of nodes. Beijing Science Press. 2010
SEHKM includes assistant node to help to organize [17] R. Watro, D. Kong, S. Cuti, C. Gardiner, C. Lynn, and P. Kruus,
“ TinyPK: Securing sensor networks with public key technology” , in
member nodes in the same cluster when cluster head is Proceedings of the 2nd ACM Workshop on Security of Ad Hoc and
removed, compromised or replaced. The group key update Sensor Networks (SASN ’ 04) , Washington, DC, Oct. 2004 , pp. 59 –
algorithm is safe and efficient. 64 .
[18] J. Zheng and A. Jamalipour, Wireless Sensor Networks: A Networking
The resource cost of SEHKM is reasonable. The key update Perspective: A John & Sons, Inc, and IEEEE, 2009
procedure is efficient and the memory cost for keys storage [19] C. Chih-Chun, S. Muftic, and D. J. Nagel, "Measurement of Energy
is small. Costs of Security in Wireless Sensor Nodes," in Computer
Communications and Networks, 2007. ICCCN 2007. Proceedings of
REFERENCES 16th International Conference on, 2007, pp. 95-102.
[1] J. Zheng and A. Jamalipour, Wireless sensor networks: a networking [20] N. A. Pantazis and D. D. Vergados, "A survey on power control
perspective: John Wiley & Sons, 2009. issues in wireless sensor networks," Communications Surveys &
[2] D. W.Carman, P. S. Kruus, and B. J. Matt, Constraints and Approaches Tutorials, IEEE, vol. 9, pp. 86-107, 2007.
for Distributed Sensor Security: NAI Labs Tech. Rep.#00-010, 2000. [21] G. de Meulenaer, F. Gosset, O. X. Standaert, and O. Pereira, "On the
[3] S. Gajjar, S. Pradhan, and K. Dasgupta, "Wireless sensor network: Energy Cost of Communication and Cryptography in Wireless Sensor
Application led research perspective," in Recent Advances in Intelligent Networks," in Networking and Communications, 2008. WIMOB '08.
Computational Systems (RAICS), 2011 IEEE, 2011, pp. 025-030. IEEE International Conference on Wireless and Mobile Computing,
[4] I. F. Akyildiz and M. C. Vuran, Wireless Sensor Networks. Singapore: 2008, pp. 580-585.
Markono Print Media Pte Ltd, 2010. [22] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci,
[5] X. Zhang and J. Wang, "Key Management in Wireless Sensor Networks: "Wireless sensor networks: a survey," Computer Networks, vol. 38, pp.
Development and Challenges," in Applied Mechanics and Materials, 393-422, 3/15/ 2002.
2014, pp. 654-660. [23] A. S. Wander, N. Gura, H. Eberle, V. Gupta, and S. C. Shantz,
[6] N. Sastry and D. Wagner, "Security considerations for IEEE 802.15. 4 "Energy analysis of public-key cryptography for wireless sensor
networks," in Proceedings of the 3rd ACM workshop on Wireless networks," in Pervasive Computing and Communications, 2005.
security, 2004, pp. 32-42. PerCom 2005. Third IEEE International Conference on, 2005, pp. 324-
[7] R. Rajagopalan and P. K. Varshney, "Data aggregation techniques in 328.
sensor networks: A survey," 2006.
Experiential Analysis of the Effectiveness of Buck
and Boost DC-DC Converters in a Basic off Grid PV
System
Asowata Osamede, Ruaan Schoeman and HCvZ Pienaar
Dept of Electronic,
Vaal University of Technology,
Vanderbijlpark, SOUTH AFRICA-1900
Abstract— The major renewable energy potential on earth is DC converters [7]. Fig. 1 reveals a basic off-grid PV system
provided from solar radiation and solar photovoltaics (PV) are connected to a DC-DC converter via a Data logging interface
considered a promising technological solution to support the circuit (DLIC). The geographical position of a country
global transformation to a low-carbon economy and reduce strongly affects the operational application of solar energy
dependence on fossil fuels. The aim of this paper is to carry out
[8].This is attributed to the fact that the intensity of solar
an experiential analysis of the effectiveness of power conditioning
devices with a focus on the Buck and Boost DC-DC converters energy is unevenly distributed on the surface of the earth.
(12 V, 24 V and 48 V) in a basic off grid PV system with a fixed
load profile. Improper choice of power conditioning device in a
basic off-grid PV system can attribute to power loss, hence the Solar Panel Power DC-DC Power Power
Converter
need for a right choice of power conditioning device to be coupled
with the system of the essence. This would assist in harnessing
more of the available solar energy. The practical setup consists of
Current Sensor
Voltage Sensor
38
38
Among the factors that unfavourably influence the output
28
25
power from a PV system, temperature effect is a major factor.
[10] as well as the appropriate choice of power conditioning
devices in PV systems. A rise in surface temperature of a PV Jan 4 2
module negatively affects its output power. It is important to Feb 5 3
4
3
note that the direct beam radiation received by a PV panel is
1
Mar 6 4
directly related to its power conversion [11]. In line with the AAp
PR 3Y
MA 7 JUN JUL
-3
above the average minimum and maximum day temperature in May 6 2
degrees for the months of April to July 2012 is represented in
Table 1 and Fig. 2. This is done to give a concise description of Figure 2: Average minimum and maximum temperature (April – July,
the temperature that was recorded during the sampling period. 2012).
The average maximum and minimum day temperatures are
derived using MS EXCEL from the day temperature data
downloaded from the DAQPRO data logger. Further analysis
which involves a bar chart proves that the month of April and
May had the highest day temperature with the Months of July
having the lowest recorded temperature. The temperature on
the surface of a PV panel basically affects the PV output power
which is supplied to the DC-DC converter (Buck or Boost).
Furtherance to the graph presented in Fig. 2 a Pearson
correlation is done between the PV voltages against the
average temperature per week and is presented in Fig. 3.
Correlation is a statistical technique that can show whether and
how pairs of variables or data are related [12]. Fig. 3 clearly
shows the correlation of the PV voltage with an increase in day
temperature. It is advised that PV system should be operated at
the maximum power point in order to reduce the overall cost of
the system [13]. So power regulations circuits such as DC-DC Figure 3: Graph of the correlation of the PV voltage against the day
converters should be incorporated in the set-up of a PV system. temperature for the month of June, 2012.
Power regulation circuits used in PV systems include
Maximum Power Point trackers, DC-DC converters and solar Diffused radiation is basically the radiation from the
chargers, but this article is limited to the empirical test of the surroundings of the PV panel which is not sufficient to drive
use of DC-DC converters. The correlation is further used to the load successfully. The third portion of the graph (82% to
determine the effect that different radiation conditions exert on 100%) indicates the effect of direct beam radiation on the DC-
the output of a DC-DC converter. Fig. 4 shows an example of DC converter voltage. The point at which power is delivered
the effect that different radiation conditions exert on the output
to the load is indicated by a triangle and represents the start of
of a DC-DC converter The first portion of the graph (0% to
the conversion-time as a result of the direct beam radiation
56%) indicates no output of the DC-DC converter as no solar
radiation exists (attributed mainly to night time). The second (100% - 81% equaling 19% for this given example). The
portion of the graph (56% to 81%) is where the DC-DC triangle also indicates a voltage of about 12 V which coincides
converter produces an output voltage which is not its stated with the output voltage specification of the DC-DC converter.
operating voltage (12 V). This is due to the fact that diffused The effect of the minimum and maximum temperature of on
radiation exists and not direct beam radiation. the output voltage from the DC-DC converter during the
sampling period has been presented in Fig. 3. Table 2.
TABLE I. AVERAGE MINIMUM AND MAXIMUM
TEMPERATURES FOR THE MONTHS OF (APRIL-JULY, 2012) represents a succinct comparison of the types of DC-DC
converters that exist and their basic application. This is done
Month Average Minimum Average Maximum
temperature (oC) temperature (oC) to have an overview of the characteristics of the Buck and
April 3 38 Boost DC-DC converters, hence assist to make a right choice
May 4 38 to incorporate in PV systems with specific load profile.
June 1 25
July -3 28
measuring instrument which must be controlled on at least two
Radiation conditions for the output
occasions [14]. The PV panel was connected to a 12 V, 24 V
of the DC-DC converter
14
and a 48 V MEANWELL DC-DC converter, which are Buck
and Boost converters and can be used for power conditioning
DC-DC converter voltage
12
No radiation Diffuse radiation Direct radiation [15]. A 12 V, 24 V and 48 V DC-DC were tested in this
10 research because they maintain a low power loss in the
8 conversion process. [A DLIC using hall-effect current sensors,
6
LTS 6-NP (LEM product) connected the PV system to a
R² Value DAQPRO 5300 data logger for recording input and output
4
voltages and currents]. The DAQPRO 5300 data logger has 8
2 analogue input channels and a 16 bit sampling resolution that
0 makes it ideal for logging purposes in this type of empirical
0 10 20 30 40 50 60 70 80 90 100
Percentage (Time)
research [16]. The DC-DC converter was connected to a
varying load resistance.
Figure 4: Radiation conditions for the output of a DC-DC converter.
However, it is possible that the PV voltages could be as high
as 29 V. Bearing in mind the 29 V limitation reveals that the
TABLE II. TYPES OF DC-DC CONVERTERS THAT EXIST AND THEIR BASIC
APPLICATION
data loggers stated above will not be able to accommodate
these high voltages, as most of them have an extreme input
DC-DC converters Step-up switching Step-down switching
converters regulators
voltage of 2.5 V. A data logging interface (DLI) circuit was
Known names Boost switching Buck switching incorporated in the system. It is afterwards necessary to
regulators regulators condition the voltage to make it less than the maximum input
Working Principle Provide a higher Provide a regulated
voltage output than voltage level that is voltage required by the data loggers. The DLI must also be
the input voltage less than the input able to provide DC current monitoring using hall-effect
voltage
Advantages Easy to analyse and High efficiency, high
current sensors. The electrical design of the test setup is
has a simple circuit switching frequencies presented in Fig. 5.
built-in circuit
protections at low
cost
DAQPRO 5300 DAQPRO 5300
Disadvantages Not suitable for high High voltage ripples
Data Logger Data Logger
power conversion at the output
Input power Output power
Application Designed for driving They are available
Vi x Ii Vo x Io
strings of LEDs. with internal
architectures that are DLIC DLIC
optimized for
applications that have
specific goals, such as
power saving and high
efficient PV systems. SOLARWORLD
SW220 MEANWELL
PV Panel DC-DC Converter Load resistor
Tilt angles = 12 V/24 V 6Ω
16°, 26° and 36° power regulation
III. PRACTICAL SET UP Orientation angle =
0°
Samples Tilt angle PV voltage (Vin) PV current (Iin) Power in DC voltage DC current (Iout) Power out Efficiency On- time/wk in
(°) (V (A) (W) (Vout) (V) (A) (W) (%) hrs
1 16° 27.86 4.115 114.4 22.95 4.078 93.59 81.63 4.23
2 16° 29.43 4.115 121.0 24.21 4.078 98.72 81.52 4.77
3 26° 31.8 4.115 130.9 24.27 4.078 98.97 75.63 33.06
4 26° 30.37 4.115 124.9 25.17 4.078 102.64 82.13 28.39
5 36° 31.26 4.115 128.3 24.01 4.078 97.91 76.11 14.71
6 36° 30.26 4.115 124.1 24.16 4.078 98.52 79.12 16.91
Average 30.16 4.115 123.93 24.12 4.078 98.39 79.35 NA
Abstract—Coverage in 3D wireless sensor network (WSN) used to describe a scenario in which k sensors cover a common
is always a very critical issue to deal with. Coming up with region. More precisely, a point in 3D is said to be k-covered
good coverage models implies more energy efficient networks. if it lies in the region that is common to the sensing spheres of
K-coverage is one model that ensures that every point in a given k-sensors. k is termed as degree of coverage. The ultimate aim
3D Field of Interest (FoI) is guaranteed to be covered by k sensors. of this project is to come up with a deployment strategy for
When it comes to 3D, coming up with a deployment of sensors
that gurantees k-coverage becomes much more complicated than
sensors that guarantees k-coverage of a given 3-dimensional
in 2D. The basic idea is to come up with a convex body that is Field of Interest (FoI) for large values of k. A first step to
guaranteed to be k-covered by taking a specific arrangement address this issue is to come up with a 3-dimensional convex
of sensors, and then fill the FoI will non-overlapping copies body (tile) that is guaranteed to be k-covered by a certain
of this body. In this work, we propose a new shape for the arrangement of k (or more) sensors, and then fill the FoI with
3D scenario which we call a Sixsoid. Prior to this work, the non overlapping copies of that shape by repeating the same
convex body which was proposed for coverage in 3D was the arrangement.
so called Reuleaux Tetrahedron. Our construction is motivated
from a construction that can be applied to the 2D version of the The term Sixsoid has been coined in this paper to signify
problem in which it imples better guarantees over the Reuleaux a geometrical shape that resembles a super-ellipsoid [14].
Triangle. Our contribution in this paper is twofold, firstly we Sixsoid is created by the intersection of six sensors, each
show how Sixsoid gurantees more coverage volume over Reuleaux having the same sensing radius, which are placed on the
Tetrahedron, secondly we show how Sixsoid also guarantees a six face centers of a cube of side length r where r is the
simpler and more pragmatic deployment strategy for 3D wireless
sensor networks. In this paper, we show the constuction of Sixsoid,
radius of the sensing spheres. We compare the implications
calculate its volume and discuss its implications on the k-coverage of this convex body with the previously proposed model on
in WSN. 3D k-covergae based on Reuleaux Tetrahedron. Recall that
the Reuleaux Tetrahedron, is created by the intersection of
I. I NTRODUCTION four spheres placed on the vertices of a regular tetrahedron of
side length r. In an attempt to guarantee 4-coverage of the
Wireless Sensor Networks (WSN) consists of energy con- given field, [2] considers a scenario in which four sensors
strained sensor nodes having limited sensing range, commu- are placed on the vertices of a regular tetrahedron of side
nication range, processing power and battery. The sensors length equal to the sensing radius r. Their intersection is
generally follow different hop-by-hop ad-hoc data gathering the so called Reuleaux Tetrahedron. Unfortunately it is not
protocols to gather data and communicate. The sensors can possible to obtain a tiling of the 3D space with non-overlapping
sense the information which-ever lies in its sensing range copies of Reuleaux Tetrahedron [1]. In fact, such a tiling is
using RF-ID (Radio Frequency Identification). The sensed not possible even with a tetrahedron [15]. In [1] a plausible
data can be communicated to another sensor node which lies deployment strategy is hinted that exploits this construction
within communication range of the sender. The gathered data by overlapping two Reuleaux Tetrahedrons, gluing them at a
finally reaches the base station which may be hops apart from common tetrahedron’s face, but this deployment doesn’t seem
any sensor. Since sensor transceivers are omni-directional, we to be pragmatic. In this paper, we propose another 3D solid
assume the sensing and communication ranges as spheres of (the Sixsoid) for this purpose and an extremely pragmatic
certain radii. The network is called homogeneous when all deployment strategy. We show that, the volume guaranteed to
the sensors have the same radii and heterogeneous otherwise. be 6-covered (by Sixsoid) is more than the volume guranteed
Coverage of a certain FoI and deployment of sensors are an to be 4-covered (by Reuleax Tetrahedron). Indeed, 6-coverage
issue of research where the aim is to make energy efficient is more desirable than 4-coverage if it takes less number of
networks. Sensor deployment and coverage in 2D requires sensors to k-cover the same (FoI). We also discuss how the
simpler strategies and protocols as compared to 3D. 3D sensor fraction of the volume of Sixsoid inside a cube changes as a
network is used generally for underwater sensor surveillance function of sensing radius.
[4], floating lightweight sensors in air and space, air and water
pollution monitoring, forest monitoring, any other possible 3D Another key ingredient of the work of [1], [2] is a result
deployments etc. Real life applications of WSNs are mostly that states that any convex object with breadth (the maximum
confined to 3D environments. The term k-coverage in 3D is distance between any two points) at most r (where r is the
R EFERENCES
[1] Ammari, H. M.: On the problem of k-coverage in 3D wireless
sensor networks: A Reuleaux Tetrahedron-based approach. In: 7th
international conference on Inteligent sensors, Sensor networks and
Information processing (2011)
[2] Ammari, H.M., Das, S.K.: A Study of k-coverage and Measures of
Connectivity in 3D Wireless Sensor Networks. In: IEEE Transac-
tions on Computers , vol. 59, no. 2, February (2010)
[3] Alam, S., Haas, Z.: Coverage and connectivity in three dimensional
networks. In: Proc. ACM MobiCom, pp. 346-357 (2006)
[4] Akylidiz, I.F., Pompili, D., Melodia, T.: Underwater Acoustic
Sensor Networks: Research Challenges. In: Ad Hoc Networks, vol.
3, pp. 257 -279, Mar (2005)
[5] Bai, X., Kumar, S., Xuan, D., Yun, Z., Lai, T.H.: Deploying
Identifying Ideal Values of Parameters for Software
Performance Testing
Charmy Patel Ravi Gulati
Shree Ramkrishana Intritute of Computer Department Of Computer Sceince
Education & Applied Sciences, Veer Narmad South Gujarat University,
Surat, Gujarat Surat, Gujarat
charmyspatel@gmail.com rmgulati@gmail.com
Abstract—Performance is the most imperative feature simple linear regression technique. Afterwards section 4
concerned with the quality of software. Performance testing describes our experimental setup, process and results. We
provides the proof of performance of the product and set a finally conclude by demonstrating our derived results in the
baseline for further enhancement in the application. Software form of critical performance affecting factors and performance
performance is dependent on various factors like response, speed
measurements matrix.
and underling resources. In this paper we have identified critical
factors affecting software performance and derived the measures
and matrices related to performance. We have performed II. PERFORMANCE MEASUREMENTS
statistical mining to find the confidence interval of critical
parameters which plays important role in software performance. In our previous work we analyzed that software quality and
Our research for Software Performance analysis is a systematic, performance related problems are commenced at early stage of
quantitative and qualitative way in the direction of supporting implementation but not fixed till they become very much
performance sensitive software systems development that meet expensive and difficult. Different performance testing tools in
performance objectives and prevents performance problems. market are available with their own static boundaries like
platform, browser compatibility etc. No consolidated
Keywords—Performance testing; performance framework is available which analyze any web application to
measurements; linear regression; correlation; confidence interval find its performance affected critical factors. To make
application efficient, performance management and
I. INTRODUCTION
improvements are done manually based on suggestions of the
Performance is a marker which refers to a software system team leader and/or the project manager. Above discussion
that accumulates its requirements for relevance. To gain state that to maintain software quality and performance at the
Software system’s main goal- response time and throughput, time of software development and deployment is a research
Responsiveness is a main skill [1]. Software performance challenge which leads us to the development of integrated
testing validates the stability, scalability or speed features of performance analysis framework in the support of software
the application/system under test. Performance is quality assurance. So, to address this challenge we had
apprehensive to achieve resource-utilization, response times proposed a framework which
and throughput that meet the performance objectives for the
Extracts the value of performance parameters.
product/project.
Compare these values with ideal performance
In most of the software projects, testing is done at the last measurements.
phase. Lead the industry to performance driven approach and And provides feedback for performance
stress diminution, Critical problem relevant to performance improvement.
and quality are to be solved at early phase of implementation.
So, to achieve the software quality, at the time of application By the pilot survey and analysis among industry people we
development, developer requires an indication about the have identified key factors responsible for software
performance of a page/module created by him/her. This performance.
indication can be given in terms of critical factors which
Performance testing tools helps to test system performance in
affects software performance. Rigorous analysis and statistics
practically virtual environment. However Performance of
is required for the identification of such parameters.
Software and Web application depends on various critical
This paper is structured as follows; section 2 covers the performance factors [3]. For website performance testing is
concept of performance measurements. Section 3 about dependent on Speed, Scalability and Stability of the page. So,
methodology to find performance measures by applying number of http requests, various objects’ size and their loading
statistical mining in which relationship between various times are very important parameters which effects on webpage
parameters is established using the correlation coefficient and performance.
A. Data Collection and Big Query Processing t = 536.7898, df = 290044, p-value < 2.2e-16
To find the ideal values for all performance sensitive Alternative hypothesis: true correlation is not equal to 0
parameters we have taken HttpArchive data set. Steve Souders 95 percent confidence interval:
created the HttpArchive dataset and Pat Meenan's 0.7041132 0.7077645
WebPagetest system provides platform to built it [4].
HttpArchive is a treasure trove of web performance data. It Sample estimates:
crawls millions popular sites twice a month and records the cor
performance relevant data of each webpage [5]. Approx 4 GB 0.7059435
data is available in the dataset. All the data are retrieved using
IE9 browser with the default internet connection speed of 5.0 As explained with the help of the above function we gain the
mbps. correlation between different parameters. We define these
performance parameters for the analysis of critical factors.
HTTPArchive data set is available on Google Big Query by
Ilya Grigorik who is a web performance developer advocate Total_http_request – total number of http request generated by
and engineer at Google, where he works on building the web the website, Page_size – the size of underlying webpage
fast and performance best practices. including images, scripts, css and multimedia, Page_loadtime
– a total time of page rendering in the browser are highly
We have explored HTTPArchive dataset for our research dependent parameters on their relevant independent
using GitHub as a service. First we developed a new project in parameters. So, we find correlation of these dependent
which we imported HTTPArchive- a web performance
parameters with their relevant independent parameters which
is shown in table I, II and III. plot(mydata$img_size_KB, mydata$Page_size_KB)
(2)
Table I. CORRELATION BETWEEN REQUESTS PARAMETERS res = lm(mydata$Page_size_KB ~ mydata$img_size_KB)
(3)
Parameters Total_http_req
abline(res, col='red') (4)
Total_HTML_req 0.71
Total_Javascripts_req 0.71
Total_css_req 0.36
Total_images_req 0.91
Total_Multimedia_req 0.33
Page_size
Parameters (KB)
Page_loadtime
Parameters (seconds)
qnorm(0.975)*s/sqrt(n) (5)
Then next we find the upper bound and lower bound of each
parameter.
SR
.
No Parameter Name Confidence Interval
Lower Upper
Bound Bound
1 Page_load_time (sec) 5.79 6.87
Abstract—The paradigm of big data demands either extension of percentage of the total evaluation cycle. In order to save time,
existing benchmarks or building new benchmarks to capture the most of the benchmark queries may be tuned to execute fast on
diversity of data and impact of change in data size and/or system large data sizes based on their execution on small data size.
size. This has led to increase in cycle time of benchmarking an However, the efficiency and accuracy of tuning may not be
application which includes multiple workloads executions on guaranteed. PIQL [9] is a programming framework for writing
different data sizes. This paper addresses the problem of reducing application‟s structured queries, over key/value store, which
the benchmark cycle time for structured application evaluation ensure that the query execution time is at most SLO (service
on different data sizes. The paper propose an approach of level objective) with increase in data size i.e. query is scale
reducing Big data benchmark cycle time using prediction models
independent. The paper has classified queries into constant,
for estimating SQL query execution time with data growth. The
paper also proposes a model which could be used for efficient
bounded, linear and non-linear depending on the relation
tuning of benchmark queries before their executions, to speed up between increases in query execution time with increase in data
the application evaluation process, on different data sizes. The size. We shall focus, in this paper, on the constant, bounded and
proposed model estimates structured query execution time for linear structured queries for relational databases which fall in
large data size by exploiting data value distribution without the first quadrant [4] of the benchmarks for the efficient tuning
actually generating high volume data. The model is validated of their queries.
against three lab implementation of real life applications and We propose an approach of reducing benchmark cycle time
TPC-H benchmarks. which could be applied for any big data system architecture
centralized, distributed or parallel provided appropriate query
Keywords—Data Volume; Model; Structured Big data; Query performance prediction models as function of data sizes are
tuning; Evaluation time available. In this paper, however we focus on reducing
benchmark cycle time for only structured centralized RDBMS
I. INTRODUCTION based Big data systems using SQL query prediction models
The proliferation of big data applications, which are such as linear regression, for efficient tuning of benchmark
dynamic in velocity, volume and variety of data, leads to queries on large data sizes without actually generating it. An
development of new benchmarks for evaluating their intuitive approach is to estimate a query execution time as
performance. The complexity of defining benchmark for big linearly proportional to data size, however, in structured big
data has been discussed in [7]. Recently, BigDataBench [1] has databases, a query‟s execution time depends on the query
defined 19 workloads, based on diversity and type of data. This execution plan including data access paths , as decided by the
kind of extensive benchmark suites increases the evaluation optimizer; this execution plan may be different on different data
time due to time spent in generation and loading of data of large sizes for the same query. Here, the challenge is first to identify
sizes and executing multiple workloads of different types and which type of queries of an application will follow the constant
sizes. Xion[5] has discussed PCA approach of reducing the or linear model for different data sizes, and for what data size
evaluation time by finding out a valid minimal subset of performance should be measured which could be linearly
workload representative which includes the same characteristics extrapolated for correct prediction. The main contribution of the
as the benchmark suite without missing out any diversity. paper is proposing an approach to reduce benchmarking cycle
However, these benchmarks still need to be executed multiple time by using query performance prediction models. The paper
times on different data sizes to evaluate performance sensitivity also talks about an approach for identifying queries whose
to the data volume. execution time may be predicted using simple performance
The big data application benchmarks could be classified prediction model such as linear regression. We show the
into four quadrants based on the variety of data and mode of applicability of the approach on single RDBMS for large data
accessing it [4] – structured and SQL, structured and NOSQL, sizes.
unstructured and SQL and unstructured and NOSQL. The The paper is organized as follows. The Section II talks
conventional approach of evaluating benchmarks on different about the benchmarking process. The Section III discusses
data sizes will spent huge time generating and loading of data efficient tuning of benchmark queries using a model to predict
before the actual execution which may contribute to large query‟s execution time on large data sizes. Section IV presents
Abstract—In this paper, a robust blind color image fact that watermark is distributed irregularly over the
watermarking technique using Online Sequential Extreme transformed image makes it difficult for the attacker to remove
Learning Machine (OS-ELM) is proposed. Blue channel is watermark or modify the watermarked content. The DWT
utilized and transformed using DWT. Low frequency LL4 sub- based watermarking algorithms have gained more popularity
band is used for watermark embedding. A variant of mini-batch as they end up giving better results in terms of visual
machine learning algorithm i.e. OS-ELM is initially tuned with a imperceptibility and robustness against common image
fixed number of training data used in its initial phase and size of processing attacks [5-6]. Presently, the problem of
block of data learned by it in each step. The training data to OS- watermarking of images has been converged to be an
ELM is constructed by combining the quantized and desired LL4
optimization problem wherein the twin requirements namely
sub-band coefficients of the DWT domain. A random key decides
the starting watermark embedding position of the coefficients.
visual quality of the watermarked image and robustness of the
Two binary images are used as watermark. The robustness embedding algorithm must be balanced out. Many soft
towards common image processing attacks is enhanced using this computing techniques have been employed to this end. The
process. Experimental results show that the extracted visual quality of the watermarked and attacked images is
watermarks from watermarked and attacked images are similar assessed by computing PSNR and MSSIM parameters while
to the original watermarks. Computed time spans for embedding multiple image processing attacks are implemented over
and extraction are of the order of milliseconds, which is suitable watermarked images to examine the issue of robustness. For
for developing real time watermarking applications. this purpose, usually two parameters - Normalized Correlation
(NC) and Bit Error Rate (BER) on case to case basis are used
Keywords—Blind Watermarking; Extreme Learning to quantify and assess the degree of similarity between
Machine;Online Sequential; Normalized Correlation (NC); PSNR; embedded and recovered watermarks. Many research groups
MSSIM; BER have proposed different soft computing techniques, especially
meta-heuristic algorithms to develop robust watermarking
I. INTRODUCTION algorithms. Among these, adaptive meta-heuristic techniques
The explosive usage and growth of internet using advance (AMHTs) are used to optimize the numerical values of scaling
communication networks have led to a very alarming scenario factors or embedding strength while integrating the
wherein illegal copying, reproduction and distribution of watermarks with image coefficients in transform domain [7].
original multimedia content has become easier and faster. In addition to this, Artificial Neural Networks (ANNs) have
There are now significant opportunities to pirate the been used to embed and extract watermarks. Other techniques,
copyrighted digital multimedia products. Therefore, especially hybrid networks such as Adaptive Neuro-Fuzzy
development of robust digital watermarking techniques has Inference Systems (ANFIS) [8] and machine learning
acquired very special status in the research domain of techniques such as Support Vector Regression (SVR) and its
multimedia security [1]. Cox et al. [2] argued that a watermark different variants are also used for image watermarking [9-10].
must be placed in perceptually significant components of the Pal et al. [11] proposed a reversible watermarking method
signal for it to be robust to common signal processing (Odd-Even Method) used for watermark insertion and
distortions and malicious attacks. Dey et al. [3] proposed a extraction in a bio medical image with large data hiding
robust biomedical content authentication system by capacity, security as well as high watermarked quality. Cheng-
embedding different logo of the hospital or multiple electronic Ri. Piao et al. [12] proposed a blind watermark embedding
patient records (EPR) within the retinal image using DWT- and extraction algorithm using RBF Neural Network. The
DCT-SVD based approach of watermarking combined with interpolation method and few trigonometric functions can also
firefly algorithm. Mishra et al. [4] have proposed an be used to embed secret bits into the gray planes of color
informed/non-blind watermarking scheme based on ELM. The image [13]. Similarly, Der-Chyuan Lou et al. proposed new
978-1-4673-9354-6/15/$31.00 ©2015 IEEE
healthcare image watermarking scheme based on HVS model OS-ELM algorithm is used to minimize time lapses to train
and back-propagation network (BPN) technique [14]. Their the neural network. This is contrary to the performance of
experimental results indicate that this technique could survive other gradient descent based neural architectures such as BPN.
various image processing attacks including JPEG lossy This will be definitely helpful in extending and developing the
compression. Qun-Ting Yang et al. [15] have proposed a color proposed scheme to real time moving multimedia data such as
image oblivious watermarking scheme based on BPN and video both in compressed and uncompressed form. The
DWT. We observe that primitive neural network training processing time spans in milliseconds made our proposed
algorithms such as gradient descent optimization based Back algorithm fit for developing real time image processing
Propagation Network (BPN) often suffer from various
applications such as color video watermarking. Four different
drawbacks of long training time, multiple local minima etc.
image processing operations over watermarked images as
[16]. Thus, it tends to waste time during training. But, they are
found to be adaptive in nature and operation. Although, the attacks are carried out to examine the robustness of the
fuzzy inference based schemes do not suffer from these embedding scheme. These attacks are described in detail in
problems but they are not adaptive in nature. Therefore, it has Section III. Visual quality of the watermarked and attacked
been suggested for this and other similar engineering images is quantified by using PSNR and MSSIM metrics.
applications to make use of hybrid variants such as neuro- Note that both these metrics are full reference metrics which
fuzzy systems to achieve better results. These alternatives require both the watermarked / attacked image along with the
have been tried and tested in a major way but they are proved original image to compute the visual quality. The watermarks
to be costly in terms of embedding and extraction time [17]. have been extracted in a blind manner by using values
predicted by the OS-ELM algorithm. The extracted
The traditional neural networks such as BPN or RBF are slow
watermarks are matched with the embedded ones and the
in learning. Usually, it takes several minutes or several hours
degree of matching is evaluated by computing two different
to train neural networks for different applications by choosing
control parameters (i.e. learning rate, learning epochs, parameters - Normalized Correlation or NC(X, X*) and Bit
stopping criteria and other pre-defined parameters) which Error Rate or BER(X, X*), X and X* being the original and
must be appropriately selected in advance. The inappropriate extracted watermarks. These correlation metrics have been
values for these control parameters may either lead to chosen in the present work as these are widely reported in
unsuccessful training of a neural network or the overfitting of literature [23]. It is found that the embedding and extraction
trained neural network. G. B. Huang et al. [18] have processes are well optimized and the proposed DWT-OSELM
implemented a batch learning algorithm called extreme based watermarking scheme is robust enough against the
learning machine (ELM) for SLFNs which randomly chooses selected attacks.
the input weights and the hidden neurons’ biases for SLFN
with additive neurons, and then analytically computes the III. OS-ELM FORMULATION AND EXPERIMENTAL
output weights of SLFNs. This makes its training extremely DETAILS
fast and is especially suitable for developing real time A novel color image watermarking scheme using a newly
applications. developed SLFN known as Online Sequential Extreme
In some online applications, the mini-batch sequential Learning Machine (OS-ELM) is implemented. This is done
machine learning algorithms may be preferred over batch with a view to achieve fast computation, good generalization
learning algorithms as they do not need to be re-trained for a capability and accuracy and due to the reason that it offers a
upcoming data [19, 20]. In this work, ELM is modified based solution in the form of a system of linear Eqn. H Y . The
on recursive least-squares (RLS) algorithm, which is referred SVD method is used to calculate the Moore-Penrose
to as Online Sequential Extreme Learning Machine (OS-ELM)
generalized inverse [21] of H according to the Eqn. (1).
and is used for watermark embedding and extraction. OS-
ELM can learn the training data one-by one or chunk by chunk
ˆ H T H H T Y
1
(with fixed or varying size) and discard the training data as (1)
long as the training procedure for those data is completed.
where H is the hidden layer output matrix of the neural
II. RESEARCH FOCUS & CONTRIBUTION network, ˆ is the estimate of the output weights and Y is the
In this paper, the Online Sequential Extreme Learning expected output.
Machine (OS-ELM) is used to develop a novel watermark The activation function used in OS-ELM computation is
embedding and extraction in three different color images. The sigmoid activation function defined in Eqn. (2) is used to train
embedding is done in low frequency coefficients to ensure the OS-ELM.
robustness of binding the watermark with the coefficients of 1
the host image in transform domain. Visual quality of the g ( x) (2)
watermarked images is also found to be good after embedding. 1 e x
The OS-ELM is used in the DWT domain of the host images,
which ensures better embedding process. More specifically,
where is the gain parameter of sigmoid function.
two different binary watermarks have been embedded and The OS-ELM consists of two main phases. The first phase -
extracted from three host images of size 512 512 3 . The Boosting phase is to train the SLFNs using the primitive ELM
method with some batch of training data in the initialization
stage and these boosting training data will be discarded C (3)
as soon as boosting phase is completed. The required batch of Pi ' OSELM Round i
Q
training data is very small, which can be equal to the
number of hidden neurons. In this work, the number of hidden This output is close to the desired output included in the
neurons has been set to 30, the number of initial training data dataset used to train the OS-ELM. The optimized numerical
to 75 and size of block of data learned by OS-ELM in each value of Q = 32 is used for all practical computations.
step to 5. In the second phase - Sequential Learning Phase, the 3. Select the starting location of watermark embedding
OS-ELM will learn the train data one-by-one or chunk-by- coefficient Ci using the random secret key.
chunk and all the training data will be discarded once the 4. Embed the watermark according to the Eqn. (4) which uses
learning procedure on these data is completed. the predicted output of the OS-ELM ( Pi ):
As stated above, a blind color image watermarking scheme Ci'key Pi ' key wi (4)
using the OS-ELM in DWT domain is implemented in this
work. For this purpose, LL4 sub-band coefficients are used to where wi is the watermark which is either binary or random
carry out embedding and extraction processes. For embedding
process, blue channel is extracted from the host image. Note
sequence, is the embedding strength optimized to be equal
'
that the host image is of size 512 512 3 , therefore, the to 0.17 for both watermarks and Ci are the modified LL4 sub
extracted blue channel out of RGB image is of size band coefficients obtained after watermark embedding.
512 512 1. It has been observed that the best results are 5. Perform Inverse DWT to generate watermarked image.
obtained in case the blue channel is used for embedding and
extraction. The OS-ELM is trained with the LL4 sub-band B. Watermark Extraction Algorithm
coefficients. A random key is used to decide the initial Listing 2 gives the sequence of steps used to extract the
location of watermark embedding. Three standard color watermark from watermarked and attacked images in a
images - Airplane, Baboon, and Lena are used to embed complete blind manner.
binary watermarks after training the OS-ELM using the
quantized values of the LL4 sub-band coefficients. The size of Listing 2: Extraction Procedure
the input dataset is 1024 2 while it produces an output 1. Transform the watermarked blue image using 4-level
sequence of size 10241 whose coefficients are close to the DWT transform. Select LL4 ( 32 32 size) sub-band
desired LL4 sub-band coefficients. Two different 32 32 "
coefficients and set Ci to it.
sized binary watermarks are tested in this experimental work. "
2. Quantize Ci by Q, and use the already trained OS-ELM
The watermarked images are tested for visual quality by model to predict the output:
computing two full reference metrics:- PSNR and MSSIM
C"
[13]. The watermarked images are also subject to selected Pi" OSELM Round i (5)
image processing attacks to verify the issue of robustness. Q
These attacks are: - (a) JPEG (QF=50, QF=75 and QF=90), (b) '
3. Extract the watermark wi using the Eqn. (6) below, using
Scaling (resized to half and then restored to original size), (c) "
the output of the OS-ELM in Eqn. (5) and Ci and the secret
Gaussian Noise (5% and 10%), and (d) Salt and pepper (0.1 %
and 0.5%). Blind extraction of the watermarks from the key.
watermarked images is done before and after executing image wi' ( Pi" Ci" ) (1 / ) (6)
processing attacks. Both embedding and extraction are done
using the same key and the same OS-ELM model. In this where is the embedding strength.
scenario, only the watermarked or the attacked image is
required to recover the watermark (blind extraction) by The watermark extraction is carried out using the algorithm
predicting the output of the OS-ELM. A comprehensive given in Listing 2 and the normalized correlation NC (W, W’)
analysis of the results obtained in this simulation is given in is computed between the embedded and the extracted
section IV. watermarks. This formulation is given in Eqn. (7).
A. Watermark Embedding Algorithm
n
Listing 1 gives the sequence of steps used to carry out the W (i)W ' (i) (7)
embedding process. NC (W , W ' ) i 1
n n
i 1
2
(a)
(b)
(b)
(c)
(b) (c)
(a)
A. JPEG Compression
JPEG Compression with Quality factor (Q.F.) =50, 75 and 90
(a) is used. In case of Binary watermark #1 and #2, the recovered
watermarks are depicted in Table I and the NC and BER
PSNR= 43.998204 MSSIM= 0.9989 values indicates the extracted watermarks are still
recognizable.
30.39/ 0.8356/
B. Image Scaling Airplane 0.7547 0.0850
Gaussia
The watermarked image is resized to the size 256 256 and n Noise 30.38/ 0.8574/
Baboon
then restored it to the original size i.e. 512 512 . In case of 10% 0.9651 0.0762
binary watermark #1 and #2 of size 32 32 , high NC (W, Lena
30.41/ 0.8734/
W’) value and low BER(W, W’) value show that the 0.9766 0.0742
watermark is successfully extracted as compiled in Table II. 34.69/ 0.9806/
Airplane 0.9568 0.0986
TABLE II. THE EXPERIMENT RESULT AFTER SCALING AND RESIZING Salt & 35.12/ 0.9889/
Pepper Baboon
Attack Image PSNR NC Extracted 0.9909 0.0977
0.1%
(dB)/ (W,W’)/ Watermark Lena
35.12/ 0.9861/
MSSIM BER 0.9932 0.0986
(W,W’)
30.75/ 0.9806/ 34.37/ 0.9117/
Airplane Airplane 0.9533 0.0557
0.9458 0.1006 Salt &
23.06/ 0.9699/ 34.74/ 0.8881/
Scaling Pepper Baboon
Baboon 0.8081 0.1045 0.9905 0.0674
0.1%
32.26/ 1/ Lena
34.68/ 0.8938/
Lena 0.9924 0.0596
0.9839 0.0938
30.65/ 0.8974/ 27.73/ 0.7865/
Airplane Airplane 0.8185 0.1582
0.9421 0.0635 Salt &
23.04/ 0.8753/ 28.27/ 0.8348/
Scaling Pepper Baboon
Baboon 0.8066 0.0732 0.9579 0.1523
0.5%
32.11/ 0.9228/ Lena
28.08/ 0.8070/
Lena 0.9664 0.1465
0.9835 0.0527
27.78/ 0.7519/
Airplane 0.825 0.1289
C. Image Noising Salt &
Gaussian noise with noise amount=5% and 10% respectively is 28.12/ 0.7138/
Pepper Baboon 0.9569 0.1514
added to the watermarked images. In case of Binary watermark 0.5%
#1 and #2 of size 32 32 , the recovered watermarks are still Lena
28.13/ 0.7450/
0.9670 0.1318
recognizable and shown in Table III.
The results compiled in this section indicate that OS-ELM
TABLE III. THE EXPERIMENT RESULTS AFTER ADDING NOISE algorithm has been quite successful in developing color image
PSNR NC Extracted watermarking scheme which can produce results by fulfilling
Noise (dB)/ (W,W’)/ Watermark real time constraints. The algorithm successfully handles the
Image
Type MSSIM BER selected image processing attacks. We observe that watermark
(W,W’) recovery is good. This is indicated by high computed values
36.87/ 1/ of NC. The corresponding BER values are the lowest as per
Airplane 0.9213 0.0938
Gaussia our expectation. The visual quality of signed and attacked
n Noise 36.90/ 1/ images is also very good. This is indicated by high computed
Baboon 0.9918 0.0938
5% PSNR and MSSIM values. Thus, the OS-ELM algorithm is
Lena
36.86/ 1/ capable to minimize the tradeoff between robustness and
0.9946 0.0938 imperceptibility at a fast processing speed. This makes it
36.47/ 0.9206/ particularly suitable for extending this work to video
Airplane 0.9188 0.0547 watermarking applications.
Gaussia 36.46/ 0.9206/
n Noise Baboon V. CONCLUSIONS
5%
0.9913 0.0537
36.46/ 0.9206/ Copyright protection and image authentication are crucial
Lena
0.9942 0.0537 domain of contemporary multimedia research. To this end,
numerous soft computing techniques based watermarking
schemes have been proposed worldwide. However, they do [8] Charu Agarwal, Anurag Mishra, Arpita Sharma, “A novel gray-
not touch upon the crucial requirement of minimizing the scale image watermarking using hybrid Fuzzy-BPN
architecture,” (2015), Egyptian Informatics Journal, 16(1), pp.
processing time spans. This requirement is found to be an 83-102
important one for developing secure multimedia applications [9] Pan-Pan Zheng, Jun Feng, Zhan Li, Ming-quan Zhou, “A novel
such as image or video watermarking. In this paper, a novel SVD and LS-SVM combination algorithm for blind
blind watermarking scheme using single layer feed-forward watermarking,” (2014), Neurocomputing, 142 (22), pp. 520-528
neural network (SLFN), commonly known as Online [10] Mehta, R.; Mishra, A.; Singh, R.; Rajpal, N., "Digital Image
Watermarking in DCT Domain Using Finite Newton Support
Sequential Extreme Learning Machine (OS-ELM) is proposed Vector Regression, (2010), Sixth International Conference on
for color images. We train the OS-ELM by using quantized Intelligent Information Hiding and Multimedia Signal
LL4 sub-band coefficients of the blue channel of the host Processing (IIH-MSP), pp.123-126
image by taking its 4–level DWT transform. A dataset of size [11] Pal, A.K.; Dey, N.; Samanta, S.; Das, A.; Chaudhuri, S.S., "A
1024 2 is prepared by using LL4 sub-band coefficients. This hybrid reversible watermarking technique for color biomedical
images," (2013), IEEE International Conference on
dataset is used to train the OS-ELM network which produces a Computational Intelligence and Computing Research (ICCIC),
output sequence of size 10241 used to carry out embedding. pp.1-6
A random key decides the beginning position of the [12] Cheng-Ri Piao, Seunghwa Beack, Dong-Min Woo, and Seung-
Soo Han, “ A Blind Watermarking algorithm Based on HVS and
coefficients where the watermark is embedded. Two different RBF Neural Network for Digital Image”, (2006), Springer
binary watermarks of size 32 32 are used in this work. Blind Verlag Berlin Heidellberg, pp. 493-496
extraction of the watermarks is carried out. We conclude that [13] Chakraborty, S.; Maji, P.; Pal, A.K.; Biswas, D.; Dey, N.,
the proposed DWT-OS-ELM based watermarking scheme is "Reversible Color Image Watermarking Using Trigonometric
efficient to minimize the tradeoff between the imperceptibility Functions," (2014), International Conference on Signal
Processing and Computing Technologies (ICESC) in Electronic
and robustness. Experimental results show that the extracted Systems, pp.105-110
watermarks from watermarked and attacked images are similar [14] Der-Chyuan Lou, Ming-Chiang Hu, and Jiang-Lung Liu,
to the original watermark. The computed time spans for “Healthcare Image Watermarking Scheme Based on Human
embedding and extraction are in millisecond which makes it Visual Model and Back-Propagation Network”, (2008), Journal
of C.C.I.T, Vol 37(1) , pp. 151-162
suitable for developing real time watermarking applications.
[15] Qun-Ting Yang, Tie-Gang Gao, Li Fan, "A novel robust
Overall, the watermarking scheme is well optimized for visual watermarking scheme based on neural network," (2010),
quality of images and robustness and may be successfully used International Conference on Intelligent Computing and
to carry out watermarking of uncompressed video sequences. Integrated Systems (ICISS), pp.71-75
[16] Charu Agarwal, Anurag Mishra, Arpita Sharma, "Digital image
REFERENCES watermarking in DCT domain using Fuzzy Inference System,"
(2011), 24th Canadian Conference on Electrical and Computer
[1] Hartung F., Kutter, M.: Mulimedia Watermarking Techniques. Engineering (CCECE), pp.822-825
Proceedings of the IEEE (1999) 1079-1094
[17] Charu Agarwal and Anurag Mishra, “A Novel Image
[2] I. J. Cox, Joe Kilian, F. Thomson Leighton and Talal Watermarking Technique using Fuzzy-BP Network”,
Shamoon, “Secure Spread Spectrum Watermarking for (2010), Proceedings of 6th International Conference on
Multimedia”, (1997), IEEE Transactions on Image Processing, Intelligent Information Hiding and Multimedia Signal
vol. 6(12), pp. 1673-1687 Processing, pp. 102-105
[3] Dey, N., Samanta, S., Chakraborty, S., Das, A., Chaudhuri, S. [18] G-B Huang (2004), The Matlab code for ELM is available
and Suri, J., “Firefly Algorithm for Optimization of Scaling on: http://www.ntu.edu.sg/home/egbhuang
Factors during Embedding of Manifold Medical Information: [19] G.-B. Huang, N.-Y. Liang, H.-J. Rong, P. Saratchandran, and N.
An Application in Ophthalmology Imaging,” (2014) , Journal of Sundararajan, “On-line sequential extreme learning machine,”
Medical Imaging and Health Informatics, 4, pp. 384-394
IASTED International Conference on Computational
[4] Anurag Mishra, Amita Goel, Rampal Singh, Girija Chetty, Intelligence (CI 2005), Calgary, AB, Canada, Jul. 4–6, 2005.
Lavneet Singh, "A novel image watermarking scheme using [20] N.-Y. Liang, G.-B. Huang, P. Saratchandran, and
Extreme Learning Machine," (2012), International Joint N. Sundararajan, “A Fast and Accurate Online Sequential
Conference on Neural Networks (IJCNN), pp.1-6
Learning Algorithm for Feedforward Networks”, (2006), IEEE
[5] Dey, N.; Das, P.; Roy, A.B.; Das, A.; Chaudhuri, S.S., "DWT- Transactions on neural networks, VOL. 17(6), pp. 1411-1423.
DCT-SVD based intravascular ultrasound video watermarking," [21] D. Serre (2002), “Matrices: Theory and Applications”,
(2012), World Congress in Information and Communication Springer Verlag, New York Inc.
Technologies (WICT), pp.224-229
[22] Zhou Wang; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P.,
[6] Wang Zhenfei, Zhai Guangqun, Wang Nengchao, “Digital "Image quality assessment: from error visibility to structural
watermarking algorithm based on wavelet transform and neural similarity," Image Processing, IEEE Transactions on , vol.13,
network,” (2006), Wuhan University Journal of Natural no.4, pp.600,612, April 2004
Sciences, 11(6), pp. 1667-1670
[7] N. Dey, S. Samanta, X.-S. Yang, A. Das, and S. S. Chaudhuri, [23] Fung, V.; Rappaport, T.S.; Thoma, B., "Bit error simulation for
“Optimisation of Scaling Factors in Electrocardiogram Signal π/4 DQPSK mobile radio communications using two-ray and
Watermarking using Cuckoo Search,“ (2013), International measurement-based impulse response models," (1993), Selected
Journal of Bio-Inspired Computation, 4(5), pp. 315-326 Areas in Communications, IEEE Journal on , vol.11(3), pp.393-
405
A Detection Mechanism of DoS Attack using
Adaptive NSA Algorithm in Cloud Environment
Abstract—Security of any distributed system is not only by introducing better mechanism for neighborhood
complex in nature, it also needs much more attention as most of representation, novel matching technique etc. However, this
the applications being used and developed in recent past are on approach lacks in adaptability of the self-learning mechanism
distributed platform. Denial of Service (DoS) attack causes drop and needs periodical update of the detector set to eliminate
in quality of service and may also reach to entire absence of unnecessary presence of the old and irrelevant dataset. This
service for some ‘real’ users. Identifying some users as attackers augmentation will improve the performance of the overall
also need appropriate algorithm. Negative selection algorithm system by removing false positive cases.
(NSA) is a very effective approach in identifying some user as
attacker. However declaring some ‘real’ user as an attacker is a II. REVIEW WORK
very common limitation of these types of algorithms unless and
until the mechanism of detection is updated at regular intervals. A. Denial of Service (DoS) Attack
In this research work we have modified NSA algorithm to take In Denial of Service attack [20-25], attacker sends huge
into account the necessity of updating the detector set from time numbers of requests to a server that the server fails to respond
to time. We have introduced a second detection module to to an authorized user in time. In case of Flooding attack like
accommodate the updation. Both the algorithms are implemented
TCP SYN flood, UDP flood and ICMP flood, attacker sends
on common data set and comparative study is presented. Our
excessive amount of packets to waste different resources.
proposed algorithm comes out with much improved results and
significantly reduces false positive (false alarm) cases.
Whereas for Logic attack (such as Ping of death, Land,
Teardrop), attacker triggers some error in software system.
Keywords—NSA; DDoS; Feature Vector; IP Spoofing In TCP SYN flood [15-18], UDP flood [18] and ICMP
flood [18], the attacker sends a large number of SYN requests
I. INTRODUCTION
to a server, UDP packets to random ports of remote host and
Negative Selection Algorithm (NSA) is one of the very ICMP echo requests respectively. In most cases the attacker
effective approaches [1-3] to build an Artificial Immune uses spoofed IP addresses. As a result, a large number of half
System (AIS). T-cells of thymus gland of human body, by open connection floods the ports capacity.
eliminating body own cells, detects foreign molecules. This is
a well-accepted approach to detect malicious traffic from In Ping of death, the attacker sends a malformed ping
normal traffic. In case of Distributed Denial of Service message to a computer [18]. This oversized and malformed
(DDoS), both the server (host) and the bandwidth of the message can cause a buffer overflow which results in system
network are consumed by the attacker(s) and as a result the crash. In land attack [18], attacker sends TCP SYN packets to
regular and genuine users are not getting the services to the a target system by using spoofed IP of the target system. So
desired and committed level. As such it is very hard to detect the system reply itself continuously.
between these two types of traffic (genuine and malicious) Teardrop attack [18] uses fragmented IP packets, sent by
correctly. Conventional firewalls mechanism to drop all UDP the attacker with errors in offset field. Offset field indicates
packets during UDP flood attack causes drop of genuine (non- starting position of the fragments. At the time of reassembling
attack) UDP packets also and obviously is not a desirable of the packets, due to error in offset, it cannot be done
solution as it cannot guarantee the interest of the genuine correctly and are overlapped.
users. More accurate mechanisms for anomaly detection, fraud
detection etc. are thus required to prevent any loss of interest B. Distributed Denial of Service (DDoS) Attack
of genuine users. Distributed Denial of Service (DDoS) attack [19] [25] is a
The Novel Neighborhood Negative Selection Algorithm type of Denial of Service attack where multiple sources are
(NNNSA) [1-6] takes care of some of the limitations of NSA used to attack one single system. The targeted systems and all
the other systems which are maliciously used by the attacker
Abstract—Virtual Private Networks (VPNs) provide a secure on practically all platforms whereas OpenSSH can only be
encrypted communication between remote networks worldwide installed on operating systems which is identical to UNIX
by using Internet Protocol(IP) tunnels and a shared medium [6]. However OpenSSH forms part of a big protocol family
like the Internet. End-to-end connectivity is established by
tunneling. OpenVPN and OpenSSH are cross-platform, secure, unlike OpenVPN which can only be used for VPN tunneling.
highly configurable VPN solutions. The performance comparison Contribution to the research in SSHVPN has been very slow
however between OpenVPN and OpenSSH VPN has not yet and new version has fixed patches of security flaws and as
been undertaken. This paper focuses on such comparison and well as cryptography. Since OpenVPN releases have more
evaluates the efficiency of these VPNs over Wide Area Network innovations, it is currently the choice of companies, for
(WAN) connections. The same conditions are maintained for a
fair comparison. To the best knowledge of the authors, this is their VPN needs [7]. The performance comparison however
the first reported test results of these two commonly used VPN between OpenVPN and OpenSSH VPN has not yet been
technologies. Three parameters, namely speed, latency and jitter undertaken.
are evaluated. Using a real life scenario with deployment over the
Linux Operating System, a comprehensive in-depth comparative This document is organized into the following sections:
analysis of the VPN mechanisms is provided. Results of the
analysis between OpenSSH and OpenVPN show that OpenSSH • Section 2 will cover the basics of VPN and give some
utilizes better the link and significantly improves transfer times. insight into the concepts associated with OpenVPN and
OpenSSH.
Keywords—Tunneling, OpenSSH, OpenVPN, VPN. • Section 3 describes the Physical framework.
• Section 4 details the experimental testing.
I. I NTRODUCTION
• The last section entitled ’Performance measures’, per-
A Secure Shell(SSH) protocol was designed by Tatu forms some exhaustive test between OpenVPN and
Ylonen in 1995, so as to secure data sent over unprotected OpenSSH VPN.
network like the Internet. It does so by encrypting the traffic • We then conclude by giving the results and observations.
[1]. SSH software was open source and free, it was therefore
very popular on the market. Unfortunately in December 1995 II. V IRTUAL P RIVATE N ETWORK (VPN)
Tatu made the software proprietary and it was no longer Virtual Private Networks (VPNs) [8] are nowadays be-
freely available to users [2]. In 1999, Open source developers coming the most worldwide method used for remote access.
set up a research community to develop a free version of the Companies tend to expand to multiple different locations and
Open Secure Shell (OpenSSH), deriving from the version of have office branches in many countries. VPNs securely convey
SSH [3]. In 2000, around more than 2 million people were information (file sharing, video conferencing, etc...) across
using the version of OpenSSH for free. OpenSSH developers the Internet connection to remote users, office branches and
advance that the application is more secure than the original business partners into an extended corporate network. A VPN
SSH protocol due to their policy of producing clean and is created using dedicated connections, virtual tunneling pro-
audited code under the BSD license known as a family tocols or traffic encryption to establish a virtual point-to-point
permissive free software licenses [4]. As such, Open SSH connection. OpenSSH [9] (OpenBSD Secure Shell),based on
became one of the most popular security implementation, the SSH protocol is a collection of secure network-level
resulting by default in a large number of operating systems services which protect network communications by encrypting
[5]. Open source VPN is another OpenVPN developed network traffic and providing secure tunneling capabilities.
by a different research group. The advantage of using OpenVPN [6] is open-source and implements virtual private
OpenVPN over OpenSSH is that OpenVPN is deployable network (VPN) techniques for creating secure point-to-point
V. PERFORMANCE MEASURES
Using a set of metrics, the performance of a network can be
measured. For OpenVPN and OpenSSH, the speed of transfer,
jitter and latency, are used to represent the performance of
the VPN. Latency [15] is defined as an expression of the
amount of time a packet takes to move from one designated
point to another. The machine hardware, the link speed, and
the encapsulation time affect the latency through a VPN
tunnel. Jitter [16] is the variation in the time between packets
arriving, caused by network route changes or congestion.
Abstract—Data quality management systems are Big Data quality dimensions, methodologies and activities
thoroughly researched topics and have resulted in many tools would be discussed to result with research questions most
and techniques developed by both academia and industry. relevant for data quality in Big Data context.
However, the advent of Big Data might pose some serious
questions pertaining to the applicability of existing data quality
concepts. There is a debate concerning the importance of data
quality for Big Data; one school of thought argues that high
data quality methods are essential for deriving higher level
analytics while another school of thought argues that data
quality level will not be so important as the volume of Big Data
would be used to produce patterns and some amount of dirty
data will not mask the analytic results which might be derived.
This paper aims to investigate various components and
activities forming part of data quality management such as
dimensions, metrics, data quality rules, data profiling and data
cleansing. The result list existing challenges and future
research areas associated with Big Data for data quality Figure 1 Data Quality components
management.
Data quality dimensions are the ways to express the
Keywords—Big Data, Data quality, Data profiling, Data notion of data quality; the major investigation concerning
cleansing, data quality rules, dimensions, metrics.
dimension for this paper concerns whether the same
dimensions which have been applied for traditional data
I. INTRODUCTION quality strategies such as completeness, relevancy,
We are currently living at the beginning of a Big Data consistency amongst others, could also be relevant for Big
era. This Information Technology based concept might Data quality strategies? Or could there be new dimensions
prove to radically change the ways organizations operate, only applicable in the context of Big Data?
and at a larger scale how human society operates. Big Data In order to quantify dimensions so as to bring some
is interesting not only to IT people, but also to physicists, notion of measurability and comparison to the dimensions,
mathematicians, politicians, law/security officials and also metrics need to be applied. Thus, there is the need to
people involved in the tourism and hospitality sectors investigate whether the same metrics which have been
amongst many others[6]. The term „Big Data‟ in itself is a traditionally applied to measure data quality dimensions
poor definition of its representation; it often only conveys could still be applicable for dimensions determined
the idea of a large volume of data too large to be handled by applicable to Big Data, and whether existing metrics and
current processing power of computers[12][13]. However, their formulas would need to be upgraded.
Big Data does concern the large volume of data but also
includes the capacity to search, process, analyze and present The execution part of a data quality strategy would
valuable information coming from large, diverse and rapidly consist of several activities whose common goal is to
changing datasets. This leads to Big Data being defined by improve the quality of data based upon the dimensions
volume, variety and velocity [13]. Veracity is another identified in a particular context. Some of the most cited
characteristic of Big Data which is growing in popularity data quality activities are:
and discusses the rising issue of certainty involved with
Data profiling: examination of data sources to
using data.
generate information about the data and the
The principal goal of the paper is to analyze and present datasets.
the data quality techniques which would be more
appropriate for Big Data in a general context; issues such as
A mere comparison of the dimensions mentioned by the Caballero et al have mapped how the 3v‟s of big data
two set of authors cited above clearly indicate a high level affect the 3Cs of data quality as follows:
of correlation and similarity between what are the principal
ways that data should be assessed for quality purposes.
However, we raise the question concerning how far those
dimensions would still apply in a Big Data context? What
are those dimensions which are judged to be more
important for data quality in a Big Data use context?
B. Dimensions of data quality for Big Data
The high volume and velocity properties of Big Data entails
that segregating correct and erroneous data for further data
analysis is more important. Also, due to data coming from
multiple sources, there is a need of higher method of data
integration to harmonize the semantics of the data being
used[17]. However, according to [18], the importance of Figure 2: Matrix of 3Cs relative to the 3Vs
improving data quality for Big Data might not be so However, the ways the dimensions have been mapped to the
important as the amount of incorrect data is deemed to be 3v‟s were based solely upon hypothesis and no actual
negligible to affect the final outcome after data have been research method has been applied to generate this mapping.
analysed. Thus, which of those two completely contrasting Thus, there is a need in terms of research in the area of Big
schools of thought is relevant seems to depend upon the Data quality to deeper investigate which dimensions are
amount and impact of the erroneous or „dirty‟ data as part of more important in the context of Big Data.
a big dataset. This increases the importance of
understanding which dimensions are more relevant for Big
Data.
C. Measurements and metrics applied as part of or supporting the different activities. The
As briefly outlined in the previous section, the measurement following gives a brief summary of some of them:
of data quality is closely linked with measuring the Classification: building classes from the data in a
dimensions of data quality. Most metrics used for dataset is explained to be the first step to build rule sets.
An example of classification for address standardization
measurement of data quality are normally within a range
is that the „street‟ value could be linked with
from 0 to 1, with 0 representing incorrect value and 1 corresponding values such as „Road‟, „avenue‟ or
representing a correct value [3]. Many dimensions such as „lane‟; thus all those values could be assigned the same
accuracy, completeness and consistency amongst others are classification label.
calculated by the following function: Patterns: they give a generalized view of how the data
is formatted; it involves parsing the data and classifying
D = 1 – (Ni/Nt) (1) all tokens into appropriate classes and replacing those
classes with pre-defined labels. An example of a
Where D is the metric for a given dimension, Ni is the resultant pattern could be N/N+B+II where N
number of incorrect values and Nt is the total amount of represents numbers, B is a label for city names and I
values for the dimension concerned. This measurement and represent a single alphabet. Rules are then written to
associated metric would definitely still hold even for Big process the patterns.
Data, but it could be quite difficult to derive both Ni and Dictionaries: standardized data could be validated
Nt in situations where there is constant input of data. Thus, against some domain to ensure proper identification.
the velocity aspect of Big Data could be the most For example, a dictionary of available cities in a
problematic property in terms of Data Quality country could be used to validate city names stored as
measurement; but if this velocity aspect has been mastered, part of addresses in a dataset.
there is no reason why the same metrics won‟t be
Discovering variants of a term: this a very elaborated
applicable for Big Data.
sub-stage which would involve the use of reference sets
D.Data cleansing for each token or value of data; use of syntactic
Data cleansing is a well cited process and potentially clustering concerning data which possess the same set
involve the most data transformations with respect to data of terms in a sequence except some minor differences;
quality activities. The need to transform or edit some data use of resemblance measure for detecting groups of
source to meet certain data quality standard is an important similar data which would use a formula to denote; use
dilemma when it comes to Big Data quality; as the same of a diff-utility to find the difference between groups of
data could be used or analyzed towards different use cases similar records.
with Big Data, transforming the original dataset according Thus, [9] have proposed a data-driven tool relying upon
to the business rules for one use case might negatively detecting characteristics of „dirty‟ data in a given dataset.
impact the Big Data activities for another use case with the The dependence of using domain experts has been
same dataset[12]. minimized. However, the question as to whether this same
Some researchers from IBM-research India have proposed data cleansing methodology could be applied to
identified four stages as part of data cleansing process for Big Data datasets is more than ever relevant. How feasible
large enterprise datasets and are summarized in the will it be to devise reference sets and creating rule sets given
diagram below[9]. the 3v‟s characteristics of Big Data? What would be the
performance of this data cleansing method in order to
generate dictionaries and applying resemblance measures
and diff-utility? Some serious scientific research is
warranted to uncover those aspects.
Dealing with user defined functions (UDFs) has been
reported as one of the challenge in implementing data
cleansing when scaling to Big Data. Thus, Khayyat et al has
proposed a new architecture to incorporate more efficiently
the application of UDFs and named it „BigDansing‟[10].
The typical data cleansing steps as described by the above
named authors are (1) specifying quality rules, (2) detecting
errors w.r.t data quality rules and (3) repairing detected
errors. However, detecting and repairing data quality issues
are reported to face some issues, namely:
Figure 3: Main stages of data cleansing process a) High complexity of rules leads to intractable
computations over large datasets and thus limiting the
To accomplish those four main stages, the above named applicability of data cleansing systems.
authors cite different tools and methods which should be
b) Effective parallelization is hard to achieve with UDFs systematic, thus inherent to the process of data creation, it
when the latter is specified through the use of is quite reasonable to find meaningful ways to cure the
procedural languages. causes of data quality errors as an efficient data cleaning
process [22]. Xiaolan et al explain that diagnosing data
„BigDansing‟ is reported to deal with those issues by (1) quality errors in Big Data environments raise some issues
abstracting and simplifying the process of rules specification for traditional methods such as provenance analysis, feature
for UDFs and (2) to enable the application of distributed selection and causal analysis in terms of Massive Scale,
repair algorithms. „BigDansing‟ have been benchmarked System complexity and High error rates.
with other systems which could support some level of data Data X-Ray proposes to overcome the above issues by (1)
cleansing routines such as Spark SQL, Shark, NADEEF and finding a hierarchical structure of features which best
PostGreSQL. The results show that „BigDansing‟ represent erroneous elements, (2) using Bayesian analysis to
outperforms the other systems using measures such as time estimate the causal likelihood of features being associated
to scale data quality activities upon large datasets, higher with potential causes of errors and diagnose those causes
efficiency in deduplication of large datasets and using conciseness, specificity and consistency dimensions.
improvements of repair efficiency.
However, it could be argued that the process of rule E. Data profiling
specification being on the shoulders of users could be one of
the limiting factors of a system such as „BigDansing‟. Thus, Profiling of data is basically about examining data available
questions such as whether statistical processes in terms of in a given data source and collecting/producing statistics
supervised or unsupervised learning models could automate and information such as metadata, relationships,
the process of rule specifications and further enhance the dependencies, patterns and cardinalities [14]. The results out
efficiency of such systems? of a data profiling job usually proves to be very useful as
Another recent method to help improve data cleansing for they contribute towards creation of constraints and rules
Big Data has been with the application of Bayesian which could be applied during data cleansing. The
networks and has been termed BayesWipe [20]. The authors traditional use cases of profiling include query optimization,
emphasize that traditional data cleansing techniques such as data cleansing, data integration, scientific data management
outlier detection, noise removal, entity resolution and and data analytics; out of those, data cleansing and data
imputation cannot provide effective solutions in the context analytics seem to be the more relevant when we consider
of Big Data. The fact that techniques such as CFDs depend data quality for Big Data.
upon clean external reference sets to learn data quality rules Big Data is reported to raise three main impacts/challenges
is one of the major drawbacks in devising effective data when it comes to data profiling:
cleansing solution with Big Data. Even devising rules from 1. Profiling could be very useful in assessing
the „dirty‟ data is not judged to be a satisfactory enough usefulness of known and unknown data; this could
solution[20]. Thus, the authors posit that a statistical process help deployment of future Big Data use cases.
underlies the generation of both clean and dirty data with 2. High variety and volume of data challenges
the data source and error models used to undertake the existing data profiling tools and techniques in
detection and repairing stages of data cleansing. Algorithms terms of computational complexity and memory
are generated from the statistical process are coupled with requirements amongst others.
updated query rewriting techniques; the fact that BayesWipe 3. New data management and architectures are
could also be applied in an online scenario where only the involved with Big Data, such key-value and
top-k data portion of the data are considered and the document based stores, high usage of parallel and
cleansing process is performed while the data is being distributed systems and so on. Thus, there is the
retrieved adds its improved applicability in the context of need to re-think how data profiling need to be
Big Data. Empirical evaluations performed over both carried out in the context of Big Data[14].
synthetic and real datasets tend to show improvements in
The following chart maps the upcoming research challenges
terms of amount of data cleansing ratios when BayesWipe is
associable between Big Data and data profiling.
compared with CFDs and Amazon Mechanical Turk, but
there is still a very large portion of dirty data not cleansed.
For example, the offline version cleans only 40 % of the
data in a synthetic car database. Another question about
BayesWipe concerns the efficiency of the data source and
error models which is the foundation of this method. The
evaluation results denote that those models could be
improved to lead to higher data cleansing ratios.
All the above discussed techniques apply data cleansing
techniques upon the data itself, but do not attempt to correct
the cause which create the errors and dirtiness in the data; as
most of those data quality issues are reported to be
Figure 4: Research challenges for Big Data profiling process which relies heavily upon subject matter experts.
Furthermore, current methods for discovering CFDs have
Online profiling is triggered by the fact that users might been reported to have difficulties to scale for relations
need to wait a substantial amount of time before viewing the having a large number of attributes and they are not robust
results if profiling is handled by traditional data profiling with datasets having a high level of dirty data. Thus, Yeh
methods; hence, the idea of displaying intermediate results and Puri have developed an approach called CFinder which
with proven level of confidence might help the user to take a follows the following main steps to automatically generate
decision whether they would want to continue to work for a better CFDs:
specific use case. Given the huge amount for Big Data, this
would certainly be an appreciable feature for potential Big
Data users such as data analysts and data scientists.
Furthermore, it is imperative to understand that profiling
would be an activity to be performed both before and after
data cleansing as after the cleansing process, the information
about the data and the data source would be updated.
Incremental profiling is linked with the velocity
characteristic of Big Data. As data is expected to be updated
quite often, there should means to perform profiling upon
changing data with periodic timespans. Re-using past
profiling results to improve the computational response
times of profiling might be an avenue to explore! Figure 5: Main steps of CFinder
Continuous profiling is almost the same idea as
Even if CFinder outperforms CFD-TANE in terms of the
incremental profiling with the exception that we are now
recall and precision metrics, Yeh and Puri are investigating
expecting profiling to be performed upon data while it is
ways to improve CFinder with (1) the use of heuristics to
being created or updated.
improve scalability (2) application of industry ontologies to
One of the aim to profile Big Data would be to determine
determine which attributes are related for the pruning
the common properties or level of heterogeneity of different
process and (3) exploration of other metrics to eliminate
datasets; thus, two types of heterogeneity is of particular
weak CFDs.
importance when profiling for Big Data, namely syntactic
There could be pitfalls relying upon automated data repair
heterogeneity, which is mainly about finding inconsistent
solutions based on DQRs, especially in use cases dealing
formatting between data, and structural heterogeneity,
with critical data. However, the involvement of human users
which is about unmatched schemata. Consequently, the
to validate data repairs means the response time of data
information generated by profiling would be very useful in
quality tools degrade considerably. Thus, an interactive
the integration aspect of different sources of a Big Data use
approach which performs some proportion of automatic
case. This is an open area of research as existing solutions
repairs while allowing users to validate repairs have been
simply appears to be ineffective for Big Data.
Recognizing the domain of yet unknown data is the aim of proposed [23]. It involves generating repairs to only the top-
topical profiling. With Big Data use cases, there could be k most important violated rules and ranking the most
the use of unknown datasets such as social media data. beneficial repairs from the users‟ perspectives for their
There should be some thorough research upon how topical validation.
profiling could be implemented efficiently for Big Data. Experiments comparing the rules ranking method with other
techniques such as the Greedy and Random algorithms tend
F. Data quality rules to demonstrate that the rule ranking method is more
Enforcing Data quality rules(DQRs) are integral activities to efficient. However, in the context of Big Data, there are
improve the cleansing part of data quality. Prior research in several questions which would be raised by the current
the field show that DQRs are being enforced via various research:
methods such as Functional dependencies(FD), Conditional 1. How practical is it to have the user validating repairs,
Functional Dependencies(CFDs), Dedupalog, Integrated even for the top-k rules, when there could be the
Constraints(ICs) and Bayesian networks amongst assumption of a huge size of top-k repairs to be
others[5][23][24]. All of those techniques have been undertaken?
evaluated through several researches with the common 2. There should be some measures for the computational
purpose of improving the efficiency of the cleaning or complexity of the rule ranking algorithm as it involves
repairing activities. Another common theme through all nested loops and user interactions. Furthermore, the
those researches is the fact rules are discovered out of the stopping condition of one of the loop equals to the fact
characteristics present in the data. that there is no more dirty tuples in a given dataset. In a
Yeh and Puri aimed at increasing consistency in datasets by Big Data scenario, with the high velocity of data
discovering rules for more efficient CFDs[24]. Current production, this could well result in the algorithm
challenges with increasing consistency are (1) it is a labor generating infinite loops!!
intensive process and (2) rule discovery is largely a manual
3. The top-k repairs would invariably be linked with the used in the context of power consumption and sensor/RFID
use case for which analytics are being applied in a Big based data might demand a more important component of
Data dataset. As it is already questioned in this accuracy and/or completeness to ensure a better data quality.
research, there is a legitimate question whether to However, whatever the use case, it has been discussed the
transform the data repairs according to one particular data quality activities are slowly growing in importance to
use case and thus update the original dataset OR create ensure that the results of analytics could be trusted.
a copy of the corrected data while keeping the original B. Measurement and metrics
dataset for other use cases.
Most measurements and metrics applicable in normal data
RULEMINER is a system to discover DQRs addressing the quality practices would still hold for Big Data, even if the
main limitations of existing rules discovery methods [3]: velocity inherent with Big Data might make metrics
Existing rule discovery algorithms are usually calculation more cumbersome. Getting interim accurate
designed for a single rule language, thus unable to results from data profiling would be highly useful.
discover many useful rules for a dataset. However, the major area of research would most certainly
Most existing algorithms generate a large amount relate to the meaning of the measurement, e.g can it be
of rules, where many of those rules are not argued that a 90% consistency measure of data in a data
adequate. warehouse system be equivalent to a 90% consistency
Manual evaluation of the output of rules is a time measure of data in a Big Data context? This seems highly
consuming process.
unlikely as due to the 3v‟s of Big data, there is bound to be
RULEMINER discovers rules expressed as Denial higher tolerance of dirty data.
Constraints (DCs) which is supposed to subsume FDs and C. Data quality rules
CFDs. However, it‟s quite unclear whether DCs would
subsume more elaborate rules such as for semantic Rules discovery normally involve several expressions such
interpretation of data as discussed in the data cleansing as CFDs and DCs. However, due to the heightened decision
section [9] of this current research, thus the first limitation making exigencies associated with Big data analytics, there
listed above is unsure to be addressed by RULEMINER. is a need to create new methods which would efficiently
Another issue with RULEMINER remains the fact that there deal with multiple and complex rules.
is a dependence upon users to validate repairs in terms of its Automatic or user-defined rules: The involvement of
Negative Example-Positive Examples pairing; with Big users during rules generation would be a limiting factor with
Data, this could result in computationally too costly repairs. Big Data, but the rules could be potentially so complex that
However, this method seems to be a very user friendly with there could be no other alternative than relying upon user
a front end interface which allows by easily allowing the validation for certain type of highly complex repairs. On the
user to specify the maximum amount of errors to display for other hand, automatic discovery and application of rules to
a given discovered rule(similar to the top-k notion) and a the repair process might be quite dangerous, especially in
filtering option allows the user to focus upon certain rules use cases dealing with critical data. Thus, this research
depending upon a given use case.
recommends discovering DQRs for Big Data with the use of
III. FINDINGS algorithms which should be robust enough to automatically
A. Dimensions discover a certain level of repairs potentially via heuristics
or learning with dirty data in terms of Bayesian networks;
Concerning data dimensions which provide the basis of human validations should be restricted to only repairs which
measuring data quality, the above discussions sourcing from are occurring to very ambiguous and complex rules.
existing literature point to two important conclusions: Computational complexity: The most recent research point
Applicability of traditional dimensions: the dimensions towards aiming at the most important or top-k rules which
which are normally cited for measuring data quality for must be taken into account to generate enough confidence
normal datasets are on the whole still relevant in the context level in the use of Big Data. This could largely be left upon
of Big Data. Thus, consistency, completeness, accuracy and subject or domain specialists to identify and denote those
credibility/believability have been ranked among the most top-k rules; ultimately, the weak spot of complex and
useful dimensions for big datasets. Further empirical efficient data quality management systems for Big Data
research is recommended to validate the assumptions of could once again be the capability of the human specialists
caballero et al. to link the traditional dimensions with the to correctly denote the top-k rules for particular use cases.
“v‟s” of Big Data. D. Data profiling
Dimensions tightly coupled with use case: the area of
application or use case of big datasets highly influence Profiling is one of the most important groundwork to enable
which dimensions would be more relevant to ensure data a proper data quality management system and is central to
quality. For example, the use of big datasets coming from the proper generation of DQRs and performing data
social media to generate sentiment analysis systems could cleansing. However, the traditional outputs of profiling are
require higher degree of consistency and credibility seriously challenged by Big Data in the following terms:
compared to completeness. On the other hand, big datasets
New Big Data models: The fact that many Big Data models Volume issues: Typical data cleansing methods rely upon
such as HBase do not use indexes imply that some of the reference sets or dictionaries of clean data in order to
outcomes of traditional profiling such as foreign key compare and clean a particular dataset. The availability of
dependencies no longer hold. Thus, even if traditional this clean reference set can prove to be impossible to
profiling results such as number of NULL values could still produce with Big Data use cases; thus, existing research are
be relevant, there is a growing need for data profiling to moving towards the use of clean reference sets towards the
provide other statistics in view of the new Big Data models. use of heuristics or statistical processes to detect dirty data,
On the other hand, the fact that most architecture of Big but those techniques still need to be improved for the
Models makes use of in-memory processing could resolve efficiency of the amount of dirty data corrected. E.g,
some of the disk related computational issues generally Bayeswipe is reported to clean only 40% of dirty data in a
involved with data quality activities. synthetic and not so huge dataset. With real life big dataset,
Online profiling: it needs to be investigated deeper as the this ratio should be worse. Hence, there is a need to have
intermediate results of profiling could allow both decision more efficient data cleansing methods.
takers and data quality management developers take the Computational complexity: some data cleansing activities
most suitable actions in a context of high velocity of data. might involve huge processing power. E.g, performing
E.g, it could ease the formulation of the top-k rules much resemblance measure might perfectly be achievable in
more rapidly or identify inconsistency patterns for certain datasets of millions of records with tens of attributes, but
attributes which are more repetitive, and therefore support when it is being scaled to Big Data proportions of billions of
the data cleansing activity on the fly. records across thousands of attributes, it could result into
Variety: Harnessing the power of yet unknown data is one disastrous response times if the data quality solution is not
of the benefits of implementing Big Data solutions. Those being properly powered by due processing capabilities.
data are very often in improperly defined or totally IV. CONCLUSION AND FUTURE WORK
unstructured format which is a major obstacle to properly
carry out profiling activities and could ultimately result in Data quality has been of major importance since the
vastly incorrect profiling outputs. Thus, there is a high need growing importance of information systems for decision
to develop new profiling techniques taking into account
making in the past decades. The fact that Big Data promises
heterogeneity of data sources to consolidate the integration
level required for Big Data implementation. to unlock largely untapped data sets the scene for the
importance of data quality once again; it would be foolish to
E. Data cleansing be able to process and analyze large quantity and variety of
data if those data cannot be guaranteed to be „fit for use‟,
Cleansing or repairing activities are normally the thus potentially the benefits which data analytics promises
culmination of any proper data quality management process. to bring might be jeopardized by improper handling of the
The following are the major challenges which Big Data data quality process. The latter has largely been researched
seems to bring for data cleansing: specially in the context of data warehousing and produced
Repair upon original data: How some datasets are very positive results in terms of methodologies, tools and
intended to be used is critical towards understanding how techniques to improve the quality of data in known data
cleansing activities would need to be carried out. Thus, in environments. However, Big Data is largely up to now a
situations where the same dataset could be subject to still burgeoning and developing data environment being
different types of analysis as it has been seen in many Big characterized mostly by the volume, velocity and variety
Data use cases, repairing or providing edits on the original aspects. Many researchers are starting to investigate the
dataset might prove very beneficial for a particular use case veracity or data quality aspect relative to Big Data. Hence,
but could cause the dataset to be not suitable for other use how relevant are the known and used methodologies, tools
cases. Thus, there should be the development of and techniques in the context of Big Data?
mechanisms which denote original and edited data for Big Throughout this paper, there has been the investigation of
Data cleansing activities. main elements pertaining to data quality. The first general
Symptoms v/s causes: Unfortunately, many data quality learning from and consensus amongst experts in Big Data
management systems correct the data which is judged to be quality circle is that many traditional data quality processes
dirty according to some rules rather than the actual cause or are highly relevant or would need some minor updates to fit
triggers of those errors. Thus, with the high velocity of Big the Big Data environment. Still, there are certain areas
Data, we could face a situation where a previously cleansed which need to be largely updated for Big Data. Hence, the
dataset could be filled with new errors very rapidly. E.g, dimensions relevant for Big Data seems to be more relevant
machine generated data from sensors which are improperly to particular use case and thus, it would seem highly
calibrated might be the cause of flooding datasets with improbable to come up with some general dimensions
continuously improper or dirty data. Instead of wasting which could be argued to be always applicable. The
resources to continuously detect and correct the data, it measures and metrics being developed so far tends to
makes more sense to re-calibrate the sensors accordingly. indicate an overall applicability to Big Data, with doubts
remaining only to the velocity or ever changing statistics of
the data in Big Data context. Thus, we could argue that [11] Loshin David, 2014. Understanding Big Data Quality for Maximum
Information Usability, s.l.: s.n.
measurements and metrics for Big Data would have a highly
important temporal aspect to be able to be used for [12] Loshin, D., 2013. Big Data Analytics: from strategic planning to
enterprise integration with tools, techniques, NoSql and Graph.
comparisons and decision making. In terms of data quality Elsevier.
rules, there is a big challenge associated with Big Data when
[13] Malik, P., 2013. Governing Big Data: Principles and Practices. IBM
it concerns user-defined functions and the human Journal of Research and Development.
validations of rules discovered; the volume, velocity and
[14] Naumann, F., 2013. Data Profiling Revisited. s.l., s.n.
variety of the data might make it impossible for human
[15] Obhyung Kwon, N. L. S., 2014. Data quality management, data usage
validation of rules to be established so that they might be experience and acquisitionintention of big data analytics. International
applied for data cleansing processes on time. Thus, the Journal of Information Management, pp. 387-394.
whole method of rule discovery needs to be investigated in a [16] Pipino, L., Yang, L. & Wang, R., 2002. Data Quality Assessment.
deeper basis, as it could also prove to be computationally a Communications of the ACM.
very expensive process. Data profiling for Big Data might [17] Saha, B. & Srivastava, D., 2014. Data Quality: The other face of Big
also require new methods of being performed; online Data. s.l., s.n.
profiling seems to be one of the demand of profiling experts [18] Soares, S., 2012. Big Data quality. In: Big Data Governance: An
which could be applied for Big Data and there definitely emerging imperative. s.l.:MC Press, pp. 101-112.
need to be new methods to resolve data profiling issues [19] Steve lavalle, E. l. R. S. H. N. K., 2011. Big Data, Analytics and the
related to heterogeneous datasets. Finally, there are lots of path from insights to value. MIT Sloan Management Review.
challenges set by Big Data for the data cleansing activity; [20] Sushovan, D., Yuheng, H., Yi, C. & Subbarao, K., 2014. BayesWipe:
the volume of data might just make it computationally too A Multimodal System for Data Cleaning and Consistent Query
expensive to repair all dirty data detected, whether to make Answering on Structured BigData. s.l., s.n.
corrections on original dataset or make a copy of corrected [21] UNECE, 2014. A suggested framework for the quality of Big Data, s.l.:
dataset depends upon the characteristics of the use case of s.n.
the Big Data scenario, and determining the exact causes of [22] Xiaolan, W., Xin Luna, D. & Alexandra, M., 2015. Data X-Ray: A
dirty data and eliminating those causes are major research diagnostic tool for data errors. Melbourne, ACM.
challenges. [23] Yakout, M., Elmagarmid, A. K. & Neville, J., 2010. Ranking for data
repairs. s.l., IEEE.
Future work relative to this research would involve carrying
more empirical evaluations and testing of the main research [24] Yeh, P. Z. & Puri, C. A., 2010. An efficient and robust approach for
discovering Data Quality Rules. s.l., IEEE.
questions set out in the previous paragraph; solving those
different questions would gradually bring a new [25] Youngwa Lee, K. K. R. L., 2003. The Technology acceptance
model:Past, present and future. Communication for the association of
nomenclature towards the creation of data quality processed information systems.
which could be more successful in the context of Big Data.
REFERENCES
[1] Anon., 2006. Data Centric Systems and Applications. In: Data Quality.
s.l.:Springer.
[2] Anon., n.d. data quality. [Online]
Available at:
http://searchdatamanagement.techtarget.com/definition/data-quality
[Accessed 4 May 2015].
[3] Blake, R. & Mangiameli, P., 2011. The effects and interactions of Data
Quality and Problem Complexity on Classification. ACM Journal of
Data and Information Quality, 2(2).
[4] Caballero, I., Serrano, M. & Piattinni, M., 2014. A data quality in Use
model for Big Data. ER workshops, pp. 65-74.
[5] Chu, X., Ilyas, I. F., Papotti, P. & Ye, Y., 2014. RULEMINER: Data
Quality Rules Discovery. s.l., IEEE.
[6] danah boyd, K. c., 2012. Critical questions for Big Data. Information,
Communication and Society, 10 May, pp. 663-679.
[7] Davenport, T. H., 2013. At the Big Data crossroads: turning towards a
smarter travel experience, s.l.: Amadeus IT group.
[8] Gretzel, U., 2013. Technology and tourism: building competitive
digital capability. s.l.:s.n.
[9] Hima, P. K. et al., 2011. Data Cleansing techniques for Large
Enterprise datasets. s.l., IEEE Computer Society.
[10] Khayyat, Z. et al., 2015. BigDansing: A system for Big Data
Cleansing. Melbourne, ACM.
A Classification Method to Classify
High Dimensional Data
Amit Gupta Naganna Chetty Shraddha Shukla
Department of Computer Science Department of Computer Science School of Computing Science and
and Engineering, Graphic Era Hill and Engineering, Mangalore Institute of Engineering, Galgotias University,
University, Technology and Engineering, Greater Noida,
Dehradun Uttarakhand, India Karnataka, India nsc.chetty@gmail.com Uttar Pradesh, India
amitgupta7920@gmail.com shraddha52d27@gmail.com
Abstract—The rapid computerization and advancement in provided by researchers in data mining to obtain the pattern
the technology has led to huge amount of data in the databases. out of data. Different patterns can be mined by classification,
Research has shown that the amount of data in the world doubles clustering, association rules, regression, outlier analysis, etc.
in every 20 months. However, this available data consists of large [2]. There are abundant tools available for data mining. Some
number of noise values and thus, cannot be directly used. The
of them are Rapid Miner, R, Knime, Own Code, Weka or
extraction of information from the vast pool of data has emerged
a major challenge. Pentaho, Statistica, Sas or Sas Enterprise Miner, Orange,
Tanagra, And Matlab.
Machine learning techniques have emerged as an effective
tool to overcome this challenge. Several machine learning Classification [9] is the process of finding a model (or
algorithms (like SVM, K-means etc.) are effectively applied in function) that describes and distinguishes data classes or
data mining. concepts, for the purpose of being able to use the model to
predict the class of objects whose class label is unknown. The
In this paper author have applied classification and derived model is based on the analysis of a set of training data
clustering techniques on different datasets and have proposed a (i.e., data objects whose class label is known).
model for enhancing the performance of K-means data clustering
method and Naïve Bayes data classification method. The
Different types of classification algorithms [7,8] are:
efficiency of the proposed model is calculated based on general
parameters like accuracy, precision, recall, F-measure and
number of iterations. Decision tree induction
Naïve Bayesian Classification
Keywords—Data Mining; K-Means; Naïve Bayes
Classification; Clustering; preprocessing; WEKA tool Rule-Based Classification
The basic approach of objects clustering is as shown in Approaches for Attribute Selection
the Figure 1. The approach is applied on random data points to
form three different groups of clusters. Each cluster consists of A framework for attribute selection is shown in Figure 2.
similar type of datasets. For selecting features we used WEKA [6] data mining tool.
Initially we applied preprocessing approaches on the original
data to remove noise from data. A feature selection approach
by selecting appropriate feature evaluator and search
technique has been applied on the preprocessed data. This
approach results in reduced set of features.
Approaches for Data Preprocessing The produced results after the attribute selection are given
below. In the Table 1, we have listed the total initial features
The following two different approaches are used for data and the elected features. After applying attribute selection we
preprocessing. have reduced the number of features on the basis of their
importance. In abalone dataset, initially the features were 9
Normalisation: including its class after applying feature selection reduced to
6. In iris dataset, initially the features were 5 including the
Normalisation is the process in which the data is reduced class attribute after applying feature selection reduced it to 4.
to a form ensuring data integrity and eliminating data Similarly, for wine dataset, we have 15 attributes initially and
redundancy. then it is reduced to 9.
8
9
8
200
7
6
7
6
150
5
4
5
1
2
1
50
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
0 after
randomization
10 10
7
9
7
after attribute
6
5
6
5
selection
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
120
In the above Table 5 we can observe the values noted 100
80 original data
after the clustering. The no. of iterations was 57 then it was 60
reduced to 26 after randomization and 17 after attribute 40
20
selection. The SSE was reduced from 172.694 to 96.539. The 0 after
time was 0.7 sec initially then it was improved to 0.33 and normalization
0.16 after randomization and attribute selection respectively.
The incorrectly clustered attribute was 79.70% then it was after attribute
reduced to 49.70% after randomization and 46.2054% after selection
attribute selection. Hence the accuracy is improved from
50.3% to 53.79%. The Figure 8 shows the graphical
representation of the results produced. It can easily show the
improvements in the results. Figure 9: Iris Clustering Graph
Table VII: Wine Clustering [3] A. K. Jain, prof. S. Maheshwari, “Survey of recent
clustering techniques in data mining”, international journal
Wine No. Of SSE Time Incorrectly Accuracy of computer science and management research, vol 1 issue
Clustering Iterations (sec) Clustered (%) 1 Aug 2012
(%)
Original 8 48.970 0 5.618 94.382 [4] D. Sisodia, L. Singh, S. Sisodia, K. Saxena, “Clustering
Data
Techniques: A Brief Survey of Different Clustering
After 7 27.972 0 8.988 91.012
Attribute
Algorithms”, International Journal of Latest Trends in
Selection Engineering and Technology (IJLTET), Vol. 1 Issue 3
September 2012
For the wine clustering a separate preprocessing have not
[5] E. Kijsipongse, S. U-ruekolan, "Dynamic load balancing
been performed. In the above Table 7 we can observe the
on GPU clusters for large-scale K-Means clustering," 2012
values noted after the clustering. The number of iterations is IEEE International Joint Conference on Computer Science
reduced from 8 to 7. The SSE is improved from 48.970 to and Software Engineering (JCSSE), vol., no., pp.346, 350,
27.972. The time is constant. The Figure 10 shows the May 30 2012-June 1 2012.
graphical representation of the results produced. It can easily
show the improvements in the results. [6] http://www.cs.waikato.ac.nz/ml/weka/downloading.html
IV. CONCLUSION
The results discussed here shows that the K Means
clustering algorithm applied on the selected attribute set
produces significant improvements in the values obtained for
the original dataset. The results obtained for abalone and iris
datasets shows an improvement in the accuracy by 3.49% and
2% respectively for the selected attribute set. For randomized
dataset the accuracy remains same for the abalone dataset
while it is increased by 8.67% for the iris dataset using
normalization. The number of iterations and execution time is
also decreased for all three datasets. The percentage of
incorrectly attributes and SSE is also decreased for all three
datasets.
REFERENCES
[1] E. A. Khadem, E. F. Nezhad, M. Sharifi, “Data Mining:
Methods & Utilities”, Researcher2013; 5(12):47-59.
(ISSN: 1553-9865).
Abstract—The comparison between TCP and UDP tunnels • Section II covers Virtual Private Network, with different
have not been sufficiently reported in the scientific literature. layers in the OSI model.
In this work, we use OpenVPN as a platform to demonstrate the • Section III covers User Datagram Protocol (UDP) and
performance between TCP/UDP. The de facto belief has been
that TCP tunnel provides a permanent tunnel and therefore Transmission Control Protocol (TCP).
ensures a reliable transfer of data between two end points. • Section IV makes a comparison between TCP and UDP.
However the effects of transmitting TCP within a UDP tunnel • Section V details the experimental testing.
has been explored and could provide a valuable attempt. The • Section VI describes the Physical framework.
results provided in this paper demonstrates that indeed TCP in • Section VII details the performance measures.
UDP tunnel provides better latency. Throughout this paper, a
series of tests have been performed, UDP traffic was sent inside • We then conclude by giving the results and observations
UDP tunnel and TCP tunnel successively. The same tests was in section VIII.
performed using TCP traffic.
II. V IRTUAL P RIVATE N ETWORK (VPN)
Keywords—TCP, UDP, OpenSSH, VPN, Tunneling. A virtual private network (VPN) [2] makes use of a public
network to connect multiple remote locations. A VPN spread
I. I NTRODUCTION a private network by making use of a public network (Internet)
to set up a point-to-point connection and virtual tunneling
An IP tunnel [1] is defined an Internet Protocol (IP) commu- protocols. Using a VPN, a computer can communicate across
nications medium between two networks. It encapsulates its public networks as if it is connected in the same LAN of the
own network protocol within the TCP/IP packets carried by private network. A VPN is a logical network on top of an
the Internet. IP tunnels connect separate IP networks which already existing network. Different VPN solutions work on
are not directly connected to each other. Tunneling protocol different layers in the Open System Interconnect (OSI) model
allows access to a network service which is not supported. The [3]. In the tunnels, the traffic is encrypted and sent through
nature of the traffic sent through the tunnel can be hidden by using the lower layers in the OSI model. The VPN traffic
encryption standard which repackage traffic data into different is split up from any other network traffic by using encrypted
form. It makes use of a layered protocol model such as those tunnels between the VPN hosts. Inside a tunnel, the forwarded
of the Open Systems Interconnection model (OSI) or TCP/IP traffic is encapsulated into a special packet format on which
protocol suite. Tunneling is used in all VPNs; one common a block cipher is used to encrypt the traffic [4]. As mentioned
open source application layer solution available is OpenVPN in the previous paragraph, a VPN can work on different
(Open Virtual Private Network). The popularity of VPNs has layers of the OSI model. Three common types of VPNs are
increased due to its low cost and the security it provides. The Application Layer VPNs, Network Layer VPNs and Datalink
trade-offs between TCP and UDP regardless of VPN usage is Layer VPNs. Secure Shell (SSH), Secure Sockets Layer (SSL),
always said to be the same: Speed is sacrifice for reliability and OpenVPN are VPNs that work on the Application layer
as UDP is connectionless and the server sending the data of the OSI model [5]. The tunneled traffic is encapsulated into
theoretically does not ensure if it reaches the destination or application specific headers before being sent to the other side
not. UDP is claimed to be faster but TCP is meant to be using the available Transport Layer Protocol [6], such as User
more reliable. This paper focuses on such comparison and Datagram Protocol (UDP) or Transmission Control Protocol
evaluates the efficiency between UDP tunnel and TCP tunnel (TCP).
using OpenVPN. Throughout this paper, a series of tests have
been performed, UDP traffic was sent inside UDP tunnel and III. T RANSMISSION C ONTROL P ROTOCOL (TCP) AND
TCP tunnel successively. The same tests was performed using U SER DATAGRAM P ROTOCOL (UDP)
TCP traffic. TCP is the main protocol in TCP/IP networks. The IP
This document is organized into the following sections: protocol process data packets while TCP allow two hosts
VIII. C ONCLUSION
This paper addresses performance comparison between TCP
and UDP tunnel connections. Two distinct scenarios were
Fig. 3. Latency comparison between UDP and TCP inside a TCP tunnel in used to test the two VPN tunneling mechanisms. The results
WAN environment
conclude that UDP tunnel utilizes the link more efficiently
and provide a radically improved transfer times and speed
compared with TCP tunnel. The results also demonstrates that
indeed TCP in UDP tunnel provides better latency.
It would be good to investigate the performance of the tunnels
on Mobile network. OpenVPN can be installed on mobile
devices, therefore another direction is to test the VPN on
mobile device, and measure the amount of energy which is
required to secure connections, since mobile devices have
batteries with limited capacity.
R EFERENCES
[1] S. Zhou and J. Luo, “A novel ip over udp tunneling based firewall traver-
sal for peer-to-peer networks,” in Service Operations and Logistics, and
Informatics (SOLI), 2013 IEEE International Conference on, pp. 382–
386, July 2013.
[2] R. Bush and T. Griffin, “Integrity for virtual private routed networks,” in
INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE
Computer and Communications. IEEE Societies, vol. 2, pp. 1467–1476
vol.2, March 2003.
[3] Y. Li, W. Cui, D. Li, and R. Zhang, “Research based on osi model,”
in Communication Software and Networks (ICCSN), 2011 IEEE 3rd
Fig. 4. Latency comparison between UDP and TCP inside a UDP tunnel in International Conference on, pp. 554–557, May 2011.
WAN environment [4] A. Mayer, B. Collini-Nocker, F. Vieira, J. Lei, and M. Castro, “Analytical
and experimental ip encapsulation efficiency comparison of gse, mpe,
and ule over dvb-s2,” in Satellite and Space Communications, 2007.
tunnel. The TCP stacking starts when the message size reaches IWSSC ’07. International Workshop on, pp. 114–118, Sept 2007.
[5] M. Mimura and H. Tanaka, “Behavior shaver: An application based
2 MB, since the difference in latency before 2MB is very small layer 3 vpn that conceals traffic patterns using sctp,” in Broadband,
when compared to the latency gap after 2MB. On the other Wireless Computing, Communication and Applications (BWCCA), 2010
hand, figure 2 shows the latency comparison between UDP International Conference on, pp. 666–671, Nov 2010.
[6] D. Sarkar and H. Narayan, “Transport layer protocols for cognitive net-
and TCP when using a UDP tunnel in a LAN environment. works,” in INFOCOM IEEE Conference on Computer Communications
When using UDP tunnel, we do not have the double stacking Workshops , 2010, pp. 1–6, March 2010.
problem. Compared to figure 1, the latency is smaller for [7] I.-S. Yoon, S.-H. Chung, and J.-S. Kim, “Implementation of lightweight
tcp/ip for small, wireless embedded systems,” in Advanced Information
both TCP and UDP traffic. It was said that encryption, slow Networking and Applications, 2009. AINA ’09. International Conference
down UDP traffic since when bits of data is missing the on, pp. 965–970, May 2009.
entire message may need to be re-sent again, causing latency. [8] T. Le, G. Kuthethoor, C. Hansupichon, P. Sesha, J. Strohm, G. Hadynski,
D. Kiwior, and D. Parker, “Reliable user datagram protocol for airborne
During the tests, it is shown that, UDP message is not affected network,” in Military Communications Conference, 2009. MILCOM
by encryption mechanism when the size is between 1MB to 2009. IEEE, pp. 1–6, Oct 2009.
10MB. Sending TCP traffic inside a TCP tunnel does not [9] Y.-L. Chang and C.-C. Hsu, “Connection-oriented routing in ad hoc
networks based on dynamic group infrastructure,” in Computers and
create the double TCP stacking problem, this is proved in Communications, 2000. Proceedings. ISCC 2000. Fifth IEEE Symposium
figure 2. When we compare the two graphs for example, the on, pp. 587–592, 2000.
[10] E. Bethel and J. Shalf, “Grid-distributed visualizations using connec-
tionless protocols,” Computer Graphics and Applications, IEEE, vol. 23,
pp. 51–59, Mar 2003.
[11] M.-Y. Park and S.-H. Chung, “Distinguishing the cause of tcp retrans-
mission timeouts in multi-hop wireless networks,” in High Performance
Computing and Communications (HPCC), 2010 12th IEEE International
Conference on, pp. 329–336, Sept 2010.
[12] D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, “Modeling and taming
parallel tcp on the wide area network,” in Parallel and Distributed
Processing Symposium, 2005. Proceedings. 19th IEEE International,
pp. 68b–68b, April 2005.
6to4 Tunneling Framework using OpenSSH
Irfaan Coonjah Pierre Clarel Catherine K. M. S. Soyjaudah
Faculty of Engineering School of Innovative Technologies and Engineering Faculty of Engineering
University of Mauritius University of technology, Mauritius University of Mauritius
Réduit, Mauritius La Tour Koenig, Pointes aux Sables Réduit, Mauritius
irfaan.coonjah@umail.uom.ac.mu ccatherine@umail.utm.ac.mu ssoyjaudah@uom.ac.mu
Abstract—6to4 tunneling enables IPv6 hosts and routers to • Section II covers Virtual Private Network.
connect with other IPv6 hosts and routers over the existing • Section III describes OpenSSH.
IPv4Internet. The main purpose of IPv6 tunneling is to maintain • Section IV covers IPv6 and IPV4 Private and Public
compatibility with large existing base of IPv4 hosts and routers.
OpenSSH VPN tunneling is said to have limitations with numer- addressing.
ous IPv6 clients and therefore it is advisable to use OpenVPN. • Section V describes the requirements an prerequisites for
To the best knowledge of the authors, this is the first reported buiding 6to4 tunneling on OpenSSH.
successful implementation of 6to4 tunneling over OpenSSH with • Section VI describes the scenario and section VII details
more than one client. This proof-of-concept positions OpenSSH the steps to create 6to4 tunneling.
therefore as a potential alternative to conventional VPNs.
• Section VIII performs a series of tests.
Keywords—Tunneling, OpenSSH, VPN, IPV6. • We then conclude by giving the results and observations
in section IX.
I. I NTRODUCTION II. V IRTUAL P RIVATE N ETWORK (VPN) AND IP TUNNEL
In 1995, Tatu Ylonen, designed the Secure Shell (SSH) An IP tunnel [7] is an Internet Protocol (IP) communi-
protocol to protect data by encrypting traffic before sending cations connection between two networks used to transport
it over an unsecured network such as the internet [1]. Since another network protocol. It does this by encapsulating its
the SSH Software was open source and freely available to own network protocol within the TCP/IP packets carried by
the public [2], it gained its popularity quickly. In December, the Internet. IP tunnels are used to connect two separate IP
1995, the founder of SSH made the software proprietary, networks that are not directly connected to each other. A user
hence, it was no longer accessible to the public and developers can access any network service not supported in its local area
[3]. However, in 1999, the open source developers founded network using tunneling protocol; one example is to run IPv6
a research community and created a free version of Open over IPv4. A second use is to provide unworkable services
Secure Shell (OpenSSH), which is a derived version of the for example, giving a different subnet IP address to a remote
SSH [4]. More than 2 million users in 2000, use the free user whose physical network address is not part of the LAN.
version of OpenSSH and by 2013, the number of users reached Tunneling protect the type of traffic that is run through the
10 million. The Open SSH developers are convinced that their tunnel [8], using encryption standard to repackage traffic data
application is more secure than the original SSH protocol as into a different form. A virtual private network (VPN) [9]
they respect their policy that is to produce clean and audited makes use of a public network to connect multiple remote
code. So, a family of permissive free software licenses are locations. A VPN extend a LAN using a public network, such
released under the BSD license [5]. The OpenSSH is currently as the Internet by establishing a point-to-point connection and
one of the most used security implementation, implemented in virtual tunneling protocols. It allows a computer to commu-
a large number of operating systems and has been a reliable nicate across public networks as if it is connected directly to
solution for the internet security vendor’s equipments. the same LAN. A VPN is a logical network on top of an
OpenVPN, developed by another research group is another already existing network. Different VPN solutions work on
well-known Open source VPN. OpenSSH is limited to UNIX different layers in the Open System Interconnect (OSI) model
like operating systems whereas OpenVPN can be deployed [10]. In the tunnels, the traffic is encrypted and sent through
on almost all platforms. Unlike OpenSSH, OpenVPN does using the lower layers in the OSI model. IPsec VPN operates
not form part of a protocol family and has been limited to on the Network layer of the OSI model [11]. The tunneling
VPN tunnelling. The OpenSSH group research in SSH VPN is transparent to the applications working on the higher 6
has been very slow and the recent versions have fixed patches layers. Support for Network Layer tunneling is implemented
of security flaws, therefore the 6to4 tunneling feature will by the operating system of the host that is an end point of the
undoubtedly be of great help to OpenSSH users[6]. tunnel. Data-link layer VPNs [12] are working on the Data-
link Layer of the OSI model. They are most commonly used
This document is organized into the following sections: on top of PPP in order to secure modem based connections,
V. R EQUIREMENTS AND PREREQUISITES To set as default, using any editor, open /etc/sysctl.conf and
add:
Root access to the client and the server is needed. The
# Needed to add for forwarding
client and the server are linked to the internet and both have net.ipv4.ip_forward = 1
public IP addresses. The aim is to use OpenSSH to connect 5) Configure iptables to allow masquerade(NAT) [22]:
to the server from a remote computer and set up a virtual
# sudo iptables -t nat -A POSTROUTING -s 10.0.0.0/24
tunnel with ipv6 ip addresses to allow the client computer to -o eth0 -j MASQUERADE
connect virtually in the server network. 6) The iptables settings will be lost when reboot unless it
is configured a way of saving the settings. Open /etc/rc.local
1) SSH version 6.0 or higher on both ends of the VPN: and add this line (above the exit 0 line) [23]:
2) Standard Linux or Ubuntu: # iptables-restore < /etc/iptables.rules
3) Two computers: 7) Configure the tunnel (tun0):
# sudo ssh -w 0:0 202.123.2.11
VI. S CENARIO
This command creates a tunnel interface named tun0 on both
The SSH server has ip address 10.1.2.24/24 and is machine, server and client.
connected to the internet with public ip address 202.123.2.13. If there is no errors, then the tun0 interface will be seen on
both systems, but not configured:
The client computer has private IP address 192.168.1.2 and
# ip addr show tun0
will connect to the server, using a simple ADSL line. Both tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP>
server and client is running Ubuntu operating system 13.04. mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 500 link/none
Sometime the MTU of tun0 interface need to be changed gateway fe80::2
to1500:
ifconfig tun0 mtu 1500 up
C. Bring up the VPN connection
B. Client Configuration ifup and ifdown commands are automated process for
1) Make a ssh key, call it VPN [24]: bringing up the tunnels.
# ssh-keygen -f VPN -b 1024 Bringing the connection up:
When generating the key, enter a password when asked to sudo ifup tun0
increase security. Shutting down the connection down:
2) On the client, put the private key (VPN) in /root/.ssh and sudo ifdown tun0
set permissions [25]: If needed, make the directory /root/.ssh Configuring the interfaces manually by giving them each
first. an IP address (do this as root):
# sudo mkdir /root/.ssh
# sudo cp VPN /root/.ssh/VPN On the server:
# sudo chown root:root /root/.ssh/VPN
ip link set tun0 up
# sudo chmod 400 /root/.ssh/VPN
ip -6 addr add fe80::2/64 dev tun0
3) Configure the key on the server: After transferring
the public key (VPN.pub) to the server, first put it in On the client:
/root/.ssh/authorized keys. If needed, make the directory
/root/.ssh first [24]. ip link set tun0 up
ip -6 addr add fe80::1/64 dev tun0
# sudo mkdir /root/.ssh !
# sudo bash -c "cat VPN.pub >> /root/.ssh/authorized_keys" route add -net 10.1.2.0 netmask 255.255.255.0 dev tun0
R EFERENCES
[1] R. Seggelmann, M. Tuxen, and E. Rathgeb, “Ssh over sctp x2014;
optimizing a multi-channel protocol by adapting it to sctp,” in Commu-
nication Systems, Networks Digital Signal Processing (CSNDSP), 2012
8th International Symposium on, pp. 1–6, July 2012.
[2] J. Sarkinen, “An open source(d) controller,” in Telecommunications
Energy Conference, 2007. INTELEC 2007. 29th International, pp. 761–
768, Sept 2007.
[3] A. .B, D. .F, and N. .B, “Linux operations and administration,” vol. 1,
pp. 399–400, August 2012.
[4] A. Jacoutot, “About openssh.” http://www.openssh.com/, May 2014.
[5] A. Jacoutot, “Openssh.” http://www.openssh.com/portable.html, May
2014.
[6] D. Meng, “Implementation of a host-to-host vpn based on udp tunnel and
openvpn tap interface in java and its performance analysis,” in Computer
Science Education (ICCSE), 2013 8th International Conference on,
pp. 940–943, April 2013.
[7] S. Zhou and J. Luo, “A novel ip over udp tunneling based firewall traver-
sal for peer-to-peer networks,” in Service Operations and Logistics, and
Informatics (SOLI), 2013 IEEE International Conference on, pp. 382–
386, July 2013.
[8] P. Rawat, J. Bonnin, and L. Toutain, “Designing a tunneling header
compression (tucp) for tunneling over ip,” in Wireless Communication
Systems. 2008. ISWCS ’08. IEEE International Symposium on, pp. 273–
278, Oct 2008.
[9] R. Bush and T. Griffin, “Integrity for virtual private routed networks,” in
INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE
Computer and Communications. IEEE Societies, vol. 2, pp. 1467–1476
vol.2, March 2003.
[10] Y. Li, W. Cui, D. Li, and R. Zhang, “Research based on osi model,”
in Communication Software and Networks (ICCSN), 2011 IEEE 3rd
International Conference on, pp. 554–557, May 2011.
[11] M. Zhuli, L. Wenjing, and G. ZhiPeng, “Context based deep packet
inspection of ike phase one exchange in ipsec vpn,” in Innovative Com-
puting Communication, 2010 Intl Conf on and Information Technology
Ocean Engineering, 2010 Asia-Pacific Conf on (CICC-ITOE), pp. 3–6,
Jan 2010.
[12] M. Khan, “Quantitative analysis of multiprotocol label switching (mpls)
in vpns,” in Students Conference, 2002. ISCON ’02. Proceedings. IEEE,
vol. 1, pp. 56–65 vol.1, Aug 2002.
[13] D. Djim, “Openssh, keeping your communiques secret,” January 2011.
http://openbsd.das.ufsc.br/openssh/.
[14] D. Jim, “Openssh portable openssh.” http://www.openssh.com/portable.
html, 2014.
[15] G. J. Caceres Forte, Next Generation SSH2 Implementation: Securing
Data in Motion. Burlington, MA: Syngress Publishing, Inc., second ed.,
2011.
[16] J. Jayanthi and S. Rabara, “Ipv6 addressing architecture in ipv4 net-
work,” in Communication Software and Networks, 2010. ICCSN ’10.
Second International Conference on, pp. 461–465, Feb 2010.
Comparative Study of Wireless Sensor Network
Standards for application in Electrical Substations
Fabrice Labeau Akash Agarwal Basile Agba
McGill University Indian Institute of Technology, Patna Hydro-Québec (IREQ)
Montreal, Canada India Varennes, Canada
fabrice.labeau@mcgill.ca
Abstract—Power utilities around the world are modernizing their such an environment is essential.
grid by adding layers of communication capabilities to allow for
more advanced control, monitoring and preventive maintenance. This paper provides the description and study of the above
Wireless Sensor Networks (WSNs), due to their ease of mentioned WSN standards along with their stack structures
deployment, low cost and flexibility, are considered as a solution and network architectures. Also, an overview of the
to provide diagnostics information about the health of the measurement and characteristics of impulsive noise such as
connected devices and equipment in the electrical grid. However, impulse rate, amplitude, duration and rising time etc. in the
in specific environments such as high voltage substations, the electric power systems environment (400/275/132-kV) is
equipment in the grid produces a strong and specific radio noise,
provided.
which is impulsive in nature. The robustness of off-the-shelf
equipment to this type of noise is not guaranteed; it is therefore The rest of the paper is organized as follows: In Section II,
important to analyze the characteristics of devices, algorithms an overview of the estimation and characteristics of impulsive
and protocols to understand whether they are suited to such noise environment is provided. In Section III, we study the
harsh environments. In this paper, we review several WSN protocol stack structure of WSN standards. In Section IV, we
standards: 6LoWPAN, Zigbee, WirelessHART, ISA100.11a and compare the WSN standards, to figure out the most suitable
OCARI. Physical layer specifications (IEEE 802.15.4) are similar for application in the INE. Finally, we conclude in Section V.
for all standards, with considerable architectural differences
present in the higher layers. The purpose of this paper is to II. IMPULSIVE NOISE ENVIRONMENT IN ELECTRICAL
determine the appropriate WSN standard that could support SUBSTATION
reliable communication in the impulsive noise environment, in
electrical substations. Our review concludes that the It is known that the noise environment in electrical
WirelessHART sensor network is one of the most suitable to be substations is adverse and typically is impulsive [4]. Such
implemented in a harsh impulsive noise environment. impulsive nature of noise degrades the communication carried
out on operating frequency band of the wireless network,
deployed in such an environment [3]. The major sources of
Keywords—Wireless Sensor Networks; 6LoWPAN; Zigbee; impulsive noise in an electricity substation are (1) Partial
WirelessHART; ISA100.11a; OCARI; impulsive noise Discharge (PD), which typically is caused due to imperfect
environment; reliable communication. insulation, and (2) Sferic Radiation (SR), which results from
operation of circuit breakers and isolators etc. Both of these
I. INTRODUCTION
processes produce radio waves that can be measured using
Today, electrical substations require real-time information for UHF antennas. Apart from the electrical substations,
adaptive energy allocation to the end users, for efficient atmospheric noise and other man-made noises also are the
delivery of power to the customers and to increase profitability. sources of impulsive noise.
To address such needs, advanced wireless devices with low The estimation of impulsive noise can help in assessment of
power and mobile sensors are emerging. Such sensor devices difficulties in the deployment of a wireless network. Much
monitor the equipment in the substation and provide adaptive, effort has already been laid towards measurement of impulsive
self-healing electric automation system for tele-maintenance, noise in a variety of physical environments. For instance, the
tele-protection and tele-control [1]. Several Wireless Sensor statistical characterization of the wireless channel such as
Network (WSN) standards currently exist, such as 6LowPAN, shadowing deviation and path loss, has been studied under
Zigbee, ISA100.11a, WirelessHART and, more recently, various environments within the substation, such as 500 kV
Optimization of communication for Ad-hoc reliable industrial substations, an industrial power control room, and an
networks (OCARI) [2]. However, their implementation in the underground network transformer vault [5].
electrical substation is an open area for research. In the rest of Various other experiments for the estimation of impulsive
the paper all the above mentioned standards will be collectively noise in different frequency bands have also been conducted.
termed as „WSN standards‟ According to the measurement set up in [6], four types of
The impact of WSN in electrical substation depends on the antennas are used for monitoring the partial discharge. Two
reliable communication in the harsh and complex impulsive quasi-TEM half horns, designed to capture signals in
noise environment (INE) of the electric substation [3]. In order frequency bands 0.716 – 1.98 GHz and 1.92 – 5 GHz, a high
to deploy WSN in smart grids, the knowledge of parameters band (HB) horn, that covers the range 2 – 6 GHz and a low
such as wireless channel model or link quality information in band (LB) horn to cover the range 0.7 – 2 GHz. The fourth
antenna is a di-cone antenna to collect data below 700 MHz. It
This work was supported by Hydro-Québec, the Natural Sciences and is observed that some external interferences are encountered
Engineering Research Council of Canada and McGill University in the during the on-site partial discharge measurements such as
framework of the NSERC/Hydro-Québec Industrial Research Chair in Interactive discrete spectral interferences, periodic pulse shaped
Information Infrastructure for the Power Grid.
Abstract—Here, we propose a method for recognition of hand- a class to improve the discrimination ability. It is observed
written English digit utilizing discrete cosine space-frequency that the accuracy and efficiency of many classifiers could be
transform known as the Discrete Cosine S-Transform (DCST). improved substantially by extracting direction features, local
Experiments have been conducted on the publicly availabe
standard MNIST handwritten digit database. The DCST features structural or curvature feature [13]. Kumar et al. [22] used
along with an Artificial Neural Network (ANN) classifier is mathematical morphology techniques to divide the dataset into
utilized for solving the classification issues of written by hand two groups and further classify those considering the structural
digit. The Discrete Cosine S-Transform coefficients are extracted features of the digits. The overall recognition rate recorded
from the standard images of MNIST handwritten isolated digit as 92.5%. The system fails to identify the digits due to the
database. The database consists of a total of 70000 including
60000 training samples and 10000 test samples. To overcome broken lines with large gap, incomplete digits and the digits
the computational overhead, we have normalized all the images with uneven strokes. Also, many multi resolution techniques
of the MNIST dataset from 28 × 28 to 20 × 20 image size are used for the classification of English digit. S-transform is
by eliminating the unsought boundary pixels up to width four. one of the multi resolution techniques that extracts multi-scale
Further, the classification of digits has been made by using a back resolution like discrete wavelet transform [17]. It has been
propagation neural network (BPNN). This work has achieved
precisely 98.8% of success rate for MNIST database. widely used in image restoration and texture analysis [16].
KeywordT Classification; MNIST dataset; Discrete Cosine One of the major limitations of S-transform is its exponential
S-Transform; BPNN. computational overhead. As of late a speedier adaptation of
the S-transform,in particular, discrete orthonormal Stockwell
I. I NTRODUCTION transform (DOST) has been intended for feature extraction. We
The problem of optical scanned character recognition [1] have utilized the discrete cosine transform (DCT) rather than
is to construe as well as translate apprehensible handwritten fast Fourier transformation in DOST which prompts Discrete
input automatically, which is of great interest as well as a Cosine S-Transform (DCST).
gainsay in the field of object identification. It has numerous The remaining piece of the paper is composed as takes after.
down to earth applications in real time e-processing of data, Section II gives a full depiction of the database utilized. The
such as automatic address reading, postal mail sorting through discrete cosine S-transform examined briefly in Section III-A.
ZIP/PIN code, bank check processing, etc. As a benchmark, The proposed method is depicted in Section IV. Section V
the Modified National Institute of Standards and Technology depicts the outcomes took after by finishing up concluding
(MNIST) dataset has been used often times to design afresh remark in Section VI.
digit recognition system [2], [3]. In the literature study, we
found a great amount of works based on MNIST dataset,
suggesting many different methods [4]–[6]. Lauer et al. [7]
have proposed trainable feature extractors for handwritten digit
recognition. A comprehensive examination by Liu et al. [8]
compares the performance of previously proposed classifiers
including linear and polynomial classifier, Nearest Neighbor
classifier, and different neural networks. Normally individuals
don’t generally compose the same digit in the very same way
at a given purpose of time. Due to this within class variance
of the shape of a character, it is a major challenge for the
researchers to classify the character. Many feature extraction
approaches such as biologically inspired features [9], higher
order singular value decomposition [10], GA-based feature
selection approach [11], Fuzzy model based recognition [12] Fig. 1: 100 sample digit images of MNIST dataset
have been proposed to specify the shape invariance within
II. DATABASE U SED to the Discrete Stockwell Transform (DST) and is given by,
The MNIST database was gotten from NIST’s Special N
−1
2π 2 m2 i2πmk
Database 3 (SD-3) and Special Database 1 (SD-1) [2] that S[k, n] = e− n2 H[m + n]e N (2)
comprises of binary images of written by hand digits ranging m=0
from 0 to 9. These are collected by high school students
where k is the time translation and n is the index of frequency
and employees of the United States Census Bureau. A total
shift with n = 0. Here H(.) is the DCT of h(.). The 2D-
of 30,000 patterns from SD-3 and 30,000 from SD-1 are
DST of an image of size N × N has the growth function
gathered aimlessly to have 60000 samples in the training set.
O[N 4 + N 4 log(N )]. Due to high complexity and redundant
Test set comprises of 10,000 samples out of which 5,000 are
information of S-transform, it is not frequently used in many
chosen from SD-3 and the rest are from SD-1. The MNIST
applications. Notwithstanding, DCST can be utilized to speak
dataset contains gray-scale images of size 28 × 28. Thus, the
to information effectively, which is the orthonormal variant of
dimensionality of each image sample vector is 784. Hundred
the DST, creating N point time recurrence representation for a
sample images from the MNIST database are appeared in
signal of length N. In this way, DCST give features with zero
Fig. 1. Since the samples are as of now standardized along
data redundancy. The voice frequencies (vx , vy ) are obtained
these lines, a little measure of preprocessing is required. For
within the bandwidth of 2px −1 × 2py −1 . If we represent the
our experiment we have normalized further all the images
voice frequencies as a complex image where, vx corresponds
of the MNIST dataset to 20 × 20 image size by killing the
to real and vy corresponds to imaginary part then the mag-
unsought boundary pixels up to width four. Accordingly, the
dimensionality of every image test vector gets to be 400 nitude and phase angle is obtained by Mv = vx2 + vy2 and
v
lengths. The MNIST database is publicly available at the θv = tan−1 ( vxy ) respectively. Hence, for each sample, we
obtained 400 DCST coefficients where the size of the image
TABLE I: MNIST dataset files and their sizes is 20×20. So, our feature matrix size for m samples is m×400.
Files Size in Bytes
Training set images 9912422 B. Classification Phase
Training set labels 28881
Test set images 1648877 We have considered the back propagation neural network
Test set labels 4542 (BPNN) for the classification of the test set. The Artificial
Neural Network (ANN) used has three layers as shown in the
homepage of LeCun [3] and these are shown in Table I along Fig. 2. The DCST features are extracted from each sample in
with their sizes in bytes. The distribution of each digit of the the training set. The network is trained considering the samples
test set is shown in Table II. from the training set. We fixed a total of 400 (i.e., n = 400)
FP
F all out (or F P R) = (5)
TN + FP
#
$
#
$
M iss rate (or F N R) = (6)
TP + FN
%
#
%
'$
& & TP + TN
Accuracy = (7)
TP + FN + FP + TN
The accuracy of the network can be calculated from the
Equation 7.
0
10
Train 10
0
Validation
Test
Cross−Entropy (crossentropy)
Best
gradient
−1
−1
10
10
−2
10
−2
10
−3
10
0 20 40 60 80 100
Number of Epochs
−3
10
0 10 20 30 40 50 60 70 80 90 Fig. 6: Training convergence characteristics of MNIST dataset
Number of epochs
VI. C ONCLUSION 5
R EFERENCES
Fig. 7: Validation checks during the training of MNIST dataset
[1] Richard O Duda, Peter E Hart, et al. Pattern classification and scene
analysis, volume 3. Wiley New York, 1973.
[2] Li Deng. The mnist database of handwritten digit images for machine
learning research [best of the web]. Signal Processing Magazine, IEEE, [3] Yann Lecun and Corinna Cortes. The MNIST database of handwritten
29(6):141–142, Nov 2012. digits.
[4] Ernst Kussul and Tatiana Baidyk. Improved method of handwritten
digit recognition tested on {MNIST} database. Proceedings from the
15th International Conference on Vision Interface. Image and Vision
Computing, 22(12):971 – 981, 2004.
[5] Zhihong Man, Kevin Lee, Dianhui Wang, Zhenwei Cao, and Suiyang
Khoo. An optimal weight learning machine for handwritten digit image
recognition. Special issue on Machine Learning in Intelligent Image
Processing. Signal Processing, 93(6):1624 – 1638, 2013.
[6] Ping Zhang, Tien D. Bui, and Ching Y. Suen. A novel cascade ensemble
classifier system with a high recognition performance on handwritten
digits. Pattern Recognition, 40(12):3415 – 3429, 2007.
[7] Fabien Lauer, Ching Y. Suen, and Grard Bloch. A trainable fea-
ture extractor for handwritten digit recognition. Pattern Recognition,
40(6):1816 – 1824, 2007.
[8] Cheng-Lin Liu, Kazuki Nakashima, Hiroshi Sako, and Hiromichi Fuji-
sawa. Handwritten digit recognition: benchmarking of state-of-the-art
techniques. Pattern Recognition, 36(10):2271 – 2285, 2003.
[9] Angelo Cardoso and Andreas Wichert. Handwritten digit recognition
using biologically inspired features. Neurocomputing, 99(0):575 – 580,
2013.
[10] Berkant Savas and Lars Eldn. Handwritten digit classification us-
ing higher order singular value decomposition. Pattern Recognition,
40(3):993 – 1003, 2007.
[11] C. De Stefano, F. Fontanella, C. Marrocco, and A. Scotto di Freca. A
ga-based feature selection approach with an application to handwritten
character recognition. Frontiers in Handwriting Processing. Pattern
Recognition Letters, 35(0):130 – 141, 2014.
[12] M. Hanmandlu and O.V. Ramana Murthy. Fuzzy model based recogni-
tion of handwritten numerals. Pattern Recognition, 40(6):1840 – 1854,
2007.
[13] Jie Zhou, Adam Krzyzak, and Ching Y. Suen. Verificationa method
of enhancing the recognizers of isolated and touching handwritten nu-
merals. Handwriting Processing and Applications. Pattern Recognition,
35(5):1179 – 1189, 2002.
[14] Ying Wen and Lianghua He. A classifier for bangla handwritten numeral
recognition. Expert Systems with Applications, 39(1):948 – 953, 2012.
[15] Hossein Khosravi and Ehsanollah Kabir. Introducing a very large dataset
of handwritten farsi digits and a study on their varieties. Pattern
Recognition Letters, 28(10):1133 – 1141, 2007.
[16] Robert Glenn Stockwell. A basis for efficient representation of the s-
transform. Digital Signal Processing, 17(1):371–393, 2007.
[17] Rafael C Gonzalez. Digital image processing. Pearson Education India,
2009.
[18] Yanwei Wang and Jeff Orchard. The discrete orthonormal stockwell
transform for image restoration. In 16th IEEE International Conference
on Image Processing (ICIP), pages 2761–2764. IEEE, 2009.
[19] Yann LeCun, LD Jackel, L Bottou, A Brunot, C Cortes, et al. Com-
parison of learning algorithms for handwritten digit recognition. In
International conference on Artificial neural networks. volume 60, pages
53–60, 1995.
[20] Lecun, Y. and Bottou, L. and Bengio, Y. and Haffner, P. Gradient-based
learning applied to document recognition. In Proceedings of the IEEE.
volume 86(11), pages 2278–2324, 1998.
[21] Liu, Cheng-Lin and Nakashima, Kazuki and Sako, Hiroshi and Fujisawa,
Hiromichi Handwritten digit recognition: benchmarking of state-of-the-
art techniques. Pattern Recognition. volume 36(10), pages 2271–2285,
2003.
[22] Kumar, V Vijaya and Srikrishna, A and Babu, B Raveendra and Mani,
M Radhika. Classification and recognition of handwritten digits by
using mathematical morphology. Sadhana. volume 35(4), pages 419–
426, 2010.
Improving User Cognitive Processes in Mobile
Learning Platforms through Context-Awareness
Brita Curum Kavi Kumar Khedo
Faculty of Engineering Faculty of Engineering
University of Mauritius (UoM) University of Mauritius (UoM)
Réduit, Mauritius Réduit, Mauritius
brita_curum@hotmail.com k.khedo@uom.ac.mu
Abstract—The use of mobile devices for learning is prominent available, studies have also pointed out that too much
nowadays. Mobile learning has become an essential part to information causes cognitive load [8-9]. Mobile learning on
enhance today’s learning style. However, technologies still lack education is restricted to two characteristics. These are the
abilities to fully understand human reaction and comprehend effectiveness of m-learning and the design of these systems
their cognitive processes to be able to interact with them and [10]. Researchers have identified m-learning as efficient using
deliver the right amount of contents. Therefore, one research learning outcomes such as motivations, perceptions, attitudes,
scope is having devices which can sense and acknowledge user academic records and satisfaction of the student, rather than
context information according to the individual’s cognitive load. the learning processes [11-12]. Learning processes which
The purpose is to seriously engage a learner with learning
cover cognitive load need to be looked into to address this
materials with an appropriate speed of information flow, within
the mobile device without constraints of time, locations, and
gap.
device restrictions. In this paper, a detailed evaluation of context- II. MOBILE LEARNING
aware mechanisms to improve cognitive load of users in mobile
learning platforms is carried out. Different evaluation criteria Mobile learning is a focused and challenging area for
are identified and discussed followed with a detailed assessment enhancing learning approaches in the next few years. Mobile
on the recent existing works of context-aware algorithms. devices which are used for information sharing and
Identified open challenges and research directions for using communication are produced with different hardware and
context-aware information to improve mobile learning are software specifications [13]. This new form of learning will
presented. provide us with an interface based on our actual situation and
helps us to strengthen our knowledge in any instance [14].
Keywords—mobile learning; context-awareness; cognitive load; Wisdom will come from mobile devices and be propagated
context-aware mechanisms; mobile learning platforms amongst peers in a social context system as supported by
I. INTRODUCTION reference [15], who also reported that m-learning supports
instant communication with peers. Many authors viewed
Technology has brought about many changes in our daily mobile learning as an add-on to e-learning which allows a
life. The dominating characteristic of mobile devices wireless connection [16]. This additional feature has
permeates our routine with its permanent accessibility to modernized the lifestyles of people, allowing various
information and communication. Being an ubiquitous asset, functions which were not present earlier. As a result, mobile
mobile devices have shown its potential to support learning. technologies are reshaping the routine of users in many ways.
Mobile learning refers to the use of mobile devices whereby
learning is supported while on the move, leaving behind A. Recent advances in Mobile learning
limitations of traditional educational environments. Reference
[1] defines mobile learning as the process to teach and learn Mobile devices have impacted many other sectors,
with the aid of mobile technologies such as smart phones, regardless educational sectors such as politics and are seen to
PDAs, media players and tablet computers which can be be lucrative to productivity [17]. Given that the reach of
available anywhere instantly. In recent years, mobile learning mobile technologies is further expanded, with its low
has become widespread [2] and has attracted many learners. competitive prices and newly added features, more and more
As a result, a clear growth of the demands for adapted delivery people are allowed to own one or many such device. Since
of educational contents is seen [3]. A platform such as Moodle people carry mobile devices with them most of the time,
is an integrated open Learning Management System (LMS) learning can occur anytime and anywhere.We are no more
which assist learners and educators in creating personalized restricted to traditional classrooms to study. The flexibility
educational materials. Other similar systems exist which are supported by mobile learning is sustained. Furthermore, the
gradually moving on mobile. Examples are MOMO, MLE- concept of Bring Your Own Device (BYOD) in workplaces is
Moodle and many others. The key reason behind many increasing productivity. Richer and dynamic learning is
researchers‟ interest about mobile learning stems from its encouraged in this way. The portable characteristic together
portable, customizable and situated characteristics [4-7]. with the power of applications (apps) in mobile devices are
However, with the vast quantity of learning resources
Gamification is a growing form of digital content in the III. CONTEXT AWARENESS IN MOBILE LEARNING
form of games. They are integrated to entertain users while
learning is supported simultaneously. Reference [23] reported A system is considered as “context-aware” if it can extract,
gamification as the provider of interactive edutainment, read and use contextual information and adapt its
increasing student engagement and alternative ways of functionalities to the current context in use [29]. According to
extending the learning process. These edu-games can be researchers, the three types of information to be collected to
accessed on mobile devices at any time. define context are places, people and things [30]. To define
One noticeable recent trend in learning using new these three entities, four categories of context are identified
technologies is the rise of Massive Open Online Courses which are clearly described below [38]:
(MOOCs). Many universities have adopted these courses as a
way to increase the number of students‟ admissions. Mobile Identity: characterizes the entity with an explicit
technologies will enable MOOCS in the coming years to identifier, unique in the name domain of the
propose more customized learning [24]. application.
Although recent advancements in technology present Location: includes spatial and geographical data.
different options and capabilities for learning through mobile
devices, it is very difficult to grasp full spectrum of the Status or Physical context: contains properties
possibilities provided by the device. Digital technologies distinguished by the user. It includes noise level,
development has been limited to social communication in the temperature, lighting and many others.
past. Only few people centered mobile learning as an Time: can be time of the day, month, year, date.
academic activity in higher institutions [25]. Indeed, there is a
growing interest to cater for user context change and provide A. Types of context
improved learning experiences as stated by Reference [26].
B. Mobile learning platforms A common approach to classify context instances is to
distinguish between the context dimensions. References [31]
and [32] broadly refer to these dimensions as external or
Investigations have shown that students like the flexibility extrinsic and internal or intrinsic, while Reference [33]
that mobile learning is providing them and with experiences grouped them into physical and logical context. External or
they are developing positive perceptions towards this physical dimensions refer to context that can be measured by
technology [27]. There exist a number of commercial mobile hardware sensors such as light, sound, touch, temperature and
learning platforms already implemented and in use today. many others. Internal or logical dimension is specified by the
Moodle is one model as seen in section I. which is widely user or captured by monitoring user interactions, such as the
adopted internationally. In this section, other mobile learning learner‟s goals, tasks to complete and emotional state.
platforms are presented and discussed. Importance is given to external context instances most of the
Mobl21 [29] is an awarded mobile learning platform that time. Location, activity, identity and time are identified as
supports dynamic and unconventional learning. Available as a primary contexts. Nonetheless, some efforts to use other
mobile apps, desktop widgets or web applications, educators context information were carried out in recent years [34].
develop customizable contents that are available through Additionally, contextual information is vital for effective
mobile devices. It allows study at a more convenient pace with content adaptations in m-learning platforms. An adaptive
an instant access to valuable learning material anytime and engine is required to provide adaptive learning [35]. The
engine obtains data and outputs relevant adaptation results. performances of the learner need to be thoroughly investigated
Context is based on the learner‟s state, educational state, [46].
activity‟s state, infrastructure‟s state and the surrounding‟s
state [36]. The learner‟s state consists of the following Basically, cognitive load stems from three types that
dimensions: cognitive skills, intentions, learning styles, directly affect learning which are intrinsic, extraneous and
preferences and interactions with the system [37]. Depending germane. For learning to occur, the total cognitive burden
on the individual‟s contextual information, these dimensions should never go beyond an individual‟s working memory
may be constant or may vary over time. In this paper, focus is capability. Generally, the addition of extraneous and germane
mainly on the learner‟s state since it enfolds the cognitive cognitive load is believed to be equivalent to the cognitive
abilities of a learner. burden without considering the intrinsic cognitive burden.
Therefore, instructional designing‟s aim is to reduce the
On the other hand, Reference [38] stated that contextual extraneous load as the germane cognitive burden is increased
information is captured using sensors which are of three basic [48]. Elaborated hereunder are the three types of cognitive
categories. These are the Activity sensors, Bio sensors and the load:
Environmental sensors. These sensors study and record
efficiently the situation and current location of the user. Intrinsic cognitive load refers to the complexity that is
Activity sensors are supportive to users while they perform involved in learning materials. It is related to the
registration of the work currently in progress or saving amount of information the working memory deals with
activities that have already been completed. It principally at a simultaneous time [49].
keeps tracks of the tasks the user is trying to solve. An Extraneous cognitive load consists of non-relevant,
example is the DYONIPOS [39], which allows capturing of unimportant materials causing learners to make
activities that the learner undertakes and suggests appropriate additional use of their mental processes. It can be
learning materials to the learner. Bio sensors on the other defined as the unimportant processing, caused by
hand, capture temperature, respiration, heart rate and so on. instructional design [50].
An example is the accelerometer, a device which functions as
a compass. It detects motion and calculates the acceleration Germane presents elements that enable learners to use
based on this data. Environmental sensors use photo, audio proper cognitive mental resources to learning process.
and proximity sensors to extract information about the users‟ Moreover, it encourages the growth of a learner‟s knowledge
surroundings. Proximity sensor as used in smartphones these level. In other words, it is the load which allows learners‟
days turns off display of the phone device when the user is on conscious focus of attention to intentionally recall and
call [40]. These sensors remain very limited in terms of comprehend the learning contents they come across [51].
usability and power consumption. With the large brands of The picture below depicts the cognitive load types where
mobile devices available in the market today, not all of them the dashed lines represent the loads that can be manipulated by
possess all the sensors required to perform an evaluation of the purposeful instructional design [47].
cognitive processes of users.
The challenge is to maintain the use of different contexts
combined together to deliver accurate learning materials to the
learner [41-42][39]. User performance based on recorded data
obtained from their independent thinking, the contextual
information and their interaction with the learning system as
combined will bring m-learning to another level. It is
supported that users will eventually boost up their motivations Fig. 1. Depiction of Cognitive Load types [47].
and maintains good learning performance [43].
Cognitive Load Theory (CLT) recommends the extraneous
IV. CONTEXT-AWARE MECHANISMS TO IMPROVE COGNITIVE load to be reduced by re-engineering learning activities when
LOADS IN MOBILE LEARNING the intrinsic complexity of a task stay fixed [52]. When
extraneous cognitive load is low, the released cognitive
Cognitive load is defined as the load generated with an resources can be re-assigned as germane aspect of automation
assigning work to an individual cognitive system. A higher to balance the load and eventually ease learning [53].
cognitive load brings a lower user satisfaction towards The current situation of learning through a mobile device
learning [44]. Hence, it is significant to find ways to lower the causes negative impact on a user due to the high cognitive
cognitive loads in teaching activities. Improper design of load from improper design of contents delivered. Thus, it is
learning elements results in an increase in cognitive burden of vital to center the problem of adaptation to the cognitive level
students, causing them to overload their working memory required for a particular context. Reference [54] came up with
[45]. Researchers are still looking for answers in the new a theory called “Split Attention principle” which proves that
proposed technology as to whether the learners are well people learn more when the information presented is in more
prepared to face the pressure and difficulties they are than one form. Learning material can be textual or pictorial,
displayed with. When employing new technologies to provide animation or verbal presentation. In order to provide better
richer content, factors that will affect the learning mobile learning experience, it is important to produce
instructional context to the learner, where he feels at ease with developed the Facilitation Activities Technology (FAT)
the frequent change of the surrounding condition and framework. It guides the design of contextual and location-
simultaneously be immersed in contents of the device. based learning. These frameworks are evaluated for their
situated characteristics of mobile learning, mobile application
Another example of instructional design principle based on design or contextual influences on learning contents. Similar
cognitive load theory was identified [55]. This is called the to the above described frameworks, many more have been
goal free problem effect. In geometry problem-solving we built to cater for adaptive learning materials.
need to find a value for a particular angle from a diagram
whereas in goal-free problems students are required to find Sensing technologies such as Global Positioning System
values of as many angles as they could [56]. (GPS), Radio-Frequency Identification (RFID) and Quick
Reference [57] introduced the redundancy effect. Response (QR) codes has enabled learning systems to focus
Redundant information increases the extraneous cognitive on and detect real-world locations and contexts of learners
load and forms part of the learner‟s working memory capacity. [62]. However, those efforts are very limited, not exploiting
It processes unnecessary information which is not even vital the full capabilities of the devices available, with narrow
for learning. Therefore, to support multimedia learning, content adaptation algorithms and little focus on the context
display of text in pictorial or animation format should be information [39]. Existing context-aware adaptive system also
presented as narration instead of static text. use very limited context information [63]. For example,
JAPELAS [64] is a context-aware learning system equipped
Expertise reversal effect [58] can be described as the with GPS which assists in practising Japanese in the real
phenomenon where instructional techniques are advantageous
world. It is supported by JEDY, an online dictionary which
to inexperienced learners and on the other side, it causes a
negative impact on experienced learners. The prime considers the intrinsic context of the system while TANGO
recommendation for this is that instructional design methods [64] helps Japanese students to recognize English words using
need adjustments as learners require more knowledge in a mobile phones through RFID tag reader. It includes six
specific field of study. The table below evaluates theories with modules to select suitable English words based on learner
the respective cognitive load type. models. Learner models can be the learner‟s profile such as
name, gender, interests and knowledge level. English
TABLE I. EVALUATION OF COGNITIVE LOAD THEORY vocabulary learning [65] makes use of WLAN positioning
technologies which help to locate the current position of the
Cognitive Load Type of cognitive load
Theory
learner. In this way, it encourages the student‟s interests and
Intrinsic load Extraneous load Germane load
allows for good performance. Additionally, collected data
Split Attention
Principle [54]
Constant High Low such as the learner‟s location and the time available for
Goal free Constant Low High
learning pre-select and adapt learning contents according to
problem effect the individual‟s preferences. Some researchers further
[55] established DYONIPOS [40] as described in section III where
Redundancy Constant High Low sensors are used to capture appropriate intrinsic contexts to
effect [57] provide suitable learning materials to learners‟ preferences. Its
Expertise Constant High Low environment context captures the nature of the location the
Reversal effect user is currently in.
[58]
Mobile Learning Support System (MLSS) is a tool
Building an adaptive educational system to adapt to designed by Reference [41] to ease students‟ learning in
different context as previously described is not an easy task. outdoor settings. It supports multimedia learning through the
Contract-based Adaptive Software Architecture (CASA) [59] content learning function, the tags searching function and the
is a framework which allows adaptive applications database connection function. With the MLSS, students scan
developments. It dynamically adapts their functionality in tags attached to corresponding objects and a variety of related
response to changes during execution time. Reference [60] material is displayed on screen. It uses GPS and 2D barcode
proposed the 5R adaptation concept based on location technologies which assist students to get information by
information to allow learning at a suited time, in the right interacting with the surroundings they are in. This system
environment, using the proper device while providing the enables the combination of real-world and digital resources. It
appropriate materials to the learner. Likewise, reference [61] provides additional adaptive learning activities in a variety of
outdoor and indoor settings.
TABLE II. EVALUATION OF CONTEXT-AWARE MECHANISMS
$EVWUDFW²When secure data transmission is only constituted of ‘forbidden’ states (i.e. state which
implemented through chaotic systems, the choice of the are unobservable). In this paper, starting from the re-
output is a preliminary problem. In this paper, the sults introduced in [11] and other recent works [13, 9]
quality of the transmitted information is analysed with an analysis of the “best” output for secure data trans-
respect to the observability concept, for each potential
mission is realized on the basic Lorenz circuit. It is im-
output. More-over, in order to overcome observability
loss, a dual immersion technique is proposed.The use of
portant to mention that the observability analysis with
high order sliding mode observer on a well known Lorenz respect to the output choice is only a preliminary step
system, allows to highlight the well founded of the in the design of data secure transmission scheme. Af-
proposed analysis and method. ter that, many other problems occur, such as the choice
of the ciphering method, the input choice,the method
Keywords—C Chaotic systems; Observability; for retrieving information (see for example [6, 12]),...
Singularity; Immersion, Secure data transmission Moreover, when an output is chosen, it may exist an ob-
servability singularity set, which leads to lose informa-
tion in some part of the state space. Hereafter it is pro-
posed a solution, based on the immersion technique, to
overcome this problem under very weak conditions. It
1. Introduction is important to mention that, since the work of [22, 21],
the immersion technique was extensively used in the ob-
Since the works of Pecora and Caroll [19], it is well server design context. Nevertheless, this is usually used
known that two chaotic systems can be synchronized. to recover the linearity property by diffeomorphism and
Following this fact, many authors have proposed secure output injection [3, 4, 24, 23], but generally with only
data transmission schemes (see for example [5, 1, 17]) a local diffeomorphism. In the majority of the men-
based on the synchronization of chaotic systems. Nev- tioned papers, immersion was accomplished by adding
ertheless, the output choice with respect to the synchro- a dynamic by means of output integration [3]. The sta-
nization, at the best of our knowledge, is less analyzed bility of such extra dynamics can be problematic and an
(see for example [13, 9]). In [18] Marels and Nijmei- elegant solution is proposed in [23]. Nevertheless the
jer draw the link between unidirectional synchroniza- problem of observability singularity was not tackled. A
tion and observation. This allows to use the control dual immersion technique using only extra differentia-
system theory to analyze the synchronization of chaotic tions is proposed in this paper. This method is close to
systems. Nevertheless, as chaotic systems are always the one proposed in [2], in another context. Moreover,
nonlinear systems, linear control system theory can’t be for stability arguments it is chosen, in this paper, to im-
directly applied to such systems. This is due to the fact, pose exponentially stable dynamics instead of constant
for example, basic properties as stability, controllability dynamics. The proposed approach is feasible thanks
and observability are generally local for nonlinear sys- to the finite time differentiators as for example the one
tems. It is the reason why in [11] local observability proposed in [7] (but for this method, the delay appears
concepts and criteria are introduced. These lead to de- due to the data acquisition frame) or High Order Sliding
termine an observability singularity set [9]. This set is Mode (HOSM) [15, 8].
∗ K.A.A Langueh is with QUARTZ Laboratory, IPGP, ENSEA,
The paper is organized as follows: in the next sec-
6 Avenue du Ponceau, 95014 Cergy-Pontoise, O. Datcu is with Po-
litehnica University of Bucharest, Faculty of Electronics, Telecom- tion some observability concepts, symbolic observabil-
munications, and Information Technology and QUARTZ Laboratory, ity index and observability singularity definitions are
IPGP, ENSEA, 6 Avenue du Ponceau, 95014 Cergy-Pontoise, J-P. recalled and the problem statement is explained in the
Barbot is with QUARTZ Laboratory, IPGP, ENSEA, 6 Avenue du
Ponceau, 95014 Cergy-Pontoise, EPI Non-A, INRIA. G. Zheng is
framework of Lorenz system. In section III, the dual
with EPI Non-A INRIA, Lille, Nord-Europe and K. Busawon is with immersion method is presented. After that, some re-
Northumbria University, Newcastle, UK. calls on HOSM differentiator are given in section IV. In
Rank{dO(n)} = n (4) S4 = {x ∈ R3 : x1 = x2 = 0}
from the following injective function:
50
T
40
O(4) = y, ẏ, ÿ, y(3) (10)
30
20
10 (11)
0
−10 20
30
of x, denoted x̂ by:
−40
20
Abstract—Three types of antenna arrays such as uniform data streams simultaneously through the channel to obtain
linear, uniform rectangular arrays and uniform cube arrays are high channel capacity, because it can provide more spatial
used in the transmitter and their corresponding channel capacity degrees of freedom [1], [2]. However, the MIMO capacity is
on several paths in the indoor environment are calculated. affected largely by the number of paths and correlation
Numerical results show that uniform linear arrays is better than between sub-channels in the realistic environment, which both
that for uniform rectangular arrays and uniform cube arrays the number of paths and correlation between sub-channels
system with and without interference. affect spatial degree of freedom provided by MIMO. It is well-
known that a very complicated environment can offer large
Keywords—uniform linear array; uniform rectangular arrays;
paths and little correlation between sub-channels to obtain
uniform cube arrays
high MIMO capacity, and the opposite results will be obtained
when a very simple environment is considered. The question
I. INTRODUCTION is whether a communication environment is complicated
Wireless data rates continue to increase by different sufficiently to support spatial degree of freedom provided by
techniques to date, and one of these techniques is antenna MIMO.
array that deploys multiple antennas on transmitter or receiver. Spatial degree of freedom depends on not only
In general, antenna array can be called single-input multiple- environmental complication but also the number of antennas
output (SIMO) while single antenna on transmitter and of both transmitter and receiver. In other words, for getting
multiple antennas on receiver are deployed, and called necessary MIMO capacity, environmental complication and
multiple-input single-output (MISO) while multiple antennas suitable antenna deployment have to be treated together.
on transmitter and single antenna on receiver are deployed. Hence, how to deploy suitable the number of antennas of both
Furthermore, multiple-input multiple-output (MIMO) also transmitter and receiver in the corresponding environment is
belongs to one kind of the antenna arrays, which multiple an important research topic. UP to now, most researches about
antennas on both transmitter and receiver are deployed. MIMO capacity are based on symmetric deployment which
Compared with single-input multiple-output (SISO), MISO expresses the numbers of transmitting antennas is equal to that
and SIMO utilize transmitting antenna diversity and receiving of receiving antennas. However, asymmetric deployment
antenna diversity respectively to yield the corresponding which expresses the numbers of transmitting antennas is
diversity gains, and the gains can enhance average receiving unequal to that of receiving antennas is worth to investigate,
power and obtain more channel capacity further. In contrast because the capacity for asymmetric deployment is better than
with MISO and SIMO, MIMO breaks a multipath channel into for symmetric deployment under some conditions.
several individual spatial channels, then provide an additional
spatial dimension for communication and yield a spatial For wireless communication systems, two main sources of
degree of freedom gain. These additional spatial degrees of performance degradation are the thermal noise present in the
freedom can be exploited by spatially multiplexing several channel or generated in the receiver and unwanted signals
data streams onto the MIMO channel, and lead to an increase emanating from the same or nearby stations. Co-channel
in the capacity [1], [2]. interference (CCI) is one of the unwanted signals and it
appears due to frequency reuse in wireless channels. CCI
MIMO in theory is possible to break a multipath channel reduction has been studied and used in a very limited form in
to several un-correlation sub-channels and transmit multiple wireless networks for many years [1], [2], [3]. The use of
978-1-4673-9354-6/15/$31.00 ©2015 IEEE
directional antennas and antenna arrays has long been channel matrix denotes the complex channel gain from the ith
recognized as an effective technique for reducing CCI, since
the differentiation between the spatial signatures of the desired interference antenna to the xth receiving antenna.
signal and CCI signals can be exploited to reduce the A matrix representation of MIMO-NB system with single
interference when multiple antennas are used. Although using
multiple antennas (SIMO and MISO) to reduce CCI has been CCI is shown in Fig. 2. In this figure, the desired signal can
proven effective in many literatures, it is not very sure that still be fed into several sub-channels by corresponding signal
MIMO can be used to reduce CCI as well. Furthermore,
capacity affected by CCI is still largely an unsolved problem processing and SVD. However, the signal processing and
[3] and corresponding literatures are few. As a result, it is
SVD is useless for the interference signal, since the
worth to investigate whether MIMO can effectively reduce
CCI on channel capacity calculation. interference channel matrix Hi is unknown to the receiver. As
The 60 GHz band which provides 7 GHz of unlicensed a result, the result contributed by the interference can be
spectrum with a potential to develop wireless communication
systems with multi Gbps throughput as part of the fifth- expressed as U* H i X i . Finally, an equivalent architecture of
generation (5G) system [4]. For wireless communication
MIMO-NB system with single CCI based on CSI-B is shown
systems, CCI [5] is one of the unwanted signals and it appears
due to frequency reuse in wireless channels. in Fig. 3. It is seen that the received signal vector Ŷ can be
Reference[6] analysis of the MIMO capacity of WLAN
systems with single CCI at 60GHz band have been seen as the combination of the desired signal vector X d
investigated. However, to our knowledge, a comparative study propagating through several sub-channel plus the result
about the performances of uniform linear, uniform rectangular
contributed by the interference Xi and the zero mean additive
arrays and uniform cube arrays when applied to Channel
Capacity analysis problems has not yet been investigated for Ŵ .
MIMO-WLAN system. In this paper, we compare the three white Gaussian noise vector The system can be expressed
types of antenna arrays such as uniform linear, uniform as follows:
rectangular arrays and uniform cube arrays are used in the
transmitter and their corresponding channel capacity on ˆ DX SX W
Y ˆ
d i (3)
several paths in the indoor environment are calculated.
where S U H i denotes a N r N i equivalent channel matrix
*
(1)
signal, and the element hxy of the channel matrix denotes the
complex channel gain from the yth transmitting antenna to the
Ni × 1 Nr × N i Xd D Ŷ
xd ,1 1 00
0 0 0
yˆ1
Xd Hd Y 0 2 0 0 00
xd , 2 yˆ 2
xd ,1 h11 h12 h13 h1N d y1 x 0 0 3 0 00 yˆ
d ,3 3
xd , 2 h21 h22 h23 h2 N d 0 00
y2
x h h h h y3 0 0 0 0 m 00 yˆ
d ,3 31 32 33 3N d
xd , N m
0 0 Nm
0 0 0 0 0
y
xd , N d hN r 1 hN r 2 hN r 3 hN r N d Nr
xd , N d 0 0 0 0 0 0 0 yˆ N r
Nd × 1 Nr × Nd Nr × 1
Nd × 1 Nr × Nd Nr × 1
W Ŵ
w1
wˆ 1
w2 wˆ 2
w3 wˆ
3
w wˆ
Nr
Nm
Nr × 1
wˆ N r
Figure 1. A sketch of MIMO-NB system with single CCI
Nr × 1
Y 0
0 20 40 60 80
door door SNRt(dB)
X Figure 3. The average capacities of WLAN systems for both MIMO-Linear,
Z
MIMO-Rectangular and MIMO-Cube with and without single CCI
4m
VRc for linear arrays, uniform rectangular arrays, uniform
Desired transmitter CCI transmitter cube arrays and SISO are shown in Figure 4.
partition
(1.25m, 2m, 1.2m) (3.95m, 2m, 1.2m)
desk
100
metallic
cabinet
90
Room 1 partition Room 2
0.0025m 70
60
50 1X1
8X8-Linear
Y 40 8X8-Rectangular
8X8-Cube
30
X 0 20 40 60 80
Z SNRt(dB)
(a) (b) (c)
Figure 4. VRc for MIMO-Linear, MIMO-Rectangular, MIMO-Cube
Figure 2. Layouts of different arrays. (a) Linear Array (b) Rectangular and SISO
Array (c) Cube Array
The numerical results for the outage probability with and
without inference are shown in Fig. 5 and Fig. 6, respectively.
The average capacities of WLAN systems calculated from Here SNRt is 35dB. Note that the average SNRr (receiving
236 receiving locations for both linear arrays, uniform signal power to noise power ratio) is SNRt subtracting about
rectangular arrays and uniform cube arrays with and without
single CCI are shown in Figure 3. 25dB in our cases and the value is about 10dB.
without CCI (SNRt=35dB) REFERENCES
1 [1] G. D. Durgin, Space-Time Wireless Channels. New Jersey: Prentice
Hall PTR, 2003.
[2] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.
0.8
outage probability Po
ACKNOWLEDGEMENTS
Abstract—Developing automatic and accurate computer-aided time required for it. One of the most important steps in this
diagnosis (CAD) systems for detecting brain disease in magnetic system is to find out a set of discriminative feature that can
resonance imaging (MRI) are of great importance in recent years. classify the normal brain MR image from the abnormal one.
These systems help the radiologists in accurate interpretation of An assortment of techniques has been studied for this purpose.
brain MR images and also substantially reduce the time needed
for it. In this paper, a new system for abnormal brain detection Over the last decade, several researches have been carried
is presented. The proposed method employs a multiresolution out for brain MR image classification. The most widely
approach (discrete wavelet transform) to extract features from used approach for feature extraction is the multiresolution
the MR images. Kernel principal component analysis (KPCA)
is harnessed to reduce the dimension of the features, with the
analysis that decomposes original MR image into several sub-
goal of obtaining the discriminant features. Subsequently, a new images. These images preserve information about both low
version of support vector machine (SVM) with low computational and high frequencies. Wavelet transform is one of the most
cost, called least squares SVM (LS-SVM) is utilized to classify important approaches for the texture analysis of the image.
brain MR images as normal or abnormal. The proposed scheme is Various researchers have used wavelet transform to extract
validated on a dataset of 90 images (18 normal and 72 abnormal). the features from the MR image. Chaplot et al. [3] have
A 6-fold stratified cross-validation procedure is implemented and utilized the approximation coefficients of two-dimensional
the results of the experiments indicate that the proposed scheme discrete wavelet transform (2D DWT) of level-2 decomposition
outperforms other competent schemes in terms of classification as the features and employed self-organizing map (SOM)
accuracy with relatively small number of features. and support vector machine (SVM) classifiers. Maitra and
Keywords—Magnetic resonance imaging (MRI); Discrete Chatterjee [6] have introduced Slantlet transform (ST) which
wavelet transform (DWT); Kernel principal component analysis is an improved version of DWT, for feature extraction and
(KPCA); Least squares support vector machine (LS-SVM) applied back-propagation neural network (BPNN) classifier.
El-Dahshan et al. [7] have used the approximation coefficients
I. I NTRODUCTION of level-3 decomposition of 2D DWT to represent each im-
age. Principal component analysis (PCA) was employed to
Brain diseases are growing rapidly among children and reduce the number of coefficients. They used feed forward
adults throughout the world. According to the National Brain back-propagation artificial neural network (FP-ANN) and k
Tumor Foundation (NBTF) in the United States, it has been -nearest neighbor (k-NN) classifiers separately to detect the
estimated that, in children, brain tumors are the reason for normal and pathological brain. In [4], [8]–[10], the researchers
one-quarter of all cancer deaths [1]. In the year 2104, World have used the coefficients of level-3 approximation sub-
Health Organization (WHO) reported that around 250,000 band of 2D DWT to extract features from images and then
people globally were diagnosed with primary brain tumors employed PCA for feature reduction. They have suggested
every year. Therefore, early detection of brain disease is very different classifiers with some training parameter optimization
important. Magnetic resonance imaging (MRI) has been used approaches, namely, feed forward neural network (FNN) with
as the most suitable medical imaging technique for an accurate scaled chaotic artificial bee algorithm (SCABC) [9], FNN with
detection of various brain diseases in recent years [2]. It is adaptive chaotic particle swarm optimization (ACPSO) [8],
a low-risk, non-invasive method that generates high-quality and BPNN with scale conjugate gradient (SCG) [4]. Zhang et
images of the anatomical structures of the human brain and al. [10] have used a kernel SVM (KSVM) classifier with three
gives rich information about the soft brain tissues anatomy [3], kernels, viz., linear (LIN), homogeneous polynomial (HPOL),
[4]. MRI provides better contrast for different brain tissues inhomogeneous polynomial (IPOL) and Gaussian radial basis
than all other imaging modalities [5]. These advantages have (GRB), to segregate the normal and pathological MR images.
delineated MRI as the most well-known method of brain They have achieved high classification accuracy with GRB
pathology diagnosis and treatment. However, the high volume kernel. Das et al. [5] have presented an efficient mutiscale
of information leads difficulty in analyzing and interpreting geometric analysis tool, Ripplet transform (RT) for feature
MR images. Computer-aided diagnosis (CAD) systems are cur- extraction followed by PCA for dimensionality reduction. A
rently used which examines brain MR images with the help of less expensive SVM approach, called least square SVM (LS-
image processing techniques. CAD systems help radiologists SVM) was applied for classification and they have achieved
in accurate interpretation of brain MR images for detecting suitable results over larger datasets. Saritha et al. [11] sug-
abnormal brain. Therefore, it is necessary to develop a CAD gested the combined wavelet entropy based spider web plots
system to increase the diagnosis capability and to reduce the (SWP) to extract features. The entropy values were calculated
&
'(()#
!
Fig. 1. Block diagram of the proposed scheme for detection of abnormal brain
for the approximation sub-bands of level-8 decomposition of A. Feature Extraction using mutltiresolution technique
Daubechies-4 wavelet. Finally, probabilistic neural network
(PNN) was applied for classification. Zhang et al. [12] have The proposed scheme uses a popular multiresolution tech-
nique, called DWT to extract features from the brain MR
used Shannon entropy (SE) and Tsallis entropy (TE) to get
features from the discrete wavelet packet transform (DWPT) images. Wavelet transform is proven to be a powerful math-
coefficients and suggested a generalized eigenvalue proximal ematical tool for feature extraction [15]. Compared to other
transformation techniques, wavelet transform provides time-
SVM (GEPSVM) classifier. Zhou et al. [13] have achieved
classification accuracy of 92.60% by using wavelet entropy frequency localization of an image which is very important
values as the features for each image. They have applied a for classification.
Naive Bayes classifier (NBC) to determine the normal and A 2D DWT is implemented using low pass and high pass
abnormal brain. Zhang et al. [14] have obtained 82.69% of filters and down samplers. In case of images, the DWT is
accuracy using SVM classifier. To get the features, they have applied to each dimension individually, which results in four
utilized wavelet-energy values of all the detail sub-bands of sub-band images (LL, LH, HL, HH) at each level. Among
level-2 decomposition. them, three sub-band images LH (low-high), HL (high-low)
and HH (high-high) are the detail (high frequency) components
The literature review reveals different existing schemes for in horizontal, vertical and diagonal directions, respectively. LL
abnormal brain detection. Most of the schemes are not able (low-low) sub-band image is the approximation (low pass)
to get a high classification accuracy. It has been observed component which is used for next level 2D DWT calculation
that the dimension of the feature space is relatively high in [4]. Fig. 2 illustrates the wavelet decomposition of a normal
many cases which may degrade the performance. When the brain MR image up to three resolution levels. In this study,
extracted features have more complicated structures and can we have utilized the coefficients of the approximation sub-
not be well represented in a linear subspace, then PCA will be band of level-3 decomposition (LL3 ) of Daubechies-4 wavelet
not helpful for dimension reduction in such case. Hence, there to extract features. Daubechies-4 provides better resolution for
is a need to use a new technique for nonlinear dimensionality smoothly varying signals in case if MR images of the brain.
reduction. Moreover, PCA requires a high computational cost Therefore, we have selected Daubechies-4 wavelet, which
for eigenvalue decomposition when the number of features is gives better classification accuracy. The coefficients of LL3
more than the number of images. To address the above issues, sub-band are arranged in row-major order to generate a feature
we have utilized the coefficients of approximation sub-band vector. Then a feature matrix is created by combining the
of 2D DWT for feature extraction. A kernel PCA (KPCA) vectors corresponding to all brain MR images. The extracted
approach is employed to handle the nonlinear feature values features have been normalized before employing to KPCA.
and to reduce the computational cost. To make the system The feature z is normalized to zn using the following formula.
more robust and computationally efficient, LS-SVM is used
to classify the abnormal brain from the normal one. The z−μ
zn = (1)
proposed method is tested on a dataset of 90 images and the σ
experimental results indicate that the scheme is superior to its where, μ and σ are the mean and standard deviation of the
competent schemes. The remainder of this paper is organized features, respectively. The normalized feature vectors are then
as follows. Section II deals with the working procedure of sent to the next phase.
the proposed method. In Section III, the simulation results
and comparisons are portrayed. Finally, Section IV gives the B. Feature Reduction using KPCA
concluding remarks.
The size of the feature space becomes large if the ap-
proximation coefficients are directly used as the features, and
II. P ROPOSED M ETHOD all the features are not relevant for classification. Hence,
to make the classification task feasible, the dimensionality
The proposed method includes three important phases, of the feature vector needs to be significantly reduced, and
namely, feature extraction, feature dimensionality reduction informative features need to be extracted. PCA is often used
and classification. The overall block diagram of the proposed for this purpose [16]. However, PCA only allows the linear di-
scheme is shown in Fig. 1. All the phases of the scheme are mensionality reduction and it doesn’t perform well on the high-
portrayed below in detail. dimensional features having complicated structures. Therefore,
Fig. 2. A normal brain MR image and its wavelet decomposition at three resolution level
a non-linear form of PCA, called kernel PCA (KPCA) is Here, K is a symmetric and positive semidefinite matrix.
employed in this paper for the dimensionality reduction of the We then normalize the eigenvectors β to ensure that the
features. Additionally, KPCA is computationally efficient than corresponding eigenvectors V are orthonormal. The resulting
conventional PCA when the size of the feature space is greater k th principal component of a test sample x is calculated using
than the number of samples [17], [18].
N
k
Consider a dataset {xk ∈ X} of N observations where yk (x) = V .φ(x) = βi φ(xi ) .φ(x)
k
Fig. 3. Sample of brain MR images with (a) Normal brain, (b) brain tumor, (c) stroke, (d) degenerative disease, (e) infectious disease
TABLE III. D IFFERENT CLASSIFICATION PERFORMANCE MEASURES TABLE VI. C ONFUSION MATRIX FOR ‘LS-SVM + RBF’ CLASSIFIER
TP (True Positive): correctly classified positive cases, TN (True Negative): correctly clas-
sified negative cases, FN(False Negative): incorrectly classified positive cases, FP(False
Positive): incorrectly classified negative cases
is 93.06%, 98.61% and 100%, respectively. We have also
compared the ROC curves obtained by LS-SVM classifier with
2) Results and Discussion: The proposed method utilizes the three kernels and are shown in Fig. 4. Table VIII presents
the coefficients of LL3 sub-band as the primary features of the classification performance comparison of our proposed
each MR image. However, the size of the feature space is 1444 method with the existing schemes. It is observed that the
for Daubechies-4 wavelet, which is quite large for computa- suggested scheme is superior to its competent schemes while
tion. Thus, KPCA approach is used to reduce the dimensions it requires relatively less number of features.
of features to only 7. These reduced features are the first
7 principal components(PCs) which are only 0.48% of the TABLE VII. P ERFORMANCE METRICS FOR THREE CLASSIFIERS
primary features. The polynomial kernel has been selected to Classifier Sensitivity (%) Specificity (%) ACC(%) AUC
calculate the kernel matrix in KPCA method. The performance LS-SVM+Linear 93.06 100 94.44 0.965
of the method is tested with different number of principal LS-SVM+Polynomial 98.61 100 98.89 0.986
components to find out the required number of features. It has
LS-SVM+RBF 100 100 100 1
been observed that the proposed system works efficiently with
7 PCs on the given dataset. The LS-SVM classifier has been
trained with the three kernels (linear, polynomial, and RBF).
To estimate the optimal value of the parameters, viz., θ, σ and
ζ, various pairs of the (ζ, θ) and (ζ, σ) are tested and finally 1
the pair with the low error rate is chosen to train the classifier. 0.9
The confusion matrix for linear, polynomial, and RBF kernel 0.8
is illustrated in Table IV, V, and VI, respectively.
0.7
0.5
Output (predicted) class
0.4
Abnormal (positive) Normal (negative)
Target class Abnormal (positive) 67 5 0.3
threshold
LS−SVM+Linear
Normal (negative) 0 18 0.2
LS−SVM+Polynomial
LS−SVM+RBF
0.1
0
TABLE V. C ONFUSION MATRIX FOR ‘LS-SVM + POLYNOMIAL’ 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
CLASSIFIER 1−Specificity
References Total number of images Feature extraction Feature reduction Classifier Number of features ACC (%)
Chaplot et al., 2006 [3] 52 DWT — SVM with RBF 4761 98
El-Dahshan et al., 2010 [7] 70 DWT PCA FP-ANN 7 97
El-Dahshan et al., 2010 [7] 70 DWT PCA k-NN 7 98.6
Zhang et al., 2011 [9] 66 DWT PCA FNN with SCABC 19 100
Zhang et al., 2011 [4] 66 DWT PCA FNN with SCG 19 100
Zhou et al., 2015 [13] 64 Wavelet entropy — Naive Bayes 7 92.60
Zhang et al., 2015 [14] 66 Wavelet energy — SVM 6 82.69
Proposed method 90 DWT KPCA LS-SVM+Linear 7 94.44
LS-SVM+Polynomial 7 98.89
LS-SVM+RBF 7 100
accurate CAD system for brain MR image classification. The [9] Y. Zhang, L. Wu, and S. Wang, “Magnetic resonance brain image
classification accuracies of LS-SVM with respect to the linear, classification by an improved artificial bee colony algorithm,” Progress
polynomial, and RBF kernel are 94.44%, 98.89% and 100%, In Electromagnetics Research, vol. 116, pp. 65–79, 2011.
respectively. The results show the efficacy of the suggested [10] Y. Zhang and L. Wu, “An MR brain images classifier via principal
component analysis and kernel support vector machine,” Progress In
scheme with considerably less number of features as compared Electromagnetics Research, vol. 130, pp. 369–388, 2012.
to other schemes. Though the feature reduction technique [11] M. Saritha, K. P. Joseph, and A. T. Mathew, “Classification of MRI
and the classifier used in this paper are less expensive than brain images using combined wavelet entropy based spider web plots
the schemes proposed in the literature, however, the feature and probabilistic neural network,” Pattern Recognition Letters, vol. 34,
extraction step is more time consuming. The proposed work no. 16, pp. 2151–2156, 2013.
can be experimented with larger datasets, and the performance [12] Y. Zhang, Z. Dong, S. Wang, G. Ji, and J. Yang, “Preclinical diagnosis
of the feature extraction stage can be enhanced using other of MR brain images via discrete wavelet packet transform with tsallis
entropy and generalized eigenvalue proximal support vector machine
advanced transformation techniques. (GEPSVM),” Entropy, vol. 17, no. 4, pp. 1795–1813, 2015.
[13] X. Zhou, S. Wang, W. Xu, G. Ji, P. Phillips, P. Sun, and Y. Zhang,
R EFERENCES “Detection of pathological brain in MRI scanning based on wavelet-
entropy and naive bayes classifier,” in Bioinformatics and Biomedical
[1] E. A. El-Dahshan, H. M. Mohsen, K. Revett, and A. B. M. Salem, Engineering, 2015, pp. 201–209.
“Computer-aided diagnosis of human brain tumor through MRI: A
survey and a new algorithm,” Expert Systems with Applications, vol. 41, [14] G. Zhang, Q. Wang, C. Feng, E. Lee, G. Ji, S. Wang, Y. Zhang, and
no. 11, pp. 5526–5545, 2014. J. Yan, “Automated classification of brain MR images using wavelet-
energy and support vector machines,” in 2015 International Confer-
[2] C. Westbrook, Handbook of MRI technique. John Wiley & Sons, 2014. ence on Mechatronics, Electronic, Industrial and Control Engineering
[3] S. Chaplot, L. M. Patnaik, and N. R. Jagannathan, “Classification of (MEIC-15), 2015.
magnetic resonance brain images using wavelets as input to support [15] S. Mallat, A wavelet tour of signal processing. Academic Press, 1999.
vector machine and neural network,” Biomedical Signal Processing and
Control, vol. 1, no. 1, pp. 86–92, 2006. [16] S. Theodoridis and K. Koutroumbas, Pattern recgnition. Academic
Press, 2008.
[4] Y. Zhang, Z. Dong, L. Wu, and S. Wang, “A hybrid method for MRI
brain image classification,” Expert Systems with Applications, vol. 38, [17] B. Schölkopf, A. Smola, and K.-R. Müller, “Kernel principal component
no. 8, pp. 10 049–10 053, 2011. analysis,” in Artificial Neural NetworksICANN’97, 1997, pp. 583–588.
[5] S. Das, M. Chowdhury, and K. Kundu, “Brain MR image classification [18] ——, “Nonlinear component analysis as a kernel eigenvalue problem,”
using multiscale geometric analysis of ripplet,” Progress In Electromag- Neural computation, vol. 10, no. 5, pp. 1299–1319, 1998.
netics Research, vol. 137, pp. 1–17, 2013. [19] C. M. Bishop, Pattern recognition and machine learning. Springer,
[6] M. Maitra and A. Chatterjee, “A Slantlet transform based intelligent 2006.
system for magnetic resonance brain image classification,” Biomedical [20] J. A. K. Suykens and J. Vandewalle, “Least squares support vector
Signal Processing and Control, vol. 1, no. 4, pp. 299–306, 2006. machine classifiers,” Neural processing letters, vol. 9, no. 3, pp. 293–
[7] E.-S. A. El-Dahshan, T. Honsy, and A.-B. M. Salem, “Hybrid intel- 300, 1999.
ligent techniques for MRI brain images classification,” Digital Signal [21] “Harvard medical school data,” http://www.med.harvard.edu/AANLIB/.
Processing, vol. 20, no. 2, pp. 433–441, 2010.
[8] Y. Zhang, S. Wang, and L. Wu, “A novel method for magnetic resonance
brain image classification based on adaptive chaotic PSO,” Progress In
Electromagnetics Research, vol. 109, pp. 325–343, 2010.
Introducing FERPS: A Framework for Enterprise
Resource Planning Selection
ON Singh UG Singh
School of Management, Information Systems and School of Management, Information Systems and
Governance Governance
University of Kwa Zulu-Natal University of Kwa Zulu-Natal
Durban, South Africa Durban, South Africa
omesh.narain.singh@gmail.com singhup@ukzn.ac.za
Abstract—Enterprise Resource Planning (ERP) systems, over promising, more so in the area of information systems and
the years, have evolved into integral business systems in both information technology [16]. Researchers argue that IT
medium and large organizations. The strength of an ERP is management in the public sector is ―more of the same‖
that it is a computerized transactional information system, when compared to the private sector whilst others view the
with a centralized data repository. This allows for significant
public sector as ―a whole new ball game‖ [17].
data availability and seamless collaboration between business
functions. Realizing associated benefits through an ERP
implementation is challenging, as experienced by a South The water utility under study was established in 2004,
African water utility. On the establishment of the utility the and the following business units were created - strategy and
failure to implement an ERP system successfully, forced the leadership, customer services, engineering, operations,
organization to critically review the position it found itself in. environmental services, human resources, corporate services,
This study identifies some of the critical shortfalls in the initial finance and technology. In order to operate a sound business,
ERP implementation, and introduces the process of the creation with strong governance principles, it was imperative for this
of an ERP evaluation framework, named FERPS, that assists in water utility to implement financial systems to account for
ERP evaluation and selection, prior to acquisition. Ultimately,
business transactions. Thus, the utility implemented an ERP
FERPS assists in providing an organization with evaluation
results of ERP systems to support the implementation of an system, where the acquisition process was out-sourced to a
optimal ERP system, based on organizational fit. consulting company. At the time the utility lacked the
resources and technically skilled staff to champion this
project.
Keywords—evaluation framework; enterprise resource
planning (ERP) systems; ERP selection; ERP modules; ERP
procurement The background of this particular water utility is that it is
responsible for the provisioning of water, waste water
management and other environmental services. These
I. INTRODUCTION
services include the supply of water, from the source to
With the implementation of ERP systems, consumer, from sewer connection to waste water treatment
companies are able to better manage their resources, and finally, disposal into the environment. These water
standardise business processes across the organisation and services are provided to a range of consumers including
assist management with real time financial and production community households, industries, businesses, and
information [9]. There are many commercial off-the-shelf government institutions. The highest governance structure of
(COTS) software packages available in the market that a water utility is usually composed of a board of directors.
organisations can configure to their specific needs with The management of this particular water utility is divided
inherent benefits such as outsourced system maintenance, among two committees; the leadership committee (Leadco) -
support and system improvements. Some of these COTS who is responsible for executing the overall business strategy
software include Oracle, SAP, BAAN, MS Dynamics etc. set by the board of directors, and a management committee
Organisational reasons for implementing ERP systems may (Manco) - whose responsibility it is to manage the
vary from replacing legacy systems to transforming business operational activities of the utility.
operations [15].
However, the ERP acquisition by the consulting
When considering the public sector, citizens in company, resulted in an ERP implementation failure, as the
many countries expect their administration to provide system was unsuccessful in satisfying the utility‘s
quality services that are adapted to the technological operational and strategic objectives. Post implementation
environment at the lowest cost. This expectation is in review of the ERP implementation identified two major
response to the business practices in the private sector where shortcomings:
evolved business methods, techniques and practices appear
Following this unsuccessful implementation of an ERP Hence through this study, the researchers will address this
system at this water utility, the researcher set out to identify gap, and add to the body of literature on the acquisition of
the criteria that are essential to evaluate an ERP system, ERP systems.
prior to implementation, to ensure a ‗best-fit‘ in the
adoption process. The developers and suppliers of ERP III. RESEARCH DESIGN AND METHODOLOGY
systems have a perspective that a plain vanilla approach that
results in a ―one size fits all‖. This perception arises from This research followed an exploratory approach to
the argument that systems operate on the most widely identifying the criteria that are essential for inclusion in a
accepted best practice [10] [13]. This proved incorrect in the framework that can be used to evaluate ERP system
case of the utility. suitability, prior to their purchase.
This research reports on the process of identification, as Since the current ERP implementation process, was
well as the criteria identified, which have been incorporated regarded as a ‗failure‘, as the current system was not fully in
into FERPS (Framework for Evaluating an Enterprise use in the utility, senior management at the utility initiated
Resource Planning System) this ‗criteria determination process‘, based on commercial
off-the-shelf (COTS) ERP modules.
II. LITERATURE REVIEW
Enterprise Resource Planning Systems are computer The newly appointed IT manager was appointed as the
based information systems that have evolved over the years project champion and was tasked to project manage the data
to process organisational transactions and facilitate collection process. The key categories that were identified
integrated real-time production planning, customer response by the IT manager for data collection included the processes
and reporting [7]. of ERP evaluation framework development, ERP
evaluation, ERP selection and ERP implementation.
Over the years the largest investment in information
systems software has been in ERP systems [8]. With the Data was collected using non-probability purposive
implementation of ERP systems, companies are able to better sampling. Key role players were identified based on their
manage their resources, standardise business processes field of expertise and functional responsibilities in the
across the organisation and assist management with real time utility. These participants included the chief financial officer
financial and production information [9]. (CFO), income accountant, expenditure account,
procurement officer, asset officer and budget officer of the
Essentially an Enterprise Resource Planning (ERP) utility.
system is a computer-based software that consists of various
modules that relate to specific business functions and Six semi-structured interviews were conducted to
transactional information that is stored in a central database. determine what modules and scope were essential to the
This allows for significant data availability and collaboration water utility.
between business functions. Thereafter a second round of interviews was conducted.
These purposive in-depth interviews were conducted with
Although ERP systems bring inherent competitive finance departmental heads and key staff. The purpose was
advantages, there also exists a high risk of failure. About to collect data on the actual business processes of the utility.
70% and ERP implementations fail to deliver the expected Interviewees were asked specific questions regarding the
benefits and three quarters of ERP implementations are modules applicable to their job functions and processes.
unsuccessful. On average these projects where 178% over
The information from both these sets of interview was Request for quotation (RFP)
qualitative in nature. Hence narrative analysis was Purchase order
conducted and responses were grouped into themes. Emergency orders
Essentially these themes formed the categories of FERPS. Receiving
Goods returns note
Vendor master file
Subsequent narrative analysis of the same data, once Inventory
again through thematic representation, identified the Inventory master file
individual criteria essential in each category. Inventory reorder
Dispatch
These categories and criteria formed the basis of the Stock takes
modular framework FERPS. Reporting requirements
Acknowledgments
The water utility - consisting of the board, leadership
committee and management committee, chief financial
officer and finance staff for allowing the data collection.
References
[1] Ganesh, K. et al., 2014. Enterprise Resource Planning:
Fundamentals of Design and Implementation, Springer.
[2] Turton, J.W., 2010. A manager‘s view of critical success factors
necessary for the successful implementation of ERP.
[3] Thomas, G. & Jajodia, S., 2004. Commercial-off-the-shelf
enterprise resource planning software implementations in the
public sector: practical approaches for improving project
success. Journal of Government Financial Management, 53(2),
pp.12–19.
[4] Al Rashid, W., 2013. Managing stakeholders in enterprise
resource planning (ERP) context-a proposed model of effective
implementation.
Performance Evaluation of Matrix- and EXIF-Based
Video Rotation Methods in MJPEG-Based Live
Video Encoder
...
while is the angular offset of the camera from the normal
orientation of the device. For example, if device with portrait JPEG image (frame N)
<boundary>
New frame
arrived
from camera Fig. 4. Delivery of MJPEG-based live video stream over HTTP
Normal 1
Flip horizontal 2
Using the created rotation matrix, second step is mainly Rotate for 180˚ 3
pixel multiplication. One by one, pixels are multiplied by the Flip vertical 4
rotation matrix, thus creating a rotated image. Consequently,
complexity of matrix-based rotation is O(width*height), which Transpose 5
is approximately O(n2) complexity class. It is valid to assume Rotate for 90˚ 6
that this algorithm will cease its performance with video
resolution growth. However, alterations made on pixels are Transverse 7
permanent and due to that, this kind of rotation is supported by Rotate for 270˚ 8
all relevant browsers and image viewers.
B. EXIF Tagging
V. ANDROID-POWERED IMPLEMENTATION OF MJPEG-BASED
EXIF (Exchangeable Image Format) metadata [20] is the LIVE VIDEO ENCODER
extra information about the image, such as date and time when
To compare the performance of the two video rotation
the image was taken, model and manufacturer of the camera
methods on mobile devices, we developed a simple MJPEG
being used, image resolution, etc. EXIF metadata is appended
video encoder based on Android operating system. The
to the image data as a file header, as shown in Fig. 5.
encoder performs the algorithm described in Fig. 3. The goal
Through the orientation attribute, EXIF metadata of the experiment was to determine the extent the rotation
enables to store the information about image orientation, procedures reduce the frame rate of the produced live video.
public void onPreviewFrame(byte[] yuvData, which takes frames in the range of 10-30 video frames per
Camera camera) { second (fps). Video rotation techniques have been analyzed
int rotationAngle = calculateRotationAngle();
ByteArrayOutputStream jpeg = and compared with regard to rotation angle and lighting
new ByteArrayOutputStream(); conditions. Results show video rotation speed is not in
if (rotationType == MATRIX) { correlation with the rotation angle for both techniques.
Bitmap rotBmp = rotateMatrix(yuvData, Therefore, further discussion is focused on lighting conditions.
rotationAngle);
rotBmp.compress(Bitmap.CompressFormat.JPEG,
MJPEG_QUALITY, jpeg); A. Performance Limits of the Camera
}
else { Android cameras use adaptive frame rate that depends on
YuvImage yuv = new YuvImage(yuvData, the lighting conditions and is not under direct control of an
ImageFormat.NV21, application developer. The best the developer can do is to
W, H, null); specify the preferred frame rate range among the supported
yuv.compressToJpeg(new Rect(0, 0, W, H),
MJPEG_QUALITY, jpeg);
ones. Therefore, in our first experiment, we examine the
rotateExif(jpeg, rotationAngle); performance of the camera without the encoder being
} attached. The goal is to determine the maximum speed at
sendToClient(jpeg); which the camera is able to deliver raw frames to the encoder
} under different lighting conditions. Results are shown in Fig.
--------------------------------------------------
private Bitmap rotateMatrix(byte[] yuvData, 7a.
int rotationAngle) { We confirmed there is a difference in frequency in which
byte[] rgbData = convertYuvToRgb(yuvData);
Bitmap bmp = Bitmap.createBitmap(rgbData, camera captures frames in regard to the amount of light
0, 0, W, H, null); entering camera lens. Camera speeds under all three lighting
Matrix rotationMatrix = new Matrix(); conditions remain steady with the resolution growth because
rotationMatrix.postRotate(rotationAngle); camera speed is a fixed hardware characteristic which has
Bitmap rotBmp = Bitmap.createBitmap(bmp, nothing to do with image size. However, the speed in daily
0, 0, W, H,
rotationMatrix, false); light is almost twice as big as interior and street lighting
return rotBmp; speed. This proves that camera adjusts its speed in a way
} which does not affect a human eye drastically. In the poor
--------------------------------------------------
private void rotateExif(ByteArrOutputStream jpeg,
int rotationAngle) {
// Android’s EXIF library manages EXIF
// metadata through a file system
writeToTempFile(jpeg, EXIF_PATH);
ExifInterface exif =
new ExifInterface(EXIF_PATH);
exif.setAttribute(
ExifInterface.TAG_ORIENTATION,
String.valueOf(rotationAngle));
exif.saveAttributes();
readFromTempFile(EXIF_PATH, jpeg);
}
Fig. 6. Implementation of the MJPEG live video encoder using the
Android camera API and ExifInterface library
a) Performance of the camera without running the encoder
MJPEG encoder is implemented using the Android camera
API [21] and Android EXIF library [22], as shown in Fig. 6.
Raw frame data are delivered to the encoder using the
onPreviewFrame() callback method of the
Camera.PreviewCallback class. This method is called
autonomously by the Android framework whenever a camera
has a fresh frame data to be delivered to the application. The
programmer then has the ability to manipulate the frame data
by implementing the internals of the callback method. Within
the callback, we differentiate between the two video rotation
methods and invoke the appropriate rotation routine and image
processing libraries, respectively.
VI. PERFORMANCE EVALUATION OF VIDEO ROTATION b) MJPEG encoder performance during JPEG compression
TECHNIQUES
Fig. 7. Performance of the camera and MJPEG encoder when no video
Performance evaluation of video rotation techniques has rotation is required
been done on Android-powered device with 5Mpx camera
resolution growth, as shown in Fig. 7b. However, for small
resolutions, the encoder is able to perform the compression
fast enough that the performance loss is not visible to the user.
This means that the encoder finishes with the frame processing
before new frame arrives from the camera. For larger
resolutions, we notice a proportional decline in performance
since the compression becomes dominant resource-consuming
process and starts to slow down the encoder.
Since the camera delivers frames faster as lighting
conditions get better, the turning point at which the encoder
starts to reduce the frame rate appears earlier on a daily light
than on interior and street lighting. Also, once the turning
point has been reached, we can notice a slight variations in
a) Daily light encoder performance for the same resolution, but different
lighting conditions. Street lighting speed is the fastest one and
daily light speed is the slowest one. The reason for this
occurrence is a different possibility for pixel arrangement
coding in different lighting conditions which makes image
taken during a day much less compressible than the others
because they have more details and shades. In other words,
they contain a lot more different pixels than darker images.
S Gunness UG Singh
Centre for Innovative and Lifelong Learning School of Management, Information Systems and
University of Mauritius Governance
Ebene, Mauritius University of Kwa Zulu-Natal
sandhya.gunness@uom.ac.mu Durban, South Africa
singhup@ukzn.ac.za
Abstract—This paper attempts to understand the broader provides tools for assessing the performance of students which
implications for integrating learning analytics in e-learning is usually grounded in the data analysis gathered during
systems. This is also the commencement of a collaborative effort interactive online sessions. This paper aims at investigating
for developing research bases around learning analytics between how to introduce educators to adopt these tools to enhance
the University of Mauritius and the University of KwaZulu-
their teaching practices, in particular, to use learning analytics
Natal. The focus on e-assessments is deliberate: the tendancy in
both our universities to appraise student learning as summative for appraising metacognitive skills in e-assessments. After a
and quantitative does not reflect contemporary workplace and survey of the literature on the potentials and hypes around
scholarly needs where learning processes are more valuable than learning analytics, we look towards the literature linking
outcomes. The necessity for epistemological rethinking and learning analytics, data mining and e-assessment for educators
developing systems based on contextual requirements is to interface more effectively and efficiently with the LMS
important, whether it is related to curriculum development or platform. We conclude this discussion-paper with a workshop
workshop organisation or even conducting a class. After a survey outline which can help educational technologists organise
of the literature on the potentials and hypes around learning practical, sustainable and meaningful hands-on sessions with
analytics, we look towards the literature linking learning
educators. The aim of this workshop is to familiarise
analytics, data mining and e-assessment for educators to
interface more effectively and efficiently with the LMS platform. academics with learning analytics for e-assessments, that go
We conclude this concept-paper with a workshop outline which beyond the simple recall of knowledge, and that are geared
can help educational technologists organise practical, sustainable towards equipping their learners with a disposition towards
and meaningful hands-on sessions with educators. lifelong learning.
Keywords—learning analytics, big data, data mining, e- BIG DATA FOR EDUCATION
assessments, higher order thinking skills (HOTS)
Traditionally, teaching pedagogies provide limited immediate
INTRODUCTION feedback to students. Thus educators often have to spend long
hours marking regular assignments. As a result they lack
The University of Mauritius (UoM) falls under the Applying
enthusiasm to guide students on how to improve their
stage of technology integration. At this stage, teaching still comprehension skills. Furthermore, many educators fail to
takes the priority in the classroom. However, computer-based experiment with, and provide support, by adopting
learning, through Learning Management Systems (LMS), are technology-based resources to provide variety in the learning
integrated into regular teaching, and thus the users (lecturers process. Adopting data-driven approaches to teaching
and students) become more confident in adopting specialized facilitates better analysis of students‟ actual learning, as well
technology-based tools to teach and learn their subject fields.
as provide systematic feedback to both students and educators.
The Centre for Innovative and Lifelong Learning (CILL), The potential of “big data” provides opportunities to mine the
responsible for academic induction of e-learning integration at learning patterns of students, to get a better insight into both
the UoM has adopted the Moodle LMS as its e-learning their student performance, and learning approaches [1].
platform. It has been observed that most of the academics at
the UoM adopt e-learning services for communication and
Data mining is defined as the „broad field of discovering
provision of supplementary reading material. Essentially
innovative and potentially valuable information, from large
Moodle is used by academics as the typical placeholder for amounts of data‟ [2].
lecture material. Academics are clearly not incorporating any
of the interactive features available in Moodle. In addition to Educational Data Mining specifically refers to „an emerging
the Quizzes tool which Moodle provides for electronic discipline, focusing on developing methods for exploring
assessment (e-assessment), the latest Moodle 2.8 version
Abstract—The advancement of the Internet, mobile and background of Context-aware Recommender Systems, Section
wireless technologies have produced a rapid change of III explains Context-aware systems and its Components,
information in terms of volume and accessibility. The enormous Section IV discusses Context-aware Recommender System
volume of information can be devastating especially to mobile CARS and the algorithm, Section V presents the application,
users exceeding human ability to differentiate information which
Section VI talks about the issues and challenges, Section VII
is relevant and that which is irrelevant. For many yzears now,
recommender system have become well-known, and have been concludes the paper
studied in various domains such as e-learning, online shopping,
tourism to help overcome information overload. II. BACKGROUND
Recommendations are produced based on users who have A. Definition of Context
interests in a particular thing or item. This recommendation Different definition of context exits, according to [4]
process was further enhanced by incorporating context such as Schilit and Theirmer introduced the term Context-awareness,
time, weather, and location to make recommendations more
they defined context as location, objects and identities of
accurate and efficient. However, these systems have introduced
context-aware recommender systems. This paper presents a individual nearby. The general definition of context is defined
survey of Context-aware recommender systems, the background by Dey and Abowd, in [5-7]they defined “context” is any
and algorithms of Context-aware Recommender System, and also information that can be used to characterize the situation of an
discusses the open issues of context-aware recommender systems. entity. An entity is a place or object that can be considered
relevant with the interaction between an application and a
Keywords—Recommender System; Context; Context-aware user. The context information of a user consists of different
Recommender systems attributes, like physical location, emotional state such as
“calm, distraught or anger” physiological state such as “body
I. INTRODUCTION temperature heart rate or” behavioural patterns, personal
history etc.[8]
In the past, many researchers have studied Context-aware
computing and have also built several applications to prove B. Categories of Context
the importance of it [1] context-aware computing is Various researchers categorized context in different ways.
considered to automatically use context data so as to run the This will be of advantage for application designers to reveal
services that are suitable for a specific time, places or events. the most possible pieces of context that will be of benefit
It was incorporated to improve the traditional user request when designing their application. [4]According to [5, 9]
response pattern that requires the user to express the desire for Context is divided into two categories: Static and dynamic
recommendation.[1] Static Context often changes during system operation, such as
The traditional recommendation is focused in suggesting contact list, calendar, user profile, to-do-list, address book etc.
the most important items/objects to users without considering Dynamic Context is an information that is highly variable, like
any additional contextual information, like location, weather temperature, time, location of the user, mood of the user
and time.[2] though, in several applications such as travel etc.[5]. And also work done by Schilit in [2] divided context
package, e-learning etc., it might not be adequate to consider into three categories, they are: User Context, Physical Context,
items and users only, it is essential to integrate the context Computing Context shown in Figure 1.
information into the process of recommender systems so as to
recommend services/items to users under some circumstances.
[2] Context-awareness is a potential technology for mobile
devices by means of facilitating the use of device in
challenging condition by dynamically adapting the device
behaviour.[3]
The main contribution of this paper is to give a survey of
Context-aware Recommender system and services. The
remainder of this paper is as follows: Sections II discusses the
Middleware
Abstract—Blood is a non-replenishable entity, the only source [5][4] have become more complex. The main issue, which
of which are humans. Timely availability of quality blood is a plagues BBMS in the country is enforcing the standards and
crucial requirement for sustaining the healthcare services. identification of the Professional Donors. Therefore, in the
Therefore, maintaining quality of blood and identification of modern world the purpose of BBMS is not only to passively
Professional Donors is a major responsibility of blood banks.
act as inventory management system but to actively enforce
NACO (National AIDS Control Organization) and NABH
(National Accreditation Board for hospitals and Healthcare standard operating procedures along with providing decision
Providers) have provided guidelines for ensuring the quality of support. Recent Blood Bank Management Systems tend to
blood and identifying Professional Donors. Moreover, manually focus on adapting the system to local practices instead of
monitoring standards and identifying professional donors is a enforcing standard practices. The developed solution
challenging job. In this work, we develop a standard compliant augments the functionality of the contemporary systems.
Blood Bank Management System with a novel rule based Authors in [15] developed a blood bank management system
enforcing mechanism. The developed system is an end-to-end which adhered to the requirements of a single hospital. The
solution for not only managing but implementing enforcing system developed in [3] caters to a National level transfusion
strategies and providing decision support to the users. The
service, but limits the scope to providing citizen centric
proposed Blood Bank Management System has been
implemented across 28 blood banks and a major hospital. It has services and inventory management. Authors in [8], attempt to
been found extremely effective in streamlining the workflow of address the issue of safe transfusion by developing an end to
blood banks. end solution. A real time system for blood bank has been
proposed in [1].
Keywords—Blood Bank Management System; Blood Stock;
NACO; NABH; Professional Donor; Donor Repository A recent study [12] has observed that the existing
workflow of blood banks needs to be strengthened, both in
I. INTRODUCTION terms of planning and monitoring. Also, there are many gaps
The major concern of blood banks is to ensure efficient and in the management of blood supply [16] [9]. Our work
effective collection and maintenance of quality blood stock as addresses the following gaps as compared to similar systems.
well as identification of Professional Donors (Section II-E). Firstly, the existing systems have been designed to take care of
This becomes crucial since the span of time, especially in routine functioning of blood bank and care not able to enforce
emergency situations, between requirement, arrangement and the guidelines and standards based on rules. Secondly,
delivery of blood is very narrow. Moreover, blood banks identification of professional donors, if available, is usually
across the state, districts are not able to utilize the available based on bio-metric devices only. Such identification
blood stock appropriately due to lack of connectivity and time mechanisms mostly require an additional level of integration
taken to propagate information via conventional channels. effort and scrutiny by the blood bank staff. Thirdly, quality
National AIDS Control Organization (NACO) and National checks are based on manual entry processes. In addition to
Accreditation Board for hospitals and Healthcare Providers these fundamental issues, there are many challenges that are
(NABH) have provided guidelines to ensure the quality of encountered when such a system is implemented across a large
blood. But there is absence of effective enforcement strategy number of locations. These challenges are discussed in Section
to ensure the adherence to these guidelines. In view of this, we II-A. In view of the above, the contributions of this paper are:
propose a comprehensive IT solution i.e. a Blood Bank
Management System (BBMS) attempting to address this Development of a standard compliant Blood Bank
problem by providing means to connect, digitize and Management System including a stringent rule based
streamline the workflow of blood banks. enforcing mechanism.
Architecture for a fault-tolerant deployment
The need for automating blood banks have been there for a especially for rural and areas with sparse
long time. In early days of digitization, the primary purpose of connectivity.
an IT solution for blood banks was inventory management Identifying key requirements and learnings from im-
[6][13]. With time, the processes involved in management of plementation of a Blood Bank Management System
services of blood bank as well as Blood Transfusion System for large scale deployments
The application is further divided into layers based on a 3) Infrastructure at the Blood Banks: The infrastructure at the
responsibility layering strategy that associates each layer with blood banks consists of Gigabit LAN connection from (two
a particular responsibility. This strategy has been chosen as it distinct providers, client PCs and additional hardware such as
isolates various system responsibilities from one another, printers etc. as per the requirement. In case of the failure of
hence improving efficiency while developing the system as connectivity between the beneficiary blood bank and the data
well as its maintenance. center, the system automatically switches over to the other
provider. In case, both the networks are down, local servers
have been installed at the blood banks to continue the
operations till the connectivity is resumed.
IV. DISCUSSION
Prior to the implementation of the solution, the entire
workflow of the concerned blood banks was manual. When a
donor arrived, his/her entry was made into multiple registers.
Once the blood was collected in blood bags, various tests were
performed on the collected samples. The results were again per the existing processes not only minimizes the training
written into registers. In case of requirement of blood, a citizen time but also enhances the productivity of the end user.
had to enquire to each blood bank separately either through Secondary Notifications: In a domain such as health and
phone or in person. Moreover, there was no effective especially services dealing with emergency cases such as
monitoring mechanism at the blood banks to ensure the quality blood banks cannot solely rely on passive generation of
of the collected blood. In such a scenario, chances of the reports or alerts confined within the system. Critical alerts
stocked blood being expired without being used were high. The such as blood expiry notifications, unattended emergency
reasons were two fold. Firstly, there was no timely intimation blood requests etc. should be intimated to concerned
to the concerned official about the stock going to expire. This authorities via SMS/Push Notifications.
prevented appropriate measures to be taken by authorities to Importance of Mobile Applications: A key requirement
transfer the stock to places where there may be requirement. was to make the camp details, reports, stock details etc
Secondly, since each blood bank operated in isolation, available to the public through web portal. During the
determining such requirements was tedious and had to be operation of the project, it was observed that availability of
initiated at the end of the blood banks. In addition to this, there such information via Mobile Apps would help reaching a
was no provision at all to identify donors including professional wider range of people. It was also observed that push
donors. The professional donors were not only troublesome to notifications to the users via Mobile Apps and their
the blood banks since their blood did not satisfy the quality consent/feedback through the same would result in
parameters for donation wasting time and effort but also reaching out a higher number of donors.
harmful to the recipients. For administrative purposes, the Online Training Videos/Manuals: Transfer of staff,
authorities received statutory reports of donor and stock change in work profile etc. are a few common issues in
separately from each blood bank on monthly basis. This helped such organizations. In such cases, continuous retraining is
the concerned authorities to provide controlling guidelines to not possible. Therefore, online training material including
blood banks. The controlling guidelines may include interactive tutorials and videos should be available to the
scheduling blood donation camps, reducing demand supply gap staff. In addition to this, a comprehensive help including
by encouraging voluntary donations, enhancing infrastructure at tips to use the system should be provided for boosting the
blood banks based on the stock etc. efficient use of the system.
After the deployment of the solution, the complete process of 2) Key Learning for Project Implementation:
donating blood and its management has not only become
simpler but effective as well. The system has helped the Site Readiness: Site preparation should be parallel activity
authorities to enforce standard practices and at the same time with development and it should be initiated immediately
makes it convenient for the staff, saving their precious time and after the initiation of project. Roles and responsibility of
efforts. Due to real time availability of stock, the system has stakeholders should be cleared and communicated
allowed citizens to enquire and arrange blood a lot quicker, properly. Regular meeting should be scheduled to discuss
hence saving many invaluable lives. The statutory reports can the status of site. Responsibility for site preparation should
be viewed on the most recent data, allowing authorities to issue be well defined in Project Signing Document.
controlling guidelines quickly and extract statistical Stakeholder Participation: Stakeholder engagement is
information easily. Most importantly, the solution has brought the process of involving people who either influence or are
transparency and accountability to the system, putting the affected by the decisions taken by an organization. It plays
power at the hands of citizens. In Table II, we show specific a vital role in successful implementation of the project.
improvements on numerous parameters brought in by BBMS as Their inclusion and recognition in the BBMS decision-
compared to the pre-deployment scenario. making process is the key to the successful implementation
A. Key Learnings and use of the system. The module wise nodal officer and
team for requirement gathering and workflow finalization
A successful and effective system unfolds and solves many should be identified. End user should be involved from the
issues. In the following, describe the key learnings from our requirement gathering phase. Requirement gathering team
work. should also interact with end users, analyze their working,
examine the register and reports which are being used so
1) Key Learning for Improving Citizen Centric Services: that complete requirement is gathered.
Service Provider for Backbone Connectivity: Reliable
Automation Should Reduce Effort: Any such application
network connectivity critically influences the performance
should aim at prioritizing the reduction in the workload of
of an application. The requisite quality is achieved by
the end user rather than only automating the workflow. In
opting for two service providers. First with fiber based
order to achieve this, the system should be designed to
lease line connectivity and the second with RF based last
minimize the inputs at multiple stages including
mile connectivity. This ensured lower network downtime
intermixing of offline and online data entries; efforts
while keeping the system operational in a transparent
should be made to capture the information directly into the
manner.
system with proper validation instead of first capturing on
registers and then entering them into the system. Moreover, Initial Teething Troubles: Since, the staff at the blood
specially designing the Graphical User Interface (GUI) as banks was not acquainted to use such system; a Project
Management Unit (PMU) was required to be stationed at enforcing mechanism. The paper discussed an architecture for
various blood banks for handling technical issues. Due to a fault-tolerant deployment especially for rural and areas with
the critical nature of services at blood banks, an additional sparse connectivity. The challenges faced in such large scale
responsibility of the PMU was to educate and enable the deployments of citizen-centric services were identified along
blood bank staff to handle the technical issues as well. with key learnings from such implementation.
V. CONCLUSION
The paper discussed development of a standard compliant
Blood Bank Management System with a novel rule based
TABLE II. COMPARISON OF PRE-DEPLOYMENT AND POST-DEPLOYMENT SCENARIOS
S.No. Activity Pre-Deployment Post-Deployment
1. Stock Availability a) Telephonic/in-person visit by Patients, Hospital staff to a) Online availability of stock.
each blood bank. b) Information from multiple Blood Banks is shown in
b) Chances that the operator either does not know the exact consolidated manner.
stock or hides the stock details
2. Donor Repository a) No centralized repository of donors. a) Centralized donor repository.
b) Tedious and time consuming process to search registers b) One click search of registered volunteer donors across
for volunteer donors having a particular blood group. multiple blood banks.
3. Enforcing and a) NACO and NABH guidelines were manually monitored. a) The NACO and NABH guidelines are strictly enforced.
Authenticity Mechanism b) Chances of typographical errors and lack of consistency b) Bar coded labels available on Bags synchronized with
due to manual entry. the database. This helps in reducing manual errors due to
misplaced tags or entries.
4. Statutory Reports a) Manual preparation of stock reports. a) Real time availability of statutory reports.
b) Delay in dispatch or generation of reports, b) Easy to extract statistical information.
c) No statistical analysis
5. Duplicity a) Repeated entry in registers, a) Once the information is captured, it is available to be
b) Result from Medical Equipments were entered manually used in other forms and reports without re-entry.
in registers b) Generic Medical Equipment Interface for automatically
mapping results from machines to blood samples and
inserted into BBMS.
6. Efforts a) The staff was occupied in preparation of statutory reports a) Online Reports reduce this effort.
and register entries. b) Automatic and immediate availability of statistical data
b) Stock count, donor count were done manually based on various parameters.
7. Transparency a) No provision to display the real time stock in Blood Bank. a) Comprehensive User Management and Access Policies,
b) In case shortage of blood in blood banks, no mechanism b) Transaction Log availability,
to get other blood bank stock status c) Real Time information available through public portals.
8. Accuracy a) Donors Physical fitness was directly dependent upon a) On the basis of keyed values system suggests donor
Doctors Examination. fitness. This help in cases where a doctor is not available
b) Bag Grouping and whether the bag is reactive for disease or a technician conducts the tests.
depended upon technician entry in registers. b) With validations, cross checks, bar codes in place at
c) Chances of typographical errors were more. various stages, the chances of errors have been reduced.
9. Response Time a) Long queues due to manual and multiple register entries, a) Waiting time has been reduced,
b) Patient's relatives needed to travel to the blood banks in b) With the availability of portal and capability to
search of Blood. integrate KIOSK, registration time can further be reduced.
c) Stock enquiry was time consuming c) Users can get stock information through the
Portal.
10. Temperature Monitoring a) No monitoring system available during the transportation a) With the Temperature Tag, the chain of temperature is
during blood of bags from camps to the blood bank and from blood bank noted during transportation. In case the bag is exposed to
transportation to transfusion centers. The bags which should have been an unsafe temperature, the bag can be rejected at the
rejected due to exposure to unsafe temperature (if any), were recipient’s end.
accepted and used.
REFERENCES
[6] H. Lowalekar and N. Ravichandran. Blood bank inventory
[1] N. Adarsh, J. Arpitha, M. D. Ali, N. M. Charan, and P. G. management in India. OPSEARCH, 51(3):376–399, 2014.
Mahendrakar. Effective blood bank management based on rfid in real [7] C. J. McDonald, S. M. Huff, J. G. Suico, G. Hill, D. Leavelle,
time systems. In Embedded Systems (ICES), 2014 International R. Aller, A. Forrey, K. Mercer, G. DeMoor, J. Hook, et al. Loinc, a
Conference on, pages 287–290. IEEE, 2014. universal standard for identifying laboratory observations: a 5-year
[2] L. Bos and K. Donnelly. Snomed-ct: The advanced terminology and update. Clinical chemistry, 49(4):624–633, 2003.
coding system for ehealth. Stud Health Technol Inform, 121:279–290, [8] M. F. Murphy, E. Fraser, D. Miles, S. Noel, J. Staves, B. Cripps, and J.
2006. Kay. How do we monitor hospital transfusion practice using an
[3] E. Ekanayaka and C. Wimaladharma. Blood bank management system. end-to-end electronic transfusion management system? Transfusion,
Technical Session-Computer Science and Technology & Industrial 52(12):2502–2512, 2012.
Information Technology, page 7, 2015. [9] M. F. Murphy and M. H. Yazer. Measuring and monitoring blood
[4] L. T. Goodnough, M. E. Brecher, M. H. Kanter, and J. P. utilization. Transfusion, 53(12):3025–3028, 2013.
AuBuchon.Transfusion medicineblood transfusion. New England [10] N. A. C. Organization. Standards for blood banks and transfusion
Journal of Medicine, 340(6):438–447, 1999. services. 2007.
[5] D. M. Harmening. Modern blood banking and transfusion practices. [11] W. H. Organization et al. International classification of diseases
FA Davis, 2012 translator: ninth and tenth revisions: user’s guide to electronic tables
[computer file]. 1997.
[12] K. Ramani, D. V. Mavalankar, and D. Govil. Study of blood-
transfusion services in maharashtra and gujarat states, india. Journal of
health, population, and nutrition, 27(2):259, 2009.
[13] C. Sapountzis. Allocating blood to hospitals from a central blood bank.
European Journal of Operational Research, 16(2):157–162, 1984.
[14] S. Srivastava, R. Gupta, A. Rai, and A. Cheema. Electronic health
records and cloud based generic medical equipment interface. National
Conference on Medical Informatics (NCMI), AIIMS, New Delhi, 2014.
[15] S. Sulaiman, A. A. K. A. Hamid, and N. A. N. Yusri. Development of
a blood bank management system. Procedia-Social and Behavioral
Sciences, 195:2008–2013, 2015.
[16] L. M. Williamson and D. V. Devine. Challenges in the management of
the blood supply. The Lancet, 381(9880):1866–1875, 2013.
A Survey of Recommender System Feedback
Techniques, Comparison and Evaluation Metrics
Masupha Lerato, Omobayo A. Esan, Ashley-Dejo Ngwira SM, Tranos Zuva
Ebunoluwa Dept. of Computer Systems Engineering
Dept. of Computer Systems Engineering Tshwane University of Technology, TUT
Tshwane University of Technology, TUT Soshanguve, South Africa
Soshanguve, South Africa Country {ngwiraSM, zuvaT}@tut.ac.za
{masuphaLE, esanoa}@tut.ac.za,ebunashley@gmail.com
Observation
Inference
Estimated Ratings
Prediction
Predicted Ratings
IFT [6-9].
Expressivity of
Positive &
user preference Positive
Negative
Positive
Merits
1. IFT can be collected at much lower cost, user time in
examining and rating Accuracy Low High Very High
2. IFT is effortless, it does not put burden on the user of
Noise
retrieval system. Yes Yes Yes
3. It can be continuously collected from user-system Measurement
interaction and can be used for the users profile update. Reference Relative Absolute Absolute
4. It is less accurate when compared with EFT, but large
quantities of information can be obtained at lower cost
from users.
IV. METRIC EVALUATION OF RECOMMENDER SYSTEMS
5. The affective labelling is unobtrusive and difficult to
The evaluation metric of recommender System can be
cheat by users.
used to measure the performance of the system in order to
obtain the error the system can encounter during the
Demerits
implementations.
1. IFT is susceptible to noise
The following metric can be used to predict the rate a
2. The IFT are sensitive to user’s context, albert not the user will give to items. This is shown in (1-6) respectively
same extent [10, 11].
3. IFT is less accurate compared to EFT Mean Absolute Error (MAE): This measures the
4. Implicit feedback is always positive. For instance, if a average of the absolute deviance between the predicted
user did not watch a track that does not mean the user rating and the actual rating given by the users in the system
does not like the track [1, 2] as in (1).
5. IFT is difficult to interpret.
N
pri ari
MAE i 1
C. Hybrid Feedback Technique(HFT) N
HFT is the combination of both implicit and EFT. This
approach utilizes both combination of numerical rating Mean Square Error (MSE): This is used in order to
scores and human behavior in predicting items of interest give more importance to cases with larger deviance from the
and taste to the users. actual rating. It is used instead of MAE as in (2).
Merits
1. HFT helps to improve the prediction rating accuracy.
V.
N
( pri ari ) 2 CONCLUSION
MSE i 1
In this paper, techniques that are used to construct user’s
N interest and taste on items in RS Feedback have been
(2) highlighted. The merits and demerits of IFT, EFT and HFT
on RS have been discussed.
Root Means Square Error (RMSE): This is the The performance evaluation of RS has been discussed.
variant of MSE is the Root Mean Squared Error (RMSE), In a summary, accurate feedback in RS can be of assistance
which was the error metric, used in the Netflix. The RMSE to users in various applications such as in helping user’s to
between the predicted and actual ratings is given as in (3). find items of their interest from huge items within a limited
time.
pr ar
N 2 In future, IFT, EFT and HFT we be implemented in real-
life experiment to address some of the challenges
RMSE i 1 i i
(3)
N highlighted in this paper such as accuracy, speed and
efficiency.
In application where a list of recommendations are
provided for user to evaluate relevant and irrelevant items. ACKNOWLEDGMENT
Precision, Recall and DGG are used for information
retrieval in such scenarios. The authors acknowledged the contribution of Tshwane
Precision: This is the fraction of relevant item University of Technology.
recommended to the items in the recommendation list as in
(4). REFERENCES
[1] L.H Li, R-W Hsu and Fu-Ming Lee. Review on “The Technologies
relevant items recommended
Pr ecision and The Assessment Methods of Recommender Systems” Pg.1-46.
items in the recommendation list [2] Mild, A. & Natter, M., “Collaborative Filtering or Regression Models
for Internet Recommendation Systems” J. of Targeting, Measurement
(4) and Analysis for Marketing, Vol.10, No. 4, pp. 304-313, 2002.
[3] Gawesh Jawaheer, Martin Szomsor and Patty Kostkova Comparison
The number of the list can be huge depending on the type of of Implicit and Explicit Feedback from an Online Musinc
Recommendation Service, Her ’10, Sptemebr 2010.
recommendation technique and the size of the database
[4] F Ricci, L. Rokach, B. S Paul. KantorRecommender Systems
used. Handbook”, pg.77
Recall: This is defined as the fraction of item [5] E R Núñez-Valdéz , J.M Cueva Lovelle , O.S Martínez , V García-
recommended to be relevant by user to relevant items as in Díaz, P Ordoñez de Pablos , C.E Montenegro Marín “Implicit
(5). feedback techniques on recommender systems applied to electronic
books Computers in Human Behavior” Vol.,28 2012, 1186–
1193Elissa, “Title of paper if known,” unpublished.
relevant items recommended [6] Pasquale Lops, Marco de Gemmis and Giovanni Semeraro, Content-
Re call based Recommender Systems: State of the Art and Trends,
relevant items Recommender Systems Handbiik pg 73-85
(5) [7] Amatrian, X; Pujol, J (2009)’I like it…I like it Not: Evaluating User
Rating Noise in Recommender Systems.
[8] Tong Queue Lee, Young Park, Yong-Tae Park “A time-based
Discounted Cumulative Gain (DCG): This is used approach to effective recommender systems using implicit feedback”
for measuring the effectiveness of the recommendation Expert Systems with Applications, Vol., 34, 2008, Pg., 3055–3062.
method in locating the most relevant item at the top and the [9] Douglas W. Oard and Jinmook Kim ”Implicit Feedback for
fewer items at the bottom of the recommendation list as in Recommender Systems”Digital Library Group, 2000.. Young, The
Technical Writer’s Handbook. Mill Valley, CA: University Science,
(6). 1989.
[10] Ioannis Arapakis, Joemon M. Jose and Philip D.Gray Affective
p
2reli 1
DCG
Feedback: An Investigation into the Role of Emotions in the
information Seeking Process” SIGIR’08, Pg.395-402.
i log 2 (1 i) [11] Jonathan l. herlocker, Joseph a. Konstan, Loren g. Terveen and John t.
Riedl “Evaluating Collaborative Filtering Recommender Systems”
(6) ACM Transactions on Information Systems, Vol. 22, No. 1, January
2004, Pages 5–53.
Implementing Protocol to Access White-Space
Databases on Smart Set-top Box
Elesa Ntuli Seleman Ngwira Tranos Zuva
Department Of Computer Systems Department Of Computer Systems Department Of Computer Systems
Engineering, TUT Engineering, TUT Engineering, TUT
Pretoria, South Africa Pretoria, South Africa Pretoria, South Africa
Ntulife@tut.ac.za NgwiraSM@tut.ac.za ZuvaT@tut.ac.za
Abstract—in television white space networks, secondary (CRs) or white space devices (WSDs) operate as secondary
users are mandatory to query an accredited geo-location users (SUs) without creating interference access the white
spectrum database (GSDB) in order to determine the vacant spaces to the certified or primary users (PUs). There are
channels or white spaces. The recent developments of the mainly two techniques which are commonly considered for
Protocol to Access White Spaces (PAWS) by the Internet
discovering the white spaces namely the spectrum sensing
Engineering Task Force (IETF) have proposed to standardize
communication between the GSDB and white space devices. To and geo-location spectrum database (GSDB) [3].
enable efficient sharing of spectrum, white spaces devices Experimental TVWS broadband networks using GSDBs
(WSDs) and their related restrictions, the mechanism or have been piloted in many parts of the world, Worldwide
channel selection remain open issues in research. In this paper, there is a coordinated move for Digital Switch Over (DSO)
we proposed use of PAWS and JavaScript Object Notation by terminating analogue television transmission. This
(JSON) to address the communication between the GSDB and Digital Dividend (DD) has created new opportunities on
WSDs to enable optimal channel selection decisions in the set- spectrum for many new wireless technologies and research
top box making it a smart set-top box. including South Africa [7].
Keywords—TV whitespace; set-top box; digital television;
cognitive radio; base station; geo-location spectrum database; Protocol to
We review the portable devices when switching over to
Access White Spaces; White space device. digital transmission using the Television White Space
(TVWS), Base Station (BS), Geo-location and the
I. INTRODUCTION possibility of turning the set–top box (STB) into a smart set-
[1] and [2] in their prior work on TV spectrum top box (SSTB) by introducing the possibility of browsing
measurements in both urban and rural areas in Southern the internet which is suitable for rural areas in South Africa.
Africa, they found that TVWS availability ranges between Spectrum is becoming a limited natural resource, although
100 to 300 MHz The key element in the worldwide the evidence shows instances continually where the
deployment of digital television (DTV) is the digital set-top spectrum is not fully exploited by the allocated facilities
box receiver and related interactive services. Set-top boxes making it inefficiently utilized [1].
with First-generation were quite simple with support only
for channel-tuning and audiovideo data decoding. spectrum of about 1720 MHz will be required according to
Increasingly, however, these set-top boxes can be used as a ITU predictions many new wireless services/ applications
gateway to the Internet and as a hub in home networking cannot be rolled out due to nonavailability of spectrum by
and entertainment. Digital-quality broadcast television, the year 2020 [17], which demands dynamic allocation of
personal video recording, and high-speed Internet access are spectrum instead of static [1]. The technology of CR would
some of the features for extensive range of consumer assist in meeting the constantly growing demand of radio
broadband multimedia services. These services with such spectrum and aid in dealing with resource in more
capabilities are paving the way for the set-top box to methodical and in more efficient way. The basic impression
become not only a residential entertainment center but also a of CR is to reuse the spectrum whenever it is vacant by the
smart central piece of home computing equipment capable primary/licensed users (PUs). The frequent spectrum
of handling various advanced applications such as browsing sensing for detecting the TV switchover to full digital
the internet. Sharing of TV spectrum successfully between broadcast service (DD) is required to be performed by the
broadcast and broadband services depends on the approval Secondary/unlicensed Users (SUs) which creates new
of dynamic spectrum access (DSA) regulatory approach. In spectrum opportunities due to higher spectral efficiency
DSA based broadband networks and cognitive radios. compared to analogue services. This TVWS technology
opens the door for CR technology. In various countries
Spectrum regulatory bodies are studying the pros and con of
CR devices. Other countries have made provisions for CR
V. ACCESS TO GSDB
The TVWS-BS is connected to one of the approved GSDBs
the TVWSBS establishes an HTTP session with the national
GSDB in order to register and initiate a query for available
channels. The messages between the TVWS-BS and the
GSDB are standardized by the PAWS [8], while messages
between the WSD and the GSDB is use dependent, i.e. not
standardized by the PAWS. Once selected a channel for
itself, the TVWS-BS then uses each WSD’s geo-locations
(which are readily known to the TVWS-BS) to query the
GSDB for more available channels using the AVAIL
SPRECTRUM BATCH REQ request.
Fig. 2. Channel Selection hierarchy structure
The WSDs are also called slaves because they do not have
direct access to the GSDB [5]. Depending on the amount of The figure above illustrations the hierarchy structure where
traffic demand or class of service (i.e. real-time or best the top level presents the main goal on the structure, which
effort) by the WSDs and the distance from the BS, the best select the best available channel. The selection of the
suitable number of channels will be selected (also channels is based on the user’s preference which requires a
depending on the availability of white spaces). Finally, by specific quality of service (QoS) for each class of service.
sending a SPECTRUM USE NOTIFY, the TVWS-BS Considering two classes of service (CoS): Real Time (RT)
notifies the GSDB of the selected channels used by the and Best Effort (BE) which are on the second level of the
entire network. This will assist the GSDB not to provide the hierarchy. On the third level of the hierarchy are the three
same list of available channel to other secondary networks, independent criteria to be compared when selecting the
thereby preventing any co-channel interference [6]. channels. At the bottom of the hierarchy are at least four
alternative channels which are compared in order to select
VI. AVAILABLE SPECTRUM RESPONSE MESSAGE the best channel for the user.
According to PAWS protocol, attributes are extracted from D. Overview of openPAWS protocol
the AVAIL SPECTRUM RESP message which is sent to
TVWS-BS by the GSDB [5]. This message contains several
parameters which include timestamp, spectrumSchedules,
maxTotalBwHz, maxContiguousBwHz, etc. The most
important parameters used in the model are maxTotalBwHz
and spectrumSchedules which are explained below.
ACKNOWLEDGMENT
We would like to thank TUT for supporting this paper and
appreciate Sibusiso Colby Ntuli for all the support given.
REFERENCES
[1] M. T. Masonta, D. Johnson, and M. Mzyece, The White Space
Opportunity in Southern Africa: Measurements with Meraka
Cognitive Radio Platform, R. Popescu-Zeletin, et al., Ed.
Springer, vol. 92, pp. 64-73, Feb. 2012.
[2] A. Lysko, M. Masonta, D. Johnson, and H. Venter, ―Fsl based
estimation of white space availability in UHF TV bands in
Bergvliet, South Africa,‖ in SATNAC, George, South Africa,
Sep. 2-5 2012.
[3] M. Masonta, Y. Haddad, L. De Nardis, A. Kliks, and O.
Holland, ―Energy efficiency in future wireless networks:
Cognitive radio standardization requirements,‖ in IEEE
CAMAD, Barcelona, Spain, Sep. 17-19 2012.
[4] Ghosh, S.; Naik, G.; Kumar, A.; Karandikar, A., "OpenPAWS:
An open source PAWS and UHF TV White Space database
implementation for India," in Communications (NCC), 2015.
[5] TENET, ―The Cape Town TV white spaces trial,‖ Available
from: http://www.tenet.ac.za/tvws, 2015, [Accessed:
14/03/2015].
[6] V. Chen, Ed., S. Das, L. Zhu, J. Malyar, and P. McCann,
―Protocol to access spectrum database,‖ Jun. 2013, draft-IETF-
PAWS-Protocol-06, 2013.
Fig. 5. PAWS slave device state diagram, once the slave device is
[7] C. Ghosh, S. Roy, and D. Cavalcanti, ―Coexistence challenges
connected to the Master, the rest of the PAWS messages are exchanged in a
for heterogeneous cognitive wireless networks in TV white
manner similar to the Master-database communication. [4]
spaces,‖ IEEE Wireless Communications, vol. 18, no. 4, pp. 22–
31, Aug. 2011.
[8] FCC Spectrum Policy Task Force. http://www.fcc.org. ET
Docket No. 04-186, 2008.
[9] T. Dhope, D. Simunic, and M. Djurek. Application of DOA
estimation algorithms inSmart antenna systems. Studies in
Informatics and Control, 19(4), 445–452, 2010.
[10] K-C. Chen and R. Prasad. Cognitive Radio Networks. John
Wiley & Sons, 2009.
[11] Ofcom: Digital Dividend: Clearing the 800 MHz Band.
[12] http://www.ofcom.org.uk/consult/condocs/cognitive/,
[Accessed: 14/03/2015].
[13] http://ieee802.org/22/, [Accessed: 14/03/2015].
Abstract—Under the current situation the booming social exchange information with others having similar interests or
networks, users interact between people with the way of social demands. Hence, under the influence of social network, there
networks platforms (such as: press like, join fans pages and is a considerable amount of social activities shifting from
groups), and for these interactive information on social platform reality to virtual online platforms. This brings a direct impact
can fully represent that oneself is interested in the content of
to the consumer market, which is too immense to be ignored.
information sources in different social group. Therefore, how to
collect user behavior patterns (Users Behavior) generated by
social interaction, and then analyze and understand user Social networks for business operators, the value of the
preferences by these interactive data would be the purpose to information can be one of the important sources for
discuss and to do the research of the paper. understand customer's opinion and behavior analysis which
retained by the virtual social group. And the "human" play a
Furthermore, for many brand enterprises, it is important to
very important link in the social network, and also because of
know how to understand individual preferences, because when
you know the individual preference information, it can carry out the interactive link between people, the social platform has
personal preference for advertising, product recommendation, become a very huge information exchange center, spread and
article recommended…and other diversified personal social scattered that people for any event and personal preferences
service, which can increase the click rate and exposure of the are derived from the words and social behavior. By collecting
products, better close to the needs of the user's life. Therefore, and analyzing personal interaction behavior and the content of
with the above through social science and technology writing comments on the social platform, we can more
development trend arising from current social phenomenon, understand that the preference for personal interest has more
research of this paper, mainly expectations for analysis by the significant results. Therefore, how to effectively analyze the
data of user’s personal interaction on the social network, such as:
personal preference of the social platform has become a
user clicked fan page, user press like article, user share data etc.
three kinds of personal information for personal preference problem to be solved in this paper.
analysis, and from this huge amount of personal data to find out
corresponding diverse group for personal preference category. The paper mainly put forward the research of
We can by personal preference information for diversify personal solving social personal preference analysis, through the three
advertising, product recommendation and other services. The modules planning and implementation that the paper
paper at last through the actual business verification, the mentioned about and to collect and analyze data, according to
research can improve website browsing pages growth 8%, site the algorithm of each module for processing, the effective
bounce rate dropped 11%, commodities click through rate personal preference information analysis for social network
growth 36%, more fully represents the results of this research fit platform, through modules planning in this paper, after the
the use’s preference. system actual verification, web browsing pages growth 8%,
Keywords—Social Networks; Social Personal Analysis; Cost site bounce rate dropped 11%, product click through rate
Per Click; Personal of Interest analysis growth 36%, the results can provide extended of follow-up
research, the method follow up will be introduced in order.
I. INTRODUCTION
The framework of this paper is as follows: the first
Recently, the flourish of information technology caused section introduces the motivation and background of this paper.
people to think how they can promote the chances of The second section, through the previous research put forward
interaction with others by information technology. With the that the relevant literature is discussed and analyzed. The third
progress of information technology and the extension of the section introduces the used of the method and technology in
concept of interaction, virtual communities (e.g. Facebook, the paper. The fourth section puts forward the implementation
Weibo, Blogspot, PTT, forums, news…) had begun forming results and analysis. The fifth section is the conclusion and the
and growing. Users have gotten used to interacting and future outlook.
communicating with their families, colleagues and friends
over all kinds of social networking sites. Even they will
A. Social Network Development Trends With this research we studied previously distributed
Web Crawler [3] to carry out personal social networking sites
According to the estimation of the international market (e.g. Facebook) information collect service. Since this
research authority eMarketer, the number of global users of research should be carried out personalization information
social network has reached 1.5 billion. The number of active collect service, so for the sake of the personal information
user of Facebook, Twitter, Weibo and Tencent were one billion, protection law of different countries, so we need to obtain
0.5 billion and 0.3 billion at 2012, respectively. Furthermore, personal authorization in this research. Therefore, this research
the penetration rate of social network services among the practical cooperation with brand owners, and legally acquired
global users of all ages has reached 79%. Even those once the vendor of Customer Relationship Management
considered the main composition of digital divide has a rise of Information, through user consent and legally obtain
penetration rate up to 9.3%. These facts show that the use of personalization authorized to carry out analysis of this
social network is very popular. In domestic, the number of research.
user of social network had a breakthrough of ten million at
2011, according to the research of Institute for Information The paper in this research mainly added two data scopes,
Industry. [3]Among those under 30 and between from 50 to 59, 1.legally acquired 10,000 users authorized to Facebook
the proportions of using social network were 96% and 70%, personal information 2.Observing Taiwan’s 10,000 Facebook
respectively, which show an extremely high penetration rate. fans pages, as shown in Table 1. The research by means of the
As for the period of using social network, a research shows user's personal ID Mapping to understand which fans pages
that there were more than 70% of domestic people using it that the user join, and through the method of user's personal
over a year. If we take those using over six months but less authorization, for personal users in the Facebook on press like
than a year into account, the proportion is approximately 90%. article and share article and other information, to carry out
Learning from the high penetration rate, the application of personalized preference analysis via these two data. After
social network has deeply rooted in our daily life and also analyzing the results and provide it to the brand owners to
changed our lifestyle.。 carry out follow-up recommended services.
Press Press
Operational Join fans Share
like fans like
complexity pages Article
pages article
Weight
0.35 0.3 0.25 0.1
allocation
For the Table 2 weight parameter setting, this research is Fig. 4 Personal favor graphy
based on the field experiments process of the fourth section.
Through the Fig. 4 personal favor graphy, we can define
The research by obtain 10,000 personal access information to
the results of personal preference based on the data of each
conduct the data analysis, which by A / B test and mining
node preference, through Dijkstra's Algorithm based on the
process of adjustment to obtain optimization, and detailed
shortest path, as shown in Fig. 5, to find out the personal
experimental procedures, refer to the fourth section.
preference on the social group of the higher score data, can
define the results of personal preference.
3.2.2 Personal preference analysis
Fig. 5 Dijkstra's Algorithm The system on this paper collect 10,000 personal
preference data with the user’s approval through the Personal
To find a vertex t from the U-V set, so that the D [t] is the preference recommendation interface, the WebCrawler of data
minimum, and put the T into the S set, until the U-V is empty processing platform module will carry out the personal data
set. Adjust the values in the D array according to the formula collection. In addition, the research simultaneously observed
below: Taiwan's 10,000 Facebook fans pages data to understand the
fans pages that the user have joined and to collect personal
MinW ( P) W (u, v)
( u ,v )P
social interaction data. Then, the Personal preference analysis
calculation module will analyze the personal preference of
(3-1)
these data, and finally through the results of personal
D = A [F, I] ( I =1, N ) (3-2) preference analysis to convey the product or the article that
u = {F} (3-3) users interest in.
v = {1, 2, … , N} (3-4) 4.2 The system flow
D [I] = min ( D [I], D [t] +A [t, I; ] ((I , Start
t )E) (3-5)
Users browse the web pages
D: An array of N positions for storing the shortest distance
from a vertex to another vertex. FB APP_Users Login
0.00%
Test_A Test_B Test_C Test_D Test_E
The purpose of this paper is how to quickly understand the
personal preferences on social behavior of the user, and then
carry out all the recommendation and other innovative
Fig. 9-3 Click through rate empirical data application services. In the future, due to the rapid
development of social media, through the analysis of the
Table 5、The rule of thumb recommended services data
interest of personal preference, to reach a variety of innovative
The rule of thumb recommended services applications services, will be one of the next big issues of the
(ROT) social analysis. Therefore, personal preference analysis data
Bounce Click conducted by the research can be used as reference of
PV follow-up research, expected the follow-up research will be
rate through rate
Test_A 168,157 8.12% 1.21% put forward further industrial application, population analysis,
Test_B 21,113 10.21% 1.34% social policy ... etc, this will be the future of our concerned.
Test_C 6,571 8.97% 1.52%
Test_D 3,543 6.77% 0.33% VI. ACKNOWLEDGEMENT
Test_E 523 9.93% 1.67%
The verification of this research verified through the data of This study is conducted under the "Social Intelligence
the five recommended field which collected through the rule Analysis Service Platform (3/4)" of the Institute for
of thumb recommended services by the co-operative enterprise. Information Industry which is subsidized by the Ministry of
The results of data as shown in Table 5, the total browsing rate Economy Affairs of the Republic of China .
of PV reading data is 199,907, total average of page views is
REFERENCES
39,981.4, the average of bounce rate is 8.8%, and the average
of click through rate is 1.21%. [1] Zhang, C. ; Fang, Y.A “Trust-Based Privacy-Preserving Friend
Recommendation Scheme for Online Social Networks” IEEE
Transactions on Dependable and Secure Computing, 2015, 12(4),
413-427.
[2] Louta, M.D. ; Varlamis, I. “A Trust-Aware System for Personalized User
Recommendations in Social Networks”, IEEE Transactions on Systems,
Man, and Cybernetics, 2014, 44(4), 409-421.
[3] Cheng-Hung Tsai, Tsun Ku, and Wu-Fan Chien, “Object Architected
Design and Efficient Dynamic Adjustment Mechanism of Distributed
Web Crawlers”, International Journal of Interdisciplinary
Telecommunications and Networking, 2015, 7(2), 57-71.
[4] Rashid, A. “Who Am I? Analyzing Digital Personas in Cybercrime
Investigations” IEEE Computer Magazines 46(4).
[5] Sheth, A., Thomas, C., & Mehra, P. “Continuous semantics to analyze
real-time data.” IEEE Internet Computing, 2010,14(6), 0084-89.
[6] Cheng-Hung, Tsai, Tsun Kua, Liang-Pu Chena, Ping-che, Yang,
“Constructing Concept Space from Social Collaborative Editing,”
Proceedings of AHFE 2015.
[7] Hui Zheng ; Ji-kun Yan ; Ye Jin “Persona Analysis with Text Topic
Modelling” Proceedings of IEEE ACAI 2012.
[8] Shiga, A. , “A Support System for Making Persona Using Bayesian
Network Analysis” Proceedings of IEEE ICBAKE 2013.
[9] Clark, J.W. , “Correlating a Persona to a Person,” Proceedings of IEEE
SocialCom 2012.
[10] Cunha, B. ; Pereira, J.P. ; Gomes, S. ; Pereira, I. ; Santos, J.M. ; Abraham,
A., “Using personas for supporting user modeling on scheduling systems”,
Proceedings of IEEE HIS 2014.
[11] Letier, E. ; Sasse, M.A., “Building a National E-Service using Sentire
experience report on the use of Sentire: A volere-based requirements
framework driven by calibrated personas and simulated user feedback”
Proceedings of IEEE RE 2014
[12] Bo Wu ; Qun Jin ; Nishimura, S. ; Julong Pan ; Wenbin Zheng ; Jianhua
Ma, “Social Stream Organization Based on User Role Analysis for
Participatory Information Recommendation” Proceedings of IEEE
UMEDIA 2014
[13] Chengjie Mao ; Zhenxiong Yang ; HanJiang Lai, “A new team
recommendation model with applications in social network”, Proceedings
of IEEE CSCWDM 2014.
[14] Piedra, N. ; Chicaiza, J. ; Tovar, E., “Recommendation of OERs shared in
social media based-on social networks analysis approach”, Proceedings of
IEEE FIE 2014.
[15] Djauhari, M.A., “A recommendation on PLUS highway development: A
social network analysis approach”, Proceedings of IEEE IEEM 2014.
[16] Mishra, S., “An analysis of positivity and negativity attributes of users in
twitter”, Proceedings of IEEE ASONAM 2014.
[17] Lu Yang, “A collaborative filtering recommendation based on user profile
and user behavior in online social networks”, Proceedings of IEEE
ICSEC 2014.
[18] Huaimin Wang ; Gang Yin ; Ling, C.X. “Who Should Review this
Pull-Request: Reviewer Recommendation to Expedite Crowd
Collaboration” Proceedings of IEEE APSEC 2014.
[19] Castiglione, A. ; De Santis, A. “Friendship Recommendations in Online
Social Networks” Proceedings of IEEE INCoS 2014.
[20] Shaffer, C.A. ; Weiguo Fan ; Fox, E.A., “Recommendation based on
Deduced Social Networks in an educational digital library”, Proceedings
of IEEE JCDL 2014
[21] Gerogiannis, V. ; Anthopoulos, L., “Project team selection based on social
networks”, Proceedings of IEEE ITMC 2014.
[22] Diaby, M. ; Cataldi, M. ; Viennet, E. ; Aufaure, M.-A., “Field selection
for job categorization and recommendation to social network users”,
Proceedings of IEEE ASONAM 2014.
[23] Zegarra Rodriguez, D. ; Bressan, G. “Music recommendation system
based on user's sentiments extracted from social networks”, Proceedings
of IEEE ICCE 2014.
[24] Sun Yanshen ; Yu Xiaomei, “Personalized recommendation based on link
prediction in dynamic super-networks”, Proceedings of IEEE ICCCNT
2014.
[25] Xueming Qian ; Dan Lu ; Xingsong Hou ; Liejun Wang, “Personalized
tag recommendation for Flickr users”, Proceedings of IEEE ICME 2014.
Multimodal Biometric Scheme for E-Assessment
S.O. Ojo T. Zuva S.M. Ngwira
Computer Systems Engineering Computer Systems Engineering Computer Systems Engineering
Tshwane University of Technology Tshwane University of Technology Tshwane University of Technology
Tshwane, South Africa Tshwane, South Africa Tshwane, South Africa
Abstract—Problems of impersonation and other malpractices of Authentication – Preventing the misuse of identity
various degrees have continued to mar the electronic-assessment (e- detail.
assessment) of students at various levels. The use of traditional
methods of password and token passing for identity management In an effort to guard against the aforementioned threats and
and control has failed to provide acceptable solution to these achieve the goals of Presence, Identity, and Authentication,
problems, due to their susceptibility to theft and various forms of several biometric based schemes have been proposed with the
violations. The failure of these and other methods in identity use of biometrics for the authentication of learners in an e-
management has propelled the emergence of different biometric- assessment.
based systems as solutions. The biometric systems use naturally
endowed traits which include fingerprint, face, iris, voice, etc. for The main purpose of this paper is to present a non-intrusive
more reliable human identification and authentication. This paper multimodal biometric authentication scheme for e-assessment.
proposes a multimodal biometric scheme for e-assessment aimed at This paper is arranged as follows: Section 2 a general overview
reducing the time spent when performing biometric authentication. of biometrics and some popular biometric modes, Section 3
Results obtained indicate that by combining the fingerprint and Proposed method, Section 4 Results, Section 5 Conclusion and
facial biometric modes in a multimodal biometric scheme, the recommendation.
amount of time required to authenticate a learner in an e-
assessment can be reduced. II. LITERATURE REVIEW
Keywords— E-assessment; biometrics; fusion A. Biometrics
Historically, human identification of persons has been done
I. INTRODUCTION through various means, including visual identification, gait
Traditionally, the assessment process has been considered a recognition, fingerprint matching, presentation of access cards,
high stake process and the inclusion of one or more proctors usernames and passwords, analysis of bioelectrical signals, etc.
throughout the assessment process has been one of the means Biometrics is the ability to identify and authenticate an
of ensuring that the assessment is conducted in a credible individual using one or more of their behavioral, physical or
manner. Proctors, also known as invigilators, are present to chemical characteristics [3]. A biometric recognition system
address security concerns during an assessment. relies on who a person is or what a person does instead of what
they know or possess. Physical biometric characteristics
The C.I.A (Confidentiality, Integrity, and Availability)
include face, fingerprint, retina, iris, etc., characteristics.
security goals [1] which are meant to prevent the compromise
Behavioral biometric characteristics include gait recognition,
of assets in computer network systems are applicable to the
keystroke dynamics, mouse dynamics, etc. Chemical
summative e-assessment [2]. Apart from the hardware and
biometrics characteristics involve the use of human DNA to
software used for the purpose of an e-assessment, the student
identify a person. A biometric system can be described as a
undergoing the e-assessment too is an asset and certain threats
pattern-recognition system. The biometric system aims to
are posed by this asset. These threats include, an illegal or
recognize a person based on a feature vector deduced from
unauthorized candidate taking an e-assessment, the
physiological and/or behavioral characteristic that a person
dissimulation of identity detail, and the abuse, or misuse of
possesses.
authenticity detail [2].
The use of a particular biometric mode is largely dependent
Due to the aforementioned threats posed by a student, the
on the application domain. Some popular biometric modes
C.I.A. goals though sufficient to protect hardware and data
used for identification include gait recognition, fingerprint
used for the e-assessment, will not adequately address the
recognition, facial recognition, bioelectrical signals, keystroke
threats posed by the student; and thus three other goals for the
dynamics, and mouse dynamics.
summative e-assessment process can be deduced [2]:
The gait is a popular biometric due to the plausibility of
Presence – Physical and online presence, which
obtaining ones gait unobtrusively as an individual walks about.
defines the state of an entity being in a particular
The gait has been defined as ―the coordinated, cyclic
environment.
combination of movements that result in human locomotion‖
Identity – The ability to distinguish one entity from [4]. Research has been conducted into the identification of
another humans through gait recognition [5, 6], and the gait has been
found to be reliable even in law enforcement as Closed Circuit
Television (CCTV) camera footage can be used in identifying
𝐹𝐴𝑅 + 𝐹𝑅𝑅
𝐴𝑐𝑐% = 100 − % (3)
2
where:
𝐴𝑐𝑐% = 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦.
Fig 1 Multimodal Biometric Scheme with fusion at matching module.
As the size of a biometric database increases, the possibility Adapted from [19]
of a false rejection or false acceptance also increases [18]. The
accuracy metrics are commonly used in evaluating a biometric The biometric database contains templates of both the
system, for any given application domain, such as e- learners’ facial biometric and the corresponding fingerprint
assessment. biometric in a one to one mapping. Fig 2 shows a sample
database with five learners enrolled.
III. PROPOSED APPROACH
This study proposes the use of fingerprint and facial
biometrics in a multimodal biometrics authentication scheme
for e-assessment. The facial biometric has been found to be less
precise than the fingerprint biometric when identifying
individuals and will thus be used as the first mode of
authentication to obtain a waned list of possible samples which
have met the threshold. From this waned list, the fingerprint
biometric can then be used to examine the smaller subset of
possible identities of the learners. In so doing, we are able to
reduce the overall amount of time spent matching biometric
samples against a large biometric database.
Fusion will be done at the matching module. Fig 1 shows
the flow chart of the proposed scheme with fusion done at the
matching module. The fingerprint biometric is obtained
𝑁
𝜇 = 𝑥𝑖 (4)
𝑖 =1
Where:
𝑋 = 𝑥1 , 𝑥2 , … . , 𝑥𝑖 , … . , 𝑥𝑁 and represents a 𝑛 × 𝑁 data
matrix
N = number of facial images being examined
𝑥𝑖 = a vector with dimension n made up of an 𝑝 × 𝑞 image
𝑛 = 𝑝 × 𝑞.
𝑝 𝑋| 𝐼 = 0
𝐿𝐿𝑅 = (5)
𝑝 𝑋| 𝐼 = 1
where:
𝑋 = 𝑠1 , 𝑠2 , 𝑞1 , 𝑞2
𝑠1 = 𝑚𝑎𝑡𝑐𝑖𝑛𝑔 𝑠𝑐𝑜𝑟𝑒 𝑓𝑟𝑜𝑚 𝑓𝑖𝑛𝑔𝑒𝑟𝑝𝑟𝑖𝑛𝑡 𝑏𝑖𝑜𝑚𝑒𝑡𝑟𝑖𝑐
𝑚𝑎𝑡𝑐𝑖𝑛𝑔 𝑚𝑜𝑑𝑢𝑙𝑒
𝑠2 = 𝑚𝑎𝑡𝑐𝑖𝑛𝑔 𝑠𝑐𝑜𝑟𝑒 𝑓𝑟𝑜𝑚 𝑓𝑎𝑐𝑖𝑎𝑙 𝑏𝑖𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑚𝑎𝑡𝑐𝑖𝑛𝑔
𝑚𝑜𝑑𝑢𝑙𝑒
𝑞1 = 𝑞𝑢𝑎𝑙𝑖𝑡𝑦 𝑜𝑓 𝑓𝑖𝑛𝑔𝑒𝑟𝑝𝑟𝑖𝑛𝑡 𝑏𝑖𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑠𝑎𝑚𝑝𝑙𝑒
𝑞2 = 𝑞𝑢𝑎𝑙𝑖𝑡𝑦 𝑜𝑓 𝑓𝑎𝑐𝑖𝑎𝑙 𝑏𝑖𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑠𝑎𝑚𝑝𝑙𝑒
Fig 2 One to one maping of facial biometric templates with fingerprint
biometric template
𝐼 ∈ 0, 1
𝑝 𝑋| 𝐼 = 0 = 𝑡𝑒 𝑔𝑒𝑛𝑢𝑖𝑛𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑒𝑡 𝑋
From the set of five templates shown in Fig 2, the facial
biometric will return three templates which have achieved the 𝑝 𝑋| 𝐼 = 1 = 𝑡𝑒 𝑖𝑚𝑝𝑜𝑠𝑡𝑒𝑟 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑒𝑡 𝑋
set threshold. These three templates are ranked in order of a
similarity score with the first template in the set being the most In comparing fingerprint biometric data with template data,
likely correct identity and the last template being the least Minutiae Matching is used as shown in equation 6 [21].
likely identity. The three templates are passed on to the
fingerprint biometric module. The fingerprint module then 2 2
compares its biometric data against the three provided and also 𝑟𝑜𝑤𝑘𝑇 − 𝑟𝑜𝑤𝑟𝑒𝑓
𝑇
+ 𝑐𝑜𝑙𝑘𝑇 − 𝑐𝑜𝑙𝑟𝑒𝑓
𝑇
Abstract—Problems of impersonation and other malpractices In an effort to guard against the aforementioned threats and
of various degrees have continued to mar the electronic- achieve the goals of Presence, Identity, and Authentication,
assessment (e-assessment) of students at various levels. The use of several techniques have been proposed with the use of
traditional methods of password and token passing for identity biometrics for the authentication of learners in an e-assessment.
management and control has failed to provide acceptable solution
to these problems, due to their susceptibility to theft and various The main purpose of this paper is to discuss various
forms of violations. The failure of these and other methods in biometric modes and their application for e-assessment. This
identity management has propelled the emergence of different paper is arranged as follows; Section 2 a general overview of
biometric-based systems as solutions. The biometric systems use biometrics, Section 3 biometrics performance measures,
naturally endowed traits which include fingerprint, face, iris, Section 4 multimodal biometrics, Section 5 applications of
voice, etc. for more reliable human identification and biometrics for e-assessment, Section 6 Challenges and open
authentication. In this paper, various biometric modes are issues.
discussed along with their application within the domain of e-
assessment. The results in this paper show that the need for II. BIOMETRICS
multimodal biometrics and transparent authentication in an e-
Historically, human identification of persons has been done
assessment is still an open issue.
through various means, including visual identification, gait
Keywords—E-assessment; Biometrics; Multimodal Biometrics recognition, fingerprint matching, presentation of access cards,
usernames and passwords, etc. Biometrics is the ability to
I. INTRODUCTION identify and authenticate an individual using one or more of
their behavioral or physical or chemical characteristics [3]. A
Traditionally, the assessment process has been considered a
biometric recognition system relies on who you are or what
high stake process and the inclusion of one or more proctors
you do [4]. Physical biometric characteristics include face,
throughout the assessment process has been one of the means
fingerprint, retina, iris, etc., characteristics. Behavioral
of ensuring that the assessment is conducted in a credible
biometric characteristics includes gait recognition, keystroke
manner. Proctors, also known as invigilators, are present to
dynamics, mouse dynamics, etc. Chemical biometrics
address security concerns during an assessment. The C.I.A
characteristics involves the use of human DNA to identify a
(Confidentiality, Integrity, and Availability) security goals [1]
person. A biometric system is a pattern-recognition system.
which are meant to prevent the compromise of assets in
The biometric system aims to recognize a person based on a
computer network systems are applicable to the summative e-
feature vector deduced from physiological or behavioral
assessment. Apart from the hardware and software used for the
characteristic that the person possesses.
purpose of an e-assessment, the student undergoing the e-
assessment too is an asset and certain threats are posed by this Biometrics recognition entails enrollment, authentication or
asset. These threats include, an illegal or unauthorized identification tasks. Enrollment associates an identity with a
candidate taking an e-assessment, the dissimulation of identity biometric characteristic. In verification, an enrolled individual
detail, and the abuse, or misuse of authenticity detail [2]. can proffer an identity and the system will be responsible for
verifying the authenticity of the proffered identity based on his/
Due to the aforementioned threats posed by a student, the
her biometric feature. An identification system identifies an
C.I.A. goals though sufficient to protect hardware and data
enrolled individual based on their biometric characteristics
used for the e-assessment, will not adequately address the
without the individual having to claim an identity. Biometric
threats posed by the student; and thus three other goals for the
recognition has become a principal method of human
summative e-assessment process can be deduced [2]:
identification and verification.
Presence – Physical and online presence, which Any human traits can serve as a biometric characteristic
defines the state of an entity being in a particular provided it satisfies the measurement requirements of:
environment. universality - each person should have the characteristic,
Identity – The ability to distinguish one entity from distinctiveness - any two persons should be different in terms
another of the characteristic, permanence - the characteristic should not
Authentication – Preventing the misuse of identity change drastically over a short period of time, collectability -
detail. the characteristic should be quantitatively measurable, and for
practical purposes, in addition, performance - meet accuracy,
speed, and resource requirements, acceptability - accepted by Next we identify some behavioral and physical biometrics
the intended population, and circumvention – not easily and discuss how they are used.
hoodwinked [4, 5]. These seven measurement parameters are
used to determine the usefulness of a biometric. For the e- A. Keystroke dynamics
assessment domain, the most important five of these are The keystroke dynamics biometric aims to obtain biometric
universality, distinctiveness, permanence, collectability, and data as a learner types on the computer keyboard. In order to
circumvention. authentication a learner using the keystroke biometrics the
following metrics are considered: typing speed, keystroke seek
Tables 1 and 2 show some physiological biometrics and time, flight time, characteristic errors, and characteristic
behavioral biometrics respectively and how they measured sequences [7]. The time between keystrokes is known as a
against these five parameters, on the scale of LOW (L) – diagraph and is a vital part of the keystroke biometric data. In
MEDIUM (M) – HIGH (H). matching the keystroke data obtained with templates in a
database, correlation is used. The correlation r can be
Table 1 Physiological biometric modes rating based on five characteristics [5] determined by equation 1 [8]:
𝑛
𝑖=1 (𝑘𝑖 ∗ 𝑡𝑖 )
𝑟=
Distinctiveness
Circumvention 𝑛 2 𝑛 2
(1)
Collectability
Permanence
Where
𝑘 = 𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 𝑙𝑒𝑛𝑔𝑡 𝑛 𝑠𝑡𝑜𝑟𝑖𝑛𝑔 𝑡𝑒 𝑓𝑙𝑖𝑔𝑡 𝑡𝑖𝑚𝑒 𝑜𝑓 𝑡𝑒
Biometric 𝑠𝑡𝑜𝑟𝑒𝑑 𝑡𝑒𝑚𝑝𝑙𝑎𝑡𝑒
Fingerprint H H M M H 𝑡 = 𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 𝑙𝑒𝑛𝑔𝑡 𝑛 𝑠𝑡𝑜𝑟𝑖𝑛𝑔 𝑡𝑒 𝑓𝑙𝑖𝑔𝑡 𝑡𝑖𝑚𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑘𝑒𝑦𝑠𝑡𝑟𝑜𝑘𝑒𝑠 𝑜𝑓 𝑎 𝑏𝑖𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑠𝑎𝑚𝑝𝑙𝑒
Palmprint H H M M M
Retina H H H L L B. Mouse dynamics
Iris H H H M L Mouse dynamics, like keystroke dynamics, is a behavioral
biometric and is also captured non-intrusively as a learner
Face H M M H L interact with the computer (uses the mouse). Any mouse action
can be classified into one of four categories [9]:
Table 2 Behavioural biometric modes rating based on five characteristics [5]
Mouse-Move – Any general motion of the mouse
Drag-and-Drop – A mouse down action followed
by a mouse up action
Distinctiveness
Circumvention
Collectability
Universality
Permanence
Abstract—Over a relatively short time span, mobile realising a number of benefits for a company. However,
communication progressed from an expensive and unwieldy failing to elicit the necessary requirements or hastily
commodity to the brink of delivering a truly “agile” workplace. implementing mobility initiatives may become costly
It has become of importance for organisations to invest in exercises [2].
initiatives aimed at providing workers with a degree of mobility.
The layout of the paper follows: Section 2 addresses the
The impact of mobility on organisational environments seems to
be generally well understood. The researchers identified a gap research questions that inspired this research, while existing
with regard to a holistic analysis of the requirements of mobile models and frameworks are discussed in Section 3. The
environments. The question as to what framework can be research methodology followed to gather and analyse the data
defined to guide organisations when designing environments to is outlined in Section 4. Section 5 describes the layers of the
support mobile knowledge workers was identified. SAND framework. Sections 6 and 7 present the conclusion
and directions for future work respectively.
Having analysed data gathered via an online questionnaire with
the current literature knowledge base, a layered approach to The research questions are defined hereunder.
mobile environments, namely the SAND framework, was
designed. Addressing the aspects captured in the layers of the II. RESEARCH QUESTIONS
framework can assist organisations in designing successful and
optimal mobile environments.
Main RQ: What framework/model can be defined to address
Keywords—Mobile workforce; Mobile knowledge workers; the requirements of mobile knowledge workers?
Mobile worker requirements; Challenges facing mobile worker; Sub Research Questions
Mobile worker perceptions; Framework
Our main question may be deconstructed into the following
I. INTRODUCTION sub RQs:
Mobile work has introduced a wide range of opportunities RQ1: What activities and tasks are performed by mobile
into the technological and economic landscape. With the knowledge workers?
continuing increase in mobile device capabilities, RQ2: Which technologies are being utilised by mobile
globalisation, and the dramatically fast, “need now, want knowledge workers?
now” attitude from consumers and clients, organisations can
no longer ignore the mobility trend, lest they suffer the RQ3: What constraints are being experienced by mobile
consequences of missing business due to being unable to react knowledge workers?
as fast as their competitors. The answers to the above sub questions led to the researchers
devising a framework, coined the SAND Framework that
Despite literature addressing best practices and guidelines for brings together these factors.
mobile knowledge worker support, a framework that brings
these together into a single holistic whole remains amiss. Such framework may be used by organisations as a guide
during the implementation of mobile environments, thereby
Organisations consequently need to spend much time providing them with a set of guidelines to act faster, hence
researching and gathering information to guide them when reducing the risk of financial losses.
implementing their own mobile initiatives [37]. All these lead
to increased cost while competitors gain traction in the market In building the framework, specific focus was placed on
[1]. identifying the different aspects of mobility, including the
challenges and constraints that are faced by mobile
The question of “what framework may be defined to address knowledge workers when performing tasks in a mobile
the needs of mobile workers” therefore gained importance as environment.
the trend for mobile work gains momentum, aimed at
Mobile technology
Technology ought to enable workers to seamlessly
connect to the network in order to enable
Requirement Brief discussion
use of different research methodologies, strategies and
viewpoints by the various authors. Having compared the
Heterogeneity Mobile workers may utilise various heterogeneous results of our survey with the above-mentioned studies, it is
and devices, such as laptops, PDAs and smart-phones, apparent that not all issues garner the same level of
Interoperability each with their own hardware features and importance, and that the viewpoints of the mobile workers
computing capabilities. Thus, the mobile system
should be able to allow the user to seamlessly switch often did not form part of the said studies.
between different hardware platforms.
IV. RESEARCH METHODOLOGY
Communication As mobile workers tend to be on the move to carry
out their activities, some of them may not be reached The literature generally deals with the implementation of a
for certain periods. Therefore, the mobile system mobile environment from the organisational standpoint, but
should provide mechanisms for automatic
connection/disconnection, lack input from the actual or future users of these
synchronous/asynchronous communication and environments. When the needs of the users are not included
attended/unattended message delivery. in the design of a mobile environment, they very often
become estranged and unable to successfully and optimally
Awareness Mobile workers should be able to be aware of function in the said environment. In order to gather input
changes in the environment, be it document changes,
or new information becoming available. The mobile from the users pertaining to their experiences and their
system should, therefore, be able to provide requirements for a successful mobile environment, it is
information on both offline (changes/new important to gain greater insight into the daily work and
information) and online (list of connection users and challenges of mobile workers.
their locations) changes to the mobile environment.
The notion of dividing mobile environments into layers is The first two phases of the study entailed gaining an
discussed by Neyem et al. [6]. They indicate that mobile improved understanding of the activities and requirements of
environments ought to be designed on two layers, namely: mobile knowledge workers. This was accomplished by
administering a questionnaire, designed to extract as much
Infrastructure – The underlying infrastructure should cover relevant information as feasible, in order to perform relevant
both wireless and wired communication services in order to data analyses.
provide a stable platform on a mass scale. The infrastructure
will need to be technically heterogeneous, geographically The questionnaire was built around the constructs inspired by
dispersed, and institutionally complex without a centralised our research questions as well as groups of issues related to
coordination mechanism. mobile environments as identified during the literature review
phase of the study. The literature groups (constructs) are:
Services – While the infrastructure should be heterogeneous
and stable, the services will require personalisation; be able to Services
be dynamically configured and modified; and combined to
Questions in this construct deal with the respondents’
meet the personal needs of the worker.
experience regarding the services that were made available
Most authors (see [8], [9] and [5]) seem to agree on several for mobile working. The respondents were also afforded the
issues, commonalities and the interrelation of various opportunity to comment on services they would like to have
concepts pertaining to the implementation of mobile access to.
environments. These models were not always shown to
address the requirements from the mobile worker’s
perspective, but rather from an organisational standpoint,
focussing on aspects such as data protection and governance.
The differences between the said models could result from the
Flexibility and Autonomy By dividing focus into different layers, the SAND framework
has the potential of providing an understanding of the needs
Flexibility and Autonomy questions determine the level of of mobile workers in a linear and logical manner, and to
flexibility and autonomy the respondent are afforded when highlight the variables of a successful mobile environment.
performing mobile work. The SAND layers are underpinned by related security issues.
It is however important to recognise that the layers are
Technology interdependent. By implication, although possible to function
modularly, the layers that make up a mobile environment
Technology-related questions gauged the respondents’ access function best when addressed as a holistic whole.
to the technology that enables them to perform mobile work.
Apart from general technology aspects, questions pertaining The SAND layers are discussed below.
to connectivity and availability were also posed.
A. Devices and Hardware
Awareness and Collaboration
One of the main challenges that mobile workers face pertains
With these questions, the goal was to: to the physical devices and hardware that are used by them to
interact with the mobile environment. By grouping these
Determine how aware the respondents were of their team challenges together, the first layer of the SAND framework
members, and whether this was in fact a necessary came to the fore.
requirement; and
Ascertain whether the respondents collaborated with their Naturally, the devices and hardware layer is of significant
team members, and if so, how such collaboration was importance, as failing to equip mobile workers with adequate
done and how often it occurred. hardware and devices can have a noticeable effect on the
efficiency of interactions with the environment.
Privacy and Security
Two main issues are highlighted in this layer:
Questions related to privacy and security dealt the
respondents’ perceived need for privacy and security of their Physical requirements of mobile devices
data and online persona. BOYD (Bring your own device) strategies
During the third phase, the researchers analysed the data The above two aspects are closely related, especially when
collected, and determined the relevant patterns that emerged employees require guidance with regard to the procurement
from the information provided. In the final phase, the of devices.
researchers focussed on the requirements that emerged from
the perspective of the mobile knowledge worker, and devised For mobile workers to perform their daily tasks efficiently, it
the SAND framework that can be utilised by organisations was found that four physical requirements are of importance:
when entering into their own mobility initiatives.
Processing power of the device
The analyses of the data collected allowed the researchers Size of the device’s screen
deeper insight into the daily work and challenges of mobile Capacity of power sources
workers, and highlighted the pertinent points that should be Portability of the device
included in the design of a framework to optimally meet the
needs of the users. Several recommendations and best These findings are consistent with those of [10], which claim
practices stemmed from this. that these requirements are also a main concern for
manufacturers.
V. BUILDING THE SAND FRAMEWORK
According to [11], perhaps the most distinguishable
Analyses of the results indicated that various challenges are characteristic of mobile devices is portability, and in order to
more prominent when they are addressed without adequate achieve maximum portability, manufacturers generally strive
support for the underlying aspects related to such challenges. to find a balance between device size and user expectations.
One such aspect emerging from the analyses is that mobile
environments may usefully be designed as a series of layers, The second aspect addressed by the Device and Hardware
with each layer building upon previous layers. layer of the SAND framework pertains to BYOD (Bring Your
Own Device) strategies. Generally, BYOD does not constitute
The approach, therefore, taken by the SAND framework a generic technology, but rather, is driven by a multitude of
allows organisations to divide issues pertaining to mobility- technological trends and constitutes a business policy of
enabling environments into various layers, thereby providing allowing employees to use personally-owned devices to
them with a set of recommendations that focuses on the access organisational resources, such as email, intranets and
different areas of mobile environments. networks [12].
The results of our survey indicate that some kind of BYOD Successful connectivity: Users can connect and
strategy is increasingly being expected by mobile employees. authenticate, and are able to successfully interact with
However, BYOD introduces a range of concerns to resources.
companies, such as privacy and the safeguarding of company
resources and information [13]. The priority and strategies for addressing these issues will, in
part, determine how architects design the software for mobile
In order to address concerns related to BYOD strategies, the workers as well as the measures that are put in place to reduce
SAND framework proposes that focus should be placed on at or mask these issues.
least the following aspects:
Network connectivity and related issues are subsequently
Where feasible, allow mobile workers to use their own grouped together into the second layer of the SAND
devices; framework.
Investigate the feasibility and related risks of
The SAND framework ascribes to the notion that the more
implementing BYOD initiatives;
autonomous a mobile system is designed to be, the better it is
Policies and procedures regulating acceptable use and able to tolerate various connectivity issues, and the less
standards should be incorporated; sensitive it becomes to bandwidth fluctuations and unstable
Solutions that offer remote management of devices ought connections.
to be considered;
In order to achieve a measure of success when addressing
Time should be spent analysing the security of the
network connectivity issues, developers should consider the
network, especially at the network edges, in order to
offline usage capabilities of the application, as well as data
provide maximum security of network resources;
synchronisation strategies [16].
Employees should be provided with education and
awareness pertaining to the consumption of network C. Applications and Services
resources.
Addressing network connectivity issues leads to the next
B. Network connectivity layer of the SAND framework, namely Applications and
Services.
Irrespective of how functionally and technologically superior
a device or piece of hardware might be, without proper Much has been written about mobile application development
remote connectivity technologies and platforms, users may and the best practices that govern such [16], [17] and [18].
still struggle to connect remotely and access company The development of applications for mobile devices
resources via extranets and/or intranets. Though the quality of experienced an exponential growth in recent years, especially
the connectivity technology might largely be out of the hands since the opening of the iPhone AppStore in 2008 [19].
of the organisation, several measures can be put into place to
alleviate some the concerns raised by connection quality. With the increasing popularity and complexity of devices, the
number of scenarios that need to be considered is also
The very nature of mobility increases the volatility of remote increasing [20]. Applications capable of integrating and
connections [14]. The efficiency of remote connectivity interacting with organisational resources are becoming
technologies is a major concern in mobile environments and particularly challenging, as aspects such as platform
has a decidable influence on how successful a mobile worker heterogeneity, data security and privacy need to be
is at interacting with company resources. considered.
Despite connectivity challenges, mobile workers are still Modern consumers have an increasing number of devices and
expecting an efficient mobile endeavour, high quality of platforms available to choose from, hence developing cross-
service and ubiquitous coverage [15]. platform applications is becoming difficult.
Based on the issues reported in this study, connectivity issues Without an adequate cross-platform development strategy in
can be grouped into four categories: place, companies may face the need to develop applications
for each target platform [21], which adds to the costs of
No connectivity: Users are unable to connect. developing and maintaining applications.
Slow connectivity: Users can connect, but the data
transfer speed is experienced as relatively slow. The ultimate goal of cross-platform application development
is therefore to create applications that are able to run on as
Unstable/Unreliable connectivity: Users can connect, many platforms as possible while spending the least amount
but the connection is unstable. of time on development and achieving optimal performance.
Another complexity related to the development of essential to the motivation and psyche of mobile workers
applications and services stems from the fact that the [25]. Therefore, finding effective methods of making mobile
experience of working on a mobile device is quite different to workers feel like “part of the group” and not alienating them
working on a desktop computer. from the company is an important consideration in mobile
initiatives.
Applications for mobile devices are generally more
constrained than their computer-based counterparts with By comparing the results of this study to those of [24] it may
regard to various aspects, such as smaller displays, different be argued that organisational identity has ties to the
ways of interacting with the application, and data employees’ commitment to their employers. It is therefore
transmission rates. significant for companies to foster good relationships with
their employees and to provide ways in which employees can
Most company networks host a variety of resources that are build organisational identity.
required by mobile workers in order to accomplish their tasks
[22]. How applications interact with these resources are an Naturally, employees can only form an identity with respect
important factor to consider when designing and choosing to aspects they are aware of. Consequently, the effectiveness
applications that will be deployed to mobile employees. of communication channels between employees, their peers
Applications ought to be intelligently integrated into existing and leadership directly impacts identity building [26]. The
applications and architecture. SAND framework supports this notion, namely, companies
should set open communication policies, aimed at facilitating
At least two issues can be seen as critical to the success of a two-way communication.
mobile application, namely the availability of the application
and the ease with which mobile workers are able to interact It can be argued that a lack of face-to-face communication is
with and access company resources. an inherent challenge of mobile initiatives. It has furthermore
been shown by [27] that poor communication skills have an
The recommendations suggested by the SAND framework are impact on how the communicator and his/her message is
that companies need to think not only in terms of platform perceived. The implication is that different strategies need to
heterogeneity but also in terms of the network conditions and be investigated to resolve communication-related issues
environment that the applications will be used in. Hence, the amongst the mobile workforce, as proper communication
goal is not only to procure or develop applications that are channels are crucial in co-ordinating between members [28].
appropriate for the tasks to be performed, but which will also
provide the users with a satisfying user-experience. These A further “soft” issue surrounding mobile initiatives revolves
include the availability of the application and the ability to around the education and training of mobile workers. While
easily and effectively access company resources. mobile workers generally experience much more freedom
than their office-bound counterparts, this perceived freedom
D. Support may have disruptive side-effects, such as anti-social
behaviour, distraction and a blurring of work/life boundaries.
The research data has shown that without addressing the Training on how to cope with the challenges presented by
“soft” issues surrounding mobile initiatives, the risk of mobile work was noted as beneficial to mobile workers [29].
alienating employees is increased. When employees are
estranged, they tend to resist change and generally decline Mobile workers generally remove intellectual resources from
buy into new strategies. Factors such as workplace inclusion, company premises and it is important that these workers
organisational identity, employee awareness programs and appreciate the risks involved, and ultimately, be able to accept
training thus present hurdles that should not be ignored [23]. responsibility for the company resources they access and
utilise off-premise.
Consequently, a third layer of the SAND framework is
introduced. This layer deals with the support and “soft” It is, however, not only the mobile workers who need
issues that need to be addressed to provide an optimal educational support. Managers of teams that include mobile
environment for mobile workers. It also addresses, to some workers need to learn the skills necessary to manage these
degree, the management of mobile workers. types of teams.
Baumeister and Leary [24] argued that the need to belong is a E. Security
fundamental human motivation. The implication is that all
humans have an inherent need to belong to, and be accepted Irrespective of how the previously mentioned layers are
by, a group. In an organisational context, these groups can implemented, a final, and arguably the most all-encompassing
include their peers, their department, or their employers. set of issues that need to addressed deals with the security
concerns affecting the mobile environment.
Mobile workers are not physically present in the office as
their peers are, and this may cause the workers to experience Although various security strategies have been proposed [30]
a feeling of being “left out”. Interpersonal communication is and [31], these artefacts have been shown to have several
attributes in common, thus allowing the researchers to The layers as discussed above can be synthesised as in Fig. 2.
categorise the various aspects. Once these aspects have been
categorised, they can be shown to be related to the categories
(or groupings) as identified in sections 5.1 to 5.4:
Diagrammatically we have:
Applications
Support
and Services
In order to successfully implement mobile initiatives, diligent [3] Volker Derballa and Key Pousttchi, "Extending Knowledge
planning and understanding of the elements of a mobile Management to Mobile Workplaces," ICEC'04 Sixth International
environment are required. Companies should not hastily Conference on Electronic Commerce, pp. 583 - 590, 2004.
embark on mobility efforts. The goal is to find a balance
between the requirements of the users and the requirements of [4] Bettina Beurer-Zeullig and Miriam Meckel, "Smartphones Enabling
Mobile Collaboration," Proceedings of the 41st Hawaii International
the company, and to establish an environment that is not only
Conference on System Sciences, pp. 1 - 10, 2008.
capable of providing optimal support, but also one that is
safe, secure, modular and scalable, and able to seamlessly [5] Torsten Brodt, Liz Carver, Terrence Fernando, Hans Schaffers, and
integrate with existing infrastructure.
Robert Slagter, Mobile Virtual Work, Professor Erik Andiressen and [22] Marco Casole and Yi Cheng, "Secure access to corporate resources in a
Matti Professor Vartiainen, Eds.: Springer Berlin Heidelberg, 2006. multi-access perspective: Needs, problems, and Solutions," The
Institution of Electrical Engineers, pp. 482 - 487, 2003.
[6] Andres Neyem, Sergio Ochoa, and Jose Pino, "Designing mobile shared
workspaces for loosely coupled workgroups," CRIWG, pp. 173 - 190, [23] Debra Major, Valerie Morganson, and Kurt Oborn, "Comparing
2007. telework locations and traditional work arrangements," Journal of
Managerial Psychology, vol. 25, no. 6, p. 583, 2010.
[7] Torsten Brodt, Marc Pallot, Wolfgang Prinz, and Hans Schaffers, The
Future Workspace - Perspective on mobile and collaborative working.: [24] Roy Baumeister and Mark Leary, "The need to belong: Desire for
Telematica Instituut (Enschede), 2006. interpersonal attachments as a fundamental human motivation,"
Psychological Bulletin, vol. 117, no. 3, pp. 497 - 529, 1995.
[8] Valeria Herskovic, Andres Neyem, Sergio Ochoa, and Jose Pino,
"General Requirements to Design Mobile Shared Workspaces," in [25] Tomonori Hashiyama, Junko Ichino, Yasuhiro Kojima, and Shun'ichi
Computer Supported Cooperative work in design, 2008, pp. 582 - 587. Tano, "Casual Multi-user Web Collaboration by Lowering
Communication Barriers," , Tokyo, 2010.
[9] Lasse Berntzen, Tore Engvig, Per Hasvold, and Bente Skattor, "A
framework for mobile services supporting mobile non-office workers," [26] Jos Bartels, Menno De Jong, Oscar Peters, and Ad Pruyn, "Horizontal
Human Computer interaction, vol. 4, pp. 742 - 751, 2007. and vertical communication as determinants of professional and
organisational identification," Personnel Review, vol. 39, no. 2, pp. 210
[10] Anar Gasimov, Chee Wei Phan, Juliana Sutanto, and Chuan-Hoo Tan,
- 226, 2009.
"Visiting mobile application development: What, how and where,"
Ninth international conference on mobile business, pp. 74 - 78, 2010. [27] Jonathan Gardner and Lamar Reinsch, "Do communication abilities
affect promotion decisions? Some data from the C-suite," Journal of
[11] Yufei Yuan and Wuping Zheng, "From stationery work support to
business and technical communication, vol. 28, no. 1, pp. 31 - 57, 2013.
mobile work support: A theoretical framework," Proceedings of the
International conference on mobile business, pp. 1 - 7, 2005. [28] E Haribabu, C Lakshmi, and Avvari Mohan, "Intra-organizational
communication and aspects of organisation culture: A workplace study
[12] Louis Jonker, "Brnig your own device: Policies and contracts,"
among knowledge worker teams of two IT firms in Hyderabad,"
ITechLaw European Conference, pp. 3 - 20, 2012.
Management of Innovation and Technology, vol. 1, pp. 131 - 135, 2006.
[13] Rupali Gurudatt, Sangita Mohite, Rajnikant Shelke, and Shirkant
[29] IBM, "The mobile working experience," p. 12, 2005.
Belsare Vikas Solanke, "Mobile Cloud Computing - Bring your own
device," Fourt International Conference on Communication Systems [30] Intel, "Enterprise Mobility: Increasing productivity for mobile users,"
and Network Technologies, p. 566, 2014. 2013.
[14] Chevonne Dancer, Jacqueline Jackson, and Gordon Skelton, "Teaching [31] Al Salam and Rahul Singh, "Semantic information assurance for secure
software engineering through the use of mobile application distributed knowledge manaegment: A business process perspective,"
development," Journal of Computing Sciences in Colleges, vol. 28, no. IEEE Transactions on Systems, Man and Cybernetics, vol. 36, no. 3, pp.
5, pp. 39 - 40, 2013. 472 - 474, 2006.
[15] Tolga Ayhan, Leonid Kazovsky, and Shing-wa Wong, "Hybrid optical- [32] Vipin Chaudhary, Hanping Lufei, and Weisong Shi, "Adaptive secure
wireless access networks," Proceedings of the IEEE, vol. 100, no. 5, pp. access to remote services in mobile environments," IEEE transactions
1197 - 1225, 2012. on services computing, vol. 1, no. 1, p. 49, 2008.
[16] Sybase, "Best practices for developing mobile applications: The data [33] Lu Ma and Jeffrey Tsai, "Formal modeling and analysis of a secure
driven approach," 2007. mobile-agent system," IEEE Transactions of systems, man and
cybernetics, vol. 38, no. 1, p. 180, 2008.
[17] Motorola, "Best practices for developing enterprise smartphone apps,"
2012. [34] Max Landman, "Managing smart phone security risks," InfoSec '10, pp.
145 - 154, 2010.
[18] AQuA, "Best practice guidlines for producing high quality mobile
applications," 2013.
[19] Anthony Wasserman, "Software Engineering Issues for Mobile [35] NIST, "Improving critical infrastructure cybersecurity," 2013. [Online].
Application development," FoSER, pp. 397 - 400, 2010. https://www.huntonprivacyblog.com/wp-
content/files/2013/10/preliminary-cybersecurity-framework.pdf
[20] Stefan Diewald, Matthias Kranz, Andreas Moller, and Luis Roalter,
"Towards a holistic approach for mobile application development in [36] Sybase, "Best Practices for Developing mobile applications: The data
intelligent environments," MUM, p. 73, 2011. driven approach," iAnywhere Solutions, 2007.
[21] Spyros Xanthopoulos and Stelios Xinogalos, "A comparative analysis of [37] EVD Kar, SM Muniafu, Y Wang, "Mobile services in unstable
cross-platform development approaches for mobile applications," environments: Design requiremetns based on three case studies",
Proceedings of the 6th Balkan conference in informatics, pp. 213 - 219, Proceedings of ICEC, pp 302 - 308, 2006.
2013.
Characteristics of Transfer Function of Power Lines
having Tapped Branches in the Propagation Path
Banty Tiru
Department of Physics
Gauhati University
Guwahati-14, India
banty_tiru@rediffmail.com
Abstract— Power lines can be used for communication In practical networks, complex branches like STAR
purpose. However, the channel is highly frequency selective
arising from discontinuities in the network. In this work, the
effect of tapped branches in different topologies (STAR and BUS)
is studied. The frequency selectiveness is measured in terms of the
Coherence Bandwidth. It is found that the Coherence Bandwidth
is more for STAR than a BUS one. In STAR topology, the same
at first decreases and then increases for increasing number of
taps at the nodes. For large number of components, the transfer
function of STAR branch is governed by the characteristics of the
connecting cable while for in BUS topology, the same becomes
very complex. The paper gives an idea of the dependency of the
efficiency of the communication system on branches with
different topologies encountered.
Fig. 1. Different types of branches CD in the propagation path AB. (a)
Keywords—Power Line Communication, Complex Simple branch (b) Branch with a STAR (c) BUS topology.
Trigonometric Hyperbolic Functions, ABCD matrices,
Transmission Line, Star Topology, Bus Topology. and BUS topologies are found. In the former, there are large
numbers of branches in a single node. In the later, the same are
connected in a serial manner. This paper studies the effect of
I. INTRODUCTION different topologies on the communication path. The frequency
selectiveness is measured in terms of the coherence bandwidth
P OWER line communication (PLC) uses the unambiguous
power line (PL) to meet the data requirements of access,
in-house and control [1]. PLC is also used as components of
(CBW) defined as the range of frequencies over which the
channel can approximately be considered as flat. The study
hybrid networks incorporating both cabled and wireless will enable to predict efficiency of PL in a practical network.
channels [2]. PLs are however the worst to deal with The PL channels are assumed to have a linear time invariant
characterized with time varying frequency selective transfer behavior modeled as two port network (2PN). Section II of the
paper describes the use of ABCD matrices in the evaluation of
function (TR). An in-depth knowledge is therefore required to
the TR and its application to PLs. In Section III, effect of
predict the efficiency of the devices.
different types of branches on the TR is studied. A
The frequency selectiveness of PL channels depicts the
comparative study of the CBW is done in Section IV. The
multipath nature arising from the many discontinuities in the paper is concluded with the key results.
communication path. Notches occur when the communication
signal reach the receiver via different paths having different
propagation delays and at opposite phase. The worst cases are II. TRANSMISSION MATRIX OF A TWO PORT NETWORK AND
when the signal encounters open or short circuited branches. APPLICABILITY TO POWER LINE CHANNEL MODELING
The number of notches increases with the length of the
The transmission, chain or ABCD matrix of a 2PN (Fig. 2),
branches and
is defined in terms of the input voltage v1 , input current i1 ,
are periodic in nature. Many researchers have analyzed the output voltage v2 and output current i2 as in (1)
problem of notch formation and the processes are at present
understood clearly. The dependencies on channel variability
like length of branch, loads etc [3]-[10] has been studied v1 A B v 2
elaborately. However, most of the papers target simple i
branches in the propagation path like that shown in Fig.1 (a). 1 C D i 2
ZL
H f
Fig. 2. ABCD matrix of a 2 Port Network
AZ L B CZ S Z L DZ S (a) (b)
8
12
10 7
Here, Z s and Z L are the source and load impedances 10
6
8
abs(cothz))
abs(tanhz)
respectively. ABCD matrix of complex networks is equal to 8 5
6
the product of the matrices of the cascaded sections. A PL 6 4
network can be taken as a 2PN with cascaded sections of 4 3
4
transmission line (TL) and branches taps. For a PL of length l, 2
2 2
2
characteristics impedance Z 0 and propagation constant γ, the 0
2 0
0
1
0 10
10 0
ABCD matrix of a simple TL and branched tap is given by (3) 0
y -10 -2 x
y -10 -2 x
cosh l Z 0 sinh l 1 0 Fig. 3. Plot of absolute value of complex hyperbolic (a) cotangent (b) tangent
ABCDTL sinh l ; ABCD 1 (3) function showing periodicity along the imaginary axis.
cosh l Tap
Z 1
Z0 in
R jwL
Where Z in is the input impedance of the branch of length lbr Z0 ; R jwLG jwC j (7)
G jwC
and terminated by a load Zbr . Z in is defined by
α and β are the termed as the attenuation constant (nephers/m)
and phase constant (radian/m) of the cables respectively. Both
Zbr Z 0 tanh(lbr )
Zin Z 0 (4) α and β are frequency dependent terms and given by (8)[11]
Z 0 Zbr tanh(lbr )
12
R 2 G 2
12
RG
1 2 2 1 2 2 1 2
w LC (8)
For open and short circuited branches, the input impedance ,
(and input admittance) is obtained by substituting adequate 2 w L w C w LC
loads limits in (4) yielding (5) and (6)
tanhlbr
In the simulation, the constants are taken from a real network
Zinopen Zin Z 0 cothlbr ; Yinopen (5) measurement and noted as L=0.69µH/m, G=0.018µmho/m
Z br Z0 (Seimen/m) and C=38pF/m. For high frequencies, Z0 ~135Ω.
cothlbr
Zinshort Zin Z 0 tanhlbr ; Yinshort (6)
Z br 0 Z0 III. EFFECT OF DIFFERENT BRANCHES IN THE COMMUNICATION
PATH
The periodicity of the notches occurs due to the periodicity of A communication path from A to B encounters different
the hyperbolic tangent and cotangent functions branches in the propagation path like those shown in Fig. 1. A
(Fig. 3). Notches are found whenever the input admittance PL network can consist of either of the configurations or a
(impedance) becomes a maximum (minimum) so that most of hybrid system one consisting of both.
the signal is shorted out from the load. Z0 and γ depends on A. Simple Branch in the Propagating Path.
the primary line constants of the cable termed as R
(Resistance/length), L(inductance/length), G (conductance In a simple branch CD in the propagating path AB, as the
/length) and C (capacitance /length). At angular frequency w length of the branch is increased, the number of admittance
the same is given by peaks and hence, the number of notches also increases. For
lbr 4.55m , Yopen has a maximum at 10.7MHz and 32.2MHz And the input admittance at node „N2‟ is
and minimum at 21.5MHz and 42.9MHz.
1 Z 0Ytotal tanh(l )
YIN2 _ STAR
0 0.5 1 Z Y tanh(l )
Z0 0 total (14)
Fig. 4. Magnitude of transfer function of AB with (a) and (c) different branch Y n Yopen
Where Y1 Yin1 , Y2 Yin2 ..(16)
length and (b) and (d) variation of Z in and Yin for the two lengths YINn _ STAR n 1
n
1 Z 02 Yn Yopen
n 1
1 ZtotalYopen n
YIN2 _ STAR (17)
Fig. 5. STAR configuration Ztotal Z 02Yopen d
Z L1 Z 0 tanh(lbr1 ) Z Z 0 tanh(lbr 2 ) (10) branches with STAR topology, Yin depends more on the
Z in1 Z 0 ; Z in 2 Z 0 L 2
Z 0 Z L1 tanh(lbr1 ) Z 0 Z L 2 tanh(lbr 2 ) lengths of the branches in the node, but for large numbers
whose lengths vary randomly, on the connecting branch to the
1 Z0Ybr1 tanh(lbr1 ) 1 Z Y tanh(lbr 2 ) (11) node. More analysis is done in Section IV.
Yin1 ; Yin 2 0 br 2
Z0 1 Z0Ybr1 tanh(lbr1 ) Z0 1 Z0Ybr 2 tanh(lbr 2 )
Where
1 1
Ybr1 ; Ybr2
Z br1 Z br2 (12)
1
Ytotal Yin1 Yin2 ; Z total (13)
Ytotal
tanhl1
|
0.5 1
IN-STAR
1 1
Y1 ; y1 ; Yopen1
|Ytotal|
(e) (18)
(a) 0.5 Z in1 Z in 0 Z0
|Y2
0 0
0 25 50 0 25 50
0.5 4K Yopen1 is the input admittance of the branch of length l1 if it
|Yopen|
|Ztotal|
(f)
(b) were open. The input admittance at „N2‟ is given by
0
0 25 50 0 0 25 50
0.5 10K
|Yin1|
(c) (g)
|d|
0 0 0
0 25 50 25 50
0.5 0.5
|Yin2|
|n/d|
(d) (h)
0 0
0 25 50 0 25 50
Frequency(MHz) Frequency(MHz)
0.5 0.5
|
|Y8IN-STAR|
IN-STAR
0 0 Y1 y1 Yopen1
0 50 0 25 50 YIN1 _ BUS (19)
0.5 0.5 1 Z 02 Y1 y1 Yopen1
|
|Y3IN-STAR|
(c)
IN-STAR
(d)
1
Where the subscript „1‟ in YIN _ BUS denotes that there is only
|Y3
(e)
|Y3IN-STAR|
IN-STAR
(f)
Y2 y2 Yopen2
|Y3
Where
Fig. 7. The input impedance YINn _ STAR of a branch having a STAR 1 tanhl2
y2 YIN1 _ BUS ; Y2 ; Yopen2 (21)
connection having (a) 3 branches of equal lengths l lbrn 4.55m,n=1,2,3 Zin 2 Z0
(b) 8 branches of equal lengths l lbrn 4.55m ,n=1to 8 (c) 3 branches of
equal lengths l lbrn 8.55m ,n=1,2,3 (d) 3 branches l lbrn , In general,
l=8.55m, lbrn 4.55m, n=1,2,3 (e)3 branches, l lbr1 lbr2 lbr3 (f) Yi 1 yi 1 Yopen_ i 1
yi (22)
8 branches, l, lbrn random (0-7m) , n=1 to 8. Y axis in Seimen. 1 Z 02 Yn 1 yn 1 Yopen_ i 1
|d1|
.Y n
(24)
2
1 Z02 Yn 1 yn 1 Yopen_ n 1
IN _ bus 0 0
0 25 50 0 25 50
0.5 0.5
|Yopen1|
|y2|
0 0
Analysis. To study the effect of increasing the nodes, the 0 25 50 0 25 50
10K 0.5
|Ytotal2|
simplest form of circuit like Fig. 8(a) is taken and the input
|Zin1|
1
admittance at „N2‟ i.e YIN _ BUS is found out. In the next stage 0
0 25 50
0
0 25 50
0.5 2K
|Ztotal2|
this acts as the input admittance of the branch connection at
|Y1|
„N2‟ and the input admittance at „N3‟ is found out. Likewise 0
0 25 50
0
0 25 50
|
the number of nodes is increased and the admittance at the
total1
1 10K
|d2|
main line found out in each case. To make the analysis simple,
|Y
0 0
the lengths of all the cables taken to be equal 5K
0 25 50
0.5
0 25 50
|Ztotal1|
l1 l2 ...... lbr1 lbr 2 ........ l0 4.55m . As before, the
|y3|
loads are taken to be resistive of very high values. As noted 0 0
0 25 50 0 25 50
Frequency(MHz) Frequency(MHz)
earlier, the length has a Yopen1 maxima at 10.7MHz and
32.2MHz and minimum at 21.5MHz and 42.9MHz. Similarly
Y1 and y1 has maxima and minima at these frequencies. Fig.9 Fig. 10. Analysis of the input impedance at „N3‟ for YIN2 _ BUS .Here K=103 ,
shows YINn _ BUS for various n . In Fig. 10, the formation of Z in1 Z in2 Z in0 , Y1 Y2 y1 , y2 YIN1 _ BUS and
peaks is analyzed for YIN2 _ BUS using the calculations in y3 YIN2 _ BUS .The details are given in APPENDIX A . Y axis in Seimen.
APPENDIX A. The Figure shows that the peaks are much
dependent on the admittance of open circuits and loaded Fig. 9 shows that as the number of BUS connections increased,
branches. In each addition of more BUS nodes, the admittance the admittance in the node „Nn‟ shows more number of peaks
of the open circuit adds to the net impedance forming the load with decreasing amplitude. Therefore the presence of a BUS
for the next stage. connection in the main propagating path offers a complicated
0.5 0.5
transfer function if the number of elements are increased.
|YIN-BUS|
|Y2IN-BUS|
0 0 BRIDGES
0 25 50 0 25 50
0.5 0.5
|Y4IN-BUS|
0 0
0 25 50 0 25 50
|Y6IN-BUS|
0.5 0.5
TABLE I
|YIN-BUS|
0 25 50 0 25 50 Coherence
0.2 0.2 No. T N Specification Bandwidth
|Y8IN-BUS|
(MHz)
|
l lbrn 4.55m
IN-BUS
0 0
0 25 50 0 25 50 (iii) -do- 8 -do- 0.3
Frequency(MHz) Frequency(MHz)
(iv) -do- 10 -do- 0.3
(v) -do- 20 -do- 0.25
Fig. 9. The input admittance at node „Nn‟ for branches in BUS configuration (vi) -do- 50 -do- 0.45
different having different numbers of nodes n. Y axis in Seimen. (vii) -do- 100 -do- 0.85
(viii) -do- 2 l lbrn 8.55m 0.3
(ix) -do- 3 l lbrn 8.55m 0.25
(x) -do- 3 l lbrn 0.40
(xi) -do- 8 l , lbrn random 0.33
(xii) -do- 40 l , lbrn random 0.65
(xii) bus 1 4.55m 0.4
(xiii) -do- 2 -do- 0.3
(xiv) -do- 3 -do- 0.2
(xv) -do- 4 -do- 0.2
(xvi) -do- 5 -do- 0.2
(xvii) -do- 6 -do- 0.2
(xviii) -do- 7 -do- 0.2
(xix) -do- 8 -do- 0.2
V. CONCLUSION Then
Y2 y 2 Yopen2
_ bus
2
In this work, the dependence of transfer functions on the YIN
1 Z 02Yopen2 Y2 y 2
branched taps in the propagation path for STAR and BUS
topologies are studied. It is seen that the effect on the transfer
function increases when the number of branches in the STAR Yopen2
1
node or number of serial connections in the BUS configuration Ytotal2 1
_ BUS ; Ytotal2 Y2 y 2
2
YIN
increases seen from a decrease in the Coherence Bandwidth. 1
Z 02Yopen2 Z open2
However for STAR topologies, the CBW improves for larger Ytotal2
number of branches. In general, the CBW of STAR topology is
larger than the BUS topology. The results of the work can be 1 Yopen2 Z total2 1 n
used to find out the efficiency of various PLC channels YIN2 _ BUS 2 y3
Z total2 Z 02Yopen2 2
Z IN _ BUS d 2
connecting devices in a complex network. In the future the
procedure will be applied to developing a model for predicting
efficiencies of PLC in more complex topologies. Such a study
REFERENCES
will enhance the use of power line and means can be found out
[1] P. Sutterlin, and W. Downey, “A Power Line Communication Tutorial-
to make the channel more compatible for communication. Challenges and Technologies,” in Proc. ISPLC., Japan, 1998, Available:
http://www.viste.com/LON/tools/PowerLine/pwrlinetutoral.pdf
[2] S. Barmada, M. Raugi, M. Tucci, “Power line communication integrated
in a wireless Power Transfer System : A feasibility study”, in Proc. 18th
APPENDIX
ISPLC, 2014, Glasgow, pp. 116-120.
[3] E. Biglieri, “Coding a Horrible Channel,” IEEE Comm. Mag. vol. 41, no.
1. Taking the first circuit as shown in Fig. 8(a), 5, pp. 92-98, May. 2003.
tanh l1
[4] P. Mlynek, J. Misurec, M. Koutny, “ Random Channel Generator for
1 1
Y1 ; y1 ; Yopen1 Indoor Power Line Communication”, Meas. Sc. Rev. vol. 13, no. 4, pp.
Z in1 Z in0 Z0 206-212, 2013.
[5] F.Zwane, T.J.O. Afullo, “An Alternative Approach in Power Line
where Communication Channel Modelling”, Prog. Elect. Res. Vol 47, pp 85-
Z L 0 Z 0 tanh(l0 ) 93, 2014.
Z in0 Z 0 [6] B. Tiru, R. Baishya and U. Sarma, “An analysis of indoor power line as a
Z 0 Z L1 tanh(l0 )
communication media using ABCD matrices”, in Advances in
Z L1 Z 0 tanh(l L1 )
Z inl Z 0 Communication and Computing , Springer, 2015, pp. 171-181.
Z 0 Z L1 tanh(l L1 ) [7] I. Tsokalo, R.Lehnert,”Modelling Approach of broadband in-home PLC
in network simulator 3”, in Proc.19th ISPLC, Austin, 2015, pp. 113-118.
Then
[8] J. Anatory, N. Theethayi, M. Kissaka, N. Mvungi, “Broadband Power
Y1 y1 Yopen1 Line Communications: The factors influencing Wave propagations in
YIN1 _ BUS
1 Z 02Yopen1 Y1 y1
Medium Voltage Lines”, in Proc. ISPLC., Pisa, 2007, pp. 127-132.
[9] S. Tsuzuki, S. Yamamoto, T. Takamatsu, Y. Yamada. (2001, Apr.)
Measurement of Japanese Indoor Power Line Channels, Presented at
International Symposium on Power Line Communications. [Online].
Available:
http://www.isplc.org/docsearch/Proceedings/2001/pdf/0687_001.pdf
[10] T.P. Surekha, T. Ananthapadmanabha, C. Puttamadappa, “Analysis of
Effect of Power Line Channel Characteristics Parameters in Broadband
Power Line Communications (BPLC) systems”, in Proc. IEEE/PES
PSCE., Seattle, 2009, pp. 1-6.
[11] W. Frazer, “Transmission Lines ,” in Telecommunications, 2nd ed., New
Delhi: CBS Publishers and Distributers, 1985, pp. 97.
Abstract— IT sector is growing day by day, its data and more efficient and eco-friendly. So making the cloud green
energy costs are also increasing, which needs to access the high- does not only benefits the cloud providers by saving the costs
end computing capabilities. To fulfill these requirements, there is but also helps in making environment sustainable for the
a need to shift from traditional computing practices to new coming generations. This work leads to minimize the power
computing practices that give access to a broader network, usage by considering virtualization at various levels [10].
unlimited resources, and enables self-service on demand at a Virtualization means running two or more operating systems
reasonable cost using pay per use method. All these requirements on a single physical machine and thus abstract the computer
can be fulfilled with cloud computing. Cloud Computing has a resources. In this work, an IDPS based framework have been
number of advantages as well as disadvantages. Security is a
proposed and its prototype implementation is given. IDPS uses
major threat to the cloud computing because the data placed on
cloud servers is always vulnerable as there are lots of attackers
the concept of autonomic computing to automatically detect a
on the internet who tries to compromise the security of the cloud. malicious activity, attempts to stop it and reports the security
To overcome this issue, we use security appliances in the form of Admin.In next section we discussed various papers that are
physical machines or by virtualization. A significant amount of based on security issues in cloud computing and a survey has
power is consumed by security appliances which not only leads to been done on the existing frameworks for cloud security.
the high energy costs but also a major threat to the environment
as carbon-dioxide is released. So there is a need to find optimal II. LITERATURE SURVEY
solutions for providing security in cloud computing environment This section is divided into three parts. Part A discussed the
in an efficient manner. In this paper, a comprehensive survey on security issues survey including confidentiality, availability,
existing cloud security frameworks has been done. Based upon integrity etc. Part B discussed the security frameworks for
the limitations on existing frameworks a new framework has clouds as well as for green clouds. Part C discussed the
been proposed to provide security in virtual networks that is limitations of existing frameworks and a comprehensive
based on Intrusion Detection and Prevention System (IDPS) and review on existing frameworks has been shown in Table1.
its prototype implementation has also been done.
A. Survey on Security issues
Keywords— security; green cloud computing; IDPS S. Subashini et al. [2] surveyed various risks and
I. INTRODUCTION security threats existent in cloud computing system at the
service delivery model level (SaaS, IaaS and PaaS). Security
Cloud computing is the most trending technology among IT issues in SaaS that are main challenges in the cloud
sector because of its characteristics like rapid elasticity, broad environment are discussed one by one. Though cloud
network access, measured services, on-demand self-service and computing is very advantageous technology in IT sector but
resource pooling [1]. It has given relief to the user by some issues like SLA agreements, security, privacy, power
eliminating the need for the high-cost hardware requirements at efficiency still exists which are a bottleneck for the potential
the client side. Many organizations have shifted their data to customers. They suggested that the integrated solution must be
the cloud which saves costs and provides ubiquitous and fast designed by analyzing macro and micro elements and then
access to the data. As the database increases, it needs more deploy them in the cloud.
servers which lead to an increase in the size of data center.
These large data-centers cause two major issues: the first is Dimitrios Zissis et al. [3] discussed the transition of
maintaining security and the other is energy consumption. traditional computing to grid computing and now to cloud
Implementing the security policies which include computing. They worked on two challenges: Firstly,
confidentiality, integrity and availability are the biggest identifying unique security requirements to evaluate cloud
challenges in the cloud industry. The recent attacks on Apple security and secondly, to propose a solution that eliminates
and Sony cloud have created a fear in the mind of customers these types of risks. The proposed solution based on
about the safety of their data. The security appliances that are cryptographic techniques assures authentication, integrity and
used to prevent attacks and to maintain a safe environment confidentiality of data communications. The security
consumes a significant amount of energy. So to lower the requirements, threats and the users of service levels have been
energy costs and the carbon contents, we need green cloud discussed and the categorization of threats has been done. A
environment. Green cloud computing is the necessity of the discussion has also been done on the trusted third party
environment as global warming is increasing. Green cloud describing on whom it can rely, the creation of security
computing aims to minimize the power consumption costs and domains and cryptographic separation of data. The assessment
emissions of carbon-dioxide thus making cloud computing
S
F
S
S
F
S
P
E
T
E
E
L
E
R
R
C
R
C
K
O
A
Y
U
N
O
Y
A
N
A
V
A
A
O
G
1.
bl
M
M
SI
W
TI
TI
Ta
RI
XI
Reference Parameters Green Cloud Security Architecture IDPS used Tactics Implemented Types of attacks Advantages Disadvantages
No considered Policy
[10] Green cloud Yes Dynamic De- No Virtualization Yes Attacks related to Supports green Computational cost of
computing, centralized (IDS only) security computing, drop rate IDS, To cope with
security of 2% only dynamic environment
of VMs, Detects not
prevents
[15] Threat events No Static Centralized No --- Yes Spoofing, Allow customers to Requires meticulous
compromisin Tampering, comparatively choose collection of data and
g cloud and Repudiation, among different industry SME inputs
internet Information vendors, alleviates
security Disclosure, Fear, Uncertainty and
Denial of Service, Doubt, quantitative
Elevation of and iterative
Privilege convergence approach
[16] Trust and No Dynamic De- No Virtualization No Prevent Found new Detective rather than
accountability centralized diminishing accountability issues preventive approach is
issues controls and lack that are not known proposed, securing the
of privacy before log files is itself a
challenge
[17] Risk, No Static De- No --- Yes Security and Follow PDCA security ---
security, centralized privacy attacks cycle
privacy
[18] Security No Dynamic De- No Virtualization Yes(at SaaS Breaching into Developed in standard ---
centralized model) company’s quality management
network and (PDCA), helpful for
accessing data, risk analysis,
using TELNET to assessment and
access system mitigation, applicable
files using guest on all service and
ID, obtaining deployment models
unauthorized
access
C. Limitations in existing frameworks computation cost and will make the network more
The above discussed frameworks handle security issues but on efficient.
the other hand, they have increased the computational cost. The limitations discussed above section are eliminated by
None of the frameworks have used IDPS. IDPS can IDPS that will not just detect the attack but also attempt to
automatically take actions like dropping the malicious packets, prevent it. The power consumption is reduced by the
resetting the connection and/or blocking the traffic from the virtualization technique as a single hardware system does all
offending IP address etc. Other advantages of IDPS include the tasks. In Fig. 2, switch that is used is virtual in nature, the
automatic correction of Cyclic Redundancy Check (CRC) database is also virtual and even the IDPS itself is running on
errors, cleaning up the unwanted traffic. So IDPS is more a virtual machine. The problem of different security protocols
useful than IDS. There is only one framework that supports is taken into consideration with the concept of nesting of
green cloud computing, quickly reduces the network overload virtual machines. All VMs of same security protocols are
and results in the packet drop rate of only 2% [10]. Some grouped together to make the functioning of IDPS better and
issues do exist in this framework. Firstly, the Intrusion more energy efficient.
Detection System (IDS) only detects the attack but not prevent
it. Sometimes, it is too late to take necessary actions by the
Admin to prevent the attack. Secondly, each virtual machine
has a different configuration and thus require different security
solutions. Thus, rules/signatures in IDS should be relevant to
each VM. Lastly, the IDS itself can be attacked as the attacker
can directly send a malicious packet to the network which
affect the installed IDS. So to overcome above mentioned
issues, an IDPS based framework has been proposed and
prototype implementation has been done as discussed in next
section. A comprehensive survey of above mentioned
frameworks has also been done which consists of various
parameters like architecture, green/non-green, tactics used etc.
and results are shown in Table 1.
III. PROPOSED FRAMEWORK
An IDPS [19] based framework has been proposed and its
prototype implementation is given in this section. It secures Fig.1. Parameters considered for the framework
the network and also takes care of green cloud computing.
Fig.1 shows the layout of proposed framework. The main A. Working of the Framework
tactics used are: The working of the proposed framework as shown in Fig. 2 is
IDPS: Intrusion Detection and Prevention System given below:
identifies the malicious activity, logs information
about it, attempts to stop it and reports the security
Admin. The IDPS used in this framework is hybrid
comprising of Signature and anomaly based
technique which gives real time protection and the
response time is active. It has minimal impact on the
performance of the overall network [20].
Virtualization: Virtualization is running two or more
operating systems on a single hardware or it is the
abstraction of computer resources. We have used the
virtualization technology to support green cloud
computing. The virtual switch has been used instead
of the physical switch resulting in energy efficient
environment. The IDPS itself runs on VM and its
data is also stored on a VM.
Nesting of Virtual Machines (VMs): Nesting of
Virtual Machines means one or more VMs running
under the other VM [21]. By using nesting, we can
combine the VMs of the same configuration together. Fig.2. Framework for security in green cloud computing
It will ease the implementation of security as the The user got connected to the internet and log into the
security requirements of same types of VMs will be cloud system.
the same. This will also decrease the overall
Table 2. PROTOTYPE IMPLEMENTATION MODEL
The message generated from the user system passes CloudVisor [21] is used for the nesting of VMs
through the gateway and reaches the router. The which helps in running one or more VMs under the
router will check its routing table and forwards the other VM. The VMs having same configurations will
packet to another router. The process goes on until it be nested together so that they have same security
reaches the destination node which is the switch here. protocols.
The switch receives, process and forward the data
and send it to the desired location, that here is port0.
Port0 is a physical network interface and all packets B. Prototype used for implementation of the framework
go through Port0. The tools used for implementation, different tasks they
The firewall will monitor and control the incoming
perform, the kind of input they take, the output they
and outcoming traffic. It distinguishes a trustworthy
and secure network from the other outside network. provide and where they are used is given in Table 2. Cisco
The packets through port0 reach the Virtual Switch Packet Tracer is used to route and check the flow of
which works similarly as physical switch though it is packets among various routers and switches. Linux bridge
connected to Virtual Machines rather than Physical is implemented as a virtual switch by making few changes
ones. It receives the data packet from port0, process in it. It will send the packets to the desired destination. To
them and forward them to the desired Virtual check whether the IDPS copes with the sudden increase
Machine. Here, Linux Bridge is implemented as a and decrease of traffic, we have used a traffic generator
virtual switch. software by which we can provide fluctuations to the flow
For providing security, we have installed IDPS. The rate of the packets. For network protocol analyzing,
data packet reaches IDPS through VNI-p, which is a Wireshark is used. OSSEC is implemented here as IDPS.
Virtual Network Interface connected with the mirror All the log is stored on a separate virtual machine where
port. It duplicates and monitors data packets. IDPS ORACLE is installed.
detects the malicious activities, log information about
those activities and takes action to prevent or block it IV. CONCLUSION AND FUTURE SCOPE
and reports it to the Admin. The actions taken by Instead of using separate hardware systems, virtualization
IDPS include triggering an alarm, blocking the technologies are used which leads to better manageable and
particular IP, dropping malicious packets and efficient cloud computing. But in the multi-tenant
resetting the network. environment it raises the issue of data privacy and security.
VNI-c is Virtual Network Interface for connecting to The large data centers emit carbon-dioxide which leads to
IDPS and port1. Port1 is connected to the physical
global warming. So to address both of these issues, we have
network to ensure no disturbance to the whole
proposed a framework which tackles with security and is
system.
energy efficient. Signature and anomaly based hybrid IDPS
For logging the malicious activities, we have used a
tracking module and DB which is a dedicated virtual detects, logs, prevent and alert the security Admin. The
machine that tracks the malicious activities and store Nesting of the VMs provide easy management and IDPS can
them in the database for further use to prevent the deal better as the grouping of VMs has been done on the basis
same type of attacks that had already occurred. of security protocols. Virtualization gives overall low
All the VMs are managed by Hypervisor. Here, we computation cost because a single hardware system does all
have used hosted hypervisor which runs on an the tasks (providing security, maintaining log files, managing
operating system like other application software do. virtual machines, implementing switch) and thus consumes
very less power and make our framework energy efficient.
The further research continues to implement this framework in Journal of Network and Computer Applications, ELSEVIER, pp. 379-388,
2011.
real time scenarios and based on those results the
[11] H. Takabi, J.B.D. Joshi, G. Ahn, ”SecureCloud: towards a comprehensive
computational cost of IDPS could further be reduced. Also, we security framework for cloud computing environments”, Computer
aim to improve IDPS for better detection and prevention of Software and Applications Conference Workshops, IEEE, pp. 393-398,
attacks and to lower the rate of false negatives and false 2010.
[12] A. Malik, M. Nazir, “Security framework for cloud computing
positives.
environment: a review”, Journal of Emerging Trends in Computing and
Information Sciences, vol. 3, no. 3, pp. 390-394, 2012.
REFERENCES [13] M. Almorsy, J. Grundy, A. Ibrahim, “Collaboration-Based cloud
[1] NIST, ”The NIST definition of cloud computing”, October, 2011, computing management framework”, International Conference on Cloud
<http://faculty.winthrop.edu/domanm/csci411/Handouts/NIST.pdf> Computing, IEEE, pp. 364-371, 2011.
Accessed on 12 Aug, 2015. [14] A. Youssef, Manal Alageel, ”A framework for secure cloud computing”,
[2] S. Subashini, V. Kavitha, “A survey on security issues in service delivery International Journal of Computer Science issues, pp. 487-500,2012.
models of cloud computing”, Journal of Network and Computer [15] P. Saripalli, B. Walters, “QUIRC: a quantitative impact and risk
Applications, ELSEVIER, pp. 1-10, 2010. assessment framework for cloud security”, International Conference on
[3] D. Zissis, D. Lekkas, ”Addressing cloud computing issues”, Journal of Cloud Computing,IEEE”,2010.
Network and Computer Applications, ELSEVIER, pp. 583-592,2010. [16] R. Ko,et al., ”TrustCloud: a framework for accountability and trust in
[4] C.Modi, D. Patel, B. Borisaniya, H. Patel, A. Patel, M. Rajarajan, “A survey cloud computing”, World Congress on Services, IEEE, 2011.
of intrusion detection techniques in Cloud”, Journal of Network and [17] ENISA, “Security framework for governmental clouds”, February, 2015,
Computer Applications, ELSEVIER, pp. 42-55, 2012. <https://www.enisa.europa.eu/activities/Resilience-and-CIIP/cloud-
[5] B. Makhija, V. Gupta, I. Rajput, “Enhanced data security in cloud computing/governmental-cloud-security/security-framework-for-
computing with third party auditor”, International Journal of Advanced govenmental-clouds/security-framework-for-governmental-
Research in Computer Science and Software Engineering, Volume 3, Issue clouds/at_download/fullReport> Accessed on Aug 2015.
2, February 2013. [18] X. Zhang, N. Wuwong, H. Li, X. Zhang, ”Information security risk
[6] M. Sharma, H. Bansal, A. Sharma, ”Cloud computing: different approach management for the cloud computing environments”, International
and security challenge”, International Journal of Soft Computing and Conference on Computer and information Technology,IEEE,pp.1328-
Engineering, March 2012. 1334, 2010.
[7] S. Ramgovind, MM. Eloff, E. Smith, “The management of security in cloud [19] NIST, ”Guide to intrusion detection and prevention systems(IDPS)”,
computing”, International Conference on Information Security for South 2007,<http://csrc.nist.gov/publications/nistpubs/800-94/SP800-94.pdf>,
Africa, IEEE, 2010. Accessed on 4 August,2015.
[8] J. Wang, Y. Zhao, S. Jiang, J. Le, “Providing privacy preserving in cloud [20] A. Patel, M. Taghavi, K. Bakhtiyari, J. Celestino, ”An intrusion
computing”, International Conference on test and Management, IEEE, detection and prevention system in cloud computing: A systematic
2009. review”, Journal of Network and Computer Applications, ELSEVIER, pp.
[9] A. Bakshi, B. Yogesh, “Securing cloud from DDOS attacks using intrusion 25-41, 2012.
detection system in virtual machine”, IEEE, 2010. [21] F. Zhang, J. Chen, H. Chen, B. Zang, ”CloudVisor: retrofitting protection
[10] J. Li, Bo Li, T. Wo, C. Hu, J. Huai, l. Liu, K.P. Lam, “Cyberguarder: a of virtual machines in multi-tenant cloud with nested virtualization”,
virtualization security assurance architecture for green cloud computing”, Symposium on Operating Systems, ACM, 2011.
Face Recognition Techniques, their Advantages,
Disadvantages and Performance Evaluation
Lerato Masupha, Tranos Zuva, Seleman Ngwira, Omobayo Esan
Tshwane University of Technology
Department of Computer Systems Engineering
Pretoria, South Africa
{masuphaLE,zuvat,ngwirasm,esanoa}@tut.ac.za
Abstract—A human brain can store and remember features that include the eyes, nose, ears, lips, chin, teeth and
thousands of faces in a person’s life time, however it is very chicks. Some of the features are used to recognize individuals.
difficult for an automated system to reproduce the same results.
Faces are complex and multidimensional which makes extraction In this paper we look at the face recognition techniques.
of facial features to be very challenging, yet it is imperative for
We categorized the techniques under three categories namely
our face recognition systems to be better than our brain’s
capabilities. The face like many physiological biometrics that holistic, feature based and hybrid approaches
include fingerprint, hand geometry, retina, iris and ear uniquely The major contribution of this paper:
identifies each individual. In this paper we focus mainly on the Highlight various face recognition techniques, their
face recognition techniques. This review looks at three types of merit and demerits.
recognition approaches namely holistic, feature based Challenges and open issues affecting face recognition
(geometric) and the hybrid approach. We also look at the technique
challenges that are face by the approaches.
Evaluation metric for face recognition techniques
Keywords—Face recognition, biometrics, performance The rest of the paper is divided into: Section II is the
evaluation overview of face recognition techniques; Section III
challenges and Open Issue affecting face recognition
techniques; Section IV Performance Evaluation Techniques
I. INTRODUCTION and Section V concludes the paper.
Biometrics have emerged the most alternative for
authenticating individuals in this present technological age,
instead of authenticating using conventional passwords, PINs,
smart cards, tokens, keys etc[1, 2]. A Biometric system is an II. FACE RECOGNITION TECHNIQUES
automated technique of examining an individual using Face recognition techniques can be divided into three
physiological or biological traits in order to ascertain his/her categories”
identity [1, 3]. Technique that operate on intensity image.
Technique that deal with video sequences.
The disadvantage of traditional techniques like Technique that requires other sensory data such as 3D
vulnerability to loss, theft, misplaced, stolen or forged makes information or infra-red imagery.
many application areas such as law enforcement,
banking, time attendance and immigration to migrate towards Obviously, face recognition method has received a great
using biometric system to improve the security system [1]. deal of attention in various applications in the field of image
The biological traits are signature, gait, speech and keystroke, analysis and computer vision due to several advantages it has
these traits change with time[4]. The physiological traits over other biometric methods.
include face, fingerprint, palm print and iris which could
remain permanent throughout individual’s life time [5, 6]. This advantages include low cost equipment for
capturing, face can be done without explicit action on the part
Fingerprint is the oldest biometric of all times to be used of user, its non-intrusive characteristics.[8, 9].
to for identifying individuals[7]. However face recognition has
an advantage over the other physiological biometrics systems
in that the individual being recognized does not need to A. Face recognition based on intensity
participate or acknowledge the action. Furthermore the facial Face recognition method based on intensity of images
images can be obtained easily with a fixed inexpensive camera can be divided into two[8]: (i) Feature-based approach (ii)
as opposed to other biometrics that requires the use of more Holistic –based approach and (iii) Holistic-based approach as
expensive equipment like the retina and iris. The face has in Fig. 1.
1) Advantages
In feature-based techniques, the feature points precede
the analysis done for matching the image to that of a
Feature- known individual.
Holistic-
based based
The feature-based technique can be made invariant to
size, orientation and lighting.
It has compactness of representation of the face images
and high speed matching/
2) Disadvantages
Hybrid- Feature-based techniques lack discrimination ability
based
It is difficult to automatic detect feature in this
approach.
b) Elastic bunch graph The advantage of this technique is that it is easy and
This technique is based on dynamic link structures. A efficient as the PCA reduces the dimension size of an image in
graph for an individual face is generated using a set of fiducial a short period of time. Another advantage is that it has high
points on the face, each fiducial point is a node of a full correlation between the training data and the recognition
connected graph and is labeled with the Gabor filters’ data[13]. The disadvantage of this technique is that its
response. Each arch is labeled with the distance between accuracy depends solely on a number of factors amongst them
correspondent fiducial points[10]. is the lighting as this decreases the accuracy.
1. 3D model-based techniques
1) Advantages Using 3D information for face recognition help in
They do not destroy any of the information in the exploiting feature based on the shape and the curvature of the
images by concentering on only limited regions or face such as the shape of forehead, jawline and checks without
point of interest. being plagued by the variances caused by lighting, orientation
The technique produce better recognition results than and background clusters that affect the 2D systems[16].
the feature-based technique.
The examples of 3D includes: scanning systems, stereo
2) Disadvantages vision systems, structured light systems, reverse
Because the approach does not destroy any images rendering/shape from shading etc.
information this makes the approach to start image
information with the basic assumption that all the pixels
in the image are equally important. a) Advantage
The approach is computationally expensive and it also It exploits features using shape and curvature of the face.
requires high degree of correlation between the test and
training images.
The approach does not perform effectively under large b) Disadvantage
variation in pose, scale and illumination etc. The approach is complex and computational cost.
c) Hybrid approach
The hybrid approach is a combination of two or more D. Infra-red based techniques
approaches which is aimed at yielding more efficient results. The thermal infra-red imagery is insensitive to variations
Using more than one approach means that the downfalls of in lighting which makes such image to be used for detecting
one approach are improved by advantages of the other and recognizing faces. When using infra-red the images vein
complimenting each other. For example combination of and tissues structure which are unique in each individual yield
template matching algorithm and 2DPCA algorithm[14, 15]. a good results when applied to these images.
2. Usability
This is refers to extent at which the product can be used
B. Uniqueness by specified users to achieve certain goals with effectiveness,
Human face is not unique; there are many factors that efficiency and satisfaction.
cause the appearance of the face to vary. These are categorized
as extrinsic and intrinsic.
B. Performance metrics
C. Variation The basic fundamental performance metrics for face or
Age, illumination, and pose also affect face recognition any biometric system are:
system. Although most face recognition system works well Failure-to-Enroll Rate (FER): This is the number of users who
under constrained condition, the performance of most of these the system fails to capture.
systems degrades rapidly when they are put to work under Failure-to-Acquire Rate (FTR): This is the number of
conditions where none of the features are regulated. verification or identification attempts that the biometric system
unable to capture.
Failure-Match Rate (FMR): This is the rate of
incorrect positive matches when using matching algorithm for
IV. EVALUATION METRIC FOR FACE
single template comparison.
BIOMETRIC RECOGNITION SYSTEMS
False-Non-Match Rate (FNMR): this is the rate of incorrect
negative matches when using matching algorithm for single
A. Evaluation classification template comparison.
The evaluation of face biometric system can be classified
into three as shown in Fig. 2 below:
C. Verification performance metric
EVALUATION OF False Rejection Rate (FRR): This is the number of
BIOMETRIC geneui user that the system incorrectly denied.
False Acceptance Rate (FAR): this is the number of
impostors that the system accepted.
Receiver Operating Characteristics Curve (ROC):
This is the plot of FMR or FAR (i.e accepted impostor
attempt) on x-ax0s against FNMR or FRR (i.e rejected geneui
Data quality Usability Security
attempts) on the y-axis plotted using a function of decision
threshold as a parameters.
Equal Error Rate (ERR): This is a point at which
FAR and FRR correspond. The more EER is near to 0%, the
better the system performance.
ACKNOWLEDGMENT
The authors acknowledged the financial contributions
and resources made available by Tshwane University of
Technology.
REFERENCES
[1] O. A. Esan, T. Zuva, S. M. Ngwira, and K. Zuva, "Performance
Improvement of Authentication of Fingerprints using Enhancement and
Matching Algorithms," International Journal of Emerging Technology
and Advanced Engineering, vol. Vol. 3, 2013.
[2] O. S. Adeoye, "A survey of emerging biometric technology,"
international journal of computer applications, vol. 9, p. 2, 2010.
[3] M. D. Dhameliya, "A multimodal biometric recognition system based on
fusion of palmprint and fingerprint," international journal of engineering
trends and technology, vol. 4, p. 1908, 2013.
[4] L. masupha, T. Zuva, and S. Ngwira, "A Review of Gait Recognition
Techniques and their Challenges," presented at the Third international
conference on digital information processing, E-Business and cloud
computing, Reduit Mauritius, 2015.
[5] H. A. Aboalsamh, "Vein and Fingerprint Biometrics Authentication-
Future Trends," International Journal of Computer and Communications,
vol. 3, 2009.
[6] O. Esan, S. Ngwira, L. Masupha, and T. Zuva, "Health Care
Infrastructure Security using Bimodal Biometrics System," international
journal of computer and information technology, vol. 03, pp. 299-305,
2014.
[7] A. Senior and R. Bole, "Improved Fingerprint Matching by Distortion
Removal," IEICE Trans. Inf. System, vol. 8, pp. 825-831, 2001.
[8] R. Jafri and H. R. Arabnia, "A Survey of Face Recognition Techniques,"
Journal of Information Processing Systems, vol. 5, pp. 41-63, june 2009
2009.
[9] R. Patel and S. B. Yagnik, "A literature survey on face recognition
techniques," international Journal of computer trends and technology,
vol. 5, pp. 189-194, 2013.
[10] M. Sharif, S. Mohsin, and M. Y. Javed, "Face recognition techniques,"
Research journal of applied science engineering and technology, vol. 4,
pp. 4979-4990, 2012.
[11] B. a. Draper, K. Baek, M. S. Bartlett, and J. R. Beveridge, "Recognizing
faces with PCA and ICA," Computer vision and image understanding,
vol. 91, pp. 115-137, 2003.
[12] V. Vijayakumari, "Face recognition techniques: A survey," world journal
of computer application and technology, vol. 1, 2013.
[13] A. Mir and A. G. Mir, "feature extraction methods(PCA fused with
DCT)," international Journal of advances in engineering and technology,
vol. 6, pp. 2145-2152, 2013.
[14] J. Wang and H. Yang, "Face detection based on template matching and
2DPCA algorithm," presented at the congress on image and signal
processing, 2008.
[15] M. Agarwal, N. Jain, M. Kumar, and H. Agrawal, "Face Recognition
Using Eigen Faces and
Artificial Neural Network," international Journal of computer theory and
engineering, vol. 2, pp. 624-629, 2010.
[16] R. Brunelli and T. Poggio, "Face recognition: Feature versus Templates,"
IEEE transaction on Pattern Analysis and Machine Intelligence, vol. 15,
pp. 1042-1052, 2003.
Designing Budget Forecasting and Revision System
Using Optimization Methods
Abstract—The sales procedures are the most important objectives. Due to the changing conditions and the influences
factors for keeping companies alive and profitable. So sales and of many internal and external factors, decision making about
budget sales are considered as important parameters influencing the budget allocation is so important and has too much
all other decision variables in an organization. Therefore, poor complexity. To make this decision making easy and flexible in
sales forecasting can lead to great losses in an organization these changing conditions of organizations is so essential.
caused by inaccurate and non-comprehensive production and Therefore, it is important to periodically and flexibly adjust
human resource planning. Hence, in this research a coherent the budget and systematize this process can help to be done
solution has been proposed for forecasting sales besides refining more accurately and reduce human faults.
and revising it continuously by ANFIS1 model with consideration
of time series relations. Data has been collected from the public Efforts to balance and adjust the budget, lead to a better
and accessible annual financial reports related to a famous understanding of income, expenditure and cash flow in a
Iranian company. Moreover, for more accuracy in forecasting, business. Inappropriate Budgeting in an organization can’t
the solution has been examined by BPN2 and PSO3 as help the managers and can’t improve performance. A Budget
optimization methods. The comparison between taken prediction which is planned inappropriately in an organization will be
and the real data shows that PSO method can optimize some ignored because it can’t give staff, reasonable criteria for
parts of prediction in contrast to the rest which is more comparison with actual performance. That is why it is very
coincident to the output of BPN analysis. As a consequence, a important to review and revise the budget periodically in
hybrid integrated system including both has been designed. This
accordance with the actual performance. Additionally,
system uses them depending on their abilities to optimize each
part, so it can produce more precise results relatively.
analysis of variance from budget during the budget revises
process; help managers to determine when to adjust their
Keywords—ANFIS; PSO and BPN methods; hybrid method operations and costs.
The successful combination of some methods, such as
I. INTRODUCTION neural networks, fuzzy logic and evolutionary computation,
The importance of sales forecasting for a firm has often developed a new method called Soft and Intelligent
been stressed [1] and is best expressed by what happens when Computing, That these soft techniques can be used in
it is absent. “Without a sales forecast, in the short term, estimation, forecasting and decision making in various
operations can only respond retroactively, learning to lose contexts. Neuro fuzzy or Fuzzy Neural system is a hybrid
orders, inadequate service and poorly utilized production system that combining the fuzzy logics ability of making
resources. In the longer term, financial and market decision decisions with the neural network’s ability of learning, in high
making misallocate resources so that the organizations levels of complexity, to be able to present a modeling /
continuing existence may be brought into question” [2]. The estimation system. Fuzzy Neural system is a neural network
forecasts are used for a number of purposes in a firm, performing with neuro fuzzy inference system coordinately
including production planning, budgeting, sales quota setting [6].
and personnel planning ([3],[4]). For sales forecasting and budget assessment it is necessary
The primary objective of most business enterprises is to identify the relation between variables influencing this
securing a profit and the accumulation of wealth. Budgeting forecast. There are several methods to correlate variables with
aides management in realizing its profit objective by providing sales volume such as multiple linear regressions and
a scientific technique for forecasting business operations and computational intelligence regression which are statistical
establishing standards [5]. Managers use the budget as a road methods for studding and modeling the relations between
map for allocating the company's resources. The main purpose variables. Artificial neural networks (ANNs) and Adaptive
of budgeting is aligning the company activities with the Network Fuzzy base Inference System (ANFIS) are two
common nonlinear techniques for sale forecasting in recent
1
years. ANFIS first proposed in [7] which is two combine of
Adaptive Neuro Fuzzy Inference System ANNs and Fuzzy Inference System (FIS). In addition, ANFIS
2
Back Propagation Neural Network
3
Particle Swarm Optimization
can do training more precisely due to using fuzzy system
TABLE I. INTERNAL AND EXTERNAL FACTORS
Variable Abbreviation Source Test Data
Operating Income Margin OIM prior financial statement 0.814
Inventory Turnover IT prior financial statement 26.21
Debt Ratio DR prior financial statement 0.252
Return On Assets ROA prior financial statement 0.328
Employee Count EC company data 127 B. Adaptive Neuro Fuzzy Inference Systems
Asset A company data 350000
Industry Share IS company/Industry data 0.023 ANFIS is an adaptive network of nodes and directional
Currency C Industry factors 24846 links with associated learning rules. It is called adaptive
Inflation rate IR Industry factors 182.5
Sale Sale company data 424786
because some, or all, of the nodes have parameters which
which causes to get membership function parameters and influence the output of the node. These networks identify and
optimizes them. There are several methods for optimizations learn relationships between inputs and outputs. The basic
and composing them with ANFIS. Training algorithm used in architecture of ANFIS consist five layers with different
this research for error reduction are PSO and BPN methods. functions that completely described in reference [8].
7. Amount of promotion budget. A technique that PSO uses is a vector for moving. The
motion vector which intelligently tries to update itself every
8. Fashion and taste of consumers. moment. In order To find the best moving one must follow
two types of motions: 1) Move to the previous experience or
Applying business experience extracted from past financial
x
statements help management and salespeople to make local best ( x ib ) 2) Move to the pattern or global best ( gb ).
decisions better. According to these, we could find some x
factors that seem to be more effective in sales forecasting than Since full motion of particle it is impossible to gb , as well as
others. All of these factors come in table 1. x
full motion to x ib particle the best answer that is gb Away,
Therefore, a move that Particle Selects, movement between
v ik 1 v ik c1r1 (x ib x ik ) c 2 r2 (x gb x ik ) (2)
x ik 1 x ik cv ik 1
(3)
Where, is the inertia factor that (0.4,14) , x ib and
x gb are local best and global best and c1 and c 2 are random
variables that c1 and c 2 are defined as c1 rb
1 1
and c 2 r2b2
Fig. 2. Flowchart of the proposed PSO technique
, with r1 and , and c1 and c 2 are positive acceleration
constant. Kennedy asserted that c1 c 2 4 guarantees the
stability of PSO [10]. Figure 2 shows the flowchart of the
proposed PSO algorithm.
IV. TIME SERIES MODEL
In this contribution an additive model with the following
components to mimic the time series is applied. The purpose
of a time series is a collection statistical data, collected equally
and at regular time intervals. Statistical methods are used to
this kind of statistical data called time-series analysis.
(a)
A technique was used in this article using time-series
methods to predict the future according to defined patterns
liable of the past. This model works as explained. First a
certain number of data end of to assess forecast regardless and
other data are evaluated and Trained. In fact, if x t
represents the amount of data in the moment now, It can be
expressed in equation 3.
x t f (x t 1 , x t 2 ,..., x t n ) (4)
Step1:
Outputs Inputs
budget sales.
Completing the source data with internal and external Seasonal Sales
ANFIS-2
(Time Series
Building Fuzzy
Inference
Year[1…a]
factors and accessible data. Analysis) System(FIS)
Step2: Improve
Designing and training the first adaptive neuro fuzzy Seasonal coefficient ANFIS-1 Learning Engine
By Using PSO
Step3:
Designing (ANFIS-2), to predict second semiyearly ∑
period by using the time series analysis. (By using Seasonal coefficient
Sales Prediction
ANFIS-2, prediction could be done without any date Prediction For
Total For Total
about effective factors). Year:a+1 Year:a+1
Step4: *
Calculating the ratios of every season sales to total
sales with seasonal sales data in various years Sales
Designing ANFIS-3 by using time series of ratios, to
Winter Autumn Summer Spring
Budget Sale Sale Sale Sale
Version1 Prediction Prediction Prediction Prediction
predict the next period. Sales Budget
Obtaining the sales forecasting of all the seasons as a
- Deversion 1
Real
Spring
result. * Sale For
sales forecasting can be more accurate, after Sales Winter Autumn Summer Spring
Budget
happening each season. Version2
Sale
Prediction
Sale
Prediction
Sale
Prediction
Realized
Sale
Step5:
Gathering the realized data of last two seasons. *
Real
Summer -
Sales Budget
Deversion 2
Returning to the first step and going through the Sale For
Year:a+1
Sales
Budget Winter Autumn Summer Spring
Version4 Sale Realized Realized Realized
Prediction Sale Sale Sale
Real
Winter Sales Budget
Sale For - Deversion 4
Year:a+1
+ Second Semiyearly
a=a+1
Realized Sale
Year:a+1
A. Semiyearly Modeling
In this section the output of ANFIS-1 has been trained with
both BPN and PSO methods. In Fig.5 (a) prepare the
comparison between real data and the trained output of
ANFIS-1 and Fig.5 (b) ones illustrates the percentage error.
These outputs show when we have effective factors as inputs,
forecasting system trained by PSO method has better and more
accurate results than by BPN method.
Three common indices such as Mean Square Error (MSE), (a)
Root Mean Square Error (RMSE), Mean Absolute Percent
Error (MAPE), is used as performance evaluation criteria (see
table II).
(b)
Fig. 6. Seasonal modeling a) Training with BPN & PSO b) Percentage of
error
Testing Data
Season
Optimization MSE MAPE Seasonal MAPE MSE
of the
year Method (%) Cofficent (%)
(b) Prediction
BPN 3.98e-6 0.897 0.1912 1.24 5.5e-6
Fig. 5. Semiyearly modeling a) Training data with BPN & PSO methods b) Spring PSO 3.99e-6 0.898 0.1913 1.26
Percentage error
Summe BPN 1.48e-5 1.547 0.2375 2.39 3.4e-5
r PSO 1.48e-5 1.547 0.2376 2.38
BPN 1.63e-6 0.391 0.2659 1.50 1.66e-
Fall 5
TABLE II. THE COMPARISON OF TWO DIFFERENT OPTIMIZED RESULTS PSO 1.63e-6 0.391 0.2658 1.48
FOR SALE PREDICTION WITH ANFIS-1
BPN 1.76e-5 1.174 0.3051 2.43 5.08e-
5
Training Data Testing Data Winter
PSO 1.77e-5 1.168 0.3050 2.36
Optimization MSE RMSE MAPE Sale MAPE
Method (%) Prediction (%)
TABLE VI. SALE PREDICTION FOR EACH SEASON USING BP AND PSO
Optimization Sale prediction for
Season MAPE (%)
Method each season
Spring BPN 72,588 2.1044
PSO 72,177 2.6582
Summer BPN 90,122 5.622
PSO 89,611 6.1573
Fig. 7. Second Semiyearly Modeling Training with BPN & PSO Fall BPN 100, 850 4.7649
PSO 100,290 5.2998
Winter BPN 115,780 0.9935
Outputs in table IV show that, trained forecasting system PSO 115,070 1.5998
by BPN method, when we are using time series data as inputs,
has better and more accurate results than by PSO method. Table VI indicates the revising process based on one
realized data. Realized spring sales are replaced with its
TABLE IV. THE COMPARING RESULTS OF DIFFERENT OPTIMIZATION prediction, and revise all other seasons accordingly and this
METHODS FOR SECOND SIX- MONTH PREDICTION USING TIME SERIES METHOD process repeats for summer as well.
Training Data Testing Data
Second six
E. Hybrid model
Optimizatio MAPE month MAPE The results show that PSO method could improve the
MSE RMSE
n Method (%) prediction (%)
performance of BPN in ANFIS model when it analyze the
BPN 1.8739e5 432.89 0.359 214,530 3.729 effects of the internal and external factors in sale forecasting,
PSO 1.683e6 1297.2 1.034 213,970 3.982 while its effects on the analysis results of time series data is
ambiguous. It seems that PSO does not always perform well
and its behavioral parameters may need tuning.
Considering the fact that we face high test errors although We found that the best fit model for forecasting the
we start from low training errors, one can assume that the discussed process was the Hybrid ANFIS model which can
training set was too small for this specific problem and the produce more accurate predictions. The hybrid model applies
model is over fitted. Especially as data points which are not in a combination of the PSO method and BPN method to train
close proximity to the training set are hard to predict correctly. membership function of fuzzy inference system (FIS).
Table VII illustrates the optimization results of hybrid
method and its percentage improvement from BPN and PSO
The actual The actual Optimization The amount of sales Forecast for second Forecast seasonal MAPE
amount of amount of method of the first Semiyearly coefficient (%)
spring summer Semiyearly
prediction with real data of prediction with real data of
REFERENCES
[1] S.G. Makridakis, S.C. Wheelwright and R.J. Hyndman, “Forecasting:
Methods and Applications,” Third edition, John Wiley and Sons. 1998
[2] R. Fildes, and R. Hastings, “The organization and improvement of
market forecasting,” Journal of the Operational Research Society, vol.
45, pp. 1–16. 1994.
[3] J. Mentzer, and K. Kahn, “Forecasting technique familiarity,
satisfaction. Usage and application,” Journal of Forecasting, vol. 14(5),
pp. 465–476. 1995.
[4] H. R. White, “In Sales Forecasting: Timesaving and Profit-Making
Strategies that Work,” London, UK: Scott, Foresman and Company.
TABLE VIII. SALE PREDICTION FOR EACH SEASON CALCULATE IMPROVEMENT PERCENT USING HYBRID MODEL
Season Optimization Sale prediction for each MAPE Improvement Percent Improvement Percent
Method season (%) compared with BPN compared with PSO
Spring Hybrid 73,754 0.531 1.58% 2.138%
Abstract—Cognitive Radio (CR) networks is well-known for its being obstacle to the activity of Primary Users (PU) or
expertness in solving the problem of spectrum scarcity that reside Licensed Users. In this paper, we cautious to pin-point that
with wireless resources where unlicensed users can Cognitive Radio Mobile Ad Hoc Network (CR-MANET),
opportunistically perform transmission without impacting the internally neither consists of federal party to obtain the
operations of licensed users. Channel switching is inherently spectrum usage information from the neighborhood nor
necessary to make unlicensed and licensed users to appropriate external third party provision (spectrum broker) that
propagation in the channel. All through the progression, empowers the distribution of the offered spectrum resources.
communication characteristics such as bottleneck bandwidth and
Round-trip time (RTT) have to be modified. However this change
has to adaptively update by TCP in its congestion window In classical ad hoc networks, the mobility of relay
(CWND) to make an efficient use of the available resources. TCP nodes and the ambiguity residing with wireless channels are
CRAHN is well-known mechanism which generates spectrum the two key factors that affect the reliable distribution data
alertness by retrieving explicit feedback from relay and from source to destination [3]. TCP CR-MANET works
destination nodes. In this paper we proposed TCP CR-MANET progressively in considering the temporal spectrum sensing,
and it is evaluated with respect to bottleneck bandwidth and channel switching and the cognizance of primary User’s (PU)
RTT varying characteristics. This protocol updates its CWND activity. Here, we proposed window based TCP for CR-
based upon the available buffer space of the relay node. TCP CR- MANET. CR-MANET constructed from many mobile
MANET system is implemented in NS2 simulator and analyzed
secondary users interconnected to each other in a distributed
with various characteristics of the network. Experimentally, it
has been proved that our proposed TCP CR-MANET provides
manner. CR-MANET can be deployed in various aspects of
better throughput with respect to TCP CRAHN. Intelligent Transport Systems (ITS) applications [4].
Keywords—Cognitive Radio, Congestion control, Spectrum The main challenges of transport layer in a classical
Sensing, Transport Protocol, and Mobile ad hoc network. wireless ad hoc networks are [5]
I. INTRODUCTION 1. Congestion.
2. Packet drops based on channel related problems.
Cognitive radio technologies have gained the rapid 3. Packet losses on mobility.
growth of wireless communication popularity now a days, due
to its capability of improving spectrum utilization by In case -1: The RTT value was increasing based on the
deploying the unused spectrum in dynamically changing increased queuing delay of relay nodes. When RTT value goes
environments. The unlicensed bands, mostly that fall under beyond the given limit, the relay nodes fail to forward packets,
900MHz and 2.4GHz, are getting more and more congested likely this event degrades the performance of TCP.
[1]. Various wireless applications are activated under In case 2. In the network, a packet drops due to channel
unlicensed bands. The radio spectrum demand has been related problems or channel induced, likely of fading and
increased dramatically and still few more available spectrums shadowing performance of channel.
can be allocated. However, according to Federal
Communications Commission’s (FCC) report [2], the same In case 3. Relatively Packet losses occurred in the network,
spectrum bands are underutilized due to existing amount of when there was mobility related losses or Permanent losses
idle spectrum holes at spatial and temporal measurements. [6].
Cognitive Radio Technology has the potential to ameliorate
The source node would mistakenly consider the above
the scarcity of wireless resources. Cognitive Radio Networks
mentioned cases as congestion event.
(CRN) has gained increasing popularity now a days due to its
capability of improving spectrum utilization by manipulating All these losses are taken as inducing factors that are
the vacant spectrum in energetically varying environment. applicable to CR-MANET. In CR-MANET, we rely an
Secondary Users (SU) or Unlicensed Users are able to intermediate nodes, which periodically piggyback the
resourcefully function in the permitted spectrum bands, not
spectrum information with Acknowledgement (ACK) and also [9] performance via 1) thorough recognition of primary user
update’s Primary User’s arrival on time, explicitly informing and 2) effective utilization of the channel, which are the two
the source. Whereas, recently many dimensions of spectrum contradictory goals that need to be achieved by diagnosing a
sensing algorithms are focused [1]. Handling diverse channel better blending factor. The transport layer acclimatizes the
information deprived from various channels and attaining current rate of sensing state and decides optimal setting of
insight about performance of these techniques remains an sensing time [11]. During this period, it detects the primary
open challenge, from end to end protocol perspective [7]. Our users and utilizing channels, also minimize the interference on
protocol ensures prospectively channel switching event by primary users and maintains throughput.
adhering to the momentous updation in band width of the
interfered link. On account of bottleneck spectrum permitting
low data rate there will be a sudden rise in number of packet B. Impact of Primary User Activity
pertaining to the network at unit time period. Thus we propose
a TCP congestion window (CWND) which levers rapidly to A primary user’s activity is periodically detected
the transformation in the environment. TCP is a broad during spectrum sensing or data transforming. In case, on
recognized region of wireless networks where numerous arrival of the primary user, if secondary user’s operation
hypothetical models has been proposed that delivers affects the current channel, the system will be in search of
explanation and predicting its performances. Hence the various unoccupied channel in the spectrum. When the current
objective of TCP CR-MANET is to facilitate window based channel’s spectrum sensing is of periodic and well defined
methodology of the classical TCP, and increased applicability. interval, two activities will be performed as follows.
1. Available Channel set Discovery at various
spectrum bands
II. MOTIVATION
2. Harmonize with the subsequent hop neighbors to
Here we analyze the one of the problems of the derive mutually adequate channels in the set.
preceding approaches of transport protocols like TCP
NewReno in CRAHN which drive us to have enhanced
performance with the proposed system TCP CR-MANET. In this set, a risky factor arises i.e. the path towards
With respect to CRAHN, each node is furnished with a RF destination got disconnected and remains the same until the
transceiver. The key deeds of the Cognitive Radio Networks system detects the new channel. The source will not be aware
are of this event in that period of time, the affected node has to
1. Spectrum sensing. transmit the explicit feedback notification to the source node.
The transport protocol [11] to differentiate these states, based
2. Impact of primary user activity. on the value of “on” and “off” stage (α and β) of Primary User
3. Spectrum change. activity, that comprises four different patterns as follows.
A. Network Modeling
REFERENCES
Abstract—Wireless Sensor Networks (WSNs) consists of which will process the signal and take the decision about it.
sensor nodes that are battery powered, these nodes are often The communication device will exchange the information
deployed in remote areas and left unattended for long periods between the nodes. The communication device converts the bit
between battery charges. According to the studies, over sixty stream coming from the microcontroller to radio waves [2].
percent of energy is consumed during the transmission and
The design of the WSNs is influenced by a number of factors.
reception process of the network. Due to the reason stated above,
protocols and algorithms have been designed to manage energy, According to [3] these factors are fault tolerance, scalability,
to make sure that the life span of the network is prolonged. This production cost, operating environment, sensor network
study proposes the improvement of the Energy Efficient Chain topology, hardware constraints, transmission media, and
Based Network protocol (ECBSN). This protocol has challenges power consumption.
in a way it forms the chains using fixed number of nodes per The WSNs must be designed in such a way that even when
chain instead of varying the number of nodes in a chain at one node fails due to power failure or physical damage the
different levels. This cause the chain furthest to the Base Station network should be able to deliver reliable messages to the end
(BS) to die faster if it’s Cluster Head (CH) is the chosen leader user. Looking at the factors listed above, the major factor that
which has to relay all aggregated data to the sink just like in the
needs to be considered when designing the WSNs is power
Improved Energy Efficient Chain Based Network (IEBSN). We
then propose the Improved Energy Efficient Chain Based consumption. Since the WSNs is battery powered, power from
Routing (IECBR) in WSNs. This study will consider varying the the nodes can be drained very fast if there is no adequate
number of nodes at different levels, that is more nodes will be power management strategy that is put into place.
assigned to the chain closest to the BS and the further you are to To address this challenge a number of protocols and
the BS few nodes will be assigned. It will also consider the level of algorithms have been proposed to manage power in the entire
the energy and the distance of the node which is the CH to the WSN.
BS. The simulation results showed a great improvement in the Power consumption in the network can be divided into three
lifespan of the nodes in the network as compared to ECBSN. main categories: sensing, communication and data processing.
Keywords—Wireless Sensor Networks; Sensor Nodes; Energy
According to [4] and other researchers over sixty percent of
Efficient , Hierarchical Clustering. the energy is consumed in the transmit, receive, idle and sleep
modes. In other words most of the energy is consumed during
the communication process. According to [5], some sensors
I. INTRODUCTION
can consume more energy than the radio. This means choosing
Wireless Sensor Networks (WSNs) consist of sensor nodes the right hardware components is also very important when
which cost less and have small power[1].The structure of the designing WSN. The type of hardware components you
sensor node consists of four basic components: a sensing unit choose for your design contributes significantly to the total
formed from a sensor and an analogue to digital converter, a energy consumed in the entire network.
processing unit formed from a microcontroller with memory, a This paper is arranged in this manner: Section 2 is the
communication unit which is a transceiver and a power unit Literature Survey, Section 3 is the Proposed technique,
whose energy source is usually a battery. Section 4 is Conclusion.
Normally a WSN have numerous sensor nodes which are
thickly conveyed in the region of interest, and the sink which
may not be exceptionally a long way from the sensor nodes. II. ROUTING PROTOCOLS
The sensor nodes are sent in a zone to screen changes in
distinctive applications e.g. natural applications, military This literature review will focus on energy efficient routing
applications, human services applications, security and protocols for wireless sensor networks taking into
observation applications, etc. consideration the hierarchical chain based protocol. Routing
The sensor will monitor the physical phenomena and generate is very important for energy efficiency in WSNs. [6] shows
an electrical signal based on what it has observed. The signal that a good routing protocol can avoid challenges faced in
will be passed to the analogue to digital converter so that the WSNs such as Scalability, fault tolerance, Energy Efficiency,
analogue signal can be understood by the microcontroller, Quality of service, etc.
PEGASIS RESULTS
Figure 4: EECB[10]
This convention will expect free space and direct line without
hesitation where k=2.
Where Eelec stand for energy for the hardware and for our
calculation we use 50nJ/bit, Eamp energy for the amplifier is
100pJ/bit/m2, b being the number of bits, and d as the space
between the nodes and the coefficient k being the signal
strength loss. The value of k will depend on the data
propagation model utilized in the system that is: direct line of
sight and when there is no obstacle k=2 and for environments
with extreme impedance k =4.
This protocol will assume cases where we don’t have
obstacles and direct line on sight where k=2.
3) Location Based Routing This study proposed the Improved Energy Efficient Chain
This type of network utilizes the position information to relay Based Routing (IECBR) in WSNs to enhance the lifespan of
the information to the BS. To determine the distance between the system and diminish the utilization of energy during
the neighbouring nodes it make use of the signal strength. correspondence (that is transmission and reception). The
Some of the location based protocol has been explained in operation of the Protocol will be founded on stages as follows
[12] Location‐Aided Routing (LAR) in mobile ad hoc
networks. Clustering Stage
doing the CH will specifically send the information got to the 1960
head leader without collecting it. The Min limit spares the No of Nodes
1950
energy of the CHs by not permitting them to do information
accumulation. 1940
1930
At the point when the CHs get the information from its group
1920
individuals, the transmission of information will begin in the
CHs chain. This will ensure that the CH totals their 1910
Abstract—As an effect of the modernization policy in the and General Norms. Out of these, the Internal Control Norms
administrative processes inside the Ecuadorean Central (ICN) of the General State Comptroller‘s Office [5], the
Government, the usage of Information and Communication Agreement for Information Security (IS) of the National
Technologies (ICTs) has increased during the last years within Secretariat of Public Administration [6] and the Agreement for
more than 300 of the most important and high-ranked public Process-Based Government (PBG) [7] make direct reference to
institutions. Likewise, the citizen access to Internet and e- ICT Management Processes. In Table I, the main
Government services has grown due to the democratization of the characteristics of the legal framework are described.
governmental ICT platform to ensure access to the most
important public services, including those offered by Higher TABLE I. ECUADOREAN LEGAL FRAMEWORK FOR PUBLIC SECTOR‘S
Education Institutions (HEIs). However, the applicable ICT MANAGEMENT
regulatory and legal framework in the public sector has very
little compliance with governance and management Process
Authority
considerations. Our work develops a combined model for ICT Instrument Description
governance and management based on academic models for The processes that should be
General State
governance and strategic alignment, aided by professional Internal Control implemented in public institutions are
Comptroller‘s
practices in the field, fully compliant with the regulatory Norms (ICN) listed and briefly explained from a
Office (CGE)
mechanisms that operate within the Central Government of controlling perspective.
National
Ecuador. Also, our work analyzes the results, limitations and The directives for using ISO 27001 [8]
Secretariat of Agreement for
future work regarding the applicability of this model inside the as National Technical Norm for
Public Information
Council for Evaluation, Accreditation and Quality Assurance of implementing Information Security
Administration Security (IS)
Higher Education Institutions (CEAACES) in order to support Management Systems (ISMS).
(SNAP)
the process of continuous improvement of public higher Agreement for The components for establishing a
Ecuador
education in Ecuador. President‘s
Process-Based process-based government, such as
Government governing processes (subjects) and
Office
Keywords—ICT; IT; governance; management; cobit; model; (PBG) products/services (adjectives)a.
iso; practices; standard a. The adjectives depicted in the agreement are translated as products and services which are provided to
the governing processes that are shown as subjects in this agreement.
I. INTRODUCTION
The Central Government‘s public institutions, created
According to the Art. 226 of the Ecuadorean Constitution before 2013, have equivalent processes to those described in
[1], the Executive Function (Central Government) is part of the Table I. These processes are known as Governing, Value-
public sector, comprised of more than 300 dependencies where Aggregator and Enabling Processes (consulting and support).
more than a hundred of e-Government applications are used [2] The process known as ―Information and Communication
[3]. Currently, there are more opportunities to use these e- Technologies (ICTs)‖ is considered inside the Enabling
Government services, as the central Government has promoted Processes (adjectives in PBG). ICTs is seen as an ―adjective‖
citizen access to ICT platforms. In fact, in between 2010 and process, makes it evident that both institutional and ICT
2015, the percentage of Internet users amongst the Ecuadorean processes must be aligned so that the relevance of the latter can
population has increased from 17 to 51 per cent [4]. be assured. Nonetheless, these criteria and alignment processes
In case of governmental dependencies, these maintain ICT are not detailed enough in the Ecuadorean Legal Framework
platforms to support e-Government services along with their for ICT Management.
internal processes. Management of these platforms is based on Because of the aforementioned reasons, in our current
a regulatory and legal framework, which includes Organic and work, we propose de design of a model-type artifact in order to
Common Laws, Statutory Instruments, Agreements, Decrees
integrate ICT Governance and Management for the Ecuadorean Guidelines and Areas-based Model. The first is conceived
public sector. around a set of key concepts: objectives, key performance
indicators (KPIs) and organizational perspectives (financial,
This artifact is developed by consolidating systematic customer, processes and learning/growth) [21]. In this case,
alternatives, and partial but complimentary solutions, both built COBIT is a clear adaptation of this model.
over the basis of the analyzed problem, state-of-the-art research
and the best practices to solve this issue; the artifact strength A combined framework has already been created based on
relies on keeping internal consistency of its components with Guidelines and Areas [22] [23]. After comparing this model
the Ecuadorean Framework for ICT Management in the public with others, it has been reported to be similar to COBIT 4.1. In
sector. Then, in order to assess the applicability of this artifact- COBIT 5 [20] all the criteria of the previous version are
like model, the Council for Evaluation, Accreditation and covered and enhanced, so COBIT 5 can be considered as the
Quality Assurance of Higher Education Institutions most complete framework for combining ITG and ITM [24].
(CEAACES) is considered as a case study so that criteria for its
implementation can be devised from the evaluation results. Returning to a broad approach, in spite of COBIT and ITIL
This research approach is based on the Design Science providing an excellent choice of IT frameworks for governance
Paradigm proposed by Hevner, Ram, March and Park [9]. and service provision, respectively; they both have limitations.
In one hand, the definition of impact, risk and other
From this point forward, this article is organized as follows: measurements are vague and not necessarily quantitative [21]
[25]. On the other hand, they do not provide implementation
In Section II, considerations regarding governance, ICT details, which obliges the usage of other guides or standards for
Management and strategic alignment are discussed in depth. filling the gaps, as it has been reported in previous work [26]
Later, these considerations are used to define the model which [27].
is fully defined in Section III. In Section IV, the evaluation of
the model is described, as it was applied in CEAACES, A. IT Strategic Alignment
Finally, in Section V, a discussion of the obtained results IT alignment aids institutions by maximizing IT investment
and further work is presented. return, generating competitive advantage through information
systems, and delivering orientation and flexibility to react to
II. GOVERNANCE AND IT MANAGEMENT CONSIDERATIONS new opportunities [28].
According to Webb, Pollard and Ridley [10], whilst IT Some models have been identified and described as
executives and managers (Information Technology alignment research trends [29] [30] [31]. These models adopt
Management - ITM) deploy and supervise business strategies, the Strategic Alignment Model (SAM) [32] which has been
other structures handles organizational policies, culture and IT widely adopted [31], and has been considered as a design
investment (Information Technology Governance - ITG). In paradigm [9] on which our proposal is developed. The SAM
other words, ITG defines and spreads mechanisms required to model is comprised of four domains which are related in
ensure the current and future business-technology alignment various ways. Each of them describes one perspective that
objectives [11] whereas ITM must ensure that the governance displays both links (strategic adjustment and functional
mechanisms are in place in order to fulfil the corresponding integration) in order to obtain a proper IT alignment inside an
strategies [12]. institution (Fig. 1).
ITG can be implemented by combining diverse relational
structures, processes and mechanisms; plenty of research has
been done in the last decade about how to implement ITC in a
structured and process-oriented way [13] [14]. Recently, ISO
38500 [15] has become the first international standard that
depicts general directives about implementing ITG inside
organizations, however it does not include mechanisms,
techniques or specific tools, so it lacks of practical
contribution.
Regarding ITM, the Business-Driven Information
Management model (BDIM) is the application of models,
practices, techniques and tools for mapping and quantitatively
evaluate the interdependencies amongst the business
performance and IT solutions in order to improve IT service
quality and the related business outcomes [16] [17] [18].
Currently, ITIL [19] and COBIT [20] are the most developed
frameworks for IT service delivery and governance, Fig. 1. IT Strategic Alignment Domains [45]
respectively.
For the integrated implementations of ITG and ITM two For example the second perspective, or technologic
models can be found in specialized literature Business- transformation, describes the business strategy implementation
Objectives driven IT Management (IT-MBO), and the through IT. Then, as business strategies are conductors for
alignment, they provide vision and objectives to meet business
Business- Guidelines:
A service- - Business
Driven
oriented Structures/Processes/Mechanisms.
Information
Management model based - IT Processes.
Management
(ITM) on COBIT COBIT: - IT Balanced Score card (BSC).
Model
- Internal and ITIL Management - Best-practices.
(BDIM)
Approach. Domains. - Auditing.
- Departments ITIL: - Improvement.
and Specific - Service Strategy - Innovation.
Individuals. Strategies: (SS). Areas:
- Present - Information - Service Design - Service.
Strategy. Security. ISO 27000 (SD). - Resources.
- Projects and - Business ISO 22301 - Service Transition - Risk.
Operations. Continuity. ISO 31000 (ST). - Development.
- Cost and - Risk ISO 12207 - Service Operation - Architecture.
Quality. Management. CMMI (SO). - Projects.
- Budget - Software Others. - Service - Support and Q&A.
Accountability. Development. Configuration (CM). - Investment.
- Current work. - Other - Outsourcing.
strategies. - Compliance.
- Improvement.
- Innovation.
The strategy for defining the model is explained as iii. For ITM, ITIL and COBIT service-related processes
follows: are considered along with the eventual usage of
specific norms and guidelines according to the needs.
i. COBIT is adopted as generic practice for Governance
and Management, considering its content and The model components or mechanisms are comprised of
completeness level. recommendations, processes, strategic alignment, guidelines
and norms. These mechanisms must be applied as follows:
ii. For ITG, a basic framework is adopted: structures,
processes, relationships and strategic alignment. Also, 1) If strategic alignment employs technology
the recommendations for implementation are aligned transformation in the SAM Model (Business Strategy -> IT
with the local legal framework for the public sector so Strategy -> IT Infrastructure): The COBIT Goal Cascade
that compliance can be ensured throughout the whole technique has to be applied in order to define the main ITG
model.
COBIT processes to be implemented.
2) According to the organization restrictions and
reality: Some ITG recommendations depicted in the model
should be ignored. For instance Acquisition, IT Project strategical alignment in between the National Plan for Good-
Portfolio Management and IT Project Follow-Up are Living [41], and the academic strategy inside the
centralized activities in the public sector that may be Universities‘ current study programs and research projects.
overlooked. Whenever it is feasible, recommendations From the public context, CEAACES is a public
should be adapted to meet the closest ITG COBIT institution that is regulated by the Ecuadorean Legal
processes. Framework (Table I). Therefore, as a public institution, it is
3) Adopting COBIT processes related to ITM: controlled by the Central Government in Ecuador. Since
Alternatively, there are ITIL services that can be considered; CEAACES regulates academic quality inside Ecuadorean
however, in doing so, these services must be integrated with HEIs, it is clear that its institutional mission is focused in
ITG components. Then, it has to be defined if processes education as a public service. Furthermore, Ecuador has
promoted e-Government platforms to democratize access to
need to be decomposed, and if the granularity level given to
public information and provide equal opportunities for
them requires applying specific guidelines and norms. citizens, including access to public higher education
Eventually, current process capabilities and their maturity programs.
may be assessed by using a maturity/capacity model, or else
including/discarding processes according to specific As consequence, the information that CEAACES
manages have direct impact in the public and political
organizational needs.
scenario in which Ecuador deploys its national strategy. This
IV. APPLYING THE ITG/ITM MODEL IN THE COUNCIL FOR information and its management processes requires
EVALUATION, ACCREDITATION AND QUALITY ASSURANCE Governance models that can start at Strategy levels and end
OF HIGHER EDUCATION INSTITUTIONS IN ECUADOR up at Operational IT Areas. Thus, in order to apply the
(CEAACES) proposed model inside CEAACES, three alignment scenarios
have been identified.
The proposed ITG/ITM Model has been applied to the
Council for Evaluation, Accreditation and Quality Assurance i. Corporative goals alignment in which COBIT
of Higher Education Institutions in Ecuador (CEAACES). processes are taken into account.
The model was considered suitable for its application in ii. IT goals alignment to ensure that information
CEAACES due to the multifaceted nature of this institution infrastructure, its processes and strategies are aligned
from the political and public point of view. with the corporative goals.
From the political context, this institution has its own iii. Business Processes Alignment to ensure that the
organic structure [40] created to ensure transparency, quality processes are deployed in compliance with the whole
and continuous improvement within the processes for corporative and IT goals. These processes are classified
evaluation, accreditation and academic quality assurance by their priority according their contribution level: one
inside Ecuadorian higher education institutions (HEIs). IT goal (low priority), two IT goals (medium priority),
Mainly, CEAACES along with the Council of Higher three or more IT goals (high priority).
Education (CES) are in charge of ensuring political and The alignment results are shown in Table III.
Mission
Exercise management of public politics for quality assurance or higher education in Ecuador, through processes of evaluation, accreditation
and categorization of Higher Education Institutions (HEIs).
Mission and Strategic Objectives
Strategic a) To evaluate and accredit universities and polytechnic schools, their undergraduate and graduate academic programs.
Objectives b) To evaluate and accredit superior institutes and their academic programs.
c) To place CEAACES as a reference in matters of higher education quality, introducing it in the national, regional and international
academic debate.
d) To ensure academic quality of undergraduate and graduate students of HEIs.
Mission of To advice and provide technological support regarding management of evaluation, accreditation, categorization and quality assurance
ICT processes; as well as information technology management, through infrastructure and computer services for management, storage, custody
Management and data, information, and knowledge technical management.
Customer-oriented service culture. Customer-oriented service culture.
Corporative
Business service continuity and Business service continuity and
Goals Business service continuity and availability.
availability. availability.
Scenario 1: Corporative Goals Scenario 2: IT Goals Scenario 3: Business Processes
Focused on customer orientation and Customer orientation and business Customer orientation. Emphasis on
internal policy compliance. Emphasis on process functionality optimization. benefit making, resource optimization;
benefit making; majorly on risk Emphasis on risk management and majorly on risk management
management optimization. resource optimization; majorly on optimization.
benefit making.
COBIT (2 IT goals) BAI01, BAI02, BAI04, BAI10, DSS01, BAI01, BAI02, BAI04, DSS01, DSS03, DSS04, DSS05
Processes DSS02, DS06, MEA03 DSS02, DSS05, DS06, MEA01
EDM01, EDM02, APO01, APO02,
APO03, APO05, APO07, APO08,
Low APO01, APO03, APO05, APO07,
EDM05, APO03, APO05, APO07, BAI03 APO09, APO10, BAI02, BAI04,
(1 IT goal) BAI10, MEA02, MEA03
BAI10, DSS01, DSS02, DSS06,
MEA01, MEA02 , MEA03
In Scenario 1, the high-priority processes encompass the others, so they are considered for further analysis. Next, considerations
are made regarding corporative governance in CEAACES, along with particular processes interests so that definitive mechanisms
can be established (Table IV).
TABLE IV. MECHANISMS IDENTIFIED AFTER APPLYING THE ITG/ITM MODEL IN CEAACES
Mechanisms
- ICT structure with centralized decision, inherent of supporting public processes, according to
organizational structure of CEAACES.
- Creation of Technology Committee at Advisor Lever, led by CEAACES Chairman, or
representative.
Structure
- Integration of alignment ICT tasks in roles and responsibilities attached to the ICT Department
and/or Technology Committee.
- Presidency of ICT Department in the Technology Committee, reporting and accomplishing
Basic administrator roles of ICT business relationships.
Framework - EDM03 Risk Optimizations Assurance
Governance
(ITG) - APO02.05 Define Strategic Plan and Waybill
Processes
- AP009 Service Level Agreement Management
- APO03 Enterprise Architecture Management
Strategic
Alignment SAM Model - See Table II. Integrated Model for IGT/ITM
(BITA)
Mechanisms
- APO13 Security Management
- BAI06 Change Management
- BAI03 Identification and Solution Construction Management.
Business-Driven Information - DSS03 Problem Management.
Management Model (BDIM) - DSS04 Continuity Management.
Management - DSS05 Service Security Management.
(ITM) - MEA01 Supervise, Evaluate and Value Performance and Continuity.
- MEA02 Supervise, Evaluate and Value Internal Control System.
Specific Strategies and Norms:
- Information Security. - ISO 27001 and ISO 27002 for Information Security and Security Services (APO13 + DSS05)
- Security services - ISO 22301 and ISO 27001 for Business Continuity and Problem Management (DSS03 + DSS04)
- Problem Management. - ISO 25000 and Agile Methods for Solution Development (BAI03)
- Business Continuity.
1,3
Computer Science and Engineering Dept, University of Mauritius, Mauritius
r.sungkur@uom.ac.mu
2
School of Management, Information Technology and Governance,
University of Kwa-Zulu Natal (Westville campus), South Africa
Singhup@ukzn.ac.za
Abstract—The introduction of game-based learning (GBL) focus time of between 10 and 20 minutes in a normal face to
into the pedagogical processes and curriculum design can face session [17]. Beyond this, the potential for students to
increase student engagement in the learning process. There are a absorb more content, decreases. Thus, if an important concept
range of game based learning approaches available, but, so far, is presented after this ‘prime focus time’ students do not
limited adoption of serious games has been recorded. The digital
absorb this material. Furthermore, certain modules require
habits of learners should be studied carefully, to better
understand the way current technology-savvy students learn, and greater practical exposure; but due to the nature of our current
integrate socially. With the wide-spread use of Mobile Devices curriculum, these modules are restricted to being taught
today, GBL has a vital role to play in the educational landscape. theoretically only. Quoting a Chinese proverb – “I hear and I
This research analyses the potential usage of Mobile Devices to forget; I see and I remember; I do and I understand,” unless
enhance the learning process, through Game-Based Learning. students are involved in practical sessions, thereby engaging
with the material taught, comprehension of the theory taught,
Keywords—Game Based Learning; Mobile Devices; Serious is often difficult.
games; assessment; student feedback
I. INTRODUCTION The question is, what does GBL offer, that traditional lectures
cannot? Ideally, GBL encourages the learner to explore the
Adopting Game Based Learning (GBL) as the basis of solution space and ask, “What if I do this???”. The theory
pedagogical processes, and curriculum design, is a way of behind gaming is that decisions are taken at regular intervals.
promoting learner engagement in the education cycle. A This forced decision making process ensures that critical
variety of GBL approaches exist but, actual, current usage is thinking is propelled, and reflection harvested. Being forced to
restricted to serious games [1], [2], [3]. To fully exploit the take decisions the learning experience increases by leaps and
potential that GBL offers, the digital habits of learners should bounds.
be understood fully. This could provide a better understanding
of the way this ubiquitous generation of young students learn, II. LITERATURE REVIEW
and integrate socially. Furthermore, learning games presently
available may be limiting, but not as restrictive as the syllabus A. Pedagogical approaches to Learning
into which they are currently being integrated.
Every educator develops his/her own way of teaching. To be
In simple terms, Game Based Learning refers to the concept of effective this method should be adapted to suit his/her students
learning a specified module through a game [4], [5], [6]. learning, while still supporting the personal pedagogy of the
teacher. There are different kinds of pedagogies to learning,
Too many ‘sage on the stage’ type lectures, on a daily basis, which include m-learning pedagogies [7], inclusive
can be monotonous, non-engaging, and unstimulating, hence pedagogies [8], deep and surface pedagogies [9], and
may not be the most effective way to teach students. In this participatory approaches [10], to name a few. Three specific
scenario, teaching is present, but the question often arises approaches are explained briefly below, that relate directly to
whether learning has actually taken place, following the the current research:
teaching ‘session’. As early as 1976, research of student
attention patterns during lecture sessions suggested that 1. Learning by doing [11], [12]
student attention declines steadily during a lecture [15].
Students' attention span lasts as long as their interest [16]. This is a very practical form of learning as it requires ‘hands-
Pedagogically speaking, the average human mind has a prime on experience’. This is usually adopted in technical and
science studies where students acquire knowledge, and test uses entertainment, to further education and training’ in
this knowledge in laboratories and workshops, through various sectors. Sawyer and Smith’s (2008) taxonomy of
practical engagement. With the advent of web technology, serious games expands this definition to include games for
students can perform simulations in an online environment health, training, science and research, production and
enabling them to write-up their findings immediately. Since marketing. Serious Games is a sub-category of GBL, however
this learning experience is group-based, the student can it should be noted that the terms are sometimes used
collaborste with other learners to get answers to their queries, synonymously, (Corti, 2006).
as well as get feedback from their seniors, during the learning
process. The ‘learning by doing’ approach is highly effective
C. What is GBL and how does it work?
in environments where students can create their own artefacts.
2. Learning done through discussion and debate [11] Advocates of effective GBL state that it is a highly effective
and interactive platform, which motivates and engages
Traditionally discussion and debate learning was arranged in learners in the continuous learning process [25]. This trend has
‘real settings with real people’, and success was achieved been enhanced, with video game designers producing and
through a small class, and effective instructor. In recent times refining highly encouraging environments, for their players to
discussions and debate are hosted in online environments, enjoy, as depicted in Figure 1.
using web technology to facilitate the communication process.
The introduction of web technology into this learning
approach, allows students to chat with each other using the
discussion forums, and thus share ideas in a very collaborative
and flexible manner, without the constraints of time and place.
The effectiveness of this learning approach is dependent on
two factors; how big the student group is, and the role of
instructor
Figure 1: Learning with GBL [25]
3. Blended Learning [11], [13], [14]
In an environment of effective game based learning, the
Advocates of blended learning state that social and human principal focus is on a pre-defined goal, choosing actions that
interactivity such as body language, welcome, socializing, and will help achieve the goal, as well as experiencing the
face to face contact are necessary in education. In this context, consequences on those actions, be they correct or incorrect
the blended learning approach has its roots in technology- actions, all along the way. Mistakes are made in a minimal
supported learning. de Boer (2004) defines blended learning as risk setting, hence the participant actively learns from these
an approach that blends different kinds of delivery and mistakes, and through repeated practice and experimentation,
learning methods, which can be enabled by technology and the right actions are sought. In this way, the participant is
supported by traditional teaching methods. Thus it offers the constantly alert, and engaged in exercising behavior and
better of both worlds. thought processes, which can easily be transferred from a
simulated environment to a real life scenario.
B. Game Based Learning
D. GBL as Empirical Evidence
Tang, Hanneghan and El-Rhalibi (2009) define GBL as: The benefits of games and simulations for educational has
“the innovative learning approach derived from the use of been researched extensively. 68 comparisons were examined
computer games or software applications, that use games for by Randel, Wetzel and Whitehill comparing simulation/game
learning and education purposes, such as learning support, approaches to learning and traditional instruction, in relation
teaching enhancement, assessment and evaluation of to student performance. Their main discoveries were:
learners.”
As presented in Figure 1, GBL not only offers a unique 56% found no difference; 32% favored
concept to complement traditional teaching strategies, but also simulations/games; 7% favored simulations/games
with control questionable; 5% favored conventional
infuses teaching with energy, encourages innovative thinking
instruction.
and provides diversification in teaching methods [18]. GBL
encourages creative behavior, divergent thought (Fusard, In so far as retention is concerned, it was found that
2001), and can serve as an excellent ice breaker. GBL can act simulations/games induced better retention over time
as a learning trigger, to induce lively discussion on learning than conventional approaches.
concepts amongst students following game play. With regard to interest, 86% of the respondents
showed a much greater liking for games and
Zyda (2005) defines a ‘serious game’ as ’a mental contest, simulations over conventional approaches.
played with a computer, in accordance with specific rules, that
F. GBL v/s Traditional Learning Table 1: Comparison of Traditional Learning and GBL [27]
The effectiveness of hands on learning is not new [20]. Thus, Traditional Learning Game-based
properly designed GBL has many benefits over traditional (lectures, online Learning
learning methods, as presented in Figure 2 - including being tutorials)
less costly and having minimal risk. The main advantage, Cost-effective
however, lies in its significantly engaging learning process X X
[21]. A precise set of circumstances can be attempted several Low physical X X
times, thus enabling the student to explore the consequences risk/liability
of different actions. In the context of pedagogy GBL are
Standardized
useful to enlighten practically oriented teaching topics, as well X X
as dealing effectively with problem solving. Research shows assessments allowing
that ’GBL has a singular role in building students’ self- student-to-student
confidence’ and it can reduce the gap between quicker and comparisons
slower learners’ (Fuscard, 2001). The comparison between Highly engaging
GBL and Traditional Learning is summarized in Table 1. X
Learning pace X
customized to individual
student
Immediate feedback in X
response to student
mistakes
Student can without X
difficulty transfer
learning to real-world
scenarios
Learner is dynamically X
Figure 2: GBL as a Pedagogical Device
engaged
G. Advantages of GBL [22],[23]
Like with other forms of technology adopted in the teaching H. The Four Principles of Learning Games [13],[24]
process, GBL also offers advantages. Some of these
advantages are briefly presented below. Since GBL environments created are not suitable to everyone,
often the games that we find uninteresting are the ones which
cast a bad experience on our learning phase. For GBL to be processes and skills. In a GBL environment, as learners
effective, it must be structured in an environment that is advance in the game, they also learn to experience immediate
conducive to student learning. In this context, the Carnegie in-game consequences, especially with regard to their
Mellon’s Eberly Center for teaching excellence, has devised mistakes.
four key principles that describe successful GBL learning
approaches, as presented in Figure 3.
III. METHODOLOGY AND PROPOSED SOLUTION
A. Overview of Proposed System
The proposed system consists of developing mobile games
for the module titled ‘Network System Administration’, and is
depicted in Figure 4.
Games are designed in the form of quizzes and puzzles,
while at the same time, making interesting use of graphics and
videos.
Principle 1
The first principle relates to a student’s prior knowledge of the
game. Students who have interacted with GBL, or similar
environment previously, have an advantage of familiarity over
those students who are adopting it for the first time. In
traditional learning, buried student’s misconception might
arise only at test time, whereas in a gamed based approached,
the weaknesses of the learner is immediately made apparent
Principle 2
The second principle is a student’s motivation. It is one of the
key factors that sustain students on what they do to learn.
Some may even play a particular game repeatedly, until they
achieve a perfect score. Without knowingly realizing it,
learners learn how to operate within a game environment,
actively think, experiment and learn how to achieve their
goals. Through repeated practice the lessons learned, help
them to develop and master consistent and fruitful thought
processes.
Principle 3
Students need to acquire component skills from GBL and
know when to apply these skills. Learning is a step by step
process, and each learner works at different pace. This means
slower students often struggle, and faster students become
bored in passive training programs. On the contrary, game
based learning can be tailored to each learner. A learner
normally begins with the basic concepts and then gradually, as
he begins to master his interim goals, eventually moving on to
more advanced challenges.
Figure 4: System Structure
Principle 4
Students quality of learning can be enhanced by a goal
The State Transition of the proposed system is shown in
directed practice, coupled with targeted feedback. In this
Figure 5.
context, traditional learning, when compared to GBL, does not
provide learners with the opportunity to practice thought
software e.g. Tablets and Smartphones (see Figure 6),
and
An Android Emulator is required using the latest
APIs, to run the GBL application on a computer (see
Figure 7).
Figure 6: Emulator
Specifications Required
The different tools that will be used are:
Any device using an android platform [26] not
beyond version 4.2 (Jelly Bean) to run the GBL
what they are learning, through the experiences and
interactions with the application developed. This provides
motivation to the students as they are able to see the
connection between their learning experience and real-life
scenarios.
REFERENCES
[1] Dagnino, F., Ott, M., Pozzi, F., & Yilmaz, E. (2015). Serious games
design: reflections from an experience in the field of intangible cultural
heritage education. E-learning & Software for Education.
[2] Arnab, S., Berta, R., Earp, J., De Freitas, S., Popescu, M., Romero, M.,
Figure 7: Features of the GBL Mobile Application developed
& Usart, M. (2012). Framing the Adoption of Serious Games in Formal
Education. Electronic Journal of e-Learning, 10(2), 159-171.
[3] Stapleton, A. J. (2004). Serious games: Serious opportunities. In
IV. DISCUSSION Australian Game Developers‟ Conference, Academic Summit,
Melbourne.
The Game-Based Learning Application for Mobile [4] Blunt, R. (2007). Does game-based learning work? Results from three
Devices was successfully created and effectively used to recent studies. In Proceedings of the Interservice/Industry Training,
demonstrate how the teaching of a module on ‘Network Simulation, & Education Conference (pp. 945-955).
Systems Administration’ can be made more effective and [5] Van Eck, R. (2006). Digital game-based learning: It's not just the digital
interesting. A number of interactive features such as quiz, natives who are restless. EDUCAUSE review, 41(2), 16.
puzzles, jigsaw puzzles, counter, score, levels, graphics and [6] Pivec, M., Dziabenko, O., & Schinnerl, I. (2003). Aspects of game-
videos were adopted to make the mobile enrich the learning based learning. In 3rd International Conference on Knowledge
experience. The GBL environment provided learners with Management, Graz, Austria (pp. 216-225).
videos for enhancing their learning. The feedback facility [7] Lindsay, L. (2015). Transformation of teacher practice using mobile
provided them with not only the correct answers to their tasks, technology with one‐ to‐ one classes: M-learning pedagogical
but also learning feedback on concepts that they approaches. British Journal of Educational Technology
misunderstood. The pedagogical aspects of Game-Based [8] Spratt, J., & Florian, L. (2015). Inclusive pedagogy: From learning to
Learning on Mobile Devices were considered while action. Supporting each individual in the context of ‘everybody’.
developing the application. Thus, the mobile application Teaching and Teacher Education, 49, 89-96.
developed was self-explanatory, intuitive and user friendly. [9] Boughey, C. (2015). Approaches to large-class teachinglarge-class
pedagogy: Interdisciplinary perspectives for quality higher education,
It is understood, that the GBL application developed is not David J. Hornsby, Ruksana Osman, Jaqueline de Matos-Ala (Eds.): book
meant to replace the lecture class, but rather to complement it, review. South African Journal of Science, 111 (1 & 2), 4-5.
by providing students with a blended learning environment. [10] Anderson, V., mckenzie, M., Allan, S., Hill, T., McLean, S., Kayira, J.,
The GBL appeared to be much more popular and effective, for & Butcher, K. (2015). Participatory action research as pedagogy:
revision purposes, with the students. The application has been investigating social and ecological justice learning within a teacher
tested with 40 students following the above-mentioned module education program. Teaching Education, 26(2), 179-195.
and it was unanimously agreed that it definitely enhances the [11] Tadesse, T., & Gillies, R. M. (2015). Nurturing Cooperative Learning
Pedagogies in Higher Education Classrooms: Evidence of Instructional
process of learning. Further enhancements to the application
Reform and Potential Challenges. Current Issues in Education, 18(2).
would include simulation of a network, speech recognition,
[12] Yates, T. (2015). Learninglearning and Teaching Community-Based
and the possibility of incorporating multiple players, to make Research: Linking Pedagogy to Practice by Etmanski, C., Budd, LH, and
the games more challenging. Dawson, T.(ed.) 2014. University of Toronto Press. Toronto, ON.
388pp. Engaged Scholar Journal: Community-Engaged Research,
V. CONCLUSION
Teaching, and Learning, 1(1).
Game based learning helps develop in the individual, the [13] Jabbar, A. I. A., & Felicia, P. (2015). Gameplay Engagement and
sense of perseverance, which is latent in him. When people are Learning in Game-Based Learning A Systematic Review. Review of
provided with interesting materials, it becomes easier for them Educational Research, 0034654315577210.
to achieve their potential, in a particular field of subject. The [14] Roy, A., & Sharples, M. (2015). Mobile Game Based Learning: Can It
sense of competition within GBL software is an enabler to Enhance Learning Of Marginalized Peer Educators?. International
foster inventiveness. The GBL application that has been Journal of Mobile and Blended Learning (IJMBL), 7(1), 1-12.
developed, not only allows for mobile learning but also allows [15] Johnstone, A. H., & Percival, F. (1976). Attention breaks in lectures.
the students to fully use the concept of GBL to enhance their Education in chemistry, 13(2), 49-50.
learning experience. This has proved to be very motivating to [16] Matheson, C. (2008). The educational value and effectiveness of
the students where they become actively engaged in the lectures. The Clinical Teacher, 5(4), 218-221.
learning process. Moreover, it can connect students sharing a [17] Read, B. (2005). Lectures on the Go. The Chronicle of Higher
common interest, thus encouraging creativity, as well as Education, 52(10), A39.
developing teamwork skills, which is crucial in the job market. [18] Pitt, M. B., Borman-Shoap, E. C., & Eppich, W. J. (2015). Twelve tips
Increasing time in the classroom to attempt to teach students for maximizing the effectiveness of game-based learning. Medical
teacher, (0), 1-5.
how to think and perform in the face of real-world challenges,
will be ineffective. GBL can help students to better remember
[19] Learning in Immersive Worlds
http://www.jisc.ac.uk/media/documents/programmes/elearninginnovatio
n/gamingreport_v3.pdf [Accessed 15 August 2015]
[20] Chen, C. H., Wang, K. C., & Lin, Y. H. (2015). The Comparison of
Solitary and Collaborative Modes of Game-based Learning on Students'
Science Learning and Motivation. Journal of Educational Technology &
Society, 18(2), 237-248.
[21] Furió, D., Juan, M. C., Seguí, I., & Vivó, R. (2015). Mobile learning vs.
Traditional classroom lessons: a comparative study. Journal of Computer
Assisted Learning, 31(3), 189-201.
[22] Hung, C. Y., Sun, J. C. Y., & Yu, P. T. (2015). The benefits of a
challenge: student motivation and flow experience in tablet-PC-game-
based learning. Interactive Learning Environments, 23(2), 172-190.
[23] Tsai, F. H., Tsai, C. C., & Lin, K. Y. (2015). The evaluation of different
gaming modes and feedback types on game-based formative assessment
in an online learning environment. Computers & Education, 81, 259-
269.
[24] Shi, Y. R., & Shih, J. L. (2015). Game Factors and Game-Based
Learning Design Model. International Journal of Computer Games
Technology, 2015.
[25] Jian, M. S., Shen, J. H., Huang, T. C., Chen, Y. C., & Chen, J. L. (2015).
Language Learning in Cloud: Modular Role Player Game-Distance-
Learning System Based on Voice Recognition. In Future Information
Technology-II (pp. 129-135). Springer Netherlands.
[26] Android 4.2 apis| Android Developers, 2013. Android 4.2 apis| Android
Developers. [Online],
http://developer.android.com/about/versions/android-4.2.html [Accessed
on 05 August 2015]
[27] Game Based Learning vs Traditional Learning, Game-Based Learning:
What it is, Why it Works, and Where it’s Going. [Online],
http://www.newmedia.org/game-based-learning--what-it-is-why-it-
works-and-where-its-going.html [Accessed on 10 August 2015]
[28] Sawyer, B. & Smith, P. (2008). Serious Game taxonomy. Paper presented
at the Serious Game Summit 2008, San Francisco, USA
[29] Corti, K. (2006) Games-based Learning; a serious business application.
PIXELearning Limited.
[30] Boer, W.F. de (2004). Flexibility support for a changing university.. PhD
dissertation, Faculty of Behavioural Sciences, University of Twente,
Enschede, NL.
[31] Tang, S., Hanneghan, M., & El Rhalibi, A. (2009). Introduction to games-
based learning. In T. Connolly, M. Stansfield, & L. Boyle
(Eds.), Games-based learning advancements for multi-sensory human
computer interfaces (pp. 1–17). Hershey, PA: IGI Global
[32] Zyda, M. (2005). "From Visual Simulation to Virtual Reality to
Games", IEEE Computer, vol. 38, no. 9, pp.25 -32
[33] O’NEIL, H. F., WAINESS, R. and BAKER, E. L. (2005). Classification
of learning outcomes: evidence from the computer games literature.
Curriculum Journal, 16(4): 455 – 474.
Privacy Challenges in Proximity Based Social
Networking: Techniques & Solutions
Asslinah Mocktoolah Kavi Kumar KHEDO
Department of Computer Science and Engineering Department of Computer Science and Engineering
University of Mauritius University of Mauritius
Mauritius Mauritius
ashlinahmee@gmail.com k.khedo@uom.ac.mu
Abstract—The development of the Proximity based Mobile now enjoy the benefits of social networks by
Social Networking (PMSN) has been growing exponentially communicating with their friends and sharing information
with the adoption of smartphones and introduction of Wi-Fi while they are on the move. Taking advantage of the
hotspots in public and remote areas. Users present in the embedded location services such as GPS in mobile devices,
vicinity can interact with each other using the embedded this concept was further exploited to create a new category
technologies in their mobile devices such as GPS, Wi-Fi and of mobile social networks called Proximity based Mobile
Bluetooth. Due to its growing momentum, this new social Social Networking (PMSN) targeting mainly users in
networking has also aroused the interest of business people and vicinity.
advertisers. However, due to a lack of security in these
networks, several privacy concerns were reported. Users are PMSN refers to the social interaction of physically
more reluctant to share their locations and to address this proximate mobile users through their mobile devices via
issue, some initial solutions to preserve location privacy were Bluetooth or Near Field Communication (NFC) techniques.
implemented. The aim of this paper is to present a clear While OSN has been accused of promoting anti-social
categorization of the different privacy threats in PMSN. behavior, PMSN on the other hand has bridged the gap
Moreover, the location privacy enforcement policies and between the virtual and physical worlds. Different services
techniques used to ensure privacy are outlined and some for users within close proximities are provided such as
solutions employed in existent systems are presented and connecting to known friends or strangers having similar
discussed. To the best of our knowledge, this is the first study
interests, selecting nearby restaurants having good reviews
done outlining several categories of PMSN privacy challenges
and their solutions in this new type of social networking
or setting reminders at explicit places. However, sometimes,
services. Finally, some privacy research challenges and future these benefits are overwhelmed by the cost in revealing
perspectives are proposed. locations and many users share a mutual apprehension of
sharing their location information. According to a recent
Keywords—proximity based social networking; location report by Pew Research Center, a slight drop is observed in
privacy challenges; security; privacy solutions; categorization of the number of smartphones users who use location based
privacy challenges services and around 46% of teen users and 35% of adult
users have turned off their location tracking features due to
I. INTRODUCTION privacy concerns [5]. The reluctance of adopting these
services are mainly due to the dangers associated with the
The 21st century has witnessed an array of technological disclosed location coordinates such as physical stalking,
advances in the world of Internet changing the nature of tracking usual routes of users or identifying sensitive
socialization and communication, especially with the advent information of users‟ visits to some embarrassing places
of social media networks. According to statistics carried out such as clinics or clubs [6].
in March 2015, Online Social Networking (OSN) prevails
as the most popular networks visited worldwide [1]. Earlier, To minimize the risks related to the privacy issues and to
even with the notion of freedom of speech, it was not further encourage people to use PMSN services, many
obvious to share views or pass on a message easily. With attempts to preserve privacy have been made such as K-
this new phenomenon, voicing out opinions is as easy as a anonymity algorithms, obfuscation techniques or
piece of cake, for example, the hashtag #JeSuisCharlie cryptographic schemes. However, no in-depth analysis of
which translates to „I am Charlie‟ allowed millions of the different privacy challenges and solutions of PMSN has
people from all over the world to share their solidarity been made prior to the implementation of an effective
messages just a few hours after the terrorist attack in Paris privacy preservation framework. Several recent studies on
[2]. mobile social networks have presented a categorization on
different privacy issues and outlined some possible
During the recent years, with the rapid evolution of solutions but a survey on the PMSN privacy challenges and
technology and upsurge of smartphones, traditional OSN the existing works on the solutions implemented is missing.
websites such as Facebook [3] and Twitter [4] have This study is important for further research in privacy and
extended their services to mobile applications. Users can security in PMSN networks by addressing the different
privacy concerns so as to assure the continuity of these wireless ad-hoc networks. The P2P architecture gives rise to
systems. a greater number of privacy issues as compared to a client-
server architecture. The next section gives a broad
The rest of this paper is organized as follows. In Section categorization of the different privacy challenges in PMSN
II, an overview of PMSN services is outlined and a systems.
categorization of the privacy challenges related to these
platforms is presented in Section III. Section IV gives an
idea of the privacy enforcement policies and techniques III. PRIVACY CHALLENGES IN PMSN
while Section V studies the existing privacy aware PMSN Together with the popularity of PMSN, an increasing
solutions. Some privacy research challenges and future danger of user privacy is observed due to the repeated
perspectives are described in Section VI and the study is release of location information of the users. Some efforts
summarized on a concluding note in Section VII. have been made to give an overview of the privacy threats
of PMSN but a detailed and comprehensive categorization
II. PROXIMITY BASED MOBILE SOCIAL NETWORKING of PMSN privacy challenges is missing. In this paper, the
(PMSN) main privacy challenges are categorized in four main
classes: location privacy, identity privacy, trust and
Social network giants such as Facebook focus on re- malicious attacks.
creating a user‟s offline social graph on the net but usually
overlook changing social networks that users participate in
A. Location Privacy
their daily lives, for example, people working out at the
gym, doing shopping for groceries or parents accompanying In PMSN networks, it is imperative to share location
their children to playgrounds. PMSN revolves around coordinates but this act is accompanied by several problems
connecting these users, who share the same interests or such that it can cause user troubles ranging from being a
activities at that same moment in time helping them to victim to be robbed to being sexually assaulted. Users‟ usual
discover new connections based on their physical routes can be predicted with advance attacks and even the
proximities. mode of transport can be inferred [11]. The location privacy
issues in these systems are further categorized into different
PMSN applications such as Lokast [7] and Proxxi [8] sub-classes: privacy disclosure, absence location privacy,
automatically discover users in proximity and help them to inferred location privacy, location data exploitation and
interact with each other based on their geographical location cheating.
locations. One of the popular PMSN application Foursquare
with more than 55 million of users worldwide adopted its 1) Privacy Disclosure
own location detection technology: Pilgrim by using GPS With the advent of location-enhanced technology,
and users‟ past check-in histories [9]. The users‟ locations disclosing location information to another person or service
are also tracked in the background and push notifications are can be valuable but yet dangerous. It can be risky if anyone
sent when they are located at specific places. Although, can access the information and in addition, location
these tracking services can be helpful to users, privacy information can be directly associated with other
issues may also arise since according to Foursquare privacy information such as identities, preferences and social
policy, user information is leaked to third parties, e.g., to relationships.
display the user's current location on a map, the latitude and
longitude details are directed to the Google map server. 2) Absence Location Privacy
Actually, no privacy protection schemes are present for The absence location privacy issue refers to the
large PMSN platforms such as Foursquare but the security hypothesis that a person is not at a particular place at a
of users is only based on the privacy settings as managed by given moment. An example of such as privacy violation
the users. refers to a theft occurred in Manchester in August 2010,
where social networking was used to verify the absence of
These networks are also usually known as ephemeral the victims before committing the theft at their residence
social networks due to the short duration for which they are [12].
created at specific events [10]. For example, when attending
conferences, these applications can be very advantageous to 3) Inferred location Privacy
allow new attendees to connect to people sharing the same In this case, locations of other users shared in the
research areas or having similar interests. PMSN has been network are used to decipher the fact that a specific person
applied in other areas as well for businesses, advertising or is currently at a place based on the relationships between
marketing and also for entertainment purposes such as them. For example, a group of friends usually hang out
gaming applications where mobile game users adopt new together, and if one of them share their locations, it can be
gaming experiences and different ways to interact with other deduced that the others are located at same place.
users around them. By diversifying into different application 4) Location Data Exploitation
areas, PMSN is paving the way to attract more users to Location information exploitation is related to the extent
adopt this new trend. of how PMSN applications or third parties use the data
Such applications are mostly based on a Peer-to-Peer shared and for which purposes it is being used. The privacy
(P2P) architecture allowing mobile users to communicate policy of commercial PMSN applications such as
with each other in the absence of a central server through Foursquare clearly state that once information has been
Challenges
Location Privacy Identity Privacy Malicious Attacks
Trust Relationships
Wormhole Attacks
Sybil Attacks
Direct Anonymity
K-Anonymity
Neighborhood Attacks
Eavesdropping
Distributed DOS
Inferred Location
Location Exploitation
Location Cheating
Location Disclosure
Absence Location
Solutions
Techniques
Abstract—Transportation costs for road transport vehicles taking detours, resulting in longer turnaround times,
companies may be intensified by rising fuel prices, levies, which reduce overall supply chain efficiency. According to the
traffic congestion, etc. Of particular concern to the Mpact National Planning Commission, 20% of the paved road network is
group of companies is the long waiting times in the queues at currently classified as being in a poor or very poor condition [4].
loading and offloading points at three processing mills in the
KZN (KwaZulu-Natal) province in South Africa. Following a Other transportation challenges include volatile fuel prices, toll
survey among the drivers who regularly deliver at these sites, costs and stricter carbon requirements. Leading transportation
recommendations for alleviating the lengthy waiting times are companies are addressing some of these issues by optimising
put forward. On the strength of one of these recommendations, vehicle fleets, using more fuel efficient vehicles, implementing
namely the innovative use of ICTs, suggestions on how cloud- strategic route planning, and advancing driver training [4].
based technologies may be embraced by the company are
explored. In the process, the value added by a cloud-based An additional challenge experienced by transport companies is
supply chain, enterprise systems, CRM (Customer potentially long waiting times during the offloading of goods and
Relationship Management) and knowledge management is the subsequent reloading of goods onto a transportation vehicle. Of
examined. particular interest to the researchers is the impact such waiting
times have on the Mpact group of companies [5] which specialise
Keywords—Transportation company; Supply chain; in the manufacture of paper. Trucks often have to wait for
Cloud computing; Enterprise systems; CRM; knowledge extended periods in long queues during the offloading of timber
management and the reloading of finished products.
Figure 2: Recoverable Fibre Collection Points and Recycling Mills (synthesised by researchers).
Recoverable fibre is moved to the individual processing plants quantitative studies strive for random sampling, qualitative studies
from various areas across the country. Bottle necks of vehicles often use purposeful or criterion-based sampling, that is, a sample
queuing at the offloading sites often occur. Currently the planning that has the characteristics relevant to the research questions [18].
of the vehicles is done off-line by an individual who receives an
order from the recycling plant to inform them of the various Ultimately, all fieldwork culminates in the analysis and
recycled fibre available around the country as well as to which mill interpretation of some dataset. Analyses involve categorising the
the recyclable fibre needs to be transported. The scheduler does not data into manageable themes, patterns, and relationships.
have the information with regard to the stock and consumption Subsequently one aims to understand the various constitutive
levels of the mills needed to determine whether recyclable fibre elements of the data through an inspection of the relationship
can indeed be sent to the mills. Over and above these factors the between concepts, constructs or variables, and to determine
scheduler also needs to determine the various travel times whether there are any patterns or trends that can be identified or
associated with each of the loads as the distances to each mill isolated, or to establish themes (factors) in the data [19].
varies from the loading sites distributed across the country.
A survey instrument to assess the turnaround times of trucks at the
The above uncertainties lead to shortcomings in the planning of three recycling mills was developed and validated. The survey
routes, ultimately leading to trucks arriving at a mill with a long relied not only on driver feedback, but also on data captured
queue, while there may be no queue at some of the other mills. through the use of on-board tracking devises, installed on each of
the trucks, and which was analysed to determine the actual
A part of a solution, the power of more information on the turnaround times of the vehicles at the various mills. The financial
processes, integrated communication, and real-time analytics may implications for the recycling mills, the transporter as well as the
well deliver savings. Logistics solutions that allow for real-time truck drivers were determined and analysed. Aspects of qualitative
monitoring make for accurate models of merge in transit possible and action research methods were integrated into the formal
for the first time, viz. a process that was previously considered too process. Qualitative and quantitative data were collected from each
complex to accurately estimate, can now be forecasted and of the interviewees. Root causes were identified and meaningful
monitored through a real-time, online dashboard in the cloud [16]. inferences, focusing on the objective (to reduce turnaround times)
were made.
SURVEY REPORT
Figure 3: Data Flow Analysis (Adapted from the SA Office of Research, Development, and Technology).
V. FINDINGS & RECOMMENDATIONS this a facility customers are coming to expect but it also alleviates
the administrative and communications burden on the scheduler
Analyses of the financial losses and the potential financial savings [21].
indicated that by shortening the turnaround times to within the four
hour norm a savings of two million ZAR could be gained. The Through the survey it was also established that the use of flat deck
results of the driver questionnaire highlighted the mills with the trailers contributed to the longer than normal turnaround times.
worst offloading times as well as the areas within the offloading Subsequently the study identified that the use of tautliner trailers
process that required attention. instead of flat deck trailers would reduce the turnaround time due
to the time saving of not having to tarp and un-tarp the flat deck
The study identified a number of measures that could be trailer. It was recommended that tautliners be introduced into the
implemented to reduce the time spent by a driver at the recoverable transporter fleet as part of their vehicle replacement programme.
fibre offloading sites: Further discussion of this technical recommendation is beyond the
scope of this paper, but more information appears in [6].
The implementation of ICT planning software to take into account
variable factors that would determine the truck arrival dates and The results of the survey furthermore revealed a correlation
times at specific mills, thereby facilitating planning to reduce between how helpful the drivers found the staff (EQ measurement)
bottlenecks at the mills was identified as a strategic necessity. at each mill and the turnaround time at such mill. Mills with more
Cloud computing could be used as the cloud allows for track and helpful staff had a better turnaround time than those where the staff
trace functionality, with drivers submitting proof of delivery from were not perceived to be helpful. It is recommended that
mobile devices. Customers could then access such data. Not only is
companies educate their employees and create a service oriented
culture at all mills.
Various advantages to the different cloud-based supply chains are Seamless collaboration: supply chain capabilities are
[10] [11]: harmonised beyond physical boundaries.
A connected supply chain has the following advantages: Highly evolved operating models: product/service delivery is
exponentially improved to meet customers‟ evolving
Real-time visibility: supply chains become more dynamic, demands.
secure and interactive.
Advantages of an intelligent supply chain are:
Actionable insights: innovative data analysis supports Enhanced responsiveness: using better information and
advanced decision-making. sophisticated analytics to interpret and react speedily to
disruptions, including demand and supply signals.
Enhanced and accelerated innovation: digitally inspires and
supports creative advances in design, personnel, operations Proactive prevention: decision support, driven by predictive
and customer relationships (CRs) analytics, helps to confirm reliability and rapid adaptability
Advantages to a scalable supply chain are: Last mile postponement: swift repurposing of organisational
assets at short notice helps to ensure that supplies always meet
Improved efficiency: provides for the integration of people, changing demands.
processes and technology
Supply chain enabled cloud computing is lately recognised as a
Organisational flexibility: digital plug-and-play enablers decidable enabler, providing a route through which supply chain
provide for natural “configure and re-configure” capabilities. executives can rapidly and efficiently access innovative supply
chain solutions [9] [10].
Personalised experiences: channel-centric supply networks
help foster individualised products and services. Next we discuss how cloud computing enablers may assist with the
transport company challenges identified earlier.
Rapid supply chain advantages are:
Subscribing to enterprise- and CRM software in the cloud offers Transport Logistic
SaaS
lucrative advantages to a transport company. Coupled with an
- Supply chain - Information in real time. questionnaire was distributed and it was confirmed that drivers
management (SCM) were paid on an incentive basis which is related to productivity and
- Improving user experience hence the kilometres they drive per month. The research indicated
(UX). that driver morale is negatively affected by the long turnaround
times, leading to time lost on the road and subsequent loss in
income.
The information in this section provides an answer to RQ2.
A further finding was that the use of innovative ICTs would be of
VIII. CONCLUSIONS AND FUTURE WORK value. Some correlation was also revealed between the friendliness
Owing to the deregulating of freight movement on South African of the personnel at the mills and the queue waiting times. The use
roads and the decrease in rail volumes out of the mills, the net of a different kind of truck was also suggested.
result has been an increase in freight movement through paper On the strength of the recommendation for ICTs, the researchers
processing mills which have resulted in an increase in turnaround investigated how cloud-based technologies in the supply chain may
times at such mills. The increased turnaround times incur provide an answer to these. It was found that the SaaS service
additional costs for the owners of the fleet, resulting in higher model may be used for a cloud-based enterprise system embodying
transport tariffs for the customer of such services. One of the a CRM (Customer Relationship Management) system and
challenges faced is the outlay of the recoverable fibre transport knowledge management. As a result the company‟s IT
matrix – with all the individual processing plants being located in infrastructures and logistics management would be strengthened.
KZN and the recoverable fibre being transported from areas across
South Africa. Subsequently, personnel have difficulty in optimally Further research would be required to study time and motion
planning the movement/routes of the trucks. properties of the waiting trucks. A cloud-based framework could
also be developed for road transport companies.
In an attempt to alleviate the challenges, the researchers embarked
on a survey among the drivers delivering at these mills. A
Name
1 Age
Telephone Number
E-Mail address
2.a. What time do you normally arrive to off-load at Felixton mill? 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
2.b. What time do you normally arrive to off-load at Piet Retief mill? 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
2.c. What time do you normally arrive to off-load at Merebank mill? 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
What time of the day would you consider the best to arrive at Felixton Mill - allows for quick
3.a. 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
turn-around
What time of the day would you consider the best to arrive at Piet Retief Mill - allows for
3.b. 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
quick turn-around
What time of the day would you consider the best to arrive at Felixton Mill - allows for quick
3.c. 6:01 - 12:00 12:01 - 18:00 18:01 - 24:00 24:01 - 6:00
turn-around
5.a. How helpful do you find the Mpact personal at Felixton Mill? Very Helpful Helpful Unhelpful Very Unhelpful
5.b. How helpful do you find the Mpact personal at Piet Retief Mill? Very Helpful Helpful Unhelpful Very Unhelpful
5.c. How helpful do you find the Mpact personal at Merebank Mill? Very Helpful Helpful Unhelpful Very Unhelpful
6.a. How many trucks are on Avg before you in the queue at Felixton Mill? 0 to 3 4 to 7 7 to 10 10 or more
6.b. How many trucks are on Avg before you in the queue at Piet Retief Mill? 0 to 3 4 to 7 7 to 10 10 or more
6.c. How many trucks are on Avg before you in the queue at Merebank Mill? 0 to 3 4 to 7 7 to 10 10 or more
7 Do you feel that Mpact is trying there best to turn the trucks around quickly? Yes No
8 Are you incentivised and paid per kilometre you drive? Yes No
9 If you could choose not load for Mpact - would you? Yes No
10 Which of the three mills take the longest to turn your truck around? Felixton Piet Retief Merebank
11 Would the use of taut-liner trailers improve the turn-around time? Yes No
How much longer does it take on avg. to turn-around at Felixton Mill compared to other
12.a 0-1Hrs. 1-2Hrs. 2-3Hrs. 3 or more Hrs.
offloading points.
How much longer does it take on avg. to turn-around at Piet Retief Mill compared to other
12.b 0-1Hrs. 1-2Hrs. 2-3Hrs. 3 or more Hrs.
offloading points.
How much longer does it take on avg. to turn-around at Merebank Mill compared to other
12.c 0-1Hrs. 1-2Hrs. 2-3Hrs. 3 or more Hrs.
offloading points.
How to Detect Unknown Malicious Code Efficiently?
Abstract—Recently, rapid developments of IT technology lead Korean hydro and nuclear power incident which became great
to development of various platforms. With the development of national security threat. Unknown malicious codes are in
the new platforms, diverse malicious codes are created to target action consistently without being detected by security system.
the new platforms. These new malicious code means critical and To achieve its malicious goals, codes use variety of
new threat to national infrastructure, especially the important mechanisms such as stealing the password information or
ones that can lead to social chaos. In Korea, Korea hydro and creating traffic in order to attack other host by using
nuclear power was hacked and blueprint was stolen, which was connection of command control server [1]. However due to
later posted on-line. This created great problem as the place was limitation of the professional personals in the area, it is nearly
hacked was critical infrastructure. Thus, the vaccine related are
impossible to analyze every malicious code in short period of
searched out as effective method to analyze the malicious codes
that are created every day uncontrollably. However, Personals
time. Thus, analyzing the Unknown malicious code prior to
that manage malicious codes are limited compared to newly Computer System that causes severe problems has become
create malicious codes. How to detect unknown malicious code priority. . In order to solve the problem, Advanced Unknown
efficiently that remain to be unanswered? However, to answer Malicious Code Detection Model deciding the priority of
this question, malicious code analysis method has to be malicious code was presented in previous paper [2]. However,
concerned, especially the critical ones first. In order to analyze after analyzing one hundred normal files and one hundred
the Unknown malicious codes effectively, Unknown malicious malicious files, cases were found which treated normal files as
code detection model was introduced in the previous paper. malicious decreasing the effectiveness of the malicious code
However, this model sometimes treated normal file as malicious detection. Thus, finding a method to decrease the identifying
code. This eventually decreased its effectiveness in finding and of normal file as a malicious became a critical issue which led
analyzing the malicious codes. . Thus it became necessary to to research in the field.
decrease the misdetection rate in order to increase the
effectiveness of the model. As a result in this research, we created
specific conditions that lead to decrease the miss detection rate
II. RELATED WORK
significantly. Hence in this paper, we presented a method that In this section, previous work regarding research method
detects the Unknown malicious codes more efficiently. for detection of malicious code is effectively introduced and
discussed. Further description on recent research is discussed
Keywords—IT security; Unknown malicious code detection along with the problems that came with the previous method.
model; Critical Infrastructure
A. Static Analysis of Executables to Detect Malicious
I. INTRODUCTION Patterns
Information technology (IT) has become our life. It is Detecting malicious code is an important part of
basically involved in our daily activities. Without technology, information security [3]. In this paper, they present a static
we cannot live comfortably. With the advancement of analysis of executables to detect malicious code patterns. In
technology, issues related to data and file technology. With the malicious code detection, malicious code writer try to
wide spread use of IT in society, various new IT based obfuscate malicious code detection for evading anti-virus
platforms are being created. With development of new IT software. Authors tested the efficiency of three commercial
platforms, malicious codes that targets the platform is also anti-virus software against code-obfuscation. The anti-virus
being created and this makes preexisting Signature based software couldn’t detect that codes are obfuscated. So they
malicious code detection difficult. Unknown malicious code present a new architecture for detecting obfuscated malicious
has become important and advanced Persistent Threat to code.
Critical Infrastructure, one among them was hacking of
7 Patch/Hook detection 6.0 8% In other through the placement of malicious code detection
rule this expression was also as expected predict actual results
8 Stack/Heap test 6.0 3%
that were higher than the malware detection efficiency. We
File looked out the detection rate of infection consisting of 41
9 1.0 92%
generation/deletion/modification
Registry
samples when the arrangement order of the highest as
10 4.0 76% Ordering Score Group, sequentially combined. The results
generation/deletion/modification
11 Network connection 10.0 5%
were as follows.
A. Dataset
We planned experiments to improve the efficiency of the
analysis program developed by the infection. Advanced
Unknown malicious code detection model presented in the
previous work was applied to one hundred normal files and
one hundred malicious files for examining the effectiveness of
the model and finding and fixing the limitation. These files
were having variety of characteristics and were chosen
randomly and were provided by Ahnlab.
B. Implementation
We were each 100 each analysis to improve the program
and apply the ruleset Unknown malicious code normal files. A
total of 67 applied ruleset rule was applied by adding a rule for
each group at 24 ruleset who mentioned.
C. Results
We have proposed an algorithm as shown in Figure 2 in
order to reflect the actual malware analysis program After experimenting on these sample files there were some
implementing this process. We have developed a program that problems as expected. In some instances normal file was
detects unknown malicious code on the basis of this algorithm. treated with similar characteristics as malicious code and was
categorized as malicious code.
Figure 2. Advanced Unknown Malicious code Detection Model
This resulted increased the miss detection rate. The result
of the experiment is as follows.
Number of
Rank ID Rule Description
Misdetection
compression or encryption (over
1 2 30
entropy 7) file detection
Ravi Foogooa
rfoogooa@umail.utm.ac.mu
Chandradeo Bokhoree
sbokhoree@umail.utm.ac.mu
Kumar Dookhitram
University of Technology, Mauritius
La Tour Koenig
Pointe-aux-Sables, Mauritius
kdookhitram@umail.utm.ac.mu
Abstract—A lot of effort is still required in greening ICT and getting reliable data for such assessments is a daunting task
the use of ICT to green organisations. However, Green ICT [8]. Initial enthusiasm easily dies out in such cases. Thus,
initiatives are hard to sustain. In this context, Green ICT there is need for a carefully devised Green ICT strategy to
maturity models help by providing a benchmarking tool and a craft an organisation‟s way to a harmonious development [4].
roadmap. However, several Green ICT maturity models have In this context, maturity models are very helpful. They can
been proposed by different researchers over the years with no help in assessing the current status of companies with respect
clear justification. This makes it difficult for companies to choose to best practices and they can also support in planning a
which Green ICT maturity model to adopt. This research aims at roadmap for promoting environmental sustainability
comparing the different Green ICT maturity models. This could
initiatives.
help companies manage their Green ICT initiatives in a more
sustainable way. It will analyse the evolution of Green ICT However, a number of Green ICT maturity models have
maturity models and provide cues for further research in this been developed over the years, for example in [9]–[13]. They
area. are based on different expectations, ambitions and assessment
criteria and it can therefore be difficult to choose which model
Keywords—Green ICT, maturity models, sustainable to adopt. An inappropriate choice could either not be
development motivating enough for a company or further discourage it and
kill new Green ICT initiatives in the bud itself. In order to
I. INTRODUCTION overcome this difficulty, we set off to compare and contrast
Much effort has been put so far in sustainable development several existing Green ICT maturity models in order to
and at least the rate of unsustainable growth has slowed down understand the functioning and limitations of these models
in recent years. Nevertheless, a lot still remains to be done in through a cross comparative analysis. Subsequently, a generic
sustainable development as the world is getting less and less approach to Green ICT maturity is developed and presented
sustainable. It is the duty of everyone to bring his contribution herein.
to make the world a better place for ourselves and our future
generations. Research on the link between Information and II. GREEN ICT MATURITY MODELS
Communication Technologies (ICT) and sustainable Maturity models are not new. Way back in 1993, the
development has shown that ICT does account for 2% of Software Engineering Institute of Carnegie-Mellon University
global carbon emissions [1] with its power consumption and e- presented the Capability Maturity Model (CMM) for process
waste on the upper trend. However, ICT could also be used to maturity [14] which over the years evolved in the Capability
reduce the impact of human activities on the environment by Maturity Model Integrated (CMMI). It defined maturity in a
15% by 2020 [2] especially with applications geared towards number of levels each characterised by a specific behaviour.
dematerialisation use such as electronic transactions, video Organisations were encouraged to assess the level at which
conferencing and ecommerce amongst others. However, ICT they were and then follow the characteristics of the next level
can have a positive or negative impact on the environment to define initiatives for process improvement in their
depending on how it is used [3]. There has been considerable organisation. The advantage of the maturity model was that it
research on prescriptive directives on how to reduce the allowed benchmarking across different organisations and also
impact of ICT on the environment such as in [4]–[7] amongst
others which are focused on reduced resource use. However,
We are aware that the ideal situation would have been to Initial: Questions on Green IT awareness and any
apply the Green ICT maturity model assessment with the implementations. Qualifies for the level if company has only
proper consultants on selected real companies with different little understanding and a few policies.
Green ICT experiences. However, for various reasons such as Basic: Questions on Green IT strategy and its
lack of resources and time, we had to resort to an alternative implementations. Qualifies for the level if company has a
Green IT strategy but implementation is immature with no the organisation and environmental issues are considered in
clear accountability. business decisions.
Intermediate: Questions on sustainable IT strategy and on Level 2 – Embedded: Questions on Green ICT strategy and
the control of its implementation. Qualifies for the level if the committed resources. Qualifies for the level if there are
company has a full sustainable IT strategy with targets and resources committed on the Green ICT strategy and initial
metrics at the individual project level. actions have started.
Advanced: Questions on the importance of sustainability in Level 3 – Practised: Questions on implementation of
IT and business life cycles. Qualifies for the level if the Green ICT strategy across the organisation. Qualifies for the
company puts sustainable ICT at the heart of IT and business level if the Green ICT strategy is agreed upon and its
planning and both IT and the business drive the efforts implementation is growing across the organisation.
together.
Level 4 – Enhanced: Questions on control of the
Optimising: Questions on sustainable ICT across the implementation of Green ICT strategy in the organisation.
extended enterprise. Qualifies for the level if the company Qualifies for the level if there is a consistent application of
adopts sustainable ICT practices across its full supply chain. Green ICT strategy with an emphasis to improve and learn
across the organisation.
Level 5 – Leadership: Questions on inclusion of Green
Model 4 [12] ICT strategy within the organisation strategy. Qualifies for the
Level 1 – Initial: Questions on Green ICT awareness and level of effectiveness of Green ICT strategy is measured and if
practice in the organisation. Qualifies for the level if there is sustainability concerns form part of business strategy.
little awareness /practice, for example, in terms of e-waste and We used the selected Green ICT maturity models to
energy efficiency and there is no Green ICT strategy for the generate the list of fictitious companies as follows:
organisation.
Model 1 – companies A-E
Level 2 – Repeatable: Questions on Green ICT strategy
and scope of its implementation. Qualifies for the level if there Model 2 – companies F-J
is only a basic Green ICT strategy and the scope of
implementation is restricted to the ICT department. Model 3 – companies K-O
Level 3 – Defined: Questions on Green ICT strategy and Model 4 – companies P-T
scope of its implementation. Qualifies for the level if there is a Model 5 – companies U-Y
clear Green ICT strategy geared towards resource
consumption and the ICT department is a regular partner in This approach gave the following list of fictitious
greening of some processes across the organisation. companies with different experiences in Green ICT:
Level 4 – Manageable: Questions on the inclusion and A. Company has no Green ICT awareness.
review of Green ICT strategy in the organisation and the role
B. Company has some Green ICT awareness but has not
of ICT department in greening of the organisation. Qualifies
initiated any Green ICT actions
for the level if the Green ICT policy is taken into
consideration in all business processes and it is itself reviewed C. Company has implemented some Green ICT actions but
regularly. has no formal Green ICT strategy
Level 5 – Optimising: Questions on the role and scope of D. Company has a formal Green ICT strategy and has started
the Green ICT strategy. Qualifies for the level if the Green only started implementing it.
ICT strategy is broad, allows participants to moderate their
actions to optimise their impact on the environment and is E. Company has implemented its Green ICT strategy and
applied beyond the frontiers of the organisation to reach the makes use of metrics to control its implementation
whole supply chain. F. Company has no implemented any Green ICT initiatives
G. Company has implemented a few Green ICT initiatives
resulting in a reduction of carbon footprint of its ICT
Model 5 [13]
activities by 10%
Level 0a – Not applicable: This is a decision by the
H. Company has a dedicated organisation for Governance of
organisation that it is not worth or possible to carry out an
Green ICT and achieves a reduction of carbon footprint of
assessment.
its ICT activities by 50%
Level 0 – Ad-hoc: Questions on Green ICT strategy.
I. Company's ICT department is carbon neutral and reports
Qualifies for the level if there is no agreed Green ICT plan in
its carbon footprint
place in the organisation.
J. Company uses Green ICT to reduce carbon footprint
Level 1- Foundation: Questions on Green ICT strategy.
across the organisation systematically
Qualifies for the level if there is an agreed Green ICT plan in
K. Company has little awareness and no policies on Green
ICT
L. Company has a Green ICT strategy but implementation
has just started
M. Company has a Green ICT strategy and has proven
implementations with metrics to control it - Green ICT is
a concern in ICT projects. TABLE I. COMPARISON OF GREEN ICT MATURITY
ASSESSMENTS
N. Company has implemented its Green ICT strategy and ASSESSMENT
Green ICT concerns are at the heart of decisions across COMPANY Mod el 1 Mod el 2 Mod el 3 Mod el 4 Mod el 5
H 3 2 2 2 2
P. Company has little awareness / practice in Green ICT and I 4 3 2 2 2
J 4 4 3 3 4
does not have a Green ICT strategy K 1 0 0 0 0
L 3 1 1 1 1
Q. Company has a basic Green ICT strategy but the scope of M 4 2 2 2 2
Y 4 4 3 3 4
regularly as well.
T. Company has a practice to include Green ICT policy in all The results show that the different models may agree on
its actions across its entire supply chain low Green ICT maturity companies near the 0 level. However,
they differ on the medium to high Green ICT maturity
U. Company has no green ICT plan companies. The increasing expectations of all stakeholders in
recent times probably explains the difference in the models.
V. Company has an agreed Green ICT plan and Indeed, the oldest model in [9], although encouraging for low
environmental issues are generally considered in the achieving (from the Green ICT maturity perspective), is less
company ambitious and discerning as the other models. There is
W. Company has an agreed Green ICT plan and has however agreement on the need for a Green ICT strategy or
committed resources to it with some initial results plan and for a rigorous control of its implementation across
the different models. There is also increasing use of metrics
X. Company has an agreed Green ICT plan and its and quantitative ways of controlling Green ICT initiatives for
implementation is growing across the company higher maturity levels. Models such as the one in [10] include
Y. Company measures the effectiveness of the carbon footprinting throughout as a means of assessment.
implementation of its Green ICT strategy and Although most of the models are ICT department centric with
sustainability is considered as part of all business strategy many initiatives starting from the ICT department itself, they
also recognize that Green ICT cannot be restricted to the ICT
For the analysis, the results of the assessment of each department and reward those that take it across the whole
Green ICT maturity model was slightly modified for organization. Some like in [11], [12] even expect high
comparison sake. As all the models used a five level scale, all maturity companies to extend Green ICT initiatives to the full
assessment levels were coded in the number range 0-4 to supply chain. All the different models focus mostly on the
facilitate comparison. A colour coding was also used to show environmental aspect of Green ICT with rare exceptions such
the difference in assessment of the same company by the as the one in [15] which was unfortunately excluded from the
different Green ICT maturity models. The colour coding study. It is clear though that Green ICT maturity cannot focus
followed is red, amber, yellow, blue and green for levels 0 to 4 solely on the environmental aspect of sustainability any
respectively. longer.
The table below shows the results of this analysis:
V. GENERAL GREEN ICT MATURITY MODEL
In light of the above, we propose a general Green ICT
model encompassing the different models reviewed. The
model should be discerning at the lower levels to differentiate
between those who are not doing anything and those trying to References
devise and implement initial Green ICT initiatives. We feel [1] Gartner, “Gartner Estimates ICT Industry Accounts for 2 Percent of
that this is important to encourage those starting on the Global CO2 Emissions,” 2007. [Online]. Available:
journey of Green ICT. However, it should also set more http://www.gartner.com/newsroom/id/503867.
ambitious goals such as the inclusion of metrics at higher [2] M. Webb, “SMART 2020 : Enabling the low carbon economy in the
maturity levels. Finally, Green ICT mature companies are information age,” 2008.
expected to promote Green ICT values not only within their
companies but rather across their full supply chain. Our [3] L. Erdmann, L. Hilty, J. Goodman, and P. Arnfalk, “The Future
Impact of ICTs on Environmental Sustainability,” Speville /
proposed general Green ICT model thus includes the Brussels, 2004.
following maturity levels:
[4] S. Murugesan, “Harnessing Green IT : Principles and Practices,”
0. Reckless: At this level, companies have little or no IEEE IT Pro, no. February, pp. 24–33, 2008.
awareness of Green ICT and especially believe that [5] S. Ruth, “Reducing ICT-related Carbon Emissions: An Exemplar
nothing can be done at the company level. for Global Energy Policy?,” IETE Technical Review, vol. 28, no. 3,
pp. 207–211, 2011.
1. Ad-hoc: At this level, there is recognition of Green
ICT but initiatives are ad-hoc and not sustainable. [6] K. Raza, V. K. Patle, and S. Arya, “A Review on Green Computing
for Eco-Friendly and Sustainable IT,” Journal of Computational
2. Conscious: At this level, the company has a Green Intelligence and Electronic Systems, vol. 1, no. 1–14, pp. 3–16, Jun.
ICT plan and is committed to its implementation 2012.
across the ICT department [7] G. Bekaroo, C. Bokhoree, and C. Pattinson, “Towards Green IT
Organisations: A Framework for Energy Consumption and
3. Responsible: At this level, the company has a Green Reduction,” The International Journal of Technology, Knowledge,
ICT plan and is committed to its implementation and Society, vol. 8, no. 3, 2013.
across the whole organization. It is important for
metrics to be defined and monitored at this level. [8] A. Shah, T. Christian, C. Patel, C. Bash, and R. Sharma, “Assessing
ICT ‟ s Environmental Impact,” IEEE Computer, no. July, pp. 91–
4. Role model: At this level, the company includes 93, Jul-2009.
Green ICT issues at all levels of business decision [9] Graeme Philipson, “A Green ICT Framework,” St Leonards, 2010.
making and strives to implement it across its full
[10] B. M. Desai and V. Bhatia, “Green IT Maturity Model : How does
supply chain. Companies here are in a virtuous your Organization Stack up?,” SETLabs Briefings, vol. 9, no. 1, pp.
improvement mode for sustainable development of the 49–57, 2011.
business.
[11] B. Donnellan, C. Sheridan, and E. Curry, “A capability maturity
This model regroups the main directions and the levels framework for sustainable information and communication
from several models and will help to provide a standard technology,” IT Professional, vol. 13, no. 1, pp. 33–40, 2011.
guidance for companies wishing to properly manage their [12] A. Hankel, L. Oud, M. Saan, and P. Lago, “A Maturity Model for
Green ICT initiatives. Green ICT : The case of the SURF Green ICT Maturity Model,” in
28th EnviroInfo Conference, 2014.
VI. CONCLUSION AND FUTURE WORK [13] “UK HM Greening Government ICT Strategy Combined
Assessment Model 2015,” 2015. [Online]. Available:
The comparison attempted was not a statistical one and as https://www.gov.uk/government/publications/green-ict-maturity-
such, it is difficult to make far reaching generalising claims. model.
However, a qualitative assessment shows that there is some
[14] M. C. Paulk, B. Curtis, M. B. Chrissis, and C. V Weber, “Capability
agreement as well as clear differences among the different Maturity Model for Software, Version 1.1,” Pittsburg, Pennsylvania
Green ICT models. 15213, 1993.
Future work would involve confirming our results through [15] S. DeMonsabert, K. Odeh, and J. Meszaros, “SustainaBits: A
more quantitative approaches. It would also be interesting to framework and rating system for sustainable IT,” in International
further develop and test the generic Green ICT maturity Green Computing Conference (IGCC), 2012.
model. There is also need for a taxonomy of Green ICT [16] R. Foogooa and K. Dookhitram, “A Self Green Maturity
maturity models which would give a direction for future Assessment Tool for SMEs,” in IST Africa, 2014.
development of Green ICT maturity models.
QoS-Aware Single Service Selection Mechanism for Ad-Hoc
Mobile Cloud Computing
Abstract—Mobile technology has made notable progress on the remote servers and/or on devices through wireless
over the past few years, this is due to the development of new networks. With the rapid growth in the new compelling era of
handheld devices, improved wide area cellular coverage and mobile computing, technological innovations are observed to
seamless integration of wireless data access into mobile devices. be occurring at an accelerated rate both in low and middle-
These recent advances in wireless internet technologies have
income nations [2]. Undoubtedly, ubiquitous demand for
given birth to the Mobile Cloud Computing paradigm. However,
intermittent internet disconnection and others lead to the mobile services would be increasing to the point where 60
evolution of ad-hoc mobile cloud where mobile devices exposes percent of IP traffic would be originating from mobile
their computing resources and residence services to other devices by 2016 [3].
devices. One challenge in the ad-hoc mobile cloud is that of The rich experience evolving from the integration of these
service selection especially in a virtualized ad hoc mobile mobile devices with the cloud computing architecture through
environment. This is because the best service to be selected the aid of wireless internet connections results in the recent
might have left after discovery as a result of the dynamic nature paradigm called Mobile Cloud Computing (MCC) [3]. MCC
of the ad hoc mobile system. To resolve this challenge, we simply refers to an infrastructure system that allows mobile
propose a single service selection mechanism where the instances
information or services to be moved to the centralized cloud.
or images pertaining to the information of community members
are stored temporarily on centralized virtual node. When a This can them be exposed to any registered devices with the
selection is to be done, these images are searched instead of central cloud. The benefits of cloud computing like reliability,
searching individual node. A multi-criteria decision is proposed scalability, device-independent and on-demand services
as our solution approach. Experiment is conducted using brings a limelight to the functionality of resource constrained
Execution time as the QoS parameter. Evaluation and analysis mobile devices through this development.
are carried out. We first recorded an increase in execution time This improvement motivated the idea of a use case
as the number of services increases. But at a certain period, the scenario in our project called Grid-based utility Infrastructure
execution time increment is not well noticeable or proportional for SMME Enabling Technology (GUISET) [5].
to service increment. This is attributed to the service images that
GUISET is a proposed middleware platform whose basic
are already in the virtual leading server thereby reducing the
execution time. goal is to provide utility infrastructure services to small
medium and macro enterprise (SMMEs), most especially in
Keywords—Ad hoc cloud mobile; GUISET; service selection;
the context of mobile services like m-Learning, m-Commerce,
server mobile machine; SMMEs m-Health [5]. Because of the richness of MCC, mobile
devices have become part and parcel of human being and
I. INTRODUCTION hardly can people walk without it.
Cloud computing is an infrastructure service provisioning Many studies are currently ongoing in different phases of
paradigm which is based on pay per use mechanism. The our GUISET project along the mobile cloud computing
basic goals of this computing paradigm are to remove the paradigm for proper efficiency and effectiveness. Some of the
burden of providing infrastructure, platform and software research works includes its implementation [6]–[8],
services from service consumers and also reduce costs. This performance [5] and pricing strategy [9]. Going by the
will allow service consumers to concentrate on the core numerous advantages brought out by MCC which includes
business. multi-tenancy, ease of integration, scalability and dynamic
One major improvement and advantage of cloud provisioning of services in an on-demand manner, MCC also
computing to subscribers is the issue of mobile services. This incurs some challenges. Some of the challenges includes
paradigm has allowed the mobile information to be moved to intermittent dis-connectivity and long WAN latency
the cloud and enhance data and information shareability. This introduced [3]. Other challenges include low processing
information or services could be obtained through a web capabilities, low memory, small displays and battery capacity
browser interaction with the wireless connection [1]. etc. [10]. It is with the view to mitigate for these challenges
Mobile subscribers accumulate wealth of experience from that gave birth to an infrastructure-less mobile platform
various uses of mobile services such as currency conversion named ad hoc mobile cloud [2].
service and weather forecast service. These services operate
Abstract—This paper presents a comprehensive study of level, countless alternatives to first order DPA attacks have
leakage reduction techniques applicable to CMOS based devices. been developed and demonstrated. Researchers in [2] have
In the process, mathematical equations that model the power- successfully demonstrated that in sub-100 nm and related
performance trade-offs in CMOS logic circuits are presented. technologies, power leakages are as high as the dynamic
From those equations, suitable techniques for leakage reduction
power of the device and hence the leakage (static) supply
as pertaining to CMOS devices are deduced. Throughout this
research it became evident that designing CMOS devices with current can be used as a new Side-Channel.
high-κ dielectrics is a viable method for reducing leakages in
cryptographic devices. To support our claim, a 22nm NMOS Off State leakages have been recognized by semiconductor
device was built and simulated in Athena software from Silvaco. manufacturers as a major bottleneck for future microcontroller
The electrical characteristics of the fabricated device were integration. Off-state leakage is a static power current that
extracted using the Atlas component of the simulator. From this leaks through transistors even when such devices are turned
research, it became evident that high-κ dielectric metal gate are off. It is one of two principal sources of power dissipation in
capable of providing a reliable resistance to DPA and other form today’s microcontrollers. The other source of leakage is of
of attacks on cryptographic platforms such as smart card.The course dynamic power, which is caused by the repeated
fabricated device showed a marked improvement on the 𝑰𝒐𝒏 /𝑰𝒐𝒇𝒇 capacitance charge and discharge on the output of the hundreds
ratio, where the higher ratio means that the device is suitable for of millions of gates in today’s microcontrollers. Until recently,
low power applications. Physical models used for simulation only dynamic power was identified as a significant source of
included 𝑺𝒊𝟑 𝑵𝟒 and 𝑯𝒇𝑶𝟐 as gate dielectric with TiSix as metal power consumption, and Moore’s law has helped to control it
gate. From the simulation result, it was shown that 𝑯𝒇𝑶𝟐 was the through shrinking processor technology. Dynamic power is
best dielectric material when TiSix is used as the metal gate. proportional to the square of the supply voltage; therefore
reducing the voltage significantly reduces the device’s power
Keywords—Differential power analysis; High-K dielectric gate; consumption. Unfortunately smaller geometries aggravate
Smart card current leakages problems; static power begins to dominate the
power consumption equation in microcontroller design” [12].
I. INTRODUCTION
CMOS technology was invented at Fairchild
Semiconductor in 1963 by Wanlanss Frank. The original idea
Five mathematical equations govern the power performance 𝑃 = 𝑃𝑑𝑦𝑛𝑎𝑚𝑖𝑐 + 𝑃𝑠𝑡𝑎𝑡𝑖𝑐 (3)
in the CMOS logic circuits according to [12]. In this paper, we
present them in a way that addresses the basics of physics and The first term of (3) can be broken off into two distinct
logic circuitry design. The first mathematical equations related entities namely 𝑃𝑠ℎ𝑜𝑟𝑡 and 𝑃𝑠𝑤𝑖𝑡𝑐 ℎ . The first component 𝑃𝑠ℎ𝑜𝑟𝑡
to CMOS power fundamentals are the basics of low power is the power dissipated during gate voltage transient time
consumption [3] while the last two equations are more while the second component 𝑃𝑠𝑤𝑖𝑡𝑐 ℎ comes as a result of the
concerned with sub-threshold and gate-oxide leakage many charging and discharging of capacitances in the device.
modeling in CMOS technologies. The last term 𝑃𝑠𝑡𝑎𝑡𝑖𝑐 represents the power generated when the
transistor is not in the process of switching. Equation (3) can
A. Investigation of Frequency and Voltage Relationships be rewritten as:
Equation (1) below depicts the supply voltage dependency 𝑃 = (𝑃𝑠ℎ𝑜𝑟𝑡 + 𝑃𝑠𝑤𝑖𝑡𝑐 ℎ ) + 𝑃𝑠𝑡𝑎𝑡𝑖𝑐 = 𝐴𝐶𝑉 2 𝑓 + 𝑉𝐼𝑙𝑒𝑎𝑘 (4)
of the operating frequency of the device as computed in [12] :
In (4) A denotes the number of bits that are actively
𝑓 ∝ 𝑉 − 𝑉𝑡ℎ 𝛼 𝑉
(1) switching and C is the combination of the device’s load and
internal capacitance. It is worth mentioning at this stage that
for ease of simplicity, the power lost to spasmodic short circuit
In this equation, 𝑉 represents the transistor’s supply voltage
at the gate’s output has been neglected.
while 𝑉𝑡ℎ is the device’s voltage. The exponent α is an
From (4) it is evident that dropping the supply voltage leads
experimentally derived constant with a value of 1.3
to an important decrease in the device power consumption.
approximately [12]. Dynamic voltage scaling in CMOS
Mathematically speaking, dividing the supply voltage by 2 or
devices is used to control switching power dissipation in
halving it will reduce the power consumption by four. The
battery operated systems. Also, power consumption
main drawback of that proposition is that it will reduce the
minimization techniques rely on low voltage modes and
processor’s top operating frequency by more than half. A
lowered clock frequencies. In [12] authors have used the
better approach suggested in [12] relies on the use parallel or
relation derived in (1) to compute an equation that depicts the
pipelined techniques to compensate for the performance losses
relationship between frequency and supply voltage. The
due to supply voltage reduction.
derivation begins with the selection of the device’s working
voltage and frequency defined as 𝑉𝑛𝑜𝑟𝑚 and 𝑓𝑛𝑜𝑟𝑚
respectively. The quantities selected are normalized entities C. Computing Leakage Current
depicting the relationship between the largest possible
device’s operating voltage 𝑉𝑚𝑎𝑥 and frequency 𝑓𝑚𝑎𝑥 . This Parallelism and pipelining techniques for power reduction
relationship is shown in (2) below: were first proposed by [19]. Since then researchers have
conducted studies aimed at optimizing the pipelining depth for
𝑉 𝑡ℎ 𝑉 𝑡ℎ
𝑉𝑛𝑜𝑟𝑚 = 𝛽1 + 𝛽2 ∙ 𝑓𝑛𝑜𝑟𝑚 = + 1− ∙ 𝑓𝑛𝑜𝑟𝑚 (2) dissipated power reduction in CMOS devices. Furthermore,
𝑉𝑚𝑎𝑥 𝑉𝑚𝑎𝑥
researches have been conducted at a functional block level to
From (1) it is evident that if f= 0, then (2) becomes: compare the performances of pipelining and parallelism to
find out which technique performs best when it comes to
𝑉𝑡ℎ minimizing total switching power. In (3) it was shown that
𝑉𝑛𝑜𝑟𝑚 = 𝛽1 =
𝑉𝑚𝑎𝑥 leakage current (source of static power consumption) is a
combination of subthreshold and gate-oxide leakage i.e.:
The value of 𝑉𝑛𝑜𝑟𝑚 can safely be approximated to 0.37. That
approximation closely matches present day’s industrial data 𝐼𝑙𝑒𝑎𝑘 = 𝐼𝑠𝑢𝑏 + 𝐼𝑜𝑥
[12]. It is also worth mentioning at this stage that 𝑓𝑚𝑎𝑥 is
proportional to 𝑉𝑚𝑎𝑥 and that the frequency will drop to zero if Deriving Subthreshold Power Leakage: Authors of [19] present
V is equal to 𝑉𝑡ℎ , as clearly shown (1). an equation representing the direct relationship between a
CMOS device threshold voltage, its subthreshold leakage
current and the device supply voltage as follows:
𝑛𝑉0 1− 𝑒 −𝑉 𝑉 𝜃 introduce delays by adding extra circuitry and wires and also
𝐼𝑠𝑢𝑏 = 𝐾1 𝑊𝑒 −𝑉𝑡ℎ (6)
uses extra area and power.
In (6) K1 and 𝑛 are normally derived experimentally, W Another approach to static power reduction is based on the
represents the device’s gate width, and Vθ is its thermal utilization of multiple threshold voltage techniques. “Present
voltage. The quantity Vθ can safely be approximated to 25 mV day’s processes typically offer two threshold voltages.
at room temperature (20 degrees Celsius). If Isub rises enough Microprocessor designers assign a low threshold voltage to
some of the few identified performance-critical transistors and
to generate, Vθ will rise as well and in that process cause an a high threshold voltage to the majority of less time critical
increase in Isub and this may result in thermal runaway. From transistors. This approach has the tendency to incur a high
(6) it becomes clear that two ways exist for reducing Isub subthreshold leakage current for the performance-critical
which are (1) turning off the supply voltage and (2) stepping- transistors, but can significantly reduce the overall leakage”
up the threshold voltage. In [12] it is argued that “since this [12].
quantity shows up as a negative exponent, increasing that Other techniques for reducing subthreshold leakage are
value could have a dramatic effect in even small increments. closely related to gate tunneling current, however, their effects
On the other hand, it is evident from (1) that increasing are still under investigations. Gate-oxide leakage has a
Vth automatically creates a reduction in speed. The obvious negligible dependence on temperature [12]. Therefore, as it
problem with the first approach is loss of state; as for the subsides with drops in temperature, gate-oxide related current
second option, its major inconvenience relates to the loss of leakage become important.
performance”. The device’s gate width W, its gate length Lg,
the device’s oxide thickness Tox, and doping concentration
Npocket are other major contributors to subthreshold leakage in IV. FABRICATION METHOD
CMOS based technologies. Processor designers often optimize
one or a few of those leakage components as a convenient For this paper, a 20nm NMOS MOFSET was fabricated
technique to reduce subthreshold leakage as will be seen in using the SILVACO Athena module and the device’s
subsequent paragraphs. electrical characteristic and performance were simulated and
evaluated using the Atlas module from SILVACO. The
Subthreshold Power Leakage: Gate leakage mechanisms, such specifications of the sample used in this experiment was p-
as tunneling across thin gate oxide leading to gate oxide type (boron doped) silicon substrate with doping concentration
leakage current become significant at the 90nm node and of 1.5𝑒 15 atoms 𝑐𝑚−3 and <100> orientation. The next step
smaller. Gate oxide leakage is not as well understood as consisted in developing the P-well by growing a 800 Å oxide
subthreshold leakage. For the purpose of this research, a screen on top of bulk silicon. This technique makes use of dry
simplification of equations from the authors of [5] is sufficient oxygen at a very high temperature (i.e. approximately 800°C)
to illustrate the point:
followed by Boron as dopant with a concentration of
2 3.75𝑒 13 atoms 𝑐𝑚−3 . In the third step, the deposited oxide
𝑉
𝐼𝑜𝑥 = 𝐾2 𝑊 𝑒 −𝛼𝑇𝑜𝑥 𝑉
(7) layer is etched and there after annealed to ensure that all boron
𝑇𝑜𝑥
atoms are spread uniformly. This done at a temperature of
Where 𝐾2 and 𝛼 are derived experimentally. In (5) we draw 900°C using nitrogen followed by a futher rise in temperature
our attention to the oxide thickness, 𝑇𝑜𝑥 component of the to 950°C using dry oxygen. The next step was to isolate the
equation. “Increasing 𝑇𝑜𝑥 will reduce the gate leakage. However, neighbouring transistor by creating a shallow trench isolator
it also negatively affect the transistor’s efficiency since 𝑇𝑜𝑥 must
with a thicknesses of 130 Å. After that step, the wafer was
decrease proportionately with process scaling to avoid short oxidized with dry oxygen for approximately 25 minutes at
channel effects. Therefore, increasing 𝑇𝑜𝑥 is not a viable option” temperature of 1000°C. Two important processes were
[12]. A better approach to this problem may lie in the involved in the development of the STI namely, Low Pressure
development of high-κ dielectric gate insulators. This approach is Chemical Deposition (LPCVD) and reactive ion etching
currently under heavy investigation by the research community. (RIE). The LPCVD process starts with the deposition of a
1000 Å nitride layer on top of the STI oxide layer, followed by
a photo-resistor deposition on the wafer. The RIE process
III. REDUCING STATIC POWER CONSUMPTION consisted in etching the unnecessary part on the top of the STI
area. Both chemical and mechanical polishing was
Many researchers as well research groups have developed implemented to strip away any extra oxide on the wafer. STI
power models for reducing static power consumption for was further annealed for approximately 15 minutes at a
embedded devices. Power gating [4, 16] is slowly becoming a temperature of 850°C. As a final STI step in the process, an
very popular design technique for decreasing leakage currents. oxide layer was carefully deposited and etched to eliminate
Although effective in reducing static power consumption in possible defects that may have occurred on the surface.
many instances, its major drawback lies in its tendency to It is important to mention at this stage that the deposition of
high-κ dielectric process with gate oxide thickness is selected
so that they have the same equivalent oxide thickness as 𝑆𝑖𝑂2 .
Furthermore, the length of the high-κ material was scaled so as
to get the equivalent 22nm gate length of the transistor. The
next step consisted of the deposition of Titanium Silicide
(TiSix) on top of the high-κ dielectrics (𝑆𝑖3 𝑁4 , 𝐻𝑓𝑂2 )
followed by halo implantation of indium dose to obtain the
optimum value of the NMOS device [6, 7]. The next step
involved the formation of the sidewall spacer that would serve
as source and drain electrodes for the device. In this case,
implantation with arsenic is followed by a dose of phosphor to
ensure an uninterrupted flow of current in the fabricated
NMOS device [8].
The next step consisted in the deposition of a layer (0.5 μm)
Fig. 2. Doping profile of the fabircated device
of Boron Phosphor Silicate Glass (BPSG) to act as the pre-
metal dielectric [9]. Once more, annealing was done at 950°C Results of electrical characteristic simulation are obtained in
on the wafer to strengthen the structure. The next step consisted
Fig. 3 below.The plots are also known as “𝑉𝑡 Curves”, because
of compensation implantation with a layer of phosphorous. The
last step involved the deposition of aluminium layer which devices designers use them extensively to extract the threshold
served as metal electrode (source and drain). Once the model voltage (𝑉𝑡 ), which defines an approximation of when a
structure design was completed we then proceeded with device transistor is “on” and allows current to flow across the
simulation using Atlas. channel.
For this research, the Figure presents the drain (𝐼𝑑 ) vs gate
V. RESULTS AND DISCUSSIONS voltage (𝑉𝐺𝑆 ) curves for 𝑆𝑖3 𝑁4 (k~29), 𝐻𝑓𝑂2 (k~21) and
aconventional device made of 𝑆𝑖𝑂2 . Typically, the fabricated
The complete NMOS structure is shown in Fig. 1 below. device’s drain voltage 𝑉𝐷𝑆 was fixed when (𝐼𝑑 ) vs (𝑉𝐺𝑆 ) was
The fabrication process is the same for all high-κ devices plotted. The threshold voltage (𝑉𝑡ℎ ), state on current (𝐼𝑜𝑛 ) and
fabricated for this research except that the dielectric materials state off current (𝐼𝑜𝑓𝑓 ) can be extracted from the (𝐼𝑑 ) vs (𝑉𝐺𝑆 )
were varied. curve.
5.00E-05
4.00E-05
3.00E-05
Drain Current
(A/um)
2.00E-05
𝜀=3.9
1.00E-05 𝜀=21
𝜀=29
0.00E+00
0 1 2 3 4
-1.00E-05
Gate Voltage (V)
Brian Jones
Leeds Business School
Leeds Beckett University
United Kingdom
b.t.jones@leedsbeckett.ac.uk
Abstract—Social media have transformed the world in expectations of customers who are interacting with them
which we live. Although several studies have uncovered forms through social media.
of customer engagement on social media, there is a scarcity of
academic research on customer engagement within the grocery Following a call for research on understanding what
sector. This study therefore aims to address this gap in the
customers seek when interacting with businesses on social
literature and shed light on the various ways customers engage
with grocery stores on Facebook. Netnography is used to gain media [7], this paper aims to provide knowledge as to why
an understanding of the behaviour of customers on the and how customers engage with companies on social media,
Facebook page of Tesco and Walmart. The findings of this more specifically on Facebook pages of grocery stores.
study reveal that cognitive, emotional and behavioural Furthermore, some researchers examined the motivations of
customer engagement are manifested and that customers can customers for interacting with apparel retailers on social
both create and destroy value for the firm. This study media and suggested that further research be carried out to
contributes to knowledge by uncovering the various forms of understand motivations of customers who connect with
customer engagement on Facebook pages of Tesco and retailers on social media in different retail sectors [8]. This
Walmart.
paper will therefore focus on the grocery sector to
Keywords—Social Media; Customer Engagement; understand motivations of customers for interacting with
Relationship Marketing; Grocery Stores; Facebook grocery stores on Facebook by analysing responses of
customers to customer and company initiated messages.
I. INTRODUCTION
Technological advances are dramatically and substantially The purpose of this study is to examine the various ways
transforming the retail industry [1] [2]. The Internet enables customers engage with grocery stores. The paper is
the transfer of information and knowledge worldwide in real organised as follows. First, the main concepts of social
time to customers [3] who use these technologies to enhance media, customer engagement and relationship marketing are
their shopping experience [4] To improve the satisfaction presented. Second, we outline the netnography method,
levels of customers, retailers are increasingly using social followed by an analysis and discussion of the findings. We
media, mobile and Internet technologies to enrich their then present the conclusion, limitations, and suggestions for
shopping experience [4] The advent of social media is future research directions.
revolutionising marketing practices [5] [6]. II. LITERATURE REVIEW
With social media, it is now feasible for businesses to A. Social Media and Facebook
have ongoing dialogues and exchange of experiences by Social media usage is exploding and online platforms
actively listening and responding to queries of customers. have become vital tools for marketing [9]. Facebook is the
However, in practice businesses seem to be using social social media platform the most widely adopted by brands
media just as any other communication medium and they and companies [10]. These social media platforms have
are not directly interacting and are not seeking to obtain converted the Internet from a platform for information to a
highly engaged customers through social media. platform for influence [6]. Individuals leverage social
Consequently, it is critical that businesses understand the networks and blogs to create, recommend and share
Social media marketing has been defined as ―a social and Unlike authors who have used multidimensional
managerial process by which individuals and groups obtain perspectives, van Doorn et al. [36] focus only on the
what they need and want through a set of Internet-based behavioural dimension of customer engagement. According
to these scholars, customer engagement consists of
behaviours, which go beyond transactions. This definition person, company, or group in social media networks‖ [46].
highlights that behavioural manifestations do not only mean Similarly, these repetitive and systematic customer attacks
purchases, but also include other activities of the customer are referred to as shitstorm [45]. ―A shitstorm denotes
such as word of mouth, customer co-creation and emotional and often irrational criticisms carried out by
complaining behaviour [37] recommendations, helping many consumers. Rational negative opinions usually form
other customers, blogging, writing reviews and even the basis for shitstorms, which eventually grow through
engaging in legal actions [36]. irrational and assertive content added by other dissatisfied
users.‖ [45]. Likewise, customers tend to express strong
Customer engagement also incorporates customer co- emotions on the Internet more readily and easily owing to
creation [36]. Value co-creation is defined as the mutual the anonymity offered via this medium of communication
collaborative activities by stakeholders participating in [44].
direct interactions aspiring to add to the value that
materializes for either one or both parties [38]. Customers III. METHODOLOGY
can participate in the invention of the offering, or the design To gain an understanding of the types of customer
and production of related products. Hence, co-creation engagement occurring on the Facebook pages of grocery
happens when the customer contributes through stores, a netnographic study was undertaken. Netnography,
spontaneous, discretionary behaviours that personalise the developed by Dr. Robert V. Kozinets in the late nineties, is
customer-to-brand experience [36]. a participant-observation research used for data collection to
research online communities. It is an interpretive method
In virtual communities, conversations occur on prices, formulated specifically to investigate the consumer
performance, quality and personal experiences with specific behaviour of communities and cultures present on the
brands [39]. These conversations in virtual communities Internet [47]. Netnography provides insights into virtual
illustrate customer empowerment and value co-creation communities similar to the ways that anthropologists try to
[39]. Interventions of marketers are accepted in online understand the norms, cultures and practices of traditional
communities only if they contribute to the community [39]. offline communities. Virtual communities consist of online
Customers tend to react negatively when marketers have gatherings of customers expressing interest in similar
commercially driven communications. This research reveals lifestyles, brands, products and services [47]. Similar to
that customer engagement increases satisfaction, loyalty, ethnographic research, netnography tries to provide
empowerment, emotional bonding, connection, trust and understanding of a community and the interactions and
commitment. communications within the community [48].
Social media enable customers to connect and interact The purpose of this netnographic research is to observe
with other customers and non-customers in their social the interactions between hypermarkets and supermarkets
networks and influence them [40]. Customers with strong with their customers within a Web 2.0 platform. In order to
emotional bonds can become advocates for businesses in undertake netnographic research, the researchers have
peer-to-peer interactions with other customers and non- observed interactions on the official Facebook pages of
customers and play a crucial role as co-creators of value in Tesco and Walmart. Facebook has been selected as it is the
the value adding process [40]. social platform, which is the most widely used by
companies to interact with their customers. The researcher
However, when organisations fail to engage customers has opted for a non-participation observation during the
they have to face the potential threat of customer netnographic research similar to studies carried out by other
enragement [41], a situation where customers can easily researchers [48], [49].
become value destroyers instead of value creators for
companies [42] [43]. The open-comment platform of To ensure a rigorous and reliable research approach, the
Facebook and the anonymity offered by the Internet produce researchers have followed the five stages and procedures
the ideal conditions for public outrage to be vented on recommended by Kozinets [47]. The five sequential steps
corporate walls [44]. Furthermore, social media have are (1) making entrée, (2) data collection and analysis, (3)
empowered customers and the public by giving them a voice providing trustworthy interpretation, (4) research ethics, and
and weakened the position of companies by rendering them finally (5) member checks.
vulnerable to customer attacks, negative publicity and
corporate reputation damage [45]. Additionally, social For the entrée, the researchers have selected Tesco and
media users can generate huge waves of outrage within a Walmart, the leading supermarkets and hypermarkets in the
short period of time when reacting to questionable activity world based on the March 2013 Global Food Retail report.
or statement of an organisation [46]. They qualify such a Moreover, both Tesco and Walmart have set up their official
phenomenon as an online firestorm, which they define ―as Facebook pages where the language used is mainly English.
the sudden discharge of large quantities of messages Another reason why these two grocery stores have been
containing negative WOM and complaint behavior against a selected is that they both have adopted an international
strategy and are operating in several countries. The official social media pages are now used as measurements of
Facebook pages of Tesco and Walmart meet the criteria that consumer engagement in social media [36] [55]. All three
have been set by Kozinets [47] in that they are relevant to dimensions of customer engagement have been observed on
the topic of the research. On both Facebook pages there are the Facebook page of Tesco and Walmart.
high traffic of postings and a large number of discrete
posters. More detailed and rich data are available on both
A. Cognitive Engagement
Facebook pages. And finally the two Facebook pages meet
the last criteria by enabling companies to communicate to From a cognitive standpoint, engagement is a positive
customers, customers to communicate back to the company, state of mind that is represented by high commitment,
and also customers communicating to other customers. energy, and loyalty towards a firm [34], for e.g. the person's
level of concentration or engrossment in the brand [32].
Data has been collected for a one-month period during Cognitive engagement occurs at Tesco and Walmart,
which saturation of data was reached as recommended by whenever customers post comments in which they provide
Kozinets [47]. During this one-month data collection period, information and help to other customers, or when they give
the researcher downloaded conversations occurring on the advice to other customers. Customers post comments on the
official Facebook pages of Tesco and Walmart. The data Facebook page of Tesco and Walmart when they want to
were analysed using the qualitative software NVivo 7.0. share some information to other members of the brand
Therefore the researcher used qualitative content analysis to community. For e.g. a happy customer posted a comment to
elicit themes from the datasets. Similar to previous research inform about the gift he got.
in social media in other sectors [50] [51], the unit of
analysis consisted of the content of the Facebook pages of “Found a plane in my sons [sic] kinder surprise” .
Tesco and Walmart and the coding units were the individual
posts and comments by customers of these grocery stores. This is consistent with findings of a previous study, which
observed that customers gain social benefits by sharing their
The third step of netnography is to provide trustworthy experiences with other customers on the social network,
interpretation [47]. Research is deemed to be reasonable and which is essentially a social venue [56].
trustworthy when conventional procedures of netnography
are followed while collecting and analysing data [47]. There Another form of cognitive engagement is when customers
has been triangulation of data to enhance credibility of the post comments about their loyalty towards the stores. A
study. Triangulation involved the use of a wide range of customer of Walmart posted a comment in which she
customers who have posted comments [52]. Viewpoints and expresses her loyalty to the store by mentioning that
experiences of customers could be verified against opinions Walmart is her favourite store:
and beliefs of other customers, thus building a rich picture
of needs, attitudes and behaviours of the users under “Walmart the best store in the whole world, everything
scrutiny [53]. Additionally the researcher has achieved site that I need is there, yeah [sic] because after looking in the
triangulation by the participation of customers from two other stores website comparing prices at end Walmart is the
distinct organisations (Tesco and Walmart) in order to my favourite and I finished my day in Walmart.”
reduce the effect on the research of particular local factors
peculiar to one grocery store. Findings may be perceived as Prolonged customer engagement with a brand can result
more credible when similar findings emerge from two or in customer loyalty [57]. Additionally, customer loyalty is
more different sites [53]. triggered in several ways for e.g. through positive online
interactions with the brand and the community members, or
by actively defending the company when faced with negative
In this study, the ethical procedure recommended by
user generated content, or by getting good customer care
Langer and Beckman [54]) has been adopted because the service after having experienced a bad service [28].
comments posted by customers of Tesco and Walmart on
the official Facebook page are not password restricted and B. Emotional Engagement
are available to the public. From an emotional perspective, customer engagement
may be characterised by feelings of an individual towards a
The researchers did not carry out member checks in their brand [35]. In this study, four main emotions have been
study as they argued that it was unnecessary to present the identified: enthusiasm, humour, sarcasm and skepticism.
findings back to members of the community who
participated in the research when it was conducted entirely Enthusiasm mirrors an individual‘s intrinsic level of
unobtrusively. interest and excitement about the online brand community
IV. FINDINGS AND DISCUSSION while enjoyment indicates the level of pleasure and
happiness derived by the customer when interacting with the
Customers react to messages posted by companies in virtual community and its members [57]. Enthusiasm is
various ways on Facebook pages of companies. Customer linked to positive emotions felt by customers who post
behaviours such as liking, sharing and commenting on
enthusiastic comments when they are happy or excited. internal [58]. Furthermore, some customers seem to have
Thus they convey in words their hedonic shopping value lost trust, which is one of the pillars of relationship
[33]. For instance customers of Tesco have expressed marketing Sarcastic comments may indicate that the
happiness and excitement for the game it launched for relationship between Tesco/Walmart and its customers have
Easter. The following posts reveal the positive emotions felt been damaged.
by customers who were very happy and excited to have won
after participating in the egg hunt organised by Tesco: Sceptic comments are those comments that incorporate
an element of doubt or lack of conviction about something.
“Thank you tesco [sic] for running the competition! I Customers let companies know that they remain dubious or
won a hudl from egg #17! Very excited for it to arrive” have a feeling of incredulity about promises made by
companies. Sceptic comments are posted when customers
Customers use humour in their comments when they have lost trust in the company and this may damage or have
find an event or a situation amusing or funny. Walmart already damaged the relationship between the customer and
posted an advertisement for yoghurt with a picture showing the company. At Walmart, customers post sceptical
a lady putting the yoghurt in her bag and the copy of the comments whenever Walmart post comments, which they
advertisement reads as consider to be dubious. For e.g. Walmart shared a link
praising the action of a Walmart employee. One customer
“Meet your new office buddy - delicious Chobani Greek posted two comments to let the other members of the
Yogurt (5.3 oz.). Now only $1 on Rollback.” (Walmart, community know that she has doubts about the truth of this
posted on 9 April 2014, Walmart, Facebook page). article.
In response to this advertisement, a customer posted the “This is bogus!!!!” (WAB, posted on 23 April 2014,
following comment: Walmart Facebook page).
“This is the second article I've seen that a [sic]
“SHE'S STEALING YOGURT! LITERALLY PUTTING employee of Walmart supposedly done something good. Like
IT IN HER PURSE AND NOT IN THE GROCERY KART I said before Bogus” (WAB, posted on 23 April 2014,
[sic]” Walmart Facebook page).
These humorous comments posted by customers are When customers are emotionally engaged, the nature of
ways by which they engage with companies. This form of the relationship changes ([31]. Positive emotions associated
customer engagement brings positive value to the online with enthusiasm and humour, tend to enhance the
community [28]. relationship between the company and the customer, while
negative emotions associated with sarcasm and scepticism
However there are times when customers express may harm the relationship between the two parties.
negative emotions, hence damaging the image of the C. Behavioural Engagement
company. Comments are labeled as sarcastic when irony is
From a behavioural viewpoint, engagement refers to
used to mock or to convey contempt. Customers post such
actions toward a firm that go beyond transactions [34], for
comments to express their anger and/or disappointment
e.g. participation, vigour and interaction with the focal
following an action of the company or an event that has
brand [33]. This study reveals that customers engage with
occurred. This form of emotion brings negative value and so
the grocery stores by communicating back (C2B
can potentially harm the relationship between the two
communication) to seek for more information i.e. customer
parties. This study reveals that sarcasm is the form of
query, for entertainment, to get additional incentives and to
emotional expression mostly used by customers both on
participate by responding to posts of the stores.
Tesco and Walmart Facebook pages. Following a company
post in which Tesco asked about the most exotic food that
Additionally, this study uncovers that customers
customers had not tasted yet, several customers posted
converse with other customers (C2C communication) on
sarcastic comments referring to the horsemeat scandal as
Facebook, share advertisement, give advice to other
illustrated below:
customers, get or provide feedback to other customers,
criticise other customers, help other customers, make
“Was going to say horsemeat but then I remembered, I
themselves or someone else known to other customers i.e.
had it last year in place of beef !!! [sic]” (TDW, posted on
reputation and also provide support and encouragement to
7 April 2014, Tesco Facebook page).
other customers. These forms of behavioural engagement
add value to the relationship. These types of customer
These findings are consistent with another study which
behaviour are consistent with findings of van Doorn et al.
showed that information broadcast by a third party through
[36] who categorise it as a form of behavioural customer
social media exacerbates publics‘ emotions such as disgust,
engagement, which incorporates customer co-creation.
anger, and contempt when the reason for the crisis is
Hence, co-creation happens when the customer contributes
through spontaneous, discretionary behaviours that consumed, several customers posted sarcastic comments
personalise the customer-to-brand experience [36]. referring to the horsemeat scandal, which Tesco had
Examples of customer engagement behaviours are when allegedly been selling in the past. A customer defended the
customers suggest how to enhance the customer experience, company by posting the following comment:
helping and training service providers, or simply helping
other customers to enjoy their customer experience [36]. For “Sick of hearing about horse meat comedy, I don't think
e.g. a customer of Walmart posted a comment to advise Tesco's would knowingly sell you anything that you did not
customers how to keep fresh strawberries for a longer period want... The supplier is at fault not Tesco
of time:
Analysis of data also shows that customers post
“I found a better way for them to last longer, put them in comments that are negative and unfavourable for the
a bowl of water and a cup of vinegar and soak for 10 min. company in five main instances: to complain, to criticise, to
and then rinse off and put in refrigerator. They stay fresh provide information about competitors, to warn customers
LONGER AND THEY ARE CLEAN AND PESTICIDE against a product/service, and lastly to retaliate i.e. to inform
FREE!!! [sic]” of actions they have taken or are about to take because of
their dissatisfaction. These forms of negative comments are
Furthermore, the researchers have identified five referred to as negative word of mouth [49]. Complaints are
positive actions resulting from comments posted by the most common form of customer post containing an
unfavourable message in this study. Customers use the
customers which add value to the organisation: customer
Facebook page of the company to voice out their
referral, customer suggestions, customer defending the dissatisfaction and discontent with the product or service of
company, customer defending employees of the company the company. Customers post messages on the Facebook
and finally promotion by customer. Customers post positive pages of Tesco and Walmart to criticise actions taken by the
comments to recommend the brand or the company to their respective companies. When Walmart posted about its
friends. This type of customer behaviour is consistent with initiative of empowering women in different parts of the
findings of Sashi [40] who refer to these customers as world, a customer posted a negative comment to inform
advocates. Advocacy is a form of consumer engagement, Walmart and the Facebook community that she does not
which happens when consumers actively recommend believe this company post, which she describes as hogwash
specific brands, and/or ways of using these brands [39]. i.e. insincere speech or meaningless talk as Walmart does not
Similarly, customer referral is a form of customer pay its employees decent wages
engagement and is termed as endorsing [57]. Endorsing is a
“What a bunch of hog wash [sic] !!!! Walmart you are
form of behavioural customer engagement in which the one of the hugest reasons the working poor stay poor!!!!
customer proactively recommends products and services to Shame on u [sic]!!! Pay living wages and give health
the members of the online community ([57]. benefits and then you can make claims like this!!!!”
Customers at times make suggestions to the company Such type of customer engagement activities may have
and these suggestions turn customers into co-creators of negative consequences for the organisation [36]. When
value for the organisation [43]. These types of comments customers post negative reviews, the reputation of the
are very valuable for any organisation as the customers are company may be damaged. Thus, co-destruction of value
arises during interactions between the company and its
readily informing the company about their needs and wants,
customers where instead of creating value for both parties,
providing competitive intelligence for free, and tell
value is destroyed either for all parties or for one party [60].
companies how to solve problems that they have
encountered. V. CONCLUSION
The aim of this study was to analyse the various forms of
Another form of behavioural customer engagement is
customer engagement by examining the reactions of
when customers defend the company or employees of the
customers to company initiated messages on Facebook
company on the Facebook page. This is highly appealing for
pages of Tesco and Walmart. From the extant literature and
the organisation as it shows high level of customer loyalty
the findings of the study on the use of Facebook in the
[57]. In this study there are several occasions when the
hypermarket and supermarket sector, it is clear that
company has posted on its official Facebook page and have
Facebook is influencing the way businesses are being
been criticised by its customers. Other customers who do
managed today. Within the hypermarket and supermarket
not agree with the latter have replied back to these
sector, co-creation in terms of new product development or
customers and defended the company. These findings are
improving existing products are limited since these
consistent with researchers who argue that very often
businesses are merely selling products manufactured or
customers respond to complaints before the companies do
produced by suppliers. Nonetheless, this study has revealed
by giving the company the benefit of the doubt [49]. For
that customers do create value for the organisation by
example when Tesco posted a comment to gain insight into
interacting on the Facebook page of Tesco and Walmart,
the type of exotic food that its customers had not yet
thus adding to the literature on co-creation of value by
customers. Customers become co-creators of value when The limitation of the study is linked to the nature of the
they offer advice or help other customers within the netnography method, which restricted the analysis to those
community, when they refer products or services to other customers who have posted comments online, thus
customers, when they make suggestions to the grocery neglecting other sources such as offline customer feedback.
stores, when they defend the company and its employees. Despite choosing two main grocery stores, the researcher
has not done a comparative analysis of the use of Facebook
The findings of this study provide considerable support to within the two grocery stores as the aim of the study was to
customers destroying value for the grocery stores on their gain an understanding of the types of customer engagement
Facebook pages. This study has revealed that customers occurring within the grocery sector on Facebook. For future
mainly use the Facebook pages of grocery stores to post research. a comparative analysis of the use of Facebook by
complaints and criticisms. Furthermore, they actively grocery stores may be undertaken to analyse the social
recommend customers to boycott the grocery stores and to media strategies adopted by these firms. An additional
patronise stores of competitors. Additionally, customers avenue for future research could be to carry out research by
provide information about products and services of focusing on one grocery store operating in different
competitors, thus encouraging the other customers to shop countries. This would help the identification of differences
in other outlets. These actions of customers harm the and similarities between social media communication in
organisation as they destroy value [41] [42]. This study various country contexts. Furthermore, this will allow the
therefore contributes to the body of literature in that it has researcher to look at how communication is tailored to
compiled various ways that customers can threaten and specific locations i.e. the importance of place and culture.
harm grocery stores on Facebook.
http://www.revistalatinacs.org/070/paper/1037mx/07en.html DOI:
10.4185/RLCS-2015-1037en
[14] A. Kaplan A., M. Haenlein, ―Users of the world, unite! The
REFERENCES challenges and opportunities of social media‖ Business Horizons
[1] S. Sands, E. Harper, and C. Ferraro, C., ―Customer-to-non-customer Vol.53, 2010 pp. 59-68
interactions: Extending the ‗social‘ dimension of the store [15] Harvard Business School The New Conversation: Taking Social
environment‖, Journal of Retailing and Consumer Services, Vol. 18, Media From Talk to Action, Harvard Business Review Report, 2010
2011, pp. 438–447. Harvard Business Publishing, Boston, MA, available at:
[2] R. Varadarajan, R. Srinivasan, G.G Vadakkepatti, M. Yadav, P.A www.sas.com/resources/whitepaper/wp_23348.pdf (accessed
Pavlou,., S. Krishnamurthy, and T. Krause, T., ―Interactive December 17,2012).
Technologies and Retailing Strategy: A Review, Conceptual [16] D.E., Schultz, J., Peltier, ‗Social media‘s slippery slope: challenges,
Framework and Future Research Directions‖, Journal of Interactive opportunities and future research directions‘, Journal of Research in
Marketing, Vol. 24, No. 2, 2o11, pp. 96–110. Interactive Marketing, Vol. 7 No. 2, 2013, pp. 86-99.
[3] M. O‘Hern, , and A. Rindfleisch, A., ―Customer co-creation: a [17] Facebook(2014).Newsroom–Keyfacts.http://newsroom.fb.com/Key-
typology and research agenda‖, Review of Marketing Research, Vol. Facts.
6, 2009, pp. 84–106.
[18] H.G. Pereira, M.F Salgueiro,., I., Mateus, ‗Say yes to Faceook and get
[4] S. Pookulangara, and. K. Koesler, ―Cultural influence on consumers‘ your customers involved! Relationships in a world of social
usage of social networks and its‘ impact on online purchase networks‘, Business Horizons, 2014,
intentions‖ Journal of Retailing and Consumer Services, Vol.18, http://dx.doi.org/10.1016/j.bushor.2014.07.001
2011, pp. 348–354.
[19] E. Pöyry,., P. Parvinen, T. Malmivaara, T., ‗Can we get from liking to
[5] C. Dubose, ‗The social media revolution‘, Radiologic Technology, buying? Behavioral differences in hedonic and utilitarian Facebook
Vol. 12, No 1, 2011, pp. 30-43. usage‘, Electronic Commerce Research and Applications, Vol. 12,
[6] R. Hanna, A. Rohm, V.L., Crittenden, ―We‘re all connected: The 2013, pp. 224–235
power of the social media ecosystem‖, Business Horizons, Vol. 54, [20] S. H Dekay "How large companies react to negative Facebook
2011, pp. 265—273. comments", Corporate Communications: An International Journal,
[7] G. Tsimonis, and S. Dimitriadis, S., ‗Brand strategies in social Vol. 17 Iss 3, 2012, pp. 289 – 299.
media‘, Marketing Intelligence & Planning, Vol. 32, No. 3, 2014, pp. [21] J. Kietzmann, H. Hermkens, Kristopher & Mccarthy, Ian P. &
328-344 Silvestre, S. Bruno ―Social Media? Get Serious! Understanding the
[8] R. M . Alexander, J.K., Gentry, ―Using social media to report functional building blocks of social media‖, Business Horizons, vol.
financial results‖, Business Horizons, Vol. 57, 2014, pp. 161—167 54, 2011, pp. 241-251.
[9] L.I. Labrecque, ‗Fostering Consumer–Brand Relationships in Social [22] S. Fournier, S, J.Avery, ‗The uninvited brand‘, Business Horizons,
Media Environments: The Role of Parasocial Interaction‘, Journal of Vol. 54, 2011, pp. 193-207.
Interactive Marketing, 2014, in press [23] P. LaPointe, , 'Measuring Facebook's Impact on Marketing', Journal
[10] C. Ho, "Consumer behavior on Facebook", EuroMed Journal of Of Advertising Research, 52, 3, 2012, pp. 286-287.
Business, Vol. 9, Iss 3, 2014, pp. 252 – 267. [24] W.G Mangold,. D.J., Faulds,), ‗Social media: The new hybrid
[11] R. Garretson: Future tense: The global CMO. Retrieved September element of the promotion mix‘, Business Horizons, Vol. 52, 2009, pp.
29, 2010, from http://graphics.eiu. com/upload/Google%20Text.pdf 357—365.
[12] S. Asur, ‗The Economics of Attention: Social Media and Businesses‘, [25] J. Colliander, & M. Dahlén, 'Following the Fashionable Friend: The
Vikalpa: The Journal For Decision Makers, Vol. 37, No. 4, 2012, pp. Power of Social Media', Journal Of Advertising Research, Vol. 51,
77- 85. Iss.1, 2011, pp. 313-320.
[13] G.V. Valerio Ureña, D. Herrera Murillo, N. Herrera Murillo, FJ [26] N.L Chan,., B.D Guillet,., ‗Investigation of social media marketing:
Martínez Garza, ―Purposes of the communication between companies How does the hotel industry in Hong Kong perform in marketing on
and their Facebook followers‖, Revista Latina de Comunicación social media websites?‘, Journal of Travel & Tourism Marketing,
Social, 70, 2015, pp. 110 to 121. Vol. 28, No. 4, 2011, pp. 345-368, p. 347.
[27] K. S Coulter, A. Roggeveen, "Like it or not": Consumer responses to networks‘, Journal of Marketing Communications, Vol. 20, Nos 1-2,
word-of-mouth communication in on-line social networks", 2014, pp. 117-128, DOI: 10.1080/13527266.2013.797778, p. 118.
Management Research Review, Vol. 35 Iss: 9, 2012 pp. 878 – 899. [47] R.V Kozinets,., (2002), ‗The field behind the screen: using
[28] V. Kumar, L. Aksoy, B. Donkers, R. Venkatesan, T. Wiesel, and S. netnography for marketing research in online communities‘, Journal
Tillmanns, S., ‗Undervalued or Overvalued Customers: Capturing of Marketing Research, Vol. 39, pp. 61–72.
Total Customer Engagement Value‘, Journal of Service Research, [48] A. Rageh, TC Melewar, A. Woodside, ―Using netnography research
Vol. 13, No. 3, 2010, pp. 297-310. method to revel the underlying dimensions of the customer/tourist ,
[29] N. Woodcock, A. Green, M. Starkey, ‗Social CRM as a business Qualitative Market Research: An International Journal Vol. 16 No. 2,
strategy‘, Journal of Database Marketing & Customer Strategy 2013,pp. 126-149
Management, Vol. 18, No. 1, 2011, pp. 50–64. [49] J. Colliander, and A.H., Wien, ,―Trash talk rebuffed: consumers‘
[30] T.M. Harrison, and M. Barthel, ‗Wielding new media in Web 2.0: defense of companies criticized in online communities‖, European
exploring the history of engagement with the collaborative Journal of Marketing,Vol. 47, No. 10, 2013, pp. 1733-1757.
construction of media products‘, New Media Society, Vol. 11, No. 1- [50] H. Hsieh, & S. Shannon, ‗Three approaches to qualitative content
2, 2009, pp. 155-178. analysis‘, Qualitative Health Research, Vol. 15, No. 9, 2005 pp.
[31] R.J Brodie,. L.D., B., Juric, A, Ilic, , ‗Customer Engagement: 1277–1288.
Conceptual Domain Hollebeek, , Fundamental Propositions, and [51] C. Stavros, M.D Meng,., K. Westberg, F. Farrelly, ‗Understanding
Implications for Research‘ Journal of Service Research, Vol. 14, fan motivation for interacting on social media‘, Sport Management
No.3, 2011a, pp. 252-271. Review, Vol. 17, 2014 pp. 455–469
[32] L. Hollebeek, , ‗Demystifying customer brand engagement: Exploring [52] Y.S Lincoln,., and E.G Guba,.‗Naturalistic Inquiry‘, Beverly Hills,
the loyalty nexus‘, Journal Of Marketing Management, Vol. 27. No. CA: Sage Publications, 1985.
7/8, 2011, pp. 785-807. p. 790
[53] A.K Shenton,. ‗Strategies for ensuring trustworthiness in qualitative
[33] L.D., Hollebeek, ‗The customer engagement/value interface: An research projects‘, Education for Information, Vol. 22, 2004, pp. 63–
exploratory investigation‘, Australasian Marketing Journal, Vol. 21 75
Issue 1,2013, p17-24.
[54] R. Langer, & S.C. Beckman, ‗Sensitive research topics: netnography
[34] C. Porter, N. Donthu, W. MacElroy, D. Wydra ‗How to foster and revisited‘, Qualitative Market Research: an International Journal,
sustain engagement in virtual communities‘, California Management Vol. 8, No. 2, 2005 pp. 189-203.
Review, Vol. 53, No. 4, 2011, pp. 80-110.
[55] J. Gummerus, V. Liljander, E. Weman, M. Pihlström, ,"Customer
[35] S. Vivek, S. Beatty, & R. Morgan, 'Customer Engagement: Exploring engagement in a Facebook brand community", Management Research
Customer Relationships Beyond Purchase',Journal Of Marketing Review, Vol. 35 Iss: 9, 2012, pp. 857 – 877.
Theory & Practice, 20, 2, 2012, pp. 122-146
[56] H. Park, Y.K Kim, ‗The role of social network websites in the
[36] J. van Doorn, K.N. Lemon, V.Mittal, S. Nass, D. PickP. Pirner, P., consumer–brandrelationship‘, Journal of Retailing and Consumer
and P.C. Verhoef, ‗Customer Engagement Behavior: Theoretical Services, Vol. 21, 2014, pp. 460–467
Foundations and Research Directions‘, Journal of Service Research,
Vol. 13, No.3, 2010, pp. 253-266. [57] L. Dessart, C.Veloutsou, A. Morgan-Thomas, "Consumer
engagement in online brand communities: A social media
[37] T.H Bijmolt, . A., Leeflang, P.S. H., Block, F., Eisenbeiss, M., perspective", Journal of Product & Brand Management, Vol. 24 Iss
Hardie, B. G. S., Lemmens, A., and Saffert, P., ‗Analytics for 1, 2015, pp. 1 – 33.
Customer Engagement‘, Journal of Service Research, Vol. 13, No. 3,
2010, pp. 341-356. [58] Y.Jin, B. Fisher Liu, & L. L Austin,.. ―Examining the role of social
media in effective crisis management: The effects of crisis origin,
[38] C. Grönroos,), ―Creating a relationship dialogue: communication, information form, and source on publics‘ crisis responses‖,
interaction and value‖, The Marketing Review, Vol. 1, No. 1, 2012, Communication Research, Vol. 41, No.1, 2014, pp. 74–94.
pp. 5-14.
[59] A. Palmer, & N. Koenig-Lewis, ―An experiential social network-
[39] R.J., Brodie, A., Ilic, B. Juric, L. Hollebeek, ‗Consumer engagement based approach to direct marketing‖, Direct Marketing: An
in a virtual brand community: An exploratory analysis‘, Journal Of International Journal, Vol.3, 2009, pp. 162–176.
Business Research, Vol. 66, Iss. 1, 2011b, pp. 105–114.
[60] L. Plé, R. Chumpitaz Cáceres, "Not always co-creation: introducing
[40] CM. Sashi, (Customer engagement, buyer-seller relationships, and interactional co-destruction of value in service dominant logic",
social media", Management Decision, Vol. 50 Iss: 2, 2012, pp. 253 – Journal of Services Marketing, Vol. 24, Iss 6, 2010, pp. 430 - 437
272.
[41] P.S.H., Leeflang, P.C., Verhoef, P., Dahlström, T Freundt,.,
‗Challenges and solutions for marketing in a digital era‘, European
Management Journal, Vol. 32 (2014) 1–12.
[42] P. Verhoef, S., Beckers, & J. van Doorn, J.,), ―Understand the Perils
of Co-Creation‖, Harvard Business Review, Vol. 91, No. 9, 2013, p.
28-2.
[43] P.C Verhoef, W.J., Reinartz, and M. Krafft, ‗Customer Engagement
as a New Perspective in Customer Management‘, Journal of Service
Research, Vol.13, No.3, 2010, pp. 247-252.
[44] V.Champoux, J. Durgee, L. McGlynn, "Corporate Facebook pages:
when ―fans‖ attack", Journal of Business Strategy, Vol. 33, Iss 2,
2012, pp. 22 – 30.
[45] I Schulze Horn., T.Taros, S. Dirkes, L. Huer, M. Rose M., R.
Tietmeyer, E. Constantinides, ―Business Reputation and Social Media
- A Primer on Threats and Responses‖, IDM Journal of Direct, Data
and Digital Marketing Practice, Vol. 16, no. 3, 2015,
http://www.palgrave-journals.com/dddmp/index.html. p. 4
[46] J. Pfeffer, T. Zorbach, & K. M., Carley, ‗Understanding online
firestorms: Negative word-of-mouth dynamics in social media
The Impact and Opportunities of e-Tutoring in a
Challenged Socio-Economic Environment
Petra le Roux1; Marianne Loock2
School of Computing
University of South Africa (UNISA)
lrouxp@unisa.ac.za1; loockm@unisa.ac.za2
Abstract—As social network sites rise in popularity, the youth The severe problems with basic numeracy and literacy
spends large amounts of their time browsing the Internet and levels in South Africa have a ripple effect that can be seen in
interacting via social network sites. Furthermore, social network the low Grade 12 mathematics throughput. Mathematics offers
sites creates opportunities that allow the youth to connect to an account of a learner's ability to think logically and
different learning environments and thus opens up options for systematically, reason, judge, calculate, compare, reflect and
new dimensions to learn. This gives rise to research on a shift summarize. Furthermore, mathematics requires consistency,
from conversational to education content of social media. the knowledge of numbers learnt from Grade 1 and earlier. It
However, due to social and economic challenged circumstances, is generally accepted that addressing numeracy and literacy
many children in third world countries cannot share in these
problems during early years of education will reduce the
opportunities. Poor numeracy and literacy levels achieved in
basic education, predict huge stumble blocks for these learners
problem of high levels of failure and dropout in later grades.
during their school career. Added the huge shortage of teachers This challenge is particularly important in communities that
who can provide mother tongue education to non-English are socially and economically challenged.
speaking learners, a challenge is inevitable. But the rapid South Africa has a huge shortage of teachers - especially
technological changes in information and communication teachers who can provide mother tongue education to non-
technology enable people to help one another, even over distance. English speaking children. This problem has a direct effect on
Situated in this context, the broad aim of this ongoing research is
class sizes and teachers need support that will make up for the
to investigate how the use of social networking tools as a platform
fact that they cannot attend to individual children sufficiently
for cross-age e-tutoring addresses the social and educational
needs of social and economic challenged learners. The first part
during class time [3]. Many current teachers have had little
of this ongoing research used an experimental study and the exposure to technology and do not feel comfortable using it.
conclusions drawn clearly indicates that the possibility exists for They will require training (with patience) to get them to adjust
adolescent tutors to develop a higher self-esteem when fully to teaching practices that rely on technology. Bringing
administered in a position as tutors of younger tutees. The technology into the classroom may be met with resistance
younger tutee also indicated that it was a positive learning and from current, older teachers which can lead to alienation.
social experience. These results are enough reason to redo this
The problem stated above can be addressed by improving
research by moving to a quasi-experimental research method
the numeracy and literacy outcomes for the socially and
with a control group.
economically challenged learner without placing additional
Keywords—social computing; mobile learning; e-tutoring; load on the already overburdened teaching resources in the
identity development; information security awareness formal learning environment [4]. A solution is to make use of
out-of-school time for learning where school and community
partners together to create academic and enrichment activities
I. INTRODUCTION to support learning. Expanded learning opportunities describe
There are severe problems with basic numeracy and these ranges of learner programs and activities that occur
literacy levels in third world countries [1], [2]. According to beyond the traditional school hours [5]. The proposed solution
the South African Department of Basic education [3] the will endeavour to provide a secure expanded learning
"quality of basic literacy levels is still well below what it opportunity for socio-economic challenged youths through e-
should be. Fewer than half of all learners in the country tutoring. This will be attempted by examining both the
perform at a level that indicates that they have at least partially opportunities offered by social networking tools and mobile
achieved the competencies specified in the curriculum. In learning as well as the sociological and psychological impact
Grade 6, the results indicate that only around 30% of learners it has on the participants.
fall into this category. The percentage of learners reaching at
least a „partially achieved‟ level of performance varies from II. LITERATURE REVIEW
30% to 47%, depending on the grade and subject considered.
The percentage of learners reaching the „achieved‟ level of A. Social computing
performance varies from 12% to 31% ". Schuler [6] defines social computing as "any type of
computing application in which software serves as an
intermediary or a focus for a social relation" [6]. Thus, social
VI. DATA
QUANTITATIVE
kriangkrai.l@bu.ac.th, {kensuke,kei,shigeki}@nii.ac.jp
Abstract—Anomaly detection is one of the crucial issues devices. Host-based detection systems analyze and pass traffic
of network security. Many techniques have been developed through a particular host if there are not potentially malicious
for certain application domains, and recent studies show that packets. An example of host-based detection systems is Port-
machine learning technique contains several advantages to detect Sentry [3]. Network-based detection systems are strategically
anomalies in network traffic. One of the issues applying this positioned in a network to monitor all traffic and detect any
technique to real network is to understand how the learning
algorithm contains more bias on new traffic than old traffic.
unusual traffic on the hosts of that network. Network-based
In this paper, we investigate the dependency of the time period detection systems need to be a very fast to analyze traffic,
for learning on the performance of anomaly detection in Internet because it needs to monitor all packets or flows passing though
traffic. For this, we introduce a weighting technique that controls the network. Examples of network-based detection systems are
influence of recent and past traffic data in an anomaly detection RealSecure, SecureNet [4], and Snort [5].
system. Experimental results show that the weighting technique
improves detection performance between 2.7-112% for several Recently, studies on machine learning [6] show promising
learning algorithms, such as multivariate normal distribution, k- results in anomaly detection. They have been mainly classified
nearest neighbor, and one-class support vector machine. into two techniques: classification and clustering techniques
[7]. Classification is a learning technique that requires labeled
Keywords—weighting technique, machine learning, multiple
instances as training data to generate a function for classi-
timeline, anomaly detection.
fying a new instance. Detection systems applying supervised
techniques need to label packets or flows which are normal or
I. I NTRODUCTION anomalous. Several learning algorithms based on classification
technique have been studied, such as k-nearest neighbors [8],
Because of mobile applications, access and backbone In- support vector machines [9]. Clustering is a learning technique
ternet traffic has exponentially grown every year. In addition, that tries to find hidden structure in data. We could detect
available software and tools for novice attackers is easily anomalies in network traffic on the basis of the assumption
available [1], so that they can simply use these software to that major groups are normal traffic and minor groups are
attack and intrude into target networks. These attack tools have anomalies. Many studies have employed clustering techniques,
been developed very well to imitate normal packets, flows and such as [10] and [11].
easily evade existing detection systems. Some attackers have
been applied advanced techniques, so conventional detection As an extension of network anomaly detection with ma-
methods hardly detect the traffic from such malicious software. chine learning, we proposed a multi-timeline detection system
For these reasons, it makes daily operation difficult for network [12] that use multiple time series of network traffic cor-
administrators to monitor unusual incidents or anomalies pass- responding to previous days behavior, as input to machine
ing through their own systems. learning algorithm. This technique helps the system to detect
anomalies more accurate than those of sigle timeline, as shown
Many anomalies in current computer networks are more in the empirical results [12]. Nevertheless, this multi-timeline
complicated than prior ones. We generally categorize anoma- technique treats all traffic or timelines equally, but some
lies on computer systems into two groups. The first group network environments need a weight on particular timelines,
is anomalies caused by threats or human intention, such as for example, more on recent timelines. Thus, a use of weight
attacks, viruses, worms, scanning, and spamming. The second among different time series would improve the detection
group is anomalies caused by accidents, including outages, performance.
misconfigurations, and flash crowds. The desirable detection
system should discover both types of anomalies with high In this paper, we propose a weighting technique to influ-
accuracy and produce low error rate. ence the multi-timeline detection system over recent timelines.
We conduct experiments on real network traffic to compare
In past years, researchers have studied various detection detection performance between weighting and non-weighting
techniques to compete against new malicious software and new technique for several types of attacks with three learning algo-
types of anomalies in computer networks. A study by Denning rithms: the multivariate normal distribution, k-nearest neighbor,
[2] classified detection systems into host-based and network- and one-class support vector machine.
based detection systems. Host-based detection systems are
installed locally on many different types of machines, namely The following sections provide explanation of our detection
servers, workstations, notebook computer, or even mobile system and show experimental results on a campus network
where ρ is a parameter to get the proportion of maximum prob- k(x, x ) = exp(−γ x − x ). (11)
ability, where smaller values of ρ produce higher probabilities. Each support vector thus becomes the center of a RBF, and
We varied ρ between 2 and 4 on a linear scale for selection of γ determines the area of influence that the support vector has
the best detection performance. We define the classify function over the data space. We varied the γ value between 10−5 and
of test data x as 104 to observe a change for the best accuracy.
anomaly if p(x) < ε, To classify testing data, we used the Svm-Predict function
f (x) = (7)
normal otherwise. from the Libsvm to determine an unknown vector sample x,
which belongs to the positive or negative class. It returns +1 or
-1 as the result of classification and provides to y the result of
w3 w3
the sum from the SVM decision formula Eq. 10. If the result
w2 w2 w2 w2 Weight
value
at a particular time interval is -1, we classify that time interval
w1 w1 w1 w1 w1 w1 as an anomalous interval. While, if the result is +1, we classify
Training that time interval as a normal interval.
day
m-5 m-4 m-3 m-2 m-1 m
E. Performance Metric
Weight length
We used F-score [21] as a single measure for evaluating the
Fig. 4. An example of weighting process with weight length = 6 and weight detection performance of our proposed technique. The F-score
value = 3. is widely used to evaluate the quality of binary classifications,
Detection performance of MND with weighting process Detection performance of OSVM with weighting process
F-score
0.4 0.64
0.3 0.4
0.3 0.62
0.2 0.2 0.35
0.6 0.1
0.1 0.3
0 0.58 0 0.25
0.56 0.2
25 25
20 20
15 9 15 9
Weighting length (days) 10 7 Weighting length (days) 10 7
5 5
5 3 5 3
Weighting value Weighting value
0 1 0 1
Fig. 5. Detection performance of multi-timeline module with weighting Fig. 7. Detection performance of multi-timeline module with weighting
process by using MND. process by using OSVM.
0.5 0.71
0.4
0.3 0.7
We conduct the experiment to explore detection perfor-
0.2 0.69 mance of both the multi-timeline detector module with and
0.1
0 0.68 without weighting process. Network operators intend to give
0.67 some recent traffic more weight on recent traffic behavior. We
25
use a weighting process on the multi-timeline detector module
20 and comparing detection performance to those without the
15
7
9 weighting process.
Weighting length (days) 10
5
5
0 1
3
Weighting value We perform experiments with a linear gradual weighting
technique. First, we set the weight length and weight value as 1
for weighting process. Second, we select the best feature from
Fig. 6. Detection performance of multi-timeline module with weighting the first experiment for each type of anomaly by using first
process by using KNN. learning algorithm, MND. After that we measure F-score for
every types of anomalies and compute the average performance
of multi-timeline detector module for MND. Third, we alter
especially when the sizes of two classes are substantially the weight length from 1 day to 5, 10, 15, 20, and 25 days,
skewed. The F-score, which considers both the precision and then average detection performance for all weight length.
recall [22] to compute the score, assigns a value ranging Next, we change the weight value from 1 to 3, 5, 7, and
between 0 and 1, where 1 represents the perfect detection and 9, then follow the procedure from the first to third step
0 represents the worst detection. We measured the precision, and compute the average detection performance for all value.
recall, and F-score based on entire intervals. The precision, Finally, we switch the learning algorithm from MND to KNN
recall, and F-score are derived by Eqs. 12-14 respectively: and OSVM respectively, and plot all computed average values
on three-dimensional graphs to compare trends in detection
TP performance between these learning algorithms.
precision = , (12)
TP + FP
Performance results from this experiment on weighted
TP timeline are shown in Figures 5-7 for MND, KNN, and OSVM,
recall = , (13)
TP + FN respectively. The x-axis represents the weight values from 1
precision × recall to 9, and the y-axis indicates the weight length during the
F-score = 2 × , (14) learning process. The z-axis shows the F-score that contains
precision + recall a value between 0 and 1, where 0 represents the worst and 1
represents the best detection performance.
where TP is the number of true positives (the number of
anomalous intervals that were correctly detected), FP is the Our experimental results indicates the advantage of weight
number of false positives (the number of normal intervals process in most cases. Results of MND (Figure 5) show that
incorrectly identified as anomalous intervals), and FN is the the multi-timeline detector module with the weighting process
number of false negatives (the number of anomalous intervals produces 3.2-16.6% improvement than the module without
that were not detected). TP, FP, and FN were directly derived the weighted process. The case of KNN (Figure 6) indicates
that adding the weighting process randomly produces a small [2] D. E. Denning, “An intrusion-detection model,” IEEE Trans. Softw.
effect between 2.7-6.4% improvement. Employing OSVM in Eng., vol. 13, no. 2, pp. 222–232, Feb. 1987.
Figure 7, however, the trend in detection performance is quite [3] B. Toxen, Real World Linux Security, 2nd ed. Prentice Hall Profes-
different from those by using MND and KNN. The detection sional Technical Reference, 2002.
performance with OSVM is improved linearly for increase of [4] P. Spirakis, S. Katsikas, D. Gritzalis, F. Allegre, J. Darzentas, C. Gi-
gante, D. Karagiannis, P. Kess, H. Putkonen, and T. Spyrou, “Securenet:
the weight values and weight lengths. The improvement of A network oriented intrusion prevention and detection intelligent sys-
performances by using OSVM are between 3.5-112%. From tem,” in Proceedings of the 10th International Conference on Informa-
these three figures, OSVM is a suitable algorithm for the multi- tion Security, IFIP SEC94, The Netherlands, May 1994.
timeline detector module with the linear gradual weighting [5] M. Roesch, “Snort - lightweight intrusion detection for networks,” in
technique. Proceedings of the 13th USENIX conference on System administration,
ser. LISA ’99. Berkeley, CA, USA: USENIX Association, 1999, pp.
229–238.
V. D ISCUSSION AND C ONCLUSION [6] C.-F. Tsai, Y.-F. Hsu, C.-Y. Lin, and W.-Y. Lin, “Intrusion detection by
machine learning: A review,” Expert Systems with Applications, vol. 36,
Empirical results from our experiment reveal that the no. 10, pp. 11 994–12 000, 2009.
detection performance of multi-timeline system with weight- [7] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A
ing relies on two main factors. One factor is dependency survey,” ACM Comput. Surv., vol. 41, no. 3, pp. 15:1–15:58, Jul. 2009.
of two parameters in the weighted process. The weighting [8] S. Manocha and M. A. Girolami, “An empirical analysis of the prob-
technique in this experiment is timeline replication, so both abilistic k-nearest neighbour classifier,” Pattern Recogn. Lett., vol. 28,
the number of training data and the number of replication as no. 13, pp. 1818–1824, Oct. 2007.
a weighting value mainly affect the F-score. Another factor [9] W.-H. Chen, S.-H. Hsu, and H.-P. Shen, “Application of svm and ann
is the learning algorithm employed to the detection module. for intrusion detection,” Comput. Oper. Res., vol. 32, no. 10, pp. 2617–
2634, Oct. 2005.
Our results indicate that the weighting technique strengthens
[10] S. Jiang, X. Song, H. Wang, J.-J. Han, and Q.-H. Li, “A clustering-
the role of OSVM. The weighting process with MND has based method for unsupervised intrusion detections,” Pattern Recogn.
slightly improved for increase of the number of training data Lett., vol. 27, no. 7, pp. 802–810, May 2006.
and weighting value. The results from KNN show random and [11] A. Kind, M. Stoecklin, and X. Dimitropoulos, “Histogram-based traffic
small change when we added the weighting technique to the anomaly detection,” Network and Service Management, IEEE Transac-
multi-timeline detection module. tions on, vol. 6, no. 2, pp. 110 –121, june 2009.
[12] K. Limthong, K. Fukuda, Y. Ji, and S. Yamada, “Unsupervised learning
In summary, we proposed a weighting technique over the model for real-time anomaly detection in computer networks,” IEICE
multi-timeline detection system so that we can apply any Transactions on Information and Systems, vol. E97.D, no. 8, pp. 2084–
machine learning algorithms or use any traffic feature to detect 2094, 2014.
network anomalies. We conducted two experiments to examine [13] S. Aksoy and R. M. Haralick, “Feature normalization and likelihood-
capabilities of the multi-timeline, detection performance over based similarity measures for image retrieval,” Pattern Recognition
Letters, vol. 22, no. 5, pp. 563 – 582, 2001, image/Video Indexing
different volumes of background traffic and weighting process. and Retrieval.
Experimental results strongly confirm that the multi-timeline [14] J. Grossman, M. Grossman, and R. Katz, The first systems of weighted
detector module with weighting process outperform those differential and integral calculus. Archimedes Foundation, 1980.
without weighting process, especially for OSVM. [15] R. Lippmann, D. Fried, I. Graf, J. Haines, K. Kendall, D. McClung,
D. Weber, S. Webster, D. Wyschogrod, R. Cunningham, and M. Ziss-
For our future work, we intend to apply the multi-timeline man, “Evaluating intrusion detection systems: the 1998 darpa off-line
detection system to a real network environment. Although, intrusion detection evaluation,” in DARPA Information Survivability
our experimental results show that the multi-timeline detection Conference and Exposition, 2000. DISCEX ’00. Proceedings, vol. 2,
system with some learning algorithms detected several types 2000, pp. 12 –26 vol.2.
of anomalies with promising performance, there are many [16] S. Amasaki and C. Lokan, “The effects of gradual weighting on
duration-based moving windows for software effort estimation,” in
factors in network traffic that might adversely affect detection Product-Focused Software Process Improvement, ser. Lecture Notes in
performance of the system. Moreover, to fulfill an essential Computer Science, A. Jedlitschka, P. Kuvaja, M. Kuhrmann, T. Mnnist,
requirement of anomaly detection in real time, we intend to J. Mnch, and M. Raatikainen, Eds. Springer International Publishing,
develop an automatic inspector who provide full details of 2014, vol. 8892, pp. 63–77.
anomalies after it have been detected by the multi-timeline [17] X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J.
detection system. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J.
Hand, and D. Steinberg, “Top 10 algorithms in data mining,” Knowl.
Inf. Syst., vol. 14, no. 1, pp. 1–37, Dec. 2007.
ACKNOWLEDGMENT [18] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Fourth Edi-
tion, 4th ed. Academic Press, 2008.
We gratefully acknowledge the funding from the Faculty [19] T. M. Mitchell, Machine Learning, 1st ed. New York, NY, USA:
Members Development Scholarship Program of Bangkok Uni- McGraw-Hill, Inc., 1997.
versity, Thailand. The authors would like to thank all of the [20] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector
anonymous reviewers for their excellent suggestions that have machines,” ACM Transactions on Intelligent Systems and Technology,
greatly improved the quality of this paper. vol. 2, pp. 27:1–27:27, 2011.
[21] C. J. V. Rijsbergen, Information Retrieval, 2nd ed. Newton, MA, USA:
Butterworth-Heinemann, 1979.
R EFERENCES
[22] J. Davis and M. Goadrich, “The relationship between precision-recall
[1] H. F. Lipson, “Tracking and tracing cyber-attacks: Technical challenges and roc curves,” in Proceedings of the 23rd international conference on
and global policy issues,” Software Engineering Institute, Carnegie Machine learning, ser. ICML ’06. New York, NY, USA: ACM, 2006,
Mellon University, Pittsburgh, Pennsylvania, Special Report CMU/SEI- pp. 233–240.
2002-SR-009, November 2002.
Algorithm Performance Indexing through Parallelism
Peter Manyere Andren.L. Nel
University of Zimbabwe University of Johannesburg
Electrical Department Mechanical Department
P. O Box MP 167, Mt Pleasant P O Box 524, Auckland Park 2006
Harare, Zimbabwe Johannesburg, South Africa
Email: pmanyere@eng.uz.ac.zw Email: andren@uj.ac.za
I. INTRODUCTION
Sds (n, k ) at e
n 0 k 0
j 1 ( n ,k )
(1)
where
Parallel computing systems can be evaluated in terms of 4 fc k 2Ro 2Ra Ts Tp
1 (n, k ) ( )( Rt Ra ) (2)
execution time. The execution time is expressed as a function
c Fs c c 2 2
of the size of the input, the number of processors used, their
relative computation and inter-process communication speed Ra =slant range of (antenna phase center) APC to scene center
(Grama et al, 2003). A parallel computing system comprises
of the algorithm and architecture. An evaluation of the
Rt is the distance from the location of the APC to the point
algorithm alone may not be conclusive due to some loss in target at time 𝑡.
accuracy. The assessment of performance of parallel
𝑇𝑝 =pulse duration
processors is based on speed-up and efficiency. The reason for
having multiple processors is to execute a given problem 𝑛 =pulse number
faster. One would expect a system to run twice as much when Ro =real time range
twice as many hardware resources are used. Practically, this 𝑇𝑠 =sampling period
will not be the case due to some overheads related to
parallelism which include inter-process interaction, idling and 𝐹𝑠 =sampling frequency
excess computation.
Down Range(m)
-5 40
10 20
15
Range
10
Do 20
wn 25
-30 -20 -10 0 10 20 30
Ran Cross Range (m)
Down
-30
50
-20 Fig 4. 1-D Segmentation Algorithm
Down Range(m)
-10 40
0
30
The segments were parallel processed into sub-images that
10
were recombined after processing to form a composite image.
20 20
30
40
10 V. PARALLEL FORMULATION BY 2-D BLOCK
50
SEGMENTATION
-60 -40 -20 0 20 40 60
Cross Range (m)
Fig 2.b 2-D Algorithm Image The choice of the shortest path to solve the problem is critical.
Two methods of achieving the fastest path of solving a
Simulation of 36-point target data that was processed by 2-D problem uses the vertices of data matrix in what is referred to
Segmentation Algorithm, generated an image in fig 3. a ‘All-Pairs Shortest Paths’ (Dijkstra, 1983), Source-
Partitioned Formulation and Source-Parallel Formulation. The
Source-Partition Formulation partitions the vertices among n
processors. Each processor computes the shortest path for all
vertices assigned to it. On the contrary, Source-Parallel
Formulation allots each vertex to a set of processors and
applies the parallel formulation of a single source algorithm in
order to solve the problem on each set of processors (Grama et
al, 2003). The major limitation of Source-Partition Parallel
Formulation is that only n processors are kept busy doing
useful work while the rest will be idle. Parallel Formulation function To , and is given by:
was therefore adopted in this paper to improve the
performance of Source-Partition Parallel Formulation (Floyd, To N pcTN p TS (3)
1962). Two-dimensional (2-D) block segmentation was
The speedup S , captured the relative benefit of solving a
therefore implemented in this research (fig.5), a useful tool for
problem in parallel. For n n data segmented by edge
segmenting SSAR data for parallel processing.
detection, the efficiency and the corresponding values of the
speedup for the parallel algorithm can be determined. The
2
whole operation on a serial computer would take 9tc n
seconds where tc represents the time for each multiply-add
operation. A simple parallel algorithm to solve this problem
would require the partitioning of the image data equally across
the processing elements with each element applying the
template to its own image segment (sub-image). In this case, a
processing element allotted a vertically sliced image data of
dimensions n (n / N pc ) , has to access a single layer of n
Fig 5. 2-D Segmentation Algorithm pixels from the processing element to the left and right. The
total time for the parallel algorithm becomes:
The 2-D technique partitioned data A( k ) into N pc blocks or
9tc n 2
segments of size (n / N pc ) (n / N pc ) . Each segment TN p 2(ts twn) (4)
N pc
was assigned to one of the N pc processors in parallel for
with ts as the startup time for data transfer and t w being the
processing and hence the number of segments was equivalent
per word transfer time.
to number of processors. In practice, not all data is square (
n n ) and this calls for data resizing prior to segmentation. It The value of speedup becomes:
VII. RESULTS Table 2. Relative Speedup Performance for 1-D Segmentation Algorithm
Best Running Time (seconds)
Algorithm
7.1 Performance Index for 1-D Algorithm
Noise-Free Noisy
Environment Environment
The 1-D Segmentation algorithm for SSAR was evaluated in
terms of its execution time in both noisy and noise-free PFA 3.085 3.683
environment. The number of parallel processors as well as
data segment sizes were varied and the mean execution times 1-D 0.600 0.600
recorded as in Table 1. Segmentation
Speedup (No. of 5.142 6.138
times faster)
Table 1. Mean Run Times for 1-D Segmentation Algorithm
Segment No. of Parallel Noise-Free Noisy 7.2 Performance Index for 2-D Algorithm
Length Processors Environment Environment
(samples) Run time (s) Run time (s)
1024 2 0.7635 0.7903 The algorithm was evaluated in terms of its data processing
512 4 0.6351 0.6517 speed both in noisy and noise-free environments. The number
of parallel processors in the grid was varied and mean
256 8 0.5972 0.5973
execution time recorded (Table 2). The mean run times show a
128 16 0.6361 0.6456
general decrease in execution time as the grid size increases to
64 32 0.7300 0.7334
some point. The minimum mean execution time is obtained
with 16 processors. A further increase in number of processor
The actual performance index of the 1-D Algorithm is tend to reduce the efficiency of the parallel system. Figure 7
presented in figure 6. It is observed that as the number of shows the performance index for 2-D algorithm.
parallel processors was increased from 2 to 8, the algorithm 1.4
2-D Segmentation Algorithm Performance Index
1.25
Run Time (s)
0.75
1
0 5 10 15 20 25 30 35 40
Parallel Processors
0.7
Run Time (s)
VIII. CONCLUSION
Abstract—License Plate Recognition (LPR) is an (iii) Extraction of character image from license plate,
automatic system which is able to recognize a license number (iv) To extract alpha numeric characters from license plate
plate. It is an effective monitoring method that uses optical image,
character recognition (OCR) on images to read vehicle number (v) Character recognition of license number plate with the
plates. Genetic neural network used such type of system for help of neural network and identify the vehicle.
developing many challenges. The system used in highway A neural network is an information-processing capability
computerized toll aggregation and surveilling the movements that is influenced by the same way as the biological nervous
of road traffic. Licensed Number Plate Recognition System is
very important component of Intelligent Transportation
system, like brain. ANN is composed of large amount of
System. Using morphological & connected component element highly interconnected processing element and neurons.
the image of car number plate has been extraction to show the Every neuron has their local memory and the output of each
character segmentation and at last each individual character is neuron depends only upon input signal arriving at the
matched with neural data base and that vehicle has been neuron.
recognized. Feed forward Back-propagation neural network The basic architecture of neuron layers are: input
(FFBPNN) is selected as most powerful tool to perform the layer, hidden layer and output layer. The information flows
recognition process. The main focus in this paper on three from input units to output units, strictly in a feed-forward
module: license plate location (LPL), characters segmentation manner. There is no feedback connection and data can be
and characters recognition using neural network.
The research work begins by the Pre-processing, extraction of
processed using multiple layers of units. The dynamical
an image, connected component element of an image and properties in feed forward of the network are important. In
finally recognition of image by the feed forward Back other applications, the significant changes due to output
Propagation Neural Network. neurons activation, such that the transient behaviour
constitutes the output of the neural network. Some other
artificial neural network architectures are adaptive
Keywords—Template matching; Neural Network; Feed
Forward Back Propagation; License Number Plate resonance theory maps, Elman network, competitive
networks, etc. depending on the properties and necessary
I. INTRODUCTION condition of the application. [2]
Intelligent transportation systems (ITSs) act as an Individual weight is multiplied with every input
important role in people’s routine life as their main neuron at the entrance. Intermediate section of neuron is a
objective is to improve safety and mobility of transportation, sum function which sums all weighted inputs and bias. At
also to enhance the productivity by the use of new output side, transfer function is calculated using the
technologies.[1] summation of previously weighted inputs and bias is
In this work, a character recognition algorithm for a passed via activation function (Fig. 1.).
Vehicle license plate recognition system is introduced to be
used as brain of an intelligent infrastructure like
computerized payment systems like highway toll collection
system, collection fee for parking, and Arial management
systems for traffic monitoring.[1] Template matching is
used which is based on Minimum Euclidean distance for
visual similar characters such as ’5’ & ’S’ and ’0’ &
’D’.[12]
Vehicle license plate recognition (VLPR)
algorithm consists of the following steps:
(i) Captured the car's number plate
Fig.1. Working of an artificial neuron.[8]
(ii) Used Morphological & connected component element,
IX. CONCLUSION & FUTURE SCOPE It can be redesign for multinational car licensed plate with
two-rows existing on it. The research work is done on the
A licensed number plate recognition system based on stationary vehicles with white background of licensed plate.
neural network and template matching has been proposed in Further work can be done on image of moving vehicles.
this paper. It can be concluded from the experimental result Also, recognition of colored background of number plates
that the proposed system is 100% efficient in segmentation can be considered in future. Number plate with two-rows
as well as recognition of characters. A combination of will also be a typical problem to sort in future.
neural network and template matching is performing well as
compared to both of them alone. The future scope of this REFERENCE
system can be used for high security recognition of number
[1] Harpreet Kaur and Naresh Kumar Garg, “Number Plate Recognition
plates. Using Neural Network Classifier and KMEAN,” International Journal
of Advanced research in Computer Science and Software Engineering,
vol 4, Issue 8, pp.429-434, August 2014.
[6] Sarmad Majeed Malik and Rehan Hafiz “Automatic Number Plate
Recognition based on connected component analysis technique,” 2nd
International Conference on Emerging Trends in Engineering and
Technology (ICETET'2014 London (UK), pp.33-36), May 30-31,
2014.
[8] Andrej Krenker, Janez Bešter and Andrej Kos, “Introduction to the
Artificial Neural Networks”.
[10] Antonio Albiol, Jose Manuel Mossi, Alberto Albiol, Valery Naranjo,
“Automatic License Plate Reading Using Mathematical Morphology”,
In proc. of Spanish Ministry of Science and Technology, 2002.
[12] Xifan Shi, Weizhong Zhao, Yonghang Shen, "Automatic License Plate
Recognition System Based on Color Image Processing” ICCSA, pp.
1159-1168, 2005.
A Classification Method to Detect if a Tweet Will be
Popular in a Very Early Stage
ZHAO Xianghui1, PENG yong1, YAO Yuangang1 WANG Xiaoyi2, ZHENG Zhan2,
1 2
China Information Technology Security Evaluation Center International School of Software, Wuhan University
Beijing, China Wuhan, China
e-mail: zhaoxh@itsec.gov.cn
Abstract—Timely prediction of the popular tweet is of great similar retweet counts. Can E F[11] suggested that whether the
value in monitoring public opinion, marketing, emergency content contained pictures ,tags and other nice visual features
detection, personalized recommendations and other areas. This do have some impacts on the final retweet counts. There are
paper makes two improvements of predicting popularity of a also some researchers directly studied the popularity of a
tweet in Microblog. On one hand, we proposes some dynamic
tweet. For example, Hong L[12] defined the tweet popularity
features, such as the retweet depth, retweet width, the total fans
of the retweeters, to improve the prediction accuracy. On the problem as a classification task. Peng B[13] showed that
other hand, we sharply shorten the time to detect whether a tweet besides some static features, the dynamic transferring features
will be popular by putting forward a method called LR-DT also impact on the tweet popularity. Gupta J P[14] proposed
which combining the linear regression and the decision tree. We an approach to recognize the human activities through gait,
firstly use the linear regression method to predict the dynamic and contributed the use of Model based approach for activity
features' amount an hour later after the tweet transferred and recognition with the help of movement of legs only. Kong
then combine them with some static features into the decision S[15] argued the tweet popularity could be analyzed from two
tree classifier to detect the popularity of tweets. Our experiments aspects: tweet lifespan and the final report count. In
are based on the real data set from Sina Weibo, which is the most
summarization, some researchers have studied the tweet
popular micro blog service platform in China . The results show
that we proposed method effectively identify the popularity of a popularity problem. However, they didn’t pay much attention
tweet in less than 5 minutes while with little lose of accuracy. to the dynamic features of tweets especially the transferring
characteristics. In addition, previous research focus on the
Keywords—Microblog; retweet; popularity; prediction. features after a tweet has been created longer than an hour. So
they could not make timely prediction of tweet popularity in a
I. INTRODUCTION very early stage.
Microblog is a platform for information interaction, sharing In this paper, we analyzed dynamic features of tweets and
and forwarding based on users' relationships. Because of the proposed a novel approach by combining linear regression and
originality of its content, the convenience of its broadcasting decision tree to detect if a tweet will be popular. Contrary to
and other features, Microblog has provided people a brand previous study, our research work aims to find a way to detect
new way of social communication. The number of its the tweets' popularity at a very early stage. Firstly we add four
registered users showed the explosive growth and it has dynamic tweet features (repostDepth, repostWidth,
reached over 0.4 billion in China. With so many users, we can vUserCount, totalFollowersCount) into traditional static tweet
observe public opinions, detect hot topics and analyze surveys features. Next, by analyzing the forwarding history of tweets,
and perform marketing from popular microblogs. Thereby, it we apply feature selection and choose ten features mostly
is important to detect if a tweet will become hot in the early related to the tweet's popularity by information gain measure.
stage. Among them, we find some dynamic features especially
Researchers worldwide have done few jobs at tweet-level respostCount, repostWidth and vUserCount almost present
forwarding and then analyze if a tweet will be popular[1]. Zhu linear growth with time going. Finally, we combine the linear
X, Tang X and Wang X studied the facts influenced the regression and the decision tree as LR-DT to detect tweet's
popularity of tweets respectively [2,3,4]. Their research popularity at a very early stage. We tested our approach on a
results showed that not only the content based features but tweet dataset crawled from Sina weibo. It covers 14376 tweets
also the context features, such as the number of verified and 708534 retweets related to 70073 users. Experimental
retweeters, impacts much on the popularity of a tweet. Zaman results showed that our approach with new features can
T R suggested that authors and the retweeters' features are of achieve higher prediction accuracy than previous studied
importance at predicting whether a tweet will be tweeted [5]. features. More important, the detection accuracy are achieved
However experiments conducted by Kai W [6] and Yang Z[7] in just 5 minutes after tweets are created.
and Luo Z[8] showed that the tweets are easier to be The contributions of this paper are concluded as two sides.
transferred among people sharing the common interests. Wu One hand, we added some dynamic features which changes
Z[9] and Kong S[10]found that the similar tweets deserve the linearly over time. The propagation characteristics like
repostWidth, vUserCount, and other features have a closer
TABLE II. EVALUATION MEASURES In the experiments, we apply 10-fold cross validation. The
dataset is randomly partitioned into 10 subsets. Each subset
Parameter name Formula
accuracy or recognition rate
has roughly equal size. We rationality select one subset as test
set and the remaining data sets as training. A decision tree
error rate or misclassification rate classification model will be learned from a training set and
recall or true positive rate apply it on the corresponding test set. Then we can obtain the
final prediction results. They are reported in Table 4.
Precision
F1 or harmonic mean of precision TABLE IV. PREDICTION RESULTS BY 10-FOLD CROSS VALIDATION
and recall Groups accuracy errorRate recall precision F1
Base 85.03% 14.07% 86.22% 75.98% 80.63%
III. EXPERIMENTAL RESULTS Total 86.68% 13.32% 91.80% 82.00% 86.47%
Selected 86.67% 13.33% 91.79% 81.99% 86.46%
We tested our approach on a tweet dataset crawled from
Sina weibo. It covers 14376 tweets and 708534 retweets
related to 70073 users. From Table 4, we see that the total group including dynamic
For each tweet, we extract 34 features illustrated in Table 1 features achieves the best performance. The selected group
during its propagation process. Next, we apply information with only top 10 informative features almost don’t loss any
gain as a measurement to select the best 10 features showed in classification accuracy. It shows that the dynamic features of
figure 1. tweets have major effects on the tweet propagation while basic
features of tweets affect far less.
0.4 Through the experiments above, we concluded that with
0.35
0.3 dynamic and static features, decision tree method can predict
gain rate
0.25
0.2 the tweet’s popularity well. However, these experiments are
0.15
0.1 based on the features extracted after 1 hour from the tweet
0.05 creation. We aim to make the prediction at a very early stage.
0
followersCou…
followerCount
vUserCount
depth
friendsCount
contentLength
verified
statusesCount
Abstract—In an interconnected power system, all the with the generators throughout transmission for stable
generators must run at an appropriate capacity to meet the operation. This is called power system stability.
demand in power. Loss of synchronism between the generators Power system stability, also known as synchronous stability
and/or too much frequency fluctuations may cause protective refers to the ability of a system to return to synchronism after
equipment to trip. Load frequency control (LFC) is necessary to
any disturbance such as a sudden change in loading
balance the power generation and the load, by monitoring
frequency and power changes between interconnected power conditions. Supply frequency and voltage must always be
systems in tie-lines. In this paper, power systems from previous within a certain limit to ensure safe and reliable operation of
research works are analysed for stability, and different types of electrical equipment and apparatus both at the consumer
controllers are designed and validated through simulation and premises and during transmission and distribution. Thus it is
compared with a Proportional-Integral-Derivative (PID) - essential to be able to monitor and keep the voltage and
controlled power system. Three types of controllers are frequency within limits.
considered, namely Fuzzy, Fuzzy-PID and Adaptive Neuro- After a perturbation involving a net change in power, the
Fuzzy controllers. The first power system considered is a linear system will enter a transient state which is normally
identical non-reheat two-area system. However, a linear system
oscillatory and reflected by fluctuations in the power flow
does not model an actual power system completely because of
neglected nonlinearities. Hence, two main sources of nonlinearity over transmission lines. This is called the dynamic system
(generation rate constraint (GRC) and governor dead band performance. In a tie-line connecting one group of generators
(GDB)), which arise due to practical constraints are considered to another, these oscillations may build up and be reflected by
and included in the model of the system. excessive fluctuations in power flow in the tie line. This will
cause protective equipment to trip [1].
Keywords—Load Frequency Control, Fuzzy, Adaptive Neuro A stable system is one in which after any perturbation, the
Fuzzy, Fuzzy-PID, Governor Deadband, Generation Rate synchronous machines remain in synchronism at the end of a
Constraint. finite transient period. Moreover, the amplitude of the
oscillations in the transient period must be kept within a
I. INTRODUCTION control. Besides, if there is any prolonged change, as long as
An electrical power service aims at providing reliable and these changes are within a predefined limit, the system must
uninterrupted electricity. Reliable and uninterrupted mean that remain stable and both frequency and voltage kept constant.
the supply must be of constant RMS voltage and frequency. To accomplish LFC, a controller is used to make the system
In real life, of course this is not possible and hence a tolerance return to synchronism after any load change. As input, the
range according to the service provider is used. This range controller needs the error signal which is the difference
must be within a certain norms. For example, if there is a between the desired and the actual output value. The
voltage drop of 10-15% or a frequency drop of a few hertz, controller will then generate a control signal which will affect
there is a risk of stalling in motor loads but usually, generators the system’s output value by amplifying it or attenuating it in
trip on voltage or frequency before this happens. An electrical an attempt to obtain the desired output. Controllers can be
grid involves a lot of planning and simulation and its adjusted to improve transient response by decreasing
operation is complex. Hence, automatic control is used in the maximum overshoot and settling time and also to improve
system instead of human control because fast reaction speed steady state response by removing steady state error.
is needed. Electrical energy is one of the most important resources and
In an electrical grid, many synchronous generators with hence optimization in power systems is essential. LFC is thus
different voltage ratings are connected to bus terminals which important and being able to design optimal but cheap
have the same frequency and phase sequence as the controllers is of upmost priority. There is a wide variety of
generators. All generators connected in parallel must be run at controllers designed using different methods, each offering
the appropriate capacity to meet the demand. If a generator distinct advantages and disadvantages. There are conventional
loses synchronism, there will be fluctuations in the voltage controllers like PID and other interesting concepts like fuzzy
and frequency supply. It is essential to synchronise the bus controllers. It is useful to compare the actual impact of these
A. PID Controller
Figure 1 Fuzzy Numbers [4] A proportional integral derivative (PID) controller involves
three parameters, the proportional, integral and derivative
gain.
Often not all three control actions are needed to optimally Considering an uncontrolled non-linear system [12],[13], a
control a system with minimal cost and hence P, I, PI and PD Generation Rate Constraint (GRC) of 10% pu MW min-1
controllers can also be used. PID controllers are very popular (0.0017 % pu MW s-1) and a Governor Deadband (GBD) of
in the industrial field because of their simplicity and low cost 0.05 % were added to the linear two area system for each area
while effectively improving transient and steady state [15], [16].
responses even when the actual working principles and
parameters of the system are unknown.
B. Fuzzy Controller
B. Fuzzy PID Controller A fuzzy controller is now designed for each area of the two-
The Fuzzy-PID controller uses a fuzzy and a PID controller area system. Matlab/Simulink inbuilt fuzzy block is used. To
together to improve the control action. The membership create the FIS which defines all the parameters of the fuzzy
functions and fuzzy rules are difficult to fine tune to obtain block, ‘fuzzy’ is typed in the Matlab command window so as
the best response. Besides, both fuzzy and PID controllers are to open the FIS editor. Before starting the simulation, the FIS
not adaptive, meaning that if the set-point is changed, the are exported to the workspace and the name of the
system may not be optimally controlled anymore or even corresponding FIS is used as parameter for the fuzzy block.
become unstable. Using a fuzzy and a PID as cascaded
controllers such that the fuzzy controller adjusts the ACEi = BiΔfi + ΔPtie,I (2)
parameters of the PID controllers depending on the actual where,
response of the system results in an online and more optimal ACEi is the Area Control Error,
control. i is the Area number,
Δfi is the frequency deviation,
C. Neuro-fuzzy Controller Bi is the frequency bias parameter [17-19]
In Neuro-fuzzy controllers, a set of training data representing The variables being controlled are ACE and ACE input
the desired response of the controller is used to train the fuzzy variable and there will be one output variable, the control
controller’s rules and membership functions. This is done action. The area control error is often used as the controller
using the Adaptive Neuro-Fuzzy Inference system, ANFIS input.
function in Matlab/Simulink. By using ACE as our controller input, deviations in both Δf
For this type of controller, the ANFIS training routine trains a and ΔPtie can be reduced.
Sugeno-type fuzzy inference system using hybrid training Next the membership functions are defined using the Fuzzy
algorithm (a combination of backpropagation gradient descent Inference System (FIS) membership function. Seven
method and least-squares method) to identify the parameters Triangular membership functions are used for both the input
of the FIS. and control variables. The ranges of the membership
functions are initially set according to the ranges of the
II. METHODOLOGY chosen input variables in the uncontrolled two-area system.
A linear two-area system and a nonlinear two-area system Afterwards, the linguistic variables are assigned.
considered for LFC. All systems are analysed to observe their The defuzzification method is taken as centroid. The rules are
initial response. designed approximately by logic and then fine-tuned by trial
After this various types of controllers are designed for the and error to obtain the best control action. The weight of each
two-area systems such as PID, Fuzzy, Fuzzy-PID and rule is kept at 1. The ranges of the membership functions are
adaptive Neuro-Fuzzy so that a comparative analysis can be adapted to obtain a better control action and the rules are
later performed. The systems are implemented in the form of tuned again to improve response.
block diagrams in the s-domain and simulated. Controllers are
designed using the control system toolbox which can be used C. Fuzzy-PID Controller
to tune predefined controllers such as PID and Fuzzy so as to The same method as in fuzzy controllers is used to design the
adapt them to a particular system. fuzzy rules and set the fuzzy membership functions. The
output of the fuzzy controller is connected to a PID controller.
A. Systems
The stability of the system is checked by analysing the D. Adaptive Neuro-Fuzzy Controller
uncontrolled system’s response to confirm if the frequency The same system as for the Fuzzy controlled system is used
deviation and incremental tie line power responses tend to except that the FIS is a Sugeno-type designed using the
finite values. Adaptive Neuro Fuzzy Inference System (ANFIS) editor.
1) Linear Two Area System The AND method is set as product and weighted average is
The two-area interconnected power system proposed by [11] used for defuzzification.
is used with a load change of 0.01 pu MW for area 1 and 0.02 A set of training data representing the inputs and desired
pu MW for area 2. The areas are both large systems of controller output is obtained from the two-area system with a
1000MW, 60Hz each [14]. PID controller. The training data set is loaded in the ANFIS
2) Non-Linear Two Area System editor and an initial FIS is generated using grid partition. The
number of membership functions for the inputs is set to 7 and As can be observed from Table I, even though the same step
the output is set to be linear. The FIS is then trained for 10 load changes are applied in all systems, there are
epochs using the hybrid optimization method and tolerance 0. discrepancies when nonlinearities are added to the system.
When GRC is added, the steady state error, the settling time
III. DATA ANALYSIS AND FINDINGS and the maximum overshoot increases for ∆f and ∆Ptie in both
The systems were first analysed for a step load change of 0.01 areas. This is because GRC limits the rate of change of power
pu MW for the first area and 0.02 pu MW for the second area. generation. When both GDB and GRC are present, settling
The same step changes were initially used for the different time increases by a much greater amount due to the very
controllers to be able to compare their performances. Then, oscillatory response of system. On the other hand, steady state
different load changes were applied to the controlled systems error and maximum overshoot decrease slightly compared to
to investigate their responses under different conditions. system with GRC only. This is because GDB adds a range of
inputs in the system for which there is no reaction.
A. Uncontrolled two area system
Table I Uncontrolled Two Area System
Two-area system with GRC
Linear two-area system Two-area system with GRC
and GDB
Area 1 Area 2 Area 1 Area 2 Area 1 Area 2
ΔPtie ΔPtie/ ΔPtie ΔPtie/ ΔPtie
ΔPtie/p
Δf/Hz /pu Δf/Hz pu Δf/Hz /pu Δf/Hz pu Δf/Hz /pu Δf/Hz
u MW
MW MW MW MW MW
Steady
- 0.00 - - - 0.00 - - - 0.00 -
state -0.004
0.035 5 0.035 0.005 0.327 5 0.327 0.005 0.292 4 0.292
value
Maximum 0.00 0.01 0.01
0.044 0.052 0.006 0.331 0.330 0.010 0.365 0.368 0.010
Overshoot 6 0 0
Settling 154.
5.55 6.13 5.44 6.13 9.87 17.3 8.95 17.3 68.3 67.1 154.7
time/s 7
B. Controlled Linear Two Area System C. Controlled Non-Linear Two Area System
It can be observed from Table III, that in terms of settling As can be observed in Table IV, steady state error could not
time, adaptive Neuro-fuzzy controller perform best with a be completely removed though reduced to very small amount.
decrease in settling time of over 40% for all responses. It also This is because of the oscillatory nature of the system when
has good performance in terms of maximum overshoot GDB was added. In terms of settling time, both types of
reduction which is above 70%. However it has the poorest controllers offered similar improvement. However, the Fuzzy-
performance in terms of steady state error. The PID controller PID controller gave better response in terms of maximum
gave zero steady state error and greatly decreased maximum overshoot.
overshoot by above 80% in all areas but had poor The controlled system is now tested under different load
performance in terms of settling time (reduction of 0.082% to changes. As can be seen from Figure 4, the system responds
11.41%) compared to other controllers. well even with other load changes.
On the other hand, the fuzzy controller provided good
performance in terms of settling time (above 30%) but little
reduction of maximum overshoot (around 40% for frequency Table II Controller Types
deviation in both areas) and steady state error was still Controller
present. Controller type
number
1 PID
The Fuzzy-PID controller designed completely removed
steady state error and gave more than 80% reduction in 2 Fuzzy with Δf and ΔPtie as inputs
maximum overshoot for all responses. Besides, though not the 3 Fuzzy with ACE and ACE as inputs
best, it gave good enough reduction in settling time of more
than 25% for both Δf and ΔPtie.. Different load changes are 4 Fuzzy-PID
then applied to the Fuzzy-PID system to check if it can 5 Adaptive Neuro-Fuzzy
perform well in different conditions.
As can be observed in Figure 3, the Fuzzy-PID system
performs well for other load changes including negative load
changes.
Table III Controlled Linear Two Area System Table IV Controlled Non-Linear Two Area
Δ
Δ
Change in Change Change in Δ stead
maximum in steady maximu
settlin y
overshoot/ settling state m
g state
% time% error/% overshoo
time% error
t/%
1 -90 -0.082 -100 /%
2 -72.27 -3.52 -99.7 -
PID -15.14 -62.96
Area Area 99.86
3 -39.55 -45.98 -99.4
1 1 Fuzzy -
4 -90.45 -26.62 -100 -25.99 -65.13
-PID 99.82
5 -72.05 -59.52 -66.86 Δf
Δf -
PID -16.73 -62.42
1 -84.2 -11.1 -100 Area 99.96
2 -82.86 15.69 -99.7 2 Fuzzy -
-26.8 -64.86
Area -PID 99.78
3 -39.43 -43.42 -99.43
2 -
4 -82.48 -33.86 -100 PID 3.33 -83.36
98.67
5 -76.57 -62.53 -66.86 ΔPtie
Fuzzy -
2.08 -82.24
1 -93.3 -11.41 -100 -PID 99.23
2 -11.86 -21.06 0
ΔPtie 3 -89.77 -30.08 -98.39
4 -90.55 -32.39 -100
5 -96.37 -41.54 -97.94
Figure 3. Linear two area system with Fuzzy-PID controller Figure 4. Non-Linear two area system with Fuzzy-PID
response under different load changes controller response under different load changes
IV. CONCLUSION International Journal of Engineering and Science, 2(11), pp. 06-
13.
From simulation results obtained, it can be concluded for the
linear two area system, the Fuzzy-PID controller offered the [14] Beaufays, F., Abdel-Magid, Y. & Widrow, B., 1994. Application
of neural networks to load-frequency control in power systems.
best overall performance with elimination of steady state Neural Networks, 7(1), pp. 183-194.
error, more than 80% reduction of maximum overshoot and
[15] Panda, G., Panda, S. & Ardil, C., 2009. Automatic generation
over 25% reduction in settling time. Also, it was found that by control of interconnected power system with generation rate
adding GRC and GDB as non-linearities to the system leads constraints by hybrid neuro fuzzy approach. International Journal
to larger settling times, higher maximum overshoot and of Electrical and Electronics Engineering, 3(9).
steady state errors. Moreover, upon designing controllers for [16] Kumari, N. & Jha, A. N., 2013. Effect of generation rate
the non-linear two area system, the Fuzzy-PID controller only constraint on load frequency control of multi area interconnected
gave slightly better performance than the anti-windup thermal systems. Journal of Electrical and Electronic Engineering
Research, 5(3), pp. 44-49.
controller.
[17] Elgerd, O. I., 1973. Electric Energy Systems Theory: An
Hence, it can be concluded that Fuzzy-PID controllers should
Introduction. New Delhi: Tata Mc-Graw Hill Publishing
be considered and following the trend of mostly using PID Company ltc.
controllers in industrial environment may not always lead to [18] Ndubisi, S. N., 2010. An intelligent fuzzy logic controller applied
the best control action. However, it was also observed that to multi-area load frequency control. American Journal of
there is not a single controller which will give the best control Scientific and Industrial Research, 1(2), pp. 220-226.
action in all situations. Instead, each controller gives different [19] Prakash, S. & Sinha, S. K., 2011. Load frequency control of three
performances when applied to the different systems under area interconnected hydro-thermal reheat power system using
consideration. artificial intelligence and PI controller. International Journal of
Engineering, Science and Technology, 4(1), pp. 23-37.
REFERENCES
[1] Anderson, P. M. & Fouad, A. A., 1977. Power System Control
and Stability. Ames, Lowa: The LOWA State University Press.
[2] El-Saady, G. et al., 2013. A new robust technique LFC of multi-
area power system using SMES. International Journal of Control,
Automation and Systems, 1(2).
[3] Babulu, K. & Kumar, K., 2012. Fuzzy self-adaptive PID controller
design for electric heating surface. International Journal of
Engineering Inventions, 1(5), pp. 10-21.
[4] Sharma, D., 2011. Designing and modeling fuzzy control systems.
International Journal of Computer Applications, 16(1), pp. 46-53.
[5] Chen, G. & Pham, T. T., 2001. Introduction to Fuzzy Sets, Fuzzy
Logic, and Fuzzy Control systems. New York: CRC Press.
[6] Ramesh, S. & Krishnan, A., 2010. Fuzzy rule based load
frequency control in a parallel AC-DC interconnected power
systems through HVDC link. International Journal of Computer
Applications, 1(4).
[7] Altas, I. H. & Neyens , J., 2006. A fuzzy logic load-frequency
controller for power systems. Turket, s.n.
[8] Vavilala, S. K., Srinivas, R. S. & Machavarapu, S., 2014. Load
frequency control of two area interconnected power system using
conventional and intelligent controllers. Journal of Engineering
Research and Applications, 4(1), pp. 156-160.
[9] Datta, M., Senjyu, T., Yona, A. & Funabashi, T., 2011. A fuzzy
based method for leveling output power fluctuations of
photovoltaic-diesel hybrid power system. Renewable Energy: An
International Journal, Volume 36, pp. 1693-1703.
[10] Sivanandam, S. N., Sumathi, S. & Deepa, S. N., 2007.
Introduction to Fuzzy Logic using Matlab. New York: Springer.
[11] Venkata, P. B. & Kumar, J. S., 2005-2008. Load frequency
control for a two area interconnected power system using robust
genetic algorithm controller. Journal of Theoretical and Applied
Information Technology, 4(12), pp. 1204-1212.
[12] Jang, R. J., 1993. ANFIS: adaptive-network-based fuzzy
interference system. IEEE Transactions on Systems, Man and
Cybernetics, 23(3), pp. 665-685.
[13] Loganathan, C. & Girija, K. V., 2013. Hybrid learning for
adaptive neuro fuzzy interference system. Research Inventry:
Observer-based Control for Biomass Regulation in
Wastewater Treatment Plants
R. Ramjug-Ballgobin and H. C. S. Rugooputh K. Busawon and R. Binns
Faculty of Engineering Faculty of Engineering and Environment
University of Mauritius Northumbria University
Reduit, Mauritius Newcastle upon Tyne, NE1 8ST
United Kingdom
Abstract—In this paper, an estimation and control design In this paper, we propose an observer-based control
methodology for biomass concentration in bioreactors is to regulate the biomass concentration for a bioreactor that
presented. For this, a feedback linearising control was designed emulates the dynamics of a wastewater treatment system.
to perform biomass regulation. Since the controller depends on First, we show that such control should be bounded and
the measurements of the biomass concentration, an observer should obey some practical constraint. Consequently, the
was designed to estimate the latter. After that, an observer and
control law has to be designed such that these constraints
an observer-based controller were applied to the system.
Simulation study showed the good convergence features of the are obeyed at all times. Next, a simple observer is design
proposed observer-based controller using Matlab/Simulink. based on the structure of the system. The controller and the
observer are then combined to produce an observer-based
Keywords—bioreactor; biomass concentration; observer; control. A simulation study is carried out to show the
controller. efficacy of the proposed observer-based controller using
Matlab/Simulink. Finally, some conclusions are drawn.
I. INTRODUCTION
During the last decades, biological and biotechnical
processes have gained significant importance in industry. II. SYSTEM MODEL AND ANALYSIS
Some common applications are the production of certain A. Wastewater Treatment System
chemical compounds by microorganisms, the cultivation of The study of wastewater treatment control can be
a specific biomass for its utilization, the extraction of its reduced to studying the following simple microbial growth
metabolites and the degradation of pollutants; such as in reaction given by:
biological wastewater treatment plants. As a result,
bioreactors must use sophisticated control procedures to X (t ) (t ) X (t ) D(t ) X (t ) (1)
ensure a satisfactory and efficient performance. The S (t ) D(t )( S in S ) Ys (t ) X (t )
regulation of the bioreactor is, however, a complex problem
since the exact biological models are not available in most where X is the biomass concentration; S is the substrate
cases. One way to overcome this difficulty is to use the concentration; µ, the biomass specific growth rate; D is the
mass-balance-based modelling [1] where the unknown dilution rate; Sin, the influent substrate concentration and Ys
biological components are included in the kinetics of the is the yield coefficient for substrate concentration.
bioreaction.
In this work, we focus on biological wastewater
treatment plants. The absence of a reliable sensor for the on-
line measurement of water quality parameters is a major
problem that needs to be tackled. The difficulty resides in
the fact that many of these parameters cannot be measured
by on-line sensors and the correctness of the actual
hardware sensors is either inadequate or too costly. Also, the
latter gives rise to maintenance problems which can result in
the system's malfunction. This has led to an increased
interest among researchers to focus on the problem of
observer design [2-6] for the estimation of biomass
concentration in wastewater treatment plants (see e.g. [1],
[7-10]). For instance, since wastewater treatment control
amounts to studying a simple microbial growth reaction, Figure 1 A schematic of the bioreactor
nonlinear observer-based estimators have been proposed for
on-line estimation of kinetic rates inside bioreactors [7].
. previous section. More precisely, the estimate of the biomass
Sˆ Ys Xˆ D( S in y ) k1 S Sˆ (15) is fed into the controller (14) so that we obtain:
.
Xˆ D Xˆ k 2 S Sˆ kX ref
D( Xˆ ) k (18)
where k1 and k2 are the gains of the observer, which are Xˆ
chosen such that the overall error dynamics are stable.
Setting However, it is not all guaranteed that the overall system
would stay stable under the feedback (17). Consequently, we
1 S Sˆ need to proceed for a stability analysis of the overall system
2 X Xˆ controlled via the observer (separation principle). For this
consider the closed loop system:
the error dynamics of the observer is given by: X X D( Xˆ ) X
1 k1 1 Ys 2 X D( X ) X D( X ) X D( Xˆ ) X (19)
(16)
2 k 2 1 D 2 kX v ( D( X ) D( Xˆ )) X
That is,
1 k1 Ys 1 0 Also,
2 k 2 0 2 D 2
D( X ) D( Xˆ )X kX ref
1 1
ˆ X
(20)
X X
The gains of the observer are chosen as follows:
kX ref 1
k 1 2Ys Xˆ
k 2 2 Ys
Now since and X̂ are bounded, one can conclude that the
overall closed-loop system is stable.
so that,
1 2Ys Ys 1 0
2 Ys
2
0 2 D 2
2 1 1 0
Ys 2
0 2 D 2
Ys F
IV. RESULTS AND DISCUSSION D=0.8 h-1, Sin=10 g/l, μmax= 0.9 h-1, Ks= 9 g/l, θ=5, Ys=2,
A simulation of the bioreactor model given in Figure 2 and xref=1.5, k=1
the controller are carried out by using the following
simulation parameters:
2.5
Biomass Concentration(g/l)
1.5
0.5
0
0 5 10 15 20 25 30 35 40
Time(Days) Figure 4 Profiles of X and 𝑋
2.5
[4] W. Haddad, V. Chellaboina and S. Neresov, Impulsive and hybrid
dynamical systems: stability, dissipativity and control, Princeton
University Press, 2006.
[5] H. Hammouri, M, Nadri and R. Mota, Constant gain observer for
continuous-discrete time uniformly observable systems,
2
Proceedings of the 45th IEEE Conference on Decision & Control,
Manchester Grand Hyatt Hotel, San Diego, CA, USA, December
13-15, 2006.
[6] Lasson Karafyllis and Costas Kravaris, From Continuous-Time
Design to Sampled-Data Design of Observers, IEEE Transactions
1.5 on Automatic Control, Vol. 54, no. 9, 2009.
0 0.2 0.4 0.6 0.8 1
Time(Days)
1.2 1.4 1.6 1.8 2
[7] M. Farza, K. Busawon and H. Hammouri, “Simple Nonlinear
Observers for On-line Estimation of Kinetic Rates in Bioreactors”,
Automatica, Vol. 34, No. 3, pp. 301-318. (1998).
Figure 6 Profile of X under feedback linearising control [8] J-L. Gouzé and V. Lemesle, “A bounded error observer with
with Xref = 1.5 and k =5 adjustable rate for a class of bioreactor models”, In Proceedings of
the European Control Conference, ECC 2001. Porto, Portugal.
Figure 7 shows the profile of X using the observer-based (2001).
control. One can see that X goes to its target reference value, [9] S.Nunez, F. Garelli, H. De Battista, “Second-order sliding mode
Xref, as expected. The transient oscillations are due to the observer for biomass concentration and growth rate estimation in
dynamics of the observer. batch photo-bioreactors”, International Journal of Hydrogen
Energy 2014, 39: 8772-8779.
[10] D. Tingey. An observer design for state affine systems with
3
application to a bioprocess, Proceedings of Control 2004, Bath,
UK. (2004).
2.5
Biomass Concentration(g/l)
1.5
0.5
0 5 10 15 20 25 30 35 40 45 50
Time(Days)
Abstract—Thermal convection and fluid flow in porous media industry, filtration processes heat transfer enhancement
has gained increasing research interest in recent years due to the especially in high heat flux applications such as cooling of
presence of porous media in many engineering applications. electronic equipment, building insulation [3,4,5,6,7].
Rough set theory has been regarded as a powerful feasible and
effective methodology in the performance of data mining and In recent years, many approaches have been proposed for
knowledge discovery activities. This paper introduce a method extracting hidden relationships holding among pieces of
for building knowledge for the rate of heat transfer (Local information stored in a given database [8]. Rough set theory is
Nusselt Number) considering the free convection flow over a a relatively new mathematical and AI technique proposed by
vertical flat plate in a fluid-saturated porous medium in the Pawlak and Skowron [9, 10] to handle imprecision,
presence of heat sources or sinks and with nonlinear density uncertainty and vagueness. As an effective method to feature
temperature variation. First, Solutions for a class of coupled selection, rough sets can preserve the meaning of the features
nonlinear equations are obtained by using the fourth-order It has been widely applied in many fields such as machine
Runge-Kutta method with shooting technique; second, Numerical learning [11,12], data mining [13], stock market analysis [14].
calculations of rate of heat transfer for different parameters such The main advantage of rough set theory is that it does not
as; variable suction/blowing, variable wall temperature exponent, require any preliminary or additional information about data-
heat source/sink, nonlinear density temperature variation (NTD) like probability in statistics, basic probability assignment in
for both uniform and variable permeability cases are made and
DS theory or the value of possibility in fuzzy set theory.
presented in tabular form (decision table). Finally, a set of
maximally generalized decision rules is generated by using rough The rest of the paper is organized as follows. Section 2
sets methodology. The effectiveness of the obtained results will be discusses the basic concepts of Rough set theory, Section 3
illustrated by comparing it with previously published work and analysis of the proposed problem, section 4 results and
are found to be in excellent agreement. The proposed method discussion, and section 5 concludes the paper.
effectively decreases the time and complexity of process to
obtaining the rate of heat transfer.
II. ROUGH SETS
Keywords—Data mining - Knowledge discovery - Rule In this context some essential definitions from rough set
Induction - Rough sets –free convection - porous medium- variable theory that are used for extracting decision rules will be
permeability - heat source/sink. recalled.
IND( B) { ( x , y ) U a B , a(x) a(y) } (1) the intersection of these reduction sets is called core
denoted as: CORE Reducts
Once the indiscernibility in the condition concepts is found,
the equivalence classes are used to classify the objects
considering available information. So we need the concept of III. ANALYSIS
set approximation.
A. Formulation of the Problem
Consider the universe is U and identify the partition induct
by relation R and items as R(x) Then: According to the assumptions and analysis mentioned in
[6] the ordinary differential equations which describe the
problem are:
The Lower Approximation of X: is the set contain all
objects which with certainty belong to the set X. it can
be defined as (Fig.1 (b)): f \ 1 -1 e 1 0 (5)
RX R( x) : R( x) X (2)
xU
R ( x ) : R ( x ) X
And the boundary conditions are:
RX (3)
xU
f 0 fw , 0 1 (7)
The Boundary Region of X: is the difference between
upper and lower approximation. it can be defined as : f \ 0 , 0 (8)
BN R X RX RX (4) Where
: is the permeability parameter
When a boundary region exists; i.e. when RX RX 0
: is the nonlinear density temperature (NDT) variation
then X is a Rough Set [15]
parameter
0 : is the heat source/sink parameter
fw 0 Nusselt
X40 0 0 0 0 2 0.49465
U X41 0 0 0.2 0 2 0.67484
number
X42 0 0 0.5 0 2 0.89321
X11 0 -0.2 0.2 -0.5 2 0.74915 X53 0.2 0.2 0.2 0 2 0.58082
X12 0 -0.2 0.5 -0.5 2 0.91427 X54 0.2 0.2 0.5 0 2 0.85192
X14 0 0 0.2 -0.5 2 0.59573 X56 -0.2 -0.2 0.2 0.5 2 0.80938
X15 0 0 0.5 -0.5 2 0.78808 X57 -0.2 -0.2 0.5 0.5 2 1.01352
X17 0 0.2 0.2 -0.5 2 0.3969 X59 -0.2 0 0.2 0.5 2 0.6788
X18 0 0.2 0.5 -0.5 2 0.63646 X60 -0.2 0 0.5 0.5 2 0.90756
X19 0.2 -0.2 0 -0.5 2 0.68211 X61 -0.2 0.2 0 0.5 2 0.28286
X20 0.2 -0.2 0.2 -0.5 2 0.81969 X62 -0.2 0.2 0.2 0.5 2 0.52239
X21 0.2 -0.2 0.5 -0.5 2 1.00092 X63 -0.2 0.2 0.5 0.5 2 0.78672
X23 0.2 0 0.2 -0.5 2 0.66995 X65 0 -0.2 0.2 0.5 2 0.87567
X24 0.2 0 0.5 -0.5 2 0.87754 X66 0 -0.2 0.5 0.5 2 1.09347
X26 0.2 0.2 0.2 -0.5 2 0.47765 X68 0 0 0.2 0.5 2 0.74712
X27 0.2 0.2 0.5 -0.5 2 0.73033 X69 0 0 0.5 0.5 2 0.98886
X29 -0.2 -0.2 0.2 0 2 0.74812 X71 0 0.2 0.2 0.5 2 0.59363
X30 -0.2 -0.2 0.5 0 2 0.92694 X72 0 0.2 0.5 0.5 2 0.86991
X32 -0.2 0 0.2 0 2 0.60627 X74 0.2 -0.2 0.2 0.5 2 0.94576
X33 -0.2 0 0.5 0 2 0.81135 X75 0.2 -0.2 0.5 0.5 2 1.17846
X35 -0.2 0.2 0.2 0 2 0.43054 X77 0.2 0 0.2 0.5 2 0.81976
X78 0.2 0 0.5 0.5 2 1.0757 X120 0 -0.2 0.5 0 5 1.37089
X79 0.2 0.2 0 0.5 2 0.4187 X121 0 0 0 0 5 0.66832
X80 0.2 0.2 0.2 0.5 2 0.67007 X122 0 0 0.2 0 5 0.93906
X81 0.2 0.2 0.5 0.5 2 0.95941 X123 0 0 0.5 0 5 1.27715
X82 -0.2 -0.2 0 -0.5 5 0.66625 X124 0 0.2 0 0 5 0.48514
X83 -0.2 -0.2 0.2 -0.5 5 0.85675 X125 0 0.2 0.2 0 5 0.79953
X84 -0.2 -0.2 0.5 -0.5 5 1.10629 X126 0 0.2 0.5 0 5 1.17141
X85 -0.2 0 0 -0.5 5 0.49265 X127 0.2 -0.2 0 0 5 0.87965
X86 -0.2 0 0.2 -0.5 5 1.07773 X128 0.2 -0.2 0.2 0 5 1.13416
X87 -0.2 0 0.5 -0.5 5 0.99213 X129 0.2 -0.2 0.5 0 5 1.46507
X88 -0.2 0.2 0 -0.5 5 0.24566 X130 0.2 0 0 0 5 0.74004
X89 -0.2 0.2 0.2 -0.5 5 0.5313 X131 0.2 0 0.2 0 5 1.02071
X90 -0.2 0.2 0.5 -0.5 5 0.85536 X132 0.2 0 0.5 0 5 1.37464
X91 0 -0.2 0 -0.5 5 0.73 X133 0.2 0.2 0 0 5 0.56635
X92 0 -0.2 0.2 -0.5 5 0.93219 X134 0.2 0.2 0.2 0 5 0.88792
X93 0 -0.2 0.5 -0.5 5 1.19903 X135 0.2 0.2 0.5 0 5 1.2738
X94 0 0 0 -0.5 5 0.56308 X136 -0.2 -0.2 0 0.5 5 0.82737
X95 0 0 0.2 -0.5 5 0.79561 X137 -0.2 -0.2 0.2 0.5 5 1.09703
X96 0 0 0.5 -0.5 5 1.08951 X138 -0.2 -0.2 0.5 0.5 5 1.4375
X97 0 0.2 0 -0.5 5 0.33122 X139 -0.2 0 0 0.5 5 0.69476
X98 0 0.2 0.2 -0.5 5 0.3969 X140 -0.2 0 0.2 0.5 5 0.98995
X99 0 0.2 0.5 -0.5 5 0.96057 X141 -0.2 0 0.5 0.5 5 1.35228
X100 0.2 -0.2 0 -0.5 5 0.79654 X142 -0.2 0.2 0 0.5 5 0.53451
X101 0.2 -0.2 0.2 -0.5 5 1.01088 X143 -0.2 0.2 0.2 0.5 5 0.86655
X102 0.2 -0.2 0.5 -0.5 5 1.29574 X144 -0.2 0.2 0.5 0.5 5 1.25787
X103 0.2 0 0 -0.5 5 0.63648 X145 0 -0.2 0 0.5 5 0.89156
X104 0.2 0 0.2 -0.5 5 0.87991 X146 0 -0.2 0.2 0.5 5 1.17102
X105 0.2 0 0.5 -0.5 5 1.19088 X147 0 -0.2 0.5 0.5 5 1.52659
X106 0.2 0.2 0 -0.5 5 0.41982 X148 0 0 0 0.5 5 0.76298
X107 0.2 0.2 0.2 -0.5 5 0.71758 X149 0 0 0.2 0.5 5 1.06708
X108 0.2 0.2 0.5 -0.5 5 1.06951 X150 0 0 0.5 0.5 5 1.44387
X109 -0.2 -0.2 0 0 5 0.74896 X151 0 0.2 0 0.5 5 0.60931
X110 -0.2 -0.2 0.2 0 5 0.98185 X152 0 0.2 0.2 0.5 5 0.94847
X111 -0.2 -0.2 0.5 0 5 1.28018 X153 0 0.2 0.5 0.5 5 1.35302
X112 -0.2 0 0 0 5 0.59898 X154 0.2 -0.2 0 0.5 5 0.95782
X113 -0.2 0 0.2 0 5 0.86022 X155 0.2 -0.2 0.2 0.5 5 1.2475
X114 -0.2 0 0.5 0 5 1.1831 X156 0.2 -0.2 0.5 0.5 5 1.6188
X115 -0.2 0.2 0 0 5 0.40618 X157 0.2 0 0 0.5 5 0.83328
X116 -0.2 0.2 0.2 0 5 0.71378 X158 0.2 0 0.2 0.5 5 1.14669
X117 -0.2 0.2 0.5 0 5 1.07225 X159 0.2 0 0.5 0.5 5 1.53857
X118 0 -0.2 0 0 5 0.81313 X160 0.2 0.2 0 0.5 5 0.68609
X119 0 -0.2 0.2 0 5 1.0566 X161 0.2 0.2 0.2 0.5 5 1.03276
X162 0.2 0.2 0.5 0.5 5 1.45114 X204 0 0 0.5 0 1 0.77044
X163 -0.2 -0.2 0 -0.5 1 0.53672 X205 0 0.2 0 0 1 0.20898
X164 -0.2 -0.2 0.2 -0.5 1 0.63175 X206 0 0.2 0.2 0 1 0.41523
X165 -0.2 -0.2 0.5 -0.5 1 0.75047 X207 0 0.2 0.5 0 1 0.63254
X166 -0.2 0 0 -0.5 1 0.3447 X208 0.2 -0.2 0 0 1 0.67708
X167 -0.2 0 0.2 -0.5 1 0.47346 X209 0.2 -0.2 0.2 0 1 0.80536
X168 -0.2 0 0.5 -0.5 1 0.62074 X210 0.2 -0.2 0.5 0 1 0.96966
X169 -0.2 0.2 0 -0.5 1 0.06615 X211 0.2 0 0 0 1 0.50518
X170 -0.2 0.2 0.2 -0.5 1 0.26787 X212 0.2 0 0.2 0 1 0.66258
X171 -0.2 0.2 0.5 -0.5 1 0.46544 X213 0.2 0 0.5 0 1 0.85191
X172 0 -0.2 0 -0.5 1 0.59057 X214 0.2 0.2 0 0 1 0.27579
X173 0 -0.2 0.2 -0.5 1 0.69387 X215 0.2 0.2 0.2 0 1 0.4865
X174 0 -0.2 0.5 -0.5 1 0.82539 X216 0.2 0.2 0.5 0 1 0.71535
X175 0 0 0 -0.5 1 0.40185 X217 -0.2 -0.2 0 0.5 1 0.59282
X176 0 0 0.2 -0.5 1 0.53715 X218 -0.2 -0.2 0.2 0.5 1 0.71939
X177 0 0 0.5 -0.5 1 0.69614 X219 -0.2 -0.2 0.5 0.5 1 0.87467
X178 0 0.2 0 -0.5 1 0.13034 X220 -0.2 0 0 0.5 1 0.42594
X179 0 0.2 0.2 -0.5 1 0.3342 X221 -0.2 0 0.2 0.5 1 0.58328
X180 0 0.2 0.5 -0.5 1 0.54143 X222 -0.2 0 0.5 0.5 1 0.76407
X181 0.2 -0.2 0 -0.5 1 0.6481 X223 -0.2 0.2 0 0.5 1 0.21209
X182 0.2 -0.2 0.2 -0.5 1 0.76075 X224 -0.2 0.2 0.2 0.5 1 0.42029
X183 0.2 -0.2 0.5 -0.5 1 0.9068 X225 -0.2 0.2 0.5 0.5 1 0.63849
X184 0.2 0 0 -0.5 1 0.46362 X226 0 -0.2 0 0.5 1 0.64718
X185 0.2 0 0.2 -0.5 1 0.60659 X227 0 -0.2 0.2 0.5 1 0.78127
X186 0.2 0 0.5 -0.5 1 0.77906 X228 0 -0.2 0.5 0.5 1 0.94859
X187 0.2 0.2 0 -0.5 1 0.201 X229 0 0 0 0.5 1 0.48259
X188 0.2 0.2 0.2 -0.5 1 0.40807 X230 0 0 0.2 0.5 1 0.64604
X189 0.2 0.2 0.5 -0.5 1 0.62657 X231 0 0 0.5 0.5 1 0.83802
X190 -0.2 -0.2 0 0 1 0.56535 X232 0 0.2 0 0.5 1 0.27174
X191 -0.2 -0.2 0.2 0 1 0.67692 X233 0 0.2 0.2 0.5 1 0.48398
X192 -0.2 -0.2 0.5 0 1 0.81487 X234 0 0.2 0.5 0.5 1 0.7123
X193 -0.2 0 0 0 1 0.38726 X235 0.2 -0.2 0 0.5 1 0.70482
X194 -0.2 0 0.2 0 1 0.53104 X236 0.2 -0.2 0.2 0.5 1 0.84739
X195 -0.2 0 0.5 0 1 0.69591 X237 0.2 -0.2 0.5 0.5 1 1.02829
X196 -0.2 0.2 0 0 1 0.14792 X238 0.2 0 0 0.5 1 0.54317
X197 -0.2 0.2 0.2 0 1 0.35067 X239 0.2 0 0.2 0.5 1 0.71373
X198 -0.2 0.2 0.5 0 1 0.55796 X240 0.2 0 0.5 0.5 1 0.91845
X199 0 -0.2 0 0 1 0.61949 X241 0.2 0.2 0 0.5 1 0.33656
X200 0 -0.2 0.2 0 1 0.7389 X242 0.2 0.2 0.2 0.5 1 0.55373
X201 0 -0.2 0.5 0 1 0.88921 X243 0.2 0.2 0.5 0.5 1 0.79361
X202 0 0 0 0 1 0.44377
X203 0 0 0.2 0 1 0.59416
An increasing of the values of the
B. The Rough Set Framework for Rule Induction parameters , , f w and enhances the rate of
The complete process to generate a set of decision rules the heat transfer.
then classified to assess their performance using rough sets
methodology are shown in fig. 2. As the (NDT) parameter 0 increased the range
of local Nusselt number reduced.
Physically it means that as ε increases the rate of the heat
transfer increases in magnitude because the increased near
wall permeability allows the fluid to advocate heat away more
quickly than it would for 1 , there by thinning the
boundary layer and increasing the temperature gradient.
Rules
.....
.....
f w "(-Inf,-0.1)") ( 0 "(-0.1,0.1)")
Rule 85 IF ( "(-Inf,0.1)") ( "(-Inf, 0.25)") THEN Nu={0.49265}
( "(3.5,Inf)"
f w "(-0.1,0.1)") ( 0 "(-Inf,-0.1)")
Rule 118 IF ( "(-Inf,0.1)") ( "(-0.25,0.25)") THEN Nu={0.81313}
( "(3.5,Inf)"
f w "(0.1,Inf)") ( 0 "(-0.1,0.1)")
Rule 212 IF ( "(0.1,0.35)") ( "(-0.25,0.25)") THEN Nu={0.66258}
( "(-Inf,1.5)"
f w "(0.1,Inf)") ( 0 "(0.1,Inf)")
Rule 243 IF ( "(0.35,Inf)") ( "(0.25,Inf)") THEN Nu={0.79361}
( "(-Inf,1.5)"
TABLE III. THE VALUES OF RATE OF HEAT TRANSFER IN TERMS OF THE LOCAL NUSSELT NUMBER COMPARED WITH THE RESULT OF (EL-KABEIR ET AL. [4])
METHOD AND THE PROPOSED METHODOLOGY
fw 0
U Nusselt Number
(EL-Kabeir et al. [4] )
Nusselt Number
( proposed methodology )
Abstract—Existing wireless channel Multiple-Input Multiple- technologies, are promising technologies in wireless commu-
Output (MIMO) interference models demonstrate that full duplex nications for enhancing the spectrum efficiency of the channel.
communication in a k-link MIMO system suffers from both self- It is also important to note that the ideas of MIMO and
interference and inter-user interference [1]. However, the fact
that nodes exchange information simultaneously (Transmission opportunistic communication are not exclusive. There exists
and reception activities) in a bi-directional manner on each wireless communication systems which consist of both a
of the k considered links; comes with considerable potential cognitive radio and MIMO topology with the aim to achieve
gain in terms of channel capacity. On the other hand, half- higher spectral efficiency [5].
duplex communication between the k MIMO links comes with
The MIMO technology provides spatial multiplexing by
the advantage that it does not suffer from self interference.
Half duplex MIMO communication however comes with the using multiple transmit and receive antennas instead of a single
drawback that it only achieves single link channel capacity. It also pair. It has therefore proven to be capable of tremendously
can still result in inter-user interference. This paper proposes a improve communication date rates capabilities in more than
mixed strategy which takes advantage of the Full-duplex MIMO one current applications ranging from sensor networks to
capacity gain while reducing self-interference by tapping into the
cellular systems. However, the use of multiple antennas comes
Half-duplex potential and also reducing inter-user interference by
interleaving Full-duplex and Half-duplex communications on the with its own challenges.
k MIMO links. This is done by designing optimum transmitter The Interference, between the multiple transmissions re-
and receive filters for Weighted Sum Rate (WSR) maximisation mains one of the main challenges faced by MIMO technolo-
using the Rosen’s gradient projection method. After proper gies. The advent of MIMO technologies has shifted the focus
modeling and MATLAB simulation, the obtained results show
that the Full-Half duplex Mixed strategy enhances the overall of the wireless channel modeling from a fading channel to
channel capacity as compared to the k-link MIMO Full-duplex an interference channel. The latter approach pays much more
WSR maximisation approach. attention to the multiplicative channel impairments such as in-
terference while the fading channel modeling mainly considers
Index Terms—Full-duplex, Half-duplex, Multiple-Input the additive effect of noise on the transmitted signal.Therefore,
Multiple-Output (MIMO), self-interference, inter-user
interference, mixed strategy, filter design, Weighted Sum
handling interference at both the transmitter and the receiver
Rate (WSR). design level has become the purpose of many research works
I. I NTRODUCTION [3]. The purpose of this work is to tap into the potential of
both the Full-duplex and the Half-duplex models as a possible
We live in an age of globalization. The world is becoming way to enhance spectral-efficiency by interleaving them on a
a small village. This has triggered the need to interconnect single k links MIMO channel.
everything through what has started to be called Internet
of Things (IoT) [2]. The proliferation of such applications II. R ELATED WORK
has created a serious demand in terms of high data rates
which has therefore ignited the quest for achieving more and Most existing MIMO technologies have been so far fully
more spectral efficiency. Research works have been trying to Half-duplex [4] or fully Fully-duplex [1]. Most of them
investigate ways to push the boundaries of the wireless channel have been either Time-Division Duplex (TDD) or Frequency-
utilization by proposing so many possible solutions in order Division Duplex (FDD). This means that despite the increase
to use as much capacity of the wireless channel as possible. in channel capacity provided by MIMO technologies, most of
Multiple Input Multiple Output (MIMO) technologies as well them have either simply provided possibility for only using
as opportunistic communication such as cognitive radio have half capacity of a link as they have not been able to allow
demonstrated to be very powerful technologies that help transmission and reception to occur concurrently or they have
improve the efficiency of the wireless spectrum utilization. evaluated the impact of using a Full-duplex communication
The Radio spectrum is becoming increasingly scarce as for all the k links. The channel model is depicted in figure 1
more and more devices go wireless. Opportunistic commu- it clearly shows the different considered channel impairments
nication technologies such as cognitive radios such as MIMO at each link.
problems for bi-directional full-duplex systems were studied Half-duplex links in case K is even or (K+1)
2 and (K−1)
2 Full-
in [10]. Furthermore weighted sum-rate maximisation for full- duplex and Half-duplex links respectively. Assuming that data
duplex systems under multiple pairs of nodes or full-duplex streams are transmitted from node i(a) , it first goes through a
MIMO interference channels has been considered in [1]. It (a)
transmit filter Vi also referred to as the precoding matrix and
comes with the expectation of doubling the link capacity as a is modelled as a Gaussian distributed and zero mean random
major advantage. However it suffers from a very strong self- vector and given by:
interference at each the links’ nodes. Therefore the present (a) (a) (a)
work proposes a strategy which will allow to still gain in terms xi = Vi di (3)
of channel capacity while minimizing the sel-interference where:
negative effect at the front-end of the used receiver antennas. (a)
• di is the complex, zero-mean and i.i.d transmitted data
III. M IXED STRATEGY: M ODELING AND ALGORITHM streams at node i(a) .
(a)
This section discusses the proposed model which leads to • xi is the Ni X1 signal vector transmitted by node i(a) .
the formulation of the optimisation problem. It also briefly ሺሻ n1 (2)
ඥ࣋ ࡴ
explains how the formulated problem is solved by using Node 1 (1) Node 1 (2)
ሺሻ +
the gradient projection optimisation algorithm for inequality (N1, M1) ඥࣁ ࡴ (N1, M1)
FD
ሺሻ
ሺሻ ඥࣁ ࡴ
constraints. ඥ࣋ ࡴ
Notation: + ሺሻ
ඥ࣋ ࡴ
The complex conjugate of a matrix A is noted as AH through- ሺሻ
ඥ࣋ ࡴ ሺሻ
ඥ࣋ ࡴ
out the modeling process. n1 (1)
Node 2 (1) ሺሻ + Node 2 (2)
A. Modeling of the system
HD
ඥ࣋ ࡴ
(N2, M2) n2 (2) (N2, M2)
This study proposes the utilisation of full-duplex MIMO +
interference channels as considered in [1], interleaved with
...
n2 (1) ...
half-duplex MIMO systems. There are two nodes on each link Node K (1) K odd: FD Node K (2)
and each full-duplex link has two neighbouring half-duplex (NK, MK) K even: HD (NK, MK)
(a)
The undistorted received signal μi
in both cases (Full- H H
(ab) (b) (b) (ab)
duplex and Half-duplex) can be modeled as follows: + γρi diag Hii Vi Vi Hii
(a) (a) (a) (a) H
μF Di = yF Di − eai and μHDi = yHDi − eai (aa) (a) (a)
+ γηii diag Hii Vi Vi
(8)
In order to design optimal transmit and receive filters for K
2
10
2
been investigated. After a tedious modeling of the proposed
strategy in terms of the different involved channel impairments
such as interference (both self-interference and inter-user
Sum-rate (bps/Hz)
0 10 20 30 40 50
related to the increase in the number of antennas are clearly
SNR (dB) exhibited. This work therefore contributes towards the current
Fig. 5. Spectral efficiency performance comparison between FD and Mixed- research in terms of MIMO systems designs with the aim
strategy (FD-HD) with power level variation
of achieving higher spectral efficiency. Further work could
Full-duplex (FD) versus Mixed Strategy (FD-HD) convergence behavior
consist of an investigation about applying the mixed-strategy
30
model proposed in this paper in a cellular system scenario for
29 example.
28
R EFERENCES
Sum-rate (bps/Hz)
z = x + y = [a + c, b + d] = [e, f]
a = a31, a30, ..a0 aj belongs to (0, 1) b Consider the input data sample a = 0.54
= b31, b30, ..b0 bj belongs to (0, 1) c =
c31, c30,..c0 cj belongs to (0, 1) Following steps indicate the conversion from floating
point real number to Q format.
d = d31, d30, ....d0 dj belongs to (0, 1) e
= e31, e30, ..e0 aj belongs to (0, 1) f = Normalization step to get the data in the range of 0 <
f31, f30, ..f0 fj belongs to (0, 1) a<1.
a = 0.54/4 = 0.1350
a31, b31, c31, d31, e31, f31 are the sign bits. If a31 = 0
then the number is a positive number If a31 = 1 then the 0.1350 *(230) = 144955146.2
number is a negative number Round this number a = (144955146)d
Do we need to move towards better reliable computing There are four 32 bit Q format booth multipliers. The input
systems? operand to the first multiplier are values a and c, to the
second multiplier a and d, to the third multiplier b and c, to
In critical mission control system application, is it the fourth multiplier b and d.
sufficient to obtain a single solution or is it necessary to
obtain a bound of solutions?
Experimental results
Figure 7. Max comparator for multiplier upper bound The interval arithmetic Q format 32 bit multiplier is
implemented using Xilinx ISE 14.2 tool suite. Xilinx is a
Fig. 7 illustrates the multiplier upper bound computa-tion powerful simulation tool developed by model technologies
unit. The architecture is similar to the lower bound for Xilinx devices. The area estimate of the and delay
computing unit, except for now the comparison is to obtain estimate of the design is in the following Fig. 9.
maximum of two operands.
Computational time is more. Hence the compu-tational cost [7] R. E. Moore,“ Interval Arithmetic and Automatic Error analysis in
Digital Computing ,” Technical Report No. 25, Applied Mathematics and
is more. Statistics Laboratory, Stanford University, Stanford, Califr-nia, Nov. 1962,
pp. 1-3.
Interval Q format multiplier [8] R. E. Moore, “ Automatic Error Analysis in Dig-ital Computation ,”
Technical Report No. 48421, Lockheed Missiles and Space Co.,
32 bit Q format representation. MSB is sign bit. Sunnyvale, Cal-ifornia, Jan. 1959, pp. 1-5.
Booth multiplier CSD, Wallace tree any design be used. [9] R. E. Moore, “Methods and Applications of Interval Analysis ,”
SIAM, Philadelphia, Jun. 1979, pp. 4-27.
Four multiplication operation to obtain a bounded solution. [10] John Pryce, “ IEEE Working Group P1788 , A standalone for
Interval Arithmetic ,” Dagastuhl Seminar 09471, Cranfield University,
Three comparisons to obtain lower bound. Three Nov. 2009, pp. 1-20.
comparisons to obtain upper bound. [11] T. Hickey, Q. Ju, M.H. Van Emden, “ Interval Arithmetic: from
One 32 bit Q multiplier using booth algorithm Principles to Implementation ,” Journal of the ACM (JACM), Vol. 48 No.
5, Sep. 2001, pp. 1038-1068.
a) Use signed integers b) Only integer multiplica-tions c) [12] S.M. Rump, “ INTLAB - INTerval LABoratory ,” Tibor Csendes,
Editor, Developements in Re-liable Computing, Kluwer Academic
One 32 Bit Booth multiplier Publishers, Jun. 1999, pp. 77-104.
No addition and subtraction for exponents No [13] S.M. Rump, Hamburg, “ INTERVAL COMPUTA-TIONS WITH
INTLAB ,” Developments in Reliable Computing, Springer Publication,
normalization. Sep. 1999, pp. 77-104.
[14] G. I. Hargreaves,“ Interval Analysis in MATLAB,” Numerical
4 32 bit Booth multipliers No exponent additions Analysis Report No. 416, Dec. 2002, pp.1-4.
[15] Erick L Obester,“ Fixed-point Representation and fractional Math ,”
No exponent subtraction Obester Consulting, Aug. 2007, pp. 1-19.
Computation Beyond Moore’s Law:
Adaptive Field-Effect Devices for Reconfigurable
Logic and Hardware-Based Neural Networks
Udo Schwalke
Institute for Semiconductor Technology and Nanoelectronics
Technische Universität Darmstadt
Darmstadt, Germany
schwalke@iht.tu-darmstadt.de
Abstract—The success of integrated silicon technology is processing like big data analysis as we know today would
based on the down-scaling of minimum feature sizes of silicon simply not exist.
field-effect devices (MOSFETs) in a complementary circuit
configuration (CMOS) according to Moore’s Law. Reducing the The success of integrated silicon technology is based on the
feature size provides more components per chip and higher scaling of minimum feature sizes. Within the past 50 years,
speed. However, this continuous miniaturization of MOSFETs microelectronics has completed the transition into
will come to an end as CMOS scaling will soon approach atomic nanoelectronics, i.e. 100µm (micrometer) “big” devices have
dimensions. To take computation beyond Moore’s Law requires been downscaled into the sub-100 nanometer range. For
breaking at least with two major paradigms: (1) High computing example, today state-of-the-art advanced silicon CMOS
performance is directly related to high switching speeds of the technologies are utilizing feature sizes of 22 - 14 nm which is
single device and (2), the separation of memory and computing. five times below the size of a virus! Reducing the feature size
In this work we report on a novel adaptive nanowire field-effect provides more components per chip and higher speed.
transistor (a-NWFET) architecture which provides a release However, this continuous top-down miniaturization of
from paradigms (1) as well as (2). The fabricated a-NWFETs are MOSFETs will come to an end [4] as CMOS downscaling will
originally ambipolar nanowire devices, using midgap Schottky- soon approach atomic dimensions.
barrier contacts as source and drain (S/D) electrodes. The final
unipolar a-NWFET device type (i.e. NMOS or PMOS) can be To take computation beyond Moore’s Law requires to
created by applying an electric bias at the back-gate. The ability break at least with two major paradigms: (1) High computing
to select the transistor type by the application of an electrical performance is directly related to high switching speeds of the
signal to the back-gate adds to the versatility of the device single device and (2), the separation of memory (e.g. DRAM)
concept, where the two complementary device types are and computing (e.g. CPU). On the other hand, it would be
interchangeable on the fly. A simple and versatile device highly desirable if any new approach would be fully
structure for logic and intrinsic memory applications with the compatible with the established silicon-CMOS integrated
potential to realize novel reconfigurable logic architectures and circuit technology in order to make use of most of the existing
hardware-based neural networks will be presented. semiconductor industry.
Keywords—Adaptive Field-Effect Transistor; Moore’s Law; In conventional CMOS technology, NMOS- and PMOS-
Reconfigurable Logic; Neural Network. FETs are hardware defined by choosing the appropriate doping
of source (S) and drain (D) junctions with respect to the
I. INTRODUCTION substrate. However, in this work we report on a novel adaptive
nanowire field-effect transistor (a-NWFET) architecture which
Like no other technology, integrated electronics has provides a release from paradigms (1) as well as (2). The
changed our daily life and silicon has been the ultimate fabricated a-NWFETs are originally ambipolar silicon
semiconductor material in micro- and nanoelectronics for more nanowire devices, using midgap Schottky-barrier contacts as
than 50 years. The continuous down-scaling of silicon field- source and drain (S/D) electrodes. The final unipolar a-
effect devices (MOSFETs) [1] in a complementary NWFET device type (i.e. NMOS or PMOS) can be selected by
configuration (CMOS) [2] according to Moore’s Law [3] the electric bias at the back-gate. The ability to select the
provides the basis of the tremendous progress in information transistor type by the application of an electrical signal to the
technology. Thanks to silicon CMOS, we can enjoy, for back-gate adds to the versatility of the device concept, where
example, the rich multimedia experience when using the the two complementary device types are interchangeable on the
internet, mobile phones or tablet PCs. In fact, without the fly. This simple and versatile device structure for logic and
present hardware technology platform, many (if not all) areas intrinsic memory applications has the potential to realize novel
of computing, communication, networking and information
Holes Electrons
1x10
temperature dependence of the drain current, pointing towards
-12
1.0
0.6
N-Type NWFET
0.4
VBG=-7,5V
0.2
VBG=+7,5V
0.0
-5 -4 -3 -2 -1 0 1 2 3 4 5
VFG (V)
a) b)
IDS (A)
memory
|IDS| (A)
-10
P-NWFET
-10
1x10 1x10 window
VBG = 3V -11
N-NWFET
-11
1x10
1x10
-12
1x10 P-NWFET
-12 before write cycle
-13
1x10 1x10 after write cycle
N-NWFET
-14
-13 1x10 before write cycle
1x10 after write cycle
-15
1x10
-10 -8 -6 -4 -2 0 -3 -2 -1 0 1 2 3
VBG (V) VFG (V)
(a) (b)
CONTROLS, METRICS AND RECOMMENDATIONS TO PROVIDE IMMEDIATE PROTECTION WITHOUT FURTHER COST OR CHANGE OF
INFRASTRUCTURE the list of controls for this grid with recommended standard
settings that can be modified according to the particular need
TIGHTER CONTROLS, METRICS AND RECOMMENDATIONS FOR MONITORING AND CONTROLLING THE EFFECTIVENESS OF SUCH
for it. The user’s first step will be to determine which controls
CONTROLS
are mandatory o recommended. Later he must assign a
weighting of each control (the value will be contained between
REDUCE WEAK SAFETY PRACTICES, VULNERABILITY AND ITS IMPACT. MAINTENANCE AND APPROPRIATE CONTROL SETTINGS
0 and 1). With regard to the score the user will complete the
degree of compliance of each control and the input value will
be contained between 0 and 100.
REDUCE CHANCES OF SUCCESS OF DETERMINATION ATTACKERS. INSTALLATION OF HIGH COST CONTROLS.
The total of the grid is determined as follows:
Fig. 2. Sequence to follow for the proposed framework 1) It shall be calculated the subtotal (maximum) of each
B. Framework Function security control: Subtotal (Maximum) = Entered weighting X
100.
Each grid formed between a phase and iteration will
contain an "X" number of controls to be implemented and 2) It shall be calculated the subtotal (real) of each
comply. The sum of compliance with each of the controls security control: Subtotal (real) = Entered weighting X score
gives the level of compliance of each grid. The framework obtained.
will also have the possibility to calculate the degree of 3) The total of the grid: equal to the ratio between the sum
compliance not only of each grid but also by phase and of the subtotals and the sum of the maximum subtotals,
iteration. Fig. 3 shows an example of the proposed interface, multiplied by 100 to get the percentage.
where you can see a classification by colors indicating
compliance levels.
Following the example of Fig. 4 the total of the grid is
equal to: (407,60/462)* 100= 88,22%.
Abstract—Teaching Climate Change’s concepts are hard and strategies, techniques and aids to teach because the content
challenging to impress in the minds of adolescent learners in a itself is quite intricate to comprehend. Learners demonstrate
traditional classroom environment. Designing appropriate difficulty grasping the complex systems involved such as the
instructional strategies with the best visual experiences enable greenhouse effect [4]. Learners, in fact, face scepticism, as they
learners to grasp the complex principle behind the concept and build mental models of Climate Change concept which are not
stimulate their interest. Hence, technology can be seen as a aligned with models taught in class. They enter the class with
panacea. In this regard, a study was conducted to explore Form preconceived notion on Climate Change and unfortunately,
II learners’ view on learning Climate Change with blended educators are only able to probe the accuracy of learners‟ prior
learning approach. The theoretical context of this research is
knowledge after summative assessment [5].
underpinned by socio-constructivism. As methodology, an action
research was carried out by implementing an interactive Scholars contend that with blended learning, learners are
educational software, designed based on learners’ requirements. better able to clarify any lesson at any time and enrich
A questionnaire was administered to 30 mixed ability learners socialisation opportunities while, simultaneously, enables
from a gender-mixed private secondary school with the aim of educators to share content, lesson plans, and other curriculum
collecting their learning experience. An active participation of the components, thus shifting the role of the educator from content
learners during the face-to-face session demonstrated that deliverer to guide [6, 7]. In this sense, educators are moving
through blended learning, they had a clearer understanding and away from a dogmatic approach to a more engaging and
they appreciated the concept with a larger improvement in
interesting approach that encourages critical thinking rather
learners performance. Hence, this active and constructive
learning strategy encouraged collaboration and cooperation, and
than just fact accumulation. Moreover, to ensure that learning
celebrated the autonomy of the learners. Moreover, learners were takes place through blended learning, an appropriate pedagogy
motivated to take action and address problems in the community, must be adopted. Evidence shows that socio-constructivism is a
developing their civic responsibility for sustainable development. good fit for blended learning because it stimulates active
learning whereby learners are guided through a
Keywords—Climate Change; Blended learning; Socio- conceptualisation of the implications of environmental change
constructivism by exploring the effects of Climate Change and analysing real-
world problems [8, 9]. The paper aims at using an appropriate
I. INTRODUCTION instructional design to guide learners in reconstructing their
Environmental education cuts across multiple disciplines knowledge on the concept of Climate Change. The objectives
such as the Sciences, Arts and Humanities. The study of of this paper are to adopt blended learning approach to teach
Climate Change has been under-emphasised in most secondary Climate Change science and to celebrate the autonomy of the
education curricula and the capacity to support the integration learners.
of climate change science is still relatively limited [1].
Imparting the science of climate change is not always easy as II. LITERATURE REVIEW
the content itself is difficult to comprehend. As such, A. Learning with technology
technology helps learners to visualise far field Climate Change
The traditional learning approaches have long been used
impacts and improve their global perspectives. Also, since
and proven effective in the past. Nevertheless, they have
Climate Change consists of theoretically complex models and
recently been questioned in their ability to provide the learner
hypotheses involving interdisciplinary connections, the
with "rich" environments and "authentic" experiences of
traditional classroom is unable to accommodate for the varying
learning [10]. Traditional classrooms are space bound,
learning styles of learners, to provide adequate explanation,
feedback given to learners is delayed and the educator makes
and suitable and timely feedback in this digital era [2]. The
limited use of visual aids and materials. With the advent of
traditional didactic strategies used to educate learners about
Internet and prevalence of computer technology cutting across
issues of global climate change, are inappropriate. Traditional
demographic boundaries, educators have turned to online
ways of teaching, which are largely based on the infusion of
learning [11]. Visual materials such as graphics, photographs,
knowledge, are inapt in helping learners to apply knowledge in
concept maps, films, computer and television images support
understanding the real issues of everyday life [3]. Hence, written texts in all disciplines and make learning fun, motivate
innovative instructional approaches and techniques must be learners and enrich their imagination [12]. However, the
designed. The science of Climate Change demands a variety of
separation of action of teaching and the reaction of learning obviously crucial for helping learners understand Climate
causes learners to feel isolated, confused, frustrated and at Change concepts, that is, in meaning-making, and guides
times, learners‟ interest in the subject and learning learners to build new mental models [22]. Educational
effectiveness are reduced[13]. As such, to get the best of both researchers have advanced that socio-constructivism is the
learning methods, blended learning is considered, as it provides most appropriate pedagogy to be adopted for blended learning
a viable option for learners who seek the flexibility of online [5, 8, 9].
courses but also request for personal contact with the educator
and other learners in classroom setting. Indeed, blended III. METHODOLOGY
learning makes pedagogically significant use of the Internet
The purpose of this study is to adopt a blended learning
and other technological tools while reducing seat time (time
approach to Climate Change science by designing and
spent in the classroom) [14]. Per se, blended learning improves
implementing an interactive educational software to mixed
pedagogy, increases access to knowledge, fosters social
ability learners so as to cater for the complexities of the
interaction, increases the amount of educator‟s presence during
concept faced in traditional classroom. Khan‟s blended
learning, enhances ease of revision and provides learners with
learning framework guided the design, implementation and
greater control over their pace of learning, instructional flow,
evaluation processes of the software which is further discussed
selection of resources, and time management[15]. Proponents
in the Research Technique section further below. The research
of this approach believe that this system includes a committed,
was supported by socio-constructivist learning theory.
sustained, and well thought-out implementation plan, which
combines appropriate technology with traditional classroom A. Research Design
interaction leading to better outcomes for learners [16, 17]. Researchers and practitioners in fields of blended learning
Consequently, due to difficulties in understanding the complex regularly use quantitative methods to measure the effectiveness
and dynamic principle behind Climate Change, educators have of this model among learners. So, based on recent literature
to design appropriate instructional strategies and methods [18]. reviews, a questionnaire with questions similar to those as in
It is vital that learners develop a holistic understanding of the [23] was administered to 30 mixed ability learners of Form II
concept. So, Khan‟s framework as illustrated in Fig. 2.1 can from a mixed-gender private secondary school. The
serve as a guide to plan, develop, deliver, manage, and evaluate questionnaire used included a five-point Likert-type scale
blended learning programs for Climate Change. anchored at Strongly Disagree=1 and Strongly Agree =5,
indicating learners‟ disagreement or agreement with each item.
The questions were worded based on Khan‟s Blended Learning
Framework in relation to the features of the pedagogical tool
and the satisfaction of learners. The questions were set to
evaluate the effectiveness of learning in an online environment
and the interface design, the appropriateness of the resources
adopted for learning the science of Climate Change, the
accuracy of the instructions and the clarity of the evaluation
criteria.
B. Research Technique
An action research was carried out for the study as it
Fig. 2.1: Khan's Blended Learning Framework [19] seemed more appropriate. “Educators collaborate in evaluating
their practice and try out new strategies to render their practice
B. Socio-Constructivism more consistent with educational values they espouse and thus
develop a shared theory of teaching by research practice” [22]
In seeking to improve learners‟ performance and Fig 3.1 illustrates the processes undertaken for the action
satisfaction, educators may consider adopting a socio- research for designing, implementing and evaluating the
constructivist approach to teaching and learning. Accordingly, educational software for learning Climate Change science.
it is believed that learning is an active and constructive process;
learners not only construct knowledge, but the knowledge they
already possess affects their ability to gain new knowledge
[20]. Learners are believed to be enculturated into their
learning community, based on their existent understanding,
through their interaction with the immediate learning
environment. In a school setting, three elements are always
active: the environment, the learner, and the educator [21]. This
conception of the social and the individual being closely
interconnected, functionally unified, constantly interacting, and
the change and development in one relentlessly influencing the
other, provides a valid explanation for both social and
individual change. The socio-constructivist pedagogy is Fig. 3.1: Action Research [24]
3
Step 1: Making the commitment In trying to solve problems of traditional teaching, “en
aval” (after) the diagnosis evaluation, the educational software
A requirement analysis was done before designing the was administered to 30 mixed ability learners of Form II from
software. The lesson was performed in the traditional a mixed-gender private secondary school. The learners were
classroom and “en amont” (prior) a diagnosis evaluation was asked to download the educational software and worked out the
carried out to measure the performance of the learners and concept at home at their own pace. Prior to this, in class, the 30
simultaneously to identify weaknesses and the needs level learners had to create their Gmail account in the Computer Lab
(Maslow‟s level of needs) of the learners. Following the and were given a demonstration of how to download and use
evaluation, it was observed that learners demonstrated the educational tool. They had a period of one week to
difficulty in grasping the concept and its related aspects such as complete the task. During the class session (35 minutes), after
the greenhouse gasses and their relationship to global warming. the deadline, the educator was asked to clarify and enhance
The ambiguities of the cumulative effects of hundreds of aspects of Climate Change which seemed ambiguous for the
environmental factor blurred the concept of Climate Change. learners. Learners responded positively in class and shared
So, it was motioned that the traditional teaching method of their own research work to their classmates. At the end of the
Climate Change might be inefficient for the digital natives to classroom session, the learners were administered a
understand such complexities. questionnaire to fill in to measure their appreciation and views
Step 2: Designing the study on the educational software. The questionnaire was collected
one day after its administration. Lastly, learners‟ degree of
Based on the results of diagnosis evaluation (performance understanding of the concept of Climate Change was measured
of learners), the requirements and needs‟ level of the learners - through a written evaluation ranging from lower order to
identified through their weaknesses - , a prototype of an higher order questions (Bloom‟s Taxonomy). The performance
interactive educational software was designed. The educational from the second evaluation was compared to that of the
tool consisted of inter-connected pages and links to ease diagnosis evaluation.
navigation and interaction. The software followed an
interdisciplinary approach and pages were enriched by video Step 4: Improving your practice
clips from YouTube, pictures, simple texts and interactive web- Results from the questionnaire, highlighted the new needs
based content to allow learners to have a closer appreciation of of the learners such as synchronous online communication with
the concept. Instructions were given on each page so as the tool
educators and among learners. Learners wished to
becomes user-friendly. Alongside, Thorndike‟s 3 laws of communicate with their peers and educator while they were on
learning were considered in the design process. The interface task so that they could ask questions and share their views. The
design aimed at making the learner, physically, mentally and needs were reviewed and rethought of.
emotionally ready to learn (Law of Readiness). Animated as
well as static pictures both in the foreground and the Step 5: Beginning again
background of the pages projected the learners in an
environment conducive to learning Climate Change science At this point, the new needs of learners were clearly
with sounds of water and natural hazards. Links were also identified and an innovative strategy was re-designed. A forum
inserted to direct learners to specific Climate Change concepts for online synchronous discussion was integrated. Eventually,
such as global Warming and its consequences or causes of the change was implemented for better learner performance
climatic changes Video Clips on YouTube. In addition, the and satisfaction.
simple texts on the pages allowed learners to further enhance IV. RESULTS
what they saw from the Video Clips. Navigating icons allowed
the learners to access one page to another. Prior to designing the educational software, the needs of the
learners were measured through a diagnosis evaluation on the
Likewise, at the end of each the sub topic, there was a concept of Climate Change to identify the needs, weaknesses
recapitulation of what has been covered and, only at this point, and performance of learners. Hence, based on the results
that the learner was able to engage in the evaluation tasks. The obtained, an interactive educational tool was designed and
evaluation tasks comprised of MCQs, Cloze text, gap-fill, implemented to cater for the needs of the learners. The
matching and crossword which were created through Hot educational tool provided the learners with animated pictures,
Potatoes 6 and linked to the educational tool. Hence, to enable simple texts, interactive web-based content and Video Clips as
learners to construct new knowledge and master the concept, well as evaluation tasks to test the knowledge of the learners at
the formative evaluation at the end of each sub topic was set the end of each sub topics. After the implementation of the
with immediate feedback (Law of Exercise). Learners were educational tool, the learners‟ views on the software were
provided with appropriate feedback immediately for both captured through a questionnaire and consequently, their
correct and wrong answer. They were allowed to work out the performance was measured through a summative evaluation
evaluation task thrice. Then, at the end of the topic on Climate (class test).
Change, a summative evaluation was set to reinforce learning
through an on-task and field activity (Law of Effect).The Data collected from the learners during the survey were
educational software was uploaded on Google drive. compiled, analysed and discussed. The results provided
insights on the learners‟ view on learning Climate Change
Step 3: Making Sense of Experience through blended learning. The effectiveness of blended
4
learning was assessed by measures of learners‟ success in on the subject. A majority of learners (mean=4.04) advanced
terms of performance and satisfaction [25]. that they were able to work anytime and anywhere at their own
pace. So, they were eager to search for more information on
Results demonstrated that 80% of the learners responded Internet (“I‟m bored when I‟m learning online”, mean=1.73).
that blended learning helped them in learning the concept. It They, besides, became more aware of the climatic changes in
was observed that both the interactive educational tool used their own environment by building on what they see and what
and the face-to-face session, were helpful to gain understanding they learn. Actually, the summative activity was a real
of the concept. Nevertheless, the remaining 20% found the challenge which they were eager to tackle. Learners initially
learning model unsuitable. The percentage structure is shown had a low level of comprehension on the concept however after
in Fig 4.1. the online lessons which were further sustained during
classroom interaction enabled them to better appreciate the
10% concept. In fact, learners are motivated by activities that reflect
10% Extremely those in „real world‟ [29]. Thus the learners agreed that that
blended learning model is an efficient one (mean=4.00)
Very
50% In terms of personal contact during face-to-face session, the
Not Enough
mean of 3.73 demonstrated that learners have divergent views.
30% Does Not Some learners agreed that personal contact with the educator is
imperative during the learning process while others do not. As
Fig. 4.1: "Blended learning helps in the learning of the concept"
contended in [29], personal contact with the educator is
Indeed, learners usually react positively towards blended essential to provide help and support. Educators play a major
learning; however, they use traditional approach as the simplest role in social presence. Indeed, the results of the summative
form of discussion on the content and for receiving feedback evaluation were compared to the diagnosis evaluation. There
from the educator. A similar result was obtained as in [26]. was significant improvement in all learners‟ performance. In
Nonetheless, the insufficient level of two-way interaction fact, the 10% of the learners who did not favour the blended
among learners and the educator during their online session learning approach did perform better.
resulted into 10% of the learners not finding the learning model
appropriate and helpful. The remaining 10% responded This conclusion is analogue to the result of [30] conducted
negatively to the blended learning approach as they favour over a much larger sample in their study on the use of blended
mostly the face-to-face interaction. Since they have always learning.
been cradled by the traditional learning model, they needed the V. CONCLUSION
constant personal contact with the educator.
The paper aimed at designing and adopting a blended
Undeniably, though most learners reacted positively to the learning model to guide and reconstruct the knowledge of
educational software in terms of clarity and learners on the science of Climate Change and to celebrate the
straightforwardness, a mean of 4.80 reflected that statistically autonomy of the learners. The results from this action research
the majority of the learners agreed that there should be suggested that the combination of well-designed and supported
synchronous online communication so as to enable discussion blended model for learning the Climate Change science enables
of the concept among themselves and to ask for clarification to learners to appreciate the concept and mounted their interest to
the educator while doing the work. It is to be pointed out that investigate more on the concept on their own; the climatic
learners meet their educator only twice a week for a period of happenings in their environment and worldwide. Learners
35 minutes each. Also, results from the survey (mean=4.50) developed an overarching philosophy that will help them set
demonstrated that the instructions were precisely given to goals for sustainable futures and develop a world view. Hence,
facilitate navigation from page to page, to view the video Clips it is observed that the autonomy of the learners was celebrated.
from YouTube and to perform the evaluation activities. Likewise, it was concluded that learners‟ performance
Learners agreed that the design of the educational software; improved with this learning model compared to that of the
video clips, audio, animated pictures, simple texts and traditional classroom. As advanced in [31] “there is convincing
interactive web based content were well selected for enhancing evidence that online learners do just as well if not better than
the learning of Climate Change science, making it clear and learners in face-to-face courses” Indeed, the findings which
simple to understand (mean=4.05) and as a results the emanated from the action research also provided a platform for
resources met their needs (mean=4.04). Learning in the online educators to enhance learners‟ knowledge on Climate Change
environment was a new experience for the learners and through blended learning. The role of the educator in blended
eventually they were more eager to evolve in this virtual space learning environment is essential and should be considered for
being digital natives. the success of blended learning programs. Educator‟s role in
Moreover, for both right and wrong answers, learners were blended Learning environment includes management,
provided with immediate feedback. In case of a wrong answer, guidance, providing feedback, and evaluation [19]. The
they were given the opportunity to work out the activity again. greatest value of an effective blended learning model is that it
Research advanced that in most circumstances feedback that is fosters a more interactive, collaborative and engaging learning.
immediate and specific, results in better learning [27, 28].The Nevertheless, this study needs to be supported by larger-
concept was made closer to reality and increased their interest scale studies to reveal the impact and to have a broader
5
appreciation of using Blended Learning approach in [13] J. Humphries, Gauging Faculty Attitudes Toward Online and Hybrid
conceptualising Climate Change. Also, learners could be Learning. Journal of Applied Computing 5 (1): 28–32., 2009
encouraged along with the objectives of the curricula to [14] A. deNoyelles, C. Cobb, and D. Lowe, “Influence of Reduced Seat Time
on Satisfaction and Perception of Course Development Goals: A Case
participate actively in their community life to make the change Study in Faculty Development”, Journal Of Asynchronous Learning
for a sustainable world. Networks, 16(2), 85-98., 2012
[15] Y. Blieck, M. de Jong, and L. Vandeput, “Blended Learning for Lifelong
ACKNOWLEDGMENT Learners in a Multicampus Context (MuLLLti)”, L. Leuven University
I would like to thank the learners at St Helena‟s College for College, Belgium, 2012
their participation. I would like to extend my gratitude to Ms. [16] C.J. Bonk, and C.R Graham, eds, “The Handbook of Blended Learning:
Global Perspectives”, Local Designs, 2006, San Francisco: Pfeiffer.
V. Bundhoo and Mrs. N. Domun for their assistance during the
research. Last but not the least, a special thanks to Mrs. M. [17] Garrison, D.R., and Vaughan, N.D., “Bended Learning in Higher
Education: Framework, Principles, and Guideline”, 2008, San Francisco,
Dhuny for her help and support. CA: John Wiley and Son.
[18] C.M. DiEnno, and S.C. Hilton, “High school students‟ knowledge,
attitudes, and levels of enjoyment of an environmental education unit on
REFERENCES non-native plants”. The Journal of Environmental Education. 37(1), 13-
25., 2005
[1] D.K. Bardsley, and A.M. Bardsley, A.M. “A Constructivist Approach to [19] B. Khan, “Managing ELearning Strategies: Design, Delivery,
Climate Change Teaching and Learning Geographical Research” Implementation and Evaluation”. Information Science Publishing, 2005,
Geographical Research, Volume 45, Issue 4, 2007 London.
[2] E. Aladag, and B. Ugurlu, “Global Climate Change Education in [20] E. Etkina, and J.P Mestre, “Implications of Learning Research for
Turkey”, 2008. Unpublished paper Teaching Science to Non-Science Majors.” SENCER Backgrounder
[3] V. Papadimitriou, “Prospective Primary Teachers‟ Understanding of PRESENTED AT at SSI 2004
Climate Change, Greenhouse Effect, and Ozone Layer Depletion”, [21] V.V Davydov, translated by S.T Kerr, The influence of L. S. Vygotsky
Journal of Science Education and Technology, Vol. 13, No. 2 , 2004 on educational theory, research, and practice. Educational Researcher,
[4] National Research Council, “Climate Change Education in Formal 24 (3), 12-21., 1995
Settings, K-14: A Workshop Summary”. A. Beatty, Rapporteur. Steering [22] M. Turuk, “The relevance and implications of Vygotsky sociocultural
Committee on Climate Change Education in Formal Settings, K-14. theory in the second language classroom”, ARECLS, Vol.5, 244-262,
Board on Science Education, Division of Behavioral and Social Sciences 2008
and Education, 2012, Washington, DC: The National Academies Press.
[23] V. Aleksić, and M. Ivanović, “Blended Learning in Tertiary Education:
[5] F. Movahedzadeh, “Improving students‟ attitude toward Science through A Case Study”, Sun SITE Central Europe, Vol. 1036., 2013
Blended Learning”, Science education and civic engagement 3:2, 2011
[24] F. Rust, and C. Clark, HOW TO DO ACTION RESEARCH IN YOUR
[6] E. Aboukhatwa, “Blended Learning as a Pedagogical Approach to CLASSROOM, Taking Action with Teacher Research, Heinemann
Improve the Traditional Learning and E-Learning Environments”, The PressG. Eason, B. Noble, and I.N. Sneddon, “On certain integrals of
Second International Arab Conference on Quality Assurance in Higher Lipschitz-Hankel type involving products of Bessel functions,” Phil.
Education, 2012. Trans. Roy. Soc. London, vol. A247, pp. 529-551, April 1955.
[7] E. Banados, “A Blended-learning Pedagogical Model for Teaching and [25] C.C. Wai, and E. Lim Kok Seng, “Measuring the effectiveness of
Learning EFL Successfully Through an Online Interactive Multimedia blended learning environment: A case study in Malaysia”, Educ Inf
Environment”, CALICO Journal, Vol. 23, No. 3, 2006 Technol (2015) 20:429-443, 2013
[8] A. Koohang, and K. Harman, Open source: “A metaphor for e-learning. [26] I. Мiliszewska, “Transnational Education Programs: Student Reflections
Informing Science”, The International Journal of an Emerging on a Fully-Online Versus a Hybrid Model”, Victoria University,
Transdiscipline, 8, 75-86, 2005, Retrieved from Australia, 2007.
http://inform.nu/Articles/Vol8/v8p075-086Kooh.pdf
[27] L.M. Jeffrey, J. Milne, G. Suddaby, and A. Higgins, “Blended learning:
[9] A. Koohang, “A learner-centered model for blended learning design”, How teachers balance the blend of online and classroom components”,
International Journal of Innovation and Learning, 6(1), 76-9, 2009. Journal of Information Technology Education: Research, 13, 121-140,
[10] A. Relan, and B.J. Gillani, “Web-based instruction and the traditional 2014.
classroom: Similarities and differences”. In Khan, B. (Ed.), Web-based [28] S.D. Miller, “How high- and low-challenge tasks affect motivation and
instruction (pp. 25–37), 1997, New Jersey: Educational Technology learning: Implications for struggling learners”, Reading & Writing
Publications. Quarterly, 19(1), 39-57, 2010.
[11] M.T. Stone, and S. Perumean-Chaney, “The Benefits of Online [29] A.R. Artino, and J.M. Stephens, “Academic motivation and self-
Teaching for Traditional Classroom Pedagogy: A Case Study for regulation: A comparative analysis of undergraduate and graduate
Improving Face-to-Face Instruction”, MERLOT Journal of Online students learning online”, Internet & Higher Education, 12(3/4), 146-
Learning and Teaching Vol. 7, No. 3, September 2011 151., 2009, doi: 10.1016/j.iheduc.2009.02.001
[12] A.E. Bozdogan, “The effects of instruction with visual materials on the [30] S. Reasons, K. Valadares, M. Slavkin, “ Questioning the Hybrid Model:
development of preservice elementary teachers‟ knowledge and attitude Student Outcomes In Different Course Formats”. JALN, Vol. 9, No. 1,
towards global warming” The Turkish Online Journal of Educational 2009
Technology, volume 10 Issue 2 , 2011
[31] A.W.T. Bates, and A. Sangra, “Classroom assessment techniques”,
2011, San Francisco: Josey Bass.
Performance Analysis of Parallel CBAR in
MapReduce Environment
Sayantan Singha Roy Chandan Garai Ranjan Dasgupta
Department of C.S.E Department of C.S.E Department of C.S.E
NITTTR, Kolkata NITTTR, Kolkata NITTTR, Kolkata
Kolkata, India Kolkata, India Kolkata, India
sayantansingharoy@outlook.com chandangarai@hotmail.com ranjandasgupta@ieee.org
V. RESULT
We evaluate our proposed algorithm‟s performance in Fig.
5. Performance experiments were executed on a cluster of 4
computers. Each computer has 3.6 GHz quad core i7
processors and 4 GB RAM.Hadoop version 1.0.3 and java
version 1.8.0_31 were used to run map reduce job.
In Fig. 5, we have increased the number of records and
number of computers or nodes also increased for each set of
records. Here, we can see that time taken to process the set of
records is decreasing along with increase number of nodes.
This proves parallelism nature of our proposed algorithm. It
also shows when number records becomes high, single node
takes longer time but node 4 takes relatively same time for all Fig. 5 Speed up graph
the set of records. Time complexity to create cluster is
quadratic in nature as mentioned earlier, we arrest the growth In Table 2, we analyse the growth of execution time
of the function by increasing number of nodes. As the number between single node and four nodes. For 10 million records
of records increases, time spent by the algorithm is relatively with respect to 1 million records, execution time growth for
moderate.So we can say that our proposed algorithm can single node is 129%. Whereas execution time growth for four
handle large data. nodes is 60% for same condition.
Table 2: Execution time growth comparison between single node and cluster
Table 1: Case study
of 4 nodes
VI. CONCLUSION
More than 50% improvement has been observed in some
cases of applications which are quite impressive from
computational perspective. It has also been observed that, the
time for clustering becomes almost stationary for higher
number of nodes even the input volume of data has been
increased from 7 million to 10 million. Thus, CBAR being
very useful clustering technique, using it in cloud environment
(Hadoop MapReduce) for processing BigData has some
inherent advantages and may be used for various applications
as discussed in our work.
REFERENCES [11] Borthakur, D.: The Hadoop Distributed File System: Architecture and
Design, 2007
[1] Rakesh Agrawal, Ramakrishnan Srikant, “Fast algorithms for mining
association rules”. Proceedings of the 20th International Conference on [12] Chandan Garai, Ranjan Dasgupta, "An Improved Job Scheduling
Very Large Data Bases, VLDB, pp 487-499, Santiago, Chile, September Algorithm by Utilizing Released Resources for MapReduce," Fourth
1994. International Conference of Emerging Applications of Information
Technology (EAIT), doi: 10.1109/EAIT.2014.55,pp. 9-14, 2014
[2] Yuh-Jiuan Tsay, Jiunn-Yann Chiang, “CBAR: an efficient method for
mining association rules”,Elsevier,Knowledge-Based Systems, pp. 99– [13] Singh, A.; Chaudhary, M.; Rana, A.; Dubey, G, “Online Mining of data
105, 2005 to generate association rule mining in large databases”, Recent Trends in
Information Systems (ReTIS), IEEE, pp. 126 – 131, 2011
[3] Kannan Govindarajan , Thamarai Selvi Somasundara, Vivekanandan S
Kumar , Kinshuk, “Continuous Clustering in Big Data Learning [14] Jing Zhang; Gongqing Wu; Haiguang Li; Xuegang Hu; Xindong Wu, "A
Analytics”, IEEE Fifth International Conference on Technology for 2-Tier Clustering Algorithm with Map-Reduce",ChinaGrid Conference
Education, pp. 61-64,2013 (ChinaGrid), pp.160-166,doi: 10.1109/ChinaGrid.2010.14, July2010
[15] Wilpon, J.G.; Rabiner, L., "A modified K-means clustering algorithm
[4] Jeffrey Dean, Sanjay Ghemawat “MapReduce: Simplified Data
for use in isolated work recognition" in Acoustics, Speech and Signal
Processing on Large Clusters”, 6th symposium on operating systems
Processing, IEEE, vol.33, no.3, pp.587-594, Jun 1985
design and implementation,2004
[5] Hartigan, J. A.; Wong, M. A., "Algorithm AS 136: A K-Means [16] Wagstaff, Kiri, Claire Cardie, Seth Rogers, and Stefan Schrödl.
Clustering Algorithm". Journal of the Royal Statistical Society, Series C "Constrained k-means clustering with background knowledge."
28 (1): 100–108. JSTOR 2346830, 1979 In ICML, vol. 1, pp. 577-584. 2001.
[6] Rokach, Lior, and Oded Maimon. "Clustering methods." Data mining [17] Kanungo, Tapas, David M. Mount, Nathan S. Netanyahu, Christine D.
and knowledge discovery handbook. Springer US, pp. 321-352, 2005 Piatko, Ruth Silverman, and Angela Y. Wu. "An efficient k-means
clustering algorithm: Analysis and implementation." Pattern Analysis
[7] Cortes, C.; Vapnik, V. ,"Support-vector networks". Machine Learning , and Machine Intelligence, IEEE, vol. 24 (7), pp. 881-892 , 2002
Vol.20, Springer, 1995
[18] Chu, Cheng, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski,
[8] Weizhong Zhao, Huifang Ma, Qing He, “Parallel K-Means Clustering Andrew Y. Ng, and Kunle Olukotun. "Map-reduce for machine learning
Based on MapReduce”, First International Conference, CloudCom 2009, on multicore."Advances in neural information processing systems,
Beijing, China, pp 674-679, Springer Berlin Heidelberg, ISBN 978-3- vol. 19,pp. 281, 2007
642-10664-4, December 1-4, 2009
[19] Rajaraman, Anand, and Jeffrey D. Ullman. Mining of massive datasets.
[9] "Running Hadoop on Ubuntu Linux (Multi-Node Vol. 77. Cambridge: Cambridge University Press, 2012.
Cluster)"http://www.michael-noll.com/tutorials/running-hadoop-on-
ubuntu-linux-multi-node-cluster/
[10] http://en.wikipedia.org/wiki/Apache_Hadoop
Using a Location-Based Mobile Application to
Locate a Traditional Health Practitioner in South
Africa
Johannes, M. Kekana1 and Robert, T. Hans2
Department of Computer Science
Tshwane University of Technology
Soshanguve, Pretoria, South Africa
1
joe_kekana@yahoo.com
2
hansr@tut.ac.za
Abstract—Mobile technology has permeated many industries, practicing doctors, just like medically trained doctors. This
including the health industry. Different applications are used for legislation is in line with what has been observed on the
various things in the health industry, from checking blood ground - people have shown to have trust in consulting
pressure to monitoring patients’ health remotely. Following a traditional health practitioners who are based in their
legislation which was passed in South Africa to recognize communities [8]. As a result of these developments, dubious
traditional health practitioners as practicing doctors many bogus traditional health practitioners have cropped up all over the
traditional health practitioners have emerged and are robbing country conning people their hard earned money. In an
unsuspecting patients their money. To curb this problem the attempt to curb this problem the government has established a
South African government established a traditional health
council which is aimed at ensuring that all legally practicing
council whose mandate is to ensure that all legally practicing
health practitioners register with it. However, prospective
traditional health practitioners register with the council.
patients still find it difficult to locate these registered health However, people who intend consulting traditional health
practitioners. This paper proposes and presents a location-based practitioners are still unable to find them as easily as possible
mobile application to enable prospective patients to search for and also on the move. This paper therefore is aimed at closing
the nearest traditional health practitioner anytime and anywhere. this gap by proposing a location-based mobile application
which enables a user to find the nearest traditional health
Keywords—mobile technology; location-based; mobile practitioner.
application; traditional health practitioner; e-health.
The remainder of this paper is structured in the following
manner. Section II discusses related work in mobile
I. INTRODUCTION applications and their usage. Section III provides insight on
The advent of mobile technology has transformed how we the functionality of the proposed mobile application. Section
lead our lives. The number of mobile phones has risen to IV discusses the experimental results. Section V discusses the
being more than the number of personal computers in benefits which will be realized by the proposed system.
developed and developing countries [1]. Mobile phones are Section VI presents the conclusion and future work.
portable and enable people to perform internet based functions
that would have demanded use of personal computers [2]. For II. RELATED WORK
some people mobile phones have become expression of their
In the 21st century mobile phones have become integral
personality [3].
part of almost everything we do. Mobile phones are found in
Mobile technology has been adopted in many different entertainment, education, health [7],[9] and so on. It is just
industries. Mobile applications are used on social networking difficult to imagine one going through a day without having
to enable people to interact [4]. Advertisement remains the used a mobile phone in one way or another. Mobile phones
major source of revenue for mobile applications [5]. The smart owe their popularity to their ability to put information at the
phones are also being used to print documents that they might users’ finger tips anytime and anywhere [10].
have been reading online and want to refer to later in hard
In recent years the health industry has seen a boom of
copy [6].
mobile applications which are meant to address various issues
The use of mobile applications has also permeated the including stress related matters [11], maternal mortality
health industry in a big way. Users are able to use their mobile related issues [12] and heart related illnesses [7], and many
phones to establish the levels of blood pressure, check as to more. In support of the abovementioned assertion [13] states
whether they are not suffering from cancer, diabetes, etc. [7]. that mobile technology, has been leveraged to provide
The South African government recently passed a legislation healthcare services in many countries, with some of the
aimed at recognizing traditional health practitioners as examples being emergency response systems for road traffic
REFERENCES
[1] V.C.E Bahamondez, J. Häkkilä and A. Schmidt, “Towards better UIs for
mobile learning: experiences in using mobile phones as multimedia tools
at schools in rural Panama,” MUM '12: Proceedings of the 11th
International Conference on Mobile and Ubiquitous Multimedia. 2012.
[2] Y. Bang, D.J. Lee and K. Han, “The Impact of Mobile Channel
Figure 5. Map directions
Adoption on Purchase Time Dispersion in e-Commerce,” ICEC '15
Proceedings of the 17th International Conference on Electronic [11] J.Sakata, M,Zhang, S.Pu, J,Xing, and K.Versha, “Beam: a mobile
Commerce 2015. Article No. 34. application to improve happiness and mental health,” 2014 CHI EA '14
[3] A,Meschtscherjakov, D.Wilfinger and M,Tscheligi, “Mobile attachment CHI '14 Extended Abstracts on Human Factors in Computing Systems,
causes and consequences for emotional bonding with mobile phones,” New York, United States of America, pp. 221-226 .
2014. CHI '14 Proceedings of the SIGCHI Conference on Human [12] S.R.Shinde, R.Shinde, S.Shanbhag, M.Solanki and P.Sable,
Factors in Computing Systems , pp 2317-2326. “mHEALTH-PHC - Application design for rural health care,” 2014
[4] D.Ferreira, J.Goncalves. V.Kostakos, L.Barkhuus K.Anind and Humanitarian Technology Conference - (IHTC), pp. 1 – 5.
A.K.Dey, “Contextual experience sampling of mobile application micro- [13] A.Maitra and N.Kuntagod, “A novel mobile application to assist
usage,” 2014.MobileHCI '14 Proceedings of the 16th international maternal health workers in rural India”, 2013 5th International Workshop
conference on Human-computer interaction with mobile devices & on Software Engineering in Health Care (SEHC), pp. 75-78.
services, New York, United States of America, pp. 91-100. [14] R.M.González, J.Francisco, F.J. Álvarez, A.Rodríguez, J.M. Arteaga
[5] J.P. Rula, B.Jun and F.Bustamante, “Mobile AD(D): Estimating Mobile and A.M. González, “Guidelines for designing graphical user interfaces
App Session Times for Better Ads,” 2015.HotMobile '15 Proceedings of of mobile e-health communities,” INTERACCION '2012 Proceedings of
the 16th International Workshop on Mobile Computing Systems and the 13th International Conference on Interacción Persona-Ordenador,
Applications, New York, United States of America, pp 123-128. Article No. 3 , ACM New York, NY, United States of America.
[6] B.Leeladevi, C.P.R Raj, and A.Tolety, “A study on smartphone printing [15] S.Brown and T.X. Brown, “Value of mobile monitoring for diabetes in
approaches,” Information & Communication Technologies (ICT), 2013 developing countries,” ICTD '2013 Proceedings of the Sixth
IEEE Conference on, JeJu Island. International Conference on Information and Communication
[7] N. Nirwal, N. Sardana, and A.J. Bhatt, “Hopeful Hearts: A Mobile Technologies and Development: Full Papers - Volume, ACM New
Health Care Application,”. 2014 Seventh International Conference on York, United States of America, pp 267-273.
Contemporary Computing (IC3), pp. 351 – 356. [16] R. De Bruin and S.H. von Solms, “Securing mobile applications in
[8] J.P. van Niekerk, (2012),Traditional healers formalised? [Online] hostile rural Environments,” IST-Africa 2014 Conference Proceedings
Available From: http://www.scielo.org.za/scielo.php?pid=S0256- Paul Cunningham and Miriam Cunningham (Eds), pp 1-9.
95742012000300001&script=sci_arttext [Accessed: 16 August 2015]. [17] S. Sathe, R. Melamed, P. Bak, S. Kalyanaraman, “Enabling Location-
[9] J. Pirker, P. Weiner, C. Gütl, and V.M. Garcia-Barrios, “Location-based Based Service 2.0: Challenges and Opportunies,” 2014 IEEE 15th
Mobile Application Creator: Creating Educational Mobile Scavenger International Conference on Mobile Data Management, pp 313-316.
Hunts,” 2014 International Conference on Interactive Mobile [18] T.J. Gerpott and S. Berg, “Adoption of Location-Based Service Offers
Communication Technologies and Learning (IMCL), pp. 160-164. of Mobile Network Operators”, Mobile Business and 2010 Ninth Global
[10] D.M. Mahmud and N.A.S. Abdullah, “Mobile Application Mobility Roundtable (ICMB-GMR), pp 154-160.
Development Feasibility Studies,” 2014 IEEE Conference on Open
Systems (ICOS), pp. 30-35.
A New Efficient Algorithm for Executing Queries
over Encrypted Data
Rasha Refaie1, A. A. Abd El-Aziz2, Nermin Hamza3, Mahmood A. Mahmood4, Hesham Hefny5
Dept. of Computer & Information Sciences
Institute of Statistical Studies and Research
Cairo University
e-mail: {1rosha2030@gmail.com, 2a.ahmed@cu.edu.eg, 3nermin_hamza@yahoo.com, 4mahmoodissr@cu.edu.eg,
5
hehefny@ieee.org}
Abstract—Outsourcing databases into cloud increases the and cannot run computations on values which encrypted using
need of data security. The user of cloud must be sure that his different keys at DBMS[3].
data will be safe and will not be stolen or reused even if the data-
centers were attacked. The service provider is not trustworthy so MONOMI is a system to execute analytical workloads over
the data must be invisible to him. Executing queries over encrypted data on not trusted database server. In MONOMI a
encrypted data preserves a certain degree of confidentiality. In
part of the query is run on the not trusted server over
this paper, we propose an efficient algorithm to run
computations on data encrypted for different principals. The encrypted data and the remaining of the query on the plain
proposed algorithm allows users to run queries over encrypted database at the user server[12]. MONOMI send the encrypted
columns directly without decrypting all records. result for the user which decrypt it and run the final
computation which is more efficiently to compute at the user.
Keywords—Database security; query processing, Installing new database design still a big problem in
Homomorphic encryption; CryptDB; MONOMI; and Secure MONOMI and any security constraints is not taken into
indexes account [3].
I. INTRODUCTION [13] suggest a good algorithm to search over encrypted data.
The limitations of the existing techniques for fuzzy match and
When a database is provided as a service, a service provider range queries are efficiently eliminated using this algorithm.
may be not trustworthy and data needs to be protected from This algorithm is efficient for searching when the result of
the database service provider. The problem in using
query is less than 40%of the total data.
encryption is that no way to execute queries over encrypted
data without decrypting it[1]. Hence the data will not be [14] proposes a new architecture to support data's
protected, which present a problem for users. There for we confidentiality, integrity and availability. This architecture
propose an efficient algorithm to run queries over encrypted illustrate a secure and robust cloud storage by combining
data without decrypting it. So the data will remain cryptography and access control with two layered encryption.
confidential and invisible even to the cloud provider.
Using pre-computed index make the execution of queries in
II. RELATED WORK databases faster. But using standard indexes become
Ineffective when the data is encrypted[15]. Moreover, if
Different techniques have been suggested to maintain a certain several users with different access rights use the same index,
degree of privacy on database outsourcing scenarios. In each one of them needs access to the entire index, possibly
addition they allow to execute some SQL queries efficiently. including indexed elements that are beyond his access rights.
A simple but elegant solution to this problem is suggested by
Fully homomorphism encryption (FHE), is a new concept of
split the index into several sub-indexes where each sub-index
security. This system can calculate any type of function on
relates to values in the column encrypted using the same
encrypted data[2]. Using fully homomorphic encryption still
key[3].
very expensive and slow[3].
Security issues and related costs are the most strategic issues
CryptDB’s approach is to execute queries over encrypted data
which related to outsourcing database. [16] proposes a model
on the DBMS server as it would on an Plaintext database[4].
which includes the variability of database workload and cloud
The proxy manages all the communication to or from the
prices and the related cost of encryption schemes. By applying
database and uses secret keys to encrypt all data which
the model to actual cloud provider prices, the encryption and
included in queries issued to the DBMS[5,6]. In this approach
data is encrypted in a layered way which is called as adaptive encryption costs for data privacy can be determined.
onion[7,8,9,10,11]. CryptDB is much more efficient but
cannot support most analytical queries over encrypted data
Suppose a user issues the following query: has only one current layer in each onion. ORD onion is used to
adjust the order leakage for the queries including comparison.
SELECT Emp_Name, Salary, Dept_Num Order-preserving encryption (OPE), is used to encrypt values
and keep the order of these values. If x < y, then OPEK(x) <
FROM TABLE2 OPEK(y), for any secret key K[4]. Therefore, if a column is
encrypted with OPE, the server can perform range queries and
WHERE Salary ≥ 100 queries including comparison. In our example salary is
encrypted by different key. The problem is that when the same
In CryptDB data is encrypted in a layered way which is called
value encrypted by different keys it gives different
as onion. The term "onion" refers to layers of encryption,
values[EK1(100) ≠ EK2(100)]. So, server can not perform range
Overlapping like the layers of an onion. These onions have
queries and queries including comparison on a column
different layers each encrypted by using different algorithms.
containing values encrypted by different keys.
The outer layer of an onion is the most secure layer. A value
To execute this query using CryptDB. The proxy will encrypt
the query and send it to DBMS server to run it. The server
cannot check if total ≥ 100 because salary encrypted by
different keys. Thus, CryptDB executes the following query at
the DBMS server:
FROM TABLE2
Q1
User1
Encrypted Q2
Password P1 Database
Proxy server Q3
The following algorithm will describe the outlines of the suggested technique:
When any authorized user wants to search some records and The proxy will encrypt this query using (K1, K2) to (Q1, Q2).
query condition is on the column which encrypted using The proxy send Q1 and Q2 to the DBMS. The DBMS run Q1
different keys, so the proxy will encrypt the query using on the data which encrypted by K1and Q2 on the data which
different keys (K1, K2, K3, ............) to (Q1, Q2, Q3, ecrypted by K2 to return exactly these records which the user
................). The proxy send these queries to DBMS to run wants. The DBMS server returns the encrypted query result
every query on the corresponding data which encrypted by the (TABLE 3,4).
same key. These queries will return these records which
exactly the user wants from Encrypted_Table. This approach
does not need to decrypt all the values of entire encrypted Q1: SELECT Emp_Name, Salary, Dept_Num
column; rather it decrypts only those values which satisfy the
user query. The following example can describe the searching FROM TABLE2
operation of the suggested algorithm:
WHERE Salary ≥ x5a8c34
SELECT Emp_Name, Salary, Dept_Num
Emp_Name Salary Dept_Num [1] (2011) The Forbes website. [Online]. Available:http://
http://www.forbes.com/sites/andygreenberg/2011/12/19/an-mit-magic
98wu x5a8c34 1 trick-computing-on-encrypted databases-without-ever-decrypting them/
u8sb x84a21c 1 [2] TEBAA.M, EL HAJJI.S & EL GHAZI.A "Homomorphic Encryption
3781e2 x5a8c34 1 Applied to the Cloud Computing Security". In Proc. of the World
Congress on Engineering 2012, Vol I WCE 2012, London, U.K, July4-
6, 2012.
[3] R. Refaie, A. A. Abd El-Aziz, N. Hamza, M. A. Mahmood and H. Hefny
Q2:SELECT Emp_Name, Salary, Dept_Num "A Survey on Executing Query on Encrypted", The International
Conference on Intelligent Information Technologies (ICIIT 2014),
FROM TABLE2 Chennai, India, 12/2014.
[4] R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan
WHERE Salary ≥ x638e54 "CryptDB: Protecting confidentiality with encrypted query processing".
In Proc. of the 23rd
TABLE4: RESULT of Q2 [5] R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan. "
Review of \CryptDB: Protecting Confidentiality with Encrypted Query
Processing". May 16, 2012 SOSP, pages 85100, Cascais, Portugal, Oct.
Emp_Name Salary Dept_Num 2011.
98wu x638e54 2 [6] R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan. "
CryptDB: A Practical Encrypted Relational DBMS". Available:http:
//lib4shared.com/doc-file/cryptdb-a-practical-encrypted-relational-dbms
The proxy decrypts these results and returns it to the user [7] Dayioglu.Z, Kiraz.M, Birinci.F, and Akin.I. "Secure Database in Cloud
Computing: CryptDB Revisited". 6t INTERNATIONAL
(TABLE5). INFORMATION SECURITY & CRYPTOLOGY CONFERENCE 20-
21 September/Eylül 2013 | Ankara / TURKEY
TABLE5: THE RETURNED RESULT FROM PROXY SERVER’S SIDE
[8] http://en.wikipedia.org/wiki/Tor_(anonymity_network)
Emp_Name Salary Dept_Num [9] Ferretti.L, Pierazzi.F, Colajanni.M, and Marchetti.M."Security and
Confidentality Solutions for Public Cloud Database Services "
Alice 100 1 SECURWARE 2013 : The Seventh International Conference on
Eve 800 1 Emerging Security Information, Systems and Technologies.
[10] P. SRIVANI, S. RAMACHANDRAM and R. SRIDEVI. " A SURVEY
McDonald 100 1 REPORT ON CRYPTDB " Computer Science , Communication &
Alice 100 2 Instrumentation Devices Editors: Janahanlal Stephen,Harish Rohilk and
SVasavi Copyright © 2015 AET-2014k6 Organisers. ISBN: 978-981-
In this technique the proxy does not need to decrypt all the 09-5247-1.
values of entire encrypted column; rather it decrypts only [11] C. Curino, E.P. C. Jones, R. Popa, N. Malviya, E. Wu, S. Madden, H.
these values which match the user query. Balakrishnan and N. Zeldovich. " Relational Cloud: A Database-as-a-
Service for the Cloud " available at
https://people.csail.mit.edu/nickolai/papers/curino-relcloud.pdf.
[12] Stephen Tu, M. Frans Kaashoek, Madden.S and Zeldovich.N. "
IV. CONCLUSION Processing Analytical Queries over Encrypted Data". In Proc. of the
39th International Conference on Very Large Data Bases (VLDB), Riva
In this paper, we outlined Various techniques which offer del Garda, Italy, August 2013.
privacy on database outsourcing scenarios. A secure algorithm [13] M. Sharma, A. Chaudhary, S. Kumar. " Query Processing Performance
for searching over encrypted data was suggested. Our and Searching over Encrypted Data by using an Efficient Algorithm"
proposed technique builds on CryptDB’s design. CryptDB’s International Journal of Computer Applications (0975 – 8887) Volume
62– No.10, January 2013.
approach is to make a secure communication between user and
[14] R. Kalaichelvi and L. Arockiam " Secure and Robust Cloud Storage
the encrypted database by executing queries over encrypted with Cryptography and Access Control " R. Kalaichelvi et al./ Elixir
data on the DBMS server as it would on an plaintext database. Comp. Sci. & Engg. 56 (2013) 13481-13484. Available online at
Our proposed algorithm will efficiently eliminate the www.elixirpublishers.com (Elixir International Journal)
limitations of computations on values encrypted for different [15] Shmueli.E, Waisenberg.R, Elovici.Y and Ben-Gurion.E. "Designing
principals. We will apply our algorithm to ensure its Secure Indexes for Encrypted Databases. In Proc of Data and
Applications Security". 19th Annual IFIP WG 11.3 Working
effectiveness. Conference, USA, 2005.
[16] Ferretti.L, Pierazzi.F, Colajanni.M, and Marchetti.M." Performance and
Cost Evaluation of an Adaptive Encryption Architecture for Cloud
Databases". IEEE TRANSACTIONS ON CLOUD COMP TING, VOL.
2, NO. 2, APRIL-JUNE 2014.
Memristor Model For Massively-Parallel
Computations
Abstract—The model of memristor described in the paper is burdening simple models of the HP memristor with frequently
designed for building models of large networks for analog used window functions [3] increase with increasing
computations. A circuit containing thousands of memristors for complexity of the application circuit. The experiments up to
finding the shortest path in a complicated maze is a typical now with these models reveal their malfunction in the SPICE
example. The model is designed to meet the following criteria: 1. environment also for primitive circuits containing only one
It is a model of HP memristor with linear dopant drift while memristor. The well-known Pickett’s model of TiO2
respecting the physical bounds of the internal state variable. 2. memristor [4] is rather complicated in itself: in a SPICE
Reliable operation in the SPICE environment also when implementation, suggested in [5], it represents 26 additional
simulating extremely large networks. 3. Minimization of the
equations. When utilizing such a model in the application
simulation time while computing bias points and during transient
analyses. A benchmark circuit for testing the applications of
network containing 1000 memristors, then it represents
various complexities is presented. The results confirm a perfect additional 26 thousand rows and 26 thousand columns in the
operation of the model also in applications containing thousands circuit matrix. In addition, the Pickett’s model labors with
of memristors. serious convergence problems. A simple circuit can be shown,
containing 8 memristors, each modeled by the Pickett’s
Keywords—memristor; model; massively-parallel analog model, whose DC solution cannot be found in SPICE due to
computations; SPICE numerical problems.
The paper describes a model of the HP memristor with
I. INTRODUCTION linear dopant drift, which is mathematically equivalent to the
Circuits for massively-parallel computing, when the classical Strukov model [6], but taking into account the
collective cooperation of a large number of memristors results physical limits of the internal state variable x [0,1], which is
in effects unattainable by digital computers, are important the normalized width of the doped (and thus conductive) TiO 2
potential application areas of memristors. Fast searching for layer [7]. This limitation is commonly modeled by a
paths in complicated labyrinths, when the computation time rectangular window function [7]. However, such a model fails
grows exponentially with the maze size, is a typical example. numerically if the memristor reaches either border of the state
An interesting method for solving efficiently this problem via variable [8]. For our purposes we will therefore use the
a memristor network is proposed in [1]. By means of analog method of nonlinear transformation [3] of the native state
switches, this network can be reconfigured to the form of the variable (charge) into the physical state variable (x), which
analyzed labyrinth. When applying DC voltage across must be optimized in order to minimize the total count of
terminals, which represent the entrance and the exit of the equations for the simulation program.
maze, the current flows only through the memristors which lie
on the sought paths inside the labyrinth. After disconnecting II. PROPOSED MODEL OF HP MEMRISTOR
the source, these paths are memorized in the form of
memristances of the corresponding memristors. The path The classical model of the TiO2 memristor is in the form
lengths can be evaluated via measuring the total resistances of [7]
individual paths. Such a principle is used in [2] for finding
optimal trajectories of the passenger on the London tube v Rm x i , Rm x Ron x Roff (1 x) Roff Rx , (1)
network, which is modeled by a memristive grid.
Experimenting with such networks requires hundreds if not
thousands of memristors. Their samples are not currently dx R
accessible. However, problems also exist with computer kif w (x) , k v 2on . (2)
simulations of such large circuits. The numerical problems dt D
This work was supported by the Czech Science Foundation under grant
No 14-19865S – Generalized higher-order elements. Research described in
this paper was financed by the Czech Ministry of Education in the frame of
the National Sustainability Program under grant LO1401. For research, the
infrastructure of the SIX Center was used. The research was also supported by
the Project of Specific Research, K217 Department, UD Brno.
0
-1 - 0.5 0 q 0.5 1 V
FIw( )
q0
q(t) B
(a)
Fig. 1. Nonlinear function for transforming the normalized charge q into the
normalized physical state variable x of TiO2 memristor with rectangular
window.
dq
i (4)
dt
A into the node. All the memristors have the same initial
memristances. As is obvious from the current direction, all
q vertical/horizontal memristors will decrease/increase their
resistances. The initial resistance of the in node is given by the
I I
resistance of the network with identical 6 k resistors,
whereas the final resistance is given by the network of vertical
1F Ron and horizontal Roff resistors. The corresponding initial and
1 G
final voltages are shown in Fig. 4. The non-monotonic
1 G transient can be achieved via selecting a proper value of Rini.
Raux1 Gres Gq C int Raux 2
Figure 5 provides the simulation results of the same circuit
for sinusoidal excitation. It should be noted that the pinched
V hysteresis loop can exhibit crossing points also outside the v–i
V
Roff R FI w (kV (q) q0 ) origin. This status can be adjusted via a proper selection of the
B initial memristance.
(b)
The computation times for transient analyses from Figs 4
Fig. 2. Model of HP memristor, (a) ideological, (b) for SPICE and 5 were approximately the same, ca 4 seconds. Similar
implementation. analyses of more complicated circuits with M=N=50, thus
containing 5100 memristors, and M = N = 100 with 20200
III. BENCHMARK CIRCUIT memristors take ca 9 and 75 seconds, respectively.
The benchmark circuit in Fig. 3 consists of 2MN+M+N 10
identical memristors. Its complexity can be arbitrarily varied Vin
by selecting the parameters M and N. The signal source is 8.2585 V
applied between the in and ground terminals. The current [V] 8
source has been selected for experiments described below,
providing either DC current or sinusoidal waveform with
adjustable amplitude, frequency and initial phase, with the 6
voltage response being evaluated in both cases. For sinusoidal 4.3898 V
excitation, the fingerprints of v–i pinched hysteresis loops can
be tested. The experiments were done in Micro-Cap 11 and 4
LTspice on a PC with Intel(R) Core(TM) i7-3770 CPU @
3.4GHz and 16GB RAM, Windows 7 Ultimate 64 bit.
2
in 1 2 3 M
0
1 0 5 10 time 15 20
[sec]
2
Fig. 4. Results of transient analysis of the circuit from Fig. 3 for M=50,
N=30: voltage response to a constant current of 250 A flowing into the in
node; Rini = 6 k, Ron = 100, Roff = 10k, step ceiling = 0.2 sec.
3
-5
-1 -0.5 0 I 0.5 1
[mA]
Fig. 5. Transient analysis of the circuit from Fig. 3 for M=50, N=30, driven
by sinusoidal 1 mA/ 1 Hz current; Rini = 1.3 k, Ron = 100 , Roff = 10 k;
step ceiling = 0.02 sec.
IV. CONCLUSIONS
Computer experiments with extremely large circuits,
containing the proposed model of the HP memristor, confirm
the robustness of this model. It enables a fast and reliable
analysis of circuits containing thousands of memristors.
Comparable parameters are absolutely unavailable for
classical memristor models, irrespective of whether they
utilize the window functions or the Pickett – type physical
models.
It follows from Appendix 1 that variable resistance can be
modeled in Micro-Cap directly via a resistor with formula-
dependent resistance. This way modeled memristive port
speeds-up the analysis of large circuits containing memristors
in comparison to the classical SPICE modeling via controlled
sources as applied to PSpice and LTspice (see Appendix 2). In
some special cases, the latter approach can provide more
accurate results: the accuracy of the first model is guaranteed
with smaller step ceiling. Since it results in reducing the speed
of the transient analysis, the advantage of the direct modeling
of memristance is rather questionable.
APPENDIX 1 – MEMRISTOR MACRO FOR MICRO-CAP 11 GM IN1 IN2 VALUE =
+ {v(IN1,IN2)/(ROFF-DELTAR*FI(XINI-0.5+K*V(Q)))}
R3 IN1 IN2 1G
.PARAMETERS(Ron=100,Roff=10k,Rini=1k) .ENDS MEMRISTOR_IDEAL
Plus
I(RM)
IN1 q
Cq R2
RM {Rmem} Gq 1 1G References
Minus
IN2
[1] Y. V. Pershin, and M. Di Ventra, “Solving mazes with memristors: A
.define uv 1e-14 massively parallel approach,” Phys. Rev., vol. E 84, p. 046703, 2011.
[2] Z. Ye, S. H. M. Wu, and T. Prodromakis, “Computing shortest paths in
.define k (uv*Ron/D**2)
2D and 3D memristive networks,” arXiv:1303.3927v1 [physics.comp-
.define D 10n ph], 15 Mar 2013.
.define deltaR (Roff-Ron) [3] D. Biolek, Z. Biolek, V. Biolkova, and Z. Kolka, “Reliable Modeling of
Ideal Generic Memristors via State-Space Transformation,”
.define xini ((Roff-Rini)/deltaR)
Radioengineering, vol. 24, no. 2, pp. 393–407, 2015.
.define FI(x) TABLE(x,-0.5,0,0.5,1) [4] M. D. Pickett, D. B. Strukov, J. L. Borghetti, J. J. Yang, G. S. Snider, D.
.define Rmem {Roff-deltaR*FI(xini-0.5+k*v(q))} R. Stewart, and R. S. Williams, “Switching dynamics in titanium dioxide
memristive devices,” Journal of Applied Physics, vol. 106, p. 074508,
2009.
APPENDIX 2 –SUBCIRCUIT FOR PSPICE AND LTSPICE [5] H. Abdalla, and M. D. Pickett, “SPICE modeling of memristors,” In
Proc. IEEE Int. Symp. Circuits and Systems, Rio de Janeiro (Brazil), pp.
1832–1835, 2011.
.SUBCKT MEMRISTOR_IDEAL In1 In2 [6] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The
+ PARAMS: RON=100 ROFF=10K RINI=1K missing memristor found,” Nature, vol. 453, pp. 80–83, 2008.
*
[7] Z. Biolek, D. Biolek, and V. Biolkova, “SPICE model of memristor with
.FUNC FI(X)={TABLE(X,-0.5,0,0.5,1)}
nonlinear dopant drift,” Radioengineering, vol. 18, no. 2, pp. 210–214,
.PARAM K={(UV*RON/D**2)} D=10N UV=1E-14 2009.
+ XINI={(ROFF-RINI)/DELTAR} DELTAR={ROFF-RON}
* [8] D. Biolek, Z. Biolek, V. Biolkova, and Z. Kolka, “Modeling of TiO2
memristor: from analytic to numerical analyses,” Semiconductor Science
GQ 0 q VALUE = {I(GM)}
and Technology, vol. 29, no. 12, p. 125008, 2014.
CQ q 0 1
R2 q 0 1G
*
Stability of Digitally Emulated Mem-Elements
Abstract—The paper analyzes the stability issue of a special regard to the variety of potentially useful unconventional
emulator of memristive, memcapacitive and meminductive circuit elements, there is a need for mimicking the behavior of
systems as well as higher-order elements from Chua’s periodical arbitrary devices, the memcapacitor, meminductor and HOEs
table, and other two-terminal devices. It is demonstrated that, in being a small subset of them.
order to provide stable behavior, the analog port of the emulator
must be implemented by the controlled voltage or current source Many circuit ideas of the emulators of the above systems
depending on the type of the emulated two-terminal device. As a have been published so far. Their survey is described, for
practical consequence, the emulator must contain both of the example, in [20], [21], [22]. All of them can be classified as
above analog interfaces in order to be universal. analog or hybrid (i.e. with digitally-programmed analog
interface), and as single-purpose or universal. The ambition of
Keywords— memristor; memcapacitor; meminductor; emulator emulating flexibly an arbitrary component can only be
accomplished by using a hybrid universal emulator. The first
I. INTRODUCTION step towards its implementation was described in [11]: the
Memristors [1] and memristive systems [2] belong to resistance of a digital potentiometer is controlled by a
promising building blocks for constructing future computer microcontroller depending on the program which evaluates the
systems [3] and bio-inspired electronics [4]. In addition to this differential state equation of a considered memristive system.
potential application area, there are also many studies dealing However, the use of the potentiometer for emulating the analog
with analog applications of memristors [5], drawing on the port limits the applications to memristive devices alone. In
analog nature of the memristor. These novel circuit ideas, order to mimic also memcapacitive, meminductive and other
utilizing the unique features of memristors, cover various areas systems, the so-called memulator was proposed in [21]. The
of analog signal processing commonly used for communication analog port is emulated via a controlled voltage or current
circuits, such as oscillators [6] and relaxation generators [7], source. The way of the control depends on the mathematical
rectifiers [8], electronically controlled filters [9], controllers model between the port voltage and current of the emulated
[10] and amplifiers [11], modulators and demodulators [12], two-terminal device.
A/D and D/A converters [13], and systems for image [14] and
audio [15] signal processing. With regard to the duality of the voltage and the current
sources, it would seem that such a memulator could be of an
Re-designing the above systems from the conventional to economical construction, thus providing only one type of the
the memristive platform is inconceivable without the above controlled sources. However, practical experiments with
corresponding experimental work. However, there is a lack of the memulator set up serious problems with the stability of the
real mem-devices on the market which could be useful for such emulated devices. It turns out, for example, that the
experimenting. As a result, usually the only feedback for the memcapacitive/meminductive devices cannot be satisfactorily
designers is the computer simulation [16], with all its obvious emulated via the current/voltage sources. This stability issue is
and well-known limitations in comparison to life experiments. analyzed in depth in Section II of this paper. It is clearly shown
that the universal hybrid emulator must use a controlled
In addition to memristors and memristive systems, also
voltage and also current source for mimicking devices of
memcapacitors, meminductors, memcapacitive and
arbitrary nature. Section III presents experimental results with a
meminductive systems [17], and general higher-order elements
real hardware demonstrating the usefulness of this approach.
(HOEs) [18] from Chua’s periodical table [19] come on the
scene. The lack of useful aids for experimenting is yet worse II. STABILITY ANALYSIS OF EMULATED ELEMENTS
than for memristive systems.
A. Hybrid Emulator
In comparison with the computer simulations, the hardware
emulation of memristive systems [20] can be an important step The hybrid emulator can be configured either as a voltage
towards laboratory experiments with newly designed controlled current source or as a current controlled voltage
memristor-based applications. Such emulators can concurrently source. Modern integrated DC-DC converters, such as
stimulate the interest of students and designers who are not ADuM5000, allow designing the device as fully floating with a
familiar with unconventional electronic components. With parasitic capacitance in the order of several picofarads. Fig. 1
shows the block diagram of a designed emulator, which is
This work was supported by the Czech Science Foundation under grant
No 14-19865S – Generalized higher-order elements. The research is a part of
the COST Action IC1401 and is financially supported by the Czech Ministry
of Education under grant No LD15033. Research described in this paper was
financed by the Czech Ministry of Education in the frame of the National
Sustainability Program under grant LO1401. For research, the infrastructure
of the SIX Center was used. The research was also supported by the Project of
Specific Research, K217 Department, UD Brno.
D/A converter. The MCU runs a discretized system of
differential-algebraic equations of the emulated element vn 1 vnet s / F in R 1 et s / F , (1)
between the terminals A and B.
where τF is the filter time constant. If we denote
Let us consider an emulator embedded in a continuous-time
electrical network, which closes a feedback loop. Generally, e t s / F , (2)
the loop also contains an antialiasing linear filter at the input of
the A/D converter and a reconstruction filter at the output of then the z-transform of (1) is
the D/A converter. Vz V I R1 , (3)
τF α stability condition
0 (no filtering) 0 R < RM
Let us note that using τF = 10 ts significantly decreases the Taking into consideration the sampling process of the
emulator bandwidth. emulated system dynamics, it makes sense to analyze the
stability for β < 1. Fig. 4 shows the magnitudes of both roots of
The same analysis performed for a current-controlled (16) for the same filter scenario as for the resistive system (see
emulator leads to an inverse condition to (11) Table I). In all cases we obtain a stable behavior for β < 1.
R 1 E. Memcapacitive System
. (12)
RM 1 Similarly to both previous systems we will analyze an
emulated capacitor CM
D. Meminductive System
The analysis of the meminductive system is based on the dv
i CM . (18)
same assumption which was used for the slow dynamics of the dt
memristive system (5), (6). We will analyze an emulated
inductor LM Applying the simple forward rule for differentiating, the
discretized version of (18) is
t
1 vn vn 1
i i (0)
LM v( )d ,
0
(13) in CM
ts
. (19)
where i(0) is the initial condition. Analogously to the inductive system we obtain the
characteristic equation combining (19) and (4) as
Applying the simple rectangle integration rule the
discretized version of (13) will be in the time domain
1 z 1 b(i I t ) for i I t
I 1 (1 ) 0, (20)
z( z ) f t (i ) b(i I t ) for i I t , (24)
0
where otherwise
Fig. 6. Test circuit for memristor and meminductor and experiment setup.
Abstract—Research and commercial efforts are currently growth and account for a major share of industrial production
addressing challenges and providing solutions in cloud and exports [31, 32, 50].
computing. Business models are emerging to address different What appear to be the major constraints to the development
use scenarios of cloud computing. In this paper, we present a of SMMEs in many developing countries are limited access
virtual enterprise (VE) model of cloud computing to enable
to finance, technology, markets and management skills.
Small, Medium and Micro Enterprises (SMMEs) to respond
quickly to customers’ demands and market opportunities, Access to and awareness of business information is also the
therefore Enabling Small, Medium, and Micro Enterprises main constraint to the development and growth of SMMEs in
through Cloud utility Infrastructure: gaining agility and developing economies. Limited access to information
flexibility needed for business success. In our virtual enterprise resources to start, survive and grow is one of the challenges
model, temporary co-operations are created to realize the value faced by SMMEs in enterprise development [33].
of a short term business opportunity that the partner SMMEs In the current e-business environment, individual enterprises,
cannot (or can, but only to a lesser extent) capture on their own. including SMMEs cannot survive on their own. It is crucial
This model is based on the realization that it is not economically that SMMEs engage effectively with their partners and
viable for SMMEs to acquire their own private cloud
customers. These enterprises require a certain way of e-
infrastructure or even subscribe to public cloud services as a
single entity. The pricing model obtained from our proposed business interaction with their partners. The virtual enterprise
business model shows the benefits that are derived from using the (VE) business concept, also known as the networked
VE cloud model over subscription to public cloud as a single enterprise, consists of distributed business functions and
business enterprise. The pricing structure of our VE cloud model utilities, outsourced to partners that work with the firm to
is up to 17.82 times economical compared with equivalent deliver the product to end customers. The VE model is one
Amazon EC2 instance type pricing model. such business environment that can facilitate cloud
computing for SMMEs. Emerging technologies, including
Keywords—Cloud computing; service computing; SMMEs, cloud computing, have the potential to transform and
Virtual Enterprise
automate the business processes of SMMEs and enable them
I. INTRODUCTION to engage with trading partners and customers in global
networks [34]. Our VE model of cloud utility infrastructure is
Cloud computing is a computing paradigm in which every attempts to enable SMMEs take advantage of cloud
layer of computing from infrastructure to application is a computing in a VE business model.
service. It enables usage of computing hardware and
hardware belonging to a „third-party‟ thus lowering cost of The network enterprise model is identical to the Grid-based
ownership of computing and enabling mobile computing [1], Utility Infrastructure for SMMEs-enabled Technology
[2], [3], [4], [5], [6], [7]. Adoption of cloud computing as a (GUISET) ([8], [9]) project. GUISET is modelled to provide
utility infrastructure lowers total cost of doing business for computing utility infrastructure for small and medium
the Small, Micro and Medium Enterprises (SMMEs). The enterprises as well as the average rural dweller of African
SMMEs contribute to economic growth and promotion of communities from a cooperative/networked enterprise
equitable development. The employment potential of SMMEs viewpoint [8]. Cost is a key constraint for African users of
at low capital cost has been the major advantage of the sector. technology, thus our business model of VE for cloud
Employment intensity of the SMMEs sector is much higher computing addressed pays specific attention to cost of
than that of the large enterprises. The SMMEs constitute over computing utility [2], [10]. Also, because a larger percentage
90% of total enterprises in most of the economies and are of users of the proposed VE model use mobile devices its
credited with generating the highest rates of employment design addressed challenges associated with generic cloud
B. Model prototyping
● Services are consumed by users on a pay-
per-use basis, hence maintenance, scalability,
In this section we present simulated prototype of the infrastructure is of third party‟s concern
proposed model. The cloud deployment is illustrated in terms
of automation, identity, permissions and delegation, openness ● VM are created and customized per users‟
and choice. request and requirements
1) Basic assumption of the simulation 2) Description of the simulation
In developing our simulation, the following assumptions The scenario described in chapter four is considered in
are considered, bearing in mind the duration of the project. simulating our model. End-consumer requirements,
infrastructure capabilities and cloud service catalogue that
● The cloud infrastructure is running, services operate in fulfilling a particular request were considered. In
are deployed and service consumers request services. order to fulfil a request, end-user requirements are the
deciding factor. The cloud is scaled according to the number requirements, etc. for the new VM instance. An IP address to
of users. use to connect to the VM instance is also provided (i.e. VM:
172.18.1.18). Before creating a VM instance the SMME
3) Simulation environment administrator creates the user SMME under the SMME
The simulation of our model was done by installing customer.
Nimbula Director. “Nimbula Director is an automated cloud
management system which allows customers to easily
repurpose their existing infrastructure and build a private VII. COST SAVING EVALUATION
computing cloud in the safety of their own data centre.” This In evaluating the performance of our model, we have
cloud deployment model is appropriate for our architecture evaluated our model according to utility evaluation of an
since the medium sized enterprises will be forming a cloud SMME to see if the utility requirements are fulfilled. Cost
using their existing infrastructure to become cloud SPs for saving is the basic requirement of our VE-enabled cloud
very small enterprises. To install a Nimbula director site, we model. We compared our model with the Amazon Elastic
needed a minimum of 3 machines setup, and a seed node Compute Cloud (EC2) pricing. We used Standard on Demand
machine with a DVD. The machines comprising the Nimbula EC2 cost model in evaluating our model. EC2 has a number of
Director cluster(s) need to comply with the following pricing models (Amazon.com
hardware and software requirements, therefore the machines http://aws.amazon.com/ec2/pricing/). The Standard on
we used meet the standard. Demand model is the pricing model equivalent to the cloud
infrastructure in this work. In the Standard on Demand model,
Nimbula director UI is divided to the top pain and the
the user pays for compute capacity by the hour with no long-
bottom pain, this allows you to create, modify and destroy
term commitments or upfront payments. Our pricing starts at
objects.
$300 per year per processor core - including support and
The five main tasks at the top pain perform the following: maintenance. This is equivalent to our model. Our model is
based on Nimbula Director, where software price is only
● User management create and manage based on number of physical processor cores on which it runs
groups and their permission (i.e. the bigger the physical infrastructure, the more you pay
● Image list contains a persistence list of because you have more cores). This is the same criteria for
machine images that could be used to keep track of similar EC2 pricing models. However, for the configuration
different versions of a machine image. and proposed model in this research, the Standard on Demand
pricing model is the ideal comparable model. The comparison
● A machine image is a VM template that you of pricing model of the proposed VE-enabled Cloud
can launch into a running machine instance Enterprise Architecture for SMMEs is therefore based on the
EC2 Standard on Demand pricing model. This does not
● Virtual network allows you to create and suggest rigidity in the pricing model of the proposed
manage VEthernets and VDHCP servers. VE are architecture, but the analysis is done to show the cost saving
virtual layer 2 networks that provide isolation and in capability of the model.
the implement of using VLANs
Cloud providers provide four basic cost models 1 – 4 as
● VDTP servers can be created for each stated in [36]:
VEthernet to dynamically assign IP addresses to VM
instances running in that Vethernet. Cost model for data storage
(size(total)×tsub×cost(storage) where tsub is the
● Network security list let you configure a subscription time),
built in distributed firewall for isolating instances and
regulating traffic in and out of the cloud that is Cost model for computational machine
dynamically configured and independent of the (cost(machine)),
underlying network.
Cost model for data transfer into the cloud (cost(trans
● Instance management allow you to view and f erin)), and
launch machine images into running machine
Cost model for data transfer out to the cloud
instances.
(cost(trans f erout))
A running VM instance created by the SMME cloud
Amazon EC2 provides the flexibility to choose from a
administrator was used for our demonstration. The VM
number of different instance types to meet flexible computing
instance is created using quick launch. The VM instance is
needs (see TABLE I.). Each instance provides a predictable
created via instance management task, here an instance can be
amount of dedicated compute capacity and is charged per
launched and viewed. Instances are customizable according to
instance-hour consumed. The standard instance type has
user requirements. The administrator verifies that the new VM
memory-to-CPU ratios suitable for most general-purpose
instance has been launched and is running. The Web interface
applications.
displays details like the image list, state, placement
TABLE I. THE EC2 STANDARD INSTANCE TYPES
The Nimbula configuration equivalent to the highest EC2 standard instance was used in our prototype. This is shown in
TABLE II
TABLE II. NIMBULA DIRECTOR INSTANCE TYPE USED FOR OUR VE-ENABLED CLOUD
To illustrate the cost estimation, we examined the case of double extra-large instance in Amazon EC2. The
VE-Enabled Cloud Enterprise Architecture using the Nimbula configuration we used is equivalent to a double extra-large
Director Instance type in Table 3 and the amazon instance EC2 machine instance-type. The price of the EC2 instant type
type in Table 4. TABLE III and Table 4 show the estimated configuration is 17.82 times more expensive than the
costs based on instance type obtained in our private cloud and equivalent VE-cloud configuration in the proposed
Amazon EC2 respectively. TABLE III shows a Linux-based architecture. This is a huge
saving for the SMMEs who are the target users of the
proposed architecture.
TABLE III. AMAZON EC2 INSTANCE TYPE WITH HIGH-MEMORY DOUBLE EXTRA LARGE
EC2 Instance
Virtual Plat-
Instance Type Name Compute Memory store I/O Price
cores form
units (ECU) volumes
Standard on- High- M2.2x 13 4 (with 3.25 34.2 GiB 840 GiB (1 64-bit High $0.640
demand memory large ECUs each) x 840 GiB) per hour
instances double
[2] Xinhui L, Ying L, Tiancheng L, Jie Q, Fengchun W. . The method and [21]Lin C. A novel college network resource management method using cloud
tool of cost analysis for cloud computing. IEEE International Conference computing. Elsevier Physics Procedia.2012;24 ():2293 – 2297
on Cloud Computing .2009;CLOUD '09(DOI: [22] Maurer M, Brandic I, Sakellariou R. Adaptive resource configuration for
10.1109/CLOUD.2009.84.): 93 - 100 Cloud infrastructure management. Future Generation Computer
[3] Fernando N, Loke SW, Rahayu W. Mobile cloud computing: A survey. In Systems.2012;Accepted Manuscript(doi:10.1016/j.future.2012.07.004):
Press, Future Generation Computer Systems.2013;29: 84–106 [24] Espadas J, Molina A, Jiménez G, Molina M, Ramírez R, Concha D. A
[4] Seneviratne A, Thilakarathna K, Petander H, Wasalthilake D. Moving tenant-based resource allocation model for scaling Software-as-a-Service
from clouds to mobile clouds to satisfy the demand of mobile user applications over cloud computing infrastructures. Future Generation
generated content. IEEE 5th International Conference on Advanced Computer Systems.2012;In press(doi:10.1016/j.future.2011.10.013):
Networks and Telecommunication Systems (ANTS).2011; (DOI: [25] Stanoevska-Slabeva K, Wozniak T, Hoyer V. Practical guidelines for
10.1109/ANTS.2011.6163645): 1 – 4 evolving IT infrastructure towards Grids and Clouds. In: Stanoevska-
[5] Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I. Cloud computing Slabeva K (eds.)Grid and Cloud Computing: A Business Perspective on
and emerging IT platforms: Vision, hype, and reality for delivering Technology and Applications. 1st ed. Berlin Heidelberg: Springer-Verlag;
computing as the 5th utility. Future Generation Computer Systems. 2010. p225 - 243
2009;25: 599 - 616 [26] Ranjan R, Zhao L. Peer-to-peer service provisioning in cloud computing
[6] Angeli D, Masala E. A cost-effective cloud computing framework for environments. J Supercomput.2011;Springer Science+Business Media,
accelerating multimedia. In press, J. Parallel Distrib. Comput. 2012; LLC 2011(DOI 10.1007/s11227-011-0710-5):
(doi:10.1016/j.jpdc.2012.06.005) [27] Martins J, Pereira J, Fernandes SM, Cachopo J. Towards a simple
[7] Wang X, Wang B, Huang J. Cloud computing and its key techniques. programming model in cloud computing platform. In Proceedings of
IEEE International Conference on Computer Science and Automation First International Symposium on Network Cloud Computing and
Engineering (CSAE). 2011 ; 2: 404 - 410 Applications.2011;IEEE Computer Society(DOI
10.1109/NCCA.2011.21):83 - 90
[8] Adigun MO, Emuoyibofarhe OJ, Migiro SO . Challenges to access and
opportunity to use SMME enabling technologies in Africa. 1st all Africa [28] Pang J, Cui L, Zheng Y, Wang H. A workflow-oriented cloud computing
Technology Diffusion Conference, Johannesburg, South Africa. June 12- framework and programming model for data intensive application.
14, 2006; 1: 34-43 Proceedings of the 15th International Conference on Computer
Supported Cooperative Work in Design.2011;():356 - 361
[9] Ekabua OO. Change impact analysis model-based framework for service
provisioning in a grid environment. South Africa. A PhD thesis in the [29] Tang K, Zhang JM, Feng CH. Application centric lifecycle framework in
department of Computer Science, University of Zululand; 2009 cloud. Eighth IEEE International Conference on e-Business
Engineering.2011;IEEE Computer Society(DOI
[10] Rodero-Merino L, Caron E, Muresan A, Desprez F . Using clouds to 10.1109/ICEBE.2011.32):329 - 334
scale grid resources: An economic model. Future Generation Computer
Systems. 2012; 28 :633–646 [30] Iqbal W, Dailey MN, Carrera D, Janecek P. Adaptive resource
provisioning for read intensive multi-tier applications in the cloud. Future
[11] Rimal BP, Jukan A, Katsaros D, Goeleven Y. Architectural requirements Generation Computer
for cloud computing systems: An enterprise cloud approach . J Grid Systems.2011;27(doi:10.1016/j.future.2010.10.016):871–879
Computing .2011;9(DOI 10.1007/s10723-010-9171-y):3–26
[31] Mobileeservices.org. Research themes. Centre for Mobile e-Services,
[12] Sun D, Chang G, Sun L, Wang X. Surveying and analyzing security, University of Zululand. http://mobileeservices.org/themain/node/10
privacy and trust issues in cloud computing environments. Procedia (accessed 07 August, 2012).
Engineering .2011;15(doi:10.1016/j.proeng.2011.08.537 ):2852 - 2856
[32] Aremu, MA and Adeyemi SL (2011) Small and Medium Scale
[13] Tom T-h, Mohammed S. Toward designing a secure biosurveillance Enterprises as A Survival Strategy for Employment Generation in
cloud. J Supercomput.2011;Springer Science+Business Media, LLC(DOI Nigeria. Journal of Sustainable Development, 4(1), 200 – 206. Accessible
10.1007/s11227-011-0709-y): at http://ccsenet.org/journal/index.php/jsd/article/view/9240. Accessed
last on Tuesday, 27 November 2012.
[14] Mohamed EM, Abdelkader HS, EI-Etriby S. Enhanced Data Security
Model for Cloud Computing. The 8th International Conference on [33] Sharma, M, Mehra, A, Jola, H, Kumar, A, Misra,M, Tiwari, V (2011)
INFOrmatics and Systems (INFOS2012).2012;14-16 May():CC12 -CC17 Scope of cloud computing for SMEs in India. Journal of Computing 2(5):
144 – 149. Accessible at
[15] Lin Y-K, Chang P-C. Performance indicator evaluation for a cloud http://arxiv.org/ftp/arxiv/papers/1005/1005.4030.pdf. Last accessed
computing system from QoS viewpoint. Qual Quant.2011;Springer Tuesday, 27 November 2012
Science+Business Media B.V. 2011(DOI 10.1007/s11135-011-9613-z ):
[34] Abor, PA, Abekah-Nkrumah, G, Sakyi, K, Adjasi, CKD, Abor, J (2011)
[16] Mauch V, Kunze M, Hillenbrand M . High performance cloud computing The socio-economic determinants of maternal health care utilization in
. Elsevier Future Generation Computer Systems.;In Ghana. International Journal of Social Economics, Vol. 38 Iss: 7, pp.628
press(doi:10.1016/j.future.2012.03.011 ): – 648. DOI: 10.1108/03068291111139258.
[35]Dai, W. (2009) The Impact of Emerging Technologies on small and
medium Enterprises (SMEs). Journal of business Systems, Governance
and Ethics 4(4): 53 – 60. Accessible at
http://www.jbsge.vu.edu.au/issues/vol04no4/Dai.pdf. Last accessed on
Tuesday, 27 November 2012
[36] Mantri, S., Nandi, G., Kumar, S., Kumar (Eds.), High perfomance
architecture and grid computing (pp. 4-10). Proceedings of the
International Conference, HPAGC 2011, Chandigarh, India. Series:
Communications in Computer and Information Science, Vol. 169.
[37] GTSI solutions. (2009). Building a framework for successful transition.
White Papers/Cloud Computing. Retrieved from
http://www.gtsi.com/cms/documents/White-Papers/Cloud-
Computing.pdf.
[38] Bloomberg, J., Schmelzer, R. (2006). The Business Inflexibility Trap.
Service Orient Or Be Doomed: How Service Orientation will Change
your Business How Service Orientation will change your Business (pp. 1-
11). (Original work published 2006). Retrieved from
Amazon/http://www.amazon.com/Service-Orient-Be-Doomed-
Orientation/dp/0471768588.
[39] Arteta, B. M., Giachetti, RE. (2004). A measure of agility as the
complexity of the enterprise system. Robotics and Computer Integrated
Manufacturing: 13th International Conference on Flexible Automation
and Intelligent Manufacturing, 20(6), 495-503. Retrieved from
Elsevier/http://www.sciencedirect.com/science/article/pii/S07365845040
00717.
[40] Camarinha-Matos, L.M (2005). Brief Historical Perspective for Virtual
Organizations, Virtual Organizations Systems and Practices
[41] Sobotta, A. (2008). Enhancing agility promoting benefits of service-
orientation with utility computing. Masters Dissertation, Copenhagen,
Denmark: IT University of Copenhagen (ITU). Retrieved from
http://gotze.eu/wp-content/uploads/2009/03/adrian-sobotta-master-
thesis.pdf.
[42] Davis, E. (2007.) What is on American minds? Management Review,
April, 14-20.
[43] Bloomberg, J., Schmelzer, R. (2006). The Business Inflexibility Trap.
Service Orient Or Be Doomed: How Service Orientation will Change
your Business How Service Orientation will change your Business (pp. 1-
11). (Original work published 2006). Retrieved from
Amazon/http://www.amazon.com/Service-Orient-Be-Doomed-
Orientation/dp/0471768588.
[44] Bharadwaj, A. S (2000). A resource-based perspective on information
technology capability and firm performance: An empirical investigation.
MIS Quarterly Executive, 24(1), 169-196.
[45] Leonard-Barton, D. (1992). Core Capabilities and Core Rigidities: A
Paradox in Managing New Product Development. Strategic Management
Journal, 13(5), 112-125.
[46] Overby, E., Bharadwaj, A., Sambamurthy, V. (2006). Enterprise agility
and the enabling role of information technology. European Journal of
Information Systems, 15(2), 120.
[47] Zhang, L., Zhou, Q. (2009). Cloud computing open architecture.
Proceedings of the IEEE International Conference on Web Services,
ICWS 2009, 607-616. Retrieved from IEEE Digital
library/http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5175875&
url=http%3A
[48] Weil, P., Broadbent, M. (1998). Leveraging the new infrastructure: How
market leaders capitalize on information technology. Boston: Harvard
Business School Press.
[49] Conboy, K., Fitzgerald, B. (2004). a study of agility in different
disciplines. Proccedings of the 2004 ACM Workshop in Interdisciplinary
Software Engineering Research, 37-44.
[50] Katua, N. T. (2014). The Role of SMEs in Employment Creation and
Economic Growth in Selected Countries. International Journal of
Education and Research Vol. 12 (2), 46-472
Agent Development Platforms for Bioinformatics
Abstract—During the past decade, Molecular Biology To be able to extract, sequence and analyze the DNA,
wet labs and sequencing activities have been generating a appropriate tools, hardware and software are essential in
vast amount of data, causing an increase in the adoption of Bioinformatics. During wet lab experimentation, DNA are
new methods and tools for identification and analysis. In extracted and purified though a number of steps and then
particular, the use of software agent technology in sequenced.
bioinformatics applications has been on the rise, due some
of its unique features as autonomy and remote execution
capabilities. However, nowadays there exists multiple
software agent development platforms, and their potential
application to bioinformatics require an assessment based
on their inherent features. We therefore present a
comparative analysis of the main available platforms
focusing on standard compatibility, communication, Fig 2: Major steps in DNA Extraction and purification
mobility, security policy, availability, usability and
development issues. Our result, based on a scoring system, DNA is composed of four major bases (Adenine, Guanine,
demonstrate that the JADE platform is the most Cytosine and Thymine). The exact order of these bases
appropriate and promising one to be used in tackling determine the gene and the latter determines the
characteristics, structure and function of proteins.
bioinformatics problems.
Determining the exact sequence of these bases is one of
the most fundamental task in Molecular Biology and
Keywords—Software Agent; Development Platform; Bioinformatics. The next logical step is to be able to determine
Bioinformatics. its function from its underlying sequence. New generation
sequence technologies generate „Big data‟ which need to be
analyzed. „Big data‟ possess four essential characteristics, the
I. INTRODUCTION
four popular V‟s: volume (amount) of data, data processing
velocity, data sources inter- and intra-variability, and veracity
of the data quality. Usually these data are very diverse,
Bioinformatics is an interdisciplinary subject involving
complex (network interaction and dynamics): data can no
Molecular Biology, Computing and Statistics to collect and
more be analyzed in isolation), and often of several scales. For
analyze biochemical and biological data. The major areas in
example, the Human Genome Project required the know-how,
this field are: sequence analysis, Gene expression, and Protein
technology, and human resource from twenty different
expression, Structural Bioinformatics, Network and System
Biology, Data mining, Gene prediction among others. institutions spread over 13 years of work with a budget of
Bioinformatics has in the recent years, generated an avalanche more than three billion dollars to determine the whole genome
of data. Development and use of unsupervised and automated structure which consists of approximately three billion
analysis tools for these data became imperative. nucleotides. However the same tasks now can be done within
three days. This is due to high-throughput genomics. Specific
All living organisms are made up of cells. Most of these bioinformatics activities are employed for specific purposes.
cells contains hereditary material, the chromosomes (DNA
and RNA). According to existing literature, System integration seems
to be the most appropriate solution to deal with these
avalanche of data. Three essential aspects of system
integration are: distribution, autonomy and heterogeneity.
The development of software agents as well as their wide
Fig 1: Major parts of a DNA: genes (exon), non-coding usage requires good underlying infrastructure. Thus, simply
regions; noncoding regions called introns, within exons. having an infrastructure is usually not enough. As user-
acceptance of the infrastructure depends on ease of application
B. Sequence Alignment
Sequence comparison is essential in bioinformatics
analysis. It is of paramount importance toward functional and
structural analysis of newly discovered sequences. The JADE
platform is the most suited for this task due to its use of the
behavior model which greatly simplifies the creation of the
sequence alignment algorithms. Aglets and Anchor use the
event model which is more suitable than the procedural model
implemented by Grasshopper.
C. Gene Prediction
Gene prediction is a prerequisite for detailed functional
annotation of genes and genomes. The current gene prediction
methods can be categorized into two major classes, abinitio–
based and homology-based approaches. Both approaches
adopt the sequence alignment technique as a step in the
prediction procedure, therefore the suitability ratings are
similar to the sequence alignment column.
According to the evaluation, the characteristics of the
JADE platform gave the highest score amongst the evaluated
platforms, making it the most suitable platform for the three
main bioinformatics tasks namely data mining, sequence
alignment and gene prediction.
Table 1: Qualitative Comparison among Agent Development Platforms
[14] Graham JRA, Decker K, Mersic Decaf – a flexible multi agent system
REFERENCES architecture. Auton Agents Multi Agent Syst 2003;7:7-27.
[1] M. D. Dikaiakos and G. Samaras, “Performance evaluation of mobile [15] Agent system for genomic annotation. Int J Cooperative Inf Systems
agents: Issues and approaches, lecture notes in computer science,” in 2002;11:265-92.
Performance Engineering, vol. 2047. Springer, May 2008, pp. 148–166. [16] M. Luck, P. McBurney, and C. Preist, Agent Technology: Enabling Next
[2] P. Braun and W. R. Rossak, Mobile Agents-Basic Concept, Mobility Generation Computing (A Roadmap for Agent Based Computing),
Models, and the Tracy Toolkit. Morgan Kaufmann Publishers, AgentLink, 2003.
December 2005. [17] E. Merelli, R. Culmone, and L. Mariani ,Bioagent: A mobile agent
[3] C. Spyrou, G. Samaras, E. Pitoura, and P. Evripidou, “Mobile agents for system for bioscientists, in NETTAB - Agents in Bioinformatics,
wireless computing: the convergence of wireless computational models Bologna, 2002.
with mobile agent technologies,” Mobile Networks and [18] K. Bryson, M. Luck, M. Joy, and D. Jones, Applying agents to
Applications(MONET), vol. 9, no. 5, pp. 517–528, October 2004. bioinformatics in geneweaver, in Cooperative Information Agents IV,
[4] D. Milojicic, F. Douglis, and R. Wheeler, Mobility: processes, Lecture Notes in Artificial Intelligence, Springer-Verlag, 2000, pp. 60–
computers, and agents. Addison-Wesley Professional, April 2009. 71
[5] Bellifemine, F. L., Caire, G., and Greenwood, D. (2007). Developing [19] Decker K, Khan S, Schmidt C, et al .Biomas: a multi-
Multi-Agent Systems with JADE. Wiley. [20] R. A. S. Moeller, M. Schroeder,Conflict resolution for the automated
[6] Foundation for Intelligent Physical Agents (FIPA). Foundation for annotation of transmembrane proteins, Computational Chemestry., 26
Intelligent Physical Agents (FIPA). Website. Available at (2001), pp. 41–46.
http://www.fipa.org/ (last access: August 07, 2008). [21] K. Webb and T. White,Cell modeling using agent-based formalisms, in
[7] Aglet community http://aglets.sourceforge.net/ [Last date accessed: Autonomus Agent and Multi-Agent
27/07/15] [22] http://www.mygrid.org.uk [Last accessed: 29/07/15]
[8] FIPA-OS http://fipa-os.sourceforge.net/ [Last date accessed: 27/07/15] [23] González PP, Cárdenas M, Camacho D, Franyuti A, Rosas O, Lagúnez-
[9] Grasshopper – the agent platform http://www.grasshopper.de/ [Last date Otero J.-Cellulat: an agent-based intracellular signalling model,
accessed: 27/07/15] Biosystems. 2003 Feb-Mar;68(2-3):171-85.
[10] JADE http://jade.cselt.it/ [Last date accessed: 27/07/15] [24] http://www.grid.it [Last accessed: 29/07/15]
[11] Merelli et al, Agents in Bioinformatics, Computational and System [25] Chatzipapadopoulos, FG; Perdikeas, MK; Venieris, IS
Biology, Briefings in Bioinformatics, VOL 8. NO 1. 45-59, Mobile agent and CORBA technologies in the broadband intelligent
doi:10.1093/bib/bbl014, Advance Access publication, May 26, 2006 network, IEEE COMMUNICATIONS MAGAZINE, issue: 6, volume: 38,
year: 2000, pages: 116 – 124
[12] AgentLink III , http://www.agentlink.org [Last date accessed: 29/07/15]
[26] LM Silva · G Soares · P Martins · V Batista · L Santos - Comparing the
[13] Bryson K, Luck M, Joy M, Jones D. Cooperative Information Agents performance of mobile agent systems: a study of benchmarking,
IV, Lecture Notes in Artificial Intelligence. Berlin/Heidelberg: Springer- Computer Communications 04/2000; 23(8). DOI:10.1016/S0140-
Verlag; 2000. Applying agents to bioinformatics in Geneweaver; p. 60- 3664(99)00237-6
71.
[27] M. Dikaiakos, Melinos k., G. Samaras, "A Performance Analysis
Framework for Mobile-Agent Systems, " 5th International Conference
on Mobile Agents. Atlanta, Georgia, USA, December 2 - 4, 2001.
Published by Springer-Verlag in the Lecture Notes on Computer Science
series.
[28] A.R. Silva, A. Romao, D. Deugo and M. Silva, "Towards a reference
model for surveying mobile agent systems", Autonomous Agents and
Multi-Agent Systems, vol. 4, no. 3, pp. 187-231, September 2001.
[29] Burbeck, K. Garpe, D. Nadjm-Tehrani, S.: Scale-up and performance
studies of three agent platforms. In: Performance, Computing, and
Communications, 2004 IEEE International Conference on, pages: 857-
864. April 15-17, 2004, Linkoping University.
[30] R. Leszczyna, 2004, “Evaluation of agent platforms,” European
Commission, Joint Research Centre, Institute for the Protection and
Security of the Citizen, Ispra, Italy, Tech. Rep.
[31] E. Shakshuki, “A methodology for evaluating agent toolkits,” in
International Conference on Information Technology: Coding and
Computing (ITCC‟05), Las Vegas, Nevada, USA, vol. 1. IEEE
Computer Society, April 2005, pp. 391–396
[32] Aglets Workbench, by IBM Japan Research Group.
http://aglets.trl.ibm.co.jp [Last date accessed: 27/07/15]
[33] Mudumbai S.S., William J. and Abdelliah E., „Anchor Toolkit-A
Secure Mobile Agent System‟. Published in eScholarship. Available at
http://escholarship.org/uc/item/2594j56c [Last date accessed: 28/07/15]
[34] C. Bäumer, Thomas Magedanz - Grasshopper - A Mobile Agent
Platform for Active Telecommunication, , IATA '99 Proceedings of the
Third International Workshop on Intelligent Agents for
Telecommunication Applications, Pages 19-32, Springer-Verlag
London, UK ©1999
[35] „Voyager and Agent Platforms Comparison‟published by ObjectSpace
Inc. for Voyager 1.0, available online at
http://www.cis.upenn.edu/~bcpierce/courses/629/papers/unfiled/AgentPl
atformsW97.PDF, September 1997.
[36] Sergio Ilarri, Raquel Trillo, Eduardo Mena - SPRINGS: A Scalable
Platform for Highly Mobile Agents in Distributed Computing
Environments. Conference Paper · January 2006, DOI:
10.1109/WOWMOM.2006.103 ·Source: DBLP Conference: 2006
International Symposium on a World of Wireless, Mobile and
Multimedia Networks (WoWMoM 2006), 26-29 June 2006, Buffalo,
New York, USA, Proceedings
[37] Braun, P., Müller, I., Schlegel, T., Kern, S., Schau, V., & Rossak, W.
(2005). Tracy: An extensible plugin-oriented software architecture for
mobile agent toolkits. In Software Agent-Based Applications, Platforms
and Development Kits (pp. 357-381). Birkhäuser Basel.
[38] J C Collis, D T Ndumu, H S Nwana and L C Lee - The ZEUS agent
building tool-kit , BT Technology Journal, July 1998, Volume 16, Issue
3, pp 60-68
[39] K. Shunmuganathan et al, Agents Based Bioinformatics Integeration
using RETSINA, The International Arab Journal of Information
Technology, Vol 5, No 3, July 2008.
[40] Rusli Abdullah, Hamidah Ibrahim, Rodziah Atan, Suhaimi Napis, Azmi
Jaafar, Mohd Hasan Selamat, Nurul Haslina Hairudin and Siti Hernazura
Hamidon, M. Jamil, Applying Agent Technology to
FacilitateKnowledge Sharing Among Bioinformatics, IJCSNS
International Journal of Computer Science and Network 310 Security,
VOL.8 No.4, April 2008
[41] G.S. Bhamra et al, Intelligent Software Agents: An Overview,
International Journal of Computer Applications, March 2014, DOI:
10.5120/15474-4160
[42] G.S. Bhamra et al, Agent enabled mining of Distributed Protein Data
banks, International Journal in Foundations of Computer Science and
Technology, Vol 5, No 3, May 2015
Confused which Educational Video to Choose?
Appropriateness of YouTube Videos for Instructional Purposes- Making the Right Choice
Abstract—The ubiquity of computers and increasing Internet utilize YouTube EDU most effectively: YouTube for Schools
connectivity is leading to the emergence of innovative ways of and YouTube for Teachers [2].
learning such that terms like mixed-ability teaching, independent
and flexible learning and interactive lessons are becoming II. LITERATURE REVIEW
buzzwords for the learning society. The availability of the A. History
enormous variety of online free resources poses problems when it The advent of technology has revolutionized learning in many
comes to picking out the relevant, updated and appropriate ones innovative ways. Non-print multimedia-based technologies,
be it for general use or for educational purposes. YouTube Edu such as instructional videos, are now regarded as cost-
offers a learning platform with hundreds of thousands of effective, interactive, effective teaching and learning tools [3].
instructional materials, only a mouse click away. In an attempt
to spot the best videos, many are guided by the number of views Since the mid-20th century, instructional technologies and
and likes. This study investigated the appropriateness of this educational media have undergone major development.
selection approach. Media specialists and video producers
Paving the way from learning and teaching through audio in
brainstormed, discussed and agreed on a list of 16 criteria which
the 1950s, the very first instructional videos appeared in the
were then applied to 30 videos dealing with three subjects of
Mathematics, namely Fractions, Sets and Integers. Each video late 1960s. Parallel with the introduction and nearly universal
was attributed a suitability video index (SVI) on a scale of 0-48. use of personal computers, an evolution of instructional
The study revealed that if in the first instance, the number of technology was witnessed. This continued to expand with the
views and likes can be a first step towards selecting appropriate appearance of the Internet for interactive educational lessons,
videos, yet a more scientific approach needs to be adopted and webinars, podcasts, live distance video interactions, and Web-
this study has compiled a set of carefully selected criteria to based courses and programs [4].
assess the suitability of videos for instructional purposes. These
criteria can be used as important guidelines for video makers If before the millennium people had to buy or rent films,
wishing to produce sound and pedagogical instructional videos. YouTube has been able to position itself as a giant library with
millions of free videos to choose from. YouTube emerges as
Keywords—YouTube Education, innovative video technology, the third most popular website behind Google and Facebook
learning mathematics [5], being the go-to site for video on the web.
I. INTRODUCTION
B. Contribution of ICT, including YouTube videos to learning
Created in February 2005 by Chad Hurley, Steve Chen, and
Jawed Karim, three former PayPal employees, YouTube Videos are just one constituent of the complex classroom
provides a public-access Web-based platform that allows activity system. Thus the learning outcomes depend largely on
people to upload, view and download videos, and the the way video viewing is integrated with other learning
resources and tasks [6].
possibility to share video clips on the Internet through other
web sites, mobile devices, blogs, and emails [1]. With more Anderson [7] defines ICT in education as ''the digitizing of
than 1 billion users everyday, people spend hundreds of human knowledge, cloud computing, social networking,
millions of hours on YouTube and generate billions of views. touch-screen technology, and the convergence of mobile and
PC technologies''
YouTube introduced YouTube EDU which brings learners and
educators together in a global video classroom where viewers Incorporating technology into the classroom benefits not only
can have access to a broad set of educational videos. the students but the teachers as well [8]. In addition, all
YouTube created two programs to help schools and teachers students do not have the same learning abilities, and
A focus group discussion can be defined as an in-depth, open- VI. DATA ANALYSIS, FINDINGS AND DISCUSSIONS
ended group discussion of 1 to 2 hours' duration during which Phase I
a specific set of issues on a predefined and limited topic is
explored. Such group usually consists of 5 to 8 participants The results of the first phase of this study are provided in
and are convened under the guidance of a facilitator [21]. Table 1 and they show the 16 criteria discussed and agreed by
Through such group discussions, people‟s knowledge and the media specialists. Furthermore, the participants of the
experiences can be explored along with what people think, focus group pointed out that not all criteria have the same
how they think and why they think that way [22]. degree of importance such that it would be inappropriate to
discard a video based on some of these criteria only. For
Our focus group consisted of a total of 8 educational media instance, the group opined that the video cannot be discarded
producers and 3 professional technicians involved in the if there was no objectives. On the other hand, criteria such as
production of educational audio-visual materials. language and accuracy were deemed to be mandatory.
Some of the leading questions that fuelled our discussion Based on this argument, we agreed upon further categorizing
included: the 16 criteria into three groups, namely Mandatory, Essential
1. What are the main stages in video production? and Desirable. The explanatory note for each criterion was
2. How is the raw script worked out? also provided:
3. What guides the translation of a raw script to audio Mandatory: The two groups stressed on the importance of this
visual scripts? criteria. Unless these criteria are present to some extent, the
4. Which elements are considered? video should be rejected. For instance, if the content is not
5. What is the input of TV Directors in the process? according to the syllabus, the video can be rejected. The same
6. Which are the components which add value to an rigour applies to language which must be clear and
educational video? understandable by the students.
We also discussed about the primary objectives of producing Essential: Over and above the mandatory criteria, the groups
educational videos and how they can be successfully drew a list of these essential criteria. One such criteria
integrated in the classroom. includes the duration. Research shows that the majority of
After in depth discussion, a list of criteria emerged. The people watch only the first 4 minutes of videos on YouTube
results are discussed in the next section. Based on these [25]. This means the longer the duration of a video, less
results, we developed a marking rubric based on a 4-point people will watch it till the end. Hence, there is a need to
Likert scale, again discussed in depth in the next section. select videos that are not too long.
Mathematics learning disabilities is rampant among Mauritian Desirable: Finally the last six criteria were considered as
learners and ICT is being viewed as a channel to bridge the desirable and included the following: aesthetics, presenter, use
learning gap. YouTube resources are often accessed by both of prop/captions/jingles/music and addressing mixed ability
teachers and students, especially to complement learning and students amongst others.
teaching. We therefore decided to analyse a number of Despite this categorization, the groups agreed that all three
mathematical videos as per the established set of criteria. Ten categories have a certain degree of importance, making the
videos for each of the following 3 topics Fractions, Sets and whole set of criteria a necessary reference guide for
Integers were analyzed against the criteria. These topics were identification of good instructional videos.
based upon various MES reports [23] which mentioned the
TABLE 1: Set of criteria to determine appropriateness of online educational B the higher order skills of the high scorer.
videos
L
Syllabus/Curriculum E
Illustrations involving real-life examples
Content of the video must be aligned with
Real life examples should be used to explain
school syllabus/curriculum.
M difficult concepts especially for slow learners
A Language to have a better understanding of the
N Language must be easily understood by the application of the concept.
D students. Use of props/captions/jingles/music
A Instructional design The use of props, captions and music are
T Content must be properly structured so that desirable features of a video that help to keep
O understanding and assimilation are viewers‟ interest while aiding them in better
R facilitated. understanding difficult terms and concepts.
Y Accuracy/Adequacy Objectives and recap
The content of the video must be accurate The objectives of the lesson should be well
with regards to all concepts and thesis to be defined right at the beginning of the video
mastered. Explanation should be exhaustive. and a recap is desirable at the end to allow
viewers to summarize main learning points
of the video.
Clarity of image/sound Encourages independent and self-learning
The quality of the images and the sound The video should promote independent and
should be of high standard. self-learning among students where
Engagement minimum assistance is required to
The video should catch and retain the understand the lesson.
attention of the viewers and keep them Based on this framework at Table 1, a marking rubric scale
motivated throughout. was developed to allocate a score for each criterion. Table 2
Presenter/Narrator proposes a sample for the criterion „objectives and recap‟.
Presenter should be well groomed and have TABLE 2: Objectives and Recap criterion
the right presentation skills to keep the
attention of the audience. 3 2 1 0
E
Duration Both Only Objectiv No
S
The duration of the video should not be too objectives objective es and objective
S
long but just enough to meet the objectives and recap s or recap not s and
E
of the conceptual content. Short duration is are recap well- recap
N
recommended. included included defined
T
I Supports/enhances classroom lessons
A Phase II
L Video should support lessons taught in class
and should add value to them. Thirty videos, 10 for each of the three topics, namely
Pace of delivery (Accent, tone, fractions, sets and integers, have been previewed and a
punctuation, pitch) number of observations were made. The results of the search
The content should be delivered at a displayed hundreds of thousands of suggestions. Like Google,
reasonable pace, with the appropriate that is YouTube uses ranking factors to determine which videos end
the accent, tone and pitch of the narrator so up at the top of each search results page (SERP). YouTube
that explanation are very clear. looks at a video‟s number of views, how long users watch it
Aesthetics and how many positive ratings and comments it has. For good
The visual appearance, judicious use of measure, they also throw in the number of channel
colors and format of the video should elicit subscribers, how many times the video appears in a user‟s
D and maintain viewing interest. playlist, how often it‟s added to a favourite list or playlist and
E how many times it‟s been embedded on a website.
Addressing mixed ability students
S
The video should be designed in a way that it Videos with the highest number of views from the first two
I
caters for mixed ability students making use pages were chosen and assessed as per the 16 criteria. The
R
of prior knowledge and illustrations to scores were recorded and analysed. The average duration of
A
introduce new concepts while also triggering the videos was 7 minutes and yearly views ranged from 5000
to 340,000. We noted that for Fractions, there were many that the higher the yearly likes, the higher the SVI value tends
more videos and the yearly views were higher than for Sets to be. But again, the chart indicates cases of diversion from
and Integers. this rule, such that it could be a mistake to select an
instructional video solely on account of the yearly likes,
For analysis purposes, we considered an average yearly views
as videos on YouTube are posted at different points in time. though this might be a first step towards earmarking a set of
As an example, a 3 year old video might be there with 20,000 possible good instructional videos.
views while another one might be there for less than one year
and have around 10 000 views.
Suitability Video Index (SVI) No. of likes per year against SVI
Each video was attributed a SVI value based on the sum of the 50
scores. For each of the 16 criteria [max= 48 (16*3), min=0].
40
The range of SVI values was 22 to 46, with most of the videos
scoring above 28, which is a good threshold for videos of 30
acceptable pedagogical worth.
SVI
20
10
Finding 1: 0
0 500 1000 1500
In general, videos with high number of yearly views scored
well on the SVI scale and are thus appropriate for sound
instructional purposes. However, as shown in Figure 1, higher No. of likes per year
yearly views did not necessarily mean higher SVI, such that
Fig 2. Variation of SVI with yearly likes
there were videos with an average of 100,000 yearly views
and with SVI values ranging from 25 to 48. So, while high
yearly views can be indicative of the popularity of a video, yet
the ultimate choice cannot simply be made on this parameter. Finding 3:
This corroborates with research carried out by Duncan et al
[22] suggesting that the overall 100 most easily accessible or Figure 3 reveals that yearly dislikes is also far from being
watched YouTube videos may not necessarily be robust or indicative of the inappropriateness of a video for instructional
invested with high quality production values. purposes. Data shows that on one hand there are videos with
low yearly dislikes but with low SVI values, and videos with
Variation of SVI with number high yearly dislikes but with high SVI values too.
of views per year
50
40 No. of dislikes per year against
30 SVI
SVI
20 50
10 40
30
SVI
0
20
0 100000 200000 300000 400000
10
No. of views/year 0
0 50 100 150 200
Fig 1. Variation of SVI with yearly views
No. of dislikes per year
Finding 2:
Fig 3. SVI versus number of dislikes
The analysis regarding yearly likes revealed a weak but
significant association with the SVI value. Figure 2 shows
Finding 4:
Figure 4 shows the comparison of SVI with percentage of VII. CONCLUSIONS AND RECOMMENDATIONS
likes per view. The scatter diagram once again shows that If some of the parameters on the YouTube page like number
there is no relationship between these two factors, such that a of views and likes can be a first step towards choosing a video
high percentage of likes/view did not correlate with high SVI for viewing purposes, yet none of them is a strong, reliable
values. metric. Therefore, teachers should not rely on either of these
parameters to select appropriate content on YouTube for
classroom integration. Based on the findings of this study, we
SVI v/s Percentage of likes per recommend that the best approach to selecting YouTube
views videos is for teachers to view the whole content and to assess
50 it as per the 16 criteria mentioned in this research paper.
40 Furthermore, teachers need to build a database of videos for a
30 particular topic as no one video caters for a whole topic. In
SVI
Abstract—In recent years demand for high data rate amongst promises Quality of Service (QoS) [2]. Efforts were made
internet users has increased due to the use of real time data to provide QoS for the increasing traffic load of WLANs.
services such as video and voice conferencing. The IEEE 802.11n This led to the amendments in the IEEE 802.1ln standard.
standard is an advancement from the previous a,b and g versions.
Its design goal is to increase data rate, and maintain compat- In the IEEE 802.1ln amendments, new MAC techniques have
ibility with the previous versions. The IEEE 802.11n standard been introduced to enhance QoS and throughput. The MAC
introduces improvements at both the PHY and MAC layers. This layer improvement techniques envisaged in the latest IEEE
standard still adopts the EDCA and HCCA mechanism used for 802.1ln standard amendment includes [3]: Enhanced Dis-
enhancing QoS in the network by assigning priorities to different tributed Channel Access (EDCA), Frame Aggregation, Block
classes of data called Access Categories (ACs). In this paper we
compute a theoretical throughput model for 4 different stations Acknowledgement and Reduced Inter-frame Space. For the
transmitting separate ACs in non saturated conditions. Networks purpose of our work we will be making use of EDCA.
with non saturated conditions find real applications in WSN and EDCA, is basically an extension of the Distributed Coordi-
WiFi enterprises. We then perform a practical experimentation nation Function (DCF) which was part of the amendment of
with 4 different nodes and compare the throughput results with
the theoretical one. Based on the comparison, we determine the IEEE 802.1le to provide enhanced QoS of service based on
the fairness linked with the QoS mechanism. The comparative priority classes. Information to be transmitted by each station
throughput performance indicates that the analytical results are classified into priority classes called Access Categories
compare well with the practical experimental results considering (ACs). Information such as Video and Audio are delay, jitter
non saturated ACs. and packet loss sensitive and hence requires more attention to
I. I NTRODUCTION QoS. Other applications like internet surfing, sending emails
etc. do not really have real time effects. EDCA tends to
Over the years, we have experienced increased usage of address this by giving higher priorities to real time data and
WLAN for several applications. In fact, WLAN has become lower priorities to other forms of data. Even though one
very flexible and ubiquitous. For that matter WLAN is now may say QoS is being enhanced, users of lower priority may
being used for a wide range of services such as, video stream- still experience low QoS especially in instances where High
ing, audio, web, printing, gaming etc. The first development Priority ACs are enormous within the network. This paper
of Wireless LAN (WLAN) saw several improvements into seeks to evaluate the fairness of the IEEE 802.11n network
different versions which were all intended to satisfy different in relation to categorising different data types into ACs. We
needs. The IEEE 802.11a standard uses Orthogonal Frequency achieve this by computing the theoretical throughput of 4
Division Multiplexing (OFDM) to provide data rate of upto 54 stations. Each station transmits one ACs at several random
Mbps in a 5 GHz band [1]. This standard yields more data rate instances and analysis on the fairness can be made. We also
than the IEEE 802.1l and IEEE 802.1lb versions equivalent to evaluate this in real time using Observer Expert as a test-bed.
2 Mbps and 11 Mbps respectively. The IEEE 802.1la was very Observer Expert provides real time VoIP and Video Expert
expensive to integrate into chips due to the operating band Analysis, Stream Reconstruction, Multi-Hop Analysis, and
of 5 GHz. Its variant, the IEEE 802.1lb was more popular Connection Dynamics [14]. This paper provides a comparative
even though it achieved relatively less data rate. The quest analysis of IEEE 802.11n ACs, between analytical and real
for improved cost effective data rate at 2.4 GHz prompted the life measurements using observer expert. The analytical results
development of a new version called IEEE 802.1lg, which show some conformance level with the real life performance.
provided the same data rate at the same frequency as its
The remainder of the paper is outlined as follows: Section
variant. The most recent of all these standards is the IEEE
II will elaborate on the new MAC layer enhancements of
802.1ln which boasts of data rate of up to 600 Mbps.
the IEEE 802.11n standard. Section III will highlight the
Improvement of data rates have therefore provided the Analytical Model used for throughput evaluation. Section IV
chance for heavy traffic to be accommodated in the WLAN will present the simulation and experimental set up. In section
networks. This comes at a cost as the channel is capacity V results obtained from the computations and experiments
limited which therefore causes network constraints and com-
ª*&&&
Fig. 1. Basic DCF Mechanism [9]
R EFERENCES
[1] C. Kai, Y Chen and N Yu, ”Performance Analysis of DCF under two
access mechanisms in IEEE802.1la WLAN”, 2nd International Conference
on Mobile Technology, Applications and Systems, pp. 1-7,2005.
[2] P. E. Engelstad and O. N. Osterbo, ”Delay and Throughput Analysis of
IEEE 802.11e EDCA with Starvation Prediction”, in LCN, Zurich,pp.647-
(a) Simulation (in b/s vs sec) (b) Experimental (in pkt/s vs sec) 655, 2005.
Fig. 8. Throughput vs Transmission times for Background AC. [3] E.Perahia, R. Stacey, ”Next Generation Wireless LANs Throughput,
robustness, and reliability in 802.11n”, Cambridge University Press 2008,
ISBN-13 978-0-521-88584-3 , eBook (EBL).
[4] P.E. Engelstad and O.N. Osterbo, ”Non-Saturation and Saturation Anal-
ysis of IEEE 802.11e EDCA with Starvation Prediction”, in International
symposium on Modeling, analysis and simulation of wireless and mobile
systems, Montral, Quebec, pp. 224-233, 2005.
[5] D He, C.Q. Shen, ”Simulation Study of IEEE 802.1l e EDCF”, Vehicular
Technology Conference, pp. 685-689, 2003.
[6] Y. Daldoul, T. Ahmed and D. Meddour, ”IEEE 802.11n Aggregation
Performance Study for the Multicast”, Wireless Days (WD), pp. 1-6,
2011.
[7] S. Maaroufi, W. Ajib, H. Elbiaze, ”Performance Evaluation of New MAC
Mechanisms for IEEE 802.11n”, , 2007.
[8] J. Kolap, S. Krishnan, N. Shaha, ”Comparison of Frame Aggregation
Mechanism in 802.11n WLAN”, , 2011.
[9] T. Vanhatupa, ”Wi-Fi Capacity Analysis for 802.11ac and 802.11n:
Theory & Practice”, ,Ekahau Wi-Fi Design White Paper, 2013.
[10] R. Leutert, ”WLAN 802.11n MIMO Analysis”, ,SHARKFEST, Stanford
University, 2010.
[11] E. Charfi, L. Chaari, L. Kamoun, ”Fairness of the IEEE 802.11n
aggregation scheme for real time application in unsaturated condition”,
Fig. 9. Jains fairness Index for simulation results based on 4 stations with ,IFIP WMNC, 2011.
different ACs. [12] K. Xu, Q. Wang, and H. Hassanein, ”802.11e enhanced distributed
coordination function (EDCF) in WLAN”, ,in Globecom, San Francisco,
pp. 1048- 1053, 2003.
[13] L. Kriara, Mahesh, K. Marina and A. Farshad, ”Characterisation
computation of the fairness are indicated in figure 9. In this of 802.11n Wireless LAN Performance via Testbed Measurements and
figure it is evident that, AC1 (Highest) obtains the highest Statistical Analysis”, ,in SECON, pp. 158- 166, 2013.
fairness, whilst AC2 receives less fairness compared to AC1 . [14] Network Instruments,”Observer Expert User Guide”, 2013.
The trend decrease until AC4 which has the lowest fairness
index. The results of the fairness index is due to the fact
that there is variation in throughput of all ACs. This impacts
negatively on the QoS in terms of throughput for ACs with
lower priorities in a non saturated networks.
VI. C ONCLUSION
In this paper we have elaborated on the main MAC mech-
anisms beginning with DCF to EDCA. We have presented
the enhancements associated with the EDCA mechanism. We
indicated the purpose of the enhancement which was mainly
for improving QoS as envisaged in the IEEE802.11e standard.
Moving further we used a theoretical model from literature
based on a stochastic approach to compute the throughput of
the network. We then provided results and compared them
with physical experimental results. In that regard we evaluated
the fairness caused by the ECDA mechanism. We conclude
Security in the Internet of Things through
Obfuscation and Diversification
Shohreh Hosseinzadeh, Sampsa Rauti, Sami Hyrynsalmi and Ville Leppänen
Department of Information Technology
University of Turku, Finland
Emails: {shohos, sjprau, sthyry, ville.leppanen} @utu.fi
Abstract—Internet of Things (IoT) is composed of hetero- In this paper, we propose a novel idea that addresses
geneous embedded and wearable sensors and devices that possible security threats in IoT. We base our idea on two
collect and share information over the Internet. This may promising techniques, obfuscation and diversification, that
contain private information of the users. Thus, securing the have been proved to be successful in impeding the malware
information and preserving the privacy of the users are of
in various domains [4]. In this work-in-progress paper, we
paramount importance.
propose using these two techniques to protect the operating
systems and APIs of the devices participating in IoT, and
In this paper we look into the possibility of a p p l y i n g the
also to introduce an additional level of security at the
two techniques, obfuscation and diversification, in IoT.
network level, by diversifying some protocols used in the
Diversification and obfuscation techniques are two outstand-
communication. This study is a research proposal, in which
ing security techniques used for proactively protecting the
we present our novel ideas. In the future works, we consider
software and code. We propose obfuscating and diversifying
implementing these ideas, and demonstrate the effectiveness
the operating systems and APIs on the IoT devices, and also
of the proposed approaches.
some communication protocols enabling the external use of
The remainder of the paper is structured as follows: in
IoT devices. We believe that the proposed ideas mitigate the Section 2 we present a background on the characteristics of
risk of unknown zero-day attacks, large-scale attacks, and also IoT, the software and the protocols used in these domains.
the targeted attacks. Section 3 discusses our proposed idea in detail, and Section
Keywords—Internet of Things, IoT, security, privacy, obfus- 4 concludes the paper.
cation, diversification
II. CHARACTERISTICS OF THE I O T
I. INTRODUCTION
Operating systems and software in IoT
Internet of Things (IoT) or Internet of Everything, was
introduced by MIT Auto-ID Labs in 1999 [1], is a network IoT comprises a wide variety of heterogeneous sensors
of physical devices (objects) connecting to each other (wire- and devices, of which some are powered by more potent 32-
less) for sending/receiving data. The tiny chips embedded bit processors (e.g., smart phones) and some are controlled
in the devices enable them to communicate without the by lightweight 8-bit micro-controllers [5]. Therefore, the
human interaction. This aims to make the human lives more chosen software should be applicable to a range of devices,
intelligent, automated and thus more comfortable. IoT is including the lowest power ones. On one hand, it should be
known as the third revolution in information technology capable of supporting the functionality of the object; and
after the Internet and mobile communication networks, and on the other hand, should be in line with the limitations of
today it is being used in multitude public and private these devices in memory, computational power, and energy
sectors, ranging from the public safety to health care. By capacity. The software in IoT should have the following
the continuous growing trend, more and more”things” are characteristics [6]:
getting connected to each other every day, collecting and
transmitting personal and business information back and • Heterogeneous hardware constraints: the IoT soft-
forth. Cisco IBSG [2] reports that the number of connected ware should have limited CPU and memory re-
devices in 2015 is 25 billion, and this number is expected quirements, so that it could support the constrained
to grow to 50 billion by 2020. However, the security in IoT hardware platforms.
is still a major challenge. According to [3], 70 percent of • Autonomy: It should be energy efficient, reliable,
the IoT devices are vulnerable to exploits that c o u l d be a and adaptive to the network s t a c k .
doorway for attackers to the network. On this basis, • Programmability: It should provide a standard Ap-
researchers and developers are continually seeking effective plication Program Interface (API) for software de-
techniques that boost the security in this environment, which velopment and support the standard programming
is compatible with the limitations of the participating nodes languages.
in IoT. To the best of our knowledge, none of the existing
research works in the field of IoT security have l o o k e d These factors have leaded the developers to think of the
into the obfuscation and diversification techniques as the operating systems that are adaptive to the diverse low-power
potential techniques for mitigating the risk of malware. objects in IoT. Among all, Contiki [7] and TinyOS [8] are
REFERENCES
[1] I. Bose and R. Pal, “Auto-id: Managing anything, anywhere, anytime in
the supply chain,” Commun. ACM, 48(8), pp. 100–106, Aug. 2005.
[2] “The internet of things- how the next evolution of the internet is
changing everything,” verified 2015-07-08. [Online]. Available:
https://www.cisco.com/web/about/ac79/docs/innov/IoTIBSG 0411FINAL.pdf
[3] “Internet of things research study - hp report,” verified 2015-07-08.
[Online]. Available: http://www8.hp.com/h20195/V2/GetPDF.aspx/4AA5-
4759ENW.pdf
[4] P. Larsen, A. Homescu, S. Brunthaler, and M. Franz, “SOK: Auto- mated
software diversity,” in Security and Privacy (SP), May 2014, pp. 276–
291.
[5] K. T. Nguyen, M. Laurent, and N. Oualha, “Survey on secure com-
munication protocols for the Internet of Things,” Ad Hoc Networks,
vol. 32, pp. 17 – 31, 2015.
[6] E. Baccelli, O. Hahm, M. Gü nes, M. Wählisch, and T. C. Schmidt,
“Operating systems for the IoT–goals, challenges, and solutions,” in
WISG2013, January 2013.
[7] A. Dunkels, B. Gronvall, and T. Voigt, “Contiki - a lightweight and
flexible operating system for tiny networked sensors,” in Local Com-
puter Networks, 2004. 29th Annual IEEE International Conference
on, Nov 2004, pp. 455–462.
[8] P. Levis S. Madden, J. Polastre, R. Szewczyk, K. Whitehouse, A.
Woo, D. Gay, J. Hill, M. Welsh, E. Brewer, and D. Culler, “Tinyos: An
operating system for sensor networks,” in Ambient Intelligence.
Springer Berlin Heidelberg, 2005, pp. 115–148.
[9] M. Corson, R. Laroia, J. Li, V. Park, T. Richardson, and G. Tsirt-
sis, “Toward proximity-aware internetworking,” Wireless Communi-
cations, IEEE, vol. 17, no. 6, pp. 26–33, December 2010.
[10] Z. Shelby, K. Hartke, and C. Bormann, “The constrained application
protocol (CoAP),” in Internet Engineering Task Force (IETF), 2014.
[11] S. Babar, P. Mahalle, A. Stango, N. Prasad, and R. Prasad, “Pro-
Proactive Digital Forensics in the Cloud using Virtual
Machines
D.J. Ras H.S. Venter
Department of Computer Science Department of Computer Science
University of Pretoria University of Pretoria
Pretoria, South Africa Pretoria, South Africa
dras@cs.up.ac.za hventer@cs.up.ac.za
Abstract—With the advent of cloud computing systems it has In this paper, a model architecture is proposed that
become possible to provision large scale systems in a short time implements forensic readiness as described by the ISO
with little effort. The systems underpinning these cloud systems standard 27043 [2] by means of proactive forensics of cloud
have to deal with massive amounts of data in order to function. computing systems. The proactive forensic model is used as a
Should an indecent occur that requires some form of forensic method to address the problems of data volume, data isolation,
investigation it can be very challenging for an investigator to data integrity and availability of the cloud system. The
conduct the investigation. This is due, in large part, to the proposed architectural model captures potential forensic
volatility of data in cloud systems. evidence with the aid of a forensic monitor and transmits the
potential evidence via a secure channel, to a cloud forensic
In this paper, a model architecture is proposed to enable
proactive forensics of cloud computing systems. Using a reference
system, where the potential evidence can be analyzed.
architecture for cloud systems, an add-on system is created to The goal of the architecture is to provide a mechanism that
enable the capture and storage of forensic data. The captured data would allow an investigator to rapidly and effectively acquire
is then available to the investigator should the need for an data needed for an investigation. It is not the goal of the
investigation arise. This must be achieved with minimal alteration system to fully replicate the virtual machines in the cloud
or interruption of existing cloud systems. system. This would in effect massively increase the size of the
cloud and exacerbate the problem of large data volumes. The
The system is described and a theoretical architectural model
is given. An evaluation discusses the possible advantages and
exact details of what potential digital evidence is captured and
disadvantages of such a system and how it can be implemented as how the transmission channel is secured is not in the scope of
a proof of concept. It also relates the proposed model to the ISO this paper, only the system that could facilitate the collection.
27043 standard of forensic investigations. The paper is structured as follows: Section II contain
background information regarding the current state of the
Keywords—Cloud computing; digital forensics; cloud forensics;
cloud, cloud forensics and the ISO 27043 standard. It also
standards; ISO standard 27043
contains the related work on which this paper is based. Section
II.D contains models for the possible implementation of the
I. INTRODUCTION ISO 27043 standard (section II.C). Section III contains the
In the period from the early 80s to the present day, proposed model for the realization of the ISO 27043 standard.
computer storage capabilities have massively increased, The model is evaluated in section IV finally, section V
beginning low capacity floppy disks to massive clustered concludes the paper. All instances where the phrase the
cloud computing system. As these technologies have evolved standard is used, the phrase refers to the ISO 27043 standard.
and improved, so too has the need to perform forensics on
devices and systems. Where it is relatively simple to II. BACKGROUND
forensically analyze a single disk, performing a forensic
investigation on a cloud computing system is much more In order to realize an architecture to enable forensic readiness
complex, due to the inherent complexity of such a system. in cloud computing systems, background information is
Problems like data location, segregation and recovery, to name needed with regards to the general concepts of cloud
but a few, plague investigators making cloud forensics very computing, the current state of cloud forensics and work
challenging [1]. Another, more general problem, is that up related to forensic readiness of cloud systems. The ISO 27043
until recently there has not been a single standard on which a standard outlines the forensic process that must be followed
forensic investigation process could be based. This problem during an investigation by separating the phases into different
should hopefully be alleviated with the introduction of the ISO process classes [2]. In this paper the readiness class is used as
27043 standard [2]. In combination with a standard a measure as to whether the proposed cloud architecture is
investigative process, the application of forensic readiness useful. The following sections elaborate on the
principles and procedures could further aid in making cloud aforementioned topics.
forensics less challenging [3]–[5].
The ISO 27043 standard serves as a baseline reference paper deals with a proof of concept and that the model must
point for this paper, in terms of the goals that must be still be implemented, the Single tenant VM model is chosen as
achieved in order to realize forensic readiness. The aim of the a departure point for the full model architecture. This model is
proactive forensic architecture is realize the goals set forth in selected because it offers the least complex implementation
the readiness class of the ISO 27043 standard. and best data segregation. The most prominent drawback is
that the Single tenant VM model has an adverse impact on the
D. Previous work performance of the cloud system.
The work of Ras and Venter [22] describes 5 possible It is not the intention of the system to capture all the
architectures for enabling proactive cloud forensics. Table I information contained in all the virtual machines of the cloud
summarizes the different architectures in terms of their system. Capturing all the information of each VM in the cloud
implementation, implementation complexity, data segregation, is not a feasible solution. By capturing all the information,
tamper resistance, and As a Service paradigm. The model each guest VM would take up roughly twice the amount of
architectures are based on the NIST reference architecture for disk space i.e. the space required for the VM to execute and
cloud systems [1] where hardware nodes are virtualized via the space required for storage of the captured information. The
clustering and the installation of a cloud operating system implementation of the model is intended for a specialized case
(OS). A hypervisor then allows guest virtual machines (VMs) of private cloud systems. In this case private cloud systems
to be tenants on the cloud system. Using the information in can be defined as cloud systems owned by companies or
Table I, a selection can be made of which model to implement organizations that are not open for use by the general public.
as a proof of concept. The model in this paper builds on this
previous work and extends the model into a theoretical The model, as shown in Fig. 1, is comprised of four
architecture that can be implemented to address the problems components namely: the cloud hosting system (block 1), the
of architecture, data collection and data analysis. forensic controller (block 2), the cloud forensics system (block
3) and the cloud security system (block 4).
III. PROPOSED MODEL
A. Cloud hosting system
From the previous work, a full system model can be The single tenant forensic monitor model, uses nested
designed which is shown in Fig. 1. Due to the fact that this VMs to enable forensic monitoring and is shown in Fig. 1
B. Forensic controller
The forensic controller (Fig. 1 block 2) governs the
forensic VMs in the cloud system. It is connected to the
security system of the cloud infrastructure.
The controller has the function of assigning the level of
forensic data acquisition to the different forensic VMs. In
order to save both space and computational overhead, the
forensic VM can be configured to different levels of data
acquisition. At the lowest level, the VM only captures minimal
data, for example log files. Whereas on the highest level, the
entire guest VM can be captured for forensic analysis. This
level of data acquisition is determined by the configuration set
by the cloud service provider.
The forensic controller receives input from the cloud
security system. Should a specific VM be targeted by some
external threat, the cloud security system detects the threat and
signals the forensic controller. The forensic controller can
escalate the level of data acquisition by the forensic VM for
the targeted guest VM. This can be done either for a single
VM or for a group of VMs, depending on the detected threat.
The captured forensic data is transported via a secure
channel to the cloud forensics system where automated
analysis can be performed.
Abstract—In recent years, there have been lots of research Traditional methods of authentication were done using two
developments done in face recognition systems. Face recognition approaches. Firstly, the user can possess something that will
systems are widely used for access control, border control, help for authentication, such as User ID, keys, smart card and
surveillance and in law enforcement. Among other biometrics, it badge. However, these things can be easily shared,
is the most natural and acceptable way of identifying an duplicated, lost or stolen. A second approach to identify a
individual. Face recognition system does not require physical person is that the user can remember something such as a pin
interaction with the user. Research is still being done intensively or password. This approach also is not secure as the pin or
to produce systems that can cater for several challenges such as password can be easily obtained by any third party who wants
changes in pose, illumination, occlusion and low resolution
unauthorized access to valuables and information [2].
images. Algorithms reported in literature use either global
feature extraction or local feature extraction. In this work, a A different method of authentication is by using
different technique is proposed that combines both global and biometrics. Biometric-based methods easily deal with the
local approaches for face recognition. The Principal Components problems of traditional methods since users are identified by
Analysis (PCA) and Local Binary Patterns (LBP) have been who they are, not by something they have to remember or
employed. Face recognition yields a recognition rate of 90 % with carry with them [3]. Biometric traits are more difficult to
PCA and 92 % with LBP. However, results show an forge, copy, share, misplace or guess[4]. A biometric system
improvement in recognition rate to 95 % when both approaches
requires the person being authenticated to be present at the
are fused.
time and point of authentication. Biometric has another
Keywords—Face Recognition; Biometric;, Global Feature; advantage over traditional authentication methods by
Local Feature; PCA; LBP providing negative identification functionality. These
situations arise when the user tries to avoid successful
identification (in case of thieves) or an imposter trying to
I. INTRODUCTION establish a false identity. Biometric recognition has thus
The needs for reliable authentication techniques have become an indispensable tool for authentication in this
increased due to information security, privacy concerns and technologically-advanced society.
rapid advancements in networking, communication and
Though many biometric traits are available nowadays, not
mobile systems. Increasing demands and deployments of
all biometrics are suitable for all types of applications [5].
biometrics are observed in diverse environments such as
Each application has its own requirements in terms of
airports, banks, law enforcements, secure access controls,
recognition accuracy, resource requirements, reliability and
commercial and forensic applications. Biometric-based
cost [6]. In this work, the focus is on face recognition.
authentication has started to gain acceptance as a legitimate
method for determining an individual’s identity in many A facial recognition system is a computer-based
countries. authentication system that can identify users based on their
unique facial characteristics. Unlike other biometrics such as
Biometric refers “to identifying an individual based on his
fingerprint, hand geometry, DNA, face recognition does not
or her distinguishing characteristics” [1]. Biometric traits are
require direct contact with the sensor, since face image can be
often classified as physiological or behavioral traits.
taken from a distance, thus hygiene issues do not arise. Also,
Physiological characteristics are related to the shape of the
face recognition is the most socially and culturally accepted
body. Examples include, but are not limited to fingerprint,
internationally, since already existing manual facial
face, iris, hand geometry, palm print, retina and DNA.
recognition is being used for passports. However, face
Behavioural characteristics are related to the behavior of a
recognition systems suffer from occlusion of hair and make-
person, including, but not limited to: Keystrokes, signature,
ups. Moreover, changes in facial expressions, illuminations
gait and voice. A more recent term in this field is soft
and pose also affect the system. In order to cater for these
biometric. Soft biometric means certain information about the
problems, in our research, a fusion of global and local features
user, which can be combined with his biometric identity to
has been done to make the recognition system more robust.
provide a stronger authentication method.
References
[1] Bolle, R. M., Connell, J. H., Pankanti, S., Ratha, N. K. & Senior, A. W.,
2003. Guide to Biometric: Springer-Verlag, New-York.
[2] Jain, A. K., Ross, A. & Prabhakar, S., 2001, Fingerprint Matching Using
Minutiae and Texture Features. In: Proceedings of the International
Conference on Image Processing, Thessaloniki, Greece, pp 282-285.
[6] Prabhakar S., Pankanti S. & Jain A. K., 2003. Biometric Recognition:
Security and Privacy Concerns, IEEE Security and Privacy Magazine,
Vol. 1, No. 2, pp. 33-42
[7] Zhao, W., Chellappa, R., Phillips, J., &. Rosenfeld, A., 2003. Face
Recognition in Still and Video Images: A Literature Survey, ACM
Computing Surveys, Vol. 35, pp. 399-458
[8] Phillips, P.J., Moon, H., Rizvi, S.A. & Rauss, P.J., 2000. The FERET
Evaluation Methodology for Face Recognition Algorithms, In: IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No.
10, pp. 1090-1104
[9] Bledsoe, W. W., 1964. The model method in facial recognition, Technical
Report, PRI 15, Panoramic Research, Inc., Palo Alto, California.
Fig. 8 ROC Curve for LBP, PCA and combined approach
[10] Kirby, M. & Sirovich, L., 1990. Application of the Karhunen-Loeve
Procedure for the Characterization of Human Faces. In: IEEE
The ROC curve shows an improvement in recognition rate Transactions on Pattern Analysis and Machine Intelligence, vol. 12, pp.
when the combined approach is used as compared to using 103-107.
only PCA or only LBP. The recognition rate increases to 95 %
with lower error rates, FAR of 0.04 and FRR 0.05. This [11] Turk, M., & Pentland, A., 1991. Eigenfaces for recognition, Journal of
Cognitive Neuroscience, Vol. 3, No. 1, pp. 71-86, 1991.
approach is helpful when we have low resolution images and
only one algorithm cannot cater for such images. [12] Etemad K. & Chellappa R., 1997. Discriminant Analysis for Recognition
of Human Face Images, Journal of the Optical Society of America, Vol.
14, pp 1724-1733.
[13]Swets, D., & Weng, J., 1996. Using Discriminant Eigenfeatures for Image
Retrieval. IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 18, pp. 831-836.
V. CONCLUSION
[14] Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., 1997, Eigenfaces vs.
Fisherfaces: Recognition Using Class Specific Linear Projection, IIEEE
Research on face recognition has produced quite Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No.
satisfactory result, however, the performance of the system 7, pp 711-720
degrades when there is pose variance. The performance of
[15] Martinez A. & Kak A., 2001. PCA versus LDA, IEEE Transactions on
existing face recognition system can be further improved by Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228-233.
merging both local and global features. Two techniques have
been investigated, the Principal Component Analysis (PCA) [16] Jaiswal S., Bhadauria S. S. & Jadon R. S., 2011, Comparison Between
and Local Binary Patterns (LBP). The UOM database has Face Recognition Algorithm-Eigenfaces, Fisherfaces And Elastic Bunch
been created for the purpose of this study. Experimental Graph Matching, Journal of Global Research in Computer Science, Vol.
2, No. 7
results have shown that the combination of PCA and LBP
[17] Bartlett, M. S., 2001. Face Image Analysis by Unsupervised Learning. [25] Hsieh, C., Lai, S. & Chen, Y., 2010. An Optical Flow-Based Approach to
Kluwer Academic Publishers. Robust Face Recognition Under Expression Variations, IEEE
Transactions On Image Processing, Vol. 19, No. 1, January 2010, pp. 233
[18]Liu C. & Wechsler H., 1999. Comparative Assessment of Independent – 240
Component Analysis (ICA) for Face Recognition, In: Second
International Conference on Audio and Video-based Biometric Person [26] Chellappa, R. Sinha, P. & Phillips, P.J., 2010. Face Recognition by
Authentication, AVBPA’99,Washington D. C. USA, pp. 22-24 Computers and Humans, Computer, Vol. 43 , No. 2, pp. 46 – 55.
[19] Rasied T.S.M., Khalifa O.O., & Kamarudin Y.B., 2005. Face [27] Park, U., Tong, Y., & Jain, A.K., 2010. IEEE Transactions On Pattern
Recognition Based On Singular Valued Decomposition And Analysis And Machine Intelligence, Vol. 32, No. 5, pp. 947-954
Backprogagation Neural Network, In: 2005 1st International Conference
on Computers, Communications and Signal Processing with Special [28] Cheng Z., Sun J. & Lei X., 2010, U-face Of Applied Research In The
Track on Biomedical Engineering, CCSP 2005, pp. 304-309. Face Recognition, 3rd International Conference on Advanced Computer
Theory and Engineering (ICACTE), Vol. 5, pp. 15-18,20-22 Aug. 2010
[20] Kaur M., Vashisht R., & Neeru N., 2010. Recognition of Facial ,Chengdu .
Expressions with Principal Component Analysis and Singular Value
Decomposition, In Proc. International Journal of Computer Applications, [29] De Silva V. & Tenenbaum J.B., 2003, Global Versus Local Methods In
Vol. 9, No.12, pp. 36-40. Nonlinear Dimensionality Reduction, pp. 721–728 in S. Becker, S.
Thrun,, K. Obermayer (Eds.), Advances in Neural Information Processing
[21] Scholkopf B., Smola A., & M¨uller K.-R.. 1999, Kernel Principal Systems, 15, MIT Press, Cambridge, MA, 2003.
Component Analysis. In Scholkopf B., Burges C. J. C., & Smola A. J.,
editors, Advances in Kernel Methods - Support Vector Learning, pages [30] Fawcett, T., 2006. An introduction to ROC analysis, Pattern Recognition
327–352. MIT Press, Cambridge, MA, 1999. Letters, Vol. 27, pp. 861–874.
[22] Ojala T., Pietikainen M., and Harwood D., 1996. A comparative study of [31] Mansfield, A. J. & Wayman, J. L. 2002. Best practices in testing and
texture measures with classification based on featured distribution, reporting performance of biometric devices, National Physical
Pattern Recognition, vol. 29, no. 1, 51–59. Laboratory, Centre for Mathematics and Scientific Computing, Tech.
Rep. Version 2.01, 2002.
[23] Agarwal, M. Agrawal, H. Jain, & N. Kumar, M., 2010. Face Recognition
using Principle Component Analysis, Eigenface and Neural Network. In: [32] Lachiche, N. & Flach, P. 2003, Improving accuracy and cost of two-class
2010 International Conference on Signal Acquisition and Processing, and multi-class probabilistic classifiers using ROC curves, In: Twentieth
ICSAP '10. Bangalore, 9-10 Feb. 2010, pp. 310 – 314. International Conference on Machine Learning (ICML'03), Washington
DC., pp. 416–423.
[24] Sellahewa, H. & Jassim, S. A., 2010. Image-Quality-Based Adaptive
Face Recognition, IEEE Transactions On Instrumentation And
Measurement, Vol. 59, No. 4, April 2010, pp. 805 – 813.
Rapid Prototyping with a Local Geolocation API
Geerish Suddul, Kevin Nundran, Jonathan L.K. Morgan Richomme
Cheung Research and Development
Dept. Business Informatics & Software Engineering Orange Labs
University of Technology, Mauritius (UTM) France
Republic of Mauritius morgan.richomme@orange-ftgroup.com
g.suddul@umail.utm.ac.mu
Abstract—Geolocation technology provides the ability to time opens up seemingly endless sales, marketing and
target content and services to users visiting specific locations. business opportunities.
There is an expanding growth of device features and Web
Application Programming Interfaces (APIs) supporting the It is mandatory for most mobile applications to have an
development of applications with geolocation services on mobile Internet connectivity in order to give meaning to the
platforms. However, to be effective, these applications rely on the geolocation data, such as a positioning user coordinates on a
availability of broadband networks which are not readily map, finding nearby objects or services on a map or even to
available in various developing countries, especially Africa. We tag geographic information around objects. In developing
propose a geolocation API for the Orange Emerginov Platform countries, the ITU estimates that around 67% of the
which keeps geolocation data in an offline environment and population remains offline, and in Africa specifically only
periodically synchronises with its online database. The API also 20.7% of individuals are using the Internet, while mobile
has a set of new features like categorisation and shortest path. It broadband subscriptions is below 20% per 100 inhabitants [4].
has been successfully implemented and tested with geolocation Even though connectivity remains a hurdle, the adoption of
data of Mauritius, consumed by mobile applications. Our result mobile phones in sub-Saharan Africa witnessed a growth of
demonstrates reduction of response time around 80% for some around 62% since 2008. Countries like Mayotte, Gabon,
features, when compared with other online Web APIs. Mauritius, Botswana and Reunion have more than 75% unique
mobile subscribers [5]. The rate of mobile application usage
Keywords—micro-services; geolocation; Web API; OSM
geolocation API; mobile applications
have also gone up, and we reasonably foresee the rise in
geolocation services in these applications.
In order to bridge the connectivity gap and promote
I. INTRODUCTION development of geolocation applications, we are proposing a
Geolocation technology makes it possible for a computing local geolocation API for rapid application development on
device to acquire its physical location. The methods for the France Telecom Orange Lab's Emerginov platform [6]. It
acquiring geolocation data can be categorised as device-based is an open platform which provides a rich set of APIs to
or server-based [1]. The former implies that the device uses support the development of micro-services and aims at
Global Positioning System (GPS) and/or base station empowering local communities to design and deliver
triangulation techniques to generate its location, while server- multimedia (vocal, SMS, USSD) and user-generated contents.
based techniques rely on third party services using IP lookup, Being the usual hotspots of innovation, local universities in
or WIFI service set identifiers (SSID). the African region are especially targeted. They contribute to
enrich the platform with features like the USSD enabler [7]
Once the location information is available, applications
and the Mobile Payment API [8]. The aim of this work is to
can connect to geolocation databases which provides geo-
further contribute with an offline geolocation API.
positioning, geo-coding and geo-tagging functionalities in the
form of Application Programming Interfaces (APIs). There are
a number of commercial and open source available II. BACKGROUND
geolocation databases, which stores huge amount of
unstructured and semi-structured data, providing varying A. Opportunities for Local Geo-coders
levels of accuracy, such as Google MAP APIs [2] and Open Over the years, consumers have witnessed a shift from
Street Map APIs [3]. traditional applications of geolocation services to more
innovative, powerful and useful forms following different
The emergence of mobile devices supporting both device-
business models. Geolocation-based applications provide
based and server-based location techniques, and the
services that range from simple physical location relative to a
availability of several geolocation databases contribute to
map, such as navigation, to searching for services like nearby
make geolocation services significantly popular across several
restaurants, to tagging geographic information onto objects,
types of application. It has become the foundation for location
aware mobile applications. Both the private and public sector such as photographs. This application landscape is still under
rely on geolocation data, for instance, businesses can track, expansion and experimentation by various service providers
learn and target consumers based on their behaviours. who see competitive advantages. New breeds of applications
Therefore, the ability to identify a customer’s location at any provide services like location tagging shared socially among
IV. TESTING
An all-in-one geolocation client tool, called MapPie has
been developed to test all functionalities available for client
applications. We describe a few test scenarios below.
Scenario 1: Finding shortest path
Fig. 1. E-OSM API Architecture The function takes two location parameters, either a
standard point or a point on a Polygon, each with a pair of
A. Developer Services coordinates and it uses the Dijkstra’s[23] algorithm to find the
An interface is provided to allow developers to register for shortest path and distance between the two points. The
an API key and manage their accounts. The API key needs to algorithm picks the unvisited vertex (node) with the lowest-
be used in each of their micro-services/applications. Requests distance, calculates the distance through the edges (route)
are typically encoded in the HTTP GET method, within the connected to the vertex to each unvisited neighbour, and
URL in the following format: updates the neighbour's distance if smaller.
URL/ClassName/Method/Parameters
B. API Services
The API follows a modular design, which allows quick
and easy extension of new features. It consists of the following
modules:
Points – These are nodes on a map, each points are
represented by a latitude and longitude value.
Linestrings – These are defined as ways or lines on a
map. For example, it can take the form of a road on a
map.
Polygons – These are a set of points that are linked to
form a shape (any number of sides). For example, a
polygon can be a building on a map.
Review – Allows user to associate reviews to points,
linestrings and polygons.
Fig. 2. Shortest Path Result
Categorization – Allows user to associate category
names to points, linestrings and polygons. For
Scenario 2:Searching a Polygon Category
example, regrouping all schools entity (kindergarten,
college, university, etc.) under the School category The function searchPolyCategory takes a category name
name. defined by the application user, such as a school ,restaurant or
Administration – Allows the registration of new public building, and looks up all polygons in this category.
developer accounts and other administrative and The category name groups multiple OSM keywords under a
operational features of the API. For example, a single name which allows custom searches.
developer registers for an account, thereafter Scenario 3: Finding nearby objects
activated by an administrator
A user can find all nearby clothing shops within a distance
C. Local Data Services of 1km from the current location. The function is named
OpenStreetMap geolocation data is large set in semi- searchPointBound, in which the user provides a location, the
structured format, and is available in XML. It is converted to keyword (e.g. clothing) and the range/distance in meters (e.g.
JSON format, which allows faster querying. Third party tools, 1000). The API processes the operation according to the
notably Osmosis [20] and 3angle[21] performs the parameters and displays the shops that are only within the
conversion, following which the data is stored in MongoDB range of 1km.
[22] which is a JSON-style data store. Unlike standard SQL Scenario 4: Reverse Geo-coding
databases, it does not enforce document structures. This
flexibility facilitates the mapping of documents to an entity or The function searchLineCoord, takes as input a point on a
an object. Each document can match the data fields of the path to search the complete path (e.g. A coordinate on a road
represented entity, even if the data has substantial variation. can be used to find the road’s name). The function also
The API administrator can choose from different options to provides additional information such as the total distance of
synchronise with the offline database, usually performed off-
the path, and the average time to travel the path via motor [3] OpenstreetMap (OSM) API, available at: http://wiki.openstreetmap.org
vehicle, cycling or by foot. /wiki/API [Aug 2015]
[4] ITU, "ICT Facts and Figures- The World in 2015", International
Following the implementation of MapPie, we conducted a Telecommunications Union, Geneva, May 2015. Available at:
set of simple experiments to compare the response time of http://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx
accessing offline data from E-OSM with that of an existing [5] GSM Association, "The Mobile Economy 2015", GSMA, London, UK,
online API. A subset of the functionalities from MapPie 2015. Available at: http://www.gsmamobileeconomy.com/
GSMA_Global_Mobile_Economy_Report_2015.pdf
dealing with Points, Linestings and Polygons have been
[6] M. Richomme, D. Blaisonneau, B. Herard, B. Ngom and G.Suddul,
implemented to use the online Nominatim API. As can be “Emerginov: An Open PHP PaaS to Stimulate and Animate Local Co-
depicted in Fig.3, both for Point and Polygon functionalities, innovation”, In Proc. Of 4th International ICST Conference,
E-OSM performs better with reduction of around 88% and AFRICOMM 2012, Yaounde, Cameroon, Nov 2012. Springer-Verlag,
86% respectively in response time. As expected, clients using Berlin Heidelberg, pp. 126-132.
E-OSM API only requires a local connection to the database [7] G. Suddul, A. Soobul, U. Bahadoor, A. Ramdoyal, N. Doolhur and M.
as opposed to accessing the geographically far OSM server Richome, “An Open USSD-Enabler to simplify access to mobile
services in emerging countries”, 4th International Conference on
over Nominatim. There are less intermediary network nodes Emerging Trends in Engineering and Technology(ICETET), 2011
involved, and therefore less delay. As for the Linestring
[8] N. Doolhur, G. Suddul, R. Foogooa and M. Richomme, “An Open API
functionality, the request consists of finding the shortest path to Monetize Mobile Micro-Services for Emerging Countries”, In Proc.
between two points. E-OSM shows negative results as it of the IEEE 11th International Africon 2013 Conference, Mauritius, SEP
returns the shortest path, where as Nominatim returns the first 2013. IEEE USA, pp. 684-687.
set of points (route) which can connect the two points. [9] Yahoo Boss PlaceFinder API. Available at: https://developer.yahoo.com
/boss/placefinder/. [Sep 2015]
[10] Nominatim Search Tool. Available at: http://wiki.openstreetmap.org
/wiki/Nominatim [Jan 2015]
[11] XAPI, OSM Extended API. Available at: http://wiki.openstreetmap.org
/wiki/Xapi [Jan 2015]
[12] Overpass API. Available at: http://wiki.openstreetmap.org
/wiki/Overpass_API [May 2015]
[13] Data Processing or Parsing Libraries, https://wiki.openstreetmap.org
/wiki/Frameworks [May 2015]
[14] Libosmscout. Availabe at:https://wiki.openstreetmap.org/wiki/
Libosmscout [May 2015]
[15] GNU Lesser General Public License, http://www.gnu.org/licenses/lgpl-
3.0.en.html [Jun 2015]
[16] Cartotype. Available at:https://wiki.openstreetmap.org/wiki/CartoType
[APR 2015]
[17] Open Street Map Database. Available at: http://wiki.openstreetmap.org/
wiki/Database [Aug 2015]
[18] W3C XML Speficiation (version 1.0), 2008. Available at:
Figure 3 Response Time Comparison for ESOM & http://www.w3.org/TR/2008/REC-xml-20081126/ [Aug 2015]
Namonatim [19] JSON Specification. Available at: http://json.org/ [Sep 2015]
[20] Osmosis Tool. Available at: www.wiki.openstreetmap.org/wiki/Osmosis
[Jun 2015]
V. CONCLUSION & FUTURE WORKS [21] Derick Rethans, 3angle, OSM to MongoDB, Available at:
We successfully implemented an API for the Emerginov http://derickrethans.nl/talks/osm-mongouk12.pdf [Mar 2015]
platform which stores geolocation data in an offline mode, and [22] MongoDB. Available at: https://www.mongodb.org/ [Jun 2015]
provides a Restful interface to be consumed by micro-services. [23] Dijkstra, E.W. "A Note on Two Problems in Connection with Graphs.",
Test results demonstrate that offline mode clearly solves the Numerische Math.1, 269-271, 1959.
data rate and delay issues experienced over the Internet, and [24] A. Mannara,"Location-based games and the use of GIS information",
Master's Thesis, Norwegian University of Science and Technology,
can thus be adopted in developing countries. Our future work, 2012.
on the one hand deals with improving the performance of the
[25] Hadoop HDFS. Availble at: http://hadoop.apache.org/docs/
shortest path algorithm which is based on categorisation tags
r1.2.1/hdfs_design.html [Sep 2015]
we introduced. On the other hand, in order to handle a bigger
[26] J. Dean and S. Ghemawat, " MapReduce: Simplified Data Processing on
data set like geolocation data for Africa, we are investigating Large Clusters ", In proceedings of the 6th International conference on
the opportunities provided by Big Data solutions, such as symposium on operating system design and implementation,
Hadoop's Distributed File System (HSFS) [25] and USENIX,2004.
MapReduce [26] programming model.
REFERENCES
[1] ISACA, "Geolocation: Risk, Issues and Strategies",white paper, ISACA
USA, 2011
[2] Google Map APIs, available at: https://developers.google.com
/maps/?hl=en [Sep 2015]
Role of Attributes Selection in Classification of
Chronic Kidney Disease Patients
Naganna Chetty Kunwar Singh Vaisla Sithu D Sudarsan
Research Scholar at UTU, Dehradun Dept. of Computer Science and Engg., ABB Corporate Research, India
and BTK Institute of Technology Dwarahat, sdsudarsan@gmail.com
Dept. of CSE, MITE, Mangalore, India Uttarakhand, India
nsc.chetty@gmail.com vaislaks@rediffmail.com
Abstract—In the present days the Chronic Kidney Disease The present work emphasizes on an application of data
(CKD) is a common problem to the public. CKD is generally mining, in particular, the classification techniques in health
considered as kidney damage and is usually measured with the informatics to detect Chronic Kidney Disease (CKD).
GFR (Glomerular Filtration Rate). Several researchers from CKD is usually a silent condition. Signs and symptoms, if
health care and academicians are working on the CKD problem
present, are generally nonspecific and unlike several other
to have an efficient model to predict and classify the CKD patient
in the initial stage of CKD, so that the necessary treatment can be chronic diseases (such as congestive heart failure, and chronic
provided to prevent or cure CKD. In this work classification obstructive lung disease), they do not reveal a clue for
models have been built with different classification algorithms, diagnosis or severity of the condition. Typical symptoms and
Wrappersubset attribute evaluator and bestfirst search method to signs of uremia appear almost never in early stages (Stage 1 to
predict and classify the CKD and non CKD patients. These 3, and even in Stage 4) and develop too late only in some
models have applied on recently collected CKD dataset patients in the course of CKD. Still, all newly diagnosed CKD
downloaded from the UCI repository. The models have shown patients, patients with an acute worsening in their kidney
better performance in classifying CKD and non CKD cases. function and CKD patients on regular follow-up should have a
Results of different models are compared. From the comparison
978-1-4673-9354-6/15/$31.00 focused
©2015history
IEEEand physical examination. This will be the key
it has been observed that classifiers performed better on reduced
dataset than the original dataset. to perceive real implications of health associated with
decreased kidney function in CKD [2].
Keywords—Data mining; classification; prediction; chronic CKD is defined as damage to kidney or Glomerular
kidney disease; attributes reduction
Filtration Rate (GFR) < 60 mL/min/1.73 m2 for 3 months or
I. INTRODUCTION more, irrespective of the cause. Kidney damage in kidney
related diseases can be caused by the presence of albuminuria,
The real world data gets doubles every 20 months and
defined as albumin to creatinine ratio >30 mg/g in two of three
contain some amount of noise in it. Hence there is a need to
spot urine specimens. GFR can be estimated from calibrated
store, manage and process this data efficiently. With the rapid
serum creatinine and estimating equations, such as the
advances in storage devices, it is easier to store and manage
Modification of Diet in Renal Disease (MDRD) study
the vast amount data. Even though continuous effort has been
equation or the Cockcroft-Gault formula [3].
made, the efficient processing of huge amount of data is still a
challenge to the researchers, academicians, etc. This challenge GFR is traditionally measured as renal clearance of an
can be handled with the data mining techniques. Data mining ideal filtration marker, such as inulin from plasma. This
is an essential activity in KDD (Knowledge Discovery in measured GFR is considered the gold standard but is not
Databases) process, which extracts patterns from observed practical for daily clinical use due to complexity of the
data. measurement procedure. Estimating GFR based on a filtration
marker (usually serum creatinine) is now widely accepted as
Health Informatics is producing vast amount of data and
an initial test. Several GFR prediction equations that use
processing of this generated huge amount of data creates more
serum creatinine or some other filtration markers along with
possibilities for knowledge to be gained. This gained
certain patient characteristics (like age, gender, and race) are
information can improve the service quality of healthcare to
giving precise estimates of GFR in various clinical settings
patients. The number of issues arise when dealing with this
[4].
vast amount of data, one among them is how to analyze this
data in a reliable manner. The basic goal of Health Informatics Different stages and action plan for CKD are shown in
is to use real world medical data to improve our understanding TABLE 1. Here CKD stages from 1-5 along with the GFR
of medicine and medical practice [1]. reading and actions required during each stage are described.
Selected
Attributes Data Mining Patterns Knowledge
Attributes
Dataset Selection (Classification) Discovery
Data with selected attributes Data with selected attributes for Data with selected attributes
for Naïve bayes classifier IBK classifier for SMO classifier
80
70
60 Attributes Selector with
50 Naïve Bayes
40 Attributes Selector with
30 SMO
20 Attributes Selector with IBK
10
0
Attributes Selected Attrbutes Reduction (%)
450
400
350
300
250
NB
200
150 SMO
100 IBK
50
0
Correctly classified Incorrectly classified Classification accuracy
instances instances
Abstract—Facebook has become the most powerful online platforms. At present, there are more than 370,000 Facebook
social network in recent years. The billions of users appreciate users in Mauritius and the 18-34 age group is the most active
the ease of use and the various applications of the network. It is making up 63% of total users [4]. Unfortunately, not much has
so pervasive in people’s personal life that it is difficult to imagine been published in the field of online social networks. The
life without Facebook. However, there have been a number of literature is even scarcer for developing countries like
concerns with regards to privacy and security. The present study Mauritius which have embraced technological innovations
investigates the current use of Facebook in Mauritius and without any second thoughts. What is the current scenario in
assesses the level of awareness of privacy and security among a Mauritius? Is Facebook private and secure? Very often, before
selected sample of users. Data shows that literature is scarce in
feeding some sensitive and personal information to the social
Mauritius though the island aspires to be the Cyber Island of the
Indian Ocean. Thus, a large-scale study on the use of Facebook in
networks, users hardly give a deep thought about its
Mauritius is required. A comprehensive questionnaire was implications. This is precisely the reasons such a question
designed to gauge the perception and to investigate privacy and rather shakes an average trivial active user who pays no
security concerns. Results demonstrate that users in Mauritius attention to privacy and security issues on the online network.
blindly trust the network but are relatively concerned about the In fact, there is no survey done in Mauritius where the
usage of their personal information. It is also a matter of serious personal information dispensed through Facebook has been
concern that users have limited knowledge of the provisions of assessed in terms of awareness of privacy and security.
the Computer Misuse and Cybercrime Act (2003) which is one of
the harshest in the world when it comes to misuse of computers.
The paper proposes a number of recommendations which need to AIMS AND OBJECTIVES OF PRESENT STUDY
be applied urgently by online social networks and other
stakeholders and the need for an explicit informed consent from The goal of this research is to contribute to the safety and
users each time personal data is used. security of personal information on Facebook and to raise the
consciousness on distributing sensitive data which might
Keywords—Facebook; privacy; law; security; extimacy; affect people‟s lives. In this work, we examine the currently
awareness; Mauritius existing privacy and security policies of Facebook and look at
the users‟ awareness vis a vis privacy practices and their
I. INTRODUCTION knowledge of the existing laws of the country. The objectives
can be described as follows:
Facebook, the technological revolution that has invaded
To assess current awareness level of current terms of
the world and become an intimate part of the private lives of
millions of users, is increasingly „commodifying‟ the personal use of Facebook
into the public sphere [1]. Around the world, Facebook has To evaluate user privacy concerns with regards to
positioned itself as a trove of data from a wide spectrum of Facebook
users, at least 1.49 billion of whom are active each month [2]. To gauge respondents‟ knowledge of the provisions
Researchers state that „the revenue generated by Facebook of the relevant legislations governing use of ICT in
alone is approximately 3.91 million dollars‟ [3]. Facebook is Mauritius
regarded as one of the wide-reaching and sharing information
1
II. LITERATURE REVIEW taxonomy of privacy threats which they categorise into attacks
directed at the (i) user, and (ii) online social networking
A. The ‘New World Order’ of Facebook platform. These range from identity thefts, fraud, virus spread,
In his book „The Facebook Effect‟, Kirkpatrick [5] highlights „malware‟, phishing, to „de-anonymization‟ of user identity.
the founder of Facebook, Mark Zuckerberg‟s philosophy of Albanesius [12] cites a Wall Street Journal report indicating
„publicness‟. Marichal [1] aptly describes the design of the that many applications share personal data with advertising
platform as „architecture of disclosure‟, whereby the human agencies and other internet tracking companies. According to
instinct of „sharing‟ is networked as an act of „public Wall Street Journal this breach of privacy has impacted
revelation‟. This „engineering of connectivity‟ [6] results into millions of users. Companies are „mining trends‟ through
generalised visibility of private life, often without the likes, and data surveillance has the power to swing election
conscious awareness of users. Facebookers, then, are very results. In fact, many risks are involved that might rope in
vulnerable indeed in the hands of hackers, lurkers and other users in complex situation about the protection of information
unscrupulous forces. Tello [7] argues that certain Facebook on Facebook and the freedom of people‟s privacy. Kulcu &
settings infringe on the right of people to private scrutiny and Henkoglu [13] explains that the cognizance and control of
the protection of personal data. She aptly refers to the erosion digital information by the content owner has a decisive
of the boundaries of privacy as „extimacy‟ – the phenomenon responsibility in the assurance of personal data. It is
of exposed intimacy. Searches, likes, browses, links, and unbelievable and even worth considering how information
other acts of virtual participation can be tracked by Big distributed around the world through Facebook, are used
Brother avatars such as (i) individuals obssessed with feeding momentarily and replicated in numbers. Furthermore,
on the private lives of others (ii) profit-oriented companies, Facebook and even other social networking sites keep on
and (iii) Governments collecting personal data in their own polishing up their advertising strategies based on the personal
servers. Ominously quoting Scott McNealy, the co-founder data obtained on a regular basis. Facebook follow the tracks
of Sun Microsystems, „ Privacy has died. Get used to‟, Tello and sort ads accordingly with the personal interest of users. It
[7] also calls for the review of certain features of Facebook‟s is crucial to secure private information through making legal
design which limit users‟ ability to control the accessibility of arrangement concerning the way Facebook might use personal
personal information. data and also without user‟s confirmation, the diffusion and
use of information be allowed. Nonetheless, the onus falls on
the users to protect their privacy since personal information is
B. Big Brother’s Eyes: Privacy and Security issues imparted on Facebook by themselves. Private information is
According to Zhang & Sun [8], revealing personal information being displayed and provided on Facebook by the user‟s own
in any online social networks (OSNs) is described as „double- will. The exposure of personal information was limited by
edged sword‟. Conflictingly, publicizing information, in Facebook in 2005, yet everyone accessed these data in 2010.
general, is a must as individuals desire to take part in social In addition, users had the choice of keeping within bounds the
communities and on the contrary, individual‟s identity may visibility of their intimate information [14]. From one point of
muster malicious attacks like „stalking, personalized view, this adjustment in Facebook default settings empowered
spamming and phishing‟ from the virtual community. Shehab, everyone to watch the data in the user profile, on the contrary,
Squicciarini, Ahn & Kokkinou [9] are of opinion that the users had the chance to change the entire profile background
growth of security and uncertainty owing to the mushrooming to defend their privacy. Anyhow, the changes related to the
of network sharing services via internet, booming amount of services agreement have developed awkward conditions for
messages, and the fast progress in ICT has induced the the users. Although users are being informed by mail about the
protection of privacy to stand as one of the most arguable, sporadic visit of their Facebook accounts, they pay no heed to
worrisome and daunting challenge. Leitch & Warren [10] their account information, which may result an increase of
provide considerable empirical evidence to support their possible risks. Another concern is the complication to gain
argument on the reality of security and threats related to access and apply the privacy settings which have to be
Facebook and further stated that there are similarities of provided to the users owing to legal obligations, although the
security risks and threats between the common internet backdrop differ from the basic sharing principle of Facebook.
community and Facebook. In various cases, it appeared that Those who are not in the picture to protect their personal data
these risks are significant and have commanding effect on and those who have insufficient information about the settings
Facebook, for instance people who believe their Facebook to protect their privacy on Facebook, is really impartial about
companion signifies that the possible effect for identify theft is what confidential information is actually available. Based on
even more absolute. Shehab, Squicciarini, Ahn & Kokkinou the information acquired through Facebook, anyone can get in
[9] pointed that the increase use of social networks like touch with an individual‟s private data [15]. Facebook may act
Facebook, Twitter and others has caused many challenging correctly and control our social lives but it can also damage
security and privacy difficulties. The increasing personal our romantic relationships. In fact, Facebook does gather data
information in Facebook marks up privacy matters and on a large-scale about its users in spite of copyright issues.
necessitates deep understanding and awareness of security Moreover, Facebook is aware when one look at another
problems. Kayes and Iamnitchi [11] propose an elaborate person‟s timeline, communicate a message and even know the
2
time, date and place where your picture is taken. For example spite of the extensive and alarming privacy and security clause
if you log on or post anything from a mobile phone, it can spot on the online social networks user agreement, users are
where you are or else if you access the site from the computer continually sharing, liking, entertaining, connecting blindly all
it immediately tracks record through the IP address you are sorts of personal information with enthusiasm and curiosity
using [16]. and as a result blunting the criticism made in past research
papers against online social networks with a smile. Past
C. Legal framework for using Facebook in Mauritius research showed that social network users are often not aware
As an online social networking service Facebook has been of the bulk data that is actually being divulged as privacy
the subject of much debate especially with regard to the right threat is often hidden or unclear to them. Complaints from
to privacy and freedom of expressions. Article 22 of the Civil Facebook users were related to failing with the compliance of
Code of Mauritius lays the foundation of the right to privacy privacy matters and practices. The improper use of both
[17]. It reads; ‘Chacun a droit au respect de la vie privée.’ In private information on social networking sites and the
other words, the constitution of Mauritius jealously protects viewable content by other users have become the key reason
and guarantees the right to privacy. The parent Act which is for the need to be responsive about the use of social
the Information and Communication Technologies Act 2001 networking sites and attributing importance to privacy.
in Mauritius has been enacted with the objective of making the Various studies emphasized on the important position of users
dissemination of information easy and at the same time on social networking sites. In Mauritius, there is no study
guaranteeing the citizens the necessary protection of the analysing user profiles which puts forward the sensitivity and
inalienable right to privacy. The question that has perplexed awareness level of the users on the protection of personal data
many people here and elsewhere is how far we have and privacy. One study in Turkey showed the shortfall in the
compromised our right to privacy under the garb of freedom legal arrangements but the online users receive an advantage
of expression and right to information through the medium of from this study apparently to act more cautiously against the
Facebook. It is pertinent to note that by just clicking to the risks in the general use of social networks [13]. Since the
terms and conditions appended to the use of Facebook we are existence of social platform, the protection of user‟s privacy
voluntarily or inadvertently relinquishing our right to privacy. on Facebook has been an element of discussion, but privacy
In this context we must remind ourselves that recourse to the policies change each year on these platforms. [13]. However,
authorities for redress would become difficult if not in the past studies, a decrease was noticed in the number of
impossible legal battle. Thus, „any person who knowingly personal data available to everyone but an increase also was
sends, transmits or causes to be transmitted a false or noted in the amount of sharing with the users in the friend list.
fraudulent message; uses an information and communication It becomes easy to access the personal data of prime
service, including telecommunication service, for the importance by Facebook users unless the privacy settings are
transmission or reception of a message which is grossly changed. Anyone must contemplate the restrictions on the
offensive, or of an indecent, obscene or menacing character; or default settings which normally allow other users to show the
for the purpose of causing annoyance, inconvenience or Facebook profile [13]. To what extent are privacy settings
needless anxiety to any person; or for the transmission of a used, and how aware are Facebook users of the various
message which is of a nature likely to endanger or privacy and security threats? It is commonly assumed that
compromise State defence, public safety or public order; will, users could not be bothered about using privacy controls.
if found guilty, will be required to pay a fine not exceeding Albanesius [18] reports a projection that 13 million users do
1,000,000 rupees and to imprisonment for a term not not apply or are not aware of privacy controls. She cites a
exceeding 5 years‟ in Section 46 of the Computer Misuse and survey showing that 28% of respondents open their profiles to
Cybercrime Act. Citizens need to be educated that Facebook, others beyond friends. 4.8% reveal their whereabouts on
being an international social network does not or is not in a Facebook. Do they any knowledge of the dangers lurking on
position to check the veracity of the information shared among the net? Interestingly, Grimmelman [19] argues that people are
the users. The responsibility to do so resides on the user of the generally not cognisant of the consequences of revealing
network and the consequences of publicizing the information personal information on Facebook. He posits that users
through the medium remains exclusively on the user. „massively misunderstand Facebook‟s privacy architecture.‟
Moreover the jurisdiction where a case may be lodged against Grimmelman [19] debunks „myths‟ about users‟ attitudes
Facebook is California which may prove difficult to some to towards the sharing of personal information, asserting that
achieve justice. However, in March 2015 the French court ‟social networking sites activate the subconscious cues that
ruled that a litigant can take recourse to French jurisdiction in make users think that they are interacting within bounded,
the event of a dispute arising between Facebook and its user in closed, „private spaces‟.
France. This is a matter of jurisprudential importance but as
regard Mauritius it is still to be tested how far a Mauritian
citizen can take a matter to Mauritian court against Facebook. III. METHODOLOGICAL APPROACH
The current study adopts a quantitative approach. This
D. Existing Research and limitations
research is based on a two-pronged outlook on:
The usual objectives of most research previously was to gauge 1. Terms of use of Facebook
the existing level of privacy and user awareness and to reveal 2. The Computer Crime and Cybercrime Act (2003) of
the dangers and risks that social network users encounter. In Mauritius
3
The survey method has been chosen to gather data from a
wide audience. This method is recommended in several The majority of users were employees of either the private or
studies on online social networks as it allows collection of a public sector making up 66% of the total.
large amount of data [20]. The sampling technique used is a In general, people make use of the network on a quasi-daily
probability method namely random sampling. A questionnaire basis as shown below:
was designed and disseminated online and especially on the
Facebook walls of the authors of the study and disseminated to
random users in Mauritius. The first part of the questionnaire
assessed the level of awareness of the contents of the „terms of
use‟ of Facebook. In the second part, the knowledge of the
laws of Mauritius pertaining to use of ICT was assessed. The
„friends‟ of the respective authors were encouraged to
participate in the study and to disseminate to a maximum
number of people online. A number of printed questionnaires
were distributed randomly to Facebook users of Mauritius
from a wide-ranging background. According to statistical
tables, a population of 370, 000 yields 384 responses at a 95%
confidence interval. About 620 responses were obtained, out
of which 600 were analysed after cleaning and coding of data.
Data was analysed using SPSS for descriptive and inferential
outputs. An open-ended question was also included and this
was analysed qualitatively in the form of a word cloud.
IV. FINDINGS AND INTERPRETATION
A. Demographic Information
Fig 3. Usage of Facebook
The following shows the demographic distribution of the
sample.
B. General interpretation
Results show that users had serious concerns with regards to
the use of private data but were ignorant of the type of
information being collected. Thus, there is no correlation (r:
0.046) about knowledge of terms of use of Facebook and
amount of information being collected by the network. This
clearly shows that though the terms of use are public
knowledge, no one reads about them and thus accounts for
their ignorance about the fact that Facebook collects all data
pertaining to their contents, chats and other activities.
4
but in general, some are called just „Facebook friends‟, which Inadequate knowledge of the provisions of the law
is equivalent to the online aspect only. Superficial insight into the fines associated with
A more in-depth analysis was carried out on the available data transgressing the law
to test if there was any difference between male and female Erroneous ideas about the term of imprisonment
respondents concerning knowledge of the terms of use. It associated with specific offences
shows that there is a difference (significant at 0.000, less than
0.05).
ANOVA
Terms of use and Gender
5
V. RECOMMENDATIONS People on the social networks want to have private
Risks are never ending when it comes to technology. What conversation without anyone, even the site owner being aware
people have to understand is that technology is not perfect and of the contents. Unfortunately, this remains a utopia with the
certain aspects of it are not fool proof. Technology is new technology which seems to capture everything, even
manmade and glitches may happen at any time. Incidents may before we even join the network.
happen with any website and not just Facebook. Hence it is up Facebook‟s architecture promotes extimacy and a global
to the public to be reasonable on what information they share culture of publicness. Within this superstructure of shared
and keep on social networking sites. Furthermore, we should disclosure Big Brother avatars pose serious threats to the
be cautious and remember to delete personal conversation if it personal information willingly shared every day by millions of
is no longer needed. We choose to share information through users unaware of the dangers lurking in the maze of networks.
Facebook at our own peril - it is a willing risk we are taking. Whether innocently or naively or foolishly or negligently
In the light of the findings (Section V, parts B, C & D of the shared – the fact remains that intimate data, and indeed every
paper) of this study, the following measures are proposed to move on Facebook can be spied on and fraudulently misused.
protect user privacy: Although users don‟t read the terms and conditions of usage
and profess ignorance of privacy threats as well as existing
•awareness raising campaigns in various media to laws, paradoxically they are concerned about the various
sensitise the population about basic security measures security issues surrounding private data in social networks.
regarding Facebook usage This calls for the implementation of widespread awareness
•Enforcement of our legal systems vis a vis security and raising endeavours to deflate the myth that privacy can be
privacy issues. safely shared on Facebook. Whilst policy changes aimed at
•integrating safe Facebooking principles into secondary giving users some degree of control on their data will require
and tertiary curricula global lobbies, in the short term enlightened and responsible
•changing certain privacy policies of Facebook and re- „facebooking‟ can go a long way towards correcting a utopic
designing elements of its architecture which make users vision of Facebook.
vulnerable to privacy attacks (for instance user consent
should be sought prior to using personal information.
•dissemination of information security news to
organisations and public
•awareness sessions for organisations on information Acknowledgments
security issues The authors would like to put on record the invaluable
•conduct IT security audits for organisations based for contribution of Miss Lorna Ramsamy and Miss Nivita
social networks Mannick, Research Assistants, for their significant input.
•dedicated cyber security portal
•sensitisation campaigns in schools and cyber-caravans
•dedicated portal for online reporting of privacy breaches
VII. REFERENCES
VI. CONCLUSION [1] Marichal, J. Facebook Democracy: The Architecture of Disclosure and
The human race is continually going through a lot of changes the Threat to Public Life. Ashgate. 2012
nowadays, and it could be said that there is something elusive [2] International Business Times, Beyond Lol –Facebook collection analysis
user data- no laughing matter.http://www.ibtimes.com/beyond-lol-
happening at every second. New inventions, new gadgets, facebooks-collection-analysis-user-data-no-laughing-matter-204671
new mobiles, new fashion, new social platforms and much 2015.
more are the rage of the day. Hacking and interception of [3] Kumar, A.M, Sharma, B.N. & Shrivastava, S.K. Online Social
users‟ profile by other unknown users are not uncommon. The Networks: Privacy Challenges and Proposed Security Framework for
Facebook. International Journal of Soft Computing and Engineering
technological processes starting from the „click or touch (IJSCE). 2014,Vol. 4 Issue 1.
screen to like or share” operations are indeed very complex to [4] Le Defi Quotidien, Le nombre des utilisateurs de Facebook, 2014.
users and this complexity is hidden from them via a „window‟ [5] Kirkpatrick, D. The Facebook Effect: The Real Inside Story of Mark
like interface. Zuckerberg and the World's Fastest-growing Company. 2011. Virgin
Facebook research has started only very recently. Considering books.
the importance of this social network on the lives of [6] Van Dijk, J. The Culture of Connectivity: A Critical History of Social
individuals, it is worth noting that people have been overtaken Media. Oxford. 2013. Oxford University Press.
by the novelty of this phenomenon. The majority of users are [7] Tello, L. Intimacy and “Extimacy” in Social Networks: The Ethical
Boundaries of Facebook. Scientific Journal of Media Studies
willingly joining networks without understanding the risks of Communicar. 2013. No. 41, v XXI
doing so. The terms of use, though publicly available, are not [8] Zhang, C. & Sun, J. Privacy and Security for Online Social Networks:
being scrutinised and applied by the users. However, Challenges and Opportunities. 2010. IEEE Network
paradoxically aspect, users feel strongly about privacy issues.
6
[9] Shehab, M., Squicciarini, A. Ahn, G-J. & Kokkinou, I. Access control [15] Kosinski. M, Stillwell, D. & T. Graepel, Private traits and attributes are
for online social networks third party applications. Computer & predictable from digital records of human behaviour. Proceedings of the
Security. ScienceDirect Journal, 2012. pg 897-911 Elsevier Ltd. National Academy of Sciences (PNAS), 2013.
[10] Leitch, S. & Warren, M. Security Issues Challenging Facebook. [16] Le Defi Quotitidien. Facebook viral copyright warning is useless.
Facebook Proceedings of the 7th Australian Information Society http://www.defimedia.info/dimanche-hebdo/dh-magazine/item/22845-
Australian Information Security Management Conference. 2009. facebook-%E2%80%93-viral-copyright-warning-is-useless.html. 2012
[11] Kayes, I. & Iamnitchi, A. A Survey on Privacy and Security on Online [17] No name. Code Civil Mauricien – Article 22
Social Networks.http://arxiv.org/pdf/1504.03342.pdf . 2015. http://www.ilo.org/dyn/natlex/docs/ELECTRONIC/88152/114145/F-
[12] Albanesius, C. A Privacy Breach for Facebook? PC Magazine. 2010. 172904586/MUS88152%20Fre.pdf. 2012.
Vol. 29 Issue 12, p1-1. 1p. [18] Albanesius, C.13 Million U.S. Facebook Users Not Using Privacy
[13] Kulcu, O. & Henkoglu, T. Privacy in social networks: An analysis of Controls. PC Magazine. 08888507, 2012, p1-1. 1p.
Facebook. Internationl Journal of Information Management. 2014. pg [19] Grimmelmann, J. Privacy as Product Safety . Widener Law Journal.
761-769 Elsevier Lts 2010. Vol. 19 Issue 3, p793-827.
[14] Boyd, D. Hargittai, E. Facebook privacy settings: Who cares? 2010. [20] Fasola O. S. Perceptions and acceptance of librarians towards using
Vol. 15 No. 8 http://firstmonday.org/article/view/3086/2589#p2 Facebook and Twitter to promote library services in Oyo State, Nigeria",
The Electronic Library , 2015. Vol. 33 Issue: 5.
7
UOM Multimodal Face and Ear Database
Nazmeen B. Boodoo-Jahangeer* & Sunilduth Baichoo
Department of Computer Science, University of Mauritius,
Reduit, Mauritius
nazmeen182@yahoo.com*
Abstract—Research in biometrics have received increased associated hygiene issue, as may be the case with direct
attention in the past decades. Existing face and ear dataases have contact fingerprint scanning, and is not likely to cause anxiety,
been described and compared. Some databases may not have as may happen with iris and retina measurements. The ear is
enough number of subjects or images per subject to allow proper large compared with the iris, retina, and fingerprint and
evaluation of algorithms. In order to support the development of therefore is more easily captured.
algorithms for biometrics, especially for face and ear, a new
database, the UOM Multimodal database has been created. This The main drawback of ear biometrics is that they are not
database involves images taken by more than 100 volunteers at usable when the ear of the subject is covered [2]. In the case of
the University of Mauritius. The images are taken under active identification systems, this is not a drawback as the
different lighting and various pose. At least 80 images per subject subject can pull his hair back and proceed with the
are available in the database. authentication process. The problem arises during passive
identification as in this case no assistance on the part of the
Keywords—Face Recognition; Biometrics; Ear; Database subject can be assumed. In the case of the ear being only
partially occluded by hair, it is possible to recognize the hair
I. INTRODUCTION and segment it out of the image.
Face recognition plays an important role in our society A common problem that researchers face is the non-
since it is currently being used successfully in several availability of a large dataset to test the reliability of face and
applications including Identity Cards, Passports, credit card, ear recognition algorithms. In this work, datasets of face and
Driver’s licenses and access control, crowd surveillance, ear have been created to cater for the testing of the different
among others. Researchers have studied face recognition algorithms. The next sections outlines the different databases
extensively in the past few decades and have developed available, evaluates them and explain the framework of the
several algorithms that allow identification of an individual UOM Multimodal face and ear database.
based on their facial image. Nevertheless, face recognition still
remains a challenge since researchers are trying to get II. BACKGROUND STUDY
improved performance rate. One way of improving the
performance of an authentication system is by fusing face In this section, the existing face databases used in literature
features with ear features, thus creating a multi-modal by other researchers are outlined, together with features of
biometric system. each database
Ear biometric, on the other hand is a newer technology A. Face Databases
compared to face biometric. The potential of the human ear for
personal identification was recognized and advocated as long Database consists of a number of images used for testing
ago as 1890 by the French criminologist Alphonse Bertillon of an algorithm. There are several criteria that affect the
[1]. In machine vision, ear biometrics has received little choice of database, depending upon the tasks required. Some
attention compared to the more popular techniques of features of database are:
automatic face, eye, or fingerprint recognition. However, ears Number of Subjects: Each database has different number
have played a significant role in forensic science for many of subjects.
years, especially in the United States, where an ear
classification system based on manual measurements has been Number of images per subject: Also, the number of images
developed by Iannarelli, and has been in use for more than 40 per subject will help in determining the number of training
years [2]. and test sets.
Ears have certain advantages over the more established Gender: Certain databases cater for both male and female
biometrics; as Bertillon pointed out, they have a rich and subjects, and help for research in gender classification.
stable structure that is preserved from birth well into old age.
The ear does not suffer from changes in facial expression, and Image Resolution: The resolution of an image determines
is firmly fixed in the middle of the side of the head so that the the storage space required and also has an impact on the
immediate background is predictable, whereas face processing time.
recognition usually requires the face to be captured against a Pose Variation: Images of different poses help in testing
controlled background. Ear image acquisition does not have an for algorithms that cater for variance in pose.
1
Light Variation: In real-life scenario, light cannot be Each of the 54 subjects was then imaged under four
controlled, images with lighting variance help to test illumination conditions (three lamps individually and then
algorithms that cater for light variance. combined). Subjects were recorded between one and five
times over a 6–week period.
Variation in Facial Expression: Face images can be subject
to different expressions, which can be used for research, 6) CMU Pose, Illumination, and Expression (PIE)
such as determining drowsy drivers. Database
The CMU PIE is a database of over 40,000 facial images
Occlusion: In real-life system, at times, the image of a of 68 people [8]. Using the CMU 3D Room, the authors
subject is partly occluded by scarf, sunglasses or jewelry. imaged each person across 13 different poses, under 43
A database providing occluded face images help to test different illumination conditions, and with 4 different
robustness of algorithms with respect to occlusion. expressions.
The following paragraphs give details about the existing
7) Equinox IR
facial databases commonly used in literature.
The Equinox IR database consists of images from 340
1) AR Face database individuals [9]. For each individual, Images are
The AR face database [3] contains over 4,000 color images simultaneously acquired in visible, Short Wave InfraRed
corresponding to 126 people's faces (70 men and 56 women). (SWIR), Mid Wave InfraRed (MWIR) and Long Wave
Images feature frontal view faces with different facial InfraRed (LWIR). For each of three illumination conditions,
expressions, illumination conditions, and occlusions (sun 40 frame contiguous sequences are taken while subject recites
glasses and scarf). The pictures were taken under strictly a standard phrase and 3 different expressions.
controlled conditions. No restrictions on wear (clothes,
8) FERET
glasses, etc.), make-up, hair style, etc. were imposed to
participants. Each person participated in two sessions, The FERET database was collected in 15 sessions between
separated by two weeks (14 days) time. August 1993 and July 1996 [10]. The database contains 1564
sets of images for a total of 14,126 images that includes 1199
2) MIT database individuals and 365 duplicate sets of images. A duplicate set is
The MIT-CBCL face recognition database contains face a second set of images of a person already in the database and
images of 10 subjects [4]. It provides two training sets. The was usually taken on a different day. For some individuals,
first one consists of high resolution pictures, including frontal, over two years had elapsed between their first and last sittings,
half-profile and profile view while the second set consists of with some subjects being photographed multiple times. This
synthetic images (324/subject) rendered from 3D head models time lapse was important because it enabled researchers to
of the 10 subjects. The test set consists of 200 images per study, for the first time, changes in a subject's appearance that
subject. We varied the illumination, pose (up to about 30 occur over a year.
degrees of rotation in depth) and the background.
9) FRGC database
3) BANCA face database The database contains 3D face scans [11] of 4950 face
The BANCA database is a multi-modal database intended images of 466 different subjects. Each subject session consists
for training and testing multi-modal verification systems [5]. of images taken under well-controlled conditions (i.e., uniform
The BANCA database was captured in four European illumination, high resolution) and images taken under fairly
languages in two modalities (face and voice). The subjects uncontrolled ones (i.e., non-uniform illumination, poor
were recorded in three different scenarios, controlled, quality). The database also contains various facial expressions
degraded and adverse over 12 different sessions spanning (happiness, surprise). The subjects are 57% male and 43%
three months. In total 208 people were captured, 104 men and female.
104 women.
10) GTAV database
4) CASPEAL This database has been created at the Universitat
The CAS-PEAL face database provides a large-scale Politècnica de Catalunya (UPC) [12]. This database
Chinese face images [6]. Currently, the CAS-PEAL face emphasizes on facial expression variations, pose variations,
database contains 99,594 images of 1040 individuals (595 illumination variations and in partially occluded faces. It
males and 445 females) with varying Pose, Expression, includes a total of 44 persons with 27 pictures per person
Accessory, and Lighting (PEAL). Each subject is also asked to which correspond to different pose views (0º, ±30º, ±45º, ±60º
look up and down to capture 18 images in another two shots. and v90º) under three different illuminations (environment or
The authors also considered 5 kinds of expressions, 6 kinds natural light, strong light source from an angle of 45º, and
accessories (3 glasses, and 3 caps), and 15 lighting directions. finally an almost frontal mid-strong light source. Furthermore,
It is partly made available to the public. at least 10 more additional frontal view pictures are included
with different occlusions and facial expression variations.
5) CMU Hyperspectral Face Database
The CMU database contains visible and near infrared 11) Harvard RL
(NIR) images from 450nm to 1100nm containing 65 spectral The Harvard Robotics Lab database consists of 10 subjects
bands and a spatial resolution of 640x480 pixels [7]. taken in a wide range of illumination [13]. In each image, the
Acquisition of the 65 images took an average of 8 seconds. subject held his head steady while being illuminated by a
2
dominant light source. The space of light source directions, facial expression, luminance, scale and viewing angle and
which can be parameterized by spherical angles, was then were shot at different time. Limited side movement and tilt of
sampled in 15 increments. 75 images for each subject were the head were tolerated
recorded, the images were of 193 × 254 pixels each.
19) University of Texas database
12) KFDB database The University of Texas contains images [21] contains
The Korean Face Database (KFDB) contains images of images of 284 subjects, including 76 males and 208 females.
1000 subjects. Images are taken with varying pose, The database contains both still images and video clips of
illumination and facial expressions [14]. The subjects were faces and people. The still images and videos were taken at
imaged in the middle of an octagonal frame carrying seven close range, under controlled lighting conditions in an indoor
cameras and eight lights (in two colors: fluorescent and laboratory environment. Another duplicate session includes a
incandescent) against a blue screen background. Also, pose full set of these still and video images, taken at an average
images were collected in three styles: natural (no glasses, no interval of 24 days were taken for testing purposes.
hair band to hold back hair from the forehead), hair band, and
glasses. Separate frontal pose images were recorded with each 20) University of Oulu
light turned on individually for both the fluorescent and The University of Oulu Face database contains images of
incandescent lights. The subjects were also asked to display 125 subjects in 16 different camera calibration and
five facial expressions: neutral, happy, surprise, anger, and illumination condition and an additional image if the person
blink, which were recorded with two different colored lights, has glasses [22]. The faces were in frontal position captured
resulting in 10 images per subject. under Horizon, Incandescent, Fluorescent and Daylight
illuminant. The images also included 3 spectral reflectance of
13) MPI database skin per person measured from both cheeks and forehead
The MPI (Max Planck Institute) face database [15]
contains images of 200 subjects (100 men and 100 women). 21) XM2VTS databasse
Subjects were wearing bathing caps at the time of recording The Extended M2VTS (XM2VTS) multi-modal face
that are later automatically removed. The faces are free of database [23] includes still colour images, audio data, video
makeup, accessories, or facial hair. Part of the database is sequences and 3D Model. It contains four recordings of 295
available for researchers. subjects taken over a period of four months. Each recording
contains a speaking head shot and a rotating head shot.
14) ND HID
The Notre Dame HumanID database contains images of 22) Yale database A
more than 300 subjects [16]. The images were recorded The first Yale face database [24] contains 165 grayscale
multiple times over a period of 13 weeks. . A minimum of images in GIF format of 15 individuals (14 male, 1 female).
four high-resolution color images were obtained during each There are 11 images per subject, one per different facial
session under controlled conditions. In addition to that, images expression or configuration: center-light, w/glasses, happy,
were recorded under two lighting conditions and two facial left-light, w/no glasses, normal, right-light, sad, sleepy,
expressions, resulting in more than 15 000 images. surprised, and wink.
3
B. Evaluation of Face Database An ear database allow ear detection and ear recognition
As detailed in above section, several databases have been algorithms to be evaluated. Criteria for choosing ear database
created in the past for testing of face recognition algorithms. can include: Number of subjects, No. of images per subject,
Image resolution, Pose variation, Light Variation and
The advantages of these databases are:
Occlusion. The following paragraphs give details about
Most of them are publicly available for use for research commonly-used ear databases.
purposes.
The databases have varying number of subjects and 1) IIT Delhi ear database
researcher can choose depending on applications The Indian Institute of Technology (IIT) Delhi ear image
Some have images of both males and females database [28] was used for the purpose of this research. The
Databases, such as CAS-PEAL, CMU PIE, FERET, IIT Delhi ear image database was acquired from 125 different
GTAV, KFDB and Yale DB provides images taken at subjects and each subject has at least 3 ear images. All the
subjects in the database are in the age group 14-58 years. The
different angles.
resolution of these ear images is 272 x 204 pixels.
Some databases, including AR, CAS-PEAL, CMU PIE,
Harvard RL, KFDB, Oulu and Yale offer images taken 2) IIT Kanpur ear database
under different lighting conditions. The IIT Kanpur database is available in two subsets. The
A few databases also offer variations in facial first subset contains 801 side view images of 190 people, 2 to
expressions, such as neutral, smiling, sleepy, and sad, 10 images per person. The second subset contains images of
among others. 89 subjects. It contains 9 images for every person were taken
The databases, such as AR, CAS-PEAL, CMU PIE, at 3 dissimilar postures. Each pose was captured at three
different scales.
GTAV, ORL, UMIST and Yale provide images with
occlusion due to caps, scarves, glasses and hands. 3) UND databases
Database such as Wild LFW allows testing for The University of Notre Dame provides several databases
algorithms in unconstrained environment. that can be use for ear recognition [29]:
CollectionE: 464 visible-light profile ear images from 114
The above databases are definitely an asset to researchers
human subjects;
since they support research in face recognition and helps to
test robustness of such system with variance in lighting CollectionF: 942 3D ,plus corresponding 2D profile ear
conditions, pose, expressions, as well as occlusions. An images from 302 human subjects;
algorithm may work well when tested with images under
CollectionG: 738 3D, plus corresponding 2D profile ear
controlled environment. However good the algorithm may be,
there may be variations in performance if images are in images from 235 human subjects;
unconstrained environment. CollectionJ2: 1800 3D, plus corresponding 2D profile ear
However, in this research, the database created will have: images from 415 human subjects.
At least 10 images per person, as per tests carried out
(refer to section 4.2.3)
A resolution of at least 200 by 200 pixels in order to 4) USTB databases
extract the features precisely. USTB DB1
Frontal images of the subjects without variation in pose The USTB database involves 60 subjects, whereby the right
and lighting ear of each subject is photographed with a digital camera [30].
Profile images of subjects to be able to extract the features Three different images of each subject are taken. They are
of ear normal frontal image, frontal image with trivial angle rotation
and image under different lighting conditions. Each of them
has 256 gray scales
C. Ear Database
USTB DB2
Several databases have been used in literature for testing of The USTB DB2 contains images of 77 volunteers [30]. The
ear recognition system. Examples include the University of purpose of the database was to support the research about ear
Science and Technology Beijing (USTB) database and the recognition under illumination variations and angle variations.
University of Notre Dame (UND) database, which are publicly The images are taken under different lighting and angle
available for research purposes. Due to lack of a standard set, variations
several researchers have built their proprietary ear datasets
[27]. Thus, the results cannot be compared if experiments are
done on different datasets. Many researchers have used face USTB DB3
database, such as FERET, containing face profile images in The USTB DB3 contains images from 79 [30]. The main
order to extract the ear image. Also, researchers have created purpose of this database was to support research concerning
their own database for testing purposes. about steps of ear recognition system including ear detection,
the robustness of recognition methods under depth variation,
4
ear recognition under partial occlusion and multi-modal III. FRAMEWORK OF THE UOM MULTIMODAL DATABASE
biometric feature recognition based on the fusion of A multi-modal database consists of different traits of a single
information from ear and face. The database contains both person. Existing Multi-modal databases are either not free or
face and ear images with varying pose and occlusion. does not have enough number of images to get valid results.
Some multi-modal database involving face and ear are not
USTB DB4 made available to the public. Most researchers make use of
This database contains images from 500 volunteers [30]. The profile face image to get the ear, while others use chimeric
images were captured with different pose. Each pose is databases [32].
photographed with 17 CCD cameras simultaneously. The
interval between the cameras is 15 degrees. The integral According to the investigations done by Poh and Bengio [33],
images of face and ear are captured. Both grayscale and color generating multiple chimeric databases does not degrade nor
images are available. improve the performance of a fusion operator when tested on a
5) real-user database with respect to using only a real-user
6) UBEAR database. However, fusion models can only be evaluated on
The UBEAR ear database has over 4430 images [31], having. real multimodal biometric data. For the purpose of this
The images were captured under varying lighting conditions. research we have created a multi-modal face and ear database.
Also, the subjects were taken under different pose, such as In this study the database is named UOM Multimodal
looking towards the camera, upwards or downwards. The Database. Database availability helps to validate a given
subjects were not asked to remove hair, jewelery and algorithm, compare between different algorithms and develop
headdresses during image capture. the images had varying new algorithm [34].
quality to enable testing for robustness of algorithms.
The Table II below summarises details about the commonly- Considerations have been given to the following factors:-
used ear image database. The table outlines the different ear (a) Number of subjects: At least 100 subjects will be taken at
databases in terms of the number of subjects, no. of images, the first instance
image resolution, pose variation, light variation, occlusion (b) Number of images per subject. The system will take 10
details as well as no. of face images corresponding to the ear images per subject.
image. (c) Gender: Both Males and Females will be considered.
(d) Image Resolution: Face image: 252 × 288
Ear image: 80 × 150
D. Evaluation of Ear databases (e) Pose variations: 5 poses will be taken: -900, -450, 00, 450,
Some of the commonly-used databases are detailed above. 900 on the x-axis.
(f) Light variations: 3 lighting conditions will be considered.
These databases provides
(g) Occlusion: Minor occlusion due to earrings and glasses
have been tolerated. However, subjects were asked to make
Images of different resolutions sure the ear is not occluded.
Images are captured under various pose and illumination
conditions.
UBEAR USTB DB4 and IIT Delhi also provides images A. Experiment to determine Number of Training and Testing
with partial occlusions with hair and jewelry. images
IIT Kanpur also provides the corresponding face image, An investigation is done to determine the number of training
two of them, but it is not enough for testing. and test images required for getting maximum recognition
rate. Details of the experiment are given below:
E. Rationale for New Face and Ear database
Aim of experiment: To determine how many test images and
A new database has been created for this study, because it train images are required to obtain a stable recognition rate.
offers
Images of more than 100 subjects Dataset: Frontal face images from the UOM database
Subjects include both males and females
Images are taken at different poses and lighting Algorithm: The Principal Components Analysis (PCA) is
conditions to support further research. applied to recognize faces.
Several cameras are required to take photographs at the
same time Methodology: The testing is done in two parts:
A controlled environment is required
Images of both frontal and profile are required for the Experiment 1: Firstly the training set is kept constant and the
same person test set is changed incrementally from 1 to 10. Here the train
A good resolution of about at least 200 by 200 pixels are images consist of 3 images per subject.
required.
5
Experiment 2: Secondly, the training set is changed training images, here the performance is stable with 5 training
incrementally from 1 to 10 while the test set is kept images per subject.
unchanged. Here the test set consists of 3 images per subject
As per the above experiment, a stable performance is obtained
Results: The results of the experiments are as follows: with 5 training images and 5 testing images per subject. Thus,
for the UOM Multimodal database, a minimum of 10 images
1) Experiment 1 result: per subject is required for testing of face or ear recognition
The recognition rate increases initially with more test images, system.
as shown in Table III. A graph is drawn to show the trend as in
Fig. 1 TABLE IV RESULTS OF EXPERIMENT 2: TEST IMAGES ARE KEPT
UNCHANGED
As shown in Fig. 1, the recognition rate increases with an No of train No. of test images Recognition Rate
increase in the number of images per subject in testing set, till images (per (per subject)
a stable value is reached. Here the value 5 is considered to be subject)
the appropriate value 1 3 images 67
2 71
TABLE III: RESULTS OF EXPERIMENT 1: TRAIN IMAGES ARE KEPT
UNCHANGED TABLE STYLES
3 75
4 79
5 85
No. of train No of test images Recognition Rate 6 87
images (per (per subject) 7 88
subject) 8 89
3 images 1 65 9 90
2 70 10 91
3 75
4 80 A graph is drawn to study the trend in recognition rate, as
5 87 shown in Fig 2
6 90
7 90.2
8 90.3
9 90.4
10 91
B. Imaging framework
The database has been created at the University of Mauritius
to support further research in involving pose and light
Fig. 1 Trend showing recognition rate versus test images variance. However, the scope of this research is limited to
frontal face and ear image. The equipment used includes:
2) Experiment 2 Result (a) Five CCTV cameras to take videos of a subject
As far as the training images are concerned, a similar trend is simultaneously from 5 angles: -900, -450, 00, 450, 900 on the x-
noticed as shown in Table IV. There is an increase in axis.
performance with an increase in the number of training images (b) Three light sources
per subject. (c) A computer connected to the DVR system to view the
videos.
As shown in Fig. 2, the performance of a system increases up
to a stable value when there is an increase in the number of
6
The Fig. 3 shows the set-up of the camera. The subject was Five images are included in the test set and five in training set
required to look at the middle camera, and the video is taken for each subject. Images of 80 subjects were used for training
for a few minutes. To get the images, corresponding frames at and images for 20 subjects were used for testing of imposters.
specific time intervals were extracted from the different videos
taken, resulting in 5 images at any particular time. 10 images IV. CONCLUSION
per camera were extracted within one second to avoid This work provides details about the different factors that
variations in pose, giving a total of 50 images per subject in needs to be considered when choosing a database for research.
normal light. Also, images were captured under three lighting The database commonly-used for face recognition and ear
conditions, resulting in 30 additional images per subject. recognition are outlined and evaluated. The reasons for new
Above 100 subjects volunteered to provide their image. Since database have been outlined and framework for the new
for some of the subjects, there were problems of pose and database has been explained. The UOM Multimodal database
occlusions, 100 subjects have been selected, resulting in a total contains images light variance and pose variance to encourage,
of 5000 images in the UOM . research in pose and lighting for ear and face recognition.
Sample images of all databases used in this research have been
given.
References
[1] Bertillon, A., 1890. La Photographie Judiciaire, Avec Un
-900 -900
-900 Appendice Sur La Classification Et L’identification
Anthropometriques, Gauthier-Villars, Paris.
[2] Iannarelli, A., 1989. Ear Identification. Forensic Identification
-900 -900 Series. Paramont Publishing Company, Fremont, California.
Subject
[3] Martinez A.M. & Benavente R.. 1998, The AR Face Database.
CVC Technical Report #24, June 1998
[4] Weyrauch B., Huang J., Heisele B., & Blanz V., 2004,
Fig. 3 Setup of cameras Component-based Face Recognition with 3D Morphable Models,
First IEEE Workshop on Face Processing in Video, Washington,
D.C., 2004.
C. Sample Images [5] Bailly-baillire E., Bengio S., Bimbot F., Hamouz M., Kittler J.,
Sample face images and ear images from the UOM Mariéthoz J., Matas J., Messer K., Porée F., Ruiz B., 2003, The
BANCA Database and Evaluation Protocol, In Proc. International
Multimodal database are shown in Fig.4 and Fig. 5
Conference on Audio- and Video-Based Biometric Person
respectively.
Authentication (AVBPA03), pp. 625 - 638.
[6] Gao W., Cao B., Shan S., Zhou D., Zhang X., & Zhao D., The
CAS-PEAL Large-Scale Chinese Face Database and Evaluation
Protocols, Technical Report No. JDL_TR_04_FR_001, Joint
Research & Development Laboratory, CAS, 2004
[7] Denes L. J., Metes P., & Liu Y., Hyperspectral Face Database,
Technical Report CMU-RI-TR-02-25, Robotics Institute,
Carnegie Mellon University, October, 2002
[8] Sim T., Baker S., & Bsat M., 2002, The CMU Pose, Illumination,
and Expression Database. In: IEEE International Conference on
Automatic Face and Gesture Recognition, pp. 46-51
Fig. 4. Sample face images from UOM Multimodal database
[9] Socolinsky D., Wolff L., Neuheisel J., & Eveland C., 2001.
Illumination Invariant Face Recognition Using Thermal Infrared
Imagery. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, 2001
[10] Phillips, P.J., Moon, H., Rizvi, S.A. & Rauss, P.J., 2000. The
FERET Evaluation Methodology for Face Recognition
Algorithms, In: IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 22, No. 10, pp. 1090-1104.
[11] Phillips P. J., Flynn P. J., Scruggs T., Bowyer K. W., Chang J.,
Hoffman K., Marques J., Min J., & Worek W., 2005, Overview
of the Face Recognition Grand Challenge, Proc. 2005 IEEE
Computer Society Conference on Computer Vision and Pattern
Recognition, No. 1, pp. 947 – 954.
[12] Tarrés F. & Rama A., 2005, A Novel Method For Face
Fig. 5. Sample ear images from UOM Multimodal database Recognition Under Partial Occlusion Or Facial Expression
7
Variations, 47th International Symposium of ELMAR, 2005, pp. [24] Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J., 1997,
163 - 166. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific
[13] Hallinan P., 1995, A Deformable Model for Face Recognition Linear Projection, IIEEE Transactions on Pattern Analysis and
under Arbitrary Lighting Conditions. PhD thesis, Harvard Machine Intelligence, Vol. 19, No. 7, pp 711-720
University, 1995 [25] Georghiades, A.S., Belhumeur, P.N. & Kriegman, D.J., 2001.
[14] Hwang B. W., Byun H., Roh M.C., &. Lee S.W., 2003, From Few to Many: Illumination Cone Models for Face
Performance Evaluation Of Face Recognition Algorithms On The Recognition under Variable Lighting and Pose IEEE Transactions
Asian Face Database, KFDB. In Audio- and Video-Based on Pattern Analysis and Machine Intelligence, Vol. 23, No. 6, pp.
Biometric Person Authentication (AVBPA), pp. 557–565. 643-660.
[15] Troje, N. and H. H. Bülthoff, 1996, Face recognition under [26] Huang G. B., Ramesh M., Berg T., & Learned-Miller E., 2007
varying poses: The role of texture and shape. Vision Research 36, Labeled Faces in the Wild: A Database for Studying Face
1761-1771. Recognition in Unconstrained Environments, University of
[16] Flynn P., Bowyer K., & Phillips P. J., 2003. Assesment of Time Massachusetts, Amherst, Technical Report 07-49, October, 2007.
Dependency in Face Recognition: An Initial Study. In Audio and [27] Abate, A.F., Nappi, M., Riccio, D., & Ricciardi, S., 2006. Ear
Video-Based Biometric Person Authentication (AVBPA), pp. 44 Recognition By Means Of A Rotation Invariant Descriptor. In:
–51. 18th International Conference on Pattern Recognition, 2006.
[17] Watson C. I. 1994, NIST mugshot identification database. ICPR 2006, Hong Kong, pp. 437 – 440.
Available at http://www.nist.gov/srd/nistsd18.htm, Retrieved on [28] Kumar A, Wu C., 2012, Automated human identifcation using
15th May 2015. ear imaging, Pattern Recognition, Vol. 45, No. 3, pp. 956 – 968.
[18](Samaria F., & Harter A., 1994. Parameterisation of a Stochastic [29] UND. (2014). CVRL DATA SETS ( University of Notre Dame
Model for Human Face Identification, Proceedings of 2nd IEEE UND database),. Retrieved from
Workshop on Applications of Computer Vision, Sarasota FL, 5-7 http://www3.nd.edu/~cvrl/CVRL/Data_Sets.html
December 1994, pp. 138 – 142 [30] USTB 2005 The ear recognition laboratory at the University of
[19] Graham, Daniel B. and Allinson, Nigel (1998) Characterizing Science and Technology Beijing, China.
virtual Eigensignatures for general purpose face recognition. In: [31] Raposo R, Hoyle E, Peixinho A, Proenca H. ’UBEAR: A dataset
Face recognition: from theory to applications. NATO ASI Series of ear images captured on-the-move in uncontrolled conditions’.
F, Computer and Systems Sciences, Vol. 163. Springer, pp. 446- In: Computational Intelligence in Biometrics and Identity
456 Management (CIBIM), 2011 IEEE Workshop on; 2011. p. 84 –
[20] Spacek L., 2007, Computer Vision Science Research Projects, 90.
Accessed on July 20 2013, from Essex University website: [32] Ross A., & Jain A., 2003. Information Fusion in Biometrics,
http://cswww.essex.ac.uk/mv/allfaces/faces94.html. Pattern Recognition Letters, Vol. 24, No. 13, pp. 2115–2125.
[21] O'Toole, A.J., Harms, J., Snow, S. L., Hurst, D. R., Pappas, M. [33] Poh N. & Bengio S., 2005, Can chimeric persons be used in
R. & Abdi, H., 2005. A Video Database of Moving Faces and multimodal biometric authentication experiments?, In: MLMI'05,
People. IEEE Transactions on Pattern Analysis and Machine Second international conference on Machine Learning for
Intelligence, Vol. 27, No. 5, pp. 812-816. Multimodal Interaction, pp. 87-100
[22] Martinkauppi B., Soriano M., Huovinen S. & Laaksonen M., [34] Fabregas J., & Faundez-Zanuy M., 2008. Biometric Face
2002, Face video database. Proc. First European Conference on Recognition with Different Training and Testing Databases,
Color in Graphics, Imaging and Vision (CGIV'2002), April 2-5, Lecture Notes in Computer Science, Vol.5042, pp 44-55.
Poitiers, France, pp. 380-383.
[23] Messer K., Matas J., Kittler J., Luettin J., & Maitre G., 1999.
XM2VTSDB: The Extended M2VTS Database. In Second
International Conference on Audio and Video-based Biometric
Person Authentication, 1999.
8
2 MIT 114 7 108 Male, 480 × 640 5 3 2 1 ++
database 6 Female
3 BANCA 52 5 28 male, ++ 1 ++ 1 12 ++
28 Female
5 CMU 54 ++ 1 4 1 1–5
Hyper
6 CMU PIE 68 640 × 486 13 43 3 1 Glasses
1600
14 MPI 200 3 3 1 1
20 U. of 20 180 × 200 1
Essex
9
21 U. Texas 284 76 male, 720 × 480 ++ 1 ++ 1
208
female
10
Review Current CRM Architectures and Introducing
New Adapted Architecture to Big Data
Abstract—The business model as known by the majority In fact, it is necessary to consider the challenges of the
of specialists has moved from product concentration to the implementation of the CRM, as justified by the
customer concentration. And as we all know, electronic commercial market studies, which have shown that
commerce and generally the world of technology has approximately 70% of CRM projects result either losses
exponentially believed that principal. In this era, many or no improvement in performance of the company [1].
companies have begun to permeate the Electronic Customer These negative results are well described in academic and
Relationship Management (CRAG / E-CRM) more than professional level. [2] What we can add is that there are
conventional CRM to better understand their customers. As many actors and startups that are placed on the new
early as the first definitions of the CRM, we found among operating niche of big-data and CRM, but there was
the main three axes - technology - Currently the world data therefore a gap between their positioning and state of
is also outstanding to have a change with the advent of Big evolution of their systems due to the rapid development,
Data paradigm. There is a need to rethink and reconsider diversification and heterogeneity of the data available
the validity of existing architectures management of
today. This requires a review of the start of the
electronic customer relationship.
development cycle of these solutions, and review the
In this paper, we will see the existing architectures, and we structure and architecture that give a clear view of the
subsequently present an architecture with features capable modules and the possible interfaces between them.
to respect and exploit the new data based on research works.
It is in this sense that we started our research on managing
Keywords—CRM; Big Data; CRM; e-CRM; Architecture the customer relationship, precisely the part of the
recommendation, and in this paper we have made a study
of some existing architectures of the main market
industries and we proposed our own architecture which
I. INTRODUCTION will include a module with features that could add value
Customer Relationship Management (CRM) is a business to the final results of an electronic customer relationship
strategy that integrates organizational culture, human management system. This paper is organized as follow: in
resources, processes and technology, to acquire and retain the second section, we will present Big-Data
high value customers. fondamentals and concepts, then, we will present some
architectures, in section 4 we will present our proposed
With the most demanding and critical consumers vis-à-vis architecture, and our document will be concluded in
trademarks by their access to information, organizations Section 5.
can no longer do without a new reflection on the customer
relationship. If the digital revolution and the social web II. BIG DATA FONDAMENTALS AND CONCEPTS
renew the opportunities to enrich the customer Big Data is more or less structured data sets that become
relationship strategies, pitfalls are to be avoided to not so large that they are difficult to work with conventional
lose the customer who now does not hesitate to turn to the management database tools.
competition if he finds more advantages.
The associated definition is: "structured and unstructured
The customer relationship management handles all data requires the very large volume of adapted analytical
aspects of customer interactions. It offers a view on the tools ". some states that are also found the term "massive
company's performance and employees and bring data" also often preferred in recent years.
productivity. After a good consolidation of all possible
data sources within the enterprise, commercial and The excitement around this phenomenon appeared a few
marketing department can manage profitably the service years ago, generates definition’s muddle.
and targeting of customers and prospects.
- The marketing plan (contact management, track leads, In this part, we will present some architectures of
opportunities and companions) specialists industries in this segment [7] [11] [12] and
we will finish with a proposal in section 4
- Sales source of different channels as well as the rest
of the process. ORACLE
Abstract—Because of the complexity of Object Oriented prefer a tool which is similar to real-industry IDE where they
Programming (OOP), various applications have been can work with the real codes [8]. Existing OOP tools do not
implemented as a learning support. However, results show that allow a progressive learning style. Existing tools are either too
the novice learners still face problems in learning OOP. This simple without displaying real codes or they are found to be
paper has undergone an empirical research, aiming at too complex as they display too much of information. A
understanding the problems faced by the novice learners in compromise between the different features to be displayed to a
grasping the OOP concepts. This research has therefore the aim user should be considered. Existing tools should take into
of determining requirements for a software tool which caters for account the learners’ growth. This means they should allow a
the needs of the students. As methodology, a survey was carried
novice learner to progressively outreach to an advanced level.
out to identify the learners’ difficulties and the types of supports
This will ultimately reduce the problem of switching to a real-
they require while interacting with the OOP supporting tool. The
latter also takes into consideration valid features of prior OOP IDE [7].
teaching software tools. A particular feature, in-built in the tool, It is thus imperative to implement relevant OOP tools
also caters for e-assessment which will allow the user to know his which will support novice learners. The aim of this paper is to
proficiency level. Novice learners are thus able to monitor their understand the problem faced by the novice learners in
progress and take relevant actions accordingly. Thus, this will grasping OOP concepts and proposing an efficient software
encourage novice learners to engage and be motivated to learn
tool. The later will contribute in both teaching and learning of
OOP at their own pace.
OOP.
Keywords—Graphical User Interface (GUI); Object-Oriented
Programming (OOP); teaching; learning, software tools; engage; II. LITERATURE REVIEW
proficiency level; Integrated Development Environment (IDE)
A. Evolution of classroom tools[3]
I. INTRODUCTION Different classroom tools and technology have been used
Today, there are many undergoing researches regarding over more than five decades to help teachers to teach
teaching and learning [1] in order to research on the most efficiently in their classes. In 1650, the introduction of the
efficient ways to teach and learn. However, learning goes horn-book allowed students to follow lessons with wooden
beyond classrooms. Students should be able to study at their paddles printed. In 1850, the ferrule, some kind acting as a
own pace and place [2]. Classroom technology has evolved pointer, was being used. The “magic lantern” in 1870 allowed
from the handbook (1650) to iPad (2011) [3]. Despite the need teachers to project lectures onto glass slides so that students
to have more programmers in the future, students are not were able to follow the class in large group. The chalkboard in
willing to overcome the difficulties of programming. Object- 1890 had been very beneficial for many educational
oriented programming (OOP) remains difficult to teach [4, 5] institutions. It had been used for more than hundred years. In
and to learn [5]. Throughout years, researchers have been the same year, the invention of a small chalkboard commonly
trying to find the best strategy to teach OOP so that novices known as “the school slate” allowed students to learn concepts
are motivated to learn it [6].One of the strategies being used is more effectively. The pencil in 1900 was more widely used
the usage of software tool to facilitate teaching and learning. when introduced compared to the school slate. The
stereoscope in 1905 allowed classroom lectures to be more
Recent studies both in 2013 and 2014 confirmed that visually animated. The filmstrip projector in 1925 allowed
existing tools do not meet the expectation of novice learners students to view motion-pictures. In that same year itself, the
[7, 8]. The technical University Kosice in Slovakia has been transmission of class lessons through the radio made a
using existing tools for many years. They concluded that tremendous impact for 2 decades. The mimeograph (1940),
existing software tools still do not prepare novice learners for the language lab headset (1950), the reading accelerator
the job market. Novice learners still find OOP difficult as they (1957) , the skinner teaching machine (1957), the educational
have to switch from a novice tool to a real IDE [7]. In 2014 television (1958), the liquid paper (1960), the filmstrip viewer
[8], an experiment was carried out with students at the (1965), the hand-held calculator (1972), the scantron (1972),
University of Surrey in United Kingdom. Although the the PLATO computer (1980), the cd-rom drive (1985), the
students like the simplicity of the tool being used, they still
hand-held graphing calculator (1985), the interactive complained that they would prefer to see the real java codes so
whiteboard (1999), the iclicker (2005) ,the XO laptop (2006) that they would be more accustomed to industry coding [8].
and ipad (2011) are tools which had contributed a lot in the
progress of teaching and learning. Software programming BlueJ [22, 17] in 1996 allowed students to learn OOP in a
tools are now being run on ipad or laptops to teach more structured way. One of the strength of blueJ is the clear
programming. Today, learners can eventually learn OOP separation of the concepts of OOP classes and objects and the
through these devices. possibility to interact with and analyse these objects and
classes. One other of its strength is its simple user interface
which makes it very suitable for novice learners. A research
B. Evolution of software OOP teaching tool [22] conducted in 2007 at the University of Piraeus in Greece
With the aim of supporting students in understanding the favored blueJ for its simplicity. The simplicity of the software
concept of programming, LOGO concept [11] was conceived allowed students to be more engaged to learn OOP. However,
in 1967.The latter supported learning and formal thinking to the research highlighted some disadvantages with its usage.
solve problems by using micro-chips to build small robots. One of them has been the fact that BlueJ is a static visualizing
Instructions were given to these micro-tools to motivate tool. This means blueJ does not allow us to monitor the state
learners to grasp the concept of programming [11]. After logo, and behavior of objects. This has limited the teaching and
the most used software tool which contribute to the teaching learning process of OOP.
and learning of computer programming was “Karel the robot”,
developed by Richard E.Pattis [12]. Xinogalos [13] made Dynamic visualization software tools allow us to see the
another version of karel (objectKarel) which was used to state and behavior of objects. Both JPIE [23] and Greenfoot
introduce OOP to first year university students. ObjectKarel [18] [24] are dynamic visualizing tools. Throughout years,
made a positive impact on novice learners. The latters were various experiments [25] have been carried out to test the
able to overcome their difficulties in understanding OOP effectiveness of Greenfoot in teaching and learning of OOP.
concepts. They were more motivated with objectKarel. Greenfoot still has many limitations which prevent novices to
However, objectKarel was not being used on the job market. It self-learn. They desperately need assistance for complicated
was not a professional programming language. [13] classes [25]. Other educational software tools also have
brought out new features which have been contributing to the
Further researches done on programming educational tools ease of the learning process of OOP. Jeeroo is a multi
were developed to help teachers deliver quality standards language tool and narrative tool [26]. However, scratch is not
programming classes. These tools have been built to motivate multi language but is a narrative tool as well [27]. Jeeroo had
students so that they do not drop programming classes [14]. reduced withdrawal rate of programming class in the USA.
These software tools have been categorized into 5 Students maintain a high interest in learning OOP with the
categories: narrative tools, visual programming tool(static and introduction of Jeeroo to them [28,32]. Baltie displays
dynamic visualization), flow-model tools, specialized output components on the screens as the user develops his
realizations and tiered language tools [15] [16]. Narrative tools proficiency skills (tiered language). For instance, for a novice
allow the user to follow some story telling tutorials to learn user, lesser components will be displayed compared to an
programming. In the end, the user shall be able to build the expert user. JHAVE displays pop-up questions and
story in the software. Visual programming tools can be either instructions to guide the user throughout his learning process.
static or dynamic. Static visual programming tools do not Therefore, this feature makes it a specialized output
allow us to see the state and behavior of classes [17]. Whereas, realizations software tool type [26]. Raptor is a flow-model
dynamic visual tools allow us to see the state and the behavior software tool. [26]
of the objects [18]. However, both static and visual tools allow
the user to drag and drop objects to learn OOP programming. III. ANALYSIS
Flow-model tools allow the user to connect program elements
together in such a way that the user can represent the order of A. Analysis of existing software tools
computation. Specialized output realizations provide
feedbacks in non-textual way which guides the user more Alice [20, 21] being a 3D tool is good for visual
appropriately. Finally, tiered tool language allows the user to programming. However, 3D is seen to be strainful for the eyes
work with only features he is allowed to, depending on his of the users. For a novice user, the first-time exposure should
proficiency skills. For instance, a novice user will have less be a 2d-image [29]. Upon the level of expertise of the user, the
features appearing on his screen compared to an advanced option 3-D can be made available.
user.
BlueJ being a simple software object-oriented teaching
Alice, an OOP educational software tool, was introduced tool still lacks both animated feedback (specialized output
in 1995 [19]. Alice is a 3D tool. The user is not required to realization features) and tiered-language features (displaying
understand 3D to use the software. The user will drag and components according to user expertise level). The same
drop objects to code visually and graphically. He does not see improvements for both JPIE and Greenfoot would be
the real java codes. [20, 21]. Alice aim was to bring virtual recommended as suggested for BlueJ. Baltie’s main strength is
reality for students to work in group [19]. It has tremendously that it allows users to learn concepts only at their proficiency
ease learning process of novices [8]. Teachers have also found level. Unfortunately, baltie is not free. We need to purchase
the tool to be effective for them. However, learners had the software to acquire its full version [30] [28]. JHave [26]
have been using or using while learning OOP for the first
As mentioned in the literature review section, software time.
tools have been categorized into 5 categories: narrative tools,
In the profile section (first), data collected allowed us to
visual programming tool (static and dynamic visualization),
identify whether he is a male or female respondent. Morever,
flow-model tools, specialized output realizations and tiered
it allows us to know the age and the highest qualitication of
language tools [15]. According to figure 4, Alice, Scratch and
the person. The variables in the profile section allowed us to
Jeeroo are categorized as a narrative and visualize tool.
be able to know whether the person filling the form is an
Moreover, Jeeroo is multi-language as well. BlueJ, Greenfoot
expert or a beginner to OOP.
and Raptor are visual and flow-model tools. JPIE is a visual
tool. Baltie is a visual and tiered-language tool. Lastly, Jhave In the motivation section (second), the data collected
the specialized output realization tools guiding the users allowed us to know whether the surveyee still does want to
through their learning process. As it can be seen, different learn OOP or he is discouraged by the complexity of OOP
functionalities are being researched, implemented, assessed in concepts.
order to give as much supports to the learners as possible. No
software seems to have given the effective support to the The difficulties faced with OOP section (third) allowed us
novice learners yet. to know how the surveyee finds OOP for novice learners. He
was asked to rate the complexity of the following topics for a
novice learner: inheritance, association and polymorphism.
B. Methodology
A qualitative analysis was done in order to identify the The fourth section, allows us to identify those software
problems that novice learners faced in understanding OOP. An which are currently being used by most people. It was seen
online survey was carried out. Thirty two surveyees of age that most of people who do not like OOP have been using
between 18 to 28 participated in this survey. Some of them advanced IDE when they first started learning OOP. These
already learnt OOP some years ago and some of them are tools were complicated for them to work with and to learn
beginners to OOP. The survey consisted of four sections: the OOP.
profile of the surveyee , his motivation to learn OOP, his and The approach taken was to enhanced blueJ because it has a
others’ difficulties faced in learning OOP and the tools they simple user interface compared to others. Furthermore,
findings from both the survey and background study
concluded that class-diagram is a better choice for learning
OOP. BlueJ consists of class-diagrams which look simple to
users. As blueJ is the simplest tool among all the existing
ones, it was easier to progressively add features to it. This will
ultimately allow us to achieve our goal of displaying features
according to user’s proficiency level. Newer is the learner
with OOP concepts, the lesser features will be displayed to
him. Easier is the learner with OOP concepts, more features
will be displayed to him.
Fig. 3. rating for introducing Polymorphism to novices
Fig. 5. Piechart representing the percentage of preferences regarding the 8 Fig. 6. Level 1 functionalities(features)
features which are seen to be vital according to the literature review
(distribution of data from table II).
Novice learners will be able to create inherited (parent-
It was concluded that most of the surveyees wanted a child) class-relationship between class-diagrams, define
simple user interface tool, with tiered-features possibility, methods and variables of each class. Tutorials can be in the
flow-model and specialized output realization features. 21% form of text and images. These tutorials should be part of the
preferred a simple user interface for ease of interaction. 17 % software to be built. Furthermore, the user should be given the
would like to be guided during their learning process by choice to choose between narrative style tutorials or some
having feedbacks, assessment to test their level of proficiency other types of tutorials. The users will be allowed to play
and to be able to move to more advanced OOP concepts around graphically with the class-diagrams while following
progressively. The programming language taught should be these tutorials. Upon creation/modification of each class-
used in the industry. diagram and their respective relationship with other classes,
real programming code will be generated in the back-end of
IV. SOFTWARE TOOL PROPOSED the software tool. The user will ultimately have an option
where he can view and modify these codes in real-time. If
As it can be seen, each of the existing software tools lack name or method or variable of a class is to be changed in the
one or some particular features which are essential for a code, this change will be reflected on the class-diagram. This
novice to learn object-oriented programming at his own pace. is often known as static visualization. If a user wishes to move
The solution proposed is to build a software OOP educational to another level of proficiency, he will have to click on the
tool combining all the seven functionalities which have been “change level button” as shown in the following figure 7.
discussed earlier in this paper. Therefore, the software which Upon clicking on the “change level button”, the learner will be
is to be built should be narrative, visual, flow-model, asked to answer a set of questions. The latter will allow the
specialized output realization, tiered language and multi- software application to assess the user. In this way, if the user
languages tool with a simple user interface. The programming fails the test, the application will not allow the user to proceed
language taught should be used in the industry. to further topics.
A. Description of the software proposed The following figure illustrates one of the e-assessment
questions to test the user knowledge before allowing him to go
Important OOP topics are inheritance, association and to the next level:
polymorphism. The application to be built will sub-divide
these topics into 6 topics for the learning process of a user.
Fig. 7. E-Assessment of OOP concepts by typing java codes in the textboxes and validate it by using the “Next” button.
The application interacts with the user by providing him progressively. It will be equipped with simple features such as
feedbacks. If the user successfully passes the tests, he will be a simple graphical user interface (GUI) according to user-level
allowed to proceed to level 2. Level 2 will be where a user expertise, simple class-diagrams and pop-up messages
will be able to view the state and behaviors of each class providing feedback on the user’s performance.
dynamically. The following figure shows the option at level 3.
The user is allowed to learn both inheritance (2nd button) and An OOP course should be set to evaluate the software with
association (4th button)”: novice learners. A survey should be carried out at the end of
the course to receive feedbacks from these students.
The tool having proposed in this paper caters only for OOP
topics such as inheritance, association and polymorphism.
Other OOP topics such as encapsulation can be included in the
application. 3d modeling was not considered in this paper.
Future work may consider switching between 2d and 3d user-
level to maximise learning growth.
Acknowledgment
We thank all the surveyees who have spare some of their
free-time to answer questions in the survey forms. Without
them, it would have been very difficult to validate the gap of
knowledge found in the literature review section.
Abstract—This paper presents the design and implementation in the lobby or waiting area of the office so that clients can
of an automatic lecturer availability tracking system easily identify the availability of the staff members that they
incorporating various software and hardware technologies. The are looking for. Changes in staff member availability need to
system is aimed at reducing the amount of time that students, be reflected client-side in real-time.
visitors and faculty members spend looking for specific
employees without being hindered by the ethical issues III. TECHNOLOGY DESCRIPTION
surrounding RFID based staff tracking. It focuses on the
availability of the staff members within their allocated office A. Software
space. The system is currently being tested and is performing Apache: This serves as the web host for all the web
well enough to provide clients and other staff members with real- pages used by the system.
time information regarding staff member availability and
essential contact information. PHP (Hypertext Preprocessor): For the web component
of the project, PHP will be used. PHP was chosen due
Keywords—Staff tracking; Staff availability; Privacy; to it being open source and having very good
performance statistics when compared to other web
I. INTRODUCTION service engines [5].
There are many systems available for tracking staff
members. A good percentage of them involve the personnel MySQL (My Structured Query Language): MySQL
wearing RFID (Radio-frequency identification) tags that are was chosen as the database system to be used as it is
scanned as they move from one part of a building to another. one of the most successful database systems used
This has brought about ethical issues regarding the privacy of worldwide [6]. It also performs well when compared to
the staff members being tracked [1]–[4]. This proposed system other database systems even when hosted on a limited
aims to bypass these privacy issues by focusing on the resource server [6] and can integrate easily with a PHP
availability of the staff member in their physical office space or based project
lab space. It does not factor in the movement of staff members JavaScript: JavaScript is used for client-side scripting
once they leave their offices. and performs the visual changes with the help of PHP
and AJAX.
II. OBJECTIVES
The system needs to track staff members unobtrusively in AJAX (Asynchronous JavaScript and XML): AJAX
an attempt to provide other staff members and visitors with an was chosen so that the web interface could be updated
easy means of obtaining information regarding staff members’ dynamically without having to refresh the entire page
whereabouts and availability. The system needs to be every time a change in staff availability was detected.
accessible via a web interface for easy, ubiquitous access. It HTML (Hyper Text Markup Language): Chosen due to
also needs to be operated in an automated fashion with as little its simplicity and the fact that it is supported on such a
input from the staff members themselves as possible. This will wide range of devices.
be achieved by installing sensors that monitor the doors of staff
members. The sensors should be able to give data about the C#: This was chosen as the platform for a desktop
state of the door – either open or closed. This data should then application that allows staff members to make manual
enable the system to give information about the availability of changes to their availability status when needed.
the staff member concerned. A means of visually representing
B. Hardware:
the availability of staff members needs to be designed and the
representation should conform to both mobile and desktop Magnetic Switches: These switches are inexpensive
platforms. This visual representation should then be displayed and provide clear binary data on the status of the door.
of being real-time.
VII. CONCLUSION
The system successfully manages to provide availability of
staff members in an unobtrusive way. The staff members also
have ultimate control over what availability status is selected,
thereby eliminating the ethical problems associated with
monitoring a user by means of RFID. The system is available
Fig. 3.Desktop Interface
via the web and is therefore easily accessible to anyone with a
device capable of loading an HTML web page. For real-time
feedback they need a JavaScript enabled browser. A visual
VI. SYSTEM RESPONSIVENESS representation was designed in HTML that showed the
availability of staff members along with their contact
A. What was assessed information. The HTML was designed to conform to the screen
Response time was measured in terms of how frequently size of the device that it is viewed on, and is therefore viewable
the system can check for a change in the database and then on both mobile and desktop platforms.
successfully update the visual representation of that data on the
client’s side. VIII. REFERENCES
[1] S. Kurkovsky, E. Syta, and B. Casano, “Continuous RFID-enabled
1) Variables: authentication: Privacy implications,” Technology and Society
The JavaScript timer was set to run at different intervals for Magazine, IEEE, vol. 30, no. 3, pp. 34–41, 2011.
each test case. [2] G. Kaupins and R. Minch, “Legal and ethical implications of employee
location monitoring,” in System Sciences, 2005. HICSS’05. Proceedings
2) Method used: of the 38th Annual Hawaii International Conference on, 2005, p. 133a–
A web page was set up that automatically made 20 changes 133a.
to a staff member’s status in five seconds. The view on the [3] V. Samaranayake and C. Gamage, “Employee perception towards
electronic monitoring at work place and its impact on job satisfaction of
client’s side was then observed and the number of changes that software professionals in Sri Lanka,” Telematics and Informatics, vol.
occurred were tabulated. 29, no. 2, pp. 233–244, 2012.
3) Results [4] M. Workman, “A field study of corporate employee monitoring:
Attitudes, absenteeism, and the moderating influences of procedural
From looking at Fig. 5. it can be seen that changes will be justice perceptions,” Information and Organization, vol. 19, no. 4, pp.
successfully reflected on the client’s side within at least 150 218–232, 2009.
milliseconds. This is fast enough to meet the systems objective [5] T. Suzumura, S. Trent, M. Tatsubori, A. Tozawa, and T. Onodera,
“Performance comparison of web service engines in php, java and c,” in
Web Services, 2008. ICWS’08. IEEE International Conference on, 2008,
pp. 385–392.
[6] M. Ahmed, M. M. Uddin, M. S. Azad, and S. Haseeb, “MySQL
performance analysis on a limited resource server: Fedora vs. Ubuntu
Linux,” in Proceedings of the 2010 Spring Simulation Multiconference,
2010, p. 99.
Chel-Mari Spies
Department: Computer Systems Engineering
Tshwane University of Technology
Pretoria, South Africa
spiesc@tut.ac.za
Abstract—Advances in mobile health (mHealth) has been blamed, but instead a collaboration of culprit issues have been
notable in the last few years, but so has the problems associated identified.
with the development, implementation and sustained use of
mHealth systems. After implementation, various difficulties arise In this study, the author aims to identify the problematic areas
contributing to being considered a failure. The purpose of the
and propose a solution in the form of a model that can be used
study is to create a model by which e-Health systems can be
evaluated in order to ensure development for sustained use, with throughout the design and implementation phases to try to
attention focused on aspects identified through literature. The improve sustainability in terms of implementability and
author examines problems identified in previous research to continued use.
establish difficulties and shortfalls regarding the perceived
success of said systems. Patient and healthcare practitioner
points-of-view, along with software and hardware considerations A. Background
are taken into account. The investigation determines that the use
of IT in mHealth is still dependent on serious factors influencing Thorough research for development and implementation
the realization of success of well-established as well as newly has been done by various companies, institutions and/or
developed systems. These concerns undermine the effectiveness individuals in order to develop useful systems which aims to
and usefulness of the improvement and efficiency of healthcare improve service delivery, ensure ease-of-use for patients, and
facilities and knowledge as a whole. The author proposes a model lighten the load of healthcare practitioners, but despite best
against which developed and developing mHealth systems can be efforts, most systems are still open to improvements. Studies
measured in order to promote continued usage of implemented show that problems regarding various human and
systems. This study sheds new light on little recognized issues for technological aspects are still experienced and the evidence
bringing concept and practice together in a useful, uninterrupted,
shows that many flaws can be avoided by taking a few things
and unending manner.
under advisement during the design phase.
In this study, the difficulties are studied from various
Keywords—mHealth; system design; system evaluation points-of-view: patient, healthcare practitioner, and technical
(which has been divided into software and hardware).
I. INTRODUCTION Problems experienced by the patient are consumer
There has been a steady increase in the use of IT in the oriented: users are concerned with comfort and ease-of-use.
health sector. Hospitals and healthcare practitioners have Healthcare practitioners have different needs as they are
come a long way from simply entering patient data onto a focused on information that can be collected from the system.
computer. The implementation of information technology When considering the technical (or design) side, it
takes on many forms – from taking simple measurements as becomes clear that a distinction must be made between the
implemented in electronic healthcare systems (eHealth), to hardware and software aspects. The software must be accuracy
more complex systems and incorporating wireless technology and security driven, and the hardware scheme must consider
in mobile health (mHealth, to wearable and implantable user needs.
devices as described in personalised electronic healthcare and
medicine (pHealth).
II. PROBLEMS FACING IMPLEMENTATION OF MHEALTH
Through the involvement of healthcare practitioners and SYSTEMS
engineers alike, many systems have been designed and Bearing in mind previous research and case studies, it
implemented with various degrees of success. In the instances becomes evident that the following issues poses major
where the continuous use of such systems failed, various concerns not only in the design of new tools, but also when
reasons have been cited as to why this was the case. sustained use is intended:
It becomes clear from the literature referred to throughout the
paper that no single person, institution, or aspect can be
Security – Data in a health system must be protected from “Institute of Electrical and Electronics Engineers
malicious attacks and disclosures and patient information Standards Association (IEEE)” [31]
must be protected at all cost [29]. “International Electrotechnical Commission (IEC)
Sensing methods – Various sensing methods have been – International Electrotechnical Commission –
implemented in developed systems in the past. Examples Technical Report (IEC/TR)” [32]
include electrocardiogram (ECG) electrodes using “International Organization for Standardization (ISO)
textiles, global positioning (GPS) system modules, and – International Organization for Standardization -
temperature sensors that are optical fiber based [30]. Technical Report (ISO/TR)” [33]
“German Institute for Standardisation (DIN – “Deutsches
Institut für Normung”)” [34]
After consideration of the identified concerns, it becomes
clear that there is a need for a single model that can be used to In the tables, the standardising body and standard number
scrutinize the projected efficiency of a developed system, is given as identification method of each standard. The
together with its accompanying tools. description (printed in Italic script) of each is the indorsed
This paper proposes that the evaluation problem be solved name of each standard.
by means of identifying endorsed methods of standardization.
A. Patient point-of-view
III. UNIVERSAL STANDARDS AS BASIS FOR THE PROPOSED Aspects in this section cover acceptance by the patient,
MODEL convenience for the patient, discretion when the patient is
wearing or using the device, the degree of perceived
After considering the aspects described in the previous usefulness, and wearability in terms of comfort for the patient.
paragraphs, judgement methods, and official engineering (and A summary is given in Table 1.
other) standards were identified as basis for the proposed
model.
Tables 1 to 4 show existing standards that serve as B. Practitioner point-of-view
guidelines in development of tools, systems and other related Healthcare practitioners consider ease-of-use, usefulness,
aspect in the development and implementation processes. the perception of others, activity monitoring and the quality of
the service that the system yields to be among the most
Standards from the following bodies have been included: important aspects, as shown in Table 2.
Clinical and Laboratory Standards Institute (CLSI)
Table 2. Aspects identified from practitioner point-of-view
Aspect Features Standardising body, number & description
Ease-of-use “DIN ISO 20282-1:2008 Ease of operation of everyday products - Part 1: Design
requirements for context of use and user characteristics” [26]
Usefulness “DIN ISO 20282-1:2008 Ease of operation of everyday products - Part 1: Design
Acceptance
requirements for context of use and user characteristics” [26]
Others‟ perception Survey
Personal initiative Electronic monitoring
“ISO/IEC 18092:2013 Information technology - Telecommunications and information
Stationary patients exchange between systems - Near Field Communication - Interface and Protocol (NFCIP-
Activity 1)” [35]
monitoring “ISO/IEC TR 8802-1:2001 Information technology – Telecommunications and information
Roaming patients exchange between systems – Local and metropolitan area networks specification
requirements – Part 1: Overview of Local Area Network Standards” [36]
Intended use “ISO 10004:2012 Quality management – Customer satisfaction – Guidelines for monitoring
Quality of and measuring” [25]
service Data reliability “ISO 8000-130:2009 Data quality – Part 130: Master data: Exchange of characteristic data:
Accuracy” [37]
Table 3. Aspects identified from hardware design point-of-view
Aspect Features Standardising body, number & description
Compatible Multi-vendor “ISO/IEC 7498-1 Information technology - Open Systems Interconnection - Basic
architecture compatibility Reference Model” [38]
“ISO/TR 21720:2007 Health informatics - Use of mobile wireless communication and
Electromagnetic Electromagnetic computing technology in healthcare facilities - Recommendations for electromagnetic
compatibility susceptibility compatibility (management of unintentional electromagnetic interference) with medical
devices” [39]
“ISO 9241-411:2012 Ergonomics of human-system interaction – Part 411: Evaluation
Favourable
methods for the design of physical input devices” [27]
Mobility
“ISO 9241-411:2012 Ergonomics of human-system interaction – Part 411: Evaluation
Unobtrusive
methods for the design of physical input devices” [27]
Low power “IEC 62018 ED. 1.0 B:2003 Power consumption of information technology equipment.
Power
consumption Measurement methods” [40]
“ISO 9241-411:2012 Ergonomics of human-system interaction – Part 411: Evaluation
Weight
Physical methods for the design of physical input devices” [27]
parameters “ISO 9241-411:2012 Ergonomics of human-system interaction – Part 411: Evaluation
Size
methods for the design of physical input devices” [27]
Local and metropolitan area networks (2.4%) As a point of interest it might be highlighted that this study
Medical devices (2.4%) does not cover telemedicine and teleconsulting, or implantable
Patient and sample identification (2.4%) medical devices as described by pHealth. These aspects might
Power consumption (2.4%) be included additionally in further development of the
Other areas (4.9%) proposed model.
VI. ACKNOWLEDGMENT
Chel-Mari Spies is the principal researcher and author of
this study. This work is partially supported in terms of
resources by Tshwane University of Technology.
The author declares no conflict of interest.
VII. REFERENCES
[1] R. Patel, W. Green, M. W. Shahzad, and C. Larkin, “Use of Mobile
Clinical Decision Support Software by Junior Doctors at a UK
Teaching Hospital: Identification and Evaluation of Barriers to
Engagement,” JMIR mHealth and uHealth, vol. 3, no. 3, p. e80, 2015.
[2] I. Kim, P.-H. Lai, R. Lobo, and B. J. Gluckman, “Challenges in
wearable personal health monitoring systems,” in Engineering in
Medicine and Biology Society (EMBC), 2014 36th Annual
Figure 1. Fields of application in final model International Conference of the IEEE, 2014, pp. 5264–5267.
[3] M. Ji, Y. Wu, P. Chang, X. Yang, F. Yang, and S. Xu, “Development
and Usability Evaluation of the Mobile Delirium Assessment App
Table 5. Adherence standards for universal design – Final proposed model
Standardising body Standard number Aspects covered Point-of-View
CLSI GP33-A Vol. 30 No. 7 Patient identification Software
DIN ISO 20282-1:2008 Convenience Patient
IEC 61523 Processing capabilities Software
IEC 62018 ED. 1.0 B:2003 Power consumption Hardware
IEC 62366 Usefulness Patient
80001-1 ED. 1.0
IEC Interoperability Software
B:2010
IEC/TR 80001-2-3 ED. 1.0 EN:2012 Network management Software
IEEE 802.21-2008 Communication Software
IEEE STD 1175.3 – 2004 Functionality Software
Data management Practitioner
ISO 8000-130:2009
Quality of service Software
Discretion
Mobility Hardware
ISO 9241
Physical parameters Patient
Wearability
Acceptance Patient
ISO 10004:2008
Quality of service Practitioner
Processing capabilities
ISO 10303-1:1994 Software
Security
ISO 13606-1:2012 Data management Software
ISO 15489-1:2001 Data management Software
ISO/IEC 7498 Compatible architecture Hardware
ISO/IEC 9798-1:2010 Network management Software
ISO/IEC 15408-1:2009 Security Software
ISO/IEC 18092:2013 Activity monitoring Practitioner
ISO/IEC 19794-7:2007 Communication Software
ISO/IEC 25010:2011 Processing capabilities Software
ISO/IEC 27040:2015 Security Software
ISO/IEC TR 8802-1:2001 Activity monitoring Practitioner
ISO/IEEE 11073-20602:2010 Interoperability Software
ISO/TR 16982:2002 Convenience Patient
ISO/TR 21720:2007 Electromagnetic compatibility Hardware
Electronic monitoring --- Acceptance Practitioner
Survey --- Acceptance Practitioner
Based on Confusion Assessment Method for Intensive Care Unit [21] K. Han, K. Kim, and T. Shon, “Untraceable mobile node
(CAM-ICU).,” Studies in health technology and informatics, vol. 216, authentication in WSN.,” Sensors (Basel), vol. 10, no. 5, pp. 4410–29,
pp. 899–899, 2014. 2010.
[4] A. Pantelopoulos and N. G. Bourbakis, “A survey on wearable sensor- [22] M.-P. Gagnon, P. Ngangue, J. Payne-Gagnon, and M. Desmartis, “m-
based systems for health monitoring and prognosis,” Systems, Man, Health Adoption by Healthcare Professionals: A Systematic Review,”
and Cybernetics, Part C: Applications and Reviews, IEEE Journal of the American Medical Informatics Association, p. ocv052,
Transactions on, vol. 40, no. 1, pp. 1–12, 2010. 2015.
[5] N. Leon, R. Surender, K. Bobrow, J. Muller, and A. Farmer, [23] M. K. Khan and K. Alghathbar, “Cryptanalysis and security
“Improving treatment adherence for blood pressure lowering via improvements of „two-factor user authentication in wireless sensor
mobile phone SMS-messages in South Africa: a qualitative evaluation networks‟.,” Sensors (Basel), vol. 10, no. 3, pp. 2450–9, 2010.
of the SMS-text Adherence SuppoRt (StAR) trial,” BMC family [24] Y.-W. Ma, J.-L. Chen, Y.-M. Huang, and M.-Y. Lee, “An efficient
practice, vol. 16, no. 1, p. 80, 2015. management system for wireless sensor networks.,” Sensors (Basel),
[6] W. Maetzler, J. Domingos, K. Srulijes, J. J. Ferreira, and B. R. Bloem, vol. 10, no. 12, pp. 11400–13, 2010.
“Quantitative wearable sensors for objective assessment of [25] Quality management - Customer satisfaction - Guidelines for
Parkinson‟s disease.,” Mov. Disord., vol. 28, no. 12, pp. 1628–37, monitoring and measuring, ISO Standard 10004, 2012.
2013. [26] Ease of operation of everyday products - Part 1: Design requirements
[7] F. Muench, “The promises and pitfalls of digital technology in its for context of use and user characteristics, DIN ISO standard 20282-1,
application to alcohol treatment,” Alcohol Research: Current Reviews, 2008
vol. 36, no. 1, pp. 131–142, 2015. [27] Ergonomics of human-system interaction - Usability methods
[8] C. C. Poon, Y. M. Wong, and Y.-T. Zhang, “M-health: the supporting human-centered design, ISO/TR standard 16982, 2002
development of cuff-less and wearable blood pressure meters for use [28] Medical devices - Application of usability engineering to medical
in body sensor networks,” in Life Science Systems and Applications devices, IEC standard 62366,
Workshop, 2006. IEEE/NLM, 2006, pp. 1–2. [29] P. K. Sahoo, “Efficient security mechanisms for mHealth applications
[9] J. Lӧnnblad, J. Castano, M. Ekstrӧm, M. Lindén, and Y. Bäcklund, using wireless body sensor networks.,” Sensors (Basel), vol. 12, no. 9,
“Optimization of wireless BluetoothTM sensor systems,” in pp. 12606–33, 2012.
Engineering in Medicine and Biology Society, 2004. IEMBS’04. 26th [30] K. Hung, C. C. Lee, W. M. Chan, S.-O. Choy, and P. Kwok,
Annual International Conference of the IEEE, 2004, vol. 1, pp. 2133– “Development of a wearable system integrated with novel biomedical
2136. sensors for ubiquitous healthcare.,” Conf Proc IEEE Eng Med Biol
[10] K. Wac, R. Bults, B. van Beijnum, I. Widya, V. Jones, D. Konstantas, Soc, vol. 2012, pp. 5802–5, 2012.
M. Vollenbroek-Hutten, and H. Hermens, “Mobile patient monitoring: [31] Institute of Electrical and Electronic Engineers. (2015). About IEEE
the MobiHealth system.,” Conf Proc IEEE Eng Med Biol Soc, vol. [Online]. Available: http://www.ieee.org/index.html
2009, pp. 1238–41, 2009. [32] International Electrotechnical Commission. (2015). Who we are
[11] F. Felisberto, N. Costa, F. Fdez-Riverola, and A. Pereira, [Online]. Available: http://www.iec.ch/about/profile/
“Unobstructive Body Area Networks (BAN) for efficient movement [33] International Organization for Standardization. About ISO [Online].
monitoring.,” Sensors (Basel), vol. 12, no. 9, pp. 12473–88, 2012. Available: http://www.iso.org/iso/home/about.htm
[12] E. Jovanov, “Wireless Technology and System Integration in Body [34] German Institute for Standardization. (2015). DIN Standards
Area Networks for m-Health Applications.,” Conf Proc IEEE Eng [Online]. Available: http://www.din.de/en/about-standards/din-
Med Biol Soc, vol. 7, pp. 7158–60, 2005. standards
[13] A. B. Waluyo, I. Pek, X. Chen, and W.-S. Yeoh, “SLIM: a Secured [35] Information technology - Telecommunications and information
Lightweight Interactive Middleware for wireless body area network.,” exchange between systems - Near Field Communication - Interface
Conf Proc IEEE Eng Med Biol Soc, vol. 2008, pp. 1821–4, 2008. and Protocol (NFCIP-1), ISO/IEC Standard 18092, 2013.
[14] P. Yu, M. X. Wu, H. Yu, and G. Q. Xiao, “The challenges for the [36] Information technology – Telecommunications and information
adoption of M-health,” in Service Operations and Logistics, and exchange between systems – Local and metropolitan area networks
Informatics, 2006. SOLI’06. IEEE International Conference on, 2006, specification requirements – Part 1: Overview of Local Area Network
pp. 181–186. Standards, ISO/IEC Technical Report 8802-1, 2001.
[15] A. Darwish and A. E. Hassanien, “Wearable and implantable wireless [37] Data quality – Part 130: Master data: Exchange of characteristic data:
sensor network solutions for healthcare monitoring,” Sensors, vol. 11, Accuracy, ISO Standard 8000-130, 2009.
no. 6, pp. 5561–5595, 2011. [38] Information technology - Open Systems Interconnection - Basic
[16] M. Kuroda, S. Qiu, and O. Tochikubo, “Low-power secure body area Reference Model” ISO/IES Standard 7498-1, 1994.
network for vital sensors toward IEEE802. 15.6,” in Engineering in [39] Health informatics - Use of mobile wireless communication and
Medicine and Biology Society, 2009. EMBC 2009. Annual computing technology in healthcare facilities - Recommendations for
International Conference of the IEEE, 2009, pp. 2442–2445. electromagnetic compatibility (management of unintentional
[17] M. D. Jara‟\iz-Simón, J. A. Gómez-Pulido, M. A. Vega-Rodr‟\iguez, electromagnetic interference) with medical devices, ISO Technical
and J. M. Sánchez-Pérez, “Fast decision algorithms in low-power Report 21720, 2007.
embedded processors for quality-of-service based connectivity of [40] Power consumption of information technology equipment.
mobile sensors in heterogeneous wireless sensor networks,” Sensors, Measurement methods, IEC Standard 62018 ED. 1.0 B, 2003
vol. 12, no. 2, pp. 1612–1624, 2012. [41] Information Technology – Biometric Data Interchange Formats – Part
[18] E. Borycki, H. Monkman, J. Griffith, and A. Kushniruk, “Mobile 7: Signature/Sign time series Data, ISO/IEC Standard 19794-7, 2007.
Usability Testing in Healthcare: Methodological Approaches.,” [42] IEEE standard for local and metropolitan area network – Part 21:
Studies in health technology and informatics, vol. 216, pp. 338–342, Media independent handover services, IEEE Standard 802.21, 2008.
2014. [43] Health Informatics – Electronic health record communication – Part 1:
[19] E. Kyriacou, P. Constantinides, C. S. Pattichis, M. S. Pattichis, and A. Reference model, ISO Standard 13606-1, 2012.
Panayides, “eEmergency health care information systems.,” in [44] Information and documentation – Records management – Part 1:
Conference proceedings : ... Annual International Conference of the General, ISO Standard 15489-1, 2001.
IEEE Engineering in Medicine and Biology Society. IEEE [45] IEEE Standard for CASE Tool Interconnections – Reference model
Engineering in Medicine and Biology Society. Annual Conference, for specifying software behaviour, IEEE Standard 1175.3, 2004.
2011, vol. 2011, pp. 2501–4. [46] Application of risk management for IT-networks incorporating
[20] M. Lee and T. M. Gatton, “Wireless health data exchange for home medical devices – Part 1: Roles, responsibilities and activities, IEC
healthcare monitoring systems.,” Sensors (Basel), vol. 10, no. 4, pp. Standard 80001-1 ED.1.0 B, 2010.
3243–60, 2010. [47] Health informatics - Personal health device communication - Part
20601: Application profile - Optimized exchange protocol, ISO/IEEE
Standard 11073-20602, 2010.
[48] Application of risk management for IT-networks incorporating
medical devices – Part 2-3: Guidance for wireless networks, IEC
Technical Report 80001-2-3 ED. 1.0 EN, 2012.
[49] Information tecnology – Security techniques – Entity authentication –
Part 1: General, ISO/IEC Standard 9798-1, 2010.
[50] Accuracy in Patient and Sample Identification; Approved Guideline,
Clinical and Laboratories Institute Standard GP33-A Vol. 30 No. 7,
2010.
[51] Systems and software engineering – Systems and software Quality
Requirements and Evaluation (SQuaRE) – System and software
quality models, ISO.IES Standard 25010, 2011.
[52] Delay and Power Calculation Standards - Part 1: Integrated circuit
delay and power calculations, IEC Standard 61523-1, 2012.
[53] Industrial automation systems and integration – Product data
representation and exchange, ISO Standard 10303-1, 1994.
[54] Information technology – Security techniques – Evaluation criteria for
IT security, ISO/IEC Standard 15408-1, 2009.
[55] Information technology – Security techniques – Storage security,
ISO/IEC Standard 27040, 2015.
An Improved Image Steganography Scheme with
High Visual Image Quality
Sumit Laha, Rinita Roy
Department of Computer Science and Engineering
Future Institute of Engineering and Management
Kolkata, India
sumitlaha@icloud.com, rinita5493@gmail.com
Abstract—With the advent of internet, secured transmission the original secret image from the stego image. Although there
of data over networks offers a great challenge. This problem of have been several studies on the embedding of secret images,
secrecy of information has drawn a lot of attention due to its most of them deals with normal LSB substitution. But the LSB
immense demand in real life applications such as information substitution approach suffers from certain drawbacks. For
security systems. The objective of this study consist of three-fold. example, if a third party tries to read the LSB positions from
In the first phase, we propose an embedding algorithm to hide a the stego image, a secret image hidden using this technique
secret image within the cover image. In the second phase, the can easily be intercepted. Moreover, this method completely
stego image quality is optimized using GA. Then, in the third replaces the original values in the LSB positions, while
phase, we have presented the procedure to extract the secret
substituting the secret image. This results in low visual quality
image from the optimized stego image with 100% data lossless.
The experimental results demonstrate that the proposed method
of the stego image as compared to the original cover image.
significantly outperforms the five state-of-the-art steganography Inspired by this motivation, in this paper, we propose an
techniques known to date in terms of enhanced peak signal to embedding algorithm to hide a secret image within the cover
noise ratio (PSNR) of the stego image. image. Then, the stego image quality is optimized using GA
and finally, we have presented the procedure to extract the
Keywords—Image Steganography; Embedding algorithm; secret image from the optimized stego image. As a result, the
XOR; Genectic Algorithm; 8-connected neighbor; Peak Signal to
secret image is obtained with 100% data lossless and also,
Noise Ratio
provides a high visual image quality with respect to the cover
image.
I. INTRODUCTION
The remainder of the paper is organized as follows:
With the development of internet, secured transmission of Section II gives a brief description of a literature review of
data over networks provides a great challenge to the previous solution methods; Section III presents the proposed
researchers and practitioners. This problem of secrecy of embedding technique, GA and the extraction procedure;
information has long drawn a lot of attention due to its Section IV provides the experimental results and finally, the
immense demand in real life applications such as information conclusions are drawn in Section V.
security systems and rapid development of communication and
network. There are two methods, namely, cryptography and Nomenclature
steganography in order to keep the data secret. Cryptography
comprises two phases - encryption and decryption of data. pc cover image pixel
However the existence of the data cannot be hidden from the ps stego image pixel
third party. On the other hand, steganography aims to keep the ip initial population array
existence of the data (in the format of image, video, audio or C cover image represented as 2D matrix
text) secret in the cover image [1]. The steganography S stego image represented as 2D matrix
procedure [2] consists of three important elements: the data to MAXI maximum intensity of a pixel, i.e., 255
be hidden (here, image), the cover image, and the resulting k population size of GA
stego image (after embedding the secret image into the cover
file). Two important steganography approaches frequently in II. LITERATURE REVIEW
the literature, namely spatial domain-based [3-6] and frequency Recently, there have been growing number of research on
domain-based [7, 8]. In case of spatial domain approach, steganography methods especially, based on soft computing
embedding of data is carried out using Least Significant Bit methods like GA and particle swarm optimization in both
(LSB) substitution method, whereas in the frequency domain, spatial and frequency domains. In the spatial domain,
the data is embedded in the frequency coefficient of images. embedding of message is done by LSB substitution method,
This paper deals with the problem of embedding of the whereas in the frequency domain, the message is embedded in
secret image, optimizing this stego image as close as to the the frequency coefficient of images. In this section, we focus
cover image without altering the embedded bits and extracting
on a review of the previous studies on the steganography and authentication bit conditions and showed its improved
methods. performance compared with the recent steganography schemes.
Kanan and Nazeri [9] presented a novel steganography Yang, Chen, Yu and Wang [18] proposed an improved
technique based on GA to obtain high embedding capacity and image secret sharing scheme by hashing the four-pixel block,
enhanced PSNR of the stego image. The high embedding block ID and image ID to increase the authentication ability
capacity is achieved by modifying secret bits and then finding and by arranging the bits in the for-pixel square block to
the best place for embedding it into the cover image based on improve the stego image and finally, by using the GF (28) to
GA. obtain a lossless version without requiring additional pixels.
Roy and Laha [10] proposed a GA to optimize the stego
image based on a fitness function using 8-connected neighbor III. PROPOSED WORK
of each pixel. They have shown that the proposed scheme is In this section the proposed work comprises three phases.
found to be relatively better than some existing steganography The first phase deals with a new embedding algorithm to hide
techniques. the secret image, the second phase addresses the proposed GA
Li and Wang [11] presented a stenographic scheme to hide that uses a heuristic approach to find superior solution and the
data JPEG images securely along with high message capacity third phase provides the extraction algorithm and these are
and good image quality in two phases. In the first phase, presented in subsequent sections.
particle swarm optimization algorithm is applied to improve
the stego image quality by deriving the optimal substitution A. Embedding of data
matrix for transforming the secret images. In the next phase, In this section, we have proposed a new embedding
the hiding the secret image into the cover image using a technique to hide the secret image. In the conventional LSB
modified JPEG quantization table. substitution method, the LSBs of the cover image are just
replaced with the bits of the secret image. But there is a flaw in
Tseng, Chan, Ho and Chu [12] proposed an effective
terms of security. If an intruder scans the LSBs, he can easily
steganography scheme to enhance the quality of stego image
intercept the secret image. Furthermore, this approach yields
by using an improved GA and optimal pixel adjustment
low PSNR values.
process. The experimental results show that the mean square
error of the stego image of the proposed scheme is much lower
than those produced by the existing methods.
Wang, Lin and Lin [13] developed a steganography method
based on GA to embed the secret message in the moderately
significant bit of the cover image. They also applied a global
substitution step and a local pixel adjustment process for
improvement of the quality of the stego image. Later, in
another research work, Wang, Lin and Lin [14] proposed
optimal LSB substitution to embed the data secret in the host
image. They also proposed GA to the data hiding problem in
the k LSBs of the host image.
Recently, there have been some noteworthy studies in
connection with the development of image sharing scheme Fig. 1. Embedding of secret image into cover image.
with steganography and authentication ability to protect the
integrity stego images from dishonest participants [15-18]. So a new approach has been proposed, where instead of
simply substituting the secret image bits, XOR operation is
Lin and Tsai [15] proposed a technique for the secret image performed between the cover image bits and the secret image
sharing by applying Shamir method. Their proposed scheme bits. This is illustrated in Fig. 1. The proposed algorithm
consists of three levels of security protection, namely, sharing (considering 2 bits per pixel for embedding secret bits) is given
the secret among the participants, embedding the data hiding, as follows:
and authentication capability, resulting in high security and
efficiency of the system. Input: Cover and secret grey scale images of sizes m×m and
n×n, respectively, where n ≤ m/2
Chang, Hsieh and Lin [16] developed a novel image Output: Stego image of size m×m
sharing scheme considering both steganography and Step 1: Read the cover and secret image files.
authentication based on Chinese remainder theorem. The Step 2: 2 bits from the MSB side of first pixel of the secret
performance of this scheme is shown to be superior to the image are taken and XOR-ed respectively with the 6th and 7th
existing methods in respect of both high authentication ability bit of the each pixel of the cover image and the results are
and good stego image quality. copied into the 6th and 7th bit positions of the corresponding
Wu and Kao [17] presented a secret image sharing scheme pixel of the stego image.
by employing optimal pixel adjustment process to obtain Step 3: Step 2 is repeated for all the pixels of the secret image,
enhanced stego image quality under different payload capacity taking 2 bits at a time of the pixel of the secret image.
Step 4: Write the output stego into the output file.
B. Optimization using GA
Genetic algorithms (GAs) originally developed by John
Holland [19] are robust structured stochastic global search
algorithm based on biologically inspired computational Fig. 2. Heuristic Initialization for a single pixel.
intelligence that have been employed to a wide range of
problems in the areas of science, engineering, business and Similarly, in the random initialization phase, the remaining
other fields [20-23]. These algorithms can be successfully part of the initial population is generated randomly using
applied for generating efficient solutions from large, complex random permutation from the entire range of possible solutions
and multi-modal search space. While implementing GA for (0 – 255), while retaining the embedding in the two LSBs
solving a particular problem, one has to amend the coding of unaltered. The solutions generated are unique from the
the solution in the search space along with the appropriate solutions obtained heuristically in the initial population.
tuning of parameters pertaining to GA to make use of the 2) Fitness Function and Selection
problem specific available information effectively. The In this GA, the fitness function based on 8-connected
important components of GA consists of an initial population PSNR [10] is obtained from the following equations:
of candidate solutions (chromosomes), selection of new
x 1 y 1 2 (1)
population based on a suitable fitness function, recombination mse '(k ) C (i, j ) S (i, j ) C (x, y) S (x, y) C (x, y) i p (k) / 9
2 2
A. Embedding Algorithm
To explain the effectiveness of the new embedding
technique, an experiment has been carried out. A 512×512 a b c
grayscale “Lena” image and a 256×256 “General test pattern”
is taken as cover and secret images respectively. Fig. 4
represents the histogram of the original “Lena” image, while
Fig. 5 and Fig. 6 represent that of the stego images.
d e f
Fig. 7. Secret image – (a) General test pattern; Cover images – (b) Lena, (c)
Jet, (d) Peppers, (e) Sailboat, (f) Baboon.
1 m 1 n 1
MSE C (i, j ) S (i, j )
2
(3)
mn i 0 j 0
Fig. 4. Histogram of original Lena image.
MAX I 2 There are some issues important for future research
PSNR 10log10 (4) consideration. First, an attempt can be made to develop better
MSE embedding algorithm that can produce high visual quality
while keeping the secret message intact. Second, as an
The comparisons of the above methods are done on the
alternative to the existing soft computing algorithm, other
basis of the PSNR values (from equations (3) and (4)) of the
algorithms such as cuckoo search algorithm and bacteria
optimized images obtained by applying the proposed GA with
foraging optimization algorithm can be implemented to obtain
respect to the corresponding cover images. The GA parameters
accurate modeling of steganography. Finally, the proposed
has been summarized in table 1.
scheme can also be tried for color images.
TABLE I. GA PARAMETERS
ACKNOWLEDGMENT
Parameter Value
Number of generations 100
We are grateful to Professor Amrita Khamrui for her
helpful comments and suggestions in writing this paper.
Population size 30
Crossover rate 0.9
Mutation rate 0.05
REFERENCES
[1] A. Cheddad, J. Condell, K. Curran, and P. McKevitt, “Digital image
Table 2 compares the PSNR values of different grey scale steganography: Survey and analysis of current methods,” Signal
images for the proposed method as against the Lin and Tsai’s Processing, vol. 90, pp. 727-752, 2010.
method [15], Yang, Chen, Yu and Wang’s method [18], [2] L.M. Marvel, C.G. Boncelet, and C.T. Retter, “Spread spectrum image
Chang, Hsieh and Lin’s method [16], Wu and Kao’s method steganography,” IEEE Trans. Image Process, vol. 8, pp. 1075-1083,
[17] and Kanan and Nazeri’s method [9]. 1999.
[3] H. Sajedi, and M. Jamzad, “BSS: Boosted steganography scheme with
TABLE II. COMPARISON OF PROPOSED METHOD WITH DIFFERENT cover image processing,” Expert Systems with Applications, vol. 37, pp.
STEGANOGRAPHY TECHNIQUES 7703-7710, 2010.
[4] W.-J. Chen, C.-C. Chang, and T. Le, “High payload steganography
Images mechanism using hybrid edge detector,” Expert Systems with
Methods Avg.
Lena Jet Peppers Sailboat Baboon Applications, vol. 37, pp. 3292-3301, 2010.
Lin and Tsai [5] A. Ioannidou, S.T. Halkidis, and G. Stephanides, “A novel technique for
39.2 39.25 39.17 39.16 39.18 39.19 image steganography based on a high payload method and edge
[15]
Chang, Hsieh detection,” Expert Systems with Applications, vol. 39, pp. 11517-11524,
40.37 40.73 39.3 38.86 39.94 39.84 2012.
and Lin [16]
Yang, Chen, [6] B.E. Carvajal-Gamez, F.J. Gallegos-Funes, and A.J. Rosales-Silva,
Yu and 41.6 41.66 41.56 41.51 41.55 41.58 “Color local complexity estimation based steganographic (CLCES)
Wang [18] method,” Expert Systems with Applications, vol. 40, pp. 1132-1142,
Wu and Kao 2013.
43.54 43.53 43.56 43.55 43.54 43.54
[17] [7] W.-Y. Chen, “Color image steganography scheme using DFT, SPIHT
Kanan and codec, and modified differential phase – shift keying techniques,”
45.12 45.18 45.13 45.1 45.12 45.13
Nazeri [9] Applied Mathematics and Computation, vol. 196, pp. 40-54, 2008.
Proposed [8] R. Jafari, D. Ziou, and M.M. Rashidi, “Increasing image compression
47.03 47.037 47.033 47.039 47.029 47.034
method rate using steganography,” Expert Systems with Applications, vol. 40,
pp. 6918-6927, 2013.
It is revealed from the results of Table 2 that the proposed [9] H.R. Kanan, and B. Nazeri, “A novel image steganography scheme with
method yields average PSNR value of 47.034 which is high embedding capacity and tunable visual image quality based on a
genetic algorithm,” Expert Systems with Applications, vol. 41, pp.
considerably higher in comparison with those produced by all 6123–6130, 2014.
other algorithms. [10] R. Roy, and S. Laha, “Optimization of Stego Image Retaining Secret
Information Using Genetic Algorithm with 8-connected PSNR,”
V. CONCLUSIONS Procedia Computer Science, vol. 60, pp. 468-477, 2015.
[11] X. Li, and J. Wang, “A steganographic method based upon JPEG and
In this paper, the problem of steganography is considered to particle swarm optimization algorithm,” Information Sciences, vol. 177,
optimize the stego image close to the cover image without pp. 3099–3109, 2007.
affecting the secret image embedded within it. We proposed an [12] L.-Y. Tseng, Y.-K. Chan, Y.-A. Ho, and Y.-P. Chu, “Image hiding with
embedding algorithm to hide a secret image within the cover an improved genetic algorithm and an optimal pixel adjustment
image and then, the stego image quality is optimized using GA. process,” In Intelligent Systems Design and Applications, ISDA'08.
Eighth International Conference on, IEEE, vol. 3, pp. 320-325, 2008.
Finally, we have presented the procedure to extract the secret
image from the optimized stego image with 100% data lossless. [13] R.Z. Wang, C.F. Lin, and J.C. Lin, “Hiding data in images by optimal
moderately significant-bit replacement,” IEEE Electronics Letters, vol.
The proposed scheme optimizes some benchmark images 36, pp. 2069-2070, 2000.
taken from USC-SIPI image database to get images with higher [14] R.Z. Wang, C.F. Lin, and J.C. Lin, “Image hiding by optimal LSB
substitution and genetic algorithm,” Pattern Recognition, vol. 34, pp.
PSNRs and these PSNRs are greater than those produced by 671-683, 2001.
the existing algorithms.
[15] C.-C. Lin, and W.-H. Tsai, “Secret image sharing with steganography [20] D.E. Goldberg, “Genetic algorithms in search, optimization and machine
and authentication,” The Journal of Systems and Software, vol. 73, pp. learning,” Addison-Wesley Longman Publishing Co., Inc., Boston, MA,
405-414, 2004. 1989.
[16] C.-C. Chang, Y.-P. Hsieh, and C.-H. Lin, “Sharing secrets in stego [21] M. Gen, and R. Cheng, “Genetic algorithms and engineering design,”
images with authentication,” Pattern Recognition, vol. 41, pp. 3130- NY: John Wiley & Sons, 1997.
3137, 2008. [22] T.D. Gwiazda, “Genetic algorithms reference. Crossover for single-
[17] C.-C. Wu, S.-J. Kao, and M.-S. Hwang, “A high quality sharing with objective numerical optimization problems (Vol I),” Berlin: Springer;
steganography and adaptive authentication scheme,” The Journal of 2006.
Systems and Software, vol. 84, pp. 2196-2207, 2011. [23] T.D. Gwiazda, “Genetic algorithms reference. Mutation operator for
[18] C.-N. Yang, T.-S. Chen, K.H. Yu, and C.-C. Wang, “Improvements of single-objective numerical optimization problems (Vol II),” Lomianki,
image sharing with steganography and authentication,” The Journal of Poland: Tomasz Gwiazda, 2007.
Systems and Software, vol. 80, pp. 1070-1076, 2007.
[19] J. Holland, “Adaptation in natural and artificial systems: an introductory
analysis with applications in biology, control, and artificial intelligence,”
Ann Arbor, MI: University of Michigan Press, 1975.
A Quantum-inspired Cuckoo Search Algorithm for
the Travelling Salesman Problem
Sumit Laha
Department of Computer Science and Engineering
Future Institute of Engineering and Management
Kolkata, India
sumitlaha@icloud.com
Abstract—The quantum algorithm and the cuckoos search problems in reasonable computational times due to their
algorithm as emerging novel evolutionary techniques has recently polynomial time complexities. On the other hand,
drawn a lot of research interest due to their capability to search metaheuristics are robust and stochastic based adaptive search
globally as well as locally by exploring the search space more optimization techniques and perform better than the heuristics
efficiently in various applications of engineering and to generate optimal or near-optimal solutions for large-sized
management. To the best of our knowledge, this paper first TSP problems in reasonable times, however, at the cost of
considers the application of quantum-inspired cuckoo search additional computational times.
algorithm to solve the classic travelling salesman problems. In
this paper, we present a quantum embedded cuckoo search Noteworthy metaheuristics to solve the TSP include
algorithm for the travelling salesman problem. To accelerate the simulated annealing [1], neural network [2], particle swarm
search process for better solution quality, some neighborhood optimization [3], ant colony optimization [4], genetic
search based construction and stochastic heuristic approaches is algorithm [5], self-organizing maps [6], Lagrangian relaxation
utilized in the simulated annealing algorithm. The proposed [7], and elastic net [8].
method is tested with several benchmark test problem instances
taken from the TSP library in the literature. The computational Apart from the above metaheuristics, recently, quantum
results demonstrate that the proposed hybrid method is very computing as new research filed has drawn much attention and
competitive with the state-of-the-art procedures in the literature. wide applications in combinatorial optimization. Later, it is
found that that there is a growing interest in emerging
Keywords—Travelling salesman problem; quantum algorithm; quantum computing with evolutionary algorithms and
cuckoo search algorithm; simulated annealing; quantum-inspired quantum-inspired evolutionary algorithms have developed as
cuckoo search algorithm; combinatorial optimization promising alternative global optimization techniques due to
their capability to search globally as well as locally by
I. INTRODUCTION exploring the search space more efficiently in various
applications of engineering and management. Since late
The travelling salesman problem (TSP) is a classic
1990‟s a variety of quantum-inspired evolutionary algorithms
example of NP-hard combinatorial optimization problem. The
such as genetic-quantum algorithm [9], quantum-inspired
objective of this problem is to obtain the shortest possible path
gravitational search algorithm [10], quantum differential
so that the salesman visits each city only once and comes back
evolution algorithm [11] and quantum-inspired cuckoo search
to the destination. These TSP problems are frequently
algorithm (QCSA) [12] have been successfully utilized to
encountered in various applications in industry and real-life
solve different engineering and combinatorial optimization
situations such as transportation problems, assignment
problems.
problems, scheduling, production systems, vehicle routing,
computer systems, and image processing and pattern Although there exists few research on the application of
recognition. Since the TSP belongs to the class of NP-hard, the QCSA to knapsack problems [12], application of QCSA to
there is great interest among researchers to develop efficient TSP was not reported in the literature. However, there some
algorithms to find the optimal or near-optimal solutions in literature on other quantum-inspired genetic algorithm to solve
reasonable computational time. TSP problems [13].
During last three decades, there exists numerous studies on This paper offers a novel evolutionary algorithm called
the development of different optimization techniques such as QCSA for the travelling salesman problem. The important
exact, heuristics, and metaheuristics to solve the TSP. The feature of the cuckoo search algorithm [14, 15] is its global
exact algorithm includes branch and bound method and linear random walk governed by Levy flights, rather than standard
programming, however, these algorithms are feasible only for isotropic random walk to converge global optimal solution. To
solving small-sized TSP, whereas for large-sized problems, improve the solution further, some neighborhood search based
the computational times required by them is excessively high. construction heuristic approaches is utilized in the simulated
Heuristics are based on problem specific information and annealing (SA) algorithm.
produce near-optimal solutions, especially for large-sized
0.93 0.28 0.84 1.00 0.73 0.61 Step 7: Apply quantum intra-bit mutation operation by
selecting randomly a qubit with mutation probability (pm) from
0.88 0.53 0.67 0.73 0.61 0.31 the path obtained in step 6 and interchange the qubit
e ; f probability amplitudes α and β in the qubit.
0.47 0.85 0.74 0.69 0.80 0.95 Step 8: Apply quantum measurement operation to obtain the
The binary representation of the path from (6) can be feasible path and the corresponding path distance.
obtained following Equation (4) as: Step 9: Apply quantum interference operator by using
appropriate quantum gates U ( ) at each position of in
0 1 0 11 1 1 0 0 0 0 0 11 0 0 0 1 (7)
the path. Among different types of gates, rotation gates, U ( )
The decimal equivalent for the path given in Equation (7)
can be written as: [2-7-4-0-6-1] or [2-1-4-0-1-1] (we change 7 is frequently used by the researchers [12, 16]. In this paper, we
and 6 values which are greater than 5 by considering 7 (mod 6) consider the rotation gates for solving the TSP. The rotation
=1; 6 (mod 6) =0) or [2-1-4-0-3-5] (since there are two more gate is given as:
cos sin Step 4: If C Y C X , set X Y .
U (9) Step 5: If C (Y ) C ( X ) , compute C (Y ) C ( X ) and
sin cos
where θ is the rotation angle. The value of rotation angle
Tk Tk 1 and set X Y with probability e Tk .
should be well taken to avoid getting stuck of the local
optimum solution. A big value of θ can converge to premature Step 6: If k < max_iteration, return to step 2. Otherwise go to
local optima, whereas, a small value of θ increases the step 7.
convergence time [12]. Also, the lookup table of θ to obtain the Step 7: Output the current best solution as the final solution.
direction of θ is given in table 1. In this study, θ is taken as
same as used by Layeb [12], which is used to solve C. Initial solution
20
similar problem like bin packing problem. The initial solution for the SA is considered as the output
Step 10: Replace some paths (50%) among N paths randomly solution of the QCSA (Section III A).
by the generated path in Step 9 if it is better. D. Neighborhood solutions
Step 11: Delete a fraction (pa) of the inferior paths and replace
them following Step 6. In order to improve the neighborhood solutions of the
Step 12: Keep the best path and rank the paths. proposed SA, two construction heuristics are proposed.
Step 13: Set k k 1 . If k maxcount , return to step 5. Applying these two heuristics on a current n-city path, a total
Otherwise, stop. of 2(n-1) paths are generated and the best among them is taken
as the updated current path. Next, in order to randomly explore
the solution space more efficiently to converge to the global
TABLE I. LOOKUP TABLE OF THE ROTATION ANGLE Θ solution instead of getting stuck to local optimum, we use four
Reference
random neighborhood search techniques on the updated current
αij βij Direction of θ αij path obtained from the construction heuristics. The details of
binary value
>0 >0 1 +θ >0 these techniques are presented below.
>0 >0 0 -θ >0
>0 <0 1 -θ >0 1) Heuristic 1
>0 <0 0 +θ >0 Consider an example of a 5-city TSP. Assume the current
<0 >0 1 -θ <0 path as 3-5-1-4-2. Pick the first city from the path and append
<0 >0 0 +θ <0 to each of the remaining cities in the path two generate four
<0 <0 1 +θ <0 partial paths, 5-3, 1-3, 4-3 and 2-3. Next select the partial path
<0 <0 0 -θ <0 with least distance (say, 4-3). Next, to generate the new path,
B. Proposed SA heuristic insert this partial path 4-3 at the beginning and append the
remaining cities from the previous path (here, 3-5-1-4-2)
SA [17] is a stochastic neighborhood search technique sequentially. Thus the new path becomes 4-3-5-1-2.
based on the ideas of physical annealing process drawn from Proceeding in the similar manner, repeat this steps till a total of
the principle of statistical mechanics. It has been found four new paths are generated and the best path among them is
effective to solve many combinatorial optimization problems. taken as the current path. The detailed illustration of the
It begins with an initial solution, an initial high temperature, Heuristic 1 is given in Fig. 1.
and number of iterations. A neighborhood solution is
generated from the initial solution, which is considered as the
current solution. If the neighborhood solution is better than the
current solution, it is accepted and updated as the current
solution. However, if the neighborhood solution is inferior to
the current solution, it is also accepted with a probability based
on e T and updated as the current solution, thereby enhancing
the possibility of global search out of a local optimum. At high
temperature in the beginning, there is a high probability to
move to a worst solution. However, as the iteration increases,
the probability is reduced. As the iteration increases, the
temperature is gradually reduced following a particular
annealing schedule and continues till the stopping criterion is
reached. The proposed SA-based heuristic is presented as
follows.
Fig. 1. Illustratio of Heuristic 1
Step 1: Initialization: Obtain the initial solution (X) with path
distance (C(X)) and consider it as current solution. Set the 2) Heuristic 2
initial temperature (T1); Set iteration number k =1. In order to explain heuristic 2, let us take an example of a
Step 2: Set k k 1 . 5-city TSP. Assume the current path as 3-5-1-4-2. To generate
Step 3: Generate new neighborhood solution Y from X (Section the first path same procedure is applied as heuristic 1 and the
III D). new path becomes 4-3-5-1-2. Now the partial path 4-3 in the
path is locked and its terminal cities (4 and 3) are considered A. Experimental Framework
for further processing. Applying similar procedure for both the In order to measure the performance of the proposed
cities individually, new sets of partial paths (5-4, 1-4, 2-4; 5-3, method over other existing algorithms, two metrics, namely,
1-3, 2-3) are generated. Next the partial path with least distance percentage error of the average solution obtained by an
is selected and it is appended before or after the previously algorithm with respect to the optimal solution, PEavg (%) and
locked partial path with the remaining cities added sequentially percentage error of the best solution obtained by an algorithm
from the previous path. Proceeding in the similar manner, with respect to the optimal solution, PEbest (%) are usually used
repeat this steps till a total of four new paths are generated and for each problem instance in the TSP literature. These
the best path among them is taken as the current path. The performance measures for an algorithm for a particular
detailed illustration of the Heuristic 2 is given in Fig. 2. problem instance are defined as follows:
PEavg
average solution by an alorithm - optimal solution
100
(10)
optimal solution
PEbest
best solution by an algorithm - optimal solution
100
(11)
optimal solution
Abstract—Cloud computing environment is one of the latest better performance of any computing scheme. Scheduling of
research area where research is taking place at a very fast pace. tasks means defining the order in which tasks are going to be
This concept of computing which is based on cloud is expanding executed on virtual machine. Cloud computing do provide
its umbrella day by day, taking almost every computing activity several special characteristics to workflows like:
under its shed either it is related to resources, tasks or data
storage. This hiking interest of users has made it quite 1. Resources are available in the form of standardized
challenging for the providers to meet up the consumer desires. services and can be availed as per the user’s the
choice.
Scheduling of tasks plays an important role in the
performance of any cloudlet scheduling algorithm. Tasks 2. The number and type of resources allocated to a
scheduling if observed closely in realistic world is based on workflow is decided as per the customer’s
dependent tasks i.e., they represents a workflow. A number of requirement.
scheduling algorithms have been proposed for independent tasks
but very few are there which operates on dependent tasks. Here, 3. The number of resources allocated to a workflow can
in this paper we propose a new scheduling algorithm for be changed dynamically at runtime and hence rightly
workflow, which are having dependencies among tasks, taking said elastically scalable.
into consideration important parameters of transfer time and
bandwidth along with basic requirements of optimizing the Broadly, the workflows can be categorized under two
execution time and cost. The simulation is experimented using categories, a) Business workflow and b) Scientific workflow.
cloudSim toolkit. Our algorithm provides better results over A scientific workflow is basically represented by a Direct
other existing algorithm like PSO and CSO and is more closely Acyclic Graph (DAG). Thus, the tasks are dependent in nature
related to real world scenario. and while scheduling tasks this dependency needs to be taken
special care. The allocation of tasks and resources at runtime
Keywords—Cloud computing; Workflow; Tasks scheduling; and dynamically mapping resources to meet the performance
CloudSim; Virtual Machine expectations is to be handled with deep care.
Total time of execution= Total execution time = The EWSA algorithm is compared with Cat Swarm
processing time + transfer time processing time Optimization (CSO) technique [12]. The results obtained by
running the simulation clearly shows that EWSA algorithm
provides a more optimized solution in terms of scheduling
Cost of execution = task length x Cost of execution = task
cost of Vm + transfer cost length x cost of Vm workflow tasks in a more realistic way. The experiment is
performed for different set of workflows as mentioned in table
Combine result 3 and table 4. The tabular as well as graphical comparisons are
shown among both the algorithms in terms of Execution time
Fig. 2 Flowchart of EWSA and Cost of execution.
TABLE III. Execution time Comparison workflow. For each number of task, we have executed the
Number of task in CSO algorithm EWSA algorithm algorithm 20 times and mean value of that 20 execution
workflow was taken as final value for each respective number of task.
20 98.7135 76.5634 Fig. 3 and Fig. 4 shows distribution of workload for
30 283.8921 214.5672
different number of tasks with respect to execution time and
execution cost respectively, which is different each time
40 729.4284 576.3049
depending upon the file size, output size of file, bandwidth
50 799.6712 652.8310 of Vm on which it is executed. But the results are improved
60 819.3914 734.2735 each time as compared to CSO algorithm.
Abstract—Cloud computing is fast gaining popularity in consider when making the move to the cloud are also
educational institutions of developing countries like Nigeria. discussed. Hence, the rest of this paper is structured as follows:
Software as a Service, Platform as a Service and Infrastructure Section 2 reviews related works. In Section 3, Strengths,
as a Service are the three key models through which cloud Weakness, Opportunities and Threats (SWOT) analysis is
computing services are delivered to end-users. A number of performed to determine the applicability of the applicability of
studies have been conducted to identify the enabling factors as the various cloud service offerings in the Nigerian educational
well as the issues being faced as regards the adoption of cloud institutions. In Section 4 discusses the issues to be considered
computing in the Nigerian context. In this study however, by institutions intending to adopt any of the service delivery
Strength, Weakness, Opportunity and Threat analysis of the
models. Section 5 concludes the paper.
service delivery models in the Nigerian Education landscape has
been presented. In addition, the issues that an educational II. RELATED WORKS
institution needs to consider when adopting cloud computing is
discussed. In [6] an empirical study was conducted to determine the
potential for the adoption of grid computing in tertiary
Keywords—cloud computing; SWOT analysis; suitability; institutions of Nigeria. Although, grid computing is not exactly
Nigerian education; service delivery models cloud computing, they share a lot in common especially in
terms of vision, architecture and technology [7]. The study
I. INTRODUCTION revealed a significant lack of awareness about the benefits of
Cloud computing has some characteristics that distinguish grid computing particularly in tertiary institutions, which
it from other technologies [1] [2]. They include the following: generally prevented its low adoption and suggested investing in
Users do not necessarily have to own the information awareness initiatives, workshops as well as the acquisition of
technology (IT) resources they utilize. For instance, the servers grid resources to facilitate adoption.
they interact with might be hosted in data centers at remote In [8] an empirical study was also conducted to assess
locations from them. Also, services are provided on-demand to affordances of selected cloud computing tools for language
the end-users and the end-users only pay for what is used. teacher education in Nigeria. The study revealed that
Cloud services can be delivered as software (SaaS), platform, participants were able to perceive the opportunities inherent in
(PaaS), or infrastructure (IaaS) [1]. SaaS is a model in which the use of cloud computing for classroom learning as well as
application software is delivered via the Internet [3]. PaaS is a the unintended affordances. The study however was focused on
model where, the service providers supply services to the users, language Teacher Educators in Colleges of Education and the
such as development environment, and server platforms scope of tools studied comprised mainly of SaaS tools such as
through which the users can develop custom applications. In DropBox and Google Drive.
IaaS, computer infrastructure such as servers, and storage
devices are remotely delivered through the Internet. In [9] a critical analysis of the benefits and challenges of
the adoption and usage of cloud computing in Nigeria was
Cloud computing has moved from just being a topic of performed. In the study, the relationship between key
interest and debate to one that is being adopted and applied to stakeholders in the Nigerian cloud ecosystem and proposed
various aspects of the economy in sub-Saharan Africa - methods for optimizing the benefits of cloud computing while
prominently the enterprise [4] and in recent times the education reducing the inherent adoption challenges was presented. This
landscape. A recent study has identified the need for Nigerian study however, focused on businesses and corporate
educational institutions to have a plan of action for the organizations as being the consumers of cloud technology and
adoption of cloud services [5]. The need arises thus for an not educational institutions. The study also deemphasizes the
analysis of the applicability of the various cloud service cloud service delivery channels (IaaS, PaaS, and SaaS) arguing
offerings within the Nigerian educational institution context. that the rise of cloud ecosystem would render them irrelevant.
This is the main motivation behind this study. Also, the various Therefore, the challenges identified in this study were for the
challenges that may be faced are outlined as well as possible cloud ecosystem and not for the service delivery models.
solutions. In addition, the issues the institutions need to
Abstract—Transportation is an issue of concern in big cities paper is thus structured as follows: In Section 2 a review of
of many developing countries today. Due to the large population existing systems is conducted. In Section 3, we focus on the
in these cities, there is constant traffic congestion and pollution. design of the proposed system using the Unified Modeling
As a result taxi services are common. In Nigeria, companies Language (UML) diagrams. In Section 4, we discuss the
offering these services have discovered that they can better serve implementation of the mobile application and we compare it
the large population by providing their services through the with the existing systems in Section 5. Section 6 concludes the
mobile platform. Given the wide spread adoption of smart paper.
phones in these regions, we designed, developed and deployed an
Android-based application for one of the taxi service company II. REVIEW OF EXISTING SYSTEMS
called Red Cab. The application makes it easier for Red Cab to
cater for its current customers and also reach out to newer ones. Easy Taxi was founded in Brazil and has since been
expanding. Bankole Cardoso introduced the company to
Keywords—Android, Mobile application, Taxi booking, Nigeria in July 2013 [11]. Its main goal is to create efficiencies
transportation in the Nigerian transportation network by changing the
perception of Nigerians about taxis. With Easy Taxi, users can
I. INTRODUCTION call a taxi anytime and anywhere and the app would
Transportation plays a vital role in the day-to-day activities automatically search for the taxi closest to the vicinity of the
of the society. In most communities, a large fraction of the customer. As soon as a taxi is selected, a live map of the taxi’s
working population commutes to work daily [1]. Commuting movement is shown on the screen of the customer’s mobile
may not only be for business purposes but also for relaxation, device. The drawback of Easy Taxi however is that because it
shopping and other social activities. Of all the means of is new, Nigerians are still rather skeptical about its ability to
transportation, land transport, comprising of the use of vehicles always deliver hence it is not yet widely used.
– both private and commercial is common especially in Afro Cab is another Nigerian based taxi hailing app, which
developing countries of Africa [2]. However, a key advantage emerged in 2012 [12]. The company is currently servicing
of commercial transportation over owning a personal vehicle is Lagos and Abuja with plans to expand their reach to more
that it is less expensive and economical considering the high major cities in the country. The company has over 600
poverty rate in this region [3]. registered drivers in Abuja and Lagos. The application works
Taxi services are becoming prominent especially in big by allowing users to input their location and then a menu of
cities of sub-Saharan Africa [4] [5]. Prior to this time, the registered locations/streets are shown. If the location keyed-in
companies offering these services have not been able to reach is not known, an alternative pickup location is shown. The type
out to as many people as they would have desired especially of car the user wants to ride is listed and the user is also able to
remote locations [6], [7], [8]. However, given the wide spread specify the range of amount that s/he is willing to pay before
adoption of mobile devices (particularly smart phones) in this tapping the Get Cab button. The app sends the request to a
region, there is a pressing need to reach out to more customers number of the drivers within the vicinity. A driver can either
and also better cater for existing ones [9], [10]. Some of the accept or reject the request and the feedback is fed to the
companies based in Nigeria – the nation with the largest black customer in real time. However, once a driver accepts the offer,
population in sub-Saharan Africa – have already deployed a live map view of the taxi’s movement is shown on the screen
mobile applications to better serve their customers but Red Cab of the user’s mobile device. A drawback of this application is
Taxi is yet to successfully deploy a mobile app to support their that it fails to recognize some locations and so customers
services. Instead the company relies on its Web-based taxi requiring the services in those areas may be denied the service.
booking system. In order to maintain a competitive edge with Uber Taxi on the other hand can be described as a taxi
its competitors, Red Cab Taxi has recognized the need for a service for the elite. It portrays its users as successful
mobile version of its taxi booking service. The aim of this individuals who have style, class and elegance [13]. The Uber
paper therefore is to discuss the development of an Android- smartphone application that is available on all mobile platforms
based mobile app for Red Cab Taxi Company. The rest of this
C. Order History
This module shows in ascending order, the various orders
previously placed. The user can also delete an order history by
pressing down an order for three seconds that is to be deleted.
The module is depicted in Fig. 5.
A. Registration Module
This is the first interactive screen that is displayed after the
splash screen if the user is downloading the application for the
first time. It provides fields that capture the name (surname) the
mobile phone number and password for login to the
application. This is depicted in Fig. 3.
D. Taxi Locator
This module is integrated with Google Maps showing the
location of the assigned driver on the map. It is used to monitor
the position of the driver as shown in Fig. 6.
TABLE II. A COMPARATIVE ANALYSIS BETWEEN THE RED CAB APP AND
OTHER EXISTING APPS
Privacy and security policies to protect data need to be B. Hyper Text Transfer Protocol Secure (HTTPS)
formulated as well as a risk management process that
prioritizes securing information systems to ensure enforcement
of policies. Lack of skills to personnel also poses a risk to HTTPS is a protocol that was developed by Netscape that
privacy and security of data. Data security is based on a uses a Secure Socket Layer (SSL) or Transport Layer Security
systematic assessment of threats and risks of breach or loss (TLS) when accessing a website server. HTTPS encrypts and
which should be considered when developing a system’s decrypts web end user communications using 128, 192 or 256-
design requirements. Risk assessment results should inform bit long key size for encryption and decryption [28]. Data
security controls, risk analysis as an ongoing process can help packets captured during transmission are not recognizable [29].
The ODK data encryption capabilities and the HTTPS
ability to securely transmit data form the study facilities to the
aggregate server informed the design and setup of our mHealth
system of data collection.
III. METHODOLOGY
Case-based surveillance systems are not standard in sub-
Saharan African, and most surveillance data come from facility
records or periodic surveys, such as Demographic & Health
Surveys (DHS) and AIDS Indicator Surveys. However, each of
these has significant limitations. Facility records are often
unreliable or poorly kept and are not generalizable beyond the
facility-level. Surveys provide higher quality, national-level
data using sound methodology, but they only occur once every
5 years. Thus, it is difficult for a Ministry of Health (MoH) to
use these data to assess the effectiveness of changes in health Fig.1 HIV Case-Based Surveillance System
system policies and practices.
Abstract—Enormous applications and the future necessities compression, health care, banks, post offices, etc. Number of
of image processing area open new paths for researchers. The researchers enthusiastically participated in the emergence of
analysis and recognition of numerous documents in the form of OCR systems [10] -[13] and investigation regarding
images are the challenging task. The classification is one of the document images pattern exploration has been vigorous since
vital phases in the image processing. The methods of 30 years [14]. OCR work is continuously going on for several
classification must possess the consistency and accuracy. This Indian scripts [15] -[17]. Devanagari screenplay is practical
research analyzes the different classification techniques and the for Marathi , Hindi, Nepali, Sanskrit, and sundry linguistics
performance analysis is carried out with the help of testing the
image classification principles. In this research paper few
which are used by more than 450 million persons around the
classification techniques for Devanagari script are considered world. Other Indian languages like Punjabi, Bengali, and
and evaluated in MATLAB R2014a. Gujarati follows scripts analogous to Devanagari. Indian
scripts, including Devanagari invented from the primordial
Keywords—OCR; Classification; Nearest neighbor; Neural Brahmi script after diverse modifications. A Devanagari script
network; SFSA, MLP has 13 vowels as well as 34 consonants with 14 vowel
modifiers. Compound characters are made by joining two or
I. INTRODUCTION more core characters, it not distinguish upper, lowercase
characters. Researchers have contributed their efforts for
In current scenario interdisciplinary exploration grooming Devanagari script. Bhattacharya and Chaudhuri [35] have used
very rapidly, this includes two or more areas to do the different multilayer perceptron classifier at every stage of their
research. One kind of interdisciplinary study area is machine scheme of recognition. Govindaraju [46] has thought about
learning which gives a communal stand to the scholar from the gradient features to do the characters feature extraction.
field of computer science, artificial intelligence, biology, Method for recognition, dissection of join Devanagari
medicine, economics, statistics, mathematics etc. to effort machine made texts are explained by Grain and Chaudhuri
collectively. Optical Character Recognition (OCR) is a [47], it relies on fuzzy multifactorial analysis. Recognition of
mission of programmed computer identification of electronic words is done by devising the stochastic finite state
character or probed images to spawn an automated edition. automaton, which considers classifier scores and frequencies
The accuracy rate of an OCR system varies and it’s reliant on of character [26]. Sinha [48] explained how the spatial
input document. For a decent superiority image gets from a association among ingredient characters of the Devanagari
sanitary, the result is typically an accurate dictation of the text script do crucial function for word recognition. In recognition
blocks of an input. In case a noisy image taken from a of Devanagari characters, classifier like support vector
document which is damaged, unclear, or else in meager machine have been reported by Jayadevan et al. [15].
condition, the resultant eminence is likely to be deprived with Currently significance of bilingual [18] -[21] and multi-lingual
errors. In such scenarios, an image itself is more realistic OCR is also raising [22]. Numbers of editorial published
representation of the original as compared to OCR outcome. before 2000 are stored in the study carried out by Pal and
There are numerous OCR systems existing commercially in Chaudhari [17]. Because of exploration, solicitation and
the sell [1]. These OCRs are extensively used to renovate expansion of machine learning algorithms, newer and newer
documents into an electronic archive [2] with an aim to classification techniques are impending from research.
automate record saving in office [3] or to circulate the words Numerous classification methods have been developed which
on a Web page [4]. Because of OCR it is promising to revise might belong to categories like neural network based, rule
the text, seek out for a text/phrase [5] [6], save text more based, fuzzy based concept, perceptron based method,
efficiently and apply methods like text to speech translation statistical method, symbolic method, logical method, etc. This
[7], text mining [8], machine translation [9] etc. Consequently research paper is an attempt to present a review of the OCR
OCR is arduous, contemporary pattern recognition field, work on Devanagari script classification available from year
predominantly in document image analysis expanse. Various 2000 and onwards. The review focuses efforts associated with
applications of OCR are in natural language processing, data identification of printed characters, numerals and it provides
II. CLASSIFICATION
Feature Extraction
OCR systems comprehensively use the techniques of Statistical/Structural etc.
pattern recognition, which allocate an unidentified sample of
an identified class. The features extracted are provided as an
input to the classification method. The production of feature Classification
extraction phase in image processing is a feature vector which Nearest neighbor, Neural network, SFSA, MLP etc.
is given as an input to classification phase and classified by
means of supervised, unsupervised technique. Here the data
set is alienated into training set, test set for each character. Fig. 1. Phases of Document Image Process.
Bag-of-key points take out from the feature extraction are
employed for recognition aim. Numerous tactics [15] are Nearest neighbor is an instance dependent learning
employed to catalogue the character structures like k-Nearest method to classify objects find on the basis of closest training
Neighbor approach, neural network, and SVM classifier etc. examples in the feature space. K-NN is incredibly easy in the
Phases of document image processing including classification calculation. For every data point from k neighboring training
are shown in Fig. 1. data points having known class values and nearest distance are
chosen. The class which comes most frequently in the
III. CLASSIFIERS CONSIDERED neighborhood is allocated to the new data point depends on an
assured distance metric. K-NN classifiers are extremely
The automation of Devanagari script recognition begins effortless to train, to catalogue the input mockup feature by
in the early 1970s. Numerous classifiers have been used to do means of the mutual assortment of its k adjacent neighbors
the Devanagari text identification work. In the training period [25] -[27]. K-NN is utilized with concavity, structural,
of such approaches, patterns are assumed as input and the gradient features [27]. One type of nearest neighbor classifier
scheme legalizes themselves to weaken the misclassification is Minimum Distance Classifier [28] which is used for
gaffes of these patterns. Such trained schemes are employed to evaluation of priority matching of characters portions, contour
do the taxonomy of enigmatic test patterns. One prominent sets.
example of such a technique is the Neural network, that revise
the weight associations from the training configurations with 3.3 Neural network
the purpose of enhanced classification. Classifiers considered
here are as follows. The neural network is a computing design that
encompasses analogous interconnection of adaptive neural
3.1 Schema dependant minimum distance classifier processors. Because of comparable flora, it can accomplish
calculations at a greater rate in comparison to the traditional
Here nearness of an incoming pattern to the patterns procedures. Due to its adaptability spirit, it discerns how to
of the probable pattern classes gives a measure in deciding the acclimate data amendments and discover the input signal
pattern class of the pattern under deliberation. As the appearances. Node productivity assumed to another node in
classification is depends on the minimum distance the linkage and the decisive consequence trusts on the
computation, this technique is known as minimum distance multifarious associations of all nodes. Neural networks cover
classification technique. For classification varied information excellent learning, generalization skills. These skills are
supply is used. The schema states classifier hierarchical crucial for trade with input arrays fuzziness and triumph
structure and afford the relative significance, role of varied tolerably in the noise manifestation, fractional data and they
information supply in the hierarchy. The benefit of this could invent from the illustrations [26], [29] -[31]. For
methodology is, classifier structure can be effortlessly character recognition, one of the admired classifiers is the
customized by changing the schema [23] [24]. multilayer perceptron [32] -[41].
Abstract—Human readable symbols are extracted from a Though efficient rule extraction algorithms makes it possible
trained neural network using Rule Extraction Techniques. In for decision process of a trained network to be expressed as
this paper internal representation of feed forward neural classification rules, concepts learned by neural networks are
network is augmented by a distance term to produce fewer rules. difficult to understand because they are represented using
This paper presents an efficient method to extract fewer rules
large assemblages of real valued parameters. These rules if
from multilayer feed forward neural network. The proposed
method calculates distance between activation values of hidden more comprehensive to a human user leads better, quicker and
units for a given input values and moves them depending on the accurate classification of decision. Greedy Rule Generation
calculated distance value. The method shows fewer rules on algorithm produces concise and accurate rules [6].
three publicly available data sets without compromising
classification accuracy.
I. INTRODUCTION
The application of Artificial Neural Network (ANN)
technology is now extended in fields as diverse as commerce,
science, industry and medicine, offers a clear testament to the
capability of the ANN paradigm. The following three salient
characteristics of ANNs makes it capable for this success. The
first is the ability to learn dynamically from the network by
adjusting weights in training of the network. i.e. dynamically
acquiring information/knowledge about a given problem Fig. 1. Fully connected feed forward neural network.
domain through a training phase. The second characteristic is
the ability to store knowledge in numerical form which makes
it compact. The third characteristic is its robustness, the Figure 1 shows a network consisting of simple
ability to provide solution of noisy input data. interconnected units, or neurons, where each unit/node
In addition to these characteristics, one of the most important receives input from other units and sends its output to others.
advantages of trained artificial neural networks is the high Neurons of a layer receives the inputs, which is weighted and
degree of accuracy on generalization of solution [2], extracted biased through an transfer/activation function, usually a step
on a sample data of a problem domain, over a set of threshold or a sigmoid function, and then sends the output
previously unseen examples from the problem domain. The value to the neuron of the next layer which will be an input
effectiveness of artificial neural network as a tool to aid for that layer. Neuron of a layer is connected to the other
human decision making variety of business application neuron via a weighted connection. The simplified process for
problems and other industries are increasing widely. training a Feed Forward Neural Network (FFNN): (1) present
Classification and pattern recognition is one of the important the input to the network (2) propagate through the network
application of ANN [8][9] . Text Classification uses several until it reaches the output layer. This forward process
tools from Information Retrieval (IR) and Machine Learning produces a predicted output. (3) Calculate the error value for
[13]. Similarly Gait for human activity recognition also can be the network i.e. the difference between the predicted output
done using ANN [14]. To detect and diagnosis of mental and the actual output. (4) Adjust the weights of connections to
disorders, researcher used neural network pattern recognition minimize the error value. Since the networks internal layers
tool for classification task and designed a network [15]. and weights at internal nodes are a set of floating points, their
behavior are not human readable. Hence ANN are also known
(3)
Since for k ≠ j, ≠
V. RULE EXTRACTION
The three layered feed forward network is trained using
Levenberg Marquardt training algorithm. After training the
hidden unit activation values are clustered using Chi2
algorithm [12]. It clusters these hidden unit activation values
in the intervals [0,b1),(b1,b2),(b2,b3),(bk,1] such that each
activation values falls in interval li where li represents
(bi−1,bi)[12]. Explaining the output in term of clustered hidden
unit activation values using CART algorithm. Rules are
extracted in the form If (Hj1 = l1 and Hj2 = l2) then class =
class1
Which means if j1th hidden unit’s activation value is in
Fig. 3. Pattern of hidden unit activation throughout the space for the ILPD
l1interval and if j2th hidden unit’s activation value is in l 2 problem after training without Euclidean distance term.
interval. Then the rule is classified as class1. Since after
introducing the distance term, hidden unit activation values space more rules are required to explain the hidden activation
are moved closer, the values will fall in less number of - output relationship.
different intervals li and hence it will result fewer rules. In the
next step of rule extraction from input values and hidden unit Figure 4, shows the pattern of hidden unit activation
values, rules can be generated directly for continuous input as values after training the network with Euclidean Distance
follows: If jth hidden unit activation is in an interval li i.e. term. It can be seen that the hidden unit activation values are
not scattered in the space. Values are brought closer to each
If aHj falls to li then wj1x1 + wj2x2 + ... + wjn xn falls to li other to make cluster. This will result in more number of
Since extracted hidden output rules are in terms of the hidden unit activation values to fall in an interval and hence
intervals a hidden unit value falls, the input output rules can the extraction of rules of hidden unit values, belonging to an
be extracted by combining it with above input and hidden interval, to be classified to a class will be fewer.
value rule.
REFERENCES
[1] C.W.Omlin, C. L. Giles. Extraction of Rules from Discrete-Time
Recurrent Neural Networks, Neural Networks,Vol 9,No 1,pp 441-52,
1996.
[2] A.Gupta. Generalized Analytic Rule Extraction for Feed forward Neural
Networks, IEEE Transactions on Knowledge and Data Engineering, Vol.
11, No. 6, pp 985-991, 1999.
[3] S.M. Kamruzzaman and A. R. Hasan, Rule Extraction using Artificial
Neural Networks, ICTM, 2005.
[4] D. Dancey, Z.A.Bandar and D. McLean.Logistic Model Tree extraction
from Artificial Neural Network. IEEE Transactions on Systems, Man,
and Cybernetics, Vol.37, No.4, pp 794-803, 2007.
[5] R. Setiono, B. Baesens, and C. Mues. Recursive Neural Network Rule
Extraction for data with mixed attributes, IEEE Transactions on Neural
Networks, Vol.19, No.2, 2008.
[6] K. Odajimaa, Y. Hayashi, G. Tianxia, R. Setiono. Greedy rule generation
from discrete data and its use in neural network rule extraction, Neural
Network, Vol 21, pp 1020-1028, 2008.
[7] M.G. Augasta, T. Kathirvalavakumar. Reverse Engineering the Neural
Networks for Rule Extraction in Classification Problems, Neural Process
Lett, Springer, Vol 354, pp 131-150, 2011.
[8] S.M. Kamruzzaman and A.M. Jehad Sarkar. A new data mining scheme
using artificial neural networks, Sensors, Vol 11, pp 4622-4647, 2011.
[9] L. Ai-sheng, Z.Qi. Automatic modulation classification based on the
combination of clustering and neural network, Science Direct, 2011.
[10] T. Q. Huynh and J. A. Reggia. Guiding Hidden Layer Representations
for Improved Rule Extraction from Neural Networks, IEEE, Neural
networks, Vol 22. No 2, Pg No 264-275, 2011.
[11] S. Kulluk, L. Ozbakr, A. Baykasoglu. ”Fuzzy DIFACONN-miner: A
Novel approach for fuzzy rule extraction from neural networks ”,Expert
System with Applications, Vol 40, No 3, pp 938946,2013.
[12] H. Liu and R. Setiono, Chi2: Feature selection and discretization of
numeric attributes, Proceedings of the Seventh International Conference
on Tools with Artificial Intelligence, IEEE Computer Society
Washington, DC, USA,1995.
[13] V. Bijalwan , V. kumar, P. Kumari KNN based Machine Learning
Approach for Text and Document Mining , International Journal of
Database Theory and Application, Vol 7, pp 61-70, 2014.
[14] A. Gupta, J. Prakash. Human activity recognition using gait pattern,
International Journal of Computer Vision and Image Processing, Vol 3 ,
pp 31-53,2013.
[15] P. Kumari, A. Vaish Individual Identification based on Neurosignal
using Motor Movement and Imaginary Cognitive Process, International
Journal for Light and Electron Optics. DOI:
http://dx.doi.org/doi:10.1016/j.ijleo.2015.09.020.2015.
[16] A. Asuncion and D.J. Newman, UCI Machine Learning Repository,
http://www.ics.uci.edu/ mlearn/MLRepository.html.
An Investigation on Residential Exposure to
Electromagnetic Field from Cellular Mobile Base
Station Antennas
Amar Renke Dr. Mahesh Chavan
Shivaji University: Department of Electronics Engineering Shivaji University: Department of Electronics Engineering
KIT’s College of Engineering KIT’s College of Engineering
Kolhapur, India Kolhapur, India
amarrenke@hotmail.com maheshpiyu@gmail.com
Abstract—The increasing number of antennas on the cellular work related to EM radiation emitted from cellular base
base station towers led to growth of public concern related to stations in close vicinity of dwellings. We investigated
human exposure to electromagnetic fields (EMF) and various different cases of residential human exposure to EM radiation
health effects. The study investigated the levels of from base stations in the village of Pachgaon. Located in south
electromagnetic radiation emitted from cellular base stations in west side of Kolhapur. Base station antenna towers may be
Kolhapur, a district located in west Maharashtra. The paper
located on ground, rooftops of residential buildings or on
focused on human exposure to electromagnetic radiation from
base station antennas in residential areas. Various exposure commercial buildings or on hotel rooftops. We measured the
situations were considered such as hall, kitchen, bedroom, and magnitude of EM field and power density in various parts of
terrace, where electromagnetic field exposure was measured in the dwellings. The contribution of EM radiation from cellular
terms of power density and electric field. The average height of mobile base station antennas is higher as compare to other EM
the antenna tower was around 150 feet and average numbers of field sources.
antennas on the base station towers were around 16. Results were
tabulated in terms of power density and electric field. The II. MATERIALS AND METHODS
measured values of power density were well below the maximum
permissible exposure levels. The electromagnetic field exposure In Kolhapur, there were different cellular systems such as 2G
was maximum at terrace, medium at hall and bedrooms and low and 3G systems like GSM, CDMA and UMTS which provides
at kitchens. The residential exposure varies from 45.92 to 13860 mobile voice and data services. Each system uses different
μw/ m2. The maximum value of residential exposure was 2797 frequency bands i.e. 800, 900, 1800 and 2100 MHz. all
μw/ m2. systems provide voice, data and internet services. Here in
Keywords—power density; health consequences; residential
Kolhapur different service providers available are Reliance,
exposure; dwellings
Airtel, Vodaphone, and Idea etc. To estimate the worst case
residential exposure, we measured EM field levels in different
I. INTRODUCTION parts of the dwellings. In our study, base station towers are
In the recent years, wireless data transmission has recorded mounted on ground and average numbers of antennas on the
an impressive technological development. Due to increased towers were around 16 to 20. All investigated dwellings were
use of mobile, the installation of new base stations in the situated around the base station antenna towers. The average
vicinity of dwellings increases the public exposure to distance between base station antenna tower and dwellings
electromagnetic field. As the base stations are installed in the was varying from 10m to 50m.
residential areas, the residential exposure comes into the EM field were measured in different parts of dwellings such
picture. Because there will be the existence of EM field as hall, bedroom, kitchen and balcony and rooftops. In most of
everywhere in the dwellings. Each person from dwellings gets the cases distance between dwellings and antenna tower were
exposed to EM field from cellular base station antennas. around 20m. Considering the average height of the Indian
Generally in urban areas the population is dense therefore people the EM field exposure was measured at height 1.5m
cell size is reduced to cover each and every part of the city. with the help of three axis electromagnetic field meter model
This requires more no of base stations. To find out the likely KM-195 as shown in figure 1.0. This meter is having
impact of the EMF exposure from cellular base stations, frequency range 200MHz to 2.5 GHz which covers entire
different studies were carried out in the different areas of EMF frequency band used for cellular mobile communication.
exposure measurement [1]-[5]. And possible consequences of Initially base station antenna sites were selected then EM
human exposure to EM fields from cellular base stations [6]- field and power density were measured in different parts of the
[10]. More awareness is present in developed countries dwellings.
regarding EM field exposure due to cellular base stations,
while less awareness in Asian countries. This paper presents
Abstract—In todays big data era, all modern applications are be very efficient and scalable. Traditional algorithms were
generating and collecting large amount of data. As a result, designed to run sequentially over a single machine. But, as
data mining is encountering new challenges and opportunities the volume of data increases computational cost associated
to make algorithms such that, this voluminous data can be with its processing also increases. This causes problems in
effectively and efficiently transformed into actionable knowl- analyzing data on a single sequential machine and instead
edge . Traditional algorithms were designed to run sequentially of assisting in data analysis, the processor serve more like
over a single machine. But, as the volume of data increases a bottleneck. To deal with upcoming issues, due to large
computational cost associated with its processing also increases. scale data, parallelization of mining algorithms becomes
This causes problems in analysing data on a single sequential inevitable. Parallelization of association rule mining tech-
machine and instead of assisting in data analysis, the processor niques can be done either by dividing and distributing data
serve more like a bottleneck. Parallel and distributed ap- over multiple nodes and generating association rules locally
proaches improve the performance in terms of computational followed by merging local results to obtain global asso-
cost as well as scalability but experience some limitations ciation rules. This is called Data-parallelization. Second
during load balancing, data partitioning, job assignment, moni- approach is, parallelizing algorithm itself called Algorithm-
toring etc. MapReduce, a parallel programming model is a new parallelization.
concept which provides seemingly unlimited computing power, A detailed literature review of various existing methods
cheap storage as well as, can overcome above limitations. This is given along with their pros and cons. Section 2 includes
makes it a topic of upcoming research interest. A detailed
some background of traditional association rule mining tech-
literature review of some existing methods is given along with
niques. Section 3 includes the discussion of some existing
research work done for parallelization of association rule
their pros and cons.
mining. Section 4 explores some of the existing techniques
for implementing association rule mining on MapReduce
1. Introduction with their respective advantages and limitations. Section 5
draws conclusion.
With the advancement in technology, all modern appli-
cations are generating and collecting large amount of data.
Some typical examples include social networking, wireless
2. Association Rule Mining
sensors network, computer networks etc. As a result, data
mining require more effective and efficient algorithms to Association rule mining can be defined as a process
transform this huge data into actionable knowledge. Data of extracting correlations and associations among items in
mining can be defined as a process of extracting hidden large dataset [21]. Association rule is the implication of
patterns and predictive information from large volume of the form X− > Y where X and Y are different items
data. Broadly this hidden knowledge can be extracted using in transactional or relational database. X− > Y holds
methods like, Association Rule Mining, clustering, sequence in dataset D, two properties: Support(s)- Total number of
analysis, classification or forecasting. Association rule min- transactions containing both X and Y (X union Y) out of
ing is one of the important technique used for extracting total present transactions. Equation1.
interesting patterns and correlations among items in large i.e.
dataset [21]. There are two traditional approaches followed support(X− > Y ) = P (XU Y ). (1)
called candidate-generation based (Apriori algorithm) and
Candidate-less (FP-growth algorithm). and Confidence(c)- Percentage of transaction containing X,
The rule mining task can be divided into two computa- given that it already has Y Equation2.
tionally intensive subtasks i.e frequent itemsets generation i.e.
and association rule generation. It require the algorithms to conf idence(X− > Y ) = P (Y |X). (2)
ª*&&&
Association rule mining aims to extract frequent item- 3. Parallelization of Association Rule Mining
sets with support(s) >= Min sup and Finally generating
association rules from these frequent item set with and Association rule mining can be viewed as two step
confidence(c)>= Min conf . Here, Min sup and Min conf process [3] [21]- frequent itemset generation and asso-
are threshold values. Association rule mining techniques can ciation rule generation. Being computationally intensive
be divided into two major catagories- Candidate-generation tasks parallelization is next step required to be taken to
based techniques and Candidate-less techniques. Number of reduce the work load over one sequential machine and
algorithms exist under each catagory but importantly Apriori distribute it among several nodes. Parallelization of associ-
algorithm is a candidate-generation based approach and FP- ation rule mining can be achieved in two major ways: Data
Growth is a candidate-less approach. parallelization- which divide and distribute data over mul-
Apriori Algorithm [21] uses the prior-knowledge in tiple nodes and generate association rules locally followed
terms of properties of frequent itemset. It is an iterative by merging local results to obtain global association rules.
approach where k-frequent itemsets are extracted using (k- Second is Algorithm parallelization where algorithmic tasks
1)-frequent itemsets. The algorithm exhibits anti-monotonic are distributed over number of nodes. Based on the literature
property called Apriori property [21], states that ” For a review done, it can be concluded that most of the research
frequent itemset all its non-empty subsets must also be work is being done on data parallelization.
frequent.” Apriori can be seen as two step process: Fre-
quent itemset generation and Association rule generation. 3.1. Parallelization of Apriori Algorithm
Frequent itemset generation can be further divided into
two subtasks called Joining where, lk−1 candidate sets are
Apriori is most widely used frequent itemset generation
joined with itself to generate candidate k-itemsets (Ck ). and
algorithm which, iteratively generate k candidate itemsets
Pruning where, all those candidate frequent itemsets with
from present (k − 1) frequent itemsets. Several interesting
support less than Min sup get pruned off. The working of
and effective research work to parallelize the mining of
apriori can be seen in Figure 1.
itemsets can be seen in [11], [12]. Initially Apriori was
implemented on multi-processors and further [1], [2], [13]
implemented them with distributed architecture. Summary
of parallel version of Apriori is defined in [8] as given
below: Count Distribution- can be seen as a most direct form
of parallelization for apriori algorithm. At each node global
candidate itemset as well as frequent itemsets are stored. For
respective local data present on each node, support count
of candidate itemset is calculated using apriori algorithm
and finally, these local results are exchanged among nodes
respectively. Candidate Distribution- here, each node donot
maintain global results but partitions the candidate itemsets
Figure 1. Working Flow of Apriori Algorithm [21] with respect to the partition of dataset. Each node locally
calculate support count of their own candidate itemsets.
FP-Growth algorithm [20] is candidate-less approach Data Distribution- combines the above two approches to-
therefore, do not generate candidate itemset in between. gether by partitioning both datasets as well as candidate
It uses tree as a data structure which is much compact itemsets at the same time such that, each node can work
and frequent itemsets are mined directly from the tree. independently.
FP-Tree is a composed and compressed representation of Parallel implementation of finding frequent itemsets fol-
large dataset. This approach require only two database lowed by sequential extraction of association rule depletes
scans. First scan is required to obtain the support count the performance gained so far when the number of generated
of each item in dataset. Infrequent items with support frequent itemset is large. Hence, parallelization of second
less than Min sup are removed and others are arranged in phase is equally important to enhance the overall perfor-
decreasing order of their support. Second scan over dataset mance in terms of scalability and efficiency. SeaRum [4] is
is made to construct FP-Tree. Finally this tree is used to one of the initial attempt in our knowledge for developing
extract frequent itemsets directly using bottom-up strategy. both phases of Apriori algorithm in distributed environment.
Both traditional approaches can be compared over several
parameters as shown in Table 2. 3.2. Parallelization of FP-Growth Algorithm
Traditional algorithms are designed to run sequentially Multi-tree approach [10] was one of the initial works
over single machine [20], [21]. With increase in data vol- done to parallelize FP-Growth algorithm. Broadly, they used
ume, computational intensiveness increases such that it be- three steps to achieve parallelism which were followed in
comes impossible for single machine to perform efficiently sequence by parallel processing flows. Initially, horizontal
[3]. subset of data is analyzed. Secondly local FP-Tree is build
Comparison of Apriori and FP-Growth
Parameters Apriori FP-Growth
Technique Candidate-based Candidate-less
Time Execution time is more as need to produce candidate It is much smaller than apriori. (Fast)
every time. (Slow)
Memory usage Due to large candidate generation require large Uses tree data structure and donot generate candidate
space. itemsets, require less memory.
No. of scans Multiple scans for generating candidate sets Only two scans are required.
Property Use apriori property, Prune step, Join step Conditional frequent pattern tree and base are con-
structed from database, saatisfying min sup value.
in parallel and finally, on this local FP-Tree mining pro- 4.1. MapReduce
cess is carried out. From every processing flow candidate
patterns are obtained and then merged together. Further, MapReduce [14] is a parallel programming model which
enhancement were made in merging algorithms using clus- support distributed computing, required for mining large-
ter computing environment. Moreover, in [13] certain con- scale data. MapReduce composed of two procedure i.e Map
straints were proposed for massive dataset, which should which basically performs sorting and filtering of data and
be followed in parallel itemset extraction to obtain good Reduce merges the map output to produce final result.
scalability. further improved itemset extraction with better Mapreduce is a framework which can process the paralleliz-
hardware resource exploitation on multi-core processors. It able problems with massive data set by using a large number
proved to be a new innovation in the field which enhanced of machines. Machines collectively called as cluster which
the performance of FP-Growth algorithm via improving can take benefit of locality of data.
the temporary locality for accessing the data at different
levels of memory. In [12] initial effort was made to address
cache-hint optimization using the above technique. Certain
applications were developed [11], implementing the above
for mining itemsets in parallel.
4.4. Other Approaches linear speed Up and good scalability. Able to achieve near
constant runtime because of scaling data and nodes together.
Beside MapReduce, many other research work using Another stratergy YAFIM(Yet Another Frequent Itemset
different approaches can be seen, which also ultimately Mining) [18], used different approach for using parallel
improves the association rule mining process while using Apriori algorithm over Spark RDD framework in-spite of
parallel environment. A cloud based service modelSeaRuM MapReduce. Spark is a specially designed for iterative and
[4] efficiently extracts association rules from huge frequent- interactive algorithms of data mining. It is basically in-
items. SeaRuM run a number of distributed MapReduce jobs memory based parallel computing model means all data
performing different tasks in cloud. The architecture contain is loaded in memory itself. Secondly it do not use fixed
following jobs Data acquisition, Data Preprocessing, Item two state model as in MapReduce but provide DAG based
Frequency Computation, Itemset Mining, Rule Extraction, data flow. These features allow speeding up the computation
Rule aggregation and Sorting. significantly for the iterative algorithms like Apriori. YAFIM
A randomized approach [6], PARMA stands for -A Par- outperforms in terms of fast computation in comparison to
allel Randomized Algorithm For Approximate Association MapReduce.
Rule Mining. The approach combines the two different A new method called NIMBLE [19], aims to achieve
methods named Random Sampling and Parallelization for portability of programming code, for fast and efficient
extracting association rules out of massively huge amount implementation of the parallel data mining algorithms. The
of dataset. The overall cost of association rule mining can tool kit allows to build Machine learning and Data Mining
be divided into two components: scanning and Mining.The algorithms around reusable building blocks parallely such
scanning factor increases very rapidly and dominant mining that they can be easily utilized by other programming model
and make the whole process unsellable and complex. Hence, as well. This helps to achieve inter-portability. NIMBLE
PARMA combines both approaches in a novel fashion, it facilitates the processing of variety of data formats due
mines small random samples in parallel. Each sample is to its built-in support as well as simplify the custom
given as input to MapReduce function running in orthogo- data-format implementations. Strategies for optimization
nal manner. lastly, filtering and aggregation of association and abstraction incorporate hand in hand to deliver high
rule from each sample are collected. PARMA can compute performance runtime. It can be seen as a infrastructure,
association rules directly and is not limited to frequent providing limited effort parallelization and Support for
itemset extraction. Significantly outperform and has near- rapid prototyping.
[8] Xin Yue Yang, Liu, Zhen and Yan Fu, ”MapReduce as a program-
All discussed approaches in Section 4.1, 4.2 and 4.3 can ming model for association rules algorithm on Hadoop,” in 3rd
International Conference on Information Sciences and Interaction
be summarized in the form of Table 2 Sciences (ICIS), pp.99-102, 23-25 June 2010.
[9] Othman Yahya, Osman Hegazy and Ehab Ezat,”An Efficient Im-
5. Conclusion plementation of Apriori Algorithm Based on Hadoop-Mapreduce
Model,” in Proccedings of the International Journal of Reviews in
Association rule mining aims to find the correlations Computing, Vol. 12, pp.59-67, 31st December 2012.
and associations from the data. Association rules can [10] Zaiane, O.R., El-Hajj, M. and Lu, P., ”Fast parallel association rule
be extracted by using either of the two primary ap- mining without candidacy generation,” in Proceedings IEEE Inter-
proaches i.e. candidate-generation based (Apriori algorithm) national Conference on Data Mining ICDM, pp.665-668, 2001.
or candidate-less (FP-Growth). The process of association [11] L. Liu, E. Li, Y. Zhang, and Z. Tang, Optimization of frequent
rule mining can be further divided into two subtasks frequent itemset mining on multiple-core processor, in Proceedings of the
itemset generation and association rule generation. Frequent 33rd international conference on Very large data bases VLDB., pp.
1275-1285, 2007.
itemset generation is computationally very intensive. Hence,
parallelization of traditional approaches becomes inevitable. [12] A. Ghoting, G. Buehrer, S. Parthasarathy, D. Kim; A. Nguyen, Y.-
K. Chen, and P. Dubey, Cache-conscious frequent pattern mining
Association rule mining techniques can be parallelized in on modern and emerging processors, in journal of the VLDB , vol.
two ways i.e. data parallelization or algorithmic paralleliza- 16, pp. 77-96, 2007.
tion. Based on the literature review done it can be concluded [13] M. El-Hajj and O. R. Zaane, Parallel bifold: Large-scale parallel
that most of the research work has been done using data pattern mining with constraints, in journal of Distributed and
parallelization approach but, not much research work is Parallel Databases, vol. 20, no. 3, pp. 225-243, 2006.
being done on algorithmic parallelization. Parallelization of [14] J. Dean and S. Ghemawat, Mapreduce: simplified data processing
association rule mining techniques over distributed architec- on large clusters, in Community of ACM, vol. 51, no. 1, pp. 107-
ture helps us to achieve scalability and computational speed- 113, Jan. 2008.
up. But, there are many limitations such as data partitioning [15] Aouad, Lamine M., Nhien-An Le-Khac, and Tahar M. Kechadi.
problems, uneven data distribution, improper load balancing, ”Distributed frequent itemsets mining in heterogeneous platforms.”
in Journal of Engineering, Computing and Architecture Vol.1, no.
data communication overhead etc. MapReduce, a parallel 2, 2007.
programming model is a new concept which seemingly
[16] Wang Yong, Zhang Zhe and Wang Fang, ”A parallel algorithm of
provides high computational power hence, promises a great association rules based on cloud computing,” in 8th International
scope to evolve association rule mining techniques in par- ICST Conference on Communications and Networking in China
allel environment. (CHINACOM), pp.415-419, 14-16 Aug. 2013.
[17] Xiaoting Wei, Yunlong Ma, Feng Zhang, Min Liu and Weim-
References ing Shen, ”Incremental FP-Growth mining strategy for dynamic
threshold value and database based on MapReduce,” in Proceed-
[1] Le Zhou, Zhiyong Zhong, Jin Chang, Junjie Li, Huang, J.Z. and ings of IEEE 18th International Conference on Computer Sup-
Shengzhong Feng, ”Balanced parallel FP-Growth with MapRe- ported Cooperative Work in Design (CSCWD), pp.271-276, 21-23
duce,” in IEEE Youth Conference on Information Computing and May 2014.
Telecommunications (YC-ICT), pp.243-246, 2010. [18] Hongjian Qiu, Rong Gu, Chunfeng Yuan and Yihua Huang,
[2] Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang and Edward ”YAFIM: A Parallel Frequent Itemset Mining Algorithm with
Chang, ”PFP: Parallel Fp- Growth For Query Recommendation,” Spark,” in IEEE International Parallel & Distributed Processing
in Proceedings of ACM conference on Recommender systems Symposium Workshops (IPDPSW), pp.1664-1671, 19-23 May
(RecSys), 2008. 2014.
[3] Farzanyar, Z. and Cercone, N., ”Efficient mining of frequent [19] Ghoting, Amol and Kambadur, Prabhanjan and Pednault, Ed-
itemsets in social network data based on MapReduce framework,” win and Kannan, Ramakrishnan, ” NIMBLE: A Toolkit for the
in EEE/ACM International Conference on Advances in Social Implementation of Parallel Data Mining and Machine Learning
Networks Analysis and Mining (ASONAM), pp.1183,1188, 25-28 Algorithms on Mapreduce,” in Proceedings of the 17th ACM
Aug. 2013. SIGKDD International Conference on Knowledge Discovery and
[4] Apiletti, D., Baralis, E., Cerquitelli, T., Chiusano, S. and Gri- Data Mining, pp. 334-342, 2011.
maudo, L., ”SeaRum: A Cloud-Based Service for Association Rule [20] J. Han, J. Pei, and Y. Yin, Mining frequent patterns without
Mining,” in 12th IEEE International Conference on Trust, Secu- candidate generation, in SIGMOD, pp. 1-12, 2000.
rity and Privacy in Computing and Communications (TrustCom),
pp.1283,1290, 16-18 July 2013. [21] Agrawal, Rakesh, and Ramakrishnan Srikant. ”Fast algorithms for
mining association rules.” In Proccedings of 20th international
[5] http://www.slideshare.net/zafarjcp/data-mining-association-rules- conference on very large data bases, VLDB, vol.12-15, pp. 487-
basics. 499, 1994.
[6] Riondato, Matteo, Justin A. DeBrabant, Rodrigo Fonseca and Eli [22] Geng, Xia, and Zhi Yang. ”Data mining in cloud computing.” In
Upfal, ”PARMA: a parallel randomized algorithm for approximate International Conference on Information Science and Computer
association rules mining in MapReduce,” In Proceedings of the Applications (ISCA 2013), 2013.
21st ACM international conference on Information and knowledge
management, pp. 85-94, 2012. [23] X. Y. Yang, Z. Liu and Y. Fu, MapReduce as a Programming
[7] Lingjuan Li and Min Zhang, ”The Strategy of Mining Association Model for Association Rules Algorithm on Hadoop, in Proceed-
Rule Based on Cloud Computing,” in International Conference ings 3rd International Conference on Information Sciences and
on Business Computing and Global Informatization (BCGIN), Interaction Sciences (ICIS), vol. 99, no. 102, pp. 23-25, 2010.
pp.475-478, 29-31 July 2011.
Effectiveness of LSP Features for Text
Independent Speaker Identification
Sharada V Chougule1, Mahesh S. Chavan2,
Finolex Academy of Management and Technology, Ratnariri1,
KIT’s College of Engineering, Kolhapur2,
shardavchougule@gmail.com1, maheshpiyu@gmail.com2
Abstract— The speech features used for speaker recognition person’s voice (criminal) for forensic purpose or in audio
should uniquely reflect characteristics of the speaker’s vocal tract conferencing are few examples of the same [1].
apparatus and contain negligible information about the linguistic
Speaker recognition is usually carried out in two steps
contents in the speech. Cepstral features such as Linear Predictive
Spectral Coefficients (LPCCs) and Mel Frequency Cepstral namely enrolling (training) and testing (decision making). In
Coefficients (MFCCs) are most commonly used features for training stage, a set of speaker dependent features are derived
speaker recognition task, but found to be sensitive to noise and from the input speech, which is useful to form the model of the
distortion. Other complementary features used initially for speech particular speaker. In testing phase, the test speaker’s features
recognition can be found useful for speaker recognition task. In are compared with that of the stored database models and end
this work, Line Spectral Pair (LSP) features (derived from
baseline linear predictive coefficients) are used for text
decision is made using some form of matching algorithm. A
independent speaker identification. In LSP features, power variety of speech features useful for speaker recognition are
spectral density at any frequency tends to depend only on close to studied in past. These features are primarily based two main
the respective LSP. In contrast, for cepstral features, changes in factors: i) Physical nature of human speech production organs
particular parameter affects the whole spectrum. The goal here is ii) The socio-educational background the speaker. The
to investigate the performance of line spectral pair (LSP) features
features based on the first category are named as low-level
against conventional cepstral features in the presence of acoustic
disturbance. Experimentation is carried out using TIMIT and (physical) features, whereas the features relating the second
NTIMIT dataset to analyze the performance in case of acoustic factor are called behavioral (high-level) features [2]. Low
and channel distortions. It is observed that the LSP features level features derived from shape of vocal tract and source
perform equally well to conventional cepstral features on TIMIT excitation of human speech production model [3], are easy to
dataset and have showed enhanced identification results on compute and requires small data for analysis and extraction.
NTIMIT datasets .
In comparison, high-level features are generally related to a
Index Terms- Linear Predictive Cepstral Coefficients speaker’s learned habits and style (such as particular word
(LPCCs), Mel Frequency Cepstral Coefficients (MFCCs), Line usage or idiolect and long term prosodic), are difficult to
Spectral Pair (LSP) extract and require larger speech data. The shape of vocal
tract and excitation parameters is supposed to be the most
I. INTRODUCTION unique characteristics of individual’s speech.
Speech is a natural source of communication for human Cepstral based features separates the source (vocal folds)
being. Human brain and auditory system analyzes the speech and system (vocal tract) parameters from the speech signal.
well for obtaining the information contents from it. In the Mel-frequency cepstral coefficients (MFCCs) and Linear
similar manner, computer analysis of speech is useful for Predictive Cepstral Coefficients (LPCCs) are most common
various purposes. It helps in observing real time speech examples of the same. Studies in [4, 5, 6] showed that the
spectra, to get some important characteristics carried by source or vocal tract features alone are susceptible to additive
speech (utterance) as well as some features of the speaker. noise and channel distortions and gives decreased recognition
Speech and speaker recognition by machines are main two performance. Vocal source features such as spectral centroids,
streams found useful in variety of applications in real world. spectral band energy, spectral bandwidth, spectral crest factor
When the task is ‘who’ is talking (rather than ‘what’ is said), it and Shannon entropy are used in combination with MFCC
is referred to as ‘speaker recognition’ in general. Depending features for speaker identification on TIMIT dataset [7].
on the nature of end decision, it is categorized as speaker Features like Minimum Variance Distortionless Response
identification and speaker verification. Irrespective of the end (MVDR) [8] and Mean Hilbert Envelope Coefficients
task, the requirement of speech analysis is to extract the (MHEC) [9] are proposed for improving the performance of
measures of ‘speaker’ characteristics, distinguishing one speaker recognition in noisy conditions. Front end processing
person from the other. This procedure is called feature based on autoregressive models followed by modulation
extraction and is one of the important front end procedures in filtering process is proposed for robust feature extraction in
speaker recognition techniques. There is growing need of noisy conditions [10].It is desirable that the features used to
using speech as a person identity in number of day-to-day represent the speaker specific characteristics must be robust
applications. Person authentication prior to admission to a enough for various forms of added noise and undesired
secure facility or for the transaction over telephone, verifying distortions in the surrounding.
(8)
ACKNOLEDEMENT
The authors would like to thank TIFR Mumbai and Dr.
Samudravijaya K. for providing the speech database.
REFERENCES
Abstract—Road accident is one of the crucial areas of as Poisson models, Negative Binomial models etc. As
research in India. A variety of research has been done on data previously mentioned that this data was collected from police
collected through police records covering a limited portion of stations, it contains accident records that happened on limited
highways. The analysis of such data can only reveal information portion of highways under their territories. Hence their results
regarding that portion only; but accidents are scattered not only
only explore the characteristics of accidents on some portions
on highways but also on local roads. A different source of road
accident data in India is Emergency Management research of highways.
Institute (EMRI) which serves and keeps track of every accident However, authors from other countries have used quality
record on every type of road and cover information of entire data for accidents and analyzed those using statistical
State’s road accidents. In this paper, we have used data mining techniques. Poisson models [6-7] and negative binomial
techniques to analyze the data provided by EMRI in which we (NB) models [8-10] have been used extensively to
first cluster the accident data and further association rule mining identify the relationship between traffic accidents and the
technique is applied to identify circumstances in which an causative factors. It has been widely recognized that Poisson
accident may occur for each cluster. The results can be utilized to models outperform the standard regression models in handling
put some accident prevention efforts in the areas identified for
the nonnegative, random and discrete features of crash counts
different categories of accidents to overcome the number of
accidents. [11-12].
Although Poisson models perform better, the constraint of
the mean being equal to the variance in Poisson models is
Keywords—Data Mining; Road Accidents; Accident Analysis; often violated by over-dispersed accident data. As an
Association Rule Mining.
alternative, Negative Binomial (NM) models can be used to
accommodate this over-dispersion by incorporating an
I. INTRODUCTION additional independently distributed error term. However, with
Road accident is one of the undesirable events that are the assumption of independent observations, both the Poisson
uncertain and unpredictable. Road accident is one of the major models and the NB models not suitable to handle the
causes of unnatural deaths, disability and property damage. In heterogeneous nature of road accident data.
a report by MORTH, 2014 [1], it is stated that 0.4 million There are various regression analysis based models i.e.
accidents are reported each year in India that makes India as linear regression models, negative binomial regression models
one of the countries with larger accident rate. The report stated and Poisson regression models which are the popular methods
that there is a minor decrease in accident rate from year 2012 in road accident data analysis as these methods can find the
to 2013 but it does not guarantee that this decrease will exist factors associated with a road accident. Such information can
for future years also. However, fruitful research is required in be very useful for the purpose of preventing road accidents by
order to identify circumstances of accident occurrence. taking preventive measure on the locations of accident
In India, when an accident took place, it is recorded by occurrence. Although these techniques are found useful in
the concerned police officer of that region’s police station. analyzing road accident data and identifying associated
Police stations only cover the accidents which have happened factors, [13] finds in his study that using these techniques in
in their territories. Raj V. P. [2] discussed that in India, the road accident data analysis may lead to some problems if the
method used for collecting, compiling and recording the accident data have higher dimensions. These problems are
accident data needs lot of improvements. He states that the exponential growth in the parameters with respect to growth in
reports which are prepared at accident sites are very basic and the number of variables and invalid results of statistical tests
non-analytical for research purpose. Various researches [3-5] because of sparse data in contingency tables. One of the basic
have been done on this data using statistical techniques such requirements of regression models is that they have certain
A. Cluster Analysis
d (X, Y) = ∑(δ(Xi, Yi)) for i=1 to n (1) R-statistical software was used for k-mode clustering. We
used Akaike Information Criteria (AIC) [21], Bayesian
Where, Information Criterion (BIC) [22] and Consistent AIC (CAIC)
[23] to identify the number of clusters in the data set. A total
0, Xi = Yi of 15 models were generated from 1 cluster to 15 clusters.
Figure 1 illustrates the evolution of BIC, AIC and CAIC for
δ(Xi, Yi) = all the 15 models generated. We can see a fall in the values of
(2)
1, Xi ≠ Yi AIC, BIC and CAIC with respect to an increase in the number
of clusters. Based on the Figure 1, we select the model with 6
clusters as there is no improvement shown in the values of
In the above equations, Xi and Yi are the values of object AIC, BIC and CAIC. K-modes clustering algorithm was
X and Y for attribute i. This distance measure is often referred applied to obtain the six clusters. These six clusters obtained
as simple matching dissimilarity measure. Here, we are by K-modes algorithm is described in following subsection.
providing a brief description of the K-modes clustering
algorithm.
B. Cluster Description
K-mode clustering procedure:
In order to cluster the data set D into k cluster, k-modes Cluster 1 involves 69% of two wheeler accidents happened
clustering algorithm perform the following steps: in populated areas such as markets, hospitals, local colonies
1. Initially select k random objects as cluster centers or across highways and local roads. The road features associated
modes with these two wheeler accidents were intersections and
2. Find the distance between every object and the bends. Two wheeler accidents which occurred on intersections
cluster center using distance measure defined in and bends on highways involved one injured only. Two
equation1 wheeler accidents at local roads are mostly involved two or
3. Assign each object to the cluster whose distance with more injured victims.
the object is minimum Cluster 2 consists of two wheeler accidents that occurred
4. Select a new center or mode for every cluster and less populated areas such highways that go through a hill area,
compare it with the previous value of center or mode; forest area or vegetation area. Accidents with two victims are
if the values are different, continue with step 2. mainly involved in this cluster.
1. Association rules for cluster1
The rules show that two wheeler accidents are mainly
occurs on specific road segments like intersections on
community areas like colony, markets. Intersections in
colonies near highways are more prone to two wheeler
accidents than colonies on local roads. Also market areas are
more likely to have two wheeler accidents with two or more
injured victims. For example, a rule {Road_type = “highway”
AND Road_feature = “intersection” AND Area_around =
“colony”} {Victim_injured=”1”} specifies that in cluster1
if an accident occurs at intersections in colonies near highways
with two wheeler accidents; then the chances for 1 victim
Figure 1: Cluster selection criteria injury is high. Another rule {Road_type = “local” AND
Road_feature = “intersection” AND Area_around = “market”}
Cluster 3 consists of all accidents in which a vehicle falls {Victim_injured=”>2”} indicates that if an accident occurs
down from height. Most of these accident cases were critical at intersections in markets on local roads then that accident
where area around was hill and involves more than two will involve more than 2 victims injured.
victims injured. The main road feature responsible for these
2. Association rules for cluster2
accidents was blind turn on road. The main duration of these
accidents were morning time of 4:00 am to 6:00 am on hilly The rules indicate that forest area and vegetation area that
roads and 8:00 pm to 4:00 am in other roads. are aside of certain highways are dangerous for two wheeler
Cluster 4 consists of accidents involving multiple vehicle accidents as sudden bend, slope at night time can cause
accidents and divider/fixed object hit cases. These accidents imbalance of driver and may cause accidents. Association rule
are mostly happened in night time around 11:00 pm to 4:00am {Road_type =”highway” AND Area_around = ”vegetation”
which are critical accidents whereas accidents at other time are AND Time =”Night”} {Accident_severity =”critical”}
non-critical. These accidents have same tendency on both suggests that highways that goes through vegetation land at
highway and local roads. Intersections are the main places night time are dangerous for two wheeler accidents and most
where these accidents have occurred. these accidents are critical.
Cluster 5 involves pedestrian hit accidents. Local roads
involve more pedestrian hit accidents than highway roads. 3. Association rule for cluster 3
Mostly these pedestrian hit accidents have occured in market, The rules show that most of the vehicle-fall from height
colony, near hospitals, and other populated areas. Pedestrian accidents involved more than 2 injured. It is found that
hit accidents at night time were critical whereas at day time vehicles falling from height on hilly highways are severe
these accidents have non-critical accidents. accident where more than two injured persons are there. The
Cluster 6 consists of the accidents involving vehicle roll- reason might be the vehicle type is four-wheeler or similar
over cases. Vehicle roll-over cases were found at bends and category which transports more than 2 persons at a time. Also,
slopes on highways. It has been observed that 80% of these it shows that mostly vehicles fall from height from hill
accidents have happened on the forest and vegetation areas on location are due to a bend on road.A rule {{Road_type
local roads. =”highway” AND Area_around =”hill” AND Road_feature
All these clusters were further analyzed using association =”bend AND Victim_injured=”>2”} {Accident_severity
rule mining to find the correlation among different attributes =”critical”} indicates that an accident at bend on hilly
in the data. Each cluster obtained was analyzed using highways with more than two victims injured are critical
association rule mining algorithm i.e. Apriori algorithm using accidents. It indicates that vehicles that can transport more
WEKA 3.6 tool. For this purpose, a minimum support of 30% than two persons such as car, taxi and bus are more prone to
was taken as threshold to identify strong rules. Although fall height accidents at bends on hilly highways. So the drivers
various rules are generated for every cluster only certain rules of such vehicles should drive carefully on hilly highways to
with strong confidence and lift value were taken into prevent accidents as these accidents are critical accidents in
consideration. The information extracted from these rules is these regions.
discussed in the following subsection.
4. Association rules for cluster 4
C. Description of Association Rules The rules indicate that multi-vehicular and fixed
Apriori algorithm [24] has been applied on every cluster object/divider hit accidents are mostly occurred at night time
using WEKA3.6 to generate association rules. Although on highway roads. Intersections on highways are another
several association rules are generated for each cluster type, cause for such type of accidents which are very difficult to
we are summarizing the information conveyed by association observe at night duration. A rule {Road_feature =
rules for each cluster. “intersection” AND Time = “night” AND Accident_severity =
”critical”} {Victim_injured = ”>2”} implies that accidents data. As previously mentioned, in a country like India, data
in this clusters involves more than two victims injured if collection process for traffic and road accidents are not as per
accidents happens in the night at intersections are critical mark. This data provides information that was not available
accidents. Rules for this clusters indicates that accidents have with the previous data collection process by police officers.
same tendency in local as well as highway roads. So one We understand that 108 service is a lifesaving system
should carefully drives at nights and be careful at intersections rather an accident data collection system. More staff can be
otherwise the critical accident can occur. provided by government to collect more information like
speed of the vehicle at accident time, weather information,
5. Association rule for cluster 5 road structure information and etc. in order to enhance the
The rules show that local colonies and markets on local research on road accidents in India, so that researchers can
roads are the major places of pedestrian hit accidents. A rule find much better results that can help in preventing accidents
{Road_type=”local” AND Area_around=”market” AND in India. The results can be utilized by the concern officers of
Victim_injured=”1”} {Age=”Young”} indicates that if a traffic and road safety department of India to put some
pedestrian hit accidents in market on local roads involves one accident prevention efforts in the areas identified for different
injury then the age of the victim involved in the accident will categories of accidents to overcome the accidents.
be young. Another rule {Road_type=”local” AND
Area_around=”colony” AND Victim_injured= “2”}
{Age=”child”} indicates that if an accident in colony on local
ACKNOWLEDGMENT
roads involves two victims injures then the age group involved
in the accident will be child group. The rules for this cluster The authors acknowledge the GVK-EMRI Dehradun to
suggest that markets, colony and hospital area are the major provide the data for our research work.
places where pedestrian hit cases have occurred. Hence
pedestrian should be careful in these locations.
REFERENCES
6. Association rules for cluster 6
[1] MORTH, 2014. “Road Accidents in India 2013”, New Delhi: Ministry
The rules indicate that vehicle roll-over accidents are of Road Transport and Highways Transport Research Wing,
occurred at night in forest area and roads near vegetation Government of India. August 2014.
areas. A slope in forest road and a bend on road is the road [2] R. V. Ponnaluri, “ Road traffic crashes and risk groups in India: analysis,
interpretations, and prevention strategies, IATSS Research, 35(2), 2012.
feature involved in these accidents. A forest road is more
[3] M. Parida, S. S. Jain, and C. N. Kumar, “Road traffic crash prediction
prone to vehicle roll-over accidents in night time. The rule on national highways” Indian Highways, Indian Road Congress, 40(6),
{Area_around = ”forest” AND Time = ”night”} 2012.
{Accident_severity=”critical”} indicates that roll-over [4] C. N. Kumar, M. Parida, and S. S. Jain, “Poisson family regression
accidents at night time in forest area is critical accidents. techniques for prediction of crash counts using Bayesian inference”
Procedia-Social and Behavioral Sciences, 104, 2013.
Another rule {Area_around = ”vegetation” AND
[5] R. Bandyopadhyaya, and S. Mitra, “Modelling Severity Level in Multi-
Road_feature = ”bend”} {Victim_injured = “>2”} indicates vehicle Collision on Indian Highways”, Procedia – Social and
that bend on roads near vegetation areas are dangerous for Behavioral Sciences, 104, 1011-1019, 2013.
vehicle roll-over accidents and these accidents involves more [6] B. Jones, L. Janssen, and F. Mannering, “Analysis of the Frequency and
than two victims injured in accidents. Duration of Freeway Accidents in Seattle”, Accident Analysis and
Prevention, Elsevier, vol. 23, 1991.
[7] S. P. Miaou, and H. Lum, “Modeling Vehicle Accidents and Highway
V. CONCLUSION AND SUGGESTIONS Geometric Design Relationships”, Accident Analysis and Prevention,
Elsevier, vol. 25, 1993.
In this paper, we used association rule mining to analyze [8] S. P. Miaou, “The Relationship between Truck Accidents and
accident patterns for different types of accidents on the road Geometric Design of Road Sections–Poisson versus Negative
network of Dehradun district in Uttarakhand state of India. Binomial Regressions”, Accident Analysis and Prevention, Elsevier,
vol. 26, 1994.
Our approach used 9640 accidents records after preprocessing
[9] M. Poch, and F. Mannering, “Negative Binomial Analysis of
that have occurred on Dehradun district road network during Intersection-Accident Frequencies”, Journal of Transportation
2009 to 2014. These accident records are based on the Engineering, vol. 122, 1996.
information collected by 108 ambulance service running in [10] M. A. Abdel-Aty, and A. E. Radwan, “Modeling Traffic Accident
Uttarakhand state that contains about 17 variables from which Occurrence and Involvement”, Accident Analysis and Prevention,
13 variables were found suitable for analysis. We used K- Elsevier, vol. 32, 2000.
modes clustering that identified 6 clusters which are mainly [11] S. C. Joshua, and N. J., Garber, “Estimating Truck Accident Rate and
Involvements using Linear and Poisson Regression Models”,
based on accident type. Further, association rule mining Transportation Planning and Technology, vol. 15, 1990.
technique is applied on these clusters to identify some [12] M. J. Maher, and I. Summersgill, “A Comprehensive Methodology for
interesting rules that can help in understanding the the Fitting of Predictive Accident Models”, Accident Analysis and
circumstances of accidents in different clusters. Although, Prevention, Elsevier, vol. 28, 1996.
certain important rules were identified but more interesting [13] W. Chen, and P. Jovanis, “Method of Identifying Factors Contributing to
Driver-Injury Severity in Traffic Crashes”, Transportation Research
rules can be found if more information is available with the Record-1717, 2002.
[14] L. Y. Chang, and W. C. Chen, “Data Mining of Tree based Models to [19] P. N. Tan, M. Steinbach, and V. Kumar, “Introduction to data mining”.
Analyze Freeway Accident Frequency”, Journal of Safety Research, Pearson Addison-Wesley, 2006.
Elsevier, vol. 36, 2005. [20] J. Han, and M. Kamber, “Data Mining: Concepts and Techniques”.
[15] J. Abellan, G. Lopez, and J. Ona, “Analyis of Traffic Accident Morgan Kaufmann Publishers, USA, 2001.
Severity using Decision Rules via Decision Trees”, Expert System with [21] H. Akaike, “Factor analysis and AIC”. Psychome 52, 317–332, 1987.
Applications, Elsevier, vol. 40, 2013.
[22] A. E. Raftery, “A note on Bayes factors for log-linear contingency table
[16] B. Depaire, G. Wets, and K. Vanhoof, “Traffic Accident Segmentation models with vague prior information”. Journal of the Royal Statistical
by means of Latent Class Clustering”, Accident Analysis and Society, Series B 48, 249–250, 1986.
Prevention, Elsevier, vol. 40, 2008.
[23] C. Fraley, A. E. Raftery, “How many clusters? Which clustering
[17] V. Rovsek, M. Batista, and B. Bogunovic, “Identifying the Key Risk method? Answers via model-based cluster analysis”, The Computer
Factors of Traffic Accident Injury Severity on Slovenian roads using a Journal 41, 578–588, 1998.
Non-parametric Classification Tree”, Transport, Taylor and Francis.
2014. [24] R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules
in Large Databases”, Proceedings of the 20th International Conference
[18] T. Kashani, A. S. Mohaymany, and A. Rajbari, “A Data Mining on Very Large Data Bases, p. 487-499, 1994.
Approach to Identify Key Factors of Traffic Injury Severity”, Promet-
Traffic & Transportation, vol. 23, 2011.
Distribution Methods of Web Pages and Testing
Abstract—Publishing the contents through internet we need processes are unable to work together with the primary
a medium i.e Web browser. Hyper Text Markup Language is method and have to use system calls grant by the Kernel of
very helpful for writing the web contents and by the help of web browser. Primary process links with the other processes will
browsers we publish our page on net. It is invented as an communicate with each other using inter-process
application so that people can take advantage to view immobile communications with the support of browser kernel. Other
web pages in succession. Technologies improves and the web sites
gain more advance features and have dynamic web applications
than Opera Mini, Kernel of browsers offers the same shield to
with the exchanging of contents from other websites, similarly hook up contents as to the standard network content. These
browsers features have turn out to be multi-principal operating types of multi-principal operating system creation for a
surroundings with shared resources with commonly trustable browser fetch considerable security and dependability benefits
web site principals. Internet Explorer, Google Chrome, Mozila to the whole browser scheme: the negotiation or failure of
Firefox and many more have a multiple key principal operating rules affects the Browser Kernel.
system assembly that gives a facility of web browser. The
constrained control to supervise the protection of all machine A group of instructions for retrieving, traversing and
resources for the collected web page is based on the defined rules. presenting the information resources on the web is a computer
An intranet is not a public network with the surfing limitation to program known as web browser. Source of information is
one organization. It uses the same technical principles and acknowledged by Uniform Resource Locator or Uniform
procedure as the international public internet. Internet is the Resource Identifiers. Source of information can be a shared
most apparent opportunity for publishing the web pages but we web page, videos, images or any other type of content. Hyper-
may need to limit the sharing of our web pages to a local internet
linking in resource provides the facility to the users to
inside the group, instead of making this web page availability for
public. We can share out our web pages on local memory of navigate the used browsers to relate with actual resources.
personal computer or server. This paper covers all the publishing Web browsers include Nexus (1990, first web browser by Tim
options and offers guidance for designing web page and the Burners-Lee), Erwise (1st web browser with graphical user
distribution methods. It also covers the basics testing methods of interface), Mosaic (1993, Marc Andreessen), Netscape
published web pages. Navigator (1994), Microsoft Internet Explorer (1995), Opera
(1996), Mozilla Firefox (1998, later 2011), Apple’s Safari
Keywords— Web Browsers; Web Pages; Web Server; Web
(January-2003), Google Chrome (September 2008), and
Page; Testing
Internet Explorer 8 (December 2011) [13].
I. INTRODUCTION Web browsers Netscape Navigator and Mosaic are simple
Publishing the web page through web browsers have in software applications that provide HTML code sustain
progress to be a multiple primary operational place where key bookmarks and processed input. As website creates, so it
is a web site [11]. Similarly multi-functional operating system, fulfils the requests of web browser. Now a day’s browsers are
modern proposals [2, 3, 6, 11, 12] and browsers like Internet more advance, supports more than one type of web codes like
Explorer-8,9 [8] and Firefox 3rd version [4] supports PHP, XHTML, HTML 5 and active JavaScript. Secure
generalized concept for two way communication e.g. Post websites use encryption in the code. Advanced web browsers
Message and protection of frames for web developers. On the permit the web developers to developed well designed
other hand no existing browsers, including new architectures interactive websites e.g. Ajax facilitate browsers to robustly
like internet explorer, Google Chrome and Opera Mini have a upload updated information on the web without reload the
multi-functional operating system creation that gives a contents of webpage. Advancement in CSS permits the
browser-based operating system called Browser Kernel. The browser to display the visual effects in responsive website or
restricted control manage the protection and sharing of system web design. Cookies in the browser allow to remembering
resources in all respect with browser principles [5,7,9]. your settings for specific webpage in a website. Sometimes
website works very well in one browser, but would not
Kernel of browser directly interacts with the elementary part function in another browser efficiently. Therefore, it is elegant
of operating system and depicts a set of structure entitled for to install number of multiple browsers to seeing the best
principal of browser. We depict the separation boundary output of your page. Even browsers are mostly intended to
across the distinct browser rules defined by the foundation utilize the World Wide Web, and have the authority to access
policy [1, 10] like protocol, domain-name and port using the the information to be made available by our selected web
sandboxed operating system processes. Due to sandboxed servers in local area networks [14].
REFERENCES
Fig. 8. CuteFTP Hosting default setting Dialog Box
[1] Helen J. Wang, Chris Grier, Alexander Moshchuk, Samuel T. King,
2. Make sure the testing computer is set to a 16- colour Piali Choudhury, Herman Venter Microsoft Research, University of
video mode or, at most a 256-colour mode. Also try Illinois at Urbana-Champaign, University of Washington “The Multi-
several different brightness settings on your monitor and Principal OS Construction of the Gazelle Web Browser” MSR Technical
change the colour balance of monitor to make sure your Report MSR-TR-2009-16
pages still look okay on display that are more blue or red [2] D. Crockford. JSONRequest. http://www.json.org/jsonrequest.html.
than yours.
[3] D. Crockford. The Module Tag: “A Proposed Solution to the Mashup [17] http://www.ccsf.edu/Pub/Fac/composer.html
Security Problem”. http://www.json.org/module.html.
[18] http://en.wikipedia.org/wiki/Microsoft_FrontPage#mediaviewer/File:Mi
[4] Firefox3fordevelopers,2008.”https://developer.mozilla.org/en/Firefox_3 crosoft-FrontPage-screenshot.png
_for_developers”.
[19] http://www.a2hosting.com/kb/getting-started-guide/publishing-your-
[5] C. Grier, S. Tang, and S. T. King. “Secure web browsing with the OP web-site/publishing-your-web-site-with-microsoft-frontpage
web browser”. In Proceedings of the 2008 IEEE Symposium on Securiy
and Privacy, 2008. [20] http://support.lanset.net/pages/conf_2.html
Author Index
Abderrahmane, Daif 78
Abuna, Felix 87
Adewumi, Adewole 85, 86
Agarwal, Akash 17
Agba, Basile 17
Ahmed, Abd El-Aziz 64
Ahmed, Mahmood 64
Akingbesote, Alaba 46
Akinola, Ayotuyi 46
Annamalai, Senthamarai Selvan 38
Appavoo, Perianen 69
Ashley-Dejo, Ebunoluwa 26, 28
Asowata, Osamede 5
Catherine, Clarel 12
Catherine, Pierre 15, 16
Chavan, Mahesh 90, 92
Cheema, Amarjeet 27
Chen, Chien-Hung 21
Cherutich, Peter 87
Chetty, Naganna 75
Chien, Wu-Fan 30
Chikudo, Admiral 68
Chiu, Chien-Ching 21
Chougule, Sharada 92
Christo, Pienaar 5
Chuku, Peter 70
Coonjah, Irfaan 12, 15, 16
Curum, Brita 19
ICCCS-2015 Author Index
El Jamiy, Fatima 78
Engelbrecht, Klarissa 33
Esan, Omobayo 28, 36
Farquhar, Carey 87
Flores, Denys A. 40
Foogooa, Ravi 45
Hamza, Nermin 64
Hans, Robert 63
Hefny, Hesham 64
Ho, Min-Hui 21
Horvat, Matija 24
Hosanee, Yeeshtdevisingh 79
Hosseinzadeh, Shohreh 71
Hyrynsalmi, Sami 71
Ierache, Jorge 60
Jaufeerally-Fakim, Yasmina 68
Jena, Sanjay Kumar 18
Jones, Brian 48
Juddoo, Suraj 13
Jutton, Teenah 69
Ka, Selvaradjou 38
Kekana, Johannes 63
Khedo, Kavi Kumar 19, 42
Kim, Hyoungjun 44
ICCCS-2015 Author Index
L. K. Cheung, Jonathan 74
Labeau, Fabrice 17
Laha, Sumit 82, 83
Langueh, Kokou 20
Le Roux, Petra 49
Lee, Jaehee 44
Lee, Kyungho 44
Leppänen, Ville 71
Lerato, Masupha 28
Liao, Shu-Han 21
Limthong, Kriangkrai 50
Liu, Han-Wen 30
Loock, Marianne 49
M. Abdou, Mohamed 56
Macharia, Paul 87
Maharaj, Manoj 3
Mahlobogwane, Zanele 39
Maiti, Sumana 10
Majhi, Banshidhar 18, 22
Mansur, Vidya 58
Manyere, Peter 51
Masupha, Lerato 36
Medhi, Nabajyoti 6
Migabo, Emmanuel 57
Mishra, Anurag 9
Misra, Sanjay 85, 86, 88, 89
Mocktoolah, Asslinah 42
Mohamed, Hossam 56
Mohapatra, Ramesh Kumar 18
Montenegro, Carlos W. 40
Muhammad, Najam Ul Islam 1
Muhongya, Kambale Vanty 3
Mujahid, Umar 1
Murthy, Dr. B K 27
Mvelase, Promise 67
Pal, Manjish 6
Panchoo, Dr. Shireen 79
Panchoo, Shireen 61
Patel, Charmy 7
Paupiah, Pravin Selukoto 11
Peeroo, Swaleha 48
Peng, Yong 53
Phate, Thato 57
Rajpal, Ankit 9
Ramjug-Ballgobin, Rajeshree 54, 55
Ras, Dirk 72
Rauti, Sampsa 71
Refaie, Rasha 64
Renke, Amar 90
Richomme, Morgan 74
Robberts, Michelle 33
Roy, Rinita 82
Rughooputh, H C S 55
Sack, Pablo 60
Sambai, Betsy 87
Samy, Martin 48
Saxena, Deepika 84
Saxena, Shilpi 84
Sayed Hassen, Sayed Zahiruddeen 54
Schoeman, Ruaan 5
Schwalke, Udo 59
Sharma, Anju 35
Sharma, Shivani 91
Sheel, Neelaksh 94
Shettar, Rajashekar 58
Singh, Omesh 23
Singh, Upasana 23, 25, 41
Singha Roy, Sayantan 62
Singhal, Rekha 8
Skvorc, Dejan 24
Sm, Ngwira 28
ICCCS-2015 Author Index
Sohoraye, Mrinal 76
Sonawane, M.S. 88
Sotenga, Prosper 70
Soyjaudah, K. M. S. 12
Soyjaudah, K.M.Sunjiv 15, 16
Spies, Chel-Mari 81
Spies, Jan 80
Srivastav, Vinita 89
Srivastava, Praveen 27
Srivastava, Siddharth 27
Stone, Kyle 80
Sudarshan, Sithu D 75
Suddul, Geerish 68, 74
Sungkur, Roopesh 41
Ul-Ain, Qurat 1
Wang, Jidong 4
Wang, Xiaoyi 53
Yao, Yuangang 53
Yoon, Hyunsik 44
Zhan, Zheng 53
Zhang, Xinyang 4
Zhao, Xianghui 53
Zheng, Gang 20
Zuva, Tranos 28, 29, 31, 32, 36
ICCCS-2015 Keyword Index
Keyword Index
6LoWPAN 17
8-connected neighbor 82
802.11n 70
ABCD matrices 34
Access Categories(ACs) 70
Accident Analysis 93
Activation value 89
Ad hoc cloud mobile 46
Adaptive Field-Effect Transistor 59
Adaptive Neuro Fuzzy 54
Algorithm indexing 51
Android 24, 86
ANFIS 37
anomaly detection 50
Architecture 78
artificial noise injection 2
assessment 41
assistant node 4
Association Rule Mining 62, 93
attributes reduction 75
Automotive 11
Data cleansing 13
Data Mining 14, 91, 93
Data mining 56, 75
data mining 25
ICCCS-2015 Keyword Index
Data profiling 13
Data quality 13
Data quality dimensions 13
Data quality metrics 13
Data quality rules 13
Database 77
Database security 64
DDoS 10
design aspects 81
Development Platform 68
Differential power analysis 47
Digital forensics 72
Digital signal operations 58
digital television 29
Discrete Cosine S-Transform 18
Discrete wavelet transform (DWT) 22
diversification 71
E-assessment 31, 32
e-assessments 25
e-CRM 78
e-health 63
e-tutoring 49
Ear 77
education 19
EM Radiation 90
Embedding algorithm 82
emulator 66
Energy Efficiency 39
Enforcing Mechanism 27
engage 79
Enhanced Distributed Channel Access (EDCA) 70
enterprise resource planning (ERP) systems 23
Enterprise systems 43
ERP modules 23
ERP procurement 23
ERP selection 23
ethical tracking 80
evaluation framework 23
EXIF metadata 24
Extreme Learning Machine 9
Forensic monitoring 72
FPGA 1
Framework. 33
free convection 56
ftp 94
Full-duplex 57
Fusion 31
Fuzzy 54
Fuzzy-PID 54
Hadoop 62
Half-duplex 57
health 63
heat source/sink. 56
Hidden units 89
hierarchical wireless sensor networks 4
High-K dielectric 47
higher order thinking skills (HOTS) 25
Homomorphic encryption 64
hosting 94
Hybrid Feedback 28
hybrid method 37
Hypervisor 72
ICT 40
identity development 49
Image Steganography 82
Immersion 20
Implicit Feedback 28
impulsive noise environment 17
ICCCS-2015 Keyword Index
Informatic Security 60
information security awareness 49
Innovative video technology 69
inter-user interference 57
Internet of Things 71
Interval arithmetic 58
Intrusion detection and prevention system 35
IoT 71
IP Spoofing 10
ISA100.11a 17
iso 40
IT 40
IT security 44
k-Coverage 6
K-Means 14
Kernel principal component analysis (KPCA) 22
key management 4
Keywords: 26
Knowledge discovery 56
Knowledge management 43
law 76
LBP 73
learning analytics 25
Learning mathematics 69
Least squares support vector machine (LS-SVM) 22
licensed number plate 52
Line Spectral Pair (LSP) 92
Linear Predictive Cepstral Coefficients (LPCCs) 92
linear regression 7
live video streaming 24
Load Frequency Control 54
Local Feature 73
location privacy challenges 42
location-based 63
machine learning 50
Magnetic resonance imaging (MRI) 22
management 40
Map reduce 91
MapReduce 62
massively-parallel analog computations 65
Maturity models 45
Mauritius 76
Mel Frequency Cepstral Coefficients (MFCCs) 92
memcapacitor 66
meminductor 66
ICCCS-2015 Keyword Index
memristor 65, 66
Metrics 60
mHealth 81, 87
micro-services 74
Microblog 53
MIMO 57
MIMO-WLAN 21
mixed strategy 57
MJPEG 24
MLP 88
MNIST dataset 18
Mobile ad hoc networks 38
mobile application 63
Mobile application 86
mobile applications 74
Mobile Devices 41
Mobile knowledge workers 33
mobile learning 19, 49
mobile technology 63
Mobile worker perceptions 33
Mobile worker requirements 33
Mobile workforce 33
model 40, 65
MONOMI 64
Moores Law 59
Multimodal Biometrics 32
multiple timeline 50
Mutual Authentication 1
NABH 27
NACO 27
Naı̈ve Bayes 14
nearest neighbor 88
Neural network 88
neural network 52
Neural Network 59
Nigerian education 85
Normalized Correlation (NC) 9
NSA 10
obfuscation 71
Object-Oriented Programming(OOP) 79
Observability Singularity 20
observer 55
OCR 88
Online Sequential 9
Online social network 3
OpenSSH 12, 16
ICCCS-2015 Keyword Index
OpenVPN 12, 15
orientation angles 5
Pallelization 91
Parallelism 51
PCA 73
Peak Signal to Noise Ratio 82
performance evaluation 36
performance measurements 7
Performance testing 7
Personal of Interest analysis 30
physical layer secrecy 2
Polycrystalline 5
popularity 53
porous medium 56
portable device 29
Power Line Communication 34
practices 40
prediction 53, 75
Preprocessing 14
Privacy 87
privacy 71, 76, 80
privacy solutions 42
Professional Donor 27
proficiency level. 79
Protocol to Access White Spaces 29
proximity based social networking 42
PSNR 9
PSO and BPN methods 37
public concern 90
Q format representation 58
Quality of Service (QoS) 70
quantum algorithm 83
quantum-inspired cuckoo search algorithm 83
query processing 64
RDBMS 8
Recommender system 26
Recommender Systems. 28
Reconfigurable Logic 59
Relationship marketing 48
reliable communication 17
Residential Exposure 90
retweet 53
Reuleaux Tetrahedron 6
RFID 1
Road Accidents 93
ICCCS-2015 Keyword Index
rotation matrix 24
Rough sets 56
Rule extraction 89
Rule Induction 56
SASI 1
Secure data transmission 20
Secure indexes 64
Security 35, 87
security 4, 42, 71
Security Controls 60
self-interference 57
Sensor Nodes 39
Serious games 41
server mobile machine 46
service computing 67
service delivery models 85
service selection 46
SFSA 88
simulated annealing 83
single CCI ray-tracing approach 21
Sixsoid 6
smart card. 47
SMMEs 46, 67
social computing 49
Social media 48
Social Networks 30
Social Personal Analysis 30
Socio-constructivism 61
Software Agent 68
software tools 79
Spectrum Sensing 38
SPICE 65
SSAR data 51
SSIM 9
staff availability 80
staff tracking 80
standard 40
Star Topology 34
student feedback 41
suitability 85
Supply chain 43
Sustainable Development 45
SWOT analysis 85
Tasks scheduling 84
Taxi booking 86
TCP tunnel 15
ICCCS-2015 Keyword Index
Testing 94
Throughput 70
tilt 5
traditional health practitioner 63
Transmission Line 34
Transport Protocol 38
transportation 86
Transportation company 43
Travelling salesman problem 83
Tunneling 12, 16
TV whitespace 29
UDP tunnel 15
Ultralightweight 1
Unknown malicious code detection model 44
variable permeability 56
vehicle security 11
video rotation 24
Virtual Enterprise 67
Virtual Machine 84
visualization 3
VPN 12, 15, 16
web 80
web API 74
weighting technique 50
WEKA tool 14
White space device 29
wireless sensor networks 2
Wireless Sensor Networks 6, 17, 39
WirelessHART 17
Workflow 84
WSR. 57
XOR 82
YouTube Education 69
Zigbee 17
Sponsored by:
MAURITIUS SUBSECTION
ISBN 978-1-4673-9353-9