You are on page 1of 491


Volume 1 & 2

The Proceedings
Of 2nd National Conference on Innovation and Entrepreneurship in Information and Communication Technology May 14-15, 2011


Dr. Anil Kumar Pandey, Convener, SIG-IEICT

Editors 1. Dr. Saba Hilal, GNIT-MCA Institute 2. Mr. Pradeep Agrawal, GGIT 3. Dr. S.K. Pandey, GGIT 4. Dr. Shikha Jalota, GGIT 5. Mr. Ankit Shrivastava, GGIT 6. Ms. Monika, GNIT-MCA Institute 7. Ms. Jyoti Guglani, GGIT 8. Mr. Amit Kumar, GGIT

Copyright 2011 All rights reserved. No part of this publication may be produced or transmitted in any form or by any means without the written permission of Special Interest Group on (IE-ICT)

India is one of the youngest nations in the world. There are millions in the employable age. The conventional methods alone would not be adequate to address this problem. Innovation and entrepreneurship are key to job creation and national competitiveness. Technological advancements are increasing at rapid pace and developed economies are deriving economic dividends to create wealth and improve the efficiency of public services and processes. Throughout the developing world innovative entrepreneurs are working to establish businesses that are ICT enabled. India is at a threshold of new takeoff. The Indian IT industry is likely to clock revenues of over USD $70 billion by the end of this year and by USD $220 billion by 2020 that is why innovation and entrepreneurship and ability to combine the two in the domain of ICT becomes of immense importance. ICT has impacted our society in all walks of life education, employment, healthcare, communication, governance, business, banking to defense and disaster management. This has been possible due to innovative entrepreneurial venture that have come up using ICT and ICT enabled services. However in countries like India taking it to rural areas and integrating with agriculture, sanitation and village level micro enterprises are yet to be developed and commercialised. The university/ institute based innovators routinely produce breakthrough technologies that, if commercialized by industry, have the power to sustain the economic growth. However, this is not happening. In absence of this the business world is witnessing the rise of the student entrepreneurs who start their entrepreneurial journey at a very early stage while still pursuing their education. In order to help innovators and entrepreneurs create ventures academic institutions including schools, government, businesses and investors must work together. A clear knowledge of incubation support for taking their ideas as start up would help the economic growth as well as job opportunities for millions of youths of this country. This conference has been organized with the objective to bring students, faculty members, IT professionals, government agencies, venture capital agencies, innovation foundations and social entrepreneurs on a common platform to elicit and explore sources of innovation that exists among the upcoming and informal group of population who can be converted into the entrepreneurs of tomorrow through possible intervention generated here. It has the following aims. To provide a platform for students to meet and interact with innovators, successful entrepreneurs, government agencies and Venture Capitalists. To develop insight for integrating entrepreneurial experiences in the formal education process. To empower and sensitize students and faculty towards wealth creation through innovation and entrepreneurship. To give the basic knowledge and tools for setting up technology driven enterprise. To invoke drive and motivation to convert Intellectual Property Rights into enterprise through innovative management. Special Interest Group (SIG) on Innovation and Entrepreneurship in ICT, CSI Ghaziabad Chapter and Mahamaya Technical University have joined hands to organize this conference. It is a matter of great satisfaction that the response in terms of research papers as well as participation has been overwhelming and nation wide. Distinguished academicians, scholars from universities, government departments and entrepreneurs are participating to make it meaningful. We present here all the papers, PPTs and abstracts submitted by the authors. In future we intend to have third party review and get them published.

Dr. Anil Kumar Pandey Editor-In-Chief

The Proceedings

2nd National Conference on Innovation and Entrepreneurship in Information and Communication Technology, May 14-15, 2011


Volume 1
New Technology acceptance model to predict adoption of wireless technology in healthcare Manisha Yadav, Gaurav Singh, Shivani Rastogi.9 Solutions to Security and Privacy Issues in Mobile Social Networking Nikhat Parveen, Danish Usmani..15 Wireless Monitoring Of The Green House Using Atmega Based Monitoring System: WSN Approach Miss.Vrushali R.Deore., Prof. V.M. Umale..21 Fuzzy C- Mean Algorithm Using Different Variants Vikas Chaudhary, Kanika Garg, Arun Kr. Sharma.27 Security Issues In Data Mining Rajeev Kumar, Pushpendra Kumar Singh, Arvind Kannaujia ..38 Web Caching Proxy Services: Security and Privacy issues Mr. Anoop Singh , Mr. Rohit Singh, Ms. Sushma Sharma ..42 A Comparative Study to Solve Job Shop Scheduling Problem Using Genetic Algorithms and Neural Network Vikas Chaudhary, Kanika Garg, Balbir Singh..48 Innovation & Entrepreneurship In Information And Communication Technology Deepak Sharma Nirankar Sharma, Nimisha Srivastava .55 Insider Threat: A Potential Challenges For The Information Security Domain Abhishek Krishna, Santosh Kumar Smmarwar, JaiKumar Meena , Monark Bag, Vrijendra Singh ...57 Search Engine: Factors Influencing The Page Rank PrashantAhlawat, Hitesh Kumar Sharma..63 Are The Cmmi Process Areas Met By Lean Software Development? Jyoti Yadav68 Password Protected File Splitter And Merger (With Encryption And Decryption) Mrs. Shikha saxena, Mr. Rupesh kumar sharma .73 Security Solution in Wireless Sensor Network Pawan Kumar Goel , Bhawnesh Kumar ,Vinit Kumar Sharma..77 Vertical Perimeter Based Enhancement Of Streaming Application P.Manikandan, R.Kathiresan, Marie Stanislas Ashok82 Orthogonal Frequency Division Multiplexing for Wireless Communications Meena G shende.87 A Comprehensive Study of Adaptive Resonance Theory Vikas Chaudhary, Avinash Dwivedi, Sandeep Kumar, Monika Bhati..91 Is Wireless Network Purely Secure? Mrs Shikha Saxena, Mrs Neetika Sharma,Ms Rachana Singh..98 An innovative digital watermarking process A Critical Analysis Sangeeta Shukla..104 Design Of A Reconfigurable Sdr Transceivers Using Labview Sapna Suri, Vikram Verma, Rajni Raghuvanshi And Pooja Pathak.111 A Modified Zero Knowledge Identification Scheme Using ECC Kanika Garg, Dr. R. Radhakrishan, Vikas Chaudhary, Ankit Panwar116 Security and Privacy of Conserving Data in Information Technology Suresh Kumar Kashvap, Pooja Agrawal , Minakshi Agrawal Vikas Chandra Pandey 120 Barriers to Entrepreneurship - An analysis of Management students

Dr. Pawan Kumar Dhiman ..126 A Novel Standby Leakage Power Reduction Method Using reverse Body Biasing Technique for Nanoscale VLSI Systems James Appollo .A.R, Tamijselvan. D..131 Survey On Decision Tree Algorithm Jyoti Shukla ,Shweta Rana ..136 Distributed Security Using Onion Routing Ashish T. Bhol , Savita H Lambole 141 Virtulization implementation in an Enterprise Rohit Goyal.146 Design of data link later using WiFi MAC protocols K.Srinivas (M.Tech) 149 Leveraging Innovation For Successful Entrepreneurship Dr. Sandeep kumar , Sweta Bakshi, Ankita Pratap ..153 Performance evaluation of cache replacement Algorithmd for Cluster Based cross layer design for Cooperative Caching (CBCC) in Mobile-Ad Hoc Network Madhavarao Boddu, Suresh joseph k..165 Owerment And Total Quality Management For Innovation And Success In Organisations Ms. Shamsi Sukumaran K , Ms. Bableen Kaur .178 A New Multiple Snapshot Alogorithm for Direction of Arrival Estimation using Smart Antenna Lokesh L , Sandesha karanth, Vinay T, Roopesh , Aaquib Nawaz ...185 Quality Metrics for TTCN -3 and Mobile Web Application Anu saxena , Kapil Saxena ...190 A Unique Pattern matching Algorithm Using The Prime Number Approach Nishtha kesswani , Bhawani Shankar Gurjar .....194 Study and implementation of Power control in Ad hoc networks Animesh srivastava, Vanya garg, Vivekta Singh..197 Improving the performance of Web log Mining by using K- Means clustering with Neural Network Vinita Srivastava..203 Higher Education Through Enterpreneurship Development in India Mrs. Vijay..208 Concepts ,Techniques,Limitations And Application of data mining S.C. Pandey, P.K Singh, D. dubey...210 ICT for Energy Efficiency, Conservation & reducing Carbon Emissions Aakash Mittal 212 Study of Ant Colony Optimization For Proficient Routing In Solid Waste Management Aashdeep Singh, Arun Kumar, Gurpreet Singh.213 Survey on Decision Tree Algorithm Jaya Bhushan, Shewta Rana, Indu...215

Volume 2
Analysis of Multidimensional Modeling Related To Conceptual Level Udayan Ghosh, Sushil Kumar....222 Wireless Sensor Networks Using Clustering Protocol Gurpreet Singh, Shivani Kang.....229 Performance Evaluation of Route optimization Schemes Using NS2 Simulation Manoj Mathur, Sunita Malik, Vikas...235 IT-Specific SCM Practices in Indian Industries: An Investigation Sanjay Jharkharia.....239 Cloud Computing Parveen Sharma, Manav Bharti University, Solan Himachal Pradesh..259 Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform Sucheta Dhir.263 A Hybrid Filter for Image Enhancement Vinod Kumar, Kaushal Kishore and Dr. Priyanka ...269 Comprehensive Study of Finger Print Detection Technique Vivekta Singh, Vanya Garg ....273 Study of Component Based Software Engineering using Machine Learning Techniques Vivekta Singh 281 Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments K Madhavi, Dr Narasimham Challa.....285 A 3D Face Recognition using Histrograms Sarbjeet Singh, Meenakshi sharma, Dr. N Suresh Rao, Dr. Zahid Ali...291 An Application of Eigen Vector in Back Propagation Neural Network for Face expression Identification Ahsan Hussain ....295 Next Generation Cloud Computing Architecture Ahmad Talha Siddiqui, Shahla Tarannum, Tehseen Fatma.....299 Virtualization of Operating System using Xen Technology Annu Dhankhar, Siddharth Rana....304 Quality Metrics for TTCN-3 and Mobile-Web Applications Anu Saxena, Kapil Saxena..308 Future of ICT Enable Services for Inclusive Growth in Rural Unprivileged Masses Bikash Chandra Sahana , Lalu Ram.......312 Conversion of Sequential Code to Parallel An Overview of Various Conversion Methods Danish Ather, Prof. Raghuraj Singh .....314 Innovation And Entrepreneurship In Information And Communication Technology Deepak Sharma, Nirankar Sharma, Nimisha Shrivastava....320 Fuzzy Classification On Customer Relationship Management Mohd. Faisal Muqtida, Ashi Attrey, Diwakar Upadhyay...322 New Technology Acceptance Model to Predict Adoption of Wireless Technology in Healthcare Gaurav Singh , Manisha Yadav, Shivani Rastogi.330 Entrepreneurship through ICT for disadvantaged communities Ms. Geetu Sodhi, Mr. Vijay Gupta ....335 Efficient Location-Based Spatial Query (LBSQ) Processing in Wireless Broadcast Environments K Madhavi, Dr Narasimham Challa...342 K means Clustering Algorithm with High Performance using large data Vikas Chaudhary , Vikas Mishra, Kapil ...350 Performance Evaluation of Route optimization Schemes Using NS2 Simulation Manoj Mathur, Sunita Malik, Vikas...355 Image Tracking and Activity Recognition Navneet Sharma, Divya Dixit, Ankur Saxena...358

An innovative digital watermarking process A Critical Analysis Sangeeta Shukla, Preeti Pandey, Jitendra Singh361 Survey On Decision Tree Algorithm Shweta Rana........368 Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform Sucheta Dhir.....374 Improving The Performance Of Web Log Mining By Using K-Mean Clustering With Neural Network Vinita Shrivastava...379 A Hybrid Filter for Image Enhancement Vinod Kumar, Kaushal Kishore, Dr. Priyanka ....385 Trends in ICT Track: Software development & Deployment (AGILE METHODOLOGIES) Shubh, Priyanka Gandh, Manju Arora....390 Vulnerabilities in WEP Security and Their Countermeasures Akhilesh Arora.....400 Implementation of Ethernet Protocol and DDS in Virtex-5 FPGA for Radar Applications Garima chaturvedi, Dr.Preeta sharan ,Peeyush sahay...408 CCK Coding Implementation in IEEE802.11b Standard Mohd. Imran Ali......413 Cognitive Radio and Management of Spectrum Prof.Rinkoo Bhatia, Narendra Singh Thakur, Prateek Bhadauria, Nishant Dev...416 Impact of MNCs on entrepreneurship Ms. Sonia......428 Multilayered Intelligent Approach - An Hybrid Intelligent Systems Neeta Verma, Swapna Singh....437 Green ICT: A Next Generation Entrepreneurial Revolution Pooja Tripathi441


An Innovation Framework For Practice-Predominant Engineering Education Om Vikas ........450 Mobile Ad-hoc Network Apoorv Agarwal, Apeksha Aggarwal ....460 Fuzzy C- Mean Clustering Algorithm Arun Kumar Sharma........470 A Comparitive Study of Web Securioty Protocols Hanish Kumar......475 E-Village A new mantra for rural development Mr. S.K. Mourya......481 Green ICT: A Next Generation Entrepreneurial Revolution Prof Pooja Tripathi.....484 Role Of 21st Centaury : Ict Need Of The Day Saurabh Choudhry ........486 Reusability Of Software Components Using Clustering Meenakshi Sharma, Priyanka Kakkar, Dr. Parvinder Sandhu, Sonia Manhas.487

PART - 1

New Technology acceptance model to predict adoption of wireless technology in healthcare

Abstract Adoption of new technologies is researched in Information Systems (IS) literature for the past two decades, starting with the adoption of desktop computer technology to the adoption of electronic commerce technology. Issues that have been researched comprise of how users handle various options available in software environment, their perceived opinion, barriers and challenges to adopting a new technology, IS development procedures that are directly impacting any adoption including interface designs and elements of human issues. However, literature indicates that the models proposed in the IS literature such as Technology Acceptance Model (TAM) are not suitable to specific settings to predict adoption of technology. Studies in the past few years have strongly concluded that TAM is not suitable in healthcare setting because it doesnt consider a myriad of factors influencing adoption technology adoption in healthcare This paper discusses the problems in healthcare due to poor information systems development, factors that need to be considered while developing healthcare applications as these are complex and different from traditional MIS applications and derive a model that can be tested for adoption of new technology in healthcare settings. The contribution of this paper is in terms of building theory that is not available in the combined areas of Information Systems and healthcare. Index Terms healthcare, Information Systems, adoption factors.


nstitute of Medicine (IOM) in the United States has recognized that frontier technologies such as wireless technology would improve access to information in order to achieve quality health care. A

report released by the IOM in 2003 outlined a set of recommendations to improve Patient safety and reduce errors using reporting systems that are based on Information Systems (IS). While it is widely accepted that IS assists health related outcomes, how this can be efficiently achieved is an under researched area. Therefore, conflicting outcomes are reported in healthcare studies as to the successful role of IS. In essence, research is needed to investigate the role, and perhaps the use of, frontier technologies in improving information management, communication, cost and access to improve quality healthcare. In healthcare, specific issues relating to the failures of Information Management are being addressed using frontier technologies such as RF Tags and Wireless Handheld Devices. The main focus in using these technologies is to collect patient related information in an automated manner, at the point of entry, so as to reduce any manual procedures needed to capture data. While no other discipline relies more heavily on human interactions than health care, it is in healthcare that technology in the form of wireless devices has the means to increase not decrease the benefits derived from the important function of human interaction. Essential to this is the acceptance of this wireless handheld technology as this technology enables to collect data at the point of entry, with minimal manual intervention, with a higher degree of accuracy and precision. When it comes to the Management of Information Systems, development and implementation of a hospital Information System is different from traditional Information Systems due to the life critical environment in hospitals. Patient lives are dependent

upon the information collected and managed in hospitals and hence smart use of information is crucial for many aspects of healthcare. Therefore, any investigation conducted should be multi-dimensional and should cover many aspects beyond technical feasibility and functionality dictated by traditional systems. Successful implementation of health information systems includes addressing clinical processes that are efficient, effective, manageable and well integrated with other systems). While traditional Information Systems address issues of integration with other systems, this is more so important in hospital systems because of the profound impact these systems have on short and long term care of patients .Reasons for failure in Information Systems developed for healthcare include lack of attention paid to the social and professional cultures of healthcare professionals, underestimation of complex clinical routines, dissonance between various stakeholders of health information, long implementation time cycles, reluctance to support projects financially once they are delivered and failures to learn from past mistakes. Therefore, any new technologies should address these reasons in order to be accepted in the healthcare setting. II. UNSUITABILITY OF CURRENT TECHNOLOGY ACCEPTANCE MODELS TO HEALTHCARE: The acceptance of new technologies has long been an area of inquiry in the MIS literature. The acceptance of personal computer applications, telemedicine, e-mail, workstations, and the WWW are some examples of technologies that have been investigated in the MIS literature. User technology acceptance is a critical success factor for IT adoption and many studies have predicted this using Technology Acceptance Model (TAM), to some extent, accurately by means of a host of factors categorized into characteristics of the individuals, characteristics of the technology and the characteristics of the organizational context. Technology Acceptance Model, specifically measures the determinants of computer usage in terms of perceived usefulness and perceived ease of use. While perceived usefulness has emerged as a consistently important attitude formation, studies have found that perceived ease of use has been inconsistent and of less significant. The literature suggests that a plausible explanation for this could be the continued prolonged users exposure to technology leading to their familiarity, and hence the ease in using the system. Therefore users could have interpreted the perceived ease of use as insignificant while determining their intention to use a technology. The strengths of TAM lies in the fact that it has been tested in IS with various sample sizes and characteristics. Results of these tests suggest that it is capable of providing adequate explanation as well predicting user acceptance of IT. Strong support can be found for the Technology Acceptance Model (TAM) to be robust in predicting user acceptance However, some studies criticize TAM for its examination of the model validity with students who have limited computing exposure, administrative and clerical staff, who do not use all IT functions found in software applications. Studies also indicate that the applicability of TAM to specific disciplines such as medicine is not yet fully established. Further, the validity and reliability of the TAM in certain professional context such as medicine and law is questioned. Only limited information is found in the healthcare related literature as to the suitability of TAM. Similarly, in the literature related to the legal field, especially where IT is referred, limited information can be found on TAM. Therefore, it appears that the model is not fully tested with various other professionals in their own professional contexts. Therefore, it can be argued that, when it comes to emerging technology such as wireless handheld devices, TAM may not be sufficient to predict the acceptance of technology because the context becomes quite different. It should be noted that the current context in healthcare related Information Systems is not only the physical environment but also the ICT environment as wireless technology is markedly different from Desktop technology. A major notable change is the way in which information is accessed using wireless technology as the information is pushed to the users as opposed to users pulling the information from desktop computers. In the Desktop technology, users have the freedom to choose what they want to access and the usage behavior is dependent upon their choice. On the other hand, using wireless devices, it is possible for the information whether needed or not to reach these devices assume significant importance because of the setting in which these devices are used. For example, in an operation theatre patient lives assume importance and information needs must reflect this. If wireless handheld devices dont support data management that are closely linked with clinical procedures due to device restrictions such as screen size and memory, despite their attractions, users would discard these devices. Therefore,

applications developed onto these devices must address complex clinical procedures that can be supported by these devices. Another major consideration in the domain of wireless technology is the connectivity. While this is assumed to be always available in a wired network environment, this can not be guaranteed in a wireless technology due to mobility the network connectivity. As users carry the device and roam, the signal strength may change from strong to weak and this may interrupt user operations. Therefore, to accomplish smart information management, certain technical aspects must also be addressed. Current users of wireless technology are concerned with their security and privacy aspects associated in using this technology. This is because they need to reveal their identity in order to receive information. While the privacy is concerned with the information that they provide to others, security threats fall under the categories of physical threat and data threat. Due to the infancy stages and hardware restrictions, handheld devices are not able to implement these features to the expected level on the devices as found in desktop computers. In a healthcare setting, any leak in the privacy issues would have potential adverse impact on the stakeholders. Further, due to other devices that may be using radio frequency or infra-red frequency in providing healthcare to patients, there may be practical implementation restrictions in the usage of wireless devices for ICT. Our own experience in providing wireless technology solutions to a private healthcare in Western Australia yielded mixed responses. The wireless technology developed and implemented for the Emergency Department was successful in terms of software development and deployment. The project was well accepted by the users in the healthcare. However, the wireless solution provided to address problems encountered in the Operation Theatre Management System was not well received by the users, despite the superiority in design, functionality and connectivity. Users were reluctant to use the application due to the hardware and database connectivity restrictions, despite scoring a high level of opinion on acceptance for usefulness and ease of use. Now, let us assume that TAM is correct in claiming that the intention to use a particular system is a very important factor in determining whether users will actually use it. Let us also assume that the wireless systems developed for the private healthcare provider in Western Australia exhibited that there were clear intentions to use a the system. However, despite a positive affect on perceived usefulness and perceived ease of use, the wireless system was not accepted by users. It should be noted that the new system mimicked the current traditional system, and yet did not yield any interest in terms of user behaviors. While searching for reasons for this hard to explain phenomena, who argued, after studying TAM, that perceived usefulness should also include near-term and long-term usefulness in order to study behavioral intentions. Other studies that have examined the utilization of the Internet Technology have also supported view. This has given us a feeling that TAM may not be sufficient to predict the acceptance of wireless technology in specific healthcare setting. A brief review of prior studies in healthcare indicated that a number of issues associated with the lack of acceptance of wireless handheld devices are highlighted but not researched to the full extent that they warrant. For example, drawbacks of these devices in healthcare included perceived fear for new learning by doctors, time investment needed for such learning, cost involved in setting up the wireless networks and the cost implications associated with the integration of existing systems with the new wireless system .A vast majority of these studies concur that wireless handheld devices would be able to provide solutions to the Information Management problems encountered by healthcare. While these studies unanimously agree that the information management would be smarter using wireless technology and handheld devices, they seldom provided details of those factors that enabled the acceptance of wireless technology specific to healthcare setting. MIS journals appear to be lagging behind in this area. Therefore, it is safe to assume that current models that predict the acceptance of technology based on behavioral intentions are insufficient. This necessitates a radically new model in order to predict the acceptance of wireless handheld technology in specific professional settings.

III. INGREDIENTS FOR A NEW MODEL TO PREDICT ACCEPTANCE OF NEW TECHNOLOGY: Some of the previous models measured actual use through the intention to use and input to these models are perceived usefulness, perceived ease of use, attitude, subjective norm, perceived behavioral control, near term use, short term use, experience, facilitating conditions and so on. In recent years, factors that impacting technology acceptance

included job relevance, output quality and result demonstrability. In the field of electronic commerce and mobile commerce, factors such as security and trust are considered as factors of adoption of these technologies. In end user computing, factors such as user friendliness and maintainability appear to be influencing the applications. Therefore, any new model to determine the acceptance of wireless technology would include some of the above factors. In addition to these, when it comes to wireless technology, any acceptance factors should hinge on two dominant concepts hardware (or device) and applications that run o the hardware as the battle continues to accommodate more applications on a device that is diminishing in size, but improving in power. Further, mobile telephones and PDAs, appear to be accepted based on their attractiveness, hardware design, type of key pad that they provide, screen color and resolution, ability to be carried around etc. In effect, the hardware component appears to be an equally dominant factor in the adoption of wireless technology. Once the hardware and software applications are accepted, the third dominant factor in the acceptance of wireless technology appears to be the telecommunication factor. This factor involves various services provided by telecommunication companies, the cost involved in such services, the type of connectivity, roaming facilities, ability to access the Internet, provision for Short Messaging Services (SMS), ability to play games using the mobile devices etc. These factors are common to both mobile telephones and emerging PDAs. Some common features that the user would like to see appear to be alarming services, calendar, scheduler, ability to access digital messages both text and voice etc. Therefore, studies that investigate the adoption of wireless technology should aim to categories factors based on hardware, applications and telecommunication as these appear to be the building blocks of any adoption of this technology. Specific factors for applications, perhaps, could involve portability across various hardware, reliability of code, performance, ease of use, module cohesion across different common applications, clarity of code etc,. In terms of hardware, the size of the device, memory size, key pad, resolution of screen, various voice tones, portability, attractiveness, brand names such as Nokia, capability such as alarms, etc. would be some of the factors of adoption or acceptance. In terms of service provision, plan types, costs, access, free time zones, SMS provision, cost for local calls, cost to access the Internet, provision to share information stored between devices etc. appear to be dominant factors. Factors such as security etc form a common theme as all the three dominant categories need to ensure this factor. Factors mentioned above are crucial to determine the development aspects of Wireless Information Systems (WIS) for healthcare as these factors dictate the development methodology, choice of software language, user interface design etc. Further, the factors of adoption in conjunction with methodology would determine the integration aspects such as coupling the new system with existing systems. This would then determine the implementation plans. In essence, an initial model that can determine the acceptance of wireless technology in healthcare can be portrayed as follows:

Diagram 1: Proposed Model for Technology Adoption in Healthcare Settings In the above model, the three boxes in dark borders show the relationship between various factors that influence the acceptance of technology. The box on the left indicates various factors influencing wireless technology I any given setting. The three categories of factors hardware, software and telecommunication affect the way in which wireless technology is implemented. The factors portrayed in the box are generic and their role to specific healthcare setting varies depending upon the level of implementation. Once the technology is implemented, it is expected to be used. In healthcare settings, it appears that the usage, relevance and need are the three most important influencing factors for the continual usage of new technology When the correct balance is established, users exhibit positive perceptions about using a new technology such as wireless handheld devices for data management purposes. This, in turn, brings out positive attitude towards using the system, both short

and long term usage. The positive usage would then determine the intentions to use, resulting in usage behavior. The usage behavior then determines the factors that influence the adoption of new technology in a given setting. This is shown by the arrow that flows from right to left. Based on the propositions made in the earlier paragraphs, it is suggested that any testing done to predict the acceptance of new technology in healthcare should test the following hypotheses: 1. Hardware factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management 2. Software factors have a direct effect on the development, integration and implementation of wireless technology in healthcare for data management 3. Telecommunication factors direct effect on the development, integration and implementation of wireless technology in healthcare for data management 4. Factors influencing wireless technology in healthcare setting have direct positive effect on usage, relevance and need 5. User perception of new technology is directly affected by usage, relevance and need 6. User perception of new technology has a direct effect on user attitude in using such technology 7. User attitude has a direct effect on intentions to use a new technology 8. Usage behavior is determined by intentions to use a new technology of those key factors. This approach would complement the open ended questions so as to determine the importance of the individual factors determining the adoption and usage of wireless devices and applications.

V. DATA COLLECTION: In order to perform validity and reliability tests, a minimum of 250 samples are required. Any study to test the model should consider the randomness of the samples to avoid any collective bias. Similarly, about 50 samples may be required to undergo the interview process, with each interview to last for 60 minutes. Any instruments developed for testing the model should be able to elicit responses of 'how' and 'why'. This is essential in order to discern differences between adoption and usage decision of wireless handheld applications. In addition, comparing responses to the question about adoption and questions about use would provide evidence that respondents were reporting their adoption drivers and not simply their current behavior. The interview questions should be semi structured or partially structured to guide the research. There are variations in qualitative interviewing techniques such as informal, standardized and guided. Structured interviews and partially structured interviews can be subjected to validity checks similar to those done in quantitative studies. Samples could be asked about their usage of wireless devices including mobile telephones and other hospital systems during the initial stages of the interview. They could be interviewed further so as to identify factors that would lead to the continual usage of these devices and any emerging challenges that they foresee such as training. The interviews can be recorded on a digital recording system with provision to convert automatically to a PC to avoid any transcription errors. This approach would also minimize transcription time and cost. The interview questions should be developed in such as way that both determinants and challenge factors could be identified. This then increases or enhances the research results, which is free of errors or bias.

IV. INSTRUMENTS: The instruments typically would constitute two broad categories of questions. The first category of questions would be related to the adoption and usage of wireless applications in healthcare for data collection purposes. The second category would consist of demographic variables, as these variables determine the granularity of the setting. Open ended questions can be included in the instrument to obtain unbiased and non-leading information. Prior to administering the questions, a complete peer review and a pilot study are insisted in order to ascertain the validity of the instrument. A two stage approach can be used in administering the instrument, where the first stage would gather information about the key factors influencing users decisions to use wireless applications and the second stage on the importance

VI. DATA ANALYSIS: Data should be coded by two individuals into a computer file prior to analysis and a file comparator technique should be used to resolve any data entry errors. A coding scheme should also be developed based on the instrument developed. The coders

should be given sufficient instructions on the codes, anticipated responses and any other detail needed to conduct the data entry. Coders should also be given a start-list that will include definitions from prior research for the categories of the construct. Some of the categories would include utilitarian outcomes such as applications for personal use and barriers such as cost and knowledge. Data should be analyzed using statistical software applications using both quantitative and qualitative analyses. Initially a descriptive analysis needs to be conducted, including a frequency breakdown. This should then be followed by a detailed cross sectional analysis of the determinants of behavior. A factor analysis should also be conducted to identify factors of adoption. Once this is completed, tests for significance can be performed between various factors.
[4] Freeman, E. H. (2003). Privacy Notices under the GrammLeach-Bliley Act. Legally Speaking (May/June), 5-9. [5] Goh, E. (2001). Wireless Services: China (Operational Management Report No. DPRO-94111): Gartner. [6] Hu, P. J., Chau, P. Y. K., & Liu Sheng, O. R. (2002). Adoption of telemedicine technology by health care organizations: An exploratory study. Journal of organizational computing and electronic commerce, 12(3), 197-222. [7] Hu, P. J., Chau, P. Y. K., Sheng, O. R. L., & Tam, K. Y. (1999). Examining the technology acceptance model using physician acceptance of telemedicine technology. Journal of Management Information Systems, 16(2), 91-112. [8] Kwon, T. J., & Zmud, R. W. (Eds.). (1987). Unifying the fragmented models of information systems implementation. New York: John Wiley. [9] Oritz, E., & Clancy, C. M. (2003). Use of information technology to improve the Quality of Health Care in the United States. Health Services Research, 38(2), 11-22. [10] Remenyi, D., Williams, B., Money, A., & Swartz, E. (1998). Doing Research in Business and Management. London: SAGE Publications Ltd. [11] [12] Rogers, E. M. (1995). Diffusion of Innovation (4th ed.). New York: Free Press. [13] Rozwell, C., Harris, K., & Caldwell, F. (2002). Survey of Innovative Management Technology (Research Notes No. M-15-1388): Gartner Research. [14] The nature and determinants of IT acceptance, routinization, and infusion, 67-86 (1994). [15] Sausser, G. D. (2003). Thin is in: web-based systems enhance security, clinical quality. Healthcare Financial Management, 57(7), 86-88. [16] Simpson, R. L. (2003). The patient's point of view -- IT matters. Nursing Administration Quarterly, 27(3), 254-256. [17] Smith, D., & Andrews, W. (2001). Exploring Instant Messaging: Gartner Research and Advisory Services. [18] Sparks, K., Faragher, B., & Cooper, C. L. (2001). Well-Being and Occupational Health in the 21st Century Workplace. Journal of Occupational and Organizational Psychology, 74(4), 481-510. [19] Tyndale, P. (2002). Taxonomy of Knowledge Management Software Tools: Origins and Applications, 2002, from [20] Wiebusch, B. (2002). First response gets reengineered: Will a new sensor and the power of wireless communication make us better prepared to deal with biological attacks? Design News, 57(11), 63 - 68. [21] Wisnicki, H. J. (2002). Wireless networking transforms healthcare: physician's practices better able to handle workflow, increase productivity (The human connection). Ophthalmology Times, 27(21), 38 - 41. [22] Yampel, T., & Eskenazi, S. (2001). New GUI tools reduce time to migrate healthcare applications to wireless. Healthcare Review, 14(3), 15-16. 9



We saw in this case study that there is a necessity for a new model to accurately predict the adoption of new technologies in specific healthcare setting because current models available in the Information Systems domain are yet to fulfill this need. Based on our experience and available literature, we identified some initial factors that can influence and determine acceptance of technology. We also proposed a theoretical model that can be tested using these initial factors. In order to be complete, we suggested a proposed methodology for testing the model.

[1] Davies, F. D., Bagozzi, R. P., & Warshaw, P. R. (1989). User acceptance of computer technology: A comparison of two theoretical models. Communications of the ACM, 35(8), 9821003. Davis, G. B. (1985). A typology of management information systems users and its implication for user information satisfaction research. Paper presented at the 21st Computer Personnel Research Conference, Minneapolis. Dyer, O. (2003). Patients will be reminded of appointments by text messages. British Medical Journal, 326(402), 281.




Solutions to Security and Privacy Issues in Mobile Social Networking

Abstract Social network information is now being used in ways for which it may have not been originally intended. In particular, increased use of smartphones capable of running applications which access social network information enable applications to be aware of a users location and preferences. However, current models for exchange of this information require users to compromise their privacy and security. We present several of these privacy and security issues, along with our design and implementation of solutions for these issues. Our work allows location-based services to query local mobile devices for users social network information, without disclosing user identity or compromising users privacy and security. We contend that it is important that such solutions be accepted as mobile social networks continue to grow exponentially.


ur focus is on security and privacy in locationaware mobile social network (LAMSN) systems. Online social networks are now used by hundreds of millions of people and have become a major platform for communication and interaction between users. This has brought a wealth of information to application developers who develop on top of these networks. Social relation and preference information allows for a unique breed of application that did not previously exist. Furthermore, social network information is now being correlated with users physical locations, allowing information about users preferences and social relationships to interact in real-time with their physical environment. This fusion of online social networks with real-world mobile computing has created a fast growing set of applications that have unique requirements and unique implications that are not yet fully understood. LAMSN systems such as WhozThat [1] and Serendipity [2] provide the

infrastructure to leverage social networking context within a local physical proximity using mobile smartphones. However, such systems pay little heed to the security and privacy concerns associated with revealing ones personal social networking preferences and friendship information to the ubiquitous computing environment. We present significant security and privacy problems that are present in most existing mobile social network systems. Because these systems have not been designed with security and privacy in mind, these issues are unsurprising. Our assertion is that these security and privacy issues lead to unacceptable risks for users of mobile social network systems. We make three main contributions in this paper. a) We identify three classes of privacy and security problems associated with mobile social network systems: (1)direct anonymity issues, (2) indirect or K-anonymity issues, and (3) eavesdropping, spoofing, replay, and wormhole attacks. While these problems have been examined before in other contexts, we discuss how these problems present unique challenges in the context of mobile social network systems. We motivate the need for solutions to these problems. b) We present a design for a system, called the identity server, that provides solutions for these security and privacy problems. The identity server adapts established privacy and security technologies to provide novel solutions to these problems within the context of mobile social network systems. We describe our implementation of the identity server.

X. BACKGROUND In this section we provide the reader with a short introduction to work in the area of mobile social

networking and the technologies that have made it possible. 2.1 MOBILE COMPUTING Smartphones now allow millions of people to be connected to the Internet all the time and support mature development environments for third-party application developers. Recently there has been a dramatic rise in usage of smartphones, those phones capable of Internet access, wireless communication, and supporting development of third-party applications. This rise has been due largely to the iPhone and iPod Touch. 2.2 SOCIAL NETWORKS The growth of social networks has exploded over the last year. In particular, usage of Facebook has spread internationally and to users of a wide age range. According to Facebook.coms statistics page, the site has over 200 million active users [4] [5], of which ove 100 million log on To compare this with Com Scores global Internet usage statistics [6], this would imply that nearly 1 in 10 of all Internet users log on to Facebook everyday and that he active Facebook Internet population is larger than any single countrys Internet population (China is the largest with 179.7 million Internet users [6]). 2.3 PRIVACY AND SECURITY The work described in this paper draws on some previous privacy research in both location-based services and social networks [12] [13]. This prior work does not approach the same problem as addressed in this paper, however the mechanisms used in these papers may provide certain functions necessary to associate user preferences anonymously with user location for use in third-party applications. Our work, however, differs in that it seeks to hide the users identity while distributing certain personal information obtained from existing online social networks. XI. SECURITY AND PRIVACY PROBLEMS Peer-to-peer mobile social network systems, like WhozThat and Social Aware, exchange users social network identifiers between devices using shortrange wireless technology such as Bluetooth. In contrast to these systems, a mobile device in clientserver mobile social network systems, such as Bright kite and Loop, notifies a centralized server about the current location of the device (available via GPS, cell-tower identification, or other mechanisms). By querying the server, mobile devices in these clientserver systems can find nearby users, information about these nearby users, and other items of interest. 3.1 Direct Anonymity Issues The information exchange model of the mobile social network systems discussed previously provide little protection for the users privacy. These systems require the user to allow access to his or her social network profile information and at the same time associate that information with the users identity. For instance, Facebook applications generally require the user to agree to give the application access to his/her information through Face books API, intrinsically tying such information to the users identity. In a peer-to-peer context-aware mobile social network system such as Social Aware, we can track a user by logging the date and time that each mobile or stationary device detects the users social network ID. By collecting such logs, we can construct a history of the locations that a user has visited and the times of each visit, compromising the users privacy. Finally, given access to a users social network ID, someone else could access that users public information in a way that the user may not have intended by simply viewing that users public profile on a social network Web site. We conclude that clear text exchange of social networking IDs in systems such as WhozThat and Social Aware leads to unacceptable security and privacy risks, and allows the users anonymity to be easily compromised. We call such problems that directly compromise a users anonymity direct anonymity attacks. Direct anonymity attacks are also possible in client-server mobile social network systems. While users social network IDs are generally not directly exchanged between mobile devices in such systems, mobile or stationary devices can still track a user by logging the date and time that each device finds the user nearby. Since each device in these systems can find the social network user names and often full names of nearby users, the privacy of these users can be compromised. Thus, we have a direct anonymity issue - exposure of user names and locations in client-server systems allows the users anonymity to be compromised. 3.2. The Indirect or K-Anonymity Problem the indirect anonymity problem exists when a piece of information indirectly compromises a users identity. An example of this is when a piece of information unique to a user is given out, such as a list of the users favorite movies, this information might then be easily mapped back to the user. The Kanonymity problem occurs when n pieces of information or n sets of related information can be

used together to uniquely map back to a users identity. Furthermore, if a set of information can only be mapped to a set of k or fewer sets of users, the users anonymity is still compromised to a degree related to k. The challenge is to design an algorithm that can decide what information should and should not be given out in order to guarantee the anonymity of associated users. This problem is similar to previous K-anonymity problems related to the release of voter or hospital information to the public. However, it has been shown that by correlating a few data sets a high percentage of records can be reidentified. A paper by Sweeney shows how this reidentification process is done using voter records and hospital records [17]. The K-anonymity problem in this paper is unique in that the standard K-anonymity guarantees that released information cannot distinguish between k 1 individuals associated with the released information. However, the problem discussed here does not involve the release of personal records but rather sets of aggregated information that may relate to sets of individuals that may or may not be associated with the released information. Therefore, the K-anonymity guarantee for our problem refers to the minimal number of indistinguishable unique sets that are sufficient to account for all released information. More precisely there must be no more than k unique sets that are 1 not subsets of each other and all other sufficient sets are supersets of some of the minimal sets. This paper presents this K-anonymity problem informally and proposes a solution that is currently being explored and implemented by the authors, however it does not formally solve this problem, which is proposed as an important open problem in the area of mobile social network privacy. We argue that this problem is important because it would provide an alternative for users to take advantage of new mobile social network applications without compromising their privacy. The K-anonymity problem applies to both peer-to-peer and client server mobile social network systems, since both systems involve sharing a users social network profile data with other users of these systems 3.3 Eavesdropping, Spoofing, Replay, and Wormhole Attacks Once a users social network ID has been intercepted in a peer-to-peer mobile social network system, it can be used to mount a replay and spoofing attack. In a spoofing attack, a malicious user can masquerade as the user whose ID was intercepted (the compromised user) by simply sending (replaying) the intercepted ID to mobile or stationary devices that request the users social network ID. Thus, the replay attack, where the compromised users ID is maliciously repeated, is used to perform the spoofing attack. Another specific type of replay attack is known as a wormhole attack [18], where wireless transmissions are captured on one end of the network and replayed on another end of the network. these attacks could be used for a variety of nefarious purposes. For example, a malicious user could masquerade as the compromised user at a specific time and place while committing a crime. Clearly, spoofing attacks in mobile social networking systems present serious security risks. In addition to intercepting a users social network ID via eavesdropping of the wireless network, a malicious user could eavesdrop on information transmitted when a device requests a users social network profile information from a social network server. For example, if a mobile device in a peer-topeer system uses HTTP (RFC 2616) to connect to the Facebook API REST server [19] instead of HTTPS (RFC 2818), all user profile information requested from the Facebook API server is transmitted in clear text and can be intercepted. Interception of such data allows a malicious user to circumvent Face books privacy controls, and access private user profile information that the user had no intention to share. Eavesdropping, spoofing, replay, and wormhole attacks are generally not major threats to client-server mobile social network systems. These attacks can be defended against with the appropriate use of a robust security protocol such as HTTPS, in conjunction with client authentication using user names and passwords or client certificates. If a users social network login credentials (user name and password, or certificate) have not been stolen by a malicious user and the user has chosen an appropriately strong password, then it is nearly impossible for the malicious user to masquerade as that user. .

XII. SECURITY AND PRIVACY SOLUTIONS We have designed and implemented a system, called the identity server, to address the security and privacy problems described previously. Our system assumes that each participating mobile device has reasonably reliable Internet access through a wireless wide area network (WWAN) cell data connection or through a WiFi connection. Mobile devices that lack such an Internet connection will not be able to participate in our system. Furthermore, we assume that each participating mobile device has a short-range wireless network interface, such as either Bluetooth or WiFi,

for ad-hoc communication with nearby mobile and/or stationary devices. We describe the design and implementation of the identity server in this section 4.1Design of the Identity Server and Anonymous Identifier As discussed in subsections III-A and III-C, the clear text exchange of a users social network ID presents significant privacy and security risks [20]. To address these risks, we propose the use of an anonymous identifier, or AID. The AID is a nonce that is generated by a trusted server, called the identity server (IS). Before a users mobile device advertises the users presence to other nearby mobile and stationary devices, it securely contacts the IS to obtain the AID. The IS associates the newly generated AID with the mobile device that requested the AID, and then returns the new AID to the mobile device. The users mobile device then proceeds to share this AID with a nearby mobile and/or stationary device by launching a Bluetooth AID sharing service. After a nearby mobile or stationary device (device B) discovers this AID sharing service on the users mobile device (device A), device B establishes a connection to the users mobile device to obtain the shared AID. After the AID has been obtained by device B, device A requests another AID from the IS. This new AID will be shared with the next mobile or stationary device that connects to the AID sharing service on device A. While our design and implementation uses Bluetooth for AID sharing, we could also implement AID sharing using WiFi After the device B obtains the shared AID from device A, device B then proceeds to query the IS for the social network profile information for the user that is associated with this AID. Figure 1 shows the role of the IS in generating AIDs and processing requests for a users social network information. Once the social network information for an AID has been retrieved by the IS, the IS removes this AID from the list of AIDs associated with the mobile user. Before the users mobile device next advertises the users presence using the Bluetooth AID sharing service, it will obtain a new AID from the IS as described above. We permit multiple AIDs to be associated with a mobile user, which allows for multiple nearby mobile or stationary devices to obtain information about the user. To improve efficiency, the users mobile device may submit one request for multiple AIDs to the IS, and then proceed to share each AID one at a time with other nearby devices. The IS sets

a timeout value for each AID when the AID is created and provided to a users mobile device. An AID times out if it is not consumed within the timeout period, that is, if the IS has not received a query for social network profile information for the user associated with this AID within the timeout period. Upon timeout of an AID, the IS removes the AID from the list of AIDs associated with the user. We use AID timeouts to prevent the list of AIDs associated a user from growing without bound. The use of AIDs in our system provides important privacy features for mobile users. Since the mobile device shares only AIDs with other devices, a malicious user who has intercepted these AIDs cannot connect these AIDs to a particular users social network identity. Furthermore, the IS does not support the retrieval of certain personally identifiable information from a users social network profile, such as the users full name, email address, phone number, etc. Since the IS does not support the retrieval of personally identifiable information, a device that retrieves social network information for the user associated with an AID is unable to connect the AID to the users social network identity. Thus, only by compromising the IS can a malicious user tie an AID to a users social network ID. We assume that the IS is a secure and trusted system, and that compromising such a system would prove to be a formidable task. The use of IS and AIDs as we have described solves the direct anonymity problem. As the reader will see in subsection IV-C, the IS also addresses the indirect anonymity problem by providing a K-anonymity

guarantee for information returned from users social network profiles. 4.2. Implementation of the Identity Server All IS services accessed by mobile and/or stationary devices are exposed as web services conforming to the 4.3 Trust Networks and Onion Routing One way to support privacy in social network applications is to transfer information using a trusted peer-to-peer network [29]. Such a network would require a trust network much like that used by Katz and Gold beck [30] in which social networks provided trust for default actions on the. Moreover, in a mobile social network application, nodes could not only share their information directly but could give permission to their trusted network to share their information. This approach was used in the One Swarm [31] system to allow peer-to-peer file sharing with privacy settings that allowed the user to share data publicly, just with friends, or even with a chosen subset of those friends. However, such a model has obvious problems if any nodes are compromised since information is easily associated with its source. These peer-to-peer networks could be made anonymous through the use of onion routing [32]. The Tor network [33] uses onion routing to allow nodes to send data anonymously. Through the use of layers of encryption that are decrypted at selected routers along a virtual route, routing nodes cannot directly relate the information at the destination to its source. If data was shared in this manner it would not be so easy to identify the source of the information, protecting the direct anonymity of the user. We are currently exploring the use of trust networks and onion routing in terms of taking a more decentralized approach to protecting user anonymity that does not require trust of the social network (such as Facebook itself [29].

REST architecture [21]. We used the open source Reset framework [22] for Java to develop the IS. We expose each resource on the IS, including a mobile users AID, a mobile users current location, and the Facebook profile information for a mobile user, as separate URL-accessible resources supporting HTTP GET, POST, and PUT methods as appropriate. Figure 2 shows the web-accessible resources exposed on the IS, along with the HTTP methods supported by each resource. The body of each HTTP request is encoded using JSON (RFC 4627). All web service network traffic between the IS and other mobile/stationary devices is encrypted using HTTPS, and access to all resources is authenticated using HTTP basic access authentication (RFC 2617). Each mobile user must sign up for a user account on the IS prior to participation in our system. During the signup process, the user provides his/her Facebook user ID (we can obtain this using Facebook Connect [23]), and chooses a user name and password. The users user name and password are securely stored on the users mobile device, and are used to authenticate with the IS and obtain access to the guarded web resources on the IS for the devices current location, the users AID, and the users Facebook profile information. Access to the web resources for the users AID and current location is available only to the user herself/himself, and no other entity save for the logic implemented on the IS. Access to the web resource for the users Facebook profile information (we call this user user A) is provided to any authenticated user with a user account on the IS, provided that the authenticated users device is within an acceptable range of user As mobile device. See below for more information on location-based access control for a users Facebook profile.

CONCLUSION We have identified several important privacy and security issues associated with LAMSN systems, along with our work on novel solutions for these issues. Our solutions support anonymous exchange of social network information with real world locationbased systems, enabling context-aware systems that do not compromise users security and privacy. We hope that our work will convince users and developers that it is possible to move forward with creative mobile social network applications without further compromising user security and privacy. REFERENCES [1] N. Eagle and A. Pentland, Social serendipity: Mobilizing social software, [2] Global internet use reaches 1 billion, release.asp?press=2698. [3] C. M. Gartrell, Social aware: Context-aware multimedia presentation via mobile social networks, Masters thesis, University of Colorado at Boulder, December 2008 applic [4] E. Miluzzo, N. D. Lane, S. B. Eisenman, and A. T. Campbell, Cenceme - injecting sensing presence into social networking applications, in Proceedings of the 2nd European Conference on Smart Sensing and Context (EuroSSC 2007), October 2007. [5] Brightkite, [6] Loopt, [7] A. Tootoonchian, K. K. Gollu, S. Saroiu, Y. Ganjali, and A. Wolman, Lockr: social access control for web 2.0, in WOSP 08: Proceedings of the first


Wireless Monitoring Of The Green House Using ATMEGA Based Monitoring System: WSN Approach

Abstract in the present paper, authors have given an emphasis on WSN approach for green house monitoring and control. A control system is developed and tested using recent atmega microcontroller. The farmers in the developing countries can easily use designed for maximising yield. Atmega microcontrollers are preferred over other microcontrollers due to some important features including 10- bit ADC, sleep mode , wide input voltage range and higher memory capacity. Index Terms WSN, AVR, MICROCONTROLLERS, GREEN PRECISION AGRICULTURE RF2.4, HOUSE,

of the automation system architecture in modern greenhouses. Wireless communication can be used to collect the measurements and to communicate between the centralized control and the actuators located to the different parts of the greenhouse. In advanced WSN solutions, some parts of the control system itself can also be implemented in a distributed manner to the network such that local control loops can be formed. Compared to the cabled systems, the installation of WSN is fast, cheap and easy. Moreover, it is easy to relocate the measurement points when needed by just moving sensor nodes from one location to another within a communication range of the coordinator device. If the greenhouse flora is high and dense, the small and light weight nodes can even be hanged up to the plants branches. WSN maintenance is also relatively cheap and easy. The only additional costs occur when the sensor nodes run out of batteries and the batteries need to be charged or replaced, but the lifespan of the battery can be several years if an efficient power saving algorithm is applied. The research on the use of WSN in agriculture is mainly focused primarily on areas such as Proof-ofconcept applications to demonstrate the efficiency and efficacy of using sensor networks to monitor and control agriculture management strategies. The attempt is made by the authors to show the effective utilization of this concept into day to day monitoring of the green house for higher yield. II. RF COMMUNICATION AND MONITORING OF THE GREEN HOUSE PARAMETERS RF is the wireless transmission of data by digital radio signals at a particular frequency. RF



recent survey of the advances in wireless sensor network applications has reviewed a wide range of applications for these networks and identified agriculture as a potential area of deployment together with a review of the factors influencing the design of sensor networks for this application. WSN is a collection of sensor and actuators nodes linked by a wireless medium to perform distributed sensing and acting tasks. The sensor nodes collect data and communicate over a network environment to a computer system, which is called, a base station. Based on the information collected, the base station takes decisions and then the actuator nodes perform appropriate actions upon the environment. This process allows users to sense and control the environment from anywhere. There are many situations in which the application of the WSN is preferred, for instance, environment monitoring, product quality monitoring, and others where supervision of big areas is necessary. Wireless sensor network (WSN) form a useful part

communication works by creating electromagnetic waves at a source and being able to send the electromagnetic waves at a particular destination. These electromagnetic waves travel through the air at near the speed of light. The advantages of a RF communication are its wireless feature so that the user neednt have to lay cable all over the green house. Cable is expensive, less flexible than RF coverage and is prone to damage. RF communication provides extensive hardware support for packet handling, data buffering, burst transmissions, clear channel assessment and link quality. A. FEATURES a) Low power consumption. b) High sensitivity (type -104dBm) c) Programmable output power -20dBm~1dBm d) Operation temperature range -40~+85 deg C e) Operation voltage: 1.8~3.6 Volts. f) Available frequency at : 2.4~2.483 GHz B. APPLICATIONS a) Wireless alarm and security systems b) AMR-automatic Meter Reading c) Wireless Game Controllers. d) Wireless Audio/Keyboard/Mouse C. PROPOSED RF COMMUNICATION BASED GREEN HOUSE PARAMETER MONITORING HARDWARE In the proposed hardware, there would be two section master and slave. The slave part would contain the temperature and humidity sensor. The sensor would be connected to the AVR microcontroller. The RF transceiver would be connected to the AVR microcontroller which would wirelessly send the data to the master part. The master part would contain the RF transceiver which would receive the data and give to the microcontroller. The count would be displayed on the graphics LCD. The motor and DC fan would also be connected to the master board. These motor and DC fan would be accordingly controller based upon the relevant temperature and humidity condition. The major components of the proposed hardware,as seen in fig.1, are, Microcontroller - AVR- Atmega 16, Atmega 32. Compiler : AVR studio Range - 150 meter Master and Slave communication: 247 slaves. Sensor: Temperature : LM35 and Humidity sensor



There are several features of Atmega microcontroller as given below which makes it an ideal choice for green house parameter monitoring. A. FEATURES a) High-performance, Low-power AVR 8-bit Microcontroller b) Advanced RISC Architecture c) High Endurance Non-volatile Memory segments 16K Bytes of In-System Self-programmable Flash program memory 512 Bytes EEPROM 1K Byte Internal SRAM d) Peripheral Features Two 8-bit Timer/Counters with Separate Prescalers and Compare Modes One 16-bit Timer/Counter 8-channel, 10-bit ADC e) Special Microcontroller Features Power-on Reset and Programmable Brown-out Detection Internal Calibrated RC Oscillator External and Internal Interrupt Sources B.ARCHITECTURAL DESCRIPTION The ATmega16 provides the following features: 16K bytes of In-System Programmable flash Program memory with Read-While-Write capabilities, 512 bytes EEPROM, 1K byte SRAM, 32 general purpose I/O lines, 32 general purpose working registers, a JTAG interface for Boundary scan, On-chip Debugging support and programming, three flexible

Timer/Counters with compare modes, Internal and External Interrupts, a serial programmable USART, a byte oriented Two-wire Serial Interface, an 8channel, 10-bit ADC with optional differential input stage with programmable gain (TQFP package only), a programmable Watchdog Timer with Internal Oscillator, an SPI serial port, and six software selectable power saving modes. The Idle mode stops the CPU while allowing the USART, Two-wire interface, A/D Converter, SRAM; Timer/Counters, SPI port, and interrupt system to continue functioning. The Power-down mode saves the register contents but freezes the Oscillator, disabling all other chip functions until the next External Interrupt or Hardware Reset. In Power-save mode, the Asynchronous Timer continues to run, allowing the user to maintain a timer base while the rest of the device is sleeping. The ADC Noise Reduction mode stops the CPU and all I/O modules except Asynchronous Timer and ADC, to minimize switching noise during ADC conversions. In Standby mode, the crystal/resonator Oscillator is running while the rest of the device is sleeping. This allows very fast start-up combined with low-power consumption. In Extended Standby mode, both the main Oscillator and the Asynchronous Timer continue to run.The device is manufactured using Atmels high density nonvolatile memory technology. The Onchip ISP Flash allows the program memory to be reprogrammed in-system through an SPI serial interface, by a conventional nonvolatile memory programmer, or by an On-chip Boot program running on the AVR core. The boot program can use any interface to download the application program in the Application Flash memory. Software in the Boot Flash section will continue to run while the Application Flash section is updated, providing true Read-While-Write operation. By combining an 8-bit RISC CPU with In-System Self-Programmable Flash on a monolithic chip, the Atmel ATmega16 is a powerful microcontroller that provides a highly-flexible and cost-effective solution to many embedded control applications. IV. WHY RF 2.4?

(Abbreviations: SOC: System-on-Chip, Network Processor, TXRX: Transceiver)


Table II. Part number, minimum and maximum frequency range, operating voltage and description

In nutshell, the advantages of RF 2.4 are, a) Low power consumption. b) Integrated data filters. c) High sensitivity d) Operation temperature range -40~+85 deg C e) Available frequency at : 2.4~2.483 GHz- No certification f) Required from government V. DETILS OF THE SENSORS USED

The important features given below in table I and table II make RF 2.4 an ideal choice for green house parameter Monitoring Table I. Part number, status, device type, frequency range and sensitivity

A. TEMPERATURE SENSOR The LM35 series, shown in fig.3, are precision integrated-circuit temperature sensors, whose output voltage is linearly proportional to the Celsius (Centigrade) temperature. The LM35 thus has an advantage over linear temperature sensors calibrated in Kelvin, as the user is not required to subtract a large constant voltage from its output to obtain convenient Centigrade scaling. The LM35 does not require any external calibration or trimming to provide typical accuracies of 1.4C at room temperature and 3.4C over a full -55 to +150C

temperature range. Low cost is assured by trimming and calibration at the wafer level. The LM35s low output impedance, linear output, and precise inherent calibration make interfacing to readout or control circuitry especially easy. It can be used with single power supplies, or with plus and minus supplies. As it draws only 60 A from its supply, it has very low self-heating, less than 0.1C in still air. The LM35 is rated to operate over a -55 to +150C temperature range, while the LM35C is rated for a -40 to +110C range (-10 with improved accuracy). The LM35 series is available packaged in hermetic TO-46 transistor packages, while the LM35C, LM35CA, and LM35D are also available in the plastic TO-92 transistor package. Fig. 4 shows the typical use of IC temperature sensor in the green house control system using AVR microcontroller. VI. DESIGN OBJECTIVES

Fig.2. Typical use of IC Temperature Sensor B. HUMIDITY SENSOR (HIH-3610 SERIES) Following are the features of humidity sensor selected for this design. a) Linear voltage output vs %RH b) Chemically resistant (output is not disturbed due to the presence of chemicals in the air). c) The HIH-3610 Series humidity sensor is designed specifically for high volume OEM (Original Equipment Manufacturer) users. d) Direct input to a controller due to sensors linear voltage output. Table III shows the available humidity sensors for green house application. Table III. Available Humidity Sensors for green house applications

The horticulturists near Nasik region felt the need of some automatic controller for their green houses where they grow export quality roses. The atmosphere in India change with great variance with the season. Hence, the quality of the roses does not remain the same due to the great change in the temperature and humidity parameters. Roses with adverse quality give less income. The loss in the income due to adverse quality roses is to the tune of 2 to 3 lakhs per acre per season. For roses, ideally, the green house should provide good light throughout the year, temperature range between 15 to 28C, night temperature should be between 15 to 18C, and the day temperature should not exceed 30C in any case. The growth is slowed down with the fall of temperature below 15C. If the temperature rises above 28C, humidity must be kept high. Higher night temperature above 15C hastens flower development, while lower temperature around 13.5C delays it. Depending on the temperature inside the greenhouse, the moisture should be kept in line for the best results. For example, if the temperature is 24 degrees, 60% humidity is suitable. Hence, variable temperature and humidity control for different crops using wireless technique for WSN environment using low cost technique was the main objective. Low power consumption during testing was another objective. Hence, selection of the sensors and most importantly, microcontroller, was very important keeping power consumption at remote places in view. To bring the temperature within control limit, exhaust fans were made automatically ON and for humidity control, water pump was made ON-OFF. VII. PROGRAMMING

Embedded C is used for the programming. Fig. 5 shows the the programming window of the AVR studio software used during programming.

IX. Results

Fig.3. AVR studio window during programming Following are some important features of the AVR studio. A) Integrated Development Environment (Write, Compile and Debug) B) Fully Symbolic Source-level Debugger C) Extensive Program Flow Control Options D) Language support: C, Pascal, BASIC, and Assembly VIII. FIELD OBSERVATIONS

Results are found to be satisfactory. Area A and B (each admeasuring 10 meters x 10 meters) were selected. Area A is used to take reading without temperature and humidity control. Readings in Area B were taken after suitable automatic control action with the help of AVR based green house controller. It is found that the designed hardware has shown consistently faithful readings and also proved to be accurate in the humid atmosphere of the green house. Following readings and graphs show some of the readings in Area A and corresponding readings after corrective action in Area B. Table IV. Readings taken in the green house near Nasik before and after control action. Fan is automatically ON after temperature in AREA is more than 30 C and Motor is ON after R. Humidity is less than 50%.

Readings were taken for 15 days. ON-OFF action of the hardware was tested. Satisfactory results were achieved. Fig. 6, 7, 8 and 9 show the photographs of the green house structures used to take readings.

Fig. 5: Green house parameters in AREA A (without control action) Fig. 4: Green house near Nasik growing roses: Complete sections of the green are seen. Vents are used to regulate the temperature, naturally.

range of 4-5 lakhs per acre. REFERENCES Network, European Journal of Scientific Research ISSN 1450-216X , Vol.33 No.2 (2009), pp.249-260 Euro Journals Publishing, Inc. 2009. [2] The Greenhouse Remote Monitoring System based on RCM2100, WANG Juan WANG Yan College of Mechanical and Electric Engineering, Agricultural University of Hebei, Baoding,071001, China [3] A Study on the Greenhouse Auto Control System Based on Wireless Sensor Network, BeomJin Kang Dae, Heon Park, KyungRyung Cho, ChangSun Shin, SungEon Cho , JangWoo Park IEEE, 22 December 2008. [4] Anil Kumar Singh, Precision farming Water technology center, New Delhii [5] Debashis Mandal and S. K. Ghosh, Precision Farming [6] H. J. Hellebrand, H. Beuche, K.H. Dammer, Precision Agriculture [7] S. M. Swinton and J. Lowenbergdeboer, Precision Agriculture. [8] Mahmoud Omid , A Computer-Based Monitoring System to Maintain Optimum Air Temperature and Relative Humidity in Greenhouses [9] Teemu Ahonen, Reino Virrankoski and Mohammed Elmusrati, Greenhouse Monitoring with Wireless Sensor Network [10] Andrzej Pawlowski, Jose Luis Guzman, Francisco Rodrguez, Manuel Berenguel, Jos Snchez and Sebastin Dormido, Simulation of Greenhouse Climate Monitoring and Control with Wireless Sensor Network and Event-Based Control [11] Candido, F. Cicirelli, A. Furfaro, and L. Nigro, Embedded real-time system for climate control in a complex greenhouse

Fig. 6: Green house parameters in AREA B (with control action). Humidity values are increased and temperature values are decreased due to automatic control action of AVR based wireless green house controller.



A. Low cost and maintenance free sensors are used to monitor environment. The system has several advantages in term of its compact size, low cost and high accuracy. B. The green house system considers design optimization and functional improvement of the system. C. The same system can be used to monitor industrial parameters also. D. The system developed has shown consistency, accuracy and precise control action over a period of 15 days and did not fail even once during testing.. E. Quality of roses in area B found to good than area A. F. Owner of the green house said that the good quality roses are sold at 1.5 times higher rate than medium quality roses. Hence, the system, if implemented, can increase the profit margin. G. The cost of the system is less than Rs. 2500/- if produced in multiple. H. For one acre green house, we need only 5 sets of AVR based green house controllers. I. Projected increase in the profit is in the


Fuzzy C- Mean Algorithm Using Different Variants

Abstract Clustering can be considered the most important unsupervised learning problem; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. A loose definition of clustering could be the process of organizing objects into groups whose members are similar in some way. A cluster is therefore a collection of objects which are similar between them and are dissimilar to the objects belonging to other clusters. A group of the same or similar elements gathered or occurring closely together a bunch: Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function: Index Terms clustering analysis, fuzzy clustering, fuzzy c- mean, genetic algorithm.


lustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? It can be shown that there is no absolute best criterion which would be independent of the final aim of the clustering. Consequently, it is the user which must supply this criterion, in such a way that the result of the clustering will suit their needs. For instance, we could be interested in finding representatives for homogeneous groups (data reduction), in finding natural clusters and describe their unknown properties (natural data types), in finding useful and suitable groupings (useful data classes) or in

finding unusual data objects (outlier detection The main requirements that a clustering algorithm should satisfy are:scalability;dealing with different types of attributes; discovering clusters with arbitrary shape; minimal requirements for domain knowledge to determine input parameters; ability to deal with noise and outliers; insensitivity to order of input records; high dimensionality; interpretability and usability There are a number of problems with clustering. Among them: Current clustering techniques do not address all the requirements adequately (and concurrently); Dealing with large number of dimensions and large number of data items can be problematic because of time complexity; The effectiveness of the method depends on the definition of distance (for distance-based clustering);If an obvious distance measure doesnt exist we must define it, which is not always easy, especially in multi-dimensional spaces; The result of the clustering algorithm (that in many cases can be arbitrary itself) can be interpreted in different ways Clustering algorithms may be classified as listed below: Exclusive Clustering Overlapping Clustering Hierarchical Clustering Probabilistic Clustering

In the first case data are grouped in an exclusive way, so that if a certain datum belongs to a definite cluster then it could not be included in another cluster. A simple example of that is shown in the figure below, where the separation of points is achieved by a straight line on a bi-dimensional plane. On the contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data, so that each

point may belong to two or more clusters with different degrees of membership. In this case, data will be associated to an appropriate membership value. Instead, a hierarchical clustering algorithm is based on the union between the two nearest clusters. The beginning condition is realized by setting every datum as a cluster. After a few iterations it reaches the final clusters. Finally, the last kinds of clustering use a completely probabilistic approach Clustering algorithms can be applied in many fields, for :Marketing: finding groups of customers with similar behavior given a large database of customer data containing their properties and past buying records; Biology: classification of plants and animals given their features; Libraries: book ordering; Insurance: identifying groups of motor insurance policy holders with a high average claim cost; identifying frauds; City-planning: identifying groups of houses according to their house type, value and geographical location; Earthquake studies: clustering observed earthquake epicenters to identify dangerous zones; WWW: document classification; clustering weblog data to discover groups of similar access patterns with the update of membership uij and the cluster centers cj by:


(3) This iteration will stop when we generate the objective , where is a termination criterion between 0 and 1, whereas k are the iteration steps. This procedure converges to a local minimum or a saddle point. As already told, data are bound to each cluster by means of a Membership Function, which represents the fuzzy behaviour of this algorithm. To do that, we simply have to build an appropriate matrix named U whose factors are numbers between 0 and 1, and represent the degree of membership between data and centers of clusters of Jm.the cluster, and ||*|| is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership uij and the cluster centers cj Then create the algorithm of fuzzy c-mean for the next form of conversion in the search form to iteration the step of algorithm. The algorithm is composed of the following steps:
Initialize U=[uij] matrix, U(0) At k-step: calculate the centers vectors C(k)=[cj] with U(k)

XIV. FUZZY C-MEANS CLUSTERING ALGORITHM Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition. It is based on minimization of the following objective function:

(1) where m is any real number greater than 1, uij is the degree of membership of xi in the cluster j, xi is the ith of d-dimensional measured data, cj is the ddimension center of the cluster, and ||*|| is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above,

Using 1 and 2 Update U(k) , U(k+1)

choosing an initialization for the c-means clustering algorithms. Experiments use six data sets, including the Iris data, magnetic resonance and color images. The genetic algorithm approach is generally able to find the lowest On data sets with several local extreme, the GA approach always avoids the less desirable solutions. Deteriorate partitions are always avoided by the GA approach, which provides an effective method for optimizing clustering models whose objective function can be represented in terms of cluster centers. The time cost of genetic guided clustering is shown to make series of random initializations of fuzzy/hard c-means, where the partition associated with the lowest J value is chosen, and an effective competitor for many clustering domains. The subtractive clustering method assumes each data point is a potential cluster center and calculates a measure of the likelihood that each data point would define the cluster center, based on the density of surrounding data points.

If || U(k+1) - U(k)||< step 2.

then STOP; otherwise return to

Advantage:-i). Gives best result for overlapped data set and comparatively better than k-mean algorithm.

XV. VARIANTS IN FUZZY C-MEAN The most widely used clustering algorithm implementing the fuzzy philosophy is Fuzzy CMeans (FCM), initially developed by Dunn and later generalized by Bezdek, who proposed a generalization by means of a family of objective functions . Despite this algorithm proved to be less accurate than others, its fuzzy nature and the ease of implementation made it very attractive for a lot of researchers, that proposed various improvements and applications refer to . Usually FCM is applied to unsupervised clustering problems. i) Optimizing of Fuzzy C-Means Clustering Algorithm Using Genetic Algorithm (GA) PATTERN recognition is a field concerned with machine recognition of meaningful regularities in noisy or complex environments. In simpler words, pattern recognition is the search for structures in data. In pattern recognition, group of data is called a cluster [1].Fuzzy C-Means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters this method was developed by Dunn [2] in 1973 and improved by Bezdek [3] in 1981 and is frequently used in pattern recognition Thus what we want from the optimization is to improve the performance toward some optimal point or points[4]. Luus [5] identifies three main types of search methods: calculus- based, enumerative and random. Hall, L. O., Ozyurt, I. B. and Bezdek, J. C. [6] describe a genetically guided approach for optimizing the hard (j) fuzzy (j) c-means functional used in cluster analysis. Our experiments show that a genetic algorithm ameliorates the experiments difficulty of

Where m is any real number greater than 1, it was set to 2.00by Bezdek.uij is the degree of membership of xi in the cluster j; xi is the ith of d-dimensional measured data ; cj is the d-dimension center of the cluster and ||*|| is any norm expressing the similarity between any measured data and

Then go to the next step:-

the same tested data. Fig. 1 at Appendix A shows the flow chart of the program. A. Example 1 - Modeling a Two Input Nonlinear Function In this example, a nonlinear function was proposed: z = sin(x) * sin ( y) (7) The range X [-10.5, 10.5] and Y [-10.5. 10.5] is the input space of the above equation, 200 data pairs obtained randomly First the best least square error is obtained for the FCM of weighting exponent (m=2.00).Next the least square error of the subtractive clustering is obtained by iteration which was clusters since this error predefined if the error is less then. Then the cluster number is taken to the FCM algorithm, the error with 24 clusters and he weighting exponent (m).

The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution [9, 10]. the individuals that they were created from, just as in natural adaptation. The algorithm is composed of the following steps:

Fig. 2 Random data points of equation (7); blue circles for the data to be clustered and the red stares for the testing data B. Example 2 - Modeling a One Input Nonlinear Function In this example, a nonlinear function was proposed also but with one variable x: y = sin(x) (8)

A complete program using MATLAB programming language was developed to find the optimal value of the weighting exponent. It starts by performing subtractive clustering for input-output data, build the fuzzy model using subtractive clustering and optimize the parameters by optimizing the least square error between the output of the fuzzy model and the output from the original function by entering a tested data. The optimizing is carried out by iteration. After that, the genetic algorithms optimized the weighting exponent of FCM. The same way, build the fuzzy model using FCM then optimize the weighting exponent m by optimizing the least square error between the output of the fuzzy model and the output from the original function by entering

The range X [-20.5, 20.5] is the input space of the above equation, 200 data pairs were obtained randomly and shown in fig 3 of the following diagram of Random data point equation

star class from the point class. On the other hand, it is easy to see that Fig. 2 is more crisp than Fig. 3 It illustrates that, for the classification of Iris database, features PL and PW are more important than SL and SW. Here we can think of that the weight assignment) 0, 0, 1, 1) is better than) 1, 1, 0, 0) for Iris database classification. With respect to FCM clustering, it is sensitive to the selection of distance metric. Zhao [12] stated that the Euclidean distance give good results when all clusters are spheroids with same size or when all clusters are well separated. In [13, 10], they proposed a GK algorithm which uses the well-known Mahalanobis distance as the metric in FCM. They reported that the GK algorithm is better than Euclidean distance based algorithms when the shape of data is considered. In [11], the authors proposed a new robust metric, which is distinguished from the Euclidean distance, to improve the robustness of FCM. Since FCMs performance depends on selected metrics, it will depend on the feature-weights that must be incorporated into the Euclidean distance. Each feature should have an importance degree which is called feature-weight. Feature-weight assignment is an extension of feature selection [17]. The latter has only either 0-weight or 1-weight value, while the former can have weight values in the interval [0.1]. Distance measures are studied and a new one is proposed to handle the different feature-weights. In section 5 we proposed the new FCM for clustering data objects with different feature-weights. Modified Distance Measure for the New FCM Algorithm Two distance measures are used in FCM widely in literature: Euclidian and Mahalanobis distance measure. Suppose x and y are two pattern vectors (we have introduced pattern vector in section 3). The Euclidian distance between x and y is: d2 (x, y) = (x-y) T (x-y) (14)

Fig. 3 Random data points of equation (8); blue circles for the data to be clustered and the red stares for the testing data Advantage: - Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, which are the radius, squash factor, accept ratio, and the reject ratio are optimized using the GA.The original FCM proposed by Bezdek is optimized using GA and another values of the exponent rather than(m =2) are giving less approximation error ii) A New Feature Weighted Fuzzy C-Means Clustering Algorithm The Goal of cluster analysis is to assign data points with similar properties to the same groups and dissimilar data points to different groups [3]. Generally, there are two main clustering approaches i.e. crisp clustering and fuzzy clustering. In the crisp clustering method the boundary between clusters is clearly defined. However, in many real cases, the boundaries between clusters cannot be clearly defined. Some objects may belong to more than one cluster. In such cases, the fuzzy clustering method provides a better and more useful method to cluster these objects [2].Fuzzy c-means (FCM) proposed by [5] and extended by [4] is one of the most wellknown methodologies in clustering analysis. Basically FCM clustering is dependent on the measure of distance between samples. A clustering based on PL and PW. , One can see that there are much more crossover between the star class and the point class. It is difficult for us to discriminate the

And the Mahalanobis distance between x and a center t (taking into account the variability and correlation of d2 (x, t, C) = (x - t)T C-1 (x- t) (15)

In Mahalanobis distance measure C is the co-variance matrix. Using co-variance matrix in Mahalanobis distance measure takes into account the variability and correlation of the data. To take into account the weight of the features in calculation of distance between two data points we suggest the use

of (x-y)m (modified (x-y)) instead of (x-y) in distance measure, whether it is Euclidian or Mahalanobis. (x-y)m is a vector that its ith element is obtained by multiplication of ith element of vector (x y) and ith element of vector FWA. So, with this modification, equ.14 and equ.15 will be modified to this form: d2m (x, y) = (x- y)tm (x y) d2m (17) (x y)m (i) = (x - y) (i) * FFWI (i) (18) (x,t,C) = (xt)tm C-1 (x(16) t)m set to be able to calculate and (having these parameters in hand, we can easily calculate the feature estimation index for each feature. see section 3). To have these clusters we apply FCM algorithm with Euclidian distance on the data set. The created clusters help us to calculate the FWA vector. This step, in fact, is a pre-computing step. In the next and final step, we apply our Feature weighted FCM algorithm on the data set, but here we use modified Mahalanobis distance in FCM algorithm.

We will use this modified distance measure in our algorithm of clustering data set with different feature- Since FCMs performance depends on selected metrics, it will depend on the featureweights that must be incorporated into the Euclidean distance. Each feature should have an importance degree which is called feature-weight. Feature-weight assignment is an extension of feature selection [17]. The latter has only either 0-weight or 1-weight value, while the former can have weight values in the interval [0.1]. Generally speaking, feature selection method cannot be used as feature-weight learning technique, but the inverse is right. To be able to deal with such cases, we propose a new FCM Algorithm that takes into account weight of each feature in the data set that will be clustered. After a brief review of the FCM in section 2, a number of features ranking methods are described in section 3. These methods will be used in determining FWA (feature weight assignment) of each feature. In section 4 distance measures are studied and a new one is proposed to handle the different feature-weights. In section 5 we proposed the new FCM for clustering data objects with different feature-weights. New Feature Weighted FCM Algorithm In this section we propose the new clustering algorithm, which is based on FCM and extend the method that is proposed by [15] for determining FWA of features and, moreover, uses modified Mahalanobis measure of distance, which takes into account the FWA of features in addition to variability of data. As mentioned before, despite FCM, this algorithm clusters the data set based on weights of features. In the first step of this algorithm we should calculate the FWA vector using method proposed in [15]. To do so, we need some clusters over the data

The result will be clusters which have two major difference with the clusters obtained in the first step. The first difference is that the Mahalanobis distance is used in fig 3. It means that the variability and correlation of data is taken into account in calculating the clusters. The second difference, that is the main contribution of this investigation, is that features weight index has a great role in shaping the clusters. Advantage:- We transformed the values into the FWA vector which its elements are in interval[0,1] and each element shows the relative significance of its peer feature. clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. Distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis, i.e. the algorithm can adapt more data modes by matriculating the data objects .At the same time, through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result iii) A modified Algorithm Fuzzy C-Mean Clustering

Fuzzy C-Mean Clustering algorithm is widely used in data mining technology at present, and also is the

representative of fuzzy clustering algorithm.FCM clustering algorithm has perfect theory and profound mathematical basis[1]In Common FCM clustering algorithm itself also has some disadvantages. In common FCM clustering algorithm it generally adopts Euclidean Distance to measure the dissimilarity between objects so that it can easily discover the cluster with convex shapes[2].In some dats sets,the cluster cannot get a good result when data are often expressed by Euclidean Space.In order to adapt to more modes, Dij (m) =(xi-xj) T (xi-xj) (1) training data set the other half is as the test data set. Finally, compare and analyze the result of the test to determine the effectiveness of modified FCM clustering algorithm. Advantage:- We propose two modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis i.e the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. iv) An Efficient Fuzzy C-Mean Clustering Algorithm There are many fuzzy clustering methods being introduced [1].The fuzzy C-Mean (FCM) Algorithm is widelty used. It is based on the concept of fuzzy Cparition which was introduced by Ruspini[2],developed by Dunn [3],and generalized by Bezdek[4,5].The FCM algorithm and its derivatives have been used very successfully in many applications, such as pattern recognition[6]. ,classification[7],data mining[8],and image segmentation [9,10].It has also been used for data analysis and modeling [11,12]Normally ,the FCM algorithm consists of several execution steps.In the first step ,the Algorithm Sselects C initial cluster centers from the original dataset randomly.Then ,in later steps after several iterations of the algorithm ,the final result converges to the actual cluster center.Therefore ,choosing a good set of initial cluster centers is chosen, the algorithm may takes less iteration to find the actual cluster centers. In [13] propose the multistage random sampling FCM algorithm.It is based on the assumption that a small subset of a dataset of feature vectors can be used to approximate the cluster centers of the complete data set.Under this assumption FCM is used to compute the cluster centers of an appropriate size subset of the original database This efficient algorithm for improving the FCM is called the partition simplification FCM (psFCM).It is divided into two phases In phase 1st we first partition the data set into some small blocks cels using the k-d tree method [14] and reduce the original dataset into a simplified dataset with unit blocks as described in our work.[15]The FCM algorithm measures the quality of the partitioning by comparing the distance from

In accordance with the data type, data can be divided into numeric data and character data. The dissimilarity calculation method is usually just suitable for numeric attribute above [4] Design of Modified FCM Clustering Algorithm In common FCM clustering algorithm, the membership belonging to a particular cluster of data object is determined by the distance from the object to the certain cluster center. The membership of each data object indicates the degree of the data object belonging to the cluster . By using Mahalanobis Distance and Matriculated input vector to modify the FCM algorithm, we get the main study object in the paper. The objective function is as follow:-


d2 (xj,q) =(xj-ci)T Fi-1(xj-ci)

(2) (3)

Where d2 (xj, ci) =

Here we define the dissimilarity between objects by using Mahalanobis Distance,Where Fis an internal cluster internal tightness ,F matrix define as:





Firstly select a set of sample data sets Balance and Artifical where Balance Scale is a standard data set of UCI, Artificial is an artificial data set. Secondly implement the process by standard FCM algorithm MatFCM for vectors algorithm and MatFCM for matrices algorithm respectively. Then, test the clustering algorithm according to the different clusters.Due to the invisible data, we can only calculate the result of each algorithm Rec (Reception) and Rej (Rejection) by FRC (Fuzzy Relational Classifier) classification algorithm based on clustering which can determine the good or bad clustering effect; In which half of the data is as the

pattern xi to the current candidate cluster center wj with the distance from pattern xi to other andidate cluster centers.The objective function is an optimization function that calculates the weighted within-group sum of squared errors as follow[15]. Phase 1: Refine initial prototypes for fuzzy c-mean clustering Step 1: First we partition the dataset into unit blocks by using the k-d tree method.The splitting priority depends on the scattered degree of data values for each dimension.If one dimension has a higher scattered degree,it has a higher priority to be split.The scattered degree is defined as the distribution range and the standard deviation of the feature . Step 2: After splitting the data set into unit blocks we calculate the centroid for each unit block that contain some sample patterns.The centroid represent all sample pattern in this unit block.Then we use all of these centroid to denote the original pattern are represented by all the computed centroid In addition each centroid contains statistical information of the pattern in each unit block .These include the number of pattern in a unit block (WUB) and the linear sum of all patterens in a unit block When we scan the database the second time it also finds the statistics of each dimension.These statistics will be used when the algorithm calculate new candidate cluster centers, which improve the system performance. Step 3: Initialize the cluster center matrix by using a random generator from the dataset, record the cluster centers and set t=0 Step 4: Initialize the membership matrix U(0) by using functionwith the simplified data set Step 5:Increase t(i.e, t=t+1); compute a new cluster center matrix (candidate) W(i) by using function W(i)j Step 6: Compute the new membership matrix and simplified dataset and then go to phase 2 Phase 2: Find the actual cluster centers for the dataset Step 1: Initialize the fuzzy partition matrix U(0) by using the result of W(I) from Phase 1 with step (5) and (6) for the dataset x Step 2: Follow step 3 to 5 of the FCM algorithm discussed in Section 2 using The Stopping Condition. From the Experimental results described in the last section, the proposed psFCM algorithm has approximately the same speedup for the patterns of normal distribution as well as uniform distribution.In general our method works well for most kind of datasets. In Phase 1 of the psFCM algorithm the cluster centers found by using the simplified dataset is very close to the actual cluster centers. Phase 2 converges quickly if we use these cluster centers from phase 1 as the initial cluster centers of phase 2 .From the experiments, in most cases phase 2 converges in only a few iterations to converges if the stopping condition is smaller .However the number of patterns is used in phase 1 of the proposed algorithm is much smaller than FCM algorithm because Nps<<N. The psFCM algorithm divides a dataset into several unit blocks.The centroid of unit blocks replace the pattern and form a new dataset,the simplified dataset.As mentioned in Section 4,The simplified dataset decreases the complexity of computing the membership matrix from in every iteration.For a fair comparison the initialization in phase 1 of therandomely.We have also found in the psFCM algorithm is determined randomly.We have also found in the psFCM algorithm that an intial cluster center selected from a unit block with a higher density is closer to the actual cluster centere.This is a feature that cannot be found using the FCM algorithm and its derivatives .In future work, WE will study this feature more thoroughly. psFCM algorithm and the FCM algorithm is determined randomely.We have also found in the psFCM algorithm is determined randomly.We have also found in the psFCM algorithm that an intial cluster center selected from a unit block with a higher density is closer to the actual cluster centere.This is a feature that cannot be found using the FCM algorithm and its derivatives .In future work, WE will study this feature more thoroughly. Advantage: In efficient clustering that is better than the FCM algorithm.We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes. To calculate the feature estimation index we passed a pre-computing step which was a fuzzy clustering using FCM with Euclidian Fuzzy C-Mean is one of the algorithms for clustering based on optimizing an objective function being sensitive to initial conditions v). The Algorithm Global Fuzzy C-Mean Clustering

There are many fuzzy clustering methods being introduced[2].Fuzzy C-Mean clustering algorithm is one of most important and popular fuzzy clustering algorithms.At present the FCM algorithm has been extensively used in feature analysis pattern recognition image processing classifier design etc ([3][4]).However the FCM clustering algorithm is sensitive to the situation of the initialization and easy to fall into the local minimum or a saddle point when iterating.To solve this problem several other techniques have been developed that are based global optimization methods (e.g genetic algorithm simulated annealing)[5-7].However in many practical applications the clustering method that is used is FCM with multiple restarts to escaping from the sensibility to initial value[8]clustering. In the following section we describe the proposed global fuzzy C-Mean algorithm starting conditions and always converge to a local minimum. In order to solve this problem we employ the FCM algorithms a local search and the FCM is scheduled differing in the initial positions of the cluster centers. Based on the k-meansalgorithm, in Ref. [9] they proposed if the k-1 centers placed at the optimal positions for the (k-1)-clustering problem and the remaining kth center placed at an appropriate position to be discovered, an optimal clustering solution with k clusters can be obtained through local search. Base on this assumption we proposed the global Fuzzy C-Means clustering algorithm. Instead of randomly selecting initial values for all cluster centers as is the case with most global clustering algorithms, the proposed technique proceeds in an incremental way attempting to optimally add one new cluster center at each stage. More specifically, we start with fuzzy 1-partition andind its optimal position which corresponds to the centroid ofthe data set X. For fuzzy 2-partition problem, the first initialcluster center is placed at the optimal position for fuzzy1-partition, while the second initial center at execution n isplaced at the position of the data point xn (n=1,,N ). Then we perform the FCM algorithm from each of these initial positions respectively, to obtain the best solution for fuzzy 2-partition. In general, let solution for fuzzy Cpartition. If we have found the solution for the fuzzy (C1)-partition problem, we perform the FCM algorithm with C clusters from each of this initial state. Comparison of the global Fuzzy C-Means algorithm to the FCM and the global k-means algorithm To validate the sensibility to initial value and the accuracy of the proposed algorithm, we conducted several experiments two artificial data sets and three real survey datasets. The three algorithms (FCM, GKM and GFCM) are compared on five data sets:(i) a synthetic data set consisting of ten clusters, with each cluster consisting of 30 points, and this data set is two-dimensional and is depicted . (ii) a synthetic data set consisting of fifteen clusters, with each cluster consisting of 20 points, and this data set is two-dimensional and is depicted in Fig.1(b); (iii) elect 63 data points from vowel data set, which consist of nine clusters and is10-dimensional; (iv) select 600 data points from set image dataset, which consist of six clusters and is 36-dimensional; (v) The main advantage of the algorithm is that it does not depend on any initial conditions and improves the accuracy of clustering.The algorithm is briefly summarized as follow: Step 1: Perform the FCM algorithm to find the optimal clustering centers v(1) of the fuzzy 1partition problem and let obj_1 be its corresponding value of the objective function found by(1). Step 2: Perform N runs of the FCM algorithm with c clusters where each run n starts from the initial state (V1*,,vc*,xn), and obtain their corresponding values of the objective functions and clustering centers. Step 3: Find the minimal value of the objection function obj_(c+1) and its corresponding clustering centers V(c+1) be the final clustering centers for fuzzy (c+1) partition Step 4: If c+1=c,stop;otherwise set c=c+1 and go to step 2. Advantage:Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy Cpartition through a series of local searches (FCM).At each local search we let optimal cluster centers for fuzzy (c-1)-partition problem be the (c-1) initial positions and an appropriate position within the data space be the remaining Cth initial position. The global FCM clusterining algorithm does not depend on any initial conditions, effectively escapes from the

sensibility to initial value and improve the accuracy of For each of the above presented data sets we executed 10times the three algorithms respectively, and summaries these results as show in Table I. Experimental results suggest that the global Fuzzy CMeans algorithms and the global k-means algorithms are not sensitive to initial value, their clustering errors and accuracy of clustering are stable, and the global Fuzzy C-Means algorithms experimental results is better than the global k-means algorithms and FCM. For the disadvantage of the global algorithms converging speed, we propose the fast global Fuzzy C-Means clustering algorithm, which significantly improves the convergence speed of the global Fuzzy C-Means clustering algorithm, which significantly improves the convergence speed of the global Fuzzy C-Means clustering algorithm, Inters of the fast global Fuzzy C-Means algorithm, it is very encouraging that, although executing significantly faster, forward a modified algorithm on this basis i.e. the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. In efficient clustering that is better than the FCM algorithm. We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes. Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy Cpartition through a series of local searches

XVI. COMPERSION Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function: The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution. Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, In fuzzy weighted c-Mean we transformed the values into the FWA vector which its elements are in interval [0,1] and each element shows the relative significance of its peer feature. clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. We propose two modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put

XVII. CONCLUSION Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method is frequently used in pattern recognition. It is based on minimization of the following objective function; the study of fuzzy cmean then we study of the different variants of fuzzy c-mean and compression in between them. The Genetic algorithm is a stochastic global search method that mimics the metaphor of natural biological evolution. GAs operates on a population of potential solutions applying the principle of survival of the fittest to produce (hopefully) better and better approximations to a solution. Genetic algorithm provides higher resolution capability. The time needed to reach an optimum through genetic algorithm is less than the time needed by iterative approach. Genetic algorithm gives better performance and has less approximation error with less time. the subtractive clustering parameters, In fuzzy weighted c-Mean we transformed the values into the FWA vector which its elements are in interval[0,1] and each element shows the relative significance of its peer feature. Clustering on the data set in which weight of each feature plays a significant role in forming the shape of clusters. We propose two

modified programs for FCM clustering algorithm: one brings Mahalanobis distance into FCM clustering algorithm as the measure space; the other adapts the matriculated input vectors in FCM clustering algorithm. We put forward a modified algorithm on this basis i.e. the algorithm can adapt more time through the test and comparison on different data sets, it shows that the modified FCM clustering algorithm has a much better clustering result. In efficient clustering that is better than the FCM algorithm. We reduce the computation cost and improve the performance by finding a good set of initial cluster centers. The index is defined based on the aggregated measure of separation between the classes in terms of class membership functions. The index value decreases with the increase in both the compactness of individual classes and the separation between the classes.Fuzzy C-Mean algorithms are not sensitive to initial value ,their clustering errors and accuracy of clustering are stable and the global Fuzzy C-Mean algorithm experimental result is better than the algorithm FCM. Fuzzy C-Mean clustering algorithm (GFCM), which is a global clustering algorithm for the minimization of the clustering error. This algorithm is an incremental approach to clustering, and we can obtain an optimal solution for fuzzy C-partition through a series of local searches
and its Use in Detecting Compact Well-Separated Clusters, Journal of Cybernetics 3;1973: 32-57. [7] Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY, 1981. [8] Beightler, C. S., Phillips, D. J., & Wild, D. J., Foundations of optimization (2nd ed.). (Prentice-Hall) Englewood Cliffs, NJ, 1979. [9] Luus, R. and Jaakola T. H. I., Optimization by Direct Search and Systematic Reduction of the Size of Search Region, AIChE Journal 1973; 19(4): 760 766 [10] Hall, L.O., Bensaid, A.M., Clarke, L.P., et al., 1992. "A comparison of neural network and fuzzy clustering techniques in segmentation magnetic resonance images of the brain". IEEE Trans. Neural Networks 3. [11] Hung M, D. ang D, 2001 "An efficient fuzzy c-means clustering algorithm". In Proc. the 2001 IEEE International Conference on Data Mining. [12] Han J., Kamber M., 2001 "Datamining: Concepts and Techniques". Morgan Kaufmann Publishers, San Francisco. [13] Pal S. K. and Pal A. (Eds.) 2002, "Pattern Recognition: From Classical to Modern Approaches". World Scientific, Singapore. [14] de Oliveira J.V., Pedrycz W., 2007, "Advances in Fuzzy Clustering and its Applications", John Wily & sons. [15] X. Wang, Y. Wang and L. Wang.,2004 "Improving fuzzy c-means clustering based on feature-weight learning", [16] Zhang J S,Leung Y W, Improved possibilistic c-mean clustering algorithm,IEEE Trans.on Fuzzy System, 2004, 12(2), PP: 209-227 [17] ZAHID N,Abouelala O,Limouri M,Essaid A. Fuzzy clustering based on K-nearest- neighbors rule,FuzzySets and System,2001,120 (1), pp:239-247. [18] Xin-Bo Gao. The Analysis and Application of Fuzzy clutering.XiDian University Press, 2004. [19] Jia-Wei Han, Michelins Kambber .Data Mining Concepts and Techniques.Mechanical Industry Press, 2002. [20] F. Hoppner, F. Klawonn, R. Kruse,and T.Runkler, Fuzzy cluster analysis Wiley Press New York,1999 [21] M.R. Reazaee ,P.M.J. Zwet, B.P.E Lelieveldt, R.J.Geest and J.H.C. Reiber, A multiresolution image segmentation technique based on pyramidal segmentation and fuzzy clustering. [23] P. Teppola, S.P. Mujunen, and P. Minkkinen, Adaptive fuzzy c-mean clustering in process monitor ring, Chemo metrics and Intelligent Laboratory System. [24] X. Change W. Li and J Farrell, A c-Mean clustering based fuzzy modeling method.

[1] T. Kwok, R Smith, S. Lozano, and D.Taniar, Parallel Fuzzy c-mean clustering for larage data sets. [2] F. Hoppner, F. Klawonn, R. Kruse and T. Runkler, Fuzzy cluster analysis [3] M.C.Clark, L. O. Hall, MRI segmentation using fuzzy Clustering techniques integrating Knowledge. [4] Y.W.Lim, S.U.Lee, On the color Image Segmentation Algorithm Based on the Thresholding and the Fuzzy cMean Techniques. [5] Li-Xin Wang, A Course in Fuzzy Systems and Control, (Prentice Hall,Inc.) Upper Saddle River, NJ 07458; 1997: 342-353. [6] J. C. Dunn, A Fuzzy Relative of the ISODATA Process


Security Issues In Data Mining

Abstract In this article we discuss our research in developing general and systematic methods for intrusion detection. The key ideas are to use data mining techniques to discover consistent and useful patterns of system features that describe program and user behavior, and use the set of relevant system features to compute (inductively learned) classifiers that can recognize anomalies and known intrusions. The paper also discusses the current level of computer security development in Tanzania with particular interest in IDS application with the fact that approach is easy to implement with less complexity to computer systems architecture, less dependence on operating environment (As compared with other security-based systems) and ability to detect abuse of user privileges easily. The findings are geared towards developing security infrastructure and providing ICT services. Index Terms computer security; data mining; security; Intusion detection, ICT


n the last decade there have been great advances in data mining research and many data mining methods have been applied to everyday business, such as market basket analysis, direct marketing and fraud detection. If we want to find something unusual about a system we need to know something about, expected behavior of the system, behavior of functionalities and whether there are additional, unwanted functionalities introduced. According to R.L Grossman in Data Mining: Challenges and Opportunities for Data Mining during the Next Decade, he defines data mining as being concerned with uncovering patterns, associations changes, anomalies, and statistically significant structures and events in data. Simply put it is the ability to take and pull from it patterns or deviations which may not be seen easily to the naked eye. The recent rapid development in data mining has made available a wide variety of algorithms, drawn from the fields of statistics, pattern recognition, machine learning, and database. Several types of algorithms are particularly relevant to our research . We provide an overview

general of what are intrusion detections for better understanding before we implement our approach. The data mining techniques can be used to compute the intra- and inter-packets record patterns, which are essential in describing program or user behavior. The discovered patterns can guide the audit data gathering process and facilitate feature selection. A. Why Data Mining to security? Applications of Data Mining in Computer Security concentrate heavily on the use of data mining in the area of intrusion detection. The reason for this is twofold. First, the volume of data dealing with both network and host activity is so large that it makes it an ideal candidate for using dat mining techniques. Second, intrusion detection is an extremely critical activity. An ideal application in intrusion detection will be to gather sufficient normal and abnormal audit data for a user or a program, then apply a classification algorithm to learn a classifier that will determine (future) audit data as belonging to the normal class or the abnormal class; B. Sequence Analysis Approach on models sequential patterns algorithms practices have been applied in our manuscript. These algorithms can help us understand what (time-based) sequence of audit events are frequently encountered together. These frequent event patterns are important elements of the behavior profile of a user or program. We are developing a systematic framework for designing, developing and evaluating intrusion detection Systems. Specifically, the framework consists of a set of Environment-independent guidelines and programs that can assist a system administrator or security officer to a) Select appropriate system features from audit data to build models for intrusion detection. b) Architect a hierarchical detector system from component detectors. c) Update and deploy new detection systems as needed. d) Understand behavior patterns of in-norm behavior which are unique for each system, which

makes IDS system also unique. Uniqueness makes in turn the intrusion detection system itself more resistant to attacks. XX. METHODOLOGY Prior work has shown the need for better security tools to detect malicious activity in networks and systems. These studies also propose the need for more usable tools that work in real contexts [3, 4]. To date, however, there has been little focus on the preprocessing steps of intrusion detection. We designed our study to fill this gap, as well as to further the understanding of IDS usability and utility, particularly as the IDS are installed and configured in an organization. Consequently, our research questions were: a. What do security practitioners expect from IDS, security mechanism? b. What are the di culties that security practitioners face when installing and configuring an IDS and security mechanism? c. How can the usability of IDS, security mechanism be improved? We used a qualitative approach to answer these questions, relying on empirical data from security practitioners who have experience with IDSs in real environment. Below we detail our data sources and analysis techniques. A. Data Collection We collected data from two deferent sources. First, we conducted semi-structured interviews with security practitioners. Second, we used participatory observation, an ethno-graphic method [5], to both observe and work with two senior security specialists who wanted to implement IDS in their organization. These two sources of data allowed us to triangulate our findings; the descriptions from interviewees about the usability of IDSs were complemented by the richer data from the participatory observation. B. Data Analysis The data from the interviews and participatory observation were analyzed using qualitative description with constant comparison and inductive analysis. We first identified instances in the interviews when participants described IDSs and security activities in the context of the activities they had to perform. We next contrasted these descriptions with our analysis on the participatory observation notes. These notes were coded iteratively, starting with open coding and continuing with axial and theoretical coding [7]. XXI. MOTAVING APPLICATION A file server can utilize association discovered between requests to its stored files for predictive flowing. For instance, if it discover that a file A is usually requested after a file B of the same size, then flowing of packet A after request of B could reduce the latency, especially if this pairing holds in a large percentage of file requests. A detection mechanism likes to know which packet and what size are usually being transmitted through. It could be beneficial to know, for example the flowing sequence an size of the packet to avoid unusual flowing capturing behavior control. The detection filter can reduce its response time by allowing the only authorized traffic (from and to the internal network). Results were then organized by the challenges that the participants faced when deploying and maintaining an IDS system.

From the mechanism outside (intra) and inside (inter) two default decisions are possible: a. Default = discard: That which is not expressly permitted is prohibited. b. Default = forward: That which is not expressly prohibited is permitted.

XXII. SECURITY SUPPORT AND EVALUATION Despite the fact that the security specialists had tried to simplify the deployment of some of the security techniques like IDS by limiting its purpose, the IDS integration proved to be a challenging task, due to a number of organizational constraints. For example, to connect to the IDS at place, the specialists needed to have available ports (at least two in the case of the

IDS used during participatory observation). We looked on security techniques that was not only easy to use, but also gave relevant information about the security of the organizations systems. Consequently, the ideal situation would have been to install the IDS and use of antiviruses in the most critical network domain of different organization to generate meaningful reports about the security level of the networks, with a minimal use of resources. However, we found that this did not occur; as discussed, organizational factors like distribution of IT responsibilities a ected the decision to not involve critical networks due to the corresponding overhead of involving multiple administrators. A key concept that appears in both the authentication and confidentiality mechanism for transmitted packet is the security association (SA). An association is one way relationship to the traffic, if a peer relationship is needed, for two ways secure exchange, then two security associations are required. A security association rules is uniquely identified by the Following parameters: 1. Sequence counter overflow: A flag indicating whether overflow of the sequence number counter should generate an audible event and prevent further transmission of packet on this security association (SA). 2. Anti-Replay number of packets: Used to determine Whether an inbound packet is a replay. 3. Lifetime of this security Association: A time interval or packet count after which an SA must be replaced with a new SA or terminated, plus an indication of which of there actions should occur. 4. Path: Any observed path maximum transmission unit (maximum size of a packet that can be transmitted without fragmentation) and aging variables. Because the packets are connectionless, unreliable service, the protocol did not guarantee that packets will be delivered in order and did not guarantee that all packets willbe delivered when we tested our approach. Therefore the receiver should have implement number of packets, with a default of NP = 60. The right edge of the graph represents the highest sequence member, NP, so far received for a valid packet. For any packet with a sequence number in the range from Total Packets (TP) (NP) + 1 to NP that has been correctly received (i.e. properly authenticated) we authorize the traffic. Finding results are presented experimentally in the figure 2 below.

Effects of Number of Packets on Misclassification Rates To understand the effects of the time intervals on the misclassification rates, we run the experiments using various time intervals: 5s, 10s, 30s, 60s, and 90s. The effects on the out-going and inter-LAN traffic were very small. However, as Figure 2 shows, for the in-coming traffic, the misclassification rates on the intrusion data increase dramatically as the time interval goes from 5s to 30s, then stabilizes or tapers off afterwards. XXIII. CONCLUSION In this paper we have proposed intrusion detections Method that employs data mining techniques for intrusion detection. We believe that a periodic comprehensive of IDS presented could be valuable for acquisition managers, security analysts and R&D program mangers. The internet does not recognize administrative borders and hence making the internet an attractive option for people with criminal intents. The accuracy of the detection model proposed depends on sufficient training data and the right feature set. We suggested that the association rules and frequent packets detections can be used to compute the consistent patterns from audit data. With the infancy of Security technology at place the intrusion detections approach could be where to begin with. Preliminary experiments of using presented techniques on the LAN data in the region has shown promising results.



[1] E. Kandogan and E. M. Haber. Security administration tools and practices. In Security and Usability: Designing Secure Systems that People

Can Use, chapter 18, pages 357378. OReilly Media, Inc., Sebastapol, 2005. [2] D. Botta, R. Werlinger, A. Gagn, K. Beznosov, L.Iverson, S. Fels, and B. Fisher. Towards understanding IT security professionals and their tools. In Proc. of ACM Symposium on Usable Privacy and Security (SOUPS), pages 100111, Pittsburgh, Pennsylvania, July 18-20 2007. [3] D. M. Fetterman. Ethnography: Step by Step. Sage Publications Inc., 1998. [4] M. Sandelowski. Whatever happened to qualitative description? Research in Nursing & Health, 23(4):334340, 2000. [5] K. Charmaz. Constructing Grounded Theory. SAGE publications, 2006. [6] J.Yu, Z.Chong, H.Lu and A.Zhou. False Positive or False Negative:Mining Frequent Itemsets from High Speed Transaction Data Streams. In Proceeding of the 30th ACM VLDB International Conference on very large Data bases, pages 204215, 2004. [7] Wu et al, Top 10 Algorithms in Data Mining, Springer-Verlag London 2007. [8] Yin, Scalable Mining and Link Analysis Across Multiple Database Relations, Volume 10 Issue 2, SIGKDD 2008. [9] Z. Jiuhua, Intrusion Detection System Based on Data Mining Knowledge Discovery and Data Mining, WKDD 2008. First International Workshop on Volume, Issue, 23-24 Jan. Page(s):402 405, 2008.


Web Caching Proxy Services: Security and Privacy issues

Abstract A Web proxy server is a server that handles HTTP requests from clients. If the clients are of a common organization or domain, or exhibit a similarity in browsing behavior, the proxy can effectively cache requested documents. Caching, which migrates documents across the network closer to the users, reduces network traffic, reduces the load on popular Web servers and reduces the time that end users wait for documents to load. A proxy server accepts requests from clients. When possible and desired, it generates replies based upon documents stored in its local cache; otherwise, it forewords the requests, transfers the replies to the clients and caches them when possible. WWW clients perceive better response time, improved performance and speed when response to requested pages are served from the cache of a proxy server, resulting in faster response times after the first document fetch.

XXV. OVERVIEW: Web proxy works as an intermediary tool between Aa browser and the Web itself.2 a browser configured to use a proxy sends all of its URL requests to the proxy instead of sending them directly to the target Web server. In addition to funneling requests, proxies can also implement a caching mechanism, in which the proxy returns a cached version of the requested document for increased speed.3,4 many companies, governmental gencies, and universities sometimes make their proxies mandatory by blocking direct access to the Internet. Proxies typically handle both clear text and SSLencrypted traffic, and users can configure their browser for either or both situations (caching works only with clear text traffic). Web proxies are sometimes coupled with Web accelerators, which attempt to further reduce the time

required to display a requested Web page by predicting what page the user will request next.5These accelerators work on the server side, on the proxy, or on the client side, but in all three cases, the aim is to send the Web page to the client before its requested. Server-side accelerators push pages to clients, whereas proxy and client-side accelerators pull pages from the server. Clientside accelerators require a local component the client to be installed on the users machine. In this article, were interested in client-side accelerators for which HTTP traffic goes from the browser to the local client and then to the proxy, which either returns a cached copy of the requested document or fetches it from the site to which it belongs. Some Web accelerators add a local cache on the client (so that pages are cached at three different levels the browser, the client, of the proxy), some prefetch the pages they anticipate will be requested next, and some perform compression between the client and the proxy server. Examples of freeware Web accelerators are Fasterfox ( Security and privacy issues are common on the Internet, and many of the threats this article identifies arent unique to Web proxies. Network architecture, the widespread use of clear-text messages, the accumulation of sensitive data by various actors, flawed software, gullible and imprudent users, and unclear yet constantly changing and globally inconsistent legal protections are just some of the well-known sources of Internet security problems.1 The security and privacy implications of installing a Web proxy are an excellent illustration of this general problem, but ideally this article will increase awareness about the issue beyond this specific example. In todays fully connected working environment, individuals and companies often choose to use new Software and services without anticipating all the possible side effects. Employees indeed, entire corporations often add tools to their environments or use services that promise various

benefits by tapping into a vast offering of software freely available on the Internet from both unknown and well-known companies. Inevitably, this leads to problems: someone in an organization might use a free Web-based email system, for example, thereby exposing confidential documents over the Internet, or an employee might install a file-searching tool without realizing that it sends results outside the companys firewall. From a security and privacy viewpoint, such decisions ramifications are often overlooked some are easy to anticipate, but others are more difficult to foresee and stem from a combination of factors. In this article, I review the security and privacy implications of one such decision: installing an externally run, accelerated Web proxy. I assume a worst case scenario in which this decision is widespread (that is, the Web proxy has a large user base). I also consider three different levels of impact: on a proxy user, on an organization whose employees use a proxy, and on a Web site owner whose site is accessed through a proxy.

Figure 1 shows users accessing a Web page with and without a proxy. Requests without a proxy (in pink) go directly to the target Web server (pink path), but proxy requests go first to the proxy and then to the target Web server (red paths). If the proxy has a cached copy of the requested page, it might return that copy to avoid extra network activities.

XXVI. WEB CACHING (PROXY SERVER) The proxy servers main goal is to satisfy clients request without involving the original web server. It is a server acting like as a buffer between the Clientss web browser and the Web server. It accepts the requests from the user and responses to them if it contains the requested page. If it doesnt have the requested page, then it requests to Original Web Server and responses to the client. It uses cache to store the pages. If the requested web page is in cache, it fulfills the request speedily.







Figure 1. Internet users accessing a target Web site. Their requests are routed through their ISP, from node to node (black circles) to the target. The users with a Web proxy have their requests routed to the Web proxy first (red paths), whereas the users who dont have a proxy access the Web site directly (in pink).

The two main purpose of proxy server are: 1. Improve Performance As it saves the result for a particular period of time. If the same result is requested again and if it is present in the cache of proxy server then the request can be fulfilled in less time. So it drastically improves the performance. The major online searches do have an array of proxy servers to serve for a large number of web users. 2. Filter Requests Proxy servers can also be used to filter requests. Suppose a company wants to restrict the user from accessing a specific set of sites, it can be done with the help of proxy servers.

XXVII. WEB PROXY SECURITY A Web proxy is externally run if the service is controlled entirely by a third party with no connection to the end user, the end users employer, or the end users ISP. This situation is typical of Web based services, which external corporations often provide freely and without contractual obligations. the Web is under the proxy owners control, so the user can be deceived or manipulated at will. XXXI. CONTENT DOWNLOADS Another concern for users is the storage of unwanted content on their machines, which is most commonly found in accelerated Web proxy client-server architectures. (Classic Web proxies usually dont have a client program running on the users machine, so this concern doesnt apply to them.) As part of prefetching, the proxy service can download pages to the users machine that he or she never requested, which puts that person in the difficult position of downloading potentially offensive or inappropriate documents. This concern is often mitigated by the fact that the accelerators clients handle the prefetched pages, not the browser, so someone competent enough to know the difference might sort out the problem for the user. However, this doesnt help if the users activity is directly monitored over the network, which is common practice in many companies. Its still possible to find out if the pages are, in fact, automatically prefetched, but doing so requires specific technical skills and understanding of the accelerated Web proxys inner workings. In the best cases, HTTP headers that the accelerator sends contain something specific, in which case a network administrator can capture and scan request content to identify prefetching requests. If requests from the accelerator are identical to requests in the clients Web browser, a detailed analysis of the clients system log will distinguish prefetched from requested pages. Note that, conversely, users can also intentionally get some offensive pages prefetched so that they can read them freely while claiming not to have downloaded those pages themselves.

XXVIII. WEB PROXIES AND USERS To evaluate the security implications of using an externally run Web proxy, lets first look at the consequences from the end users viewpoint. Most casual users dont have a clear understanding of the systems workings or the implications of using it, even if the end user license agreement they accepted when installing or configuring the product mentions privacy considerations. Consequently, these users end up putting their own privacy and security at risk.

XXIX. PRIVACY A proxy owner can store and analyze any unencrypted traffic that goes through its Web browser proxy (pages read, links followed, forms filled out, and so on),9 a situation thats especially unavoidable when that owner is also the connectivity provider (such as a company, a university, or an ISP). This situation is similar to a node mining unencrypted plaintext traffic flowing over the Internet between a source and a destination, but the difference is that information in a Web proxy is visible to a third party that otherwise wouldnt have access to it. Although the proxy owner can gather only a limited amount of information over an encrypted channel, users arent protected from the accumulation of personal data during unencrypted browsing activities or when accessing Web sites that should use encryption but dont. Moreover, users cant directly detect such information-gathering activities.

XXXII. WEB PROXIES AND ORGANIZATIONS The use of Web proxies can also affect entire organizations. When employees start using an externally run proxy without the organizations knowledge or without understanding the consequences of their actions, they can expose some of the organizations confidential activities or bypass some of its rules. XXXIII. USER ACTIVITIES If enough users in an organization use an externally run proxy, then that proxys owner accumulates an enormous amount of information about the activities within the organization. Obviously, much of it might be inconsequential, but targeted data mining can reveal important information about an organizations technical choices, current prospects, and so forth.

XXX. DECEPTION Next, lets consider what proxies can send to users instead of what users can send through proxies. The data a proxy sends is completely under the proxys control, which means it can modify the pages viewed without any obvious way for the user to detect those changes. For example, a proxy could claim that a particular site is temporarily unavailable, send back an older version of the page, or return a version that never actually existed. Ultimately, the users view of

Conversely, the company controlling a Web proxy might withhold, delay, or alter information sent back to users, thereby sending them down the wrong path or driving them away from important opportunities. Again, this situation isnt unique an organizations ISP can do the same thing. However, if users operate a proxy without organizational consent, the organization hasnt explicitly chosen to use that proxy and didnt enter into a contractual agreement with the company offering the service. XXXIV. ORGANIZATIONAL CONTROL An externally run Web proxy also has implications for organizational control because a proxy can bypass or defeat system checks. If the system in place is meant to block access to a list of forbidden sites by scanning the destination at the IP level, for example, it wont be able to trigger any hits because outgoing requests go to the proxy, not the target URL. Thus, the controlling tool must be modified to analyze the targets in HTTP headers because they contain the actual end destination. If the system in place scans incoming documents for forbidden patterns, it can also fail if the accelerated Web proxy modifies the encoding in some unexpected way for example, by compressing messages. Such modifications circumvent an organizations attempts to control user activities for productivity or other purposes public libraries and schools might have to control content, but a proxy can forestall compliance plans. Of course, control is still possible (the site running the proxy can itself be banned), but the mechanisms in place must be adapted to handle this new situation, which comes at a price and requires a level of technical expertise that might be difficult to find in any given organization. Using proxies to circumvent checks is hardly a new way to escape controlling techniques: technically savvy users routinely use similar approaches to bypass filtering. However, casual users can also employ evasive techniques without intending to do so. XXXV. Internal Information Disclosure An additional concern is the possibility of leaking internal network information to the company running the Web proxy. If user requests go to a proxy first before being processed locally, then information about document names and internal network topology can leak out. (This doesnt necessarily give the proxy owner access to those documents, though.) A similar concern is disclosure of information thats secret merely because nothing betrays its existence. Obviously, this practice is extremely insecure in the first place, but the problem worsens with an externally run Web proxy. If a user releases a document to another user by dropping it on the organizations Web site, for example, then using an externally run Web proxy discloses that document to the company running it (in this case, access to the document is possible). XXXVI. WEB PROXIES AND WEB SITE OWNERS Centrally controlled Web proxies can have a substantial impact on Web site owners and application developers as well. It blurs the picture of the sites actual usage and can interfere with the sites applications and its content delivery. Content Control If a large portion of a sites visitors use the same Web proxy provider, then the Web content provider should be concerned about the proxy providers ability to alter or withhold some or all of the content being published. (This is technically no different than what any host between a Web site and its users already does, but in this case, the proxy provider is artificially included in a path with a potentially large number of users.) This behavior is considered malicious if the proxy owner (or a hacker) is intentionally interfering with any Web pages, but it can also be a technical side effect for example, an excessive caching time at the proxy level might jeopardize the Web site owners efforts to provide rapidly reactive content. Connectivity problems at the proxy level also affect Web site providers. Site-Usage Tracking Detailed site-usage statistics are crucial to most content providers: how many visitors viewed what information, what patterns did these visitors follow, when and from where did they access the Web site? This information is important for maintaining and improving a site, but it can also be a source of additional revenue in the form of advertising. If a large portion of Internet users goes through a Web proxy, it jeopardizes the ability to track usage. The first problem is request origin, which mostly stems from a single source, the Web proxy. The second problem is page caching: when a popular Web proxy service delivers a cached version of a document, the traffic on the original site drops significantly. This differs from the situation in which a multitude of small proxies scattered across the Internet cache various pages, where a good ratio of actual access to the site would be maintained (instead of one proxy serving a very large user base). The third issue is prefetching: with automatically prefetched documents, its difficult for Web site owners to make sense of a visit. Web Applications

10 Page prefetching also sometimes interferes with Web-based applications because it involves automatically following links in a currently displayed page. If the link triggers an action in the application (such as logout or delete), then page prefetching can automatically trigger these actions and thus break the application. However, its arguably safe to prefetch GET-based links for applications compliant with the HTTP 1.1 specification because, according to the specification, GET methods should not have the significance of taking an action other than retrieval, and thus shouldnt trigger any action in the application A Real-World Example 7 Google launched its Google Web Accelerator (GWA), a freely distributed accelerated Web proxy. GWA is a good example of an externally run, accelerated Web proxy controlled entirely by a third party with no connection to end users, their employers, or their ISPs. HTTP proxies and caches are common, but theyre scattered across the Internet, each managing relatively few users. A GWA of Google-sized proportions changes the situation because several users have their traffic proxied by this one source GWA is thus an excellent realworld test bed because of its potential scale. Of course, I dont claim that GWA poses all the threats discussed in this article, and I certainly dont suggest that Googles aim was to create any of these issues; I merely use GWA to illustrate the fact that the problem outlined here could materialize. If we analyze GWA with respect to the threats outlined earlier, we should first note that even if the service were pre-installed on a users machine, the current version advertises itself very clearly, so the user would likely notice a normal installation. Moreover, the user can configure prefetching and local caching behavior for the GWA. In the versions I tested (googlewebaccclient, version 0.2. 62.80-pintail.a, googlewebaccwarden.exe version, and googlewebacctoolbar.dll version in Firefox 1.5 on Windows XP SP2), 12 prefetching is enabled and local caching always checks for a newer version of the page by default; theres no obvious way to disable local caching entirely, although the user can delete the cache content. Citing security reasons, GWA doesnt handle SSL-encrypted traffic at all. Google makes no secret of storing the data sent through GWA ( html). Google also says that it temporarily caches cookies to improve performance, but it doesnt provide any details about how long cached data is kept GWA does compress data between the server and the client, which means that some GWA users could bypass some of the filters put in place by their organizations. On the positive side, the version we tested doesnt route internal information (that is, access the machine by name or via a nonroutable IP), thus the content disclosure concern we discussed earlier doesnt apply here. GWA also deals with caching issues by letting content providers specifically request that their pages not be cached. However, this assumes that GWA indeed doesnt cache the page, which the user or content provider cant control; it also means that site providers must disable page caching entirely, regardless of the caching source. Googles connectivity to the Internet is obviously outstanding, so in practice, its likely that having connectivity through GWA is actually an improvement over the vast majority of content providers. (That said, because of its high exposure, Google is more likely than most content providers to be the target of various network-level attacks, as well discuss later.) Google has taken steps to prevent the problems of activity tracking to some degree for example, it adds the real origin as part of its request, so a modified log analyzer can retrieve real sources of hits. However, this doesnt help if GWA returns a cached version of the page (although Web site owners can get around this by requesting the page not be cached at all). GWA also adds a special HTTP header x-moz: prefetch to its prefetching requests so that Web masters can configure the server to return a 403 HTTP response code (access forbidden) if they want to deny such a request. However, this places a large burden on technical teams, who now have to modify their Web servers and log analyzers to adapt to the decisions made by a single company. Moreover, they must follow GWAs evolution to maintain their tools and Web sites. XXXVII. POTENTIAL WEAKNESS From a reliability viewpoint, one potential weakness of using a popular Web proxy service is that it creates a single point of failure. If numerous Internet users subscribe to the same service, an attacker might see the Web proxy as a target of choice because it has a formidable global impact. This problem is common to any popular service, but its exacerbated by the level of control over users Web sessions that Web proxies provide. Although many Web sites might be affected, only the proxy sites owner can assess such a crucial sites overall security. Other situations present similar threats not directly linked to Web proxies as such, but as consequences of the way Web proxies work.

XXXVIII. CONCLUSION Some Solutions and Assessing the Threat So what can be done to protect users, organizations, and content providers against the threats weve outlined here? From a technical viewpoint, one approach is the systematic use of encryption and server authentication: it prevents both page modification and caching, and reduces activity monitoring to just the URL. This solution has some drawbacks, though it has an overhead price, only the site provider can do it, and it amounts to making proxies essentially useless except as bridges. From the user and organizational viewpoint, the best approach seems to be a generic one something that isn't specific to Web proxies but instead applies to using third-party services in general. Such services must be understood as ceding your control over various things to another entity, so its important to analyze and truly understand what youre giving up, how it could backfire, how likely it is to backfire, and then compare it with the real benefits you can expect from using the service. Its also important to understand how the service provider fits in with this: do you have a legally enforceable contract, and does this contract provide sufficient privacy protection? Why is the service offered? How long is the data kept? Whats the long-term impact if the situation changes? In general, a company policy that prevents the modification of its computing environment seems like a sensible choice. However, this kind of policy is difficult to enforce in practice and should be coupled with extensive user education efforts as well as technical enforcement. Weve seen that a third-party Web proxy (such as GWA) has the potential to create security and reliability problems because it can provide unwanted external control over Web activities, impacting end users, organizations, and content providers. It can also interfere with organizations attempts to control their users activities and potentially disclose important internal information. Furthermore, it can alter content providers ability to accurately track activities on their sites and create a particularly dangerous single point of failure over which only the company running the service has control. These threats might never materialize if a trusted party manages the external Web proxy, as long as this trusted company isnt forced into providing information by governmental agencies. The threats Ive outlined here havent fully materialized in the real world yet, but several rogue players have enough incentives to at least attempt some of them, so organizations depending on this kind of architecture must be aware of the potential threats and do what they can to mitigate them.

REFERENCES: [1] A. Luotonen and K. Altis, World-Wide Web Proxies, Computer Networks and ISDN Systems,. [2] G. Colouris: Distributed Systems. [3] Singhle and Shivratre: Distributed Operating System [4] R. Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems, Wiley, 2001 [5] [6] [7] [8] R. Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems, Wiley, 2001 [9] A. Luotonen and K. Altis, World-Wide Web Proxies, Computer Networks and ISDN Systems, vol. 27, no. 2, 1994, pp. 147154. [10] J. Wang, A Survey of Web Caching Schemes for the Internet, ACM SIGCOMM Computer Comm. Rev., vol. 29, no.5, 1999, pp. 3646. [11] A. Rousskov and V. Soloviev, A Performance Study of the Squid Proxy on HTTP/1.0, World Wide Web, vol. 2, nos. 12, 1999, pp. 47 67. [12] J. Domnech et al., The Impact of the Web Prefetching Architecture on the Limits of Reducing Users Perceived Latency, Proc. IEEE/ACM Intl


A Comparative Study To Solve Job Shop Scheduling Problem Using Genetic Algorithms And Neural Network

This paper presents the difference between genetic algorithm and neural network for solving a job shop scheduling problem. We have mentioned the operation of genetic algorithm and algorithm of genetic for solving job shop scheduling problem. In this paper we have compare the genetic algorithm and neural network .And solve a fuzzy flexible job shop scheduling problem using genetic algorithm and neural network. In each generation, crossover and mutation are only applied to one part of the chromosome and these populations are combined and updated by using half of the individuals Index Terms Genetic Algorithm, Neural Network, Job Shop Scheduling Problem.


he job shop scheduling problem (JSSP) is one of the most well-known problems in both fields of production management and combinatorial optimization. The classical n-by-m JSSP studied in this paper can be described as follows: scheduling n jobs on m machines with the objective to minimize the completion time for processing all jobs. Each job consists of m operations with predetermined processing sequence on specified machines and each operation of the n jobs needs an uninterrupted processing time with given length. Operations of the same job cannot be processed concurrently and each job must be processed on each machine exactly once. Efficient methods for solving JSSP are important for increasing production efficiency, reducing cost and improving product quality. Moreover, JSSP is acknowledged as one of the most challenging NPhard problems and there is no any exact algorithm can be employed to solve JSSP consistently even when the problem scale is small. So it has drawn the attention of researches because of its theoretical, computational, and empirical significance since it

was introduced. Due to the complexity of JSSP, exact techniques, such as branch and bound , and dynamic programming are only applicable to modest scale problems. Most of them fail to obtain good solutions solving large scale problems because of the huge memory and lengthy computational time required. On the other hand, heuristic methods, include dispatching priority rules, shifting bottleneck approach and Lagrangian relaxation , are attractive alternatives for large scale problems. With the emergence of new techniques from the field of artificial intelligence, much attention has been devoted to meta-heuristics. One main class of metaheuristics is the construction and improvement heuristic, such as tabu search and simulated annealing . Another main class of meta-heuristic is the population based heuristic. Successful examples of population based algorithms include genetic algorithm (GA) , particle swarm optimization (PSO), artificial immune system and their hybrids , and so on. Among the above methods, GA, proposed by John Holland, is regarded as problem independent approach and is well suited to dealing with hard combinational problems. GAs use the basic Darwinian mechanism of survival of the fittest and repeatedly utilize the information contained in the solution population to generate new solutions with better performance. Classical GAs use binary strings to represent potential solutions. One main problem in classical GA is that binary strings are not naturally suited for JSSP. Another problem in classical GAs is premature convergence. Although GAs have better performance than most of conventional methods, they could not guarantee to resist premature convergence when individuals in the population are not well initialized. From the view point of the JSSP itself, it is a hard combinatorial optimization problem with constraints. The goal of the scheduling methods is to

find a solution that satisfies the constraints. However, some of the infeasible solutions are of similarity with the feasible optimal solutions, and may provide useful information for generating optimal solution. XL. GENETIC ALGORITHM Genetic algorithms are one of the best ways to solve a problem for which little is known. They are a very general algorithm and so will work well in any search space. Genetic algorithms use the principles of selection and evolution to produce several solutions to a given problem. The genetic algorithms developed USA in the 1970s.[3] Genetic Algorithm Application to solve a simple problem: maximizing f(x) =x2; x=0,....., 31 The most common type of genetic algorithm works like this: A population is created with a group of individuals created randomly .The individuals in the population are then evaluated. The evaluation function is provided by the programmer and gives the individuals a score based on how well they perform at the given task [2]. A) Individual:-Any possible solution. B) Population:-Group of all individuals. C) Search space:-All possible solution to the problems. D) Chromosome:-Blueprint for an individual. E) Trait:-Possible aspect of an individual. F) Allele:-Possible setting for a trait. G) Locus:-The position of a gene on the chromosome. H) Genome:-Collection of all chromosomes for an individual. fitness method simply calculated the amount of free space each individual/solution offers. 2. Cross-over (Swapping information):Exchange (Swapping) of genes between parent model in order to produce child model(next generation). Control the degree of mixing and sharing of information. It is performed only by those that the fitness tests. The example of how two parents crossover to make two children.

Fig.1 Simple Problem Using Genetic Algorithm. The Operations of a Genetic Algorithm 1. Selection (Based on misfit):-A filter to select the model that best fit the data is commonly used. You will need a method to calculate this fitness. [4]Lets use the space optimization example. The

Fig.2 swapping number. 3. Mutation:-Making random changes in genes. It is made in order to maintain some degree of randomness in the population (helps to avoid local minima).The probability of mutation should be kept low in order to prevent excess of randomness. 4. Reproduction:-pair of"parentsolutions are selected to generate child solutions with many of the characteristics of theirparent. Reproduction includes crossover(High probability)and mutation(low probability). Advantages of Genetic Algorithm A) They efficiently search the model space, so they are more likely (then local optimization technique) to converge toward a global minima. B) There is no need of linearization of the problems. C) There is no need to compute partial derivatives. D) It can quickly scan a vast solution set. E) This is very useful for complex or loosely defined problems. Disadvantages of Genetic Algorithms A) They show a very fast initial convergence followed by progressive slower improvements (sometimes is good to combine it with a local optimization method). B) In presences of lots of noise, convergence is difficult and the local optimization technique might be useless. C) Models with many parameters are computationally expensive. D) Sometimes not particularly good models are better then the rest of the population and cause premature convergence to local minima. E) The fitness of all the models may be similar , so convergence is slow . F) It is too hard for the individuals to venture

away from their current peak. The Flowchart Illustrates the Basic Steps in a Genetic Algorithm. improving productivity as well! Chances are increasing steadily that when you get that trip plan packet from the travel agency, a GA contributed more to it than the agent did. 4. .Business: Using genetic algorithm we can solve the business problems .firstly we analysis then evaluated the problems then get optimal solution. XLI. NEURAL NETWORK

Fig.3 Flowchart of genetic algorithm. Fitness:-A Measure of the goodness of the organism. Expressed as the probability that the organism will live another cycle (generation).Basic for the natural selection simulation. Offspring:-Common task. Applications of Genetic Algorithm 1. Automotive Design: Using Genetic Algorithms [GAs] to both design composite materials and aerodynamic shapes for race cars and regular means of transportation (including aviation) can return combinations of best materials and best engineering to provide faster, lighter, more fuel efficient and safer vehicles for all the things we use vehicles for. Rather than spending years in laboratories working with polymers, wind tunnels and balsa wood shapes, the processes can be done much quicker and more efficiently by computer modeling using GA searches to return a range of options human designers can then put together however they please. 2. Computer Gaming: These GAs have been programmed to incorporate the most success ful strategies from previous games-the programs. 3. Trip, Traffic and Shipment Routing: New applications of a GA known as the "Traveling Salesman Problem" or TSP can be used to plan the most efficient routes and scheduling for travel planners, traffic routers and even shipping companies. The shortest routes for traveling. The timing to avoid traffic tie-ups and rush hours. Most efficient use of transport for shipping, even to including pickup loads and deliveries along the way. The program can be modeling all this in the background while the human agents do other things,

Neural Network is a network of many very simple processors ("units"),each possibly having a (small amount of) local memory[10]. The units are connected by unidirectional [6] communication channels("connections"),which carry numeric(as opposed to symbolic)data. The units operate only on their local data and on the inputs they receive via the connections. The design motivation is what distinguishes neural networks from other mathematical techniques: A neural network is a processing device, either an algorithm, or actual hardware, whose design was motivated by the design and functioning of human brains and components thereof. There are many different types of Neural Networks, each of which has different strengths particular to their applications. The abilities of different networks can be related to their structure, dynamics and learning methods. Neural Networks offer improved performance over conventional technologies in areas which includes: Machine Vision, Robust Pattern Detection, Signal Filtering, Virtual Reality, Data Segmentation, Data Compression, Data Mining, Text Mining, Artificial Life, Adaptive Control, Optimization and Scheduling, Complex Mapping and more. Advantage of Neural Network A) Systems which combine disparate technologies. B) Systems which capitalize on the synergy of multiple technologies. C) Systems which implement multiple levels or facets of activities from different perspectives Met strategies such as simulated annealing (SA); tabu search (TS); genetic algorithms (GAs) guide a local heuristic, embedded within their structure, through the search domain and hence are able to provide a superior method (cf. results of Vaessens et al. 1996). Disadvantage of Neural Network A) HOPFIELD: For complex problems such as G the Hopfield model is unable to converge to the global optimum as it has a tendency to

become trapped within local minima solutions, hence there is no guarantee of achieving good solutions. B) THE BEP NEURAL MODEL: Although the BEP model is able to perform classification effectively it exhibits limited success in dealing with optimisation problems because of the inherent lack of generic patterns between inputs and outputs in optimisation problems. Application of Neural Network 1. Character Recognition: The idea of character recognition has become very important as handheld devices like the Palm Pilot are becoming increasingly popular. Neural networks can be used to recognize handwritten characters. 2. Image Compression: Neural networks can receive and process vast amounts of information at once, making them useful in image compression. With the Internet explosion and more sites using more images on their sites, using neural networks for image compression is worth a look. 3. Stock Market Prediction: The day-to-day business of the stock market is extremely complicated. Many factors weigh in whether a given stock will go up or down on any given day. Since neural networks can examine a lot of information quickly and sort it all out, they can be used to predict stock prices. 4. Traveling Salesman's Problem: Interestingly enough, neural networks can solve the traveling salesman problem, but only to a certain degree of approximation. 5. Medicine, Electronic Nose, Security, and Loan Applications: These are some applications that are in their proof-of-concept stage, with the acception of a neural network that will decide whether or not to grant a loan, something that has already been used more successfully than many humans. 6. Miscellaneous Applications: These are some very interesting (albeit at times a little absurd) applications of neural networks. job consists of m operations with predetermined processing sequence on specified machines and each operation of the n jobs needs an uninterrupted processing time with given length. Operations of the same job cannot be processed concurrently and each job must be processed on each machine exactly once. Efficient methods for solving JSSP are important for increasing production efficiency, reducing cost and improving product quality.

XLII. SOLUTION OF JOB SHOP SCHEDULING PROBLEM USING GENETIC ALGORITHM The job shop scheduling problem (JSSP) is one of the most well-known problems in both fields of production management and combinatorial optimization. The classical n-by-m JSSP studied in this paper can be described as follows: scheduling n jobs on m machines with the objective to minimize the completion time for processing all jobs [7]. Each

There are two important issue related to use genetic algorithm to job shop problem. (1) How to encode a solution of the problem into a chromosome so as to ensure that a chromosome will correspond to a feasible solution. (2) How to enhance the performance of genetic search by incorporating traditional heuristic methods. [5]The job shop scheduling with alternative machine decomposed to two. First, operations are allocated to specific machine. Second, determine the sequence of operation allocated each machine with respect to operation sequence constraint specified. There are many approaches used to find solutions to job scheduling problems. Dispatch rules have been used to solve job scheduling problems (Baker 1984). The operations of a given job have to be processed in a given order. The problem consists in finding a schedule of the operations on the machines, taking into account the precedence constraints that minimize the makespan (Cmax), that is, the finish time of the last operation completed in the schedule. Each operation uses one of the m machines for a fixed

duration. Each machine can process at most one operation at a time and once an operation initiates processing on a given machine it must complete processing on that machine without interruption. JSSP is acknowledged as one of the most challenging NP-hard problems and there is no any exact algorithm can be employed to solve JSSP consistently even when the problem scale is small. The JSP is amongst the hardest combinatorial optimization problems. It is assumed that a potential solution to a problem may be represented as a set of parameters. The individuals, during the reproductive phase, are selected from the population and recombined, producing offspring, which comprise the next generation. The JSP is NPhard (Lenstra and Rinnooy Kan, 1979), and has also proven to be computationally challenging. There are four jobs; each has three different operations to be processed according to a given sequence. There are six different machines, and the alternative routing and processing times are shown. threshold x and a net input N i.

Since an operation can never start before time 0, the starting time of the entire schedule, the threshold x can be set at 0, resulting in the implementation of a non-negativity constraint. The thresholds can also be determined in a problem specific manner by calculation of the earliest possible starting time of the operations. The earliest possible starting time of the first operation of a job is the starting time of the scheduled time-span: 0 in the [6]. Example. The second operation's (i (j+ 1) k) earliest possible starting time is 0 + tijk. The earliest possible starting time of the third operation of job 1 is 0 + tijk + ti (j+l)k. In this manner the thresholds of the units representing the starting time are determined, thus reducing the 'search space' and resulting in a more-stable network.[5]. SC and RC units. The net input of these units is calculated by adding the bias to the summed incoming weighted activation:

The bias (Bi) added to the incoming weighted activations of the connected units is the processing time of the operation as formulated in the equation this unit represents. The constraint representing units are of a deterministic negated linear threshold type with x being the threshold and Ni the input: XLIII. SOLUTION OF JOB SHOP SCHEDULING PROBLEM USING NEURAL NETWORK The job-shop scheduling neural network should contain units that are capable of representing the starting times of the operations (S units),[9] whether sequence constraints are violated (SC units), whether resource constraints are violated (RC units), and the value of the Yipk indicator variables (Y units). S units. The input N i of these units is calculated by adding the previous activation Ai(t-1) (thus simulating a capacitor) to the summed incoming weighted activation.

This activation function allows for an indication of the violation of the equation being represented. The bias is applied to represent the problem-specific operation processing times. Y units. The net input of the Y units is calculated by summation of the incoming weighted activation:

The activation is determined according to a deterministic step function: The selected unit for this is of a deterministic linear threshold type with an activation A i, a

designed for the example problem consists of 6 Sijk units, the second layer consists of 10 constraint units and the third layer consists of 3 Y units. The thresholds of the S units can be determined in a problem-specific manner. For instance the threshold of the S unit representing the third operation of job 1 (S 133) can be set at 13 (t111 + t122). The first unit of the second layer (representing the first sequence equation) for instance, collects negated information (connection weight -1) from the unit representing S111 and positive information (weight +1) from the unit representing S122. Together with the bias of this unit (-5) the violation of this constraint can be determined. If, for instance, S111 = 1 and S122 = 2, a constraint violation should be signalled since the second operation starts before the first operation has ended. The net input of the first constraint unit, SC1 in Fig, will be 2 - 1 - 5 = -4, resulting in an activation of 4, signalling a violation. This information has to be fed back to the S units to cause an appropriate change in starting times. For this reason, the S units collect the information from the SC units (see Fig). The corresponding S units will receive this redirecting information resulting in an inhibition (+4 * -1) of the S111 unit and an excitation (+4 * 1) of the S122 unit, thus working towards an acceptable solution. If these feedback weights are set correctly, feasible solutions will be generated without requiting explicit initialisation of the S units. The resource constraints are implemented in the general structure presented in Fig. The RC units collect information from the adequate S units and the Y units according to the resource equations. Suppose the starting time of operation 1 of job 1 on machine 1 (Slll) is 1, and the starting time of operation 2 of job 2 on machine 1 (S221) is 2. In that case, unit Y121 receives a net input of -I + 2 = 1 resulting in an activation of 1, signalling that S111 precedes operation S221. The RC unit representing equation 5 receives -1 from S111,2 from unit S221 and -35 from Y121. This value added to the bias (30) of this RC unit results in a net input of -5 resulting in an activation of 5, signalling a violation. The S111 and S221 units receive this activation through their weighted feedback connections, resulting in an advanced starting time of operation 111 and a delayed starting time of operation122. The RC unit representing equation 6 receives 1 from S111, -2 from S221, and 35 from Y121. This value added to the bias (-3) of this RC unit results in a net input of 31, resulting in an activation of 0.

The activation of this unit represents whether job 1 precedes job 2 on machine k or job 2 precedes job 1. The proposed structure consists of three layers; the bottom layer containing the S units, the middle layer containing the SC and RC units, and the top layer containing the Y units. As an example a 2/3/J/C max scheduling problem is used with its machine allocations and operation times presented in Tables I and 2.

Before a dedicated neural network can be designed an integer linear programming representation has to be created according to the method presented. For the sequence constraints this translation results in: n(m1) = 4 sequence constraints of type Sijb - Si(j-1)a" ti(j-1)a >- O, 1) S122- $111- 5 20 2) S133- $122 - 8 20 3) S221- $213- 7 20 4) S232 -$221- 3 20. For the resource constraints the value of the constant H must be determined. This constant should have a value that is large enough to ensure that one of the disjunctive statements holds and the other is eliminated so:

For the example problem there are nm(n-1) = 6 resource constraints of type Spjk - Sijk + (-H* rijk) + H>_ 0 Sok- Spjk + ~+n * Yijk) ~->- O, 5) S221 - $111 + (-35"Y12 I) + 35 - 5 _>0 6) S111 - $221 + ( 35* Y121) -3 _>0 7) S232- $122 + (-35"Yl22) + 35 - 8 20 8) S122 - $232 + ( 35* Y122) -9_>0 9) S213 - $133 + (-35"Y123) + 35 - 2 _20 10) S133 - $213 + ( 35* Y123) -7 _20. In total there are n(nm-l) = 10 equations, nm = 6 starting time (Sijk) variables and ran(n-I)/2 = 3 disjunction (Yipk) variables for the 2-job, 3-machine problem. The first layer of the neural network

obviously effective and viable tools for realtime prediction tasks. G) For the two extreme cases of building block scaling, uniform and exponential, genetic algorithms with perfect mixing have time complexities of O(m) and O(m2) respectively. H) Fast,accurate and easy-to-use Genetic algorithm better then neural network.

XLV. CONCLUSION Fig4.solving job shop scheduling problem using neural network XLIV. COMPARISON A) Genetic Algorithms (GAs) essentially started with the work of Holland (1975), who in effect tried to use Natures genetically based evolutionary process to effectively investigate unknown search spaces for optimal solutions. Neural Networks (NNs) are based on early work of McCulloch & Pitts (1943), who buil a first crude model of a biological neuron with the aim to simulate essential traits of biological information handling. For a comprehensive survey of the use of evolutionary algorithms, and GAs in particular, in management applications, see Nissan (1995). In Industry: production planning, operations scheduling, personnel scheduling, line balancing, grouping orders, sequencing, and sitting. In financial services: risks assessment and management, developing dealing rules, modeling trading behavior, portfolio selection and optimization, credit scoring, and time series analysis. Moreover, the way a NN learns also depends on the structure of the NN and cannot be examined separately from its design. In GA theory, it is usually necessary to put related parameters together in the genome. In our case, we keep close together in the genome the weights corresponding to the same hidden layer neuron. The name Neural Networks already describes what they try to do, i.e. to handle information like biological neurons do, and thus using the accumulated experience of nature or evolution in developing those In this paper we have presented Genetic algorithm and Neural Network for solving a job shop scheduling problem. We have fully evolution of genetic and neural network and explain the benefit and Application of genetic and neural network. Genetic Algorithm produced quite good results, ones apparently at least as good as those in the literature. With the help of genetic algorithm we can solve business problem. And get a optimum solution. We have done numerical for solving a job shop scheduling problem using Genetic algorithm, Neural network. This research considered only nominal features. Our work extends in a natural way to other varieties of features. REFERENCES
[1] James F. Frenzel received a Ph.D. in Electrical Engineering from Duke University in 1989. [2] Holland J.H., Adaptation in natural and artificial system, Ann Arbor, The University of Michigan Press, 1975. [3] [Goldberg D., Genetic Algorithms, Addison Wesley, 1988 . [4] Baker, K.R. (1974). Introduction to Sequencing andScheduling, Wiley, New York. [5] Foo, Y.P.S. and Takefuji, Y. (1988a). Neural Networks for Solving Job-Shop Scheduling. [6] CoEvolution of Neural Networks for Control of Pursuit & Evasion. [7] B.J. Lagewag, J.K. Lenstra and A.H.G. Rinnooy Kan, 1977, Job shop scheduling by implicit enumeration., Management Science, Vol. 24, pp. 441-450. [8] M. Garey, D. Johnson and R. Sethi, 1976, The Flowshop and job shop scheduling. Maths Ops Res. 1, pp 117129. [9] S.W. Mahfoud, D.E. Goldberg, "A Genetic algorithm For parallel simulated annealing", in Parallel Problem Solving from Nature, 2, R. Manner, B. Manderick eds., Elsevier Science Publishers B.V., 1992. [10] P.J. Verbos, \An Overview of Neural Networks.







Innovation & Entrepreneurship In Information And Communication Technology

Abstract This paper describes ICT (information and communication technology) and the new innovations which are related to it. ICT as we all know is a wider perspective of information technology. Information technology deals with unified communications (UC), integration of telecommunications and the audio-visual systems in modern IT. This paper illustrates new innovations by Cisco Company for both the larger and smaller organization. Unified communications are the important aspect related with ICT. These are integrated to optimized business processes. Unified communications integrates real time and as well as non real time communication with business processes and requirements.ICT is a powerful tool for the development of various business and IT activities. It concentrates harder for the economic issues related with business and digital era. Later in this paper the focus is on the telecommunication. Innovations and the digital enterpenureship will provide better chances to rise in ICT field.. Telecommunication link and medium play a vital role in the smooth life of any business as well as in the success of ICT. Digital enterpenureship describes the relationship between an entrepreneur and the digital world. Index Terms communication, technology, economy, unified, Cisco

audio-visual, building management and telephone network with the computer network system using a single unified system of cabling, signal distribution and management. XLVII. NEW INNOVATIONS IN ICT

An innovation starts as a concept that is refined and developed before application. Innovations may be inspired by reality. The innovation process, which leads to useful technology, requires: A) Research B) Development C) Production D) Marketing E) Use In this section we are defining some of the greatest and newer innovations in the field of unified communication, telecommunications and audio Visual audio-visual systems. XLVIII. MICROSOFTS LITE GREEN IT PROJECT The Microsoft research labs in India have been working on a project: Lite Green is used to reduce the bill and be energy efficient. It is very important innovation in the field of ICT just because of the desktops, When running at full capacity consume close to 100 220 watts and 62-80 watts when running at close to zero percent CPU usage .This project is as most effective during weekends and overnight. The energy saving is close to 80 percent in such cases. XLIX. SOFTWARE DEFINED RADIO Software defined Radio is a radio communication system where components that have been typically implemented in hardware for example filters, amplifiers, modulators/demodulators, detectors, etc. is instead implemented by means of software on a personal computer or embedded computing devices. This brings benefits to any actor involved in the telecommunication market manufactures operator


he world of information and communications technology is constantly changing without stopping anywhere .ICT consists of all technical issues related to handle information and aid communication, including computer and network hardware, communication as well as necessary software. In other words, ICT consists of IT as well as telephony, broadcast media, all types of audio and video processing and transmission and network based control and monitoring functions. The term ICT is now also used to refer to the merging of audio-visual and telephone networks with computer networks through a single cabling or link system. There are large economic incentives (huge cost savings due to elimination of the telephone network) to merge the

users. The advantage for users is priority to room their communication to other cellular system and the tape advantage of worldwide mobility and coverage. A basic SDR system may consist of a personal computer equipped with a sound card, or other analog-to-digital converter, preceded by some form of RF front end. Software radios have significant utility for the military and cell phone services, both of which must serve a wide variety of changing radio protocols in real time. L. IPTV Internet Protocol television (IPTV) is a system through which Internet television services are delivered using the architecture and networking methods of the Internet Protocol Suite over a packetswitched network infrastructure, e.g., the Internet and broadband Internet access networks, instead of being delivered through traditional radio frequency broadcast, satellite signal, and cable television (CATV) formats.There are lot of regulations are coming in the Information and entertainment sector due to the changing technological scenario coupled with digitization of broadcasting industries. This changing environment led to the growing popularity of IPTV at the international level. The scope of IPTV in India is not highly recognized. IPTV services may be classified into three main groups: A) Live television with or without interactivity related to the current TV show. B) Time-shifted programming: Catch up TV that replays a TV show that was broadcast hours or days ago, start-over TV works over that replays the current TV show from its beginning. C) Video on demand (VOD): browse a catalog of videos, not related to TV programming LI. MOBILE TV Mobile TV is expected to accept sufficient growth in the Asia pacific region. The mobile TV has already arrived in INDIA and its future is bright. The mobile TV is considered as to be wireless device and wireless services provide a large success in India with both the urban areas and rural areas are rising steadily. IPTV when introduced in the country and was considered to be the next big technology driven in the telecom industry. However the service did not as pick up as considered to be up to the mark. This was due to some factors including low broadband penetration and slow internet access speed. LII. UNIFIED COMMUNICATIONS 300 SERIES: The Unified Communications 300 Series (UC300) is part of Ciscos Foundational Unified Communications (UC) offering which provides basic or foundational UC features for small businesses, typically voice, data and wireless integration, plus some basic integrated messaging applications. UC300 is positioned for businesses that require more basic UC features at an affordable price. Ciscos earlier Unified Communications 500 Series (UC500) also for smaller businesses belongs to the Advanced UC offering which has a more advanced feature set, including video, enhanced security and mobility. LIII. CISCO UNIFIED COMMUNICATIONS MANAGER BUSINESS EDITION 3000 The Cisco Unified Communications Manager Business Edition 3000 is an all-in-one solution specifically designed for mid-sized businesses with 75-300 users (400 total devices) and up to 10 sites (nine remote sites) with centralized call processing. Cisco Unified Communications Manager software, the Cisco Unity Connection messaging solution (12 ports) and Cisco Unified Mobility are pre-installed on a single Cisco Media Convergence Server. LIV. DIGITAL ENTERPENURESHIP Entrepreneurship is the act of being an entrepreneur, which can be defined as "one who undertakes innovations, finance and business activities in an effort to transform innovations into economic goods". The digital enterpenureship term is introduced to define any organization digitally. Each and every detail of enterprise is termed digitally. LV. CONCLUSION In this paper the final output is to focus on emerging field in innovations of Information and communication technology. Finally you can conclude that not only in area of wireless devices for example mobile TV but also in the other fields of ICT like audio visual system may be wire included may the information and communication technology is successful. References
[1] [2] [3] [4] [5] [6] ly oadmap/default.mspx


Insider Threat: A Potential Challenges For The Information Security Domain

Abstract The growth of insider threat is ever expanding it proliferation in information technology sectors, managing such threat is one of the exquisite challenge for Information security professionals as well as it is also one of the earnest duties of the members of board and executives of the company concern. The insiders have exceptional privilege of accessing the various vital information and information systems in the organizations; they do sometime misuse such privilege due to immense reasons. Our studies depict that such threat can cause unbounded destruction to the business of the organization and make a situation highly exacerbated for an organization to achieve their objective. In this paper we deliver the result of an empirical study which shows that what the several reasons are which tends the insider of an organization to turn hostile, various methods used by insiders to create IT sabotage and also we researched various measures used to deter, detect and mitigate malicious insider threats.

the main cause of data leakage of organization. This results in huge financial loss to such organization, loss of assets, company defamation even it tends to close of the business of the organization. These types of threat existing across many discipline which includes security related to environmental and technology. Placing the appropriate level of defense and security against such type of threat is a stiff challenge for each concern. It is very much time consuming and also at the same time cost consuming practice. The magnitude of threat provided by the insider is greater than outsider threat, mainly due to enormous level of knowledge of the organizations vital and sensitive information. Various security measures are being taken till date like multilevel security policies and access control lists, but they are not up to mark in mitigating the risk of insider threat.

Index Terms Insider threat, Contentious threat, Disgruntled Employee, Mala fide intention, Instigation.


ver since with the expansion of computerized data entry in the field of information technology we are paving way towards the digitalized era and so on the potential threat attached to it security is enhancing its vigor accordingly and subsequently creating a situation encumbrance for an organization to face and counter measure those threats. One of such contentious threat is insider threat. Insider threat are typically are those legitimate users who has or had authorization to access the organizations critical Information and information systems like trade secrets, account numbers, social security numbers, intellectual property rights and personal/health records etc figure1 illustrate. Further fig. 1 show that they furnish such critical information which can be stored on the network or shared drive of information systems to the counterpart of the organization like its market competitors, regulators, unauthorized internal users or press and media with mala fide intentions so as to obliterate the confidentiality, integrity and availability of the organizations information or information systems in short it is


LVII. METHODS OF ATTACK BY INSIDERS It is been stated that most of the insiders were believed as disgruntled and most of them do crime just because to take revenge. They take revenge mainly due to some negative incident occurred with then in the organization like unsatisfied with the salary package of the organization. It is been found that 95% of the insiders steal the data in working hours. 85% insiders use their credentials to perform IT sabotage where as 44% of them use unauthorized account which they created previously. The survey also reveals that 15% use technical methods and means to carry out their attacks like with the help of logic bombs, virus and various spywares. They post these malware inside the targeted computers or computer systems. The most of the insiders stole or alter the vital information during normal working hour of their duties. Tentatively 8% insiders use remote access from outside the organization to access the secret information of their employers. Finding reveals that most of the insiders are mostly male technical and carry high dignified designation in the organization in majority they are former employee. Social engineering is also one of the biggest weapons in the hands of the insiders to commit data stealing activities. Fig. 2 illistrutes a survey conducted in the year 2008/09 by ZDNet Asia's latest survey on the region's top IT security priorities in which in special emphasis is given to the insider threat. The survey reveals that amongst the existing threat protection against the insider threat is 52.8 percent comparing to others. LVIII. PROPOSED COUNTERMEASURES AGAINST INSIDER ATTACKS Following are the proposed countermeasures against insider attacks, the deployment of which will benefit the organization at large to combat with the insider attacks.

Fig: 2 Top IT Security priorities Source: ZDNet Asia IT Priorities Survey 2008/2009.

A) Conduct proper and result oriented risk assessment for insider threat specially. Comprehensive risk based security should be implemented by the management within the whole organization so as to provide appropriate level of protection to the vital asset of the organization. It is fact that one can not implement 100% against every

risk factor in the organization for each resources but certain adequate level of protection can be provided to vital and most critical resource (Ciechanowicz, 1997). A real security goal of every organization is to protect it critical asset and resources against every possible internal and external threats. Many organization conduct risk assessment but they fail to give proper and special emphasis to the insider

threats of such organization as a results only partial protection is provided. During assessment they must not overlook insider of the organization. It is imperative that both qualitative and quantitative risk assessment should be done specially for insider of the organization. Case study: An organization fails to secure his vital data and computer systems form one of his disgruntled employee. The organization runs a business of maintaining the database of phone numbers and addresses for emergency services. Such insider deleted the whole database from the server of organizations network operation centre by getting access to such server by defeating all physical security of such server by using the badge of the network administrator illegally ,thereafter the situation turn more worse as the organization has no more backup mechanism left with them because such backup tapes for recovering the database is also stolen by such malicious insider which is residing inside the organizations network operation centre. It was noticed that no outsider can able to do such disaster as easily as the aforesaid insider can do. The main reason of such calamity is because such insider is well acquainted with the security feature of the organization thus exploitation of such security is easier for insider than any outsider. Thus if the organization has conducted the proper risk assessment prior to such incident to ascertain what vulnerabilities and threat is exiting ,the organization would have easily over come such risk factor. B) Separate security policies and procedure should be stated in organization for insiders. which is inevitable. Case study: A disgruntled employee of the organization is the software developer who downloads the password files from the organizations server into his laptop with pure malicious intention to break the password of it organization. He cracked the password with the help of various password cracking tools available in the internet and finally he cracked the password along with the root password of the server of the organization. Thereafter he started unauthorized network access of such organization and started bragging to the network administrator for dismantling the important database of such organization which is residing in that server. In this case the organization has no specific security policies which oversee and control the line of action of its employees, despites of having organizational security policies and procedure there must be a separate security policies and procedure which place the stringent restriction on the performance of it employee so that they do not exceed their limits beyond requirements. Later the organization has modified the policies and procedure and placed the rigorous password management policy for its insider of the organization. The gist of whole case is to strengthen the inner infrastructure of the organization so that all operational activities should go seamless. C) Use Advanced and stringent logical control technologies. It is Imperative for an organization to secure his sensitive data from malicious insiders which is in digital form by some advanced and stringent logical control technologies like for instance (DLP) Data leakage prevention system . This tool to huge extent is very utile to prevent to insider attacks on sensitive data of organization it is a software which enforce polices to protect sensitive and critical information of organization. This software tries to capture all this activities of the user if they deviate from their legitimate duty like for instance if any person tries to copy some sensitive data of the organization by inserting USB storage drive in computer and trying to steal such data which is against security policies ,this software is set in such a way that immediately trigger the alarm to restrict such action in this way its defends the organizations critical data from getting robbed by in own disgruntled insiders malicious action. It is a preventive measure against insider threat in short it is a watch dog inside the organization (Magklaras and Furnell, 2002). Likewise, other technology which helps in combating

In the organization where all sensitive information is in digitized form it is highly recommended that along with the general security policy of the organization it will be more advantageous for a organization if they draft one separate security polices and procedure for all conduct and activities of their insiders like implementation of strong password and account management (Roy Sarkar, 2011). This effort wil result in getting more close and stringent control on the insider daily activities in such organization and prevents misunderstandings amongst the employees. The policies and procedure should specifically mention the constraints, privilege and responsibilities of their employees. There should also exists a flexibility of change in such security policies and procedure reason is very obvious that organization is not a static entity there is always some changes occurs in the working of organization and

with insider threat is network access control which provides control over client and servers activities. D) Must follow separation of duties and least privilege The separation of duties and principle of least privilege must be followed up by every business process in the organization so as to reduce to damage that may be caused by malicious insiders in the organization (Alghathbar, 2007). It requires dividing the duties of every employees according to their skill set so as to reduces the possibility that their can be a situation one employees may embezzle or steal the sensitive or vital information or may commit IT sabotage without the help of other employee. The separation of duties in an organization can be enforced both by technical and non technical way. The role based accesses control is also plays vital role in controlling the activities of the insiders, such least privilege mechanism provides greater strength and reliability in the organization regarding the proper utilization of resource and limiting the impact of insider threat. Case study: One of the employee of the immigration service providing organization has exceed the limit of his work and he fraudulently made the modification in the clients records related to united state immigration asylum decision which is highly sensitive data. Later in the investigation it was found that such modification was done with the help of organizations computer system which was only managed by the organizations authorized employees only. This fraudulent act was done in consideration for $50,000. Thereafter the organization to control such situation, implemented the separation of duties via role based access control methodology so as to limit the authorization of its employees in the organization. In addition to this least privilege also been implemented to prevent its officials of organization to approve or modify any immigration decision without having authority to do it. E) Effective leadership limits the insider security threat. members to integrate all the activities and interest (Williams, 2008). Leadership is binding energy which control the ideology of its team members so as to focus on the achieving the stated objective of organization only. It motivate the insider and it is a constant source of inspirations for it followers. It has rightly been said that a good leadership helps in mitigating the insider threat to a great extent in the organization by maintaining a proper and well defined discipline in the organization. Therefore it is recommended that to establish the practice of effective leadership.


Discontinue the computer access to exemployees who left the job in organization.

A survey reveal that many organization do feel uneasy in their effort to combat with insider threat only due to not having practice of effective leadership over its staffs and employees. For accomplishment of organizations objectives, leadership plays a very vital role it provides team spirits amongst the organizations staffs and its

It is suggested that whenever any employees gets termination or leaves the organization once the employment gets terminated it is crucial to follow stringent pattern which disable all the access path to the organizations network and computer systems for that terminated employees. If such procedure is not duly followed then organizations computer system and its network is vulnerable to access by those unauthorized and illegitimate terminated employees which can be disastrous for such organizations sustainability. It is found that many former employees frequently use remote access to the network of the organization thus it is recommended that those remote access or VPN (virtual private network) should be disabled for particularly for those aforesaid ex- employees. In addition, the termination process must include the termination of physical access to the organization this can be done by collecting back all the keys, badge and parking permits etc from the terminated employee(Hines,2007). When any employee gets fired it is important to circulate the termination notice of that employee to whole other working employees of the organization, this will considerably reduces the insider threat in the organization. Case study: One banking organizations system administrator was terminated startling without the termination noticed from his employer for his termination .After getting terminated he started access the web server of the company from his home for nefarious activities in order to retaliate against the company. It is been found that although such administrator was been terminated from the company but his accessibility to the system of the company in not been removed so far and on account of that such terminated administrator had used the login password

of web server which is not been changed after his termination and he started exploiting the vulnerability of such web server through remote login and put the server to get shut down and finally get crashed. This incident cost heavily to the company due to this such company lost many of his venture with his client on account of the failure of his web server which carry critical information of the company. The essence of the case is that it is the responsibility of the company to thoroughly disable the all access points of the terminated employees once they get terminated. G) Organize an response plan. emergency insider threat external disgruntled employees to wag a attack against organization or not especially to steal the critical information of the organization for financial gain (Colwill, 2009). Frequent security awareness program will keep up- to- date the security professional of the organization from latest types of threat and help them to manage those threat efficiently and by report to its management about it. Social engineering is also one of the malevolent practices which help the perpetrator to gain the physical or electronic access of the victim network by account and password theft of such victim .The social engineering that how the malicious insiders insert hardware key logger on the computer to steal the vital information of the organization . Thus we can overcome this types of risk factor by instituting security awareness programs to organizations employees so as they remain vigilant in advance to face this types of threat. I) Perform auditing and monitoring on every online action of the employee.

The organization must have a well defined and duly documented insider response plan to face the stiff challenge during period when insider attack occurs in the organization. This plan is back up to combat with the insider threats. The insider threat response plan usually differ form the response plan for the external attackers. These plans are usually drafted by the technical staff of the organization they are designated as insider threat response team for such organization (Cappelli , Moore, Trzeciak and Shimeall,2009). The plan must contains the specific action to be taken against the malicious attackers, how to redress the organization from it impact and also the responsibilities of the response team members. Last but not least such plan must acquire the full support of top management of the organization. The details of the plan must not be shared by all employees of the organization it should be handle only by the confidential staff member of the organization H) Organize periodic security programs for all employees. awareness

It has been found that periodic security awareness program to all employees enhances the efficiency to identify the malicious insider to great extent inside the organization. These program help organization to measure the behavior of it employees, that what kind of behaviors may amounts to threat to the organization. Whether their conduct is suspicious or not, all this can be ascertain only through organizing aforesaid security awareness programs. It is also recommended to managers and top officials to procure these awareness programs as it helps in ascertaining the what types of social networking is prevailing inside the organization amongst the employees whether insiders are engaged with

Auditing and monitoring are the activities which can help in discovering the suspicious activities of the malicious insiders in the organization before any adverse consequence happens (Peecher, Schwartz and Solomon, 2007). In information technology domain the auditing and monitoring refers to as the verification and investigation of various electronic systems, computer networks log files etc which helps in great extent to trace the root cause of insider threat in the organization. In addition, auditing must verify the sanctity and integrity of all access logged file in the network of the organization (Moynihan, 2008). It is also imperative to conduct random auditing and monitoring in the organization it will serve as deterrent control from performing any malicious action against the organization. Various automated monitoring tools helps in preventing and detecting those Emails which are written for counterparts of the organization by malicious insiders of such organization, it also help in monitoring and detecting the copied documents form hard drive or flash media or drives and also helps in preventing insider from printing, copying, or downloading critical data of the organization, thus it provides protection of privacy. J) Protection to physical environment of the organization.

Although the organization is having electronic security for its vital business assets and information,

it is also imperative to give emphasis on securing the physical environment structure of the organization from both internal and external threat. The organization must firstly protect his employees who are one of the critical assets of the organization. This can be achieved only by securing the office surrounding from various occupational hazard and form malicious outsiders (Magklaras and Furnell, 2002). By securing the physical environment of organization it tends to prevent the terminated employees to regain the access again with the legitimate current employees of such organization as such physical security will act like an extra layer of defense. Thus by maintain such layer of physical security an insider of such organization can have least chance to turn hostile against his organization for financial gain on the instigation of any terminated employee. Therefore, physical security also carries equal importance in eradication of insider threat form organization. according to their security requirement. REFERENCES
[1] ZDNet Asia IT Priorities Survey 2008/2009, (Accessed on 22/02/2011) [2] Ciechanowicz.Z(1997), Risk analysis: requirements, conflicts and problems, Computers & Security, 16(3), 223-232. [3] Roy Sarkar.K(2011) , Assessing insider threats to information security using technical, behavioral and organizational measures, Information security technical report, Article in press. [4] Magklaras.G.B and Furnell.S.M (2002), Insider Threat Prediction Tool: Evaluating the probability of IT misuse, Computers & Security, 21(1), 62-73. [5] Alghathbar.K(2007) , Validating the enforcement of access control policies and separation of duty principle in requirement engineering, Information and Software Technology ,49(2), 142157. [6] Williams.A.H (2008), In a trusting environment, everyone is responsible for information security, Information security technical report, 13(4), 207215. [7] Hines.M(2007), Insider threats remain ITs biggest nightmare, InfoWorld, September 22. [8] Cappelli .D,Moore.A ,Trzeciak .R and Shimeall.T(2009), Common Sense Guide to Prevention and Detection of Insider Threats, 3rd Edition Version 3.1, Carnegie Mellon University CyLab. [9] Colwill.C(2009), Human factors in information security: The insider threat e Who can you trust these days?, Information Security Technical Report,14(4),186-196. [10] Peecher.M, Schwartz.R and Solomon.I(2007) , Its all about audit quality: Perspectives on strategic-systems auditing, Accounting, Organizations and Society ,32(4-5) ,463485. [11] Moynihan.J(2008),Managing the Insider Threat: Data Surveillance, Information Systems Audit and Control Association, (Accessed on 11/02/2011) [12] Magklaras.G.B and Furnell.S.M(2002), Insider Threat Prediction Tool: Evaluating the probability of IT misuse, Computers & Security, 21(1), 62-73.

Insider threat is the long continued security issue for every organization. We can only mitigate the threat even with the use of the highly sophisticated tools and technique we cannot altogether eradicate the threat .Only some precautionary aforesaid measures if followed meticulously can helps organization to counter the insider threat to some extent. In this paper we attempted to propose some specific guidelines for combating with insider threat in every organization. Indeed we do not claim that our guidelines completely secure to face the aforesaid threat, it may differ from organization to organization


Search Engine: Factors Influencing the Page Rank

Abstract In todays world Web is considered as ocean of data and information(like text,videos, multimedia etc.) consisting of millions and millions of web pages in which web pages are linked with each other like a tree. It is often argued that, especially considering the dynamic of the internet, too much time has passed since the scientific work on PageRank, as that it still could be the basis for the ranking methods of the Google search engine. There is no doubt that within the past years most likely many changes, adjustments and modifications regarding the ranking methods of Google have taken place, but PageRank was absolutely crucial for Google's success, so that at least the fundamental concept behind PageRank should still be constitutive. This paper describes the factors which affects the ranking of the web pages and helps in calculating those factors. By adapting these factors website developers can increase their sites page rank and within the PageRank concept, considering the rank of a document is given by the rank of those documents which link to it. Their rank again is given by the rank of documents which link to them. The PageRank of a document is always determined recursively by the PageRank of other documents

links leading to them. [1]PageRank thinks of links as votes, where a page linking to another page is casting a vote. A) Page Rank: PageRank is the algorithm used by the Google search engine, originally formulated by Sergey Brin and Larry Page in their paper The Anatomy of a Large-Scale Hypertextual Web Search Engine. It is based on the premise, prevalent in the world of academia, that the importance of a research paper can be judged by the number of citations the paper has from other research papers. Brin and Page have simply transferred this premise to its web equivalent: the importance of a web page can be judged by the number of hyperlinks pointing to it from other web pages. [2]Now web graph has huge dimensions and is subject to dramatic updates in terms of nodes and links, therefore the PageRank assignment tends to became obsolete very soon.[4] B) About algorithm


ageRank was developed by Google founders Larry Page and Sergey Brin at Stanford. At the time that Page and Brin met, search engines typically linked to pages that had the highest keyword density, which meant people could game the system by repeating the same phrase over and over to attract higher search page results. The rapidly growing web graph contains several billion nodes, making graph-based computations very expensive. One of the best known web-graph computations is Page-Rank, an algorithm for determining the importance of Web pages. [7] Page and Brin's theory is that the most important pages on the Internet are the pages with the most

It may look daunting to non-mathematicians, but the PageRank algorithm is in fact elegantly simple and is calculated as follows: i) PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where PR(A) is the PageRank of a page A PR(T1) is the PageRanck of a page T1 C(T1) is the number of outgoing links from the page T1 d is a damping factor in the range 0 < d < 1, usually set to 0.85 The PageRank of a web page is therefore calculated as a sum of the PageRanks of all pages linking to it (its incoming links), divided by the number of links on each of those pages (its outgoing links). From a search engine marketer's point of view, this

means there are two ways in which PageRank can affect the position of your page on Google: ii) The number of incoming links. Obviously the more of these the better. But there is another thing the algorithm tells us: no incoming link can have a negative effect on the PageRank of the page it points at. At worst it can simply have no effect at all. iii) The number of outgoing links on the page which points at your page. The fewer of these the better. This is interesting: it means given two pages of equal PageRank linking to you, one with 5 outgoing links and the other with 10, you will get twice the increase in PageRank from the page with only 5 outgoing links. At this point we take a step back and ask ourselves just how important PageRank is to the position of your page in the Google search results. The next thing we can observe about the PageRank algorithm is that it has nothing whatsoever to do with relevance to the search terms queried. It is simply one single (admittedly important) part of the entire Google relevance ranking algorithm. Perhaps a good way to look at PageRank is as a multiplying factor, applied to the Google search results after all its other computations have been completed. The Google algorithm first calculates the relevance of pages in its index to the search terms, and then multiplies this relevance by the PageRank to produce a final list. The higher your PageRank therefore the higher up the results you will be, but there are still many other factors related to the positioning of words on the page which must be considered first.[2] LX. THE EFFECT OF INBOUND LINKS It has already been shown that each additional inbound link for a web page always increases that page's PageRank. Taking a look at the PageRank algorithm, which is given by PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) one may assume that an additional inbound link from page X increases the PageRank of page A by d PR(X) / C(X) where PR(X) is the PageRank of page X and C(X) is the total number of its outbound links. But page A usually links to other pages itself. Thus, these pages get a PageRank benefit also. If these pages link back to page A, page A will have an even higher PageRank benefit from its additional inbound link. The single effects of additional inbound links shall be illustrated by an example.

We regard a website consisting of four pages A, B, C and D which are linked to each other in circle. Without external inbound links to one of these pages, each of them obviously has a PageRank of 1. We now add a page X to our example, for which we presume a constant Pagerank PR(X) of 10. Further, page X links to page A by its only outbound link. Setting the damping factor d to 0.5, we get the following equations for the PageRank values of the single pages of our site: PR(A) = 0.5 + 0.5 (PR(X) + PR(D)) = 5.5 + 0.5 PR(D) PR(B) = 0.5 + 0.5 PR(A) PR(C) = 0.5 + 0.5 PR(B) PR(D) = 0.5 + 0.5 PR(C) Since the total number of outbound links for each page is one, the outbound links do not need to be considered in the equations. Solving them gives us the following PageRank values: PR(A) = 19/3 = 6.33 PR(B) = 11/3 = 3.67 PR(C) = 7/3 = 2.33 PR(D) = 5/3 = 1.67 We see that the initial effect of the additional inbound link of page A, which was given by d PR(X) / C(X) = 0,5 10 / 1 = 5 is passed on by the links on our site. A) The Influence of the Damping Factor The degree of PageRank propagation from one page to another by a link is primarily determined by the damping factor d. If we set d to 0.75 we get the following equations for our above example: PR(A) = 0.25 + 0.75 (PR(X) + PR(D)) = 7.75 + 0.75 PR(D) PR(B) = 0.25 + 0.75 PR(A) PR(C) = 0.25 + 0.75 PR(B) PR(D) = 0.25 + 0.75 PR(C) Solving these equations gives us the following PageRank values: PR(A) = 419/35 = 11.97

PR(B) = 323/35 = 9.23 PR(C) = 251/35 = 7.17 PR(D) = 197/35 = 5.63 First of all, we see that there is a significantly higher initial effect of additional inbound link for page A which is given by d PR(X) / C(X) = 0.75 10 / 1 = 7.5 We remark that the way one handles the dangling node is crucial, since there can be a huge number of them. According to Kamvar et al. [Kamvar et al. 03b], a 2001 sample of the web containing 290 million pages had only 70 million nondangling nodes. This large amount of nodes without out-links includes both pages that do not point to any other page and also pages whose existence is inferred by hyperlinks but not yet reached by the crawler. Besides, a dangling node can represent a pdf, ps, txt, or any other file format gathered by a crawler but with no hyperlinks pointing outside.[4] This initial effect is then propagated even stronger by the links on our site. In this way, the PageRank of page A is almost twice as high at a damping factor of 0.75 than it is at a damping factor of 0.5. At a damping factor of 0.5 the PageRank of page A is almost four times superior to the PageRank of page D, while at a damping factor of 0.75 it is only a little more than twice as high. So, the higher the damping factor, the larger is the effect of an additional inbound link for the PageRank of the page that receives the link and the more evenly distributes PageRank over the other pages of a site. B) The Actual Effect of Additional Inbound Links At a damping factor of 0.5, the accumulated PageRank of all pages of our site is given by PR(A) + PR(B) + PR(C) + PR(D) = 14 Hence, by a page with a PageRank of 10 linking to one page of our example site by its only outbound link, the accumulated PageRank of all pages of the site is increased by 10. (Before adding the link, each page has had a PageRank of 1.) At a damping factor of 0.75 the accumulated PageRank of all pages of the site is given by PR(A) + PR(B) + PR(C) + PR(D) = 34 This time the accumulated PageRank increases by 30. The accumulated PageRank of all pages of a site always increases by (d / (1-d)) (PR(X) / C(X)) Where X is a page additionally linking to one page of the site, PR(X) is its PageRank and C(X) its number of outbound links. The formula presented above is only valid, if the additional link points to a page within a closed system of pages, as, for instance, a website without outbound links to other sites. As far as the website has links pointing to external pages, the surplus for the site itself diminishes accordingly, because a part of the additional PageRank is propagated to external pages. The justification of the above formula is given by RaphLevien and it is based on the Random Surfer Model. The walk length of the random surfer is an exponential distribution with a mean of (d/(1-d)). When the random surfer follows a link to a closed system of web pages, he visits on average (d/(1-d)) pages within that closed system. So, this much more PageRank of the linking page - weighted by the number of its outbound links - is distributed to the closed system. For the actual PageRank calculations at Google, Lawrence Page und Sergey Brin claim to usually set the damping factor d to 0.85. Thereby, the boost for a closed system of web pages by an additional link from page X is given by (0.85 / 0.15) (PR(X) / C(X)) = 5.67 (PR(X) / C(X)) So, inbound links have a far larger effect than one may assume.[2] LXI. THE EFFECT OF OUTBOUND LINKS Since PageRank is based on the linking structure of the whole web, it is inescapable that if the inbound links of a page influence its PageRank, its outbound links do also have some impact. To illustrate the effects of outbound links, we take a look at a simple example.

We regard a web consisting of two websites, each having two web pages. One site consists of pages A and B, the other consists of pages C and D. Initially, both pages of each site solely link to each other. It is

obvious that each page then has a PageRank of one.[6] Now we add a link which points from page A to page C. At a damping factor of 0.75, we therefore get the following equations for the single pages' PageRank values: PR(A) = 0.25 + 0.75 PR(B) PR(B) = 0.25 + 0.375 PR(A) PR(C) = 0.25 + 0.75 PR(D) + 0.375 PR(A) PR(D) = 0.25 + 0.75 PR(C) Solving the equations gives us the following PageRank values for the first site: PR(A) = 14/23 PR(B) = 11/23 We therefore get an accumulated PageRank of 25/23 for the first site. The PageRank values of the second site are given by PR(C) = 35/23 PR(D) = 32/23 So, the accumulated PageRank of the second site is 67/23. The total PageRank for both sites is 92/23 = 4. Hence, adding a link has no effect on the total PageRank of the web. Additionally, the PageRank benefit for one site equals the PageRank loss of the other. from that page lose PageRank accordingly.[6] Even if the actual PageRank values for the pages of an existing web site were known, it would not be possible to calculate to which extend an added outbound link diminishes the PageRank loss of the site, since the above presented formula regards the status after adding the link. B) Intuitive Justification of the Effect of Outbound Links The intuitive justification for the loss of PageRank by an additional external outbound link according to the Random Surfer Modell is that by adding an external outbound link to one page the surfer will less likely follow an internal link on that page. So, the probability for the surfer reaching other pages within a site diminishes. If those other pages of the site have links back to the page to which the external outbound link has been added, also this page's PageRank will deplete. We can conclude that external outbound links diminish the totalized PageRank of a site and probably also the PageRank of each single page of a site. But, since links between web sites are the fundament of PageRank and indispensable for its functioning, there is the possibility that outbound links have positive effects within other parts of Google's ranking criteria. Lastly, relevant outbound links do constitute the quality of a web page and a webmaster who points to other pages integrates their content in some way into his own site. C) Dangling Links An important aspect of outbound links is the lack of them on web pages. When a web page has no outbound links, its PageRank cannot be distributed to other pages. Lawrence Page and Sergey Brincharacterize links to those pages as dangling links.

A) The Actual Effect of Outbound Links As it has already been shown, the PageRank benefit for a closed system of web pages by an additional inbound link is given by (d / (1-d)) (PR(X) / C(X)), Where X is the linking page, PR(X) is its PageRank and C(X) is the number of its outbound links. Hence, this value also represents the PageRank loss of a formerly closed system of web pages, when a page X within this system of pages now points by a link to an external page. The validity of the above formula requires that the page which receives the link from the formerly closed system of pages does not link back to that system, since it otherwise gains back some of the lost PageRank. Of course, this effect may also occur when not the page that receives the link from the formerly closed system of pages links back directly, but another page which has an inbound link from that page. Indeed, this effect may be disregarded because of the damping factor, if there are enough other web pages in-between the link-recursion. The validity of the formula also requires that the linking site has no other external outbound links. If it has other external outbound links, the loss of PageRank of the regarded site diminishes and the pages already receiving a link


The effect of dangling links shall be illustrated by a small example website. We take a look at a site consisting of three pages A, B and C. In our example, the pages A and B link to each other. Additionally,

page A links to page C. Page C itself has no outbound links to other pages. At a damping factor of 0.75, we get the following equations for the single pages' PageRank values: PR(A) = 0.25 + 0.75 PR(B) PR(B) = 0.25 + 0.375 PR(A) PR(C) = 0.25 + 0.375 PR(A) Solving the equations gives us the following PageRank values: PR(A) = 14/23 PR(B) = 11/23 PR(C) = 11/23 So, the accumulated PageRank of all three pages is 36/23 which is just over half the value that we could have expected if page A had links to one of the other pages. According to Page and Brin, the number of dangling links in Google's index is fairly high. A reason therefore is that many linked pages are not indexed by Google, for example because indexing is disallowed by a robots.txt file. Additionally, Google meanwhile indexes several file types and not HTML only. PDF or Word files do not really have outbound links and, hence, dangling links could have major impacts on PageRank. Regarding our example website for dangling links, removing page C from the database results in page A and B each having a PageRank of 1. After the calculations, page C is assigned a PageRank of 0.25 + 0.375 PR(A) = 0.625. So, the accumulated PageRank does not equal the number of pages, but at least all pages which have outbound links are not harmed from the dangling links problem. The definition of PageRank above has another intuitive basis in random walks on graphs. The simplified version corresponds to the standing probability distribution of a random walk on the graph of the Web. Intuitively, this can be thought of as modeling the behavior of a random surfer. The random surfer simply keeps clicking on successive links at random. However, if a real Web surfer ever gets into a small loop of web pages, it is unlikely that the surfer will continue in the loop forever. Instead, the surfer will jump to some other page.[5] By removing dangling links from the database, they do not have any negative effects on the PageRank of the rest of the web. Since PDF files are dangling links, links to PDF files do not diminish the PageRank of the linking page or site. So, PDF files can be a good means of search engine optimization for Google. LXII. CONCLUSION So what we conclude from here is the main factors influencing the page rank is the inbound links and the outbound links including the dangling links. Future work that can be done is the total no of pages affecting the page rank of a web site. In order to prevent PageRank from the negative effects of dangling links, pages without outbound links have to be removed from the database until the PageRank values are computed. According to Page and Brin, the number of outbound links on pages with dangling links is thereby normalized. As shown in our illustration, removing one page can cause new dangling links and, hence, removing pages has to be an iterative process. After the PageRank calculation is finished, PageRank can be assigned to the formerly removed pages based on the PageRank algorithm. Therefore, as many iterations are needed as for removing the pages. Regarding our illustration, page C could be processed before page B. At that point, page B has no PageRank yet and, so, page C will not receive any either. Then, page B receives PageRank from page A -and during the second iteration, also page C gets its PageRank. REFERENCES
[1] The PageRank Citation Ranking: Bringing Order to the Web (PDF, 1999) by Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Sergey Brin, Larry Page (1998). "The Anatomy of a LargeScale Hypertextual Web Search Engine". Proceedings of the 7th international conference on World Wide Web (WWW). TaherHaveliwala and SepandarKamvar. (March 2003). "The Second Eigenvalue of the Google Matrix" Stanford University Technical Report: 7056. Gianna M. Del Corso, Antonio Gull, Francesco Romani (2005). "Fast PageRank Computation via a Sparse Linear System". What can you do with a Web in your Pocket (PS, 1998) by Sergey Brin, Rajeev Motwani, Larry Page and Terry Winograd Efficient Crawling Through URL Ordering (PDF, 1998) by Junghoo Cho, Hector Garcia-Molina and Lawrence Page L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. StanfordDigital Libraries Working Paper, 1998.




[5] [6] [7]


Are The CMMI Process Areas Met By Lean Software Development?

Abstract Agile software development is a conceptual framework for undertaking software engineering projects. There are a number of agile software development methodologies such as Extreme Programming (XP), Lean Development, and Scrum. Along with these, many organizations demand CMMI compliance of projects where agile methods are employed. This paper analyzes to what extent the CMMI process areas can be covered by Lean Development and where adjustments have to be made.

to be followed in developing a project using Lean thinking [1] are: Step 1: Identify value Step 2: Map the value stream Step 3: Create flow Step4: Establish pull Step 5: Seek perfection LXV. PRINCIPLES OF LEAN THINKING 1. Eliminate Waste 2. Increase Learning/ Feedback 3. Make Decisions as Late as Possible/ Delay commitment 4. Deliver as Quickly as Possible/ Deliver fast 5. Empower the Team 6. Building Integrity In 7. See the "Big Picture"/ See the Whole

Index Terms Agile, CMMI, Lean, Lean software development, process areas, maturity levels


rganizational maturity indicators like CMMI levels have become increasingly important for software development. In large organizations there are policies which enforce that all parts of the organization have to achieve certain maturity levels. At the same time, Lean software development is gaining increasing attention in the software development community. Just like the other agile methods, lean methods offer new approaches to the existing challenges in software development. In this paper we would investigate applicability and usefulness of CMMI model suite in agile/lean development efforts.

LXVI. CMMI Capability Maturity Model Integration (CMMI) is a process improvement approach that helps an organization to improve their performance. According to the Software Engineering Institute (SEI, 2008), CMMI helps "integrate traditionally separate organizational functions, set process improvement goals and priorities, provide guidance for quality processes, and provide a point of reference for appraising current processes."[2]

LXIV. LEAN SOFTWARE DEVELOPMENT The methodology of lean software development is the application of lean manufacturing principles when developing software. Bob Charette, the originator, writes that the measurable goal of Lean development is to build software with one-third the human effort, one-third the development hour and one-third the investment as compared to what SEI CMM Level 3 organization would achieve. The steps

LXVII. ARE THE PROCESS AREAS OF CMMI MET BY LEAN SOFTWARE DEVELOPMENT? Process areas are the areas that will be covered by the organization's processes. The process areas have been listed below: A) Project Planning[4]

plans for iteration at a time. B) Project Monitoring and Control [4] SG 1: Monitor the Project against the Plan Table 4 Specific practices and their analysis Specific practices SP 1.1 Monitor Project Planning Parameters SP 1.2 Monitor Commitments SP 1.3 Monitor Project Risks SP 1.4 Monitor Data Management SP 1.5 Monitor Stakeholder Involvement SP 1.6 Conduct Progress Reviews SP 1.7 Conduct Milestone Reviews Covered by lean? Find out the wastes, check the progress, continuous review, continuous waste elimination, communication with stakeholders, pull kanban metrics [3], meeting commitments on daily basis. Priorities are set for each iteration.

SG1: Establish estimates Table 1 Specific practices and their analysis Specific practices SP 1.1 Estimate the Scope of the Project SP 1.2 Establish Estimates of Work Product and Task Attributes SP 1.3 Define Project Lifecycle Phases SP 1.4 Estimate Effort and Cost Covered by lean? Decide the iterations Value stream mapping is the mechanism used

To eliminate wastes

SG2: Develop a project plan Table 2 Specific practices and their analysis Specific practices SP 2.1 Establish the Budget and Schedule SP 2.2 Identify Project Risks SP 2.3 Plan Data Management SP 2.4 Plan the Project's Resources SP 2.5 Plan Needed Knowledge and Skills SP 2.6 Plan Stakeholder Involvement SP 2.7 Establish the Project Plan SG3: Obtain commitment to the plan Table 3 Specific practices and their analysis Specific practices SP 3.1 Review Plans that Affect the Project SP 3.2 Reconcile Work and Resource Levels SP 3.3 Obtain Plan Commitment Covered by lean? Plan for single iteration not the whole project at the beginning. Covered by lean? Decide the work flow, define the definition of done, option analysis [3], involve the stakeholders for feedback.

SG 2: Manage Corrective Action to Closure Table 5 Specific practices and their analysis Specific practices SP 2.1 Analyze Issues SP 2.2 Take Corrective Action SP 2.3 Manage Corrective Actions Covered by lean? Rework is done if iterations performance deviates from what was to be done. Plan is flexible so deviation from plan is allowed. Features are dropped for the next iteration rather than not meeting the current iterations deadline[3]


Supplier agreement management [4]

SG 1 Establish Supplier Agreements Table 6 Specific practices and their analysis Specific practices SP 1.1 Determine Acquisition Type SP 1.2 Select Suppliers Covered by lean? Tight coupling of suppliers, fast response times[3]

SP 1.3 Establish Supplier Agreements SG 2 Satisfy Supplier Agreements Table 7 Specific practices and their analysis

CMMI covers the entire project in a single go, whereas Lean breaks the project into iterations and

Covered by lean? Quick delivery in short durations

This process area is relevant for entire project but in case of lean software development teams and the customer are more crucial. Therefore it might be replaced with a more relevant process area that is related to integration of team or customers.

Specific practices SP 2.1 Review COTS products SP 2.2 Execute the Supplier Agreement SP 2.3 Accept the Acquired Product SP 2.4 Ensure Transition of Products

E) Risk Management [4] SG 1 Prepare for Risk Management Table 10 Specific practices and their analysis

D) Integrated Project Management [4] SG 1 Use the Project's Defined Process Table 8 Specific practices and their analysis

Specific practices Specific practices Covered by lean? Define the steps for iteration, learn within a team and through organization, change quickly according to the need [3], contribute to team by equal participation.

Covered by lean? Find the wastes/ risks and eliminate them.

SP 1.1 Establish the Project's Defined Process SP 1.2 Use Organizational Process Assets for Planning Project Activities SP 1.3 Establish the Project's Work Environment SP 1.4 Integrate Plans SP 1.5 Manage the Project Using the Integrated Plans SP 1.6 Contribute to Organizational Process Assets

SP 1.1 Determine Risk Sources and Categories SP 1.2 Define Risk Parameters SP 1.3 Establish a Risk Management Strategy

SG 2 Identify and Analyze Risks Table 11 Specific practices and their analysis

Specific practices SP 2.1 Identify Risks SP 2.2 Evaluate, Categorize, and Prioritize Risks

Covered by lean? Find out the value added activities and separate wastes.

F) Quantitative Project Management [4] SG 2 Coordinate and Collaborate with Relevant Stakeholders Table 9 Specific practices and their analysis SG 1 Prepare for Quantitative Management Table 12 Specific practices and their analysis

Specific practices

Covered by lean? Close collaboration with stakeholders, frequent feedback from stakeholders [3].

Specific practices SP 1.1 Establish the Projects Objectives SP 1.2 Compose the Defined Processes

SP 2.1 Manage Stakeholder Involvement SP 2.2 Manage Dependencies SP 2.3 Resolve Coordination

Covered by lean? Subprocesses are the iterations in the value stream.

SP 1.3 Select Subprocesses and Attributes SP 1.4 Select Measures and Analytic Techniques SP 1.1 Elicit Needs SP 1.2 Transform Stakeholder Needs into Customer Requirements SG 2 Develop Product Requirements Table 16 Specific practices and their analysis SG 2 Quantitatively Manage the Project Table 13 Specific practices and their analysis Specific practices Specific practices
SP 2.1 Monitor the Performance of Selected Subprocesses SP 2.2 Manage Project Performance SP 2.3 Perform Root Cause Analysis

Rough idea of customer needs

Covered by lean?
The technique of 5 whys can be used for root cause analysis.

SP 2.1 Establish Product and Product Component Requirements SP 2.2 Allocate Product Component Requirements SP 2.3 Identify Interface Requirements

Covered by lean? Prioritize the requirements at iteration level


G) Requirements Management [4] SG 1 Manage Requirements Table 14 Specific practices and their analysis Specific practices SP 1.1 Understand Requirements SP 1.2 Obtain Commitment to Requirements SP 1.3 Manage Requirements Changes SP 1.4 Maintain Bidirectional Traceability of Requirements SP 1.5 Ensure Alignment Between Project Work and Requirements H) Requirements Development [4] SG 1 Develop Customer Requirements Table 15 Specific practices and their analysis Covered by lean? Understanding of requirements by meeting the customer [3] . Requirements evolve as project progresses. Work is more important than signing agreements. Initial requirements might not be useful as the project move to completion. SG 3 Analyze and Validate Requirements Table 17 Specific practices and their analysis

Specific practices SP 3.1 Establish Operational Concepts and Scenarios SP 3.2 Establish a Definition of Required Functionality and Quality Attributes SP 3.3 Analyze Requirements SP 3.4 Analyze Requirements to Achieve Balance SP 3.5 Validate Requirements

Covered by lean? Functional analysis is limited to iteration. Requirements are met according to the definition of done.

Technical Solution [4]

Table 18 Specific goal, its specific practices and their analysis

Specific goal

Specific practices

Covered by lean?

Specific practices

Covered by lean?

SG 1 Select Product Component Solutions SP 1.1 Develop Alternative Solutions and Selection SP 1.2 Select Product Component Solutions SP 3.1 Implement the Design SP 3.2 Develop Product Support Documentation Q) Causal Analysis and Resolution [4] This process area is followed by lean software development. R) Organizational Process Focus [4] The focus here is on iterations rather than on the process of entire organization. S) Organizational Process Definition [4] The focus is on iterations rather than on the process of entire organization. T) Organizational Training [4] The focus of training is individuals. They are provided with the environment and tools and are trusted to develop the project. U) Organizational Process Performance [4] The process is fixed in lean software development so SP 1.2 [4] is not applicable. The focus of lean software development is to improve the process by eliminating the wastes. V) Organizational Performance Management [4] The practices of this process area fully support the lean practices. LXVIII. CONCLUSION CMMI and lean software development are approaches to continuous improvement. This paper concludes that CMMI tends to reduce risk in lean software development. These practices make good sense, and you could argue that it has always inherently been expected as part of your agile method. In general the CMMI model provides a good understanding what practices to consider but you will have to adopt it to your context, and find lean implementations for the practices. REFERENCES
[1] M. Poppendieck and T. Poppendieck, Lean Software Development: An Implementation Guide: AddisonWesley,2006. Integrating Lean, Six Sigma, and CMMI by David N. Cardhttp. Agile/Lean Development and CMMI by Jeffrey L. Dutton, Richard S. McCabe Systems and Software Consortium egratin

Options are considered till last moment possible.

SG 3 Implement the Product Design

Design is not to be prepared for the whole project before coding begins.

J) Product Integration [4] This process area is not religiously followed in lean software development. K) Verification [4] Reviews are conducted whenever there are meetings by the whole team and the customer. Thus verification takes place more frequently exposing the defects. L) Validation [4] This process area conforms to the lean software development methodology. M) Measurement and Analysis [4] The focus here is collection of data in huge volumes. But lean software development focuses of storing only small and necessary data, so that the team can focus on what provides value to the customer. N) Process and Product Quality Assurance [4] The work products are evaluated by the team and the customer together. Any changes required are carried out quickly instead of waiting for documentation to be completed. O) Configuration Management [4] This process area is followed by lean software development but changes are made quickly rather than following long procedures. P) Decision Analysis and Resolution [4] This process area is followed by lean software development.





Password Protected File Splitter And Merger (With Encryption And Decryption)

Abstract Splitter is the program that is use to split a single file into different pieces of small size files. And the file merger is used to combine these small size files into a single file of summation size of these small size files. Splitters work like a splitting a file into the number of pieces to reduce the file size and length. And the Merger is used to combine the different pieces of a file to make it readable and useable by the people who want to perform some task on it with a single large file. Actually the small pieces of the splitted file are not usable because they lose their identity and cannot recognize by the operating system. Because these files do not have any type of file extension and its only the small chunks of the large file. The purpose of splitting the files into small.This project research is on the file Splitters and Mergers and tries to enhance the functionality of the existing splitters and mergers , by providing extended functionality added to the file splitters and mergers. In this, we are trying to split the desired file into the number of pieces according to the given size by the users in MBs (Megabytes). If there is 100 MB file is there and user wants to split it into 10 MB pieces, then the system will generate 10 different pieces of 100 MB file each of 10MB size. So it depends on the user that how much single file size he/she wants. A file named file and it broken into 10 pieces so files break with the index 0 to 9 i.e. file0 to file9. Now the file is broken into 10 pieces with no meaning of individual file to anyone until it is merge again to recover it into a human understandable form sizes to store them in the less memory space and transfer these files over the internet, where large files are not allowed to transfer.


hat is File Splitter? File splitter is one of the software programs to split a particular file into a specified size. So we reduce the size of the file, which is desired from transfer point of view. File splitter breaks the file into the file size which desired by the user, it breaks a big file into number of small files consisting of file size very less than the size of the original file. File splitter uses a program to divide the original file into number of small size files. For example you have a file which is 100MB in size and you want to deliver this file to your friend over the

Internet through the use of Internet, then you are not allowed to send it as in the same size as it is. So you have to use a splitter program to break it into the size which is allowed by the service provider who is carrying your file. So it is necessary to break it into the size which is supported by the service provider. Such as Gmail, Yahoomail and other mailing service which send and receive mails to and from the remote location to its server, allows only file size of 25 MB that you can deliver or receive. If any file which is greater than this size, it will not send or receive it from the remote server. So if you are using a net and want to receive and deliver some kind of files from your friends then you really need a program to convert the file size which is allowed by your service providers. And the program you are using for the splitting of the files must have some type of restrictions that no one other than you can use it. So if you are using a splitter program which is accessible for every one, that is, it is used by any one who can use your computer. So you have to use such type of splitter program which at least guarantee you that the software is not used by the person who are not authorized to do so. Because if you are paying for anything no other one have rights to use it against your permission. So always use a program that provide with a password protection and cannot use by any other who are not authorized for this program. Because if you are paying for a program to split the files and sometimes your friend comes and copy your program and execute it on its local machine, it must not work. Otherwise it wastage of money for that another person using the same thing for which you have paid a money to buy or purchase it. What is File Merger? Once you break your files into the smaller size files, you again need a program which can again assemble it to the original file size, so you can use it again. So you have to select a merger program which is used to merge the splitted files. Merger is the program which takes small size files as an input and generate a file having the actual

file size, and which can be used by the normal people who are not very aware of the different technologies. So a merger is the program that you use for the reassembling of the file which you have broken down before at the time of send it through the Internet. Merger always uses the properties of the splitter in the sense that it uses the same process as splitter. But the difference is that it only combines the splitted part and splitter split the combined part. Merger is the one of the only technology which redefined the splitted files and provides a better structure to the unreadable files for the people who are not very much aware of the different technology to read a break file. What is Encryption? Encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as key. The result of the process is encrypted information. Encryption has long been used by militaries and governments to facilitate secret communication. Encryption is now commonly used in protecting information within many kinds of civilian systems. Encryption is used to protect data in transit, for example data being transferred via networks (e.g. Internet, e-commerce), mobile telephones, wireless microphones, wireless intercom systems, Bluetooth devices and bank automatic teller machines. There have been numerous reports of data in transit being intercepted in recent years. Encryption can protect the confidentiality of messages, but other techniques are still needed to protect the integrity and authenticity of a message (your files). So when you sending a video file to your friend, which is confidential to both of you and if it hacked by any one of the hacker in between or your friend mail is hacked by anyone by knowing his/her password , then it will be no longer confidential to you and leaked. So it is necessary to have a system that can protect your files which are very much confidential and you do not want to share it with any one then you have to use a program to protect your files from unauthorized access. What is Decryption? Decryption process is the inverse of the encryption process in which an encrypted text is again converted into the plain text or normal information that is human understandable. Decryption is the process where the conversion of the encrypted text to normal text exists. Decryption is the only possible if the user who wants to decrypt the encrypt file who knows, how it is encrypted or there is any key that can explain the encrypted process, so decrypt the particular file. So that key may be any of the technique of the conversion of the encrypted text. So the key can be any password or any kind of scheme that can convert an encrypted file to the normal file, whatever it was. LXX. SOFTWARE REQUIREMENT A) Purpose This document details the software requirements specification for the file Splitter and Merger with encryption and decryption open source project. It will later be used as a base for the extension of the existing software itself. This document follows the IEEE (Institute of Electrical and Electronics Engineers) standard for software requirements specification documents.A computer file is a block of arbitrary information, or resource for storing information, which is available to and is usually based on some kind of durable. Most computers have at least one file system. Some computers allow the use of several different file systems. For instance, on newer MS Windows computers, the older FAT (File Allocation Table) file systems of and old versions of Windows are supported, in addition to the file system that is the normal file system for recent versions of Windows. B) Document Conventions File Splitter and Merger were created prior to this document, so all requirements stated here are already satisfied. It is very important to update this document with every future requirement and clarify its priority for consistency purposes, so that this document can remain useful. Because of the fact that file splitter and merger are already implemented, parts of this document have a style similar to a manual document. C) Project Scope File Splitter and Merger is a tool that can split, merge and manipulate files. It provides a command line Interface and a shell Interface (Console). It is available in two versions, basic and enhanced, both are open source.The command line interface provides the user with all the functionality needed to handle a file (or more files together). The functionalities are distributed in modules. Each module performs a specific function and loads in the main menu In the basic version, the software contains four modules: A) Split B) Merge C) Encrypt D) Decrypt

E) Password Manager


Fig 3.2: USE CASE for MERGER C) ENCRYPTION AND DECRYPTION USE CASE In this use case user have to set the password first to encrypt the file, because the encrypted file only again decrypted through the password set at the time of encryption. Here password is the main part for the encrypted file, because if the password is stolen by any of the unauthorized person, he/she can view your encrypted data by means of the again decrypt it. So you have to very careful about the settings of the password. Once you set your password, file is encrypted and decrypted again only by providing the same password to the decryption module. So we merge the use case for the Encryption/Decryption because of the reason that they both are dependent on each other. Without using the encryption, it is waste to use the decryption module.

Fig 3.1: USE-CASE for SPLITTER In splitter use case the firstly user ask for the file size in which he/she wants to break the file. That is, module will ask for the size in which the small piece of an original file is needed by the user. The size of the file will play an important role for the splitting a particular file into smaller pieces. Then the user will set the destination folder for keeping the different small pieces of a single large file. And after set the destination folder, all the small pieces will be place there as an output of the module. In this case a user will be any end user or customer of the intended program that will play the role of user who wants to transfer a big sized file to the another locations. . B) MERGER USE CASE In this use case diagram, user will load the file from the destination folder to the merger module and again set the destination folder for the binding of all files. That is, now the users have to set the destination for the file which will be a arrangement of all the break small files.


LXXII. CONCLUSION Data splitter and merger really can work as the basis of a good file transfer system. The biggest problem with transfer of files is their larger size. People mostly either use the USB pen drive (flash drives) and other storage devices for file transfer but they does not work when file is transferred at the remote location, which is something more than the space not handled by these devices. At any time and from 50 percent from 100 percent, you need a good file splitter to send a file to the remote locations with file size more than accepted data on the Internet. The other option for file transfer is Email ids but they restrict us up to a finite file size for transferring the files over the Internet, for example yahoo gives 25 MB and Gmail give 30 MB file size that you can send from your local computer to the remote computers or other electronic devices where mailing system can be accessed. This password protected File splitter and merger offer a better solution for all of the above problems. In addition of these problems one another problem arises when there is need of secure file transferring from one computer device to another over the network. Generally files do not predict any information after they are broken into the smaller files, but the main part or you can say the master part of the file(the first of the splitted file), which contains the Beginning of File (BOF) Byte in it, runs even after breaking down into small piece of a original file. So this problem may be eliminate by using a secure file transfer system, that is not other than encrypted files, those are not understandable easily by the human, because it converts the normal text into the encrypted text. So according to our software a secure file transfer system is provided with the splitting and merging process. By which you can safely transfer your data over the Internet without sharing the confidential information to anyone, if it is stolen by anyone. With encryption process you can set your password for the file which you are going to encrypt. After receiving the file at the other end, you have to tell your friend the password which you applied on the intended file. And at the other end your friends have our software and he/she can decrypt the same file with the set password, for which software will ask at the time of decrypting it. According to need, we prepare a system which consists of the generally four modules, (a) Splitter, (b) Merger, (c) Encryption and finally (d) Decryption. These all of the above modules have their own importance for the file transferring from local computer device to other device over the network. Firstly you have to set your password for the first time, which will guarantee of not accessed by an unauthorized person. You have to always remember your password to use the software from the second time onwards. If you forget your password and enter the wrong password for three times then the software will not work further, that is, it will be dumb and do not perform any work after that. Splitter is used split the files into the desires small size pieces of the original file, then Encryption is used to encrypt the master file of the original file so that it is not readable to anyone. After doing this, Internet is used to transfer the file pieces from one device to another device. At the receiver end, the encrypted file is again decrypted using the Decryption module to decrypt the master file and then use the Merger module to merge all the pieces of file received through the Internet. REFERENCES
[1] [2] [3] [4] [5] [6] [7] Balagurusamy E. (2004), Programming in ANSI C, Tata McGraw-Hill. Booch Grady, Runbaugh James, Jacobson Ivar (2004), The UML User Guide, Pearson Edition. Ivar Jabcobson (2004), Object Oriented Software Engineering, Pearson Edition. Kaner, C. (2004), A Course in Black Box Software Testing.Available: P. Kanetkar Yashwant (1999) Let Us C 5th Edition, BPB Publications. Stallings William (2006), Cryptography and Network Security Principles and Practices, 4th Edition, Pearson Prentice Hall. Aggarwal K.K. & Singh Yogesh (2005),Software Engineering, New age International.


Security Solution in Wireless Sensor Network

Abstract Wireless sensor networking is an emerging technology that promises a wide range of potential applications in both civilian and military area. Wireless sensor networks (WSNs) have many potential civilian and military application for eg. environmental monitoring, battlefield surveillance, and homeland security. In many important military and commercial applications, it is critical to protect a sensor network from malicious attacks, which presents a demand for providing security mechanisms in the network.A wireless sensor network (WSN) typically consists of a large number of low - cost, low - power, and multifunctional sensor nodes that are deployed in a region of interest. These sensor nodes are small in size but are equipped with sensors, embedded microprocessors, and radio transceivers. Therefore, they have not only sensing, but also data processing and communicating capabilities. They communicate over short distance via a wireless medium and collaborate to accomplish a common task, for example, environment monitoring, military surveillance, and industrial process control .Wireless sensor networks are result of developments in micro electro mechanical systems and wireless networks. These networks are made of tiny nodes which are becoming future of many applications where sensor networks are deployed in hostile environments. The deployment nature where sensor networks are prone to physical interaction with environment and resource limitations raises some serious questions to secure these nodes against adversaries. The traditional security measures are not enough to overcome these weaknesses. To address the special security needs of tiny sensor nodes and sensor networks as a whole we introduce a security mode. In our model we emphasize on three areas: (1) cluster formation (2) secure key management scheme (3) Secure Routing algorithm. Our security analysis shows that the model presented in this paper meets the unique security needs of sensor networks. Index Terms Wireless sensor networks security, secure key management.


dvancements in micro electro mechanical systems (MEMS) and wireless networks have made possible the advent of tiny sensor nodes called smart dust which are low cost small tiny devices with limited coverage, low power, smaller memory sizes and low bandwidth. Wireless sensor networks are consisting of large number of sensor nodes which are becoming viable solution to many challenging domestic, commercial and military applications. Sensor networks collect and disseminate data from the fields where ordinary networks are unreachable for various environmental and strategically reasons. In addition to common network threats, sensor networks are more vulnerable to security breaches because they are physically accessible by possible adversaries, consider sensitive sensor network applications in military and hospitals compromised by adversaries. Many developments have been made in introducing countermeasures to potential threats in sensor networks; however, sensor network security remains less addressed area. In this paper we present a security framework for wireless sensor networks to provide desired security countermeasures against possible attacks. Our security framework consists of three interacting phases: cluster formation, secure key management and secure routing schemes. We make three contributions in this paper: A) We discuss cluster formation and leader election in a multi-hop hierarchical cluster model B) We present a secure key management scheme. C) We propose a secure routing mechanism which addresses potential threats in node to cluster leader and cluster leader to base station and vice versa communication. The rest of paper is organized as follows. Section

II provides summary of related work in key management and routing protocols in wireless sensor networks. Section III presents our security framework discussing the cluster formation and leader election process, secure key management scheme, secure routing and their algorithms, . Section IV provides analysis of our security model , and finally in Section V we conclude our paper providing the future research directions. LXXV. RELATED WORK Researchers have addressed many areas in sensor network security. Some of the related work has been summarized in the following paragraphs. Eschenauer et al. [1], present a probabilistic key pre-distribution scheme where each sensor node receives a random subset of keys from a large key pool before deployment. To agree on a key for communication, two nodes find one common key within their subsets and use that key as their shared key. Chan et al [2], extended idea of Eschenauer et al. [14] and developed three key pre-distribution schemes; q-composite, multipath reinforcement, and random-pairwise keys schemes. Pietro et al [3], Present a random key assignment probabilistic model and two protocols; direct and cooperative to establish a pairwise communication between sensors by assigning a small set of random keys to each sensor. This idea later converges to pseudo random generation of keys which is energy efficient as compare to previous key management schemes. Liu et al [4] propose a pairwise key schemes is based on polynomial pool-based and grid based key pre-distribution schemes have high resilience against node captures and communication overhead. Du et al [5] pairwise key pre-distribution is an effort to improve the resilience of the network by lowering the initial payoff of smaller scale network attacks and pushes adversary to attack at bigger scale to compromise the network. Du et al [6] present a key scheme based on deployment knowledge. This key management scheme takes advantage of the deployment knowledge where sensor position is known prior to deployment. Because of the randomness of deployment, it is not feasible to know the exact neighbor locations, but knowing the4 set of likely neighbors is realistic, this issue is addressed using the random key pre-distribution of Eschenauer et al. Adrian et al [7] have introduced SPINS (Security Protocols for Sensor Networks). SPINS is a collection of security protocols (SNEP) and mircoTESLA. SNEP (Secure Network Encryption Protocol provides data confidentiality and two-way data authentication with minimum overhead. MicroTESLA, a micro version of TESLA (Time Efficient Streamed Loss-tolerant Authentication) provides authenticated streaming broadcast. SPINS leaves some questions like security of compromised nodes, DoS issues, network traffic analysis issues. Furthermore, this protocol assumes the static network topology ignoring the ad hoc and mobile nature of sensor nodes. Chen et al [8] proposed two security protocols. First, base station to mote confidentiality and authentication which states that an efficient sharedkey algorithm like RC5 be used to guarantee the authenticity and privacy of information. Second, Source authentication, by implementing a hash chain function similar to that used by TESLA (timed efficient stream loss-tolerant authentication) to achieve mote authentication. Jeffery et al [9] proposed a light weight security protocol that operates in the base station of sensor communication where base station can detect and remove an aberrant node if it is compromised. This protocol does not specify any security measures in case of any passive attacks on node where an adversary is intercepting the communication. LXXVI. THE SECURITY MODEL Our security model consists of two interacting phases: cluster formation, secure key management. A) Cluster formation As soon as sensor nodes are deployed, they broadcast their IDs and listens to the neighbors, add the neighbor IDs in its routing table and count the number of neighbor it could listen to. Hence these connected neighbors become a cluster. Each cluster elects a sensor node as a leader. All inter-cluster communication is routed through cluster leaders. Cluster leaders also serve as fusion nodes to aggregate packets and send them to the base station. The cluster leader receives highest number of messages, this role changes after reaching an energy threshold, hence giving opportunity to all the nodes becoming a cluster leader when nodes move around in a dynamic environment. Coverage of clusters depends on the signal strength of the cluster leader. Cluster leader and its neighbor nodes form a parentchild relationship in a tree-based network topology. In this multi hop cluster model, data is collected by

the sensor nodes, aggregated by the cluster leader and forwarded to the next level of cluster, eventually reaching the base station. Figure 1 below shows a network of 210 sensor nodes forming 10 clusters. scheme, self enforcing scheme, and key predistribution scheme. The trusted server scheme depends on a trusted server e.g., Kerberos [11]. Since there is no trusted infrastructure in sensor networks, therefore trusted-server scheme is not suitable in this case. The self-enforcing scheme depends on asymmetric cryptography using public keys. However, limited computation resources in sensor nodes make this scheme less desirable. Public key algorithms such as Diffe-Hellman [12] and RSA [13] as pointed out in [6, 7] require high computations resources which tiny sensors does not provide. The key pre-distribution scheme, where key information is embedded in sensor nodes before the nodes are deployed is more desirable solution for resource starved sensor nodes. A simple solution is to store a master secret key in all the nodes and obtain a new pair wise key. In this case capture of one node will compromise the whole network. Storing the master key in tamper resistant sensor nodes increases the cost and energy consumption of sensors. Another key pre-distribution scheme [5] is to let each sensor carry N 1 secret pairwise keys, each of which is known only to this sensor and one of the other N 1 sensors (N is the total number of sensors). Extending the network makes this technique impossible as existing nodes will not have the new nodes keys. In our security framework we introduce a secure hierarchical key management scheme where we use three keys: two pre-deployed keys in all nodes and one in network generated cluster key for a cluster to address the hierarchical nature of sensor network. Kn (Network key) Generated by the base station, pre-deployed in each sensor node, and shared by the entire sensor network. Nodes use this key to encrypt the data and pass onto next hop. Ks (sensor key) Generated by the base station, pre-deployed in each sensor node, and shared by the entire sensor network. Base station uses this key to decrypt and process the data and cluster leader uses this key to decrypt the data and send to base station. Kc (cluster key) Generated by the cluster leader, and shared by the nodes in that particular cluster. Nodes from a cluster use this key to decrypt the data and forward to the Cluster Leader. By providing this key management scheme we make our security model resilient against possible attacks on the sensor network. In this key management scheme base station uses Kn to encrypt and broadcast data. When a sensor node receives the message, it decrypts it by using its Ks. In this key calculation, base station uses Kn1..nn to broadcast the message. This process follows as:

Fig 1: Cluster formation B) Secure key management scheme Key management is critical to meet the security goals of confidentiality, integrity and authentication to prevent the Sensor Networks being compromised by an adversary. Due to ad-hoc nature and resource limitations of sensor networks, providing a right key management is challenging. Traditional key management schemes based on trusted third parties like a certification authority (CA) are impractical due to unknown topology prior to deployment. Trusted CA is required to be present all the times to support public key revocation and renewal [10]. Trusting on a single CA for key management is more vulnerable, a compromise CA will risk the security of entire sensor network. Fei et al [10] decompose the key management problem into: Key pre-distribution installation of keys in each sensor node prior to distribution Neighbor discovery discovering the neighbor node based on shared key End-to-end path key establishment end to end communication with those nodes which are not directly connected. Isolating aberrant nodes identifying and isolating damaged nodes. Re-keying re-keying of expired keys Key-establishment latency reducing the latency resulted from communication and power consumption. The core problem we realize in wireless sensor network security is to initialize the secure communication between sensor nodes by setting up secret keys between communicating nodes. In general we call this key establishment. There are three types of key establishment techniques [5, 6]: trusted-server

Base station encrypts its own ID, a current time stamp TS and its Kn as a private key. Base station generates a random seed S and assumes itself at level 0. Sensor node decrypts the message received from the base station using Ks. When a node sends a message to cluster leader, it constructs the message as follows: {ID, Ks, TS, MAC, S (message)} Cluster leader checks the ID from the packet, if the ID in the packet matches the ID it holds, verifies the authentication and integrity of the packet through MAC. Otherwise, packet is dropped by the cluster leader. Node builds the message using the fields below: Cluster leader aggregates the messages received from its nodes and forwards it to next level cluster leader or if the cluster leader is one hop closer to the base station, it directly sends to the bases station. Receiving cluster leader checks its routing table and constructs the following packet to be sent to next level cluster leader or base station. Cluster leader adds its own ID, its network and cluster key in incoming packet and rebuilds the packet as under: {ID, Kn, kc, [ID, Ks, TS, MAC, S (Aggr message)]} Here ID is the ID of receiving cluster leader which aggregates and wraps the message, and sends it to the next hop cluster leader or to the base station if directly connected. Next hop cluster leader receives the packet and checks the ID, if the ID embedded in the packet is same as it holds, it updates the ID for the next hop and broadcast it, else the packet is discarded. Base station receives the packet from its directly connected cluster leader; it checks the ID of sending cluster leader, verifies the authentication and integrity of the packet through MAC. Cluster leader directly connected with base station adds its own ID along with the packet received from the sending cluster leader. Packet contains the following fields: {ID[ID, Kn, kc, [ID, Ks, TS, MAC, S (Aggr message)]]} C) Secure Routing In our secure routing mechanism, all the nodes have a unique ID#. Once the network is deployed, base station builds a table containing ID#s of all the nodes in the network. After self organizing process base station knows the topology of the network. Using our secure key management scheme nodes collect the data, pass onto the cluster leader which aggregates the data and sends it to the base station. We adapt the energy efficient secure data transmission algorithms by [15] and modify it with our secure key management scheme to make it more resilient against attacks in wireless sensor networks. Following two algorithms: sensor node and base station algorithms are presented for secure data transfer from node to base station and base station to node communication: Node algorithm performs the following functions: A) Sensor nodes use the Kn to encrypt and transmit the data B) Transmission of encrypted data from nodes to cluster leader C) Appending ID# to data and then forwarding it to higher level of cluster leaders D) Cluster leader uses Kc to decrypt and then uses its Kn to encrypt and send the data to next level of cluster leaders, eventually reaching the base station Base station algorithm is responsible of following tasks: A) Broadcasting of Ks and Kn by the base station B) Decryption and authentication of data by the base station Node algorithm Step 1: If sensor node i wants to send data to its cluster leader, go to step 2, else exit the algorithm Step 2: Sensor node i requests the cluster leader to send the Kc. Step 3: Sensor node i uses Kc and its own Kn to compute the encryption key Ki, cn. Step 4: Sensor node i encrypts the data with Ki,cn and appends its ID# and the TS to the encrypted data and then sends them to the cluster leader. Step 5: Cluster leader receives the data, appends its own ID#, and then sends them to the higher-level cluster leader or to the base station if directly connected. Go to Step 1. Base Station Algorithm Step 1: Check if there is any need to broadcast the message. If so, broadcast the message encrypting it with Kn. Step 2: If there is no need to broadcast the message then check if there is any incoming message from the cluster leaders. If there is no data being sent to the base station go to step 1. Step 3: If there is any data coming to the base station then decrypt the data using Ks, ID# of the node and TS within the data. Step 4: Check if the decryption key Ks has decrypted the data perfectly. This leads to check the credibility of the TS and the ID#. If the decrypted data is not perfect discard the data and go to step 6.

Step 5: Process the decrypted data and obtain the message sent by sensor nodes Step 6: Decides whether to request all sensor nodes for retransmission of data. If not necessary then go back to step 1. Step 7: If a request is necessary, send the request to the sensor nodes to retransmit the data. When this session is finished go back to step 1. This routing technique provides stronger resilience towards spoofed routing information, selective forwarding, sinkhole attacks; Sybil attacks wormholes and HELLO flood attacks presented in [16]. station. We have presented a hierarchical secure key management scheme based on three levels of predeployed keys and lastly we have presented a secure routing mechanism which provides a stronger resilience towards susceptible attacks on sensor networks. We plan to implement this security framework in Berkeleys motes having confidence that this framework will provide added security in wireless sensor network communication. REFERENCES
[1] L. Eschenauer and V. Gligor, A Key-management Scheme for Distributed Sensor Networks, Proceedings of the 9th ACM conference on Computer and Communication Security 2002, Washington DC, USA. P. Ganesan, R. Venugopalan, P. Peddabachagari, A. Dean, F Mueller, and M Sichitiu, Analyzing and Modeling Encryption Overhead for Sensor Network Nodes, WSNA03, September 19, 2003, San Diego, California,USA. R. Pietro, L. Mancini, and A. Mei, Random key-Assignment for Secure Wireless Sensor Networks, ACM,SANS,2003. D. Liu and P. Ning, Establishing Pairwise Keys in Distributed Sensor Networks, ACM CCS 2003. W. Du, J. Deng, Y. S. Han, and P. K. Varshney, A Pairwise Key Pre-Distribution Scheme for Wireless Sensor.Networks, W. Du, J. Deng, Y. S. Han, S. Chen, and P. K. Varshney, A Key Management Scheme for Wireless Sensor Networks Using Deployment Knowledge, IEEE InfoCom,2004. A. Perrig, R. Szewczyk, V. Wen, D. Culler, J. D. Tygar. SPINS: Security Protocols for Sensor Networks, in Wireless Networks Journal (WINE), September 2002. H. Chan, A. Perrig, and D. Song, Random Key Predistribution Schemes for Sensor Networks. In Proceedings of the IEEE Symposium on Security and Privacy,.Oakland,.California,USA J. Undercoffer, S. Avancha, A. Joshi, and J. Pinkston, Security for Sensor Networks 2002 CADIP Research_Symposium F. Hu, J. Ziobro, J. Tillett, and N. Sharma,Wireless Sensor Networks: Problems and Solutions Rochester Institute of Technology, Rochester, New York USA. B.C. Neuman and T. Tso., Kerberos: An authentication service for computer networks. IEEE communications 32(9):pgs33-38, 1994. W. Diffie and M. E. Hellman, New directions in cryptography. IEEE transactions on information theory22:644-654, 1976. R. L. Rivest, A. Shamir, and L. M. Adleman, A method for obtaining digital signatures and public keycryptosystems. Communications of the ACM, 21(2):120-126, 1978 T. Li, H. Wu and F. Bao, SenSec Design, Institute for Infocomm research, Singapore, 2004 H. Cam, S. Ozdemir, D. Muthuavinashiappan, and P. Nair, Energy Efficient Security Protocol for Wireless Sensor Networeks, 2003 IEEE C.Karlof and D. Wagner, Secure Routing in Wireless Sensor Networks: Attacks and Countermeasures, University of California at Berkeley, USA 2003. P. K. Goel and V.K. Sharma ,Wireless Sensor Network: Security Model, International Journal of Science Technology and Management, IJSTM Vol. 2, Issue 2, ISSN: 2229-6646 (online) , pp 100-107.


LXXVII. ANALYSIS OF PROPOSED SECURITY MODEL This section presents an analysis to explain the features of our security model which make this model feasible to implement. In our security model packet format in a typical node to cluster leader communication would be as under:
[3] [4] [5]


This gives us 44 bytes of data packet to transmit. Taking into account 128K program memory of ATmega128L MICA2Dot our model can be best implemented in a network of up to 3000 sensor nodes. Going beyond this number we might need to have a tradeoff between the security and performance which is highly unlikely because most of the applications so far do not deploy sensor nodes at that large quantity. Assuming the ongoing developments in enhancing the program memory this framework will be feasible in even larger and denser networks. The algorithms presented in this model takes into consideration the nodes and cluster leaders which are not participating in sending and aggregating the data. These nodes forward the data packets without applying any further cryptographic operation, thus further saving the processing power and memory.








[14] [15]



In this paper we have presented a security framework for wireless sensor network which is composed of three phases: cluster formation, secure key management scheme and secure routing. Cluster formation process has described the topology formation and self organization of sensor nodes, leader election and route selection towards base



Vertical Perimeter Based Enhancement of Streaming Application

Abstract-The explosion of web and increase in processing power meets a large number of short lived connections making connection setup time equally important. With Fire Engine, the networking stack went through one more transition where the core pieces (i.e. socket layer, TCP, UPD, IP, and device driver) used an IP Classifier and serialization queue to improve the connection setup time, scalability, and packet processing cost. The Fire Engine approach is to merge all protocol layers into one STREAMs module which is fully multi threaded. Inside the merged module, instead of using per data structure locks, use a per CPU synchronization mechanism called vertical perimeter. The vertical perimeter is implemented using a serialization queue abstraction called squeue

system does not provide related system calls. Although there are some existing mechanisms, like procfs, to access system information and system calls, like sched_setaffinity (), to dispatch threads to certain processors, dispatch wraps these into a complete API. It not only provides generic interface for future extension and high portability without kernel modification, but also performs much better than procfs. A. Fire Engine The Fire Engine [10] networking stack for the Solaris Operating System (OS) is currently under development by Sun. Enhanced network performance and a flexible architecture to meet future customer networking needs are twin goals of Fire Engine development. Addressing existing requirements, including increased performance and scalability, Disaster Recovery (DR), Secure Internet Protocol (IPSec), and IP Multiprocessing (IPMP), as well as future requirementssuch as 10-gigabits per second (Gbps) networking, 100-Gbps networking, and TCP/IP Offload Engine (TOE)are given equal priority. Implemented in three phases, Fire Engines development stages are structured to provide increased flexibility and a significant performance boost to overall network throughput. Phase 1 has already been completed and these goals have been realized in services using TCP/IP. Web-based benchmarks show a 30- to 45-percent improvement on both SPARC and Intel x86 architectures, while bulk data transfer benchmarks show improvements in the range of 20 to 40 percent. Phases 2 and 3 should deliver similar overall performance improvements. With increased flexibility and performance boosts of this magnitude, FireEngine is well on its way to reinforcing Suns Solaris OS as the commercial standard for networking infrastructure. B. Performance Barriers


threading, scheduling, dispatching, STREAM, Multicore, Fire Engine, Vertical Perimeter, CPU scheduling, task queue, IP Multithreading.

I. INTRODUCTION In recent years, since the clock rate of single processor cannot be increased without overheating, to increase performance, manufacturers develop multicore systems instead. In order to run the applications on multicore systems, there are many parallel algorithms developed, e.g. parallel H.264 applications. By exploiting parallelism, multicore systems compute more effectively. Multiple threads and processes are common useful approaches to speed up user a task with one thread in some operating systems, e.g. Linux. In our observation, threads sometimes are not dispatched reasonably on processors. We redefine these anomalies formally from multiprocessing timing anomalies and focus on thread manipulation on multicore systems. For example, even if some cores are idle, the operating system does not dispatch any thread to the idle ones. Furthermore, even if users find out this situation, they still cannot directly dispatch these threads accordingly if the operation


The existing TCP/IP stack uses STREAMS perimeters and kernel adaptive mutex for multithreading. As the current STREAMS perimeter provides per module, per protocol stack layer, or horizontal perimeters. This can, and often does, lead to a packet being processed on more than one CPU and by more than one thread, leading to excessive context switching and poor CPU data locality. C. Network Performance FireEngine introduces a new highly scalable, packet classification architecture called Firehose. Each incoming packet is classified early on, then proceeds through an optimized list of functionsthe Event List that makes it easy to add protocols without impacting the network stacks complexity, performance, or scalability. FireEngine concentrates on improving the performance of key server workloads that have a significant networking component. The impact of network performance on these workloads, as well as benchmarks that describe overall workload performance D. Performance metrics Applications often use networking in two distinct ways: To perform transactions over the network, or to stream data over the network. Transactions are shortlived connections transferring a small amount of application data, while streaming data is a transfer of large amounts of data during long-lived connections. In the transaction case, performance is determined by a combination of the time it takes to get the first byte (first-byte latency), connection set up/tear down, plus network throughput (bits per second or bps). In the streaming case, performance is dominated by overall network throughput. These parameters impact performance in various ways, depending on the amount of data transferred. For instance, when transferring one byte of data, only first-byte latency and connection set up/tear down count. When transferring very large amounts of data, only network throughput is relevant. Finally, there is the ability to sustain performance as the number of active simultaneous connections increases. This is often a requirement for Web servers. A networking stack must take into account the host systems hardware characteristics. For low-end systems, it is important to make efficient use of the available hardware resources, such as memory and CPU. For higher-end systems, the stack must take into account the high variability in memory access time, as well as system resources that offload some functions to specialized hardware.

Fire Engine focuses on these network performance metrics [10]: Network throughput Connection set up/tear down First-byte latency Connection and CPU scalability Efficient resource usage. E. Vertical Perimeters The Solaris 10 FireEngine project introduces the abstraction of a vertical perimeter, which is composed of a new kernel data structure, the squeue_t (serialization queue type), and a worker thread owned by the squeue_t , which is bound to a CPU. Vertical perimeters or squeues by themselves provide packet serialization and mutual exclusion for the data structures. FireEngine uses a per-CPU perimeter, which is a single instance per connection. For each CPU instance the packet is queued for processing, and a pointer to the connection structure is stored inside the packet. The thread entering squeue may either process the packet immediately, or queue it for later processing. The choice depends on the squeues entry point and its state. Immediate processing is possible only when no other thread has entered the same squeue. A connection instance is assigned to a single squeue_t so it is processed only within the vertical perimeter. As a squeue_t is processed by a single thread at a time, all data structures used to process a given connection from within the perimeter can be accessed without additional locking. This improves both CPU and thread context data locality of access for the connection metadata, packet metadata, and packet payload data, improving overall network performance. This approach also allows: The removal of per-device driver worker thread schemes, which are often problematic in solving system-wide resource issues. Additional strategic algorithms to be implemented to best handle a given network interface, based on network interface throughput and system throughput (such as fanning out per-connection packet processing to a group (of CPUs). II.

Many techniques have been developed to exploit parallelism. OpenMP [5] is a tool where looped tasks can be partitioned into multiple independent tasks automatically. Affine partition [6], [7], is another method that can find the optimal partition, which maximizes the parallelism with the minimum

synchronization. However, few works have been done to address on how to allocate resources to threads on multicore systems. Andr C. Neto, Filippo Sartori, [1] proposed MARTe is a framework built over a multiplatform library that allows the execution of the same code in different operating systems. Drawback is latency. Franois Trahay, lisabeth Brunet Alexander Denis [2], the author present a thread safety while processing by locking mechanism. Drawback is of Deadlock. Fengguang Song, Shirley Moore [3], This paper proposes an analytical model to estimate the cost of running an affinity-based thread schedule on multicore systems. Tang-Hsun Tu, Chih-Wen Hsueh [4], Decomposing of thread by User Dispatching Mechanism (UDispatch) that provides controllability in user space to improve application performance. Ana Sonia Leon [8], proposed Chip Multi-Threading (CMT) architecture which maximizes overall throughput performance for commercial workloads. Drawback is low performance. Sunay Tripathi, Nicolas Droux, Thirumalai Srinivasan, presented a paper that is a new architecture which addresses Quality of Service (QoS) by creating unique flows for applications and Services. III.FIRE ENGINE ARCHITECTURE The Solaris FireEngine networking performance improvement project adheres to these design principles [10]: Data locality: Ensures that a connection is always processed by the same CPU whenever possible CPU modelling: Efficient use of available CPUs and interrupt/worker thread model. Allows use of multiple CPUs for protocol processing Code path locality: Improves performance and efficiency of TCP/IP interactions TCP/IP interaction: Switches from a message passing-based interface to a function call-based interface. Because of the large number and dependent nature of changes required to achieve FireEngine goals, the development program is split into three phases: Solaris 10 Fire Engine phase 1 [10]: Fundamental infrastructure implemented and a large performance boost realized. Application and STREAMS module developers see no changes other than better performance and scalability. Solaris 10U and SX Fire Engine Phase 2 [10]: Feature scalability, offloading, and the new Event List framework implemented. 1. Solaris 10 Fire Engine phase 1 architecture A. IP Classifier-Based Fan Out When the Solaris IP receives a packet from a NIC, it classifies the packet and determines the connection structure and vertical perimeter instance that will process that packet. New incoming connections are assigned to the vertical perimeter instance attached to the interrupted CPU. Or, to avoid saturating an individual CPU, a fan-out across all CPUs is performed. A NIC always sends a packet to IP in interrupt context, so IP can optimize between interrupt and noninterrupt processing, avoiding CPU saturation by a fast NIC. There are multiple advantages with this approach: The NIC does minimal work, and complexity is hidden from independent NIC manufacturers. IP can decide whether the packet needs to be processed on the interrupted CPU or via a fan out across all CPUs. Processing a packet on the interrupted CPU in interrupt context saves the context switch, compared to queuing the packet and letting a worker thread process it. IP can also control the amount of work done by the interrupt without incurring extra cost. On low loads, processing is done in interrupt context. With higher loads, IP dynamically changes between interrupt and polling while employing interrupt and worker threads for the most efficient processing. In the case of a single high bandwidth NIC (such as 10Gbps), IP also fans out the connection to multiple CPUs. If multiple CPUs are applied, the connection is bound to one of the available CPUs servicing the NIC. Worker threads, their management, and special fan-out schemes can be coupled to the vertical perimeter with little code complexity. Since these functions reside in IP, this architecture benefits all NICs. The DR issues arising from binding a worker thread to a CPU can be effectively handled in IP. The Solaris 10 FireEngine project introduces the abstraction of a vertical perimeter, which is composed of a new kernel data structure, the squeue_t (serialization queue type), and a worker thread owned by the squeue_t, which is bound to a CPU. Vertical perimeters or squeues by themselves provide packet serialization and mutual exclusion for

the data structures. FireEngine uses a per-CPU perimeter, which is a single instance per connection. For each CPU instance the packet is queued for processing, and a pointer to the connection structure is stored inside the packet.

Figure 3 : Scheduling and process state transition A. Fair share scheduling The FS scheduler has two levels of scheduling: process and user. Process level scheduling same as in standard UNIX (priority and nice values act as bias to scheduler as it repositions processes in the runqueue). User level scheduler relationship can be seen in the following (simplified) pseudo-code. Whereas process-level scheduling still occurs 100 times a second, user-level scheduling adjustments (usage parameter) occur once every 4 seconds. Also, once second, process-level priority adjustments that were made in the previous second begin to be forgotten. This is to avoid starving a process. FSS is all about making scheduling decisions based on process sets rather than on basis of individual processes.

Figure 2: Packets flowing in TCP through vertical perimeter tcp_input - All inbound data packets and control messages tcp_output - All outbound data packets and control messages tcp_close_output - On user close tcp_timewait_output - timewait expiry tcp_rsrv_input - Flow control relief on read side. tcp_timer - All tcp timers IV. ANALYSIS 1. Scheduling algorithm Good cluster schedulers attempt to minimize job wait time while maximizing cluster utilization. The maximization of utilization and minimization of wait time are subject to the policy set by the scheduler administrator. Types of scheduling [11]: Long-term scheduling : the decision to add to pool of processes to be executed Mid-term scheduling : the decision to add to the number of processes that are partially or fully in memory Short-term scheduling : decision as to which available process will be executed I/O scheduling : decision as to which processs pending request shall be handled by an available I/O device

Figure 4: Scheduling process fs_interval (): Duration of each fairshare window. fs_depth (): Number of fairshare windows factored into the current fairshare utilization calculation. fs_decay (): Decay factor applied to weighting the contribution of each fairshare window in the past. V.RESULTS Comparison of Single Queue and Multiple Queues:

Here we are examine that execution time is less,using of multiple queues than single queue.comparison regarding to the paket count and length of bytes. Execution time in nanosecond but easy reference we converting into seconds. Length of bytes 300:
E xec ution T im e(S ec )

300 B ytes
15 10 5 0 S ingle q Multi q 10 0.4 0.15 100 0.9 0.34 500 5 2 1000 10 4 S ingle q Multi q

In solaris 10 the fire engine has the concept of vertical perimeter. By using the vertical perimeter we can improve the streaming application and we mainly concentrated on the queue management, assigning the core, thread allocating and time profiling. For the allocation of process we need the scheduling algorithm. Here we used fair share scheduling (FSS). FSS will make scheduling process easier and efficient. So are going to analysis and implement the FSS in this paper.
VII. REFERENCES Andr C. Neto, Filippo Sartori, MARTe: A Multiplatform Real-Time Framework, IEEE transactions on nuclear science, vol. 57, no. 2, april 2010. [2] Franois Trahay, lisabeth Brunet, An analysis of the impact of multi-threading on communication performance, IEEE transactions 2009. [3] Fengguang Song , Shirley Moore, Analytical Modeling and Optimization for Affinity Based Thread Scheduling on Multicore Systems, IEEE transactions 2009. [4] Tang-hsun Tu, Chih-wen hsueh, Rong-Guey Chang, A portable and efficient User Dispatching Mechanism for multicore system, IEEE International conference on Embedded and real-time computing systems and applications, pp.427-436,2009 [5] L. Dagum and R. Menon, OpenMP: An Industry-Standard API for Shared-Memory Programming, IEEE Computational Science & Engineering, vol. 5, no. 1, pp. 46 55, Jan. 1998. [6] W. Lim, G. I. Cheong, and M. S. Lam, An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication, Proceedings of the 13th international conference on Supercomputing, pp. 228237, 1999. [7] W. Lim and M. S. Lam, Maximizing Parallelism and Minimizing Synchronization with affine Transforms, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 201214, 1997. [8] Ana Sonia Leon, Senior Member, A Power-Efficient HighThroughput 32-Thread SPARC Processor, IEEE journal of solid-state circuits, vol. 42, no. 1, jan. 2007. [9] W.-Y. Cai and H.-B. Yang. Cross-layer QoS optimization design for wireless sensor networks. In Wireless, Mobile and Sensor Networks, 2007. [10] Tong Li, Alvin R. Lebeck, Spin Detection Hardware for Improved Management of Multithreaded Systems, IEEE transactions on parallel and distributed systems, vol. 17, no. 6, June 2006. [11] Sunay Tripathi, FireEngine - A New Networking Architecture for the Solaris Operating System P.pdf, Nov. 2004. [12] Prof. Navneet Goyal, Department of Computer Science & Information System, BITS, Pilani,operating system. [1]

P a c ke t C ount

Length of bytes 600:

E xec ution T im e(S ec )

600 B ytes

5 0 S ingle q Multi q

10 0.3 0.1

100 0.9 0.4

500 5 2

1000 9 3 S ingle q Multi q

P a c ke t C ount

VI.CONCLUSION In this paper, we presented the vertical perimeter for enhancing the streaming application. In multicore environment, using multiple threads is a common useful approach to improve application performance. Nevertheless, even in many simple applications, the performance might degrade when the number of threads increases. However, in our observation, the more significant effect is the dispatching of threads.


Orthogonal Frequency Division Multiplexing for Wireless Communications

Abstract Orthogonal Frequency Division Multiplexing (OFDM) ,a modulation technique has an increased symbol duration which makes it robust against Inter-Symbol-Interference (ISI). OFDM split a high-rate datastream into a number of lower rate streams that are transmitted simultaneously over a number of sub carriers. The advanced transmission techniques of OFDM, applied in wireless LANs and in digital and video broadcasting, and CDMA, the foundation of 3G mobile communications, have been part of almost every communication system. In this paper we study the OFDM transmission and reception scheme, working ,advantages, disadvantages and applications. Keywords OFDM, ADSL, DVB-T

I INTRODUCTION Orthogonal frequency division multiplexing (OFDM) is widely known as the promising communication technique in the current broadband wireless mobile communication system due to the high spectral efficiency and robustness to the multipath interference. Currently, OFDM has been adapted to the digital audio and video broadcasting (DAB/DVB) system, high-speed wireless local area networks (WLAN) such as IEEE802.11x, HIPERLAN II and multimedia mobile access communications (MMAC), ADSL, digital multimedia broadcasting (DMB) system and multi-band OFDM type ultra-wideband (MB-OFDM UWB) system, etc. in multi-carrier OFDM system. Besides being the basis for many high data rate wireless standards, the main advantages of OFDM are its high spectral efficiency and its ability to use the multipath channel to its advantage. A. II PRINCIPLES OF OFDM

number of subcarriers. In OFDM, a rectangular pulse is used as sub carrier for transmission. It facilitates the process of pulse forming and modulation by implementing efficiently with simple IDFT (inverse discrete fourier transform) along with IFFT (inverse fast fourier transform). To reverse this operation at receiver, an FFT (fast Fourier transform) is needed. According to the theorems of the Fourier Transform the rectangular pulse shape will lead to a sin(x)/x type of spectrum of the sub carriers (fig.1). But the spectrums of the sub carriers are not separated but overlap. The information transmitted over the carriers can still be separated because of the orthogonality relation. By using an IFFT for modulation the spacing of the sub carriers is chosen in such a way that at the frequency where the received signal is to be evaluated (indicated as arrows), all other signals are zero.For this orthogonality, the receiver and transmitter must be perfectly synchronized i.e. both must assume the same modulation frequency and the same time-scale for transmission. OFDM is a block transmission technique. In the baseband, complexvalued data symbols modulate a large number of tightly grouped carrier waveforms. The transmitted OFDM signal multiplexes several low-rate data streams each data stream is associated with a given subcarrier. The main advantage of this concept in a radio environment is that each of the data streams experiences an almost flat fading channel. In slowly fading channels, the intersymbol interference (ISI) and intercarrier interference (ICI) within an OFDM symbol can be avoided with a small loss of transmission energy using the concept of a cyclic prefix.

The basic principle of OFDM is to split a high-rate datastream into a number of lower rate streams that are transmitted simultaneously over a


(possibly complex) symbol stream using some modulation constellation (QAM, PSK, etc.). Note that the constellations may be different, so some streams may carry a higher bit-rate than others. An inverse FFT is computed on each set of symbols, giving a set of complex time-domain samples. These samples are then quadrature-mixed to passband in the standard way. The real and imaginary components are first converted to the analogue domain using digital-to-analogue

converters (DACs); the analogue signals are then used Fig1: OFDM and the orthogonality principle III SYSTEM 1. Transmiter DESCRIPTION to modulate cosine and sine waves , respectively. at

the carrier frequency,


signals are then summed to give the transmission signal, Receiver .

Fig 2 Transmiter of OFDM System An OFDM carrier signal is the sum of a number of orthogonal sub-carriers, with baseband data on each sub-carrier being independently modulated commonly amplitude using some type of quadrature or phase-shift Fig 3 Receiver of OFDM System The receiver picks up the signal , which is then

quadrature-mixed down to baseband using cosine and sine waves at the carrier frequency. This also creates signals centered on , so low-pass filters

modulation (QAM)

keying (PSK). This composite baseband signal is typically used to modulate a main RFcarrier. is a serial stream of binary digits. By inverse multiplexing, these are first demultiplexed into parallel streams, and each one mapped to a

are used to reject these. The baseband signals are then sampled and digitised using analogue-todigital converters(ADCs), and a forward FFT is used to convert back to the frequency domain.

transmitter end to the receiver end. Presence of guard band in this system deals with the problem of ISI and noise is minimized by larger number of sub carrier. IV ADVANTAGES & DISADVANTAGES OF AN OFDM SYSTEM Advantages Due to increase in symbol duration, there is a reduction in delay spread. Addition of guard band almost removes the ISI and ICI in the system. Conversion of the channel into many narrowly spaced orthogonal sub carriers render it immune to frequency selective fading. As it is evident from the spectral pattern of an OFDM system, orthogonally placing the sub carriers lead to high spectral efficiency. Can be efficiently implemented using IFFT Disadvantages These systems are highly sensitive to Doppler shifts which affect the carrier frequency offsets, resulting in ICI. Presence of a large number of sub carriers with varying amplitude results in a high Peak to Average Power Ratio (PAPR) of the system, which in turn hampers the efficiency of the RF amplifier. PEAK TO AVERAGE POWER RATIO The main drawback of OFDM is its high PAPR, which distorts the signal if the transmitter contains nonlinear components such as power amplifiers and causes some deficiencies such as intermodulation, spectral spreading and changing in signal constellation. Minimising the PAPR allows a higher average power to be transmitted for a fixed peak power, improving the overall signal to noise ratio at the receiver. There are many solutions in the literature to reduce the effect of PAPR in the OFDM signals. There are a number of techniques to deal with the problem of PAPR. Some of them are amplitude coding, clipping, partial clipping and filtering, (PTS), transmit sequence

This returns

parallel streams, each of which is

converted to a binary stream using an appropriate symbol detector. These streams are then recombined into a serial stream, , which is an

estimate of the original binary stream at the transmitter. B. Data on OFDM The data to be transmitted on an OFDM signal is spread across the carriers of the signal, each carrier taking part of the payload. This reduces the data rate taken by each carrier. The lower data rate has the advantage that interference from reflections is much less critical. This is achieved by adding a guard band time or guard interval into the system. This ensures that the data is only sampled when the signal is stable and no new delayed signals arrive that would alter the timing and phase of the signal.

C. Fig 4 : Data on OFDM The distribution of the data across a large number of carriers in the OFDM signal has some further advantages. Nulls caused by multi-path effects or interference on a given frequency only affect a small number of the carriers, the remaining ones being received correctly. By using error-coding techniques, which does mean adding further data to the transmitted signal, it enables many or all of the corrupted data to be reconstructed within the receiver. This can be done because the error correction code is transmitted in a different part of the signal. In the OFDM system, orthogonally placed sub carriers are used to carry the data from the

selected mapping (SLM) and interleaving These techniques achieve PAPR reduction at the expense of transmit signal power increase, bit error rate (BER) increase, data rate loss, computational complexity increase, and so on AMPLITUDE CLIPPING AND FILTERING A threshold value of the amplitude is set in this process and any sub-carrier having amplitude more

than that value is clipped or that sub-carrier is filtered to bring out a lower PAPR value. SELECTED MAPPING In this a set of sufficiently different data blocks representing the information same as the original data blocks are selected. Selection of data blocks with low PAPR value makes it suitable for transmission. PARTIAL TRANSMIT SEQUENCE Transmitting only part of data of varying sub-carrier which covers all the information to be sent in the signal as a whole is called Partial Transmit Sequence Technique. . V APPLICATIONS ADSL OFDM is used in ADSL connections that follow the G.DMT (ITU G.992.1) standard, in which existing copper wires are used to achieve high-speed data connections.Long copper wires suffer from attenuation at high frequencies. The fact that OFDM can cope with this frequency selective attenuation and with narrow-band interference are the main reasons it is frequently used in applications such as ADSL modems. However, DSL cannot be used on every copper pair; interference may become significant if more than 25% of phone lines coming into a central office are used for DSL.For experimental amateur radio applications, users have even hooked up commercial off-the-shelf ADSL equipment to radio transceivers which simply shift the bands used to the radio frequencies the user has licensed. LAN and MAN OFDM is extensively used in wireless LAN and MAN applications, including IEEE 802.11a/g/n and WiMAX.IEEE 802.11a/g/n operating in the 2.4 and 5 GHz bands, specifies a per-stream airside data rates ranging from 6 to 54 Mbit/s. If both devices can utilize "HT mode"

added with 802.11n then the top 20 MHz per-stream rate is increased to 72.2 Mbit/s with the option of data rates between 13.5 and 150 Mbit/s using a 40 MHz channel. Four different modulation schemes are used: BPSK, QPSK, 16-QAM, and 64QAM, along with a set of error correcting rates (1/25/6). The multitude of choices allows the system to adapt the optimum data rate for the current signal conditions. DVB-T By Directive of the European Commission, all television services transmitted to viewers in the European Community must use a transmission system that has been standardized by a recognized European standardization body,[ and such a standard has been developed and codified by the DVB Project, Digital Video Broadcasting (DVB); Framing structure, channel coding and modulation for digital terrestrial television.[Customarily referred to as DVBT, the standard calls for the exclusive use of COFDM for modulation. DVB-T is now widely used in Europe and elsewhere for terrestrial digital TV. VI CONCLUSION

In this paper we studied abou OFDM system and concluded that Orthogonal frequency division multiplexing (OFDM) is a promising technique for the broadband wireless communication system.


A Comprehensive Study of Adaptive Resonance Theory

Abstract In this paper we have studied Adaptive resonance theory and their extensions, and to provide an introduction to ART by examining ART1, ART2, ART2-A, FUZZY ART, ART MAP, FUZZY ARTMAP, the first member of the family of ART neural network. ART was specially designed to overcome the stability-plasticity dilemma problem. This paper also describes the use of unsupervised ART2 neural network for recognition patterns. We investigate the performance of ART2-A, and it offers better recognition accuracy even when the illumination of the images is varied. This paper also explores the features of FUZZY ARTMAP neural network classifier. FUZZY ARTMAP is both much faster and incrementally stable. The FUZZY ART is an unsupervised network which is the essential component of the FUZZY ARTMAP. Keywords: ART, neural network, pattern recognition

here are based on data mining methods used to preprocess unlabeled examples. Unsupervised learning is closely related to the problem of density estimation in statistics however unsupervised learning also encompasses many other techniques that seek to summarize and explain key features of the data.

Neural network: Neural network is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks a neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases an NN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Supervised Learning: Supervised learning is the machine learning task of inferring a function from supervised training data. In this learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which is called a classifier. The inferred function should predict the correct output value for any valid input object. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. Unsupervised Learning: Unsupervised learning is a class of problems in machine learning where the goal is to determine how data is organized. Many methods employed

Adaptive Resonance Theory [1], or, ART, is both cognitive and neural theory of how the brain quickly learns to categorize, and predict objects and events in a changing world, and a set of algorithm which computationally embody ART principles and are used in large scale engineering and technological applications here fast, stable Incremental, learning about complex changing environments is needed .ART clarifies the brain processes from which conscious experience emerge. ART predicts how top-down attention works and regulates fast stable learning of recognition categories. In particular, ART articulates a critical role for "resonant" states in driving fast stable learning; hence the name adaptive resonance. These resonant states are bound together, using top-down attentive feedback in the form of learned expectations, into coherent representations of the world. ART hereby clarifies one important sense in which the brain carries out predictive computation. ART algorithms have been used in large scale applications such as medical data basepredictions, remote sensing and airplane design.

Fig1: Basic structure of the ART network

neuron in F1 is connected to all neurons in F2 via the continuous-valued forward long term memory (LTM) Wf , and vice versa via the binary-valued backward LTM Wb. The other modules are gain 1 and 2 (G1 and G2), and a reset module. Each neuron in the comparison layer receives three inputs: a component of the input pattern, a component of the feedback pattern, and a gain G1. A neuron outputs a 1 if and only if at least three of these inputs are high: the 'two-thirds rule.' The neurons in the recognition layer each compute the inner product of their incoming (continuous-valued) weights and the pattern sent over these connections. The winning neuron then inhibits all the other neurons via lateral inhibition. Gain 2 is the logical 'or' of all the elements in the input pattern x. Gain 1 equals gain 2, except when the feedback pattern from F2 contains any 1; then it is forced to zero. Finally, the reset signal s sent to the active neuron in F2 if the input vector x & the output of F1 di er by more than some vigilance level. Operation: The network starts by clamping the input at F1. Because the output of F2 is zero, G1 and G2 are both on and the output of F1 matches its input. The pattern is sent to F2, and in F2 one neuron becomes active. This signal is then sent back over the backward LTM, which reproduces a binary pattern at F1. Gain 1 is inhibited, and only the neurons in F1 which receive a 'one' from both x and F2 remain active. If there is a substantial mismatch between the two patterns, the reset signal will inhibit the neuron in F2 and the process is repeated. 1. Initialization:

Basic processing module of ART networks is an extended competitive learning network, as shown in Fig.1 [2]. The m neurons of an input layer F1 register values of an input pattern1= (i1,i2,,im ) every neuron of output layer F2 receives a bottom-up net activity tj, built from all F1outputs S = I. The vector elements of T=(t1,t2,.,tn) can be perceived as the results of comparison between input pattern I and prototypes W1=(w11,,,w1m),,Wn=(wn1,.,wnm). These prototypes are stored in the synaptic weights of the connections between F1 - and F2 -neurons. The ART Architecture:

Fig2.The ART architecture. The system consists of two layers F1 andF2,which are connected to each other via the LTM. The input pattern is received at F1, whereas classification takes Place in F2. As mentioned before, the input is not directly classified. First a characterization takes place by means of extracting features, giving rise to activation in the feature representation field. The expectations, residing in the LTM connections, translate the input pattern to a categorization in the category representation field. The classification is compared to the expectation of the network, which resides in the LTM weights from F2 to F1. If there is a match, the expectations are strengthened, otherwise the classification is rejected. The simplified neural network model:

Where N is the number of neurons in F1, M the number of neurons in F2, 0 i < N, and 0 j <M. Also, choose the vigilance threshold , 0 1; 2. Apply the new input pattern x: 3. Compute the activation values y0 of the neurons in F2:

4. Select the winning neuron k (0 k <M): 5. Vigilance test: if

Where. Denotes inner product, go to step 7, else go to step 6. Note that Wkb .

Fig3.The ART neural network. The ART1 simplified model consists of two layers of binary neurons (with values 1 and 0), called F1 and F2 .Each Essentially is the inner product , which will be large if and near to each other; 6. Neuron k is disabled from further activity. Go to step 3;

7. Set for all l, 0 l < N: Yj=-1(inhibit node j)(and continue executing step8 again) If||x||/||s||>, then proceed to step 13. Step13: update the weights for node j Bij(new)=LXi/L=1+||x|| Tji(new)=Xi Step14: test for stopping condition ii) Adaptive Resonance Theory2 (ART2) ART2 accepts continuous valued vectors.ART2 networks [3] are both plastic & stable in that they can learn new data without erasing currently stored information .Thus ART2 networks are suitable for continuous, incremental online training. The main advantages of ART networks are that they do not suffer from the stability ,plasticity problem of supervised networks & thus are more suitable for continuous online learning of the classification task. The difference between ART2 and ART1 reflects the modifications patterns with valued components. The architecture of an ART2 network is delineated inFig4. In this particular conguration, the `feature representation eld (F1) consists of four loops. An input pattern will be circulated in the lower two loops rst. Inherent noise in the input pattern will be sup-pressed [this is controlled by the parameters a and b and the feedback function f()] and prominent features in it will be accentuated. Then the enhanced input pattern will be passed to the upper two F1 loops and will excite the neurons in the `category representation eld (F2) via the bottom-up weights. The established class neuron in F2 that receives the strongest stimulation will re. This neuron will read out a `top-down expectation in the form of a set of top-down weights sometimes referred to as class templates. This top-down expectation will be compared against the enhanced input pattern by the vigilance mechanism. If the vigilance test is passed, the top-down and bottom-up weights will be updated and, along with the enhanced input pattern, will circulate repeatedly in the two upper F1 loops until stability is achieved. The time taken by the network to reach a stable state depends on how close the input pattern is to passing the vigilance test. If it passes the test comfortably, i.e. the input pattern is quite similar to the top-down expectation, stability will be quick to achieve. Otherwise, more iteration are required. After the top-down and bottom-up weights have been updated, the current ring neuron will become an established class neuron. If the vigilance test fails, the current ring neuron will be disabled. Another search within the remaining established class neurons in the F2 layer will be conducted. If none of the established class neurons has a top-down expectation similar to the input pattern, an unoccupied F2 neuron will be assigned to classify the input pattern. This procedure repeats itself until either all the patterns are classier or the memory capacity of F2 has been exhausted. The ART2 algorithm: Step1: Initialize parameters a, b, c, d, e, p, ,, , Step2: perform step 3-13 upto specified number of epochs of training.

8. Re-enable all neurons in F2 and go to step 2. III. EXTENSIONS IN ADAPTIVE RESONANCE THEORY. The Extensions of ART are discussed as follows: i) Adaptive Resonance Theory1 (ART1) ART1 is an efficient algorithm that emulates the self organizing [2] pattern recognition and hypothesis testing properties of the ART neural networks architecture for horizontal and vertical classification of 0-9 digits recognitions. The ART1 model can self organize in real time producing stable & clear recognition. While getting input patterns beyond those originally stored. It can also preserve its previously learned knowledge while keeping its ability to learn new input patterns that can be saved in such a fashion that the stored patterns cannot be destroyed or forgotten. ART1 is important to cluster binary input vectors (nonzero) & direct user control of the degree of similarity among patterns placed on a cluster unit. The learning process is designed such that patterns are not necessarily presented in a fixed order & the number of the patterns for clustering may be unknown in advanced updates for both the bottom up & top down weights are controlled by differential equations. However, this process may be finished in a learning trial. In other words the weights reach the equilibrium during each learning trial. The ART1 algorithm: Step1: Initialize parameters L>1and0<1. Initialize weights 0<bij(0)<L/L-1+n,tji(0)=1 Step2: while stopping condition is false, perform steps 314. Step3: for each training input. do steps4-13 Step4: set activation of all f2 units to zero. Set activations of f1(a)units to input vector s. Step5: compute the norm of s: ||s||=si Step6: send input signal from f1(a)to fi(b)layer Step7: for each f2 node that is not inhibited. If yj1, then yj=bijXi Step8: while reset is true, perform step 9-12. Step9: find j such that yjyj for all nodes j.If yj=1, then all odds are inhibited and this pattern cannot be clustered. Step10: recomputed activation x of f1(b) Xi=Si Tji Step11: compute the norm of vector x||x||=Xi Step12: test for reset If||x||/||s||<,then

Step3: for each input vectors do step 4-12 Step4: Update f1 unit activation Ui =0, xi= si /e+\\s\\ Wi = si, qi =0 Pi=0, vi= f(xi) Update f1 unit activations again. I =vi/e+\\v\\, wi=si+aui, Pi=ui, xi=wi/e+\\w\\, qi=pi/e+\\p\\ V=f(xi)+bf(qi) Step5: compute signals to f2 units Yj= bijpi Step6: while reset is true, perform step 7-8. Step7: for f2 unit choose yj with largest signal . step8: Check for reset Ui=vi/e+\\v\\, pi=ui+d tji Ri=ui+cpi/e+\\u\\+c||p|| If ||r||>-e,then Yj=-1(inhibit j) Since reset is true, go to step 6 If||r||>-e, then Wi=si+aui V=f(xi)+bf(qi) Reset is false, so go to step 9. Yj=-1(inhibit node j)(and continue executing step8 again) If||x||/||s||>, then proceed to step 13. Step13: update the weights for node j Bij(new)=LXi/L=1+||x|| Tji(new)=Xi Step14: test for stopping condition. Step11: update f1 activations. Step12: test stopping condition for weight updates. Step13: test stopping condition for number of epochs. iii) Adaptive Resonance Theory2-A (Art2a) The Art networks are designed to allow the user to control the degree of similarity of patterns placed on same cluster. The resulting of the number of clusters then depends on the distance between all input patterns, presented to the network during training periods. The ART2-a network can be characterized by its preprocessing, choice, match and adaptation where choice and match define the search circuit for a fitting prototype [2]. The central functions of the ART2-a algorithm [2] are as followsPreprocessing No negative input values are allowed and all encoded input vectors A to unit Euclidean length, denoted by function symbol N as :

(1) Carpenter and grossberg suggested an additional method of noise suppression to contrast enhancement by setting all input values to zero, which do not exceed a certain bias theta as defined by,

(2) This kind of contrast enhancement does only make sense if characteristics of input patterns, leading to a distribution on different clusters, are coded in their highest values .with theta bounded by

Fig4: Architecture of an ART2 network step9: perform step 10-12 upto specified number of learning iterations. Step10:update weights for wining unit j. Tji(new)=dui+,1+d(d-1)}tji(old) bji(new)=dui+,1+d(d-1)}bji(old) Step11: compute the norm of vector x||x||=Xi Step12: test for reset If||x||/||s||<, then

(3) The upper limit will lead to complete suppression of all patterns having the same constant value for all elements. Choice Bottom up activities, leading to the choice of a prototype are determined by


(4) Bottom-up net activities are determined differently for previously committed and uncommitted prototypes. The choice parameter 0 again defines the maximum depth of search for a fitting cluster. With =0 , all committed prototypes are checked before an uncommitted prototype is chosen as winner. The simulations in this paper apply = 0. Match Resonance and adaptation occurs either if j is the index of an uncommitted prototype or if j is a committed prototype and (5) Adaptation Adaptation of the final winning prototype requires a shift towards the current input pattern, (6) ART 2-A type networks always use fast commit slow recode mode. Therefore the learning rate is set to n=1 if j is an uncommitted prototype and to lower values for further adaptation. Since match and choice do not evaluate the values of uncommitted prototypes, there is no need to initialize them with specific values. iv) Fuzzy Art The Fuzzy ART is a neural network introduced by Carpenter, Gross berg and Rosen in 1991[4]. It is a modified version of the binary [5] ART1, which is Notably able to accept analog fuzzy input patterns, i.e. vectors whose components are real numbers between 0 and 1. The Fuzzy ART is an unsupervised neural Network capable of incremental learning that is it can learn continuously without forgetting what it has previously learned. Fuzzy ART network:

Fig5. Sample Fuzzy ART network. A Fuzzy ART network is formed of two layers of neurons, the input layer F1 and the output layer F2, as illustrated in Fig. Both layers have an activity pattern, schematized on the figure with vertical bars of varying height. The layers are fully interconnected, each neuron being connected to every neuron on the other layer. Every connection is weighted by a number lying between 0 and 1. A neuron of F2 represents one category formed by the network and is characterized by its weight vector wj (j is the index of the neuron) . The weight vector's size is equal to the dimension M of layer F1. Initially all the weight vectors' components are fixed to 1. Until the weights of a neuron are modified, we say that it is uncommitted. Inversely, once a neuron's weights have been modified, this neuron is said to be Committed. The network uses a form of normalization called complement coding. The operation consists on taking the input vector and concatenating it with its complement. The resulting vector is presented to layer F1. Therefore, the dimension M of layer F1 is the double of the input vector's dimension. Complement coding can be deactivated, in this case layer F1 will have the same dimension as the input vector. Unless specified otherwise, we will always suppose that complement coding is active. The Fuzzy ART learns by placing hyper boxes in the M 2 {dimensions hyperspace, M being the size of layer F1. As said earlier, each neuron of layer F2 represents a category formed by the network. The position of the box in the space is encoded in the weight vector of the neuron. The general structure of Fuzzy ART: A typical ART network includes three-layers. The layers F0, F1 and F2 are input, comparison and recognition layers, respectively. The input layer F0 gets the attributes of peers which are needed to be classified. Each peer advertises its capability in the comparison layer F1 for competence comparison. The nodes in F0 and F1 are composed of the entities of the ontology. The corresponding nodes of layer F0 and F1 are connected together via one-to-one, non-modifiable links.

Nodes in recognition layer F2 are candidates of the semantic clusters. There are two sets of distinct connections between the layers: bottom-up (F1 to F2) and top-down (F2 to F1). Then there is a vigilance parameter which defines some kind of tolerance for the comparison of vectors. F2 is a competitive layer, which means that only the node with the largest activation becomes active and the other nodes will be inactive (in other words, each node in F2 corresponds to a category). Therefore, every node in F2 has its own, unique top-down weight vector, also called prototype vector (it is used to compare the input pattern to the prototypical pattern that is associated with the category for which the node in F2 stands). category. This inter-ART vigilance resetting signal is a form of "back propagation" of information, but one that differs from the back propagation that occurs in the Back Propagation network. For example, the search initiated by inter-ART reset can shift attention to a novel cluster of visual features that can be incorporated through learning into a new ART~ recognition category. This process is analogous to learning a category for "green bananas" based on "taste feedback. However, these events do not "back propagate" taste features into the visual representation of the bananas, as can occur using the Back Propagation network. Rather, match tracking reorganizes the way in which visual features are grouped, attended, learned, and recognized for purposes of predicting an expected taste.

Fig6.The general structure of FUZZY ART V).ARTMAP ARTMAP is also known as Predictive ART[5], combines two slightly modified ART-1 or ART-2 units into a supervised learning structure where the first unit takes the input data and the second unit takes the correct output data, then used to make the minimum possible adjustment of the vigilance parameter in the first unit in order to make the correct classification. ARTMAP models combine two unsupervised modules to carry out supervised learning. Many variations of the basic supervised & unsupervised networks have since been adapted for technological applications& biological analyses. Modules ART, and ARTb self-organize categories for vector sets a and b.ART, and ARTb are connected by an inter-ART module that consists of the Map Field and the control nodes called Map Field gain control and Map Field orienting subsystem. Inhibitory paths are denoted by a minus sign; other paths are ART, and ARTb are here connected by an inter- ART module that in many ways resembles ART 1. This inter-ART module includes a Map Field that controls the learning of an associative map from ART~ recognition categories to ARTb recognition categories. This map does not directly associate exemplars a and b, but rather associates the compressed and symbolic representations of families of exemplars a and b. The Map Field also controls match tracking of the ART vigilance parameter. A mismatch at the Map Field between the ART~ category activated by an input a and the ARTb category activated by the input b increases ART, vigilance by the minimum amount needed for the system to search for and, if necessary, learn a new ART,, category whose prediction matches the ARTb

Fig7: Block diagram of an ARTMAP system.

vi) Fuzzy ARTMAP: The Fuzzy ARTMAP, introduced by Carpenter et al. in 1992 [6] is a supervised network which is composed of two Fuzzy ARTs. The Fuzzy ARTs are identified as ARTa and ARTb. The parameters of these networks are designated respectively by the subscripts a and b. The two Fuzzy ARTs are interconnected by a series of connections between the F2 layers of ARTa and ARTb. The connections are weighted, i.e. a weight wij between 0 and 1 is associated with each one of them. These connections form what is called the map field F ab. The map field has two parameters ( ab and ab) and an output vector ab.

version of binary ART1.ARTMAP combines two slightly modified ART1 or ART2 units in a supervised learning structure. FUZZY ARTMAP is a supervised network which is composed of two FUZZY ART.

figure 8: Sample Fuzzy ARTMAP network

The input vector a of ARTa is put in complement coding form, resulting in vector A. Complement coding is not necessary in ARTb so we present the input vector B directly to this network. Figure 2 presents the structure of the Fuzzy ARTMAP. The weights of the map field's connections are illustrated by a vertical bar with a height proportional to the size of the weight. The weights of the map field are all initialized to 1.ARTMAP is represented by many rectangles. In other words it learn a to distinguish the data by mapping boxes into the space, enclosing each category with a certain number of these boxes. Classifying Once the Fuzzy ARTMAP is trained [6], it can be used as a classifier. In this case, the ARTb network is not used. We present a single input vector to ARTa which is propagated until resonance, with a (temporary) vigilance a = 0. Thus, the first category selected by the choice function is accepted field used fast learning ( ab = 1), the output vector contains only 0's except a 1. The index of. If the map this component is the number of the category in which the input vector A has been classified. The use of the map field is thus to associate a category number to each neuron of ARTa's F2 layer (Fa 2 ), i.e. to each box in the hyperspace. If fast learning was not used, one neuron of F a2 can correspond to many categories with different degrees. One way to determine the category number could be to select the index of the maximal component. If needed, the desired output vector can be restored from the weights of the neuron of Fb 2 whose index is the category number.

ART is a theory developed by Stephen Gross berg & Gail Carpenter on the aspects of how the brain process information. It describes a number a of neural network models.ART1 is the simplest variety of ART networks acceptation only binary inputs.ART1 is an unsupervised learning model specially designed for recognizing binary inputs.ART2 extends networks capabilities to support continuous inputs. The main advantages of ART2 networks, is that they do not suffer from stability-plasticity problem of supervised networks and thus are more suitable for continuous online learning of the classification task. The ART2-a is the best classifier wgen the images are corrupted by additive random noise.ART2-a is much less time consuming than the other neural networks and its adaptation is fast while introducing a new sample is added. Fuzzy ART implements fuzzy logic into ARTs pattern recognition thus enhancing generalizability. An optional(&very useful)feature of Fuzzy ART is complement coding, a means of in cooperating the absence of features into pattern classification which goes a long way towards preventing inefficient & unnecessary category proliferation. FUZZY ARTMAP is a powerful neural network model with many useful characteristics including stability & online learning. FUZZY ARTMAP gave better performance & fewer rules over other machine learning algorithm & new models. ARTMAP learning process is a lot faster. Moreover, Fuzzy ARTMAP is capable of incrementally stable learning. ARTMAP learning process is a lot faster. Moreover, Fuzzy ARTMAP is capable of incrementally stable learning.

[1]Carpenter,G.A. and Grossberg,s.,Adaptive Resonance Theory. The handbook of brain theory and neural networks. [2]T. Frank, K. F. Kraiss, T. Kuhlen, Comparative Analysis of Fuzzy ART and ART_2A Network Clustering Performance, IEEE, Transaction on Neural Networks, Vol. 9, pp. 544 -559, [3]Carpenter,G.A. and Grossberg,s.,self organization codes for analog input patterns [4]Gail A. carpenter, Stephen Gross berg and David B.Rosen Fuzzy ART: fast, stable and learning categorization of analog pattern by adaptive resonance system Neural Networks, 4:759{771, 1991. *5+Gail A. carpenter, Stephen Grasbergs massively parallel architecture for a self organizing neural pattern recognition machine. Computer vision graphics and image processing 115, 1987. [6]Gail A. Carpenter, Stephen Gross berg, Natalya Markesan, John H. Reynolds, and David B. Rosen. Fuzzy ARTMAP: neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3:698{713, 1992.

There are two models in which ARTMAP,FUZZZY ARTMAP are supervised model and ART1,ART2,ART2-A,FUZZY ART are unsupervised model.ART1 is used to cluster binary inputs vectors(nonzero) but ART2 accepts continuous valued vectors.ART2-A is used to designed to control the degree of similarity of patterns. FUZZY ART is a modified


Is Wireless Network Purely Secure?

Abstract Wireless technology has been gaining rapid popularity for some years. Adaptation of a standard depends on the ease of use and level of security it provides. In this case, contrast between wireless usage and security standards show that the security is not keeping up with the growth paste of end users usage. Current wireless technologies in use allow hackers to monitor and even change the integrity of transmitted data. Lack of rigid security standards has caused companies to invest millions on securing their wireless networks. When discussing the security of wireless technologies, there are several possible perspectives. Different authentication, access control and encryption technologies all fall under the umbrella of security. During the beginning of the commercialization of the Internet, organizations and individuals connected without concern for the security of their system or network. While the current access points provide several security mechanisms, my work shows that all of these mechanisms are completely in-effective. As a result, organizations with deployed wireless networks are vulnerable to unauthorized use of internet enabled systems and access to their internal infrastructure. 1. Introduction

external compromise. As a result, the organizations have canalized their external network traffic through distinct openings protected by firewalls. The idea is simple. By limiting external connections to a few well protected openings, the organization can better protect itself. Unfortunately, the deployment of a wireless network opens a back door into the internal network that permits an attacker access beyond the physical security perimeter of the organization. 2. 802.11 Wireless Networks

802.11 wireless networks operate in one of two modes- ad-hoc or infrastructure mode. The IEEE standard defines the ad-hoc mode as Independent Basic Service Set (IBSS) and the infrastructure mode as Basic Service Set (BSS). In the remainder of this section, we explain the differences between the two modes and how they operate. In ad hoc mode, each client communicates directly with the other clients within the network, see figure 1

Organizations are rapidly deploying wireless infrastructures based on the IEEE 802.11 standard. Unfortunately, the 802.11 standard provides only limited support for confidentiality through the wired protocol which contains significant flaws in the design. Furthermore, the standards committee for 802.11 left many of the difficult security issues such as key management and a robust authentication mechanism as open problems. As a result, many of the organizations deploying wireless networks use either a permanent fixed cryptographic variable, or a key, or no encryption what so ever. Organizations over the last few years have expended a considerable effort to protect their internal infrastructure from

Figure 1: Example ad-hoc network Ad-hoc mode is designed such that only the clients within transmission range (within the same cell) of each other can communicate. If a client in an ad-hoc network wishes to communicate outside of the cell, a member of the cell MUST operate as a gateway and perform routing.

In infrastructure mode, each client sends all of its communications to a central station, or access point (AP). The access point acts as an ethernet bridge and forwards the communications onto the appropriate network either the wired network, or the wireless network, see figure 2. association request frame, and the access point responding with an association response frame. After following the process described in the previous paragraph, the client becomes a peer on the wireless network, and can transmit data frames on the network. 3. Traditional Wireless Security

Wireless security can be broken into two parts: Authentication and Encryption. Authentication mechanisms can be used to identify a wireless client to an access point and vice-versa, while encryption mechanisms ensure that it is not possible to intercept and decode data. Authentication Access points support MAC authentication of wireless clients, which means that only traffic from authorized MAC addresses will be allowed through the access point. Access point will determine if a particular MAC address is valid by checking it against either a RADIUS server external to the access point or against a database within the nonvolatile storage of the access point. . For many years, MAC access control lists have been used for authentication. Encryption Much attention has been paid recently to the fact that Wired Equivalent Privacy (WEP) encryption defined by 802.11 is not an .industrial strength Encryption protocol. For many years 802.11 WEP has been used for encryption. 4. 802.11 Standard Security Mechanisms

Figure 2: Example infrastructure network Prior to communicating data, wireless clients and access points must establish a relationship or an association. Only after an association is established can the two wireless stations exchange data. In infrastructure mode, the clients associate with an access point. The association process is a two step process involving three states: 1. Unauthenticated and unassociated, 2. Authenticated and unassociated, and 3. Authenticated and associated. To transition between the states, the communicating parties exchange messages called management frames. We will now walk through a wireless client finding and associating with an access point. All access points transmit a beacon management frame at fixed interval. To associate with an access point and join a BSS, a client listens for beacon messages to identify the access points within range. The client then selects the BSS to join in a vendor independent manner. For instance on the Apple Macintosh, all of the network names (or service set identifiers (SSID)) which are usually contained in the beacon frame are presented to the user so that they may select the network to join. A client may also send a probe request management frame to find an access point affiliated with a desired SSID. After identifying an access point, the client and the access point perform a mutual authentication by exchanging several management frames as part of the process. The two standardized authentication mechanisms are described in sections 4.1 and 4.2. After successful authentication, the client moves into the second state, authenticated and unassociated. Moving from the second state to the third and final state, authenticated and associated, involves the client sending an

4.1 Open System Authentication Open system authentication is the default authentication protocol for 802.11. As the name implies, open system authentication authenticates anyone who requests authentication. Essentially, it provides a NULL authentication process. Experimentation has shown that stations do perform a mutual authentication using this method when joining a network, and our experiments show that the authentication management frames are sent in the clear even when WEP is enabled. 4.2 Shared Key Authentication Shared key authentication uses a standard challenge and response along with a shared secret key to provide authentication. The station wishing to authenticate, the initiator, sends an authentication request management frame indicating that they wish to use shared key authentication. The recipient of the authentication request, the responder, responds

by sending an authentication management frame containing 128 octets of challenge text to the initiator. The challenge text is generated by using the WEP pseudo-random number generator (PRNG) with the shared secret and a random initialization vector (IV). Once the initiator receives the management frame from the responder, they copy the contents of the challenge text into a new management frame body. This new management frame body is then encrypted with WEP using the shared secret along with a new IV selected by the initiator. The encrypted management frame is then sent to the responder. The responder decrypts the received frame and verifies that the 32-bit CRC integrity check value (ICV) is valid, and that the challenge text matches that sent in the first message. If they do, then authentication is successful. If the authentication is successful, then the initiator and the responder switch roles and repeat the process to ensure mutual authentication. The entire process is shown in figure 4, and the format of an authentication management frame is shown in figure 3.

Sequence number 1 2 3 4

Status Code Reserved Status Reserved Status

Challenge text Not present Present Present Not Present

WEP used No No Yes No

Table 1: Message Format based on Sequence Number Fram e Contr ol Dura tion De st Ad dr Sou rc Add r BSS ID Seq # Fra me Bo dy FC S 4.3 Closed Network Access Control Lucent has defined a proprietary access control mechanism called Closed Network. With this mechanism, a network manager can use either an open or a closed network. In an open network, anyone is permitted to join the network. In a closed network, only those clients with knowledge of the network name, or SSID, can join. In essence, the network name acts as a shared secret. 4.4 Access Control Lists Another mechanism used by vendors (but not defined in the standard) to provide security is the use of access control lists based on the ethernet MAC address of the client. Each access point can limit the clients of the network to those using a listed MAC address. If a clients MAC address is listed, then they are permitted access to the network. If the address is not listed, then access to the network is prevented. 4.5 Wired Equivalent Privacy protocol The Wired Equivalent Privacy (WEP) protocol was designed to provide confidentiality for network traffic using the wireless protocol. WEP provide the security of a wired LAN by encryption through use of the RC4 algorithm with two side of a data communication. A. In the sender side:

Management Frame Format Algorith m Number Seq Nu m Statu s Cod e Eleme nt ID Challen ge Text

Lengt h

Authentication Frame Format Figure 3: Authentication Management Frame The format shown is used for all authentication messages. The value of the status code field is set to zero when successful, and to an error value if unsuccessful. The element identifier identifies that the challenge text is included. The length field identifies the length of the challenge text and is fixed at 128. The challenge text includes the random challenge string. Table 1 shows the possible values and when the challenge text is included based on the message sequence number.

WEP try to use four operations to encrypt the data (plaintext).At first, the secret key used in WEP algorithm is 40-bit long with a 24-bit Initialization Vector (IV) that is concatenated to it for acting as the encryption/decryption key. Secondly, the resulting key acts as the seed for a Pseudo-Random Number Generator (PRNG).Thirdly, the plaintext throw in an integrity algorithm and concatenate by the plaintext again. Fourthly, the result of key sequence and ICV will go to RC4 algorithm. A final encrypted message is made by attaching the IV in front of the Cipher text. Now in Fig.2 define the objects and explain the detail of operations. assisted by MIC (Message Integrity Check) also, whose function is to avoid attacks of bit-flipping type easily applied to WEP by using a hashing technique. Figure-7 shows a whole picture of WPA process. As you see, TKIP uses the same WEP's CR4 Technique, but making a hash before the increasing of the algorithm CR4. A duplication of the initialization vector is made. One copy is sent to the next step, and the other is hashed (mixed) with the base key.

Figure 7: WPA Encryption Algorithm (TKIP) Figure 5: WEP encryption Algorithm (Sender Side) B. In the Recipient side: WEP try to use five operations to decrypt the received side (IV + Cipher text).At first, the PreShared Key and IV concatenated to make a secret key. Secondly, the Cipher text and Secret Key go to in CR4 algorithm and a plaintext come as a result. Thirdly, the ICV and plaintext will separate. Fourthly, the plaintext goes to Integrity Algorithm to make a new ICV (ICV) and finally the new ICV (ICV) compare with original ICV. In Fig.3 you can see the objects and the detail of operations schematically. After performing the hashing, the result generates the key to the package that is going to join the first copy of the initialization vector, occurring the increment of the algorithm RC4. After that, there's the generation of a sequential key with an XOR from the text that you wish to cryptograph, generating then the cryptography text. Finally, the message is ready for send. It is encryption and decryption will performed by inverting the process. 5. Weaknesses in 802.11 Standard Security Mechanisms

This section describes the weaknesses in the access control mechanisms of currently deployed wireless network access points. 5.1 Weakness of Lucents access control mechanism In practice, security mechanisms based on a shared secret are robust provided the secrets are wellprotected in use and when distributed. Unfortunately, this is not the case with Lucents access control mechanism. Several management messages contain the network name, or SSID, and these messages are broadcast in the clear by access points and clients. The actual message containing the SSID depends on the vendor of the access point. The end result, however, is that an attacker can easily sniff the network namedetermining the shared secret and gaining access to the protected network. This flaw exists even with

Figure 6: WEP encryption Algorithm (Recipient Side) 4.6 WPA (Wi-Fi Protected Access) The WPA came with the purpose of solving the problems in the WEP cryptography method, without the users need to change the hardware. The main reason why WPA generated after WEP is that the WPA allows a more complex data encryption on the TKIP protocol (Temporal Key Integrity Protocol) and

WEP enabled because the management messages are broadcast in the clear. 5.2 Weakness of Ethernet MAC Address Access Control Lists In theory, access control lists provide a reasonable level of security when a strong form of identity is used. Unfortunately, this is not the case with MAC addresses for two reasons. First, MAC addresses are easily sniffed by an attacker since they MUST appear in the clear even when WEP is enabled, and second most all of the wireless cards permit the changing of their MAC address via software. As a result, an attacker can easily determine the MAC addresses permitted access via eavesdropping, and then subsequently masquerade as a valid address by programming the desired address into the wireless card by-passing the access control and gaining access to the protected network. 5.3 Weaknesses of WEP In the WEP mechanism the access point sends the random challenge (plaintext, P) and then the station responds with the encrypted random challenge (ciphertext, C). -The attacker can capture the random challenge (P) and encrypted random challenge (C). - The IV was in the plaintext in the packet. Because the attacker now knows the random challenge (plaintext, P), the encrypted challenge (ciphertext, C), and the public IV, the attacker can derive the pseudo-random stream produced using WEP, with the shared key, K, and the public initialization variable, IV. WEP does not prevent replay attacks. An attacker can simply record and replay packets as desired and they will be accepted as legitimate WEP uses RC4 improperly. The keys used are very weak, and can be brute-forced on standard computers in hours to minutes, using freely available software. WEP reuses initialization vectors. A variety of available cryptanalytic methods can decrypt data without knowing the encryption key. WEP allows an attacker to undetectably modify a message without knowing the encryption key. 5.4 Weaknesses of WPA WPA Personal uses a Pre-Shared Key (PSK) to establish the security using an 8 to 63 character passphrase. The PSK may also be entered as a 64 character hexadecimal string. WPA Personal is secure when used with good passphrases or a full 64-character hexadecimal key. Weak PSK passphrases can be broken using off-line dictionary attacks by capturing the messages in the four-way exchange when the client reconnects after being deauthenticated. This weakness was based on the pair wise master key (PMK) that is derived from the concatenation of the passphrase, SSID, length of the SSID and a number or bit string used only once in each session. The result string is hashed 4,096 times to generate a 256-bit value and then combine with nonce values. The required information to generate and verify this key (per session) is broadcast with normal traffic and is really obtainable; the challenge then becomes the reconstruction of the original values. It explains that the pair wise transient key (PTK) is a keyed- HMAC function based on the PMK; by capturing the four way authentication handshake, the attacker has the data required to subject the passphrase to a dictionary attack Wireless suites such as aircrack-ng can crack a weak passphrase in less than a minute.

C XOR P= WEP [K, IV, PR] (pseudo random stream)

The attacker now has all of the elements to successfully authenticate to the target network without knowing the shared secret K. The attacker requests authentication of the access point it wishes to associate/join. The access point responds with an authentication challenge in the clear. The attacker, then, takes the random challenge text, R, and the pseudo-random stream, WEP[K,IV,PR], and computes a valid authentication response frame body by XOR-ing the two values together. The attacker then computes a new Integrity Check Value (ICV) the attacker responds with a valid authentication response message, and he associates with the AP and joins the network. Thus we can see-


Conclusions and Future Work

My work demonstrates serious flaws in ALL of the security mechanisms used by the vast majority of access points supporting the IEEE 802.11 wireless standard. The end result is that ALL of the deployed 802.11 wireless networks are at risk of compromise providing a network access point to internal networks beyond the physical security controls of the organization operating the network.

Unfortunately, fixing the problem is neither easy nor straight forward. The only good long term solution is a major overhaul of the current standard which may require replacement of current APs (although in some cases a firmware upgrade may be possible). Fortunately, the 802.11 standards body is currently working on significant improvements to the standard. However, it is too late for deployed networks and for those networks about to be deployed. A number of vendors are now releasing high-end access points claiming that they provide an increase in security. Unfortunately, few of the products we have examined provide enough information to determine the overall assurance that the product will provide, and worse, several of the products that do provide enough information use un-authenticated Diffie-Hellman which suffers from a well-known man in the middle attack. The use of un-authenticated Diffie-Hellman introduces a greater vulnerability to the organizations network. The increase in risk occurs because an attacker can insert them self in the middle of the key exchange between the client and the access point obtaining the session key, K. This is significantly worse than the current situation where the attacker must first determine the pseudorandom stream produced for a given key, K, and public IV, and then use the stream to forge packets. References [1] LAN MAN Standards of the IEEE Computer Society. Wireless LAN medium access control (MAC) and physical layer (PHY) specification. IEEE Standard 802.11, 1997 Edition, 1997.

[2] J. Walker, Unsafe at any key size: an analysis of the WEP encapsulation, Tech. IEEE 802.11 committee, March 2000. Document Holder/0-362.zi%p. [3] N. Borisov, I. Goldberg, and D. Wagner, Intercepting Mobile Communications: The Insecurity of 802.11. http://www.isaac.cs.berkeley. edu/isaac/wep-faq.html. [4] L. Blunk and J. Vollbrecht, PPP Extensible Authentication Protocol (EAP), Tech. Rep. RFC2284, Internet Engineering Task Force (IETF), March 1998. [5] ARASH HABIBI LASHKARI A Survey on Wireless Security protocols (WEP, WPA and WPA2/802.11i), FCSIT, University of Malaya Malaysia


An Innovative Digital Watermarking Process

AbstractThe seemingly ambiguous title of this paper use of the terms criticism and innovation in concord signifies the imperative of every organisations need for security within the competitive domain. Where organisational security criticism and innovativeness were traditionally considered antonymous, the assimilation of these two seemingly contradictory notions is fundamental to the assurance of long-term organisational prosperity. Organisations are required, now more than ever, to grow and be secured with their innovation capability rending consistent innovative outputs. This paper describes research conducted to consolidate the principles of digital watermarking and identify the fundamental components that constitute organisational security capability. The process of conducting a critical analysis is presented here. A brief description is provided of the basic field of digital watermarking recently, followed by a description of the advantages and disadvantages that were conducted to evaluate the process. The paper concludes with a summary of the analysis and potential findings for future research. Keywords- Digital Watermarking, property protection, Steganography Intellectual

Fig 1. A digital watermarked picture In visible digital watermarking, the information is visible in the picture or video. Typically, the information is text or a logo, which identifies the owner of the media. The image on the right has a visible watermark. When a television broadcaster adds its logo to the corner of transmitted video, this also is a visible watermark.

Introduction All Digital watermarking is the process of embedding information into a digital signal in a way that is difficult to remove. In digital watermarking, the signal may be audio, pictures, or video. If the signal is copied, then the information also is carried in the copy. A signal may carry several different watermarks at the same time. Fig 2. General Digital Watermarking Process In invisible digital watermarking, information is added as digital data to audio, picture, or video, but it cannot be perceived as such (although it may be

possible to detect that some amount of information is hidden in the signal). The watermark may be intended for widespread use and thus, is made easy to retrieve or, it may be a form of Steganography, where a party communicates a secret message embedded in the digital signal. In both the cases, as in visible watermarking, the objective is to attach ownership or other descriptive information to the signal in a way that is difficult to remove. It also is possible to use hidden embedded information as a means of covert communication between individuals. Applications Digital watermarking may be used for a wide range of applications, such as: Copyright protection Source tracking (different recipients get differently watermarked content) Broadcast monitoring (television news often contains watermarked video from international agencies) Covert communication Digital watermarking life-cycle phases Then the watermarked digital signal is transmitted or stored, usually transmitted to another person. If this person makes a modification, this is called an attack. While the modification may not be malicious, the term attack arises from copyright protection application, where pirates attempt to remove the digital watermark through modification. There are many possible modifications, for example, lossy compression of the data (in which resolution is diminished), cropping an image or video, or intentionally adding noise. Detection (often called extraction) is an algorithm which is applied to the attacked signal to attempt to extract the watermark from it. If the signal was unmodified during transmission, then the watermark still is present and it may be extracted. In robust digital watermarking applications, the extraction algorithm should be able to produce the watermark correctly, even if the modifications were strong. In fragile digital watermarking, the extraction algorithm should fail if any change is made to the signal.

Classification: A digital watermark is called robust with respect to transformations if the embedded information may be detected reliably from the marked signal, even if degraded by any number of transformations. A digital watermark is called robust if it resists a designated class of transformations. Robust watermarks may be used in copy protection applications to carry copy and no access control information. Typical image degradations are JPEG compression, rotation, cropping, additive noise, and quantization. For video content, temporal modifications and MPEG compression often are added to this list. A digital watermarking method is said to be of quantization type if the marked signal is obtained by quantization. Quantization watermarks suffer from low robustness, but have a high information capacity due to rejection of host interference. A digital watermark is called imperceptible if the watermarked content is perceptually equivalent to the original, unwatermarked content.[1] A digital watermark is called perceptible if its presence in the marked signal is noticeable, but non-intrusive. A digital watermarking method is referred to as spread-spectrum if the marked signal is obtained by

Fig 3. Digital watermarking life cycle phases General digital watermark life-cycle phases are with embedding-, attacking-, and detection and retrieval functions The information to be embedded in a signal is called a digital watermark, although in some contexts the phrase digital watermark means the difference between the watermarked signal and the cover signal. The signal where the watermark is to be embedded is called the host signal. A watermarking system is usually divided into three distinct steps, embedding, attack, and detection. In embedding, an algorithm accepts the host and the data to be embedded, and produces a watermarked signal.

an additive modification. Spread-spectrum watermarks are known to be modestly robust, but also to have a low information capacity due to host interference. A digital watermarking method is referred to as amplitude modulation if the marked signal is embedded by additive modification which is similar to spread spectrum method, but is particularly embedded in the spatial domain. Reversible data hiding is a technique which enables images to be authenticated and then restored to their original form by removing the digital watermark and replacing the image data that had been overwritten. Digital watermarking for relational databases emerged as a candidate solution to provide copyright protection, tamper detection, traitor tracing, maintaining integrity of relational data. Literature Review Wherever The literature surveyed prior to the research process, and throughout the duration of this project, constituted more than hundreds of documents. From this large literature set,some documents were identified as core, directly addressing the subject of digital watermarking. These documents were sourced from many locations, including peer reviewed journals, conference proceedings, white papers, electronic books, etc. The core documents were further subdivided into 9 groups. The topics were tabulated and were used to perform critical analysis. The first step was a detailed manual analysis and interpretation of the documents thus extracted (supplementing the initial literature study) and the second step was a critical approach towards every document for analysis. Critical analysis The analysis was done on the basis of the topics indexed and the present work, the future scope, the methodology used and the conclusion were tabulated for concluding the report on the analysis if digital watermarking. Considering the potential advantages, disadvantages of this technology. Advantages: Content Verification: Invisible digital watermarks allow the recipient to verify the authors identity. This might be very important with certain scientific visualizations, where a maliciously altered image could lead to costly mistakes. For example, an oil company might check for an invisible watermark in a map of oil deposits to ensure that the information is trustworthy. Watermarks provide a secure electronic signature. Determine rightful ownership: Scientific visualizations are not just graphs of data; they are often artistic creations. It is therefore entirely appropriate to copyright these images. If an author is damaged by unauthorized use of such an image, the author is first obligated to prove rightful ownership. Invisible digital watermarking provides another method of proving ownership (in addition to posting a copyright notice and registering the image). Track unlawful use: This technology might allow an author to track how his or her images are being used. Automated software would scan randomly selected images on the Internet (or any digital network) and flag those images which contain the authors watermark. This covert surveillance of network traffic would detect copyright violations thereby reducing piracy. Avoid malicious removal: The problem with copyright notices is that they are easily removed by pirates. However, an invisible digital watermark is well hidden and therefore very difficult to remove. Hence, it could foil a pirates attack. Disadvantages Degrade image quality: Even an invisible watermark will slightly alter the image during embedding. Therefore, they may not be appropriate for images which contain raw data from an experiment. For example, embedding an invisible watermark in a medical scan might alter the image enough to lead to false diagnosis. May lead to unlawful ownership claims for images not yet watermarked: While invisible digital watermarks are intended to reduce piracy, their widespread acceptance as a means of legal proof of ownership may actually have the opposite effect. This is because a pirate could embed their watermark in older images not yet containing watermarks and make a malicious claim of ownership. Such claims might be difficult to challenge. No standard system in place: While many watermarking techniques have been proposed, none of them have become the standard method. Furthermore, none of these schemes have yet been tested by a trial case in the courts. Therefore, they do not yet offer any real copyright protection. May become obsolete: This technology only works if the watermarks cannot be extracted from an image. However, technological advances might allow future pirates to remove the watermarks of today. It is very difficult to ensure that a cryptographic method will remain secure for all time.

Table 1: Crtical analysis of the literature reviewed Topic Present work Future work Proposed In future Digital watermarking Watermarking for watermarking should be more 3D Polygons using method is based on wavelet transform robust to possible Multiresolution (WT) and geometric Wavelet multiresolution operation, noise Decomposition representation imposition and (MRR) of the intentional attack. polygonal model. The embedding capacity should also be increased, and processing time be decreased. The method to extract the watermark without the original polygon should be proposed. It must be expanded to the free-form surface model or solid model which has to be more secret than the polygonal model in the CAD/CAM area. A practical method In the future, in A Practical that discourages order to make Method for program theft by watermarks more Watermarking embedding Java tamper-resistant, we Java Programs programs with a are to apply error digital watermark. correcting code to Embedding a our watermarking program method. developers copyright notation as a watermark in Java class files will ensure the legal ownership of class files. Proposed an audio This algorithm A Digital Audio digital does not Watermark watermarking compromise the Embedding algorithm based on robustness and Algorithm the wavelet inaudibility of the transform and the watermark complex cepstrum effectively. transform (CCT) by combining with human auditory model and using the masking effect of human ears.

Method First the requirements and features of the proposed watermarking method are discussed. Second the mathematical formulations of WT and MRR of the polygonal model are shown. Third the algorithm of embedding and extracting the watermark is proposed

Conclusion Finally, the effectiveness of the proposed watermarking method is shown through several simulation results.

Embedding method is indiscernible by program users, yet enables us to identify an illegal program that contains stolen class files.

The result of the experiment to evaluate our method showed most of the watermarks (20 out of 23) embedded in class files survived two kinds of attacks that attempt to erase watermarks: an obfuscactor attack, and a decompilerecompile attack.

This algorithm is realized to embed a binary image watermark into the audio signal and improved the imperceptibility of watermarks.

Experimental results show that this algorithm has a better robustness against common signal processing such as noise, filtering, resampling and lossy compression.

Soft IP Protection: Watermarking HDL Codes Leverage the unique feature of Verilog HDL design to develop watermarking techniques. These techniques can protect both new and existing Verilog designs. This paper presents a secure (tamperresistant) algorithm for watermarking images, and a methodology for digital watermarking that may be generalized to audio, video, and multimedia data. . We are currently collecting and building more Verilog and VHDL circuits to test our approach. We are also planning to develop CAD tools for HDL protection. We watermark SCU-RTL & ISCAS benchmark Verilog circuits, as well as a MP3 decoder. Both original and watermarked designs are implemented on asics & fpgas. Further, the use of Gaussian noise, ensures strong resilience to multiple-document, or collusional, attacks. The results show that the proposed techniques survive the commercial synthesis tools and cause little design overhead in terms of area/resources, delay and power.

Secure Spread Spectrum Watermarking for Multimedia

Digital Watermarking facing Attacks by Amplitude Scaling and Additive White Noise

A communications perspective on digital watermarking is used to compute upper performance limits on blind digital watermarking for simple AWGN attacks and attacks by amplitude scaling and additive white noise.

The experiments presented are preliminary, and should be expanded in order to validate the results. We are conducting ongoing work in this area. Further, the degree of precision of the registration procedures used in undoing a fine transforms must be characterized precisely across a large test set of images. An important results is that the practical ST-SCS watermarking scheme achieves at least 40 % of the capacity of ICS which can still be improved by further research.

Experimental results are provided to support these claims, along with an exposition of pending open problems.

Digital watermark mobile Agent

A digital watermark agent travels from host to host on a network and acts like a detective that detects watermarks and collects evidence of

The second component that we are developing is a datamining and data-fusion module to intelligently select the next migration hosts

We show that this case can be translated into effective AWGN attacks, which enables a straight forward capacity analysis based on the previously obtained watermark capacities for AWGN attacks. Watermark capacity for different theoretical and practical blind watermarking schemes is analyzed.. This system enables an agency to dispatch digital watermark agents to agent servers and agent can perform various tasks on the server. Once all the

Analysis shows that the practical STSCS watermarking achieves at least 40 % of the capacity of an ideal blind watermarking scheme.

Development of an active watermark method which allows the watermarked documents themselves to report their own usage to

any misuse. Furthermore, we developed an active watermark method which allows the watermarked documents themselves to report their own usage to an authority if detected. Analysis of Watermarking Techniques for Graph Coloring Problem Theoretical framework to evaluate watermarking techniques for intellectual property protection (IPP). Based on this framework, we analyze two watermarking techniques for the graph coloring(GC) problem . Since credibility and overhead are the most important criteria for any efcient watermarking technique, Theoretical analysis of watermark capacity Simplied watermark scheme is postulated. In the scheme, detection yields a multidimensional vector, in which each dimension is assumed to be i.i.d. (independent and identically distributed) and follow the Gaussian distribution. based on multiple sources of information such as related business categories and results of web search engines. actions have been taken, a report will be sent to an agencys database and an agent can continue to travel to another agent server. an authority if detected.


Formulae is derived that illustrate the tradeoff between credibility and overhead.

Asymptotically we prove that arbitrarily high credibility can be achieved with at most 1-coloroverhead for both proposed watermarking techniques.

Practical capacity of digital watermark as constrained by reliability

Some more experiments can be performed

Reliability is represented by three kinds of error rates: the false positive error rate, the false negative error rate, and the bit error rate

Experiments were performed to verify the theoretic analysis, and it was shown that this approach yields a good estimate of the capacity of a watermark

University of Science and Technology, 2002, Vol.30(5), pp.12-15 . [4] Hong-yi Zhao, Chang-nian Zhang, Digital signal processing and realization in MATLAB. Publishing company of chemical industry, Beijing, 2001, pp.129-131. [5] Secure and Robust Digital Watermarking on Grey Level Images, SERC Journals IJAST Vol 11, 1 [6] Secure Spread Spectrum Watermarking for Multimedia, IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 12, DECEMBER 1997. [7] Digital Watermarking by Chaelynne M. Wolak, DISS 780 Ass: Twelve, School of Computer and Information Sciences Nova Southeastern University July 2000 [8] Digital watermarking a technology overview, Hebah H.O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, AmmanJordan. IJRRAS 6 (1) January 2011, _1_10.pdf [9] Digital Watermarking facing Attacks by Amplitude Scaling and Additive White Noise, Joachim J. Eggers, Bernd Girod, Robert B auml, 4th Intl. ITG Conference on Source and Channel Coding Berlin, Jan. 28-30, 2002 [10] Digital Watermark Mobile Agents, Jian Zhao and Chenghui Luo Fraunhofer Center for Research in Computer Graphics, Inc. 321 South Main Street Providence, Digital Watermarks in Scientific Visualization, Wayne Pafko Term Paper (Final Version) SciC8011 Paul Morin May 8, 2000

Conclusion This paper concludes with a discussion on the relevance and applicability of the Innovative digital piracy and potential further research. The first point pertains to the requirement and the need of digital watermarking as an answer to digital piracy. The economic impact of digital piracy on the media industry is a credible threat to the sustainment of the industry. The advantages of digital watermarking are the following Content Verification, Determine rightful ownership, Track unlawful use, Avoid malicious removal. The disadvantages of it are Degrade image quality, May lead to unlawful ownership claims for images not yet watermarked, No standard system in place , May become obsolete. Refrences A Practical Method for Watermarking Java Programs, The 24th Computer Software and Applications Conference (compsac2000), Taipei, Taiwan, Oct. 2000. [2] A Digital Audio Watermark Embedding Algorithm, School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China [3] Yue Sun, Hong Sun, and Tian-ren Yao, Digital audio watermarking algorithm based on quantization in wavelet domain. Journal of Huazhong


Design of a Reconfigurable SDR Transceivers using LabVIEW

AbstractSoftware defined radio has assumed a lot of significance in the recent past. By now it is established fact that most of the future radios will be implemented using software. This offers the advantage of on the fly repair, maintenance and modifications. Also software defined radio is one of the best available approaches to global roaming problem currently faced in the field of information technology. Unless we over come global roaming problem we shall not be able to access any information any time anywhere. Because different modules of software can be switched on depending upon the availability of different types of radio transmissions, software defined radio offers the portability, miniaturization, easy software updates and global roaming. This paper presents the design and implementation of a QAM, QPSK and reconfigurable SDR based radio transmitter and radio receiver. Both the transmitter and receiver are implemented using Lab VIEW, a graphical language. The radio transceiver was tested on a PC by simulating a software defined transmission channel. The audio signal and music were first recorded using sound capture VI (Virtual instrument), these were then modulated, transmitted; received and demodulated at the receiver. We were able to receive the exact replica of the transmitted audio signals. Both transmitter and receiver were implemented on the same PC. The work is in progress where one PC will act as transmitter and the other PC will act as receiver. KeywordsSoftware Defined Radio, Virtual Instruments, MODEM INTRODUCTION One of the revolutionary technologies of the 21st century is the Software Defined Radio that is attempting to solve the seamless/global roaming problems and the all frequency radio transceivers. Of Late a need to design a reconfigurable radio that could switch over to any desired frequency especially in defense sector has been felt. Lab VIEW, QAM Modem, QPSK

Software defined radio (SDR) is defined as a radio whose some or all the physical layer function such as modulation or coding schemes are software defined. SDR further refers to a technology where in software modules running on a generic hardware platforms consisting of Digital Signal Processors (DSP),General purpose Microprocessors, Microcontrollers and Personal computers are used to implement radio functions such as generation of modulation modules in the transmitter, demodulation, tuning and amplification in the receiver. The SDR architecture evolved in three phases: Its phase1 architecture implements channel coding, source coding and control functionality in software on DSP/microcontroller programmable logic. This architecture is already in use for today's digital phones and allows a measure of new service introduction on to the phone in the field; in effect it allows reconfiguration at the very top of the protocol stack, the applications layer. In second phase the baseband modem functionality is implemented in software. This step allows the realization of new and adaptive modulation schemes under either selfadaptive or download control. Further extension of this i.e. phase three shown in figure 1, involving a major change to the overall architecture to implement the intermediate frequency (IF) signal processing digitally in software, will allow a single terminal to adapt to multiple radio interface standards by software reconfigurability. It is clear that the processing power required to implement a phase 3 handset exceeds that available from generic low-power DSPs in the near future by a large margin. The main interests of the scientific community in SDR are due to its following advantages: Reconfigurability Ubiquitous connectivity Interoperability On the air upload of software module to subscriber Faster deployment of new services and Quick remote diagnostic and defect rectification. Due to these advantages DRDO India has also launched ambitious projects of replacing almost all hardware based radios by their SDR counterparts. In a communiqu dated 18th Nov. 2010 DRDO has targeted to deploy SDR in Army Air force and Navy by the year 2013. A generic architecture of SDR is given as below


Software processing

D/A Digital IF Processing WBand RF Front

Base band Modem Processing


Bit stream

Data Interface Interface D/A A/D

D/A Man machine interface Interface

Front Panel: This is a user interface .where the program is controlled and executed Block Diagram: and indicators on the Front Panel. Icon/Connector: It is the internal circuit where the program code is written. Any information that is needed during the simulation can be found in the controls icon, which is a visual representation of the VI, has connectors for program inputs and outputs DIGITAL MODULATION Digital Modulation is used in many communications systems. The move to digital modulation provides: More information capacity Compatibility with digital data services Higher data security Better quality communications Quicker system availability. EXAMPLES OF DIGITAL MODULATION QPSK (Quadrature Phase Shift Keying) FSK (Frequency Shift Keying) MSK (Minimum Shift Keying) QAM (Quadrature Amplitude Modulation) DESIGN STEPS AND RESULTS The design steps for implementing SDR are as follows: To design, implement and simulate VIs for capturing and reproducing audio signals in Lab VIEW environment


Figure 1.

Architecture of SDR

METHODOLOGY Lab VIEW(Laboratory Virtual Instrument Engineering Workbench) simulation will be used in order to design SDR Transceivers. LabVIEW, a graphical programming language from National Instruments which allows designing of systems in an intuitive block-based manner in shorter times as compared to the commonly used text-based programming languages. Lab VIEW programs are called virtual instruments or VIs because their appearance and operation imitates the physical instruments such as multimeters oscilloscope etc. LabVIEW is commonly used for data acquisition, instrument control, and industrial automation Simply put, a Virtual Instrument (VI) is a LabVIEW programming element. A VI consists of a two windows namely front panel, block diagram, and an icon that represents the program. The code is in the one window and the user interface (inputs and outputs) appear in the separate window as shown in figure.

Figure 2.

Parts of VI

Mux/ demux

Speech Codec

To design, implement and simulate software based QAM MODEM. To design, implement and simulate a software based QPSK MODEM. To implement the reconfigurable SDR Transceivers on a single PC using a simulated channel between them.





Figure 3.



Figure 4.

Voice Capture VI
Figure 5.

The VI shown in figure 4 is used to capture the sound information in Lab VIEW. This VI inputs the sound in Lab VIEW without the use of I/O drivers, otherwise which are necessary to provide external inputs to Lab VIEW environment. The sound input can be provided using microphone. The response of this SDR transmitter was analyzed by giving different types of audio inputs such as speech and music etc. from the microphone. Following are the different observations and results obtained after giving the input to the system.

Sound Input Graph

Voice RegeneratorVI In the similar manner, the voice regeneration VI can be designed using the sub Vis for sound output to regenerate the original sound information. The originated sound can be heard using the speaker VI system. This VI again eliminates the need of external I/O devices required to provide Lab VIEW signal to external environment. QAM MODEM A modulation technique which exploits the amplitude and phase information of the carrier signal is Quadrature amplitude modulation (QAM) . The VI used to modulate the carrier using the captured sound is shown in figure 4. QAM modulation scheme is used because of its some of the merits over other modulation schemes. The received modulated information is demodulated using QAM demodulator to recover the original information. Demodulation is the process of extracting the original information from the modulated signal. Here the original sound information from the modulated sound signal is extracted by using QAM demodulator. The QAM demodulated signal s(t) is obtained by performing the multiplication between the complex

carrier consisting of a cosine and a sine waveform and the real part of the modulated signal. At the receiver, these two modulated signals can be demodulated using a coherent demodulator. Such a receiver multiplies the received signal separately with both a cosine and sine signal to produce the received estimates of I (t) and Q (t) respectively. Because of the orthogonal property of the carrier signals, it is possible to detect the modulating signals independently.

Figure 8.


Figure 6.


Figure 9.


Figure 7.


QPSK MODEM VI The VI used to modulate the carrier using the captured sound is shown in figure 8.

After designing the QPSK and QAM Modem separately in lab View now the next step is to implement the reconfigurability feature to our design. For this we have used a case structure which work in two cases true and false case. In true case we have used the QAM Modem whereas in false case the QPSK Modem is used. the selection criteria from the above two is based upon the frequency ranges of the two.if the input signal frequency ranges from 5Mhz to 42 Mhz then QPSK works otherwise for higher ranges of frequency QAM works. The other modulation scheme can also be implemented by selecting their frequency ranges. The above figures shows the output at QAM and QPSK Modem and the figure no.10& 11 shows the front panel of our complete reconfigurable SDR transceiver design.

was observed that the recovered audio signal was the exact replica of the transmitted signal. However, for the transmission of modulated signal the RF front end of SDR is still implemented by analog circuitry, this is the main limitation of our work as ideal SDRs differ from practical SDRs in that all radio components will be implemented by software, including the RF front end. However, because of technology limitations, ideal SDRs are unachievable yet. BIBLIOGRAPHY Mazen K. Alsliety, An Overview Of SDR Opportunities and Challenges in Telematics, Proceeding of the SDR 06 Technical Conference and Product Exposition. Sven G. Biln, Modulation Classification for Radio Interoperability Via SDR, Proceeding of the SDR07 Technical Conference and Product Exposition. Raymond J. Lackey and Donald W, Upmal (May 1995) Speakeasy: The Military Software Radio, IEEE Communications Magazine. Kumagai, Jean (Jan 2007) Radio Revolutionaries, IEEE spectrum Magazine, 44 .24-27. Mitola, J. (May 1995) The Software Radio Architecture IEEE Communication Magazine, 33, 2638. Digital Modulation in Communications Systems An Introduction Application Note 1298 September 2003 Introduction to LabVIEW Three-Hour Course, Edition Part Number 323668B-01 by National Instruments September 2000 LabVIEW Basics II Course Manual Edition Part Number 320629G-01 by National Instruments. August 2005 LabVIEW Fundamentals Edition Part No. 374029A-01 Instruction Manual, by National Instruments. Clark, Cory L. (2006) LabVIEW Digital Signal Processing and Digital Communication Tata McGraw Hill. Kim, N., Kehtarnavaz, N. & Torlak, M. LabVIEW-Based Software-Defined Radio: 4QAM Modem, Systemics, cybernetics and informatics 4 number 3. Kehtarnavaz, N. & Kim, N. (2005) Digital Signal Processing System-Level Design Using LabVIEW, Elsevier Shakti Kumar, Rajni Raghuvanshi, Pooja Pathak Design and Implementation of Software Defined Radio for IT Applications Proceeding of the international conference (2010)





Front transceiver
Figure 10.





[16] [17]







Front transceiver
Figure 11.





CONCLUSION This paper presented the design and implementation of a reconfigurable software-defined radio transceiver using LabVIEW. Using this design our audio signal was differently modulated by two modulation methods at different carrier frequencies and it was found that it worked efficiently. When the modulated signal was demodulated using receiver it


A Modified Zero Knowledge Identification Scheme using ECC

Abstract In this paper we present a Fiat-Shamir-like ZeroKnowledge identification scheme based on the elliptic curve Cryptography. As we know in an open networkcomputing environment, a workstation cannot be trusted to identify its users correctly to network services. Zero-knowledge (ZK) protocols are designed to address these concerns, by allowing a prover to demonstrate knowledge of a secret while revealing no information to be used by the verifier to convey the demonstration of knowledge to others. The reason that ECC has been chosen is that it provides methodology for obtaining higher speed implementations of authentication protocols and encryption/decryption techniques while using fewer bits for the keys. This means that ECC systems require smaller chip size and less power consumption. Key Words Identification, Security, Zero-Knowledge, Elliptic Curve.

have been purposed and implemented to limit the amount of information shared in order to provide positive identification. Several of these techniques have some weaknesses and are particularly susceptible to man-in-the-middle, off-line and impersonation attacks [1]. Zero-knowledge proofs techniques are powerful tools in such critical applications for providing both security and privacy at the same time.


Communication between the computer and a remote user is currently one of the most vulnerable aspects of a computer system. In order to secure this, cryptographic system must be built into the user terminal, and suitable protocols developed to allow the computer and the user to recognize each other upon initial contact and maintain continued security assurance of secret messages. In particular, zeroknowledge proofs (ZKP) can be used whenever there is a need to prove the possession of critical data without a real need to exchange the data itself. Examples of such applications include: credit card verification, digital cash system, digital watermarking, and authentication. Most of the messaging systems used, rely on secret sharing to provide identification. Unfortunately, once you tell a secret it is no longer a secret. This is how identity theft and credit card fraud happen. Authentication and key exchange protocols

A zero knowledge interactive proof system allows one person to convince another person of some fact without revealing the information about the proof. In particular, it does not enable the verifier to later convince anyone else that the prover has a proof of the theorem or even merely that the theorem is true [2]. A zero-knowledge proof is a two-party protocol between a prover and a verifier, which allows the prover to convince the verifier that he knows a secret value that satisfies a given relation (zero-knowledge property). Zero-knowledge protocols are instances of an interactive proof system, where prover and verifier exchange messages (typically depending on random events). 1. Security: An impostor can comply with the protocol only with overwhelmingly small probability. 2. Completeness: An interactive proof is complete if the protocol succeeds (for a honest proofer and a honest verifier) with overwhelming probability p > 1/2. (Typically, p ~ 1). 3. Soundness: An interactive proof is sound if there is an algorithm M with the following properties: i M is polynomial time. ii If a dishonest prover can with nonnegligible probability successfully execute

the protocol with the verifier, then M can be used to extract knowledge from this prover which with overwhelming probability allows successful subsequent protocol executions. (In effect, if someone can fake the scheme, then so can everyone observing the protocol e.g. by computing the secret of the true prover). 4. Zero-Knowledge (ZK) Property: There exists a simulator (an algorithm) that can simulate (upon input of the assertion to be proven, but without interacting with the real prover) an execution of the protocol that for an outside observer cannot be distinguished from an execution of the protocol with the real prover. The concept of zero-knowledge, first introduced by Goldwasser, Micali [4] and Rackoff is one approach to the design of such protocols. Particularly, in Feige, Fiat, and Shamir show an elegant method for using an interactive Zero-Knowledge proof to prove identity in [2] a cryptographic protocol. Fiat-Shamir Zero-Knowledge identification scheme is based on discrete logarithmic. In this paper, we modify Fiat-Shamir Zero-Knowledge identification scheme using Elliptic Curve Cryptography.

Fig. 1 Fiat Shamir User Identification Process Fiat Shamir User Identification Process: The Process of user identification can be understood as. Key Generation Process: i Trusted centre choose two large prime numbers p & q. ii Then trusted center calculate n = p*q and publishes n as modulus. iii Each potential claimant (prover) selects a secret prime number s which should be coprime to n iv Each potential claimant (prover) calculates v = s2 mod n as its public key and publish it. Verifying Process: The following steps are performed to identify the authenticated user. i The prover choose a random number r and sends x= r2 mod n (the witness x) to the verifier. ii The verifier randomly selects a single bit c= 0 or c = 1, and sends c to the prover. iii The prover computes the response y = r sc mod n and sends it to the verifier. iv The verifier rejects the proof if y = 0 and accepts if y2 = xvc mod n . Informally, the challenge (or exam) c selects between two answers (0 or 1): the secret r (to keep the claimant honest) or one that can only be known from s. If a false claimant were to know that the challenge is c = 1, then he could provide an arbitrary number a, then sends witness a2/v. Upon receiving c = 1, he sends y = a. Then y2 = a2/v v. If the false claimant were to know that the challenge is c = 0, then he could select an arbitrary number a and send witness a2. This property allows us to simulate runs of the protocol that an outside observer cannot distinguish from real runs (where the challenges c is true random challenges).

The Fiat Shamir protocol is based on the difficulty of calculating a square-root. The claimant proves knowledge of a square root modulo a large modulus n. Verification can be done in 4 steps as shown in figure 1.

Elliptic Curve Cryptography (ECC) is a public key cryptography. In public key cryptography each user or the device taking part in the communication generally have a pair of keys, a public key and a private key, and a set of operations associated with the keys to do the cryptographic operations. Only the particular user knows the private key whereas the public key is distributed to all users taking part in the communication. Some public key algorithm may require a set of predefined constants to be known by all the devices taking part in the communication. Domain parameters in ECC is an example of such constants. Public key cryptography, unlike private key cryptography, does not require any shared secret

between the communicating parties but it is much slower than the private key cryptography. The mathematical operations of ECC is defined over the elliptic curve y2 = x3 + ax + b, where 4a3 + 27b2 mod p 0. Each value of the a and b gives a different elliptic curve. All points (x, y) which satisfies the above equation plus a point at infinity lies on the elliptic curve. The public key is a point in the curve and the private key is a random number. The public key is obtained by multiplying the private key with the generator point G in the curve. The generator point G, the curve parameters a and b, together with few more constants constitutes the domain parameter of ECC. One main advantage of ECC is its small key size. A 160-bit key in ECC is considered to be as secured as 1024-bit key in RSA. The elliptic curve addition operation differs from general addition. Assuming that P and Q are two points on the elliptic curve, P = (x1, y1) and Q = (x2, y2); if P = Q, then the elliptic curve addition operation P + Q = (x3, y3) can be obtained through the following rules. x3 = (2 x1 x2) mod p ---[1] y3 = {(x1 x3) y1} mod p --- [2] Where = y2 - y1 x2 - x1 = 3x12 + a 2y1 The dominant operation in ECC cryptographic schemes is point multiplication. Point multiplication is simply calculating kP as shown in figure 2, where k is an integer and P is a point on the elliptic curve defined in the prime field. for P = Q for P Q make it an attractive alternative. In particular, for a given level of security, the size of the cryptographic keys and operands involved in the computation of EC cryptosystems are normally much shorter than other cryptosystems and, as the computational power available for cryptanalysis grows up, this difference gets more and more noticeable.

Fiat-Shamir Zero-Knowledge identification scheme is based on discrete logarithmic. We modify FiatShamir Zero-Knowledge identification scheme using Elliptic Curve Cryptography as shown in figure 3. Modified Fiat Shamir User Identification Process: The Process of user identification can be understood as. Key Generation Process: i) Third party choose the value of a and p for the elliptic curve Ep (a,b). ii) The value of b is selected by claimant so the equation satisfied the condition 4a3 + 27b2 mod p 0. iii) The value of a and p are announced to be public where as b remains secret to the claimant. iv) The claimant chooses a secret point s on curve and calculates v=2s mod p. Claimant keeps s as its private key and registers v as public key with the third party. Verifying Process: The following steps are performed to identify the authenticated user. i) Alice the claimant, chooses a random point r (r is the point on the curve). She then calculate the value of x= (2r) mod p; is called the witness and send x to the Bob as the witness. ii) Bob, the verifier, sends the challenge C to Alice. The value of C is a prime number lies between 1 to p-1.

Fig. 2 Point Multiplication All reported methods for computing kP parse the scalar k and depending on the bit value, they perform either an ECC-ADD or a ECC-Double operation. In fact, ECC is no longer new, and has withstood in the last years a great deal of cryptanalysis and a long series of attacks, which makes it appear as a mature and robust cryptosystem at present. ECC has a number of advantages over other public-key cryptosystems, such as RSA, which

value of r ad s and public key is depend on the value of s. The absence of a sub-exponential time algorithm for the scheme means that significantly smaller parameters can be used in ECC than with DSA or RSA. This will have a significant impact on a communication system as the relative computational performance advantage of ECC versus RSA is not indicated by the key sizes but by the cube of the key sizes. The difference becomes even more dramatic as the greater increase in RSA key sizes leads to an even greater increase in computational cost

Fig. 3 Fiat Shamir Scheme using ECC iii) Alice calculate the response y= r +c.s mod p. Note that r is the random point selected by the Alice in the first step, s is secret number and c is the challenge send by Bob and sends the response (y) to Bob. iv) Bob calculates x+(c v) mod n and 2y mod n. If these two values are congruent, then Alice knows the value of s and she is authenticated person.( she is honest ) . If not congruent that means she is not authenticated person and verifier can reject her request.

A unique feature of the new identification scheme is that it is based on Elliptic Curve Cryptography (ECC). In [8], they conclude that the Elliptic Curve Discrete Logarithm Problem is significantly more difficult than Integer Factorization Problem. For instance, it was found in that to achieve reasonable security, RSA should employ 1024-bit modulo, while a 160-bit modulus should be sufficient for ECC. Also our identification scheme is faster than Fiat-Shamir scheme [5] and Guillou-Quisquater [7], because our Scheme depends on addition operation while those schemes depend on exponential operation. In future few dominant proof techniques have emerged in security proofs. Among which are, probabilistic polynomial time reducibilitys between problems, simulation proofs, the hybrid method, and random self reducibility can be introduced and comparative performance study can be carried out.
REFERENCES Ali M. Allam, Ibrahim I., Ihab A. Ali, Abd ELrahman H. Elsawy Efficient Zero-knowledge Identification Scheme with Secret Key Exchange IEEE,2004 [2] Ali M. Allam ,Ibrahim I. Ibrahim ,Ihab A. Ali, Abdel Rahman H. Elsawy The Performance Of An Efficient ZeroKnowledge Identification Scheme IEEE,2004 [3] Sultan Almuhammadi, Nien T. Sui, and Dennis McLeod Better Privacy and Security in E-Commerce: Using Elliptic Curve-Based Zero-Knowledge Proofs IEEE,2004 [4] S. Goldwasser, S. Micali, and C. Rackoff, "The knowledge complexity of interactive proof systems.", Siam J. Comput., 18(1), pp. 186- 208, February 1989. [5] U. Feige, A. Fiat, and A. Shamir, "Zero knowledge proofs of identity.", Journal of Cryptology, 1(2), pp. 77-94, 1988. [6] Chengming Qi , Beijing Union university, A ZeroKnowledge Proof of Digital Signature Scheme Based on the Elliptic Curve Cryptosystem 2009 Third International Symposium on Intelligent Information Technology Application. [7] L. Guillou, and J. Quisquater, "A Paradoxical" IdentityBased Signature Scheme Resulting from ZeroKnowledge.",Proc. CRYPTO '88. [8] W. Stallings. Cryptography and network security", 3rd edition, Prentice Hall, 2003. [9] Behrouz A. Forouzan. Cryptography and network security. TMH [1]

The security of the system is directly tied to the relative hardness of the underlying mathematical equation. We can easily prove that 2y is the same as x+ (cv) in modulo n arithmetic as shown below. 2Y=2(r+cs) =2r+2cs= (x+cv) --- [3] The challenge (or exam) c selects between the value of 1 and p-1, the secret r (to keep the claimant honest) or one that can only be known the value of s. If a false claimant were to know that the challenge c, then he could provide an arbitrary number m and send witness , Since b is chosen by claimant and generate the points on the equation of Elliptic curve Ep(a,b). No other person can guess on which equation points are generated and which point is randomly selected by claimant. If false claimant sends m to witness then definitely it will not match the final verification, as only claimant knows the


Security and Privacy of Conserving Data in Information Technology

Abstract- In this paper, we present a security infrastructure design to ensure safety in the electronic government system: a combination of well-known security solutions, including Public Key Infrastructure, Shibboleth, Smart cards and Lightweight Directory Access Protocol. In this environment we give an overview in privacy preserving and security for Data Mining processes. The original target to supply services through the internet has evolved into the impact of e-Government programmers in delivering better services to their citizens, more efficient in an inclusive society which emphasizes on the quality of the services provided and the extent to which online services are meeting user needs.

access to services European Union wide by establishing secure systems for mutual recognition of national electronic identities for public administration websites and services (European Commission, 2006). The necessity of an interoperable and scalable security and identity infrastructure has been identified by all implicated parties focusing on the effectiveness of solutions provided. SECURITY GOVERNMENT AND ELECTRONIC

Keywords- Security, Integration, Single Sign On, Privacy, cryptography.

INTRODUCTION Member countries of the European Union are speeding into the digitalization of government services, with countries currently offering a surplus of interactive services which are increasing in availability and sophistication. International attempts to develop integrated customer oriented administrative services represent efforts to alleviate the problems of bureaucracy and improve the provision of administrative Services. Since the launch of the European Strategy for the development of eGovernment, with the e-Europe 2002 initiative presented in March 2000 at the Lisbon European Council, a change of focus has occurred. The original target to supply services through the internet has evolved into the impact of e-Government programmers in delivering better services to their citizens, more efficient in an inclusive society which emphasizes on the quality of the services provided and the extent to which online services are meeting user needs. Identified as a major aspect, is the safe

Electronic Government services are being rapidly deployed throughout Europe. Security is the main concern in this process, creating the need for an interoperable secure infrastructure that will meet all current and future needs. It is a necessity that such an infrastructure will provide a horizontal level of service for the entire system and must be accessible by all applications and sub-systems in the network. Delivering electronic services will largely depend upon the trust and confidence of citizens. For this aim, means have to be developed to achieve the same quality and trustworthiness of public services as provided by the traditional way. Regarding the level of systems design, some fundamental requirements, as far as security is concerned, have to be met: Identification of the sender of a digital message. Authenticity of a message and its verification. Non-repudiation of a message or a dataprocessing act. Avoiding risks related to the availability and reliability. Confidentiality of the existence and content of a Message. The best solution makes use of coexisting and complementary

Technologies which ensure safety throughout all interactions. Such a system provides assurances of its interoperability by using widely recognized standards and open source software. This evolutionary infrastructure design is based on a collaboration of existing cutting edge technologies in a unique manner. Public key infrastructure, Single sign on techniques and LDAP collaborate effectively guaranteeing efficient and secure communications and access to resources. A Public Key Infrastructure (PKI) based on asymmetric keys and digital certificates, is the fundamental architecture to enable the use of public key cryptography in order to achieve strong authentication of involved entities and secure communication. PKI have reached a stage of relative maturity due to extensive research that has occurred in the area over the past two decades, becoming the necessary trust infrastructure for every e-business (ecommerce, e-banking, e-cryptography). The main smart card reader and the Personal Identification Number (PIN) can use the smart card). Smart cards provide the means for performing secure communications with minimal human intervention. In addition smart cards are suitable for electronic identification schemes as they are engineered to be tamper proof. The lightweight directory access protocol, or LDAP, is the Internet standard way of accessing directory services that conform to the X.500 data model. LDAP has become the predominant protocol in support of PKIs accessing directory services for certificates and certificate revocation lists (CRLs) and is often used by other (web) services for authentication. A directory is a set of objects with similar attributes organized in a logical and hierarchical manner. An LDAP directory tree often reflects various political, geographic, and/or organizational boundaries, depending on the model chosen. LDAP deployments today tends to use Domain name system (DNS) names for structuring the topmost levels of the hierarchy. The directory contains entries representing people, organizational units, printers, documents, groups of people or anything else which represents a given tree entry (or multiple entries). Single Sign on (SSO) is a method of access control that enables a user to authenticate once and gain access to the resources of multiple independent software systems. Shibboleth is standards-based, open source middleware software which provides Web Single Sign on (SSO) across or

Fig1. Security and E-Government purpose of PKI is to bind a public key to an entity. The binding is performed by a certification authority (CA), which plays the role of a trusted third party. The user identity must be unique for each CA. The CA digitally signs a data structure, which contains the name of the entity and the corresponding public key besides other data. Such a pervasive security infrastructure has many and varied benefits, such as cost savings, interoperability (inter and intra enterprise) and consistency of a uniform solution. A PKI smart card is a hardware-based cryptographic device for securely generating and storing private and public keys, digital certificates and performing cryptographic operations. Implementing digital signatures in combination with advanced cryptographic smart cards minimizes user side complexity while maintaining reliability and security (Only an identity in possession of a smart card, a

Fig2. Public Key Encryption (PKI) within organizational boundaries. It allows sites to make informed authorization decisions for individual access of protected online resources in a privacy preserving manner. Shibboleth is a Security Assertion

Mark up Language with a focus on federating research and educational communities. Key concepts within Shibboleth include: Federated Administration: The origin campus (home to the browser user) provides attribute assertions about that user to the target site. A trust fabric exists between campuses, allowing each site to identify the other speaker, and assign a trust level. Origin sites are responsible for authenticating their users, but can use any reliable means to do this. Access Control Based On Attributes: Access control decisions are made using those assertions. The collection of assertions might include identity, but many situations will not require this (e.g. accessing a resource licensed for use by all active members of the campus community or accessing a resource available to students in a particular course). Active Management of Privacy: The origin site (and the browser user) controls what information is released to the target. A typical default is merely "member of community". Individuals can manage attribute release via a web-based user interface. Users are no longer at the mercy of the target's privacy policy. A collaboration of independent technologies presented previously leads to an evolutionary horizontal infrastructure. Introducing federations in e-government, in association with PKI and LDAP technology, will lead to efficient trust relationships between involved entities. A federation is a group of legal entities that share a set of agreed policies and rules for access to online resources. These policies enable the members to establish trust and shared understanding of language or terminology. A federation provides a structure and a legal framework that enables authentication and authorization across different organizations. In general the underlying trust relationships networks of the federation are based on Public Key Infrastructure (PKI) and certificates enable mutual authentication between involved entities. This is performed using SSL/TLS protocol and XML digital signatures using keys contained in X.509 certificates obtained from eschool Certification Authorities. An opaque client certificate can contain information about the user's home institution and, optionally, the user's pseudonymous identity. Shibboleth technology relies on a third party to provide the information about a user, named attributes. Attributes are used to refer to the characteristics of a user and not the user straightforward: a set of attributes about a user is what is actually needed rather than a name with respect to giving the user access to a resource. In the hypnotized architecture, this is performed by the LDAP repository which is also responsible for the association of user attributes. Additionally LDAP contains a list of all valid certificates and revoked certificates. Digital signatures are used to secure all information in transit between the various subsystems. This infrastructure leverages a system of certificate distribution and a mechanism for associating these certificates with known origin and target sites at each participating server. User side complexity is guaranteed to be minimum without any cutbacks on the overall security and reliability. The model presented in this paper offers the advantages of each single technology used and deals with their deficiencies through their combined implementation: Hybrid PKI hierarchical infrastructure delegates the trust to subordinate CASs permitting the creation of trust meshes, under a central CA, between independent organizations. Interoperability is simply addressed. PKI supports single sign on with the use of Shibboleth. Shibboleth coordinates with PKI to develop enhanced, complex free, authorization and authentication processes. The user becomes part of the designed system using Single Sign on (SSO) technology that simplifies the access to multiple resources with only one gain access procedure. In practice this results in enhancing the security of the whole infrastructure, among other evident technical issues, because a sufficient level of usability is assured. Providing security infrastructure is not enough, the user must also be able to make use of the security features. Otherwise, the designed service will fail due to the fact that users behavior is often the weakest link in a security chain. The combination of the above mentioned techniques creates strong trust relationships between users and eGovernment services, by implementing a zeroknowledge procedure of a very strong authorization. Zero-Knowledge is an interactive method for one entity to prove the possession of a secret without actually revealing it, resulting eventually in not

revealing anything about the entitys personal information. The combined techniques mitigate the problem of memorizing many passwords and reduce the vulnerability of using the same password to access many web services. (AA), the Handle Service (HS), attribute sources, and the local sign-on system (SSO). Shibboleth interacts with the Ldap infrastructure to retrieve user credentials. From the Identity Providers point of view, the first contact will be the redirection of a user to the handle service, which will then consult the SSO system to determine whether the user has already been authenticated. If not, then the browser user will be asked to authenticate, and then sent back to the SP URL with a handle bundled in an attribute assertion. Next, a request from the Service Provider's Attribute Requester (AR) will arrive at the AA which will include the previously mentioned handle. The AA then consults the ARP's for the directory entry corresponding to the handle, queries the directory for these attributes, and releases to the AR all attributes the requesting application is entitled to know about that user. PRIVACY PRESERVING DATA MINING In large intra-organizational environments, data are usually shared among a number of distributed databases, for security or practicality reasons, or due to the organizational structure of the business. Data can be partitioned either horizontally, where each database contains a subset of complete transactions or vertically, where each database contains shares of each transaction. The role of a data warehouse is to collect and transform the dispersed data to an acceptable format, before they will be forwarded to the Data Mining (DM) subsystem. Such central repository raises privacy concerns, especially if it used in an inter-organizational. Setting where several entities, mutually untreated, may desire to mine their private inputs, both securely and accurately. Alternatively, data mining can be performed locally, at each database (or intranet), and then the sub results be combined to extract knowledge, although this will most likely affect the quality of the output. If a general discussion was to be made about protecting privacy in distributed databases, we would point to the literature for access control and audit policies, authorization and information flow control (e.g., multilevel and multilateral security strategies), security in the application layer (e.g., database views), and Operating Systems security Among others. However in this paper we assume that appropriate security and access control exist in the intraorganizational setting, and we mainly focus on the interorganizational setting where a set of mutually untreated entities wish to execute a miner on their private databases. As an alternative layer of protection, original data can be suitably altered or anonym zed before given as an

Fig3. Single Sign on (SSO)

It is essential to distinguish the authentication process from the authorization process. During the authentication process a user is required to navigate to his home site and authenticate him. During this phase information is exchanged between the user and his home site only; with all information on the wire being encrypted. After the successful authentication of a user, according to the user attributes/credentials, permission to access resources is either granted or rejected. The process in which the user exchanges his attributes with the resource server is the authorization process during which no personal information is leaked and can only be performed after successful authentication. User Authentication is performed only once when the user identifies himself inside the trust mesh. Once authenticated inside the trust mesh, users are not required to re-authenticate themselves. When a user navigates To a resource store inside the trust mesh, the authorization process is executed. During this process the service provider requires from the users Identity Provider to present the users access credentials. The Identity provider, after successfully identifying the user and checking if he is previously authenticated, retrieves user credentials for the required resource. If user has not previously been authenticated, the authentication process is initialized. The Shibboleth Identity provider contains four primary components the Attribute Authority

input to a miner, or queries in statistical databases may be. The problem with data perturbation is that in highly distributed environments, preventing the inference of unauthorized information by combining authorized information is not an easy problem. Furthermore, in most perturbation techniques lies a tradeoff between protecting privacy of the individual records and at the same time establishing. Accuracy of the DM results. At a high abstraction level, the problem of privacy preserving data mining between mutually untrusted parties can be reduced to the following problem for a two-party protocol: Each party owns some private data and both parties wish to execute a function F on the union of their data without sacrificing the privacy of their inputs. In a DM environment, for example, the function F could be a classification function that outputs the class of a set of transactions with specific attributes, a function that identifies association rules in partitioned databases, or a function that outputs aggregate results over the union of two statistical databases. In the above distributed computing scenario, an ideal protocol would require a trusted third party who would accept both inputs and announce the output. However, the goal of cryptography is to relax or even destroy the need for trusted parties. Contrary to other strategies, crypto mechanisms usually do not pose dilemmas between the privacy of the inputs and the accuracy of the output. borrow knowledge from the vast body of literature on secure e-auction and e-voting systems. These systems are not strictly related to data mining but, they exemplify some of the difficulties of the multiparty case (this has been pointed out first by but it only concerned e-auctions, While we extend it to include e-voting systems as well). Such systems also tend to balance well the efficiency and security criteria, in order to be implemental in medium to large scale environments. Furthermore, such systems fall within our distributed computing scenario and have similar architecture and security requirements, at least at our abstraction level. In a sealed bid e-auction for example, the function F, represented by an auctioneer, receives several encrypted ids and declares the winning bid. In a secure auction, there s a need to protect the privacy of the losing bidders, while establishing accuracy of the auction outcome and verifiability or all participants. Or, in an Internet election, the function, represented by an election authority, receives several encrypted votes and declares the winning candidate. Here he goal is to protect the privacy of the voters (i.e., unlink ability between the identity of the voter and the vote that has even cast), while also establishing eligibility of the veterans verifiability for the election result. During the last decade, a few cryptographic schemes for conducting online e-auctions and e-elections have been proposed in the literature. Research has shown that it is possible to provide both privacy end accuracy assurances in a distributed computing scenario, where all participants may be mutually untreated, without the presence of an unconditionally trusted third arty. CONCLUSIONS Internationally numerous governments are becoming available Online every day. As unattached efforts of addressing electronic government are implemented globally, the need or an interoperable horizontal security infrastructure is tressed. He effective security infrastructure design presented in this per is a solution which makes use of coexisting and complementary pen source technologies and standards. Provides secure and effective communication supported by ase of use for the end user. Scalability and interoperability an advantage of this design suitable to meet the needs of electronic government. n this environment we studied the context of DM security; furze, further research is needed to choose and then adapt the specific cryptographic techniques to the DM environment, asking into account the kind of databases to worksite,

Fig4. Kinds of SMC Solutions In the academic literature for privacy preserving data mining, following the line of work that begun with Yao, most theoretical results are based on the Secure Multiparty Computation (SMC) approach .SMC protocols are interactive protocols, run in a distributed network by a set of entities with private inputs, who wish to compute a function of their inputs in a privacy preserving manner.We believe that research for privacy preserving DM could

the kind of knowledge to be mined, as well as the kind specific DM technique to be used. [4] Murat Kantarcioglu, Chris Clifton: PrivacyPreserving Distributed Mining of Association Rules on Horizontally Partitioned Data. DMKD 2002 [5] Yehuda Lindell, Benny Pinkas: Privacy Preserving Data Mining. J. Cryptology 15(3): 177-206 (2002) [6] M. Naor and B. Pinkas, Computationally Secure Oblivious Transfer, Advances in Cryptology: Proceedings of Crypto 1999. [7] Benny Pinkas: Cryptographic Techniques for PrivacyPreserving Data Mining. SIGKDD Explorations 4(2): 12-19 (2002) [8] Roland Traunmller: Electronic Government, Second International Conference, EGOV 2003, Prague, Czech Republic, September 1-5, 2003, Proceedings Springer 2003

REFERENCES [1] Agrawal, Ramakrishnan Srikant: PrivacyPreserving Data Mining. SIGMOD Conference 2000: 439450 [2] Domingo-Ferrer, Antoni Martnez-Ballest, Francesc Seb: MICROCAST: Smart Card Based (Micro)PayperView for Multicast Services. CARDIS 2002: 125134 [3] Pho Duc Giang, Le Xuan Hung, Sungyoung Lee, Young-Koo Lee, Heejo Lee: A Flexible TrustBased Access Control Mechanism for Security and Privacy Enhancement in Ubiquitous Systems. MUE 2007: 698-703


Barriers to Entrepreneurship - An analysis of Management students

Abstract Management education has been at the vanguard of higher education in India. With the booming economy and ever increasing job opportunities, MBA have become the most preferred post graduate degree in the country. The only area of concern is that management institutes are creating more job seekers than job creators. Entrepreneurship is still not being considered as a serious career option by most of the management graduates due to availability of job and hefty package in some of

the industries. In the present study it has been observed that mostly management students in the Sant Longowal Institute of Engineering & Technology having service parental background and do not prefer to take risk and preferring jobs in the MNCs or the corporate sectors.

Introduction: The third world countries are still facing some socio-economic problems like unemployment, poverty, inflation, low productivity etc. In India, despite sixty two years of development about 30 percent of the total population i.e. around 30 million are still living below poverty line. In order to improve their living standards, they are to be productive employed. There is a growing worldwide appreciation of fact that micro enterprises may be vibrant option by creating employment generation and our intuitions of higher learning can crate entrepreneur( job creator) which may further create job for others. Entrepreneurship has been acknowledged as one of the essential dynamic factor determining socio-economic growth of any country, because it increases employment, efficiency, productivity, GNP and standard of living. Keeping in mind significance of institutions of higher education augment manifold when it is able to produce not only skilful and employable human resource but also help to develop attitude among its students to opt entrepreneurship as a career choice. Because institutions of education system are distinctive place of knowledge transfer and innovation which help towards nurturing entrepreneurial activities. Some experts envisage that India and China to rule the world in the 21st century. For over a century the United States has been the largest economy in the world but major developments have taken place in the world economy since then, leading to the shift of focus from the US and the rich countries of Europe to the two Asian giants- India and China. In the recent years the rich countries of Europe have seen the greatest decline in global GDP share by 4.9 percentage points, followed by the US and Japan with a decline of about 1 percentage point each. Within Asia, the rising share of China and India has more than made up the declining global share of Japan since 1990. During the seventies and the eighties, ASEAN countries and during the eighties South Korea, along with China and India, contributed to the rising share of Asia in world GDP. On the

other side according to some experts, the share of the US in world GDP is expected to fall (from 21 per cent to 18 per cent) and that of India to rise (from 6 per cent to 11 per cent in 2025), and hence the latter will emerge as the third pole in the global economy after the US and China. By 2025 the Indian economy is projected to be about 60 per cent the size of the US economy. The transformation into a tri-polar economy will be complete by 2035, with the Indian economy only a little smaller than the US economy but larger than that of Western Europe. By 2035, India is likely to be a larger growth driver than the six largest countries in the EU, though its impact will be a little over half that of the US. India, which is now the fourth largest economy in terms of purchasing power parity, will overtake Japan and become third major economic power within 10 years. Therefore there is paramount need to promote entrepreneurial attitude among the masses in this direction; as the expansion of industry is at faster rate and to integrate this agriculture labour force, for attainable growth of all the sectors of economy; a world class infrastructure is required. The Foreign Direct Investment is coming up in number of areas of the economy. Entrepreneur is one who always takes challenges, built new products through innovative ideas and responds to the changes quickly. Entrepreneurs are normally considered as agents of change in the socio-economic development of a country. They are also seen as innovators, risk takers, decision makers and people with definite vision. They have different characteristics as compared to the people accepting jobs or wage employment etc.The misconception that entrepreneurship is a monopoly of some communities but is proved that entrepreneurs are not born but could be identified, trained and developed through proper environment. Peter.F.Drucker. defines an entrepreneur as one who always searches for change, respond to it and exploits it as an opportunity. Innovation is the specific tool of entrepreneurs, the means by which they exploit change as an opportunity for a different business or service. Entrepreneurship Development is a complex phenomenon. Productive activity undertaken by

him and constant endeavour to sustain and impose it are the outward expression of this process of development of his personality. Such process is the crystallization of social milieu from which he comes, family imbibes, make-up of his mind, personal attitudes, educational level, parental occupation and so on. Employment has been the biggest indicator of any economy and this employment is governed by many factors like Industrial culture, market opportunities and most importantly the development of businesses. Its a cyclic process. More business, more employment, improved buying power, more purchases are leading to more business opportunities and so on. On the other Indias needs are very clear: To remove the poverty of our millions as speedily as possible, say before 2010, to provide health for all; to provide good education and skill for all; to provide employment opportunities for all; to be a net exporter; and to be self reliant in natural security and build up capabilities to sustain and improve on all these in the future. Keeping in mind country like ours need to emphasize on the technical education and training as it is an essential element in capacity building for the socio -economic growth and development. Management and technical education has assumed greater importance due to globalization, international competition and for sustainable economic growth. Increased brain drain is another reason of increased importance of technical education. The size of our technical manpower is about six million and is third largest in the world. In the year 1947-48, the country had 38 degree level institutes with intake capacity of 3670. The intake for post graduate was 70. As per AICTE Annual report 2003-2004 there are approximate 1500 colleges of Engineering around the country enrolling approximately 4 Lacs students at undergraduate level. Whereas post -graduate studies and research have been limited to 26203 candidates in 268 AICTE approved institutions of engineering, in addition to 54167 candidates were enrolled in MCA in 1012 institutions. Our professional institutions are the major source of manpower for employment by industry; they provide technological and managerial solutions to the problems arising in industry-both short term as well as long term- through consultancy and R&D. Both industry and institutions are depend on each other and drive benefit from mutual interaction which can be possible through giving a burst in the establishing industries and by inculcating entrepreneurial competencies among the students. Ministry of Human Resource Development ( M.H.RD. ) , Government of India ( GOI) has implemented the scheme of Strengthening Existing Institutions and Establishment of New Institutions for Non-corporate and Unorganised Sectors. To cater the needs of these sectors the New Education Policy of 1986 has emphasized the following: To encourage students to consider self-employment as a career option, training in entrepreneurship will be provided through modular or optional courses, in degree and diploma programmes. ( NPE 6.10 ) In order to increase the relevance of management education, particularly in the non-corporate and under managed sectors, the management education system will study and document the Indian experiences and create a body of knowledge and specific educational programmes suited to these sectors .( NPE.6.7 ) Continuing education, covering established as well as emerging technologies, will be promoted. ( NPE 6.4 ) The number of unemployed graduates in engineering and management discipline has been increasing of late due to mushrooming of institutions. Such an increase has been causing quite some concern in the country primarily of these factors. Firstly, a large amount of money is invested in teaching a student to become a graduate and post graduate in management/ engineering. Secondly, modern aspects of technology taught to such students go to naught when such students are not provided with adequate opportunities to utilize their skills. Recognizing the problems posed by such an increase in unemployed graduates, Government has been concerned with promoting enterprises which would exploit the know-how, talent, and skill available in the scientific and technological population emerging from this educational system. Realizing the need for mitigating the problem of unemployment among science, engineering and technology persons, government of India established a National Science and Technology Entrepreneurship Development Board (NSTEDB) in the Department of Science and Technology in January 1982 to promote avenues of gainful self/ wage employment for Science and Technology person with an emphasis on entrepreneurship development.

Objectives:The study has been undertaken with the following objectives: 1 .To know about the entrepreneurship among the students of technical institutions 2. To study whether education play any role in the development of entrepreneurship

Review of related studies: The studies conducted in India and abroad on entrepreneurship development and role of education have stressed that in the era of knowledge intensive work environment, it is significant to establish knowledge infrastructure in engineering institutions to foster technology innovations and technology incubation. And these institutions can play a vital role dissemination of knowledge pool among the various sections of the society. The present paper focuses on the role of institutions in the development of entrepreneurship development with a special reference to Sant Longowal Institute of Engineering and Technology (SLIET) a central Government Institution of district Sangrur. Some of the studies conducted in India and abroad highlights the importance of entrepreneurship.

As described by David McClelland (1961) the entrepreneurs are primarily motivated by an overwhelming need to achieve and strong urges to build. Therefore, it can be viewed that entrepreneurs are high achiever and they have strong desire to create new things. Collins and Moore (1970) argued that entrepreneurs are tough, pragmatic driven needs of independence and achievement. They do not easily submit to the authority. Cooper, Woo and Dunkelberg (1970) argued that entrepreneurs exhibit extreme optimism in their decisionmaking processes. As described by Bird (1992) that entrepreneurs as mercurial, that is , prone to insights ,brainstorms ,deceptions ,ingeniousness and resourcefulness. They are cunning, opportunistic, creative and unsentimental. of which 29.55 population live in urban areas and 70.45 percent in villages. And it has 19 districts namelyAmritsar,Fatehgarhsahib,Gurdaspur,Forozpur, Ludhiana,Jallandhar,Kapurthala,Hosiarpur,Mansa,Moga,Mukt sar,Nawanshar,Rupnagar,Faridkot Patiala,Bhatinda, Mohali , Barnala and Sangrur.The literacy rate of district Sangrur is 45.99 percent only out of which 41.25 percent rural and 60.42 urban people are literate. The higher educational professional institution are three in number in the district Two institutions of Engineering and Technology in government sector one in private sector are catering the need of the district and other areas of the country. Therefore to understand the nature of road look for entrepreneurship and also to analyze the attitude of the youth of entering in to business, the researcher prepared a questionnaire in which various parameters as family background, parents occupation, future planning, schooling, income level, and need for achievement in keeping in view entrepreneurship as career option were taken into account and a sample 53 MBA students of final and pre-final of Sant Longowal Institute of Engineering and Technology.(SLIET) had been taken in to consideration as first hand information. We contacted all the 60 students but only 53 came forward and shared the information. After collecting data it has been analyzed in percentages

Sample Selection and Methodology and findings:The present paper is an attempt to find out the role of technical institutions in promoting techno- entrepreneurship in the District Sangrur , ( Punjab.) Punjab is situated in the northwest of India. It is bordered by Pakistan on the west, the Indian states of Jammu & Kashmir on the north, Himachal Pradesh on its northest and Haryana and Rajasthan to its south. The total population of the state is 2, 42, 89000 (2001 census) out

Table-1 S-Service, B-Business, A-Agriculture, H-Higher Studies, A-Settled Abroad Source: Personal Investigation


Table no. 1 -2 depicts that out of total 53 surveyed students in Sant Longowal institute of engineering and technology 50.94 percent of the students fathers occupation was service,18.86 percent business, and 30.18 percent were engaged in agriculture sector. Whereas mothers support to the family is concerned 13.20 percent were in service, 1.88 percent in business and 80.94 were house wives and were engaged in homely chores etc. In case of familys income of the students is concerned 1.88 percent had Rs.1 lac per annum, 22.64.18 had Rs.2-4 lac, and 26.41 percent had annual income 4-6 lac per annum respectively . As regards educational background of the surveyed students is concerned 37.73 percent had passed their matric examination from Punjab School Education Board( P.S.E.B) TAB LE - 2 Fathers Occupation Mothers Occupation SLIET Service 50.94% Business 18.86% Agriculture SLIET Service 13.20% Business 1.88% House Wife 84.90% 30.18%

P.S.E.B 37.73% C.B.S.E 33.96% I.C.S.E Nil Other 28.30%

Future Planning SLIET Business 39.62% Service 56.60% H.Study 13.20% Abroad 9.43%

Matric Education Family Income (P.A) SLIET 1 Lac 1.88%

33.96 percent from Central Board of School Education ( C.B.S.E), and 28.30 percent had passed their matriculation examination from other State Boards. In case of future planning of the students is concerned 39.62 percent of the students were interested in business, 56.60 percent wanted to join service, 13.20 percent were interested to pursue higher studies and 9.43 percent were interested to go in abroad respectively. No one is interested to pursue career in agriculture sector. Mostly students want to do job due to job security. Furthermore the information elicited from the students reveal that the students are interested in higher studies is more interested in landing that coveted dream jobs . Thus we can drive conclusion that students wants job security and hesitate to take risk while starting small ventures. Suggestions: 1. The management institutions should focus on improving overall personality of the students and individual motivation and counselling is need of the hour apart from class room teaching. 2. Government should make mandatory to teach entrepreneurship subject in all management institutions and every institutions should fix target

1-2 Lac 22.64% 2-4 Lac 54.71% 4-6 lac 26.41%


to generate entrepreneur and they should signed MOU with the financial institutions so that students may not to have move from pillar to post for getting financial assistance. Conclusion: Going global India emphasized the need for accelerating the development of small and medium enterprises. The target articulated in the preamble of the Small and Medium Enterprises Development Bill, 2005 affirms to provide for facilitating the promotion and development and enhancing the competitiveness of small and medium enterprises..The findings of the present study indicates that majority of the students are more interested in getting service as secured profession rather than risk taking occupation. Some inputs relating to entreprership are being given to the students but these are not sufficient and more proactive steps are required in this direction. 2. Dr. Abhilasa Singh, Role of Entrepreneurship in Small Scale Business Management, A Bi-annual Journal of management & Technology. Vol-1 Issue-1. July-Dec,2006. Pp.147151. 3. Kurato Donald.F. The Emergence of Entrepreneurship Education :Development,trends and challanges. Entrepreneurship theory and practice Sept. Issue. 4. Mohit A Parekh,Devagni Devashragi,Entry barrier to Entrepreneurship. 9th Binnial conference on Entrepreneurship Feb-16-18.2011. EDI Ahmedabad 5. Prof.P.B.Sharma, Role of Technical Institutions in Technology Innovation and their impacts on national security. The Journal of Engineering Education, Jan.2006. p.16 6. S.G. Patil,Lata S Patil Rabindra D Patil, Role of Women in socio economic Development of the Region-with special refrence to North Maharastra 9th Binnial conference on Entrepreneurship Feb-1618.2011. EDI Ahmedabad 7. Vasant Desai, Dynamics of Entrepreneurial Development and Management. Himalaya Publishing House. New Delhi.1999.

References: 1. Ajay Singh, Shashi Singh& Girish Tyagi,Role of economic status, secondry & Engineering level education in development of entrepreneurial attitudes & activities of Technical undergraduates students. 9th Binnial conference on Entrepreneurship Feb-16-18.2011. EDI Ahmedabad








Design of data link layer using WiFi MAC protocols

K.Srinivas (M.Tech) (C R Reddy College of Engg) CRS Murthy (M.Tech) (C R Reddy College of Engg)

Abstract: The name of a popular wireless networking technology that uses radio waves to provide wireless highspeed Internet and network connections.The Wi-Fi Alliance, the organization that owns the Wi-Fi (registered trademark) term specifically defines Wi-Fi as any "wireless local area network (WLAN) products that are based on the Institute of Electrical and Electronics Engineers' (IEEE) 802.11 standards

bands at 2.4GHz and 5GHz. Further hardware design, the communication of hardware design more, operation in these bands entails a strict regulatory data and the maintenance, modification and procurement transmit power constraint, thus limiting range and even bit of hardware. It is a common language for electronics rates beyond a certain distance [5] design and development prototyping Section 1 Users expect Internet connectivity wherever they travel and many of their devices, such as iPods and wireless cameras, rely on local area Wi-Fi access points (APs) to obtain connectivity. Even smart phone users may employ Wi-Fi instead of 3G and WiMAX to improve the performance of bandwidth intensive applications or to avoid data charges. Fortunately, there is often a large selection of commercial APs to choose from. For example, JiWire [6], a hotspot directory, reports 395 to 1,071 commercial AP the designed data link layer is capable of mac layer and physical layer .the data link layer communicates with the other three layers .the layers designed are capable of transmitting 1 and 2 mbits \sec i.e frequency hopping spread spectrum in the 2,4 ghz band and infrared .beyond the standard functionality usally peromed by mac layers the 802.11 mac performance .the protocols consisting of fragmentation ,packet retransmission and acknowledgements .tha mac layer defines two access methods the distribution coordination function [dcf]and point coordination function[pcf]

The main objective of the ieee 802.11 are standard the csma /ca ,physical and mac layer for transmitter and receiver is modleded in this paper .the vhdl (bery high speed hardware description language ) is defined in ieee as a tool of cretation of electronics systems because it supports the development verification sysnthasis and testing The main core of the IEEE 802.1 lb standard are an 802.11 and another 802 LAN [3]. The CSMA\CA, Physical and MAC layers. But only MAC However, all is not prefect in the WLAN world. layer for transmitter is modeled in this paper using the Offering nominal bit rates of 11Mbps [802.1 lb] and VHDL. The VHDL (Very High Speed Hardware 54Mbps(802.1la and 802.11g) the effective throughputs Description Language ) is defined in IEEE as a tool of are actually much lower owing to packet collisions, creation of electronics system because it supports the protocol overhead , and interference in the increasingly development , verification , synthesis and testing of congested unlicensed



Various individual modules of Wi-Fi Transmitter have been designed, verified functionally using VHDL - simulator, synthesized by the synthesis tool .This design of the WiFi transmitter is capable of transmitting the frame formats. The formats include all 802.11 frames i.e. MAC frame, RTS frame , CTS frame and ACK frame. The transmitter is also capable of generating errorchecking codes like HEC and CRC. It can handle variable data transfer

[1] Blind deconvolution of spatially invariant image blurs with phase

[2] Identification of image an blur parameters for the restoration of non-causal blurs [3] Total variation blind deconvolution [4] Maximum-likelihood parametric blur identification based on a continuous spatial domain model [5] Out-of-focus blur estimation and restoration for digital auto-focusing system, [6] Simultaneous out-of-focus blur estimation and restoration fordigital autofocusing system


Leveraging Innovation For Successful Entrepreneurship

Abstract Innovation has become the most hyped word in the dictionary of business models of present century. New ideas, concepts and products have provided the firm with a new tool to counter the dynamic forces of change that disrupt its establishment and force it to recreate based on new rules set by change. Innovation has given the firms the power to lead the change and create an impetus to transform the industry structure rather than sitting and waiting for the storm to arrive and cast its effects on the surroundings. Innovation is a powerful tool in the hands of firms. It helps them create longevity as it opens wide horizons for to expand their reach and be one step ahead. The key process in economic change is the introduction of innovations, and the central innovator is the entrepreneur. There will always be continuous winners and losers in this system. The entrepreneur is the initiator in this change process. The entrepreneur, seeking profit through innovation, transforms the static equilibrium, circular market flow, into the dynamic process of economic development. He interrupts the circular flow and diverts labor and land to investment. "The function of entrepreneurs is to reform or revolutionize the pattern of production by exploiting an invention or, more generally, an untried technological possibility for producing a new commodity or producing an old one in a new way, opening a new source of supply of materials or a new outlet for products, by reorganizing a new industry. The notion of entrepreneurship is typically associated with new business creation, new product development and offerings by individuals. With the onset of intensifying global competition there is an increasing need for business organizations to become more entrepreneurial to not only survive but to thrive and prosper. Hence corporate entrepreneurship has become an important paradigm in todays business environment. Corporate Entrepreneurship is a much broader concept encompassing innovation, creativity, change and regeneration within the corporate climate or entire organization. Corporate Entrepreneurship is a

concept by which corporate employees at any level of the company identify and construct a unique business model that offers significant growth opportunities for their company.



Throughout history, innovators and entrepreneurs have had a tremendous impact on development, exploration, trade, education, science, and integration. During the 20th century, innovation and entrepreneurship have been regarded as key drivers in technological progress and productivity development worldwide. New radical innovations from new fields of knowledge such as information and communication technologies and biotechnology have emerged to influence everyday life for most people. Realizing this, policy makers as well as individuals argue that innovative and entrepreneurial change processes need to be further implemented on the micro as well as macro levels in society .

Innovation refers to radically new or incremental changes in ideas, products, processes or services. Following Joseph Schumpeter's (1934) original work, an invention is related to a new idea or concept, while an innovation refers to such ideas applied in practice. In The Theory of Economic Development (1934), Schumpeter defined innovation from an economic perspective as the introduction of a new good or of a new quality

140 of a good, the introduction of a new method of production, the opening of a new market, the conquest of a new source of supply of raw materials or half-manufactured goods, and the carrying out of a new organization of an industry. On the individual level, innovation comprises the origination of an idea through to its implementation, at which point it can be transformed into something useful. Since innovation is also considered a major driver of the economy, especially when it leads to new product or service categories, or to increasing productivity, the factors that stimulate individuals or groups to innovate should be of major interest to policy makers. In particular, public-policy incentives could be implemented to spur innovation and growth. On the organizational level, innovation may be used to improve performance and growth through new concepts and methods that increase efficiency, productivity, quality, competitive positioning, and market share. Innovation policies and practices may be implemented in a variety of organizations, such as industries, hospitals, universities, as well as local governments. While most forms and practices of innovation aim to add value, radical innovation may also result in a negative or destructive effect for some. Many new developments clear away or change aging practices, and those organizations that do not innovate effectively may be substituted by new organizations and firms that do. It is not only our understanding of the importance of innovations for development that is changing, but also the concept of how innovations are formed. New models of innovation are emerging that are shifting the concept of innovation from being shaped by a closed to an open paradigm (Hedner, Maack, Abouzeedan, & Klofsten, 2010). Such forms of innovation include, for example, user innovation, open innovation, crowd-sourcing, and crowd-casting, which all represent novel and interesting phenomena that may change our conception of how innovation of use, innovation in services, innovation in configuration of technologies, as well as innovation of novel technologies themselves are formed. In agreement with such open concepts of innovation, loosely formed groups of customers, users, scientific communities, or experts/researchers may collectively shape product or process innovations within a variety of sectors.

Technology is often attributed as one of the driving forces behind globalization (Bartlett & more of a demand-led view based on interaction between users and producers of innovation in, what is referred to as, national innovation systems (NIS). Since then, innovation policy has shifted toward an innovation systems perspective, including demand-pull and interaction between users and producers of innovation. Innovation policy plays an important role in influencing innovation performance, but must be closely tailored to specific needs, capabilities, and institutional structures of each country (OECD, 2005a), i.e. the national innovation system.

There is no common definition of the innovation system concept. Typically the concept includes activities of private as well as public actors; linkages; the role of policy and institutions. The analysis is carried out at the national level: R&D activities and the role played by the universities, research institutes, government agencies, and government policies are viewed as components of a single national system, and the linkages among these are viewed at the aggregate level (Carlsson, Jacobsson, Holmn, & Rickne, 2002). Lundvall, Johnson, Andersen, and Dalum (2002, p. 220) find it useful to think about innovation systems in two dimensions. One refers to the structure of the system what is produced in the system and what competences are most developed? The second refers to the institutional set-up how does production, innovation, and learning take place? The innovation system concept can be understood in a narrow as well as a broad sense (Lundvall, 1992). The narrow sense concentrates on those institutions that deliberately promote the acquisition and dissemination of knowledge and are the main sources of innovation. The broad sense recognizes that these narrow institutions are

141 embedded in a much wider socio-economic system. The concept has become popular among several important policymaking organizations, for example, both the OECD and EU have absorbed the concept as an integral part of their analytical perspective.2 Much of the literature on innovation systems insists on the central importance of national systems, but a number of authors have argued that globalization has greatly diminished or even eliminated the importance of the nation state (Freeman, 2002). As a result, there have been several new concepts emphasizing the systemic characteristics of innovation, but related to levels other than the nation state. Sometimes the focus is on a particular country or region which then determines the spatial boundaries of the system. The literature on regional systems of innovation has grown rapidly since the mid 1990s (e.g. Cooke, 1996; Maskell & Malmberg, 1999). In other cases, the main dimension of interest is a sector or technology. Carlsson and Jacobsson (1997) developed the concept technological systems while Breschi & Malerba (1997) uses the notion of sectoral systems of innovation. Usually these different concepts and dimensions reinforce each other and are not in conflict. Despite this growing interest in systems of innovation there have been few attempts to include entrepreneurship as a central component (Golden, Higgins, & Lee, 2003). In Europe, all European Union (EU) Member States and candidate countries have committed to the Lisbon Agenda and increased their public R&D expenditure. Thus, in the 2000s, European innovation policy has become somewhat biased toward a science push or linear model, in which R&D is supposed to lead to increased innovation and entrepreneurship. The third generation of innovation policy thinking calls for more horizontality, coordination and integration of innovation, and other policy domains (OECD, 2006) and stronger linkages with entrepreneurship as a component of the NIS (Golden et al., 2003) and through development of indicators to measure its importance as a driver of innovation (Arundel & Hollanders, 2006). Carlsson (2006) argues that in order to understand how successful innovation systems are in generating economic growth, one would have to include an assessment of the level of entrepreneurial activity and business formation outputs.

Innovation as a policy area is primarily concerned with a few key objectives: ensuring the generation of new knowledge and making government investment in innovation more effective; improving the interaction between the main actors in the innovation system (universities, research institutes, and firms) to enhance knowledge and technology diffusion; and establishing the right incentives for private sector innovation to transform knowledge into economic value and commercial success (Commission of the European Communities, 2005c; OECD, 2002c). A review of innovation policy documents compiled by the OECD and the EU suggests that the framework for innovation policy could be illustrated as in Fig. 2. Here we find policy objectives for the increase of R&D intensity, the stimulation of climate and culture of innovation, as well as for the commercialization of technology. The last of these includes instruments and support which are important for many innovative start-ups, e.g. a support innovation infrastructure (such as technology transfer offices, science parks, and business/technology incubators), encourage the uptake of strategic technologies among SMEs; improve access to pre-commercialization funding and venture capital; and provide tax (e.g. R&D tax credits, favorable capital cost allowances) and other incentives and supports to accelerate the commercialization of new technologies and products. As we shall demonstrate later in this paper, this framework is not dissimilar from that for entrepreneurship policy. The major difference may be the types of policy measures included within each of the framework boxes.

Entrepreneurship is the act of being an entrepreneur. According to the French tradition, this implies one who undertakes innovations, finance and business acumen in an effort to transform innovations into economic goods.

142 Entrepreneurs undertake such tasks in response to a perceived opportunity which in its most obvious form may be a new start-up company. However, the entrepreneurship concept has in recent years been extended to also include other forms of activity, such as social, political, and international entrepreneurship. Some of these new fields of entrepreneurship research and practice are to a large extent driven by eglobalization processes which are facilitated by new information technology tools. Social entrepreneurship, focusing on non-profit entrepreneurial activities, is a new area which is currently attracting more research. Other, developing perspectives include academic entrepreneurship, women entrepreneurship . Thompson & Jones-Evans, 2009), as well as ethnic entrepreneurship, the latter focusing on the role of immigrants as entrepreneurs in their new home countries (cf. Clark & Drinkwater, 2010; Smallbone, Kitching, & Athaya, 2010). In addition, there is also an increasing emphasis on specific sectors where entrepreneurs are active, such as in the medical, life sciences, services, and technology areas, with new paradigms emerging as a result. Needless to say, other new paradigms and concepts within the field of entrepreneurship will appear in the future as the concept of entrepreneurship takes on new forms and shifts into new frontiers. Certainly, it is within the nature of the metaphor entrepreneurship that such creativity and development should be anticipated. As such, the research in the entrepreneurship field needs to develop a better understanding of the important relationship between innovation, entrepreneurial activities, and economic development (Acs & Storey, 2004; Acs & Szerb, 2007; Carlsson, Acs, Audretsch, & Braunerhjelm, 2009; Reynolds, 1997; Reynolds, Carter, Gartner, & Greene, 2004; Stough, Haynes, & Campbell, 1998). The entrepreneur is an actor in microeconomics and, according to Schumpeter (1934), is a person who is willing and able to convert a new idea or invention into a successful innovation. In the classical sense, entrepreneurship employs what Schumpeter called the gale of creative destruction which means that entrepreneurial activities may partly or fully replace inferior practices across markets and industries, while new products or new business models are created simultaneously. According to this perspective, creative destruction is a driver of the dynamism of industries and long-term economic growth. A vital ingredient in entrepreneurship is therefore risk-taking. Knight (1961) classified three types of uncertainty facing an entrepreneur: risk, which could be measured statistically; ambiguity, which is difficult to measure statistically; and true uncertainty or Knightian uncertainty, which is impossible to statistically estimate or predict. Entrepreneurship is often associated with true uncertainty, in particular when it involves newto-the-world innovations. Innovation and technological change is developed and implemented more rapidly today than ever before. Entrepreneurs across the globe implement the process of commercialization resulting from innovation and technological change. Over several decades, our concepts of the innovation process have transitioned from being based on a technology push and need pull model of the 1960s and early 1970s, through the coupling model of the late 1970s to early 1980s, to today's integrated model. Thus, our concept of the innovation process has shifted from from one that presented innovation as a linear sequential process to our current perception of innovation as a shifting, parallel, networking and open phenomenon. As a result of internetization, communication, and eglobalization, innovation is moving more rapidly, is more dispersed and increasingly involves inter-company and inter-personal networking (Abouzeedan et al., 2009; Hedner et al., 2010). As a result, entrepreneurs are needed to develop and implement innovation. Needless to say, innovation and entrepreneurship policies need to be supported and firmly embedded in society (Norrman & Klofsten, 2009). Since entrepreneurship may be translated into economic growth, governments increasingly support the development of an entrepreneurial culture by integrating entrepreneurship into educational systems, encouraging business risktaking in start-ups, as well as national campaigns supporting a range of public entrepreneurship incentives. Over the last century, Alfred Nobel, the famous Swedish inventor and philanthropist, has personified the concept of innovation and

143 entrepreneurship on the individual level (Jorpes, 1959; Schck & Sohlman, 1929). Nobel (1833 1896) pursued a career as a chemist, engineer, innovator, and entrepreneur and became one of the great philanthropists of our time. Nobel held 355 different patents, including that of dynamite. He created an enormous fortune during his lifetime, and in his final will and testament he instituted the Nobel Prizes, the most prestigious scientific prizes of all time.

There is a long debate tracing back to the economist Josef Schumpeter about the role of small and large firms with respect to technological progress and innovation. While during the 1970s and early part of the 1980s the leading role of large enterprises was stressed amongst academics and policymakers, in the late 1980s and throughout the 1990s, the role and impact of SMEs was rediscovered. It is now well established that SMEs and entrepreneurship are important for economic growth and renewal (Acs, Audretsch, Braunerhjelm, & Carlsson, 2004; Birch, 1981; Davidsson, Lindmark, & Olofsson, 1994; Reynolds, Bygrave, Autio, Cox, & Hay, 2002; Wennekers and Thurik, 1999). As mentioned, entrepreneurship policy has emerged primarily from SME policy, becoming particularly evident as a policy area in the late 1990s and early 2000s (European Commission, 1998, 2004a; Hart, 2003; OECD, 1998, 2001a; Stevenson & Lundstrm, 2002). Although it is the company's size that is the crucial criterion to distinguish SMEs from other enterprises, when considering SMEs, in particular, there is much more that matters, like the applied business model, occupied market segment, sector alignment, growth-orientation, etc. However, there is an aspect that is obviously more closely linked to the company's size than all the others: the age of the company or more precisely the stage of the firm's life cycle (Ortega-Argils & Voigt, 2009). In fact, it makes a difference whether an enterprise is classified as an SME because it is a very recently established firm (entrepreneurial start-up) or because the company's size is rather the result of a market adjustment process (e.g. a limited niche market). Since the majority of new firms are born small,

it is natural that SMEs and entrepreneurial firms would, at least for a period of time, be seen as synonymous entities, and that SME policy and entrepreneurship policy would have overlapping domains, as illustrated in Lundstrm and Stevenson (2005). However, it is important to remember that there are also differences between SMEs and entrepreneurial firms; not all entrepreneurial firms stay small. Just as there are differences between SME policy and entrepreneurship policy. Whereas the main objective of SME policy is to protect and strengthen existing SMEs (i.e. firms), entrepreneurship policy emphasizes the individual person or entrepreneur. Thus, entrepreneurship policy encompasses a broader range of policy issues geared to creating a favorable environment for the emergence of entrepreneurial individuals and the start-up and growth of new firms. A critical issue for entrepreneurship policy is how to encourage the emergence of more new entrepreneurs and growing firms. .

Entrepreneurship policy, then, is primarily concerned with creating an environment and support system that will foster the emergence of new entrepreneurs and the start-up and earlystage growth of new firms (Lundstrm & Stevenson, 2005; Stevenson & Lundstrm, 2002). The framework of entrepreneurship policy measures includes policy actions in six areas: (1) promotion of entrepreneurship; (2) reduction of entry/exit barriers; (3) entrepreneurship education; (4) start-up support; (5) start-up financing; and (6) target group measures (Stevenson & Lundstrm, 2002). Major policy instruments and measures in this policy area include those to remove administrative and regulatory to new firm entry and growth,4 improve access to financing5 and to information, and other support infrastructure and services.6 To promote a culture of entrepreneurship, expose more students to entrepreneurship in the education system, and remove barriers to entrepreneurship among specific target groups within the population are further examples of major policy instruments

144 (Gabr & Hoffman, 2006; Lundstrm & Stevenson, 2005). Goshal, 1996). With each wave of technological change the bar of knowledge required to obtain a level of sophistication changes. The result is generally a greater need for human capital, which has given rise to the increase in knowledge workers (Gilbert et al., 2004). An economic landscape, characterized by the rise of international production, innovation networks, and the emergence of science-based technologies, has emerged. With the technologydriven boom of the 1990s, Germany and Japan have been replaced by the USA as the innovation policy exemplar. Innovation policy, which has been growing in interest and emphasis since the mid-1990s, has largely evolved from S&T policy (OECD, 2006). The first generation of innovation policy, based on the science push or linear model, focused primarily on funding of science-based research in universities and government laboratories. The second generation of innovation policy adopted which are under-represented as business owners (e.g. women, youth, ethnic minorities, unemployed, etc.) where the objective is to address identified social, systemic, or other particular barriers to entry; and (b) technostarters where the objective is to encourage high-growth potential businesses based on R&D, technology or knowledge. Finally, the Holistic Entrepreneurship Policy is a comprehensive policy approach encompassing the full range of entrepreneurship policy objectives and measures. Clearly, the Niche Entrepreneurship Policy addressing techno-starters is highly relevant when discussing Innovative entrepreneurship policy. However, which will be further discussed in the next section, the effectiveness of a niche policy as a stand-alone policy may be impeded if the entrepreneurial culture is underdeveloped. Bear in mind that, similar to what was earlier argued about the role of an Innovation policy (in the section on Science, Technology, and Innovation), Entrepreneurship policy plays an important role in influencing entrepreneurial performance, but the policy should be closely tailored to the specific needs, capabilities, and institutional structures of each country/region and innovation system.

Fig. 3. Framework of entrepreneurship policy areas. When presenting their Entrepreneurship Policy typology, Stevenson and Lundstrm (2002) included four different categories of entrepreneurship policy. The first of these is the SME Policy Add-on, in which case initiatives to respond to the needs of starting firms or the broader stimulation of entrepreneurship are added-on to existing SME programs and services, but at a somewhat marginalized and weakly resourced level. The second is the New Firm Creation Policy, in which case the government focuses on measures to reduce administrative and regulatory (government) barriers to business entry and exit, and generally simplify the startup process so more people are able to pursue that path. In the Niche Entrepreneurship Policy the government formulates targeted measures to stimulate the level of business ownership and entrepreneurial activity around specified groups of the population. There are two types of targets for niche policies, (a) segments of the population





Joseph Alois Schumpeter pointed out over one hundred years ago that entrepreneurship is crucial for understanding economic development. Today, despite the global downturn, entrepreneurs are enjoying a renaissance the world over according to a recent survey in the Economist magazine (Woolridge, 2009). The dynamics of the process can be vastly different depending on the institutional context and level of development within an economy. As Baumol (1990) classified, entrepreneurship within any country can be productive, destructive or unproductive. If one is interested in studying entrepreneurship within or across countries, the broad nexus between

145 entrepreneurship, institutions, and economic development is a critical area of inquiry and one which can determine the eventual impact of that entrepreneurial activity. The interdependence between incentives and institutions, affect other characteristics, such as quality of governance, access to capital and other resources, and the perceptions of what entrepreneurs perceive. Institutions are critical determinants of economic behavior and economic transactions in general, and they can have both direct and indirect effects on the supply and demand of entrepreneurs (Busenitz & Spencer, 2000). Historically, all societies may have a constant supply of entrepreneurial activity, but that activity is distributed unevenly between productive, unproductive, and destructive entrepreneurship because of the incentive structure. To change the incentive structure you need to strengthen institutions, and to strengthen institutions you need to fix government. The role incentives play in economic development has become increasingly clear to economists and policymakers alike. People need incentives to invest and prosper. They need to know that if they work hard, they can make money and actually keep that money. As incentive structures change, more and more entrepreneurial activity is shifted toward productive entrepreneurship that strengthens economic development (Acemoglu & Johnson, 2005). This entrepreneurial activity tends to explode during the innovation-driven stage that culminates in a high level of innovation, with entrepreneurship leveling out as institutions are fully developed (Fukuyama, 1989).

course the static interpretation was subject to much criticism. Solow (1957) at MIT updated the date, wages and capital returns, and improved on Douglas's simple estimation regressions by bringing in yearly data on profit/wages sharing. Now for the 19091949 time-span, Solow modified Douglas's earlier findings by a kind of exponential growth factor suggested by Schumpeter early on in the century. As the Nobel Laureate Samuelson (2009, p. 76) recently pointed out, This residual Solow proclaimed, demonstrated that much of postNewtonian enhanced real income had to be attributed to innovational change (rather than, as Douglas believed, being due to deepening of the capital/labor K/L ratio). Fig. 1 shows the relationship between entrepreneurship and economic development. Entrepreneurship differs from innovation because it involves an organizational process. Schumpeter provided an early statement on this. In recent years, economists have come to recognize what Liebenstein (1968) termed the input-competing and gap-filling capacities of potential entrepreneurial activity in innovation and development. Entrepreneurship is considered to be an important mechanism for economic development through employment, innovation, and welfare. The intersection of the S-curve on the vertical axis is consistent with Baumol's (1990) observation that entrepreneurship is also a resource, and that all societies have some amount of economic activity, but that activity is distributed between productive, unproductive, and destructive entrepreneurship. As institutions are strengthened, more and more entrepreneurial activity is shifted toward productive entrepreneurship strengthening economic development (Acemoglu & Johnson, 2005). This entrepreneurial activity explodes through the efficiency-driven stage and culminates in a high level of innovation with entrepreneurship leveling out.

Technical change and economic development for most of the first part of the twentieth century was assumed to be a function of capital and labor inputs. Douglas (1934) at the University of Chicago compiled a time series of US labor supply (L) and a series of capital-plant and equipment (K) for the time period 18991922. The results suggested that labor received about 0.75% of output and capital of 0.25%, and that K/L ratio deepening (more capital per worker) was important to technological change. Of

Fig. 1. The relationship between entrepreneurship and economic development

146 and the corresponding stages of developed as found in Porter et al. (2002).




Baumol (1990) proposed a theory of the allocation of entrepreneurial talent in a seminal article, titled Entrepreneurship: Productive, Unproductive and Destructive. He makes an important observation that although entrepreneurship is typically associated with higher incomes, innovation and growth, the entrepreneur is fundamentally engaged only in activity aimed at increasing wealth, power and prestige (1990, p. 898). Therefore, entrepreneurship is not inherently economically healthy and can be allocated among productive, unproductive, and destructive forms. The framework presented by Baumol is useful in that it brings to attention the importance of the full range of entrepreneurial activity. The tradeoff between productive and unproductive activity has been studied, typically in developed countries, most often from the perspective of economic organization. Strong regulatory regimes often mean that policies typically oversee the direction of entrepreneurship in the economy. In contrast, many developed countries have designed economic policies specifically to minimize the ability of entrepreneurs to engage in unproductive activities, and to support productive entrepreneurship. In many developing countries, unproductive and destructive activities are substantial components, if not the substantial components in the economy. Even in rapidly developing countries, opportunities for profit can outpace the evolution of institutions, and this mismatch widens the scope of rent-seeking or worse activities. In the underdeveloped countries, economic activities are found to be predatory and extractive. Baumol originally proposed a framework to understand the allocation, rather than the supply, of entrepreneurship. He assumes that a certain proportion of entrepreneurs exist across and within societies. Baumol hypothesizes that the allocation of entrepreneurial talent is influenced by a structure of rewards in the economy. He suggests that the rules of the game determine the

outcome of entrepreneurial activity for the economy, rather than the objectives or supply of the entrepreneurs. According to Baumol (1990, p. 897), Schumpeter's analysis was not elaborate enough because it did not place value on moving between these forms of entrepreneurship. If activities are chosen based on perceived opportunity for profit (or other personal gain), it should not be assumed that the activities will be of a certain type. For this reason, Baumol (1990, p. 897) extends Schumpeter's list of entrepreneurial activities to include activities of questionable value to society, such as innovative new practices of rent-seeking. These activities of questionable value form Baumol's conception of unproductive entrepreneurship. Unproductive entrepreneurship is what Baumol refers to as a range of activities that threaten productive entrepreneurship. Specifically, he notes rent-seeking, tax evasion, and avoidance as the dominant forms of unproductive entrepreneurship. Within rent-seeking, he includes excessive legal engagement; within taxation, he notes that high-tax societies host a certain set of incentives for entrepreneurial effort. Baumol makes several useful propositions about productive and unproductive entrepreneurship, but he offers no insight into destructive entrepreneurship. In order to shed light on destructive entrepreneurship that is not captured in his existing framework, Acs and Desai proposed the theory of destructive entrepreneurship. They assume entrepreneurs operate to maximize utility and accept Baumol's proposition that the supply of entrepreneurs remains relatively constant. Acs and Desai then find most treatments of entrepreneurship allocation assuming the existence of occupational choice and limiting applicability (Desai & Acs, 2007; Desai, Acs, & Weitzel et al., 2010).



Entrepreneurship and innovation are closely linked. Much of entrepreneurial activity most assuredly involves innovation, and, likewise, entrepreneurs are critical to the innovation

147 process. In addition, the turbulence (creative destruction) produced by a high rate of business entry and exit activity is in itself associated with higher levels of innovation in an economy. It is possible to observe convergence between innovation and entrepreneurship policy, particularly when the policy goal is to foster new high-growth innovative firms. In this section we will discuss the start-up of innovative and rapidly growing firms, as well as how public policy can be deployed to promote innovative entrepreneurship. Entrepreneurship and innovation policy as derivatives of other policy areas Entrepreneurship and innovation policy are both derivations of other policy areas. While entrepreneurship policy has emerged primarily from SME policy, innovation policy has largely evolved from science and technology (S&T) or research and development (R&D) policy. contribute to society as a whole by introducing new products/services that often are contradictory to institutional norms Innovations in a form of a new product, process or service are an important factor in providing competitive advantage for SMEs. Continuous creation and recognition of new ideas and opportunities are common characteristics for innovation activity and entrepreneurship. At the best, innovation facilitates small companies to overcome resource restrictions needed for growth. To sum up, it has been argued that both entrepreneurship and innovation are linked to economic growth and industrial renewal. But it is not entirely evident exactly how. Often the relationships between growth, entrepreneurship and innovation tend to be indirect rather than direct. Today it is a well-established fact that SMEs are important for economic growth and renewal. The carrying out of new combinations may, however, have less to do with the size of a firm or organization; instead newness in the form of innovation and entrepreneurship has again caught the attention of many academics and policymakers. Much of entrepreneurial activity most assuredly involves innovation. Likewise, entrepreneurs are critical to the innovation process and entrepreneurial capacity is a key element in the transfer of knowledge in the commercialization process. The turbulence produced by a high rate of business entry and exit activity is in itself associated with higher levels of innovation in an economy The examination of existing work on Entrepreneurship and Innovation policy, suggests that an important direction for the future is to link the two to each other. It is argued that public policy promoting innovation and economic growth must also involve instruments promoting entrepreneurship. For innovative entrepreneurship to be able to fully contribute to economic growth and development it is recommended that its importance will need to be further acknowledged in innovation as well as entrepreneurship policies. The combination of entrepreneurship and innovation results in innovative entrepreneurship: new firms based on new (inventive) ideas, and sometimes, but not

Fig. 2. Illustrative framework of innovation policy areas. However, it is noted that policy measures to stimulate innovative entrepreneurship are often of a different form than those to foster general entrepreneurial activity as are the target groups they seek to influence, and the composition of system members (Lundstrm & Stevenson, 2005; Stevenson, 2002). Of course, innovation policy is broader than policy to foster innovative entrepreneurship, especially regarding objectives such as those to increase R&D investments or encourage the uptake of strategic technologies. As Lundstrm and Stevenson (2005) observed, it is possible for governments to have policies for innovation that do not incorporate much consideration for policies to foster entrepreneurial capacity, not even for innovative entrepreneurship.

The importance of innovations by entrepreneurs is even more important as global competition offers more entrepreneurial opportunities from a greater pool of people. The challenge is for firms to find and make use of these individuals for their survival. Innovation and entrepreneurship are the essence of the capitalist society. The entrepreneurs

148 always, research-based. Such firms often have relatively high-growth potential and may become future gazelles. Thus, the encouragement of innovative entrepreneurship has caught the attention of both policymakers and academics. In this paper it is, however, argued that policies in favor of innovative entrepreneurship should be considered in the context of a holistic entrepreneurship policy framework. The effectiveness of innovative entrepreneurship policy as a stand-alone policy may be impeded if the culture for entrepreneurship is under-developed, the density of business owners too thin, the full range of education support missing, and so on. Thus, in order to increase economic growth through innovative entrepreneurship it is suggested that at least three alternatives are considered. The first of these includes the encouragement of entrepreneurship in general. Not only does the establishment and expansion of new firms create additional new jobs, an increased general entrepreneurial activity is also likely to result in a higher number of innovative high-growth firms. The two other options discussed are niche policies focusing on either increasing the frequency of high-growth firms among the innovative ones or on increasing R&D and innovative activities among low growing firms. All three alternatives can increase innovative entrepreneurship, but the actual policy instruments are very different. There are many examples of highly successful innovations stemming from small enterprises, which have revolutionized entire industries. Start-up companies, young entrepreneurs, university spin-offs, and small highly innovative firms more often than not produce the major technological breakthroughs and innovations, leaving behind the R&D efforts and innovation strategies of large global corporations. It has been argued that entrepreneurship takes on new importance in a knowledge economy because it serves as a key mechanism by which knowledge created in one organization can become commercialized in another enterprise. New and small firms also serve as important vehicles for knowledge spill-overs when their ideas, competencies, products, strategies, innovations, and technologies are acquired, accessed, and commercialized by larger enterprises. Smalland medium-sized enterprises (SMEs) and entrepreneurship continue to be a key source of dynamism, innovation, and flexibility in advanced industrialized countries, as well as in emerging and developing economies. For innovative entrepreneurship to be able to fully contribute to economic growth and development, its importance will need to be further acknowledged in innovation as well as entrepreneurship policies.

Abouzeedan, A., Busler, M., & Hedner, T. (2009). Managing innovation in a globalized economy defining the open capital. Acemoglu, D., & Johnson, S. (2005). Unbundling institutions. Journal of Political Economy. Acemoglu, D., Johnson, S., & Robinson, J. (2001). The colonial origins of comparative development: An empirical investigation. American Economic Review. Ahmed (Ed.), World sustainable development outlook 2009, Part VII, knowledge management and education. Brighton, UK: World Association for Sustainable Development University of Sussex. Acs, Z., & Storey, D. (2004). Introduction: Entrepreneurship and economic development. Regional Studies. Acs, Z., & Szerb, L. (2007). Entrepreneurship, economic growth and public policy. Small Business Economics. Acs, Z. J., Audretsch, D. B., & Evans, D. S. ( 1994 ). Why does the self-employment rate vary across countries and over time? Discussion Paper no. 871, Centre for Economic Policy Research. Acs, Z., Braunerhjelm, P., Audretsch, D. B., & Carlsson, B. (2009). The knowledge spillover theory of entrepreneurship. Small Business Economics. Acs, Z. J., & Varga, A. (2005). Entrepreneurship, agglomeration and technological change. Small Business Economics.

149 Ahmad, N., & Hoffman, A. (2007). A framework for addressing and measuring entrepreneurship. Paris: OECD Entrepreneurship Indicators Steering Group. Aquilina, M., Klump, R., & Pietrobelli, C. (2006). Factor substitution, average firm size and economic growth. Small Business Economics. Audretsch, D. ( 2002 ). Entrepreneurship: A survey of the literature. Report for European Commission, Enterprise Directorate General. European Commission, Enterprise and Industry. Autio, E. (2007). GEM 2007 high-growth entrepreneurship report. Global Entrepreneurship Monitor. Bates, T. (1990). Entrepreneur human capital inputs and small business longevity. The Review of Economics and Statistics. Baumol, W. (1990). Entrepreneurship: Productive, unproductive and destructive. Journal of Political Economy. Baumol, W., Litan, R., & Schramm, C. (2007). Good capitalism, bad capitalism, and the economics of growth and prosperity. New Haven, CT: Yale University Press. Bhola, R., Verheul, I., Thurik, R., & Grilo, I. ( 2006 ). Explaining engagement levels of opportunity and necessity entrepreneurs. Birch, D. L., & Medoff, J. (1994). Gazelles. In L. C. Solmon, & A. R. Levenson (Eds.), Labor markets, employment policy and job creation (pp. 159167). Boulder, CO and London: Westview Press. Blanchflower, D. (2000). Self-employment in OECD countries. Labour Economics. Blanchflower, D., Oswald, A., & Stutzer, A. (2001). Latent entrepreneurship across nations. European Economic Review. Block, J., & Wagner, M. ( 2006 ). Necessity and opportunity entrepreneurs in Germany: Characteristics and earnings differentials. Bosma, N., Acs, Z. J., Autio, E., Coduras, A., & Levie, J. ( 2009 ). GEM executive report. Babson College, Universidad del Desarrollo, and Global Entrepreneurship Research Consortium . Busenitz, L., & Spencer, J. W. (2000). Country institutional profiles: Unlocking entrepreneurial phenomena. Academy of Management Journal. Bygrave, W., Hay, M., Ng, E., & Reynolds, P. (2003). Executive forum: A study of informal investing in 29 nations composing the global entrepreneurship monitor. Venture Capital, Carlsson, B., Acs, Z., Audretsch, D. B., & Braunerhjelm, P. (2009). Knowledge creation, entrepreneurship, and economic growth: A historical review. Industrial & Corporate Change. Clark, K., & Drinkwater, S. (2010). Recent tends in minority ethnic entrepreneurship in Britain. International Small Business Journal. Caliendo, M., Fossen, F. M., & Kritikos, A. S. (2009). Risk attitudes of nascent entrepreneurs new evidence from an experimentally validated survey. Small Business Economics, Davidsson, P. (2004). Researching entrepreneurship. New York: Springer. De Clercq, D., Sapienza, H. J., & Crijns, H. (2005). The internationalization of small and medium firms. Small Business Economics Desai, S., & Acs, Z. J. ( 2007 ). A theory of destructive entrepreneurship. JENA Economic Research paper No. 2007-085. Desai, S., Acs, Z. J., & Weitzel, U. ( 2010 ). A model of destructive entrepreneurship. United Nations University (UNU) WIDER Working Paper No. 2010/34. Djankov, S., La Porta, R., LopezdeSilanes, F., & Shleifer, A. (2002). The regulation of entry. Quarterly Journal of Economics, Douglas, P. H. (1934). The theory of wages. New York: Macmillan. Dreher, A. (2006). Does globalization affect growth? Evidence from a new index of globalization. Applied Economics, Fukuyama, F. (1989). The end of history?. The National Interest, 16, 318. Godin, K., Clemens, J., & Veldhuis, N. (2008). Measuring entrepreneurship conceptual frameworks and empirical indicators. Studies in Entrepreneurship Markets, 7. Gompers, P., & Lerner, J. (2004). The venture capital cycle. Cambridge, MA: MIT Press. Grilo, I., & Thurik, R. A. (2008). Determinants of entrepreneurship in Europe and the U.S. Industrial and Corporate Change. Guiso, L., Sapienza, P., & Zingales, L. ( 2006 ). Does culture affect economic outcomes? Hindle, K. (2006). A measurement framework for international entrepreneurship policy research: From impossible index to malleable

150 matrix. International Journal of Entrepreneurship and Small Business. Jorgenson, D. W. (2001). Information technology and the US economy. American Economic Review. Liebenstein, H. (1968). Entrepreneurship and development. American Economic Review.. Miller, T., & Holmes, K. R. eds, (2010). 2010 index of economic freedom: The link between entrepreneurial opportunity and prosperity. The Heritage Foundation and The Wall Street Journal. Minniti, M. (2005). Entrepreneurship and network externalities. Journal of Economic Behavior and Organization. Mueller, S., & Thomas, A. (2001). Culture and entrepreneurial potential: A nine country study of locus of control and innovativeness. Journal of Business Venturing. Murphy, K. M., Schleifer, A., & Vishny, R.W. (1993). Why is rent seeking so costly to growth. American Economic Review Papers and Proceedings, 83(2), 409414. OECD ( 2006 ). Understanding entrepreneurship: Developing indicators for international comparisons and assessments. Papagiannidis, S., & Li, F. (2005). Skills brokerage: A new model for business start-ups in the networked economy. European Management Journal. Porter, M., Sachs, J., & McArthur, J. (2002). Executive summary: Competitiveness and stages of economic development. Oxford University Press. Porter, M., Ketels, C. & Delgado, M. ( 2007 ) The microeconomic foundations of prosperity: Findings from the Business Competitiveness Index, Chapter 1.2. From The Global Competitiveness Report 20072008. World Economic Forum, Geneva Switzerland. Porter, M., & Schwab, K. (2008). The global competitiveness report 20082009. Geneva: World Economic Forum. Romn, Z. (2006). Small and medium-sized enterprises and entrepreneurship. Hungarian Central Statistical Office. Romer, P. (1990). Endogenous technological change. Journal of Political Economy. Rostow, W. W. (1960). The stages of economic growth: A non-communist manifesto. Cambridge: Cambridge University Press. Sala-I-Martin, X., Blanke, J., Hanouz, M., Geiger, T., Mia, I., & Paua, F. (2007). The Global Competitiveness Index: Measuring the productive potential of nations. In M. E. Samuelson, P. (2009). Advances in total factor productivity and entrepreneurial innovation. Schumpeter, J. (1934). The theory of economic development. Cambridge, MA: Harvard University Press. Shane, S., & Cable, D. (2003). Network ties, reputation, and the financing of new ventures. Management Science. Solow, R. M. (1957). Technical change and the aggregate production function. The Review of Economics and Statistics. Srensen, J. B., & Sorenson, O. (2003). From conception to birth: Opportunity perception and resource mobilization in entrepreneurship. Advances in Strategic Management. Weitzel, U., Urbig, D., Desai, S., Sanders, M., & Acs, Z. (2010). The good, the bad and the talented: Entrepreneurial talent and selfish behavior. Journal of Economic Behavior and Organization. Weitzman, M. (1970). Soviet post war economic growth and factor substitution. American Economic Review. Woolridge, A. (2009, March 14). Global heroes: A special report on entrepreneurship. The Economist. *****

* Head, Deptt. Of Management, Raj Kumar Goel Institute of technology For Women, Ghaziabad, U.P., Pin 201306 ** Head, Deptt. Of Management, Rishi Chadha Viswas Girls Institute of Technology, Ghaziabad, U.P., Pin 201306 ***Lecturer, Deptt. Of Management, Raj Kumar Goel Institute of technology For Women, Ghaziabad, U.P., Pin 201306


Performance evaluation of cache replacement Algorithms for Cluster Based Cross layer design for Cooperative Caching (CBCC) in Mobile-Ad Hoc Networks

Madhavarao Boddu, 2Suresh Joseph k

Department of Computer Science, School of Engineering and Technology, Pondicherry University {,}

Abstract Cluster Based cross layer design for Cooperative Caching (CBCC) approach is used for improving data accessibility and to reduce query delay in MANETs. An efficient cache replacement algorithm plays a major role in reducing query delay and improving data accessibility in MANETs. The comparative evaluation of cache-replacement algorithms (LRU, LRU-MIN and LNC-R-W3-U) is done based on Hit Ratio (HR) and Delay Savings Ratio (DSR) with respect to variable cache sizes. Here AODV routing protocol is used for path determination. The experimental results show that LNC-R-W3-U outperforms LRU and LRU-MIN in HR and DSR under variable cache sizes.
Key terms: Ad hoc networks, cross-layer design, clustering, cooperative caching, prefetching


A mobile ad hoc network (MANETs) is a collection of wireless mobile nodes dynamically forming a network without the aid of any network infrastructure. In MANETs mobility of nodes and wireless transmission effect on attenuation, interference and multipath propagation due to the mobility nature of nodes in MANETs the topology changes dynamically. As the topology changes the route must be updated immediately by sending control messages. It causes overhead for route discovery and maintenance. Mobile nodes are resource constrained in terms of power supply and storage space. First, accessing remote information station via multi hop communication leads to longer query latency and causes high energy consumption. Second, when many clients frequently access the database server they cause a high load on the server and reduce server response time. Third, multi hop communication causes the network capacity degrades when network partition occurs. To overcome the

above limitations data caching is an efficient methodology to reduce query delay and bandwidth. To further enhance the performance of data caching cluster based cross layer and perfecting techniques are used. The focus of our research will be to improve the overall network performance by reducing the client query delay and response time. In this paper the comparative evaluation of LRU, LRUMIN and LNC-R-W3-U cache replacement algorithms are done for cluster based cross later design for cooperative caching in MANETs. The rest of the paper organized as follows: section II describes the related work. Section III describes the overview of CBCC approach. Section IV describes the proposed cache replacement algorithm for CBCC approach. Section V concludes the and suggests possible future work.

Caching has been widely used in the wired area networks such as the internet, to increase the performance of web services. However the existing cooperative caching schemes cannot be implemented directly in MANETs due to the resource constraints that characterize the networks as a result new approaches have been proposed to tackle the challenges. Many cooperative caching proposals are available for wireless networks. The proposals are grouped based on the usage of underlying routing protocol, cache consistency management and cache replacement mechanism. In [1, 2] different approaches have been introduced to increase data accessibility and to reduce query delay. In cooperative cache based data access in ad hoc networks [1], a scheme is proposed in this for caching they used cached data and cached path etc. more over


the used cache replacement algorithm is only based on least recently used information. The used LRU as cache replacement algorithm has some limitations and the above proposed approach doesnt considered Prefetching technique. In [2], a similar approach is proposed for the network integrating ad hoc networks with the internet. Cache replacement algorithms have direct impact on the cache performance. In [3-7], a considerable number of proposals give much higher priority to data accessibility as opposed to accessed latency. So both factors are largely influenced by the caching scheme that the cache management adopted, In [3-7] new cache replacement algorithms are used to make the best use of cache space. But the used traditional replacement algorithms like LRU, LFU, and LRFU have problems. However caching alone is not sufficient to guarantee high data accessibility and low communication latency in dynamic system. To overcome these draw backs, a new approach is proposed in cluster based cross layer design for cooperative caching in MANETs [8]. In the above proposal [8] for cache replacement mechanism they used LRU-MIN as the cache replacement algorithm. But the used LRU-MIN has certain limitations. Among those first, it prefers only small objects to raise the hit ratio. Second, it doesnt exploit the frequency information of memory accesses. Third, the overhead cost of moving cache blocks into the most recently used position each time when a cache block is accessed. In order to address these limitations we are going to do a comparative evaluation of different cache replacement algorithms (LRU, LRU-MIN, LNC-R-W3-U) based on recency, cost based functions respectively. The cost based greedy algorithm(LNC-R-W3-U) which makes use of frequency of information while evicting the objects from the cache consistently provides better performance over LRU, LRU-MIN when compares the results in terms of cache hit ratio and delay savings ratio and further enhances the performance of cluster based cross layer design for cooperative caching in MANETs.

upper layer applications in MANETs environment. The instances of CBCC run in each mobile host. The network traffic information which is in the data link layer can be retrieved by the middleware layer for Prefetching purposes.
Application Layer App 1 App 2

- - - ------

App n

Middleware Layer Cache Management Cache Consistency Prefetching Information search Local Hit Cache Replacement Cache admission control Global Hit

Cluster Hit Remote Hit


Cache Path

Hybrid Cache

Cache Data

Transport Layer TCP UDP

Network Layer Routing Protocols (AODV) Data Link Layer Bluetooth 802.11 Hiper LAN

Fig1: System architecture for Cluster-Based Cooperative (CBCC) Caching (CBCC)

Application layer: It is responsible for providing an interface for users to interact with application services or networking services. Application layer uses HTTP, FTP, TFTP, TELNET etc..

Middleware layer: It is responsible for service location, group communications shared memory. CBCC is a cluster-based middleware which stays Middleware layer consists of various blocks such as on top of the underlying network stack and provides cache management, information search, Prefetching caching and other data management services to the and clustering.
A. CBCC Architecture


Cache management: Cache management includes determination (routing). The current system cache admission control, cache consistency architecture uses AODV protocol for path maintenance, and cache replacement. determination. a. Cache admission control: In this a node will cache all received data items until its cache space full after the cache space becomes full, the received data item will not be cache if the data item has a copy within the cluster. Data link Layer: Provides apparent network services so that network layer can be ignorant about the network topology and provides access to physical networking media. It includes error checking and flow control mechanism.
B. Cluster Formation and Maintenance

b. Cache replacement: When fresh data item is arrived for caching and if cache space is full then the cache replacement algorithm is used to locate one or more cached data items to take out from the cache place. The cache replacement process involves two steps: First, if some of the cached data items become obsolete, these items will be detached to make 2 Cluster Head space for the newly arrived data item. 2 If there is still no enough cache space Cluster Member after all obsolete items are removed, Gateway cache replacement will go to the Fig. 2. Clustering Architecture second step, which is that one or more cached data items will be expelled Clustering is a method used to partition the from the cache space according to network into several virtual groups based on the some some criteria. predefined method. For the cluster formation we use c. Cache consistency: The cache consistency least cluster change algorithm [8] which is an strategy keeps the cached data items improvement of lowest ID algorithm. Each mobile synchronized with the original data items node has a unique id. The node which has least id in in the data source. the group is elected as a cluster head. Cluster head Information search: Deals with locating and maintains a list which maintains the information of all other nodes in the group. In a cluster, the number fetching the data item requested by the client. of hops between any two nodes is not more than two. Prefetching: Responsible for determining the data In the whole network there is no direct connection item to be prefetched from the Data Centre for future between the cluster heads. Fig. 2 is an illustration of use. clustering architecture. In fig. 2, nodes which have Transport layer: It is responsible for providing pink color are cluster heads, nodes which have green data delivery transportation between the applications color are gateways, and the rest are cluster members. in the network by using the protocols like TCP, UDP. Cluster member is just like a mobile node it does not It includes the functionalities like Identifying the have any extra functionality. The node which is services, segmentation, sequencing and reassembling common to two cluster heads is elected as a gateway. Gateway is used for providing the communication and error correction. between two cluster heads. Whenever a node requests Network layer: It is responsible for providing for the data, first it has to be checked in the cluster logical addressing and path determination (routing). head list. If it is not available in the list of cluster The routing protocols such as AODV, DSR, DYMO, head then the cluster head forwards the requested etc. are responsible for performing path data item to the other cluster via gateway.


By using LCC we can reduce the frequent changes of cluster head formation. LCC adopts LID to create clusters. If a cluster member moves out of the cluster it wont affect the existing clustering architecture. If two cluster heads exist within the cluster, the lowest id mobile node is elected as a cluster head and if more number of nodes moves out of the cluster will form a new cluster.
C. Information Search Operation

It mainly deals with locating and fetching the data item requested by the client from the cache. This Information search includes 4 cases. Case 1: Local hit: When a copy of the requested data item is ordered inside the hard disk of the requester, the data item is retrieved to serve the query and no cooperation is necessary.

Case 2: Cluster hit: When the requested data item is stored in a client within the cluster of the requester, the requester sends a request to the Cluster head and the Cluster head returns the address of a client that has cached the data item. Case 3: Remote hit: When the data is found with a client belonging to a cluster, other than home cluster of the requester, along the routing path to the data source. Case 4: Global hit: Data item is retrieved from the server. When the client data request comes to the mobile node, first it checks in the local hard disk of mobile node i.e. local cache of mobile node. If it is available in the local cache it sends back the reply to the client. Otherwise the request is forwarded to the neighbors based on the cache current state information in the cluster head. If the cluster head has the requested cache state information cluster head gives back a

Client request

Local cache check

Local Hit

Consistency check Not Valid


Local Miss Cluster Miss Neighbours search Cluster hit

Validate from client

Search in forwarding nodes Remote Hit Retrieve data from forwarding nodes

Retrieve data from neighbours Remote Miss Retrieve data from the data centre Fresh Copy

Cache admission mechanism request

Replacement (LNCR-W3-U)

Return data to the client

Fig. 3. Information Search Operation

request is processed the same way and sends back the to the requester by giving the cluster member id. If it reply to the requester. Otherwise the request is is not available within the cluster then the request is reached to the data center, the datacenter processes forwarded to the other cluster through gateway. The


the data request and sends backs the requested enough space has been unreserved. The cost function information to the client via multi hop can be calculated by using the formula. communication then the client uses the cache admission control for the consistency check in the cluster. If the same data is available within the cluster, then it wont cache the objects information. If it is not available it will cache the data objects and sends back the cached information to the cluster head for updating in the cluster cache state.
profiti rri * di vri * vdi Si

- (1)

rri : Mean reference rate of document i


: Mean delay to fetch document i into cache

vri : Mean validation rate of document i vd i : Mean validation delay for document i

S i : Size of document i

Least Recently Used (LRU): It is one of the most widely used cache replacement algorithm, which evicts the objects based on the least recently used information. LRU maintains a hash table for the fast accessing of the data. In the head of the table the most recently used information is placed and in the tail of the table the least recently used information is stored. When a new data item is added to the cache, it is added to the tail of the table. Whenever a cache hit occurs the access time of the requested data item is updated and it is moved into the head of the list. After the cache is full, it simply removes the tail element of the list. LRU_MIN: It uses a technique called least recently used information with minimal number of page replacements. LRU-MIN is also just like LRU. Like LRU, LRU-MIN also maintains and sorted list of documents in the hash table based on the least recently used information i.e. based on the time the document was last used. The only difference between LRU and LRU-MIN is the method of selecting the document for the replacement. Whenever cache needs to replace the document, it searches from the tail of the hash table and evicts the data items only by which have equal or greater size than newly arrived data item size. If all cached documents are smaller than new document, the search is repeated looking for the first two documents greater than half the size of the new document. The process of halving the size and doubling the number of documents to be removed is repeated if large enough documents can still not be found for replacement. LNC-R-W3-U: It is a cost based greedy algorithm [12] which consists of both cache replacement and cache consistency mechanisms. The algorithm selects documents for replacement with least cost until the

We assume that the values rri , d i

vri ,

vd i are priory known

and are not functions of time. The cache consistency algorithm is a TTL- based algorithm. If a document with an expired TTL is referenced and found in the cache, its content is validated by sending a conditional GET to the server owning the document.

In order to get adequate cache space, the algorithm first considers for replacement all documents having just one reference sample in increasing profit order, then all documents with two reference samples in increasing profit order. The cache consistency algorithm sets TTL for a newly received document i as: - (2)

Otherwise, If the expires timestamp is not available then, Whenever a referenced document i has been cached longer than TTLi units, the consistency algorithm validates the document by sending a conditional GET to the server specified in the documents URL. Whenever a new version of document i is received, algorithm updates the sliding windows containing the last K distinct Last-Modified timestamps and the last K validation delays and recalculates vri , vd i and TTLi.

In this section we have evaluated the performance of LRU, LRU-MIN and LNC-R-w3-U cache replacement algorithms in CBCC approach using ns2 simulation environment. The simulation parameters


used in the experiments are shown in table 1. The simulation was carried out in a grid of 4000300m with 50 to 100 nodes. The time interval between two consecutive queries generated from each node/client follows an exponential distribution with mean node query delay Tq is taken as 6 sec. The node density can be selected by selecting the number of nodes; here we considered the number of nodes as 70 by default. The bandwidth selected for the transmission is 2mbps and the total transmission range of 250m is considered for the simulation. Each client generates a single stream of read only queries. After a query is sent out, the client does not generate new query until the pending query is served. Each client generates accesses to the data items following Zipf distribution [9] with a skewness parameter ( ) 0.8. If = 0, clients uniformly access the data items. As is increasing, the access to the data items becomes more skewed. Similar to other studies [10][11] we choose to be 0.8 . The AODV routing protocol was used in the simulation. The nodes/clients move according to the random waypoint model. Initially, the clients are randomly distributed in the area. Each client selects a random destination and moves towards the destination with a speed selected TABLE 1 SIMULATION PARAMETERS
Parameter Simulation area Database size Cache size (KB) Size of the document(Smin) Size of the document(Smax) Transmission range Number of clients Zipf-like parameter Time-To-Live (TTL) Mean query delay(Tq) Bandwidth Node speed B. Performance Metrics Maximum Capacity 50 450 25-250M 50-100 0.5-1.0 200-1000 sec 2-100 sec 2-20 m/s Default value 4000*300 m 750 items 80 1kB 10kB 250M 70 0.7 500 6 sec 2mb/s 2 m/s

randomly from [ , ]. After the client reaches its destination, it pauses for a period of time and repeats this movement pattern. The data are updated only by the server. The server serves the requests on FCFS (first-come-first-serve) basis. When the server sends a data item to a client, it sends the TTL value along with the data. The TTL value is set exponentially with a mean value. After the TTL expires, the client has to get the new version of the data item either from the server or from other client (having maintained the data item in its cache) before serving the query. The zipf-like parameter [9] can be expressed as
PN (i) i

- (3)

N i 1 1

1 i

Here N is the total number of data items. And is the skewness parameter.
Performance metrics are used to evaluate and to improve the efficiency of the process. The performance metrics Hit Ratio (HR), Delay Savings Ratio (DSR) are considered in the simulation experiment. Hit ratio: It is defined as the ratio of number of successful requests to the total number of requests.. Hit ratio = - (4)

Delay savings ratio: it is defined as the total time taken for the completion of successful requests..

(nri * d i DSR
i i

nvi * ci )
- (5)

fi * di

Where nri is the number of references to document i, fi is the total number of references to document i. nvi is the number of validations performed on document i. C. Cache Performance Comparison We compared the performance of LRU, LRU-MIN and LNC-RW3-U. As Fig. 4 indicates, LNC-R-W3-U consistently provides better performance than LRU, LRU-MIN for all the cache sizes, it improves the Delay Savings Ratio(DSR) on average by 26.5 percent compared with LRU and 8.12 percent compared with LRU-MIN. LNC-R-W3-U improves the cache hit ratio

performance when compared with LRU, LRU-MIN. it improves the cache hit ratio performance by 34.35 percent over LRU, and 6.7 percent over LRU-MIN. LNC-R-W3-U also improves the consistency of the cached documents in addition to improving performance of cache. The performance evaluations of various parameters are plotted by using the graphical representation. In Fig. 4, X-axis represents cache size and Y-axis represents DSR and for fig. 5, X-axis represents cache size and Y-axis represents Hit-Ratio.




LRU Cache Size (KB) 80 100 150 280 340 400 DSR 0.14 0.19 0.22 0.26 0.32 0.36 HR 0.08 0.12 0.15 0.19 0.25 0.3 LRU-MIN DSR 0.22 0.25 0.28 0.32 0.38 0.41 HR 0.11 0.18 0.23 0.28 0.34 0.39 LNC-R-W3-U DSR 0.25 0.28 0.32 0.35 0.39 0.42 HR 0.14 0.19 0.24 0.28 0.34 0.43

In this paper, comparative performance evaluation of LRU, LRU-MIN and LNC-R-W3-U cache replacement algorithms over cluster-based cooperative caching (CBCC) approach in MANETs is done using NS2 simulation environment. The experimental results shows that LNC-R-W3-U consistently provides better performance in terms of Delay Savings Ratio (DSR) and the Hit Ratio (HR) when compared to LRU, LRU-MIN for various cache sizes.
G. Cao, L. Yin and C.R. Das. Cooperative cache-based data access in adhoc networks, IEEE Computer Society, vol.37, 2004, pp.32-39. M. K. Denko and J. Tian, Cross-layer design for cooperative caching in mobile adhoc networks, in Proc. 5th IEEE, Consumer Communications and Networking Conf. (CCNC), 2008, pp. 375380. L. Yin and G. Cao, Supporting cooperative caching in ad hoc networks, IEEE Trans. Mobile Comput., vol. 5, no. 1, pp. 77-89, Jan. 2006. J. Zhao, P.Zhang, and G. Cao, On cooperative caching in wireless P2P networks,inProc.28thInt.Conf.DistributedComputingSystems (ICDCS2008),2008. H.Artail, H.Safa, K.Mershad, Z.Abou-Atme, andN.Sulieman, COACS: A cooperative and adaptive caching system for MANETs,IEEE Trans. Mobile Comput., vol. 7, no. 8, pp. 961-977, Aug. 2008. N.Chand,R.C.Joshi,andM.Misra, Cooperative caching strategy in mobile ad hoc networks based on clusters, Wireless Person.Commun., pp. 41-63, Dec. 2006. J. Tian and M. K. Denko, Exploiting clustering and cross-layer design approaches for data caching in MANETs, in Proc. 3rd IEEE Int. Conf. Wireless and Mobile Computing, Networking and Communications, (WiMob), 2007, p. 52. Mieso K. Denko, Jun Tian, Thabo K. R. Nkwe, and Mohammad S. Obaidat, Cluster Based Cross-Layer Design for Cooperative Caching in Mobile Ad Hoc Networks, IEEE Systems Journal, vol. 3, no. 4, Dec 2009. L. Breslau, P. Cao, L. Fan, G. Phillips and S. Sheker, Web Caching and Zipf-Like Distributions: Evidence and Implications, IEEE INFOCOM, pp. 126-134, March 1999. L. Yin and G Cao, Supporting Cooperative Caching in Ad Hoc Networks, IEEE INFOCOM, pp. 2537-2547, March 2004. Huaping Shen, Sajal K. Das, Mohan Kumar and Zhijun Wang, Cooperative Caching with Optimal Radius in Hybrid Wireless Networks, NETWORKING, pp. 841-853, 2004. [12] Junho Shim, Peter Scheuermann, and Radek Vingralek, Proxy Cache Algorithms: Design, Implementation, and Performance, IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 4, July/August 1999, pp.549-562.

DSR- Delay Savings Ratio, HR- Hit Ratio

0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

80 LRU 100 150 280 LRU-MIN 340 400 Cache Size (KB) LNC-R-W3-U Fig. 4: Performance comparison of DSR 0.45 0.4 0.35 Hit Ratio 0.3 0.25 0.2 0.15 0.1 0.05 0 LRU LRU-MIN 80 100 280 340 400 Cache Size (KB) Fig. 5: Performance comparison of HR 150

Delay Savings Ratio


Comparative Study of the phases of Wireless Intelligent Network

Rashid Ali Khan Computer Science, Mtech SRMS Bareilly Pradesh, India Ashok Kumar Verma Computer Science, Mtech SRMS Bareilly Uttar Uttar Pradesh, India

Abstract The primary weapon for empowering providers to deliver distinctive services with enhanced flexibility is Wireless Intelligent Networks (WINs). The Wireless Intelligent Network seeks to win and retain subscribers with a proven and scalable solution.Today's wireless subscribers are much more sophisticated telecommunications users than they were five years ago. No longer satisfied with just completing a clear call, today's subscribers demand innovative ways to use the wireless phone. Increasing complexity in telecommunications services requires ever more complex standards, and therefore the need for better means to write them. Over the years, scenario-driven approaches have been introduced in order to describe functional aspects of systems at several levels of abstraction. Their application to early stages of design and standardization processes raises new hopes in editing concise, descriptive, maintainable, and consistent documents that need to be understood by a variety of readers. In this context, this paper is a comparative study of the four phases of WIN and comments on their successive growth & services. Index Terms Wireless Intelligent Network (WIN), Telecommunications Industry Association (TIA) Interim Standard (IS), Personal Communications Service (PCS), Automatic Speech Recognition

Intelligent network (IN) solutions have revolutionized wireline networks creation and deployment of services has become the hallmark of a wireline network based on IN Intelligent Network Conceptual Model

INTRODUCTION ireless intelligent network (WIN) is a concept being Wdeveloped by the Telecommunications Industry Association (TIA) Standards Committee TR45.2. The charter of this committee is to drive intelligent network (IN) capabilities, based on interim standard (IS)-41, into wireless networks. IS41 is a standard currently being embraced by wireless providers because it facilitates roaming. Wireless service providers are deploying Intelligent Network technology in their networks to facilitate mobility management and to offer a variety of enhanced services to subscribers. Technical Marketing Services has recently completed a study showing that spending on Wireless Intelligent Networks is likely to be a good investment.

WIN is an evolving network architecture . It enhances the mobile services and creates new capabilities . It performs intelligent networking resulting in customer need fulfillment. The features are controlled outside the switch. It is purposely defined for wireline and wireless networks. Enhanced services will also entice potentially new subscribers to sign up for service and will drive up airtime through increased usage of PCS or cellular services. As the wireless market becomes increasingly competitive, rapid deployment of enhanced services becomes critical to a successful wireless strategy. Thus far, the telecommunications industry has deployed mobile networks that have focused mainly on the needs of retail consumers. These networks have advanced considerably from their analogue origins to encompass 3G mobile networks, broadband wireless networks such as WiFi and WiMax, and are now progressing towards LTE 4G networks. While wireless networks have evolved to support the needs of the mobile user, new applications for mobile data are emerging. Wireless intelligent network (WIN) has brought successful strategies into the wireless networks.

Services WIN protocol facilitates the development of platformindependent, transport-independent and vendor-independent WIN services such as: A) Hands-Free, Voice-Controlled Services Voice-controlled services employ voice-recognition technology to allow the wireless user to control features and services using spoken commands, names, and numbers. There are two main types of automatic speech recognition (ASR). Speaker-dependent requires specific spoken phrases unique to an individual user. B) Voice-Controlled Dialing (VCD) VCD allows a subscriber to originate calls by dialing digits using spoken commands instead of the keypad. VCD may be used during call origination or during the call itself. C) Voice-Controlled Feature Control (VCFC) VCFC directory number, identify the calling party as an authorized subscriber with a mobile directory number and personal identification number (PIN), and specify feature operations via one or more feature-control strings. D) Voice-Based User Identification (VUI) VUI permits a subscriber to place restrictions on access to services by using VUI to validate the identity of the speaker. VUI employs a form of ASR technology to validate the identity of the speaker rather than determine what was said by the speaker. E) Calling Name Presentation (CNAP) CNAP provides the name identification of the calling party (e.g., personal name, company name, restricted, not available) to the called subscriber. F) Password Call Acceptance (PCA) PCA is a callscreening feature that allows a subscriber to limit incoming calls to only those calling parties who are able to provide a valid password (a series of digits). Calls from parties who cannot provide a valid password will be given call refusal while PCA is active. G) Selective Call Acceptance (SCA) SCA is a call-screening service that allows a subscriber to receive incoming calls only from parties whose calling party numbers (CPNs) are in an SCA screening list. Calls without a CPN will be given call-refusal treatment while SCA is active. A. Some basic ruling factors of WIN services 2.5G CDMA2000's 1xRTT is the first technology for the evolution of cdmaOne 2G networks to 2.5G networks. The major impetus for 2.5G is the "always-on" capability. Being packet based, 2.5G technologies allow for the use of infrastructure and facilities only when a transaction is required, rather than maintaining facilities in a session-like manner. This provides tremendous infrastructure efficiency and service delivery improvements. 3G Third generation (3G) networks were conceived from the Universal MobileTelecommunications Service (UMTS) concept for high speed networks for enabling a variety of data intensive applications. 3G systems consist of the two main standards, CDMA2000 and W-CDMA, as well as other 3G variants such as NTT DoCoMo's Freedom of Mobile Multimedia Access (FOMA) and Time Division Synchronous Code Division Multiple Access (TD-SCDMA) used primarily in China. AAA Sometimes referred to as "triple-A" or just AAA, authentication, authorization, and accounting represent the "big three" in terms of IP based network management and policy administration. Authentication provides a vehicle to identify a client that requires access to some system and logically precedes authorization. The mechanism for authentication is typically undertaken through the exchange of logical keys or certificates between the client and the server. Authorization follows authentication and entails the process of determining whether the client is allowed to perform and/or request certain tasks or operations. Authorization is therefore at the heart of policy administration. Accounting is the process of measuring resource consumption, allowing monitoring and reporting of events and usage for various purposes including billing, analysis, and ongoing policy management. Advanced Messaging Advanced messaging technologies will provide advanced capabilities beyond those provided by SMS. In fact, many believe that messaging is the single most important application to exploit the capabilities of 3G (and beyond) networks Billing Billing systems collect, rate, and calculate charges for use of telecommunications services. For post-paid services, a collector at the switch gathers data and builds a call detail record (CDR). For prepay systems, prepay processing system determines the appropriate charges and decrements the account accordingly. Both systems utilize a guiding process to match calls to customers plans and a rating engine to rate individual calls. General Packet Radio Service General Packet Radio Service (GPRS) is a 2.5 generation packet based network technology for GSM networks. The major impetus for GPRS and other packet based mobile data technologies is the "always-on" capability. Being packet based, GPRS allows for the use of infrastructure and facilities only when a transaction is required, rather than maintaining facilities in a session-like manner. This provides tremendous infrastructure efficiency and service delivery improvements. Calling Party Pays Calling Party Pays (CPP) is the arrangement in which the mobile subscriber does not pay for incoming calls. Instead, the calling party pays for those calls. CPP is offered in many places, but has not been regulated in the United States where Mobile Party Pays (MPP) is still predominant. Electronic Billing Presentation and Payment Electronic Billing Presentation and Payment (EBPP) is the use of electronic means, such as email or a short message, for rending a bill.The advantage of EBPP over traditional means is primarily the savings to the operator in terms of the cost to produce, distribute,

and collect bills. EBPP may be used in lieu of a standard paper bill as a means to reduce operational costs. GETS The Government Emergency Telecommunications Service (GETS) is an organization established to support the United States National Communications System (NCS). The role of GETS is to provide specialized call processing in the event of congestion and/or outages during an emergency, crisis, or war. GETS has already established capabilities to facilitate priority call treatment for wireline/fixed networks, including the local exchange and long distance networks. In the event of an emergency, the authorities would be able to gain faster access to telecommunications resources than the every-day citizen. Intelligent Agents A key enabling technology for personalization, Intelligent agent technology provides a mechanism for information systems to act on behalf of their users. Specifically, intelligent agents can be programmed to search, acquire, and store information on behalf of the wants and needs of users. Intelligent agents are task-oriented. Inter-operator Messages Inter-carrier Messaging (ICM) sometimes referred to as inter-operator or inter-network messaging - refers to the ability to transmit messages between mobile communications networks regardless of technologies involved (CDMA, GSM, iDen, PDC, or TDMA) and regardless of SMSC protocols deployed (CIMD, SMPP,UCP). IWF The Interworking Function (IWF) acts as a gateway between the mobile network and data network infrastructure such as a WAP gateway. The IWF is used to facilitate a circuit switched connection from the MSC to the WAP gateway. In addition the IWF can be used to support mobile originated and terminated calls for asynchronous data and fax. Lawful Intercept Lawful Intercept (LI) or CALEA (Communications Assistance to Law Enforcement Act) represent regulation requiring mobile network operators to enable legally authorized surveillance of communications. This means a "wireless tap" of the communications channel for voice and/or data communications. LI is being considered in Europe and is being mandated in the USA in 2002. LDAP LDAP is an important protocol to IP networking and is therefore important to the development and administration of mobile data applications. An important evolution of LDAP will involve the migration to DENs, which have the potential to considerably improve directory environments. Mobile Instant Messaging Simply put, Mobile Instant Messaging (MIM) is the ability to engage in IM from a mobile handset via various bearer technologies, which may include SMS, WAP, or GPRS.In a mobile environment, the user is constrained by bandwidth and the UI. Mobile IN This module provides a brief introduction to the concepts and technologies associated with intelligent networks for mobile communications. All intelligent networking for telecommunications involves the concept of a "query/response" system. This system entails the notion of distributed intelligence wherein a database is queried for information necessary for call processing. Mobile IP Mobile IP is the underlying technology for support of various mobile data and wireless networking applications. For example, GPRS depends on mobile IP to enable the relay of messages to a GPRS phone via the SGSN from the GGSN without the sending needing to know the serving node IP address MVNO A Mobile Virtual Network Operator (MVNO) is a mobile operator that does not own its own spectrum and usually does not have its own network infrastructure. Instead, MVNO's have business arrangements with traditional mobile operators to buy minutes of use (MOU) for sale to their own customers. Personal Area Networks Personal Area Networks (PAN) are formed by wireless communications between devices by way of technologies such as Bluetooth and UWB. PAN standards are embodied by the IEEE 802.15 family of "Home Wireless" standards, which superseded older infrared standards and HomeRF for dominance in this area of wireless communications. Prepay Technology While there are many technologies involved in deploying mobile prepay (prepaid wireless service), this paper provides and introduction to the various types of mobile prepay deployments and the associated technologies. Point Solutions, ISUP based Solutions, Intelligent Network based Solutions, Handset Solutions, Hybrid Solutions, Call Detail Record (CDR) based Solutions. Presence & Availability Presence and availability technologies provide the ability to determine the event in which a mobile user is present in a certain location and/or available for certain events to take place such as mobile messaging, games, and other location based services Mobile Positioning The terms mobile positioning and mobile location are sometimes used interchangeably in conversation, but they are really two different things. Mobile positioning refers to determining the position of the mobile device. Mobile location refers to the location estimate derived from the mobile positioning operation. There are various means of mobile positioning, which can be divided into two major categories network based and handset based positioning Personalization The goal of mobile operators is to increasingly make their service offerings more personalized towards their customers. This movement is led by the need to differentiate products against fierce competition while driving improved revenue per unit customer. Most of the emphasis on personalized services today is placed on mobile data services enabled by technologies such as GPRS. Service Bureaus A telecommunications service bureau is an organization or business that offers outsourced telecommunications services on a wholesale basis to other service providers, which typically offer retail services, directly or indirectly, to end-users. Many services may be obtained from a service bureau solutions provider. The typical service will be new, unproven services, applications that require economies of scale, and regulation-driven applications. Softswitch Simply put, Softswitch is the concept of separating the network hardware from network software. In traditional circuit switched networks, hardware and software is not independent. Circuit switched networks rely on dedicated facilities for inter-connection and are designed primarily for

voice communications. The more efficient packet based networks use the Internet Protocol (IP) to efficiently route voice and data over diverse routes and shared facilities. Smart Cards Smart cards in the wireless marketplace provide: improved network security through user identification, a facility for storing user data, and a mechanism for recording various service data events. These capabilities enable improved service customization and portability in a secure environment, especially suited for various transaction based services. Smart cards are tamper resistant and utilize ISO-standardized Application Protocol Data Units (APDU) to communicate with host devices via PIN codes and cryptographic keys. SMS Short Message Service (SMS) is a mobile data service that allows alphanumeric messaging between mobile phones and other equipment such as voice mail systems and email. SMS is a store-and-forward system. Messages are sent to a Short Message Service Center (SMSC) from various devices such as another mobile phone or via email. The SMSC interacts with the mobile network to determine the availability of a user and the user's location to receive a short message. SS7 SS7 is a critical component of modern telecommunications systems. SS7 is a communications protocol that provides signaling and control for various network services and capabilities. While the Internet, wireless data, and related technology have captured the attention of millions, many forget or don't realize the importance of SS7. Every call in every network is dependent on SS7. Likewise, every mobile phone user is dependent on SS7 to allow inter-network roaming. SS7 is also the "glue" that sticks together circuit switched (traditional) networks with Internet protocol based networks. Unified Messaging Unified messaging (UM) is the concept of bringing together all messaging media such as voice messaging, SMS and other mobile text messaging, email, and facsimile into a combined communications experience. Minimally, the communications experience will take the form of a unified mailbox and/or alert service, allowing the end-user to have a single source for message delivery, repository, access, and notification. USSD Unstructured Supplementary Service Data (USSD) is a technology unique to GSM. It is a capability built into the GSM standard for support of transmitting information over the signaling channels of the GSM network. USSD provides sessionbased communication, enabling a variety of applications. USSD is defined within the GSM standard in the documents GSM 02.90 (USSD Stage 1) and GSM 03.90 (USSD Stage 2). VAS Value-added services (VAS) are unlike core services. They have unique characteristics and they relate to other services in a completely different way. They also provide benefits that core services can not. WAP Wireless Application Protocol (WAP) is an enabling technology based on the Internet client server architecture model, for transmission and presentation of information from the World Wide Web (WWW) and other applications utilizing the Internet Protocol (IP) to a mobile phone or other wireless terminal. Wireless 911/112 Wireless Emergency Services (WES) refers to the use of mobile positioning technology to pinpoint mobile users for purposes of providing enhanced wireless emergency dispatch services (including fire, ambulance, and police) to mobile phone users. Wireless Testing The complexity of wireless networks will increase more quickly in the next five years than it did in the previous fifteen due to the rapid advent of broadband service. These networks will need to quickly move from supporting voice-centric to implementing data-intensive applications. This urgency comes from the need to defray the steep entry costs paid by wireless operators worldwide. WLAN Roaming Wireless Local Area Networks (WLAN) are increasing becoming an attractive alternative to licensed spectrum. Initially thought of strictly as a competitor to 3G, WLAN is now thought of as a complement to cellular based data services, providing a high bandwidth alternative to 3G at a fraction of the cost.

Comparative Study One of the vital solutions for this highly competitive and increasingly demanding market is to build a sophisticated Wireless Intelligent Network infrastructure that can flexibly support existing and new services. This approach can reduce the load on the wireless switches. Thus eventually WIN phases are constantly in progress enhancing customer requirements. A Telecommunications Industry Association/American National Standards Institute (TIA/ANSI) standard messaging protocol that enables subscribers in ANSI-41 based wireless networks to use intelligent network services. WIN also supports the network capabilities to provide wireless activities such as automatic roaming, incoming call screening, and voice-controlled services. CAMEL A European Telecommunications Standards Institute (ETSI) standard messaging protocol for including IN functions into GSM mobile networks. CAMEL is used when roaming between networks, allowing the home network to monitor and control calls made by its subscribers. CAMEL API allows roaming subscribers access to their full portfolio of Intelligent Network (IN) services. CAMEL is a relatively inexpensive method of allowing telecom operators to add new services to the existing network infrastructure. A few typical applications include: PrePaid Calling Personal Numbering Location dependent services UMTS/GSM-MAP An ETSI standard messaging protocol used in UMTS/GSM wireless networks to communicate among network elements to support user authentication, equipment identification, and roaming: Mobile Switching Center (MSC), Home Location Register (HLR), Visitor Location Register (VLR), Equipment Identity Register (EIR), Short Message Service Center (SMSC) Authentication Center (AuC) Typical applications include: Intelligent Peripheral (IP) Service Control Point (SCP) Enhanced Services Platform

ANSI - 41 A TIA/ANSI standard messaging protocol used in CodeDivision Multiple Access (CDMA) and Time Division Multiple Access (TDMA) wireless networks primarily in the Americas and parts of Asia to communicate among network elements (MSC, HLR, VLR, EIR, SMSC) to support inter-system handoff, automatic roaming, authentication, and supplementary call features. The ANSI 41D specification (formerly known as IS-41) is primarily used in the wireless network to provide services such as automatic roaming, authentication, intersystem hand-off, and short message service. All wireless network elements use this messaging protocol to communicate. Typical applications include: Intelligent Peripheral (IP) Service Control Point (SCP) Enhanced Services Platform INAP Intelligent Network Application Protocol (INAP), an ITU-T specification, allows applications to communicate between various nodes/functional entities of a wireline Intelligent Network. The protocol defines the operations required to be performed between nodes/functional entities for providing Intelligent Network services. A few typical applications include: Call Center solutions requiring handling of special service numbers (800 & 900 services) Local Number Portability Calling Card registration and authentication including charging and fraud management capabilities Interactive Voice Response (IVR) systems for small and large business segments Calling Name delivery Service Management systems for study of traffic patterns as well as generating call reports and billing records at central administration and billing center. AIN protocol AIN applications include: A) Toll-free dialing and FreePhone facilities for subscribers B) Virtual Private Network Services for closed user groups operating over geographically distributed facilities Universal Access Number (UAN) Split Charging capability enabling the subscriber to separately charge personal and business calls made from the same instrument Call Rerouting and Redistribution based on traffic volume and/or time of day suitable for telemarketing businesses and reservation centers with multiple locations. Prepaid and Calling Card services Televoting, whereby franchisees may cast their choice over secure voice response systems, preserving privacy, possible travel time as well as avoiding human tampering of results and other malpractices. Comparitive study table of CAMEL in four phases: CAMEL Control of MO, MT and MF calls Phase 1 Any time interrogation CAMEL Additional EDPs Phase 2 interaction between a user and a service using announcements, voice prompting and information collection via in-band or USSD interaction; Control of call duration and transfer of AoC Information to the ms; The CSE can be informed about the invocation of the supplementary services ECT, CD and MPTY; For easier post-processing, charging information from a serving node can be integrated in normal call records. Support of facilities to avoid overload; Capabilities to support Dialed Services; Capabilities to handle mobility events, such as notreachable and roaming; Control of GPRS sessions and PDP contexts; Control of mobile originating SMS through both CS and PS serving network entities; Interworking with SoLSA (Support of Localized Service Area) (optional) The CSE can be informed about the invocation of the SS CCBS. Support of Optimal Routing for CS mobile to mobile calls; Capability of the CSE to create additional parties in an existing cal l; Capability for the CSE to create a new call unrelated to any other existing call; Capabilities for the enhanced handling of call party connections; Capability for the CSE to control sessions in the IP Multimedia Subsystem (IMS); Enhanced CSE capability for dialed services; The capability to report basic service changes during ongoing call; The CSE capability to select between preferred and less preferred bearer services; The capability for the CSE to control trunk originated calls; The capability for the CSE to request additional dialed digits;

CAMEL Phase 3

CAMEL Phase 4

Intelligent Network and Wireless Protocols Codec Name Customized Application for Mobile network Enhanced Logic (CAMEL) Universal Mobile Telecommunications System Mobile Application Part (UMTS-MAP) which includes Global System for Mobile communication - Mobile Application Part (GSM-MAP) Wireless Intelligent Network (WIN) ANSI 41 Descripti on Phase 2, 3, & 4 Phase 1,2,2+, &3 Standard 3GPP TS 29.078 (v5.1.0 Release 5) UMTS MAP 3GPP TS 29.002 V4.2.1 (200012) 3G TS 29 002 v3.4.0 (2000-3) TIA/EIA/IS 771 TIA/EIA/IS 826 TIA/EIA-41.(1-

Phase I & II ANSI -

Codec Name Descripti on 41D Intelligent Network Application Protocol (INAP) CS-1 CS-2 Standard 6) D Dec 1997 ITU-T Q.1218, release 1095 ITU-T Q.1228, release 997 additional service capabilities for wireless operators as well as greater harmonization of network capabilities and operations with emerging third-generation network requirements. WIN Phase 2 includes MSC triggers for an IN prepaid solution WIN Phase 3 incorporates enhancements to support location-based services These requirements are based on four service drivers: location-based charging, fleet and asset management service, enhanced call routing service, and location-based information service. WIN Phase 4 is currently in requirements review by the WIN standards group. Wireless Intelligent Networking allows the service provider to rapidly introduce new services. Mobile Pre-Pay is a common application. There are two overall standards employed today CAMEL and WIN. Maintaining and monitoring the Common Channel Signaling (CCS) network is critical to its success.Understanding and troubleshooting the SS7 protocol is a key part of that success. REFERENCES DR.S.S.RIAZ AHAMED, Journal of Theoretical and Applied Information Technology, Wireless Intelligent Network ppt IEEE Trans. on WIN, Lucent Technologies Bell Lab Service management pdf W. H. Tranter and K. L. Kosbar, "Simulation of Communication Systems," IEEE Communications Magazine, July 1994, Pp 22-28. W. Turin, "Simulation of Error Sources in Digital Channels,"IEEE J. on Selected Areas in Comm., Vol. 6, pp. 85-93 (January 1988). K. Walsh and E. G. Sirer, "Staged Simulation for Improving the Scale and Performance of Wireless Network Simulations," Proc. 2003 Winter Simulation Conference, New Orleans.

Matrix The matrix below shows the IN wireline and wireless protocols supported by telecommunications standards.

[1] [2] [3] [4] [5] Conclusion The first phase of WIN standards was published in 1999 and established the fundamental call models and operations required to support this flexible service architecture. Many service providers currently implement WIN Phase 1 in their networks. Examples of WIN Phase 1 services are calling name presentation and restriction, call screening, and voice-control services. Nearing completion are WIN Phase-2 standards that provide both [6]


owerment And Total Quality Management For Innovation And Success In Organisations

Abstract While struggling with the changing climate in technological communication with access to the Internet and electronic services, many organisations have brought in external experts who advise them on quality and restructuring. To survive in a competitive environment characterized by deregulation and converging markets, complex customer needs, corporate restructuring, and downsizing, todays organizational leaders are searching for innovative ways to enhance the creative potential of their workforce. As with total quality management and reengineering, empowerment has become one of the mantras of the 1990s.Employee empowerment has become a new topic that attracts research academics and practitioners. An empowerment approach encourages employees to have more discretion and autonomy in organizing their own work. It also involves a quality service delivery system in which employees can face the customers free of rulebooks and are encouraged to do whatever is necessary to satisfy them. However, many academics contend that implementation of the empowerment principle is a rather difficult task. In their promotional literature, earlier advocates of employee empowerment often prescribed simple, step-by-step procedures to be followed, and predicted certain success as a result. Recently, many researchers have challenged this view. They point out that property management empowerment, which was often omitted in earlier researches, is a critical factor in determining the success of customer satisfaction. Index Terms Empowerment, Responsibility, Total Quality Management

Empowerment is the process to give staff real authority in their work to achieve continuous improvement and job satisfaction in an organisations performance for better quality products and customer service in order to remain competitive. Empowerment encourages and allows individuals to take personal responsibility for quick response times to consumer needs and complaints with greater warmth and enthusiasm. In recent years, empowerment has become a separate discipline that has attracted widespread discussion.

According to Spreitzer (1992) there are four characteristics most empowered employees have in common: 1. sense of self-determination to choose how to do the work; 2. sense of competence to perform the work well; 3. sense of meaning to care about what they are doing; and finally 4. sense of impact to have influence on the larger environment. Empowerment is a mind-set that employees have an overall feeling of psychological empowerment about their role in the organisation. Greater job autonomy, increasing meaningfulness of the job, mentoring behaviours of the immediate supervisors and job satisfaction increases the organisational commitment of the employees and increase their psychological empowerment in the workplace. It is observed that employee empowerment strongly associates with the nature of the job and the leadership commitment in developing an empowered workforce. An empowered workplace should be structured to encourage front-line employees to exercise initiative and imagination in solving problems, improving processes and meeting customer needs. There is a to create enthusiasm and commitment by the development of organisational value and visions that are congruent with workers values and visions. Furthermore, the role of management from this perspective is to create a culture of participation by providing a compelling mission, a structure that emphasizes flexibility and autonomy, rewards for participation and a lack of punishment of risk taking as well as ongoing involvement programmes.


Empowerment programmes can transform a stagnant organisation into a vital one by creating a shared purpose among employees, encouraging greater collaboration

165 and, most importantly, delivering enhanced value to customers. It has been found that organisations with a commitment to employee involvement and empowerment also have a commitment to total quality. This concept stems from the current international strategy towards total quality management (TQM). It is often based on a desire to gain competitive advantage, increase productivity and improve customer relationships through quality assurance issues. As to the origins of empowerment, empowered groups have often resulted from organizational change such as downsizing or by adopting a flatter structure. Therefore, employees often perceive empowerment as receiving additional tasks. Effecting such organizational change is probably the hardest aspect of establishing TQM. However, effective empowerment can bring most organisations many successes and achievements as employees learn about the connection between their decisions, actions and customer value. In addition, they become self-directed decision makers aligned with the shared purpose. In order to stimulate employees to become involved and empowered in business improvement programmes, employees at all levels need to be given power, knowledge, information, and rewards that are relevant to business performance.


The key to achieving empowerment for improved performance is for everyone in an organization to have a clear understanding of what they are trying to achieve by empowerment and what they must do to achieve their purpose. The empowerment process management model identifies the following six key steps in the planning, initiating and evaluating of an organisations initiative to extend and strengthen empowerment.

Figure 1: The Empowerment Process Management Model These steps make a closed loop process whose output is continuous improvement (Kinlaw, 1995): 1. Define and communicate the meaning of empowerment to every member of the organization. 2. Set goals and strategies that become the organizing framework for staff at every organizational level as they undertake their own efforts to extend and strengthen empowerment. 3. Train staff to fulfil their new roles and perform their functions in ways that are consistent with the organisations goals for extending and strengthening empowerment. 4. Adjust the organizations structure so that it demands a flatter format, creates greater autonomy and freedom to act. 5. Adjust the organizations systems (like planning, rewarding, promoting, training, and hiring) to support the empowerment of staff.

166 6. Evaluate and improve the process of empowerment by measuring improvement and the perceptions of the organizations members. These six elements in the model are linked together within a single rectangle to emphasize their relatedness. Around this large rectangle are a series of smaller rectangles which identify sources of critical inputs. The empowerment process can only be undertaken successfully if the following kinds of information and knowledge are well understood: A) meaning of empowerment; B) payoffs expected; C) targets for empowerment which provide a set of alternatives that everyone can use in targeting specific opportunities to empower themselves and others; D) strategies for empowerment which provide multiple alternatives for reaching the targets which individuals and organizations identify; E) how controls for empowerment differ from traditional controls and how these controls can be developed; F) New roles and functions in which property managers and other members of the organization must become competent their performance to be compatible with the meaning and purposes of empowerment. empowered work teams; for example, degree and use of group activity, manager-subordinate relationships, and decision-making authority, the Grid offered a logical starting point for the organization to engage in discussions about its current environment and human resource strategies by provoking questions such as the following: A) How interdependent are group members? B) Do we reinforce individual or group performance? C) Do members strongly identify with each other? D) Do demographically or culturally diverse groups look and feel substantially different? E) How much authority, control, and hierarchical trust do members of various groups have? Developing the grid as a framework for simultaneous assessment In order to perform simultaneous assessments of Multicorps team-building efforts and empowerment directives, we constructed the Grid from two continua. The horizontal continuum refers to the distinction described above between co-acting groups and real teams. For any organization, placement along this continuum involves analysis of group/team variables such as individual roles, nature of tasks, problem-solving and learning styles. The Grids vertical continuum illustrates possible team transitions from disempowered to empowered. To ensure an accurate assessment and avoid cross-organizational confusion, the disempowered-empowered continuum is derived from a firm-specific definition of empowerment. Specifically, the empowered end of Multicorps continuum was characterized first by what employees described as decision-making authority; that is, the permission to make a particular choice from a range of options. Organizational participants frequently used the words good and responsible to describe the kind of decisions they expected to make when empowered implying some alignment of personal values with organizational priorities. In contrast, at the disempowered end of the same continuum, Multicorp employees referred to a lack of decision-making autonomy and management control systems that were not only disempowering but also demoralizing. Lastly, trust at the disempowered end of the continuum referred to a fear of negative consequences, often related to direct repercussions experienced in the past. The intersection of the Grids two continua create four quadrants into which groups may fall following firmspecific evaluation: empowering managers (upper left), empowered work teams (upper right), platoons (lower


Empowerment Strategy Grid is a management tool constructed and refined from research and intervention at several US corporations. Integrating the fundamental concepts of empowerment, team building, and diversity, the Empowerment Strategy Grid facilitates organizational assessment of work team development and progress towards achieving company empowerment strategies. What follows is a comprehensive description of the Empowerment Strategy Grid and its practical application in the context of designing and implementing effective, firm-specific, empowered work teams. Demonstrated corporate need to minimize the potential confusion and transform work groups into empowered work teams at the leading US multinational corporation, Multicorp, led to the development of the Empowerment Strategy Grid. Development of the Empowerment Strategy Grid is a holistic framework for analysing work groups and mapping team progress towards empowerment strategies. Capturing within its simple schematic the fundamental variables that structure and categorize

167 right), and automatons (lower left). By identifying the quadrant in which a group is anchored, we have found that organizational participants involved in the change process can better assess their team model, identify gaps between espoused human resource strategy and practice, design interventions to empower teams fully, and measure the transition path into their target area of the Grid. Overall, automatons are told how, when, and where to perform their work. They work in close proximity to one another, report to the same manager, and gain authority through function, level, or role Next, sliding along the co-actors-team continuum to the lower right quadrant of the Grid, platoons incorporates those individuals who identify with, trust, and respect their team members yet are not empowered. Named by several of Multicorps participants, platoon is intended to evoke imagery of a controlled or conditional work environment: teams in this quadrant see themselves as peers with varying talents who contribute synergistically to the teams performance, as competent and capable within their delineated sphere of action, and as constructively managing conflict. In contrast to empowered work teams, platoon members must report to and obey a higher authority. Ultimately, these teams are bounded by external rules and managerial controls which often creates an us versus them culture. Within the team, group performance is reinforced, recognized, and rewarded. Platoon-like teams can operate very efficiently under certain conditions. Similar to empowering managers, platoons may be a viable strategy: teams are motivated to learn, members perform well with one another, and work teams produce. These teams are simply not as effective as empowered work teams in rapid, autonomous, creative problem solving since they must comply with set rules and appeal to authority. Given possible benefits of this type of team and the firms competitive environment, organizational practitioners and beneficiaries must decide that empowered work teams are more effective and efficient prior to moving from platoons along the empowerment-disempowerment continuum to the empowered work teams quadrant. Finally, in Multicorps empowered work teams quadrant, individual roles are fluid and assigned relative to team needs and member competences, tasks are interdependent, and team members share decisionmaking authority based on member skills and the specific knowledge needed to manage problems. Team members may report to the same or to different managers - identification with other members may be more important than physical proximity. In this selfmanaged, collective autonomy, team members are often self-learners and rely on group problem solving. Team members manage conflict creatively and focus on producing integrative solutions. Members trust each other and demand fair organizational processes. In an environment in which team output is a collective goal, performance measurement, recognition, and rewards are based on the team. As a result of team member cohesion and synergistic performance, these empowered work teams often better sustain their effectiveness over time as

Figure 2: The Empowerment Strategy Grid According to Multicorp definitions of empowerment, work groups located in the empowering managers quadrant are characterized by both the attributes of a coacting group and a participatory management style of decision making. Group members are individuals who work in close proximity to one another, report to the same manager, or perform functionally similar, yet independent, tasks. These co-actors are empowered to contribute ideas to the decision-making process but not to influence, make, or implement final decisions. Moreover, given the dynamic of this group structure, group conflict is handled through compromise, individual domination, or authority intervention. Finally, in an environment in which output and performance are additive, compensation, rewards, and productivity measures are individually based. The pattern of empowered relations described by the empowering managers quadrant defined for Multicorp appears to be the strategy with which many managers in existing organizational structures are most adept. In response to empowerment-type initiatives, both managerial and individual employee roles shift to meet high-level expectations of empowerment. Moving counter-clockwise and down Multicorps empowerment-disempowerment continuum, automatons is attributed to Frederick II who said in describing his soldiers, in general, they are veritable machines with no other forward movement than that which you give them. In a state of machine-like, automatic operation, members of work groups in the automaton quadrant have clearly defined roles and tasks and limited autonomy.

168 well as drive potential improvements in productivity, satisfaction, turnover, and absenteeism. Given the potential intrinsic and extrinsic rewards for both company and employee, the empowered work team quadrant is clearly an ideal empowerment target for many companies. As a result, organizational members at various levels are often committed, or simply chartered by senior management, to build empowered work teams. Yet implementers and presumed beneficiaries are typically overwhelmed with the enormity of tasks, time, and resources required to develop truly empowered teams. Many seem to doubt the possibility of changing the assumptions, habits, and practices embedded in one quadrant with those that support empowered work teams. Challenges also exist in knowing where to start and in planning interventions which stimulate, monitor, and sustain the change process. At a minimum, by providing rich, firm-specific descriptors of four possible environments, the Grid can help an organization rapidly identify both its current group status and gaps between desired strategy and practice.

Figure 3: The Empowerment Strategy Matrix Grid Quadrant Descriptors The overall process stifles individual initiative and decision-making autonomy, group problem solving, and innovative ways of thinking about balancing work and family commitments. Moreover, organizational outcomes include systemic inconsistencies and confusion, individualization of manager-employee interaction, hindrance of team development, challenges to diversity, employee dissatisfaction, and turnover.



Clearly, organizational transformation from one quadrant, such as automatons, to empowered work teams is no simple feat. To make these moves, up-front assessment is critical in order to avoid the root problems associated with empowerment programmes that have no organizational meaning. Equally important are understanding the gaps between the organizations current position and target quadrant and then measuring and monitoring progress to close these gaps. The Empowerment Strategy Grid allows organizational practitioners to identify relevant gaps by evaluating their team models and comparing outcomes with managerial intent. But the Grid can also give guidance in measuring and monitoring the effectiveness of interventions designed to promote change. Following team model assessment, gap identification, and development planning, practitioners can use the Grid to perform iterative analyses of team progress in response to specific team-building or empowerment interventions. To do this, we strongly recommend that companies identify relevant organizational measures at the beginning of their change process. This will help anchor the continua according to internally-defined goals and then determine initial placement of the organizations groups in one of the Grids quadrants.

Organizations, and people in them, face uncertainty, change, complexity and huge pressures. Among the factors causing this are: the demand for higher quality and value for money (more for less); higher expectations of quality of life at work and elsewhere; increasing globalization of the economy; efforts to contain growth in public expenditure and transfer services from the public to the private sector; the growing urgency of both equal opportunities and ecological issues and awareness of inequities in the global economic system; and, lately, international recession. How can we find our way through this complex situation which is at once exciting and daunting? It is likely that we shall find ways forward most successfully first, by releasing creative energy, intelligence and initiative at every level; and second, by learning how to unite people in solving common problems, achieving common purposes and respecting and valuing difference. Organizations which do this will have the best chance of surviving and prospering. They will

attract the most able people and have the best relationships with customers. This implies a different culture: leadership that is inspiring, empowering and nurturing, rather than controlling; an atmosphere of high expectations, appreciation and excitement; a balance between yin and yang; recognition that, normally, internal competition is destructive and there are elegant or win-win solutions; an attitude of wanting everyone to excel; acceptance that in todays conditions we are bound to have difficult feelings and that understanding how to deal with our feelings, and how to assist others with theirs, is a key skill. We also need to learn how to tap into the energy to improve things, so often expressed as complaint, criticism and blame, and help people deal with feelings of hopelessness, often masquerading as cynicism. It is believed that greater employee empowerment is that breakthrough opportunity for all businesses to leverage in improving their sustainable performance. TQM programmes that do not have management commitment and employee empowerment are bound to fail. Property managers believe that, with top management commitment, by involving employees in problem solving, decision making, and business operations, then performance and productivity will increase. To be able to participate in empowerment, employees need to be sufficiently educated. Employees should be encouraged to control their destiny and participate in the processes of the organisation. To be effective, employees should be given power, information, knowledge and rewards that are relevant to business performance. Successful empowerment can not only drive decisionmaking authority down to those employees closest to the product or customer, improving quality and customer service, but also inculcate a sense of job ownership, commitment, and efficacy among empowered individuals and work teams. TQM calls for a change of culture with the support of management that requires employee empowerment for quality improvement at all levels. Empowerment also leads to greater levels of satisfaction among the workforce, whereas empowered employees give faster and friendlier service to customers as well. There is a significant, positive relationship between success with organisational process improvement and the presence of all three cultural elements related to quality improvement: customer focus, employee empowerment, and continuous improvement. Given the potential financial and employee benefits associated with empowered work teams, empowerment may be a viable survival solution for companies facing intensifying competitive pressures and changing workforce dynamics. Yet, empowerment in practice is

170 more than just a 1990s buzzword. Instead, if leadership involves envisaging future strategy, attracting and enabling diverse organizational members at every level to embrace the leaders vision, and persistently operationalizing this vision in a meaningful and congruent manner throughout the entire organization, then empowerment is a significant leadership challenge. The Empowerment Strategy Grid helps companies like Multicorp avoid the implementation pitfalls associated with group differences, variations in the definition and degree of empowerment across an organization, and interventions which unintentionally disempower. By helping organizational practitioners assess their team development, interventions, and progress towards achieving corporate empowerment strategies, the Empowerment Strategy Grid can help companies reap the full potential from their empowerment programmes.
[25] Alderfer, C.P. (1986), "An intergroup perspective on group dynamics", in Lorsch, J. (Eds),. [26] Bandura, A. (1977), "Self-efficacy: toward a unifying theory of behavior change", Psychological Review, Vol. 84 pp.191-215. [27] Bandura, A. (1982), "Self-efficacy mechanism in human agency", American Psychologist, Vol. 37 No.2, pp.122-47. [28] Conger, J.A., Kanungo, R.N. (1988), "The empowerment process: integrating theory and practice", Academy of Management Review, Vol. 13 No.3, pp.471-82. [29] Follet, M.P. (1940), H.C. and Urwick, L. (Eds), in Metcalf, . [30] Greenhaus, J.H., Parasuraman, S., Wormley, W.M. (1990), "Effects of race on organizational experiences, job performance evaluations, and career outcomes", Academy of Management Journal, Vol. 33 No.1, pp.64-86. [31] Hackman, J.R., Oldham, G.R. (1980), Work Redesign, Addison-Wesley, Reading, MA., . [32] Hill, L.A. (1992), Becoming a Manager: How New Managers Master the Challenges of Leadership, Harvard Business School Press, Boston, MA., . [33] Johnson, R.D. (1994), "Wheres the power in empowerment?: definition, differences, and dilemmas of empowerment in the context of work-family boundary management", Unpublished doctoral dissertation, Harvard University., . [34] Kotter, J.P. (1990), A Force for Change: How Leadership Differs from Management, Free Press, New York, NY., . [35] Kouzes, J.M., Posner, B.Z. (1987), The Leadership Challenge. How to Get Extraordinary Things Done in Organizations, Jossey-Bass, San Francisco, CA., . [36] Lawler, E.E., Mohrman, S.A., Ledford, G.E. (1992), Employee Involvement and Total Quality Management: Practice and Results in Fortune 1000 Companies, Jossey-Bass, San Francisco, CA., . [37] Maznevski, M.L. (1995), "Process and performance in multicultural teams", Submitted to the Organizational Behavior Division of the Academy of Management Annual Meetings, Vancouver, BC., . [38] Miller, W.H. (1995), "General Electric; Auburn, Maine", Industry Week, . [39] Nemeth, C. (1987), "Influence processes, problem solving, and creativity", in Zanna, M., Olson, J., Hermer, C. (Eds),. [40] "Powerless empowerment", . [41] Taifel, H., Turner, J.C. (1986), "The social identity of intergroup behavior", in Worchel, S., Austin, W.G. (Eds),. [42] Thomas, K.W., Velthouse, B.A. (1990), "Cognitive elements of empowerment: an interpretive model of intrinsic task motivation",Academy of Management Review, Vol. 15 No.4, pp.666-81. [43] Treitschke, H. von. (1915), The Confessions of Frederick the Great with Treitschkes Life of Frederick, G.P. Putnam and Sons, New York, NY., . [44] Tsui, A., Egan, T., OReilly, C. (1992), "Being different: relational demography and organizational attachment", Administrative Science Quarterly, Vol. 37 No.4, pp.549-79. [45] Watson, W.E., Kumar, K., Michaelson, L.K. (1993), "Cultural diversitys impact on interaction processes and performance: comparing homogeneous and diverse task groups", . [46] Morgan, G. (1988), Riding the Waves of Change, Jossey-Bass, San Francisco, CA, . [47] Nixon, B. (1992), "Developing an Empowering Culture in Organizations", Empowerment in Organizations, Vol. 2 pp.14-24. [48] Senge, P (1993), The Fifth Discipline, Century Business, London, . [49] Simmons, M. (1993), "Creating a New Leadership Initiative", Management Development Review, Vol. 6 No.5, . [50] Stacey, R. (1993), Strategic Management and Organizational Dynamics, Pitman, Marshfield, MA, . [51] Batten, J (1994), "A total quality culture", Management Review, Vol. 83 No.5, . [52] Becker, F. (1990), The Total Workplace, Van Nostrand Reinhold, USA,. [53] Brymer, R.A (1991), "Employee empowerment: a guest-driven leadership strategy", The Cornell HRA Quarterly, pp.58-68. [54] Honold, L (1997), "A review of the literature of employee empowerment", Empowerment in Organisation, Vol. 5 No.4, . [55] Huyton, J, Baker, S (1992), "Empowerment: a way to increase productivity and morale", Education Forum Proceedings on Direction 2000, Hong Kong, pp.511-18. [56] Jones, P, Davies, A (1992), "Empowerment: a study of general managers of four-star hotel properties in the UK", International Journal of Hospitality Management, Vol. 10 No.3, pp.211-17. [57] Kinlaw, D.C (1995), The Practice of Empowerment: Making the Most of Human Competence, Gower, Hampshire, . [58] Mohrman, S, Lawler, E, Ledford, G (1996), "Do employee involvement and TQM programmes work?", Journal of Quality and Participation, Vol. 19 No.1, pp.6-10. [59] Potterfield, T.A (1999), The Business of Employee Empowerment, Quorum Books, USA, . [60] Simmons, P, Teare, R (1993), "Evolving a total quality culture", International Journal of Contemporary Hospitality Management, Vol. 5 No.3, .


A New Multiple Snapshot Algorithm for Direction-of Arrival Estimation using Smart Antenna
Lokesh L , Sandesha Karanth , Vinay T, Roopesh , Aaquib Nawaz.

Abstract In this paper, a new Eigen Vector algorithm for direction of arrival (DOA) estimation is dMUSICeloped, based on eigen value decomposition and normalization of covariance matrix. Unlike the classical Maximum Likelihood Method (MLM) and Maximum Entrophy Method (MEM) algorithms the proposed method only involves the determination noise subspace eigen vectors which provides better resolution and bias as compared to existing DOA algorithms. The performance of the proposed method is demonstrated by numerical results for widely spaced and closely spaced sources. Keywords: array signal processing, direction of arrival, MUSIC, MLM, MEM


ireless networks face MUSICer-changing demands on their spectrum and infrastructure resources. Increased minutes of use, capacityintensive data applications, and the steady growth of worldwide wireless subscribers mean carriers will have to find effective ways to accommodate increased wireless traffic in their networks. HowMUSICer, deploying new cell sites is not the most economical or efficient means of increasing capacity. Wireless carriers have begun to explore new ways to maximize thespectral efficiency of their networks and improve their return on investment. Smart antennas have emerged as one of the leading innovations for achiMUSICing highly efficient networks that maximize capacity and improve quality and coverage. Smart antennas provide greater capacity and performance benefits than standard antennas because they can be used to customize and fine-tune antenna coverage patterns to the changing traffic or radio frequency (RF) conditions.

Fig1: Smart Antenna System A smart antenna system at the base station of a cellular mobile system is depicted in Fig. 1. It consists of a uniform linear antenna array for which the current amplitudes are adjusted by a set of complex weights using an adaptive beamforming algorithm. The adaptive beamforming algorithm optimizes the array output beam pattern such that maximum radiated power is produced in the directions of desired mobile users and deep nulls are generated in the directions of undesired signals representing co-channel interference from mobile users in adjacent cells. Prior to adaptive beamforming, the directions of users and interferes must be obtained using a direction-ofarrival estimation algorithm. The paper is organized as follows: Section II develops the theory of smart antenna systems Section III describes Maximum Likelihood method. Section IV describes Maximum Entrophy Method , Section V describes MUSIC method. Section VI presents performance results for the smart antenna Direction of Arrival algorithms. Finally, conclusions are given in section VII.

A) Signal Model

M 1

x k ( n)
i 0

bi (n) a ( i )


Formation of Array Correlation Matrix

The spatial covariance matrix of the antenna array can be computed as follows. Assume that (signal) and (noise) are uncorrelated, is a vector of Gaussian white noise samples with zero mean. The spatial covariance matrix is given by Fig1:Uniform Linear Array Consider a uniform linear array geometry with L elements numbered 0, 1, ..., L - 1. Consider that the array elements have half-a-wavelength spacing between them. Because the array elements are closely spaced, we can assume that the signals received by the different elements are correlated. A propagating wave carries a baseband signal, s(t), that is received by each array element, but at a different time instant. It is assumed that the phase of the baseband signal, s(t), received at element 0 is zero. The phase of s(t) received at each of the other elements will be measured with respect to the phase of the signal received at the 0th element. To measure the phase difference, it is necessary to measure the difference in the time the signal s(t) arrives at element 0 and the time it arrives at element k. The Steering vector is such measure and for a L antenna elements array it is given by

R E[ xn xnH ] E[( A bn nn ) ( A bn nn ) H

The spatial covariance matrix is dived into signal and noise subspaces and hence we obtain


A Rss A H

Where, is Array Correlation Matrix or Spatial Correlation matrix, is Array Manifold Vector, is hermitian transpose of A, is noise variance.

Maximum Likelihood Method

The Capon AOA estimate is known as a Minimum Variance Distortionless Response (MVDR). It is also alternatively a maximum likelihood estimate of the power arriving from one direction while all other sources are considered as interference. Thus the goal is to maximize the Signal to Interference Ratio (SIR) while passing the signal of interest undistorted in phase and amplitude. The source correlation matrix is assumed to be diagonal. This maximized SIR is accomplished with a set of array weights given by

1 e S ei2
i 2 d sin

d ( L 1) sin


The combination of all possible steering vectors forms a matrix A known as the array manifold matrix Hence, the received signal vector x(t) of (1) can be expressed in terms of A as:


R xx1 a ( ) a H ( ) R xx1 a ( )


Where, is the inverse of un-weighted array correlation matrix and is the steering vector for an angle . The MLM pseudo spectrum is given by



1 a H ( ) Rinv a( )


Where, is the hermitian transpose of and is the inverse of autocorrelation matrix.

Maximum Entrophy Method

This method [8] finds a power spectrum such that its Fourier transform equals the measured correlation subjected to the constraint that its entropy is maximized. The solution to this problem requires an infinite dimensional search. The problem has to be transformed to a finite dimensional search. One of the algorithms proposed by Lang and McClellan has power spectrum given by

Fig2: Proposed MUSIC Vector System As shown in fig2 the MUSIC algorithm estimates the covariance matrix and then performs MUSIC Decomposition to form the subspace. The Noise subspace is then normalized to obtain better resolution as compared to other DOA algorithms. One must know in advance the number of incoming signals or one must search the Eigen values to determine the number of incoming signals. If the number of signals is M, the number of signal Eigen values and eigenvectors is M and the number of noise Eigen values and eigenvectors are L-M (L is the number of array elements). Because Eigen Vector exploits the noise eigenvector subspace, it is sometimes referred to as a subspace method. The Eigen values and eigenvectors for correlation matrix is found. M eigenvectors associated with the signals and LM eigenvectors associated with the noise are separated. The eigenvectors associated with the smallest Eigen values are chosen to calculate power spectrum. For uncorrelated signals, the smallest Eigen values are equal to the variance of the noise. The L (L M) dimensional subspace spanned by the noise eigenvectors is given by


1 [S

CC S ]


Where, C is column of R-1 and is the steering vector. PME() is based on selecting one of Lth array elements as a reference and attempting to find weights to be applied to the remaining L-1 received signals to permit their sum with a minimum mean square error fit to the reference. Since there are L possible references, there are L generally different PME() obtained from the L possible column selections of R-1

MUSIC Method or Eigen Vector Method

MUSIC Method promises to provide unbiased estimates of the number of signals, the angles of arrival and the strengths of the waveforms. MUSICmethod makes the assumption that the noise in each channel is uncorrelated making the noise correlation matrix diagonal. The incident signals may be correlated creating a non diagonal signal correlation matrix. HowMUSICer, under high Signal correlation the traditional MUSIC algorithm breaks down and MUSIC Method must be implemented to correct this weakness.


e1 e2 e3 .......... ....... eL


Where, is the ith Eigen Value. The noise subspace Eigen vectors are orthogonal to the array steering vectors at the angles of arrival . Because of this orthogonality condition, one can show that the Euclidean distance for each and MUSICery angle of arrival .Placing this distance expression in the denominator creates sharp peaks at

the angles of arrival. The MUSIC pseudo spectrum is given by


H a( ) E N E N a( ) H


Given a case when sources are widely apart and less number of antenna elements are used as shown in fig 3 , it is found that MEM, MNM and MUSIC method detect direction of sources and all DOA algorithms produces best output. Case2: Closely spaced sources with less number of antenna elements Table2: Input to Delay and Sum, MEM and MUSIC Method

Where, is steering vector for an angle , is a matrix comprising of noise Eigen vectors and is the ith eigen value.

Simulation Results
Here the DOA algorithms namely; MNM, MEM, and new MUSIC method are simulated using MATLAB. Assumptions 1.Distance between antenna elements is to avoid grating lobes. 2. Signal and Noise is un-correlated. Case1: Widely spaced source with less number of antenna elements Table1: Input to MNM, MEM and MUSIC Method

Number of array elements 8

Number of sources 2

Directions of Sources

Amplitude of sources

[250 ,300]

[1, 1]v

Number of array elements 8

Number of sources

Directions of Sources

Amplitude of sources

[100 , 500]

[1, 2]v
Fig4: Comparisons of MNM, MEM and MUSIC Method for closely spaced sources with less antenna elements Given a case when sources are closely spaced and less number of antenna elements are used as shown in fig4 , it is found that MNM and MEM perform badly and new MUSIC method yields better output. Case3: Widely spaced sources with more number of antenna elements Table3: Input to MNM, MEM and MUSIC Method

Fig3: Comparisons of MNM, MEM and MUSIC Method for widely spaced sources with less antenna elements

Number of array elements 50

Number of sources 2

Directions of Sources [200, 600]

Amplitude of sources

[1, 3 ]v

175 Given a case when sources are widely apart and more number of antenna elements are used as shown in fig 5, it is found that performance of MNM, MEM and MUSIC methods is good.
in fig 6, it is found that performance of MNM, MEM and MUSIC methods is good because EM wave strikes more number of antenna elements.

The Direction of Arrival (DOA) block of smart antenna systems based on classical and subspace methods are presented. The new Eigen Vector method is compared with existing MEM and Delay & Sum method. From the simulation results of MATLAB the conclusions are: when the sources are widely spaced and less number antenna elements are used performance of MNM, MEM and MUSICr method are good. When the sources are closely spaced and less number of antenna elements are used performance of MNM and MEM is worst and Eigen Vector algorithm is best suited in this case. When the sources are widely spaced and more number of antenna elements are used all algorithms perform well. When the sources are closely spaced and more number of antenna elements are used performance of MUSIC are improved. REFERENCES
[61] R. M. Shubair and A. Al-Merri, "Robust algorithms for direction finding and adaptive beamforming: performance and optimization," Proc. of IEEE Int. Midwest Symp. Circuits & Systems (MWSCAS'04), Hiroshima, Japan, July 25-28, 2004, pp. 589-592. [62] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Computationally efficient DOA estimation in a multipath environment using covariance differencing and iterative spatial smoothing," Proc. of IEEE Int. Symp. Circuits & Systems (ISCAS'05), Kobe, Japan, May 23-26, 2005, pp. 3805-3808. [63] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Investigation of high-resolution DOA estimation algorithms for optimal performance of smart antenna systems," Proc. of 4th IEE Int. Cory': 3G Mobile Communications (3G'03), London, UK, 25-27 June, 2003, pp. 460-464. [64] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, "Performance MUSICaluation of direction finding algorithms for adaptive antenna arrays,"Proc. of 10th IEEE Int. Coni Electronics, Circuits & Systems (ICECS'03),Sharjah, United Arab Emirates, 14-17 December, 2003, Vol. 2, pp. 735738. [65] R. M. Shubair and A. Al-Merri Convergence study of adaptive beam- forming algorithms for spatial interference rejection," Proc. of Int. Symp. Antenna Technology & Applied Electromagnetic s (ANTEM'05), Saint- Mato, France, June 15-17, 2005. [66] R. M. Shubair and W. Al-Jessmi, "Performance analysis of SMI adaptive beamforming arrays for smart antenna systems," Proc. of IEEE Int. Symp. Antennas & Propagation (AP-S'05), Washington, D.C., USA, July 3-8, 2005, pp. 311314. [67] R. M. Shubair, A. Al-Merri, and W. Al-Jessmi, "Improved adaptive beamforming using a hybrid LMS/SMI approach," Proc. of IEEE Int. Coni Wireless and Optical Communications Networks (WOCN'05), Dubai, UAE, March 6-8, 2005, pp. 603-606.

Fig5: Comparisons of MNM, MEM and MUSIC Method for widely spaced sources with less antenna elements Case4: Closely spaced sources with more number of antenna elements Table4: Input to MNM, MEM and MUSIC Method

Number of array elements 70

Number of sources 2

Directions of Sources [300 ,330]

Amplitude of sources

[1 ,2]v

Fig6: Comparisons of MNM, MEM and MUSIC Method for closely spaced sources with more antenna elements Given a case when sources are closely spaced and more number of antenna elements are used as shown


Quality Metrics for TTCN-3 and Mobile-Web Applications

Abstract Web-based application is essentially a client-server system, which combines traditional effort logic and functionality, usually server based. This paper has been designed to predict the Web metrics for evaluating the efficiency and maintainability of hyperdocuments in termes of Testing and Test Control Notation (TTCN-3)and mobile-wireless web application.In the modern era of Information and Communication Technology (ICT), Web and the Internet, have brought significant changes in Information Technology (IT) and their related scenarios. The quality of a web application could be measured from two perspectives: programmers view and users view and here maintainability perceived by the programmers, and efficiency experienced by the end-user. Index Terms Web-based effort estimation, Webbased design, Web metrics, E-commerce , web application.

effort measurement models for Web-based hypermedia applications based on implementation phase of development life cycle. For this work, we studied various size measures at different points in the development life cycle of Web-based systems, to estimate effort, and these have been compared based on several predictions. The main objective of design metrics is to provide basic feedback of the design being measured. In this paper we are introduced some new matrices for quality factors in terms of TTCN-3 and mobile-wireless web application. Here we are calculated only two quality-factor,Maintainability and Efficency.

Many software quality factors have already defined and in this paper we have defined quality factor for web metrics for evaluating the efficiency and maintainability. It is already well established that a website should be treated as a set of components. Our interest is to consider the nature of these components, and how they affect the web site's quality. We are putting a lot of emphasis on maintainability and efficiency in this paper, since for most of the life of a web site; it is being actively maintained .Web sites differ from most software systems in a number of ways. They are changed and updated constantly after they are first developed. As a result of this, almost all of the effort involved in running a web site is maintenance. We will use the following criteria for estimating maintainability and efficiency (from ISO 9126):


he diverse nature of web applications makes it difficult to measure these using existing quality measurement models. Web applications often use large numbers of reusable components which make traditional measurement models less relevant. Through a client Web browser, users are able to perform business operations and then to change the state of business data on the server. The range oWebbased applications varies enormously, from simple Web sites (Static Web sites) that are essentially hypertext document presentation applications, to sophisticated high volume e-commerce applications often involving supply, ordering,payment, tracking and delivery of goods or the provision of services (i.e. Dynamic and Active Web sites). We have focused on the implementation and comparison of

terms of analysability and changeability and for locating issues, an initial set of appropriate TTCN-3 metrics has been developed.To ensure that these metrics have a clear interpretation, their development was guided by the Goal Question Metric approach. First the goals to achieve were specified, e.g. Goal 1:Improve changeability of TTCN-3 source code or Goal 2: Improve analysability of TTCN-3. coupling metrics are used to answer the question of Goal 1 and counting the number of references for answering the questions of Goal 2. The resulting set of metrics not only uses well known metrics for general-purpose programming languages but also defines new TTCN-3-specific metrics. As a first step, some basic size metrics and one coupling metric are used: A) Number of lines of TTCN-3 source code including blank lines and comments, i.e. physical lines of code . B) Number of test cases C) Number of functions D) Number of altsteps, E) Number of port types F) Number of component types G) Number of data type definitions H) Number of templates. I) Template coupling, which is to be computed as follows: where stmt is the sequence of behavior entities referencing templates in a test suite, n is the number of statements in stmt, and stmt(i)denotes the i th statement in stmt. Template coupling measures the dependence of test behaviour and test data in the form of TTCN-3 template definitions. On the basis of Template coupling we have calculated the quality factors and its sub -factor of the maintainability and efficiency. A. MAINTAINABILITY Web-based software applications have a higher frequency of new releases, or update rate. Maintainability is a set of attributes that bear on the effort needed to make specified modified modifications (ISO 9126: 1991, 4.5).this is the ability to identify and fix a fault within a software component is what the maintainability characteristic addresses. Sub characteristics of the maintainability are-analysability,changeability,stability,testability. a. Analysability Analysability is measured as the attributes of the software that have a bearing on the effort needed for

A) Maintainability: i) Analysability ii) Changeability iii) Stability iv) Testability B) Efficiency: i) Time based efficiency ii) Resource based efficiency We now look at each of these criteria and their metrics in the following subsections which are suitable for TTCN-3 specific metrics and mobile web application. III. METRICS FOR TTCN-3

Testing and Test Control Notation (TTCN-3), has shown that this maintenance is a non-trivial task and its burden can be reduced by means of appropriate concepts and tool support. The test specification and test implementation language TTCN-3 has the look and feel of a typical general-purpose programming language, i.e. it is based on a textual syntax, referred to as the core notation.For assessing the overall quality of software, metrics can be used. Since this article treats quality characteristics such as maintainability of TTCN-3 test specifications, only internal product attributes are considered in the following. For assessing the quality of TTCN-3 test suites in

diagnosis and modification of deficiencies and causes of failures. For optimal Analysability most templates may be inline templates. - Metric 1.1: complexity violation := a. Time based efficiency The time behaviour describes for instance processing times and throughput rates. b. Resource based efficiency resource behaviour means the amount of resources used and the duration of use. IV. METRICS FOR MOBILE-WIRELESS WEB APPLICATION b. Changeability For changeability we are interested in how easily the data, formatting and program logic in the website can be changed. For good changeability a decoupling of test data and test behavior might be advantageous. Mobile and wireless devices and networks enable "any place, any time" use of information systems,providing advantages, such as productivity enhancement, flexibility, service improvements and information accuracy.This research develops a methodology to define and quantify the quality components of such systems. In this section we describes the metrics development process and presents examples of metrics.

c. Stability Stability is the tolerance of the application towards unexpected effects of modifications. This metric measures the number of all component variables and timers referenced by more than one function,testcase,or altstep and relates them to the overall number of component variables and timers. - metric : global variable and timer usage :=

d. Testability Testability is the effort for validating modification. There are only a few special considerations that should be made when measuring testability for a web site. Since the site can be tested through a web browser exactly like black box testing. While most of these metrics mainly describe the overall quality of test suites (an example is the Template coupling metric), some of them can also be used to improve a test suite by identifying the location of individual issues. B. EFFICIENCY Efficiency is a set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions(ISO 9126: 1991, 4.4). A set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions.. This characteristic is concerned with the system resources used when providing the required functionality. The amount of disk space, memory, network etc. provides a good indication of this characteristic. The subcharacteristics of the efficiency is time and resource

In this section we have to measure only two quality attributes,Maintainability and Efficiency. A. Maintainability Increasing the quality of the development processes and products/program code in the areas of maintainability will help to lower the cost when adding a new target platform .Maintainability (ISO9126) includes the analyzability, changeability, stability, and testability sub-characteristics.This set of characteristics reflects mainly the technical stakeholders' viewpoint, such as the developers and

maintenance people. a.Use of standard protocol B. Efficency Efficiency (ISO-9126) includes the time behavior and resource utilization sub-characteristics. a.Time efficiency Time behavior sub-characteristic is very important in the mobile-wireless applications because the price of each minute of data transferring is very high, and the users will avoid expensive systems. i) Response time to get information from server ii) Response time to get information from Client b.Resource efficiency Mobile devices include small memory and low processing resources, so applications must be aware of these restrictions and optimize resource utilization. i) Size of application in mobile device ii) Size of memory in mobile device iii) Device memory cleanup after completing the task iv) Network throughput Finally, the limited processing and network resources require efficient use of the available resources. V.CONCLUSION & FUTURE WORK The paper has discussed the ISO 9126 norm with respect to the development of mobile web applications and TTCN-3. This paper introduced two subjects with respect to quality attributes (ISO-9126). First, TTCN-3 described the metrics for quality factor (ISO-9126). Second, mobile-wireless information systems which also used for measuring the quality factors (ISO-9126). In this paper we have calculate only two quality factors, Maintainability and Efficiency in terms of TTCN-3 and mobile-wireless web application. The research can be expanded to calculate other quality factors such as Functionality,Reliability, Usability, Portability in term of TTCN-3 and mobilewireless web application. REFERENCES
[1] Stefan, M. Xenos: A model for assessing the quality of ecommerce systems,Proceedings of the PC-HCI 2001 Conference on Human Computer Interaction,2001. Asunmaa, P., Inkinen, S., Nyknen, P., Pivrinta, S., Sormunen, T., & Suoknuuti, M. (2002). Introduction to mobile internet technical architecture. Wireless Personal Communications, 22, 253259. Boehm, B.W., J.R. Brown, J.R. Kaspar, M. Lipow & G. Maccleod, Characteristics of Software Quality (Amsterdam: North Holland. 1978). Bache, R., Bazzana, G., Software Metrics for Product Assessment, Mcgraw-Hill, 1994. Calero, C., Ruiz, J., & Piattini, M. (2004). A web metrics survey using WQM. Proceedings ICWE 2004, LNCS 3140, Springer-Verlag Heidelberg, 147160. Coleman, D. Ash, B. Lowther, P. Oman, Using Metrics to Evaluate Software System Maintainability, Computer Vol. 27, No..8, pp. 44-49. Ejiogu, L., Software Engineering with Formal Metrics, QED Publishing, 1991. G. M. Weinberg: The Psychology of Computer Programming, 1979. Hordijk, W., & Wieringa, R. (2005). Surveying the factors that influence maintainability. Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering ESECFSE05, Lisbon, Portugal, 385-388. ISO9000. (2000). Quality management systems Requirements. Geneva, Switzerland: International Organization for Standardization. J. Eisenstein, J. Vanderdonckt, A. Puerta, Applying ModelBased Techniques to the Development of UIs for Mobile Computers, Proceedings on Intelligent User Interfaces, Santa Fe, 2001 J. Offutt, Quality Attributes of Web Software Applications, IEEE Software, IEEE, March/April 2002, pp. 25-32. M. Satyanarayanan: Fundamental Challenges in Mobile Computing, Symposium on Principles of Distributed Computing, 1996. Mccall, J.A., P.K. Richards & G.F. Walters, Factors in Software Quality, Vol. 1,2, AD/A-049-014/015/05, And Springfield, VA: National Technical Information Service, 1977. S McConnell, Real Quality for Real Engineers, IEEE Software, March/April 2002, pp. 5-7.



[4] [5]


[7] [8] [9]



[12] [13]




A Unique Pattern Matching Algorithm Using The Prime Number Approach

Nishtha Kesswani, Bhawani Shankar Gurjar

Abstract There are several patternmatching algorithms vailable in the literature. The Boyer-Moore algorithm performs pattern-matching from right to left and computes the bad character and good suffix [8].The Rabin-Karp patternmatching algorithm performs patternmatching on the basis of modulus of a number. But it suffers from the drawback that there may be spurious hits in the result. In this paper we have suggested a unique prime-factor based approach to pattern-matching. This algorithm gives better results in several cases as compared to the contemporary patternmatching algorithms. Introduction Pattern matching is an interesting problem of finding a pattern in a text. This problem becomes even more interesting when we are trying to solve it in optimum time and with minimum complexity. There can be many ways of finding a pattern in a text. The basic concept of finding a pattern is used to compare each element of pattern with the text. For example, for the following text T and pattern P we have

In this way we have a large number of comparisons. If the size of text is n and the size of pattern is m then in worst case the number of comparisons required can be described by the following equation: Number of Comparisons=(n m+1)m = nm - m2+ m So, the primary emphasis of every optimum algorithm for pattern matching is to reduce these comparisons. When the size of problem is increasing, it becomes hard and time consuming. Here in this algorithm we tried to make minimum comparisons so that it can be an optimum pattern finding algorithm. The Proposed Algorithm The main approach of this algorithm is the property of prime numbers. That is Each natural number can expressed in the unique form of prime product. First, we have to convert the original problem into the form of numbers ( we make one-to -one mapping with natural numbers). We make a unique number with help of pattern to be searched. The algorithm calculates the maximum prime factor of the pattern using


max_prime-factor and checks whether the text is fully divisible by the maximum prime-factor thus calculated. If it is so, then only the pattern is checked against the text, otherwise not. Algorithm1: Search a pattern P in the text T Algorithm Prime_pattern Search(P, T) 1. Let P[1..m] is pattern and T[1..n] is a text. 2. P1:= value ( p,m) 3. where P stands for pattern and m for pattern length. 4. T1 := value (T,m) 5. d := max_prime_factor( P1) 6. s := n-m, r := radix. 7. For(i := 1 to s) 8. if( Ti % d = 0) 9. For (j =0 to m 1) 10. P[ j ] = T [ s + j] 11. End For 12. Print pattern is found 13. i := i + 1 14. Ti := value [ T [i..i + m], m] 15. End For Algorithm2: Return the value corresponding to the Pattern/Text Algorithm Value( P1,r,m) 1. P0 := 0 2. For (i := 1 to m) 3. P0 := r.P0 + P[i] 4. Return P0 Algorithm3: Return maximum prime factor of n Algorithm max_prime_ factor (n ) 1. m = n, i = j = 0, p [ h/2 ] := {0} 2. If (n =1|| 2) 3. Return n 4. While ( i!= (n + 1)) 5. If ( n % i = 0) 6. P [ j ] := i 7. j + + , n := n / i 8. i := 2

9. 10. 11.

End while else i:= i + 1 return P [ j ]

Experimental Results The algorithm was tested on different patterns and texts and it was found that as compared to the contemporary pattern matching algorithms such as Rabin-Karp, this prime number based approach was found to be more efficient as it uses the maximum prime factor approach. Although, the worst case complexity of this algorithm is O(nm), the same as that of Rabin-Karp algorithm. But, it may produce better results in some cases, such as if the number is exactly divisible by the max prime factor, the pattern may match the given text. The result is that no further comparisons may be required and thus reducing the number of comparisons. With this algorithm we can solve a problem using very few comparisons as compared to other pattern algorithms like Rabin Karp algorithm, Boyer Moore pattern matching algorithm and Kruth Moris algorithm. But all these algorithms may require large number of comparisons, whereas the prime number-based approach requires lesser number of comparisons if the maximum prime is unique and that is the best case for this algorithm. The worst case arises if Ti is divisible by the maximum prime number each time. Conclusions and Future work Although this algorithm gives better results in case the maximum prime factor matches the text, but it may also generate some spurious results. As a future enhancement, this algorithm can be generalized to remove the spurious results as far as possible. One of the main drawbacks of the algorithm is that it can only be used for those patterns and text that can be expressed as numbers as


this algorithm primarily uses the prime number-based approach. References [1] Christian Lovis, Robert H. Baud, Fast Exact String Pattern-matching Algorithms Adapted to the Characteristics of the Medical Language, Journal of American Medical Association, Jul-Aug 2000, v.7(4),pp. 378-391 [2] Aho, A.V., 1990, Algorithms for finding patterns in strings. in Handbook of Theoretical Computer Science, Volume A, Algorithms and complexity, J. van Leeuwen ed., Chapter 5, pp 255-300, Elsevier, Amsterdam. [3] AOE, J.-I., 1994, Computer algorithms: string pattern matching strategies, IEEE Computer Society Press. [4] Baase, S., Van Gelder, A., 1999, Computer Algorithms: Introduction to Design and Analysis, 3rd Edition, Chapter

11, pp. ??-??, Addison-Wesley Publishing Company. [5] Baeza-Yates R., Navarro G., Ribeiro-Neto B., 1999, Indexing and Searching, in Modern Information Retrieval, Chapter 8, pp 191-228, Addison-Wesley. [6] Beaquier, D., Berstel, J., Chretienne, P., 1992, lments d'algorithmique, Chapter 10, pp 337-377, Masson, Paris. [7] Cole, R., 1994, Tight bounds on the complexity of the Boyer-Moore pattern matching algorithm, SIAM Journal on Computing 23(5):1075-1091. [8] Cormen, T.H., Leiserson, C.E., Rivest, R.L., 2010. Introduction to Algorithms, Chapter 34, pp 853-885, MIT Press. [9] Crochemore, M., Hancart, C., 1999, Pattern Matching in Strings, in Algorithms and Theory of Computation Handbook, M.J. Atallah ed., Chapter 11, pp 11-1--11-28, CRC Press Inc., Boca Raton, FL.


Study and Implementation of Power Control in Ad hoc Networks

Abstract An ad hoc network facilitates communication between nodes without the existence of an established infrastructure. Random nodes are connected to one another using Ad hoc networking and routing among the nodes is done by forwarding packets from one to another which is decided dynamically. The transmission of packets among the nodes is done on a specified power level. Power control is the method used for transmission of the packets at an optimized power level so as to increase the traffic carrying capacity, reduce the usage of battery power and minimize the interference to improve the overall performance of the system with regards to the usage of power. This paper tells us regarding COMPOW (Common Power) and CLUSTERPOW (Cluster Power) protocols, which are two existing protocols for power control in homogeneous and non-homogeneous networks respectively. We have implemented these two protocols in Java Platform and run it for different number of nodes. From the implementation we have come up with the power optimal route among the nodes and the routing table for each node for both homogeneous and nonhomogeneous networks. COMPOW (Common Power) protocol is an asynchronous, distributed and adaptive algorithm for calculating the common optimized power for communication among different nodes. CLUSTERPOW (Cluster

Power) protocol is a protocol designed for optimizing the transmit power and establish efficient clustering and routing in nonhomogeneous networks. INTRODUCTION 1.1WIRELESS NETWORKS Any type of computer network which is wireless is known as a wireless network. It is commonly used in telecommunication network where wire is not the mode of connectivity among the nodes[5]. The various types of wireless networks are: 1. Wireless LAN (Local Area Network) 2. Wireless PAN (Personal Area Network) 3. Wireless MAN (Metropolitan Area Network) 4. Wireless WAN (Wide Area Network) 5. Mobile Devices Network Wireless networks are basically used for sending information quickly with greater reliability and efficiency. The usage of wireless networks range from overseas communication to daily communication among people through cellular phones .One of the most extensive use of wireless network is Internet connectivity among the countries. However, wireless networks are more prone to outside threat from malicious hackers and are hence vulnerable. 1.2 WIRELESS AD HOC NETWORK


Ad hoc networks are a new paradigm of wireless communication for mobile hosts (which we call nodes). In an ad hoc network, there is no fixed infrastructure such as base stations or mobile switching centers. Mobile nodes that are within each other s radio range communicate directly via wireless links, while those that are far apart rely on other nodes to relay messages as routers. Node mobility in an ad hoc network causes frequent changes of the network topology [1]. In a wireless Ad hoc Network, routing is done at each node by forwarding the data to other nodes and the forwarding by nodes is decided dynamically based on the connectivity among the nodes. Being a decentralized network and easy to set up, wireless ad hoc network find their usage in a variety of applications where the central node is not reliable. However in most ad-hoc networks the competition among the nodes results in interference which can be reduced using various cooperative wireless communications. 1.3 POWER CONTROL IN AD HOC NETWORKS Power control basically deals with the performance within the system. The intelligent selection of the transmit power level in a network is very important for good performance. Power control aims at minimizing the traffic carrying capacity, reducing the interference and latency, and increasing the battery life. Power control helps combat long term fading effects and interference. When power control is administered, a transmitter will use the minimum transmit power level that is required to communicate with the desired receiver. This ensures that the necessary and sufficient transmit power is used to establish link closure. This minimizes interference caused by this transmission to others in the vicinity. This improves both bandwidth and

energy consumption. However, unlike in cellular networks where base stations make centralized decisions about power control settings, in ad-hoc networks power control needs to be managed in a distributed fashion [2]. Power control is a cross-layer design problem .In the physical layer it can enhance the quality of transmission. In the network layer it can increase the range of transmission and the number of simultaneous transmissions. In the transport layer it can reduce the magnitude of interference. 1.4 PROJECT DESCRIPTION The aim of the thesis is to find the lowest common power level for an ad hoc network in which the network is connected for both homogeneous and non-homogeneous networks. For this there are two existing protocols [3][4] COMPOW protocol and CLUSTERPOW protocol. These power control protocols find the lowest common power levels for homogeneous and non homogeneous networks respectively. We found the routing table for each node in the network and then found the optimized route using Bellman Ford Algorithm that considers power as metric. From the optimized route we show the connectivity among the nodes at different power levels and compare the connectivity and efficiency of transmission among the nodes at different power levels. POWER CONTROL 2.1 INTRODUCTION Power control is the intelligent selection of lowest common power level in an ad hoc network in which the network remains connected. The power optimal route for a sender receiver pair is calculated and the power level used for this transmission is set


as the lowest power level for that particular transmission. In case of multiple nodes, power optimal route for each transmission is calculated. The importance of power control arises from the fact that it has a major impact on the battery life and the traffic carrying capacity of the network. In the subsequent topics we discuss how power control affects various layers. 2.2 TRANSMIT POWER Transmit power level is the power level at which the transmission among the nodes take place. Increasing the transmit power level has its own advantages. A higher transmit power level means a higher signal power at the receiver end .So the signal to noise ratio is significantly increased and the error in the link is thus reduced. When the signals in a network keep on fading, it is advantageous to use a high transmit power so that the signals received at the receivers end is not that weak. However high transmit power has quite a few disadvantages. The overall consumption of battery by the transmitting device will be high. Interference in the same frequency band increases drastically. Hence the need for an algorithm arises that can select an optimum transmit power level in a network.

COMPOW PROTOCOL 3.1 INTRODUCTION COMPOW (Common Power) protocol provides an asynchronous, distributive and adaptive algorithm which finds the smallest power level at which at which the network remains connected. The protocol provides bidirectionality of links and connectivity of the network, asymptotically maximizes the traffic carrying capacity, provides power aware routes and reduces MAC layer contention. [3] 3.2 CONNECTIVITY We generate nodes randomly on a surface area S square meters and assign them with specific x and y coordinates for each node generated randomly. For each source destination pair we check that whether the distance between them is less then m (the range in meters of each node). Then we check for the interference in transmission among the nodes .We take up an interference parameter n (assumed to be


much less than the range m). We check that the distance between the nodes of two simultaneous transmission is less than (1+n)*m. Suppose the rate at which the sender wants to send a data packet to receiver is a bits per second. There is a reciprocal dependency between the rate a and the range m.[3] Hence we need to decrease the m value .However very low values of m may result in a disconnected network. So we need to choose a r value at which the network remains connected and this suffices our aim of finding lowest common power level at which the network remains connected. 3.3 ADVANTAGES The COMPOW protocol increases the traffic carrying capacity, reduces the battery consumption i.e. increases the battery life, reduces the latency, reduces interference, guarantees bidirectional links ,provides power aware routes and can be used with any proactive routing protocol.[3] Another feature of the COMPOW protocol is the plug and play capability. It is among the very few protocols that has been implemented and tested in a real wireless test bed. [3]

3.4 LIMITATIONS When the nodes in a network are clustered the COMPOW protocol may settle for an unnecessarily high power level [3] .Even a single node outside the cluster may result in a high power level selection for the whole network. COMPOW protocol works only for homogeneous networks. 3.5 ENHANCEMENT we describe how to find the lowest common power level in a cluster using an existing protocol CLUSTERPOW that fills the loopholes of the COMPOW protocol.

CLUSTERPOW 4.1 INTRODUCTION CLUSTERPOW protocol provides us with implicit, adaptive, loop free and distributed clustering based on transmit power level. The routes discovered in this protocol consist of a non-increasing sequence of transmit power levels. CLUSTERPOW is an enhanced version of COMPOW as CLUSTERPOW is used in non-homogenous network whereas COMPOW is used where the network is homogenous. We can use CLUSTERPOW with both reactive and proactive routing protocol. It finds the lowest transmit power at which the network is connected.[4] 4.2 CONNECTIVITY We generate nodes randomly on a surface area S square meters and assign them with specific x and y coordinates for each node generated randomly. The next hop in CLUSTERPOW is found by consulting the lowest power routing table where the destination is reachable. As we go from the


source toward the destination the power level at every intermediate node is nonincreasing. That is, for every destination D, the entry (row) in the kernel routing table is copied from the lowest power routing table where D is reachable, i.e., has a finite metric. The kernel routing table has an additional field called the transmit power (txpower) for every entry, which indicates the power level to be used when routing packets to the next hop for that destination. [4] 4.3 ADVANTAGES

CLUSTERPOW recursive lookup scheme can be modified so that it is indeed free of infinite loops. This is done by tunneling the packet to its next hop using lower power levels, instead of sending the packet directly. One mechanism to achieve this is by using IP in IP encapsulation. Thus, while doing a recursive lookup for the next hop, we also recursively encapsulate the packet with the address of the node for which the recursive lookup is being done. The decapsulation is also done recursively when the packet reaches the corresponding destination [4]. CONCLUSION

The CLUSTERPOW protocol increases the network capacity, reduces the battery consumption i.e. increases the battery life, reduces interference and it is loop free. It takes care of non-homogenous networks. The traffic-carrying capacity of the network can be shown by taking into consideration the additional relaying burden of using small hops versus the interference caused by long hops, it is optimal to reduce the transmit power level [4]. 4.4 LIMITATIONS CLUSTERPOW does not take care of consumption of energy in transmitting the packets in the network. While CLUSTERPOW takes care of the network capacity, the power consumption in processing while transmitting and receiving is typically higher than the radiative power required to actually transmit the packet [4]. 4.5 ENHANCEMENTS The limitations of CLUSTERPOW protocol can be overcome using two existing protocols: Tunneled CLUSTERPOW and MINPOW protocol. The MINPOW protocol reduces the energy consumption in sending packets over the network. In Tunneled

In this paper we have implemented two existing protocols for Power Control in ad hoc networks: COMPOW and CLUSTERPOW. We have done the simulation in JAVA. Simulation of the protocols was done in constant nodes and power level was taken as the metric to compare the performance. We have constructed the routing tables for the transmission of data among the nodes and calculated the minimum transmit power required for the transmission. The results of simulation confirm that COMPOW protocol is better for homogeneous networks and it is not suitable for clusters whereas the CLUSTERPOW is better for nonhomogeneous networks. So we can conclude that no single protocol supersede any other protocol. The performance of the protocols depends upon the different scenarios it is subjected to. REFERENCES
[1]Zhou Lidong and Haas Zygmunt J., Securing Ad Hoc Networks, In IEEE Network magazine, special issue on networking security, Vol. 13, No. 6, November/December, (1999), pages 2430. [2]Agarwal Sharad, Krishnamurthy Srikanth V., Katz Randy H.,Dao Son K., Distributed Power Control in Ad-hoc Wireless Networks,Proc. of IEEE International Symposium on Personal, Indoor and

Mobile Radio Communications, San Diego, CA, vol. 2,(2001),pp. 5966 . [3]Narayanaswamy Swetha, Kawadia Vikas, Sreenivas R.S. and Kumar P. R., Power Control in Ad-hoc Networks: Theory, Architecture, Algorithm and Implementation of the COMPOW Protocol, Proc. of European Wireless Conference, (2002), pp. 156-162. [4]Kawadia Vikas and Kumar P.R., Power Control and Clustering in Ad Hoc Networks, Proc. of IEEE INFOCOM, (2003), pp. 459-469. [5]Goldsmith Andrea,Wireless Communications,California:Cambridge University Press,2005 [6]Tanenbaum Andrew S.,Computer Networks,New Jersey:Prentice Hall Publisher,2002 [7]Bellman Richard, On a Routing Problem, in Quarterly of Applied Mathematics, (1958), 16(1), pp.87-90. [8]Toh C.K, Ad Hoc Mobile Wireless Networks,New Jersey:Prentice Hall Publisher,2002 8] Ankit Saha , Chirag Hota Study and Implementation of Power Control in Ad hoc Networks. National Institute of Technology Rourkela


Improving The Performance Of Web Log Mining By Using K-Mean Clustering With Neural Network
Vinita Shrivastava

Abstract The World Wide Web has evolved in less than two decades as the major source of data and Information for all domains. Web has become today not only accessible and searchable information source but also one of the most important communication channels, almost a virtual society. Web mining is a challenging activity that aims to discover new, relevant and reliable information and knowledge by investigating the web structure, its content and its usage. Though the web mining process is similar to data mining, the techniques, algorithms, and methodologies used to mine the web encompass those specific to data mining, mainly because the web has a great amount of unstructured data and the changes are frequent and rapid. In the present work, we propose a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multilayered neural network using the K-means clustering algorithm. The proposed model use multi-layered network architecture with a back propagation learning mechanism to discover and analyse useful knowledge from the available Web log data. Index Terms Clustering algorithms, data mining,and Unsupervised Learning algorithm, Online Learning Algorithm, Neural network, k-mean clustering, web usage mining.

INTRODUCTION Web mining the application of machine learning techniques to web-based data for the purpose of learning or extracting knowledge. Web mining encompasses wide variety techniques, including soft computing. Web mining methodologies can generally be classified into one of three distinct categories: web usage mining, web structure mining, and web content mining examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. In web usage mining the goal is to examine web page usage patterns in order to learn about a web system's users or the relationships between the documents. For example, the tool presented and creates association rules from

web access logs, which store the identity of pages accessed by users along with other information such as when the pages were accessed and by whom; these logs are the focus of the data mining effort, rather than the actual web pages themselves. Rules created by their method could include, for example, "70% of the users that visited page A also visited page B examines web access logs. Web usage mining is useful for providing personalized web services, an area of web mining research that has lately become active. It promises t o help tailor web services, such as web search engines, to the preferences of each individual user. In the second category of web mining methodologies, web structure mining, we examine only the relationships between web documents by utilizing the information conveyed by each document's hyperlinks. Data mining is a set of techniques and tools used to the no trivial process of extracting and present implicit knowledge, no knowledge before, this information is useful and human reliable; this is processing from a great set of data; with the object of describing in automatic way models, no knowledge before; to detect tendencies and patterns [1,2] The Web Mining are the set of techniques of Data Mining applied to Web [7]. The Web Usage Mining is the process of applying techniques to detect patterns of usage to Web Page [3,5]. The Web Usage Mining use the data storage in the Log files of Web server as first resource; in this file the Web server register the access at each resource in the server by the users [4,6].

NEURAL NETWORK An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the

brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process [9]. 2.1 Architecture of neural networks 2.1.1 Feed-forward networks Feed-forward ANNs allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straightforward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organization is also referred to as bottom-up or top-down. 2.1.2 Feedback networks Feedback networks can have signals traveling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organizations. pre-processing, pattern discovery, and pattern analysis [3, 8]. Pre-processing consists of converting usage information contained in the various available data sources into the data abstractions necessary for pattern discovery. Another task is the treatment of outliers, errors, and incomplete data that can easily occur due reasons inherent to web browsing. The data recorded in server logs reflects the (possibly concurrent) access of a Web site by multiple users, and only the IP address, agent, and server side clickstream are available to Identify users and server sessions. The Web server can also store other kinds of usage information such as cookies, which are markers generated by the Web server for individual client browsers to automatically track the site visitors [3, 4]. After each user has been identified (through cookies, logins, or IP/agent analysis), the clickstream for each user must be divided into sessions. As we cannot know when the user has left the Web site, a timeout is often used as the default method of breaking a users click-stream into sessions [2]. The next phase is the pattern discovery phase. Methods and algorithms used in this phase have been developed from several fields such as statistics, machine learning, and databases. This phase of Web usage mining has three main operations of interest: association (i.e. which pages tend to be accessed together), clustering (i.e. finding groups of users, transactions, pages, etc.), and sequential analysis (the order in which web pages tend to be accessed) [3, 5]. The first two are the focus of our ongoing work. Pattern analysis is the last phase in the overall process of Web usage mining. In this phase the motivation is to filter out uninteresting rules or patterns found in the previous phase. Visualization techniques are useful to help application domains expert analyze the discovered patterns.

MINING WEB USAGE DATA In Web mining, data can be collected at the serverside, client-side and proxy servers. The information provided by the data sources described above can be used to construct several data abstractions, namely users, page-views, click-streams, and server sessions. A user is defined as a single individual that is accessing file web servers through a browser. In practice, it is very difficult to uniquely and repeatedly identify users. A page-view consists of every file that contributes to the display on a users browser at one time and is usually associated with a single user action such as a mouse-click. A click-stream is a sequential series of page-views requests. A server session (or visit) is the click-stream for a single user for a particular Web site. The end of a server session is defined as the point when the users browsing session at that site has ended [3, 10]. The process of Web usage mining can be divided into three phases:

CONVENTIONAL METHOD USED IN WEB MINING Clustering Clustering the process of partition a set of data in a set of meaning full subclasses known as clusters. It helps users understand the natural grouping or structure in a data set. Clustering is an unsupervised learning technique which aim is to find structure in a collection of unlabeled data. It is being used in many fields such as data mining, knowledge discovery, pattern recognition and classification [3]. A good clustering method will produce high

quality clusters in which similarity is high known as intra-classes and inter-classes where similarity is low. The quality of clustering depends upon both the similarly measure used by the method and it, s implementation and it is also measured by the its ability to discover hidden patterns. Generally speaking, clustering techniques can be divided into two categories pair wise clustering and central clustering. The former also called similaritybased clustering, groups similar data instances together based on a data-pair wise proximity measure. Examples of this category include graph partitioning-type methods. The latter, also called centroid-based or model-based clustering, represents each cluster by a model, i.e., its centroid". Central clustering algorithms [4] are often more efficient than similarity-based clustering algorithms. We choose centroid-based clustering over similaritybased clustering. We could not efficiently get a desired number of clusters, e.g., 100 as set by users. Similarity-based algorithms usually have a complexity of at least O (N2) (for computing the data-pair wise proximity measures), where N is the number of data instances. In contrast, centroid-based algorithms are more scalable, with a complexity of O (NKM), where K is the number of clusters and M the number of batch iterations. In addition, all these centroid-based clustering techniques have an online version, which can be suitably used for adaptive attack detection in a data environment K-Mean Algorithm The K-Means algorithm is one of a group of algorithms called partitioning clustering algorithm [4]. The most commonly use partitional clustering strategy is based on square error criterion. The general objective is to obtain the partition that, for a fixed number of clusters, minimizes the total square errors. Suppose that the given set of N samples in an ndimensional space has somehow been partitioned into K-clusters {C1, C2, C3... CK}. Each CK has nK samples and each sample is in exactly one cluster, so that nK = N, where k=1 K. The mean vector Mk of cluster CK is defined as the centroid of the cluster MK = (1/nk) Where xik is the ith sample belonging to cluster CK. The square-error for cluster CK is the sum of the squared Euclidean distances between each sample in CK and its centroid. This error is also called the within-cluster variation [5]: ek2 = The square-error for the entire clustering space containing K cluster is the sum of the within-cluster variations The basic steps of the K-mean algorithm are: 1. Select an initial partition with K clusters containing randomly chosen sample, and compute the centroids of the clusters, 2. Generate a new partition by assigning each sample to the closest cluster centre, 3. Compute new cluster centre as the centroids of the clusters, 4. Repeat steps 2 and 3 until optimum value of the criterion function is found or until the cluster membership stabilizes. 4.3 Problem identification: Problems with k-means In k-means, the free parameter is k and the results depend on the value of k. unfortunately; there is no general theoretical solution for finding an optimal value of k for any given data set. It take more time for calculating the data set. It can only handled the Numerical data set. The Result depend on the Metric used the measure || x-mi||.

PROPOSED APPROACH:In the present work, the role of the k-means algorithm is to reduce the computation intensity of the neural network, by reducing the input set of samples to be learned. This can be achieved by clustering the input dataset using the k-means algorithm, and then take only discriminate samples from the resulting clustering schema to perform the learning process. The number of fixed clusters can be varied to specify the coverage repartition of the samples. The number of selected samples for each class is also a parameter of the selection algorithm. Then, for each class, we specify the number of samples to be selected according to the class size. When the clustering is achieved, samples are taken from the different obtained clusters according to their relative intraclass variance and their density. The two measurements are combined to compute a coverage factor for each cluster. The number of samples taken from a given cluster is proportional to the computed coverage factor. Let A be a given class, to witch we want to apply the proposed approach to extract S sample. Let k be the number of cluster fixed to be used during the k-means clustering phase. For each generated cluster cli, (i:1..k), the relative variance is

computed using the following expression: When Card(X) give the cardinality of a given set X, and dist(x,y) give the distance between the two points x and y. The most commonly used distance measure is the Euclidean metric which defines the distance between two points x=(p1,.pN) and y=(q1,.,qN) from RN as: {if dist(Candidates[j].point,x)<min then min:= dist(Candidates[j].point,x) ; } if (min > ) then Sam(i):=Sam(i) U{Candidates[j].point}; j:=j+1; } if card(Sam(i)) < Num_samples(cli) then repeat{Sam(i):=Sam(i)UCandidates[random].poin }until (card(Sam(i)) = Num_samples(cli)); 3-For i=1 to k do Out_sam:=Out_sam U Sam(i);

The density value corresponding to the same cluster cli is computed like the following: The coverage factor is then computed by: We can clearly see that: 0 Vr(cli) 1 and 0 Den(cli) 1 for any cluster cli. So the coverage factor Cov(cli) belong also to 1-Cluster the class A using the k-means algorithm into k cluster.the [0,1] interval. Furthermore, it is clear that: We can so deduce easily that: Hence, the number of samples selected from each cluster is determined using the expression Num_samples(cli)=Round(S*cov(cli) Let A be the input class; k: the number of cluster; S: the number of samples to be selected (S k); Sam(i): the resulting selected set of samples for the cluster i; Out_sam: the output set of samples selected from the class A; Candidates: a temporary array that contain the cluster points and their respective distance from the centroid. i,j,min,x: intermediates variables; : Neiberhood parameter The proposed selection model algorithm is 1-Cluster the class A using the k-means algorithm into k cluster. 2-For each cluster cli (i:1..k) do { Sam(i) :={centroid(cli)}; j:=1; For each x from cli do { Candidates [j].point :=x; Candidates [j].location :=dist(x, centroid(cli)) ; j:=j+1 ;}; Sort the array Candidates in descending order with Hence, the number of samples selected from each cluster is respect to the values of location field; j:=1; While((card(Sam(i)))<Num_samples(cli)) and (j<card(cli)) do{min:=100000; For each x from Sam(i) do

Conclusion In this work, we study the possible use of the neural networks learning capabilities to classify the web traffic data mining set. The discovery of useful knowledge, user information and server access patterns allows Web based organizations to mining user access patterns and helps in future developments, maintenance planning and also to target more rigorous advertising campaigns aimed at groups of users. Previous studies have indicated that the size of the Website and its traffic often imposes a serious constraint on the scalability of the methods. As popularity of the web continues to increase, there is a growing need to develop tools and techniques that will help improve its overall usefulness.

REFRENCES: [1] W.J. Frawley, G. Piatetsky-Shapiro, and C.J. Matheus, Knowledge Discovery in Databases: An Overview, Knowledge Discovery in Databases, G. Piatetsky-Shapiro and W.J Frawley, eds., Cambridge, Mass.: AAAI/MIT Press, pp. 1-27, 1991. [2] Mika Klemettinen, Heikki Mannila, Hannu Toivonen: A Data Mining Methodology and Its Application to Semi-automatic Knowledge Acquisition. DEXA Workshop 1997: 670-677 [3] R. Kosala, H. Blockeel, and Web Mining Research: A Survey, SIGKKD Explorations, vol. 2(1), July 2000. [4] Borges-Levene, An average linear time algorithm for web usage mining:, 2000 [5] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKKD Explorations, vol.1, Jan 2000. [6] P. Batista, M. J. Silva, Mining web access logs of an on-line newspaper, (2002),

[7] Cernuzzi, L., Molas, M.L. (2004). Integrando diferentes tcnicas de Data Mining en procesos de Web Usage Mining. Universidad Catlica "Nuestra Seora de la Asuncin". Asuncin. Paraguay. [8] R. Ivncsy, I. Vajk, Different Aspects of Web Log Mining. 6th International Symposium of Hungarian Researchers on Computational Intelligence. Budapest, Nov., 2005. [9] Chau, M.; Chen, H., Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Volume 37, Issue 3, May 2007 Page(s):352 358 [10] Raju, G.T.; Satyanarayana, P. S. Knowledge Discovery from Web Usage Data: Extraction of Sequential Patterns through ART1 Neural Network Based Clustering Algorithm, International Conference on Computational Intelligence and Multimedia Applications, 2007, Volume 2, Issue , 13-15 Dec. 2007 Pages :88 -92 [11] Jalali, Mehrdad Mustapha, Norwati Mamat, Ali Sulaiman, Md. Nasir B. , A new classification model for online predicting users future movements, in International Symposium on Information Technology, 2008. ITSim 2008 26-28 Aug. 2008, Volume: 4, On page(s): 1-7, Kuala Lumpur, Malaysia


Higher Education Through Entrepreneurship Development In India

Mrs Vijay Research Scholar Manav Bharti University, Solan H.P MBA( HR/Mktg),M.Phil(Management) Pursuing Ph.D(Management) vijaykharb18@gmail.com9253722808.
Abstract Higher education plays a very precious role now a day. Higher education developed a new skills, new ideas and new strategy. In India higher education system is one of the largest systems in the world. Higher education for role entrepreneurship can play shaping the institutional development. It is clear statement entrepreneurships engagement is a rapidly expanding and evolving aspect of higher education that requires proper support and developed. Higher education is changing the phase of entire society and help to develop the nation. INTRODUCTION Higher education plays a very precious role now a day. Higher education developed a new skills, new ideas and new strategy. In India higher education system is one of the largest systems in the world. Higher education for role entrepreneurship can play shaping the institutional development. It is clear statement entrepreneurships engagement is a rapidly expanding and evolving aspect of higher education that requires proper support and developed. Higher education is changing the phase of entire society and help to develop the nation. . gurudakshina after completion studies. Education mostly for males all the work done by males and women take care child and home. In olden days a few popular education entrepreneurship Takshila(medicine), Ujjan(astronomy) and Nalanda is biggest branches of knowledge(10,000 students). In 18th century widespread-by British education every temple, village and regions of the country. The subjects taught included Reading, Writing, Arithmetic, Theology, Law, Astronomy, Metaphysics, Ethics, Medical Science and Religion. The schools were attended by students representative of all classes of society. 20th century education systems under British rule. Gandhi is said to have described the traditional educational system as a beautiful tree that was destroyed during the British rule. After Independence, education became the responsibility of the states. The central Governments been to co-ordinate in technical and higher education and specifies standard YEAR Field Education system 1964 Dr. Kothari, Education CHANGING EDUCATION SYSTEMS IN INDIA History In olden days education system in India has a long history. The Gurukul system of education is popular in those days. Gurukul gave a highest education for human development physical, mental and spiritual. At the Gurukuls, the teacher imparted knowledge of Religion, Scriptures, Philosophy, Literature, Statecraft, Medicine Astrology etc. Education was free but students gave a voluntary Commission under the D.S. Scientific field

Chairmanship 1976 Ministry Human Resource Development's Department of Education policy planning and

Education 1986-1992 Government Education compulsory up years 1998 PM Vajpayee A.B. Setting up of Vidya Vahini Network link to up to 14 It has assumed super importance for accelerating economic growth both in developed and developing countries. It promotes capital formation and creates wealth in country. It is hope and dreams of millions of individuals around the world. It reduces unemployment and poverty. It is the process of searching out opportunities in the market place and arranging resources required to exploit these opportunities for long term gains. It is the process of planning, organizing, controlling, opportunities and assuming. Thus it is a risk of business enterprise. It is a creative and innovative skill and adapting response to environment of what is real, when where and why. Higher education in India has evolved in distinct and divergent streams with each stream monitored by an apex body, indirectly controlled by the Ministry of Human Resource Development. There are 18 important universities called Central Universities, which are maintained by the Union Government. The private sector is strong in Indian higher education.. The National Law School, Bangalore is highly regarded, with some of its students being awarded Rhodes Scholarships to Oxford University, and the All India Institute of Medical Sciences is consistently rated the top medical school in the country. Indian School of Business, Hyderabad and the Indian Institutes of Management (IIMs) are the top management institutes in India. The University Grants Commission Act 1956 explains. "The right of conferring or granting degrees shall be exercised only by a University established or incorporated by or under a Central Act carol bon tempo, or a State Act, or an Institution deemed to be University or an institution specially empowered by an Act of the Parliament to confer or grant degrees. Thus, any institution which has not been created by an enactment of Parliament or a State Legislature or has not been granted the status of a Deemed to be University is not entitled to award a degree." Accreditation for higher learning autonomous institutions (13) established by the University Grants Commission:A) All India Council for Technical Education (AICTE) B) Indian Council for Agriculture Research (ICAR) C) National Council for Teacher Education (NCTE) D) Pharmacy Council of India (PCI)

universities, UGC CSIR. and



6% Gross



Domestic Product (GDP) spent Primary education 2000 onwards Government More Emphasized Higher Education. Expenditure on Education in India - The Government expenditure on Education has greatly increased since the First five-year plan. The Government of India on elementary education goes towards the payment of teachers' salaries. Government established Bal Bhavans, Distance education, Education for Women (30% of the seats have been reserved for women). Higher Education Entrepreneurships Entrepreneurship is the act of being an entrepreneur, which can be defined as "one who undertakes innovations, finance and business acumen in an effort to transform innovations into economic goods.

E) Indian Nursing Council (INC) F) Dentist Council of India (DCI)* Central Council of Homeopathy (CCH) G) Central Council of Indian Medicine (CCIM) The entire society and humane life both have changed considerably since that a few years back. Information Technology has increased the pace of development. Higher education is helpful to developed entrepreneurship by entrepreneurs. Now a day a best women entrepreneurs A number of courses in India help to developed entrepreneurships. For ex MBA are a one of the best course to developed entrepreneurs. MBA is an indicator of persons eligibility for working on managerial positions and deal with heavy work load. Moreover the employer can be sure that an applicant for certain positions have a solid experience background because MBA was initially designed for already working professionals who want to expand their skills, knowledge, Networking opportunity, Start your own business, confidence, Innovation and creativity etc.,. One of the main criteria for applying for MBA is several years of working experience depending on a program. MBA is a very attractive to your help handled situations and useful to develop economic of our country. Objectives: (Through Education developed) 1. Develop and strengthen their entrepreneurial quality.. 2. Analyze Environment 3. Entrepreneurial disciplines 4. Understand overall procedures 5. Provides large scale employment 6. Effective resource mobilization of capital and skill 7. Balanced Regional development 8. Developed Trade 9. It promotes capital formation 10. Develop passion for integrity and honesty.

Conclusion: Best Education through cultivate the human minds. REFERENCES:[1] Akhouri, M.M.P and vinod Gupta: Sustaining Entrepreneurship, NIESBUD, New Delhi, 1990. [2] Brimmer, A.F.: The Setting of Entreprenurship in India, Quartely journal of Management, L 20 (4), November 1955. [3] Dhar,P.N. and H.F.Lydal: The Role of Small Entreprises in Indias Economic Development, Asia Publishing House, Bombay, 1961. [4] Khanka S.S.: Entrepreneurial Development ,S. Chand & Company,New Delhi 2006. [5] Kilby, Peter(Ed): Entrepreneurship and Economic Development, The Free Press, New York,1971. [6] Singh, N.P. : The Role of Entrepreneurship in Economic Development, Vora & Co Publishers (Pvt.) Ltd., Bombay, 1966. [7] Shane, Scott "A General Theory of Entrepreneurship: the Individual-Opportunity Nexus", Edward Elgar,

Role of Entrepreneurships According to Jones and Butler 1992 Corporate Entrepreneurships is the process by which firms notice opportunities and act to creatively organized transactions between factors of production so as to create the surplus values There are a different variable affect to developed entreprenurships1. Economic Factors - Capital, Labour, Raw Materials and Market 2. Social Factors - Family, Society and Socia- Cultral 3. Psychological Factors - Primary and Secondary needs 4. Political Factors - Government Rule and Regulation.


Concepts, Techniques, Limitations & Applications Of Data Mining

Abstract Data mining is a new powerful technique to extract useful information from large and unorganized databases. It is the search for relationship and global patterns which exists in large databases but are hidden among the very large amount of data. It is concerned with the analysis of data and the use of software techniques for generating patterns and regulations in sets of data. Data mining enables us to understand the present market trends and makes us able to adopt proactive measures to get maximum profit from the same. In the present paper concepts,techniques,limitations and application of data mining in marketing have been discussed and analysed.The paper demonstrates the ability of data mining in improving the decision making process in marketing field. Keywords: Mass marketing, Artificial Neural Networks, Proactive, Predictive, Data mining Applications.

INTRODUCTION ata mining can be defined as the process of data Dselection and exploration and building models using vast data stores to uncover previously unknown patterns[1]. It aims to identify valid, novel, potentially useful and understandable correlations and patterns in data by combing through copious data sets to sniff out patterns that are too subtle are complex for humans to detect [2].The existence of medical insurance fraud and abuse for example has led many healthcare insurers to attempt to reduce their losses by using data mining tools to help them find and track offenders [3]. Data mining can improve decision making by discovering patterns and trends in large amounts of complex data [4]. Presently most of the industries which sell products and services require advertising

and promoting their products and services. Banks, insurance companies and retail stores are typical examples. Usually two type of techniques to advertisement and promotion. (i)Mass marketing and (ii) Direct marketing. Mass marketing deals with mass media such as televisions, radio and newspapers, broadcasts messages to the public with out discrimination. It has been an effective way of promotion when the products were in huge demand by the public. The second techniques of promotion is direct maketing.In this techniques instead of promoting to customers in discriminatively, direct marketing studies customers characteristics and requirements and chooses certain customers as the target for promotion. It is expected that the response rate for the selected customers can be much improved. At present a large amount of information on customers is available in databanks. Hence data mining can be very useful for direct marketing. Data mining has been used widely in direct marketing to target customers [5, 6]. In medical community some authors refer to data mining as the process of acquiring information whereas others refer to data mining as utilization of statistical techniques within the knowledges discovery process [7].The terror related activities can be detached on the web using data mining techniques [8] .Terrorists cells use the internet infrastructure to exchange news and recruit new members and supporters .It is believed that the detection of terrorists on the web might prevent further terrorist attacks. Keeping their view in mind law enforcements agencies are trying to detect terrorists by monitoring all ISPs traffic using data mining [8].

Process of Data Mining:

The various steps [9] in the data mining process to extract useful informations are: (i) Problem definition: This phase is to understand the problem and the domain environment in which the problem occurs. We need to clearly define the problem before proceed further. The Problem definition specifies the limits within which the problem needs to be solved. It also specifies the cost limitations in solving the problem. (ii) Creation of a database for data mining: This phase is to create a database where the data to be mined are stored for knowledge acquisition. The creation of data mining database consumes about 50% to 90% of the overall data mining process. Data warehouse is also a kind of data storage where large amount of data is stored for data mining. (iii) Searching of the database: This phase is to select and examine important data sets of a data mining database in order to determine their feasibility to solve the problem. Searching the database is a time consuming process and requires a good user interface and computer system with good processing speed. (iv) Creation of a data mining model: This phase is to select variables to act as predictors. New variables are also built depending upon the existing variables along with defining the range of variables in order to support imprecise information. (v) Building a data mining model: This phase is to create various data mining models and to select the best of these models. Building a data mining model is an iterative process. The data mining model which we select can be a decision tree, an artificial neural network or an association rule model. (vi) Evaluation of data mining model: This phase is to evaluate the accuracy of the selected data mining model. In data mining the evaluating parameter is data accuracy in order to test the working of the model. This is because the information generated in the simulated environment varies from the external environment. (vii) Deployment of the data mining model: This phase is to deploy the built and the evaluated data mining model in the external working environment. A monitoring system should monitor the working of the model and produce reports about its performance. The information in the report helps to enhance the performance of selected data mining model. The following fig. shows the various phases in the data mining process. ,

Data mining process models: We need to follow a systematic approach of data mining for meaningful retrieval of data from large data banks. Several process models have been proposed by various individuals and organizations that provide systematic phases for data mining. The three most popular process models of data mining are: The CRISP-DM process model: In this process model CRISP-DM stands, for cross industry standard process for data mining. The life cycle of CRISP-DM process model consists of six phases: (i) Understanding the business: This phase is to understand the objectives and requirements of the

business problems and generating a data mining definition for the business problem. (ii) Understanding the data: This phase is to first analyze the data collected in the first phase and study its characteristics and matching patterns to propose a hypo these for solving the problem. (iii) Preparing the data: This phase is to create final datasets that are input to various modeling tools. The raw data items are first transformed and cleaned to generate datasets which are in the form of tables, records and fields. (iv) Modeling: This phase is to select and apply different modeling techniques of data mining. We input the data sets collected from the previous phase to these modeling techniques and analyze the generated output. (v) Evaluation: This phase is to evaluate a model or a set of models that we generate in the previous phase for better analysis of the refined data. (vi) Deployment: This phase is to organize and implement the knowledge gained from the evaluation phase in such a way that it is easy for the end users to comprehend. The 5As process model: This process model stands for Assess, Access, Analyze, Act and Automate. The 5As process model of data mining generally begins by first assessing the problem in hand. The next logical step is to access or accumulate data that are related to the problem. After that we analyze the accumulated data from different angles using various data mining techniques. We then extract meaningful information from the analyzed data and implement the result in solving the problems in hand. At least we try to automate the process of data mining by building software that uses the various techniques which we used in the 5As process model. The six sigma process model: The six sigma is a data driven process model that eliminates defects, wastes or quality control problems that generally occurs in a production environment. Six sigma is very popular in various American industries due to its easy implementation and it is likely to be implemented world wide. This process model is based on various statistical techniques, use of various types of data analysis techniques and implementation of systematic training of all the employees of an organization. Six sigma process model postulates a sequence of five stages called DMAIC, which stands for Define, Measure, Analyse, Improve and Control. The life cycle of six sigma process model consists of five phases: (i) Define: This phase is to define the goals of a project along with its limitations. (ii) Measure: This phase is to collect information about the current process in which the work is done and to try to identify the basis of the problem. (iii) Analyze: This phase is to identify the root cause of the problem in hand and ensure those root causes by using various data analysis tools. (iv) Improve: This phase is to implement all those solutions that tries and solves the root causes of the problem in hand. (v) Control: This phase is to monitor the outcome of all its previous phases and suggest improvement measure in each of its earlier phases. Data Mining Techniques: The most commonly used techniques [10] in data mining are: (a) Artificial Neural networks: Non linear predictive models that are learnt by training & resemble and variable biological neural networks in structure. (b) Decision trees: The decision tree methods include classifications and regression tress (CART) and chi-square automatic interaction detection (CHAID). These resemble tree shaped structures. (c) Genetic algorithms: Optimizations techniques that use processes such as genetic combination, mutation and natural selection in a design based on the concepts of evolution. (d) Nearest neighbor method: A technique that classifies each record in a data set based on a combination of the classes of the k record(s) most similar to it in a historical dataset. It is sometimes called the k-nearest neighbor technique. (e) Rule Induction: The extraction of useful if then rules from data based on statistical significance. How Data Mining is applied: The technique that is used in data mining is called modeling. Modeling is simply the act of building a model in one condition where one knows the answers and then applying it to another condition that one does not know. The following two examples may help in understanding the use of data mining for building a model for new customer. Let us suppose that the marketing director has a lot of information about his prospective customers i.e. their age, sex, credit history etc. Now his problem is that he dont know the long distance calling usage of these prospects (because they are most likely now customers of his

Data mining can be used for direct marketing to get higher profit as compared to mass marketing. For that whole database contains 300000 customers. In direct marketing only 20% of the customers identified as likely buyers by data mining (which costs. 60000) are chosen to receive the promotion package in the mail. The mailing cost is thus reduced dramatically. At the same time however the response rate can be improved from 1% in mass mailing to 3 %( real improvements for a 20% rollout). We see from table-3 that net profit from the promotion becomes positive in direct mailing compared to a loss in mass mailing. Table-3 A comparison between directional campaign and mass mail campaign.

Details Number of customers mailed Cost of printing, mailing(Rs 40.00 each)

Mass mailing 300,000 12000000

Direct Mailing (20%) 60000 2400000

Cost of data mining Nil 1000000 Total promotion cost 12000000 3400000 Response rate 1.0% 3.0% Number of sales 3000 1800 Profit from sale(Rs. 3000 9000000 5400000 each) Net profit from promotion -3000000 2000000 competition).He would like to concentrate on those prospects that have large amounts of long distance usage. Then he can obtained this by building the model [8] as shown in the table-1 Table-1 data mining for Prospecting

The aim in prospecting is to make some calculated guesses about the information in the lower right hand quadrant Which is based on the model that he builds while going from customers general information to customers proprietary information. With this model in hand new customers can be selected as target. Another common example for building the models is shown in the table-2.Test marketing is an excellent source of data for this kind of modeling. By mining the results of a test market which represents a broad but relatively small sample of prospects one can provide a base for identifying good prospects in the overall market. Table2 Data mining for predictions

Discussions: We see from table-3 that net profit from the promotion becomes positive in direct mailing compared to a loss in mass mailing. Thus the paper demonstrates the ability of data mining in improving the decision making process in marketing field. Data mining Issues: Although data mining has been developed into a conventional, mature, trusted and powerful technique even then there are certain issues [10, 11] related to data mining which are discussed below in detail. One should note it that these issues are not exclusive and are not ordered in any way. (i) Issues related to Data mining methods: It is often desirable to have different data mining methods available because different approaches mine data differently depending upon the data in hand and the mining requirements [9].The algorithms that we use

in data mining assumes that the stored data is always noise free and in most of the cases it is a forceful assumption. Most data sets contain exceptions, invalid or incomplete information which complicates the data analysis method. Presence of noisy data reduces the accuracy of mining results. Due to which data preprocessing i.e. the cleaning of data and its transformation becomes essential. Data cleaning is a time consuming process but it is one of the most important phase in knowledge discovery method. Data mining techniques should be able to handle noisy or incomplete data. (ii) Issues related to Data source: Data mining systems rely on databases to supply the raw data for input and this raises serious issues because databases are dynamic, incomplete, noisy and large[10] The current trend[9] is to collect as much of data as possible and mine them later as and when required. The concern is about the quality and type of the large data being collected; very clear understanding is required to collect right data of proper amount and to distinguish between useful and useless data. Now a days databanks are of different types and stores data with complex and diverse data types. It is very difficult to expect a data mining system to effectively and efficiently achieve good mining results on all kinds of data and sources. Different data types and data sources require specialized mining algorithms and techniques . (iii) Issues related to user interface: The knowledge invented by data mining tools is useful as long as it is interesting and understandable by the end user. Good data visualization simplifies interpretation of data mining results as well as helps end users to better understand their needs. The major problems related to user interfaces and visualization are Screen real estate, information rendering and interaction.Intacterivity with stored data and data mining results is essential because it provides means for the end user to focus and refine mining tasks and to visualize the discovered knowledge from different angles and at different conceptual levels. (iv) Issues related to Security and social matters: Security is an important problem with any type of data collection which is shaved and/or is intended to be used for strategies decision making [9].Moreover when data is collected for customers profiling, user behavior understanding, correlating personal data with other information etc .large amounts of sensitive and private information about individuals or companies is gathered and stored. This becomes controvertisial given the confidential nature of some of this data and the potential illegal areas to the information. Moreover data mining could disclose new implicit knowledge about individuals or groups that could be against privacy policies (v) Issues related to Performance: Artificial intelligence and statistical methods for the data analysis and interpretation are generally not designed for mining large data sets. Data sets size in terabytes is common now a day. As a result, processing large data sets raises the problems of scalability and efficiency of data mining methods. It is not possible to fractionally use algorithms with exponential and even medium order polynomial complexity for data mining. Linear algorithms are generally used for mining large data. Applications of Data Mining: Data mining finds its applications in many fields in our daily life. It is very useful for small, middle and large organizations that produces large amount of data everyday. Almost all the present day different organizations use data mining in all the plans of their work. Some of these organizations use data mining in all the phases of their work. Some of these potential applications are mentioned below: (i) Data mining in marketing: Data mining provides the marketing and sales executive with various decision support systems that helps us in consume acquisition, consumer segmentation consumer retention and cross selling. In this way it enables us to better interact with consumers, improve the level of consumer services that we provide and establish a song lasting consumer relationship. One can demonstrate that data mining is an effective tool for direct marketing which can give more profit to the retail industry than the traditional means of mass marketing. (ii) Data mining in healthcare: Healthcare organizations and pharmaceutical organizations provides huge amount of data in their clerical and diagnostic activities. Data mining enables such organizations to use the machine learning techniques to analysis healthcare and pharmaceutical data and retrieve information that might be useful for developing new drugs. When medical institutions use data mining for their existing data they can discover new, useful and potentially life saving knowledge that otherwise remained inert in their database. (iii) Data Mining in Banking: Bank authorities can be able to study and analyze the credit patterns of their consumer and prevent any kind of bad credits or fraud detection in any kind of banking transactions using data mining. By data mining the bank authorities can find hidden correlations between different financial indicators and can identify stock

trading rules from historical market data. By data mining bank authorities can identify to change credit card affiliation. (iv) Data mining insurance sector: Data mining can help the insurance companies to predict which customers with buy new policies and can also identify the behavior patterns of risky customers and fraudulent behavior. (v) Data Mining in Stocks and investments analysis and management: Data mining enables us to study fist the specific patterns of growth or downslides of various companies and then intelligently invest in a company which shows the most stable growth for a specific period. (vi) Data Mining in Computer security analysis and management: Data mining enables network administrators and computer security experts to combine its analytical techniques with our business knowledge to identify probable instances of fraud and abuse that compromises the security of a computer or network. (vii) Crime analysis and management: Data mining enables security agencies and police organizations to analyze the crime rate of a city or a locality by studying the past and the current trend that led to the crime and prevents the reoccurrence of such incidences and enables concerned authorities to take preventive measures. References: [1] A.Milley (2000), Healthcare and data mining, Health Management Technology, 21(8), 44-47. [2] D.kreuze(2001).Debugging hospitals.Technology Review,104(2),32. [3] T.christy(1997).Analytical tools help health firms fight fraud.Insurance & Technology,22(3),22-26.

[4] SBiafore, (1999).Predictive solutions bring more poer to decision makes, health management technology, 20(10), 12-14 [5] T.Terano, Y.Ishino (1996).Interactive knowledge discovery from marketing questionnaire using simuated breeding and inductive learning methods. In proceedings of the second international conference on knowledge discovery and data mining pp .279282. [6] V.Ciesielski, G. Palstra (1996). Using hybrid neural/expert system for database mining in market survey data. In proceedings of the second international conference on knowledge Discovery and Data Mining,pp 38-43. [7] A.Wilson,L.Thabane,A Holbrook(2003).Application of data mining techniques in pharmacovigilance.British Journalof Clinical Pharmacology.(57)2,127-134. [8] M.Ingram (2001) .Internet privacy threatened following terrorist attacks on VS,URL : 2001/isps24shtm [9] Data Mining, (BPB publications,B-14 ,Connaught place, New Delhi-1) p-15-16. [10] System Analysis & Design by A.C.Swami & V.Jain (College Book House p. ltd. Chaura Rasta,jaipur ,p-328-329) [11] G. Sharma, Data mining, data warehousing And olap (S.K.Katariya & Sons, Ansari road, dariyaganj, New Delhi) pp. 14-15


ICT for Energy Efficiency, Conservation & reducing Carbon Emissions

Abstract Energy efficiency and conservation is a component that requires in-depth analysis and is as important as alternate energy resources and other socially relevant issues like climate change. This opportunity confirms to what other reports have foundenergy efficiency and conservation is important. The focus of the effort is to not only understand why energy efficiency and conservation is important and why the emphasis provided to is justified but is to explore beyond and look for initiatives the society must take to conserve energy. Our goal is to unlock the efficiency potential and look for methods to conserve energy in the future. In this report we target not only the small households and try to establish what an individual can do to conserve energy while using smart appliances before moving on to industrial sector. The focus of the report is also to look how not only large industry setups can contribute their bit to the environment by energy efficiency and conservation but also try to link an untapped economic savings potential by doing so. In this effort, the report also talks about smart building, an initiative that has developed recently to cut down on carbon emissions as well as cut on resources and hence using energy efficiently. The research also talks about the kind of impetus government of India has provided in its trade and development policies to favor energy efficiency. In this the kind of pilot projects what Indian Government has carried out and what more needs to be done is also emphasized. The research talks about sustainable development and a smarter future which deals with energy efficiency and lowering carbon emissions by inculcating the newer technologies that have started to surface around the world. The research is an endeavor towards a smarter planet with contributions from the individuals or the society as a whole.


CT has made progressive inroads in our society. Whether it deals with providing education to the rural India or implementing a safer government, ICT is one of the fastest growing industries in the world. In the years to come, ICT would play a bigger, better

and a safer role in shaping our future. ICT shares equal responsibility in building a smarter future, improving the quality of life by reducing carbon emissions, Greenhouse emissions as well as providing day-to-day solutions for seamless services and information.ICT shares equal responsibility towards Energy utilization. The sudden growth in energy utilization and its conservation is not only because of gradual depletion of resources but also because of factors like reliability on unstable regions of earth for petrol and other crude resources or imbalance amongst different countries on such factors. Witnessing the sudden increase in crude oil prices fortnightly is one such reason, why energy must be efficiently used and reserved. Apart from the ethical responsibility that we share towards our planet and the future generations to come that what we have utilized belongs to the future generation as well, all of us share the equal responsibility of reducing carbon emissions and other GhG emissions. While discussing energy efficiency, not only energy conservation is important but also embracing other renewable sources to provide energy to individuals/organizations is also important. Utilizing the renewable resources is another step ICT has to fulfill. The scenario we face today is indeed depressing as we possess the knowledge and technology to slash energy use but also at the same time increase the level of comforts. The basic barriers faced today by the ICT sector while energy efficiency is three in nature, namely behavioral, organizational and financial. These barriers can be dealt with valuing energy, transforming behavior and government incentives and policies. Greater energy efficiency is an important component in comprehensive national and strategic policy making for energy resources and climate change in the future. Energy utilization would definitely lead us to lower carbon emissions and GhG emissions which would be an important landmark for the generations to come. Lowering carbon emissions is an important and necessary step that ICT would help in the years to come. Monetization of carbon emissions would be

a welcome step to provide individuals and organizations alike to the amount of money that could be saved by efficient energy conservation. Energy Efficiency offers a vast, low-cost energy and economic resource for the country only if smartly unlocked. Most of the power grid lay out in the country for distributing electricity throughout the country is decades old and answering the energy crisis on the basis of such a grid is not a feasible solution. A report by IIT K in March 1999, states that the Transmission and Distribution losses in India account for roughly 4-6% in Transmission and 1518% in Distribution. While India still suffers a deficit of over 35-40% in supply and demand. Due to energy deficit, power cuts are common in India and have adversely effected the countrys economic growth. According to Wikipedia, theft of electricity, common in most parts of urban India, amounts to 1.5% of India's GDP. Despite an ambitious rural electrification program, some 400 million Indians lose electricity access during blackouts. Statistics speak that most of the deficit is due to T&D losses. It is possible to bring down the distribution losses to a 6-8 % level in India with the help of newer technological options (including information technology) in the electrical power distribution sector which will enable better monitoring and control. About 70% of the electricity consumed in India is generated by thermal power plants, 21% by hydroelectric power plants and 4% by nuclear power plants. Of the 70% produced by thermal power plants, 20% is lost in Transmission and Distribution. Apart from the losses, considerable carbon emissions result which further hamper with the ecological stability. More than 50% of India's commercial energy demand is met through the country's vast coal reserves. Indian Government has looked to increase production of electricity with nuclear power plants which according to experts is a dangerous liaison. Using nuclear reactors to produce energy is waiting for a Bhopal or Chernobyl to happen once again or provide a potential threat to terrorists. With ever increasing products looking for more and more energy looking for other viable sources of energy is an important task in front of the Indian Government. Wind energy and solar energy are ever lasting potential sources of energy that needs to be efficiently utilized. ICT can play a huge role in embracing such renewable sources with the current existing infrastructure. The country has also invested heavily in recent years in renewable energy utilization, especially wind energy. In 2010, India's installed wind generated electric capacity was 13,064 MW. Additionally, India has committed massive amount of funds for the construction of various nuclear reactors which would generate at least 30,000 MW. In July 2009, India unveiled a $19 billion plan to produce 20,000 MW of solar power by 2020. Even though India was one of the pioneers in renewable energy resources in early 1980s, its success has been dismal. Only 1% of energy needs are met by renewable energy sources. Apart from electricity, petroleum products are a huge concern for India. Conditions have changes since the past few decades and the energy resources cant be utilized the way they were in the past. Petrol products need to be utilized efficiently and ICT can play a pivotal role in this development. Hybrid Cars have started coming in the market but with limited success due to high operational cost involved. Apart from that electrical cars have surface in India as well, but with little respite. It is important the government rolls out efficient policy and business plans for the efficient utilization. Energy efficiency and utilization would not only maximize the life scale of resources but also help in climate change and reducing carbon footprint. Business and policy makers now realize that climate change is a global problem and needs immediate and sustained attention. ICT industry can enable a large part of that reduction. ICT solutions include measure not just for efficient utilization of electricity but these efforts would translate into gross energy and fuel savings.

What kind of policies India requires?

A) Better public-private collaborations: Most of the energy utilities are government operated with almost 97% electricity government produced. It is important that private companies are given ample opportunities which would facilitate better utilization and minimum wastage. B) Government incentives and policy making: Government should provide incentives and subsidies to the private players that have taken a step towards climate change and efficiently cutting down on energy utilization. C) Setting up a project and monitoring: Use a comprehensive, building commissioning plan throughout the life of the project. D) Advocate for cleaner energy at local and national level: Most of the companies make use of coal to spread its network around the country. Strict actions at the management level are required against such companies. Also, advocate these companies on how

reduced carbon footprint would mean increased savings on fuel and other resources. E) Create a long term strategy that supports market based solutions F) Leading by example and support pilot projects: An investment of such kind can only be facilitated by government. It is important India promotes these kinds of ICT solutions and support pilot projects undertaken at local level. G) Monitoring of production plants, like wind farms and giving power in the hands of citizens. washing machine would be able to function at the time when the demand for voltage is least. Also a fridge would be allowed to cool to temperatures lower than the usual in case of high voltage and would be allowed to go at a temperature higher than it would have in case of low or erratic power supply. Modernization is necessary for energy consumption efficiency, real time management of power flows and to provide the bi-directional metering needed to compensate local producers of power.

Thrust areas where ICT solutions can be used for energy efficiency/conservation.
Smart Grid and Smart Meters:
India is trying to expand electric power generation capacity, as current generation is seriously below peak demand. Although about 80% of the population has access to electricity, power outages are common, and the unreliability of electricity supplies is severe enough to constitute a constraint on the country's overall economic development. The government had targeted capacity increases of 100,000 megawatts (MW) over the next ten years. As of January 2001, total installed Indian power generating capacity was 112,000 MW. The electric grid of the country is still not complete, although the government has started on the unification of state electricity boards (SEB). In the time when Indian Government is talking about completing Electric Grid, implementation of smart grid looks unrealistic. A smart grid in simple terms is an electrical network using information technology. The Smart Grid is envisioned to overlay the ordinary electrical grid with an information and net metering system, which includes smart meters. Smart grids are being promoted by many governments as a way of addressing energy independence, global warming and energy conservation issues. A smart grid is not a new grid that would change the ways how power is distributed, it is merely an enhancement. A smart grid involves two way communications between the user and the supplier. Smart Grid is responsible for automated processes in distributed systems. A smart grid is useful in the fact that it would help shifting the peak loads which generally result in erratic power supply and blackouts. Apart from this implementation of smart grid would mean that excessive voltage demand would be shifted to off-peak hours. Example, a

Smart Grid implementation would mean:

1. Reducing transmission and distribution losses 2. Real-Time usage i.e. a two way communication in which the grid would know exactly what kind of appliance is being utilized. 3. Renewable Energy An area where wind energy or solar energy is available can rely on them rather using thermal or other conventional sources of energy. ICT enabled solutions would make sure that most of the demands to such an area would be met by the renewable sources of energy and the other sources be concentrated elsewhere. Hence, Smart Grid would provide a formidable way to embrace other renewable sources of energy.

Smart Meter:
Another smart technology provided by using Smart Grid would be the smart meters. Smart Meters would aim at providing remote meter reading to reduce physical trips for maintenance and meter data collection. Another major advantage of ICT enabled solutions can be to lower electricity demand by communicating real-time electricity usage and price through smart meter. Example of this would be, letting the customer know on an hourly basis that how much energy he consumed in the last one hour and what would be the bill for that particular service. Or letting the customer know whenever the energy usage in the last one hour crosses the average electricity usage. Smart Meters would provide the government with the knowledge on how consumers would react to prices, thus giving providers better information on how to structure and price electricity to optimize efficiency. Another major advantage would be smart thermostats enabled meters that would allow reduce or time-shift demand based on price triggers. Gas and Water meters can be implemented on the same

platform and provide opportunities for efficiency. Linking Smart Meters with weather channels would also allow consumers to decide how much energy they need to utilize and how much energy they need to conserve. Aside from reducing carbon dioxide emissions, employing ICT solution in the grid can lead to other benefits such as: 1. Increasing national security by enabling decentralized operations and energy supply. 2. Providing a platform for electric vehicles. important issue. The accident rate among cars is the highest in the world. India has about 1% of the world's cars (some 4.5 million) yet still manages to kill over 100,000 people in traffic accidents each year. This amounts to 10% of the entire world's traffic fatalities. The U.S., with more than 40% of the world's cars, creates just 43,000 fatalities. Alleviating congestion is not important only because it would help in timely movement of traffic and reduce accidents, it is also important for the climate. Most of the petroleum products and diesel products are being utilized in this sector be it for personal movement or commercial logistics. Removing congestion would reduce the traffic snarls and traffic jams; Urban India faces everyday and would translate into fuel savings and advocate climate control.

Road Transportation
Another possible area where ICT would help us in energy conservation and reducing carbon emissions is Road Congestion. With easier access to car loans and better salaries, Indians have given up on the usage of two wheelers and shifter to four wheel drives for better safety and standards. The 2000$ Tata Nano and other low-cost cars have fuelled this change too. Quoting from Traffic congestion in Indian cities: Challenges of a rising power by Azeem Uddin the statistics say Indians are rushing headlong to get behind the wheel. Indians bought 1.5 million cars in 2007, more than double of that in 2003. The cumulative growth of the Passenger Vehicles segment during April 2007 March 2008 was 12.17 percent. In 2007-08 alone, 9.6 million motorized vehicles were sold in India. India is now the largest growing car market in the world. With the increased share of cars and increasing number of cars on the Indian roads came the problem of congestion. India has one of the worst congestions in the world understandable from the fact that while other countries have a continuous traffic of cars, motorcycles and heavy motor vehicles, India runs on auto-rickshaws and hand pulled rickshaws, bullock carts and cycles as well. Unlike the western world, India doesnt believe in the methods of lane driving and safe driving. No braking distance, continuous honking are common scenes on the Indian road. To add to it, non functioning traffic signals, deteriorating road surfaces and lax attitude in restoring bridges/broken roads contribute to the congestion as well. Traffic congestion is a serious problem in most Indian metros. The scorching pace of economic growth and the growing incomes of Indias burgeoning middle class are only likely to make the situation worse. Public transport systems are overloaded, and there is a limit on how much additional infrastructure such as roads and rail lines a city can add. Apart from the congestion, road safety is another

How can ICT solutions conserve energy and lower carbon emissions?
Since energy conservation would lead to lower carbon emission, any setup meant for energy conservation would automatically transcend to lower carbon emissions and favorable climate change. Let us look at a few schemes that ICT can bring about. 1. Help Individuals make better plans Continuous updates on the traffic movement on the roads and what road can be avoided is required on the fly. Let the citizens know before they plan a trip about how much time they can save if they take the other route. 2. Improve the journey experience The best possible way to improve congestion is to keep the consumer informed. Set up LCD panels at major traffic junctures to keep the motorist informed about the condition of the road ahead. Provide them with average traffic speed and estimated time of arrival at a particular location. This involved routing based on real-time information to avoid congestion. 3. Traffic lights and roads updated with sensors Rather the traffic lights being time-bound let them be volume bound i.e. install sensors on traffic lights and the roads that can judge the movement of traffic on every major road. If the traffic from a particular side is more as compared to other sides, let the traffic move for a much longer time as compared to others. 4. Increase Vehicle performance Provide mobile apps to provide drivers with feedback on miles per gallon. Install chips in the engine of the car that helps a driver keep track of the engine condition of the vehicle. Such a mobile

application can link up revenue as well, as it would not only let the customer make an informed decision about the quality of oil he gets for his vehicle. Also, the same mobile application can be used to inform the customer about timely services which are due for his vehicle and the service centers available for the following. The reduction potential range on ecodriving is also large because the impact of this is still considerably new. This also means off the shelf devices that connect to cars on-board computer to give drivers information about fuel-use. 5. Social Networking and other collaboration tools to ease car-pooling and car sharing Government should take the onus of building websites that would help in pooling of cars. Even private organizations should take part and make sure that four people coming from same location and headed to same part can use the same vehicle for movement. Incentives should be provided to the driver or the vehicle owner like a small gift voucher from petroleum companies. 6. Smart parking/reserve a spot Another smart solution by ICT involves reserving a spot of parking for a particular time-slot on the curbs or in the market place. Else GPS or satellite can help you track the next vacant parking spot. These solutions could also reduce the movement time to minimum and hence conserve oil. 7. Commercial logistics could make use of less carbon intensive methods for transport and use the same eco-driving measures as for the personal drivers. Also, a periodic feedback to the driver on his driving habits could drive him to optimization. 8. Premium toll rates using RFID Levy a core area charge to reduce traffic congestion in their crowded business districts. For instance, if you are in Delhi and every time you drive into the Connaught Circus, the RFID chip will log your entry and you would be charged for entering a core area. Despite the benefits that ICT can provide in improving personal and commercial transport, it has to tackle challenges like lack of infrastructure, simplifying the user experience, awareness regarding car-pool and public transport and improving of road structure and laying down sensors for the job. International Energy Outlook projections for 2030 of the US Department of Energy, China and India account for nearly one-half of the total increase in residential energy use in non-OECD countries. With increasing activity in urban real estate and building sectors, urban buildings will soon become big polluters. The time to take initiatives in this direction is now, through popularizing what are called `intelligent' and `green' buildings. Smart buildings are the intelligent-buildings that incorporate information and communication technology and mechanical systems making them more cost-effective, comfortable, secure and productive. An intelligent building is one equipped with a robust telecommunications infrastructure, allowing for more efficient use of resources and increasing the comfort and security of its occupants. An intelligent building provides these benefits through automated control systems such as heating, ventilation, and air conditioning (HVAC); fire safety; security; energy/lighting management; and other building management systems. For example, in the case of a fire, the fire alarm communicates with the security system to unlock the doors. The alarm will also communicate with the HVAC system to regulate airflow and prevent the fire from spreading. The objective of smart building is to optimize energy consumption and hence lower carbon emissions. Smart buildings require both a firm groundwork i.e. the design and the embedded technology. While the design sets the initial energy consumption of the building, technology optimizes the energy use of building operations. Smart buildings require proper use of sunlight and ventilation system and less reliability on the heating, ventilation and the airconditioning system (HVAC). ICT can also provide for software tools that would aim at choosing materials, modeling lighting, assessing air flows and sizing HVAC systems. Smart Building apart from the energy efficiency it provides also has several additional benefits. 1. Higher quality of life. 2. Better air and access to sunlight while working. 3. Setups generally in reduced water consumption and environment friendly locations. Of course, the setup cost involved in setting up a Smart building is high but savings tend to make the technology effective. Let us have a look at the few ICT solutions that exist today for smart buildings: Smart appliances have been setup in smart buildings. These appliances are in communication with the smart grid and can switch on or off

Smart Buildings:
The percentage of urban population in India increased from 18.0 in 1961 to 27.8 in 2001. The energy consumption rose threefold, from 4.16 to 12.8 quadrillion Btu between 1980 and 2001, putting India next only to the US, Germany, Japan and China in total energy consumption. According to the

depending on the energy supply and demand and hence manage to shift the load to off-peak hours. Safety: Smart buildings bring with them impressive safety systems. In case of a fire, the alarm system will not only trip the water sensors and alert the police booths but also regulate the HVAC system to control the fire. Smart thermostat: The smart thermostat allows occupants the kind of temperature they would require in their room and plus work on the real time information like weather alerts and climate control. Sensors: automatically trip the lighting system or the conditioning system whenever the room is empty and switch it on as soon as any door or any window is opened. Intensity Sensors: that can gauge the amount of sunlight coming from outside and set the lighting system according. The same principle applies for HVAC as well. systems and is possibly the first building in India without any light switch. All cabins are equipped with infra-red detectors to detect occupancy. Entry is only through smart cards with built-in antennas. The Wipro Technologies Development Centre (WTDC) in Gurgaon is the largest platinumrated green building in Asia that has been felicitated by USGBC. [Source Hindu Business, Urban Buildings: Green and smart, the way ahead.] Even thought the advantages and the reduction potential Smart buildings provide is enormous, it faces a number of challenges like limited interoperability, limited deployment of smart grip and the high up-front cost and the shortage of expertise. Smart Buildings require constant support from the government and increased incentives from the government for the organization that implement smart buildings rather the traditional buildings. Government also needs to address the shortage of expertise and support investments from both the public and the private partnerships. It also needs to commission new high performing buildings and retrofit existing ones at levels of government. This will contribute to a better understanding of building costs and expected energy efficiency, as well as increase the knowledge base for building professionals.

What is required?
Need for certifications
Worldwide, green buildings are certified through an independent body, the US Green Building Council (USGBC), through its LEED (Leadership in Energy and Environmental Design) certification programme. In India, Indian Green Building Council (IGBC) set up by the Confederation of Indian Industry (CII) certifies the smart buildings. It comprises all stakeholders in the green building value chain. But there are only 135 certified green buildings in the country as yet. A few achievements by India in the field of smart building include: India Habitat Centre in New Delhi. Its exteriors are so designed that it is cleaned every time it rains. Despite its location at the intersection of two major roads with heavy traffic, the building is devoid of disturbance and protected against tropical sunshine due to its unique design. The use of shaded canopies over large paved courts reduces energy load on air conditioning and produces an improved climate for its occupants. The Confederation of Indian Industry Sohrabji Godrej Green Business Centre (CII-Godrej GBC) was the first structure in India to receive the prestigious `platinum' rating from the USGBC. The Engineering Design and Research Centre (EDRC) of Larsen &Toubro's Engineering Construction and Contracts Division in Chennai is another such building. It has fully automated energy management, life-safety and telecommunication

ICT provides a tremendous opportunity to use energy efficiently and reduce carbon emissions. Beyond the benefits, it would help India on a path to energy independence, but it is a task that requires collective efforts from the organizations/the citizens and the government as well. The adaptation of energy efficiency policies would increase the productivity of employees and the overall quality of life. The reduction of energy utilization and carbon emissions is an opportunity for humanity to enable change. IGBCs vision stands true To usher in a green building movement and facilitate in India emerging as one of the world leaders in green buildings by 2015 which would help India unlocking the potential of other sources of energy as well as optimizing the use of present resources. The monetization of carbon emissions and the need of ubiquitous broadband throughout the country for implementation of a smart grid cannot be underlined. The search for alternative fuels like biodiesel, ethanol, hybrids, green batteries, flex fuel vehicles and conversions, synfuels, solar assisted fuel

and ways to implement them at a lower cost is also required. The hybrid cars that have surfaced are expensive and considerable research is required to lower the prices further. Development of infrastructure and government policies and incentives would be the front-runners in implementing smart and green technologies. Government also needs to make sure it keeps private sector involved so as to make Green enabled IT commercially-viable. Reducing carbon emissions and efficiently utilizing energy would not only increase the savings and the economy but also increase green-collar jobs, an avenue of employment for the talented and high skilled Indian youth. Energy efficiency offers a vast, low cost energy resource for the Indian Economy only if the nation can craft a comprehensive and innovative approach to unlock it. Significant and persistent barriers will need to be addressed to stimulate demand for energy efficiency and manage its delivery across the nation. REFERENCES: 1. Unlocking energy efficiency in the US economy, McKinsey and Company, July 2009 2. Energy Efficiency in Buildings, Business realties and opportunities, 3. World Business Council for sustainable development 4. Energy Efficiency, Best Practices, Foundation for community association research. 5. Indian Green Building Council, [\] 6. Urban buildings: Green and Smart, the way to go [] 7. SMART BUILDINGS: Make Them Tech Smart; [] 8. Traffic congestion in Indian cities: Challenges of a rising power; Kyoto of the Cities, Naples 9. Smart grip, Wikipedia;[\] 10. Global Energy Network Institute, 11. Smart 2020, United States Report Addendum, GeSI, 2008 12. Automation in power distribution, Vol. 2 No.2, 13. Electricity sector in India, Wikipedia 14. Electrical Distribution in India: An overview, Forum for regulators 15. Optimize energy use, WBDG sustainable Development, 12-21-2010


Study of Ant Colony Optimization For Proficient Routing In Solid Waste Management

Abstract. The routing in solid waste management is one of the key areas where 70% to 85% of the total system cost is wasted in just the collection of the waste. During the transportation many of the collection points may be missed out and it may happen that the path followed by the driver is longer than the optimal path. The study involved in the paper intends to find the optimal route for collecting solid waste in cities. Ant colony optimization is a new meta- heuristic technique inspired by the nature of the real ants and helps in finding the optimal solution of such type of problems. The system tries to implement the solid waste management routing problem using Ant colony optimization. Key-Words: Solid waste management, Ant colony optimization (ACO), Routing, Waste collection.


he collection of municipal solid waste is one of the most difficult operational problems faced by local authorities in any city. In recent years, due to a number of costs, health, and environmental concerns, many municipalities, particularly in industrialized nations, have been forced to assess their solid waste management and examine its cost effectiveness and environmental impacts, in terms of designing collection routes. During the past 15 years, there have been numerous technological advances, new developments and mergers and acquisitions in the waste industry. The result is that both private and municipal haulers are giving serious consideration to new technologies such as computerized vehicle solutions [1]. It has been estimated that, of the total amount of money spent for the collection, transportation, and disposal of solid waste, approximately 6080% is spent on the collection phase [2, 4]. Therefore, even a small improvement in the collection

operation can result to a significant saving in the overall cost. The present study is mainly focused on the collection and transport of solid waste from any loading spot in the area under study. The routing optimization problem in waste management has been already explored with a number of algorithms. Routing algorithms use a standard of measurement called a metric (i.e. path length) to determine the optimal route or path to a specified destination. Optimal routes are determined by comparing metrics, and these metrics can differ depending on the design of the routing algorithm used [3, 10].The complexity of the problem is high due to many alternatives that have to be considered. Fortunately, many algorithms have been developed and discussed in order to find an optimized solution, leading to various different results. The reason for this diversity is that the majority of routing algorithms include the use of heuristic algorithms. Heuristic algorithms are ad hoc, trial-and-error methods which do not guarantee to find the optimal solution but are designed to find near-optimal solutions in a fraction of the time required by optimal methods.

In the literature of the past few years, much effort has been made in the domain of urban solid waste. The effort focuses either on theoretical approaches, including socio-economic and environmental analyses concerning waste planning and management, or on methods, techniques and algorithms developed for the automation of the process. The theoretical approaches examined in the literature refer to issues concerning the conflict between urban residents and the municipality for the selection of sites for waste treatment, transshipment

stations and disposal, the issue of waste collection and transport and its impact to human health due to noise, traffic, etc. In this context, the calculation of total cost for collection and transport, for a specific scenario, is implemented. The identification of the most cost-effective alternative scenario and its application is simulated. In the literature, methods and algorithms have been used for optimizing sitting and routing aspects of solid waste collection networks that were deterministic models including, in many cases, Linear Programming (LP) [Hsieh and Ho, 1993], [Lund and Tchobanoglous, 1994]. However, uncertainty frequently plays an important role in handling solid waste management problems. The random character of solid waste generation, the estimation errors in parameter values, and the vagueness in planning objectives and constraints, are possible sources of uncertainty. Fuzzy mathematical programming approaches for dealing with systematic uncertainties have been broadly used in the last few years. For example, the sitting planning of a regional hazardous waste treatment center [Huang et al., 1995], the hypothetical solid waste management problem in Canada [Koo et al.1991], an integrated solid waste management system in Taiwan [Chang and Wang, 1997]. To cope with non-linear optimization problems, such as deciding about efficient routing for waste transport, methods based on Genetic Algorithms (GA), Simulated Annealing (SA), Tabu Search and Ant Colony Optimization (ACO) are also proposed [Chang and Wei, 2000], [Pham and Karaboka, 2000], [Ducatelle and Levine, 2001], [Bianchi et al. 2002], [Chen and Smith, 1996], [Glover and Laguna, 1992], [Tarasewich and McMullen , 2002]. The problem could be classified as either a Traveling Salesman Problem (TSP) or a Vehicle Routing Problem (VRP) and for this particular problem, several solutions and models have been proposed. However, the complexity of the problem is high due to many alternatives that have to be considered and the number of possible solutions is considerably high, too. As it is mentioned above, the most popular algorithms used today in similar cases include the Genetic Algorithms, the Simulated Annealing (SA), the Tabu Search and the Ant Colony Optimization (ACO) Algorithm. Genetic Algorithms [Glover et al., 1992], [Pham and Karaboka, 2000], [Chen and Smith, 1996] use biological methods such as reproduction, crossover, and mutation to quickly search for solutions to complex problems. GA begins with a random set of possible solutions. In each step, a fixed number of the best current solutions are saved and they are used in the next step to generate new solutions using genetic operators. Crossover and mutation are the most important genetic operations used. In the crossover function parts of two random solutions are chosen and they are exchanged between two solutions. As a result two new child solutions are generated. The mutation function alters parts of a current solution generating a new one. The Ant Colony Optimization (ACO) algorithm [Dorigo and Maniezzo, 1996], was inspired through the observation of swarm colonies and specifically ants. Ants are social insects and their behaviour is focused to the colony survival rather the survival of the individual. Specifically, the way ants find their food is noteworthy. Although ants are almost blind, they build chemical trails, using a chemical substance called pheromone. The trails are used by ants to find the way to the food or back to their colony. The ACO simulates this specific ants characteristic, to find optimum solutions in computational problems, such as the Travelling Salesman Problem. As this context is mainly focused on the ACO algorithm and its testing to the solid waste collection problem, the ACO is analytically described in the next section.

Problem Formulation of Solid Waste Routing.

The collection of solid waste operation in City begins when workmen collect plastic bags containing residential solid waste from different points in the city. These bags are carried to the nearest pick-up point where there are steel containers. The containers are unloaded into special compact vehicles. Every morning the vehicles are driven from the garage to the regions, where they begin to collect residential solid waste from the pick-up points. There is no specific routing basis for the vehicles being left to the driver's choice. Occasionally, some pick-up points may be missed. It may happen that the route followed by the driver is longer than the optimal route. In regions which have two collection vehicles, they may meet at the same pick-up point several times. Once the solid waste is loaded into the vehicles, it is carried out of to the disposal site located far away. The suggested procedure for solving the vehicle routing problem in the selected Region 2 of begins with a particular node closest to the garage or the previous region and ends with the nodes closest to the disposal site. This reduces the number of permutations considerably. Further detailed node networks, description and procedures can be found in. The problem of routing in solid waste management is the main point of focus in thesis. There are many ways to

solve the problem of solid waste management. Ant colony optimization is a new technology to solve the optimization problem. As routing in solid waste management is a challenge, so in this thesis we are planning to tackle the routing in solid waste management through the technique of ant colony optimization. The Ant Colony Optimization (ACO) algorithm, was inspired through the observation of swarm colonies and specifically ants. Ants are social insects and their behaviour is focused to the colony survival rather the survival of the individual. Specifically, the way ants find their food is noteworthy. Although ants are almost blind, they build chemical trails, using a chemical substance called pheromone. The trails are used by ants to find the way to the food or back to their colony. choosing by randomly, with at most one ant in each customer point. Each ant builds a tour incrementally by applying a state transition rule. While constructing the solution, ants also updating the pheromone on the visited edges by local updating rule. Once the ants complete their tour, the pheromone on edges will be updated again by applying global updating rule[2]. During construction of tours, ants are guided by both heuristic information and pheromone information. Heuristic information refers to the distances of the edges where ants prefer short edges. An edge with higher amount of pheromone is a desirable choice for ants. The pheromone updating rule is designed so that ants tend to leave more pheromone on the edges which should be visited by ants. The Ant Colony System algorithm is given as in Fig. 2[3,4]. In the ant ACS, an artificial ant k after serves customer r chooses the customer s to move to from set of Jk(r) that remain to be served by ant k by applying the following state transition rule which is also known as pseudo-random-proportional-rule:

Ant Colony Optimization (ACO) algorithm.

Ant System was efficient in discovering good or optimal solutions for small problems with nodes up to 30. But for larger problems, it requires unreasonable time to find such a good result. Thus, Dorigo and Gambardella [3,4] and Bianchi et al.[1] devised three main changes in Ant System to improve its performance which led to the existence of ant colony system. Initialize Loop//Each loop called an iteration Each ant is placed on a starting customer spoint Loop//Each loop called a step Each ant constructs a solution (tour) by applyi State transition rule and a local pheromone updating Until all ants have construct a complete solution. A global pheromone updating rule is applied Until stopping criteria is met Ant Colony System is different from Ant System in three main aspects. Firstly, state transition rule gives a direct way to balance between exploration of new edges and exploitation of a priori and accumulated information about the problem. Secondly, global updating rule is applied only to those edges which belong to the best ant tour and lastly, while ants construct the tour, a local pheromone updating rule is applied. Basically, Ant Colony System (ACS) works as follows: m ants are initially positioned on n nodes chosen according to some initialization rule such as

arg max
u jk ( r )

[ (r , u )] if a a0 {exp loitation).. .. [ (r , u )] other (biased exp loration ...

.(1) Where: = The control parameter of the relative importance of the visibility. (r,u)=The pheromone trail on edge (r,u) (r,u)=A heuristic function which was chosen to be the inverse distance between customers r and u A = A random number uniformly distributed in [0,1] a0 (0 a0 1) = A parameter S = A random variable selected according to the probability distribution which favors edges which is shorter and higher amount of pheromone It is same as in Ant system and also known as random-proportional-rule given as follow:

pk(r , s) 0

[ (r , s)][ (r , s)] [ (r , s)][ (r , s)]

u j (r ) k


jk (r )



Where, pk(r,s) is the probability of ant k after serves customer r chooses customer s to move to. The parameter of a0 determines the relative importance of exploitation versus exploration. When an ant after serves customer r has to choose the customer s to

move to, a random number a (0 a 1) is generated, if a a0, the best edge according to Eq -1 is chosen, otherwise an edge is chosen according to Eq -2[3]. While building a tour, ants visits edges and change their pheromone level by applying local updating rule as follow: (r, s) (1 p). (r, s) p. (r, s), 0 p 1 (3) Where: = A pheromone decay parameter (r,s) = 0 (initial pheromone level) Local updating makes the desirability of edges change dramatically since every time an ant uses an edge will makes its pheromone diminish and makes the edge becomes less desirable due to the loss of some of the pheromone. In other word, local updating drives the ants search not only in a neighborhood of the best previous tour. Global updating is performed after all the ants have completed their tours. Among the tours, only the best ant which produced the best tour is allowed to deposit pheromone. This choice is intended to make the search more directed. The pheromone level is updated by global updating rule as follow: (r, s) (1 p). (r, s) . (r, s), 0 1 (4) Where: 1 / Lgb i f (r , s) global best tour (r , s ) (5) 0 otherwise Lgb is the length of the global best tour from the beginning of the trial. (global-best) and a is pheromone decay parameter. Global updating is intended to provide greater amount of pheromone to shorter tours. Eq -5 dictates that only those edges belong to globally best tour will receive reinforcement. given point an agent has to choose between different options and the one actually chosen results to be good, then in the future that choice will appear more desirable than it was before. The main purpose in this paper is to provide an adequate fast heuristic algorithm which yields a better solution than traditional available methods. The ant system exploits the natural phenomenon of ants to solve such optimization problems. The concept of this method is to find the priori tour that gives the minimum total expected cost. In many such types of solutions the ACO has shown the ability in obtaining good solutions.

REFERENCES [1] Nikolaos V. Karadimas, Maria Kolokathi, Gerasimoula Defteraiou, Vassili Loumos, Ant Colony System vs ArcGIS Network Analyst: The Case of Municipal Solid Waste Collection, 5th Wseas Int. Conf. on Environment, Ecosystems and Development, Tenerife, Spain, December 14-16, 2007. [2] Municipality of Athens, Estimation, Evaluation and Planning Of Actions for Municipal Solid Waste Services During Olympic Games 2004.Municipality of Athens, Athens, Greece, 2003. [3] Parker. M., Planning Land Information Technology Research Project: Efficient Recycling Collection Routing in Pictou County, 2001. [4] Karadimas N.V., Doukas N., Kolokathi M. and Defteraiou,Routing Optimization Heuristics Algorithms for Urban Solid Waste Transportation Management, G., World Scientific and Engineering Society, Transactions on Computers. , Vol. 7, Issue 12, ISSN: 1109-2750, pp. 2022-2031, 2008. [5] A. Awad, M. T. Aboul-Ela, and R. AbuHassan,"Development of a Simplified Procedure for Routing Solid Waste Collection", International Journal for Science and Technology ( Scientia Iranica), 8 (1), 2001, pp. 71-75. [6] Karadimas, N.V., Kouzas, G., Anagnostopoulos, I. and Loumos, V., Urban Solid Waste Collection and Routing: the Ant Colony Strategic Approach. International Journal of Simulation: Systems, Science & Technology, Vol. 6, 2005, pp. 4553. [7] M. Dorigo, Optimization, Learning and Natural Algorithms, PhD thesis, Politecnico di Milano, Italie, 1992. [8] S. Goss, S. Aron, J.-L.Deneubourg et J.-M. Pasteels, The self-organized exploratory pattern of the Argentine ant, Naturwissenschaften, volume 76,

The study mainly concentrates on the carrying or moving of the solid waste from different points of collection. Recent results as indicated by literature review show that the ant colony optimization is a new field emerging in the area of network optimization problems. The general idea underlying the Ant System paradigm is that of a population of agents (ants) each guided by an autocatalytic process directed by a greedy force. Were an agent alone, the autocatalytic process and the greedy force would tend to make the agent converge to a suboptimal tour with exponential speed. We employ positive feedback as a search and optimization tool. The idea is that if at a

pages 579-581, 1989. [9] J.-L. Deneubourg, S. Aron, S. Goss et J.-M. Pasteels , The self-organizing exploratory pattern of the Argentine ant, Journal of Insect Behavior, volume 3, page 159, 1990. [10] A. Colorni, M. Dorigo, and V. Maniezzo, Distributed optimization by ant colonies Proceedings of ECAL'91, European Conference on Artificial Life, Elsevier Publishing, Amsterdam, 1991. [11] M. Dorigo, V. Maniezzo, and A. Colorni, The ant system: an autocatalytic optimizing process, Technical Report TR91-016, Politecnico di Milano (1991). [12] Ingo von Poser, Adel R Awad, Optimal Routing for Solid Waste Collection in Cities by Using Real Genetic Algorithm, 07803-9521- 2/06/ IEEE, pp 221-226. [13] M. Dorigo, and Luca Maria Gambardella, Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem, IEEE Tramsactions on Evolutionart Computation, Vol. 1, No. 1, April 1997 [14] Ismail, Z. and S.L. Loh, 2009, Ant Colony Optimization for Solving Solid Waste Collection Scheduling Problesm. Journal of Mathematics and Statistics 5(3)199205, 2009 ISSN 1549 - 3644.


Survey on Decision Tree Algorithm

This paper aims to study the various classification algorithms used in data mining. All these algorithms are based on constructing a decision tree for classifying the data but basically differ from each other in the methods employed for selecting splitting attribute and splitting conditions. The various algorithms which will be studied are: CART (Classification and regression tree), ID3 and C4.5.

existing software and hardware platforms to enhance the value of the existing information resources, and can be integrated with the new products and the systems as they are brought on-line. Data mining tools can analyze massive databases to deliver answers to questions such as, "Which clients are most likely to respond to my next promotional mailing, and why?, when implemented on high performance client/server or parallel processing computers.

A data mining function that assigns items in a collection to target categories or classes is known as Classification. Goal of classification can be described as to accurately predict the target class for each case in the data. The classification task will begin with a data set in which the class assignments are known for each case. The classes are the values of the target. The classes are distinct and do not exist in an ordered relationship to each other. Ordered values would indicate a numerical, rather than a categorical, target. A predictive model with a numerical target uses a regression algorithm, not a classification algorithm. For example, customers might be classified as either users or non-users of a loyalty card. The predictors would be the attributes of the customers: age, gender, address, products purchased, and so on. The target would be yes or no (whether or not the customer used a loyalty card). In the model build (training) process, a classification algorithm finds relationships between the values of the predictors and the values of the target. Different classification algorithms use different techniques for finding relationships. These relationships are summarized in a model, which can

The extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses, is known as data mining. The future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions are predicted by the Data mining tools. The automated, the prospective analyses offered by the data mining move beyond the analyses of past events provided by retrospective tools typical of the decision support systems. The Data mining tools can answer business questions that were traditionally too much time consuming to resolve. They search databases for the hidden patterns, finding predictive information that the experts may have missed because it lies outside their expectations.Most of the companies already collect and refine the massive quantities of data. These Data mining techniques can be implemented rapidly on

then be applied to a different data set in which the class assignments are unknown. Definition : Given a database D = { t1,t2,t3,} of tuples (data , records). And a set of classes c = {c1,c2,c3cm} the classification problem is to define a mapping f:D ->c where each ti is assigned to one class . A class .cj ,contains precisely those mapped to it , that is, cj = { ti : f(ti)=cj , 1<i<n ,and ti belongs to D} [4] compared to the predefined pattern . Then that item is going to be placed in the class with largest similarity value.

K Nearest Neighbors
The KNN technique, it assumes that the entire training set includes not only the data in the set but also the desired classification for each item . As a result, the training data become the model When classification is to be made for a new item, its distance to each item in the training set is to be determined .only the K , closest entries in the training set are considered items from this set of K , closest items.

Different types of classification algorithms

1. Statistical based algorithms Regression
As with all the regression techniques we assume type existence of a single output variable and more input variable s. The output variable is numerical . The general regression tree building methodology allows input variables to be tye mixture of continuous and categorical variables. A decision tree is generated where each decision node in the tree contains a test on some input variables value . The terminal nodes of the tree contain the predicted output variable values . Regression tree may be considered as a variant of decision trees , designed to approximate real valued function instead of being used for classification tasks.

3. Decision Tree-Based Algorithm

The decision tree approach is most useful in classification problems. Here in this technique a tree is constructed to model the classification process. Once the tree is built , it is applied to each tuple in the database and results in classification for that tuple . It involves two basic steps : building the tree and applying the tree to the database.

The ID3 technique is based on information theory and attempts to minimize the expected number of comparisons,. Its basic strategy is to choose splitting attributes with the highest gain first. The amount of information associated with an attribute value is related to the probability of occurrence. The concept here used to qualify information is called entropy.

Bayesian Classification
The effect of a variable value on a given class is independent of the values of other variable is assumed by the Nave Bayes classifications. This assumption is called class conditional independence. This assumption is made to simplify the computation and in this sense considered to be Nave. This is a fairly among assumption and is often not applicable. Although, bias is estimating probabilities, not their exact values that determine the classifications.

It is an improvement of ID3. Here classification is via either decision trees or rules generated from them. For splitting purposes, It uses the largest Gain Ratio that ensures a larger than average information gain .This is to compensate for the fact that the gain Ratio value is skewed towards splits where the size of one subset is close to that of the starting one.

2. Distance based algorithms Simple approach

Here in this approach , if we have a representation of each class , we can perform classification by assigning each tuple in the class in which it is most similar . A simple classification technique would be to place each item in the class where it is most similar to the center of the class, a predefined pattern can be used to represent the class. Here if once the similarity measure is defined, each item to be classified will be

Classification and Regression tree (CART) is a technique that generates a binary decision tree. Similarly as with ID3, entropy is used as a measure to choose the best splitting attribute and criterion Here however , where the child is created for each subcategory , only two children are created . The splitting is performed around what is found to be the best split point. The tree stops growing when no split will improve the performance.

4. Neural Network Based algorithm


A model representing how to classify any given

database is constructed with neural networks, just as with decision trees. The activation functions typically are sigmoid. when a tuple must be classified , certain attribute values from that tuple are input into the directed graph at the corresponding source nodes .

The ID3 Algorithm

ID3 is a non incremental algorithm, meaning it derives its classes from a fixed set of training instances. An incremental algorithm revises the current concept definition, if necessary, with a new sample. The classes created by ID3 are inductive, that is, given a small set of training instances, the specific classes created by ID3 are expected to work for all future instances. The distribution of the unknowns must be the same as the test cases. Induction classes cannot be proven to work in every case since they may classify an infinite number of instances. Note that ID3 (or any inductive algorithm) may misclassify data.

How does ID3 decide which attribute is the best? A statistical property, called information gain, is used. Gain measures how well a given attribute separates training examples into targeted classes. The one with the highest information (information being the most useful for classification) is selected. In order to define gain, we first borrow an idea from information theory called entropy. Entropy measures the amount of information in an attribute. Given a collection S of c outcomes Entropy(S) = -p(I) log2 p(I) where p(I) is the proportion of S belonging to class I. is over c. Log2 is log base 2. Note that S is not an attribute but the entire sample set.

ID3 (Examples, Target_Attribute, Attributes) Create a root node for the tree If all examples are positive, Return the single-node tree Root, with label = +. If all examples are negative, Return the single-node tree Root, with label = -. If number of predicting attributes is empty, then Return the single node tree Root, with label = most common value of the target attribute in the examples. Otherwise Begin A = The Attribute that best classifies examples. Decision Tree attribute for Root = A. For each possible value, , of A, Add a new tree branch below Root, corresponding to the test A = Let Examples( examples that have the value . for A ), be the subset of

Data Description
The sample data used requirements, which are: by ID3 has certain

Attribute-value description - the same attributes must describe each example and have a fixed number of values. Predefined classes - an example's attributes must already be defined, that is, they are not learned by ID3. Discrete classes - classes must be sharply delineated. Continuous classes broken up into vague categories such as a metal being "hard, quite hard, flexible, soft, quite soft" are suspect. Sufficient examples - since inductive generalization is used (i.e. not provable) there must be enough test cases to distinguish valid patterns from chance occurrences.

If Examples( ) is empty Then below this new branch add a leaf node with label = most common target value in the examples Else below this new branch add the subtree ID3 (Examples( Attributes - {A}) ), Target_Attribute,

Attribute Selection

End Return Root




C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. This section explains one of the algorithms used to create Univariate DTs. This one, called C4.5, is based on the ID32 algorithm, that tries to find small (or simple) DTs. We start presenting some premises on which this algorithm is based, and after we discuss the inference of the weights and tests in the nodes of the trees.

where represent the class to which each sample belongs. [1] At each node of the tree, C4.5 chooses one attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. Its criterion is the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurs on the smaller sub lists. This algorithm has a few base cases. All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously-unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value.

Some premises guide this algorithm, such as the following: if all cases are of the same class, the tree is a leaf and so the leaf is returned labeled with this class; for each attribute, calculate the potential information provided by a test on the attribute (based on the probabilities of each case having a particular value for the attribute). Also calculate the gain in information that would result from a test on the attribute (based on the probabilities of each case with a particular value for the attribute being of a particular class); depending on the current selection criterion, find the best attribute to branch on.

Classification and regression trees (CART) is a nonparametric Decision tree learning technique that produces either classification or regression trees, depending on whether the dependent variable is categorical or numeric, respectively. Trees are formed by a collection of rules based on values of certain variables in the modeling data set Rules are selected based on how well splits based on variables values can differentiate observations based on the dependent variable Once a rule is selected and splits a node into two, the same logic is applied to each child node (i.e. it is a recursive procedure)

C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set samples. Each sample of already classified is a

vector where represent attributes or features of the sample. The training data is

Splitting stops when CART detects no further gain can be made, or some pre-set stopping rules are met Each branch of the tree ends in a terminal node Each observation falls into one and exactly one terminal node Each terminal node is uniquely defined by a set of rules node have identical values of the dependent variable, the node will not be split. If all cases in a node have identical values for each predictor, the node will not be split. If the current tree depth reaches the userspecified maximum tree depth limit value, the tree growing process will stop. If the size of a node is less than the userspecified minimum node size value, the node will not be split. If the split of a node results in a child node whose node size is less than the userspecified minimum child node size value, the node will not be split. Conclusion and Future Work Decision tree induction is one of the classification techniques used in decision support systems and machine learning process. With decision tree technique the training data set is recursively partitioned using depth- first (Hunts method) or breadth-first greedy technique (Shafer et al , 1996) until each partition is pure or belong to the same class/leaf node (Hunts et al, 1966 and Shafer et al , 1996). Decision tree model is preferred among other classification algorithms because it is an eager learning algorithm and easy to implement. Decision tree algorithms can be implemented serially or in parallel. Despite the implementation method adopted, most decision tree algorithms in literature are constructed in two phases: tree growth and tree pruning phase. Tree pruning is an important part of decision tree construction as it is used improving the classification/prediction accuracy by ensuring that the constructed tree model does not overfit the data set (Mehta et al, 1996). In this study we focused on serial implementation of decision tree algorithms which are memory resident, fast and easy to implement compared to parallel implementation of decision that is complex to implement. The disadvantages of serial decision tree implementation is that it is not scalable (disk resident) and its inability to exploit the underlying parallel architecture of computer system processors. Our experimental analysis of performance evaluation of the commonly used decision tree algorithms using Statlog data sets (Michie et al, 1994) shows that there is a direct relationship between execution time in building the tree model and the volume of data records. Also there is an indirect relationship between execution time in building the model and attribute size of the data sets. The experimental analysis C4.5 algorithms have a good classification accuracy compared to other

Tree Growing Process

The basic idea of tree growing is to choose a split among all the possible splits at each node so that the resulting child nodes are the purest. In this algorithm, only univariate splits are considered. That is, each split depends on the value of only one predictor variable. All possible splits consist of possible splits of each predictor For each continuous and ordinal predictor, sort its values from the smallest to the largest. For the sorted predictor, go through each value from top to examine each candidate split point (call it v, if x v, the case goes to the left child node, otherwise, goes to the right.) to determine the best. The best split point is the one that maximize the splitting criterion the most when the node is split according to it. The definition of splitting criterion is in later section. For each nominal predictor, examine each possible subset of categories (call it A, if x A , the case goes to the left child node, otherwise, goes to the right.) to find the best split. Find the nodes best split. Among the best splits found in step 1, choose the one that maximizes the splitting criterion. Split the node using its best split found in step 2 if the stopping rules are not satisfied. Splitting criteria and impurity measures At node t, the best split s is chosen to maximize a splitting criterion i(s,t) . When the impurity measure for a node can be defined, the splitting criterion corresponds to a decrease in impurity. In SPSS products, I (s, t) p(t) i(s, t) is referred to as the improvement. Stopping Rules Stopping rules control if the tree growing process should be stopped or not. The following stopping rules are used: If a node becomes pure; that is, all cases in a

algorithms used in the study. The variation of data sets class size, number of attributes and volume of data records is used to determine which algorithm has better classification accuracy between IDE3 and CART algorithms. In future we will perform experimental analysis of commonly used parallel implementation tree algorithms and them compare it that serial implementation of decision tree algorithms and determine which one is better, based on practical implementation. [5] Alcal, J., Snchez, L., Garca, S., del Jesus, M. et. al. KEEL A Software Tool to Assess Evolutionary Algorithms to Data Mining Problems. Soft Computing, 2007. [6] Clark, P., Niblett, T. The CN2 Induction Algorithm. M] Hmlinen, W., Vinni, M. Comparison of machine learning methods for intelligent [7] tutoring systems. Conference Intelligent Tutoring Systems, Taiwan, 2006achine Learning 1989, 3(4) [8] Jovanoski, V., Lavrac, N. Classification Rule Learning with APRIORI-C. In: Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving, 2001. [9] Romero, C., Ventura, S. Educational Data Mining: a Survey from 1995 to 2005. Expert Systems with Applications, 2007, 33(1). [10] ] Yudelson, M.V., Medvedeva, O., Legowski, E., Castine, M., Jukic, D., Rebecca, C. Mining Student Learning Data to Develop High Level Pedagogic Strategy in a Medical ITS. AAAI Workshop on Educational Data Mining, 2006.

[1] Baik, S. Bala, J. (2004), A Decision Tree Algorithm for Distributed Data Mining: Towards Network Intrusion Detection, Lecture Notes in Computer Science, [2] McSherry, D. (1999). Strategic induction of decision trees. Knowledge-Based Systems, [3] Agrawal R.,and Srikant R. 1994 Fast Algorithms for Mining Association Rules in Large Databases Proc. 20th Int. Conf.Very Large Data Bases (VLDB 94) [4] Jeffrey W. Seifert Analyst in Information Science and Technology Policy Resources, Science, and Industry Division , Data Mining: An Overview


PART - 2


Analysis of Multidimensional Modeling Related To Conceptual Level

Udayan Ghosh and Sushil Kumar University School of Information technology, Guru Gobind singh Indraprastha University Kashmere Gate, Delhi.
Abstract Many OLAP usages indicate that their usability performance degrades due to wrong interpretation of business dimensions. In this paper, we are focusing about business dimensions by multidimensional data model structures for the DWs. Multidimensionality is just a design technique that separates the information into facts and dimensions by understanding the business processes and the required dimensions . Many approaches have been suggested but we will focus on widely accepted star Schema with slight improvement using Snowflake Schema a variation of star schema, in which the dimensional tables from a star schema are organized into a hierarchy by normalizing them. Multidimensional model present information to the end-user in a way that corresponds to his normal understanding of his business dimensions, key figures or facts from the different scenarios that influence users requirement . Key Words: Business Dimensions, Multidimensional Modeling, Data Warehouse, OLAP, Star Schema, Snowflake Schema, Fact. 1.


DW generalize and consolidate the data in the multidimensional space. The construction of DW involves data warehouses involves data cleaning, data integration, and data transformation and can be viewed as an important preprocessing step for data mining. Moreover, data warehouses provide on-line analytical processing (OLAP) tools for the interactive analysis of multidimensional data of varied granularities, which facilitates effective data generalization and data mining. A data warehouse is a set of data and technologies aimed at enabling the executives, managers and analysts to make better and faster decisions. DWs to manage information efficiently as the main organizational asset. The principal role of DW in taking strategic decisions, quality is fundamental. Data warehouse systems are

important tools in todays competitive, fast-changing era. In the last several years, many firms have spent millions of dollars in building enterprise-wide data The DWs have to inherent support for complex queries however its maintenance does not suppose transactional load. These features cause the design techniques and the used strategies to be different from the traditional ones. Many people feel that with competition mounting in every industry and domain, data warehousing is the latest must-have marketing weapon and panacea a way to retain customers by learning more about their requirements Enterprise DW3 -An enterprise DW provides a centralized database architecture for decision support for the enterprise. Operational Data Store-It has a broader enterprise wide frame, but unlike the real one. Enterprises DW, data is refreshed in near real time and used for routine business processes. Data Mart -Data mart is a subset of data warehouse and it supports a specific domain, business process. 1.1- Characteristics of DW8- The main characteristics of data warehouse are: Subject oriented. DW is organized around major subjects, such as, supplier, product, customer and sales. Separate, DW is always a physically distinct store of data transformed from the application data found from the traditional OLTP environment. Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms. It usually combines two operations in data accessing: initial loading of data and access of data. Time Variant. Problems have to be addressed; trends and correlations have to be explored. They are time stamped and associated with defined periods of time2. Not dynamic. When the data is updated, it is done only periodical, but not as on individual basis.

Integrated Performance. The data which is requested by the user has to perform well on all scales of integration. Data cleaning and data integration techniques are applied to ensure consistency in naming conventions, encoding structures, attribute measures, and so on. Consistency. Architectural and contents of the data is very significant and can only be ensured by the use of metadata: this is independent from the source and collection date of the data. 1.2-data warehouse building process7: To construct an effective data warehouse we have to analyze business processes, dimension and business environment. After obtaining the DW logical schema, build it through application of transformations to the source logical schema, and apply the construction of a large and complex information system, can be viewed as the construction of a large and complex building, for which the owner, architect, and builder have different views. These perspectives are merged to form a complex framework that represents the top-down, business-driven, or owners perspective, as well as the bottom-up, builder-driven, or implementers view of the information system. The multidimensional model transforms the visualization of a schema into a more business-focused framework. All these structures cubes, measures and dimensions interact with each other to provide an extremely powerful reporting environment. Most of the multidimensional database systems used in business framework and decision support applications is particular. Generally, they can be categorized into two categories: 1st is the special traditional relational DBMS which create multidimensional schemas such as star schema and snowflake schema by applying the mature theory of relational database systems, the 2nd is the multidimensional database systems which are designed specially for online analysis. All dimensional tables are directly connected with the fact table and do not generate connections with other dimensional tables. However, it will need to separate one dimension into many dimensions as per business dimension mappings. Such structure is called the snowflake mode, a slight modification of star adding relational constraints of normalization. Relational database systems are suitable for OLTP4 applications, but it does not guarantee to meet the expectations of online analytical processing applications in real time environment. Relational OLAP3 systems which are inherently ORDBMS can only be classified as relational database systems, because after changing into systems supporting OLTP applications, relational approach can only used, that disappeared the object features. A multidimensional database is a type of database (DB) which is optimized for DW and OLTP applications. Multidimensional databases are mostly generated using the given data from existing RDs, a multidimensional database allows a user to refer problem and questions related to concizing business operations and trends analysis. An OLTP application that processes data from a multidimensional database is formally referred as a multidimensional OLTP application. A multidimensional database or a multidimensional database management system implies the ability to rapidly accept the data in the database so that answers can be generated easily. A number of vendors provide products that use multidimensional databases. An approach to how data is stored and the user interface differs. To multidimensional database systems, applications are eased due to uniform specifications does not exist. They are special database systems which do not support comprehensive query, Four different views regarding the design of a data warehouse must be deemed: the top-down view, the data source view, the data warehouse view, and the business query view. The top-down view allows the selection of the relevant information vital for the DW. This information resembles the current and future business requirements. The data source view shows or reflect the information being captured, stored, and managed by operational systems. This information may be documented at various Hierarchies of detail and accuracy, from individual data source tables to integrated data source tables. Data sources are often modeled by traditional data modeling approach, such as the entity-relationship model or CASE (computeraided software engineering) tools. The data warehouse view combines fact tables and dimension tables. It represents the information that is stored inside the DW, including predetermined aggregates and counts, as well as information pertaining to the source, date, and time of origin, added to provide historical scenario. Finally, the business query view is the perspective of data in the data warehouse from the viewpoint of the end user. 2. Multi Dimensional modeling9 It is a technique for formalizing and visualizing data models as a set of measures that are defined by common aspects of the business processes. Business Dimensional modeling has two basic concepts. Facts:

A fact is a collection of related data items, composed of Business measures. A fact is a focus of interest for the decision making Business process.

Measures are continuously valued results that describe facts. A fact is a business statistics. Multidimensional data-base technology is a key term in the interactive analysis of large amounts of data for decision-making purposes. Multidimensional data model is introduced based on relational elements. Dimensions are modeled as dimension relations. Data mining applications provides knowledge by searching semi-automatically for previously unknown patterns, trends and their relationships in multidimensional databases structures. OLAP software enables analysts, managers, and executives to gain insight into the performance of an enterprise through fast and interactive access to a wide range of views of data organized to reflect the multidimensional nature of the enterprise wide data. 3.1:-The Goals of Multi-Dimensional Data Models11 To enable end-user to access the information in a way that corresponds to his normal understanding of his business, key figures or facts from the different perspectives that relates with the business environment that influence them. To facilitate the physical implementation that the software recognizes (the OLAP), thus allowing a program to easily access the data required for processing. 3.2:- Usages of Multi-Dimensional modeling use business dimensions: INFORMATION PROCESSING: support for querying, basic statistical analysis, and reporting using crosstabs, graphs, tables or charts. A current trend in data warehouse information processing is to construct low-cost Web-based application tools for global access integrated with Web browsers. ANALYTICAL PROCESSING Using dimensions with OLAP it includes OLAP operations such as slice-and-dice, drill-down, roll-up, drill-through, drill-across and pivoting. It generally operates on historical data in both summarized and detailed forms. The major strength of on-line analytical processing over information processing is the multidimensional data analysis of data warehouse data. DATA MINING support with KDD (knowledge discovery in databases) it helps to discovers hidden patterns and associations, clustering, performing classification and prediction, and presenting the mining results using visualization tools etc.

Dimension: The parameter over which we have to perform analysis of facts and data. The parameter that gives meaning to a measure number of customers is a fact, perform analysis over time. Dimensional modeling has been coherent architecture for building distributed DW Applications. If we come up with more complex queries for our DW which involves three or more dimensions. This is where the multi-dimensional database plays a eminent role. Dimensions are distributed by which summarized data can be used. Cubes are data manipulating units composed of fact tables and dimensions from the data warehouse (DW). Dimensional modeling also has emerged as the only coherent architecture for building distributed data warehouse Applications. 3. Multi-Dimensional Modeling using business dimension9 Multidimensional database technology has come a long way since its inception more than 30 years ago. It has recently begun to reach the mass market, with major providers now delivering multidimensional database engines along with their traditional relational database software, often at no extra cost. A multidimensional data model is typically referred for the design of corporate data warehouses and departmental data marts. Such a model can be adopted with star schema, snowflake schema, or fact constellation schema. The core of the multidimensional model is the data cube, which consists of a large set of facts (or measures) and a number of business dimensions. Business dimensions are the entities or perspectives with respect to organizations that wants to keep information and are hierarchical in nature.Multi-dimensional technology has also made significant gains in scalability and maturity to describe the organizations current business requirement. Multidimensional model is based on three key concepts: Modeling business rules Cube and measures Dimensions

3.3:- Logical Multidimensional Model

The multidimensional data model is important because it enforces simplicity. As Ralph Kimball states in his landmark book, The DW Toolkit: The central attraction of the dimensional model of a business is its simplicity that simplicity is the fundamental key that allows users to understand DBs, and allows software to navigate databases efficiently. The multidimensional data model is composed of logical cubes, measures, dimensions, hierarchies, levels, and attributes. The simplicity of the model is inherent because it defines objects that represent real-world business entities. Analysts know which business measures they are interested in examining, which dimensions and attributes make the data meaningful, and how the dimensions of their business are organized into levels and hierarchies. Multidimensional data cubes, are the basic logical model for OLAP applications12. The focus of OLAP tools is to provide multidimensional analysis to the underlying information. To achieve this goal, these tools employ multidimensional models for the storage and presentation of data. Figure1: Diagram of logical Multi dimensional model Users can quickly and easily create multi level queries. The multi-dimensional query model has one important advantage over the relational querying techniques. Each dimension can be queried separately. This allows users to divide and analyze what would be a very complex query into simple manageable steps. The multidimensional model also provides powerful filtering capabilities. Additionally, it is also possible to create conditions based on measures that are not part of the final report. Because the dimensional query is independent of the filters, it allows complete flexibility in determining the structure of the condition. The relational implementation of the multidimensional data model is typically a star schema, or a snowflake schema. 3.4 Conceptual View: Figure 2: Levels of view

Figure 1: Keys of multidimensional model A logical model (figure1) for cubes based on the key observation that a cube is not a self-existing entity, but rather a view over an underlying data set. Logical cubes provide a means of organizing measures that have the same shape, that is, they have the exact same dimensions. The relational model forces users to manipulate all the elements as a whole, which tends to lead to confusion and unexpected result sets. In contrast, the multi-dimensional model allows end users to filter each dimension in isolation and uses more friendly terms such as Add, Keep and Remove.

Conceptual view describes the semantics of a domain, being the scope of the model. For example, it may be a model of the interest area of an organization or industry. This consists of entity classes, representing kinds of things of significance in the domain, and relationships assertions about associations between pairs of entity classes. A conceptual view specifies the kinds of facts or propositions that can be expressed using the model. In that sense, it defines the allowed expressions in an artificial 'language' with a scope that is limited by the scope of the model. Early phases of many software development projects emphasize the design of a conceptual data model. Such a design can be detailed into a logical data model6. In later stages, this model may be translated into physical data model. However, it is also possible to implement a conceptual model directly. Multidimensional Conceptual View provides a multidimensional data model that is intuitively analytical and easy to use. Business users view of an enterprise is multidimensional in nature. Therefore, a

multidimensional data model conforms to how the users perceive business problems. 3.5 Star schema architecture with business dimension scenario: it consists of a fact table for a particular business process ( for example: Sales analysis would take Sales as fact table) with a single table for each dimension table. Star Schema is the special design technique for multidimensional data representations. It Optimize data query operations instead of data update operations. Star Schema is a relational database schema for representing multidimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables15. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles with a star like structure where one fact table is connected to multiple dimensions. The center of the star schema consists of a huge fact table and it points towards the dimension tables. The advantage of star schema is slicing down, performance increase and easy understanding of data. Steps in designing star schema Identify a business process for analysis. Identify measures or facts. Identify the dimensions for facts. List the columns that describe the each dimension. Determine the lowest level of summary in a fact table15. Snowflake schema: The snowflake schema is a variant of the star schema, where some dimension tables are normalized, and enhanced further splitting the data into additional tables16. The resulting schema graph forms a shape similar to a snowflake. Important aspects of Star Schema & Snow Flake Schema In a star schema every dimension will have a primary key and also a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierarchies are broken into separate tables in snow flake schema16,17. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies. Snowflake schema is the normalized form of star schema.

4 Proposed Model: In this section we will summarize the basic concepts of object oriented multi-dimensional model [19]. The multi dimensional model is the core of the comprehensive object oriented model of a DW containing all the details that are necessary to specify a data cube, the dimensions, the classification hierarchies, the description of fact and measures attributes.

3.6 Snowflake schema

Locationdimensions have upper level hierarchies say Product, QTR and Region respectively. Then in the notation of GOOMD model, there will be four middle layer groups {DCustomer, DModel, DLocation, DTime} with hierarchy.

Figure 3: Hierarchical View of Proposed Model Schema At the lowest layer, each vertex represents an occurrence of an attribute or measure, e.g. product name, day, customer city etc. A set of vertices semantically related is grouped together to construct an Elementary Semantic Group at lower layer. On next, several related elementry are group together to form a Semantic Group at middle layer the next upper layer constructs to represent any context of business analysis. A set of vertices of any middle, those determine the other vertices of the lower, is called Determinant Vertices. This layered structure may be further organized by combination of two or more middle as well as lower group to represent next upper level layers from the topmost layer the entire database appears to be a graph with middle as vertices and edges between middle layer object. Dimensional Semantic Group is a type of middle layer object to represent a dimension member, which is an encapsulation of one or more lower layer group along with extension and / or composition of one or more constituent middle layer groups. Fact Semantic Group (FSG) is a type of group represents facts, which are an inheritance of all related lower, middle and a set of upper defined on measures. In order to materialize the Cube, one must ascribe values to various measures along all dimensions and can be created from FSG. Example: Let consider an example, based on Sales Application with sales Amount as measure and with four dimensions Customer, Model, Time and Location with the set of attributes {C_ID, C_NAME, C_ADDR}, {M_ID, M_NAME, P_ID, P_NAME, P_DESC}, {T_ID, T_MONTH, Q_ID, Q_NAME, YEAR} and {L_ID, L_CITY, R_ID, R_NAME, R_DESC} respectively. Model, Time and

Figure 4:Schema for Sales Application in Proposed Model model also provides algebra of OLAP operators those will operate on different semantic groups. The Select operator is an atomic operator and will extract vertices from any middle layer groups depending on some predicate P. The Retrieve operator extracts vertices from any Cube using some constraint over one or more dimensions or measures. The Retrieve operator is helpful to realize slice and dice operation of OLAP. The Aggregation operators perform aggregation on Cube data based on the relational aggregation function like SUM, AVG, MAX etc. on one or more dimensions and are helpful to realize the roll-up and drill down operations of OLAP. 5 Comparisons of Conceptual Design Models Property 1 (Additivity of measures): DF, starER and OOMD support this property. Using ME/R model, only static data structure can be captured. No functional aspect can be implemented with ME/R model. Property 2 (Many-to-many relationships with dimensions):StarER and OOMD support this property. DF and ME/R models do not support manyto-many relationships. Property 3 (Derived measures): None of the conceptual models include derived measures as part of their conceptual schema except OOMD model.


[1] Jiawei Han and Micheline Kamber Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann publications [2] S.Kelly.Data Warehousing in Action.John Wiley &Sons(1997). [3] Kimball, R. The Data Warehouse Toolkit: Practical Techniques for BuiIding Dimensional Data Warehouses. John Wiley and Sons 1996. ISBN 0-471-15337-0 [4] S.Chaudhuri,U.Dayal.An overview of data warehousing and OLAP technology. SIGMOD Record 26,1 (1997). [5] G.Colliat.OLAP, relational and multi-dimensional database systems.SIGMOD Record 25, 3 (1996) [6] M. Golfarelli, D. Maio, and S. Rizzi. The dimensional fact model: a conceptual model for data warehouses. IJCIS, 7(2- 3):215247, 1998. [7] M. Jarke, M. Lenzerini, Y. Vassilious, and P. Vassiliadis, editors. Fundamentals of Data Warehousing. Springer-Verlag,1999. [8] L. Cabibbo and R. Torlone. A logical approach to multidimensional databases. In Proc. of EDBT-98, 1998. [9] E. Franconi and U. Sattler. A data warehouse conceptual data model for multidimensional aggregation. In Proc. of the Workshop on Design and Management of Data Warehouses (DMDW-99), 1999. [10] McGuff, F. Data Modeling for Data WarehousesOctober, 1996 from [11] Gyssens, M. and Lakshmanan, L.V.S. A foundation for multi-dimensional databases, Technical Report, Concordia University and University of Limburg, February 1997. [12] M.Blaschka,C.Sapia,G.H ofling, and B. Dinter. Finding Your Way through Multidimensional Data Models. In DEXA 98, pages 198203, 1998. (16.06.2006), 2006. [13] Antoaneta Ivanova, Boris Rachev Multidimensional models- Constructing DATA CUBE International Conference on Computer Systems and Technologies CompSysTech2004 [14] Multidimensional Database Technology by Torben Bach,Pedersen,Christian S. Jensen,Aalborg University [15] Rakesh Agrawal, Ashish Gupta, and Sunita Sarawagi.Modeling multidi-mensional databases. Research Report, IBM Almaden Research Center, San Jose, California,1996. [16] P. Vassiliadis and T.K. Sellis, A Survey of Logical Models for OLAP Databases, ACM SIGMOD Record, vol. 28, no.4, 1999. [17] L. Cabibbo and R. Torlone. A logical approach to multidi-mensional databases. In Proc.of EDBT-98, 1998 [18] Deepti Mishra, Ali Yazici, Beril Pinar Baaran A Casestudy of Data Models in Data Warehousing. [19] Anirban Sarkar, Swapan Bhattacharya Object Relational Implementation of Graph Based Conceptual Level Multidimensional Data Model

Property 4 (Non-strict and complete classification hierarchies): Although DF and ME/R can define certain attributes for classification hierarchies; starER model can define exact cardinality for non-strict and complete classification hierarchies. OOMD can represent non strict and complete classification hierarchies. Property 5 (Categorization of dimensions specialization/ generalization): All conceptual design models except DF support this property. Property 6 (Graphic notation and specifying user requirements): All modeling techniques provide a graphical notation to help designers in conceptual modeling phase.ME/R model also provides state diagrams to model systems behavior and provides a basic set of OLAP operations to be applied from these user requirements. OOMD provide complete set of UML diagrams to specify user requirements and help define OLAP functions. Property 7 (Case tool support): All conceptual design models except starER have case tool support18. 6. Conclusion This paper helps us to enlighten our comprehensibility with the multidimensional structure related to business processes and dimensions. Multi-dimensional data model combined with facts and context dimensions using Star and Snow Flake schema. This paper relates the various multi-dimensional modeling according to the multidimensional space, language aspects and physical representation of the traditional Database Model and establish relationship multidimensional data to object oriented data.



Gurpreet Singh*, Shivani Kang** Faculty of Computer Applications*, Dept.of Computer Science** Chandigarh Group of Colleges, Gharaun Campus, Mohali, Punjab.* Abstract: This Paper presents the Centralized Energy Management System (CEMS), a dynamic fault-tolerant reclustering protocol for wireless sensor networks. CEMS reconfigures a homogeneous network both periodically and in response to critical events (e.g. cluster head death). A global TDMA schedule prevents costly retransmissions due to collision, and a genetic algorithm running on the base station computes cluster assignments in concert with a head selection algorithm. CEMS performance is compared to the LEACH-C protocol in both normal and failure-prone conditions, with an emphasis on each protocols ability to recover from unexpected loss of cluster heads. Keywords: CEMS, TDMA, WSN, GA. I. Introduction nodes transmitting their data to the base station, they instead transmit to another sensor designated as the local cluster head. The cluster head then sends aggregated (and possibly compressed) sensor information to the sink as a single transmission. Note that clustering makes some nodes more important than others, while increasing the energy dissipation of those same nodes. This Paper implements a novel reclustering technique that minimizes both energy expenditure and loss of network coverage due to the failure of cluster heads. CEMS (Centralized Energy Management System) moves almost all processing not directly related to data collection off of the energy-limited sensor nodes and onto the sink. Furthermore, the base station maintains a record of expected transmission times from the networks cluster heads, based on their location on the global TDMA1 schedule. If a cluster head consistently fails to transmit during its expected window of time, the sink triggers an emergency reclustering to restore network coverage.

Wireless sensor networks (WSNs) are increasingly deployed in a variety of environments and applications, ranging from the monitoring of medical conditions inside the human body to the reporting of mechanical stresses in buildings and bridges. In these and many other WSN applications the sensors cannot be recharged once placed, making energy expenditure the primary limiting factor in overall network lifetime. One standard WSN configuration consists of a set of sensors that communicate to the external world via a base station, or sink, that has no power constraints. The sensors number in the hundreds or even thousands, and are primarily constrained by a limited battery supply of available energy. While the sink is modeled as a single node, it may provide access to other systems upstream such as distributed processing facilities or databases devoted to consolidating and cataloging the reported WSN data. Since the primary form of energy dissipation for wireless sensors is in radio transmission and reception [1], a variety of network modifications have been proposed to limit radio use as much as possible. Sensor clustering at the network layer has been shown to be a scalable method of reducing energy dissipation. Rather than individual sensor


Wireless Sensor Network

While WSNs have much in common with more traditional ad-hoc and infrastructure-mode wireless networks, they differ in several important ways. A WSN generally has a large number of sensors scattered over an area and a single node referred to as the base station, or sink, which is responsible for receiving data transmitted by sensors in the field. It bears some similarity to an access point in an infrastructure-mode network. The sink may or may not be located inside of the space being sensed, and is almost always considered to be a powered node operating without energy constraints. Depending on the application and configuration of the WSN, the base station may have additional responsibilities such as coordinating network activities, processing or formatting incoming data, or working with an upstream data analysis system to provide data matching any query requests that it receives [2]. In contrast to the single base station, WSNs may have hundreds or even thousands of sensors operating in

the field. These low-power devices are often battery powered and sometimes include solar panels or other alternative energy sources. During their limited lifespan (defined as the time interval during which sufficient energy remains to transmit data), sensors are tasked with monitoring a single aspect of their surrounding environment and reporting their sensed data via an onboard radio transceiver. Given the above characteristics, a few important differences from standard WLANs become apparent: Most traffic is upstream, from the sensors to the base station. The small amount of downstream traffic tends to be dominated by broadcast traffic from the base station to provide generic updates to all sensor nodes. Power use is a key performance metric, as sensors are battery powered. Network links tend to be low-capacity, as high throughput is energy intensive and often unnecessary. Long delays may be acceptable for many WSN applications (e.g. a network monitoring soil pH will be relatively immune to high latencies.) representative signal, a processor governing sensor operations, onboard flash memory, and the radio transceiver responsible for linking the sensor node to the rest of the network. Sensing Unit At their simplest, sensors are designed to generate a signal corresponding to some changing quantity in the surrounding environment. This may be anything from a simple Peltier diode to measure temperature (such as the Microchip TC74), or as complex as a charge-coupled device to monitor video input (such as the Omnivision OV7640). The data are sent to an onboard microprocessor after being converted to a digital signal by the sensor electronics, which may perform some processing or culling before electing to transmit the information over the modules transceiver. Generally sensors are procured and attached to WSN sensor platforms by the purchaser, and are often manufactured by different companies than those which provide the platform itself. Processing Unit A sensors processor must, at minimum, serve as an effective interface to the sensor module and regulate data flow from the sensing unit to the radio transceiver. There are currently three popular types of processing unit in general use: microcontrollers, microprocessors, and Field-Programmable Gate Arrays (FPGAs) [4]. Onboard storage, often in the form of flash memory, is also often included as part of a sensors processing unit. Microcontrollers such as the 8-bit TI MSP430 [9] and 16-bit Atmel AVR are the simplest and one of the most common forms of processor, unable to support complex operations but running at a low clock speed and consuming the least amount of power. They are most often used when little data processing or decision making is necessary. Microprocessors such as the 32-bit Intel Xscale are a more general-purpose CPU, and are potentially much more powerful than microcontrollers, with significantly higher clock speeds and more flexibility in terms of their programming. Field-Programmable Gate Arrays (FPGAs) use a hardware description language to allow sensor modules to be reconfigured in the field to rapidly process the data that their sensor units are reporting. This can be invaluable for real-time surveillance networks and target tracking, where image processing algorithms can be implemented on the hardware level without purchasing a dedicated GPU. FPGAs are also


Components of Sensor Network


WSNs employ a variety of hardware platforms and software systems. Sensors themselves vary in size from a few millimeters even within similar sensors, radio transceivers, sensors, and microprocessor facilities may vary. Given the wide variety of platforms available, any protocols developed for a WSN must consider the characteristics of the underlying hardware on which they will operate.

Wireless sensor nodes have shrunk significantly as MEMS technology has progressed. Individual components are now often integrated into the same chip, and hardware design has evolved to reflect this change. A sensor node is composed of several independent components linked together to form one operative package. A power supply provides the necessary energy to a sensing unit designed to monitor the environment and produce a

the highest energy consumers of the three processors, and may not be compatible with general-purpose WSN software systems. and only a few specific applications such as live video surveillance impose any strict time constraints on data acquisition and processing. Since sensor hardware is often extremely limited in terms of both resources and available energy, small footprints and efficient use of memory and processor cycles is a key requirement for any WSN operating system.E.g TinyOS. WSN Simulation Wireless sensor network research is largely directed at improving the energy efficiency, coverage, reliability, and security of sensors and networks. This translates into a need for detailed information about conditions on the lower levels of the network stack in an ad-hoc wireless environment. Conventional network simulators often have more support for packet-level network-layer simulation than for, e.g., frame-level information gathering and radio energy dissipation modules. To meet this need a number of simulators have evolved or adapted to service the needs of WSN.E.g GloMoSim IV. The Centralized System (CEMS) Energy Management

Software written for wireless sensor networks differs from more conventional platforms in several respects. The vast majority of WSN operating systems and network protocols that have been produced in academia or in the commercial sector are poweraware due to the limited amount of energy available to sensor nodes. From a software design standpoint, this necessitates optimizing algorithms and program architectures to minimize the amount of energy dissipated per operation. Efficiency of execution in terms of running time is still a concern, but is of secondary importance. If an algorithm could finish rapidly but consume more power than a slower implementation, the slower version might still be selected for use in a WSN. Furthermore, sensor nodes are extremely limited in terms of resources. On-board RAM capacity is extremely small due to the energy drain of volatile memory. Software systems must therefore rely primarily on register-based operations and any flashbased storage medium that might be present. This requires that a program use a limited number of often-accessed data structures, and that it performs computations using as little memory as possible. Two other distinguishing features of WSN software arise less from hardware limitations and more from environmental constraints [5]. Since sensors can be deployed in a potentially inaccessible field (e.g. underwater, inside walls, in a combat zone), WSN software systems must be able to run unattended for long periods of time. Any logical or physical faults should be able to be dealt with, worked around, or minimized in impact without the intervention of human agencies. Support for any kind of graphical user interface, or even a terminal interface in field conditions, is not generally provided. Sensors may be reprogrammed or configured in controlled conditions, however, via software running on an external machine to which individual nodes may be connected.

The Centralized Energy Management System is a clustering protocol that exploits the predictable nature of TDMA-based channel access to rapidly detect and respond to critical failures. Almost all energy-intensive operations (such as cluster formation) are moved upstream to the base station, which is assumed to not have any energy constraints. CEMS has two distinct phases: cluster formation and steady-state operation. The former is run at the beginning of each reclustering phase, which occurs both periodically and in response to cluster head death. The base station calculates cluster assignments and notifies the new cluster heads. If all heads acknowledge, the steady-state phase is initiated. During this phase, sensors report data to their cluster heads. The data are then compressed and aggregated before being forwarded to the base station. Two assumptions governed the creation of this system: The optimal clustering configuration changes over time as the residual energy of cluster heads decreases Any node, including a cluster head, has a non-zero probability of failing at a given time due to random accidents. The sink maintains state information on each node in the network consisting of its location and its

Operating Systems WSN-specific operating systems are distinguished from existing embedded OSs such as ChibiOS/RT or Nucleus RTOS by their lack of real-time processing constraints. Sensor networks are rarely interactive,

projected amount of residual energy. The assignment of nodes to clusters is calculated using a genetic algorithm (GA) which considers nodes spatial positions, and the assignment of a cluster head to each cluster is calculated using node energy and position. The number of cluster heads, determined a priori, is an input parameter to the system. Clustering Phase CEMS clustering phase is initiated at network startup and at each subsequent reclustering, whether due to period triggers or in response the cluster head failure. Selection of cluster heads and cluster members is divided into two stages. A genetic algorithm first determines cluster membership for each sensor in the network during the cluster formation stage. This information, along with spatial coordinates and current energy levels, is then passed to a cluster head selection algorithm during the head selection stage. Once both cluster heads and members have been determined, the sink informs each sensor of its new assignment during the sensor notification stage. The genetic algorithm which determines cluster membership is implemented with the GALib C++ library. It uses a fixed-length list of integers to describe a genome representing a potential network topology. The genomes length is always equal to the current number of living sensors. Each value in the list signifies a cluster ID. (The number of clusters is determined a priori.) Selection is accomplished through the minimizing objective function presented in Figure 1. First, a centroid for each cluster is determined. Each cluster is then assigned a score based on the sum of the squared distances between each cluster member and that clusters centroid. The sum of all cluster scores is used as the objective score for that genome [7]. Each individual sensor has a probability of being chosen for mating equal to its fitness score divided by the sum of fitness scores over that generation. Two individuals are chosen each generation, and the highest scoring genome is selected.

zc The centroid for cluster z zsi Sensor i in cluster z sz Score for cluster z n The number of sensors in a given cluster m The number of clusters in the genome Figure 1 - Objective Function

Sensor Notification
After sensors have been assigned and heads elected to clusters, the base station broadcasts a message to each cluster head informing it of its new role, which sensors are in its cluster, the distance it must transmit to the base station, and the distances that its members must transmit [9]. The cluster head relays distance and membership data to each sensor in its cluster and sends an acknowledgement to the base station. Once all acknowledgments are received, the sink initiates the network's steady-state phase. If all cluster heads do not send an acknowledgement before a timeout window expires, the sink reclusters and increases the missed transmission count of any cluster head which failed to acknowledge. Any sensor with three consecutive missed transmissions will be declared dead and removed from future clustering assignments.

TDMA Scheduling
CEMS employs a global TDMA schedule (i.e. all sensors and clusters participate) to manage channel access among sensors and the base station. There is a single broadcast slot at the beginning of each cycle, while the remaining slots are strictly unicast. Each sensor is given a unique slot in the TDMA schedule during each reclustering phase. Note that there is no guarantee a sensors slot will be the same in two different rounds of operation. All sensors, including cluster heads, transmit to the base station on their slot. All nodes must also listen on slot 0, which is reserved for broadcast communications from the base station. Furthermore, cluster heads must listen during each slot in their clusters range to receive data from their members. To minimize hardware delays resulting from switching between sleep and wake states, slots within a single cluster always form a contiguous block of slots. Sensors are

in sleep mode at all other times, their radio electronics turned completely off. cluster members reclustering. that they die shortly after

Failure Reclustering




CEMS uses the periodic nature of its global TDMA cycle to rapidly recover from coverage loss due to cluster head death. At the beginning of each steadystate phase, the base station computes the expected transmission times of each cluster head using its TDMA slot and the overall cycle length. If any cluster head fails to transmit during its expected time, the sink increments that sensors missed transmission count. Three missed transmissions result in that sensor being labeled as dead, and trigger an emergency reclustering event to reconnect the cluster to the WSN. A successfully received transmission resets the sensors missed transmission count. Note that a tradeoff exists between the recovery period and accurate classification of cluster head death [10]. The more missed transmissions required before a sensor is declared dead, the longer a cluster may be offline before emergency reclustering is triggered. A small missed transmission count, however, is vulnerable to false positives. In the field a sensors transmissions may be blocked by a mobile obstacle (e.g. a passing vehicle), interfered with by a spike in radio noise, etc. Misinterpretation of these temporary problems as permanent sensor death could lead to unnecessary energy expenditure and downtime due to reclustering.

Therefore, in dense networks with overlapping areas of coverage or for networks which do not require complete coverage, a long reclustering period may be preferable. For networks where coverage must be maintained for as long as possible, a shorter reclustering period is desirable.

Figure 2: Hour Reclustering Period

Reclustering Period
The ideal duration of each reclustering period in CEMS is application-specific. Figures Figure 2 and Figure 3 show the tradeoff between coverage and lifetime for reclustering periods of 20 hours and 100 hours, respectively. Each graph shows the residual energy over time for each sensor in the network. Given an initial population of one hundred sensors, the simulation begins with five clusters. Sharp declines in energy correspond to being made a cluster head, while gradual energy loss represents cluster membership. The relatively narrow gap between the sensor with the lowest residual energy and that with the highest residual energy in Figure 2 indicates a fairly even balancing of energy costs over the network. This preserves coverage for as long as possible, after which all sensors die within a few hours of each other. Figure 3, conversely, begins to lose sensors almost immediately. In this configuration, cluster heads lost so much energy before being reassigned as

Figure 3: Hour Reclustering Period



By reclustering infrequently, a network may have a longer operative lifespan at the expense of early and increasingly common gaps in its coverage. For denser networks or those monitoring conditions likely to

register on multiple sensors, this may be an acceptable tradeoff. For sparser of more precise networks, however, a decreased lifespan may be an acceptable cost for ensuring good coverage. A further tradeoff must be made between the number of clusters in a network and the expected failure rate of sensors due to accidents. A small number of cluster heads cause a significant loss of coverage if they fail, while a larger number of clusters head cause a proportionally smaller coverage loss. A synchronized global TDMA schedule allows the base station to predict when transmissions from given cluster heads are expected and ensures that no sensors will act as hidden terminals. CEMS uses the former feature to implement a quick recovery system that rapidly restores network coverage in the event of cluster head death. 02 (June 28 - July 01, 2004). ISCC. IEEE Computer Society, Washington, DC, 238-243. [4] Manjeshwar, A. and Agrawal, D.P. Teen: a routing protocol for enhanced efficiency in wireless sensor networks. In Parallel and Distributed Processing Symposium. Proceedings 15th International, pages 2009-2015. [5] Tang, Q.; Tummala N.; Gupta S. and Schweibert L. Communication Scheduling to Minimize Thermal Effects of Implanted Biosensor Networks in Homogeneous Tissue. IEEE Transcations on Biomedical Engineering 52 (2005): 1285-1293. [6] Mudundi, S.R., and Hasham H.A. A New Robust Genetic Algorithm for Dynamic Cluster Formation in Wireless Sensor Networks. Proceedings of the Seventh IASTED International Conferences (2007). [7] Hussain S.; Matin A.W.; Islam O., "Genetic Algorithm for Energy Efficient Clusters in Wireless Sensor Networks," pp.147-154, International Conference on Information Technology (ITNG'07), 2007 . [8] Grefenstette, J., "Optimization of control parameters for genetic algorithms," IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-16(1), pp. 122-128, 1986. [9] Haupt, R. "Optimum population size and mutation rate for a simple real genetic algorithm that optimizes array factors." Proc. of Antennas and Propagation Society International Symposium, 2000, Utah, Salt Lake City. [10] Varga, A. Omnet++ Discrete Event Simulation System. Computer software. Vers. 3.2. Omnet++ Community Site. <>.



[1] Heinzelman, W.; Chandrakasan, A.; Balakrishnan, H., "Energy-efficient communication protocol for wireless microsensor networks," System Sciences, 2000. Proceedings of the 33rd Annual Hawaii International Conference, 10 pp. vol.2-, 4-7 Jan. 2000 [2] Heinzelman, W., Chandrakasan A., Balakrishnan H. "An Application-Specific Protocol Architecture for Wireless Microsensor Networks." IEEE Transactions on Wireless Communications 1 (2002). [3] Voigt, T.; Dunkels, A.; Alonso, J.; Ritter, H.; and Schiller, J. 2004. Solar-aware clustering in wireless sensor networks. In Proceedings of the Ninth international Symposium on Computers and Communications 2004 Volume 2 (Iscc"04) - Volume


Performance Evaluation of Route optimization Schemes Using NS2 Simulation

Electronics & Communication Engineering Department D.C.R.U.S.T., MURTHAL (Haryana) 2 snt 3

Abstract- MIPv4 (Mobile Internet Protocol version 4), in which the main problem is triangular routing. Mobile node able to deliver packets to a corresponding node directly through foreign agent but when corresponding node sends packet to the mobile node packet comes to foreign agent via home agent then it comes to mobile node. This asymmetry is called triangle routing. It leads to many problems, like load on the network and delay in delivering packets. The next generation IPv6 is designed to overcome this kind of problem (triangle routing). To solve the triangle routing problems three different route optimization schemes are used which exclude the inefficient routing paths by creating the shortest routing path. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme. I have taken Throughput and Packet delivery fraction, Performance metrics to compare these three schemes by using NS-2 simulations. Throughput is the rate of communications per unit time. Packet delivery fraction (PDF) is the ratio of the data packets delivered to the destinations to those generated by the CBR sources. By using these parameters I have found that enhanced light weight route optimization scheme performance is better than Liebschs Route optimization scheme & Light Weight Route optimization scheme. I INTRODUCTION As the growth of wireless network technology dimension for accessing mobile network has been increased dramatically. Mobile Internet Protocol version6 is a mobility protocol standardized by the Internet Engineering Task Force (IETF). In Mobile Internet Protocol version6, communications are maintained even though the mobile node (MN) moves from its home network to foreign network. This is because that the MN sends Binding Update (BU) message to its Home Agent (HA) located in the HN to inform the location information

whenever the MN hands off (move) to other networks. The Mobile Nodes in the Internet, it requires that the MNs maintain mobility related information and create own mobility signaling message. In other words, the MNs that has limited processing power, battery, and memory resource. To overcome such limitations, IETF has proposed Proxy Mobile IPv6 (PMIPv6) protocol. In PMIPv6, the MN's mobility is guaranteed by the newly proposed network entities such as the local mobility anchor (LMA) and the mobile access gateway (MAG). PMIPv6 causes the triangle routing problem that causes inefficient routing path. In order to establish the efficient routing paths, three different Routing Optimization (RO) schemes have been introduced. To solve the triangle routing problems three different route optimization schemes are used which exclude the inefficient routing paths by creating the shortest routing path The RO schemes using correspondent information (CI) message. These are Liebschs Route optimization scheme, Light Weight Route optimization scheme, enhanced light weight route optimization scheme., In this paper I have compare these three schemes by using NS-2 simulations. II Terminology 1. Local Mobility Anchor: The LMA is the Home Agent of an MN in a PMIPv6 domain. It is topological anchor point for the MNs home network. LMA provides charging and billing services to the MN when MN accesses network resources and services. So, Communication between the MNs must pass through the LMA. 2. Mobile Access Gateway: It is the functional element that manages mobility related signalling on behalf of MNs. It is responsible for detecting the MN's attachment or detachment from an access network. 3. Binding Cache Entry: It provides the route information about a communicating node in the networks. It can exist either within an Local Mobility Anchor or in the Mobile Access Gateway.


4. Proxy Mobility Agent: PMA is the proxy mobility agent that resides in each of the access routers that are within a mobility domain. PMA helps to send proxy binding update to LMA on behalf of the mobile. 5. Proxy Binding Update: The PBU is a message sent by a Mobile Access Gateway to the MN's Local Mobility Anchor for establishing or deestablishing a connection between the Mobile Node's. It is also called signaling message. Local Mobility Anchor is act as home agent. It informs the Local Mobility Anchor that the MN is now connected to or disconnected from the MAG. 6. Proxy Binding Acknowledgment (PBA): A PBA is a response message sent by an LMA in response to a PBU that it (LMA) earlier received from the corresponding MAG. A success or positive response indicates that it can start transmitting data packets on behalf of the MN through the responding LMA to the MNs Corresponding Node(s).

III THE RO SCHEMES Liebschs RO Scheme: In Liebschs Route Optimization we use Local Mobility Anchor and Mobile Access Gateway to exchange the RO message for establishing RO path for the Mobile Nodes. It is the LMA which enable the packet sending possible between MN1and MN2. When the MN1 sends the packet to the MN2, the Local Mobility Anchor enable the RO trigger for data packets sent from the MN1 to the MN2. This is because the LMA has all network topology information in the LMD. In the beginning of Route Optimization procedures LMA sent the Route Optimization Int message to Mobile Access Gateway2. After Mobile Access Gateway2 send the RO Init Acknowledgement to Local Mobility Anchor. Local Mobility Anchor sends the RO setup message to the MAG1. The MAG1 send the RO setup Acknowledgment message to the LMA. As the LMA send and receive the same message for the MAG2, the RO procedure is finished. Then data packets are directly delivered between the Mobile Node1 and Mobile Nnode2 due to the effect of the RO.

2. Light Weight Route Optimization Scheme (LWRO): In Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In it Mobile Node1 connected to Mobile Access Gateway and the Mobile Node2 connected to Mobile Access Gateway2. The packets from the Mobile Node1 to the Mobile Node2 are passing through the Local Mobility Anchor. When the Local Mobility Anchor received the packet, it knows the path for the packets to the Mobile Access Gateway2, but at the same time, it also sends a corresponding Binding Update to Mobile Access Gateway2. The Mobile Access Gateway1 receive the corresponding Binding Acknowledgment. Now packet is send from Mobile Access Gateway2 to Mobile Node1. Thus packets from the MN1 destined to the MN2 get intercepted by the Mobile Access Gateway1 and are forwarded to the Mobile Access Gateway2, instead of being forwarded to the Local Mobility Anchor.


3. Enhance Light Weight Route Optimization Scheme (ELWRO): In Enhance Light Weight Route Optimization Scheme Local Mobility Anchor and Mobile Access Gateway are used. To establish the Route Optimization path between the Mobile Nodes we use Local Mobility Anchor and Mobile Access Gateway. In ELWRO scheme in Corresponding Binding Information (CBI) message are used. In MN1 sends data packets to the MN2.First of all MN1 sends the data packets to the Mobile Access Gateway1, and then the MAG1 sends the data packets to the Local Mobility Anchor. The LMA knows the possible setup with RO. The LMA sends Corresponding Binding Information (CBI) message to the MAG1.Corresponding Binding Information (CBI) message include the MN1's address, the MN2's address, and the MAG2's address information. When the MAG1 received CBI message, then the MAG1 send Correspondent Binding Update message to the MAG2. Correspondent Binding Update message include the MN1's address, the MN2's address and the MAG1's address information The MAG2 sends Corresponding Binding Acknowledgment (CBA) message to the MAG1 for Corresponding Binding (CB). Now the packets are exchange between the MN1 and the MN2.

IV PERFORMANCE METRICS Performance matrix for the above three scheme is given by 1) Throughput: Defined as rate of communication per unit time. TH=SP/PT SP=sent packet PT=pause time

2) Packet Delivery Fraction: Defined as the ratio of data packets delivered to destination to those generated by CBR source is known as packet delivery fraction. PDF=SPD/GPCBR*100 SPD =sent packet to destination GPCBR =generated packet by cbr V PERFORMANCE RESULT A) Throughput: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than the Liebschs and light weight route optimization scheme. In ELWRO rate of communication of packets are more with respect to pause time. In packet are transmitted between CN & MN more fastly.


REFERENCES [1] D. Johnson, Scalable Support for Transparent Mobile Host Internetworking, in Mobile Computing, edited by T. Imielinski and H.Korth. [2] RFC-2002 IP Mobility Support. [3] C.E. Perkins, Mobile IP: Design Principles and Practises. [4] IETF Working Group on Mobile IP. [5]Effect of Triangular Routing in Mixed IPv4/IPv6 Networks [6] Route Optimization Mechanisms Performance Evaluation in Proxy Mobile IPv6 2009 Fourth International Conference on Systems and Networks Communications. [7] J. Lee, et al. "A Comparative Signaling Cost Analysis of Hierarchical Mobile IPv6 and Proxy Mobile IPv6", IEEE PIMRC 2008, pp.1-6, September 2008 B) Packet Delivery Fraction: As indicated in graph the Enhanced Light Weight route optimization scheme perform better than the Liebschs and light weight route Optimization scheme. In packet are transmitted between CN & MN more fastly. [8] Deering, S., & Hinden, R.: Internet Protocol, Version 6 (IPv6) Specification, IETF, RFC 2460, December, 1998.

VI CONCLUSION In this paper, we have introduced the operation of three RO schemes that solve the triangle routing problem and provided the results of performance evaluation. The results of Throughput and Packet Delivery Fraction performance evaluation show that performance of our ELWRO scheme is better than Liebschs route optimization scheme & LWRO scheme.


IT-Specific SCM Practices in Indian Industries: An Investigation

Sanjay Jharkharia Associate Professor, QMOM Area Indian Institute of Management Kozhikode

Abstract-To capture the issues related to Information Technology and Supply Chain Management in Indian industries, a survey is conducted for manufacturing industries. The objective of the survey includes understanding the status and practices of IT-specific supply chain management practices in Indian manufacturing industries. Some other relevant issues which are not exclusively in the domain of IT but greatly influence the performance of supply chains are also discussed and investigated in this survey. For example, the aspects of performance measurement of supply chains are also addressed. The survey-outcome has been compared with the previous surveys, which were conducted in similar areas with global or Indian context. It is observed that the companies are using IT in their supply chain activities but the benefits of ITenablement are not fully realized due to various reasons, which the authors have explored in this report. Index Terms Supply chain, Information technology, Survey methodology

responsiveness have to play a greater role in the survival of a company. Further, the increased requirement for greater responsiveness and shortening of products life cycle create uncertain environment. Many companies have identified supply chain management (SCM) as a way to effectively tackle these situations. Harland (1997) describes supply chain management as managing business activities and relationships: (i) internally within an organization (ii) with immediate suppliers (iii) with first and second-tier suppliers and customers along the supply chain and (iv) with the entire supply chain. In the manufacturing sector, purchased goods and services represent a significant amount in the value of the product. The firms, in this sector depend on many suppliers and service providers. Therefore, a firms competitive

IT-SPECIFIC SCM PRACTICES IN INDIAN INDUSTRIES: AN INVESTIGATION I INTRODUCTION Since economic reforms, which started in India in 1991, the competitive environment for Indian companies has become more complex. This has led to more focus on customer service for survival in the global market. As the market is flooded with new and innovative products, the cost, quality and

advantage no longer lies within its own boundary but also depends on the links, which form its supply chain. Parlar and Weng (1997) investigated the relationship between the manufacturing and supply chain functions. Through a mathematical modeling approach, they demonstrated that the two functions should be coordinated because the costs associated with the second round of supply and production, to meet unsatisfied demand, is much higher than that for the first production run.


There is a growing confidence that adoption of supply chain management is essential for the companies to compete in the global market. There is also a perception that supply chain management helps in improving flexibilities at different levels of the operations. Tully (1994) has found that firms are achieving volume, design, and technology flexibilities through SCM. However, successful implementation of supply chain This is an improvement from its rank of 57th in year 2001-02. The competitiveness of manufacturing sector of a country has a significant role in these rankings. Manufacturing in India is believed to be suffering from neutrality syndrome (Korgaonker, 2000), which means little strategic emphasis and overwhelming focus on decisions such as capacity planning, make or buy etc. The stability of production is a major organizational goal for Indian manufacturing industry. Information processing is still very much fragmented even in computerized applications area. The decision making process in the companies is still based on traditional information processing which is time consuming and may yield insufficient or unreliable

management requires shift in the paradigms, structures, policies and behaviors in the

organizations. In order to monitor and facilitate this transition, it is important to develop an understanding of the existing scenario of supply chain management. In the supply chains,

Information technology (IT) can play the role of a facilitator for information sharing. The

information. Departments and companies are internally managed according to their own goals rather than the goals of the whole organization or the supply chain (Sahay et al., 1997). Under such situations, organizations may not deliver superior value to their customers. In a developing country like India, where the market is diverse and fragmented, supply chain efficiency can bring in remarkable benefits to the organizations (Kanungo et al., 1999). Supply chains of Indian

advancements in IT can be used to share the information on a real time basis. Many companies are now deploying IT tools to integrate their supply chains to make these IT-enabled. Therefore, the issues related to the IT-enablement of supply chains are now more important and relevant. A questionnaire-based survey on Indian

manufacturing companies was undertaken to address the related issues. Whether these companies are in line with the global practices on supply chain and IT applications or lack in adopting these advanced practices is the focal issue that lies at the core of this research. II INDIAN MANUFACTURING INDUSTRIES AND SUPPLY CHAINS According to the global competitiveness report (2009-10) India ranks 49th out of 133 countries in global competitiveness index in the year 2009-10.

manufacturing industries are characterized by weak infrastructure outside the organization (Kadambi, 2000) and lack of supply chain policy (Sahay et al., 2001). However, many manufacturing companies are now in the process of integrating their supply chain to stay competitive in the market.


III ROLE OF IT IN SUPPLY CHAIN MANAGEMENT As the business grows, the number of products and the geographical spread of market begin to rise. At the same time, the role of information sharing becomes critical in order to manage the business. The information technology that facilitates satisfaction, higher productivity and ultimately higher financial performance etc. However, in some cases traditional performance measures like return on investment and return on sales may not necessarily increase with IT usage since the IT investment would simply be the cost of remaining competitive in the industry (Byrd and Marshall, 1997). The way IT could be deployed and maintained in a supply chain is a crucial issue (Scala and McGrawth, 1993) and this depends on many factors such as maturity and compatibility of ITtools that the supply chain partners use, level of costs involved, strategic alliances among the supply chain partners, competitiveness of the supply chain, level of integration needed etc.

information sharing also contributes to the reduction of lead times and shipment frequency by reducing the time and cost to process order. The key findings of the KPMG 1997 global supply chain survey put IT as a major enabler of SCM (Freeman, 1998). Information technology plays its roles, both internally and externally within the supply chain. It assists not only the identification and fulfillment of the customer needs but also enhances the crossfunctionality within the organization. This

IV REVIEW OF SURVEY PAPERS AND PROJECT OBJECTIVES Many researchers have conducted surveys in the area of supply chain management. Some have also addressed to the specific needs of manufacturing industries. From the literature review and previous empirical studies, it is observed that many researchers have investigated the use of SCM practices in the organizations. These practices improve the overall deliverables of the

significantly reduces the cycle-time for the planning process. The developments in information

technology have made it possible that information is available on a real time basis to the supply chain partners. By the integration of operations outside the organization, it increases the accuracy of sales forecasts and helps manage inventories effectively. The reduction in inventory is possible because, IT systems enable a company to make decisions, which are based on real time information rather than on guesses (Kwan, 1999). For effective utilization of the latest developments in IT, all the supply chain partners must have some minimum essential IT infrastructure. However, despite substantial

organizations and the supply chains. However, the extent to which industries have really embraced these practices still needs to be examined. It is observed that there are some issues, which are not discussed in the literature and in a majority of cases sufficient time has elapsed since past studies. Therefore, there is a need to take a fresh look into these issues, mainly in the Indian context. Are Indian manufacturing companies aware of and implementing the latest IT tools and supply chain management practices today? Addressing these

investment in IT by companies, the relation between information technology investment and increase in performance has been extremely elusive. There are a number of anticipated benefits from technology investment, which include reduced costs, improved quality, increased flexibility, improved customer


basic questions is at the core of this research project. In the context of Indian supply chains, a survey was conducted by Kadambi (2000) but it addressed only few supply chain issues. Further, it is based on only 32 responses. Another survey on SCM in India was conducted by Sahay et al. (2001), which was not a comprehensive one and was done about eight years back since the present study undertaken by this author. Hence, the author is motivated to conduct a survey, which not only assess the status of supply chain management in India but also addresses the issues, which have not been discussed in the past surveys. This study covered the following issues: a) Supply chain strategy of IT for tools supply in the chain management were also consulted in developing the questionnaire. It was designed on a five point Likert scale. As the response rate of such surveys are not enthusiastic and the respondents are generally reluctant to spare time in filling these

questionnaires, the questions were set close ended, that require less time and efforts in filling the questionnaire. Further, the Indian experience of mailed / postal surveys by taking random sample from an industrial database has not been encouraging. Therefore, to obtain a high response rate, convenience-randomized sampling was preferred in deriving the companies database. Accordingly, the targeted respondents of the questionnaire included: (i) Working executives who are participating in IIMKs Executive Education Programmes,

b) Application organizations effectiveness c)

(ii) Executives from the manufacturing industry in the National Capital Region Delhi, and (iii) Other executives from the manufacturing sector who were easy to contact. The questionnaire was served in hard copy to those respondents who were approached personally (face to face) by the author. However, it was served in the soft copy for the web-based survey. A total of three hundred companies/executives operating in India were approached for their responses during MaySeptember, 2009. The results of the survey are discussed in the next section.

Adoption of supply chain practices by the organizations

d) Opinion of the organizations on certain supply chain issues like benefits and flexibility of IT enabled supply chain etc. e) Issues relevant to supply chain

performance measurement and opinion of the organizations on the indicators for supply chain performance.

V RESEARCH METHODOLOGY To address the supply chain issues in Indian manufacturing companies, a questionnaire-based survey was undertaken. The questionnaire was designed keeping in view the available literature and previous surveys. The practicing managers and academicians in the area of supply chain Out of 300 targeted executives only 97 usable responses were received. These responses have been analyzed to get an overview of supply chain practices in Indian industry. For each question Cronbachs coefficient ( ) was calculated to test the VI SURVEY RESPONSE AND RESPONDENTS PROFILE


reliability and internal consistency of the responses. Cronbachs coefficient, having a value of more than 0.5 is considered adequate for such exploratory work (Nunally, 1978). Barring two questions, which were later discarded from further analysis, the values of have been found to be more than 0.5 partners. It has been found that 54.4% of the respondent companies are strong believer in collaboration and actively extending their supply chain. However, 30% of the respondents believe in collaboration but use a go-slow strategy. Fifteen percent of the respondents are interested in collaboration but have other priorities before entering into any such collaboration. On the other hand Sahay et al. (2001) have observed that about one-third of the companies had no supply chain policies. Despite respondents being different in both the cases, the comparison of these two results indicates that there is a growing awareness in Indian companies for supply chain collaboration. Though the companies appear to be enthusiastic about collaboration in their supply chain, it also appears that these collaborations are more on oneTable 1: Annual Turnover of the respondent companies S. No. 1 2 3 Annual Turnover in Crore of Rs. 25-100 100-500 More than 500 No. of companies 16 41 40 to-one basis as the companies are not in practice of regular joint meetings with all the partners of the supply chain. When asked about the frequency of such joint meetings on a 1 to 5 Likert scale with one indicating never and five means most often, the mean value of the responses is only 2.77. Forty-six percent of the respondents had never (or rarely) VII SUPPLY CHAIN STRATEGY In order that the companies work effectively in a supply chain, coordinated activities and planning between linkages of the chain are necessary (Cartwright, 2000). Supply chain management focuses on how firms utilize their suppliers processes, technology, and capability to enhance competitive advantage. It also promotes the coordination of manufacturing, logistics, and materials management functions within the attended any such meeting. This trend is quite similar to that reported by Brabler (2001) in his survey on German companies. It is further observed in the present survey that only 38% of the companies have separate supply chain department and in 15% of the companies, it is headed by the CEO of the company. Large number of

with an average value of 0.76. It implies that there is a high degree of internal consistency in the responses to the questionnaire. In this survey, in terms of turnover, companies were divided in three categories. These categories are: (i) Turnover between Rs. 25-100 Crores, (ii) Turnover between Rs. 100-500 Crores, and (iii) More than Rs. 500 Crore turnover. The breakup of the respondent companies in terms of turnover is given in the Table 1 below.

respondents, with a mean score of 3.81 on a 5-point scale, observed that business process reengineering (BPR) is a prerequisite to supply chain integration. This is in tune with McMullan (1996) survey, where 88% of the respondents considered it a necessity in supply chain management. There is a

organization (Lee and Billington, 1992). In the present survey respondents were asked about their policy on supply chain collaboration with trading


moderate level of agreement in support of the statement that relevant information of one
Cost-benefits analysis

department in the organization of supply chain is available online to the other (3.42). Here, the values in the brackets indicate the mean value of the responses on the five-point Likert scale. It has been recommended in the literature that incentives should be provided to the small partners in the supply chain for information sharing (Munson et al., 2000) but it is not a common practice in the Indian supply chains. On a five point Likert scale, the mean agreement level in support of the statement is quite poor with a score of only 1.70. When asked about the weightage of certain factors in formulating IT-enabled supply chain strategy, it is found that cost-benefits analysis (4.23) has received the maximum weightage among the respondents (Figure 1). Respondents are of the opinion that the (3.84), upcoming trading technological partners IT

Trading partners IT infrastructure and willingness

Human factors

Availability of trained manpower

Financial constraints

Government regulations


infrastructure and willingness (3.74), and logistics related factors (3.73) should also be given more weightage in formulating the strategy for ITenabled supply chain.

Standard Deviation Mean

Figure 1: Factors influencing formulation of IT-enabled supply chain strategy A practical approach to supply chain management is to have only strategically important suppliers in the value chain and reduce the number of suppliers. This strategy strengthens the buyersuppliers relations (Tan et al., 1999). The other benefits of reduced supplier base are: lower price, lower administration cost and improved

communications (Szwejczewski et al., 2001). However, in the survey it is found that there are multiple suppliers for one component or a finished product at the manufacturers end. When asked


about the adoption of various supply chain practices in their organizations, it is observed that the interaction of business and IT staff (3.27) is the most frequently used practice in most of the organization (Figure 2). Ross et al. (1996) also had a similar observation and reported a strong partnership between IT management and business management. Online tracking of the inventory

Interaction of business and IT staff Online tracking of inventory status Target costing Joint meetings of entities of supply chain Stabilized product price Online order processing and billing Activity Based Costing (ABC) Cross docking Virtual customer servicing Online tracking of Electronic Point of Sales (EPOS)
Standard Deviation

status (3.16) and target costing (3.00) are also being used moderately. However, the practices, rarely used, are: online tracking of electronic point of sales (EPOS) with a score of (1.72), virtual customer servicing (2.03), cross docking (2.18) and activity-based-costing (2.34).

Figure 2: Adoption of various supply chain practices


The information sharing among the partners in the supply chain leads to visibility of the processes in the entire supply chain. The information sharing in the supply chain reduces uncertainties, which is the


root cause of high inventory level in the supply chains. Information sharing can also be made for forecasting and design data sharing etc. Pandey et al. (2010) in their survey on Indian manufacturing companies have found a positive and significant correlation between various types of information sharing and competitive strength. On information sharing, Kaipia and Hartiala (2006) have done some case and empirical studies. They observed that only that information that improves supply chain performance should be shared. This survey explores the types of information, which supply chain partners usually share. Eight widely used domains of information sharing were identified from the literature and respondents were asked to indicate their level of information sharing with suppliers on a 5-point Likert scale. Three most widely used areas of information sharing (Figure 3) are identified as those related to purchasing (3.67), order tracking (3.46), product development (3.30) and inventory status (3.07). The magnitude of this information sharing, as shown in the bracket, is an indicator of only moderate level of information sharing. Therefore, it may be inferred that there is enough scope of further collaboration. The survey results indicate some involvement of supplier in the manufacturers process. However, the KPMG global supply chain survey indicates very low level of involvement of suppliers in manufacturers For the automation of supply chain, firms may use a number of IT tools such as bar-coding, electronic data interchange (EDI), intranet, extranet, internet, website, enterprise resource planning (ERP), and supply chain management (SCM) software etc. Regarding the use of internet in SCM, Cagliano et al. (2005), in his survey, found that both partial adoption of the internet on a few processes and complete adoption throughout the supply chain are used by companies. However, the former is only a transition phase. Respondents were asked to indicate the use of these tools in their organization. It is observed from the survey that Internet is the most widely used IT tool in supply chain automation, currently being used by 100% of the companies. Eighty-nine percent of the companies have either developed their own websites or have planned to develop it in the next one year. Intranet and ERP software are also emerging as favorite IT tools IX USE OF IT TOOLS IN SUPPLY CHAIN AUTOMATION
0 2 4

Order tracking

Inventory status

Sales forecasting


Standard Deviation Mean

Figure 3: Types of information sharing in a supply chain

processes (Freeman, 1998). This may be attributed to the time gap between these two surveys and due to the increasing awareness among manufacturers about suppliers constructive involvement in their supply chain.


among the companies. Sixty-nine percent of the companies have implemented ERP. Fifteen percent of the companies have planned to install it within next one year but 16% of the companies have no plan to use it in near future. However, the penetration of extranet, bar coding, EDI and SCM software is limited to a few companies only. SCM software is the least used supply chain automation tool, being used by only 16.5% of the companies. Ten percent of the companies intend to use it within next one year but about 60% of the companies have no plan to use it in the near future. In KPMGs global supply chain survey (Freeman, 1998), nearly all companies expected a dramatic increase in the requirement of EDI and Bar-coding by their suppliers and customers in the years ahead. However, it does not seem to be valid in the Indian context. The application of Bar-coding is likely to increase in the coming years but EDI does not seem to be taking ground in the Indian companies. Though EDI is being used by 26% of the companies, more than 50% of the respondent companies have no plan to use it in the near future. ERP implementation has been reported in 69% of the surveyed companies, which is significantly higher than the 40% and 20% figure given by Sahay et al. (2001), and Saxena and Sahay (2000) respectively. However, Kadambi (2000) has reported the ERP implementation in 60% of the responding companies in India but it may also be recalled that his observation is based on only 32 responses. As far as SCM software is concerned, its implementation level is close to fifteen percent in all the Indian surveys discussed in this paper. Fodor (2000) has reported in his survey that 20% of the sample has implemented ERP software and 15% has opted to install supply chain management 0 2
Standard Deviation Mean

software. These results are almost similar to the past Indian surveys in terms of SCM software. The difference in the level of ERP implementation in India and in other countries of the globe may be attributed partially to the time gap between these surveys and partially to local conditions and some other factors. The survey explored the level of ITbased information sharing by the manufacturer with suppliers, customers, distributors, warehouse providers and logistics service providers in the supply chain. It is revealed from the survey that compared to other constituents of the supply chain, suppliers are more frequently sharing the information with manufacturers through IT (2.77). However, the level of IT-based information sharing between manufacturer and any other supply chain constituent is only at a moderate level (Figure 4) as the maximum score on a five point Likert scale is less than three. This indicates that the overall status of IT-based information sharing in the supply chain is still not very enthusiastic and there is enough scope for improvement in this direction.


Warehouses and logistics service 4

Figure 4: Level of IT based information sharing by linkages in the supply chain


X INVESTMENT IN IT TOOLS Respondents were asked about the degree of investment in various IT tools, which support the smooth functioning of supply chain management. It is observed from the survey that maximum investment has been made in ERP. The
Office automation Local Area Network (LAN)

investments in local area network (LAN) and computer hardware closely follow the investment in ERP (Figure 5). However, the companies invested much less in supply chain software, extranet and EDI. It is also observed from the study that the investment in SCM software and extranet are likely to increase in the future but the investment in EDI is not likely to significantly increase as the companies are now using Internet to send the information through e-mail and attachments.
Bar-coding Automated Storage and Retrieving System Extranet

Standard Deviation Mean

Figure 5: Investment in IT tools for supply chain automation XI USE OF BAR-CODING Bar coding has the capability to track the flow of goods in a supply chain. It can yield a significant savings for FMCG companies, if implemented properly. Fodor (2000) has reported the results of a questionnaire, which he conducted on the readers of some reputed magazines and journals. According to the survey, bar-coding was implemented by 52% of the sample. In India, bar-coding application by big companies has increased from almost nothing to 30%. It improves the companys demand

forecasting accuracy close to 20% (Anand, 2002). More importantly, it assists in tracking the consumer buying patterns, which enable companies


to price products as per market conditions, introduce new items at stock keeping units (SKUs) and keep watch on new launches in a better way. In the present survey, undertaken by the authors, only 32% of the companies were in use of bar-coding. Respondents observed that maximum benefits of bar-coding are obtainable in speeding up of data entry and there is almost unanimity among the respondents about this advantage as the standard deviation for this option is the minimum among all the discussed advantages of bar-coding. The other major advantages of bar-coding are reported as verification of orders at receiving and shipping, and updating of the stock position (Figure 6). XII BENEFITS OF IT-ENABLED SUPPLY CHAIN The IT-enablement of the supply chain offers several advantages to the users over the

conventional supply chain where IT is not predominantly used for communication among the supply chain partners. Some of these advantages are responsiveness, reduction in manpower,

increase in turnover etc. Bal and Gundry (1999) have reported time and cost savings as the two advantages, ahead of others in the virtual team working which is a possible emerging application area of IT-enabled supply chain. Closs et al. (1996) have observed that IT capability improves

timeliness and flexibility, which influence the logistics competence. The top five benefits of IT as Speed up data entry observed by Sohal et al. (2001) are control of inventory cost, improvement of management Enhances data security productivity, improvement of order cycle time, improved staff productivity and improved product quality. Brabler (2001) has reported that E-business can reduce the lead-time and general flexibility in Improves customer service the supply chain. Bhatt (2000) and Bhatt et al. (2001) are of the view that the firms could use IT to enhance the quality of the products and services. In Accurate forecasting 0 1 2 3 4 5 the present survey, the five most important benefits of IT-enabled supply chain, in the decreasing order are identified as responsiveness (4.32), inventory reduction (4.16), order fulfillment time reduction (4.14), better customer service (4.14) and improved Figure 6: Benefits of Bar-coding technology relations in the supply chain (4.07). For each of these observed benefits of IT-enabled supply chain, the coefficient of variation (CV) is also calculated, which is defined as the ratio of standard deviation ( ) to the mean value ( ). The values of CV for these benefits are found as responsiveness (17.2%), inventory reduction (22.5%), order fulfillment time

Standard Deviation Mean


reduction (18.8%), better customer service (21.6%) and improved relations in the supply chain (21.6%). The lower value of CV indicates a greater degree of convergence among the respondents about that parameter. The observed low value of CV for
Better customer service Inventory reduction

responsiveness and order fulfillment time reduction further substantiate the finding that these are the undisputed benefits of IT-enablement of supply chains (Figure 7). The present survey endorses the view that improved customer service can be achieved through IT-enablement as there is a possibility of significant reduction in lead times. It is also observed from the survey that, in the case of manufacturing companies, IT-enablement of the supply chain does not have much impact on the quality of the product. Fawcett et al. (2008) have also reported the strategic benefits of SCM in their survey paper. Of the reported benefits in their paper increased inventory turnover, increased revenue, and cost reduction across the supply chain are the most sought after benefits.

Low working capital

Edge over the new entrants

Accurate forecasting

Reduced unit cost of

Reduction in manpower

Increase in turnover

Product quality

Standard Deviation Mean

Figure 7: Benefits of IT-enabled supply chain


McCormack and Kasper (2002) have reported a significant relationship between Internet usage and supply chain performance. However, from the present survey, it appears that the use of Internet in Indian supply chains is mainly confined for communication through e-mails. Other supply chain functions such as inventory tracking, purchasing, collaborative information sharing etc. Purchasing activities have an important role in supply chain management. The survey explored the IT applications in purchasing activities and it is observed that 91 percent of the respondent companies have provided personal computers (PCs) or terminals of the main frame computer for their purchasing staff. Ninety percent of the companies had provided e-mail facility and 75% of the companies had provided Internet access to their purchasing staff. Fifty three percent of the companies were having purchase performance evaluation system and 78% of the companies had their own vendor rating system. Thirty- one percent of the companies practiced the automatic release of purchase orders, which is based on inventory level. It is also observed from the survey that despite this high level of IT penetration in the organizations, only twentyone percent of the companies were following the practice of online real time supplier XII USE OF INFORMATION TECHNOLOGY IN ORGANIZATIONAL ACTIVITIES Feedback was also taken from the respondents about the use of IT in various activities of the organization. It is observed that the maximum use of IT is in the area of accounts and finance, and it is reasonably ahead of other application areas like purchasing, sales and service, logistics operations, and manufacturing scheduling. The findings of KPMGs Global Supply Chain Survey (Freeman, 1998) and Indian survey by Saxena and Sahay (2000) reported that IT systems are better integrated in accounts and finance area as compared to other XIV BULLWHIP EFFECT AND ITS CAUSES The amplification of demand variability in the upstream of supply chains is a common information tracking and only nineteen percent of the companies had installed software for automatic handling of online queries. It may be attributed mainly to the disparity in the trading partners IT capability.

applications of Internet like e-business, online ordering and order confirmation, online quotation, tracking of electronic point of sales (EPOS) are not widely used by the respondent companies. In this survey, respondents were also asked to rank the problems in integrating their supply chain with Internet and it is observed that the threat of data security (3.23), insufficient bandwidth (2.93) and lack of trained manpower (2.90) are the main problems in integrating supply chain with Internet. However, these are likely to get phase out with time as technology advancements in Internet security and bandwidth is quite rapid during recent years. Moreover, the mean value of respondents answer is around three, which represents a moderate level of barrier.

phenomenon, which is more visible in the consumer goods sector. This is known as bullwhip effect. When asked about the reasons for bullwhip effect, the respondents observed that long lead-time of material acquisition (3.34), lack of real time


information availability at the vendors end (3.31), price fluctuations in the market (3.15) and forecasting errors (3.13) are the root causes of the Bullwhip effect (Figure 8). sharing. Using online real time information sharing in the supply chain, the long lead-time of material acquisition can certainly be reduced to a large extent. IT-based information sharing can play an important role in reducing the forecasting errors as well. However, the information sharing is possible only when there is a good trust and integration in the supply chain. The authors therefore suggest that Lack of real time information at to counter Bullwhip effect in the supply chain, following two measures should be given

importance. These are: (a) Integrate the supply chain and promote trust among the linkages for Forecasting errors information sharing, (b) Use latest IT tools for online information sharing so that there is no confusion about the demand at various levels of the Batch ordering 0 1 2 3 4 supply chain. XV CRITICAL SUPPLY CHAIN ISSUES In a study by McMullan (1996), conducted in AsiaPacific region, respondents identified issues such as information technology, inventory and infrastructure, both internal and external, as the key Figure 8: Reasons of Bullwhip effect supply chain management issues. Fantazy et al. (2009) in a survey found flexibility among the top eight factors necessary for successful Batch ordering, based on demand implementation of SCM initiative. This observation has been validated in the present survey. The role of top management in the success of supply chains was investigated in detail by Sandberg and Abrahamsson (2010). They observed that despite its often stated importance, we know little about it. In the present survey also, respondents were asked to assign weighatage to certain issues of supply chain management for judging its effectiveness. It is observed that the issues, which are considered important for the functioning of supply chains, in the decreasing order of importance are (Figure 9): commitment of top management (4.63), buyer consolidation, is considered as the least important of these causes. Earlier, Lee et al. (1997) have identified four major causes of the bullwhip effect, which are: (i) demand forecast updating, (ii) order batching, (iii) price fluctuation and (iv) rationing and shortage gaming. An analysis of the results of these two studies also provides insight on the mechanism to counter it. The present survey identifies long lead-time of material acquisition as the most important cause of bullwhip effect. In this survey, the top two reasons are almost equally important and both are related to information

Standard Deviation


suppliers relations (4.30), IT and decision support system (4.29), customer focus (4.20), information sharing at all levels of the chain (4.17), motivation and commitment of the personnel (4.08) and flexibility of the supply chain (4.06).

Commitment of top management

IT and Decision Support System

Information sharing at all levels


Cross functional integration

A supplier is a part of competitors chain

Bullwhip effect

Dictatorial attitude of major stake holders 0 2 4 6

Figure 9: Criticality of various supply chain issues


Bryson and Currie (1995) have observed in their survey that IT is considered strategically important but very few organizations ascribed critical importance to it. However, this survey indicates that now the companies have realized the importance of information technology in the success of the supply chains. Some other issues such as a supplier could be a part of more than one supply chain and dictatorial attitude of the major stake holders of the chain are also discussed in the literature (Munson et al., 2000) but none of these issues are considered important by the respondents. XVII PERFORMANCE MEASUREMENT INDICATORS XVI SUPPLY CHAIN PERFORMANCE MEASUREMENT Performance measurement provides the means by which a company can assess whether its supply chain has improved or degraded over a period of time. In the present survey, respondents were asked about the relevance of performance measurement of a supply chain and it is observed that supply chain performance measurement has a motivational effect on the performance improvement. Keebler (2001) also observed that the impact of good or bad performance of any partner of the supply chain is inevitable on the performance of the entire supply chain. Regarding supply chain performance Balanced Scorecard (Kaplan and Nortan, 1992) provides an excellent background for performance measurement of a supply chain. The balanced scorecard approach can be used to classify the performance measures of a supply chain in the following four categories: (i) (ii) (iii) (iv) Financial Perspectives Customer service perspectives Internal business measures Innovation and other measures performance measurement. It is observed in this survey that supply chain performance measurement is a continuous process in 28% of the organizations. It is 3-4 times in a year in 16% of the organizations, once in a year in 19% of the organizations. It is not regularly measured in about 19% of the

organizations. However, eighteen percent of the respondents did not say anything on the frequency of performance measurement of a supply chain in their organization.

Financial results are the major criteria in determining how supply chains are performing over a period of time yet these are not the complete drivers of the success. Moreover, the operations managers cannot wait long for the availability of financial results of a quarter or a month. In a study by McMullan (1996), the most commonly used performance measures of a supply chain in the customer service category are on-time delivery, customer complaints, back orders, stock out etc. Manrodt (2001) has identified the most frequently used logistics performance measures, which in the decreasing order of usage are outbound freight costs, inventory count accuracy, order fill rate, on-

measurement, Cassivi (2006) has noted that to identify operational performance measures of a supply chain, a good understanding of the most important research initiatives in logistics,

manufacturing, and operations activity is necessary. In India, Saad and Patel (2006) conducted a study on automotive sector companies and found that supply chain performance measurement is not fully embraced by the Indian Auto Sector. They also highlighted the difficulties associated with the


time delivery, and customer complaints. The least used measures in the increasing order of usage are enquiry response time, cash-to-cash time, units processed per unit time and cost to service. Inappropriate performance measures often lead managers to respond to the situations incorrectly and continue to support undesirable behavior. Therefore, it is desired to identify the most relevant performance measures as felt by the supply chain managers of various organizations. In the present survey, performance indicators were classified in the four categories, which are based on balanced scorecard (Kaplan and Nortan, 1992). Literature review was conducted to identify and shortlist the performance indicators for this purpose (McMullan, 1996; Manrodt, 2001; Beamon, 1999; Johnson and Davis, 1998; Keebler, 2001; Lapide, 2001; Mooraj, 1999; Neely et al., 1995, 2000; Pires et al., 2001). The opinion of supply chain experts from industries was also sought for deciding the appropriate measures. Respondents were asked to identify the most important performance indicators of a supply chain, relevant to their organization from the list of given indicators. The 15 most important 9. 10 . 11 . 12 . 13 . 14 . 15 . 8. 7. 6. 5. 4. 2. 3. 1. On-time delivery Responsiven ess Order fill rate Inventory turnover ratio Ease in tracking of customer orders Return on investment Total supply chain inventory control Plant productivity Just-in-time environment Economic value added Reduced wastes Retention of old customers Cost per unit of product Better product quality Reduced throughput time S. N. Performanc e indicators Standa rd deviatio n () 0.56 0.67 0.61 0.89 Mea n valu e ( ) 4.74 4.44 4.43 4.35 Coefficie nt of variation (CV) 11.8% 15.1% 13.8% 20.4% Table 2: Main indicators for supply chain performance measurement




0.94 0.77

4.26 4.21

22.0% 18.3%

0.86 1.05 1.07 0.97 0.90

4.15 4.13 4.11 4.10 4.08

20.7% 25.4% 26.0% 23.6% 22.0%

performance indicators, as identified from the survey, in the decreasing order of mean value are shown in the Table 2.

1.12 1.08

4.08 4.08

27.4% 26.4%





In the overall ranking of performance indicators, it is observed that on-time delivery (4.74), responsiveness (4.44) and order-fill-rate (4.43) are the three most important indicators for the performance evaluation of a supply chain. These indicators have CV values of 11.8%, 15.1% and 13.8% respectively. These values are the lowest among all the discussed performance indicators therefore, it may be inferred that these performance indicators are the consistently accepted in the industry and can be used as important parameters to measure the performance of the supply chains. Among the top fifteen indicators, five each belong to customer service and internal business measures. Three belong to financial measures and two belong to innovation and other measures. These findings indicate that the business managers accord a very high priority to the customer service and internal business measures. This result is justified also because it is the customer who is the ultimate evaluator of the supply chain by purchasing the products derived through a supply chain and therefore customers satisfaction level should figure in the performance of the supply chain. XVIII CONCLUSION The status of supply chain management in Indian manufacturing companies has been explored

supply chain can be quantified only in certain areas like inventories, working capital and costs of communication but its intangible impact on goodwill and the responsiveness of the company to react to situations is far greater. As more companies emphasize on responsiveness, the importance of information technology in supply chain

management is going to be increasingly important in days to come. It is observed that firms have upgraded their internal capabilities in terms of computer hardware, internet, intranet, extranet, ERP, SCM software etc but they have been less successful in utilizing these capabilities for external co-ordinations, be it in terms of purchase process, design data sharing or inventory control etc. These figures indicate that though companies have developed individual IT capability to a large extent, the integration and information sharing in the supply chain is still much lower than desired. The observation of Closs et al. (1996) is also valid in the Indian context that companies have developed their internal capabilities but substantial improvement is needed to make the supply chain integration a reality. XIX REFRENCES
1. 2. Anand, M. (2002), Operation streamline, Business World, 18 February, pp. 20-26, New Delhi. Bal, J. and Gundry, J. (1999), Virtual teaming in the automotive supply chain, Team Performance Management: An International Journal, Vol. 5 No. 6, 1999, pp. 174-193. Beamon, B. M. (1999)," Measuring supply chain performance", International Journal of Operations and Production Management, Vol.19 No. 3, pp. 275292. Bhatt, G..D. (2000)," An empirical examination of the effects of information systems integration on business process improvement, International Journal of Operations and Production Management, Vol.20 No.11, pp. 1331-1359. Bhatt, G.D. and Stump, R.L. (2001)," An empirically derived model of the role of IS networks in business process improvement initiatives, Omega: International Journal of Management Science, Vol.29, pp.29-48. Brabler, A. (2001), E-Supply Chain ManagementResults Of An Empirical Study, Proceedings of the

through a questionnaire-based survey. The findings indicate that Indian companies are moving steadily to adopt the supply chain practices and these are in line with the practices elsewhere. The IT-


enablement of supply chains is another issue examined in the paper. The benefits observed due to IT-enablement are discussed in the report. The supply chain managers have to decide which IT tools offer the greatest strategic value to their supply chain. The financial impact of IT on the
6. 5.


Twelfth Annual Conference of the Production and Operations Management Society, POM-2001, March 30-April 2 2001, Orlando Fl. Bryson, C. and Currie, W. (1995)," IT strategy: formal rational orthodoxy or contingent adhocracy", Omega: International Journal of Management Science, Vol. 23 No. 6, pp. 677-689. Byrd, T. A. and Marshall, TE. (1997), " Relating Information Technology Investment to Organizational Performance: a Casual Model Analysis", Omega: International Journal of Management Science, Vol. 25 No.1, pp. 43-56. Cagliano, R., Caniato, F. and Spina, G. (2005), Ebusiness strategy: How companies are shaping their supply chain through the Internet, International Journal of Operations and Production Management, Vol. 25 No. 12, pp. 1309-1327. Cartwright, S. D. (2000)," Supply chain interdiction and corporate warfare", IEEE Engineering Management Review, third quarter, 30-35. Cassivi, L. (2006), Collaboration planning in a supply chain, Supply Chain Management: An International Journal, Vol. 11 No. 3, 249-258. Closs, J.C., Goldsby, T.J. and Clinton, S.R. (1996), "Information technology influences on world class logistics capability", International Journal of Physical Distribution and Logistics Management, Vol.27 No.1, pp. 4-17. Fantazy, K. A., Kumar, V. And Kumar, U. (2009), An empirical study of the relationships among strategy, flexibility, and performance in the supply chain context, Supply Chain Management: An International Journal, Vol. 14 No. 3, pp. 177-188. Fawcett, S. E., Magnan, G.M. and McCarter, M.W. (2008), Benefits, barriers and bridges to effective supply chain management, Supply Chain Management: An International Journal, Vol. 13 No. 1, pp. 35-48. Fodor, G. (2000),"Room to grow", .htm Freeman, B. (1998), "Highlights of KPMG's global supply chain survey", _figures.htm Harland, C. (1997), "Supply chain operational performance roles", Integrated Manufacturing Systems, Vol. 8 No. 2, pp. 70-78. Johnson, M. and Davis, T. (1998), Supply chain performance by using order fulfillment metrics", National Productivity Review, summer, pp. 3-16. Kadambi, B. (2000), "IT-Enabled supply chain management-A preliminary study of few manufacturing companies in India", Kaipia, R. and Hartiala, H. (2006), Information sharing in supply chains: five proposals on how to proceed, The International Journal of Logistics Management, Vol. 17 No. 3, pp. 377-393. Kanungo, S., Sharma, S., Bhatia, K. and Babu, S. (1999), Toward a model for relating supply chain management and use of IT: an empirical study, in Sahay, B.S. (Ed.), Supply chain management for global competitiveness, Macmillan India Limited, New Delhi. 22. Kaplan, R.S. and Norton, D.P. (1992), The balanced scorecard- measures that drives performance", Harvard Business Review, January- February, pp. 7180. 23. Keebler, J.S. (2001), "Measuring Performance in the Supply Chain", in Mentzer, John, T. (Ed.), Supply Chain Management, Response Books, New Delhi. 24. Korgaonker, M.G. (2000), Competitiveness of Indian Manufacturing Enterprises, Manufacturing Magazine, December, pp. 26-37. 25. Kwan, A.T.W., (1999), " The use of information technology to enhance supply chain management in the electronics and chemical industries", Production and Inventory Management Journal, third quarter, pp.7-15. 26. Lapide, L.(2001), "What about measuring supply chain performance", 27. Lee, H. and Billington, C. (1992)," Managing supply chain inventory: Pitfalls and opportunities", Sloan Management Review, spring, pp. 65-73. 28. Lee, H. L., Padmanabhan, V., and Whang, S. (1997), "Information distortion in a supply chain: the bullwhip effect", Management Science, Vol. 43, pp. 546-558. 29. Manrodt, Karl B. (2001), "The state of Supply Chain Measurement" pdf 30. McCormack, K. and Kasper, K. (2002), The extended supply chain: a statistical study, Benchmarking: An International Journal, Vol. 9 No. 2, pp. 133-145. 31. McMullan, A. (1996)," Supply chain management practices in Asia Pacific today", International Journal of Physical Distribution and Logistics Management, Vol. 26 No.10, pp.79-95. 32. Mooraj, S. (1999), "The balanced scorecard: a necessary good or an unnecessary evil", European Management Journal, Vol.17 No. 5, pp. 481-491. 33. Munson, C.L., Rosenblatt, M.J. and Rosenblatt, Z. (2000)," The use and abuse of power in supply chains", IEEE Engineering Management Review", second quarter, pp. 81-91. 34. Neely, A., Bourne, M. and Kennerley, M. (2000)," Performance measurement system design: developing and testing a process-based approach" International Journal of Operations and Production Management, Vol. 20 No. 10, pp. 1119-1145. 35. Neely, A., Gregory, M. and Platts, K. (1995)," Performance measurement system design: a literature review and research agenda, International Journal of Operations and Production Management, Vol. 15 No. 4, pp. 80-116. 36. Nunally, J. C. (1978), Psychometric Methods, McGraw Hill, NY 37. Pandey, V. C., Garg, S. and Shankar, R. (2010), Impact of information sharing on competitive strength of Indian manufacturing enterprises: an empirical study, Business Process Management Journal, Vol. 16 No. 2 (In Press). 38. Parlar, M. and Weng, Z. K. (1997), "Designing a firm's coordinated manufacturing and supply decisions with short product life cycles", Management Science, Vol. 43 No. 10, October, pp. 1329-1344. 39. Pires, S.R.I. (2001), "Measuring supply chain performance", Proceedings of the Twelfth Annual Conference of the Production and Operations

















Management Society, POM-2001, March 30- April 2, 2001, Orlando Fl. Ross, J.W., Beath, C.W. and Goodhue, D.L. (1996), Develop long term competitiveness through IT assets, Sloan Management Review, Vol. 38 No. 2, pp. 31-42. Saad, M. and Patel, B. (2006), An investigation of supply chain performance measurement in the Indian automotive sector, Benchmarking: An International Journal, Vol. 13 No. 1/2, pp. 36-53. Sahay, B.S., Cavale, V., Mohan, R., Rajini, R. and Gupta, P. (2001), The Indian supply chain architecture, Industry, September, pp. 19-32, New Delhi. Sahay, B.S., Saxena, K.B.C. and Kumar, A. (1997), Information technology exploitation for world-class manufacturing: the Indian scenario, In research report Centre for Excellence in information management, Management Development Institute, Gurgaon, India. Sandberg, E. and Abrahamsson, M. (2010), The role of top management in supply chain management practices, International Journal of Retail and Distribution Management, Vol. 38 No. 1, pp. 57-69. Saxena, K. B. C. and Sahay, B. S. (2000)," Managing IT for world-class manufacturing: the Indian scenario", International Journal of Information Management, Vol. 20, pp. 29-57. Scala, S. and McGrawth, R. (1993), "Advantages and Disadvantages of Electronic Data Interchange", Information and Management, Vol.25, pp.85-91. Sohal, A.S., Moss, S. and Ng, L. (2001), Comparing IT success in manufacturing and service industries, International Journal of Operations and Production Management, Vol. 21 No. 1/2, 2001, pp. 30-45. Szwejczewski, M., Goffin, K. and Lemke, F. (2001), Supplier management in German manufacturing companies: an empirical investigation International Journal of Physical Distribution and Logistics Management, Vol.31 No. 5, pp. 354-373. Tan, K.C., Kannan, V.R., Handfield, R.B. and Ghosh, S. (1999), Supply chain management: an empirical study of its impact on performance, International Journal of Operations and Production Management, Vol. 19 No. 10, 1999, pp. 1034-1052. Tully, S. (1994), Youll never guess who really makes what, Fortune, October, pp. 124-128.














Cloud Computing
A Study of Utility Computing and Software as a Service (SaaS) Parveen Sharma, Manav Bharti University, Solan Himachal Pradesh Guided By: Dr. M.K. Sharma Associate Professor & Head MCA Program Depart. of Computer Science Amrapali Institute -Haldwani (Uttarakhand)
Abstract TodayCloud computing is most attractive technology. It studies how the cloud has quickly taken hold as part of our everyday lives and how it is balanced to become a major part of IT strategies for all organizations. Cloud will be advantageous from a business intelligence standpoint over the isolated alternative that is more common today. Itsstudy the utilities computing on demand, such as electricity and telephone,and describe how the cloud model(SaaS) will ultimately serve to change- SaaSis a budget-smart choice that will examine challenges of cloud computing in traditional models for pricing and getting information technology. It begins with SaaS cloud computing which quickly evolved in tentative field of cloud computing. It delivers a single application through the browser to thousands of customers using a multitenant architecture with the down fall of cost and service charges for its applications. SaaS buyers are weighing the trade-offs between the application of fast flexibility, cost savings, and reduced trust on internal IT property. SaaS makes sense for their answer requirements. Index Terms Supplier, Cloud computing, Client, SaaS and multi-tenancy.

I INTRODUCTION Cloud computing means Internet ('Cloud') based development and use of computer technology.The word Cloud is used as a metaphor for the internet, based on the Cloud drawing used in the past to represent the telephone network and later to describe the internet in computer network diagrams as an abstraction of the underlying infrastructure The Cloud Computing is a phrase that is being used today to describe the act of storing, accessing, and sharing data, applications, and computing power in cyberspace. The concepts of storing data in remote locations use of tools only when we need them, but the positives and negatives of Cloud Computing (CC) present userswith unprecedentedopportunities and challenges. CC is a natural evolution of the widespread adoption of virtualization, serviceoriented architectureand computing utility.CC describes a new supplement, consumption, and delivery model for IT services based on the Internet, and it typically involves over-the-internet provision

of dynamically scalable and often virtualized resources. In 1960 John McCarthy said that "computation may someday be organized as a public utility. And it can be regarded as the latest wave of disruption in the IT. It can be best described as a highly automated, readily scalable, on-demand computing platform of virtually unlimited processing, storage always available to carry out a task of any size and charged based on usage. Almost all the modern day characteristics of CC the comparison to use of public, private, government and community forms was thoroughly explored in. The concept of Clouds is not new; it is sure that they have proved a major commercial success over recent years. The primary aim of Cloud Computing is to provide mobility deployment of web-based application by means of easily accessible tools . CC has three types of models, these are given below. IaaSmodel (Infrastructure as a Service)2.PaaSModel (Platform as a Service)3. Saas Model (Software as a Service) II HISTORY Software as a service's acronym, SaaS, first appears in an article called "Strategic Backgrounder: Software as a Service." It was published in February 2001 by the Software & Information Industry's eBusiness Division. Software as a service is essentially an extension of the idea of the Application Service Provider(ASP) model[1]

III WHAT IS SAAS? SaaS referred to as "software on demand," is software that is deployed over the internet. This approach to application delivery is part of the utility computing model where all of the technology is in the "cloud" accessed over the Internet as a service. SaaS is presently the most popular model of cloud computing service because of its high flexibility and scalability, high performance with better availability, vast services and less maintenance. SaaS Cloud delivers a single application through the browser to thousands of customers using a multitenant architecture.


SaaS model offers a high level explanation of the distributed data manner of software. It allows customer to require a computer with internet access to download the application and develop the software. It also allows the software to be licensed for either a single user or for a whole group of users.SaaS dealer is not only responsible for providing the service of data centresessential to run the application.SaaS is the key setting for the quick development. SaaS model make possible for every customer to take advantages of providers latest technological features without the burden of software maintenance, management, updates and upgrades. IV SAAS AS A PLATFORM SERVICE SaaS platform service is estimated to reduce the costs of the development of the application of business systems .Its conceptual basis is a business standard that overcomes all barriers globally and seamlessly connects all business processes and thus dynamically promotes effective business development. SaaS platform service performs integrated management of application records. The SaaS provides a business system withthe functions required for the development and operation ofSaaS applications. It has four components: (a)Basic functions: Basic functions include authentication,user management and authorization control. (b)Common components:Common components include E-mail distribution and billing data generation. (c)Service linkage: Service linkage implements linkage with other services. (d)Development framework: Development framework covers the methodology .The SaaS platform service runs on the common IT platformservice.Department that provides SaaS applicationsusing the SaaS platform service is called the SaaS provider. V BENEFIT OF SAAS a. SaaSdefined as a method by which Application Service Supplier (ASS) provide applications different over the software Internet i. j. e. b. SaaS makes the customer to get free of installing and operating the application on own computer. It also removes the great load of software maintenance. c. SaaS ability to have access to powerful technologies, with a least financial commitment d. The great benefit of SaaS is the ability to run the most recent version of the application . SaaS helps organizations avoid capital expenditure andpay for the functionality expenditure. f. SaaSremoves customer doubts as an operational

about application servers, loading, application desertion and related, common concerns of IT. g. The SaaS provider can improve the good organization of the

management of SaaS applications by execution work in accordance with the standard operation flow defined by the SaaS platform service. h. Save money by not having to purchase servers or other software to support use. Faster Implementation. Focus Budgets on competitive advantage rather than infrastructure Monthly obligation rather than up front capital cost Multi-Tenant efficiency Multi-Tenant efficiency Flexibility and scalability

leveraging cloud infrastructure on pay-as-you-go pricing structure.



VI CHARACTERISTICS OF SAAS Applications are network based so that the business users freeto use the service. Each application ispay-per-usage basis,

VII ADVANTAGES Pay per use Anytime, anywhere accessibility Pay as you go Instant scalability Security Reliability VIII SAAS CLIENT SaaSis a new delivery model and flexibility model. SaaS provider remotely manages software applicationsfor its customers. SaaS eliminates customer worriesabout

sanctioning the business owner to expecttheir budget for the usage of number of

applicationsaccording to business need. Application delivery naturally based on oneto-many model. An application is shared across multiple users. Managing Complexity while reducing

software costs. SaaS make possible to have regular

application servers, storage, and application development. It also enables every customer tobenefit from the dealers latest technological for

integration with a large. SaaS is highly efficient as Multi-tenant structural design. Network-based access to, and management of, commercially available software Activities managed from central locations rather than at each customer's site, enabling customers to access applications remotely via the Web Application delivery typically closer to a one-to-many model (single instance, multitenant architecture) than to a one-to-one model, including architecture, pricing, partnering, and management characteristics Centralized feature updating, which obviates the need for end-users to download patches and upgrades. Frequent integration into a larger network of communicating softwareeither as part of a mashup or a plugin to a platform as a service

features. SaaSProviderstasksresponsible

managing servers, power andCooling.SaaS alsomaintain operating system software, databases, installationof updates. Web based applications to easily

provisionsoftware for customers on demand. Typically has a multi-tenant model of application withroom for customization for each customer. Centralized controlled software deployment reduces support costs. Provide the latest version of the application software to thecustomer. Make sure the security and privacy of client data.


IX BENEFITS TO SAAS OPERATOR Own the development platforms, hardware & high degreeof maintaining. Software is subscribed on yearly or a monthly fee, It Improved reliability,performance and efficiency,Enhanced productivity and fasterdeployment.User can access to on-demand application anywhere,anytime.Does not have to purchase and support theinfrastructure that the application runs upon. X CONCLUSION By integrating all of the application software, data center,database, IT infrastructure and services together in a web-based,multi-tenant on demand delivery model, SaaSdealers canprovide ability to customers with economies of and talent thatwas one of the biggest challenges for traditional,onpremisedeployments. SaaS shifts the duty of deployment, operation, management, support and successfully operation of theapplication from the customer to the vendor. The aim of cloud computing is toapply vast computational power and storage capacity to solveproblems.Resources incloud computing are not confined to data storage and servers,the can also be complete distributed systems_especially clusters. XI REFERENCES Software as a Service: Strategic Backgrounder, Software & Information Industry Association, Feb 2001,,

retrieved May 2010



Comparing the effect of Hard and Soft threshold techniques on speech compression using wavelet transform
Sucheta Dhir Indira Gandhi Institute of Technology, G.G.S. Indraprastha University, E-mail:
Abstract- In mobile communication systems, service providers are trying to accommodate more and more users in the limited bandwidth available to them. To accommodate more users they are continuously searching for low bit data rate speech coder. There are many types of speech coder ( vcoder) available such as Pulse Code Modulation (PCM) based vcoder , Linear Predictive vcoder (LPC), and some higher quality vcoders like Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept that involves the use wavelet transform in speech compression. The wavelet transformation of a speech signal results in a set of wavelet coefficients which represent the speech signal in the wavelet domain. Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Both types of thresholds can be either hard or soft. The result of MATLAB simulation shows that Compression factor increases when soft threshold is used in both global and level dependent threshold techniques. However the better signal to noise ratio and retained signal energy values are obtained when hard threshold is used. Index Terms-DWT, Global threshold, level dependent threshold, Compression

I INTRODUCTION Humans use multiple ways to communicate with one another. Speech is the most commonly used media by people to express their thoughts. Development of telephones, mobile satellite communication, etc. has helped us to communicate with anyone present on the globe that has the access to mobile technology. With the bandwidth of only 4 KHz human speech can convey information with emotions. Now a day there is great emphasis on reducing the delay in transmission as well as on sound clarity of the transmitted & received signal. Through speech coding a voice signal is converted into more compact form, which can then be transmitted on a wired or

wireless. The motivation behind the compression of speech is that there is limited access to the bandwidth available for transmission. Thus a speech signal is first compressed and the coded before its transmission and at the receiver end received signal is first decoded and then decompressed to get back the speech signal. Special stress is laid on the design and development of efficient compression techniques and speech coders for voice communication and transmission. Speech coders may be used for real time coding of speech for its use in mobile satellite communication, cellular telephony, and audio for videophones or video teleconferencing. Traditionally Speech coders are classified mainly into two categories: Waveform coders and analysis/synthesis vcoders. A waveform coder attempts to copy the actual shape of the signal produced by a microphone. The most commonly used waveform coding technique is Pulse code modulation (PCM). A vcoder attempts to reproduce a signal that is perceptually equivalent to the speech waveform. One of the most commonly used techniques for analysis/synthesis coding are Linear Predictive Coding (LPC) [1], Residual Excited Linear Prediction (RELP) and Code Excited Linear Prediction (CELP). This paper deals with slightly newer concept which employs the use of wavelets for speech compression [5]. Wavelets are mathematical functions of finite duration with average value zero. A signal can be represented by a set of scaled and translated versions of a basic function called the mother wavelet and this process is known as Wavelet Transformation [9]. The wavelet transformation of a signal results in a set of wavelet coefficients which represent the signal in the wavelet domain. All the data operations can now be performed using just the corresponding wavelet coefficients.


II SPEECH COMPRESSION USING WAVELET TRANSFORMATION: Fig-1 shows the Design Flow of Wavelet based Speech Encoder. process can be described as the usual process of setting to zero the elements whose absolute values are lower than the threshold. Soft threshold process is an extension of hard threshold, first setting to zero the elements whose absolute values are lower than the threshold, and then shrinking the nonzero coefficients toward 0.

4- Quantization and Encoding:

Quantization is the process of mapping large set of input values to a smaller set. Since quantization involves many to few mapping therefore it is a nonlinear and irreversible process. The thresholding of wavelet coefficients gives floating point values. These floating point values are converted into integer values using quantization table. These quantized coefficients are the indices to the quantization table. Quantized table contains redundant information. To remove the redundant information the quantized coefficients are then efficiently encoded. Encoding can be performed using Huffman Coding. Huffman coding is a statistical technique which attempts to reduce the amount of bits required to represent a string of symbols. Huffman coding is a type encoding technique which involves computation of probabilities of occurrence of symbols. These symbols are the indices to the quantization table. Symbols are arranged in descending order according to their probability of occurrence. Shortest code is assigned to symbol having maximum probability of occurrence and longest code is assigned to the symbol having minimum occurrence. The actual compression takes place in this step only because in the previous steps the length of the signal at each stage was equal to the length of the original signal. It is in this step that each symbol is represented with a variable code. III PERFORMANCE PARAMETERS

Fig 1: Design Flow of Wavelet based Speech Coder.

The major steps shown in the above diagram are explained in the following sections.

1- Choice of Wavelet Function

To design a high quality speech coder, the choice of an optimal mother wavelet function is of prime importance. The selected wavelet function should be capable of reducing the reconstructed error variance and maximizing signal to noise ratio (SNR). Different criteria can be used to select an optimal mother wavelet function [6]. Selection of optimal mother wavelet function can be based on the amount of energy a wavelet function can concentrate into level 1 approximation coefficients.

2- Wavelet Decomposition
A signal is decomposed into different resolutions or frequency bands. This task can be carried out by taking the discrete wavelet transform of the signal by a suitable function at appropriate decomposition level. The level of decomposition can be selected based on the value of entropy [2]. For processing a speech signal level-5 wavelet decomposition is adequate.

1- Compression Factor: It is the ratio of

original signal to the compressed signal. CR (1)

3- Truncation of Coefficients
Most of the speech energy is concentrated in the high valued coefficients, which are few. Thus the small valued coefficients can be truncated or zeroed. For compression wavelet coefficients are truncated below a threshold. There are two approaches for calculating thresholds: Global Threshold and Level Dependent Threshold. Global threshold is used to retain largest absolute value coefficients, regardless of level of decomposition. Unlike Global threshold, Level Dependent threshold vary depending upon the level of decomposition of the signal. Both types of thresholds can be either hard or soft. Hard threshold

2- Retained Signal Energy (PERFL2): It

indicates the amount of energy retained in the compressed signal as a percentage of the compressed signal. (2)

3- Percentage of Zero Coefficient (PERF0): PERF0 is defined as the number of zeros

introduced in the signal due to thresholding which is given by the following relation. (3)



4- Signal to Noise Ratio (SNR): SNR gives

the quality of the reconstructed signal. A high value indicates better reconstruction. (4) IV SIMULATION RESULTS For choosing optimal mother wavelet functions of five different wavelet families were used to decompose a speech sample shown in fig 2. The retained signal energy at level-1 wavelet decomposition was calculated and the same is recorded in table 1(a, b, c, d, e).

Function bior-1.1 bior-1.3 bior-1.5 bior-2.2 bior-2.4 bior-2.6 bior-2.8 bior-3.1 bior-3.3 bior-3.5 bior-3.7 bior-3.9 bior-4.4 bior-5.5 bior-6.8

Energy 91.4615 96.3201 92.5828 96.7950 96.9173 96.9730 97.0020 98.3436 98.3986 98.4286 98.4455 98.4556 95.8568 93.5781 96.5751

Table 1(e): Retained Signal Energy for Biorthogonal Wavelet Family Fig 2: Speech signal sample.

Wavelet Function Haar Wavelet Function db-1 db-2 db-3 db-4 db-5 db-6 db-7 db-8 db-9 db-10 Wavelet Function sym-2 sym-3 sym-4 sym-5 sym-6 sym-7 sym-8 Wavelet Function coif-1 coif-2 coif-3 coif-4 coif-5 Wavelet

Retained Signal Energy 91.4615 Retained Signal Energy 91.4160 93.8334 94.8626 95.4728 95.8830 96.1680 96.2927 96.3349 92.3262 96.3416 Retained Signal Energy 93.8224 94.8626 95.6662 96.0647 96.1711 96.1728 96.3343 Retained Signal Energy 93.8450 95.7307 96.1958 96.3504 96.4062 Retained Signal

Table 1(a): Retained Signal Energy for Haar Wavelet Family.

Table 1(b): Retained Signal Energy for Daubechies wavelet Family

One wavelet function out each wavelet family is selected based on the maximum retained signal energy criteria at level 1 wavelet decomposition. Based on maximum retained energy criteria bior-3.9, db-10, sym-8, coif-5 wavelet functions are selected for level-5 wavelet decomposition for speech compression. Table 2 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, global threshold. Fig 3 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using global threshold approach. Wavelet Function : bior-3.9 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.2932 3.4657 CR 23.8536 15.1721 SNR 76.5387 76.5383 PERF0 96.9588 63.5240 PERFL2
Table 2(a): Performance Parameter Table for Bior-3.9 wavelet function and Global Threshold Approach.

Table 1(c): Retained Signal Energy for Symlets Wavelet Family

Wavelet Function : db-10 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3852 3.5429 CR 23.8335 14.1865 SNR 78.7307 78.7307 PERF0 90.8521 41.3243 PERFL2
Table 2(b): Performance Parameter Table for db-10 wavelet function and Global Threshold Approach.

Table 1(d): Retained Signal Energy for Coiflets Wavelet Family

Wavelet Function : sym-8 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3070 3.4524 CR


SNR PERF0 PERFL2 23.7707 78.6117 90.7802 14.1173 78.6117 40.9308

Table 2(c): Performance Parameter Table for sym-8 wavelet function and Global Threshold Approach.

Wavelet Function : coif-5 Performance Global Threshold Parameter Hard Threshold Soft Threshold 3.3035 3.4460 CR 23.8271 14.1590 SNR 78.6407 78.6407 PERF0 90.8347 41.3203 PERFL2
Table 2(d): Performance Parameter Table for coif-5 wavelet function and Global Threshold Approach.

Fig 3(g): Reconstructed signal for coif-5 wavelet function using hard-global threshold.

Fig 3(h): Reconstructed signal for coif-5 wavelet function using soft-global threshold.

Fig 3(a): Reconstructed signal for bior-3.9 wavelet function using hard-global threshold.

Fig 3(b): Reconstructed signal for bior-3.9 wavelet function using soft-global threshold.

Similarly table 3 shows the values of Compression Factor (CR), Signal to noise ratio (SNR), Percentage of Zero Coefficients (PERF0), and Retained Signal Energy (PERFL2) for selected wavelet functions for both hard and soft, level dependent threshold. And Fig 4 shows the reconstructed signal after decoding and decompression of encoded and compressed speech signal using level dependent threshold approach. Wavelet Function : bior-3.9 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3905 3.6901 CR 17.7406 9.9707 SNR 78.0619 78.0619 PERF0 92.7563 58.4797 PERFL2
Table 3(a): Performance Parameter Table for Bior-3.9 wavelet function and Level Dependent Threshold Approach.

Fig 3(c): Reconstructed signal for db-10 wavelet function using hard-global threshold.

Wavelet Function : db-10 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.3150 3.5871 CR 18.2082 10.5317 SNR 78.0619 78.0619 PERF0 83.9324 37.4042 PERFL2
Table 3(b): Performance Parameter Table for db-10 wavelet function and Level Dependent Threshold Approach.

Fig 3(d): Reconstructed signal for db-10 wavelet function using soft-global threshold.

Fig 3(e): Reconstructed signal for sym-8 wavelet function using hard-global threshold.

Wavelet Function : sym-8 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2535 3.4300 CR 17.9551 10.1267 SNR 78.0964 78.0964 PERF0 83.4822 35.9319 PERFL2
Table 3(c): Performance Parameter Table for sym-8 wavelet function and Level Dependent Threshold Approach.

Fig 3(f): Reconstructed signal for sym-8 wavelet function using soft-global threshold.

Wavelet Function : coif-5 Performance Level Dependent Threshold Parameter Hard Threshold Soft Threshold 3.2273 3.4563 CR 18.3864 10.7569 SNR 77.9466 77.9466 PERF0


PERFL2 84.1851 38.7021
Fig 4(h): Reconstructed signal for coif-5 wavelet function using soft-level dependent threshold.

Table 3(d): Performance Parameter Table for coif-5 wavelet function and Level Dependent Threshold Approach.

VI CONCLUSION Compression of speech signal is essential, since raw speech is highly space consuming. In this paper wavelet transform is used for speech compression. Its performance was tested on various parameters and the following points were observed. Speech compression using wavelet transformation involves quantization of coefficients before encoding step, which is an irreversible process; hence original speech cannot be retrieved from compressed speech signal. As can be seen in table 2, the percentage of zeros introduced (PERF0) remain exactly the same for hard and soft, level dependent threshold technique. Ideally, for equal values of PERF0 the CRs shall also be equal but a difference in the values of CR is observed. This discrepancy can be accounted for the introduction of additional zeros at the quantization stage, because the coefficients are scaled down in soft threshold. It is due to this scaling that Retained signal energy and hence SNR has dropped to lower values, though audibility and understand ability of the speech was not significantly affected. Higher values of Compression factors are achieved when db-10 wavelet function is used for speech compression and signal to noise ratio are achieved when bior-3.9 wavelet function is used for speech compression. Similar inference can be made from observations for hard and soft, global threshold technique (Table 3). VII REFERENCES
Fig 4(e): Reconstructed signal for sym-8 wavelet function using hard-level dependent threshold.

Fig 4(a): Reconstructed signal for bior-3.9 wavelet function using hard-level dependent threshold.

Fig 4(b): Reconstructed signal for bior-3.9 wavelet function using soft-level dependent threshold.

Fig 4(c): Reconstructed signal for db-10 wavelet function using hard-level dependent threshold.

Fig 4(d): Reconstructed signal for db-10 wavelet function using soft-level dependent threshold.

Fig 4(f): Reconstructed signal for sym-8 wavelet function using soft-level dependent threshold.

Fig 4(g): Reconstructed signal for coif-5 wavelet function using hard-level dependent threshold.

[1] Shijo M Joseph, Firoz Shah A and Babu Anto P, Spoken digit compression: A Comparative Study between Discrete Wavelet Transforms and Linear Predictive Coding International Journal of Computer Applications (0975 8887) Volume 6 No.6, September 2010. [2] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [3] Wonyong Chong, Jongsoo Kim, Speech and Image Compressions by DCT, Wavelet, and Wavelet Packet International Conference on Information, Communications and Signal ProcessingICICS '97Singapore, 9-12 September 1997 [4] P.Prakasam and M.Madheswaran, Adaptive Algorithm for Speech Compression using Cosine Packet Transform IEEE 2007 proc. International


Conference on Intelligent and Advanced Systems. pp 1168-1172. [5] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58. [6]. R. Polikar. The wavelet tutorial. URL: ial.html, March 1999. [7]. Gonzalez, Woods and Eddins. Digital Image Processing. Gatesmark Publishing Ltd., 2009. ISBN 9780982085400 [8] K. Subramaniam, S.S. Dlay, and F.C. Rind. Wavelet transforms for use in motion detection and tracking application. IEEE Image processing and its Applications, pages 711715, 1999. [9]. P.S. Addison. The Illustrated Wavelet Transform Handbook. IOP Publishing Ltd, 2002. ISBN 0-7503-0692-0. [10]. M. Tico, P. Kuosmanen, and J. Saarinen. Wavelet domain features for fingerprint recognition. IEEE Electronic Letters, 37(1):2122, January 2001. [11] Jalal Karam, End Point Detection for Wavelet Based Speech Compression Procedings of world academy of science, engineering and technology Volume 27 February 2008 ISSN 1307-6884. [12] AbduI Mawla M,A. Najih, Abdul Rahman Ramli, Azizah Ibrahim and Syed A.R, Comparing Speech Compression Using Wavelets With Other Speech Compression Schemes IEEE 2003 proc. Students conference on research and development (SCOReD). pp 55-58. .



A Hybrid Filter for Image Enhancement

Vinod Kumar, a Kaushal Kishore, b and Dr. Priyanka a

Deenbandhu Chotu Ram University of Science and Technology, Murthal, Sonepat, Haryana India b Ganpati Institute of Technology and Management, Bilaspur, Yamunanagar, Haryana, India,a,b,a

Abstract- Image filtering processes are applied on images to remove the different types of noise that are either present in the image during capturing or introduced into the image during transmission. The salt & pepper (impulse) noise is the one type of noise which is occurred during transmission of the images or due to bit errors or dead pixels in the image contents. The images are blurred due to object movement or camera displacement when we capture the image. This pepper deals with removing the impulse noise and blurredness simultaneously from the images. The hybrid filter is a combination of weiner filter and median filter. Keywords: Salt & Pepper (Impulse) noise; Blurredness; Median filter; Weiner filter

median filter, we do not replace the pixel value with the mean of neighboring pixel values, we replaces with the median of those values. The median is calculated by first sorting all the pixel values from the surrounding neighborhood into numerical order and then replacing the pixel being considered with the middle pixel value. (If the neighboring pixel which is to be considered contains an even number of pixels, than the average of the two middle pixel values is used.) Fig.1 illustrates an example calculation.

I INTRODUCTION The basic problem in image processing is the image enhancement and the restoration in the noisy envirement. If we want to enhance the quality of images, we can use various filtering techniques which are available in image processing. There are various filters which can remove the noise from images and preserve image details and enhance the quality of image. Hybrid filters are used to remove either gaussian or impulsive noise from the image. These include the median filter and weiner filters. Combination or hybrid filters have been proposed to remove mixed type of noise during image processing from images. II MEDIAN FILTER The median filter gives best result when the impulse noise percentage is less than 0.1%. When the quantity of impulse noise is

Fig.1:Exp. of median filtering III WEINER FILTER The main purpose of the Wiener filter is to filter out the noise that has corrupted a signal. Weiner filter is based on a statistical approach. Mostly filters are designed for a desired frequency response. The Wiener filter deals with the filtering of image from a different point of view. One method is to assume that we have knowledge of the spectral properties of the original signal and the noise, and one deals with the Linear Time Invarient filter whose output would come as close to the original signal as possible [1]. Wiener filters are characterized by the following assumption:

increased the median filter not gives best result. Median filtering is a nonlinear operation used in image processing to reduce "salt and pepper" noise. Also Mean filter is used to remove the impulse noise. Mean filter replaces the mean of the pixels values but it does not preserve image details. Some details are removes with the mean filter. But in the


a. signal and (additive white gaussian noise) noise are stationary linear random processes with known spectral characteristics. b. Requirement: the filter must be physically realizable, i.e. causal (this requirement can be dropped, resulting in a non-causal solution). c. Performance criteria of weiner filter: minimum mean-square error. Wiener Filter in the Fourier Domain The weiner filter is given by following transfer function: G(u,v) = Dividing the equation by Ps makes its behaviour easier to explain: G(u,v) = Where H(u, v) = Degradation function H*(u, v) = Complex conjugate of degradation function Pn (u, v) = Power Spectral Density of Noise Ps (u, v) = Power Spectral Density of un-degraded image. The term Pn /Ps is the reciprocal of the signal-tonoise ratio. IV IMAGE NOISE Image noise is the degradation of the quality of the image. Image noise is prodouced due to the random variation of the brightness or the color information in images that is produced by the sensors and the circuitry of the scanner or digital cameras. Image noise can also originate in film grain and in the unavoidable shot noise of an ideal photon detector. Image noise is generally regarded as an undesirable by-product of image capture. The types of Noise are following:Additive White Gaussian noise Salt-and-pepper noise Blurredness Additive White Gaussian noise The Additive White Gaussian noise to be present in images are independent at each pixel and signal intensity. In color cameras where more amplification is used in the blue color channel than in the green or red channel, there can be more noise in the blue channel. Salt-and-pepper noise The image which has salt-and-pepper noise present in image will show dark pixels in the bright regions and bright pixels in the dark regions. [2]. The salt & pepper noise in images can be caused by the dead pixels, or due to analog-to-digital conversion errors, or bit errors in the transmission, etc. This all can be eliminated in large amount by using the technique dark frame subtraction and by interpolating around dark/bright pixels. Blurredness The blurredness of the image is depend on the point spread function (psf) .The psf may circular or linear. The image is blurred due to the camera movement or the object displacement. V HYBRID FILTER This hybrid filter is the combination of Median and weiner filter. when we arrange these filter in series we get the desired output. First we remove the impulse noise and then pass the result to the weiner filter. The weiner filter removes the blurredness and the additive white noise from the image. The result is not the same as the original image, but it is almost same.

Algorithm The following steps are followed when we filtered the image: If the image is colored convert it in the gray scale image. Convert the image to double for better precision. Find the median by sorting all the values of the 3*3 mask in increasing order. Replace the center pixel value with the median value. Estimate the Signal to Noise ratio. Deconvolution function is applied to filtered the image. VI MSE & PSNR The term peak signal-to-noise ratio, PSNR, is the ratio between the maximum possible power of a signal and the power of corrupting noise signal. MSE= The PSNR is defined as:



PSNR = 10 . = 20 . where, MAXI is the maximum possible pixel value of the image. VII SIMULATION RESULT The Original Image is cameraman image . Adding three types of Noise (Additive white Gaussian noise, Salt & Pepper noise blurredness) and pass this image to our hybrid filter we get the desired result. The result depend upon the blurring angle (theta) and the blurring length (Len) and the intensity of the impulse noise. The performance is compared with the MSE & PSNR of the original image and the filter output image.

Fig.4 Blurred image with gaussian noise of mean=0, var=.001

Fig.2 Original Image

Fig.5 Blurred or Impulse noisy

hybrid filter output


Fig.6 Hybrid Filter output Fig.3 Blurred Len=21, Theta=11 image with


Blurre d length 21 15 10 10 05 05 Blurrin g Angle 11 09 05 05 03 03 Percentag e of impulse noise (%) 0,01 0.02 0.01 0.03 0.01 0.04 Mean square error 0.0087 0.0130 0.0060 0.0131 0.0052 0.0135 PSNR

68.22 67.35 70.08 67.39 71.11 67.20

VIII CONCLUSION Fig.7 hybrid Filter Output Now we calculate the mean square error for the different conditions to check the performance of our filter. The Table shows that when the blurredness of the image vary with angle and length and the percentage of impulse noise is constant. Table 1: Blurre Blurrig Percentag Mean Peak d Angle e of square Signal length impulse error to noise(%) Noise ratio 21 11 0.01 0.0087 69.11 15 09 0.01 0.0079 69.30 10 07 0.01 0.0074 69.49 05 03 0.01 0.0050 70.49 02 02 0.01 0.0040 71.49 Next when the blurredness of the image is same and the percentage of the impulse noise is increased, then the following results are obtained: Table 2: Blurre Blurrig Percentag Mean PSNR Angle d e square length of error impulse noise (%) 21 11 0.01 0.0087 68.11 21 11 0.03 0.0172 66.08 21 11 0.05 0.0268 64.15 21 11 0.07 0.0333 63.02 21 11 0.09 0.0398 62.06 When the blurredness and impulse noise is simultaneously varying, we get the following results: Table 3: We used the cameraman image in .tif format and adding three noise (impulse noise, gaussian noise, blurredness) and apply the noisy image to hybrid filter. The final filtered image is depending upon the blurring angle and the blurring length and the percentage of the impulse noise. When these variables are less the filtered image is nearly equal to the original image. IX SCOPE FOR FUTURE WORK There are a couple of areas which we would like to improve on. One area is in improving the de-noising along the edges as the method we used did not perform so well along the edges. Instead of using the median filter we can use the adaptive median filter. we can increase the types of noise. X REFERENCES: [1] Wavelet domain image de-noising by thresholding and Wiener filtering by Kazubek, M. Signal Processing Letters IEEE, Volume: 10, Issue no. 11, Nov. 2003 265 Vol.3. [2] Image Denoising using Wavelet Thresholding and Model Selection by Shi Zhong. Image Processing, 2000, Proceedings, 2000 International Conference held on, Volume: 3, 10-13 Sept. 2000 Pages: 262. [3] A hybrid filter for image enhancement ,by Shaomin Peng and Lori Lucke Department of Electrical Engineering University of Minnesota Minneapolis, MN 55455 [4] Performance Comparison of Median and Wiener Filter in Image De-noising. International Journal of Computer Applications Page No.(0975 8887) Volume 12 No.4, November 2010 [5] Multi-level Addaptive Fuzzy Filter for Mixed Noise Removal by Shaomin Peng and Lori Lucke. Department of Electrical Engineering University of JIinnesota Minneapolis. LIS 55455 612-625-3822 and 612-625-3588.



Comprehensive Study of Finger Print Detection Technique

Vivekta Singh [1]
Sr. Lecturer GNIT-GIT, Greater Noida[1][1] +91-9971506661[1]
Abstract- Extraction and verification of Biometric signature are tedious task that are prone to errors and influence by many factors. Fingerprint is one of the most reliable personal identification methods and most widely used. The fingerprint detection techniques are either automated or in some cases the prints can be matched manually. The manual fingerprint detection is tedious, time consuming and expensive, while the automated fingerprint detection technique is not so reliable and authentic. In this paper, a technique that is based on a combination of an automated system and the traditional manual detection is presented. This technique can cop up with the problems related with automated as well as manual methods of detection. Index Terms Finger Print Detection (FDT), Minutiae, Matching, verification, Ridge extraction

Vanya Garg[2]
Lecturer [2] GNIT-GIT, Greater Noida[2] [2] +91-9410236413[2]
automatic fingerprint identification techniques is that it do not follow the same guideline as the manual one. For example a sweat pore present in a finger print may be identified as a different pattern, and thus the fingerprint may not match with its counterpart in the database. An automated finger print identification system include mainly the three issues: 1. Preprocessing stage 2. Minutia extraction stage 3. Post processing stage Here in this paper its been tried to address the different aspects of the finger print detections techniques and flow. The coming section named as Finger print Detection Technique speaks about the details of various stages and their flow. II FINGER PRINT DETECTION TECHNIQUE A fingerprint is composed of many ridges and furrows as shown in [Figure-1]. But, fingerprints are not distinguished by their ridges and furrows, but by Minutiae [6], which are some abnormal points on the ridges. Among the variety of minutia types reported in literatures, two are mostly significant and in heavy usage: one is called termination, which is the immediate ending of a ridge; the other is called bifurcation, which is the point on the ridge from which two branches derive.

I INTRODUCTION Fingerprint has been widely used in personal identification for several countries [5]. It is much more reliable than other kinds of popular personal identification method based on signature, face and speech. Apart from fingerprint verification for criminal identification and police work. Now a days it is used for various application such as security control, work our tracking, online transaction authentication, security of PC application, banking cash machines etc[7]. C