You are on page 1of 30




Page No
i ii










i. ii. iii.


29 30 32

The Project entitled as “Web Mining Browser” can be applied in real time using VB.NET windows application and SQL Server 2005 for Creative solutions, Coimbatore. Basic aim of this project is to create a browser application in VB.NET which works offline and in online with user personalization data. Once the Internet connection is disconnected we can browse the pages without interrupt. Stored pages are automatically updated using the auto intelligent class. User can also read the RSS Feeds with the Built in RSS Reader. RSS Feeds are also stored locally and updated automatically and user can also read the Feeds offline.

It also detects the URL’s in the page and stores the pages. This system is a Personalizable one. Browser can be optimized for the different users. Once user name is provided browser authenticates and load with the personalized settings The system consists of modules such as User Registration, Link Extractor, Page Storage, RSS Reader, and Feed Storage. User registration is used to register a new user to the system. A unique userid is assigned for each user. Link Extractor module extracts the Hyperlinks available on the HTML page and displays on the corner. Pages are stored in the database in binary format. RSS Feeds are automatically detected for each website and the feeds are readed automatically. This module stores RSS feeds on to the database for offline usage

Office access:
The offline pages are accessed without any interruption science the stored pages are automatically updated using the auto intelligent class. User can also read the RSS Feeds with the Built in RSS Reader. RSS Feeds are also stored locally and updated automatically and user can also read the Feeds offline.

It also detects the URL’s in the page and stores the pages. This system is a Personalizable one. Browser can be optimized for the different users. Once user name is provided browser authenticates and load with the personalized settings The system consists of modules such as User Registration, Link Extractor, Page Storage, RSS Reader, Feed Storage. User registration is used to register a new user to the system. A unique userid is assigned for each user. Link Extractor module extracts the Hyperlinks available on the HTML page and displays on the corner. Pages are stored in the database in binary format.

Feed reader:
RSS Feeds are automatically detected for each website and the feeds are breaded automatically. This module stores RSS feeds on to the database for offline usage

User registration:
User registration is used to register a new user to the system. A unique userid and password is assigned for each user.

Quick launch:
The predefine link to the most used website is given as separate button by click it , we can easily reach that page and our time is saved instead of typing the address in the address bar.

In this chapter, a short profile about the organization and a brief introduction to the system is presented

1.1 Organization profile
Creative K Solutions was one of the Software Establishments, established in 1997 at Coimbatore, that provides software solution and web solutions to the companies situated in and around Coimbatore and Tamilnadu. Our company developers are well trained and well equipped to fulfill the needs of our customers. Our web solutions targets with the Windows Server and Linux Server. We are working with .Net and Java Technologies. Our company is fully equipped with latest hardware and software’s. As multiple skills and competencies combine to realize technology-driven business transformations, software development continues to be the largest software engineering activity across enterprises. Drivers for custom-built solutions for clients are based on innovative use of

technology to achieve competitive advantage and differentiation. As organizations drive towards iteration of their business and IT strategies, outsourcing IT application development allows focusing on core business with benefits across the business spectrum. CreativeKSolutions are robust, scalable and it will easily integrate with a diverse range of products and technologies. CreativeKSolutions expertise spans the entire gamut of application and custom development. At CreativeKSolutions, the wide range of technological expertise, application knowledge and consulting experience, enables to develop and integrate robust and scalable ebusiness solutions that keep end customer's requirement in mind. The software development process, supported by a proven onsite-offshore development methodology and quality management system, shortens application development timeframes, providing significant business benefits to customers.

Hardware configuration
Processor Name Processor Speed Memory (RAM) Hard Disk Floppy Drive Monitor Keyboard Mouse : : : : : : : Pentium IV 1.7 GHz 256 MB 40 GB 3 ½ “1.44 MB Drive Samsung Color Monitor 104 keys HCL Multimedia Keyboard : Logitech Optical Mouse

Software configuration
Operating System Software Tools : Windows 2000 Onwards : VB.NET windows application, ADO.NET

Front end Back end

: :


Front End: VB.NET
 VB.NET is a simple, modern, object oriented language derived from C++ and Java.  It aims to combine the high productivity of Visual Basic and the raw power of C++.  We may develop Console application, Windows application, and Web application using

 In VB.NET Microsoft has taken care of C++ problems such as Memory management, pointers

 It supports garbage collection, automatic memory management and a lot.  Pointers are missing in VB.NET.  In VB.NET we cannot perform unsafe casts like convert double to a Boolean.  Value types (primitive types) are initialized to zeros and reference types

(objects and classes are initialized to null by the compiler automatically.
 Arrays are zero bases indexed and are bound checked.  Overflow of types can be checked.  VB.NET allows the users to use pointers as unsafe code blocks to manipulate your old code.  Components from VB.NET and other managed code languages and directly be used in


 Back End: SQL Server
 SQL server is a collection of many objects, such as tables, views, stored procedures and constraints.  SQL server is owned by a single user account but can contain objects owned by other users.

 SQL Server has its own set of system tables, which catalog gives the definition of the

 SQL Server maintains its own set of user accounts and security.

 SQL server is the primary unit of recovery and maintains logical consistency among objects in the database.
 SQL Server has its own transaction log and manages the transactions within the

 SQL Server can participate in two-phase commit transactions with other SQL server

database on the same server or different server.
 SQL Server can span multiple disk drives and operating system files.  SQL Server can range in size from 1 MB through a theoretical limit of  SQL Server can grow and shrink, either automatically or by command.  SQL Server can have objects joined in queries with objects from other databases in the

1 TB in size.

same SQL server installation.
 SQL Server can have specific options set or disabled.

 SQL server is conceptually similar to but richer than the ANSI SQL-schema concept. Features of Windows 2000 Professional Microsoft Windows 2000 Professional is more compatible and more powerful than any other workstation. It provides faster access of information and tasks that can be accomplished more quickly and easily. Network administrator can work more efficiently because many of the most common computer management tasks are automated and streamlined with Windows 2000 Professional. It offers increased compatibility with different types of network and with wide array of legacy hardware and software.

Basic features
• It provides improved driver support.  It supports personalized computing environment.  Increased support for new generation hardware and multimedia technologies.

 Sophisticated web and Internet integration.  Standard based security structures.  It provides quick and easy access to the Internet.  Active desktop allows customizing the workspace and the address bar helps to connect to the Internet from any window.  A variety of tools that helps to communicate with people and other computers are available.  The communication tool is used to send e-mail, handle phone calls, send a fax, conduct a meeting with videoconference, etc.

A detailed study of the existing system is necessary. The functions of the system, requirements for the users, structure of the current system is made through the system study. The problems faced in the current system are found and solution pertaining to it is done in the system study.

2.1 Existing system:
Existing system are available as browser system that allows the users to surf websites. While they surfing, it stores some of their browsing pages on the cache as an offline webpage and stores the URL address on the history. In the existing system only the full online test are conducted. In the process of system study the following steps can be followed,  First thoroughly investigate on the existing system.  That can lead to have a very clear idea of what the system is.  Next find out what are the further requirements.  Then evaluation of the system concept for feasibility is performed.  Designing of the input, process and output.

 Establish the constraints and get ready for the verifications and validations.

 It is available only for browsing the website and stores them in cache memory.  Once browser history was erased all the browser history and WebPages are being erased.  Only URL addresses are stored as history.  It is not personalized one (i.e.) Not particular to one user data is being considered while browsing.

2.2 Proposed system
The proposed system provides a Desktop Data Personalizable Browser which allows user to browse the website of their like and stores the page particular to a user. Links available of a webpage are also extracted and displayed separately for easy navigation. It automatically updates the pages without user intervention. Proposed system contains RSS [Really Simple and Syndication] Feed Reader. Rss Reader reads the RSS of a website automatically. It retrieves RSS feed data and displays them with more easy to read user interface.

 Personalized Browser settings and page storage.  Automatically updates the page if any change acquired on the web page.  Allows user to browse offline without the Internet connection.  Allow users to read RSS Feeds from the browser itself.  RSS feeds are updates automatically and available for offline reading.


System Design is a solution, a “how to” approach to the creation of new system It provides the understanding and procedural details necessary for implementing the system recommended in the feasibility study. A Design goes through the logical and physical stages of development. Design is a creative process that involves working with the unknown new system, rather than analyzing the existing system. Thus, in analysis it is possible to produce the correct model of existing system. Table design A database is a collection of inter-related data with minimum redundancy to serve the user quickly and efficiently. The data are stored in tables. We have learned that data provide the basic information system. Without data there is no system, but the data must be provided in the right form for input and the information produced must be in a format acceptable to the user. The tables that are used are USER TABLE, FEED TABLE, and WEBPAGE TABLE.

2.3 Environmental model Context analysis diagram
The environmental model defines the interface between the system and rest of the universe. The Context Analysis Diagram (CAD) for the Proposed system is developed. Context Analysis Diagram is the first step through which one data Flow diagram, which gives a system overview, can depict an entire system.

2.4 Behavioral model Data flow diagram
Data Flow Diagram (DFD) is a modeling tool that allows picturing system as a network of functional process to one another by pipelines of data. They are also widely used for representation of external and top-level design specification. The DFD shows the interface between the system and external terminators. Data Flow Diagram is also called as” Bubble Chart”. The bubble represents the process, the line represents the data flow and rectangle represents the entity.

2.5 Data model Entity relationships diagram
Entity Relationship Diagram (ERD) is a model that describes the store layout of a system at a high level abstraction. ER-Diagram enables to examine and highlights the data structure and relationship between data stores in the DFDs. Based on the information provided needed to access the database record efficiently.

System design sits in the technical kernel of software engineering and applied science regardless of the software process model that is used. Beginning once the software requirements have been analyzed and specified, tests that are required in the building and verifying the software is done. Each activity transforms information in a number that ultimately results in validated computer software. There are mainly three characteristics that serve as guide for evaluation of good design,

• The design must implement all of explicit requirements contained in the analysis model, and it must accommodate all of the implicit requirements desired by the customer. • The design must be readable, understandable guide for those who generate code and for those who test and subsequently support the software. • The design should provide a complete picture of software, addressing the data, its functional and behavioral domains from the implementation perspective. System Design is thus a process of planning a new system or to replace or the complement of the existing system. The design based on the limitations of the existing system and the requirements specification gathered in the phase of system analysis. Input design is the process of converting the user-oriented description of the computer based business information into program-oriented specification. The goal of designing input data is to make the automation as easy and free from errors as possible. Logical Design of the system is performed where its features are described, procedures that meet the system requirements are formed and a detailed specification of the new system is provided. Architectural Design of the system includes identification of software components, decoupling and decomposing them into processing modules, conceptual data structures and specifying relationship among the components. Detailed Design is concerned with the methods involved in packaging of processing modules and implementation of processing algorithms, data structure and interconnection among modules and data structure. External Design of software involves conceiving, planning and specifying the externally observable characteristics of the software product. The external design begins in the analysis phase and continues till the design phase.

As per the design phase the following designs had to be implemented, each of these design were processed separately keeping in mind all the requirements, constraints and conditions. A step-by-step process was required to perform the design. Process Design is the design of the process to be done; it is the designing that leads to the coding. Here the conditions and the constraints given in the system are to be considered. Accordingly the designing is to be done and processed. The Output Design is the most important and direct source of information to the user. The output design is an ongoing activity during study phase. The objectives of the output design define the contents and format of all documents and reports in an attractive and useful format.

After the successful study of requirement analysis the next step involved is the Design and Development phase that practically helps to build the project. The methods that are applied during the development phase    Software Design Code Generation Software Testing

The Linear Sequential Model or Classic Life Cycle or the Waterfall Model develops project. This is a sequential approach to software development that begins at the system level and progresses through analysis, design, coding and testing. System / Information Engineering and Modeling Because software is always part of a larger system, work begins by establishing requirements for all system elements and then allocating some subset of these requirements to

software. System view is essential when software must interact with other elements such as hardware people and database. Software requirements analysis Requirements is intensified and focused specially on software. To understand the nature of the program to be built, the software engineer must understand the information domain for the software, as well as required function, behavior, performance and interface. Design of the project The design process translates requirements into a representation of the software that can be accessed for quality before coding begins. Like requirements, the design is documented and becomes part of the software configuration. Code Generation The design must be translated into a machine-readable form. The code generation step performs this task. If design is performed in a detailed manner, code generation can be accomplished mechanistically.

After completing the design phase, code was generated using Visual Basic environment and the SQL Server 2005 was used to create the database. The server and the application were connected through ADO.Net concepts. The purpose of code is to facilitate the identification and retrieval of items of information. Codes are built with the mutually exclusive features. They are used to give operational distractions and other information. Codes also show interrelationship among different items. Codes are used for identifying, accessing, sorting and matching records. The code ensures that only one value of code with single meaning is correctly applied to give entity or attribute as described in various ways. Codes can also be designed in a manner easily understood and applied by the user. The coding standards used in the project are as follows: 1. All variable names are kept in such a way that it represents the flow/function it is serving. 2. All functions are named such that it represents the function it is performing.

A software application in general is implemented after navigating the complete life cycle method of a project. Various life cycle processes such as requirement analysis, design phase, verification, testing and finally followed by the implementation phase results in a successful project management. The software application which is basically a web based application has been successfully implemented after passing various life cycle processes mentioned above. As the software is to be implemented in a high standard industrial sector, various factors such as application environment, user management, security, reliability and finally performance are taken as key factors through out the design phase. These factors are analyzed step by step and the positive as well as negative outcomes are noted down before the final implementation. Security and authentication is maintained in both user level as well as the management level. The data is stored in Access 2000 as RDBMS, which is highly reliable and simpler to use, the user level security is managed with the help of password options and sessions, which finally ensures that all the transactions are made securely. The application’s validations are made, taken into account of the entry levels available in various modules. Possible restrictions like number formatting, date formatting and confirmations for both save and update options ensures the correct data to be fed into the database. Thus all the aspects are charted out and the complete project study is practically implemented successfully for the end users.

Software testing is a critical element of software quality assurance and represents the ultimate review of specification, design and code generation. Once the source code has been generated, software must be tested to uncover as many errors as possible before delivery to the

customer. In order to find the highest possible number of errors, tests must be conducted systematically and test cases must be designed using disciplined techniques.

Types of testing White box Testing
White box testing some times called as glass box testing is a test case design method that uses the control structures of the procedural design to derive test cases. Using White Box testing methods, the software engineer can derive test case, that guarantee that all independent paths with in a module have been exercised at least once, exercise all logical decisions on their true and false sides, execute all loops at their boundaries and within their operational bounds, exercise internal data structures to ensure their validity. “Logic errors and incorrect assumptions are inversely proportional to the probability that a program path will be executed“. The logical flow of a program is some times counterintuitive, meaning that unconscious assumptions about flow of control and data may lead to make design errors that are uncovered only once path testing commences. “Typographical errors are random“ When a program is translated into programming language source code, it is likely that some typing errors will occur. Many will be uncovered by syntax and typing checking mechanisms, but others may go undetected until testing begins. It is as likely that a type will exist on an obscure logical path as on a mainstream path.

Black box Testing
Black box testing, also called as behavioral testing, focuses on the functional requirements of the software. That is, black box testing enables the software engineer to derive sets of input conditions that will fully exercise all functional requirements for a program. Black box testing attempts to find errors in the following categories: 1. Incorrect or missing functions

2. Interface errors 3. Errors in data structures or external data base access 4. Behavior or performance errors 5. Initialization and termination errors By applying black box techniques, a set of test cases that satisfy the following criteria were been created: Test cases that reduce, by a count that is greater than one, the number of additional test cases that must be designed to achieve reasonable testing and test cases that tell something about the presence or absence of classes of errors, rather than an error associated only with the specific test at hand. Black-box testing is not an alternative to white - box testing techniques. Rather it is complementary approach that is likely to uncover a different class of errors than white - box methods.

Validation Testing:
Validation testing provides the final assurance that software meets all functional, behavioral and performance requirements. Validation testing can be defined in many ways, but a simple definition is that validations succeed when the software functions in a manner that is expected by the user. The software once validated must be combined with other system element. System testing verifies that all elements combine properly and that overall system function and performance is achieved. After the integration of the modules, the validation test was carried out over by the system. It was found that all the modules work well together and meet the overall system function and performance.

Integration Testing
Integration testing is a systematic technique for constructing the program structure while at the same time conducting test to uncover errors associated with interfacing. The objective is to take unit - tested modules and build a program structure that has been dictated by design. Careful test planning is required to determine the extent and nature of system testing to be performed and to establish criteria by which the result will be evaluated. All the modules were integrated after the completion of unit test. While Top - Down Integration was followed, the modules are integrated by moving downward through the control

hierarchy, beginning with the main module. Since the modules were unit - tested for no errors, the integration of those modules was found perfect and working fine. As a next step to integration, other modules were integrated with the former modules. After the successful integration of the modules, the system was found to be running with no uncovered errors, and also all the modules were working as per the design of the system, without any deviation from the features of the proposed system design.

Acceptance Testing
Acceptance testing involves planning and execution of functional tests, performance tests and stress tests in order to demonstrate that the implemented system satisfies its requirements. When custom software is built for one customer, a series of acceptance tests are conducted to enable the customer to validate all requirements. In fact acceptance cumulative errors that might degrade the system over time will incorporate test cases developed during integration testing. Additional testing cases are added to achieve the desired level functional, performance and stress testing of the entire system.

Unit testing
Static analysis is used to investigate the structural properties of source code. Dynamic test cases are used to investigate the behavior of source code by executing the program on the test data. This testing was carried out during programming stage itself. After testing each every field in the modules, the modulus of the project is tested separately. Unit testing focuses verification efforts on the smallest unit of software design and field. This is known as field - testing.


The implementation and testing has been done in a step-by-step process. Each module has been developed and tested individually to obtain the necessary required output in the desired form. The project is full-fledged and user-friendly. The system has greatly reduced the clerical overhead and drastically reduced the time taken in the products. The system satisfies all requirements needed by the user. I conclude the software as best to my knowledge. The software developed has been designed and run to satisfy the requirements and needs of the organization as well as the end users. The system reduces the manual work of maintenance of the records. It has also resulted in quick retrieval and reference of required information, which is vital to the degrees of the organization. The entire system is documented and can be easily understood by the end users. The form are very user friendly and also easy to handle even by the beginners with very little effort and guidance.

Integration of other applications such as desktop running software authentication and authorization using the passport services.
• •

Registered users could be well informed about the new website integration through E-mail. The data screens can be upgraded and menus can be easily added when required. user details can be added to the forms when there comes necessity of new data.

The system has much scope in the future an it can be developed to add more features to satisfy the user‘s request.

Books Referred:
1. Alex Homer , “Professional C#.NET 1.1”, 2004 Edition, Wrox Publications 2. Steven Holzner, “.NET Black Book”, 2003 Edition, Dreamtech Publications 3. Roger S Pressman, “Software Engineering”, 2000 Edition, Dreamtech Publications 4. Karli Watson, Richard Anderson , “Professional ASP.NET 1.1” , 2004 Edition, Wrox


1. 2. 3. 4.




Admin /user

Enter User name & password


Admin /user

Enter User name & password

Data flow diagram: