Professional Documents
Culture Documents
algorithms have been developed to address either handling datasets with a very large sample size
or with a very high number of dimensions, but they are often impractical when the data is large
in both aspects. To simultaneously overcome both the ‘curse of dimensionality’ problem due to
high dimensions and scalability problems due to large sample size, we propose a new fast
algorithm which uses fast data-space reduction and an intelligent sampling strategy. In addition
to clustering, FensiVAT also provides visual evidence that is used to estimate the number of
clusters (cluster tendency assessment) in the data. In our experiments, we compare FensiVAT
with nine state-of-the-art approaches which are popular for large sample size or high-
dimensional data clustering. Experimental results suggest that FensiVAT, which can cluster large
volumes of high-dimensional datasets in a few seconds, is the fastest and most accurate method
- System analysis
- Existing system
- Proposed system.
- Feasibility study
- Technical feasibility
- Operational feasibility
- Economical feasibility
- System Requirements
- Modules description
- SDLC methodology
- Software requirement
- Hardware requirement
- System design
-UML
- Technology description.
- coding
- testing
- Output screens
- Conclusion
- Bibliography
- References.
Abstract
Clustering large volumes of high-dimensional data is a challenging task. Many clustering
algorithms have been developed to address either handling datasets with a very large sample size
or with a very high number of dimensions, but they are often impractical when the data is large
in both aspects. To simultaneously overcome both the ‘curse of dimensionality’ problem due to
high dimensions and scalability problems due to large sample size, we propose a new fast
algorithm which uses fast data-space reduction and an intelligent sampling strategy. In addition
to clustering, FensiVAT also provides visual evidence that is used to estimate the number of
clusters (cluster tendency assessment) in the data. In our experiments, we compare FensiVAT
with nine state-of-the-art approaches which are popular for large sample size or high-
dimensional data clustering. Experimental results suggest that FensiVAT, which can cluster large
volumes of high-dimensional datasets in a few seconds, is the fastest and most accurate method
DISADVANTAGES
The siVAT scheme does not involve any sensitive threshold parameter, and requires the user
to supply only two parameters: n the desired sample size, and k0, an overestimate of k, the
assumed number of clusters, to obtain k0 distinguished objects in the sample.
Subspace clustering methods do not suffer from nearest neighbor problems in high-
dimensional space. PROCLUS is a subspace clustering approach, which first samples the
data, then selects a set of k medoids, and iteratively improves the clustering. PROCLUS is
capable of discovering arbitrarily shaped clusters in high-dimensional datasets. However,
PROCLUS is very sensitive to input parameters, and is not efficient for very large N
Proposed system
To deal with large amounts of high-dimensional data, this paper introduces a rapid, hybrid
clustering algorithm, which efficiently integrates (i) a new random projection (RP) based
ensemble technique; (ii) an improved visual assessment of cluster tendency (iVAT) algorithm ,
and (iii) a smart sampling strategy, called Maximin and Random Sampling (MMRS) . The
proposed method achieves fast clustering by combining ensembles of random projections with
scalable version of iVAT, hence we call it FensiVAT. FensiVAT aggregates multiple distance
matrices, computed in a lower-dimensional space, to obtain the iVAT image in a fast and
efficient manner, which provides visual evidence about the number of clusters to seek in the
original dataset. MMRS sampling picks distinguished objects from the dataset, hence it requires
relatively very few samples compared to random sampling to yield a diverse subset of the big
data that represents the cluster structure in the original (big) dataset...
ADVANTAGES OF PROPOSED SYSTEM
Therefore, these approaches either take hours for large size datasets having
hundreds to thousands of dimensions, and/or sacrifice accuracy for faster
computation time. Moreover, datasets used in these papers are not considered large
in today’s computing environment.
We use a statistical measure to compare the cluster distributions in samples
obtained from three sampling strategies: random sampling, MMRS sampling in the
dimensional up space, and MMRS sampling in the dimensional down space (we
will call this type of sampling Near-MMRS). Our experiments show that Near-
MMRS samples accurately portray the distribution of the original data in lower
dimensions.
FEASIBILITY STUDY
PRELIMINARY INVESTIGATION
The first and foremost strategy for development of a project starts from the thought of
designing a mail enabled platform for a small firm in which it is easy and convenient of sending
and receiving messages, there is a search engine ,address book and also including some
entertaining games. When it is approved by the organization and our project guide the first
activity, ie. Preliminary investigation begins. The activity has three parts:
Request Clarification
Feasibility Study
Request Approval
REQUEST CLARIFICATION
After the approval of the request to the organization and project guide, with an
investigation being considered, the project request must be examined to determine precisely what
the system requires. Here our project is basically meant for users within the company whose
systems can be interconnected by the Local Area Network(LAN). In today’s busy schedule man
need everything should be provided in a readymade manner. So taking into consideration of the
vastly use of the net in day to day life, the corresponding development of the portal came into
existence.
FEASIBILITY ANALYSIS
An important outcome of preliminary investigation is the determination that the system request
is feasible. This is possible only if it is feasible within limited resource and time. The different
feasibilities that have to be analyzed are
Operational Feasibility
Economic Feasibility
Technical Feasibility
Operational Feasibility
Operational Feasibility deals with the study of prospects of the system to be developed.
This system operationally eliminates all the tensions of the Admin and helps him in effectively
tracking the project progress. This kind of automation will surely reduce the time and energy,
which previously consumed in manual work. Based on the study, the system is proved to be
operationally feasible.
Economic Feasibility
Technical Feasibility
According to Roger S. Pressman, Technical Feasibility is the assessment of the technical
resources of the organization. The organization needs IBM compatible machines with a graphical
web browser connected to the Internet and Intranet. The system is developed for platform
Independent environment. Java Server Pages, JavaScript, HTML, SQL server and WebLogic
Server are used to develop the system. The technical feasibility has been carried out. The system
is technically feasible for development and can be developed with the existing facility.
SYSTEM REQUIREMENTS
Modules description
Distance Matrix using Ensemble Method:
The third (previous) step provides n samples in the down space, Sd _ Rq, which can be used to
build an n_n distance matrix Dn;d. We need a reliable iVAT image in order to select the number
of clusters obtained by SL in penultimate steps of FensiVAT. The VAT/iVAT image provides a
subjective visual assessment of potential cluster substructure based on how distinctive the dark
blocks (clusters) appear in the image. However, the quality of the image of the reordered
distance matrix D0_n;d, obtained by applying VAT/iVAT to Dn;d, often turns out to be very
poor due to the unstable nature of random projection. Hence, we turned to an ensemble-based
approach to obtain a good quality iVAT image from multiple reordered distance matrices
(fD0_d;igQi =1) in the down space. Since the ordering of the data in every reordered matrix
D0_d;i may be different, it is not feasible to directly aggregate multiple reordered distance
matrices (fD0_d;igQi =1). Therefore, we devised a new method to aggregate the Q n _ n
ensemble of distance matrices to obtain a better quality iVAT image.
Clustering:
All single linkage partitions are aligned partitions in the VAT/iVAT ordered matrices, so SL is
an obvious choice for the clustering algorithm in Step 9. Having the estimate of the number of
clusters, k from the previous step, we cut the k1 longest edges in the iVAT-built MST,
resulting in k single linkage clusters. If the dataset is complex and clusters are intermixed,
cutting the k 1 longest edges may not always be a good strategy as the data points (outliers),
which are typically furthest from normal clusters, might comprise most of the k1 longest edges
of the MST, leading to misleading partitions. Such data points need to be partitioned (usually in
their own cluster) before a reliable partition can be found via the SL criterion. However, the
iVAT image provides visual evidence as to how large the clusters should be. Thus, if the size of
SL-clusters does not match well the visual evidence, then the partition can be discarded (perhaps
choosing a different clustering algorithm to partition the sample of feature vectors in Rp or
throwing out data from small clusters).
Extension
In the extension step (Step 10) of FensiVAT, we label the remaining N = (Nn) data points in O,
by giving them the label of their nearest object in Sd. This requires the computation of an n_ ˜N
size matrix, ˆˆD, with computational complexity O(qn˜N). In this step, we use the sample Sd and
feature vectors Y in Rq (obtained in Step 2) to compute the distance matrix ˆˆD. This further
reduces the computation time which would be needed for the equivalent operation in Rp. Next,
the remaining ˜N data points in O are labeled using this distance matrix, based on the label of the
nearest object in Sd. Although, a single random projection (RP) might be sufficient to achieve
comparable accuracy in the NOPR labeling step, several, RPs are used to best ensure robust
nearest neighbor search in NOPR. First, multiple RPs are applied on the full dataset to get
multiple Ys. Then, the sample labels are extended to each of these Ys using NOPR, which would
give multiple sets of labels fU(i) ˆ y gQ i=1 for full dataset. The final labels (U) are selected
using voting, based on the labels cast by each voter from each RP, for each remaining data point
in O.
Partition Accuracy:
For all datasets, except US Census 1990, the quality of the output crisp partition obtained by
various clustering algorithms is assessed using ground truth information, Ugt . The similarity of
computed partitions with respect to ground truth labels is measured using the partition accuracy
(PA). The PA of a clustering algorithm is the ratio of the number of samples with matching
ground truth and algorithmic labels to the total number of samples in the dataset. The value of
PA ranges from 0 to 1, and a higher value imply a better match to the ground truth partition.
SDLC methodology
INPUT DESIGN
Input Design plays a vital role in the life cycle of software development, it requires very
careful attention of developers. The input design is to feed data to the application as accurate as
possible. So inputs are supposed to be designed effectively so that the errors occurring while
feeding are minimized. According to Software Engineering Concepts, the input forms or screens
are designed to provide to have a validation control over the input limit, range and other related
validations.
This system has input screens in almost all the modules. Error messages are developed to
alert the user whenever he commits some mistakes and guides him in the right way so that
invalid entries are not made. Let us see deeply about this under module design.
Input design is the process of converting the user created input into a computer-based
format. The goal of the input design is to make the data entry logical and free from errors. The
error is in the input are controlled by the input design. The application has been developed in
user-friendly manner. The forms have been designed in such a way during the processing the
cursor is placed in the position where must be entered. The user is also provided with in an
option to select an appropriate input from various alternatives related to the field in certain cases.
Validations are required for each data entered. Whenever a user enters an erroneous data,
error message is displayed and the user can move on to the subsequent pages after completing all
the entries in the current page.
OUTPUT DESIGN
The Output from the computer is required to mainly create an efficient method of
communication within the company primarily among the project leader and his team members,
in other words, the administrator and the clients. The output of VPN is the system which allows
the project leader to manage his clients in terms of creating new clients and assigning new
projects to them, maintaining a record of the project validity and providing folder level access to
each client on the user side depending on the projects allotted to him. After completion of a
project, a new project may be assigned to the client. User authentication procedures are
maintained at the initial stages itself. A new user may be created by the administrator himself or
a user can himself register as a new user but the task of assigning projects and validating a new
user rests with the administrator only.
The application starts running when it is executed for the first time. The server has to be started
and then the internet explorer in used as the browser. The project will run on the local area
network so the server machine will serve as the administrator while the other connected systems
can act as the clients. The developed system is highly user friendly and can be easily understood
by anyone using it even for the first time
FUNCTIONAL REQUIREMENTS:
Input: this should require dataset information data. It for used evaluation.
Process: depend on algorithms analysis works. We analyses security, items search using
algorithms using.
NON-FUNCTIONAL REQUIREMENTS:
Usability: This should be given the leading priority. This should be able to log into system with
ease and should be able to access all grants. A User can learn to operate prepare inputs for
interpret outputs on a system.
Reliability: This is the ability of system component to perform it required functions understand
Condition for a specified period on time. Reliability includes mean time to security attacks or
failures. One of the main factors that are used to determines the important requirement of any
application.
Performance: It is concerned with quantifiable activates of the system. System must have
internet facility to maintain an accurate date and time and transfer operations.
Implementation: The client is implemented in Java, it can run on any browser where the user
will be able to operate the system.
Operations: The operations requirements are constraints on the Boolean keywords and query
conditions.
Extensibility: This system should be flexible in such a way that it can be easily extended in
order to add some more modules in the future.
Hardware Constraints:
Ram : 128Mb.
Software Constraints:
Techniques : Java
IDE : .Netbeans
Database : MySql
SYSTEM DESIGN
SYSTEM DESIGN
There are several reasons to identify the design goals of any system. These goals will help to
design the system in an efficient manner. There are several criteria to identify these goals. Some
of the criteria were explained below:
Performance criteria:
a) Response time: The response time of the method is very low because the system simple
design developed on the high performance system.
Dependability criteria:
a) Robustness: the system should be designed to work efficiently on images of any type of
formats without any problem.
b) Availability: the system should be ready to accept command from user at any point of time.
c) Fault Tolerance: the system should not allow the user to work with fault input. It displays
error messages foe every specific fault occurred.
Maintenance criteria:
a) Portability: the system should work on all the platforms like linux, windows.
b) Readability: the code generated should be able to understand the purpose of the project, so
as to make the user to make the modifications easily.
c) Traceability: the code generated should be easy to map with the functions with the
operations selected by the user.
End-user criteria:
a) Utility: the system should be made to operate on al inputs of end-user under any kind of
circumstances. It should complete all the commands or instructions given by user without
any interruptions.
b) Usability: the interface of the user is to be defined with all options which make the work of
the end-user easier.
UML Diagrams
UML stands for Unified Modeling Language. This object-oriented system of notation has
evolved from the work of Grady Booch, James Rumbaugh, Ivar Jacobson, and the Rational
Software Corporation. These renowned computer scientists fused their respective technologies
into a single, standardized model. Today, UML is accepted by the Object Management Group
(OMG)as the standard for modeling object oriented programs.
There are two broad categories of diagrams and then are again divided into sub-categories:
• Structural Diagrams
• Behavioral Diagrams
Structural Diagrams:
The structural diagrams represent the static aspect of the system. These static aspects represent
those parts of a diagram which forms the main structure and therefore stable.
These static parts are represents by classes, interfaces, objects, components and nodes. The four
structural diagrams are:
• Class diagram
• Object diagram
• Component diagram
• Deployment diagram
Class Diagram:
Class diagrams are the most common diagrams used in UML. Class diagram consists of classes,
interfaces, associations and collaboration. Class diagrams basically represent the object oriented
view of a system which is static in nature. Active class is used in a class diagram to represent
the concurrency of the system. Class diagram represents the object orientation of a system. So it
is generally used for development purpose. This is the most widely used diagram at the time of
system construction.
Object Diagram:
Component Diagram:
During design phase software artifacts (classes, interfaces etc) of a system are
arranged in different groups depending upon their relationship. Now these groups are known as
components. Finally, component diagrams are used to visualize the implementation.
Deployment Diagram:
Deployment diagrams are a set of nodes and their relationships. These nodes are
physical entities where the components are deployed. Deployment diagrams are used for
visualizing deployment view of a system. This is generally used by the deployment team.
Behavioral Diagrams: Any system can have two aspects, static and dynamic. So a model is
considered as complete when both the aspects are covered fully. Behavioral diagrams basically
capture the dynamic aspect of a system. Dynamic aspect can be further described as the
changing/moving parts of a system.
• Sequence diagram
• Collaboration diagram
• Activity diagram
Use case diagrams are a set of use cases, actors and their relationships. They represent
the use case view of a system. A use case represents a particular functionality of a system. So
use case diagram is used to describe the relationships among the functionalities and their
internal/external controllers. These controllers are known as actors.
Sequence Diagram:
A sequence diagram is an interaction diagram. From the name it is clear that the
diagram deals with some sequences, which are the sequence of messages flowing from one
object to another. Interaction among the components of a system is very important from
implementation and execution perspective. So Sequence diagram is used to visualize the
sequence of calls in a system to perform a specific functionality.
Collaboration Diagram:
The purpose of collaboration diagram is similar to sequence diagram. But the specific purpose
of collaboration diagram is to visualize the organization of objects and their interaction.
State chart Diagram:
Activity Diagram:
Architecture Diagram
USE CASE DIAGRAM:
To model a system the most important aspect is to capture the dynamic behaviour. To
clarify a bit in details, dynamic behaviour means the behaviour of the system when it is
running /operating. So only static behaviour is not sufficient to model a system rather dynamic
behaviour is more important than static behaviour.
In UML there are five diagrams available to model dynamic nature and use case diagram
is one of them. Now as we have to discuss that the use case diagram is dynamic in nature there
should be some internal or external factors for making the interaction. These internal and
external agents are known as actors. So use case diagrams are consists of actors, use cases and
their relationships.
The diagram is used to model the system/subsystem of an application. A single use case
diagram captures a particular functionality of a system. So to model the entire system numbers of
use case diagrams are used. A use case diagram at its simplest is a representation of a user's
interaction with the system and depicting the specifications of a use case. A use case diagram can
portray the different types of users of a system and the case and will often be accompanied by
other types of diagrams as well.
Register
Login
View Query
LBS PROVIDER
LBS user
Generate key
send user
Decrypt key
query location
store location
CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of
static structure diagram that describes the structure of a system by showing the system's classes,
their attributes, operations (or methods), and the relationships among the classes. It explains
which class contains information.
user
provider
Register
Register
login
Login
Register()
view user()
Login()
view query()
send query()
generate key()
decrypt()
admin.
Login
store location
view location()
SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that
shows how processes operate with one another and in what order.
Register
login
register
login
send query
view query
decrypt key
store location
.
COLLABORATION DIAGRAM
2: login 4: login
1: Register
6: view query
Provider user
5: send query
7: decrypt key
9: view source destination
3: register
8: view query location
10: store location
Admin
ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of
components in a system. An activity diagram shows the overall flow of control.
logout
Technology description
Java Technology
The Java programming language is a high-level language that can be characterized by all
of the following buzzwords:
i. Simple
ii. Architecture neutral
iii. Object oriented
iv. Portable
v. Distributed
vi. High performance
vii. Interpreted
viii. Multithreaded
ix. Robust
With most programming languages, you either compile or interpret a program so that you
can run it on your computer. The Java programming language is unusual in that a program is
both compiled and interpreted. With the compiler, first you translate a program into an
intermediate language called Java byte codes —the platform-independent codes interpreted by
the interpreter on the Java platform. The interpreter parses and runs each Java byte code
instruction on the computer. Compilation happens just once; interpretation occurs each time the
program is executed. The following figure illustrates how this works.
A platform is the hardware or software environment in which a program runs. The Java
platform differs from most other platforms in that it’s a software-only platform that runs on top
of other hardware-based platforms.
You’ve already been introduced to the Java VM. It’s the base for the Java platform and is
ported onto various hardware-based platforms. The Java API is a large collection of ready-made
software components that provide many useful capabilities, such as graphical user interface
(GUI) widgets. The Java API is grouped into libraries of related classes and interfaces; these
libraries are known as packages.The following figure depicts a program that’s running on the
Java platform. As the figure shows, the Java API and the virtual machine insulate the program
from the hardware.
Native code is code that after you compile it, the compiled code runs on a specific
hardware platform. As a platform-independent environment, the Java platform can be a bit
slower than native code. However, smart compilers, well-tuned interpreters, and just-in-time byte
code compilers can bring performance close to that of native code without threatening
portability.
Every full implementation of the Java platform gives you the following features:
i. The essentials: Objects, strings, threads, numbers, input and output, data structures,
system properties, date and time, and so on.
iii. Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram
Protocol) sockets, and IP (Internet Protocol) addresses.
iv. Internationalization: Help for writing programs that can be localized for users
worldwide. Programs can automatically adapt to specific locales and be displayed in the
appropriate langage.
v. Security: Both low level and high level, including electronic signatures, public and
private key management, access control, and certificates.
vi. Software components: Known as JavaBeansTM, can plug into existing component
architectures.
vii. Object serialization: Allows lightweight persistence and communication via Remote
Method Invocation (RMI).
viii. Java Database Connectivity (JDBCTM): Provides uniform access to a wide range of
relational databases.
The Java platform also has APIs for 2D and 3D graphics, accessibility, servers,
collaboration, telephony, speech, animation, and more. The following figure depicts what is
included in the Java 2 SDK.
ODBC
Through the ODBC Administrator in Control Panel, you can specify the particular
database that is associated with a data source that an ODBC application program is written to
use. Think of an ODBC data source as a door with a name on it. Each door will lead you to a
particular database. For example, the data source named Sales Figures might be a SQL Server
database, whereas the Accounts Payable data source could refer to an Access database. The
physical database referred to by a data source can reside anywhere on the LAN.
The ODBC system files are not installed on your system by Windows 95. Rather, they
are installed when you setup a separate database application, such as SQL Server Client or
Visual Basic 4.0. When the ODBC icon is installed in Control Panel, it uses a file called
ODBCINST.DLL. It is also possible to administer your ODBC data sources through a stand-
alone program called ODBCADM.EXE. There is a 16-bit and a 32-bit version of this program,
and each maintains a separate list of ODBC data sources.
The advantages of this scheme are so numerous that you are probably thinking there must
be some catch. The only disadvantage of ODBC is that it isn’t as efficient as talking directly to
the native database interface. ODBC has had many detractors make the charge that it is too slow.
Microsoft has always claimed that the critical factor in performance is the quality of the driver
software that is used. In our humble opinion, this is true. The availability of good ODBC drivers
has improved a great deal recently. And anyway, the criticism about performance is somewhat
analogous to those who said that compilers would never match the speed of pure assembly
language. Maybe not, but the compiler (or ODBC) gives you the opportunity to write cleaner
programs, which means you finish sooner. Meanwhile, computers get faster every year.
JDBC Goals:
The designers felt that their main goal was to define a SQL interface for Java. Although
not the lowest database interface level possible, it is at a low enough level for higher-level tools
and APIs to be created. Conversely, it is at a high enough level for application programmers to
use it confidently.
Attaining this goal allows for future tool vendors to “generate” JDBC code and to hide many of
JDBC’s complexities from the end user.
2. SQL Conformance
SQL syntax varies as you move from database vendor to database vendor. In an effort to
support a wide variety of vendors, JDBC will allow any query statement to be passed through it
to the underlying database driver. This allows the connectivity module to handle non-standard
functionality in a manner that is suitable for its users.
The JDBC SQL API must “sit” on top of other common SQL level APIs. This go allows
JDBC to use existing ODBC level drivers by the use of a software interface. This interface
would translate JDBC calls to ODBC and vice versa.
4. Provide a Java interface that is consistent with the rest of the Java system
Because of Java’s acceptance in the user community thus far, the designers feel that they
should not stray from the current design of the core Java system.
5. Keep it simple
This goal probably appears in all software design goal listings. JDBC is no exception.
Sun felt that the design of JDBC should be very simple, allowing for only one method of
completing a task per mechanism. Allowing duplicate functionality only serves to confuse the
users of the API.
Strong typing allows for more error checking to be done at compile time; also, less errors appear
at runtime.
Because more often than not, the usual SQL calls used by the programmer are simple
NetBeans:
The NetBeans IDE is primarily intended for development in Java, but also supports other
languages, in particular PHP, C/C++ and HTML5.
NetBeans is cross-platform and runs on Microsoft Windows, Mac OS X, Linux, Solaris and
other platforms supporting a compatible JVM.
History:
NetBeans began in 1996 as Xelfi (word play on Delphi),[7][8] a Java IDE student project under the
guidance of the Faculty of Mathematics and Physics at Charles University in Prague. In 1997,
Roman Stank formed a company around the project and produced commercial versions of the
NetBeans IDE until it was bought by Sun Microsystems in 1999. Sun open-sourced the
NetBeans IDE in June of the following year. Since then, the NetBeans community has continued
to grow.[9] In 2010, Sun (and thus NetBeans) was acquired by Oracle Corporation.
NetBeans Platform:
The NetBeans Platform is a framework for simplifying the development of Java Swing desktop
applications. The NetBeans IDE bundle for Java SE contains what is needed to start developing
NetBeans plugins and NetBeans Platform based applications; no additional SDK is required.
Applications can install modules dynamically. Any application can include the Update Center
module to allow users of the application to download digitally signed upgrades and new features
directly into the running application. Reinstalling an upgrade or a new release does not force
users to download the entire application again.The platform offers reusable services common to
desktop applications, allowing developers to focus on the logic specific to their application.
Among the features of the platform are:
i. User interface management (e.g. menus and toolbars)
ii. User settings management
iii. Storage management (saving and loading any kind of data)
iv. Window management
v. Wizard framework (supports step-by-step dialogs)
NetBeans IDE :
All the functions of the IDE are provided by modules. Each module provides a well-defined
function, such as support for the Java language, editing, or support for the CVS versioning
system, and SVN. NetBeans contains all the modules needed for Java development in a single
download, allowing the user to start working immediately. Modules also allow NetBeans to be
extended. New features, such as support for other programming languages, can be added by
installing additional modules. For instance, Sun Studio, Sun Java Studio Enterprise, and Sun
Java Studio Creator from Sun Microsystems are all based on the NetBeans IDE.
MIME Type or Content Type: If you see above sample HTTP response header, it contains tag
“Content-Type”. It’s also called MIME type and server sends it to client to let them know the
kind of data it’s sending. It helps client in rendering the data for user. Some of the mostly used
mime types are text/html, text/xml, application/xml etc.
Understanding URL
URL is acronym of Universal Resource Locator and it’s used to locate the server and resource.
Every resource on the web has it’s own unique address. Let’s see parts of URL with an example.
http://localhost:8080/FirstServletProject/jsps/hello.jsp
http:// – This is the first part of URL and provides the communication protocol to be used in
server-client communication.
localhost – The unique address of the server, most of the times it’s the hostname of the server
that maps to unique IP address. Sometimes multiple hostnames point to same IP addresses and
web server virtual host takes care of sending request to the particular server instance.
8080 – This is the port on which server is listening, it’s optional and if we don’t provide it in
URL then request goes to the default port of the protocol. Port numbers 0 to 1023 are reserved
ports for well known services, for example 80 for HTTP, 443 for HTTPS, 21 for FTP etc.
Web Container
Tomcat is a web container, when a request is made from Client to web server, it passes the
request to web container and it’s web container job to find the correct resource to handle the
request (servlet or JSP) and then use the response from the resource to generate the response and
provide it to web server. Then web server sends the response back to the client.
When web container gets the request and if it’s for servlet then container creates two Objects
HTTPServletRequest and HTTPServletResponse. Then it finds the correct servlet based on the
URL and creates a thread for the request. Then it invokes the servlet service() method and based
on the HTTP method service() method invokes doGet() or doPost() methods. Servlet methods
generate the dynamic page and write it to response. Once servlet thread is complete, container
converts the response to HTTP response and send it back to client.
Some of the important work done by web container are:
Communication Support – Container provides easy way of communication between
web server and the servlets and JSPs. Because of container, we don’t need to build a
server socket to listen for any request from web server, parse the request and generate
response. All these important and complex tasks are done by container and all we need to
focus is on our business logic for our applications.
Lifecycle and Resource Management – Container takes care of managing the life cycle
of servlet. Container takes care of loading the servlets into memory, initializing servlets,
invoking servlet methods and destroying them. Container also provides utility like JNDI
for resource pooling and management.
Multithreading Support – Container creates new thread for every request to the servlet
and when it’s processed the thread dies. So servlets are not initialized for each request
and saves time and memory.
JSP Support – JSPs doesn’t look like normal java classes and web container provides
support for JSP. Every JSP in the application is compiled by container and converted to
Servlet and then container manages them like other servlets.
Miscellaneous Task – Web container manages the resource pool, does memory
optimizations, run garbage collector, provides security configurations, support for
multiple applications, hot deployment and several other tasks behind the scene that makes
our life easier.
Coding
package reformance.evaluation;
import weka.classifiers.Classifier;
import weka.classifiers.Sourcable;
import weka.classifiers.trees.j48.BinC45ModelSelection;
import weka.classifiers.trees.j48.C45ModelSelection;
import weka.classifiers.trees.j48.C45PruneableClassifierTree;
import weka.classifiers.trees.j48.ClassifierTree;
import weka.classifiers.trees.j48.ModelSelection;
import weka.classifiers.trees.j48.PruneableClassifierTree;
import weka.core.AdditionalMeasureProducer;
import weka.core.Capabilities;
import weka.core.Drawable;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.Matchable;
import weka.core.Option;
import weka.core.OptionHandler;
import weka.core.RevisionUtils;
import weka.core.Summarizable;
import weka.core.TechnicalInformation;
import weka.core.TechnicalInformationHandler;
import weka.core.Utils;
import weka.core.WeightedInstancesHandler;
import weka.core.TechnicalInformation.Field;
import weka.core.TechnicalInformation.Type;
import java.util.Enumeration;
import java.util.Vector;
/**
<!-- globalinfo-start -->
* Class for generating a pruned or unpruned C4.5 decision tree. For more information, see<br/>
* <br/>
* Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers,
San Mateo, CA.
* <p/>
<!-- globalinfo-end -->
*
<!-- technical-bibtex-start -->
* BibTeX:
* <pre>
* @book{Quinlan1993,
* address = {San Mateo, CA},
* author = {Ross Quinlan},
* publisher = {Morgan Kaufmann Publishers},
* title = {C4.5: Programs for Machine Learning},
* year = {1993}
*}
* </pre>
* <p/>
<!-- technical-bibtex-end -->
*
<!-- options-start -->
* Valid options are: <p/>
*
* <pre> -U
* Use unpruned tree.</pre>
*
* <pre> -C <pruning confidence>
* Set confidence threshold for pruning.
* (default 0.25)</pre>
*
* <pre> -M <minimum number of instances>
* Set minimum number of instances per leaf.
* (default 2)</pre>
*
* <pre> -R
* Use reduced error pruning.</pre>
*
* <pre> -N <number of folds>
* Set number of folds for reduced error
* pruning. One fold is used as pruning set.
* (default 3)</pre>
*
* <pre> -B
* Use binary splits only.</pre>
*
* <pre> -S
* Don't perform subtree raising.</pre>
*
* <pre> -L
* Do not clean up after the tree has been built.</pre>
*
* <pre> -A
* Laplace smoothing for predicted probabilities.</pre>
*
* <pre> -Q <seed>
* Seed for random data shuffling (default 1).</pre>
*
<!-- options-end -->
*
* @author Eibe Frank (eibe@cs.waikato.ac.nz)
* @version $Revision: 1.9 $
*/
public class C45
extends Classifier
implements OptionHandler, Drawable, Matchable, Sourcable,
WeightedInstancesHandler, Summarizable, AdditionalMeasureProducer,
TechnicalInformationHandler {
/**
* Returns a string describing classifier
* @return a description suitable for
* displaying in the explorer/experimenter gui
*/
public String globalInfo() {
return "Class for generating a pruned or unpruned C4.5 decision tree. For more "
+ "information, see\n\n"
+ getTechnicalInformation().toString();
}
/**
* Returns an instance of a TechnicalInformation object, containing
* detailed information about the technical background of this class,
* e.g., paper reference or book this class is based on.
*
* @return the technical information about this class
*/
public TechnicalInformation getTechnicalInformation() {
TechnicalInformation result;
return result;
}
/**
* Returns default capabilities of the classifier.
*
* @return the capabilities of this classifier
*/
public Capabilities getCapabilities() {
Capabilities result;
try {
if (!m_reducedErrorPruning)
result = new C45PruneableClassifierTree(null, !m_unpruned, m_CF, m_subtreeRaising, !
m_noCleanup).getCapabilities();
else
result = new PruneableClassifierTree(null, !m_unpruned, m_numFolds, !m_noCleanup,
m_Seed).getCapabilities();
}
catch (Exception e) {
result = new Capabilities(this);
}
result.setOwner(this);
return result;
}
/**
* Generates the classifier.
*
* @param instances the data to train the classifier with
* @throws Exception if classifier can't be built successfully
*/
public void buildClassifier(Instances instances)
throws Exception {
ModelSelection modSelection;
if (m_binarySplits)
modSelection = new BinC45ModelSelection(m_minNumObj, instances);
else
modSelection = new C45ModelSelection(m_minNumObj, instances);
if (!m_reducedErrorPruning)
m_root = new C45PruneableClassifierTree(modSelection, !m_unpruned, m_CF,
m_subtreeRaising, !m_noCleanup);
else
m_root = new PruneableClassifierTree(modSelection, !m_unpruned, m_numFolds,
!m_noCleanup, m_Seed);
m_root.buildClassifier(instances);
if (m_binarySplits) {
((BinC45ModelSelection)modSelection).cleanup();
} else {
((C45ModelSelection)modSelection).cleanup();
}
}
/**
* Classifies an instance.
*
* @param instance the instance to classify
* @return the classification for the instance
* @throws Exception if instance can't be classified successfully
*/
public double classifyInstance(Instance instance) throws Exception {
return m_root.classifyInstance(instance);
}
/**
* Returns class probabilities for an instance.
*
* @param instance the instance to calculate the class probabilities for
* @return the class probabilities
* @throws Exception if distribution can't be computed successfully
*/
public final double [] distributionForInstance(Instance instance)
throws Exception {
/**
* Returns the type of graph this classifier
* represents.
* @return Drawable.TREE
*/
public int graphType() {
return Drawable.TREE;
}
/**
* Returns graph describing the tree.
*
* @return the graph describing the tree
* @throws Exception if graph can't be computed
*/
public String graph() throws Exception {
return m_root.graph();
}
/**
* Returns tree in prefix order.
*
* @return the tree in prefix order
* @throws Exception if something goes wrong
*/
public String prefix() throws Exception {
return m_root.prefix();
}
/**
* Returns tree as an if-then statement.
*
* @param className the name of the Java class
* @return the tree as a Java if-then type statement
* @throws Exception if something goes wrong
*/
public String toSource(String className) throws Exception {
/**
* Returns an enumeration describing the available options.
*
* Valid options are: <p>
*
* -U <br>
* Use unpruned tree.<p>
*
* -C confidence <br>
* Set confidence threshold for pruning. (Default: 0.25) <p>
*
* -M number <br>
* Set minimum number of instances per leaf. (Default: 2) <p>
*
* -R <br>
* Use reduced error pruning. No subtree raising is performed. <p>
*
* -N number <br>
* Set number of folds for reduced error pruning. One fold is
* used as the pruning set. (Default: 3) <p>
*
* -B <br>
* Use binary splits for nominal attributes. <p>
*
* -S <br>
* Don't perform subtree raising. <p>
*
* -L <br>
* Do not clean up after the tree has been built.
*
* -A <br>
* If set, Laplace smoothing is used for predicted probabilites. <p>
*
* -Q <br>
* The seed for reduced-error pruning. <p>
*
* @return an enumeration of all the available options.
*/
public Enumeration listOptions() {
newVector.
addElement(new Option("\tUse unpruned tree.",
"U", 0, "-U"));
newVector.
addElement(new Option("\tSet confidence threshold for pruning.\n" +
"\t(default 0.25)",
"C", 1, "-C <pruning confidence>"));
newVector.
addElement(new Option("\tSet minimum number of instances per leaf.\n" +
"\t(default 2)",
"M", 1, "-M <minimum number of instances>"));
newVector.
addElement(new Option("\tUse reduced error pruning.",
"R", 0, "-R"));
newVector.
addElement(new Option("\tSet number of folds for reduced error\n" +
"\tpruning. One fold is used as pruning set.\n" +
"\t(default 3)",
"N", 1, "-N <number of folds>"));
newVector.
addElement(new Option("\tUse binary splits only.",
"B", 0, "-B"));
newVector.
addElement(new Option("\tDon't perform subtree raising.",
"S", 0, "-S"));
newVector.
addElement(new Option("\tDo not clean up after the tree has been built.",
"L", 0, "-L"));
newVector.
addElement(new Option("\tLaplace smoothing for predicted probabilities.",
"A", 0, "-A"));
newVector.
addElement(new Option("\tSeed for random data shuffling (default 1).",
"Q", 1, "-Q <seed>"));
return newVector.elements();
}
/**
* Parses a given list of options.
*
<!-- options-start -->
* Valid options are: <p/>
*
* <pre> -U
* Use unpruned tree.</pre>
*
* <pre> -C <pruning confidence>
* Set confidence threshold for pruning.
* (default 0.25)</pre>
*
* <pre> -M <minimum number of instances>
* Set minimum number of instances per leaf.
* (default 2)</pre>
*
* <pre> -R
* Use reduced error pruning.</pre>
*
* <pre> -N <number of folds>
* Set number of folds for reduced error
* pruning. One fold is used as pruning set.
* (default 3)</pre>
*
* <pre> -B
* Use binary splits only.</pre>
*
* <pre> -S
* Don't perform subtree raising.</pre>
*
* <pre> -L
* Do not clean up after the tree has been built.</pre>
*
* <pre> -A
* Laplace smoothing for predicted probabilities.</pre>
*
* <pre> -Q <seed>
* Seed for random data shuffling (default 1).</pre>
*
<!-- options-end -->
*
* @param options the list of options as an array of strings
* @throws Exception if an option is not supported
*/
public void setOptions(String[] options) throws Exception {
// Other options
String minNumString = Utils.getOption('M', options);
if (minNumString.length() != 0) {
m_minNumObj = Integer.parseInt(minNumString);
} else {
m_minNumObj = 2;
}
m_binarySplits = Utils.getFlag('B', options);
m_useLaplace = Utils.getFlag('A', options);
// Pruning options
m_unpruned = Utils.getFlag('U', options);
m_subtreeRaising = !Utils.getFlag('S', options);
m_noCleanup = Utils.getFlag('L', options);
if ((m_unpruned) && (!m_subtreeRaising)) {
throw new Exception("Subtree raising doesn't need to be unset for unpruned tree!");
}
m_reducedErrorPruning = Utils.getFlag('R', options);
if ((m_unpruned) && (m_reducedErrorPruning)) {
throw new Exception("Unpruned tree and reduced error pruning can't be selected " +
"simultaneously!");
}
String confidenceString = Utils.getOption('C', options);
if (confidenceString.length() != 0) {
if (m_reducedErrorPruning) {
throw new Exception("Setting the confidence doesn't make sense " +
"for reduced error pruning.");
} else if (m_unpruned) {
throw new Exception("Doesn't make sense to change confidence for unpruned "
+"tree!");
} else {
m_CF = (new Float(confidenceString)).floatValue();
if ((m_CF <= 0) || (m_CF >= 1)) {
throw new Exception("Confidence has to be greater than zero and smaller " +
"than one!");
}
}
} else {
m_CF = 0.25f;
}
String numFoldsString = Utils.getOption('N', options);
if (numFoldsString.length() != 0) {
if (!m_reducedErrorPruning) {
throw new Exception("Setting the number of folds" +
" doesn't make sense if" +
" reduced error pruning is not selected.");
} else {
m_numFolds = Integer.parseInt(numFoldsString);
}
} else {
m_numFolds = 3;
}
String seedString = Utils.getOption('Q', options);
if (seedString.length() != 0) {
m_Seed = Integer.parseInt(seedString);
} else {
m_Seed = 1;
}
}
/**
* Gets the current settings of the Classifier.
*
* @return an array of strings suitable for passing to setOptions
*/
public String [] getOptions() {
if (m_noCleanup) {
options[current++] = "-L";
}
if (m_unpruned) {
options[current++] = "-U";
} else {
if (!m_subtreeRaising) {
options[current++] = "-S";
}
if (m_reducedErrorPruning) {
options[current++] = "-R";
options[current++] = "-N"; options[current++] = "" + m_numFolds;
options[current++] = "-Q"; options[current++] = "" + m_Seed;
} else {
options[current++] = "-C"; options[current++] = "" + m_CF;
}
}
if (m_binarySplits) {
options[current++] = "-B";
}
options[current++] = "-M"; options[current++] = "" + m_minNumObj;
if (m_useLaplace) {
options[current++] = "-A";
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String seedTipText() {
return "The seed used for randomizing the data " +
"when reduced-error pruning is used.";
}
/**
* Get the value of Seed.
*
* @return Value of Seed.
*/
public int getSeed() {
return m_Seed;
}
/**
* Set the value of Seed.
*
* @param newSeed Value to assign to Seed.
*/
public void setSeed(int newSeed) {
m_Seed = newSeed;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String useLaplaceTipText() {
return "Whether counts at leaves are smoothed based on Laplace.";
}
/**
* Get the value of useLaplace.
*
* @return Value of useLaplace.
*/
public boolean getUseLaplace() {
return m_useLaplace;
}
/**
* Set the value of useLaplace.
*
* @param newuseLaplace Value to assign to useLaplace.
*/
public void setUseLaplace(boolean newuseLaplace) {
m_useLaplace = newuseLaplace;
}
/**
* Returns a description of the classifier.
*
* @return a description of the classifier
*/
public String toString() {
if (m_root == null) {
return "No classifier built";
}
if (m_unpruned)
return "J48 unpruned tree\n------------------\n" + m_root.toString();
else
return "J48 pruned tree\n------------------\n" + m_root.toString();
}
/**
* Returns a superconcise version of the model
*
* @return a summary of the model
*/
public String toSummaryString() {
/**
* Returns the size of the tree
* @return the size of the tree
*/
public double measureTreeSize() {
return m_root.numNodes();
}
/**
* Returns the number of leaves
* @return the number of leaves
*/
public double measureNumLeaves() {
return m_root.numLeaves();
}
/**
* Returns the number of rules (same as number of leaves)
* @return the number of rules
*/
public double measureNumRules() {
return m_root.numLeaves();
}
/**
* Returns an enumeration of the additional measure names
* @return an enumeration of the measure names
*/
public Enumeration enumerateMeasures() {
Vector newVector = new Vector(3);
newVector.addElement("measureTreeSize");
newVector.addElement("measureNumLeaves");
newVector.addElement("measureNumRules");
return newVector.elements();
}
/**
* Returns the value of the named measure
* @param additionalMeasureName the name of the measure to query for its value
* @return the value of the named measure
* @throws IllegalArgumentException if the named measure is not supported
*/
public double getMeasure(String additionalMeasureName) {
if (additionalMeasureName.compareToIgnoreCase("measureNumRules") == 0) {
return measureNumRules();
} else if (additionalMeasureName.compareToIgnoreCase("measureTreeSize") == 0) {
return measureTreeSize();
} else if (additionalMeasureName.compareToIgnoreCase("measureNumLeaves") == 0) {
return measureNumLeaves();
} else {
throw new IllegalArgumentException(additionalMeasureName
+ " not supported (j48)");
}
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String unprunedTipText() {
return "Whether pruning is performed.";
}
/**
* Get the value of unpruned.
*
* @return Value of unpruned.
*/
public boolean getUnpruned() {
return m_unpruned;
}
/**
* Set the value of unpruned. Turns reduced-error pruning
* off if set.
* @param v Value to assign to unpruned.
*/
public void setUnpruned(boolean v) {
if (v) {
m_reducedErrorPruning = false;
}
m_unpruned = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String confidenceFactorTipText() {
return "The confidence factor used for pruning (smaller values incur "
+ "more pruning).";
}
/**
* Get the value of CF.
*
* @return Value of CF.
*/
public float getConfidenceFactor() {
return m_CF;
}
/**
* Set the value of CF.
*
* @param v Value to assign to CF.
*/
public void setConfidenceFactor(float v) {
m_CF = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String minNumObjTipText() {
return "The minimum number of instances per leaf.";
}
/**
* Get the value of minNumObj.
*
* @return Value of minNumObj.
*/
public int getMinNumObj() {
return m_minNumObj;
}
/**
* Set the value of minNumObj.
*
* @param v Value to assign to minNumObj.
*/
public void setMinNumObj(int v) {
m_minNumObj = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String reducedErrorPruningTipText() {
return "Whether reduced-error pruning is used instead of C.4.5 pruning.";
}
/**
* Get the value of reducedErrorPruning.
*
* @return Value of reducedErrorPruning.
*/
public boolean getReducedErrorPruning() {
return m_reducedErrorPruning;
}
/**
* Set the value of reducedErrorPruning. Turns
* unpruned trees off if set.
*
* @param v Value to assign to reducedErrorPruning.
*/
public void setReducedErrorPruning(boolean v) {
if (v) {
m_unpruned = false;
}
m_reducedErrorPruning = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String numFoldsTipText() {
return "Determines the amount of data used for reduced-error pruning. "
+ " One fold is used for pruning, the rest for growing the tree.";
}
/**
* Get the value of numFolds.
*
* @return Value of numFolds.
*/
public int getNumFolds() {
return m_numFolds;
}
/**
* Set the value of numFolds.
*
* @param v Value to assign to numFolds.
*/
public void setNumFolds(int v) {
m_numFolds = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String binarySplitsTipText() {
return "Whether to use binary splits on nominal attributes when "
+ "building the trees.";
}
/**
* Get the value of binarySplits.
*
* @return Value of binarySplits.
*/
public boolean getBinarySplits() {
return m_binarySplits;
}
/**
* Set the value of binarySplits.
*
* @param v Value to assign to binarySplits.
*/
public void setBinarySplits(boolean v) {
m_binarySplits = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String subtreeRaisingTipText() {
return "Whether to consider the subtree raising operation when pruning.";
}
/**
* Get the value of subtreeRaising.
*
* @return Value of subtreeRaising.
*/
public boolean getSubtreeRaising() {
return m_subtreeRaising;
}
/**
* Set the value of subtreeRaising.
*
* @param v Value to assign to subtreeRaising.
*/
public void setSubtreeRaising(boolean v) {
m_subtreeRaising = v;
}
/**
* Returns the tip text for this property
* @return tip text for this property suitable for
* displaying in the explorer/experimenter gui
*/
public String saveInstanceDataTipText() {
return "Whether to save the training data for visualization.";
}
/**
* Check whether instance data is to be saved.
*
* @return true if instance data is saved
*/
public boolean getSaveInstanceData() {
return m_noCleanup;
}
/**
* Set whether instance data is to be saved.
* @param v true if instance data is to be saved
*/
public void setSaveInstanceData(boolean v) {
m_noCleanup = v;
}
/**
* Returns the revision string.
*
* @return the revision
*/
public String getRevision() {
return RevisionUtils.extract("$Revision: 1.9 $");
}
/**
* Main method for testing this class
*
* @param argv the commandline options
*/
public static void main(String [] argv){
argv[0]="weather3.arff";
runClassifier(new C45(),argv );
}
TESTING
TESTING
Software testing can be stated as the process of validating and verifying that a computer
program/application/product:
• It works as expected,
• It can be implemented with the same characteristics, It satisfies the needs of
stakeholders.
Software testing, depending on the testing method employed, can be implemented at any time in
the software development process.
Testing levels
There are generally four recognized levels of tests: unit testing, integration testing,
system testing, and acceptance testing. Tests are frequently grouped by where they are added in
the software development process, or by the level of specificity of the test.
Unit testing
Unit testing, also known as component testing, refers to tests that verify the functionality of a
specific section of code, usually at the function level. In an object-oriented environment, this is
usually at the class level, and the minimal unit tests include the constructors and destructors.[32]
These types of tests are usually written by developers as they work on code (white-box style), to
ensure that the specific function is working as expected. One function might have multiple tests,
to catch corner casesor other branches in the code. Unit testing alone cannot verify the
functionality of a piece of software, but rather is used to assure that the building blocks the
software uses work independently of each other.
Integration testing
Integration testing is any type of software testing that seeks to verify the interfaces
between components against a software design. Software components may be integrated in an
iterative way or all together. Normally the former is considered a better practice since it allows
interface issues to be located more quickly and fixed.
Integration testing works to expose defects in the interfaces and interaction between integrated
components (modules). Progressively larger groups of tested software components
corresponding to elements of the architectural design are integrated and tested until the software
works as a system.
System testing
Testing Types:
Installation testing
An installation test assures that the system is installed correctly and working at actual
customer's hardware.
Compatibility testing
Regression testing focuses on finding defects after a major code change has occurred.
Specifically, it seeks to uncover software regressions, as degraded or lost features, including old
bugs that have come back. Such regressions occur whenever software functionality that was
previously working, correctly, stops working as intended. Typically, regressions occur as an
unintended consequenceof program changes, when the newly developed part of the software
collides with the previously existing code. Common methods of regression testing include
rerunning previous sets of test-cases and checking whether previously fixed faults have
reemerged.
Acceptance Testing
1. A smoke testis used as an acceptance test prior to introducing a new build to the main
testing process, i.e. before integrationor regression.
2. Acceptance testing performed by the customer, often in their lab environment on their
own hardware, is known as user acceptance testing(UAT). Acceptance testing may be
performed as part of the hand-off process between any two phases of development.
Alpha testing
Beta Testing
Beta testing comes after alpha testing and can be considered a form of external user acceptance
testing. Versions of the software, known as beta versions, are released to a limited audience
outside of the programming team. The software is released to groups of people so that further
testing can ensure the product has few faults or bugs. Sometimes, beta versions are made
available to the open public to increase the feedbackfield to a maximal number of future users.
Functional testing refers to activities that verify a specific action or function of the
code. These are usually found in the code requirements documentation, although some
development methodologies work from use cases or user stories. Functional tests tend to answer
the question of "can the user do this" or "does this particular feature work."
Non-functional testing refers to aspects of the software that may not be related to a
specific function or user action, such as scalabilityor other performance, behavior under certain
constraints, or security. Testing will determine the breaking point, the point at which extremes
of scalability or performance leads to unstable execution.
1. User Interfaces in C#: Windows Forms and Custom Controls by Matthew MacDonald.
2. Applied Microsoft® .NET Framework Programming (Pro-Developer) by Jeffrey Richter.
3. Practical .Net2 and C#2: Harness the Platform, the Language, and the Framework by Patrick
Smacchia.
4. Data Communications and Networking, by Behrouz A Forouzan.
5. Computer Networking: A Top-Down Approach, by James F. Kurose.
6. Operating System Concepts, by Abraham Silberschatz.
7. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A.
Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the clouds: A berkeley view of cloud
computing,” University of California, Berkeley, Tech. Rep. USB-EECS-2009-28, Feb 2009.
8. “The apache cassandra project,” http://cassandra.apache.org/.
9. L. Lamport, “The part-time parliament,” ACM Transactions on Computer Systems, vol. 16,
pp. 133–169, 1998.
10. N. Bonvin, T. G. Papaioannou, and K. Aberer, “Cost-efficient and differentiated data
availability guarantees in data clouds,” in Proc. of the ICDE, Long Beach, CA, USA, 2010.
11. O. Regev and N. Nisan, “The popcorn market. online markets for computational resources,”
Decision Support Systems, vol. 28, no. 1-2, pp. 177 – 189, 2000.
12. A. Helsinger and T. Wright, “Cougaar: A robust configurable multi agent platform,” in Proc.
of the IEEE Aerospace Conference, 2005.
Sites Referred:
http://www.sourcefordgde.com
http://www.networkcomputing.com/
http://www.ieee.org
http://www.emule-project.net/
REFERENCES
[1] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. Advances in Knowledge
Discovery and Data Mining. AAAI/MIT Press, 1996.
[2] Hoppner, F.; Klawnn, F.; Kruse, R.; Runkler, T.;. Fuzzy Cluster
[3] Analysis: “Methods for classification data analysis and image recognition” John Wiley &
Sons Inc. New York NY., 2000.
[4] H. Gunadi “Comparing nearest neighbor algorithms in highdimensional space ” 2011.
[5] T. C. Havens and J. C. Bezdek “An efficient formulation of the improved visual assessment
of cluster tendency (iVAT) algorithm ” IEEE Trans. Knowl. Data Eng. vol. 24, no. 5, pp.
813–822, May 2012.
[6] D. Kumar, M. Palaniswami, S. Rajasegarar, C. Leckie, J. C. Bezdek and T. C. Havens
“clusiVAT: A mixed visual/numerical clustering algorithm for big data ” in Proc. IEEE Int.
Conf. Big Data, pp. 112–117, 2013.Fahad, N. Alshatri, Z. Tari, A. Alamri, I. Khalil, A. Y.
Zomaya,
[7] S. Foufou and A. Bouras “A survey of clustering algorithms for big data: Taxonomy and
empirical analysis ” IEEE Trans. Emerging Topics Comput., vol. 2, no. 3, pp. 267–279, Sep.
2014.
[8] M. Popescu, J. Keller, J. Bezdek and A. Zare “Random projections fuzzy c-means (RPFCM)
for big data clustering ” in Proc. IEEE Int. Conf. Fuzzy Syst., pp. 1–6, 2015.