You are on page 1of 33

Soft Computing

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

COCHIN – 682022

2010

Seminar Report

On

Soft Computing

Submitted By

Jishin K C

In partial fulfillment of the requirement for the award of

Degree of Master of Technology (M.Tech)

In

Software Engineering

Department of Computer Science 1 CUSAT


Soft Computing

DEPARTMENT OF COMPUTER SCIENCE

COCHIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

COCHIN – 682022

Certificate

This is to certify that the Seminar report entitled ″Content Based Image
Retrieval ″, submitted by Jishin K C, Semester I, in the partial fulfillment of the requirement for the
award of M.Tech. Degree in Software Engineering is a bonafide record of the Seminar presented by
her in the academic year 2010.

Dr. Sumam Mary Idicula Dr. K Paulose Jacob

Seminar Guide Head of the Department

Department of Computer Science 2 CUSAT


Soft Computing

ACKNOWLEDGEMENT

I express our profound gratitude to the Head of Department Dr. K Paulose


Jacob for allowing me to proceed with the seminar and also for giving me full freedom to access the
lab facilities.

My heartfelt thanks to my guide Dr. Sumam Mary Idicula for taking time
and helping me through my seminar. She has been a constant source of encouragement without which
the seminar might not have been completed on time. I am very grateful for her guidance.

I am also thankful to Mr.G Santhosh Kumar, Lecturer, Department of Computer


Science, for helping me with my seminar. His ideas and thoughts have been of great importance.

Department of Computer Science 3 CUSAT


Soft Computing

ABSTRACT

Soft Computing differs from conventional (hard) computing in that, unlike hard
computing, it is tolerant of imprecision, uncertainty, partial truth, and approximation. In
effect, the role model for soft computing is the human mind. Principal constituents of Soft
Computing are Neural Networks, Fuzzy Logic, Evolutionary Computation, Swarm
Intelligence and Bayesian Networks. The successful applications of soft computing suggest
that the impact of soft computing will be felt increasingly in coming years. Soft computing is
likely to play an important role in science and engineering, but eventually its influence may
extend much farther.

Keywords: Soft computing, Neural network, Fuzzy Logic, Evolutionary Computation,


Swarm Intelligence, Bayesian Networks

Department of Computer Science 4 CUSAT


Soft Computing

CONTENTS

• Introduction…………………………………………………...7

• Neural networks……………………………………………....8

• Biological model of neuron……………………………8

• Model of an artificial neuron…………………………..9

• Characteristics of Neural Networks…………………....9

• Learning……………………………………………….10

• Neural Network case study………………………….....10

• Evolutionary Computation……………………………………13

• General framework………………………………….....13

• Genetic algorithms………………………………….....13

• Case study……………………………………………..15

• Fuzzy System…………………………………………………22

• Fuzzy sets………………………………………………22

• Fuzzy control in detail………………………………….23

• Fuzzy logic case study………………………………….24

5. Bayesian Network…………………………………………....28

6. Swarm Intelligence……….……………………………….....29

6.1 Example Algorithms………………………………...29

6.2 Applications………………………………………....30
Department of Computer Science 5 CUSAT
Soft Computing

9. Conclusion………………………………………………….32

10. References…………………………………………………33

Department of Computer Science 6 CUSAT


Soft Computing

1. INTRODUCTION

Soft Computing became a formal Computer Science area of study in the early
1990's.Earlier computational approaches could model and precisely analyze only relatively
simple systems. More complex systems arising in biology, medicine, the humanities,
management sciences, and similar fields often remained intractable to conventional
mathematical and analytical methods. That said, it should be pointed out that simplicity and
complexity of systems are relative, and many conventional mathematical models have been
both challenging and very productive. Soft computing deals with imprecision, uncertainty,
partial truth, and approximation to achieve tractability, robustness and low solution cost.

The idea of soft computing was initiated in 1981 by Lotfi. A. Zadeh. Generally
speaking, soft computing techniques resemble biological processes more closely than
traditional techniques, which are largely based on formal logical systems, such as sentential
logic and predicate logic, or rely heavily on computer-aided numerical analysis (as in finite
element analysis). Soft computing techniques are intended to complement each other.

Unlike hard computing schemes, which strive for exactness and full truth, soft
computing techniques exploit the given tolerance of imprecision, partial truth, and
uncertainty for a particular problem. Another common contrast comes from the observation
that inductive reasoning plays a larger role in soft computing than in hard computing.
Components of soft computing include: Neural Network, Perceptron, Fuzzy Systems,
Baysian Network, Swarm Intelligence and Evolutionary Computation.

The highly parallel processing and layered neuronal morphology with learning
abilities of the human cognitive faculty ~the brain~ provides us with a new tool for designing
a cognitive machine that can learn and recognize complicated patterns like human faces and
Japanese characters. The theory of fuzzy logic, the basis for soft computing, provides
mathematical power for the emulation of the higher-order cognitive functions ~the thought
and perception processes. A marriage between these evolving disciplines, such as neural
computing, genetic algorithms and fuzzy logic, may provide a new class of computing
systems ~neural-fuzzy systems ~ for the emulation of higher-order cognitive power.

Department of Computer Science 7 CUSAT


Soft Computing

2. NEURAL NETWORKS

Neural Networks, which are simplified models of the biological neuron system, is a
massively parallel distributed processing system made up of highly interconnected neural
computing elements that have the ability to learn and thereby acquire knowledge and making
it available for use. It resembles the brain in two respects:
- Knowledge is acquired by the network through a learning process.
-Interconnection strengths known as synaptic weights are used to store the knowledge.

2.1 BIOLOGICAL MODEL OF NEURON

A neuron is composed of nucleus- a cell body known as soma. Attached to the soma
are long irregularly shaped filaments called dendrites. The dendrites behave as input
channels, all inputs from other neurons arrive through dendrites. Another link to soma called
Axon is electrically active and serves as an output channel. If the cumulative inputs received
by the soma raise internal electric potential of the cell known as membrane potential, then the
neuron fires by propagating the action potential down the axon to excite or inhibit other
neurons. The axon terminates in a specialized contact called synapse that connects the axon
with the dendrite links of another neuron.

Department of Computer Science 8 CUSAT


Soft Computing

2.2 MODEL OF AN ARTIFICIAL NEURON


An artificial neuron model bears direct analogy to the actual constituents of biological
neuron. This model forms basis of Artificial Neural Networks.

Here x1, x2, x3 … xn are the n inputs to the artificial neuron. w1, w2, w3… wn are the weights
attached to the input links. Total input I received by the soma of the artificial neuron is

I = X1W1 + X2W2 + X3W3 + ...


To generate the final output y, the sum is passed on to a non-linear filter φ called Activation
function, which releases the output.
Y= φ(I)
A very commonly used Activation function is the Thresholding function. In this, the sum is
compared with a threshold value θ. If the value of I is greater than θ, then the output is 1 else
it is 0.
2.3 CHARACTERISTICS OF NEURAL NETWORKS
1) The NNs exhibit mapping capabilities, that is, they can map input patterns to their
associated output patterns.
2) The NNs learn by examples. Thus, NN architecture can be ‘trained’ with known
examples of a problem before they are tested for their ‘inference’ capability on
unknown instances of the problem. They can, therefore, identify new objects
previously untrained.
3) The NNs posses the capability to generalize. Thus, they can predict new outcomes
from past trends.
4) The NNs are robust systems and are fault tolerant. They can, therefore, recall full
patterns from incomplete, partial or noisy patterns.

Department of Computer Science 9 CUSAT


Soft Computing

5) The NNs can process information in parallel, at high speed, and in a distributed
manner.
2.4 LEARNING
Basically, learning is a process by which the free parameters (i.e., synaptic weights
and bias levels) of a neural network are adapted through a continuing process of stimulation
by the environment in which the network is embedded. The type of learning is determined by
the manner in which the parameter changes take place. Specifically, learning machines may
be classified as follows:
- Learning with a teacher, also referred to as supervised learning
- Learning without a teacher
This second class of learning machines may also be subdivided into
- Reinforcement learning
- Unsupervised learning or self-organizing learning

2.5 NEURAL NETWORK CASE STUDY

High Quality Cluster Generation of Feature Points of Fingerprint Using Neutral


Network
The identification of a person requires a comparison of fingerprint with all the
fingerprints in a database. This database may be very large. So it will take long response
time. The identification process can be speeded up by reducing the number of comparisons
that are required to be performed. This is possible only if the fingerprints in the database are
classified. The classification of fingerprints has been done based on their geometric features
only. Minutiae features represent a fingerprint as an individual; they have been used for
verification / identification. A methodology for formation of visually isolated clusters of a
fingerprint on the bases of minutiae distribution has been proposed. The minutiae information
extracted for verification has been used for the classification.
Methodology
A fingerprint image should be viewed as a flow feature points. The local features
(minutiae) are extracted from the fingerprint image.

Figure 1: The main steps in the proposed methodology

Department of Computer Science 10 CUSAT


Soft Computing

A.Minutiae Extraction

The input image is divided into equal sized blocks. Each block is processed
independently. The ridges in each block are thinned and the resulting skeleton image is then
used for feature extraction.

B. X-Y Location Calculation


After identification of minutiae points their X-Y coordinate locations are stored in the
database to input them in to the classification system.
C. Fingerprint Classification
Fingerprints are then classified using ART Neural network.
Clustering Fingerprints using Adaptive Resonance Theory
For clustering the Fingerprints we adapt ART1 algorithm to our situation, which is
more suitable for clustering binary vectors. ART1 consists of an Attentional subsystem and
an Orientation subsystem. The Attentional subsystem consists of Comparison layer F1,
Recognition layer F2, and Control gains G1 and G2. F1 and F2 layers are fully connected
with top-down weights and bottom-up weights. The Orientation subsystem consists of the
vigilance parameter ρ. The input pattern vectors PH=1 to 2 are presented at the F1 layer. Each
input pattern presented at the F1 layer activates a node (winner) in the F2 layer. The F2 layer

Department of Computer Science 11 CUSAT


Soft Computing

reads out the top-down expectation to F1, where the winner is compared with the input
vector. The vigilance parameter determines the mismatch that is to be tolerated when
assigning each host to a cluster. If the match between the winner and the input vector is
within the tolerance, the top-down weights corresponding to the winner are modified. If a
mismatch occurs, F1 layer sends a reset burst to F2, which shuts off the current node, and
chooses another uncommitted node. Once the network stabilizes, the top-down weights
corresponding to each node in the F2 layer represent the prototype vector for that node. Our
architecture of ART1 based network for clustering Fingerprints (illustrated in Figure 2)
consists of 2 nodes in the F1 layer, with each node presented with the binary value 0 or 1.
The pattern vector PH, which represents the X, Y locations of minutiae point is presented at
the F1 layer. The F2 layer consists of a variable number of nodes corresponding to the
number of clusters.
We measure the accuracy of our classification approach by generating graphical point
on a XY plane of a particular color, for minutiae point belonging to a particular cluster i.e.
specific color for specific cluster. When we connect points of same color with a curve line to
represent a cluster, we found that the clusters formed by ART1 are displayed isolated. So we
observed that ART1 produces high quality clustering. Its graphical representation is given
below:
Performance of ART1 Clustering Technique
ART1 algorithm classifies minutiae points of a fingerprint in to number of clusters. To
measure the quality of clusters obtained, we graphically represent minutiae points belong to
different cluster in different color.
Figure 3: Three clusters created by ART1

In recent years, there has been considerable research in exploring novel methods and
techniques have been used for classification of fingerprints. We use the ART1 clustering
algorithm to classify the minutiae points of a fingerprint. Cluster information generated by
ART1 Neural Network can be used in multilevel classification of fingerprints.

Department of Computer Science 12 CUSAT


Soft Computing

3. EVOLUTIONARY COMPUTATION

In computer science, evolutionary computation is a subfield of artificial intelligence


(more particularly computational intelligence) that involves combinatorial optimization
problems. Evolutionary computation uses iterative progress, such as growth or development
in a population. This population is then selected in a guided random search using parallel
processing to achieve the desired end. Such processes are often inspired by biological
mechanisms of evolution. It aims at function optimization, problem solving. Evolutionary
computation includes Genetic Programming, Evolution strategies, Genetic Algorithm,
Evolution Programming.

3.1 GENERAL FRAMEWORK OF EVOLUTIONARY COMPUTATION

3.2 GENETIC ALGORITHMS

The most popular evolutionary algorithm is the genetic algorithm of J. Holland .GA is
an iterative procedure and its starting condition is a large set of random strings called genes
or the genome. The genes are linear arrays of numbers. Each number represents an aspect of
the system to which the algorithm is being applied; for example, if we are dealing with neural
network topology, then one of the numbers could represent the number of layers in a
particular network.

Department of Computer Science 13 CUSAT


Soft Computing

When the algorithm is run, each of the networks represented by the population of
genes is tested and graded according to its performance. The genes are then copied with a
probability that is larger if their performance was greater. That is, the genes that produce a
network with poor performance are less likely to get copied; those with good performance are
more likely to get copied. The copying process therefore produces a population that has a
large number of better performing genes.
Having completed the copying process, the genes are then "bred" by crossing over
some of the numbers in the arrays at random points. A small random change called a
"mutation" is introduced to add some diversity. The whole process is then repeated. Each
genome or gene encodes a possible solution in a given problem space-referred to as the
search space. This space comprises all possible solutions to the problem at hand. The symbol
alphabet used is often binary but this has been extended in recent years to include character
based encodings, real-valued encodings, and tree representations.
The following steps are required to define a basic GA:
1. Create a population of random individuals (e.g., a mapping from the set of parameter
values into the set of (0-1) such that each individual represents a possible solution to the
problem at hand.
2. Compute each individual’s fitness, i.e., its ability to solve a given problem. This involves
finding a mapping from bit strings into the reals, the so-called fitness function.
3. Select individual population members to be parents. One of the simplest selection
procedures is the fitness-proportionate selection, where individuals are selected with a
probability proportional to their relative fitness. This ensures that the expected number of
times an individual is chosen is approximately proportional to its relative performance in the
population. Thus, high-fitness individuals stand a better chance of "reproducing," while low-
fitness ones are more likely to disappear.
4. Produce children by recombining patent material via crossover and mutation and add them
to the population.
5. Evaluate the children's fitness.
6. Repeat step (3) to (5) until a solution with required fitness criteria is obtained.
Selection alone cannot introduce any new individuals into the population. Genetically
inspired operators such as crossover and mutation are used to find new points in the search
space. The most important genetic operator is the crossover operator. As in biological
systems, the crossover process yields recombination of alleles via exchange of segments
between pairs of genotypes. Genetic algorithms are stochastic iterative processes that are not

Department of Computer Science 14 CUSAT


Soft Computing

guaranteed to converge. The termination condition may be specified either as some fixed,
maximal number of generations, or on reaching a predefined acceptable fitness level.

3.3 GENETIC ALGORITHM CASE STUDY

Solving New Student Allocation Problem with Genetic Algorithms

The New Student Allocation Problem (NSAP) is one of clustering problems that
allocates students into some classes with minimum intelligence gap in each class and the
number of students in each class does not exceed its maximum capacity. This topic is
essential, because it is very difficult to give good educational service for large number of
students with high diversity of achievements or skills. With the students allocated into the
groups, discriminating policies to these groups can be implemented easily.
It is difficult to measure students’ intelligence. University or school only have very
limited time to know their real intelligence level and the educational process should be held
at a short time after new student registration period. It is fairly acceptable if their intelligence
measured by their admission test score but then they will usually be allocated into classes
with sorting method: a group of new students who has similar ranking and assigns into the
same class. Although this method has no appropriate concept of students’ similarity, many
educational institutions still utilize it. For example, there is a case in California USA which
applied this method. The case appeared after a research resulted on how similar students get
different results in hundreds of school in this area. Since students with same ranking assumed
as similar, it is very reasonable that they get different results.
Same total scores do not always show students’ similarity, because same total scores
can be assembled from various score combinations. According to clustering method, among
the various scores combination, the score combination with completely similar components
shows high degree of students’ similarity. For example, if we apply only two subjects as
grouping criteria, student A with 100 in Biology and 50 in Mathematics must be similar to
the student B with Biology score is approximately 100 and Mathematics score is
approximately 50. If one of those combinations puts in reverse, so that student A with 100 in
Biology and 50 in Mathematics can be considered similar to student C with 50 in Biology and
100 in Mathematics, it definitely doesn’t make sense, although their total score is the same.
Student C should be separated in different class with student A or student B. If there is
student D with 90 in Biology and 45 in Mathematics, then he/she can be grouped in a same

Department of Computer Science 15 CUSAT


Soft Computing

class with student A or student B. It is acceptable since student D is more similar to student A
or to student B rather than to student C. On the contrary, in traditional sorting method,
student C is not only similar to student A or student B, but he/she is also exactly the same
with student A and student B, because their total scores are the same. This is inappropriate
concept of similarity. Hence, NSAP should be solved by clustering method with a right
concept of objects similarity.
There are loads of clustering methods which have been developed in various fields
but none of them can be applied to solve NSAP due to classroom capacity. However there is
no clustering method which predefines the number of each cluster. In NSAP, the number of
objects (students) in each cluster (class) depends on the classroom size. It cannot pursue the
result of clustering process. The clustering methods should be modified to solve NSAP. It is
common in clustering area, because clustering is a subjective process. This subjectivity
makes the process of clustering becomes difficult, because a single algorithm or approach is
not adequate to solve another clustering problem.
Bhuyan et al. had applied GAs to solve general clustering problem . They clustered
several objects so that each cluster resulted in high dissimilarities with other clusters and in
each cluster contained similar objects. They suggested chromosomal representations called
order-based representation. Their idea about chromosomal representation is very good, but it
is not enough to represent classroom capacity. However, it inspires us to modify their work as
one of our proposed approach to solve NSAP.
According to the general clustering problem, finding solution of NSAP is a hard
process because of the large number of ways available. To cluster n objects into nc clusters,
the number of ways is

Let us consider this simple illustration. There are 15 new students that should be allocated to
three classes with same capacity. They will be clustered based on two variables: Mathematics
and Biology scores. Assume that the distribution of their scores in a Cartesian coordinate
system is shown in Figure. Although the number of ways is equals to 210,766,920, the
solution will be established easily by examining the distribution. The solutions are class A
consists of students with index number 3, 5, 8, 13 and 15, class B consists of students with

Department of Computer Science 16 CUSAT


Soft Computing

index number 4, 7, 9, 11 and 14, and class C consists of students with index number 1, 2, 6,
10 and 12.

It is an easy problem to be solved in a small number of students, but in fact the


number of students is too large and the number of criteria is more than two. Finding the
solution by traditional methods needs unfeasible time or it will only be trapped in a local
optima. It is very complicated, since the larger number of objects, the harder to find the
optimal solution. Furthermore, it takes longer to reach a reasonable result. This is the reason
why this problem labeled as NP-hard problem.
We propose two approaches of chromosomal representation for solving NSAP with
Genetic Algorithms (GAs). It inspired by a successful application of GAs for solving difficult

Department of Computer Science 17 CUSAT


Soft Computing

optimization problems, included NP-hard problem. Using conventional methods to solve NP-
Hard problem requires an exceptionally lot of time, but GAs can solve it in a feasible time.
GAs have been proved powerful for solving optimization problems in many current empirical
works and theories. They particularly well suited to solve hard and multidimensional
problems with numerous difficulties to find optima. They search the solution based on the
mechanism of natural selection and natural genetics.
Centre Based Approach (CBA)
The chromosomal representation of CBA is binary representation. To cluster n new
students into nc classes, chromosomal representation designed as followed:
(1) A chromosome consists of c sub-chromosome. The ith sub-chromosome is representation
of a student as the centre of ith class and it consists of m genes or bits.
(2) A student should be a member of a class, but he/she probably becomes the centre of two
or more classes.
The chromosomal representation is shown in Figure

Chromosomes which consist of nc sub-chromosomes will be generated randomly with


binary representation in the initial generation step. In order to generate it, we should
determine the width of sub-chromosome. Based on this value, the mathematic formulation of
the centre of classes can be determined too. The following is the rules to determine them:
(1) If there is an integer m so that number of new students equals to 2m, then the width of
subchromosomeis m. The index of student as the centre of the ith class will be

Department of Computer Science 18 CUSAT


Soft Computing

(2) If there is no integer m so that the number of new students equals to 2m, then the width of
sub-chromosome is the lowest m which 2m > n. The index of student as the centre of the ith
class is

The centre of classes should be converted to the distribution of new students with algorithms.
After converting the centre to the distribution, the value of Equation (3) can be calculated for
evaluation.
Experimentation parameters
To evaluate the proposed approaches, we developed each proposed approaches as a
software which capable to allocate n students into nc classes based on m subjects. Then we
experimented them with a random data score consists of 200 students in four subjects. The
scores are integers between 0 and 100. The students will be allocated into five classes with
same capacities. We use Islamic University of Indonesia as the model on the number of
students, classes and capacities on each class. In this case, the largest intelligence gap of the
random data equals to 75.56. Among the students, there are 9 pairs of students who have
same scores.
Experiments have been performed for each combination of the following parameters:
(1) Crossover probability: 0.25 and 0.50.
(2) Mutation probability: 0.01, 0.05 and 0.1.
(3) Population size: 50, 100 and 200
Here, we follow the suggested parameters of GAs. Experiments with PBA have been
performed for all combination of the crossover and the mutation methods. Experiments with
CBA have been performed for a combination of crossover and mutation .The algorithms run
on a Pentium Dual-Cores notebook with 1.73 GHz. We stop our algorithms after exactly
1000 generations for PBA and 250 generations for CBA. The CPU in PBA spends less than 5
minutes and in CBA spends less than 7 minutes.
We also run the algorithm of PBA for 2000 generations to give more chance to this
approach. In these additional generations, CPU spends about 9 minutes. But there is no
improvement compared to the previous experiments.
The largest intelligence gap of all classes shows that both approaches can split the students
with the largest intelligence gap into different classes. It indicates with none of the classes has
the largest intelligence gap that equals to the largest intelligence gap of the input data.

Department of Computer Science 19 CUSAT


Soft Computing

However, PBA cannot group each pair of students with same scores into the same classes.
There are only four pairs of them allocated into same classes, although it was examined with
the additional generations. On the other hand, each pair of the students is allocated into the
same class by CBA. Although CBA needs more computational times than PBA, but it needs
less number of generations to reach the best known solution. CBA needs only 236
generations to find the best known solution. Experiment with the additional generations
shows that PBA needs tentative number of generations to reach it.
Experiments showed that CBA needed more time than PBA. This result is acceptable,
because CBA had to decode chromosomes as binary numbers into decimal numbers and then
converted them into the solutions. It did not happen to PBA. In addition, CBA is dependent
on the number of classes, while PBA is independent to it. It indicates that time complexity of
PBA is simpler than time complexity of CBA, but the experiment results showed that best
known result of PBA was not as small as the objective function in the first generation. It
indicates that PBA trapped into the local optima. Additional generation also gave no
improvement to the result by PBA. Both combinations of parameters and variations of
crossover and mutation methods gave no effect to the results. It seems that searching process
by PBA is similar to blind search. It is shown that there is no effect on additional population
size. Logically, larger population size will give more probability to get a better population
initialization (1st generation) similar to the running result of CBA.
We should analyze the searching space of PBA to know why this approach is trapped
to the local optima. The searching space of this approach with order-based representation for
the chromosome is much greater than the real problem. The searching space of PBA only
depends on the number of students and it equals to the factorial of the number of students.
For 200 students, it equals to 200! but the total number of ways to cluster 200 students into
five classes with same capacities only equals to 200!/(40!)5. There are a huge number of gene
combinations in one partition which represents a same class; those too many different
chromosomes represent the same solution. Manipulation by GAs operators only produces
different chromosomes, but they represent same solution or old solution which had been
created previously. In other words, the GAs operators cannot improve the fitness function in
each generation. This is the principle of GAs. Since it cannot be attained, it clearly shows that
the approach cannot give a good solution.
The comparison of PBA and CBA shows that CBA is better than PBA. With binary
chromosomal representation in CBA, GAs operators can effectively manipulate
chromosomes. The operators can easily produce new different solutions. It gives more

Department of Computer Science 20 CUSAT


Soft Computing

possibility for GAs to explore the searching space. Finally, the potential solution can be
found in a reasonable time. CBA also reduces the searching space as it equals to two powered
by the width of chromosome. With the case of 200 students and five classes, it merely equals
to 240. It is relatively small compared to the searching space of real problem (200!/(40!)5).
According to the largest intelligence gap in each class, we conclude that CBA can produce
classes with intelligence gap as minimum as possible. It means that more students with
similar intelligence level are allocated in a same class.
We have proposed two approaches for GAs to solve NSAP. Using PBA to solve
NSAP make GAs lost its ability. NSAP is a hard problem for PBA. Although the time
complexity of PBA is simpler than CBA, experiments proved that this approach did not have
ability to solve the problem. This approach makes GAs trapped into local optima, because the
operators do not effectively produce a better population in the next generation. It is caused by
total number of chromosome that is generated from GAs is much larger than total number of
ways to cluster n students into nc classes. On the contrary, CBA succeeds to solve NSAP. In
this approach, chromosomes represent students as the centre of each class. It can minimize
the largest intelligence gap in each class and it can also reduce the searching space. Since we
only use one combination of GAs operators with CBA, the following researches should try to
improve the ability of GAs by more than one combination of operators. They can also hybrid
GAs with another Artificial Intelligence approach to improve its ability.

Department of Computer Science 21 CUSAT


Soft Computing

4. FUZZY SYSTEMS

Fuzzy logic is widely used in machine control. The term itself inspires a certain
skepticism, sounding equivalent to "half-baked logic" or "bogus logic", but the "fuzzy" part
does not refer to a lack of rigour in the method, rather to the fact that the logic involved can
deal with fuzzy concepts—concepts that cannot be expressed as "true" or "false" but rather as
"partially true". Although genetic algorithms and neural networks can perform just as well as
fuzzy logic in many cases, fuzzy logic has the advantage that the solution to the problem can
be cast in terms that human operators can understand, so that their experience can be used in
the design of the controller. This makes it easier to mechanize tasks that are already
successfully performed by humans.

4.1 FUZZY SETS

The input variables in a fuzzy control system are in general mapped into by sets of
membership functions similar to this, known as "fuzzy sets". The process of converting a
crisp input value to a fuzzy value is called "fuzzification".

A control system may also have various types of switch, or "ON-OFF", inputs along
with its analog inputs, and such switch inputs of course will always have a truth value equal
to either 1 or 0, but the scheme can deal with them as simplified fuzzy functions that happen
to be either one value or another.

Given "mappings" of input variables into membership functions and truth values, the
microcontroller then makes decisions for what action to take based on a set of "rules", each of
the form:

IF brake temperature IS warm AND speed IS not very fast


THEN brake pressure IS slightly decreased.

In this example, the two input variables are "brake temperature" and "speed" that have
values defined as fuzzy sets. The output variable, "brake pressure", is also defined by a fuzzy
set that can have values like "static", "slightly increased", "slightly decreased", and so on.
This rule by itself is very puzzling since it looks like it could be used without bothering with
fuzzy logic, but remembers the decision is based on a set of rules:

• All the rules that apply are invoked, using the membership functions and truth values
obtained from the inputs, to determine the result of the rule.

Department of Computer Science 22 CUSAT


Soft Computing

• This result in turn will be mapped into a membership function and truth value
controlling the output variable.
• These results are combined to give a specific ("crisp") answer, the actual brake
pressure, a procedure known as "defuzzification".

This combination of fuzzy operations and rule-based "inference" describes a "fuzzy expert
system".

4.2 FUZZY CONTROL IN DETAIL

Fuzzy controllers are very simple conceptually. They consist of an input stage, a
processing stage, and an output stage. The input stage maps sensor or other inputs, such as
switches, thumbwheels, and so on, to the appropriate membership functions and truth values.
The processing stage invokes each appropriate rule and generates a result for each, then
combines the results of the rules. Finally, the output stage converts the combined result back
into a specific control output value.

The most common shape of membership functions is triangular, although trapezoidal


and bell curves are also used, but the shape is generally less important than the number of
curves and their placement. From three to seven curves are generally appropriate to cover the
required range of an input value, or the "universe of discourse" in fuzzy jargon.

As discussed earlier, the processing stage is based on a collection of logic rules in the form of
IF-THEN statements, where the IF part is called the "antecedent" and the THEN part is called
the "consequent". Typical fuzzy control systems have dozens of rules.

Consider a rule for a thermostat:

IF (temperature is "cold") THEN (heater is "high")

This rule uses the truth value of the "temperature" input, which is some truth value of
"cold", to generate a result in the fuzzy set for the "heater" output, which is some value of
"high". This result is used with the results of other rules to finally generate the crisp
composite output. Obviously, the greater the truth value of "cold", the higher the truth value
of "high", though this does not necessarily mean that the output itself will be set to "high",
since this is only one rule among many. In some cases, the membership functions can be
modified by "hedges" that are equivalent to adjectives. Common hedges include "about",
"near", "close to", "approximately", "very", "slightly", "too", "extremely", and "somewhat".
These operations may have precise definitions, though the definitions can vary considerably
Department of Computer Science 23 CUSAT
Soft Computing

between different implementations. "Very", for one example, squares membership functions;
since the membership values are always less than 1, this narrows the membership function.
"Extremely" cubes the values to give greater narrowing, while "somewhat" broadens the
function by taking the square root.

In practice, the fuzzy rule sets usually have several antecedents that are combined
using fuzzy operators, such as AND, OR, and NOT, though again the definitions tend to vary:
AND, in one popular definition, simply uses the minimum weight of all the antecedents,
while OR uses the maximum value. There is also a NOT operator that subtracts a
membership function from 1 to give the "complementary" function.

There are several different ways to define the result of a rule, but one of the most
common and simplest is the "max-min" inference method, in which the output membership
function is given the truth value generated by the premise.

Rules can be solved in parallel in hardware, or sequentially in software. The results of


all the rules that have fired are "defuzzified" to a crisp value by one of several methods.
There are dozens in theory, each with various advantages and drawbacks.

The "centroid" method is very popular, in which the "center of mass" of the result
provides the crisp value. Another approach is the "height" method, which takes the value of
the biggest contributor. The centroid method favors the rule with the output of greatest area,
while the height method obviously favors the rule with the greatest output value.

Fuzzy control system design is based on empirical methods, basically a methodical approach
to trial-and-error. The general process is as follows:

• Document the system's operational specifications and inputs and outputs.


• Document the fuzzy sets for the inputs.
• Document the rule set.
• Determine the defuzzification method.
• Run through test suite to validate system, adjust details as required.
• Complete document and release to production.

4.3 FUZZY LOGIC: CASE STUDY

Out of the various important metrics for quality prediction of software, Inspection
metrics, inspection rate and error density are the most important metrics for industrial

Department of Computer Science 24 CUSAT


Soft Computing

applications. Ebenau was the first who employed inspection metrics to identify modules that
are likely to be error prone. Software inspection is considered as an essential practice to
develop high quality software. If it is possible to identify potentially error prone modules
with relatively high degree of accuracy at little or no extra cost by analyzing the present
inspection data, project managers can use such findings to optimize software development
productivity. They may test potentially erroneous modules more extensively than others. Past
researchers focused their attention on empirically validating cost effectiveness of inspection
methods. Barnard and Price identified 9 key metrics used in planning, monitoring, controlling
and improving inspection processes. Sun Sup So proposed in his paper a fuzzy logic based
approach to predict error prone modules.
In order to develop software quality assessment model, one must first identify factors
that strongly influence software quality and the number of residual errors. Unfortunately, it is
extremely difficult, to accurately identify relevant quality factors. Furthermore, the degree of
influence is imprecise in nature. That is, although exact and discrete metric data are used,
inference rules used may be fuzzy in nature. Suppose, for example, that an inspection team
reported an inspection rate of over 380 LoC/h whereas typical inspection rate ranges from
150 to 200 LoC/h [22]. One can convincingly argue that such inspection rate significantly
exceeds the reported average from industrial applications, and experts will most likely agree
unanimously with the conclusion. However, such assessment is fuzzy because the term
“significantly” cannot be precisely quantified. Moreover, if a team reports inspection rate of
275 LoC/h, experts are likely to differ in their opinions as to whether or not the inspection
rate exceeded the industrial norm and by how much it exceeded. In other words, decision
boundary is not well defined. Due to its natural ability to model imprecise and fuzzy aspect of
data and rules, fuzzy logic is an attractive alternative in situations where approximate
reasoning is called for. A prototype system can also be developed based solely on domain
knowledge without relying on extensive training data. Furthermore, performance of the
system can be gradually tuned as more data become available.
A fuzzy logic-based prediction model is proposed as follows:
Let
I= Inspection rate.
E=Error density.
Complexity and Co-efficient matrix for Inspection Rate and Error Density are given as:

Department of Computer Science 25 CUSAT


Soft Computing

Complexity and Co-efficient matrix of I

Complexity and Co-efficient matrix of E

Definition of quality grades

Where, Marks of a module are defined as:


Fuzzification:

The complexity attributes low, medium and high of the two metrics inspection rate
and error density are taken as triangular fuzzy numbers, for I ≥ 100 and 0 ≤ E ≤ 35. Ri for
I<100 is obtained directly from the complexity table no.1, while Ri for I ≥ 100 is obtained by
the process of fuzzification and defuzzification. Re for E>35 is obtained directly from the
complexity table no.2, while Re for 0 ≤ E ≤ 35 is obtained by the process of fuzzification and
defuzzification. The pictorial representation is given in Figures.
Fuzzy Pictorial representation of I

Fuzzy Pictorial representation of E

Department of Computer Science 26 CUSAT


Soft Computing

Defuzzification:

We define:

The proposed study proposes a fuzzy logic based precise approach to quantify quality
of software. Software can be given quality grades on the basis of two metrics inspection rate/
hr and error density. The prediction of quality of software is very easy by this approach.
Precise quality grades can be given to any software. Software are graded on the quality grade
basis in 10 grades. Modules having grade 1 are supposed to be most error prone, while those
having quality grade 10 are considered satisfactory on the basis of quality. Triangular Fuzzy
numbers have been used for inspection rate and error/kLOC. The methodology of fuzzy logic
used for, in the proposed study, is sufficiently general and can be applied to other areas of
quantitative software engineering.

Department of Computer Science 27 CUSAT


Soft Computing

5. BAYESIAN NETWORK

A Bayesian network, belief network or directed acyclic graphical model is a


probabilistic graphical model that represents a set of random variables and their conditional
dependences via a directed acyclic graph (DAG). For example, a Bayesian network could
represent the probabilistic relationships between diseases and symptoms. Given symptoms,
the network can be used to compute the probabilities of the presence of various diseases.

Formally, Bayesian networks are directed acyclic graphs whose nodes represent
random variables in the Bayesian sense: they may be observable quantities, latent variables,
unknown parameters or hypotheses. Edges represent conditional dependencies; nodes which
are not connected represent variables which are conditionally independent of each other.
Each node is associated with a probability function that takes as input a particular set of
values for the node's parent variables and gives the probability of the variable represented by

the node. For example, if the parents are m Boolean variables then the probability function
could be represented by a table of 2m entries, one entry for each of the 2m possible
combinations of its parents being true or false.

Efficient algorithms exist that perform inference and learning in Bayesian networks.
Bayesian networks that model sequences of variables (e.g. speech signals or protein
sequences) are called dynamic Bayesian networks. Generalizations of Bayesian networks that
can represent and solve decision problems under uncertainty are called influence diagrams.

Bayesian networks are used for modelling knowledge in computational biology and
bioinformatics (gene regulatory networks, protein structure and gene expression analysis),
medicine, document classification, information retrieval, image processing, data fusion,
decision support systems, engineering, gaming and law.

Department of Computer Science 28 CUSAT


Soft Computing

6. SWARM INTELLIGENCE

Swarm intelligence (SI) describes the collective behaviour of decentralized, self-


organized systems, natural or artificial. The concept is employed in work on artificial
intelligence. The expression was introduced by Gerardo Beni and Jing Wang in 1989, in the
context of cellular robotic systems.

SI systems are typically made up of a population of simple agents or boids interacting


locally with one another and with their environment. The agents follow very simple rules, and
although there is no centralized control structure dictating how individual agents should
behave, local, and to a certain degree random, interactions between such agents lead to the
emergence of "intelligent" global behavior, unknown to the individual agents. Natural
examples of SI include ant colonies, bird flocking, animal herding, bacterial growth, and fish
schooling.

The application of swarm principles to robots is called swarm robotics, while 'swarm
intelligence' refers to the more general set of algorithms. 'Swarm prediction' has been used in
the context of forecasting problems.

6.1 EXAMPLE ALGORITHMS

6.1.1 Ant colony optimization

Ant colony optimization (ACO) is a class of optimization algorithms modeled on the


actions of an ant colony. ACO methods are useful in problems that need to find paths to
goals. Artificial 'ants' - simulation agents - locate optimal solutions by moving through a
parameter space representing all possible solutions. Real ants lay down pheromones directing
each other to resources while exploring their environment. The simulated 'ants' similarly
record their positions and the quality of their solutions, so that in later simulation iterations
more ants locate better solutions. One variation on this approach is the bees algorithm, which
is more analogous to the foraging patterns of the honey bee.

6.1.2 Particle swarm optimization

Particle swarm optimization (PSO) is a global optimization algorithm for dealing with
problems in which a best solution can be represented as a point or surface in an n-
dimensional space. Hypotheses are plotted in this space and seeded with an initial velocity, as
well as a communication channel between the particles. Particles then move through the

Department of Computer Science 29 CUSAT


Soft Computing

solution space, and are evaluated according to some fitness criterion after each time step.
Over time, particles are accelerated towards those particles within their communication
grouping which have better fitness values. The main advantage of such an approach over
other global minimization strategies such as simulated annealing is that the large numbers of
members that make up the particle swarm make the technique impressively resilient to the
problem of local minima.

6.1.3 Gravitational search algorithm

Gravitational search algorithm (GSA) is constructed based on the law of Gravity and
the notion of mass interactions. The GSA algorithm uses the theory of Newtonian physics
and its searcher agents are the collection of masses. In GSA, we have an isolated system of
masses. Using the gravitational force, every mass in the system can see the situation of other
masses. The gravitational force is therefore a way of transferring information between
different masses.

6.1.4 Intelligent Water Drops

Intelligent Water Drops algorithm (IWD) is a swarm-based nature-inspired


optimization algorithm, which has been inspired from natural rivers and how they find almost
optimal paths to their destination. These near optimal or optimal paths follow from actions
and reactions occurring among the water drops and the water drops with their riverbeds. In
the IWD algorithm, several artificial water drops cooperate to change their environment in
such a way that the optimal path is revealed as the one with the lowest soil on its links. The
solutions are incrementally constructed by the IWD algorithm. Consequently, the IWD
algorithm is generally a constructive population-based optimization algorithm.

6.2 APPLICATIONS

Swarm Intelligence-based techniques can be used in a number of applications. The


U.S. military is investigating swarm techniques for controlling unmanned vehicles. The
European Space Agency is thinking about an orbital swarm for self assembly and
interferometry. NASA is investigating the use of swarm technology for planetary mapping. A
1992 paper by M. Anthony Lewis and George A. Bekey discusses the possibility of using
swarm intelligence to control nanobots within the body for the purpose of killing cancer
tumors.

Department of Computer Science 30 CUSAT


Soft Computing

6.2.1 Crowd simulation

Artists are using swarm technology as a means of creating complex interactive


systems or simulating crowds. Tim Burton's Batman Returns was the first movie to make use
of swarm technology for rendering, realistically depicting the movements of a group of bats
using the Boids system. The Lord of the Rings film trilogy made use of similar technology,
known as Massive, during battle scenes. Swarm technology is particularly attractive because
it is cheap, robust, and simple.

6.2.2 Ant-based routing

The use of Swarm Intelligence in Telecommunication Networks has also been


researched, in the form of Ant Based Routing. This was pioneered separately by Dorigo et al.
and Hewlett Packard in the mid-1990s, with a number of variations since. Basically this uses
a probabilistic routing table rewarding/reinforcing the route successfully traversed by each
"ant" (a small control packet) which flood the network. Reinforcement of the route in the
forwards, reverse direction and both simultaneously has been researched: backwards
reinforcement requires a symmetric network and couples the two directions together;
forwards reinforcement rewards a route before the outcome is known (but then you pay for
the cinema before you know how good the film is). As the system behaves stochastically and
is therefore lacking repeatability, there are large hurdles to commercial deployment. Mobile
media and new technologies have the potential to change the threshold for collective action
due to swarm intelligence.

Department of Computer Science 31 CUSAT


Soft Computing

7. CONCLUSION

The complementarity of FL, NC, GC, and PR has an important consequence: in many
cases a problem can be solved most effectively by using FL, NC, GC and PR in combination
rather than exclusively. A striking example of a particularly effective combination is what has
come to be known as "neurofuzzy systems." Such systems are becoming increasingly visible
as consumer products ranging from air conditioners and washing machines to photocopiers
and camcorders. Less visible but perhaps even more important are neurofuzzy systems in
industrial applications. What is particularly significant is that in both consumer products and
industrial systems, the employment of soft computing techniques leads to systems which
have high MIQ (Machine Intelligence Quotient). In large measure, it is the high MIQ of SC-
based systems that accounts for the rapid growth in the number and variety of applications of
soft computing.

The successful applications of soft computing suggest that the impact of soft
computing will be felt increasingly in coming years. Soft computing is likely to play an
especially important role in science and engineering, but eventually its influence may extend
much farther. In many ways, soft computing represents a significant paradigm shift in the
aims of computing - a shift which reflects the fact that the human mind, unlike present day
computers, possesses a remarkable ability to store and process information which is
pervasively imprecise, uncertain and lacking in categoricity.

Department of Computer Science 32 CUSAT


Soft Computing

8. REFERENCE

1. Zadeh, Lotfi A., "Fuzzy Logic, Neural Networks, and Soft Computing," Communications
of the ACM, March 1994, Vol. 37 No. 3, pages 77-84.

2. Harish Mittal, Pradeep Bhatia and Puneet Goswami, “Software Quality Assessment
Based on Fuzzy Logic Technique”, International Journal of Soft Computing Applications
ISSN: 1453-2277 Issue 3 (2008), pp.105-112

3. Zainudin Zukhri and Khairuddin Omar,” Solving New Student Allocation Problem with
Genetic Algorithms”, International Journal of Soft Computing Applications ISSN:
1453-2277 Issue 3 (2008), pp.6-15

4. Bhupesh Gour, T. K. Bandopadhyaya and Sudhir Sharma,” High Quality Cluster


Generation of Feature Points of Fingerprint Using Neutral Network”, International Journal of
Soft Computing Applications ISSN: 1453-2277 Issue 4 (2009), pp.13-18

5. NARESH K. SINHA and MADAN M. GUPTA, “Soft computing and intelligent systems –
Theory & Applications”

6. R.G.S. ASTHANA, “Evolutionary Algorithms and Neural Networks”

Department of Computer Science 33 CUSAT

You might also like