1
2
3


Graphical User Interface 



Pattern Evaluation 



Data Mining Engine 



4
_{5}
6
– 
Selection 
– 
Crossover 
– 
Mutation 
7
• 
Before a GA can be run, a suitable coding(or representation) 
for the problem must be devised. 

• 
It is assumed that a potential solution to a problem may be represented as a set of parameters (for example, the dimensions of the beams in a bridge design). 
• 
For example, if our problem is to maximize a function of three variables, F(x, y, z), we might represent each variable by a 10bit binary number. Our chromosome would therefore contain three genes, and consist of 30 binary digits. 
8
• 
A fitness function must be devised for each problem to be 
solved. 

• 
Given a particular chromosome, the fitness function returns a 
single numerical “fitness” or “figure of merit”. 

• 
Which is supposed to be proportional to the “utility” or “ability” of the individual which that chromosome represents. 
9
• 
During the reproductive phase of the GA, individuals are selected from the population and recombined, producing offspring which will comprise the next generation. 
• 
Parents are selected randomly from the population using a scheme which favours the more fit individuals. 
• 
Having selected two parents, their chromosomes are 
10
11
• 
Convergence is the progression towards increasing uniformity. 
• 
A gene is said to have converged when 95% of the population 
share the same value. 

• 
The population is said to have converged when all of the genes have converged. 
• 
If the GA has been correctly implemented, the population will evolve over successive generations so that the fitness of the best and the average individual in each generation increases 
12
• 
Its robustness 

• 
Ability to work on large and “noisy” datasets, 

• 
GA’s perform global search of the solution space in comparison to most other algorithms that use Greedy approach 

• 
Coping well with attribute interaction. 

• 
Parallel approaches to genetic algorithms, 

• 
the scalability of these algorithms can be achieved. 

– 
this characteristic is of great importance in data mining. 

• 
Moreover, genetic algorithms have high degree of autonomy that 
13
14
15
Domain
Application Types
Control
Design
Scheduling
Robotics
Machine Learning
Signal Processing
Game Playing
Combinatorial
Optimization
gas pipeline, pole balancing, missile evasion, pursuit
semiconductor layout, aircraft design, keyboard
configuration, communication networks
manufacturing, facility scheduling, resource allocation
trajectory planning
designing neural networks, improving classification
algorithms, classifier systems
filter design
poker, checkers, prisoner’s dilemma
set covering, travelling salesman, routing, bin packing,
graph colouring and partitioning
• 
Concept is easy to understand 
• 
Modular, separate from application 
• 
This is very useful for very complex and loosely defined problem. 
• 
With a well defined fitness function and carefully chosen attributes, genetic algorithm can perform much faster than 
17
• 
The definition of the fitness function can be very complicated sometime. 
• 
The fitness function may affect the performance of the process significantly if the complexity of the fitness function increase. 
18
• 
MATLAB – Matrix Laboratory MATLAB is a highperformance language for technical 
computing. It integrates computation, visualization and programming in an easytouse environment where problems and solutions are expressed in familiar mathematical notation. 

• 
Simulink  
19
• 
Math and computation 
• 
Algorithm development 
• 
Data acquisition 
• 
Modeling, simulation, and prototyping 
• 
Data analysis, exploration, and visualization 
• 
Scientific and engineering graphics 
• 
Application development, including graphical user interface building 
20
• 
In the future work, the algorithm derived in this presentation 
will be implemented into program using MATLAB. 

• 
Beside, the study will be focus on applying genetic algorithm on the database. 
• 
Finally, it will compare with conventional data mining technique in order to find the benefit by using genetic 
21
22
23
24