You are on page 1of 8

Genetic Programming Based Image Segmentation

with Applications to Biomedical Object Detection

Tarundeep Singh1, Nawwaf Kharma2, Mohmmad Daoud3 and Rabab Ward4
Electrical & Computer Engineering Department, Concordia University, Montreal, Quebec, Canada1,2
Department of Electrical & Computer Engineering, U. of Western Ontario, London, Ontario, Canada3
Department of Electrical & Computer Engineering, U. of British Columbia, Vancouver, B.C., Canada4
t_dhot@encs.concordia.ca1, kharma@ece.concordia.ca2, mdaoud@imaging.robarts.ca3, rababw@icics.ubc.ca4

ABSTRACT However, image segmentation is an ill-defined problem. Even

Image segmentation is an essential process in many image though numerous approaches have been proposed in the past [7,
analysis applications and is mainly used for automatic object 12, 13], there is still no general segmentation framework that can
recognition purposes. In this paper, we define a new genetic perform adequately across a diverse set of images [1]. In addition,
programming based image segmentation algorithm (GPIS). It uses most image segmentation techniques exhibit a strong domain or
a primitive image-operator based approach to produce linear application-type dependency [7, 12]. Automated segmentation
sequences of MATLAB® code for image segmentation. We algorithms often include a priori information of its subjects [8],
describe the evolutionary architecture of the approach and present making use of well-designed segmentation techniques restricted
results obtained after testing the algorithm on a biomedical image to a small set of imagery.
database for cell segmentation. We also compare our results with In this paper, we propose a new, simple image segmentation
another EC-based image segmentation tool called GENIE Pro. algorithm called Genetic Programming based Image
We found the results obtained using GPIS were more accurate as Segmentation (GPIS) that uses a primitive image-operator based
compared to GENIE Pro. In addition, our approach is simpler to approach for segmentation and present results. The algorithm
apply and evolved programs are available to anyone with access does not require any a priori information about objects to be
to MATLAB®. segmented other than a set of training images. In addition, the
algorithm is implemented on MATLAB® and uses its standard
Categories and Subject Descriptors image-function library. This allows easy access to anyone with
I.4.6 [Image Processing and Computer Vision]: Segmentation – MATLAB®.
pixel classification. In the following sections, we provide a brief introduction to
relevant work in GP based image segmentation and image
General Terms: Algorithms, Experimentation. analysis, followed by an overview of our approach in Section 1.3.
Section 2 describes the methodology of our algorithm and the
Keywords: Image Segmentation, Genetic Programming. experimental setup for compiling results. Finally, Section 3
presents the results of the experiments conducted on a biomedical
1. INTRODUCTION image database for cell segmentation purposes. We also compare
Image segmentation is the process of extraction of objects of our results with another EC-based image segmentation algorithm
interest from a given image. It allows certain regions in the image called GENIE Pro.
to be identified as an object based on some distinguishing criteria,
for example, pixel intensity or texture. It is an important part of 1.1 Related Work
many image analysis techniques as it is a crucial first step of the One of the initial works in this field was published by Tackett
imaging process and greatly impacts any subsequent feature [16] in 1993. He applied GP to develop a processing tree capable
extraction or classification. It plays a critical role in automatic of classifying features extracted from IR images. These evolved
object recognition systems for a wide variety of applications like features were later used to construct a classifier for target
medical image analysis [8, 9, 14, 15], geosciences and remote detection. On the same lines, in 1995, Daida et al. [5, 6] used GP
sensing [2, 3, 4, 5, 10, 11], and target detection [10, 11, 16]. to derive spatial classifiers for remote sensing purposes. This was
the first time GP was used for image processing applications in
geosciences and remote sensing.
In 1996, Poli [14] proposed an interesting approach to image
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
analysis based on evolving optimal filters. The approach viewed
not made or distributed for profit or commercial advantage and that image segmentation, image enhancement and feature detection
copies bear this notice and the full citation on the first page. To copy purely as a filtering problem. In addition, he outlined key criteria
otherwise, or republish, to post on servers or to redistribute to lists, while building terminal sets, function sets and fitness functions
requires prior specific permission and/or a fee. for an image analysis application.
GECCO’09, July 8–12, 2009, Montréal, Québec, Canada.
Copyright 2009 ACM 978-1-60558-325-9/09/07...$5.00.

In 1999, Howard et al. [10, 11] presented a series of works using segmentation algorithms that segment a wide variety of images.
GP for automatic object detection in real world and military In GPIS, we define a pool of low level image analysis operators.
image analysis applications. They proposed a staged evolutionary The GP searches the solution space for the best possible
approach for evolution of target detectors or discriminators. This combination of these operators that are able to perform the most
resulted in achieving practical evolution times. accurate segmentation. From now on, we refer to these image
In 1999, another interesting approach was proposed by Brumby et analysis operators as primitives. Each individual in a population is
al. [4]. They used a hybrid evolutionary approach to evolve image a combination of these primitives and represents an image
extraction algorithms for remote sensing applications. These segmentation program. Therefore, GPIS typically breeds a
algorithms were evolved using a pool of low level image population of segmentation programs in order to evolve one
processing operators. On the same lines, Bhanu et al. [2, 3] used accurate image segmentation program.
GP to evolve composite operators for object detection. These
operators were synthesized from combinations of primitive image 2. METHODOLOGY
processing operations used in object detection. In order to control The proposed algorithm GPIS is designed as a general tool for
the code-bloat problem, they also proposed size limits for the learning based segmentation of images. In this paper, particular
composite operators. attention is given to the testing it on biomedical images. Our
In 2003, Roberts and Claridge [15] proposed a GP based image approach does not require a particular image format or size and
segmentation technique for segmenting skin lesion images. A key works equally well on both color and grayscale images in any
feature of their work was the ability of the GP to generalize based MATLAB® compatible format.
on a small set of training images. For the purpose of learning, a directory with both input images
Our approach is motivated by the works of Tackett [16], Brumby and matching ground truths (GTs) must be provided. From this
et al. [4] and Bhanu et al. [2, 3]. They all effectively implemented point onwards, we call this a training set. Every input image must
a primitive image operator based approach for image analysis. have a corresponding GT of the same size and format. The GT
This is similar to our approach. In addition, we have used the key image is a binary image showing the human expert assessment of
criteria outlined by Poli [14] as references while building our the boundaries of the objects of interest; all pixels inside those
algorithm. boundaries are by definition object pixels and all pixels outside
the boundaries are by definition, non-object pixels. Pixels on the
1.2 GENIE Pro boundary of an object are by definition also object pixels.
GENIE Pro [4, 9] is a general purpose, interactive and adaptive GPIS has two stages of operation. Stage 1 is a learning phase in
GA-based image segmentation and classification tool. GENIE Pro which GPIS uses the training set to evolve a MATLAB® program
uses a hybrid GA to assemble image-processing algorithms or which meets user-defined threshold of segmentation accuracy
pipelines from a collection of low-level image processing relative to the input images of the training set.
operators (for example edge detectors, textures measures, spectral
In the second stage, this evolved individual is evaluated for its
orientations and morphological filters). The role of each evolved
ability to segment unseen images of the same type as the training
pipeline is to classify each pixel as feature or non-feature.
images. The accuracy results achieved here are from here on
The GA begins with a population of random pipelines, performs called validation accuracy.
fitness evaluation for each pipeline in the population and selects In a real world situation, due to lack of GTs for unseen images,
the fitter pipelines to produce offspring pipelines using crossover validation accuracy will take the form of the subjective
and mutation. In order to compute fitness of a pipeline, the assessment of a human user. However, for this paper, the authors
resultant segmentation produced by a pipeline is compared to a evaluate the quality i.e. the validation accuracy of the individual
set of training images. These training images are produced by evolved by GPIS by comparing their segmentation results to their
manual labeling of pixels by user as True (feature) or False (non- matching GT images. We report the results of our evaluation in
feature) pixels using an in-built mark-up tool called ALLADIN. the Results section (Section 3) of this paper.
Finally, when a run of GENIE Pro is concluded, the fittest
pipeline in the population is selected and combined using a linear 2.1 Stage 1: Learning phase of GPIS
classifier (Fisher Discriminant) to form evolved solution that can GPIS operates in a typical evolutionary cycle in which a
be used to segment new images. population of potential program solutions (each meant to segment
GENIE Pro was originally developed for analyzing multispectral images) is subjected to repeated selection and diversification until
satellite data but has been later applied for biomedical feature- at least one of the individual meets the termination criteria. The
extraction problems also [9]. We use GENIE Pro as a comparative flowchart of the learning stage is presented in Figure 1.
method to check effectiveness of our algorithm.

1.3 Overview of Our Work

In this paper, we describe a new genetic programming based
image segmentation algorithm, GPIS that uses a primitive image-
operator based approach for segmentation. Each segmentation
algorithm can be viewed as a unique combination of image
analysis operators that are successfully able to extract desired
regions from an image. If we are able to describe a sufficient set
of these image analysis operators, it is possible to build multiple

initialization is also random i.e. parameter values of operators are
also assigned randomly, based on the operator type. For practical
reasons, the size of each chromosome is limited to a maximum
length of 15.



Figure 1. Flowchart of GPIS (c)

Figure 2. (a) Typical layout of a gene (b) Typical layout
2.1.1 Representation and Initialization of a chromosome comprising of n genes (c) One-to-one
In our scheme, the genome of an individual encodes a mapping of the genome and phenome
MATLAB® program that processes an image. The input to the
program is an image file and the execution of the MATLAB® Table 1. Primitive image analysis operators in the gene pool
program is an image of the same size and format. This output Operator
Description Inputs Operator Type
image file is a segmented version of the input image. Name
ADDP Add Planes 2 Arithmetic
The general layout of a gene is a shown in Figure 2 (a). As seen in
the figure, each gene specifies information about the primitive SUBP Subtract Planes 2 Arithmetic
operator it encodes, the input images to the operator and MULTP Multiply Planes 2 Arithmetic
parameter settings for the operator. This corresponds to a few Absolute
lines (1-3) of the equivalent MATLAB® program. The gene DIFF 2 Arithmetic
consists of five parts. The first part contains name of the primitive AVER Averaging Filter 1 Filter
operator and the second and third part contain the possible input
images to the operator. Based on nature of the primitive operator, DISK Disk Filter 1 Filter
a gene may have one or two input images. The fourth part GAUSSIAN Gaussian Filter 1 Filter
contains weights or parameter values for the primitive operator LAPL Laplacian Filter 1 Filter
and fifth part encodes the nature of the Structuring Element or SE
UNSHARP Unsharp Filter 1 Filter
(only in case of morphological operations) or a secondary Filter
Parameter or FP (only in case of filter operators). LP Lowpass Filter 1 Filter
HP Highpass Filter 1 Filter
The phenomic representation (chromosome) is a simple
combination of genes, as shown in Figure 2 (b). The chromosome DIL Image Dilate 1 Morphological
represents a complete MATLAB® segmentation program. There ERODE Image Erode 1 Morphological
is a one-to-one mapping between the genome and the phenome as
OPEN Image Open 1 Morphological
shown in Figure 2 (c). It also shows the representation of the
knowledge structure used by the genetic learning system. CLOSE Image Close 1 Morphological
Image Open-
We use a pool of 20 primitive operators. Table 1 provides the OPCL 1 Morphological
complete list of all primitive image analysis operators in the gene
Image Close-
pool along with the typical number of inputs required for each CLOP 1 Morphological
HISTEQ 1 Enhancement
Initialization creates a starting population for the GP. The initial Equalization
population to the GP is randomly generated i.e. chromosomes are ADJUST Image Adjust 1 Enhancement
formed by a random assigned sequence of operators. The genomic THRESH Thresholding 1 Post-processing

In addition, at the time of initialization, the size of the population 2.1.3 Termination Criteria
along with values of crossover rates and mutation rates are Termination of the GP is purely fitness based and the
assigned by the user. evolutionary cycle continues till the time there is no major change
in fitness over a 10 generations. In order to do this, first we
2.1.2 Fitness Evaluation calculate a minimum acceptable fitness value based on our trial
A segmented image consists of positive (object) and negative runs. This value was found to be 95% for the database in use. Till
(non-object) pixels. Ideally the segmentation of an image would the time, these values of fitness were not achieved, the GP keeps
result in an output image where positive pixels cover object pixels running. Once, these values were reached, a mechanism of
perfectly and the negative pixels cover non-object pixels calculating cumulative means of the fitness of successive
perfectly. Based on this idea, we can view segmentation as a generations was implemented. If the absolute difference between
pixel-classification problem. The task of the segmentation the means of 10 successive generations was less than 5% of the
program now becomes assignment of the right class to every pixel highest fitness achieved, the GP stops. If however, the GP is used
in the image. As such, we can apply measure of classification on any other database, a default value of 90% is set. The
accuracy to the problem of image segmentation. Every termination criteria can be defined as follows:
segmentation program can be expected to identify not only pixels
belonging to the objects of interest (True Positives, TPs), but also |current fitness – mean fitness(10 gen)| < 0.05 × highest fitness
some non-object pixels identified as objects (False Negatives,
FNs). Further, in addition to identifying non-object pixels (True 2.1.4 Parent Selection
Negatives, TNs), some pixels belonging to non-objects can be Parent selection is done to select chromosomes that undergo
identified as object pixels (False Positives, FPs). diversification operations. In order to do this, we use a
tournament selection scheme. It is chosen instead of rank
Therefore for an ideal segmentation, the number of FPs and FNs selection as it is computationally more efficient. The size of the
should be zero while the number of TPs and TNs should be tournament window λ is kept at 10% of the size of the population.
exactly equal to number of object and non-object pixels. If we The number of parents selected is 50% of the size of the
normalize the value of TPs and TNs by the total number of object population.
and non-object pixels respectively, their individual values in the
best case scenario would be 1 and 0 in the worst case scenario. 2.1.5 Elitism
However, for the segmentation problem, achieving this is a We use elitism as a means of saving the top 1% chromosomes of
challenging task, thus we define two more measures based on a population. Copies of the best 1% of the chromosomes in the
TPs, TNs, FPs and FNs called the False Positive Rate (FPR) and population are copied without change to the next generation.
False Negative rate (FNR). FPR is the proportion of non-object
pixels that were erroneously reported as being object pixels. FNR 2.1.6 Diversification
is the proportion of object pixels that were erroneously reported We employ five genetic operators in total: one crossover and four
as non-object pixels. mutation operators. These are selected probabilistically based on
Therefore, for an ideal segmentation, the values of FPR and FNR their respective rate of crossover and mutation.
should be zero. For finding accuracy of a segmentation program, Crossover: We use a 1-point crossover for our GP. Two parents
we use a pixel-based accuracy formula based on FPR and FNR. are chosen randomly from the parent pool. A random location is
This formula reflects the training and validation accuracy for chosen in each of the parent chromosomes. The subsequences
GPIS. It is as follows: before and after this location in the parents are exchanged
(1) creating two offspring chromosomes.
Mutation: We use four mutation operators for our GP. There are
where FPR represents False Positive Rate and FNR represents three inter-genomic mutation operators, namely, swap, insert and
False Negative Rate. delete and one intra-genomic mutation operator, alter, which
typically alters the weight element of the selected gene. The gene
The above formula for accuracy extends image segmentation to be mutated is randomly chosen from the selected parent
problem to a pixel-classification problem. Therefore, ideally value chromosome.
of accuracy should be 1 (or 100%) for a perfectly segmented
image. We also see that the formula is monotonicity, i.e. if image 2.1.7 Injection
A is better segmented than image B then Accuracy (A) > In order to overcome loss of diversity in a population, we use an
Accuracy (B). injection mechanism. We inject a fixed percentage of new
However, we further extend this formula by introducing a term randomly initialized programs to the population after every n
that penalizes longer programs. The fitness function for GPIS is generation. In the current configuration, we inject 20% new
as follows: programs every 5 generations.

(2) 2.1.8 Survivor Aggregation

where FPR represents False Positive Rate, FNR represents False The aim of this phase is to collect chromosomes that have
Negative Rate, len represents length of the program, β is a scaling qualified to be part of the next generation (parent, offspring, elite,
factor for the length of a program, such that β belongs to [0.004, injected) in order to build the population for the next generation.
0.008]. We found this range sufficient for our purpose. This phase works in two modes: non- injection and injection
mode. In the non-injection mode, copies of all parent

chromosomes (50%), offspring chromosomes (49%) and elite The final set of parameter values used for GPIS is given in Table
chromosomes (1%) form the population of the next generation. In 2.
the injection mode, since a fixed size population (20%) of new
chromosomes is inserted into the population, the top 79% of 2.3.1 Procedure for Training and Validation
parent-offspring population is selected along with the elite set In order to plan a run of the algorithm, we first decide size of the
(1%) to form the population of the next generation. training and validation sets. To do so, we define G as the global
total number of images in a database, T as the training set, V as
2.1.9 Output (Fittest Individual) the validation set, and R as the number of times optimal
Once the termination criterion has been satisfied, the output of the individuals are evolved for the same database. The final values for
GP is typically the “fittest” chromosome present in the final the above used in the present configuration are: G = 1026, T = 30,
population. This chromosome is then chosen to be tested on a set V = 100 and R = 28.
of unseen test images and it is explained in Section 2.2. Our aim Table 2. Parameter settings for GPIS
is to create a pool of such outputs (segmentation programs) which
allows us to have multiple segmentation algorithms for the same Population size: µ 200  
database. This is created by subsequent runs of the GP. Crossover Rate: Pc 0.45
Note: When we apply percentages, the results are rounded to the Swap Mutation Rate: Pms 0.25
closest integers. In case of elitism, if 1% < 1, 1 individual is
Insert Mutation Rate: Pmi 0.25
Delete Mutation Rate: Pmd 0.2
2.2 Stage 2: Evaluation Methodology Alter Mutation Rate: Pma 0.7
As mentioned in the previous section, the output of Stage 1 gives
us one chromosome, which was the fittest chromosome amongst Scalability factor for length: β 0.005
the population of final generation. The accuracy of the
segmentations produced by this chromosome on the training
images is known as training accuracy of the run. The actual 2.3.2 Procedure for Training and Validation
challenge for this individual is to produce similar segmentation In order to plan a run of the algorithm, we first decide size of the
accuracies on an unseen set of images known as the validation training and validation sets. To do so, we define G as the global
images. total number of images in a database, T as the training set, V as
the validation set, and R as the number of times optimal
In order to do this, we randomly select a fixed number of new individuals are evolved for the same database. The final values for
images from outside the training set along with their the above used in the present configuration are: G = 1026, T = 30,
corresponding GTs, from the image database. From this point V = 100 and R = 28.
onwards, we refer to call this the validation set. Once the
validation set is chosen, the “fittest chromosome” is applied on Procedure for Obtaining Results using GPIS
the entire set of images, one-by-one and segmentation accuracies Step 1. Randomly select T images and other V images from the
for each image is calculated based on the accuracy formula (1) G images in the database.
given is Section 2.1.2. Once this process ends, the average Step 2. Perform training on T images to choose fittest
segmentation accuracy of set or validation accuracy of the run is individual for validation.
Step 3. Validate this individual on V images to check the
We repeat the above process for various runs and calculate the applicability of this individual on unseen images. If
overall training accuracy (average training accuracies of runs) and individual produces high validation accuracy, save it in
validation accuracy (average validation accuracies of runs) for the the result set, else discard it.
algorithm. From here on, we refer to the above as training
Step 4. Repeat Steps 1 to 3, R times producing a set of optimal
accuracy and validation accuracy respectively.
individuals (result set).
The output of Stage 2 is a chromosome that performs equally well Step 5. Calculate values of average training and validation
on both training and validation sets and produces high overall accuracy of the result set.
validation accuracy. Procedure for Obtaining Results using GENIE
2.3 Experimental Setup Pro
In order to test the effectiveness and efficacy of our algorithm, we Step 1. Select the same T and V images from the G images in
tested the algorithm on a biomedical image database that the database, used for the corresponding GPIS run.
consisted of HeLa cell images (in culture) of size 512 pixels × Step 2. Load each of the T images as a base image and create a
384 pixels . The task of the algorithm was to segment the cells training overlay for each image by marking Foreground
present in the images. The procedure for obtaining results using (object) and Background (non-object) pixels manually.
our algorithm is given in Section
Step 3. Train on these manually marked training overlays using
We also compare the results of our algorithm with those produced the in-built Ifrit Pixel Classifier.
by GENIE Pro. The procedure used for obtaining results using
Step 4. Apply learned solution on V images to produce
GENIE Pro is given in Section
corresponding segmented images.

Step 5. Calculate validation accuracy for these V images using 3.2 Efficiency
formula (1). Table 5 reflects the efficiency of the process to produce the
Step 6. Repeat Steps 1 to 5, R times, same as like GPIS. required results. We measure efficiency based on number of
generations taken by GPIS to produce one individual of minimum
Step 7. Calculate values of average training and validation
acceptable fitness. This acceptable fitness is 95% training
accuracy of the result set.
accuracy. In our runs, we observed that GPIS never failed to
3. RESULTS produce an acceptable individual.
We have based our results on two criteria, effectiveness of the The experiments were performed on an Intel Pentium (R) 4 CPU,
algorithm to accurately segment the given images, and efficiency 3.06 GHz, 2GB RAM computer. To execute 1 generation, GPIS
of the algorithm in doing so. took at an average 4.21 minutes. The average time taken for a
complete run was approximately 513 minutes. The maximum
Effectiveness is based on two measures, pixel accuracy of the
time taken for a complete run was 580 minutes.
evolved solution and the cell count rate (percentage of cell
structures correctly identified). In order to calculate the cell count Since GPIS is designed to run as an offline tool and the time it
rate, we have categorized cells into two types: Type1 and 2. Type takes to execute an evolved program is between 1-3 seconds, the
1 cells are those which can be identified by eye with relative ease. period of evolution of an optimal program is within reasonable
Type 2 cells are those which are relatively difficult to be real world constraints. Also, the standard deviation for number of
identified by eye. We also provide comparative results for generations is low. This shows that GPIS runs consistently to
effectiveness for GENIE Pro. This is presented in Section 3.1.1. produce an optimal program within a tight window.

Efficiency reflects the time the algorithm takes to produce one Table 3. Segmentation accuracy: GPIS Vs GENIE Pro
individual of acceptable fitness. This is measured in terms of
number of generations. These results are presented in Section Algorithm Training Data Validation Data
3.1.2. GPIS 98.76% 97.01%
We also briefly discuss one evolved program and also provide GENIE Pro 94.12% 93.12%
segmented images produced. This is presented in Section 2.4.3
and Figure 5 and 6.
Table 4. Cell count rate: GPIS Vs GENIE Pro
3.1 Effectiveness Cell GPIS GENIE PRO
Tables 3 & 4 presents results obtained for training and validation Count
accuracies of segmentation achieved for GPIS and GENIE Pro. Training Validation Training Validation
Measure Data Data Data Data
These values represent each algorithm’s ability to correctly
classify each pixel in an image as an object or non-object pixel. Detected
98.24% 97.98% 97.02% 96.56%
We found that our algorithm performed better in segmenting the Cells
cells in the images as compared to GENIE Pro.
Type 1
100% 100% 100% 100%
The second measure for effectiveness that we used was cell count Cells
rate. We extend the concept of TPs, TNs, FPs and FNs to object
detection where a TP denotes an object that is correctly identified Type 2
98.78% 98.22% 97.49% 96.89%
by the algorithm as cell, FN denotes an object incorrectly Cells
identified as a cell, FP denotes non-object incorrectly identified as Undetecte
cell, and TN denotes a non-object correctly identified as the 1.32% 1.55% 2.12% 2.25%
d Cells
background. In order to consider an object as belonging to any of
the above four options, a minimum of 70% of object pixels must
correspond to any of the four options mentioned above. Cells Table 5. Performance of GPIS based on number of
identified were manually counted. generations
Similar to the accuracy formula, based on TPs, TNs, FPs and FNs,
we can define the FPR and FNR for cell count. FPR is the Statistical Measure Number Of Generations
proportion of non-cell structures that were erroneously reported as MEAN 122.07
being cell structures. FNR is the proportion of cell structures that
were erroneously reported as non-cell structures. The cell count MEDIAN 122
rate formula used is as follows: STANDARD DEVIATION 6.85
Cell Count Rate = (1-FPR) × (1-FNR) (3) UPPER BOUND 138

flat, disk-shaped structuring element of radius 2. A 6 × 6
3.3 Evolved Program averaging filter is again applied to the output image of the eroded
Figure 5 shows the chromosomal and genomic structure of an image. Its output image undergoes a composite morphological
evolved program. The program evolved is a combination of filters operation of closing and opening with the same structuring
and morphological operators. The first gene is a 6 × 6 Gaussian element as above. Finally this image is converted to a binary
low pass filter with a sigma value of 0.8435 followed by a 4 × 4 output image using a threshold of 0.09022. The validation
averaging filter. The output image from gene 2 is eroded with a accuracy is calculated for this image.
flat, disk-shaped structuring element of radius 2. A 6 × 6 Gaussian
Figure 6 shows implementation of this evolved program on two
low pass filter with a sigma value of 0.8435 followed by a 4 × 4
validation images along with corresponding results from GENIE
averaging filter. The output image from gene 2 is eroded with a



[GAUSS, d1, 0, 6, 0.8435] [AVER, io1, 0, 4, 0] [EROD, io2, 0, 0, 1]

[AVER, io3, 0, 6, 0] [CLOP, io4, 0, 0, 1] [THRESH, io5, 0, 0.09022, 0]


Genomic Structure MATLAB® Implementation

d1 = input;
[GAUSS, d1, 0, 6, 0.8435] h1 = fspecial(‘gaussian’, [6 6], 0.8435) ;
io1 = imfilter(d1, h1);
[AVER, io1, 0, 4, 0] h2 = fspecial(‘average’, [4 4]);
io2 = imfilter(io1,h2);
[EROD, io2, 0, 0, 1] SE1 = strel(‘disk’, 2);
io3 = imerode(io2, SE1);
[AVER, io3, 0, 6, 0] h3 = fspecial(‘average’, [6 6]);
io4 = imfilter(io3,h3);
[CLOP, io4, 0, 0, 1] io5 = imclose(io4, SE1);
[THRESH, io5, 0, 0.09022, 0] output = im2bw(io5, 0.09022);

Segmentation accuracy on validation set: 99.04 %

Number of operators used = 6
Average execution time = 1.252 seconds
Number of generation needed to converge = 114
Number of fitness evaluation = 10,532

Figure 5. An evolved program: (a) Chromosome for an evolved program, (b) Genomic structure for the evolved program, (c)
Genomic structure and equivalent MATLAB® implementation of the evolved program, (d) Performance results for the evolved

(a) (b) (c) (d)
Figure 6. (a) Segmentation produced by GPIS using evolved program shown above on validation image 1 (Validation Accuracy =
99.21%, Cell Count Rate = 100%), (b) Segmentation produced by GENIE Pro on validation image 1 (Validation Accuracy =
95.46%, Cell Count Rate = 97.89%), (c) Segmentation produced by GPIS using evolved program shown above on validation image
2 (Validation Accuracy = 98.93%, Cell Count Rate = 100%), (d) Segmentation produced by GENIE Pro on validation image 2
(Validation Accuracy = 94.22%, Cell Count Rate = 96.45
Paradigm: Extracting Low-contrast Curvilinear Features from SAR
4. CONCLUSIONS Images of Arctic Ice”, Advances in Genetic Programming II, P. J.
In this paper, we propose a simple approach to the complex Angeline, K. E. Kinnear, (Eds.), Chapter 21, The MIT Press, 1996,
problem of image segmentation. The proposed algorithm, GPIS, pp. 417-442.
uses genetic programming to evolve image segmentation [6] J. M. Daida, J. D. Hommes, S. J. Ross, A. D. Marshall, and J. F.
programs from a pool of primitive image analysis operators. The Vesecky, “Extracting Curvilinear Features from SAR Images of
Arctic Ice: Algorithm Discovery Using the Genetic Programming
evolved solutions are simple MATLAB® based image
Paradigm,” Proceedings of the IEEE International Geoscience and
segmentation programs. They are easy to read and implement. Remote Sensing Symposium, Italy, IEEE Press, 1995, pp. 673–75.
In addition, the algorithm does not require any a priori
information of objects to be segmented from the images. We [7] K. S. Fu, and J. K. Mui, “A Survey on Image Segmentation”,
Pattern Recognition, 13, 1981, pp. 3-16.
have tested our algorithm on a biomedical image database. We
also compare the results to another GA-based image [8] P. Ghosh and M. Mitchell, “Segmentation of Medical Images using
segmentation algorithm, GENIE Pro. We found that our a Genetic Algorithm”, Proceedings of the 8th Annual Conference
on Genetic and Evolutionary Computation, 2006, pp. 1171—1178.
algorithm consistently produced better results. Both the
segmentation accuracy and cell count rate were higher than [9] Harvery, N. Levenson, R. M., Rimm, D. L. Investigation of
GENIE Pro. It also produced an optimal solution within a automated feature extraction techniques for applications in cancer
derection from multi-spectral histopathology images. Proceedings
reasonable time window. In addition, GPIS never failed to
of SPIE, Vol. 5032, 2003, 557-556.
produce an optimal solution.
[10] D. Howard and S. C. Roberts, “A Staged Genetic Programming
Strategy for Image Analysis”, Proceedings of the Genetic and
5. ACKNOWLEDGMENTS Evolutionary Computation Conference, 1999, pp. 1047—1052.
We are grateful to Dr. Aida Abu-Baker and Ms Janet Laganiere [11] D. Howard, S. C. Roberts, and R. Brankin, “Evolution of Ship
from the CHUM Research Centre, Notre-Dame Hospital, Detectors for Satellite SAR Imagery”, Proceedings of EuroGP'99,
Montreal for providing us with the images for the cell database. Vol. 1598, 1999, pp. 135- 148.
We would also like to thank Dr. James Lacefield from the [12] N. R. Pal, and S. K. Pal, “A Review on Image Segmentation
University of Western Ontario, London for his help on this Techniques”, Pattern Recognition, 26, 1993, pp. 1277-1294.
[13] D. L. Pham, C. Xu, J. L. Prince, “Survey of Current Methods in
Medical Image Segmentation”, Annual Review of Biomedical
6. REFERENCES Engineering, 2, 2000, pp. 315—337.
[1] Bhanu, B.; Sungkee Lee; Das, S., “Adaptive image segmentation [14] R. Poli, “Genetic Programming for Feature Detection and Image
using genetic and hybrid search methods”, IEEE Transactions on Segmentation”, T.C. Forgarty (Ed.), Evolutionary Computation,
Aerospace and Electronic Systems, Vol. 31, Issue 4, Oct 1995 Springer- Verlag, Berlin, Germany, 1996, pp. 110–125.
Page(s):1268 – 1291.
[15] M. E. Roberts and E. Claridge, “An Artificially Evolved Vision
[2] B. Bhanu and Y. Lin, “Object Detection in Multi-modal Images System for Segmenting Skin Lesion Images”, Proceedings of the
using Genetic Programming”, Applied Soft Computing, Vol. 4, 6th International Conference on Medical Image Computing and
Issue 2, 2004, pp. 175-201. Computer-Assisted Intervention, Vol. 2878, 2003, pp. 655- 662.
[3] B. Bhanu, Y. Lin, “Learning Composite Operators for Object [16] W. Tackett, “Genetic Programming for Feature Discovery and
Detection”, Proceedings of the Conference on Genetic and Image Discrimination”, In S. Forrest, editor, Proceedings of 5th
Evolutionary Computation, July 2002, pp. 1003–1010. International Conference on Genetic Algorithm, 1993, pp. 303–
[4] S. P. Brumby, J. P. Theiler, S. J. Perkins, N. R. Harvey, J. J. 311.
Szymanski, and J. J. Bloch, “Investigation of Image Feature [17] Y. J. Zhang, “Influence of Segmentation over Feature
Extraction by a Genetic Algorithm”, Proceedings of SPIE, Vol. Measurement”, Pattern Recognition Letters, 16(2), 1992, 201-206.
3812, 1999, pp. 24-31.
[5] J. M. Daida, J. D. Hommes, T. F. Bersano-Begey,S. J. Ross, and J.
F. Vesecky, “Algorithm Discovery using the Genetic Programming