# A Hybrid Genetic Algorithm for Two Types of Polygonal Approximation Problems

Bin Wang1, and Chaojian Shi1,2
1

Department of Computer Science and Engineering, Fudan University, Shanghai, 200433, P. R. China 2 Merchant Marine College, Shanghai Maritime University, Shanghai, 200135, P. R. China wangbin.cs@fudan.edu.cn, cjshi@shmtu.edu.cn

Abstract. A hybrid genetic algorithm combined with split and merge techniques (SMGA) is proposed for two types of polygonal approximation of digital curve, i.e. Min-# problem and Min-ε Problem. Its main idea is that two classical methods—split and merge techniques are applied to repair infeasible solutions. In this scheme, an infeasible solution can not only be repaired rapidly, but also be pushed to a local optimal location in the solution space. In addition, unlike the existing genetic algorithms which can only solve one type of polygonal approximation problem, SMGA can solve two types of polygonal approximation problems. The experimental results demonstrate that SMGA is robust and outperforms other existing GA-based methods.

1

Introduction

In image processing, the boundary of an object can be viewed as a closed digital curve. How to represent it for facilitating subsequent image analysis and pattern recognition is a key issue. Polygonal approximation is a good representation method for the closed digital curve. Its basic idea is that a closed digital curve is divided into a ﬁnite number of segments and each segment is approximated by a line segment connecting its two end points. The whole curve is then approximated by the polygon formed by these line segments. Polygonal approximation is a simple and compact representation method which can approximating the curve with any desired level of accuracy. Therefore, this method is widely studied in image processing, pattern recognition, computer graphics, digital cartography, and vector data processing. In general, there are two types of polygonal approximation problems which have attracted many researchers’ interest. They are described as follows: Min-# problem: Given a closed digital curve, approximate it by a polygon with a minimum number of line segments such that the approximation error does not exceed a given tolerance error ε. Min-ε problem: Given a closed digital curve, approximate it by a polygon with a given number of line segments such that the approximation error is minimized.
Corresponding author.
D.-S. Huang, K. Li, and G.W. Irwin (Eds.): ICIC 2006, LNCIS 345, pp. 30–41, 2006. c Springer-Verlag Berlin Heidelberg 2006

A Hybrid GA for Two Types of Polygonal Approximation Problems

31

Both of the above polygonal approximation problems can be formulated as a combinatorial optimization problem. Since an exhaustive search for the optimal solution in the potential solution space will result in an exponential complexity [1], many existing methods for polygonal approximation problems yield suboptimal results to save computational cost. Some existing methods for polygonal approximation problems are based on local search technique. They can be classiﬁed into following categories: (1) sequential tracing approach [2], (2) split method [3], (3) merge method [4], (4) split-and-merge method [5], and (5) dominant point method [6]. These methods work very fast but their results may be very far away from the optimal ones because of their dependence on the selection of starting points or the given initial solutions. In recent years, many nature-inspired algorithms such as genetic algorithms (GA) [1,8,9,10,11], ant colony optimization (ACO)[12], particle swarm optimization (PSO)[13] and so on, have been applied to solve the Min-# problem or the Min-ε problem and presented promising approximation results. In this paper, we focus on using GA-based method to solve polygonal approximation problems. The power of GA arises from crossover, and crossover causes a structured, yet randomized exchange of genetic material between solutions, with the possibility that ’good’ solutions can generate ’better’ ones. However, crossover may also generate infeasible solutions, namely, two feasible parents may generate an infeasible child. This especially arises in combinatorial optimization where the encoding is the traditional bit string representation and crossover is the generalpurpose crossover [11]. Therefore how to cope with the infeasible solution is the main problem involved in using GA-based method for polygonal approximation problems. Among existing GA-based methods for polygonal approximation problems, there are two schemes which are used to cope with infeasible solutions. One is to modify the traditional crossover and constrain it to yield feasible oﬀsprings. Here, we term it constraining method. Yin [8] and Huang[10] adopt this method for solving min-ε problem and min-# problem, respectively. Both of them adopt a modiﬁed version of the traditional two-cut-point crossover. In traditional twocut-point crossover (shown in Fig. 4), two crossover sites are chosen randomly. However, it may generate infeasible solutions. They modiﬁed it by choosing the appropriate crossover point on the chromosome which can maintain the feasibility of oﬀsprings. However, this will require repeated testing candidate crossover points on the chromosome and result in an expensive cost of time. Furthermore, in some case, such crossover sites can not be obtained for Min# problem. For solving min-ε problem, Chen and Ho [11] proposed a novel crossover termed orthogonal-array-crossover which can maintain the feasibility of oﬀsprings. However, the complexity of this kind of crossover is also high and it is only suitable for min-ε problem and not for min-# problem. Another method for coping with the infeasible solutions is penalty function method. Yin [1] adopted this scheme for min-# problem. It’s main idea is that a penalty function is added to the ﬁtness function for decreasing the survival

32

B. Wang and C. Shi

probability of the infeasible solution. However, it is usually diﬃcult to determine an appropriate penalty function. If the strength of the penalty function is too large, more time will be spent on ﬁnding the feasible solutions than searching the optimum, and if the strength of penalty function is too small, more time will be spent on evaluating the infeasible solutions [11]. For solving the above problems involved in coping with the infeasible solutions, we propose a hybrid genetic algorithm combined with split and merge technique (SMGA) for solving min-ε problem and min-# problem. The main idea of SMGA is that the traditional split and merge technique is employed to repair infeasible solutions. SMGA has following three advantages over the existing GA-based methods. (1) SMGA doesn’t require developing a special penalty function, or modifying and constraining the traditional two-cut-point crossover for avoiding yielding an infeasible solution. In SMGA, an infeasible solution can be transformed into a feasible one through a simple repairing operator. (2) SMGA combines the advantage of GA possessing the strong global search ability, and the merits of the traditional split and merge technique having the strong local search ability. This will improve the solution quality and convergence speed of GA. (3) Diﬀerent from the existing GA-based methods which are designed for solving min-ε problem or min-# problem alone, SMGA are developed for solving both of them. We use four benchmark curves to test SMGA, the experimental results show its superior performance.

2

Problems Formulation

Deﬁnition 1. A closed digital curve C can be represented by a clockwise ordered sequence of points, that is C = {p1 , p2 , . . . , pN } and this sequence is circular, namely, pN +i = pi , where N is the number of points on the digital curve. Deﬁnition 2. Let pi pj = {pi , pi+1 , . . . , pj } represent the arc starting at point pi and continuing through point pj in the clockwise direction along the curve. Let pi pj denote the line segment connecting points pi and pj . Deﬁnition 3. The approximation error between pi pj and pi pj is deﬁned as follows: e(pi pj , pi pj ) =
pk ∈pi pj

d2 (pk , pi pj ),

(1)

where d(pk , pi pj ) is the perpendicular distance from point pk to the line segment pi pj .

A Hybrid GA for Two Types of Polygonal Approximation Problems

33

Deﬁnition 4. The polygon V approximating the contour C = {p1 , p2 , . . . , pN } is a set of ordered line segments V = {pt1 pt2 , pt2 pt3 , . . . , ptM −1 ptM , ptM pt1 }, such that t1 < t2 < . . . < tM and {pt1 , pt2 , . . . , ptM } ⊆ {p1 , p2 , . . . , pN }, where M is the number of vertices of the polygon V . Deﬁnition 5. The approximation error between the curve C = {p1 , p2 , . . . , pN } and its approximating polygon V = {pt1 pt2 , pt2 pt3 , . . . , ptM −1 ptM , ptM pt1 } is deﬁned as
M

E(V, C) =
i=1

e(pti pti+1 , pti pti+1 ),

(2)

Then the two types of polygonal approximation problems are formulated as follows: Min-# problem: Given a digital curve C = {p1 , p2 , . . . , pN } and the error tolerance ε. Suppose Ω denotes the set of all the polygons which approximate the curve C. Let SP = {V | V ∈ Ω ∧ E(V, C) ≤ ε}, Find a polygon P ∈ SP such that |P | = min |V |,
V ∈SP

(3)

where |P | denotes the cardinality of P . Min-ε problem: Given a digital curve C = {p1 , p2 , . . . , pN } and an integer M , where 3 ≤ M ≤ N . Suppose Ω denotes the set of all the polygons which approximate the curve C. Let SP = {V | V ∈ Ω ∧ |V | = M }, where |V | denotes the cardinality of V . Find a polygon P ∈ SP such that E(P, C) = min E(V, C).
V ∈SP

(4)

3
3.1

Overview of Split and Merge Techniques
Split Technique

Traditional split technique is a very simple method for solving polygonal approximation problem. It is a recursive method starting with an initial curve segmentation. At each iteration, a split procedure is conducted to split the segment at the selected point unless the obtained polygon satisfy the speciﬁed constraint condition. The detail of split procedure is described as follows: Suppose that curve C is segmented into M arcs pt1 pt2 , . . . , ptM −1 ptM , ptM pt1 , where pti is segment point. Then a split operation on curve C is: for each point pi ∈ ptj ptj+1 , calculate the distance to the corresponding chord D(pi ) = d(pi , ptj ptj+1 ). Seek a point pu on the curve which satisﬁes D(pu ) = max D(pi ). Suppose that pu ∈ ptk ptk+1 . Then the arc ptk ptk+1 is segmented at the point pu into two arcs ptk pu and pu ptk+1 . Add the point into the set of segment points. Fig. 1 shows a split process. The function of split operator is to ﬁnd a new possible vertex using heuristic method.
pi ∈C

34

B. Wang and C. Shi

split

Fig. 1. Split operation

merge
d min

Fig. 2. Merge operation

3.2

Merge Technique

Merge technique is another simple method for yielding approximating polygon of digital curve. It is a recursive method starting with an initial polygon which regards all the points of the curve as its vertexes. At each iteration, a merge procedure is conducted to merge the selected two adjacent segments of the current polygon until the obtained polygon satisfy the speciﬁed constraint condition. The detail of merge procedure is described as follows: Suppose that curve C is segmented into M arcs pt1 pt2 , . . . , ptM −1 ptM , ptM pt1 , where pti is segment point. Then a merge operation on curve C is deﬁned as: For each segment point pti , calculate the distance it to the line segment which connect its two adjacent points Q(pti ) = d(pti , pti−1 pti+1 ). Select a segment point ptj which satisﬁes Q(ptj ) = min Q(pti ), where V is the set of the current
pti ∈V

segment points. Then two arcs ptj−1 ptj and ptj ptj+1 are merged into a single arc ptj−1 ptj+1 . The segment point ptj is removed from the set of the current segment points. Fig. 2 shows a merge process. The function of merge operator is to remove a possible redundant vertex in heuristic way.

4
4.1

The Proposed Genetic Algorithm (SMGA)
Chromosome Coding Scheme and Fitness Function

The encoding mechanism maps each approximating polygon to a unique binary string which is used to represent a chromosome. Each gene of the chromosome corresponds to a point of the curve. if and only if its value is 1, its corresponding curve point is considered as a vertex of the approximating polygon. The number

A Hybrid GA for Two Types of Polygonal Approximation Problems

35

Fig. 3. Mutation

of genes whose value is 1 equals to the number of vertexes of the approximating polygon. For instance, given a curve C = {p1 , p2 , . . . , p10 } and a chromosome ’1010100010’. Then the approximating polygon the chromosome represents is {p1 p3 , p3 p5 , p5 p9 , p9 p1 }. Assume that a chromosome α = b1 b2 . . . bN . For min-ε problem, the ﬁtness function f (α) is deﬁned as follows: f (α) = E(α, C) For Min-# problem, the ﬁtness function f (α) is deﬁned as follows:
N

(5)

f (α) =
i=1

bi

(6)

For the above ﬁtness functions, the smaller the function value is, the better the individual is. 4.2 Genetic Operators

Selection. Select two individual from the population randomly and leave the best one. Mutation. Randomly select a gene on the chromosome and shift it a site to left or right randomly and set 0 to the original gene site (shown in Fig. 3). Crossover. Here, we use the traditional two-cut-point crossover. Its detail is that: randomly select two sites on the chromosome and exchange the two chromosomes’ substring between the two selected sites. For example: given two parent chromosomes ’1010010101’ and ’1011001010, the randomly selected crossover sites is 4 and 7. Then the two children yielded by two-cut-point crossover are ’1011001101’ and ’1010010010’ (shown in Fig. 4). 4.3 Chromosome Repairing

Two-cut-point crossover may yield infeasible oﬀspring. Here, we develop a method using the split and merge technique introduced in section 3 for repairing the infeasible oﬀsprings. For Min-ε Problem: Suppose that the speciﬁed number of sides of the approximation polygon is M . Then for an infeasible solution α, we have L(α) = M ,

36

B. Wang and C. Shi

Parent 1

Offspring 1

Parent 2

Offspring 2

Fig. 4. Two-cut-point crossover

where L(α) denotes the number of sides of the approximating polygon α. Then the infeasible solution α can be repaired through following process: If L(α) > M then repeat conducting merge operation until L(α) = M . If L(α) < M , then repeat conducting split operation until L(α) = M . For Min-# Problem: Suppose that the speciﬁed error tolerance is ε. Then for an infeasible solution α, we have E(α) > ε, where E(α) is the approximation error. Then the infeasible solution α can be repaired through following process: If E(α) > ε, then repeat conducting split operation until E(α) ≤ ε. Computational Complexity: Supposed that the number of curve points is n and the number of sides of the infeasible solution is k. From the deﬁnitions of the split and merge operations, the complexity of the split procedure is O(n − k) and that of the merge procedure is O(k). For Min-ε problem: suppose that the speciﬁed number of sides of the approximating polygon is m. If k < m, then repairing the current infeasible solution will require recalling split procedure m − k times. Thus the complexity of the repairing process is O((n − k)(m − k)). If k > m, then repairing the current infeasible solution will require recalling merge procedure k − m times. Therefore, the complexity of the repairing process is O(k(k − m)). For Min-# problem: it is diﬃcult to exactly compute the complexity of the repairing process. Here, we give the complexity of the worst case. In the worst case, we have to add all the curve point to the approximating polygon to maintain the feasibility of the solution. In such case, the approximation error is equal to 0. It will require calling split procedure n − k times. Therefore, the complexity of the repairing process in the worst case is O((n − k)2 ). 4.4 Elitism

Elitism is implemented by preserving the best chromosome with no suﬀering from being changed to the next generation.

5

Experimental Results and Discussion

To evaluate the performance of the proposed SMGA, we utilize four commonlyused benchmark curves, as shown in Fig. 5. Among these curves, (a) is a ﬁgure-8

A Hybrid GA for Two Types of Polygonal Approximation Problems

37

(a) ﬁgure-8

(b) chromosome

(c) semicircle

(d) leaf

Fig. 5. Four benchmark curves

curve, (b) is a chromosome-shaped curve, (c) is a curve with four semi-circles and (d) is a leaf-shaped curve. The number of their curve points is 45, 60, 102 and 120 respectively. Literature [6] presented their chain codes. Two groups of experiments are conducted to evaluate the performance of SMGA. One is to apply SMGA to solve the Min-ε problem. The other is to apply SMGA to solve the Min-# problem. All the experiments are conducted using a computer with CPU Pentium-M 1.5 under Windows XP. The parameter of SMGA is set as follows: population size Ns = 31, crossover probability pc = 0.7, mutation probability pm = 0.3 and the maximum number of generations Gn = 80.
Table 1. Experimental results of SMGA and EEA [11] for Min-ε problem Curves M 10 12 14 17 18 19 22 27 30 6 9 10 11 13 15 16 8 9 12 14 15 17 18 BEST ε EEA SMGA 38.92 38.92 26.00 26.00 17.39 17.39 12.22 12.22 11.34 11.19 10.04 10.04 7.19 7.01 3.73 3.70 2.84 2.64 17.49 17.49 4.54 4.54 3.69 3.69 2.90 2.90 2.04 2.04 1.61 1.61 1.41 1.41 13.43 13.43 12.08 12.08 5.82 5.82 4.17 4.17 3.80 3.80 3.13 3.13 2.83 2.83 AVERAGE ε EEA SMGA 44.23 42.89 29.42 27.80 20.14 18.55 14.46 13.37 12.79 12.56 11.52 11.22 8.63 7.73 4.87 4.05 3.67 2.93 18.32 17.64 4.79 4.71 3.98 3.73 3.19 3.15 2.36 2.05 1.87 1.69 1.58 1.51 15.56 13.99 13.47 12.76 6.75 5.86 5.13 4.56 4.27 4.07 3.57 3.21 3.04 2.95 VARIANCE EEA SMGA 78.50 25.98 4.68 2.05 4.69 1.41 2.31 1.11 1.47 0.91 0.97 0.50 0.56 0.32 0.57 0.15 0.33 0.04 0.45 0.12 0.15 0.06 0.05 0.02 0.04 0.01 0.06 0.00 0.04 0.01 0.03 0.01 2.42 1.26 1.76 0.55 0.88 0.00 0.59 0.06 0.14 0.04 0.16 0.03 0.05 0.01

semicircle (N = 102)

Figure-8 (N = 45)

chromosome (N = 60)

38

B. Wang and C. Shi

(M = 18, ε = 11.34) (a) EEA

(M = 22, ε = 7.19) (b) EEA

(M = 27, ε = 3.73) (c) EEA

(M = 30, ε = 2.84) (d) EEA

(M = 18, ε = 11.19) (e) SMGA

(M = 22, ε = 7.01) (f) SMGA

(M = 27, ε = 3.70) (g) SMGA

(M = 30, ε = 2.64) (h) SMGA

Fig. 6. The comparative results of SMGA and EEA [11] for Min-ε problem, where M is the speciﬁed number of the sides of approximating polygon, ε is the approximation error

5.1

For Min-ε Problem

Ho and Chen [11] proposed a GA-based method, Eﬃcient Evolutionary Algorithm (EEA), which adopted constraining method to cope with infeasible solutions for solving Min-ε problem. Here we use three curves, semicircle, ﬁgure-8 and chromosome to test SMGA and compare it with EEA. For each curve and a speciﬁed number of sides M , the simulation conducts ten independent runs for SMGA and EEA, respectively. The best solution, average solution and variance of solutions during ten independent runs for SMGA and EEA are listed in Table 1. Parts of simulation results of SMGA and EEA are shown in Fig. 6, where M is the speciﬁed number of the sides of approximating polygon, and ε is the approximation error. From Table 1 and Fig. 6, we can see that, for the same number of polygon’s sides, SMGA can obtain approximating polygon with smaller approximation error than EEA. The average coputation time of EEA for three benchmark curves, semicircle, ﬁgure-8 and chromosome, are 0.185s, 0.078s and 0.104s respectively, while SMGA only require 0.020s, 0.011s and 0.015s. It can be seen that SMGA outperforms EEA in the convergence speed. 5.2 For Min-# Problem

Yin [1] proposed a GA-based method for solving Min-# problem (we term it YGA). YGA adopted penalty-function method to cope with infeasible solutions.

A Hybrid GA for Two Types of Polygonal Approximation Problems Table 2. Experimental results for SMGA and YGA [1] for Min-# problem Curves ε 150 100 90 30 15 30 20 10 8 6 60 30 25 20 15 BEST M YGA SMGA 15 10 16 12 17 12 20 16 23 20 7 6 8 7 10 10 12 11 15 12 12 10 13 12 15 13 19 14 22 15 AVERAGE M YGA SMGA 15.4 10.1 16.2 12.6 17.4 12.8 20.3 16.0 23.1 20.0 7.6 6.0 9.1 7.0 10.4 10.0 12.4 11.0 15.4 12.0 13.3 10.0 13.6 12.1 16.3 13.0 19.5 14.0 23.0 15.2

39

Leaf (N = 120)

Chromosome (N = 60)

Semicirle (N = 102)

VARIANCE YGA SMGA 0.5 0.0 0.3 0.0 0.4 0.0 0.3 0.0 0.4 0.0 0.2 0.0 0.3 0.0 0.4 0.0 0.3 0.0 0.4 0.0 0.3 0.0 0.4 0.0 0.5 0.0 0.3 0.0 0.7 0.0

( ε = 30,M = 20) (a) YGA

( ε = 15,M = 23) (b) YGA

( ε = 6,M = 15) (c) YGA

( ε = 15,M = 22) (d) YGA

( ε = 30,M = 16) (e) SMGA

( ε = 15,M = 20) (f) SMGA

( ε = 6,M = 12) (g) SMGA

( ε = 15,M = 15) (h) SMGA

Fig. 7. The comparative results of SMGA and YGA [1] for Min-# problem, where ε is the speciﬁed error tolerance, M is the number of sides of the approximating polygon

Here, we conduct SMGA and YGA using three benchmark curves, leaf, chromosome and semicirle. For each curve and a speciﬁed error tolerance ε, the simulation conducts ten independent runs for SMGA and YGA, respectively. The best solution, average solution and variance of solutions during ten independent runs for SMGA and YGA are listed in Table 2, Parts of simulation results of SMGA

40

B. Wang and C. Shi

and YGA are shown in Fig. 7, where ε is the speciﬁed error tolerance, M is the number of sides of the approximating polygon. From Table 2 and Fig. 7, we can see that, for the same error tolerance, SMGA yields approximating polygon with relatively smaller number of sides than YGA. The average computation time of YGA for three benchmark curves, leaf, chromosome and semicirle, are 0.201s, 0.09s and 0.137s respectively, while SMGA only require 0.025s, 0.015s and 0.023s for them. It can be seen that SMGA outperforms YGA in the quality of the convergence speed.

6

Conclusion

We have proposed SMGA successfully to solve two types of polygonal approximation of digital curves, Min-# problem and Min-ε problem. The proposed chromosome-repairing technique of using split and merge techniques eﬀectively overcomes the diﬃcult problem of coping with infeasible solutions. The simulation results have shown that the proposed SMGA outperforms the existing GA-based methods which use other techniques of coping infeasible solutions for two types of polygonal approximation problems.

Acknowledgement