You are on page 1of 4

A Novel PSO-Based Parameter Estimation for Total

Variation Regularization
Saeid Fazli, Hamed Bouzari, Hamed Moradi Pour,Alireza Shayesteh fard
Electrical Engineering Department
Zanjan University, Zanjan, IRAN
{fazli & h.bouzari & h.moradi & Shayestehfard}@znu.ac.ir

Abstract-In this paper a novel approach for estimation of As regards to the successful applications of PSO, this paper
regularization parameter in Total Variation (TV) method, based aims at exerting it in image processing inverse problems. After
on Particle Swarm Optimization (PSO) is presented. As regards to
the fact that this parameter has a great impact on how well the explanation of our method, we will show its very well behavior
TV may work, many techniques have been used by researchers against inverse problems like deblurring.
but mostly are somehow based on an assumption on the nature of II. TOTAL VARIATION MODELS
the problem. This work suggests a new method as in which, the
PSO itself learns how to deal with this parameter without any The history of L1 estimation procedures goes back to Galileo
prior knowledge, just by tracking the procedure of how the
changes of this parameter affect the performance of TV. Finally (1632) and Laplace (1793) and has also received a lot of
experimental results are presented to show performance of the attention from the robust statistics community [10]. The first
proposed method in comparison to previous works. who introduced Total Variation methods to Computer Vision
tasks were Rudin, Osher and Fatemi (ROF) in their paper on
I. INTRODUCTION edge preserving image denoising [11]. The model is designed
Total Variation (TV) is a well known approach for to remove noise and other unwanted fine scale details, while
estimating the solution of ill-posed problems. Hadamard preserving sharp discontinuities (edges). The ROF model is
defined as the following variational model:
introduced the concept of well-posed problem in [1]; a problem
is said to be well-posed if its solution is unique and exits for
the arbitrary and continuous data set, reverse of well-posed is ⎧ 1 ⎫
the problem of ill-posed problem. Unlike many other min ⎨∫ ∇u dΩ + ∫ (u − f ) dΩ⎬ .
2
(1)
u
⎩Ω 2λ Ω ⎭
approaches proposed for solving these kinds of Inverse
problems, TV regularization technique takes in consideration
that the data is discontinuous and it doesn’t matter to be where Ω is the image domain, f is the observed image
smooth or continuous. function which is assumed to be corrupted by Gaussian noise,
In this paper, we present an algorithm based on Particle and u is the sought solution. The free parameter λ is used to
Swarm Optimization (PSO) to estimate the solution of ill- control the amount of smoothing in u . The aim of the ROF
posed problems by predicting the TV regularization parameter, model to minimize the Total Variation of u :
based on its performance trajectory.
PSO is a heuristic search technique (which is considered as 2 2
an evolutionary algorithm by its authors [2]) that simulates the ⎛ ∂u ⎞ ⎛ ∂u ⎞
movements of a flock of birds which aim to find food. It is a ∫ ∇ u dΩ = ∫
Ω Ω
⎜ ⎟ + ⎜⎜ ⎟⎟ dΩ .
⎝ ∂x ⎠ ⎝ ∂y ⎠
(2)
stochastic optimization method often used to solve complex
engineering problems such as structural and biomechanical
Its main property is that it allows for sharp discontinuities in
optimizations (e.g. References [3–9]). Similar to evolutionary
the solution while still being a convex in u [11].
optimization methods, PSO is a derivative-free, population-
based global search algorithm. Similar to the ROF model, the TV − L1 model [12], [13],
Among other heuristic methods, PSO is popular with its ease [14] is defined as the variational problem
of application and fast convergence to a near optimal solution.
PSO uses a population of solutions, called particles, which fly ⎧ ⎫
through the search space with directed velocity vectors to find min ⎨∫ ∇u dΩ + λ ∫ u − f dΩ ⎬ . (3)
u
better solutions. These velocity vectors have stochastic ⎩Ω Ω ⎭
components and are dynamically adjusted based on historical
and inter-particle information. Neural network training, voltage The difference compared to the ROF model is that the
control and task assignment are some of the application areas squared L2 data fidelity term has been replaced by the L1
of the PSO. norm. Moreover, while the ROF model in its unconstrained

978-1-4244-3388-9/09/$25.00 ©2009 IEEE


formulation (1) poses a strictly convex minimization problem, xki +1 = xki + vki +1 . (4)
the TV − L1 model is not strictly convex. This means that in
general, there is no unique global minimizer. The
( ) (
vki +1 = wk vki + c1r1 pki − xki + c2 r2 pkg − xki . ) (5)

TV − L1 model also offers some desirable improvements.


where xki represents the current position of particle i in
First, it turns out that the TV − L1 model is more effective
design space and subscript k indicates a (unit) pseudo-time
than the ROF model in removing impulse noise (e.g. salt and i
increment. The point pk is the best-found position of particle
pepper noise) [13]. Second, the TV − L1 model is contrast
invariant. This means that, if u is a solution of (3) for a certain i up to time step k and represents the cognitive contribution to
i g
input image f , then cu is also a solution for cf for c ∈ ℜ + . the search velocity vk . The point p k is the global best-found
Therefore the TV − L1 model has a strong geometrical position among all particles in the swarm up to time step k and
meaning which makes it useful for scale-driven feature forms the social contribution to the velocity vector. The
selection [15] and denoising of shapes [16]. variable wk is the particle inertia, which is reduced
TABLE I dynamically to decrease the search area in a gradual fashion
NUMERICAL METHODS
(see [37], [38] for further details). Random numbers r1 and
Algorithms
r2 are uniformly distributed in the interval [0, 1], while c1 and
Explicit time marching [11], [17]
Linearization of the EL equation [18], [19], [20] c2 are the cognitive and social scaling parameters, respectively
Nonlinear primal-dual method [21] (see [38], [39] for further details).
Duality based methods [21], [22], [23], [24], [25], [26]
Non-linear multigrid methods [27], [28], [29], [30], [31]
After the update step, the fitness function value f x k is ( ) i

First order schemes from convex optimization [32] calculated for each particle based on its position (candidate
solution represented by the particle.) The local best position
Second-order cone programming [33]
Graph cut methods [34], [25], [35] pki of each particle and the global best position p kg are
updated using these fitness values.
III. SOLUTION OF TOTAL VARIATION MODELS Initialize Optimization
Computing the solution of Total Variation models is a Initialize algorithm constants
Randomly initialize all particle positions and velocities
challenging task. The main reason lies in the Perform Optimization
nondifferentiability of the L1 norm at zero. It is therefore not For k = 1, number of iterations
surprising that one can find many items about this topic in the For i = 1, number of particles
literature. TABLE I gives a selected overview of numerical
methods to solve Total Variation models. Describing all these
Evaluate analysis function ( )
f xki
End
approaches in detail is clearly beyond the scope of this paper. Check convergence
We rather proceed by restricting our investigations to the
Update pki , p kg , and particle positions and
variational approach.
IV. PARTICLE SWARM OPTIMIZATION velocities xki +1 , vki +1
End
Particle Swarm Optimization (PSO) was first introduced by Report Results
Kennedy and Eberhart [36] as an optimization method for
Figure 1. Pseudo-code for a PSO algorithm.
continuous nonlinear functions. It is a stochastic optimization
technique based on individual improvement, social cooperation
and competition in the population. PSO is inspired of the V. PARTICLE SWARM OPTIMIZATION FOR THE TOTAL
behaviors of social models like bird flocking or fish schooling. VARIATION REGULARIZATION PROBLEM
Particles (design points) are distributed throughout the After description of TV problem and PSO application,
design space and their positions and velocities are modified respectively, in this section, we describe our method of using
based on knowledge of the best solution found thus far by each PSO to solve the TV problem.
particle in the ‘swarm’. Attraction towards the best-found As we mentioned before, there is an important parameter in
solution occurs stochastically and uses dynamically-adjusted TV regularization method which must be specified in someway
particle velocities. Particle positions (4) and velocities (5) are that it shouldn’t affect the solution negatively. It means that we
updated as shown below must be careful about the value of this parameter because if we
choose it too big, the estimated solution is going to be
smoothed, and also if we choose it too small, the estimated approach results in a better performance while just estimation
solution is about to be highly infected by noise. of the regularization parameter is enough to show the
So by understanding that this parameter has a great impact performance of the method practically.
on the final answer, we should mention that the manner in
which this parameter changes, however, is also the matter of
fact; this means that after distinguishing the right value for this
parameter in one iteration, it’s also important to denominate it
for the next iteration or just assuming it as a constant.
To answer this, we applied PSO to track the changes of this
parameter and it tries to find the best value of that, per iteration.
The experimental results have shown that, this method got
great achievements in special case, deblurring in image
processing; the outcomes will be illustrated in the next section.
VI. EXPERIMENTAL RESULTS
The experimental parameters of the proposed method are
given as follows:
TABLE II
CONSTANT VALUES
Parameter Name Value
Size of the swarm 30
Maximum number of iterations 20
Cognitive scaling parameter (C1) 1.5
Social scaling parameter (C2) 1.5

Fitness function ( )
f xki Mean Squared Error (MSE)
(L2
norm)
Figure 2. (Left to right, up to down) Lena (original), Blurred and noisy,
Reconstructed with assumption of constant λ , Reconstructed with PSO.
2
Blurring type Average, 9x9 window size
Noise type Gaussian, SNR=40
CG iterations 100 TABLE III
ERROR MEASUREMENTS WITH CONSTANT λ = 0.041
In this experiment we are about to find the deblurred version Measurement Type Value
of an image using the proposed method. We blur the original Improvement in Signal-to-Noise Ratio (ISNR) 7.3681 dB
Lena by convolving it with an averaging window and then Signal-to-Noise Ratio (SNR) 22.5175 dB
Peak Signal-to-Noise Ratio (PSNR) 29.6989 dB
degrade it by adding some Gaussian noise. By choosing the
Mean Squared Error (MSE) ( L22 norm) 69.6936
MSE as our fitness function, we perform minimization in each
Root of Mean Squared Error (RMSE) ( L2 norm) 8.3483
iteration. This procedure results in the selection of the best
regularization parameter. Computation of TV is based on Mean Absolute Error (MAE) ( L1 norm) 5.2631
Conjugate Gradient algorithm. Maximum Absolute Difference (MAX) ( L∞ norm) 86.5597
At the first experiment, we try to solve the inverse problem
using our method then the inverse problem by choosing the
TABLE IV
regularization parameter as a constant value, found in the ERROR MEASUREMENTS WITH PSO
previous experiment.
Measurement Type Value
As shown in Fig 2, the reconstructed images are almost the Improvement in Signal-to-Noise Ratio (ISNR) 8.4123 dB
same, but we know that the selection of the regularization Signal-to-Noise Ratio (SNR) 23.5617 dB
parameter in the first experiment is based on trial and error and Peak Signal-to-Noise Ratio (PSNR) 30.7431 dB
is chosen on the basis of the best answer. It is obvious that the Mean Squared Error (MSE) ( L22 norm) 54.7991
proposed method finds the best regularization parameter Root of Mean Squared Error (RMSE) ( L2 norm) 7.4026
without any priori knowledge of the system. As can be seen in Mean Absolute Error (MAE) ( L1 norm) 4.6894
the Tables III and IV, the results of the proposed PSO-based Maximum Absolute Difference (MAX) ( L∞ norm) 77.462
method are better than using the constant value which is
formerly chosen. This is mainly because of the fact that
whenever PSO wants to estimate the regularization parameter,
firstly it tries to eliminate the blurring effect by choosing a
small regularization parameter and then tries to eliminate the
noise effect by choosing a big regularization parameter. This
0.042
[14] T. Chan and S. Esedoglu. Aspects of total variation regularized L1
0.04 function approximation. SIAM J. Appl. Math., 65(5):1817–1837, 2004.
[15] T. Chen, W. Yin, X. Zhou, D. Comaniciu, and T. Huang. Total variation
TV Regularization parameter

models for variable lighting face recognition. IEEE Trans. Pattern Anal.
0.038

0.036 Mach. Intell., 28(9):1519–1524, 2006.


0.034
[16] M. Nikolova, S. Esedoglu, and T. Chan. Algorithms for finding global
minimizers of image segmentation and denoising models. SIAM Journal
0.032 of Applied Mathematics, 66(5):1632– 1648, 2006.
0.03
[17] A. Marquina and S. Osher. Explicit algorithms for a new time dependent
model based on level set motion for nonlinear deblurring and noise
0.028
removal. SIAM J. Sci. Comput., 22:387–405, 2000.
0.026 [18] C. Vogel and M. Oman. Iteration methods for total variation denoising.
SIAM J. Sci. Comp., 17:227–238, 1996.
0.024
[19] C. Vogel. A multigrid method for total variation-based image denoising.
0.022
0 10 20 30 40 50 60 70 80 90 100
Progress in Systems and Control Theory, 1995.
Iteration [20] A. Chambolle and P.-L. Lions. Image recovery via total variation
minimization and related problems. Numer. Math., 76:167–188, 1997.
Figure 3. Changes of TV regularization parameter due to PSO
[21] T. Chan, G. Golub, and P. Mulet. A nonlinear primal-dual method for
CONCLUSION total variation-based image restoration. SIAM J. Sci. Comp., 20(6):1964–
1977, 1999.
In this work we have shown that by the help of PSO [22] A. Chambolle. An algorithm for total variation minimizations and
applications. J. Math. Imaging Vis., 2004.
algorithm, we can estimate the regularization parameter in TV [23] J. Carter. Dual Methods for Total Variation-based Image Restoration.
method. We have also discussed that the manner in which this PhD thesis, UCLA, Los Angeles, CA, 2001.
parameter is going to change, is also very important and it [24] M. Hinterm¨uller and K. Kunisch. Total bounded variation regularization
as bilaterally constrained optimization problems. SIAP, 64(4):1311–1333,
shouldn’t be assumed a constant value in the whole process. 2004.
Advantages of applying the proposed PSO algorithm to solve [25] A. Chambolle. Total variation minimization and a class of binary MRF
TV problem are shown in the paper by some experimental models. In Energy Minimization Methods in Computer Vision and
Pattern Recognition, pages 136–152, 2005.
results. [26] M. K. Ng, L. Qi, Y.-F. Yang, and Y.-M. Huang. On semismooth
Newton’s method for total variation minimization. J. Math. Imaging Vis.,
REFERENCES 27:265–276, 2007.
[1] J. Hadamard, “Sur les problèmes aux dérivées partielles et leur [27] C. Frohn-Schauf, S. Henn, and K. Witsch. Nonlinear multigrid methods
signification physique”, Princeton University Bulletin, 1902, pp. 49-52. for total variation image denoising. Comput. Visual Sci., pages 199–206,
[2] Russell C. Eberhart and Yuhui Shi. Comparison between genetic 2004.
algorithms and particle swarm optimization. In V. W. Porto, N. [28] A. Bruhn and J. Weickert. Towards ultimate motion estimation:
Saravanan, D. Waagen, and A.E. Eibe, editors, Proceedings of the Combining highest accuracy with real-time performance. In Proc. 11th
Seventh Annual Conference on Evolutionary Programming, pages 611– Int. Conf. Comp. Vis., pages 749–755, 2005.
619. Springer-Verlag, March 1998. [29] J. Savage and K. Chen. An improved and accelerated nonlinear multigrid
[3] Venter G, Sobieszczanski-Sobieski J. Multidisciplinary optimization of a method for total-variation denoising. J. Math. Imaging Vis., 82(8):1001–
transport aircraft wing using particle swarm optimization. 9th 1015, 2005.
AIAA/ISSMO Symposium on Multidisciplinary Analysis and [30] T. Chan, K. Chen, and J. Carter. Iterative methods for solving the dual
Optimization, Atlanta, GA, 2002. formulation arising from image restoration. Electronic Transactions on
[4] Fourie PC, Groenwold AA. The particle swarm algorithm in topology Numerical Analysis, 26:299–311, 2007.
optimization. Proceedings of the 4th World Congress of Structural and [31] T. Chan and K. Chen. On a nonlinear multigrid algorithm with primal
Multidisciplinary Optimization, Dalian, China, May 2001. relaxation for the image total variation minimisation. Numerical
[5] Eberhart RC, Shi Y. Particle swarm optimization: developments, Algorithms, 41:387–411, 2006.
applications and resources. Proceedings of the IEEE Congress on [32] P. Weiss, L. Blanc-F´eraud, and G. Aubert. Efficient schemes for total
Evolutionary Computation (CEC 2001), Korea. 27–30 May 2001; IEEE variation minimization under constraints in image processing. Technical
Press: New York, 81–86. report, INRIA, 2007.
[6] Reinbolt JA, Schutte JF, Fregly BJ, Koh BI, Haftka RT, George AD, [33] D. Goldfarb and W. Yin. Second-order cone programming methods for
Mitchell KH. Determination of patient-specific multi-joint kinematic total variation-based image restoration. SIAM Journal on Scientific
models through two-level optimization. Journal of Biomechanics 2005; Computing, 27(2):622–645, 2005.
38:621–626. [34] J. Darbon and M. Sigelle. Image restoration with discrete constrained
[7] Schutte JF, Reinbolt JA, Fregly BJ, Haftka RT, George AD. Parallel total variation, part i: fast and exact optimization. J. Math. Imaging Vis.,
global optimization with particle swarm algorithm. International Journal 26(3):261–276, 2006.
for Numerical Methods in Engineering 2004; 61:2296–2315. [35] D. Goldfarb and W. Yin. Parametric maximum flow algorithms for fast
[8] Gies D, Rahmat-Smil Y. Particle swarm optimization for reconfigurable total variation minimization. Technical report, Rice University, 2007.
phase-differentiated array design. Microwave and Optical Technology [36] R.C. Eberhard, J. Kennedy, “New optimizer using Particle Swarm
Letters 2003; 38:168–175. Theory”, Proceedings of the Sixth International Symposium on Micro
[9] Schutte JF, Koh BI, Reinbolt JA, Fregly BJ, Haftka RT, George AD. Machine and Human Science, Nagoya, Japan (1995).
Evaluation of a particle swarm algorithm for biomechanical optimization. [37] Fourie PC, Groenwold AA. The particle swarm algorithm in topology
Journal of Biomechanical Engineering 2005; 127:465–474. optimization. Proceedings of the 4th World Congress of Structural and
[10] P. Huber. Robust Statistics. Wiley, New York, 1981. Multidisciplinary Optimization, Dalian, China, May 2001.
[11] J. Savage and K. Chen. An improved and accelerated nonlinear multigrid [38] Schutte JF, Reinbolt JA, Fregly BJ, Haftka RT, George AD. Parallel
method for total-variation denoising. J. Math. Imaging Vis., 82(8):1001– global optimization with particle swarm algorithm. International Journal
1015, 2005. for Numerical Methods in Engineering 2004; 61:2296–2315.
[12] J.-F. Aujol, G. Gilboa, T. Chan, and S. Osher. Structuretexture image [39] Shi Y, Eberhart RC. Parameter selection in particle swarm optimization.
decomposition–modeling, algorithms, and parameter selection. Int. J. In Evolutionary Programming VII, Porto VW, Saravanan N, Waagen D,
Comp. Vis., 67(1):111–136, 2006. Eiben AE (eds). Lecture Notes in Computer Science, vol. 1447, Springer:
[13] M. Nikolova. A variational approach to remove outliers and impulse Berlin, 1998; 591–600.
noise. J. Math. Imaging Vis., 20(1-2):99–120, 2004.

You might also like