You are on page 1of 6

10th Brazilian Symposium on Neural Networks

On Frequency Sensitive Competitive Learning for VQ Codebook Design


P. H. Esprito Santo, R. C. Albuquerque, D. C. Cunha and F. Madeiro University of Pernambuco Department of Electrical Engineering {paulohugos, rebeccacamile, dccunha}@gmail.com, franciscomadeiro@yahoo.com.br Abstract
Vector quantization (VQ) plays an important role in many image coding systems. The present paper examines the application of frequency sensitive competitive learning for VQ codebook design. Simulation results are presented for: the inuence of the initialization on codebook design; the normalized entropy of the codevectors; the number of multiplications spent in the codebook training; the evolution of the codebook by the end of each iteration of the algorithm; the quality of the designed codebooks in terms of the peak signal to noise ratio of the reconstructed images. A comparison with both LBG (Linde-Buzo-Gray) and non frequency sensitive competitive learning is presented.

initial codebook, driven by a training sequence. The LBG algorithm is a procedure of decreasing distortion at each iteration of the training sequence; it is highly sensitive to the initial codebook in the sense that the both performance and convergence speed depend on which initial codebook is used [5]. Competitive learning can also be applied to codebook design [6, 7, 10, 11, 12, 13]. The present paper is concerned with an evaluation of frequency sensitive competitive learning applied to vector quantization (VQ) codebook design. A comparative evaluation with LBG and non sensitive competitive learning algorithm is performed. The paper is organized as follows. Section 2 presents an overview of vector quantization. The algorithms under consideration are described in Section 3. Simulation results are presented in Section 4. Finally, conclusions are drawn in Section 5.

1. Introduction
Multimedia communications, web browsing, integrated services digital networks (ISDN), storage of medical images and business documents, archiving of ngerprints, and transmission of remote sensing images obtained from satellites are examples of typical applications which have motivated intensive research and signicant progress in image coding techniques. In those applications, the fundamental purpose of image compression is to reduce the number of bits to represent the image (while maintaining the necessary/acceptable image quality), in order to minimize the requirements of storage and transmission. In this scenario, vector quantization (VQ) plays an important role in numerous image coding systems, allowing high compression rates [1, 2]. According to Shannons Rate-Distortion Theory [3], a better performance is achieved by coding vectors instead of scalars, that is, vector quantization presents some advantages over scalar quantization. The most traditional technique for designing vector quantizers is the Linde-Buzo-Gray (LBG) algorithm [4]. It produces good VQ codebooks from iterative updates of an

2. Vector Quantization
Vector quantization [1, 2] can be seen as an extension of scalar quantization to a multidimensional space. A vector quantizer (or multidimensional quantizer) is a mapping Q from a vector x in K-dimensional Euclidean space, RK , into a nite subset W of RK containing N distinct reproduction vectors. Thus, Q : RK W, (1)

where the codebook W = {wi ; i = 1, 2, . . . , N } is the set of K-dimensional codevectors, also known as reconstruction vectors, template vectors or quantization vectors. Hence, it is assumed that the codebook size is N . The corresponding code rate of the vector quantizer, which measures the number of bits per vector component, 1 is R = K log2 N . A vector quantizer can be considered as a combination of two separate mappings: A VQ encoder and a VQ decoder. The former follows the nearest neighbor rule to output the binary index of the codevector that presents the greatest similarity to the source vector to be coded. In the latter, upon receiving the binary representation of index i,

1522-4899/08 $25.00 2008 IEEE DOI 10.1109/SBRN.2008.19

135

it simply looks up the i-th codevector, wi , from a copy of the codebook W , and outputs wi as the reproduction (reconstruction) of x. Codebook design plays a fundamental role in communications systems based on vector quantization. Techniques for codebook design attempt to produce a codebook that is optimum for a given source in the sense that the average distortion in representing the input vectors by the corresponding codevectors may be kept to a minimum. The most widely used technique for codebook design is the LBG algorithm [4]. Neural networks have also been successfully used for training vector quantizers [7, 10, 11, 12, 13].

distortion. The codebook training stops when (Dn1 Dn )/Dn . The convergence speed of the LBG algorithm depends on the initial codebook.

3.2. Competitive Algorithm


In the simple competitive learning applied to codebook design, at each presentation of a training vector only the winner (i.e., the weight vector more similar to the training vector) has its components updated. After the initialization of the weights (i.e., of the components of the N K-dimensional codevectors), codebooks can be designed by the competitive algorithm (from now on it will be denoted simply by CL) described as follows. CL algorithm : For 1 n nCL For 1 m M Find the winner wi (n, m): i = arg min d[x(m), w i (n, m)]
i

3. Codebook Design Algorithms


3.1. LBG Algorithm
The LBG algorithm, also known as Generalized Lloyd Algorithm (GLA) or K-means algorithm, is the most widely used method for VQ codebook design. Using a representative vector sequence for codebook training, the LBG algorithm is described as follows. Let the iteration of the LBG algorithm be denoted by n. Given K, N and a distortion threshold 0, the LBG algorithm [4] consists of the following steps: Step (1) Initialization: Given an initial codebook W0 and a training set X = {xm ; m = 1, 2, . . . , M }, set n = 0 and D1 = . Step (2) Partitioning: Given Wn (codebook at the n-th iteration), assign each training vector (input vector) to the corresponding class (Voronoi cell) according to the nearest neighbor rule; determine the distortion
N

Update the winner according to wi j (n, m) = wi j (n, m) + wi j (n, m), where wi j (n, m) = (n)[xj (m) wi j (n, m)]. (4) (3)

In the previous description, nCL is the total number of iterations of the CL algorithm, x(m) is the m-th vector of the training set1 , while wi (n, m) and wi (n, m) denote, respectively, the i-th codevector and the winner when the presentation of the m-th training vector at the n-th iteration is considered. In its turn,
K

Dn =
i=1 xm Si

d(xm , wi ).

(2)

d[x(m), wi (n, m)] =


j=1

[xj (m) wij (n, m)]2

(5)

Step (3) Convergence Test (stop criterion): If (Dn1 Dn )/Dn then stop, with Wn representing the nal codebook (designed codebook); else, continue. Step (4) Codebook Updating: Calculate the new codevectors as the centroids of the classes; Set Wn+1 Wn ; Set n n + 1 and go to Step (2). The distortion decreases monotonically in the LBG algorithm, since the codebook is iteratively updated attempting to satisfy the centroid and nearest neighbor conditions. In the LBG algorithm, the distortion introduced by representing the training vectors by the corresponding codevectors (centroids) is monitored at each iteration. The stopping rule (convergence test) is based on that monitored

denotes the quadratic Euclidean distance between the vectors x(m) e wi (n, m), where xj (m) is the j-th component of the vector x(m) and wij (n, m) is the j-th component of the vector wi (n, m) . In the expression describing the winner updating, wi j is the modication introduced in the j-th component of the winner, (n) is the learning rate or adaptation gain in the n-th iteration (0 < (n) < 1), wi j is the j-th component of the winner and wi j is the updated version of the component of the winner2 . In the CL algorithm previously described,
1 It is important to observe that, for notation convenience, it was used x(m) in the description of the CL algorithm and xm in the description of the LBG algortihm. 2 Observe that n and m from w (n, m), w (n, m) and i j i j wi j (n, m) were omitted to simplify the notation.

136

the learning rate decreases linearly with the iteration n, remaining itself constant along an iteration, i.e., the learning rate is kept constant during each complete presentation of the M training vectors. Hence, the learning rate is given by (n) = (1) + (n 1) (nCL ) (1) , nCL 1 (6)

where (1) and (nCL ) denote two parameters of the CL algorithm: the initial learning rate and the nal learning rate, respectively.

For all simulations, the dimension of the vector quantization codebooks was K = 16, corresponding to image blocks of 4 x 4 pixels. Five coding rates R = 0.3125, 0.375, 0.4375, 0.5 and 0.5625 bpp were considered, corresponding to codebook sizes N = 32, 64, 128, 256 and 512. Given a image, thirty different codebooks were designed for each value of coding rate. The distortion threshold for the LBG algorithm was = 0.001 .

3.3. FSCL Algorithm


The main problem about simple competitive learning is that some neurons (codevectors, at the context of VQ) can have few or no chance to win the competition. This can result in neurons that have not been suciently trained (sub-utilized neurons). Based on the implementation of the Grossbergs conscience principle [8], the technique denominated Frequency Sensitive Competitive Learning (FSCL), presented by Krishnamurthy et al. [9], attempts to train all codevectors equally, i.e., tries to update all codevectors approximately the same number of times. In the FSCL algorithm, the frequency which each codevector wins the competition is monitored. This information is used during the training to ensure that all codevectors have approximately the same opportunity to be updated. Precisely, the FSCL algorithm uses a modied distortion (distance) measure, presented in [9], incorporating the frequency which each codevector is declared winner. Let d[x(m), wi (n, m)] be the distortion measure used by the CL algorithm. The modied distortion measure used by the FSCL algorithm is given by d[x(m), wi (n, m)] = fi
K j=1

(a) Lena

(b) Elaine

(c) Peppers

Figure 1. Original images.

The rst set of simulations concerned the sensibility of the algorithms to the initial codebooks, by using the coefcient of variation (CV) of the PSNR of reconstructed image: the greater sensibility corresponds to the higher value of CV. Analyzing Tables 1, 2 and 3, one can notice that, in general, the smaller sensibility to the initial codebook is presented by the CL algorithm.

Table 1. Coefcient of variation for the PSNR values of the reconstructed Elaine image. N 32 64 128 256 512 CL 0.174 0.108 0.113 0.202 2.064 FSCL 0.255 0.284 0.296 0.212 2.496 LBG 0.272 0.175 0.187 0.253 1.810

[xj (m) wij (n, m)]2 , (7)

where fi denotes the number of times that the i-th codevector is declared winner so far. The equation (7) shows that if a codevector wins frequently, its distortion d will increase. As a consequence, its chance to win the next competitions will diminish. Therefore, other codevectors with small values of fi will have opportunity to win a competition (to be updated) in the next presentations of training vectors. This adaptative feature of the FSCL algorithm permits that the codevectors be updated approximately the same number of times during the training.

Table 2. Coefcient of variation for the PSNR values of the reconstructed Lena image. N 32 64 128 256 512 CL 0.213 0.157 0.137 0.114 3.206 FSCL 0.232 0.305 0.227 0.176 2.674 LBG 0.264 0.284 0.253 0.264 1.949

4. Simulation Results
Simulations were carried out for three traditional 256 256 images: Lena, Elaine and Peppers, depicted in Figure 1.

137

Table 3. Coefcient of variation for the PSNR values of the reconstructed Peppers image. N 32 64 128 256 512 CL 0.199 0.127 0.128 0.144 3.186 FSCL 0.474 0.243 0.212 0.178 3.128 LBG 0.285 0.218 0.272 0.327 2.007

Table 6. Normalized entropy reconstructed Peppers image. N 32 64 128 256 512 CL 0.860 0.830 0.843 0.810 0.813 FSCL 0.913 0.933 0.935 0.945 0.940

for

the

Simulations were also conducted in order to evaluate the normalized entropy of the codevectors, which is a metric that goes to one as the codevectors tend to be updated with the same frequency [14]. According to Tables 4, 5 and 6, the FSCL algorithm indeed has conscience, since it attempts to give all the codevectors the same chance to be adjusted the FSCL algorithm leads to a training (codebook design) which produces codebooks with higher normalized entropy values when compared to the ones obtained by the CL algorithm.

codebooks. As an example, for Lena, a PSNR gain of about 0.5 dB is obtained by substituting the LBG codebook by the CL codebook for a coding rate 0.5 bpp.

Table 4. Normalized entropy reconstructed Elaine image. N 32 64 128 256 512 CL 0.876 0.865 0.895 0.926 0.944 FSCL 0.929 0.960 0.958 0.971 0.965

for

the

Table 5. Normalized entropy reconstructed Lena image. N 32 64 128 256 512 CL 0.857 0.810 0.833 0.784 0.850 FSCL 0.943 0.940 0.955 0.950 0.954

for

the

Figure 2. PSNR (dB) versus coding rate (bpp) for LBG, CL and FSCL algorithms: Elaine image. Figures 5 and 6 exhibit the PSNR values of the reconstructed images by the end of each iteration of the algorithms under consideration. It is observed that the CL algorithm produces the best codebooks (the ones that lead to the highest PSNR values for the reconstructed images), requiring, for that purpose, less iterations than both LBG and FSCL. Table 7 shows that for all values of codebook size N evaluated, the number of multiplications performed by FSCL algorithm is slightly higher than that of CL algorithm. It is also observed that both algorithms require a smaller number of multiplications when compared to LBG algorithm. As an example, for N = 128, 1.258 108 ,

The best PSNR results for each of the three codebook design algorithms are presented in Figures 2, 3 and 4 for Elaine, Lena and Peppers images, respectively. One can see that the best performance is presented by the CL algorithm and the worst PSNR results are associated with the FSCL

138

Figure 3. PSNR (dB) versus coding rate (bpp) for LBG, CL and FSCL algorithms: Lena image.

Figure 4. PSNR (dB) versus coding rate (bpp) for LBG, CL and FSCL algorithms: Peppers image.

5.072 107 , 5.387 107 multiplications are performed by the LBG, CL and FSCL algorithms, respectively. It is worth mentioning that, in spite of using a smaller number of multiplications with respect to LBG, the CL algorithm leads to reconstructed images with higher values of PSNR.

Table 7. Number of multiplications performed by LBG, CL and FSCL algorithms for Elaine image as training set. N 32 64 128 256 512 LBG 5.033 107 7.550 107 1.258 108 2.181 108 3.355 108 CL 1.081 107 2.556 107 5.072 107 1.011 108 2.017 108 FSCL 1.147 107 2.713 107 5.387 107 1.073 108 2.143 108

5. Conclusions
The present paper was concerned with an evaluation of frequency sensitive competitive learning applied to vector quantization (VQ) codebook design. The FSCL (frequency sensitive competitive) algorithm was compared to LBG (Linde-Buzo-Gray) algorithm and non sensitive competitive learning algorithm, referred to simply as CL (competitive)

Figure 5. PSNR (dB) versus iteration number for LBG, CL and FSCL algorithms: Elaine image.

algorithm. It was observed that the CL algorithm is the least sensitive to the initialization. Regarding the normalized entropy, simulation results have shown that the FSCL algorithm attempts to give all the codevectors the

139

Communications, vol. COM - 28, No. 1, pp 84-95, January 1980. [5] Lee, D., Baek, S., and Sung, K. Modied K-means Algorithm for Vector Quantizer Design. IEEE Signal Processing Letters, vol. 4, No. 1, pp. 2-4, January 1997. [6] Madeiro, F., Vajapeyam, M. S., Morais, M. R., Aguiar Neto, B. G., and Alencar, M. S. Multiresolution Codebook Design for Wavelet/VQ Image Coding. Proceedings of the 15th International Conference on Pattern Recognition (ICPR2000), vol. 3, Barcelona, Spain, pp. 79-82, September 2000. [7] Kohonen, T. Self-Organization and Associative Memory (3rd ed). Springer-Verlag, Berlin, 1984. [8] Grossberg, S. Adaptive Pattern Classication and Universal Recording I. Parallel Development and Coding of Neural Feature Detectors. Biological Cybernetics, vol. 23, pp. 121-134, 1976. [9] Krishnamurthy, A. K., Ahalt, S. C., Melton, D. E., and Chen, P. Neural Networks for Vector Quantization of Speech and Images. IEEE Journal on Selected Areas in Communications, vol. 8, No. 8, pp. 1449-1457, October 1990. [10] Chang, C.-H., Xu, P., Xiao, R., Srikanthan, T. New Adaptive Color Quantization Method Based on Self-Organizing Maps. IEEE Transactions on Neural Networks, vol. 16, No. 1, pp. 237-249, January 2005. [11] Wu, F. H., Ganesan, K. Comparative Study of Algorithms for VQ Design Using Conventional and Neural-net Based Approaches. Proceedings International Conference on Acoustics, Speech, and Signal Processing (ICASSP89), vol. 2, pp. 751-754, May 1989. [12] Nasrabadi, N. M., Feng, Y. Vector Quantization of Images Based Upon the Kohonen Self-organizing Feature Maps. Proceedings IEEE International Conference on Neural Networks, vol. 1, pp. 101-108, July 1988. [13] Chen, O. T.-C., Sheu, B. J., Fang, W.-C. Image Compression Using Self-organization Networks IEEE Transactions on Circuits and Systems for Video Technology, vol. 4, No. 5, pp. 480-489, October 1994. [14] Paliwal, K. K. and Ramasubramanian, V. Effect of Ordering the Codebook on the Efciency of the Partial Distance Search Algorithm for Vector Quantization IEEE Transactions on Communications, vol. 37, No. 5, pp. 538-540, May 1989.

Figure 6. PSNR (dB) versus iteration number for LBG, CL and FSCL algorithms: Lena image.

same chance to be adjusted FSCL algorithm exhibits normalized entropy values closer to one when compared to CL algorithm. In the learning algorithms considered in the present paper, the learning rate decreases linearly with the iteration, remaining itself constant along an iteration, i.e., the learning rate is kept constant during each complete presentation of the training vectors. It was observed that the CL algorithm produces the best codebooks (the ones that lead to the highest PSNR values for the reconstructed images), requiring, for that purpose, less iterations than both LBG and FSCL. As a nal comment, the number of multiplication operations required by the FSCL algorithm was slightly higher than that of the CL algorithm, but lower than that of LBG.

References
[1] Gersho, A. and Gray, R. M. Vector Quantization and Signal Compression. Kluwer Academic Publishers, Boston, MA, 1992. [2] Gray, R. M. Vector Quantization,. IEEE ASSP Magazine, pp. 4-29, April 1984. [3] Berger, T. Rate Distortion Theory: A Mathematical Basis for Data Compression. Prentice-Hall, Englewood Cliffs, NJ, 1971. [4] Linde, Y., Buzo, A. and Gray, R. M. An Algorithm for Vector Quantizer Design. IEEE Transactions on

140