You are on page 1of 6

A Spatial Median Filter for Noise Removal in Digital Images

James Church, Dr. Yixin Chen, and Dr. Stephen Rice Computer Science and Information System, University of Mississippi jcchurch@olemiss.edu,{ychen,rice}@cs.olemiss.edu

Abstract
In this paper, six different image ltering algorithms are compared based on their ability to reconstruct noiseaffected images. The purpose of these algorithms is to remove noise from a signal that might occur through the transmission of an image. A new algorithm, the Spatial Median Filter, is introduced and compared with the current image smoothing techniques. Experimental results demonstrate that the proposed algorithm is comparable to popular image smoothing algorithms. In addition, a modication to this algorithm is introduced to achieve more accurate reconstructions over other popular techniques.

Each of the algorithms covered can be applied to one dimensional as well as two dimensional signals. Figure 1 demonstrates ve common ltering algorithms applied to an original image.

(a)

(b)

(c)

(d)

(e)

(f)

1. Smoothing Algorithms
The inexpensiveness and simplicity of point-andshoot cameras, combined with the speed at which budding photographers can send their photos over the Internet to be viewed by the world, makes digital photography a popular hobby. With each snap of a digital photograph, a signal is transmitted from a photon sensor to a memory chip embedded inside a camera. Transmission technology is prone to a degree of error, and noise is added to each photograph. Signicant work has been done in both hardware and software to improve the signal-to-noise ratio in digital photography. In software, a smoothing lter is used to remove noise from an image. Each pixel is represented by three scalar values representing the red, green, and blue chromatic intensities. At a pixel studied, a smoothing lter takes into account the pixels surrounding it in order to make a determination of a more accurate version of this pixel. By taking neighboring pixels into consideration, extreme noisy pixels can be ltered out. Unfortunately, extreme pixels can also represent original ne details, which can also be lost due to the smoothing process. This paper will examine four common smoothing algorithms and introduce a new smoothing algorithm.

Figure 1. Examples of common ltering approaches. (a) Original Image (b) Mean Filtering (c) Median Filtering (d) Root signal of Median Filtering (e) Component-wise Median Filtering (f) Vector Median Filtering The simplest of smoothing algorithms is the Mean Filter as dened in (1). The Mean Filter is a linear lter which uses a mask over each pixel in the signal. Each of the components of the pixels which fall under the mask are averaged together to form a single pixel. This new pixel is then used to replace the pixel in the signal studied. The Mean Filter is poor at maintaining edges within the image. 1 N xi N i=1

MEANFILT ER(x1 , ..., xN ) =

(1)

The median is a statistical concept whereby in a given sorted list of numbers, the median is the center value of the list. The use of the median in signal processing was rst introduced by J. W. Tukey[1]. If the count of the list is even, there could be multiple center values. If the count of the list is odd, there is one unique median value; therefore, it is convenient to use odd list

sizes when looking for a median. The Median Filter is performed by taking the magnitude of all of the vectors within a mask and sorting the magnitudes, as dened in (2). The pixel with the median magnitude is then used to replace the pixel studied. The Simple Median Filter has an advantage over the Mean lter in that it relies on median of the data instead of the mean. As single noisy pixel present in the image can signicantly skew the mean of a set. The median of a set is more robust with respect to the presence of noise. MEDIANFILT ER(x1 , ..., xN ) = MEDIAN( x1 2 , ..., xN 2 ) (2)

the signal studied. The Vector Median Filter is a wellresearched lter and popular due to the extensive modications that can be performed in conjunction with it. V MF(x1 , ..., xN ) = MIN( x1 xi , ..., xN xi )
i=1 i=1 N N

When ltering using the Simple Median Filter, an original pixel and the resulting ltered pixel of the sample studied are sometimes the exact same pixel. A pixel that does not change due to ltering is known as the root of the mask. It can be shown that after sufcient iterations of median ltering, every signal converges to a root signal[3]. The Component Median Filter, dened in (3), also relies on the statistical median concept. In the Simple Median Filter, each point in the signal is converted to a single magnitude. In the Component Median Filter each scalar component is treated independently. A lter mask is placed over a point in the signal. For each component of each point under the mask, a single median component is determined. These components are then combined to form a new point, which is then used to represent the point in the signal studied. When working with color images, however this lter regularly outperforms the Simple Median Filter. When noise affects a point in a grayscale image, the result is called salt and pepper noise. In color images, this property of salt and pepper noise is typical of noise models where only one scalar value of a point is effected. In this noise model, the Component Median Filter is more accurate than the Simple Median Filter. The disadvantage of this lter is that it will create a new signal point that did not exist in the original signal, which may be undesirable in some applications. MEDIAN(x1r , ..., xNr ) CMF(x1 , ..., xN ) = MEDIAN(x1g , ..., xNg ) (3) MEDIAN(x1b , ..., xNb ) The Vector Median Filter was developed by Astola, Haavisto, and Neuvo in 1990[2]. In the Vector Median Filter (4), a lter mask is placed over a single point. The sum of the vector magnitude differences using the L2 norm from each point to each other point within the mask is computed. The point with the minimum sum of vector differences is used to represent the point in

(4) For each of the four algorithms discussed, experimental results will be shown that indicate which algorithm is best suited for the purpose of impulse noise removal in digital color images. Those results will be compared to two new algorithms for noise removal: the Spatial Median Filter and the Modied Spatial Median Filter. In testing, we consider sets of images containing various amounts of articial noise. Impulse noise represents random spikes of energy that happen during the data transfer of an image. When applying noise, a percentage of the image is damaged by changing a randomly selected point channel to a random value from 0 to 255. The noise model, Inoisy , is given by xp x<p 2 1 x<p 3 3 x<p (5) where I is the original image, Ir , Ig , and Ib represent the original red, green, and blue component intensities of the original image, x, y [0, 1] and are random values, z [0, 255] and also random, and p [0, 1] represents the probability of noise in the image. I(i, j) (Ir (i, j), Ig (i, j), z) Inoisy (i, j) = (Ir (i, j), z, Ib (i, j)) (z, Ig (i, j), Ib (i, j))
1 y< 3 y< 2 3 y

2. Spatial Median Filter for Smoothing Images


When transferring an image, sometimes transmission problems cause a signal to spike, resulting in one of the three point scalars transmitting a incorrect value. This type of transmission error is called salt and pepper noise due to the bright and dark spots that appear on the image as a result of the noise. The ratio of incorrectly transmitted points to the total number of points is referred to as the noise composition of the image. The goal of a noise removal lter is to take a corrupted image as input and produce an estimation of the original with no foreknowledge of the characteristics of the noise nor the noise composition of the image. In images containing noise, there are two challenges. The rst challenge is determining noisy points. The second challenge is to determine how to adjust these points. In the Vector Median Filter (VMF), a point

in the signal is compared with the points surrounding it as dened by a lter mask. Each point in the mask lter is treated as a vector representing a point in a threedimensional space. Among these points, the summed vector distance from each point to every other point within the lter is computed. The point in the signal with the smallest vector distance amongst those points in the lter is the minimum vector median. The point in space that has the smallest distance to every other point is considered to be the best representative among the set. In the basic implementation of the Vector Median Filter, the vector median replaces the point in the signal currently studied. The original VMF approach does not consider if the current point is original data or not. If a point has a small summed vector distance, yet is not the minimum vector median, it is replaced anyway. The advantage of replacing every point achieves a uniform smoothing across the image. The disadvantage to replacing every point is that original data is sometimes overwritten. A good smoothing lter should simplify the image while retaining most of the original image shape and retain the edges. A benet of a smoothed image is a better size ratio when the image needs to be compressed. The Spatial Median Filter is a new noise removal lter. The Spatial Median Filter and the Vector Median Filter follow a similar algorithm and it will be shown that they have comparable results. To improve the quality of the results of the Spatial Median Filter, a new parameter will be introduced and experimental data is shown demonstrating the amount of improvement. The Spatial Median Filter is a uniform smoothing algorithm with the purpose of removing noise and ne points of image data while maintaining edges around larger shapes. The Spatial Median Filter is based on the spatial median quantile function developed by P. Chaudhuri in 1996, which is a L1 norm metric that measures the difference between two vectors [4]. R. Sering noticed a spatial depth could be derived by taking an invariant of the spatial median[5]. The Sering paper rst gave the notion that any two vectors of a set could be compared based on their centrality using the Spatial Median. Y. Vardi and C. Zhang have improved the spatial median by deriving a faster estimation formula[6]. The following is the basic algorithm determining the Spatial Median of a set of points, x1 , ..., xN : 1. For each vector x, compute S, which is a set of the sum of the spatial depths from x to every other vector. 2. Find the maximum spatial depth of this set, Smax 3. Smax is the Spatial Median of the set of points.

The spatial depth between a point and set of points is dened by,
N

Sdepth (X, x1 , ..., xN ) = 1

1 N 1

i=1

X xi X xi

(6)

The Spatial Median Filter is an unbiased smoothing algorithm and will replace every point that is not the maximum spatial depth among its set of mask neighbors. The Modied Spatial Median Filter attempts to address these concerns.

3. A Modied Spatial Median Filter


The Spatial Median Filter is similar to the Vector Median Filter in that in both lters, the vectors are ranked by some criteria and the top ranking point is used to the replace the center point. No consideration is made to determine if that center point is original data or not. The unfortunate drawback to using these lters is the smoothing that occurs uniformly across the image. Across areas where there is no noise, original image data is removed unnecessarily. In the Modied Spatial Median Filter, after the spatial depths between each point within the mask are computed, an attempt is made to use this information to rst decide if the masks center point is an uncorrupted point. If the determination is made that a point is not corrupted, then the point will not be changed. We rst calculate the spatial depth of every point within the mask and then sort these spatial depths in descending order. The point with the largest spatial depth represents the Spatial Median of the set. In cases where noise is determined to exist, this representative point then replaces the point currently located under the center of the mask. The point with the smallest spatial depth will be considered the least similar point of the set. By ranking these spatial depths in the set in descending order, a spatial order statistic of depth levels is created. The largest depth measures, which represent the collection of uncorrupted points, are pushed to the front of the ordered set. The smallest depth measures, representing points with the largest spatial difference among others in the mask (and possibly the most corrupted points), are pushed to the end of the list. We can prevent some of the smoothing by looking for the position of the center point in the spatial order statistic list. Let us consider a parameter T (where 1 T masksize), which represents the estimated number of original points under a mask of points. As stated earlier, points with high spatial depths (and supposedly uncorrupted points) are at the beginning of the list. Pixels

Root-Mean-Squared Error

with low spatial depths appear at the end. If the position of the center mask point appears within the rst T bins of the spatial order statistic list, then we can argue that while the center point is not the best representative point of the mask, it is still original data and should not be replaced. Two things should be noted about the use of T in this approach. When T is 1, this is the equivalent to the unmodied Spatial Median Filter. When T is equal to the size of the mask, the center point will always fall within the rst T bins of the spatial order statistic and every point is determined to be original. This is the equivalent of performing no ltering at all, since all of the points are left unchanged. The algorithm to detect the least noisy point depends on a number of conditions. First, the uncorrupted points should outnumber, or be more similar, to the corrupted points. If two or more similar corrupted points happen in close proximity, then the algorithm will interpret the occurrence as original data and maintain the corrupted portions. While T is an estimation of the average number of uncorrupted points under a mask of points, the experimental testing made no attempt to measure the impulse noise composition of an image prior to executing the lter.

The tests to determine the best mask size were conducted in this manner: 1. Each of the ten images in the collection was articially distorted with p=0.0, p=0.05, p=0.10, and p=0.20 noise composition, resulting in 40 images. 2. Each of the forty noisy images was then reconstructed using the SMF with mask sizes of N=3, N=5, and N=7 (the second argument, threshold T , is set to 1), resulting in 120 reconstructed images. 3. The Root-Mean-Squared Error was computed between all 120 reconstructed images and the originals. The RMSE is a simple estimation score of the difference between two images. An ideal RMSE would be zero, which means that the algorithm correctly identied each noisy point and also correctly derived the original data at that location in the signal.
Comparison of mask sizes and Average RMSE with various noise compositions
36 34 32 30 28 26 24 22

4. Experimental Results
To test the accuracy of the spatial median lter, we need three things: an uncorrupted image, the image with corruption applied by some means, and the estimated reconstruction of the original by the spatial median lter. To estimate the quality of a reconstructed image, we calculate the Root-Mean-Squared Error between the original and the reconstructed image. The Root-Mean-Squared Error (RMSE) for an original image I and reconstructed image R is dened by (7).
Iw Ih

N=3 N=5 N=7

20 0.00

0.05

0.10 0.15 Noise Composition (p)

0.20

0.25

Figure 2. Comparison of mask sizes and Average RMSE with various noise compositions

RMSE(I, R) =

1 Iwidth Iheight

i=0 j=0

I(i, j) R(i, j)

(7) The algorithm for the Modied Spatial Median Filter (MSMF) requires two parameters. The rst parameter considered is the size of the mask to use for each ltering operation. The second parameter, threshold T , represents the estimated number of original points for any given sample under a mask. A collection of ten photographs of various sizes was used in these tests. The images had a variety of textures and subject matter. The texture of the images had a larger impact on the threshold chosen than the window mask size.

As seen in Figure 2, a mask size of 3 clearly outperformed the other tested sizes of 5 and 7. Neither the amount of noise, the size of the image, nor the subject matter of the image had an effect on which mask size performed the best. Less thorough tests were run on higher mask sizes such as 9 and 11. With each increase in mask size, the RMSE of each test increased. Due to these preliminary tests, the mask size of 3 was used for the remainder of the testing. In testing for the optimal threshold value T to use for the reconstruction of an image, a database of 59,895 images was used. Each image in this set has the dimensions of 256 384 or 384 256. The images in this set contain a wide variety of subject matters and textures. Most of the images are photographs of various scenes, such as portrait shots, nature shots, animal shots, scenic shots of snow-capped mountains and sandy beaches. Some of the images are not photographs. Of these images, several are drawings on a clean white background.

For each image, eleven different noise levels were applied (from p=0.00 noise to p=0.50 noise in increments of 0.05), thus creating 658,845 images. The Modied Spatial Median Filter was applied to every image at all nine threshold T options. For each test, the RMSE for each point was computed. The tests were conducted on an Intel R Pentium R D dual core CPU with each processor running at 3 GHz clock speed with 1 GB of RAM. The algorithm was written in the language of C and compiled using the GNU C Compiler 4.11 with the speed optimization ag set. The operating system was Red Hat Fedora Core 5. In all, over 11.8 million tests to discover the optimal threshold T value of the Modied Spatial Median Filter were conducted using the N = 3 window mask. The algorithm for Modied Spatial Median Filter on a single point in an image is implemented in O(N 4 ) time, where N is the a dimension of the window mask size. The testing took roughly ten days to complete. The objective of these tests is to nd a threshold point within the spatial order statistic that produces the closest reconstruction of an original image that has been affected by impulse noise (i.e., produce a small RMSE when comparing an original image to a reconstructed image.)
T 1 2 3 4 5 6 7 8 9 # Optimal Images 34561 127854 80799 132581 81617 79664 24188 35168 62413 % Optimal Images 5.25% 19.41% 12.26% 20.12% 12.39% 12.09% 3.67% 5.34% 9.47% Avg 22.563 22.030 21.784 21.571 22.317 24.444 27.463 33.392 47.889 Med 21.943 21.454 21.265 21.079 21.801 23.841 27.237 34.161 51.753

Table 1. Number of images with best reproduction at each threshold using the RMSE Metric, as well as Average and Median RMSE across all images and noise models. Best values are in bold. Table (1) identies the Root-Mean-Squared Error among the different thresholds. The worst performing thresholds are 1, 7, 8, and 9. 5.25% of all the images work best at threshold value 1. Threshold value 1 is the unmodied version of the Spatial Median Vector. Threshold 1 does produce accurate results on its own, but it can easily be improved by selecting any threshold value from 2 to 5. It should be noted that while the unmodied Spatial Median Vector is shown to be comparable with the results of the unmodied Vector Median Filter, there are four other thresholds which produce more accurate reconstructions for 64.18% of all the images and noise models. Based on this observa-

tion, it is recommended not to use the unmodied Spatial Median Vector since the algorithm can be improved by simply using a threshold value 2, 3, 4, or 5. 9.47% of the images work best at threshold 9, which represents no ltering at all. Considering that one of the 11 noise models has no noise at all, representing 9.09% of all images, a result of 9.47% of images not requiring any ltering is acceptable. In all, 2,518 images of the 658,845 tested containing some noise still produced less accurate reconstructions using any ltering threshold. This accounts for 0.39% of all the images tested. When reviewing which threshold produces the best results, the threshold 4 clearly stands out. When this threshold is applied, the lter produces the lowest RMS error of reconstructed images across all eleven noise models at 20.12% of all images. It has the smallest average RMS error across all images, and it has the smallest median RMS error. It also outperforms the minimum Spatial Vector Median (threshold 1). By looking at the results for images in the threshold categories 3 and 5, the results are nearly the same. Both thresholds 3 and 5 outperform the minimum spatial median as well. Thresholds 3 and 5 account for 24.65% of all tested images. By noting that thresholds 3, 4, and 5 all have similar results, it can be concluded that threshold 4 effectively covers 44.77% of the tested images. Threshold level 5 has an interesting characteristic by having the smallest maximum RMS error, which means in the worst case, it will never produce an image with more than 82.021 RMS error. This worse-case-scenario is more accurate than threshold 1 at 91.179 (smooth everything) and much more accurate than threshold 9 at 101.921 (no ltering at all). Due to these results, it is proposed that 4 is the optimal threshold value for the parameter T and that 3 3 is the optimal mask size when the noise composition of the image is unknown. When comparing the Modied Spatial Median Filter, the Spatial Median Filter, the Vector Median Filter, the Component-Based Median Filter, the Simple Median Filter, and the Mean Filter, a 2000 image subset of the 59,895 images was used. These tests were executed on the same machine as the threshold discovery tests and took just under eight hours to complete. The worst performing lter of the six tests was the Mean lter. For all noise models containing at least p = 0.10 noise, it produced the least accurate results. The unmodied Vector Median Filter was only marginally better than the Mean lter. The most stable of the lters was the Component Median Filter, which had the best accuracy across all eleven tested noise models. For noise models containing p 0.15, we see that Modi-

ed Spatial Median Filter produced the most accurate images.

Comparison of Six Noise Filters Across 11 Noise Models


40

35

Median Root-Mean-Squared Error

Mean Filter Median Filter Component Median Filter Vector Median Filter Spatial Median Filter Modified Spatial Median Filter

(a)

30

25

20

(b)

15

10 0.0

0.1

0.2 0.3 Noise Composition (p)

0.4

0.5

(c) Figure 3. Comparison of Six Noise Filters over various noise compositions Figure 4. Demonstration of Modied Vector Median Filter: Original Image(a), 30% Noise Added(b), Reconstructed Image(c)

5. Conclusions
In this paper I have introduced two new lters for removing impulse noise from images and shown how they compare to four other well-known techniques for noise removal. First, four common noise ltering algorithms were discussed. Next, a Spatial Median Filter was proposed based on a combination of work on the Vector Median Filter and the Spatial Median quantile order statistic. Seeing that the order statistic could be utilized in order to make a judgment as to whether a point in the signal is considered noise or not, a Modied Spatial Median Statistic is proposed. The Modied Spatial Median Filter requires two parameters: A window size and a threshold T of the estimated non-noisy pixels under a mask. In the results, we nd the best threshold T to use in the Modied Spatial Median Filter and determined that the best threshold is 4 when using a 3 3 window mask size. Using these as parameters, this lter was included in a comparison of the Mean, Median, Component Median, Vector Median, and Spatial Median Noise Filters. In the broad comparison of noise removal lters, it was concluded that for images containing p 0.15 noise composition, the Modied Spatial Median Filter performed the best and that the Component Median Filter performed the best over all noise models tested.

6. Acknowlegements
We would like to thank the University of Mississippi for their support on this project.

References
[1] J. W. Tukey, (1974). Nonlinear (Nonsuperposable) Methods for Smoothing Data. Conference Record EASCON, p. 673. [2] Jaakko Astola, Petri Haavisto, and Yrj , (1990). Veco tor Median Filters. Proceedings of the IEEE, Vol. 78, No. 4, pp. 678-689. [3] N. C. Gallagher, Jr. and G. L. Wise, (1981). A Theoretical Analysis of the Properties of Median Filters. IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 29, pp. 1136-1141. [4] Probal Chaudhuri, (1996). On a Geometirc Notion of Quantiles for Multivariate Data. Journal of the American Statistical Assocation, Vol. 91, No. 434, pp. 862872. [5] Robert Sering, (2002). A Depth Function and a Scale Curve Based on Spatial Quantiles. Statistical Data Analysis Based on the L1-Norm and Related Methods. [6] Yehuda Vardi and Cun-Hui Zhang, (2000). The Multivariate L1-median and associated data depth. Proceedings of the National Academy of Sciences, Vol. 97, No. 4, pp. 1423-1426.