You are on page 1of 6

Consider the following image, on 3 bits (values between 0 and 7) and the origin (0,0) underlined:

0 6 0 1 3
5 5 1 6 2
0 7 3 2 6
7 3 4 5 5
5 3 5 2 4

1. compute the binomial 3x3 filter and the result of the filtered image for the pixel at location (2,2)

2. compute, for the same location (2,2), the result of image filtering by the 3x3 median filter

3. same question but for the following weighted median filter:

2 3 2
3 5 3
2 3 2

4. same question for the following L filter: {1,3,4,6,9,6,4,3,1}

5. for the same location compute the result of filtering the image with the conditional mean filter having
the following parameters: L=3, th=2.5

6. the image from before is rotated with 30° around its center. write the composed transformation
matrix. compute the value of the rotated image at location (1,2) using bilinear interpolation.

7. compute the Sobel gradient magnitude and its direction for the central pixel of the above image

8. using the kmeans algorithm compute an optimal threshold and calculate the corresponding binary
image, for the following input image:

6 2 2 1 3
6 6 1 2 2
6 7 3 2 2
7 3 4 5 5
1 3 5 6 7
ANSWERS:

1. the binomial 5x5 filter has the following mask:

1/16*

1 2 1
2 4 2
1 2 1
location (2,2) is the center of the image and the result of filtering is given by the 2D convolution
operation. the neighborhood considered is:

5 1 6
7 3 2
3 4 5
and the result: (1*5+2*1+1*6+2*7+4*3+2*2+1*3+2*4+1*5)/16 =
(5+2+6+14+12+4+3+8+5)/16=59/16=3.68 if the result remains on 3 bits and in the same range we
round to the nearest integer in the 0-7 range so the final result is 4.

2. the median filter constructs the sequence of values in the specified neighborhood. the sequence is
sorted and the result is represented by the median value (value at the center of the sorted sequence):

the neighborhood:

5 1 6
7 3 2
3 4 5
unsorted sequence: 5,1,6,7,3,2,3,4,5

sorted sequence: 1,2,3,3,4,5,5,6,7

median value and final result: 4

3. the weighted median filter works the same as the median filter with the difference that the weights
tell us how many times the corresponding image value is repeated in the sequence

the neighborhood:

5 1 6
7 3 2
3 4 5
unsorted sequence: 5 - 2 times ,1 - 3 times,6 - 2 times,7 - 3 times,3 - 5 times,2 - 3 times,3 - 2 times,4 - 3
times,5 - 2 times= 5,5,1,1,1,6,6,7,7,7,3,3,3,3,3,2,2,2,3,3,4,4,4,5,5

sorted sequence: 1,1,1,2,2,2,3,3,3,3,3,3,3,4,4,4,5,5,5,5,6,6,7,7,7

median value and final result: 3


4. the L filter constructs the sorted sequence and the final result is the weighted mean using the given
coefficients

the sorted sequence (as computed for the median filter): 1,2,3,3,4,5,5,6,7

the coefficients: {1,3,4,6,9,6,4,3,1}

the weighted mean: (1*1+3*2+4*3+6*3+9*4+6*5+4*5+3*6+1*7)/(1+3+4+6+9+6+4+3+1)=

=(1+6+12+18+36+30+20+18+7)/37=148/37=4

5. conditional mean filter considers a LxL neighborhood and the final result is given by the mean of the
pixel values that are not further by th from the value at the location we are computing the result for. so,
only those pixels having the absolute difference against the central pixel smaller than th are considered
for the average.

in our case the neighborhood is 3x3:

5 1 6
7 3 2
3 4 5
knowing that th=2.5, the pixels considered for the mean computation are those in red

the final result is: (5+1+3+2+3+4+5)/7=23/7=3.2 rounded to 3.

the simple average would be: 36/9=4.

6. in order to perform a rotation around the center of the image we need to move the center in the
origin which is done by a translation, then we perform the rotation, and finally we apply the inverse
translation. the composed transformation will be:

T(dx,dy)*R(theta)* T(-dx,-dy), where T(-dx,-dy) is the translation that moves the center in the origin , and
T(dx,dy) is its inverse. since we are talking about the center of the image dx=dy=2.

in matrix form we have:

𝜋 𝜋
𝑦𝑜𝑢𝑡 1 0 2 cos⁡( ) sin( ) 0 1 0 −2 𝑦𝑖𝑛
6 6
(𝑥𝑜𝑢𝑡 ) = (0 1 2) ∗ 𝜋 𝜋 ∗ (0 1 −2) ∗ (𝑥𝑖𝑛 )
−sin( ) cos⁡( ) 0
1 0 0 1 6 6 0 0 1 1
( 0 0 1)

computing the matrix products we get:

𝜋 𝜋 𝜋 𝜋
𝑦𝑜𝑢𝑡 = 𝑦𝑖𝑛 ∗ cos ( ) + 𝑥𝑖𝑛 ∗ sin ( ) − 2 ∗ cos ( ) − 2 ∗ sin ( ) + 2
{ 6 6 6 6
𝜋 𝜋 𝜋 𝜋
𝑥𝑜𝑢𝑡 = −𝑦𝑖𝑛 ∗ sin ( ) + 𝑥𝑖𝑛 ∗ cos ( ) − 2 ∗ cos ( ) + 2 ∗ sin ( ) + 2
6 6 6 6
𝑦𝑜𝑢𝑡 = 𝑦𝑖𝑛 ∗ 0.866 + 𝑥𝑖𝑛 ∗ 0.5 −0.732
{
𝑥𝑜𝑢𝑡 = −𝑦𝑖𝑛 ∗ 0.5 + 𝑥𝑖𝑛 ∗ 0.866 +1.268

we know the location in the output image: yout = 2, xout=1, and we solve the above system for xin and
yin:

𝑦𝑖𝑛 = 2.5
{
𝑥𝑖𝑛 = 1.1

this means that the value of the rotated image at location (1,2) is the same as the value of the original
image at location (1.1,2.5)

the problem is that these are not integer coordinates so we will use interpolation to estimate a value at
the non integer location

bilinear interpolation considers the 4 neighbors to estimate a value at the non integer location:

f(x,y) =a*b*f([x]+1,[y]+1) + (1-a)*b*f([x],[y] +1) + a*(1-b)*f([x]+1,[y]) + (1-a)*(1-b)*f([x],[y])

where: [] is the integer part, a and b are the fractional parts of x respectively y; f is the image value.

in our case:

5 1
7 3
f(1.1,2.5) = 0.1*0.5*f(2,3)+0.9*0.5*f(1,3)+0.1*0.5*f(2,2)+0.9*0.5*f(1,2)=

=0.05*1+0.45*5+0.05*2+0.45*7 = 0.05+2.25+0.1+3.15=5.55 eventually rounded up to 6 (if we


consider the output also in the same range as the input)

7. the horizontal (gx) and the vertical (gy) components of the Sobel gradient are obtained by convolving
the image by the two following masks:

1 0 -1
2 0 -2
1 0 -1

1 2 1
0 0 0
-1 -2 -1

the neighborhood considered:

5 1 6
7 3 2
3 4 5
gx = 1*5+2*7+1*3-1*6-2*2-1*5=5+14+3-6-4-5=7

gy = 1*5+2*1+1*6-1*3-2*4-1*5 = 5+2+6-3-8-5 = -3

Sobel gradient magnitude: g=sqrt(gx*gx+gy*gy)=sqrt(49+9)=sqrt(58)=7.6 ~ 8

direction of the gradient: atan(gy/gx) = atan(-3/7)=......

8. we start by choosing an arbitrary threshold in the image range: 0-7. for example the starting threshold
= 3.5 (the middle of the image range)

original image

7 2 2 1 3
7 6 1 2 2
6 7 3 2 2
7 3 4 5 7
1 3 7 6 7

we compute the means of the pixels having values bigger (mb) respectively smaller (ms) than the
threshold

image pixels bigger that the threshold: 7,7,6,6,7,7,4,5,7,7,6,7=> mb=76/12 = 6.33

image pixels smaller that the threshold:2,2,1,3,1,2,2,3,2,2,3,1,3 = 27/13 =2.07

the new threshold is the mean of mb & ms: th = (6.33+2.07)/2=4.2

we repeat the operations from before until the new threshold is equal to the old one (convergence
condition), value that will also represent the final threshold to obtain the binary image.

so:

th=4.2

mb = 7+7+6+6+7+7+5+7+7+6+7/11 = 6.54

ms = 2+2+1+3+1+2+2+3+2+2+3+4+1+3/14 = 31/14 =2.21

new th = (6.54+2.21)/2 = 4.3

next step using th = 4.3:

mb = 7+7+6+6+7+7+5+7+7+6+7/11 = 6.54

ms = 2+2+1+3+1+2+2+3+2+2+3+4+1+3/14 = 31/14 =2.21


so the new th = 4.3 equal to the old one so the alg. stops here and we use 4.3 to obtain the binary
image: if the pixel value >th that pixel -> 1, else pixel -> 0. the binary image will look like this:

1 0 0 0 0
1 1 0 0 0
1 1 0 0 0
1 0 0 1 1
0 0 1 1 1

You might also like