Professional Documents
Culture Documents
1. Introduction
Auto-focus technology plays an important role in optical vision imaging systems. It has been
widely used in a variety of optical imaging systems, such as consumer cameras, industrial
inspection tools, microscopes, and scanners [1, 2]. There are many kinds of auto-focus
methods that have been studied since 1990. Generally, these methods can be categorized into
active auto-focusing methods and passive auto-focusing methods [3].
Active auto-focusing methods use some auxiliary optical devices to measure the position
of a reference point on the sample. For example, Liu et al. introduced a laser beam, a splitter,
and an extra CCD to the normal optical system and the sample position could be measured by
detecting the centroid of the reflected light spot on the sample [4, 5]. Additionally, Hsu et al.
embedded an astigmatic lens into the optical path to produce a focus error signal, which could
be converted to the sample’s defocus distance [6, 7]. However, the timing and geometrical
fluctuations of the light source and the mechanical error reduced the positioning accuracy of
the auto-focus system. Furthermore, this method is expensive and complicated. Alternatively,
passive methods are based on a number of images taken by varying focus lens positions. The
advantage of passive methods over active methods is that they are simpler and less expensive.
Therefore, passive auto-focusing methods are widely used in vision applications.
The depth from focus (DFF) method and the depth from defocus (DFD) method are two
typical passive auto-focusing methods that are currently studied. The DFF-type methods are
based on the fact that the image formed by an optical system is focused at a particular
distance whereas objects at other distances are blurred or defocused [3]. DFF includes two
stages: the first stage is to determine the focus function that describes the degree of focus at
different positions and the second stage is to search and find the best focus position according
to the focus function. Researchers have proposed various focus functions, including the Sobel
gradient method [8], band pass filter-based technique [9], energy of Laplacian [10, 11], and
sum-modified Laplacian [12]. Wavelet transforms based on the discrete cosine transform
have also been developed in recent years [13–15]. Additionally, searching algorithms are
studied, which include the mountain climbing servo [16], the fast hill-climbing search [17],
and the modified fast climbing search with adaptive step size [18]. Very high accuracy can be
achieved by DFF methods. However, since all these methods need to acquire a large number
of images at different distances and calculate their focus values, they require long scanning
times and high power consumption, which limit their application.
The DFD method is popularly used in depth estimation and scene reconstruction, which
can measure the position of samples by just a few images. The DFD method directly
estimates the focus location from a measurement of the level of defocus. Therefore, the
efficiency of the method is high, which makes it suitable for real-time auto-focusing.
However, the accuracy of the DFD method is relatively low because the method needs to
build a model of the optical imaging system that is approximate and introduces theoretical
errors.
In this paper, we propose a new auto-focusing method with low computation amount and
accuracy suitable for real application. Firstly, the traditional DFD methods are improved by a
rapid calculation. Then, the improved DFD method is combined with the DFF method to form
a new fast and accurate auto-focus method. The combination method is verified by
experiments and the results show that the proposed auto-focusing method can decrease the
computational cost and achieve high accuracy for real application.
2. The improved DFD method
2.1 Conventional DFD methods
The conventional DFD methods are mainly based on the power spectrum in the frequency
domain or the point spread function (PSF) of the image in the spatial domain. Subbarao and
Surya proposed a general method in which an S-transform was applied to conduct
deconvolution in the frequency domain on two defocused images taken with different camera
settings [19, 20]. Favaro and Soatto applied a functional singular value decomposition to
compute the PSF [21]. Zhou et al. used a coded aperture [22, 23] that customized the PSF by
modifying the aperture shape with a complex statistical model for depth estimation. Hong et
al. analyzed the power spectrum from a novel aspect, namely, oriented heat-flow diffusion
[24], based on the fact that its strength and direction correspond to the amount of blur.
Generally, complex modeling or great computation loads are required when these
conventional DFD methods are used, which may involve computation times comparable with
or longer than the DFF methods.
2.2 The improved DFD method
2.2.1 Defocus blurring analysis
Figure 1 shows the imaging situation of three adjacent points A, B, and C on three different
planes FP, IP1, and IP2. FP is the focal plane, and IP1 and IP2 are two defocused planes. A, B,
and C become three separate blurred spots that possess a certain shape and size owing to the
fact that A, B, and C will spread in the propagation direction of the light path according to
geometrical optics. When IP1 and IP2 are far from FP, the three blurred spots will overlap
within a certain area on planes IP1 and IP2; this will produce a small region containing
information on multiple imaging points, which is the reason that the images on IP1 and IP2 are
fuzzy.
Suppose that the spread angles of A, B, and C on FP are the same and there are no energy
losses in the spreading process, and set the radii of the blurred spots on IP1 and IP2 as R1 and
R2, respectively, then all the points within the area of radius R1 (R2) around the coordinate (x,
y) on FP can spread towards the same coordinate (x, y) and overlap at (x, y) on IP1 (IP2). The
pixel value at any point on imaging plane corresponds to the light intensity on it, and
generally the light intensity is supposed to be evenly distributed in the blurred spots. Set the
area of the pixel (x, y) as S0, then S0 / π R12 times of light intensity at point A on FP will
spread to the position of pixel (x, y) on IP1. The intensity at point B, C and all points within
the area of radius R1 around the coordinate (x, y) on FP can also be deduced in this way. So,
the pixel value at (x, y) on IP1 (IP2) is S0 / π R12 (S0 / π R22) times of all the pixel values in the
area of radius R1 (R2) around (x, y) on FP [25]. So far, the qualitative relationship between the
radius of the blurred spots and the corresponding pixel values has been established and this is
the basis of the calculation method presented in the next section.
2.2.2 The improved DFD calculation method
The two defocused images can be obtained in two ways. One is by changing the image
distance; in this way, the defocus distance (the distance between the current imaging plane
and the focal plane) in the image space can be calculated using the two defocused images.
The other is by changing the object distance; in this case, the defocus distance (distance
between the current object plane and the best imaging object plane) in the object space can
also be calculated. According to these two ways, the improved DFD method will be
separately analyzed in the following. The selection of methods is determined by the hardware
structure of the auto-focus system.
a) Improved DFD method by changing image distance
Figure 2 shows a scheme of the improved DFD method by changing image distance. P is an
object point on the object plane FP, which is blurred within a radius of R1 (R2) on the imaging
plane IP1 (IP2).
The radius of blurred spots can be calculated with the similar triangle principle:
D 1 1
R1 = d ( − ). (1)
2 f u
Where D is the lens diameter, and u, f, and v denote the objective distance, the focal
length, and the image distance of the optical imaging system, respectively. The parameter d is
the distance between the imaging plane and the focal plane. In the automatic focusing
imaging system, the imaging plane corresponds to IP1 or IP2 and the corresponding defocus
distance is d or d + Δd (Fig. 2). Therefore, the aim is the calculation of d.
Suppose that u, f, and D are constants; the relationship between R1, R2, and d can be
expressed as:
R1 = k d . (2)
R2 = k (d + Δd ). (3)
Where k = D/2(1/f-1/u) and Δd is the distance between IP1 and IP2. Figure 3 shows the
blurred spots on IP1 and IP2.
In Fig. 3(b), set the pixel value at (x, y) on IP1 as V1 and the area of the blurred spot as S1 ;
in Fig. 3(c) the corresponding parameters on IP2 are set as V2 and S2 ; the area of the pixel (x,
y) is assumed as S0. According to the previous description, the pixel (x, y) on IP1 (IP2)
includes all the information on the points contained within the area of radius R1 (R2) around
the coordinates (x, y) on the focal plane FP; furthermore, V1 is S0 /S1 times the sum of the
pixel values in the area S1 on FP, and V2 is S0 /S2 times the sum of the pixel values in the area
S2 on FP. Since S2 incudes S1, then
V1 S1 < V2 S 2 . (4)
R1 V
< 2. (5)
R2 V1
V2 V1
d < Δd . (6)
1 − V2 V1
Set
V2 V1
d m a x = Δd . (7)
1 − V2 V1
Although the specific calculation formula of d has not been given, its maximum dmax can
be estimated from Eq. (7). Additionally, the calculation is simple and quick, which may
provide the possibility for a fast algorithm for auto-focus.
On theoretical deduction, the (x, y) location doesn’t change on FP, IP1 and IP2. In fact,
however, it changes according to the slope of the chief ray as shown in Fig. 2. And the error
becomes greater as slope increases. In this paper, V1 and V2 are taken in the center area of the
images, thus the slope of chief ray is small. The method we proposed is still efficient with a
small error.
b) Improved DFD method by changing object distance
For the application of the DFD method by changing the object distance u, which is a variable,
the calculation of dmax in Eq. (7) must be converted into the calculation of the defocus
distance of the object plane in the object space. The converting scheme and process are
separately shown in Fig. 4(a) and Fig. 4(b).
Fig. 4. (a). The relevant parameters in the modeling change. ∆u denotes the variation of the
object distance u, corresponding to two images taken at two positions in the focusing process,
∆v is the variation of the image distance v caused by ∆u, dmax is the defocus distance in image
space, and uref denotes the defocus distance in the object space. (b) The process of the
modeling change.
When the object distance changes with an amount of ∆u (∆u is very small), ∆v, which is
the corresponding variation of the image distance, can be calculated according to the
Gaussian imaging formula,
1 1 1
= + . (8)
f u + Δu v + Δ v
In this case, ∆v corresponds to ∆d in Eq. (2). Then
V2 V1
d m a x = Δv . (9)
1 − V2 V1
V2 V1 u0 − f
C= , K= . (13)
1 − V2 V1 u2 − f
In Eq. (13), section C is calculated similarly to the previous example. In imaging systems
that need to be focused, u2 is normally close to u0, thus K approximately equals to 1. Even in
the case of u2 is relatively far from u0, K can be set as 1, then uref (K = 1) is still a valuable
estimated value because just 1/n times of it is used according to the focusing scheme which
will be proposed in section 3. So,
V2 V1
uref = Δu . (14)
1 − V2 V1
V2 V1
ureal ( m a x ) = uref = Δu . (15)
1 − V2 V1
Similarly, the defocus distance in the object space can’t be calculated precisely, but its
maximum can be estimated from Eq. (15). And V1 and V2 are taken in the center area of the
images to reduce the error.
3. The combination of DFF and the improved DFD
A focusing range can be acquired by the improved DFD method in a rapid calculation.
Furthermore, the goal of accurate automatic focusing can be achieved by the combination of
the improved DFD with the DFF method. The combination method can be divided into two
stages: the rough focusing stage and the fine focusing stage, as shown in Fig. 5. In the rough
focusing stage, at a certain position, two images of an object are taken with a certain interval
distance to estimate the current defocus distance by the improved DFD method. The next
position is taken with a step of 1/n times the above estimated defocus distance (since the
estimated defocus distance does not equal the real defocus distance, we consider 1/n times of
it for the final accuracy, and n is usually assumed as 5-10). At the new position, we sample
the two images and estimate the current defocus distance again. The process is repeated until
the defocus distance is smaller than the default threshold, which depends on the optical
imaging system. Next, the fine focusing stage will be conducted. The DFF method is used to
search for the peak position of the focus function with a fixed step length that is less than the
depth of focus (DOF). The peak position of the focus function is the focusing position.
The conventional automatic focusing methods perform the searching using the same step
along the whole process. Additionally, the step value, usually a fraction of the DOF, is very
small to ensure accuracy. Thus, the searching efficiency is low and the local peak of the focus
function may be acquired owing to the small step length, which further lowers the efficiency
and may even cause focusing failure. In contrast to the conventional methods, the proposed
searching strategy divides the whole searching process into two stages. Large steps are used
in the rough searching stage and small steps in the fine searching stage, which can remove the
influence of the local peak. Meanwhile, high searching efficiency can be achieved.
Fig. 5. The combined focusing method.
In this experiment, auto-focusing was achieved by changing the object distance. The
entire auto-focus process was as follows: the sample on the translation stage was imaged by
the microscope and the images were analyzed by the computer, which then gave the
corresponding command to the motion controller; then, the motion controller adjusted the
translation stage’s vertical motion to change the object distance until it was at the best
imaging position.
Glass slides were used as the focusing targets. In order to test the accuracy of the
proposed method for estimating the defocus distance, the translation stage was driven to 18
known positions (the corresponding real defocus distance ureal was 720 x 1, 720 x 2, 720 x 3,
720 x 4, …., 720 x 18 μm, 0.36 μm/pulse), and at every position, two images were sampled
with a certain interval distance ∆u (36, 72, and 108 μm) to estimate the defocus distance.
4.2 Experimental data analysis
We tested 10 glass slides and acquired 10 groups of data. There were many similarities
between the groups, so we selected one group as follows.
4.2.1 Convergence analysis
Figure 7 shows the estimated defocus distance at the 18 positions, calculated by the improved
DFD method with a sampling interval distance ∆u of 36, 72, and 108 μm.
Fig. 7. Curve of the estimated defocus distance. The horizontal axis corresponds to the 18
sampling positions from small to large distances from the focal position, and the value is the
real corresponding defocused distance. The vertical axis denotes the corresponding estimated
defocus distance.
It can be seen that the estimated defocus distances increase with the sampling positions,
and the estimated values are approximately linearly proportional to the true values real
defocus distance to some degree, which verifies that the proposed method is qualitatively
correct. Furthermore, the estimated defocus distances increase with ∆u, which also presents
an approximate linear relationship.
It should be noted that for ∆u = 36 μm case, the estimated defocus distance is less than the
real defocus distance, which seems contradictive to the theoretical deduction before. The
probable reasons for the error may be as follows. In order to reduce random error, multiple
sets of V1 and V2 in the central zone of the images should be taken to calculate the defocus
distances and the average of these defocus distances are used. It is different from the
theoretical deduction, which may cause the error. Furthermore, the optical imaging system is
supposed incoherent. On this assumption, light intensities can be added directly. However, it
is actually partially coherent, and this may also lead to some error.
In order to verify the proposed method, further experiments are conducted using another
three objects, all of which are more complex focusing targets that include complex depth
variations. The three objects are a coin, a printed circuit board (PCB) containing chip pins
welding and an iron with irregular fracture surface, referred to as Obj 1, Obj 2 and Obj 3
respectively. The results are listed in Table 1.
Table 1. Estimated defocus distance for the three objects.
As shown in Fig. 8, the relative errors in the three conditions (∆u = 36, 72, 108 μm) vary
widely and change with the sampling position. It is evident that the relative error increases
with the increase of ∆u. We can choose a reasonable ∆u to control the magnitude of the
relative error. Additionally, according to ∆u, we can expand or contract the estimated defocus
distance by a certain proportion to compensate for the inadequacy of the method. From Fig. 8,
it seems that the relative error of the improved DFD is still significant (the maximum is
175.4%); however, it is used only in the rough focusing stage. Thereafter, fine focusing is
implemented to ensure accuracy. The benefit of the application of rough focusing is that an
effective efficiency can be achieved.
4.2.3 Efficiency analysis of the combination of DFF and the improved DFD
In the searching stage of auto-focusing, the searching step is determined by the defocus
distance, which is calculated by the improved DFD method. Therefore, the searching times
are much smaller than in the DFF method, which is why the proposed combination method
has a higher efficiency. Next, some specific cases will be analyzed.
In the range from ureal = 12960 to ureal = 720 μm, when ∆u = 36 μm, n = 5, and Pth = 720
μm (about 10 DOF, where Pth is the default threshold of the defocus distance stated before,
often set as several times the value of DOF), 10 images were taken, 5 calculations were
required, and the translation stage needed to be driven 5 times (the 5 step values were 4287,
3162, 2510, 1430, and 727 μm). Similarly, when n = 10 and Pth = 720 μm, 20 images were
taken and 10 calculations were required, corresponding to 10 moves of the translation stage
(the step values were 2143, 1780, 1661, 1349, 1313, 1036, 959, 715, 573, and 364 μm).
For conventional DFF methods, under the same accuracy conditions, at least 17 images
need to be taken and 17 repetitions of the calculations are conducted (in this case, the
searching step is set as 720 μm, and the translation stage is driven with the step 17 times).
Furthermore, the single computation load (computation of the focus function value, Grey
Level Variance [11]) is more than that of the improved DFD (computation of the defocus
distance). Table 2 shows the comparison between the different methods.
Table 2. Comparison between the proposed combination method and the DFF method.
It can be seen that the efficiency of the proposed method is at least 3-5 times that of the
traditional DFF method at a rough estimate. If we choose more suitable values of ∆u, n, and
Pth, then the efficiency will be further improved. Actually, the searching step in conventional
DFF methods is very small (much less than 720 μm in this experiment), thus the efficiency of
the proposed combination method is much more than 3-5 times that of the conventional DFF
method.
5. Conclusions
In this paper, a combination algorithm of DFF and improved DFD for auto-focusing is
proposed and experimentally demonstrated to be efficient and significantly more effective
than the traditional ones. The proposed novel DFD auto-focus method can be applied to the
auto-focus system both by changing the image distance and the object distance, which
provides much more flexibility when designing the structure of the auto-focusing system.
However, in real application, the accurate estimation of the defocus distance is not easy. An
inaccurate estimation will directly affect the searching efficiency or even lead to focusing
failure. Thus, the combination method is still worthy of our continued study.
Acknowledgments
The authors thank the financial support from the National Science and Technology Major
Project of the Ministry of Science and Technology of China (Grant No. 2014ZX07104) and
National Key Foundation for Exploring Scientific Instrument of China (2013YQ03065104),
the National Science and Technology Support Program of China (2012BAI23B00).