Professional Documents
Culture Documents
Abstract. Image restoration using resolution expansion paper proposes a new nonlinear restoration technique for
is important in many areas of image processing. This pa- text images, which creates smooth foreground and back-
per introduces a restoration method for low-resolution ground regions while preserving sharp edge transitions.
text images which produces expanded images with im- There has been significant research in both the text en-
proved definition. This technique creates a strongly bi- hancement and resolution expansion fields. Some text
modal image with smooth regions in both the foreground enhancement efforts focus on fixing broken or touching
and background, while allowing for sharp discontinuities characters [18, 23]. Other methods [5, 24] use degraded
at the edges. The restored image, which is constrained samples to model the image and then solve for the re-
by the given low-resolution image, is generated by iter- stored image using a recursive technique. To remove the
atively solving a nonlinear optimization problem. Low- effects of noise, [11] uses a morphological filter and [1]
resolution text images restored using this technique are uses pixel patterns to improve OCR results. A variety
shown to be both quantitatively and qualitatively su- of methods have been proposed in order to improve con-
perior to images expanded using the standard methods trast within text images including quadratic filters [13],
of linear interpolation and cubic spline expansion. Ex- soft morphological filters [8], non-linear mapping [16],
perimental results demonstrate that text images created and using a multi-resolution pyramid approach and fuzzy
by this new algorithm improve optical character recog- edge detectors [14]. Numerous resolution expansion meth-
nition accuracy more than images obtained by existing ods have been published in the literature as well [2, 3, 6, 7,
expansion methods. 9, 12, 15]. Linear interpolation tends to smooth the image
data at transition regions and results in a high-resolution
Key words: Bimodal distribution – Modeling – Resto- image that appears blurry. Cubic spline expansion allows
ration – Optimization – Text for sharp transitions, but tends to produce a ringing ef-
fect at these discontinuities. The method proposed in
[4] uses relative displacements in image sequences to im-
prove resolution.
1 Introduction A number of research efforts investigated combining
text enhancement with resolution expansion in order to
Text image resolution expansion has become increasingly improve low-resolution text images. Shannon interpola-
important in a number of areas of image processing. Op- tion is performed with text separation from the image
tical character recognition (OCR) of document images background in [10] to improve the OCR accuracy of dig-
continues to be of great importance as we attempt to ital video. Deblurring is combined with linear interpola-
become a paperless society. Restoring text from video tion in [19] to enhance document images obtained from
surveillance imagery is often crucial to law enforcement digital cameras. Both of these methods essentially per-
agencies. Digital video compression algorithms can bene- form the text enhancement and resolution expansion as
fit from successful text resolution expansion techniques. two separate steps. The proposed method computes reso-
Video archival efforts also make use of OCR to assist lution expansion using an algorithm specifically designed
with indexing and retrieval. Text observed in these types to enhance text images to perform both of these steps
of images is often low-resolution and requires resolution simultaneously.
expansion in order to improve OCR performance. Com- The goal of resolution expansion is to create an ex-
mon interpolation methods, which were not designed panded image with improved definition from observed
specifically for text images, typically smooth over the im- low-resolution imagery. Acquisition of this low-resolution
portant details and produce inadequate expansion. This imagery can be modeled by averaging a block of pix-
els within a high-resolution image. Resolution expan-
Correspondence to: P.D. Thouin
P.D. Thouin, C.-I. Chang: A method for restoration of low-resolution document images 201
20
30
40
images expanded using standard methods. Fig. 1. High-resolution imaging system with R × C elements
The remainder of this paper is organized as follows.
Section 2 describes the problem of image resolution ex-
pansion. In Sect. 3, three text scoring functions are in-
6
troduced which exploit properties of text images and set
a foundation for the image restoration method proposed ...............................
in this paper. Section 4 derives the technique used to it- ................ ......
. .
eratively solve for the functional minimum that results in ...... ....................................... .....
.. ..... .
.....
... .....
the restored image. Section 5 presents experiments using
this technique and quantitatively compares the proposed
.. ..
method to other methods of image resolution expansion. R/q .... ....
Finally, a summary of this proposed technique is given ... ...
in Sect. 6. ... ...
... .....
... .. ...
.... ................ .....................
...... ............ ....
.......... ..
2 Problem statement ..................................
?
The image acquisition process consists of converting a -
C/q
continuous image into discrete values obtained from a
group of sensor elements. Each sensor element produces
1
5 10 15 20 25 30 35
arranged in a non-overlapping grid of square elements, Fig. 2. Low-resolution imaging system with R/q × C/q ele-
smaller elements result in higher resolution imagery. A ments
high-resolution imaging system is shown in Fig. 1 where
the number of sensors is adequate to represent the de-
sired text image. The majority of pixels within the image Low-resolution imaging can therefore be thought of as
are either white or black, with a small number of gray block-averaging high-resolution images.
pixels occurring at the edges. Figure 2 illustrates a low- The effects of this block-averaging can be further ob-
resolution imaging system where the number of sensors served by examining image histograms. Figure 3 plots
has been reduced by a factor of q = 4 in both the hori- the histograms of an original text image and its block-
zontal and vertical directions. This low-resolution acqui- averaged image. The original image’s histogram is bi-
sition results in significant blockiness and is insufficient modal, which is common to text images. The peak occurs
to accurately represent this image. Each sensor element at background (white) values, since the majority of pix-
effectively averages the image within its section of the els on a text page is background. There is a secondary
grid, resulting in an increased amount of gray pixels. peak representing the black letters, which corresponds
202 P.D. Thouin, C.-I. Chang: A method for restoration of low-resolution document images
0.01 7
x 10
18
0.009
16
0.008 Original Image
14
0.007
12
0.006
0.005 10
0.004 8
0.003
6
Block−Averaged Image
0.002
4
0.001
2
0
0 50 100 150 200 250
Grayscale Values 0
0 50 100 150 200 250
∂B(x)
µ1 µ2
(1) (1)
=4x3r,c − 6(µwhite + µblack )x2r,c x2,1 x2,q
∂xr,c .. ..
(4) . .
+ 2(µ2white+ 4µwhite µblack + µ2black )xr,c
(1) (1) (1)
− 2µwhite µblack (µwhite + µblack ) xq,1 xq,2 · · · xq,q
Partial derivatives of the bimodal score B(x) are inde- Fig. 5. Average score computation
pendent of neighboring pixels. The estimated means of
the bimodal distribution, µblack and µwhite , are known a
priori and their appropriate constants can be pre- 3.3 The average constraint score
computed.
It is reasonable to require that the average of a group
3.2 The smoothness score of high-resolution pixels is close to the original value
of the low-resolution pixel from which they were de-
With the exception of edges, text images tend to be very rived. For each block of low-resolution pixels, an aver-
smooth in both the foreground and background regions age score A(x) is used to measure how well the restored
which results in neighbors with similar values. A smooth- high-resolution pixels meet the average constraint im-
ness score, which is computed for each block of pixels, posed by their corresponding low-resolution pixels. Fig-
is introduced to measure this feature. For this proposed ure 5 shows four low-resolution pixels whose values are
algorithm, a simple statistic using only the four nearest µ1 , µ2 , µ3 , and µ4 . The q×q group of high-resolution pix-
neighbors of each pixel is used. Other more sophisticated els that are being restored from pixel µ1 are represented
(1)
smoothness measures could be implemented as well. The by {xr,c , 1 ≤ (r, c) ≤ q}. The average score for this
smoothness score S(x) used by this technique is given by 2 × 2 block is expressed by
X
S(x) = [(xr−1,c − xr,c )2 + (xr,c−1 − xr,c )2 4
X q q
1 X X (i) 2
r,c (6) A(x) = [µi − x ] (9)
q 2 r=1 c=1 r,c
+ (xr,c+1 − xr,c )2 + (xr+1,c − xr,c )2 ] i=1
where r and c are the row and column indices within the where i is the index for the low-resolution pixels, µi is
(i)
block being evaluated. The minimum value of S(x) = 0 the value of each low-resolution pixel, and xr,c are the
occurs when all pixels have identical values. restored high-resolution pixels corresponding to pixel µi .
The first and second partial derivatives are calculated The initial high-resolution image formed by using pixel
in a straightforward manner. The first partial derivative replication always has an average score of zero because
of this smoothing score with respect to pixel xr,c is it satisfies the constraint.
The first partial derivative for the group of high-
∂S(x) resolution pixels corresponding to pixel µi is equal to
= 8xr,c − 2(xr−1,c + xr,c−1 + xr,c+1 + xr+1,c ) .
∂xr,c
q q
(7) ∂A(x) 2 1 X X (i)
= [ x − µi ] (10)
∂xr,c
(i) q 2 q 2 r=1 c=1 r,c
The second partial derivatives are nonzero only when
7
9 9 x 10
x 10 x 10 7
4 8
7.8
3.9 6
2 2
7.6
3.8
5
7.4
4 4 3.7
Smoothness Score
7.2
Average Score
4
Bimodal Score
3.6 7
3
6 6 3.5
6.8
6.6 2
3.4
6.4
8 8 3.3
1
6.2
3.2 6 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
Iterations Iterations Iterations
10 10
12 12
14 14
(a) Bimodal (b) Smooth- (c) Average
score ness score score
16 16
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16
10
1.15
2 2
4 4
6 6
1.1
BSA Score
8 8
10 10
12 12
1.05
14 14
16 16
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16
2 2
8 8
Fig. 9a–d. BSA score minimization
10 10
12 12
14 14
tion procedure is completed in 30 iterations for this im-
16 16
age and the BSA score is plotted in Fig. 9d. The in-
2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16
dividual bimodal, smoothness, and average scores are
shown separately in Figs. 9a–c. Initially, both the bi-
(e) Iteration 15 (f) Iteration 30 modal and smoothness scores are high because the im-
age scores poorly for these measures. As the iterations
Fig. 8a–f. Iterative text restoration example proceed, the image becomes more bimodal and smooth
and these two scores are reduced. The average score plot-
ted in Fig. 9c originally has the minimum value at zero
because the constraint is met exactly. The average score
in the transformed domain, which is simply δ = Eδ 0 in increases as the restoration proceeds, but this score is
the pixel domain. For each iteration, the image update δ 0 still significantly smaller than the bimodal and smooth-
is determined. The iterations continue until convergence ness scores. Minimization of the BSA score produces a
is reached, resulting in a desired restored image. An ex- restored image that is the optimal combination of these
ample of this iterative image restoration process is shown bimodal, smoothness, and average measures.
in Fig. 8. The original 4 × 4 block of pixels is expanded
by a factor of q = 4 using pixel replication to produce a
16 × 16 high-resolution image shown in Fig. 8a. As the 5 Experimental results
iterative restoration process proceeds in Figs. 8b–f, the
image becomes more bimodal and smooth resulting in The proposed BSA restoration algorithm was compared
a greatly improved image. The majority of gray pixels to several common expansion methods, including pixel
that occur between characters are replaced with either replication, linear interpolation, and cubic spline expan-
black or white values, resulting in a strongly bimodal sion. In linear interpolation, a linear fit is calculated be-
distribution. The resulting image is also smooth in both tween all pixels within each column, and then repeated
the foreground and background regions while maintain- for all pixels within each row. These images naturally
ing the constraint that the average of each 4 × 4 block tend to be smooth, without sharp discontinuities, pro-
of high-resolution pixels is close to the original value of ducing blurry results. Cubic spline expansion [3] approxi-
each corresponding low-resolution pixel. The minimiza- mates the given discrete low-resolution pixels as a smooth
206 P.D. Thouin, C.-I. Chang: A method for restoration of low-resolution document images
500
450
Original
350
250
150
Spline
100
BSA
50
1200 Linear
1000
Spline
Fig. 11a–e. Text restoration results for 4 × 4 degradation 800
600
BSA
amount of gray pixels produces superior character sepa-
ration evident in the word “conform”. This gray pixel de- 400
100
90
80
__
Spline
60
O BSA better than Spline
(a) Spline Image X BSA worse than Spline
50
40
20 40 60 80 100 120
Images
6 Conclusions References
In this paper, we present a new resolution expansion 1. T. Akiyama, N. Miyamoto, M. Oguro, K. Ogura.: Faxed
technique for the restoration of low-resolution grayscale document image restoration method based on local pixel
text images. Bimodal, smoothness, and average (BSA) patterns. Proc. SPIE, 3305: 253–262, 1998
scores that measure desired properties observed in text 2. T. C. Chen, R. J. P. de Figueiredo.: Image decimation
images were introduced and combined to form a single and interpolation techniques based on frequency domain
scoring function. The restored image obtained by solv- analysis. IEEE Trans. on Commun., 32(4), 1984
ing this nonlinear optimization problem is one which is 3. H. S. Hou, H. C. Andrews.: Cubic splines for image inter-
strongly bimodal and smooth, while satisfying the av- polation and digital filtering. IEEE Trans. on Acoust.,
erage constraint score. The proposed BSA restoration Speech, and Signal Process., 26(6), December 1978
technique was shown to be both qualitatively and quan- 4. M. Irani, S. Peleg.: Improving resolution by image regis-
titatively superior to the existing linear interpolation and tration. CVGIP: Graphical Models and Image Process.,
cubic spline expansion techniques. 53(3): 231–239, 1991
Although this technique was specifically designed for 5. M. Y. Jaisimha, E. A. Riskin, R. Ladner,, S. Werner.:
text images, it has also proven effective at restoring low- Model-based restoration of document images for OCR.
resolution fingerprint images. Text images have much in Proc. SPIE, 2660: 297–308, 1996
common with fingerprint images. Both have strongly bi- 6. N. B. Karayiannis, A. N. Venetsanopoulos.: Image inter-
polation based on variational principles. Signal Process.,
modal histograms, are smooth in both the foreground
25: 259–288, 1991
and background regions, and have sharp edge transi-
7. R. G. Keys.: Cubic convolution interpolation for digital
tions. With no modifications, the BSA algorithm was
image processing. IEEE Trans. on Acoust., Speech, and
able to successfully restore fingerprint images. In addi- Signal Process., 29(6): 1153–1160, 1981
tion to fingerprints, it is expected that the proposed al- 8. L. Koskinen, H. Huttunen,, J. T. Astola.: Text enhance-
gorithm will accurately expand tables, charts, graphs, ment method based on soft morphological filters. Proc.
and other line drawings. SPIE, 2181: 243–253, 1994
In order to further improve the proposed BSA restora- 9. A. D. Kulkarni, K. Sivaraman.: Interpolation of digital
tion technique, new measures may be added that make imagery using hyperspace approximation. Signal Pro-
use of a priori knowledge. If the scanning resolution and cess., 7: 65–73, 1987
font size within an image are known, the average stroke 10. H. Li, O. E. Kia,, D. S. Doermann.: Text enhancement
width in pixels can be computed. A new score could in digital video. Proc. SPIE, 3651: 2–9, 1999
be created to measure how close a group of pixels is 11. J. Liang, R. M. Haralick,, I. T. Phillips.: Document im-
to this desired width. Additionally, different measures age restoration using binary morphological filters. Proc.
for the existing bimodal, smoothness, and average scores SPIE, 2660: 274–285, 1996
could be implemented. Another research area under ex- 12. V. S. Nalwa.: Edge-detector resolution improvement by
ploration is optimally selecting the weights for the three image interpolation. IEEE Trans. Pattern Anal. Mach.
scoring functions, by scanning the same document at Intell., 9(3): 446–451, 1987
multiple resolutions and restoring the low-resolution and 13. G. Ramponi, P. Fontanot.: Enhancing document images
comparing it to the corresponding scanned high-reso- with a quadratic filter. Signal Process., 33: 23–34, 1993
lution image, which could potentially be of significant 14. F. Sattar, D. B. H. Tay.: On the multiresolution enhance-
benefit. Additional research is also being considered to ment of document images using fuzzy logic approach.
determine how to apply this method to images with non- ICPR98 Proc. Int. Conf. on Pattern Recognition, pp.
uniform background such as video frames. 939–942, 1998
15. R. R. Schultz, R. L. Stevenson.: A Bayesian approach to
image expansion for improved definition. IEEE Trans.
Acknowledgements. The first author wishes to thank G. H. on Image Processing, 3(3): 233–242, 1994
Nickel of Los Alamos National Laboratory and C. S. Cumbee 16. Y.C. Shin, R. Sridhar, V. Demjanenko, P. W. Palumbo,,
of the Department of Defense for technical discussions, and J. J. Hull.: Contrast enhancement of mail piece images.
S. J. Dennis of the Department of Defense for his support of Proc. SPIE, 1661: 27–37, 1992
this research. 17. J. Skilling, R. K. Bryan.: Maximum entropy image re-
construction: general algorithm. Mon. Not. R. Astron.
Soc., 211: 111–124, 1984
18. P. Stubberud, J. Kanai,, V. Kalluri.: Adaptive image
restoration of text images that contain touching or bro-
ken characters. ICDAR95 Proc. Int. Conf. Doc. Anal.
Recognition, pp. 778–781, 1995
19. M. J. Taylor, C. R. Dance.: Enhancement of document
images from cameras. Proc. SPIE, 3305: 230–241, 1998
20. P. D. Thouin, C. -I Chang.: A Method for Restoration
of Low-Resolution Text Images. Symp. Doc. Image Un-
derstanding, pp. 143–148, 1999
21. OCR Accuracy Report Version 5.3.: University of
Nevada, Las Vegas, NV, UNLV-ISRI
210 P.D. Thouin, C.-I. Chang: A method for restoration of low-resolution document images
22. UW English Document Image Database I, Volume 1, Chein-I Chang received his
Binary Images.: University of Washington, Seattle, WA, B.S., M.S. and M.A. degrees from
1993 Soochow University, Taipei, Tai-
23. A. P. Whichello, H. Yan.: Linking broken character bor- wan, 1973, the Institute of Math-
ders with variable sized masks to improve recognition. ematics at National Tsing Hua
Pattern Recognition, 29(8): 1429–1435, 1996 University, Hsinchu, Taiwan, 1975
24. M.Y. Yoon, S.W. J.S. Kim.: Faxed image restoration and the State University of New
using Kalman filtering. ICDAR95 Proc. Int. Conf. Doc. York at Stony Brook, 1977, re-
Anal. Recognition, pp. 677–681, 1995 spectively, all in mathematics, and
M.S., M.S.E.E. degrees from the
University of Illinois at Urbana-
Champaign in 1982 respectively
Paul D. Thouin received his and Ph.D. in electrical engineering
B.S. degree in electrical engineer- from the University of Maryland,
ing from the University of Michi- College Park in 1987. He was a visiting assistant professor
gan, Ann Arbor in 1987. In 1993, from January 1987 to August 1987, assistant professor from
he obtained the M.S.E.E. degree 1987 to 1993, and is currently an associate professor in the
from George Washington Univer- Department of Computer Science and Electrical Engineering
sity in Washington, D.C. He is at the University of Maryland Baltimore County. He was a
currently a Ph.D. candidate at the visiting specialist in the Institute of Information Engineering
University of Maryland Baltimore at the National Cheng Kung University, Tainan, Taiwan from
County majoring in electrical en- 1994-1995. He is an editor for Journal of High Speed Network
gineering. Mr. Thouin has been and the guest editor of a special issue on Telemedicine and
employed by the U.S. Depart- Applications. His research interests include automatic tar-
ment of Defense since 1987 where get recognition, multispectral/hyperspectral image process-
he is a Senior Engineer currently ing, medical imaging, information theory and coding, signal
assigned to the Image Research Branch. His research interests detection and estimation, and neural networks. Dr. Chang is
include image enhancement, statistical modeling, document a senior member of IEEE and a member of SPIE, INNS, Phi
analysis, and pattern recognition. Mr. Thouin is a member of Kappa Phi and Eta Kappa Nu.
SPIE and Phi Kappa Phi.