Professional Documents
Culture Documents
and Video
Kristofor Gibson Dũng Võ, and Truong Nguyen
Space and Naval Warfare Systems Center Pacific Department of Electrical and Computer Engineering
53560 Hull. St. University of California San Diego
San Diego, CA 92126 USA La Jolla, CA 92093 USA
Email: kris.gibson@navy.mil Email: dungtrungvo@gmail.com, nguyent@ece.ucsd.edu
Abstract—This paper proposes a novel method for single fog in images in Section III. We then explore the effect of
image dehazing that operates at a faster speed than current coding (e.g., JPEG compression) when applying any dehazing
methods for implementation in video enhancements. We provide a method before or after compression in Section IV. We extend
comparison of our proposed dehazing method with current state
of the art methods. We then consider the effect of compression our analysis to video by looking at motion vector estimation in
by investigating the blocking and ringing artifacts in cases of Section V. We present results in Section VI with a simulation
applying any dehazing method before or after compression. Based of images that support our analysis and then finish with a
on an investigation with the JPEG model, we conclude that conclusion in Section VII.
the best dehazing performance (with less artifacts) is achieved
if the dehazing is applied before compression. Simulations for II. BACKGROUND
both JPEG images and H.264 compressed sequences validate our
There are contrast enhancement methods that can improve
conclusion.
the quality of images based on the statistics. Examples of
I. I NTRODUCTION these statistical methods are histogram equalization [2], Grey-
Suppose there is a hypothetical situation where one needs Level-Grouping [3], and Contrast Limited Adaptive Histogram
to design a surveillance system that is capable of capturing Equalization (CLAHE) [4]. Methods like CLAHE were devel-
images or video (visible and/or infrared) for near real-time oped to be spatially adaptive (versus globally applied) in order
surveillance by way of transmitting the imagery through a to account for variable contrast degradations through a single
lossy compression method (e.g., JPEG for images or H.264 image. Additionally, there are methods that try to increase
for video). Also suppose the surveillance system is positioned local contrast in the pixel domain [5], [6]. These can help
in an ocean environment. Since fog and haze are a common achieve satisfactory contrast enhancements because of their
presence in the ocean environment due to the abundance of spatially adaptive nature. Other methods use a model of the
water vapor and temperature fluctuations, the imagery obtained human vision system (HVS) to make the recovered image
by this surveillance system will suffer from low contrast at consistent with how a human perceives the image by enforcing
all spectral bands [1]. A typical choice for improving the color constancy [7], [8]. Instead of modeling the receiving
imagery (by removing fog/haze) acquired is to apply contrast end (human or machine), we will focus on more applicable
enhancements after the images are captured and compressed. methods that use an external physics model of fog and haze
This is commonplace because most systems are not equipped to estimate and recover a dehazed image [1], [9]–[13]. This is
with on-board enhancement systems. We will call this type of an attractive model because it can further help to recover the
system a post-enhancement or Post. Another reasonable choice depth of a scene, which can facilitate other processes such as
is to apply a contrast enhancement before compression. We object detection.
call this pre-enhancement, or Pre. The question is which is To account for the presence of compression, researchers
the better choice, the Pre or Post method and why? have also implemented contrast enhancements in the transform
In Section II we will briefly explore common contrast domain [14]–[16]. What is not found in research material
enhancements that can be used to remove fog and haze. is a joint investigation between dehazing and image/video
This section then introduces a commonly used physics model compression. In addition, the implementation of the Pre or
implemented in current single image dehazing methods that Post method hasn’t been investigated. We will first look at the
account for the physics model of haze and fog. It is also rea- common model used for dehazing and then observe the effect
sonably desirable for the contrast enhancement method to be of dehazing before (Pre) and after (Post) compression.
fast enough in order to allow several images to be analyzed (by A. Fog and Haze Physical Model
humans or computer vision systems) in a reasonable amount
The dichromatic model commonly used for representing fog
of time or for near real-time video surveillance. Therefore we
and haze is
will propose a single image dehazing method that is faster
than current methods and yet effective in removing haze and x̂(m, n, λ) = x(m, n, λ)t(m, n, λ)+a(λ)(1−t(m, n, λ)) (1)
IV. E FFECTS OF JPEG C OMPRESSION WITH D EHAZING In JPEG compression, an NxN forward Type-II DCT ap-
Let’s take a step back and assume we have a perfect plied in non-overlapping blocks. The direct form of the (i, j)th
dehazing algorithm where t(m, n) and a are exactly known. NxN block of the transformed luminance is
We want to view what happens when you dehaze before or
fi,j (u, v) =
after lossy JPEG compression (assuming haze is present in X
the image). We will use JPEG compression with uniform K(u, v) y(N i + m, N j + n)Cu (m)Cv (n) (15)
quantization for the first part of our investigation. With JPEG, 0≤m,n≤N −1
the input image is first converted into a YUV (or YCrCb) for u, v = 0, ..., N − 1 and
colorspace using a 3x3 projection matrix R.
(2k + 1)sπ
Cs (k) = cos , (16)
(y, cr , cb )T = Rx. (11) 2N
In the Pre method x̂ is dehazed to x before converting to K(u, v) = α(u)α(v), (17)
YCrCb so we simply have y as our luminance (instead of ŷ). and
If the chosen method is Post then the resulting luminance is
( p
1/N , if s = 0 (18)
still hazy ŷ. α(s) = p
Using our dichromatic model, the hazy luminance is 2/N , if s 6= 0. (19)
We will use the framework shown above to explore how
ŷ(m, n) = r1 T x̂(m, n)
ringing and blocking artifacts are affected when using the Pre
= r1 T x(m, n)t(m, n) + r1 T a(1 − t(m, n)) and Post methods.
= y(m, n)t(m, n) + ay (1 − t(m, n)) (12) A. Ringing Artifacts from Coding and Dehazing
where y is the luminance of the non-hazy image pixel When an image is decompressed, ringing artifacts will occur
r1 T x(m, n) and the airlight projected onto the luminance when frequency components are lost at the compression side.
channel is ay = r1 T a. (rk T is the k th row of matrix R.) Be- This loss is caused by the quantization of the DCT coefficients.
fore we ignore the Cr , Cb channels in this analysis, take note Using (12) and (15), a DCT coefficient at frequency (u, v) of
that if the airlight is colorless, then the Cr and Cb channels will the hazy luminance at the (i, j)th block is
not contain any airlight information (rk T a = 0, ∀k = 2, 3). X
That is fˆi,j (u, v) = K(u, v)
0≤m,n≤N −1
cˆr (m, n) = r2 T x(m, n)t(m, n) (13)
[y(N i + m, N j + n)t(N i + m, N j + n)
and +ay (1 − t(N i + m, N j + n))Cu (m)Cv (n)] . (20)
cˆb (m, n) = r3 T x(m, n)t(m, n). (14) If we assume the depth is the same at every pixel within
Once the luminance is obtained, the values are offset to be the (i, j)th block (ti,j (m, n) = ti,j ) then (20) becomes
within the range of [-2P , 2P − 1] for (P + 1) bit data. For X
fˆi,j (u, v) = K(u, v) [ti,j y(N i + m, N j + n)
clarity purposes we will not make a notational difference and
0≤m,n≤N −1
from this point on we assume that the luminance values for ŷ
and y are subtracted by 2P . +ay (1 − ti,j )Cu (m)Cv (n)] . (21)
Next we associate the zigzag frequency ν with the hor- With (28) and (29), we can say the probability that a hazy
izontal and vertical frequencies, u, and v respectively. The AC coefficient is annihilated is greater than the probability a
operators Zu (ν), Zv (ν), and Zν (u, v) are given as non-hazy AC coefficient is annihilated,
u = Zu (ν), v = Zv (ν), ν = Zν (u, v). q(ν) q(ν)
P |fˆi,j
z
(ν)| < z
> P |fi,j (ν)| < . (30)
2 2
These operators allow us to parameterize (21) with ν as
Thus using (30) and assuming ti,j is constant at an (i, j)
fˆi,j
z
(ν) = fˆi,j (Zu (ν), Zv (ν)). (22) block, the probability of ringing artifacts using the Post method
Thus the DC component of fˆi,j
z
(ν) is fˆi,j
z
(0). For the AC is greater than the probability of ringing using the Pre method.
2
components, 1 ≤ ν ≤ N − 1, we have B. Blocking Artifacts from Coding and Dehazing
fˆi,j
z
(ν) = The cause of blocking artifacts in lossy compression meth-
ods is due to the artificial boundaries induced by the block
X
K(Zν (u), Zν (v)) [ti,j y(N i + m, N j + n)
0≤m,n≤N −1
based DCT between neighboring blocks [18]. To observe the
severity of blocking we will compare the signal to noise ratios
Cu (νm )Cv (νn )]
z
(SNR) on the reconstructed (or decompressed) end of the
= ti,j fi,j (ν), for 1 ≤ ν ≤ N 2 − 1. (23) system. To do this, we use the (i, j)th reconstructed block,
r q
In (23) we see that the relationship between the AC compo- yi,j which is a dequantized and inverse block DCT of fi,j ,
nents of the hazy block fˆi,j
z z
and its non-hazy counterpart fi,j is N −1 N −1
an attenuation by ti,j (assuming ti,j is constant in the (i, j)th
X X q
r
yi,j (m, n) = K(u, v)fi,j (u, v)Cm (u)Cn (v)q(u, v),
block). u=0 v=0
The quantization q(ν) applied to each DCT coefficient is (31)
where q(u, v) is the uniform quantization at frequency (u, v).
q
fˆi,j (ν) = bfˆi,j
z
(ν)/q(ν) + 1/2c, (24) r
It has been shown in [19] that the reconstruction yi,j can be
characterized as the original signal plus reconstruction noise,
where the operator bx + 1/2c is a rounding to the nearest
integer operation on x. The quantizing of coefficients in (24) r
yi,j (m, n) = yi,j (m, n) + r (32)
can annihilate (set to zero) AC components hz (ν) when
where r is the reconstructed noise with zero mean and
hz (ν)/q(ν) < 1/2. (25) variance of σr2 . Likewise, the hazy reconstructed block is
The probability a hazy AC coefficient is annihilated is r
ŷi,j (m, n) =
q(ν)
P |fˆi,j
z
(ν)| < , (26) t(m, n)yi,j (m, n) + ay (1 − t(m, n)) + r . (33)
2
and the probability a non-hazy AC coefficient is annihilated is If we take the next step in this analysis by also adding
camera noise n to (12), then the image used for dehazing in
z q(ν) the Pre method (no compression) is
P |fi,j (ν)| < . (27)
2
P re
ŷi,j (m, n) =
With (23) and (26) we can relate the probability using the
non-hazy AC coefficient, ti,j (m, n)yi,j (m, n) + ay (1 − ti,j (m, n)) + n (34)
q(ν) q(ν) and the hazy reconstructed image used for the Post method is
P |fˆi,j
z
(ν)| < z
= P |ti,j fi,j (ν)| <
2 2
P ost
q(ν)
ŷi,j (m, n) =
z
= P |fi,j (ν)| < (28)
2ti,j ti,j (m, n)yi,j (m, n) + ay (1 − ti,j (m, n)) + n + r . (35)
(since ti,j is always positive). Let’s make another reasonable Now assume that we have a perfect dehazing function D(.)
assumption to simplify our analysis by restricting 0 ≤ ti,j < 1 defined as
because at ti,j = 1 means the distance to the camera is zero 1
which in practice never occurs. Using this additional restric- D(ĥ) = h = (ĥ − ay ) + 1 (36)
t(m, n)
tion, shifting the threshold from q(ν) 2
q(ν)
to 2t i,j
increases the
probability (27) to (28) respectively because of the monotonic where the transmission t(m, n) and airlight ay are exactly
increasing property of the cumulative distribution function known. Using (34), (35), and (36) we have the dehazing
which gives us the inequality function to be
q(ν)
q(ν)
P re P re n
z
P |fi,j (ν)| < z
< P |fi,j (ν)| < . (29) Di,j = D(ŷi,j (m, n)) = yi,j (m, n) + (37)
2 2ti,j ti,j (m, n)
and
P ost P ost n + r
Di,j = D(ŷi,j (m, n)) = yi,j (m, n) + . (38)
ti,j (m, n)
Note that in (37) the reconstruction error is not present because
compression has not taken place. To complete the Pre analysis,
we will add reconstruction noise similar to (32) to represent
P re
the reconstructed version of Di,j ,
P re,r P re
Di,j = Di,j + r . (39)
With E[n ] = E[r ] = 0, the expected value for both
DP re,r and DP ost methods are both equal to yi,j (m, n). But
interestingly, their variances are different and are calculated as
P re σn2
var[Di,j ] = σy2i,j + + σr2 (40)
t2i,j (m, n)
and
σn2 + σr2 (a) Synthesized Haze
P ost
var[Di,j ] = σy2i,j + . (41)
t2i,j (m, n)
with var[yi,j (m, n)] = σy2i,j . More importantly, the Pre and
Post SNR relationship at block (i, j) for the perfect dehazing
P re P ost
function D, SN Ri,j and SN Ri,j respectively, is
P re
σy2i,j σy2i,j P ost
SN Ri,j = 2
σn
≥ σn2 +σ 2 = SN Ri,j . (42)
t2i,j (m,n)
+ σr2 r
t2i,j (m,n)
(a) Original video frame. (b) Dehazed video frame. Red box
is region of interest used for com-
parisons. Fig. 5. Luminance PSNR vs. Bitrate using H.264.
plot of the results are in Fig. 5. Over all bitrates, the Pre
method has a higher PSNR than the Post method.
For more results, please visit the following website:
http://videoprocessing.ucsd.edu/∼kgibson/PrePost.