Professional Documents
Culture Documents
Dept. Tecnologı́as de las Comunicaciones, ETSI Telecom., Universidad de Vigo, 36200 Vigo, Spain
email: jhernan@tsc.uvigo.es, fperez@tsc.uvigo.es
ABSTRACT
of applying a DCT transform to x[n] in a 8 8 pixels block ba-
sis. For copyright protection purposes, a watermark W [k] car-
A novel DCT-domain watermark extraction procedure for still im-
rying some hidden information (owner and image identification
ages that does not require the original image is presented. This
number, transaction date, etc.) is added to the original image in
method is based on the generalized Gaussian model, which in-
the DCT domain, obtaining as a result the watermarked version
cludes as a special case the cross-correlation-based watermark de-
Y [k] =4 X [k] + W [k].
tector structures, used so far in the literature. Optimal maximum
likelihood (ML) structures are given, which allow to analytically In the watermarking technique we analyze in this paper, the
assess the performance of watermarking methods in the DCT do- watermark W [k] is generated in the DCT domain employing a 2-
main within a statistical framework. These original theoretical re- dimensional multipulse amplitude modulation scheme [2, 3]. In
other words, W [k] can be expressed as the sum of N orthogonal
sults are validated with experiments that show a considerable im-
provement over the existing watermark extraction techniques. The f g
pulses Pi [k] Ni=1 N X
perceptual model used in the tests is also described. W [k ] = bi Pi [k]; (1)
i=1
1. INTRODUCTION 4
where the coefficients b = (b1 ; : : : ; bN ) are used to encode the
In recent years we have witnessed a striking proliferation of tech- hidden message. The modulation pulses Pi [k] N f g
i=1 are gener-
niques for representation, storage and distribution of digital mul- ated as a function of a secret key K , only known by the copyright
timedia information. Unfortunately, these developments have also owner. They are expressed as
[k]s[k]; k 2 Si
opened the gate to unathorized copying, distribution and manip-
ulation of data, mostly images. Specialized and costly hardware Pi [k] = 0; otherwise;
(2)
may alleviate the problem of images duplication, at the price of a
dramatic reduction in marketing possibilities –this is the crypto- where s[k] is a key-dependent pseudorandom sequence such that
graphic approach taken by pay TV channels, not foreseeable for 4 fS gN are also
s[k] 2 f 1; 1g; 8k, and the sets of indexes T = i i=1
scenarios such as Internet–. Watermarking techniques can at least
key-dependent and determine the spatial shape of the pulses. The
sequence [k ] is called the perceptual mask and indicates the max-
ensure that ownership information is invisibly embedded into the
image, thus preventing or deterring users from illegal uses.
imum allowable magnitude of the alteration that the coefficient
Although many watermarking methods have sprouted over the
X [k ] may suffer without achieving noticeable distortions. The sets
few past years, even with commercial products available, the re-
fS g N S \S
i i=1 are assumed to be non-overlapping, i.e. i j = ; i = ;8 6
sults up to date are quite discouraging, since there are freely avail-
able programs (e.g., unZign, Stirmark) that have succeeded in wip-
j , and sparsely spread over the whole image in a pseudorandom
fashion to provide security and robustness against cropping [2, 3].
Given a watermarked image Y [k] and the secret key K , first
ing the watermark away with little impact on the quality of the re-
sulting image. Parallel to this, the lack of theoretical analyses in
the presence of a watermark for that key is tried to be detected
most of the available literature makes it difficult to know the actual
in the so-called watermark detection test. If the result is positive,
limits in the performance of the various methods and to provide
then the watermark decoding procedure obtains an estimate of the
well-founded solutions which are the only way to eventually turn
message b. We will assume in this paper that no attacks aimed at
digital copyright protection into a mature discipline. In this paper
desynchronizing the watermark are performed. However, both the
we make a contribution in this direction by showing how water-
synchronization and watermark detection problems can be tackled
marking in the DCT domain (the most commonly used) can be
within the statistical framework presented in sections 2 to 4. The
dramatically improved by carefully modeling the problem and de-
perceptual model used in our particular watermarking scheme is
signing the proper watermark detector. We will assume throughout
given in Section 5. Section 6 is devoted to experimental results,
the paper that the original image is not known. While knowledge
while Section 7 presents our conclusions and future lines of re-
of the original image greatly simplifies the extraction procedure
search.
[1] it also narrows the range of possible applications.
Let x[n] be a two-dimensional sequence representing the lu-
minance of the original image, where n = (n1 ; n2 ). For the sake 2. STATISTICAL MODEL
of readability, we will use in the sequel this vector notation to rep-
resent two-dimensional discrete indexes. Let X [k] be the result Detector structures usually proposed for hidden information de-
coding in DCT-domain spread spectrum data hiding techniques
Work partially funded by CICYT under contract TIC-96-0500-C10-10 are based on the crosscorrelation between the watermarked image
Y [k] and the pseudorandom sequence s[k]. This scheme would When a binary antipodal constellation is used to encode M = 2N
be appropriate if noise –in watermarking, the original image– fol- possible messages, the ML detector structure is equivalent to a bit-
lowed a Gaussian distribution. However, the Gaussian assumption by-bit hard decisor, so the outputs of the decoder are
is inaccurate for DCT coefficients of common images. Some au-
thors have proposed the generalized Gaussian probability density ^bi = sgn(ri ); i 2 f1; : : : ; N g:
function (pdf)
Now let us analyze the performance of the watermark decod-
fx (x) = A e j xj :
c
(3) ing process in terms of the probability of bit error Pb . Obviously,
performance results strongly depend on image characteristics, so
as an alternative leading to improved statistical models [4]. Note we will obtain Pb conditioned to a given original image X [k] or, in
that the Gaussian and the Laplacian pdf’s are just special cases of other words, the probability of getting a bit error when a secret key
this expression, given by c = 2 and c = 1, respectively. Pre- is taken at random and is applied in both the watermarking and de-
coding processes. In this context, X [k] will be regarded as a deter-
T fS g
vious works in this field show that DCT coefficients at low fre-
quencies are reasonably well-modeled by a generalized Gaussian ministic signal while the sequence s[k] and the sets = i N i=1
distribution with c = 1=2. Coefficients at high frequencies are will be modeled statistically.
better approximated by a Gaussian distribution and sometimes by If the pseudorandom sequence s[n] is modeled as an i.i.d. two-
dimensional random process with marginal pdf fs (s), then, each
jS j
a Laplacian distribution.
The parameters A and in Eq. (3) can be expressed as sufficient statistic ri is the sum of i statistically independent
jS j f
contributions ( i is the cardinality of the set k; Pi [k] = 0 ). 6 g
(3=c)
1=2 4
Hence, by central limit theorem arguments, r = (r1 ; : : : ; rN )
= 1 ; A= c ; (4)
(1=c) 2 (1=c) can be accurately approximated as the output of a vector Gaussian
channel. Therefore, the probability of error conditioned to X [k ]
where is the standard deviation. Hence, the pdf is completely can be expressed as a function of the first and second order mo-
specified by c and . Let us define the sequence ments of r1 ; : : : ; rN . Let us define the two-dimensional sequence
4 X [8k + i; 8k + j ]; i; j 2 f0; : : : ; 7g;
Ci;j [k1 ; k2 ] = 4 jY [k] + [k]s[k]jc[k]
r [k ] = jY [k] [k]s[k]jc[k] ;
1 2
which results if we take the (i; j )-th DCT coefficient of every extracted from Eq. (3).
block. We will model each of these 64 sequences as the output 2
If the tiling generation process is such that each index k N2
of a two-dimensional i.i.d. random process whose marginal distri- S
belongs to i with probability 1=N for all i 2f g
1; : : : ; N and
bution follows Eq. (3), with parameters c(i; j ) and (i; j ). Let us assignments of indices to sets are performed independently, i.e.
also define the sequences c[k] as P rfk 2 Si ; m 2 Sj g = P rfk 2 Si gP rfm 2 Sj g; 8k 6=
m; i; j 2 f1; : : : ; N g, then after some algebraic manipulations it
4 c(k mod 8; k mod 8)
c[k] = can be proven [10] that
1 2
and [k] in a similar fashion. Thus, these two sequences indicate E [ri ] = 1
X E [r[k]] : (5)
N c[k]
the parameters c and associated with each sample X [k ]. k [k]
3. WATERMARK DECODER
V ar(ri ) = 1 V ar(r[k]) + N 1 X E 2 [r[k]] : (6)
X
N 2c[k ] N2 2c[k ]
Let us assume that M possible different messages can be encoded k [k] k [k]
with the vector b = (b1 ; 2 f
; bN ) and let bl ; l 1; ; M g Assume that bi = 1. Then, Y [k] = X [k] + [k ]s[k]; 8k 2 Si ,
denote the codeword associated to one of those messages. Also,
let Wl [k]; l2f g
1; : : : ; M be the watermark obtained from bl = and, as a consequence,
(bl;1 ; : : : ; bl;N ) using Eq. (1). Then, assuming the i.i.d. general-
ized Gaussian model for X [k], it can be easily shown that the op- r[k] = jX [k ] + 2[k]s[k]jc[k] jX [k]jc[k]:
timum decoder in the ML sense is the one that chooses the index s[k] follows a discrete uniform two-level distribution, s[k] 2
2f g
l 1; : : : ; M verifying
If
f 1; 1g, it can be easily shown that the mean and variance of r[k]
jY [k] Wm [k]jc[k] jY [k] Wl[k]jc[k] > 0; 8 m 6= l:
are
X
c[k]
E [r[k]] = 1 jX [k]j + 2[k]
h
c[k]
[k ]
k 2
Assuming that bl;i 2f g 8 2f g 2f
1; 1 ; l 1; : : : ; M ; i 1; : : : ; N , g
+ jX [k]j 2[k]
c[k] i
jX [k]jc[k]
this optimization problem is equivalent
P
to finding the codeword bl
which maximizes the expression N i=1 bl;i ri ; where the coeffi-
c[k]
V ar(r[k]) = 1 jX [k]j + 2[k]
h
cients ri are sufficient statistics for the detection problem and are 4
defined as c[k] i2
jX [k]j 2[k] :
4
ri =
X jY [k] + [k]s[k]jc[k] jY [k] [k]s[k]jc[k] :
k2Si
[k]c[k] These expressions can be applied in equations (5) and (6) to com-
pute the moments of ri . When bi = 1, it can be verified that
V ar(ri ) is given by Eq. (6) and E [ri ] is negative and its absolute which is a sum of statistically independent terms. Hence, applying
value is given by Eq. (5). When a binary antipodal constelation the central limit theorem we can infer that l(Y ) is approximately
with M = 2N is used to encode the hidden message, the probabil- Gaussian. Assuming that s[k] is an i.i.d. two-dimensional random
ity of bit error Pb of the ML decoder (a bit-by-bit hard decisor) is sequence with a discrete marginal distribution with two equiprob-
Pb = Q (SNR), where Q(x) = 4 p1 R 1 e t2 =2 dt and the signal able levels, s[k]2f g 8
1; 1 ; k, then we can easily prove that the
2 x
to noise ratio SNR is defined as mean and variance of l(Y ) conditioned to H0 are [10]
4
SNR = p
E [ri ] : (7) E [l(Y ) j H0 ] =
X
[k]c[k] jX [k ]jc[k]
V ar(ri ) k
1X
[k]c[k] jX [k] + [k ]jc[k]
4. WATERMARK DETECTOR 2
k
Now let us analyze the watermark detection test, in which we have
to decide whether a given image contains a watermark generated + jX [k ] [k]jc[k] (12)
with a certain key. The watermark detection problem can be math-
ematically formulated as the binary hypothesis test
H1 4 4
Let us define m1 = E [l(Y ) j H1 ] and 12 = V ar(l(Y ) j H1 ).
(Y ) >< ; (9) If H1 is decided in the detection test when l(Y ) > , then the
H0 probabilities of false alarm (PF ) and detection (PD ) are
where is the decision threshold and (Y ) is the likelihood func-
PF = Q + m1 ; PD = Q m1 : (16)
tion
1 1
(Y ) =
1 M
X f (Y j H1 ; bl ) Let us define the following “signal to noise ratio”
f (Y j H0 )
(10)
M l=1
4 m21 :
SNR1 =
If we assume that the coefficients of the original image X [k] 12
(17)
follow the generalized Gaussian model studied in Sect. 2, and that
the watermark does not carry hidden information, in other words, 2
If we denote by Q 1 (PF ) the value x R such that Q(x) = PF ,
that there is only one pulse (N = 1) and it is modulated by a then it can be easily proved, by examining the expressions in (16),
known value b1 = 1, then the log-likelihood function l(Y ) =
4 that
ln (Y ) has the form p
Figure 3: SNR as a function of c for Lena (256 256). [5] A. J. Ahumada and H. A. Peterson, “Luminance-model-
based DCT quantization for color image compression,” in
Human Vision, Visual Processing, and Digital Display III
(Proc. of the SPIE) (B. E. Rogowitz, ed.), 1992.
7. CONCLUSIONS AND FURTHER WORK
[6] J. A. Solomon, A. B. Watson, and A. J. Ahumada, “Visi-
Novel structures based on the use of generalized Gaussian models bility of DCT basis functions: Effects of contrast masking,”
have been proposed for the ML detection of DCT-domain water- in Proc. Data Compression Conference, (Snowbird, Utah
marks embedded in still images. By considering these models, (USA)), pp. 361–370, IEEE Computer Society Press, 1994.
we have been able to dramatically improve the performance of the [7] A. B. Watson, “Visual optimization of DCT quantization ma-
cross-correlation-based detectors that have been used up to date. trices for individual images,” in Proc., AIAA Computing in
In any case, we also have presented a theoretical analysis that al- Aerospace 9, (San Diego, California (USA)), pp. 286–291,
lows to assess the performance of DCT-based methods, measured American Institute of Aeronautics and Astronautics, 1993.
in terms of the bit error rate and the probabilities of false alarm
and detection, for a given image. The Gaussian detector is simply [8] A. N. Netravali and B. G. Haskell, Digital Pictures. Repre-
a particular case of the generalized model, so the analytical results sentation, Compression and Standards. New York: Plenum
given here are directly applicable. One immediate extension of Press, 1995.
our analysis is the consideration of channel codes which have been [9] M. Barni, F. Bartolini, V. Cappellini, and A. Piva, “A DCT-
already shown to considerably improve on spatial-domain water- domain system for robust image watermarking,” Signal Pro-
marking methods [11]. Another future line of research consists cesing, vol. 66, pp. 357–372, May 1998.
in fitting a generalized Gaussian model (from a discrete set of pa- [10] J. R. Hernández, Técnicas de Sellado Invisible para la Pro-
rameters c and ) to the DCT coefficients histogram so as to decide tección de los Derechos de Autor en Imágenes Digitales.
upon and apply the optimal detector structure. PhD thesis, Universidade de Vigo, 1998.
[11] J. R. Hernández, J. M. Rodrı́guez, and F. Pérez-González,
8. REFERENCES “Improving the performance of spatial watermarking of im-
ages using channel coding.” Accepted for publication in Sig-
[1] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon, “Se-
nal Procesing, Elsevier.
cure spread spectrum watermarking for multimedia,” IEEE
Transactions on Image Processing, vol. 6, pp. 1673–1687,
December 1997.