Professional Documents
Culture Documents
Media authentication is important in content delivery via untrusted Fig. 1 is the block diagram for our earlier image authentication
intermediaries, such as peer-to-peer (P2P) file sharing. Many dif- scheme [1] as well as the current work. We denote the source image
ferently encoded versions of a media file might exist. Our previous as x. We model the target image y by way of the space-varying
work applied distributed source coding not only to distinguish the two-state lossy channel in Fig. 2. The legitimate state of the channel
legitimate diversity of encoded images from tampering but also to performs lossy JPEG2000 or JPEG compression and reconstruction
localize tampered regions in an image already deemed to be inau- with peak signal-to-noise ratio (PSNR) of 30 dB or better. The tam-
thentic. In both cases, authentication requires a Slepian-Wolf en- pered state additionally includes malicious tampering. The channel
coded image projection that is supplied to the decoder. state variable Si is defined per nonoverlapping 16x16 block of im-
We extend our scheme to authenticate images that have under- age y. If any pixel in block Bi has been tampered with, Si = 1;
gone contrast, brightness, and affine warping adjustment. Our ap- otherwise, Si = 0.
proach incorporates an Expectation Maximization algorithm into the We now review the authentication system. The left-hand side of
Slepian-Wolf decoder. Experimental results demonstrate that the Fig. 1 shows that a pseudorandom projection (based on a randomly
proposed algorithm can distinguish legitimate encodings of authen- drawn seed Ks ) is applied to the original image x to produce projec-
tic images from illegitimately modified versions, despite arbitrary tion coefficients X, which are quantized to Xq . The authentication
contrast, brightness, and affine warping adjustment, using authenti- data comprise two parts, both derived from Xq . The Slepian-Wolf
cation data of less than 250 bytes per image. bitstream S(Xq ) is the output of a Slepian-Wolf encoder based on
Index Terms— Image authentication, distributed source coding, rate-adaptive low-density parity-check (LDPC) codes [6]. The much
Expectation Maximization smaller digital signature D(Xq , Ks ) consists of the seed Ks and
a cryptographic hash value of Xq signed with a private key. The
authentication data are generated by a server upon request. Each re-
1. INTRODUCTION sponse uses a different random seed Ks , which is provided to the
decoder as part of the authentication data. This prevents an attack
Media authentication is important in content delivery via untrusted which simply confines the tampering to the nullspace of the projec-
intermediaries, such as peer-to-peer (P2P) file sharing or P2P mul- tion. Based on the random seed, for each 16x16 nonoverlapping
ticast streaming. In these applications, many differently encoded block Bi , we generate a 16x16 pseudorandom matrix Pi by drawing
versions of the original file might exist. Moreover, transcoding and its elements independently from a Gaussian distribution N (1, σ2 )
bitstream truncation at intermediate nodes might give rise to further and normalizing so that ||Pi ||2 = 1. We choose σ = 0.2 empiri-
diversity. Intermediaries might also tamper with the media for a vari- cally. The inner product Bi , Pi is an element of X, quantized to
ety of reasons, such as interfering with the distribution of a particular an element of Xq .
file, piggybacking unauthentic content, or generally discrediting a The authentication decoder, on the right-hand side of Fig. 1,
distribution system. In previous work, we applied distributed source seeks to authenticate the image y with authentication data S(Xq )
coding (DSC) to image authentication to distinguish the diversity of and D(Xq , Ks ). It first projects y to Y in the same way as during
legitimate encodings from malicious manipulation [1] and demon- authentication data generation. A Slepian-Wolf decoder reconstructs
strated that the same framework can localize tampering in images Xq from the Slepian-Wolf bitstream S(Xq ) using Y as side infor-
deemed to be inauthentic [2]. We extended the system to be robust mation. Decoding is via joint bitplane LDPC belief propagation [7]
to contrast and brightness adjustment [3] and affine warping [4]. In initialized according to the known statistics of the legitimate chan-
this paper, we handle simultaneous contrast, brightness and affine nel state at the worst permissible quality for the given original image.
warping adjustment. Our approach lets the authentication decoder Then the image digest of Xq is computed and compared to the im-
learn adjustment parameters using an Expectation Maximization [5] age digest, decrypted from the digital signature D(Xq , Ks ) using
(EM) algorithm that generalizes the solutions in [3, 4]. a public key. If these two image digests are not identical, the re-
Section 2 reviews our image authentication system using dis- ceiver declares image y to be inauthentic. If they match, then Xq
tributed source codes [1]. In Section 3, we formalize the authentica- has been recovered. To confirm the authenticity of y, the receiver
tion problem and introduce our extension for image authentication verifies that the likelihood of the legitimate channel is greater than a
with parameter learning. The EM algorithmic details are given in certain threshold.
Section 4. Simulation results in Section 5 show that the proposed Since this second-pass comparison uses all available infor-
scheme can distinguish between authentic editings of images and il- mation, the threshold for the likelihood specifies how statistically
legitimately modified versions. similar the target image must be to the original to be declared
Original Image Target Image
x y
Two-state Channel
authentic. But the rate of the Slepian-Wolf bitstream S(Xq ) deter- Original
Image
mines whether the quantized image projection Xq is recovered at x Contrast and
Affine Warping JPEG2000/ Target
Brightness Image
all [8]. Accordingly, at the encoder, we select a Slepian-Wolf bit- Adjustment
and Cropping JPEG Si=0
y
rate just sufficient to successfully decode with both legitimate 30 dB
Malicious
JPEG2000 and JPEG reconstructed versions of x. At the decoder, Tampering Si=1
we choose a threshold for the likelihood for the second-pass com-
parison to distinguish between the different joint statistics induced
in the images by the legitimate and tampered channel states. Fig. 3. Space-varying two-state lossy channel with contrast and
brightness adjustment and affine warping .
Original
Image x Target Image
JPEG2000/JPEG Si=0
y
realigned to the original, with channel states Si labeled red if tam-
pered and blue if cropped out in y. The rest are legitimate cropped-in
Malicious
Si=1
states.
Tampering
The image authentication system described in Section 2 cannot
authenticate legitimate images that have undergone contrast, bright-
Fig. 2. Space-varying two-state lossy channel. ness, and affine warping adjustment, because the side information is
neither aligned with the corresponding authentication data nor com-
pensated for the contrast and brightness changes. Approaches sug-
3. CONTRAST, BRIGHTNESS, AND AFFINE WARPING gested in the prior art to overcome this problem involve generating
ADJUSTMENT MODEL affine-invariant features that serve as authentication data [9, 10, 11].
These features are invariant to some transformations, but would be
In this paper, we replace the two-state lossy channel in Fig. 2 with sensitive to others. We instead propose that the authentication de-
the one in Fig. 3. Now both the legitimate and tampered states of coder jointly estimate the adjustment parameters directly from the
the channel are affected by contrast and brightness adjustment and Slepian-Wolf bitstream S(Xq ) and the target image y using an EM
affine warping. In the legitimate state, we model the channel as algorithm.
A and b are transformation and translation parameters in affine The introduction of learning to the system in Fig. 1 requires a mod-
warping, respectively, α and β are contrast and brightness ad- ification of the Slepian-Wolf decoder block from a joint-bitplane
justment parameters, and z is noise introduced by compression LDPC decoder [7] to the parameter-learning Slepian-Wolf decoder
and reconstruction. To keep y the same size as x, it is padded shown in Fig. 5. As before, it takes the Slepian-Wolf bitstream
with black pixels (arbitrarily) and cropped. Fig. 4(a) shows a S(Xq ) and the target image y and yields the reconstructed image
source image “Lena” at 512x512 original resolution. The legiti- projection Xq . But it now also estimates the parameters via an
mate y in Fig. 4(b) is first contrast and brightness adjusted with EM algorithm. The E-step updates the a posteriori probability mass
(α, β) = (1.2, −20), rotated by 5 degrees around the image center, functions (pmf) Papp (Xq ) using the joint bitplane decoder and also
then cropped to 512x512, and finally JPEG2000 compressed and estimates corresponding coordinates for a subset of reliably-decoded
0.996 −0.087 projection pixels. The M-step updates the parameters based on the
reconstructed at 32 dB PSNR. Here, A =
0.087 0.996 corresponding coordinate pmfs, denoted Q(m) in Fig. 5. This loop
and b = [ 23 −21 ]T . The tampered y in Fig. 4(c) addition- of EM iterations terminates when hard decisions on Papp (Xq ) satisfy
ally includes malicious tampering. Fig. 4(d) shows the tampered y the constraints imposed by S(Xq ).
(a) (b) (c) (d)
Fig. 4. Test image “Lena” (a) x original, (b) y in legitimate state, (c) y in tampered state, (d) channel states Si (red: tampered, blue:
cropped-out) associated with the 16x16 blocks of realigned output in (c).
In the E-step, we fix the parameters A, b, α, and β at their cur- Target Image
y
rent hard estimates. The inverse adjustment is applied to the image
y to obtain a compensated image ycomp . If the parameters are accu- Contrast, Brightness,
and Affine Wapring
A,b, D , E Parameter
rate, ycomp would be closely aligned to the original image x in the Adjustment
Estimation
Q(m)
cropped-in region. We derive intrinsic pmfs for the image projec-
tion pixels Xq as follows. In the cropped-in region, we use Gaussian ycomp Corresponding
Coordinate Estimation
Reconstructed
distributions centered at the random projection values of ycomp , and Slepian-Wolf
Bitstream S(Xq) Joint Bitplane Papp(Xq)
Image Projection
Xq'
in the cropped-out region, we use uniform distributions. Then, we Decoder
run three iterations of joint bitplane LDPC decoding on the intrinsic
pmfs with the Slepian-Wolf bitstream S(Xq ) to produce extrinsic
Fig. 5. Slepian-Wolf decoder with parameter learning.
pmfs Papp ([Xq ]i = xq ).
We estimate the corresponding coordinates m(i) for those pro-
jection pixels for which maxxq Papp ([Xq ]i = xq ) > T = 0.995, the lower bound separately over these two sets of parameters. We
denoting this set of reliably-decoded projection indices as C. We model P (n(i) |m(i) , [xmax
q ]i , y; A, b) as a Gaussian distribution,
also denote the maximizing reconstruction value xq to be [xmax q ]i . 1 i.e., (n(i) − Am(i) − b) ∼ N (0, σ 2 I). Taking partial derivatives
(i)
The latent variable update can be written as Qi (m) := P (m = with respect to A and b, and setting to zero, we obtain the optimal
m|[xmax
q ]i , y, n(i) ; A, b, α, β), where n(i) is the set of top-left co- updates:
ordinates of the 16x16 projection blocks Bi in the original image ⎡ ⎤
x, and m(i) represents the corresponding set of coordinates in the ⎡ ⎤ .. ..
A11 A21 ⎢ . . ⎥
overcomplete random projection Y . ⎣ A12 A22 ⎦ := E(GT G)−1 E[GT ] ⎢ (i) ⎥
⎢ n(i) n2 ⎥ ,
In the M-step, we re-estimate the parameters A, b, α, and β by ⎣ 1
⎦
b1 b2 .. ..
holding the corresponding coordinate pmfs Qi (m) fixed and maxi- . .
mizing a lower bound of the log-likelihood function:
where
L(A, b, α, β) ≡ log P ([xmax
q ]i , n(i) , y; A, b, α, β) ⎡ (i) ⎤T
i∈C
··· m1 ···
⎛ ⎞ G = ⎣ ··· (i)
m2 ··· ⎦ .
··· 1 ···
= log ⎝ P ([xmax
q ]i , n(i) , y|m(i) ; A, b, α, β)P (m(i) )⎠
i∈C m(i) Similarly, we model P ([Xqmax ]i |y, m; α, β) as a quantized
Gaussian with mean at (Y (m) − β)/α), and variance σz2 /α2 . Set-
≥ Qi (m) log P ([xmax
q ]i , n(i) , y|m; A, b, α, β)
i∈C m
ting partial derivatives with respect to α and β to zero, we obtain the
optimality updates:
= Qi (m)
i∈C m |C| i∈C μiXY − i∈C j∈C μiX μjY
α :=
log P (n(i) |m, [xmax
q ]i , y; A, b) + log P ([xmax
q ]i , y|m; α, β) . |C| i∈C μiX 2 − i∈C j∈C μiX μjX
1 i
β := μY − αμiX ,
The lower bound is due to Jensen’s inequality and concavity of |C| i∈C
log(.). Note also that P ([xmax
q ]i , y|m(i) ; α, β) does not depend
on the parameters A and b, and P (n(i) |m, [xmax q ]i , y; A, b) does
not depend on the parameters α and β. Thus, we can maximize where μiX = Em∼Qi [E[X|Y (m), [xmax
q ]i ]],
μiY = Em∼Qi [Y (m)],
that C is nonempty, we make sure to encode a small portion
1 To guarantee
of the quantized image projection Xq with degree-1 syndrome bits. The μiX 2 = Em∼Qi [E[X 2 |Y (m), [xmax
q ]i ]],
decoder knows those values with probability 1 and includes their indices in
C. μiXY = Em∼Qi [Y (m)E[X|Y (m), [xmax
q ]i ]].
5. SIMULATION RESULTS −1
10