Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Retrieval of Bitmap Compression History

Retrieval of Bitmap Compression History

Ratings: (0)|Views: 28 |Likes:
Published by ijcsis
The histogram of Discrete Cosine Transform coefficients contains information on the compression parameters for JPEGs and previously JPEG compressed bitmaps. In this paper we extend the work in [1] to identify previously compressed bitmaps and estimate the quantization table that was used for compression, from the peaks of the histogram of DCT coefficients. This can help in establishing bitmap compression history which is particularly useful in applications like image authentication, JPEG artifact removal, and JPEG recompression with less distortion. Furthermore, the estimated table calculates distortion measures to classify the bitmap as genuine or forged. The method shows good average estimation accuracy of around 92.88% against MLE and autocorrelation methods. In addition, because bitmaps do not experience data loss, detecting inconsistencies becomes easier. Detection performance resulted in an average false negative rate of 3.81% and 2.26% for two distortion measures, respectively.
The histogram of Discrete Cosine Transform coefficients contains information on the compression parameters for JPEGs and previously JPEG compressed bitmaps. In this paper we extend the work in [1] to identify previously compressed bitmaps and estimate the quantization table that was used for compression, from the peaks of the histogram of DCT coefficients. This can help in establishing bitmap compression history which is particularly useful in applications like image authentication, JPEG artifact removal, and JPEG recompression with less distortion. Furthermore, the estimated table calculates distortion measures to classify the bitmap as genuine or forged. The method shows good average estimation accuracy of around 92.88% against MLE and autocorrelation methods. In addition, because bitmaps do not experience data loss, detecting inconsistencies becomes easier. Detection performance resulted in an average false negative rate of 3.81% and 2.26% for two distortion measures, respectively.

More info:

Published by: ijcsis on Dec 04, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/04/2010

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8 No. 8, 2010
Retrieval of Bitmap Compression History
Salma Hamdy, Haytham El-Messiry, Mohamed Roushdy, Essam Kahlifa
Faculty of Computer and Information SciencesAin Shams UniversityCairo, Egypt{s.hamdy, hmessiry, mroushdy, esskhalifa}@cis.asu.edu.eg
 Abstract
 — 
The histogram of Discrete Cosine Transformcoefficients contains information on the compression parametersfor JPEGs and previously JPEG compressed bitmaps. In thispaper we extend the work in [1] to identify previouslycompressed bitmaps and estimate the quantization table that wasused for compression, from the peaks of the histogram of DCTcoefficients. This can help in establishing bitmap compressionhistory which is particularly useful in applications like imageauthentication, JPEG artifact removal, and JPEG recompressionwith less distortion. Furthermore, the estimated table calculatesdistortion measures to classify the bitmap as genuine or forged.The method shows good average estimation accuracy of around92.88% against MLE and autocorrelation methods. In addition,because bitmaps do not experience data loss, detectinginconsistencies becomes easier. Detection performance resulted inan average false negative rate of 3.81% and 2.26% for twodistortion measures, respectively.
 Keywords: Digital image forensics; forgery detection; compression history; Quantization tables.
I.
 
I
NTRODUCTION
 Although JPEG images are the most widely used imageformat, sometimes images are saved in an uncompressed rasterform (bmp, tiff), and in most situations, no knowledge of previous processing is available. Some applications arerequired to receive images as bitmaps with instructions forrendering at a particular size and without further information.The image may have been processed and perhaps compressedwith contain severe compression artifacts. Hence, it is usefulto determine the bitmap history; whether the image has everbeen compressed using the JPEG standard and to know whatquantization tables were used. Most of the artifact removalalgorithms [2-9] require the knowledge of the quantizationtable to estimate the amount of distortion caused byquantization and avoid over-blurring. In other applications,knowing the quantization table can help in avoiding furtherdistortion when recompressing the image. Some methods tryto identify bitmap compression history using MaximumLikelihood Estimation (MLE) [10-11] or by modeling thedistribution of quantized DCT coefficients, like the use of 
Benford’s law [
12], or modeling acquisition devices [13]. Furthermore, due to the nature of digital media and theadvanced digital image processing techniques, digital imagesmay be altered and redistributed very easily forming a risingthreat in the public domain. Hence, ensuring that mediacontent is credible and has not been altered is becoming animportant issue governmental security and commercialapplications. As a result, research is being conducted fordeveloping authentication methods and tamper detectiontechniques. Usually JPEG compression introduces blockingartifacts and hence one of the standard passive approaches isto use inconsistencies in these blocking fingerprints as areliable indicator of possible tampering [14]. These can also beused to determine what method of forgery was used.In this paper we are interested in the authenticity of theimage. We extend the work in [1] to bitmaps and use theproposed method for identifying previously compressedbitmaps and estimating the quantization table that was used.The estimated table is then used to determine if the mage wasforged or not by calculating distortion measures.In section 2 we study the histogram of DCT ACcoefficients of bitmaps and show how it differs for previouslyJPEG compressed bitmaps. We then validate that withoutmodeling rounding errors or calculating prior probabilities,quantization steps of previously compressed bitmaps can stillbe determined straightforward from the peaks of theapproximated histograms of DCT coefficients. Results arediscussed in section 3. Section 4 is for conclusions.II.
 
H
ISTOGRAM OF
DCT
 
C
OEFFICIENTS IN
B
ITMAPS
 We studied in [1] the histogram of quantized DCTcoefficients and showed how it can be used to estimatequantization steps. Here, we study uncompressed images andvalidate that the approximated histogram of DCT coefficientscan be used to determine compression history. Bitmap imagemeans no data loss and hence all what is required to build aninformative histogram is expected to be present in thecoefficients histograms.The first step is to decide if the test image was previouslycompressed because if the image was an originaluncompressed there is no compression data to extract. Whenthe image is decided to have a compression history, the nextstep is to estimate that history. For grayscale image,compression history mainly means its quantization table whichwill be the focus of this paper. For color image, this isextended to estimating color plane compression parametersthat includes subsampling and associated interpolation.
141http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8 No. 8, 2010
Fig. 1(b)
shows the approximated histogram
 H 
*
 
of DCTcoefficient at position (3,3) of the luminance channel of anuncompressed Lena image and the histogram of the imageafter being JPEG compressed with quality factor 80. It is clearthat the latter contains periodic patterns that are not present inthe uncompressed version. It was observed that the coefficientis very likely to have been quantized with a step of thisperiodic [15]. Now if that JPEG was stored in a bitmapuncompressed form, we expect the DCT coefficients to havethe same behavior because nothing is lost during this formatchange. This is evident in
Fig. 1(d)
which shows an identicalhistogram to the one in 
Fig. 1(c)
. Hence, similar to theargument in [1], if we closely observe the histogram of 
 H 
*
(i,j)
 outside the main lobe, we notice that the maximum peak occurs at a value that is equal to the quantization step used toquantize
 X 
q
(i,j)
. This observation applies to most lowfrequency AC coefficients.
Fig. 2(a)
and
(b)
 show
|H|
, theabsolute histograms of DCT coefficients for Lena of 
Fig. 1(a)
 at frequencies (3,3) and (3,4), respectively. As for highfrequencies, the maximum occurred at a value matching
Q(i,j)
 when
|X 
*
(i,j)|>B
, (
Fig. 2 (c)
 and 
(d)
), where B is as follows: 
u,v)j
π 
v(.)i
π 
u( c(u) c(v). B(i,j)(i,j) X (i,j) X 
 Γ 
q*
1612cos1612cos50
(1)where
 X 
q
(i,j)
is the quantized coefficient, and
 X 
*
(i,j)
is theapproximated quantized coefficient,
 Γ 
is the round off error,and
otherwise  for c
 10 21 )(
  
 See [1, 11].  Sometimes we do not have enough information todetermine
Q(i,j)
for high frequencies
(i,j)
. This happens whenthe histogram outside the main lobe decays rapidly to zeroshowing no periodic structure. This reflects the small or zerovalue of the coefficient. At such cases, it can be useful toestimate as many of the low frequencies and then searchthrough lookup tables for a matching
standard 
table.Estimating the quantization table of a bitmap can helpdetermine part of its compression history. If all (or most of) of the low frequency steps were estimated to be ones, we canconclude that the image did not go through previouscompression. High frequencies may bias because they havevery low contribution and do not provide a good estimate.Moreover, this method works well also for uncompressed orlossless compressed tiff images.
Fig. 3(d)
shows the 96.7%correctly estimated
Q
table using the above method of a tiff image taken from UCID [16
]. The X’s mark the“undetermined” coefficients.
 Now for verifying the authenticity of the image, we use thesame distortion measures we used in [1]. The averagedistortion measure is calculated as a function of theremainders of DCT coefficients with respect to the original
Q
 matrix:
8181),(),,(mod 1
ij jiQ ji D B
(2)where
 D(i,j)
and
Q(i,j)
are the DCT coefficient and thecorresponding quantization table entry at position
(i,j)
,respectively. An image block having a large average distortionvalue indicates that it is very different from what it should beand is likely to belong to a forged image. Averaged over theentire image, this measure can be used for making a decisionabout authenticity of the image.
In addition, the JPEG 8×8 “blocking effect” is somehow
still present in the uncompressed version and hence blockingartifact measure, BAM [14], can be used to give an estimate of  the distortion of the image. It is computed from the
Q
table as:
    
8181),(),( ),(),()( 2
ij jiQ ji Dround  jiQ ji Dn B
(3)where
 B(n)
is the estimated blocking artifact for the
n
th
block.
(a) Lena image (b) Uncompressed(c) JPEG compressed
Q(3,3)
=6 (d) Previously compressed bmp
Fig. 1.
Histograms of 
 X 
*
(3,3)
.(a) (b)(c) (d)
Fig. 2.
(a) |
 X 
*
(3,3)| where H
max
occurs at
Q
(3,3)=6. (b) |
 X 
*
(3,4)| where H
max
 occurs at
Q
(3,4) = 10 (c) |
 X 
*
(5,4)| where H
max
occurs at
Q
(5,4)=22. (d)
 |X 
*
(7,5)| where H
max
occurs at
Q
(7,5) = 41.
142http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8 No. 8, 2010
III.
 
E
XPERIMENTAL
R
ESUTLS
A
ND
D
ISCUSSION
 
 A.
 
 Estimation Accuracy
Our testing image set consisted of 550 images collectedfrom different sources (more than five camera models), inaddition to some from the public domain Uncompressed ColorImage Database (UCID), which provides a benchmark forimage processing analysis [16]. Each of these images wascompressed with different quality factors, [60, 70, 80, and 90].Again, each of these was uncompressed and resaved asbitmap. This yielded 550×4 = 2,200 untouched images. For
each quality factor group, an image’s histogram of DCT
coefficients at one certain frequency was generated and usedto determine the corresponding quantization step at thatfrequency according to section 2. This was repeated for all the64 histograms of DCT coefficients. The resulting quantization
table was compared to the quality factor’s known table and the
percentage of correctly estimated coefficients was recorded.Also, the estimated table was used in equations (2) and (3) todetermine the im
age’s average distortion and blocking artifact
measures, respectively. These values were recorded and usedlater to set a threshold value for distinguishing forgeries fromuntouched images.
Table 1
shows the accuracy of estimating all 64 entriesusing the proposed method for each quality factor averagedover the whole set. It exhibits a similar behavior to JPEGimages; as quality factor increases, estimation accuracyincreases steadily with an expected drop for quality factorshigher than 90 as the periodic structure becomes lessprominent and the bumps are no longer separate enough .Overall, we can see that the estimation accuracy is higher thanthat of JPEG images [1]. We anticipate that because lossycompression tends to lessen available data to make a betterestimate. Average estimation time for all 64 entries of imagesof size 640×480 for different QFs was 52.7 seconds.Estimating
Q
using MLE methods [10-11] is based onsearching for all possible
Q(i,j)
for each DCT coefficient overthe whole image which can be computationally exhaustive forlarge size files. Another method [12] proposed a logarithmiclaw and argued that the distribution of the first digit of DCT
coefficients follows that generalized Benford’s law. The
method is based on re-compressing the test image with severalquality factors and fitting the distribution of DCT coefficientsof each version to the proposed law. The QF of the versionhaving the least fitting artifact is chosen and its corresponding
Q
table is the desired one. Of course the above methods canonly estimate standard compression tables. Although it may beaccurate, it is time consuming. Plus it fails when the re-compression quantization step is an integer multiple of theoriginal compression step size. Another method [17] tends tocalculate the autocorrelation function of the histogram of DCTcoefficients. The displacement corresponding to the peak closest to the peak at zero is the value of 
Q(i,j)
given that thepeak is higher than the mean value of the autocorrelationfunction. The method eventually uses a hybrid approach; thelow frequency coefficients are determined directly from theautocorrelation function, while the higher-frequency ones areestimated by matching the estimated part to standard JPEGtables scaled by a factor of 
s
, which is determined from theknown coefficients.
Table 2
shows the estimation accuracy while
Table 3
 shows estimation time, for the different mentioned methodsagainst ours. Note that accuracy was calculated for directlyestimating only the first nine AC coefficients withoutmatching. This is due to the methods failing to estimate highfrequency coefficients as most of them are quantized to zero.On the other hand, the listed time is for estimating the ninecoefficients and then retrieving the whole matching table fromJPEG standard lookup tables. Maximum peak is faster than
5 4 3 2 1 1 1 14 1 1 1 1 10 10 101 1 1 1 1 10 10 101 1 1 1 1 10 10 101 1 1 1 14 12 12 121 1 1 1 12 13 11 111 1 1 1 13 11 12 111 1 1 1 13 12 12 12
(a) Test image (b) Estimated
Q
for uncompressed version (most low frequencies are ones).
 3 4 4 6 10 16 20 245 5 6 8 10 23 24 226 5 6 10 16 23 28 226 7 9 12 20 35 32 257 9 15 22 27 44 41 3110 14 22 26 32 42 45 3720 26 31 35 41 48 47
X
29 37 38 39 45 40
X X
3 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 1
X
0 0 0 0 0
X X
(c) Estimated Q for previously compressed version with QF = 80. (d) Difference between (c) and original table for QF=80.
 
Fig. 3
. Estimating
Q
table for original and previously compressed tif image.TABLE I. P
ERCENTAGE OF CORRECTLY ESTIMATED COEFFICIENTSFOR SEVERLA QFS
 
QF 60 70 80 90BMP
82.07% 84.80% 87.44% 89.44%
JPEG[1]
72.03% 76.99% 82.36% 88.26%
143http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->