Professional Documents
Culture Documents
24 (2013) 1276–1292
a r t i c l e i n f o a b s t r a c t
Article history: The development of powerful and low-cost hardware devices allied with great advances on content edit-
Received 21 December 2012 ing and authoring tools have promoted the creation of computer generated images (CG) to a degree of
Accepted 23 August 2013 unrivaled realism. Differentiating a photo-realistic computer generated image from a real photograph
Available online 4 September 2013
(PG) can be a difficult task to naked eyes. Digital forensics techniques can play a significant role in this
task. As a matter of fact, important research has been made by the scientific community in this regard.
Keywords: Most of the approaches focus on single image features aiming at detecting differences between real
Digital forensics
and computer generated images. However, with the current technology advances, there is no universal
Feature fusion
Photorealism
image characterization technique that completely solves this problem. In our work, we (1) present a com-
Classifier combination plete study of several CG versus PG approaches; (2) create a large and heterogeneous dataset to be used
Image descriptors as a training and validation database; (3) implement representative methods of the literature; and (4)
Synthetic images devise automatic ways for combining the best approaches. We compared the implemented methods
Voting method using the same validation environment showing their pros and cons with a common benchmark protocol.
Feature extraction We collected approximately 4850 photographs and 4850 CGs with large diversity of image content and
quality. We implemented a total of 13 methods. Results show that this set of methods can achieve up to
93% of accuracy when used without any form of machine learning fusion. The same methods, when com-
bined through the implemented fusion schemes, can achieve an accuracy rate of 97%, representing a
reduction of 57% of the classification error over the best individual result.
Ó 2013 Elsevier Inc. All rights reserved.
1047-3203/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.jvcir.2013.08.009
E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292 1277
Fig. 1. (a) A computer generated image (CG) and (b) a photograph (PG). We organize the remainder of this paper into five sections. In Sec-
tion 2, we analyze the literature methods which address the prob-
lem of PG versus CG and, additionally, expose image descriptors
started to acquire normal child pornography photographs and alter that may be used in the formulation of new methods focused on
them digitally in such a way they seem to be computer generated. the problem of this paper. In Section 3, we present our methodology
This is a scenario where digital forensics can be very helpful to dis- and implementation of each of them. In Section 4, we describe and
tinguish with confidence between CG and PG. Other typical sce- analyze the results of each approach. Finally, in Section 5, we pres-
nario in which computer generated images can play damaging ent some concluding remarks and suggestions for future work.
roles is image doctoring for political propaganda. In all cases, the
validation of authenticity of the images, therefore, is a major chal-
lenge in forensics [6]. 2. Related work
In this problem, we consider a photograph any image originated
from an acquisition device (e.g., camera, scanner) capturing a High quality computer graphics images emerged with the evo-
scene. In turn, a synthetic image is any scene partially or totally lution of the field of Computer Graphics. Methods for distinguish-
rendered by a computer software. This is a standard definition in ing PGs and CGs are of mid-2000s [7] and often use concepts of
the literature. Some authors, such as [7] (c.f., Section 2.2.4), have related areas [7,15–17].
also considered the actual content of the scene photographed/rep- Most of the existing proposals for distinguishing CGs and PGs in
resented. However, in forensics, we can classify such situations as a the literature contemplate two steps:
recapture attack. Recapture attacks might be approached by using
liveness detection, for instance, such as the work we have devel- identification and extraction of features that reveal the dif-
oped in [8,9]. One aspect that needs to be clear is that, under the ferences between the two classes (CG vs. PG);
CPPA terms, we need to identify the situation in which one cap- classification of images based on the set of obtained charac-
tures a photography involving child pornography and then alters teristics (features).
it in the computer in such a way it resembles a computer-gener-
ated one. The purpose of the content creator is to avoid someone In addition to the concepts already explored in the area of identifi-
to think the altered content is photography. cation of CG, many descriptors from related areas in image process-
For differentiating PGs and CGs, if each photograph had a reli- ing are promising for use in the problem but were not explored in
able mark that indicated its source camera, then the problem previous works in the forensic literature.
would be solved [10]. However, it would require the watermark The main difference between the various methods in the litera-
to be inserted in all existing photographs, clearly an impossibility ture consists in the choice of the characteristics to describe an im-
for existing images captured with older devices. Non-intrusive ap- age (descriptor). The effectiveness of this process is fundamental to
proaches, therefore, constitute the most suited option to solve the the good accuracy of a method. Here, we analyze existing descrip-
problem. tors on the applicability and effectiveness in the problem of CGs vs.
Researchers have approached this problem in various forms. PGs.
The human visual system is quite complex and uses many visual Areas such as Content-Based Image Retrieval (CBIR) have exten-
features to classify a scene. Using it as inspiration, we can try to sive research in the area of feature extraction and image character-
identify the visual features that distinguish between PG and CG ization [18] and thus present potential sources for our work. The
and use them in our approach. Edges, colors and shapes are exam- characterization of an image can be based on several criteria: color
ples of visual characteristics that could be used [11]. To improve histograms [19], texture [20,21], shape [22] edges [23], meshes
the current results, a first approach would be the use of new and (patches) [23], surface [24], among others.
relevant features. A second approach would be a novel way to We then describe the foundations extensively used in the area.
use the existing methods, such as an ensemble of them. In the following sections, we review the relevant work available in
Prior work reported classification accuracies superior than 95% the literature, describe concepts explored in other areas that we
[12,13] on this problem, however, Gloe et al. [14] created scenarios apply to our set of solutions. Relevant works for combining
in which the accuracy of some of these methods was significantly descriptors and classifiers which we also explore in this work are
lower than the reported in the papers. In a real scenario, containing directly explained in Section 3.
complex scenes, we believe that the methods would present lower
accuracy rates. In a scope in which the objects of study can serve as
criminal evidence, we need robust and scenario-invariant tech- 2.1. Fundamentals
niques. Thus, in the task of discerning between PG and CG, in this
paper our contributions are: A pattern recognition tool, or simply classifier, aims at estab-
lishing a classification criterion based on a set of reference data.
collection of a complex and diverse dataset of CG versus PG Unlike the pattern matching, the classifier does not seek to identify
which will be freely available for benchmarking algorithms equality, but statistical similarities in the data. While the human
in the field; eye is effective in certain aspects [4], classifiers play an important
role in the automatic data classification.
1278 E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292
Two pattern recognition methods widely used in studies in the 2.2.3. Wang e Moulin [13]
area of identification of CGs are the Linear Discriminant Analysis Wang and Moulin [13] explored the image differences in the
(LDA) and Support Vector Machine (SVM) [25]. LDA creates a linear frequency domain and proposed a model based on histograms of
classification surface and is of relatively simple implementation. wavelet coefficients of the image. For performance improvement,
SVM uses n-dimensional classification surfaces and their imple- band-pass filters were applied to the standard histogram in order
mentation is more complex although normally more effective. to characterize it with a reduced amount of information. In the val-
idation, the authors used an LDA classifier and a database of per-
2.2. State-of-the-art CG vs. PG methods sonal images. The authors reported an accuracy of around 100%
for both types of images at a rate of 0.1% false negatives. According
In the following sections, we describe and discuss relevant to the analysis [28], this method is more precise than the approach
methods found in the literature related to CG versus PG described in [15] and obtained a significant reduction in the com-
discrimination. putation time.
count the dependency between wavelet coefficients of adjacent be obtained directly from the source, i.e., not resulting from a post-
pixels present in the images. acquisition processing such as rescaling, since this could destroy
The average noise of all the images in the same class gives rise the characteristics associated with the color interpolation.
to the standard reference noise of that class. The classification is
done by the value of the correlation between the residual image 2.2.9. Li et al. [16]
and the standard reference. The photographs were acquired from Li et al. [16] proposed a method which uses the key ideas and
several camera models. The CGs were obtained from the Internet methods of [15,17]. In their implementation, however, the authors
and were divided into two groups: images generated by Maya [2] did not use the wavelet decomposition and worked in the HSV col-
and by 3DS Max [3]. The results indicate that PGs have low corre- or space. The authors consider a two-scale analysis of the image:
lation with the standard errors of both packages. The average per- the original and a reduced one. In each of the scales, an approxima-
formance of the method was 72% [28]. tion of the second order derivative is computed. This differentia-
tion is achieved in four different directions. Variance and kurtosis
2.2.7. Dirik et al. [12] statistics are calculated on these results. Despite this difference
Dirik et al. [12] proposed a method based on the color interpo- to assess the approximate correlation in a given direction, it does
lation process present in digital cameras. Even if further processing not capture the correlation between different directions. To do
is applied, the identification of the algorithm applied in the step of so, they use a linear predictor, similar to that proposed by [15].
demosaicing is still possible. The authors proposed that, to classify From each predicted value, the logarithm of the error is computed.
an image as PG, it is sufficient to identify a color interpolation The authors assume that if PGs follow the linear model pro-
process. posed, then the distribution of these errors serves as criteria for
The method supports the hypothesis that if one PG, interpolated categorization of images. The first fourth-order statistics are used
by a Bayer filter [33], is re-interpolated by the same filter, it will to characterize this distribution. For each scale, RGB color channel
suffer significantly less changes than if it was re-interpolated by and orientation, two statistics from the differentiation matrix and
another filter. four statistics from distribution of errors, are obtained. The method
Dirik et al. [12] also explored the interference of the lenses in provides a total of 144 characteristics.
the digital photograph acquisition process. The chromatic aberra-
tion is related to the difference between refractive indices for dif- 2.2.10. Peng et al. [35]
ferent wavelengths of light incident on the lens acquisition, Peng et al. [35] proposed a method for identification of natural
which causes a misalignment between color channels of the image. images and computer generated graphics based on hybrid features.
An alignment of the color channels means that there is dependence Initially, statistical features are extracted in the spatial and wavelet
between them. An image information with high mutual informa- domain, such as mean, variance, kurtosis, skewness, and median of
tion preserves the alignment of the color channels and, therefore, the histograms of grayscale images. Then, fractal dimensions of the
can be identified as having no traces of chromatic aberration. grayscale images and wavelet sub-bands are extracted as visual
A total of 1200 3 ¼ 3600 images were gathered. Half of the features. Finally, a pre-processing of Gaussian filter is applied to
data set was used to train an SVM classifier, and the other half the images prior to the computation of photo response non-unifor-
was used to test it. The reported results are 98.1%, 89.3% and mity noise (PRNU) through a wavelet-based denoising filter, and
99.6% using color interpolation characteristics, chromatic aberra- physical features are calculated from the enhanced PRNU.
tion and wavelet coefficients, respectively. Results considering fu- An SVM classifier is used in the identification process, achieving
sion of demosaicing and wavelet features resulted in an accuracy of a classification accuracy of 97.3% for computer generated graphics
99.9%. Further experiments with JPEG compression showed a clas- and 91.28% for natural images.
sification accuracy of 90% validating the hypothesis that the chro-
matic aberration features are relevant, even for images with high 2.2.11. Fan et al. [36]
compression ratios. Fan et al. [36] investigated factors that cause people to perceive
images as either real or computer-generated. A set of photographs
2.2.8. Gallagher and Chen [34] and computer-generated images is shown to computer-graphics
Gallagher and Chen [34] also used traces of Bayer interpolation experts and laypeople to judge their types, which were categorized
of colors to distinguish digital images and computer generated as original, modified to show only intrinsic reflectance compo-
images. The algorithm makes no assumption about the identity nents, and modified to show only intrinsic shading components.
or even about the linearity of the algorithm of demosaicing. The From the experiments, results demonstrated that visual realism
only assumption is that all the interpolated pixels present a differ- depends not only on image properties, but also on viewer’s cogni-
ent set of variances of the original set of pixels. tive characteristics. Color and shading played important roles in vi-
To verify this hypothesis, the method seeks a periodicity in the sual realism. Although experts were able to outperform laypeople
variances of the diagonals of the image. Initially, a high-pass filter in the identification process, their ability was limited to grayscale
is applied to enhance periodicity, if present. In case of an image images.
that suffered interpolation, it was expected that the variances were
periodic among different diagonals and constant along them. Sub- 2.2.12. Nguyen et al. [37]
sequently, the variance of each diagonal is estimated by the max- The work described by Nguyen et al. [37] addressed the prob-
imum likelihood method. At the end, there is a vector with the lem of discriminating between computer generated and photo-
variances for each diagonal, whose signal is analyzed through a graphic human faces. The method is based on an estimation of
Fourier transform. face asymmetry, however, the approach only works with frontal
A set of 2400 images was used, of which 1600 were PGs and 800 faces and it is very sensitive to the stage of shape and illumination
were CGs. The approach achieved 98.4% of accuracy. Applying post- normalization. Face asymmetry was used as main information in
JPEG compression, the method had a result of approximately 82% the identification process.
classification accuracy. According to the authors, it was expected
that the method needed large images (larger diagonals), however, 2.2.13. Isenberg [38]
even with small images, the method achieved a classification accu- Isenberg [38] analyzed several evaluation methods for non-
racy of 66%. One restriction of the method is that the images must photorealistic and illustrative rendering (referred to in the work
1280 E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292
as NPR, non-photorealistic rendering), including qualitative and Fast Fourier Transform not equally spaced and via Fourier Wrap.
quantitative techniques. Some issues addressed in the work in- The methods differ mainly by the choice of the spatial grid used
clude aspects that make an NPR technique be able to successfully in each scale and angle. Both return a table of curvelet coefficients
replicate a traditional technique for creating photorealistic con- indexed by a scaling parameter, an orientation parameter and a
tent. The authors present a discussion on implications for the use spatial location. For an array of size n n, both implementations
of non-photorealistic techniques, and people’s opinion about dif- have complexity Oðn2 log nÞ. The transforms are invertible, with
ferent non-photorealistic techniques as compared to traditional vi- inversion algorithms of similar complexity.
sual representations. The objective of the work is to evaluate how Complementarily, the contourlet transform [42] is also a multi-
good some techniques are in generating digital content. resolution and directional transform. Unlike curvelets, it was pro-
posed directly in the discrete domain. The decomposition by pyra-
2.2.14. Summary midal filter bank is obtained by combining the Laplacian pyramid
All the aforementioned methods that addressed the problem of and a directional filter bank.
CG vs. PG have been studied individually. Some studies have tested Finally, the shearlet transform [45] is newer and is similar in
a simple union of the methods with two or three other methods construction to the curvelets. Its mathematical basis is solid and,
[12]. A robust comparison of methods for distinguishing between given its construction, it also enables multi-resolution analyses of
CG and PG, in which the execution environment and the train- images. The shearlet transform uses a continuous dilation with
ing/testing data are the same, is not available yet according to two parameters. This dilation is the product of a scaling matrix
our knowledge and this is a subject we explore in this paper. dish by a shear. Therefore, the shearlet coefficients depend on scal-
ing, shear and translation parameters.
2.3. Methods from related fields
2.3.3. Histogram of Shearlet Coefficients [46]
There are several works which were not directly proposed for The Histogram of Shearlet Coefficients (HSC) [46] is an approach
the problem of distinguishing PGs and CGs, however, that can be that takes advantage of the directionality of the shearlet transform
extended to deal with such problem. In this sense, we analyzed to determine the distribution of the edges of the image.
some of them to be used as our basis. First, a multi-scale shearlet decomposition is applied in order to
capture the image information in different orientations and scales.
2.3.1. Fractality [39] Then, for each scale, a histogram with the same number of inter-
An important concept in the area of image segmentation is the vals as the number of orientations is computed. The value of each
fractality, proposed by Mandelbrot [40]. The fractal dimension of histogram interval is obtained by summing the absolute values of
an image can be calculated by the technique of box counting the shearlet coefficients. Finally, the histograms of all levels are
[39]. We cover a limited set E 2 Rn with disjoint boxes of side e. concatenated and normalized.
Let N e ðEÞ be the number of boxes. Suppose that E contains an infi- In their work, the authors report promising results for face iden-
nite number of points as a curve or a surface, and that N e ðEÞ tends tification and texture classification.
to þ1 when e tends to 0. The size of the box, D, characterizes the
rate of growth. 2.3.4. Cooccurrence Matrix [47]
Some authors state that the box counting dimension is only de- The Gray Level Cooccurrence Matrix (GLCM) [47] is a widely
fined when upper and lower boundaries coincide. The definition of used texture descriptor in image analysis. Unlike the measures cal-
the authors do not require the use of boxes as a measuring ele- culated directly from the values of the original image (first-order
ment. An alternative definition uses circles of radius , instead of moment), the GLCM regards the relationship of groups of pixels.
boxes. Descriptors based on the cooccurrence matrices are obtained in
two stages. First, directional operators (which define how the im-
2.3.2. X-lets [41–43] age should be covered) are computed from the image. If each pixel
In the field of image processing, there is a series of transforma- of the image can take n values for each color channel, then the
tions applicable to a set of data. Two-dimensional transforms can resulting array of the image will have size n n. Then, for images
be analyzed from several aspects such as: multi-resolution that assume discrete values between 1 to 256, the related matrices
(decomposition into multiple resolutions), location (in space and have size 256 256 ¼ 65; 536.
frequency), redundancy (whether the coefficients have redundant In the second step, statistics derived from the matrices are used
information), directionality (vertical, horizontal and diagonal as descriptors. For instance, Haralick et al. [47] proposed 14 sepa-
directions can not effectively capture all the details of the image), rate measures. Clausi [48] analyzed the correlation between tex-
and anisotropy (the windows must have different formats to cap- tures proposed by Haralick et al. and concluded that, among the
ture every nuance of an image). reported measures, there are at most five non correlated: contrast,
The wavelet transform lacks the last two characteristics, which dissimilarity, inverse difference moment of the normalized and
are also responsible for the distinction between curvelets and con- standardized inverse difference.
tourlets. The curvelet transforms [43] were initially developed in
the continuous domain. Multi-scale filters are applied, followed 2.3.5. Histogram of Oriented Gradients [49]
by ridgelets [44] processed in each subband. Adjustments were The Histogram of Oriented Gradients (HOG) counts the occur-
proposed for the discrete domain. A new implementation of curv- rences of oriented gradients over regions of an image [49]. HOG
elets was subsequently proposed by the authors without using the differs from other techniques to be calculated on a grid of uni-
ridgelet transform. Shortly after the introduction of curvelets, some formly spaced cells and use of normalization with overlapping
researchers have developed numerical algorithms for their imple- regions.
mentation and reported a series of practical successes [43]. These In the HOG method, the first step is the gamma correction. A
implementations are based on the original construction, which gradient detector is applied and the direction of the gradient is
uses a pre-processing step involving a partitioning in phase and determined for each pixel. For each region of fixed size (cell), the
space followed by a ridgelet transform applied to data blocks. frequency interval of occurrences of each gradient direction is cal-
In [43], the authors of the curvelet transform redesigned it to culated and a histogram of gradient orientations is created. Cells
simplify its implementation. Two methods have been proposed: may have an arbitrary form. The authors implemented the method
E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292 1281
with cells in circular and rectangular formats. Each cell has its Table 1
direction weighted by the intensity gradient. The cells are grouped Concepts used in the implemented methods. In the first column, we show the indices.
In the second, the identifier used in our work. In the third, the main concept used by
into larger regions (blocks), step in which the histograms are con- the method. In the last column, the related features.
catenated and normalized. The resulting feature vector is the con-
catenation of the block histograms. Index Method Basis Feature
1 Li Second order differences [16] Edges/Texture
2 LSB Camera noise [28] Acquisition
2.3.6. Local Binary Patterns [50] 3 LYU Wavelet transform [15] Edges/Texture
The human visual system is able to interpret nearly achromatic 4 POP Interpolator predictor [53] Acquisition
5 BOX Boxes counting [39] Auto-similarity
scenes, such as at low light levels. The color acts only as a sugges-
6 CON Contourlet transform [42] Edges/Texture
tion for richer interpretations. Even when the color information is 7 CUR Curvelet transform [56] Edges/Texture
distorted, for example, due to color blindness, the visual system 8 GLC Cooccurrence matrix [47] Texture
still operates satisfactorily [51]. Intuitively, this suggests that, at 9 HOG Histogram of oriented grads. [49] Shape
10 HSC Histogram of shearlet coeff. [46] Curves
least for our visual system, contrast and texture are distinct phe-
11 LBP Local binary patterns [50] Edges/Texture
nomena. However, the joint use of texture and contrast is popular 12 SHE Shearlet transform [45] Edges/Texture
in image analysis. 13 SOB Sobel operator [57] Edges
Local Binary Patterns (LBP) have been introduced as a supple- 14 FUS1 Concatenation Combination
mentary measure to the contrast of an image [50]. The value is ob- 15 FUS2 Simple voting Combination
16 FUS3 Weighted voting Combination
tained by summing the values thresholded by the central pixel,
17 FUS4 Meta-classification Combination
weighted by their position relative to the central pixel. The algo-
rithm originally employed radius equal to 1 and an 8-connected
neighborhood.
The major limitation of this approach is the support to only
small and fixed areas around the image. Images captured in an 8- technique to put all descriptor values in a common domain, we
connected neighborhood, for example, cannot capture structural can have promising results.
details of larger scales. Moreover, the operator is not robust to sub- In general, the formulation of a classification method that em-
tle changes such as in illumination direction [50]. Subsequently, ploys pattern recognition is usually composed of two stages: fea-
the authors of [52] proposed multi-scale binary pattern operators. ture extraction and classifier training.
Its implementation is simple and consists of applying the operator A descriptor represents each image as an ordered set of features
in multiple neighborhoods. and can be seen as a mapping of the image information to a m-
dimensional feature space in which m is the number of features
the descriptor represents. Considering the descriptors we employ
2.3.7. Demosaicing [53] in this work, we have the lowest m with the fractality method, with
Popescu and Farid [53] proposed a way to detect demosaicing m ¼ 3, and the largest with the curvelet coefficients, with
and distinguish different types of interpolation employed by differ- m ¼ 2328.
ent models of cameras. The authors used this method to identify The use of a large number of features can lead to the curse of
manipulations in an image. dimensionality [54]. In addition, the necessary data volume to per-
Suppose an image has been interpolated by a specific interpola- form a statistically significant analysis increases exponentially
tion algorithm. The pixels of one color channel will present a differ- with the number of data dimensions. A small number of parame-
ent correlation pattern than in the tampered region. Thus, the ters may result in a low and possibly false characterization. Here,
sample block can be used to identify regions that have suffered we also assess if the proposed descriptors incur in the curse of
changes. dimensionality (specially when combining them through fusion)
For simplicity, the authors assume that the interpolation algo- by means of Random Subspace Methods (RSM) [55].
rithm, albeit unknown, is linear. The authors use an Expectation/ In this work, we implemented and validated 17 approaches (see
Maximization (EM) algorithm for iteratively estimating the un- Table 1). Each of the implementations is explained below. The
known parameters (the pixel neighborhood size and the correla- remainder of this section explains how they were implemented:
tion parameters among them). In the E step, the authors the image acquisition (Section 3.1), the state-of-the-art methods
compute the probability that each sample belongs to the linear (Section 3.2), the approaches from related fields (Section 3.3)
model assumed. In the M step, they estimate a model interpola- which are tested for this particular problem for the first time,
tion. The authors report a 97% classification accuracy in the iden- and methods for data normalization and feature/classifier fusion
tification of interpolation models using an LDA classifier. (Section 3.4).
3. Methodology
3.1. Image collection
The CG vs. PG problem seems to be a problem in which there
will be no silver bullet image characterization process which, by it- For a more robust statistical analysis, we create a large and het-
self, completely solves the problem. Experience from several past erogeneous sample space, which translates into a large number of
work in the literature corroborates this fact. As the rich image pro- images and content diversity.
cessing and computer vision image descriptors available in the lit- Among the images, we searched for both indoor and outdoor
erature explore complementary properties of digital images, it is scenes and a variety of equipment sources, as of the personal data-
natural to expect that such properties will help to capture different sets available online and already used in previous publications
nuances and telltales related to the process of creating photoreal- [58]. Among the computer-generated ones, we looked only for
istic images. In this paper, we explore such different properties photo-realistic images. Additionally, we used images of high de-
and complementary descriptions for solving the problem and show gree of realism as the online challenge [59] to test our implemen-
that when combining them in a proper way considering not only a tations. Section 4.2 gives more details on the collection of such
smart combination policy but also an appropriate normalization dataset.
1282 E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292
rectangular, of size 3 3 pixels. The direction of each pixel of the and classifiers towards solving the problem we deal with in this
cell was weighted by the gradient intensity. Cells were grouped paper.
into blocks. Finally, the histograms of the blocks were concate- As already studied in several works in the literature [66,54], fea-
nated and normalized. We used blocks of size 3 3 and histograms ture combination normally brings together information from dif-
divided into 9 intervals. ferent characterization techniques and with different domains
(data range value for the dimensions, data types, etc.). In order to
3.3.9. SOB deal with this potential problem, before performing any fusion,
First, we applied the Sobel operator [57] without thresholding. we perform data normalization. Data normalization is a key step
We chose, among all the local maxima in the intensity map, the in dealing with large amounts of heterogeneous data [54] and in
50 largest in magnitude. For each of 50 points, we set a centered this paper we decided to evaluate this point for the PG vs. CG
Gaussian 2D point, in a square block of size 1 (parameter which problem.
we varied). We excluded those which had inadequate Gaussian fits The choice here can be challenging when there is not enough
and calculated the variances rx and ry of each Gaussian. We calcu- information on the data distribution. We theoretically studied sev-
r þr eral normalization techniques, such as t-norms, z-scores [54] and
lated the average variance ( x 2 y ) in order to eliminate the variation
to rotation. We obtained a total of 150 values as the resulting set of w-scores [54], then chose the simplest one: z-score normalization
features. [67]. The reason is that z-score remains one of the most popular
normalization techniques available, whose calculation is
3.3.10. LSB straightforward
Since acquisition equipments usually add noise to the original xl
image, we searched for traces from the internal processing of dig- z¼ ð1Þ
r
ital cameras. Our assumption is that noise can be correlated with
the least significant bits (LSBs) of the image. where x is a feature vector in m-dimensional space, l is the feature
First, we obtained a map of the least significant bits of the image vector mean and r the feature vector standard deviation.
by dividing the value of each pixel by two. Then, we linearized the
data and calculated the cooccurrence features [47] of these
3.4.1. FUS1
coefficients.
Our first combination strategy was the feature concatenation.
Previous works such as [68,69] have explored this idea, however,
3.3.11. POP none has combined more than three different methods. Each fea-
Popescu and Farid [53] proposed a way to identify the demosa- ture extraction method individually provides a set of features,
icing process. It was used as a basis for the implementation of an- which are combined into a single feature vector. The resulting vec-
other descriptor. tor is the input for the classifier (see Fig. 2).
Let f ðx; yÞ be the intensity of the pixel at position ðx; yÞ in a par- One possible advantage is that it does not discard any data be-
ticular color channel of the image. We based on the assumption fore the final judgment and all data is supplied to the classifier,
that f ðx; yÞ belongs to one of two correlation models: M 1 , if it is lin- which can take a better decision based on this data. Another fea-
early correlated to its neighbors, or M 2 otherwise. Thus, if f ðx; yÞ is ture is that only a pattern recognition step is performed.
linearly correlated to its neighbors, and Since each image is represented by a set of features, the result-
a ¼ fau;v j N 6 u; v 6 þNg is the parameter set, then ing matrix can become prohibitive (e.g., with each combined fea-
X
N ture vector with 4100 dimensions, in our case). The classifier has
f ðx; yÞ ¼ ðau;v f ðx þ u; y þ v Þ þ nðx; yÞÞ the task of finding a separation surface in an m-dimensional space,
u;v ¼N with large m. The dynamic handling of such data volume can be
limited by the processing power of a computer. While the advances
In the Expectation step, we then compute
2
1 r ðx; yÞ
Prðf ðx; yÞ 2 M 1 jf ðx; yÞÞ ¼ pffiffiffiffiffiffiffi exp
r 2p 2r2
The error is such that in the Maximization step, we minimize the
e ¼ 0 and obtain
quadratic error @ a@x;y
!
X
þN X
wðx; yÞf ðx þ s; y þ tÞf ðx þ u; y þ v Þ
u;v ¼N x;y
X
¼ x;y
wðx; yÞf ðx þ s; y þ tÞf ðx; yÞ
It can be noted that these steps are iterative and, in the first itera-
tion, a is initialized, say with value 0:5. The Expectation and Maxi-
mization steps are repeated until the difference between the
parameters a of iteration (i) and (i 1) is smaller than 1. Analyzing
the periodicities in the p-Map in the frequency domain, we ex-
tracted the four higher-order statistics of such p-Maps in order to
feed a classifier. This step goes beyond the direct (an not automatic)
analysis described in [53].
Fig. 3. Simple voting flowchart. (1) k Descriptors are applied to the image. (2) Each
classifier elects a class (vote). (3) Sum of the values for each class (poll). Fig. 4. Weighted voting method flowchart. (1) k Descriptors are applied to the
image. (2) Each of the k classifiers elects a class (vote) and each classifier has a
weight wi from the training phase. (3) Sum of all values, weighted by wi .
Table 2 Table 3
Results for the method based on wavelet coefficients. Results for the method based on second order differences.
Table 6
Results for the method based on curvelets.
4.4.2. Li
The method proposed by Li et al. [16] achieved a high accuracy Curvelets Accuracy Variance Average accuracy
in this group, an average of 93%. CG PG CG PG
We obtained a feature vector of size 48 for each color channel,
CUR 0.806 0.805 3.57E04 1.52E03 0.805
totalizing 144 features. Table 3 summarizes these experiments.
E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292 1287
Table 7 Table 10
Results for the method based on the cooccurrence matrix. Results for the method based on the LBP descriptor.
4.5.4. GLCM
Initially, we computed the cooccurrence matrix and then ex-
tracted the characteristics of Haralick et al. [47] for each color Table 12
channel. We used a vector with 12 dimensions to represent each Result for the method based on the shearlet coefficients. The two first rows were
image. Comparing the size of the feature vector of the method with obtained using the Daubechies filter, whereas the third and fourth rows were
obtained using Symmlet filter.
the contourlet transform output (696-d), whose size is suboptimal
[42], the result was significantly worse than other methods, partic- Shearlets Accuracy Variance Average accuracy
ularly the contourlet. Table 7 summarizes these experiments. CG PG CG PG
SHE 1 0.748 0.677 6.75E04 1.73E04 0.713
SHE 2 0.748 0.676 1.57E03 5.49E04 0.712
4.5.5. HOG SHE 3 0.752 0.677 1.24E03 3.28E04 0.715
In the implementation based on the histogram of oriented gra- SHE 4 0.747 0.674 7.05E04 1.16E03 0.710
dients, we used from 9 to 16 orientations. We obtained an average
accuracy of 74% in the best implementation. Table 8 summarizes
these experiments.
4.5.7. LBP
In the method based on Local Binary Patterns, the key parame-
4.5.6. HSC ter is the definition of the radius of neighborhood. We defined
In the HSC [46] method, we varied the size of the block, the dis- three radii in our method, 1, 2 and 3, which resulted in 8, 16 and
placement of the sliding block, the number of levels, and the num- 24 neighbors, respectively. The best accuracy ( 87%) was ob-
ber of angles. We tested three sets of parameters. In all approaches, tained with radius equal to 2 (see Table 10). Table 10 summarizes
we used a displacement of 256 pixels and 8 levels. these experiments.
In a first approach, we used blocks of size 256 256 and 8 an-
gles. In a second approach, we used the same size and changed the
number of angles to 16. Finally, in a last approach, we used eight 4.5.8. LSB
angles, but with blocks of size 512 512. In the first and second We calculated the least significant bit plane for each color chan-
approach, we obtained four juxtaposed blocks, while in the last nel image, as well as [29]. From each LSB map, we extracted the
we obtained just one block. The method was applied indepen- cooccurrence matrix and the Haralick et al. [47] features. The fea-
dently in each color channel. Table 9 summarizes these ture extraction is fast and has four characteristics for each map of
experiments. least significant bits. The average accuracy was 66%. Table 11 sum-
marizes these experiments.
Table 8
Results for the method based on the histogram oriented gradients.
4.5.9. SHE
HOG Accuracy Variance Average accuracy We implemented a method based on the work of [74] using dif-
CG PG CG PG ferent wavelet filters. We used Daubechies and Symmlet filter with
HOG 1 0.735 0.700 1.02E03 5.33E04 0.718 different sizes of support.
HOG 2 0.754 0.725 1.64E04 8.77E04 0.740 After the extraction of shearlet coefficients for each color level,
scale and direction, we used mean, variance, skewness and kurtosis
of the coefficients as descriptors. The accuracies obtained from
each of the implementations differed slightly and were around
Table 9 71%. Table 12 summarizes these experiments.
Results for the method based on the histogram of shearlet coefficients. In each of the
methods, we decomposed the image into three levels. The coefficients were obtained
for each block, with center shifted by 256 pixels.
4.5.10. SOB
Shearlet Accuracy Variance Average
Histogram accuracy The Sobel operator is commonly used as edge detector. In our
CG PG CG PG
implementation, we used two variations: Gaussian window with
HSC 1 0.818 0.787 2.24E04 4.17E04 0.802 side 7 and Gaussian window with side 9. The results differed
HSC 2 0.830 0.785 5.09E04 7.37E04 0.808 slightly and were around 55%. Table 13 summarizes these
HSC 3 0.815 0.783 7.97E04 2.89E04 0.799
experiments.
1288 E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292
Table 16
Comparison of the results for the combination methods.
Fig. 6. Some images correctly identified by, at least, 9 out of 13 individual methods. CGs and PGs are shown in the first and second rows, respectively.
Table 17
1.0
Comparison among the implemented approaches for distinguishing CGs and PGs. For
each of the seventeen methods, it is shown the number of dimensions of the feature
space, the accuracies for each class, and its average accuracy.
0.8
Index Method m CG PG Average accuracy
1 BOX 3 0.541 0.568 0.554
2 CON 696 0.918 0.887 0.902 False positive rate
3 CUR 2328 0.806 0.805 0.805 0.6
4 GLC 12 0.640 0.630 0.635 BOX
5 HOG 256 0.754 0.720 0.740 CON
CUR
0.4
the fact that combination methods use only the results of the indi-
vidual methods and make their rules in agreement with the con-
sensus of the methods. Furthermore, the combination methods
0.8
Fig. 9. Samples of images from the Fake or Foto website. Images with green borders are PGs and with brown borders are CGs. (For interpretation of the references to colour in
this figure caption, the reader is referred to the web version of this article.)
with the highest number of hits was Li with 8 hits out of 12. Con- methods varied in the number of features used (3–2, 328) and in
catenation, simple voting, weighted voting and meta-classification the computational performance. Among the methods evaluated,
methods reached 9, 8, 9 and 10 hits, respectively. The results show the methods of [15,16] showed the best results individually. We
that, in a complex scenario of distinguishing between CGs and PGs, emphasize that the dataset collected for validation o the methods
methods of fusion can improve the overall accuracy of the results will be available for free allowing fair comparisons in the future.
(see Fig. 9). There are a number of approaches addressing the problem PG
versus CG. So far, however, there was little effort to combine the
5. Conclusions and future work existing techniques and assess their complementarity. In this pa-
per, we discussed four methods for combining descriptors. The first
Technology has significantly changed the way we capture, cre- method performed a simple junction of each feature array into a
ate and store images. Current digital cameras allow instant images single set. Simple voting methods perform the extraction and clas-
with the support of tools for acquiring and processing images in- sification steps individually for each method and the class with the
serted into the device itself. Editing and distributing a photo also majority of votes is elected. The weighted voting method uses
became a simple task, due to powerful image editors and the information from the training phase to consider the final choice.
Internet. The fourth method uses meta-classification distances of the SVM
In this context, images are consolidated as an important means scores as characteristics. The best results were obtained with the
of communication. In parallel, advances in computer graphics have meta-classifier (97% of accuracy), with more than four percentage
allowed the imaging in levels of complexity never achieved before. points above the best single method.
Modern techniques and more powerful machines allow the crea- For a real scenario of classification, if the user already has the
tion of complex scenes, visually very close to reality. Images of this image descriptors, then the implementation of combination meth-
nature can confuse a naïve user. The combination of the fact that ods is simple. The joint use of descriptors is usually done by con-
the PGs have their role as digital document and the fact that CGs catenation, however, our results indicated that a simple voting
can have a complexity enough to confuse the viewer can create can generate better results.
problems in a court of law. The proof of the authenticity of an im- Future work opportunities include the extension of the methods
age by a jury can validate a criminal evidence which, in turn, may to local regions of images (blocks), application of more sophisti-
implicate/acquit an individual. cated normalization techniques such as w-score [54] and the inclu-
Most existing studies that seek to distinguish PGs and CGs in sion of others, among many descriptors available. Another
the literature to date do not perform a complete and robust com- approach could be, instead of croppings, the use of resized images,
parison with previous methods. The works often make compari- since this operation is common in the Internet. Counter forensic
sons with other methods only through the reported accuracies. techniques are also targets of study because we can use descriptors
Using the values reported by other methods lead to inconsistent that are more robust to certain types of attacks. The case of images
comparisons. The environments of tests may vary, for instance, obtained at recapture, as discussed in [7], represents a problem of
the set of training data test. The methods take several hypotheses relevance that could be explored in a future work and, in this case,
to the point that, in a real scenario, we expect that the results are the separation of the scene contents and concepts would be
significantly worse [14]. In a context where the objects of study are fundamental.
potential criminal evidence, we need robust methods in every type
of scenario.
This work aimed at creating a common test scenario, imple- Acknowledgements
menting some state-of-the-art methods and, more importantly,
proposing and implementing several new methods from related This work was partially supported by São Paulo Research Foun-
fields, and performing a consistent comparison of them, discussing dation – FAPESP (Grants 2010/05647-4, 2010/13745-6, and 2011/
their complementarity when used in conjunction with data fusion 22749-8), National Counsel of Technological and Scientific Devel-
techniques. Our tests used a common data set, training/testing opment – CNPq (Grants 304352/2012-8 and 307113/2012-4) and
parameters, and same performance conditions. The implemented Microsoft.
E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292 1291
References [32] J. Lukas, J. Fridrich, M. Goljan, Digital camera identification from sensor noise
sensor, IEEE Transactions on Information Forensics and Security (TIFS) 1 (2)
(2006) 205–214.
[1] Nvidia, 2012. <http://www.nvidia.com/>.
[33] B.E. Bayer, Color imaging array, U.S. Patent 3971065, 1976.
[2] Maya, 2012. <http://usa.autodesk.com/maya/>.
[34] A.C. Gallagher, T. Chen, Image authentication by detecting traces of
[3] 3ds Max, 2012. <http://usa.autodesk.com/3ds-max/>.
demosaicing, in: IEEE International Conference on Computer Vision and
[4] H. Farid, M.J. Bravo, Image forensic analyses that elude the human visual
Pattern Recognition (CVPR), USA, 2008, pp. 1–8.
system, in: SPIE Symposium on Electronic Imaging (SEI), CA, 2010, pp. 754106–
[35] F. Peng, J. Liu, M. Long, Identification of natural images and computer
754106–10.
generated graphics based on hybrid features, International Journal of Digital
[5] H. Farid, Creating and detecting doctored and virtual images: implications to
Crime and Forensics (IJDCF) 4 (2013) 1–16.
the child pornography prevention act, Tech. Rep. 2004–518, Dartmouth
[36] S. Fan, T.-T. Ng, J.S. Herberg, B.L. Koenig, S. Xin, Real or fake?: human
College, USA, 2004.
judgments about photographs and computer-generated images of faces, in:
[6] A. Rocha, W. Scheirer, T. Boult, S. Goldenstein, Vision of the unseen: current
SIGGRAPH Asia, ACM, Singapore, 2012, pp. 17:1–17:4.
trends and challenges in digital image and video forensics, ACM Computing
[37] Dang-Nguyen, G. Boato, F.G.B. DeNatale, Discrimination between computer
Surveys (CSUR) 43 (2011) 26:1–26:42.
generated and natural human faces based on asymmetry information, in: 20th
[7] T.-T. Ng, S.-F. Chang, J. Hsu, L. Xie, M.-P. Tsui, Physics-motivated features for
European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 2012,
distinguishing photographic images and computer graphics, in: ACM
pp. 1234–1238.
Multimedia (ACMMM), Singapore, 2005, pp. 239–248.
[38] T. Isenberg, Evaluating and validating non-photorealistic and illustrative
[8] A. da Silva Pinto, H. Pedrini, W.R. Schwartz, A. Rocha, Video-based face
rendering, in: Image and Video-Based Artistic Stylisation, Computational
spoofing detection through visual rhythm analysis, in: 25th Conference on
Imaging and Vision, vol. 42, Springer-Verlag, London, 2013, pp. 311–331.
Graphics, Patterns and Images (SIBGRAPI), Ouro Preto, Brazil, 2012, pp. 221–
[39] L.S. Liebovitch, T. Toth, A fast algorithm to determine fractal dimensions by
228.
box counting, Physics Letters A 141 (1989) 386–390.
[9] W.R. Schwartz, A. Rocha, H. Pedrini, Face spoofing detection through partial
[40] B. Mandelbrot, How long is the coast of Britain? Statistical self-similarity and
least squares and low-level descriptors, in: International Joint Conference on
fractional dimension, Science 156 (3775) (1967) 636–638.
Biometrics (IJCB), 2011, pp. 1–8.
[41] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, 1999.
[10] M.K. Johnson, H. Farid, Detecting photographic composites of people, in:
[42] M.N. Do, M. Vetterli, Contourlets: a directional multiresolution image
International Workshop on Digital Watermarking (IWDW), China, 2007, pp.
representation, in: IEEE International Conference on Image Processing (ICIP),
19–33.
USA, vol. 1, 2002, pp. I–357–360.
[11] T. Pouli, E. Reinhard, Image statistics and their applications in computer
[43] E. Candes, L. Demanet, D. Donoho, L. Ying, Fast discrete curvelet transforms,
graphics, Tech. rep., Eurographics State of the Art Report (STAR), 2010.
Multiscale Modeling Simulation 5 (3) (2006) 861–899.
[12] E. Dirik, H. Sencar, N. Memon, Source camera identification based on sensor
[44] M. Do, M. Vetterli, The finite Ridgelet transform for image representation, IEEE
dust characteristics, in: IEEE Signal Processing Applications for Public Security
Transactions on Image Processing (TIP) 12 (1) (2003) 16–28.
and Forensics (SAFE), USA, 2007, pp. 1–6.
[45] G. Kutyniok, W.-Q. Lim, Compactly supported shearlets are optimally sparse,
[13] Y. Wang, P. Moulin, On discrimination between photorealistic and
Journal of Approximation Theory (2011) 1564–1589.
photographic images, in: IEEE International Conference on Acoustics, Speech,
[46] W.R. Schwartz, R.D. da Silva, L.S. Davis, H. Pedrini, A novel feature descriptor
and Signal Processing (ICASSP), France, vol. 2, 2006, p. II.
based on the shearlet transform, in: IEEE International Conference on Image
[14] T. Gloe, M. Kirchner, A. Winkler, R. Bohme, Can we trust digital image
Processing (ICIP), Belgium, 2011, pp. 1053–1056.
forensics?, in: ACM Multimedia (ACMMM), Germany, 2007, pp. 78–86.
[47] R.M. Haralick, K. Shanmugam, I. Dinstein, Textural features for image
[15] S. Lyu, H. Farid, How realistic is photorealistic?, IEEE Transactions on Signal
classification, IEEE Transactions on Systems, Man, and Cybernetics (SMC) 3
Processing (TSP) 53 (2) (2005) 845–850
(6) (1973) 610–621.
[16] W. Li, T. Zhang, E. Zheng, X. Ping, Identifying photorealistic computer graphics
[48] D.A. Clausi, An analysis of co-occurrence texture statistics as a function of grey
using second-order difference statistics, in: International Conference on Fuzzy
level quantization, Canadian Journal of Remote Sensing (CJRS) 28 (1) (2002)
Systems and Knowledge Discovery (FSKD), China, vol. 5, 2010, pp. 2316–2319.
45–62.
[17] W. Chen, Y. Shi, G. Xuan, Identifying computer graphics using hsv color model
[49] N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in:
and statistical moments of characteristic functions, in: IEEE International
IEEE International Conference on Computer Vision and Pattern Recognition
Conference on Multimedia and Expo (ICME), China, 2007, pp. 1123–1126.
(CVPR), USA, 2005, pp. 886–893.
[18] Y. Rui, T.S. Huang, S.-F. Chang, Image retrieval: current techniques, promising
[50] T. Ojala, M. Pietikäinen, T. Mäenpää, A generalized local binary pattern
directions, and open issues, Journal of Visual Communication and Image
operator for multiresolution gray scale and rotation invariant texture
Representation 10 (1) (1999) 39–62.
classification, in: International Conference on Advances in Pattern
[19] R. Chakravarti, X. Meng, A study of color histogram based image retrieval, in:
Recognition (ICAPR), Brazil, 2001, pp. 399–408.
IEEE International Conference on Information Technology (CIT), USA, 2009, pp.
[51] H. Farid, A 3-D photo forensic analysis of the Lee Harvey Oswald backyard
1323–1328.
photo, Tech. Report TR2010-669, Dartmouth College, USA, 2010.
[20] W. Equitz, W. Niblack, Retrieving images from a database using texture-
[52] T. Mäenpää, M. Pietikäinen, Multi-scale binary patterns for texture analysis, in:
algorithms from the qbic system, Tech. Rep., RJ 9805, IBM Research,1994.
Scandinavian Conference on Image Analysis (SCIA), Sweden, 2003, pp. 885–
[21] G. Elkharraz, S. Thumfart, D. Akay, C. Eitzinger, B. Henson, Texture features
892.
corresponding to human touch feeling, in: IEEE International Conference on
[53] A.C. Popescu, H. Farid, Exposing digital forgeries in color filter array
Image Processing (ICIP), Egypt, 2009, pp. 1341–1344.
interpolated images, IEEE Transactions on Signal Processing (TSP) 53 (10)
[22] S. Loncaric, A survey of shape analysis techniques, Pattern Recognition 31 (8)
(2005) 3948–3959.
(1998) 983–1001.
[54] W. Scheirer, A. Rocha, R. Micheals, T. Boult, Robust fusion: extreme value
[23] A.B. Lee, K.S. Pedersen, D. Mumford, The nonlinear statistics of high-contrast
theory for recognition score normalization, in: European Conference on
patches in natural images, International Journal of Computer Vision (IJCV) 54
Computer Vision (ECCV), USA, vol. 6313, 2010, pp. 481–495.
(2003) 83–103.
[55] T.K. Ho, The random subspace method for constructing decision forests, IEEE
[24] N. Sochen, R. Kimmel, R. Malladi, A general framework for low level vision,
Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 20 (8)
IEEE Transactions on Image Processing (TIP) 7 (3) (1998) 310–318.
(1998) 832–844.
[25] R.O. Duda, P.E. Hart, Pattern Classification and Scene Analysis, second edition.,
[56] E.J. Candes, D.L. Donoho, Curvelets –A Surprisingly Effective Nonadaptive
John Wiley & Sons Inc., 1973.
Representation for Objects with Edges, Vanderbilt University Press, 2000.
[26] R.W. Buccigrossi, E. Simoncelli, Image compression via joint statistical
[57] R. Gonzalez, R. Woods, Digital Image Processing, third ed., Prentice-Hall, 2007.
characterization in the wavelet domain, IEEE Transactions on Image
[58] G.C. de Silva, T. Yamasaki, K. Aizawa, Sketch-based spatial queries for
Processing (TIP) 8 (12) (1999) 1688–1701.
retrieving human locomotion patterns from continuously archived gps data,
[27] P.P. Vaidyanathan, Quadrature mirror filter banks, m-band extensions and
IEEE Transactions on Multimedia (TMM) 11 (7) (2009) 1240–1253.
perfect-reconstruction techniques, in: IEEE International Conference on
[59] Fake or Foto, 2012. <http://www.fakeorfoto.com/>.
Acoustics, Speech, and Signal Processing (ICASSP), USA, vol. 4, 1987, pp. 4–20.
[60] Curvelab, 2012. <http://www.curvelet.org>.
[28] T.-T. Ng, S.-F. Chang, Identifying and prefiltering images – distinguishing
[61] C. toolbox, 2012. <http://www.ifp.illinois.edu/minhdo/software/>.
between natural photography and photorealistic computer graphics, IEEE
[62] S.-M. Phoong, C. Kim, P. Vaidyanathan, R. Ansari, A new class of two-channel
Signal Processing Magazine (SPM) 26 (2) (2009) 49–58.
biorthogonal filter banks and wavelet bases, IEEE Transactions on Signal
[29] A. Rocha, S. Goldenstein, Progressive randomization process and equipment
Processing (TSP) 43 (3) (1995) 649–665.
for multimedia analysis and reasoning, Patent PCT/BR2007/000156, World
[63] Shearlab, 2012. <http://www.shearlab.org/indexsoftware.html>.
Intellectual Property Org. (WIPO), 2008.
[64] T. Ojala, M. Pietikäinen, T. Mäenpää, Multiresolution gray-scale and rotation
[30] A. Rocha, S. Goldenstein, Progressive randomization: seeing the unseen,
invariant texture classification with local binary patterns, in: IEEE
Elsevier Computer Vison and Image Understanding (CVIU) 114 (3) (2010) 349–
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 24,
362.
2002, pp. 971–987.
[31] S. Dehnie, T. Sencar, N. Memon, Identification of computer generated and
[65] O. Ludwig, D. Delgado, V. Goncalves, U. Nunes, Trainable classifier-fusion
digital camera images for digital image forensics, in: IEEE International
schemes: an application to pedestrian detection, in: IEEE International
Conference on Image Processing (ICIP), USA, 2006, pp. 2313–2316.
Conference on Intelligent Transportation Systems (ITSC), 2009, pp. 1–6.
1292 E. Tokuda et al. / J. Vis. Commun. Image R. 24 (2013) 1276–1292
[66] A. Rocha, J.P. Papa, L.A.A. Meira, How far do we get using machine learning [70] Matlab, 2012. <http://www.mathworks.com/>.
black-boxes?, International Journal of Pattern Recognition and Artificial [71] R, 2012. <http://www.r-project.org/>.
Intelligence (IJPRAI) 26 (2) (2012) 1261001-1–1261001-23 [72] C. Chang, C. Lin, LIBSVM: a library for support vector machines, ACM
[67] R.J. Larsen, M.L. Marx, An Introduction to Mathematical Statistics and Its Transactions on Intelligent Systems and Technology (TIST) 2 (2011) 1–27.
Applications, third ed., Pearson, 2000. [73] T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters 27 (8)
[68] A. Dirik, S. Bayram, H. Sencar, N. Memon, New features to identify computer (2006) 861–874.
generated images, in: IEEE International Conference on Image Processing [74] J. Ma, G. Plonka, The curvelet transform, a review of recent applications, IEEE
(ICIP), USA, vol. 4, 2007, pp. 433–436. Signal Processing Magazine (SPM) 27 (2) (2010) 118–133.
[69] C.M. Bishop, Pattern Recognition and Machine Learning, first ed., Springer,
2006.