You are on page 1of 8

http://www.diva-portal.

org

Postprint

This is the accepted version of a paper presented at 11th International Conference on


Information Systems and Advanced Technologies, ICISAT 2021, Virtual, Online, 27
December 2021 through 28 December 2021.

Citation for the original published paper:

Benhamza, H., Djeffal, A., Cheddad, A. (2021)


Image forgery detection review
In: Proceedings - 2021 International Conference on Information Systems and
Advanced Technologies, ICISAT 2021 Institute of Electrical and Electronics Engineers
Inc.
https://doi.org/10.1109/ICISAT54145.2021.9678207

N.B. When citing this work, cite the original published paper.

©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be
obtained for all other uses, in any current or future media, including reprinting/republishing
this material for advertising or promotional purposes, creating new collective works, for
resale or redistribution to servers or lists, or reuse of any copyrighted component of this work
in other works.

Permanent link to this version:


http://urn.kb.se/resolve?urn=urn:nbn:se:bth-22737
Image forgery detection review
Hiba Benhamza Abdelhamid Djeffal Abbas Cheddad
LESIA Laboratory LESIA Laboratory Blekinge Institute of Technology
Mohamed Khider University Mohamed Khider University Sweden
Biskra,Algeria Biskra,Algeria abbas.cheddad@bth.se
hiba_benhamza@hotmail.fr abdelhamid_djeffal@yahoo.fr

Abstract— With the wide spread of digital document use in these techniques involve a feature extraction phase and a
administrations, fabrication and use of forged documents have matching process.
become a serious problem. This paper presents a study and
classification of the most important works on image and This report presents main existent works related to image
document forgery detection. The classification is based on and document forgery detection and introduces their
documents type, forgery type, detection method, validation methods, results and discussion. The aim of this report is to
dataset, evaluation metrics and obtained results. Most of point the main challenges of this research field and to
existing forgery detection works are dealing with images and compare the proposed methods and discuss their advantages
few of them analyze administrative documents and go deeper to and limits.
analyze their contents.
II. DIGITAL IMAGE FORGERY DETECTION METHODS
Keywords— Document forgery, Copy-move forgery, Splicing, After the tremendous development in communication
imitation, forgery detection methods, Digital image forgery. technologies in last recent years, digital image forgery
I. INTRODUCTION detection has become a centre of interest for many scientific
researchers trying to secure many administrative and business
Administrative documents are forms created to establish activities. Indeed, scientific research is now aware of the
an identity, right or authorization. And therefore, falsification proposition of several techniques for the detection of falsified
of documents or identity is severely punishable by law. This documents, especially passive methods.
concerns identity and authorization documents such as
passports, identity cards and driving licenses. Fraudsters use Image forgery detection is concerned with detecting
these false documents to commit offences generally, like manipulated images or verifying their authenticity. It is
travelling using fake passport or visa, or to forge diplomas to divided into two main categories: Active and passive.
get a job. Moreover, digitalization or digital transformation A. Passive methods
has created new communication and optimization tools for
business and administration management. Due to new Passive forgery detection techniques do not need any
technologies, it is easy to digitize documents using a scanner signature or watermark to detect forgery but rather by
and save them in pdf format or images. The new tendency of analysing the statistical distortion they leave behind after
governments is to use digital documents instead of hard ones forgery although the detection of digital image falsification in
to optimize many administrative procedures. this case is considered more difficult. This technique has
become widely used because it does not need prior
Unfortunately, nowadays document images tampering information about the image. It depends on the hypothesis that
has become easier with the use of sophisticated tools with any modification in the image may change its consistency or
recent advances in technology and multimedia. Anyone can statistics property [4].
use available and low cost professional image forgery tools
easily to modify images, such that it cannot be distinguished 1) Copy-move forgery detection (CMFD)
from authentic ones with the naked eye. Image forgery Most of the research in the image forgery detection field
detection is categorized into two main types: active and is concentrated on copy-move forgery. In this section we
passive methods. The active methods are concerned with explain CMFD and the common workflow of CMFD
digital signatures or watermarking. They depend on previous techniques.
information taken from the original image. Though, these The copy-move process is based on the idea of taking a
methods are not used for all kind of documents because they part or parts of the image and pasting it on other sides of the
need special equipment to embed watermark or a signature same image. Because texture and color of these regions are
like particular cameras. Alternatively, passive methods are similar, blending them into the background becomes easier.
used to detect forgery without previous inserted information
[1]. These methods can treat two main types of image In documents forgery, CMF is generally used to hide or
forgery: Copy-move tampering (CMF) and image splicing add information, see Fig. 1.
tampering, where copy-move forgery is the most generally
adapted by forgers. Many researchers have been interested to
copy-move forgery and there exist several methods
introduced in this field. CMF is the way of copying a part of
an image and pasting it into another region on the same
image, while splicing is combining multiple regions of
different images to create a forged image [1], [2], [3]. There Fig. 1. A: Authentic image B: Forged image.
exist two main categories of techniques used to detect CMF:
Block-based and key-point based methods. Where each of
CMFD process starts generally with a pre-processing step In 2017, using histogram of orientated gradient, Mahale
to enhance image features. The images can then be converted Vivek Hilal et al. [7] presented an algorithm to detect copy-
into grayscale colors and divided into blocks. After pre- move forgery. In their methodology they started with a pre-
processing, a feature extraction step is used to collect processing step: first they convert images into grayscale, then
information representing the characteristics of the regions of to find out the intensity direction, they measure the gradient
interest in the image. After feature extraction is performed a of image and after that they apply the Gaussian filter.
matching phase is used to elicit similar features in the image. Afterwards, they passed to feature extraction phase: In this
Finally, it is possible to show and identify the potential step they divide the image into overlapping blocks of fixed
falsified regions in the image by visualising the results of the size. After the image is divided into blocks, the Histogram of
CMFD process [5]. Oriented Gradient (HOG) is calculated for each block to find
descriptor features. Then, a matching step is performed to
CMFD techniques are organized into two main types, identify the forged regions. They used the Euclidean distance
block-based and keypoint-based: with a threshold value to get the decision. As for the dataset,
a) Block-based method : This method depends on they used a public dataset called COMOFOD. They tested
dividing the image into blocks of different shapes, including their approach on three different experimentations using three
squares and circles that can be non-overlapping or different dataset sizes. They got as best result a false
overlapping to be used afterwards in the preprocessing stage. acceptance rate of 0.82 and false rejection rate of 0.17 in the
Then, two main steps are performed to detect forgery: case of taking 70 original images and 70 forged.
• Features are first extracted from these blocks and In 2018, using discrete cosine transform, Mohammed
compared to determine their level of similarity. Hazim Alkawaz et al. [8] presented a method for copy-move
forgery detection. Using block based detection approach with
i. Feature extraction step different block sizes, this work aims to study the effect of
There exists many techniques for features block size on its effectiveness in detecting forged areas in
extraction in block-based methods. We can cite for terms of FP and FN. They presented a process for the
example: framework that starts with an RGB image then they convert
• Discrete Cosine Transform (DCT) which is one of it to a grayscale. After that, they partitioned the image into
the most widely used techniques in CMFD. It is based overlapping blocks between 4 × 4 and 8 × 8 pixels, then for
on frequency transform and known for its robustness each block they computed the 2D DCT coefficient and
against noise addition and JPEG compression [5]. rearranged it into feature vector using zigzag scanning. With
• Texture and intensity based features used in images a lexicographic sorting, all blocks should be sorted then
that contain certain patterns or textures, for example matching blocks is carried out to find duplicated blocks with
we mention pictures that contain landscapes Euclidean distance. For the implementation they used
characterised by containing a certain density and CoMoFoD dataset.
smoothness, such as grass, trees, ground and sky [5].
In 2018, Badal Soni et al. [9] proposed a technique that
ii. Matching step uses Speeded Up Robust Features (SURF) for block-based
To find similar blocks, the matching process is feature extraction, and for matching it uses Features from
used to compare the features for each block and then Accelerated Segment Test (FAST) key point method. First,
match them to determine the manipulated area. Here they begin with a preprocessing stage where they convert the
is some of matching techniques for block-based image into grayscale domain, then they divide it into n × n
methods: overlapping blocks. Then to extract features, they extract
• Sorting: Sorting is a way that order the features, SURF descriptors from each block. Afterwards, they pass to
and it is commonly used in the matching process of matching process using FAST features points to concatenate
block-based approaches. The sorting techniques neighbouring blocks. Thereafter, the extracted features have
include : KD-Tree, Lexicographical, and Radix [5]. to be matched using 2NN matching procedure. Finally, they
terminate by removing outliers if they exist. To evaluate the
• Euclidean distance: Euclidean distance is
performance of their method, they used three different
measuring the distance between two points or two
versions of the MICC dataset. They realize that the proposed
vectors in the Euclidean space. After arranging the
technique is robust against rotation and scaling attacks. Also,
calculated distances, the similar blocks are
they tested the proposed system performance by measuring
identified, and then the suspected regions are
TPR, FPR and execution time of the algorithm in seconds and
distinguished in the image [5].
they obtained the following results for the MICC-F220
iii. The main existing works: dataset: 97.4%, 8.6% and 9.2 seconds respectively, the
In 2015, Beste Ustubıoglu et al. [6] presented a MICC-F2000 dataset: 94.5%, 9.8% and 28.6 seconds and the
technique to detect copy-move forgery using color moments. MICC-F8 multi dataset: 83.84%, 12.64% and 11.2 seconds.
First they divided the image into circle blocks. Then, they In 2019, Tingge Zhu et al. [10] proposed a method based
extracted feature vectors from the blocks using three-color on LBP residue classes (LBPRC) and color regions (CR). The
moments. Afterwards, the feature vector matrix has to be frame work of the proposed algorithm detects and identifies
sorted lexicographically. To create a dataset, they used suspected tampered areas. First, the image has to be divided
images from Google image search then they create fake into overlapping blocks. Afterwards, for each block they
images by duplicating some regions in the image and putting compute LBPRC and CR to search then for similar blocks in
it within the same image. They found that the proposed the same LBPRC and CR. In this stage, the suspicious regions
method had high accuracy and false positive ratio with 0.9981 are extracted from these matching pairs. Finally, depending
and 0.0205 respectively. on a given threshold, they locate the forged regions.
To test the performance of their method, they used two they used 573 pictures. They used the MICC-220 as a dataset
databases: the CMH database and the CoMoFoD_small_v2 plus their own data. They evaluated their method by
database. They employed recall, precision and F1 (a trade measuring TPR, FPR and time complexity. To obtain the best
between precision and recall), and they achieve respectively results, three main parameters are used in FCM algorithm
93.65%, 97.98% and 96.58% using the CoMoFoD_small_v2 which are: the number of clusters, the maximum of clusters
database. to create and the minimum amount of improvement. Their
results depends on the datasets that are used, they noticed that
b) Keypoint-based method: Keypoint-based method is the TPR of the MICC-220 is superior to the one obtained
non block-based. Key point features are extracted from the from their dataset, also the former exhibits a lower time
distinct area such as edges and corners. Each feature is complexity. Perhaps, that is due to the professional forged
represented by a set of descriptors. Then, Descriptors and images used and the high number of images with high
features are collected to be classified and then matched to resolution in their dataset as compared to the MICC-220
identify duplicated regions in the image. dataset.
i. Feature extraction techniques c) Other: Within existing algorithms, most of them are
focusing on block-based and keypoint-based methods or both
There are three categories for feature extraction of
of them. Below, we survey other recently presented works
key-points methods, which are:
using other detection methods.
• Scale Invariant Feature Transform (SIFT).
In 2017, Junlin Ouyang et al. [12] proposed a new
• Speed Up Robust Features (SURF). method using convolutional neural networks to detect copy-
• Harris Corner Detector move forgery. Using a small sample of training data, they
slightly modify the network architecture taken from an
Where, SIFT is frequently used in key point-based existing database of trained models such as ImageNet. To
approach. Due to the effectiveness and the stability of accomplish their work. First, they built their handcraft dataset
this technique, it has been strongly used in the CMFD. that contained about 10000 images, also they used both the
ii. Matching techniques OXFORD and the UCID datasets. Subsequently, the
Convolutional neural network CNN network was initialized
• Nearest neighbour is one of the similarity while fine-tuning some of the parameters. Eventually, they
measurement methods used. It is used to measure can attain results by imputing test images into the obtained
the similarity between two specific points in a trained model. For the results, they achieved good
vector space by calculating the distance between performance on both the OXFORD and the UCID datasets
these two points. We say that there is a similarity with 2.32% and 2.43% test error respectively. However, they
when the distance does not exceed a certain got very poor performance for the handcraft database with
threshold. 42% test error due to the random tampering operation.
• Clustering technique is a technique that combines For a deeper study on the CMFD techniques see [13].
similar objects together and puts them into groups S.Teerakanok et al. presented a review and analysis of CMFD.
[5]. In addition, they proposed a new CMFD framework that
iii. The main existent works considered CMFD techniques as a single framework and no
longer divided into block-based and key-point based
In 2010, Xu Bo et al. [11] proposed a method for copy- techniques.
move forgery detection using SURF key point to find
duplicate regions in the image. The main point of using this 2) Splicing forgery: Splicing is combining different
algorithm is to produce fast results compared to other key elements from multiple images into a single image to create
point feature extraction techniques. First, they extract the a fake one see Fig. 2. The tampering traces cannot be seen
interest points to assign a unique orientation by using the and it can be hardly tracked even if no post-processing is done
Haar wavelet. In a second step, they extract square regions [14]. It can be detected by looking for the effect of Splicing
around the interest points using SURF descriptors. To on the image statistics, by noticing the degrees of light on the
determine the duplication between two images, they used the image, or by searching the boundaries of the extraneous
Euclidean distance for all feature descriptors. In the region [3]. Recently, some methods have been presented to
experiments, they used the Uncompressed Color Images detect image splicing forgery.
Database (UCID) edited. They tested robustness against
scaling, rotation, blurring and noise. They found that the
proposed method can accurately detect duplicate areas and
can also resist to the impact of noise, scaling and rotation.
In 2018, Hesham A.Alberry et al. [1] presented a copy
move forgery detection method in which they introduce a fast
technique that optimizes SIFT and fuzzy C-means (FCM)
clustering. The technology is based on the SIFT algorithm for
(jj2007
feature extraction. Fuzzy C-mean Clustering method is used
to reduce the time complexity of SIFT algorithm. First, the
19/0~2007
key points are extracted by SIFT method. Then, these key
Splicing Imitaion
points are used to extract the feature descriptor. Afterwards,
they passed to a matching stage followed by a clustering Fig. 2. Splicing & Imitation forgery.
algorithm to cluster the key points. For the experimental step,
In 2013, using both techniques Discrete cosine transform between two vectors. They use Hu invariant moment to
(DCT) and Local Binary Pattern (LBP), Amani A.Alahmadi extract descriptors. Then, the feature vectors of the same
et al. [15] presented a new method to detect Splicing forgery. alphabetic class are compared using the Euclidean distance.
First, they divided the input image into overlapping blocks. They relied also on alignment, rotation and size of characters.
Then for each block, they computed LBP and transformed it Where a fake added character may have conception errors
into frequency domain using 2D DCT. Finally, they used during the production of the forgery, see Fig. 3.
standard deviation to extract features and for the
classification they used SVM.
For the input image they worked in chromatic channel.
They used CASIA databases and Columbia Uncompressed
456 5410 4897
Image Splicing Detection Evaluation Dataset. To evaluate the a. Size issue b. Skew issue c. Alignment issue
performance of the proposed method, they calculated the
Fig. 3 . Conception errors: a. 6 is larger, b. 4 is slanted and c. 8 is
accuracy which reached to 97%. misaligned.
Most of the methods have been proposed to detect The experimentation showed good results for fraudulent
splicing or CM forgery, however, in 2017, Chandhany Shyan document detection experiment (0.77 for recall and 0.82 for
et al. [16] proposed a method that aimed to detect both precision) but low precision in both shape similarities
splicing and CM forgery using the same dataset. This method experiment and outlier detection.
combines block discrete cosine transform (DCT) and Zernike
moments, they used a process combining two main steps: Other techniques: Other research was based on
finding image forgery using SVM classifier and classification determining document printer source to detect forgery. It was
of the output to either of the forgery types. noticed that each printer has different texture characteristics
The proposed method extracted the features of a color that can help to detect if the document is manipulated or not.
image based on developed threshold method. First they used For example, in 2017, M.Tsai et al. [19] proposed a method
DCT to transform non-overlapping blocks of an image into that is based on printers sources. In their study, they used two
matrices from which the discriminative features for forgery systems: The first is based on SVM method and the other one
detection are extracted using an enhanced threshold method. is developed CNN.
Before that, to minimize the effect caused by the diversity of The database was constructed using text document and
the image content, they deployed a pre-processing step. images, the text includes different language characters like
For copy-move forgery detection they used a feature English character ”e” and Arabic character ”Djim ”. They
extraction technique. Afterwards, they used the Patch Match used also benchmarked image like Peppers, Lena and
Algorithm implementing three steps: initialisation, Baboon. The results showed high accuracy according to the
propagation and random search. After the feature matching experiments. Where, the lowest accuracy they got is 79.71%
process they used a post-processing step to increase the and the highest one is 99.36% for different language
possibility of detecting forgery in a proper manner without characters and different CNN architectures. While, the
being exposed to a false alarm of Copy-move forgery highest accuracy rate for images is 99.95% using Lena image.
detection. On the other hand, the SVM classification system showed
They used in their experimentation the CASIA dataset. good performance with high accuracy (accuracy=99.97) for
CASIA v1.0 consists of 1721 images between authentic and Peppers image.
tampered images using splicing. However, CASIA v2.0 is the In 2017, Francisco Cruz et al. [20] proposed a method
largest, it contains 12614 color images between authentic and where their aim was to detect tampered regions using a direct
tampered. The experimental results showed the effectiveness modification without any post-processing. Their method was
of the proposed method, as it gave a good accuracy rate of based on the idea that the background of the forged image
99.50% at threshold T = 8 in the Splicing forgery type, but in would not be coherent and consistent and the counterfeit
Copy-move type it gave 87.50%, which is an evidence of the region would appear different from the other intact
efficacy of the proposed method. neighbouring regions. They used in their experiments a hand-
3) Imitaion: This type of forgery is used generally for made dataset of 200 documents, each of which contains at
document forgery where there exists text to add or to modify least one forgery operation. Thus, they collected 481 forgery
by trying to imitate the font style and size of the text in the instances with different types of forgery (copy-move,
document [17]. (Fig. 2.) imitation and region cuts). They used SVM as classifier for
their experiments with a cross-validation. For the results, they
In 2010, J. Beusekom et al. [18] proposed a technique that showed that they were able to detect the forged regions with
detects text lines that were manipulated or added to a numeric 7.38% and with 0.05% of false positive ratio.
document. It is based on measuring the rotation and the In 2017, H. Benhamza et al. [21] proposed a method to
alignment of the text to detect such errors in these text-line detect administrative digital documents forgery based on
features. They performed the following steps: extracting text analytic study of scanned documents. In order to determine
lines, calculating the alignment lines, calculating distances that, they calculate the similarity of the image document
between these lines, and finally based on these distances, the according to the three basic components of the document:
lines are classified into usual alignment or unusual. background texture, text and stamp or signature. Hence, they
divided the image into non overlapped blocks with different
In 2013, Romain Bertrand et al. [17] presented a method sizes. Then, they calculated the mean and the standard
that is based on character shape comparison, where, each deviation of each block to calculate the homogeneity of the
character is described by a features vector. To compare image. Using these measurements they extract a features
between two characters they calculate the distance (D) victor. They used SVM machine learning technique on a
handmade dataset and the best accuracy they obtained was cellular automata and Lower Upper Decomposition and the
85.24%. second scenario using CA and Singular Value
Decomposition. Their aim of proposing this method was to
In 2018, Shruti Ranjan et al. [22] presented a method for prevent digital images from tampering by including an
digital image forgery. They enhanced the quality of the encrypted and unpredictable key into the image. For the first
images using histogram equalization. Afterwards, they scenario: They have to obtain the corresponding matrix of the
removed noise using a median filter and segmented images original image then they performed lower–upper (LU)
using K-means clustering. They extracted features using the decomposition to obtain L and U matrices. Afterwards, a
Gray Level Co-occurence Matrix (GLCM), which is a matrix statistical information has to be computed to get an Array list
used to analyse the texture with a set of a statistical format. After that, they used cellular automata with a XOR
measurements. The classifier first used was the linear kernel local rule to embed the secret code into the LSB of each first
SVM then they used Artificial Neural Network (ANN) eight pixels of the input image. The results is an image that
classifier which showed good results compared to that given contains the cipher key to detect image forgery. For the
by linear SVM (ANN accuracy = 96.4%). second scenario: They used an RGB type instead of a gray
Some methods have been proposed for image and scale type that was used in the first scenario. First, they
document forgery detection based on multispectral image obtained the red, blue, and green matrices of input image.
analysis. Multispectral cameras can capture extra information Afterwards, they computed the Eigen-values and the Eigen-
such as ultraviolet, visible, infrared and thermal ranges, vectors of the red and the green channels to create a feature
which can help to detect if there exist a manipulation in vector (key) that was then embedded into the blue matrix.
images. They created their own dataset using official digital
document, medical images and portrait in gray scale and RGB
For example, Muhammad Jaleed Khan et al. proposed in types in png and jpeg format. The results of the proposed
[23] a method based on Fuzzy C-means Clustering (FCM) method were analysed using different mechanisms, such as:
where they used the public available UWA Writing Inks True and False Alert, time consumption and performance and
Database. Also, Yue Zheng et al. presented in [24] a data- visual quality.
device hash for image forgery detection. This new device
used to identify the forged areas in the image and can identify Other works were focusing on protecting the copyright of
the camera source as well. They used an adjusted CASIA the image ownership. To realise that, they used watermarking
database in this work which is a pictures combination taken techniques.
from the CASIA ITDE v1.0 and the CASIA ITDE v2.0 In 2018, using the blind and non-blind watermarking
databases. The proposed method shows a high tamper technique, Dayand G.Savaka et al. [28] proposed a hybrid
detection rate (TDR) of 95.42%. It has also proven to be method that combines the two techniques. First of all, they
highly accurate in determining the type of camera with which included the blind technique in the internal watermarking.
the picture was taken, where, it has reached an accuracy of While for the external watermarking, they used the non-blind
99%. technique. To achieve that, they followed two processes.
B. Active methods First, for the embedding process they have to embed the
Active forgery detection techniques, are considered as an secret watermark into the internal cover image using DWT
image protection tool such as watermarking and digital with the help of blind watermarking technique, and this
signatures. In these methods, a known authentication code is would yield the internal watermark. Afterwards, they embed
placed in the image before transmission, and its credibility is it into the external cover image using DWT with non-blind
then examined by the receiver. The negative point is its watermarking to get the Hybrid watermarked image. For
limitation, as only pre-processed images can be identified, but watermark extraction, they have to apply the reverse
the main positive side is that it is more certain and more operation using a secret key to extract the secret watermark.
reliable than passive forgery detection techniques [4], [25]. To evaluate the performance of this technique, they have used
similarity measurement such as: the correlation, structural
Steganography is an active method, some works have similarity (SSIM) and PSNR between the original watermark
used it for image authentication and integrity verification. It and the extracted one. The results showed that this approach
is a mean of confidential communication between a sender is proven to be robust against noise.
and a receiver. As the way this technology works is based on
the idea of the sender inserting a secret message contained III. DISCUSSION
within a digital image called the cover image, and then after In the following table TABLE I, we present a summery and
it reaches the receiver, he can extract the secret message from a classification of the most important existing works of
the image. forgery detection. Classification is based on the used features
In 2009, Abbas Cheddad et al. [26] proposed a technique extraction technique, the used databases, the training method,
that prevents digital documents from falsification. The aim of the performance measurements and the obtained results.
this work was proposing new approach that is motivated by Most of works are on image forgery detection, for
existing techniques that display security weaknesses. Using document forgery detection, we consider the documents as
different techniques, such as the use of wavelet transform, for images. Where, particular characters are used to detect
the purpose of developing a secret message for digital forgery such as “e” character for English and “Djim”
documents encryption. character for Arabic (the detected characters are processed as
In 2014, Ahmed Pahlavan Tafti et al. [27] proposed a images).
method that uses cellular automata (CA) for the system Among the 21 analysed document works, only one [19]
implementation in image forgery detection, where they dealt with Arabic language and no related works specific to
proposed two methods. The first scenario is about using
administrative documents that are characterised by CASIA, COMOFOD and UCID image databases are the
containing text, logos, stamps and signatures. most used sets in this field. While, for document images there
exist no existing sets yet as far as we know.
SVM is the most machine learning technique used for DCT and SURF are the most used techniques for feature
forgery detection for its accuracy and because it performs extraction.
well on datasets with limited number of samples.
Nevertheless, recent works such as [12] use deep learning. For performance measurements: TPR, FPR, Recall and
accuracy are used to evaluate the precision of forgery
detection methods.

TABLE I. Summary of the used methods in image and document forgery detection

Forgery de- Forgery Au1hor(s) Techniques File 1ype Da1ase1(s) Evalua1ion me1rics Results
tection type method
Beste Us1uboglu cl al.I Colour Moments Image handmade accuracy and false posi- 0.9981 and 0.0205
2015 [6] 1ive ra1io
Mahale Vivek Hila! el h1s1ogram of orienta1ed Image COMOFOD False accepted raie 0.82 FAR 0.17 FRR
al./2017171 gradient (HOG) (FAR). False rejected
rate (FRR)
M.Hazim ct al./ 2018 [8] DCT Image CoMoFoD FN,FP precision and re- prccision=63.52%
call recall=97 .89%
CMF Xu Bo el al./ 2010 [ II ] SURF Image UCID Scaling, Rolation, Blur-
ring, Noise
Badal Soni et al.I 2018 SURF and FAST Image MICC-F220, TPR, FPR and execution 97.4%, 8.6% and 9.2 sec
[91 MICC-F2000 and time
MICC-f8multi
Passive Tingge Zhu ct al./ 2019 LBPRC and CR Image CMH and CoMo- recall.precision and Fl 93.65%. 97 .98% and 96.58%
[!OJ FoD_small_v2
Hesham A.Alberry et al.I SIFT Fuzzy C-means Image MICC-220 and TPR FPR and time com- MICC-220(TPR=99.09%,
201 8 I II clustering handmade plexity FPR=9.09 TC= 16min),
handmade(TPR=71.69%,
FPR= I0.83% TC= Ih I5min)
Junlin Ouyang et al./ CNN Image handmade, Error rate 2.32%, 2.43% and 42%
2011 I 121 OXFORD and
UCID
Amani A. Alahmadi Cl DCT, LBP and SVM Image CASIA and accuracy 97%
al.12013 115] Columbia
Splicing Chandhany Shyan et al.I Du , zernike moments Image CASIA accuracy 99.5% (splicing) 87.5%
2011 I 161 and SVM (CMF)
Romain Benrand and al.I Hu invariant moment, Document handmade recall and precision recall=0.77, precision=0.82
2013 [171 Euclidian distance
lmitaion J. Beusekom et al./ ZUiu branch-and-bound. Document handmade accuracy 99.5%
[18] Gaussian distribution
and Bayesian
formulation
MTsai et al./ 2017 119] SVM and CNN Image Document and accuracy 99.36% for documents and
handmade and 99.97% for images
benchmarked image
Francisco Cruz et al./ SVM Document handmade TPR and FPR 7.38% torged patches 0.05%
2017 [20] FP ratio
O1her H.Benhamza el al./ 2017 SVM Document handmade accuracy 85.24%
[2 1]
Shruti Ranjan et al./ K-means clustering, Document handmade accuracy 96.4%
2018 [221 SVM and ANN
Muhammad Jalccd Khan Fuzzy C-means Document UWA Wri1ing Inks accuracy 76%
et al./ 2018 [231
Yue Zheng et al.I 2019 SURF, DCT Image the modified CA- FRR, TDR 1.5%, 95.42%
[24] SIA
Abbas Cheddad el al.I Sicganography/Jarvis Image and
2()0') [261 kernel for halfloning Document
Active Ahmed Pahlavan Tafti et Cellular automata Image handmade
al./ 2014 1271
Dayand G.Savaka et al./ watennarking, DIVT correlaiion, PSNR and
2018 [281 SSIM

IV. CONCLUSION This paper is considered as a survey to start a work on


detecting forgery in Arabic administrative documents by
In this paper, we introduced image forgery detection, its analysing them and preparing a training image dataset.
methods and tools. We showed the most used techniques for
feature extraction and training stages and we presented the REFERENCES
most used datasets and the performance measurements for [1] H. A. Alberry, A. A. Hegazy and G. I. Salama, “A fast SIFT based
evaluation. Hence, we found that a few works are dealing method for copy move forgery detection”, Future Computing and
with Arabic documents and only one which treated it in the Informatics Journal, Elsevier, 2018, 3, pp. 159-165.
same way as other languages.
[2] K. R. Revi and M. Wilscy, “Scale invariant feature transform based University of Abdelhamid Ibn Badis, Mostaganem, Algeria, 3rd - 4th
copy-move forgery detection techniques on electronic images—A December, 2017.
survey”, 2017 IEEE International Conference on [22] S. Ranjan, P. Garhwal, A. Bhan, M. Arora, and A. Mehra, “Framework
Power, Control, Signals and Instrumentation Engineering (ICPCSI), for image forgery detection and classification using machine learning,”
2017, pp. 2315-2318. in 2018 2nd International Conference on Trends in Electronics and
[3] N. Kanagavalli and L. Latha, “A survey of copy-move image forgery Informatics (ICOEI). IEEE, 2018, pp. 1–9.
detection techniques”, 2017 International Conference on Inventive [23] M. J. Khan, A. Yousaf, K. Khurshid, A. Abbas, and F. Shafait,
Systems and Control (ICISC), 2017 , pp. 1-6. “Automated forgery detection in multispectral document images using
[4] G. K. Birajdar and V. H. Mankar, “Digital image forgery detection fuzzy clustering,” in 2018 13th IAPR International Workshop on
using passive techniques: A survey”, Digital investigation, Elsevier, Document Analysis Systems (DAS). IEEE, 2018, pp. 393–398.
2013, 10, pp. 226-245. [24] Y. Zheng, Y. Cao, and C.-H. Chang, “A puf-based data-device hash for
[5] N. B. A. Warif, A. W. A. Wahab, M. Y. I. Idris, R. Ramli, R. Salleh, S. tampered image detection and source camera identification,” IEEE
Shamshirband and K.-K. R. Choo, “Copy-move forgery detection: Transactions on Information Forensics and Security, vol. 15, pp. 620–
survey”, challenges and future directions Journal of Network and 634, 2019.
Computer Applications, Elsevier, 2016, 75, pp. 259-278. [25] R. Dobre, R. Preda, and A. Marcu, “Improved active method for image
[6] B. Ustubıoglu, V. Nabıyev, G. Ulutas, and M. Ulutas, “Image forgery forgery detection and localization on mobile devices,” in 24th
detection using colour moments”, in 2015 38th International International Symposium for Design and Technology in Electronic
Conference on Telecommunications and Signal Processing (TSP). Packaging. IEEE, 2018, pp. 255–260.
IEEE, pp. 540–544, 2015. [26] A. Cheddad, J. Condell, K. Curran, and P. Mc Kevitt, “A secure and
[7] M. V. Hilal, P. Yannawar, and A. T. Gaikwad, “Image inconsistency improved self-embedding algorithm to combat digital document
detection using histogram of orientated gradient (hog),” in 2017 1st forgery,” Signal Processing, vol. 89, no. 12, pp. 2324–2332, 2009.
International Conference on Intelligent Systems and Information [27] A. P. Tafti and H. Hassannia, “Active image forgery detection using
Management (ICISIM). IEEE, 2017, pp. 22–25. cellular automata,” in Cellular Automata in Image Processing and
[8] M. H. Alkawaz, G. Sulong, T. Saba, and A. Rehman, “Detection of Geometry. Springer, 2014, pp. 127–145.
copy-move image forgery based on discrete cosine transform,” Neural [28] D. G. Savakar and A. Ghuli, “Robust invisible digital image
Computing and Applications, vol. 30, no. 1, pp. 183–192, 2018. watermarking using hybrid scheme,” Arabian Journal for Science and
Engineering, vol. 44, no. 4, pp. 3995–4008, 2019.
[9] B. Soni, P. K. Das, and D. M. Thounaojam, “Improved block-based
technique using surf and fast keypoints matching for copy-move attack
detection,” in 2018 5th International Conference on Signal Processing
and Integrated Networks (SPIN). IEEE, 2018, pp. 197–202.
[10] T. Zhu, J. Zheng, Y. Lai, and Y. Liu, “Image blind detection based on
lbp residue classes and color regions,” PloS one, vol. 14, no. 8, 2019.
[11] X. Bo, W. Junwen, L. Guangjie, and D. Yuewei, “Image copy-move
forgery detection based on surf,” in 2010 International Conference on
Multimedia Information Networking and Security. IEEE, 2010, pp.
889–892.
[12] J. Ouyang, Y. Liu, and M. Liao, “Copy-move forgery detection based
on deep learning,” in 2017 10th International Congress on Image and
Signal Processing, BioMedical Engineering and Informatics
(CISPBMEI). IEEE, 2017, pp. 1–5.
[13] S. Teerakanok and T. Uehara, “Copy-move forgery detection: A state
of the art technical review and analysis,” IEEE Access, vol. 7, pp. 40
550–40 568, 2019.
[14] D. Vaishnavi and T. Subashini, “Recognizing image splicing forgeries
using histogram features,” in 2016 3rd MEC International Conference
on Big Data and Smart City (ICBDSC). IEEE, 2016, pp. 1–4.
[15] A. A. Alahmadi, M. Hussain, H. Aboalsamh, G. Muhammad, and G.
Bebis, “Splicing image forgery detection based on dct and local binary
pattern,” in 2013 IEEE Global Conference on Signal and Information
Processing. IEEE, 2013, pp. 253–256.
[16] C. S. Prakash, A. Kumar, S. Maheshkar, and V. Maheshkar, “An
integrated method of copy-move and splicing for image forgery
detection,” Multimedia Tools and Applications, vol. 77, no. 20, pp. 26
939–26 963, 2018.
[17] R. Bertrand, P. Gomez-Kr¨amer, O. R. Terrades, P. Franco, and J.-M.
Ogier, “A system based on intrinsic features for fraudulent document
detection,” in 2013 12th International Conference on Document
Analysis and Recognition. IEEE, 2013, pp. 106–110.
[18] J. Van Beusekom, F. Shafait, and T. M. Breuel, “Text-line examination
for document forgery detection,” International Journal on Document
Analysis and Recognition (IJDAR), vol. 16, no. 2, pp. 189–207, 2013.
[19] M.-J. Tsai, Y.-H. Tao, and I. Yuadi, “Deep learning for printed
document source identification,” Signal Processing: Image
Communication, vol. 70, pp. 184–198, 2019.
[20] F. Cruz, N. Sidere, M. Coustaty, V. P. D’Andecy, and J.-M. Ogier,
“Local binary patterns for document forgery detection,” in 2017 14th
IAPR International Conference on Document Analysis and
Recognition (ICDAR), vol. 1. IEEE, 2017, pp. 1223–1228.
[21] H. Benhamza and A. Djeffal, “Détection des faux documents
administratifs par machines à vecteurs supports,” Fifth International
Conference on Image and Signal Processing and their Applications

You might also like