You are on page 1of 10

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 2, Issue 6, June - 2015. ISSN 2348 4853

To Design And Algorithm Using Zero Watermarking With


Stegnography For Text Document
Pradeep Kaur, PG Scholar, PTU Jalandhar
Pankaj Bhambri, Assistant Professor, GNDEC Ludhiana

ABSTRACT
With the wide use of various communication technologies and internet, it has become tremendously easy
to reproduce, communicate, and distribute digital contents. As far as there is authentication and
copyright protection issues have arisen. Textual way of communication is the most widely used medium
for travelling the data over the Internet besides image, audio, and video. There are various platforms for
the plain text such as books, newspapers, web pages, advertisement, research papers, legal documents,
letters, novels, poetry, and many other documents. For the security of text contents is a significant issue
for copyright protection which cannot be condoned. In this thesis, I have proposed a zero-watermarking
approach towards text watermarking; propose a zero text watermarking algorithm based on occurrence
frequency of vowel ASCII characters and articles for copyright protection of plain text. Uses of watermark
for the watermark embedding process are smaller in length. The embedding algorithm makes use of
frequency vowel characters and articles to generate a specialized author key. The extraction algorithm
uses this key to extract watermark, hence identify the original copyright owner. Here also using
stegnography technique to provide better security to text documents. Experimental results illustrate the
effectiveness of the proposed algorithm on text documents encountering various tempering attacks
performed by different independent attackers and the results are also compared with the recent work on
text watermarking.

I.

INTRODUCTION

To provide the security to the digital contents has gained remarkable importance in current digital time.
Internet has become an essential part of our daily life for the transfer of different forms of data such as
emails, articles, news papers, websites, images, audios, videos, commercials, and opinion blogs.
Information over the Internet is mostly in the form of text and the copyright protection of text is one of
the main concerns of its original author. Text is the most important and core part of legal documents,
reports, and journals, but its security has been seriously ignored. The threats of electronic publishing like
illegal copying and re-distribution of copyrighted material, plagiarism and other forms of copyright
violations need to be explicitly addressed, particularly for plain text.
A. Information Hiding
Information hiding techniques are an essential with a huge application in a number of application areas.
Digital audio, video, text, and images are gradually more furnished with identification marks
imperceptibly. The identification mark can be author name, copyright information, a serial number or
any other message. Information hiding techniques help to prevent un-authorized copying directly.
Information hiding techniques fulfill all requirement of providing security to digital contents. Information
hiding is a general term which surrounding with a various sub parts, like: cryptography, watermarking,
and steganography.
17 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
B. Steganography
The word steganography comes from the Greek Steganos, which mean covered or secret and graphy
mean writing or drawing. Therefore, steganography means, literally, covered writing. Steganography is
the art and science of hiding information such that its presence cannot be detected and a communication
is happening [13]. Secret information is encoded in a manner such that the very existence of this
information is concealed.
C. Watermarking
Watermarking is a branch of information hiding which is used to hide additional information in digital
media like image, audio, video, or text. Digital watermarking technique refers to the process of
embedding the given watermark information in the protective information like a picture, audio, video, or
text, and picking the given watermark information from the protective information, which in not
perceived by human perceptual system. A digital watermark is a visible or invisible identification code
that is permanently embedded in the data, to transmit hidden data. It remains present in the data even
after the decryption process. It usually provides copyright protection by embedding a digital signal or
watermark information which is unique to the copyright owner that to be protected. The watermark is
later used for the identification of original copyright owner by certifying authorities
D. Text Watermarking
Text is the most extensively used medium of communication existing over the Internet. The major
components of websites, books, newspapers, articles, legal documents is simple the plain text. Therefore,
plain text requires utmost protection and security from copyright violators. In past, a number of digital
watermarking algorithms have been proposed for images, audios, and videos; however digital
watermarking algorithms for plain text are inadequate and ineffective.
Digital watermarking is the process of embedding a unique digital watermark in a digital content to
protect it from illegal copying and copyright violations. The process of embedding and extracting a digital
watermark to and from a digital text document which uniquely identifies the original copyright owner of
that text is called Digital Text Watermarking. Text watermarking abides by the same principles as image,
audio, or video watermarking. The watermark should remain resilient to random tampering attacks,
undetectable to anybody but the original owner/author of the text, as well as easily and fully
automatically reproducible by the watermark extraction algorithm. The main concern in text
watermarking is that the plain text contains less redundant information as compared to images, audio,
and video which could be used for secret communication, as happens in steganography, and
watermarking.
E. Contribution towards thesis
A number of developments in the field of text watermarking have been made till knows. This thesis
contributes towards text watermarking with the utility of text constituents like vowel characters and
articles. Using these watermarks remains resistant to tampering attacks. The main contributions of this
thesis are:

A novel zero-watermarking approach has been adopted towards text watermarking.


Encryption, steganography, and watermarking are blended which provides a robust text
watermarking solution.

18 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853

The resilience against attacks on text has been improved and the watermark has been made
robust to attacks of varying length.
The algorithms are tested under the insertion, deletion, and combined tampering attacks in both
localized and dispersed forms.
The proposed technique provides optimal results using vowel characters and articles.

II. BACKGROUND OF THE PROPOSED ALGORITHM


The proposed algorithm uses vowel characters to watermark the text document. The original owner of
the text generate key using an algorithm which is watermark embedding algorithm. This algorithm is
known as zero watermarking algorithms in which text documents remain same when watermarking is
done as it generates the author's key by using properties of the text without changing it. The text
document is first analyzed and then articles from the text are identified. Average frequency articles (AFP)
are obtained and on that bases create the partition of text. Then count highest occurring vowel characters
and makes a list of MOV that is maximum occurring vowel characters list. This list is used to generate the
author key of a particular watermark given by the original owner.
The proposed algorithm is a merge of watermarking, segnography and encryption. The original author
embeds the copyright information in a text and it generates the watermark key using embedding
algorithm, the existence of watermark remains hidden.
The watermarking process involves two stages, watermark embedding and watermark extraction.
Watermark embedding is done by the original author and extraction done later by the copyright owner
(CA) to prove ownership. The original copyright owner of text inputs a watermark. And unique key is
generated using input text. This key is used later for extraction of watermark, whenever a copyright
conflict arises in future. In the proposed algorithm, at the time of watermark extraction there is no need
of original watermark is needed and there is no alteration in the text watermark. The original owner
records there copyrights to the trusted certification authority, that authority take decisions whenever
there is any copyright conflicts arises.
A typical watermarking process is shown in figure 1 with embedding scheme shown on left and
extraction scheme shown in on the right.

19 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
Figure 1: A Typical Generic Watermarking Scenario With Watermark Embedding And Extraction
Processes Shown On The Left And Right Respectively

a) Embedding Algorithm
The algorithm in which the watermark is embedding into text is called embedding algorithm. The
embedding algorithm logically embeds the watermark in text without making any changes in text
document and it generates the author key. Setgnography is also applied to provide the better security.
Flowchart for embedding algorithm is shown in figure 2.

Start

Load Cover Text and display


its parameters
Load a Watermark and
display its parameters

Apply Text Watermarking


and generate secret key

Apply attack on
watermarked encrypted text

Load Cover Image

Apply Stegnography using


LSB technique on attacked
text and cover image

Save Stego Image Obtained

20 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
Figure 2: Flowchart For The Watermark Embedding Algorithm.
B) Extraction Algorithm
The extraction algorithm is used to extract watermark from the text. It takes a key as input to extract the
watermark from the document. Articles and vowels of entire document make the algorithm more
resistant against attacks and watermark is still robust after various attacks. The watermark extraction
process is shown in the figure 3.
Load Stego Image

Extract attacked text by


using LSB technique.

Apply Secret Key

Apply De-Watermarking
and Extract Secret Message

Analyze our result and


display MSE, PSNR and
accuracy of watermark.

Figure3: Flowchart for the watermark extraction algorithm.


III. RESULTS AND DISCUSSIONS
To evaluate the performance of the proposed algorithm, there is a text samples to perform attacks on it
by a different individuals. The characteristics of the original files can be altering by the attacks but the
whole theme of the text is remaining same. Whenever the attacker try to ruin the copyrights then they
will perform attacks to alter the text and various attack files were differ which is based on attack volume.
To examine the tampering attacks on the text file by evaluating the accuracy of retrieved watermarks as
well as experiments were performed to check the insertion and deletion attacks on the text files. To
insert and delete the data from the text is the most common attacks on text documents. Further, the
proposed algorithm is compared with the previous algorithm which is based upon the prepositions.
To test the effect of tempering on text experiment were conducted to examine the accuracy of retrieved
watermarks, to calculate the insertion and deletion attack, and to further noticefied the impact of
tempering on watermark, experiments were made to measure the PSNR and MSE of extracted
watermarks. And also the proposed algorithm is compared with the previous algorithm.
21 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
The performance of proposed algorithm is shown in table 1, which demonstrates how the proposed
algorithm is robust and it also shown the accuracy of varied attack samples.
Table: 1 Details of watermark accuracy of watermark (W1)
Parameters

Attack file 1

Watermark

Attack file 2

95

Attack file3

80

Average

96

90

63.9335

Accuracy (%)

PSNR

63.9911

63.7544

64.0555

MSE

0.0259

0.0274

0.0256

Inserted

64

58

62

61.33

36

42

38

38.66

0.0263

Words

Deleted
Words

From the graphical figures 4 shown below, it shows the accuracy of watermark w1 which used to in this
proposed algorithm, and values of MSE and PSNR are also shown.

y 150
ca
r
u
c 100
ca
k
r 50
a
m
re
ta
0
w

accuracy
PSNR
0.0258

0.0274

0.0253

MSE

A1

A2

A3

Attack files

22 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853

Figure 4: Accuracy Of Retrieved Watermark (W1) Under Tempering Attacks

In the below graphical figures 5, it shows the accuracy of watermark w2 which used to in this proposed
algorithm, and values of MSE and PSNR are also shown.

Figure 5: Accuracy of retrieved watermark W2 under tempering attacks


IV. COMPARATIVE RESULTS
The performance of the proposed algorithm is compared with the previous algorithm which is based on
prepositions. To facilitate the comparison, I have used three different watermarks and three different
text files with varied length. Previous algorithm used a watermark with a longer in length for the
watermarking, but I have used a watermark with a smaller in length. Table 4.4 shows the comparison
result of two algorithms.
Table: 2 comparisons of two algorithms

Parameters

Previous Algorithm
W1

W2

23 | 2015, IJAFRC All Rights Reserved

Proposed algorithm
W1

W2

W3

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853

Watermark

75.92

67.22

90

74.49

97.50

40

12

21

10

Accuracy
(%)

Watermark

112

Length

300
yc
ar 250
u
cc 200
a
k
ra 150
m 100
re
ta
50
w
0

97.5
w3
67.22
75.92

74.49
90

w2
w1

previous algo
proposed algo

Figure 6: performance comparison of proposed algorithm with previous algorithm based on


prepositions
From the graphical figures 6, it shows the comparison of algorithms with the accuracy of watermarks
which used to in this proposed algorithm. Results show that watermark is still more robust even when
the watermark length is short.
www.ijafrc.org
24 | 2015, IJAFRC All Rights Reserved

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
V. CONCLUSION
Digital watermarking is a very effective solution for authentication and copyright protection of digital
contents and text documents. Text documents gained very high importance for their security purpose. So,
here I have proposed a zero-watermarking algorithm for copyright protection of text documents. The
algorithm integrates the occurrence frequency of articles and vowels characters in the text to protect it.
The algorithm with a zero-watermarking approach provides a robust solution for text watermarking
problem. To check the frequency of occurrence of each vowels character in an each text and generate a
key using the intrinsic properties of the text. Here also used stgnography technique which provides more
security to the text files by hiding the text data within the image. The key which is generated is registered
with CA and that key is used when there is any conflict arises in the copyright claims, and then this
watermark can be extracted from the digital content to identify the original owner.
I have tested the performance of the algorithm for tampering attacks like insertion and deletion attacks
of 3 texts. I also compared the performance of the algorithm with the proposed algorithms. The results
show that my algorithm is also more robust even when the watermark length is shorter, as well as they
are secure, and efficient with minimal computational requirements. The watermark remains resilient
after attacks which make the watermark more efficient and robust.
VI. REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]

[10]
[11]
[12]

A.khan et al., Optimizing Perceptual Shaping of a Digital Watermark Using Genetic Programming,
Iranian Journal of Electrical and Computer Engineering, vol. 3, 2004, pp. 144-150.
Khan Asifullah, Intelligent perceptual shaping of a digital watermark, Diss. Ghulam Ishaq Khan
Institute of Engineering Sciences and Technology, 2006.
Afzel Noore et al., "Robust biometric image watermarking for fingerprint and face template
protection." IEICE Electronics Express, vol. 3 no.2,2006, pp.23-28.
B.Macq et al, A method of text watermarking using presuppositions, SPIE International
Conference on Security, Steganography, and Watermarking of Multimedia Contents, 2007
D. Huang and H. Yan, Interword distance changes represented by sine waves for watermarking text
images, IEEE Trans. Circuits and Systems for Video Technology, Vol.11-12, 2001, pp.1237-1245.
D. Neeta, K. Snehal, and D. Jacobs, Implementation of LSB Steganography and Its Evaluation for
various Bits, IEEE Conference paper, South Africa, december 2006, pp 173-178.
F. A. P. Petitcolas et al., Information hiding - A survey, Proceedings of the IEEE, vol. 87, no. 7, 1999,
pp.1062 1077.
Fahd N. Al-Wesabi, English Text Zero-Watermark Based on Markov Model of Letter Level Order
Two, vol. 3, 2012.
Jaseena K.U. and Anita John, Text Watermarking using Combined Image and Text for
Authentication and Protection, International Journal of Computer Applications, vol. 20, 2011, pp.813.
H. M. Meral et al., Natural language watermarking via morpho syntactic alterations, Computer
Speech and Language, vol.23, 2009, pp.107-125.
H. M. Meral, E. Sevin et al., Syntactic tools for text watermarking, 19th SPIE Electronic Imaging
Conf. 6505: Security, Steganography, and Watermarking of Multimedia Contents, San Jose, 2007.
J. Zhang, Q. Li, C. Wang, and J. Fang, A novel application for text watermarking in digital reading,
in Proceedings of International Conference on Artificial Intelligence and Computational
Intelligence (AICI 09), Shanghai, China, 2009, pp. 103-111.

25 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org

International Journal of Advance Foundation and Research in Computer (IJAFRC)


Volume 2, Issue 6, June - 2015. ISSN 2348 4853
[13]
[14]
[15]

[16]

[17]

[18]

[19]

J Ingemar, et al.,"Secure spread spectrum watermarking for images, audio and video", Image
Processing International Conference, IEEE, Vol. 3, 1996, pp-243-246
J. T. Brassil, S. Low, and N. F. Maxemchuk, Copyright Protection for the Electronic Distribution of
Text Documents, Proceedings of the IEEE, vol. 87, no. 7, 1999, pp.1181-1196.
J. T. Brassil, S. Low, N. F. Maxemchuk, and L. OGorman, Electronic Marking and Identification
Techniques to Discourage Document Copying, IEEE Journal on Selected Areas in Communications,
vol. 13, no. 8, 1995, pp. 1495-1504.
Z. Jalil, A. M. Mirza, and T. Iqbal, A Zero-Watermarking Algorithm for Text Documents using
Structural Components,International Conference on Information and Emerging Technologies
(ICIET), 2010
Zunera.Jalil, M. Arfan Jaffar, and Anwar M. Mirza. "A novel text watermarking algorithm using
image watermark." International journal of innovative computing, information, and control, vol.7,
2010, pp.1255-1271
Zunera Jalil, Anwar M. Mirza, and Maria Sabir, Content based Zero-Watermarking Algorithm for
Authentication of Text Documents, International Journal of Computer Science and Information
Security, Vol. 7, No. 2, February, 2010.
Zunera Jalil et al. "Improved Zero Text Watermarking Algorithm against Meaning Preserving
Attacks", World Academy of Science, Engineering and Technology, pp. 592-596.

26 | 2015, IJAFRC All Rights Reserved

www.ijafrc.org