You are on page 1of 13



01 Abstract 1

02 Introduction 2

03 History 3

04 Requirements of Hiding Digital Information 4

05 Steganographic Techniques 5

06 Related Work 5

07 Problem Statement 10

08 Scope Statement 10

09 Proposed Solution 10

10 Proposed Model 11

11 References 12

Ph.D. Thesis Proposal 1 By: KHAN FARHAN RAFAT


“we can scarcely imagine a time when there did not exist a necessity, or at least a desire, of
transmitting information from one individual to another in such a manner as to elude general

With every passing by day, more and more people are switching over to fascinating on-line
i.e. “always on” communication to perform their in routine business and personal tasks. This
rapid swing from the existing time consuming complex manual procedural formalities have
forced the Government and Business communities to offer their services such as home shopping,
banking, billing, taxation etc., to public, in open and on 24/7 basis.

The gigantic global network of inter connected computer systems, commonly known to
people as Internet and composed of expensive gadgetry and software services, is a vital source of
this drift.

However, the above fascinating and instantly available on line facilities are closely tied with
issues concerning availability, integrity, confidentiality and authentication of information
exchanged over communication media which has lead to the evolution of information hiding
techniques such as Cryptography and Steganography, for secure on and off line communication.

Steganography, which dates back to the time of ancient Greeks, has also found its way into
the field of Computer Science and is effectively being used alone or together with cryptography.

Ph.D. Thesis Proposal 2 By: KHAN FARHAN RAFAT

1. Introduction

It is difficult to comment as to how people communicated in the pre-historic days (dark

ages), however, one may logically assume that the earliest forms could have been sketches
which had lead to the understanding of their associated sounds (i.e. with sketches) when
these would have been narrated. That narration might have paved the way for the evolution
of Natural Languages (NL). [2]

The evolution of NL has opened up doors for the technological revolution which has
brought dramatic changes in the lives of people all over the world. One of such changes is the
introduction of Internet for public which, originally developed for military usage, has grown
in to a gigantic global system of interconnected computer networks.

The development of Internet for the military is just a glimpse of the security concern
associated with communication which is, and has remained a serious concern of both, the
public and private sectors alike.

Over the ages, miscellaneous data hiding techniques have been evolved to protect
confidential information from falling into the hands of hostile, which can be classified into
two broad categories namely Cryptography and Steganography.

2. History

The word Steganography is derived from the Greek which means covered (or hidden)
writing. While Cryptography concerns itself with making the intelligible information as un-
intelligible, Steganography hides the existence of that information.

Before giving a brief history of the related work on Text-based

steganography, it will be appropriate to discuss the frequently used
terms in this context. The secret message to be hidden is referred to
as embedded data and the innocuous text / audio / image used for
embedding is called as cover. The resultant output object after
embedding is referred to as stego-object. The key used in
embedding the secret message is known as stego-key. It is a priory
Figure 1 - Model of a
for the sending and receiving ends to have agreed upon on a Steganographic system

Ph.D. Thesis Proposal 3 By: KHAN FARHAN RAFAT

mutual key exchange protocol / mechanism. Figure-1 depicts the model for secure
Steganographic System.

The recent interest in Steganography should not be linked up with the publication of the
NEWS in USA Today of the year 2000 which stated that terrorists might be using
steganography for concealing their secrets from the law enforcing agencies. The history of
steganography dates back to fifth century B.C. where in Greece, it was exercised by the
prisoners of King Darius. Another famous quoted technique is that of tattooing of a secret
message over the shaved scalp of a slave. After some time when the hair of the salve grew,
he was sent to the destination where his head was again shaved for reading the camouflaged
message. Germans showed masterly expertise in World War II. With ‘microdot’ technique,
messages were photographed and reduced to size as small as a period (full stop) [3][4].

Digital technology has given a broader spectrum to steganography to flourish as

compared to unconventional ways of secret writing such as writing messages with invisible
ink made up of onion juice, alum, ammonia salts and other similar materials that glow dark
when held over flames – a technique used extensively by the British and Americans during
American Revolution [4]. Today variety of digital electronic media such as audio, image,
video, text etc. provides a convenient way for hiding valuable information [5][6][7].

The fascinating attribute attached with digital text documents is the fact that these are
written, saved and retrieved by the personal computer in a manner as is seen by the necked
eye. This is contrary to the mechanics of other digital file formats like image, video, audio
etc. where the information saved in computer is different from that, which is retrieved.
Various techniques for hiding data in text file exist. It is, however, worth to mention that
Text steganography is considered as the most challenging of all since it involves zero
overhead of meta data often used for hiding information [7].

3. Requirements of Hiding Digital Information

A number of protocols and different data embedding techniques exist that enable us to
hide information in a given object. However, all of the protocols and techniques must satisfy
following requirements so that correct steganography can be applied. The following lists
requirement that all steganography techniques must adhere:
Ph.D. Thesis Proposal 4 By: KHAN FARHAN RAFAT
3.1 The integrity of the hidden information after it has been embedded inside the stego
object must remain intact.
3.2 The stego object must remain unchanged or almost unchanged to the naked eye.
3.3 It is assumed that attacker knows that secret data is hidden inside the stego object.


A number of available digital media including Text, Image, Audio, Video Files together
with other types are being used for hiding secret information.


5.1. Acronym [8]

Table 1

Acronym Translation

l8 Too late

ASAP As Soon As Possible

C See

CM Call Me

F2F Face to face

In this method words can be substituted with their abbreviations to represent the
binary bit pattern of zero or one corresponding to the bits of secret information.

5.2. Change of Spelling [9]

Table 2

American Spelling British Spelling

Favorite Favourite

Criticize Criticise

Fulfill Fulfil

Ph.D. Thesis Proposal 5 By: KHAN FARHAN RAFAT

This method exploits the way words are spelled in British and American English for
hiding secret information bits.

5.3. Semantic Method [14]

Table 3

Big Large

Small Little

Chilly Cool

Smart Clever

Spaced Stretched

Synonym substitution of words is used to hide the binary bits of secret information.
The synonym substitution may represent a single or multiple bit combination for the
secret information.

5.4. HTML Tags [11][17]

HTML Tags can be used in varying combination or as gaps and horizontal tabulation
to represent a pattern of secret information bits.
5.4.1 Using white space in tags

Stego key:

<tag>, </tag>, or <tag/> … 0

<tag >, </tag >, or <tag /> … 1

<user >

<name>Alice</name >

<id >01</id>


Hidden Bit String: 101100

Ph.D. Thesis Proposal 6 By: KHAN FARHAN RAFAT

5.5. XML Document [19]

XML is a preferred way of data manipulation between web-based

based applications. The
user defined tags are used to hide actual message or the placement of tags represents
the corresponding secret information bits. For example to hide 01110,
01110 following can
be used:

Stego key: <img></img> -> 0 ,

<img/> -> 1

Stego data:
<img src=”foo1.jpg”></img>
<img src=”foo2.jpg”/>
<img src=”foo3.jpg”/>
<img src=”foo4.jpg”/>
<img src=”foo5.jpg”></img>
5.6. IPv4 [20]

Figure 2

Figure 2 shows how the IP (version 4) header is organized. Three unused bits have
been marked (shaded) as places to hide secret information. One is before the DF and
MF bits and another unused portion of this header is inside the Type of service field
which contains two unused bits (the least significant bits).
5.7. The Transport Layer [[20]
Figure 3

Ph.D. Thesis Proposal 7 By: KHAN FARHAN RAFAT

Every TCP segment begins with a fixed
fixed-format 20-byte
byte header. The 13th and the 14th
bytes of which are shown in Figure 3. The 66-bit field not used, indicated in shade,
can be used to hide secret information.

5.8. White Spaces [12][1


Tabel 4 Original Text

Table 5 Encoded Text

In this technique spaces between words, sentences or paragraphs are used to

represent bits of secret information.

5.9. Line Shifting [12][13


Figure 4

This method hides information by shifting the text lines to some degree to represent
binary bits of secret information.

5.10. Word Shifting [13][1


Figure 5

Here, the distance between words is altered to hide bits of secret information.

Ph.D. Thesis Proposal 8 By: KHAN FARHAN RAFAT

5.11. Feature Coding [13]

This method hides the secret information bits by associating certain attributes to the
text characters.

5.12. Miscellaneous techniques [10]

A number of idiosyncrasies ways can be associated with hiding information, by

introducing modification or injecting intentional grammatical word/sentence errors
to the text. Following are some techniques / procedures which can be employed in
this context:

5.12.1 Typographical errors - “tehre” rather than “there”.

5.12.2 Using abbreviations / acronyms - “yr” for “your” / “TC” in place of “Take
5.12.3 Transliterations – “gr8” rather than “great”.
5.12.4 Free form formatting - redundant carriage returns or irregular separation of
text into paragraphs, or by adjusting line sizes.
5.12.5 Use of emoticons for annotating text with feelings - “:)” to annotate a pun.
5.12.6 Colloquial words or phrases - “how are you and family” as “how r u n
5.12.7 Use of Mixed language - “We always commit the same mistakes again, and
’je ne regrette rien’!”.
5.13. MS Word Document [15]

This method makes use of change tracking technique of MS Word for hiding
information, where the stego-object appears to be a work of collaborated writing.
The bits to be hidden are first embedded in the degenerated segments of the cover
document. This is followed by the revision of degenerated text thereby imitating it
as being an edited piece of work.

Ph.D. Thesis Proposal 9 By: KHAN FARHAN RAFAT

Figure 6


6.1 All of the existing text based encoding methods either require original file or the
knowledge of the original files formatting to be able to decode the secret
6.2 Adding spaces between words and lines or HTML tags or Inserting data past end
of file mark Increases File length and are equally eye catching.

7.1 To Evolve Steganographic Technique which results in a Zero over headed

STEGO file as till today “NO” known example of hiding binary data in ASCII
text document exist which results in a stego-file of length equal to that of cover
7.2 To suggest enhancements in existing text based steganographic techniques.


The proposed solution will consist of:

8.1 Methods based on generating ASCII cover text corresponding to a given
8.2 Method based on altering a given ASCII TEXT cover in order to encode the
message in it (Figure 7 refers).

Ph.D. Thesis Proposal 10 By: KHAN FARHAN RAFAT

Figure 7

Ph.D. Thesis Proposal 11 By: KHAN FARHAN RAFAT


[1]. Code Wars: Steganography, Signals Intelligence, and Terrorism. Knowledge,

Technology and Policy (Special issue entitled ‘Technology and Terrorism’) Vol. 16, No. 2
(Summer 2003): 45-62 and reprinted in David Clarke (Ed.), Technology and Terrorism.
New Jersey: Transaction Publishers (2004):171-191. Maura Conway.

[2]. Elements of Cryptography – 6th Edition (Student edition). Arthur H. Robinson, joel L.
Morison, Phillip C. Muehrcke, A. John Kimerling, Stephen C. Guptill, ISBN – 9-814-

[3]. Steganography: is it becoming a double-edged sword in computer security? Miss K.I.

Munro, University of the Witwatersrand.

[4]. Steganography 2nd Lt. James Caldwell, U.S. Air Force, 2003, , last
accessed November 14, 2008.

[5]. Algorithms for Audio Watermarking and Steganography Nedeljko Cvejic, University
of Oulu 2004.

[6]. Image Steganography: Concepts and Practice

M. Kharrazi, H. T. Sencar, N. Memon, National University of Singapore (2004).

[7]. Techniques for data hiding W. Bender, D. Gruhl, N. Morimoto, and A. Lu, IBM Systems
Journal, Vol. 35, Issues 3&4, pp. 313-336, 1996.

[8].Text Steganography in SMS

Mohammad Sirali-Shahreza, M. Hassan Shirali-Shahreza, 0-7695-3038-9/07 © 2007 IEEE,

DOI 10.1109/ICCIT.2007.100

[9]. Text Steganography by Changing Words Spelling Mohammad Shirali-Shahreza, ISBN

978-89-5519-136-3, Feb. 17-20, 2008 ICACT 2008

[10].Information Hiding Through Errors: A Confusing Approach Mercan Topkara, Umut

Topkara, Mikhail J. Atallah, Purdue University

[11].Adaptation of Text Steganographic Algorithms for HTML Stanislav S. Barilnik, Igor

V. Minin, Oleg V. Minin, 8th International Siberian Workshop and Tutorials
EDM'2007, Session IV, JULY 1-5, ERLAGOL

[12].Document Marking and Identification using Both Line and Word Shifting S. H. Low
N. F. Maxemchuk J. T. Brassil L. O'Gorman, AT&T Bell Laboratories, Murray Hill NJ
07974, 0743-166W95-1995 IEEE

Ph.D. Thesis Proposal 12 By: KHAN FARHAN RAFAT

[13].Research on Steganalysis for Text Steganography Based on Font Format Lingyun
Xiang, Xingming Sun, Gang Luo, Can Gan, School of Computer & Communication, Hunan
University, Changsha, Hunan P.R.China, 410082

[14].A New Synonym Text Steganography M. Hassan Shirali-Shahreza, Mohammad Shirali-

Shahreza. International Conference on Intelligent Information Hiding and Multimedia
Signal Processing 978-0-7695-3278-3/08 © 2008 IEEE

[15].A New Steganographic Method for Data Hiding in Microsoft Word Documents by a
Change Tracking Technique Tsung-Yuan Liu, Wen-Hsiang Tsai,and Senior Member,
1556-6013 © 2007 IEEE

[16].Applied Cryptography, Second Edition: Protocols, Algorthms, and Source Code in C,

by Bruce Schneier Wiley Computer Publishing, John Wiley & Sons, Inc.ISBN: 0471128457
Pub Date:01/01/96

[17].Steganography and Digital Watermarking 2004 Jonathan Cummins, Patrick Diskin,

Samuel. Lau and Robert Parlett, School of Computer Science, The University of

[18].Digital Watermarking and Steganography, Second Edition Ingemar J. Cox,Matthew L.

Miller, Jeffrey A. Bloom, Jessica Fridrich, Ton Kalker. Morgan Kaufmann Publishers is an
imprint of Elsevier. 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA. ISBN

[19].Steganography: A New Horizon for Safe Communication through XML Aasma Ghani
Memon, Sumbul Khawaja and Asadullah Shah. Isra UniversityHyderabad, Pakistan.Journal
of heoretical and Applied Information Technology ©2005 – 2008

[20].An Analysis of Steganographic Techniques by Richard Popa.

Ph.D. Thesis Proposal 13 By: KHAN FARHAN RAFAT