You are on page 1of 15


In the past, people used hidden tattoos, text hidden in wax-covered tablets or invisible ink to convey secret messages. Throughout the ages, the recent growth in computational power and network technologies has propelled it to the forefront of todays security techniques provide easy-to-use communication channels for secret message passing with the help of steganography.This paper will explore on how to hide information in carriers such as digital media like text, PEG images, and MP3 audio files where the hidden message will not be apparent to the observer and also explore how to detect the existence of hidden information. Steganography is a really interesting subject and outside of the mainstream cryptography and system administration that most of us deal with day after day. Steganography is the art and science of writing hidden messages in such a way that no one apart from the sender and intended recipient even realizes there is a hidden message. The word

steganography is of Greek origin and means "covered, or hidden writing". The detection of steganographically encoded packages is called steganalysis. Modern steganography entered the world in 1985 with the advent of the Personal Computer applied to classical steganography problems. Steganography has been widely used in historical times, especially before cryptographic systems were developed. Steganography goal is to keep the presence of a message secret, or hide the fact that communication is taking place. There are two methods for detecting steganographically-encoded data: visual steganalysis and statistical steganalysis

Steganography is the art and science of writing hidden messages in such a way that no one apart from the sender and intended recipient even realizes there is a hidden message. By contrast, cryptography obscures the meaning of a message, but it does not conceal the fact that there is a message. Today, the term steganography includes the concealment of digital information within computer files. For example, the sender might start with an ordinary-looking image file, then adjust the color of every 100th pixel to correspond to a letter in the alphabet -- a change so subtle that someone who isn't actively looking for it is unlikely to notice it.Over the past couple of years, steganography has been the source of a lot of discussion, particularly as it was suspected that terrorists connected with the September 11 attacks might have used it for covert communications. While no such connection has been proven, the concern points out the effectiveness of steganography as a means of obscuring data. Indeed, along with encryption, steganography is one of the fundamental ways by which data can be kept confidential. This article will offer a brief introductory discussion of steganography: what it is, how it can be used, and the true implications it can have on information security.

What is Steganography?

The word steganography is of Greek origin and means "covered, or hidden writing".While we are discussing it in terms of computer security, steganography is really nothing new, as it has been around since the times of ancient Rome. For example, in ancient Rome and Greece, text was traditionally written on wax that was poured on top of stone tablets. If the sender of the information wanted to obscure the message - for purposes of military intelligence, for instance - they would use steganography: the wax would be scraped off and the message would be inscribed or written directly on the tablet, wax would then be poured on top of the message, thereby obscuring not just its meaning but its very existence. According to, steganography (also known as "steg" or "stego") is "the art of writing in cipher, or in characters, which are not intelligible except to persons who have the key; cryptography" [2]. In computer terms, steganography has evolved into the practice of hiding a message within a larger one in such a way that others cannot discern the presence or contents of the hidden message. In contemporary terms, steganography has evolved into a digital strategy of hiding a file in some form of multimedia, such as an image, an audio file (like a .wav or mp3) or even a video file.

What is Steganography Used for?

Like many security tools, steganography can be used for a variety of reasons, some good, some not so good. Legitimate purposes can include things like watermarking images for reasons such as copyright protection. Digital watermarks (also known as fingerprinting, significant especially in copyrighting material) are similar to steganography in that they are overlaid in files, which appear to be part of the original file and are thus not easily detectable by the average person. Steganography can also be used as a way to make a substitute for a one-way hash value (where you take a variable length input and create a static length output string to verify that no changes have been made to the original variable length input).Further, steganography can be used to tag notes to online images (like post-it notes attached to paper files). Finally, steganography can be used to maintain the confidentiality of valuable information, to protect the data from possible sabotage, theft, or unauthorized viewing[5]. Unfortunately, steganography can also be used for illegitimate reasons. For instance, if someone was trying to steal data, they could conceal it in another file or files and send it out in an innocent looking email or file transfer. Furthermore, a person with a hobby of saving pornography, or worse, to their hard drive, may choose to hide the evidence through the use of steganography. And, as was pointed out in the concern for terroristic purposes, it can be used as a means of covert communication. Of course, this can be both a legitimate and an illegitimate application. Here is a summary about hiding:

And about extracting:

Modern steganographic techniques

Modern steganography entered the world in 1985 with the advent of the Personal Computer applied to classical steganography problems. [3] Development following that was slow, but has since taken off, based upon the number of 'stego' programs available.

Concealing messages within the lowest bits of noisy images or sound files. Concealing data within encrypted data. The data to be concealed is first encrypted before being used to overwrite part of a much larger block of encrypted data. This technique works most effectively where the decrypted version of data being overwritten has no special meaning or use: some cryptosystems, especially those designed for filesystems, add random looking padding bytes at the end of a ciphertext so that its size cannot be used to figure out the size of the original plaintext. Examples of software that use this technique include FreeOTFE and TrueCrypt. Chaffing and winnowing Invisible ink Null ciphers Concealed messages in tampered executable files, exploiting redundancy in the i386 instruction set. Embedded pictures in video material (optionally played at slower or faster speed). A new steganographic technique involves injecting imperceptible delays to packets sent over the network from the keyboard. Delays in keypresses in some

applications (telnet or remote desktop software) can mean a delay in packets, and the delays in the packets can be used to encode data. There is no extra processor or network activity, so the steganographic technique is "invisible" to the user. This kind of steganography could be included in the firmware of keyboards, thus making it invisible to the system. The firmware could then be included in all keyboards, allowing someone to distribute a keylogger program to thousands without their knowledge.[5] Content-Aware Steganography hides information in the semantics a human user assigns a datagram; these systems offer security against a non-human adversary/warden.[6] BPCS-Steganography - a very large embedding capacity steganography. Blog-Steganography. Messages are fractionalyzed and the (encrypted) pieces are added as comments of orphaned web-logs (or pin boards on social network platforms). In this case the selection of blogs is the symmetric key that sender and recipient are using. The carrier of the hidden message is the whole blogosphere.

Historical steganographic techniques

Steganography has been widely used in historical times, especially before cryptographic systems were developed. Examples of historical usage include:

Hidden messages in wax tablets: in ancient Greece, people wrote messages on the wood, then covered it with wax so that it looked like an ordinary, unused tablet. Hidden messages on messenger's body: also in ancient Greece. Herodotus tells the story of a message tattooed on a slave's shaved head, hidden by the growth of his hair, and exposed by shaving his head again. The message allegedly carried a warning to Greece about Persian invasion plans. This method has obvious drawbacks:

1. It is impossible to send a message as quickly as the slave can travel, because it takes months to grow hair. 2. A slave can only be used once for this purpose. (This is why slaves were used: they were considered expendable.)

Hidden messages on paper written in secret inks under other messages or on the blank parts of other messages. During and after World War II, espionage agents used photographically produced microdots to send information back and forth. Since the dots were typically extremely smallthe size of a period produced by a typewriter or even smallerthe stegotext was whatever the dot was hidden within. If a letter or an address, it was some alphabetic characters. If under a postage stamp, it was the presence of the stamp. The problem with the WWII microdots was that they needed to be embedded in the paper, and covered with an adhesive (such as collodion), which could be detected by holding a suspected paper up to a light

and viewing it almost edge on. The embedded microdot would reflect light differently than the paper. More obscurely, during World War II, a spy for the Japanese in New York City, Velvalee Dickinson, sent information to accommodation addresses in neutral South America. She was a dealer in dolls, and her letters discussed how many of this or that doll to ship. The stegotext in this case was the doll orders; the 'plaintext' being concealed was itself a codetext giving information about ship movements, etc. Her case became somewhat famous and she became known as the Doll Woman. Counter-propaganda: During the Pueblo Incident, US crew members of the USS Pueblo (AGER-2) research ship held as prisoners by North Korea communicated in sign language during staged photo ops to inform the United States that they had not defected, but had instead been captured by North Korea and were still loyal to the U.S. In other photos presented to the US, the crew members gave "the finger" to the unsuspecting North Koreans, in an attempt to discredit the pictures that showed them smiling and comfortable. [7] The one-time pad is a theoretically unbreakable cipher that produces ciphertexts indistinguishable from random texts: only those who have the private key can distinguish these ciphertexts from any other perfectly random texts. Thus, any perfectly random data can be used as a covertext for a theoretically unbreakable steganography. A modern example of OTP: in most cryptosystems, private symmetric session keys are supposed to be perfectly random (that is, generated by a good Random Number Generator), even very weak ones (for example, shorter than 128 bits). This means that users of weak cryptography (in countries where strong cryptography is forbidden) can safely hide OTP messages in their session keys.

The following formula provides a very generic description of the pieces of the steganographic process: cover_medium + hidden_data + stego_key = stego_medium In this context, the cover_medium is the file in which we will hide the hidden_data, which may also be encrypted using the stego_key. The resultant file is the stego_medium (which will, of course. be the same type of file as the cover_medium). The cover_medium (and, thus, the stego_medium) are typically image or audio files. In this article, I will focus on image files and will, therefore, refer to the cover_image and stego_image. Before discussing how information is hidden in an image file, it is worth a fast review of how images are stored in the first place. An image file is merely a binary file containing a binary representation of the color or light intensity of each picture element (pixel) comprising the image.

Images typically use either 8-bit or 24-bit color. When using 8-bit color, there is a definition of up to 256 colors forming a palette for this image, each color denoted by an 8-bit value. A 24-bit color scheme, as the term suggests, uses 24 bits per pixel and provides a much better set of colors. In this case, each pix is represented by three bytes, each byte representing the intensity of the three primary colors red, green, and blue (RGB), respectively. The Hypertext Markup Language (HTML) format for indicating colors in a Web page often uses a 24-bit format employing six hexadecimal digits, each pair representing the amount of red, blue, and green, respectively. The color orange, for example, would be displayed with red set to 100% (decimal 255, hex FF), green set to 50% (decimal 127, hex 7F), and no blue (0), so we would use "#FF7F00" in the HTML code. The size of an image file, then, is directly related to the number of pixels and the granularity of the color definition. A typical 640x480 pix image using a palette of 256 colors would require a file about 307 KB in size (640 480 bytes), whereas a 1024x768 pix high-resolution 24-bit color image would result in a 2.36 MB file (1024 768 3 bytes). To avoid sending files of this enormous size, a number of compression schemes have been developed over time, notably Bitmap (BMP), Graphic Interchange Format (GIF), and Joint Photographic Experts Group (JPEG) file types. Not all are equally suited to steganography, however. GIF and 8-bit BMP files employ what is known as lossless compression, a scheme that allows the software to exactly reconstruct the original image. JPEG, on the other hand, uses lossy compression, which means that the expanded image is very nearly the same as the original but not an exact duplicate. While both methods allow computers to save storage space, lossless compression is much better suited to applications where the integrity of the original information must be maintained, such as steganography. While JPEG can be used for stego applications, it is more common to embed data in GIF or BMP files. The simplest approach to hiding data within an image file is called least significant bit (LSB) insertion. In this method, we can take the binary representation of the hidden_data and overwrite the LSB of each byte within the cover_image. If we are using 24-bit color, the amount of change will be minimal and indiscernible to the human eye. As an example, suppose that we have three adjacent pixels (nine bytes) with the following RGB encoding: 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011

Now suppose we want to "hide" the following 9 bits of data (the hidden data is usually compressed prior to being hidden): 101101101. If we overlay these 9 bits over the LSB of the 9 bytes above, we get the following (where bits in bold have been changed): 10010101 00001100 11001001 10010111 00001110 11001011 10011111 00010000 11001011 Note that we have successfully hidden 9 bits but at a cost of only changing 4, or roughly 50%, of the LSBs. This description is meant only as a high-level overview. Similar methods can be applied to 8-bit color but the changes, as the reader might imagine, are more dramatic. Gray-scale images, too, are very useful for steganographic purposes. One potential problem with any of these methods is that they can be found by an adversary who is looking. In addition, there are other methods besides LSB insertion with which to insert hidden information. Without going into any detail, it is worth mentioning steganalysis, the art of detecting and breaking steganography. One form of this analysis is to examine the color palette of a graphical image. In most images, there will be a unique binary encoding of each individual color. If the image contains hidden data, however, many colors in the palette will have duplicate binary encodings since, for all practical purposes, we can't count the LSB. If the analysis of the color palette of a given file yields many duplicates, we might safely conclude that the file has hidden information. But what files would you analyze? Suppose I decide to post a hidden message by hiding it in an image file that I post at an auction site on the Internet. The item I am auctioning is real so a lot of people may access the site and download the file; only a few people know that the image has special information that only they can read. And we haven't even discussed hidden data inside audio files! Indeed, the quantity of potential cover files makes steganalysis a Herculean task.


FIGURE 1. The cover_image (5th wave.gif), hidden_data file (virusdetectioninfo.txt), and stego_key.

The following examples come from Andy Brown's S-Tools for Windows. S-Tools allows users to hide information into BMP, GIF, or WAV files. The basic scheme of the program is straight-forward; you drag an image or audio file into the S-Tools active

window to act as the cover_medium, drag the hidden_data file onto the cover_medium, and then provide a stego_key for encryption. The result is the stego_medium. All of this is shown in Figure 1: 1. I highlighted the GIF image file 5th wave.gif and dragged it to the S-Tools active window. Note that S-Tools reports that up to 138,547 bytes can be hidden in this image file. 2. I next highlighted a 14 KB text file called virusdetectioninfo.txt and dragged it onto the image file in S-Tools. 3. A dialog box pops up telling me that I am hiding 6,019 bytes of data and asks for a passphrase with which to encrypt the hidden text; the default secret key crypto scheme used by S-Tools is the International Data Encryption Algorithm (IDEA).

FIGURE 2. The original image file (left) and image file with embedded text (right), side by side.

FIGURE 2. The original image file (left) and image file with embedded text (right), side by side.

FIGURE 3. Extracting hidden information from the image file. Once the image file has been received, the user merely drags the file to S-Tools and rightclicks over the image, specifying the Reveal option. A dialog box will pop up requesting the passphrase. Figure 3 shows the information about the hidden archive file, and allows the user to open the file


Steganography goal is to keep the presence of a message secret, or hide the fact that communication is taking place Cryptography goal is to obscure a message or communication so that it cannot be understood Steganography and Cryptography make great partners. It is common practice to use cryptography with steganography


As mentioned previously, steganography is an effective means of hiding data, thereby protecting the data from unauthorized or unwanted viewing. But stego is simply one of many ways to protect the confidentiality of data. It is probably best used in conjunction with another data-hiding method. When used in combination, these methods can all be a part of a layered security approach. Some good complementary methods include:

Encryption - Encryption is the process of passing data or plaintext through a series of mathematical operations that generate an alternate form of the original data known as ciphertext. The encrypted data can only be read by parties who have been given the necessary key to decrypt the ciphertext back into its original plaintext form. Encryption doesn't hide data, but it does make it hard to read! Hidden directories (Windows) - Windows offers this feature, which allows users to hide files. Using this feature is as easy as changing the properties of a directory to "hidden", and hoping that no one displays all types of files in their explorer. Hiding directories (Unix) - in existing directories that have a lot of files, such as in the /dev directory on a Unix implementation, or making a directory that starts with three dots (...) versus the normal single or double dot. Covert channels - Some tools can be used to transmit valuable data in seemingly normal network traffic. One such tool is Loki. Loki is a tool that hides data in ICMP traffic (like ping).


There are two methods for detecting steganographically-encoded data: visual steganalysis and statistical steganalysis. The visual method compares a copy of the source file with the suspect file by running a hash against the source file and checking that it matches the hash on the suspect copy. Statistical steganalysis compares theoretically expected frequency distributions of message content with the frequency distribution of the suspected file. Because the covertext has to be modified to store the hidden data, there are usually detectable signs within the covertext's normal characteristics that can be used to reveal the hidden message. For example, when running a histogram on an image, there should be random spikes, but if the histogram is flat or has one large spike, it's likely the image contains hidden information.

There are tools available, such as Stegdetect, that analyze content for hidden information. Stegdetect is capable of detecting several different steganographic methods used to embed hidden information in JPEG images.

It us used in map making, where cartographers add a nonexistent street or lake inorder to detect copyright offenders. Steganography could also be used to hide the existence of sensitive files on storage media. . Modern techniques use steganography as a watermark to inject encrypted copyright marks and serial numbers into electronic mediums such as books, audio, and video.


Steganography is useful for hiding messages for transmission. One of the major discoveries of this investigation was that each steganographic implementation carries with it significant trade-off decisions, and it is up to the steganographer to decide which implementation suits him/her best. Below, advantages and disadvantages to some steganographic techniques are discussed. Technique Advantages Disadvantages Hard to detect. Original Least Message is hard to recover image is very similar to Signifcant Bit if image is subject to attack altered image. Embedded (LSB) such as translation and data resembles Gaussian Encoding rotation. noise. Significant damage to Hard to detect as message Low Frequency picture appearance. and fundamental image Encoding Message difficult to data share same range. recover. Altered picture closely Mid Frequency resembles original. Not Relatively easy to detect, as Encoding susceptible to attacks such our project has shown. as rotation and translation. High Image is distorted. Frequency Message easily lost if None Domain picture subject to Encoding compression such as JPEG.

When properly implemented, steganography can be difficult to detect, but not impossible. Steganography detection can be used to prevent communication of malicious data.

Steganography is a really interesting subject and outside of the mainstream cryptography and system administration that most of us deal with day after day. But it is also quite real; this is not just something that's used in the lab or an arcane subject of study in academia. Stego may, in fact, be all too real there have been several reports that the terrorist organization behind the September 11 attacks in New York City, Washington, D.C., and outside of Pittsburgh used steganography as one of their means of communication. Steganography is a fascinating and effective method of hiding data that has been used throughout history. Methods that can be employed to uncover such devious tactics, but the first step are awareness that such methods even exist. There are many good reasons as well to use this type of data hiding, including watermarking or a more secure central storage method for such things as passwords, or key processes. Regardless, the technology is easy to use and difficult to detect. The more that you know about its features and functionality, the more ahead you will be in the game.


IEEE security and privacy article