You are on page 1of 36

10CE022

INTRODUCTION

Chapter 1 Introduction
1.1 Introduction
With advancements in digital communication technology and the growth of computer power and storage, the difficulties in ensuring individuals privacy become increasingly challenging. The degrees to which individuals appreciate privacy differ from one person to another. Various methods have been investigated and developed to protect personal privacy. Encryption is probably the most obvious one, and then comes steganography. Steganography is an old art which has been in practice since time unknown. Steganography, from the Greek, means covered or secret writing and is thus the art of hiding messages inside innocuous cover carriers, e.g. images, audio, video, text, or any other digitally represented code or transmission, in such a manner that the existence of the embedded messages is undetectable. The hidden message may be plaintext, cipher text, or anything that can be represented as a bit stream. Encryption lends itself to noise and is generally observed while steganography is not observable. Steganography and cryptography, though closely related, they are not the same. The former has the intent to hide the existence of the message whereas the later scrambles a message to absolute illegibility. The goal of steganography is to avoid drawing suspicion to the transmission of a hidden message. It hide messages inside other harmless messages in a way that does not allow any enemy to even detect that there is a second secret message present. If suspicion is raised, then this goal is defeated. Discovering and rendering useless such covert messages is another art form known as steganalysis. This approach of information hiding technique has recently become important in a number of application areas. Digital audio, video, and pictures are increasingly furnished with distinguishing but imperceptible marks, which may contain a hiding copyright notice or serial number or even help to prevent unauthorized copying directly. Military communications system make increasing use of traffic security technique which, rather than merely concealing the content of a message using encryption, seek to conceal its sender, its 1

10CE022

INTRODUCTION

receiver or its very existence. Similar techniques are used in some mobile phone systems and schemes proposed for digital electronics.

1.2 What is Steganography?


Since the rise of the Internet one of the most important factors of information technology and communication has been the security of information. Cryptography was created as a technique for securing the secrecy of communication and many different methods have been developed to encrypt and decrypt data in order to keep the message secret. Unfortunately it is sometimes not enough to keep the contents of a message secret, it may also be necessary to keep the existence of the message secret. The technique used to implement this, is called steganography. Steganography is a subject which is rarely touched upon by most IT Security Enthusiasts. Most people don't see Steganography has a potential threat, some people don't even know what Steganography is. Steganography is the art and science of hiding information in ways that prevent the detection of hidden messages. In short we can say that Steganography is the practice of hiding private or sensitive information within something that appears to be nothing out of the usual. This is accomplished through hiding information in other information, thus hiding the existence of the communicated information.

Figure 1.1 Example of Steganography

10CE022

INTRODUCTION

Figure 1.2 General Scheme of Steganography

Steganography comes from the Greek words Stegans (Covered) and Graptos (Writing). The most common use of Steganography is to hide a file inside another file. Modern steganography's goal is to keep hidden message's mere presence undetectable, but steganographic systems because of their invasive nature, leave behind detectable traces in the cover medium. Even if secret content is not revealed, the existence of it can be guessed because modifying the cover medium changes its statistical properties, so eavesdroppers can detect the distortions in the resulting stego medium's statistical properties. When information or a file is hidden inside a carrier file, the data is usually encrypted with a password. Steganography is often confused with cryptology because the two are similar in the way that they both are used to protect important information. The difference between the two is that Steganography involves hiding information so it appears that no information is hidden at all. Steganography and cryptography are cousins in the spy craft family. Cryptography scrambles a message, so that it cannot be understood. Steganography hides the message, so it cannot be seen. If a person or persons views the object that the information is hidden inside of he or she will have no idea that there is any hidden information, therefore the person will not attempt to decrypt the information. Steganography in the modern day sense of the word usually refers to information or a file that has been concealed inside a digital Picture, Video or Audio file. What Steganography essentially does is exploit human perception, human senses are not trained to look for files that have information hidden inside of them, although there are programs available that can do what is called Steganalysis (Detecting use of Steganography.) 3

10CE022

INTRODUCTION

1.2 Terminology used in Steganography:


In the field of steganography, some terminology has developed.

The adjectives Cover, Embedded And stego were defined at the Information Hiding Workshop held in Cambridge, England. The term ``cover'' is used to describe the original, innocent message, data, audio, still, video and so on. When referring to audio signal Steganography, the cover signal is sometimes called the ``host'' signal. The information to be hidden in the cover data is known as the embedded'' data. The ``stego'' data is the data containing both the cover signal and the ``embedded'' information. Logically, the processing of putting the hidden or embedded data, into the cover data, is sometimes known as embedding. Occasionally, especially when referring to image Steganography, the cover image is known as the Container.

The following formula provides a very generic description of the pieces of the steganographic process: cover_medium + hidden_data + stego_key = stego_medium

10CE022

STEGANOGRAPHY FUNDAMENTALS

Chapter 2 Steganography Fundamentals

2.1 General Block Diagram of Steganography

Figure 2.1 Block Diagram Of steganography Description: On the sender side: A Secret Message: A message that needs to be hidden behind any media file. The file can be audio, video, Text or Image. Cover Message: This can be any type of media file behind which secret message can be hidden Secret Key: 5

10CE022

STEGANOGRAPHY FUNDAMENTALS

A Secret key is used to encrypt a message. This is one type of password provided for security. Key can be public or private. If key is public, it will be the same at both sender and receiver side and if key is private both sender and receiver use different key to encrypt as well as decrypt message. Embedding Algorithm: Embedding algorithm is used to combine cover message, secret message and a key. This combined message is called Stego Message. Stego Message: A Stego message is generated using Embedding Algorithm. Basically it is a combination of cover message, Hidden message and a key. Stego Message is transmitted over the network by sender. On the receiver side: Receiver needs to detect whether received message is stego message or not. Stegnalysis is the technology which detects whether cover message contains any hidden data or not. Some tools are available which can identify a secret message hidden behind received cover message. On the Receiver side, there are two probabilities: 1. Received message is not Stego Message: If received message does not contain any hidden data behind it, then received message is directly accessible at the receiver side. No need to use steganalysis. 2. Received message is Stego Message : If received message contains any hidden data behind it then some Message Retrieving algorithm is required to extract secret message from original message. After doing steganalysis, it can be confirmed that some data is hidden behind received message. So, Receiver must know a secret key which is required to decrypt message. No one accept receiver is able to decode a secret message using a secret key and Message Retrieving algorithms.

10CE022

TYPES OF STEGANOGRAPHY

Chapter 3 Types Of steganography

Steganography can also be classified on the basis of carrier media. The most commonly used media are text, image, audio and video. There are mainly four Types of Steganography. 1. Image Steganography 2. Text Steganography 3. Audio Steganography 4. Video Steganography

3.1 Image Steganography


In this section we deal with data encoding in still digital images. In essence, image steganography is about exploiting the limited powers of the human visual system (HVS). Within reason, any plain text, cipher text, other images, or anything that can be embedded in a bit stream can be hidden in an image. Image steganography has come quite far in recent years with the development of fast, powerful graphical computers, and steganographic software is now readily available over the Internet for everyday users.

Some Guidelines to Image Steganography:


Before proceeding further, some explanation of image files is necessary. To a computer, an image is an array of numbers that represent light intensities at various points, or pixels. These pixels make up the image's raster data. An image size of 640 by 480 pixels, utilizing 256 colors (8 bits per pixel) is fairly common. Such an image would contain around 300 kilobits of data.

10CE022

TYPES OF STEGANOGRAPHY

Digital images are typically stored in either 24-bit or 8-bit per pixel files. 24-bit images are sometimes known as true color images. Obviously, a 24-bit image provides more space for hiding information; however, 24-bit images are generally large and not that common. A 24-bit image 1024 pixels wide by 768 pixels high would have a size in excess of 2 megabytes. As such large files would attract attention were they to be transmitted across a network or the Internet, image compression is desirable. However, compression brings with it other problems, as will explain shortly. Alternatively, 8-bit color images can be used to hide information. In 8-bit color images, (such as GIF files), each pixel is represented as a single byte. Each pixel merely points to a color index table, or palette, with 256 possible colors. The pixel's value, then, is between 0 and 255. The image software merely needs to paint the indicated color on the screen at the selected pixel position. If using an 8-bit image as the cover-image, many steganography experts recommend using images featuring 256 shades of grey as the palette, for reasons that will become apparent. Grey-scale images are preferred because the shades change very gradually between palette entries. This increases the image's ability to hide information. When dealing with 8-bit images, the steganographer will need to consider the image as well as the palette. Obviously, an image with large areas of solid color is a poor choice, as variances created by embedded data might be noticeable. Once a suitable cover image has been selected, an image encoding technique needs to be chosen.

Image Compression:
Image compression offers a solution to large image files. Two kinds of image compression are lossless and lossy compression. Both methods save storage space but have differing effects on any uncompressed hidden data in the image. Lossy compression, as typified by JPEG (Joint Photographic Experts Group) format files, offers high compression, but may not maintain the original image's integrity. This can impact negatively on any hidden data in the image. This is due to the lossy compression algorithm, which may ``lose'' unnecessary image data, providing a close approximation to high-quality digital images, but not an exact duplicate. Hence, the term lossy'' compression.

10CE022

TYPES OF STEGANOGRAPHY

Lossy compression is frequently used on true-color images, as it offers high compression rates. Lossless compression maintains the original image data exactly; hence it is preferred when the original information must remain intact. It is thus more favored by steganographic techniques. Unfortunately, lossless compression does not offer such high compression rates as lossy compression. Typical examples of lossless compression formats are CompuServes GIF (Graphics Interchange Format) and Microsoft's BMP (Bitmap) format.

Image Encoding Techniques:


Information can be hidden many different ways in images. Straight message insertion can be done, which will simply encode every bit of information in the image. More complex encoding can be done to embed the message only in ``noisy'' areas of the image, that will attract less attention. The message may also be scattered randomly throughout the cover image. The most common approaches to information hiding in images are: Least significant bit (LSB) insertion Masking and filtering techniques

Each of these can be applied to various images, with varying degrees of success. Each of them suffers to varying degrees from operations performed on images, such as cropping, or resolution decrementing, or decreases in the color depth.

3.1.1 Least Significant bit insertion


The least significant bit insertion method is probably the most well known image steganography technique. It is a common, simple approach to embedding information in a graphical image file. Unfortunately, it is extremely vulnerable to attacks, such as image manipulation. A simple conversion from a GIF or BMP format to a lossy compression format Such as JPEG can destroy the hidden information in the image.

10CE022

TYPES OF STEGANOGRAPHY

DETAILED IMAGE STEGANOGRAPHY WORK BASED ON LSB INSERTION Flow Diagram-

Figure 3.1 Image Steganography based on LSB Insertion Algorithm

When applying LSB techniques to each byte of a 24-bit image, three bits can be encoded into each pixel. (As each pixel is represented by three bytes) Any changes in the pixel bits will be indiscernible to the human eye. For example, the letter A can be hidden in three pixels. Assume the original three pixels are represented by the three 24-bit words below: (00100111 11101001 11001000) (00100111 11001000 11101001) 10

10CE022

TYPES OF STEGANOGRAPHY (11001000 00100111 11101001)

The binary value for the letter A is (101101101). Inserting the binary value of A into the three pixels, starting from the top left byte, would result in:

(00100111 11101000 11001001) (00100111 11001000 11101001) (11001001 00100110 11101001)

The emphasized bits are the only bits that actually changed. The main advantage of LSB Insertion is that data can be hidden in the least and second to least bits and still the human eye would be unable to notice it. When using LSB techniques on 8-bit images, more care needs to be taken, as 8-bit formats are not as forgiving to data changes as 24-bit formats are. Care needs to be taken in the selection of the cover image, so that changes to the data will not be visible in the stegoimage. Commonly known images, (such as famous paintings, like the Mona Lisa) should be avoided. In fact, a simple picture of your dog would be quite sufficient. When modifying the LSB bits in 8-bit images, the pointers to entries in the palette are changed. It is important to remember that a change of even one bit could mean the difference between a shade of red and a shade of blue. Such a change would be immediately noticeable on the displayed image, and is thus unacceptable. For this reason, data-hiding experts recommend using grey-scale palettes, where the differences between shades are not as pronounced. Alternatively, images consisting mostly of one color, such as the so-called Renoir palette, named because it comes from a 256 color version of Renoir's ``Le Moulin de la Galette''.

3.1.2 Masking and Filtering

Masking and filtering techniques hide information by marking an image in a manner similar to paper watermarks. Because watermarking techniques are more integrated into the image, they may be applied without fear of image destruction from lossy compression. By covering, or masking a faint but perceptible signal with another to make the first non-

11

10CE022

TYPES OF STEGANOGRAPHY

perceptible, we exploit the fact that the human visual system cannot detect slight changes in certain temporal domains of the image. Technically, watermarking is not a steganographic form. Strictly, steganography conceals data in the image; watermarking extends the image information and becomes an attribute of the cover image, providing license, ownership or copyright details. Masking techniques are more suitable for use in lossy JPEG images than LSB insertion because of their relative immunity to image operations such as compression and cropping.

3.2 Text Steganography Codebook

Encoder
Original Document

Figure 3.2 Text steganography

The illegal distribution of documents through modern electronic means, such as electronic mail, means such as this allow infringers to make identical copies of documents without paying royalties or revenues to the original author. To counteract this possible widescale piracy, a method of marking printable documents with a unique codeword that is Indiscernible to readers, but can be used to identify the intended recipient of a document just by Examination of a recovered document. An additional application of text steganography suggested by Bender, et al. is annotation that is, checking that a document has not been tampered with. Hidden data in text could even be used by mail servers to check whether documents should be posted or not. The marking techniques described are to be applied to either an image representation of a document or to a document format file, such as PostScript or Textiles. The idea is that a codeword (such as a binary number, for example) is embedded in the document by altering 12

10CE022

TYPES OF STEGANOGRAPHY

particular textual features. By applying each bit of the codeword to a particular document Feature, we can encode the codeword. It is the type of feature that identifies a particular encoding method. Three features are described in the following subsections:

3.2.1 Line-Shift Coding:


In this method, text lines are vertically shifted to encode the document uniquely. Encoding and decoding can generally be applied either to the format file of a document, or the bitmap of a page image. By moving every second line of document either 1/300 of an inch up or down, it was found that line-shift coding worked particularly well, and documents could still be completely decoded, even after the tenth photocopy. However, this method is probably the most visible text coding technique to the reader. Also, line-shift encoding can be defeated by manual or automatic measurement of the number of pixels between text baselines. Random or uniform respacing of the lines can damage any attempts to decode the codeword. However, if a document is marked with line-shift coding, it is particularly difficult to remove the encoding if the document is in paper format. Each page will need to be rescanned, altered, and reprinted. This is complicated even further if the printed document is a photocopy, as it will then suffer from effects such as blurring, and salt-and-pepper noise.

3.2.2 Word-Shift Coding


In word-shift coding, code words are coded into a document by shifting the horizontal locations of words within text lines, while maintaining a natural spacing appearance. This encoding can also be applied to either the format file or the page image bitmap. The method, of course, is only applicable to documents with variable spacing between adjacent words, such as in documents that have been text-justified. As a result of this variable spacing, it is necessary to have the original image, or to at least know the spacing between words in the unencoded document.

13

10CE022

TYPES OF STEGANOGRAPHY

The following is a simple example of how word-shifting might work. For each textline, the largest and smallest spaces between words are found. To code a line, the largest spacing is reduced by a certain amount, and the smallest is extended by the same amount. This maintains the line length, and produces little visible change to the text. Word-shift coding should be less visible to the reader than line-shift coding, since the spacing between adjacent words on a line is often shifted to support text justification. However, word-shifting can also be detected and defeated, in either of two ways. If one knows the algorithm used by the formatter for text justification, actual spaces between words could then be measured and compared to the formatter's expected spacing. The differences in spacing would reveal encoded data. A second method is to take two or more distinctly encoded, uncorrupted documents and perform page by page pixel-wise difference operations on the page images. One could then quickly pick up word shifts and the size of the word displacement. By respacing the shifted words back to the original spacing produced under the formatter, or merely applying random horizontal shifts to all words in the document not found at column edges, an attacker could eliminate the encoding. However, it is felt that these methods would be time-consuming and painstaking.

3.2.3 Feature Coding:


A third method of coding data into text is known as feature coding. This is applied either to the bitmap image of a document, or to a format file. In feature coding, certain text features are altered, or not altered, depending on the codeword. For example, one could encode bits into text by extending or shortening the upward, vertical end lines of letters such as b, d, h, etc. Generally, before encoding, feature randomization takes place. That is, character end line lengths would be randomly lengthened or shortened, then altered again to encode the specific data. This removes the possibility of visual decoding, as the original end line lengths would not be known. Of course, to decode, one requires the original image, or at least a specification of the change in pixels at a feature. Due to the frequently high number of features in documents that can be altered, feature coding supports a high amount of data encoding. Also, feature encoding is largely 14

10CE022

TYPES OF STEGANOGRAPHY

indiscernible to the reader. Finally, feature encoding can be applied directly to image files, which leaves out the need for a format file. When trying to attack a feature-coded document, it is interesting that a purely random adjustment of end line lengths is not a particularly strong attack on this coding method. Feature coding can be defeated by adjusting each end line length to a fixed value. This can be done manually, but would be painstaking. Although this process can be automated, it can be made more challenging by varying the particular feature to be encoded. To even further complicate the issue, word shifting might be used in conjunction with feature coding, for example. Efforts such as this can place enough impediments in the attacker's way to make his job difficult and time consuming.

Example of Text Steganography:


A message could be arranged in the second letter of every word in a cover message. "Accepted your overture. Next Friday, about eleven, come away anywhere." Message is Cover blown

3.3 Audio Steganography


A. Alteration At the first step, message bits substitute with the target bit of samples. Target bits are those bits which place at the layer that we want to alter. This is done by a simple substitution that does not need adjustability of result be measured. B. Modification In fact this step is the most important and essential part of algorithm. All results and achievements that we expect are depending on this step. Efficient and intelligent algorithms are useful here. In this stage algorithm tries to decrease the amount of error and improve the transparency. C. Verification In fact this stage is quality controller. What the algorithm could do has been done, and now the outcome must be verified. If the difference between original sample and new Sample is acceptable and reasonable, the new sample will be accepted. Otherwise it will be rejected and original sample will be used in reconstructing the new audio file instead of that.

15

10CE022

TYPES OF STEGANOGRAPHY

Figure 3.3 General Diagram of Audio Staganography D. Reconstruction The last step is new audio file (stego file) creation. This is done sample by sample. There are two states at the input of this step. Either modified sample is input or the original sample that is the same with host audio file. It is why we can claim the algorithm does not alter all samples or predictable samples. That means whether which sample will be used and modified is depending on the status of samples (Environment) and the decision of intelligent algorithm. This section presents some common methods used in audio Steganography

3.3.1 LSB Coding


Least significant bit (LSB) coding is the simplest way to embed information in a digital audio file. By substituting the least significant bit of each sampling point with a binary message, LSB coding allows for a large amount of data to be encoded. The following diagram illustrates how the message 'HEY' is encoded in a 16-bit CD quality sample using the LSB method:

16

10CE022

TYPES OF STEGANOGRAPHY

Figure 3.4 Audio steganography based on LSB insersion algorithm Standard LSB ALGORITHM: It performs bit level manipulation to encode the message. The following steps are a. Receives the audio file in the form of bytes and converted in to bit pattern. b. Each character in the message is converted in bit pattern. c. Replaces the LSB bit from audio with LSB bit from character in the message. In LSB coding, the ideal data transmission rate is 1 kbps per 1 kHz. In some implementations of LSB coding, however, the two least significant bits of a sample are replaced with two message bits. This increases the amount of data that can be encoded but also increases the amount of resulting noise in the audio file as well. Thus, one should

consider the signal content before deciding on the LSB operation to use. For example, a sound file that was recorded in a bustling subway station would mask low-bit encoding noise. On the other hand, the same noise would be audible in a sound file containing a piano solo. The main advantage of the LSB coding method is low computational complexity of the algorithm while its major disadvantage : As the number of used LSBs during LSB coding increases or, equivalently, depth of the modified LSB layer becomes larger, probability of making the embedded message statistically detectable increases and perceptual transparency of stego objects is decreased. Low Bit encoding

17

10CE022

TYPES OF STEGANOGRAPHY

is therefore an undesirable method, mainly due to its failure to meet the Steganography requirement of being undetectable.

3.3.2 Phase Coding


Phase coding addresses the disadvantages of the noise-inducing methods of audio Steganography. Phase coding relies on the fact that the phase components of sound are not as perceptible to the human ear as noise is. Rather than introducing perturbations, the technique encodes the message bits as phase shifts in the phase spectrum of a digital signal, achieving an inaudible encoding in terms of signal-to-perceived noise ratio.

Figure 3.5 Phase coding in Audio Sine Wave Signal

The phase coding method breaks down the sound file into a series of N segments. A Discrete Fourier Transform (DFT) is applied to each segment to create a matrix of the phase and magnitude. The phase difference between each segment is Calculated, the first segment (s0) has an artificial absolute phase of p0 created, and all other segments have newly created phase frames. The new phase and original magnitude are combined to get the new segment, Sn. These new segments are then concatenated to create the encoded output and the frequency remains preserved. In order to decode the hidden information the receiver must know the length of the segments and the data interval used. The first segment is detected as a 0 or a 1 and this indicates where the message starts. This method has many advantages over Low Bit Encoding, the most important being that it is undetectable to the human ear. Like all of the techniques described so far though, its weakness is still in its lack of robustness to changes in the audio data. Any single sound operation or change to the data would distort the information and prevent its retrieval. 18

10CE022

TYPES OF STEGANOGRAPHY

3.3.3 Echo Hiding


Echo hiding embeds its data by creating an echo to the source audio. Artificial echo are used to hide the embedded data, the delay, the decay rate and the initial amplitude. As the delay between the original source audio and the echo decrease it becomes harder for the human ear to distinguish between the two signals until eventually a created carrier sounds echo is just heard as extra resonance. In addition, offset is varied to represent the binary message to be encoded. One offset value represents a binary one, and a second offset value represents a binary zero. If only one echo was produced from the original signal, only one bit of information could be encoded. Therefore, the original signal is broken down into blocks before the encoding process begins. Once the encoding process is completed, the blocks are concatenated back together to create the final signal.

Figure 3.6 Echo Hiding Signal The blocks are recombined to produce the final signal. The "one" echo signal is then multiplied by the "one" mixer signal and the "zero" echo signal is multiplied by the "zero" mixer signal. Then the two results are added together to get the final signal. The final signal is less abrupt than the one obtained using the first echo hiding implementation. This is because the two mixer echoes are complements of each other and that ramp transitions are used within each signal. These two characteristics of the mixer 19

10CE022

TYPES OF STEGANOGRAPHY

signals produce smoother transitions between echoes. The following diagram summarizes the second implementation of the echo hiding process.

Figure 3.7 Echo Hiding process

To extract the secret message from the stego-signal, the receiver must be able to break up the signal into the same block sequence used during the encoding process. Then the autocorrelation function of the signal's spectrum (the spectrum is the Forward Fourier Transform of the signal's frequency spectrum) can be used to decode the message because it reveals a spike at each echo time offset, allowing the message to be reconstructed. Much like phase encoding this has considerably better results than Low Bit Encoding and makes good use of research done so far in psychoacoustics. As with all sound file encoding, we find that working in audio formats such as WAV is very costly, more so than with bitmap images in terms of the file size to storage capacity ratio. The transmission of audio files via e-mail or over the web is much less prolific than image files and so is much more suspicious in comparison. It allows for a high data transmission rate and provides superior robustness when compared to the noise inducing methods.

3.4 Video Steganography

20

10CE022

TYPES OF STEGANOGRAPHY

A steganographic algorithm for compressed video is introduced here, operating directly in compressed bit stream. In a GOP, secret datas are embedded in I frame, and in P frames and in B frames. This proposed secure compressed video Steganographic architecture taking account of video statistical invisibility .The frame work is shown in the Figure 4.8

Figure 3.8 Video Steganography This architecture consists of four functions: I P and B frame extraction, the scene change detector, motion vectors calculation and the data embedder and steganalysis. The first section explains the extraction of I P and B frames from MPEG video. In the next section, scene change detector analyzes the frames with maximum scene change. I frames in MPEG standard is coded in intra frame manner, we can obtain the DC picture with abstracting the DC coefficients from the DCT coefficient codes then, data embedder, secret message is hidden into the compressed video sequence without bringing perceptive distortion. Motion vectors in P and B can be utilized for data hiding. In this proposed method datas are embedded in blocks based on motion vectors with large magnitude, Since human visual system is less sensitive to distortion in regions that are temporally near to features of highluminance intensity, this feature can be utilized for data hiding., data are not embedded in all blocks but only in motion vectors with a magnitude above a threshold. Larger magnitude illustrates faster temporal changes and less visible degradation due to data hiding.

21

10CE022

STEGANALYSIS & TOOLS

Chapter 4 Steganalysis and tools

4.1 Steganalysis
Whereas the goal of steganography is the avoidance of suspicion to hidden messages in other data, steganalysis aims to discover and render useless such covert messages. Hiding information within electronic media requires alterations of the media properties that may introduce some form of degradation or unusual characteristics. These characteristics may act as signatures that broadcast the existence of the embedded message, thus defeating the purpose of steganography. Attacks and analysis on hidden information may take several forms: detecting, extracting, and disabling or destroying hidden information. An attacker may also embed counter-information over the existing hidden information. Here two methods are looked into: detecting messages or their transmission and disabling embedded information. These approaches (attacks) vary depending upon the methods used to embed the information in to the cover media. Some amount of distortion and degradation may occur to carriers of hidden messages even though such distortions cannot be detected easily by the human perceptible system. This distortion may be anomalous to the "normal" carrier that when discovered may point to the existence of hidden information. Steganography tools vary in their approaches for hiding information. Without knowing which tool is used and which, if any, stego key is used; detecting the hidden information may become quite complex.

4.2 Detecting Hidden Information


Unusual patterns stand out and expose the possibility of hidden information. In text, small shifts in word and line spacing may be somewhat difficult to detect by the casual observer. However, appended spaces and "invisible" characters can be easily revealed by 22

10CE022

STEGANALYSIS & TOOLS

opening the file with a common word processor. The text may look "normal" if typed out on the screen, but if the file is opened in a word processor, the spaces, tabs, and other characters distort the text's presentation. Images too may display distortions from hidden information. Selecting the proper combination of steganography tools and carriers is the key to successful information hiding. Some images may become grossly degraded with even small amounts of embedded information. This visible noise will give away the existence of hidden information. The same is true with audio. Echoes and shadow signals reduce the chance of audible noise, but they can be detected with little processing. Only after evaluating many original images and stego images as to color composition, luminance, and pixel relationships do anomalies point to characteristics that are not "normal" in other images. Patterns become visible when evaluating many images used for applying steganography. Such patterns are unusual sorting of color palettes, relationships between colors in color indexes, exaggerated "noise" An approach used to identify such patterns is to compare the original cover-images with the stego-images and note visible differences (known-cover attack).Minute changes are readily noticeable when comparing the cover and stego-images. In making these comparisons with numerous images, patterns begin to emerge as possible signatures of steganography software. Some of these signatures may be exploited automatically to identify the existence of hidden messages and even the tools used in embedding the messages. With this knowledge-base, if the cover images are not available for comparison, the derived known signatures are enough to imply the existence of a message and identify the tool used to embed the message. However, in some cases recurring, predictable patterns are not readily apparent even if distortion between the cover and stegoimages is noticeable. A number of disk analysis utilities are available that can report and filter on hidden information in unused clusters or partitions of storage devices. A steganographic file system may also be vulnerable to detection through analysis of the systems partition information. Filters can also be applied to capture TCP/IP packets that contain hidden or invalid information in the packet headers. Internet firewalls are becoming more sophisticated and allow for much customization. Just as filters can be set to determine if packets originate from within the firewall's domain and the validity of the SYN and ACK bits, so to can the filters be configured to catch packets that have information in supposed unused or reserved space. 23

10CE022

STEGANALYSIS & TOOLS

4.3 Disabling Steganography


Detecting the existence of hidden information defeats the Steganographys goal of imperceptibility. Methods exist, that produce results which are far more difficult to detect without the original image for comparison. At times the existence of hidden information may be known so detecting it is not always necessary. Disabling and rendering it useless seems to be the next best alternative. With each method of hiding information there is a tradeoff between the sizes of the payload (amount of hidden information) that can be embedded and the survivability or robustness of that information to manipulation. The distortions in text noted by appended spaces and "invisible" characters can be easily revealed by opening the file with a word processor. Extra spaces and characters can be quickly stripped from text documents. The disabling or removal of hidden information in images comes down to image processing techniques. For LSB methods of inserting data, simply using a lossy compression technique, such as JPEG, is enough to render the embedded message useless. Images compressed with such a method are still pleasing to the human eye but no longer contain the hidden information. Tools exist to test the robustness of information hiding techniques in images. These tools automate image processing techniques such as warping, cropping, rotating, and blurring. Such tools and techniques should be used by those considering making the investment of watermarking to provide a sense of security of copyright and licensing just as password cracking tools are used by system administrators to test the strength of user and system passwords. If the password fails, the administrator should notify the password owner that the password is not secure. Hidden information may also be overwritten. If information is added to some media such that the added information cannot be detected, then there exists some amount of additional information that may be added or removed within the same threshold which will overwrite or remove the embedded covert information. Audio and video are vulnerable to the same methods of disabling as with images. Manipulation of the signals will alter embedded signals in the noise level (LSB) which may

24

10CE022

STEGANALYSIS & TOOLS

be enough to overwrite or destroy the embedded message. Filters can be used in an attempt to cancel out echoes or subtle signals but becomes this may not be as successful as expected. Caution must be used in hiding information in unused space in files or file systems. File headers and reserved spaces are common places to look for out of place information. In file systems, unless the steganographic areas are in some way protected (as in a partition), the operating system may freely overwrite the hidden data since the clusters are thought to be free. This is a particular annoyance of operating systems that do a lot of caching and creating of temporary files. Utilities are also available which "clean" or wipe unused storage areas. In wiping, clusters are overwritten several times to ensure any data has been removed. Even in this extreme case, utilities exist that may recover portions of the overwritten information. As with unused or reserved space in file headers, TCP/IP packet headers can also be reviewed easily. Just as firewall filters are set to test the validity of the source and destination IP addresses, the SYN and ACK bits, so to can the filters be configured to catch packets that have information in supposed unused or reserved space. If IP addresses are altered or spoofed to pass covert information, a reverse lookup in a domain name service (DNS) can verify the address. If the IP address is false, the packet can be terminated. Using this technique to hide information is risky as TCP/IP headers may get overwritten in the routing process. Reserved bits can be Overwritten and passed along without impacting the routing of the packet.

4.4 Steganography Tools 1 MP3Stego


MP3Stego will hide information in MP3 files during the compression process. The data is first compressed, encrypted and then hidden in the MP3 bit stream. http://www.petitcolas.net/fabien/software/MP3Stego_1_1_17.zip

2 JPHide and JPSeek


JPHIDE and JPSEEK are programs which allow you to hide a file in a jpeg visual image. There are lots of versions of similar programs available on the internet but JPHIDE and JPSEEK are rather special. http://www.snapfiles.com/php/download.php?id=101911

3 GIFShuffle
25

10CE022

STEGANALYSIS & TOOLS

The program gifshuffle is used to conceal messages in GIF images by shuffling the colour map, which leaves the image visibly unchanged. gifshuffle works with all GIF images, including those with transparency and animation, and in addition provides compression and encryption of the concealed message. http://www.darkside.com.au/gifshuffle/

4 WbStego
WbStego is a tool that hides any type of file in bitmap images, text files, HTML files orAdobe PDF files. The file in which you hide the data is not optically changed. http://www.wbailer.com/wbstego

26

10CE022

STEGANOGRAPHY STEPS

Chapter 5 Steganography Steps

5.1 Steganography Software:

27

10CE022

STEGANOGRAPHY STEPS

28

10CE022

STEGANOGRAPHY STEPS

29

10CE022

APPLICATION

Chapter 6 Application
Application of Steganography
It can be used wherever data security is required.

6.1 Digital Watermarking


A digital watermark is a kind of marker covertly embedded in a noise-tolerant signal such as audio or image data. It is typically used to identify ownership of the copyright of such signal. "Watermarking" is the process of computer-aided information hiding in a carrier signal; the hidden information should, but does not need to contain a relation to the carrier signal. Digital watermarks may be used to verify the authenticity or integrity of the carrier signal or to show the identity of its owners. .Digital watermarking may be used for a wide range of applications, such as:

Copyright protection Source tracking (different recipients get differently watermarked content) Broadcast monitoring (television news often contains watermarked video from international agencies)

Figure 6.1 Digital watermarking Process

30

10CE022

APPLICATION

The information to be embedded in a signal is called a digital watermark, although in some contexts the phrase digital watermark means the difference between the watermarked signal and the cover signal. The signal where the watermark is to be embedded is called the host signal. A watermarking system is usually divided into three distinct steps, embedding, attack, and detection. In embedding, an algorithm accepts the host and the data to be embedded, and produces a watermarked signal. Then the watermarked digital signal is transmitted or stored, usually transmitted to another person. If this person makes a modification, this is called an attack. While the modification may not be malicious, the term attack arises from copyright protection application, where third parties may attempt to remove the digital watermark through modification. There are many possible modifications, for example, lossy compression of the data , cropping an image or video, or intentionally adding noise. Detection is an algorithm which is applied to the attacked signal to attempt to extract the watermark from it. If the signal was unmodified during transmission, then the watermark still is present and it may be extracted. In robust digital watermarking applications, the extraction algorithm should be able to produce the watermark correctly, even if the modifications were strong. In fragile digital watermarking, the extraction algorithm should fail if any change is made to the signal.

6.2 Security Implementations:


The "secrecy" of the embedded data is essential in this area. Steganography have been approached in this area. Steganography provides us with: (A) Potential capability to hide the existence of confidential data. (B) Hardness of detecting the hidden (i.e., embedded) data. (C) Strengthening of the secrecy of the encrypted data.

It can use & sell IBES products as stand alone or as a part of their bigger.
Package Hiding data on the network in case of a breach. Example: In practice, when you use some steganography, you must first select a vessel data according to the size of the embedding data. The vessel should be innocuous. Then, you embed the confidential data by using an embedding program (which is one component of the steganography software) together with some key. 31

10CE022

APPLICATION

When extracting, you (or your party) use an extracting program (another component) to recover the embedded data by the same key. In this case you need a "key negotiation" before you start communication. Attaching a stego file to an e-mail message is the simplest example in this application area.

6.3 Media Database System


Data can be stored as XML, Exif, XMP, Plus etc. It is used to store data photo taken, location, camera info, exposure. Keyword helps in searching images. Geotagging allows better sorting.

6.4 Usage in modern printers


Steganography is used by some modern printers, including HP and Xerox brand color laser printers. Tiny yellow dots are added to each page. The dots are barely visible and contain encoded printer serial numbers, as well as date and time stamps.

6.1.5 Other Application


Steganography is also employed in various useful applications, e.g., for human rights organizations, as encryption is prohibited in some countries , copyright control of materials, enhancing robustness of image search engines and smart IDs, identity cards, where individuals details are embedded in their photographs . Other applications are video-audio synchronization, companies safe circulation of secret data, TV broadcasting, TCP/IP packets, for instance a unique ID can be embedded into an image to analyze the network traffic of particular users and also checksum embedding. In Medical Imaging Systems where a separation was considered necessary for confidentiality between patients image data or DNA sequences and their captions, e.g., physician, patients name, address and other particulars. A link must be maintained between the image data and the personal information. Thus, embedding the patients information in the image could be a useful safety measure and helps in solving such problems.

32

10CE022

APPLICATION

Steganography would provide an ultimate guarantee of authentication that no other security tool may ensure. Miaou present an LSB embedding technique for electronic patient records based on bi-polar multiple-base data hiding. A pixel value difference between an original image and its JPEG version is taken to be a number conversion base. Inspired by the notion that steganography can be embedded as part of the normal printing process, the Japanese firm Fujitsu is developing technology to encode data into a printed picture that is invisible to the human eye, but can be decoded by a mobile phone with a camera as exemplified in Figure (BBC News, 2007).

Figure 6.2: Use of Steganography in Mobile

Figure 6.3: Displays the application of deployment into a mobile phone The process takes less than one second as the embedded data is merely 12 bytes. Hence, users will be able to use their cellular phones to capture encoded data. Fujitsu charges a small fee for the use of their decoding software which sits on the firm's own servers. The basic idea is to transform the image colour scheme prior to printing to its Hue, Saturation and Value components, HSV, then embed into the Hue domain to which human eyes are not sensitive. Mobile cameras can see the coded data and retrieve it. This application can be used for doctors prescriptions, food wrappers, billboards, business cards and printed media.

33

10CE022

ADVANTAGES & DISADVANTAGES

Chapter 7 Advantages & disadvantages


7.1 Advantages

1. Confidentiality:
Any unauthorized person does not know that sensitive data exists.

2. Survivability:
Verify that the data does not get destroyed in the transmission.

3. No detection:
It can not be easily found out that the data is hidden in a given file.

4. Visibility:
People cannot see any visible changes to the file in which the data is hidden. 5. It also provides protection against copyright though digital water marking.

7.2 Disadvantages of Steganography


1. Huge number of data = huge file size, so someone can suspect about it. 2. In sending and receiving, information can be leaked. 3. The confidentiality of information is maintained by the algorithms, and if the algorithms are known then its all over. 4. If this technique is gone in the wrong hands like hackers, terrorist, criminals then this can be very much dangerous for all.

34

10CE022

CONCLUSION & FUTURE EXTENSION

Chapter 8 Conclusion &Future Extension


.

8.1 Conclusion & Future Extension


Steganography transmits secrets through apparently innocuous covers in an effort to conceal the existence of a secret. Digital image steganography and its derivatives are growing in use and application. In areas where cryptography and strong encryption are being outlawed, citizens are looking at steganography to circumvent such policies and pass messages covertly. As with the other great innovations of the digital age: the battle between cryptographers and cryptanalysis, security experts and hackers, record companies and pirates, steganography and Steganalysis will continually develop new techniques to counter each other. In the near future, the most important use of steganographic techniques will probably be lying in the field of digital watermarking. Content providers are eager to protect their copyrighted works against illegal distribution and digital watermarks provide a way of tracking the owners of these materials. Steganography might also become limited under laws, since governments already claimed that criminals use these techniques to communicate. The possible use of steganography technique is as following. Hiding data on the network in case of a breach. Peer-to-peer private communications. Posting secret communications on the Web to avoid transmission. connection or transmission

Embedding corrective audio or image data in case corrosion occurs from poor

35

10CE022

BIBLIOGRAPHY

Chapter 9 Bibliography
9.1 Bibliography
http://dictionary.reference.com/search?q=steganography http://www.lia.deis.unibo.it/Courses/RetiDiCalcolatori/Progetti98/Fortini/history.html http://www.stegoarchive.com/ http://www.jjtc.com/pub/r2026a.htm http://www.jjtc.com/stegdoc/steg1995.html http://www.jjtc.com/Steganography/ http://search.yahoo.com/search?p=History+steganography&ei=UTF-8&fr=fp-tab-webt&cop=mss&tab= http://www.giac.org/practical/Luther_Deyo_GSEC.doc http://www.stsc.hill.af.mil/crosstalk/2003/06/caldwell.pdf http://www.petitcolas.net/fabien/steganography/history.html http://www.jjtc.com/pub/r2026.pdf

36