You are on page 1of 15

Steganography is the art and science of concealing information in unremarkable cover media so as not to

arouse an eavesdropper’s suspicion. It is an application under information security field. Being classified
under information security, Steganography will be characterized by having set of measures that rely on
strengths and counter measures (attacks) that are driven by weaknesses and vulnerabilities. Today,
computer and network technologies provide easy-to-use communication channels for steganography. The
aim of this project is to propose a modified high-capacity image steganography technique that depends on
wavelet transform with acceptable levels of imperceptibility and distortion in the cover image and high
level of overall security.
Keywords: Steganography, security, wavelets, and information hiding.
The purpose of steganography is covert communication to hide a message from a third party.
Steganography is often confused with cryptology because the two are similar in the way that they both are
used to protect important information. The difference between the two is that Steganography involves
hiding information so it appears that no information is hidden at all. Generally, in steganography, the
actual information is not maintained in its original format and thereby it is converted into an alternative
equivalent multimedia file like image, video or audio which in turn is being hidden within another object.
This apparent message (known as cover text in usual terms) is sent through the network to the recipient,
where the actual message is separated from it.
The majority of today’s steganographic systems uses multimedia objects like image, audio, video etc as
cover media because people often transmit digital pictures over email and other Internet communication. In
modern approach, depending on the nature of cover object, steganography can be divided into five types:
• Text Steganography
• Image Steganography
• Audio Steganography
• Video Steganography
• Protocol Steganography
Text Steganography
Since everyone can read, encoding text in neutral sentences is doubtfully effective. But taking the first
letter of each word of the previous sentence, we will see that it is possible and not very difficult. Hiding
information in plain text can be done in many different ways. Many techniques involve the modification of
the layout of a text, rules like using every n-th character or the altering of the amount of white space after
lines or between words. The last technique was successfully used in practice and even after a text has been
printed and copied on paper for ten times, the secret message could still be retrieved. Another possible way
of storing a secret inside a text is using a publicly available cover source, a book or a newspaper, and
using a code which consists for example of a combination of a page number, a line number and a character
number. This way, no information stored inside the cover source will lead to the hidden message.
Discovering it relies solely on gaining knowledge of the secret key.
Image Steganography
To hide information, straight message insertion may encode every bit of information in the image or
selectively embed the message in “noisy” areas that draw less attention—those areas where there is a great
deal of natural colour variation. The message may also be scattered randomly throughout the image. A
number of ways exist to hide information in digital media. Common approaches include:-
• Least significant bit insertion
• Masking and filtering
• Redundant Pattern Encoding
• Encrypt and Scatter
• Algorithms and transformations

Audio Steganography
In a computer-based audio steganography system, secret messages are embedded in digital sound. The
secret message is embedded by slightly altering the binary sequence of a sound file. Existing audio
steganography software can embed messages in WAV, AU, and even MP3 sound files. Embedding secret
messages in digital sound is usually a more difficult process than embedding messages in other media,
such as digital images. Masking is more robust than LSB insertion with respect to compression, cropping,
and some image processing. Masking techniques embed information in significant areas so that the hidden
message is more integral to the cover image than just hiding it in the “noise” level. This makes it more
suitable than LSB with, for instance, lossy JPEG images. Form of signal noise to more powerful methods
that exploit sophisticated signal processing techniques to hide information. The list of methods that are
commonly used for audio steganography is:-
• LSB coding
• Parity coding
• Phase coding
• Spread spectrum
• Echo hiding

Video Steganography
Video files are generally a collection of images and sounds, so most of the presented techniques on mages
and audio can be applied to video files too. The great advantages of video are the large amount of data that
can be hidden inside and the fact that it is a moving stream of images and sounds. Therefore, any small but
otherwise noticeable distortions might go by unobserved by humans because of the continuous flow of
information.

Protocol Steganography
The term protocol steganography refers to the technique of embedding information within messages and
network control protocols used in network transmission. In the layers of the OSI network model there exist
covert channels where steganography can be used [25]. An example of where information can be hidden is
in the header of a TCP/ IP packet in some fields that are either optional or are never used.

AN OVERVIEW OF THE WAVELET THEORY


In order to understand the wavelet transform better, we must know the Fourier transform in more detail.
The wavelet applications mentioned include numerical analysis, signal analysis, control applications and
the analysis and adjustment of audio signals. The Fourier transform is only able to retrieve the global
frequency content of a signal, the time information is lost. This is overcome by the short time Fourier
transform (STFT) which calculates the Fourier transform of a windowed part of the signal and shifts the
window over the signal. The short time Fourier transform gives the time-frequency content of a signal with
a constant frequency and time resolution due to the fixed window length. This is often not the most desired
resolution. For low frequencies often a good frequency resolution is required over a good time resolution.
For high frequencies, the time resolution is more important. A multi-resolution analysis becomes possible
by using wavelet analysis. The continuous wavelet transform is calculated analogous to the Fourier
transform, by the convolution between the signal and analysis function. However the trigonometric
analysis functions are replaced by a wavelet function. A wavelet is a short oscillating function which
contains both the analysis function and the window. Time information is obtained by shifting the wavelet
over the signal. The frequencies are changed by contraction and dilatation of the wavelet function. The
continuous wavelet transform retrieves the time-frequency content information with an improved
resolution compared to the STFT. The discrete wavelet transform (DWT) uses filter banks to perform the
wavelet analysis. The discrete wavelet transform decomposes the signal into wavelet coefficients from
which the original signal can be reconstructed again. The wavelet coefficients represent the signal in
various frequency bands. The coefficients can be processed in several ways, giving the DWT attractive
properties over linear filtering.

Why we need to transform signals?


Mathematical transformations are applied to signals to obtain further information from that signal that is
not readily available in the raw signal. A time-domain signal is a raw signal, and a signal that has been
"transformed" by any of the available mathematical transformations as a processed signal. When we plot
time-domain signals, we obtain a time-amplitude representation of the signal. This representation is not
always the best representation of the signal for most signal processing related applications. In many cases,
the most distinguished information is hidden in the frequency content of the signal. The frequency
SPECTRUM of a signal is basically the frequency components (spectral components) of that signal. The
frequency spectrum of a signal shows what frequencies exist in the signal. There are many transforms that
are used quite often by engineers and mathematicians. Hilbert transform, short-time Fourier transform
(more about this later), Wigner distributions, the Radon Transform, and of course our featured
transformation, the wavelet transform, constitute only a small portion of a huge list of transforms that are
available at engineer's and mathematician's disposal. Every transformation technique has its own area of
application, with advantages and disadvantages, and the wavelet transform (WT) is no exception.
Before understanding any transform, we must have some information about stationary and non-stationary
signal.
• Stationary signal
• Non-stationary signal
Stationary signal: Stationary signals are constant in their statistical parameters over time. If we look at a
stationary signal for a few moments and then wait an hour and look at it again, it would look essentially the
same, i.e. its overall level would be about the same and its amplitude distribution and standard deviation
would be about the same. Rotating machinery generally produces stationary vibration signals. Stationary
signals are further divided into deterministic and random signals. Such stationary signal can be explained
with the following figure:-
Non-stationary signal: A signal whose frequency constantly changes in time. This signal is known as the
"chirp" signal. This is a non-stationary signal.

It can be explained with another example. Following figure plots a signal with four different frequency
components at four different time intervals, hence a non-stationary signal. The interval 0 to 300 ms has a
100 Hz sinusoid, the interval 300 to 600 ms has a 50 Hz sinusoid, the interval 600 to 800 ms has a 25 Hz
sinusoid, and finally the interval 800 to 1000 ms has a 10 Hz sinusoid.
Fourier Transform:
The French mathematician J. Fourier showed that any periodic function can be expressed as an infinite
sum of periodic complex exponential functions. FT decomposes a signal to complex exponential functions
of different frequencies. The way it does this, is defined by the following two equations:

In the above equation,


• t stands for time,
• f stands for frequency, and
• x denotes the signal at hand.
Note that x denotes the signal in time domain and the X denotes the signal in frequency domain. This
convention is used to distinguish the two representations of the signal. Equation (1) is called the Fourier
transform of x(t), and equation (2) is called the inverse Fourier transform of X(f), which is x(t).
Why Fourier Transform is not suitable for non-stationary signals?

The information provided by the integral, corresponds to all time instances, since the integration is from
minus infinity to plus infinity over time. It follows that no matter where in time the component with
frequency "f" appears, it will affect the result of the integration equally as well. In other words, whether the
frequency component "f" appears at time t1 or t2, it will have the same effect on the integration. This is
why Fourier transform is not suitable if the signal has time varying frequency, i.e., the signal is non-
stationary. If only, the signal has the frequency component "f" at all times (for all "f" values), then the
result obtained by the Fourier transform makes sense.

Also the Fourier transform tells whether a certain frequency component exists or not. This
information is independent of where in time this component appears. It is therefore very important to know
whether a signal is stationary or not, prior to processing it with the FT.
Stationary Signal Fourier Transform

Non-stationary Signal Fourier Transform

Although the above two signals are different but their Fourier Transform are nearly same. Thus, we can
say that the Fourier Transform can not distinguish the two signals very well. Therefore, FT is not a suitable
tool for analyzing non-stationary signals, i.e., signals with time varying spectra.

Problem with Fourier Transform:


The FT gives the spectral content of the signal, but it gives no information regarding where in time those
spectral components appear. Therefore, FT is not a suitable technique for non-stationary signal, with one
exception: FT can be used for non-stationary signals, if we are only interested in what spectral components
exist in the signal, but not interested where these occur. However, if this information is needed, i.e., if we
want to know, what spectral component occur at what time (interval), then Fourier transform is not the
right transform to use. This problem is solved by the “WAVELET TRANSFORM”.
Short time Fourier Transform:

In STFT, we pass the time-domain signal from various high pass and low pass filters, which filters out
either high frequency or low frequency portions of the signal. This procedure is repeated, every time some
portion of the signal corresponding to some frequencies being removed from the signal.

Here is how this works: Suppose we have a signal which has frequencies up to 1000 Hz. In the first stage
we split up the signal in to two parts by passing the signal from a highpass and a lowpass filter (filters
should satisfy some certain conditions, so-called admissibility condition) which results in two different
versions of the same signal: portion of the signal corresponding to 0-500 Hz (low pass portion), and 500-
1000 Hz (high pass portion).

Then, we take either portion (usually low pass portion) or both, and do the same thing again. This
operation is called decomposition.
Problem in STFT: we cannot exactly know what frequency exists at what time instance, but we can only
know what frequency bands exist at what time intervals
This is known as Heisenberg "uncertainty principle" i.e. momentum and position of moving particle can
not be known at the same time.
Higher frequencies are better resolved in time, and lower frequencies are better resolved in frequency. This
means that, a certain high frequency component can be located better in time (with less relative error) than
a low frequency component. On the contrary, a low frequency component can be located better in
frequency compared to high frequency component.

The Wavelet Transform:


The wavelet transform is a transform of this type. It provides the time-frequency representation. (There are
other transforms which give this information too, such as short time Fourier transform, Wigner
distributions, etc.). A particular spectral component occurring at any instant can be of particular interest. In
these cases it may be very beneficial to know the time intervals these particular spectral components occur.
Wavelet transform is capable of providing the time and frequency information simultaneously, hence
giving a time-frequency representation of the signal.
Scale: The scale is inverse of frequency. That is, high scales correspond to low frequencies, and low scales
correspond to high frequencies. Consequently, the little peak in the plot corresponds to the high frequency
components in the signal, and the large peak corresponds to low frequency components (which appear
before the high frequency components in time) in the signal.
The wavelet transform (WT) has gained widespread acceptance in signal processing and image
compression. Because of their inherent multi-resolution nature, wavelet-coding schemes are especially
suitable for applications where scalability and tolerable degradation are important
Discrete Wavelet Transform
In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet
transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key
advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location
information (location in time).
Haar wavelets
The first DWT was invented by the Hungarian mathematician Alfréd Haar. For an input represented by a
list of 2n numbers, the Haar wavelet transform may be considered to simply pair up input values, storing
the difference and passing the sum. This process is repeated recursively, pairing up the sums to provide the
next scale: finally resulting in 2n − 1 differences and one final sum.
 Wavelet transform decomposes a signal into a set of basis functions.
 These basis functions are called wavelets
 Wavelets are obtained from a single prototype wavelet y(t) called mother wavelet by dilations and
shifting:

Where a is the scaling parameter and b is the shifting parameter


One-Dimensional Wavelet Decomposition
A single- level one- dimensional Wavelet decomposition with respect to either a particular Wavelet or
particular Wavelet decomposition filters is illustrated in Figure 3. Starting from a signal s, two sets of
coefficients are computed: approximation coefficients cA1, and detail coefficients cD1. These vectors are
obtained by convolving s with the low-pass filter Lo_D for approximation and with the high-pass filter
Hi_D for detail, followed by dyadic decimation. The length of each filter is equal to 2N. If n is the length
of s, the signals F and G are of length n + 2N - 1, and then the coefficients cA1 and cD1 are of length [(n-
1)/2] +N.
Multilevel 2-D Wavelet Decomposition
For images, an algorithm similar to the one dimensional case is possible for two-dimensional
Wavelets and scaling functions obtained from one dimensional ones by tensor product [4]. This kind of
two-dimensional DWT leads to a decomposition of approximation coefficients at level j in four
components: the approximation at level j+1, and the details in three orientations (horizontal, vertical, and
diagonal), as depicted in Figure 4.

Figure 5 describes the basic decomposition step for images using the 2D Wavelet transform. Also,
different levels of Wavelet transform were tried in this paper (up to 5). Increasing the levels will add
complexity and computational overhead, but the robustness of the steganography method will be enhanced
[10].
How the DWT is actually computed?

The DWT analyzes the signal at different frequency bands with different resolutions by decomposing the
signal into a coarse approximation and detail information. DWT employs two sets of functions, called
scaling functions and wavelet functions, which are associated with low pass and highpass filters,
respectively. The decomposition of the signal into different frequency bands is simply obtained by
successive highpass and lowpass filtering of the time domain signal. The original signal x[n] is first passed
through a halfband highpass filter g[n] and a lowpass filter h[n]. After the filtering, half of the samples can
be eliminated according to the Nyquist’s rule, since the signal now has a highest frequency of p/2 radians
instead of p. The signal can therefore be subsampled by 2, simply by discarding every other sample. This
constitutes one level of decomposition and can mathematically be expressed as follows:

where yhigh[k] and ylow[k] are the outputs of the highpass and lowpass filters, respectively, after
subsampling by 2.
The frequency bands that are not very prominent in the original signal will have very low amplitudes, and
that part of the DWT signal can be discarded without any major loss of information, allowing data
reduction. Figure 4.2 illustrates an example of how DWT signals look like and how data reduction is
provided. Figure 4.2a shows a typical 512-sample signal that is normalized to unit amplitude. The
horizontal axis is the number of samples, whereas the vertical axis is the normalized amplitude. Figure
4.2b shows the 8 level DWT of the signal in Figure 4.2a. The last 256 samples in this signal correspond to
the highest frequency band in the signal; the previous 128 samples correspond to the second highest
frequency band and so on. It should be noted that only the first 64 samples, which correspond to lower
frequencies of the analysis, carry relevant information and the rest of this signal has virtually no
information. Therefore, all but the first 64 samples can be discarded without any loss of information. This
is how DWT provides a very effective data reduction scheme.

You might also like