1 views

Uploaded by fatma-taher

A Novel Steganalysis Algorithm

- IRJET-Crowd Density Estimation using Novel Feature Descriptor
- International Journal of Computer Science IJCSIS Vol. 10 No. 6 June 2012
- A hybrid stock selection model using genetic algorithms and support vector%0Aregression.pdf
- Huang Nakamori Wang 2005
- PREDICTING BANKRUPTCY USING MACHINE LEARNING ALGORITHMS
- Seminar Report
- A PREDICTIVE SYSTEM FOR DETECTION OF BANKRUPTCY USING MACHINE LEARNING TECHNIQUES
- Thesis on Telugu ocr
- mamogram (1)
- Classifying Single Trial EEG: Towards Brain Computer Interfacing
- SVM-KNN: A NOVEL APPROACH TO CLASSIFICATION BASED ON SVM AND KNN
- Inductive Learning Algorithms and Representations for Text Categorization.pdf
- An Efficient Extreme Learning Machine Based Intrusion Detection System
- Classification of Brain MRI Tumor Images a Hybrid 2017 Procedia Computer Sc
- Review on Tuberculosis Detection Using Various Data Mining Techniques
- Confusoin Matrix Generation
- Character Recognition
- Report Digit Recognition
- IMPROVED STEGANOGRAPHIC SECURITY BY APPLYING AN IRREGULAR IMAGE SEGMENTATION AND HYBRID ADAPTIVE NEURAL NETWORKS WITH MODIFIED ANT COLONY OPTIMIZATION
- sw3.pdf

You are on page 1of 4

Wei Zeng, Haojun Ai, Ruimin Hu

National Engineering Research Center for Multimedia Software, 430072 Wuhan, China

zengwei1979@gmail.com

recognition by the Human Auditory System (HAS).

The HAS has very low sensitivity to changes to the

phase components of an audio signal. By using the

phase components of the sound segment as a data

space, a fairly large amounts data can be coded into the

host signal. The embedded data is fairly transparent to

the HAS if the relative relations between the phase

components of preceding segments are well preserved.

The modification of the off-set of all the phase

components results in no distortion to the sound signal.

Unlike LSB Coding, phase coding is robust to

small amounts of additive noise, since the noise wont

affect to the distortion of the phase in most of the

frequency slots. The wave form of the signal is more

important than the absolute value of each data point in

Phase Coding.

Some steganography algorithms of phase coding

[1] ~ [4] have emerged recently. But compared with

steganalysis of other audio hiding algorithms, effective

steganalysis methods of phase coding are relatively

unexplored. Concerning this, in this paper, we propose

a potent steganalysis algorithm for typical phase

coding algorithm proposed by Bender [1]. In the next

section, we explain the phase steganography algorithm.

In Section 3, we introduce the feature extraction and

the steganalysis method. The experimental results are

given in Section 4. Section 5 concludes the paper.

Abstract

Audio steganalysis has attracted more attentions

recently. Phase steganalysis is one of the most

challenging research fields. In this paper, a novel

algorithm to detect phase coding steganography in

audio signal is proposed. It is based on analysis of the

phase discontinuities, and can be described as follows.

Firstly, it takes FFT transform of special segment of

audio and unwraps the phases of each audio sample,

then extracts the phase difference between neighboring

samples. Secondly, in order to monitor the change of

phase difference, it calculates the five statistical

features of phase difference for steganalysis. Thirdly,

the SVM classifier is utilized for classification. All of

the 800 various audios are trained and tested in our

experimental work. With various embedding

parameters for training and testing audios, the

proposed algorithm can achieve a good classification,

and the correct rate of detecting is up to 95%.

1. Introduction

Recently, digital watermarking and data hiding

have become a vibrant research area. Various kinds of

multimedia files can be downloaded freely from the

Internet. Terrorists might have seen this as an

opportunity to communicate secretly with each other.

Thus, various steganalysis methods have emerged as

means to deter covert communication by terrorists.

Steganalysis is the scientific technology to decide if a

medium carries some hidden messages or not and, if

possible, to determine what the hidden messages are.

In addition to preventing secret communication among

terrorists, steganalysis serves a way to judge the

security performance of steganography techniques.

Audio steganography is a useful means for

transmitting covert battlefield information via an

innocuous cover audio signal. Phase coding is a coding

schemes that introduces least perceptible noise to the

host .The off-set of the phase of a sound is irrelevant to

DOI 10.1109/ALPIT.2007.41

The phase coding algorithm embeds data into an

audio signal by taking advantage of the HAS response

to phase information. Bend et al proposed phase

coding algorithms based on the HAS sensitivity to

phase. In their approach, they divide the host audio

sequence into a set of equal-length segments and

compute the DFT for each segment, equivalent to

computing the STFT. As described in [1], the first step

of the phase coding algorithm is to compute the STFT

of the current block in the host signal xm ( n) . This is

performed by dividing xm ( n) into a set of L equal-

261

follows:

Extrinsic discontinuities

Extrinsic discontinuities are the result of the

computation of the inverse tangent function which

gives values of the phase modulo 2 . The arctangent

function calculates phase angles limited between

to rad, although the true phase angles are not

limited to this range. Consequently, any angle outside

this range is wrapped around zero, which can be

detected practically by identifying phase jumps that

can be up to 2 rad. An empirical way of unwrapping

the phase is by detecting where these jumps occur

and adding or subtracting 2 accordingly. The

disadvantage of this method is that it cannot

discriminate whether the phase jumps are due to

rapidly changing angles or due to the wrapping

ambiguity. The literature reports a number of methods

used to unwrap phase, to give a smooth phase

spectrum. These are also techniques for avoiding this

source of discontinuity, including the differential of

phase and the calculation of phase from a geometric

analysis of the z-plane.

Intrinsic discontinuities

The intrinsic discontinuities, found in phase

spectra, arise from properties of the physical system

that are responsible for generating the data under

analysis. The intrinsic discontinuities appear when

both the real and imaginary parts of the Fourier

spectrum of the signal are crossing zero

simultaneously. This second type of discontinuity is

due to the intrinsic nature of the signal itself and not

due to computational artifacts. There are two methods

of identifying the occurrence of intrinsic

discontinuities. From the complex Fourier spectrum of

the data, any simultaneous zero-crossing of the real and

imaginary components indicates the presence of a

discontinuity. A z-plane analysis of the data will give

rise to zeros on (or very close to) the unit circle. This

also reveals the existence of intrinsic

discontinuities.

In this paper, we employ a conventional phase

unwrapping algorithm to overcome the extrinsic

discontinuities. Typical phase coding does some

modification in phase spectrum, not concerning the

phase discontinuities. So through the method described

in next section, we can detect the change of phase

continuities.

DFT is used to obtain the magnitude spectrum M i ( k )

and the phase spectrum i ( k ) , for 0 i L 1 .

The next step is to determine the phase difference

between subblocks on a frequency-by-frequency basis:

i (k ) = i (k ) i 1 (k )

(1)

To embed a bit of information into the current

block, the phase of the first subblock 0 ( k ) is

replaced with a unique phase signature corresponding

to the desired data bit:

1 (k ), if (m) = 1

0 (k ) =

+1 (k ), if (m) = +1

(2)

with this phase signature plus the sum of phase

differences up to the given subblock. In this manner,

the relative phase of each subblock is preserved, and

the long-term phase of the block itself is maintained:

i

i (k ) = 0 (k ) + j (k )

(3)

j =1

obtained by computing the inverse DFT of each

subblock using the original magnitude response

Compared with the encoding algorithm, the

decoding is much simpler to be implemented. The

length of block and subblock, the DFT points, and the

data interval must be known at the receiver. The

embedded bit is extracted by computing the DFT of the

first subblock n, extracting its phase response 0 ( k )

and comparing this response to 1 ( k ) and +1 ( k ) ,

then choosing an appropriate threshold, the embedded

data can be detected as 0 or 1.

3.1. Theory analysis

The definition of phase is:

Im( S ( ) )

f () = tan1

where 0 < f ( ) < 2 (4)

Re ( S ( ) )

phase of an initial audio segment with a reference

phase analysis via the Fourier transform, Ioannis

262

is adjusted in order to preserve the relative phase

between segments. Based on the Ioannis Paraskevass

theory, we know that the phase coding corrupts the

extrinsic continuities of unwrapped phase in each

segment, causing the change of phase difference. So

the statistical analysis of phase difference in each

segment can be used to monitor the change and

classify the embedded signal and clean signal. We

divide each signal into frames with a given length, and

then derive the phase differential spectra. The phase

differential spectra are derived from the unwrapped

phase spectra using FFT transform. Then from each

plot, five statistical features [6]: variance, skewness,

kurtosis, median, and mean absolute deviation are

derived in order to compress the large amount of

information that each spectrum conveys.

Furthermore,

), > 0

4. Experimental results

The proposed phase steganalysis technique is

implemented and tested on a set of 800 16bit wav files

(44.1 KHz, 20 sec). The audio files include music

types (piano, symphony, violin, and rock), songs,

speech (male, female), nature noise etc. In phase

coding, there are five embedding parameters:

embedded messages, block length N, subblock length

n, phase modifier, frequency slots per bit.

In phase coding algorithm, we must concern the

phase dispersion cause by a break in the relationship of

the phases between each of the frequency components.

Minimizing phase dispersion constrains the data rate of

phase coding. One cause of the phase dispersion is the

substitution of phase 0 ( k ) with binary code. The

magnitude of the phase modifier needs to be close to

the original value in order to minimize dispersion.

The difference between phase modifier states

should be maximized in order to minimize the

susceptibility of the encoding to noise. In our modified

phase representation, a 0-bit is 2 and a 1-bit

is 2 .

Another source of distortion is the rate of change of

the phase modifier. With N-point DFT, theoretically,

we can use up to N-frequency slots of the phase matrix

of the coding. However, because of the noise in the

decoded phase in a typical sound waveform, it is

almost impossible to code on bit frequency slot.

Moreover, the modification of the phase done to each

frequency component will cause severe phase

dispersion. By changing the phase more slowly and

transitioning between phase changes, the audible

distortion is greatly reduced. Here we set interval of

phase modification as 16 in each subblock.

In addition, as to simply the calculation, we choose

one segment of 1024 samples which have most power

in audio file to analysis. In our experiment, we use 200

clean audios and their stego audios as input to train

SVM, and test another 600 clean audios and their stego

audios. The block length N and subblock length n is

N=512, n=128; N=512, n=256; N=1024, n=128;

such

problem:

m

1

min T + C k ,

,b , 2

k =1

(5)

k 0, k = 1, , m,

Where training data are mapped to a higher

dimensional space by the function , and C is a

penalty parameter on the training error. For any test

instance x , the decision function (predictor) is

f ( x) = sgn ( T ( x ) + b )

is

The train and test audios use the five statistical features

derived from each plot, as described in Sec 3.2.

subject to yk ( T ( xk ) + b ) 1 k ,

kernel is K xi , x j = exp xi x j

and without hidden data can be viewed as classification

problem. In this paper, we use the Support Vector

Machine (SVM) because of its excellent performance.

We use a set of audios (stego and normal audios) as the

training data to construct the SVM classifier.

SVM is based on Vapniks statistical learning

theory [7]. It creates a maximum-margin hyperplane

which separate the training vectors from different

classes. When the margin is maximized, the

probabilistic test error bound is minimized. Non-linear

classifier can be created by mapping the original input

space into a higher dimensional feature space using a

non-linear kernel function. Some common kernels are

linear, polynomial, radial basis function and sigmoid

kernels.

m

freely available package LIBSVM [8] and radial basis

function (RBF) kernel to train SVM. The function

K ( xi , x j ) ( xi ) ( x j )

(6)

263

N=2048, n=512; N=2048, n=1024 respectively. We

choose N=512, n=128; N=1024, n=512; N=2048,

n=512; N=2048, n=1024 for train, and test all other

parameter combinations.

Accuracy result of testing 6002 audios is shown

in Figure 1.

5. Conclusion

Phase coding is one of the most effective coding

methods in terms of the signal-to-perceived noise ratio.

In this paper, we present a novel method to detect

hidden message by typical phase coding in audio

signal. We use statistical analysis of phase difference

to monitor the phase discontinuities and use SVM

classifier to capture the faint changes of phase causing

by embedding. Experiments are conducted on a set of

various types of audios and the correct rate of

classification reaches to 95%.

As to monitor the statistical changes caused by

other phase coding algorithm, future work may focus

on analyzing more effective features in audio signal.

Also an appropriate classifier need further study.

References

[1] W Bender, D Gruh, N Morimoto, et al, Techniques for

data hiding, IBM System.1996, vol.35, no.3&4:313-336.

Using Phase Dispersion, IEE Seminar on Secure Images

and Image Authentication(2000/039), London, UK, April

2000, p.5.

6.1667%, 4.5%, 4.3333%, 5% respectively. The

missing rate of detecting 600 embedded audios is

shown in Figure 2.

Steganography by Amplitude or Phase Modification,

Security and Watermarking of Multimedia Contents V,

Proceedings of the SPIE, Volume 5020, 2003: 67-76.

[4] Akira Takahashi, Ryouichi Nishimura, Yiti Suzuki,

Multiple Watermarks for Stereo Audio Signals Using

Phase-Modulation Techniques, IEEE Transactions on

Signal Processing, Vol 53, No 2, February 2005: 806-815.

[5] Ioannis Paraskevas, Edward Chilton, Combination of

magnitude and phase statistical features for audio

classification, Acoustical Society of America, ARLO 5(3),

July 2004: 111-117.

[6] A. Papoulis, Probability and Statistics, PrenticeHall,

Englewood Cliffs, 1990: Chap.12.

using various embedding parameters

New York: Spring Verlag, 1995.

get high detecting accuracy. There is a trade-off

between alarm rate and missing rate. Concerning this

trade off, we find it is reasonable to train SVM models

with audio embedded data using N=2048, n=512

because both alarm rate and missing rate can be

controlled. This conclusion is on the assumption that

each segment has embedded messages.

[8]

C.C.Chang,

C.J.Lin,

http://www.csie.ntu.edu.tw/~cjlin/libsvm,

support vector machines, 2007.

264

"LIBSVM",

library for

- IRJET-Crowd Density Estimation using Novel Feature DescriptorUploaded byIRJET Journal
- International Journal of Computer Science IJCSIS Vol. 10 No. 6 June 2012Uploaded byijcsis
- A hybrid stock selection model using genetic algorithms and support vector%0Aregression.pdfUploaded byspsberry8
- Huang Nakamori Wang 2005Uploaded byjoanaguirre
- PREDICTING BANKRUPTCY USING MACHINE LEARNING ALGORITHMSUploaded byJames Moreno
- Seminar ReportUploaded byReshma KC
- A PREDICTIVE SYSTEM FOR DETECTION OF BANKRUPTCY USING MACHINE LEARNING TECHNIQUESUploaded byLewis Torres
- Thesis on Telugu ocrUploaded byuday
- mamogram (1)Uploaded byAdd K
- Classifying Single Trial EEG: Towards Brain Computer InterfacingUploaded byscribd.com@jperla.com
- SVM-KNN: A NOVEL APPROACH TO CLASSIFICATION BASED ON SVM AND KNNUploaded byIRJCS-INTERNATIONAL RESEARCH JOURNAL OF COMPUTER SCIENCE
- Inductive Learning Algorithms and Representations for Text Categorization.pdfUploaded byggrop
- An Efficient Extreme Learning Machine Based Intrusion Detection SystemUploaded byGRD Journals
- Classification of Brain MRI Tumor Images a Hybrid 2017 Procedia Computer ScUploaded bySridhar Koneru Venkkat
- Review on Tuberculosis Detection Using Various Data Mining TechniquesUploaded byIRJET Journal
- Confusoin Matrix GenerationUploaded bychitrapriyan
- Character RecognitionUploaded byLoganathan Rm
- Report Digit RecognitionUploaded byAristofanio Meyrele
- IMPROVED STEGANOGRAPHIC SECURITY BY APPLYING AN IRREGULAR IMAGE SEGMENTATION AND HYBRID ADAPTIVE NEURAL NETWORKS WITH MODIFIED ANT COLONY OPTIMIZATIONUploaded byAIRCC - IJNSA
- sw3.pdfUploaded bypablo_27ep
- Hand Gesture Classification Using Emg SignalUploaded byEditor IJRITCC
- 1Uploaded byPaula Cyntia Part II
- Road Traffic Volume ForecastUploaded byZaki Al-Tamimi
- jadav-2016-ijca-910921Uploaded byHerman Rizani
- IRJET-Face Spoof Detection using Machine Learning with Colour FeaturesUploaded byIRJET Journal
- 10.1007%2Fs11063-013-9288-7Uploaded byRahul Yadav
- Spatial Feat EmbeddingUploaded bymartin_321
- digitalUploaded byAnanyaja Debadipta
- untitledUploaded bykeyvan firuzi
- qsaranddrugdesignppt-110706061441-phpapp02.pdfUploaded by16_dev5038

- Maths Ext 1 2014 Terry Lee's SolutionsUploaded byYe Zhang
- Module 1 Geometry of Shape and SizeUploaded byKèlǐsītǎnKǎPáng
- Physics Report_GP1_Practicum 2_Wilcha Anatasya Veronica.docxUploaded bywilcha
- ijertv6n2_08Uploaded byRajesh Bathija
- DPP(37-38).pdfUploaded byRahul Kumar Sharma
- chapter 9 test a circles and parabolasUploaded byapi-342236522
- Cointegration and Testing Unit Roots / applied economicsUploaded byNaresh Sehdev
- MATH099 SYLLABUSUploaded bytinomutenda
- 2011 AMC8 ProblemsUploaded byjeanliu701
- pat math parent guide 2018Uploaded byapi-288922072
- Copy of Kalman ReportUploaded bysameerfarooq420840
- TDOAUploaded bysourabhbasu
- Chenmat ReportUploaded byPenny Gildo
- HP28S REFERENCE MANUALUploaded bygnd100
- Pricing_Options_Based_on_Trinomial_Markov_Tree.pdfUploaded byPacymo Dubelogy
- Answer Key (Finals - Math 17)Uploaded byArlan Rodrigo
- Learn More About Engineering MathematicsUploaded byCrystal King
- Calculation of Cross Sectional AreaUploaded byAnne Magno
- Topology Based Data Analysis Identifies a Subgroup of Breast Cancer With a Unique Mutational Profile and Excellent SurvivalUploaded byLuis Morales
- Lab 8 - DFT Leakage and WindowingUploaded byKara Butler
- Double Exponential RegressionUploaded byKwok Chung Chu
- What is Computation (Jack Copeland).pdfUploaded byjosepepefunes26
- chapter 2 Cambridge year 9Uploaded bydude
- Statement of PurposeUploaded byStacey Chang
- 4-aplacevalueUploaded byapi-299092653
- 2_3 Basic Limit LawsUploaded byJhemson ELis
- selected topics in mathUploaded bydexter john palete
- what's missing in ai the interface layerUploaded byDanteA
- NM MEI A-level Maths CourseworkUploaded byMaurice Yap
- Lec1 Mat LabUploaded bykiran