Machine Learning Based Steganalysis of Images

By
-chandana kaza
 Steganography
 Steganalysis
 Conventional methods
 Machine learning based steganalysis
 Experiments and results
 Transmit secret messages.
 To make transferred secret messages undetectable.
 Embed messages in such a way so as not be detected.

 Passive warden-examines and determines whether the
message contains hidden message.
 Active warden-alters the message, even though there is

no trace of secret message.
 Not suited
◦ Images with low number of colors
◦ Images with unique semantic content
 Best suited
◦ Gray scale images
◦ Uncompressed scans of photographs
◦ Images captured by digital camera

 Simple and straight forward
 Embed message into the least significant bit plane.
 Difficult to be found by human eye.

Cover image Stego image
 To detect the existence of steganography
 Estimate the message length
 Extract hidden information
 Achieved by exploiting differences between files.

 2 types of LSB embedding
◦ Sequential
◦ Non-sequential
Classifying of steganalysis techniques

• Instance based
• Non-instance based
 The 2 Method [Pfitzmann and Westfeld]
 Splits images into segments
 Calculate 2 co-efficients for every segments
 Decide whether there is hidden message.
 Suitable for sequentisl LSB

 If embedded message bit and original bit are different
then, flip bit.
 Technique
◦ Let pixel value=j
◦ If j=2i, after flip j=2i+1
◦ If j=2i+1, after flip j=2i
 Combines 2 pixel values 2i and 2i+1 together as a pair,
and the two values differ in the lowest bit.
 Where k is the total number of all possible pixels.
 If p is close to 1 then image is embedded.

 Suitable for any type of situation.
 Exploits spatial correlation in images.
 Based on analyzing how the number of R and S groups

changes with the increased message length embedded
in LSB plane.
 Technique
◦ Consider a MXN image with pixel values from set p.
◦ For gray scale p={0……255}
◦ Divide the image into disjoint groups of n adjacent pixels.
◦ Define a discrimination function f(xl . . . . . xn)ER that assigns

a real number to each pixel group
• Purpose of the function is to quantify the smoothness
or “regularity” of the group of pixels.
• The noisier the group of pixels G=(x1….xn), the larger

the value of the discrimination function.
o Define an invertible operation F on p called flipping.
o LSB Flipping F1:0<->1,2<->3,….,254<->255
o Shifted Lsb Flipping F-1:-1<->0,1<->2,….,255<->256

 Depending on the above operation we define 3 pixel
groups
 Where F(G)=(F(x1),….f(xn))
 The flipping function can be captured by a mask M,
which is a n-tuple with values -1,0,1.

 Number of R groups for mask M:RM
 Number of singular groups:SM
 Statistical hypothesis
◦ For typical image
◦ This theory is violated after randomizing the LSB

plane.
 Randomization of the LSB plane forces
Rm  Sm
ie, the difference tend to become
zero as the message length increases.
 In
the case of R-M and S-M the opposite
happens.
 P is the length of message.
 The initial measurement of R and S groups is RM(p/2),
SM(p/2), R-M(p/2), and

 S-M(p/2)
 The points RM(p/2), RM(1/2), RM(1-p/2) and
SM(p/2), SM(1/2), SM(1-p/2) determine 2 parabolas.
 Now calculate the root of the quadratic equation

 Learning denotes changes in the system that
enable the system to do the same task more
effectively next time.
 Ex: classify an object as an instance

 The conventional methods just used some hypothesis
observed heuristically.
 If they are differences between the real model and fixed

model, they will fail.
 ML is used to reduce the errors brought by fixed

models.
 Forhidden information detection simple
classifiers are used.
 Using machine learning the quality of

classifiers will be improved and successively
more stable performance can be acquired.
 Hidden information process is treated as classification
process.
 I/p-images
 O/p-class labels
 The data set is built based on the values in 2 and RS

methods.
 Training set: A portion of data set used to fit(train) a model
for prediction or classification of values that are known in the
training set, but unknown in (future)data.
 Test set: A set of data used only to assess the performance

[generalization] of a fully-specified classifier.
 Feature extraction: Transforming the input data into the set of

features to reduce redundant information is called features
extraction.
 24 bit color images are collected
 Embed different length of messages into images
 Extract features using different methods for sequential and

non-sequential cases.
 Perform preprocessing
 Every image will result an instance represented by a set of

features in the data set.
 Build the experiment platform on WEKA
 Test results on different machine learning methods.
◦ Naïve Bayes
◦ Bayes networks
◦ Decision trees
◦ KNN
◦ SVM
◦ Neural networks
 Apply ML-based classifier on POV3 algorithm.
 The LSB bit-plane is treated as sequential pixel samples and
is split into 100 segments.
 For every segment, the 2 probability for all the pixels from
the first segment to current one is calculated.
 We get 300 coefficients, simple 2 then make a decision

according to a threshold.
 But our method uses these as features and construct
classifiers.
 Precision α1/Image complexity
 Precision α embed rate
◦ Because most of the pictures are in high complexity
level 2-4, so ML-based methods generally perform
better than simple 2 .
◦ Conclude that applying machine learning to 2 can

effectively improve the accuracy
◦ Classifier wrapped conventional steganalysis

maybe a good solution to detect sequential LSB
steganography.
 We use RS approach in this case
 The differences between R±M(p/2), and S±M(p/2),

increase when message length p increases.
 The features are calculated using the difference

 Directdifference is not used in order to reduce
the bias between different images.
 Testthis feature based methods with ML

techniques.
 Main focus is on the change of embed rate, the

difference between different intrinsic
complexities.
Precision Embed All
Embed 0.1 Embed 0.2 Embed 0.5 Embed 1.0
(RMS) Mixed
Naive Bayes 50.90%(0.59) 53.80%(0.57) 54.90%(0.45) 94.56%(0.31) 80.48%(0.37)
Bayes Net 89.21%(0.27) 95.85%(0.17) 99.35%(0.07) 99.45%(0.07) 95.44%(0.18)
kNN 92.16%(0.28) 97.55%(0.16) 99.45%(0.07) 99.70%(0.05) 96.38%(0.19)
J48 94.11%(0.27) 98.05%(0.14) 99.40%(0.08) 99.65%(0.06) 97.56%(0.14)
SMO 59.39%(0.64) 75.32%(0.50) 93.21%(0.26) 96.70%(0.18) 80.90%(0.44)
BP 53.10%(0.50) 54.10%(0.50) 52.70%(0.50) 56.80%(0.50) 80.00%(0.40)
Threshold RS 95.30% 98.75% 99.70% 87.11% 97.38%

From table, we can see that J48 performs best in
mixed embed rate case, and can get nearly 98%
accuracy at all embed levels.
We use only two features, this result is comparable

to 2 case in sequential embedding and is better
than threshold based RS can do.
 1. http://www.cs.waikato.ac.nz/ml/weka/.
 2. http://www.outguess.org/.
 3. S. Antani, R. Kasturi, and R. Jain. A survey on the use of pattern recognition methods for abstraction,indexing and
retrieval of images and video. Pattern Recognition, 35(4):945{965, 2002.
 4. G. Berg, I. Davidson, M.-Y. Duan, and G. Paul. Searching for hidden messages: Automatic detection of steganography. In
IAAI, pages 51-56, 2003.
 5. M. Morkel. Steganography And Steganalysis, ICSA Research Group. University of Pretoria, South Africa. January 2005.
 6. Y.-M. Di, H. Liu, A. Ramineni, and A. Sen. Detecting hidden information in images: A comparative study. In 2nd
Workshop on Privacy Preserving Data Mining (PPDM), 2003.
 7. S. Dumitrescu, X. Wu, and Z. Wang. Detection of lsb steganography via sample pair analysis. InInformation Hiding 5th
International Workshop IH 2002 Revised Papers, Lecture Notes in ComputerScience vol. 2578, pages 355{372, 2003.
 8. J. J. Fridrich. Feature-based steganalysis for jpeg images and its implications for future design of steganographic schemes.
In Information Hiding 6th International Workshop IH 2004 Revised Selected Papers, Lecture Notes in Computer Science vol.
3200,pages 67{81, 2004.
Thank you

Machine Learning Based Steganalysis of Images

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Based Steganalysis of Images

Uploaded by

Copyright:

Available Formats

By

 To make transferred secret messages undetectable.

 Embed messages in such a way so as not be detected.

 Active warden-alters the message, even though there is

◦ Images with unique semantic content

◦ Uncompressed scans of photographs

◦ Images captured by digital camera

 Embed message into the least significant bit plane.

 Difficult to be found by human eye.

 Estimate the message length

 Extract hidden information

 Achieved by exploiting differences between files.

Classifying of steganalysis techniques

 Splits images into segments

 Calculate 2 co-efficients for every segments

 Decide whether there is hidden message.

 Suitable for sequentisl LSB

 If p is close to 1 then image is embedded.

 Exploits spatial correlation in images.

 Based on analyzing how the number of R and S groups

◦ For gray scale p={0……255}

◦ Divide the image into disjoint groups of n adjacent pixels.

◦ Define a discrimination function f(xl . . . . . xn)ER that assigns

• The noisier the group of pixels G=(x1….xn), the larger

o LSB Flipping F1:0<->1,2<->3,….,254<->255

o Shifted Lsb Flipping F-1:-1<->0,1<->2,….,255<->256

which is a n-tuple with values -1,0,1.

◦ For typical image

◦ This theory is violated after randomizing the LSB

SM(p/2), R-M(p/2), and

 Now calculate the root of the quadratic equation

 Ex: classify an object as an instance

 If they are differences between the real model and fixed

 ML is used to reduce the errors brought by fixed

 Using machine learning the quality of

 The data set is built based on the values in 2 and RS

 Test set: A set of data used only to assess the performance

 Feature extraction: Transforming the input data into the set of

 Embed different length of messages into images

 Extract features using different methods for sequential and

 Every image will result an instance represented by a set of

 We get 300 coefficients, simple 2 then make a decision

◦ Conclude that applying machine learning to 2 can

◦ Classifier wrapped conventional steganalysis

 The differences between R±M(p/2), and S±M(p/2),

 The features are calculated using the difference

 Testthis feature based methods with ML

 Main focus is on the change of embed rate, the

Naive Bayes 50.90%(0.59) 53.80%(0.57) 54.90%(0.45) 94.56%(0.31) 80.48%(0.37)

Bayes Net 89.21%(0.27) 95.85%(0.17) 99.35%(0.07) 99.45%(0.07) 95.44%(0.18)

kNN 92.16%(0.28) 97.55%(0.16) 99.45%(0.07) 99.70%(0.05) 96.38%(0.19)

J48 94.11%(0.27) 98.05%(0.14) 99.40%(0.08) 99.65%(0.06) 97.56%(0.14)

SMO 59.39%(0.64) 75.32%(0.50) 93.21%(0.26) 96.70%(0.18) 80.90%(0.44)

BP 53.10%(0.50) 54.10%(0.50) 52.70%(0.50) 56.80%(0.50) 80.00%(0.40)

Threshold RS 95.30% 98.75% 99.70% 87.11% 97.38%

We use only two features, this result is comparable

You might also like