Steganography

By: Joe Jupin Supervised by: Dr. Longin Jan Latecki

Overview

Introduction
 

Clandestine Communication Digital Applications of Steganography Uncompressed Images Compressed Images Steganalysis The Images Used

Background
  

Finding and Extracting Messages from Bitmaps  Detecting Messages in jpegs  Future Work

avi) Digital images (bmp. gif.Introduction  Clandestine Communication   Cryptography  Scrambles the message into cipher Steganography  Hides the message in unexpected places  Digital Applications of Steganography  Can be hidden in digital data       MS Word (doc) Web pages (htm) Executables (exe) Sound files (mp3. wav. cda) Video files (mpeg. jpg) .

Background Character Space 0–9 A–Z a–z  Length = 12 Message = Hello Stego! Binary  Uncompressed Images Grayscale Bitmap images (bmp) Integer 32 00100000  256 shades of intensity from black to white 48 – obtained 57 00110000 00111001  Can be from color images  Arranged into a 2-D matrix .01011010 65 – 90 01000001  Messages are hidden in the least significant bits 97 – 122 01100001 – 01111010 (lsb)  Matrix values change slightly  Interested in patterns that form messages .

Background  Compressed Images  Grayscale jpeg images (jpg)     Joint Photographic Experts Group (jpeg) Converts image to YCbCr colorspace Divides into 8x8 blocks Uses Discrete Cosine Transform (DCT) – Obtain frequency coefficients – Scaled by quantization to remove some frequencies – High quality setting will not be noticed   Huffman Coding Affects the images statistical properties .

000 color jpeg images  320x240 or 240x320  www.Background Steganalysis  The Images Used   From Star Trek Website 1.com  There will be Klingons  .startrek.

Finding and Extracting Messages from Bitmaps  Problem Messages can be hidden in lsb‟s  May be anywhere in image  Cannot see message in image  Would take forever to be processed by a human  .

7 = in 76. In  Procedure contrast to cryptography. pattern in the present enumerated string    Define a „message‟ (pattern of message characters) Define „message characters‟ (used in messages) Use „stego stems‟ (patterns)  A test can be performed faster by using tiled samples . intercept and modify messages without being able to  Take a Boolean snapshot of even and odd pixels violate certain security premises guaranteed by a  Construct a string of all possible characters cryptosystem.Finding and Extracting Messages from Bitmaps Steganography is the art and science of communicating in a way which hides the existence of the communication. the goal of steganography is to hide messages  An n-pixel image has n-7 individual character inside other "harmless" a way that does not enumerations (320 xmessages 240 .793) allow "enemy" to even detect that there is a second  Use any character properties to match a message secret message [Markus Kuhn 1995-07-03]. the "enemy" is allowed to  Inject messages intowhere a images detect.

Finding and Extracting Messages from Bitmaps  Observation   Only considered linear unencrypted messages Trial performed on 100 grayscale bitmaps   97 clean 3 stego  Took an average of 9 seconds per image to find with 100% accuracy (no training -.cold)  Occasionally some garbage text at head or tail  Took an average of 3 seconds per image to test with 100% accuracy   Clean images had pattern scores of less than 10 Stego images had pattern scores of 31 or more .

Finding and Extracting Messages from Bitmaps  Conclusion Messages are detectible and extractible from non-encrypted uncompressed images  Linear messages can be found in any direction with more computation  This method can be foiled by hashing the message into the image  .

Detecting Messages in jpegs  Problem Cannot use an enumeration scheme to detect or find a message  May only be able to detect because of encoding schemes and encryption  Cannot see message in image  Statistical properties of an image change when a message is injected  .

367367 skwV 4.034 0.004 0.065238 skwEv 47.332710 skwV 3.165738 frequency 1.291 0.705 0.571 0.411032 meanEv 0.01123 15.094929 krtH 97.569 0.213 varEd 0.424866 12 15627.874 0.332762 meanEd 4.887 0.948 0.260870 -0.373766 krtEh 3.084698 krtD 0.899 0.465 0.135543 varEh varEd 0.331954 1.123 0.523 0.370270 meanV meanH 0.601 0.19 0.471598 krtEv 1.381317 varH 0.577 0.026724 meanD 182.361345 -0.146 0.418 0.087974 skwEh -0.128 0.726 0.043085 23 23 23 23 23 23 23 -0.809 0.017 0.350634 meanEh 1.326 0.531 0.603208 krtEh 3.153389 krtEd -0.001311 skwH -0.558653 meanEv -0.479060 varD 1.146 0.497393 -0.969 0.226 0.766 skwEd 0.337264 12 12 12 12 12 meanEh 12 meanEd 12 varEv 12 -2.692 0.343829 varD 1.054988 skwEv 24.385321 skwV 3.025054 varV 0.073430 skwH 0.590963 meanV meanH 0.541 0.096439 skwEv 23.059 0.187500 krtEv 2.422609 varEd 0.703 0.594 0.050189 meanD 120.738226 varH 0.244 varEv -0.172 0.538 12 12 12 12 12 12 12 0.041 0. variance.004 0.044753 varV -1.271698 23 23 23 23 23 meanEh 23 meanEd 23 varEv 23 Procedure 34 34 Obtain the 4-level 2-D wavelet decomposition of the0.120 12 12 12 12 12 12 12 -0.166710 0.412698 varD 0.032725 meanD 90.042625 krtH 364.476190 1.360447 23 23 23 23 23 23 23 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34 34   class 0 = clean.205 0.242233 krtEh 3.079329 skwEh -1.877 0.611057 varEh 3640.426 0. skewness and kurtosis of coefficients and error for prediction in subband    Normalize the data by 0-1 min-max Train Fisher Linear Descriptor (FLD) Test the FLD threshold .200 0.518569 krtEv 1.053992 krtH 62.320611 space statistics  23 15572.731 skwEh 0.485 0.155397 krtD -1.339 0.306227 krtEd -0.060 0.301011  Obtain the orientation decomposition of0.935 images -0.463496 skwEd -0.Detecting Messages in jpegs  -0.133 0.542244 skwEd -0.318 0.001666 skwH -0.980 0.363 0.229 varEh 4.349 0.622 krtV 0.080103 varV 0.427911 skwD 193.021374 12 17.451 krtV 0.268 0.553661 meanEv -0.705 0.116 0.055986 krtD -9.432629 3.345166 varH 0.681 0.153005 krtEd -0.402427 skwD krtV 920.237224 -0.366 0.808 0.572352 3. 1=stego) 72 features plus the class (0 Includes: mean.944 0.395349 meanV meanH 0.079 0.077 0.625 0.482941 skwD 838.738 0.226 0.874 0.391 0.

8%. True Pos 98.5 True Neg 99.Detecting Messages in jpegs  Observation  Trials performed on 2000 images 1000 clean and 1000 stego  Random selection of 1000 instances without replacement (500 each class)  Messages in stego had sufficient size   Results show overwhelming accuracy Bior3. True Pos 98.6%  Rbio5.1 True Neg 100%.8%  .

Detecting Messages in jpegs  Conclusion Messages of sufficient size can be detected in stego images with great accuracy  Improved accuracy may be due to a large training set  1000 (800/200)  500 (400/100)   Restricted domain  Many similar images .

Detecting Messages in jpegs  Problems  Authors did not handle log of zero problem  Replaced with small value  Differing jpeg sizes need differing message sizes  Dynamic message injection .

5 with 80/20 split and default settings  Results  .Detecting Messages in jpegs  Other Classifiers Tests were run on J4.1 and rbio5. SMO.8. Logistic and Naïve Bayes for bior3.

Future Work  Would like to find optimal stems Pattern matching  Text mining  Cryptanalysis   Would like to optimize TestMsg code  C/assembly code .

G. F. Hanover. “Steganalysis Using Color Wavelet Statistics and One-Class Support Vector Machines”. “Detecting Hidden Messages Using Higher Order Statistical Models” Department of Computer Science. Dartmouth College. R. Hanover NH 03755 . Grady Ward. Hany. All Rights Reserved.. Dartmouth College.References      Petitcolas. M.cam. Anderson. USA Farid.cl. “Detecting Steganographic Messages in Digital Images” Department of Computer Science.P.. Dartmouth College. "Information Hiding . Department of Computer Science. Lyu. Hany. Hanover NH 03755 Moby™ Words II. Siwei and Farid. July1999. URL: http://www.uk/~fapp2/publications/ieee99-infohiding.A Survey". Copyright (c) 1988-93.. Hany. NH 03755.ac. Kuhn.A.pdf (11/26/0117:00) Farid.

Spy Vs. Spy by Antonio Prohias from MAD Magazine .

Have a good Winter Break! .