You are on page 1of 44

RECOVERING

GRAPHICS FILES
Dr. Abu Sayed Md. Mostafizur Rahaman
Professor
Department of Computer Science and Engineering
Jahangirnagar University
OUTCOME
 After reading this chapter and completing the exercises, you
will be able to:
 Describe types of graphics file formats
 Explain types of data compression
 Explain how to locate and recover graphics files
 Describe how to identify unknown file formats
 Explain copyright issues with graphics

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 2
Rahaman
RECOGNIZING A GRAPHICS FILE
 Graphic files contain digital photographs, line art, three-
dimensional images, and scanned replicas of printed pictures
 A graphics program creates one of three types of graphics files:
• Bitmap images are collections of dots, or pixels, in a grid format that
form a graphic.
• Vector graphics are based on mathematical instructions that define
lines, curves, text, ovals, and other geometric shapes.
• Metafile graphics are combinations of bitmap and vector images.
 Two types of programs to work with graphics files
• Graphics editors
• Image viewers

3
UNDERSTANDING BITMAP AND RASTER IMAGES
 Bitmap images
• Grids of individual pixels
 Raster images - also collections of pixels
• Pixels are stored in rows
• Better for printing

• Image quality
• Screen resolution - determines amount of
detail that is displayed.
• Resolution is related to the density of
pixels onscreen and depends on a
combination of hardware and software.
• Number of color bits used per pixel

4
UNDERSTANDING VECTOR GRAPHICS
 Characteristics of vector graphics
• Uses lines instead of dots

• Vector file store only the calculations for drawing lines and shapes

• A graphics program converts these calculations into image

• Smaller than bitmap files (because it only stores calculation)

• Preserve quality when image is enlarged

 CorelDraw, Adobe Illustrator

5
UNDERSTANDING METAFILE GRAPHICS
 Metafile graphics combine raster and vector graphics
 Example
• Scanned photo (bitmap) with text (vector)
 Share advantages and disadvantages of both types
• When enlarged, bitmap part loses quality

 For example, if you enlarge a metafile graphic, the area


created with a bitmap loses some resolution, but the vector-
formatted area remains sharp and clear.

6
UNDERSTANDING GRAPHICS FILE FORMATS
 Standard bitmap file formats
• Portable Network Graphic (.png)
• Graphic Interchange Format (.gif)
• Joint Photographic Experts Group (.jpeg, .jpg)
• Tagged Image File Format (.tiff, .tif)
• Window Bitmap (.bmp)
 Standard vector file formats
• Hewlett Packard Graphics Language (.hpgl)
• Autocad (.dxf)
• EPS, WMF, EMF
Go to this site to know more about file format: www.
garykessler.net/library/file_sigs.html

Go to this site: www.webopedia.com and type file format


(ex. Tga and press submit. Scroll down until you find a
definition of this format) 7
UNDERSTANDING GRAPHICS FILE FORMATS
 Nonstandard graphics file formats
• Targa (.tga)
• Raster Transfer Language (.rtl)
• Adobe Photoshop (.psd) and Illustrator (.ai)
• Freehand (.fh9)
• Scalable Vector Graphics (.svg)
• Paintbrush (.pcx)
Go to this site: www.webopedia.com and type file format
(ex. Tga and press submt. Scroll down until you find a
definition of this format)

Go to this site to know more about file format: www.


garykessler.net/library/file_sigs.html

8
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS
 Most, if not all, digital cameras produce digital photos in raw or Exif
format
 Forensics investigator might need to examine a digital photo created
by a witness to an accident,
• For example. Crimes such as child pornography might involve hundreds of digital
photos of alleged victims, and knowing how to analyze the data structures of
graphics files can give you additional evidence for a case
 Examining the raw file format
• Raw file format
• Referred to as a digital negative
• Typically found on many higher-end digital cameras
• Sensors in the digital camera simply record pixels on the camera’s memory card
• Raw format maintains the best picture quality

9
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS

 Examining the raw file format (cont’d)


• The biggest disadvantage is that it’s proprietary
• And not all image viewers can display these formats
• The process of converting raw picture data to another format is
referred to as demosaicing
 Examining the Exchangeable Image File format
• Exchangeable Image File (Exif) format
• Commonly used to store digital pictures
• Developed by JEITA as a standard for storing metadata in JPEG and TIF
files
• When a digital photo is taken, information about the device (such as model,
make, and serial number) and settings (such as shutter speed, focal length,
resolution, date, and time) are stored in the graphics file.
• In addition, if the device has GPS capability, the latitude and longitude
location data might be recorded in the Exif section of the file.

10
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS

 Examining the Exchangeable Image File format (cont’d)


• Exif format collects metadata
• Investigators can learn more about the type of digital camera and the
environment in which pictures were taken
• Viewing an Exif JPEG file’s metadata requires special programs
• Exif Reader, IrfanView, or ProDiscover
• Exif file stores metadata at the beginning of the file

11
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS

12
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS

13
UNDERSTANDING DIGITAL PHOTOGRAPH FILE FORMATS

14
AUTOPSY SHOWS EXIF INFORMATION

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 15
Rahaman
IMAGES IN DEPTH
 Take graduate level Digital Image Processing class

16
UNDERSTANDING DATA COMPRESSION
 Some image formats compress their data
• GIF and JPEG
 Others, like BMP, do not compress their data
• Use data compression tools for those formats
 Data compression
• Coding data from a larger to a smaller form
• Types
• Lossless compression and lossy compression

17
LOSSLESS AND LOSSY COMPRESSION
 Lossless compression
• Reduces file size without removing data
• Based on Huffman or Lempel-Ziv-Welch coding
• For redundant bits of data
• Utilities: WinZip, PKZip, StuffIt, and FreeZip
 Lossy compression
• Permanently discards bits of information
• Vector quantization (VQ)
• Determines what data to discard based on vectors in the graphics file
• Utility: Lzip

18
LOSSLESS ENCODING: LEMPEL-ZIV

Dictionary Content Codeword


location Binary LSB of
Coded Sequence:
0000 0001 0010 0011 1001 0100 1000 1100 value of Content
location
1101
1 0 000 0
2 1 000 1
3 00 001 0
4 01 001 1
5 011 100 1
6 10 010 0
7 010 100 0
8 100 110 0
1/27/2023 19
9 101 110 1
LOSSLESS DECODING: LEMPEL-ZIV
Coded Sequence:
0000 0001 0010 0011 1001 0100 1000 1100 1101
Dictionary Codeword Info code
location Info code LSB of
from dic Codeword
position
1 0000 0
2 0001 1 Recovered Sequence:
3 0010 0 0 0 1 00 01 011 10 100 101
4 0011 0 1
5 1001 01 1
6 0100 1 0
7 1000 01 0
8 1100 10 0
9
1/27/2023 1101 10 1 20
LOCATING AND RECOVERING GRAPHICS FILES
 Operating system tools
• Time consuming
• Results are difficult to verify
 Digital forensics tools
• Image headers
• Compare them with good header samples
• Use header information to create a baseline analysis
• Reconstruct fragmented image files
• Identify data patterns and modified headers

21
IDENTIFYING GRAPHICS FILE FRAGMENTS
 If a graphics file is fragmented across areas on a disk, you
must recover all the fragments before re-creating the file.
Recovering any type of file fragments is called carving, also
known as salvaging outside North America.
 Digital forensics tools
• Can carve from file slack and free space
• Help identify image files fragments and put them together

22
REPAIRING DAMAGED HEADERS
 When examining recovered fragments from files in slack or free space
• You might find data that appears to be a header

 If header data is partially overwritten, you must reconstruct the header to make it readable
• By comparing the hexadecimal values of known graphics file formats with the pattern of the file header you
found

 Each graphics file has a unique header value

 Example:
• A JPEG file has the hexadecimal header value FFD8, followed by the label JFIF for a standard JPEG or Exif file
at offset 6

 Exercise:
• Investigate a possible intellectual property theft by a contract employee of Exotic Mountain Tour Service
(EMTS)

23
SEARCHING FOR AND CARVING DATA FROM
UNALLOCATED SPACE
 Steps
• Planning your examination
• Searching for and recovering digital photograph evidence
• Use Prodiscover Basic to search for and extract (recover) possible evidence
of JPEG files
• False hits are referred to as false positives

24
CASE STUDY
 Suppose you’re investigating a possible intellectual property theft by a contract employee of Exotic Mountain Tour
Service (EMTS). EMTS has just finished an expensive marketing and customer service analysis with Superior Bicycles,
LLC.
 Based on this analysis, EMTS plans to release advertising for its latest tour service with a joint product marketing
campaign with Superior Bicycles.
 Unfortunately, EMTS suspects that a contract travel consultant, Bob Aspen, might have given sensitive marketing data to
another bicycle competitor. EMTS is under a nondisclosure agreement with Superior Bicycles and must protect this
advertising campaign material.
 An EMTS manager found a USB drive on the desk Bob Aspen was assigned to.
 Your task is to determine whether the drive contains proprietary EMTS or Superior Bicycles data. The EMTS manager also
gives you some interesting information he gathered from the Web server administrator.
 EMTS filters all Web-based e-mail traffic traveling through its network and detects suspicious attachments.
 When a Web-based e-mail with attachments is received, the Web filter is triggered. The EMTS manager gives you two
screen captures, shown in Figures 8-5 and 8-6, of partial e-mails intercepted by the Web filter that lead him to believe Bob
Aspen might have engaged in questionable activities.

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 25
Rahaman
CASE STUDY

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 26
Rahaman
CASE STUDY

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 27
Rahaman
CASE STUDY- UNDERSTANDING
 For this examination, we need to search for all possible places data might
be hiding.

 To do this, in the next section you use ProDiscover’s cluster search function
with hexadecimal search strings to look for known data.

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 28
Rahaman
CASE STUDY (TO DO): SEARCHING FOR AND CARVING DATA FROM
UNALLOCATED SPACE
 At this time, you have little information on what to look for on the USB drive Bob Aspen used. You need to ask some
basic questions and make some assumptions based on available information to proceed in your search for information.

 The first message from terrysadler@goowy.com is addressed to baspen99@aol.com, which matches the contract
employee’s name, Bob Aspen. Next, look at the time and date stamps in this message. The first is 4 Feb 2015 9:21 PM,
and the second, farther down, is a header from Jim Shu with a time and date stamp of February 5, 2015, 5:17 AM -08:00.

 Therefore, it seems Jim Shu sent the original message, which was forwarded to the terrysadler@goowy.com account.
Because the timestamp for Jim Shu is later than the time-stamp for terrysadler@goowy.com, Terry Sadler’s location
might be in a different time zone, somewhere west of Jim Shu, or one of the two e-mail server’s time values is off
because e-mail servers, not users, provide timestamps.

 Continuing with the first message, note that Jim is telling Terry to have Bob alter the file extensions from .txt to .jpg, and
the files are about new kayaks. The last line appears to be a previous response from terrysadler@goowy.com
commenting that Bob (assuming it’s Bob Aspen) can’t receive this message.

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 29
Rahaman
UNDERSTANDING ABOUT THE CASE

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 30
Rahaman
UNDERSTANDING ABOUT THE CASE
 With these collected facts and your knowledge of JPEG file structures, you can use the steps in the
following sections to determine whether these allegations are true.

 Planning Your Examination


• In the second e-mail from Jim Shu to Terry Sadler, Jim states, “So to view them you have to re-edit each file to the proper
JPEG header of offset 0x FF D8 FF E0 and offset 6 of 4A.” From this statement, you can assume that any kayak
photographs on the USB drive contain unknown characters in the first four bytes and the sixth byte. Because this is all Jim
Shu said about the JPEG files, you need to assume that the seventh, eighth, and ninth bytes have the original correct
information for the JPEG file.
• In “Examining the Exchangeable Image File Format,” you learned the difference between a standard JFIF JPEG and an Exif
JPEG file: The JFIF format has 0x FFD8 FFE0 in the first four bytes, and the Exif format has 0x FFD8 FFE1. In the sixth byte,
the JPEG label is listed as JFIF or Exif. In the second e-mail, Jim Shu mentions 0x FF D8 FF E0, which is a JFIF JPEG format.
He also says to change the sixth byte to 0x 4A, which is the uppercase letter “J” in ASCII.
• Because the files might have been downloaded to the USB drive, Bob Aspen could have altered or deleted them, so you
should be thorough in your examination and analysis. You need to search all sectors of the drive for deleted files, both
allocated space (in case Bob didn’t modify the files) and unallocated space.

Digital Forensics- Dr. Abu Sayed Md. Mostafizur


1/27/2023 31
Rahaman
SEARCHING FOR AND RECOVERING DIGITAL PHOTOGRAPH
EVIDENCE
 In this section, you learn how to use ProDiscover to search for and extract (recover)
possible evidence of JPEG files from the USB drive the EMTS manager gave you.

 The search string to use for this examination is “FIF.” Because it’s part of the label
name of the JFIF JPEG format, you might have several false hits if the USB drive
contains several other JPEG files. These false hits, referred to as false positives,
require examining each search hit to verify whether it’s what you are looking for.

 Next to practical

32
REBUILDING FILE HEADERS
 Before attempting to edit a recovered graphics file
• Try to open the file with an image viewer first
 If the image isn’t displayed, you have to inspect and correct
the header values manually
 Steps
• Recover more pieces of file if needed
• Examine file header
• Compare with a good header sample
• Manually insert correct hexadecimal values
• Test corrected file
• Next to practical

33
RECONSTRUCTING FILE FRAGMENTS
 Locate the noncontiguous clusters that make up a deleted file
 Steps
• Locate and export all clusters of the fragmented file
• Determine the starting and ending cluster numbers for each
fragmented group of clusters
• Copy each fragmented group of clusters in their correct sequence to a
recovery file
• Rebuild the file’s header to make it readable in a graphics viewer

34
RECONSTRUCTING FILE FRAGMENTS
 Go to practical part

35
IDENTIFYING UNKNOWN FILE FORMATS
 Knowing the purpose of each format and how it stores data is
part of the investigation process
 The Internet is the best source
• Search engines like Google
• Find explanations and viewers
 Popular Web sites
• www.fileformat.info/format/all.htm
• http://extension.informer.com
• www.martinreddy.net/gfxl

36
ANALYZING GRAPHICS FILE HEADERS
 Necessary when you find files your tools do not recognize
 Use a hexadecimal editor such as WinHex
• Record hexadecimal values in the header and use them to define a file
type
 Example:
• XIF file format is old, little information is available
• The first 3 bytes of an XIF file are the same as a TIF file
• Build your own header search string

37
UNDERSTANDING STEGANOGRAPHY IN GRAPHICS FILES
 Steganography hides information inside image files
• An ancient technique
 Two major forms: insertion and substitution
 Insertion
• Hidden data is not displayed when viewing host file in its associated
program
• You need to analyze the data structure carefully
• Example: Web page

38
UNDERSTANDING STEGANOGRAPHY IN GRAPHICS FILES

39
UNDERSTANDING STEGANOGRAPHY IN GRAPHICS FILES

40
UNDERSTANDING STEGANOGRAPHY IN GRAPHICS FILES
 Substitution
• Replaces bits of the host file with other bits of data
• Usually change the last two LSBs (least significant bit)
• Detected with steganalysis tools (a.k.a - steg tools)
 You should inspect all files for evidence of steganography
 Clues to look for:
• Duplicate files with different hash values
• Steganography programs installed on suspect’s drive

41
UNDERSTANDING STEGANOGRAPHY IN GRAPHICS FILES

42
USING STEGANALYSIS TOOLS
 Use steg tools to detect, decode, and record hidden data
 Detect variations of the graphic image
• When applied correctly you cannot detect hidden data in most cases
 Check to see whether the file size, image quality, or file
extensions have changed

43
UNDERSTANDING COPYRIGHT ISSUES WITH GRAPHICS
 Steganography has been used to protect copyrighted material
• By inserting digital watermarks into a file
 Digital investigators need to aware of copyright laws
 Copyright laws for Internet are not clear
• There is no international copyright law
 Check www.copyright.gov
• U.S. Copyright Office identifies what can and can’t be covered under
copyright law in U.S.

44

You might also like