You are on page 1of 18

A Report on the use of Forensic Analysis Tools in

Backtrack and Kali

By
Kevin Pryce

B00052949

25/10/2013
Table of Contents
A Report on the use of Forensic Analysis Tools in Backtrack and
Kali.............................................................1
Introduction.....................................................3
Section 1: Imaging Tools.........................................4
Section 1:1. dd_rescue.........................................4
Section 1:2. affconvert........................................5
Section 2 : Information Gathering Tools..........................7
Section 2:1 fsstat.............................................7
Section 2:2 affinfo............................................8
Section 3 : Analysis Tools.......................................9
Section 3:1. bulk_extractor....................................9
Section 3:2. scalpel..........................................10
Section 3:3. recoverjpeg......................................12
Section 4 : Hashing and Steganography...........................14
Section 4:1 md5deep...........................................14
Section 4:2 stegdetect........................................15
Conclusions.....................................................16
References......................................................18
Illustrations...................................................20
Introduction

Backtrack (BT) is a Linux based distribution, initally created by


Jason Dennis, that contains a large selection of tools used in
digital forensics and penetration testing. The name comes from an
algorithm used to to build solutions incrementally, by discarding
the candidate piece of information as soon as it has been ruled
out (backtrack)as a possible step to a solution.
The first release was in Feb. 2006, with versions going up to BT 5
R3. In March 2013, Kali was released as the updated, slicker
version which will replace BT, including support and service.
Older versions will no longer have the community support as
efforts will concentrate on Kali. Backtrack-linux.org will remain
as a resource reference for a undetermined peroid of time, while
users migrate to the new Kali.org platform. Both of these sites
have alot of information for the user as well as community support
in the forums. Not exactly user friendly for nubes, information
can be difficult to find and can be slightly daunting when the
user is unfamiliar with the objectives of this distribution [1].
The objective of this report is to outline and demonstrate some of
the tools used in the digital forensics section of Kali/BT. The
student will try to outline the usage of the tool, then show,
using screen shots, how to use the tool and hopefully have the
correct output to demonstrate that the tool works. To try and
avoid confusion, of which there can be alot, the student will try
and explain the tool in 300 to 400 words. This will give the
reader an outline of the tool, but is not meant to be a
comprehensive guide to the usage of the tool. The student will
give links to the resources used as well as useful help commands
and the manual pages included with most tools.
In an attempt to keep it simple, we will concentrate our attention
on one or two raw disk images (.dd). This image is a bit for bit
copy of the contents of a hard drive, that we can use to test our
tools. This type of image file is used by dd imaging tool. dd is
one of the oldest imaging tools still used and its format is
recognised by most analysis tools, so it is a widely accepted
format [2].
The examples we are using are from a testing suite that can be
found at sourceforge [3]. These images have been created with the
tester in mind and may have bugs written in to test the tool being
used.
Section 1: Imaging Tools

Section 1:1. dd_rescue


dd_rescue is used to recover data from a file or disk that may
have read errors. The algorithm it uses first passes over data
that can be easily recovered, and waits until all of the easily
retrievable data is recovered before trying to recover data from
damaged sectors. dd_rescue tries to cause as little damage as it
can to the already failing drive by making fast sweeps over the
disk. This method prevents the program from spending more time in
corrupt areas which causes more damage to the already damaged
surface, heads and drive mechanics. It does not write zeros to bad
sectors allowing data to be preserved longer.
This utility also uses a logfile feature. The ddrescuelog is a
very important part of this tool, as it allows the user to more
flexibility by allowing interupt and resume options. Using the
logfile can help in recovering data from the same disk multiple
times. This would increase the chance of retreiving all of the
data as the ddrescuelog will only read the needed blocks from the
second and sucessive reads.
For example: We try to recover data from a failing hard drive. The
first pass of dd_rescue gives us our base image, which has been
recorded in the logfile. So say we get 80% of the data from the
hard drive on the first pass. On the second pass dd_rescue will
try to recover data from the sectors that it had previously failed
on. The second pass gets the data up to 90%. This can continue
until as much of the data that can be recovered, is [4].
The tool was tested in the Kali distro. The student decided to use
the existing .dd file to mitigate disaster. This is a straight
forward tool and does not require complex commands to get
something done.

Step 1: Make sure that after you start the tool you navigate to
the folder containing the image you want to copy. In illustration
1. we can see the command to run dd_rescue.

Illustration 1: Verbose dd_rescue command.

Step 2: As shown in the illustration 1, call dd_rescue -v is


verbose mode which show the working output. 8-jpeg-search.dd is
the image that you want to copy and DDrescue is the folder you
want to put the rescued image into.
Step 3: In illustration 2 we can see the output from the verbose
command.

Illustration 2: dd_rescue output.

The output file hash was checked against the the original image
hash to verity that the complete image was recovered.

Section 1:2. affconvert


affconvert is used to convert raw data files(.dd) to the Advanced
Forensic Format (AFF), an opensource, extensible imaging format.
It gives the user the ability to store disk images with or with
out compression, it allows disk images of any size to be stored,
metadata can stored within the disk image or without, it is a
simlpe tool to use and it will work across multiple platforms.
One of the issues digital investigators encounter is that .dd
images may be very large but contain very little information in
relation to its size. It is necessary to compress images. This ads
another layer of complecation, as compression tools do not allow
random access within a compressed file. AFF seeks to mitigate this
by storing files in a more compact form [5].
Mainstream forensic analysis tools such as EnCase are restricted
to using images that are less than 2GB. So with an image that is
10GB in size, for example, EnCase would have to split the files
into 5 2GB files to be able to analyise the data. This is time
consuming and the associated metadata would be limited and of
little use. AFF images can be of any size.
affconvert stores data in data segments of the same size, which
allows for the creation of smaller image files than other
properity software available. The pagesize is determined when the
initial image is created, but is usually 1MB or 16MB in size. The
pages are then numbered sequentiallty from page0 to page n. When
an AFF image is compressed, it is quicker to read data from the
file than it is to read from an uncompressed drive.
affconvert also generates md5 and sha1 hash's, that can be used to
verify that the converted contents of an AFF image match the
original checksum of the raw image.
affconvert is used to convert raw disk images to the Advanced
Forensics Fomat. AFF is an alternative to other disk imaging
formats and allows the user some more flexibility. Designed as an
open source format, it allows for open use across multiple
platforms. AFF format uses less disk space than other image
formats and allows for metadata to be stored with the image.

Step 1: Make sure you are in the same directory as the .dd you
want to convert as shown in illustration 3.

Illustration 3: affconvert run command.

Step 2: In illustration 4 we can see the converted file and the


checksum values for Md5 and SHA1 hashing algroithms.

Illustration 4: affconvert showing results of command.


Section 2 : Information Gathering Tools

Section 2:1 fsstat


fsstat is a simple tool to find out the details of a file system.
It is part of the TSK and Autopsy forensics toolkits. This is a
very simple tool to use. Just point it at the file system and
click. It could be used in the inital recon of a file system, to
figure out what format it is in [16].
The file system type is useful to begin with. Is it ntfs, fat32,
fat16? This can lead the examiner to conclusions as to how to
proceed. The operating system is a good peice of information.
The output of this tool depends on the system being looked at.
Some systems will allow more detailed information , some less. FAT
file systems are displayed in a less detailed form and the data is
sectors, not clusters [17].
In this example there is a minimum of information show. Basic
metadata information includes the Master File Record (MFT), this
is the record of every file on the system. The MFT Mirror is the
exact duplicate of the first four records of the MFT. This gives
access to the MFT in the case of a single sector failure [18].
Section 2:2 affinfo
Affinfo is part of the Advanced Forensic Format (AFF) toolkit. It
is used to print ot screen detailed statistics about a file. In
the example below we can see that the tool has picked out 2
badsectors in the file. It has recognised two image files on
the .dd image and that the sector size is 512. The tool also picks
up the checksum values in md5 and sha1. It also shows what file
that it has been converted from, in the acquisation_commandline
entry in the image [23].
Used with the other tools in the suite, affinfo allows a forensic
investigator keep consistency across the data that is being
worked. After an image is converted to aff, this tool could be
used to confirm the state of the image before continued testing.
Logs can be used to record and maintain data that could be used in
a legal suitation and supports the intregrity of the format [24].
AFF is an open source, platform independent format that seeks to
provide the digital forensic's examiner with the flexibility to
move between different disk image format, in a way that preserves
the integrity of the filesystem image. AFF provides all the tools
necessary to perform valid tests on the images, without some of
the restrictions that other formats contain. For example, EnCase
can only test files up to a 2GB limit.
Section 3 : Analysis Tools

Section 3:1. bulk_extractor


bulk_extractor is a light weight, fast forensics tool that can
extract information from a file system or disk image with out
having to parse through the actual file system. It extracts
information based on recognised formats such as credit card
numbers, email address, domain names etc,. This ability to ignore
the file system of a devise allows the tool to be used across
different digital platforms. It can be used to extract information
from SSDs, hard drives and DVD, for example.
The tool also provides the user with text files and histograms of
the information it finds. This makes the application suitable for
use in digital investigation scenarios. bulk- extractor has the
ability to run different serch parameters in parallel, which makes
it fast and efficient. When using the tool compressed data must be
decompressed and the tool does not give the file names of where
the data was extracted from [6].
The candidate decided to use the image 8-jpeg-search.dd again for
the example. It only serves as an example of the program running,
as the image file is specifically used to test a different tool
and does not contain the data that bulk_extractor looks for such
as urls, credit card numbers, etc.. The candidate has tested
bulk_extractor on a 128MB usb and the 12-carve-ext2.dd image,
achieving good results. Unfortunately the supposedly stable
version of BT that I have installed won't let me use Ksnapshot at
the moment, so you will have to take my word for it.
To get to the data that has been extracted by the tool the best
starting point is the report.xml that is contained in the output
data folder that was named in the inital run command. This
contains the exact amount of instances of each parameter found.
The user can then look at them in more detail in the individual
.txt files. A list of the created files can be seen on the Phase
3 Creating Histograms dialouge of bulk_extractor.

Step 1: Again, insure that you give the actual path the image to
be worked on, or, as in this case, work in the same folder. In
illustration 5 we see the command used to run bulk_extractor and
copy the output to the extractoroutput folder

Illustration 5: run command for bulk_extractor

Step 2: The -o creates the folder 'extractoroutput', where the


extracted output will be placed. The last parameter is the image
file to be used. If you wanted to use this on a usb: first fdisk
-l to find where the usb is mounted, usually sdb1, and then point
bulk_extractor to that location ie: /dev/sdb1/. This will read the
contents of the usb. In illustration 6 we can see the the folder
that bulk_extractor has created in it report.

Illustration 6: output from bulk_extractor.

This tool cannot be used on compressed files or images and it will


only find the given common string associations i.e. email format,
credit card number formats, ip address.
It presents the information found in a user readable way and in a
format that can be used in court. Named Entity Recognition is the
technology that bulk_extractor is based on [7].

Section 3:2. scalpel


scalpel is a data carving tool. It is used to extract data from
possibly corrupt or over written disk images. This tool uses the
header and footer values to recognise the type of the file, which
makes it flexable across different file systems. The idea is to
find the start and the end, the header and the footer, location of
a file and then 'carve' the byte sequence into an external file.
In this example we used the 8-jpeg-search.dd file to test scalpel.
Results were very good and it found twice as many jpeg files as
the recoverjpeg tool that we will look at later. The main task
here is that the user must edit the scalpel .conf file [8].
This directs the tool to look for a specific file type, as shown
in the in illustration 7, which is the .conf file located in
the /dev/etc/scalpel directory.
Illustration 7: uncommented search parameters.

When the .conf file is opened initally all of these references are
commented out. The user must manually delete the hash tag so that
scalpel can use the uncommented parameters to search for the
header and footer information. The .conf offers a selection of the
most common file formats and the user can also create special
entries for different extensions.
Consecutive runs over the same filesystem can yeild more complete
file recovery with out damaging the media. Complete file systems
can be reconstructed from multiple corrupted disks, as scalpel
logs each pass and can 'carve' data based on what headers and
footers it needs to match. After the first pass over an image
scalpel has a record of the location of all the headers and
footers in the image [9].
On the second and consecutive passes, scalpel will try to match a
header with the correct footer. scalpel uses work queues to keep
track of chunks of the disk image where the file is to be carved
from, following the .conf uncommented parameters.
scalpel is an invaluable tool in the forensics tool kit. It
provides the user with an fast, effective carving tool that does
not need huge amounts of processing power to complete its task. It
works on different filesystems and will retreive files even when
all metadata is destroyed. It has been proven to recover data from
filesystems that have been reformated multiple times [10].

Step 1: The -c points scalpel to the uncommented .conf file. The


-o points to the output folder. In illustration 8 we can see the
command pointing scalpel to the .conf file.

Illustration 8: using the .conf file.

Step 2: In illustration 9, on the next page, we can see the files


that scalpel has carved from the .dd image.
Illustration 9: output of what scalpel has found.

Section 3:3. recoverjpeg


We mentioned this tool earlier. Another one for your 'carving '
kit, recoverjpeg is easy to use and does exactly what it says: it
recovers jpeg's. recoverjpeg searchs a disk image in blocks for a
jpeg structure, which it stores in a default format imagexxxxx.jpg
in the folder from which it was run [11]. In the example shown
below, recoverjpeg was executed on the raw image file(.dd)stored
on the desktop and the recovered jpeg's were recovered to the same
location.

Illustration 10: Output from recoverjpeg


Originally designed to recover images from a damaged hard drive,
rescuejpeg has been improved over the last few years, to include a
file recovery option [12].
Rescuejpeg works by searching for jpeg markers in a filesystem.
Each jpeg has a beginning marker called Start Of Image(SOI), which
is represented in bytes as 0xFF,0xD8. We can see this highlighted
in the code snippet on the next page.
The first line of highlighted code is instructing the program to
check that if the first byte is not equal to 0xFF and the byte
attached to it is not equal to 0xD8, then don't start. If this is
not true, then the program can continue.
The second peice of code is instructing the program to check for
consecutive 0xFF markers and disgard them. After the initial 0xFF,
0xD8 marker is read, any 0xFF bytes are used as padding in the
file and do not need to be scanned as they contain nothing. The
0xFF must be followed by a byte telling what type of marker it
is. The 0xD8 byte tells us that it is a start of image marker.The
third peice of highlighted code tells us that the end of the file
has been reached using the End Of Image(EOI) 0xFF,0xD9 marker
[13].

Illustration 11: Code Sample


Section 4 : Hashing and Steganography

Section 4:1 md5deep

This tool uses the md5 hash function to create a fixed size bit
string of a file. This string of letters and numbers, the longer
the string the more complex the algroithm, is a way to identify
the file. It is a fingerprint of the file, it is unique to that
file as it was generated from the unique file structure of the
file.[14]
This works on the principle that it is infeasible to change the
data without changing the hash. Also that it is infeasible to
gererate a message from a hash and and find different messages
with the same hash value.[15]
The hash function must be one-way only. That is : it should be
difficult to find a message with the exact same hash value. This
is because even the most subtle difference would make a huge
difference in the generated hash. This is Pre-image resistance.
It must be infeasible that given an input, another, different,
input would have the hash. This is Second pre-image resistance.
It must be difficult to find two different, seperate messages that
have the same hash value. This is Collision resistance [19].
No hashing function is ever fully secure against decryption. What
the hash does is increase the amount of time and resources it
takes to crack, making it infeasible resourse wise. MD5 is no
longer concidered safe, since it was proven to have low collision
resistance. SHA-3 is currently concidered the most secure hashing
algorithm [20].
This page has a very detailed explaination of the what is going on
under the hood at reference point 3.1 [21]. The illustration shows
the size of the image , along with the hash.

Illustration 12: md5 hash of the .dd file


Section 4:2 stegdetect

This is an interesting bit of kit. The tool was tested on jpegs


recovered from 8-jpeg-search.dd using scaplel. In the illustration
we can see that stegdetect has picked up a hidden jphide file, a
format used for hiding messages with in pictures.

Illustration 13: showing hidden jphide file.

Stegdetect is used to detect Steganography, the art of hiding a


message or some sort of information in an image or text. We've all
done it at some stage, the secret message written in lemon juice
and magically revealed with a bit of heat from a candle. That is
the art of steganography.
In the screen shot above we can see that the tool pulled a jphide
file from the .dd image. This is a format used to to turn the jpeg
into a seganographical devise. There are two basic forms of
steganographic techniques: Technical, using invisible ink or
microdots : Linguistic, which is is split into : Semagrams, uses
symbols and signs to convey a message. Open code is a message
hidden in plain view and is uncovered by ciphers or specific
jargon.
An example is changing every 500th bit in a bit map of an image to
a different colour, spelling out the message using the colour as
the letters in the message. This is a simlpe enough technique and
when it is combined with the use of online media storage and
viewing apps, a fragmented , undercover group could communicate
using such a technique with a fairly low chance of detection [22].
Conclusions
We have taken a small look at a selection of forensic tools
available in the Backtrack/Kali distro. Starting with imaging
tools, a forensic examination would begin by trying to preserve
the original media state as much as possible. An initial
examination would try to make a copy of the media, in this case we
could assume the disk was damaged or corrupt and use dd_rescue.
This would give us raw disk image that would allow an investigator
to test the image while preserving the intregrety of the original
media. By converting the image to the AFF format, two paths of
analysis can begin on the disk image.
Next we could use fsstat to get some basic information about the
disk image. This could help us to identify the file system type,
operating system version and volumn type. Affinfo will give more
detailed information about badflag, bad sectors and the pagesize
of the image which may give an indication of where to begin. At
the early stages gathering as much information as possible is
important, before more invasive techniques are used that may
affect the image integrity.
We can then use bulk_extractor and scalpel to extract files. They
use different search parameters allowing for a broader reference
base, so covering more possibilites. In scalpel we have the option
to specify one or many search options, allowing us to be flexible
in suitations of limited resources or the fear of data loss. We
can search for common file formats. bulk_extractor searchs through
files using name identity regognition to find string references
such as email or credit card number formats. It has the added
bebefit of presenting its finding in standardised text format.
Other tools can then be employed to search for other, hidden
files. In our example we saw stegdetect find hidden files that are
a recognised steganography format. Md5deep is a hashing tool that
would be used to fingerprint a file if it was being transmitted or
copied.
Ther are suites of tools, CAINE and EnCase, for example that
contain tools that form a logical, investigitative structure for
different platforms. This allows the examiner to maintain
consistency in the format that the presentation of information is
shown. It is better to test an image using the suite of tools that
was developed with the format in mind.
Digital Forensic's is an emerging field in the digital world.
Access to information is becoming more important, as the consumer
demands more. Demand for faster, better devices along with change
will see lots of change for the years to come.
References

[1] “” Internet: http://www.kali.org/ [Oct. 15, 2013].


[2] “Ftp index of gnu.org”Internet: ftp://ftp.gnu.org/ [Oct. 15,
2013].
[3] “Digital Forensics Tool Testing Images” Internet:
http://dftt.sourceforge.net/ [Oct. 15, 2013].
[4]“Ddrescue - Data recovery tool” Internet:
http://www.gnu.org/software/ddrescue/ [Oct. 16, 2013].
[5] “Advanced Forensic Format:an Open,Extensible Format For Disk
Imaging” Internet:
http://cs.harvard.edu/malan/publications/aff.pdf [Oct. 15, 2013].
[6] “Digital Forensic Analysis Using BackTrack, Part 1” Internet:
http://www.linuxforu.com/2011/03/digital-forensic-analysis-using-
backtrack-part-1/ [Oct. 16, 2013].
[7] “Digital Corpora. For use in computer forensics education
research ” Internet: http://digitalcorpora.org/ [Oct. 16, 2013].
[8] “Using bulk_extractor for digital forensics triage and cross-
drive analysis” Internet: http://simson.net/ref/2012/2012-08-
08%20bulk_extractor%20Tutorial.pdf [Oct. 15, 2013].
[9]”Head the Ball” Internet: http://www.irongeek.com/i.php?
page=man-pages/list [Oct. 15, 2013].
[10]“Scalpel: A Frugal, High Performance File Carver” Internet:
http://dfrws.org/2005/proceedings/richard_scalpel.pdf [Oct. 15,
2013].
[11]“A full service computer security and digital forensics firm
” Internet: http://www.digitalforensicssolutions.com/ [Oct. 15,
2013].
[12]“Ubuntu Manuals” Internet:
http://manpages.ubuntu.com/manpages/recoverjpeg.1.html [Oct. 15,
2013].
[13]“A tool to recover lost files on damaged memory cards or USB
drives” Internet: https://www.rfc1149.net/devel/recoverjpeg.html
[Oct. 15, 2013].
[14]“ISO/IEC 10918-1 : 1993(E) ” Internet:
http://www.digicamsoft.com/itu/itu-t81-36.html [Oct. 15, 2013].
[15]“Recover lost JPEGs and MOV files on a bogus memory card or
disk” Internet: https://github.com/samueltardieu/recoverjpeg [Oct.
16, 2013].
[16]“NTFS – New Technology File System” Internet:
http://www.ntfs.com [Oct. 15, 2013].
[17]“Autopsy and The Sleuth Kit are open source digital
investigation tools (a.k.a digital.....” Internet:
http://www.sleuthkit.org [Oct. 16, 2013].
[18]“fsstat displays the details associated with a file system”
Internet: http://man.he.net/man1/fsstat [Oct. 15, 2013].
[19]“Cryptographic Hash-Function Basics:Definitions, Implications,
and Separations for Preimage Resistance, Second-Preimage
Resistance,and Collision Resistance” Internet:
http://www.cs.ucdavis.edu/~rogaway/papers/relates.pdf [Oct. 15,
2013].
[20]“HashDeep” Internet:
https://github.com/jessek/hashdeep/blob/master/md5deep.1 [Oct. 17,
2013].
[21]“The MD5 Message-Digest Algorithm” Internet:
http://www.ietf.org/rfc/rfc1321.txt [Oct. 17, 2013].
[22]“An Overview of Steganography for the Computer Forensics
Examiner” Internet:
http://www.garykessler.net/library/fsc_stego.html [Oct. 15, 2013].
[23]“AFF is an open and extensible file format to store disk
images and associated metadata ” Internet:
https://github.com/simsong/AFFLIBv3 [Oct. 20, 2013].
[24]“The Advanced Forensics Format Library and Tools” Internet:
http://simson.net/ref/2006/aff-ifip.pdf [Oct. 20, 2013].

Illustrations

Illustration Index
Illustration 1: Verbose dd_rescue command........................6
Illustration 2: dd_rescue output.................................7
Illustration 3: affconvert run command...........................8
Illustration 4: affconvert showing results of command............8
Illustration 5: run command for bulk_extractor...................9
Illustration 6: output from bulk_extractor......................10
Illustration 7: uncommented search parameters...................11
Illustration 8: using the .conf file............................11
Illustration 9: output of what scalpel has found................12
Illustration 10: Output from recoverjpeg........................12
Illustration 11: Code Sample....................................13

You might also like