You are on page 1of 4

Computer Forensics: Identifying Disk Differences Broken Mirrors

http://blogs.sans.org/computer-forensics/2010/08/12/computer-fore...

Computer Forensics: Identifying Disk Differences Broken Mirrors


Posted by Dave Hull on August 12, 2010 11:30 am Filed under Computer Forensics, Evidence Analysis, artifact analysis

One Friday afternoon I was greeted by a large package from FedEx. Inside the giant box was supposed to be a hard disk drive on which I was to conduct digital forensic analysis. Opening the box and removing a few handfuls of packing peanuts revealed a bubble-wrapped Dell Tower. Obviously, the clients, like most non-computer folks, didnt know they could remove the actual hard disk drive from the tower and send those my way. After grabbing the paperwork for this job, filling out my own chain-of-custody documentation and evidence receipt, I cracked open the tower and saw the following inside:

Search for:

Image 1: Double SATA, double fun CATEGORIES This job suddenly got more interesting and possibly less profitable for me. I wondered if the clients knew when they hired me for the job that the system contained two disks. I disconnected the lower drive from the system, added it to my evidence inventory as D1 and connected it to my write-blocker and then to the SIFT Workstation. After gathering the mmls output for the image, I started the imaging process. The output from mmls looked something like this (actually, it looked nothing like this. Im not going to expose details of the clients system. For our purposes, this works): DOS Partition Table Offset Sector: 0 Units are in 512-byte sectors Slot Meta ----00:00 Meta Meta ----01:00 Start 0000000000 0000000000 0000000063 0000210924 0000210924 0000210924 0000210987 End 0000000000 0000000062 0000210923 0007952615 0000210924 0000210986 0007952615 Length 0000000001 0000000063 0000210861 0007741692 0000000001 0000000063 0007741629 Description Primary Table (#0) Unallocated Dell Utilities FAT (0xde) Win95 Extended (0x0F) Extended Table (#1) Unallocated NTFS (0x07) artifact analysis Browser Forensics Case Leads Certification and License Challenge Computer Forensic Hero Computer Forensics Computer Forensics and IR Summit Digital Forensic Law Drive Encryption eDiscovery Email Investigations Ethics Based on the interviews with my clients and the information I had about the case, I knew that the NTFS partition would be the primary focus of my investigation. I started the disk imaging process and went out for a walk-about. After dc3dd finished imaging and hashing the disk image. I returned D1 to the Dell tower and removed the second hard disk drive, added it to the evidence inventory as D2 and repeated the process above. Running mmls on D2 revealed the following information: DOS Partition Table Offset Sector: 0 Units are in 512-byte sectors Slot Meta ----00:00 Meta Meta ----01:00 Start 0000000000 0000000000 0000000063 0000210924 0000210924 0000210924 0000210987 End 0000000000 0000000062 0000210923 0007952615 0000210924 0000210986 0007952615 Length 0000000001 0000000063 0000210861 0007741692 0000000001 0000000063 0007741629 Description Primary Table (#0) Unallocated Dell Utilities FAT (0xde) Win95 Extended (0x0F) Extended Table (#1) Unallocated NTFS (0x07) Evidence Acquisition Evidence Analysis Incident Response Linux IR Malware Analysis Memory Analysis Mobile Device Forensics Network Forensics Registry Analysis Reporting Reverse Engineering Timeline Analysis Uncategorized USB Device Analysis Windows IR Write Blockers RECENT POSTS

00: 01: 02: 03: 04: 05: 06:

00: 01: 02: 03: 04: 05: 06:

Astute readers will note that D2s partition table matches D1s exactly. I breathed a sigh of relief as it appeared that D1 and D2 were a mirrored pair likely running in RAID1 configuration. This meant that I could limit my analysis to a single drive and probably finish the job on the original time line Id promised and without losing money on the gig. Rather than image D2, I took an MD5 hash of it to verify that it matched D1. Sadly, it did not. If the drives were supposed

2 di 6

12/08/2010 19.33

Computer Forensics: Identifying Disk Differences Broken Mirrors

http://blogs.sans.org/computer-forensics/2010/08/12/computer-fore...

to be a mirrored pair, the synchronization was apparently broken. At this point, I mounted the NTFS partition on the SIFT workstation and viewed the contents of the root directory and compared them to the contents of the root directory of the NTFS image Id collected already. Everything in the root directory was the same, file names, sizes, time stamps, sub-directory names and so on. My theory that this was a mirrored pair out of sync had some life in it. I started imaging D2 and thought about ways of quickly isolating differences between the two disks. My first thought was to run Jesse Kornblums ssdeep against the disk images to see how similar they were. I tried this, but ssdeep complained, Value too large for defined data type. Next I decided to carve out each partition and calculate MD5 hashes and compare them. One by one, the partitions were carved and their MD5 hashes compared. In the end, each partitions hash matched except for the NTFS partition. Knowing that one of the options for dc3dd is the hashwindow option, which can be used for piecewise hashing, I decided to use it to determine how similar the two disks were. Here are the commands I used: dc3dd if=D1_ntfs.img hash=md5 hashwindow=10M hashlog=D1.hashlog of=/dev/null dc3dd if=D2_ntfs.img hash=md5 hashwindow=10M hashlog=D2.hashlog of=/dev/null Note the output file is /dev/null. I didnt need to create new images, Id already carved them out. All I wanted from this operation was to locate which portions of the two disk images were different. The result of running these two commands was a couple of text files containing MD5 hashes for 10 MB sections of the NTFS partitions. Heres a sample of the output: md5 md5 md5 md5 md5 ... md5 md5 md5 md5 ... 0- 10485760: fdfd6a607ebef09871c3c51140e9eb40 10485760- 20971520: f1c9645dbc14efddc7d8a322685f26eb 20971520- 31457280: f1c9645dbc14efddc7d8a322685f26eb 31457280- 41943040: f1c9645dbc14efddc7d8a322685f26eb 41943040- 52428800: f1c9645dbc14efddc7d8a322685f26eb 482344960492830720503316480513802240492830720: 503316480: 513802240: 524288000: f1c9645dbc14efddc7d8a322685f26eb 00254e8b9cf9c6d3a1f6ba8040cf4782 348b1f2236220e4ab71e335385cb80fe f1c9645dbc14efddc7d8a322685f26eb

The first column of output shows us the hashing algorithm that was used, then the starting and ending byte offsets, followed by the MD5 sum for the bytes in that range. Note that the first 10 MB section of the partition contained data, the next 470 MBs of the partition contained no data, thus the MD5 sums for each 10 MB section were the same until finally, between 470 and 480 MBs, the partition contained something other than nulls, hence the MD5 sums started to vary again. I was curious to see how different the two sets of hashlogs were. So I ran ssdeep against the first hashlog file and saved the result, then ran ssdeep against the second hashlog, comparing its result against the first: ssdeep D1.hashlog > D1.hashlog.ssdeep ssdeep -m D1.hashlog.ssdeep D2.hashlog /cases/20100808/D2.hashlog matches D1.hashlog.ssdeep:/cases/20100808/D1.hashlog (88) The hashlogs from these two partitions are 88% alike according to ssdeep. I now had more evidence that these two partitions were (at one point in time) a mirrored pair. The next step was to locate the differences between the two partitions. To do that, I used the diff command: diff D1.hashlog D2.hashlog 1c1 < md5 0- 10485760: fdfd6a607ebef09871c3c51140e9eb40 --> md5 0- 10485760: ef9f993a60a6a77114aab999091597ce 48c48 < md5 492830720- 503316480: 00254e8b9cf9c6d3a1f6ba8040cf4782 --> md5 492830720- 503316480: 547bd7c44930a1911cd6ce6f85b606df 51c51 < md5 524288000- 534773760: 705c8fc001d91cc32919d34d83127df6 --> md5 524288000- 534773760: 64772837bbb0502f98af41261bb3743e 53c53 < md5 545259520- 555745280: 272594145001e58f0b1dfba6e7a36ce1 --> md5 545259520- 555745280: 3497c9365449e8339c550b161ea98535 55,56c55,56 < md5 566231040- 576716800: b20266a7591cac2f2cfa9f8375a71761 < md5 576716800- 587202560: f9b18be13c774fa009717101ec495afc --> md5 566231040- 576716800: 58638effcded45e272d555def45351f8 > md5 576716800- 587202560: c366890edd98ed67c381adc7c294dfb5 58,61c58,61 < md5 597688320- 608174080: 9a30b16c50fdd1e6b46c621cabde0ecd < md5 608174080- 618659840: 59f55c57bc15467e1734d8eab837b02c < md5 618659840- 629145600: cdcb01a465f188cff6e08b5189413f2e < md5 629145600- 639631360: 7981b144a85149fcac7fce2161d44278 --> md5 597688320- 608174080: dedffa8e94b137914ae70ee64d02ec5b > md5 608174080- 618659840: 9261d04130b4802a9ee6cfe50e5b3f2a > md5 618659840- 629145600: 46f66e9d9711815a521a3429179f3e42 > md5 629145600- 639631360: 5696f83cead6b1fc80bf5b7819535f99 63c63 < md5 650117120- 660602880: 43aadd02600598fab034d091684c9dff --> md5 650117120- 660602880: 9d6db2c17acc1c321493bd054510c1d1 190c190 < md5 1981808640- 1992294400: c8a0dc3bcbedc485c3ebfd06087a34d8 --> md5 1981808640- 1992294400: b6a289e4342258a016223eb4400f1c8c 380c380 < md5 TOTAL: a347712cc414e2f7ea23baedd929d620 --> md5 TOTAL: 2ba46718f54169305073b0bc469bc1e9

3 di 6

12/08/2010 19.33

Computer Forensics: Identifying Disk Differences Broken Mirrors

http://blogs.sans.org/computer-forensics/2010/08/12/computer-fore...

If youre anything like me, that may look like mind numbing output. Lets review a few lines of the output line-by-line. First is the diff command itself, simple enough, compare D1.hashlog to D2.hashlog. The next line, 1c1 refers to line number one in each file, the c means that line number one in the second file has changed compared to the first file. The next several lines of diff output follow this same format, then we see 55,56c55,56 < md5 566231040< md5 576716800--> md5 566231040> md5 576716800576716800: b20266a7591cac2f2cfa9f8375a71761 587202560: f9b18be13c774fa009717101ec495afc 576716800: 58638effcded45e272d555def45351f8 587202560: c366890edd98ed67c381adc7c294dfb5

This means that lines 55 through 56 of file one have changed in file two, based on this explanation, you can make sense of the rest of the file. Given diffs output, weve narrowed down the differences between the two ~4 GB files to 120 MBs. We can now narrow in on the differences even more by repeating the process above using a smaller hashwindow and restricting dc3dd to those sections of the NTFS images where we know the differences reside. The output in the hashlog files contains byte offsets. When we run dc3dd, well be working with blocks. We can specify our block size and divide the byte offsets given in diffs output and drill down to the specific portions of each 10 MB section to more precisely locate the differences. Heres an example: dc3dd bs=512 if=D1_ntfs.img of=/dev/null count=20 hashwindow=1M hash=md5 hashlog=diff_1_D1.dd.hashlog warning: sector size not probed, assuming 512 dc3dd 6.12.3 started at 2010-08-08 21:47:49 -0400 command line: dc3dd bs=512 if=D1_ntfs.img of=/dev/null count=20 hashwindow=1M hash=md5 hashlog=diff_1_D1 compiled options: DEFAULT_BLOCKSIZE=32768 sector size: 512 (assumed) md5 0- 1048576: d5c912a902d74371aa06aafefe21674a md5 1048576- 2097152: b6d81b360a5672d80c27430f39153e2c ... dc3dd bs=512 if=D2_ntfs.img of=/dev/null count=20 hashwindow=1M hash=md5 hashlog=diff_1_D2.dd.hashlog warning: sector size not probed, assuming 512 dc3dd 6.12.3 started at 2010-08-08 21:50:17 -0400 command line: dc3dd bs=512 if=D2_ntfs.img of=/dev/null count=20 hashwindow=1M hash=md5 hashlog=diff_1_D2 compiled options: DEFAULT_BLOCKSIZE=32768 sector size: 512 (assumed) md5 0- 1048576: 042a76d72aaf721c2d49246a40d974df md5 1048576- 2097152: b6d81b360a5672d80c27430f39153e2c ... Ive abbreviated the output, but you can see that where previously we knew that there was a difference in the first 10 MBs, now we know that difference is actually in the first MB. Now were getting somewhere. Lets zoom in on the first MB from each file and see how we can pinpoint the difference. dc3dd bs=512 if=D1_ntfs.img of=/dev/null hash=md5 hashwindow=512 count=20 hashlog=D1_1MB.hashlog dc3dd bs=512 if=D2_ntfs.img of=/dev/null hash=md5 hashwindow=512 count=20 hashlog=D2_1MB.hashlog

These dc3dd commands collect the MD5 hashes for every 512 bytes of data for the first 1 MB of each NTFS partition. Running diff on the two resulting hashlog files, we get the following result: diff D1_1MB.hashlog D2_1MB.hashlog 17c17 < md5 8192- 8704: 590693b0719f5a66787565fa3d795e05 --> md5 8192- 8704: dc1196943e31869bbcf12fe86f7d896c

Now we know the difference in the first MB of the NTFS partitions is somewhere between byte offset 8192 and 8704. At this point, we can easily carve out this section of each file and compare them. dc3dd if=D1_ntfs.img of=D1_8192-8704.img bs=512 skip=16 count=1 hash=md5 warning: sector size not probed, assuming 512 dc3dd 6.12.3 started at 2010-08-08 22:18:22 -0400 command line: dc3dd if=D1_ntfs.img of=D1_8192-8704.img bs=512 skip=16 count=1 hash=md5 compiled options: DEFAULT_BLOCKSIZE=32768 sector size: 512 (assumed) md5 TOTAL: 590693b0719f5a66787565fa3d795e05 1+0 sectors in 1+0 sectors out 512 bytes (512) copied (100%), 0.00191684 s, 261 K/s dc3dd completed at 2010-08-08 22:18:22 -0400 dc3dd if=D2_ntfs.img of=D2_8192-8704.img bs=512 skip=16 count=1 hash=md5 warning: sector size not probed, assuming 512 dc3dd 6.12.3 started at 2010-08-08 22:18:42 -0400 command line: dc3dd if=D2_ntfs.img of=D2_8192-8704.img bs=512 skip=16 count=1 hash=md5 compiled options: DEFAULT_BLOCKSIZE=32768 sector size: 512 (assumed) md5 TOTAL: dc1196943e31869bbcf12fe86f7d896c 1+0 sectors in 1+0 sectors out 512 bytes (512) copied (100%), 0.00155415 s, 322 K/s dc3dd completed at 2010-08-08 22:18:42 -0400 Lets review our dc3dd command above. We specify our blocksize (bs) as 512 bytes and we want to skip to byte offset 8192 and collect a single block. Skip and count both take blocks as arguments so we divide 8192 by 512 and get 16 for our skip value. Note also that our MD5 sums for each of the 512 byte sections weve carved out, match the previous MD5 sums. Now to pinpoint the differences in these 512 byte sections, I run them through xxd, dumping them as hexadecimal files and then diff those two files:

4 di 6

12/08/2010 19.33

Computer Forensics: Identifying Disk Differences Broken Mirrors

http://blogs.sans.org/computer-forensics/2010/08/12/computer-fore...

xxd -g1 -u D1_8192-8704.img > D1_8192-8704.img.xxd xxd -g1 -u D2_8192-8704.img > D2_8192-8704.img.xxd diff D1_8192-8704.img.xxd D2_8192-8704.img.xxd 1c1 < 0000000: FF FF 00 07 00 00 00 00 7E 37 01 03 00 00 00 00 --> 0000000: FF FF 00 07 00 00 00 00 FF 3F 00 00 00 00 00 00

........~7...... .........?......

Thats it, repeat this process and determine if the differences between the two disks are relevant to the case. And of course, in true SANS fashion, now that youve seen the difficult way to do something like this, you should know that the SIFT Workstation includes vbindiff, which can solve this problem for you and in a much more elegant way, though you may still run into instances where the files you are working with are too large for vbindiff. Heres a screen capture of vbindiff showing the above difference:

Image 2: vbindiff D1_ntfs.img D2_ntfs.img Dave Hull is an incident responder and forensic investigator for a Fortune 10000 CIRT where he enjoys being the dumbest person in the room. Over the years he has worn many hats and still dons them from time to time. When hes not flipping bits, he enjoys teaching for the SANS Institute.
Digg
vote now

Permalink | Comments RSS Feed - Post a comment | Trackback URL.

2 Comments
Andrew Hay
Posted August 12, 2010 at 1:00 pm | Permalink

Wow, great post. We need more walkthroughs like this ;) Ken Pryor
Posted August 12, 2010 at 4:02 pm | Permalink

I agree with Andrew. What an excellent post! Thanks Dave! KP

Post a Comment
Your email is never published nor shared. Required fields are marked * Name *

Email *

Website

5 di 6

12/08/2010 19.33