You are on page 1of 5

The windows file system

In this paper I will take a look at the current windows file system and
explain the following in detail: the boot sector, the MFT, Files and their
attributes, folders and the B+ Data Structure, and Possible Attacks on
NTFS.
The Boot Sector
The windows File system begins with the boot sector. This is made
whenever you format an NTFS volume and is located in the first sector of
your windows partition. The boot sector holds information about the
drive, which is recorded in the BIOS parameter block (often referenced
as BPB). The BPB details information about the hard disk such as its size,
and the physical parameters of volume. The boot sector also contains
code that points to the Master File Table and it's backup ($MFT and
$MFTMirror). The MFT Backup ($MFTMirror) acts as a fault tolerance
mechanism; it holds a mirror copy of the first four records or the first
cluster of the Master File Table. If any records in the MFT are corrupt,
NTFS will refer to the boot sector for the location of the mirror and use
the mirror copy to not only get the correct information but to also repair
the MFT. The Boot Sector is also the mechanism that is responsible for
passing operations from the Master Boot Record to the NT loader
program. The Boot process basically goes something like this: BIOS >>
MBR >> Boot Sector >> the NT Loader (NTLDR) >> hardware detection >>
Core OS loads (Ntoskrnl.exe) >> Services Start >> Logon.
The Master File Table
The MFT is the core component of the NTFS file system. Through the
MFT the NTFS file system becomes a highly organized array of records
containing information describing the content of your file system. Every
instance of data on your hard disk is described within these records,
from the boot sector to your plain text file.
The first sixteen records of the MFT are dedicated to metadata files. The
metadata files define the structure of the MFT and essentially make it a
self-describing database. The use of metadata files in the MFT should
not be surprising; every database uses some form of metadata to define
it's data structure. The metadata files that are stored within the first
sixteen records of the MFT are as follows:
The MFT
Rec. | File Name | Description
0 | $Mft | The Master File Table
1 | $MftMirror | The Master File Table Mirror
2 | $LogFile | A log file containing a list of
transaction steps for NTFS
recoverability.
3 | $Volume | Information about he volume.
4 | $AttrDef | Defines attributes
5 |. | The root folder
6 | $Bitmap | Cluster bitmap representing the volume.
7 | $Boot | Boot sector (discussed above)
8 | $BadClus | Contains bad clusters for a volume
9 | $Secure | Contains security descriptors for all
files within the volume.
10 | $Upcase | Converts lowercase characters to
Unicode uppercase characters.
11 | $Extend | Used for various option extensions
(Unique file Id, Quota Information,
Reparse point information, etc.)
12 - 15 | Reserved for future use.
The location of these files is not fixed (save for the boot sector which
must be located in the first sector of the partition. NTFS is a flexible file
system, in windows XP, Microsoft moved the location of the $LogFile
and $Bitmap metadata files to improve overall performance. In fact,
nearly all of the system files described above can be moved if needed to
avoid bad clusters.
Microsoft stores every file or folder on your system as a record within
the MFT starting at either record seventeen or record twenty-four. The
reason I give two starting points here is because there are two different
views on the subject. The Linux-NTFS project says that the MFT table
doesn't use records seventeen through twenty-three, while ntfs.com
says that file records begin at record seventeen. I have not seen
Microsoft give a specific starting point for normal file records.
Files and their Attributes
In the MFT, normal records are made up of numerous fields called file
attributes. A file attribute describes some aspect of the file that is
contained within the MFT record. Going into more detail, a descriptive
list of attributes are as follows:
File Attributes
Standard Information:
Old school file attributes: read only, timestamp, link count etc.
Attribute List:
Almost like another metadata file. It gives locations of all attribute
records that don't fit in the actual MFT.
File name:
The name of the file. The long name can be up to 255 Unicode
characters while the short name follows the 8.3 old-school format.
Additional names (required to meet the POSIX standard), or hard links
are stored here also as file name attributes.
Data:
This attribute contains the actual data (if it is a small file) or is the base
file that points to the extent on the disk that contains the data. It is
possible to have multiple data attributes per file.
Object ID:
A volume unique identifier. Used by the distributed link tracking service.
Logged Tool Stream:
Similar to a data stream, but operations are logged to the NTFS log files.
This is used by EFS.
Reparse Point:
Used for Symbolic Links (yes NTFS does have this capability), Junction
Points, Volume Mount Points, Remote Storage Server.
Index Root:
Used to implement folders and other indexes (to be explained below).
Index Allocation:
Used to implement the B-tree structure for large folders or other large
indexes (to be explained below).
Bitmap:
Used to implement the B-tree structure for large folders and other large
indexes.
Volume Information:
Used only in the $Volume system file. Contains the volume version.
As mentioned, with small files (usually no more than 1kb), the data
resides in the MFT record as a resident attribute. In most cases the file is
too large to fit in the MFT record. In these instances, the data attribute
contains the VCN-to-LCN mapping information which points to the
extent on the disk where the data resides as a non-resident attribute (an
extent or data run is where the data is actually held on your hard disk).
Using this map, the MFT points to the physical location of the extent by
referring to the Logical Cluster Number(the LCN is simply a numbered
ordering of all clusters on the volume) and the length of the extent. Each
extent must consist of contiguous set of clusters on the disk. NTFS
organizes the extents of each file logically (even though they may not be
physically contiguous) by the assignment of a Virtual Cluster Number
(VCN).
For example, I have file A that is too large to fit in the MFT. NTFS writes
the data attribute of file A onto the hard disk starting at LCN 127. The
length of the file takes up 5 clusters - but cluster number 130 is bad or
occupied. The File on disk would look like: |data | data | data | another
file | data | data |. A VCN to LCN description for this file would be
clusters 0, 1, 2, 4, 5 to 127, 128, 129, 131, 132. The MFT would point to
LCN 127 as the start of the run, identify it as VCN 0 and count the length
of the run (3 clusters). It would then point to LCN 131 continuing the
run, identify it as VCN 4 and count the length of the run (2 clusters).
Folders and the B+ Tree Data Structure
Directories under NTFS are indexes that contain the filename attribute,
file reference, timestamp and file size for the files organized by that
index. Indexing and sorting the files speed directory access, there is no
need for NTFS to organize the data every time you list the contents of
the directory. The duplicate attributes in the index also save time - as the
NTFS doesn't need to look up that information in the MFT every time the
directory is accessed. Also, because the index contains the file reference
(a 64bit number identifying each file) there is no need to search through
the MFT for the file.
When a directory grows too large to fit into the limited space of the MFT
it expands from it's entry onto the file system. NTFS creates child
indexes on the disk - referenced by the parent index in the MFT. To
expand the directory structure onto the disk NTFS implements a B+ Tree
data structure, expanding 'out' rather than 'deep', allowing for fast
retrieval times.
Possible Attacks on NTFS
Any unauthorized modification of file attributes is an attack on the
integrity of the Windows File system. This could include the modification
of the security descriptors or the timestamp for a certain file. Another
exploit within the windows file system would be the abuse of alternate
data streams for a quick way to hide data. The virus Win2k.Stream is an
example of this kind of abuse, so is my hide program. Security
Descriptors could also be completely bypassed by using another ntfs
driver to read the file system. The oft referred to ntpasswd utility uses
this method to circumvent permissions when accessing the SAM file on
an NTFS drive.
Is there a need to attack the NTFS or the MFT itself? Programs rarely
touch the file system directly. Any requests that you issue will be passed
into kernel and then to the NT I/O manager. The I/O manager then calls
the NTFS File System Driver which in turn accesses the file system.
Because of this approach an attack on the file system becomes
unnecessary. The cleaner method of attack, and one that you'll see in
rootkits is to intercept the I/O request before it reaches the file system
by either hooking into dispatch functions of the driver or setting up a file
system driver filter.
tools & links of interest:
ntpasswd | ntfs tools | ntfs progs | sysinternals
feedback? comment on livejournal.

You might also like