You are on page 1of 57

File System Implementation

OPERATING SYSTEM CONCEPTS: Tenth Edition


Chapter 13 & 14
Dr. Suleiman Abu Kharmeh
File System Implementation

Dr. Suleiman Abu Kharmeh


Layered File System
• Logical File System
– Maintains file structure via FCB
(File Control Block)
• File organization module
– Translates logical block to physical
block
• Basic File system
– Converts physical block to disk
parameters (drive 1, cylinder 73,
track 2, sector 10 etc)
• I/O Control
– Transfers data between memory
and disk

Dr. Suleiman Abu Kharmeh


File system Units
• Sector – the smallest unit that can be accessed on a
disk (typically 512 bytes)

• Block(or Cluster) – the smallest unit that can be


allocated to construct a file

• What’s the actual size of 1 byte file on disk?


– takes at least one cluster,
– which may consist of 1~8 sectors,
– thus 1 byte file may require ~4KB disk space.

• All file systems have default allocation


Dr. Suleiman Abu Kharmeh
Sector and Cluster File layout
• Block size = 1 cluster = 2KB

Dr. Suleiman Abu Kharmeh


FCB – File Control Block
• Contains file attributes + block locations
– Permissions
– Dates (create, access, write)
– Owner, group, ACL (Access Control List)
– File size
– Location of file contents
• UNIX File System  I-node
• FAT/FAT32  part of FAT (File Allocation Table)
• NTFS  part of MFT (Master File Table)
Dr. Suleiman Abu Kharmeh
File protection in Unix
• Two different ways of thinking about it:
– access control lists (ACLs)
• for each object, keep list of subjects and their allowed actions
– capabilities
• for each subject, keep list of objects and subj’s allowed actions
• Both can be represented with the following matrix:

objects

/etc/passwd /home/gribble /home/guest


root rw rw rw

subjects gribble r rw r
guest r capability

ACL

7
Dr. Suleiman Abu Kharmeh
Access Lists and Groups in Unix
• Mode of access: read, write, execute
• Three classes of users on Unix / Linux
RWX
a) owner access 7  111
RWX
b) group access 6  110
RWX
c) public access 1  001

• Ask manager to create a group (unique name), say G, and add


some users to the group.
• For a file (say game) or subdirectory, define an appropriate
access:

file/directory
command
permissions

• Attach the group association of a file/directory:


chgrp G game

file/directory
command
Dr. Suleiman Abu Kharmeh group
A Sample UNIX Directory Listing

Linux tutorial part 2:


https://moodle.najah.edu/mod/url/view.php?id=511627

Dr. Suleiman Abu Kharmeh


10
Dr. Suleiman Abu Kharmeh
Partitions
• Disks are broken into one or more partitions.
• Each partition can have its own file system method
(UFS, FAT, NTFS, …).
– UFS: Unix File System (Unix/Linux)
– FAT: File Allocation Table (Windows, older)
– NTFS: New Technology File System (Windows, recent)
• Disk or partition can be used raw – without a file
system

Dr. Suleiman Abu Kharmeh


A Disk Layout for A File System

Boot Super File descriptors


File data blocks
block block (FCBs)

• Super block defines a file system


– size of the file system
– size of the file descriptor area
– start of the list of free blocks
– location of the FCB of the root directory
– other meta-data such as permission and times

Dr. Suleiman Abu Kharmeh


Block Allocation
• Allocation method: refers to how disk blocks
are allocated for files.

• Example methods:
– Contiguous allocation
– Linked allocation
– Indexed allocation

Dr. Suleiman Abu Kharmeh


Contiguous Block Allocation

Dr. Suleiman Abu Kharmeh


Contiguous Allocation Method
• Continuous allocation: each file occupies set of
contiguous blocks
– Pros:
• Best performance in most cases
• Simple – only starting location (block #) and length (number of
blocks) are required
– Problems include:
• Finding space on the disk for a file,
• External fragmentation, need for compaction off-line
(downtime) or on-line

Dr. Suleiman Abu Kharmeh


Linked Block Allocation

Dr. Suleiman Abu Kharmeh


Linked Block Allocation
• Each file is a linked list of disk blocks: blocks may be
scattered anywhere on the disk
• Pros:
– Less fragmentation
– Flexible file allocation

• Cons:
– In HDD, sequential read requires disk seek to jump
to the next block. (Still not too bad…)
– Random read will be very inefficient.
O(n) time seek operation
(n = # of blocks in the file)

Dr. Suleiman Abu Kharmeh


Indexed Block Allocation
• Maintain an array of
pointers to blocks.

• Random access becomes


as easy as sequential
access!

• UNIX File System


Index Block

Dr. Suleiman Abu Kharmeh


Free Space Management
• What happens when a file is deleted?
 We need to keep track of free blocks…

• Bit Vector (or BitMap)


• Linked List

Dr. Suleiman Abu Kharmeh


Bit Vector (= Bit Map)

Dr. Suleiman Abu Kharmeh


Bit Vector (= Bit Map)
• Pros
– Could be very efficient with hardware support
– We can find n number of free blocks at once.
• Cons
– Bitmap size grows as disk size grows. Inefficient if entire bitmap
can’t be loaded into memory.

Dr. Suleiman Abu Kharmeh


Linked List
• Pros
– No need to keep global table.

• Cons
– We have to access each block in
the disk one by one to find more
than one free block.
– Traversing the free list may
require substantial I/O

Dr. Suleiman Abu Kharmeh


Section 1, 2 and 3

23
Dr. Suleiman Abu Kharmeh
UNIX file layout overview

15 pointers
(iNodes)

Dr. Suleiman Abu Kharmeh


I-node
• FCB(file control block) of UNIX

• Each i-node contains 15 block pointers


– 12 direct block pointers and 3 indirect (single,
double, triple) pointers.

• Block size is 4K
 Thus, with 12 direct pointers, first 48K are
directly reachable from the i-node.

Dr. Suleiman Abu Kharmeh


I-node block indexing

Dr. Suleiman Abu Kharmeh


I-node addressing space
Recall block size is 4K (if each entry is 4 bytes):

Indirect block contains 1024 (=4KB/4bytes) entries

• A single-indirect block can address


1024 * 4K = 4M data
• A double-indirect block can address
1024 * 1024 * 4K = 4G data
• A triple-indirect block can address
1024 * 1024 * 1024 * 4K = 4T data

Any Block can be found with at most 3 indirections.

Dr. Suleiman Abu Kharmeh


File Layout in UNIX

Dr. Suleiman Abu Kharmeh


Partition layout in UNIX

• Boot block
• Super block
• FCBs
– (I-nodes in Unix, FAT or MFT in Windows)
• Data blocks
Dr. Suleiman Abu Kharmeh
File System Maintenance
• Format
– Create file system layout: super block, I-nodes…
• Bad blocks
– Most disks have some, increase over age
– Keep them in bad-block list
– “scandisk”
• De-fragmentation
– Re-arrange blocks rather contiguously
• Scanning
– After system crashes
– Correct inconsistent file descriptors

Dr. Suleiman Abu Kharmeh


File Operations (system calls)

1. Create 7. Append
2. Delete 8. Seek
3. Open 9. Get attributes
4. Close 10.Set Attributes
5. Read 11.Rename
6. Write

31
Dr. Suleiman Abu Kharmeh
Application File System Interaction

Process Open file


control table File descriptors
block (system-wide) (Metadata) File system
info
File
descriptors

Open Directories
file
pointer
..
array
.
File data

Dr. Suleiman Abu Kharmeh


open(file…) under the hood
1. Search directory structure
for the given file path fd = open( FileName, access)
2. Copy file descriptors into
in-memory data structure
3. Create an entry in system- PCB Allocate & link up
data structures
wide open-file-table
4. Create an entry in PCB
Open
5. Return the file pointer to Directory look up
file
user table by file path

Metadata File system on disk

Dr. Suleiman Abu Kharmeh


read(file…) under the hood
read( fd, userBuf, size )
PCB
Find open file
descriptor
Open
file read( fileDesc, userBuf, size )
table
Logical  phyiscal

Metadata read( device, phyBlock, size )


Get physical block to sysBuf
Buffer copy to userBuf
cache
Disk device driver

Dr. Suleiman Abu Kharmeh


An Example Program Using File System Calls

35
Dr. Suleiman Abu Kharmeh
An Example Program Using File System Calls

36
Dr. Suleiman Abu Kharmeh
Virtual File Systems
• Virtual File Systems (VFS) on Unix provide an object-oriented way of
implementing file systems
• VFS allows the same system call interface (the API) to be used for different types
of file systems
– Separates file-system generic operations from implementation details
– Implementation can be one of many file systems types, or network file
system
• Implements vnodes which hold inodes or network file details
– Then dispatches operation to appropriate file system implementation
routines
• The API is to the VFS interface, rather than any specific type of file system
• For example, Linux has four object types:
– inode, file, superblock, dentry
• VFS defines set of operations on the objects that must be implemented
– Every object has a pointer to a function table
• Function table has addresses of routines to implement that function on
that object
Dr. Suleiman Abu Kharmeh
Schematic View of Virtual File
System

Dr. Suleiman Abu Kharmeh


Windows File System
• FAT
• FAT32
• NTFS

Dr. Suleiman Abu Kharmeh


DOS/Windows File System

FAT
The File Allocation Table (FAT) file system was initially
developed for DOS Operating System and was later used
and supported by all versions of Microsoft Windows.
It was an evolution of Microsoft's earlier operating system
MS-DOS and was the predominant File System in Windows
versions like 95, 98, ME etc.
All the latest versions of Windows still support FAT file
system although it may not be popular.
FAT had various versions like FAT12, FAT16 and FAT32.
Successive versions of FAT were named after the number of
bits in the table: 12, 16 and 32.
Dr. Suleiman Abu Kharmeh
FAT
• FAT == File Allocation Table
• FAT is located at the top of the volume.
– two copies kept in case one becomes damaged.

• Cluster size is determined by the size of the


volume.
– Why?
– It is dependent on number of bits used to address
each cluster 16 or 32 ( later slides)
Dr. Suleiman Abu Kharmeh
• Stores basic info about the file system
• FAT version, location of boot files
• Total number of blocks
• Index of the root directory in the FAT

• File allocation table (FAT)


• Marks which blocks are free or in-use
• Linked-list structure to manage large files

• Store file and directory data


• Each block is a fixed size (4KB – 64KB)
• Files may span multiple blocks

Super
Disk Block

42
Dr. Suleiman Abu Kharmeh
Volume size V.S. Cluster size

Assume 20 bits are used to address e


ach cluster then
Drive Size Cluster Size Number of Sectors
--------------------------------------- -------------------- ---------------------------
512MB or less 512 bytes 1
513MB to 1024MB(1GB) 1024 bytes (1KB) 2
1025MB to 2048MB(2GB) 2048 bytes (2KB) 4
2049MB and larger 4096 bytes (4KB) 8

Dr. Suleiman Abu Kharmeh


FAT block indexing

Dr. Suleiman Abu Kharmeh


FAT Limitations
• Entry to reference a cluster is 16 bit
Thus at most 2^16=65,536 clusters accessible
Partitions are limited in size to 2~4 GB.
Too small for today’s hard disk capacity!

• For partition over 200 MB, performance degrades


rapidly. ( 65K * 4K per cluster = 260 Mbytes disk)
Wasted space in each cluster increases. Because we must
make cluster size to address 65k clusters in large disks.

• Two copies of FAT…


 still susceptible to a single point of failure!

Dr. Suleiman Abu Kharmeh


FAT32
Enhancements over FAT

• More efficient space usage


– By smaller clusters.
– Why is this possible? 32 bit entry…
• More robust and flexible
– root folder became an ordinary cluster chain, thus it can be
located anywhere on the drive.
– back up copy of the file allocation table.
– less susceptible to a single point of failure.

Dr. Suleiman Abu Kharmeh


Windows File System

NTFS (New Technology File system)

NTFS or the NT File System was introduced with the


Windows NT operating system.
NTFS allows ACL-based permission control which was the
most important feature missing in FAT File System.
Later versions of Windows like Windows 2000, Windows XP,
Windows Server 2003, Windows Server 2008, and Windows
Vista also use NTFS.
NTFS has several improvements over FAT such as security
access control lists (ACL) Like unix

Dr. Suleiman Abu Kharmeh


NTFS
• MFT == Master File Table
– Analogous to the FAT

• Design Objectives
1) Fault-tolerance
 Built-in transaction logging feature.
2) Security
 Granular (per file/directory) security support.
3) Scalability
 Handling huge disks efficiently.

Dr. Suleiman Abu Kharmeh


The Sun Network File System (NFS)

• An implementation and a specification of a


software system for accessing remote files across
LANs (or WANs)

• The implementation is part of the Solaris and


SunOS operating systems running on Sun
workstations using an unreliable datagram
protocol (UDP/IP protocol and Ethernet

Dr. Suleiman Abu Kharmeh


NFS (Cont.)
• Interconnected workstations viewed as a set of independent machines with
independent file systems, which allows sharing among these file systems in a
transparent manner
– A remote directory is mounted over a local file system directory
• The mounted directory looks like an integral subtree of the local file system,
replacing the subtree descending from the local directory
– Specification of the remote directory for the mount operation is
nontransparent; the host name of the remote directory has to be provided
• Files in the remote directory can then be accessed in a transparent manner
– Subject to access-rights accreditation, potentially any file system (or directory
within a file system), can be mounted remotely on top of any local directory

Dr. Suleiman Abu Kharmeh


NFS (Cont.)
• NFS is designed to operate in a heterogeneous
environment of different machines, operating systems, and
network architectures; the NFS specifications independent
of these media

• This independence is achieved through the use of RPC


primitives built on top of an External Data Representation
(XDR) protocol used between two implementation-
independent interfaces

• The NFS specification distinguishes between the services


provided by a mount mechanism and the actual remote-
file-access services

Dr. Suleiman Abu Kharmeh


Old/backup

Dr. Suleiman Abu Kharmeh


Unix Directory
• Internally, same as a file.
• A file with a type field as a directory.
– so that only system has certain access permissions.
• <File name, i-node number> tuples.

Dr. Suleiman Abu Kharmeh


Unix Directory Example
- how to look up /usr/bob/mbox ?
Root Directory Block 132
Block 406
1 . 6 .
I-node 6 I-node 26 26 .
1 .. 1 ..
6 ..
4 bin 26 bob
12 grants
7 dev 17 jeff
81 books
14 lib 132 14 sue
406 60 mbox
9 etc 51 sam
17 Linux
6 usr 29 mark
8 tmp
Aha!
Looking up
Looking up I-node 60
bob gives
usr gives Relevant Data for has contents
I-node 26
I-node 6 data (bob) /usr/bob is of mbox
is in in block 406
block 132
Dr. Suleiman Abu Kharmeh
Fast File System
Data and I-node placement
• Original (non-FFS) unix FS had two major problems:
– 1. data blocks are allocated randomly in aging file systems
• blocks for the same file allocated sequentially when FS is new
• as FS “ages” and fills, need to allocate blocks freed up when other files
are deleted
– problem: deleted files are essentially randomly placed
– so, blocks for new files become scattered across the disk!
– 2. inodes are allocated far from blocks
• all inodes at beginning of disk, far from data
• traversing file name paths, manipulating files, directories requires
going back and forth from inodes to data blocks

– BOTH of these generate many long seeks!


55
Dr. Suleiman Abu Kharmeh
Cylinder groups
• FFS addressed these problems using notion of a
cylinder group
– disk partitioned into groups of cylinders
– data blocks from a file all placed in same cylinder group
– files in same directory placed in same cylinder group
– inode for file in same cylinder group as file’s data

• Introduces a free space requirement


– to be able to allocate according to cylinder group, the disk
must have free space scattered across all cylinders
– in FFS, 10% of the disk is reserved just for this purpose!
• good insight: keep disk partially free at all times! 56
Dr. Suleiman Abu Kharmeh
Other FFS innovations

• Small blocks in the FS (1KB) caused two


problems:
– low bandwidth utilization
– FFS fixes by using a larger block (4KB)
• allows for very large files (1MB only uses 2 level
indirect)
• but, introduces internal fragmentation
– there are many small files (I.e., <4KB)
• fix: introduce “fragments”
– 1KB pieces of a block

57
Dr. Suleiman Abu Kharmeh

You might also like