Professional Documents
Culture Documents
6
File Systems
(4 slots)
Chapter 4
Files
Directories
File System Implementation
File System Management and Optimization
Example File Systems
OS
Introduction
2/85
OS
Objectives
• Files
– File Naming
– File Structure
– File Types
– File Access
– File Attributes
– File Operations
– An Example Program using File System Calls
• Directories
– Single-Level Directories Systems
– Hierarchical Directory Systems
– Path Names
– Directory Operations
3/85
Objectives…
OS
4/85
Objectives…
OS
5/85
OS
Overview
• Problems
– Data in memory is lost when the process terminates
– It is frequently necessary for multiple processes to access (parts
of) the information at the same time.
– How to preserve data> Write them to file.
• 3 essential requirements for long-term information storage
– It must be possible to store a very large amount of information
– The information must survive the termination of process using it
– Multiple processes must be able to access the information
concurrently
• Disk and its two operations (read, write) are used to solve the
long-term storage problem.
6/85
OS
1- Files: Definitions
Files:
• are logical units of information (created by processes).
• are the abstraction (objects and mechanism), that models in a
convenient way the information stored and read it back on hardware
devices and are managed by OS.
• Processes can read existing files & create new ones if need be.
• Information stored in files must be persistent (not be affected by
process creation and termination).
File system
• The OS’s component that manages the information stored on the
storage devices and provides the users access to that information in a
convenient way.
• Hides the complexity of storage hardware devices.
• Provides the users a uniform logical view of the information stored
on these devices, based on the concept of file.
• Maps files onto physical devices.
7/85
OS
Files: Naming
8/85
OS
Files: File Structure
There are 3 common way possibilities :
– An unstructured sequence of bytes
• Byte sequences (maximum flexibility).
• Can put any thing they want and name file any way.
– Record sequence
• File contains sequence of fixed length records.
• It provides 3 basic operations: read, write, append.
– Tree
• A file consists of a tree of records, not necessarily all the same length, each
containing a key field in a fixed position in the record.
• The tree is sorted on the key field to allow rapid searching for a key.
• Provides the get operation to get the record with specifies key.
• OS decides the new record’s position.
9/85
OS
Files: File Structure Demo.
10/85
OS
Files: File Types
11/85
OS
Files: File Access
• Sequential access
– Reads the bytes or records in order, starting at the beginning,
but could not skip around and read them out of order
– Could be rewound
– Specific to magnetic tapes
– Operation: read
• Random access
– Reads the bytes or records out of order, or to access records
by key rather than by position ( key: unique data which helps
distiguishing information)
– Specific to disks
– Operation: seek
12/85
OS
Files: File Attributes
13/85
OS
Files: Operations
14/85
OS
Files: Copy file – Demo.
System calls
System calls
16/85
OS
2- Directories/ Folder: Definition
Directory:
• Is a way of organizing files.
• Imposes (quy định) a hierarchy of files.
• Consists of a collection of files and
subdirectories.
• Can be viewed as a special file
managed by the OS.
17/85
OS
Directories: Single-Level Directory Systems
• Is having one directory containing all files (root
directory).
• Applying on system that can be used by only one or
many users.
• Is adequate for simple dedicated applications.
• Advantages: Simplicity, ability to locate file quickly.
• Disadvantages: The file name can be duplicated in one/ many
users.
18/85
OS
Directories: Hierarchical Directory Systems
20/85
OS
21/85
OS
Directories: Operations
• Create
• Delete
• Opendir
• Closedir
• Readdir
• Rename
• Link: Allow a file to appear in more than one directory
• Unlink
22/85
OS
23/85
OS
FS Impl.: File System Layout
• File Systems
– Are stored on disks
– Disks can be divided up into one or more partitions, with
independent file systems on each partition.
Structure of a partition
• The 1st block in each partition is called boot block that is existed
even if the partition does not contain a bootable OS.
25/85
OS
• The 2nd block is the super block contains the all key parameters
about the file system including magic number (file system type),
the number of blocks in the file system, and the other key
administrative information. It will be read into memory when the
computer is booted or the file system is first touched.
• Next block is free blocks (ex: form of bitmap or a list of pointer)
• The following block are i-nodes, an array data structures, one
per file, telling all about the file (metadata of a file).
• After that might come the root directory, which contains the top
of the file system tree
• Finally, the remainder contains all the other directories and files
26/85
OS
27/85
OS
28/85
OS
29/85
OS
0 or -1
can be
selected
to mark
the last
disk
Tanenbaum, Fig. 4-11.
block.
31/85
OS
FS Impl.: Implementing Files
32/85
OS
FS Impl.: Implementing Files…
33/85
OS
34/85
OS
FS Impl.: i-nodes…
• Advantages
– The space is reserved because i-node need only be in memory
when the corresponding file is open.
– Allows the maximum number files that may be opened at
once.
– Is smaller than the space occupied by the file table.
• Disadvantages
– Each i-nodes has room for a fixed number of disk addresses
(not flexibility) What happens when a file grows beyond
this limit? Solution: reserve the last disk address not for
data block, but it is used to contains more disk block
addresses.
35/85
OS
36/85
OS
FS Impl.: i-nodes…
• Example:
An i-nodes has contain 10 direct addresses and 1 single indirected
addresses of 4 bytes each and all disk blocks are 1KB. What is the
largest possible file?
– 10 direct addresses with 1KB each block:
File size is managed by direct addresses: 10 * 1KB = 10KB
– 1 single indirected address points to a disk block (1KB)
Number of pointers can be managed by the indirect address:
1024/4 = 256
File size is managed by the indirect address:
256* 1KB= 256 KB
The largest possible file is 10 + 256 = 266 KB
37/85
OS
38/85
OS
40/85
OS
• Searching Directories
– Search linearly from beginning to end when a file name
has to be looked up → slow with extreme long
directories.
– To speed up: the hash table is used.
• Advantage: faster lookup.
• Disadvantages: more complex administrator.
– Applying to the large directories, the searching result
is cached.
• Before starting a search, a check is first made in the cache.
• If so, it can be located immediately. Otherwise, the
searching is started.
43/85
OS
File system with no shared files (tree) File system with shared files – Using links
Tanenbaum, Fig. 4-7. Directed Acrylic Grapg - DAG
Tanenbaum, Fig. 4-16. 44/85
OS
Advantages
- One i-node
- No new blocks allocated
Disadvantages:
- A mechanism for removing file is needed. If not, a
user can be billed (thông báo) for a “removed file”.
45/85
OS
Traditional/ Hard Linking: (a): Situation prior to linking. (b) After the link is
created. (c): After the original owner removes the file
Tanenbaum, Fig. 4-17.
46/85
OS
48/85
OS
FS Impl.: LFS…
• Solutions: using LFS
– Main idea: Structure the entire disk as a log.
– All writes are initially buffered in memory, and
periodically, all the buffered writes to the disk in a
single segment at the end of the log.
– A single segment
• Contains i-nodes, directory blocks, and data blocks, all
mixed together.
• At the start of each segment is a segment summary telling
what can be found .
– i-nodes are scattered all over the log (the blocks are
located in the usual way when an i-node is located).
49/85
OS
FS Impl.: LFS…
FS Impl.: LFS…
• Problems
– Disks are finitely large, thus when the log occupies the entire disk, no new
segments can be written.
– Many existing segments may have blocks that are no longer need but they
still occupy space is previously written segments.
Cleaner thread is used.
FS Impl.: LFS…
• Solutions
– Cleaner thread spends its time scanning the log circularly to
compact it.
• It starts out by reading the summary of first segment in the log to see which i-
nodes and files are there
• It then checks the current i-node map to see if the i-nodes are still current and file
blocks are still in use → is written in the next segment. Otherwise, that information
is discarded
• Cleaner moves along the log, removing old segments from the back and putting
any live data into memory for rewriting in the next segment
the disk is a big circular buffer, with the writer thread adding new
segment to the front and the cleaner thread removing old ones
from the back.
complexity
53/85
OS
54/85
OS
FS Impl.: JFS…
• Problems
– The first step is completed and the system crashes.
• The i-node and file blocks will not be accessible from any
file, but will also not be available for reassignment
– The second step is completed and the system crashes.
• Only blocks are lost
– If the order of operation is changed
• If the i-node is released first, then after rebooting, the i-
node may be reassigned, but the old directory entry will
continue to point to it, hence to the wrong file.
• If the blocks are released first, then a crash before i-node is
cleared (means the valid directory entry points to an i-node
listing blocks now is the free storage pool).
55/85
OS
56/85
OS
POSIX: Portable
Operating System
Interface for UNIX.
57/85
OS
58/85
OS
3- File System Management & Optimization
Disk Space Management
• Block size
– Files are stored in fixed-size blocks of bytes, not necessarily
adjacent
– How large the block should be?
• Large: Is efficiently read, but waste of HDD space
• Small:
– Is good use of HDD space
– Span multiple blocks
– Many disk accesses to read a file (reduce performance – waste time)
– A good compromise should be chosen
• Performance and space utilization are inherently in conflict
– Median size better then mean size (~2KB)
• Just read (1KB), just written (2.3 KB), and read and written (4.2KB)
59/85
OS
60/85
OS
1: Free block
0: Allocated block
62/85
OS
63/85
OS
64/85
OS
66/85
OS
Block number
(a) Consistence: the in-
use block can not be
in free block list.
(b) Missing block: no
real harm, waste
space and reduce the
capacity of disk.
Solution: the file
system checker adds
the missing block to
the free list
File system state
Tanenbaum, Fig. 4-24. 68/85
OS
70/85
OS
• Directory Consistency
– Both counts will agree → consistency
– The link count is higher: the file are removed, i-node will not be removed
(is not serious errors and cause waste space) → fixed by setting the link
count in the i-node to the correct value
– The link count is lower: 2 directory entries are linked to a file, but the i-
node present is only one (disaster and all blocks is release when the file is
removed → the one of directories points to an unused i-node) → force the
link count in i-node to the actual number of directory entries
71/85
OS
73/85
OS
74/85
OS
76/85
OS
77/85
OS
80/85
OS
A Unix i-node
Tanenbaum, Fig. 4-34.
84/85
OS
Summary
Q&A
85/85
OS