You are on page 1of 19

Magnetic Disks

On older disks the number of sectors per track was the same for all cylinders.
The physics of the inner track sectors defined the maximum number of bytes per sector. From physics, the outer sectors could have stored more bytes than defined, as the areas are bigger.

Waste of space / capacity


Physical disk geometry
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic Disks
Modern disks are divided into zones with more sectors in the outer zones than in the inner zones (zone bit recording).

Physical geometry (left) and corresponding virtual geometry example (right)


Figure from [Ta01 p.302], modified Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

This must be seen as two sectors

Magnetic Disks
Physical geometry: The true physical disk layout. With modern disks only the internal electronic knows about it. CHS (for old disks) or not published any more Virtual geometry: The published disk layout to the external world (device driver, operating system, user) CHS (e.g. WD 18300 example) LBA (logical block addressing)
Disk sectors are just numbered consecutively without regard of the physical geometry.

A disk is a random access storage device.


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Magnetic Disks
Low level formatting: Creation of the physical geometry on the disk platters. Defect disk areas are masked out and are replaced by spare areas. Done by disk drive internal software. Partitioning: The disk is divided into independent partitions, each logically acting as a separate disk. Definition of a master boot record in first sector of the disk. Done by application program. High level formatting: A partition receives a boot block and an empty file system (free storage administration, root directory).
Done by application program or by operating system administration tool.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Logical Disk Layout


Magnetic disks

File system

Figure from [Ta01 p.400], modified


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File System Management


Storage Media Magnetic Disks Files and Directories File Implementation Directory Implementation Free Block Management File System Layout Disk Performance Floppy Disks
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Files
A file is a named collection of related information recorded on secondary storage. [Sil00 p.346] A file is a logical storage unit. It is an abstract data type. [Sil00 p345, 347] Files are an abstraction mechanism for storing information and retrieving it back later. [after Ta01 p.380]
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Structure
Files

Logical file structure examples [Ta01 p.382]


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Structure
Files

a) Byte sequence
Unstructured. The OS does not know or care what is in the file. Meaning imposed by application program. Maximum flexibility. Approach used by Unix and Windows.

b) Sequence of records (fixed-length)


Each record has some internal structure. Background idea: read / write operations from secondary storage have record size.

c) Tree of records
Highly structured. Records may be of variable size. Access to a record through key (e.g. Pony). Lookup / read / write / append are performed by OS, not by application program. Approach used in large mainframe computers (commercial data processing systems).
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Access
Sequential Access
Simple and most common. Based on the tape model of a file. Data is processed in order (byte after byte or record after record). Operations: read, write, rewind. Records need not to be of same length (e.g. text files with each line posing a record. Remember Pascal readln, writeln.
Files

Record
Computer Architecture

Figure from [Sil00 p.355], modified


WS 06/07 Dr.-Ing. Stefan Freinatis

File Access
Direct Access
Files

Bytes or fixed-length logical records. Records are numbered. Access can be in no particular order. Access by record number. Based on disk model of a file. Useful for immediate access to large data records (e.g. database). Operations: read, write, seek.
(file pointer)
1 2 3 4 5 6 7 8 9 10 11 12

Byte or record

seek
Computer Architecture

Figure from [Sil00 p.355], modified


WS 06/07 Dr.-Ing. Stefan Freinatis

File Access
Indexed Access
Index file holds keys. Keys point to records within relative file. Suited for tree structures.
Files

Example of index file and relative file, figure from [Sil00 p.358]
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Names
Files

Name assigned by creation process


andrew 2day urgent! fig_14

Case sensitivity
Andrew andrew ANDREW
Unix: case sensitive. MS-DOS: not sensitive.

Two-part file names: basename.extension


readme.txt prog.c.Z lecture.doc
Extensions are often just conventions, not mandatory by the operating system (although convenient when the OS knows about them).
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Attributes
Additional information about a file.
Depends on operating system and file system what attributes there are. Files

Assigned by the operating system. Stored in the file system


Some possible file attributes Access rights Creation date text / binary flag Temp flag Hidden flag File type Who can access the file and in what way? Date of file creation Whether the file content is text or is binary If set, file is a temporary file and is deleted on process exit Whether or not file name is displayed in listings Regular file or directory file or ...
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Types
Windows Files Unix

Regular files

Directories

Block special files Character special files

Files for maintaining the logical structure of the file system

Text files (also termed ASCII files)


Contain bytes (words in Unicode) according to a standardized character set, such as EBCDIC, ASCII or Unicode. The content is directly printable (screen, printer). Data.

Binary files
Contents not intended to be printed (at least directly). Content has meaning only to those programs using the files. Program (binary executable) or data.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Directories
A directory is a named logical place to put files in.
Single-level directory
This is the directory entry for the file called records, pointing to the file content on the storage media.

Early operating systems (CP/M, MS-DOS 1.0) Still used in tiny embedded systems File names are unique
Computer Architecture WS 06/07

This is the file content of the file records.

Figure from [Sil00 p.360]


Dr.-Ing. Stefan Freinatis

Directories
Two-level directory
user1 user2 user3 user4

root directory

sub directories

Hierarchical structure (tree of depth 1) Absolute file names, relative file names, path names
/user1/test, /user3/test test, ../user4/data /user3
Figure from [Sil00 p.361], modified
WS 06/07 Dr.-Ing. Stefan Freinatis

Absolute file names are unique


Computer Architecture

Directories
Multi-level directory

level

Figure from [Sil00 p.363]


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Directory
Directories

Generalization of two-level directory Hierarchical structure of arbitrary depth


Tree structure, graph structure. Logical organization structure.

One root directory


Arbitrary number of sub (sub sub ...) directories

Efficient file search


Tree / Graph traversing routines. Much faster than sequential search.

Logical grouping
System files, user files, shared files, ...

Most common structure


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Multi-Level Directory
Directories

Acyclic graph directory structure


Additional directory entries (Links) Shared directories Shared files More than one absolute name for a file (or a directory) Dangling link problem
Shared directory
Computer Architecture WS 06/07

Shared files
Figure from [Sil00 p.365]
Dr.-Ing. Stefan Freinatis

Multi-Level Directory
Directories

General graph directory structure


Allowing links to point to directories creates the possibility of cycles.

Avoiding cycles: Forbid any links to directories No more shared directories then Use cycle detection algorithm
Computer Architecture WS 06/07

Figure from [Sil00 p.365]


Dr.-Ing. Stefan Freinatis

File System Management


Now turning from the users view to the implementors view. Users are concerned with how files are named, what operation are allowed and what the directories look like. Implementors are interested in how files and directories are stored on the disk, how the disk space is managed, and how to make everything work efficiently.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

File System Management


Storage Media Magnetic Disks Files and Directories File Implementation Directory Implementation Free Block Management File System Layout Disk Performance Floppy Disks
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

File Implementation
The most important issue in implementing files is the way how the available disk space is allocated to a file.

Contiguous Allocation Linked Allocation


Chained Blocks Chained Pointers

Indexed Allocation
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous Allocation
File Implementation

Each file occupies a set of contiguous blocks on the disk. File defined by disk address (first block) and by length in block units.
Advantage

Simple implementation
For each file we just need to know its start block and its length

Fast access
Access in one continuous operation. Minimum head seeks. Disadvantage

Disk fragmentation
Problem of finding space for new file. The final file size must be known in advance!
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous Allocation
File Implementation

(a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed
Figure from [Ta01 p.401]
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Contiguous Allocation
External Fragmentation
File Implementation

Free disk space is broken into chunks (holes) which are spread all over the disk. New files are put into available holes, often not filling them up entirely and thus leaving smaller holes. A big problem arises when the largest available hole is too small for a new file.

Internal Fragmentation
A file usually does not fill up its last block entirely, so the remaining space in the block is left unused.
Computer Architecture WS 06/07

used

Dr.-Ing. Stefan Freinatis

Linked Allocation
File Implementation

Each file is a linked list of disk blocks. The blocks may be scattered anywhere on the disk. Each block has besides its data a pointer to the next block. The pointer is a number (a block number).

next data

next

next

nil

...

data

...

data
disk

Chained blocks
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked Allocation
File Implementation

The file jeep starts with block 9. It consists of the blocks 9, 16, 1, 10, and 25 in this order.

Advantage

Simple implementation
Only first block number needed.

No external fragmentation
Figure from [Sil00 p.380]

Files consist of blocks scattered on the disk. No more useless blocks on disk.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked Allocation
Disadvantage
File Implementation

Free space management


Somehow all the free blocks must be recorded in some free-block pool.

Higher access time


More seeks to access the whole file owing to block scattering.

Space reduction
Some bytes of each block are needed for the pointer.

Reliability
If a pointer is broken, the remainder of the file is inaccessible.

Not efficient for random access


To get to block k we must walk along the chain.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Linked Allocation
File Implementation

In particular the last disadvantage of the chained blocks allocation method, the unsuitability for random accesses to files, lead to the chained pointers allocation method. A table contains as many entries as there are disk blocks. The entries are numbered by block number. The block numbers of a file are linked in this table in chain manner (as with chained blocks). This table is called file allocation table (FAT).
Figure from [Sil00 p.382]
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Linked Allocation
File Implementation
block block

block

Chained block allocation Chained pointer allocation (FAT) The FAT is stored on disk and is loaded into memory when the operating system starts up.
Computer Architecture

Figures from [Ta01 p.403,404], modified


WS 06/07 Dr.-Ing. Stefan Freinatis

Chained pointers
Advantage
Linked Allocation

Simple implementation
One simple table for both file allocation and free-block pool.

Whole block available for data


No more pointers taking away data space.

Suitable for random accesss


Although the principle of getting to block k did not change, the search (counting) is now done on the block numbers, not on the blocks themselves. Disadvantage

FAT takes up disk space and memory (when cached)


One table entry for each disk block. Table size proportional to disk size.

Higher access time (compared to contiguous allocation)


Still it needs many seeks to collect all the scattered blocks.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation
File Implementation

Each file is assigned an index block. The index block is an array of block numbers, listing in order the blocks belonging to the file. To get to block k of a file, one reads the kth entry of the index block.
next next next next index block nil data data data data
disk

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Indexed Allocation
File Implementation

The file jeep is described by index block 19. The index block has 8 entries of which 5 are used.

Index blocks are also called index nodes, short i-nodes or inodes.
Figure from [Sil00 p.383]
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation
Advantage
File Implementation

Good for random access


Fast determination of block k of a file.

Lesser memory occupation


Only for those files currently in use (open files) the corresponding index blocks are loaded into in memory.

Lesser disk space occupation


Only as many index blocks needed as there are files in the file system. Disadvantage

Free block management


A separate free-block pool must be available.

Index block utilization


Unused entries in index block do waste space.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation
What if a file needs more blocks than entries available in an index block? Linked index blocks
The last entry in an index block points to another index block (chaining).
File Implementation

Multilevel index blocks


An entry does not point to the data, but points to a first-level index block (single indirect block) which then points to the data. Optionally, additional level available through second-level and third-level index blocks.

Combined scheme
Most entries point to the data directly. The remaining entries point to first-level and second-level and third-level index blocks. Used by Unix.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Indexed Allocation
File Implementation
data data data

Combined scheme example (Unix V7)


from [Ta01 p.447], modified Note: The inodes are no disk blocks, but are records stored in disk blocks. The single / double / triple indirect blocks are disk blocks.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

data