CE 466 System Programming

File System and Processing
Computer Engineering Department Yarmouk University 10/12/2008

1

Files
• A named collection of related info • Consists of a sequence of bits, bytes, lines, or records • For each file, the OS keeps the following info:
• • • • • Name Type Location Size Important dates
2

1

File Access Methods
• Sequential Access

• Direct Access

3

File system
• A file system provides a logical view of the file (data/information)

4

2

File system
• • • • Who maps the user (logical) view of the files to the blocks where the file’s data is stored? If not the OS, the user should!!! The OS provides a mapping between the logical and physical units of storage The OS provides a set of basic operations for file manipulation such as: create, open, read write, ….
5

Layered File System
• The mapping is implemented using a layered approach • Lower levels: device specific I/O control • Intermediate levels:
• Basic file system which issues commands to the appropriate device driver to read and write physical blocks (drive 1, track 2, ..) • File organization module translates logical block address to a physical one
6

3

Layered File System
• Each file’s logical blocks are numbered from 1 to N • Corresponding physical blocks could have any addresses (depends on allocation policy)

• Upper level (logical file system):
• Manages file system structures including:
• Directory structures to provide file organization module • File control block (FCB) that contains info about the file including location of file contents, ownership, permissions
7

Directories
• A file system may be divided into partitions • Each partition will have a directory structure • Directories store information such as name, size, and location of a file • Operations performed on directories include: search for a file, create a file, list a directory, traverse a file system
8

4

Directory Structure
• Collection of nodes that contains information about all files
Directory

Files

F1

F2

F3

F4

Fn

Both directory structures and files reside on disk
9

Directory Design Constraints
• Organize directory for:
• Efficiency in locating files quickly • Convenient naming for users
• 2 users may have the same file names • A file may have multiple names

• Logical grouping of files (all games, all c++ code, …)

10

5

Single-Level Directory
• A single directory for all users

• Naming problem • Grouping problem
11

Two-Level Directory
• Separate directory for each user

• Different users may have the same file name • Efficient searching • Sharing might be hard
12

6

Multi-Level Directory
• Efficient searching

• Grouping Capability
13

General Graph Directory

14

7

Directory Implementation
• Linear list of file names with pointer to the data blocks
• simple to program • time-consuming to execute

• Hash Table – linear list with hash data structure
• decreases directory search time • collisions – situations where two file names hash to the same location • fixed size

15

File System Structures
• On-disk structure: (nonvolatile)
• Boot control block, which contains information needed to boot the OS (boot block in UNIX file system ‘UFS’) • Partition control block, which contains partition details such as number of blocks, size of blocks, free block count, .. (superblock in UFS) • A directory structure used to organize the files • An FCB which contains file details (inode in UFS)
16

8

File System Structures

• • • • •

In memory: (volatile)
Info is used for file-system management and to improve performance Partition table Directory structures of recently accessed directories System-wide open-file table, which includes a copy of the FCB of each open file Per-process open-file table, which contains a pointer to the appropriate FCB in the systemwide open-file table

17

A Typical File Control Block (FCB)

File Permissions File Dates (create, access, write) File Owner, Group File Size File Data Blocks

18

9

Open-File Table
• Saves time since no need to search for the file every time an I/O operation is performed • When a file is opened, its info is added to the open-file table • The operations are performed using the index in the table • Info in the open-file table include: file pointer, open count,…
19

File System Operations (create a file)
• Create new file
• Reads the directory into memory • Knowing the directory structure, the logical file system creates a FCB • Updates the directory structure with the new FCB • Writes it back to disk

20

10

File System Operations (open a file)
• Before a file can be used for I/O, it must be opened:
• Open system call searches the system-wide openfile table to see if the file is already open • If the file is not already open
• Directory structure is searched • When file is found, its FCB is copied into the systemwide open-file table

21

File System Operations (open a file)
• Entry in the per-process open-file table with a pointer to the entry in the system-wide open-file table is added (also a pointer to the current location in the file is added) • The open file command returns a pointer (file descriptor in UFS) to the appropriate entry in the per-process file table • All file operations are performed by this pointer
22

11

Allocation Methods

• Refers to how disk blocks are allocated for files:
• Contiguous allocation • Linked allocation • Indexed allocation

23

Contiguous Allocation

24

12

Contiguous Allocation
• Each file occupies a set of contiguous blocks on the disk • Simple, only starting location and length (number of blocks) are required • Random access? • External fragmentation (dynamic storageallocation problem) • Files hard to grow
25

Modified Contiguous Allocation
• Known as extent-based systems • Initially, an extent (contiguous chunk of space is allocated • Another extent is allocated if need arises • File’s blocks are recorded as a location and block count plus a link to the first block in the next extent
26

13

Linked Allocation

27

Linked Allocation
• Each file is a linked list of disk blocks that may be scattered anywhere on the disk • Simple – need only starting address and no sizedeclaration is necessary • No fragmentation or waste of space • Sequential access only (no random access) • Each block consumes 4 bytes to point to the next block (4B in 512B/B .78% waste) • If one pointer is lost, the file state is not guaranteed
28

14

File-Allocation Table

29

File-Allocation Table (FAT)
• Used by MS-DOS and OS2 • A section of space at the beginning of each partition is set a side to contain the table • The table has entry for each disk block and is indexed by block number • Directory entry contains the block number of the first block of the file • The table entry contains the block number of the next block in the file • The last block of the file has a special end of file entry
30

15

Indexed Allocation

31

Indexed Allocation
• Solves the problem of direct access
• Brings the pointers into one location: the index block • Index block is an array of disk-block addresses • The ith entry in the index block points to the ith block in the file

• Directory contains the address of the index block
32

16

Indexed Allocation
• Each file has an index block
• If file is small few non null pointers (i.e. space is wasted more than with a linked list) • If file is big Index block may not be sufficient to hold enough pointers to file blocks

33

Index blocks and big files
• Linked scheme
• Have as many index block as needed to accommodate a file • Last pointer in the current index block points to the next index block

Index Block 0

Index Block 1

34

17

Index blocks and big files
• Multilevel index
• Use the first-level index block to point to a set of second-level index blocks • Second-level index block point to the file blocks • This approach could continue to a third of fourth level • 4 KB blocks can store 1 K of 4B pointers in an index block Two level of indexes allow 1M 4B pointers 4GB file
35

Multilevel index

M

outer-index

index table

file
36

18

Index blocks and big files
• Combined scheme
• Use a combination of direct blocks and indirect blocks • The indirect blocks may use multilevel index • UNIX use direct blocks, indirect blocks, double and triple indirect blocks

37

Combined Scheme: UNIX (4K bytes per block)

38

19

Free-Space Management
• Bit vector (n blocks)
… 678
bit[i] = 1 ⇒ block[i] free 0 ⇒ block[i] occupied

• Block number calculation
(number of bits per word) * (number of 0-value words) + offset of first 1 bit
39

Linked Free Space List on Disk

40

20

Standard ‘C’ File Read and Write
• Uses standard C library's input and output functions • Portable across all operating systems • Buffers read and write operations, making file operations faster and more efficient • Based on FILE data type (also called stream):
• Location in file (where to read or write next) • Read and write buffers
41

Opening Files
• To work with a file, we must open it first, using fopen()
• Returns a (FILE *) on success or NULL on failure
FILE* f_read; FILE* f_write; FILE* f_readwrite; FILE* f_append; f_read = fopen("/home/choo/data.txt", "r"); f_write = fopen("logfile", "w"); f_readwrite = fopen("/usr/local/lib/db/users", "r+"); f_append = fopen("/var/adm/messages", "a");

42

21

Closing Files
• When done working with the file, we need to close it using fclose()
• returns 0 on success and -1 on failure

• Closing a file does the following:
• Flushes unsaved changes to disk (OS disk cache) • Frees file descriptor and other file resources
if (!fclose(f_readwrite)) { perror("Failed closing file'/usr/local/lib/db/users':"); exit(1); }
43

Reading from a File
int c; char buf[201]; /*read a single character from the file*/ c = fgetc(f_read); /* read one line from the file */ fgets(buf, 201, stdin); /* place the given character back into file stream */ ungetc(c, stdin); /* check if the read/write head has reached EOF */ if (feof(f_read)) { printf("End of file reached\n"); } /* read one block of 120 characters */ if (fread(buf, 120, 1, f_read) != 1) { perror("fread"); }
44

22

Writing to a File
int c; char buf[201]; /* write the character 'a' to the given file. */ c = 'a'; fputc(c, f_readwrite); /* write the string "hello world" to the given file. */ strcpy(buf, "hello world"); fputs(buf, f_readwrite); /* write the string "hi there, mate" to the (screen) */ fprintf(stdout, "hi there, mate\n"); /* write out any buffered writes to the given file stream. */ fflush(stdout); /* write twice the string "hello, great world!\n" */ /* third parameter to fwrite():the number of blocks to write). */ strcpy(buf, "hello, great world. we feel fine!\n"); if (fwrite(buf, strlen(buf), 2, f_readwrite) != 2) { perror("fwrite"); }

45

Moving Read/Write Location
/* move the read/write pointer to position '30' */ /* first position in the file is '0', not '1'. */ fseek(f_read, 29L, SEEK_SET); /* move the read/write pointer of the file stream 25 characters*/ /* forward from its given location. */ fseek(f_read, 25L, SEEK_CUR); /* remember the current read/write pointer's position, /* move it to location '520' in the file, write the string /* "hello world", and move the pointer back to the /* previous location. */ long old_position = ftell(f_readwrite); if (old_position < 0) { perror("ftell"); exit(0); } if (fseek(f_readwrite, 520L, SEEK_SET) < 0) { perror("fseek(f_readwrite, 520L, SEEK_SET)"); exit(0); } fputs("hello world", f_readwrite); if (fseek(f_readwrite, old_position, SEEK_SET) < 0) { perror("fseek(f_readwrite, old_position, SEEK_SET)"); exit(0);}
46

23

Moving Read/Write Location (other functions)
#include <stdio.h> int fseek(FILE *stream, long offset, int whence); long ftell(FILE *stream); void rewind(FILE *stream); int fgetpos(FILE *stream, fpos_t *pos); int fsetpos(FILE *stream, fpos_t *pos);

47

Accessing Files Using System Calls
• • • • • Accessing files is done best using the standard C library functions Sometimes, more low-level operations are needed such as checking file permissions Unix treats different devices similar to accessing files Sometimes called “low level I/O” This form of I/O is UNBUFFERED i.e. each read/write request results in accessing disk (or device) directly to fetch/put a specific number of bytes.
48

24

File Descriptors
• A File descriptor is the basic system object to manipulate files • A File descriptor is an +ve integer used to access a memory area containing data about the open file • It is different for different files to uniquely identify a file

49

Creating Files
• To create a file use the creat() system call
int creat(char *file_name, int mode)

50

25

Creating Files
/usr/include/sys/stat.h:
#define S_IRWXU 0000700 /* -rwx------ */ #define S_IREAD 0000400 /* read permission, owner */ #define S_IRUSR S_IREAD #define S_IWRITE 0000200 /* write permission, owner */ #define S_IWUSR S_IWRITE #define S_IEXEC 0000100 /* execute/search permission, owner */ #define S_IXUSR S_IEXEC #define S_IRWXG 0000070 /* ----rwx--- */ #define S_IRGRP 0000040 /* read permission, group */ #define S_IWGRP 0000020 /* write " " */ #define S_IXGRP 0000010 /* execute/search " " */ #define S_IRWXO 0000007 /* -------rwx */ #define S_IROTH 0000004 /* read permission, other */ #define S_IWOTH 0000002 /* write " " */ #define S_IXOTH 0000001 /* execute/search " " */

51

Opening Files
• int open(char *filename, int flag, int perms)
• • Returns a file descriptor or -1 for when it fails O_APPEND, O_CREAT, O_EXCL, O_RDONLY, O_RDWR, O_WRONLY + others see online man pages or reference manuals

int fd_read; int fd_write; int fd_readwrite; int fd_append; /* Open the file /etc/passwd in read-only mode. */ fd_read = open("/etc/passwd", O_RDONLY); if (fd_read < 0) { perror("open"); exit(1); }
52

26

File Operations
• int close(int handle) • int read(int handle, char *buffer, unsigned length) • int write(int handle, char *buffer,unsigned length) • Length: is used to specify the number of bytes to be read or written into a file • The sizeof() is commonly used to specify the length

53

Example
/* program to read a list of floats from a binary file first byte of file is an integer saying how many floats in file. Floats follow after it */ /* command line */ #include<stdio.h> #include<fcntl.h> float bigbuff[1000];
54

27

Example
main(int argc, char **argv) { int fd; int bytes_read; int file_length; if ( (fd = open(argv[1],O_RDONLY)) = -1) { /* error file not open */.... perror("Datafile"); exit(1); } if ( (bytes_read = read(fd,&file_length, sizeof(int))) == -1) { /* error reading file */... exit(1); } if ( file_length > 999 ) {/* file too big */ ....} if ( (bytes_read = read(fd,bigbuff,file_length*sizeof(float))) == -1) { /* error reading open */... exit(1); } }
55

fsync()
• • int fsync(int fildes); To ensure that the file on the physical disk gets updated immediately

#include <unistd.h>/*declaration of fsync()*/ . . . . if (fsync(fd) == -1) { perror("fsync"); }

56

28

access()
• int access(const char *path, int amode);

• The return value is zero if the requested access is possible and -1 otherwise. • The parameter amode can take one of the following four symbolic values.
• • • • R_OK Test for read permission W_OK Test for write permission X_OK Test for execute permission F_OK Test for existence

• Check: stat() and chmod()
57

Other System Call
• off_t lseek(int fildes, off_t offset, int whence);

• lseek - reposition read/write file offset • int rename(const char *old, const char *new); • int unlink(const char *path);

58

29

Reading the Contents of a Directory
• DIR structure for directory reading is what the FILE structure is for files • contains information used by other calls to read the contents of the directory • When reading the contents of a directory, we need to open a directory (returns a DIR structure). • The data regarding a directory entry is returned in a dirent structure. • The only relevant field in this structure is d_name,
• A null-terminated character array, containing the name of the entry (a file or a directory).
59

Opening And Closing A Directory
#include <dirent.h> /* struct DIR, struct dirent, opendir().. */ /* open the directory "/home/users" for reading. */ DIR* dir = opendir("/home/users"); if (!dir) { perror("opendir"); exit(1); } When we are done reading from a directory, we can close it using the closedir() function: if (closedir(dir) == -1) { perror("closedir"); exit(1); }
60

30

Reading The Contents Of A Directory
/* this structure is used for storing the name of each entry in turn. */ struct dirent* entry; /* read the directory's contents, print out the name of each entry. */ printf("Directory contents:\n"); while ( (entry = readdir(dir)) != NULL) { printf("%s\n", entry->d_name); }

61

31

Sign up to vote on this title
UsefulNot useful