You are on page 1of 23

Case study Topic name

Case Study Work submitted to

Visvesvaraya Technological University


in partial fulfilment of the requirements
for the award of degree of

Bachelor of Engineering
in
Computer Science & Engineering
Submitted by

Subhashini D (usn:1KG19CS098)

Vadde sneha (usn:1KG19CS109)

Under the Guidance of

Mr. Santosh Kumar S


Assistant. Professor
Department of CSE
KSSEM, Bengaluru

Department of Computer Science & Engineering


K.S. School of Engineering and Management
No. 15, Mallasandra, off Kanakapura Road, Bengaluru-560109
2021-22
ABSTRACT
A file system is a logical collection of files on a partition or disk. A partition is a
container for information and can span an entire hard drive if desired.
Your hard drive can have various partitions which usually contain only one file
system, such as one file system housing the /file system or another containing
the /home file system.
One file system per partition allows for the logical maintenance and management of
differing file systems.
Everything in Unix is considered to be a file, including physical devices such as
DVD-ROMs, USB devices, and floppy drives.
The Unix file system is one of the principle reasons for the success of Unix. From a
user and application point of view, the system is remarkably simple, uniform, and
easy to use. Perhaps more surprising, the file system abstraction is maintained even
within the kernel, that is, operating system routines not directly concerned with a
particular class of file system can (and do) treat files and directories uniformly using
the low-level Unix I/O interface.

This flexibility and uniform simplicity is a testament to soundly engineered interfaces


and abstractions, but is also due in large measure to the use of an object-oriented
approach to system building. None of the kernel code is written in an object-oriented
language of course: it's all still C. But, as we shall see, much of the file system code
uses hand-coded objects and methods.

The abstraction layer that presents file systems to the rest of the kernel and thence
to user programs is called the Virtual File System (VFS). We shall look at the key data
structures (objects) in the VFS below.

Though the Unix virtual file system abstraction is applied to many devices, the
easiest way to understand it is to go back to its origins as a model for organizing
information on a disk. From this point of view, each partition on a disk is looked upon
as a separate disk.
TABLE OF CONTENTS

Introduction. 1

History 2

Features 3

Architecture. 4

File operation. 5

Directory Hierachy. 6.

Processes. 7

File system. 8

Security 9

Conclusion. 10
INTRODUCTION
Unix File System can be defined as a framework that organizes and
stores a huge volume of data that can be handled with ease. It
involves elements like file which is a collection of related data that
can be viewed logically, appears as a stream of bytes, with attributes
containing information in relation to the concerned file. The file
system consisting of two main components i.e files, directories. The
entire system following a hierarchy in which directories act as special
files that contain multiple files with the highest-level directory in the
entire hierarchical structure being termed as root which is
symbolically denoted by ‘/’.  There can be many subdirectories under
this directory.
The Unix File system usually has below directories present in the file
system:
bin: It is a short form for binary files. This directory stores the
commonly used executable commands.
mnt: This contains information regarding the mounted devices.
root: This is the root user’s home directory.
tmp: This is a storage for temporary files. As they are temporary they
are removed periodically from the filesystem.
usr: It contains a set of executable commands
home: It has a collection of directories and files.
proc: It contains files that are related to system processes
HISTORY
Early versions of Unix filesystems were referred to simply as FS. FS
only included the boot block, superblock, a clump of inodes, and the
data blocks. This worked well for the small disks early Unixes were
designed for, but as technology advanced and disks grew larger,
moving the head back and forth between the clump of inodes and the
data blocks they referred to caused thrashing. Marshall Kirk
McKusick, then a Berkeley graduate student, optimized the BSD 4.2's
FFS (Fast File System) by inventing cylinder groups, which break the
disk up into smaller chunks, with each group having its own inodes
and data block.
intent of BSD FFS is to try to localize associated data blocks
and metadata in the same cylinder group and, ideally, all of the
contents of a directory (both data and metadata for all the files) in the
same or nearby cylinder group, thus reducing fragmentation caused
by scattering a directory's contents over a whole disk some of the
performance parameters in the superblock included number of tracks
and sectors, disk rotation speed, head speed, and alignment of the
sectors between tracks. In a fully optimized system, the head could be
moved between close tracks to read scattered sectors from alternating
tracks while waiting for the platter to spin around.
As disks grew larger and larger, sector-level optimization became
obsolete (especially with disks that used linear sector numbering and
variable sectors per track). With larger disks and larger files,
fragmented reads became more of a problem. To combat this, BSD
originally increased the filesystem block size from one sector to 1 K
in 4.0 BSD; and, in FFS, increased the filesystem block size from 1 K
to 8 K. This has several effects. The chance of a file's sectors being
contiguous is much greater. The amount of overhead to list the file's
blocks is reduced, while the number of bytes representable by any
given number of blocks is increased.
FEATURE

When you use BPAM to access a UNIX directory, it appears to the


program as a PDS or PDSE directory. A UNIX directory is divided
into sequentially organized files (members), each described by the
directory entry. You can use the BLDL and FIND macros to search a
UNIX directory. You can code the path name with or without a
trailing slash.
The UNIX files have the following characteristics:
*BPAM treats UNIX files as members.
* files can be regular files, special character files, hard or soft link
(symbolic) files, or named pipes.
* UNIX file has a unique name of 1-to-8 characters.
File names are case-sensitive.
*You can use BSAM or QSAM to read individual UNIX files in a
directory.
*You can add, rename, or delete UNIX members in a directory, but
not through BPAM.
ARCHITECTURE
The Unix file system has a hierarchical (or tree-like) structure with its
highest level directory called root (denoted by /, pronounced slash).
Immediately below the root level directory are several subdirectories,
most of which contain system files. Below this can exist system files,
application files, and/or user data files. Similar to the concept of the
process parent-child relationship, all files on a Unix system are related
to one another. That is, files also have a parent-child existence. Thus,
all files (except one) share a common parental link, the top-most file
(i.e. /) being the exception.
Below is a diagram (slice) of a "typical" Unix file system. As you can
see, the top-most directory is / (slash), with the directories directly
beneath being system directories. Note that as Unix implementaions
and vendors vary, so will this file system hierarchy. However, the
organization of most file systems is similar.
While this diagram is not all inclusive, the following system files (i.e.
directories) are present in most Unix filesystems:
While this diagram is not all inclusive, the following system files (i.e.
directories) are present in most Unix filesystems:
bin - short for binaries, this is the directory where many commonly
used executablecommands reside
dev - contains device specific files
etc - contains system configuration files
home - contains user directories and files
lib - contains all library files
mnt - contains device files related to mounted devices
proc - contains files related to system processes
root - the root users' home directory (note this is different than /)
sbin - system binary files reside here. If there is no sbin directory on
your system, these files most likely reside in etc
tmp - storage for temporary files which are periodically removed from
the filesystem
usr - also contains executable commands
FILE OPERATIONS
BASIC FILE SYSTEM OPERATIONS
By considering the fundamental operations we can carry out using file
systems, we can see the inherent issues associated with their
implementation.
INITIALIZATION:
We must be able to turn a newly allocated region of a disk into a file
system or part of a file system. In any file system there must be at
least one fixed location structure. In more traditional unix file systems
there are many fixed location structures. These must be laid out in
order to allow one to manipulate files. We must also create an
empty root (top-level) directory within the file system.
Mounting:
Mounting is the act of making a physical file system representation on
some medium available for use by the operating system and programs
it may run. To do this, we must read the file system metadata from the
medium.

Sometimes the system to be mounted may be inconsistent due to


physical damage or due to premature termination of file system
processes when it was last mounted. Most file systems nowadays will
make a record of whether or not they are clean, that is whether or not
proper shutdown occurred and the state of the file system can be
considered to be consistent.

If a file system is known to be inconsistent, it must be checked (at


least) and repaired (if necessary and possible). It is possible that a file
system not marked as consistent will in fact be consistent, but this is
not the norm.

Journalling can allow us to more easily repair any inconsistencies that


may occur.
Even file systems marked as consistent may not be. The amount of
work needed to completely verify the consistency of a file system,
however, can be significant. The trade-off a file system makes
between how scrupulously to verify consistency and how quickly to
start up is an important engineering decision.

After verifying consistency of the file system, file system metadata


will be read and stored in active memory. Tables maintained in the
running operating system cache the file system metadata so that it is
more easily accessible.

Mounting is the act of making a physical file system representation on


some medium available for use by the operating system and programs
it may run. To do this, we must read the file system metadata from the
medium.
Sometimes the system to be mounted may be inconsistent due to
physical damage or due to premature termination of file system
processes when it was last mounted. Most file systems nowadays will
make a record of whether or not they are clean, that is whether or not
proper shutdown occurred and the state of the file system can be
considered to be consistent.
If a file system is known to be inconsistent, it must be checked (at
least) and repaired (if necessary and possible). It is possible that a file
system not marked as consistent will in fact be consistent, but this is
not the norm.
Journalling can allow us to more easily repair any inconsistencies that
may occur.
Even file systems marked as consistent may not be. The amount of
work needed to completely verify the consistency of a file system,
however, can be significant. The trade-off a file system makes
between how scrupulously to verify consistency and how quickly to
start up is an important engineering decision.
After verifying consistency of the file system, file system metadata
will be read and stored in active memory. Tables maintained in the
running operating system cache the file system metadata so that it is
more easily accessible.
UNMOUNTING:
The primary task achieved by unmounting is to flush the cached
metadata so that the representation on disk will be consistent and to
mark the system as being clean.
DIRECTORY HIERACHY
The Unix File system is a hierarchy. That is, it can reviewed as a tree
structure. Sub directories (folders) appear as branches emanating from
a their parent directories (a folder containing the folder). The tree
allows for only one parent for each sub directory, but a parent
directory may contain many sub directories. The following figure
shows a tree diagram based on the WSU Unix file system. Vertical
lines represent the contents of a directory. A horizontal line indicates
that the sub directory is contained in the directory represented by the
vertical line it intersects. In the diagram, wsunix and www are both
contained in users (they are siblings).
THE ROOT DIRECTORY:
The root directory is represented by the symbol / . Note: the / symbol
is also used to separate names in a list of directory names called a
directory path name (or just path name). We will discuss path names
next.

PATH NAMES:
Path names are used to describe the location of directories and files in
the file system hierarchy. A path name is essentially a description of
the directories that must be passed through to get to a particular
directory. There are two ways to write path names. One way is to
refer to the desired directory by giving a path name that starts at the
root. Such a path name is said to be absolute. Another way is to refer
to the desired directory by giving a path name that starts at the user's
current working directory. This kind of path name is called relative.

ABSOLUTE PATH NAMES:


Absolute path names describe a path from the root to a particular
directory (folder). The first symbol in an absolute path name
is always the root symbol. Each directory that precedes the desired
directory is listed in turn. Additional backslash characters are used to
separate the names in the list.
For example: the following hierarchy contains a directory
called mydir. The absolute path name for this directory
is: /users/wsunix/your-account-name/mydir The root contains users,
users contains your-account-name, and your-account-name contains
the directory mydir.
THE HOME DIRECTORY:
Multi-user Unix environments assign each user a home directory. On WSU Unix your home
directory is /users/wsunix/your-account-name , where your-account-name is the account
name you use to login to WSU Unix. Unix systems usually assign the alias, $HOME, to your
home directory.
WSU Unix will display the absolute path name of your home directory if you issue the
command echo $HOME regardless of where you are working in the hierarchy.
THE WORKING DIRECTORY:
In Unix, one directory is always the working (or current) directory. Think of it as the
currently open folder. The Unix command pwd (print working directory) displays the
absolute path name of this directory. When you first log on to a Unix system, your home
directory is your working directory.
. (dot) is the symbol used to represent the working directory; however, the dot is rarely used
in commands. The parent directory of the working directory is often represented symbolically
as .. (double dot). The double-dot notation is not used in absolute path names.

RELATIVE PATH NAME:


Relative path names always start at the working directory. They never start with the root
symbol. Before looking at some examples, you must understand that when traversing down a
tree, you use the name of the directory you are passing through; when traversing up the tree.
PROCESSES
In Computer Science, File Processing System (FPS) is a way of
storing, retrieving and manipulating data which is present in various
files. 
Files are used to store various documents. All files are grouped based
on their categories. The file names are very related to each other and
arranged properly to easily access the files. In file processing system,
if one needs to insert, delete, modify, store or update data, one must
know the entire hierarchy of the files. 

Advantages of File Processing System :


Cost friendly – 
There is a very minimal to no set up and usage fee for File Processing
System. (In most cases, free tools are inbuilt in computers.) 
 
Easy to use – 
File systems require very basic learning and understanding, hence,
can be easily used. 
 
High scalability – 
One can very easily switch from smaller to larger files as per his
needs.
Disadvantages of Processing System : 
 
Slow access time – 
Direct access of files is very difficult and one needs to know the entire
hierarchy of folders to get to a specific file. This involves a lot of
time. 
 
Presence of redundant data – 
The same data can be present in two or more files which takes up
more disc space. 
 
Inconsistent Data –
Due to data redundancy, same data stored at different places might not
match to each other. 
 
Data Integrity Problems –
The data present in the database should be consistent and correct. To
achieve this, the data should must satisfy certain constraints.
 
Difficulty in recovery of corrupt data – 
Recovery or backup of lost and corrupt data is nearly impossible in
case of File Processing System. 
 
Lack of Atomicity –
Operations performed in the database must be atomic i.e. either the
operation takes place as a whole or does not take place at all. 
 
Problem in Concurrent Access – 
When a number of users operates on a common data in database at the
same time then anomalies arise, due to lack of concurrency control. 
SECURITY
File system security is about making sure your users can only do
what you want them to be able to do. This means that you want
system programs to be secure and users to only be able to write where
you want them to be able to do so.
NFS Security:
Only run NFS as needed, apply latest patches. When creating your
/etc/exports file, be certain to use limited access flags when possible
such as readonly or nosuid. By using fully qualified hostnames, you
are guaranteed that only the host you want to be able to access the
filesystem can access it.

Device Security:
Device files /dev/null, /dev/tty & /dev/console should be world
writeable but NEVER executable. Most other device files should be
unreadable and unwriteable by regular users.
Script Security:
Never write setuid/setgid shell scripts (can break out). Instead, write a
compiled program in a language like "C". Scripts should ALWAYS
have full pathnames.
Program Security:
Always get your programs from a known source. Verify that it hasn't
been hampered with via checksum. If you are compiling your own
program, make sure you know that the compiler hasn't been tampered
with as well.
General Security Measures:
Create minimal writable filesystems (esp. system files/directories!).
Generally, users should only be able to write in their own directories,
and /tmp. In addition, there will be directories for a specific group to
write in. This way you control how each user can access specific areas
of the system.
Make sure that important files are only accessible by authorized
personnel. Use setuid/setgid only where necessary.
COPS will find many of these problems.
CONCLUSION
In this file system we covered several advanced file management
tasks. You know how to list hidden files, entire directory trees, and
directory names. You also learned how to copy and remove a
directory tree. Finally, two new commands, ln and find, were
introduced. With ln, you can create symbolic links, also called aliases
on Macintosh or shortcuts on Windows computers. You also can set
search criteria when using the find command to search the filesystem.
Some of the following terms and concepts:
Recursive listing: A recursive listing of a directory is one that
repeatedly displays all subdirectories down the hierarchy, until the
last level of the directory tree is reached.
2. link: A symbolic link is a name that points to another file or
directory. The target of the link can reside on another file system.
3: A link is another name for a file. A link is similar to a
Macintosh alias or a Windows shortcut.
4: link: A hard link is another name for a file.

You might also like