You are on page 1of 26

Chapter three

File Systems and


Management of Data Storages
File system Administration
• In a computer, a file system -- sometimes written filesystem -- is the
way in which files are named and where they are placed logically for
storage and retrieval.
• Without a file system, stored information wouldn't be isolated into
individual files and would be difficult to identify and retrieve. As data
capacities increase, the organization and accessibility of individual
files are becoming even more important in data storage.
• Digital file systems and files are named for and modeled after paper-
based filing systems using the same logic-based method of storing and
retrieving documents.
• File systems can differ between operating systems (OS), such as
Microsoft Windows, macOS and Linux-based systems. Some file
systems are designed for specific applications. Major types of file
systems include distributed file systems, disk-based file systems and
special purpose file systems
Cont…
• Partitioning
Way of dividing up a disk into usable separate chunks of known size, i.e.,
virtual disks
Is the lowest possible level of disk management
There has to be at least one partition in Linux
•  Systems that allow partitions implement them by writing a “label” at the
beginning of the disk
• Used to define the range of blocks included in each partition
• Usually, the label coexists with
Startup information such as boot block
Extra information such as a name or unique ID that identifies the
disk
– The device driver responsible for representing the disk reads the label
and uses the partition table to calculate the physical location of each
partition
– Partition information is stored in the 64 bytes of the disk located just
after the first 224 bytes used to store boot record
Cont…
• A partition needs 16 bytes of data to be described
– With information of four partitions, partition table become’s full
• Extended partition
– Workaround to have more partitions
– Functions as a container for up to 15 additional logical partitions
• Why do we need partitions?
– To protect the system from users
– Dual boot configurations
– For backups

• Swap space
• Should be approximately the same size as the computer’s RAM
– Partitioning...
• Device entry schemes
– IDE disks
» /dev/hd[drive][partition]

• » First disk primary drive=a, slave= b,


• » Disk partitions are referenced by number
– SCSI disks
» /dev/sd[drive][partition]
– Entire primary disk
» /dev/hda or /dev/sda
Cont…

• A computer has a sufficient amount of physical memory but most of the time we
need more so we swap some memory on disk.
• Swap space is a space on a hard disk that is a substitute for physical memory.
• It is used as virtual memory which contains process memory images.
• Whenever our computer runs short of physical memory it uses its virtual memory
and stores information in memory on disk.
• Swap space helps the computer’s operating system in pretending that it has more
RAM than it actually has.
• It is also called a swap file.
• This interchange of data between virtual memory and real memory is called
swapping and space on disk as “swap space”. 
• Virtual memory is a combination of RAM and disk space that running processes
can use. 
• Swap space is the portion of virtual memory that is on the hard disk, used
when RAM is full. 
• Swap space can be useful to computers in various ways: 
 
Cont.….

• It can be used as a single contiguous memory which reduces I/O operations to


read or write a file.
• Applications that are not used or are used less can be kept in a swap file.
• Having sufficient swap files helps the system keep some physical memory
free all the time.
• The space in physical memory which has been freed due to swap space can be
used by OS for some other important tasks.
• Operating systems such as Windows, Linux, etc. systems provide a certain
amount of swap space by default which can be changed by users according to
their needs.
• If you don’t want to use virtual memory you can easily disable it all together
but in case if you run out of memory then the kernel will kill some of the
processes in order to create a sufficient amount of space in physical memory.
• So it totally depends upon the user whether he wants to use swap space or not.
Cont…

• fdisk
• It is basic command line partitioning tool
• It could be used to list, create and delete partitions
• The command in linux is $ sudo fdisk -l /dev/sda
• Disk /dev/sda: 4>94 MB, 4>9496T>96 bytes
• >55 heads, 63 sectors/track, 522 cylinders
• Units cylinders of 16065 * 512 82>5>80 bytes Disk identifier:
0x000c79c5
• Device Boot Start End Blocks Id System
• /dev/sda1 ” 1 31 +48 976 83 Linux
• /dev/sda2 32 522 3943957+ 5 Extended
• /dev/sda 32 y22 19439 26 8e Linux LVN
Cont…

parted
• It is a fancier command-line tool that understands several label formats
• Can move and resize partitions in addition to simply creating and deleting
them.
• gparted is GUI version.
Suggestions for partitioning
• Provide a separate partition for the file system containing the /home directory
• Give enough space for the OS
• On server systems, it may make sense to provide separate partitions
for/tmp , /var , and possibly/srv
• You should provide swap space of approximately the same size as the
computer’s RAM
• If there are several (physical) hard disks, it can be useful to spread the system
across the available disks
• Helps to increase the access speed to individual components
Cont…
• Mounting file systems
It allows to place a file system anywhere in the directory tree.
The Linux command to Select a mount point is;
mount/dev/device/directory/to/mount
For Automatically mounting; /etc/fstab
For Unmounting file systems we use; unmount/directory/to/mount
Disk usage
du; Provide the total size of all files in a directory
• Has many switches
— Example
-s switch summarizes the total
-h switch displays the size in a human readable form
$ du -sh lust/local/bin
4 T /usr/local/bin
df
•Helps to see the total disk space used and free on the host
Filesystem Size Use Avai Use% Mounted
$ df -h d l on
/dev/mapper/VolGroup00- 17 8G 11G 1 59 6Z /
LogVol0l
/devfsda1 99a 37a 58N 398 /boot
tmpfs 910N 0 910N 08 /dev/shn
Cont…
 Common commands for file management
 df
 summarizes the free disk space by file system
 du
 summarizes disk usage by directory
 In
 used to generate links between files
 tar
 archiving utility
 find
 find files or patterns of files
 fsck
 attempts to verify that all links and blocks are correctly tied together
 fdisk
 managing partitions
 mke2fs
 creating file systems
 mkswap
 creating swap file system
Managing disk Quota

• Disk quota...
Soft quota
– Not to be exceeded in the long term, but you can allow users an “overdraft” by setting
the hard quota to a higher value
– Within a certain period of time they must reduce the used space to below the soft quota
– If hard quota limit is reached or the grace period is over, further write operations will
fail
– Quotas can be assigned per file system
• User quota (ext and XFS)
– Install quota software
– Mark those file systems where quotas should be enforced by
• Including the userquota mount option
• # nount -o recount, usrquota /hone

Permanently including the option in the file system’s entry in


• /etc/fstab
• /dev/hda5 /honeext2 defaults ,us rquota
The quota database is initialized using
• # quotacheck • avu
• » Creates the aquota.user database file in the partition’s root directory, in this
case /home/aquota.user
Cont…
— Start the quota system

— edquota
• Used for setting quotas for various users
• Starts your favorite editor with a template where you can fill in the
soft and hard quotas for the file system
# 9d§U0t d • U flUgo

— Disk usage
# repquo ta • a
Block t1n1ts Flue T1n1ts
User used soft hard grace used soft hard grace
root 166512 0 0 19562 0 0
tux 2304 10000 12000 806 1000 2000
huq0 ll 9 t ur Item bebe (Ph. 9 SOO 1000
Cont…
• Group quotas ( ext and XFS)
– Applies to all members of a group
– To make group quotas effective, the quota must be
set to all groups that the users in question are
members of
– Steps
• Mount option for group quotas is grpquota
– Creates the database file aquota.group
• To enable group quotas, you must use the aforementioned
commands while substituting or augmenting the -u option
by -g
• Example
• # quotaon -auvg
– Activates all types of quota (user and group) on every file system
• $ quota —vg
– Displays group quotas for those groups that you are a member of
Logical Volume Management and RAID
• RAID
– It stands for Redundant Arrays of Inexpensive Disks
– Is a system that distributes or replicates data across multiple disks
– Advantage
• Helps to avoid data loss
• Minimizes the downtime associated with hardware failures (often to zero)
• Potentially increases performance
– Could be implemented by
• Dedicated hardware that presents a group of hard disks to the OS as a single
composite drive
– Dominant in the past because of
• » Lack of software alternatives (no direct OS support for RAID)
• » Hardware’s ability to buffer writes in some form of nonvolatile memory
• The OS’s reading or writing multiple disks according to the rules of RAID
– Avoids problems that could arise due to RAID controller (hardware)
failure
Cont…
– Can do two basic things
• Improve performance by “striping” data across multiple
drives
– Allows several drives to work simultaneously to supply or absorb a
single data stream
• Replicate data across multiple drives
– Decreases the risk associated with a single failed disk
– Two forms of replication
• Mirroring
– Data blocks are reproduced bit-for-bit on several different drives
– Fast but consumes more disk space
• Parity schemes
– One or more drives contain an error-correcting checksum of the
blocks on the remaining data drives
– Disk space efficient but have lower performance
Cont…
– Levels
• RAID is traditionally described in terms of “levels”
• Specify the exact details of the parallelism and redundancy implemented by
an array
• Higher levels are not necessarily better
• Levels are just different configurations
• In the following description
– numbers identify stripes
– letters a, b, and c identify data blocks with in a stripe
– blocks marked q and p are parity blocks
• Linear mode or just a bunch of disks (JBOD)
– Not a real RAID
– Concatenates the block addresses of multiple drives to create a single
larger virtual disk
– Provides no data redundancy or performance benefit
– These days it is achieved through a logical volume manager rather than
a RAID controller
Cont…
Levels...
RAID level 0
 Used strictly to increase performance
 Combines two or more drives of equal size
 It strips data alternately among the disks in the pool
 Sequential reads and writes are spread among several disks
 Decreases write and access times
 Has low reliability compared to separate disks
RAID level 1
 It is Mirroring concept
 Writes are duplicated to two or more drives simultaneously
 Makes writes slower than they would be on a single drive
 Has a comparable read speed with RAID level 0
Cont…

Levels...
RAID level 1+0 and 0+1
» Are stripes of mirror or mirrors of stripe sets
» Are concatenations of RAID 0 and RAID 1
» Have support by many controllers and software
implementations
It goal is to Obtain the performance of RAID 0 and redundancy of RAID 1

RAID 0 RAID 0

  RAID 0+1:
Mirror of
stripes
Cont…
RAID level 5
» Stripes both data and parity information, adding redundancy
while simultaneously improving read performance
» Writes data blocks to N—1 disks and parity blocks to the
Nth disk
» More efficient in its use of disk space than is RAID 1
» If there are N drives in an array (at least three are required),
N—1 of them can store data
– » The space-efficiency of RAID 5 is therefore at least 67%,
whereas that of mirroring cannot be higher than 50%
Raid level 6
» Similar to RAID 5 with two parity disks
» Can withstand the complete failure of two drives without
losing data
Cont…
• Disk failure recovery
• JBOD and RAID 0 are no help when hardware problems occur
– Recovery is done from backups
– Other forms of RAID enter a degraded mode in which offending devices
are marked as faulty
• RAID 5 or two-disk RAID 1
– Can only tolerate the failure of a single device
– Once a failure has occurred, the array is vulnerable to a second failure
• Drawbacks of RAID 5
– Doesn’t protect against the accidental deletion of files
• Does not protect against controller failures, fires, hackers, or any
number of other hazards
• Doesn’t replace regular off-line backups
– Has weak write performance
• Writes data blocks to N—1 disks and parity blocks to the Nth disk
• Uses incremental updating
• »Each random write expands into four operations: two reads and two writes
Cont…
– Vulnerable to corruption
• If any block in a stripe fall out of sync with the parity block, it wont be
detected in normal use
• Parity block may have been re-written many times since the occurrence of
the original desynchronization
– This problem is known as “write hole”
– Solution
• Scrubbing
• Validating parity blocks one by one while the array is relatively id
• Drawbacks also apply to RAID 6
• mdadm: Linux software for RAID
– The standard software RAID implementation for Linux is
called md, “multiple disks”
– Its md’s front-end
– Supports all the RAID configurations
Cont…
• Logical volume management
• Limitations of partitions
• Partitions are inflexible; Hard to resize partitions
• Adding disks and partitions spread data; Makes consolidating backups
difficult
– Solution; Logical volume management
• Merges one or more partitions or devices into a single logical volume
group
• Allows to dynamically create, resize, and delete volumes in a volume
group
– Removes the need to unmount volumes or reboot the system to update
the partition map
• LVM system has three layers
– Physical volumes
It is the bottom layer and consists of disks, partitions, or RAID arrays
cont...
– Volume groups
• Created using physical volumes
• Consist of one or more physical volumes
– Logical volumes
• Created from the space within a volume groups
• Are the LVM equivalent of partitions
• Can hold arbitrary file systems or swap spac
Cont…
• Logical volume management...
– Create the logical volume
• Creates the logical volume 100GB in size within DEMO
• $ sudo lvcreate -L 100G -n web1 DEMO Logical
volume "web1" created
• The volume can now be accessed through
• /dev/DEMO/web1

You might also like