You are on page 1of 45

Hello!

Network and System Administration


Kenesa B. (getkennyo@gmail.com)
Ambo University, Hachalu Hundessa Campus
Computer Science Dept
April, 2023
CHAPTER THREE
File Systems and Management of Data Storages

2
File system Administration
qData is commonly stored on magnetic or optical disk platters.
qThe platters are rotated (at high speeds).
• Data is read by heads, which are very close to the surface of the platter, without touching it!
• The heads are mounted on an arm (sometimes called a comb or a fork).
q Data is written in concentric circles called tracks. Track zero is (usually) on the outside.
• The time it takes to position the head over a certain track is called the seek time.
q Often the platters are stacked on top of each other, hence the set of tracks accessible
at a certain position of the comb forms a cylinder.
q Tracks are divided into 512 byte sectors, with more unused space (gap) between the
sectors on the outside of the platter.
§ When you break down the advertised access time of a hard drive, you will notice
that most of that time is taken by movement of the heads (about 65%) and
rotational latency (about 30%).

Kenesa B. (Ambo University) Network and System Admin 3


File system Administration
qVirtually all computer systems require some way to store data permanently;
• even so-called “diskless” systems do require access to certain files in order to boot, run and be
useful.
q Linux file system is generally a built-in layer of a Linux operating system used to
handle the data management of the storage.
• It manages the file name, file size, creation date, and much more information about a file.
q A file system is designed in a way so that it can manage and provide space for non-
volatile storage data.
• All file systems required a namespace that is a naming and organizational methodology.
q The namespace defines the naming process, length of the file name, or a subset of
characters that can be used for the file name.
• It also defines the logical structure of files on a memory segment, such as the use of directories
for organizing the specific files.
• Once a namespace is described, a Metadata description must be defined for that particular file.
Kenesa B. (Ambo University) Network and System Admin 4
Partitioning Disks
qDisk space is a finite and fixed resource.
• Each hard drive we purchase has a given capacity, and it is up to the system administrators to use
it effeciently.
q Storage devices in Linux (such as hard drives and USB drives) need to be structured in
some way before use.
• In most cases, large storage devices are divided into separate sections, which in Linux are referred
to as partitions.
q Partitioning is a means to divide a single hard drive into many logical drives.
• A partition is a contiguous set of blocks on a drive that are treated as an independant disk.
• A partition table is an index that relates sections of the hard drive to partitions.
• Different file systems and anticipated uses of the data on a disk require different kinds of
partitions.
q First, in order for a disk to be bootable, we require it to have a boot sector, a small
region that contains the code that the computer’s firmware can load into memory.
• such as the Basic Input/Output System (BIOS)
Kenesa B. (Ambo University) Network and System Admin 5
Partitioning Disks
qPartitions can be of type:
§ primary (maximum four), extended (maximum one) or logical (contained within
the extended partition).
q Each partition has a type field that contains a code.
q This determines the computers operating system or the partitions file system.
q hard disk devices are named /dev/hdx or /dev/sdx with x depending on the hardware
configuration.

Partition naming two (spindle) disks with partitions. Note that an


extended partition is a container holding logical drives.
Kenesa B. (Ambo University) Network and System Admin 6
Partitioning Disks
qWhy have multiple partitions?
§ Encapsulate your data. Since file system corruption is local to a partition, you
stand to lose only some of your data if an accident occurs.
§ Increase disk space efficiency. You can format partitions with varying block sizes,
depending on your usage.
• If your data is in a large number of small files (less than 1kb) and your
partition uses 4kb sized blocks, you are wasting 3kb for every file.
• you waste on average one half of a block for every file, so matching block size
to the average size of your files is important if you have many files.
§ Limit data growth. Runaway processes or maniacal users can consume so much
disk space that the OS no longer has room on the hard drive for its bookkeeping
operations.
• This will lead to disaster. By segregating space, you ensure that things other
than the operating system die when allocated disk space is exhausted.
Kenesa B. (Ambo University) Network and System Admin 7
Devices
qThe /dev directory contains the special device files for all the devices.
§ In Linux, partitions are represented by device files.
q These are phoney files located in /dev. Here are a few entries:
brw−rw−−−− 1 root disk 3, 0 May 5 1998 hda
brw−rw−−−− 1 root disk 8, 0 May 5 1998 sda
crw−−−−−−− 1 root tty 4, 64 May 5 1998 ttyS0
qA device file is a file with type c ( for "character" devices, devices that do not use the
buffer cache) or b (for "block" devices, which go through the buffer cache).
§ In Linux, all disks are represented as block devices only
q By convention, IDE drives will be given device names /dev/hda to /dev/hdd.
§ Hard Drive A (/dev/hda) is the first drive and Hard Drive C (/dev/hdc) is the third.
q Some devices in /dev: /dev/sda, /dev/sdb, … — SATA disks, /dev/hda, /dev/hdb, … —
Older IDE disks, /dev/ttyS01, … — Serial ports, /dev/audio — Audio port
Kenesa B. (Ambo University) Network and System Admin 8
Partitioning Disks (fdisk)
qfdisk stands (for “fixed disk or format disk“) is a most commonly used command-line
based disk manipulation utility for a Linux/Unix systems.
qWith the help of fdisk command you can view, create, resize, delete, change, copy and
move partitions on a hard drive
• using its own user friendly text based menu driven interface.
qThis tool is very useful in terms of creating space for new partitions, organising space
for new drives, reorganising an old drives and copying or moving data to new disks.
qIt allows you to create a maximum of four new primary partition and number of logical
(extended) partitions, based on size of the hard disk you have in your system.
qOther Partitioning Software:
• sfdisk: a command−line version of fdisk
• cfdisk: a cursor−based version of fdisk
• parted: Gnu partition editor
• Partition Magic: a commercial utility to create, resize, merge and convert partitions
• Disk Drake: a Perl/Gtk program to create, resize, and delete partitions
Kenesa B. (Ambo University) Network and System Admin 9
fdisk -l
qfdisk -l command list all existing disk partition on your system.
qIn the fdisk -l example below you can see that two partitions exist on /dev/sdb.
qThe first partition spans 31 cylinders and contains a Linux swap partition. The second
partition is much bigger.
root@laika:~# fdisk -l /dev/sdb
Disk /dev/sdb: 100.0 GB, 100030242816 bytes
255 heads, 63 sectors/track, 12161 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 31 248976 82 Linux swap/Solaris
/dev/sdb2 32 12161 97434225 83 Linux
§ The /proc/partitions file contains a table with major and minor number of partitioned devices, their
number of blocks and the device name in /dev.
§ Verify with /proc/devices to link the major number to the proper device using: cat /proc/partitions
Kenesa B. (Ambo University) Network and System Admin 10
Partition Types
q Primary Partitions
§ The number of partitions on an Intel−based system was limited from the very
beginning:
§ The original partition table was installed as part of the boot sector and held space
for only four partition entries.
§ These partitions are now called primary partitions.
qLogical Partitions
§ One primary partition of a hard drive may be subpartitioned. These are logical
partitions. This effectively allows us to skirt the historical four partition limitation.
§ The primary partition used to house the logical partitions is called an extended
partition and it has its own file system type.
§ Unlike primary partitions, logical partitions must be contiguous.
§ Each logical partition contains a pointer to the next logical partition, which
implies that the number of logical partitions is unlimited.
Kenesa B. (Ambo University) Network and System Admin 11
Partition Types
§ However, linux imposes limits on the total number of any type of partition on a
drive, so this effectively limits the number of logical partitions.
§ This is at most 15 partitions total on an SCSI disk and 63 total on an IDE disk.
q Swap Partitions
§ Every process running on your computer is allocated a number of blocks of RAM.
These blocks are called pages.
§ The set of in−memory pages which will be referenced by the processor in the very
near future is called a "working set."
§ Linux tries to predict these memory accesses (assuming that recently used pages
will be used again in the near future) and keeps these pages in RAM if possible.
§ If you have too many processes running on a machine, the kernel will try to free
up RAM by writing pages to disk. This is what swap space is for.
§ Consider this emergency memory and not extra memory.

Kenesa B. (Ambo University) Network and System Admin 12


Partitioning Disks with fdisk and parted
q If you would like to view all commands which are available for fdisk simply use the
following command by mentioning the hard disk name such as /dev/sda as shown
below.

q Type ‘m‘ to see the list of all available commands of fdisk which can be operated on
/dev/sda hard disk.
q To print all partition table of hard disk, you must be on command mode of specific
hard disk say /dev/sda.
§ From the command mode, enter ‘p‘ instead of ‘m‘, as you enter ‘p‘, it will print the specific /dev/sda
partition table.
Kenesa B. (Ambo University) Network and System Admin 13
Partitioning Disks with fdisk and parted
q If you would like to delete a specific partition (i.e /dev/sda9) from the specific hard
disk such as /dev/sda, you must be in fdisk command mode to do this.
q Next, enter ‘d‘ to delete any given partition name from the system.
§ As I enter ‘d‘, it will prompt me to enter partition number that I want to delete from /dev/sda hard
disk.
q Suppose I enter number ‘4‘ here, then it will delete partition number ‘4‘ (i.e.
/dev/sda4) disk and shows free space in partition table.
q Enter ‘w‘ to write table to disk and exit after making new alterations to partition table.
• The new changes would only take place after next reboot of system.
q Warning: Be careful, while performing this step, because using option ‘d‘ will
completely delete partition from system and may lost all data in partition
q If you’ve free space left on one of your device say /dev/sda and would like to create a
new partition under it,
§ then you must be in fdisk command mode of /dev/sda
Kenesa B. (Ambo University) Network and System Admin 14
Partitioning Disks with fdisk and parted
q After entering in command mode, now press “n” command to create a new partition
under /dev/sda with specific size.
• While creating a new partition, it will ask you two options ‘extended‘ or
‘primary‘ partition creation.
q Press ‘e‘ for extended partition and ‘p‘ for primary partition. Then it will ask you to
enter following two inputs.
• First cylinder number of the partition to be create.
• Last cylinder number of the partition to be created (Last cylinder, +cylinders or +size).
q You can enter the size of cylinder by adding “+5000M” in last cylinder. Here, ‘+‘ means
addition and 5000M means size of new partition (i.e 5000MB).
• Please keep in mind that after creating a new partition, you should run ‘w‘ command to alter and
save new changes to partition table and finally reboot your system to verify newly created
partition or use partprobe command.
q After the new partition is created, don’t skip to format the newly created partition
using ‘mkfs‘ command.
Kenesa B. (Ambo University) Network and System Admin 15
Partitioning Disks with fdisk and parted
q After formatting new partition, check the size of that partition using flag ‘s‘ (displays
size in blocks) with fdisk command.
• This way you can check size of any specific device.
q If you’ve deleted a logical partition and again recreated it, you might notice ‘partition
out of order‘ problem or error message like ‘Partition table entries are not in disk
order‘.
q To fix such partition order problems, and assign sda4 to the newly created partition
(in the above example), issue the ‘x‘ to enter an extra functionality section and then
enter ‘f‘ expert command to fix the order of partition table
q After, running ‘f‘ command, don’t forget to run ‘w‘ command to save and exit from
fdisk command mode. Once it fixed partition table order, you will no longer get error
messages.
q By default, fdisk command shows the boot flag (i.e. ‘*‘) symbol on each partition.

Kenesa B. (Ambo University) Network and System Admin 16


Partition Table
q Partition tables are data at the start of the disk and describe the partition boundaries
to the OS. There are two common standards for this:
q Master Boot Record (MBR):
• Also sometimes called DOS partition, old format
• Limited to 2 TB and limited to 4 partitions per disk.
• This can be overcome by creating a special extended partition which holds up to 16 “logical
drives”. In Linux, these will show up as /dev/sda5, …
• Can use fdisk manipulate partitions
q GUID Partition Table (GPT):
• New format
• Limited to 9 ZB and limited to 264 partitions.
• Can use gdisk to manipulate
• parted or its GUI counterpart gparted provide convenient interfaces for editing partitions.
Changing partitions can destroy all the data on the disk!
Kenesa B. (Ambo University) Network and System Admin 17
Partitioning Disks with fdisk and parted
q parted is an interactive tool, just like fdisk. Type help in parted for a list of commands
and options.
q This screenshot shows how to start parted to manage partitions on /dev/sdb.
[root@rhel71 ~]# parted /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted)
q We create an mbr label.
(parted) mklabel msdos>
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk
will be lost. Do you want to continue? Yes/No? yes
(parted) mklabel gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk
will be lost. Do you want to continue? Yes/No? Y
(parted)
Kenesa B. (Ambo University) Network and System Admin 18
Partitioning Disks with fdisk and parted
q Once labeled it is easy to create partitions with parted.
(parted) print
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sdb: 8590MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
q This example shows how to create two primary partitions of equal size.
(parted) mkpart primary 0 50%
Warning: The resulting partition is not properly aligned for best performance.
Ignore/Cancel? I
(parted) mkpart primary 50% 100%
§ Verify with print and exit with quit. Since parted works directly on the disk, there is no need to w(rite)
like in fdisk.
Kenesa B. (Ambo University) Network and System Admin 19
Swap Space
q If memory demand is too high, memory of some processes is transferred to disk
• Usually combined with scheduling: low priority processes are swapped out
q Swap space can take the form of a disk partition or a file.
• Users may create a swap space during installation or at any later time as desired.
q Swap space can be used for two purposes
§ to extend the virtual memory beyond the installed physical memory (RAM), and also for suspend-
to-disk support.
q If the amount of physical memory is less than the amount of memory required to run
all the desired programs, then it may be beneficial to enable swap.
• This avoids out of memory conditions , where the Linux kernel OOM killer (Out of Memory error)
mechanism will automatically attempt to free up memory by killing processes.
q To increase the amount of virtual memory to the required amount, add the necessary
difference (or more) as swap space.
q The biggest drawback of enabling swap is its lower performance.
Kenesa B. (Ambo University) Network and System Admin 20
Swap Space
q To check swap status, use:
$ swapon --show
q Or to show physical memory as well as swap usage:
$ free -h
q A swap partition can be created with most GNU/Linux partitioning tools.
§ Swap partitions are designated as type 82 on MBR and 0657FD6D-A4AB-43C4-84E5-
0933C84B4F4F on GPT.
q Swap partitions are created in a similar way to regular partitions, but before writing
the partition table to disk, you must use the "t" command to change the partition's
system id to "82" (Linux Swap). The default system id is "83" (Linux).
q To set up a partition as Linux swap area, the mkswap command is used
# mkswap /dev/sdxy
q Warning: All data on the specified partition will be lost.
Kenesa B. (Ambo University) Network and System Admin 21
Swap Space
q To enable the device for paging:
# swapon /dev/sdxy
q To enable this swap partition on boot, add an entry to /etc/fstab :
UUID=device_UUID none swap defaults 0 0
• where the device_UUID is the UUID of the swap space.
q To deactivate specific swap space:
# swapoff /dev/sdxy
q Alternatively use the -a switch to deactivate all swap space.
q Since swap is managed by systemd, it will be activated again on the next system
startup.
q To disable the automatic activation of detected swap space permanently, run
systemctl --type swap to find the responsible .swap unit and mask it.

Kenesa B. (Ambo University) Network and System Admin 22


Determining Disk Usage
q The df command is short for disk filesystem and will display statistics on the current
state of a file system.
• For example, you can view used, total, and available disk space. Also, you can use the command to
view filesystem types, the mount point, and much more.
q This command is extremely useful whenever you need to find more information
regarding disk usage space on a Linux distribution.
• Using the human-readable option, you can quickly see how much space your file systems have left.
q The syntax for the df command is straightforward to understand, accepting only two
arguments.
df [OPTIONS] [FILE]
§ [OPTIONS] is where you specify any additional options that you would like applied to the
command.
§ [FILE] is where you specify the file name for which you need the file system information. It is
optional and can accept multiple file names.
§ If you do not specify a filename, df will display information for all mounted file systems.
Kenesa B. (Ambo University) Network and System Admin 23
Determining Disk Usage
§ -a or --all will output duplicate, pseudo, and inaccessible file systems.
§ -B SIZE or --block-size=SIZE scales the sizes by the input size.
§ -h or --human-readable prints the sizes in a human-readable format.
§ -H or --si are the same as the option above, but the sizes are in the power of 1000.
§ -i or --inodes will list inode usage information instead of the block usage information.
§ -l or --local limits the results to local file systems. Otherwise, the default behavior of the command
will display remote file systems.
§ --no-sync disables the sync system call before getting data. It increases the speed of the command,
but the results may be out of date. This option is on by default.
§ --output=[FIELD_LIST] allows you to change the columns outputted for the file system information.
Replace [FIELD_LIST] with a comma-separated list of fields you wish to be outputted. We touch on the
columns you can use in the column section on this page.
§ --sync forces a sync before getting data. It can slow the speed of command, but the results should be
more accurate.
§ --total displays a total of all the filesystem sizes after processing them.
Kenesa B. (Ambo University) Network and System Admin 24
Determining Disk Usage
q The du (known as the “disk usage”) command is a standard Linux/Unix command that
allows a user to gain disk usage information quickly.
q It is best applied to specific directories and allows many variations for customizing
the output to meet your needs.
• As with most commands, the user can take advantage of many options or flags.
q Also, like many Linux commands, most users only use the same two or three flags to
meet their specific set of needs.
q The syntax for the du command is
du [OPTIONS]... [FILE]...
q When there is no FILE which has been specified, then du will automatically report the
usage of the disk memory in the current directory that we are working.
§ The memory usage of the disk is given in bytes of the given file or the directories memory usage
where the file is located and also the subdirectories of that file when we execute the command
without option.
Kenesa B. (Ambo University) Network and System Admin 25
Determining Disk Usage
q We can give multiple file names and directory names as input to the du command:
du ~/Documents ~/Pictures ~/.zsg
q du doesn’t work on the file or directory which we don’t possess the authorization for.
• The command will throw an error since the du command would not be able to read the directory.
§ -a - generates a report for all files in pathname.
§ -h - displays file sizes using more human-friendly units. Units used are:
• B (Bytes), KB (Kilobytes), MB (Megabytes), TB (Terabytes), (PB Petabytes),EB (Exabytes)
§ -k - displays file sizes in 1024-byte (1K) units.
§ -l num - displays only num levels of subdirectories.
§ -r - reports files which cannot be opened and directories which cannot be read; this is the default
behavior.
§ -s - does not display file size totals for subdirectories.
§ -t - displays the total amount of space used by all pathnames examined.
§ -x - displays file sizes for only those files contained on the same device as pathname.
Kenesa B. (Ambo University) Network and System Admin 26
Configuring Disk Quotas
q If you are the only person who uses the disk, there is no need to implement quota at
all.
• But if there are multiple users who use the same disk, quotas are the best ways to control the
individual users from monopolizing entire disk space.
q A user limited by disk quotas cannot use additional disk space beyond his limit.
• For example suppose there are four users; user a, user b, user c and user d. Without quota any
user can use entire disk space, leaving no space for other users.
• This situation is very common in shared environment such as web hosting, ISPs, file server, ftp
server etc.
q But if disk quota is enabled, no user can use disk space beyond his limit.

Kenesa B. (Ambo University) Network and System Admin 27


Configuring Disk Quotas
q Quota allows you to specify limits on two aspects of disk storage:
§ the number of inodes a user or a group of users may possess;
Ø inode - data structures that contain information about files in UNIX file systems. Because
inodes store file-related information, this allows control over the number of files that can be
created.
§ and the number of disk blocks that may be allocated to a user or a group of users.
q The idea behind quota is that users are forced to stay under their disk consumption
limit, taking away their ability to consume unlimited disk space on a system.
q If there is more than one file system which a user is expected to create files, then
quota must be set for each file system separately.
q To implement disk quotas, use the following steps:
§ Enable quotas per file system by modifying the /etc/fstab file.
§ Remount the file system(s).
§ Create the quota database files and generate the disk usage table then Assign quota policies.
Kenesa B. (Ambo University) Network and System Admin 28
Configuring Disk Quotas
q As root, using a text editor, edit the /etc/fstab file.
• E.g. Add the usrquota and/or grpquota options to the file systems that require quotas:
/dev/VolGroup00/LogVol02 /home ext3 defaults,usrquota,grpquota 1 2
q In this example, the /home file system has both user and group quotas enabled.
q After adding the usrquota and/or grpquota options, remount each file system whose
fstab entry has been modified.
• E.g. umount /work
mount /dev/vdb1 /work
q After each quota-enabled file system is remounted run the quotacheck command.
• For example, if user and group quotas are enabled for the /home file system, create the files in
the /home directory:
# quotacheck -cug /home
q The last step is assigning the disk quotas with the edquota command.
# edquota username
Kenesa B. (Ambo University) Network and System Admin 29
Logical Volume Management (LVM) and RAID
q Logical volume management is essentially a supercharged and abstracted version of
disk partitioning.
• It groups individual storage devices into “volume groups.”
q The blocks in a volume group can then be allocated to “logical volumes,” which are
represented by block device files and act like disk partitions.
q Logical Volume Manager (LVM) enables you to be much more flexible with your disk
usage than you can be with conventional old-style file partitions.
• For example, if your system logs have grown immensely, and you’ve run out of room on your /var
partition, increasing a partition size without LVM is a big pain.
• You would have to get another disk drive, create a /var mount point on there too, and copy all your
data from the old /var to the new /var disk location
q With LVM in place, you could add another disk, create a physical volume, and then add
the physical volume to the volume group that contains the /var partition.
q Then you’d use the LVM file system resizing tool to increase the file system size to
match the new partition size.
Kenesa B. (Ambo University) Network and System Admin 30
Logical Volume Management (LVM) component
q Physical Volumes
• The underlying physical storage unit of an LVM logical volume is a block device such as a partition
or whole disk.
• A physical volume (PV) is a partition or whole disk designated for LVM use.
• To use the device for an LVM logical volume the device must be initialized as a physical volume
q Volume Groups
• A volume group (VG) is a collection of physical volumes (PVs), which creates a pool of disk space
out of which logical volumes can be allocated.
• This creates a pool of disk space out of which logical volumes can be allocated.
• Within a volume group, the disk space available for allocation is divided into units of a fixed-size
called extents.
q Logical Volumes
• In LVM, a volume group is divided up into logical volumes.
• A logical volume represents a mountable storage device.
Kenesa B. (Ambo University) Network and System Admin 31
Logical Volume Management (LVM) and RAID
q Logical volumes are more flexible and powerful than disk partitions.
q Here are some of the magical operations a volume manager lets you carry out:
§ Move logical volumes among different physical devices
§ Grow and shrink logical volumes on the fly
§ Take copy-on-write “snapshots” of logical volumes
§ Replace on-line drives without interrupting service
§ Incorporate mirroring or striping in your logical volumes
q The components of a logical volume can be put together in various ways.
• Concatenation keeps each device’s physical blocks together and lines the devices up one after
another.
• Striping interleaves the components so that adjacent virtual blocks are actually spread over
multiple physical disks.
q By reducing single-disk bottlenecks, striping can often result in higher bandwidth and
lower latency.
Kenesa B. (Ambo University) Network and System Admin 32
Implementing LVM
q The top-level architecture of LVM is that
individual disks and partitions (physical
volumes) are gathered into storage pools
called volume groups.
• Volume groups are then subdivided into logical
volumes, which are the block devices that hold
filesystems.
q A physical volume needs to have an LVM
label applied with pvcreate.
q Applying such a label is the first step to
accessing the device through the LVM.
q In addition to bookkeeping information, the
label includes a unique ID to identify the
device.
q Look at the LVM commands in Linux
Kenesa B. (Ambo University) Network and System Admin 33
Implementing LVM
q “Physical volume” can be disks, but they can also be disk partitions or RAID arrays.
LVM doesn’t care.
q You can control LVM with either a large group of simple commands or with the single
lvm command and its various subcommands.
q A Linux LVM configuration proceeds in a few distinct phases:
§ Creating (defining, really) and initializing physical volumes
§ Adding the physical volumes to a volume group
§ Creating logical volumes on the volume group
q LVM commands start with letters that make it clear at which level of abstraction they
operate: pv commands manipulate physical volumes,
q vg commands manipulate volume groups, and lv commands manipulate logical
volumes.
q A few commands with the prefix lvm (e.g., lvmchange) operate on the system as a
whole.
Kenesa B. (Ambo University) Network and System Admin 34
RAID Concepts
q Even with backups, the consequences of a disk failure on a server can be disastrous.
q RAID, “redundant arrays of inexpensive disks,” is a system that distributes or
replicates data across multiple disks.
§ RAID is sometimes glossed as “redundant arrays of independent disks,” too.
q RAID not only helps avoid data loss but also minimizes the downtime associated with
hardware failures (often to zero) and potentially increases performance.
q There are two types of RAID that can be used on computer systems.
§ hardware RAID and software RAID.
q RAID can be implemented by dedicated hardware that presents a group of hard disks
to the operating system as a single composite drive.
q It can also be implemented simply by the operating system’s reading or writing
multiple disks according to the rules of RAID.
q there are six different RAID levels commonly used regardless of whether hardware or
software RAID is used.
Kenesa B. (Ambo University) Network and System Admin 35
Hardware RAID
q Hardware RAID has been predominant in the past for two main reasons:
§ lack of software alternatives (no direct OS support for RAID) and
§ hardware’s ability to buffer writes in some form of nonvolatile memory.
q In hardware RAID the disks have their own RAID controller with built-in software that
handles the RAID disk setup, and I/O.
q The controller is typically a card in one of the system’s expansion slots, or it may be
built onto the system board.
• The capabilities are generally supplied via a PCI card, although there are some instances of RAID
components integrated in to the motherboard.
• Hardware RAID may also be available in a stand-alone enclosure.
q The hardware RAID interface is transparent to Linux, so the hardware RAID disk array
looks like one giant disk.
q The operating system does not control the RAID level used, it is controlled by the
hardware RAID controller.
• Most dedicated servers use a hardware RAID controller
Kenesa B. (Ambo University) Network and System Admin 36
Software RAID
q Since the relationship of the disks to one another is defined within the operating
system instead of the firmware of a hardware device, this is called software RAID
q There is no reason to assume that a hardware-based implementation of RAID will
necessarily be faster than a software- or OS-based implementation.
§ Because the disks themselves are always the most significant bottleneck in a
RAID implementation
q In software RAID there is no RAID controller card.
q The OS is used to set up a logical array, and the operating system controls the RAID
level used by the system.
q Software RAID must be configured during system installation.
q Software based RAID is the most flexible form of RAID.
q It is easy to install and update and provides full capability on all or part of any drives
available to the system.
Kenesa B. (Ambo University) Network and System Admin 37
RAID Levels
q An array's RAID level refers to the relationship imposed on the component storage
devices
§ Striping: Striping is the process of dividing the writes to the array over multiple underlying disks
§ Chunk Size: When striping data, chunk size defines the amount of data that each chunk will
contain
§ Mirroring: making a mirror copy of disk which you want to protect and in this way you have two
copies of data. In time of failure, the controller uses a second disk to serve the data
§ Parity: Parity is a data integrity mechanism implemented by calculating information from the data
blocks written to the array - a redundancy check that ensures full protection of data
q RAID 0 – Striping without parity
• Combines two or more devices by striping data across them
• The advantage of this is that since the data is distributed, the whole power of each device can be
utilized for both reads and writes
• The theoretical performance profile of a RAID 0 array is simply the performance of an individual
disk multiplied by the number of disks
Kenesa B. (Ambo University) Network and System Admin 38
RAID Levels
• The usable capacity of the array is simply the combined capacity of all constituent drives
• The failure of a single device will bring down the entire array and all data will be lost
• if you are running a RAID 0 array, backups become extremely important

q RAID 1 – Disk Mirroring


• It is a configuration which mirrors data between two or more devices
• Anything written to the array is placed on each of the devices in the group
• This means that each device has a complete set of the available data, offering redundancy in case
of device failure
Kenesa B. (Ambo University) Network and System Admin 39
RAID Levels
• the theoretical read speed can still be calculated by multiplying the read speed of an individual
disk by the number of disks
• Mirroring remains popular due to its simplicity and high level of data availability.
• RAID level 1 is costly because you write the same information to all of the disks in the array, which
provides data reliability
q Raid 4 – Parity
• One of the disks is reserved for parity to protect data – min 3 disks required
• Parity information is calculated based on the content of the rest of the member disks in the array.
• This information can then be used to reconstruct data when one disk in the array fails.
• Reading and writing speed is fast, fault tolerant

Kenesa B. (Ambo University) Network and System Admin 40


RAID Levels
q Raid 5 – Striping with Parity
• In RAID 5, data is striped across disks in much the same way as a RAID 0 array
• However, for each stripe of data written across the array, parity information will be written to one
of the disks
• RAID 5 arrays handle the loss of any one disk in the array.
• It is a mathematically calculated value that can be used for error correction and data
reconstruction
• The disk that receives the calculated parity block instead of a data block will rotate with each
stripe that is written
• Like other arrays with striping, read performance benefits from the ability to read from multiple
disks at once
• The parity blocks allow for the complete reconstruction of data if this happens
• Since the parity is distributed (some less common RAID levels use a dedicated parity drive), each
disk has a balanced amount of parity information
• System performance can slow down considerably due to on-the-fly parity calculations
Kenesa B. (Ambo University) Network and System Admin 41
RAID Levels

q RAID 6
• RAID 6 uses an architecture similar to RAID 5, but with double parity information
• This means that the array can withstand any two disks failing
• Like other RAID levels that use striping, the read performance is generally good
• RAID 6 pays for the additional double parity with an additional disk's worth of capacity
• The calculation to determine the parity data for RAID 6 is more complex than RAID 5

Kenesa B. (Ambo University) Network and System Admin 42


RAID Levels
q RAID 10
• Traditionally, RAID 10 refers to a nested RAID,
• It is created by first setting up two or more RAID 1 mirrors, and then using those as components to
build a striped RAID 0 array across them
• This is sometimes now called RAID 1+0 to be more explicit about this relationship
• RAID 1+0 arrays have the high performance characteristics of a RAID 0 array,
• Instead of relying on single disks for each component of the stripe, a mirrored array is used,
providing redundancy

Kenesa B. (Ambo University) Network and System Admin 43


Creating RAID logical volumes
q You can create RAID1 arrays with multiple numbers of copies, according to the value
you specify for the -m argument.
• Similarly, you can specify the number of stripes for a RAID 0, 4, 5, 6, and 10 logical volume with
the -i argument.
q You can also specify the stripe size with the -I argument.
q For example, to create a 2-way RAID the following command creates a 2-way RAID1
array, named my_lv, in the volume group my_vg, that is 1G in size:
# lvcreate --type raid1 -m 1 -L 1G -n my_lv my_vg
q Create a RAID5 array with stripes:
q The following command creates a RAID5 array with three stripes and one implicit
parity drive, named my_lv, in the volume group my_vg, that is 1G in size.
• Note that you can specify the number of stripes similar to an LVM striped volume.
• The correct number of parity drives is added automatically.
# lvcreate --type raid5 -i 3 -L 1G -n my_lv my_vg
Kenesa B. (Ambo University) Network and System Admin 44
Application of Computers

THE END
45

You might also like