Professional Documents
Culture Documents
Activity 1:
Visit an IT organisation and observe the functioning of the I/O interface
and the data lines, control lines, and I/O bus architecture. Also, check
whether the I/O system used is isolated or memory-mapped.
12.6 RAID
RAID is the acronym for ‘redundant array of inexpensive disks’. There are
several approaches to redundancy that have different overhead and
performance. The Patterson, Gibson, and Katz 1987 paper introduced the
term RAID. It used a numerical classification for these schemes that has
become popular; in fact, the non-redundant disk array is sometimes called
RAID 0.One disadvantage is discovering when the disk fails. Magnetic disks
help provide information about their correct operation. There is information
recorded in each sector which helps detect the errors in that sector.
Transferring of sectors will help the electronics attached to discover the
failure of disks or loss of information.
The levels of RAID are as follows:
12.6.1 Mirroring (RAID 1)
Mirroring or shadowing is the traditional solution to disk failure. It uses twice
as many disks. Data is simultaneously written on two disks, one non-
redundant and one redundant disk so that there are two copies of the data.
The system goes to the mirror disk in case one disk fails to get the required
information. This technique is the most expensive solution.
12.6.2 Bit-Interleaved parity (RAID 3)
Bit-Interleaved parity is an error detection technique where character bit
patterns are forced into parity so the total number of one (1) bit is always
Manipal University of Jaipur B1648 Page No. 272
Computer Architecture Unit 12
odd or even. This is done by adding a “1” or “0” bit to each byte as the
character/byte is transmitted. At the other end of the transmission the parity
is checked for accuracy. BIP is also a method used at the physical layer
(high speed transmission of binary data) level to monitor errors.
The cost of higher availability can be reduced to 1/N, where N is the number
of disks in a group. In this case, we need only enough redundant
information required to restore the lost information, instead of having the
complete original copy. Reads or writes go to all disks in the group, with one
extra disk to hold the check information in case there is a failure.
RAID 3 is popular in applications with large data sets, for example
multimedia and several scientific codes. Parity is one such scheme. Parity is
the example of the redundant disk which is having the sum of all the data in
the other disks. When a disk fails, the data of the all the good disks is
subtracted from the parity disk. The remaining information is the missing
information. Here, it is assumed that failures are too rare that taking longer
to recover from failure but reducing redundant storage is a good trade-off.
Mirroring effect can be considered the special case of one data disk and
one parity disk (N=1). Only duplicating the data can accomplish parity, thus,
mirrored disks have the advantage of simplifying the calculations included in
parity. The redundancy of N = 1 has the highest overhead for increasing
disk availability.
12.6.3 Block-interleaved distributed parity (RAID 5)
This level uses the same ratio of disks (data disks and check disks) as RAID
3, but data is accessed differently. In the prior organisation every access
went to all disks. Some applications would prefer to do smaller accesses,
allowing independent accesses to occur in parallel. That is the purpose of
this next RAID level. Since error-detection information in each sector is
checked on reads to see if data is correct, such “small reads” to each disk
can occur independently as long as the minimum access is one sector.
Writes are another matter. It would seem that each small write would
demand that all other disks be accessed to read the rest of the information
needed to recalculate the new parity. In our example, a “small write” would
require reading the other three data disks, adding the new information, and
then writing the new parity to the parity disk and the new data to the data
disk.
The main thing to remember to reduce this overhead is that parity is simply
a sum of information. By watching which bits change when we write the new
information, we need only to change the corresponding bits on the parity
disk. We must read the old data, compare old data to the new data to see
which bits change, read the old parity, change the corresponding bits, and
then write the new data and new parity. Thus, the small write involves four
disk accesses for two disks instead of accessing all disks. This organisation
is RAID 4. RAID 4 supports mixtures of large reads, large writes, small
reads and small writes. One drawback to the system is that the parity disk
must be updated on every write, so it is the bottleneck for sequential writes.
To fix the parity-write bottleneck, the parity information is spread throughout
all the disks so that there is no single bottleneck for writes. This distributed
parity organisation is RAID 5. Figure 12.6 shows how data is distributed in
RAID 4 and RAID 5.
As the organisation on the right shows, in RAID 5 the parity associated with
each row of data blocks is no longer restricted to a single disk. This
organisation allows for multiple writes to occur simultaneously as long as
the parity blocks are not located in the same disks. For example, a write to
block 8 on the right must also access its parity block P2, thereby occupying
the first and third disks. A second write to block 5 on the right, implying an
update to its parity block P1, accesses the second and fourth disks and thus
could occur at the same time as the prior write to block 8. Thus, RAIDs are
playing an increasing role in storage systems.
Self Assessment Questions
12. RAID is the acronym for _____________.
13. ______________ uses twice as many disks.