• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
RAID is an acronym first defined by David A. Patterson, Garth A. Gibson and RandyKatz at the University of California, Berkeley in 1987 to describe a RedundantArray of Inexpensive Disks,[1] a technology that allowed computer users to achievehigh levels of storage reliability from low-cost and less reliable PC-class disk-drive components, via the technique of arranging the devices into arrays forredundancy.More recently, marketers representing industry RAID manufacturers reinvented theterm to describe a Redundant Array of Independent Disks as a means ofdisassociating a "low cost" expectation from RAID technology.[2]"RAID" is now used as an umbrella term for computer data storage schemes that candivide and replicate data among multiple hard disk drives. The differentschemes/architectures are named by the word RAID followed by a number, as in RAID0, RAID 1, etc. RAID's various designs all involve two key design goals: increaseddata reliability or increased input/output performance. When multiple physicaldisks are set up to use RAID technology, they are said to be in a RAID array. Thisarray distributes data across multiple disks, but the array is seen by thecomputer user and operating system as one single disk. RAID can be set up to serveseveral different purposes.Purpose and basicsRedundancy is achieved by either writing the same data to multiple drives (knownas mirroring), or writing extra data (known as parity data) across the array,calculated such that the failure of one (or possibly more, depending on the typeof RAID) disks in the array will not result in loss of data. A failed disk may bereplaced by a new one, and the lost data reconstructed from the remaining data andthe parity data. Organizing disks into a redundant array decreases the usablestorage capacity. For instance, a 2-disk RAID 1 array loses half of the totalcapacity that would have otherwise been available using both disks independently,and a RAID 5 array with several disks loses the capacity of one disk. Other typesof RAID arrays are arranged so that they are faster to write to and read from thana single disk.There are various combinations of these approaches giving different trade-offs ofprotection against data loss, capacity, and speed. RAID levels 0, 1, and 5 are themost commonly found, and cover most requirements.RAID 0 (striped disks) distributes data across several disks in a way that givesimproved speed and no lost capacity, but all data on all disks will be lost if anyone disk fails. Although such an array has no actual redundancy, it is customaryto call it RAID 0.RAID 1 (mirrored settings/disks) duplicates data across every disk in the array,providing full redundancy. Two (or more) disks each store exactly the same data,at the same time, and at all times. Data is not lost as long as one disk survives.Total capacity of the array equals the capacity of the smallest disk in the array.At any given instant, the contents of each disk in the array are identical to thatof every other disk in the array.RAID 5 (striped disks with parity) combines three or more disks in a way thatprotects data against loss of any one disk; the storage capacity of the array isreduced by one disk.RAID 6 (striped disks with dual parity) (less common) can recover from the loss oftwo disks.RAID 10 (or 1+0) uses both striping and mirroring. "01" or "0+1" is sometimesdistinguished from "10" or "1+0": a striped set of mirrored subsets and a mirroredset of striped subsets are both valid, but distinct, configurations.RAID can involve significant computation when reading and writing information.With traditional "real" RAID hardware, a separate controller does thiscomputation. In other cases the operating system or simpler and less expensivecontrollers require the host computer's processor to do the computing, whichreduces the computer's performance on processor-intensive tasks (see "SoftwareRAID" and "Fake RAID" below). Simpler RAID controllers may provide only levels 0
 
and 1, which require less processing.RAID systems with redundancy continue working without interruption when one (orpossibly more, depending on the type of RAID) disks of the array fail, althoughthey are then vulnerable to further failures. When the bad disk is replaced by anew one the array is rebuilt while the system continues to operate normally. Somesystems have to be powered down when removing or adding a drive; others supporthot swapping, allowing drives to be replaced without powering down. RAID with hot-swapping is often used in high availability systems, where it is important thatthe system remains running as much of the time as possible.RAID is not a good alternative to backing up data. Data may become damaged ordestroyed without harm to the drive(s) on which they are stored. For example, partof the data may be overwritten by a system malfunction; a file may be damaged ordeleted by user error or malice and not noticed for days or weeks; and, of course,the entire array is at risk of physical damage.[edit]PrinciplesRAID combines two or more physical hard disks into a single logical unit by usingeither special hardware or software. Hardware solutions often are designed topresent themselves to the attached system as a single hard drive, so that theoperating system would be unaware of the technical workings. For example, youmight configure a 1TB RAID 5 array using three 500GB hard drives in hardware RAID,the operating system would simply be presented with a "single" 1TB disk. Softwaresolutions are typically implemented in the operating system and would present theRAID drive as a single drive to applications running upon the operating system.There are three key concepts in RAID: mirroring, the copying of data to more thanone disk; striping, the splitting of data across more than one disk; and errorcorrection, where redundant data is stored to allow problems to be detected andpossibly fixed (known as fault tolerance). Different RAID levels use one or moreof these techniques, depending on the system requirements. RAID's main aim can beeither to improve reliability and availability of data, ensuring that importantdata is available more often than not (e.g. a database of customer orders), ormerely to improve the access speed to files (e.g. for a system that delivers videoon demand TV programs to many viewers).The configuration affects reliability and performance in different ways. Theproblem with using more disks is that it is more likely that one will fail, but byusing error checking the total system can be made more reliable by being able tosurvive and repair the failure. Basic mirroring can speed up reading data as asystem can read different data from both the disks, but it may be slow for writingif the configuration requires that both disks must confirm that the data iscorrectly written. Striping is often used for performance, where it allowssequences of data to be read from multiple disks at the same time. Error checkingtypically will slow the system down as data needs to be read from several placesand compared. The design of RAID systems is therefore a compromise andunderstanding the requirements of a system is important. Modern disk arraystypically provide the facility to select the appropriate RAID configuration.[edit]Standard levelsMain article: Standard RAID levelsA number of standard schemes have evolved which are referred to as levels. Therewere five RAID levels originally conceived, but many more variations have evolved,notably several nested levels and many non-standard levels (mostly proprietary).Following is a brief summary of the most commonly used RAID levels.[3] Spaceefficiency is given as amount of storage space available in an array of n disks,in multiples of the capacity of a single drive. For example if an array holds n=5drives of 250GB and efficiency is n-1 then available space is 4 times 250GB orroughly 1TB.LevelDescriptionMinimum # of disksSpace EfficiencyImageRAID 0"Striped set without parity" or "Striping". Provides improved
 
performance and additional storage but no redundancy or fault tolerance. Any diskfailure destroys the array, which has greater consequences with more disks in thearray (at a minimum, catastrophic data loss is twice as severe compared to singledrives without RAID). A single disk failure destroys the entire array because whendata is written to a RAID 0 drive, the data is broken into fragments. The numberof fragments is dictated by the number of disks in the array. The fragments arewritten to their respective disks simultaneously on the same sector. This allowssmaller sections of the entire chunk of data to be read off the drive in parallel,increasing bandwidth. RAID 0 does not implement error checking so any error isunrecoverable. More disks in the array means higher bandwidth, but greater risk ofdata loss.2nRAID 1'Mirrored set without parity' or 'Mirroring'. Provides fault tolerancefrom disk errors and failure of all but one of the drives. Increased readperformance occurs when using a multi-threaded operating system that supportssplit seeks, very small performance reduction when writing. Array continues tooperate so long as at least one drive is functioning. Using RAID 1 with a separatecontroller for each disk is sometimes called duplexing.2n/2RAID 2Hamming code parity. Disks are synchronized and striped in very smallstripes, often in single bytes/words. Hamming codes error correction is calculatedacross corresponding bits on disks, and is stored on multiple parity disks.3RAID 3Striped set with dedicated parity or bit interleaved parity or bytelevel parity.This mechanism provides fault tolerance similar to RAID 5. However, because thestrip across the disks is a lot smaller than a filesystem block, reads and writesto the array perform like a single drive with a high linear write performance. Forthis to work properly, the drives must have synchronised rotation. If one drivefails, the performance doesn't change.3n-1RAID 4Block level parity. Identical to RAID 3, but does block-level stripinginstead of byte-level striping. In this setup, files can be distributed betweenmultiple disks. Each disk operates independently which allows I/O requests to beperformed in parallel, though data transfer speeds can suffer due to the type ofparity. The error detection is achieved through dedicated parity and is stored ina separate, single disk unit.3n-1RAID 5Striped set with distributed parity or interleave parity. Distributedparity requires all drives but one to be present to operate; drive failurerequires replacement, but the array is not destroyed by a single drive failure.Upon drive failure, any subsequent reads can be calculated from the distributedparity such that the drive failure is masked from the end user. The array willhave data loss in the event of a second drive failure and is vulnerable until thedata that was on the failed drive is rebuilt onto a replacement drive. A singledrive failure in the set will result in reduced performance of the entire setuntil the failed drive has been replaced and rebuilt.3n-1RAID 6Striped set with dual distributed parity. Provides fault tolerance fromtwo drive failures; array continues to operate with up to two failed drives. Thismakes larger RAID groups more practical, especially for high availability systems.This becomes increasingly important because large-capacity drives lengthen thetime needed to recover from the failure of a single drive. Single parity RAIDlevels are vulnerable to data loss until the failed drive is rebuilt: the largerthe drive, the longer the rebuild will take. Dual parity gives time to rebuild thearray without the data being at risk if a (single) additional drive fails beforethe rebuild is complete.4n-2[edit]Nested levelsMain article: Nested RAID levelsMany storage controllers allow RAID levels to be nested: the elements of a RAIDmay be either individual disks or RAIDs themselves. Nesting more than two deep is
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...