You are on page 1of 10

A Beginners guide to raid Part 1 :What is raid?

Raid stands for Redundant Array of Inexpensive Disks. It is a system that allows multiple hard drives to be recognized as one, allowing for greater storage capacity, increased performance, data redundancy, or a mix of these three. It used to be the case that RAID was only for servers and high end workstation. But today almost all Southbridges have some sort of integrated raid. Also it is important to note that raid is not a substitute for a good back up strategy.

Part 2: RAID types or levels


Most common RAID levels. These are the most common raid levels

JBOD (Concatenation of Physical disks to create a single logical drive) Raid 0 (Striped Set) Raid 1 (Mirrored Set) Raid 5 (Striped set with Distributed Parity) Raid 6 (striped set with Double Distributed Parity)

JBOD
JBOD is not one of the numbered raid levels but is a popular way of combining disks together of a mismatched size. In A JBOD array the discs are concatenated together end to end to create one large logical volume. JBOD can make smaller drives more useful than if they were used singularly. JBOD goes by a few names and is available as a software raid system in most operating systems.

Raid 0
Raid 0 isn't really a level of raid at all, it's not redundant hence it has level 0 redundancy i.e. None. Raid 0 Stripes the data across disks like this.

Raid 0 Logical Disk


Disk0 Disk1

A1 A3 A5 A7

A2 A4 A6 A8

If one drive fails : Array and data destroyed If two drives fail: Array and data Destroyed Read performance: Theoretical : Number or Drives x speed of the slowest disk Reality : about 75% of theoretical at best Write Performance: Theoretical : Number or Drives x speed of the slowest disk Reality : about 75% of theoretical at best. Advantages: Faster than one drive Cheaper than one large drive usually i.e. 2X 500gb are cheaper than one 1tb drives can reach larger capacity's than one drive can. i.e. 2 1TB drives can create a 2TB array. Disadvantages No redundancy Higher chance of array failure than one disk Good usage scenario: Web servers, read only file servers, video editing page files and temp storage.

Raid 1
Raid 1 is the first of the real Raid levels. It has level 1 redundancy. Raid 1 works by mirroring the data over 2 drives. This is called mirroring.

Raid 1 Logical Disk


Disk 0 Disk 1

A1 A2 A3 A4

A1 A2 A3 A4

If one disk fails: Data is still safe but array runs degraded If two disks fails: data is lost. Fatal array failure occurs Read performance: Theoretical: Around 150% of single disk performance if a good controller is used. Reality: just over 100% of slowest disk. Write performance: Theoretical: Around the same as a single disk. Reality: Just under is software controller is used. Advantages:

All data is kept intact after a failure No real performance hit as there is no parity data to be worked out.

Disadvantages Inefficient use of space. Can still run the risk of failure although the chance of dual disk failure is quite high. Good usage scenario: general file servers

Raid 5
I must admit that i have a soft spot for raid 5. it is my favorite raid level. It has saved me on a number of occasions involving many gig's of data including my As ICT project the day before it had to be in. so i must say that i am

slightly biased towards it. Raid 5 uses the same type of block level in a similar fashion to Raid 0. The crucial difference being that Raid 5 adds an additional step of working out a parity block for that block of data and stores it on the disk that is not being used to store the data. The disk used to store the parity is changed with every block write. This creates the distributed Parity of raid 5. Raid 5 also suffers from a condition called the Raid 5 Write hole. This happens when the system fails and there are still outstanding writes in the cache. The parity of the stripe may become corrupt. If this is not fixed before one of the disk fails this will lead to corruption. Raid 5 needs at least 3 drives to work.(A two disk Raid 5 set is possible on some controllers but is not often implemented used it negates a lot of the benefits of raid 5.)

Raid 5 Logical Disk


Key: P = Parity Block A = Data Block
Disk0 Disk1 Disk1

A1 A3 P3 A7

A2 P2 A5 A8

P1 A4 A6 P4

If one drive Fails: Array runs in degraded mode. All disk space still available. If two drives fail: Array fails Read performance: Theoretical: Roughly the same speed as Raid 0 unless a block fails a CRC check. Then data is read from parity, Causing a slight dip in performance. Reality: If a good hardware controller is used the performance is much the same as theoretical Write performance: Theoretical: If large amount of small changes are made data can become backed up and performance can take a hit. if large files are written performance can be very good. Reality: Very similar to the theoretical sample but is very dependent on the controller.

Good usage scenario: Gaming desktops, critical file servers Raid 6 Raid 6 is a natural evolution Raid 5.it adds a second disk for Parity data. Raid 6 excels in larger disk sets where reliability is needed. i.e in a set of 4 disk raid 6 would give you the space of just 2 meaning it is just as disk efficient raid 1+0 but would not be as fast but in a 12 disk array you would have the disk space of 10 of the disk but the array would be able to be sustain 2 disk failures where as raid 5 would only be able to sustain only one failure and in such a large array multiple failures would be probable.

Nested raid levels


With nested raid levels there can be many different raid levels. almost too many. And I'm not going to cover them in much detail. the most common nested Raid levels are:

Raid 10 raid 0+1 Raid 100 Raid 50 Raid 60

For more information go to:


http://en.wikipedia.org/wiki/Nested_RAID_levels

Non standard raid levels


There are a few non standard raid levels that you might come across. These are detailed below. They are not usually given a number like raid 5 or raid 1 they have names given to them by the Company that created them.

Intel Matrix raid


Intel matrix raid is a system of raid that combines raid onto to of more disk

without having a separate set of disks for each raid level. The below diagram uses a raid 1 and a raid 0 set:

Intel Matrix logical Disk Raid 0 and 1 spilt


Disk0 Disk1

A1 A3 A5 A7

A2 A4 A6 A8 Raid 0 Partition

A1 A2 A3 A4

A1 A2 A3 A4 Raid 1 Partition

Additionally you can use a raid 5 set over 3 disks as below:

Intel Matrix logical Disk Raid 0 and 5 Spilt


Disk 0 Disk 1 Disk 2 Disk 3

A1 A5 A9 A13

A2 A6 A10 A14

A3 A7 A11 A15

A4 A8 A12 A16 Raid 0 Partition

A1 A4 A7 P4

A2 A5 P3 A10

A3 P2 A8 A11

P1 A6 A9 A12 Raid 5 Partition

Part 3: Controllers
There are three distinct type of Raid controller available today these are 1. Hardware 2. Software

Hardware controllers
Hardware controllers are expensive but they give you the best performance available. These controllers have a hardware processor the usually has accompanying ram and flash storage. These can be thought of as a whole extra computer inside your computer. Hardware controllers are usually powered by Intel/free scale made IOX processors although some have proprietary raid engines like 3ware and Areca cards. Hardware controllers are made to be completely independent and invisible to the system and only to be visible to the end user by the raid BIOS and monitoring utilities although most operating systems need drivers to see the card. Luckily though if you are after a cheap fully hardware raid card you do not have to look far. The Revo card made by XFX is a fully hardware card and supports raid 3 at the time of writing the card is available for around 30 - 40 and has the bonus of 64mb of cache on-board. Also there are cheap hardware raid cards on e Bay almost constantly.

Characteristics of a hardware raid card:


Expensive Big (some cards such as IBM's serve raid cards can be 14 inches long) Dedicated on-board I/O processor usually has on-board cache( either as a Replaceable and upgradeable DIMM or soldered on the board)

Software controller
Almost every motherboard shipped today has some for of software raid controller on the motherboard. These software raid chips are usually integrated into the chip sets such as Intel south bridges ending in R (i.e. ICH5R , ICH6R ,ICH7R ,ICH8R AND ICH9R) and almost all Nvidia Nforce chip sets. Software controllers are usually in the form of cheap controller cards that offer raid functionality. The cards themselves offer very little in the way of raid on the chip itself. This is usually provided by the driver witch off loads all of the processing to the CPU this is not necessarily a bad thing but large writes can tax the processor and if you use your computer for gaming a large write can slow your game to a slide show in the worst cases. One of the better software raid systems is Intels raid Southbridges. These are incredibly well made controllers with a very highly matured driver and can easily achieve 300MB/s through put with the right hardware.

Part 4: Stripe sizes


On most controllers you have the ability to chose your stripe size. This may look confusing but it really is not. all you really need to know is what the different sizes are good for. There is a general rule of thumbs that the stripe size should be twice that of of the file systems block size. this is not necessarily true, because when dealing with NTFS the block size is 4k ,so the stripe size would be 8k and this would mean poor performance with large files. ideally the best point for general usage is around 64k. but having a large stripe size would not effect storage space adversely. These are the typical stripe sizes found today:

4k

8k 16k 32k 64k 128k 256k 512k 1024k

The extremities of this are only useful in very specialist cases. For the most part 64k would be the most useful to most people. This assumes that you have chosen to use raid 5. For raid 0 you would do just fine using around the 8k mark.

Produced by: Alistair Senior With thanks to: Supershanks: For pointing out another way of using matrix raid. TiG : For giving me the idea to put in usage scenario's Streetster: Spelling, punctuation and layout fixing. and my mother: For not getting annoyed at finding hard drives scattered around the house. Questions,comments,suggestions and hate mail to: al[at]alsenior.co.uk or PM 'alsenior' on Hexus community forums. http://forums.hexus.net All trademarks are properties of the respective owners. This work is licensed under the Creative Commons AttributionNoncommercial-Share Alike 2.0 UK: England & Wales License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncsa/2.0/uk/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

You might also like