You are on page 1of 27

2009 EMC Corporation. All rights reserved.

EMC Proven Professional



The #1 Certification Program in the information storage
and management industry

Data Protection: RAID
Chapter 3
Section 1 : Storage System
2009 EMC Corporation. All rights reserved.
Why RAID
o Performance limitation of disk drive
o An individual drive has a certain life expectancy
o Measured in MTBF (Mean Time Between Failure)
o The more the number of HDDs in a storage array, the larger the probability
for disk failure. For example:
o If the MTBF of a drive is 750,000 hours, and there are 100 drives in the array,
then the MTBF of the array becomes 750,000 / 100, or 7,500 hours
o RAID was introduced to mitigate this problem
o RAID provides:
o Increase capacity
o Higher availability
o Increased performance
2009 EMC Corporation. All rights reserved.
Chapter objectives
After completing this chapter, you will be able to:
o Describe what is RAID and the needs it addresses
o Describe the concepts upon which RAID is built
o Define and compare RAID levels
o Recommend the use of the common RAID levels based on
performance and availability considerations
o Explain factors impacting disk drive performance
2009 EMC Corporation. All rights reserved.
RAID Array Components


RAID
Controller
Hard Disks
Logical
Array
Physical
Array
RAID Array
Host
2009 EMC Corporation. All rights reserved.
RAID Implementations
o Hardware (usually a specialized disk controller card)
o Controls all drives attached to it
o Array(s) appear to host operating system as a regular disk drive
o Provided with administrative software
o Software
o Runs as part of the operating system
o Performance is dependent on CPU workload
o Does not support all RAID levels
2009 EMC Corporation. All rights reserved.
RAID Levels
o 0 Striped array with no fault tolerance
o 1 Disk mirroring
o Nested RAID (i.e., 1 + 0, 0 + 1, etc.)
o 3 Parallel access array with dedicated parity disk
o 4 Striped array with independent disks and a dedicated parity
disk
o 5 Striped array with independent disks and distributed parity
o 6 Striped array with independent disks and dual distributed
parity

2009 EMC Corporation. All rights reserved.
Data Organization: Striping
Stripe 1
Stripe 2
Strips
Strip 1 Strip 2 Strip 3
Stripe
Strip
Stripe
2009 EMC Corporation. All rights reserved.
RAID 0
o Data is distributed across the HDDs in the RAID set.
o Allows multiple data to be read or written simultaneously, and
therefore improves performance.
o Does not provide data protection and availability in the event
of disk failures.
2009 EMC Corporation. All rights reserved.
RAID 0
1
9
5
2
10
6
3
11
7
0
Host
RAID
Controller
2009 EMC Corporation. All rights reserved.
RAID 1
o Data is stored on two different HDDs, yielding two copies of
the same data.
o Provides availability.
o In the event of HDD failure, access to data is still available from
the surviving HDD.
o When the failed disk is replaced with a new one, data is
automatically copied from the surviving disk to the new disk.
o Done automatically by RAID the controller.
o Disadvantage: The amount of storage capacity is twice the
amount of data stored.
o Mirroring is NOT the same as doing backup!
2009 EMC Corporation. All rights reserved.
RAID 1
Block 1 Block 1 Block 1 Block 0 Block 0
Host
Block 0
RAID
Controller
2009 EMC Corporation. All rights reserved.
Nested RAID
o Combines the performance benefits of RAID 0 with the
redundancy benefit of RAID 1.
o RAID 0+1 Mirrored Stripe
o Data is striped across HDDs, then the entire stripe is mirrored.
o If one drive fails, the entire stripe is faulted.
o Rebuild operation requires data to be copied from each disk in the healthy
stripe, causing increased load on the surviving disks.
o RAID 1+0 Striped Mirror
o Data is first mirrored, and then both copies are striped across multiple
HDDs.
o When a drive fails, data is still accessible from its mirror.
o Rebuild operation only requires data to be copied from the surviving disk
into the replacement disk.
2009 EMC Corporation. All rights reserved.
Nested RAID 0+1 (Striping and Mirroring)
Block 3
Block 2
Block 1
Host
RAID 0
Block 0
Block 3 Block 2 Block 1 Block 0
RAID 1
RAID
Controller
2009 EMC Corporation. All rights reserved.
Nested RAID 0+1 (Striping and Mirroring)
RAID
Controller
Block 3
Block 2
Block 1
RAID 0
Block 0
RAID 1
Block 3
Block 2
Block 1
Block 0
Block 3
Block 2
Block 1
Block 0
Host
2009 EMC Corporation. All rights reserved.
Host
Nested RAID 1+0 (Mirroring and Striping)
Block 3
Block 3
Block 1
RAID 1
Block 0 Block 0
Block 1
RAID 0
Block 2 Block 2
RAID
Controller
2009 EMC Corporation. All rights reserved.
Host
Nested RAID 1+0 (Mirroring and Striping)
RAID
Controller
RAID 1
Block 0
Block 0
RAID 0
Block 2
Block 2 Block 3
Block 3
Block 1
Block 1 Block 0
Block 2
2009 EMC Corporation. All rights reserved.
RAID Redundancy: Parity
Parity Disk
1
9
5
3
11
7
0
0 1 2 3
4 5 6 7
4
6
1
7
18
Host
RAID
Controller
Parity calculation 4 + 6 + 1 + 7 = 18
The middle drive fails:
4 + 6 + ? + 7 = 18
? = 18 4 6 7
? = 1
?
2009 EMC Corporation. All rights reserved.
RAID 3 and RAID 4

o Stripes data for high performance and uses parity for improved
fault tolerance.
o One drive is dedicated for parity information.
o If a drive files, data can be reconstructed using data in the
parity drive.
o For RAID 3, data read / write is done across the entire stripe.
o Provide good bandwidth for large sequential data access such as video
streaming.
o For RAID 4, data read/write can be independently on single
disk.
2009 EMC Corporation. All rights reserved.
Host
RAID
Controller
Block 1
Block 2
Block 3
P 0 1 2 3
Block 0 Block 3 Block 2 Block 1 Block 0
Parity
Generated
RAID 3
2009 EMC Corporation. All rights reserved.
RAID 5 and RAID 6
o RAID 5 is similar to RAID 4, except that the parity is distributed
across all disks instead of stored on a dedicated disk.
o This overcomes the write bottleneck on the parity disk.
o RAID 6 is similar to RAID 5, except that it includes a second
parity element to allow survival in the event of two disk
failures.
o The probability for this to happen increases and the number of drives in
the array increases.
o Calculates both horizontal parity (as in RAID 5) and diagonal parity.
o Has more write penalty than in RAID 5.
o Rebuild operation may take longer than on RAID 5.
2009 EMC Corporation. All rights reserved.
Host
Block 0
P 0 1 2 3
Block 7
RAID
Controller
P 0 1 2 3
Block 0 Block 4 Block 0
Block 1
Block 5
Block 2
Block 6
Block 3
Parity
Generated
Block 0
P 0 1 2 3
Block 4
P 4 5 6 7 P 4 5 6 7
Block 4
P 4 5 6 7
Block 4
Parity
Generated
RAID 5
2009 EMC Corporation. All rights reserved.
RAID
Min
Disks
Storage
Efficiency %
Cost Read Performance Write Performance

0

2

100

Low
Very good for both
random and sequential
read
Very good


1


2


50


High

Good
Better than a single disk
Good
Slower than a single
disk, as every write must
be committed to two
disks

3

3
(n-1)*100/n
where n=
number of
disks

Moderate
Good for random reads
and very good for
sequential reads
Poor to fair for small
random writes
Good for large,
sequential writes

5

3
(n-1)*100/n
where n=
number of
disks

Moderate
Very good for random
reads
Good for sequential
reads
Fair for random write
Slower due to parity
overhead
Fair to good for
sequential writes
6 4
(n-2)*100/n
where n=
number of
disks
Moderate
but more
than RAID 5
Very good for random
reads
Good for sequential
reads
Good for small, random
writes
(has write penalty)
1+0
and
0+1
4 50 High Very good Good
RAID Comparison
2009 EMC Corporation. All rights reserved.
o Small (less than element size) write on RAID 3 & 5
o Ep = E1 + E2 + E3 + E4 (XOR operations)
o If parity is valid, then: Ep new = Ep old E4 old + E4 new (XOR operations)
o 2 disk reads and 2 disk writes
o Parity Vs Mirroring
o Reading, calculating and writing parity segment introduces penalty to every write operation
o Parity RAID penalty manifests due to slower cache flushes
o Increased load in writes can cause contention and can cause slower read response times

E
p new
RAID Controller
2 XOR
E
p new
E
p old
E
4 old
E
4 new
+ - =
E
4 old
E
p old
E
4 new

RAID Impacts on Performance
P0 D1 D2 D3 D4
2009 EMC Corporation. All rights reserved.
RAID Penalty Exercise
o Total IOPS at peak workload is 1200
o Read/Write ratio 2:1
o Calculate IOPS requirement at peak activity for
o RAID 1/0
o RAID 5


Additional Task
Discuss impact of sequential &
Random I/O in different RAID
Configuration

2009 EMC Corporation. All rights reserved.
RAID
Controller
Hot Spares
2009 EMC Corporation. All rights reserved.
Chapter Summary
Key points covered in this chapter:
o What RAID is and the needs it addresses
o The concepts upon which RAID is built
o Some commonly implemented RAID levels
2009 EMC Corporation. All rights reserved.
#1 IT
company
For more information visit http://education.EMC.com