You are on page 1of 27

Section 1 : Storage

System

Data Protection: RAID

Chapter 3
EMC Proven
Professional
The #1 Certification Program in the information
storage and management industry
2009 EMC Corporation. All rights reserved.

Why RAID
o Performance limitation of disk drive
o An individual drive has a certain life expectancy
o Measured in MTBF (Mean Time Between Failure)
o The more the number of HDDs in a storage array, the larger
the probability for disk failure. For example:
o If the MTBF of a drive is 750,000 hours, and there are 100 drives
in the array, then the MTBF of the array becomes 750,000 / 100,
or 7,500 hours

o RAID was introduced to mitigate this problem


o RAID provides:
o Increase capacity
o Higher availability
o Increased performance
2009 EMC Corporation. All rights reserved.

Chapter objectives
After completing this chapter, you will be able to:
o Describe what is RAID and the needs it addresses
o Describe the concepts upon which RAID is built
o Define and compare RAID levels
o Recommend the use of the common RAID levels
based on performance and availability
considerations
o Explain factors impacting disk drive performance

2009 EMC Corporation. All rights reserved.

RAID Array Components


Physical
Array

Logical
Array

RAID
Controller

Hard Disks

Host
RAID Array

2009 EMC Corporation. All rights reserved.

RAID Implementations
o Hardware (usually a specialized disk controller
card)
o Controls all drives attached to it
o Array(s) appear to host operating system as a regular disk
drive
o Provided with administrative software

o Software
o Runs as part of the operating system
o Performance is dependent on CPU workload
o Does not support all RAID levels

2009 EMC Corporation. All rights reserved.

RAID Levels
o 0 Striped array with no fault tolerance
o 1 Disk mirroring
o Nested RAID (i.e., 1 + 0, 0 + 1, etc.)
o 3 Parallel access array with dedicated parity disk
o 4 Striped array with independent disks and a
dedicated parity disk
o 5 Striped array with independent disks and
distributed parity
o 6 Striped array with independent disks and dual
distributed parity
2009 EMC Corporation. All rights reserved.

Data Organization: Striping


Stripe

Strip

Stripe

Strip 1

Strip 2

Stripe 1
Stripe 2
Strips
2009 EMC Corporation. All rights reserved.

Strip 3

RAID 0
o Data is distributed across the HDDs in the RAID
set.
o Allows multiple data to be read or written
simultaneously, and therefore improves
performance.
o Does not provide data protection and availability
in the event of disk failures.

2009 EMC Corporation. All rights reserved.

RAID 0
0

1
5
9
RAID
Controller

Host

2009 EMC Corporation. All rights reserved.

2
6
10
3
7
11

RAID 1
o Data is stored on two different HDDs, yielding two
copies of the same data.
o Provides availability.

o In the event of HDD failure, access to data is still


available from the surviving HDD.
o When the failed disk is replaced with a new one, data
is automatically copied from the surviving disk to the
new disk.
o Done automatically by RAID the controller.

o Disadvantage: The amount of storage capacity is


twice the amount of data stored.
o Mirroring is NOT the same as doing backup!
2009 EMC Corporation. All rights reserved.

RAID 1

Block 0
1

Host

2009 EMC Corporation. All rights reserved.

RAID
Block 0
1
Controller

Nested RAID
o Combines the performance benefits of RAID 0 with the
redundancy benefit of RAID 1.
o RAID 0+1 Mirrored Stripe
o Data is striped across HDDs, then the entire stripe is mirrored.
o If one drive fails, the entire stripe is faulted.
o Rebuild operation requires data to be copied from each disk in
the healthy stripe, causing increased load on the surviving disks.

o RAID 1+0 Striped Mirror


o Data is first mirrored, and then both copies are striped across
multiple HDDs.
o When a drive fails, data is still accessible from its mirror.
o Rebuild operation only requires data to be copied from the
surviving disk into the replacement disk.
2009 EMC Corporation. All rights reserved.

Nested RAID 0+1 (Striping and Mirroring)


RAID 1

Block 0
Block 2

Block 0
3
2
1

RAID
Controller

RAID 0

Block 1
Host

2009 EMC Corporation. All rights reserved.

Block 3

Nested RAID 0+1 (Striping and Mirroring)


RAID 1

Block 0

Block 0

Block 2

Block 2

RAID
Controller

Host

2009 EMC Corporation. All rights reserved.

RAID 0

Block 1

Block 1

Block 3

Block 3

Nested RAID 1+0 (Mirroring and Striping)


RAID 0

Block 1
Block 3

Block 2
0

RAID
Controller

RAID 1

Block 1
Host

2009 EMC Corporation. All rights reserved.

Block 3

Nested RAID 1+0 (Mirroring and Striping)


RAID 0

Block 0

Block 1

Block 2

Block 3

RAID
Controller

Host

2009 EMC Corporation. All rights reserved.

RAID 1

Block 0

Block 1

Block 2

Block 3

RAID Redundancy: Parity


0

4
1
6 5
9
RAID
Controller

Host

The middle drive fails:


Parity calculation 4 + 6 + 1 + 7 = 18

4 + 6 + ? + 7 = 18
? = 18 4 6 7
?=1

2009 EMC Corporation. All rights reserved.

Parity Disk

1
?

3
7 7
11
0123
4 518
67

RAID 3 and RAID 4


o Stripes data for high performance and uses parity
for improved fault tolerance.
o One drive is dedicated for parity information.
o If a drive files, data can be reconstructed using
data in the parity drive.
o For RAID 3, data read / write is done across the
entire stripe.
o Provide good bandwidth for large sequential data access
such as video streaming.

o For RAID 4, data read/write can be independently


on single disk.
2009 EMC Corporation. All rights reserved.

RAID 3

Block 0
3
2
1

Host

RAID0
Block
Controller
Block
Parity1
Generated
Block 2
Block 3
P0123

2009 EMC Corporation. All rights reserved.

RAID 5 and RAID 6


o RAID 5 is similar to RAID 4, except that the parity
is distributed across all disks instead of stored on
a dedicated disk.
o This overcomes the write bottleneck on the parity disk.

o RAID 6 is similar to RAID 5, except that it includes


a second parity element to allow survival in the
event of two disk failures.
o The probability for this to happen increases and the number
of drives in the array increases.
o Calculates both horizontal parity (as in RAID 5) and diagonal
parity.
o Has more write penalty than in RAID 5.
o Rebuild operation may take longer than on RAID 5.
2009 EMC Corporation. All rights reserved.

RAID 5
Block 0
Block 4
Block 1
Block 5

Block 0
4

Parity
RAID4
Block
0
Generated
Controller

Block 2
Block 6

P4
05
16
27
3
Block 3
Host

P4567
P0123
Block 7

2009 EMC Corporation. All rights reserved.

RAID Comparison
RAID

Min
Disks

Storage
Efficiency %

100

50
(n-1)*100/n
where n=
number of
disks
(n-1)*100/n
where n=
number of
disks

(n-2)*100/n
where n=
number of
disks

1+0
and
0+1

50

2009 EMC Corporation. All rights reserved.

Cost

Low

High

Moderate

Moderate

Read Performance

Write Performance

Very good for both


random and sequential
read

Very good

Good
Better than a single disk

Good
Slower than a single
disk, as every write must
be committed to two
disks

Good for random reads


and very good for
sequential reads

Poor to fair for small


random writes
Good for large,
sequential writes

Very good for random


reads
Good for sequential
reads

Fair for random write


Slower due to parity
overhead
Fair to good for
sequential writes

Moderate
but more
than RAID 5

Very good for random


reads
Good for sequential
reads

Good for small, random


writes
(has write penalty)

High

Very good

Good

RAID Impacts on Performance


RAID Controller

Ep new

Ep old

E4 old

E4 new

2 XOR

Ep new

Ep old

P0

D1

E4 old

D2

D3

E4 new

D4

o Small (less than element size) write on RAID 3 & 5


o Ep = E1 + E2 + E3 + E4 (XOR operations)

o If parity is valid, then: Ep new = Ep old E4 old + E4 new (XOR operations)


o 2 disk reads and 2 disk writes

o Parity Vs Mirroring
o Reading, calculating and writing parity segment introduces penalty to every write operation
o Parity RAID penalty manifests due to slower cache flushes
o Increased load in writes can cause contention and can cause slower read response times
2009 EMC Corporation. All rights reserved.

RAID Penalty Exercise


o Total IOPS at peak workload is 1200
o Read/Write ratio 2:1
o Calculate IOPS requirement at peak activity for
o RAID 1/0
o RAID 5

Additional Task
Discuss impact of sequential &
Random I/O in different RAID
Configuration

2009 EMC Corporation. All rights reserved.

Hot Spares

RAID
Controller

2009 EMC Corporation. All rights reserved.

Chapter Summary
Key points covered in this chapter:
o What RAID is and the needs it addresses
o The concepts upon which RAID is built
o Some commonly implemented RAID levels

2009 EMC Corporation. All rights reserved.

#1 IT
company

For more information visit

2009 EMC Corporation. All rights reserved.

http://education.EMC.com