You are on page 1of 364

Cisco Storage Design Fundamentals

Version 3.0

Student Guide

Table of Contents
Course Introduction
Overview .................................................................................................................................................1 Recommended Prerequisites..................................................................................................................1 Course Outline ........................................................................................................................................2 Cisco Certifications .................................................................................................................................3 Administrative Information.......................................................................................................................5 About Firefly ............................................................................................................................................7

Lesson 1: SCSI and Fibre Channel Primer


Overview .................................................................................................................................................9 SCSI Protocol Overview........................................................................................................................10 SCSI Operations ...................................................................................................................................17 Fibre Channel Overview........................................................................................................................22 Fibre Channel Flow Control ..................................................................................................................32 Fibre Channel Addressing.....................................................................................................................41 Fabric Login ..........................................................................................................................................45 Standard Fabric Services......................................................................................................................49

Lesson 2: Cisco MDS 9000 Introduction


Overview ...............................................................................................................................................57 Cisco Storage Solutions Overview........................................................................................................58 Airflow and Power .................................................................................................................................65 Software Packages and Licensing........................................................................................................70

Lesson 3: Architecture and System Components


Overview ...............................................................................................................................................77 System Architecture ..............................................................................................................................78 Oversubscription and Bandwidth Reservation ......................................................................................89 Credits and Buffers .............................................................................................................................101

Lesson 4: The Multilayer SAN


Overview .............................................................................................................................................105 Virtual SANs ........................................................................................................................................106 How VSANs Work ...............................................................................................................................111 Inter-VSAN Routing (IVR) ...................................................................................................................116 PortChannels ......................................................................................................................................124 Intelligent Addressing..........................................................................................................................136 Cisco Fabric ServicesUnifying the Fabric........................................................................................143 Switch Interoperability .........................................................................................................................145

Lesson 5: Remote Lab Overview


Overview .............................................................................................................................................151 System Memory Areas........................................................................................................................152 CLI Overview.......................................................................................................................................154 Fabric Manager and Device Manager ................................................................................................161 System Setup and Configuration ........................................................................................................164 Using the MDS 9000 Remote Storage Labs.......................................................................................166

Lesson 6: Network-Based Storage Applications


Overview .............................................................................................................................................171 Storage Virtualization Overview..........................................................................................................172 Network-Based Storage Virtualization ................................................................................................177 Network-Hosted Applications..............................................................................................................181 Network-Assisted Applications............................................................................................................185 Network-Accelerated Applications ......................................................................................................190 Fibre Channel Write Acceleration .......................................................................................................193

Lesson 7: Optimizing Performance


Overview .............................................................................................................................................199 Oversubscription and Blocking ...........................................................................................................200 Virtual Output Queues ........................................................................................................................203 Fibre Channel Congestion Control .....................................................................................................207 Quality of Service ................................................................................................................................209 Port Tracking.......................................................................................................................................220 Load Balancing ...................................................................................................................................222 SAN Performance Management.........................................................................................................224

Lesson 8: Securing the SAN Fabric


Overview .............................................................................................................................................237 SAN Security Issues ...........................................................................................................................238 Zoning .................................................................................................................................................244 Port and Fabric Binding ......................................................................................................................252 Authentication and Encryption ............................................................................................................256 Management Security .........................................................................................................................259 End-to-End Security Design................................................................................................................266

Lesson 9: Designing SAN Extension Solutions


Overview .............................................................................................................................................269 SAN Extension Applications ...............................................................................................................270 SAN Extension Transports..................................................................................................................273 Extending SANs with WDM ................................................................................................................278 Fibre Channel over IP .........................................................................................................................287 Extending SANs with FCIP .................................................................................................................291 Cisco MDS 9000 IP Services Modules ...............................................................................................296 High Availability FCIP Configurations .................................................................................................302 Using IVR for SAN Extension .............................................................................................................306 SAN Extension Security......................................................................................................................309 FCIP Performance Enhancements .....................................................................................................311

Lesson 10: Building iSCSI Solutions


Overview .............................................................................................................................................325 Whats the Problem?...........................................................................................................................326 iSCSI Overview ...................................................................................................................................329 MDS 9000 IP Services Modules .........................................................................................................336 When to Deploy iSCSI ........................................................................................................................342 High-Availability iSCSI Configurations ................................................................................................346 iSCSI Security .....................................................................................................................................353 iSCSI Target Discovery.......................................................................................................................360 Wide Area File Services......................................................................................................................362

ii

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

CSDF

Cisco Storage Design Fundamentals


Overview
CSDF is an intensive 2-day instructor-led training (ILT) lecture/lab course that provides learners with basic skills in designing Cisco storage networks. You will learn about and implement a broad range of features on the Cisco MDS 9000 platform, including Virtual SANs (VSANs), PortChannels, advanced security features, SAN extension with FCIP, and iSCSI solutions. In the lab, you will configure the switch from an out-of-the-box state and install the Cisco Fabric Manager GUI management application. You will configure VSANs, zones, PortChannels, and FCIP to implement a high-availability extended SAN design. This course provides an introduction to the MDS 9000 family for pre-sales engineers, system engineers, network engineers, and technical decision makers who need to design and implement SAN fabrics using MDS 9000 Family switches. Enrollment is open to Cisco SEs, Cisco channel partners, and customers.

Recommended Prerequisites
You will gain the most from this course if you have experience working with storage and storage networking technologies.

Course Outline
This slide shows the lessons in this course.

Course Overview
SCSI and Fibre Channel Primer Introduction to the MDS 9000 Platform Architecture and System Components The Multilayer SAN System Areas and Lab Overview Network-Based Storage Applications Optimizing Performance Securing the SAN Fabric Designing SAN Extension Solutions Building iSCSI Solutions
2006 Cisco Systems, Inc. All rights reserved.

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cisco Certifications
Cisco Storage Networking Certification Path
Enhance Your Cisco Certifications and Validate Your Areas of Expertise
Cisco Storage Networking Specialists
Required Exam Recommended Training Through Cisco Learning Partners

Cisco Storage Networking Support Specialist


642-354

Prerequisite: Valid CCNA Certification MDS Configuration and Troubleshooting (MDSCT) Cisco Multiprotocol Storage Essentials (CMSE) Cisco Advanced Storage Implementation and Troubleshooting (CASI)

Cisco Storage Networking Design Specialist

Required Exam

Recommended Training Through Cisco Learning Partners

Prerequisite: Valid CCNA Certification Cisco MDS Storage Networking Fundamentals (CMSNF or CSDF) 642-353 Cisco Storage Design Essentials (CSDE)

2006 Cisco Systems, Inc. All rights reserved.

The Cisco Storage Networking Certification Program is part of the Cisco Career Certifications program. The title of Cisco Qualified Specialist (CQS) is awarded to individuals who have demonstrated significant competency in a specific technology, solution area, or job role, which is demonstrated through the successful completion of one or more proctored exams. The CQS Storage Networking program consists of two parallel tracks:

The Cisco Storage Networking Support Specialist (CSNSS) track is for systems engineers, network engineers, and field engineers who install, configure, and troubleshoot Cisco storage networks. The Cisco Storage Networking Design Specialist (CSNDS) track is for pre-sales systems and network engineers who design Cisco storage networks. IT managers and project managers will also benefit from this certification.

Copyright 2006, Cisco Systems, Inc.

Cisco Storage Design Fundamentals

Cisco CertificationsCCIE Storage Networking

2006 Cisco Systems, Inc. All rights reserved.

Cisco provides three levels of general certifications for IT professionals with several different tracks to meet individual needs. Cisco also provides focused certifications for designated areas such as cable communications and security. There are many paths to Cisco certification, but only one requirementpassing one or more exams demonstrating knowledge and skill. For details, go to www.cisco.com/go/certifications. CCIE certification in Storage Networking indicates expert level knowledge of intelligent storage solutions over extended network infrastructure using multiple transport options such as Fibre Channel, iSCSI, FCIP and FICON. Storage Networking extensions allow companies to improve disaster recovery, optimize performance and take advantage of network services such as volume management, data replication, and enhanced integration with blade servers and storage appliances. There are no formal prerequisites for CCIE certification. Other professional certifications and/or specific training courses are not required. Instead, candidates are expected to have an indepth understanding of the subtleties, intricacies and challenges of end-to-end storage area networking. You are strongly encouraged to have 3-5 years of job experience before attempting certification. To obtain your CCIE, you must first pass a written qualification exam and then a corresponding hands-on lab exam.

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Administrative Information
Please silence your cell phones.

Learner Introductions

Your name Your company Skills and knowledge Brief history Objective

2006 Cisco Systems, Inc. All rights reserved.

Please introduce yourself.

Copyright 2006, Cisco Systems, Inc.

Cisco Storage Design Fundamentals

Course Evaluations

www.fireflycom.net/evals
2006 Cisco Systems, Inc. All rights reserved.

Please take time to complete the course evaluations after the class ends. Your feedback helps us continually improve the quality of our courses.

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

About Firefly
About Firefly
Technology Focus
Datacenter IP and Security Content Networking and WAN Optimization Storage Networking Business Continuance Optical Networking

Solutions Focus
Integrated Data Center Solutions Core IP Services Provisioning Multiprotocol SANs Business ContinuanceAll Application Tiers Application Optimization Application Security

Services
Global Delivery Curriculum Development State-of-the-Art Remote Labs and E-Learning Needs Assessment Consultative Education
2006 Cisco Systems, Inc. All rights reserved.

Copyright 2006, Cisco Systems, Inc.

Cisco Storage Design Fundamentals

Firefly MDS 9000 Training


Support TrackCisco Storage Networking Support Specialist (CSNSS)
Systems Engineers, Technical Consultants, and Field Engineers MDS Configuration and TroubleshootingExtended Edition (MDSCT + FCIP)

Cisco Multiprotocol Storage Essentials (CMSE) Cisco Advance Storage Implementation and Troubleshooting (CASI) Cisco Mainframe Storage Solutions (CMSS)

Design TrackCisco Storage Networking Design Specialist (CSNDS)


Systems Engineers, Technical Consultants, Storage Architects and SAN Designers Cisco Storage Design Fundamentals (CSDF)

Cisco Storage Design Essentials (CSDE) Cisco Storage Design BootCamp (CSDF + CSDE)

2006 Cisco Systems, Inc. All rights reserved.

Firefly CCIE-SAN Training


Firefly CCIE Storage KickStart
Developed by Firefly Intensive 2-week training program designed for SAN Systems Engineers, Architects, and Support Engineers Also prepares students for Cisco Storage Networking Support Specialist (CSNSS) certification exam Includes the contents of the MDSCT, CMSE, CASI and CMSS coursesusually 14 days Taught only by senior professionals who have passed the CCIE Storage written exam

Firefly CCIE Storage Lab BootCamp


Developed by Firefly 5-day intensive hands-on experience designed for students who have already passed the CCIE Written Exam Taught only by senior Firefly instructors who have achieved CCIE SAN

2006 Cisco Systems, Inc. All rights reserved.

10

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 1

SCSI and Fibre Channel Primer


Overview
This lesson provides a brief overview of the SCSI and Fibre Channel protocols.

Objectives
Upon completing this lesson, you will be able to explain the fundamentals of SCSI and Fibre Channel. This includes being able to meet these objectives:

Describe SCSI technology Describe the operations of the SCSI protocol Explain why FC is a data transport technology that is well-suited to storage networks Explain the fundamental design of FC flow control Describe the two addressing schemes used on Fibre Channel networks Describe the session establishment protocols that are performed by N_Ports and F_Ports in a fabric topology List the standard services provided by fabric switches as defined by the FC specification

SCSI Protocol Overview


SCSI Protocol Overview
The SCSI protocol defines how commands, status, and data blocks are exchanged between initiators and targets. SCSI is a Block I/O protocol. The Initiator always sends the command, reads or writes data blocks to the Target and receives a final Response.
Initiator Target

Requests Responses

LUNs

Application Client

Device Server
Tasks

Delivery Subsystem [ Parallel SCSI or FCP or IP ]


2006 Cisco Systems, Inc. All rights reserved.

SCSI Protocol Overview


The Small Computer System Interface (SCSI) is a standard that evolved from a proprietary design by Shugart Associates in the 70s called the SASI bus. SCSI performs the heavy lifting of passing commands, status, and block data between platforms and storage devices. One function of operating systems is to hide the complexity of the computing environment from the end user. Management of system resources including , memory, peripheral devices, display, context switching between concurrent applications, and so on, are generally concealed behind the user interface. The internal operations of the OS must be robust, closely monitor changes of state, ensure that transactions are completed within the allowable time frames, and automatically initiate recovery or retires in the event of incomplete or failed procedures. For I/O operations for peripheral devices such as disk, tape, optical storage, printers, and scanners, these functions are provided by the SCSI protocol, typically embedded in a device driver or logic onboard a host adapter. Because the SCSI protocol layer sits between the operating system and the peripheral resources, it has different functional components. Applications typically access data as files or records. Although these may be ultimately stored on disk or tape media in the form of data blocks, retrieval of the file requires a hierarchy of functions to assemble raw data blocks into a coherent file that can be manipulated by an application. SCSI architecture defines the relationship between initiators (hosts) and targets (for example disks or tape) as a client/server exchange. The SCSI-3 application client resides in the host and represents the upper layer application, file system, and operating system I/O requests. The SCSI-3 device server sits in the target device, responding to requests.

10

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SCSI Parallel Technology


SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires. SCSI is half-duplex - data travels in one direction at a time. On a parallel SCSI bus, a device must assume exclusive control over the bus in order to communicate. The SCSI Initiator then selects the SCSI Target and sends a Command to initiate a data transfer. At the end of the transfer, the device is de-selected and the bus is free.

Parallel Half-duplex Shared bus Limited distance


2006 Cisco Systems, Inc. All rights reserved.

SCSI Parallel Technology


The bus/target/LUN triad is defined from parallel SCSI technology. The bus represents one of several potential SCSI interfaces installed in the host, each supporting a separate string of disks. The target represents a single disk controller on the string. And the LUN designation allows for additional disks governed by a controller for example, a RAID device. The following are characteristics of parallel SCSI technology:

SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires. SCSI is half-duplexdata travels in one direction at a time. On a SCSI bus, a device must assume exclusive control over the bus in order to communicate. (SCSI is sometimes referred to as a simplex channel because only one device can transmit at a time).

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

11

Multidrop Topology and Addressing


Data / Address Bus Terminator Terminator

FC
HBA

Clock / Control Signals

SCSI Initiator (I/O Adapter)


ID ID

Interface

Interface

Interface

SCSI Targets
ID ID ID ID

6
LUN 0 LUN 1 LUN 2 LUN 3

5
LUN 0 LUN 1

ID ID

SCSI ID

0
LUN 0 LUN 1 LUN 2 LUN 3

Address = BUS : Target ID : LUN

A SCSI Initiator addresses its SCSI Target using the SCSI Nexus Bus : Target ID : LUN
2006 Cisco Systems, Inc. All rights reserved.

Multidrop Topology and Addressing


All of the devices on a SCSI bus are connected to a single cable. This is called a multidrop topology:

Data bits are sent in parallel on separate wires. Control signals are sent on a separate set of wires. Only one device at a time can transmita transmitting device has exclusive use of the bus. A special circuit called a terminator must be installed at the end of the cable. The cable must be terminated to prevent unwanted electrical effects from corrupting the signal.

A multidrop topology has inherent limitations:


Parallel transmission of data bits allows more data to be sent in a given time period but data bits may arrive early or late (skew) and lead to data errors. The fact that control signals, such as clock signals, are sent on a separate set of wires also makes synchronization more difficult. It is an inefficient way to use the available bandwidth, because only one communication session can exist at a time. Termination circuits are built into most SCSI devices, but the administrator often has to set a jumper on the device to enable termination. Incorrect cable termination can cause either a severe failure or intermittent, difficult-totrace errors. To achieve faster data transfer rates, vendors doubled the number of data lines on the cable from 8 (narrow SCSI) to 16 (wide SCSI). Vendors have increased the clock rate, which increased the transfer rates, but this also increased the possibility of data errors due to skew or electrical interference.
Copyright 2006, Cisco Systems, Inc.

12

Cisco Storage Design Fundamentals (CSDF) v3.0

Parallel SCSI is limited to a maximum cable length of 25m.

SCSI was designed to support a few devices at most, so its device addressing scheme is fairly simpleand not very flexible. SCSI devices use hard addressing:

Each device has a series of jumpers that determine the devices physical address, or SCSI ID. The ID is software-configurable on some devices. Each device must have a unique SCSI ID. Before adding a device to the cable, the administrator must know the ID of every other device connected to the cable and choose a unique ID for this new device. The ID of each device determines its priority on the bus. For example, the SCSI Initiator with ID 7 always has a higher priority than the SCSI Target with ID 6. Because each device must have exclusive use of the bus while it is transmitting, ID 6 must wait until ID 7 has finished transmitting. Fixed priority makes it more difficult for administrators to control performance and quality-of-service.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

13

SCSI-3 Architecture Model


SCSI Block Commands SCSI Streaming Commands SCSI Enclosure Services SCSI Media Commands

SCSI Primary Commands

SCSI SCSIParallel Parallel Port PortDriver Driver

SCSI-FCP SCSI-FCP Port PortDriver Driver

iSCSI iSCSI IP IP Port PortDriver Driver

SAS SAS Port PortDriver Driver

SBP-2 SBP-2 Port PortDriver Driver IEEE-1394 IEEE-1394 (Firewire) (Firewire) Port Port

SCSI SCSIParallel Parallel Port Port

Fibre FibreChannel Channel Port Port

Ethernet Ethernet Port Port

SAS SASSerial Serial Port Port

FC
SCSI Adapter FC HBA NIC SAS Interface

*SCSI-3 Separation of physical interface, transport protocols, and SCSI Command Set
2006 Cisco Systems, Inc. All rights reserved.

The SCSI-3 Architecture Model


The SCSI-3 family of standards introduced several new variations of SCSI commands and a protocol, including serial SCSI-3 and special command sets for streaming and media handling required for tape. As shown in the diagram, the command layer is independent of the protocol layer, which is required to carry SCSI-3 commands between devices. This enables more flexibility in substituting different transports beneath the SCSI-3 command interface to the operating system. The SCSI Architecture Model (SAM) consists of four layers of functionality: 1. The physical interconnect layer specifies the characteristics of the physical SCSI link: FC-PH is the physical interconnect specification for Fibre Channel. Serial Storage Architecture (SSA) is a storage bus aimed primarily at the server market. IEEE1394 is the FireWire specification. SCSI Parallel Interface (SPI) is the specification used for parallel SCSI buses.

2. The transport protocol layer defines the protocols used for session management: SCSI-FCP is the transport protocol specification for Fibre Channel. Serial Storage Protocol (SSP) is the transport protocol used by SSA devices. Serial Bus Protocol (SBP) is the transport protocol used by IEEE1394 devices.

14

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

3. The shared command set layer consists of command sets for accessing storage resources: SCSI Primary Commands (SPC) are common to all SCSI devices. SCSI Block Commands (SBC) are used with block-oriented devices, such as disks. SCSI Stream Commands (SSC) are used with stream-oriented devices, such as tapes. SCSI Media Changer Commands (SMC) are used to implement media changers, such as robotic tape libraries and CD-ROM carousels. SCSI Enclosure Services (SES) defines commands used to monitor and manage SCSI device enclosures like RAID Arrays, including fans, power and temperature monitoring.

4. The SCSI Common Access Method (CAM) defines the SCSI device driver application programming interface (API).

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

15

Serial SCSI-3 over Fibre ChannelSCSI-FCP


Interface

FC Link

FC
HBA

pWWN

Fibre Channel Fabric


Payload

SCSI Initiator (FC HBA)

Interface

Interface

pWWN

FC Frame
Interface

SCSI Targets
LUN 0 LUN 1 LUN 2 LUN 3 LUN 0 LUN 1

LUN 0 LUN 1 LUN 2 LUN 3

SCSI Commands, Data and Responses are carried in the payload of a frame from source to destination. In SCSI-FCP, the SCSI IDs are mapped to the unique worldwide name in each FC Port.
2006 Cisco Systems, Inc. All rights reserved.

Serial SCSI-3 over Fibre ChannelSCSI-FCP


All of the devices are attached to the same fabric and connected via Fibre Channel links to one or more interconnected switches.

The SCSI Initiator and SCSI Target ports are zoned together within the Fibre Channel switch. Each device logs in to the fabric (FLOGI) and registers itself with the Name Server in the switch. The FC-HBA queries the Name Server and discovers other FC ports in the same zone as itself. The FC HBA then logs in to each Target port (PLOGI) and they exchange Fibre Channel parameters. The SCSI Initiator (SCSI-FCP driver) then logs in to the SCSI Target behind the FC Target port (PRLI) and establishes a communication channel between SCSI Initiator and SCSI Target. The SCSI Initiator commences a SCSI operation by sending a SCSI Command Descriptor Block (CDB) down to the FC HBA with instructions to send it to a specific LUN behind a Target FC port (SCSI Target). The command is carried in the payload of the FC Frame to the target FC port. The SCSI Target receives the CDB and acts upon it. Usually this would be a Read or Write command. Data is then carried in the payload of the FC Frame between SCSI Initiator and SCSI Target. Finally, when the operation is complete, the SCSI Target will send a Response back to the SCSI Initiator in the payload of a FC Frame.

16

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SCSI Operations
SCSI Operation
SCSI specifies three phases of operation
Command send the required command and parameters via a Command Descriptor Block (CDB) Data Transfer data in accordance with the command Response Receive confirmation of command execution
Read
Comm and
FC

Write
Comm and

Xfer-Rdy
Data

FC
HBA

Data

Initiator
2006 Cisco Systems, Inc. All rights reserved.

Target

Respo

nse

nse Respo

10

Phases of Operation
Every communication between SCSI Initiator and SCSI Target is formed by sequences of events called bus phases. Each phase has a purpose and is linked to other phases to execute SCSI commands and transfer data and messages back and forth. The majority of the SCSI protocol is controlled by the SCSI Initiator. The SCSI Target is usually passive and waits for a command. Only the SCSI initiator can initiate a SCSI operation, by selecting a SCSI Target and sending a CDB (Command Descriptor Block) to it. If the CDB contains a Read command, the SCSI Target moves its heads into position and retrieves the data from its disk sectors. This data is returned to the SCSI Initiator. If the CDB contains a Write Command, the SCSI Target prepares its buffers and returns a XferRdy. When the SCSI Initiator receives Xfer-Rdy it can commence writing data. Finally, when the operation is complete, the SCSI Target returns a Response to indicate a successful (or unsuccessful) data transfer.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

17

SCSI Command Descriptor Blocks


SCSI SCSICommand CommandDescriptor DescriptorBlock Block (CDB) (CDB)

Byte 0 1 2 3 4 5 6 7 8 9

0
First Byte Operation Code First Byte Operation Code

Group Code

Command Code

Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved MSB Transfer Length Transfer Length Control
MSB

Transfer Data starting at this LBA Transfer Data starting at this LBA

LSB

Number of SCSI Blocks to be transferred Number of SCSI Blocks to be transferred


LSB

Last Byte Control Byte Last Byte Control Byte

A Command is executed by the Initiator sending a CDB to a Target. In serial SCSI-3, the CDB is carried in the payload of the Command Frame
2006 Cisco Systems, Inc. All rights reserved.

11

SCSI Command Descriptor Blocks


SCSI commands are built from a common structure:

Operation Code Byte N: Bytes of parameters Control Byte

The Operation Code consists of a Group Code and a command Code


Group Code establishes the total command length. Command Code establishes the command function.

The number of Bytes of parameters (N) can be determined from the Operation Code byte which is located in byte 0 of the Command Descriptor Block (CDB). The Control Byte, which is located in the last byte of a Command Descriptor Block, contains control bits that define the behavior of the command. The Logical Block Address is an absolute address of where the first block should be written (or read) on the disk. LBA 00 is the first sector on the disk volume or LUN, LBA 01 is the second sector and so on, until we reach the last sector of the disk volume or LUN. When the CDB is sent to a block device (Disk), blocks are always 512 Bytes long. The Transfer Length contains the number of 512 Byte blocks to be transferred. When the CDB is sent to a streaming device (Tape), the block length is negotiated. The Transfer Length contains the number of blocks to be transferred.
18 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

CDBs can be different sizes 6Byte CDB, 10 Byte CDB, 12 Byte CDB, 16 Byte CDB etc. to accommodate larger disk volumes, or transfer lengths. 10 Byte CDBs are common

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

19

SCSI Commands
SCSI supports several specific commands for each media type, and primary commands that all devices understand. The following commands are of particular interest:
- REPORT LUNS - INQUIRY - TEST UNIT READY - REPORT CAPACITY How many LUNs do you have? What device are you? Is the LUN available? What size is each LUN?

2006 Cisco Systems, Inc. All rights reserved.

12

SCSI Commands
Report LUNs are used by Operating Systems to discover LUNs attached to a particular hardware address. They are typically sent by the Initiator to LUN 0. Inquiries are used by the Operating System to determine the capabilities of each LUN that was discovered with REPORT LUNS. Test Unit Ready is used to check the condition of a particular LUN. Report Capacity is sent to each LUN in turn to obtain the size of each LUN.

20

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Building an I/O Request


Server
Application Volume Mgr

1. 2. 3. 4. 5. 6. 7.
CDB
Group Code
MSB

Application makes File I/O request to Volume Manager Volume Manager maps volume to SCSI ID & Target LUN File System maps Files to Blocks, makes Block I/O request Command, LBA, Block count and LUN sent to SCSI driver SCSI driver creates CDB (Command Descriptor Block) FC driver creates command frame with CDB in payload FC driver sends command frame to Target LUN and awaits response Read
LSB

File System SCSI Driver FC Driver

Command Code

MSB

Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved Transfer Length Transfer Length Control

Write
Comm and

Comm and

LSB

Xfer-Rdy
Data

Data

FC Frame
SOF Header Payload CRC EOF

nse Respo

Respo

nse
13

2006 Cisco Systems, Inc. All rights reserved.

Building an I/O Request


This slide explains the process of the Initiator talking to the Target 1. Application makes File I/O request to Volume Manager 2. Volume Manager maps volume to SCSI ID & Target LUN 3. File System maps Files to Blocks, makes Block I/O request 4. Command, LBA, Block count and LUN sent to SCSI driver 5. SCSI driver creates CDB (Command Descriptor Block) 6. FC driver creates command frame with CDB in payload 7. FC driver sends command frame to Target LUN and awaits response

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

21

Fibre Channel Overview


Fibre Channel Overview
Fibre Channel is a protocol used for efficiently transporting data between devices connected to the same fabric. Fibre Channel provides reliable and efficient data delivery with high throughput and low latency. Fibre Channel is the transport technology most commonly used for SANs today.
FC

FC
HBA HBA

FC

FC

IP Network
FC
HBA

Fibre Channel Fabric


FC

2006 Cisco Systems, Inc. All rights reserved.

15

Fibre Channel Overview


FC is a protocol used for efficiently transporting data between devices in the same fabric. It is the network interconnect technology that is most commonly used for SANs today. Traditional storage technologies, such as SCSI, are designed for controlled, local environments. They support few devices and only short distances, but they deliver data quickly and reliably. Traditional data network technologies, such as Ethernet, are designed for chaotic, distributed environments. They support many devices and long distances, but delivery of data can be delayed (Latency) FC combines the best of both worlds. It supports many devices and longer distances, and it provides reliable and efficient data delivery with high throughput and low latency. Like SCSI, Fibre Channel is a Block I/O protocol, delivering data blocks (usually 512 Bytes long) between devices in the same Fabric. In the diagram, the network on the right, consisting of servers and storage devices, is an FC SAN. The SAN consists of servers and storage devices connected by an FC network.

22

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Topologies


FC

FC
HBA

FC
FC
HBA

FC
FC FC
HBA

FC Fabric

FC

FC

FC
HBA

FC

FC
HBA

FC
HBA

FC

FC
HBA

FC
HBA

Point-to-Point

Arbitrated Loop

Switched Fabric

Arbitrated Loop provides shared bandwidth at low cost Switched Fabric provides aggregate bandwidth and scalability but requires complex FC switches, which increase the cost Most SANs today use the Switched Fabric topology
2006 Cisco Systems, Inc. All rights reserved.

16

Fibre Channel Topologies


Fibre Channel Protocol includes three basic SAN topologies.

Point-to-Point

Exactly two FC ports connected together. Both devices have exclusive access to the full link bandwidth

Arbitrated Loop

Up to 126 FC ports connected together on a Private Loop (not connected to a FC Switch) Up to 127 FC ports connected together on a Public Loop (connected via a FL port on a FC Switch) All devices share the available bandwidth around the loop, therefore a practical limit might only be 20 or so devices. A device that wishes to communicate with another device must do the following operations.

1. Arbitrate to gain control of the Loop 2. Open the port it wishes to communicate with 3. Send or Receive Data frames 4. Close the port 5. Release the loop, ready for the next transfer. Usually only two devices communicate at a time, the other FC ports in the loop are passive
SCSI and Fibre Channel Primer 23

Copyright 2006, Cisco Systems, Inc.

When the loop is broken or a device is added or removed, the downstream FC port sends thousands of LIP primitive sequences to inform the other loop devices that the loop has been broken. The LIP (Loop Initialization Procedure) is used to assign (or re-assign) Arbitrated Loop Physical Addresses (ALPAs) to each FC Port on the loop. This operation is disruptive and frames may be lost during this phase. Nowadays, most users would connected FCAL devices via a FC Hub to minimize disruption.

Switched Fabric

The topology of choice for FC SANs. Each connected device has access to full bandwidth on its link through the switch port it is connected to. The FC SAN can be expanded by adding more switches and increasing the number of ports for connected devices. The FC 24 bit addressing scheme allows for potentially 16,500,000 devices to be connected. A realistic number is a few thousand. This is because there can only be a maximum 239 switches in a single fabric and most switches today have a small number of ports each. Each FC switch must provide services for management of the SAN. These services include a Name Server, Domain Manager, FSPF Topology Database, Zoning Server, Time Server etc.

24

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Switched Fabric Topology


FC

FC
HBA

FC
FC

FC
HBA

FC

FC
HBA

FC

FC

FC
HBA

FC
HBA

FC Fabric
FC

FC

FC

FC
HBA

ISL

FC
HBA

FC FC

FC FC
HBA HBA

A Fabric contains one or more switches, connected together via Inter Switch Links (ISLs)
2006 Cisco Systems, Inc. All rights reserved.

17

Fibre Channel Switched Fabric Topology


The Switched Fabric topology incorporates one or more high-bandwidth FC switches, to handle data traffic among host and storage devices.

Each switch is assigned a unique ID called a Domain. There can be a maximum 239 switch domains in a fabric, however, McData impose a 32 Domain limit in their designs. FC Switches are connected together via Inter-Switch Links (ISLs). Each device is exclusively connected to its FC port on the switch via a bi-directional Full Duplex link. All connected devices share the same addressing space within the fabric and can potentially communicate with each other. Frames flow from device to device via one or more FC Switches. As a frame moves from switch to switch, this is called a hop. McData impose a 3 hop limit in their designs. Brocade impose a 7 hop limit and Cisco impose a 10 hop limit.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

25

Fibre Channel Ports


Ports are intelligent interface points on the Fibre Channel network:
Embedded in a FC Host Bus Adapter (HBA) Embedded in a fabric switch Embedded in a storage array or tape controller A Link connects exactly two Ports together FC Ports are assigned a dynamic FCID Node
FC
FC

Ports Node
FC

FC
HBA

FC

FC HBA Server
2006 Cisco Systems, Inc. All rights reserved.

Switch

Link

FC

Tape device

Array controller

Storage
18

Fibre Channel Ports


In data networking terminology, ports are often thought of as just physical interfaces where you plug in the cable. In FC however, ports are intelligent interfaces, responsible for actively performing critical network functions. The preceding graphic contains several ports. There are ports in the host I/O adapter (host bus adapter [HBA]), ports in the switch, and ports in the storage devices. FC terminology differentiates between several different types of ports, each of which performs a specific role on the SAN. You will encounter these terms often as you continue to learn about FC, so it is important that you learn to recognize the different port types. In addition to the common ports defined for FC, Cisco has developed some proprietary port types. Fibre Channel Ports are assigned a unique address, a Fibre Channel ID (FCID) at login time.

26

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Standard Fibre Channel Ports


Valid combinations NL NL NL FL NN NF EE EB
FC
HBA

FC

FC
HBA

NL

FCAL Hub

NL
Storage Array

Host

FL

E
Inter Switch Link

E N
Link

FC

F E B

F
Link

N
Storage Array

Host

2006 Cisco Systems, Inc. All rights reserved.

WAN Bridge

19

Standard Fibre Channel Ports

An N_Port (Node Port) is a port on a node that connects to a fabric. I/O adapters and array controllers contain one or more N_Ports. N_Ports can also directly connect two nodes in a point-to-point topology An F_Port (Fabric Port) is a port on a switch that connects to an N_Port. An E_Port (Extension Port) is a port on a switch that connects to an E_Port on another switch. An FL_Port (Fabric Loop Port) is a port on a switch that connects to an arbitrated loop. Logically, an FL_Port is considered part of both the fabric and the loop. FL_Ports are always physically located on the switch. Note that FC hubs, although they obviously have physical interfaces, do not contain FC ports. Hubs are basically just passive signal splitters and amplifiers. They do not actively participate in the operation of the network. On an arbitrated loop, the node ports manage all FC operations. Not all switches support FL_Port operation. For example, some McDATA switches do not support FL_Port operation.

An NL_Port (Node Loop Port) is a port on a node that connects to another port in an arbitrated loop topology. There are two types of NL_Ports: Private NL_Ports can communicate only with other loop ports; public NL_Ports can communicate with other loop ports and with N_Ports on an attached fabric. Note that the term L_Port (Loop Port) is sometimes used to refer to any port on an arbitrated loop topology. L_Port can mean either FL_Port or NL_Port. In reality, there is no such thing as an L_Port.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

27

Nowadays, most ports are universal. They automatically sense the port they are connected to and adopt the correct valid port type automatically. However, it is good practice to lock down the port type to its correct function.

28

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Frame


R_CTL CS_CTL TYPE SEQ_ID Destination Address D_ID Source Address S_ID Frame Control F_CTL DF_CTL SEQ_CNT RX_ID
D_ID = Where the frame is going to S_ID = Where the frame is coming from TYPE = Payload Protocol SEQ_CNT and SEQ_ID = Sequence IDs OX_ID and RX_ID = Exchange IDs

OX_ID

Parameter Field 4 Bytes wide x 6 Words = 24 Bytes

SOF

Header Optional Headers

02112 Byte Data Field 02048 Byte Payload


Fill Bytes

CRC

EOF

24

0-64

02048

0-3

4 Bytes
20

2006 Cisco Systems, Inc. All rights reserved.

Fibre Channel Frames


The maximum total length of an FC frame is 2148 bytes, or 537 words. ( A Word = 4 Bytes).

A 4-byte SOF (Start of Frame) delimiter A 24-byte header A data payload that can vary from 0 to 2112 bytes. Typically 2048 Bytes for SCSI-FCP. A 4-byte CRC (Cyclic Redundancy Check) that is used to detect bit-level errors in the header or payload A 4-byte EOF (End of Frame) delimiter

The Header contains fields used for identifying and routing the frame across the fabric.

R_CTL: Routing Control field defines the frames function. D_ID: Destination Address. The FCID of the FC Port where the frame is being sent to. CS_CTL: Class Specific Control field. Only used for Class 1 and 4. S_ID: Source Address. The FCID of the FC Port where the frame has come from. TYPE: The Upper Layer Protocol Data type contained in the payload. This is hex 08 for SCSI-FCP. F_CTL: Frame Control field contains miscellaneous control information regarding the frame, including how many fill bytes there are (0-3). SEQ_ID: Sequence ID. The unique identifying number of the Sequence within the Exchange. DF_CTL: Data Field Control. This field defines the use of the Optional Headers. SCSIFCP doesnt use Optional Headers.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

29

SEQ_CNT: Sequence Count. The number of the frame within a sequence. The first frame is hex 00 OX_ID: Originating Exchange ID. A Unique identifying number provided by the source FC Port. RX_ID: Responding Exchange ID. A Unique identifying number provided by the destination FC Port. OX_ID and RX_ID together define the Exchange ID. PARMS: Parameter Field. Usually provides a relative offset into the ULP data buffer.

The frame payload consists of 3 elements:


The payload itself, containing data or commands, is variable and can be up to 2112 bytes. The first 64 bytes of the payload can be used to incorporate optional headers. This would reduce the data payload size to 2048 bytes (2KB). SCSI-FCP usually carries multiples of 512 Byte blocks. The payload ends with 0-3 fill bytes. This is necessary because the smallest unit of data recognized by FC is a 4-byte word. However, the ULP is not aware of this FC requirement, and the data payload for a frame might not end on a word boundary. FC therefore adds up to 3 fill bytes to the end of the payloadas many as are needed to ensure that the payload ends on a word boundary.

30

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Data Constructs


Words are 4 Bytes long. The smallest unit of transfer in Fibre Channel. Frames are made up from several Words. Sequences are uni-directional and contain one or more frames Exchanges are bi-directional and contain three or more sequences
Exchange

Read
Comm and
Sequence 1

Data

Sequence 2

Word Exchange Frame Sequence


FC
HBA

Sequence 3

Respo

nse

FC

Initiator
2006 Cisco Systems, Inc. All rights reserved.

Target
21

Fibre Channel Data Constructs


The preceding graphic shows a transaction between a host (Initiator) and a storage device (Target):

The smallest unit of data is a word. Words consist of 32 bits (4 bytes) of data that are encoded into a 40-bit form by the 8b/10b encoding process. Words are packaged into frames. An FC frame is equivalent to an IP packet. A sequence is a series of frames sent from one node to another node. Sequences are unidirectionalin other words, a sequence is a set of frames that are issued by one node. An exchange is a series of sequences sent between tow nodes. The exchange is the mechanism used by two ports to identify and manage a discrete transaction. The exchange defines an entire transaction, such as a SCSI read or write request. An exchange is opened whenever a transaction is started between two ports and is closed when the transaction ends. An FC exchange is equivalent to a TCP session.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

31

Fibre Channel Flow Control


Fibre Channel Flow Control
How data interchange is controlled in a network: The flow control strategy used by Ethernet and other data networks can degrade performance:
Transmitter does not stop transmitting packets until after the receivers buffers overflow and packets are already lost Lost packets must be retransmitted Degradation can be severe under heavy traffic loads

Lost packets

Tx
PAUSE

Data

Data Data

Rx

Flow Control in Ethernet


2006 Cisco Systems, Inc. All rights reserved.

23

Flow control is a mechanism for ensuring that frames are sent only when there is somewhere for them to go. Just as traffic lights are used to control the flow of traffic in cities, flow control manages the data flow in an FC fabric. Some data networks, such as Ethernet, use a flow-control strategy that can result in degraded performance:

A transmitting port (Tx) can begin sending data packets at any time. When the receiving ports (Rx) buffers are completely filled and cannot accept any more packets, Rx tells Tx to stop or slow the flow of data. After Rx has processed some data and has some buffers available to accept more packets, it tells Tx to resume sending data.

This strategy results in lost packets when the receiving port is overloaded, because the receiving port tells the transmitting port to stop sending data after it has already overflowed. Lost packets must be retransmitted, which degrades performance. Performance degradation can become severe under heavy traffic loads.

32

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

What is Credit-Based Flow Control?


Fibre Channel uses a credit-based strategy:
When receiver is ready to accept a frame it sends a credit to the transmitter, giving permission for the transmitter to send a frame. The receiver is always in control.

Benefits:
Prevents loss of frames due to buffer overflow Maximizes link throughput and performance under high loads

Disadvantages:
Long distance links require lots more credits
Port Rx has free buffers

1 0

DATA

Tx
READY

Rx Flow Control in Fibre Channel

2006 Cisco Systems, Inc. All rights reserved.

24

What is Credit-Based Flow Control?


To improve performance under high traffic loads, FC uses a credit-based flow control strategy in which the receiver must issue a credit for each frame that is sent by the transmitter before that frame can be sent. A credit-based strategy ensures that the receiving port is always in control. The receiving port must issue a credit for each frame that is sent by the transmitter. This strategy prevents frames from being lost when the receiving port runs out of free buffers. Preventing lost frames maximizes performance under high traffic load conditions because the transmitting port does not have to resend frames. The preceding diagram illustrates a credit-based flow control process:

The transmitting port (Tx) counts the number of free buffers at the receiving port (Rx). Before Tx can send a frame, Rx must notify Tx that Rx has a free buffer and is ready to accept a frame. When Tx receives the notification (called a credit), it increments its count of the number of free buffers at Rx. Tx only sends frames when it knows that Rx can accept them. When Tx sends a frame it decrements the credit count When the credit count falls to zero, Tx must stop sending frames and wait for another credit

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

33

Types of Flow Control


Fibre Channel defines two types of flow control:
Buffer-to-buffer End-to-end R_RDY ACK Port-to-Port across every Link Between Source and Destination Ports

Buffer-to-buffer flow control

N_Port

F_Port

E_Port

E_Port

F_Port

N_Port

Source End-to-end flow control

Destination

2006 Cisco Systems, Inc. All rights reserved.

25

Types of Flow Control


FC defines two types of flow control:

Buffer-to-buffer flow control takes place between two ports that are connected by a FC link, such as an N_Port and an F_Port, or two E_Ports, or two NL_Ports. The receiving port at the other end of the link sends a primitive signal (4 Bytes) called a R_RDY (Receiver Ready) to the transmitting port. End-to-end flow control takes place between the source port and the destination port. Whenever the receiving port receives a frame it acknowledges that frame with an ACK frame (36 Bytes).

Note that buffer-to-buffer flow control is performed between E_Ports in the fabric, but it is not performed between the incoming and outgoing ports in a given switch. In other words, FC buffer-to-buffer flow control is not used between two F_Ports or between an F_Port and an E_Port within a switch. FC standards do not define how switches route frames across the switch. Buffer-to-buffer flow control is used in the following situations:

Class 1 connection request frames use buffer-to-buffer flow control, but Class 1 data traffic uses only end-to-end flow control. Class 2 and Class 3 frames always use buffer-to-buffer flow control. Class F service uses buffer-to-buffer flow control. In an Arbitrated Loop, every communication session is a virtual dedicated point-to-point circuit between a source port and destination port. Therefore, there is little difference between buffer-to-buffer and end-to-end flow control. Buffer-to-buffer flow control alone is generally sufficient for arbitrated loop topologies.
Copyright 2006, Cisco Systems, Inc.

34

Cisco Storage Design Fundamentals (CSDF) v3.0

End-to-end flow control is used in the following situations:


Classes 1, 2, 4, and 6 use end-to-end flow control. Class 2 service uses both buffer-to-buffer and end-to-end flow control.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

35

Buffer to Buffer and End-to-End Flow Control


Fabric
F_Port
R_RDY

F_Port

Data
R_RDY

N_Port A

1 N_Port 2 4
R_RDY R_RDY
ACK

5
Buffer-to-buffer flow control

Buffer-to-buffer flow control

End-to-end flow control


2006 Cisco Systems, Inc. All rights reserved.

26

Buffer-to-Buffer Flow Control and End to End Flow Control


The preceding preceding diagram illustrates buffer-to-buffer flow control in Class 3: 1. Before N_Port A can transmit a frame, it must receive the primitive signal R_RDY from its attached F_Port. The R_RDY signal tells N_Port A that its F_Port has a free buffer. 2. When it receives the R_RDY signal, N_Port A transmits a frame. 3. The frame is passed through the fabric. Buffer-to-buffer flow control is performed between every pair of E_Ports, although this is not shown here. 4. At the other side of the fabric, the destination F_Port must wait for an R_RDY signal from N_Port B. 5. When N_Port B sends an R_RDY, the F_Port transmits the data frame. End-to-end flow control is designed to overcome the limitations of buffer-to-buffer flow control. The preceding preceding diagram illustrates end-to-end flow control in Class 2: 1. Standard buffer-to-buffer flow control is performed for each data frame. 2. After the destination N_Port B receives a frame, it waits for an R_RDY from the F_Port. 3. When N_Port B receives an R_RDY, it sends an acknowledgement (ACK) frame back to N_Port A. 4. At the other side of the fabric, the initiator F_Port must wait for an R_RDY signal from N_Port A. 5. When N_Port A sends an R_RDY, the F_Port transmits the ACK frame.

36

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

End-to-end flow control involves only the port at which a frame originates and the ultimate destination port, regardless of how many FC switches are in the data path. When end-to-end flow control is used, the transmitting port is responsible for ensuring that all frames are delivered. Only when the transmitting N_Port receives the last ACK frame in response to a sequence of frames sent does it know that all frames have been delivered correctly, and only then will it empty its ULP data buffers. If a returning ACK indicates that the receiving port has detected an error, the transmitting N_Port has access to the ULP data buffers and can resend all of the frames in the sequence.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

37

Allocating Buffer Credits


Credits = (Round_Trip_Time + Processing_Time) Serialization_Time

Frame serialization time:


2Gb Link rate of 2.125 Gbps = 4.7ns/byte Frame size = 2048 data + 36 header + 24 IDLE = 2108 bytes Frame serialization time = 2108 x 4.7 = 9.9s 10s per frame

10Km 10s
Frame

Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.

Target N_Port
27

Allocating Buffer Credits


You can calculate the number of credits required on a link to maintain optimal performance using the following formula:
Credits = (Round_Trip_Time + Processing_Time) / Serialization_Time

Example
This diagram and the following two diagrams illustrate how the required number of BB_Credits are calculated for a 10km, 2Gbps FC link:

At a link rate of 2.125 Gbps, the time required to serialize (transmit) each byte is 4.7ns. (Note that each byte is 10 bits due to 8b/10b encoding.) The maximum SCSI-FCP Fibre Channel payload size is 2048 bytes, because SCSI usually transfers multiple SCSI blocks of 512 Bytes each. The payload size used in an actual customer environment would be based on the I/O characteristics of the customers applications. You also need to account for the frame overheads. These are: SOF (Start of Frame) 4 Bytes FC Header, which is 24 Bytes CRC which is 4 Bytes EOF (End of Frame) which is also 4 Bytes Also, the number of IDLEs between frames, which is usually 6 IDLEs, or 24 bytes. This gives a total of 2108 bytes. The total serialization time at 2Gbps for a 2108-byte frame (including idles) is 9.9s, or approximately 10s.
Copyright 2006, Cisco Systems, Inc.

38

Cisco Storage Design Fundamentals (CSDF) v3.0

Allocating Buffer Credits


Propagation delay:
Speed of light in fiber 5s/Km Time to transmit frame across 10Km 50s

Processing time:
Assume same as de-serialization time 10s

Response time:
Time to transmit R_RDY across 10Km 50s

Total latency 50s + 10s + 50s = 110s


10Km 20s 50s Frame
R_RDY

10s
Frame

50s Target N_Port


28

Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.

The speed of light in a fiber optic cable is approximately 5ns per metre or 5s per kilometer, so each frame will take about 50s to travel across the 10Km link. The receiving port must then process the frame, free a buffer, and generate an R_RDY. This processing time can varyfor example, if the receiver ULP driver is busy, the frame might not be processed immediately. In this case, we can assume that the receiving port will process the frame immediately, so the processing time is equal to the time it takes to de-serialize the frame. Assume that the de-serialization time is equal to the serialization time: 10s The receiving port then transmits a credit (R_RDY) back across the link. This response takes another 50s to reach the transmitter. The total latency on the link is equal to the frame serialization time plus the round-trip time across the link, or about 110s.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

39

Allocating Buffer Credits (cont.)


Given frame serialization time 10s and total latency 110s, There could be up to 5 frames on the link at one time, 1 being processed, and 5 credits being returned. Therefore, at 2Gbps we require approx 10 credits to maintain full bandwidth

A good rule of thumb is.. At 2Gbps with a 2KB payload, you need approx 1 credit per Km
10Km

Frame R_RDY

Frame R_RDY

Frame R_RDY

Frame R_RDY

Frame R_RDY

Frame

Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.

Target N_Port
29

Given a frame serialization time of 10s, and a total round-trip latency of 110s, there could be up to 5 frames on the link at one time plus one being received and processed by the receiving port. In addition, 5 credits are being returned to the transmitting port down the other side of the link. In other words, ignoring the de-serialization time, approximately 10 buffer-to-buffer credits are required to make full use of the bandwidth of the 10km link at 2Gbps with 2KB frames.

40

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Addressing


World-Wide Names
Every Fibre Channel port and node has a hard-coded address called a World Wide Name (WWN):
Allocated to manufacturer by IEEE Coded into each device when manufactured 64 or 128 bits (128 bits most common today)

Switch Name Server maps WWNs to FC addresses (FCID)


World-Wide Node Names (nWWNs) uniquely identify devices (Nodes) World-Wide Port Names (pWWNs) uniquely identify each port in a device

Example Example WWN WWN

Example Example WWNs WWNs from from a a Dual-Ported Dual-Ported Device Device nWWN pWWN A pWWN B 20:00:00:45:68:01:EF:25 21:00:00:45:68:01:EF:25 22:00:00:45:68:01:EF:25
31

WWNN 200000456801EF25
WWN 20:00:00:45:68:01:EF:25

2006 Cisco Systems, Inc. All rights reserved.

World-Wide Names
WWNs are unique identifiers that are hard-coded into FC devices. Every FC port has at least one WWN. Vendors buy blocks of WWNs from the IEEE and allocate them to devices in the factory. WWNs are important for enabling fabric services because they are:

Guaranteed to be globally unique Permanently associated with devices

These characteristics ensure that the fabric can reliably identify and locate devices, which is an important consideration for fabric services. When a management service or application needs to quickly locate a specific device: 1. The service or application queries the switch Name Server service with the WWN of the target device 2. The Name Server looks up and returns the current port address (FCID) that is associated with the target WWN 3. The service or application communicates with the target device using the port address

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

41

There are two types of WWNs:


nWWNs uniquely identify devices (Nodes). Every host bus adaptor (HBA), array controller, switch, gateway, and FC disk drive has a single unique WWNN. pWWNs uniquely identify each port in a device. A dual-ported HBA has three WWNs: one nWWN and a pWWN for each port.

nWWNs and pWWNs are both needed because devices can have multiple ports. On singleported devices, the nWWN and pWWN are usually the same. On multi-ported devices, however, the pWWN is used to uniquely identify each port. Ports must be uniquely identifiable because each port participates in a unique data path. nWWNs are required because the node itself must sometimes be uniquely identified. For example, path failover and multiplexing software can detect redundant paths to a device by observing that the same WWNN is associated with multiple pWWNs. Cisco MDS switches use the following acronyms:

pWWN (Port WWN) nWWN (Node WWN)

42

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Dynamic FCID Addressing


FCIDs are dynamically assigned to each device FC Port by the switch during the Fabric Login process. Each Switch in a fabric is assigned a Domain ID (1-239) Port is usually 00 for a N_Port or AL_PA for a NL_Port Area is usually tied to the physical switch port that the device is connected to, but this is restrictive. MDS switches instead, logically assign the Area and Port to a FCID.
Bit 23 16 15 08 07 00

Fabric

Domain Domain

Area Area

Port AL_PA AL_PA


32

Public Loop Private Loop


2006 Cisco Systems, Inc. All rights reserved.

00000000 00000000

Dynamic FCID Addressing


FCIDs are dynamically assigned to each FC port by the switch when it receives a Fabric Login (FLOGI) from the device:

Each FC switch in the fabric is assigned a unique Domain ID from 1 to 239 (except McDATA switches, which assign only domains 97 to 127). Traditional FC switches will assign the Area ID based upon the physical port on the switch that the device is connected to. For example, a device connected to port 3 on the switch will receive an Area ID of hex 03. Therefore the FCID is tied to the physical port on the switch. The Port ID is usually hex 00 for a N_Port or the AL_PA (Arbitrated Loop Physical Address) for a NL_Port. This means that every N_Port connected to the switch is reserved an entire area of 256 addresses, although it will only use 00. This is a wasteful use of addresses and one of the reasons why Fibre Channel cannot support the full 16.5 million addresses. The Cisco MDS does not tie the Area to the physical port on the switch, but will assign the FCID logically in sequence starting with an area of 00. The latest HBAs support Flat Addressing and the Cisco MDS will combine the Area and Port fields together as a 16 bit Port ID field. Each device is assigned an FCID in sequential order starting at 0000, 0001 etc. Legacy devices will be assigned a fixed Port ID of 00 per Area as defined above.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

43

The FC-AL Address Space


In a public (fabric-attached) loop:

Public NL_Ports are assigned a full 24-bit fabric address when they log into the fabric. There are 126 AL_PA addresses available to NL_Ports in an arbitrated loop; the AL_PA 0x00 is reserved for the FL_Port (which is logically part of both the fabric and the loop). The Domain and Area fields are identical to those of the FL_Port to which the loop is connected.

In a private (isolated) loop :


Private NL_Ports can communicate with each other based upon the AL_PA, which is assigned to each port during loop initialization. Private NL_ports are not assigned a 24-bit fabric address, and the Domain and Area segments are not used.

44

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fabric Login
Fabric Login
FCIDs are dynamically assigned to each device FC Port by the switch during the Fabric Login (FLOGI) process. Each device will register (PLOGI) with the switchs Name Server Initiators will query the Name Server for available targets, then send PLOGI to the target to exchange FC parameters Initiator will login to each Target using Process Login (PRLI) to establish a channel of communication between them (Image Pair)
Fabric N_Port A Initiator Node FLOGI FLOGI PLOGI PLOGI PLOGI PLOGI Process A PRLI PRLI Process B F_Port A F_Port B N_Port B FLOGI FLOGI PLOGI PLOGI Target Node

2006 Cisco Systems, Inc. All rights reserved.

34

Before an N_Port can begin exchanging data with other N_Ports, three processes must occur:

The N_Port must log in to its attached F_Port. This process is known as Fabric Login (FLOGI). During PLOGI, both ports exchange Fibre Channel common parameters. ie. Buffer credits, buffer size, classes of service supported etc. The Initiator N_Port must log in to its target N_Port. This process is also known as Port Login (PLOGI). This time the initiator and target ports exchange Fibre Channel common parameters like before. If one port supports 2KB buffers but the other only supports 1KB buffers, they will negotiate down to the lowest common factor ie 1KB buffers. The Initiator N_Port must exchange information about ULP support with its target N_Port to ensure that the initiator and target process can communicate. This process is known as Process Login (PRLI). Parameters exchanged are specific to the Upper Layer Protocol (ULP). For instance, one port will state that it is an Initiator, the other must say that it is a Target. If both ports are Initiators, then the PRLI is rejected.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

45

Fabric Login Analyzer Screenshot

Analyzer screenshot showing the contents of a FLOGI frame sent to the Fabric Login Server in the FC switch.
2006 Cisco Systems, Inc. All rights reserved.

35

Fabric Login Analyzer Screenshot


The preceding image shows an analyzer trace that displays part of a fabric login sequence. The top of the trace shows the NOS-OLS-LR-LRR sequence that occurred while the link was being initialized. (NOS is off the screen) The right-hand panel shows the contents of the FLOGI frame from the N_Port to the FLOGI Server on the switch F_Port (FFFFFE). Useful information can be obtained by studying these analyzer traces:

Notice that at this time the N_Port does not yet have an address. 00.00.00 Notice also that the World Wide Port Name is the same as the World Wide Node Name. This is common in single ported nodes. The N_Port does not support Class 1, but it does support Classes 2 and 3. The N_Port supports Alternate Buffer Credit Management Method and can guarantee 2 BB_Credits at its receiver port. You can see that this is a single-frame Class 3 sequence because the Start of Frame is SOFi3 and End of Frame is EOFt, meaning that this initial first frame is also the last one in the sequence.

46

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Registered State Change Notification (RSCN)


How can a host keep track of its targets?
It can poll each device regularly, but this would be a high overhead, or It can register for State Change Notification (SCN) with the switch The switch will send a RSCN whenever targets go offline/online The host will then query the Name Server to find out what has changed
FC

Link failure
FC

SCR SCR
FC
HBA

LS_ACC LS_ACC

Fabric RSCN RSCN Controller RSCN RSCN

SCR SCR
LS_ACC LS_ACC

Host
2006 Cisco Systems, Inc. All rights reserved.

Storage
36

The Registered State Change Notification (RSCN) Process


Changes to the state of the fabric can affect the operation of ports. Examples of fabric state changes include:

A node port is added or removed from the fabric Inter-switch links (ISLs) are added or removed from the fabric A membership change occurs in a zone

Ports must be notified when these changes occur.

The RSCN Process


The FC-SW standard provides a mechanism through which switches can automatically notify ports that changes to the fabric have occurred. This mechanism, known as the RSCN process, is implemented by a fabric service called the Fabric Controller. The RSCN process works as follows:

Nodes register for notification by sending a State Change Registration (SCR) frame to the Fabric Controller. The Fabric Controller transmits RSCN commands to registered nodes when a fabric state change event occurs. RSCNs are transmitted as unicast frames because multicast is an optional service and is not supported by many switches. Only nodes that might be affected by the state change are notified. For example, if the state change occurs within Zone A, and Port X is not part of Zone A, then Port X will not receive an RSCN. Nodes respond to the RSCN with an LS_ACC frame.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

47

The RSCN message identifies the ports that were affected by the state change event, and it identifies the general nature of the event. After receiving an RSCN, the node can then use additional Link Services commands to obtain more information about the event. For example, if the RSCN specifies that the status of Port Y has changed, the nodes that receive the RSCN can attempt to verify the current (new) state of Port Y by querying the Name Server. The Fabric Controller will generate RSCNs in the following circumstances:

A fabric login (FLOGI) from an Nx_Port. The path between two Nx_Ports has changed (e.g., a change to the fabric routing tables that affects the ability of the fabric to deliver frames in order, or an E_Port initialization or failure) An implicit fabric logout of an Nx_Port, including implicit logout resulting from loss-ofsignal, link failure, or when the fabric receives a FLOGI from a port that had already completed FLOGI. Any other fabric-detected state change of an Nx_Port. Loop initialization of an L_Port, and the L_bit was set in the LISA Sequence. An Nx_Port can also issue a request to the Fabric Controller to generate an RSCN. For example, if one port in a multi-ported node fails, another port in that node can send an RSCN to notify the fabric about the failure.

48

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Standard Fabric Services


Standard Fabric Services
Configuration Server Unzoned Name Server Zone Server

Domain Manager

Name Server

Alias Server

Fabric Management Server Generic ControllerServices

Key Server

Time Server

FC-4 ULP Mapping FC-3 Generic Services FC-2 Framing & flow control FC-1 Encoding FC-0 Physical interface
2006 Cisco Systems, Inc. All rights reserved.

Common Transport Link Services

38

The FC-SW-2 specification defines several services that are required for fabric management. These services include:

Name Server Login Server Address Manager Alias Server Fabric Controller Management Server Key Distribution Server Time Server

The FC-SW-2 specification does not require that switches implement all of these services; some services can be implemented as an external server function. However, the services discussed in this lesson are typically implemented in the switch, as in Cisco MDS 9000 Family Switches.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

49

The Domain Manager


Management Management Services Services VSAN VSANManager Manager WWN WWNManager Manager

Domain Manager
Fabric Configuration Principal Switch Selection Domain ID Allocation

FC_ID Allocation

FCID Database and Cache

Port PortManager Manager


2006 Cisco Systems, Inc. All rights reserved.

Login LoginServer Server


39

The Domain Manager


The Domain Manager is the logical function of a switch that is responsible for the assignment of addresses in a fabric. The Domain Manager is responsible for:

Allocating domain IDs (requesting a domain ID, and assigning domain IDs to other switches if this switch is the Principal Switch) Allocating port addresses (FC_IDs) Participating in the Principal Switch selection process Performing the Fabric Build and Reconfiguration processes when the topology changes

The Domain Manager supports the Fabric Port Login Server, which is the service that N_Ports use when logging in to the fabric. When an N_Port logs into the fabric, it sends a FLOGI command to the Login Server. The Login Server then requests an FC_ID from the Domain Manager and assigns the FC_ID the N_Port in its ACC reply to the FLOGI request. The preceding diagram shows how the Domain Manager interacts with other fabric services:

The VSAN Manager provides the Domain Manager with VSAN configuration and status information. The WWN Manager tells the Domain Manager what WWN is assigned to the VSAN. The Port Manager provides the Domain Manager with information about the fabric topology (a list of E_Ports) and notifies the Domain Manager about E_Port state changes. The Login Server receives N_Port requests for FC_IDs during FLOGI. The Domain Manager interacts with management services to allow administrators to view and modify Domain Manager parameters.

50

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

The Name Server


Name Server stores data about nodes, such as:
FC_IDs nWWNs and pWWNs Fibre Channel operating parameters Supported protocols Supported Classes of Service

Supports soft zoning Provides information only about nodes in the requestors zone Distributed Name Server (dNS) resides in each switch Responsible for entries associated with that switchs domain Maintains local data copies and updates via RSCNs Sends RSCNs to the fabric when a local change occurs

2006 Cisco Systems, Inc. All rights reserved.

40

The Name Server


FC Name Server is a database implemented by the switch that stores information about each node, including:

FC_IDs WWPN and WWNNs FC operating parameters, such as supported ULPs and Classes of Service

The Name Server:


Supports soft zoning by performing WWN lookups to verify zone membership Enforces zoning by only providing information about nodes in the requestors zone Is used by management applications that need to obtain information about the fabric

Each switch in a fabric contains its own resident name server, called a distributed Name Server (dNS). Each dNS within a switch is responsible for the name entries associated with the domain assigned to the switch. The dNS instances synchronize their databases using the RSCN process. When a client Nx_Port wants to query the Name Service, it submits a request to its local via the Well Known Address for the Name Server. If the required information is not available locally, the dNS within the local switch responds to the request by making any necessary requests of other dNS instances contained in the other switches. The communication between switches that is performed to acquire the requested information is transparent to the original requesting client. Partial responses to dNS queries are allowed. If an entry switch sends a partial response back to an Nx_Port, it must set the partial response bit in the CT header.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

51

Name Server Operations


When ports and nodes register with the Name Server, their characteristics are stored as objects in the Name Server database. The Port Identifier is the Fibre Channel port address identifier (FC_ID) assigned to an N_Port or NL_Port during fabric login (FLOGI). The Port Identifier is the primary key for all objects in a Name Server record. All objects are ultimately related back to this object. Because a node may have more than one port, the Node Name is a secondary key for some objects. There are three types of Name Server requests:

Get Object: This request is used to query the Name Server Register Object: Only one object at a time can be registered with the Name Server. A Client registers information in the Name Server database by sending a registration request containing a Port Identifier or Node Name. Deregister Object: Only one global deregistration request is defined for the Name Server.

Name Server information is available, upon request, to other nodes, subject to zoning restrictions. If zones exist within the fabric, the Name Server restricts access to information in the Name Server database based on the zone configuration. When a port logs out of a fabric, the Name Server deregisters all objects associated with that port.

52

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

The Management Server


Information is provided without regard to zone single access point for information about the fabric topology
Read-only access Services provided:
Fabric Configuration Service (FCS) Zone Service Unzoned Name Service

2006 Cisco Systems, Inc. All rights reserved.

41

The Management Server


The FC Management Server provides a single access point for obtaining information about the fabric topology. Whereas the Name Server only provides information about ports configured within the zone of the port requesting information, the Management Server provides information about the entire fabric, without regard to zone. The Management Server allows SAN management applications to discover and monitor SAN components, but it does not allow applications to configure the fabricthe Management Server provides read-only access to its data. The Management Server provides the following services:

The Fabric Configuration Service (FCS) supports configuration management of the fabric. This service allows applications to discover the topology and attributes of the fabric. The Zone Service provides zone information for the fabric to either management applications or directly to clients. The Unzoned Name Service provides access to provide information about the fabric without regard to zones. This service allows management applications to see all the devices on the entire fabric.

Copyright 2006, Cisco Systems, Inc.

SCSI and Fibre Channel Primer

53

Well-Known Addresses
Well-known addresses are reserved addresses for FC Services at the top of the 24-bit fabric address space
Broadcast Alias Fabric Login Server Fabric Controller Name Server Time Server Management Server QoS Facilitator Alias Server Key Distribution Server Clock Synchronization Server Multicast Server Reserved FFFFFF FFFFFE FFFFFD FFFFFC FFFFFB FFFFFA FFFFF9 FFFFF8 FFFFF7 FFFFF6 FFFFF5 FFFFF4 FFFFF0 Mandatory Mandatory Mandatory Optional Optional Optional Optional Optional Optional Optional Optional

2006 Cisco Systems, Inc. All rights reserved.

42

Well-Known Addresses
Well-known Addresses allow devices to reliably access switch services. All services are addressed in the same way as an N_Port is addressed. Nodes communicate with services by sending and receiving Extended Link Services commands (frames) to and from Well-Known Addresses Well-known addresses are the highest 16 addresses in the 24-bit fabric address space:

FFFFFF - Broadcast Alias FFFFFE - Fabric Login Server FFFFFD - Fabric Controller FFFFFC - Name Server FFFFFB - Time Server FFFFFA - Management Server FFFFF9 - Quality of Service Facilitator FFFFF8 - Alias Server FFFFF7 - Key Distribution Server FFFFF6 - Clock Synchronization Server FFFFF5 - Multicast Server FFFFF4FFFFF0 - Reserved

The first three services are mandatory in all FC switches, however all FC switches today implement the first six services by default for ease of management.

54

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 2

Cisco MDS 9000 Introduction


Overview
In this lesson, you will learn about the MDS 9000 Family of SAN switches, including an overview of the MDS chassis and line card modules.

Objectives
Upon completing this lesson, you will be able to identify the components of an MDS 9000 storage networking solution. This includes being able to meet these objectives:

Identify the hardware components of the MDS 9000 platform Explain supported airflow and power configurations Explain the MDS 9000 licensing model

Cisco Storage Solutions Overview


MDS 9000 Family Product Line: First Generation
Industry Leading Innovation and Investment Protection across a Comprehensive Product Line
MDS 9500 Multilayer Directors Multilayer Fabric Switches MDS 9000 Family Systems MDS 9020 MDS 9120 and 9140 MDS 9216 and 9216i MDS 9506 MDS 9509

MDS 9000 Modules Supervisor-1 Mgmt SAN-OS


2006 Cisco Systems, Inc. All rights reserved.

32-port FC 16-port FC

Multiprotocol Services (14+2)

8-port IPS iSCSI + FCIP

SSM Virtualization

Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer MDS 9000 Family Operating System
4

Multilayer switches are switching platforms with multiple layers of intelligent features, such as:

Ultra High Availability Scalable Architecture Comprehensive Security Features Ease of Management Advanced Diagnostics and Troubleshooting Capabilities Seamless Integration of Multiple Technologies Multi-protocol Support

Multilayer switches also offer a scalable architecture with highly available hardware and software. Based on the MDS 9000 Family-Operating System and a comprehensive management platform called Cisco Fabric Manager, the MDS 9000 Family offers a variety of application line card modules and a scalable architecture from an entry-level fabric switch to director-class systems. The Cisco MDS 9000 Family offers a industry leading investment protection across a comprehensive product line. The 9020 is a new low cost 20 port FC switch providing 1/2/4Gb/s at full line rate. This model currently has a single power supply, four fans and front to rear airflow. Featuring nondisruptive software upgrades and management via CLI or FM/DM. The current release 2.1.2 does not support VSANs but this is planned for release 3.0.0

58

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS 9000 Family Product Line: Second Generation


Industry Leading Innovation and Investment Protection across a Comprehensive Product Line
MDS 9513 Multilayer Director 4Gbps SFPs MDS 9000 Family Systems MDS 9513 10Gbps X2

MDS 9000 Modules Supervisor-2 Mgmt SAN-OS


2006 Cisco Systems, Inc. All rights reserved.

12-port FC 1/2/4 Gb/s

24-port FC 1/2/4 Gb/s

48-port FC 1/2/4 Gb/s

4-port FC 10 Gb/s

Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer MDS 9000 Family Operating System
5

In April 2006, Cisco introduced the MDS 9513 Multilayer Director and second generation linecards. The 9513 Multilayer Director Switch is a new 13 slot chassis, with two Supervisor-2 slots. (Note that Supervisor-1 is not compatible with this chassis). Supporting this architecture but forward and backward compatible with the existing architecture are the new 12 Port, 24 port and 48 port FC linecards that provide 1/2/4 Gb/s using new 4Gb/s SFPs. The 4 port 10Gb/s FC linecards are also forward and backward compatible with the existing architecture and provide 4x 10Gb/s FC ports at full line rate using X2 GBICs.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

59

MDS 9500 Series Directors


MDS 9506 MDS 9509 MDS 9513

6 slots up to 192 Ports 7 RU Form Factor 6 per rack 21.75 Deep


2006 Cisco Systems, Inc. All rights reserved.

9 slots up to 336 Ports 14 RU Form Factor 3 per rack 18.8 Deep

13 slots up to 528 Ports 14 RU Form Factor 3 per rack 28 Deep


6

MDS 9500 Series Directors


The Cisco MDS 9500 series multi-layer directors elevate the standard for director-class switches. Providing industry-leading availability, multiprotocol support, advanced scalability, security, non-blocking fabrics that are 10-Gbps ready, and a platform for storage management, the Cisco MDS 9500 Series allows you to deploy high-performance SANs with a lower total cost of ownership. Layering a rich set of intelligent features and hardware-based services onto a high-performance, protocol-agnostic switch fabric, the Cisco MDS 9500 Series of multilayer directors addresses the stringent requirements of large data-center storage environments. MDS 9500 Series switch chassis are available in three sizes: MDS 9513 (14) rack units, MDS 9509 (14) rack units and MDS 9506 (7) rack units.

MDS 9506 Chassis


The Cisco MDS 9506 Director has a 6-slot chassis with the same features as the Cisco MDS 9509 Director. It has slots for two supervisor modules, and four switching or services modules. The power supplies are located in the back of the chassis, with the Power Entry Modules (PEMs) in the front of the chassis for easy access. The MDS 9506 supports the same director-class features as the MDS 9509 but with a more compact six-slot (7 Rack Units) chassis design because the pwer supplies are located at the rear. It has slots for two supervisor modules, and four switching or services modules. Power supplies are installed in the back for easy removal, with the Power Entry Modules (PEMs) in the front of the chassis for easy access. Up to six MDS 9506 chassiss can be installed in a standard 42U rack with up to (128) 1/2Gbps FC ports per chassis or up to (24) 1-Gbps Ethernet ports per chassis for IP storage
60 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

services applications. The maximized number of available FC ports is an industry leading port density per rack with up to (768), in a single seven foot (42 RU) rack, thus optimizing the use of valuable data center floor space. Additionally, cable management is facilitated by the single side position of both interface and power terminations.

MDS 9509 Chassis


The Cisco MDS 9509 Director has a 9-slot chassis with redundant supervisor modules, up to seven switching modules or six IPS modules, redundant power supplies, and a removable fan module. Slots 5 and 6 are reserved for redundant supervisor modules, which provides control, switching, and local and remote management. The Cisco MDS 9509 supports an industry leading port density per system that is expandable up to (224) FC ports in a single chassis. Up to (48) gigabit Ethernet ports can be configured when using the IP storage services module. Even though a (48) gigabit Ethernet port configuration is physically possible in an MDS 9509, its not likely to be deployed because it only allows room for a single FC switch module. There are two system clock cards for added high availability. Dual redundant power supplies are located at the front of the chassis, therefore the MDS 9509 is only 18.8 deep.

MDS 9513 Chassis


The Cisco MDS 9513 Director has a 13-slot chassis with redundant supervisor-2 modules, up to eleven switching modules or ten IPS modules, redundant 6KW power supplies, a removable fan module at the front and dditional removable fan modules at the rear for the fabric modules. Slots 7 and 8 are reserved for redundant supervisor-2 modules, which provides control, switching, and local and remote management. The Cisco MDS 9513 supports an industry leading port density per system that is expandable up to (528) FC ports in a single chassis. Up to (80) gigabit Ethernet ports can be configured when using the IP storage services module. There are two new removable system clock modules at the rear for added high availability. Dual redundant 6KW power supplies are located at the rear of the chassis. The MDS 9513 has a revised airflow system at the rear of the chassis, in at the bottom and out at the top.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

61

MDS 9513 Director


Redefining director-class storage switching
Ultra-high availability Multiprotocol support - FC, iSCSI, FCIP, FICON

3.0

Industry Leading Port Density


Up to 528 ports per chassis, up to 1584 ports per rack SW Support in SAN-OS 3.0

New Supervisor-2 (required)


Enhanced crossbar arbiter Dual BIOS, Redundant Bootflash: Advanced Security features

Redundant High Performance Crossbar Fabric Modules Redundant 6000W AC Power Supplies
Room-to-grow power for future application modules

Revised airflow
Bottom to top, at rear of chassis Front and rear fan trays
2006 Cisco Systems, Inc. All rights reserved.

MDS 9513 Rear view


7

The MDS 9513 Director


The Cisco MDS 9513 Director has a 13-slot chassis with redundant supervisor modules, up to eleven switching modules or ten IPS modules, redundant power supplies, a removable linecard fan module at the front and removable fabric fan module at the rear. Slots 7 and 8 are reserved for redundant supervisor-2 modules, which provides control, switching, and local and remote management. MDS 9513 supports an industry leading port density per system that is expandable up to (528) FC ports in a single chassis using 11x 48 port linecards. Up to (80) gigabit Ethernet ports can be configured when using 10x 8 port IP storage services modules. There are two removable system clock cards for added high availability. At the rear of the chassis, there are two new Fabric Modules that contain redundant high performance (2Tb per second) crossbars. The MDS 9513 has new redundant 6000W power supplies and features a new revised airflow, from bottom to top at the rear of the chassis.

62

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS 9000 Fabric Switches


MDS 9216A
Expands from 16 to 64 FC ports MDS 9216 Fibre Channel switch Modular upgrade to IP or intelligent SAN switching full compatibility with MDS9500 series

MDS 9216i
Expands from 14 to 62 FC ports + 2x GigE ports for FCIP and iSCSI MDS 9216i 14+2 Port Switch Fully supports any of the new linecards

MDS 9120 and 9140


MDS 9120 20 port Fibre Channel switch Cost-effective, intelligent fabric switch for SAN edge or small/medium FC SAN Small footprint, high density, fixed configuration MDS 9140 40 port Fibre Channel switch Feature compatibility w/ MDS 9506/9509/9216

MDS 9020
Low cost 4Gbps FC Switch with free FMS license MDS 9020 4Gbps 20 port FC Switch
2006 Cisco Systems, Inc. All rights reserved.

Supports non-disruptive Firmware Upgrades


8

MDS 9000 Fabric Switches


MDS 9000 fabric switches include the MDS 9216A, the MDS9216i, MDS 9100 Series and the new MDS 9020. The MDS 9216 offers fibre channel expansion up to sixty four (64) FC ports using the new 48 port module and can also accept any linecard including the IP Storage Services (IPS) module or SSM intelligent virtual storage module for industry leading performance in the mid-range SAN switch category. The MDS 9216i offers a built in 14+2 module, can expand up to sixty two (62) FC ports using the new 48 port FC linecard. In addition to the two built in GbE ports that support FCIP and iSCSI, it can also accept the IP Storage Services (IPS) module.. The Cisco MDS 9100 Series is a cost effective intelligent fabric switch platform. These highdensity fixed configuration switches take up 1U small footprint while offering a large feature compatibility with the other MDS 9000 Family systems. The Cisco MDS9020 low cost entry level switch supports 4Gbps FC ports and non-disruptive Firmware Upgrades. The MDS 9020 is provided with a free FMS license for efficient multifabric management.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

63

Common Architecture:
Ease-of-Migration and Investment Protection

IPS-8 Supervisor-1 FC-16

Supervisor-2

12-port FC Module

24-port FC Module FC-32

48-port FC Module MDS 9506 & 9509 SSM All Line Cards Forward/Backward Compatible* MDS 9513

MDS 9216 & 9216i

MPS 14+2

4-port 10Gb/s FC

Current Generation
Architectural support for up to 256 indexes Max planned system density of 240 ports 1/2Gb/s FC interfaces

New Generation
Architectural support for up to 1,024 indexes Max planned system density of 528 ports 1/2/4Gb/s, 10G FC interfaces
9

2006 Cisco Systems, Inc. All rights reserved.

*Some feature considerations in mixed Vegas/Isola configurations. Sup-2 only in 9513

All first generation and second generation modules are forwards and backwards compatible. The first generation has architectural support for up to 256 indexes (destination ports) and the max planned system density is 240 ports (using MDS 9509) although in practice it is 224 using 7x 32 port linecards. However, using a mix of current and second generation modules it is possible to increase this to 252 ports. Each supervisor module consumes two indexes, so a total of 4 indexes are used by supervisors on MDS 9500 switches. It is worth noting that each Gigabit interface uses 4 indexes, so an IPS-8 would consume 32 indexes and a 14+2 would consume 22 indexes from the pool. The second generation platform has architectural support for up to 1024 indexes and the max planned system density is currently 528 ports using 11x 48 port cards. However, if any one of the current linecards are inserted into the 9513 chassis, the maximum number of indexes are reduced to 252. The 9513 chassis must use only the Supervisor-2 module, however both Supervisor-1 and Supervisor-2 cards may be used in the current generation 9506 and 9509.

64

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Airflow and Power


MDS 9000 Series Fan Modules and Airflow
Hot swappable fan modules Easy installation and removal Integrated temperature management MDS 9120 & 9140 Two rear mounted fan modules Hot swappable MDS 9216 4 fans Front mounted fan tray Hot swappable
MDS 9120/9140 Rear view

r Ai

Fl

ow

Air Flow

MDS 9216
2006 Cisco Systems, Inc. All rights reserved.

11

MDS 9100 and 9200 Fan Modules and Airflow


The Cisco MDS 9100 Series switch supports two hot-swappable fan modules. The Cisco MDS 9100 Series switch continues to run if a fan module is removed, as long as preset temperature thresholds have not been exceeded. This means you can swap out a fan module without having to bring the system down. The fan modules each have one Status LED. Fan module status is also indicated on a front panel LED. The MDS 9216 switch supports a hot-swappable fan module with four fans. It provides 270 CFM (cubic feet per minute) of cooling, allowing 400 Watt of power dissipation per slot. Sensors on the supervisor module monitor the internal air temperature. If the air temperature exceeds a preset threshold, the environmental monitor displays warning messages. If one or more fans within the fan module fail, the Fan Status LED turns red. Individual fans cannot be replaced, you must replace the entire fan module. The MDS 9216 will continue to run if the fan module is removed, as long as preset temperature thresholds have not been exceeded. This means you can swap out a fan module without having to bring the system down. The fan module is designed to be removed and replaced while the system is operating without presenting an electrical hazard or damage to the system, provided the replacement is performed promptly. Removal periods should be limited to a total of 5 minutes, depending on system temperature. Integrated temperature and power management facilities help to ensure increased uptime. Fan status LEDs indicate the condition of the fans on the module. If a fan or fans fail, the fan status LED turns red. When all fans are operating properly, the LED is green.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

65

MDS 9500 Series Fan Modules and Airflow


MDS 9506 & 9509 airflow
Air Flo w

Hot swappable front mounted fan tray Easy installation and removal Sensors monitor system temperature Temperature rise or fan failure generates an event Recommended to replace fan tray at earliest opportunity
Replace fan tray within 3 mins or receive critical warning. Shutdown in 2 mins if fan tray is not replaced 3 minutes 2 minutes

Shutdown

MDS 9506 6 fans MDS 9509 9 fans MDS 9513 15 fans


Plus additional fabric fan tray at rear
2006 Cisco Systems, Inc. All rights reserved.

MDS 9513 airflow - bottom to top, at rear of chassis


12

MDS 9500 Fan Modules and Airflow


The MDS 9500 Series supports hot-swappable fan modules that are easily installed or removed from the from of the chassis. They provide 85 cubic feet per minute (CFM) of airflow per slot with 410 Watts of power dissipation per slot. The MDS 9506 has a fan module with six fans and the Cisco MDS 9509 has a fan module with nine fans. Sensors on the supervisor module monitor the internal air temperature. If the air temperature exceeds a preset threshold, the environmental monitor displays warning messages. If one or more fans within the module fails, the Fan Status LED turns red and the module must be replaced. When all fans are operating properly, the LED is green. If the fan LED is red, the fan assembly may not be seated properly in the chassis, in which case remove the fan assembly and reinstall. After reinstalling, if the LED is still red, then there is a failure on the fan assembly. Fan LED status indication is provided on a per-module basis. If one fan fails then the module is considered failed. The switch can continue to run when the fan module is removed for a maximum of 5 minutes if the temperature thresholds are not exceeded. This allows you to swap out a fan module without having to bring the system down. The fan module is designed to be removed and replaced while the system is operating without presenting an electrical hazard or damage to the system, provided the replacement is performed promptly. Install the fan module into the front chassis cavity with the status LED at the top. Push the fan module to ensure power supply connector mates with the chassis, and tighten captive installation screws. If the switch is powered on, listen for the fans; you should immediately hear them operating.

66

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Power Management
MDS switches have dual power supplies * Hot swappable for easy installation and removal * Power Supply Modes
Redundant Mode: default Power capacity of the lower capacity supply Sufficient power will be available in case of PSU failure Combined Mode: non-redundant Twice the power capacity of lower capacity supply Sufficient power may not be available in case of a power supply failure Only modules with sufficient power are powered up Power is reserved for the Supervisors and Fan Assemblies After supervisors, modules are powered up starting at slot 1
* MDS 9020 has a single integral power supply
2006 Cisco Systems, Inc. All rights reserved.

13

Power Management
Power supplies are configured in redundant mode by default but they can also be configured in a combined or non-redundant mode. In redundant mode, the chassis uses the power capacity of the lower capacity power supply so that sufficient power is available in case of a single power supply failure. In combined mode, chassis uses twice the power capacity of the lower capacity power supply. Sufficient power may not be available in case of a power supply failure in this mode. If there is a power supply failure and the real power requirements for the chassis exceed the power capacity of the remaining power supply, the entire system will be reset automatically, to prevent permanent damage to the power supply. In either mode, power is reserved for the Supervisor and fan assemblies. Each supervisor module has roughly 220 watts in reserve, even if there is only one installed, and the fan module has 210 watts in reserve. In the case of insufficient power, after supervisors and fans are powered, line card modules are given power from the top of the chassis down. After the reboot, only those modules that have sufficient power shall be powered up. If the real power requirements do not trigger an automatic reset, no module will be powered down; Instead no new module shall be powered up. In all cases of power supply failure, removal, etc., a syslog message shall be printed, a call home message shall be sent if configured, and a SNMP trap shall be sent.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

67

MDS 9000 Power Supplies


MDS 9020
Integral power supply 100 Watt AC supply 100W @ 100-240 VAC

MDS 9100
Removable power supplies at rear of chassis 300 Watt AC supply 300W @ 100-240 VAC

MDS 9216
Removable power supplies at rear of chassis 845 Watt AC supply 845W @ 100-240 VAC

2006 Cisco Systems, Inc. All rights reserved.

14

MDS 9000 Power Supply Modules


The MDS 9500 Series supports redundant hot-swappable power supplies, that support AC input voltages, each of which is capable of supplying sufficient power to the entire chassis should one power supply fail. The power supplies monitor their output voltage and provide status to the supervisor module. To prevent the unexpected shutdown of an optional module, the power management software only allows a module to power up if adequate power is available. The power supplies can be configured to be redundant or combined. By default, they are configured as redundant, so that if one fails, the remaining power supply can still power the entire system. Condition LEDs give visual indications of the installed modules and their operation. The Cisco MDS 9020 switch has a single integral power supply only.

68

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS 9500 Power Supplies


MDS 9506
Removable power supplies at rear of chassis 1900 Watt AC supply 1900W @ 200-240 VAC 1050W @ 100-120 VAC

MDS 9509
Removable power supplies at front of chassis 2500 Watt AC supply 2500W @ 200-240 VAC 1300W @ 100-120 VAC 3000 Watt AC supply (new)

MDS 9513
Removable power supplies at rear of chassis 6000 Watt AC supply 6000W @ 200-240 VAC
2006 Cisco Systems, Inc. All rights reserved.

15

MDS 9500 Power Supply Modules


The MDS 9500 Series supports redundant hot-swappable power supplies, that support AC input voltages, each of which is capable of supplying sufficient power to the entire chassis should one power supply fail. The power supplies monitor their output voltage and provide status to the supervisor module. To prevent the unexpected shutdown of an optional module, the power management software only allows a module to power up if adequate power is available. The power supplies can be configured to be redundant or combined. By default, they are configured as redundant, so that if one fails, the remaining power supply can still power the entire system. Condition LEDs give visual indications of the installed modules and their operation. The Cisco MDS 9509 Director supports the following types of power supplies: A 2500 Watt AC power supply with an AC input and DC output. This power supply requires 220 VAC to deliver 2500 Watts of power. If powered by 110 VAC, it will deliver only 1300 Watts. This supply has a current rating of 20 amps for circuit breakers but a 16 amp maximum draw under normal conditions. A new 3000 Watt AC power supply with an AC input and a DC output is also available for the MDS 9509. All three power supplies appear to be very similar from the outside with the handle, air vent, power switch, condition LEDs, and captive screws in the same location. However, make note of the different power input connections where the differences are removable or permanently attached cords or a terminal block connection. The MDS 9513 power supplies are located at the rear of the chassis and provide 6000W of power at a nominal voltage of 220 VAC.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

69

Software Packages and Licensing


Software Licensing
Standard Standard Package - Free

Enforced software licensing started with SAN-OS 1.3 Includes standard license package - Free Five additional license packages
Enterprise package SAN Extension over IP (FCIP) Mainframe (FICON) Fabric Manager Server (FMS) Storage Services Enabler (SSE)

Fibre Channel and iSCSI iSCSI Server Load Balancing VSANs and Zoning PortChannels FCC and Virtual Output Queuing Diagnostics (SPAN, RSPAN, etc.) Fabric Manager and Device Manager SNMPv3, SSH, SSL, SFTP SMI-S 1.10 and FDMI Role-based access control RADIUS and TACACS+, MS CHAP RMON, Syslog, Call Home Brocade native interop modes 2 and 3 McData native interop mode 4 NPIV (N_Port ID Virtualization) IVR over FCIP Command Scheduler IPv6 (management & IP services)

Features may be evaluated free for 120 days

2006 Cisco Systems, Inc. All rights reserved.

17

The Cisco MDS 9000 Family SAN-OS is the underlying system software that powers the award-winning Cisco MDS 9000 Family Multilayer Switches. SAN-OS is designed for storage area networks (SANs) in the best traditions of Cisco IOS Software to create a strategic SAN platform of superior reliability, performance, scalability, and features. In addition to providing all the features that the market expects of a storage network switch, the SAN-OS provides many unique features that help the Cisco MDS 9000 Family to deliver low total cost of ownership (TCO) and a quick return on investment (ROI).

Common Software Across All Platforms


The SAN-OS runs on all Cisco MDS 9000 Family switches, from multilayer fabric switches to multilayer directors. Using the same base system software across the entire product line enables Cisco Systems to provide an extensive, consistent, and compatible feature set on the Cisco MDS 9000 Family. Most Cisco MDS 9000 Family software features are included in the base switch configuration. The standard software package includes the base set of features that Cisco believes are required by most customers for building a SAN. However, some features are logically grouped into addon packages that must be licensed separately.

70

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Bundled Packages for SAN-OS 3.0


Enterprise Package Enhanced security features
VSAN-based access control LUN zoning & Read-only zones Port Security Host/switch authentication (FC-SP) Digital Certificates (IKE x.509) IPsec Security (iSCSI and FCIP) *

SAN Extension over IP Package (FCIP)


FCIP Protocol FCIP Compression Hardware-based FCIP Compression * Hardware based FCIP Encryption * FCIP Write Acceleration FCIP Tape Acceleration and Read Acceleration SAN Extension Tuner

Advanced traffic engineering


QoS and Zone-based QoS Extended Credits * FC Write Acceleration & Read Acceleration SCSI Flow Statistics Enhanced IOD

Storage Services Enabler Package (SSE)


SANTap protocol NASB (Network Accelerated Serverless Backup) FAIS (Fabric Application Interface Standard)

Fabric Manager Server (FMS)


Multiple physical fabric management Centralized fabric discovery services Continuous MDS health and event monitoring Long-term, historical data collection Performance reports and charting for hot-spot analysis Performance prediction and server summary reports Web-Based Operational View Threshold Monitoring Configurable RRD Data collection auto-update and event forwarding
* = Requires MPS (14+2) module or 9216i
18

Enhanced VSAN functionality


Inter-VSAN Routing IVR with FCID NAT

Mainframe Package (FICON)


FICON protocol and CUP management FICON VSANs and intermixing FICON Tape Acceleration (R/W) Switch cascading and fabric binding

2006 Cisco Systems, Inc. All rights reserved.

Bundled Software Packages for SAN-OS 3.0


The SAN-OS feature packages are:

Enterprise Packageadds a set of advanced features which are recommended for all enterprise SANs. SAN Extension over IP Packageenables FCIP for IP Storage Services and allows the customer to use the IP Storage Services to extend SANs over IP networks. Mainframe Packageadds support for the FICON protocol. FICON VSAN support is provided to help ensure that there is true hardware-based separation of FICON and open systems. Switch cascading, fabric binding, and intermixing are also included in this package. Fabric Manager Server Packageextends Cisco Fabric Manager by providing historical performance monitoring for network traffic hotspot analysis, centralized management services, and advanced application integration for greater management efficiency. Storage Services Enabler Packageenables network-hosted storage applications to run on the Cisco MDS 9000 Family Storage Services Module (SSM). A Storage Services Enabler package must be installed on each SSM.

The SAN-OS Software package fact sheets are available at http://www.cisco.com/en/US/products/hw/ps4159/ps4358/products_data_sheets_list.html.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

71

Unique Attributes of MDS Licensing


Simple value-based packaging Feature-rich Standard Package (no extra charge)
Simple bundles for advanced features that provide significant value All upgrades included in support pricing

High availability
Non-disruptive installation No single point of failure 120 day grace period for enforcement

Ease of use
Seamless electronic licenses No separate software images for licensed features Licenses installed on switch at factory Automated license key installation Centralized License Management Console Provides single point of license management of all switches
19

2006 Cisco Systems, Inc. All rights reserved.

Unique Attributes of MDS Licensing


License usability can be a nightmare with existing products. Customers have concerns about compromising availability with disruptive software installations for licensed features. License management is a notorious problem. Cisco license packages require a simple installation of an electronic licenseno software installation or upgrade is required. Licenses can also be installed on the switch in the factory. MDS switches store license keys on the chassis SPROM, so license keys are never lost even during a switch software reinstall. Cisco Fabric Manager includes a centralized license management console that provides a single interface for managing licenses across all MDS switches in the fabric, reducing management overhead and preventing problems due to improperly maintained licensing. In the event that an administrative error does occur with licensing, the switch provides a grace period before the unlicensed features are disabled, so there is plenty of time to correct the licensing issue. All licensed features may be evaluated for a period of up to 120 days before a license is required.

72

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Multilayer Intelligent Storage Platform


Enterprise-Class Management
FM DM PM Traffic Analyzer PAA Advanced Diagnostics Integration

Intelligent Storage Services


Virtualization Replication Volume Mgmt

SAN Consolidation
VSANs PortChannels Security Traffic Engineering QoS FCC

Multiprotocol
FC FICON FCIP iSCSI

High Availability Infrastructure


Hardware Redundancy Hot-Swap Non-Disruptive Upgrade

2006 Cisco Systems, Inc. All rights reserved.

20

The Cisco MDS 9000 Series is the first multilayer intelligent storage platform.

High-availability infrastructureRedundant power and cooling, redundant supervisor modules with stateful failover, hot-swap modules, and non-disruptive software upgrades for the MDS 9500 platform give you 99.999% availability. MultiprotocoliSCSI enables integration of mid-range servers into the SAN, FICON enables integration of mainframe systems with complete isolation of FICON and FC ports, and FCIP enables cost-effective DR solutions. SAN consolidationIntelligent infrastructure services like virtual SANs (VSANs), PortChannels, per-VSAN FSPF routing, QoS, FCC, and robust security enable stable, scalable, and secure enterprise SAN consolidation. Intelligent storage servicesNetwork-based services for resource virtualization, volume management, data mobility, and replication lower TCO and increase ROI. Enterprise-class managementIntegrated device, fabric, and performance management improve management productivity and easily integrate with existing enterprise management frameworks like IBM Tivoli and HP OpenView.

Copyright 2006, Cisco Systems, Inc.

Cisco MDS 9000 Introduction

73

Q: How do we build a 3000 port fabric?

A: Six MDS 9513 Directors


2006 Cisco Systems, Inc. All rights reserved.

21

Question: How do we build a 3000 port fabric? Answer: Using six MDS 9513 directors. The MDS 9513 has the largest port capacity (528 ports) of any Fibre Channel switch or director in the market today.

74

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 3

Architecture and System Components


Overview
This lesson describes the hardware and software architecture of the MDS 9000 storage networking platform.

Objectives
Upon completing this lesson, you will be able to describe the hardware architecture and components of the MDS 9000 Family of switches. This includes being able to meet these objectives:

Describe the system architecture of the MDS 9000 platform Explain how to design fabrics using full-rate and oversubscribed line cards Explain how buffer credits are allocated on MDS 9000 line card modules

System Architecture
MDS Integrated Crossbar
Investment protection
Ability to support new line cards Multiprotocol support in one system
Centralized Crossbar switch architecture

Highly-scalable system
Aggregate Bandwidth up to 2.2 Tbps

external interfaces

external interfaces

High Port Density


Flexibility to support higher density line cards Fewer devices to purchase and manage Increase in usable FC ports No wasted internal ports Minimal switch interconnects

Crossbar switch fabric

Centralized redundant architecture


Flexibility to support speed mismatches 1Gb 2Gb 4Gb 10Gb Arbiter schedules frames fairly to ensure consistent latency Virtual Output Queuing for optimal crossbar performance
2006 Cisco Systems, Inc. All rights reserved.

MDS Integrated Crossbar


The integrated crossbar system provides investment protection through its ability to support new line cards, including new transports. It also provides multiprotocol support in one system (Fibre Channel, Internet Small Computer Systems Interface [iSCSI], and Fibre Channel over IP [FCIP]), as well as being highly scalable, supporting speeds of up to 1.44 terabits per second (Tbps) on MDS 9506 and 9509 and up to 2.2 Terabits per second on MDS 9513. High port density means fewer devices to purchase. This situation results in an increase in usable ports due to minimal switch interconnects. Thus, common equipment (power, supervisors) is amortized over more ports. The integrated crossbar system provides 80-Gbps bandwidth on MDS 9506 and 9509 and up to 100 Gbps on MDS 9513. It also features a redundant OOB management channel on the backplane. An integrated crossbar has many benefits. Investment protection is ensured by the ability to support new line cards including transports and multiprotocol support in one system. Data transfers of up to 2.2Tbps or 100Gbps per slot provides a highly-scalable system. High port density means fewer devices to purchase and manage, while an increase in usable ports due to minimal switch interconnects means that common equipment like power supplies, supervisors, and chassis can be amortized over more ports.

78

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

There is an aggregate 720-Gbps multiprotocol crossbar per supervisor module used on MDS 9506 and 9509 but the MDS 9513 has new Crossbar Fabric modules located at the rear of the chassis that provide a total aggregate bandwidth of 2.2Tbps. All MDS chassis can operate on a single crossbar at full bandwidth on all attached ports without blocking. A technique called Virtual Output Queuing (VOQ) is deployed for optimal crossbar performance. VOQ resolves head-of-line blocking issues for continuous data flow.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

79

MDS 9506 & 9509 Crossbar Architecture


Active Supervisor
Flash Card Eth Console uP

Standby Supervisor
Flash Card Eth Console uP

Crossbar Back plane


I/F Q F M
F

Crossbar

uP Q F M
F

I/F Q F M
F

uP Q F

I/F Q F

uP Q F V

I/F Q F V M
F

uP Q F V M
F

Q F M
F

Q F M
F

Si
EE

Si
EE

Si
EE

Si
EE

Si
EE

M
F

S-16 & S-32 FC Line Cards


2006 Cisco Systems, Inc. All rights reserved.

MPS 14+2 Multiprotocol

IPS-8 FCIP & iSCSI IP Services

I-32 SSM Storage Services Module


5

MDS 9506 and 9509 Crossbar Architecture


The MDS 9000 Series Multilayer switches were designed from the ground up around a collection of sophisticated application specific integrated circuit (ASIC) chips. Unlike other fabric vendors, whose products are largely based on single-ASIC designs, the MDS 9000 family of products are far more powerful and flexible, both in terms of the features they support today as well as their ability to evolve and grow with customers changing needs. Because of its sophisticated, modular, multi-ASIC design, the MDS 9000 Series Multilayer switches are capable of supporting many protocols and services, including Fibre Channel, FCIP, iSCSI, FICON and virtualization servicesall concurrently and in the same chassis. The hot-swappable, modular line card design provides a high degree of flexibility and allows for ease of expansion. Each line card in the MDS 9500 series director has redundant high speed paths across the backplane to the high-performance crossbar fabrics located on the redundant supervisor modules, thus providing a five-nines level of availability. Although one supervisor is Active and the other Passive, both crossbars are always active and capable of routing frames. The graphic illustrates a block diagram of a possible switch implementation, including:

Dual supervisor modules (Supervisor-1 or Supervisor-2) containing crossbar, microprocessor, flash memory, console and Ethernet interfaces An FC line card capable of supporting Fibre Channel and FICON protocols. Examples of this are the 16-port and 32-port line cards. An IP Services line card capable of supporting IP storage services and protocols like FCIP and iSCSI An MPS 14+2 line card with 14 FC ports and two GigE ports supporting iSCSI and three FCIP tunnels per GigE port.
Copyright 2006, Cisco Systems, Inc.

80

Cisco Storage Design Fundamentals (CSDF) v3.0

An SSM line card, capable of performing virtualization services, snapshots, replication and SCSI 3rd Party copy services to support NASB (Network Assisted Serverless Backup)

Frames arriving at the interface are de-encoded, conditioned, maybe virtualized and passed to the forwarding ASIC (F) then stored in the appropriate Virtual Output Queue (Q) until the arbiter (A), decides that a credit is available at the destination port and that the frame can now continue its journey. The frame leaves the VOQ and passes through the Up interface (I/F) across one of the crossbars and down to the destination line card and straight out of the appropriate interface. Notice that all line cards have an identical architecture from the F ASIC and above, so all frames crossing the crossbar have already been conditioned and processed and have an identical structure, regardless of their underlying protocol, FC, Ficon, iSCSI, FCIP. The internal architecture of the MDS 9216 is very similar to the MDS 9500 in that many of the same internal components are utilized. There are however several key differences:

The MDS 9216 has a fixed non-redundant supervisor card that provides arbitration and supports a single modular card, although it can be of any type. The MDS 9216 does not use a crossbar fabric. In a two-slot design there is no need or advantage to using a switching crossbar. Instead, the two cards are connected to each other through the high-speed back plane, and to themselves through an internal loopback interface.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

81

MDS 9506 & 9509 Crossbar Architecture

20-Gbps

Crossbar 720-Gbps
Supervisor Module

Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps

20-Gbps

20-Gbps

20-Gbps

Crossbar 720-Gbps
Supervisor Module

2006 Cisco Systems, Inc. All rights reserved.

Each supervisor module has an onboard 720 Gbps crossbar360 Gbps transmit (Tx) and 360 Gbps receive (Rx). Therefore in a dual supervisor installation, the MDS 9000 system has an aggregate total bandwidth of up to 1.44-Tbps. Each installed line card in a dual supervisor configuration has 80 Gbps bandwidth available to the supervisor cross-bars. Each path is 20 Gbps in each direction. Each card connects through dual 20-Gbps paths to each supervisor cross-bar. Data is load shared across both cross-bars when dual supervisor modules are installed. Both crossbars are active-active and frames from a line card will travel across either one or the other crossbar. The Arbiter function schedules frame delivery at over 1 billion frames per second and routes frames over either one crossbar or the other.

82

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS 9513 Crossbar Architecture

Total Crossbar Fabric Bandwidth 2.2 Tbps

Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 100-Gbps

25Gbps

Crossbar Fabric Module

25Gbps

25Gbps

25Gbps

Crossbar Fabric Module

2006 Cisco Systems, Inc. All rights reserved.

MDS 9513 Crossbar Architecture


Only Supervisor-2 modules can be fitted to the MDS 9513. Supervisor-2 modules also have integral crossbars but they are not used when installed on the MDS 9513. Dual redundant Crossbar Fabric modules, situated at the rear of the chassis with a total aggregate bandwidth of 2.2 Terabits per second are used instead. Each line card has four 25Gbps channels connecting to the Crossbar Fabric Modules providing a total bandwidth of 100Gbps per slot. Eleven line card slots x 100 Gbps = 2.2 Tbps Both crossbars are active-active and frames from a line card will travel across either one or the other crossbar. The Arbiter function schedules frame delivery at over 1 billion frames per second and routes frames over either one crossbar or the other.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

83

MDS 9513 System Diagram


Arbiter Crossbar Fabric Arbiter Crossbar Fabric

3.0

50Gbps 24-port LC 50Gbps 16-port LC

Up i/f

48 Ports OSM LC

Down i/f
VOQ

12 Ports FRM LC

TCAM Adj 25Gbps

Fwd/VOQ

Buf

Fwd/VOQ

TCAM

4 Ports 10G LC

VOQ

Forwarding

Forwarding

MAC/PHY

MAC/PHY

MAC

MAC

MAC

MAC

2006 Cisco Systems, Inc. All rights reserved.

Each of the eleven line card slots on the MDS9513 has 2x 2.5Gbps serial links to each Arbiter ASIC. In addition, each Supervisor slot has one each, making a total of 24x 2.5Gbps serial links to the Arbiter ASICs. These are used to communicate with the Central Arbiter to request and grant permission for a frame to cross the crossbar. Each line card Fwd/VOQ ASIC is connected to each of the Crossbar Fabrics via a pair of dual redundant 25Gbps channels providing a total 50Gbps to each crossbar. A second 50Gbps dual redundant pair of channels provides the return path from the Crossbar Fabric to the other Fwd/VOQ ASIC. Each channel comprises 8x 3.125Gbps serial links for transmit and 8x 3.125Gbps for receive. Frames arrive at the line card MAC/PHY interface and are forwarded to the Fwd/VOQ ASIC where the frames are stored in a buffer and associated with a destination VOQ. The Fwd/VOQ ASIC requests permission from the Arbiter to deliver a frame to the destination port. When the Arbiter has received a credit from the destination device, it grants permission for the frame to be sent across one of the crossbar fabrics. When permission is granted by the Arbiter, a frame leaves the VOQ in the Fwd/VOQ ASIC along one of the 25Gbps channels to one of the Crossbar Fabrics then returns via one of the 25Gbps return channels and out through the MAC/PHY ASIC on the appropriate line card. All frames travel across the crossbar fabric, regardless of where the source and destination ports are located on the ASICs or line cards. This provides consistent latency of approx 20us per frame and minimises jitter which can occur in other vendor products.

84

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS 9513 Crossbar Fabric

3.0

Redundant Crossbar Fabric


Active/Active operation balances the load across both crossbars Rapid failover in case of failure ensures no loss of frames

High Bandwidth non blocking architecture


Each Crossbar Fabric supports dual 25Gbps channels from each line card Total Crossbar Bandwidth = 2.2 Tbps A single crossbar fabric still provides sufficient bandwidth for all line cards

High Performance Centralized Architecture


Ensures consistent latency across the switch Supports up to 1024 indexes (destination interfaces) Enhanced High Performance Arbiter schedules frames at over 1 billion/sec
2006 Cisco Systems, Inc. All rights reserved.

MDS 9513 Crossbar Fabric


Both MDS 9513 Crossbar Fabric modules are located at the rear of the chassis and provide a total aggregate bandwidth of 2.2 Tbps. Each fabric module is connected to each of the line cards via dual redundant 25Gbps channels making a total of 100Gbps per slot. A single fabric crossbar module can support full bandwidth on all connected ports in a fully loaded MDS 9513 without blocking. The arbiter schedules frames at over 1 billion frames per second, ensuring that blocking will not occur even when the ports are fully utilized.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

85

Hot-Swappable Supervisors
Dual supervisors
Active and Standby Hot-swappable Stateful-Standby keeps sync with all major management and control protocols of active supervisor

Non-disruptive upgrades
Load and activate new software without disrupting traffic Standby supervisor maintains the previous version of code while the active supervisor is updated

2006 Cisco Systems, Inc. All rights reserved.

10

The Cisco MDS 9500 Series of Multilayer Directors supports two Supervisor modules in the chassis for redundancy. Each Supervisor module consists of a Control Engine and a Crossbar Fabric. The Control Engine is the central processor responsible for the management of the overall system. In addition, the Control Engine participates in all of the networking control protocols including all Fibre Channel services. In a redundant system, two Control Engines operate in an active/standby mode. The Control Engine that is in standby mode is actually in a stateful-standby mode such that it keeps sync with all major management and control protocols that the active Control Engine maintains. While the standby Control Engine is not actively managing the switch, it continually receives information from the active Control Engine. This allows the state of the switch to be maintained between the two Control Engines. Should the active Control Engine fail, the secondary Control Engine will seamlessly resume its function. The Crossbar Fabric is the switching engine of the system. The crossbar provides a high speed matrix of switching paths between all ports within the system. A crossbar fabric is embedded within each Supervisor module. The two crossbar fabrics operate in a load-shared active-active mode. Each crossbar fabric has a total switching capacity of 720 Gbps and serves 80 Gbps of bandwidth to each slot on MDS 9506 and 9509. Since each switching module of the Cisco MDS 9506 or 9509 does not consume more than 80 Gbps of bandwidth to the crossbar, the system will operate at full performance even with one Supervisor module. In a fully populated MDS 9500, the system will not experience any disruption or any loss of performance with the removal or failure of one Supervisor module. The Supervisor Module is a hot swappable module. In a dual Supervisor module system this allows the module to be removed and replaced without causing disruption to the rest of the system.

86

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Supervisor-2 Module Features


High Performance Integrated Crossbar
Active when installed in MDS 9506 or MDS 9509 chassis Bypassed when installed in MDS 9513 chassis Supports up to 48 Gbps of front-panel bandwidth per slot

3.0

Enhanced Crossbar Arbiter


1024 destination indexes per chassis Supports mix and match of gen-1 and gen-2 modules Up to 252 ports when any gen-1 modules are present Up to 528 ports when only gen-2 modules are present Front Panel Interfaces
Console Port Management Ethernet Port 10/100/1000 COM1 Port Compact Flash Slot USB Ports (2)

PowerPC Management Processor


Provides increased performance and lower power consumption vs. Sup-1

MDS 9513 requires Supervisor-2

2006 Cisco Systems, Inc. All rights reserved.

11

Supervisor-2 Module Features


Supervisor-2 is an upgraded version of supervisor-1 with additional Flash memory, RAM and NVRAM memory and redundant BIOS. It can be used in any MDS 9500 chassis, 9506, 9509 or 9513. All frames pass directly from line card ASICs across the crossbar and out to their destination interfaces. Frame flow is not regulated by the supervisor. Supervisor-2 uses a new PowerPC management processor to provide FC services to connected devices. eg FSPF, zoning, Name Server, FLOGI server, security, VSANs and IVR. When used in a MDS 506 or 9509, the integral crossbar is used. When used in the MDS 9513, the integral crossbar is bypassed, and the Crossbar Fabric modules are used instead. Supervisor-2 supports 1024 destination indexes providing up to 528 ports in MDS 9513 when only gen-2 modules are used. If any gen-1 modules are installed in MDS 9513, then only 252 ports can be used.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

87

Switch CPU ResourcesCritical for Scalability


FC Switch CPU and memory resources are critical to SAN scalability and resiliency Cisco has addressed this need with powerful processing capabilities in the MDS 9000 Family Inadequate CPU resources can have major adverse effects on SAN operations
SAN Scalability Additional CPU resources required for each new neighbor switch, logged-in device, propagated zone set, RSCN-registered device, etc. SAN Resiliency Without adequate CPU resources, fabric reconvergence from a fault can take excessive time or even fail altogether as high number of computations are required. SAN Security Additional CPU resources are required for security features such as SSH and SSL access, encryption, FC-SP fabric authentication, and port binding

2006 Cisco Systems, Inc. All rights reserved.

12

Switch CPU Resources


All Fibre Channel switches must provide a number of distributed services for their connected devices. These include the distributed Name Server, Login Server, Time Server, Management Server, FSPF, Zoning Server etc. These services are provided by the active supervisor and must respond to requests in a timely manner; the faster the better, otherwise the fabric may appear unresponsive and in extreme conditions, may hang for a period of time. For this reason, the MDS contains processors with many times the performance of competitive FC switches. The very nature of a SAN is to fan out the connectivity of fewer storage subsystems to numerous server connections. While performance is important, so are the capabilities of a switched fabric to provide services, including congestion avoidance, preferential services, and blocking avoidance. Cisco provides a full SAN switching product line with the Cisco MDS 9000 Series, a line that is optimized to build scalable SAN infrastructures and to provide industry-leading performance, resiliency, security, and manageability Independent switch performance validation testing has proven the Cisco MDS familys performance capabilities consistently outperform competing products.

88

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Oversubscription and Bandwidth Reservation


Oversubscription Overview
Each device FC port must be connected to a single FC switch port but not all devices can utilize the full bandwidth available to them A 2Gbps FC port can provide 200MB/s of bandwidth Servers rarely require more than 25MB/s Oversubscription allows several devices to share the available bandwidth ISL oversubscription is typically 7:1 200MB/s shared by 7 servers = approx 28MB/s avg per server

HBA

FC FC FC HBA FC HBA FC HBA FC HBA FC HBA


HBA

2Gbps ISL

ISL Bandwidth 200MB/s

2006 Cisco Systems, Inc. All rights reserved.

14

Oversubscription Overview
Fibre Channel standards dictate that in a Fabric Topology, each attached FC device port must be attached to its own dedicated FC switch port. Todays switch ports support 1Gbps, 2Gbps, 4Gbps and 10Gbps ports but the connected device cannot usually utilize the full bandwidth available to them. A 2Gbps port can provide 200MB/s of bandwidth in each direction, a total of 400MB/s per port. Servers often have internal bandwidth limitations and applications rarely require more than 25MB/s today. This is changing with the introduction of PCI-Express motherboards that have replaced the old parallel PCI bus with multiple 2.5Gbps serial channels to each slot. If the application is capable of demanding it, each PCI-Express channel will fully utilize a 2Gbps Fibre Channel port. However, today most servers require less than 25MB/s. Oversubscription allows several devices to share the available bandwidth. ISL oversubscription is typically 7:1 with seven servers sharing the total bandwidth of a 2Gbps FC port. 200MB/s shared by 7 servers = approx 28MB/s average per server.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

89

16-Port Full-Rate Mode Line Card


10 Gbps Shared Bandwidth 10 Gbps Shared Bandwidth 10 Gbps Shared Bandwidth 10 Gbps Shared Bandwidth

Port Group

Port Group

Port Group

Port Group

Port 10

Port 11

Port 12

Port 13

Port 14

Port 15 1:1 1:1

Port 2

Port 4

Port 6

16-port line card has no oversubscription


Supports Full Rate Mode at 1Gbps or 2Gbps

4x 1Gbps = 4Gbps 4x 2Gbps = 8Gbps

FRM FRM

Use when the device requires full bandwidth at up to 200MB/s

Suitable for Storage Arrays and ISLs between switches Up to 255 Buffer Credits per FC interface Fully configurable
plus 145 performance buffers per port Default 255 credits for E_Ports, 16 credits for Fx_Ports
2006 Cisco Systems, Inc. All rights reserved.

The 16-Port Full-Rate Mode Line Card


The 16-port line card operates in Full Rate Mode. Each port on the line card can deliver up to 2Gbps. There are 4 ports in a port group, so total bandwidth requirement could be 8Gbps per port group. The internal path to the forwarding ASIC provides 10Gbps, so more than enough bandwidth is available. The 16-port line card is suitable for any device that requires full 2Gbps bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 255 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 255 credits, F_Ports are allocated 16 credits but may be configured up to 255 credits. An additional 145 performance buffers are available when required.

90

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port 16
15

Port 1

Port 3

Port 5

Port 7

Port 8

Port 9

32-Port Oversubscribed Mode Line Card


2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth 2.5 Gbps Shared Bandwidth

Port Group

Port Group

Port Group

Port Group

Port Group

Port Group

Port Group

Port Group

Port 11

Port 13

Port 14

Port 15

Port 16

Port 17

Port 18

Port 19

Port 20

Port 21

Port 22

Port 23

Port 24

Port 25

Port 26

Port 27

Port 28

Port 29

Port 30

Port 31
16

Port 10

Port 12

Port 3

Port 5

Port 6

Port 7

Port 9

32-port line card has limited internal bandwidth


Oversubscribed by design at 3.2:1 @ 2Gbps

4x 1Gbps = 4Gbps 4x 2Gbps = 8Gbps

OSM OSM

1.6:1 3.2:1

Provides twice as many ports for approximately the same price

8 / 2.5 = 3.2

Suitable for connecting servers that require less than 62MB/s avg. bandwidth 12 Buffer Credits (fixed) per FC interface

2006 Cisco Systems, Inc. All rights reserved.

The 32-Port Oversubscribed Mode Line Card


The 32-port line card is designed to provide twice as many ports (32 ports) for nearly the same price as a 16-port line card. However, to make space, some internal components are removed. The 32-port line card operates in Oversubscribed Mode. Each port on the line card can deliver 2Gbps. There are 4 ports in a port group, so total bandwidth requirement could be 8Gbps per port group. However, the internal path to the forwarding ASIC provides only 2.5Gbps, so 4 ports must share 2.5Gbps or 250MB/s. 250MB/s shared by 4 ports means that on average, each device should not exceed 62.5MB/s on average. However, one port could be operating at 100MB/s and another at 20MB/s, another at 60MB/s and another at 70MB/s providing the total group bandwidth does not exceed 250MB/s The 32-port line card is suitable for any device that does not require full 2Gbps bandwidth. eg Servers and tape drives that demand less than 62MB/s average bandwidth. Each port has only 12 buffer credits, non-configurable.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

Port 32

Port 1

Port 2

Port 4

Port 8

91

Generation-2 Fibre Channel Modules

3.0

Four modules address key SAN consolidation requirements


12-Port 1/2/4-Gbps FC Module
Full-rate 4Gbps performance for ISLs and highest performance server and tape applications

24-Port 1/2/4-Gbps FC Module


Full-rate 2Gbps performance for enterprise storage connect and high performance server applications
Maximum subscription ratio with all ports active

48-Port 1/2/4-Gbps FC Module


Shared bandwidth 2Gbps performance for mainstream server applications

FC Line Card 12-Port 24-Port

1 Gbps 1:1 1:1 1:1 N/A

2 Gbps 1:1 1:1 2:1 N/A

4 Gbps 1:1 2:1 4:1 N/A

10 Gbps N/A N/A N/A 1:1

4-Port 10-Gbps FC Module


Full-rate 10Gbps performance for ISL consolidation and high bandwidth Metro connect
2006 Cisco Systems, Inc. All rights reserved.

48-Port 4-Port

17

Generation-2 Fibre Channel Modules


Four new second-generation modules provide much more flexibility when configuring ports.

12-port 1/2/4Gbps Fibre Channel module providing 4Gbps full rate bandwidth on every port. 24-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 2:1 oversubscription and full rate bandwidth on each port at 1Gbps and 2Gbps. 48-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 4:1 oversubscription and 2Gbps at 2:1 oversubscription and full rate bandwidth at 1Gbps on each port. 4-port 10Gbps Fibre Channel module providing 10Gbps full rate bandwidth on every port.

10G modules use 64b/66b encoding that is incompatible with modules operating at 1/2/4Gbps using 8b/10b encoding.

92

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port Groups on Generation-2 Line Cards


Port-group

3.0

48 port

Port-group

24 port

Port-group

12 port

Each line card has 4 port groups, denoted by screen-printed borders Each port group has 12 Gbps of shared bandwidth Ports can be configured to have Dedicated Bandwidth (1Gb / 2Gb / 4Gb) Remaining ports share unused bandwidth
2006 Cisco Systems, Inc. All rights reserved.

18

Port Groups
Each port group is clearly marked on the line cards with screen-printed borders. Each port group has 12Gbps of internal bandwidth available. Any port can be configured to have dedicated bandwidth at 1Gbps, 2Gbps or 4Gbps. All remaining ports in the port group share any remaining unused bandwidth. Any port in dedicated bandwidth mode has access to extended buffers. Any port in shared bandwidth mode has only 16 buffer credits.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

93

12-Port Full-Rate Mode Line Card


12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth

3.0

Port Group

Port Group

Port Group

Port Group

Port 10

Port 11

3 ports per Port Group Each Port Group shares 12 Gbps of bandwidth Full Rate Mode at 1/2/4 Gbps Suitable for 4Gbps Storage Array ports Suitable for ISLs between switches
16x 4Gbps = 64Gbps Port Channel

Port 4

3x 1Gbps = 6Gbps 3x 2Gbps = 12Gbps 3x 4Gbps = 24Gbps

FRM FRM FRM

1:1 1:1 1:1

2006 Cisco Systems, Inc. All rights reserved.

The 12-Port Full-Rate Mode Line Card


The 12-port line card module operates in Full Rate Mode. Each port on the line card can deliver up to 4Gbps. There are 6 ports in a port group, so total bandwidth requirement could be 24Gbps per port group. The internal path to the forwarding ASIC provides 25Gbps, so more than enough bandwidth is available. The 12-port line card is suitable for any device that requires full 4Gbps bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. An additional 2488 extended buffers and 512 performance buffers are available per module to ports configured in dedicated mode. Also, 144 proxy buffers are available per module.

94

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port 12
19

Port 1

Port 2

Port 3

Port 5

Port 6

Port 7

Port 8

Port 9

24-Port Oversubscribed Mode Line Card


12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth

3.0

Port Group

Port Group

Port Group

Port Group

Port 10

Port 11

Port 12

Port 13

Port 14

Port 15

Port 16

Port 17

Port 18

Port 19

Port 20

Port 21

Port 22

Port 23

Port 1

Port 4

Port 7

Port 8

6 ports per Port Group Each Port Group shares 12 Gbps of bandwidth Full Rate Mode at 1 and 2 Gbps 2:1 oversubscription at 4 Gbps

Port 9

6x 1Gbps = 6Gbps 6x 2Gbps = 12Gbps 6x 4Gbps = 24Gbps

FRM FRM OSM

1:1 1:1 2:1

Suitable for Storage Arrays that require less than 200MB/s bandwidth and ISLs between switches

2006 Cisco Systems, Inc. All rights reserved.

The 24-Port Oversubscribed Mode Line Card


The 24-port line card module operates in Oversubscribed Mode. Each port on the line card can deliver up to 4Gbps. There are 12 ports in a port group, so total bandwidth requirement could be 48Gbps per port group. The internal path to the forwarding ASIC provides 25Gbps, so with every port at 4Gbps all ports are up to 2:1 oversubscribed. At 1Gbps and 2Gbps there is more than enough bandwidth to provide all ports with full bandwidth. The 24-port line card module is suitable for any device that requires less than 200MB/s bandwidth. eg Storage Arrays or ISLs to other switches. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. Also, 144 performance buffers are available per module.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

Port 24
20

Port 2

Port 3

Port 5

Port 6

95

48-Port Oversubscribed Mode Line Card


12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth 12 Gbps Shared Bandwidth

3.0

Port Group

Port Group

Port Group

Port Group

Port 1 Port 2 Port 3 Port 4 Port 5 Port 6 Port 7 Port 8 Port 9 Port 10 Port 11 Port 12

Port 13 Port 14 Port 15 Port 16 Port 17 Port 18 Port 19 Port 20 Port 21 Port 22 Port 23 Port 24

Port 25 Port 26 Port 27 Port 28 Port 29 Port 30 Port 31 Port 32 Port 33 Port 34 Port 35 Port 36

12 Ports per Port Group Each Port Group shares 12Gbps of bandwidth Full Rate Mode at 1Gbps 2:1 oversubscription at 2Gbps 4:1 oversubscription at 4Gbps

12x 1Gbps = 12Gbps 12x 2Gbps = 24Gbps 12x 4Gbps = 48Gbps

Suitable for servers that require less than 100MB/s average bandwidth

2006 Cisco Systems, Inc. All rights reserved.

The 48-Port Oversubscribed Mode Line Card


The 48-port line card module operates in Oversubscribed Mode. Each port on the line card can deliver up to 4Gbps. There are 12 ports in a port group, so total bandwidth requirement could be 48Gbps per port group. The internal path to the forwarding ASIC provides 12Gbps, so with every port at 4Gbps all ports are up to 4:1 oversubscribed and at 2Gbps all ports are 2:1 oversubscribed At 1Gbps there is more than enough bandwidth to provide all ports with full bandwidth. The 48-port line card module is suitable for any device that requires less than 100MB/s bandwidth. eg servers or tape drives that require less than 100MB/s on average. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. Also, 144 performance buffers are available per module.

96

Cisco Storage Design Fundamentals (CSDF) v3.0

Port 37 Port 38 Port 39 Port 40 Port 41 Port 42 Port 43 Port 44 Port 45 Port 46 Port 47 Port 48

FRM OSM OSM

1:1 2:1 4:1

21

Copyright 2006, Cisco Systems, Inc.

4-Port 10Gbps Full-Rate Mode Line Card


12 Gbps Bandwidth 12 Gbps Bandwidth 12 Gbps Bandwidth 12 Gbps Bandwidth

3.0

Port Group

Port Group

Port Group

Port Group

Port 1

Port 2

Port 3

1 port per Port Group Each Port Group shares 12Gbps of bandwidth Full Rate Mode at 10Gb/s Suitable for ISLs between switches
4 ports x 10Gbps = 40Gbps Port Channel

1x 1Gbps = 12Gbps

FRM

Port 4
1:1

with 4 cards x 4 ports x 10Gbps = 160Gbps port Channel between switches

2006 Cisco Systems, Inc. All rights reserved.

22

The 4-Port 10Gbps Full-Rate Mode Line Card


The 4-port 10G FC line card module operates in Full Rate Mode. Each port on the line card can deliver up to 10Gbps. Each 10Gbps port has its own port group, so total bandwidth requirement could be 10Gbps per port group. The internal path to the forwarding ASIC provides 12Gbps, so more than enough bandwidth is available. The 4-port 10G FC line card is suitable for any device that requires full 10Gbps bandwidth. eg ISLs to other switches. Up to 16 ports (over 4x 10G FC line cards) may be placed in a Port Channel providing up to 160Gbps of Port Channel bandwidth. Each port has up to 250 configurable buffer credits. By default an E_Port or TE_Port will be allocated the full 250 credits, F_Ports are allocated 16 credits but may be configured up to 250 credits. An additional 2488 extended buffers and 512 performance buffers are available per module to ports configured in dedicated mode. Also, 144 proxy buffers are available per module.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

97

Port Bandwidth Reservation


Allows for greater flexibility in deploying oversubscribed modules Second-generation modules only Ports within a port group can be allocated 1, 2 or 4 Gbps of guaranteed bandwidth
Interface Port mode becomes dedicated at 1,2 or 4 Gbps
Port 1 - Dedicated 4 Gbps 2-250 Credits

3.0

12 Gbps Shared Bandwidth


Port 6 Dedicated 2 Gbps
23

Shared unused bandwidth 6Gbps Port 4 Out of Service

Port 2

Other ports share unused bandwidth


Shared ports have 16 BB_Credits

Ports can be taken out of service to releases credits and resources

Example: 24-Port FC Module Dedicating one port to 4 Gbps Dedicating another port to 2 Gbps Taking one port Out of Service

2006 Cisco Systems, Inc. All rights reserved.

Port Bandwidth Reservation


Bandwidth reservation provides maximum flexibility when configuring ports on secondgeneration modules. Any port in a port group can be allocated 1Gbps, 2Gbps or 4Gbps dedicated bandwidth All remaining ports in the port group share any remaining unused bandwidth Ports in dedicated bandwidth mode have access to a pool of 2488 extended buffers, and 512 performance buffers. Ports in shared bandwidth mode have only 16 buffer credits. Ports can be taken out-of-service to release credits and resources to remaining ports in the port group.

Best Practice for Configuring Ports


Shared to dedicated: Configure in order: speed, rate-mode, mode, credit Dedicated to shared: Configure in order: credit, rate-mode, speed, mode Port-mode configurations:

Auto/E mode cannot be configured in shared rate-mode FL mode is not supported in 4port 10Gbps module TL mode is not supported in any Generation-2 modules

98

Cisco Storage Design Fundamentals (CSDF) v3.0

Port 3

Copyright 2006, Cisco Systems, Inc.

Port 5

Line Card Default Configurations


Line Card 16 port 32 port 12 port 24 port 48 port 4-port 10Gb Speed Gbps Auto 1/2 Auto 1/2 Auto 1/2/4 Auto 1/2/4 Auto 1/2/4 10Gb only Rate-mode Dedicated Shared Dedicated Shared Shared Dedicated Port Mode Auto Fx/E/TE Auto Fx Auto Fx/E/TE Fx Fx Auto Fx/E/TE

10G and 1/2/4Gbps ports cannot be mixed


1/2/4 Gbps uses 8b/10b encoding 10 Gbps uses 64b/66b encoding

2006 Cisco Systems, Inc. All rights reserved.

24

Line Card Default Configurations


Line cards operate in two different modes, Dedicated and Shared. All ports on Dedicated Rate Mode line card modules (16 port, 12-port and 4-port 10G FC) have access to full bandwidth per port All ports on Shared Rate Mode line card modules (32 port, 24-port and 48 port) share 12Gbps bandwidth across a port group. Any port can configure dedicated bandwidth. All remaining ports in the port group share any remaining unused bandwidth.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

99

Recommended Uses of FC Switch Line Card Modules


Traditional Core-Edge Topology
Typical 15:1 or higher oversubscription

32 port edge switch

MDS 9000 Core-Edge Topology


Same ISL oversubscription

MDS 9000 Collapsed Core

MDS 9500 with FRM cards MDS 9216 with integrated FRM ports and additional OSM linecard ISL oversubscription >> Line Card oversubscription
2006 Cisco Systems, Inc. All rights reserved.

4-port 10Gb FRM linecard for core ISLs 12/16-port FRM linecard for storage and ISLs 24/32/48-port OSM linecard for host/tape connectivity

MDS 9500 with mixed cards

Lower oversubscription than core-edge


25

Recommended Uses of FC Switch Line Card Modules


Use Full-Rate Mode line cards for:

Storage connectivity ISLs Core switches (if deploying a core-edge topology)

Use Oversubscribed Mode line cards to reduce the cost of deploying:


Server connectivity Tape connectivity Edge switches (if deploying a core-edge topology)

The Oversubscribed Mode line cards are designed to allow cost-effective consolidation of a core-edge topology into a collapsed core:

The Oversubscribed Mode line cards serve the function of the edge switches. In core-edge topologies, the oversubscription of the ISLs between the core and edge switches is significantly greater than the oversubscription of the MDS 9000 Oversubscribed Mode line cards. In other words, a collapsed-core topology with Oversubscribed Mode line cards has less oversubscription than a typical core-edge topology. The Full-Rate Mode line cards are used for ISLs and storage connectivity, where oversubscription is not desirable. In a core-edge topology, at least one Full-Rate Mode line card is typically deployed in each edge switch for ISLs to the core. Gen-2 shared-bandwidth line cards allow SAN engineers to tune the performance required per end device.
Copyright 2006, Cisco Systems, Inc.

100

Cisco Storage Design Fundamentals (CSDF) v3.0

Credits and Buffers


Second-Generation Module Credits and Buffers
Buffer-to-buffer credits
Up to 250 Buffer Credits per port E_Port default = 250 Fx_Port default = 16
3.0

6144 credits shared across module in Dedicated rate-mode


Availability depends on rate-mode and port-mode
eg: In 48-port line card, all interfaces configured 125 credits (48x 125 = 6000) or 46 interfaces with 120 plus 2 interfaces with 240 (46x 120 = 5520 + 480 = 6000)

Max. of 16 credits can be configured In Shared rate-mode Performance Buffers


Up to 145 extra buffer credits per port Shared among all ports in the module not guaranteed Supported in FRM 12-port 4Gbps and 4-port 10Gbps line cards

More credits can be shared by making interfaces out-of-service


2006 Cisco Systems, Inc. All rights reserved.

27

Buffer-to-buffer credits:

Depends on rate-mode and port-mode Max. of 16 credits can be configured In shared rate-mode ~6000 credits shared across module in dedicated rate-mode eg: In 48-port module, all interfaces configured 125 credits Or, 40 interfaces 120 each plus 2 interfaces 225

Performance Buffers:

Min/Max/Default 1/145/145 Shared among all ports in the module not guaranteed Supported in 12port 4Gbps and 4port 10Gbps module

Credits can be shared by taking interfaces out of service.

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

101

Second Generation Buffer Credit Allocation


Total 6144 Buffers per line card module User credits may be configured in the range 2 to 250 Default 16 credits for Fx ports, and 250 credits for E/TE ports Ports configured in dedicated mode may use extended buffer credits

3.0

Proxy/Reserved Buffers Performance Buffers Extended Buffer Credits (Dedicated mode only)

144 512 2488

144 512 2488

144

144

6000
(24x 250)

6000
(48x 125)

User Configurable Credits (up to 250 per port)

3000
(4x 250) 4-port 10G
Dedicated Mode

3000
(12x 250)

12-port 1/2/4G
Dedicated Mode

24-port 1/2/4G
Shared Mode

48-port 1/2/4G
Shared Mode
28

2006 Cisco Systems, Inc. All rights reserved.

Second-Generation Buffer Credit Allocation


Each second-generation line card module has a total of 6144 buffers available. By default Fx ports are allocated 16 buffer credits and E/TE ports are allocated 250 ports. However any port can configure between 2 and 250 buffer credits per port. 4-port 10Gbps FC modules and 12-port FC modules operate in dedicated rate mode and have access to an additional 2488 extended buffer credits and 512 performance buffers shared across all ports in the module. Each port can configure a maximum 145 additional performance buffers. 24-port modules operate in shared rate mode and each port can configure between 2 and 250 buffer credits per port 48-port modules operate in shared rate mode and each port can configure between 2 and 125 buffer credits per port

102

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

6144 Buffers

Best Practices
Match the switch port speed to the connected device Configure port speed to use 1, 2 or 4Gbps Or set auto-sensing with a maximum 2Gbps
Configuring 4Gbps will reserve 4Gbps bandwidth regardless of the autonegotiated port speed

Configure the rate mode dedicated or shared


Dedicated ports will reserve bandwidth, shared ports share remaining bandwidth

Configure the port mode


F, FL, E, TE, TL, SD, ST

Configure Buffer-to-Buffer Credits Take any unused interfaces Out-of-Service


This will free up resources and any spare credits or bandwidth.
2006 Cisco Systems, Inc. All rights reserved.

29

Best Practices for Configuring Second-Generation Line Cards


1. Match the switch port speed to the port speed of the connected device and lock down the port speed to 1Gbps, 2Gbps or 4Gbps 2. Ports may be configured in auto-sense maximum 2Gbps mode and will connect at 1Gbps or 2Gbps 3. Configure the rate mode for the port dedicated or shared Dedicated ports will have dedicated 1Gbps, 2Gbps or 4Gbps bandwidth Shared ports will share any remaining unused bandwidth left over for the port group. 1. Configure the port mode F, FL, E, TE, TL, SD or ST 2. Configure buffer to buffer credits. 1 buffer credit is required for each 1Km link distance at 2Gbps with 2KB frame payload 3. Take any unused ports Out of Service to free up resources and any spare credits or bandwidth

Copyright 2006, Cisco Systems, Inc.

Architecture and System Components

103

104

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 4

The Multilayer SAN


Overview
In this lesson, you will learn how to build scalable intelligent SAN fabrics using VSANs, IVR, PortChannels, intelligent addressing, and interoperability with other vendors switches.

Objectives
Upon completing this lesson, you will be able to create a high-level SAN design with MDS 9000 switches. This includes being able to meet these objectives:

Explain the benefits of VSANs Explains how VSANs are implemented Explain how IVR enables sharing of resources across VSANs Explain how PortChannels provide high availability inter-switch links Explain how the addressing features of the MDS 9000 simplify SAN management Explain the purpose of the CFS protocol Explain how the MDS 9000 interoperates with third-party switches

Virtual SANs
VSANs Address the Limitations of Common SAN Deployments
Virtual Storage Area Networks (VSAN)
VSANs are Virtual Fabrics Allocate ports within a physical fabric to create isolated virtual fabrics. SAN islands are virtualized onto a common SAN infrastructure. VSAN on FC is similar to VLAN on Ethernet. Fabric Services are isolated within a VSAN Fabric disruption is limited to VSAN Statistics gathered are per VSAN
Independent physical SAN islands are virtualized onto a common SAN infrastructure.
Cisco MDS 9000 Family with VSAN Service

2006 Cisco Systems, Inc. All rights reserved.

VSANs Address the Limitations of Common SAN Deployments


Today, many SAN environments consist of numerous islands of connectivity. Commonly deployed SAN islands are physically isolated environments consisting of one or more interconnected switches where each island is typically dedicated to a single or to multiple related applications. A SAN island may be independently managed by a separate administration team, while strict isolation from faults is achieved through physical network deployment separation. However, because this physical isolation restricts access by other networks and users, the sharing of critical storage assets and the economic savings of storage consolidation are limited. VSAN functionality is a feature developed by Cisco that leverages the advantages of isolated SANs fabrics with capabilities that address the limitations of isolated SAN islands. VSANs provide a method for allocating ports within a physical fabric to create virtual fabrics. Independent physical SAN islands are virtualized onto a common SAN infrastructure. An analogy is that VSANs on Fibre Channel (FC) networks are like VLANs on Ethernet networks. Separate fabric services are available on each VSAN, because it is a virtual fabric, as are statistics, which are gathered on a per-VSAN basis.

106

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

VSANs Reduce Infrastructure Costs


16 Port Switches Ports Required: 70 Ports Deployed: 96 ISL Ports: 16 (7:1 fan-out)

Dynamic provisioning/resizing Improved port utilization Non-disruptive (re)assignment Shared ISL bandwidth
Ports Required: 40 Ports Deployed: 64 ISL Ports: 0

Ports Stranded: 10
Net: 96 ports for 70 used

32 Port Switches

Ports Stranded: 24
Net: 64 ports for 40 used

70 Port Fabric Red_VSAN 40 Port Fabric Blue_VSAN Ports Required: 70+40 Ports Deployed: 128 ISL Ports: 0 Ports Assignable: 18 (able to add more switching modules too!) Net: 110 ports for 110 used

2006 Cisco Systems, Inc. All rights reserved.

VSANs Reduce Infrastructure Costs


VSANs allow dynamic provisioning and resizing of virtualized SAN islands. Virtual fabrics are built to meet initial port requirements. This not only allows for good port utilization, but also for dynamic resizing of virtual fabrics to meet actual, rather than projected, needs. With individual fabrics, port counts are dictated to some degree by the hardware configurations available. Provisioning ports logically, rather than physically, allows assignment of only as many ports as are needed. Stranded ports (ports unneeded on an isolated fabric) are also reduced or eliminated.

Ports can be (re)assigned to VSANs non-disruptively. ISLs become Enhanced ISLs (EISLs) carrying tagged traffic from multiple VSANs. ISL bandwidth is securely shared between VSANs, which reduces cost of excessive ISLs. EISLs only carry permitted VSANs, which can limit the reach of individual VSANs. Each port can belong to only one VSAN, and there is no leakage between VSANs. InterVSAN Routing (IVR) must be used to exchange traffic between two different VSANs.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

107

VSANs Constrain Fault Impacts


Create a VSAN for each application Isolate traffic flows by assigning different FSPF costs Fabric Services are replicated and maintained on a per-VSAN basis Any disruption is contained within the VSAN ie
FSPF reconfigure fabric Misbehaving HBA or device RSCN broadcast Active zoneset change

!!Fabric event!! HBA generates erroneous control frames but other VSANs are protected

2006 Cisco Systems, Inc. All rights reserved.

VSANs Constrain Fault Impacts


VSANs sectionalize the fabric to increase availability. All fabric services are replicated and maintained on a per-VSAN basis, including name services, notification services, and zoning services. This means that fabric events are isolated on a per VSAN basis. This isolation provides high availability by protecting unaffected VSANs from events on a single VSAN within the physical fabric. The faults are constrained to the extents of the affected VSAN , and only affect devices within that VSAN. Protection is provided from events like:

Misbehaving HBA or controller Fabric rebuild event Zone set change

Fabric recovery from a disruptive event is also per-VSAN, resulting in faster reconvergence due to the smaller scope.

108

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SAN Consolidation with VSANs


VSANs enable highly resilient, large port density and manageable SAN designs Leverage VSANs as a replacement for multiple separate physical fabrics Separate VSANs provide hardware based traffic isolation and security Very high port-density platforms to minimize number of switches required Eliminates the wasted ports of the SAN island approach Link bundling and QoS within fabric to optimize resource usage and traffic management
2006 Cisco Systems, Inc. All rights reserved.

Group A
Cisco MDS 9506
VSAN Trunk Bundles

Group B

Cisco MDS 9216

Cisco MDS 9513

VSAN Trunks

Shared Storage Pool


Backup VSAN

Cisco MDS 9509

Main Data Center


7

SAN Consolidation with VSANs


One of the key enablers for SAN consolidation on the MDS platform is the Virtual SANs (VSANs) feature. VSANs completely isolate groups of ports in the fabric, allowing virtual fabrics to replace multiple physical SAN fabrics as the means to secure and scale applications. VSANs allow high-density switch platforms to replace inefficient workgroup fabric switches. In addition to VSANs, features like PortChannels (link aggregation) and QoS allow IT to optimize resource usage and manage traffic within the fabric. IT can ensure that applications get the resources they need without having to physically partition the fabric.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

109

VSAN Advantages
Good ROI
Leverage VSANs as a replacement for multiple separate physical fabrics Reduce number of switches Increase port density
Department/ Customer A Department/ Customer B

Availability
Disruptions and I/O pauses are confined to the local VSAN Increase fabric stability
VSAN-Enabled Fabric VSAN Trunks Mgmt VSAN

Scalability
Fabric Services are per VSAN Reduce the size of the FC distributed database FC_IDs can be reused

Security
Separate VSANs provide hardwarebased traffic isolation and security
2006 Cisco Systems, Inc. All rights reserved.

Shared Storage
8

VSAN Advantages
VSANs allow implementation of multiple logical SANs over a common fabric, which eliminates costs associated with separate physical fabrics. The virtual fabrics exist on the same physical infrastructure, but are isolated from each other. Each VSAN contains zones and separate (replicated) fabric services, which improves:

Availability through the isolation of virtual fabrics from fabric-wide faults/reconfigurations Scalability through: Replicated fabric services per VSAN Support for 256 VSANs Centralized management capability Security through fabric isolation

256 VSANs is not a hard limit. The VSAN header is 12 bits long and supports up to 4096 and we can grow to that number in the future as larger scale SAN deployments increase. Please note that the total number of VSANs that can be configured is 256 but the numbering can be anywhere between 1-4093 due to the reasons mentioned above. The FCIDs contain an 8-bit field for domains and a few are reserved leaving the 239 domain (switch) limitation per SAN with each switch getting its own domain ID. With Ciscos VSAN technology, this limitation is now extended per VSAN, implying that domains (and hence FCIDs) can be reused across VSANs. Thus, this enables the deployment of much larger scale SANs than available currently.

110

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

How VSANs Work


VSAN Primary Functions
The VSAN feature consists of two primary functions.
Fibre Channel Services for Blue VSAN VSAN header is removed at egress port Fibre Channel Services for Red VSAN Trunking E_Port (TE_Port) Trunking E_Port (TE_Port) VSAN tagged header(VSAN_ID) is added at ingress port indicating membership

Hardware-based isolation of traffic


No special drivers or configuration required for end nodes Traffic tagged at FC ingress port and carried across EISL

Independent Fabric Services for each VSAN


FSPF Zone server Name server Management server Principle switch selection, etc. Services are run, managed and configured independently

Enhanced ISL (EISL) Trunk carries tagged traffic from multiple VSANs

Fibre Channel Services for Blue VSAN Fibre Channel Services for Red VSAN
10

No special support required by end nodes

2006 Cisco Systems, Inc. All rights reserved.

VSAN Primary Functions


The VSAN feature consists of two primary functions. A hardware-based isolation of tagged traffic belonging to different VSANs, which requires no special drivers or configuration at the end nodes, such as hosts, disks, etc. Traffic is tagged at the Fibre Channel ingress port (Fx_Port) and carried across EISL links between MDS 9000 switches. Since VSANs use explicit frame tagging, they can be extended over the metro or WAN. The MDS 9000 Family IP storage module can add tags to be transported in Fibre Channel over Internet Protocol (FCIP) for greater distances. Fibre Channel, and therefore VSANs, can easily be carried across dark fiber. However, VSANs add 8 bytes of header, which may be a concern for channel extenders. The channel extenders may consider it an invalid frame and drop it. Dense wavelength division multiplexing (DWDM) switches may also count frames as invalid but may still pass the frames anyway. Qualification is still ongoing within Cisco to validate various extension methods. The creation of an independent instance of Fibre Channel fabric services for each newly created VSAN. These services include: zone server, name server, management server, principle switch selection, etc. Each service runs independently on each VSAN and is independently managed and configured as well.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

111

VSAN Attributes
256 VSANs per switch 239 switches per VSAN Traffic is isolated within its own VSAN
Control over each incoming and outgoing port
VSAN 10 VSAN 20 VSAN 30 VSAN 1 (default)

Each frame in the fabric is uniquely tagged and labeled with a VSAN_ID header on the ingress port
VSAN_ID maintained across TE ports VSAN ID stripped away across E_Ports. VSAN & Priority in EISL header for QoS
Cisco MDS 9509 Single Chassis

FC_ID can be reused across other VSANs


Increases switch granularity Simplifies Migration Ease of management
Fabric 10 Domain ID 0x61 44 Ports Fabric 20 Domain ID 0x94 24 Ports Fabric 30 Domain ID 0x33 12 Ports Fabric 1 Domain ID 0x12 8 Ports

(Logically within the MDS 9509 single chassis)


2006 Cisco Systems, Inc. All rights reserved.

11

VSAN Attributes
VSANs help achieve traffic isolation in the fabric by adding control over each incoming and outgoing port. There can be up to 256 VSANs in the switch and 239 switches per VSAN. This affectively helps with network scalability because the fabric is no longer limited by 239 Domain_IDs since they can be reused within each VSAN. To uniquely identify each frame in the fabric, the frame is labeled with a VSAN_ID on the ingress port; the VSAN_ID is stripped away across E ports. Across TE ports, the VSAN_ID is still maintained. By carrying SAN/priority in the header, quality of service (QoS) can be properly applied. The VSAN_ID is always stripped away at the other edge of the fabric. If an E port is capable of carrying multiple VSANs, it then becomes a trunking E port (TE port). VSANs also facilitate the reuse of address space by creating independent virtual SANs, therefore increasing the available number of addresses and improving switch granularity. Without a VSAN, an administrator needs to purchase separate switches and links for separate SANs. The system granularity is at the switch level, not at the port level. VSANs are easy to manage. To move or change users, you only need to change the configuration of the SAN, not its physical structure. To move devices between VSANs, you simply change the configuration at the port level; no physical moves are required.

112

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

VSAN Numbering Rules


VSAN 1 Default VSAN
Automatically configured by the switch as the default VSAN All ports are originally in VSAN 1 Always present, cannot be deleted Cisco MDS 9000 Family Switches with VSAN Service
Trunking E_Port (TE_Port) Configured VSANs
VSAN 10 VSAN 20 VSAN 30

VSAN 2 through 4093


User-configurable VSANs A maximum of 254 VSANs can be created in this number range

VSAN 4094 Isolated VSAN


Used to isolate ports whose port-VSAN has been deleted Not propagated across switches Always present, cannot be deleted
Host is isolated from the fabric Port is in VSAN 4094 (Isolated VSAN)

Trunking E_Port (TE_Port)


VSAN 10 VSAN 20 VSAN 30

Configured VSANs
2006 Cisco Systems, Inc. All rights reserved.

12

VSAN Numbering Rules


There are certain rules that have to be followed when creating VSANs. VSAN 1, for instance, is automatically configured by the switch as the default VSAN. All ports that are configured are originally put into VSAN 1 until specifically configured into another VSAN number. The VSAN numbers ranging from 2 through 4093 are the user-configurable VSANs. Even though there are many more number possibilities in this range, a maximum of 254 VSANs can be created here. VSAN 4094 is a reserved special VSAN called the isolated VSAN. It is used to temporarily isolate the ports whose VSAN has been deleted. VSAN 4094 is not propagated across switches, is always present, and cannot be deleted. In the figure, VSAN 30 is not propagated across EISL, because it is not configured in the local switch but is configured on the remote switch. Instead of the host device on the local switch being able to connect to the remote switch, it has been placed in the isolated VSAN 4094 because the ports VSAN (VSAN 30) has been deleted from the local switch configuration.
Note VSAN 0 and VSAN 4095 are reserved and not used.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

113

TE_Ports and EISLs


Trunking E_Port (TE_Port):
Carries tagged frames from multiple VSANs Only understood by Cisco MDS 9000 switches Trunks all VSANs (1-4093) by default VSAN Allowed List defines which frames are allowed
TE_Port

Can be optionally disabled for E_Port operation Has native VSAN assignment for E_Port operation Not to be confused with port aggregation (PortChannels)

EISL

TE_Port TE_Ports

Enhanced ISL (EISL):


Link created by connecting two TE_Ports Superset of ISL functionality Also carries per-VSAN control protocol information FSPF, distributed Name Server, Zoning updates etc

EISL

2006 Cisco Systems, Inc. All rights reserved.

13

TE_Ports and EISLs


Trunking E_Ports (TE_Ports) have the following characteristics:

TE_Ports can pass tagged frames belonging to multiple VSANs. TE_Ports are only supported by Cisco MDS 9000 switches. By default, TE_Ports can pass all VSAN traffic (1-4093). The passing of traffic for specific VSANs can be disabled. By default, E-Ports are assigned as part of VSAN 1. TE_Ports allow for the segregation of SAN traffic and should not be confused with port aggregation (referred to by some vendors as trunking).

Enhanced ISLs (EISLs) are ISLs that connect two TE_Ports:


An EISL is created when two TE_Ports are connected. EISLs offer a superset of ISL functionality, EISLs carry per-VSAN control protocol information.

114

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

WWN-Based VSANs
Port-Based VSANs
SW1 SW2

2.0

WWN-Based VSANs
SW1 SW2

SAN-OS 2.0
Move requires Reconfiguration on SW2 Move without reconfiguration

FC
HBA

FC
HBA

FC
HBA

FC
HBA

VSAN membership currently based on physical port of switch Reconfiguration is required when server or storage moves to another switch

VSAN membership based on pWWN of server or storage Fabric-wide distribution of configuration using CFS No re-configuration is required when a host or storage moves
14

2006 Cisco Systems, Inc. All rights reserved.

WWN-Based VSANs
With the introduction of SAN-OS 2.0, VSAN membership now may be defined based on the world wide name (WWN) of hosts and storage devices, or by switch port. With WWN-based VSAN membership, host and targets can be moved from one port to any other port anywhere in the MDS fabric without requiring manual reconfiguration of the port VSANs.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

115

Inter-VSAN Routing (IVR)


IVR Overview
Inter-VSAN Routing (IVR) allows selective routing between specific members of two or more VSANs
Preserve VSAN benefits Selectively allow traffic flow Share resources. ie Tape
VSAN 10 Transit VSAN FC or FCIP
FC
HBA

FC

FC
HBA

FC

FC

Transit VSAN isolates WAN infrastructure Resolves problems with merged fabrics FC control frames remain within the VSAN
16

FC
HBA

FC
HBA

VSAN 20
2006 Cisco Systems, Inc. All rights reserved.

IVR Overview
VSANs are like virtual switches. They improve SAN scalability, availability, and security by allowing multiple SANs to share a common physical infrastructure of switches and ISLs.These benefits are derived from the separation of Fibre Channel services in each VSAN and isolation of traffic between VSANs. Data traffic isolation between the VSANs also inherently prevents sharing of resources attached to a VSAN, for example robotic tape libraries. Using IVR, resources across VSANs are accessed without compromising other VSAN benefits. When IVR is implemented, data traffic is transported between specific initiators and targets on different VSANs without merging VSANs into a single logical fabric. FC control traffic does not flow between VSANs, nor can initiators access any resource across VSANs aside from the designated resources. IVR allows valuable resources like tape libraries to be easily shared across VSANs, and IVR used in conjunction with FCIP provides more efficient business continuity or disaster recovery solutions. IVR works for both FC and FCIP links. Using IVR, a backup server in VSAN10 could access a tape library in VSAN20 by configuring the switches involved to allow traffic between these devices, by VSAN and pwwn. Because the other nodes were not configured for IVR, they are unable to access each other.

116

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Single-Switch IVR Designs


VSAN 10 Media Server
FC
FC

IVR Zone

Tape Library
FC

VSAN 20

FC

FC FC FC FC FC FC FC
FC FC FC FC FC FC

FC

FC

FC FC FC FC FC FC FC
FC FC FC FC FC FC FC

Single-switch IVR designs:


Simplest IVR topology one switch in path Transit VSAN not required IVR zone and zoneset permits selective access across VSAN boundaries
17

2006 Cisco Systems, Inc. All rights reserved.

Single-Switch IVR Designs


An IVR path is a set of switches and ISLs through which a frame from an end-device in one VSAN can reach another end-device in some other VSAN. Multiple paths can exist between two such end-devices. The simplest example of an IVR topology is one involving two VSANs and a single switch. In addition to the normal zones and zonesets that exist within each VSAN, IVR supports the creation of an IVR zone and zoneset, which allows selective access between the devices in two VSANs. In the example, the backup media server in VSAN 10 is allowed to access the tape library in VSAN 20. All other devices are restricted to their respective VSANs.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

117

IVR with Unique Domain IDs


Prior to SAN-OS 2.1, Domain IDs must be unique within an IVR zoneset If Domain IDs are unique
Frames are routed from one VSAN to another with no added latency S_ID & D_ID are unchanged across VSANs

All VSANs belong to a single Autonomous Fabric - AFID=1


Autonomous Fabric ID=1

VSAN 10
FC
HBA

VSAN 99
FCIP tunnel

VSAN 20

FC

fcid 05.02.01

Domain 0x05

Domain 0x0A

S1

Domain 0x5F

Domain 0x5C

S2

Domain 0x14

Domain 0x06

fcid 06.03.04

AFID 1

VSAN 10

S_ID: 05.02.01

D_ID: 06.03.04

AFID 1

VSAN 99

S_ID: 05.02.01

D_ID: 06.03.04

AFID 1

VSAN 20

S_ID: 05.02.01

D_ID: 06.03.04

2006 Cisco Systems, Inc. All rights reserved.

18

IVR with Unique Domain IDs


Unique Domain IDs are required for all switches involved in IVR. In this way, a frame moving from fcid 05.02.01 in VSAN 10 to fcid 06.03.04 in VSAN 20 will retain the same source and destination FCIDs as it crosses two or more VSANs. Whenever a frame enters a Cisco MDS 9000 switch, it is tagged with a VSAN header indicating the native VSAN of the port. In the case of IVR, when the destination FCID resides in a different VSAN, the tag will be rewritten at the ingress port of the IVR border switch. In the figure, assume that a frame is destined from fcid 05.02.01 in VSAN 10 to fcid 06.03.04 in VSAN 20. The left-most switch, with Domain ID 5, applies a VSAN 10 ID tag. The next switch performs a VSAN rewrite, changing the VSAN tag to 99. The last switch changes the VSAN tag to 20. The process is reversed on the return path.

VSAN Rewrite Table


Each IVR-enabled switch maintains a copy of the VSAN Rewrite Table. The table can hold up to 4096 entries. Each entry includes the following information:

Switch identifier Current VSAN ID Source Domain Destination Domain Next-Hop VSAN (rewritten VSAN)

118

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IVR with Overlapping Domain IDs


If Domain IDs overlap
IVR NAT will rewrite the S_ID and/or D_ID in the frame header and route the frame to its destination in a different VSAN

2.1

From SAN-OS 2.1, Domain IDs do not have to be unique within an IVR zoneset

All VSAN IDs must be unique within the same Autonomous Fabric
Autonomous Fabric ID=1

VSAN 10
FC
HBA

VSAN 99 Autonomous Fabric ID=1 VSAN 20


FCIP tunnel
NAT NAT

FC

fcid 05.02.01

Domain 0x05

Domain 0x0A

S1

Domain 0x5F

Domain 0x5C

S2

Domain 0x14

Domain 0x05

fcid 05.03.04

AFID 1

VSAN 10

S_ID: 05.02.01

D_ID: 06.03.04

AFID 1

VSAN 99

S_ID: 05.02.01

D_ID: 06.03.04

AFID 1

VSAN 20

S_ID: 06.02.01

D_ID: 05.03.04

2006 Cisco Systems, Inc. All rights reserved.

19

IVR with Overlapping Domain IDs


IVR-2, introduced in SAN-OS 3.0, offers several enhancements over previous versions of IVR:

Removes unique VSAN ID and Domain ID requirement Integrates with QoS, LUN zoning, and read-only zoning Provides Automatic IVR configuration propagation throughout fabric AUTO Mode Provides Automatic IVR topology discovery Licensed with Enterprise and San Extension (with IPS 4 or 8 installed) packages

In the example, notice that VSAN 10 has a switch with Domain ID 05 and so does VSAN 20. Therefore IVR NAT must provide a proxy entry in VSAN 10 for VSAN20 Device 05.03.04 and renumber it as 06.03.04. A frame from VSAN 10 fcid 05.02.01 is written with a destination fcid of 06.03.04 and routed via the transit VSAN 99 to VSAN 20. As the frame arrives at the border switch in VSAN 20, the frame header is rewritten as 05.03.04 and routed to its destination port. Notice that with SAN-OS 2.1 there is only one Autonomous Fabric so all VSAN IDs must be unique within the same Autonomous Fabric.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

119

Autonomous Fabrics and Overlapping VSAN IDs

3.0

From SAN-OS 3.0, you can configure up to 64 separate Autonomous Fabrics All VSAN IDs must be unique within the same Autonomous Fabric but the same VSAN ID can exist in a different Autonomous Fabric IVR NAT will rewrite the S_ID and/or D_ID and route frames as before

Autonomous Fabric ID=1

Autonomous Fabric ID=2

VSAN 10
FC
HBA

VSAN 99 Autonomous Fabric ID=1 VSAN 10


FCIP tunnel
NAT NAT

FC

fcid 05.02.01

Domain 0x05

Domain 0x0A

S1

Domain 0x5F

Domain 0x5C

S2

Domain 0x14

Domain 0x05

fcid 05.03.04

AFID 1

VSAN 10

S_ID: 05.02.01

D_ID: 06.03.04

AFID 1

VSAN 99

S_ID: 05.02.01

D_ID: 06.03.04

AFID 2

VSAN 10

S_ID: 06.02.01

D_ID: 05.03.04

2006 Cisco Systems, Inc. All rights reserved.

20

Autonomous Fabrics and Overlapping VSAN IDs


From SAN-OS 3.0 with IVR-2 you can configure up to 64 separate Autonomous Fabrics. Each VSAN in a single physical switch must only belong to one Autonomous Fabric. IVR must know about the topology of the IVR-enabled switches in the fabric to function properly. You can specify the topology in two ways:

Manual Configuration Configure the IVR topology manually on each IVR-enabled switch Automatic Mode Uses CFS configuration distribution to dynamically learn and maintain up-to-date information about the topology of the IVR-enabled switches in the network.

In the example, VSAN 10 on the left is joined to VSAN 10 on the right via a Transit VSAN 99. This would be illegal in a single Autonomous Fabric so both sides are configured in separate Autonomous Fabrics 1 and 2. Notice that IVR NAT now rewrites the AFID in the EISL frame header from AFID1:VSAN 10 to AFID2:VSAN 10 as it passes through the IVR edge switches.

120

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IVR Service Groups


A Group of unique VSAN IDs and Domain IDs Up to 16 Service Groups supported with a total of 64 AFID-VSAN combinations Each VSAN in a switch must belong to one and only one Autonomous Fabric ID (AFID) A single switch can be in multiple AFIDs AFIDs in a switch cannot share a VSAN ID Default AFID is 1, and can be changed via CLI or FM VSAN ID can be reused in different AFIDs without merging that VSAN, as long as the AFIDs do not share a switch. IVR Control traffic distributed amongst all SGs IVR Data traffic contained within each SG
2006 Cisco Systems, Inc. All rights reserved.

3.0

AFID 4 AFID 1 VSAN 1 S1 swwn1


VS AN 1 10

AFID 2 VSAN 100 VSAN 1 S2 VSAN 103 swwn2


10 2

S3 swwn3 VSAN 1

AFID 3

VS AN

21

IVR Service Groups


IVR Service groups are defined as a group of unique VSAN IDs and Domain IDs within an Autonomous Fabric. VSAN IDs (domain IDs) can be the same as long as they reside in different AFIDs. With IVR-2, there can be a total 16 IVR Service Groups and the allowed AFID range is 1 64 AFID is now used in routing decision. With IVR-1 prior to SAN-OS 2.1, zoning had to be performed at each switch in the configuration but IVR-2 uses CFS to distribute IVR zoning to IVR2 enabled switches IVR-2 zoning can now be done on 1 switch only, and propagated to the IVR2 fabric. Notice that IVR control traffic eg. IVR Topology Database is distributed to all IVR enabled switches across all configured IVR Service Groups but IVR Data Traffic eg. Frames moving from VSAN to VSAN are contained within each IVR Service Group.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

121

IVR Best Practices


Encourage use of non-overlapping domains across all VSANs. For large installations, try not to have IVZ members spread across many switches. It wastes resources. Allow for multiple paths between the IVZ members. Set default zone policy to deny and avoid using the force option when activating the IVZS. Make sure that exactly the same IVR topology is applied to all IVR-enabled switches. Configure IVR to use Cisco Fabric Services Use Cisco Fabric Manager to configure IVR.

2006 Cisco Systems, Inc. All rights reserved.

22

IVR Best Practices


The following are recommended best practices for implementing IVR:

While it is not strictly required to have unique Domain IDs across VSANs for switches that are not participating in IVR, unique Domain IDs are recommended, because they simplify fabric design and management. Because the VSAN rewrite table is limited to 4096 entries, and because entries are perdomain, not per-end device, it is best to minimize the number of switches that contain IVZ members in very large implementations. Implement redundant path designs whenever possible. In normal FC environments, it is generally considered a best practice to set the default zone policy to deny. Because members of IVZs cannot exist in the default zone, activation of an IVZS using the force option may lead to traffic disruption if IVZ members previously existed in a default zone policy of permit. Make sure that exactly the same IVR topology is applied to all IVR-enabled switches. Using the Cisco Fabric Manager to configure IVR can help avoid errors and will ensure that the same IVR configuration is applied to all IVR enabled switches.

122

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Competing Technologies: Logical SANs


Fabric A
Application Hosts

Fabric B

Multiprotocol Routers

Application Hosts

Backup Server A

Backup Server B

LSAN_1

External routers or gateways


Tape Library

LSAN_2
Backup Media Server

Add latency to every frame Consume ISL ports Are difficult to manage Are a single point of failure
2006 Cisco Systems, Inc. All rights reserved.

Fabric C

23

Competing Technologies: Logical SANs


Brocade offers a proprietary solution called Logical SANs, or LSANs. This feature allows traffic between devices that would otherwise be isolated in separate fabrics. This implementation makes sense for Brocades small-to-medium sized business customers, who typically have a significant investment in smaller, legacy switches deployed in workgroup SANs. LSAN implementation requires purchase of at least one proprietary multiprotocol router. The diagram shows a redundant configuration, which requires two of the special-purpose routers. LSANs use a multiprotocol router to:

Join fabrics without merging them Perform NAT to join separate address spaces Perform functions similar to iFCP gateways

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

123

PortChannels
PortChannels
A PortChannel is a logical bundling of ISLs:
Multiple links are combined into one aggregated link More reliable than FSPF equal-cost routing Can span line cards for higher availability Higher throughput up to 160Gbps per PortChannel (16 x 10Gbps) No distance limitations Up to 16 ISLs per PortChannel Up to 128 PortChannels per switch

Single PortChannel Between Two MDS Switches


2006 Cisco Systems, Inc. All rights reserved.

Multiple PortChannels Between Two MDS Switches


25

A PortChannel is a logical bundling of identical links. PortChannels (link bundling) enable multiple physical links to be combined into one aggregated link. The bandwidth from these links is aggregated into this logical link. There may be a single PortChannel between switches or multiple PortChannels between switches. PortChannels provide a point-to-point connection over multiple interswitch link (ISL) E_Ports or extended interswitch link (EISL) TE_Ports. PortChannels increase the aggregate bandwidth of an ISL by distributing traffic among all functional links in the channel. This decreases the cost of the link between switches. PortChannels provides high availability on an ISL. If one of the physical links fails, traffic previously carried on this link will be switched to the remaining links. PortChannels are known in the industry by other names, such as the following:

ISL trunking (Brocade Communications Systems) Port bundling Aggregated channels Channel groups Channeling Bundles

124

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FSPF Routing
FC

HBA

10 1Gbps links

11

1. 2.

FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 2000 Server A + C 10 12 13 Cost 2000 Server B

FC

HBA

FC

FC

12

13

FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250

HBA

FSPF builds a routing table between each domain in the fabric FSPF chooses the least cost path and routes all frames along it, but we have two equal cost pathspath 1 and 2 both have a cost of 2000 FSPF applies round-robin algorithm to share the load between connected devices Server A + C share path 1 and server B is allocated path 2 All frames from Server A to Storage will be carried across Path 1 Path 1 will carry a different load than Path 2 FSPF does NOT load balance across equal cost paths on non-Cisco switches
2006 Cisco Systems, Inc. All rights reserved.

26

FSPF Routing
When Fibre Channel switches are joined together with ISLs, FSPF builds a routing table which is distributed to all switches using Link State updates. The routing table is a list of every possible path between any two domains in the Fibre Channel fabric. Each path is assigned a cost based upon the speed of the link:

1Gbps = 1000 2Gbps = 500 4Gbps = 250

FSPF then chooses the least cost path between any two domains. All frames will be sent along the least cost path and all other possible paths will be ignored. Every time an ISL is added, FSPF issues a Build Fabric (BF) command to rebuild the routing table. Every time an ISL is removed, FSPF issues a Build Fabric (BF) command to rebuild the routing table.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

125

FSPF Routing Issues


FC

HBA

10 1Gbps links

11

1. 2. 3.

FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 2000 Server A 10 12 13 Cost 2000 Server B 10 11 13 Cost 2000 Server C

FC

HBA

FC

FC

12

13

FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250

HBA

To provide more bandwidth, we could add another link between switches FSPF again chooses the least cost path and routes all frames along it We now have three equal cost pathspaths 1, 2 and 3 all have a cost of 2000 FSPF re-applies round-robin algorithm to share the load between connected devices Server A is allocated path 1, server B has path 2 and server C has path 3 All frames from Server A to storage will still be carried across Path 1 Path 1 will carry a different load than Path 2 or Path 3 FSPF still does not load balance across equal-cost paths on non-Cisco switches
2006 Cisco Systems, Inc. All rights reserved.

27

FSPF Routing issues


Often, we would add a second ISL between switches to provide more bandwidth. Both ISLs would have the same cost, if the link speed is the same. If there are equal cost paths, then the paths will be shared among connected devices and allocated on a round-robin basis. Once a path has been allocated to a device, then all frames will use that path even though another equal cost path may be available. If the path is congested, then other equal cost paths will not be used. FSPF does not load balance across equal cost paths; it only shares equal cost paths to connected devices. This can lead to one path being congested and another equal cost path underutilized.

Exchange-Based Load Balancing


Cisco MDS switches, by default will load balance across equal cost paths based upon the Fibre Channel exchange. This provides better granularity and balances the load across equal cost paths.

126

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FSPF and Port Channels


FC

HBA

10 1Gbps links

11

1. 2.

FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 1000 Server A + B + C 10 12 13 Cost 2000 None Port Channel Link Costs FSPF Link cost / number of links ie. 1000 / 2 = 500
FC

FC

HBA

FC

12

13

FSPF Link Costs 1Gbps = 1000 2Gbps = 500 4Gbps = 250

HBA

To provide more bandwidth, we could add another link between switches FSPF rebuilds the routing table between each domain in the fabric FSPF again chooses the least cost path and routes all frames along it, but we now only have one least cost pathpath 1, which has a new cost of 1000 All frames from Domain 10 to 13 will follow path 1 By default, the Port Channel will load balance across all links in the Port Channel If a link fails within the Port Channel, the FSPF cost doesnt change and frames will continue to flow through the Port Channel
2006 Cisco Systems, Inc. All rights reserved.

28

When two or more ISLs are placed in a Port Channel, this is seen as a single path by FSPF and a cost is calculated based upon the cost of each link divided by the number of links in the Port Channel. Cisco MDS switches will provide exchange-based load balancing across all links within the Port Channel.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

127

Flapping Links

Link State Records

Flapping link can cause FSPF recalculation FSPF will rebuild the topology database when the link goes down FSPF will rebuild the topology database when the link comes up On a failing GBIC or link, this can happen several times a second Results in wide-scale disruption to the fabric
2006 Cisco Systems, Inc. All rights reserved.

29

Flapping Links
PortChannels can handle some types of hardware failures better than ISLs not belonging to a PortChannel. For example, if a flapping link exists between the two middle directors outside of a PortChannel, FSPF overhead is incurred. In this case, each time the ISL goes down or comes up, all of the switches in the fabric will recalculate the cost of each of their FSPF links by exchanging Link State Records on every (E) ISL interface. Switches synchronize databases by sending LSRs (Link State Records) in a Link State Update (LSU) SW_ILS extended link service command. When a switch receives an LSU, it compares each LSR in the LSU with its current topology database. If the new LSR is not present in the switchs LSD, or if the new LSR is newer then the existing LSR, the LSR is added to the database. Cisco uses a modified Dykstra algorithm which does a very fast computation of the FSPF topology database. When a link flaps, the LSU's are flooded and then the path calculation occurs. While Cisco MDS switches handle flapping links more efficiently than most competitors, placing ISLs within a PortChannel can completely eliminate the overhead associated with FSPF re-calculation caused by a flapping link.

128

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Flapping Links with PortChannels

Flapping link within a PortChannel results in no FSPF recalculation Frames continue to flow across remaining links in the PortChannel Fabric stability is maintained

2006 Cisco Systems, Inc. All rights reserved.

30

Flapping Links with PortChannels


If, instead, the three middle links in the diagram are part of a PortChannel FSPF overhead is virtually eliminated. The PortChannel is represented as a single path to the FSPF routing table. No FSPF path cost recalculation is performed when a link fails in a PortChannel. This is true as long as there remains at least one functioning link in the PortChannel.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

129

PortChannel Protocol
Used for exchanging configuration information between switches to automatically configure and maintain PortChannels

2.0

The PortChannel protocol provides a consistency check of the configuration at both ends. Simplifies PortChannel configuration Automatically creates a PortChannel between switches

1 B A 2 C
Channel group10
With PortChannel Protocol, ports will be isolated instead of suspended
2006 Cisco Systems, Inc. All rights reserved.

1 2 3 A
Channel group10

1 2 3 B
Individual link

The plug-and-play functionality of the PortChannel Protocol allows the A3-B3 link to be dynamically added to the PortChannel

31

The PortChannel Protocol


The PortChannel Protocol (PCP) was introduced and is enabled by default in SAN-OS 2.0. PCP uses the FC Exchange Peer Parameters (EPP) protocol to exchange configuration information between switches in order to automatically configure and maintain PortChannels. PCP contains 2 sub-protocols:

Bringup protocol: misconfig detection and synchronization of port bringup Autocreate protocol: automatic aggregation of ports into port channels

PCP is exchanged only on FC and FCIP interfaces. The autocreate protocol is run to determine if a port can aggregate with other ports to form a channel group. Both the local and peer port have to be autocreate enabled for autocreate protocol to be attempted. More than 1 port needs to be autocreate enabled for aggregation to be attempted. A port cannot both be manually configured to be part of a PortChannel and have autocreate enabled. These two configurations are mutually exclusive. Autocreate enabled ports need to have the same compatibility parameters to be aggregated: speed, mode, trunk mode, port VSAN, allowed VSANs, port and fabric binding config.

130

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Load Balancing
Source Destination

VSANs have two load balancing options:


Flow-based: All traffic from a given source to a given destination follows the same path Load Sharing Exchange-based: FC exchanges from a given source to a given destination are Load Balanced across multiple PortChannel links and FSPF equal cost paths

Read
Comm and
Sequence 1

Exchange

Data

Sequence 2

Load-balancing option is configured on a per-VSAN basis and applies to both FSPF and PortChannels Some hardware/software combinations perform better with flow-based load balancing
e.g. HP CA with EVA

Sequence 3

nse Respo

Exchange-based Load Balancing is the default Devices in the MDS family do not split exchanges allowing for guaranteed IOD over WAN
One exchange per ISL

2006 Cisco Systems, Inc. All rights reserved.

32

Load Balancing
Load balancing is configured for each VSAN in an MDS 9000 fabric. There are two load balancing methods: flow based and exchange based. Flow based load balancing sends all traffic with the same src_id-dst_id along the same path. Exchange based load balancing ensures that members of the same SCSI exchange follow the same path. Exchange based flow control is the default and is appropriate for most environments. Load balancing is configured on a VSAN by VSAN basis, and whichever method is chosen is applied to both FSPF and PortChannels. Some hardware/software combinations can perform better with flow based load balancing. For example, HP EVA storage subsystems when coupled with Continuous Access (CA) software are sensitive to out-of-order exchanges that are possible with exchange-based load balancing. These devices, while rare, do perform significantly better with flow based load balancing.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

131

Cisco Exchange-Based Load Balancing


Load-balance across 16 FSPF logical paths
FC
HBA

S_ID 10.02.00

FC
HBA

D_ID 20.01.00

FC
HBA

One logical path; up to 16 physical links

Exchange-based load balancing (S_ID - D_ID - OX_ID) maintains IOD in a stable fabric

2006 Cisco Systems, Inc. All rights reserved.

33

Exchange-Based Load Balancing


By default, MDS 9000 family switches load balance traffic across equal cost FSPF paths and across the links in a PortChannel, on the basis of a FC exchange. All frames within a given exchange follow the same path. In the example, traffic going from the FC source ID 10.02.00 to the FC destination ID 20.01.00 will be load balanced across the equal cost FSPF logical links and within the PortChannel physical links. All frames within a given FC exchange will follow the same path. It is possible that exchanges could be delivered out of order. Because an exchange represents an upper layer transaction, e.g., a SCSI read or write operation, most devices are not sensitive to exchange re-ordering.

132

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Interoperability with PortChannels


4 FSPF Equal Cost paths Cisco PortChannel Brocade Trunk

Cisco MDS 9509

Cisco MDS 9216

Brocade 3800

Brocade 12000

PortChannels are proprietary and are not supported between Cisco MDS switches and other vendors switches Not compatible with other vendors trunking Standard ISL flow control must be configured on the Brocade switch

2006 Cisco Systems, Inc. All rights reserved.

34

Interoperability with PortChannels


Brocades trunking feature is comparable to Cisco PortChannels. Cisco PortChannels and Brocade trunking are not supported between MDS and Brocade switches. Brocade uses a proprietary flow control technique called VC (Virtual Channel) Flow Control. When an ISL comes up between an MDS switch and a Brocade switch, the ISL will be negotiated during ELP for standards-based buffer-to-buffer flow control. The MDS will reject Brocades proprietary VC flow control protocol and negotiate standards-based buffer-to-buffer flow control instead. All Brocade-to-Brocade ISLs can still use Brocades VC flow control protocol, and MDS-toMDS ISLs can still be trunking EISLs and PortChannels.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

133

Best Practices
Use PortChannels wherever possible Place single ISLs in a PortChannel
Non disruptive scalability

Configure links on different switching modules for redundancy and high availability Use the same Channel_ID at both ends
- Not a requirement but it makes management easier

Ensure that each end terminates at a single switch


- This is a requirement

Quiesce links before removing them from a PortChannel


- Avoids traffic disruption - Not needed from SAN-OS 2.0 onwards

Use in-order-guarantee feature only when required


2006 Cisco Systems, Inc. All rights reserved.

35

Best Practices
Use PortChannels whenever possible. PortChannels:

Reduce CPU usage from the levels required to maintain multiple neighbors Provide an independent recovery mechanism, faster than FSPF Are completely transparent to upper-layer protocols Can be nondisruptively scaled by adding links

Follow these guidelines when implementing PortChannels:

Spread PortChannel links across different switching modules. As a result, should a switching module fail, the PortChannel can continue to function as long as at least one link remains functional. Try to use the same Channel_ID at both ends of the PortChannel. While the PortChannel number is only locally significant, this practice helps identify the PortChannel more easily within the fabric. PortChannels are point-to-point logical links. Ensure that all links in a PortChannel connect to the same two switches or directors. In order to prevent frame loss, it is best to quiesce a link before disabling it from a PortChannel When difficulties arise with configuring PortChannels, the problem is often the result of inconsistently configured links. All links within the PortChannel require the same attributes for the PortChannel to come up. Use the show port-channel consistency detail command to identify link configuration inconsistencies.

134

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Use the in-order-delivery feature only when necessary. In-order-delivery adds latency because it deliberately holds frames in the switch. It also consumes more switch memory, because it stacks the frames at the egress port.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

135

Intelligent Addressing
Dynamic FCID Assignment Problems
WWN1
1 1 EVENT
Port Change Host Reboot Switch Reboot
FC

WWN1
1 1

FC

X
FC - Switch Directory Server

2 2

FCID1 = WWN1

2 2

FCID1

EVENT
FCID2 = WWN1

FCID2

Non-Cisco FCID Assignment:


Dynamically assigned by default Can change if device is removed and re-added to the fabric Can change if the switch Domain ID changes
2006 Cisco Systems, Inc. All rights reserved.

37

FCID Assignment Problems


FCIDs are normally assigned dynamically by a FC switch when the devices (N_Ports), including hosts, disks, and tape arrays log into the fabric. FCIDs can therefore change as devices are removed from and added to the fabric. This diagram shows a simplified depiction of a host logging into the fabric and receiving FCIDs from the switch:

After the N_Port has established a link to its F_Port, the N_Port obtains a port address by sending a Fabric Login (FLOGI) Link Services command to the switch Login Server (at Well-Known Address 0xFFFFFE). The FLOGI command contains the WWN of the N_Port in the payload of the frame. The Login Server sends an Accept (ACC) reply that contains the N_Port address in the D_ID field. The initiator N_Port then contacts the target N_Port using the FCID of the target.

In the event of a port change, host reboot or switch reboot, previous FCID assignments have the potential to change.

136

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCID Target Binding


Class I H/W Path Driver S/W State H/W Type Description --------------------------------------------------------------------------------fc 0 0/1/2/0 td CLAIMED INTERFACE HP Mass Storage Adapter fcp 1 0/1/2/0.1 fcp CLAIMED INTERFACE FCP Domain ext_bus 3 0/1/2/0.1.19.0.0 fcparray CLAIMED INTERFACE FCP Array Interface Target 6 0/1/2/0.1.19.0.0.0 tgt CLAIMED DEVICE disk 3 0/1/2/0.1.19.0.0.0.0 sdisk CLAIMED DEVICE HP OPEN-8 /dev/dsk/c4t0d0 /dev/rdsk/c4t0d0 disk 10 0/1/2/0.1.19.0.0.0.7 sdisk CLAIMED DEVICE HP OPEN-8 /dev/dsk/c4t0d7 /dev/rdsk/c4t0d7 target 7 0/1/2/0.1.19.0.0.1 tgt CLAIMED DEVICE disk 18 0/1/2/0.1.19.0.0.1.7 sdisk CLAIMED DEVICE HP OPEN-9 /dev/dsk/c4t1d7 /dev/rdsk/c4t1d7

FCID target binding:


HP/UX and AIX map block devices to FCIDs FCIDs are non-persistent by default Can jeopardize high-availability
2006 Cisco Systems, Inc. All rights reserved.

38

FCID Target Binding


Some operating systems, such as HP/UX v11.0 and AIX v4.3 and v5.1, map block devices (such as file systems) by default to the dynamically assigned FCIDs. As each Fibre Channel target device is attached to the operating system, the FCID is used as the identifier, not the WWN as in many other operating systems. The problem with the target-binding method employed by HP/UX and AIX is that FCIDs are dynamically assigned, non-persistent identifiers. There are several possible cases where a new FCID may be assigned to a storage device, thereby invalidating the binding held by a given server. These cases might involve a simple move of a storage device to a new port, or a port failure requiring the storage device to be moved to a different switch port. It could even be something as simple as a SAN switch being rebooted. All of these conditions may cause new FCIDs to be assigned to existing storage devices. A SAN designer must pay close attention to this detail when deploying HP/UX and AIX servers in a SAN, because this binding method can represent a significant risk to availability. IBM AIX v5.2 and later versions include a new feature called dynamic tracking of FC devices that can detect the change of a target FCID and remap the target without any intervention. However, AIX v4.3 and v5.1 do not possess this feature and are still widely used.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

137

Intelligent Addressing Services


The Cisco MDS 9000 employs three intelligent addressing services: FCID address caching: Cache of FCID addresses is maintained by default in RAM memory Devices moved from one port to another within a switch, retain the same FCID FCID assignments do not survive system reboot Persistent FCID allocation Stores FCIDs in NVRAM - enables FCIDs to persist across switch reboots Reduces management complexity and availability risks associated with some HP/UX and AIX servers Persistent FCIDs can be selectively added or purged Enabled by default from SAN-OS 2.0 Static FCID assignment Allows greater administrative control over FCID assignment Area and Port octet in the FCID can be manually configured Requires static Domain ID assignment Useful when migrating HP-UX and AIX servers
2006 Cisco Systems, Inc. All rights reserved.

39

Intelligent Addressing Services


The Cisco MDS 9000 Family of Multilayer Directors and Fabric Switches deliver several intelligent addressing services that reduce the complexity and eliminate the availability risk associated with deploying HP/UX and AIX servers in a Fibre Channel SAN. All of these services are included in the base MDS 9000 software feature set at no additional cost and work together to give the SAN designer several options when designing the SAN addressing scheme. Cisco MDS 9000 switches use a default addressing mechanism that assigns FCIDs in a sequential order. An MDS switch maintains an active cache of assignments based on the WWN of the FCID receiver. This active cache is used to reassign the same FCID to a device even after it temporarily goes offline or is moved to another port in the switch. The cache mechanism is active at all times, and is enabled by default. When a device is moved from one port to another port on the same switch, the device is automatically assigned the same FCID. This capability allows storage devices that are being used by HP/UX and AIX hosts to be easily moved to other ports on the switch as necessary and be assured of the same FCID assignment. For example, if a switch port or SFP failure were to occur, the device connection could simply be moved to another port and would assume the same FCID. There is no pre-configuration required to use this feature. It is enabled by default. The FCID address cache is maintained in switch dynamic memory. When a switch is powered off, the cache assignments are not maintained. To address this issue, Cisco has also provided the ability to assign persistent FCID addresses to connected devices. The persistent FCID allocation feature assigns FCIDs to devices and records the binding in non-volatile memory. As new devices are attached to the switch, the WWN-to-FCID mapping is stored in persistent non138 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

volatile memory. This binding remains intact until it is explicitly purged by the switch administrator. The persistent FCID allocation option can be applied globally or for individual VSANs. This feature reduces the management complexity and availability risks associated with deploying HP/UX and AIX servers. The persistent FCID allocation feature is enabled on a perVSAN basis, allowing different VSANs to have different addressing policies or practices. The Cisco MDS 9000 Family also supports static FCID assignments. Using static FCID assignments, the area and port octets in the FCID are manually assigned by the administrator. This feature allows SAN administrators to use custom numbering or addressing schemes to divide the FCID domain address space among available SAN devices. This feature is particularly useful for customers who migrate from other vendors switches because they can retain the same FCIDs after migration. Because the Domain ID is the first octet of the FCID, the administrator must assign a static Domain ID to the switch in order to specify the entire FCID. Therefore, in order to statically assign FCIDs on a given switch, that switch must first be configured with a static Domain ID. The static FCID assignment feature is enabled on a per-VSAN basis, and static Domain IDs must be assigned on a per-VSAN basis for each switch in the VSAN. Static FCID assignment eases the migration of HP-UX and AIX servers from a legacy fabric to an MDS fabric. The MDS switches can be configured with the same FCIDs as the legacy fabric, eliminating the need to remap storage targets on the servers.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

139

Flat FCID Assignment


Older HBAs use area bits for port IDs
Entire area reserved for a single port 255 addresses wasted
Bit 23 16 15 08 07 00

Domain

Area

Port

FCID Flat assignment mode:


Port and area bits are used for port IDs Increases scalability of FCID address space

Bit

23

16 15

08 07

00

Domain

Port IDs

FCID Auto assignment mode: (Default)


Auto detects capability of the HBA in most cases MDS maintains a list of HBAs, identified by their Company IDs (OUIs) that require a complete area during Fabric Login Supports HBAs that use both area and port bits for port IDs Supports HBAs that only use area bits for port IDs

Cisco MDS logically assigns FCIDs they are not tied to the physical port
2006 Cisco Systems, Inc. All rights reserved.

40

Flat FCID Assignment


Non-Cisco FC switches tie the Area ID to the physical port on the switch. N_Ports are normally assigned an entire area with a Port ID of 00. This severely restricts the number of ports within a switch to 256 (8 bit Area value). Cisco MDS assigns Port IDs based upon the entire 16 bit value, thereby lifting the restriction of 256 ports per switch. Fibre Channel standards require a unique FCID to be allocated to each N_Port that is attached to an Fx_Port. Some HBAs assume that only the area octet will be used to designate the port number. In other words, such an HBA assumes that no two ports have the same area value. When a target is assigned with a FCID that has the same area value as the HBA, but a different port value, the HBA fails to discover these targets. To isolate these HBAs in a separate area, switches in the Cisco MDS 9000 Family follow a different FCID allocation scheme depending on the addressing capability of each HBA. By default, the FCID allocation mode is auto. In the auto mode, only HBAs without interoperability issues are assigned FCIDs with both area and port bits. This is known as flat FCID assignment. All other HBAs are assigned FCIDs with a whole area (port bits set to 0). However, in some cases it may be necessary to explicitly disable flat FCID assignment mode if the switch cannot correctly detect the capability of the HBA.

140

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Distributed Device Alias Services (DDAS)


Cryptic WWNs
WWN1 = 12:22:67:92:86:92:15:34 WWN2 = 02:12:35:86:93:08:64:43

2.0

Simplify SAN configuration and management tasks


User-friendly CLI/FM commands and outputs

FC
FC

WWN1

WWN2

Fabric-wide distribution ensures no reconfiguration when a device is moved across VSANs Unique aliases minimize zone merge issues

Human-readable names (aliases)


Alias1 = Server-Oracle-ERP Alias2 = Array-OLTP

FC
FC

Alias1
2006 Cisco Systems, Inc. All rights reserved.

Alias2
41

Distributed Device Alias Services


Distributed Device Alias Services (DDAS) simplifies SAN configuration and management tasks. User-friendly alias names can be employed in Fabric Manager and the CLI. By distributing device aliases fabric wide, no reconfiguration is required when a device is moved across VSANs. Zone merge issues are minimal when using unique aliases fabric wide. Future releases of SAN-OS will include the capability to dynamically assign aliases.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

141

N_Port Identifier Virtualization (NPIV)


Provides ability to assign multiple port IDs to a single N_Port Multiple applications on the same port can use different IDs in the same VSAN Allows zoning and port security to be implemented at the application level Designed for virtual server environments

3.0

Virtual Servers FC Node

Email Web Print Three FLOGIs


LUN 1 N_Port ID=1
FC

LUN 2 N_Port ID=2 LUN 3 N_Port ID=3 N_Port Controller


HBA

F_Port

Three FCIDs
Three Name Server entries Three Virtual Devices All share a single FC Port

MDS Switch
42

2006 Cisco Systems, Inc. All rights reserved.

N_Port Identifier Virtualization


Fibre Channel standards define that a FC HBA N_Port must be connected to one and only one F_Port on a Fibre Channel switch. When the device is connected to the switch, the link comes up and the FC HBA sends a FLOGI command containing its pWWN to the FC switch requesting a Fibre Channel ID. The switch responds with a unique FCID based upon the Domain ID of the Switch, Area ID and Port ID. This is fine for servers with a single operating environment but is restrictive for virtual servers that may have several operating environments sharing the same FC HBA. Each virtual server requires its own FCID. NPIV provides the ability to assign a separate FCID to each virtual server that requests one through its own FLOGI command.

142

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cisco Fabric ServicesUnifying the Fabric


Cisco Fabric Services (CFS)
Cisco Fabric Services (CFS)
Protocol for configuration and discovery of fabric-wide services Distributes configuration information to all switches in the fabric Distribution is global to all CFS enabled switches regardless of VSAN
FC

Configuration for fabric-wide services

FC

FC FC FC
FC FC

FC FC FC
FC FC

Communication is in-band over FC link or out-of-band over IP as a last resort

Benefits:
CFS Protocol

Fast and efficient distribution Single-point of configuration with fabric-wide consistency Plug-and-play SANs Session-based management

2006 Cisco Systems, Inc. All rights reserved.

44

The Cisco Fabric Services Protocol distributes configuration information for WWN-based VSAN members, Distributed Device Alias Services, Port Security, Call Home, Network Time Protocol (NTP), AAA servers, Inter-VSAN Routing zones, Syslog servers, role policies, and Fibre Channel timers to all switches in a fabric. From SAN-OS 3.0 CFS will first attempt to distribute in-band using FC EISLs between switches but as a last resort, using an out-of-band IP connection if available.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

143

CFS Applications
Consistent Syslog Server, Call Home and Network Time Protocol (NTP) configuration throughout fabric aids in troubleshooting and SLA compliance Distributed Port Security, RADIUS/TACACS+ and Role-Based Access Control (RBAC) information for simpler security management Fabric wide VSAN timer and IVR topology information propagation from a single switch Distributed Device Alias Service (DDAS) allows fabric-wide aliases, simplifying SAN administration
VSAN VSANTimers Timers IBR IBR DDAS DDAS Syslog Syslog Call CallHome Home NTP NTP

CFS CFS

Port PortSecurity Security RADIUS RADIUS RBAC RBAC

2006 Cisco Systems, Inc. All rights reserved.

45

CFS Applications
The Cisco Fabric Services Protocol aids in the administration, management and deployment of configuration settings SAN wide. Consistent Syslog Server, Call Home and NTP configuration throughout fabric aids in troubleshooting and SLA compliance. CFS distributed Port Security, Radius TACACS and RBAC information enhance and simplify security by providing consistent and comprehensive security settings. Fabric wide IVR and VSAN timer information propagation from a single switch via CFS provides uniformity across the fabric. Fabric wide distributed device aliasing simplifies SAN administration by providing consistent names for devices throughout the fabric based upon the pWWN regardless of VSAN

144

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Switch Interoperability
Overview of Switch Interoperability

Open Trunking

X X MDS 9000

EISL VC Flow Control

X Brocade

X McData

PortChannels

Switches utilize their proprietary feature set. Different vendors switches often cannot interoperate with each other. Cisco MDS switches support 5 modes: Cisco Native Mode Interop Mode 1 Interop Mode 2 Interop Mode 3 3.0 Interop Mode 4
2006 Cisco Systems, Inc. All rights reserved.

Supports all Cisco proprietary features FC-SW2 compatible with all other vendors Legacy Brocade support for 16 port switches Legacy Brocade support for larger switches Legacy McData support
47

Interoperability allows devices from multiple vendors to communicate across a SAN fabric. Fibre Channel standards (e.g., Fibre Channel Methodologies for Interconnect, FC-MI 1.92) have been put in place to guide vendors towards common external Fibre Channel interfaces. If all vendors followed the standards in the same manner, then interconnecting different products would become a trivial exercise. However, some aspects of the Fibre Channel standards are open to interpretation and include many options for implementation. In addition, vendors have extended the features laid out in the standards documents to add advanced capabilities and functionality to their feature set. Since these features are often proprietary, vendors have had to implement interoperability modes to accommodate heterogeneous environments.

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

145

Standard Interoperability Mode 1


Brocade 2800 McDATA 6064 MDS 9509 CNT/Inrange FC/9000 Brocade 3900

Standard Interop mode (Interop mode 1) requires all other switches in fabric to be in Interop mode 1 Enables MDS 9000 switches to interoperate with McData, Brocade, and QLogic switches that are FC-SW-2 compatible Reduces the feature set supported by all switches Requires rebooting of third party switches Can require disruptive restart of an MDS VSAN Interop Modes affect only the VSAN for which they are configured
2006 Cisco Systems, Inc. All rights reserved.

48

Interop Mode 1
The standard interoperability mode (Interop mode 1) enables the MDS to interoperate with third party switches that have been configured for interoperability. Interop 1 mode allows the MDS to communicate over a standard set of protocols with these switches. In Interop mode 1, the feature set supported by vendors in standard interoperability mode is reduced to a subset that can be supported by all vendors. This is the traditional way vendors achieve interoperability. Most non-Cisco switches require a reboot when configured into standard interoperability mode. On Cisco switches, Interop mode is set on a VSAN rather than the whole switch. As a result, an individual VSAN may need to be restarted disruptively to implement interop 1 mode, but the entire switch does not require a reset.

146

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Legacy Interoperability Modes


Allows MDS 9000 to integrate seamlessly with Brocade and McData switches that are running in native mode:
Does not restrict the use of proprietary features Does not require a reboot of the switch

MDS

Interop Mode 2
Supports Brocade switches with 16 or fewer ports eg. 2100/2400/2800/3200 Native core PID format = 0

Brocade 2800
Interop 2

Interop Mode 3
Supports Brocade switches with more than 16 ports eg. 3900/12000/24000 Native core PID format = 1

Brocade 3900
Interop 3

3.0

Interop Mode 4
Supports McData switches and directors
McData 6140
Interop 4

2006 Cisco Systems, Inc. All rights reserved.

49

Legacy Interop Modes


MDS switches can operate in legacy modes that allow integration into existing Brocade or McData fabrics without rebooting or reconfiguring the switches or a reduction of the feature set.

Interop Mode 2: This mode allows seamless integration with specific Brocade switches (2100/2400/2800/3800 series) running in their own native mode of operation. Interop Mode 2 enables MDS switches to interoperate with older Brocade switches that utilize a restrictive PID format (PID = 0) that permits only 16 devices per domain. This restrictive PID format, also referred to as CORE PID FORMAT (CORE PID=0), is common in brocade fabrics that do not have 3900/12000 switch in the fabric. Interop Mode 3: This mode allows seamless integration with specific Brocade switches (3900 and 12000) running in their own native mode of operation. Interop Mode 3 enables MDS switches to interoperate with newer Brocade switches that utilize a less restrictive PID format (PID = 1) that permits up to 256 devices per domain. This PID format is referred to as CORE PID FORMAT (CORE PID=1) which is common in brocade fabrics with at least one 3900, 12000 model or later in the fabric...(In such cases all the other 2800/3800 switches need to be set to CORE PID=1 explicitly), which is a disruptive operation requiring a reboot of every switch. Interop Mode 4: This mode allows seamless integration with McData switches and directors running in their own native mode of operation.
Brocade Fabric switches with port counts higher than 16 (models 3900 and 12000) require that the core PID value be set to 1. Earlier models, with 16 or fewer ports, set the core PID to 0. These older Brocade switches allocated one nibble of the FCID / PID in area field 0x0 F for port numbers, limiting the port count to 16. When the core PID is set to 1, the allocated bytes in the FCID/PID allow the use of port numbers 0x00 FF.

Note

Copyright 2006, Cisco Systems, Inc.

The Multilayer SAN

147

Inter-VSAN Routing and Interop Modes


IVR Edge Switch VSAN 100 Interop Mode 2 or 3 VSAN 200 Interop Mode 4

FC
HBA

FC

Backup Server
FC

McData Director Tape Library Brocade Switches

Storage Array

Brocade Fabric in Native Mode

McData Fabric in Native Mode

Use IVR to seamlessly backup data in a Brocade fabric to a tape library in a McData fabric IVR is supported by MDS 9100, 9200 and 9500 switches and included in the Enterprise license package. Enable true SAN consolidation of storage and tape devices across the enterprise
2006 Cisco Systems, Inc. All rights reserved.

50

Using IVR with Interop Modes


This example shows how VSANs and IVR can effectively be used to allow a Backup Server in a Brocade fabric to seamlessly backup data to a Tape Library in a McData fabric. The Brocade switches are connected to an MDS interface in VSAN 100 which is placed in Interop Mode 2 or 3, depending upon the Brocade switch model and Core PID type. The McData director is connected to an MDS interface in VSAN 200 which is placed in Interop Mode 4. IVR is fully compliant with Fibre Channel standards so is completely transparent to the transfer of frames from one fabric to another. This unique feature of Cisco MDS switches allows for the first time, different vendor fabrics to be joined in a SAN without having to disruptively put each switch in Interop Mode 1. This feature enables true SAN consolidation of storage and tape devices across the enterprise.

148

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 5

Remote Lab Overview


Overview
This lesson provides an introduction to the hands-on labs in this course.

Objectives
Upon completing this lesson, you will be able to become familiar with MDS 9000 management. This includes being able to meet these objectives:

Identify the system memory areas of the MDS 9000 supervisor Describe the features of the MDS 9000 CLI Describe the basic features of Cisco Fabric Manager and Device Manager Explain how to perform the initial configuration of an MDS 9000 switch Access the MDS 90000 remote storage labs

System Memory Areas


System Memory Areas
Flash Memory Bootflash:
(internal flash)
Kickstart image System Image License files LINUX system space SAN-OS Running Config copy run start Boot Parameters (Kickstart + System) Startup Config

RAM Memory System:

NVRAM Memory NVRAM:

Slot 0:
(external flash)

Volatile:
Temporary file space

Log:
Logfile

Modflash:
(SSM flash)

The Bootflash: contains the Kickstart and System images used for booting the MDS All config changes made by CLI or FM/DM are instantly active and held in running-config #Copy run start saves the running-config to startup-config in NVRAM Startup-config is loaded when the switch is rebooted Temporary files may be stored in the Volatile: system area.
2006 Cisco Systems, Inc. All rights reserved.

The Cisco MDS contains an internal Bootflash: used for holding the current bootable images, Kickstart and System. License files are also stored here but the Bootflash: can also be used for storing any file including copies of the startup config. MDS 9500 supervisors also have an external Bootflash: memory slot called Slot0: that is used for transferring image files between switches. The SSM linecard also contains an internal ModFlash: used for storing application images. The system RAM memory is used by the Linux operating system and a Volatile: file system is used for storing temporary files. Any changes made to the switch operating parameters or configuration are instantly active and held in the Running-Configuration in RAM. All data stored in RAM will be lost when the MDS is rebooted so an area of non-volatile RAM or NVRAM is used for storage of critical data. The most critical of these would be the runningconfiguration for the switch. The Running-Configuration should be saved to the StartupConfiguration in NVRAM using the CLI command #Copy run start, so that the configuration can be preserved during switch reboot. During the switch boot process, it is essential that the switch knows where to find the kickstart and system images and what they are called. Two boot parameters are held in NVRAM that point to these two files.

152

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Boot Sequence
Both the Kickstart image and System image need to be present for a successful boot The Boot Parameters point to the location of both the Kickstart and System images The Boot process will fail if the parameters are wrong or the images are missing Install command simplifies the process and checks for errors System
Loads SAN-OS Checks file systems Loads startup-config Switch# prompt

System:
(RAM) LINUX system space SAN-OS

Kickstart

Bootloader

Loads LINUX Kernel & Drivers Gets System Boot Parameters Verifies System image and loads it Switch(Boot)# prompt

NVRAM Boot Parameters


#boot system bootflash:system30.img #boot kickstart bootflash:kickstart30.img

Gets Kickstart Boot Parameters Verifies Kickstart image and loads it Loader> prompt

Bootflash:
(internal flash) system30.img kickstart30.img
5

BIOS
Runs POST Loads Bootloader
2006 Cisco Systems, Inc. All rights reserved.

Boot Sequence
When the MDS is first switched on or during reboot, the System BIOS on the Supervisor module first runs POST (Power On Self Test) diagnostics then loads the Bootloader bootstrap function. The Boot parameters are held in NVRAM and point to the location and name of both the Kickstart and System images. The Bootloader obtains the location of the Kickstart file, usually on Bootflash: and verifies the Kickstart image before loading it. The Kickstart loads the Linux Kernel and device drivers and then needs to load the System Image. Again, the Boot parameters in NVRAM should point to the location and name of the System image, usually on Bootflash:. The Kickstart then verifies the System image and loads it. Finally, the System image loads the SAN-OS, checks the file systems and proceeds to load the startup-config, containing the switch configuration, from NVRAM. If the Boot parameters are missing or have an incorrect name or location, then the Boot Process will fail at the last stage. If this happens, then the administrator must recover from the error and reload the switch. The #Install All command is a script that greatly simplifies the boot procedure and will check for errors and the upgrade impact before proceeding.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

153

CLI Overview
CLI Overview
Command-Line Interface (CLI)
Multiple connection options and protocols Direct Console or serial linkVT100 Secure shell accessSSH (encrypted) Terminal TelnetTCP/IP over Ethernet or Fibre Channel

2006 Cisco Systems, Inc. All rights reserved.

There are multiple connection options and protocols available to manage the MDS 9000 Family switches via CLI. The initial configuration must be done using a VT100 console access. VT100 console access can be a direct connection or serial link connection such as a modem. Once the initial configuration is complete you can access the switch using either Secure Shell or Telnet. Secure Shell (SSH) protocol provides a secure encrypted means of access. Terminal Telnet access involves a TCP/IP Out-of-Band (OOB) connection through the 10-/100-MB Ethernet port or an in-band connection via IP over FC. You can access the MDS 9000 Family of switches for configuration, status, or management through the console port; initiate a Telnet session through the OOB Ethernet management port or through the in-band IP over FC management feature. The console port is an asynchronous port with a default configuration of 9600 bps, 8 data bits, no parity, and 1 stop bit. This port is the only means of accessing the switch after the initial power up until an IP address is configured for the management port. Once an IP address is configured, you can Telnet to the switch through the management Mgmt0 interface on the supervisor card. In-band IP over FC is used to manage remote switches through the local Mgmt0 interface.

154

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

CLI Features
Structured hierarchy easier to remember Style consistent with IOS software Commands may be abbreviated Help facility
Context-sensitive help (?) Command completion (Tab) Command history buffer (using keys) Console error messages

Command Scheduler with support for running Shell scripts Support for Command Variables and Aliases Configuration changes must be explicitly saved before reboot
copy running-config startup-config (abbreviated to: copy run start)
2006 Cisco Systems, Inc. All rights reserved.

CLI Features
The CLI enables you to configure every feature of the switch. More than 1700 combinations of commands are available and are structurally consistent with the style of Cisco IOS software CLI. The CLI help facility provides:

Context-sensitive help - Provides a list of commands and associated arguments. Type ? at any time, or type part of a command and type ?. Command completion - The Tab key completes the keyword you have started typing. Console error messages - Identify problems with any switch commands that are incorrectly entered so that they may be corrected or modified Command history buffer - Allows recalling of long or complex commands or entries for reentry, renewing, or correction MDS Command Scheduler - Provides a UNIX cron like facility in the SAN-OS that allows the user to schedule a job at a particular time or periodically.

Configuration changes must be explicitly saved, and configuration commands are serialized for execution across multiple SNMP sessions. To save the configuration, enter the copy runningconfig startup-config command from the config mode prompt to save the new configuration into nonvolatile storage. Once this command is issued, the running and the startup copies of the configuration are identical. Every configuration command is logged to the RADIUS server.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

155

CLI Modes
EXEC mode:
Show system information, run debug, Copy and delete files, get directory listing for Bootflash:

Configuration mode:
Configure features that affect the switch as a whole

Configuration submode:
Configure switch sub-parameters
switch prompt (switch#) exit exit exit end
9

dir bootflash: slot0:

copy

show flogi fcns

debug fspf

config terminal interface fcdomain zoneset

EXEC Mode Config Mode Config submode

database

port-channel

fc

fcip

iscsi

mgmt

switchport
2006 Cisco Systems, Inc. All rights reserved.

shut

no shut

CLI Modes
Switches in the MDS 9000 Family have three command mode levels:

User EXEC mode Configuration mode Configuration submodes

The commands available to you depend on the mode that you are in. To obtain a list of available commands, type a ? at the system prompt. From the EXEC mode, you can perform basic tests and display system information. This includes operations other than configuration such as show and debug. Show commands display system configuration and information. Debug commands enable printing of debug messages for various system components. Use the config or config terminal command from EXEC mode to go into the configuration mode. The configuration mode has a set of configuration commands that can be entered after a config terminal command, in order to set up the switch. The CLI commands are organized hierarchically, with commands that perform similar functions grouped under the same level. For example, all commands that display information about the system, configuration, or hardware are grouped under the show command, and all commands that allow you to configure the switch are grouped under the config terminal command, which includes switch sub-parameters at the configuration submode level. To execute a command, you enter the command by starting at the top level of the hierarchy. For example, to configure a Fibre Channel interface, use the config terminal command. Once you are in configuration mode, issue the interface command. When you are in the interface submode, you can query the available commands there.
156 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

Useful CLI Commands


# copy run start # dir bootflash: # erase bootflash:temp # copy slot0:tmp bootflash:temp.txt # debug Flogi # no debug all # show tech-support # show tech-support > tempfile # gzip volatile:tempfile # copy volatile:tempfile slot0:temp # config t (config)# int fc x/y (config-if)# switchport speed 1000
2006 Cisco Systems, Inc. All rights reserved.

Saves active config in NVRAM List files stored on bootflash: Erase file stored on bootflash: Copy file and change the name Monitor all FLOGI operations Switch off debug Gather switch info for support Saves output in volatile:tempfile Compresses tempfile Copies file to external flash card Enter Config Mode to change settings Configure specific interface Configure as a 1Gb port
10

Useful CLI Commands


The top part of the list shows useful commands that can be entered at the EXEC mode. changes to the configuration can only be made by entering Configuration mode first and then entering the appropriate commands. More information can be found by looking in the Cisco MDS Command Reference guide.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

157

Useful CLI Show Commands


# show environment power # show interface brief # show interface fc x/y # show module # show hardware # show version # show license usage # show running-config # show VSAN # show VSAN membership # show zoneset active # show flogi database # show fcns database Check power ratings Summary of all interfaces Detailed info about an interface Detailed status about all modules Detailed hardware status View current software versions List installed licenses and status View active switch settings Lists all created VSANs Lists interfaces by VSAN Show all active zones and zonesets Lists all devices logged in to MDS Lists all name server entries

2006 Cisco Systems, Inc. All rights reserved.

11

Useful CLI Show Commands


The show commands are too extensive to list individually so here are some common ones. More information can be found by looking in the Cisco MDS Command Reference guide.

158

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Command Aliases
Replaces complex command strings with an alias name Command aliases persist across reboots Commands being aliased must be typed in full without abbreviation Command alias always takes precedence over CLI keywords (config)# cli alias name gigint interface gigabitethernet

3.0

2006 Cisco Systems, Inc. All rights reserved.

12

Command Aliases
Some commands can require a lot of typing. An example of this is gigabitethernet that can sometimes be shortened to gig, but it is sometimes useful to group several commands and subcommands together. This can be done using Command Aliases. Command Aliases are saved in NVRAM so can persist across reboots. When creating an alias, the individual commands must be typed in full without abbreviation. If you define an alias, it will take precedence over CLI keywords starting with the same letters, so be careful when using abbreviations.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

159

Command Scheduler
Helps schedule configuration and maintenance jobs in any MDS switch Schedule jobs on a one-time basis or periodically
One-time Mode The job is executed once at a pre-defined time Periodic Mode Job is executed Daily, Weekly, Monthly, or Delta (configurable)

2.0

MDS Date and Time must be accurately configured Scheduled jobs may fail if an error is encountered
If a licence has expired, if a feature is disabled

All jobs are executed non-interactively

2006 Cisco Systems, Inc. All rights reserved.

13

Command Scheduler
The Cisco MDS SAN-OS provides a unix kron like facility called the Command Scheduler. Jobs can be defined listing several commands that are to be executed in order. Jobs can be scheduled to run at the same time every day, week, month or at a configurable frequency (delta). All jobs are executed non-interactively, without administrator response. Be aware that a job may fail if a command that is issued is disabled or no longer supported, because a license may have expired. The job will fail at the point of error, and all subsequent commands will be ignored.

160

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fabric Manager and Device Manager


Installing Fabric Manager
Point browser at mgmt IP address of MDS switch assigned at initial setup Click link to Install Fabric Manager Requires Java Web Start Installs java applets from Web Server running on MDS switch Follow prompts

FM Server runs as Windows Service Performance Manager runs as Windows Service Runs as daemon on Unix
2006 Cisco Systems, Inc. All rights reserved.

15

Installing Fabric Manager


The Fabric Manager is an SNMP-based device management application with a Java web-based GUI to view and configure multiple MDS 9000 Family director and fabric switches. The software is downloaded automatically to the end users (management) workstation using Java Web Start. Secure SNMPv3 communications are used to get and set switch parameters. Open a browser window and in the address bar, enter the IP address of the switch you wish to manage. The MDS switch will respond with a single web page from the MDS web server. If the Java Run Time Environment, Java Web Start has not already been installed, then the web page will include a link in red pointing to the sun website for downloading java. The Cisco Fabric Manager web page contains two links for downloading the java applets for Fabric Manager and Device Manager. Just click on the Install Fabric Manager link and follow the onscreen prompts. Cisco Fabric Manager is used to show the entire fabric containing all the switches, hosts and storage devices. Cisco Device Manager is used to manage a single switch. To open Device Manager, just double click on the green icon for a switch on the Fabric Manager topology view.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

161

Fabric Manager Features


Configuration, Diagnosis, Monitoring
Switch- embedded Java Application Download via Web browser Installed and updated automatically by Java Web Start Runs on clients workstation SNMPv3 secures communications with switches Discovers FC fabric and displays topology map Enables rapid multi-switch configuration Summary view Including: In-line bar charting, sorting, drill-down capabilities Chart, print, or output to file
2006 Cisco Systems, Inc. All rights reserved.

Device View Fabric View

Summary View

16

Fabric Manager Features


The Fabric Manager provides three management views and a Performance Manager traffic analysis interface. The Fabric View displays a map of your network fabric, including Cisco MDS 9000 switches, hosts, and storage devices. The Device View displays a graphic representation of the switch configuration and provides access to statistics and configuration information for a single switch. The Summary View displays a summary of xE_Ports (interswitch links), Fx_Ports (fabric ports), and Nx_Ports (attached hosts and storage) on a single switch. For more detailed information, Performance Manager included with the Fabric Manager Server license provides detailed traffic analysis by capturing data with the Cisco Port Analyzer Adapter. This data is compiled into various graphs and charts which can be viewed with any web browser.

162

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fabric Manager Features


FM+DM = CLI-Debug All features supported: VSANs, Zones, PortChannels RMON alerts, event filters SNMP users and roles Real-time & Historical statistics monitoring Fabric troubleshooting and analysis tools
Switch health, End-End Connectivity, Configuration, Zone Merge, Traceroute

Device View Fabric View

Summary View

Device View provides status at a glance Fan, power, supervisor and switching module status indicators Port status indicators

2006 Cisco Systems, Inc. All rights reserved.

17

Fabric Manager discovers network devices and creates a topology map with VSAN and zone visualization. VSAN/zone and switch trees are also available to simplify configuration. Immediately after the Fabric View is opened, the discovery process begins. Using information gathered from a seed switch (MDS 9000 Family), including name server registrations and FCGS3 fabric configuration server information, the Fabric Manager can draw a fabric topology in a user-customizable map. Because of the source of this information, any third-party devices such as other fabric switches that support FC-GS and FC-GS3 standards are discovered and displayed on the topology map. Vendor Organizational Unique Identifier (OUI) values are translated to derive the manufacturer of third-party devices, such as QLogic Corp., EMC Corp., or JNI Corp. Fabric Manager provides an intuitive user interface to a suite of network analysis and troubleshooting tools. One of those tools is the Device Manager, which is a complimentary graphical user interface designed for configuring, monitoring, and troubleshooting specific switches within the SAN fabric.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

163

System Setup and Configuration


Initial Setup
The initial setup routine is performed via a connection to the switch console port - console port parameters 9600 8-N-1 The initial setup routine prompts you for the IP address and other configuration information necessary for the switch to communicate over the management interface Switch# setup . . This setup utility will guide you through the basic configuration of the system. Setup configures only enough connectivity for management of the system. Press Enter in case you want to skip any dialog. Use ctrl-c at anytime to skip away remaining dialogs. Would you like to enter the basic configuration dialog (yes/no): yes

2006 Cisco Systems, Inc. All rights reserved.

19

Initial Setup
The first time that you access a switch in the Cisco MDS 9000 Family, it runs a setup program that prompts you for the IP address and other configuration information necessary for the switch to communicate over the supervisor module Ethernet interface. This information is also required if you plan to configure and manage the switch. The IP address must first be set up in the CLI when the switch is powered up for the first time, so that the Cisco MDS 9000 Fabric Manager can reach the switch. The console needs a rollover RJ-45 cable. There is a switch on the supervisor module of the MDS 9500 Series switches that, if placed in the out position, will allow the use of a straightthrough cable. The switch is shipped in the in position by default and is located behind the LEDs. In order to set up a switch for the first time you must obtain the administrator password, which is used to get network administrator access through the CLI. The Simple Network Management Protocol version 3 (SNMPv3) user name and password are used when you log on to the Fabric Manager but should be identified as soon as possible. The switch name will become the prompt when the switch is initialized, and the management Ethernet port IP address and subnet mask need to be known for out-of-band access.

164

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Setup Defaults
Configure default switchport interface state (shut/noshut) [shut]: Configure default switchport trunk mode (on/off/auto) [on]: Configure default zone policy (permit/deny) [deny]: Enable full zoneset distribution (yes/no) [n]: Enable FCID persistence in all the VSANs on this switch (yes/no) [n]: Would you like to edit the configuration (yes/no) [no]: Use this configuration and save it? (yes/no) [y]:

The management interface is active at this point All Fibre Channel and Gigabit Ethernet interfaces are shut down Select yes to use and save the configuration The Setup Routine can be accessed from the EXEC mode of the CLI with the # setup command

2006 Cisco Systems, Inc. All rights reserved.

20

Setup Defaults
It is recommended to have the switch interfaces come up administratively disabled, or shut. This approach will ensure that the administrator will have to configure the interface as needed and then enable with the no shut command, resulting in a more controlled environment. Switch trunk ports mode should be on. Two connected E_Ports will not do trunking if one end port has the trunk mode off. The default zoning policy of deny will make all interfaces on a switch inoperable until a zone is created and activatedinterfaces in the default zone cannot communicate with each other. This policy can be used for greater security. If the permit policy is enabled, then all ports in the default zone will be able to communicate with each other. The system will ask if you would like to edit the configuration that just printed out. Any configuration changes made to a switch are immediately enforced but are not saved. If no edits are needed, then you will be asked if you want to use this configuration and save it as well. Since [y] (yes) is the default selection, pressing Return will activate this function, and the configuration becomes part of the running-config and is copied to the startup-config. This will also ensure that the kickstart and system boot images are automatically configured. Therefore, you do not have to run a copy command after this process. A power loss will restart the switch using the startup-config, which has everything saved that has been configured to nondefault values. If you do not save the configuration at this point, none of your changes will be updated the next time the switch is rebooted.

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

165

Using the MDS 9000 Remote Storage Labs


MDS 9000 Remote Storage Labs
24x7x365 support for training events and customer demos Full console and desktop access 30 student pods 60 MDS 9000 switches Real live equipment not a simulation

2006 Cisco Systems, Inc. All rights reserved.

22

The labs are located in Nevada and used extensively, 24 hours a day during MDS lab based training courses throughout the world. The lab interface provides login authentication, and full access to switch consoles and desktop access on each server. The labs currently contain over 30 pods each containing two MDS switches, two servers, JBOD storage and PAA for diagnostics.

166

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Remote Storage Lab Interface

Point browser at www.labgear.net Enter Username and Password Click Console to access MDS CLI Click Desktop to access W2K server

2006 Cisco Systems, Inc. All rights reserved.

23

Remote Storage Lab Interface


Open a browser and in the address bar type www.labgear.net This will open the main window to authenticate your session. Enter your labgear pod username and password. These will be assigned to you by your instructor. To access the MDS switch console, click on the green console button. To access the desktop of either of the Win2K servers, click on the Desktop button and enter the username and password. Username is administrator, password is cisco

Copyright 2006, Cisco Systems, Inc.

Remote Lab Overview

167

CSDF Labs
Lab 1: Initial Switch Config Lab 2: Accessing Disks via Fibre Channel Lab 3: Configuring High Availability SAN Extension Lab 4: Configuring IVR for SAN Extension Lab 5: Exploring Fabric Manager Tools Lab 6: Implementing iSCSI

2006 Cisco Systems, Inc. All rights reserved.

24

This slide shows the labs that you will perform in this course.

168

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 6

Network-Based Storage Applications


Overview
This lesson explains how the MDS 9000 Storage Services Module (SSM) enables networkbased storage applications.

Objectives
Upon completing this lesson, you will be able to explain how the MDS 9000 Storage Services Module (SSM) enables network-based storage applications. This includes being able to meet these objectives:

Explain the basics of SAN-based storage virtualization Explain the value of network-based storage virtualization Describe the network-hosted application services supported by the SSM Describe the network-assisted application services supported by the SSM Describe the network-accelerated application services supported by the SSM Describe Fibre Channel Write Acceleration

Storage Virtualization Overview


Storage Services for SANs Today
Server management Individually managed Mirroring, striping, concatenation coordinated with disk array groupings Each host with different view of storage LUN Mapping and LUN Masking provide paths between initiators and targets
Mirror, stripe, concat., slice Coordinated across hosts Application integration Multipathing

Volume management Individually managed Just-in-case provisioning Stranded capacity Snapshot within a disk array Array-to-array replication
2006 Cisco Systems, Inc. All rights reserved.

RAID HA upgrades Multiple paths Snapshot Array-to-array replication Replication


4

Storage Services for SANs Today


Storage services for SANs today are usually a hodge-podge of ad hoc solutions. Managing individual volumes and multipathing at the host level adds to the complexity of SAN administration. Each server requires its own investment in management and attention. SAN administrators will typically over-provision storage in this scenario as a strategy to reduce the amount of time spent on resource management. Unfortunately, this results in a lot of underutilized and wasted storage. Also in this scenario, redundancy and replication tasks are often achieved at the array level, often in a same-box-to-same-box configuration or by using a 3rd party software utility to replicate across hetero storage. This adds an additional layer of complexity to the Information lifecycle and overall SAN management. Low value data winds up living on expensive storage.

172

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

LUN Mapping and LUN Masking


LUN Masking
Used in the Storage Array to mask or hide LUNs from servers that are denied access Storage Array makes specific LUNs available to server ports identified by their pWWN. LUN Mapping in HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

LUN Mapping
Server then maps some or all visible LUNs to volumes

Target Identification
The server FC driver identifies the SCSI Target ID with the pWWN of the target port then associates each port with its Fibre Channel FCID. MDS MDS

Command frames are then sent by the SCSI Initiator (server) to the SCSI Target (storage device) In a heterogeneous SAN, there may be several storage arrays and JBODs from different vendors
Difficult to configure Costly to manage Difficult to replicate and migrate data
2006 Cisco Systems, Inc. All rights reserved.

RAID configuration and LUN Masking in Storage Arrays


5

LUN Mapping and LUN Masking


In most SAN environments it is essential that each individual LUN is only discovered by a single server HBA (Host Bus Adapter), otherwise the same volume will be accessed by more than one file system leading to potential data loss or loss of security. There are basically three ways to ensure that this does not happen. LUN Masking in the storage array, LUN Mapping in the host or LUN zoning in the MDS switch in the network. LUN Masking is a feature of enterprise storage arrays that provide basic LUN level security by only allowing LUNs to be seen by selected servers, identified by their pWWN. Each storage array vendor have their own management and proprietary techniques for LUN Masking in the array so in a heterogeneous environment with arrays from different vendors, LUN management becomes more difficult. JBODs (Just a Bunch of Disks) do not have a management function or controller so do not support LUN Masking. LUN Mapping is a feature of FC HBAs that allow the administrator to selectively map only some of those LUNs that have been discovered by the HBA. LUN Mapping must be configured on every HBA so in a large SAN this is a huge management task. Most administrators would configure the HBA to automatically map all LUNs that have been discovered by the HBA and perform LUN management in the array (LUN Masking) or in the network (LUN zoning) instead. LUN Zoning is a proprietary technique offered by Cisco MDS switches that allow LUNs to be selectively zoned to their appropriate host port. LUN zoning is usually used instead of LUN Masking in heterogeneous environments or where BODs are installed.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

173

What is Virtualization?
The process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration. Server Virtualization is a way of creating several Virtual Machines from one computing resource Storage Virtualization is a logical grouping of LUNs creating a common storage pool

Virtualization layer

2006 Cisco Systems, Inc. All rights reserved.

What is Virtualization?
Virtualization is defined as the process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration. In a heterogeneous environment, LUN management can become very costly and time consuming. Storage Virtualization is sometimes used instead to create a common pool of all storage and perform LUN management within the network.

Server Virtualization is a way of creating several Virtual Machines from one computing resource Storage Virtualization is a logical grouping of LUNs creating a common storage pool

174

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Symmetric Virtualization
All server ports are zoned with the Virtualization Appliance, Virtual Target port (T) Servers only discover one target All storage ports are zoned with the Virtualization Appliance, Virtual Initiator port (I) All storage ports are controlled by one initiator All control and data frames are sent to the virtual target and terminated. The CDB and LUN are remapped and a new frame is sent to the real target
Advantages Reduced complexity - Single point of management
LAN

T I
Virtualization Appliance

Disadvantages All frames are terminated and remapped by the appliance and resent to their destination Adds latency per frame All traffic passes through the appliance Potential single point of failure Potential performance issue
2006 Cisco Systems, Inc. All rights reserved.

Symmetric Virtualization
In the symmetric approach, all I/Os and metadata are routed via a central virtualization storage manager. Data and control messages use the same path, which is architecturally simpler but has the potential to create a bottleneck. The virtualization engine does not have to live in a completely separate device. It may be embedded in the network, as a specialized switch, or it may run on a server. To provide alternate data paths and redundancy, there are usually two or more virtual storage management devices; this leads to issues of consistency between the metadata databases used to do the virtualization. The fact that all data I/Os are forced through the virtualization appliance restricts the SAN topologies that can be used and can cause bottlenecking. The bottleneck problem is often addressed by using caching and other techniques to maximize the performance of the engine; however, this again increases complexity and leads to consistency problems between engines.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

175

Asymmetric Virtualization
Each server contains an agent Intercepts Block I/O requests Sends the meta-data (CDB and LUN) to a Virtualization Manager on the LAN. The Virtualization Manager re-maps the CDB and LUN and returns it to the server. The server now sends the modified control frame to the storage target port. All subsequent data and response frames flow directly between Initiator and Target.
Advantages Data frames are sent directly to the storage port Low latency Disadvantages Requires agent in host to intercept control frame Remapping of CDB and LUN adds latency to first frame in exchange Virtualization Manager could be single point of failure T
Subsequent frames flow directly between Initiator and Target Agent sends meta-data over LAN for re-mapping.

LAN

I
Virtualization Manager

2006 Cisco Systems, Inc. All rights reserved.

Asymmetric Virtualization
In the asymmetric approach, the I/O is split into three parts:

First, the server intercepts the Block I/O requests Then queries the metadata manager to determine the physical location of the data. Then, the server stores or retrieves the data directly across the SAN.

The metadata can be transferred in-band, over the SAN, or out-of-band, over an Ethernet link; the latter is more common as it avoids IP metadata traffic slowing the data traffic throughput on the SAN, and because it does not require Fibre Channel HBAs that support IP. Each server which uses the virtualized part of the SAN must have a special interface or agent installed to communicate with the metadata manager in order to translate the logical data access to physical access. This special interface may be software or hardware. Initial implementations will certainly be software, but later implementations might use specialized HBAs, or possibly an additional adapter which works with standard HBAs.

176

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Virtualization


Network-Based Storage Virtualization
Single point of management Insulate servers from storage changes Data migration Highly resilient storage upgrades Capacity on demand Increased utilization Consolidation Legacy investment protection Heterogeneous storage network Simplified data protection Snapshots Replication Different class of storage for different purposes
RAID HA upgrades Multiple paths
2006 Cisco Systems, Inc. All rights reserved.

Application integration Multipathing

FC
HBA

FC
HBA

FC
HBA

FC
HBA

LUN abstraction Mirror, striping Snapshot Replication

Virtualization
MDS MDS

10

Network based virtualization offers substantial benefits that overcome the challenges of traditional SAN management solutions. Network based virtualization means that management is now consolidated into a single point and simplified - hosts and storage are now independent of the various management solutions.

Servers are no longer responsible for volume management and data migration Network based virtualization enables real-time provisioning of storage, reducing the waste and overhead of over-provisioning storage. Legacy and hetero storage assets can be consolidated and fully utilized Data is better protected by simplified snapshot and replication techniques Easier to assign different classes of data

What are some existing approaches to storage virtualization? How is the MDS series a superior solution?

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

177

Network-Based Storage Applications


STORAGE VIRTUALIZATIONTODAY
Host-Based Apps App Integration Multi-Pathing Volume Mgmt

NETWORK-BASED VIRTUALIZATION
Host-Based Apps App Integration Multi-Pathing Network-Based Apps Volume Mgmt Snapshot Replication Array-Based Apps RAID Multiple Paths

Array-Based Apps RAID/Vol Mgmt Multiple Paths Snapshot Replication

Virtualization

Replication

Customer Benefit
Information Lifecycle Management Increased Storage Utilization Improved Business Continuance
2006 Cisco Systems, Inc. All rights reserved.

Proof Points Simplified management Non-disruptive data migration across tiered storage Heterogeneous storage pooling Flexible storage provisioning Supports point-in-time copy, replication Flexible data protection services
11

Network-Based Storage Applications


Network-based applications today are provided by the following vendors

EMC Invista Incipient Network Storage Platform Veritas Storage Foundation for Networks

Benefits:

Insulate servers: All storage changes, including upgrade to storage arrays are seamless to the hosts. Consolidation: Different types of storage can accumulate through mergers and acquisitions, reorganizations, or vendor shift within an I.T. department. Net-based virtualization allows you to incorporate new storage seamlessly and maintain the same services and scripts. Migration: Ability to move data seamlessly from one set of storage to another. (Note, that some do this with host-based volume manager. What if you have thousands of hosts???) Secure isolation: Instantiation of a Virtual LUN (VLUN) so it is only accessible within an administrator-defined VSAN or Zone Problem: Just-in-case provisioning Solution: Just-in-time Provisioning Different class of storage for different purposes Central Storage: Central tool to manage all storage.

178

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

The Storage Services Module (SSM)


Fully distributed architecture Provides huge aggregate performance Embedded ASICs Multiple CPP (Control Path Processors) + DPP (Data Path Processors) In-line SCSI processing up to 1.2 million IOPS Integrated 32 Fibre Channel ports - 1/2Gbps Multiple paths from hosts to virtualization engine down to the physical storage Remote mirroring in case of local disaster SSM applications FC-WA (Fibre Channel Write Acceleration) FAIS (Fabric Application Industry Standard) NASB (Network-Assisted Serverless Backup) SANtap protocol
2006 Cisco Systems, Inc. All rights reserved.

I/F Q F V M
F

uP Q F V M
F
12

Q F V M
F

The Storage Services Module (SSM)


The SSM is an intelligent module that not only contains 32 1/2 Gbps FC ports but also multiple Control Path processors (CPP) and Data Path processors (DPP) used for hosting or assisting storage applications provided by a number of different partners. Cisco is working with best of breed partners to achieve optimized hardware for a leveraged solution. The SSM provides support for a number of storage features

FC-WA (Fibre Channel Write Acceleration) to enhance performance of write operations over long distances, eg. Array replication FAIS (Fabric Application Industry Standard) is a standards based protocol used by external virtualization devices to communicate with the SSM through an Open API (Application Programming Interface) NASB (Network-Assisted Serverless Backup) is used with supporting backup software to move the data mover function into the network and thereby reduce the CPU load on the application server or media server SANTap Protocol is used by a number of storage partners with external storage appliances to communicate with the SSM.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

179

Network-Based Virtualization Techniques


Network-Hosted
Partner software resides on MDS

Network-Assisted
Partner software resides on arrays, external server or appliance

Network-Accelerated
Partner Software is accelerated by Cisco engine or agent (E.G. Cisco X-Copy)

Appliance Server SSM Storage Services Module


FC HBA FC HBA FC HBA FC HBA FC HBA FC
HBA

FC FC FC

Potential Network Virtualization Applications Heterogeneous Volume Management Data Migration Heterogeneous Replication / Copy Services Continuous Data Protection (CDP)
2006 Cisco Systems, Inc. All rights reserved.

Asynchronous Replication Serverless Backup FC Write Acceleration

13

Network-Based Virtualization Techniques


Three types of Network-Based storage virtualization techniques implemented by the MDS SSM module are Network-Hosted, Network-Assisted and Network-Accelerated.

The Network-hosted technique is implemented by installing Cisco partner software on the SSM module in the MDS. The Network device is hosting the software which performs the Virtualization function for the application. The Network-Assisted technique is implemented by installing Cisco partner software on a separate appliance or external server. In this technique, the Network device is assisting the software which performs the Virtualization function for the application. The Network-Accelerated technique uses a function on the Cisco SSM to accelerate the Partner application. Serverless Backup is a typical Network Application that is Accelerated by the X-copy function running on the MDS SSM. Another function is Fibre Channel Write Acceleration.

180

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Network-Hosted Applications
Network-Hosted Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications

MDS 9000 SSM


Network-Hosted
FAIS-based open API (T11) Volume Management Data Migration Copy Services

Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring

Network-Accelerated
Standard FC protocols Serverless Backup FC Write Acceleration Synchronous Acceleration

Invista
15

2006 Cisco Systems, Inc. All rights reserved.

With the SSM, Cisco introduced an open, standards based platform for enabling intelligent fabric applications.

SSM hardware: Dual function module with 32 FC ports with embedded Virtualization Engine processors Purpose-built-ASICS this optimizes virtualization functions, all done in ASICS, providing high performance with a highly available, scalable, and fully distributed architecture Any-to-any virtualization (no need to connect hosts or storage directly into one of the FC ports) Multiple best of breed partners for flexibility and investment protection

There are four key customer benefits of this intelligent fabric applications platform:

First, it is an open, and standards-based solution for enabling multiple partner application Second, it provides feature velocity by reducing the development cycle Third, it has a modular-software architecture for running multiple applications simultaneously Finally, it provides investment protection by delivering real-world applications today with flexibility to enable advanced functions using software

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

181

Cisco and EMC Virtualization Example


Heterogeneous ServersUnix and Windows

FC

FC
HBA

FC
HBA

FC
HBA

FC
HBA

Scalable hardware High performance Embedded diagnostics Multiprotocol Platform for Intelligent services VSAN scaling features

EMC Invista Network Based Volume Management (creation, presentation and management of volumes) Online data mobility Heterogeneous Clones Point-in-time copies

HBA

Cisco Provides
Data Path Cluster (DPC) Cisco Storage Services Module (SSM) Cisco Storage Services Enabler license Cisco MDS 9000 Family of Fibre Channel switches

Cisco MDS 9000 Series

EMC Provides
Control Path Cluster (CPC) on CX700 EMC Invista software Cabinet Meta-storage

2006 Cisco Systems, Inc. All rights reserved.

EMC/IBM/HP/Hitachi Storage Arrays

16

Cisco and EMC Virtualization Example


Cisco has partnered with major storage software vendors to enable disk virtualization on dedicated hardware on the MDS9000. By providing dedicated hardware in the network to perform virtualization servers we can deliver greater performance and resilience. EMCs implementation is pictured here.

182

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cisco and EMC Virtualization Example (Cont.)


Volume Mgmt
Control Processor

Independent control path Program the Data Path Processes exceptions

Data Migration Copy Services

Cisco MDS Benefits:


FAIS

High-performance fast path Fully-distributed intelligence Integrated, HA architecture

SSM
Data Path

Multiprotocol integration Comprehensive security

Virtual to Physical Mapping Data Traffic Control Traffic


2006 Cisco Systems, Inc. All rights reserved.

Troubleshooting & diags

17

Invista requires two components: intelligent switches from vendors such as Cisco, Brocade, and McDATA, along with a separate appliance--or set of them. This set, known as the Control Path Cluster (CPC), builds what amounts to routing tables and maintains metadata. The tables are used by the intelligent switches to rewrite FC and SCSI addresses at wire speed. That capability makes the architecture highly scaleable, but more complex as well. EMC Invista is installed on an external Control Path Cluster (CX700) providing

Volume Management Data Migration Copy Services

EMC Invista manages the Control path in the Control Processor, while data flows directly between host and storage through the Data Path processors located on the SSM module in the Cisco MDS. The benefits of performing virtualization on the SSM module are

Fully integrated into the fast high bandwidth redundant crossbar High availability and redundancy Minimal latency and high throughput Comprehensive centralized security Providing a centralized solution that is easier to manage.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

183

Fabric Application Industry Standard (FAIS)


FAIS is an ANSI T11 standards-based effort to create a common application programming interface (API) for fabric applications to run on an underlying hardware platform. The objective of FAIS is to help developers move storage and data management applications off applications, hosts and storage arrays and onto intelligent storage fabric-based platforms. FAIS was coauthored by Cisco. It is pronounced "face."

184

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Network-Assisted Applications
Network-Assisted Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications

MDS 9000 SSM


Network-Hosted
FAIS-based open API (T11) Volume Mgmt Data Migration Copy Services

Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring

Network-Accelerated
Standard FC protocols Serverless Backup Write Acceleration Synchronous Replication

Invista
2006 Cisco Systems, Inc. All rights reserved.

19

Intelligent storage services are also provided on the SSM module from a large number of storage partners. Each network based appliance communicates with the SSM module through the use of the SANTap protocol. Network-Assisted applications include:

Heterogeneous Storage Replication Continuous Data Protection Data Migration Storage Performance Monitoring

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

185

Out-of-Band Appliances
Advantages: Appliance not in the primary I/O path Limitations: Appliance requires host-based software agents consuming CPU, memory and I/O. Appliance adds latency to initial I/O request. Potentially compromises I/O performance by issuing twice as many I/Os Limited interoperability with other appliances or disk array features
2006 Cisco Systems, Inc. All rights reserved.
HBA

FC FC FC HBA FC HBA
HBA

Hosts with software agents

Host based agent intercepts I/O command and sends it to appliance

Host sends I/O command to target once appliance has acknowledged it

Appliance
SAN

FC

Target

20

Out-of-Band Appliances
When a separate storage appliance is connected to the network, it has one prime advantage, in that the appliance is not in the main data path and so is not perceived as a bottleneck. The limitations of this approach are many:

Each host must have a software agent installed on the server to intercept I/O requests and redirect them to the appliance. If the administrator fails to install the agent on every server, then that server will attempt to communicate directly with its storage instead of via the appliance, possibly leading to data loss. Each intercepted I/O request must be directed to an appliance that is usually connected on the LAN and therefore adds latency for every I/O operation When the appliance is connected in-band over Fibre Channel, this results in additional I/O traffic across the FC SAN. Every solution is proprietary and not defined by standards, so each appliance cannot interoperate with another.

186

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

In-Band Appliances
Advantages: Does not require host agents
Hosts
Appliance is in the primary data path

Limitations: Disruptive insertion of appliance in the data path Potentially performance bottleneck because all frames flow through the appliance Appliance adds latency to all frames Limited interoperability with other appliances or disk array features Appliance can be single point of failure
2006 Cisco Systems, Inc. All rights reserved.

HBA

FC FC FC HBA FC HBA
HBA

Host sends all I/O to the appliance Appliance intercepts I/O and sends it to the target

Appliance
SAN

FC

Target

21

In-Band Appliances
When a separate external storage appliance is connected in-band the advantage is that host based software agents are no longer required. However, there are several limitations to this approach:

The appliance cannot be added to the SAN without causing massive disruption. All data between each of the servers and the storage must now pass through the appliance adding latency to every frame and becoming a potential bottleneck in a busy SAN. The appliance becomes a Virtual Target for all SCSI based communication. It receives all SCSI I/O and sends it to the appropriate storage devices by creating a Virtual Initiator. The appliance can become a single point of failure, although most solutions offered today are clustered. Every solution is proprietary and not defined by standards, therefore each appliance cannot interoperate with other appliances.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

187

Network-Assisted Solutions Using SANTap


SANTap is a protocol between an MDS and a storage appliance SANTap sends a copy of I/O, transparently to a storage appliance without impacting the primary data path All SANTap comunication is based upon industry standard SCSI commands Advantages: Eliminates disruption of adding an appliance to the SAN Eliminates the need for host agents Appliance is not in the primary I/O path No added latency, no bottleneck Enables on-demand storage services Target SANTap
2006 Cisco Systems, Inc. All rights reserved.

Hosts
FC FC HBA FC HBA FC HBA
HBA

Appliance not in the primary data path

Host issues I/O command to target SANTap sends copy of I/O to appliance

Appliance
No I/O disruption SAN
FC

FC

22

Network-Assisted Solutions Using SANTap


SANTAp is a protocol that is used to pass data between an MDS and a storage appliance. SANTap sends a copy of the FC frame containing SCSI I/O, transparently to a separate storage appliance. The original FC frame containing the SCSI payload is sent directly to its target with no additional latency or disruption. SANTap enables storage application appliances without impacting primary I/O

The integrity, availability and performance of the Primary I/O is maintained Seamless insertion and provisioning of appliance based storage applications Storage service can be added to any server/storage device in the network without any rewiring Incremental model to deploy appliance based applications, easy to revert back to original configuration No disruption of the Primary I/O from the server to the storage array (viz. integrity, availability & performance preserved) Addresses the Scalability Issue for appliance based storage applications Investment protection

Storage applications enabled by SANTap include:


Heterogeneous storage replication Continuous log-based data protection Online data migration Storage performance/SLA monitoring

188

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SANTap-Based Fabric Applications

Partner

Application
Heterogeneous async replication over extended distances with advanced data compression functionality Disk-based Continuous Data Protection (CDP) for Zero-Backup windows with an ability to restore to any point in time Heterogeneous async replication over extended distances with data consistency Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication

2006 Cisco Systems, Inc. All rights reserved.

23

SANTap-Based Fabric Applications


Cisco is working through several storage partners to provide intelligent storage applications through externally connected appliances.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

189

Network-Accelerated Applications
Network-Accelerated Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications

MDS 9000 SSM


Network-Hosted
FAIS-based open API (T11) Volume Mgmt Data Migration Copy Services

Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring

Network-Accelerated
Standard FC protocols Serverless Backup FC Write Acceleration Synchronous Replication

Invista
2006 Cisco Systems, Inc. All rights reserved.

25

The SSM module provides a number of network-accelerated intelligent services that enhance the standard Fibre Channel protocols. These are:

Network-Assisted Serverless Backup (NASB) Fibre Channel Write Acceleration (FC-WA) Network-based synchronous replication

190

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SSM and NASB


Serverless Backup - Today
FC FC
HBA HBA

3.0

Network-Accelerated Serverless Backup


Media Servers FC
HBA

Media Servers

Application Servers

Instead of Media Servers, MDS (with SSM) moves data directly from Disk to Tape
FC

Application Servers SAN

SSM
SAN

SAN
FC FC

FC

Tape

Disk Array

Tape

Disk Array

LAN Based
Data moved over LAN Application Server moves data

LAN Free
Data moved over SAN Application Server moves data

Server Free
Data moved over SAN Application Server not in data path Dedicated Media Server moves data

Serverless Backup
Data moved over SAN Application Server not in data path Fabric moves data
26

2006 Cisco Systems, Inc. All rights reserved.

SSM and NASB


Instead of expensive dedicated media servers providing the function of backup up data from storage to tape, the SSM provides the media server function using standards based SCSI 3rd Party Copy. Customer Benefits Lower TCO

Offload I/O and CPU work from Media Servers to SSM Reduce server admin & mgmt tasks

Higher Performance & Reliability


Each SSM delivers up to 16 Gbps throughput SSM integrated in a HA MDS platform

Investment Protection

No changes to existing backup environment SSM Data Movement can be enabled w/ software

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

191

Network-Accelerated Serverless Backup Development Partners


CA BrightStor

CommVault Systems Galaxy Backup and Recovery EMC - Legato Networker

VERITAS NetBackup

IBM Tivoli Storage Manager


2006 Cisco Systems, Inc. All rights reserved.

27

Cisco is working with five vendors who are all at different stages in qualifying their backup solution with the MDS 9000 network-accelerated serverless backup solution.

192

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Write Acceleration


Fibre Channel Write Acceleration
Performance of DR/BC applications are inhibited by distance across WAN
Latency degrades with greater distance
Databases are very sensitive to latency

3.0

Only Write I/Os are affected


Increased Service Time (Round Trip Response)

FC-Based Replication: How far?


I/Os per second (IOPS) and Application Performance diminish with distance Disk IO Service Time Increases with latency

FC-WA improves write performance over WAN

Initiator
FC

Target
FC

WAN Write Write Acceleration Acceleration


FCP_W RITE

Minimum tolerable performance level

Round Trip

XFER_RDY
FCP_DATA

XFER_RDY

P FCP_RS

DISTANCE Maximum tolerable distance (latency)

2006 Cisco Systems, Inc. All rights reserved.

29

SCSI standards define the way a SCSI Initiator shall communicate with a SCSI Target. This consists of four phases:

Initiator sends SCSI Write Command to SCSI Target LUN containing a CDB with the command, LBA and Block count When the SCSI Target is ready to receive data it responds with Xfer Ready When the SCSI Initiator receives Xfer Ready it starts sending data to the SCSI Target Finally, when the SCSI Target has received all the data, it returns a Response or Status to the SCSI Initiator

This constitutes two round trip journeys between the SCSI Initiator and SCSI Target.

Command Xfer RDY Data - Response

In a data centre environment, distances are short and therefore the round-trip time is low and latency is reduced. In a WAN environment, when distances are much longer, the SCSI Initiator cannot send data until it receives Xfer Ready after the first round trip journey. As distances increase, this considerably impacts write performance. Fibre Channel Write Acceleration spoofs Xfer Ready in the MDS switch. When the original SCSI Command is sent by the SCSI Initiator through the MDS switch to the SCSI Target, the MDS responds immediately with a Xfer Ready. The SCSI Initiator can now immediately send data to the SCSI Target instead of waiting for the true Xfer Ready to be received. Meanwhile the SCSI Command is received by the Target and it responds with the real Xfer Ready. When
Copyright 2006, Cisco Systems, Inc. Network-Based Storage Applications 193

the Target MDS switch receives the Data it will pass the data on to the Target. Finally, the SCSI Target sends a Response or Status back to the Initiator in the normal way. In a typical environment, several simultaneous SCSI operations are taking place between the SCSI Initiator and SCSI Target simultaneously, so these operations are interleaved, maximizing performance and minimizing latency.

194

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SSM and Network-Accelerated FC-WA


Fibre Channel Write Acceleration
Reduce round-trip delays
Primary Data Center

3.0

Extend Distances for DR/BC Applications


DR Data Center

Optimize bandwidth for DR Increase distance between primary site and remote site Minimizes application latency Investment protection: transport agnostic DWDM, CWDM, SONET/SDH, dark fiber
SSM SSM

FC WA

Up to 30%
Performance improvement seen by major financial services company over 125 km distance

Primary Application:
Synchronous replication

2006 Cisco Systems, Inc. All rights reserved.

30

SSM and Network-Accelerated FC-WA


The primary application for FC-WA is synchronous replication of data between storage arrays in the main and remote data centres. Tests have shown that over a 125Km distance, there is up to 30% improvement in write performance.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

195

Evolution to a Multilayer Storage Utility Model


Homogenous SAN Islands Multilayer Storage Network
Engineering, ERP, HR Applications Midrange Apps (eg. Microsoft)

Multilayer Storage Utility


Engineering, ERP, HR Applications Midrange Apps (eg. Microsoft)

ERP SAN

HR SAN VSANs HA WAN/ FCIP Security

Multiprotocol QoS HA Mgmt Scalability

Storage Virtualization LAN Free Backup

Dynamic Provisioning

Data Mobility Storage Classes

Engineering SAN Midrange DAS

Pooled Disk and Tape

Pooled Disk and Tape

Phase 0: Isolated SANs and Mid-range DAS


2006 Cisco Systems, Inc. All rights reserved.

Phase 1: High-end and Mid-range Consolidation

Phase 2: Network-Hosted Storage Applications


31

The Multilayer Storage Utility


This slide describes Ciscos vision about storage networking evolution from the Homogeneous SAN Island model to the Multilayer Storage Utility Historically storage networks have been built in physically isolated islands (Phase 0) to address several technical and non-technical issues of older storage networking technologies, such as:

Maintain isolation from fabric events or configuration errors Provide isolated management of island infrastructures Driven by bad experiences of large multi-switch fabrics

This model is also associated with very high costs and a high level of complexity To help customers overcome the limitations of building homogeneous SAN Islands, Cisco has delivered new storage networking technologies and services aimed at enabling IT organizations to consolidate heterogeneous disk, tape, and hosts onto a common storage networking infrastructure (Phase 1.) By introducing new intelligent storage network services, Cisco enables customers to scale storage networks beyond todays limitations while delivering the utmost in security and resiliency. An innovative infrastructure virtualization service called Virtual SANs (VSANs) alleviates the need to build isolated SAN islands by replicating such isolation virtually within a common cost-optimized physical infrastructure. The intelligent Multilayer Storage Utility (Phase 2) involves leveraging Cisco Multilayer Storage Networking as the base platform for delivering next generation storage services With the intelligent multilayer storage utility, the storage network is viewed as a system of distributed intelligent network components unified through a common API to deliver a platform for network-based storage services.
196 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

Network-based storage services offer several attractive opportunities for further cost optimization of the storage infrastructure. To achieve the Multilayer Storage Utility, Cisco is partnering with the industry leaders as well as with the most promising start up companies to offer complete solutions to customers. Network-hosted storage products from EMC, Veritas and IBM as well as SANTap solutions developed in partnership with companies like Topio, Kashya or Alacritus are excellent examples of how Cisco is delivering in this space.

Copyright 2006, Cisco Systems, Inc.

Network-Based Storage Applications

197

198

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 7

Optimizing Performance
Overview
In this lesson, you will learn how to design high-performance SAN fabrics using FSPF traffic management, load balancing, Virtual Output Queues, Fibre Channel Congestion Control, and Quality of Service.

Objectives
Upon completing this lesson, you will be able to engineer SAN traffic on an MDS 9000 fabric. This includes being able to meet these objectives:

Define oversubscription and blocking Explain how Virtual Output Queues solve the head-of-line blocking problem Explain how the MDS 9000 handles fabric congestion Explain how QoS is implemented in an MDS 9000 fabric Explain how port tracking mitigates performance issues due to failed links Explain how to configure traffic load balancing on an MDS 9000 SAN fabric Describe the MDS 9000 tools that simplify SAN performance management

Oversubscription and Blocking


Oversubscription and Blocking
Oversubscription helps determine fabric design Blocking is avoidable with proper design and switch hardware
FC
HBA

Host A
FC
HBA

Host B
FC
HBA

Cisco MDS switches are completely non-blocking

2 Gbps Port

2 Gbps Port

FC
HBA

FC

Head Head of of Line Line Blocking Blocking

FC
HBA

FC
HBA

Oversubscription Oversubscription 5:1 5:1

2 Gbps Port
FC

1 Gbps Port
FC

FC
HBA

Array C

Array D
4

2006 Cisco Systems, Inc. All rights reserved.

It is important to fully understand two fundamental SAN design concepts: oversubscription and blocking. Although these terms are often used interchangeably, they relate to very different concepts. Oversubscription and blocking considerations are critical when designing a fabric topology. Oversubscription is a normal part of any SAN design and is essentially required to help reduce the cost of the SAN infrastructure. Oversubscription refers to the fan-in ratio of available resources such as ISL bandwidth or disk array I/O capacity, to the consumers of the resource. For example, many SAN designs have inherent design oversubscription as high as 12:1 hoststo-storage as recommended by disk subsystem vendors. A general rule-of-thumb relates oversubscription to the cost of the solution such that the higher the oversubscription, the less costly the solution. Blocking, often referred to as head-of-line (HOL) blocking, within a SAN describes a condition where congestion on one link negatively impacts the throughput on a different link. In this example congestion on the link connecting Array D is negatively impacting the flow of traffic between Host A and Array C. This is discussed in more detail later in this section.

200

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cisco MDS Performance Advantages

Completely non-blocking Maximum throughput Consistent latency All ports and flows are serviced evenly Quality of Service
2006 Cisco Systems, Inc. All rights reserved.

Non-Blocking Switch Architecture


A switch is said to be blocking when inefficient hardware design causes ingress traffic to be blocked or stalled, due to preceding traffic that is destined to slower or congested receivers. Blocking represents a condition where ingress and egress bandwidth capacity exist, but the switch is unable to forward at the desired rate due to hardware or queuing inefficiencies. Through the use of a technology called Virtual Output Queuing (VoQ), this problem has been overcome and none of the Cisco MDS 9000 Family of switches suffers from this blocking effect. The MDS 9000 platform is also the only Fibre Channel switching platform today to support Quality of Service (QoS).

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

201

Miercom Performance Validation


Overall, the Cisco MDS 9509 proved to be a non-blocking, fully redundant architecture with excellent performance in every topology used. excellent performance regardless of frame size98.67% line rate for small frames, full line rate with large frames Regardless of the load, minimum latency for both frame sizes was very consistent.

Most impressive was the ability to sustain traffic at a much higher throughput rate than other switches
Miercom

2006 Cisco Systems, Inc. All rights reserved.

Maximum Throughput
The MDS 9509 exhibited excellent throughput and latency for all frame sizes in this test. [] Regardless of the load, the minimum latency for both frame sizes was very consistent. For small frames, it varied from 7.2 to 52.9 s under 100% intended load. For large frames, the latency ranged from 19.7 to 218.9 s under 100% intended load. [] Whenever other switches tested receive frames at a rate exceeding their capability, their buffers fill and their latency increases dramatically.

Consistent Latency
The MDS 9509 showed excellent performance regardless of frame size. It achieved near line rate with small frames (98.67%) and full line rate with large frames, both with 100% intended load. [] Furthermore, the MDS 9509 was able to sustain traffic at a much higher throughput rate for minimum- and maximum-sized frames while maintaining a more consistent latency then other switches tested. More impressive was the distribution of the traffic flows, which was varied +/- 0.01 MB/s for small frames and +/- 0.005 MB/s for large frames. Source: Performance Validation Test - Cisco MDS 9509 By Miercom at Spirent SmartLabs, Calabasas California - December 2002 http://cisco.com/application/pdf/en/us/guest/products/ps4358/c1244/cdccont_0900aecd800cbd6 5.pdf

202

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Virtual Output Queues


Head-of-Line Blocking
Storage array connected to one switch port wants to transfer data to three servers.

B A C B C
FC

ACB

2006 Cisco Systems, Inc. All rights reserved.

Head-of-Line Blocking
The example illustrates a scenario where a storage array is connected to a switch via a single link. Without VOQ technology, traffic from a storage array destined for three servers, A, B and C, will flow into a switch and be placed into a single input buffer as shown. Assuming the servers are capable of receiving data transfers from the storage arrays at a sufficient rate, there should not be any problems.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

203

Head-of-Line Blocking (Cont.)


Slowdown on one server prevents others from receiving data frames Classic Head of Line Blocking scenario

B B C C B
FC

ACBCA

2006 Cisco Systems, Inc. All rights reserved.

Should there be a problem or slowdown with one of the servers, the storage array may be prevented from sending data as quickly as it is capable, to the remaining servers. This is a classic HOL blocking condition.

204

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Virtual Output Queues


Solution: MDS 9000 implements 1024 virtual output queues (VOQs) for every ingress port on each linecard. VOQs alleviate congestion and head-of-line blocking conditions.

B C A C B
FC

256 VOQs with 4 levels of QoS per queue

2006 Cisco Systems, Inc. All rights reserved.

10

Virtual Output Queues


The MDS 9000 Family switches implement a sophisticated VOQ mechanism to alleviate HOL Blocking conditions. Virtual Output Queuing occurs at the each ingress port on the switch. There are effectively 1024 Virtual Output Queues available for every ingress port, including support for four levels of Quality of Service and up to 256 egress ports for every ingress port.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

205

Intelligent Central Arbiter


Sophisticated and fast central arbiter provides fairness and selection among VOQs Central Arbiter schedules over 1 billion frames per second

A
ARB

B A C B C
FC

AA B C

2006 Cisco Systems, Inc. All rights reserved.

11

Intelligent Central Arbiter


The MDS 9000 implements an intelligent central arbiter to:

Monitor the input queues and the egress ports. Provide for fairness. Allow unobstructed flow of traffic destined for un-congested ports. Absorb bursts of traffic. Alleviate conditions that might lead to HOL blocking.

Without a central arbiter, there would be a potential to starve certain modules and ports. The central arbiter maintains the traffic flow - like a traffic cop. The Cisco MDS Arbiter can schedule frames at over 1 billion frames per second. (1 billion = 1000 million)

206

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel Congestion Control


Fibre Channel Congestion Control
FCC is a feature that detects and reacts to network congestion The network self-adapts intelligently to the specific congestion event
Maximized throughput Avoids Head of Line blocking Protocol customized for FC loss-less networks

Congestion control message sent from switch 3 to switch 1

Traffic congestion

Cisco MDS 9000 Switch 1


2006 Cisco Systems, Inc. All rights reserved.

Cisco MDS 9000 Switch 2

Cisco MDS 9000 Switch 3


13

Fibre Channel Congestion Control (FCC) is used to gracefully alleviate congestion using intelligent feedback mechanisms within the fabric. FCC is a feature designed to throttle data at its source if the destination port is not responding correctly. It is a Cisco proprietary protocol that makes the network react to a congestion situation. The network adapts intelligently to the specific congestion situation, maximizing the throughput and avoiding head of line (HOL) blocking. The protocol has been customized for lossless networks such as Fibre Channel. FCC consists of the following three elements:

Congestion DetectionPerformed by analyzing the congestion of each output port in the switch. Congestion SignalingPerformed with special packets called Edge Quench (EQ). Congestion ControlPerformed through rate limiting of the incoming traffic.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

207

FCC Detection, Signaling, and Limiting


Rate Limiting Senders S1 1 Gbps
Switch A Switch B

Congestion Detected Receivers Congestion Signal R1 50 Mbps

S2 1 Gbps

R2 1 Gbps

Congestion Detection, Signalling, and Control


2006 Cisco Systems, Inc. All rights reserved.

14

FCC Detection, Signaling and Limiting


In the scenario shown on this slide:

S1 is sending frames into the fabric at 1 Gbps R1 is only receiving frames at 50 Mbps and does not drain FC frames fast enough Congestion occurs at the egress port of R1 as the buffers start to fill up As the buffers fill, frames are backed up to the previous buffer and congestion is detected at the ingress port of Switch B Congestion signal sent to Switch A source of troubled receiver on the appropriate linecard S1 begins rate limiting to an R1 sustainable level to match the rate of flow into and out of the switch.

The MDS 9000 switch monitors traffic from each host for congestion and FCC is activated when and if congestion is detected. The quench on edge message is sent out and the offending host traffic will be cut in half for each quench message received. There is no need for an un-quench message, because traffic usually builds back up slowly.

208

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Quality of Service
QoS Design Considerations
How can I provide priority for critical storage traffic flows?
Quality of Service:
Avoids and manages network congestion and sets traffic priorities across the network Provides predictable response times Manages delay and jitter sensitive applications Controls loss during bursty congestion

iSCSI iSCSI iSCSI iSCSI

IP LAN

IP WAN

FC SAN

FC SAN

FC

FC FC

FC HBA FC HBA FC
HBA

FC HBA FC HBA FC
HBA

FC

2006 Cisco Systems, Inc. All rights reserved.

16

Quality of Service (QoS) includes mechanisms that support the classification, marking and prioritization of network traffic. QoS concepts and technology were originally developed for IP networks. The MDS 9000 family of switches extend QoS capabilities into the storage networking domain for IP SANs as well as Fibre Channel SANs. No other switch on the market today is capable of prioritizing Fibre Channel traffic. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. In a storage network, examples of classification, marking and prioritization schemes might include:

Classification Classify all traffic in a particular VSAN, or all traffic bound for a particular destination FCID, or all traffic entering a particular FCIP tunnel Marking Set particular bits in the IP header or the EISL VSAN header Prioritization Utilize queuing strategies such as Deficit Weighted Round Robin (DWRR) or Class Based Weighted Fair Queuing (CBWFQ) to give preference based on certain markings.

QoS features enable networks to control and predictably service a variety of networked applications and traffic types. The goal of QoS is to provide better and more predictable network service by providing dedicated bandwidth, controlled jitter and latency, and improved loss characteristics.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

209

Applications deployed over storage networks increasingly require quality, reliability, and timeliness assurances. In particular, applications that use voice, video streams, or multimedia must be carefully managed within the network to preserve their integrity. QoS technologies allow IT managers and network managers to:

Predict response times for end-to-end network services Manage jitter-sensitive applications, such as audio and video playbacks Manage delay-sensitive traffic, such as real-time voice Control loss in times of inevitable bursty congestion Set traffic priorities across the network Support dedicated bandwidth Avoid and manage network congestion.

Managing QoS becomes increasingly difficult because many applications deliver unpredictable bursts of traffic. For example, usage patterns for web, e-mail, and file transfer applications are virtually impossible to predict, yet network managers need to be able to support mission-critical applications even during peak periods.

210

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

QoS for Fibre Channel


Congestion

High priority flow


FC
FC
HBA

FC
HBA

Low priority flow

Three priority queues for data traffic Absolute priority for control traffic Flows classified based on input interface, destination device alias or source/destination FCID or pWWN QoS only functions during periods of congestion FCC must be enabled
Follows Differentiated Services (DiffServ) model defined in RFCs 2474 and 2475
2006 Cisco Systems, Inc. All rights reserved.

17

QoS for Fibre Channel


MDS 9000 Family switches support QoS for Fibre Channel through the classification and prioritization of FC control traffic and data traffic. The QoS implementation in the Cisco MDS 9000 Family follows the Differentiated Services (DiffServ) model. The DiffServ standard is defined in RFCs 2474 and 2475. Data traffic can now be prioritized in three distinct levels of service differentiation: low, medium or high, while control traffic is given absolute priority. You can apply QoS to ensure that FC data traffic for latency-sensitive applications receives higher priority than traffic for throughput-intensive applications like data warehousing. Flows are classified based on one or more of the following attributes:

Input interface Source FCID Destination FCID Source pWWN Destination pWWN

QoS only functions during periods of congestion. To achieve the greatest benefit, QoS requires that FCC be enabled, and requires two or more switches in the path between the initiators and targets. Data traffic QoS for Fibre Channel is not enabled by default, and requires the Enterprise Package license. However, absolute priority for control traffic is included in the base SAN-OS license, and is enabled by default.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

211

QoS for Fibre Channel


VOQ(s)
Absolute Priority Absolute Priority FC
HBA

VOQ(s)
Absolute Priority Absolute Priority

Disk
FC

Congestion
High Priority High Priority Medium Priority Medium Priority Low Priority Low Priority High Priority High Priority Medium Priority Medium Priority Low Priority Low Priority

OLTP Server
Low Throughput Bursty, Random Low Latency

PQ
FC
HBA

Absolute Priority Scheduling Transmit Queue

Traffic destined for interface

DWRR 1 B B 50%
Classify
Class map

Backup Server
High Throughput Sequential, Streaming Not Latency Sensitive

ABCD

DWRR 2 DWRR 3

C D

30% 20%

2006 Cisco Systems, Inc. All rights reserved.

Traffic Classification: Source or destination pWWN Source or destination FCID Source interface Destination device alias

Class Map mapped to: DSCP 0-63 (46 reserved) or Policy Map
18

Transaction processing, a low volume, latency sensitive application, requires quick access to requested information. Backup processing requires high bandwidth but is not sensitive to latency. In a network that does not support service differentiation, all traffic is treated identicallythey experience similar latency and get similar bandwidths. With the QoS capability of the MDS 9000 platform, data traffic can now be prioritized in three distinct levels of service differentiationlow, medium or highwhile control traffic is given absolute priority. You can apply QoS to ensure that FC data traffic for latency-sensitive applications receives higher priority over traffic for throughput-intensive applications like data warehousing. In the example, the Online Transaction Processing (OLTP) traffic arriving at the switch is marked with a high priority level of through classification (class map) and marking (policy map). Similarly, the backup traffic is marked with a low priority level. The traffic is sent to the corresponding priority queue within a Virtual Output Queue (VOQ). A Deficit Weighted Round Robin (DWRR) scheduler configured in the first switch ensures that high priority traffic is treated better than low priority traffic. For example, DWRR weights of 60:30:10 implies that the high priority queue is serviced at 6 times the rate of the low priority queue. This guarantees lower delays and higher bandwidths to high priority traffic if congestion sets in. A similar configuration in the second switch ensures the same traffic treatment in the other direction. If the ISL is congested when the OLTP server sends a request, the request is queued in the high priority queue and is serviced almost immediately as the high priority queue is not congested. The scheduler assigns it priority over the backup traffic in the low priority queue. Note that the absolute priority queue always gets serviced first; there is no weighted round robin.

212

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Traffic Classification and Queuing


MDS 9000 supports one absolute priority queue and three DWRR queues. By default, control traffic is placed in the priority queue. When the priority queue is empty, the scheduler checks the DWRR queues. DWRR supports the weighted fair distribution of bandwidth when servicing queues that contain variable-length packets. DWRR transmits from the higher priority queues without starving the lower-priority queues by keeping track of lower-priority queue under-transmission and compensating in the next round. In the classic DWRR algorithm, the scheduler visits each non-empty queue and determines the number of bytes in the packet at the head of the queue. The deficit counter is incremented by the value of quantum. If the size of the packet at the head of the queue is greater than the deficit counter, then the scheduler moves on to service the next queue. If the size of the packet at the head of the queue is less than or equal to the deficit counter, then the deficit counter is reduced by the number of bytes in the packet and the packet is transmitted on the output port. The scheduler continues to dequeue packets until either the size of the packet at the head of the queue is greater than the deficit counter or the queue is empty. If the queue is empty, the value of the deficit counter is set to zero. When this occurs, the scheduler moves on to service the next non-empty queue. In short, DWRR provides preferential, or weighted, round robin scheduling without starving other queues.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

213

Zone-Based QoS
Zone-based QoS complements the standard QoS data-traffic classification by WWN or FCID Zone-based QoS helps simplify configuration and administration by using the familiar zoning concept ZONE A
QoS parameters are distributed as a zone attribute FC FC FC HBA
HBA

HBA

High priority flow

Congestion
FC

HBA

FC FC FC HBA
HBA

Low priority flow


QoS parameters are distributed as a zone attribute

ZONE B
2006 Cisco Systems, Inc. All rights reserved.

19

Zone-Based QoS
With zone-based QoS, QoS parameters are distributed as a zone attribute. This simplifies administration of QoS by providing the ability to classify and prioritize traffic on by zone, instead of by initiator-target pair. Zone-based QoS is supported in both the Basic and Enhanced mode. QoS parameters are distributed as vendor-specific attributes. Zone-based QoS cannot be combined with flow-based QoS
Note Zone-based QoS is a licensed feature; it requires the Enterprise Package.

Note

Zone-based QoS may cause traffic disruption upon zone-QoS configuration change (and activation) if in-order-delivery is enabled.

214

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port Rate Limiting


Rate-limiting ingress traffic:
Match ingress flow to available bandwidth Prevent devices from flooding SAN Limit traffic contending for WAN links Limit traffic on over-subscribed interfaces Configured as a percentage of available ingress bandwidth QoS feature must be enabled

50% maximum input to fc1/1 MDS Supported on MDS 9100 Series switches, MDS 9216i and MPS14+2
2006 Cisco Systems, Inc. All rights reserved.

20

Port Rate Limiting


A port rate limiting feature is available on 2nd generation modules eg. the MPS 14+2, MDs 9216i and MDS 9100 series switches with SAN-OS 1.3 or higher. This feature helps control the bandwidth for individual FC ports. Rate limiting could be useful in the following situations:

Prevent malicious or malfunctioning devices from flooding the SAN. Limit traffic contending for WAN links, e.g., storage replication ports. Limit ingress traffic on over-subscribed mode interfaces.

Port rate limiting is also referred to as ingress rate limiting because it controls ingress traffic into an FC port. The feature controls traffic flow by limiting the number of frames that are transmitted out of the exit point on the MAC. Port rate limiting works on all Fibre Channel ports. Note: Port rate limiting can be configured only on Cisco MDS 9100 Series switches, MDS 9216i and MPS 14+2. This command can be configured only if the following conditions are true:

The QoS feature is enabled using the qos enable command. The command is issued in a 2nd generation Cisco MDS 9216i or 9100 series switch.

The rate limit ranges from 1 to 100% and the default is 100%. To configure the port rate limiting value, use the switchport ingress-rate interface configuration command.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

215

QoS in Single-Switch Fabrics


FC
HBA

FC

FC
HBA

FC
HBA

FC
HBA

OSM hosts to disk

JBOD disks to host

Host to JBOD disks

QoS scheduling occurs in VOQ of ingress port Effective when multiple flows on one ingress port contend for the same egress port Can improve latency and/or bandwidth of higher priority flows

2006 Cisco Systems, Inc. All rights reserved.

21

QoS Designs
Fibre Channel QoS is effective for some configurations, but not for others. To understand why, it is important to realize that the QoS scheduler operates within the Virtual Output Queue (VOQ) of an ingress port. Because QoS scheduling occurs at the ingress port, for QoS to be effective, it is important that all competing traffic enter a switch through the same ingress port, somewhere before the common point of congestion. The diagram illustrates three configurations where FC QoS might be beneficial in a singleswitch design:

Multiple devices attached to the same quad on a host optimized 32-port FC line card. In this configuration, the group of four over subscribed mode (OSM) ports is serviced by the same QoS scheduler, so internally they appear to be connected to the same ingress port. The common point of congestion would be the storage port. A multi-disk JBOD attached to an FL port, sending data to a host on the same switch. In this configuration, there are be multiple devices on the same ingress port, each with unique FCID and pWWN which can be used for QoS classification. The common point of congestion would be the host, if for example we had a 2 Gbps JBOD and a 1 Gbps host HBA. A host sending multiple flows, each of which enter on a common ingress port and is destined for distinct FCID or pWWN within the JBOD. In this configuration, QoS can improve the latency of a higher priority flow, but cannot improve the bandwidth because the host is not QoS aware, so all of the flows would get an equal share of the bandwidth regardless of the DWRR weights.

216

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Because the QoS scheduler operates within the VOQ of the ingress ports, Fibre Channel QoS is not beneficial in all configurations. Two configurations where FC QoS is not effective in the current MDS 9000 QoS implementation are:

Multiple devices attached to full rate mode (FRM) ports on a 16-port FC line card contending for the same egress port. In this configuration, where two hosts are both sending data to the storage array there would be no benefit to giving one host a QoS priority higher than the other because the central arbiter would provide fairness to the two ingress ports that are contending for the common egress port. Multiple devices with a common ingress port (the ISL on the rightmost switch), but multiple egress ports. QoS would not provide a benefit, however FCC will still alleviate congestion on the ISL.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

217

QoS Designs for FCIP


DSCP = 46
FC

FCIP 10
FC
HBA

IP WAN

Traffic going into FCIP tunnel is marked with DSCP value


Marked at egress of FCIP interface Separate markings for control and data

Downstream IP network must implement and enforce QoS policy based on marking DSCP can be any value from 0 to 63. Default = 0 DSCP 46 is reserved for Expedited Forwarding
2006 Cisco Systems, Inc. All rights reserved.

22

QoS Designs for FCIP


In a SAN extension environment utilizing FCIP, it is possible for high priority storage traffic to be contending for the same WAN resources as other, lower priority storage and/or data traffic. In such situations, it may be desirable to implement QoS in order to provide a higher level of service for particular storage traffic flows. Traffic flowing into an FCIP tunnel can be classified at the egress FCIP interface and marked with a DSCP value. By default, the IPS module creates two TCP connections for each FCIP link. One connection is used for data frames and the other is used only for Fibre Channel control frames (i.e., Class F switch-to-switch protocol frames). This arrangement is used to provide low latency for all control frames. The FCIP QoS feature specifies the DSCP value to mark all IP packets using the TOS field in the IP header. The control DSCP value applies to all FCIP frames in the control TCP connection and the data DSCP value applies to all FCIP frames in the data connection. If the FCIP link has only one TCP connection, the data DSCP value is applied to all packets in that connection. Once marked, it is then up to the devices in the downstream IP network to implement and enforce the QoS strategy. The MDS 9000 does not implement IP QoS for ingress FCIP frames since it is an end device in the IP network.

218

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

End-To-End QoS
Priority for critical storage traffic flows
VSANs and high density switches allow for collapsed core design Traffic engineering makes collapsed core design a feasible solution FCC performs congestion detection, signaling, and congestion control End-to-end QoS priority schemes can be designed to meet customer requirements

VSAN Trunks
250 50

Multi-path Load Balancing


25 25

IP LAN

iSCSI

QoS for iSCSI

15 0

QoS for Fibre Channel

Each VSAN has its own routing process and associated metrics per link and can therefore make independent routing decisions

iSCSI Hosts

50

50

15 0

250

50

50

50

FCIP- DWDM SONET FC SAN

QoS for FCIP

FC SAN

Hosts are assigned to Virtual SANs. Each Virtual SAN is allocated fabric resources, such as bandwidth, independently

2006 Cisco Systems, Inc. All rights reserved.

23

End-to-End QoS
Cisco MDS 9000 Family introduces VSAN technology for hardware-based intelligent frame processing, and advanced traffic management features such as Fibre Channel Congestion Control (FCC) and fabric-wide quality of service (QoS) - enabling the migration from SAN islands to collapsed-core and enterprise-wide storage networks. The MDS 9000 family of switches provide several tools that allow SAN administrators to engineer resource allocation and recovery behavior in a fabric. These tools can be used to provide preferential service to a group of hosts or to utilize cost-effective, wide-area bandwidth first and use an alternate path during fabric fault.

VSANs provide a way to group traffic. VSANs can be selectively grafted or pruned from EISL trunks. PortChannels support link aggregation to create virtual EISL trunks. FSPF provides deterministic routing through the fabric. FSPF can be configured on a perVSAN basis to select preferred and alternate paths

With the MDS 9000 family of switches, QoS concepts and technology that were originally developed for IP networks have been extended into the storage networking domain for IP SANs as well as Fibre Channel SANs. No other switch on the market is capable of prioritizing Fibre Channel traffic as effectively or comprehensively. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. The Cisco MDS 9000 Family enables the design of comprehensive end-to-end traffic priority schemes that satisfy customer requirements.
Copyright 2006, Cisco Systems, Inc. Optimizing Performance 219

Port Tracking
Port Tracking
Linked port
1
FC

2.0

Tracked port

WAN/MAN
FC

WAN/MAN

Unique to MDS 9000 Failure of link 1 immediately brings down link 2 Triggers faster recovery where redundant links exist Tracked ports are continually monitored
Tracked ports can be FC, Port Channel, GigE or FCIP interface Linked ports must be FC

Failover software responds faster to a link failure


Milliseconds versus tens of seconds Not dependant on TOVs or RSCNs
2006 Cisco Systems, Inc. All rights reserved.

25

Tracking and Redirecting Traffic


The Port Tracking feature is unique to the Cisco MDS 9000 Family of switches. This feature uses information about the operational state of one link (usually an ISL) to initiate a failure in another link (usually one that connects to an edge device). This process of converting the indirect failure to a direct failure triggers a faster recovery process where redundant links exist. When enabled, the port tracking feature brings down the configured links based on the failed link and forces the traffic to be redirected to another redundant link. Generally, hosts and storage arrays can instantly recover from a link failure on a link that is immediately connected to a switch (direct link). However, recovering from an indirect link failure between switches in a local, WAN, or MAN fabric with a keep-alive mechanism is dependent on factors such as the time out values (TOVs) and registered state change notification (RSCN) information. In tests with port tracking enabled, failover occurred in approximately 150 milliseconds, compared to more than 25 seconds without the port tracking feature enabled. In the diagram, when the direct link (2) to the storage array fails, recovery can be immediate. However, when the WAN/MAN link (1) fails, recovery depends on TOVs, RSCNs, and other factors. The port tracking feature monitors and detects failures that cause topology changes and brings down the links connecting the attached devices. When you enable this feature and explicitly configure the linked ports, the SAN-OS software monitors the tracked ports and alters the operational state of the linked ports upon detecting a link state change. Port tracking is a feature of SAN-OS 2.0. It is included in the base license package at no additional cost.

220

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port Tracking Terminology


A tracked port is a port whose operational state is continuously monitored. The operational state of the tracked port is used to alter the operational state of one or more linked ports. Fibre Channel, PortChannel, FCIP, and Gigabit Ethernet interfaces can be tracked. Generally, interfaces in E and TE port modes are tracked, although Fx ports can also be tracked. A linked port is a port whose operational state is altered based on the operational state of one or more tracked ports. Only a Fibre Channel port can be linked.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

221

Load Balancing
Configuring Logical Paths
How can I provide preferred paths for a subset of hosts and storage devices in my SAN? SAN traffic engineering uses a combination of features to provide preferred selection of logical paths:
V VS SAN AN 2 0 10 B ac ku p

VSAN Metric 10 20 100 50

VSAN Metric 10 20 50 100

VSANs can be selectively grafted or pruned from EISL trunks. FSPF can be configured on a per-VSAN basis to select preferred and alternate paths. PortChannels provide link aggregation to yield virtual EISL trunks.

up ck 10 Ba 0 AN 2 VS AN VS

TE_Port

E_Port

TE_Port 4 link (8 Gbps) PortChannel configured as EISL

E_Port

2006 Cisco Systems, Inc. All rights reserved.

27

Configuring Logical Paths


The MDS 9000 Series switches provide a number of features that can be used alone or in combination to classify and select logical paths, including:

VSAN allowed lists, which permit VSANs to be selectively added to or removed from EISL trunks. FSPF link cost, which can be configured on a per-VSAN basis for the same physical link, providing preferred and alternate paths. PortChannels, which provide link aggregation and thus logical paths that can be preferred for routing purposes.

The implementation of VSANs gives the SAN designer more control over the flow of traffic and its prioritization through the network. Using the VSAN capability, different VSANs can be prioritized and given access to specific paths within the fabric on a per-application basis. Using VSANs, traffic flows can be engineered to provide an efficient usage of network bandwidth. One level of traffic engineering allows the SAN designer to selectively enable or disable a particular VSAN from traversing any given common VSAN trunk (EISL) thereby creating a restricted topology for the particular VSAN. A second level of traffic engineering is derived from independent routing configurations per VSAN. The implementation of VSANs dictates that each configured VSAN support a separate set of fabric services. One such service is the FSPF routing protocol which can be independently configured per VSAN. Therefore, within each VSAN topology, FSPF can be configured to provide a unique routing configuration and resultant traffic flow. Using the traffic engineering capabilities offered by VSANs allows a greater control over traffic within the fabric and a higher utilization of the deployed fabric resources.
222 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

Traffic Engineering Designs for SAN Extension


How can I utilize cost-effective, wide-area bandwidth first, while providing an alternate path during a fabric fault?
VSAN 10 OLTP
VSAN 10 cost = 100 VSAN 20 cost =200 OC-48

VSAN 10 OLTP

Email VSAN 20

OC-3 VSAN 10 cost = 200 VSAN 20 cost =100

Email VSAN 20

Preferred paths can be configured per VSAN


If one path fails, other path automatically takes over

Use cost-effective wide-area bandwidth first


Use alternate path during fabric fault
2006 Cisco Systems, Inc. All rights reserved.

28

Traffic Engineering Designs for SAN Extension


In addition to tuning FSPF link costs for traffic engineering in local SAN fabrics, FSPF can be particularly beneficial in SAN extension environments where multiple paths exist between separate SAN fabrics. In the diagram, there are two equal cost FCIP paths between the two VSANs on either side. The FCIP interfaces are configured as trunking so that they can carry traffic for multiple VSANs. By tuning the per-VSAN FSPF link costs, we are able to give each VSAN a dedicated preferred path, while allowing for transparent failover in the event of a link failure. In other SAN extension environments, FSPF could be used to give preference to a high speed link, for example an OC-48 link, while allowing a slower OC-3 link to provide a failover path. Or preference could be given to the more cost-effective link, allowing for failover to a more expensive path in the event of a fabric fault.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

223

SAN Performance Management


Ad Hoc SAN Performance Management
FC

Switch Counters Limited Switch Performance Management Tools

Vendor specific tools offer limited performance management capabilities Probes and analyzers in the data path are intrusive, disruptive and expensive
iSCSI

IP LAN

FC

FC
FC

ANALYZER
SINIFAR

IP WAN
QoS Probes in Data Path

Host Based Metrics

Protocol Analyzer in Data Path FC


FC

FC

FC
FC

FC SAN
2006 Cisco Systems, Inc. All rights reserved.

FC SAN
30

Ad Hoc SAN Performance Management


True end-to-end SAN performance management is a daunting task for system administrators. Host based tools, switch based tools and expensive in-band probes and analyzers all provide data but using vendor specific tools becomes increasingly difficult and time consuming when the SAN and the applications it supports begins to scale. In-band appliances in the data path are expensive, disruptive and may even mask some performance symptoms by retiming the signal, making analysis of performance data even more problematic.

224

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SPAN
Non-intrusive copy of all traffic from a port Directed to SD_Port within local or remote MDS switch Traffic redirected to Cisco Port Analyzer Adapter (PAA) Also compatible with off-the-shelf FC protocol analyzers

Destination Switch

RSPAN Tunnel Encapsulation


HBA

SPAN Source Port

ST

MDS FC Network

SD FC analyzer
31

Source Switch
2006 Cisco Systems, Inc. All rights reserved.

Switched Port Analyzer (SPAN)


SPAN allows a user to make a copy of all traffic and direct it to another port within the switch. This copy is not intrusive to any of the connected devices and is facilitated in hardware, thereby alleviating any unnecessary CPU load. Using the SPAN feature, a user could connect a Fibre Channel analyzer such as a Finisar analyzer to an unused port on the switch and then simply use SPAN to make a copy of the traffic from a port under analysis and send it to the analyzer in a non-intrusive and non-disruptive fashion. SPAN features include the following:

Non-intrusive and non-disruptive tool used with the Fibre Channel analyzer Ability to copy all traffic from a port and direct it to another port within the switch Totally hardware-drivenno CPU burden Up to 16 SPAN sessions within a switch Each session can have up to four unique sources and one destination port Filter the SPAN source based on receive only traffic, transmit only traffic, or bidirectional traffic

The Fibre Channel port that is to be analyzed is designated the SPAN source port. A copy of all Fibre Channel traffic flowing through this port is sent to the SD_Port. This includes traffic traveling in or out of the Fibre Channel port, that is, in the ingress or egress direction. The SD_Port is an independent Fibre Channel port, which receives this forwarded traffic and in turn sends it out for analysis to an externally attached Fibre Channel analyzer.

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

225

SPAN Applications
Using the SPAN feature, you can conduct detailed troubleshooting on a particular device without any disruption. In addition, a user may want to take a sample of traffic from a particular application host for proactive monitoring and analysis, a process that can easily be accomplished with the SPAN feature. Remote Switched Port Analyzer (RSPAN) further increases the capability of the SPAN feature. With RSPAN, a user has the ability to make a copy of traffic from a source port or VSAN to a port on another connected switch. Debugging protocols supported by SPAN and RSPAN include FSPF, PLOGI, exchange link parameter (ELP), and others. Examples of data analysis that can be performed with SPAN and RSPAN include:

Traffic on a particular VSAN on a TE_Port. Application-specific analysis using an analyzer.

An important debugging feature SPAN provides is that multiple users can share an SD_Port and analyzer. Also, the MDS 9000 can copy traffic on a single port at line rates. MDS 9000 can use SPAN with both unicast and multicast traffic. Dropped frames are not SPAN-ed. SPAN-ed frames will be dropped if the sum bandwidth of sources exceeds the speed of the destination port.

226

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Performance Manager
Fabric wide, historical performance reporting Browser-based display Summary and drill-down reports for detailed analysis Integrates with Cisco Traffic Analyzer Requires Fabric Manager Server license

2006 Cisco Systems, Inc. All rights reserved.

32

Performance Manager
Performance Manager, like Fabric Manager, runs as a Windows service. It monitors network device statistics and displays historical information in a web-based GUI. Performance Manager provides detailed traffic analysis by capturing data with the Cisco Port Analyzer Adapter (PAA). This data is compiled into various graphs and charts which can be viewed with any web browser. It presents recent statistics in detail and older statistics in summary. Performance Manager has three parts:

Definition of traffic flows is done by manual edits or by using a Fabric Manager configuration wizard to create a configuration file; Collection is where Performance Manager reads the configuration file and collects the desired information; Presentation is where Performance Manager generates web pages to present the collected data.

Performance Manager can collect a variety of data about ISLs, host ports, storage ports, route flows, and site-specific statistical collection areas. It relies on captured data flows through the use of the PAA, Fabric Manager Server, and the Traffic Analyzer. Using it as a FC traffic analyzer, a user can drill down to the distribution of Read vs. Write I/O, average frame sizes, LUN utilization, etc. Using it as a FC protocol analyzer, a user can have access to frame level information for analysis. The Summary page presents the top 10 Hosts, ISLs, Storage, and Flows by combined average bandwidth for the last 24 hour period. This period changes on every polling interval, although this is unlikely to change the average by much but it could affect the maximum value. The intention is to provide a quick summary of the fabrics bandwidth consumption and highlight any hot-spots.
Copyright 2006, Cisco Systems, Inc. Optimizing Performance 227

Performance Manager (Cont.)


Goals:
Ability to scale to large fabrics x 12 months of data Provide early warning of traffic problems Ability to see all packets on an individual interface Ability to diagnose traffic problems Simple to set up and use

Hybrid approach:
Aggregate traffic collected using SNMP and stored persistently in round-robin database. Use SPAN, PAA, and NTOP to capture packets for diagnosing traffic problems.
2006 Cisco Systems, Inc. All rights reserved.

33

The purpose of Performance Manager is to monitor network device statistics historically and provide this information graphically using a web browser. It presents recent statistics in detail and older statistics in summary. The deployment goal of Performance Manager is to be able to scale to large fabrics with 12 months of data, provide an early warning system for potential traffic problems, see all packets on an individual interface, diagnose traffic problems, and have it be simple to setup and use. In order to achieve these goals, Cisco implemented a hybrid approach. First to retrieve aggregate fabric traffic information using SNMP and store it persistently in a round-robin database. Then incorporate the use of SPAN, PAA, and NTOP to capture packets for ultimately diagnosing traffic problems. Performance Manager is a tool that can:

Scale to large fabrics Scales to multi-year histories Perform data collection without requiring inband local/proxy HBA access Tolerate poor IP connectivity Provide SNMPv3 support Have Zero-administration databases Provide site customization capabilities Accommodate fabric topological changes Integrate and share data with external tools Run on multiple operating systems Integrate with Fabric Manager

228

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Performance Manager Collected Data


Rx/Tx Bytes for:
ISLs Host Ports Storage Ports Flows

Bytes (and frames) sent from sources to destinations Configured based on active zones, for example:
Host1 Storage1 Host1 Storage2 Storage1 Host1 Storage2 Host1

Flows need to be defined on the correct linecard(s)


2006 Cisco Systems, Inc. All rights reserved.

34

Performance Manager Collected Data


The Performance Monitor collects receive and transmit byte data. This data is available for ISLs, host and storage ports, flows, etc. A flow is a count of bytes and frames sent from a particular source to a particular destination. Use the active zones to configure flows. For instance, given:

Zone A: Host1, Storage1, Storage2 Zone B: Host2, Storage1

Possible Flows:

Host1->Storage1 Host1->Storage2 Host2->Storage1 Storage1->Host1 Storage1->Host2 Storage2->Host1

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

229

Traffic Analyzer
Cisco customized version of ntop Free for download from CCO Live or Offline Analysis Provides SCSI based Information about storage network devices FC-enhanced public-domain tools

2006 Cisco Systems, Inc. All rights reserved.

35

Traffic Analyzer
Cisco Traffic Analyzer is a Cisco customization of the popular ntop network traffic monitoring tool. Traffic Analyzer allows for live or offline analysis, and displays information about storage and the network. Traffic Analyzer is a Fibre Channel-enhanced version of public-domain tools. Traffic Analyzer is not good for accounting, because frames may be dropped on SD_Port, by the PAA or on the host. Traffic Analyzer can be downloaded free from Cisco Connection Online (CCO).

230

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Traffic Analyzer Statistics Collected


Overall Network Statistics
Total Bandwidth Used Tx/Rx Bandwidth per VSAN Tx/Rx Bandwidth per N_Port

N_Port-Based Statistics
Per-LUN Statistics Traffic Breakdown by Time Class-based traffic breakdown

Session-based Statistics
SCSI Sessions (I_T_L Nexus) FICON Sessions (in progress) Other FC Sessions

VSAN-Based Statistics
Traffic Breakdown by VSAN VSAN configuration Stats Domain-based statistics

And more

2006 Cisco Systems, Inc. All rights reserved.

36

Traffic Analyzer Statistics Collected


Traffic Analyzer collects a large amount of statistical information. Some of the statistics collected are:

Overall Network Statistics Total Bandwidth Used Tx/Rx Bandwidth per VSAN Tx/Rx Bandwidth per N_Port Session-based Statistics SCSI Sessions (I_T_L Nexus) FICON Sessions (in progress) Other FC Sessions N_Port-Based Statistics Per-LUN Statistics Traffic Breakdown by Time Class-based traffic breakdown VSAN-Based Statistics Traffic Breakdown by VSAN VSAN configuration Stats Domain-based statistics

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

231

Traffic Analyzer How It Works


Hooks up with Port Analyzer Adapter (PAA)
Captures traffic much like Fabric Analyzer Different tool for analyzing traffic Modification of ntop

Software runs on host (PC, Mac, etc.)


No new switch software User Interface is Web Browser

Requires modified Port Analyzer Adapter (PAA-2)


Provides original length for truncated frames Captures more data with less bandwidth Preserves data privacy without compromising statistics accuracy

2006 Cisco Systems, Inc. All rights reserved.

37

Traffic Analyzer How It Works


The Traffic Analyzer hooks up with the Port Analyzer Adapter (PAA). It captures traffic much like the Fabric Analyzer, and is yet another tool for analyzing traffic. The Traffic Analyzer is a modification of ntop (see ntop.org). TA software runs on the host (PC, Mac, etc.) rather than on the switch, so there is no new switch software. The User Interface is simply a web browser. Traffic Analyzer must be used with a modified Port Analyzer Adapter (PAA). The newer PAA2 provides the original length of truncated frames. It captures more data with less bandwidth, and preserves data privacy without compromising statistics accuracy. Older PAAs are not field-upgradeable.

232

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Notification and Logging Services


Robust fault monitoring = quicker problem-resolution
Set Set alarms alarms based based on on 1 1 or or more more parameters, parameters, such such as: as:
Port Port utilization utilization CPU CPU utilization utilization Memory Memory utilization utilization

Specify Specify actions actions to to be be taken taken based based on on alarms: alarms:
Logging Logging SNMP SNMP traps traps Log-and-trap Log-and-trap

!
RMON Syslog Call Home

Log Log information information for for monitoring monitoring and and troubleshooting troubleshooting Capture Capture accounting accounting records records See a complete See a complete picture picture of of events events

Notification Notification of of critical critical system system events events


Example: Example: alert alert when when switch switch ports ports are are congested congested


2006 Cisco Systems, Inc. All rights reserved.

Flexible Flexible message message formats: formats: email, email, pager, pager, XML XML Integrates Integrates with with RMON RMON and and Syslog Syslog (SAN-OS (SAN-OS 2.0+) 2.0+)
38

Call Home
Call Home can be configured to provide alerts when switch ports become congested, therefore performance can be monitored remotely and action taken promptly. The Call Home functionality is available directly through the Cisco MDS 9000 Family. It provides multiple Call Home profiles (also referred to as Call Home destination profiles), each with separate potential destinations. Each profile may be predefined or user-defined. A versatile range of message formats are supported, including standard email-based notification, pager services, and XML message formats for automated XML-based parsing applications. The Call Home function can even leverage support from Cisco Systems or another support partnerfor example, if a component failure is detected, a replacement part can be on order before the SAN administrator is even aware of the problem. Flexible message delivery and format options make it easy to integrate specific support requirements. The Call Home feature offers the following advantages:

Integration with established monitoring systems like RMON and Syslog Comprehensive and more robust fault monitoring Aids in quicker problem-resolution

Copyright 2006, Cisco Systems, Inc.

Optimizing Performance

233

RMON Threshold Manager


Use the options on the Device Manager Events menu to configure and monitor Simple Network Management Protocol (SNMP), Remote Monitor (RMON), Syslog, and Call Home alarms and notifications. SNMP provides a set of preconfigured traps and informs that are automatically generated and sent to the destinations (trap receivers) chosen by the user. Use the RMON Threshold Manager to configure event thresholds that will trigger log entries or notifications. The RMON groups that have been adapted for use with Fibre Channel include the AlarmGroup and EventGroup. The AlarmGroup provides services to set alarms. Alarms can be set on one or multiple parameters within a device. For example, an RMON alarm can be set for a specific level of CPU utilization or crossbar utilization on a switch. The EventGroup allows configuration of events (actions to be taken) based on an alarm condition. Supported event types include logging, SNMP traps, and log-and-trap.

Syslog
The system message logging software saves messages in a log file or directs the messages to other devices. This feature provides the following capabilities:

Logging information for monitoring and troubleshooting. Selection of the types of logging information to be captured. Selection of the destination of the captured logging information.

By default, the switch logs normal but significant system messages to a log file and sends these messages to the system console. Users can specify which system messages should be saved based on the type of facility and the severity level. Messages are time-stamped to enhance realtime debugging and management. Syslog messages are categorized into seven severity levels, from debug to critical events. Users can limit the severity levels that are reported for specific services within the switch. For example, Syslog can be configured to report only debug events for the FSPF service but record all severity level events for the Zoning service. A unique feature within the Cisco MDS 9000 Family switches is the ability to send accounting records to the Syslog service. The advantage of this feature is consolidation of both types of messages for easier correlation. For example, when a user logs into a switch and changes an FSPF parameter, Syslog and RADIUS provide complementary information that portrays a complete picture of the event. Syslog can store a chronological log of system messages locally or send messages to a central Syslog server. Syslog messages can also be sent to the console for immediate use. These messages can vary in detail depending on the configuration chosen.

234

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 8

Securing the SAN Fabric


Overview
This lesson explains how to secure an MDS 9000 SAN fabric using zoning, port and fabric binding, device authentication, and management security.

Objectives
Upon completing this lesson, you will be able to design an end-to-end SAN security solution. This includes being able to meet these objectives:

Describe the most common security issues facing SANs Explain how zoning contributes to the security of a SAN solution Explain how port and fabric binding contribute to the security of a SAN solution Explain how device authentication contributes to the security of a SAN solution Explain how to secure management data paths Explain the best practices for end-to-end SAN security design

SAN Security Issues


SAN Security Challenges
SAN security is often overlooked as an area of concern:
Application integrity and security is addressed, but not back-end storage network carrying actual data SAN extension solutions now push SANs outside datacenter boundaries

Not all compromises are intentional:


Accidental breaches can still have the same consequences

SAN security is only one part of complete datacenter solution:


Host access securityone-time passwords, auditing, VPNs Storage securitydata-at-rest encryption, LUN security Datacenter physical security

External External DoS DoS or or other other intrusion intrusion

Privilege Privilege escalation/ escalation/ unintended unintended privilege privilege Application Application tampering tampering (trojans, (trojans, etc) etc)
FC
FC

Theft Theft Unauthorized Unauthorized connections connections (internal) (internal) Data Data tampering tampering
FC

HBA

FC FC FC HBA
HBA

SAN
FC
FC

FC

LAN

2006 Cisco Systems, Inc. All rights reserved.

SAN Security Challenges


Traditionally, SANs have been considered secure primarily because SAN deployments had been limited to a subset of a single data center in essence, an isolated network. Fibre Channel (FC) has been considered secure on the basis that FC networks have been isolated from other networks. Application security has long been the focus of IT professionals, while back end storage and storage networks have often been ignored from a security perspective. Today SANs often span outside a datacenter. SAN extension technologies such as DWDM, CWDM and FCIP can be used to connect devices in multiple datacenters to storage in multiple datacenters. Transport technologies such as iSCSI decrease the cost associated with attaching hosts to a SAN and therefore accelerate the rate at which devices are connected to a SAN. There are many potential threats to network security. Often times these threats are perceived as external, whereby some outside entity (i.e. a hacker or cracker) attempts to break-in to a network to steal data, read confidential information, or simply wreak havoc with a organizations business operations. While these external entities pose a significant threat to network security, more frequently internal entities pose a far greater threat and typically are not adequately addressed by network security defense mechanisms. SAN security is an important part of a complete datacenter security solution. SAN security attempts to protect both data in transport (Storage Networking Security) and data at rest (Storage Data Security).

238

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SAN Security Vulnerabilities


Fabric and target threats
Compromised application data Compromised LUN integrity Compromised application performance Unplanned downtime, costly data loss
Unauthorized Target Access Protocol Threats

Fabric protocol threats


Compromised fabric stability Compromised data security Disruptive topology changes Unplanned downtime, instability, poor I/O performance, costly data loss
Unauthorized Fabric Service

Clear Text Passwords No Audit of Access / Attempts Out-of-band Ethernet Management Connection

SAN management threats


Disruption of switch processing Compromised fabric stability Compromised data integrity and secrecy Loss of service, LUN corruption, data corruption, data theft or loss
2006 Cisco Systems, Inc. All rights reserved.

Accidental or Intentional Harmful Management Activity


5

SAN Security Vulnerabilities


Security for the SAN has often been a matter of security by obscurity, as IT organizations relied on the inherent security of the data center to protect its storage. With the expansion of SANs outside the heavily defended data center perimeter, more robust security is needed to protect storage resources and the storage fabric. In addition, other vendors platforms feature weak management security, with insecure management protocols and no ability to restrict management access.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

239

Fabric Security Tiers

FC

FC
HBA

Host-based:
LUN mapping Standard OS security

Fabric-based:

Array-based:

VSANs LUN masking Zoning LUN zoning Read-only zones Port mode security Port binding Fabric Binding Authentication Management security Role Based Access Control
6

2006 Cisco Systems, Inc. All rights reserved.

Fabric Security Tiers


SAN security can be implemented at three distinct tiers:

Host Fabric Array

Security measures at each of the three tiers can be used by storage administrators to achieve the level of security needed for a particular environment. If an entire data center is located within a single, physically secure data center a more lax suite of security measures might be chosen. However, SANs are commonly being extended beyond the confines of the corporate data center and thus implementing multiple security mechanisms at all three tiers is both warranted and necessary.

240

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fabric Security Limitations


Host and/or array LUN security: No fabric enforcement Soft zoning: WWN-based hard zoning: Spoof the WWN and gain access Port-based zoning: Occupy the port and gain access Port security (WWN binding): Spoof the WWN and occupy the port to gain access DH-CHAP: Need full authentication to gain access
7

FC

FC
HBA

Spoof the WWN or FCID and gain access

2006 Cisco Systems, Inc. All rights reserved.

Fabric Security Limitations


Traditional SAN security methods include LUN security on hosts and storage arrays as well as zoning in the fabric. As a new generation of SAN technology becomes available additional security features can be deployed to close long standing security vulnerabilities. Security Mechanism and its limitations:

Host and/or Array LUN Security Host and Array LUN security does not rely on fabric enforcement and thus has limited effectiveness. By itself LUN security is not adequate to safeguard a SAN, but host LUN security and array LUN security can be used in conjunction with other security measures to create an effective security policy. Soft Zoning Soft zoning is perhaps the oldest and most commonly deployed security method within SANs. Primarily it protects hosts from accidentally accessing targets with which they are unauthorized to communicate. However, soft zoning provides no fabric enforcement. If a host can learn the FCID of a target soft zoning will not prevent that host from accessing the target. Port-based Zoning Port based zoning is applied to every FC frame that is switched, thus it has a level of fabric enforcement not provided by soft zoning. However, security provided by port-based zoning can be circumvented simply by gaining physical access to an authorized port. WWN-based Zoning WWN based zoning applies switching logic to frames based on their factory burned-in WWN rather than the physical port the devices is connect to. WWN based zoning can be defeated through the spoofing of WWNs, which is relatively trivial to accomplish. Port Security Prevents unauthorized fabric access by binding specific WWNs to one or more given switch ports. In order to defeat port security a hacker would need to both spoof the device WWN and access the specific port or ports that devices is authorized to use.
Securing the SAN Fabric 241

Copyright 2006, Cisco Systems, Inc.

DH-CHAP Enforces fabric and device access through an authentication method during the fabric login phase. DH-CHAP offers an excellent method of securing SAN fabrics, but it does not provide encryption services. Standardized encryption services for SANs will soon be available from multiple vendors, including Cisco. Depending on the security protocols you have implemented, PPP authentication using MSCHAP can be used with or without Authentication, Authorization and Accounting (AAA) security services. If you have enabled AAA, PPP authentication using MS-CHAP can be used in conjunction with both TACACS+ and RADIUS. MSCHAPV2 authentication is the default authentication method used by the MicrosoftWindows2000 operating system. Support of this authentication method on Cisco routers will enable users of the MicrosoftWindows2000 operating system to establish remote PPP sessions without needing to first configure an authentication method on the client. MSCHAPV2 authentication introduces an additional feature not available with MSCHAPV1 or standard CHAP authentication, the change password feature. This feature allows the client to change the account password if the RADIUS server reports that the password has expired.

242

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Comprehensive Security Solutions


Fabric and target security
VSAN-based security only allow access to devices within attached VSAN Hardware-enforced zoning complementary to VSAN segregation Device authentication using DH-CHAP
DH-CHAP Device Authentication LUN Zoning Read Only Zoning FC-SP switch-to-switch authentication

Fabric protocol security


Port and fabric binding Switch-to-switch authentication

Port Mode Security

SAN management security


AAA security with RADIUS and TACACS+ SSH for secure console sessions Secure GUI access with SNMPv3 authentication and encryption RBAC and logging for audit controls
2006 Cisco Systems, Inc. All rights reserved.

SSH, RADIUS and TACACS+ Secure Console Access

Out-of-band Ethernet Management Connection

SNMPv3 Secure GUI Access

RBAC and logging

Comprehensive Security Solutions


While other SAN switch vendors have recently introduced features like hardware-enforced world-wide name (WWN) zoning, port and fabric binding, and device authentication, VSANs add a critical layer of protection by isolating ports on a per-department or per-application basis. The MDS supports standard Cisco role-based access control (RBAC) on a per-VSAN basis, providing a fine (but easily managed) level of granularity in assigning access permissions. Management data paths are secured with authenticated and encrypted SSH, SSL, and SNMPv3 sessions. All of this can be managed via centralized AAA services like Remote Authentication Dial In User Service (RADIUS) and Terminal Access Controller Access Control System (TACACS+), reducing management overhead and ensuring more consistent application of security policies across the enterprise.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

243

Zoning
MDS Advanced Zoning Features
Used to control host access to storage devices, within a VSAN MDS 9000 supports both hard and soft zoning
Soft zoning enforced by Name Server query-responses Hard zoning enforced on every frame by the forwarding ASIC
FC Alias
FC

LUN

Zoning Options
pWWN (attached Nx_Port) FCID FC Alias (within a VSAN) Device Alias (global within a SAN) fWWN (switch port-based zoning) Interface (fc1/2)
HBA

pWWN

FCID

Device Alias pWWN


FC

fWWN Int fc1/2

FCID

sWWN and port LUN


sWWN

Fully compliant with FC-GS3, FC-GS-4, FC-SW2, FC-SW3, & FC-MI Fabric Manager supports Zone Merge Analysis
Prevents fabric merge failures due to zone database mismatch
2006 Cisco Systems, Inc. All rights reserved.

10

MDS Advanced Zoning Features


Zoning is a mechanism to control access to devices with a fibre channel fabric. On the Cisco MDS 9000 family of switches and routers zoning is enforced separately in each VSAN. The MDS 9000 supports both hard and soft zoning. Soft zoning is enforced through selective query responses made by the fibre channel name server. Zoning is applied to all data traffic through hard zoning by the forwarding ASIC. Zoning can be based on port and fabric WWN, FCID, interface, and LUNs. Ciscos implementation of zoning is fully compliant with FC standards including FC-GS3, FCSW2, FC-MI and with SAN-OS 2.0 or higher, the MDS supports the FC-GS-4 and FC-SW-3 standards, which allow greater consistency of zoning parameters across the fabric. Prior to bringing up an ISL between two switches the fabric manager can conduct a zone merge analysis to determine whether the two switchs zones can be successfully merged.

244

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS Zoning Services


All zoning services offered by Cisco are implemented in hardware ASICs
No dependence on whether using mix of WWNs and Port_IDs in a zone all hardware based WWN-based zoning implemented in software with hardware reinforcement WWNs are translated to FCIDs to be frame-filtered
Hardware-Based Zoning Details

Dedicated high speed port filters called TCAMs (Ternary CAMs) filter each frame in hardware and reside in front of each port
Support up to 20,000 programmable entries consisting of zones and zone members Up to 8000 zones per fabric Very deep frame filtering for new innovative features Wire-rate filtering performance no impact regardless of number of zones or zone entries Optimized programming during zoneset activation incremental zoneset updates

RSCNs contained within zones in given VSAN Selective Default Zone behavior default is deny configured per VSAN
2006 Cisco Systems, Inc. All rights reserved.

11

MDS Zoning Services


Data plane traffic is secured with VSANs, guaranteeing segregation of traffic across shared fabrics, and with zoning to satisfy traffic segregation requirements within a VSAN. Hardwarebased ACLs provide further granularity for advanced security options. The Cisco MDS 9509 leverages Cisco's experience securing the world's most sensitive data networks to deliver the industry's most secure storage networking platform. VSANs and zoning within the MDS 9000 Family of products are two powerful tools to aid the SAN designer in building robust, secure, and manageable networking environments while optimizing the use and cost of switching hardware. In general, VSANs are used to divide a redundant physical SAN infrastructure into separate virtual SAN islands each with its own set of Fibre Channel fabric services. By each VSAN supporting an independent set of Fibre Channel services, a VSAN-enabled infrastructure can house numerous applications without the concern for fabric resource or event conflicts between these virtual environments. Once the physical fabric has been divided, zoning is then used to implement a security layout within each VSAN that is tuned to the needs of each application within each VSAN.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

245

Enhanced Zone Server


Basic mode
Represents the zone server behavior of GS-3/SW-2 standard Supported in pre-2.0 SAN-OS releases

2.0

Enhanced mode
Represents the zone server behavior of GS-4/SW-3 standard Available with SAN-OS 2.0 and greater QoS parameters distributed as part of zone attribute Consistent full-zone database across fabric Support for attributes in the standard Consistent zoning policy across fabric Unique vendor type Reduced payload for activation request

2006 Cisco Systems, Inc. All rights reserved.

12

Enhanced Zone Server


Starting with SAN-OS 2.0, the MDS supports the FC-GS-4 and FC-SW-3 standards, which allow greater consistency of zoning parameters across the fabric.

246

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Securing Hosts and Storage


Host/array

based LUN security:

LUN Mapping in the Host LUN Masking in the Storage Array Used to control host access to storage LUNs Not enforced in the fabric Prevents contention for storage resources and data corruption Protection from unintentional security breaches Lack of centralized management
Storage pWWN 3 Storage pWWN 4
FC
HBA

FC
HBA

Host pWWN 1

Host pWWN 2

A Array-based LUN Security


FC

2006 Cisco Systems, Inc. All rights reserved.

13

Securing Hosts and Storage


LUN security is used to determine which hosts are allowed to access which storage volumes. LUN-level security is required in order to allow fabric administrators the ability to control access to storage resources below the port level, such as accessing disks within a JBOD or logical volumes within a RAID array. LUN security can be enforced at the host or in the storage array. All HBAs sold today support LUN Mapping, and most intelligent storage arrays also allow administrators to restrict the hosts that can access each LUN with LUN Masking. LUN-level access could also be enforced by a router or switch in the SAN. LUN security is primarily used to prevent multiple hosts from accessing the same storage resources and thereby causing data corruption. Unless the storage array also supports LUNlevel access control, these security techniques are voluntary. Host-level LUN security does not prevent an unauthorized host from connecting to the fabric and accessing storage resources.
Note Different vendors use different terminology to describe LUN security. Typical terms used include: LUN security, LUN masking, LUN mapping, Storage Domains. In this course we use the generic term LUN Security.

Example of Host-Based LUN Security: The HBA utility on the red host could be configured to only communicate with the WWN associated with storage port A. Example of Array-Based LUN Security: Storage Port B could be configured to only accept frames from the WWN associated with the blue host port.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

247

LUN Zoning
LUN Zoning is the ability to zone an initiator with a subset of LUNs offered by a target:
Use with storage arrays that lack LUN masking capability Use instead of LUN masking to centralize management in heterogeneous storage environments Can be managed centrally from CFM
FC
HBA

1.2

FC

report_LUNs 10 LUNs available report_size LUN_1 LUN_1 is 50GB report_size LUN_3 LUN_3 is unavailable

2006 Cisco Systems, Inc. All rights reserved.

14

LUN Zoning
Disk arrays typically have multiple Logical Units on them. Standard FC Zoning extends down to the switch port level or down to the WWN of the port, but not down to the LUN level. This means that any fabric containing disk arrays with multiple LUNs needs security policies configured on both the disk array (or multiple disk arrays) and on the FC switches themselves. LUN Zoning is a feature specific to switches in the Cisco MDS 9000 Family introduced in SAN-OS 1.2 that allows zoning to extend to individual LUNs within the same WWN. This means that the centralized zoning policy configured on the FC switches can extend to hardware-enforcing zoning down to individual LUNs in disk arrays. In the top half of the diagram LUN zoning allows the switch to grant the host access to disks 1 & 2 while preventing the host from accessing all other disks in the array.

248

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Read-Only Zoning
Read-Only Zoning leverages the hardware-based frame processing of the MDS 9000 Family Use for backup servers and snapshots Especially useful for media servers that need high speed access to rich content for broadcast block level bypasses NAS service Does not work for certain file system types ie NTFS
Streaming media server
FC

1.2

FC
HBA

FCP_READ FCP_DATA FCP_WRITE

2006 Cisco Systems, Inc. All rights reserved.

15

Read-Only Zoning
Standard FC Zoning is used to permit devices to communicate with each other. Standard FC Zoning cannot perform any advanced filtering for example, by blocking or allowing specific I/O Operations such as a Write I/O command. The Cisco MDS 9000 Family provides the ability to enforce read only zones in hardware. That is, the switch can enforce read-only access to a given device (e.g. a disk) and will block any write requests. Read-only zoning filters FC4-Command frames based on whether the command is a read or write command When used in conjunction with LUN Zoning, read-only or read-write access can be granted for specific hosts to specific LUNs. Read-only Zoning was introduced with SAN-OS 1.2. This functionality is available on every port across the entire Cisco MDS 9000 product family. On the bottom half of the diagram a streaming video server is granted read-only access to the storage array, thus preventing inadvertent or malicious corruption of data. Certain operating systems use file systems that depend on the ability to write to disks (e.g., many Windows file systems). Such file systems may not function properly when placed in a read-only zone.
Note Read-Only Zones requires the Enterprise License Package.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

249

VSAN Best Practices


Use VSANs to isolate each application Use IVR to allow resource sharing across VSANs Suspend VSAN 1:
Move unused ports to VSAN 4094 Do not configure zones in VSAN 1 Set the default zone policy to deny Prevents WWN spoofing on an unused port

VSAN 201 HR_DB

FC

FC

VSAN 202 CUST_DB

FC

FC FC FC FC
FC

FC FC FC FC FC

VSAN 1

FC

2006 Cisco Systems, Inc. All rights reserved.

16

VSAN Best Practices


Cisco recommends the following VSAN best practices:

Use VSANs to isolate each application whenever feasible. Use IVR to allow resource sharing across VSANs; this allows complete isolation of each application. Place all unused ports in VSAN 4094. Because ports are in VSAN 1 by default, suspend VSAN 1, do not configure any zones, and set the default zone policy to deny. This will prevent WWN spoofing on unconfigured ports.

250

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Zoning Best Practices


Use zoning services to isolate servers:
Use Single Initiator Zoning - configure one zone per HBA Use read-only zones for read-only targets e.g. snapshot volumes Use LUN zoning to centralize LUN security Set default-zone policies to deny

Configure zones from only one or two switches:


Active zoneset will propagate to all switches in the fabric Prevents confusion and potential errors due to conflicting full zonesets If only one zoneset is needed, configure on one switch only Can recover full zoneset from active zoneset if that switch fails
Full Zoneset Active Zoneset

Active Zoneset
2006 Cisco Systems, Inc. All rights reserved.

Active Zoneset
17

Zoning Best Practices


Cisco recommends the following zoning best practices:

Zoning should always be deployed in a FC fabric. Typically one zone will be configured per HBA communicating with storage. This is called Single Initiator Zoning. Depending on the particular environment port or WWN based zoning may be selected, although WWN zoning provides more convenience and less security than port based zoning. Port security features can be used to harden WWN-based zones. Read only-zones should be applied to LUNs that will not be modified by initiators. LUN zoning can be used to augment or replace array-based zoning. Set the default zone policy to deny to prevent inadvertent initiator access to a target.

Only 1 or 2 switches should be used to configure zoning. This will help prevent confusion due to conflicting zonesets or the activation of an incomplete zoneset. If only one zoneset is needed (i.e. the active zoneset), you can configure the full zoneset on one switch only. In the event that switch goes down and the full zoneset is lost, you can easily recover the full zoneset from the active zoneset.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

251

Port and Fabric Binding


Port Mode Security
Only allow edge ports to form F_Ports or FL_Ports
FC

E_Port E_Port mode mode

Auto Auto mode mode Any port type


FC
HBA

Limit users who can change port mode via RBAC Port mode security best practices:
Use port mode security on all switch ports Shut down all unused ports Place unused ports in VSAN 4094

HBA

FC

Fx_Port Fx_Port mode mode F, FL Only

F_Port F_Port mode mode F Only

FC

Fx_Port Fx_Port mode mode

E_Port E_Port or or Auto Auto mode mode

2006 Cisco Systems, Inc. All rights reserved.

19

Port Mode Security


Security and convenience are often at odds when administering a SAN. For example, the most convenient port mode setting is Auto, which allows any type of device to login into a given switch port. While convenient the auto mode could enable a user from intentionally or inadvertently misusing the fabric. A more secure practice is to specifically configure a switch port to only allow a connection from an expected device type. In the diagram, the top left port is configured to only function as an E_Port. If a host were to try to use this port to access the fabric the switch would not allow the connection. Similarly, the storage array in the left half of the diagram is connecting through a switch port configured as either an F or FL port. This switch will only allow and N or NL port to connect to the fabric through this port. In high security environments, port mode security should be used on all switch ports.

252

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port and Fabric Binding


Security feature that binds a remote entity to a given switch port The remote entity can be a host, target or a switch Remote entities are identified by WWN Checked on all VSANs where activated Failure results in link level login failure Prevents S_ID spoofing Included with the Enterprise license Auto-learn mode to ease configuration
Port X -> WWN 1 Port Y -> WWN 2 Port Z -> WWN 3

WWN 2
FC

WWN 3
FC
HBA

WWN 4
FC
HBA

WWN 1

2006 Cisco Systems, Inc. All rights reserved.

20

Port and Fabric Binding


The port security feature restricts access to a switch port, allowing only authorized devices to connect to that port, and blocks all other access. Authorized devices may be hosts, targets, or other switches, and are identified by their World Wide Names (WWN). Port security checks are conducted on all VSANs that have the feature activated. Port security is available upon installation of the Enterprise license. Typically, any Fibre Channel device in a SAN can attach to any SAN switch port and access SAN services based on zone membership. Port Security is a feature that was introduced into the Cisco MDS 9000 family in SAN-OS 1.2 that is used to prevents unauthorized access to a switch port by binding specific WWN(s) as having access to one or more given switch ports. When Port Security is enabled on a switch port, all devices connecting to that port must be in the port-security database and must be listed in the database as bound to a given port. If both these criteria arent met, the port wont ever achieve an operationally active state and the devices connected to the port will be denied access to the SAN. In the case of a storage device or host, the port name (pWWN) or node name (nWWN) can be used to lock authorized storage devices to a specific switch port. In the case of an E_Port/TE_Port, the switch name (sWWN) is used to bind authorized switches to a given switch port. When Port Security is enabled on a port:

Login requests from unauthorized Fibre Channel devices (Nx ports) and switches (xE ports) are rejected. All intrusion attempts are reported to the SAN administrator.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

253

The auto-learn option allows allow for rapid migration across to Port Security when it is being activated for the first time. Rather than manually secure each port, auto-learn allows for automatically population of the port-security database based on an inventory of currentlyconnected devices.

254

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Port Security Best Practices


Use port mode assignments:
Lock (E)ISL ports to E_Port mode Lock access ports to Fx_Port mode

Use port security features everywhere:


Bind devices to switch as a minimum level of security Bind devices to a port as an optimal configuration Consider binding to group of ports in case of port failure Bind switches together at ISL ports bind to specific port, not just switch

Use FC-SP authentication for switch-to-switch fabric access:


Use device-to-switch when available

Use unique passwords for each FC-SP connection Use RADIUS or TACACS+ for centralized FC-SP password administration
2006 Cisco Systems, Inc. All rights reserved.

21

Port Security Best Practices


Port security best practices include the use of port mode assignments:

Lock (E)ISL ports to only be (E)ISL ports Lock initiator and target ports down to F or FL mode

When higher levels of security are desired, use port security features:

Bind devices to switch as a minimum level of security Bind devices to a port as an optimal configuration Consider binding to a group of ports in case of port failure Bind switches together at ISL ports bind to specific port, not just switch

Use FC-SP authentication for switch-to-switch fabric access:


Use device-to-switch when available FC-SP-based authentication should be considered mandatory in a secure SAN in order to prevent access to unauthorized data via spoofed or hijacked WWNs where traditional Port Security would be vulnerable.

Use unique passwords for each FC-SP connection. Use RADIUS or TACACS+ for centralized FC-SP password administration:

RADIUS or TACACS+ authentication is recommended for fabrics with more than five FCSP-enabled devices.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

255

Authentication and Encryption


WWN Identity Spoofing
Zoning provides segregation, but lacks any form of authentication Circumventing zones through impersonation of a member (identity spoofing) is both possible and relatively trivial to do

http://www.emulex.com/ts/fc/docs/wnt2k/2.00/pu.htm

2006 Cisco Systems, Inc. All rights reserved.

23

WWN Identity Spoofing


While zones provide a good method of segregating groups of hosts and disks within a SAN they do not effectively protect a SAN from a malicious attack. Most zones today are based on WWNs which are relatively trivial to spoof. Here we see an HBA configuration utility that allows an administrator to override the factory set device WWN. Such a utility might be used by a hacker to circumvent WWN based zoning and thus gain unauthorized access to a fabric. Both Emulex and QLogic provide tools to change the WWN of the Host Bus Adapter

256

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Host and Switch Authentication


Prevents accidental and/or malicious devices from joining a secure SAN.
Hosts Switches Initial phase focused on authentication DH-CHAP between devices Centralized RADIUS or TACACS+ server MS-CHAP now supported in SAN-OSv3
Prevent accidental and/or malicious host/switch to connect

RADIUS or TACACS+

Trusted Hosts
FC
HB A

FC FC
HB A

FC
HB A

HB A

Based on FC-SP security protocols.

FC-SP (DH-CHAP)

FC-SP (DH-CHAP)

FC
HBA

Storage Subsystems

2006 Cisco Systems, Inc. All rights reserved.

24

Host and Switch Authentication


Support for device authentication was introduced into the Cisco MDS 9000 family in SAN-OS 1.3. This and subsequent releases support data integrity (tamper-proofing) and authentication (non-repudiation) for both switch-to-switch and host-to-switch communication. Authentication is based on Challenge Handshake Authentication Protocol (CHAP) with DiffieHellman (DH) extensions (DH-CHAP). Ciscos implementation of DH-CHAP supports node-to-switch and switch-to-switch authentication. Authentication can be performed locally in the switch or remotely through a centralized RADIUS or TACACS+ server. If the authentication credentials cannot be ascertained or the authentication check fails, a switch or host will be blocked from joining a FC fabric. Secure switch control protocols prevent accidental and/or malicious devices from joining a secure SAN. Because of the distributed nature of the FC protocol, a user can create a DOS attack by maliciously or accidentally connecting hosts and switches into an existing fabric. This is especially a concern when deploying a geographically dispersed enterprise-wide or campus wide fabric. However, MDS 9000 addresses this with host-to-switch and switch-toswitch authentication features that are being proposed by T11. The FC-SP specification to be specific. Before two entities start exchanging control and data frames, they mutually authenticate each other by using an external RADIUS or TACACS+ server.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

257

Security for IP Storage Traffic


iSCSI Server
iSCSI

iSCSI secured with IPsec

CHAP Authentication

Primary Site

CHAP Authentication

IP WAN

Remote Site

FCIP secured with IPsec

CHAP provides authentication of iSCSI hosts IPsec provides end-to-end authentication, data integrity and encryption
Hardware-based, high-performance solution MDS 9216i or MPS-14/2 Module
2006 Cisco Systems, Inc. All rights reserved.

25

Security for IP Storage Traffic


IP Security (IPsec) is available for FCIP and Small Computer System Interface Over IP (iSCSI) over Gigabit Ethernet ports on the Multiprotocol Services modules and Cisco MDS 9216i. The proven IETF standard IPsec capabilities offer secure authentication, data encryption for privacy, and data integrity. Internet Key Exchange Version 1 (IKEv1) and IKEv2 protocols are used for dynamically setting up the security associations for IPsec using preshared keys for remote-side authentication

258

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Management Security
SAN Management Security Vulnerabilities
SAN Management Threats
Disruption of switch processing Compromised fabric stability Compromised data integrity and secrecy Loss of service, LUN corruption, data corruption, data theft or loss
Compromised fabric stability Compromised data integrity and secrecy

SAN Management Vulnerabilities


Unsecured Console Access Unsecured GUI application access Unsecured API access Privilege escalation / unintended privilege Lack of audit mechanisms
Accidental or Intentional Harmful Management Activity
2006 Cisco Systems, Inc. All rights reserved.

Clear Text Passwords No Audit of Access / Attempts Out-of-band Ethernet Management Connection

27

SAN Management Security Vulnerabilities


In an environment where security is only as strong as the weakest link, SAN management security is often overlooked as one of the most vulnerable and dangerous points of compromise. An attack that hijacks a management session or even accidental management activity can create serious consequences that impact the integrity of the fabric and the data assets it provides. Points of SAN management security exposure include:

Unsecured Console Access Unsecured GUI application access Unsecured API access Privilege escalation / unintended privilege Lack of audit mechanisms

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

259

Securing Management Access

Encryption (secure Firewall protocol)

Management VLAN

Management VSAN IP ACL RBAC

Data PathManagement Client to Switch

2006 Cisco Systems, Inc. All rights reserved.

28

Securing Management Access


To fully secure management data paths, you need to implement security measures at all points in the data path:

Use secure protocols. SNMPv3, SSH, and SSL provide strong authentication and encrypted sessions. Disable SNMPv2, Telnet, and HTTP. Use VPNs for remote management. Always implement firewalls between the management network and the Internet. Intrusion Detection Systems (IDS) should also be included in the solution. In a large company, consider implementing an internal firewall to isolate the management network from the rest of the company LAN. Use a private management VLAN to isolate management traffic. Implement IP ACLs to restrict access to mgmt0. Management VSANs can be configured to create a logical SAN for management traffic only. Use role-based access control (RBAC) to restrict user permissions.

260

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Secure Management Protocols


Simple Network Management Protocol (SNMP)
SNMPv3 used to communicate between switch and GUI management applications Supports encryption and authentication SNMP v1 and v2 also supported for legacy applications More than 50 MIBs supported

Secure Shell (SSH) v2


Encrypts and authenticates traffic between switch and management station Used for CLI sessions instead of telnet SSH Host Key Pair - RSA, RSA1, DSA, or AES

2006 Cisco Systems, Inc. All rights reserved.

29

Secure Management Protocols


The Cisco MDS 9000 Family of switches supports an extensive SNMP facility, including traps. MDS 9000 switches use SNMPv3, which supports encryption and authentication. SNMPv1 and v2 are also supported for legacy applications. The Cisco MDS 9000 Family of switches supports over 50 SNMP MIBs, allowing secure management from both Cisco GUI management applications and third-party applications. SSHv2 (Secure Shell version 2) encrypts CLI traffic between client and MDS 9000, authenticates communication between client and host, and prevents unauthorized access. The MDS 9000 platform supports the Rivest, Shamir, and Adelman (RSA1 and RSA), Digital Signature Authority (DSA), and Advanced Encryption Standard (AES) public key protocols. SSH should be used instead of Telnet.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

261

Role-Based Access Control (RBAC)


RBAC allows different users and groups to be granted appropriate levels of access to management interfaces. Predefined Roles
Network operator (read only) Exec mode file system commands Show commands and diagnostics Network-admin (read/write) Access to all CLI commands
Finance VSAN Engineering VSAN Email VSAN

Network Administrator
Configure/manage overall network

Customized Roles
Access to subsets of CLI commands

VSAN Based RBAC


Deploy on a VSAN basis By department By administrative function
2006 Cisco Systems, Inc. All rights reserved.

VSAN Administrators
Configure/manage their VSAN only
30

Role-Based Access Control


RBAC allows different administrative users and groups to be granted different levels of access to management interfaces. Some administrators might be given read-only access to permit device monitoring, others might be given the ability to change port configurations, while only a few trusted administrators are given the ability to change fabric-wide parameters. With SANOS version 1.3.1 and above, customers are able to define roles on a per-VSAN basis. This enhanced granularity allows different administrators to be assigned to manage different SAN domains.

Role-Based Security Best Practices


Cisco supports RBAC for MDS switches, allowing different administrative users and groups to be granted different levels of access to management interfaces. Some administrators might be given read-only access to permit device monitoring, others might be given the ability to change port configurations, while only a few trusted administrators are given the ability to change fabric-wide parameters. Users have the union of access permissions from all roles assigned to them. These roles can be assigned to either CLI or SNMP users. Two roles are predefined: network-admin and network-operator. Other roles can be created, with CLI commands enabled or blocked selectively for that particular role.

262

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

VSAN-Based RBAC
With SAN-OS version 1.3.1 and higher, customers are able to define roles on a per-VSAN basis. This enhanced granularity allows different administrators to be assigned to manage different SAN domains, as defined by VSANs. A Network Administrator is responsible for overall configuration and management of the network, including platform-specific configuration, configuration of roles and role assignment. Matching the VSANs to the existing operational structure allows for ease of matching user roles to realistic groupings of operational responsibility. VSAN-based roles both limit the reach of individual VSAN Administrators to the resources within their logical domain. In addition, efficient grouping of commands into roles, and assignment of roles to users, allows mapping of user accounts to practical roles, which reduces the likelihood of password sharing among operational groups.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

263

AAA Services
Authentication
User access with ID and password

Authorization
Role level or set of privileges

Accounting
Log of users management session

Centrally stored access information Covers needs for various applications:


CLI Login (Telnet/SSH/Console/Modem) SNMP (authentication and accounting) iSCSI (CHAP authentication) FC-SP (DH-CHAP authentication)
2006 Cisco Systems, Inc. All rights reserved.

31

AAA Services
AAA services consist of authentication, authorization, and accounting facilities for CLI.

Authentication refers to the authentication of users to access a specific device. Within the Cisco MDS 9000 Family switches, RADIUS and TACACS+ can be used to centralize the user accounts for the switches. When a user tries to log on to the switch, the switch will validate the user via information gathered from the central RADIUS or TACACS+ server. Authorization refers to the scope of access that users receive once they have been authenticated. Assigned roles for users can be stored in a RADIUS or TACACS+ server along with a list of actual devices that each user should have access to. Once the user has been authenticated, the switch can then refer to the RADIUS or TACACS+ server to determine the extent of access the user will have within the switched network. Accounting refers to the ability to log all commands entered by a user. These command logs are sent to the RADIUS or TACACS+ server and placed in a master log. This log can then be parsed to trace a user's activity and create usage reports or change reports. All exchanges between a RADIUS or TACACS+ server and a RADIUS or TACACS+ client switch can be encrypted using a shared key for added security.

RADIUS and TACACS+ are protocols used for the exchange of attributes or credentials between a RADIUS server and a client device (management station). RADIUS and TACACS+ cover authentication, authorization, and accounting needs for various applications, including: CLI login via Telnet, SSH, console, and modem; SNMP accounting; iSCSI CHAP authentication; and FC-SP DH-CHAP authentication. Separate policies can be specified for each application. The MDS 9000 also has the ability to send RADIUS accounting records to the system log (syslog) service. The advantage of this feature is the consolidation of messages for easier parsing and correlation.
264 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.

Centralizing Administration
Use RADIUS and/or TACACS+ for:
SNMP and CLI users iSCSI CHAP FC-CHAP
AD

RADIUS and TACACS+ Deployments


Datacenter routers and switches Dial/VPN servers Terminal servers Network management stations

RAD

Improved security due to central control in applying access rules Use redundant servers Connect RADIUS/TACACS+ to LDAP or Active Directory servers to centralize all accounts enterprise-wide

Microsoft AD
LDAP

RADIUS
RAD

LDAP Server

W2K IAS RADIUS


TAC+

DB

Linux TACACS+

Cisco MDS 9000 Family Switches

RDBMS Server

2006 Cisco Systems, Inc. All rights reserved.

32

Centralizing Administration
SAN administration must be limited to qualified and authorized individuals to assure proper configuration of the devices and the fabric. Enterprise-wide security administration is enabled through support for RADIUS servers and TACACS+ servers for the MDS 9000 family. The use of RADIUS or TACACS+ allows user accounts and roles to be applied uniformly across the enterprise, both simplifying administrative tasks as well as increasing security by providing centralized control for application of access rules. In addition, the switch can record management accounting information, logging each management session in a switch. These records may then be used to generate reports for troubleshooting purposes and user accountability. Accounting data can be recorded locally, on the switch itself, or by RADIUS servers. RADIUS is a standards-based protocol defined by RFC 2865 and several associated RFPs. RADIUS uses UDP for transport. TACACS+ is a Cisco client-server protocol which uses TCP (TCP port 49) for transport. The addition of TACACS+ support in SAN-OS enables the following advantages over RADIUS authentication:

The TCP transport protocol provides reliable transfers with a connection-oriented protocol. TACACS+ provides independent, modular AAA facilitiesauthorization can be done without authentication. TACACS+ encrypts the entire protocol payload between the switch and the AAA server to ensure higher data confidentialitythe RADIUS protocol only encrypts passwords.

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

265

End-to-End Security Design


Intelligent SAN Security
Device/SAN Management Security Via SSH, SFTP, SNMPv3, RBAC

Secure In-Transit Protocols CHAP and IPsec for iSCSI

VSANs Provide Secure Isolation


IP LAN

RADIUS or TACACS+ Server for Authentication


iSCSI Hosts iSCSI

Port Binding and DHCHAP

Hardware-Based Zoning Via Port and WWN LUN Zoning Read-Only Zones

FC

FC
FC

FCIP- DWDM SONET FC SAN FC SAN

Port Binding and (DH CHAP )

2006 Cisco Systems, Inc. All rights reserved.

34

End-to-End Security Design and Best Practices


The MDS 9000 platform provides a full suite of intelligent security functions that when deployed enable a truly secure SAN environment.

Secure SAN management is achieved via role-based access. It includes customizable roles that apply to CLI, SNMP, and web-based access, along with full accounting support. Secure management protocols like SSH, SFTP, and SNMPv3 ensure that outside connection attempts to the MDS 9000 network are valid and secure. Secure switch control protocols that leverage IPsec-ESP (Encapsulating Security Protocol) specifications yield SAN protocol security (FC-SP). DH_CHAP authentication is used between switches and devices. MDS 9000 support of RADIUS and TACACS+ AAA services help to ensure user, switch, and iSCSI host authentication for the SAN. Secure VSANs and hardware-enforced zoning restrictions using port ID and World Wide Names provide layers of device access and isolation security to the SAN.

Security measures implemented in this scenario include:


DH-CHAP capable HBAs installed in all hosts to enable authenticated fabric access Port-mode security on all switch ports Port security on all switch ports Database cluster server groups access utilize their own VSAN to provide traffic isolation Array-based LUN Security
Copyright 2006, Cisco Systems, Inc.

266

Cisco Storage Design Fundamentals (CSDF) v3.0

MDS Intelligent Security Solutions


Data Integrity & Encryption

Level of Security

Device Authorization & Authentication Traffic Isolation & Device Access Controls Mgmt Access
SSHv2, SNMPv3, SSL Centralized AAA w/ RADIUS, TACACS+ Role Based Access Controls (RBAC) VSAN based RBACs IP ACLs VSANs Hardware Zoning LUN Zoning Read-only Zones Port Security Fabric Binding Host/Switch Authentication for FC and FCIP iSCSI CHAP Authentication MS-CHAP Authentication Digital Certificates

Security for Data-inMotion (IPsec for iSCSI and FCIP)

Evolution of Security Solutions

2006 Cisco Systems, Inc. All rights reserved.

35

MDS Intelligent Security Solutions


Cisco offers the industrys most comprehensive set of security features in the MDS 9000 Family:

No impact on switch performance Data path features are all hardware-based Traditional hard and soft zoning as well as advanced LUN and Read-Only zones are available on MDS devices Port Mode Security is an excellent way to limit unauthorized access to the fabric Port Security binds device WWNs with one or more switch ports DH-CHAP provides device authentication services IPsec provides integrity and security for in-transit data

All security features are easily managed through Ciscos Fabric Manager application

Copyright 2006, Cisco Systems, Inc.

Securing the SAN Fabric

267

268

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 9

Designing SAN Extension Solutions


Overview
In this lesson, you will learn how to effectively deploy SAN extension solution on the MDS 9000 platform, including key applications and environments, high availability features, performance enhancements, Inter-VSAN Routing, and optical solutions.

Objectives
Upon completing this lesson, you will be able to identify issues and solutions for SAN extension. This includes being able to meet these objectives:

Identify applications for SAN extension Identify network transports for SAN extension Explain design configurations for SAN extension over DWDM and CWDM Define FCIP Explain design configurations for SAN extension using FCIP Describe the features of the MDS 9000 IP Services Modules Explain how to build highly available FCIP configurations Explain how IVR increases the reliability of SAN extension links Explain how to secure extended SANs Explain the options available for optimizing performance of low-cost FCIP transports

SAN Extension Applications


Data Backup and Restore
Data is backed up to remote data center Backup is accessible directly over the MAN/WAN Reduces Recovery Time Objective (RTO) Much faster than standard offsite vaulting (trucking in tapes) Ensures data integrity, reliability and availability Leverages the infrastructure of existing facilities

Local Datacenter

Remote Datacenter

WAN

BACKUP
2006 Cisco Systems, Inc. All rights reserved.

Data Backup and Restore


Remote backup is a core application for FCIP. It is sometimes known as remote vaulting. In this approach, data is backed up using standard backup applications, such as Veritas NetBackup or Legato Celestra Power, but the backup site is located at a remote location. FCIP is an ideal solution for remote backup applications because:

FCIP is relatively inexpensive compared to optical storage networking Enterprises and Storage Service Providers (SSPs) can provide remote vaulting services using existing IP WAN infrastructures Backup applications are sensitive to high latency, but in a properly designed SAN the application can be protected from problems with the backup process by using techniques such as snapshots and split mirrors.

270

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Data Replication
Data is continuously synchronized across the network Data can be mirrored for multiple points of access Enables rapid failover to remote datacenter for 24/7 data availability Reduces RTO as well as Recovery Point Objective (RPO)

Local Datacenter

Remote Datacenter

WAN

REPLICATION

2006 Cisco Systems, Inc. All rights reserved.

Data Replication
The primary type of application for an FCIP implementation is a disk replication application used for business continuance or disaster recovery. Examples of this types of application include:

Array-based replication schemes such as EMC Symmetrix Remote Data Facility (SRDF), Hitachi True Copy, IBM Peer-to-Peer Remote Copy (PPRC), or HP/Compaq Data Replication Manager (DRM). Host-based application schemes such as VERITAS Volume Replicator (VVR).

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

271

Data Replication (Cont.)


Asynchronous and synchronous replication:
Need transport solutions to address different levels of requirements for bandwidth and latency Example: Multi-hop replication
ASYNCHRONOUS REPLICATION

WAN

DWDM

SYNCHRONOUS REPLICATION

2006 Cisco Systems, Inc. All rights reserved.

Replication applications can be run in a synchronous mode, where an acknowledgement of a disk write is not sent until the remote copy is done, or in an asynchronous mode, where disk writes are acknowledged before the remote copy is completed. Applications that are using synchronous copy replication are very sensitive to latency delays and might be subject to unacceptable performance. Customer requirements should be carefully weighed when deploying an FCIP link in a synchronous environment. FCIP can be suitable for synchronous replication when run over local Metro Ethernet or short-haul WDM transport.

272

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SAN Extension Transports


Dark Fiber
Dark fiber is viable over data center or campus distances.
Single Mode fiber up to 10km at 2Gbps Multi Mode fiber up to 300m at 2Gbps Joined switches form a single fabric Fabric will segment if there is a link failure disruptive event
<10km for LW Single Mode
FC SAN FC SAN

Diverse Paths for Availability

FC SAN

FC SAN

2006 Cisco Systems, Inc. All rights reserved.

Dark Fiber
The type of fiber used defines maximum link distances for connecting Fibre Channel ports over dark fiber:

Single mode 9 fiber will support 10Km distances at 1Gbps, 2Gbps or 4Gbps. Multimode 50 fiber will support 500m distances at 1Gbps, 300m at 2Gbps and 150m at 4Gbps. Multimode 62.5 fiber will support 350m distances at 1Gbps, 150m distances at 2Gbps and 75m distances at 4Gbps.

When two switches are joined together with an ISL, they merge fabrics and become part of the same fabric with a shared address space, shared services and a single principal switch. This is a disruptive event and FSPF will build a new routing table which is distributed to all switches within the fabric. If there is a link failure, then the single fabric will segment into two separate fabrics, each with their own address space, FC services and each with their own principal switch. FSPF must disruptively build a routing table for each segmented fabric once again.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

273

DWDM
DWDM enables up to 32 channels to share a single fiber pair
Divides a single beam of light into discrete wavelengths (lambdas) Each signal can be carried at a different rate (2.5-Gbps, 10-Gbps) Dedicated bandwidth for each multiplexed channel ~ 1nm spacing DWDM transponders can support multiple protocols and speeds Point to point distance limited to approx 200km

Router

DWDM Multiplexer

DWDM Demultiplexer

Router

FC switch

FC switch

SONET

Transponders

Transponders

SONET
9

2006 Cisco Systems, Inc. All rights reserved.

DWDM
A single fiber pair connecting two FC switches together through an ISL provides a single channel (wavelength of light) between the two switches. DWDM enables up to 32 channels to share a single fiber pair by dividing the light up into discrete wavelengths or lambdas separated by approx 1nm spacing around the 1550nm wavelength. Each DWDM lambda will support a Full Duplex FC or ESCON or FICON or Ethernet channel. DWDM transponders convert each channel into its dedicated lambda and multiplex it onto a 2.5Gbps or 10Gbps link between DWDM Multiplexers. DWDM signals can be amplified and point to point distances are approx 200Km

274

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

CWDM
CWDM allows eight channels to share a single fiber pair
Each channel uses a different color SFP or GBIC Provides 8 times the bandwidth over a pair of fibers (8x 2Gb = 16Gb) CWDM is much less costly than DWDM Channel spacing is only 20nm CWDM multiplexers are passive un-powered optical devices (prisms) Maximum distance is approx 100km - Signal cannot be amplified

1470nm 1490nm 1510nm 1530nm 1550nm 1570nm 1590nm 1610nm


2006 Cisco Systems, Inc. All rights reserved.

1470nm

OADM Mux

OADM Mux

1490nm 1510nm 1530nm 1550nm 1570nm 1590nm 1610nm


10

CWDM
CWDM is much less costly than DWDM because the channel spacing is only 20nm and much less precise. CWDM provides 8 channels between two CWDM Multiplexers over a single fiber pair. CWDM Multiplexers are usually un-powered devices containing a very accurate prism to multiplex 8 separate wavelengths of light along a single fiber pair. Max distance is approx 100Km.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

275

SONET / SDH
SONET and SDH support longer distances than WDM:
Robust network management and troubleshooting Significant installed infrastructure Variety of protection schemes Limited bandwidth in some service areas (without use of DWDM)

ONS15454 with SL Line card MDS9000


FC

MDS9000
FC

Fibre Channel

2006 Cisco Systems, Inc. All rights reserved.

11

SONET/SDH
SONET (North America) or SDH (Rest of the world) is a managed optical technology that supports much longer distances than CWDM or DWDM. Typically used City to City or Country to Country.

276

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cisco End-to-End Storage Solutions


Backup Transport Service Low-End Apps
Routed IP Network

HBA HBA

HBA HBA

HBA HBA HBA HBA HBA HBA

SONET/SDH

Synch/Asynch Replication Tape Vaulting Mid-Range Apps

CWDM/DWDM

FC SAN

Synchronous Replication High-End Apps

FC SAN

2006 Cisco Systems, Inc. All rights reserved.

12

Cisco End to End Storage Solutions


Cisco provide a large number of flexible solutions to extend communication between remote data centres.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

277

Extending SANs with WDM


DWDM vs CWDM
DWDM offers:
High Scalability 32 channels Higher cost Low, predictable latency Built-in protection schemes Network Management Software Per Lambda Performance Monitoring ONS 15454 MSTP Buffer Credit Spoofing Protocol-independence enables a wide range of solutions Highest bandwidth and performance up to 10G per channel ()

CWDM offers:
Lower cost than DWDM CWDM GBICs & SFPs are used Less scalable 8 channels (s) max, 2.5G Channels Simple Deployment (Passive components) Less Electronics (Just SFPs) Shorter distances based on power of SFPs no amplification Relatively inexpensive way to get low-latency, high-bandwidth connectivity
2006 Cisco Systems, Inc. All rights reserved.

14

DWDM Transport
DWDM offers ample bandwidth, performance, and scalability. DWDM is protocolindependent, so it more easily caters to future protocols and growth. For example, DWDM is the only solution that can accommodate ESCON solutions along with FICON and FC. DWDMs strongest qualities are:

Very high scalability Very low, predictable latency Moderately long distances

DWDM distances are still limited by the application and the flow-control mechanisms of each particular protocol. For example, the high number of BB_Credits on the Cisco MDS 9000 (255 credits on the 16-port cards) allows an MDS-to-MDS link over a theoretical distance of 255km at 2Gb FC line rates. Synchronous replication requires high bandwidth and low latency, and is therefore well-suited for optical DWDM infrastructures in which FC is channeled directly over an optical network for long distances.
Note The highest distance tested by Cisco with synchronous replication is 239Km.

The primary trade-off for DWDM is cost:


DWDM equipment is much more expensive than other solutions. Dark fiber can be expensive to lease.
Copyright 2006, Cisco Systems, Inc.

278

Cisco Storage Design Fundamentals (CSDF) v3.0

DWDM services are not as widely available as SONET/SDH, so companies might need to implement and manage their own DWDM solutions. DWDM does not have the same robust management capabilities as SONET/SDH, which can also increase the cost of management.

CWDM Transport
CWDM applications are similar to those for DWDM: low-latency, high-bandwidth applications like synchronous replication. However, DWDM provides much more scalability than CWDM, and DWDM can be used over much longer distances because DWDM can be amplified. The primary advantage of CWDM is its low cost. CWDM components are passive, un-powered devices. CWDM is always much cheaper than DWDM, and is often even cheaper than SONET/SDH For those who have access to dark fiber and have limited scalability needs, CWDM is a relatively inexpensive way to get low-latency, high-bandwidth connectivity.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

279

CWDM SAN Extension Design


Portchannel 4 x 1Gbps over two diverse paths 1Gbps CWDM SFPs

Pass
FC

Pass Network

MUX-4

MUX-4

FC

Diverse Paths - onefiber pair each path

MUX-4
Pass

Network Pass

MUX-4

MDS

MDS

HA Resilience against fiber cut client protection


4-member Portchannel 2 x 2 diverse paths Load balance by Src/Dst (or Src/Dst/OXid) Fiber cut will halve capacity from 4Gbps to 2Gbps but not alter Fabric topology no FSPF route change

Use MUX-8 to double capacity Cheaper than DWDM Less Capacity Distance limitation approx 100km no amplification Max Channel BW is 2.5G
15

2006 Cisco Systems, Inc. All rights reserved.

CWDM SAN Extension Design


CWDM takes dark fiber SAN extension one step further. CWDM allows up to eight 1Gbps or 2Gbps channels (or colors) to share a single fiber pair. Each channel uses a different colored SFP or GBIC. These channels are networked with a variety of wavelength specific add-drop multiplexers to enable an assortment of ring or point-to-point topologies. The CWDM wavelengths are not amplifiable and so are limited in distance according to the number of joins and drops. A typical CWDM SFP has a 30dB power budget, so it can reach up to ~90 km in a point-point topology or around 40 km in a ring topology. As with dark fiber, protection against failure is provided by the switches at the end points. Diverse optical paths must be employed to ensure high availability and port channels are recommended to maintain fabric stability through path failures. As with dark fiber, CWDM links do not add any additional latency other than that incurred by the speed of light through fiber. As such, CWDM links are ideally suited to the low-latency requirements of synchronous replication applications. CWDM Specifications:

8-channel WDM at 20nm spacing (cf DWDM at <1nm spacing) 1470, 1490, 1510, 1530, 1550, 1570, 1590, 1610nm Special colored SFPs used in FC Switches Muxing done in CWDM Optical Add/drop Multiplexer (OADM) passive unpowered device, just mirrors and prisms 30dBm power budget (36dBm typical) on SM fiber ~90km Point-to-point or ~40km ring Not amplifiable via Erbium Doped Fiber Amplification (EDFA)
Copyright 2006, Cisco Systems, Inc.

280

Cisco Storage Design Fundamentals (CSDF) v3.0

Common DWDM SAN Extension Design


2x 2Gbps Portchannel

1,3

One member from each PortChannel routed over top fiber (Nos. 1 & 3)

1 2 3
ONS15454

SAN

2 3
ONS15454

DWDM Ring 2,4


One member from each PortChannel routed over bottom fiber (Nos. 2 & 4)

SAN

4
MDS

MDS

Cisco ONS 15454-MSTP has SAN certification:

Up to 800km @ 2G, 1600km @ 1G R_Rdy spoofing Very high capacity 32 wavelengths @ 10G per fiber pair Typically used for Synchronous Replication up to ~200 - 300km (vendor qualified): Fiber Channel over Optical Very Low Latency (speed of light only factor) 5us/km Client Protection Recommended Failover recovery (classic dual fabric design): Port Channels add resilience to each Fabric Augment or replace with other DWDM protection schemes (splitter or Y-Cable) NOTE: This is a very common deployment method, many times just point-point and not ring!
2006 Cisco Systems, Inc. All rights reserved.

16

DWDM SAN Extension Design


DWDM allows up to 32 channels (lambdas) to share a single fiber pair. Each of these channels can operate at up to 10 Gbps. In contrast to the passive multiplexing nature of CWDM, DWDM platforms (such as the ONS15454) are intelligent, offering a variety of protection schemes to guard against failures in the fiber plant. These can be used in isolation or in concert with client protection through port channeling on the MDS 9000. As with other pure Fibre Channel over optical solutions, DWDM links only incur latency from the speed of light through fiber. DWDM is thus the most popular choice for enterprise deployment of synchronous replication between metro data centers. DWDM technology is capable of spanning great distances between nodes without amplification. For this reason, DWDM technology is used for applications like undersea optical cables. Factors that determine the distance between nodes in a DWDM system include:

Higher density than CWDM 32 lambdas or channels in narrow band around 1550nm at 100GHz spacing (<1nm) EDFA amplifiable over longer distances Carriage of 1 or 2 Gbps FC, FICON, GigE, 10GigE, ESCON, IBM GDPS Data Center to Data Center Protection options; client, splitter, or line card

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

281

DWDM Solution for Mainframe Interconnect


Mainframe Interconnect GDPS Example
Coupling Facility Coupling Facility

12 . 3 6 Sysplex Timer 9 ONS 15454 ESCON/FICON storage

DWDM ONS 15454

12 . 3 6 Sysplex Timer 9

ESCON/FICON storage

NOTE: GDPS2 on ONS15454 is being tested in March 06. MSTP will be the first optical equipment to have this accreditation
2006 Cisco Systems, Inc. All rights reserved.

17

DWDM Solution for Mainframe Interconnect


This GDPS configuration defines a multi-site Parallel Sysplex environment, providing a single solution designed to manage storage subsystem processes such as mirroring. The sites are typically connected by a DWDM link, which provides a high-speed multi-channel link for the Coupling Facilities, Sysplex Timers, and ESCON or FICON director switches. Disastertolerant GDPS configurations typically use DWDM along with multiple Coupling Facilities, Sysplex Timers, and ESCON/FICON director switches for redundancy.

Sysplex
Sysplex, which stands for System Complex, is a processor complex collection that is formed by coupling multiple processors running multiple OS images, using channel-to-channel adapters or ESCON/FICON fiber optic links. Sysplex is a loosely-coupled clustering technology. Sysplex is supported on the IBM S/390 and zSeries processors.

Parallel Sysplex
Parallel Sysplex is a bundle of products (announced in April of 1994) that enables parallel transaction processing in a Sysplex. Parallel processing in a Sysplex is the ability to simultaneously process a particular workload on multiple processor complexes, each of which may have multiple processors. Parallel Sysplex is a tightly-coupled clustering technology.

Geographically Dispersed Parallel Sysplex (GDPS)


GDPS is an extension to the Parallel Sysplex architecture that allows the systems in a Parallel Sysplex to be located at geographically remote sites. GDPS enables business continuity and disaster recovery solutions.

282

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

GDPS is primarily intended to create disaster-tolerant system configurations by spreading out the components of a Parallel Sysplex across multiple locations. The main focus of GDPS automation is to ensure that a consistent copy of the data is available at another site, and that the remote data can be quickly brought online in the event of a local failure. Consistent data simply means that from an applications perspective, the secondary disks contain all updates until a specific point in time, and no updates beyond that specific point in time.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

283

ONS 15454 MSTP


Fully Reconfigurable, Intelligent DWDM
Carrier Class DWDM Transport: TDM/Ethernet/SAN Integrated DWDM platform Integrated planning, install and turn-up Automatically compensates for fiber issues Reconfigurable Networking: 32 ch Reconfigurable Optical Add Drop Mux (ROADM) C-band Tunable 10G Lasers Ciscos Core DWDM Transmission Platform: Providing DWDM capabilities for Ciscos TDM, IP and SAN End-to-end management of Cisco IP DWDM networking Directly connect IP wavelengths Industry Leading Advancements: 80ch capable solution Qualified for 40G transport
For more information, attend the course - Selling and Designing Cisco ONS 15454 MSTP Optical Solutions
2006 Cisco Systems, Inc. All rights reserved.

15454 MSTP

18

The Cisco ONS 15454 Multi-Service Transport Platform (MSTP)


Ciscos ONS 15454 is a carrier-class DWDM platform capable of using the same chassis for either DWDM or SONET / SDH or both. The MSTP provides a best-in-class integrated network design and planning software package named Metro Planner that offers more than optical link budget calculations. Metro Planner provides everything needed to set up an optical network element for DWDM, including; uploadable configuration files for each node, installation drawings and bills-of-materials. After the configuration file is uploaded to a node, the Automatic Power Control (APC) function takes over and automatically keeps the network equalized. APC is a feature that automatically compensates for fiber degradations over time, and equalizes channel powers at each node. The MSTP ROADM (Reconfigurable Optical Add-Drop Multiplexer) has been a very popular feature since its inception in 2004. ROADM popularity is due to an ability to reduce initial costs and lower ongoing operating expenses. ROADMs take the place of static OADMs, which may need to be upgraded to support increases in service demand. ROADMs can live in networks along with OADMs and all other optical components. The MSTP originally offered 32 channels in the C-band of the optical spectrum. This range will increase in a future release to 104 channels, utilizing the L-band in addition to the C-band.

284

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

ONS 15454 MSTP Data Muxponder Cards


2.5 G Data Muxponder card
Data & Storage aggregation up to 8 clients (FC, Ficon, Escon & Ethernet) on a 2.5G lambda Certification for EMC, Hitachi, IBM, HP, Brocade, Cisco MDS Buffer-to-Buffer Credit up to 1,600 Km

10 G Data Muxponder card


Data & Storage aggregation up to 8 clients on a single 10G lambda channel
8x GigE 8x 1G FC / FICON / ISC-1 4x 2G FC / FICON / ISC-3 2x 4G FC Or any mix of above

Certification for EMC, Hitachi, IBM, HP Buffer-to-Buffer Credit up to 1,400 Km

Service Cards

ONS 15454 MSTP


2006 Cisco Systems, Inc. All rights reserved.

TXP, MXP, DMXP, XPON, ADM

19

ONS 15454 MSTP Cards


The Cisco ONS 15454 MSTP provides intelligent DWDM capability for metro and regional networks. By supporting distances of greater than 600 km, the Cisco ONS 15454 MSTP provides an ideal solution for secondary data centers, as well as for backup sites located outside of the disaster radius. The Cisco ONS 15454 Multi-service Transport Platform (MSTP) can deliver any service type to any location in a metropolitan-area (metro) or regional DWDM network. It can be configured to support any DWDM topologypoint-to-point or ringand supports complex traffic patterns even across multiple networks. The many service cards available cover virtually any protocol type. The Data Muxponder cards allow the multiplexing of Ethernet and SAN based protocols.

2.5G Data Muxponder


The 2.5G Data Muxponder card aggregates a mix and match of Data and SAN client inputs (GE, FICON, Fibre Channel, and ESCON) into one 2.5 Gbps DWDM signal. The client interface supports the following payload types:

2G FC 1G FC 2G FICON 1G FICON GigE ESCON

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

285

10G Data Muxponder


The 10G Data Muxponder cards aggregate a mix of Data and SAN client inputs (GigE, FICON, and Fibre Channel) into one 10Gbps DWDM signal. The MXP_MR_10DME_C card features a tunable 1550-nm C-band laser on the trunk port. The laser is tunable across 80 wavelengths on the ITU grid. Each card features eight client ports and one DWDM trunk port. The cards support aggregation of the following signal types:

1-Gigabit Fibre Channel 2-Gigabit Fibre Channel 4-Gigabit Fibre Channel 1-Gigabit Ethernet 1-Gigabit ISC-Compatible (ISC-1) 2-Gigabit ISC-Peer (ISC-3)

Other Services Supported on the MSTP


Aggregated lower-rate TDM services from DS1/E1 over 2.5-Gbps and 10-Gbps wavelengths SONET/SDH wavelength and aggregated services: OC-3/STM-1 to OC-768/STM-256 Data services: Private-line, switched, and wavelength-based, from Ethernet to 10 Gigabit Ethernet (10 GE LAN and WAN physical layer) Storage services: 1-, 2-, 4-, and 10-Gbps Fibre Channel, FICON, ESCON, ETR/CLO, ISC1, ISC-3 Video services: D1 and high-definition television (HDTV) Digital-wrapper technology (defined in ITU-T G.709) for enhanced wavelength management and extended optical reach with integrated Forward Error Correction (FEC) and Enhanced FEC

286

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Fibre Channel over IP


Common SAN Extension Drawbacks
In the past, SAN extension solutions were: Channel extenders - proprietary solutions Designed for high-end applications Limited in distance Expensive to implement and manage A solution was required that addressed the need of both high-end and midrange applications That solution is FCIP A protocol for transporting FC frames over an IP network
Remote Office B Remote Office A

$$ $$
Remote Office C
ATM Ethernet Dark Fiber

2006 Cisco Systems, Inc. All rights reserved.

Main Data Center

Proprietary Channel Extenders


21

Common SAN Extension Drawbacks


Previous SAN extension solutions consisted of channel extenders that were based on proprietary technologies. These products were primarily designed for high-end applications, using transports like Metro Ethernet and dark fiber. These options were limited in distance and often expensive to implement and manage. Conversely, more cost-effective solutions, like FC over ATM, could not support high-end applications.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

287

What is FCIP?
Fibre Channel over Internet Protocol Allows SAN islands to be interconnected over IP networks FC Frames are encapsulated in TCP/IP and sent through the tunnel TCP/IP is used as the underlying transport to provide flow control and in-order delivery of error-free data:
Each interconnection forms an FCIP Tunnel Each GigE port supports up to 3 FCIP Tunnels

The result is a fully merged Fibre Channel fabric


Fibre Channel
HB A HB A HB A HB A

HB A

HB A

FCIP tunnel

IP
2006 Cisco Systems, Inc. All rights reserved.

22

What is FCIP?
FCIP is a mechanism that allows SAN islands to be interconnected over IP networks. The connection is transparent to Fibre Channel, and the result of an FCIP link between two fabrics is a single fully merged Fibre Channel fabric. TCP/IP is used as the underlying transport to provide congestion control and in-order delivery of error-free data. FCIP is a draft specification that is in development by the Internet Engineering Task Force (IETF) IP Storage (IPS) Working Group. The specification defines the encapsulation of Fibre Channel frames transported by TCP/IP. The result of the encapsulation is to create a virtual Fibre Channel link that connects Fibre Channel devices and fabric elements across IP networks. When FCIP connectivity is implemented in the switch instead of in a separate bridge device, standard B_Ports are not used. In the MDS implementation, each end of the FCIP link is associated to a Virtual E_Port (VE_Port), forming a Virtual ISL (VISL). VE_Ports communicate over a VISL using standard FC SW_ILS frames, just like E_Ports communicate between two switches. VE_Ports and TVE_Ports behave exactly as E_Ports and TE_Ports. For example:

[T]VE_Ports negotiate the same parameters as E_Ports, including Domain ID selection, FSPF, and zones. [T]VE_Ports can be members of a Port Channel. TVE_Ports carry multiple VSANs.

288

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IETF IP Frame Encapsulation of FC


Encapsulation method used by the IPS Module
Application DATA

SCSI

SCSI

DATA

FC

SOF

FC Hdr

SCSI

DATA

CRC

EOF

FC SAN

FCIP

IP 20

TCP 20

FCIP Hdr SOF 28 4

FC Hdr 24

SCSI

DATA

CRC 4

EOF 4 Bytes

(FC Payload 0 2112)

FCIP tunnel

2006 Cisco Systems, Inc. All rights reserved.

23

IETF IP Frame Encapsulation of FC


An application that wants to send data to a storage device will pass the request down to the SCSI driver in the server. The SCSI driver will attach a CDB (Command Descriptor Block) to the first frame containing the SCSI Command, Type, LBA (Logical Block Address) and Block Count. The CDB and/or data will be send to the FC device driver with a request to send the data to a SCSI Target and LUN at a particular FC address. The FC driver carries the CDB, LUN and SCSI Command or Data in the payload of a FC frame. The FC driver attaches a FC header containing S_ID and D_ID addresses, sequence and exchange IDs. The FC driver the computes a CRC (Cyclic Redundancy Check) based upon the contents of the FC Header and Payload, and attaches it to the FC frame. Finally the FC driver attaches a SOF (Start of Frame) and EOF (End of Frame) and sends the frame from the HBA N_Port along the link to the F_Port on the MDS switch. The MDS switch receives the frame and routes it to the appropriate FCIP Gateway interface on the IPS linecard or MPS 14+2. The FCIP Gateway attaches a FCIP Header and fully encapsulates the FC Frame in TCP/IP and sends it in IP packets across the FCIP Tunnel between two FCIP Gateways.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

289

When the packets reach the destination FCIP Gateway, the procedure is reversed as each of the headers is stripped off.

290

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Extending SANs with FCIP


Potential FCIP Environments
HB A HB A

HB A

HB A

HB A

FCIP

Metro Ethernet

FCIP

HB A

Short distance : 10s of Km: High bandwidth : Low latency


HB A HB A

HB A

HB A

HB A

FCIP

SONET or SDH

FCIP

HB A

Medium distance : 100s of Km : Medium bandwidth : Medium latency

HB A

HB A

HB A

HB A

HB A

FCIP

IP Routed WAN

FCIP

HB A

Long distance : 1000s of Km : Low bandwidth : High latency


2006 Cisco Systems, Inc. All rights reserved.

25

Potential FCIP Environments


With regard to SAN applications:

Synchronous replication requires high bandwidth and low latency and is well suited for FCIP. FCIP over Metro Ethernet or SONET/SDH can be used for synchronous replication. The FC line card for the Cisco ONS 15454 supports a feature called BB_Credit spoofing that allows SONET/SDH to carry FC with no loss of performance over thousands of kilometers. Asynchronous replication consumes less bandwidth and can tolerate more latency, so FCIP over SONET/SDH can provide a cost-effective solution in addition to supporting longer distances. Remote vaulting applicationswhich resemble standard backup applications, but where the backup device resides at a remote location, such as at an SSPcan require longer distances, but can require deterministic latency. For these solutions, FCIP over SONET can be the most effective solution. Host-based mirroring solutions are generally applications with less stringent bandwidth and latency requirements. FCIP over SONET or FCIP over IP routed WAN can be suitable infrastructures for these applications.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

291

Hub-and-Spoke Configuration
Point-to-Point
FCIP tunnel

Hub-and-Spoke

FC IP

Corporate HQ

IP FC

FCIP FCIP Each GigE port supports 3 FCIP Tunnels

Remote sites

2006 Cisco Systems, Inc. All rights reserved.

26

Hub and Spoke Configuration


The MDS 9000 IP Services Modules can be deployed in a hub-and-spoke configuration to connect multiple remote sites to the corporate data center, with IPS-8 modules at the data center hub and MDS 9216i switches and 14+2 or IPS-4 line cards at the remote sites. FCIP Tunnels are formed by connecting exactly two FCIP interfaces together in a peer to peer configuration over an IP network. Each GigE port supports three FCIP Interfaces, so each one can form a FCIP Tunnel with an FCIP interface at the remote site, thus creating a Hub and Spoke configuration. The IPS-8 linecard supports eight GigE ports so up to 24 FCIP tunnels may be created. Two other legacy Cisco IP Storage products are also designed for remote site configurations:

The SN 5428-2 Storage Router supports FCIP as well as iSCSI and a workgroup FC switch in a single box, and can be used for small remote office deployments. The FCIP Port Adapter (FCPA) provides an FC interface for Cisco 7200 and 7400 Series routers.

292

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

QoS Enables Cost-Effective SAN Extension


Rate Limiter Classifier

Marker

Scheduler

Headquarters Remote Site


FC

Priority Priority Queue Queue DWRR DWRR 2 2

Output Queue

QoS
FC
HBA

QoS
Backup Servers
FC
HBA

FC

Fibre Channel Ingress Port

Classify

DWRR DWRR 3 3 DWRR DWRR 4 4

FC

FC

QoS

QoS

QoS

Ciscos SAN QoS mechanisms include:


Fibre Channel Congestion Control (FCC) Queuing and prioritization in FC network Queuing and prioritization by IPS module Queuing and prioritization in IP network

IP WAN

QoS
FC

FC

Benefits:
Critical data has priority on the network Latency sensitive apps get priority Bandwidth can scale dynamically
2006 Cisco Systems, Inc. All rights reserved.

QoS

Backup site
27

QoS Enables Cost-Effective SAN Extension


QoS provides priority for FC traffic within the SAN and across an IP WAN. Latency sensitive applications like OLTP must be given a higher priority than less sensitive applications so that when congestion occurs, this data receives priority scheduling.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

293

FCIP Advantages
Advantages:
Low-cost connectivity solution Ubiquitous connectivity (IP) No fixed distance limitation Not reliant of Fibre Channel buffer credits Integrates easily into existing network management scheme Granular scalability by upgrading underlying transport

Disadvantages:
Higher latency than CWDM / DWDM Fully merged fabric will segment if WAN connection fails Need to reserve bandwidth across shared IP network (QoS) Many proprietary product options based upon a standard

2006 Cisco Systems, Inc. All rights reserved.

28

FCIP Advantages
FCIP provides a low-cost connectivity solution for SAN extension. There are many product optionsCisco offers three different FCIP solutionsand TCP/IP service is universally available. FCIP has no fixed distance limitation. The maximum distance is largely dependent on the quality of the underlying transport and the applications latency requirements. Because FCIP is based on IP, FCIP solutions can be easily integrated with existing network management tools and practices. Smaller organizations that do not have existing optical networking expertise will find FCIP attractive. FCIP is often considered a low-end solution, but it also offers granular scalability by providing the ability to upgrade the underlying IP transport. For example, with the IPS-8 module, a company could start with a small DS-1 or DS-3 connection and later add additional DS-1, DS3, or OC-n service. FCIP links can be bound together in PortChannels to provide bandwidth aggregation.

294

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS FCIP Connectivity


No External Gateways
Tightly integrated with MDS crossbar architecture Fast failover when paths fail Easier to manage
Cisco IP Storage Services Module

Use multiple FCIP links for added redundancy


Leverage FC Port Channels on same switch Leverage FSPF load balancing of links on different switches Leverage traffic engineering across FCIP links Port Channels provide large aggregate bandwidth
8 Gbps FCIP Connectivity Fibre Channel Port Channel for link aggregation

Isolate end points using VSANs


Remove risk of fabric segmentation Restrict unnecessary traffic from traversing the WAN/MAN
EISL Link carrying VSAN tags

Cisco Routed Network


Red VSAN is not carried across FCIP if not required in remote site

Use Inter-VSAN routing to enable further isolation while allowing connectivity between selected devices

2006 Cisco Systems, Inc. All rights reserved.

29

MDS FCIP Connectivity Advantages


MDS 9000 IPS modules allow customers to leverage FC features like PortChannels, FSPF load-balancing, and traffic engineering to implement high availability while maintaining high performance on FCIP links. The FCIP links can be isolated using VSANs and Inter-VSAN Routing (IVR) to isolate fabric events and prevent unnecessary traffic from traversing the WAN or MAN.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

295

Cisco MDS 9000 IP Services Modules


MDS 9000 IP Services Modules
IPS-8 MPS 14+2

8 GigE ports Maximum FCIP connectivity 8Gbps High-End iSCSI Software compression Write acceleration Tape acceleration

14 FC ports plus 2 GigE ports Primarily designed for FCIP Can also be used for iSCSI Hardware compression Write acceleration Tape acceleration Hardware encryption
31

2006 Cisco Systems, Inc. All rights reserved.

The IP Storage Services (IPS-8) Modules provide eight Gigabit Ethernet ports to support iSCSI and FCIP. The ports are hot-swappable, small form-factor pluggable (SFP) LC-type Gigabit Ethernet interfaces. Modules can be configured with either short or long wavelength SFPs for connectivity up to 550m and 10km, respectively. Each port can be configured in software to support iSCSI protocol and/or FCIP protocol simultaneously, while also supporting the features available on other switching modules, including VSANs, security, and traffic management. 512 MB of buffer capacity is shared between port pairs, allowing all ports to achieve gigabit speeds, and performance tuning options such as TCP window size help to ensure full storage transport performance over WAN distances. The IPS supports 2-link Ethernet PortChannels, FC PortChannels, and Virtual Router Redundancy Protocol (VRRP) to enhance link utilization and availability. VLAN support through the IPS module enables the MDS 9000 system to leverage existing reliability functions of the existing IP network that it attaches to. IPS module interfaces operate at full line rate for all protocols that have 1K byte frame size or greater and the module supports all standard Fibre Channel line card features except for FC interfaces. Fibre Channel interfaces are leveraged on other modules like the sixteen or thirtytwo port switch line cards. The IPS module can be simultaneously configured for iSCSI and FCIP operation, where it supports iSCSI initiator to Fibre Channel target functionality, as well as an FCIP gateway with up to three FCIP tunnels per port or a maximum of 24 per line card (IPS-8). This concurrent multi-protocol flexibility helps enable investment protection through seamless migration to new technologies.

296

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

The Multiprotocol Services Module (MPS 14+2)


The Cisco Multiprotocol Services module contains 14 FC ports and two GigE ports. Like all the Cisco IPS line cards, the Cisco Multiprotocol Services module supports both iSCSI and FCIP, and is optimized for SAN extension with several new features:

High performance hardware-based compression: On low-speed wide-area network (WAN) links, each GigE port supports up to 70-Mbps application throughput with a 30:1 compression ratio. On high-speed WAN links, each GigE port supports up to 1.5-Gbps application throughput with a 10:1 compression ratio. Hardware-based IPSec supports gigabit-speed encryption for secure SAN extension. FCIP Tape Acceleration improves performance of remote backup applications. Extended distance capability, with 255 buffer credits per FC port and up to 3500 extended buffer credits on a single FC port.

The Multiprotocol Services line card is also available as a fixed module in the Cisco MDS 9216i fabric switch.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

297

MPS 14+2 Extended Credits


Up to 3500 BB_credits on a single port Allows DWDM and SONET/SDH links up to ~3500Km at 2Gbps Available on 14+2 Multiprotocol Services card and MDS 9216i Part of the Enterprise Package
BB_ Credits 3500 2400

Port Group X
400

X
400

X
400

HBA

FC FC FC HBA
HBA

HBA

FC FC FC HBA
HBA

FC

DWDM SONET/SDH Up to ~3500Km

FC

2006 Cisco Systems, Inc. All rights reserved.

32

MPS 14+2 Extended Credits


The extended credit feature allows up to 3500 FC BB_credits to be configured on a single FC port if the remaining three ports in the quad are disabled. This feature is available on the MDS 9216i and the 14+2 Multiprotocol Services line card. The extended credit feature allows customers to extend DWDM and SONET/SDH links up to about 3500Km without performance degradation Up to 2400 credits can be assigned to any port of a quad while maintaining active the remaining 3 ports with 400 credits each - the standard 255 credits plus the hidden performance credits. When over 2400 credits are assigned to a port, the remaining 3 ports in the quad are disabled. Ports 13 and 14 are not allowed to be configured as long haul, so the extended credits feature can only be used on a maximum of 3 ports per line card. This feature is included in the Enterprise license package.

298

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IPS Module Hardware Architecture


IPS Linecard internal bandwidth
20 Gbps

Supervisor Module
Cross-bar 720 Gbps

Queuing ASIC
20 Gbps 8 Gbps 8 Gbps

Forwarding ASIC
3.2 Gbps 3.2 Gbps

Forwarding ASIC
3.2 Gbps 3.2 Gbps

Supervisor Module
Cross-bar 720 Gbps

SiByte

SiByte

SiByte

SiByte

GigE ports

2006 Cisco Systems, Inc. All rights reserved.

33

IPS Module Hardware Architecture


The Cisco MDS 9000 Series Multilayer switches were designed from the ground up around a collection of sophisticated application-specific integrated circuit (ASIC) chips. The hotswappable, modular line card design provides a high degree of flexibility and allows for ease of expansion. Each line card in the Cisco MDS 9500 series director has redundant high-speed paths across the backplane to the high-performance crossbar fabrics located on the redundant supervisor modules. The figure illustrates the internal bandwidth in each Cisco MDS module. Frames enter through the physical interface and are sent to the forwarding ASIC along a 3.2 Gbps channel where, at wire rate, the routing decision is made. Frames pass to buffers in the queuing ASIC that forwards frames to the cross-bar along 20Gbps channels. Each cross-bar on the supervisor module is capable of 720Gbps full-duplex bandwidth providing a total aggregate bandwidth of 1.44Tbps on MDS 9506 or 9509 or up to 2.2Tbps on MDS 9513.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

299

GigE Interfaces
Each GigE port supports three FCIP interfaces and an iSCSI interface An IPS-8 can support up to 24 FCIP Tunnels + iSCSI concurrently FCIP interfaces can be Port Channeled for HA and Load Balancing

VE_Ports

FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface

GigE port

IP Network GigE port

Port Channel
2006 Cisco Systems, Inc. All rights reserved.

34

GigE Interfaces
Each GigE port supports three FCIP interfaces and a iSCSI interface simultaneously sharing 1Gbps of available bandwidth. An IPS-8 linecard has eight GigE ports so can support 24 FCIP tunnels and up to 1600 iSCSI connections concurrently. Each FCIP interface is associated with a Virtual E_Port (VE_Port) on the FCIP Gateway. FCIP interfaces can belong to FC Port Channels for High Availability and Exchange Based Load Balancing.

300

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IPv6 Support
Extended addressing capability
Reduces the need for private addresses and NAT IP address size increased from 32 to 128 bits Represented as eight 16 bit fields 2003:FAB7:1234:5678:9ABC:DEF0:1357:2468 IPv4 can be embedded in IPv6
10.1.2.3 represented as 0:0:0:0:0:FFFF:10.1.2.3

3.0

IPv6 supported on all GigE ports and Mgmt interface Standard applications are IPv6 ready
DNS, RADIUS, TACACS, ACLs, FCIP, iSCSI, IPFC, tftp, ftp, sftp, telnet, ssh, scp, snmp, ping, traceroute, etc.

Some apps are awaiting compliance


IPSec, IKE, Virtualization

2006 Cisco Systems, Inc. All rights reserved.

35

IPv6 Support
IPv6 has been introduced with SAN-OS 3.0 providing an extended addressing capability. IPv^ is supported on all MDS GigE ports and management interfaces.

IP Address size is increased form 32 bits to 128 bits IPv4 can be embedded in IPv6 for compatibility with legacy networks. Reduces the need for private addresses and NAT

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

301

High Availability FCIP Configurations


Drawbacks of Gateway-Based FCIP
Collapses geographically distant SAN fabrics into a single fabric
Reduces availability, reduces performance All FC traffic potentially traverses WAN link FSPF, RSCN, Name Server, Zoning etc

Multiple standalone gateways required for high availability No load balancing Additional management interface requirements Limited traffic management - relies strictly on IP routers for QoS

HBA

FC FC HBA FC
HBA

HBA

FCIP tunnel Standalone FCIP gateway IP Standalone FCIP gateway

FC FC HBA FC
HBA

FC SAN Fabric

FC SAN Fabric

2006 Cisco Systems, Inc. All rights reserved.

37

Drawbacks of Gateway-Based FCIP


Gateway-based FCIP solutions require multiple boxes for high-availability, each of which may need to be managed independently. Load-balancing is not possible with two independent boxes, and traffic cannot be managed at the application level. These implementations also suffer from performance and reliability issues because the FCIP gateways allow all FC control traffic to pass over the WAN. A flapping WAN link will cause severe disruptions to the local SAN devices.

302

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

HA Pitfalls
Multiple standalone gateways required for FCIP redundancy
Costly and difficult to manage

No native HA capabilities or proprietary HA schemes No network level HA or load balancing Failovers are slow and disruptive to the fabric Increased response time before failovers

HBA

FC FC HBA FC
HBA

FCIP tunnel

HBA

FC FC HBA FC
HBA

FCIP tunnel FC SAN Fabric Standalone FCIP gateways IP Standalone FCIP gateways FC SAN Fabric

2006 Cisco Systems, Inc. All rights reserved.

38

HA Pitfalls
Gateway-based FCIP solutions require multiple boxes for high-availability, each of which may need to be managed independently. HA is provided by a proprietary clustering scheme instead of network-based resiliency. With this configuration, load-balancing is not possible, and failover results in loss of data in transit and causes disruptive FSPF recalculation in the end point fabrics.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

303

MDS FCIP High AvailabilityParallel Tunnels


FCIP integrated into IPS linecards on MDS switches Best practice is to use two IPS modules on two separate MDS switches Tunnels appear as virtual E_Ports, or TE_Ports with VSANs
EISL header enables VSAN trunking

FSPF will re-route traffic if a FCIP Tunnel fails Protects against a port failure, IPS module failure, switch failure, link failure, and IP WAN service failure
VSAN A VSAN A

IPS IP Routed WAN

IPS

IPS

IPS FCIP Tunnel forming a Virtual ISL or EISL

VSAN B
2006 Cisco Systems, Inc. All rights reserved.

VSAN B
39

MDS FCIP High Availability Parallel Tunnels


A resilient configuration with IPS modules starts with two IPS modules on two separate MDS switches on each side of the FCIP link. This configuration protects against the failure of a port, an IPS module, a switch, a link, and a WAN IP service.

304

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS FCIP High Availability Port Channels


Port Channels enable multiple EISLs to be logically aggregated into single virtual EISL Recovery is done at the PortChannel level, not at the FSPF routing level, so recovery is faster and non-disruptive Provides Exchange based or Flow based load balancing QoS can be implemented in FC SAN L2/L3 mechanisms are also used to protect, load balance, and apply QoS inside the IP cloud
VSAN A VSAN A

IPS IP Routed WAN

IPS

IPS

IPS
Two FCIP Tunnels bundled using Port Channel to form single Virtual EISL

VSAN B
2006 Cisco Systems, Inc. All rights reserved.

VSAN B
40

MDS FCIP High Availability Port Channels


To achieve non-disruptive failover, PortChannels are used to aggregate two or more (E)ISLs into a virtual (E)ISL. Within a PortChannel, the failure of a link is non-disruptive, resulting in no lost data. Load-balancing can be performed per-flow or per-exchange at the FC level, and FC QoS policies can be applied. L2/L3 mechanisms are also used to protect, load balance and potentially traffic-engineer the IP cloud.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

305

Using IVR for SAN Extension


Flapping Links
Flapping links will cause a momentary failure to the virtual ISL Switch in Fabric A loses connection with switch in Fabric B Single fabric segments and causes massive SAN disruption
Each fabric nominates its own principal switch Each fabric rebuilds its own FSPF Routing Table

When the virtual ISL is reconnected, both fabrics try to merge again
Merged fabric nominates single principal switch Rebuilds FSPF Routing Table
Bad SFP or cable
FC FC HBA FC
HBA

HBA

HBA

FCIP tunnel Standalone FCIP gateway IP Standalone FCIP gateway

FC FC HBA FC
HBA

FC SAN Fabric A

FC SAN Fabric B

2006 Cisco Systems, Inc. All rights reserved.

42

Flapping Links
The diagram above illustrates a problem with other vendors FCIP implementations. Because FCIP merges the fabrics at both ends of the link, disruptions in the WAN link will cause fabric reconfiguration events to occur on both sides. All devices will be affected by the disruption. In the case of a flapping link, disruption can be crippling.

306

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Protecting Remote SANs with IVR


VSANs enable each data centre to be isolated into separate fabrics The WAN infrastructure is placed in a separate Transit VSAN IVR is used to allow selective connectivity between devices in each VSAN Any disruption in Transit VSAN will not affect VSANs in Data Centers

Backup Host 1 VSAN A


FC
HBA

Tape Library 1 VSAN C


FCIP tunnel
FC

Transit VSAN X
FC

FC
HBA

FCIP tunnel

VSAN B Data Center 1

VSAN D Tape Library 2 Data Center 2 Disaster Recovery Center


43

Backup Host 2

2006 Cisco Systems, Inc. All rights reserved.

Protecting Remote SANs with IVR


The example above shows multiple virtual fabrics with host devices (VSANs A and B) and tape devices (VSANs C and D) with transit VSAN (VSAN X) providing connectivity over a DWDM link. With IVR, traffic from VSAN A host devices travels through transit VSAN X to VSAN C tape devices. This configuration facilitates remote tape vaulting to multiple tape devices at the DR site. If the link between switches goes down, disruption is isolated to VSAN X. Traffic from VSAN A to VSAN C and VSAN B to VSAN D will be interrupted, but the adverse effects of the RCF will not be replicated in the local fabrics.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

307

MDS FCIP High Availability VSANs


VSAN segregation limits traffic traversing the WAN link Limits extent of potential disruption due to WAN link failure Use VSANs for complete isolation of WAN ports Use IVR for selective connectivity between devices in each VSAN Use Port Channels to provide load balancing and fast failover
VSAN A
FCIP tunnel

VSAN C

Transit VSAN X
IPS
FCIP tunnel
tu nn e tu nn e l l FC FC IP FC IP IP e nn tu l FC IP

IPS
e nn tu l

FCIP tunnel

Transit VSAN Y
IPS
FCIP tunnel

IPS

VSAN B
2006 Cisco Systems, Inc. All rights reserved.

Port Channels

VSAN D
44

MDS FCIP High Availability VSANs


VSANs should be used to segregate the ports that need to access the WAN link, preventing disruptions in the WAN from propagating throughput the local SANs at either end. IVR should be used to completely isolate the WAN ports, further isolating disruptions.

308

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

SAN Extension Security


How Are FCIP Tunnels Secured?
Competing FCIP gateway products do not support integrated encryption Encryption must be performed by an external router or VPN appliance
Traffic still vulnerable from SAN to WAN edge Wastes LAN/WAN resources If existing equipment cant support gigabit-speed encryptionneed to buy new routers or VPN appliances

Encrypted at WAN routers


FC FC FC FC FC FC FC FC FC FC

WAN

Encrypted
46

2006 Cisco Systems, Inc. All rights reserved.

Optical DWDM, CWDM, or SONET/SDH links are considered relatively secure due to the inherent difficulty of tapping into optical fiber. However, security on FCIP tunnels that are routed over public IP is a serious issue. For regulated institutions like financial companies, health care, and schools, encryption of data transmitted over public networks is not just a good idea, it is a requirement. FCIP gateway products on the market today do not provide integrated encryption. Users must rely on routers or VPN appliances at the WAN edge to encrypt storage traffic. Not only does this still leave storage traffic vulnerable to interception up to the WAN edge, but it may require users to buy yet more equipment if the existing routers or VPN appliances cant support gigabit-speed storage traffic in addition to existing WAN traffic loads.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

309

Integrated Security for FCIP


Standards-based IPSec protocol for comprehensive security of FCIP and iSCSI traffic End-to-end authentication, data integrity, and encryption Hardware-based, high-performance solution MDS 9216i or MPS 14+2 Module

End-to-End EncryptionNo External VPN Required


FC FC FC FC FC FC FC FC FC FC

WAN

Encrypted
47

2006 Cisco Systems, Inc. All rights reserved.

Integrated Security for FCIP


The MDS Multiprotocol Services Module (14+2 card) provides hardware-accelerated IPSec on both GigE ports. With this solution, the MDS is an IPSec end-point and terminates the IP traffic, providing end-to-end data integrity and authentication, unlike VPN devices which terminate and forward traffic on behalf of other devices. Capable of full 1Gbps line rate, the encryption engine is based on standards like IETF RFC 3723 Securing block storage protocols over IP, Encapsulating Security Payload (ESP), all major cryptographic algorithms (AES, DES, 3DES, Sha1-HMAC, MD5-HMAC, and AESXCBC-HMAC), and key management with Internet Key Exchange (IKE) v1 and v2.

310

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCIP Performance Enhancements


FCIP Issues for Mid-Range Applications
High-latency networks:
Higher latency impacts application performance With tape backup, it can also reduce effective throughput How do we enhance performance of SAN extension over long distances?
High Latency

DS-3OC-12

Low Bandwidth

Low-bandwidth interconnects:
Can we use OC-3, DS-3? How do we reduce the cost of bandwidth for SAN extension?
2006 Cisco Systems, Inc. All rights reserved.

49

FCIP Issues for Mid-Range Applications


IP routed WANs, DS-3s, and OC-3s are readily available and low cost, but many FCIP solutions dont allow customers to take full advantage of these options. The high latency of routed IP networks can impact application performance and reduce effective throughput, forcing IT organizations to choose between poor performance and higher recurring costs.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

311

FCIP Compression
Compressed Eth IP TCP IP TCP TCP TCP FCIP FCIP Eth Header Header Header Header Opts Opts Header Header Header Header Eth Eth CRC32 CRC32

FC FC Frame Frame

Software compression for IPS-4 and IPS-8 with SAN OS 1.3


Designed for OC-3 service

Hardware compression for with 14+2 card and MDS 9216i with SAN OS 2.0
Designed for gigabit-speed service

Three compression modes supported for different WAN bandwidth links and compression ratios
2006 Cisco Systems, Inc. All rights reserved.

50

FCIP Compression
Compression is used as a mechanism to increase overall throughput on slow speed WAN links. The achievable compression ratio depends on the nature of the data. The use of data compression allows users to achieve two major objectives. The first is the ability to reduce the amount of overall traffic on a particular WAN link. This is achieved when a data rate equal to the WAN link speed is compressed, thus reducing the total amount of data on the WAN link and allowing the WAN link to be used by other IP traffic. The IPS modules use the IPPCP/LZS (RFC2395) lossless compression algorithm for compressing data. The IPPCP/LZS compression compresses only the TCP headers and payload of the FCIP frame as shown here. This allows the resulting compressed IP frame to routed through an IP network and still be subject to Access Control Lists (ACLs) and QoS mechanisms based on IP address and TCP port numbers. The type of the data in the data stream determines the overall achievable compression ratio for a given compression method. Typical data mixes should achieve around 2:1 compression. Testing compression with data comprised of all 0x0s or 0xFs or other repeating patterns will artificially increase the resultant compression ratio and will probably not be representative of the compression ratio that you can achieve with real user data. In order to better compare compression methods, use either an industry standardized test file or a test file that is representative of the real data that will be sent through the FCIP tunnel.

312

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCIP without Write Acceleration


Initiator
FC

Target
FC

Normal SCSI Write requires 2 round trips:


WRITE > XFER_READY DATA > RSP

FCIP tunnel

Normal Normal SCSI SCSI Write Write


FCP_W RITE

Doubles latency The problem is even worse for applications that restrict the number of outstanding I/Os, such as tape backup

Round Trip Round Trip

DY XFER_R

FCP_DATA
P FCP_RS

2006 Cisco Systems, Inc. All rights reserved.

51

FCIP Write Acceleration


Write Acceleration is a SCSI protocol-spoofing mechanism that is designed to improve application performance by reducing the overall service time for SCSI write I/Os and replicated write I/Os over distance. Write Acceleration reduces the number of FCIP WAN round trips per SCSI FCP write I/O. It was introduced in SanOS 1.3.

FCIP without Write Acceleration


Most SCSI FCP Write I/O exchanges consist of two or more round trips between the host initiator and the target array or tape. The protocol for a normal SCSI FCP Write is as follows:

Host initiator issues a SCSI Write command (FCP_WRITE), which includes the total size of the write. The target responds with an FCP Transfer Ready (FCP_XFER_RDY). This tells the initiator how much data the target is willing to receive in the next Write sequence. The initiator sends FCP data frames up to the amount specified in the previous FCP_XFER_RDY. The target responds with a SCSI status response (FCP_RSP) frame if the I/O completed successfully.

Each FCIP link can be filled with a number of concurrent or outstanding I/Os. These I/Os can originate from a single source or a number of sources. The FCIP link is filled when the number of outstanding I/Os reaches a certain ceiling. The ceiling is mostly determined by the RTT, write size, and available FCIP bandwidth.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

313

FCIP with Write Acceleration


Initiator
FC

Target
FC

Write Acceleration spoofs XFER_READY


Can spoof up to 32MB of outstanding I/Os Single round trip over WAN

FCIP tunnel

Normal Normal SCSI SCSI Write Write


FCP_W RITE

Round Trip Round Trip

DY XFER_R

FCP_DATA
P FCP_RS

Enables 2x distance at same latency


For some apps, can achieve >2x throughput

Write Write Acceleration Acceleration


XFER_RDY
FCP_WRITE FCP_DATA
P FCP_RS

Round Trip

XFER_RDY

2006 Cisco Systems, Inc. All rights reserved.

52

FCIP with Write Acceleration


The protocol for Write Acceleration differs as follows:

After the initiator issues a SCSI FCP Write, an FCP_XFER_RDY is immediately returned to the initiator by the MDS 9000. The initiator can now immediately send data to its target across the FCIP Tunnel. The data is received by the remote MDS and buffered. At the remote end, the target, which has no knowledge of Write Acceleration, responds with an FCP_XFER_RDY. The MDS does not allow this to pass back across the WAN. When the remote MDS receives FCP_XFER_RDY it allows the data to flow to the target. Finally when all data has been received, the target issues a FCP_RSP response or status, acknowledging the end of the operation (FC Exchange)

Write Acceleration will increase write I/O throughput and reduce I/O response time in most situations, particularly as the FCIP RTT increases.

314

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCIP Tape Acceleration


Tape drives cannot handle high WAN latencies
Cannot keep the tape streaming - causes shoe-shining

Write Acceleration alone cannot keep the tape streaming


Tape drives allow only one outstanding I/O

Tape Acceleration is an enhancement to Write Acceleration


Spoofs FCP Response so next write operation is not delayed Extends tape buffering onto the IPS modules IPS modules act as proxy tape device and backup host
35
FC

30 FC
HBA

FCIP tunnel

Throughput (MB/s)

25 20 15 10 5 0 0 10 20 30 40 50 70 100 Standard FCIP FCIP with WA FCIP with TA

XFER_RDY

FCP_WRITE FCP_DATA

Round Trip
FCP_RSP

XFER_RDY

FCP_RSP
2006 Cisco Systems, Inc. All rights reserved.

RTT (ms)
53

FCIP Tape Acceleration


Increasing numbers of customers are realizing the benefits of tape backup over WAN in terms of centralizing tape libraries and maintaining central control over backups. With increasing regulatory oversight of data retention, this is becoming increasingly important. One issue that customers often face is that tape drives have limited buffering that is often not sufficient to handle WAN latencies. Even with Write Acceleration, each drive can support only one outstanding I/O. When the tape drive writes a block, it issues an FCP_RSP status command to tell the initiator to send more data. The initiator then responds with another FCP_Write command. If the latency is too high, the tape drive wont receive the next data block in time and must stop and rewind the tape. This shoe-shining effect not only increases the time it takes to complete the backup jobpotentially preventing it from completing within any reasonable time framebut it also decreases the life of the tape drive. Write Acceleration alone is not sufficient to keep the tape streaming. It halves the total RTT for an I/O, but the initiator must still wait to receive FCP_RSP before sending the next FCP_Write. FCIP Tape Acceleration is an enhancement to Write Acceleration that extends tape buffering onto the IPS modules. The local IPS module proxies as a tape library and the remote IPS module proxies as a backup server. The local IPS sends FCP_RSP back to the host immediately after receiving each block, and data is buffered on both IPS modules to keep the tape streaming. It includes a flow control scheme to avoid overflowing the buffers, which allows the IPS to compensate for changes in WAN latencies or the tape speed.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

315

The graph on this slide shows the effects of Write Acceleration and Tape Acceleration. The tests were conducted with Legato Networker 7.0 running on a Dual Xeon 3Ghz CPU with 2G Memory and Windows Advanced Server 2000 with an IBM Ultrium TD2-LTO2 Tape Drive. Cisco has tested the Tape Acceleration feature with tape devices from IBM, StorageTek, ADIC, Quantum, and Sony, as well as VERITAS NetBackup, Legato Networker, and CommVault, and is currently working with CA. Backup application vendors will provide matrices of supported tape libraries and drives.

316

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCIP Tape Read Acceleration


Improves Tape Restore performance over WAN
Tape doesnt stop while waiting for next Read command Target MDS pre-fetches data from tape and caches data on MDS Data is continuously streamed across FCIP WAN link
FC

3.0

FC
HBA

FCIP tunnel
FCP_READ

Round Trip
FCP_READ

FCP_DATA

Pre-fetch Data
FCP_READ

FCP_RSP

FCP_RSP

2006 Cisco Systems, Inc. All rights reserved.

54

FCIP Tape Read Acceleration


Different performance issues occur when restoring data across a WAN.

A tape drive is a sequential storage medium, so the blocks stream off the tape as the tape passes the head The backup server issues Read Commands to the Tape target device requesting a number of SCSI 512 Byte blocks. The tape starts to move, reads the data into buffers and then stops waiting for the next command. Meanwhile the backup server receives the data blocks and issues a new read command for the next x blocks in sequence. The tape starts up again, reads the blocks and so on. FCIP Tape Read Acceleration performs a read-ahead to pre-fetch the data and keep the tape moving. Lets assume that the backup server issues a Read Command to read the first x blocks. This command sent to the tape and the tape starts up, and reads the blocks into the buffer and the data is sent back to the backup server. Meanwhile, before the tape has stopped moving, the MDS at the remote site issues another Read Command to read the next x blocks in sequence into the buffer and these blocks are sent over the FCIP tunnel to buffers in the MDS at the local data centre. When the local MDS receives a command from the backup server to read the next x blocks, it consumes the command and sends the data that it has already buffered.

By pre-fetching data and keeping the tape moving, FCIP Tape Read Acceleration will dramatically improve read performance over a WAN.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

317

Dynamic TCP Windowing


MWS = Bandwidth x RTT MDS IP Services Module dynamically calculates MWS:
Administrator sets max-available-bw and initial RTT MDS recalculates RTT during idle time

1.3

IPS modules support maximum MWS of 32MB

Source GigE

max_bw 45Mbps (dedicated) 45Mbps GigE

Destination

RTT = end-to-end latency x 2 One way time = 12ms x 2 = 24 RTT

2006 Cisco Systems, Inc. All rights reserved.

55

Dynamic TCP Windowing


The TCP Maximum Window Size (MWS) is derived from the product of the maximum bandwidth x RTT x 0.9375 + 4 KB . In SAN-OS 1.3 and higher, you cannot configure the TCP MWS directly on the MDS 9000 IP Services module. You tell the IPS what the maximum bandwidth of the link (the max-availablebw parameter) and configure the initial RTT value. The MDS automatically recalculates the RTT value during idle periods, so the RTT dynamically varies according to network conditions, such as IP routing changes. The MWS value is then dynamically recalculated based on configured bandwidth and RTT. The MWS automatically adjusts the MWS if FCIP compression is used. The TCP MWS can vary up to 32MB. This allows the IPS to support long distances at gigabit speeds. On earlier versions of SAN-OS, you must configure the RTT manually.

318

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Traditional TCP Congestion Avoidance


Quick Review: Traditional TCP (simplified)
Packets Sent per Round Trip (Congestion Window) Linear Congestion Avoidance (+1 cwnd per ACK) loss

cwnd halved on packet loss; retransmission signals congestion; Slow Start threshold adjusted

Slow Start Threshold

loss

Exponential Slow Start (2x pkts per RTT) Low throughput during this period

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

Round Trips
2006 Cisco Systems, Inc. All rights reserved.

56

Traditional TCP Congestion Avoidance


This diagram shows how the traditional window-sizing mechanism in TCP relates to the congestion window (cwnd) and RTT. This mechanism was designed for handling small numbers of packets in a lossy network. When the network becomes congested, packets are dropped. Traditional TCP has a tendency to overreact to packet drops by halving the TCP window size. The resulting reduction in speed that can occur in traditional TCP implementations is unacceptable to many storage applications.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

319

TCP Packet Shaping


Administrator must configure min-available-bw parameter to enable packet shaper and determine the aggressiveness of the recovery

Congestion Avoidance (+2 cwnd per RTT)

Retransmission

Packets Sent per Round Trip (Congestion Window)

Maximum Window Size Slow Start Threshold Slow Start Threshold initialized to 95% of MWS cwnd at 95% of MWS after one RTT

Minimum threshold = min-available-bw

Shaper engaged during first RTT at min-available-bw

10

11

12

13

14

15
57

Round Trips
2006 Cisco Systems, Inc. All rights reserved.

TCP Packet Shaping


The IPS Module implements a modified TCP windowing algorithm called a packet shaper. When you configure a Gigabit Ethernet port, you specify the minimum-available bandwidth. TCP can then use this value as the Slow Start Threshold. The packet shaper ramps up to this threshold within 1 RTT. From there, the TCP stack uses linear congestion avoidance, increasing throughput at the rate of 2 segments per RTT until the maximum window size is reached. When congestion occurs, the MDS 9000 TCP implementation is more aggressive during recovery than traditional TCP. When congestion occurs, the congestion window drops to the min-available-bandwidth value. The degree of aggressiveness during recovery is therefore proportional to the min-availbandwidth configuration. Note that if conventional TCP traffic shares the same link with FCIP, the conventional TCP flows recover more slowly. The bandwidth allocation will then strongly favour the FCIP traffic. To cause FCIP to behave more fairly, use a lower min-available-bandwidth value to force FCIP to start at a lower rate. Use the min-available-bandwidth parameter to determine how aggressively FCIP should behave.

320

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

FCIP Tuning Hassles


To achieve desired throughput, a number of parameters need to be tuned:
TCP parameters: Maximum and minimum available bandwidth (for packet shaper) Round-trip time (RTT) Number of outstanding I/Os for the application SCSI transfer size

How is this usually done?


Standard traffic generation tools - e.g. Iometer Requires testing real hosts and targets

FC
HBA

2006 Cisco Systems, Inc. All rights reserved.

58

FCIP Tuning Hassles


To maximize throughput on FCIP links, a number of parameters need to be tuned, including:

TCP parameters (max bandwidth, round-trip time RTT) Number of outstanding I/Os for the application SCSI transfer size

To determine these parameters, users need to use standard traffic generation tools like IOMeter to generate test data and measure response time and throughput. This requires test hosts and test targets, and must be redone every time the environment changes.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

321

SAN Extension Tuner


Assists in tuning by generating various SCSI traffic workloads Built into the IPS port Creates a virtual N-port on the IPS port that can act as both initiator and target
User-specified I/O size, transfer size, and # of concurrent I/Os Can simulate targets that do multiple xfer-rdys for large write commands

Measures throughput and response time per I/O over the FCIP tunnels
Virtual N-port 10:00:00:00:00:00:00:01
FC
FC

Virtual N-port 11:00:00:00:00:00:00:03


Gig2/3

Gig3/3

WAN/MAN
Gig3/1
2006 Cisco Systems, Inc. All rights reserved.

FCIP tunnel

Gig2/1
59

SAN Extension Tuner


The SAN Extension Tuner (SET) is a light-weight tool built into the IPS port itself to assist in tuning by generating various SCSI traffic workloads. The SET creates a virtual N_Port on the IPS port that can act as both initiator and target and mimics SCSI read/write commands. The user can specify the SCSI transfer size and number of outstanding I/Os, and can simulate targets that do multiple FCP_XFER_RDYs for large write commands. The SET measures throughput and I/O latency over the FCIP tunnels, and determines the optimal number of concurrent I/Os for maximum throughput. SCSI read/write commands are mimicked. FICON is not supported.

322

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

MDS Performance Advantages


Primary Data Center
MDS w/ 14+2
FC

Backup Data Center


MDS w/ 14+2

WAN/MAN

FC

I/O Performance Tape and Write Acceleration, packet shaper

WAN B/W Utilization Compression

Security IPSec Encryption

Traffic Management IVR, SE Tuner

2006 Cisco Systems, Inc. All rights reserved.

60

MDS Performance Advantages


The MDS platform provides three key features that are designed to squeeze the most performance out of cost-effective IP WANs:

Compression options add to implementation flexibility by allowing bandwidth to be used more effectively. Designed specifically to enable customers to leverage sub-gigabit transports in SAN-OS 1.3, compression can scale to gigabit speeds with SAN-OS 2.0 and the new 14+2 line card. Write Acceleration increases performance by spoofing the SCSI XFER_READY command to reduce round-trips and lower latency. This feature can double the usable distance without increasing latency. For applications that allow few outstanding I/Os, like tape backup, Write Acceleration can double the effective throughput. An optimized TCP MWS stack keeps the pipe full by dynamically recalculating the MWS based on changing conditions, and by implementing a packet shaping algorithm to allow fast TCP starts.

Copyright 2006, Cisco Systems, Inc.

Designing SAN Extension Solutions

323

324

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Lesson 10

Building iSCSI Solutions


Overview
In this lesson, you will learn how to effectively deploy iSCSI solutions on the MDS 9000 platform, including key applications and environments, high availability features, security features, and deployment consideration.

Objectives
Upon completing this lesson, you will be able to explain how iSCSI can be used to enable migration of mid-range applications to the SAN. This includes being able to meet these objectives:

Explain the problems that iSCSI is designed to solve Describe the iSCSI protocol Describe how iSCSI is implemented on the MDS 9000 IP Services Modules Explain how to deploy iSCSI effectively Explain how to configure high availability for iSCSI Explain how to secure iSCSI environments Explain how to simplify management of iSCSI environments with target discovery Explain where Wide Area File Services (WAFS) is effective

Whats the Problem?


Distributed Storage
Problem: Customers want to consolidate storage Distributed storage is difficult to manage As storage devices increase, backup windows increase Data center may have extra capacity that isnt being utilized
Workgroup Servers

Lots of errors Not reliable

Backup windows exceeded

2006 Cisco Systems, Inc. All rights reserved.

Data Center

Extra tape and disk capacity


4

Distributed Storage
At a corporate headquarters, how is backup accomplished? In an environment where DAS storage dominates, someone has to load and collect tapes for each device, which easily constitutes a storage management nightmare. Backup windows can be easily exceeded and normal operations can be affected as a result causing delay in the opening of the business day. This is a growing problem for many businesses today. At the same time, the data canter may have a good storage management scheme and applications already in place, as well as unallocated disk space. What is needed is a way to connect these distributed workgroup servers to the data center.

326

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Branch Offices
Problem: Branch offices located at greater distances Lack of resources to manage storage Inconsistent backup on each site Compliance with data security and retention regulations E.g. Banks, Schools, Clinics
Regulatory compliance issues

Lack of management resources

Unmanaged backups

Branch Office
2006 Cisco Systems, Inc. All rights reserved.

Data Center

Branch Office
5

Branch Offices
Branch offices can also pose a storage management issue for the enterprise. With typically too few management resources to manage storage at remote sites, backups are conducted on an adhoc basis, often leaving the company out of compliance with data security and retention regulations.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

327

Mid-Range Applications
Need cost-effective SAN solution for mid-range applications Mid range apps have low bandwidth requirements
Typically 10 20MB/s avg.

Cant utilize FC Resources


2Gb FC with 15 MB/s per port = only 7.5% bandwidth utilization

Storage Network

N-Tier Applications

IP Switching Network Application Optimization Security

Web Servers IP Firewall

Cache

Have higher latency tolerance FC attachment is costly


RAID MDS 9500 App Servers IDS

Content Switch

Typical uses
Web server farms Application server farms Branch offices
Tape DB Servers Mainframe IP Comm.

SSL

Operations

2006 Cisco Systems, Inc. All rights reserved.

Todays Datacenter

Mid-Range Applications
While FC SANs have dramatically increased operational efficiency for high-end application storage, the high cost of FC has prevented these benefits from migrating down to mid-range applications. Mid-range applications dont need the same high levels of bandwidth and low levels of latency as high-end applications, so it is often difficult to achieve ROI in a reasonable timeframe by implementing FC for mid-range applications. As a result, many applications in the enterprise, such as file, web, and messaging servers, are managed separately, either via DAS or NAS, keeping management costs high. At the same time, the customers investment in FC SANs is not fully realized. Inside the data center, there are a number of different tiers of servers. Two of those tiers are web server farms and application server farms. These servers are typically numerous, yet have low bandwidth requirements and can tolerate higher amounts of latency than database servers. It is often not considered cost-effective to migrate these servers to FC SANs. Assuming 2Gb FC ports, with each host sustaining an average of 15 MBps per port, only 7.5% of the available bandwidth is being utilized.

328

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Overview
What is iSCSI?
internet Small Computer Systems Interface (iSCSI) SCSI Transport protocol carried over TCP/IP Encapsulates SCSI commands and data into IP packets TCP is the underlying network layer transport
Provides congestion control and in-order delivery of error-free data

Allows iSCSI hosts to access iSCSI native targets Allows iSCSI hosts to access FC SAN storage targets via gateway Provides seamless integration of mid range servers into the SAN Can use standard Ethernet NICs or iSCSI HBAs
Ethernet 18 bytes IP hdr 20 bytes TCP hdr 20 bytes iSCSI hdr 48 bytes

SCSI Commands and Data

2006 Cisco Systems, Inc. All rights reserved.

What is iSCSI?
Internet Small Computer Systems Interface (iSCSI) is a transport protocol that operates on top of TCP and encapsulates SCSI-level commands and data into IP, for a TCP/IP byte stream. It is a means of transporting SCSI packets over TCP/IP, providing for an interoperable solution that can take advantage of existing IP-based infrastructures, management facilities and address distance limitations. Mapping SCSI I/O over TCP ensures that high-volume storage transfers have in-order delivery and error-free data with congestion control. This allows IP hosts to gain access to previously isolated Fibre Channel based storage targets. iSCSI is an end-to-end protocol with human-readable SCSI device (node) naming. It includes base components such as IPSec connectivity security, authentication for access configuration, discovery of iSCSI nodes, a process for remote boot, and iSCSI MIB standards. The iSCSI protocol was defined by an IP Storage Working Group through the Internet Engineering Taskforce (IETF). Version 20 was recently approved by the Internet Engineering Standards Group (IESG).

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

329

Advantages of iSCSI
Cost-effective technology for connecting low-end & midrange servers, clients and storage devices
Enables iSCSI hosts to communicate with iSCSI storage Enables iSCSI hosts to communicate with FC storage, through a gateway

Builds on SCSI and TCP/IP technology


Leverages benefits of IP Knowledge and skills Infrastructure Security tools QoS and traffic engineering Network management tools R&D investment Ubiquitous access
2006 Cisco Systems, Inc. All rights reserved.

Advantages of iSCSI
iSCSI leverages existing IP networks. Users can therefore benefit from their experience with IP as well as the industrys experience with IP technologies. This includes:

Economies from using a standard IP infrastructure, products, and service across the organization Experienced IP staff to install and operate these networks. With minimal additional training it is expected that IP staff in remote locations can maintain iSCSI based servers. Management tools already exist for IP networks this reduces the need to learn new tools or protocols. Traffic across the IP network can be secured using standards based solutions such as IPsec. QoS is used to ensure that SAN traffic is not affected by the potential unreliable nature of IP. QoS exists today in the IP infrastructure and can be applied from end to end across the IP network to give SAN traffic priority of other less time-sensitive traffic on the network. iSCSI is compatible with existing IP LAN and WAN infrastructures. iSCSI devices support and Ethernet or Gigabit Ethernet interface to connect to standard LAN infrastructures.

330

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

The Software-Based iSCSI Model


iSCSI is a network service enabled through the use of an iSCSI software driver and optional hardware Internal TCP/IP Stack consumes CPU resources during data transfer Error handling is performed by the driver, consuming even more CPU resources. Inexpensive solution
iSCSI Software Driver
Applications
File System Block Device SCSI Generic iSCSI TCP/IP Stack NIC Driver NIC Adapter Adapter Driver SCSI Adapter

2006 Cisco Systems, Inc. All rights reserved.

10

The Software-Based iSCSI Model


iSCSI drivers are normally free and provide a very low cost solution to customers that do not require high performance or low latency. The iSCSi driver performs all SCSI processing, TCP/IP processing, and Error Recovery. An iSCSI driver running on a 1 GHz CPU will spend nearly 95% of its CPU cycles when moving data through a NIC at 1 Gbps. However, nowadays host CPUs run at around 3 GHz so have more processing power and will only use around 30% of their CPU cycles moving data at 1 Gbps, leaving the remaining 70% for running applications and the operating system.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

331

TCP and iSCSI Offload Engines


Hardware implementation of iSCSI within specialized NIC Offloads TCP and iSCSI processing into hardware
Full Offload: iSCSI & TCP offload (iSCSI HBA) Partial Offload: TCP offload only (TOE)
Applications

Relieves host CPU resources from iSCSI and TCP processing Does not necessarily increase performance, only if CPU is busy Wire-rate iSCSI performance
Useful only when host must support high sustained loads Dedicated
Hardware

File System Block Device SCSI Generic iSCSI TCP/IP Stack NIC Driver TOE Adapter
iSCSI

Adapter Driver SCSI Adapter

2006 Cisco Systems, Inc. All rights reserved.

11

TCP and iSCSI Offload Engines


Some NICs have a TCP Offload Engine (TOE) to offload TCP/IP processing from the host CPU and reduce the CPU load. Some cards have partial offload and some have full offload.

Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before. Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.

iSCSI HBAs offload TCP/IP processing and iSCSI processing to co-processors and custom ASICs on the iSCSI HBA. Although relatively expensive, iSCSI HBAs provide lower latency and higher throughput than iSCSI software drivers or TOE cards. Nowadays when host CPU processors have more performance, it is not usually cost effective to use NICs with TOE, but to use software iSCSI drivers instead.

332

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Drivers and Offload Engines


Different approaches to iSCSI initiators:
iSCSI driver with a standard network card NIC with a TCP offload engine (TOE) HBAs that offload both TCP and iSCSI
Processed in the server:
Apps/file system Other protocols TCP IP Network hardware SCSI iSCSI

Processed in hardware:
Apps/file system Apps/file system SCSI iSCSI TCP IP Network hardware

Other protocols TCP IP

SCSI iSCSI

Network hardware

Standard NIC
2006 Cisco Systems, Inc. All rights reserved.

TOE

iSCSI HBA
12

iSCSI Drivers and Offload Engines


iSCSI drivers running on the host perform all SCSI and iSCSI processing using the host CPU through the standard NIC. As I/O loads increase, the host consumes more CPU cycles and struggles to deliver throughput. In a congested IP network where packets are frequently discarded, the TCP stack running on the host CPU must also recover lost packets. Some NICs have a TCP Offload Engine (TOE) to offload TCP/IP processing from the host CPU and reduce the CPU load. Some cards have partial offload and some have full offload.

Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before. Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.

To achieve maximum performance, it is necessary to offload both TCP/IP processing, iSCSI processing and error recovery from the host CPU onto the iSCSI HBA. The host is still responsible for SCSI processing.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

333

iSCSI Concepts
Network Entity
iSCSI initiator iSCSI target Network Entity - iSCSI Initiator iSCSI Node
Network Portal Ethernet, Wireless, etc.

iSCSI Node
Identified by iSCSI Node Name Initiator node = Host Target node = Storage Target node contains one or more LUNs

Network Portal
Identified by IP address and subnet mask Network access TCP/IP Ethernet, Wireless etc

IP Network

Network Portal

Network Portal

iSCSI Node

iSCSI Node

Network Entity - iSCSI Targets


2006 Cisco Systems, Inc. All rights reserved.

13

iSCSI Concepts
SCSI standards define a client server relationship between the SCSI Initiator and the SCSI Target. iSCSI standards define these as the Network Entity. The iSCSI Network Entity contains an iSCSI Node which is either the Initiator or Target. iSCSI Nodes are identified by an iSCSI Node Name. If the Target Node is a storage array, it may contain one or more SCSI LUNs. iSCSI Initiator Nodes communicate with iSCSI Target Nodes through Network Portals. Network Portals connect to the IP network and are identified by an IP Address. It is worth noting that Network portals can also be wireless ethernet ports.

334

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Node Names


iSCSI Node Name
Associated with iSCSI nodes, not adapters Up to 255 bytes, human-readable string (UTF-8 encoding) Used for iSCSI login and target discovery

iSCSI Name Types


IQN iSCSI Qualified Name

Unique identifier assigned by Naming Authority

iqn.1987-05.com.cisco.storage.backup.server1
Date = yyyy-mm when Domain Acquired Reversed Domain Name of Naming Authority

EUI extended unique identifier (IEEE EUI-64) eui.0200123456789abc


Unique identifier assigned by manufacturer
2006 Cisco Systems, Inc. All rights reserved.

14

iSCSI Node Names


Every iSCSI Node is identified by an iSCSI Node Name in one of two formats:

iqn: iSCSI Qualified Name, up to 255 bytes, human readable UTF-8 encoded string eui: Extended Unique Identifier, 8 byte hexadecimal number defined and allocated by IEEE

Although both formats can be used, typically the iSCSI driver will use the iqn format and the eui format will be used by manufacturers of native iSCSI devices.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

335

MDS 9000 IP Services Modules


Native iSCSI Deployment
iSCSI hosts can communicate directly with native iSCSI storage devices over an IP network through a standard ethernet NIC iSCSI servers can be fitted with iSCSI HBAs to offload iSCSI and TCP processing from the host CPU Most NAS filers now support iSCSI as well as NFS and CIFS protocols
NAS Filer
NAS iSCSI iSCSI iSCSI iSCSI

iSCSI Hosts with NICs

SCSI

IP Network
iSCSI iSCSI iSCSI HBA iSCSI iSCSI Servers HBA
HBA

SCSI
iSCSI

iSCSI is most suitable for hosts running applications that are not latency sensitive and have a low throughput requirement

with iSCSI HBAs

HBA

Native iSCSI Storage


2006 Cisco Systems, Inc. All rights reserved.

16

Native iSCSI Deployment


iSCSI hosts can communicate directly with native iSCSI storage devices over an IP network through a standard ethernet nic or through wireless. As the data load increases, the host CPU spends more time processing iSCSI and moving data byte by byte. iSCSI servers can be fitted with iSCSI HBAs to offload iSCSI and TCP processing from the host CPU. More and more mid range iSCSI native storage arrays are coming onto the market that make native iSCSI deployment an inexpensive reality. Most NSA filers now support Block I/O through iSCSI as well as File I/O through NFS and CIFS protocols. iSCSI is suitable for hosts running applications that are not latency sensitive and do not have a large throughput requirement.

336

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Gateways
iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices MDS 9216i, MPS 14+2 and IPS line cards all provide iSCSI gateways iSCSI is provided for free in the standard license

NAS Filer
NAS

SCSI

iSCSI iSCSI iSCSI iSCSI

iSCSI Hosts with NICs

FC Servers

HBA

FC FC FC HBA FC HBA
HBA

IP Network
iSCSI iSCSI HBA iSCSI HBA iSCSI iSCSI Servers HBA

FC SAN
iSCSI Gateway SCSI
FC FC

SCSI
iSCSI

with iSCSI HBAs

HBA

Native iSCSI Storage


2006 Cisco Systems, Inc. All rights reserved.

FC Storage
17

iSCSI Gateways
Most enterprises already have data centres with Fibre Channel SANs and FC Storage Arrays but they cannot be accessed directly from iSCSI hosts. iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices. Cisco MDS 9216i, MPS 14+2 and IPS linecards all provide an iSCSI to FC gateway function. iSCSI is provided for free on MDS switches in the standard license.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

337

Standalone Router Implementations


Standalone router-based approach is most common so far Separate management interfaces
HBA

FC FC HBA FC
HBA

HBA

FC FC HBA FC
HBA

Separate sets of security policies Less highly available

iSCSI iSCSI

iSCSI iSCSI iSCSI iSCSI

iSCSI iSCSI

iSCSI Gateways/Routers

2006 Cisco Systems, Inc. All rights reserved.

18

Standalone Router Implementations


Although some vendors now offer native iSCSI storage, most iSCSI implementations today use a gateway or router-based approach that makes FC storage available to iSCSI hosts. Typical iSCSI gateway/router implementations are appliance-based or are small multiprotocol switches with a handful of ports. The Cisco SN5428-2 was an example of this approach, as a standalone workgroup SAN switch that provided FC-to-iSCSI routing. Although this approach has some viable applications, like small companies and remote offices, it has scalability issues in the datacenter:

This approach requires implementing a new set of devices (at least 2 devices for highavailability), and possibly adding more devices if more network capacity is needed. It means separate management interfaces, and, even worse, separate security policies. It is also typically less highly available, because the high availability hardware features that one expects in a data center SAN switch are often not viable for a small, low-cost router product. It is potentially better for WAN-based branch offices. Gateways are not a good fit for metro-based branch offices, such as schools, clinics, and banks.

338

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Integrated iSCSI
Multiprotocol SAN switch Single SAN fabric Single management interface
FC HBA FC HBA FC
HBA

FC HBA FC HBA FC
HBA

Single set of security policies Tightly integrated Highly available


iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

MDS 9500s with IPS

iSCSI

Designed for the data center and backup from remote offices
2006 Cisco Systems, Inc. All rights reserved.

19

Integrated iSCSI
The Cisco iSCSI solution for data centers is the IP Services (IPS) Module series for the Cisco MDS 9000 platform. This approach integrates iSCSI and FC (along with FCIP and FICON) into a single multiprotocol SAN switch. This provides higher availability because iSCSI is supported on the highly available MDS 9000 platform. This provides a single management interface, a single point of control for security, and unifies iSCSI and FC storage into a single SAN fabric. This approach is designed to meet the availability, manageability and scalability requirements of the data center.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

339

MDS iSCSI Gateway Function


Ethernet
18

IP
20

TCP
20

iSCSI
48

SCSI Commands and Data


0 - 1024

iSCSI Initiator iSCSI messages


iSCSI
HBA

IPS Linecard
iSCSI

SCSI Com
R2T

mand

iSCSI virtual target

SC SI C omman d
XFER RDY

Data

Data

Status

FC virtual initiator

Response
FC
HBA

FC

FC Frames FC Target

MDS
2006 Cisco Systems, Inc. All rights reserved.

SOF FC Header SCSI Commands and Data CRC EOF


4 24 0-2048 4 4

20

MDS iSCSI Gateway Function


The iSCSI Gateway function is included within the MDS 9216i, MPS 14+2 and IPS-8 linecard. The iSCSI Gateway provides a virtual iSCSI Target that communicates with the iSCSI Initiator. iSCSI messages are received, the SCSI commands and data are extracted and passed to a virtual FC Initiator which builds a FC frame around the payload. The virtual FC Initiator in the iSCSI Gateway communicates with the FC Target in the storage array or Tape drive. The iSCSI Gateway function is performed in ASICs to minimize latency and provide maximum throughput.

340

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

IPS ModulesiSCSI Features


iSCSI initiator to FC target mapping High availability options:
VRRP pWWN aliasing Options for mid-range and high-end apps
iSCSI iSCSI iSCSI

IP Hosts (iSCSI-enabled) iSCSI Drivers Installed

Security:
RADIUS support IP ACLs VSANs and VLANs Integrated FC and iSCSI Zoning IPSec
IP Network

Catalyst 6500

Cisco IPS

Ease of deployment
Dynamic initiator and target discovery Proxy initiators iSNS server
FC Fabric

MDS 9000

Single management interface for FC and IP


FC Storage
2006 Cisco Systems, Inc. All rights reserved.

21

IPS Modules iSCSI Features


The IPS modules support mapping of iSCSI initiators (hosts) to FC targets (storage). They provide initiator and target discovery and LUN mapping to simplify deployment, and integrates iSCSI and FC security policies by supporting iSCSI initiator membership in VSANs and zones. Unlike FC hosts, iSCSI hosts can belong to multiple VSANs. CHAP authentication is supported, with centralized account management via RADIUS. The IPS modules support a range of HA features for both mid-range and high-end storage, including VRRP, iSCSI Trespass, Proxy Initiators, and Ethernet PortChannels. Cisco provides Network Boot drivers that work with the IPS module, to support this key data center application. All MDS management services are supported on both FC and IP interfaces. Cisco Fabric Manager and Device Manager are used to manage the IPS modules. This provides a single management interface for the entire storage network.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

341

When to Deploy iSCSI


iSCSI Fan-InScenario 1
Scenario 1: Few hosts, moderate bandwidth
30 hosts x 50MB/s = 1500MB/s
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI FC
HBA

30 hosts x 50MB/s = 1500MB/s


FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

2:1 Fan-In

60 device connections

iSCSI FC $ $

2 cards x 8 ports x 100MB/s = 1600MB/s


2006 Cisco Systems, Inc. All rights reserved.

2 cards x 8 quads x 250MB/s = 4000MB/s


23

iSCSI Fan-In Scenario 1


For smaller applications requiring few ports, FC can be as cost effective as iSCSI if there is an existing FC infrastructure. In the scenario shown here, with 30 ports requiring 50MB/s per port, it would be somewhat less expensive to use iSCSI when the cost of FC HBAs are factored in, but the cost difference will not be that significant. Because each host needs 50MB/s, the fan-in ratio of hosts to iSCSI ports is only 2:1. In addition, the two 32 port FC modules would provide more than twice the required bandwidth, which would allow for growth in I/O requirements.

342

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Fan-InScenario 2
Scenario 2: Many hosts, low bandwidth
100 hosts x 15MB/s = 1500MB/s
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

100 hosts x 15MB/s = 1500MB/s


FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC

FC
HBA

FC
HBA

FC
HBA

FC
HBA

iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

FC
HBA

6:1 Fan-In

200 device connections

iSCSI FC $ $$$$

Still only 2 IP cards required


2006 Cisco Systems, Inc. All rights reserved.

8 FC cards + 200 HBA ports required!


24

iSCSI Fan-In Scenario 2


iSCSI is most cost-effective with high fan-in. In this scenario, the total bandwidth requirement is still 1500MB/s, but this time there are 100 hosts that each need 15MB/s of bandwidth. A FC-only solution would require eight 32-port FC modules (four 32-port modules per fabric in a redundant fabric configuration), and would also require 200 HBA ports. However, due to 6.25:1 fan-in across the IP network, you would only need 2 IPS blades, and host connectivity could be provided by standard Gigabit Ethernet (or even 100Base-T) NICs. In this scenario, it would be far more cost-effective to use iSCSI.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

343

iSCSI Fan-In Ratios


Fan-in/fan-out ratios are an important aspect of optimal SAN designs High fan-in ratios make IP SANs very cost-effective:
Typical iSCSI fan-in Typical FC fan-in = 10:1 20:1 = 4:1 10:1 (10MB/s 20MB/s) (20MB/s 50MB/s)

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

FC
iSCSI iSCSI iSCSI iSCSI iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

iSCSI

2006 Cisco Systems, Inc. All rights reserved.

25

It is desirable to have high fan-in ratios in an IP SAN design, in part because they are more cost-effective, and in part because of the low port density of IP gateways and line cards.

344

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Cost-Effective DAS Consolidation


MDS IP line card provides seamless iSCSI / FC SAN integration iSCSI enabled hosts capitalize on existing IP infrastructure investment Less than half the cost of FC attachment Line card upgrade protects MDS chassis investment FC SAN resources can be fully utilized Common management infrastructure
FC
HBA

IP Hosts (iSCSI-enabled)
iSCSI iSCSI iSCSI iSCSI iSCSI

iSCSI

iSCSI

Cisco IPS Module


iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

FC
HBA HBA

FC
HBA

FC
HBA

FC
HBA

FC

FC
HBA

iSCSI iSCSI

iSCSI

iSCSI iSCSI

iSCSI

iSCSI

iSCSI iSCSI

iSCSI

iSCSI iSCSI

iSCSI

iSCSI

Cisco MDS 9509


FC FC FC

Catalyst 6500
iSCSI iSCSI iSCSI iSCSI

Backup assets

iSCSI

iSCSI

iSCSI

FC SAN
2006 Cisco Systems, Inc. All rights reserved.

26

Cost-Effective DAS Consolidation


iSCSI is an ideal solution for many mid-range and low-end applications. It is a low-cost transport that leverages the existing investment in IP infrastructure. The MDS 9000 IP Services (IPS) line cards integrate iSCSI into the core SAN, allowing iSCSI hosts to utilize existing FC storage resources. The IPS line cards provide line-rate Gigabit Ethernet (GigE) performance, allowing high fan-in ratiosmore iSCSI hosts per GigE port and reducing the cost per host. The IPS provides options for both low-cost and high-end multipathing, providing a range of high-availability solutions to suit the needs of different applications. Lastly, operational management for iSCSI storage is integrated with FC storage management, instead of isolated on a separate boxes.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

345

High-Availability iSCSI Configurations


GigE Interfaces
Each GigE port supports three FCIP interfaces and an iSCSI interface An IPS-8 can support up to 24 FCIP Tunnels + iSCSI concurrently Each iSCSI interface will support approx 200 connections GigE ports can be joined by Ethernet Port Channel for HA
Between odd/even pairs 1-2, 3-4, 5-6, 7-8
Ethernet Port Channel

FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface
2006 Cisco Systems, Inc. All rights reserved.

GigE port

GigE port

IP Network
28

GigE Interfaces
Each GigE port supports three FCIP Interfaces and an iSCSI interface simultaneously sharing 1Gbps bandwidth. Tests have shown that each iSCSI interface will support up to 200 iSCSI connections, although it is worth noting that all iSCSI hosts would share the same 1 Gbps bandwidth. GigE ports can be joined using Ethernet Port Channel for High Availability. On the IPS-8 and MPS 14+2 linecard, odd even pairs of ports share the same SiByte ASIC and resources.

346

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Low-End HA Design Topology


Adapter teaming
iSCSI

FC

Client-side data network omitted

VRRP

pWWN Aliasing

Low-end HA design topology:


Low-cost design VRRP provides redundancy across IPS modules pWWN aliasing provides redundancy on the FC side Single NIC = single points of failure No load-balancing
2006 Cisco Systems, Inc. All rights reserved.

29

Low-End HA Design Topology


In cost-sensitive environments, Cisco MDS 9000 Family features can be used to provide redundancy for iSCSI sessions. One of these features is the Virtual Redundancy Router Protocol (VRRP). VRRP provides redundant router gateway services whereby, should a Gigabit Ethernet port on the IPS module fail, another Gigabit Ethernet port on a redundant IPS module resumes the iSCSI service and continues to provide access for affected iSCSI sessions. Another feature provided by the IPS module is PWWN aliasing. PWWN aliasing provides recovery capability on the Fibre Channel end of the solution. Using PWWN aliasing, fail-over capability is provided to a redundant FC port in the event of a failure on the active FC port that is connected to the actual physical storage target. A requirement for this solution is that both FC ports must have access to the same LUNs and provide redundant paths to the physical storage residing on the FC SAN. Not all storage arrays provide the required active-active LUN capability across multiple storage subsystem interfaces. You should consult your storage vendor for details on this feature.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

347

High-End HA Design Topology


Ethernet PortChannel Multipathing S/W
FC

iSCSI

Redundant VSANs

High-end HA design topology:


Highly available, redundant fabric design Host multipathing software provides fully redundant paths Active-active load balancing Ethernet PortChannel added for link-level redundancy on the IPS module
2006 Cisco Systems, Inc. All rights reserved.

30

High-End HA Design Topology


Many Fibre Channel SAN designers believe that the highest levels of redundancy and availability are achieved through the use of redundant fabric design topologies which provide pure isolation from fabric wide disruptions. IP SAN designs can of course provide the same levels of redundancy and availability as Fibre Channel based SANs. With the Cisco family of switches, one can also implement this fabric isolation using VLAN and VSAN capabilities. Furthermore, when redundant fabric designs are combined with director class switches, the levels of fault tolerance and availability are even higher. At the upper end of the HA design spectrum for IP SANs, redundant fabrics designs are combined with multipathing software to provide active-active load balancing and nearly instantaneous failover in the event of a component failure. The use of Ethernet PortChannels can further enhance the network resiliency by providing link level redundancy in the event of a port failure on an MDS IPS module.

348

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

VRRP
Two Gigabit Ethernet ports are in a VRRP group with one virtual IP address If active VRRP port fails, peer reconnects to the same virtual IP address across the second port Provides front-end redundancy

iSCSI

VRRP
iqn.host-2 IP Network
iSCSI
FC

FC SAN

Virtual IP 10.1.1.1 iqn.host-1

2006 Cisco Systems, Inc. All rights reserved.

31

VRRP
Virtual Routing Redundancy Protocol (VRRP) is a router-based protocol that dynamically handles redundant paths, making failures transparent to applications. Two ports are placed into a VRRP group that is assigned a single virtual IP address. The external router connects to the IPS via the virtual IP address. This enables transparent failover of an iSCSI volume from one IPS port to any other IPS, either locally or on another Cisco MDS 9000 Family switch. VRRP provides redundancy in front of the MDS switch but can take up to 20 secs to failover.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

349

pWWN Aliasing
Provides back-end redundancy Each FC storage port is mapped to a virtual iSCSI target pWWN aliasing maps a secondary pWWN to the same virtual target Trespass feature for mid-range storage arrays
Exports LUNs from active to passive port

iSCSI

iqn.disk-1 iqn.host-2 IP Network


iSCSI

pWWN P1 FC SAN iqn.disk-1 iqn.disk-2 pWWN P2

FC

iqn.host-1
2006 Cisco Systems, Inc. All rights reserved.

32

pWWN Aliasing
Virtual iSCSI targets can be associated with a secondary pWWN on the FC target. This can be used when the physical Fibre Channel target is configured to have a LUN visible across redundant ports. When the active port fails, the secondary port becomes active and the iSCSI session switches to use the new active port. iSCSI transparently switches to using the secondary port without impacting the iSCSI host. All other I/O are terminated with check condition status and the host retries the I/O. If both the primary and secondary pWWNs are available, then both pWWNs can be used each session may use either pWWN. For mid-range storage arrays, the trespass feature is available to enable the export of LUNs, on an active port failure, from the active to the passive port of a statically imported iSCSI target. In physical Fibre Channel targets which are configured to have LUNs visible over two Fibre Channel N-ports, when the active port fails, the passive port takes over. However, some physical Fibre Channel targets require that the trespass command be issued, to export the LUNs from the active port to the passive port. When the active port fails, the passive port becomes active, and if the trespass feature is enabled, the MDS issues a trespass command to the target to export the LUNs on the new active port. The iSCSI session switches to use the new active port and the exported LUNs are accessed over the new active port. pWWN aliasing and trespass provide redundancy behind the MDS switch.

350

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Host-to-Storage Multipathing
Redundant I/O design with multipathing s/w is a best practice:
Error detection, dynamic failover and recovery Active/active or active/passive operation Transparent to applications on server
Host with multiple (iSCSI) NICs and multipathing software Ethernet switches Application Multipathing iSCSI Driver MDS 9000 redundant fabrics

FC storage array with redundant controller ports


FC

2006 Cisco Systems, Inc. All rights reserved.

33

Host to Storage Multipathing


Multipath storage products can provide a large spectrum of features and functions that affect the performance, availability, accessibility, configurability, and serviceability of the storage subsystem and system I/O. Due to the cost impact of redundancy, and stringent network requirements, administrators may choose to implement redundancy at only one component level. At each individual component level there must be robust management and monitoring techniques built into the network so the switchover can occur with minimal downtime. In a typical multipathing implementation, each path may traverse separate fabrics to complete the connection between initiator and target. Failure anywhere in a chosen path can cause a failover event to occur. Thus multipathing software must provide proactive monitoring and fast fail-over should an existing utilized path fail. During a failure event, it is important for either the network recovery mechanisms to maintain access to all devices (targets, LUNs) or the multipathing implementation to recognize and recover from any loss in connectivity. The key objective for redundancy design is to maintain access at the application layers and minimize any outages. A combination of multipathing software and iSCSI network redundancy is used to ensure true application layer protection.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

351

iSCSI Server Load Balancing (iSLB)


iSCSI Servers iSCSI Servers

3.0

Load Balancing
iSLB IPS IPS

Create a pool of IPS ports Load balance servers to ports from the pool

IPS

IPS

FC Arrays

FC Arrays

Without iSLB
Manually configure iSCSI configuration on multiple switches Static assignment of Hosts to IPS ports, with Active/Backup redundancy Manually Zone iSCSI Host WWN with FC Target WWN
2006 Cisco Systems, Inc. All rights reserved.

With iSLB
CFS automatically distributes iSCSI configuration to multiple switches iSLB provides dynamic load distribution with Active/Active redundancy Simplified zoning by automating setup of iSCSI specific attributes
34

iSCSI Server Load Balancing


iSLB provides dynamic load distribution across pooled GigE ports in different linecards on the same MDs switch. This removes the requirement to manually assign iSCSI hosts to IPS ports or manually zone iSCSI hosts with FC targets.

352

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Security
iSCSI Access Control
MDS 9000 Bridges Security Domains for IP SANs
IP Domain VLANs CHAP ACLs IPSec Mgmt Domain SNMP AAA RBAC SSH FC Domain VSANs Zoning Port Security
36

Multiple levels of security


2006 Cisco Systems, Inc. All rights reserved.

iSCSI Access Control


When considering security in IP SANs, it is important to consider the overall picture. IP SANs touch several overlapping security domains, and therefore may require utilization of several security mechanisms, including:

IP Domain VLANs, ACLs, CHAP, IPSec Management Domain AAA, SNMP, RBAC, SSH Fibre Channel Domain VSANs, Zoning, Port Security

The MDS 9000 family of switches provide the security features, intelligence capabilities and processing capacity needed to bridge these security domains. While it is not a requirement to implement all of these security features, it is a recommended best practice to implement multiple levels of security. For example, iSCSI CHAP authentication is not required, but can be used in combination with FC-based zoning to create a more secure IP SAN.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

353

Centralized Security
Local Radius server on MDS
RADIUS

FC Servers

Centralized AAA services via RADIUS and TACACS+ servers Single AAA database for:
iSCSI CHAP authentication FC-CHAP authentication

FC-CHAP

CHAP

FC-CHAP

CLI/SNMP accounts (RBAC) SNMPv3

iSCSI

iSCSI Servers

RBAC

FC Targets

Management Server
2006 Cisco Systems, Inc. All rights reserved.

37

Centralized Security
The MDS 9000 platform provides centralized AAA services by supporting RADIUS and TACACS+ servers. With iSCSI, RADIUS can be used to implement a single highly available AAA database for:

iSCSI CHAP authentication FC-CHAP authentication CLI/SNMP accounts (RBAC) SNMPv3

354

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSCSI Access Control Model

I will only be advertised on gig 5/3


iSCSI

CHAP I will only permit access to iqn.host1

IPS module supports IP-based access controls:


IP ACLs VLAN trunking CHAP authentication for iSCSI initiators

iSCSI virtual targets provide additional iSCSI access controls:


Advertise targets on specific interfaces Permit access to specific iSCSI initiators
2006 Cisco Systems, Inc. All rights reserved.

38

iSCSI Access Control Model


During an iSCSI login, both the iSCSI initiator and target have the option to authenticate each other. By default, the IPS module allows either CHAP authentication or no authentication from iSCSI hosts. CHAP authentication can be enabled globally for all IPS module interfaces, or on a per interface basis. You can control access to each statically-mapped iSCSI target by specifying a list of IPS ports on which it will be advertised and specifying a list of iSCSI initiator node names allowed to access it. By default iSCSI targets are advertised on all Gigabit Ethernet interfaces, subinterfaces, PortChannel interfaces, and PortChannel subinterfaces. By default, static virtual iSCSI targets are not accessible to any iSCSI host. You must explicitly configure accessibility to allow a virtual iSCSI target to be accessed by all hosts. The initiator access list can contain one or more initiators. Each initiator is identified by one of the following:

iSCSI node name IP address and subnet

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

355

FC Access Control Model


Whats a VSAN?!
VSAN_10

FC

iSCSI

iSCSI Host VLAN_10

Virtual FC Host (pWWN)

By default, all iSCSI initiators belong to the port VSAN of their iSCSI interface (VSAN 1) iSCSI initiators can be assigned to VSANs by pWWN iSCSI initiators can belong to multiple VSANs
2006 Cisco Systems, Inc. All rights reserved.

39

FC Access Control Model


The iSCSI specifications do not define VSANs, and iSCSI hosts know nothing about VSANs. However, the MDS 9000 extends these concepts from the Fibre Channel domain into the iSCSI domain, providing an inherent transparency to both protocols. By default, iSCSI initiators are members of the port VSAN of their iSCSI interface, which defaults to VSAN 1. The port VSAN of an iSCSI interface can be modified. iSCSI initiators can be members of more than one VSAN. The IPS module creates one Fibre Channel virtual N_Port in each VSAN to which the host belongs.

356

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Zoning iSCSI Initiators


VSAN_10 iSCSI_Zone1
FC

iSCSI

iSCSI Host VLAN_10

Virtual FC Host (IQN or IP)

VSANs and zones are FC access control mechanisms MDS 9000 extends VSANs and zoning into iSCSI domain iSCSI initiator access is subject to VSAN and zoning rules

2006 Cisco Systems, Inc. All rights reserved.

40

Zoning iSCSI Initiators


Zoning is a Fibre Channel access control mechanism for devices within a SAN, or in the case of the MDS 9000, within a VSAN. The MDS 9000s zoning implementation extends the VSAN and zoning concepts from the Fibre Channel domain to also cover the iSCSI domain. This extension includes both iSCSI and Fibre Channel features and provides uniform, flexible access control across a SAN. iSCSI initiators are subject to the rules and enforcement of VSANs and zoning. By default, dynamically mapped iSCSI initiators are placed in VSAN 1. If the default zone policy in VSAN 1 is set to permit, it would be possible for iSCSI initiators to access any un-zoned targets in VSAN 1. Generally speaking, setting the default zone policy to permit is not recommended.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

357

IPSec
IPSec for secure VPN tunnels:
Authentication and encryption Site-to-site VPNs for FCIP tunnels Site-to-site VPNs for iSCSI connections Hardware-based IPSec on 14+2 module
HBA

iSCSI iSCSI

iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

FC FC HBA FC
HBA

HBA

FC FC HBA FC
HBA

Site-to-site VPN for IP SAN

FC

FC

Site-to-site VPN for FCIP interconnect


FC FC FC

2006 Cisco Systems, Inc. All rights reserved.

41

IPSec
The IPSec protocol creates secure tunnels between a pair of hosts, between a pair of gateways, or between a gateway and a host. IPSec supports session-level and packet-level authentication using a variety of encryption schemes, such as MD-5 SHA-1, DES, and 3DES. Session-level authentication ensures that devices are authorized to communicate and verifies that devices are who they say they are, while packet-level authentication ensures that data has not been altered in transit. Applications for IPSec VPNs in the SAN include:

Site-to-site VPNs for FCIP SAN interconnects Site-to-site VPNs for IP SANs (iSCSI hosts accessing remote FC storage)

358

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

VLANs
Use VLANs to secure data paths at the edge of the IP network VLAN-to-VSAN mapping Private VLANs
iSCSI VLAN
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

iSCSI iSCSI

iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI

iSCSI VSAN

FC

FC

FC

FC

FC

FCIP VLAN FCIP VSAN


2006 Cisco Systems, Inc. All rights reserved.

FCIP VSAN
42

VLANs
Within each data center or remote site, VLANs can be used to provide dedicated paths for IP storage traffic. VLANs can be used to:

Protect iSCSI traffic along the data path from the hosts to the SAN fabric Provide dedicated paths for FC extension over FCIP by extending VLANs from the SAN fabric to edge routers

In addition to providing security, using VLANs to isolate iSCSI and FCIP data paths enhances the network administrators ability to provide dedicated bandwidth to SAN devices and allows more effective application of QoS parameters.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

359

iSCSI Target Discovery


iSCSI Target Discovery
Hello out there where are my targets?!
iSCSI

Here you go mate, iqn.your.target.disk1 iqn.your.target.disk2


IP Network

FC

iSCSI target discovery:


Uses iSCSI SendTargets command to query target When there are few devices and the target IP Address is known use Static Configuration Point to point In larger iSCSI designs with multiple IP connections use iSNS (Internet Storage Name Service) or SLPv2 (Service Location Protocol)

SCSI protocol is still used for LUN Discovery


2006 Cisco Systems, Inc. All rights reserved.

44

The goal of iSCSI discovery is to allow an initiator to find the targets to which it has access, and at least one address at which each target may be accessed. Ideally, this should be done using as little configuration as possible. The iSCSI discovery mechanisms only deal with target discovery; the SCSI protocol is used for LUN discovery. In order for an iSCSI initiator to establish an iSCSI session with an iSCSI target, the initiator needs the IP address, TCP port number and iSCSI target name information. The goal of iSCSI discovery mechanisms are to provide low overhead support for small iSCSI setups, and scalable discovery solutions for large enterprise setups. Thus, there are several methods that may be used to find targets ranging from configuring a list of targets and addresses on each initiator and doing no discovery at all, to configuring nothing on each initiator, and allowing the initiator to discover targets dynamically. There are currently three basic ways to allow iSCSI host systems to discover the presence of iSCSI target storage controllers:

Static configuration iSCSI SendTargets command Zero configuration methods such as the Service Location Protocol (SLPv2) and/or the Internet Storage Name Service (iSNS)

The diagram above shows the SendTargets method. This method is most often used today with simple iSCSI solutions. However, iSNS server will be used in the future to scale iSCSI target discovery. SAN-OS 2.0 has the iSNS server component.

360

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

iSNS Client and Server Support


External iSNS server

FC Servers

iSNS server is integrated into the MDS


External iSNS server is also supported via MDS iSNS client

iSCSI iSCSI iSCSI

iSNS client

iSNS client

iSCSI Servers

iSCSI iSCSI

iSNS client

FC Targets

Enables an integrated solution to configure and manage both Fibre Channel and iSCSI devices:
Device registration, discovery, and state change notification Distributed, HA solution Discovery Domains mapped to FC Zones Discovery Domain Sets mapped to FC Zonesets No need for dual access-control configuration

FC Servers

iSCSI iSCSI iSCSI

iSNS server
iSCSI iSCSI

iSNS server

iSCSI Servers

iSNS server

FC Targets
45

2006 Cisco Systems, Inc. All rights reserved.

iSNS Client and Server Support


Zero-configuration discovery methods allow iSCSI hosts to discover accessible targets without requiring an administrator to explicitly configure each host to point to its targets. iSNS is rapidly becoming the industry-standard zero-configuration protocol for iSCSI environments. It is supported by the Microsoft iSCSI client. iSNS provides name services for iSCSI and iFCP SANs. Without iSNS, each iSCSI host must be configured to point to each target portal; this can be very time-consuming and error-prone in a large deployment. with iSNS, iSCSI target devices register their addresses and attributes with a central iSNS server. Initiators then can query the iSNS server to identify accessible targets. iSNS also includes a state change notification protocol that notifies iSCSI devices when the list of accessible targets changes. With native iSCSI storage, each target is a separate portal. The MDS 9000 IPS module acts as a portal for all virtual FC targets configured on that switch. This means that host configuration is relatively simple if you only have one IPS module, but becomes increasingly complex as more IPS modules are added. When native iSCSI targets are added to the mix, iSNS is even more essential for scaling the iSCSI deployment. The Cisco IPS module includes both iSNS client and iSNS server support. If an external iSNS server like the Microsoft iSNS server is used, the MDS registers all virtual iSCSI targets with the external iSNS server. If the MDS iSNS server is used, the iSCSI hosts discover targets by querying the iSNS server in the MDS switch. The iSNS databases are distributed and synchronized in a multi-switch fabric. iSNS also supports a feature called Discovery Domains (DD) and Discovery Domain Sets (DDS). DDs and DDSs are similar to zones and zonesets. One advantage of the MDS iSNS server over other iSNS servers is that the MDS automatically maps the active zoneset to the active iSNS DDS, eliminating the need for dual access-control configuration.
Copyright 2006, Cisco Systems, Inc. Building iSCSI Solutions 361

Wide Area File Services


Typical Enterprise
Data Protection Risks Regulatory compliance issues Management Challenges High Costs: $20k-$30k/yr. Per Branch
Backup

Branch Office
IT

Regional Office
IT

NAS DAS
Files

Backup

NAS DAS
Files

IT

NAS SAN
Files

Wide Area Network

IT
Backup

NAS DAS DAS


Files

Data Center

Remote Office

Islands of Storage
2006 Cisco Systems, Inc. All rights reserved.

47

Typical Enterprise
In a typical enterprise environment, several branch offices connect to the Data Center over the WAN. Each branch office is responsible for data protection and backup of critical data leading to concerns for regulatory compliance. Each branch office requires local technical support and management of the infrastructure leading to high costs of deployment.

362

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

WAFS Solution and Data Migration


Data migrated to Data Center Reduced IT management costs Deploy WAAS at each office
IT
Backup

Branch Office Regional Office


IT
Backup

NAS DAS DAS


Files

NAS DAS DAS


Files

IT Admin

NAS SAN
Files

Wide Area Network

IT
Backup

NAS DAS DAS


Files

Backup

Data Center

Remote Office

Wide Area File Services


2006 Cisco Systems, Inc. All rights reserved.

48

WAFS Solution and Data Migration


To solve management issues and reduce IT management costs, data is migrated to the data center. WAAS appliances that provide WAFS services are deployed at each Branch Office and at the Data Center to provide access to files from each branch office.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

363

Centralization and Consolidation


Storage consolidated in Data Centre

Branch Office Regional Office

Centralized IT management and backup strategy Files cached in WAAS and locally accessed

WAAS Manager
(Web-based)

Files Files

IT Admin

NAS SAN
Files

Wide Area Network

Files

Backup

Data Center

Cluster

Remote Office

Wide Area File Services


2006 Cisco Systems, Inc. All rights reserved.

49

Centralization and Consolidation


By consolidating storage in the data center, data can be centrally managed and backed up. However, without WAFS, file access from each branch office would be slow. The WAAS appliance will cache files locally in each branch office, providing much faster access and improved performance.

364

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Data Flows in the Data Center


SCSI is a block I/O protocol NFS and CIFS are file I/O protocols The file system maps files to blocks, to access files on the storage device using block I/O
NAS Head
NAS Protocol conversion and File System only no local storage
FC

Windows Client CIFS


NAS

FC Storage Array

SCSI

SCSI SCSI
FC

CIFS NFS NFS LAN iSCSI Gateway CIFS SCSI SCSI


FC

NAS Filer Unix Client

SCSI FC SAN FC Tape

iSCSI iSCSI host


HBA

File system maps Files to blocks


HBA

= File System

2006 Cisco Systems, Inc. All rights reserved.

iSCSI Storage

FC Application Server

50

Data Flows in the Data Center


Many different File I/O and Block I/O protocols are used throughout the data center. File I/O protocols like NFS and CIFS are used to transfer files between clients and NAS filers across the LAN.

Unix clients use NFS Windows clients use CIFS

Block I/O protocols like SCSI are used to transfer blocks between SCSI Initiators and SCSI Targets.

iSCSI is used to transport SCSI commands and data across the LAN Fibre Channel is used to transport SCSI commands and data across the SAN

The File System is a table that is used to map files to blocks. The data center is a complex environment with many different File I/O and Block I/O protocols used to transfer data to and from storage devices. In this environment it is important to understand where the data is located and where the file system is located.

NAS Filers connect to the LAN and have their own File System and local storage They respond to File I/O protocols like NFS and CIFS. NAS Head is a NAS filer without local storage. They bridge the LAN and the SAN and respond to File I/O protocols then map these through the File System to Block I/O protocols that are used to access FC storage on the SAN. iSCSI Gateway allows iSCSI hosts on the LAN to access FC storage on the SAN. Note that the File System is now on the iSCSI host.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

365

An FC Application server responds to File I/O requests from the Client on the LAN and will retrieve data using Block I/O from FC storage LUNs across the SAN. This time, the FC Application Server contains the File System.

366

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

Data Flows across the WAN


CIFS is a very chatty protocol: >1300 round-trip transactions to/from the file system to load a 1MB file As distance between client and file system increases, latency increases and files take longer to load
Windows Client Windows Clients CIFS Branch Office LAN Data Center LAN WAN

CIFS
NAS

NFS NAS Filer Unix Client SCSI iSCSI iSCSI host


HBA

NFS

Branch Office LAN iSCSI Storage

Unix Clients

2006 Cisco Systems, Inc. All rights reserved.

51

Data Flows across the WAN


In a data center environment, distances are short so latencies are relatively low. When the Windows client is located across the WAN at some distance from its server, then latencies increase dramatically and files take longer to load. CIFS is notoriously a very chatty protocol. Over 1300 round-trip transactions take place between the client and the file system in the NAS Filer just to load a 1MB file.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

367

Cisco Wide-Area Application Services

L7: Application Optimization

Video Video Unified Management Management Unified

Web Web

File File Services Services

Local Local Services Services

Other Other Apps Apps

L4: Transport Optimization

Data Data Redundancy Redundancy Elimination Elimination (DRE) (DRE)

TCP TCP Flow Flow Optimizations Optimizations (TFO) (TFO)

Content Content Distribution Distribution

Application Application Classification Classification and and Policy Policy Engine Engine Logical Logical and and Physical Physical Integration Integration Security Security Monitoring Monitoring Quality Qualityof of Service Service

Network Infrastructure

Core Core Routing Routing & & Switching Switching Services Services

2006 Cisco Systems, Inc. All rights reserved.

52

Cisco Wide Area Application Services (WAAS)


Cisco Wide-Area Application Services (WAAS) is a powerful combination of the new Cisco Wide Area Application Engines (WAE) and integrated network modules. WAAS includes, and is the replacement for, Wide-Area File Services (WAFS). WAAS offers distributed enterprises with multiple branch offices the benefits of centralized infrastructure and simple remote access to applications, storage, and content. Cisco WAAS includes best-in-class protocol optimizations, caching, content distribution, and streaming media technologies. The technology overcomes bandwidth and latency limitations associated with TCP/IP and client-server protocols and allows you to consolidate your distributed servers and storage into centrally managed data centers, while offering LAN-like access to remote users. Benefits include:

Reduce TCO and improve assets management through centralized rather than distributed infrastructure. Improve data management and protection by keeping a master copy of all files and content at the data center. Improve ability to meet regulatory compliance objective. Raise employee productivity by providing faster access to shared information. Protect investment in existing WAN deployments.

368

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

WAFS Performance
Word - Time to Open
Native WAN Cisco FE Native LAN
0 5 10 15 20 25

Word - Time to Save


Native WAN Cisco FE Native LAN
0 10 20 30 40 50 60 70

1MB Word File (sec), T1, 80mS

Excel - Time to Open


Native WAN Cisco FE Native LAN
0 10 20 30 40 50

Excel - Time to Save


Native WAN Cisco FE Native LAN
0 5 10 15 20 25 30 35

2MB Excel File (sec), T1, 80mS

Cisco WAAS shows 5x to 12x faster performance as compared to the WAN, and similar performance to LAN for typical operations on Office applications
2006 Cisco Systems, Inc. All rights reserved.

53

WAFS Performance
The above diagram shows the comparison, on file open and file close, of a WAFS enabled site versus a direct access WAN site. Even when a file is not cached on the local WAN Application Engine, the WAFS performance enhancements use roughly 1/3 of the WAN that a native WAN request for the same file.
Note All graphs and statistics are examples only, actual performance will vary depending on network design, server design and application design.

Copyright 2006, Cisco Systems, Inc.

Building iSCSI Solutions

369

370

Cisco Storage Design Fundamentals (CSDF) v3.0

Copyright 2006, Cisco Systems, Inc.

You might also like