Professional Documents
Culture Documents
Version 3.0
Student Guide
Table of Contents
Course Introduction
Overview .................................................................................................................................................1 Recommended Prerequisites..................................................................................................................1 Course Outline ........................................................................................................................................2 Cisco Certifications .................................................................................................................................3 Administrative Information.......................................................................................................................5 About Firefly ............................................................................................................................................7
ii
CSDF
Recommended Prerequisites
You will gain the most from this course if you have experience working with storage and storage networking technologies.
Course Outline
This slide shows the lessons in this course.
Course Overview
SCSI and Fibre Channel Primer Introduction to the MDS 9000 Platform Architecture and System Components The Multilayer SAN System Areas and Lab Overview Network-Based Storage Applications Optimizing Performance Securing the SAN Fabric Designing SAN Extension Solutions Building iSCSI Solutions
2006 Cisco Systems, Inc. All rights reserved.
Cisco Certifications
Cisco Storage Networking Certification Path
Enhance Your Cisco Certifications and Validate Your Areas of Expertise
Cisco Storage Networking Specialists
Required Exam Recommended Training Through Cisco Learning Partners
Prerequisite: Valid CCNA Certification MDS Configuration and Troubleshooting (MDSCT) Cisco Multiprotocol Storage Essentials (CMSE) Cisco Advanced Storage Implementation and Troubleshooting (CASI)
Required Exam
Prerequisite: Valid CCNA Certification Cisco MDS Storage Networking Fundamentals (CMSNF or CSDF) 642-353 Cisco Storage Design Essentials (CSDE)
The Cisco Storage Networking Certification Program is part of the Cisco Career Certifications program. The title of Cisco Qualified Specialist (CQS) is awarded to individuals who have demonstrated significant competency in a specific technology, solution area, or job role, which is demonstrated through the successful completion of one or more proctored exams. The CQS Storage Networking program consists of two parallel tracks:
The Cisco Storage Networking Support Specialist (CSNSS) track is for systems engineers, network engineers, and field engineers who install, configure, and troubleshoot Cisco storage networks. The Cisco Storage Networking Design Specialist (CSNDS) track is for pre-sales systems and network engineers who design Cisco storage networks. IT managers and project managers will also benefit from this certification.
Cisco provides three levels of general certifications for IT professionals with several different tracks to meet individual needs. Cisco also provides focused certifications for designated areas such as cable communications and security. There are many paths to Cisco certification, but only one requirementpassing one or more exams demonstrating knowledge and skill. For details, go to www.cisco.com/go/certifications. CCIE certification in Storage Networking indicates expert level knowledge of intelligent storage solutions over extended network infrastructure using multiple transport options such as Fibre Channel, iSCSI, FCIP and FICON. Storage Networking extensions allow companies to improve disaster recovery, optimize performance and take advantage of network services such as volume management, data replication, and enhanced integration with blade servers and storage appliances. There are no formal prerequisites for CCIE certification. Other professional certifications and/or specific training courses are not required. Instead, candidates are expected to have an indepth understanding of the subtleties, intricacies and challenges of end-to-end storage area networking. You are strongly encouraged to have 3-5 years of job experience before attempting certification. To obtain your CCIE, you must first pass a written qualification exam and then a corresponding hands-on lab exam.
Administrative Information
Please silence your cell phones.
Learner Introductions
Your name Your company Skills and knowledge Brief history Objective
Course Evaluations
www.fireflycom.net/evals
2006 Cisco Systems, Inc. All rights reserved.
Please take time to complete the course evaluations after the class ends. Your feedback helps us continually improve the quality of our courses.
About Firefly
About Firefly
Technology Focus
Datacenter IP and Security Content Networking and WAN Optimization Storage Networking Business Continuance Optical Networking
Solutions Focus
Integrated Data Center Solutions Core IP Services Provisioning Multiprotocol SANs Business ContinuanceAll Application Tiers Application Optimization Application Security
Services
Global Delivery Curriculum Development State-of-the-Art Remote Labs and E-Learning Needs Assessment Consultative Education
2006 Cisco Systems, Inc. All rights reserved.
Cisco Multiprotocol Storage Essentials (CMSE) Cisco Advance Storage Implementation and Troubleshooting (CASI) Cisco Mainframe Storage Solutions (CMSS)
Cisco Storage Design Essentials (CSDE) Cisco Storage Design BootCamp (CSDF + CSDE)
10
Lesson 1
Objectives
Upon completing this lesson, you will be able to explain the fundamentals of SCSI and Fibre Channel. This includes being able to meet these objectives:
Describe SCSI technology Describe the operations of the SCSI protocol Explain why FC is a data transport technology that is well-suited to storage networks Explain the fundamental design of FC flow control Describe the two addressing schemes used on Fibre Channel networks Describe the session establishment protocols that are performed by N_Ports and F_Ports in a fabric topology List the standard services provided by fabric switches as defined by the FC specification
Requests Responses
LUNs
Application Client
Device Server
Tasks
10
SCSI uses a parallel architecture in which data is sent simultaneously over multiple wires. SCSI is half-duplexdata travels in one direction at a time. On a SCSI bus, a device must assume exclusive control over the bus in order to communicate. (SCSI is sometimes referred to as a simplex channel because only one device can transmit at a time).
11
FC
HBA
Interface
Interface
Interface
SCSI Targets
ID ID ID ID
6
LUN 0 LUN 1 LUN 2 LUN 3
5
LUN 0 LUN 1
ID ID
SCSI ID
0
LUN 0 LUN 1 LUN 2 LUN 3
A SCSI Initiator addresses its SCSI Target using the SCSI Nexus Bus : Target ID : LUN
2006 Cisco Systems, Inc. All rights reserved.
Data bits are sent in parallel on separate wires. Control signals are sent on a separate set of wires. Only one device at a time can transmita transmitting device has exclusive use of the bus. A special circuit called a terminator must be installed at the end of the cable. The cable must be terminated to prevent unwanted electrical effects from corrupting the signal.
Parallel transmission of data bits allows more data to be sent in a given time period but data bits may arrive early or late (skew) and lead to data errors. The fact that control signals, such as clock signals, are sent on a separate set of wires also makes synchronization more difficult. It is an inefficient way to use the available bandwidth, because only one communication session can exist at a time. Termination circuits are built into most SCSI devices, but the administrator often has to set a jumper on the device to enable termination. Incorrect cable termination can cause either a severe failure or intermittent, difficult-totrace errors. To achieve faster data transfer rates, vendors doubled the number of data lines on the cable from 8 (narrow SCSI) to 16 (wide SCSI). Vendors have increased the clock rate, which increased the transfer rates, but this also increased the possibility of data errors due to skew or electrical interference.
Copyright 2006, Cisco Systems, Inc.
12
SCSI was designed to support a few devices at most, so its device addressing scheme is fairly simpleand not very flexible. SCSI devices use hard addressing:
Each device has a series of jumpers that determine the devices physical address, or SCSI ID. The ID is software-configurable on some devices. Each device must have a unique SCSI ID. Before adding a device to the cable, the administrator must know the ID of every other device connected to the cable and choose a unique ID for this new device. The ID of each device determines its priority on the bus. For example, the SCSI Initiator with ID 7 always has a higher priority than the SCSI Target with ID 6. Because each device must have exclusive use of the bus while it is transmitting, ID 6 must wait until ID 7 has finished transmitting. Fixed priority makes it more difficult for administrators to control performance and quality-of-service.
13
SBP-2 SBP-2 Port PortDriver Driver IEEE-1394 IEEE-1394 (Firewire) (Firewire) Port Port
FC
SCSI Adapter FC HBA NIC SAS Interface
*SCSI-3 Separation of physical interface, transport protocols, and SCSI Command Set
2006 Cisco Systems, Inc. All rights reserved.
2. The transport protocol layer defines the protocols used for session management: SCSI-FCP is the transport protocol specification for Fibre Channel. Serial Storage Protocol (SSP) is the transport protocol used by SSA devices. Serial Bus Protocol (SBP) is the transport protocol used by IEEE1394 devices.
14
3. The shared command set layer consists of command sets for accessing storage resources: SCSI Primary Commands (SPC) are common to all SCSI devices. SCSI Block Commands (SBC) are used with block-oriented devices, such as disks. SCSI Stream Commands (SSC) are used with stream-oriented devices, such as tapes. SCSI Media Changer Commands (SMC) are used to implement media changers, such as robotic tape libraries and CD-ROM carousels. SCSI Enclosure Services (SES) defines commands used to monitor and manage SCSI device enclosures like RAID Arrays, including fans, power and temperature monitoring.
4. The SCSI Common Access Method (CAM) defines the SCSI device driver application programming interface (API).
15
FC Link
FC
HBA
pWWN
Interface
Interface
pWWN
FC Frame
Interface
SCSI Targets
LUN 0 LUN 1 LUN 2 LUN 3 LUN 0 LUN 1
SCSI Commands, Data and Responses are carried in the payload of a frame from source to destination. In SCSI-FCP, the SCSI IDs are mapped to the unique worldwide name in each FC Port.
2006 Cisco Systems, Inc. All rights reserved.
The SCSI Initiator and SCSI Target ports are zoned together within the Fibre Channel switch. Each device logs in to the fabric (FLOGI) and registers itself with the Name Server in the switch. The FC-HBA queries the Name Server and discovers other FC ports in the same zone as itself. The FC HBA then logs in to each Target port (PLOGI) and they exchange Fibre Channel parameters. The SCSI Initiator (SCSI-FCP driver) then logs in to the SCSI Target behind the FC Target port (PRLI) and establishes a communication channel between SCSI Initiator and SCSI Target. The SCSI Initiator commences a SCSI operation by sending a SCSI Command Descriptor Block (CDB) down to the FC HBA with instructions to send it to a specific LUN behind a Target FC port (SCSI Target). The command is carried in the payload of the FC Frame to the target FC port. The SCSI Target receives the CDB and acts upon it. Usually this would be a Read or Write command. Data is then carried in the payload of the FC Frame between SCSI Initiator and SCSI Target. Finally, when the operation is complete, the SCSI Target will send a Response back to the SCSI Initiator in the payload of a FC Frame.
16
SCSI Operations
SCSI Operation
SCSI specifies three phases of operation
Command send the required command and parameters via a Command Descriptor Block (CDB) Data Transfer data in accordance with the command Response Receive confirmation of command execution
Read
Comm and
FC
Write
Comm and
Xfer-Rdy
Data
FC
HBA
Data
Initiator
2006 Cisco Systems, Inc. All rights reserved.
Target
Respo
nse
nse Respo
10
Phases of Operation
Every communication between SCSI Initiator and SCSI Target is formed by sequences of events called bus phases. Each phase has a purpose and is linked to other phases to execute SCSI commands and transfer data and messages back and forth. The majority of the SCSI protocol is controlled by the SCSI Initiator. The SCSI Target is usually passive and waits for a command. Only the SCSI initiator can initiate a SCSI operation, by selecting a SCSI Target and sending a CDB (Command Descriptor Block) to it. If the CDB contains a Read command, the SCSI Target moves its heads into position and retrieves the data from its disk sectors. This data is returned to the SCSI Initiator. If the CDB contains a Write Command, the SCSI Target prepares its buffers and returns a XferRdy. When the SCSI Initiator receives Xfer-Rdy it can commence writing data. Finally, when the operation is complete, the SCSI Target returns a Response to indicate a successful (or unsuccessful) data transfer.
17
Byte 0 1 2 3 4 5 6 7 8 9
0
First Byte Operation Code First Byte Operation Code
Group Code
Command Code
Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved MSB Transfer Length Transfer Length Control
MSB
Transfer Data starting at this LBA Transfer Data starting at this LBA
LSB
A Command is executed by the Initiator sending a CDB to a Target. In serial SCSI-3, the CDB is carried in the payload of the Command Frame
2006 Cisco Systems, Inc. All rights reserved.
11
Group Code establishes the total command length. Command Code establishes the command function.
The number of Bytes of parameters (N) can be determined from the Operation Code byte which is located in byte 0 of the Command Descriptor Block (CDB). The Control Byte, which is located in the last byte of a Command Descriptor Block, contains control bits that define the behavior of the command. The Logical Block Address is an absolute address of where the first block should be written (or read) on the disk. LBA 00 is the first sector on the disk volume or LUN, LBA 01 is the second sector and so on, until we reach the last sector of the disk volume or LUN. When the CDB is sent to a block device (Disk), blocks are always 512 Bytes long. The Transfer Length contains the number of 512 Byte blocks to be transferred. When the CDB is sent to a streaming device (Tape), the block length is negotiated. The Transfer Length contains the number of blocks to be transferred.
18 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.
CDBs can be different sizes 6Byte CDB, 10 Byte CDB, 12 Byte CDB, 16 Byte CDB etc. to accommodate larger disk volumes, or transfer lengths. 10 Byte CDBs are common
19
SCSI Commands
SCSI supports several specific commands for each media type, and primary commands that all devices understand. The following commands are of particular interest:
- REPORT LUNS - INQUIRY - TEST UNIT READY - REPORT CAPACITY How many LUNs do you have? What device are you? Is the LUN available? What size is each LUN?
12
SCSI Commands
Report LUNs are used by Operating Systems to discover LUNs attached to a particular hardware address. They are typically sent by the Initiator to LUN 0. Inquiries are used by the Operating System to determine the capabilities of each LUN that was discovered with REPORT LUNS. Test Unit Ready is used to check the condition of a particular LUN. Report Capacity is sent to each LUN in turn to obtain the size of each LUN.
20
1. 2. 3. 4. 5. 6. 7.
CDB
Group Code
MSB
Application makes File I/O request to Volume Manager Volume Manager maps volume to SCSI ID & Target LUN File System maps Files to Blocks, makes Block I/O request Command, LBA, Block count and LUN sent to SCSI driver SCSI driver creates CDB (Command Descriptor Block) FC driver creates command frame with CDB in payload FC driver sends command frame to Target LUN and awaits response Read
LSB
Command Code
MSB
Reserved Service Action Logical Block Address Logical Block Address Logical Block Address Logical Block Address Reserved Transfer Length Transfer Length Control
Write
Comm and
Comm and
LSB
Xfer-Rdy
Data
Data
FC Frame
SOF Header Payload CRC EOF
nse Respo
Respo
nse
13
21
FC
HBA HBA
FC
FC
IP Network
FC
HBA
15
22
FC
HBA
FC
FC
HBA
FC
FC FC
HBA
FC Fabric
FC
FC
FC
HBA
FC
FC
HBA
FC
HBA
FC
FC
HBA
FC
HBA
Point-to-Point
Arbitrated Loop
Switched Fabric
Arbitrated Loop provides shared bandwidth at low cost Switched Fabric provides aggregate bandwidth and scalability but requires complex FC switches, which increase the cost Most SANs today use the Switched Fabric topology
2006 Cisco Systems, Inc. All rights reserved.
16
Point-to-Point
Exactly two FC ports connected together. Both devices have exclusive access to the full link bandwidth
Arbitrated Loop
Up to 126 FC ports connected together on a Private Loop (not connected to a FC Switch) Up to 127 FC ports connected together on a Public Loop (connected via a FL port on a FC Switch) All devices share the available bandwidth around the loop, therefore a practical limit might only be 20 or so devices. A device that wishes to communicate with another device must do the following operations.
1. Arbitrate to gain control of the Loop 2. Open the port it wishes to communicate with 3. Send or Receive Data frames 4. Close the port 5. Release the loop, ready for the next transfer. Usually only two devices communicate at a time, the other FC ports in the loop are passive
SCSI and Fibre Channel Primer 23
When the loop is broken or a device is added or removed, the downstream FC port sends thousands of LIP primitive sequences to inform the other loop devices that the loop has been broken. The LIP (Loop Initialization Procedure) is used to assign (or re-assign) Arbitrated Loop Physical Addresses (ALPAs) to each FC Port on the loop. This operation is disruptive and frames may be lost during this phase. Nowadays, most users would connected FCAL devices via a FC Hub to minimize disruption.
Switched Fabric
The topology of choice for FC SANs. Each connected device has access to full bandwidth on its link through the switch port it is connected to. The FC SAN can be expanded by adding more switches and increasing the number of ports for connected devices. The FC 24 bit addressing scheme allows for potentially 16,500,000 devices to be connected. A realistic number is a few thousand. This is because there can only be a maximum 239 switches in a single fabric and most switches today have a small number of ports each. Each FC switch must provide services for management of the SAN. These services include a Name Server, Domain Manager, FSPF Topology Database, Zoning Server, Time Server etc.
24
FC
HBA
FC
FC
FC
HBA
FC
FC
HBA
FC
FC
FC
HBA
FC
HBA
FC Fabric
FC
FC
FC
FC
HBA
ISL
FC
HBA
FC FC
FC FC
HBA HBA
A Fabric contains one or more switches, connected together via Inter Switch Links (ISLs)
2006 Cisco Systems, Inc. All rights reserved.
17
Each switch is assigned a unique ID called a Domain. There can be a maximum 239 switch domains in a fabric, however, McData impose a 32 Domain limit in their designs. FC Switches are connected together via Inter-Switch Links (ISLs). Each device is exclusively connected to its FC port on the switch via a bi-directional Full Duplex link. All connected devices share the same addressing space within the fabric and can potentially communicate with each other. Frames flow from device to device via one or more FC Switches. As a frame moves from switch to switch, this is called a hop. McData impose a 3 hop limit in their designs. Brocade impose a 7 hop limit and Cisco impose a 10 hop limit.
25
Ports Node
FC
FC
HBA
FC
FC HBA Server
2006 Cisco Systems, Inc. All rights reserved.
Switch
Link
FC
Tape device
Array controller
Storage
18
26
FC
FC
HBA
NL
FCAL Hub
NL
Storage Array
Host
FL
E
Inter Switch Link
E N
Link
FC
F E B
F
Link
N
Storage Array
Host
WAN Bridge
19
An N_Port (Node Port) is a port on a node that connects to a fabric. I/O adapters and array controllers contain one or more N_Ports. N_Ports can also directly connect two nodes in a point-to-point topology An F_Port (Fabric Port) is a port on a switch that connects to an N_Port. An E_Port (Extension Port) is a port on a switch that connects to an E_Port on another switch. An FL_Port (Fabric Loop Port) is a port on a switch that connects to an arbitrated loop. Logically, an FL_Port is considered part of both the fabric and the loop. FL_Ports are always physically located on the switch. Note that FC hubs, although they obviously have physical interfaces, do not contain FC ports. Hubs are basically just passive signal splitters and amplifiers. They do not actively participate in the operation of the network. On an arbitrated loop, the node ports manage all FC operations. Not all switches support FL_Port operation. For example, some McDATA switches do not support FL_Port operation.
An NL_Port (Node Loop Port) is a port on a node that connects to another port in an arbitrated loop topology. There are two types of NL_Ports: Private NL_Ports can communicate only with other loop ports; public NL_Ports can communicate with other loop ports and with N_Ports on an attached fabric. Note that the term L_Port (Loop Port) is sometimes used to refer to any port on an arbitrated loop topology. L_Port can mean either FL_Port or NL_Port. In reality, there is no such thing as an L_Port.
27
Nowadays, most ports are universal. They automatically sense the port they are connected to and adopt the correct valid port type automatically. However, it is good practice to lock down the port type to its correct function.
28
OX_ID
SOF
CRC
EOF
24
0-64
02048
0-3
4 Bytes
20
A 4-byte SOF (Start of Frame) delimiter A 24-byte header A data payload that can vary from 0 to 2112 bytes. Typically 2048 Bytes for SCSI-FCP. A 4-byte CRC (Cyclic Redundancy Check) that is used to detect bit-level errors in the header or payload A 4-byte EOF (End of Frame) delimiter
The Header contains fields used for identifying and routing the frame across the fabric.
R_CTL: Routing Control field defines the frames function. D_ID: Destination Address. The FCID of the FC Port where the frame is being sent to. CS_CTL: Class Specific Control field. Only used for Class 1 and 4. S_ID: Source Address. The FCID of the FC Port where the frame has come from. TYPE: The Upper Layer Protocol Data type contained in the payload. This is hex 08 for SCSI-FCP. F_CTL: Frame Control field contains miscellaneous control information regarding the frame, including how many fill bytes there are (0-3). SEQ_ID: Sequence ID. The unique identifying number of the Sequence within the Exchange. DF_CTL: Data Field Control. This field defines the use of the Optional Headers. SCSIFCP doesnt use Optional Headers.
29
SEQ_CNT: Sequence Count. The number of the frame within a sequence. The first frame is hex 00 OX_ID: Originating Exchange ID. A Unique identifying number provided by the source FC Port. RX_ID: Responding Exchange ID. A Unique identifying number provided by the destination FC Port. OX_ID and RX_ID together define the Exchange ID. PARMS: Parameter Field. Usually provides a relative offset into the ULP data buffer.
The payload itself, containing data or commands, is variable and can be up to 2112 bytes. The first 64 bytes of the payload can be used to incorporate optional headers. This would reduce the data payload size to 2048 bytes (2KB). SCSI-FCP usually carries multiples of 512 Byte blocks. The payload ends with 0-3 fill bytes. This is necessary because the smallest unit of data recognized by FC is a 4-byte word. However, the ULP is not aware of this FC requirement, and the data payload for a frame might not end on a word boundary. FC therefore adds up to 3 fill bytes to the end of the payloadas many as are needed to ensure that the payload ends on a word boundary.
30
Read
Comm and
Sequence 1
Data
Sequence 2
Sequence 3
Respo
nse
FC
Initiator
2006 Cisco Systems, Inc. All rights reserved.
Target
21
The smallest unit of data is a word. Words consist of 32 bits (4 bytes) of data that are encoded into a 40-bit form by the 8b/10b encoding process. Words are packaged into frames. An FC frame is equivalent to an IP packet. A sequence is a series of frames sent from one node to another node. Sequences are unidirectionalin other words, a sequence is a set of frames that are issued by one node. An exchange is a series of sequences sent between tow nodes. The exchange is the mechanism used by two ports to identify and manage a discrete transaction. The exchange defines an entire transaction, such as a SCSI read or write request. An exchange is opened whenever a transaction is started between two ports and is closed when the transaction ends. An FC exchange is equivalent to a TCP session.
31
Lost packets
Tx
PAUSE
Data
Data Data
Rx
23
Flow control is a mechanism for ensuring that frames are sent only when there is somewhere for them to go. Just as traffic lights are used to control the flow of traffic in cities, flow control manages the data flow in an FC fabric. Some data networks, such as Ethernet, use a flow-control strategy that can result in degraded performance:
A transmitting port (Tx) can begin sending data packets at any time. When the receiving ports (Rx) buffers are completely filled and cannot accept any more packets, Rx tells Tx to stop or slow the flow of data. After Rx has processed some data and has some buffers available to accept more packets, it tells Tx to resume sending data.
This strategy results in lost packets when the receiving port is overloaded, because the receiving port tells the transmitting port to stop sending data after it has already overflowed. Lost packets must be retransmitted, which degrades performance. Performance degradation can become severe under heavy traffic loads.
32
Benefits:
Prevents loss of frames due to buffer overflow Maximizes link throughput and performance under high loads
Disadvantages:
Long distance links require lots more credits
Port Rx has free buffers
1 0
DATA
Tx
READY
24
The transmitting port (Tx) counts the number of free buffers at the receiving port (Rx). Before Tx can send a frame, Rx must notify Tx that Rx has a free buffer and is ready to accept a frame. When Tx receives the notification (called a credit), it increments its count of the number of free buffers at Rx. Tx only sends frames when it knows that Rx can accept them. When Tx sends a frame it decrements the credit count When the credit count falls to zero, Tx must stop sending frames and wait for another credit
33
N_Port
F_Port
E_Port
E_Port
F_Port
N_Port
Destination
25
Buffer-to-buffer flow control takes place between two ports that are connected by a FC link, such as an N_Port and an F_Port, or two E_Ports, or two NL_Ports. The receiving port at the other end of the link sends a primitive signal (4 Bytes) called a R_RDY (Receiver Ready) to the transmitting port. End-to-end flow control takes place between the source port and the destination port. Whenever the receiving port receives a frame it acknowledges that frame with an ACK frame (36 Bytes).
Note that buffer-to-buffer flow control is performed between E_Ports in the fabric, but it is not performed between the incoming and outgoing ports in a given switch. In other words, FC buffer-to-buffer flow control is not used between two F_Ports or between an F_Port and an E_Port within a switch. FC standards do not define how switches route frames across the switch. Buffer-to-buffer flow control is used in the following situations:
Class 1 connection request frames use buffer-to-buffer flow control, but Class 1 data traffic uses only end-to-end flow control. Class 2 and Class 3 frames always use buffer-to-buffer flow control. Class F service uses buffer-to-buffer flow control. In an Arbitrated Loop, every communication session is a virtual dedicated point-to-point circuit between a source port and destination port. Therefore, there is little difference between buffer-to-buffer and end-to-end flow control. Buffer-to-buffer flow control alone is generally sufficient for arbitrated loop topologies.
Copyright 2006, Cisco Systems, Inc.
34
Classes 1, 2, 4, and 6 use end-to-end flow control. Class 2 service uses both buffer-to-buffer and end-to-end flow control.
35
F_Port
Data
R_RDY
N_Port A
1 N_Port 2 4
R_RDY R_RDY
ACK
5
Buffer-to-buffer flow control
26
36
End-to-end flow control involves only the port at which a frame originates and the ultimate destination port, regardless of how many FC switches are in the data path. When end-to-end flow control is used, the transmitting port is responsible for ensuring that all frames are delivered. Only when the transmitting N_Port receives the last ACK frame in response to a sequence of frames sent does it know that all frames have been delivered correctly, and only then will it empty its ULP data buffers. If a returning ACK indicates that the receiving port has detected an error, the transmitting N_Port has access to the ULP data buffers and can resend all of the frames in the sequence.
37
10Km 10s
Frame
Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.
Target N_Port
27
Example
This diagram and the following two diagrams illustrate how the required number of BB_Credits are calculated for a 10km, 2Gbps FC link:
At a link rate of 2.125 Gbps, the time required to serialize (transmit) each byte is 4.7ns. (Note that each byte is 10 bits due to 8b/10b encoding.) The maximum SCSI-FCP Fibre Channel payload size is 2048 bytes, because SCSI usually transfers multiple SCSI blocks of 512 Bytes each. The payload size used in an actual customer environment would be based on the I/O characteristics of the customers applications. You also need to account for the frame overheads. These are: SOF (Start of Frame) 4 Bytes FC Header, which is 24 Bytes CRC which is 4 Bytes EOF (End of Frame) which is also 4 Bytes Also, the number of IDLEs between frames, which is usually 6 IDLEs, or 24 bytes. This gives a total of 2108 bytes. The total serialization time at 2Gbps for a 2108-byte frame (including idles) is 9.9s, or approximately 10s.
Copyright 2006, Cisco Systems, Inc.
38
Processing time:
Assume same as de-serialization time 10s
Response time:
Time to transmit R_RDY across 10Km 50s
10s
Frame
Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.
The speed of light in a fiber optic cable is approximately 5ns per metre or 5s per kilometer, so each frame will take about 50s to travel across the 10Km link. The receiving port must then process the frame, free a buffer, and generate an R_RDY. This processing time can varyfor example, if the receiver ULP driver is busy, the frame might not be processed immediately. In this case, we can assume that the receiving port will process the frame immediately, so the processing time is equal to the time it takes to de-serialize the frame. Assume that the de-serialization time is equal to the serialization time: 10s The receiving port then transmits a credit (R_RDY) back across the link. This response takes another 50s to reach the transmitter. The total latency on the link is equal to the frame serialization time plus the round-trip time across the link, or about 110s.
39
A good rule of thumb is.. At 2Gbps with a 2KB payload, you need approx 1 credit per Km
10Km
Frame R_RDY
Frame R_RDY
Frame R_RDY
Frame R_RDY
Frame R_RDY
Frame
Initiator N_Port
2006 Cisco Systems, Inc. All rights reserved.
Target N_Port
29
Given a frame serialization time of 10s, and a total round-trip latency of 110s, there could be up to 5 frames on the link at one time plus one being received and processed by the receiving port. In addition, 5 credits are being returned to the transmitting port down the other side of the link. In other words, ignoring the de-serialization time, approximately 10 buffer-to-buffer credits are required to make full use of the bandwidth of the 10km link at 2Gbps with 2KB frames.
40
Example Example WWNs WWNs from from a a Dual-Ported Dual-Ported Device Device nWWN pWWN A pWWN B 20:00:00:45:68:01:EF:25 21:00:00:45:68:01:EF:25 22:00:00:45:68:01:EF:25
31
WWNN 200000456801EF25
WWN 20:00:00:45:68:01:EF:25
World-Wide Names
WWNs are unique identifiers that are hard-coded into FC devices. Every FC port has at least one WWN. Vendors buy blocks of WWNs from the IEEE and allocate them to devices in the factory. WWNs are important for enabling fabric services because they are:
These characteristics ensure that the fabric can reliably identify and locate devices, which is an important consideration for fabric services. When a management service or application needs to quickly locate a specific device: 1. The service or application queries the switch Name Server service with the WWN of the target device 2. The Name Server looks up and returns the current port address (FCID) that is associated with the target WWN 3. The service or application communicates with the target device using the port address
41
nWWNs uniquely identify devices (Nodes). Every host bus adaptor (HBA), array controller, switch, gateway, and FC disk drive has a single unique WWNN. pWWNs uniquely identify each port in a device. A dual-ported HBA has three WWNs: one nWWN and a pWWN for each port.
nWWNs and pWWNs are both needed because devices can have multiple ports. On singleported devices, the nWWN and pWWN are usually the same. On multi-ported devices, however, the pWWN is used to uniquely identify each port. Ports must be uniquely identifiable because each port participates in a unique data path. nWWNs are required because the node itself must sometimes be uniquely identified. For example, path failover and multiplexing software can detect redundant paths to a device by observing that the same WWNN is associated with multiple pWWNs. Cisco MDS switches use the following acronyms:
42
Fabric
Domain Domain
Area Area
00000000 00000000
Each FC switch in the fabric is assigned a unique Domain ID from 1 to 239 (except McDATA switches, which assign only domains 97 to 127). Traditional FC switches will assign the Area ID based upon the physical port on the switch that the device is connected to. For example, a device connected to port 3 on the switch will receive an Area ID of hex 03. Therefore the FCID is tied to the physical port on the switch. The Port ID is usually hex 00 for a N_Port or the AL_PA (Arbitrated Loop Physical Address) for a NL_Port. This means that every N_Port connected to the switch is reserved an entire area of 256 addresses, although it will only use 00. This is a wasteful use of addresses and one of the reasons why Fibre Channel cannot support the full 16.5 million addresses. The Cisco MDS does not tie the Area to the physical port on the switch, but will assign the FCID logically in sequence starting with an area of 00. The latest HBAs support Flat Addressing and the Cisco MDS will combine the Area and Port fields together as a 16 bit Port ID field. Each device is assigned an FCID in sequential order starting at 0000, 0001 etc. Legacy devices will be assigned a fixed Port ID of 00 per Area as defined above.
43
Public NL_Ports are assigned a full 24-bit fabric address when they log into the fabric. There are 126 AL_PA addresses available to NL_Ports in an arbitrated loop; the AL_PA 0x00 is reserved for the FL_Port (which is logically part of both the fabric and the loop). The Domain and Area fields are identical to those of the FL_Port to which the loop is connected.
Private NL_Ports can communicate with each other based upon the AL_PA, which is assigned to each port during loop initialization. Private NL_ports are not assigned a 24-bit fabric address, and the Domain and Area segments are not used.
44
Fabric Login
Fabric Login
FCIDs are dynamically assigned to each device FC Port by the switch during the Fabric Login (FLOGI) process. Each device will register (PLOGI) with the switchs Name Server Initiators will query the Name Server for available targets, then send PLOGI to the target to exchange FC parameters Initiator will login to each Target using Process Login (PRLI) to establish a channel of communication between them (Image Pair)
Fabric N_Port A Initiator Node FLOGI FLOGI PLOGI PLOGI PLOGI PLOGI Process A PRLI PRLI Process B F_Port A F_Port B N_Port B FLOGI FLOGI PLOGI PLOGI Target Node
34
Before an N_Port can begin exchanging data with other N_Ports, three processes must occur:
The N_Port must log in to its attached F_Port. This process is known as Fabric Login (FLOGI). During PLOGI, both ports exchange Fibre Channel common parameters. ie. Buffer credits, buffer size, classes of service supported etc. The Initiator N_Port must log in to its target N_Port. This process is also known as Port Login (PLOGI). This time the initiator and target ports exchange Fibre Channel common parameters like before. If one port supports 2KB buffers but the other only supports 1KB buffers, they will negotiate down to the lowest common factor ie 1KB buffers. The Initiator N_Port must exchange information about ULP support with its target N_Port to ensure that the initiator and target process can communicate. This process is known as Process Login (PRLI). Parameters exchanged are specific to the Upper Layer Protocol (ULP). For instance, one port will state that it is an Initiator, the other must say that it is a Target. If both ports are Initiators, then the PRLI is rejected.
45
Analyzer screenshot showing the contents of a FLOGI frame sent to the Fabric Login Server in the FC switch.
2006 Cisco Systems, Inc. All rights reserved.
35
Notice that at this time the N_Port does not yet have an address. 00.00.00 Notice also that the World Wide Port Name is the same as the World Wide Node Name. This is common in single ported nodes. The N_Port does not support Class 1, but it does support Classes 2 and 3. The N_Port supports Alternate Buffer Credit Management Method and can guarantee 2 BB_Credits at its receiver port. You can see that this is a single-frame Class 3 sequence because the Start of Frame is SOFi3 and End of Frame is EOFt, meaning that this initial first frame is also the last one in the sequence.
46
Link failure
FC
SCR SCR
FC
HBA
LS_ACC LS_ACC
SCR SCR
LS_ACC LS_ACC
Host
2006 Cisco Systems, Inc. All rights reserved.
Storage
36
A node port is added or removed from the fabric Inter-switch links (ISLs) are added or removed from the fabric A membership change occurs in a zone
Nodes register for notification by sending a State Change Registration (SCR) frame to the Fabric Controller. The Fabric Controller transmits RSCN commands to registered nodes when a fabric state change event occurs. RSCNs are transmitted as unicast frames because multicast is an optional service and is not supported by many switches. Only nodes that might be affected by the state change are notified. For example, if the state change occurs within Zone A, and Port X is not part of Zone A, then Port X will not receive an RSCN. Nodes respond to the RSCN with an LS_ACC frame.
47
The RSCN message identifies the ports that were affected by the state change event, and it identifies the general nature of the event. After receiving an RSCN, the node can then use additional Link Services commands to obtain more information about the event. For example, if the RSCN specifies that the status of Port Y has changed, the nodes that receive the RSCN can attempt to verify the current (new) state of Port Y by querying the Name Server. The Fabric Controller will generate RSCNs in the following circumstances:
A fabric login (FLOGI) from an Nx_Port. The path between two Nx_Ports has changed (e.g., a change to the fabric routing tables that affects the ability of the fabric to deliver frames in order, or an E_Port initialization or failure) An implicit fabric logout of an Nx_Port, including implicit logout resulting from loss-ofsignal, link failure, or when the fabric receives a FLOGI from a port that had already completed FLOGI. Any other fabric-detected state change of an Nx_Port. Loop initialization of an L_Port, and the L_bit was set in the LISA Sequence. An Nx_Port can also issue a request to the Fabric Controller to generate an RSCN. For example, if one port in a multi-ported node fails, another port in that node can send an RSCN to notify the fabric about the failure.
48
Domain Manager
Name Server
Alias Server
Key Server
Time Server
FC-4 ULP Mapping FC-3 Generic Services FC-2 Framing & flow control FC-1 Encoding FC-0 Physical interface
2006 Cisco Systems, Inc. All rights reserved.
38
The FC-SW-2 specification defines several services that are required for fabric management. These services include:
Name Server Login Server Address Manager Alias Server Fabric Controller Management Server Key Distribution Server Time Server
The FC-SW-2 specification does not require that switches implement all of these services; some services can be implemented as an external server function. However, the services discussed in this lesson are typically implemented in the switch, as in Cisco MDS 9000 Family Switches.
49
Domain Manager
Fabric Configuration Principal Switch Selection Domain ID Allocation
FC_ID Allocation
Allocating domain IDs (requesting a domain ID, and assigning domain IDs to other switches if this switch is the Principal Switch) Allocating port addresses (FC_IDs) Participating in the Principal Switch selection process Performing the Fabric Build and Reconfiguration processes when the topology changes
The Domain Manager supports the Fabric Port Login Server, which is the service that N_Ports use when logging in to the fabric. When an N_Port logs into the fabric, it sends a FLOGI command to the Login Server. The Login Server then requests an FC_ID from the Domain Manager and assigns the FC_ID the N_Port in its ACC reply to the FLOGI request. The preceding diagram shows how the Domain Manager interacts with other fabric services:
The VSAN Manager provides the Domain Manager with VSAN configuration and status information. The WWN Manager tells the Domain Manager what WWN is assigned to the VSAN. The Port Manager provides the Domain Manager with information about the fabric topology (a list of E_Ports) and notifies the Domain Manager about E_Port state changes. The Login Server receives N_Port requests for FC_IDs during FLOGI. The Domain Manager interacts with management services to allow administrators to view and modify Domain Manager parameters.
50
Supports soft zoning Provides information only about nodes in the requestors zone Distributed Name Server (dNS) resides in each switch Responsible for entries associated with that switchs domain Maintains local data copies and updates via RSCNs Sends RSCNs to the fabric when a local change occurs
40
FC_IDs WWPN and WWNNs FC operating parameters, such as supported ULPs and Classes of Service
Supports soft zoning by performing WWN lookups to verify zone membership Enforces zoning by only providing information about nodes in the requestors zone Is used by management applications that need to obtain information about the fabric
Each switch in a fabric contains its own resident name server, called a distributed Name Server (dNS). Each dNS within a switch is responsible for the name entries associated with the domain assigned to the switch. The dNS instances synchronize their databases using the RSCN process. When a client Nx_Port wants to query the Name Service, it submits a request to its local via the Well Known Address for the Name Server. If the required information is not available locally, the dNS within the local switch responds to the request by making any necessary requests of other dNS instances contained in the other switches. The communication between switches that is performed to acquire the requested information is transparent to the original requesting client. Partial responses to dNS queries are allowed. If an entry switch sends a partial response back to an Nx_Port, it must set the partial response bit in the CT header.
51
Get Object: This request is used to query the Name Server Register Object: Only one object at a time can be registered with the Name Server. A Client registers information in the Name Server database by sending a registration request containing a Port Identifier or Node Name. Deregister Object: Only one global deregistration request is defined for the Name Server.
Name Server information is available, upon request, to other nodes, subject to zoning restrictions. If zones exist within the fabric, the Name Server restricts access to information in the Name Server database based on the zone configuration. When a port logs out of a fabric, the Name Server deregisters all objects associated with that port.
52
41
The Fabric Configuration Service (FCS) supports configuration management of the fabric. This service allows applications to discover the topology and attributes of the fabric. The Zone Service provides zone information for the fabric to either management applications or directly to clients. The Unzoned Name Service provides access to provide information about the fabric without regard to zones. This service allows management applications to see all the devices on the entire fabric.
53
Well-Known Addresses
Well-known addresses are reserved addresses for FC Services at the top of the 24-bit fabric address space
Broadcast Alias Fabric Login Server Fabric Controller Name Server Time Server Management Server QoS Facilitator Alias Server Key Distribution Server Clock Synchronization Server Multicast Server Reserved FFFFFF FFFFFE FFFFFD FFFFFC FFFFFB FFFFFA FFFFF9 FFFFF8 FFFFF7 FFFFF6 FFFFF5 FFFFF4 FFFFF0 Mandatory Mandatory Mandatory Optional Optional Optional Optional Optional Optional Optional Optional
42
Well-Known Addresses
Well-known Addresses allow devices to reliably access switch services. All services are addressed in the same way as an N_Port is addressed. Nodes communicate with services by sending and receiving Extended Link Services commands (frames) to and from Well-Known Addresses Well-known addresses are the highest 16 addresses in the 24-bit fabric address space:
FFFFFF - Broadcast Alias FFFFFE - Fabric Login Server FFFFFD - Fabric Controller FFFFFC - Name Server FFFFFB - Time Server FFFFFA - Management Server FFFFF9 - Quality of Service Facilitator FFFFF8 - Alias Server FFFFF7 - Key Distribution Server FFFFF6 - Clock Synchronization Server FFFFF5 - Multicast Server FFFFF4FFFFF0 - Reserved
The first three services are mandatory in all FC switches, however all FC switches today implement the first six services by default for ease of management.
54
Lesson 2
Objectives
Upon completing this lesson, you will be able to identify the components of an MDS 9000 storage networking solution. This includes being able to meet these objectives:
Identify the hardware components of the MDS 9000 platform Explain supported airflow and power configurations Explain the MDS 9000 licensing model
32-port FC 16-port FC
SSM Virtualization
Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer MDS 9000 Family Operating System
4
Multilayer switches are switching platforms with multiple layers of intelligent features, such as:
Ultra High Availability Scalable Architecture Comprehensive Security Features Ease of Management Advanced Diagnostics and Troubleshooting Capabilities Seamless Integration of Multiple Technologies Multi-protocol Support
Multilayer switches also offer a scalable architecture with highly available hardware and software. Based on the MDS 9000 Family-Operating System and a comprehensive management platform called Cisco Fabric Manager, the MDS 9000 Family offers a variety of application line card modules and a scalable architecture from an entry-level fabric switch to director-class systems. The Cisco MDS 9000 Family offers a industry leading investment protection across a comprehensive product line. The 9020 is a new low cost 20 port FC switch providing 1/2/4Gb/s at full line rate. This model currently has a single power supply, four fans and front to rear airflow. Featuring nondisruptive software upgrades and management via CLI or FM/DM. The current release 2.1.2 does not support VSANs but this is planned for release 3.0.0
58
4-port FC 10 Gb/s
Device and Fabric Manager, FM Server, Performance Manager, Traffic Analyzer MDS 9000 Family Operating System
5
In April 2006, Cisco introduced the MDS 9513 Multilayer Director and second generation linecards. The 9513 Multilayer Director Switch is a new 13 slot chassis, with two Supervisor-2 slots. (Note that Supervisor-1 is not compatible with this chassis). Supporting this architecture but forward and backward compatible with the existing architecture are the new 12 Port, 24 port and 48 port FC linecards that provide 1/2/4 Gb/s using new 4Gb/s SFPs. The 4 port 10Gb/s FC linecards are also forward and backward compatible with the existing architecture and provide 4x 10Gb/s FC ports at full line rate using X2 GBICs.
59
services applications. The maximized number of available FC ports is an industry leading port density per rack with up to (768), in a single seven foot (42 RU) rack, thus optimizing the use of valuable data center floor space. Additionally, cable management is facilitated by the single side position of both interface and power terminations.
61
3.0
Redundant High Performance Crossbar Fabric Modules Redundant 6000W AC Power Supplies
Room-to-grow power for future application modules
Revised airflow
Bottom to top, at rear of chassis Front and rear fan trays
2006 Cisco Systems, Inc. All rights reserved.
62
MDS 9216i
Expands from 14 to 62 FC ports + 2x GigE ports for FCIP and iSCSI MDS 9216i 14+2 Port Switch Fully supports any of the new linecards
MDS 9020
Low cost 4Gbps FC Switch with free FMS license MDS 9020 4Gbps 20 port FC Switch
2006 Cisco Systems, Inc. All rights reserved.
63
Common Architecture:
Ease-of-Migration and Investment Protection
Supervisor-2
12-port FC Module
48-port FC Module MDS 9506 & 9509 SSM All Line Cards Forward/Backward Compatible* MDS 9513
MPS 14+2
4-port 10Gb/s FC
Current Generation
Architectural support for up to 256 indexes Max planned system density of 240 ports 1/2Gb/s FC interfaces
New Generation
Architectural support for up to 1,024 indexes Max planned system density of 528 ports 1/2/4Gb/s, 10G FC interfaces
9
All first generation and second generation modules are forwards and backwards compatible. The first generation has architectural support for up to 256 indexes (destination ports) and the max planned system density is 240 ports (using MDS 9509) although in practice it is 224 using 7x 32 port linecards. However, using a mix of current and second generation modules it is possible to increase this to 252 ports. Each supervisor module consumes two indexes, so a total of 4 indexes are used by supervisors on MDS 9500 switches. It is worth noting that each Gigabit interface uses 4 indexes, so an IPS-8 would consume 32 indexes and a 14+2 would consume 22 indexes from the pool. The second generation platform has architectural support for up to 1024 indexes and the max planned system density is currently 528 ports using 11x 48 port cards. However, if any one of the current linecards are inserted into the 9513 chassis, the maximum number of indexes are reduced to 252. The 9513 chassis must use only the Supervisor-2 module, however both Supervisor-1 and Supervisor-2 cards may be used in the current generation 9506 and 9509.
64
r Ai
Fl
ow
Air Flow
MDS 9216
2006 Cisco Systems, Inc. All rights reserved.
11
65
Hot swappable front mounted fan tray Easy installation and removal Sensors monitor system temperature Temperature rise or fan failure generates an event Recommended to replace fan tray at earliest opportunity
Replace fan tray within 3 mins or receive critical warning. Shutdown in 2 mins if fan tray is not replaced 3 minutes 2 minutes
Shutdown
66
Power Management
MDS switches have dual power supplies * Hot swappable for easy installation and removal * Power Supply Modes
Redundant Mode: default Power capacity of the lower capacity supply Sufficient power will be available in case of PSU failure Combined Mode: non-redundant Twice the power capacity of lower capacity supply Sufficient power may not be available in case of a power supply failure Only modules with sufficient power are powered up Power is reserved for the Supervisors and Fan Assemblies After supervisors, modules are powered up starting at slot 1
* MDS 9020 has a single integral power supply
2006 Cisco Systems, Inc. All rights reserved.
13
Power Management
Power supplies are configured in redundant mode by default but they can also be configured in a combined or non-redundant mode. In redundant mode, the chassis uses the power capacity of the lower capacity power supply so that sufficient power is available in case of a single power supply failure. In combined mode, chassis uses twice the power capacity of the lower capacity power supply. Sufficient power may not be available in case of a power supply failure in this mode. If there is a power supply failure and the real power requirements for the chassis exceed the power capacity of the remaining power supply, the entire system will be reset automatically, to prevent permanent damage to the power supply. In either mode, power is reserved for the Supervisor and fan assemblies. Each supervisor module has roughly 220 watts in reserve, even if there is only one installed, and the fan module has 210 watts in reserve. In the case of insufficient power, after supervisors and fans are powered, line card modules are given power from the top of the chassis down. After the reboot, only those modules that have sufficient power shall be powered up. If the real power requirements do not trigger an automatic reset, no module will be powered down; Instead no new module shall be powered up. In all cases of power supply failure, removal, etc., a syslog message shall be printed, a call home message shall be sent if configured, and a SNMP trap shall be sent.
67
MDS 9100
Removable power supplies at rear of chassis 300 Watt AC supply 300W @ 100-240 VAC
MDS 9216
Removable power supplies at rear of chassis 845 Watt AC supply 845W @ 100-240 VAC
14
68
MDS 9509
Removable power supplies at front of chassis 2500 Watt AC supply 2500W @ 200-240 VAC 1300W @ 100-120 VAC 3000 Watt AC supply (new)
MDS 9513
Removable power supplies at rear of chassis 6000 Watt AC supply 6000W @ 200-240 VAC
2006 Cisco Systems, Inc. All rights reserved.
15
69
Enforced software licensing started with SAN-OS 1.3 Includes standard license package - Free Five additional license packages
Enterprise package SAN Extension over IP (FCIP) Mainframe (FICON) Fabric Manager Server (FMS) Storage Services Enabler (SSE)
Fibre Channel and iSCSI iSCSI Server Load Balancing VSANs and Zoning PortChannels FCC and Virtual Output Queuing Diagnostics (SPAN, RSPAN, etc.) Fabric Manager and Device Manager SNMPv3, SSH, SSL, SFTP SMI-S 1.10 and FDMI Role-based access control RADIUS and TACACS+, MS CHAP RMON, Syslog, Call Home Brocade native interop modes 2 and 3 McData native interop mode 4 NPIV (N_Port ID Virtualization) IVR over FCIP Command Scheduler IPv6 (management & IP services)
17
The Cisco MDS 9000 Family SAN-OS is the underlying system software that powers the award-winning Cisco MDS 9000 Family Multilayer Switches. SAN-OS is designed for storage area networks (SANs) in the best traditions of Cisco IOS Software to create a strategic SAN platform of superior reliability, performance, scalability, and features. In addition to providing all the features that the market expects of a storage network switch, the SAN-OS provides many unique features that help the Cisco MDS 9000 Family to deliver low total cost of ownership (TCO) and a quick return on investment (ROI).
70
Enterprise Packageadds a set of advanced features which are recommended for all enterprise SANs. SAN Extension over IP Packageenables FCIP for IP Storage Services and allows the customer to use the IP Storage Services to extend SANs over IP networks. Mainframe Packageadds support for the FICON protocol. FICON VSAN support is provided to help ensure that there is true hardware-based separation of FICON and open systems. Switch cascading, fabric binding, and intermixing are also included in this package. Fabric Manager Server Packageextends Cisco Fabric Manager by providing historical performance monitoring for network traffic hotspot analysis, centralized management services, and advanced application integration for greater management efficiency. Storage Services Enabler Packageenables network-hosted storage applications to run on the Cisco MDS 9000 Family Storage Services Module (SSM). A Storage Services Enabler package must be installed on each SSM.
71
High availability
Non-disruptive installation No single point of failure 120 day grace period for enforcement
Ease of use
Seamless electronic licenses No separate software images for licensed features Licenses installed on switch at factory Automated license key installation Centralized License Management Console Provides single point of license management of all switches
19
72
SAN Consolidation
VSANs PortChannels Security Traffic Engineering QoS FCC
Multiprotocol
FC FICON FCIP iSCSI
20
The Cisco MDS 9000 Series is the first multilayer intelligent storage platform.
High-availability infrastructureRedundant power and cooling, redundant supervisor modules with stateful failover, hot-swap modules, and non-disruptive software upgrades for the MDS 9500 platform give you 99.999% availability. MultiprotocoliSCSI enables integration of mid-range servers into the SAN, FICON enables integration of mainframe systems with complete isolation of FICON and FC ports, and FCIP enables cost-effective DR solutions. SAN consolidationIntelligent infrastructure services like virtual SANs (VSANs), PortChannels, per-VSAN FSPF routing, QoS, FCC, and robust security enable stable, scalable, and secure enterprise SAN consolidation. Intelligent storage servicesNetwork-based services for resource virtualization, volume management, data mobility, and replication lower TCO and increase ROI. Enterprise-class managementIntegrated device, fabric, and performance management improve management productivity and easily integrate with existing enterprise management frameworks like IBM Tivoli and HP OpenView.
73
21
Question: How do we build a 3000 port fabric? Answer: Using six MDS 9513 directors. The MDS 9513 has the largest port capacity (528 ports) of any Fibre Channel switch or director in the market today.
74
Lesson 3
Objectives
Upon completing this lesson, you will be able to describe the hardware architecture and components of the MDS 9000 Family of switches. This includes being able to meet these objectives:
Describe the system architecture of the MDS 9000 platform Explain how to design fabrics using full-rate and oversubscribed line cards Explain how buffer credits are allocated on MDS 9000 line card modules
System Architecture
MDS Integrated Crossbar
Investment protection
Ability to support new line cards Multiprotocol support in one system
Centralized Crossbar switch architecture
Highly-scalable system
Aggregate Bandwidth up to 2.2 Tbps
external interfaces
external interfaces
78
There is an aggregate 720-Gbps multiprotocol crossbar per supervisor module used on MDS 9506 and 9509 but the MDS 9513 has new Crossbar Fabric modules located at the rear of the chassis that provide a total aggregate bandwidth of 2.2Tbps. All MDS chassis can operate on a single crossbar at full bandwidth on all attached ports without blocking. A technique called Virtual Output Queuing (VOQ) is deployed for optimal crossbar performance. VOQ resolves head-of-line blocking issues for continuous data flow.
79
Standby Supervisor
Flash Card Eth Console uP
Crossbar
uP Q F M
F
I/F Q F M
F
uP Q F
I/F Q F
uP Q F V
I/F Q F V M
F
uP Q F V M
F
Q F M
F
Q F M
F
Si
EE
Si
EE
Si
EE
Si
EE
Si
EE
M
F
Dual supervisor modules (Supervisor-1 or Supervisor-2) containing crossbar, microprocessor, flash memory, console and Ethernet interfaces An FC line card capable of supporting Fibre Channel and FICON protocols. Examples of this are the 16-port and 32-port line cards. An IP Services line card capable of supporting IP storage services and protocols like FCIP and iSCSI An MPS 14+2 line card with 14 FC ports and two GigE ports supporting iSCSI and three FCIP tunnels per GigE port.
Copyright 2006, Cisco Systems, Inc.
80
An SSM line card, capable of performing virtualization services, snapshots, replication and SCSI 3rd Party copy services to support NASB (Network Assisted Serverless Backup)
Frames arriving at the interface are de-encoded, conditioned, maybe virtualized and passed to the forwarding ASIC (F) then stored in the appropriate Virtual Output Queue (Q) until the arbiter (A), decides that a credit is available at the destination port and that the frame can now continue its journey. The frame leaves the VOQ and passes through the Up interface (I/F) across one of the crossbars and down to the destination line card and straight out of the appropriate interface. Notice that all line cards have an identical architecture from the F ASIC and above, so all frames crossing the crossbar have already been conditioned and processed and have an identical structure, regardless of their underlying protocol, FC, Ficon, iSCSI, FCIP. The internal architecture of the MDS 9216 is very similar to the MDS 9500 in that many of the same internal components are utilized. There are however several key differences:
The MDS 9216 has a fixed non-redundant supervisor card that provides arbitration and supports a single modular card, although it can be of any type. The MDS 9216 does not use a crossbar fabric. In a two-slot design there is no need or advantage to using a switching crossbar. Instead, the two cards are connected to each other through the high-speed back plane, and to themselves through an internal loopback interface.
81
20-Gbps
Crossbar 720-Gbps
Supervisor Module
Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps
20-Gbps
20-Gbps
20-Gbps
Crossbar 720-Gbps
Supervisor Module
Each supervisor module has an onboard 720 Gbps crossbar360 Gbps transmit (Tx) and 360 Gbps receive (Rx). Therefore in a dual supervisor installation, the MDS 9000 system has an aggregate total bandwidth of up to 1.44-Tbps. Each installed line card in a dual supervisor configuration has 80 Gbps bandwidth available to the supervisor cross-bars. Each path is 20 Gbps in each direction. Each card connects through dual 20-Gbps paths to each supervisor cross-bar. Data is load shared across both cross-bars when dual supervisor modules are installed. Both crossbars are active-active and frames from a line card will travel across either one or the other crossbar. The Arbiter function schedules frame delivery at over 1 billion frames per second and routes frames over either one crossbar or the other.
82
Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line card 80 Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 80-Gbps Line Card 100-Gbps
25Gbps
25Gbps
25Gbps
25Gbps
83
3.0
Up i/f
48 Ports OSM LC
Down i/f
VOQ
12 Ports FRM LC
Fwd/VOQ
Buf
Fwd/VOQ
TCAM
4 Ports 10G LC
VOQ
Forwarding
Forwarding
MAC/PHY
MAC/PHY
MAC
MAC
MAC
MAC
Each of the eleven line card slots on the MDS9513 has 2x 2.5Gbps serial links to each Arbiter ASIC. In addition, each Supervisor slot has one each, making a total of 24x 2.5Gbps serial links to the Arbiter ASICs. These are used to communicate with the Central Arbiter to request and grant permission for a frame to cross the crossbar. Each line card Fwd/VOQ ASIC is connected to each of the Crossbar Fabrics via a pair of dual redundant 25Gbps channels providing a total 50Gbps to each crossbar. A second 50Gbps dual redundant pair of channels provides the return path from the Crossbar Fabric to the other Fwd/VOQ ASIC. Each channel comprises 8x 3.125Gbps serial links for transmit and 8x 3.125Gbps for receive. Frames arrive at the line card MAC/PHY interface and are forwarded to the Fwd/VOQ ASIC where the frames are stored in a buffer and associated with a destination VOQ. The Fwd/VOQ ASIC requests permission from the Arbiter to deliver a frame to the destination port. When the Arbiter has received a credit from the destination device, it grants permission for the frame to be sent across one of the crossbar fabrics. When permission is granted by the Arbiter, a frame leaves the VOQ in the Fwd/VOQ ASIC along one of the 25Gbps channels to one of the Crossbar Fabrics then returns via one of the 25Gbps return channels and out through the MAC/PHY ASIC on the appropriate line card. All frames travel across the crossbar fabric, regardless of where the source and destination ports are located on the ASICs or line cards. This provides consistent latency of approx 20us per frame and minimises jitter which can occur in other vendor products.
84
3.0
85
Hot-Swappable Supervisors
Dual supervisors
Active and Standby Hot-swappable Stateful-Standby keeps sync with all major management and control protocols of active supervisor
Non-disruptive upgrades
Load and activate new software without disrupting traffic Standby supervisor maintains the previous version of code while the active supervisor is updated
10
The Cisco MDS 9500 Series of Multilayer Directors supports two Supervisor modules in the chassis for redundancy. Each Supervisor module consists of a Control Engine and a Crossbar Fabric. The Control Engine is the central processor responsible for the management of the overall system. In addition, the Control Engine participates in all of the networking control protocols including all Fibre Channel services. In a redundant system, two Control Engines operate in an active/standby mode. The Control Engine that is in standby mode is actually in a stateful-standby mode such that it keeps sync with all major management and control protocols that the active Control Engine maintains. While the standby Control Engine is not actively managing the switch, it continually receives information from the active Control Engine. This allows the state of the switch to be maintained between the two Control Engines. Should the active Control Engine fail, the secondary Control Engine will seamlessly resume its function. The Crossbar Fabric is the switching engine of the system. The crossbar provides a high speed matrix of switching paths between all ports within the system. A crossbar fabric is embedded within each Supervisor module. The two crossbar fabrics operate in a load-shared active-active mode. Each crossbar fabric has a total switching capacity of 720 Gbps and serves 80 Gbps of bandwidth to each slot on MDS 9506 and 9509. Since each switching module of the Cisco MDS 9506 or 9509 does not consume more than 80 Gbps of bandwidth to the crossbar, the system will operate at full performance even with one Supervisor module. In a fully populated MDS 9500, the system will not experience any disruption or any loss of performance with the removal or failure of one Supervisor module. The Supervisor Module is a hot swappable module. In a dual Supervisor module system this allows the module to be removed and replaced without causing disruption to the rest of the system.
86
3.0
11
87
12
88
HBA
2Gbps ISL
14
Oversubscription Overview
Fibre Channel standards dictate that in a Fabric Topology, each attached FC device port must be attached to its own dedicated FC switch port. Todays switch ports support 1Gbps, 2Gbps, 4Gbps and 10Gbps ports but the connected device cannot usually utilize the full bandwidth available to them. A 2Gbps port can provide 200MB/s of bandwidth in each direction, a total of 400MB/s per port. Servers often have internal bandwidth limitations and applications rarely require more than 25MB/s today. This is changing with the introduction of PCI-Express motherboards that have replaced the old parallel PCI bus with multiple 2.5Gbps serial channels to each slot. If the application is capable of demanding it, each PCI-Express channel will fully utilize a 2Gbps Fibre Channel port. However, today most servers require less than 25MB/s. Oversubscription allows several devices to share the available bandwidth. ISL oversubscription is typically 7:1 with seven servers sharing the total bandwidth of a 2Gbps FC port. 200MB/s shared by 7 servers = approx 28MB/s average per server.
89
Port Group
Port Group
Port Group
Port Group
Port 10
Port 11
Port 12
Port 13
Port 14
Port 2
Port 4
Port 6
FRM FRM
Suitable for Storage Arrays and ISLs between switches Up to 255 Buffer Credits per FC interface Fully configurable
plus 145 performance buffers per port Default 255 credits for E_Ports, 16 credits for Fx_Ports
2006 Cisco Systems, Inc. All rights reserved.
90
Port 16
15
Port 1
Port 3
Port 5
Port 7
Port 8
Port 9
Port Group
Port Group
Port Group
Port Group
Port Group
Port Group
Port Group
Port Group
Port 11
Port 13
Port 14
Port 15
Port 16
Port 17
Port 18
Port 19
Port 20
Port 21
Port 22
Port 23
Port 24
Port 25
Port 26
Port 27
Port 28
Port 29
Port 30
Port 31
16
Port 10
Port 12
Port 3
Port 5
Port 6
Port 7
Port 9
OSM OSM
1.6:1 3.2:1
8 / 2.5 = 3.2
Suitable for connecting servers that require less than 62MB/s avg. bandwidth 12 Buffer Credits (fixed) per FC interface
Port 32
Port 1
Port 2
Port 4
Port 8
91
3.0
48-Port 4-Port
17
12-port 1/2/4Gbps Fibre Channel module providing 4Gbps full rate bandwidth on every port. 24-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 2:1 oversubscription and full rate bandwidth on each port at 1Gbps and 2Gbps. 48-port 1/2/4Gbps Fibre Channel module providing 4Gbps at 4:1 oversubscription and 2Gbps at 2:1 oversubscription and full rate bandwidth at 1Gbps on each port. 4-port 10Gbps Fibre Channel module providing 10Gbps full rate bandwidth on every port.
10G modules use 64b/66b encoding that is incompatible with modules operating at 1/2/4Gbps using 8b/10b encoding.
92
3.0
48 port
Port-group
24 port
Port-group
12 port
Each line card has 4 port groups, denoted by screen-printed borders Each port group has 12 Gbps of shared bandwidth Ports can be configured to have Dedicated Bandwidth (1Gb / 2Gb / 4Gb) Remaining ports share unused bandwidth
2006 Cisco Systems, Inc. All rights reserved.
18
Port Groups
Each port group is clearly marked on the line cards with screen-printed borders. Each port group has 12Gbps of internal bandwidth available. Any port can be configured to have dedicated bandwidth at 1Gbps, 2Gbps or 4Gbps. All remaining ports in the port group share any remaining unused bandwidth. Any port in dedicated bandwidth mode has access to extended buffers. Any port in shared bandwidth mode has only 16 buffer credits.
93
3.0
Port Group
Port Group
Port Group
Port Group
Port 10
Port 11
3 ports per Port Group Each Port Group shares 12 Gbps of bandwidth Full Rate Mode at 1/2/4 Gbps Suitable for 4Gbps Storage Array ports Suitable for ISLs between switches
16x 4Gbps = 64Gbps Port Channel
Port 4
94
Port 12
19
Port 1
Port 2
Port 3
Port 5
Port 6
Port 7
Port 8
Port 9
3.0
Port Group
Port Group
Port Group
Port Group
Port 10
Port 11
Port 12
Port 13
Port 14
Port 15
Port 16
Port 17
Port 18
Port 19
Port 20
Port 21
Port 22
Port 23
Port 1
Port 4
Port 7
Port 8
6 ports per Port Group Each Port Group shares 12 Gbps of bandwidth Full Rate Mode at 1 and 2 Gbps 2:1 oversubscription at 4 Gbps
Port 9
Suitable for Storage Arrays that require less than 200MB/s bandwidth and ISLs between switches
Port 24
20
Port 2
Port 3
Port 5
Port 6
95
3.0
Port Group
Port Group
Port Group
Port Group
Port 1 Port 2 Port 3 Port 4 Port 5 Port 6 Port 7 Port 8 Port 9 Port 10 Port 11 Port 12
Port 13 Port 14 Port 15 Port 16 Port 17 Port 18 Port 19 Port 20 Port 21 Port 22 Port 23 Port 24
Port 25 Port 26 Port 27 Port 28 Port 29 Port 30 Port 31 Port 32 Port 33 Port 34 Port 35 Port 36
12 Ports per Port Group Each Port Group shares 12Gbps of bandwidth Full Rate Mode at 1Gbps 2:1 oversubscription at 2Gbps 4:1 oversubscription at 4Gbps
Suitable for servers that require less than 100MB/s average bandwidth
96
Port 37 Port 38 Port 39 Port 40 Port 41 Port 42 Port 43 Port 44 Port 45 Port 46 Port 47 Port 48
21
3.0
Port Group
Port Group
Port Group
Port Group
Port 1
Port 2
Port 3
1 port per Port Group Each Port Group shares 12Gbps of bandwidth Full Rate Mode at 10Gb/s Suitable for ISLs between switches
4 ports x 10Gbps = 40Gbps Port Channel
1x 1Gbps = 12Gbps
FRM
Port 4
1:1
22
97
3.0
Port 2
Example: 24-Port FC Module Dedicating one port to 4 Gbps Dedicating another port to 2 Gbps Taking one port Out of Service
Auto/E mode cannot be configured in shared rate-mode FL mode is not supported in 4port 10Gbps module TL mode is not supported in any Generation-2 modules
98
Port 3
Port 5
24
99
MDS 9500 with FRM cards MDS 9216 with integrated FRM ports and additional OSM linecard ISL oversubscription >> Line Card oversubscription
2006 Cisco Systems, Inc. All rights reserved.
4-port 10Gb FRM linecard for core ISLs 12/16-port FRM linecard for storage and ISLs 24/32/48-port OSM linecard for host/tape connectivity
Server connectivity Tape connectivity Edge switches (if deploying a core-edge topology)
The Oversubscribed Mode line cards are designed to allow cost-effective consolidation of a core-edge topology into a collapsed core:
The Oversubscribed Mode line cards serve the function of the edge switches. In core-edge topologies, the oversubscription of the ISLs between the core and edge switches is significantly greater than the oversubscription of the MDS 9000 Oversubscribed Mode line cards. In other words, a collapsed-core topology with Oversubscribed Mode line cards has less oversubscription than a typical core-edge topology. The Full-Rate Mode line cards are used for ISLs and storage connectivity, where oversubscription is not desirable. In a core-edge topology, at least one Full-Rate Mode line card is typically deployed in each edge switch for ISLs to the core. Gen-2 shared-bandwidth line cards allow SAN engineers to tune the performance required per end device.
Copyright 2006, Cisco Systems, Inc.
100
27
Buffer-to-buffer credits:
Depends on rate-mode and port-mode Max. of 16 credits can be configured In shared rate-mode ~6000 credits shared across module in dedicated rate-mode eg: In 48-port module, all interfaces configured 125 credits Or, 40 interfaces 120 each plus 2 interfaces 225
Performance Buffers:
Min/Max/Default 1/145/145 Shared among all ports in the module not guaranteed Supported in 12port 4Gbps and 4port 10Gbps module
101
3.0
Proxy/Reserved Buffers Performance Buffers Extended Buffer Credits (Dedicated mode only)
144
144
6000
(24x 250)
6000
(48x 125)
3000
(4x 250) 4-port 10G
Dedicated Mode
3000
(12x 250)
12-port 1/2/4G
Dedicated Mode
24-port 1/2/4G
Shared Mode
48-port 1/2/4G
Shared Mode
28
102
6144 Buffers
Best Practices
Match the switch port speed to the connected device Configure port speed to use 1, 2 or 4Gbps Or set auto-sensing with a maximum 2Gbps
Configuring 4Gbps will reserve 4Gbps bandwidth regardless of the autonegotiated port speed
29
103
104
Lesson 4
Objectives
Upon completing this lesson, you will be able to create a high-level SAN design with MDS 9000 switches. This includes being able to meet these objectives:
Explain the benefits of VSANs Explains how VSANs are implemented Explain how IVR enables sharing of resources across VSANs Explain how PortChannels provide high availability inter-switch links Explain how the addressing features of the MDS 9000 simplify SAN management Explain the purpose of the CFS protocol Explain how the MDS 9000 interoperates with third-party switches
Virtual SANs
VSANs Address the Limitations of Common SAN Deployments
Virtual Storage Area Networks (VSAN)
VSANs are Virtual Fabrics Allocate ports within a physical fabric to create isolated virtual fabrics. SAN islands are virtualized onto a common SAN infrastructure. VSAN on FC is similar to VLAN on Ethernet. Fabric Services are isolated within a VSAN Fabric disruption is limited to VSAN Statistics gathered are per VSAN
Independent physical SAN islands are virtualized onto a common SAN infrastructure.
Cisco MDS 9000 Family with VSAN Service
106
Dynamic provisioning/resizing Improved port utilization Non-disruptive (re)assignment Shared ISL bandwidth
Ports Required: 40 Ports Deployed: 64 ISL Ports: 0
Ports Stranded: 10
Net: 96 ports for 70 used
32 Port Switches
Ports Stranded: 24
Net: 64 ports for 40 used
70 Port Fabric Red_VSAN 40 Port Fabric Blue_VSAN Ports Required: 70+40 Ports Deployed: 128 ISL Ports: 0 Ports Assignable: 18 (able to add more switching modules too!) Net: 110 ports for 110 used
Ports can be (re)assigned to VSANs non-disruptively. ISLs become Enhanced ISLs (EISLs) carrying tagged traffic from multiple VSANs. ISL bandwidth is securely shared between VSANs, which reduces cost of excessive ISLs. EISLs only carry permitted VSANs, which can limit the reach of individual VSANs. Each port can belong to only one VSAN, and there is no leakage between VSANs. InterVSAN Routing (IVR) must be used to exchange traffic between two different VSANs.
107
!!Fabric event!! HBA generates erroneous control frames but other VSANs are protected
Fabric recovery from a disruptive event is also per-VSAN, resulting in faster reconvergence due to the smaller scope.
108
Group A
Cisco MDS 9506
VSAN Trunk Bundles
Group B
VSAN Trunks
109
VSAN Advantages
Good ROI
Leverage VSANs as a replacement for multiple separate physical fabrics Reduce number of switches Increase port density
Department/ Customer A Department/ Customer B
Availability
Disruptions and I/O pauses are confined to the local VSAN Increase fabric stability
VSAN-Enabled Fabric VSAN Trunks Mgmt VSAN
Scalability
Fabric Services are per VSAN Reduce the size of the FC distributed database FC_IDs can be reused
Security
Separate VSANs provide hardwarebased traffic isolation and security
2006 Cisco Systems, Inc. All rights reserved.
Shared Storage
8
VSAN Advantages
VSANs allow implementation of multiple logical SANs over a common fabric, which eliminates costs associated with separate physical fabrics. The virtual fabrics exist on the same physical infrastructure, but are isolated from each other. Each VSAN contains zones and separate (replicated) fabric services, which improves:
Availability through the isolation of virtual fabrics from fabric-wide faults/reconfigurations Scalability through: Replicated fabric services per VSAN Support for 256 VSANs Centralized management capability Security through fabric isolation
256 VSANs is not a hard limit. The VSAN header is 12 bits long and supports up to 4096 and we can grow to that number in the future as larger scale SAN deployments increase. Please note that the total number of VSANs that can be configured is 256 but the numbering can be anywhere between 1-4093 due to the reasons mentioned above. The FCIDs contain an 8-bit field for domains and a few are reserved leaving the 239 domain (switch) limitation per SAN with each switch getting its own domain ID. With Ciscos VSAN technology, this limitation is now extended per VSAN, implying that domains (and hence FCIDs) can be reused across VSANs. Thus, this enables the deployment of much larger scale SANs than available currently.
110
Enhanced ISL (EISL) Trunk carries tagged traffic from multiple VSANs
Fibre Channel Services for Blue VSAN Fibre Channel Services for Red VSAN
10
111
VSAN Attributes
256 VSANs per switch 239 switches per VSAN Traffic is isolated within its own VSAN
Control over each incoming and outgoing port
VSAN 10 VSAN 20 VSAN 30 VSAN 1 (default)
Each frame in the fabric is uniquely tagged and labeled with a VSAN_ID header on the ingress port
VSAN_ID maintained across TE ports VSAN ID stripped away across E_Ports. VSAN & Priority in EISL header for QoS
Cisco MDS 9509 Single Chassis
11
VSAN Attributes
VSANs help achieve traffic isolation in the fabric by adding control over each incoming and outgoing port. There can be up to 256 VSANs in the switch and 239 switches per VSAN. This affectively helps with network scalability because the fabric is no longer limited by 239 Domain_IDs since they can be reused within each VSAN. To uniquely identify each frame in the fabric, the frame is labeled with a VSAN_ID on the ingress port; the VSAN_ID is stripped away across E ports. Across TE ports, the VSAN_ID is still maintained. By carrying SAN/priority in the header, quality of service (QoS) can be properly applied. The VSAN_ID is always stripped away at the other edge of the fabric. If an E port is capable of carrying multiple VSANs, it then becomes a trunking E port (TE port). VSANs also facilitate the reuse of address space by creating independent virtual SANs, therefore increasing the available number of addresses and improving switch granularity. Without a VSAN, an administrator needs to purchase separate switches and links for separate SANs. The system granularity is at the switch level, not at the port level. VSANs are easy to manage. To move or change users, you only need to change the configuration of the SAN, not its physical structure. To move devices between VSANs, you simply change the configuration at the port level; no physical moves are required.
112
Configured VSANs
2006 Cisco Systems, Inc. All rights reserved.
12
113
Can be optionally disabled for E_Port operation Has native VSAN assignment for E_Port operation Not to be confused with port aggregation (PortChannels)
EISL
TE_Port TE_Ports
EISL
13
TE_Ports can pass tagged frames belonging to multiple VSANs. TE_Ports are only supported by Cisco MDS 9000 switches. By default, TE_Ports can pass all VSAN traffic (1-4093). The passing of traffic for specific VSANs can be disabled. By default, E-Ports are assigned as part of VSAN 1. TE_Ports allow for the segregation of SAN traffic and should not be confused with port aggregation (referred to by some vendors as trunking).
An EISL is created when two TE_Ports are connected. EISLs offer a superset of ISL functionality, EISLs carry per-VSAN control protocol information.
114
WWN-Based VSANs
Port-Based VSANs
SW1 SW2
2.0
WWN-Based VSANs
SW1 SW2
SAN-OS 2.0
Move requires Reconfiguration on SW2 Move without reconfiguration
FC
HBA
FC
HBA
FC
HBA
FC
HBA
VSAN membership currently based on physical port of switch Reconfiguration is required when server or storage moves to another switch
VSAN membership based on pWWN of server or storage Fabric-wide distribution of configuration using CFS No re-configuration is required when a host or storage moves
14
WWN-Based VSANs
With the introduction of SAN-OS 2.0, VSAN membership now may be defined based on the world wide name (WWN) of hosts and storage devices, or by switch port. With WWN-based VSAN membership, host and targets can be moved from one port to any other port anywhere in the MDS fabric without requiring manual reconfiguration of the port VSANs.
115
FC
FC
HBA
FC
FC
Transit VSAN isolates WAN infrastructure Resolves problems with merged fabrics FC control frames remain within the VSAN
16
FC
HBA
FC
HBA
VSAN 20
2006 Cisco Systems, Inc. All rights reserved.
IVR Overview
VSANs are like virtual switches. They improve SAN scalability, availability, and security by allowing multiple SANs to share a common physical infrastructure of switches and ISLs.These benefits are derived from the separation of Fibre Channel services in each VSAN and isolation of traffic between VSANs. Data traffic isolation between the VSANs also inherently prevents sharing of resources attached to a VSAN, for example robotic tape libraries. Using IVR, resources across VSANs are accessed without compromising other VSAN benefits. When IVR is implemented, data traffic is transported between specific initiators and targets on different VSANs without merging VSANs into a single logical fabric. FC control traffic does not flow between VSANs, nor can initiators access any resource across VSANs aside from the designated resources. IVR allows valuable resources like tape libraries to be easily shared across VSANs, and IVR used in conjunction with FCIP provides more efficient business continuity or disaster recovery solutions. IVR works for both FC and FCIP links. Using IVR, a backup server in VSAN10 could access a tape library in VSAN20 by configuring the switches involved to allow traffic between these devices, by VSAN and pwwn. Because the other nodes were not configured for IVR, they are unable to access each other.
116
IVR Zone
Tape Library
FC
VSAN 20
FC
FC FC FC FC FC FC FC
FC FC FC FC FC FC
FC
FC
FC FC FC FC FC FC FC
FC FC FC FC FC FC FC
117
VSAN 10
FC
HBA
VSAN 99
FCIP tunnel
VSAN 20
FC
fcid 05.02.01
Domain 0x05
Domain 0x0A
S1
Domain 0x5F
Domain 0x5C
S2
Domain 0x14
Domain 0x06
fcid 06.03.04
AFID 1
VSAN 10
S_ID: 05.02.01
D_ID: 06.03.04
AFID 1
VSAN 99
S_ID: 05.02.01
D_ID: 06.03.04
AFID 1
VSAN 20
S_ID: 05.02.01
D_ID: 06.03.04
18
Switch identifier Current VSAN ID Source Domain Destination Domain Next-Hop VSAN (rewritten VSAN)
118
2.1
From SAN-OS 2.1, Domain IDs do not have to be unique within an IVR zoneset
All VSAN IDs must be unique within the same Autonomous Fabric
Autonomous Fabric ID=1
VSAN 10
FC
HBA
FC
fcid 05.02.01
Domain 0x05
Domain 0x0A
S1
Domain 0x5F
Domain 0x5C
S2
Domain 0x14
Domain 0x05
fcid 05.03.04
AFID 1
VSAN 10
S_ID: 05.02.01
D_ID: 06.03.04
AFID 1
VSAN 99
S_ID: 05.02.01
D_ID: 06.03.04
AFID 1
VSAN 20
S_ID: 06.02.01
D_ID: 05.03.04
19
Removes unique VSAN ID and Domain ID requirement Integrates with QoS, LUN zoning, and read-only zoning Provides Automatic IVR configuration propagation throughout fabric AUTO Mode Provides Automatic IVR topology discovery Licensed with Enterprise and San Extension (with IPS 4 or 8 installed) packages
In the example, notice that VSAN 10 has a switch with Domain ID 05 and so does VSAN 20. Therefore IVR NAT must provide a proxy entry in VSAN 10 for VSAN20 Device 05.03.04 and renumber it as 06.03.04. A frame from VSAN 10 fcid 05.02.01 is written with a destination fcid of 06.03.04 and routed via the transit VSAN 99 to VSAN 20. As the frame arrives at the border switch in VSAN 20, the frame header is rewritten as 05.03.04 and routed to its destination port. Notice that with SAN-OS 2.1 there is only one Autonomous Fabric so all VSAN IDs must be unique within the same Autonomous Fabric.
119
3.0
From SAN-OS 3.0, you can configure up to 64 separate Autonomous Fabrics All VSAN IDs must be unique within the same Autonomous Fabric but the same VSAN ID can exist in a different Autonomous Fabric IVR NAT will rewrite the S_ID and/or D_ID and route frames as before
VSAN 10
FC
HBA
FC
fcid 05.02.01
Domain 0x05
Domain 0x0A
S1
Domain 0x5F
Domain 0x5C
S2
Domain 0x14
Domain 0x05
fcid 05.03.04
AFID 1
VSAN 10
S_ID: 05.02.01
D_ID: 06.03.04
AFID 1
VSAN 99
S_ID: 05.02.01
D_ID: 06.03.04
AFID 2
VSAN 10
S_ID: 06.02.01
D_ID: 05.03.04
20
Manual Configuration Configure the IVR topology manually on each IVR-enabled switch Automatic Mode Uses CFS configuration distribution to dynamically learn and maintain up-to-date information about the topology of the IVR-enabled switches in the network.
In the example, VSAN 10 on the left is joined to VSAN 10 on the right via a Transit VSAN 99. This would be illegal in a single Autonomous Fabric so both sides are configured in separate Autonomous Fabrics 1 and 2. Notice that IVR NAT now rewrites the AFID in the EISL frame header from AFID1:VSAN 10 to AFID2:VSAN 10 as it passes through the IVR edge switches.
120
3.0
S3 swwn3 VSAN 1
AFID 3
VS AN
21
121
22
While it is not strictly required to have unique Domain IDs across VSANs for switches that are not participating in IVR, unique Domain IDs are recommended, because they simplify fabric design and management. Because the VSAN rewrite table is limited to 4096 entries, and because entries are perdomain, not per-end device, it is best to minimize the number of switches that contain IVZ members in very large implementations. Implement redundant path designs whenever possible. In normal FC environments, it is generally considered a best practice to set the default zone policy to deny. Because members of IVZs cannot exist in the default zone, activation of an IVZS using the force option may lead to traffic disruption if IVZ members previously existed in a default zone policy of permit. Make sure that exactly the same IVR topology is applied to all IVR-enabled switches. Using the Cisco Fabric Manager to configure IVR can help avoid errors and will ensure that the same IVR configuration is applied to all IVR enabled switches.
122
Fabric B
Multiprotocol Routers
Application Hosts
Backup Server A
Backup Server B
LSAN_1
LSAN_2
Backup Media Server
Add latency to every frame Consume ISL ports Are difficult to manage Are a single point of failure
2006 Cisco Systems, Inc. All rights reserved.
Fabric C
23
Join fabrics without merging them Perform NAT to join separate address spaces Perform functions similar to iFCP gateways
123
PortChannels
PortChannels
A PortChannel is a logical bundling of ISLs:
Multiple links are combined into one aggregated link More reliable than FSPF equal-cost routing Can span line cards for higher availability Higher throughput up to 160Gbps per PortChannel (16 x 10Gbps) No distance limitations Up to 16 ISLs per PortChannel Up to 128 PortChannels per switch
A PortChannel is a logical bundling of identical links. PortChannels (link bundling) enable multiple physical links to be combined into one aggregated link. The bandwidth from these links is aggregated into this logical link. There may be a single PortChannel between switches or multiple PortChannels between switches. PortChannels provide a point-to-point connection over multiple interswitch link (ISL) E_Ports or extended interswitch link (EISL) TE_Ports. PortChannels increase the aggregate bandwidth of an ISL by distributing traffic among all functional links in the channel. This decreases the cost of the link between switches. PortChannels provides high availability on an ISL. If one of the physical links fails, traffic previously carried on this link will be switched to the remaining links. PortChannels are known in the industry by other names, such as the following:
ISL trunking (Brocade Communications Systems) Port bundling Aggregated channels Channel groups Channeling Bundles
124
FSPF Routing
FC
HBA
10 1Gbps links
11
1. 2.
FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 2000 Server A + C 10 12 13 Cost 2000 Server B
FC
HBA
FC
FC
12
13
HBA
FSPF builds a routing table between each domain in the fabric FSPF chooses the least cost path and routes all frames along it, but we have two equal cost pathspath 1 and 2 both have a cost of 2000 FSPF applies round-robin algorithm to share the load between connected devices Server A + C share path 1 and server B is allocated path 2 All frames from Server A to Storage will be carried across Path 1 Path 1 will carry a different load than Path 2 FSPF does NOT load balance across equal cost paths on non-Cisco switches
2006 Cisco Systems, Inc. All rights reserved.
26
FSPF Routing
When Fibre Channel switches are joined together with ISLs, FSPF builds a routing table which is distributed to all switches using Link State updates. The routing table is a list of every possible path between any two domains in the Fibre Channel fabric. Each path is assigned a cost based upon the speed of the link:
FSPF then chooses the least cost path between any two domains. All frames will be sent along the least cost path and all other possible paths will be ignored. Every time an ISL is added, FSPF issues a Build Fabric (BF) command to rebuild the routing table. Every time an ISL is removed, FSPF issues a Build Fabric (BF) command to rebuild the routing table.
125
HBA
10 1Gbps links
11
1. 2. 3.
FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 2000 Server A 10 12 13 Cost 2000 Server B 10 11 13 Cost 2000 Server C
FC
HBA
FC
FC
12
13
HBA
To provide more bandwidth, we could add another link between switches FSPF again chooses the least cost path and routes all frames along it We now have three equal cost pathspaths 1, 2 and 3 all have a cost of 2000 FSPF re-applies round-robin algorithm to share the load between connected devices Server A is allocated path 1, server B has path 2 and server C has path 3 All frames from Server A to storage will still be carried across Path 1 Path 1 will carry a different load than Path 2 or Path 3 FSPF still does not load balance across equal-cost paths on non-Cisco switches
2006 Cisco Systems, Inc. All rights reserved.
27
126
HBA
10 1Gbps links
11
1. 2.
FSPF Routing Table from Domain 10 to 13 10 11 13 Cost 1000 Server A + B + C 10 12 13 Cost 2000 None Port Channel Link Costs FSPF Link cost / number of links ie. 1000 / 2 = 500
FC
FC
HBA
FC
12
13
HBA
To provide more bandwidth, we could add another link between switches FSPF rebuilds the routing table between each domain in the fabric FSPF again chooses the least cost path and routes all frames along it, but we now only have one least cost pathpath 1, which has a new cost of 1000 All frames from Domain 10 to 13 will follow path 1 By default, the Port Channel will load balance across all links in the Port Channel If a link fails within the Port Channel, the FSPF cost doesnt change and frames will continue to flow through the Port Channel
2006 Cisco Systems, Inc. All rights reserved.
28
When two or more ISLs are placed in a Port Channel, this is seen as a single path by FSPF and a cost is calculated based upon the cost of each link divided by the number of links in the Port Channel. Cisco MDS switches will provide exchange-based load balancing across all links within the Port Channel.
127
Flapping Links
Flapping link can cause FSPF recalculation FSPF will rebuild the topology database when the link goes down FSPF will rebuild the topology database when the link comes up On a failing GBIC or link, this can happen several times a second Results in wide-scale disruption to the fabric
2006 Cisco Systems, Inc. All rights reserved.
29
Flapping Links
PortChannels can handle some types of hardware failures better than ISLs not belonging to a PortChannel. For example, if a flapping link exists between the two middle directors outside of a PortChannel, FSPF overhead is incurred. In this case, each time the ISL goes down or comes up, all of the switches in the fabric will recalculate the cost of each of their FSPF links by exchanging Link State Records on every (E) ISL interface. Switches synchronize databases by sending LSRs (Link State Records) in a Link State Update (LSU) SW_ILS extended link service command. When a switch receives an LSU, it compares each LSR in the LSU with its current topology database. If the new LSR is not present in the switchs LSD, or if the new LSR is newer then the existing LSR, the LSR is added to the database. Cisco uses a modified Dykstra algorithm which does a very fast computation of the FSPF topology database. When a link flaps, the LSU's are flooded and then the path calculation occurs. While Cisco MDS switches handle flapping links more efficiently than most competitors, placing ISLs within a PortChannel can completely eliminate the overhead associated with FSPF re-calculation caused by a flapping link.
128
Flapping link within a PortChannel results in no FSPF recalculation Frames continue to flow across remaining links in the PortChannel Fabric stability is maintained
30
129
PortChannel Protocol
Used for exchanging configuration information between switches to automatically configure and maintain PortChannels
2.0
The PortChannel protocol provides a consistency check of the configuration at both ends. Simplifies PortChannel configuration Automatically creates a PortChannel between switches
1 B A 2 C
Channel group10
With PortChannel Protocol, ports will be isolated instead of suspended
2006 Cisco Systems, Inc. All rights reserved.
1 2 3 A
Channel group10
1 2 3 B
Individual link
The plug-and-play functionality of the PortChannel Protocol allows the A3-B3 link to be dynamically added to the PortChannel
31
Bringup protocol: misconfig detection and synchronization of port bringup Autocreate protocol: automatic aggregation of ports into port channels
PCP is exchanged only on FC and FCIP interfaces. The autocreate protocol is run to determine if a port can aggregate with other ports to form a channel group. Both the local and peer port have to be autocreate enabled for autocreate protocol to be attempted. More than 1 port needs to be autocreate enabled for aggregation to be attempted. A port cannot both be manually configured to be part of a PortChannel and have autocreate enabled. These two configurations are mutually exclusive. Autocreate enabled ports need to have the same compatibility parameters to be aggregated: speed, mode, trunk mode, port VSAN, allowed VSANs, port and fabric binding config.
130
Load Balancing
Source Destination
Read
Comm and
Sequence 1
Exchange
Data
Sequence 2
Load-balancing option is configured on a per-VSAN basis and applies to both FSPF and PortChannels Some hardware/software combinations perform better with flow-based load balancing
e.g. HP CA with EVA
Sequence 3
nse Respo
Exchange-based Load Balancing is the default Devices in the MDS family do not split exchanges allowing for guaranteed IOD over WAN
One exchange per ISL
32
Load Balancing
Load balancing is configured for each VSAN in an MDS 9000 fabric. There are two load balancing methods: flow based and exchange based. Flow based load balancing sends all traffic with the same src_id-dst_id along the same path. Exchange based load balancing ensures that members of the same SCSI exchange follow the same path. Exchange based flow control is the default and is appropriate for most environments. Load balancing is configured on a VSAN by VSAN basis, and whichever method is chosen is applied to both FSPF and PortChannels. Some hardware/software combinations can perform better with flow based load balancing. For example, HP EVA storage subsystems when coupled with Continuous Access (CA) software are sensitive to out-of-order exchanges that are possible with exchange-based load balancing. These devices, while rare, do perform significantly better with flow based load balancing.
131
S_ID 10.02.00
FC
HBA
D_ID 20.01.00
FC
HBA
Exchange-based load balancing (S_ID - D_ID - OX_ID) maintains IOD in a stable fabric
33
132
Brocade 3800
Brocade 12000
PortChannels are proprietary and are not supported between Cisco MDS switches and other vendors switches Not compatible with other vendors trunking Standard ISL flow control must be configured on the Brocade switch
34
133
Best Practices
Use PortChannels wherever possible Place single ISLs in a PortChannel
Non disruptive scalability
Configure links on different switching modules for redundancy and high availability Use the same Channel_ID at both ends
- Not a requirement but it makes management easier
35
Best Practices
Use PortChannels whenever possible. PortChannels:
Reduce CPU usage from the levels required to maintain multiple neighbors Provide an independent recovery mechanism, faster than FSPF Are completely transparent to upper-layer protocols Can be nondisruptively scaled by adding links
Spread PortChannel links across different switching modules. As a result, should a switching module fail, the PortChannel can continue to function as long as at least one link remains functional. Try to use the same Channel_ID at both ends of the PortChannel. While the PortChannel number is only locally significant, this practice helps identify the PortChannel more easily within the fabric. PortChannels are point-to-point logical links. Ensure that all links in a PortChannel connect to the same two switches or directors. In order to prevent frame loss, it is best to quiesce a link before disabling it from a PortChannel When difficulties arise with configuring PortChannels, the problem is often the result of inconsistently configured links. All links within the PortChannel require the same attributes for the PortChannel to come up. Use the show port-channel consistency detail command to identify link configuration inconsistencies.
134
Use the in-order-delivery feature only when necessary. In-order-delivery adds latency because it deliberately holds frames in the switch. It also consumes more switch memory, because it stacks the frames at the egress port.
135
Intelligent Addressing
Dynamic FCID Assignment Problems
WWN1
1 1 EVENT
Port Change Host Reboot Switch Reboot
FC
WWN1
1 1
FC
X
FC - Switch Directory Server
2 2
FCID1 = WWN1
2 2
FCID1
EVENT
FCID2 = WWN1
FCID2
37
After the N_Port has established a link to its F_Port, the N_Port obtains a port address by sending a Fabric Login (FLOGI) Link Services command to the switch Login Server (at Well-Known Address 0xFFFFFE). The FLOGI command contains the WWN of the N_Port in the payload of the frame. The Login Server sends an Accept (ACC) reply that contains the N_Port address in the D_ID field. The initiator N_Port then contacts the target N_Port using the FCID of the target.
In the event of a port change, host reboot or switch reboot, previous FCID assignments have the potential to change.
136
38
137
39
volatile memory. This binding remains intact until it is explicitly purged by the switch administrator. The persistent FCID allocation option can be applied globally or for individual VSANs. This feature reduces the management complexity and availability risks associated with deploying HP/UX and AIX servers. The persistent FCID allocation feature is enabled on a perVSAN basis, allowing different VSANs to have different addressing policies or practices. The Cisco MDS 9000 Family also supports static FCID assignments. Using static FCID assignments, the area and port octets in the FCID are manually assigned by the administrator. This feature allows SAN administrators to use custom numbering or addressing schemes to divide the FCID domain address space among available SAN devices. This feature is particularly useful for customers who migrate from other vendors switches because they can retain the same FCIDs after migration. Because the Domain ID is the first octet of the FCID, the administrator must assign a static Domain ID to the switch in order to specify the entire FCID. Therefore, in order to statically assign FCIDs on a given switch, that switch must first be configured with a static Domain ID. The static FCID assignment feature is enabled on a per-VSAN basis, and static Domain IDs must be assigned on a per-VSAN basis for each switch in the VSAN. Static FCID assignment eases the migration of HP-UX and AIX servers from a legacy fabric to an MDS fabric. The MDS switches can be configured with the same FCIDs as the legacy fabric, eliminating the need to remap storage targets on the servers.
139
Domain
Area
Port
Bit
23
16 15
08 07
00
Domain
Port IDs
Cisco MDS logically assigns FCIDs they are not tied to the physical port
2006 Cisco Systems, Inc. All rights reserved.
40
140
2.0
FC
FC
WWN1
WWN2
Fabric-wide distribution ensures no reconfiguration when a device is moved across VSANs Unique aliases minimize zone merge issues
FC
FC
Alias1
2006 Cisco Systems, Inc. All rights reserved.
Alias2
41
141
3.0
F_Port
Three FCIDs
Three Name Server entries Three Virtual Devices All share a single FC Port
MDS Switch
42
142
FC
FC FC FC
FC FC
FC FC FC
FC FC
Benefits:
CFS Protocol
Fast and efficient distribution Single-point of configuration with fabric-wide consistency Plug-and-play SANs Session-based management
44
The Cisco Fabric Services Protocol distributes configuration information for WWN-based VSAN members, Distributed Device Alias Services, Port Security, Call Home, Network Time Protocol (NTP), AAA servers, Inter-VSAN Routing zones, Syslog servers, role policies, and Fibre Channel timers to all switches in a fabric. From SAN-OS 3.0 CFS will first attempt to distribute in-band using FC EISLs between switches but as a last resort, using an out-of-band IP connection if available.
143
CFS Applications
Consistent Syslog Server, Call Home and Network Time Protocol (NTP) configuration throughout fabric aids in troubleshooting and SLA compliance Distributed Port Security, RADIUS/TACACS+ and Role-Based Access Control (RBAC) information for simpler security management Fabric wide VSAN timer and IVR topology information propagation from a single switch Distributed Device Alias Service (DDAS) allows fabric-wide aliases, simplifying SAN administration
VSAN VSANTimers Timers IBR IBR DDAS DDAS Syslog Syslog Call CallHome Home NTP NTP
CFS CFS
45
CFS Applications
The Cisco Fabric Services Protocol aids in the administration, management and deployment of configuration settings SAN wide. Consistent Syslog Server, Call Home and NTP configuration throughout fabric aids in troubleshooting and SLA compliance. CFS distributed Port Security, Radius TACACS and RBAC information enhance and simplify security by providing consistent and comprehensive security settings. Fabric wide IVR and VSAN timer information propagation from a single switch via CFS provides uniformity across the fabric. Fabric wide distributed device aliasing simplifies SAN administration by providing consistent names for devices throughout the fabric based upon the pWWN regardless of VSAN
144
Switch Interoperability
Overview of Switch Interoperability
Open Trunking
X X MDS 9000
X Brocade
X McData
PortChannels
Switches utilize their proprietary feature set. Different vendors switches often cannot interoperate with each other. Cisco MDS switches support 5 modes: Cisco Native Mode Interop Mode 1 Interop Mode 2 Interop Mode 3 3.0 Interop Mode 4
2006 Cisco Systems, Inc. All rights reserved.
Supports all Cisco proprietary features FC-SW2 compatible with all other vendors Legacy Brocade support for 16 port switches Legacy Brocade support for larger switches Legacy McData support
47
Interoperability allows devices from multiple vendors to communicate across a SAN fabric. Fibre Channel standards (e.g., Fibre Channel Methodologies for Interconnect, FC-MI 1.92) have been put in place to guide vendors towards common external Fibre Channel interfaces. If all vendors followed the standards in the same manner, then interconnecting different products would become a trivial exercise. However, some aspects of the Fibre Channel standards are open to interpretation and include many options for implementation. In addition, vendors have extended the features laid out in the standards documents to add advanced capabilities and functionality to their feature set. Since these features are often proprietary, vendors have had to implement interoperability modes to accommodate heterogeneous environments.
145
Standard Interop mode (Interop mode 1) requires all other switches in fabric to be in Interop mode 1 Enables MDS 9000 switches to interoperate with McData, Brocade, and QLogic switches that are FC-SW-2 compatible Reduces the feature set supported by all switches Requires rebooting of third party switches Can require disruptive restart of an MDS VSAN Interop Modes affect only the VSAN for which they are configured
2006 Cisco Systems, Inc. All rights reserved.
48
Interop Mode 1
The standard interoperability mode (Interop mode 1) enables the MDS to interoperate with third party switches that have been configured for interoperability. Interop 1 mode allows the MDS to communicate over a standard set of protocols with these switches. In Interop mode 1, the feature set supported by vendors in standard interoperability mode is reduced to a subset that can be supported by all vendors. This is the traditional way vendors achieve interoperability. Most non-Cisco switches require a reboot when configured into standard interoperability mode. On Cisco switches, Interop mode is set on a VSAN rather than the whole switch. As a result, an individual VSAN may need to be restarted disruptively to implement interop 1 mode, but the entire switch does not require a reset.
146
MDS
Interop Mode 2
Supports Brocade switches with 16 or fewer ports eg. 2100/2400/2800/3200 Native core PID format = 0
Brocade 2800
Interop 2
Interop Mode 3
Supports Brocade switches with more than 16 ports eg. 3900/12000/24000 Native core PID format = 1
Brocade 3900
Interop 3
3.0
Interop Mode 4
Supports McData switches and directors
McData 6140
Interop 4
49
Interop Mode 2: This mode allows seamless integration with specific Brocade switches (2100/2400/2800/3800 series) running in their own native mode of operation. Interop Mode 2 enables MDS switches to interoperate with older Brocade switches that utilize a restrictive PID format (PID = 0) that permits only 16 devices per domain. This restrictive PID format, also referred to as CORE PID FORMAT (CORE PID=0), is common in brocade fabrics that do not have 3900/12000 switch in the fabric. Interop Mode 3: This mode allows seamless integration with specific Brocade switches (3900 and 12000) running in their own native mode of operation. Interop Mode 3 enables MDS switches to interoperate with newer Brocade switches that utilize a less restrictive PID format (PID = 1) that permits up to 256 devices per domain. This PID format is referred to as CORE PID FORMAT (CORE PID=1) which is common in brocade fabrics with at least one 3900, 12000 model or later in the fabric...(In such cases all the other 2800/3800 switches need to be set to CORE PID=1 explicitly), which is a disruptive operation requiring a reboot of every switch. Interop Mode 4: This mode allows seamless integration with McData switches and directors running in their own native mode of operation.
Brocade Fabric switches with port counts higher than 16 (models 3900 and 12000) require that the core PID value be set to 1. Earlier models, with 16 or fewer ports, set the core PID to 0. These older Brocade switches allocated one nibble of the FCID / PID in area field 0x0 F for port numbers, limiting the port count to 16. When the core PID is set to 1, the allocated bytes in the FCID/PID allow the use of port numbers 0x00 FF.
Note
147
FC
HBA
FC
Backup Server
FC
Storage Array
Use IVR to seamlessly backup data in a Brocade fabric to a tape library in a McData fabric IVR is supported by MDS 9100, 9200 and 9500 switches and included in the Enterprise license package. Enable true SAN consolidation of storage and tape devices across the enterprise
2006 Cisco Systems, Inc. All rights reserved.
50
148
Lesson 5
Objectives
Upon completing this lesson, you will be able to become familiar with MDS 9000 management. This includes being able to meet these objectives:
Identify the system memory areas of the MDS 9000 supervisor Describe the features of the MDS 9000 CLI Describe the basic features of Cisco Fabric Manager and Device Manager Explain how to perform the initial configuration of an MDS 9000 switch Access the MDS 90000 remote storage labs
Slot 0:
(external flash)
Volatile:
Temporary file space
Log:
Logfile
Modflash:
(SSM flash)
The Bootflash: contains the Kickstart and System images used for booting the MDS All config changes made by CLI or FM/DM are instantly active and held in running-config #Copy run start saves the running-config to startup-config in NVRAM Startup-config is loaded when the switch is rebooted Temporary files may be stored in the Volatile: system area.
2006 Cisco Systems, Inc. All rights reserved.
The Cisco MDS contains an internal Bootflash: used for holding the current bootable images, Kickstart and System. License files are also stored here but the Bootflash: can also be used for storing any file including copies of the startup config. MDS 9500 supervisors also have an external Bootflash: memory slot called Slot0: that is used for transferring image files between switches. The SSM linecard also contains an internal ModFlash: used for storing application images. The system RAM memory is used by the Linux operating system and a Volatile: file system is used for storing temporary files. Any changes made to the switch operating parameters or configuration are instantly active and held in the Running-Configuration in RAM. All data stored in RAM will be lost when the MDS is rebooted so an area of non-volatile RAM or NVRAM is used for storage of critical data. The most critical of these would be the runningconfiguration for the switch. The Running-Configuration should be saved to the StartupConfiguration in NVRAM using the CLI command #Copy run start, so that the configuration can be preserved during switch reboot. During the switch boot process, it is essential that the switch knows where to find the kickstart and system images and what they are called. Two boot parameters are held in NVRAM that point to these two files.
152
Boot Sequence
Both the Kickstart image and System image need to be present for a successful boot The Boot Parameters point to the location of both the Kickstart and System images The Boot process will fail if the parameters are wrong or the images are missing Install command simplifies the process and checks for errors System
Loads SAN-OS Checks file systems Loads startup-config Switch# prompt
System:
(RAM) LINUX system space SAN-OS
Kickstart
Bootloader
Loads LINUX Kernel & Drivers Gets System Boot Parameters Verifies System image and loads it Switch(Boot)# prompt
Gets Kickstart Boot Parameters Verifies Kickstart image and loads it Loader> prompt
Bootflash:
(internal flash) system30.img kickstart30.img
5
BIOS
Runs POST Loads Bootloader
2006 Cisco Systems, Inc. All rights reserved.
Boot Sequence
When the MDS is first switched on or during reboot, the System BIOS on the Supervisor module first runs POST (Power On Self Test) diagnostics then loads the Bootloader bootstrap function. The Boot parameters are held in NVRAM and point to the location and name of both the Kickstart and System images. The Bootloader obtains the location of the Kickstart file, usually on Bootflash: and verifies the Kickstart image before loading it. The Kickstart loads the Linux Kernel and device drivers and then needs to load the System Image. Again, the Boot parameters in NVRAM should point to the location and name of the System image, usually on Bootflash:. The Kickstart then verifies the System image and loads it. Finally, the System image loads the SAN-OS, checks the file systems and proceeds to load the startup-config, containing the switch configuration, from NVRAM. If the Boot parameters are missing or have an incorrect name or location, then the Boot Process will fail at the last stage. If this happens, then the administrator must recover from the error and reload the switch. The #Install All command is a script that greatly simplifies the boot procedure and will check for errors and the upgrade impact before proceeding.
153
CLI Overview
CLI Overview
Command-Line Interface (CLI)
Multiple connection options and protocols Direct Console or serial linkVT100 Secure shell accessSSH (encrypted) Terminal TelnetTCP/IP over Ethernet or Fibre Channel
There are multiple connection options and protocols available to manage the MDS 9000 Family switches via CLI. The initial configuration must be done using a VT100 console access. VT100 console access can be a direct connection or serial link connection such as a modem. Once the initial configuration is complete you can access the switch using either Secure Shell or Telnet. Secure Shell (SSH) protocol provides a secure encrypted means of access. Terminal Telnet access involves a TCP/IP Out-of-Band (OOB) connection through the 10-/100-MB Ethernet port or an in-band connection via IP over FC. You can access the MDS 9000 Family of switches for configuration, status, or management through the console port; initiate a Telnet session through the OOB Ethernet management port or through the in-band IP over FC management feature. The console port is an asynchronous port with a default configuration of 9600 bps, 8 data bits, no parity, and 1 stop bit. This port is the only means of accessing the switch after the initial power up until an IP address is configured for the management port. Once an IP address is configured, you can Telnet to the switch through the management Mgmt0 interface on the supervisor card. In-band IP over FC is used to manage remote switches through the local Mgmt0 interface.
154
CLI Features
Structured hierarchy easier to remember Style consistent with IOS software Commands may be abbreviated Help facility
Context-sensitive help (?) Command completion (Tab) Command history buffer (using keys) Console error messages
Command Scheduler with support for running Shell scripts Support for Command Variables and Aliases Configuration changes must be explicitly saved before reboot
copy running-config startup-config (abbreviated to: copy run start)
2006 Cisco Systems, Inc. All rights reserved.
CLI Features
The CLI enables you to configure every feature of the switch. More than 1700 combinations of commands are available and are structurally consistent with the style of Cisco IOS software CLI. The CLI help facility provides:
Context-sensitive help - Provides a list of commands and associated arguments. Type ? at any time, or type part of a command and type ?. Command completion - The Tab key completes the keyword you have started typing. Console error messages - Identify problems with any switch commands that are incorrectly entered so that they may be corrected or modified Command history buffer - Allows recalling of long or complex commands or entries for reentry, renewing, or correction MDS Command Scheduler - Provides a UNIX cron like facility in the SAN-OS that allows the user to schedule a job at a particular time or periodically.
Configuration changes must be explicitly saved, and configuration commands are serialized for execution across multiple SNMP sessions. To save the configuration, enter the copy runningconfig startup-config command from the config mode prompt to save the new configuration into nonvolatile storage. Once this command is issued, the running and the startup copies of the configuration are identical. Every configuration command is logged to the RADIUS server.
155
CLI Modes
EXEC mode:
Show system information, run debug, Copy and delete files, get directory listing for Bootflash:
Configuration mode:
Configure features that affect the switch as a whole
Configuration submode:
Configure switch sub-parameters
switch prompt (switch#) exit exit exit end
9
copy
debug fspf
database
port-channel
fc
fcip
iscsi
mgmt
switchport
2006 Cisco Systems, Inc. All rights reserved.
shut
no shut
CLI Modes
Switches in the MDS 9000 Family have three command mode levels:
The commands available to you depend on the mode that you are in. To obtain a list of available commands, type a ? at the system prompt. From the EXEC mode, you can perform basic tests and display system information. This includes operations other than configuration such as show and debug. Show commands display system configuration and information. Debug commands enable printing of debug messages for various system components. Use the config or config terminal command from EXEC mode to go into the configuration mode. The configuration mode has a set of configuration commands that can be entered after a config terminal command, in order to set up the switch. The CLI commands are organized hierarchically, with commands that perform similar functions grouped under the same level. For example, all commands that display information about the system, configuration, or hardware are grouped under the show command, and all commands that allow you to configure the switch are grouped under the config terminal command, which includes switch sub-parameters at the configuration submode level. To execute a command, you enter the command by starting at the top level of the hierarchy. For example, to configure a Fibre Channel interface, use the config terminal command. Once you are in configuration mode, issue the interface command. When you are in the interface submode, you can query the available commands there.
156 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.
Saves active config in NVRAM List files stored on bootflash: Erase file stored on bootflash: Copy file and change the name Monitor all FLOGI operations Switch off debug Gather switch info for support Saves output in volatile:tempfile Compresses tempfile Copies file to external flash card Enter Config Mode to change settings Configure specific interface Configure as a 1Gb port
10
157
11
158
Command Aliases
Replaces complex command strings with an alias name Command aliases persist across reboots Commands being aliased must be typed in full without abbreviation Command alias always takes precedence over CLI keywords (config)# cli alias name gigint interface gigabitethernet
3.0
12
Command Aliases
Some commands can require a lot of typing. An example of this is gigabitethernet that can sometimes be shortened to gig, but it is sometimes useful to group several commands and subcommands together. This can be done using Command Aliases. Command Aliases are saved in NVRAM so can persist across reboots. When creating an alias, the individual commands must be typed in full without abbreviation. If you define an alias, it will take precedence over CLI keywords starting with the same letters, so be careful when using abbreviations.
159
Command Scheduler
Helps schedule configuration and maintenance jobs in any MDS switch Schedule jobs on a one-time basis or periodically
One-time Mode The job is executed once at a pre-defined time Periodic Mode Job is executed Daily, Weekly, Monthly, or Delta (configurable)
2.0
MDS Date and Time must be accurately configured Scheduled jobs may fail if an error is encountered
If a licence has expired, if a feature is disabled
13
Command Scheduler
The Cisco MDS SAN-OS provides a unix kron like facility called the Command Scheduler. Jobs can be defined listing several commands that are to be executed in order. Jobs can be scheduled to run at the same time every day, week, month or at a configurable frequency (delta). All jobs are executed non-interactively, without administrator response. Be aware that a job may fail if a command that is issued is disabled or no longer supported, because a license may have expired. The job will fail at the point of error, and all subsequent commands will be ignored.
160
FM Server runs as Windows Service Performance Manager runs as Windows Service Runs as daemon on Unix
2006 Cisco Systems, Inc. All rights reserved.
15
161
Summary View
16
162
Summary View
Device View provides status at a glance Fan, power, supervisor and switching module status indicators Port status indicators
17
Fabric Manager discovers network devices and creates a topology map with VSAN and zone visualization. VSAN/zone and switch trees are also available to simplify configuration. Immediately after the Fabric View is opened, the discovery process begins. Using information gathered from a seed switch (MDS 9000 Family), including name server registrations and FCGS3 fabric configuration server information, the Fabric Manager can draw a fabric topology in a user-customizable map. Because of the source of this information, any third-party devices such as other fabric switches that support FC-GS and FC-GS3 standards are discovered and displayed on the topology map. Vendor Organizational Unique Identifier (OUI) values are translated to derive the manufacturer of third-party devices, such as QLogic Corp., EMC Corp., or JNI Corp. Fabric Manager provides an intuitive user interface to a suite of network analysis and troubleshooting tools. One of those tools is the Device Manager, which is a complimentary graphical user interface designed for configuring, monitoring, and troubleshooting specific switches within the SAN fabric.
163
19
Initial Setup
The first time that you access a switch in the Cisco MDS 9000 Family, it runs a setup program that prompts you for the IP address and other configuration information necessary for the switch to communicate over the supervisor module Ethernet interface. This information is also required if you plan to configure and manage the switch. The IP address must first be set up in the CLI when the switch is powered up for the first time, so that the Cisco MDS 9000 Fabric Manager can reach the switch. The console needs a rollover RJ-45 cable. There is a switch on the supervisor module of the MDS 9500 Series switches that, if placed in the out position, will allow the use of a straightthrough cable. The switch is shipped in the in position by default and is located behind the LEDs. In order to set up a switch for the first time you must obtain the administrator password, which is used to get network administrator access through the CLI. The Simple Network Management Protocol version 3 (SNMPv3) user name and password are used when you log on to the Fabric Manager but should be identified as soon as possible. The switch name will become the prompt when the switch is initialized, and the management Ethernet port IP address and subnet mask need to be known for out-of-band access.
164
Setup Defaults
Configure default switchport interface state (shut/noshut) [shut]: Configure default switchport trunk mode (on/off/auto) [on]: Configure default zone policy (permit/deny) [deny]: Enable full zoneset distribution (yes/no) [n]: Enable FCID persistence in all the VSANs on this switch (yes/no) [n]: Would you like to edit the configuration (yes/no) [no]: Use this configuration and save it? (yes/no) [y]:
The management interface is active at this point All Fibre Channel and Gigabit Ethernet interfaces are shut down Select yes to use and save the configuration The Setup Routine can be accessed from the EXEC mode of the CLI with the # setup command
20
Setup Defaults
It is recommended to have the switch interfaces come up administratively disabled, or shut. This approach will ensure that the administrator will have to configure the interface as needed and then enable with the no shut command, resulting in a more controlled environment. Switch trunk ports mode should be on. Two connected E_Ports will not do trunking if one end port has the trunk mode off. The default zoning policy of deny will make all interfaces on a switch inoperable until a zone is created and activatedinterfaces in the default zone cannot communicate with each other. This policy can be used for greater security. If the permit policy is enabled, then all ports in the default zone will be able to communicate with each other. The system will ask if you would like to edit the configuration that just printed out. Any configuration changes made to a switch are immediately enforced but are not saved. If no edits are needed, then you will be asked if you want to use this configuration and save it as well. Since [y] (yes) is the default selection, pressing Return will activate this function, and the configuration becomes part of the running-config and is copied to the startup-config. This will also ensure that the kickstart and system boot images are automatically configured. Therefore, you do not have to run a copy command after this process. A power loss will restart the switch using the startup-config, which has everything saved that has been configured to nondefault values. If you do not save the configuration at this point, none of your changes will be updated the next time the switch is rebooted.
165
22
The labs are located in Nevada and used extensively, 24 hours a day during MDS lab based training courses throughout the world. The lab interface provides login authentication, and full access to switch consoles and desktop access on each server. The labs currently contain over 30 pods each containing two MDS switches, two servers, JBOD storage and PAA for diagnostics.
166
Point browser at www.labgear.net Enter Username and Password Click Console to access MDS CLI Click Desktop to access W2K server
23
167
CSDF Labs
Lab 1: Initial Switch Config Lab 2: Accessing Disks via Fibre Channel Lab 3: Configuring High Availability SAN Extension Lab 4: Configuring IVR for SAN Extension Lab 5: Exploring Fabric Manager Tools Lab 6: Implementing iSCSI
24
This slide shows the labs that you will perform in this course.
168
Lesson 6
Objectives
Upon completing this lesson, you will be able to explain how the MDS 9000 Storage Services Module (SSM) enables network-based storage applications. This includes being able to meet these objectives:
Explain the basics of SAN-based storage virtualization Explain the value of network-based storage virtualization Describe the network-hosted application services supported by the SSM Describe the network-assisted application services supported by the SSM Describe the network-accelerated application services supported by the SSM Describe Fibre Channel Write Acceleration
Volume management Individually managed Just-in-case provisioning Stranded capacity Snapshot within a disk array Array-to-array replication
2006 Cisco Systems, Inc. All rights reserved.
172
FC
HBA
FC
HBA
FC
HBA
FC
HBA
LUN Mapping
Server then maps some or all visible LUNs to volumes
Target Identification
The server FC driver identifies the SCSI Target ID with the pWWN of the target port then associates each port with its Fibre Channel FCID. MDS MDS
Command frames are then sent by the SCSI Initiator (server) to the SCSI Target (storage device) In a heterogeneous SAN, there may be several storage arrays and JBODs from different vendors
Difficult to configure Costly to manage Difficult to replicate and migrate data
2006 Cisco Systems, Inc. All rights reserved.
173
What is Virtualization?
The process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration. Server Virtualization is a way of creating several Virtual Machines from one computing resource Storage Virtualization is a logical grouping of LUNs creating a common storage pool
Virtualization layer
What is Virtualization?
Virtualization is defined as the process of presenting a logical grouping or subset of computing resources so that they can be accessed in ways that give benefits over the original configuration. In a heterogeneous environment, LUN management can become very costly and time consuming. Storage Virtualization is sometimes used instead to create a common pool of all storage and perform LUN management within the network.
Server Virtualization is a way of creating several Virtual Machines from one computing resource Storage Virtualization is a logical grouping of LUNs creating a common storage pool
174
Symmetric Virtualization
All server ports are zoned with the Virtualization Appliance, Virtual Target port (T) Servers only discover one target All storage ports are zoned with the Virtualization Appliance, Virtual Initiator port (I) All storage ports are controlled by one initiator All control and data frames are sent to the virtual target and terminated. The CDB and LUN are remapped and a new frame is sent to the real target
Advantages Reduced complexity - Single point of management
LAN
T I
Virtualization Appliance
Disadvantages All frames are terminated and remapped by the appliance and resent to their destination Adds latency per frame All traffic passes through the appliance Potential single point of failure Potential performance issue
2006 Cisco Systems, Inc. All rights reserved.
Symmetric Virtualization
In the symmetric approach, all I/Os and metadata are routed via a central virtualization storage manager. Data and control messages use the same path, which is architecturally simpler but has the potential to create a bottleneck. The virtualization engine does not have to live in a completely separate device. It may be embedded in the network, as a specialized switch, or it may run on a server. To provide alternate data paths and redundancy, there are usually two or more virtual storage management devices; this leads to issues of consistency between the metadata databases used to do the virtualization. The fact that all data I/Os are forced through the virtualization appliance restricts the SAN topologies that can be used and can cause bottlenecking. The bottleneck problem is often addressed by using caching and other techniques to maximize the performance of the engine; however, this again increases complexity and leads to consistency problems between engines.
175
Asymmetric Virtualization
Each server contains an agent Intercepts Block I/O requests Sends the meta-data (CDB and LUN) to a Virtualization Manager on the LAN. The Virtualization Manager re-maps the CDB and LUN and returns it to the server. The server now sends the modified control frame to the storage target port. All subsequent data and response frames flow directly between Initiator and Target.
Advantages Data frames are sent directly to the storage port Low latency Disadvantages Requires agent in host to intercept control frame Remapping of CDB and LUN adds latency to first frame in exchange Virtualization Manager could be single point of failure T
Subsequent frames flow directly between Initiator and Target Agent sends meta-data over LAN for re-mapping.
LAN
I
Virtualization Manager
Asymmetric Virtualization
In the asymmetric approach, the I/O is split into three parts:
First, the server intercepts the Block I/O requests Then queries the metadata manager to determine the physical location of the data. Then, the server stores or retrieves the data directly across the SAN.
The metadata can be transferred in-band, over the SAN, or out-of-band, over an Ethernet link; the latter is more common as it avoids IP metadata traffic slowing the data traffic throughput on the SAN, and because it does not require Fibre Channel HBAs that support IP. Each server which uses the virtualized part of the SAN must have a special interface or agent installed to communicate with the metadata manager in order to translate the logical data access to physical access. This special interface may be software or hardware. Initial implementations will certainly be software, but later implementations might use specialized HBAs, or possibly an additional adapter which works with standard HBAs.
176
FC
HBA
FC
HBA
FC
HBA
FC
HBA
Virtualization
MDS MDS
10
Network based virtualization offers substantial benefits that overcome the challenges of traditional SAN management solutions. Network based virtualization means that management is now consolidated into a single point and simplified - hosts and storage are now independent of the various management solutions.
Servers are no longer responsible for volume management and data migration Network based virtualization enables real-time provisioning of storage, reducing the waste and overhead of over-provisioning storage. Legacy and hetero storage assets can be consolidated and fully utilized Data is better protected by simplified snapshot and replication techniques Easier to assign different classes of data
What are some existing approaches to storage virtualization? How is the MDS series a superior solution?
177
NETWORK-BASED VIRTUALIZATION
Host-Based Apps App Integration Multi-Pathing Network-Based Apps Volume Mgmt Snapshot Replication Array-Based Apps RAID Multiple Paths
Virtualization
Replication
Customer Benefit
Information Lifecycle Management Increased Storage Utilization Improved Business Continuance
2006 Cisco Systems, Inc. All rights reserved.
Proof Points Simplified management Non-disruptive data migration across tiered storage Heterogeneous storage pooling Flexible storage provisioning Supports point-in-time copy, replication Flexible data protection services
11
EMC Invista Incipient Network Storage Platform Veritas Storage Foundation for Networks
Benefits:
Insulate servers: All storage changes, including upgrade to storage arrays are seamless to the hosts. Consolidation: Different types of storage can accumulate through mergers and acquisitions, reorganizations, or vendor shift within an I.T. department. Net-based virtualization allows you to incorporate new storage seamlessly and maintain the same services and scripts. Migration: Ability to move data seamlessly from one set of storage to another. (Note, that some do this with host-based volume manager. What if you have thousands of hosts???) Secure isolation: Instantiation of a Virtual LUN (VLUN) so it is only accessible within an administrator-defined VSAN or Zone Problem: Just-in-case provisioning Solution: Just-in-time Provisioning Different class of storage for different purposes Central Storage: Central tool to manage all storage.
178
I/F Q F V M
F
uP Q F V M
F
12
Q F V M
F
FC-WA (Fibre Channel Write Acceleration) to enhance performance of write operations over long distances, eg. Array replication FAIS (Fabric Application Industry Standard) is a standards based protocol used by external virtualization devices to communicate with the SSM through an Open API (Application Programming Interface) NASB (Network-Assisted Serverless Backup) is used with supporting backup software to move the data mover function into the network and thereby reduce the CPU load on the application server or media server SANTap Protocol is used by a number of storage partners with external storage appliances to communicate with the SSM.
179
Network-Assisted
Partner software resides on arrays, external server or appliance
Network-Accelerated
Partner Software is accelerated by Cisco engine or agent (E.G. Cisco X-Copy)
FC FC FC
Potential Network Virtualization Applications Heterogeneous Volume Management Data Migration Heterogeneous Replication / Copy Services Continuous Data Protection (CDP)
2006 Cisco Systems, Inc. All rights reserved.
13
The Network-hosted technique is implemented by installing Cisco partner software on the SSM module in the MDS. The Network device is hosting the software which performs the Virtualization function for the application. The Network-Assisted technique is implemented by installing Cisco partner software on a separate appliance or external server. In this technique, the Network device is assisting the software which performs the Virtualization function for the application. The Network-Accelerated technique uses a function on the Cisco SSM to accelerate the Partner application. Serverless Backup is a typical Network Application that is Accelerated by the X-copy function running on the MDS SSM. Another function is Fibre Channel Write Acceleration.
180
Network-Hosted Applications
Network-Hosted Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications
Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring
Network-Accelerated
Standard FC protocols Serverless Backup FC Write Acceleration Synchronous Acceleration
Invista
15
With the SSM, Cisco introduced an open, standards based platform for enabling intelligent fabric applications.
SSM hardware: Dual function module with 32 FC ports with embedded Virtualization Engine processors Purpose-built-ASICS this optimizes virtualization functions, all done in ASICS, providing high performance with a highly available, scalable, and fully distributed architecture Any-to-any virtualization (no need to connect hosts or storage directly into one of the FC ports) Multiple best of breed partners for flexibility and investment protection
There are four key customer benefits of this intelligent fabric applications platform:
First, it is an open, and standards-based solution for enabling multiple partner application Second, it provides feature velocity by reducing the development cycle Third, it has a modular-software architecture for running multiple applications simultaneously Finally, it provides investment protection by delivering real-world applications today with flexibility to enable advanced functions using software
181
FC
FC
HBA
FC
HBA
FC
HBA
FC
HBA
Scalable hardware High performance Embedded diagnostics Multiprotocol Platform for Intelligent services VSAN scaling features
EMC Invista Network Based Volume Management (creation, presentation and management of volumes) Online data mobility Heterogeneous Clones Point-in-time copies
HBA
Cisco Provides
Data Path Cluster (DPC) Cisco Storage Services Module (SSM) Cisco Storage Services Enabler license Cisco MDS 9000 Family of Fibre Channel switches
EMC Provides
Control Path Cluster (CPC) on CX700 EMC Invista software Cabinet Meta-storage
16
182
SSM
Data Path
17
Invista requires two components: intelligent switches from vendors such as Cisco, Brocade, and McDATA, along with a separate appliance--or set of them. This set, known as the Control Path Cluster (CPC), builds what amounts to routing tables and maintains metadata. The tables are used by the intelligent switches to rewrite FC and SCSI addresses at wire speed. That capability makes the architecture highly scaleable, but more complex as well. EMC Invista is installed on an external Control Path Cluster (CX700) providing
EMC Invista manages the Control path in the Control Processor, while data flows directly between host and storage through the Data Path processors located on the SSM module in the Cisco MDS. The benefits of performing virtualization on the SSM module are
Fully integrated into the fast high bandwidth redundant crossbar High availability and redundancy Minimal latency and high throughput Comprehensive centralized security Providing a centralized solution that is easier to manage.
183
184
Network-Assisted Applications
Network-Assisted Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications
Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring
Network-Accelerated
Standard FC protocols Serverless Backup Write Acceleration Synchronous Replication
Invista
2006 Cisco Systems, Inc. All rights reserved.
19
Intelligent storage services are also provided on the SSM module from a large number of storage partners. Each network based appliance communicates with the SSM module through the use of the SANTap protocol. Network-Assisted applications include:
Heterogeneous Storage Replication Continuous Data Protection Data Migration Storage Performance Monitoring
185
Out-of-Band Appliances
Advantages: Appliance not in the primary I/O path Limitations: Appliance requires host-based software agents consuming CPU, memory and I/O. Appliance adds latency to initial I/O request. Potentially compromises I/O performance by issuing twice as many I/Os Limited interoperability with other appliances or disk array features
2006 Cisco Systems, Inc. All rights reserved.
HBA
FC FC FC HBA FC HBA
HBA
Appliance
SAN
FC
Target
20
Out-of-Band Appliances
When a separate storage appliance is connected to the network, it has one prime advantage, in that the appliance is not in the main data path and so is not perceived as a bottleneck. The limitations of this approach are many:
Each host must have a software agent installed on the server to intercept I/O requests and redirect them to the appliance. If the administrator fails to install the agent on every server, then that server will attempt to communicate directly with its storage instead of via the appliance, possibly leading to data loss. Each intercepted I/O request must be directed to an appliance that is usually connected on the LAN and therefore adds latency for every I/O operation When the appliance is connected in-band over Fibre Channel, this results in additional I/O traffic across the FC SAN. Every solution is proprietary and not defined by standards, so each appliance cannot interoperate with another.
186
In-Band Appliances
Advantages: Does not require host agents
Hosts
Appliance is in the primary data path
Limitations: Disruptive insertion of appliance in the data path Potentially performance bottleneck because all frames flow through the appliance Appliance adds latency to all frames Limited interoperability with other appliances or disk array features Appliance can be single point of failure
2006 Cisco Systems, Inc. All rights reserved.
HBA
FC FC FC HBA FC HBA
HBA
Host sends all I/O to the appliance Appliance intercepts I/O and sends it to the target
Appliance
SAN
FC
Target
21
In-Band Appliances
When a separate external storage appliance is connected in-band the advantage is that host based software agents are no longer required. However, there are several limitations to this approach:
The appliance cannot be added to the SAN without causing massive disruption. All data between each of the servers and the storage must now pass through the appliance adding latency to every frame and becoming a potential bottleneck in a busy SAN. The appliance becomes a Virtual Target for all SCSI based communication. It receives all SCSI I/O and sends it to the appropriate storage devices by creating a Virtual Initiator. The appliance can become a single point of failure, although most solutions offered today are clustered. Every solution is proprietary and not defined by standards, therefore each appliance cannot interoperate with other appliances.
187
Hosts
FC FC HBA FC HBA FC HBA
HBA
Host issues I/O command to target SANTap sends copy of I/O to appliance
Appliance
No I/O disruption SAN
FC
FC
22
The integrity, availability and performance of the Primary I/O is maintained Seamless insertion and provisioning of appliance based storage applications Storage service can be added to any server/storage device in the network without any rewiring Incremental model to deploy appliance based applications, easy to revert back to original configuration No disruption of the Primary I/O from the server to the storage array (viz. integrity, availability & performance preserved) Addresses the Scalability Issue for appliance based storage applications Investment protection
Heterogeneous storage replication Continuous log-based data protection Online data migration Storage performance/SLA monitoring
188
Partner
Application
Heterogeneous async replication over extended distances with advanced data compression functionality Disk-based Continuous Data Protection (CDP) for Zero-Backup windows with an ability to restore to any point in time Heterogeneous async replication over extended distances with data consistency Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication and CDP Heterogeneous asynchronous replication
23
189
Network-Accelerated Applications
Network-Accelerated Services
MDS 9000 Storage Services Module (SSM)
ASIC-based innovation Open, standards-based platform Hosts multiple partner applications
Network-Assisted
SANTap Protocol Heterogeneous storage replication Continuous Log based data protection Online Data Migration Storage performance/SLA monitoring
Network-Accelerated
Standard FC protocols Serverless Backup FC Write Acceleration Synchronous Replication
Invista
2006 Cisco Systems, Inc. All rights reserved.
25
The SSM module provides a number of network-accelerated intelligent services that enhance the standard Fibre Channel protocols. These are:
Network-Assisted Serverless Backup (NASB) Fibre Channel Write Acceleration (FC-WA) Network-based synchronous replication
190
3.0
Media Servers
Application Servers
Instead of Media Servers, MDS (with SSM) moves data directly from Disk to Tape
FC
SSM
SAN
SAN
FC FC
FC
Tape
Disk Array
Tape
Disk Array
LAN Based
Data moved over LAN Application Server moves data
LAN Free
Data moved over SAN Application Server moves data
Server Free
Data moved over SAN Application Server not in data path Dedicated Media Server moves data
Serverless Backup
Data moved over SAN Application Server not in data path Fabric moves data
26
Offload I/O and CPU work from Media Servers to SSM Reduce server admin & mgmt tasks
Investment Protection
No changes to existing backup environment SSM Data Movement can be enabled w/ software
191
VERITAS NetBackup
27
Cisco is working with five vendors who are all at different stages in qualifying their backup solution with the MDS 9000 network-accelerated serverless backup solution.
192
3.0
Initiator
FC
Target
FC
Round Trip
XFER_RDY
FCP_DATA
XFER_RDY
P FCP_RS
29
SCSI standards define the way a SCSI Initiator shall communicate with a SCSI Target. This consists of four phases:
Initiator sends SCSI Write Command to SCSI Target LUN containing a CDB with the command, LBA and Block count When the SCSI Target is ready to receive data it responds with Xfer Ready When the SCSI Initiator receives Xfer Ready it starts sending data to the SCSI Target Finally, when the SCSI Target has received all the data, it returns a Response or Status to the SCSI Initiator
This constitutes two round trip journeys between the SCSI Initiator and SCSI Target.
In a data centre environment, distances are short and therefore the round-trip time is low and latency is reduced. In a WAN environment, when distances are much longer, the SCSI Initiator cannot send data until it receives Xfer Ready after the first round trip journey. As distances increase, this considerably impacts write performance. Fibre Channel Write Acceleration spoofs Xfer Ready in the MDS switch. When the original SCSI Command is sent by the SCSI Initiator through the MDS switch to the SCSI Target, the MDS responds immediately with a Xfer Ready. The SCSI Initiator can now immediately send data to the SCSI Target instead of waiting for the true Xfer Ready to be received. Meanwhile the SCSI Command is received by the Target and it responds with the real Xfer Ready. When
Copyright 2006, Cisco Systems, Inc. Network-Based Storage Applications 193
the Target MDS switch receives the Data it will pass the data on to the Target. Finally, the SCSI Target sends a Response or Status back to the Initiator in the normal way. In a typical environment, several simultaneous SCSI operations are taking place between the SCSI Initiator and SCSI Target simultaneously, so these operations are interleaved, maximizing performance and minimizing latency.
194
3.0
Optimize bandwidth for DR Increase distance between primary site and remote site Minimizes application latency Investment protection: transport agnostic DWDM, CWDM, SONET/SDH, dark fiber
SSM SSM
FC WA
Up to 30%
Performance improvement seen by major financial services company over 125 km distance
Primary Application:
Synchronous replication
30
195
ERP SAN
Dynamic Provisioning
Maintain isolation from fabric events or configuration errors Provide isolated management of island infrastructures Driven by bad experiences of large multi-switch fabrics
This model is also associated with very high costs and a high level of complexity To help customers overcome the limitations of building homogeneous SAN Islands, Cisco has delivered new storage networking technologies and services aimed at enabling IT organizations to consolidate heterogeneous disk, tape, and hosts onto a common storage networking infrastructure (Phase 1.) By introducing new intelligent storage network services, Cisco enables customers to scale storage networks beyond todays limitations while delivering the utmost in security and resiliency. An innovative infrastructure virtualization service called Virtual SANs (VSANs) alleviates the need to build isolated SAN islands by replicating such isolation virtually within a common cost-optimized physical infrastructure. The intelligent Multilayer Storage Utility (Phase 2) involves leveraging Cisco Multilayer Storage Networking as the base platform for delivering next generation storage services With the intelligent multilayer storage utility, the storage network is viewed as a system of distributed intelligent network components unified through a common API to deliver a platform for network-based storage services.
196 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.
Network-based storage services offer several attractive opportunities for further cost optimization of the storage infrastructure. To achieve the Multilayer Storage Utility, Cisco is partnering with the industry leaders as well as with the most promising start up companies to offer complete solutions to customers. Network-hosted storage products from EMC, Veritas and IBM as well as SANTap solutions developed in partnership with companies like Topio, Kashya or Alacritus are excellent examples of how Cisco is delivering in this space.
197
198
Lesson 7
Optimizing Performance
Overview
In this lesson, you will learn how to design high-performance SAN fabrics using FSPF traffic management, load balancing, Virtual Output Queues, Fibre Channel Congestion Control, and Quality of Service.
Objectives
Upon completing this lesson, you will be able to engineer SAN traffic on an MDS 9000 fabric. This includes being able to meet these objectives:
Define oversubscription and blocking Explain how Virtual Output Queues solve the head-of-line blocking problem Explain how the MDS 9000 handles fabric congestion Explain how QoS is implemented in an MDS 9000 fabric Explain how port tracking mitigates performance issues due to failed links Explain how to configure traffic load balancing on an MDS 9000 SAN fabric Describe the MDS 9000 tools that simplify SAN performance management
Host A
FC
HBA
Host B
FC
HBA
2 Gbps Port
2 Gbps Port
FC
HBA
FC
FC
HBA
FC
HBA
2 Gbps Port
FC
1 Gbps Port
FC
FC
HBA
Array C
Array D
4
It is important to fully understand two fundamental SAN design concepts: oversubscription and blocking. Although these terms are often used interchangeably, they relate to very different concepts. Oversubscription and blocking considerations are critical when designing a fabric topology. Oversubscription is a normal part of any SAN design and is essentially required to help reduce the cost of the SAN infrastructure. Oversubscription refers to the fan-in ratio of available resources such as ISL bandwidth or disk array I/O capacity, to the consumers of the resource. For example, many SAN designs have inherent design oversubscription as high as 12:1 hoststo-storage as recommended by disk subsystem vendors. A general rule-of-thumb relates oversubscription to the cost of the solution such that the higher the oversubscription, the less costly the solution. Blocking, often referred to as head-of-line (HOL) blocking, within a SAN describes a condition where congestion on one link negatively impacts the throughput on a different link. In this example congestion on the link connecting Array D is negatively impacting the flow of traffic between Host A and Array C. This is discussed in more detail later in this section.
200
Completely non-blocking Maximum throughput Consistent latency All ports and flows are serviced evenly Quality of Service
2006 Cisco Systems, Inc. All rights reserved.
Optimizing Performance
201
Most impressive was the ability to sustain traffic at a much higher throughput rate than other switches
Miercom
Maximum Throughput
The MDS 9509 exhibited excellent throughput and latency for all frame sizes in this test. [] Regardless of the load, the minimum latency for both frame sizes was very consistent. For small frames, it varied from 7.2 to 52.9 s under 100% intended load. For large frames, the latency ranged from 19.7 to 218.9 s under 100% intended load. [] Whenever other switches tested receive frames at a rate exceeding their capability, their buffers fill and their latency increases dramatically.
Consistent Latency
The MDS 9509 showed excellent performance regardless of frame size. It achieved near line rate with small frames (98.67%) and full line rate with large frames, both with 100% intended load. [] Furthermore, the MDS 9509 was able to sustain traffic at a much higher throughput rate for minimum- and maximum-sized frames while maintaining a more consistent latency then other switches tested. More impressive was the distribution of the traffic flows, which was varied +/- 0.01 MB/s for small frames and +/- 0.005 MB/s for large frames. Source: Performance Validation Test - Cisco MDS 9509 By Miercom at Spirent SmartLabs, Calabasas California - December 2002 http://cisco.com/application/pdf/en/us/guest/products/ps4358/c1244/cdccont_0900aecd800cbd6 5.pdf
202
B A C B C
FC
ACB
Head-of-Line Blocking
The example illustrates a scenario where a storage array is connected to a switch via a single link. Without VOQ technology, traffic from a storage array destined for three servers, A, B and C, will flow into a switch and be placed into a single input buffer as shown. Assuming the servers are capable of receiving data transfers from the storage arrays at a sufficient rate, there should not be any problems.
Optimizing Performance
203
B B C C B
FC
ACBCA
Should there be a problem or slowdown with one of the servers, the storage array may be prevented from sending data as quickly as it is capable, to the remaining servers. This is a classic HOL blocking condition.
204
B C A C B
FC
10
Optimizing Performance
205
A
ARB
B A C B C
FC
AA B C
11
Monitor the input queues and the egress ports. Provide for fairness. Allow unobstructed flow of traffic destined for un-congested ports. Absorb bursts of traffic. Alleviate conditions that might lead to HOL blocking.
Without a central arbiter, there would be a potential to starve certain modules and ports. The central arbiter maintains the traffic flow - like a traffic cop. The Cisco MDS Arbiter can schedule frames at over 1 billion frames per second. (1 billion = 1000 million)
206
Traffic congestion
Fibre Channel Congestion Control (FCC) is used to gracefully alleviate congestion using intelligent feedback mechanisms within the fabric. FCC is a feature designed to throttle data at its source if the destination port is not responding correctly. It is a Cisco proprietary protocol that makes the network react to a congestion situation. The network adapts intelligently to the specific congestion situation, maximizing the throughput and avoiding head of line (HOL) blocking. The protocol has been customized for lossless networks such as Fibre Channel. FCC consists of the following three elements:
Congestion DetectionPerformed by analyzing the congestion of each output port in the switch. Congestion SignalingPerformed with special packets called Edge Quench (EQ). Congestion ControlPerformed through rate limiting of the incoming traffic.
Optimizing Performance
207
S2 1 Gbps
R2 1 Gbps
14
S1 is sending frames into the fabric at 1 Gbps R1 is only receiving frames at 50 Mbps and does not drain FC frames fast enough Congestion occurs at the egress port of R1 as the buffers start to fill up As the buffers fill, frames are backed up to the previous buffer and congestion is detected at the ingress port of Switch B Congestion signal sent to Switch A source of troubled receiver on the appropriate linecard S1 begins rate limiting to an R1 sustainable level to match the rate of flow into and out of the switch.
The MDS 9000 switch monitors traffic from each host for congestion and FCC is activated when and if congestion is detected. The quench on edge message is sent out and the offending host traffic will be cut in half for each quench message received. There is no need for an un-quench message, because traffic usually builds back up slowly.
208
Quality of Service
QoS Design Considerations
How can I provide priority for critical storage traffic flows?
Quality of Service:
Avoids and manages network congestion and sets traffic priorities across the network Provides predictable response times Manages delay and jitter sensitive applications Controls loss during bursty congestion
IP LAN
IP WAN
FC SAN
FC SAN
FC
FC FC
FC HBA FC HBA FC
HBA
FC HBA FC HBA FC
HBA
FC
16
Quality of Service (QoS) includes mechanisms that support the classification, marking and prioritization of network traffic. QoS concepts and technology were originally developed for IP networks. The MDS 9000 family of switches extend QoS capabilities into the storage networking domain for IP SANs as well as Fibre Channel SANs. No other switch on the market today is capable of prioritizing Fibre Channel traffic. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. In a storage network, examples of classification, marking and prioritization schemes might include:
Classification Classify all traffic in a particular VSAN, or all traffic bound for a particular destination FCID, or all traffic entering a particular FCIP tunnel Marking Set particular bits in the IP header or the EISL VSAN header Prioritization Utilize queuing strategies such as Deficit Weighted Round Robin (DWRR) or Class Based Weighted Fair Queuing (CBWFQ) to give preference based on certain markings.
QoS features enable networks to control and predictably service a variety of networked applications and traffic types. The goal of QoS is to provide better and more predictable network service by providing dedicated bandwidth, controlled jitter and latency, and improved loss characteristics.
Optimizing Performance
209
Applications deployed over storage networks increasingly require quality, reliability, and timeliness assurances. In particular, applications that use voice, video streams, or multimedia must be carefully managed within the network to preserve their integrity. QoS technologies allow IT managers and network managers to:
Predict response times for end-to-end network services Manage jitter-sensitive applications, such as audio and video playbacks Manage delay-sensitive traffic, such as real-time voice Control loss in times of inevitable bursty congestion Set traffic priorities across the network Support dedicated bandwidth Avoid and manage network congestion.
Managing QoS becomes increasingly difficult because many applications deliver unpredictable bursts of traffic. For example, usage patterns for web, e-mail, and file transfer applications are virtually impossible to predict, yet network managers need to be able to support mission-critical applications even during peak periods.
210
FC
HBA
Three priority queues for data traffic Absolute priority for control traffic Flows classified based on input interface, destination device alias or source/destination FCID or pWWN QoS only functions during periods of congestion FCC must be enabled
Follows Differentiated Services (DiffServ) model defined in RFCs 2474 and 2475
2006 Cisco Systems, Inc. All rights reserved.
17
Input interface Source FCID Destination FCID Source pWWN Destination pWWN
QoS only functions during periods of congestion. To achieve the greatest benefit, QoS requires that FCC be enabled, and requires two or more switches in the path between the initiators and targets. Data traffic QoS for Fibre Channel is not enabled by default, and requires the Enterprise Package license. However, absolute priority for control traffic is included in the base SAN-OS license, and is enabled by default.
Optimizing Performance
211
VOQ(s)
Absolute Priority Absolute Priority
Disk
FC
Congestion
High Priority High Priority Medium Priority Medium Priority Low Priority Low Priority High Priority High Priority Medium Priority Medium Priority Low Priority Low Priority
OLTP Server
Low Throughput Bursty, Random Low Latency
PQ
FC
HBA
DWRR 1 B B 50%
Classify
Class map
Backup Server
High Throughput Sequential, Streaming Not Latency Sensitive
ABCD
DWRR 2 DWRR 3
C D
30% 20%
Traffic Classification: Source or destination pWWN Source or destination FCID Source interface Destination device alias
Class Map mapped to: DSCP 0-63 (46 reserved) or Policy Map
18
Transaction processing, a low volume, latency sensitive application, requires quick access to requested information. Backup processing requires high bandwidth but is not sensitive to latency. In a network that does not support service differentiation, all traffic is treated identicallythey experience similar latency and get similar bandwidths. With the QoS capability of the MDS 9000 platform, data traffic can now be prioritized in three distinct levels of service differentiationlow, medium or highwhile control traffic is given absolute priority. You can apply QoS to ensure that FC data traffic for latency-sensitive applications receives higher priority over traffic for throughput-intensive applications like data warehousing. In the example, the Online Transaction Processing (OLTP) traffic arriving at the switch is marked with a high priority level of through classification (class map) and marking (policy map). Similarly, the backup traffic is marked with a low priority level. The traffic is sent to the corresponding priority queue within a Virtual Output Queue (VOQ). A Deficit Weighted Round Robin (DWRR) scheduler configured in the first switch ensures that high priority traffic is treated better than low priority traffic. For example, DWRR weights of 60:30:10 implies that the high priority queue is serviced at 6 times the rate of the low priority queue. This guarantees lower delays and higher bandwidths to high priority traffic if congestion sets in. A similar configuration in the second switch ensures the same traffic treatment in the other direction. If the ISL is congested when the OLTP server sends a request, the request is queued in the high priority queue and is serviced almost immediately as the high priority queue is not congested. The scheduler assigns it priority over the backup traffic in the low priority queue. Note that the absolute priority queue always gets serviced first; there is no weighted round robin.
212
Optimizing Performance
213
Zone-Based QoS
Zone-based QoS complements the standard QoS data-traffic classification by WWN or FCID Zone-based QoS helps simplify configuration and administration by using the familiar zoning concept ZONE A
QoS parameters are distributed as a zone attribute FC FC FC HBA
HBA
HBA
Congestion
FC
HBA
FC FC FC HBA
HBA
ZONE B
2006 Cisco Systems, Inc. All rights reserved.
19
Zone-Based QoS
With zone-based QoS, QoS parameters are distributed as a zone attribute. This simplifies administration of QoS by providing the ability to classify and prioritize traffic on by zone, instead of by initiator-target pair. Zone-based QoS is supported in both the Basic and Enhanced mode. QoS parameters are distributed as vendor-specific attributes. Zone-based QoS cannot be combined with flow-based QoS
Note Zone-based QoS is a licensed feature; it requires the Enterprise Package.
Note
Zone-based QoS may cause traffic disruption upon zone-QoS configuration change (and activation) if in-order-delivery is enabled.
214
50% maximum input to fc1/1 MDS Supported on MDS 9100 Series switches, MDS 9216i and MPS14+2
2006 Cisco Systems, Inc. All rights reserved.
20
Prevent malicious or malfunctioning devices from flooding the SAN. Limit traffic contending for WAN links, e.g., storage replication ports. Limit ingress traffic on over-subscribed mode interfaces.
Port rate limiting is also referred to as ingress rate limiting because it controls ingress traffic into an FC port. The feature controls traffic flow by limiting the number of frames that are transmitted out of the exit point on the MAC. Port rate limiting works on all Fibre Channel ports. Note: Port rate limiting can be configured only on Cisco MDS 9100 Series switches, MDS 9216i and MPS 14+2. This command can be configured only if the following conditions are true:
The QoS feature is enabled using the qos enable command. The command is issued in a 2nd generation Cisco MDS 9216i or 9100 series switch.
The rate limit ranges from 1 to 100% and the default is 100%. To configure the port rate limiting value, use the switchport ingress-rate interface configuration command.
Optimizing Performance
215
FC
FC
HBA
FC
HBA
FC
HBA
QoS scheduling occurs in VOQ of ingress port Effective when multiple flows on one ingress port contend for the same egress port Can improve latency and/or bandwidth of higher priority flows
21
QoS Designs
Fibre Channel QoS is effective for some configurations, but not for others. To understand why, it is important to realize that the QoS scheduler operates within the Virtual Output Queue (VOQ) of an ingress port. Because QoS scheduling occurs at the ingress port, for QoS to be effective, it is important that all competing traffic enter a switch through the same ingress port, somewhere before the common point of congestion. The diagram illustrates three configurations where FC QoS might be beneficial in a singleswitch design:
Multiple devices attached to the same quad on a host optimized 32-port FC line card. In this configuration, the group of four over subscribed mode (OSM) ports is serviced by the same QoS scheduler, so internally they appear to be connected to the same ingress port. The common point of congestion would be the storage port. A multi-disk JBOD attached to an FL port, sending data to a host on the same switch. In this configuration, there are be multiple devices on the same ingress port, each with unique FCID and pWWN which can be used for QoS classification. The common point of congestion would be the host, if for example we had a 2 Gbps JBOD and a 1 Gbps host HBA. A host sending multiple flows, each of which enter on a common ingress port and is destined for distinct FCID or pWWN within the JBOD. In this configuration, QoS can improve the latency of a higher priority flow, but cannot improve the bandwidth because the host is not QoS aware, so all of the flows would get an equal share of the bandwidth regardless of the DWRR weights.
216
Because the QoS scheduler operates within the VOQ of the ingress ports, Fibre Channel QoS is not beneficial in all configurations. Two configurations where FC QoS is not effective in the current MDS 9000 QoS implementation are:
Multiple devices attached to full rate mode (FRM) ports on a 16-port FC line card contending for the same egress port. In this configuration, where two hosts are both sending data to the storage array there would be no benefit to giving one host a QoS priority higher than the other because the central arbiter would provide fairness to the two ingress ports that are contending for the common egress port. Multiple devices with a common ingress port (the ISL on the rightmost switch), but multiple egress ports. QoS would not provide a benefit, however FCC will still alleviate congestion on the ISL.
Optimizing Performance
217
FCIP 10
FC
HBA
IP WAN
Downstream IP network must implement and enforce QoS policy based on marking DSCP can be any value from 0 to 63. Default = 0 DSCP 46 is reserved for Expedited Forwarding
2006 Cisco Systems, Inc. All rights reserved.
22
218
End-To-End QoS
Priority for critical storage traffic flows
VSANs and high density switches allow for collapsed core design Traffic engineering makes collapsed core design a feasible solution FCC performs congestion detection, signaling, and congestion control End-to-end QoS priority schemes can be designed to meet customer requirements
VSAN Trunks
250 50
IP LAN
iSCSI
15 0
Each VSAN has its own routing process and associated metrics per link and can therefore make independent routing decisions
iSCSI Hosts
50
50
15 0
250
50
50
50
FC SAN
Hosts are assigned to Virtual SANs. Each Virtual SAN is allocated fabric resources, such as bandwidth, independently
23
End-to-End QoS
Cisco MDS 9000 Family introduces VSAN technology for hardware-based intelligent frame processing, and advanced traffic management features such as Fibre Channel Congestion Control (FCC) and fabric-wide quality of service (QoS) - enabling the migration from SAN islands to collapsed-core and enterprise-wide storage networks. The MDS 9000 family of switches provide several tools that allow SAN administrators to engineer resource allocation and recovery behavior in a fabric. These tools can be used to provide preferential service to a group of hosts or to utilize cost-effective, wide-area bandwidth first and use an alternate path during fabric fault.
VSANs provide a way to group traffic. VSANs can be selectively grafted or pruned from EISL trunks. PortChannels support link aggregation to create virtual EISL trunks. FSPF provides deterministic routing through the fabric. FSPF can be configured on a perVSAN basis to select preferred and alternate paths
With the MDS 9000 family of switches, QoS concepts and technology that were originally developed for IP networks have been extended into the storage networking domain for IP SANs as well as Fibre Channel SANs. No other switch on the market is capable of prioritizing Fibre Channel traffic as effectively or comprehensively. Classification involves identifying and splitting traffic into different classes. Marking involves setting bits in a frame or packet to let other network devices know how to treat the traffic. Prioritization involves queuing strategies designed to avoid congestion and provide preferential treatment. The Cisco MDS 9000 Family enables the design of comprehensive end-to-end traffic priority schemes that satisfy customer requirements.
Copyright 2006, Cisco Systems, Inc. Optimizing Performance 219
Port Tracking
Port Tracking
Linked port
1
FC
2.0
Tracked port
WAN/MAN
FC
WAN/MAN
Unique to MDS 9000 Failure of link 1 immediately brings down link 2 Triggers faster recovery where redundant links exist Tracked ports are continually monitored
Tracked ports can be FC, Port Channel, GigE or FCIP interface Linked ports must be FC
25
220
Optimizing Performance
221
Load Balancing
Configuring Logical Paths
How can I provide preferred paths for a subset of hosts and storage devices in my SAN? SAN traffic engineering uses a combination of features to provide preferred selection of logical paths:
V VS SAN AN 2 0 10 B ac ku p
VSANs can be selectively grafted or pruned from EISL trunks. FSPF can be configured on a per-VSAN basis to select preferred and alternate paths. PortChannels provide link aggregation to yield virtual EISL trunks.
up ck 10 Ba 0 AN 2 VS AN VS
TE_Port
E_Port
E_Port
27
VSAN allowed lists, which permit VSANs to be selectively added to or removed from EISL trunks. FSPF link cost, which can be configured on a per-VSAN basis for the same physical link, providing preferred and alternate paths. PortChannels, which provide link aggregation and thus logical paths that can be preferred for routing purposes.
The implementation of VSANs gives the SAN designer more control over the flow of traffic and its prioritization through the network. Using the VSAN capability, different VSANs can be prioritized and given access to specific paths within the fabric on a per-application basis. Using VSANs, traffic flows can be engineered to provide an efficient usage of network bandwidth. One level of traffic engineering allows the SAN designer to selectively enable or disable a particular VSAN from traversing any given common VSAN trunk (EISL) thereby creating a restricted topology for the particular VSAN. A second level of traffic engineering is derived from independent routing configurations per VSAN. The implementation of VSANs dictates that each configured VSAN support a separate set of fabric services. One such service is the FSPF routing protocol which can be independently configured per VSAN. Therefore, within each VSAN topology, FSPF can be configured to provide a unique routing configuration and resultant traffic flow. Using the traffic engineering capabilities offered by VSANs allows a greater control over traffic within the fabric and a higher utilization of the deployed fabric resources.
222 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.
VSAN 10 OLTP
Email VSAN 20
Email VSAN 20
28
Optimizing Performance
223
Vendor specific tools offer limited performance management capabilities Probes and analyzers in the data path are intrusive, disruptive and expensive
iSCSI
IP LAN
FC
FC
FC
ANALYZER
SINIFAR
IP WAN
QoS Probes in Data Path
FC
FC
FC
FC SAN
2006 Cisco Systems, Inc. All rights reserved.
FC SAN
30
224
SPAN
Non-intrusive copy of all traffic from a port Directed to SD_Port within local or remote MDS switch Traffic redirected to Cisco Port Analyzer Adapter (PAA) Also compatible with off-the-shelf FC protocol analyzers
Destination Switch
ST
MDS FC Network
SD FC analyzer
31
Source Switch
2006 Cisco Systems, Inc. All rights reserved.
Non-intrusive and non-disruptive tool used with the Fibre Channel analyzer Ability to copy all traffic from a port and direct it to another port within the switch Totally hardware-drivenno CPU burden Up to 16 SPAN sessions within a switch Each session can have up to four unique sources and one destination port Filter the SPAN source based on receive only traffic, transmit only traffic, or bidirectional traffic
The Fibre Channel port that is to be analyzed is designated the SPAN source port. A copy of all Fibre Channel traffic flowing through this port is sent to the SD_Port. This includes traffic traveling in or out of the Fibre Channel port, that is, in the ingress or egress direction. The SD_Port is an independent Fibre Channel port, which receives this forwarded traffic and in turn sends it out for analysis to an externally attached Fibre Channel analyzer.
Optimizing Performance
225
SPAN Applications
Using the SPAN feature, you can conduct detailed troubleshooting on a particular device without any disruption. In addition, a user may want to take a sample of traffic from a particular application host for proactive monitoring and analysis, a process that can easily be accomplished with the SPAN feature. Remote Switched Port Analyzer (RSPAN) further increases the capability of the SPAN feature. With RSPAN, a user has the ability to make a copy of traffic from a source port or VSAN to a port on another connected switch. Debugging protocols supported by SPAN and RSPAN include FSPF, PLOGI, exchange link parameter (ELP), and others. Examples of data analysis that can be performed with SPAN and RSPAN include:
An important debugging feature SPAN provides is that multiple users can share an SD_Port and analyzer. Also, the MDS 9000 can copy traffic on a single port at line rates. MDS 9000 can use SPAN with both unicast and multicast traffic. Dropped frames are not SPAN-ed. SPAN-ed frames will be dropped if the sum bandwidth of sources exceeds the speed of the destination port.
226
Performance Manager
Fabric wide, historical performance reporting Browser-based display Summary and drill-down reports for detailed analysis Integrates with Cisco Traffic Analyzer Requires Fabric Manager Server license
32
Performance Manager
Performance Manager, like Fabric Manager, runs as a Windows service. It monitors network device statistics and displays historical information in a web-based GUI. Performance Manager provides detailed traffic analysis by capturing data with the Cisco Port Analyzer Adapter (PAA). This data is compiled into various graphs and charts which can be viewed with any web browser. It presents recent statistics in detail and older statistics in summary. Performance Manager has three parts:
Definition of traffic flows is done by manual edits or by using a Fabric Manager configuration wizard to create a configuration file; Collection is where Performance Manager reads the configuration file and collects the desired information; Presentation is where Performance Manager generates web pages to present the collected data.
Performance Manager can collect a variety of data about ISLs, host ports, storage ports, route flows, and site-specific statistical collection areas. It relies on captured data flows through the use of the PAA, Fabric Manager Server, and the Traffic Analyzer. Using it as a FC traffic analyzer, a user can drill down to the distribution of Read vs. Write I/O, average frame sizes, LUN utilization, etc. Using it as a FC protocol analyzer, a user can have access to frame level information for analysis. The Summary page presents the top 10 Hosts, ISLs, Storage, and Flows by combined average bandwidth for the last 24 hour period. This period changes on every polling interval, although this is unlikely to change the average by much but it could affect the maximum value. The intention is to provide a quick summary of the fabrics bandwidth consumption and highlight any hot-spots.
Copyright 2006, Cisco Systems, Inc. Optimizing Performance 227
Hybrid approach:
Aggregate traffic collected using SNMP and stored persistently in round-robin database. Use SPAN, PAA, and NTOP to capture packets for diagnosing traffic problems.
2006 Cisco Systems, Inc. All rights reserved.
33
The purpose of Performance Manager is to monitor network device statistics historically and provide this information graphically using a web browser. It presents recent statistics in detail and older statistics in summary. The deployment goal of Performance Manager is to be able to scale to large fabrics with 12 months of data, provide an early warning system for potential traffic problems, see all packets on an individual interface, diagnose traffic problems, and have it be simple to setup and use. In order to achieve these goals, Cisco implemented a hybrid approach. First to retrieve aggregate fabric traffic information using SNMP and store it persistently in a round-robin database. Then incorporate the use of SPAN, PAA, and NTOP to capture packets for ultimately diagnosing traffic problems. Performance Manager is a tool that can:
Scale to large fabrics Scales to multi-year histories Perform data collection without requiring inband local/proxy HBA access Tolerate poor IP connectivity Provide SNMPv3 support Have Zero-administration databases Provide site customization capabilities Accommodate fabric topological changes Integrate and share data with external tools Run on multiple operating systems Integrate with Fabric Manager
228
Bytes (and frames) sent from sources to destinations Configured based on active zones, for example:
Host1 Storage1 Host1 Storage2 Storage1 Host1 Storage2 Host1
34
Possible Flows:
Optimizing Performance
229
Traffic Analyzer
Cisco customized version of ntop Free for download from CCO Live or Offline Analysis Provides SCSI based Information about storage network devices FC-enhanced public-domain tools
35
Traffic Analyzer
Cisco Traffic Analyzer is a Cisco customization of the popular ntop network traffic monitoring tool. Traffic Analyzer allows for live or offline analysis, and displays information about storage and the network. Traffic Analyzer is a Fibre Channel-enhanced version of public-domain tools. Traffic Analyzer is not good for accounting, because frames may be dropped on SD_Port, by the PAA or on the host. Traffic Analyzer can be downloaded free from Cisco Connection Online (CCO).
230
N_Port-Based Statistics
Per-LUN Statistics Traffic Breakdown by Time Class-based traffic breakdown
Session-based Statistics
SCSI Sessions (I_T_L Nexus) FICON Sessions (in progress) Other FC Sessions
VSAN-Based Statistics
Traffic Breakdown by VSAN VSAN configuration Stats Domain-based statistics
And more
36
Overall Network Statistics Total Bandwidth Used Tx/Rx Bandwidth per VSAN Tx/Rx Bandwidth per N_Port Session-based Statistics SCSI Sessions (I_T_L Nexus) FICON Sessions (in progress) Other FC Sessions N_Port-Based Statistics Per-LUN Statistics Traffic Breakdown by Time Class-based traffic breakdown VSAN-Based Statistics Traffic Breakdown by VSAN VSAN configuration Stats Domain-based statistics
Optimizing Performance
231
37
232
Specify Specify actions actions to to be be taken taken based based on on alarms: alarms:
Logging Logging SNMP SNMP traps traps Log-and-trap Log-and-trap
!
RMON Syslog Call Home
Log Log information information for for monitoring monitoring and and troubleshooting troubleshooting Capture Capture accounting accounting records records See a complete See a complete picture picture of of events events
2006 Cisco Systems, Inc. All rights reserved.
Flexible Flexible message message formats: formats: email, email, pager, pager, XML XML Integrates Integrates with with RMON RMON and and Syslog Syslog (SAN-OS (SAN-OS 2.0+) 2.0+)
38
Call Home
Call Home can be configured to provide alerts when switch ports become congested, therefore performance can be monitored remotely and action taken promptly. The Call Home functionality is available directly through the Cisco MDS 9000 Family. It provides multiple Call Home profiles (also referred to as Call Home destination profiles), each with separate potential destinations. Each profile may be predefined or user-defined. A versatile range of message formats are supported, including standard email-based notification, pager services, and XML message formats for automated XML-based parsing applications. The Call Home function can even leverage support from Cisco Systems or another support partnerfor example, if a component failure is detected, a replacement part can be on order before the SAN administrator is even aware of the problem. Flexible message delivery and format options make it easy to integrate specific support requirements. The Call Home feature offers the following advantages:
Integration with established monitoring systems like RMON and Syslog Comprehensive and more robust fault monitoring Aids in quicker problem-resolution
Optimizing Performance
233
Syslog
The system message logging software saves messages in a log file or directs the messages to other devices. This feature provides the following capabilities:
Logging information for monitoring and troubleshooting. Selection of the types of logging information to be captured. Selection of the destination of the captured logging information.
By default, the switch logs normal but significant system messages to a log file and sends these messages to the system console. Users can specify which system messages should be saved based on the type of facility and the severity level. Messages are time-stamped to enhance realtime debugging and management. Syslog messages are categorized into seven severity levels, from debug to critical events. Users can limit the severity levels that are reported for specific services within the switch. For example, Syslog can be configured to report only debug events for the FSPF service but record all severity level events for the Zoning service. A unique feature within the Cisco MDS 9000 Family switches is the ability to send accounting records to the Syslog service. The advantage of this feature is consolidation of both types of messages for easier correlation. For example, when a user logs into a switch and changes an FSPF parameter, Syslog and RADIUS provide complementary information that portrays a complete picture of the event. Syslog can store a chronological log of system messages locally or send messages to a central Syslog server. Syslog messages can also be sent to the console for immediate use. These messages can vary in detail depending on the configuration chosen.
234
Lesson 8
Objectives
Upon completing this lesson, you will be able to design an end-to-end SAN security solution. This includes being able to meet these objectives:
Describe the most common security issues facing SANs Explain how zoning contributes to the security of a SAN solution Explain how port and fabric binding contribute to the security of a SAN solution Explain how device authentication contributes to the security of a SAN solution Explain how to secure management data paths Explain the best practices for end-to-end SAN security design
Privilege Privilege escalation/ escalation/ unintended unintended privilege privilege Application Application tampering tampering (trojans, (trojans, etc) etc)
FC
FC
Theft Theft Unauthorized Unauthorized connections connections (internal) (internal) Data Data tampering tampering
FC
HBA
FC FC FC HBA
HBA
SAN
FC
FC
FC
LAN
238
Clear Text Passwords No Audit of Access / Attempts Out-of-band Ethernet Management Connection
239
FC
FC
HBA
Host-based:
LUN mapping Standard OS security
Fabric-based:
Array-based:
VSANs LUN masking Zoning LUN zoning Read-only zones Port mode security Port binding Fabric Binding Authentication Management security Role Based Access Control
6
Security measures at each of the three tiers can be used by storage administrators to achieve the level of security needed for a particular environment. If an entire data center is located within a single, physically secure data center a more lax suite of security measures might be chosen. However, SANs are commonly being extended beyond the confines of the corporate data center and thus implementing multiple security mechanisms at all three tiers is both warranted and necessary.
240
FC
FC
HBA
Host and/or Array LUN Security Host and Array LUN security does not rely on fabric enforcement and thus has limited effectiveness. By itself LUN security is not adequate to safeguard a SAN, but host LUN security and array LUN security can be used in conjunction with other security measures to create an effective security policy. Soft Zoning Soft zoning is perhaps the oldest and most commonly deployed security method within SANs. Primarily it protects hosts from accidentally accessing targets with which they are unauthorized to communicate. However, soft zoning provides no fabric enforcement. If a host can learn the FCID of a target soft zoning will not prevent that host from accessing the target. Port-based Zoning Port based zoning is applied to every FC frame that is switched, thus it has a level of fabric enforcement not provided by soft zoning. However, security provided by port-based zoning can be circumvented simply by gaining physical access to an authorized port. WWN-based Zoning WWN based zoning applies switching logic to frames based on their factory burned-in WWN rather than the physical port the devices is connect to. WWN based zoning can be defeated through the spoofing of WWNs, which is relatively trivial to accomplish. Port Security Prevents unauthorized fabric access by binding specific WWNs to one or more given switch ports. In order to defeat port security a hacker would need to both spoof the device WWN and access the specific port or ports that devices is authorized to use.
Securing the SAN Fabric 241
DH-CHAP Enforces fabric and device access through an authentication method during the fabric login phase. DH-CHAP offers an excellent method of securing SAN fabrics, but it does not provide encryption services. Standardized encryption services for SANs will soon be available from multiple vendors, including Cisco. Depending on the security protocols you have implemented, PPP authentication using MSCHAP can be used with or without Authentication, Authorization and Accounting (AAA) security services. If you have enabled AAA, PPP authentication using MS-CHAP can be used in conjunction with both TACACS+ and RADIUS. MSCHAPV2 authentication is the default authentication method used by the MicrosoftWindows2000 operating system. Support of this authentication method on Cisco routers will enable users of the MicrosoftWindows2000 operating system to establish remote PPP sessions without needing to first configure an authentication method on the client. MSCHAPV2 authentication introduces an additional feature not available with MSCHAPV1 or standard CHAP authentication, the change password feature. This feature allows the client to change the account password if the RADIUS server reports that the password has expired.
242
243
Zoning
MDS Advanced Zoning Features
Used to control host access to storage devices, within a VSAN MDS 9000 supports both hard and soft zoning
Soft zoning enforced by Name Server query-responses Hard zoning enforced on every frame by the forwarding ASIC
FC Alias
FC
LUN
Zoning Options
pWWN (attached Nx_Port) FCID FC Alias (within a VSAN) Device Alias (global within a SAN) fWWN (switch port-based zoning) Interface (fc1/2)
HBA
pWWN
FCID
FCID
Fully compliant with FC-GS3, FC-GS-4, FC-SW2, FC-SW3, & FC-MI Fabric Manager supports Zone Merge Analysis
Prevents fabric merge failures due to zone database mismatch
2006 Cisco Systems, Inc. All rights reserved.
10
244
Dedicated high speed port filters called TCAMs (Ternary CAMs) filter each frame in hardware and reside in front of each port
Support up to 20,000 programmable entries consisting of zones and zone members Up to 8000 zones per fabric Very deep frame filtering for new innovative features Wire-rate filtering performance no impact regardless of number of zones or zone entries Optimized programming during zoneset activation incremental zoneset updates
RSCNs contained within zones in given VSAN Selective Default Zone behavior default is deny configured per VSAN
2006 Cisco Systems, Inc. All rights reserved.
11
245
2.0
Enhanced mode
Represents the zone server behavior of GS-4/SW-3 standard Available with SAN-OS 2.0 and greater QoS parameters distributed as part of zone attribute Consistent full-zone database across fabric Support for attributes in the standard Consistent zoning policy across fabric Unique vendor type Reduced payload for activation request
12
246
LUN Mapping in the Host LUN Masking in the Storage Array Used to control host access to storage LUNs Not enforced in the fabric Prevents contention for storage resources and data corruption Protection from unintentional security breaches Lack of centralized management
Storage pWWN 3 Storage pWWN 4
FC
HBA
FC
HBA
Host pWWN 1
Host pWWN 2
13
Example of Host-Based LUN Security: The HBA utility on the red host could be configured to only communicate with the WWN associated with storage port A. Example of Array-Based LUN Security: Storage Port B could be configured to only accept frames from the WWN associated with the blue host port.
247
LUN Zoning
LUN Zoning is the ability to zone an initiator with a subset of LUNs offered by a target:
Use with storage arrays that lack LUN masking capability Use instead of LUN masking to centralize management in heterogeneous storage environments Can be managed centrally from CFM
FC
HBA
1.2
FC
report_LUNs 10 LUNs available report_size LUN_1 LUN_1 is 50GB report_size LUN_3 LUN_3 is unavailable
14
LUN Zoning
Disk arrays typically have multiple Logical Units on them. Standard FC Zoning extends down to the switch port level or down to the WWN of the port, but not down to the LUN level. This means that any fabric containing disk arrays with multiple LUNs needs security policies configured on both the disk array (or multiple disk arrays) and on the FC switches themselves. LUN Zoning is a feature specific to switches in the Cisco MDS 9000 Family introduced in SAN-OS 1.2 that allows zoning to extend to individual LUNs within the same WWN. This means that the centralized zoning policy configured on the FC switches can extend to hardware-enforcing zoning down to individual LUNs in disk arrays. In the top half of the diagram LUN zoning allows the switch to grant the host access to disks 1 & 2 while preventing the host from accessing all other disks in the array.
248
Read-Only Zoning
Read-Only Zoning leverages the hardware-based frame processing of the MDS 9000 Family Use for backup servers and snapshots Especially useful for media servers that need high speed access to rich content for broadcast block level bypasses NAS service Does not work for certain file system types ie NTFS
Streaming media server
FC
1.2
FC
HBA
15
Read-Only Zoning
Standard FC Zoning is used to permit devices to communicate with each other. Standard FC Zoning cannot perform any advanced filtering for example, by blocking or allowing specific I/O Operations such as a Write I/O command. The Cisco MDS 9000 Family provides the ability to enforce read only zones in hardware. That is, the switch can enforce read-only access to a given device (e.g. a disk) and will block any write requests. Read-only zoning filters FC4-Command frames based on whether the command is a read or write command When used in conjunction with LUN Zoning, read-only or read-write access can be granted for specific hosts to specific LUNs. Read-only Zoning was introduced with SAN-OS 1.2. This functionality is available on every port across the entire Cisco MDS 9000 product family. On the bottom half of the diagram a streaming video server is granted read-only access to the storage array, thus preventing inadvertent or malicious corruption of data. Certain operating systems use file systems that depend on the ability to write to disks (e.g., many Windows file systems). Such file systems may not function properly when placed in a read-only zone.
Note Read-Only Zones requires the Enterprise License Package.
249
FC
FC
FC
FC FC FC FC
FC
FC FC FC FC FC
VSAN 1
FC
16
Use VSANs to isolate each application whenever feasible. Use IVR to allow resource sharing across VSANs; this allows complete isolation of each application. Place all unused ports in VSAN 4094. Because ports are in VSAN 1 by default, suspend VSAN 1, do not configure any zones, and set the default zone policy to deny. This will prevent WWN spoofing on unconfigured ports.
250
Active Zoneset
2006 Cisco Systems, Inc. All rights reserved.
Active Zoneset
17
Zoning should always be deployed in a FC fabric. Typically one zone will be configured per HBA communicating with storage. This is called Single Initiator Zoning. Depending on the particular environment port or WWN based zoning may be selected, although WWN zoning provides more convenience and less security than port based zoning. Port security features can be used to harden WWN-based zones. Read only-zones should be applied to LUNs that will not be modified by initiators. LUN zoning can be used to augment or replace array-based zoning. Set the default zone policy to deny to prevent inadvertent initiator access to a target.
Only 1 or 2 switches should be used to configure zoning. This will help prevent confusion due to conflicting zonesets or the activation of an incomplete zoneset. If only one zoneset is needed (i.e. the active zoneset), you can configure the full zoneset on one switch only. In the event that switch goes down and the full zoneset is lost, you can easily recover the full zoneset from the active zoneset.
251
Limit users who can change port mode via RBAC Port mode security best practices:
Use port mode security on all switch ports Shut down all unused ports Place unused ports in VSAN 4094
HBA
FC
FC
19
252
WWN 2
FC
WWN 3
FC
HBA
WWN 4
FC
HBA
WWN 1
20
Login requests from unauthorized Fibre Channel devices (Nx ports) and switches (xE ports) are rejected. All intrusion attempts are reported to the SAN administrator.
253
The auto-learn option allows allow for rapid migration across to Port Security when it is being activated for the first time. Rather than manually secure each port, auto-learn allows for automatically population of the port-security database based on an inventory of currentlyconnected devices.
254
Use unique passwords for each FC-SP connection Use RADIUS or TACACS+ for centralized FC-SP password administration
2006 Cisco Systems, Inc. All rights reserved.
21
Lock (E)ISL ports to only be (E)ISL ports Lock initiator and target ports down to F or FL mode
When higher levels of security are desired, use port security features:
Bind devices to switch as a minimum level of security Bind devices to a port as an optimal configuration Consider binding to a group of ports in case of port failure Bind switches together at ISL ports bind to specific port, not just switch
Use device-to-switch when available FC-SP-based authentication should be considered mandatory in a secure SAN in order to prevent access to unauthorized data via spoofed or hijacked WWNs where traditional Port Security would be vulnerable.
Use unique passwords for each FC-SP connection. Use RADIUS or TACACS+ for centralized FC-SP password administration:
RADIUS or TACACS+ authentication is recommended for fabrics with more than five FCSP-enabled devices.
255
http://www.emulex.com/ts/fc/docs/wnt2k/2.00/pu.htm
23
256
RADIUS or TACACS+
Trusted Hosts
FC
HB A
FC FC
HB A
FC
HB A
HB A
FC-SP (DH-CHAP)
FC-SP (DH-CHAP)
FC
HBA
Storage Subsystems
24
257
CHAP Authentication
Primary Site
CHAP Authentication
IP WAN
Remote Site
CHAP provides authentication of iSCSI hosts IPsec provides end-to-end authentication, data integrity and encryption
Hardware-based, high-performance solution MDS 9216i or MPS-14/2 Module
2006 Cisco Systems, Inc. All rights reserved.
25
258
Management Security
SAN Management Security Vulnerabilities
SAN Management Threats
Disruption of switch processing Compromised fabric stability Compromised data integrity and secrecy Loss of service, LUN corruption, data corruption, data theft or loss
Compromised fabric stability Compromised data integrity and secrecy
Clear Text Passwords No Audit of Access / Attempts Out-of-band Ethernet Management Connection
27
Unsecured Console Access Unsecured GUI application access Unsecured API access Privilege escalation / unintended privilege Lack of audit mechanisms
259
Management VLAN
28
Use secure protocols. SNMPv3, SSH, and SSL provide strong authentication and encrypted sessions. Disable SNMPv2, Telnet, and HTTP. Use VPNs for remote management. Always implement firewalls between the management network and the Internet. Intrusion Detection Systems (IDS) should also be included in the solution. In a large company, consider implementing an internal firewall to isolate the management network from the rest of the company LAN. Use a private management VLAN to isolate management traffic. Implement IP ACLs to restrict access to mgmt0. Management VSANs can be configured to create a logical SAN for management traffic only. Use role-based access control (RBAC) to restrict user permissions.
260
29
261
Network Administrator
Configure/manage overall network
Customized Roles
Access to subsets of CLI commands
VSAN Administrators
Configure/manage their VSAN only
30
262
VSAN-Based RBAC
With SAN-OS version 1.3.1 and higher, customers are able to define roles on a per-VSAN basis. This enhanced granularity allows different administrators to be assigned to manage different SAN domains, as defined by VSANs. A Network Administrator is responsible for overall configuration and management of the network, including platform-specific configuration, configuration of roles and role assignment. Matching the VSANs to the existing operational structure allows for ease of matching user roles to realistic groupings of operational responsibility. VSAN-based roles both limit the reach of individual VSAN Administrators to the resources within their logical domain. In addition, efficient grouping of commands into roles, and assignment of roles to users, allows mapping of user accounts to practical roles, which reduces the likelihood of password sharing among operational groups.
263
AAA Services
Authentication
User access with ID and password
Authorization
Role level or set of privileges
Accounting
Log of users management session
31
AAA Services
AAA services consist of authentication, authorization, and accounting facilities for CLI.
Authentication refers to the authentication of users to access a specific device. Within the Cisco MDS 9000 Family switches, RADIUS and TACACS+ can be used to centralize the user accounts for the switches. When a user tries to log on to the switch, the switch will validate the user via information gathered from the central RADIUS or TACACS+ server. Authorization refers to the scope of access that users receive once they have been authenticated. Assigned roles for users can be stored in a RADIUS or TACACS+ server along with a list of actual devices that each user should have access to. Once the user has been authenticated, the switch can then refer to the RADIUS or TACACS+ server to determine the extent of access the user will have within the switched network. Accounting refers to the ability to log all commands entered by a user. These command logs are sent to the RADIUS or TACACS+ server and placed in a master log. This log can then be parsed to trace a user's activity and create usage reports or change reports. All exchanges between a RADIUS or TACACS+ server and a RADIUS or TACACS+ client switch can be encrypted using a shared key for added security.
RADIUS and TACACS+ are protocols used for the exchange of attributes or credentials between a RADIUS server and a client device (management station). RADIUS and TACACS+ cover authentication, authorization, and accounting needs for various applications, including: CLI login via Telnet, SSH, console, and modem; SNMP accounting; iSCSI CHAP authentication; and FC-SP DH-CHAP authentication. Separate policies can be specified for each application. The MDS 9000 also has the ability to send RADIUS accounting records to the system log (syslog) service. The advantage of this feature is the consolidation of messages for easier parsing and correlation.
264 Cisco Storage Design Fundamentals (CSDF) v3.0 Copyright 2006, Cisco Systems, Inc.
Centralizing Administration
Use RADIUS and/or TACACS+ for:
SNMP and CLI users iSCSI CHAP FC-CHAP
AD
RAD
Improved security due to central control in applying access rules Use redundant servers Connect RADIUS/TACACS+ to LDAP or Active Directory servers to centralize all accounts enterprise-wide
Microsoft AD
LDAP
RADIUS
RAD
LDAP Server
DB
Linux TACACS+
RDBMS Server
32
Centralizing Administration
SAN administration must be limited to qualified and authorized individuals to assure proper configuration of the devices and the fabric. Enterprise-wide security administration is enabled through support for RADIUS servers and TACACS+ servers for the MDS 9000 family. The use of RADIUS or TACACS+ allows user accounts and roles to be applied uniformly across the enterprise, both simplifying administrative tasks as well as increasing security by providing centralized control for application of access rules. In addition, the switch can record management accounting information, logging each management session in a switch. These records may then be used to generate reports for troubleshooting purposes and user accountability. Accounting data can be recorded locally, on the switch itself, or by RADIUS servers. RADIUS is a standards-based protocol defined by RFC 2865 and several associated RFPs. RADIUS uses UDP for transport. TACACS+ is a Cisco client-server protocol which uses TCP (TCP port 49) for transport. The addition of TACACS+ support in SAN-OS enables the following advantages over RADIUS authentication:
The TCP transport protocol provides reliable transfers with a connection-oriented protocol. TACACS+ provides independent, modular AAA facilitiesauthorization can be done without authentication. TACACS+ encrypts the entire protocol payload between the switch and the AAA server to ensure higher data confidentialitythe RADIUS protocol only encrypts passwords.
265
Hardware-Based Zoning Via Port and WWN LUN Zoning Read-Only Zones
FC
FC
FC
34
Secure SAN management is achieved via role-based access. It includes customizable roles that apply to CLI, SNMP, and web-based access, along with full accounting support. Secure management protocols like SSH, SFTP, and SNMPv3 ensure that outside connection attempts to the MDS 9000 network are valid and secure. Secure switch control protocols that leverage IPsec-ESP (Encapsulating Security Protocol) specifications yield SAN protocol security (FC-SP). DH_CHAP authentication is used between switches and devices. MDS 9000 support of RADIUS and TACACS+ AAA services help to ensure user, switch, and iSCSI host authentication for the SAN. Secure VSANs and hardware-enforced zoning restrictions using port ID and World Wide Names provide layers of device access and isolation security to the SAN.
DH-CHAP capable HBAs installed in all hosts to enable authenticated fabric access Port-mode security on all switch ports Port security on all switch ports Database cluster server groups access utilize their own VSAN to provide traffic isolation Array-based LUN Security
Copyright 2006, Cisco Systems, Inc.
266
Level of Security
Device Authorization & Authentication Traffic Isolation & Device Access Controls Mgmt Access
SSHv2, SNMPv3, SSL Centralized AAA w/ RADIUS, TACACS+ Role Based Access Controls (RBAC) VSAN based RBACs IP ACLs VSANs Hardware Zoning LUN Zoning Read-only Zones Port Security Fabric Binding Host/Switch Authentication for FC and FCIP iSCSI CHAP Authentication MS-CHAP Authentication Digital Certificates
35
No impact on switch performance Data path features are all hardware-based Traditional hard and soft zoning as well as advanced LUN and Read-Only zones are available on MDS devices Port Mode Security is an excellent way to limit unauthorized access to the fabric Port Security binds device WWNs with one or more switch ports DH-CHAP provides device authentication services IPsec provides integrity and security for in-transit data
All security features are easily managed through Ciscos Fabric Manager application
267
268
Lesson 9
Objectives
Upon completing this lesson, you will be able to identify issues and solutions for SAN extension. This includes being able to meet these objectives:
Identify applications for SAN extension Identify network transports for SAN extension Explain design configurations for SAN extension over DWDM and CWDM Define FCIP Explain design configurations for SAN extension using FCIP Describe the features of the MDS 9000 IP Services Modules Explain how to build highly available FCIP configurations Explain how IVR increases the reliability of SAN extension links Explain how to secure extended SANs Explain the options available for optimizing performance of low-cost FCIP transports
Local Datacenter
Remote Datacenter
WAN
BACKUP
2006 Cisco Systems, Inc. All rights reserved.
FCIP is relatively inexpensive compared to optical storage networking Enterprises and Storage Service Providers (SSPs) can provide remote vaulting services using existing IP WAN infrastructures Backup applications are sensitive to high latency, but in a properly designed SAN the application can be protected from problems with the backup process by using techniques such as snapshots and split mirrors.
270
Data Replication
Data is continuously synchronized across the network Data can be mirrored for multiple points of access Enables rapid failover to remote datacenter for 24/7 data availability Reduces RTO as well as Recovery Point Objective (RPO)
Local Datacenter
Remote Datacenter
WAN
REPLICATION
Data Replication
The primary type of application for an FCIP implementation is a disk replication application used for business continuance or disaster recovery. Examples of this types of application include:
Array-based replication schemes such as EMC Symmetrix Remote Data Facility (SRDF), Hitachi True Copy, IBM Peer-to-Peer Remote Copy (PPRC), or HP/Compaq Data Replication Manager (DRM). Host-based application schemes such as VERITAS Volume Replicator (VVR).
271
WAN
DWDM
SYNCHRONOUS REPLICATION
Replication applications can be run in a synchronous mode, where an acknowledgement of a disk write is not sent until the remote copy is done, or in an asynchronous mode, where disk writes are acknowledged before the remote copy is completed. Applications that are using synchronous copy replication are very sensitive to latency delays and might be subject to unacceptable performance. Customer requirements should be carefully weighed when deploying an FCIP link in a synchronous environment. FCIP can be suitable for synchronous replication when run over local Metro Ethernet or short-haul WDM transport.
272
FC SAN
FC SAN
Dark Fiber
The type of fiber used defines maximum link distances for connecting Fibre Channel ports over dark fiber:
Single mode 9 fiber will support 10Km distances at 1Gbps, 2Gbps or 4Gbps. Multimode 50 fiber will support 500m distances at 1Gbps, 300m at 2Gbps and 150m at 4Gbps. Multimode 62.5 fiber will support 350m distances at 1Gbps, 150m distances at 2Gbps and 75m distances at 4Gbps.
When two switches are joined together with an ISL, they merge fabrics and become part of the same fabric with a shared address space, shared services and a single principal switch. This is a disruptive event and FSPF will build a new routing table which is distributed to all switches within the fabric. If there is a link failure, then the single fabric will segment into two separate fabrics, each with their own address space, FC services and each with their own principal switch. FSPF must disruptively build a routing table for each segmented fabric once again.
273
DWDM
DWDM enables up to 32 channels to share a single fiber pair
Divides a single beam of light into discrete wavelengths (lambdas) Each signal can be carried at a different rate (2.5-Gbps, 10-Gbps) Dedicated bandwidth for each multiplexed channel ~ 1nm spacing DWDM transponders can support multiple protocols and speeds Point to point distance limited to approx 200km
Router
DWDM Multiplexer
DWDM Demultiplexer
Router
FC switch
FC switch
SONET
Transponders
Transponders
SONET
9
DWDM
A single fiber pair connecting two FC switches together through an ISL provides a single channel (wavelength of light) between the two switches. DWDM enables up to 32 channels to share a single fiber pair by dividing the light up into discrete wavelengths or lambdas separated by approx 1nm spacing around the 1550nm wavelength. Each DWDM lambda will support a Full Duplex FC or ESCON or FICON or Ethernet channel. DWDM transponders convert each channel into its dedicated lambda and multiplex it onto a 2.5Gbps or 10Gbps link between DWDM Multiplexers. DWDM signals can be amplified and point to point distances are approx 200Km
274
CWDM
CWDM allows eight channels to share a single fiber pair
Each channel uses a different color SFP or GBIC Provides 8 times the bandwidth over a pair of fibers (8x 2Gb = 16Gb) CWDM is much less costly than DWDM Channel spacing is only 20nm CWDM multiplexers are passive un-powered optical devices (prisms) Maximum distance is approx 100km - Signal cannot be amplified
1470nm
OADM Mux
OADM Mux
CWDM
CWDM is much less costly than DWDM because the channel spacing is only 20nm and much less precise. CWDM provides 8 channels between two CWDM Multiplexers over a single fiber pair. CWDM Multiplexers are usually un-powered devices containing a very accurate prism to multiplex 8 separate wavelengths of light along a single fiber pair. Max distance is approx 100Km.
275
SONET / SDH
SONET and SDH support longer distances than WDM:
Robust network management and troubleshooting Significant installed infrastructure Variety of protection schemes Limited bandwidth in some service areas (without use of DWDM)
MDS9000
FC
Fibre Channel
11
SONET/SDH
SONET (North America) or SDH (Rest of the world) is a managed optical technology that supports much longer distances than CWDM or DWDM. Typically used City to City or Country to Country.
276
HBA HBA
HBA HBA
SONET/SDH
CWDM/DWDM
FC SAN
FC SAN
12
277
CWDM offers:
Lower cost than DWDM CWDM GBICs & SFPs are used Less scalable 8 channels (s) max, 2.5G Channels Simple Deployment (Passive components) Less Electronics (Just SFPs) Shorter distances based on power of SFPs no amplification Relatively inexpensive way to get low-latency, high-bandwidth connectivity
2006 Cisco Systems, Inc. All rights reserved.
14
DWDM Transport
DWDM offers ample bandwidth, performance, and scalability. DWDM is protocolindependent, so it more easily caters to future protocols and growth. For example, DWDM is the only solution that can accommodate ESCON solutions along with FICON and FC. DWDMs strongest qualities are:
Very high scalability Very low, predictable latency Moderately long distances
DWDM distances are still limited by the application and the flow-control mechanisms of each particular protocol. For example, the high number of BB_Credits on the Cisco MDS 9000 (255 credits on the 16-port cards) allows an MDS-to-MDS link over a theoretical distance of 255km at 2Gb FC line rates. Synchronous replication requires high bandwidth and low latency, and is therefore well-suited for optical DWDM infrastructures in which FC is channeled directly over an optical network for long distances.
Note The highest distance tested by Cisco with synchronous replication is 239Km.
DWDM equipment is much more expensive than other solutions. Dark fiber can be expensive to lease.
Copyright 2006, Cisco Systems, Inc.
278
DWDM services are not as widely available as SONET/SDH, so companies might need to implement and manage their own DWDM solutions. DWDM does not have the same robust management capabilities as SONET/SDH, which can also increase the cost of management.
CWDM Transport
CWDM applications are similar to those for DWDM: low-latency, high-bandwidth applications like synchronous replication. However, DWDM provides much more scalability than CWDM, and DWDM can be used over much longer distances because DWDM can be amplified. The primary advantage of CWDM is its low cost. CWDM components are passive, un-powered devices. CWDM is always much cheaper than DWDM, and is often even cheaper than SONET/SDH For those who have access to dark fiber and have limited scalability needs, CWDM is a relatively inexpensive way to get low-latency, high-bandwidth connectivity.
279
Pass
FC
Pass Network
MUX-4
MUX-4
FC
MUX-4
Pass
Network Pass
MUX-4
MDS
MDS
Use MUX-8 to double capacity Cheaper than DWDM Less Capacity Distance limitation approx 100km no amplification Max Channel BW is 2.5G
15
8-channel WDM at 20nm spacing (cf DWDM at <1nm spacing) 1470, 1490, 1510, 1530, 1550, 1570, 1590, 1610nm Special colored SFPs used in FC Switches Muxing done in CWDM Optical Add/drop Multiplexer (OADM) passive unpowered device, just mirrors and prisms 30dBm power budget (36dBm typical) on SM fiber ~90km Point-to-point or ~40km ring Not amplifiable via Erbium Doped Fiber Amplification (EDFA)
Copyright 2006, Cisco Systems, Inc.
280
1,3
One member from each PortChannel routed over top fiber (Nos. 1 & 3)
1 2 3
ONS15454
SAN
2 3
ONS15454
SAN
4
MDS
MDS
Up to 800km @ 2G, 1600km @ 1G R_Rdy spoofing Very high capacity 32 wavelengths @ 10G per fiber pair Typically used for Synchronous Replication up to ~200 - 300km (vendor qualified): Fiber Channel over Optical Very Low Latency (speed of light only factor) 5us/km Client Protection Recommended Failover recovery (classic dual fabric design): Port Channels add resilience to each Fabric Augment or replace with other DWDM protection schemes (splitter or Y-Cable) NOTE: This is a very common deployment method, many times just point-point and not ring!
2006 Cisco Systems, Inc. All rights reserved.
16
Higher density than CWDM 32 lambdas or channels in narrow band around 1550nm at 100GHz spacing (<1nm) EDFA amplifiable over longer distances Carriage of 1 or 2 Gbps FC, FICON, GigE, 10GigE, ESCON, IBM GDPS Data Center to Data Center Protection options; client, splitter, or line card
281
12 . 3 6 Sysplex Timer 9
ESCON/FICON storage
NOTE: GDPS2 on ONS15454 is being tested in March 06. MSTP will be the first optical equipment to have this accreditation
2006 Cisco Systems, Inc. All rights reserved.
17
Sysplex
Sysplex, which stands for System Complex, is a processor complex collection that is formed by coupling multiple processors running multiple OS images, using channel-to-channel adapters or ESCON/FICON fiber optic links. Sysplex is a loosely-coupled clustering technology. Sysplex is supported on the IBM S/390 and zSeries processors.
Parallel Sysplex
Parallel Sysplex is a bundle of products (announced in April of 1994) that enables parallel transaction processing in a Sysplex. Parallel processing in a Sysplex is the ability to simultaneously process a particular workload on multiple processor complexes, each of which may have multiple processors. Parallel Sysplex is a tightly-coupled clustering technology.
282
GDPS is primarily intended to create disaster-tolerant system configurations by spreading out the components of a Parallel Sysplex across multiple locations. The main focus of GDPS automation is to ensure that a consistent copy of the data is available at another site, and that the remote data can be quickly brought online in the event of a local failure. Consistent data simply means that from an applications perspective, the secondary disks contain all updates until a specific point in time, and no updates beyond that specific point in time.
283
15454 MSTP
18
284
Service Cards
19
285
1-Gigabit Fibre Channel 2-Gigabit Fibre Channel 4-Gigabit Fibre Channel 1-Gigabit Ethernet 1-Gigabit ISC-Compatible (ISC-1) 2-Gigabit ISC-Peer (ISC-3)
Aggregated lower-rate TDM services from DS1/E1 over 2.5-Gbps and 10-Gbps wavelengths SONET/SDH wavelength and aggregated services: OC-3/STM-1 to OC-768/STM-256 Data services: Private-line, switched, and wavelength-based, from Ethernet to 10 Gigabit Ethernet (10 GE LAN and WAN physical layer) Storage services: 1-, 2-, 4-, and 10-Gbps Fibre Channel, FICON, ESCON, ETR/CLO, ISC1, ISC-3 Video services: D1 and high-definition television (HDTV) Digital-wrapper technology (defined in ITU-T G.709) for enhanced wavelength management and extended optical reach with integrated Forward Error Correction (FEC) and Enhanced FEC
286
$$ $$
Remote Office C
ATM Ethernet Dark Fiber
287
What is FCIP?
Fibre Channel over Internet Protocol Allows SAN islands to be interconnected over IP networks FC Frames are encapsulated in TCP/IP and sent through the tunnel TCP/IP is used as the underlying transport to provide flow control and in-order delivery of error-free data:
Each interconnection forms an FCIP Tunnel Each GigE port supports up to 3 FCIP Tunnels
HB A
HB A
FCIP tunnel
IP
2006 Cisco Systems, Inc. All rights reserved.
22
What is FCIP?
FCIP is a mechanism that allows SAN islands to be interconnected over IP networks. The connection is transparent to Fibre Channel, and the result of an FCIP link between two fabrics is a single fully merged Fibre Channel fabric. TCP/IP is used as the underlying transport to provide congestion control and in-order delivery of error-free data. FCIP is a draft specification that is in development by the Internet Engineering Task Force (IETF) IP Storage (IPS) Working Group. The specification defines the encapsulation of Fibre Channel frames transported by TCP/IP. The result of the encapsulation is to create a virtual Fibre Channel link that connects Fibre Channel devices and fabric elements across IP networks. When FCIP connectivity is implemented in the switch instead of in a separate bridge device, standard B_Ports are not used. In the MDS implementation, each end of the FCIP link is associated to a Virtual E_Port (VE_Port), forming a Virtual ISL (VISL). VE_Ports communicate over a VISL using standard FC SW_ILS frames, just like E_Ports communicate between two switches. VE_Ports and TVE_Ports behave exactly as E_Ports and TE_Ports. For example:
[T]VE_Ports negotiate the same parameters as E_Ports, including Domain ID selection, FSPF, and zones. [T]VE_Ports can be members of a Port Channel. TVE_Ports carry multiple VSANs.
288
SCSI
SCSI
DATA
FC
SOF
FC Hdr
SCSI
DATA
CRC
EOF
FC SAN
FCIP
IP 20
TCP 20
FC Hdr 24
SCSI
DATA
CRC 4
EOF 4 Bytes
FCIP tunnel
23
289
When the packets reach the destination FCIP Gateway, the procedure is reversed as each of the headers is stripped off.
290
HB A
HB A
HB A
FCIP
Metro Ethernet
FCIP
HB A
HB A
HB A
HB A
FCIP
SONET or SDH
FCIP
HB A
HB A
HB A
HB A
HB A
HB A
FCIP
IP Routed WAN
FCIP
HB A
25
Synchronous replication requires high bandwidth and low latency and is well suited for FCIP. FCIP over Metro Ethernet or SONET/SDH can be used for synchronous replication. The FC line card for the Cisco ONS 15454 supports a feature called BB_Credit spoofing that allows SONET/SDH to carry FC with no loss of performance over thousands of kilometers. Asynchronous replication consumes less bandwidth and can tolerate more latency, so FCIP over SONET/SDH can provide a cost-effective solution in addition to supporting longer distances. Remote vaulting applicationswhich resemble standard backup applications, but where the backup device resides at a remote location, such as at an SSPcan require longer distances, but can require deterministic latency. For these solutions, FCIP over SONET can be the most effective solution. Host-based mirroring solutions are generally applications with less stringent bandwidth and latency requirements. FCIP over SONET or FCIP over IP routed WAN can be suitable infrastructures for these applications.
291
Hub-and-Spoke Configuration
Point-to-Point
FCIP tunnel
Hub-and-Spoke
FC IP
Corporate HQ
IP FC
Remote sites
26
The SN 5428-2 Storage Router supports FCIP as well as iSCSI and a workgroup FC switch in a single box, and can be used for small remote office deployments. The FCIP Port Adapter (FCPA) provides an FC interface for Cisco 7200 and 7400 Series routers.
292
Marker
Scheduler
Output Queue
QoS
FC
HBA
QoS
Backup Servers
FC
HBA
FC
Classify
FC
FC
QoS
QoS
QoS
IP WAN
QoS
FC
FC
Benefits:
Critical data has priority on the network Latency sensitive apps get priority Bandwidth can scale dynamically
2006 Cisco Systems, Inc. All rights reserved.
QoS
Backup site
27
293
FCIP Advantages
Advantages:
Low-cost connectivity solution Ubiquitous connectivity (IP) No fixed distance limitation Not reliant of Fibre Channel buffer credits Integrates easily into existing network management scheme Granular scalability by upgrading underlying transport
Disadvantages:
Higher latency than CWDM / DWDM Fully merged fabric will segment if WAN connection fails Need to reserve bandwidth across shared IP network (QoS) Many proprietary product options based upon a standard
28
FCIP Advantages
FCIP provides a low-cost connectivity solution for SAN extension. There are many product optionsCisco offers three different FCIP solutionsand TCP/IP service is universally available. FCIP has no fixed distance limitation. The maximum distance is largely dependent on the quality of the underlying transport and the applications latency requirements. Because FCIP is based on IP, FCIP solutions can be easily integrated with existing network management tools and practices. Smaller organizations that do not have existing optical networking expertise will find FCIP attractive. FCIP is often considered a low-end solution, but it also offers granular scalability by providing the ability to upgrade the underlying IP transport. For example, with the IPS-8 module, a company could start with a small DS-1 or DS-3 connection and later add additional DS-1, DS3, or OC-n service. FCIP links can be bound together in PortChannels to provide bandwidth aggregation.
294
Use Inter-VSAN routing to enable further isolation while allowing connectivity between selected devices
29
295
8 GigE ports Maximum FCIP connectivity 8Gbps High-End iSCSI Software compression Write acceleration Tape acceleration
14 FC ports plus 2 GigE ports Primarily designed for FCIP Can also be used for iSCSI Hardware compression Write acceleration Tape acceleration Hardware encryption
31
The IP Storage Services (IPS-8) Modules provide eight Gigabit Ethernet ports to support iSCSI and FCIP. The ports are hot-swappable, small form-factor pluggable (SFP) LC-type Gigabit Ethernet interfaces. Modules can be configured with either short or long wavelength SFPs for connectivity up to 550m and 10km, respectively. Each port can be configured in software to support iSCSI protocol and/or FCIP protocol simultaneously, while also supporting the features available on other switching modules, including VSANs, security, and traffic management. 512 MB of buffer capacity is shared between port pairs, allowing all ports to achieve gigabit speeds, and performance tuning options such as TCP window size help to ensure full storage transport performance over WAN distances. The IPS supports 2-link Ethernet PortChannels, FC PortChannels, and Virtual Router Redundancy Protocol (VRRP) to enhance link utilization and availability. VLAN support through the IPS module enables the MDS 9000 system to leverage existing reliability functions of the existing IP network that it attaches to. IPS module interfaces operate at full line rate for all protocols that have 1K byte frame size or greater and the module supports all standard Fibre Channel line card features except for FC interfaces. Fibre Channel interfaces are leveraged on other modules like the sixteen or thirtytwo port switch line cards. The IPS module can be simultaneously configured for iSCSI and FCIP operation, where it supports iSCSI initiator to Fibre Channel target functionality, as well as an FCIP gateway with up to three FCIP tunnels per port or a maximum of 24 per line card (IPS-8). This concurrent multi-protocol flexibility helps enable investment protection through seamless migration to new technologies.
296
High performance hardware-based compression: On low-speed wide-area network (WAN) links, each GigE port supports up to 70-Mbps application throughput with a 30:1 compression ratio. On high-speed WAN links, each GigE port supports up to 1.5-Gbps application throughput with a 10:1 compression ratio. Hardware-based IPSec supports gigabit-speed encryption for secure SAN extension. FCIP Tape Acceleration improves performance of remote backup applications. Extended distance capability, with 255 buffer credits per FC port and up to 3500 extended buffer credits on a single FC port.
The Multiprotocol Services line card is also available as a fixed module in the Cisco MDS 9216i fabric switch.
297
Port Group X
400
X
400
X
400
HBA
FC FC FC HBA
HBA
HBA
FC FC FC HBA
HBA
FC
FC
32
298
Supervisor Module
Cross-bar 720 Gbps
Queuing ASIC
20 Gbps 8 Gbps 8 Gbps
Forwarding ASIC
3.2 Gbps 3.2 Gbps
Forwarding ASIC
3.2 Gbps 3.2 Gbps
Supervisor Module
Cross-bar 720 Gbps
SiByte
SiByte
SiByte
SiByte
GigE ports
33
299
GigE Interfaces
Each GigE port supports three FCIP interfaces and an iSCSI interface An IPS-8 can support up to 24 FCIP Tunnels + iSCSI concurrently FCIP interfaces can be Port Channeled for HA and Load Balancing
VE_Ports
FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface
GigE port
Port Channel
2006 Cisco Systems, Inc. All rights reserved.
34
GigE Interfaces
Each GigE port supports three FCIP interfaces and a iSCSI interface simultaneously sharing 1Gbps of available bandwidth. An IPS-8 linecard has eight GigE ports so can support 24 FCIP tunnels and up to 1600 iSCSI connections concurrently. Each FCIP interface is associated with a Virtual E_Port (VE_Port) on the FCIP Gateway. FCIP interfaces can belong to FC Port Channels for High Availability and Exchange Based Load Balancing.
300
IPv6 Support
Extended addressing capability
Reduces the need for private addresses and NAT IP address size increased from 32 to 128 bits Represented as eight 16 bit fields 2003:FAB7:1234:5678:9ABC:DEF0:1357:2468 IPv4 can be embedded in IPv6
10.1.2.3 represented as 0:0:0:0:0:FFFF:10.1.2.3
3.0
IPv6 supported on all GigE ports and Mgmt interface Standard applications are IPv6 ready
DNS, RADIUS, TACACS, ACLs, FCIP, iSCSI, IPFC, tftp, ftp, sftp, telnet, ssh, scp, snmp, ping, traceroute, etc.
35
IPv6 Support
IPv6 has been introduced with SAN-OS 3.0 providing an extended addressing capability. IPv^ is supported on all MDS GigE ports and management interfaces.
IP Address size is increased form 32 bits to 128 bits IPv4 can be embedded in IPv6 for compatibility with legacy networks. Reduces the need for private addresses and NAT
301
Multiple standalone gateways required for high availability No load balancing Additional management interface requirements Limited traffic management - relies strictly on IP routers for QoS
HBA
FC FC HBA FC
HBA
HBA
FC FC HBA FC
HBA
FC SAN Fabric
FC SAN Fabric
37
302
HA Pitfalls
Multiple standalone gateways required for FCIP redundancy
Costly and difficult to manage
No native HA capabilities or proprietary HA schemes No network level HA or load balancing Failovers are slow and disruptive to the fabric Increased response time before failovers
HBA
FC FC HBA FC
HBA
FCIP tunnel
HBA
FC FC HBA FC
HBA
FCIP tunnel FC SAN Fabric Standalone FCIP gateways IP Standalone FCIP gateways FC SAN Fabric
38
HA Pitfalls
Gateway-based FCIP solutions require multiple boxes for high-availability, each of which may need to be managed independently. HA is provided by a proprietary clustering scheme instead of network-based resiliency. With this configuration, load-balancing is not possible, and failover results in loss of data in transit and causes disruptive FSPF recalculation in the end point fabrics.
303
FSPF will re-route traffic if a FCIP Tunnel fails Protects against a port failure, IPS module failure, switch failure, link failure, and IP WAN service failure
VSAN A VSAN A
IPS
IPS
VSAN B
2006 Cisco Systems, Inc. All rights reserved.
VSAN B
39
304
IPS
IPS
IPS
Two FCIP Tunnels bundled using Port Channel to form single Virtual EISL
VSAN B
2006 Cisco Systems, Inc. All rights reserved.
VSAN B
40
305
When the virtual ISL is reconnected, both fabrics try to merge again
Merged fabric nominates single principal switch Rebuilds FSPF Routing Table
Bad SFP or cable
FC FC HBA FC
HBA
HBA
HBA
FC FC HBA FC
HBA
FC SAN Fabric A
FC SAN Fabric B
42
Flapping Links
The diagram above illustrates a problem with other vendors FCIP implementations. Because FCIP merges the fabrics at both ends of the link, disruptions in the WAN link will cause fabric reconfiguration events to occur on both sides. All devices will be affected by the disruption. In the case of a flapping link, disruption can be crippling.
306
Transit VSAN X
FC
FC
HBA
FCIP tunnel
Backup Host 2
307
VSAN C
Transit VSAN X
IPS
FCIP tunnel
tu nn e tu nn e l l FC FC IP FC IP IP e nn tu l FC IP
IPS
e nn tu l
FCIP tunnel
Transit VSAN Y
IPS
FCIP tunnel
IPS
VSAN B
2006 Cisco Systems, Inc. All rights reserved.
Port Channels
VSAN D
44
308
WAN
Encrypted
46
Optical DWDM, CWDM, or SONET/SDH links are considered relatively secure due to the inherent difficulty of tapping into optical fiber. However, security on FCIP tunnels that are routed over public IP is a serious issue. For regulated institutions like financial companies, health care, and schools, encryption of data transmitted over public networks is not just a good idea, it is a requirement. FCIP gateway products on the market today do not provide integrated encryption. Users must rely on routers or VPN appliances at the WAN edge to encrypt storage traffic. Not only does this still leave storage traffic vulnerable to interception up to the WAN edge, but it may require users to buy yet more equipment if the existing routers or VPN appliances cant support gigabit-speed storage traffic in addition to existing WAN traffic loads.
309
WAN
Encrypted
47
310
DS-3OC-12
Low Bandwidth
Low-bandwidth interconnects:
Can we use OC-3, DS-3? How do we reduce the cost of bandwidth for SAN extension?
2006 Cisco Systems, Inc. All rights reserved.
49
311
FCIP Compression
Compressed Eth IP TCP IP TCP TCP TCP FCIP FCIP Eth Header Header Header Header Opts Opts Header Header Header Header Eth Eth CRC32 CRC32
FC FC Frame Frame
Hardware compression for with 14+2 card and MDS 9216i with SAN OS 2.0
Designed for gigabit-speed service
Three compression modes supported for different WAN bandwidth links and compression ratios
2006 Cisco Systems, Inc. All rights reserved.
50
FCIP Compression
Compression is used as a mechanism to increase overall throughput on slow speed WAN links. The achievable compression ratio depends on the nature of the data. The use of data compression allows users to achieve two major objectives. The first is the ability to reduce the amount of overall traffic on a particular WAN link. This is achieved when a data rate equal to the WAN link speed is compressed, thus reducing the total amount of data on the WAN link and allowing the WAN link to be used by other IP traffic. The IPS modules use the IPPCP/LZS (RFC2395) lossless compression algorithm for compressing data. The IPPCP/LZS compression compresses only the TCP headers and payload of the FCIP frame as shown here. This allows the resulting compressed IP frame to routed through an IP network and still be subject to Access Control Lists (ACLs) and QoS mechanisms based on IP address and TCP port numbers. The type of the data in the data stream determines the overall achievable compression ratio for a given compression method. Typical data mixes should achieve around 2:1 compression. Testing compression with data comprised of all 0x0s or 0xFs or other repeating patterns will artificially increase the resultant compression ratio and will probably not be representative of the compression ratio that you can achieve with real user data. In order to better compare compression methods, use either an industry standardized test file or a test file that is representative of the real data that will be sent through the FCIP tunnel.
312
Target
FC
FCIP tunnel
Doubles latency The problem is even worse for applications that restrict the number of outstanding I/Os, such as tape backup
DY XFER_R
FCP_DATA
P FCP_RS
51
Host initiator issues a SCSI Write command (FCP_WRITE), which includes the total size of the write. The target responds with an FCP Transfer Ready (FCP_XFER_RDY). This tells the initiator how much data the target is willing to receive in the next Write sequence. The initiator sends FCP data frames up to the amount specified in the previous FCP_XFER_RDY. The target responds with a SCSI status response (FCP_RSP) frame if the I/O completed successfully.
Each FCIP link can be filled with a number of concurrent or outstanding I/Os. These I/Os can originate from a single source or a number of sources. The FCIP link is filled when the number of outstanding I/Os reaches a certain ceiling. The ceiling is mostly determined by the RTT, write size, and available FCIP bandwidth.
313
Target
FC
FCIP tunnel
DY XFER_R
FCP_DATA
P FCP_RS
Round Trip
XFER_RDY
52
After the initiator issues a SCSI FCP Write, an FCP_XFER_RDY is immediately returned to the initiator by the MDS 9000. The initiator can now immediately send data to its target across the FCIP Tunnel. The data is received by the remote MDS and buffered. At the remote end, the target, which has no knowledge of Write Acceleration, responds with an FCP_XFER_RDY. The MDS does not allow this to pass back across the WAN. When the remote MDS receives FCP_XFER_RDY it allows the data to flow to the target. Finally when all data has been received, the target issues a FCP_RSP response or status, acknowledging the end of the operation (FC Exchange)
Write Acceleration will increase write I/O throughput and reduce I/O response time in most situations, particularly as the FCIP RTT increases.
314
30 FC
HBA
FCIP tunnel
Throughput (MB/s)
XFER_RDY
FCP_WRITE FCP_DATA
Round Trip
FCP_RSP
XFER_RDY
FCP_RSP
2006 Cisco Systems, Inc. All rights reserved.
RTT (ms)
53
315
The graph on this slide shows the effects of Write Acceleration and Tape Acceleration. The tests were conducted with Legato Networker 7.0 running on a Dual Xeon 3Ghz CPU with 2G Memory and Windows Advanced Server 2000 with an IBM Ultrium TD2-LTO2 Tape Drive. Cisco has tested the Tape Acceleration feature with tape devices from IBM, StorageTek, ADIC, Quantum, and Sony, as well as VERITAS NetBackup, Legato Networker, and CommVault, and is currently working with CA. Backup application vendors will provide matrices of supported tape libraries and drives.
316
3.0
FC
HBA
FCIP tunnel
FCP_READ
Round Trip
FCP_READ
FCP_DATA
Pre-fetch Data
FCP_READ
FCP_RSP
FCP_RSP
54
A tape drive is a sequential storage medium, so the blocks stream off the tape as the tape passes the head The backup server issues Read Commands to the Tape target device requesting a number of SCSI 512 Byte blocks. The tape starts to move, reads the data into buffers and then stops waiting for the next command. Meanwhile the backup server receives the data blocks and issues a new read command for the next x blocks in sequence. The tape starts up again, reads the blocks and so on. FCIP Tape Read Acceleration performs a read-ahead to pre-fetch the data and keep the tape moving. Lets assume that the backup server issues a Read Command to read the first x blocks. This command sent to the tape and the tape starts up, and reads the blocks into the buffer and the data is sent back to the backup server. Meanwhile, before the tape has stopped moving, the MDS at the remote site issues another Read Command to read the next x blocks in sequence into the buffer and these blocks are sent over the FCIP tunnel to buffers in the MDS at the local data centre. When the local MDS receives a command from the backup server to read the next x blocks, it consumes the command and sends the data that it has already buffered.
By pre-fetching data and keeping the tape moving, FCIP Tape Read Acceleration will dramatically improve read performance over a WAN.
317
1.3
Source GigE
Destination
55
318
cwnd halved on packet loss; retransmission signals congestion; Slow Start threshold adjusted
loss
Exponential Slow Start (2x pkts per RTT) Low throughput during this period
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Round Trips
2006 Cisco Systems, Inc. All rights reserved.
56
319
Retransmission
Maximum Window Size Slow Start Threshold Slow Start Threshold initialized to 95% of MWS cwnd at 95% of MWS after one RTT
10
11
12
13
14
15
57
Round Trips
2006 Cisco Systems, Inc. All rights reserved.
320
FC
HBA
58
TCP parameters (max bandwidth, round-trip time RTT) Number of outstanding I/Os for the application SCSI transfer size
To determine these parameters, users need to use standard traffic generation tools like IOMeter to generate test data and measure response time and throughput. This requires test hosts and test targets, and must be redone every time the environment changes.
321
Measures throughput and response time per I/O over the FCIP tunnels
Virtual N-port 10:00:00:00:00:00:00:01
FC
FC
Gig3/3
WAN/MAN
Gig3/1
2006 Cisco Systems, Inc. All rights reserved.
FCIP tunnel
Gig2/1
59
322
WAN/MAN
FC
60
Compression options add to implementation flexibility by allowing bandwidth to be used more effectively. Designed specifically to enable customers to leverage sub-gigabit transports in SAN-OS 1.3, compression can scale to gigabit speeds with SAN-OS 2.0 and the new 14+2 line card. Write Acceleration increases performance by spoofing the SCSI XFER_READY command to reduce round-trips and lower latency. This feature can double the usable distance without increasing latency. For applications that allow few outstanding I/Os, like tape backup, Write Acceleration can double the effective throughput. An optimized TCP MWS stack keeps the pipe full by dynamically recalculating the MWS based on changing conditions, and by implementing a packet shaping algorithm to allow fast TCP starts.
323
324
Lesson 10
Objectives
Upon completing this lesson, you will be able to explain how iSCSI can be used to enable migration of mid-range applications to the SAN. This includes being able to meet these objectives:
Explain the problems that iSCSI is designed to solve Describe the iSCSI protocol Describe how iSCSI is implemented on the MDS 9000 IP Services Modules Explain how to deploy iSCSI effectively Explain how to configure high availability for iSCSI Explain how to secure iSCSI environments Explain how to simplify management of iSCSI environments with target discovery Explain where Wide Area File Services (WAFS) is effective
Data Center
Distributed Storage
At a corporate headquarters, how is backup accomplished? In an environment where DAS storage dominates, someone has to load and collect tapes for each device, which easily constitutes a storage management nightmare. Backup windows can be easily exceeded and normal operations can be affected as a result causing delay in the opening of the business day. This is a growing problem for many businesses today. At the same time, the data canter may have a good storage management scheme and applications already in place, as well as unallocated disk space. What is needed is a way to connect these distributed workgroup servers to the data center.
326
Branch Offices
Problem: Branch offices located at greater distances Lack of resources to manage storage Inconsistent backup on each site Compliance with data security and retention regulations E.g. Banks, Schools, Clinics
Regulatory compliance issues
Unmanaged backups
Branch Office
2006 Cisco Systems, Inc. All rights reserved.
Data Center
Branch Office
5
Branch Offices
Branch offices can also pose a storage management issue for the enterprise. With typically too few management resources to manage storage at remote sites, backups are conducted on an adhoc basis, often leaving the company out of compliance with data security and retention regulations.
327
Mid-Range Applications
Need cost-effective SAN solution for mid-range applications Mid range apps have low bandwidth requirements
Typically 10 20MB/s avg.
Storage Network
N-Tier Applications
Cache
Content Switch
Typical uses
Web server farms Application server farms Branch offices
Tape DB Servers Mainframe IP Comm.
SSL
Operations
Todays Datacenter
Mid-Range Applications
While FC SANs have dramatically increased operational efficiency for high-end application storage, the high cost of FC has prevented these benefits from migrating down to mid-range applications. Mid-range applications dont need the same high levels of bandwidth and low levels of latency as high-end applications, so it is often difficult to achieve ROI in a reasonable timeframe by implementing FC for mid-range applications. As a result, many applications in the enterprise, such as file, web, and messaging servers, are managed separately, either via DAS or NAS, keeping management costs high. At the same time, the customers investment in FC SANs is not fully realized. Inside the data center, there are a number of different tiers of servers. Two of those tiers are web server farms and application server farms. These servers are typically numerous, yet have low bandwidth requirements and can tolerate higher amounts of latency than database servers. It is often not considered cost-effective to migrate these servers to FC SANs. Assuming 2Gb FC ports, with each host sustaining an average of 15 MBps per port, only 7.5% of the available bandwidth is being utilized.
328
iSCSI Overview
What is iSCSI?
internet Small Computer Systems Interface (iSCSI) SCSI Transport protocol carried over TCP/IP Encapsulates SCSI commands and data into IP packets TCP is the underlying network layer transport
Provides congestion control and in-order delivery of error-free data
Allows iSCSI hosts to access iSCSI native targets Allows iSCSI hosts to access FC SAN storage targets via gateway Provides seamless integration of mid range servers into the SAN Can use standard Ethernet NICs or iSCSI HBAs
Ethernet 18 bytes IP hdr 20 bytes TCP hdr 20 bytes iSCSI hdr 48 bytes
What is iSCSI?
Internet Small Computer Systems Interface (iSCSI) is a transport protocol that operates on top of TCP and encapsulates SCSI-level commands and data into IP, for a TCP/IP byte stream. It is a means of transporting SCSI packets over TCP/IP, providing for an interoperable solution that can take advantage of existing IP-based infrastructures, management facilities and address distance limitations. Mapping SCSI I/O over TCP ensures that high-volume storage transfers have in-order delivery and error-free data with congestion control. This allows IP hosts to gain access to previously isolated Fibre Channel based storage targets. iSCSI is an end-to-end protocol with human-readable SCSI device (node) naming. It includes base components such as IPSec connectivity security, authentication for access configuration, discovery of iSCSI nodes, a process for remote boot, and iSCSI MIB standards. The iSCSI protocol was defined by an IP Storage Working Group through the Internet Engineering Taskforce (IETF). Version 20 was recently approved by the Internet Engineering Standards Group (IESG).
329
Advantages of iSCSI
Cost-effective technology for connecting low-end & midrange servers, clients and storage devices
Enables iSCSI hosts to communicate with iSCSI storage Enables iSCSI hosts to communicate with FC storage, through a gateway
Advantages of iSCSI
iSCSI leverages existing IP networks. Users can therefore benefit from their experience with IP as well as the industrys experience with IP technologies. This includes:
Economies from using a standard IP infrastructure, products, and service across the organization Experienced IP staff to install and operate these networks. With minimal additional training it is expected that IP staff in remote locations can maintain iSCSI based servers. Management tools already exist for IP networks this reduces the need to learn new tools or protocols. Traffic across the IP network can be secured using standards based solutions such as IPsec. QoS is used to ensure that SAN traffic is not affected by the potential unreliable nature of IP. QoS exists today in the IP infrastructure and can be applied from end to end across the IP network to give SAN traffic priority of other less time-sensitive traffic on the network. iSCSI is compatible with existing IP LAN and WAN infrastructures. iSCSI devices support and Ethernet or Gigabit Ethernet interface to connect to standard LAN infrastructures.
330
10
331
Relieves host CPU resources from iSCSI and TCP processing Does not necessarily increase performance, only if CPU is busy Wire-rate iSCSI performance
Useful only when host must support high sustained loads Dedicated
Hardware
File System Block Device SCSI Generic iSCSI TCP/IP Stack NIC Driver TOE Adapter
iSCSI
11
Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before. Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.
iSCSI HBAs offload TCP/IP processing and iSCSI processing to co-processors and custom ASICs on the iSCSI HBA. Although relatively expensive, iSCSI HBAs provide lower latency and higher throughput than iSCSI software drivers or TOE cards. Nowadays when host CPU processors have more performance, it is not usually cost effective to use NICs with TOE, but to use software iSCSI drivers instead.
332
Processed in hardware:
Apps/file system Apps/file system SCSI iSCSI TCP IP Network hardware
SCSI iSCSI
Network hardware
Standard NIC
2006 Cisco Systems, Inc. All rights reserved.
TOE
iSCSI HBA
12
Partial Offload TOE cards offload TCP/IP processing to the TOE but pass all errors (packet loss) to the driver running on the host CPU. In a lossy network, partial offload TOEs may perform worse than before. Full Offload TOE cards offload both TCP/IP processing and error recovery to the TOE card. The host CPU is still responsible for SCSI and iSCSI processing.
To achieve maximum performance, it is necessary to offload both TCP/IP processing, iSCSI processing and error recovery from the host CPU onto the iSCSI HBA. The host is still responsible for SCSI processing.
333
iSCSI Concepts
Network Entity
iSCSI initiator iSCSI target Network Entity - iSCSI Initiator iSCSI Node
Network Portal Ethernet, Wireless, etc.
iSCSI Node
Identified by iSCSI Node Name Initiator node = Host Target node = Storage Target node contains one or more LUNs
Network Portal
Identified by IP address and subnet mask Network access TCP/IP Ethernet, Wireless etc
IP Network
Network Portal
Network Portal
iSCSI Node
iSCSI Node
13
iSCSI Concepts
SCSI standards define a client server relationship between the SCSI Initiator and the SCSI Target. iSCSI standards define these as the Network Entity. The iSCSI Network Entity contains an iSCSI Node which is either the Initiator or Target. iSCSI Nodes are identified by an iSCSI Node Name. If the Target Node is a storage array, it may contain one or more SCSI LUNs. iSCSI Initiator Nodes communicate with iSCSI Target Nodes through Network Portals. Network Portals connect to the IP network and are identified by an IP Address. It is worth noting that Network portals can also be wireless ethernet ports.
334
iqn.1987-05.com.cisco.storage.backup.server1
Date = yyyy-mm when Domain Acquired Reversed Domain Name of Naming Authority
14
iqn: iSCSI Qualified Name, up to 255 bytes, human readable UTF-8 encoded string eui: Extended Unique Identifier, 8 byte hexadecimal number defined and allocated by IEEE
Although both formats can be used, typically the iSCSI driver will use the iqn format and the eui format will be used by manufacturers of native iSCSI devices.
335
SCSI
IP Network
iSCSI iSCSI iSCSI HBA iSCSI iSCSI Servers HBA
HBA
SCSI
iSCSI
iSCSI is most suitable for hosts running applications that are not latency sensitive and have a low throughput requirement
HBA
16
336
iSCSI Gateways
iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices MDS 9216i, MPS 14+2 and IPS line cards all provide iSCSI gateways iSCSI is provided for free in the standard license
NAS Filer
NAS
SCSI
FC Servers
HBA
FC FC FC HBA FC HBA
HBA
IP Network
iSCSI iSCSI HBA iSCSI HBA iSCSI iSCSI Servers HBA
FC SAN
iSCSI Gateway SCSI
FC FC
SCSI
iSCSI
HBA
FC Storage
17
iSCSI Gateways
Most enterprises already have data centres with Fibre Channel SANs and FC Storage Arrays but they cannot be accessed directly from iSCSI hosts. iSCSI gateways allow iSCSI hosts and servers to communicate with Fibre Channel storage devices. Cisco MDS 9216i, MPS 14+2 and IPS linecards all provide an iSCSI to FC gateway function. iSCSI is provided for free on MDS switches in the standard license.
337
FC FC HBA FC
HBA
HBA
FC FC HBA FC
HBA
iSCSI iSCSI
iSCSI iSCSI
iSCSI Gateways/Routers
18
This approach requires implementing a new set of devices (at least 2 devices for highavailability), and possibly adding more devices if more network capacity is needed. It means separate management interfaces, and, even worse, separate security policies. It is also typically less highly available, because the high availability hardware features that one expects in a data center SAN switch are often not viable for a small, low-cost router product. It is potentially better for WAN-based branch offices. Gateways are not a good fit for metro-based branch offices, such as schools, clinics, and banks.
338
Integrated iSCSI
Multiprotocol SAN switch Single SAN fabric Single management interface
FC HBA FC HBA FC
HBA
FC HBA FC HBA FC
HBA
iSCSI
Designed for the data center and backup from remote offices
2006 Cisco Systems, Inc. All rights reserved.
19
Integrated iSCSI
The Cisco iSCSI solution for data centers is the IP Services (IPS) Module series for the Cisco MDS 9000 platform. This approach integrates iSCSI and FC (along with FCIP and FICON) into a single multiprotocol SAN switch. This provides higher availability because iSCSI is supported on the highly available MDS 9000 platform. This provides a single management interface, a single point of control for security, and unifies iSCSI and FC storage into a single SAN fabric. This approach is designed to meet the availability, manageability and scalability requirements of the data center.
339
IP
20
TCP
20
iSCSI
48
IPS Linecard
iSCSI
SCSI Com
R2T
mand
SC SI C omman d
XFER RDY
Data
Data
Status
FC virtual initiator
Response
FC
HBA
FC
FC Frames FC Target
MDS
2006 Cisco Systems, Inc. All rights reserved.
20
340
Security:
RADIUS support IP ACLs VSANs and VLANs Integrated FC and iSCSI Zoning IPSec
IP Network
Catalyst 6500
Cisco IPS
Ease of deployment
Dynamic initiator and target discovery Proxy initiators iSNS server
FC Fabric
MDS 9000
21
341
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
2:1 Fan-In
60 device connections
iSCSI FC $ $
342
iSCSI Fan-InScenario 2
Scenario 2: Many hosts, low bandwidth
100 hosts x 15MB/s = 1500MB/s
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
FC
HBA
FC
HBA
FC
HBA
FC
HBA
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
FC
HBA
6:1 Fan-In
iSCSI FC $ $$$$
343
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
FC
iSCSI iSCSI iSCSI iSCSI iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
iSCSI
25
It is desirable to have high fan-in ratios in an IP SAN design, in part because they are more cost-effective, and in part because of the low port density of IP gateways and line cards.
344
IP Hosts (iSCSI-enabled)
iSCSI iSCSI iSCSI iSCSI iSCSI
iSCSI
iSCSI
FC
HBA HBA
FC
HBA
FC
HBA
FC
HBA
FC
FC
HBA
iSCSI iSCSI
iSCSI
iSCSI iSCSI
iSCSI
iSCSI
iSCSI iSCSI
iSCSI
iSCSI iSCSI
iSCSI
iSCSI
Catalyst 6500
iSCSI iSCSI iSCSI iSCSI
Backup assets
iSCSI
iSCSI
iSCSI
FC SAN
2006 Cisco Systems, Inc. All rights reserved.
26
345
FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface FCIP interface FCIP FCIP interface Profile FCIP interface iSCSI interface
2006 Cisco Systems, Inc. All rights reserved.
GigE port
GigE port
IP Network
28
GigE Interfaces
Each GigE port supports three FCIP Interfaces and an iSCSI interface simultaneously sharing 1Gbps bandwidth. Tests have shown that each iSCSI interface will support up to 200 iSCSI connections, although it is worth noting that all iSCSI hosts would share the same 1 Gbps bandwidth. GigE ports can be joined using Ethernet Port Channel for High Availability. On the IPS-8 and MPS 14+2 linecard, odd even pairs of ports share the same SiByte ASIC and resources.
346
FC
VRRP
pWWN Aliasing
29
347
iSCSI
Redundant VSANs
30
348
VRRP
Two Gigabit Ethernet ports are in a VRRP group with one virtual IP address If active VRRP port fails, peer reconnects to the same virtual IP address across the second port Provides front-end redundancy
iSCSI
VRRP
iqn.host-2 IP Network
iSCSI
FC
FC SAN
31
VRRP
Virtual Routing Redundancy Protocol (VRRP) is a router-based protocol that dynamically handles redundant paths, making failures transparent to applications. Two ports are placed into a VRRP group that is assigned a single virtual IP address. The external router connects to the IPS via the virtual IP address. This enables transparent failover of an iSCSI volume from one IPS port to any other IPS, either locally or on another Cisco MDS 9000 Family switch. VRRP provides redundancy in front of the MDS switch but can take up to 20 secs to failover.
349
pWWN Aliasing
Provides back-end redundancy Each FC storage port is mapped to a virtual iSCSI target pWWN aliasing maps a secondary pWWN to the same virtual target Trespass feature for mid-range storage arrays
Exports LUNs from active to passive port
iSCSI
FC
iqn.host-1
2006 Cisco Systems, Inc. All rights reserved.
32
pWWN Aliasing
Virtual iSCSI targets can be associated with a secondary pWWN on the FC target. This can be used when the physical Fibre Channel target is configured to have a LUN visible across redundant ports. When the active port fails, the secondary port becomes active and the iSCSI session switches to use the new active port. iSCSI transparently switches to using the secondary port without impacting the iSCSI host. All other I/O are terminated with check condition status and the host retries the I/O. If both the primary and secondary pWWNs are available, then both pWWNs can be used each session may use either pWWN. For mid-range storage arrays, the trespass feature is available to enable the export of LUNs, on an active port failure, from the active to the passive port of a statically imported iSCSI target. In physical Fibre Channel targets which are configured to have LUNs visible over two Fibre Channel N-ports, when the active port fails, the passive port takes over. However, some physical Fibre Channel targets require that the trespass command be issued, to export the LUNs from the active port to the passive port. When the active port fails, the passive port becomes active, and if the trespass feature is enabled, the MDS issues a trespass command to the target to export the LUNs on the new active port. The iSCSI session switches to use the new active port and the exported LUNs are accessed over the new active port. pWWN aliasing and trespass provide redundancy behind the MDS switch.
350
Host-to-Storage Multipathing
Redundant I/O design with multipathing s/w is a best practice:
Error detection, dynamic failover and recovery Active/active or active/passive operation Transparent to applications on server
Host with multiple (iSCSI) NICs and multipathing software Ethernet switches Application Multipathing iSCSI Driver MDS 9000 redundant fabrics
33
351
3.0
Load Balancing
iSLB IPS IPS
Create a pool of IPS ports Load balance servers to ports from the pool
IPS
IPS
FC Arrays
FC Arrays
Without iSLB
Manually configure iSCSI configuration on multiple switches Static assignment of Hosts to IPS ports, with Active/Backup redundancy Manually Zone iSCSI Host WWN with FC Target WWN
2006 Cisco Systems, Inc. All rights reserved.
With iSLB
CFS automatically distributes iSCSI configuration to multiple switches iSLB provides dynamic load distribution with Active/Active redundancy Simplified zoning by automating setup of iSCSI specific attributes
34
352
iSCSI Security
iSCSI Access Control
MDS 9000 Bridges Security Domains for IP SANs
IP Domain VLANs CHAP ACLs IPSec Mgmt Domain SNMP AAA RBAC SSH FC Domain VSANs Zoning Port Security
36
IP Domain VLANs, ACLs, CHAP, IPSec Management Domain AAA, SNMP, RBAC, SSH Fibre Channel Domain VSANs, Zoning, Port Security
The MDS 9000 family of switches provide the security features, intelligence capabilities and processing capacity needed to bridge these security domains. While it is not a requirement to implement all of these security features, it is a recommended best practice to implement multiple levels of security. For example, iSCSI CHAP authentication is not required, but can be used in combination with FC-based zoning to create a more secure IP SAN.
353
Centralized Security
Local Radius server on MDS
RADIUS
FC Servers
Centralized AAA services via RADIUS and TACACS+ servers Single AAA database for:
iSCSI CHAP authentication FC-CHAP authentication
FC-CHAP
CHAP
FC-CHAP
iSCSI
iSCSI Servers
RBAC
FC Targets
Management Server
2006 Cisco Systems, Inc. All rights reserved.
37
Centralized Security
The MDS 9000 platform provides centralized AAA services by supporting RADIUS and TACACS+ servers. With iSCSI, RADIUS can be used to implement a single highly available AAA database for:
354
38
355
FC
iSCSI
By default, all iSCSI initiators belong to the port VSAN of their iSCSI interface (VSAN 1) iSCSI initiators can be assigned to VSANs by pWWN iSCSI initiators can belong to multiple VSANs
2006 Cisco Systems, Inc. All rights reserved.
39
356
iSCSI
VSANs and zones are FC access control mechanisms MDS 9000 extends VSANs and zoning into iSCSI domain iSCSI initiator access is subject to VSAN and zoning rules
40
357
IPSec
IPSec for secure VPN tunnels:
Authentication and encryption Site-to-site VPNs for FCIP tunnels Site-to-site VPNs for iSCSI connections Hardware-based IPSec on 14+2 module
HBA
iSCSI iSCSI
FC FC HBA FC
HBA
HBA
FC FC HBA FC
HBA
FC
FC
41
IPSec
The IPSec protocol creates secure tunnels between a pair of hosts, between a pair of gateways, or between a gateway and a host. IPSec supports session-level and packet-level authentication using a variety of encryption schemes, such as MD-5 SHA-1, DES, and 3DES. Session-level authentication ensures that devices are authorized to communicate and verifies that devices are who they say they are, while packet-level authentication ensures that data has not been altered in transit. Applications for IPSec VPNs in the SAN include:
Site-to-site VPNs for FCIP SAN interconnects Site-to-site VPNs for IP SANs (iSCSI hosts accessing remote FC storage)
358
VLANs
Use VLANs to secure data paths at the edge of the IP network VLAN-to-VSAN mapping Private VLANs
iSCSI VLAN
iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI iSCSI
iSCSI iSCSI
iSCSI VSAN
FC
FC
FC
FC
FC
FCIP VSAN
42
VLANs
Within each data center or remote site, VLANs can be used to provide dedicated paths for IP storage traffic. VLANs can be used to:
Protect iSCSI traffic along the data path from the hosts to the SAN fabric Provide dedicated paths for FC extension over FCIP by extending VLANs from the SAN fabric to edge routers
In addition to providing security, using VLANs to isolate iSCSI and FCIP data paths enhances the network administrators ability to provide dedicated bandwidth to SAN devices and allows more effective application of QoS parameters.
359
FC
44
The goal of iSCSI discovery is to allow an initiator to find the targets to which it has access, and at least one address at which each target may be accessed. Ideally, this should be done using as little configuration as possible. The iSCSI discovery mechanisms only deal with target discovery; the SCSI protocol is used for LUN discovery. In order for an iSCSI initiator to establish an iSCSI session with an iSCSI target, the initiator needs the IP address, TCP port number and iSCSI target name information. The goal of iSCSI discovery mechanisms are to provide low overhead support for small iSCSI setups, and scalable discovery solutions for large enterprise setups. Thus, there are several methods that may be used to find targets ranging from configuring a list of targets and addresses on each initiator and doing no discovery at all, to configuring nothing on each initiator, and allowing the initiator to discover targets dynamically. There are currently three basic ways to allow iSCSI host systems to discover the presence of iSCSI target storage controllers:
Static configuration iSCSI SendTargets command Zero configuration methods such as the Service Location Protocol (SLPv2) and/or the Internet Storage Name Service (iSNS)
The diagram above shows the SendTargets method. This method is most often used today with simple iSCSI solutions. However, iSNS server will be used in the future to scale iSCSI target discovery. SAN-OS 2.0 has the iSNS server component.
360
FC Servers
iSNS client
iSNS client
iSCSI Servers
iSCSI iSCSI
iSNS client
FC Targets
Enables an integrated solution to configure and manage both Fibre Channel and iSCSI devices:
Device registration, discovery, and state change notification Distributed, HA solution Discovery Domains mapped to FC Zones Discovery Domain Sets mapped to FC Zonesets No need for dual access-control configuration
FC Servers
iSNS server
iSCSI iSCSI
iSNS server
iSCSI Servers
iSNS server
FC Targets
45
Branch Office
IT
Regional Office
IT
NAS DAS
Files
Backup
NAS DAS
Files
IT
NAS SAN
Files
IT
Backup
Data Center
Remote Office
Islands of Storage
2006 Cisco Systems, Inc. All rights reserved.
47
Typical Enterprise
In a typical enterprise environment, several branch offices connect to the Data Center over the WAN. Each branch office is responsible for data protection and backup of critical data leading to concerns for regulatory compliance. Each branch office requires local technical support and management of the infrastructure leading to high costs of deployment.
362
IT Admin
NAS SAN
Files
IT
Backup
Backup
Data Center
Remote Office
48
363
Centralized IT management and backup strategy Files cached in WAAS and locally accessed
WAAS Manager
(Web-based)
Files Files
IT Admin
NAS SAN
Files
Files
Backup
Data Center
Cluster
Remote Office
49
364
FC Storage Array
SCSI
SCSI SCSI
FC
= File System
iSCSI Storage
FC Application Server
50
Block I/O protocols like SCSI are used to transfer blocks between SCSI Initiators and SCSI Targets.
iSCSI is used to transport SCSI commands and data across the LAN Fibre Channel is used to transport SCSI commands and data across the SAN
The File System is a table that is used to map files to blocks. The data center is a complex environment with many different File I/O and Block I/O protocols used to transfer data to and from storage devices. In this environment it is important to understand where the data is located and where the file system is located.
NAS Filers connect to the LAN and have their own File System and local storage They respond to File I/O protocols like NFS and CIFS. NAS Head is a NAS filer without local storage. They bridge the LAN and the SAN and respond to File I/O protocols then map these through the File System to Block I/O protocols that are used to access FC storage on the SAN. iSCSI Gateway allows iSCSI hosts on the LAN to access FC storage on the SAN. Note that the File System is now on the iSCSI host.
365
An FC Application server responds to File I/O requests from the Client on the LAN and will retrieve data using Block I/O from FC storage LUNs across the SAN. This time, the FC Application Server contains the File System.
366
CIFS
NAS
NFS
Unix Clients
51
367
Web Web
Application Application Classification Classification and and Policy Policy Engine Engine Logical Logical and and Physical Physical Integration Integration Security Security Monitoring Monitoring Quality Qualityof of Service Service
Network Infrastructure
Core Core Routing Routing & & Switching Switching Services Services
52
Reduce TCO and improve assets management through centralized rather than distributed infrastructure. Improve data management and protection by keeping a master copy of all files and content at the data center. Improve ability to meet regulatory compliance objective. Raise employee productivity by providing faster access to shared information. Protect investment in existing WAN deployments.
368
WAFS Performance
Word - Time to Open
Native WAN Cisco FE Native LAN
0 5 10 15 20 25
Cisco WAAS shows 5x to 12x faster performance as compared to the WAN, and similar performance to LAN for typical operations on Office applications
2006 Cisco Systems, Inc. All rights reserved.
53
WAFS Performance
The above diagram shows the comparison, on file open and file close, of a WAFS enabled site versus a direct access WAN site. Even when a file is not cached on the local WAN Application Engine, the WAFS performance enhancements use roughly 1/3 of the WAN that a native WAN request for the same file.
Note All graphs and statistics are examples only, actual performance will vary depending on network design, server design and application design.
369
370