You are on page 1of 42

Defining SAN

Storage networks are distinguished from other forms of network storage by the low-level access method that they
use. Data traffic on these networks is very similar to those used for internal disk drives, like ATA and SCSI.In a
storage network, a server issues a request for specific blocks, or data segments, from specific disk drives. This
method is known as block storage. (Block storage refers to the direct access to random disk blocks in
computer disk storage. Block storage is normally abstracted by a file system or database management
system for use by applications and end users. The physical or logical volumes accessed via Block I/O may
be devices internal to a server, direct attached via SCSI or Fibre Channel, or distant devices accessed via a
storage area network (SAN) using a protocol such as iSCSI, or AoE. Database management systems often
use their own Block I/O for improved performance and recoverability as compared to layering the DBMS
on top of a file system) The device acts in a similar fashion to an internal drive, accessing the specified block,
and sending the response across the network.In more traditional file storage access methods, like SMB/CIFS or
NFS, a server issues a request for an abstract file as a component of a larger file system, managed by an
intermediary computer. The intermediary then determines the physical location of the abstract resource, accesses
it on one of its internal drives, and sends the complete file across the network.Most storage networks use the SCSI
protocol for communication between servers and devices, though they do not use its low-level physical interface.
Typical SAN physical interfaces include 1Gbit Fibre Channel, 2Gbit Fibre Channel, 4Gbit Fibre Channel, and (in
limited cases) 1Gbit iSCSI. The SCSI protocol information will be carried over the lower level protocol via a
mapping layer. For example, most SANs in production today use some form of SCSI over Fibre Channel system,
as defined by the "FCP" mapping standard. iSCSI is a similar mapping method designed to carry SCSI
information over IP.

Benefits
Sharing storage usually simplifies storage administration and adds flexibility since cables and storage devices do
not have to be physically moved to move storage from one server to another. Note, though, that with the
exception of SAN file systems and clustered computing, SAN storage is still a one-to-one relationship. That is,
each device (or Logical Unit Number (LUN)) on the SAN is "owned" by a single computer (or initiator). In
contrast, Network Attached Storage (NAS) allows many computers to access the same set of files over a network.
It is now possible to combine the SAN and NAS using a NAS head.SANs tend to increase storage capacity
utilization, since multiple servers can share the same growth reserve.Other benefits include the ability to allow
servers to boot from the SAN itself. This allows for a quick and easy replacement of faulty servers since the SAN
can be reconfigured so that a replacement server can use the LUN of the faulty server. This process can take as
little as half an hour and is a relatively new idea being pioneered in newer data centers. There are a number of
emerging products designed to facilitate and speed up this process still further. For example, Brocade
Communication Systems offers an Application Resource Manager product which automatically provisions servers
to boot off a SAN, with typical-case load times measured in minutes. While this area of technology is still new,
many view it as being the future of the enterprise datacenter.SANs also tend to enable more effective disaster
recovery (A Disaster recovery plan covers the data, hardware and software critical for a business to restart
operations in the event of a natural or human-caused disaster. It should also include plans for coping with
the unexpected or sudden loss of key personnel, although this is not covered in this article, the focus of
which is data protection processes). A SAN attached storage array can replicate data belonging to many servers
to a secondary storage array. This secondary array can be local or, more typically, remote. The goal of disaster
recovery is to place copies of data outside the radius of effect of an anticipated threat, and so the long-distance
transport capabilities of SAN protocols such as Fibre Channel and FCIP are required to support these solutions.
(The physical layer options for the traditional direct-attached SCSI model could only support a few meters of
distance: not nearly enough to ensure business continuance in a disaster.) Demand for this SAN application has
increased dramatically after the September 11th attacks in the United States, and increased regulatory
requirements associated with Sarbanes-Oxley and similar legislation.Newer SANs allow duplication functionality
such as "cloning" and "snapshotting," which allows for real-time duplication of LUN, for the purposes of backup,
disaster recovery, or system duplication. With higher-end database systems, this can occur without downtime, and
is geographically independent, primarily being limited by available bandwidth and storage. Cloning creates a
complete replica of the LUN in the background (consuming I/O resources in the process), while snapshotting
stores only the original states of any blocks that get changed after the "snapshot" (also known as the delta blocks)
from the original LUN, and does not significantly slow the system. In time, however, snapshots can grow to be as
large as the original system, and are normally only recommended for temporary storage. The two types of
duplication are otherwise identical, and a cloned or snapshotted LUN can be mounted on another system for
execution, or backup to tape or other device, or for replication to a distant point.

Disk controllers
The driving force for the SAN market in the enterprise space is rapid growth of highly transactional data that
require high speed block level access to the hard drives (such as data from email servers, databases, and high
usage file servers). Historically, enterprises would have "islands" of high performance SCSI storage RAIDs that
were locally attached to each application server. These "islands" would be backed up over the network, and when
the application data exceeded the maximum amount of data storable by the individual server, the end user would
often have to upgrade their server to keep up.

The disk controllers used in enterprise SAN environments are designed to provide applications with block level
access to high speed, reliable "virtual hard drives" (or LUNs). In addition, modern SANs allow enterprises to
intermix FC SATA drives with their FC SCSI drives. SATA drives have lower performance, a higher failure rate,
higher capacity, and lower prices than SCSI. This allows enterprises to have multiple tiers of data that will
migrate over time to different types of media. For example: many enterprises relegate files that are rarely accessed
to FC SATA while keeping their frequently used data in FC SCSI.Another feature of most enterprise disk
controllers is a I/O cache. This feature allows higher overall performance for writing to the controller, and in
some cases (like for contiguous file access where read ahead is enabled) reading from the controller.

SAN types
SANs are normally built on an infrastructure specially designed to handle storage communications. Thus, they
tend to provide faster and more reliable access than higher level protocols such as NAS.

The most common SAN technology by far is Fibre Channel networking with the SCSI command set. A typical
Fibre Channel SAN is made up of a number of Fibre Channel switches which are connected together to form a
fabric. A fabric is similar in concept to a segment in a local area network. Today, all major SAN equipment
vendors also offer some form of Fibre Channel routing solution, and these bring substantial scalability benefits to
the SAN architecture by allowing data to cross between different fabrics without merging them. However, most of
these offerings use proprietary protocol elements, and the top-level architectures being promoted are radically
different. When extending Fibre Channel over long distances for disaster recovery solutions, it can be mapped
over other protocols. For example, products exist to map Fibre Channel over IP (FCIP) and over SONET/SDH. It
can also be extended natively using signal repeaters, high-power laser media, or multiplexers such as
DWDMs.An alternative SAN protocol is iSCSI which uses the same SCSI command set over TCP/IP (and,
typically, Ethernet). In this case, the switches would be Ethernet switches. The iSCSI standard was ratified in
2003, so it has not yet had time to gather broad industry support. Fibre Channel has existed in production
environments for over a decade and has already been widely deployed as strategic network infrastructure, so it
will take iSCSI quite some time to make significant inroads into the installed-base of Fibre Channel SANs. It also
underperforms Fibre Channel significantly, and may not be suitable for enterprise deployments. As a result, iSCSI
is generally seen as being more of a competitor to NAS protocols such as CIFS and NFS.Another alternative to
iSCSI is the ATA-over-Ethernet or AoE protocol which embeds the ATA protocol inside of raw Ethernet frames.
While a raw Ethernet protocol like AoE cannot be routed without something else performing the encapsulation, it
does provide a simple discovery model with low overhead.Connected to the SAN will be one or more servers
(hosts) and one or more disk arrays, tape libraries, or other storage devices. In the case of a Fibre Channel SAN,
the servers would use special Fibre Channel host bus adapters (HBAs) and optical fiber. iSCSI SANs would
normally use Ethernet network interface cards, and often specialized TOE cards.
Storage area networks are of two kinds - centralized storage area networks and distributed storage area networks

Compatibility
One of the early problems with Fibre Channel SANs was that the switches and other hardware from different
manufacturers were not entirely compatible. Although the basic storage protocols (such as FCP) were always
quite standard, some of the higher-level functions did not interoperate well. Similarly, many host operating
systems would react badly to other OSes sharing the same fabric. Many systems were pushed to the market before
standards were finalized and vendors innovated around the standards.The combined efforts of the members of the
Storage Networking Industry Association (SNIA) improved the situation during 2002 and 2003. Today most
vendor devices, from HBAs to switches and arrays, interoperate nicely, though there are still many high-level
functions that do not work between different manufacturers' hardware.While this work is substantially completed
for Fibre Channel, the process has only just begun for some other SAN protocols such as iSCSI. Interoperability
at the IP layer is not a problem, but higher layer functions still need substantial integration work, and this is likely
to take years.

SANs at work
SANs are primarily used in large scale, high performance enterprise storage operations. It would be unusual to
find a Fibre Channel disk drive connected directly to a SAN. Instead, SANs are normally networks of large disk
arrays. SAN equipment is relatively expensive, therefore, Fibre Channel host bus adapters are rare in desktop
computers. The iSCSI SAN technology is expected to eventually produce cheap SANs, but it is unlikely that this
technology will be used outside the enterprise data center environment. Desktop clients are expected to continue
using NAS protocols such as CIFS and NFS. The exception to this may be remote replication sites. Remote
replication enables the data center environment to exist in multiple locations for disaster recovery and business
continuity purposes. The performance issues inherent in iSCSI are likely to limit its deployment to lower-tier
applications, with Fibre Channel remaining incumbent for high performance systems.

SANs in a Small Office / Home Office (SOHO)


With the increasing rise of digital media in all phases of life and its effect on storage needs, it's natural that SANs
have begun to enter into the SOHO market. Historically, this market was dominated by NAS systems, but SOHO
is poised to become a major market for SAN infrastructure as SOHO performance requirements rise.Systems such
as film scanners and video editing applications require performance that cannot be provided by traditional file
servers. For example, motion picture film at 2048x1556 requires more than 300MBytes/s for each real-time
stream, and several of these streams can be required simultaneously. As a result, several Gigabits per second can
be required, which creates a problem for standard NAS technologies. In addition, these systems need to work with
the same files collaboratively, so they cannot be distributed through different file servers or DAS
connections.Instead of having many computers connected to the network, with each one requiring a low
bandwidth and only the server being stressed under heavy traffic, the SOHO "real-time" area only needs to
integrate a few systems, but all of them require high bandwidth to access to the same files. These problems are
addressed very well by 4Gbit Fibre Channel SAN infrastructures, where the aggregated bandwidth for sequential
I/O operations is extremely high.

RAID stands for Redundant Array of Inexpensive (or sometimes "Independent") Disks.

RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together
to appear as a single device to the host system). RAID technology was developed to address the fault-tolerance
and performance limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels
than a single hard drive or group of independent hard drives. While arrays were once considered complex and
relatively specialized storage solutions, today they are easy to use and essential for a broad spectrum of
client/server applications.
History

RAID technology was first defined by a group of computer scientists at the University of California at Berkeley
in 1987. The scientists studied the possibility of using two or more disks to appear as a single device to the host
system.
Although the array's performance was better than that of large, single-disk storage systems, reliability was
unacceptably low. To address this, the scientists proposed redundant architectures to provide ways of achieving
storage fault tolerance. In addition to defining RAID levels 1 through 5, the scientists also studied data striping --
a non-redundant array configuration that distributes files across multiple disks in an array. Often known as RAID
0, this configuration actually provides no data protection. However, it does offer maximum throughput for some
data-intensive applications such as desktop digital video production.

The driving factors behind RAID

A number of factors are responsible for the growing adoption of arrays for critical network storage.
More and more organizations have created enterprise-wide networks to improve productivity and streamline
information flow. While the distributed data stored on network servers provides substantial cost benefits, these
savings can be quickly offset if information is frequently lost or becomes inaccessible. As today's applications
create larger files, network storage needs have increased proportionately. In addition, accelerating CPU speeds
have outstripped data transfer rates to storage media, creating bottlenecks in today's systems.
RAID storage solutions overcome these challenges by providing a combination of outstanding data availability,
extraordinary and highly scalable performance, high capacity, and recovery with no loss of data or interruption of
user access.
By integrating multiple drives into a single array -- which is viewed by the network operating system as a single
disk drive -- organizations can create cost-effective, minicomputer sized solutions of up to a terabyte or more of
storage.

RAID Levels

There are several different RAID "levels" or redundancy schemes, each with inherent cost, performance, and
availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is
inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications
and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5
have been the most widely used. This is because popular NOSs such as Windows NT® Server and NetWare
manage data in ways similar to how these RAID architectures perform.

RAID 0 - RAID 1 - RAID 2 - RAID 3 - RAID 4 - RAID 5 - RAID 0/1 (or RAID 10)

RAID 0

Data striping without redundancy (no protection).

• Minimum number of drives: 2


• Strengths: Highest performance.
• Weaknesses: No data protection; One drive fails, all data is lost.

DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
RAID 1

Disk mirroring.

• Minimum number of drives: 2


• Strengths: Very high performance; Very high data protection; Very minimal penalty on write
performance.
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is
required.

Mirroring Duplexing
Standard Host Standard Host Standard Host
Adapter Adapter 1 Adapter 2
DRIVE 1 DRIVE 2 DRIVE 1 DRIVE 2
Data A Data A Data A Data A
Data B Data B Data B Data B
Data C Data C Data C Data C
Original Data Mirrored Data Original Data Mirrored Data

RAID 2

No practical use.

• Minimum number of drives: Not used in LAN


• Strengths: Previously used for RAM error environments correction (known as Hamming Code ) and in
disk drives before he use of embedded error correction.
• Weaknesses: No practical use; Same performance can be achieved by RAID 3 at lower cost.

RAID 3

Byte-level data striping with dedicated parity drive.

• Minimum number of drives: 3


• Strengths: Excellent performance for large, sequential data requests.
• Weaknesses: Not well-suited for transaction-oriented network applications; Single parity drive does not
support multiple, simultaneous read and write requests.

RAID 4

Block-level data striping with dedicated parity drive.

• Minimum number of drives: 3 (Not widely used)


• Strengths: Data striping supports multiple simultaneous read requests.
• Weaknesses: Write requests suffer from same single parity-drive bottleneck as RAID 3; RAID 5 offers
equal data protection and better performance at same cost.

RAID 5

Block-level data striping with distributed parity.


• Minimum number of drives: 3
• Strengths: Best cost/performance for transaction-oriented networks; Very high performance, very high
data protection; Supports multiple simultaneous reads and writes; Can also be optimized for large,
sequential requests.
• Weaknesses: Write performance is slower than RAID 0 or RAID 1.

DRIVE 1 DRIVE 2 DRIVE 3


Parity A Data A Data A
Data B Parity B Data B
Data C Data C Parity C

RAID 0/1 - RAID 10

Combination of RAID 0 (data striping) and RAID 1 (mirroring).


Note that RAID 10 is another name for RAID (0+1) or RAID 0/1.

• Minimum number of drives: 4


• Strengths: Highest performance, highest data protection (can tolerate multiple drive failures).
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is
required; Requires minimum of four drives.

DRIVE 1 DRIVE 2 DRIVE 3 DRIVE 4


Data A Data A mA mA
Data B Data B mB mB
Data C Data C mC mC
Original Data Original Data Mirrored Data Mirrored Data

Types Of RAID

There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and
subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly
better than another -- although software-based arrays are rapidly losing favor as high-performance, low-cost array
adapters become increasingly available. Each array solution meets different server and network requirements,
depending on the number of users, applications, and storage requirements.

It is important to note that all RAID code is based on software. The difference among the solutions is where that
software code is executed -- on the host CPU (software-based arrays) or offloaded to an on-board processor (bus-
based and external array controllers).

Description Advantages
Software- Primarily used with entry-level servers, software-based arrays rely on Low price
based RAID a standard host adapter and execute all I/O commands and Only requires a
mathematically intensive RAID algorithms in the host server CPU. standard controller.
This can slow system performance by increasing host PCI bus traffic,
CPU utilization, and CPU interrupts. Some NOSs such as NetWare
and Windows NT include embedded RAID software. The chief
advantage of this embedded RAID software has been its lower cost
compared to higher-priced RAID alternatives. However, this
advantage is disappearing with the advent of lower-cost, bus-based
array adapters.
Hardware- Unlike software-based arrays, bus-based array adapters/controllers Data protection and
based RAID plug into a host bus slot [typically a 133 MByte (MB)/sec PCI bus] performance benefits of
and offload some or all of the I/O commands and RAID operations to RAID
one or more secondary processors. Originally used only with mid- to More robust fault-
high-end servers due to cost, lower-cost bus-based array adapters are tolerant features and
now available specifically for entry-level server network applications. increased performance
versus software-based
In addition to offering the fault-tolerant benefits of RAID, bus-based RAID.
array adapters/controllers perform connectivity functions that are
similar to standard host adapters. By residing directly on a host PCI
bus, they provide the highest performance of all array types. Bus-
based arrays also deliver more robust fault-tolerant features than
embedded NOS RAID software.

As newer, high-end technologies such as Fibre Channel become


readily available, the performance advantage of bus-based arrays
compared to external array controller solutions may diminish.
External Intelligent external array controllers "bridge" between one or more OS independent
Hardware server I/O interfaces and single- or multiple-device channels. These Build super high-
RAID Card controllers feature an on-board microprocessor, which provides high capacity storage systems
performance and handles functions such as executing RAID software for high-end servers.
code and supporting data caching.

External array controllers offer complete operating system


independence, the highest availability, and the ability to scale storage
to extraordinarily large capacities (up to a terabyte and beyond). These
controllers are usually installed in networks of stand alone Intel-based
and UNIX-based servers as well as clustered server environments.

Server Technology Comparison


UDMA SCSI Fibre Channel
Best Suited Low-cost entry level server with Low to high-end server when Server-to-Server campus
For limited expandability scalability is desired networks
Advantages Uses low-cost ATA drives Performance: up to 160 Performance: up to
MB/s 100 MB/s
Reliability Dual active loop data
Connectivity to the largest path capability
variety of peripherals Infinitely scalable
Expandability

Parity

The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in
the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition
of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data
and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does
not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance,
offers no data redundancy at all.

A + B + C + D = PARITY
1 + 2 + 3 + 4 = 10
1 + 2 + X + 4 = 10

7 + X = 10
-7 + = -7
--------- ----------
X 3
MISSING RECOVERED
DATA DATA

Fault tolerance

RAID technology does not prevent drive failures. However, RAID does provide insurance against disk drive
failures by enabling real-time data recovery without data loss.
The fault tolerance of arrays can also be significantly enhanced by choosing the right storage enclosure.
Enclosures that feature redundant, hot-swappable drives, power supplies, and fans can greatly increase storage
subsystem uptime based on a number of widely accepted measures:

• MTDL:
Mean Time to Data Loss. The average time before the failure of an array component causes data to be lost
or corrupted.

• MTDA:
Mean Time between Data Access (or availability). The average time before non-redundant components
fail, causing data inaccessibility without loss or corruption.

• MTTR:
Mean Time To Repair. The average time required to bring an array storage subsystem back to full fault
tolerance.

• MTBF:
Mean Time Between Failure. Used to measure computer component average reliability/life expectancy.
MTBF is not as well-suited for measuring the reliability of array storage systems as MTDL, MTTR or
MTDA (see below) because it does not account for an array's ability to recover from a drive failure. In
addition, enhanced enclosure environments used with arrays to increase uptime can further limit the
applicability of MTBF ratings for array solutions.

Redundant Array of Inexpensive Disks (RAID)


Introduction

Low cost, high performance, 3.5-inch hard-disk drives dominate the storage systems for all practical purposes.
But storage and reliability requirements for the high-end enterprise systems exceed the features available with the
single drives. That's were Redundant Array of Inexpensive Disks (RAID) systems are essentially useful with the
highend systems.RAID technology was developed to address the faulttolerance and performance limitations of
conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or
group of independent hard drives. While arrays were once considered complex and relatively specialized storage
solutions, today they are easy to use and essential for a broad spectrum of client/server applications.
History of RAID Systems

RAID technology was first defined by a group of computer scientists at the University of California at Berkeley
in 1987. The scientists studied the possibility of using two or more disks to appear as a single device to the host
system.Although the array's performance was better than that of large, single-disk storage systems, reliability was
unacceptably low. To address this, the scientists proposed redundant architectures to provide ways of achieving
storage fault tolerance. In addition to defining RAID levels 1 through 5, the scientists also studied data striping --
a non-redundant array configuration that distributes files across multiple disks in an array. Often known as RAID
0, this configuration actually provides no data protection. However, it does offer maximum throughput for some
data-intensive applications such as desktop digital video production.A number of factors are responsible for the
growing adoption of arrays for critical network storage. More and more organizations have created enterprise-
wide networks to improve productivity and streamline information flow. While the distributed data stored on
network servers provides substantial cost benefits, these savings can be quickly offset if information is frequently
lost or becomes inaccessible. As today's applications create larger files, network storage needs have increased
proportionately. In addition, accelerating CPU speeds have outstripped data transfer rates to storage media,
creating bottlenecks in today's systems.RAID storage solutions overcome these challenges by providing a
combination of outstanding data availability, extraordinary and highly scalable performance, high capacity, and
recovery with no loss of data or interruption of user access. By integrating multiple drives into a single array –
which is viewed by the network operating system as a single disk drive -- organizations can create cost-
effective,mini computersized solutions of up to a terabyte or more of storage.

RAID Systems Overview

The dropping cost and rising capacity of high-end 3.5-in. drives plus the falling cost of RAID controllers have
brought RAID to low-end servers. Motherboards with integrated SCSI and ATA RAID controllers are readily
available.RAID controllers support one or more standard RAID configurations specified as RAID 0 through
RAID 5. Proprietary or experimental configurations have been given designations above RAID 6, and some
configurations like RAID 10 are a combination of RAID 1 and RAID 0.All of the configurations except RAID 0
provide some form of redundancy. RAID controllers with redundancy often support hot swappable drives,
allowing the removal and replacement of a failed drive. Typically, information on a replacement drive must be
rebuilt from information on the other drives before the system will be redundant and able to handle another drive
failure.To understand how a RAID system works, see the figure below. The letters A through I indicate where data
is placed on a disk along with the parity and error-correction code (ECC) support. The diagrams show the
minimum disk configurations for each approachwith the exception of RAID 0that can operate with only two
drives.
Also known as striping, RAID 0 uses any number of drives. Data is written sequentially by sector or block across
the drives. This provides high read and write performance because many operations can be performed
simultaneously, even when a sequential block of information is being processed. The downside is the lack of
redundancy. Striping is commonly combined with other RAID architectures to provide high performance and
redundancy.Called mirroring or duplexing, RAID 1 has 100% overhead requiring two drives to store information
normally stored on only one. Although writing speed is the same as with a single drive, it's possible to have twice
the read transfer rate because information is duplicated and the drives can be accessed independently. Also, there's
no loss of write performance when a drive is lost and no rebuild delay as with other RAID architectures. RAID 0
is the most expensive approach when it comes to disk usage.The RAID 2 configuration employs Hamming Code
ECC to provide redundancy. It has a simple controller design , but no commercial implementations exist, partly
due to the high ratio of ECC disks to data disks.Striped data with the addition of a parity disk is used by the RAID
3 configuration. It has a high read/write transfer rate, but controller design is complex and difficult to accomplish
as software RAID. Plus, the transaction rate of the system is the same as a single-disk drive.RAID 4 utilizes
independent disks with shared parity. This configuration has a very high read transaction and aggregate read rate.
Furthermore, it has a low ratio of ECC disks to data disks. Unfortunately, it has a poor write transaction rate and a
complex controller design. Most implementations prefer RAID 3 to RAID 4.

The RAID 5 configuration is a set of independent disks with distributed parity.Similar to RAID 3, the contents of
the parity disk are spread across each disk instead of concentrating all of this information on a single disk. RAID
5 has high read performance with a medium write transaction rate speed. It has a good aggregate transfer rate,
making it one of the most common approaches to RAID, even with a software RAID solution.This configuration,
however, has one of the most complex controller designs, and the individual block transfer rate is the same as a
single disk. Rebuilding a replacement drive is difficult and time consuming. But some controllers implement this
feature so that it takes place in the background with a degraded application throughput until rebuilding has been
completed. Essentially, RAID 6 is RAID 5 plus an extra parity drive. One parity is generated across disks while
the second parity tends to the data on another disk. The two independent parity drives provide high fault tolerance
that permits multiple drive failures to happen.Controller design is very complex, even compared to RAID 5.
Furthermore, RAID 6 has very poor write performance because two parity changes must be made for each
write.Many RAID controllers place disks on independent channels to allow simultaneous access to multiple
drives. Likewise, large memory caches significantly improve controller performance.

RAID Levels

There are several different RAID levels or redundancy schemes, each with inherent cost, performance, and
availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is
inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications
and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5
have been the most widely used.

RAID Level 0 (Non-Redundant)

A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID organization because it does not
employ redundancy at all. This scheme offers the best performance since it never needs to update redundant
information but it does not have the best performance. Redundancy schemes that duplicate data, such as
mirroring, can perform better on reads by selectively scheduling requests on the disk with the shortest expected
seek and rotational delays. Without, redundancy, any single disk failure will result in data-loss. Non-redundant
disk arrays are widely used in super-computing environments where performance and capacity, rather than
reliability, are the primary concerns.Sequential blocks of data are written across multiple disks in stripes. The size
of a data block, which is known as the stripe width, varies with the implementation, but is always at least as large
as a disk's sector size. When it comes time to read back this sequential data, all disks can be read in parallel. In a
multi-tasking operating system, there is a high probability that even non-sequential disk accesses will keep all of
the disks working in parallel.

• Minimum number of drives: 2


• Strengths: Highest performance
• Weaknesses: No data protection; One drive fails, all data is lost

RAID Level 1 (Mirrored)

The traditional solution, called mirroring or shadowing, uses twice as many disks as a non-redundant disk array.
Whenever data is written to a disk the same data is also written to a redundant disk, so that there are always two
copies of the information.When data is read, it can be retrieved from the disk with the shorter queuing, seek and
rotational delays. If a disk fails, the other copy is used to service requests. Mirroring is frequently used in
database applications where availability and transaction time are more important than storage efficiency.

• Minimum number of drives: 2


• Strengths: Very high performance; Very high data protection; Very minimal penalty on write performance.
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is
required.

RAID Level 2 (Memory Style)

Memory systems have provided recovery from failed components with much less cost than mirroring by using
Hamming codes. Hamming codes contain parity for distinct overlapping subsets of components. In one version of
this scheme, four disks require three redundant disks, one less than mirroring. Since the number of redundant
disks is proportional to the log of the total number of the disks on the system, storage efficiency increases as the
number of data disks increases.If a single component fails, several of the parity components will have inconsistent
values, and the failed component is the one held in common by each incorrect subset. The lost information is
recovered by reading the other components in a subset, including the parity component, and setting the missing
bit to 0 or 1 to create proper parity value for that subset. Thus, multiple redundant disks are needed to identify the
failed disk, but only one is needed to recover the lost information.In you are unaware of parity, you can think of
the redundant disk as having the sum of all data in the other disks. When a disk fails, you can subtract all the data
on the good disks form the parity disk; the remaining information must be the missing information. Parity is
simply this sum modulo 2.A RAID 2 system would normally have as many data disks as the word size of the
computer, typically 32. In addition, RAID 2 requires the use of extra disks to store an error-correcting code for
redundancy. With 32 data disks, a RAID 2 system would require 7 additional disks for a Hamming-code ECC. For
a number of reasons, including the fact that modern disk drives contain their own internal ECC, RAID 2 is not a
practical disk array scheme.

• Minimum number of drives: Not used in LAN


• Strengths: Previously used for RAM error environments correction (known as Hamming Code ) and in
disk drives before he use of embedded error correction.
• Weaknesses: No practical use; Same performance can be achieved by RAID 3 at lower cost.

RAID Level 3 (Bit-Interleaved Parity)

One can improve upon memory-style ECC disk arrays by noting that, unlike memory component failures, disk
controllers can easily identify which disk has failed. Thus, one can use a single parity rather than a set of parity
disks to recover lost information.In a bit-interleaved, parity disk array, data is conceptually interleaved bit-wise
over the data disks, and a single parity disk is added to tolerate any single disk failure. Each read request accesses
all data disks and each write request accesses all data disks and the parity disk. Thus, only one request can be
serviced at a time. Because the parity disk contains only parity and no data, the parity disk cannot participate on
reads, resulting in slightly lower read performance than for redundancy schemes that distribute the parity and data
over all disks. Bit-interleaved, parity disk arrays are frequently used in applications that require high bandwidth
but not high I/O rates. They are also simpler to implement than RAID levels 4, 5, and 6.Here, the parity disk is
written in the same way as the parity bit in normal Random Access Memory (RAM), where it is the Exclusive Or
of the 8, 16 or 32 data bits. In RAM, parity is used to detect single-bit data errors, but it cannot correct them
because there is no information available to determine which bit is incorrect. With disk drives, how ever, we rely
on the disk controller to report a data read error.Knowing which disk's data is missing, we can reconstruct it as the
Exclusive Or (XOR) of all remaining data disks plus the parity disk.

• Minimum number of drives: 3


• Strengths: Excellent performance for large, sequential data requests.
• Weaknesses: Not well-suited for transaction-oriented network applications; Single parity drive does not
support multiple, simultaneous read and write requests.

RAID Level 4 (Block-Interleaved Parity)

The block-interleaved, parity disk array is similar to the bit-interleaved, parity disk array except that data is
interleaved across disks of arbitrary size rather than in bits.The size of these blocks is called the striping unit.
Read requests smaller than the striping unit access only a single data disk. Write requests must update the
requested data blocks and must also compute and update the parity block. For large writes that touch blocks on all
disks, parity is easily computed by exclusive-or'ing the new data for each disk. For small write requests that
update only one data disk, parity is computed by noting how the new data differs from the old data and applying
those differences to the parity block. Small write requests thus require four disk I/Os: one to write the new data,
two to read the old data and old parity for computing the new parity, and one to write the new parity. This is
referred to as a read-modify-write procedure. Because a block-interleaved, parity disk array has only one parity
disk, which must be updated on all write operations, the parity disk can easily become a bottleneck. Because of
this limitation, the block-interleaved distributed parity disk array is universally preferred over the block-
interleaved, parity disk array.

• Minimum number of drives:3 (Not widely used)


• Strengths: Data striping supports multiple simultaneous read requests.
• Weaknesses: Write requests suffer from same single parity-drive bottleneck as RAID 3; RAID 5 offers
equal data protection and better performance at same cost.
For small writes, the performance will decrease considerably. To understand the cause for this, a one-block write
will be used as an example.

• A write request for one block is issued by a program.


• The RAID software determines which disks contain the data, and parity, and which block they are in.
• The disk controller reads the data block from disk.
• The disk controller reads the corresponding parity block from disk.
• The data block just read is XORed with the parity block just read.
• The data block to be written is XORed with the parity block.
• The data block and the updated parity block are both written to disk.

It can be seen from the above example that a one block write will result in two blocks being read from disk and
two blocks being written to disk. If the data blocks to be read happen to be in a buffer in the RAID controller, the
amount of data read from disk could drop to one, or even zero blocks, thus improving the write performance.

RAID Level 5 (Block-Interleaved Distributed Parity)

The block-interleaved distributed-parity disk array eliminates the parity disk bottleneck present in the block-
interleaved parity disk array by distributing the parity uniformly over all of the disks. An additional, frequently
overlooked advantage to distributing the parity is that it also distributes data over all of the disks rather than over
all but one. This allows all disks to participate in servicing read operations in contrast to redundancy schemes
with dedicated parity disks in which the parity disk cannot participate in servicing read requests. Block-
interleaved distributed-parity disk array have the best small read, large write performance of any redundancy disk
array. Small write requests are somewhat inefficient compared with redundancy schemes such as mirroring
however, due to the need to perform read-modify-write operations to update parity. This is the major performance
weakness of RAID level 5 disk arrays.The exact method used to distribute parity in block-interleaved distributed-
parity disk arrays can affect performance. Following figure illustrates left-symmetric parity distribution.Each
square corresponds to a stripe unit. Each column of squares corresponds to a disk. P0 computes the parity over
stripe units 0, 1, 2 and 3; P1 computes parity over stripe units 4, 5, 6, and 7 etc.

A useful property of the left-symmetric parity distribution is that whenever you traverse the striping units
sequentially, you will access each disk once before accessing any disk device. This property reduces disk conflicts
when servicing large requests.

• Minimum number of drives: 3


• Strengths: Best cost/performance for transaction-oriented networks; Very high performance, very high
data protection; Supports multiple simultaneous reads and writes; Can also be optimized for large,
sequential requests.
• Weaknesses: Write performance is slower than RAID 0 or RAID 1.

RAID Level 6 (P+Q Redundancy)

Parity is a redundancy code capable of correcting any single, self-identifying failure. As large disk arrays are
considered, multiple failures are possible and stronger codes are needed. Moreover, when a disk fails in parity-
protected disk array, recovering the contents of the failed disk requires successfully reading the contents of all
non-failed disks. The probability of encountering an uncorrectable read error during recovery can be significant.
Thus, applications with more stringent reliability requirements require stronger error correcting codes. Once such
scheme, called P+Q redundancy, uses Reed-Solomon codes to protect against up to two disk failures using the
bare minimum of two redundant disk arrays. The P+Q redundant disk arrays are structurally very similar to the
block interleaved distributes-parity disk arrays and operate in much the same manner. In particular, P+Q
redundant disk arrays also perform small write operations using a read-modify-write procedure, except that
instead of four disk accesses per write requests, P+Q redundant disk arrays require six disk accesses due to the
need to update both the `P' and `Q' information.

RAID Level 10 (Striped Mirrors)

RAID 10 is now used to mean the combination of RAID 0 (striping) and RAID 1 (mirroring). Disks are mirrored
in pairs for redundancy and improved performance, and then data is striped across multiple disks for maximum
performance.RAID 10 uses more disk space to provide redundant data than RAID 5. However, it also provides a
performance advantage by reading from all disks in parallel while eliminating the write penalty of RAID 5. In
addition, RAID 10 gives better performance than RAID 5 while a failed drive remains un-replaced. Under RAID
5, each attempted read of the failed drive can be performed only by reading all of the other disks. On RAID 10, a
failed disk can be recovered by a single read of its mirrored pair.

• Minimum number of drives: 4


• Strengths: Highest performance, highest data protection (can tolerate multiple drive failures).
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is
required; Requires minimum of four drives.

Compound RAID Levels

There are times when more then one type of RAID must be combined, in order to achieve the desired effect. In
general, this would consist of RAID-0, combined with another RAID level. The primary reason for combining
multiple RAID architectures would be to get either a very large, or a very fast, logical disk. The list below
contains a few examples. It is not the limit of what can be done.

RAID-1+0

RAID Level 1+0 (also called RAID-10) is the result of RAID-0 applied to multiple RAID-1 arrays. This will
create a very fast, stable array. In this array, it is possible to have multiple disk failures, without loosing any data,
and with a minimum performance impact. To recover from a failed disk, it is necessary to replace the failed disk,
and rebuild that disk from its mirror. For two-drive failures, the probability of survival is 66% for a 4-disk array,
and approaches 100% as the number of disks in the array increases.

RAID-0+1

RAID Level 0+1 is the result of RAID-1 applied to multiple RAID-0 arrays. This will create a very fast array. If
the RAID-0 controllers (hardware or software) are capable of returning an error for data requests to failed drives,
then this array has all the abilities of RAID-10. If an entire RAID-0 array is disabled when one drive fails, this
becomes only slightly more reliable then RAID-0.To recover from a failed disk, it is necessary to replace the
failed disk, and rebuild the entire RAID-0 array from its mirror. This requires much more disk I/O than is required
to recover from a disk failure in RAID-10. It should be noted that some enterprise-level RAID controllers are
capable of tracking which drives in a RAID-0 array have failed, and only rebuilding that drive. These controllers
are very expensive. For two-drive failures, the probability of survival is 33% for a 4-disk array, and approaches
50% as the number of disks in the array increases. This RAID level is significantly less reliable than RAID-1+0.
This is because the structure is inherently less reliable in a multi-disk failure, combined with the longer time to
reconstruct after a failure (due to a larger amount of data needing to be copied).The longer time increases the
probability of a second disk failing before the first disk has been completely rebuilt.

RAID-3+0

RAID Level 3+0 is the result of RAID-0 applied to multiple RAID-3 arrays. This will improve the performance
of a RAID-3 array, and allow multiple RAID-3 arrays to be dealt with as a single logical device. RAID-3+0 has
reliability similar to RAID-3, with improved performance. This type of array is most commonly found when
combining multiple hardware RAID devices into a single logical device.

RAID-5+0

RAID Level 5+0 (also called RAID-53 for some unknown reason) is the result of RAID-0 applied to multiple
RAID-5 arrays. This will improve the performance of a RAID-5 array, and allow multiple RAID-5 arrays to be
dealt with as a single logical device. The reliability of this type of array is similar to that of a RAID-1+0 array, but
it has the performance impacts of RAID-5. This type of array is most commonly found when combining multiple
hardware RAID devices into a single logical device.

Types of RAID Systems

There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and
subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly
better than another - - although software-based arrays are rapidly losing favor as high-performance, lowcost array
adapters become increasingly available. Each array solution meets different server and network requirements,
depending on the number of users, applications, and storage requirements.It is important to note that all RAID
code is based on software. The difference among the solutions is where that software code is executed -- on the
host CPU (softwarebased arrays) or offloaded to an on-board processor (bus-based and external array controllers).

Software based RAID

Primarily used with entry-level servers, software-based arrays rely on a standard host adapter and execute all I/O
commands and mathematically intensive RAID algorithms in the host server CPU. This can slow system
performance by increasing host PCI bus traffic, CPU utilization, and CPU interrupts. Some NOSs such as
NetWare and Windows NT include embedded RAID software. The chief advantage of this embedded RAID
software has been its lower cost compared to higher-priced RAID alternatives. However, this advantage is
disappearing with the advent of lower-cost, bus-based array adapters. The major advantages are Low-Cost & it
requires only a standard controller.

Hardware based RAID

Unlike software-based arrays, bus-based array adapters/controllers plug into a host bus slot (typically a 133
MByte (MB)/sec PCI bus) and offload some or all of the I/O commands and RAID operations to one or more
secondary processors. Originally used only with mid- to high-end servers due to cost, lower-cost bus-based array
adapters are now available specifically for entry-level server network applications. In addition to offering the
fault-tolerant benefits of RAID, bus-based array adapters /controllers perform connectivity functions that are
similar to standard host adapters. By residing directly on a host PCI bus, they provide the highest performance of
all array types. Bus-based arrays also deliver more robust fault tolerant features than embedded NOS RAID
software.

Advantages are data protection & performance benefits of RAID and more robust fault-tolerant features and
increased performance versus software-based RAID.
External Hardware RAID Card

Intelligent external array controllers bridge between one or more server I/O interfaces and Single / multiple-
device channels. These controllers feature an onboard microprocessor, which provides high performance and
handles functions such as executing RAID software code and supporting data caching. External array controllers
offer complete operating system independence, the highest availability, and the ability to scale storage to
extraordinarily large capacities (up to a terabyte and beyond). These controllers are usually installed in networks
of stand-alone Intel-based and UNIX-based servers as well as clustered server environments. Advantages are OS
independent and to build super high-capacity storage systems for high-end servers.

RAID Performance Modeling

As RAID architectures are being developed, their performance is analyzed using either simulation or analytical
methods. Simulators are generally handcrafted, which entails code duplication and potentially unreliable systems.
Equally problematic is that RAID architectures, once defined, are difficult to prove correct without extensive
simulations.

Analytical Modeling

Analytical models of RAID operations generally utilize drive characteristics to model typical operations. Key
factors such as Seek Time (time to move to a specified track), Rotational Latency (time to rotate disk to required
sector) and Transfer Time (time to read or write to disk) are used to derive mathematical expressions of the time
taken for different operations. Such expressions are usually independent of any particular hardware, and are
derived from analysis of typical actions performed during a given RAID operation. Most analytical models of
RAID are based on other models of the underlying disks driving the array. Using these drive characteristics,
queuing models can be introduced to model the arrival of tasks at each disk in a RAID, as such tasks are handed
out by the RAID controller. Combining these queued disks with arrival rates for a single disk in normal operation
provides a starting point from which more complicated RAID systems can be modeled.

Simulation

The most notable simulation tool for RAID systems is the previously mentioned RAIDframe system. The system
offers multiple ways to test any RAID architecture using the previously mentioned, generic DAG representation.
The simulator is able to use utilization data gathered from real world systems to feed in to the simulator as well as
accepting real requests from outside systems. It can also be used to generate a Linux software driver to test
architecture in a real world system. One downside to this approach is that, since all RAID operations are
performed in software, the controller software can easily become a bottleneck (up to 60% of total processing due
to software) and hence skew the results. Another simulation tool presented in the literature (and available for
academic use) is the RAID-Sims application. It is built on top of a simulator for single disks, known as DiskSim,
developed at the University of Michigan. Unlike RAID frame, Raisin is labeled as very difficult to use, and has no
generic interface. One other tool the authors are aware of is Sims-SAN, an environment for simulating and
analyzing Storage Area Networks in which RAID usually serves as the underlying storage.

Parity & Fault Tolerance

The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in
the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition
of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data
and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does
not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance,
offers no data redundancy at all. RAID technology does not prevent drive failures. However, RAID does provide
insurance against disk drive failures by enabling real-time data recovery without data loss. The fault tolerance of
arrays can also be significantly enhanced by choosing the right storage enclosure. Enclosures that feature
redundant, hot- swappable drives, power supplies, and fans can greatly increase storage subsystem uptime based
on a number of widely accepted measures:

Mean Time to Data Loss (MTDL). The average time before the failure of an array component causes data to be
lost or corrupted.

Mean Time between Data Access / Availability (MTDA). The average time before non-redundant components
fail, causing data inaccessibility without loss or corruption.

Mean Time To Repair (MTTR). The average time required to bring an array storage subsystem back to full fault
tolerance.

Mean Time Between Failure (MTBF). Used to measure computer component average reliability/life
expectancy. MTBF is not as well-suited for measuring the reliability of array storage systems as MTDL, MTTR
or MTDA because it does not account for an array's ability to recover from a drive failure. In addition, enhanced
enclosure environments used with arrays to increase uptime can further limit the applicability of MTBF ratings
for array solutions.

Our Implementation

In one of our driver projects related to Storage Area Networks (SAN), we have implemented the RAID-01 &
RAID-10. The RAID-01 (or RAID 0+1) is a mirrored pair (RAID-1) made from two stripe sets (RAID-0); hence
the name RAID 0+1, because it is created by first creating two RAID-0 sets and adding RAID-1. If you lose a
drive on one side of a RAID-01 array, then lose another drive on the other side of that array before the first side is
recovered, you will suffer complete data loss. It is also important to note that all drives in the surviving mirror are
involved in rebuilding the entire damaged stripe set, even if only a single drive was damaged. Performance during
recovery is severely degraded during recovery unless the RAID subsystem allows adjusting the priority of
recovery. However, shifting the priority toward production will lengthen recovery time and increase the risk of the
kind of the catastrophic data loss mentioned earlier.RAID-10 (or RAID 1+0) is a stripe set made up from N
mirrored pairs. Only the loss of both drives in the same-mirrored pair can result in any data loss and the loss of
that particular drive is 1/Nth as likely as the loss of some drive on the opposite mirror in RAID-01. Recovery only
involves the replacement drive and its mirror so the rest of the array performs at 100% capacity during recovery.
Also since only the single drive needs recovery bandwidth requirements during recovery are lower and recovery
takes far less time reducing the risk of catastrophic data loss. The most appropriate RAID configuration for a
specific file system or database table space must be determined based on data access patterns and cost versus
performance tradeoffs. RAID-0 offers no increased reliability. It can, however, supply performance acceleration at
no increased storage cost. RAID-1 provides the highest performance for redundant storage, because it does not
require read modify- write cycles to update data, and because multiple copies of data may be used to accelerate
read-intensive applications. Unfortunately, RAID-1 requires at least double the disk capacity of RAID-0. Also,
since more than two copies of the data exist; RAID-1 arrays may be constructed to endure loss of multiple disks
without interruption. Parity RAID allows redundancy with less total storage cost. The read modify- write it
requires, however, will reduce total throughput in any small write operations (read-only or extremely read-
intensive applications are fine). The loss of a single disk will cause read performance to be degraded while the
system reads allother disks in the array and re-computes the missing data. Additionally, it does not support losing
multiple disks, and cannot be made redundant.

The 7 Layers of the OSI Model


The OSI, or Open System Interconnection, model defines a networking framework for implementing protocols in
seven layers. Control is passed from one layer to the next, starting at the application layer in one station,
proceeding to the bottom layer, over the channel to the next station and back up the hierarchy.

This layer supports application and end-user processes. Communication partners are identified,
quality of service is identified, user authentication and privacy are considered, and any constraints
Application on data syntax are identified. Everything at this layer is application-specific. This layer provides
(Layer 7) application services for file transfers, e-mail, and other network software services. Telnet and FTP
are applications that exist entirely in the application level. Tiered application architectures are part
of this layer.
This layer provides independence from differences in data representation (e.g., encryption) by
Presentation translating from application to network format, and vice versa. The presentation layer works to
transform data into the form that the application layer can accept. This layer formats and encrypts
(Layer 6) data to be sent across a network, providing freedom from compatibility problems. It is sometimes
called the syntax layer.
This layer establishes, manages and terminates connections between applications. The session
Session
layer sets up, coordinates, and terminates conversations, exchanges, and dialogues between the
(Layer 5) applications at each end. It deals with session and connection coordination.
Transport This layer provides transparent transfer of data between end systems, or hosts, and is responsible
(Layer 4) for end-to-end error recovery and flow control. It ensures complete data transfer.

This layer provides switching and routing technologies, creating logical paths, known as virtual
Network circuits, for transmitting data from node to node. Routing and forwarding are functions of this
(Layer 3) layer, as well as addressing, internetworking, error handling, congestion control and packet
sequencing.
At this layer, data packets are encoded and decoded into bits. It furnishes transmission protocol
knowledge and management and handles errors in the physical layer, flow control and frame
Data Link synchronization. The data link layer is divided into two sublayers: The Media Access Control
(Layer 2) (MAC) layer and the Logical Link Control (LLC) layer. The MAC sublayer controls how a
computer on the network gains access to the data and permission to transmit it. The LLC layer
controls frame synchronization, flow control and error checking.
This layer conveys the bit stream - electrical impulse, light or radio signal -- through the network
Physical at the electrical and mechanical level. It provides the hardware means of sending and receiving
(Layer 1) data on a carrier, including defining cables, cards and physical aspects. Fast Ethernet, RS232, and
ATM are protocols with physical layer components.

http://www.geocities.com/SiliconValley/Monitor/3131/ne/osimodel.html

Layer Function Protocols Network


Components
Application • used for applications specifically written to DNS; FTP; TFTP; Gateway
run over the network BOOTP;
User Interface • allows access to network services that SNMP;RLOGIN;
support applications;
SMTP; MIME; NFS;
• directly represents the services that directly
support user applications
FINGER; TELNET;
• handles network access, flow control and NCP; APPC; AFP;
error recovery SMB

• Example apps are file transfer,e-mail,


NetBIOS-based applications
Presentation • Translates from application to network Gateway
format and vice-versa
Translation • all different formats from all sources are Redirector
made into a common uniform format that
the rest of the OSI model can understand
• responsible for protocol conversion,
character conversion,data encryption /
decryption, expanding graphics commands,
data compression
• sets standards for different systems to
provide seamless communication from
multiple protocol stacks

• not always implemented in a network


protocol
Session • establishes, maintains and ends sessions NetBIOS Gateway
across the network
"syncs and sessions" • responsible for name recognition Names Pipes
(identification) so only the designated parties
can participate in the session
• provides synchronization services by Mail Slots
planning check points in the data stream =>
if session fails, only data after the most RPC
recent checkpoint need be transmitted
• manages who can transmit data at a certain
time and for how long

• Examples are interactive login and file


transfer connections, the session would
connect and re-connect if there was an
interruption; recognize names in sessions
and register names in history
Transport • additional connection below the session layer TCP, ARP, RARP; Gateway
• manages the flow control of data between
packets; flow control & parties across the network SPX Advanced Cable
error-handling • divides streams of data into chunks or Tester
packets; the transport layer of the receiving
computer reassembles the message from
NWLink
packets Brouter
• "train" is a good analogy => the data is NetBIOS / NetBEUI
divided into identical units
• provides error-checking to guarantee error- ATP
free data delivery, with on losses or
duplications
• provides acknowledgment of successful
transmissions; requests retransmission if
some packets don’t arrive error-free

• provides flow control and error-handling


Network • translates logical network address and namesIP; ARP; RARP, Brouter
to their physical address (e.g. computername ICMP; RIP; OSFP;
==> MAC address)
addressing; routing Router
• responsible for
IGMP;
o addressing
Frame Relay Device
o determining routes for sending IPX
o managing network problems such as
packet switching, data congestion
ATM Switch
and routing NWLink
• if router can’t send data frame as large as Advanced Cable
the source computer sends, the network layer NetBEUI Tester
compensates by breaking the data into
smaller units. At the receiving end, the OSI
network layer reassembles the data

• think of this layer stamping the addresses on


DDP
each train car
DECnet
Data Link • turns packets into raw bits 100101 and at the Logical Link Bridge
receiving end turns bits into packets. Control
data frames to bits • handles data frames between the Network Switch
and Physical layers • error correction
• the receiving end packages raw data from and flow control
the Physical layer into data frames for ISDN Router
• manages link
delivery to the Network layer
control and
• responsible for error-free transfer of frames defines SAPs
Intelligent Hub
to other computer via the Physical Layer
802.1 OSI Model NIC
• this layer defines the methods used to
transmit and receive data on the network. It 802.2 Logical Link Advanced Cable
consists of the wiring, the devices use to
connect the NIC to the wiring, the signaling
Control Tester
involved to transmit / receive data and the Media Access
ability to detect signaling errors on the Control
network media
• communicates
with the adapter
card
• controls the type
of media being
used:

802.3 CSMA/CD
(Ethernet)

802.4 Token Bus


(ARCnet)

802.5 Token Ring

802.12 Demand Priority


Physical • transmits raw bit stream over physical cable IEEE 802 Repeater
• defines cables, cards, and physical aspects
hardware; raw bit • defines NIC attachments to hardware, how IEEE 802.2 Multiplexer
stream cable is attached to NIC
ISO 2110
Hubs
• defines techniques to transfer bit stream to
cable ISDN
• Passive
• Active

TDR

Oscilloscope

Amplifier

Network Orientation
Peer to Peer Networks

• No dedicated server or hierarchy, also called a workgroup.


• Usually 10 or fewer workstations.
• Users act as their own administrator and security.
• Computers are in same general area.
• Limited growth.

Server Based Networks

• 10 or more users.
• Employs specialized servers.
1. File and Print
2. Application
3. Mail
4. Fax
5. Communications (gateways)
• Central administration.
• Greater security.
• Centralized backup.
• Data Redundancy.
• Supports many users

Combination Networks

• Combines the features of both Peer to Peer and Server based networks
• Users can share resources among themselves as well as access server-based resources.

Network Topologies
There are 4 basic topologies with variations

Bus Topology

• Bus consists of a single linear cable called a trunk.


• Data is sent to all computers on the trunk. Each computer examines EVERY packet on the wire to
determine who the packet is for and accepts only messages addressed to them.
• Bus is a passive topology.
• Performance degrades as more computers are added to the bus.
• Signal bounce is eliminated by a terminator at each end of the bus.
• Barrel connectors can be used to lengthen cable.
• Repeaters can be used to regenerate signals.
• Usually uses Thinnet or Thicknet
o both of these require 50 ohm terminator
• good for a temporary, small (fewer than 10 people) network
• But its difficult to isolate malfunctions and if the backbone goes down, the entire network goes down.

Star Topology

• Computers are connected by cable segments to a centralized hub.


• Signal travels through the hub to all other computers.
• Requires more cable.
• If hub goes down, entire network goes down.
• If a computer goes down, the network functions normally.
• most scalable and reconfigurable of all topologies

Ring Topology

• Computers are connected on a single circle of cable.


• usually seen in a Token Ring or FDDI (fiber optic) network
• Each computer acts as a repeater and keeps the signal strong => no need for repeaters on a ring topology
• No termination required => because its a ring

• Token passing is used in Token Ring networks. The token is passed from one computer to the next, only
the computer with the token can transmit. The receiving computer strips the data from the token and sends
the token back to the sending computer with an acknowledgment. After verification, the token is
regenerated.
• relatively easy to install, requiring ;minimal hardware

Mesh

• The mesh topology connects each computer on the network to the others
• Meshes use a significantly larger amount of network cabling than do the other network topologies, which
makes it more expensive.
• The mesh topology is highly fault tolerant.
o Every computer has multiple possible connection paths to the other com-puters on the network, so
a single cable break will not stop network communications between any two computers.

Star Bus Topology

• Several star topologies linked with a linear bus.


• No single computer can take the whole network down. If a single hub fails, only the computers and hubs
connected to that hub are affected.

Star Ring Topology

• Also known as star wired ring because the hub itself is wired as a ring. This means it's a physical star,
but a logical ring.
• This topology is popular for Token Ring networks because it is easier to implement than a physical ring,
but it still provides the token passing capabilities of a physical ring inside the hub.
• Just like in the ring topology, computers are given equal access to the network media through
• the passing of the token.
• A single computer failure cannot stop the entire network, but if the hub fails, the ring that the hub controls
also fails.

Hybrid Mesh

• most important aspect is that a mesh is fault tolerant


• a true mesh is expensive because of all the wire needed
• another option is to mesh only the servers that contain information that everyone has to get to. This way
the servers (not all the workstations) have fault tolerance at the cabling level.
Top of page

Connecting Network Components


Primary Cable Types
• Coaxial Cable
• Twisted-pair
o UTP - Unshielded Twisted Pair
o STP - Shielded Twisted Pair
• Fiber-optic

Coaxial Cable
• Consists of a solid or stranded copper core surrounded by insulation, a braided shield and an insulating
jacket.

• Braided shield prevents noise and crosstalk.


• More resistant to interference and attenuation than twisted pair cabling.
• Both thin and thick cables can use (see pp. 80-81 for pics)
o BNC cable connectors,
o BNC barrel connectors
o BNC T connectors
o BNC terminators.
• Plenum (fire resistant) graded cable can be used in false ceilings of office space or under the floor.
• Can transmit data, voice and video.
• Offers moderate security ----> better than UTP/STP

Thinnet - RG-58 cable


• called
• 0.25" thick.
• Uses
o BNC twist connector,
o BNC barrel connectors
o BNC T connectors
o 50 ohm terminators
• Can carry signals 185 meters or 607 feet.
• Types: (pics on page 78)

Coaxial Cable Types

RG-8 and RG-11 Thicknet (50 ohms)


RG-58 Family
RG-58 /U Solid copper (50 ohms)
RG-58 A/U Thinnet, Stranded copper (50 ohms)
RG-58 C/U Thinnet, Military grade (50 ohms)
RG-59 Broadband/Cable TV (75 ohm) video cable
ARCnet cable (93 ohm)
RG-62 A/U RG-62 A/U is the standard ARCnet cable, but
ARCnet can use fiber optic or twisted pair.

• each cable must have a terminator whose impedance matches the cable type
• impedance = current resistance measured in ohms
• terminators are resistors that prevent signal bounce or echo.

Here are some limitations of 10Base2 Ethernet:

• Length of trunk segment may be up to 607 feet.


• A maximum of 30 workstations is allowed per trunk.
• There may be no more than 1024 workstations per network.
• Entire network trunk length can't exceed 3035 feet (925 meters)
• The minimum cable length between workstations is 20 inches.
• The Ethernet 5-4-3 Rule for connecting segments is 5 trunk segments can be connected, with 4 repeaters
or concentrators, with no more than 3 populated segments (on coaxial cable).

Thicknet - RG-8 and RG-11 coaxial cable


• 0.5" thick
• used for 10Base5 networks, linear bus topology
• transmits at 10 Mbps
• Uses DIX or AUI (Attachment Unit Interface) connector - also known as DB-15 connector to connect to
external transceivers.
• Vampire taps are used to attach a transceiver to the thicknet trunk.
• Can carry signals 500 meters or 1640 feet.
• much less flexible and far more bulky and harder to install than thinnet
• better security than thinnet
• better resistance to electrical interference than thinnet.
• MORE expensive than thinnet.

Twisted-Pair Cable
• Consists of two insulated copper wires twisted around each other.
• Twisting cancels out electrical noise from adjacent pairs (crosstalk) and external sources.
• Uses RJ-45 telephone-type connectors (larger than telephone and consists of eight wires vs. Telephone's 4
wires).
• Generally inexpensive.
• Easy to install.

Unshielded Twisted Pair (UTP)


• Maximum cable length is 100 meters or 328 feet (10BaseT).
• Types:
1. Cat 1 Voice grade telephone cable.
2. Cat 2 Data grade up to 4 Mbps, four twisted pairs.

Category 3 and above is needed for Ethernet networks. Cat 3, 4, and 5 use RJ-45 connectors
3. Cat 3 Data grade up to 10 Mbps, four pairs w/3 twists/ft.
4. Cat 4 Data grade up to 16 Mbps, four twisted pairs.
5. Cat 5 Data grade up to 100 Mbps, four twisted pairs.

This is the cheapest cable to put in. Exam questions ALWAYS take this as a given.

Here are some limitations of 10BaseT Ethernet:

• Workstations may be no more than 328 feet from the concentrator port.
• 1,023 stations are allowed on a segment without bridging.
• The minimum cable length between workstations is 8 feet.

Other Drawbacks

• UTP is particularly susceptible to crosstalk, which is when signals from one line get mixed up with
signals from another.
• easily tapped (because there is no shielding)
• 100 meters is shortest distance => attenuation is the biggest problem here.

Shielded Twisted Pair (STP)


• Uses a woven copper braid jacket and a higher quality protective jacket. Also uses foil wrap between and
around the wire pairs.
• Much less susceptible to interference and supports higher transmission rates than UTP.
• Shielding makes it somewhat harder to install.
• same 100 meter limit as UTP.
• harder to tap
• used in AppleTalk and Token Ring networks

Fiber Optic Cable


• Consists of a small core of glass or plastic surrounded by a cladding layer and jacket.
• Fibers are unidirectional (light only travels in one direction) so two fibers are used, one for sending and
one for receiving. Kelvar fibres are placed between the two fibres for strength.
• Good for very high speed, long distance data transmission.
• NOT subject to electrical interference.
• Cable can't be tapped and data stolen => high security
• Most expensive and difficult to work with.
• Immune to tapping.
• can transmit at 100 Mbps and way up to 2 Gbps
• up to 2000 meters without a repeater.
• Supports data, voice and video.
• needs specialized knowledge to install => expensive all round.

Cable Type Comparisons


# of nodes # of nodes
Type Speed Distance Installation Interference Cost per per
segment network
10 100 Highly Least
10BaseT Easy 1 computer
Mbps meters susceptible expensive
100BaseT 100 100 Easy Highly More
Mbps meters susceptible expensive than
10BaseT
16 to More
100 Moderately Somewhat
STP 155 expensive than
meters Easy resistant
Mbps Thinnet or UTP
10 185 Medium Somewhat
10Base2 Inexpensive 30 1024
Mbps meters Difficulty resistant
More
10 500 More difficult More resistant
10Base5 expensive than 100 300
Mbps meters than Thinnet than most cable
most cable
100
Not susceptible Most
Fiber Mbps 2000
Most difficult to electronic expensive type
Optic to meters
interference of cable
2 Gbps

Signal Transmission

Baseband Transmission -- Digital

• Baseband transmission uses digital signaling over a single frequency.


• Entire communication channel is used to transmit a single signal.
• Flow is bi-directional. Some can transmit and receive at the same time.
• Baseband systems use repeaters to strengthen attenuated signals.

Broadband Transmission -- Analog

• Broadband uses analog signaling over a range of frequencies.


• Signals are continuous and non-discrete.
• Flow is uni-directional and so two frequency channels or two separate cables must be used.
o if enough bandwidth is available, multiple analog transmission systems such as cable TV AND
network transmissions can be on the same cable at the same time.
o if this is the case, ALL devices must be tuned to use only certain frequencies
• Uses amplifiers for signal regeneration.

Helpful mnemonic to remember the difference:

Baseband is "BEDR"

Bidirectional
Entire channel taken up
Digital
Repeaters used to strengthen signal

IBM Cabling

• Uses AWG standard wire size.


• Connected with proprietary IBM unisex connectors.
• Defines cables as types
• used for computers and • 16 Mbps
STP MAU's.
Type 1
(Shielded twisted-pair) These three • 260 computer limit
• 101 m cable types
can be used
Type 2 STP, Voice and data • 100 m in Token
• 45 m Ring • 4 Mbps
Networks
Type 3 UTP; Voice grade
• Most common Token Ring • 72 computer limit
Cable
Type 5 Fiber-optic • industry standard
• used to connect MSAU's
together
Type 6 STP; Data patch
• used to extend Type 3 cables
from one computer to the
MSAU
• Limited to 1/2 the distance
Type 8 STP Flat; Carpet grade
of Type 1 cable
• used under floors or in
Type 9 STP; Plenum grade
ceiling space

Important Cabling Considerations


Installation Logistics

• How easy is the cable to work with?

Shielding

• Is the area "noisy"?


• Do you need plenum grade cable => more expensive

Crosstalk

• Where data security is important this is a problem


• Power lines, motors relays and radio transmitters cause crosstalk

Transmission Speed (part of the bandwidth)

• Transmission rates are measured in Mbps


• 10 Mbps is common
• 100 Mbps is becoming common
• Fiber can go well over 100 Mbps but costs and requires experts to install.

Cost

• Distance costs you money

Attenuation
• Different cables can only transmit so far without causing too many errors

Wireless Local Area Networks


• Used where cable isn't possible - remote sites; also when mobility is important.
• Use transceivers or access points to send and receive signals between the wired and wireless network.

There are 4 techniques for transmitting data

• Infrared transmission consists of four types;


1. Line of sight
2. Scatter: good within 100 ft.
3. Reflective
4. Broadband optical telepoint: used for multimedia requirements; as good as cable.
• Laser requires direct line-of-sight.
• Narrow-band (single frequency) radio

o Cannot go through steel or load-bearing walls.


o Requires a service handler.
o Limited to 4.8 Mbps
Spread-Spectrum Radio
o Signals over a range of frequencies.
o Uses hop timing for a predetermined length of time.
o Coded for data protection.
o Quite slow; Limited to 250 Kbps.

Point to Point Transmission

• Transfers data directly from PC to PC (NOT through cable or other peripherals)


• Uses a point to point link for fast error-free transmission.
• Penetrates objects.
• Supports data rates from 1.2 to 38.4 Kbps up to
o 200 feet indoors or
o 1/3 of a mile with line of site transmission.
• Also communicates with printers, bar code readers, etc.

Multipoint Wireless Bridge

• Provides a data path between two buildings.


• Uses spread-spectrum radio to create a wireless backbone up to three miles.

Long-Range Wireless Bridge

• Uses spread-spectrum technology to provide Ethernet and Token-Ring bridging for up to 25 miles.
• This costs less than T1, but T1 will transmit at 1.544 Mbps
Mobile Computing

• Uses wireless public carriers to transmit and receive using;


o Packet-radio communication.
o Uplinked to satellite, broadcast only to device which has correct address.
o Cellular networks.
o CDPD same as phone, subsecond delays only, real time transmission, can tie into cabled network.
o Satellite stations.
o Microwave, most common in USA, 2 X directional antennas, building to building, building to
satellite
• Slow transmission rate: 8 Kbps - 19.2 Kbps

Network Adapter Cards


The role of the network Adapter card it to:

• Prepare data from the computer for the network cable


• Send the data to another computer
• Control the flow of data between the computer and the cabling system

NIC's contain hardware and firmware (software routines in ROM) programming that implements the

• Logical Link Control and


• Media Access Control

functions of the Data Link layer of the OSI

Preparing Data

• data moves along paths in the computer called a BUS - can be 8, 16, 32 bits wide.
• on network cable, data must travel in a single bit stream in what's called a serial transmission (b/c on bit
follows the next).
• The transceiver is the component responsible for translating parallel (8, 16, 32-bit wide) into a 1 bit wide
serial path.
• A unique network address or MAC address is coded into chips in the card
• card uses DMA (Direct Memory Access) where the computer assigns memory space to the NIC
o if the the card can't move data fast enough, the card's buffer RAM holds it temporarily during
transmission or reception of data

Sending and Controlling Data

The NICs of the two computers exchanging data agree on the following:

1. Maximum size of the groups of data being sent


2. The amount of data to be sent before confirmation
3. The time intervals between send data chunks
4. The amount of time to wait before confirmation is sent
5. How much data each card can hold before it overflows
6. The speed of the data transmission
Network Card Configuration

• IRQ: a unique setting that requests service from the processor.

IRQ # Common Use I/O Address


IRQ 1 Keyboard
IRQ 2(9) Video Card
IRQ 3 Com2, Com4 2F0 to 2FF
IRQ 4 Com1, Com3 3F0 to 3FF
IRQ 5 Available (Normally LPT2 or sound card )
IRQ 6 Floppy Disk Controller
IRQ 7 Parallel Port (LPT1)
IRQ 8 Real-time clock
IRQ 9 Redirected IRQ2 370 - 37F
IRQ 10 Available (maybe primary SCSI controller)
IRQ 11 Available (maybe secondary SCSI controller)
IRQ 12 PS/2 Mouse
IRQ 13 Math Coprocessor
IRQ 14 Primary Hard Disk Controller
IRQ 15 Available (maybe secondary hard disk controller)
• Base I/O port: Channel between CPU and hardware
o specifies a channel through which information flows between the computer's adapter card and the
CPU. Ex. 300 to 30F.
o Each hardware device must have a different base I/O port

• Base Memory address: Memory in RAM used for buffer area


o identifies a location in the computer's RAM to act as a buffer area to store incoming and outgoing
data frames. Ex. D8000 is the base memory address for the NIC.
o each device needs its own unique address.
o some cards allow you to specify the size of the buffer ( 16 or 32 k, for example)
• Transceiver:
o sometimes selected as on-board or external. External usually will use the AUI/DIX connector:
Thicknet, for example
o Use jumpers on the card to select which to use

Data Bus Architecture

The NIC must

• match the computer's internal bus architecture and


• have the right cable connector for the cable being used

• ISA (Industry Standard Architecture): original 8-bit and later 16-bit bus of the IBM-PC.
• EISA (Extended Industry Standard Architecture): Introduced by consortium of manufacturers and
offers a 32-bit data path.
• Micro-Channel Architecture (MCA): Introduced by IBM in its PS/2 line. Functions as either 16 or 32
bit.
• PCI (Peripheral Component Interconnect): 32-bit bus used by Pentium and Apple Power-PC's.
Employs plug and play.

Improving Network Card Performance

• Direct Memory Access (DMA):


o data is moved directly from the network adapter card's buffer to computer memory.
• Shared Adapter Memory:
o network adapter card contains memory which is shared with the computer.
o The computer identifies RAM on the card as if it were actually installed on the computer
• Shared System Memory:
o the network adapter selects a portion of the computer's memory for its use.
o MOST common
• Bus Mastering:
o the adapter card takes temporary control of the computer's bus, freeing the CPU for other tasks.
o moves data directly to the computer's system memory
o Available on EISA and MCA
o can improve network performance by 20% to 70%
• RAM buffering:
o Ram on the adapter card acts as a buffer that holds data until the CPU can process it.
o this keeps the card from being a bottleneck
• On-board microprocessor:
o enables the adapter card to process its own data without the need of the CPU

Wireless Adapter Cards

• Used to create an all-wireless LAN


• Add wireless stations to a cabled LAN
• uses a wireless concentrator, which acts as a transceiver to send and receive signals

Remote-Boot PROMS (Programmable Read Only Memory)

• Enables diskless workstations to boot and connect to a network.


• Used where security is important.

OSI: The Network Layer

OSI Background
Created by the International Organization for Standardization (ISO) to develop standards for data networking,
the Open System Interconnection (OSI) protocols represent an international standardization program that
facilitates multivendor equipment interoperability. This paper will familiarize you with common terms and
introduce you to the core concepts of open systems networking.In an OSI network there are four significant
architectural entities: hosts, areas, a backbone, and a domain. A domain is any portion of an OSI network that is
under common administrative authority. Within any OSI domain, one or more areas can be defined. An area is a
logical entity; it is formed by a set of contiguous routers and the data links that connect them. All routers in the
same area exchange information about all of the hosts that they can reach.The areas are connected to form a
backbone. All routers on the backbone know how to reach all areas. The term end system (ES) refers to any
nonrouting host or node; intermediate system (IS) refers to a router. These terms are the basis for the OSI End
System-to-Intermediate System (ES-IS) and Intermediate System-to-Intermediate System (IS-IS) protocols, both of
which are discussed later in this document.

OSI Network-Layer Services and Protocols


Two types of OSI network-layer services are available: Connectionless Network Service (CLNS) and Connection-
Oriented Network Service (CONS). CLNS uses a datagram data transfer service and does not require a circuit to
be established before data is transmitted. In contrast, CONS does require a circuit to be established before
transmitting data. While CLNS and CONS define the actual services provided to the OSI transport layer entities
that operate immediately above the network layer, Connectionless Network Protocol (CLNP) and Connection-
Oriented Network Protocol (CONP) name the protocols that these services use to convey data at the network
layer. CLNP is the OSI equivalent of IP. Knowledge of OSI network addressing is the next step toward an
understanding of routing. OSI network addresses are variable-length entities designed to handle networks of
virtually any type and size. OSI addressing encompasses two primary concepts: Network Service Access Points
(NSAPs) and Network Entity Titles (NETs). NSAPs specify usage points at which network-layer services can be
acquired. If there are multiple network-layer service users (for example, OSI transport protocols Transport
Protocol 3 [TP-3] and Transport Protocol 4 [TP-4]) in a particular ES, then that ES will have multiple NSAP
addresses. In contrast, NETs specify network-layer entities or processes. NET entities represent the active agents
that operate within the network layer to carry out assigned functions. CLNP is a network-layer entity and would
therefore have an associated NET. NSAP and NET structure is very similar; in fact, in an ES, they typically differ
only in the last byte, called the selector. The NSAP selector is used to distinguish between logical entities on the
host (a transport entity in an ES or a network entity in an IS). NSAPs are hierarchical addresses consisting of two
parts: an initial domain part (IDP) and a domain-specific part (DSP). The IDP consists of authority and format
identifier (AFI) and initial domain identifier (IDI) parts. The AFI provides information about the structure and
content of the IDI and DSP fields, including whether the IDI is of variable length and whether the DSP uses
decimal or binary notation. The IDI further specifies an entity that can assign values to the DSP portion of the
address.When used in an environment where the OSI IS-IS protocol is used for routing, the DSP specifies the
area, the station ID within the area, and the selector (port) number. Figure 1 illustrates the NSAP address format
for use with IS-IS routing.

Figure 1: NSAP Address Format for Use With IS-IS Routing

OSI Routing Protocols


The OSI protocol suite includes several routing protocols and one router discovery protocol (ES-IS). Although not
explicitly a routing protocol, ES-IS is included in this section because it is commonly used with routing protocols
to provide end-to-end data movement through an internetwork. Routing within an area is called level 1 routing;
routing between areas is called level 2 routing. An IS that can route only within areas is known as a level 1 IS. A
level 1 IS needs to know only about the ESs and other level 1 ISs in its own level 1 area and about the nearest
level 2 IS that it can use to forward traffic out of its own area. Figure 2 illustrates the level 1 view of the routing
domain.

Figure 2: Level 1 View of the Routing Domain

An IS that can route between areas is called a level 2 IS. A level 2 IS must understand the topology of the areas in
which it resides, other level 2 ISs in its routing domain, and how to reach all other level 1 areas. Figure 3
illustrates the level 2 view of the routing domain.

Figure 3: Level 2 View of the Routing Domain

In OSI networks, each ES lives in a particular area. An ES discovers an IS by listening to "hello" messages
exchanged as part of the ES-IS protocol (explained in the next section). When an ES wants to send a packet to
another ES, it sends the packet to any directly connected Level 1 IS in its area. The IS looks up the destination
address and forwards the packet along the best route. If the destination address is an ES in another area, the Level
1 IS sends the packet to the nearest Level 2 IS. Forwarding through Level 2 ISs continues until the packet reaches
a Level 2 IS in the destination area. Within the destination area, Level 1 ISs forward the packet along the best path
of Level 1 ISs until the destination ES is reached. Figure 4 illustrates the CLNP routing process.

Figure 4: CLNP Routing

Network-layer and routing protocols are both involved in the routing process; these protocols are discussed in the
next two sections.

ES-IS

ES-IS is the means through which an ES becomes acquainted with an IS. It is a very simple protocol that makes
use of three types of messages: end-system hellos (ESHs), intermediate-system hellos (ISHs), and redirects. An
ESH announces the presence of an ES. An ESH is sent by all ESs to a special data-link layer address that all ISs
on that network segment listen to. An ISH announces the presence of an IS. An ISH is sent by all ISs to a special
data link-layer address that all ESs on t hat segment listen to. Both ESHs and ISHs provide network-layer and
data link-layer addresses for the source nodes. An IS sends a redirect to an ES to tell the ES that there is a more
efficient path to the destination.Figure 5 shows an instance in which a redirect message instructs ES1 to send a
packet to IS2 instead of IS1. At time 1, ES1 sends a packet to IS1. IS1's optimal path information, compiled with
the help of routing protocols, specifies that the packet should be forwarded out the same port as the one from
which the packet was received. In this case, the best path is really through IS2, which is directly accessible to
ES1. At time 2, after it has forwarded the original packet to IS2, IS1 sends a redirect message to ES1 telling it that
IS2 is a better route for datagrams destined for ES2. At time 3, ES1 directs a new packet to IS2.

Figure 5: Redirect Message Example


Where an ES is connected to an IS via a point-to-point connection, ISHs and redirects are not necessary. The ES
simply sends the IS periodic ESHs to let the IS know its network-layer address. The IS can then announce to the
rest of the network that it can forward datagram’s to that ES.

Where an ES is connected to a LAN, more complicated (but still relatively simple) operations are required. All
ESs send ESHs, and all ISs send ISHs. ESHs allow ISs to identify all ESs on the LAN; ISHs allow ESs to identify
all ISs on the LAN. ESs maintain two caches: an IS cache that contains data link-layer addresses for all ISs on the
LAN and a destination cache that contains the network layer/data link-layer address mappings for all destination
ESs. When an ES needs to transmit to a destination ES, it first checks its destination cache. If the destination ES is
listed in the cache, the source ES addresses and sends the packet accordingly. If the destination ES is not in the
destination cache, the source ES looks in its IS cache. If the IS cache is not empty, the source ES selects an IS
from the cache and addresses its packet to that IS. In other words, the ES sends the packet to any directly
connected IS in its area. The IS may or may not be the first step along the optimal path to the destination. If the IS
determines that the next hop is another IS on the ES's LAN, it forwards the packet to that IS and sends the ES a
redirect message. If the IS determines that the destination ES is on the source ES's LAN, it forwards the packet to
the destination ES and sends a redirect message to the source ES.If the IS cache is empty and there is no
appropriate entry in the destination cache, the ES sends the packet to a multicast address indicating all ESs. All
ESs on the LAN receive the multicast and examine the network-layer address. If an ES sees a network-layer
address matching its own, it accepts the packet and sends an ESH to the source ES. All ESs without a matching
network-layer address discard the packet. Figure 6 shows a flowchart of ES-IS operations.

Figure 6: ES-IS Operations


IS-IS

IS-IS is the standard intradomain routing (routing within a domain) protocol in the OSI protocol suite. It is a link
state protocol, meaning that it calls for each IS to "meet" its neighbor ISs and proliferate information about the
state of each neighbor link to all other ISs. Each IS stores these link state advertisements (LSAs) and can compute
optimal routes to each ES from the complete topological knowledge they yield. IS-IS is a cost-based routing
protocol. In other words, each IS that runs ISIS must be configured with a cost for each attached link. LSAs
include costs to allow straightforward calculation of optimal routes.LSA distribution is a critical part of IS-IS
operations. All ISs must receive LSAs from all other ISs, or topological information is not complete. LSAs are
flooded to all IS ports except those on which the LSA was received. LSAs also include remaining lifetime and
sequence number fields. ISs use these fields to help determine whether received LSAs might be duplicates, too
old, or otherwise inappropriate. ISs send LSAs at regular intervals and when the following special events occur:

• When an IS discovers that its link to a neighbor is down


• When an IS discovers that it has a new neighbor
• When an IS discovers that the cost of a link to an existing neighbor has changed

Once LSAs have been distributed appropriately, an algorithm must be run to compute optimal paths to each ES.
The algorithm most often chosen for this task is the Dijkstra algorithm. The Dijkstra algorithm iterates on the
length of a path, examining the LSAs of all ISs working outward from the host IS. At the end of the computation,
a connectivity tree yielding the shortest paths (including all intermediate hops) to each IS is formed.When a level
1 IS receives a packet, it examines the destina-tion area address in the network-layer header. If this address
matches the level 1 IS's area address, the IS routes based on the ID portion of the address. Otherwise, the IS
forwards the packet to the closest level 2 IS. Within an area, a level 1 IS receiving a packet will look in its routing
table to see if an entry exists for the destination ES. If an entry exists, the IS forwards the packet appropriately. If
an entry does not exist, the packet is either dropped or forwarded to a default IS designated for such purposes.

Integrated IS-IS

Integrated IS-IS is an implementation of the IS-IS protocol for routing multiple network protocols. Today,
Integrated ISIS standards exist that support CLNP and IP protocols.Like all integrated routing protocols,
Integrated IS-IS calls for all routers to run a single routing algorithm. LSAs sent by routers running Integrated IS-
IS include all destinations running either IP or CLNP network-layer protocols. Protocols such as the Address
Resolution Protocol (ARP) and the Internet Control Message Protocol (ICMP) for IP and the ES-IS protocol for
CLNP still must be supported by routers running Integrated IS-IS. Standard IS-IS packets must be modified to
support multiple network-layer protocols. IS-IS packet formats were designed to support the addition of new
fields without a loss of compatibility with nonintegrated versions of IS-IS. The fields that are added to IS-IS to
support integrated routing:

• Tell ISs which network-layer protocols are supported by other ISs


• Tell ISs whether end stations running other protocols can be reached
• Include any other required network-layer, protocol-specific information

Most internetworks running Integrated IS-IS support three different IS configurations: those running only IP,
those running only CLNP, and those running both IP and CLNP. ISs running only one of the two protocols ignore
information concerning the other protocol. In fact, such ISs will refuse to recognize other ISs as neighbors unless
they have at least one protocol in common. ISs running both protocols can and will become neighbors with the
other IS types.

Interdomain Routing

Interdomain routing (routing between domains) is philosophically different from intradomain routing; hence the
separation of these protocols into a new category. The primary philosophical difference is that intradomain
routing typically assumes a trusted environment in which constant communication within a single organization
occurs. By contrast, interdomain routing often occurs between different organizations that want distinct and
essential controls over information sent and received. Communication often is not as frequent and typically is
subjected to additional scrutiny.The simplest type of interdomain routing is static routing. In static routing
systems, routes between domains are manually established and deestablished. Because it involves much more
administrative overhead than dynamic routing, static routing is most often used when very few routes must be
maintained.

Cisco's OSI Implementation


Cisco Systems was the first company to support dynamic interdomain routing within OSI environments.
Currently, Cisco's OSI implementation provides both static and dynamic packet forwarding and routing and
adheres to relevant ISO protocol specifications, including:

• ISO 8473 (CLNP)


• ISO 8348 (CLNS)
• ISO 8348/Ad2 (NSAP addressing)
• ISO 8208 (packet-level CONS)
• ISO 8802-2 (frame-level services on LAN media)
• ISO 8881 (CONS over ISO 8802-2)
• ISO 7776 (Link Access Procedure, Balanced)
• ISO 9542 (ES-IS)
• ISO 10589 (IS-IS)
Integrated IS-IS extensions for IP as defined in RFC 1195 also are supported. Users can perform CLNP routing
over Ethernet, Fiber Distributed Data Interface (FDDI), Token Ring, and serial line networks. Cisco's OSI
implementation is also compliant with the United States Government Open Systems Interconnection Profile (US-
GOSIP) Version 2 specification, and Cisco is the first router vendor to be certified and registered with the
National Institute of Standards and Technology (NIST).

Interoperability

The ability of protocol implementations to work with other implementations of the same protocol (often called
interoperability) is a critical feature of any OSI implementation. Cisco's OSI implementation is highly
interoperable, having been proven so in OSI interoperability demonstrations with AT&T, Data General, DEC,
Frontier Technologies, HP, IBM, Intel, NCR, Novell, OSIWare, Spider, Sun, Tandem, Touch, Unisys, and
Wollongong. Cisco routers are able to interoperate with equipment from each of these vendors, a fact that is
particularly noteworthy in the case of AT&T, which many people believe has the largest installed base of CLNP
end systems. Cisco also participated successfully in a European pilot demonstration of CLNP-protocol-based
inter-domain routing (see Figure 7).

Figure 7: European CLNP Pilot

Figure 7: European CLNP Pilot

As networks grow larger, administrative control of network access becomes increasingly important. Such control
is particularly important in OSI networks, which were designed to provide a rich feature set in support of large,
heterogeneous networks. Cisco provides many features designed to enhance administrative control of OSI
networks. These features are described in the next two sections.
Route Redistribution

Cisco routers support information sharing between multiple routing protocols and between multiple instances of
the same routing protocol. Such sharing is known as route redistribution and is supported among all of Cisco's
routing protocols. Route redistribution ensures that routing can occur in networks that run multiple routing
protocols. Over time, Cisco has enhanced its route redistribution support to improve administrative control over
methods by which routing information moves between routing domains. To ease configuration of route
redistribution, Cisco created route maps. A route map is a set of instructions that tell the router how routing
information is to be redistributed between two routing protocols or between two instances of the same routing
protocol. Route maps contain an ordered list of match conditions. Each item in the list is matched in turn against
any route that is a candidate for redistribution. When a match is found, an item performs an action associated with
that match. The route can be permitted (redistributed) or not permitted (not redistributed), but the action also can
mandate the use of certain administrative information (called route tags) that can be attached to routing data to
augment routing decisions. Route maps also can mandate the use of certain route metrics or route types and even
can modify the route's destination in outgoing advertisements. Where different networks share similar
redistribution needs, network administrators can conserve memory and save time by using the same route map for
more than one protocol pair.

Route maps give network managers unprecedented control over the ways that routing information is propagated
in their networks. Redistribution configuration files that use route maps are easy to create, understand, and
modify. Using route maps, Cisco users are able to build larger, more robust, reliable networks, with better traffic
control than ever before.

OSI Filtering

Cisco offers advanced filtering features that provide additional administrative control of traffic flow in an OSI
network. There are four components to a Cisco OSI filter:

• Address templates
• Template aliases
• Filter sets
• Filter expressions

Address templates are applied to NSAP addresses to provide flexible filtering based on all or a portion of the
address. The simplest template is an address itself. Wildcard notation can be used in an address template to denote
a match with anything. Address prefix and suffix matching is also possible. These features are particularly useful
with NSAP's variable-length addresses. Both bit- and byte-level matching is also possible. Because NSAP
addresses can be relatively lengthy, address templates sometimes can become unwieldy. In these cases, address
templates can be assigned names called template aliases. Template aliases allow repetitive use of address
templates without concern for user typing mistakes and other problems. Aliases are more meaningful to human
administrators than alphanumeric NSAP addresses are, so it is easier to look at a template alias and know what it
denotes. Finally, when an address changes, administrators can simply modify the template alias. A filter set is a
named collection of address templates with associated permit/deny indications. Filter expressions are Boolean
combinations of filter sets, other filter expressions, and certain logical operators (AND, OR, XOR, and NOT).
Filter expressions allow filtering combinations not possible with simple filter sets. Further, they permit matches
on source address. Filter sets and filter expressions can be applied to inbound or outbound CLNP datagrams, IS-
IS adjacencies (IS-IS routers that are on the same network segment), ISO-IGRP adjacencies (ISO-IGRP routers
that are on the same segment), ES-IS adjacencies (ESs and ISs that are on the same segment), and route
redistribution. Together, they provide an extensive set of OSI filtering capabilities designed to ease network
administration while saving time and reducing the possibility of configuration errors.

Integrated and Interdomain Routing

In addition to Cisco's support of Integrated IS-IS, its standard IS-IS implementation still can run simultaneously
in the same router with other routing protocols. For example, users can use IS-IS to route CLNP and Enhanced
IGRP to route IP. Both routing processes (IS-IS and Enhanced IGRP) operate autonomously in any router. This
approach, which is often called ships-in-the-night routing, creates multiple logical routers within a single physical
router. Physical routers analyze all incoming datagrams, identify the indicated network-layer protocol in each, and
assign the packet to the appropriate logical router for processing. In addition to Integrated IS-IS, Cisco continues
to offer its ISOIGRP implementation. ISO-IGRP is another integrated routing protocol that accomplishes the
same purpose as Integrated IS-IS. The primary difference between the two is that ISO-IGRP is a distance-vector
protocol, whereas Integrated IS-IS is a link-state protocol.ISO-IGRP also gave Cisco the distinction of being the
first company to offer dynamic interdomain routing for CLNP. An ISO-IGRP network can connect two or more
IS-IS domains. Route redistribution ensures that IS-IS routes can pass through the "foreign" environment without
information loss. Static routes provide users with yet another way to effect inter-domain routing in CLNP
environments.

Other Features

To provide monitoring and troubleshooting capability, the Cisco CLNP implementation supports both ping and
trace commands. Ping commands are used to test the reachability of remote nodes. Trace commands allow an
administrator to discover the path a packet takes when it traverses the network. In addition to these helpful and
often-used commands, the show and debug commands display such information as the contents of the routing
cache, lists of ES and IS neighbors, traffic statistics, and significant CLNP event occurrences. These capabilities
constitute the industry's most robust set of CLNP monitoring and diagnostic features and, for the user, they
translate into less time spent debugging network problems. Routing paths through a network can be of equal cost.
This is particularly common in the case of serial interfaces, because the speed of the lines is often the same.
Rather than simply using one of two paths and subjecting traffic on that line to possible delay, Cisco supports per-
packet load sharing between equal-cost paths. In other words, packets can be multiplexed in a round-robin fashion
on up to four equal-cost paths. This technique provides better response through superior bandwidth
utilization.X.500 is the OSI name service protocol. Since X.500 implementations are not yet commonplace, Cisco
offers system administrators a static name-to-address translation capability. This feature allows administrators to
use convenient names rather than 20-byte NSAP addresses in all router commands. Administrators provide the
router with name/NSAP address pairs, which are used for name-to-address translation. Domain Name System
(DNS) support for NSAP addresses, as defined in RFC 1348, is currently in transition. Cisco is tracking the
transition and will support the standard that emerges. When the standard is complete, administrators will simply
load the name-to-NSAP mapping into a DNS database. Thereafter, when a name that is not in the NSAP name
database is encountered, a DNS lookup is executed automatically.

X.25 Switching

Cisco's support of ISO 8208 (CONS) provides the ability to extend X.25 switching to different media, such as
Ethernet, Token Ring, and FDDI. CONS specifies the implementation of packet-level X.25 over the Logical Link
Control 2 (LLC2) connection-oriented data link service on LAN media. LAN-based OSI nodes can be connected
both to one another and to remote OSIbased DTE devices via X.25 public data networks (PDNs) or point-to-point
lines. Figure 8 shows examples of each of these Cisco CONS configurations.

Figure 8: Example Cisco CONS Configurations


Conclusion
Cisco offers a feature-rich, robust, highly compatible, time-proven OSI routing solution that will continue to
support multivendor equipment interoperability. CLNP and CONP are 2 of over 20 protocols that can be
simultaneously routed and bridged by any of Cisco's routers. Cisco enriches the implementation of each of these
protocols with value-added features that provide ease of use, security, enhanced management, and optimized
performance for networks ranging in size from PC LAN environments to very large-scale, enterprise-wide
networks.

Copyright 1996 © Cisco Systems Inc.