Professional Documents
Culture Documents
Packet Sceduling For Crossbar Fabric PDF
Packet Sceduling For Crossbar Fabric PDF
QoS classes
Itamar Elhanany, Michael Kahane, and Dan Sadot Accompanying the rapid pace of
Ben-Gurion University Internet traffic is the increasing popular-
ity of multimedia applications sensitive
to mean packet delay and jitter. To pro-
vide differentiated services, developers
T
he infrastructure required to verse, and repeatedly reconfigures the categorize incoming packets into QoS
govern Internet traffic volume, crosspoint for successive transmissions. classes. For each QoS class, the router
which doubles every six months,
consists of two complementary
elements: fast point-to-point Multiterabit networks will require
links and high-capacity switches and innovative queuing strategies
routers. Dense wavelength division mul-
tiplexing (DWDM) technology, which and high-performance scheduling
permits transmission of several wave-
lengths over the same optical media, will algorithms to meet the future
enable optical point-to-point links to packet-scheduling challenge.
achieve an estimated 10 terabits per sec-
ond by 2008. However, the rapid growth
of Internet traffic coupled with the avail- Optical versus electrical must not exceed the acclaimed latency
ability of fast optical links threatens to A major debate exists on whether next- and jitter requirements.
cause a bottleneck at the switches and generation multiterabit routing scenarios For routers to operate under heavy traf-
routers. should use optical or electrical switch fab- fic loads while supporting QoS, designers
rics. At first glance, all-optical switches must implement smarter scheduling
SWITCH FABRICS are the straightforward solution because schemes, a difficult task given that routers
Switches and routers consist of line they enable high-speed multiterabit aggre- usually configure the crosspoint and trans-
cards and a switch fabric. A network gation and concentration. However, mit data on a per-data-unit basis. The
processor that interfaces to the physical emerging router functions, such as qual- scheduling algorithm must make a new
data links, decodes incoming traffic, and ity-of-service (QoS) provisioning, require configuration decision with each incom-
applies traffic shaping, filtering, and that packets be bufferedtypically at the ing data unit. In an ATM switch, where
other policy functions resides on each network processorsuntil the scheduler the data unit is a 53-byte cell, the algorithm
line card, which is associated with a port. grants transmission. Because dynamic must issue a scheduling decision every 168
Actual data transmission across ports optical buffering is impractical, delayed ns at 2.5-Gbit-per-second line rates. As 10-
occurs on the switch fabric, which transmissions currently dictate using opto- Gbit-per-second port rates become stan-
includes a crosspoint, a scheduler, and electric and electro-optical conversions. dard for high-end routers, the decision
buffers. The crosspoint is a configurable Switching nodes, which entail simpli- time reduces fourfold, to a mere 42 ns.
interconnecting element that dynamically fied buffer and transmission manage-
establishes links between input and out- ment, more efficiently process network QUEUING STRATEGIES
put ports. The scheduler configures the protocols deploying fixed packet sizes The switch fabric core embodies the
crosspoint, allows buffered data to tra- such as asynchronous transfer mode crosspoint element responsible for
104 Computer
matching N input and output ports.
Currently, routers incorporate electrical
crosspoints but optical crosspoint solu- Port processor N N crosspoint
tions, such as those based on dynamic
Input port 1 Output port 1
DWDM, show promise. Regardless of
the underlying technology, the basic func-
tionality of determining the crosspoint
configuration and transmitting data Prioritized virtual
remains the same. output queuing
Input queuing
Port processor
We can generally categorize switch-
queuing architectures as input- or out-
put-queued. Network processors store Input port N Output port N
arriving packets in input-queued switches
in FIFO buffers that reside at the input
or ingressport until the processor sig- Prioritized virtual
nals them to traverse the crosspoint. output queuing
A disadvantage of input queuing is that
a packet at the front of a queue can pre-
vent other packets from reaching poten- Crosspoint scheduler
tially available destinationor egress
ports, a phenomenon called head-of-line
(HOL) blocking. Consequently, the over-
all switching throughput degrades sig- Figure 1. Router-switch architecture consisting of virtual output queuing buffering, crosspoint,
nificantly: For uniformly distributed and scheduling modules. The network processor automatically classifies packets arriving at the
Bernoulli iid traffic flows, the maximum input ports to queues corresponding to their destination. The crosspoint scheduler receives
achievable throughput using input queu- transmission requests from the various queues and determines which queue is granted
ing is 58 percent of the switch core capac- transmission at each time slot. Packets accordingly traverse the crosspoint to their designated
ity. output ports and depart the switch.
Virtual output queuing (VOQ) entirely
eliminates HOL blocking. As Figure 1 of 10- and 40-Gbit-per-second rates, how- minimize packet loss resulting from
shows, every ingress port in VOQ main- ever, makes acceleration by N infeasible. buffer overflow, and
tains N separate queues, each associated Trading space for time by deploying support strict QoS requirements in
with a different egress port. The network space-division multiplexing (SDM) tech- accordance with diverse data classes.
processor automatically classifies and niques requires a dedicated path within
stores packets upon arrival in a queue cor- the switch fabric for each input-output Intuitively, these objectives appear con-
responding to their destination. VOQ thus pair. However, SDM implies having tradictory. Temporarily maximizing input-
assures that held-back packets do not O(N2) internal paths, only N of which output matches, for example, may not
block packets destined for available out- can be used at any given time because at result in optimal bandwidth allocation in
puts. Assigning several prioritized queues most, N input ports simultaneously terms of QoS, and vice versa. Scheduling
instead of one queue to each egress port transmit to N output ports. This require- is clearly a delicate task of assigning
renders per-class QoS differentiation. ment renders SDM impractical for large ingress ports to egress ports while opti-
port densities such as N = 64. mizing several performance parameters.
Output queuing Moreover, as port density and bit rates
Output-queuing strategies directly SCHEDULING APPROACHES increase, the scheduling task becomes
transfer arriving packets to their desig- The main challenge of packet sched- increasingly complex because more deci-
nated egress ports. A contention resolu- uling is designing fast yet clever algo- sions must be made during shorter time
tion mechanism handles cases in which rithms to determine input-output frames. Advanced scheduling schemes
two or more ingress ports simultaneously matches that, at any given time: exploit concurrency and distributed com-
request packet transmission to the same putation to offer a faster, more efficient
egress port. maximize switch throughput uti- decision process.
A time-division-multiplexing (TDM) lization by matching as many input-
solution accelerates the switch fabric inter- output pairs as possible, PIM and RRM
nal transmission rates by N with respect to minimize the mean packet delay as Commonly deployed scheduling algo-
the port bit rates. The growing prevalence well as jitter, rithms derive from parallel iterative
matching (PIM), an early discipline devel- Request. All unmatched inputs send using global contention logic. Although
oped by the Digital Equipment Corp. for requests to every output to which more difficult to implement, this scheme
a 16-port, 1-Gbit-per-second switch they have packets to send. yields lower packet delay at the expense
(http://www.cs.washington.edu/homes/ Grant. Each output selects a request- of longer switching intervals. It also
tom/atm.html). PIM and its popular ing input that coincides with a pre- introduces lower connectivity, global
derivatives use randomness to avoid star- defined priority sequence. A pointer contention resolution, and inherent QoS
vation and maximize the matching pro- indicates the current location of the support.
cess. Unmatched inputs and outputs highest-priority elements and, if
contend during each time slot in a three- accepted, increments (modulo N) to
step process. one beyond the granted input. ultiterabit packet-switched net-
106 Computer