You are on page 1of 5

ON THE WIRE

TCP that allowed it to adapt its send-


ing rate to available network capacity.2
Specifically, TCP started clocking its
sending rate to the arrival and fre-
quency of acknowledgments sent by
the receiver. If the network was con-
gested, the sender reduced its infor-
mation transfer rate; if the network
had available capacity, it increased its
sending rate.
Finally, there was no driving need
to re-architect the TCP/IP protocol
suite to support QoS since there were

IP QOS: no applications that really needed it.


The dominant applications have been
and continue to be HTTP, FTP, and

Traveling in First Class e-mail, all of which use TCP and can
therefore adapt their sending rates to
whatever capacity the network offers.

on the Internet Early Mechanisms for


Differentiation
Chris Metz • Cisco Systems • chmetz@cisco.com Support for QoS over IP-based net-
works is not an issue that has lacked
attention or interest. In their seminal
1992 paper, Clark, Shenker, and
Zhang outlined an architecture to
The global reach and ubiquity of the The Problem with IP: support real-time traffic flows over a
Internet has created a transport and One Class Only packet data network.3 In addition to
delivery vehicle for all sorts of applica- Until recently, IP networks supported describing different service classes the
tions. Some new ones, like voice over one service class: best effort. The net- network should support (guaranteed,
IP (VoIP) and packetized video, can work would make its best attempt to predictive, and best-effort), the paper
present multimedia data in real time. deliver packets to their destinations describes two important mechanisms
Others that are low-bandwidth and but with no guarantees and no special that are still used today.
text-based may include a “high-prior- resources allocated for any of the
ity” label because they process mis- packets. The reasons IP has never had Token bucket filter. The first was a
sion-critical business information. any notion of QoS are various. token bucket filter that characterizes
In both cases, the network must First, the original TCP/IP protocol the application traffic load receiving a
handle the application packets in a suite was built on the idea of fair and particular service. As shown in Figure
special way so that the data is deliv- equitable access to all and no special 1, it can be conceptualized as a buck-
ered to the end user ahead of other treatment for anyone. With the et of depth B that is replenished with
traffic. But the Internet and, more exception of the communicating end- tokens, or credits, at a rate of R
generally, IP networks offer no easy points, no connection state was to be tokens per second. When a packet
way to either identify such packets or maintained anywhere in the network. arrives at the router, some number of
subsequently give them special If a packet did not arrive safely at the tokens (based on the packet size) are
handling. destination, it was up to the source to subtracted from the bucket. A packet
This situation is beginning to retransmit the original packet. cannot be sent unless there are suffi-
change. Indeed, the concept of Second, the internal workings of cient tokens in the bucket. A token
Quality of Service—that is, the net- early routers (and many in current bucket filter allows a source to trans-
work capability to provide a nonde- operation) used a first-in, first-out mit a burst of packets equal to the
fault service to a subset of the aggre- (FIFO) queuing strategy. If more total number of tokens in the bucket,
gate traffic—has now entered the IP packets arrived than the router could which is less than or equal to B.
lexicon.1 IP QoS will no doubt have a handle and the queue filled up, newly A traffic source is said to conform
significant economic impact as the arriving packets (the tail of the queue) to the parameters of the token bucket
Internet evolves from a best-effort were dropped. filter if it sends packets at a rate less
connection engine into a universal Third, the impending collapse of than or equal to R. Therefore the net-
transport and service-delivery medi- the Internet brought on by increased work can easily understand and
um for large volumes of voice, real- traffic and FIFO-based routers was enforce traffic characterized by a token
time, and corporate business data. headed off at that pass by changes to bucket filter because conforming traf-

84 MARCH • APRIL 1999 http://computer.org/internet/ IEEE INTERNET COMPUTING


O N T H E W I R E

fic will never exceed R(t) + B for any IntSer v. In 1994, the R (token/sec)
increment of time equal to t. Each Internet community began
implementation must decide what to work to define an Inte-
do with nonconforming traffic. The grated Services Architec-
token bucket filter is used in many ture (IntServ) that would
router implementations to quantify extend the existing IP
and enforce the treatment a particular architectural model to sup- Token
B (tokens)
flow will receive from the network. port both real-time and Packet
best-effort traffic flows.6
Weighted fair queuing . The second The IntServ architecture
important mechanism described by defines a flow as a stream
Clark, Shenker, and Zhang was a of packets with common
weighted-fair-queuing (WFQ) algo- source addresses, destina-
rithm used to schedule packets for tion addresses, and port
outbound transmission from the numbers. IntServ suggested
routers or switches. WFQ is variant of that for a flow to receive a
the fair-queuing algorithm in which desired level of service in
individual packets of a flow are time- terms of quantifiable band- Figure 1. Token bucket filter. The flow of packets
stamped based on their arrival rate at width or delay, it is neces- through a router can be enforced by establish-
the router, their scheduled departure sary to install and maintain ing a “bucket “of depth B that is replenished
time from the router, and their flow-specific state in the with tokens or credits at a rate of R tokens per
length.4 The departure queue of the network. Of course a router second, and the quantification can be used to
WFQ scheduler is reordered every only has a finite amount of enforce the treatment of a particular flow.
time a new packet arrives so the pack- buffers and CPU, and is
ets with the smallest time stamps are attached to links with a
transmitted first. The original FQ maximum bandwidth. Thus each amount of resources from the network.
algorithm provides a fair share of the router in the network would have to It delivers the reservation request gen-
available bandwidth for each flow exercise a degree of discretionary con- erated by the application to each
(1/N) for N number of flows while trol over what flows would be allocat- router’s traffic control component.
the weighted variant allows a particu- ed what resources based on available
lar flow to receive more than its fair capacity. This idea that the network RSVP. The de facto setup protocol in
share of bandwidth. might deny service because of insuffi- the IntServ architecture is the
Two properties make the WFQ cient resources ran contrary to the Resource Reservation Protocol, other-
scheduler an ideal choice for support- notion of the connectionless, best wise known as RSVP. With RSVP, the
ing real-time traffic flows. First, a par- effort, “send-packets-whenever” kind application source (the sender) trans-
ticular flow is guaranteed its allocated of service offered by traditional IP. mits a Path message along the routed
share of the bandwidth irrespective of The basic components of the path to the unicast or multicast desti-
the behavior of other flows traveling IntServ architecture are traffic control, nation (the receiver). The purpose of
through the same router. Second, a traffic classes, and the setup protocol. the Path message is twofold: to mark
WFQ scheduler is work-conserving, Traffic control includes admission the routed path between the sender
which means that the router will control, which checks to see if the and receiver and to collect informa-
always transmit available packets and resources in the host or router can tion about the QoS viability of each
therefore the link is never idle. support a particular service; the packet router along that path.
Moreover, Parekh and Gallager classifier, which examines the source Upon receiving the Path message,
proved that a network of routers address, destination address, and port the destination host or hosts can gauge
could quantifiably bound delay for fields in each packet to determine what services the network can support
traffic conforming to a token bucket what class the packet belongs to; and (for example, guaranteed service or
filter and scheduled using WFQ. 5 the packet scheduler, which schedules controlled load) and then generate an
Sophisticated scheduling algorithms the packet for transmission on the RSVP reservation (Resv) message. The
like WFQ—essential for QoS sup- outbound link. Resv message contains traffic and QoS
port—are currently implemented in IntServ supports two traffic classes objects that are processed by the traffic
many advanced routers and switches. in addition to best-effort service: control component of each router as it
guaranteed service supports real-time follows the reverse path upstream
Reservations Required traffic flows that require a quantifiable toward the sender. If the router has suf-
As the number of real-time applica- bound on delay; controlled load ficient capacity, then resources along
tions grew, so did the realization that approximates a best-effort service over the path back toward the receiver are
best-effort service was inadequate to an uncongested network. reserved for that flow. If resources are
support them. The setup protocol enables a host or not available, RSVP error messages are
application to request a specific generated and returned to the receiver.

IEEE INTERNET COMPUTING http://computer.org/internet/ MARCH • APRIL 1999 85


C O L U M N

Router queue This probabilistic queue manage-


ment scheme gracefully instructs some
TCP sources to reduce their sending
rates so that the router queue does not
overflow. This allows the router to
minTH support new TCP connections, handle
periodic bursts of data, and maintain
AvgLen
high network utilization.
RED is implemented on many
maxTH
routers and has been collectively
Figure 2. Random Early Detection, or RED, queue. endorsed by the Internet community
as a sound queue management strate-
gy for improving and maintaining
The per-flow reservation state works. The interaction of the IP QoS high network utilization during peri-
maintained in the routers will be semantics defined in the IntServ model ods of congestion.9
deleted unless RSVP Path and Resv with those supported by various
messages are periodically sent by the datalink layers (for example, ATM) is Differentiated Services
sender and receivers, respectively. being addressed in the Integrated Differentiated Services (DiffServ) is
The work of the IETF’s IntServ and Services over Special Link-Layers the current approach for supporting
RSVP working groups culminated in (ISSLL) working group of the IETF. IP QoS. A number of factors have
RFCs 2205 through 2216, which doc- driven its design.
ument the IntServ architecture, its Random Discard First, the solution has to scale. To
components, and the RSVP protocol. Another popular approach to QoS is to achieve this, individual host-to-host
In addition, a number of vendors— minimize the depth of router queues by microflows are aggregated into a single
including industry stalwarts Microsoft, intelligently discarding packets. larger aggregate flow and then that
Cisco, and Intel—are currently or will Recall that routers using FIFO single aggregate flow receives special
soon ship RSVP-supported products.7 queues drop all packets at the tail of treatment. This type of aggregated
the queue when the queue fills. TCP behavior is not unlike the process of
Open issues. However, while RSVP sources use packet drops as an implic- packet forwarding performed inside of
was touted as the solution to IP’s QoS it signal of network congestion and an IP router in which all packets shar-
shortcomings, its applicability and reduce their information transfer rates ing a common destination prefix are
scalability over large networks—in accordingly. Although this allows forwarded to a single next-hop router.
particular the Internet—are limited.8 overflowing router queues to drain, it Second, the solution should be
For example, consider a core router in also reduces the rate at which TCP applicable to all applications and
a large ISP supporting 10,000 VoIP sources send data, leading to low net- should not require a special control
flows set up using RSVP. Since RSVP work utilization and reduced protocol or new application program-
is unidirectional, the router would throughput. TCP sources then detect ming interfaces as is the case with
have to maintain state information on available network capacity and ramp RSVP.
10,000 flows in each direction while up their sending rates. This results in Third, router and switch technolo-
processing frequent RSVP refresh full queues, tail drops, and another gies are advancing rapidly, with OC-
messages. round of reduced sending rates. 48 (2.4-Gbit) line rates supported
In addition, the current version of This cycle of inefficiency is called today and OC-192 (10-Gbit) coming
RSVP lacks both adequate security global synchronization. The remedy is a soon. Core routers and switches oper-
mechanisms to prevent unauthorized packet-discarding strategy called Ran- ating at these speeds do not need to be
parties from instigating theft-of-ser- dom Early Detection. The idea behind burdened with the instantiation of
vice attacks, and policy control—that RED is quite simple: Packets are ran- per-flow or per-customer state. A
is, techniques to authenticate and domly discarded with increased proba- more efficient and scalable option is to
authorize applications or endusers bility as the size of the queue grows. provision per-class or per-service state.
wishing to reserve resources. RED defines a minimum queue Finally, ISPs are desperate to offer a
Other efforts to deliver better ser- depth (minTH), a maximum queue portfolio of services their customers
vice over IP networks include placing depth (maxTH), and a time-based will pay for. QoS is one of them.
IP traffic over ATM virtual connec- average queue length (AvgLen), as The approach taken by DiffServ
tions that support QoS. This lets an IP shown in Figure 2. If AvgLen<minTH, (DS) is to classify individual micro-
flow receive a quantifiable level of ser- then no packets are dropped; if flows at the edge of the network into
vice commensurate with the QoS spec- AvgLen>maxTH, then all new packets one of several unique service classes
ified for the ATM VC. Unfortunately, are dropped; if minTH<AvgLen (such as gold, silver, and bronze) and
this support is limited to the ATM <maxTH, then packets are randomly then apply a per-class service in the
portion of the end-to-end path, which dropped with increasing probability as middle of the network. The classifica-
is mostly confined to backbone net- the AvgLen increases. tion is performed at the network

86 MARCH • APRIL 1999 http://computer.org/internet/ IEEE INTERNET COMPUTING


O N T H E W I R E

ingress based on an Boundary functions

analysis of one or more Traffic


fields in the packet. The meter
packet is then marked
(turning on some code
points, or bits, in the MF Traffic
Marker
classifier conditioner
packet header) as belong-
ing to a particular service
class and then injected
into the network. The
core routers that forward Ingress Core Core Egress
the packet examine the router router router router
code points in the packet
header to determine how
the packet should be
treated (for example,
BA Queuing
what transmission queue classifier
the packet should be
placed in). To accomplish Interior functions

this, the DiffServ archi-


tecture defines several Figure 3. The DiffServ boundary and interior elements.
components.10

■ The DS-field is a bit pattern con- packets that do not conform to a The core routers in Figure 3 con-
tained in the header of each pack- token bucket filter) deemed neces- tains a simple BA classifier that deter-
et that denotes the service (termed sary to deliver the service to the mines the PHB to be applied to the
per-hop behavior or PHB) the external customer. The functions packet. All packets belonging to a BA
packet should receive at each hop defined for the boundary router are handled the same way. Again, the
as it is forwarded through the net- can be performed on a router, fire- PHB is an externally observable
work.11 The type-of-service (TOS) wall, or even a host. behavior performed on each node that
field in IPv4 and the traffic class ■ Interior nodes can be core switches is realized through internal queue
field in IPv6 have been redefined or routers that provide the PHB management and scheduling tech-
respectively as the DS-fields. The based on the DSCP bits con- niques. In addition, observe that the
8-bit DS-field contains 6 bits for tained in the DS-field. These complex “high-touch” per-packet pro-
DS code points (DSCP) and 2 devices typically employ a queue cessing is only performed at the edge
bits that are currently undefined. management and scheduling dis- of the network by the boundary or
■ The per-hop behavior (PHB) defines cipline to provide the PHB. RED ingress device. The net result of the
the service the packet receives at and WFQ are examples of mech- DiffServ machinery is that a particular
each hop as it is forwarded through anisms used by routers and aggregate flow is provided with a spe-
the network. A PHB may be switches to support a PHB. cial service as it traverses the network.
expressed in relative (compared to The IETF DiffServ Working
other PHBs) or absolute (such as Figure 3 illustrates the DiffServ bound- Group is finishing work on two
bandwidth or delay) terms. ary and interior functions. Packets PHBs: expedited forwarding (EF) and
■ A behavior aggregate (BA) is a enter the network through an ingress assured forwarding (AF).
group of packets with the same boundary router. Each packet passes The EF PHB was designed to sup-
DSCP. A PHB is applied to each through a multifield (MF) classifier, port low loss, low delay, and low jitter
BA inside the network. which works with a traffic meter to connections. It appears as a point-to-
■ The boundary router is positioned determine the next action to be per- point virtual leased line (VLL) service
at the edge of a DiffServ-capable formed. The role of the traffic meter is between endpoints with a peak band-
network. This device is responsi- to measure the packet’s conformance width. To minimize jitter and delay,
ble for packet classification, with a traffic profile agreed upon by the packets must spend little or no time in
metering, packet marking, and network provider and the customer. In- router queues. Therefore the EF PHB
possibly traffic conditioning (such profile packets or those that fall inside requires that the traffic be conditioned
as policing or shaping). Network the parameters of the profile will be to conform to the peak rate at the
administrators are responsible for treated differently from out-of-profile boundary, and the network of routers
configuring the classifier, which packets. The DSCP bits in the DS-field be provisioned such that this peak rate
defines the fields to be examined may then be marked and the packet is less than the minimum packet depar-
in each packet, and any other conditioned (for example, shaped or ture rate at each router in the network.
actions (for example, dropping dropped) before entering the network. The EF PHB uses a single DSCP bit to

IEEE INTERNET COMPUTING http://computer.org/internet/ MARCH • APRIL 1999 87


C O L U M N

REFERENCES
IP QoS Resources on the Web
on the 1. P. Ferguson and G. Huston, Quality of
Service—Delivering QoS on the Internet and
IETF IntServ Working Group
http://www.ietf.org/html.charters/intserv-charter.html
eb in Corporate Networks, Wiley Computer
Publishing, 1998.
IETF ISSLL Working Group 2. V. Jacobson, “Congestion Avoidance and
http://www.ietf.org/html.charters/issll-charter.html Control,” Computer Comm. Rev., Vol. 18,
IETF DiffServ Working Group No. 4, Aug. 1988, pp. 314-329; also avail-
http://www.ietf.org/html.charters/diffserv-charter.html able at ftp://ftp.ee.lbl.gov/papers/con-
Random Early Detection (RED) Queue Management gavoid.ps.Z.
http://www-nrg.ee.lbl.gov/floyd/red.html 3. D. Clark, S. Shenker, and L. Zhang,
Internet2 Qbone http://www.internet2.edu/qos/qbone/ “Supporting Realtime Applications in an
RSVP http://www.isi.edu/rsvp/ Integrated Services Packet Network:
QoS Forum http://www.stardust.com/qosform/ Architecture and Mechanisms,” ACM
Sigcomm Proc., 1992.
4. A. Demers, S. Shenker, and S. Keshav,
indicate that the packet should be Server (COPS). Work has now begun “Analysis and Simulation of a Fair Queuing
placed in a high-priority queue on the in the Policy Framework Working Algorithm,” ACM Sigcomm Proc., 1989.
outbound link of each router hop. Group of the IETF to define a frame- 5. A. Parekh and B. Gallager, “A Generalized
The AF PHB defines four relative work and schemata for managing net- Processor Sharing Approach to Flow
classes of service with each service work QoS policies. Control in Integrated Services Networks—
supporting three levels of drop prece- Another issue concerns extending the Multiple Node Case,” IEEE/ACM Trans.
dence. Twelve distinct DSCP bit DiffServ across network or ISP on Networking, Apr. 1994, pp. 137-150.
combinations define the AF classes boundaries. Obviously this requires 6. B. Braden, S. Shenker, and D. Clark, “
and the drop precedence within each some form of bilateral agreement and Integrated Services in the Internet
class. When congestion is encoun- coordinated network configuration Architecture: an Overview,” RFC 1633,
tered at a router, packets with a high- between two consenting providers to IETF IntServ Working Group, available at
er drop precedence will be discarded support each other’s BA PHBs. This http://info.internet.isi.edu:80/in-notes/rfc/
ahead of those with a lower drop could be implemented by having a files/rfc1633.txt.
precedence. The four AF classes bandwidth broker in each domain 7. G. Gaines and M. Festa, “A Survey of
define no specific bandwidth or delay manage DiffServ policies and then RSVP/QoS Implementations,” update 2,
constraints other than that AF class 1 communicate that information across RSVP Working Group, 1 July 1998, avail-
is distinct from AF class 2, and so on. ISP boundaries.12 able at http://www.iit.nrc.ca/IETF/RSVP_
As a means of providing a scalable What’s next on the IP QoS front? survey/ietf_rsvp-qos_survey_02.txt.
and coarse level of service suitable for The ISSLL working group is studying 8. A. Mankin, ed., “Resource ReSerVation
the ISP-size and enterprise networks, ways in which RSVP/IntServ and Protocol (RSVP) Version 1 Applicability
DiffServ holds much promise. It is DiffServ can interwork. Efforts are Statement: Some Guidelines on Deploy-
certainly more scalable than the fine- under way to evolve RSVP so that it ment,” RFC 2208, IETF RSVP Working
grained, per-flow approach of can set up and maintain state in the Group, available at http://info.internet.isi.
RSVP/IntServ and does not require network that can benefit aggregate edu:80/in-notes/rfc/files/rfc2208.txt.
new applications or extensive router traffic flows transported over large 9. B. Braden et al., “Recommendations on
upgrades. Moreover, DiffServ gives ISP backbones. One application Queue Management and Congestion
network providers some degree of lati- involves the use of RSVP as a means Avoidance in the Internet,” RFC 2309,
tude in deploying and operating dif- of setting up MPLS explicit paths IETF, available at http://info.internet.isi.
ferent network infrastructures (for between ingress and egress routers for edu:80/in-notes/rfc/files/rfc2309.txt.
example, ATM or routers) that can traffic engineering purposes. Another 10. S. Blake et al., “An Architecture for
support different PHBs. application is to employ RSVP exten- Differentiated Services,” RFC 2475, IETF
sions to reserve network resources for DiffServ Working Group, available at
What’s Next? an aggregate number of microflows ftp://ftp.isi.edu/in-notes/rfc2475.txt.
Still, some issues remain. For example, such as a bundle of VoIP calls. The 11. K. Nichols, “Definition of the Dif-
how will network policies (such as fil- Internet2 community has undertaken ferentiated Services Field (DS Field) in the
ters, rules, and parameters) be man- creation of the Qbone to understand IPv4 and IPv6 Headers,” RFC 2474, IETF
aged and installed on a potentially how it can employ IP QoS mecha- DiffServ Working Group, available at
large number of boundary compo- nisms over next-generation high- ftp://ftp.isi.edu/in-notes/rfc2474.txt.
nents? Possibilities include manual speed networks. In the long run it 12. K. Nichols, V. Jacobson, and L. Zhang, “A
configuration, Simple Network will most likely be a combination of Two-bit Differentiated Services Archi-
Management Protocol (SNMP), the aforementioned solutions along tecture for the Internet,” Internet Draft,
Lightweight Directory Access Protocol with more bandwidth that will enable Nov. 1997, available at http://www-nrg.
(LDAP), and Common Open Policy the Internet to offer QoS. ■ ee.lbl.gov/papers/2bitarch.pdf.

88 MARCH • APRIL 1999 http://computer.org/internet/ IEEE INTERNET COMPUTING

You might also like