You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/3198948

Interdomain Traffic Engineering with BGP

Article  in  IEEE Communications Magazine · June 2003


DOI: 10.1109/MCOM.2003.1200112 · Source: IEEE Xplore

CITATIONS READS

173 127

5 authors, including:

Bruno Quoitin Cristel Pelsser


Université de Mons Internet Initiative Japan
56 PUBLICATIONS   1,242 CITATIONS    53 PUBLICATIONS   536 CITATIONS   

SEE PROFILE SEE PROFILE

Steve Uhlig
Queen Mary, University of London
200 PUBLICATIONS   6,195 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

INET TU Berlin View project

ENDEAVOUR Horizon 2020 View project

All content following this page was uploaded by Steve Uhlig on 26 February 2013.

The user has requested enhancement of the downloaded file.


TOPICS IN INTERNET TECHNOLOGY

Interdomain Traffic Engineering with BGP


Bruno Quoitin, Cristel Pelsser, and Louis Swinnen, University of Namur
Olivier Bonaventure and Steve Uhlig, Université Catholique de Louvain

ABSTRACT ISPs rely on traffic engineering [2] to better con-


trol the flow of IP packets. Large ISPs often
Traffic engineering is performed by means of need to engineer the flow of packets inside their
a set of techniques that can be used to better own domain to reduce congestion by better dis-
control the flow of packets inside an IP network. tributing the traffic on all their links. Several
We discuss the utilization of these techniques techniques have been developed during the last
across interdomain boundaries in the global few years. Some require the utilization of multi-
Internet. We first analyze the characteristics of protocol label switching (MPLS) to forward IP
interdomain traffic on the basis of measure- packets, while others only require tuning of the
ments from three different Internet service pro- traditional IP routing protocols used inside the
viders and show that a small number of sources ISP network. Besides optimizing the flow of
are responsible for a large fraction of the traffic. packets inside their network, most ISPs also need
Across interdomain boundaries, traffic engineer- to better control the flow of their interdomain
ing relies on a careful tuning of the route adver- traffic (i.e., IP packets that cross the boundaries
tisements sent via the Border Gateway Protocol. between distinct ISPs). Today, MPLS is not used
We explain how this tuning can be used to con- across interdomain boundaries, and the only
trol the flow of incoming and outgoing traffic, solution to engineer the flow of interdomain traf-
and identify its limitations. fic is to tune the configuration of the BGP rout-
ing protocol. This tuning is often done on a
INTRODUCTION trial-and-error basis and suffers from limitations,
as will be shown in the rest of this article.
Initially developed as a network that connected a In this article we first introduce the operation
small number of research networks, the Internet of the BGP protocol. We provide recent results
has become a worldwide data network that is used about the characteristics of interdomain traffic.
for mission-critical applications. Supporting such Finally, we describe in details several interdo-
mission-critical applications across the global Inter- main traffic engineering techniques and show
net implies several important challenges. The first their limitations.
challenge is the size of the Internet. The Internet is
a large decentralized network that connected about
160 million hosts in June 2002. Furthermore, these
BGP ROUTING IN THE INTERNET
hosts are organized in about 13,000 distinct Internet routing is handled by two distinct proto-
domains, a domain corresponding roughly to one cols with different objectives. Inside a single
company or one Internet service provider (ISP). domain, link state intradomain routing protocols
All these domains are interconnected to form the distribute the entire network topology to all
global Internet. The Border Gateway Protocol routers and select the shortest path according to
(BGP) is used to route the IP packets exchanged a metric chosen by the network administrator.
between domains. There are basically two types of Across interdomain boundaries, the interdomain
domains. Stub domains contain hosts that produce routing protocol is used to distribute reachability
or consume IP packets. These domains do not information and to select the best route to each
carry IP packets that are not produced by or des- destination according to the policies specified by
tined to their hosts. Transit domains interconnect each domain administrator. For scalability rea-
different domains, and carry IP packets that are sons, the interdomain routing protocol is only
produced by and/or destined to external domains. aware of the interconnections between distinct
Additional details on the relationships between domains; it does not know any information about
domains may be found in [1]. the content of each domain.
The second challenge is that the research BGP [3, 4] is the current de facto standard
Internet was designed with best effort service in interdomain routing protocol. In BGP terminol-
mind where connectivity was the most important ogy, a domain is called an autonomous system
issue. Today, connectivity is taken for granted, (AS). BGP is a path-vector protocol that works by
and the best effort service is used for mission- sending route advertisements. A route advertise-
critical applications with stringent service level ment indicates the reachability of a network (i.e.,
agreements (SLAs). To meet these SLAs, several a network address and a netmask representing a

122 0163-6804/03/$17.00 © 2003 IEEE IEEE Communications Magazine • May 2003


Inside a single
AS 2
R23 R24 R31 R32 domain, all
AS 3
R21 R22 routers are
R25 R33
R26
R34 considered as
R27 R28 R36
R35 “equal” and the
intradomain
R41 R42 routing protocol
R11
R12
announces all
R43 R45
R44 known paths to
R13
R14
R61 all routers. In
AS 4
AS 1 contrast, in the
R51
AS 6 global Internet, all
AS 5 ASs are not equal
■ Figure 1. A simple Internet. and an AS will
rarely agree to
block of contiguous IP addresses; e.g., 192.168. Inside a single domain, all routers are consid- provide a transit
0.0/24 represents a block of 256 addresses ered equal, and the intradomain routing protocol
between 192.168.0.0 and 192.168.0.255) announces all known paths to all routers. In con- service for all its
because this network belongs to the same AS as trast, in the global Internet, all ASs are not equal,
connected ASs
the advertising router or because a route adver- and an AS will rarely agree to provide a transit
tisement for this network was received from service for all its connected ASs toward all desti- toward all
another AS. Besides the reachable network and nations. Therefore, BGP allows a router to be
the IP address of the router that must be used to selective in the route advertisements it sends to destinations.
reach this network (known as the next hop), a neighbor eBGP routers. To better understand the
route advertisement also contains the AS path operation of BGP, it is useful to consider a sim-
attribute, which contains the list of all the transit plified view of a BGP router, as shown in Fig. 2.
ASes that must be used to reach the announced A BGP router processes and generates route
network. The length of the AS path can be con- advertisements as follows. First, the administra-
sidered as the route metric. A route advertise- tor specifies, for each BGP peer, an input filter
ment may also contain several optional attributes (Fig. 2, left) that is used to select the acceptable
such as the local-pref, multi-exit discrimina- advertisements. For example, a BGP router
tor (MED), or communities attributes [3, 4]. An could only select the advertisements with an AS
important point to note about BGP is that if a path containing a set of trusted ASes. Once a
BGP router of ASx sends a route announcement route advertisement has been accepted by the
for network N to a neighbor BGP router of ASy, input filter, it is placed in the BGP routing table,
this implies that ASx accepts forwarding the IP possibly after updating some of its attributes. The
packets to destination N on behalf of ASy. BGP routing table thus contains all the accept-
There are two variants of BGP [3, 4]. The able routes received from the BGP neighbors.
eBGP variant is used to announce the reachable Second, on the basis of the BGP routing
prefixes on a link between routers that are part table, the BGP decision process (Fig. 2, center)
of distinct ASs (e.g., R51 and R14 in Fig. 1). The will select the best route toward each known net-
iBGP variant is used to distribute inside an AS work. Based on the next hop of this best route
the best routes learned from neighboring ASes. and the intradomain routing table, the router

Peer1 Peer1
Inbound BGP routing table Outbound
filter filter
Inbound Outbound
PeerN filter
Attribute filter
Attribute PeerN
Inbound
manipulation Outbound
manipulation
filter
Attribute BGP decision process filter
Attribute
manipulation manipulation
Attribute 1. Highest LOCAL-PREF Attribute
manipulation 2. Shortest AS-PATH manipulation
3. Lowest MED
4. eBGP over iBGP
5. Nearest IGP neighbor

Forwarding table

■ Figure 2. Simplified operation of a BGP router.

IEEE Communications Magazine • May 2003 123


sions with network operators on this topic indi-
Cumulative distribution of total traffic for ASes cate that often a small number of ASs are
100 responsible for a large fraction of the total traf-
BELNET fic received or sent by a given ISP. However,
90 YUCOM while there are many studies on the topology of
PSC
80 the Internet [1, references therein] or the evolu-
tion of the BGP routing tables [5, references
Cumulative traffic (%)

70 therein] as well as many studies on the packet-


60 level characteristics of the traffic (e.g., [6] among
many other papers), few papers analyze the traf-
50 fic and its topological distribution together [7–9].
40 In the framework of a detailed analysis of
interdomain traffic, we have collected several
30 traces of all the traffic received or sent through
20 the border routers of three stub ISPs. Due to
practical reasons, it was unfortunately not possi-
10 ble to collect a trace at the same time with three
0 ISPs. The first trace was collected during one
0.01 0.1 1 10 100 entire week in December 1999. This trace covers
ASs (%) all the interdomain links of BELNET. The trace
contains all the interdomain traffic received by
BELNET, the Belgian Internet provider for uni-
■ Figure 3. Cumulative distribution of the traffic for each studied ISP. versities and research laboratories. During this
period, BELNET received 2.1 Tbytes of data
from 4243 distinct ASs. The second trace was
will install a route toward this network inside its collected during five consecutive days in April
forwarding table. This table is then looked up 2001 at the border routers of YUCOM, an ISP
for each received packet and indicates the out- based in Belgium that provides dialup access to
going interface that must be used to reach the home users. Again, this trace covers all the inter-
packet’s destination. domain links of this ISP. During this five-day
Third, the BGP router will use its output fil- period, YUCOM received IP packets corre-
ters (Fig. 2, right) to select, among the best sponding to 1.1 Tbytes of data from 7669 distinct
routes in the BGP routing table, the routes that ASs. The last trace was collected during one day
will be advertised to each BGP peer. At most at the border routers of the Pittsburgh Super-
one route will be advertised for each reachable computing Center (PSC) in March 2002. PSC
destination. The BGP router will assemble and provides access to Internet and Internet2 for
send the corresponding route advertisement organizations in western Pennsylvania. The trace
messages after a possible update of some of captured all the interdomain traffic sent by PSC
their attributes. through its border routers. During the studied
The input and output filters used in combina- day, the border routers of PSC sent IP packets
tion with the BGP decision process are the key corresponding to 574 Gbytes of data to 11,791
mechanisms that allow a network administrator to distinct ASs. The difference in the number of
support within BGP the business relationships ASs for each studied ISP is mainly due to the
between two ASes. Many types of business rela- number of ASs in the Internet at the time of the
tionships can be supported by BGP. Two of the measurement. In December 1999, BELNET had
most common are the customer-to-provider and 6298 distinct ASs in its routing table, while
peer-to-peer relationships [1]. To understand how YUCOM knew 10,560 ASs in May 2001, and
these two relationships are supported by BGP, PSC knew almost 12,000 ASes in March 2002.
consider Fig. 1. If AS5 is AS1’s customer, AS5 The above description of each ISP reveals an
will configure its BGP router to announce its important fact. Each ISP exchanges IP packets
routes to AS1. AS1 will accept these routes and with a large fraction of the Internet over a peri-
announce them to its peer (AS4) and upstream od of a few days. Based on this sole information,
provider (AS2). AS1 will also announce to AS5 interdomain traffic engineering would appear
all the routes it receives from AS2 and AS4. If difficult since an AS would need to influence
AS1 and AS4 have a peer-to-peer relationship on most of the Internet to control its traffic. Fortu-
the link between R13 and R43, router R13 will only nately, a closer look at the traffic exchanged with
announce on this link the internal routes of AS1 each AS reveals several interesting points.
and the routes received from AS1’s customer (i.e., In Fig. 3, we show the cumulative distribution
AS5). The routes received from AS2 will be fil- of the interdomain traffic received by BELNET
tered and thus not announced on the R13–R43 link and YUCOM and sent by PSC. To plot this fig-
by router R13. Due to this filtering, AS1 will not ure, we classified the traffic in each trace on the
carry traffic from AS4 toward AS2. basis of its source and destination AS. A similar
result would have been found at the prefix level
(see [9] for such an analysis with BELNET).
CHARACTERISTICS OF A first point to note about Fig. 3 is that the
INTERDOMAIN TRAFFIC studied ISPs do not exchange the same amount
of traffic with each remote AS. The 10 (resp.
An important element to consider when engi- 100) largest sources of traffic for YUCOM con-
neering the interdomain traffic of an AS are the tribute to more than 30 percent (resp. 72 per-
characteristics of this traffic. Informal discus- cent) of the traffic received by this ISP. Similarly,

124 IEEE Communications Magazine • May 2003


the 10 (resp. 100) largest sources of traffic for
BELNET contribute to 22 percent (resp. 64 per- 50
cent) of the traffic it receives during one week. BELNET
45 YUCOM
For PSC, the concentration of the traffic sinks is PSC
even more important as the 10 (resp. 100) largest 40
destinations receive 38 percent (resp. 78 per-
cent) of the total traffic sent by PSC. [8] men- 35

Total traffic (%)


tions a similar distribution for the interdomain
traffic of a large tier 1 ISP. 30
Another important point to mention about 25
the interdomain traffic exchanged by the studied
ISPs is the distance (measured in AS hops) 20
between the remote ASs and each studied ISP.
15
Figure 4 shows, for each ISP, the percentage of
its interdomain traffic that was produced by or 10
sent to remote ASes as a function of their dis-
tance measured in AS hops. This analysis shows 5
that the studied ISPs only exchange a small frac- 0
tion of their traffic with their direct peers (AS 1 2 3 4 5 6 7 8 9
hop distance on 1). Most of the packets are
AS hop distance
exchanged with ASs that are only a few AS hops
away. For the BELNET trace, most of the traffic
is produced by sources located three and four ■ Figure 4. Per-AS hop distribution of traffic.
AS hops away, while YUCOM mainly receives
traffic from sources that are two and three AS
hops away. PSC, on the other hand, sends traffic a lot of Web or streaming servers and usually
to ASs located up to four AS hops away. have several customer-to-provider relationships
This analysis has two important implications with transit ASs will try to optimize the way traf-
for interdomain traffic engineering. First, fic leaves their networks. On the contrary, access
although an AS will exchange packets with most providers that serve small and medium enterpris-
of the Internet, only a small number of ASs are es, and dialup or xDSL users typically wish to
responsible for a large fraction of interdomain optimize how Internet traffic enters their net-
traffic. This implies that an AS willing to engi- works. Finally, a transit AS will try to balance the
neer its interdomain traffic could move a large traffic on the multiple links it has with its peers.
amount of traffic by influencing a small number Optimizing the way traffic enters or leaves a
of distant ASs. Second, the sources or destina- network means to favor one link over another to
tions of interdomain traffic are not direct peers, reach a given destination or receive traffic from a
but they are only a few AS hops away. This given source. This type of interdomain traffic
implies that interdomain traffic engineering solu- engineering can be performed by tweaking the
tions should be able to influence ASs a few hops BGP routers of the AS. In order to understand
beyond their upstream providers or direct peers. how BGP can be used to control the way traffic
enters, leaves or crosses an AS, a better under-
standing of the BGP decision process is required.
INTERDOMAIN TRAFFIC ENGINEERING A BGP router receives one route toward each
Interdomain traffic engineering requirements destination from each of its peers. To select the
are diverse and often motivated by the need to best route among this set of routes, a BGP router
balance the traffic on links with other ASs and relies on a set of criteria called the decision pro-
to reduce the cost of carrying traffic on these cess. Most BGP routers apply a decision process
links. These requirements depend on the con- similar in principle to the one shown in Fig. 2.
nectivity of an AS with others, but also on the The set of routes with the same destination are
type of business handled by this AS. analyzed by the criteria in the sequence indicated
Connectivity between ASs is mainly com- in Fig. 2. These criteria act as filters, and the Nth
posed of two types of relationships. The most criterion is only evaluated if more than one route
frequent relationship between ASs is the cus- has passed the N – 1th criterion. It should be
tomer-to-provider relationship, where a customer noted that most BGP implementations allow the
AS pays to use a link connected to its provider. network administrator to optionally disable some
This relationship is the origin of most of the of the criteria of the BGP decision process.
interdomain cost of an AS. A stub AS usually
tries to maintain at least two of these links for CONTROL OF THE OUTGOING TRAFFIC
performance and redundancy reasons [1]. In To control how the traffic leaves its network, an
addition, larger ASs typically try to obtain peer- AS must be able to choose which route will be
to-peer relationships with other ASs and then used to reach a particular destination through its
share the cost of the link with the other AS. peers. Since an AS controls the decision process
Negotiating the establishment of those peer-to- on its BGP routes, it can easily influence the
peer relationships is often a complicated process selection of the best path. Two techniques are
since technical and economical factors, as frequently used.
exposed in [10], need to be taken into account. A first technique is to rely on the local-
Moreover, an AS will want to optimize the pref attribute. This optional attribute is only
way traffic enters or leaves its network, based on distributed inside an AS. It can be used to rank
its business interests. Content providers that host routes and is the first criterion of the BGP deci-

IEEE Communications Magazine • May 2003 125


sion process (Fig. 2). For example, consider a forwarding table contains both a route toward
To control how stub AS with two links toward one upstream 16.0.0.0/8 and a route toward 16.1.2.0/24, a
provider: a high bandwidth and a low bandwidth packet whose destination is 16.1.2.200 would
the traffic leaves link. In this case, the BGP router of this AS be forwarded along the second route. This fact
could be configured to insert a low local-pref can also be used to control the incoming traffic.
its network an AS
to routes learned via the low bandwidth link and In the following example, we assume that prefix
must be able to a higher value to routes learned via the high 16.0.0.0/8 belongs to AS3 and that several
bandwidth link. A similar situation can occur for important servers are part of the 16.1.2.0/24
choose which a stub AS connected to a cheap and a more subnet. If AS3 prefers to receive the packets
route will be used to expensive upstream provider. toward its servers on the R24–R31 link, it would
In practice, the manipulation of the local- advertise both 16.0.0.0/8 and 16.1.2.0/24 on
reach a particular pref attribute can also be based on passive or this link and only 16.0.0.0/8 on its other exter-
active measurements. Recently, a few companies nal links. An advantage of this solution is that if
destination
have implemented solutions [11] that allow mul- link R 24 – R 31 fails, then subnet 16.1.2.0/24
through its peers. tihomed stub ASs and content providers to engi- would still be reachable through the other links.
neer their interdomain traffic. These solutions Another method would be to allow an AS to
Since an AS usually measure the load on each interdomain indicate a ranking among the various route adver-
controls the deci- link, and some rely on active measurements to tisements that it sends. Based on the utilization of
evaluate the performance of interdomain paths. the length of the AS-path as the third criteria in
sion process on its Based on these measurements and some knowl- the BGP decision process, a possible way to influ-
edge of the Internet topology (obtained either ence the selection of routes by a distant AS is to
BGP routes, it can
through a central server or from the BGP router artificially increase the length of the AS path
easily influence to which they are attached), they attach appro- attribute. Coming back to Fig. 1, assume that
priate values of the local-pref attribute to AS3’s primary interdomain link is link R61–R45,
the selection of indicate which route should be considered as the while link R61–R36 is only used as a backup link. In
the best path. best route by the BGP routers. this case, AS6 would announce its routes normally
A second technique, often used by large tran- on the primary link (i.e., with an AS path of AS6)
sit ISPs, is to rely on the intradomain routing but would attach its own AS number several times
protocol to influence how a packet crosses the instead of once in the AS path attribute (e.g., AS6
transit ISP. As shown in Fig. 2, the BGP deci- AS6 AS6) on the R61–R36 link. The route adver-
sion process will select the nearest IGP neighbor tised on the primary link will be considered as the
when comparing several equivalent routes best route by all routers that do not rely on manu-
received via iBGP. For example, consider in Fig. ally configured settings for the weight and
1 that router R27 receives one packet whose des- local-pref attributes. This technique can be
tination is R 45 . The BGP decision process of combined with selective advertisements. For
router R27 will compare two routes toward R45, example, an AS could divide its address space in
one received via R28 and the other received via two prefixes p1 and p2 and advertise prefix p1
R 26 . By selecting router R 28 as the exit border without prepending and prefix p2 with prepending
router for this packet, AS2 will ensure that this on its first link and the opposite of its second link.
packet will consume as few resources as possible The last method to allow an AS to control its
inside its own network. If a transit AS relies on a incoming traffic is to rely on the MED attribute.
tuning of the weights of its intradomain routing This optional attribute can only be used by an
protocol as described in [12], this tuning will AS multiconnected to another AS to influence
indirectly influence its outgoing traffic. the link that should be used by the other AS to
send packets toward a specific destination. It
CONTROL OF THE INCOMING TRAFFIC should, however, be noted that the utilization of
The first method that can be used to control the the MED attribute is usually subject to a negoti-
traffic that enters an AS is to rely on selective ation between the two peering ASs, and some
advertisements and announce different route ASs do not take the MED attribute into account
advertisements on different links.1 For example, in their decision process.
in Fig. 1, if AS1 wanted to balance the traffic
coming from AS2 over the links R11 – R21 and R13 COMMUNITY-BASED TRAFFIC ENGINEERING
– R27, it could announce only its internal routes In addition to these techniques, several ISPs have
on the R11–R21 link and only the routes learned been using the communities attribute to give
from AS5 on the R13–R27 link. Since AS2 would their customers finer control over the redistribu-
only learn about AS5 through router R27, it would tion of their routes. The communities attribute
be forced to send the packets whose destination is an optional attribute that can be attached to
belongs to AS5 via router R27. However, a draw- routes. This attribute can contain several 32 bits
back of this solution is that if link R13–R27 fails, wide community values. Community values are
AS2 would not be able to reach AS5 through often used to attach optional information to
AS1. This is not desirable, and it should be possi- routes such as a code representing the city where
ble to utilize link R11–R21 for the packets toward the route was received or a code indicating
AS5 at that time without being forced to change whether the route was received from a peer or a
the routes that are advertised on this link. customer. The community values can also be used
A variant of the selective advertisements is the for traffic engineering purposes. In this case, pre-
1 It should be noted that advertisement of more specific prefixes. This defined community values can be attached to
such behavior is consid- advertisement relies on the fact that an IP router routes in order to request actions such as not
ered a wrong behavior in will always select in its forwarding table the most announcing the route to a specified set of peers,
peer-to-peer relationships specific route for each packet (i.e., the matching prepending the AS-path when announcing the
by some ISPs. route with the longest prefix). For example, if a route to a specified set of peers or setting the

126 IEEE Communications Magazine • May 2003


local-pref. However, this technique relies on of their BGP routing tables, several large ISPs
an ad hoc definition of community values and have started to install filters to ignore the BGP The redistribution
manual configurations of BGP filters, which advertisements corresponding to more specific
makes it difficult to use and subject to errors. prefixes. The deployment of those filters implies communities
The Internet Engineering Task Force (IETF) that the more specific prefixes will not be
attached to a
is currently considering the definition of a new announced by those large ISPs, and thus the
standard type of extended communities called technique will become much less effective. route contain
redistribution communities [13] to solve the draw- When considering the manipulation of the
backs of the utilization of classical communities AS-path attribute, we have mentioned that it both the traffic
to do traffic engineering. These redistribution can be used on backup links. It is sometimes also engineering action
communities can be attached to routes to influ- used to better balance the traffic ([5] reports
ence the redistribution of those routes by the that AS-path prepending affected 6.5 percent of to be performed
upstream AS. The redistribution communities the BGP routes in November 2001). However, in
and the BGP
attached to a route contain both the traffic engi- practice it can be difficult to predict the out-
neering action to be performed and the BGP come of performing AS-path prepending on a peers affected by
peers affected by this action. One of the sup- given interdomain link. Usually, ISPs that rely
ported actions allows an AS to indicate to its on AS-path prepending select the amount of this action. One
upstream peer that it should not announce the prepending on a trial and error basis. of the supported
attached route to some of its BGP peers. The redistribution communities can provide a
Another type of action allows an AS to finer granularity than AS-path prepending or actions allows an
request its upstream to perform AS-path selective announcements. In practice, it can be
AS to indicate to
prepending when redistributing a route to a spec- expected that those communities will be used to
ified peer. To understand the usefulness of such influence the redistribution of routes toward large its upstream peer
redistribution communities, let us consider again transit ISPs with a large number of customers.
Fig. 1, and assume that AS6 receives a lot of traf- For example, consider as an example YUCOM, that it should not
fic from AS1 and AS2 and that it would like to discussed earlier. This ISP has two major announce the
receive the packets from AS1 (resp. AS2) on the upstream providers that allow it to reach the
R45–R61 (resp. R36–R61) link. AS6 cannot achieve entire Internet. These two providers are then attached route
such a traffic distribution by performing AS-path each connected to several tier 1 ISPs that provide
to some of its
prepending itself. However, this becomes possi- most of their connectivity. Figure 5 provides a
ble with the redistribution communities by subset of the Internet topology as seen by BGP peers.
requesting AS4 to perform the prepending when YUCOM on the basis of the BGP advertisements
announcing the AS6 routes to external peers. it received from its two providers. In this figure,
AS6 could thus advertise to AS4 its routes with a we show the three largest tier 1 ISPs that were
redistribution community that indicates that this connected to YUCOM’s providers. Based on the
route should be prepended two times when BGP advertisements received by YUCOM, it
announced to AS2. With this redistribution com- appears that both providers sent advertisements
munities, AS4 would advertise path AS4:AS4:AS6 for routes reachable via one of those tier 1 ISPs
to AS2 and path AS4:AS6 to AS1. AS2 would thus (tier 1B in Fig. 5), while only one of those pro-
receive two routes toward AS6: AS4:AS4:AS6 and viders sent advertisements for routes reachable
AS3:AS6, and would select the route via AS3. via each of the two other tier 1 ISPs. In addition
AS1, on the other hand, would select the AS4:AS6 to this topological information, Fig. 5 also reports
route that is shorter than the AS2:AS3:AS6 route. the number of distinct ASs reachable via each tier
1 ISP via the two providers of YUCOM. For tier
DISCUSSION 1C, this number indicates that provider 2 sent to
The sections above describe several techniques YUCOM BGP advertisements toward 2470 dis-
that can be used by ISPs to engineer their inter- tinct ASs that are reachable via tier 1C.
domain traffic. However, there are some limita- Figure 5 reveals two interesting pieces of
tions to be considered when deploying those information. First, each tier 1 ISP provides con-
techniques. nectivity and thus announces routes toward a
A first point to note is that the control of the large number of ASs. In total, the three largest
outgoing traffic with BGP is based on the selection tier 1 providers announce more than 8500 ASs.
of the best route among the available routes. This Second, the studied ISP learns routes toward
selection can be performed on the basis of various more than 2000 different ASs reachable via tier
parameters, but is limited by the diversity of routes 1B via its two upstream providers. By using redis-
received from upstream providers, which depends tribution communities targeted at those large tier
on the connectivity and the policy of these ASs. 1 ISPs, our ISP could influence the redistribution
The control of incoming traffic is based on of its routes to a large number of ASs with only a
careful tuning of the advertisements sent by an few communities. For example, the studied ISP
AS. This tuning can cause several problems. could utilize a single redistribution community to
First, an AS that advertises more specific prefix- request its first upstream provider to announce
es or has divided its address space in distinct its local routes with AS-path prepending only
prefixes to announce them selectively will adver- toward tier 1B. The result of this modified adver-
tise a number of prefixes larger than required. tisement by the first provider will be that the
All these prefixes will be propagated throughout traffic coming from ASs attached to tier 1B
the global Internet and will increase the size of would be received through the other provider.
the BGP routing tables of potentially all ASs in Another point to mention concerning the useful-
the Internet. Reference [5] reports that more ness of the redistribution communities is that, as
specific routes constitute more than half of the shown in Fig. 4, most sources of traffic are locat-
entries in a BGP table. Faced with this increase ed at only a few AS hops away. The redistribu-

IEEE Communications Magazine • May 2003 127


BGP routing tables. With AS path prepending, it
can be difficult to select the appropriate value of
prepending to achieve a given goal. Finally, we
show how redistribution communities could allow
an AS to flexibly influence the redistribution of
Tier 1A
its routes toward indirectly connected ISPs.
Tier 1B Tier 1C
ACKNOWLEDGMENTS
#A

#A
91

70
This work was supported by the European Com-

24

24
S=

S=
mission within the IST ATRIUM project. We

S=

S=
35

21
#A

#A
55 would like to thank C. Rapier (PSC), B. Piret

31
(YUCOM), and M. Roger (BELNET) for their
traffic traces. We also thank S. De Cnodder, C.
Provider 1 Provider 2 Filsfils, A. Danthine, and the anonymous review-
ers for their useful comments.

REFERENCES
Stub [1] L. Subramanian et al., “Characterizing the Internet Hier-
archy from Multiple Vantage Points,” INFOCOM 2002,
June 2002.
[2] D. Awduche et al., “Overview and Principles of Internet
■ Figure 5. Subset of the interdomain topology seen from the studied ISP and Traffic Engineering,” IETF RFC3272, May 2002.
number of different ASes advertised by each tier-1 ISP. [3] Y. Rekhter and T. Li, “A Border Gateway Protocol 4
(BGP-4),” Internet draft, draft-ietf-idr-bgp4-17.txt, work
in progress, May 2002.
[4] J. Stewart, BGP4 : Interdomain Routing in the Internet,
tion communities can directly influence sources Addison Wesley, 1999.
located two AS hops away and indirectly sources [5] A. Broido, E. Nemeth, and K. Claffy, “Internet Expan-
three or four AS hops away. sion, Refinement and Churn,” Euro. Trans. Telecom-
A last point to note concerning the techniques mun., Jan. 2002.
[6] K. Thompson, G. Miller, and R. Wilder, “Wide-Area
that require changes to the attributes of BGP Internet Traffic Patterns and Characteristics,” IEEE Net-
advertisements is that any (small) change to an work, vol. 11, no. 6, Nov./Dec. 1997.
attribute will force the route advertisement to be [7] W. Fang and L. Peterson, “Inter-AS Traffic Patterns and
redistributed to potentially the entire Internet. Their Implications,” IEEE Global Internet Symp., Dec.
1999.
Although it would be possible to define tech- [8] N. Feamster, J. Borkenhagen, and J. Rexford, “Control-
niques relying on measurements to dynamically ling the Impact of BGP Policy Changes on IP Traffic,”
change the BGP advertisements of an AS for traf- AT&T tech. memo, 011106-02, Nov. 2001.
fic engineering purposes, a widespread deploy- [9] S. Uhlig and O. Bonaventure, “Implications of Interdo-
main Traffic Characteristics on Traffic Engineering,”
ment of such techniques would increase the Euro. Trans. Telecommun., Jan. 2002.
number of BGP messages exchanged and could [10] S. Bartholomew, “The Art of Peering,” BT Tech. J., vol.
lead to BGP instabilities. Any dynamic interdo- 18, no. 3, July 2000.
main traffic engineering technique that involves [11] S. Borthick, “Will Route Control Change the Internet?,”
Bus. Commun. Rev., Sept. 2002.
frequent changes to the values of BGP attributes [12] B. Fortz, J. Rexford, and M. Thorup, “Traffic Engineer-
should be studied carefully before being deployed. ing with Traditional IP Routing Protocols,” IEEE Com-
mun. Mag., Oct. 2002.
[13] O. Bonaventure et al., “Controlling the Redistribution
CONCLUSION of BGP Routes,” Internet draft, draft-ietf-ptomaine-bgp-
redistribution-01.txt, work in progress, Aug. 2002.
In this article, we describe several techniques
that are used today to control the flow of pack- BIOGRAPHIES
ets in the global Internet. We first describe the OLIVIER BONAVENTURE (Bonaventure@info.ucl.ac.be) received
current organization of the Internet and the key his Ph.D. from the University of Liege, Belgium. His research
role played by BGP. interests include interdomain routing and traffic engineer-
We then discuss the characteristics of inter- ing, IP-based mobile networks, and network measure-
ments. He was with the University of Namur, Belgium, and
domain traffic based on long traces covering all is currently a professor at the Université Catholique de Lou-
the interdomain links of three distinct ISPs. Two vain, Belgium.
common characteristics appear in those traces.
C RISTEL P ELSSER (cpe@info.fundp.ac.be) received her M.S.
First, although the Internet is composed of degree in computer science from the University of Namur
about 13,000 ASs today, a small percentage of in 2001. She currently works as a researcher at the univer-
those ASs contribute to a large fraction of the sity. Her research interests include interdomain traffic engi-
traffic received or sent by those ISPs. Second, neering and routing.
those highly active sources or destinations are B RUNO Q UOITIN (bqu@info.fundp.ac.be) received his M.S.
located only a few AS hops away, although the degree in computer science from the University of Namur
adjacent ASs are only responsible for a small in 1999. He worked in industry for two years and is cur-
rently a researcher at the university. His research interests
fraction of the total traffic. include interdomain routing and traffic engineering.
We explain how BGP is tuned today for inter-
domain traffic engineering purposes. We have L OUIS S WINNEN (lsw@info.fundp.ac.be) received his M.S.
degree in computer science from the University of Namur
shown that an AS has more control over its out- in 2001. His research interests include interdomain routing,
going than over its incoming traffic. Several tech- network simulations, and simulator development. He also
niques can be used to control the incoming gives some courses at the Haute Ecole Mosane d’Enseigne-
traffic, but they have limitations. The selective ment Superieur, Belgium.
advertisements and the more specific prefixes STEVE UHLIG (suh@info.ucl.ac.be) received his M.S. degree in
have the drawback of increasing the size of the computer science from the University of Namur in 2000.

128 IEEE Communications Magazine • May 2003


View publication stats

You might also like