You are on page 1of 28

Network layer functions

❒ transport packet from application


transport
sending to receiving hosts network
data link network
❒ network layer entity in every physical data link

Chapter 4 network physical network


host, router data link data link
physical
physical
network

Network Layer functions:


❒ path determination: route
data link
physical network
data link
physical

taken by packets from source network


network data link
to dest. Routing algorithms data link physical
physical
❒ forwarding: move packets network
application
data link
from router’s input to physical transport
network
appropriate router output data link
physical

❒ Call setup (VC networks):


Set-up routes state before
sending packet
Network Layer 4-1 Network Layer 4-2

Interplay between routing and forwarding


Connection setup
routing algorithm
❒ 3rd important function in some network architectures:
local forwarding table ATM, frame relay, X.25

header value output link
0100 3 ❒ before datagrams flow, two end hosts and intervening
0101 2
0111 2 routers establish virtual connection
1001 1
❍ routers get involved

value in arriving
❒ network vs transport layer connection service:
packet’s header ❍ network: between two hosts (may also involve
1
0111 intervening routers in case of VCs)
3 2 ❍ transport: between two processes

Network Layer 4-3 Network Layer 4-4


Network service model Network layer service models:
Q: What service model for “channel” transporting Guarantees ?
Network Service Congestion
datagrams from sender to receiver? Architecture Model Bandwidth Loss Order Timing feedback

Example services for Example services for a Internet best effort none no no no no (inferred
flow of datagrams: via loss)
individual datagrams: ATM CBR constant yes yes yes no
❒ guaranteed delivery ❒ in-order datagram rate congestion
❒ guaranteed delivery delivery ATM VBR guaranteed yes yes yes no
rate congestion
with less than 40 msec ❒ guaranteed minimum
ATM ABR guaranteed no yes no yes
delay bandwidth to flow minimum
❒ restrictions on ATM UBR none no yes no no
changes in inter-
packet spacing

Network Layer 4-5 Network Layer 4-6

Network layer connection and


connection-less service Virtual circuits
❒ datagram network provides network-layer “source-to-dest path behaves much like telephone
circuit”
connectionless service ❍ performance-wise
❒ VC network provides network-layer ❍ network actions along source-to-dest path
connection service
❒ analogous to the transport-layer services, ❒ call setup, teardown for each call before data can flow
❒ each packet carries VC identifier (not destination host
but: address)
❍ service: host-to-host ❒ every router on source-dest path maintains “state” for
❍ no choice: network provides one or the other each passing connection
❒ link, router resources (bandwidth, buffers) may be
❍ implementation: in network core
allocated to VC (dedicated resources = predictable service)

Network Layer 4-7 Network Layer 4-8


VC implementation Forwarding table VC number

12 22 32

a VC consists of: 1
2
3

1. path from source to destination


Forwarding table in interface
2. VC numbers, one number for each link along number
path northwest router:
3. entries in forwarding tables in routers along Incoming interface Incoming VC # Outgoing interface Outgoing VC #

path 1 12 3 22
2 63 1 18
❒ packet belonging to VC carries VC number 3 7 2 17
(rather than dest address) 1 97 3 87
… … … …
❒ VC number can be changed on each link.
❍ New VC number comes from forwarding table
Routers maintain connection state information!
Network Layer 4-9 Network Layer 4-10

Datagram networks
Virtual circuits: signaling protocols
❒ no call setup at network layer
❒ routers: no state about end-to-end connections
❒ used to setup, maintain teardown VC
❍ no network-level concept of “connection”
❒ used in ATM, frame-relay, X.25
❒ packets forwarded using destination host address
❒ not used in today’s Internet ❍ packets between same source-dest pair may take
different paths

application
transport 5. Data flow begins 6. Receive data application application
transport application
network 4. Call connected 3. Accept call transport
network transport
data link 1. Initiate call 2. incoming call network
data link network
physical data link 1. Send data 2. Receive data
physical data link
physical
physical

Network Layer 4-11 Network Layer 4-12


4 billion
Forwarding table possible entries Longest prefix matching
Destination Address Range Link Interface Prefix Match Link Interface
11001000 00010111 00010 0
11001000 00010111 00010000 00000000 11001000 00010111 00011000 1
through 0 11001000 00010111 00011 2
11001000 00010111 00010111 11111111 otherwise 3

11001000 00010111 00011000 00000000


Examples
through 1
11001000 00010111 00011000 11111111 DA: 11001000 00010111 00010110 10100001 Which interface?

11001000 00010111 00011001 00000000


through 2 DA: 11001000 00010111 00011000 10101010 Which interface?
11001000 00010111 00011111 11111111

otherwise 3

Network Layer 4-13 Network Layer 4-14

Datagram or VC network: why? Router Architecture Overview

Internet (datagram) Two key router functions:


ATM (VC)
❒ data exchange among ❒ run routing algorithms/protocol (RIP, OSPF, BGP)
❒ evolved from telephony
computers ❒ forwarding datagrams from incoming to outgoing link
❒ human conversation:
❍ “elastic” service, no strict
❍ strict timing, reliability
timing req.
requirements
❒ “smart” end systems
❍ need for guaranteed
(computers)
service
❍ can adapt, perform
❒ “dumb” end systems
control, error recovery
❍ telephones
❍ simple inside network,
❍ complexity inside
complexity at “edge”
network
❒ many link types
❍ different characteristics
❍ uniform service difficult
Network Layer 4-15 Network Layer 4-16
Input Port Functions Three types of switching fabrics

Physical layer:
bit-level reception
Data link layer: Decentralized switching:
e.g., Ethernet ❒ given datagram dest., lookup output port
see chapter 5 using forwarding table in input port
memory
❒ goal: complete input port processing at
‘line speed’
❒ queuing: if datagrams arrive faster than
forwarding rate into switch fabric

Network Layer 4-17 Network Layer 4-18

Switching Via Memory


First generation routers: Switching Via a Bus
❒ traditional computers with switching under direct
control of CPU
❒ packet copied to system’s memory
❒ datagram from input port memory
❒ speed limited by memory bandwidth (2 bus
to output port memory via a shared
crossings per datagram)
bus
Input Memory Output
Port Port ❒ bus contention: switching speed
limited by bus bandwidth
❒ 32 Gbps bus, Cisco 5600: sufficient
speed for access and enterprise
System Bus
routers

Network Layer 4-19 Network Layer 4-20


Switching Via An Interconnection Output Ports
Network

❒ overcome bus bandwidth limitations


❒ Banyan networks, other interconnection nets
initially developed to connect processors in
multiprocessor
❒ advanced design: fragmenting datagram into fixed
length cells, switch cells through the fabric.
❒ Buffering required when datagrams arrive from
❒ Cisco 12000: switches 60 Gbps through the
fabric faster than the transmission rate
interconnection network
❒ Scheduling discipline chooses among queued
datagrams for transmission

Network Layer 4-21 Network Layer 4-22

Output port queueing How much buffering?


❒ RFC 3439 rule of thumb: average buffering
equal to “typical” RTT (say 250 msec) times
link capacity C
❍ e.g., C = 10 Gps link: 2.5 Gbit buffer
❒ Recent recommendation: with N flows,
buffering equal to RTT. C
N
❒ buffering when arrival rate via switch exceeds
output line speed
❒ queueing (delay) and loss due to output port
buffer overflow!
Network Layer 4-23 Network Layer 4-24
Input Port Queuing The Internet Network layer
❒ Fabric slower than input ports combined -> queueing Host, router network layer functions:
may occur at input queues
❒ Head-of-the-Line (HOL) blocking: queued datagram Transport layer: TCP, UDP
at front of queue prevents others in queue from
IP protocol
moving forward Routing protocols
•addressing conventions
•path selection
❒ queueing delay and loss due to input buffer overflow! •RIP, OSPF, BGP •datagram format
Network •packet handling conventions
layer forwarding
ICMP protocol
table
•error reporting
•router “signaling”

Link layer

physical layer

Network Layer 4-25 Network Layer 4-26

IP datagram format
IP protocol version
IP Fragmentation & Reassembly
number 32 bits total datagram
length (bytes) ❒ network links have MTU
header length head. type of
(bytes) ver length (max.transfer size) - largest
len service for possible link-level frame.
“type” of data fragment fragmentation/
16-bit identifier flgs ❍ different link types, fragmentation:
offset reassembly
max number time to upper header different MTUs in: one large datagram
remaining hops live layer checksum ❒ large IP datagram divided out: 3 smaller datagrams
(decremented at (“fragmented”) within net
each router) 32 bit source IP address
❍ one datagram becomes
upper layer protocol 32 bit destination IP address several datagrams
reassembly
to deliver payload to E.g. timestamp, ❍ “reassembled” only at final
Options (if any)
record route destination
how much overhead data taken, specify ❍ IP header bits used to
with TCP? (variable length, list of routers identify, order related
❒ 20 bytes of TCP typically a TCP to visit. fragments
or UDP segment)
❒ 20 bytes of IP
❒ = 40 bytes + app
layer overhead
Network Layer 4-27 Network Layer 4-28
IP Fragmentation and Reassembly IP Addressing: introduction
223.1.1.1
length ID fragflag offset ❒ IP address: 32-bit
Example =4000 =x =0 =0 identifier for host, 223.1.2.1
223.1.1.2
❒ 4000 byte router interface 223.1.1.4 223.1.2.9
One large datagram becomes
datagram several smaller datagrams ❒ interface: connection
223.1.2.2
❒ MTU = 1500 bytes between host/router 223.1.1.3 223.1.3.27
length ID fragflag offset and physical link
=1500 =x =1 =0
1480 bytes in ❍ router’s typically have
data field length ID fragflag offset multiple interfaces 223.1.3.1 223.1.3.2
=1500 =x =1 =185 ❍ host typically has one
offset = interface
1480/8 length ID fragflag offset
=1040 =x
❍ IP addresses
=0 =370
associated with each 223.1.1.1 = 11011111 00000001 00000001 00000001
interface
223 1 1 1

Network Layer 4-29 Network Layer 4-30

IP Addressing Subnets
❒ Address can be divided in two parts
223.1.1.1
❒ IP address:
NetID HostID ❍ subnet part (high 223.1.2.1
223.1.1.2
order bits) 223.1.1.4 223.1.2.9
❒ NetID identifies the network ❍ host part (low order

❒ HostID identifies the host within the network bits) 223.1.1.3


223.1.2.2
223.1.3.27
❒ What’s a subnet ?
subnet
❍ device interfaces with
same subnet part of IP 223.1.3.1 223.1.3.2
address
Hosts within the ❍ can physically reach
same network have each other without
Network intervening router network consisting of 3 subnets
the same NetId
Host

Network Layer 4-31 Network Layer 4-32


Subnets 223.1.1.0/24
223.1.2.0/24
Subnets 223.1.1.2

How many? 223.1.1.1 223.1.1.4


Recipe
223.1.1.3
❒ To determine the
subnets, detach each
223.1.9.2 223.1.7.0
interface from its
host or router,
creating islands of
isolated networks. 223.1.9.1 223.1.7.1
Each isolated network 223.1.8.1 223.1.8.0

is called a subnet. 223.1.3.0/24


223.1.2.6 223.1.3.27

Subnet mask: /24 223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2

Network Layer 4-33 Network Layer 4-34

IP Addresses Counting up
❒ 32 bit IP address:
given notion of “network”, let’s re-examine IP addresses: ❍ 232 = 4.294.967.296 theoretical IP addresses
“class-full” addressing: ❒ class A:
The IP
❍ 27-2 =126 networks [0.0.0.0 and 127.0.0.0 reserved] address
❍ 224-2 = 16.777.214 maximum hosts Pie!
class
• 2.113.928.964 addressable hosts (49,22% of max)
1.0.0.0 to ❒ class B
A 0 network host 127.255.255.255
❍ 214=16.384 networks Class B
B network 128.0.0.0 to Class A
10 host ❍ 216-2 = 65.534 maximum hosts
191.255.255.255
• 1.073.709.056 addressable hosts (24,99% of max) C
192.0.0.0 to
C 110 network host ❒ class C D
223.255.255.255 E
❍ 221=2.097.152 networks
224.0.0.0 to 28-2 = 254 maximum hosts
D 1110 multicast address
239.255.255.255

• 532.676.608 addressable hosts (12,40% of max)
32 bits

Network Layer 4-35 Network Layer 4-36


Special Addresses Special Addresses
❒ Network Address: ❒ Direct Broadcast Address:
❍ An address with the HostID bits set to 0 identifies the ❍ Address with HostID bit set to 1 is the broadcast address
network with the given NetID (used in routing tables) of the network identified by NetID.
❍ example: 193.17.31.255
❍ examples:
• class B network: 131.175.0.0
• class C network: 193.17.31.0

193.17.31.76
193.17.31.55 193.17.31.45
193.17.31.76
193.17.31.55 193.17.31.45
193.17.31.0
193.17.31.0

Network Layer 4-37 Network Layer 4-38

IP addressing: CIDR IP addresses: how to get one?


CIDR: Classless InterDomain Routing
❍ subnet portion of address of arbitrary length
Q: How does a host get IP address?
❍ address format: a.b.c.d/x, where x is # bits in
subnet portion of address ❒ hard-coded by system admin in a file
❍ Windows: control-panel->network->configuration-
>tcp/ip->properties
❍ UNIX: /etc/rc.config
❒ DHCP: Dynamic Host Configuration Protocol:
subnet host
part part dynamically get address from as server
11001000 00010111 00010000 00000000 ❍ “plug-and-play”

200.23.16.0/23
Network Layer 4-39 Network Layer 4-40
DHCP: Dynamic Host Configuration Protocol DHCP client-server scenario
Goal: allow host to dynamically obtain its IP address
from network server when it joins network DHCP 223.1.2.1
A 223.1.1.1
Can renew its lease on address in use server
Allows reuse of addresses (only hold address while connected 223.1.1.2
an “on”) 223.1.1.4 223.1.2.9
Support for mobile users who want to join network (more B
223.1.2.2 arriving DHCP
shortly) 223.1.1.3 223.1.3.27 E client needs
DHCP overview: 223.1.3.2
address in this
223.1.3.1 network
❍ host broadcasts “DHCP discover” msg
❍ DHCP server responds with “DHCP offer” msg
❍ host requests IP address: “DHCP request” msg
❍ DHCP server sends address: “DHCP ack” msg
Network Layer 4-41 Network Layer 4-42

DHCP client-server scenario


IP addresses: how to get one?
DHCP server: 223.1.2.5 arriving
DHCP discover
client
src : 0.0.0.0, 68
dest.: 255.255.255.255,67
Q: How does network get subnet part of IP
yiaddr: 0.0.0.0
transaction ID: 654 addr?
DHCP offer
src: 223.1.2.5, 67
A: gets allocated portion of its provider ISP’s
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
address space
transaction ID: 654
Lifetime: 3600 secs
DHCP request ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
yiaddrr: 223.1.2.4 Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23
transaction ID: 655 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23
time Lifetime: 3600 secs
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
DHCP ACK ... ….. …. ….
src: 223.1.2.5, 67 Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 655
Lifetime: 3600 secs

Network Layer 4-43 Network Layer 4-44


Hierarchical addressing: route aggregation Hierarchical addressing: more specific
routes
Hierarchical addressing allows efficient advertisement of routing
information: ISPs-R-Us has a more specific route to Organization 1
Organization 0
Organization 0 200.23.16.0/23
200.23.16.0/23
“Send me anything
Organization 1 with addresses
“Send me anything
200.23.18.0/23 with addresses Organization 2 beginning
Organization 2 beginning 200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20” . . Internet
. .
. . Internet
Organization 7 .
.
Organization 7 . 200.23.30.0/23
200.23.30.0/23 “Send me anything
ISPs-R-Us
“Send me anything with addresses
ISPs-R-Us beginning 199.31.0.0/16
with addresses Organization 1
beginning or 200.23.18.0/23”
200.23.18.0/23
199.31.0.0/16”

Network Layer 4-45 Network Layer 4-46

IP addressing: the last word... NAT: Network Address Translation

Q: How does an ISP get block of addresses? rest of local network


Internet (e.g., home network)
A: ICANN: Internet Corporation for Assigned 10.0.0/24 10.0.0.1

Names and Numbers 10.0.0.4


10.0.0.2
❍ allocates addresses
138.76.29.7
❍ manages DNS
10.0.0.3
❍ assigns domain names, resolves disputes
All datagrams leaving local Datagrams with source or
network have same single source destination in this network
NAT IP address: 138.76.29.7, have 10.0.0/24 address for
different source port numbers source, destination (as usual)

Network Layer 4-47 Network Layer 4-48


NAT: Network Address Translation NAT: Network Address Translation
Implementation: NAT router must:
❒ Motivation: local network uses just one IP address as
far as outside world is concerned: ❍ outgoing datagrams: replace (source IP address, port
#) of every outgoing datagram to (NAT IP address,
❍ range of addresses not needed from ISP: just one IP new port #)
address for all devices . . . remote clients/servers will respond using (NAT
❍ can change addresses of devices in local network IP address, new port #) as destination addr.
without notifying outside world
❍ remember (in NAT translation table) every (source
❍ can change ISP without changing addresses of IP address, port #) to (NAT IP address, new port #)
devices in local network translation pair
❍ devices inside local net not explicitly addressable,
❍ incoming datagrams: replace (NAT IP address, new
visible by outside world (a security plus). port #) in dest fields of every incoming datagram
with corresponding (source IP address, port #)
stored in NAT table
Network Layer 4-49 Network Layer 4-50

NAT: Network Address Translation NAT: Network Address Translation


NAT translation table
1: host 10.0.0.1
2: NAT router WAN side addr LAN side addr
sends datagram to ❒ 16-bit port-number field:
changes datagram
138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
source addr from
…… ……
❍ 60,000 simultaneous connections with a single
10.0.0.1, 3345 to LAN-side address!
138.76.29.7, 5001, S: 10.0.0.1, 3345
updates table D: 128.119.40.186, 80 ❒ NAT is controversial:
10.0.0.1
S: 138.76.29.7, 5001
1 ❍ routers should only process up to layer 3
2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2 ❍ violates end-to-end argument
138.76.29.7 S: 128.119.40.186, 80 • NAT possibility must be taken into account by app
4
S: 128.119.40.186, 80
D: 10.0.0.1, 3345 designers, eg, P2P applications
D: 138.76.29.7, 5001 3 10.0.0.3
4: NAT router ❍ address shortage should instead be solved by
3: Reply arrives changes datagram
dest. address: dest addr from
IPv6
138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345

Network Layer 4-51 Network Layer 4-52


NAT traversal problem NAT traversal problem
❒ client wants to connect to ❒ solution 2: Universal Plug and
server with address 10.0.0.1 Play (UPnP) Internet Gateway
10.0.0.1 10.0.0.1
❍ server address 10.0.0.1 local Client Device (IGD) Protocol. Allows
to LAN (client can’t use it as ? NATted host to: IGD
destination addr)
❍ only one externally visible 10.0.0.4  learn public IP address 10.0.0.4
NATted address: 138.76.29.7 (138.76.29.7)
138.76.29.7 NAT 138.76.29.7 NAT
❒ solution 1: statically router
 add/remove port mappings
router
configure NAT to forward (with lease times)
incoming connection
requests at given port to i.e., automate static NAT port
server map configuration
❍ e.g., (123.76.29.7, port 2500)
always forwarded to 10.0.0.1
port 25000
Network Layer 4-53 Network Layer 4-54

Getting a datagram from source


NAT traversal problem
to dest. forwarding table in A
❒ solution 3: relaying (used in Skype) Dest. Net. next router Nhops
❍ NATed client establishes connection to relay 223.1.1 1
223.1.2 223.1.1.4 2
❍ External client connects to relay IP datagram: 223.1.3 223.1.1.4 2
❍ relay bridges packets between to connections misc source dest
data
fields IP addr IP addr A 223.1.1.1
2. connection to ❒ datagram remains 223.1.2.1
relay initiated 1. connection to unchanged, as it travels 223.1.1.2
by client relay initiated source to destination 223.1.1.4 223.1.2.9
10.0.0.1
by NATted host B
3. relaying ❒ addr fields of interest 223.1.2.3
Client 223.1.1.3 223.1.3.27 E
established here
138.76.29.7 NAT
router 223.1.3.1 223.1.3.2

Network Layer 4-55 Network Layer 4-56


Getting a datagram from source Getting a datagram from source
to dest. forwarding table in A
to dest. forwarding table in A
misc Dest. Net. next router Nhops misc Dest. Net. next router Nhops
data
fields 223.1.1.1 223.1.1.3
data fields 223.1.1.1 223.1.2.3
223.1.1 1 223.1.1 1
223.1.2 223.1.1.4 2 223.1.2 223.1.1.4 2
Starting at A, send IP Starting at A, dest. E:
223.1.3 223.1.1.4 2 223.1.3 223.1.1.4 2
datagram addressed to B: ❒ look up network address of E
❒ look up net. address of B in in forwarding table
A 223.1.1.1 A 223.1.1.1
forwarding table ❒ E on different network
223.1.2.1 223.1.2.1
❒ find B is on same net. as A 223.1.1.2 ❍ A, E not directly attached 223.1.1.2
❒ link layer will send datagram 223.1.1.4 223.1.2.9 ❒ routing table: next hop 223.1.1.4 223.1.2.9
directly to B inside link-layer B router to E is 223.1.1.4 B
223.1.2.3 223.1.2.3
frame 223.1.1.3 223.1.3.27 E ❒ link layer sends datagram to 223.1.1.3 223.1.3.27 E
❍ B and A are directly router 223.1.1.4 inside link-
223.1.3.1 223.1.3.2 223.1.3.1 223.1.3.2
connected layer frame
❒ datagram arrives at 223.1.1.4
❒ continued…..
Network Layer 4-57 Network Layer 4-58

Getting a datagram from source


Graph abstraction
to dest. forwarding table in router
misc Dest. Net router Nhops interface 5
data
fields 223.1.1.1 223.1.2.3
223.1.1 - 1 223.1.1.4 v 3 w
223.1.2 - 1 2 5
Arriving at 223.1.4, 223.1.2.9
u
destined for 223.1.2.2 223.1.3 - 1 223.1.3.27 2 1 z
3
1
❒ look up network address of E A 223.1.1.1
x y 2
Graph: G = (N,E) 1
in router’s forwarding table
223.1.2.1
❒ E on same network as router’s 223.1.1.2 N = set of routers = { u, v, w, x, y, z }
interface 223.1.2.9 223.1.1.4 223.1.2.9
B E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
❍ router, E directly attached
223.1.2.3
❒ link layer sends datagram to 223.1.1.3 223.1.3.27 E
223.1.2.2 inside link-layer Remark: Graph abstraction is useful in other network contexts
223.1.3.1 223.1.3.2
frame via interface 223.1.2.9 Example: P2P, where N is set of peers and E is set of TCP connections
❒ datagram arrives at
223.1.2.2!!! (hooray!)
Network Layer 4-59 Network Layer 4-60
Graph abstraction: costs Routing Algorithm classification
Global or decentralized Static or dynamic?
5 • c(x,x’) = cost of link (x,x’)
information? Static:
v 3 w
5 - e.g., c(w,z) = 5 Global:
2 ❒ routes change slowly
u ❒ all routers have complete
2 1 z • cost could always be 1, or over time
3 topology, link cost info
1 inversely related to bandwidth,
x y 2
or inversely related to ❒ “link state” algorithms Dynamic:
1
congestion Decentralized: ❒ routes change more
❒ router knows physically- quickly
Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp) connected neighbors, link
❍ periodic update
costs to neighbors
Question: What’s the least-cost path between u and z ? ❒ iterative process of ❍ in response to link
computation, exchange of cost changes
info with neighbors
Routing algorithm: algorithm that finds least-cost path
❒ “distance vector” algorithms

Network Layer 4-61 Network Layer 4-62

A Link-State Routing Algorithm Dijsktra’s Algorithm


1 Initialization:
Dijkstra’s algorithm Notation: 2 N' = {u}
❒ net topology, link costs ❒ c(x,y): link cost from node 3 for all nodes v
known to all nodes x to y; = ∞ if not direct 4 if v adjacent to u
❍ accomplished via “link neighbors 5 then D(v) = c(u,v)
state broadcast” 6 else D(v) = ∞
❒ D(v): current value of cost
❍ all nodes have same info 7
of path from source to
8 Loop
❒ computes least cost paths dest. v
9 find w not in N' such that D(w) is a minimum
from one node (‘source”) to
❒ p(v): predecessor node 10 add w to N'
all other nodes
along path from source to v 11 update D(v) for all v adjacent to w and not in N' :
❍ gives forwarding table
❒ N': set of nodes whose 12 D(v) = min( D(v), D(w) + c(w,v) )
for that node 13 /* new cost to v is either old cost to v or known
least cost path definitively
❒ iterative: after k 14 shortest path cost to w plus cost from w to v */
known
iterations, know least cost 15 until all nodes in N'
path to k dest.’s
Network Layer 4-63 Network Layer 4-64
Dijkstra’s algorithm: example Dijkstra’s algorithm: example (2)
Resulting shortest-path tree from u:
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞ v w
2 uxy 2,u 3,y 4,y u
3 uxyv 3,y 4,y
z
4 uxyvw 4,y x y
5 uxyvwz
Resulting forwarding table in u:
5 destination link
v 3 w v (u,v)
2 5
x (u,x)
u 2 z
1 y (u,x)
3
1
x y 2 w (u,x)
1 z (u,x)
Network Layer 4-65 Network Layer 4-66

Dijkstra’s algorithm, discussion Distance Vector Algorithm


Algorithm complexity: n nodes
❒ each iteration: need to check all nodes, w, not in N Bellman-Ford Equation (dynamic programming)
❒ n(n+1)/2 comparisons: O(n2) Define
❒ more efficient implementations possible: O(nlogn) dx(y) := cost of least-cost path from x to y
Oscillations possible:
❒ e.g., link cost = amount of carried traffic
Then
A A A A
1 1+e 2+e 0 0 2+e 2+e
D 0
B D B D B D B
0 0
e
1+e 1 0 0 1+e 1 dx(y) = min
v
{c(x,v) + dv(y) }
0 C 0 0 1 e
C C 1+e 0 C
1 1
e … recompute … recompute … recompute
initially where min is taken over all neighbors v of x
routing
Network Layer 4-67 Network Layer 4-68
Bellman-Ford example Distance Vector Algorithm
5
Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3 ❒ Dx(y) = estimate of least cost from x to y
v 3 w
2 5 ❒ Node x knows cost to each neighbor v:
u 2 z B-F equation says: c(x,v)
1
3
1 du(z) = min { c(u,v) + dv(z),
x y 2 ❒ Node x maintains distance vector Dx =
1 c(u,x) + dx(z),
c(u,w) + dw(z) } [Dx(y): y є N ]
= min {2 + 5, ❒ Node x also maintains its neighbors’
1 + 3, distance vectors
5 + 3} = 4 ❍ For each neighbor v, x maintains
Node that achieves minimum is next Dv = [Dv(y): y є N ]
hop in shortest path ➜ forwarding table
Network Layer 4-69 Network Layer 4-70

Distance vector algorithm (4) Distance Vector Algorithm (5)


Iterative, asynchronous: Each node:
Basic idea: each local iteration caused
❒ From time-to-time, each node sends its own by:
distance vector estimate to neighbors ❒ local link cost change wait for (change in local link
❒ Asynchronous ❒ DV update message from cost or msg from neighbor)
❒ When a node x receives new DV estimate from neighbor
neighbor, it updates its own DV using B-F equation: Distributed:
recompute estimates
Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N ❒ each node notifies
neighbors only when its DV
changes
❒ Under minor, natural conditions, the estimate if DV to any dest has
❍ neighbors then notify
Dx(y) converge to the actual least cost dx(y) their neighbors if
changed, notify neighbors
necessary

Network Layer 4-71 Network Layer 4-72


Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) + Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)} = min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
node x table = min{2+1 , 7+0} = 3 node x table = min{2+1 , 7+0} = 3
cost to cost to cost to cost to cost to
x y z x y z x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 7 x 0 2 3 x 0 2 3
from

from
from

from

from
y ∞∞ ∞ y 2 0 1 y ∞∞ ∞ y 2 0 1 y 2 0 1
z ∞∞ ∞ z 7 1 0 z ∞∞ ∞ z 7 1 0 z 3 1 0
node y table node y table
cost to cost to cost to cost to
x y z y x y z x y z x y z y
2 1 2 1
x ∞ ∞ ∞ x ∞ ∞ ∞ x 0 2 7 x 0 2 3
x z x z

from
from

from

from
y 2 0 1 7 y 2 0 1 y 2 0 1 y 2 0 1 7
z ∞∞ ∞ z ∞∞ ∞ z 7 1 0 z 3 1 0
node z table node z table
cost to cost to cost to cost to
x y z x y z x y z x y z

x ∞∞ ∞ x ∞∞ ∞ x 0 2 7 x 0 2 3

from

from
from

from
y ∞∞ ∞ y ∞∞ ∞ y 2 0 1 y 2 0 1
z 71 0 z 71 0 z 3 1 0 z 3 1 0
time time
Network Layer 4-73 Network Layer 4-74

Distance Vector: link cost changes Distance Vector: link cost changes
Link cost changes: Link cost changes:
1 60
❒ node detects local link cost change y ❒ good news travels fast y
4 1 ❒ bad news travels slow - 4 1
❒ updates routing info, recalculates
x z “count to infinity” problem! x z
distance vector 50 50
❒ if DV changes, notify neighbors ❒ 44 iterations before
algorithm stabilizes: see
At time t0, y detects the link-cost change, updates its DV, text
and informs its neighbors.
“good Poisoned reverse:
At time t1, z receives the update from y and updates its table.
news ❒ If Z routes through Y to
It computes a new least cost to x and sends its neighbors its DV.
travels get to X :
fast” At time t2, y receives z’s update and updates its distance table. ❍ Z tells Y its (Z’s) distance
y’s least costs do not change and hence y does not send any to X is infinite (so Y won’t
message to z. route to X via Z)
❒ will this completely solve
count to infinity problem?
Network Layer 4-75 Network Layer 4-76
Comparison of LS and DV algorithms Hierarchical Routing
Message complexity Robustness: what happens
❒ LS: with n nodes, E links, if router malfunctions?
O(nE) msgs sent LS:
❒ DV: exchange between
❍ node can advertise
neighbors only
incorrect link cost
❍ convergence time varies
❍ each node computes only
Speed of Convergence its own table
❒ LS: O(n2) algorithm requires DV:
O(nE) msgs ❍ DV node can advertise
❍ may have oscillations incorrect path cost
❒ DV: convergence time varies ❍ each node’s table used by
❍ may be routing loops
others
• error propagate thru
❍ count-to-infinity problem
network
Network Layer 4-77 Network Layer 4-78

Hierarchical Routing Hierarchical Routing


Our routing study thus far - idealization ❒ aggregate routers into gateway routers
❒ all routers identical regions, “autonomous ❒ special routers in AS
❒ network “flat” systems” (AS) ❒ run intra-AS routing
❒ routers in same AS run protocol with all other
… not true in practice routers in AS
same routing protocol
❒ also responsible for
scale: with 200 million administrative autonomy ❍ “intra-AS” routing
routing to destinations
destinations: protocol
❒ internet = network of outside AS
networks ❍ routers in different AS
❒ can’t store all dest’s in ❍ run inter-AS routing
can run different intra-
routing tables! ❒ each network admin may protocol with other
AS routing protocol
❒ routing table exchange want to control routing in its gateway routers
would swamp links! own network

Network Layer 4-79 Network Layer 4-80


Intra-AS and Inter-AS routing Intra-AS and Inter-AS routing
C.b Gateways: Inter-AS
B.a
•perform inter-AS C.b routing
A.a routing amongst between B.a
b A.c c themselves A and B
a a A.a Host
C b •perform intra-AS
a B b A.c c h2
routers with other a C a
d c b
routers in their a B
A b Host
AS d c Intra-AS routing
h1 b
A within AS B
network layer Intra-AS routing
inter-AS, intra-AS within AS A
link layer
routing in
gateway A.c physical layer
❒ We’ll examine specific inter-AS and intra-AS
Internet routing protocols shortly

Network Layer 4-81 Network Layer 4-82

Interconnected ASes Inter-AS tasks AS1 must:


❒ suppose router in AS1 1. learn which dests are
receives datagram reachable through
destined outside of AS2, which through
3c AS3
3a 2c AS1:
3b 2a 2. propagate this
AS3 2b ❍ router should
1c AS2 forward packet to reachability info to all
1a 1b
1d AS1 gateway router, but routers in AS1
❒ forwarding table
which one? Job of inter-AS routing!
configured by both
intra- and inter-AS
Intra-AS
Routing
Inter-AS
Routing routing algorithm
algorithm algorithm
❍ intra-AS sets entries 3c
Forwarding for internal dests 3a 2c
3b 2a
table
inter-AS & intra-As AS3 2b
❍ 1c AS2
sets entries for 1a 1b AS1
external dests 1d
Network Layer 4-83 Network Layer 4-84
Example: Setting forwarding table in router 1d Example: Choosing among multiple ASes
❒ now suppose AS1 learns from inter-AS protocol that
❒ suppose AS1 learns (via inter-AS protocol) that subnet subnet x is reachable from AS3 and from AS2.
x reachable via AS3 (gateway 1c) but not via AS2. ❒ to configure forwarding table, router 1d must
❒ inter-AS protocol propagates reachability info to all determine towards which gateway it should forward
internal routers. packets for dest x.
❍ this is also job of inter-AS routing protocol!
❒ router 1d determines from intra-AS routing info that
its interface I is on the least cost path to 1c.
❍ installs forwarding table entry (x,I)
x
x 3c
3a 2c
3c 3b 2a
AS3 2b
3a 2c 1c AS2
3b 2a 1a
AS3 2b 1b
1c AS2 1d AS1
1a 1b AS1
1d
Network Layer 4-85 Network Layer 4-86

Example: Choosing among multiple ASes


Routing in the Internet
❒ now suppose AS1 learns from inter-AS protocol that
subnet x is reachable from AS3 and from AS2. ❒ The Global Internet consists of Autonomous Systems
❒ to configure forwarding table, router 1d must (AS) interconnected with each other:
determine towards which gateway it should forward ❍ Stub AS: small corporation: one connection to other AS’s
packets for dest x. ❍ Multihomed AS: large corporation (no transit): multiple
❍ this is also job of inter-AS routing protocol! connections to other AS’s
❒ hot potato routing: send packet towards closest of ❍ Transit AS: provider, hooking many AS’s together
two routers.
❒ Two-level routing:
Use routing info Determine from ❍ Intra-AS: administrator responsible for choice of routing
Learn from inter-AS Hot potato routing: forwarding table the
protocol that subnet
from intra-AS
Choose the gateway interface I that leads
algorithm within network
protocol to determine
x is reachable via that has the
costs of least-cost to least-cost gateway. ❍ Inter-AS: unique standard for inter-AS routing: BGP
multiple gateways smallest least cost Enter (x,I) in
paths to each
of the gateways forwarding table

Network Layer 4-87 Network Layer 4-88


Internet AS Hierarchy Intra-AS Routing
Inter-AS border (exterior gateway) routers
❒ also known as Interior Gateway Protocols (IGP)
❒ most common Intra-AS routing protocols:

❍ RIP: Routing Information Protocol


❍ OSPF: Open Shortest Path First

❍ IGRP: Interior Gateway Routing Protocol (Cisco


proprietary)

Intra-AS interior (gateway) routers

Network Layer 4-89 Network Layer 4-90

RIP ( Routing Information Protocol) RIP advertisements


❒ distance vector algorithm ❒ distance vectors: exchanged among
❒ included in BSD-UNIX Distribution in 1982 neighbors every 30 sec via Response
❒ distance metric: # of hops (max = 15 hops) Message (also called advertisement)
❒ each advertisement: list of up to 25
From router A to subnets:
destination subnets within AS
u destination hops
v
u 1
A B w v 2
w 2
x 3
x y 3
z C D z 2
y

Network Layer 4-91 Network Layer 4-92


RIP: Example
RIP: Example
Dest Next hops
w - 1 Advertisement
z x - 1 from A to D
z C 4
w x y …. … ...
z
A D B
w x y
C A D B
Destination Network Next Router Num. of hops to dest.
w A 2 C
Destination Network Next Router Num. of hops to dest.
y B 2 w A 2
z B 7 y B 2
x -- 1 z B A 7 5
…. …. ....
x -- 1
Routing/Forwarding table in D …. …. ....
Network Layer 4-93 Routing/Forwarding table in D Network Layer 4-94

RIP: Link Failure and Recovery RIP Table processing


If no advertisement heard after 180 sec -->
❒ RIP routing tables managed by application-level
neighbor/link declared dead
process called route-d (daemon)
❍ routes via neighbor invalidated
❒ advertisements sent in UDP packets, periodically
❍ new advertisements sent to neighbors
repeated
❍ neighbors in turn send out new advertisements (if
tables changed) routed routed

❍ link failure info quickly (?) propagates to entire net


Transprt Transprt
❍ poison reverse used to prevent ping-pong loops (UDP) (UDP)
(infinite distance = 16 hops) network forwarding forwarding network
(IP) table table (IP)
link link
physical physical

Network Layer 4-95 Network Layer 4-96


RIP message
0 7 8 15 16 31
Message size
Command (1-6)Version (1) 0
❍ 8 UDP header
Address family (2) 0
❍ 4 bytes RIP header
IP address ❍ 20 bytes x up to 25 entries
0 20 ❒ total: maximum of 512 bytes UDP datagram
bytes
0
Metric ❒ 25 entries: too little to transfer an entire routing
table
Up to 24 more routes ❍ more than 1 UDP datagram generally needed
with same 20 bytes format
Command: 1=request to send all or part of the routing
table; 2=reply (3-6 obsolete or non documented)
Address family: 2=IP addresses
metric: distance of emitting router from the specified IP
address in
Network Layer 4-97 Network Layer 4-98
number of hops (valid from 1 to 15; 16=infinite)

Initialization OSPF (Open Shortest Path First)


❒ “open”: publicly available
❒ When routing daemon started, send special RIP
request on every interface ❒ uses Link State algorithm
❍ command = 1 (request) ❍ LS packet dissemination
❍ metric set to 16 (infinite)
❍ topology map at each node
❒ This asks for complete routing table from all ❍ route computation using Dijkstra’s algorithm
connected routers
❍ allows to discover adjacent routers!
❒ OSPF advertisement carries one entry per neighbor
router
❒ advertisements disseminated to entire AS (via
flooding)
❍ carried in OSPF messages directly over IP (rather than TCP
or UDP

Network Layer 4-99 Network Layer 4-100


OSPF “advanced” features (not in RIP) Hierarchical OSPF
❒ security: all OSPF messages authenticated (to
prevent malicious intrusion)
❒ multiple same-cost paths allowed (only one path in
RIP)
❒ For each link, multiple cost metrics for different
TOS (e.g., satellite link cost set “low” for best effort;
high for real time)
❒ integrated uni- and multicast support:
❍ Multicast OSPF (MOSPF) uses same topology data
base as OSPF
❒ hierarchical OSPF in large domains.
Network Layer 4-101 Network Layer 4-102

Hierarchical OSPF Internet inter-AS routing: BGP


❒ two-level hierarchy: local area, backbone.
❒ BGP (Border Gateway Protocol): the de
❍Link-state advertisements only in area facto standard
❍ each nodes has detailed area topology; only know ❒ BGP provides each AS a means to:
direction (shortest path) to nets in other areas.
1. Obtain subnet reachability information from
❒ area border routers: “summarize” distances to nets neighboring ASs.
in own area, advertise to other Area Border routers. 2. Propagate reachability information to all AS-
❒ backbone routers: run OSPF routing limited to internal routers.
backbone. 3. Determine “good” routes to subnets based on
❒ boundary routers: connect to other AS’s. reachability information and policy.
❒ allows subnet to advertise its existence to
rest of Internet: “I am here”

Network Layer 4-103 Network Layer 4-104


BGP basics Distributing reachability info
❒ pairs of routers (BGP peers) exchange routing info
❒ using eBGP session between 3a and 1c, AS3 sends
over semi-permanent TCP connections: BGP sessions
prefix reachability info to AS1.
❍ BGP sessions need not correspond to physical
links. ❍ 1c can then use iBGP do distribute new prefix
❒ when AS2 advertises a prefix to AS1: info to all routers in AS1
❍ AS2 promises it will forward datagrams towards ❍ 1b can then re-advertise new reachability info
that prefix. to AS2 over 1b-to-2a eBGP session
❍ AS2 can aggregate prefixes in its advertisement ❒ when router learns of new prefix, it creates entry
for prefix in its forwarding table.

eBGP session eBGP session


3c iBGP session
3c iBGP session
3a 2c 3a 2c
3b 2a 3b 2a
AS3 2b AS3 2b
1c AS2
1c AS2
1a 1b 1a 1b
AS1 1d AS1 1d
Network Layer 4-105 Network Layer 4-106

Path attributes & BGP routes BGP route selection


❒ advertised prefix includes BGP attributes. ❒ router may learn about more than 1 route
❍ prefix + attributes = “route” to some prefix. Router must select route.
❒ two important attributes: ❒ elimination rules:
❍ AS-PATH: contains ASs through which prefix
1. local preference value attribute: policy
advertisement has passed: e.g, AS 67, AS 17
decision
❍ NEXT-HOP: indicates specific internal-AS router
to next-hop AS. (may be multiple links from 2. shortest AS-PATH
current AS to next-hop-AS) 3. closest NEXT-HOP router: hot potato routing
❒ when gateway router receives route 4. additional criteria
advertisement, uses import policy to
accept/decline.

Network Layer 4-107 Network Layer 4-108


BGP messages BGP routing policy
legend: provider
❒ BGP messages exchanged using TCP. B network
❒ BGP messages: X
W A
customer
❍ OPEN: opens TCP connection to peer and C network:
authenticates sender
Y
❍ UPDATE: advertises new path (or withdraws old)
❍ KEEPALIVE keeps connection alive in absence of ❒ A,B,C are provider networks
UPDATES; also ACKs OPEN request
❒ X,W,Y are customer (of provider networks)
❍ NOTIFICATION: reports errors in previous msg;
❒ X is dual-homed: attached to two networks
also used to close connection
❍ X does not want to route from B via X to C
❍ .. so X will not advertise to B a route to C

Network Layer 4-109 Network Layer 4-110

BGP routing policy (2) Why different Intra- and Inter-AS routing ?
legend: provider Policy:
B network
X ❒ Inter-AS: admin wants control over how its traffic
W A
customer routed, who routes through its net.
C network:
❒ Intra-AS: single admin, so no policy decisions needed
Y
Scale:
❒ A advertises path AW to B ❒ hierarchical routing saves table size, reduced update
❒ B advertises path BAW to X traffic
❒ Should B advertise path BAW to C? Performance:
❍ No way! B gets no “revenue” for routing CBAW ❒ Intra-AS: can focus on performance
since neither W nor C are B’s customers ❒ Inter-AS: policy may dominate over performance
❍ B wants to force C to route to w via A
❍ B wants to route only to/from its customers!
Network Layer 4-111 Network Layer 4-112

You might also like