You are on page 1of 37

Chapter 4 Chapter 4: Network Layer

Network Layer Chapter goals:


 understand principles behind network layer
services:
 network layer service models
A note on the use of these ppt slides:  forwarding versus routing
We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you can add, modify, and delete slides
Computer Networking:  how a router works
(including this one) and slide content to suit your needs. They obviously
represent a lot of work on our part. In return for use, we only ask the A Top Down Approach  routing (path selection)
following:
5th edition.
 If you use these slides (e.g., in a class) in substantially unaltered form,  dealing with scale
that you mention their source (after all, we’d like people to use our book!) Jim Kurose, Keith Ross
 If you post any slides in substantially unaltered form on a www site, that
you note that they are adapted from (or perhaps identical to) our slides, and
Addison-Wesley, April  advanced topics: IPv6, mobility
note our copyright of this material. 2009.
 instantiation, implementation in the Internet
Thanks and enjoy! JFK/KWR

All material copyright 1996-2009


J.F Kurose and K.W. Ross, All Rights Reserved
Network Layer 4-1 Network Layer 4-2

Network layer
Chapter 4: Network Layer
 transport segment from application
transport

sending to receiving host network


data link
 4. 1 Introduction  4.5 Routing algorithms physical

 Link state
 on sending side network network
 4.2 Virtual circuit and network
data link data link

 Distance Vector
encapsulates segments data link
physical physical

datagram networks physical network network

 Hierarchical routing
into datagrams data link data link

 4.3 What’s inside a physical physical

 4.6 Routing in the  on rcving side, delivers


router network network

Internet segments to transport data link


physical
data link
physical
 4.4 IP: Internet layer
network
data link
 RIP physical
Protocol application

 OSPF  network layer protocols network transport


 Datagram format data link network
network

 BGP in every host, router network physical data link


data link
physical
 IPv4 addressing data link
physical
physical

 ICMP  4.7 Broadcast and  router examines header


 IPv6 multicast routing fields in all IP datagrams
passing through it
Network Layer 4-3 Network Layer 4-4
Interplay between routing and forwarding
Two Key Network-Layer Functions
routing algorithm

 forwarding: move analogy:


packets from router’s local forwarding table

input to appropriate  routing: process of header value output link


0100 3

router output planning trip from 0101


0111
2
2
1001 1
source to dest
 routing: determine
route taken by  forwarding: process value in arriving
packet’s header

packets from source of getting through 0111 1

to dest. single interchange 3 2

 routing algorithms

Network Layer 4-5 Network Layer 4-6

Connection setup Network service model


Q: What service model for “channel” transporting
 3rd important function in some network architectures:
datagrams from sender to receiver?
 ATM, frame relay, X.25
 before datagrams flow, two end hosts and intervening Example services for Example services for a
routers establish virtual connection individual datagrams: flow of datagrams:
 routers get involved  guaranteed delivery  in-order datagram
 network vs transport layer connection service:  guaranteed delivery delivery
 network: between two hosts (may also involve with less than 40 msec  guaranteed minimum
intervening routers in case of VCs) delay bandwidth to flow
 transport: between two processes
 restrictions on
changes in inter-
packet spacing

Network Layer 4-7 Network Layer 4-8


Network layer service models: Chapter 4: Network Layer
Guarantees ?
Network Service Congestion  4. 1 Introduction  4.5 Routing algorithms
Architecture Model Bandwidth Loss Order Timing feedback
 4.2 Virtual circuit and  Link state

Internet best effort none no no no no (inferred datagram networks  Distance Vector


via loss)  Hierarchical routing
 4.3 What’s inside a
ATM CBR constant yes yes yes no
router  4.6 Routing in the
rate congestion
ATM VBR guaranteed yes yes yes no  4.4 IP: Internet
Internet
rate congestion  RIP
Protocol
ATM ABR guaranteed no yes no yes  OSPF
minimum  Datagram format
 BGP
ATM UBR none no yes no no  IPv4 addressing
 ICMP  4.7 Broadcast and
 IPv6 multicast routing

Network Layer 4-9 Network Layer 4-10

Network layer connection and


connection-less service Virtual circuits
 datagram network provides network-layer “source-to-dest path behaves much like telephone
circuit”
connectionless service  performance-wise
 VC network provides network-layer  network actions along source-to-dest path
connection service
 analogous to the transport-layer services,  call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination host
but: address)
 service: host-to-host  every router on source-dest path maintains “state” for
 no choice: network provides one or the other each passing connection
 link, router resources (bandwidth, buffers) may be
 implementation: in network core
allocated to VC (dedicated resources = predictable service)

Network Layer 4-11 Network Layer 4-12


VC implementation Forwarding table VC number

12 22 32

a VC consists of: 1
2
3

1. path from source to destination


Forwarding table in interface
2. VC numbers, one number for each link along number
path northwest router:
3. entries in forwarding tables in routers along Incoming interface Incoming VC # Outgoing interface Outgoing VC #

path 1 12 3 22
2 63 1 18
 packet belonging to VC carries VC number 3 7 2 17
(rather than dest address) 1 97 3 87
… … … …
 VC number can be changed on each link.
 New VC number comes from forwarding table
Routers maintain connection state information!
Network Layer 4-13 Network Layer 4-14

Datagram networks
Virtual circuits: signaling protocols
 no call setup at network layer
 routers: no state about end-to-end connections
 used to setup, maintain teardown VC
 no network-level concept of “connection”
 used in ATM, frame-relay, X.25
 packets forwarded using destination host address
 not used in today’s Internet  packets between same source-dest pair may take
different paths

application
transport 5. Data flow begins 6. Receive data application application
transport application
network 4. Call connected 3. Accept call transport
network transport
data link 1. Initiate call 2. incoming call network
data link network
physical data link 1. Send data 2. Receive data
physical data link
physical
physical

Network Layer 4-15 Network Layer 4-16


4 billion
Forwarding table possible entries Longest prefix matching
Destination Address Range Link Interface Prefix Match Link Interface
11001000 00010111 00010 0
11001000 00010111 00010000 00000000 11001000 00010111 00011000 1
through 0 11001000 00010111 00011 2
11001000 00010111 00010111 11111111 otherwise 3

11001000 00010111 00011000 00000000


Examples
through 1
11001000 00010111 00011000 11111111 DA: 11001000 00010111 00010110 10100001 Which interface?

11001000 00010111 00011001 00000000


through 2 DA: 11001000 00010111 00011000 10101010 Which interface?
11001000 00010111 00011111 11111111

otherwise 3

Network Layer 4-17 Network Layer 4-18

Datagram or VC network: why? Chapter 4: Network Layer


Internet (datagram) ATM (VC)  4. 1 Introduction  4.5 Routing algorithms
 data exchange among  evolved from telephony
 4.2 Virtual circuit and  Link state
computers
 human conversation:  Distance Vector
 “elastic” service, no strict
datagram networks
 strict timing, reliability  Hierarchical routing
timing req.
requirements  4.3 What’s inside a
 “smart” end systems router  4.6 Routing in the
 need for guaranteed
(computers)
service  4.4 IP: Internet
Internet
 can adapt, perform  RIP
 “dumb” end systems Protocol
control, error recovery  OSPF
 telephones  Datagram format
 simple inside network,  BGP
 complexity inside  IPv4 addressing
complexity at “edge”
network  ICMP  4.7 Broadcast and
 many link types
 IPv6 multicast routing
 different characteristics
 uniform service difficult
Network Layer 4-19 Network Layer 4-20
Router Architecture Overview Input Port Functions
Two key router functions:
 run routing algorithms/protocol (RIP, OSPF, BGP)
 forwarding datagrams from incoming to outgoing link

Physical layer:
bit-level reception
Data link layer: Decentralized switching:
e.g., Ethernet  given datagram dest., lookup output port
see chapter 5 using forwarding table in input port
memory
 goal: complete input port processing at
‘line speed’
 queuing: if datagrams arrive faster than
forwarding rate into switch fabric

Network Layer 4-21 Network Layer 4-22

Three types of switching fabrics Switching Via Memory


First generation routers:
 traditional computers with switching under direct
control of CPU
packet copied to system’s memory
 speed limited by memory bandwidth (2 bus
crossings per datagram)
Input Memory Output
Port Port

System Bus

Network Layer 4-23 Network Layer 4-24


Switching Via An Interconnection
Switching Via a Bus Network

 overcome bus bandwidth limitations


 datagram from input port memory  Banyan networks, other interconnection nets
to output port memory via a shared initially developed to connect processors in
bus multiprocessor
 bus contention: switching speed  advanced design: fragmenting datagram into fixed
limited by bus bandwidth length cells, switch cells through the fabric.
 32 Gbps bus, Cisco 5600: sufficient  Cisco 12000: switches 60 Gbps through the
speed for access and enterprise interconnection network
routers

Network Layer 4-25 Network Layer 4-26

Output Ports Output port queueing

 Buffering required when datagrams arrive from


fabric faster than the transmission rate  buffering when arrival rate via switch exceeds
 Scheduling discipline chooses among queued output line speed
datagrams for transmission  queueing (delay) and loss due to output port
buffer overflow!
Network Layer 4-27 Network Layer 4-28
Input Port Queuing
How much buffering?
 Fabric slower than input ports combined -> queueing
may occur at input queues
 RFC 3439 rule of thumb: average buffering
 Head-of-the-Line (HOL) blocking: queued datagram
equal to “typical” RTT (say 250 msec) times
at front of queue prevents others in queue from
link capacity C moving forward
 e.g., C = 10 Gps link: 2.5 Gbit buffer  queueing delay and loss due to input buffer overflow!
 Recent recommendation: with N flows,
buffering equal to RTT. C
N

Network Layer 4-29 Network Layer 4-30

Chapter 4: Network Layer The Internet Network layer


Host, router network layer functions:
 4. 1 Introduction  4.5 Routing algorithms
 Link state Transport layer: TCP, UDP
 4.2 Virtual circuit and
datagram networks  Distance Vector
Routing protocols IP protocol
 Hierarchical routing
 4.3 What’s inside a •path selection •addressing conventions
 4.6 Routing in the •datagram format
router Network •RIP, OSPF, BGP
•packet handling conventions
 4.4 IP: Internet
Internet layer forwarding
 RIP ICMP protocol
Protocol table
•error reporting
 OSPF •router “signaling”
 Datagram format
 BGP
 IPv4 addressing
Link layer
 ICMP  4.7 Broadcast and
 IPv6 multicast routing physical layer

Network Layer 4-31 Network Layer 4-32


IP datagram format
Chapter 4: Network Layer IP protocol version
number 32 bits total datagram
header length head. type of length (bytes)
 4. 1 Introduction  4.5 Routing algorithms (bytes) ver
len service
length
for
 Link state “type” of data fragment
 4.2 Virtual circuit and 16-bit identifier flgs fragmentation/
offset reassembly
datagram networks  Distance Vector max number time to upper header
 Hierarchical routing
remaining hops live layer checksum
 4.3 What’s inside a (decremented at
router  4.6 Routing in the each router) 32 bit source IP address

 4.4 IP: Internet


Internet upper layer protocol 32 bit destination IP address
 RIP to deliver payload to E.g. timestamp,
Protocol Options (if any)
 OSPF record route
 Datagram format how much overhead data taken, specify
 BGP with TCP? (variable length,
 IPv4 addressing list of routers
 ICMP  4.7 Broadcast and  20 bytes of TCP typically a TCP to visit.
multicast routing or UDP segment)
 IPv6  20 bytes of IP
 = 40 bytes + app
layer overhead
Network Layer 4-33 Network Layer 4-34

IP Fragmentation & Reassembly IP Fragmentation and Reassembly


 network links have MTU
(max.transfer size) - largest length ID fragflag offset
possible link-level frame. Example =4000 =x =0 =0
 different link types, fragmentation:  4000 byte
One large datagram becomes
different MTUs in: one large datagram datagram
out: 3 smaller datagrams several smaller datagrams
 large IP datagram divided  MTU = 1500 bytes
(“fragmented”) within net
length ID fragflag offset
 one datagram becomes
=1500 =x =1 =0
several datagrams 1480 bytes in
reassembly
 “reassembled” only at final data field length ID fragflag offset
destination =1500 =x =1 =185
 IP header bits used to offset =
identify, order related 1480/8 length ID fragflag offset
fragments =1040 =x =0 =370

Network Layer 4-35 Network Layer 4-36


Chapter 4: Network Layer IP Addressing: introduction
223.1.1.1
 IP address: 32-bit
 4. 1 Introduction  4.5 Routing algorithms identifier for host, 223.1.2.1
223.1.1.2
 4.2 Virtual circuit and  Link state router interface 223.1.1.4 223.1.2.9
datagram networks  Distance Vector
 interface: connection
 Hierarchical routing 223.1.2.2
 4.3 What’s inside a between host/router 223.1.1.3 223.1.3.27

router  4.6 Routing in the and physical link


 4.4 IP: Internet
Internet  router’s typically have
 RIP multiple interfaces 223.1.3.1 223.1.3.2
Protocol
 OSPF  host typically has one
 Datagram format
 BGP interface
 IPv4 addressing
 4.7 Broadcast and  IP addresses
 ICMP
associated with each 223.1.1.1 = 11011111 00000001 00000001 00000001
 IPv6 multicast routing
interface
223 1 1 1

Network Layer 4-37 Network Layer 4-38

Subnets Subnets 223.1.1.0/24


223.1.2.0/24

223.1.1.1
 IP address:
223.1.2.1
Recipe
 subnet part (high
223.1.1.2
order bits)  To determine the
223.1.1.4 223.1.2.9
 host part (low order subnets, detach each
bits) 223.1.1.3 223.1.3.27
223.1.2.2 interface from its
 What’s a subnet ? host or router,
subnet creating islands of
 device interfaces with
same subnet part of IP 223.1.3.1 223.1.3.2 isolated networks.
address Each isolated network
 can physically reach is called a subnet. 223.1.3.0/24
each other without
intervening router network consisting of 3 subnets Subnet mask: /24

Network Layer 4-39 Network Layer 4-40


Subnets 223.1.1.2
IP addressing: CIDR
How many? 223.1.1.1 223.1.1.4
CIDR: Classless InterDomain Routing
223.1.1.3  subnet portion of address of arbitrary length
 address format: a.b.c.d/x, where x is # bits in
223.1.9.2 223.1.7.0
subnet portion of address

223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.0

223.1.2.6 223.1.3.27 subnet host


part part
223.1.2.1 223.1.2.2 223.1.3.1 223.1.3.2 11001000 00010111 00010000 00000000
200.23.16.0/23
Network Layer 4-41 Network Layer 4-42

IP addresses: how to get one? DHCP: Dynamic Host Configuration Protocol

Q: How does a host get IP address? Goal: allow host to dynamically obtain its IP address
from network server when it joins network
Can renew its lease on address in use
 hard-coded by system admin in a file Allows reuse of addresses (only hold address while connected
an “on”)
 Windows: control-panel->network->configuration-
Support for mobile users who want to join network (more
>tcp/ip->properties
shortly)
 UNIX: /etc/rc.config
DHCP overview:
 DHCP: Dynamic Host Configuration Protocol:
 host broadcasts “DHCP discover” msg
dynamically get address from as server
 DHCP server responds with “DHCP offer” msg
 “plug-and-play”
 host requests IP address: “DHCP request” msg
 DHCP server sends address: “DHCP ack” msg
Network Layer 4-43 Network Layer 4-44
DHCP client-server scenario
DHCP client-server scenario DHCP server: 223.1.2.5 DHCP discover
arriving
client
src : 0.0.0.0, 68
dest.: 255.255.255.255,67
yiaddr: 0.0.0.0
transaction ID: 654
A DHCP 223.1.2.1
223.1.1.1 DHCP offer
server
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
223.1.1.2 yiaddrr: 223.1.2.4
223.1.1.4 223.1.2.9 transaction ID: 654
Lifetime: 3600 secs
B DHCP request
223.1.2.2 arriving DHCP
223.1.1.3 223.1.3.27 E client needs
src: 0.0.0.0, 68
dest:: 255.255.255.255, 67
address in this yiaddrr: 223.1.2.4
223.1.3.1 223.1.3.2 transaction ID: 655
network time Lifetime: 3600 secs

DHCP ACK
src: 223.1.2.5, 67
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
transaction ID: 655
Lifetime: 3600 secs

Network Layer 4-45 Network Layer 4-46

IP addresses: how to get one? Hierarchical addressing: route aggregation


Q: How does network get subnet part of IP Hierarchical addressing allows efficient advertisement of routing
information:
addr?
A: gets allocated portion of its provider ISP’s Organization 0

address space 200.23.16.0/23


Organization 1
“Send me anything
200.23.18.0/23 with addresses
ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 .
. . Internet
Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 .
Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23
Organization 7 .
... E.. E. E. 200.23.30.0/23
“Send me anything
Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 ISPs-R-Us
with addresses
beginning
199.31.0.0/16”

Network Layer 4-47 Network Layer 4-48


Hierarchical addressing: more specific IP addressing: the last word...
routes
ISPs-R-Us has a more specific route to Organization 1 Q: How does an ISP get block of addresses?
Organization 0 A: ICANN: Internet Corporation for Assigned
200.23.16.0/23
Names and Numbers
“Send me anything
with addresses  allocates addresses
Organization 2 beginning
. 200.23.16.0/20”  manages DNS
200.23.20.0/23 Fly-By-Night-ISP
.
. . Internet  assigns domain names, resolves disputes
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning 199.31.0.0/16
or 200.23.18.0/23”
200.23.18.0/23

Network Layer 4-49 Network Layer 4-50

NAT: Network Address Translation NAT: Network Address Translation

rest of local network  Motivation: local network uses just one IP address as
Internet (e.g., home network) far as outside world is concerned:
10.0.0/24 10.0.0.1
 range of addresses not needed from ISP: just one IP
10.0.0.4 address for all devices
10.0.0.2
138.76.29.7  can change addresses of devices in local network
without notifying outside world
10.0.0.3
 can change ISP without changing addresses of
All datagrams leaving local Datagrams with source or devices in local network
network have same single source destination in this network
 devices inside local net not explicitly addressable,
NAT IP address: 138.76.29.7, have 10.0.0/24 address for
different source port numbers source, destination (as usual) visible by outside world (a security plus).

Network Layer 4-51 Network Layer 4-52


NAT: Network Address Translation NAT: Network Address Translation
Implementation: NAT router must: NAT translation table
2: NAT router 1: host 10.0.0.1
WAN side addr LAN side addr
sends datagram to
 outgoing datagrams: replace (source IP address, port changes datagram
138.76.29.7, 5001 10.0.0.1, 3345 128.119.40.186, 80
#) of every outgoing datagram to (NAT IP address, source addr from
…… ……
new port #) 10.0.0.1, 3345 to
138.76.29.7, 5001,
. . . remote clients/servers will respond using (NAT updates table
S: 10.0.0.1, 3345
D: 128.119.40.186, 80
IP address, new port #) as destination addr. 10.0.0.1
1
S: 138.76.29.7, 5001
 remember (in NAT translation table) every (source 2 D: 128.119.40.186, 80 10.0.0.4
10.0.0.2
IP address, port #) to (NAT IP address, new port #)
translation pair 138.76.29.7 S: 128.119.40.186, 80
D: 10.0.0.1, 3345 4
S: 128.119.40.186, 80
D: 138.76.29.7, 5001 3 10.0.0.3
 incoming datagrams: replace (NAT IP address, new 4: NAT router
3: Reply arrives changes datagram
port #) in dest fields of every incoming datagram dest. address: dest addr from
with corresponding (source IP address, port #) 138.76.29.7, 5001 138.76.29.7, 5001 to 10.0.0.1, 3345
stored in NAT table
Network Layer 4-53 Network Layer 4-54

NAT: Network Address Translation NAT traversal problem


 client wants to connect to
 16-bit port-number field: server with address 10.0.0.1
10.0.0.1
 60,000 simultaneous connections with a single  server address 10.0.0.1 local Client
LAN-side address! to LAN (client can’t use it as ?
destination addr)
 NAT is controversial:  only one externally visible 10.0.0.4
NATted address: 138.76.29.7
 routers should only process up to layer 3 138.76.29.7 NAT
 solution 1: statically router
 violates end-to-end argument configure NAT to forward
• NAT possibility must be taken into account by app incoming connection
designers, eg, P2P applications requests at given port to
 address shortage should instead be solved by server
IPv6  e.g., (123.76.29.7, port 2500)
always forwarded to 10.0.0.1
port 25000
Network Layer 4-55 Network Layer 4-56
NAT traversal problem NAT traversal problem
 solution 2: Universal Plug and  solution 3: relaying (used in Skype)
Play (UPnP) Internet Gateway 10.0.0.1  NATed client establishes connection to relay
Device (IGD) Protocol. Allows  External client connects to relay
NATted host to: IGD
 relay bridges packets between to connections
 learn public IP address 10.0.0.4
(138.76.29.7)
138.76.29.7 NAT
 add/remove port mappings 2. connection to
router relay initiated 1. connection to
(with lease times) by client relay initiated
10.0.0.1
by NATted host
3. relaying
i.e., automate static NAT port Client
established
map configuration 138.76.29.7 NAT
router

Network Layer 4-57 Network Layer 4-58

Chapter 4: Network Layer ICMP: Internet Control Message Protocol

 4. 1 Introduction  4.5 Routing algorithms  used by hosts & routers to


communicate network-level Type Code description
 4.2 Virtual circuit and  Link state 0 0 echo reply (ping)
information
 Distance Vector
3 0 dest. network unreachable
datagram networks  error reporting:
3 1 dest host unreachable
 Hierarchical routing unreachable host, network,
 4.3 What’s inside a port, protocol
3 2 dest protocol unreachable
router  4.6 Routing in the 3 3 dest port unreachable
 echo request/reply (used 3 6 dest network unknown
 4.4 IP: Internet
Internet by ping) 3 7 dest host unknown
 RIP  network-layer “above” IP: 4 0 source quench (congestion
Protocol control - not used)
 OSPF  ICMP msgs carried in IP
 Datagram format datagrams 8 0 echo request (ping)
 BGP 9 0 route advertisement
 IPv4 addressing  ICMP message: type, code plus
 ICMP  4.7 Broadcast and first 8 bytes of IP datagram 10 0 router discovery
11 0 TTL expired
 IPv6 multicast routing causing error
12 0 bad IP header

Network Layer 4-59 Network Layer 4-60


Traceroute and ICMP Chapter 4: Network Layer
 Source sends series of  When ICMP message  4. 1 Introduction  4.5 Routing algorithms
UDP segments to dest arrives, source calculates  Link state
RTT  4.2 Virtual circuit and
 First has TTL =1
datagram networks  Distance Vector
 Second has TTL=2, etc.  Traceroute does this 3
 Hierarchical routing
 Unlikely port number times  4.3 What’s inside a
 When nth datagram arrives Stopping criterion router  4.6 Routing in the
to nth router:  UDP segment eventually  4.4 IP: Internet
Internet
 Router discards datagram arrives at destination host  RIP
Protocol
 And sends to source an  Destination returns ICMP  OSPF
ICMP message (type 11,  Datagram format
“host unreachable” packet  BGP
code 0)  IPv4 addressing
(type 3, code 3)
 Message includes name of  ICMP  4.7 Broadcast and
 When source gets this
router& IP address multicast routing
ICMP, stops.  IPv6

Network Layer 4-61 Network Layer 4-62

IPv6 IPv6 Header (Cont)


 Initial motivation: 32-bit address space soon Priority: identify priority among datagrams in flow
to be completely allocated. Flow Label: identify datagrams in same “flow.”
(concept of“flow” not well defined).
 Additional motivation: Next header: identify upper layer protocol for data
 header format helps speed processing/forwarding
 header changes to facilitate QoS
IPv6 datagram format:
 fixed-length 40 byte header
 no fragmentation allowed

Network Layer 4-63 Network Layer 4-64


Other Changes from IPv4 Transition From IPv4 To IPv6
 Checksum: removed entirely to reduce  Not all routers can be upgraded simultaneous
processing time at each hop  no “flag days”

 Options: allowed, but outside of header,  How will the network operate with mixed IPv4 and

indicated by “Next Header” field IPv6 routers?

 ICMPv6: new version of ICMP  Tunneling: IPv6 carried as payload in IPv4


 additional message types, e.g. “Packet Too Big” datagram among IPv4 routers
 multicast group management functions

Network Layer 4-65 Network Layer 4-66

Tunneling Tunneling
A B E F A B E F
Logical view: tunnel Logical view: tunnel

IPv6 IPv6 IPv6 IPv6 IPv6 IPv6 IPv6 IPv6

A B E F A B C D E F
Physical view: Physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6 IPv6 IPv6 IPv4 IPv4 IPv6 IPv6

Flow: X Src:B Src:B Flow: X


Src: A Dest: E Dest: E Src: A
Dest: F Dest: F
Flow: X Flow: X
Src: A Src: A
data Dest: F Dest: F data

data data

A-to-B: E-to-F:
B-to-C: B-to-C:
IPv6 IPv6
IPv6 inside IPv6 inside
IPv4 IPv4
Network Layer 4-67 Network Layer 4-68
Interplay between routing, forwarding
Chapter 4: Network Layer
 4. 1 Introduction  4.5 Routing algorithms routing algorithm

 4.2 Virtual circuit and  Link state


local forwarding table
datagram networks  Distance Vector
header value output link
 Hierarchical routing 0100 3
 4.3 What’s inside a 0101 2

router  4.6 Routing in the 0111


1001
2
1

 4.4 IP: Internet


Internet
 RIP
Protocol value in arriving
 OSPF packet’s header
 Datagram format
 BGP 0111 1
 IPv4 addressing
 ICMP  4.7 Broadcast and 3 2
 IPv6 multicast routing

Network Layer 4-69 Network Layer 4-70

Graph abstraction Graph abstraction: costs


5 5 • c(x,x’) = cost of link (x,x’)
v 3 w 3
5 v w
2 5 - e.g., c(w,z) = 5
2
u z u
2
3
1 2 1 z • cost could always be 1, or
1 3
x y 2 1
x y 2 inversely related to bandwidth,
Graph: G = (N,E) 1 or inversely related to
1
congestion
N = set of routers = { u, v, w, x, y, z }
Cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

Question: What’s the least-cost path between u and z ?


Remark: Graph abstraction is useful in other network contexts

Example: P2P, where N is set of peers and E is set of TCP connections Routing algorithm: algorithm that finds least-cost path

Network Layer 4-71 Network Layer 4-72


Routing Algorithm classification Chapter 4: Network Layer
Global or decentralized Static or dynamic?  4. 1 Introduction  4.5 Routing algorithms
information? Static:
 4.2 Virtual circuit and  Link state
Global:
 routes change slowly datagram networks  Distance Vector
 all routers have complete
topology, link cost info over time  Hierarchical routing
 4.3 What’s inside a
 “link state” algorithms Dynamic: router  4.6 Routing in the
Decentralized:  routes change more  4.4 IP: Internet
Internet
 router knows physically- quickly  RIP
Protocol
connected neighbors, link  OSPF
 periodic update  Datagram format
costs to neighbors  BGP
 iterative process of  in response to link  IPv4 addressing
ICMP  4.7 Broadcast and
computation, exchange of cost changes 
info with neighbors  IPv6 multicast routing
 “distance vector” algorithms

Network Layer 4-73 Network Layer 4-74

A Link-State Routing Algorithm Dijsktra’s Algorithm


1 Initialization:
Dijkstra’s algorithm Notation: 2 N' = {u}
 net topology, link costs  c(x,y): link cost from node 3 for all nodes v
known to all nodes x to y; = ∞ if not direct 4 if v adjacent to u
 accomplished via “link neighbors 5 then D(v) = c(u,v)
state broadcast” 6 else D(v) = ∞
 D(v): current value of cost
 all nodes have same info 7
of path from source to
8 Loop
 computes least cost paths dest. v
9 find w not in N' such that D(w) is a minimum
from one node (‘source”) to
 p(v): predecessor node 10 add w to N'
all other nodes
along path from source to v 11 update D(v) for all v adjacent to w and not in N' :
 gives forwarding table
 N': set of nodes whose 12 D(v) = min( D(v), D(w) + c(w,v) )
for that node 13 /* new cost to v is either old cost to v or known
least cost path definitively
 iterative: after k 14 shortest path cost to w plus cost from w to v */
known
iterations, know least cost 15 until all nodes in N'
path to k dest.’s
Network Layer 4-75 Network Layer 4-76
Dijkstra’s algorithm: example Dijkstra’s algorithm: example (2)
Resulting shortest-path tree from u:
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞ v w
2 uxy 2,u 3,y 4,y u
3 uxyv 3,y 4,y
z
4 uxyvw 4,y x y
5 uxyvwz
Resulting forwarding table in u:
5 destination link
v 3 w v (u,v)
2 5
x (u,x)
u 2 z
1 y (u,x)
3
1
x y 2 w (u,x)
1 z (u,x)
Network Layer 4-77 Network Layer 4-78

Dijkstra’s algorithm, discussion Chapter 4: Network Layer


Algorithm complexity: n nodes
 4. 1 Introduction  4.5 Routing algorithms
 each iteration: need to check all nodes, w, not in N
 4.2 Virtual circuit and  Link state
 n(n+1)/2 comparisons: O(n2)
datagram networks  Distance Vector
 more efficient implementations possible: O(nlogn)
 Hierarchical routing
 4.3 What’s inside a
Oscillations possible:  4.6 Routing in the
router
 e.g., link cost = amount of carried traffic Internet
 4.4 IP: Internet
 RIP
A Protocol
1
A A A  OSPF
1+e 2+e 0 0 2+e 2+e 0  Datagram format
D B D B D B D B  BGP
0 0 1+e 1 0 0 1+e 1  IPv4 addressing
0 C e 0 0 1 e
C C 1+e 0 C  ICMP  4.7 Broadcast and
1 1
 IPv6 multicast routing
e … recompute … recompute … recompute
initially
routing
Network Layer 4-79 Network Layer 4-80
Distance Vector Algorithm Bellman-Ford example
Bellman-Ford Equation (dynamic programming) 5
Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
v 3 w
Define 2 5
u z B-F equation says:
dx(y) := cost of least-cost path from x to y 2
3
1
1 du(z) = min { c(u,v) + dv(z),
x y 2
1 c(u,x) + dx(z),
Then c(u,w) + dw(z) }
= min {2 + 5,
1 + 3,
dx(y) = min
v
{c(x,v) + dv(y) } 5 + 3} = 4
Node that achieves minimum is next
where min is taken over all neighbors v of x hop in shortest path ➜ forwarding table
Network Layer 4-81 Network Layer 4-82

Distance Vector Algorithm Distance vector algorithm (4)


 Dx(y) = estimate of least cost from x to y Basic idea:
 From time-to-time, each node sends its own
 Node x knows cost to each neighbor v: distance vector estimate to neighbors
c(x,v)  Asynchronous
 Node x maintains distance vector Dx =  When a node x receives new DV estimate from
[Dx(y): y є N ] neighbor, it updates its own DV using B-F equation:
 Node x also maintains its neighbors’ Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N
distance vectors  Under minor, natural conditions, the estimate
 For each neighbor v, x maintains Dx(y) converge to the actual least cost dx(y)
Dv = [Dv(y): y є N ]

Network Layer 4-83 Network Layer 4-84


Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
Distance Vector Algorithm (5) node x table = min{2+1 , 7+0} = 3
cost to cost to
x y z x y z
Iterative, asynchronous: Each node:
each local iteration caused x 0 2 7 x 0 2 3

from

from
by: y ∞∞ ∞ y 2 0 1
wait for (change in local link z ∞∞ ∞ z 7 1 0
 local link cost change
node y table
 DV update message from cost or msg from neighbor) cost to
neighbor x y z y
2 1
Distributed: x ∞ ∞ ∞
recompute estimates x z

from
y 2 0 1 7
 each node notifies
z ∞∞ ∞
neighbors only when its DV
node z table
changes if DV to any dest has cost to
 neighbors then notify x y z
changed, notify neighbors
their neighbors if
x ∞∞ ∞
necessary

from
y ∞∞ ∞
z 71 0
time
Network Layer 4-85 Network Layer 4-86

Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} Dx(z) = min{c(x,y) +


= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
node x table = min{2+1 , 7+0} = 3 Distance Vector: link cost changes
cost to cost to cost to
x y z x y z x y z Link cost changes:
x 0 2 7 x 0 2 3 x 0 2 3 1
 node detects local link cost change y
from

from

from

y ∞∞ ∞ y 2 0 1 y 2 0 1 4 1
z ∞∞ ∞ z 7 1 0  updates routing info, recalculates
z 3 1 0 x z
node y table distance vector 50
cost to cost to cost to  if DV changes, notify neighbors
x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 At time t0, y detects the link-cost change, updates its DV,
x z
from
from

from

y 2 0 1 y 2 0 1 y 2 0 1 7 and informs its neighbors.


z ∞∞ ∞ z 7 1 0
“good
z 3 1 0 At time t1, z receives the update from y and updates its table.
node z table news
cost to cost to It computes a new least cost to x and sends its neighbors its DV.
cost to travels
x y z x y z x y z
fast” At time t2, y receives z’s update and updates its distance table.
x ∞∞ ∞ x 0 2 7 x 0 2 3 y’s least costs do not change and hence y does not send any
from

from
from

y ∞∞ ∞ y 2 0 1 y 2 0 1 message to z.
z 71 0 z 3 1 0 z 3 1 0
time
Network Layer 4-87 Network Layer 4-88
Distance Vector: link cost changes Comparison of LS and DV algorithms

Link cost changes: Message complexity Robustness: what happens


 good news travels fast
60
y  LS: with n nodes, E links, if router malfunctions?
 bad news travels slow - 4 1 O(nE) msgs sent LS:
“count to infinity” problem! x z  DV: exchange between
50  node can advertise
 44 iterations before
neighbors only
incorrect link cost
algorithm stabilizes: see  convergence time varies
 each node computes only
text its own table
Speed of Convergence
Poisoned reverse:  LS: O(n2) algorithm requires DV:
 If Z routes through Y to O(nE) msgs  DV node can advertise
get to X :  may have oscillations incorrect path cost
 Z tells Y its (Z’s) distance each node’s table used by
 DV: convergence time varies 
to X is infinite (so Y won’t
 may be routing loops
others
route to X via Z)
• error propagate thru
 will this completely solve  count-to-infinity problem
network
count to infinity problem?
Network Layer 4-89 Network Layer 4-90

Chapter 4: Network Layer Hierarchical Routing


Our routing study thus far - idealization
 4. 1 Introduction  4.5 Routing algorithms
 Link state
 all routers identical
 4.2 Virtual circuit and
datagram networks  Distance Vector  network “flat”

 4.3 What’s inside a


 Hierarchical routing … not true in practice
router  4.6 Routing in the
Internet scale: with 200 million administrative autonomy
 4.4 IP: Internet
 RIP destinations:  internet = network of
Protocol networks
 OSPF  can’t store all dest’s in
 Datagram format routing tables!  each network admin may
 BGP
 IPv4 addressing  routing table exchange want to control routing in its
 ICMP  4.7 Broadcast and own network
would swamp links!
 IPv6 multicast routing

Network Layer 4-91 Network Layer 4-92


Hierarchical Routing Interconnected ASes
 aggregate routers into Gateway router
regions, “autonomous 3c
 Direct link to router in 3a 2c
systems” (AS) 3b
AS3
2a
another AS 1c
2b
 routers in same AS run AS2
1a 1b
same routing protocol 1d AS1
 forwarding table
 “intra-AS” routing configured by both
protocol
intra- and inter-AS
 routers in different AS Intra-AS
can run different intra- Routing
Inter-AS
Routing routing algorithm
algorithm algorithm
AS routing protocol  intra-AS sets entries
Forwarding for internal dests
table
 inter-AS & intra-As
sets entries for
external dests
Network Layer 4-93 Network Layer 4-94

Inter-AS tasks AS1 must:


Example: Setting forwarding table in router 1d
 suppose router in AS1 1. learn which dests are
receives datagram reachable through  suppose AS1 learns (via inter-AS protocol) that subnet
destined outside of AS2, which through x reachable via AS3 (gateway 1c) but not via AS2.
AS1: AS3
 inter-AS protocol propagates reachability info to all
 router should 2. propagate this
forward packet to reachability info to all internal routers.
gateway router, but routers in AS1  router 1d determines from intra-AS routing info that
which one? Job of inter-AS routing! its interface I is on the least cost path to 1c.
 installs forwarding table entry (x,I)

x
3c
3a 2c 3c
3b 2a 3a 2c
AS3 2b 3b 2a
1c AS2
AS3 2b
1c AS2
1a 1b
1d AS1 1a 1b AS1
1d
Network Layer 4-95 Network Layer 4-96
Example: Choosing among multiple ASes Example: Choosing among multiple ASes
 now suppose AS1 learns from inter-AS protocol that  now suppose AS1 learns from inter-AS protocol that
subnet x is reachable from AS3 and from AS2. subnet x is reachable from AS3 and from AS2.
 to configure forwarding table, router 1d must  to configure forwarding table, router 1d must
determine towards which gateway it should forward determine towards which gateway it should forward
packets for dest x. packets for dest x.
 this is also job of inter-AS routing protocol!  this is also job of inter-AS routing protocol!
 hot potato routing: send packet towards closest of
two routers.
x
3c Determine from
3a 2c Learn from inter-AS
Use routing info
Hot potato routing:
3b 2a from intra-AS forwarding table the
AS3 2b protocol that subnet
protocol to determine
Choose the gateway interface I that leads
1c AS2 x is reachable via
costs of least-cost
that has the to least-cost gateway.
multiple gateways smallest least cost
1a 1b paths to each Enter (x,I) in
1d AS1 of the gateways forwarding table

Network Layer 4-97 Network Layer 4-98

Chapter 4: Network Layer Intra-AS Routing

 4. 1 Introduction  4.5 Routing algorithms  also known as Interior Gateway Protocols (IGP)
 4.2 Virtual circuit and  Link state  most common Intra-AS routing protocols:
datagram networks  Distance Vector

 4.3 What’s inside a


 Hierarchical routing  RIP: Routing Information Protocol
router  4.6 Routing in the
 OSPF: Open Shortest Path First
 4.4 IP: Internet
Internet
Protocol  RIP  IGRP: Interior Gateway Routing Protocol (Cisco
 Datagram format
 OSPF proprietary)
 BGP
 IPv4 addressing
 ICMP  4.7 Broadcast and
 IPv6 multicast routing

Network Layer 4-99 Network Layer 4-100


Chapter 4: Network Layer RIP ( Routing Information Protocol)

 4. 1 Introduction  4.5 Routing algorithms  distance vector algorithm


 4.2 Virtual circuit and  Link state  included in BSD-UNIX Distribution in 1982
datagram networks  Distance Vector
 distance metric: # of hops (max = 15 hops)
 Hierarchical routing
 4.3 What’s inside a
router  4.6 Routing in the
From router A to subnets:
 4.4 IP: Internet
Internet
u destination hops
 RIP v
Protocol u 1
 OSPF A B w v 2
 Datagram format w 2
 BGP
 IPv4 addressing x 3
 ICMP  4.7 Broadcast and x y 3
 IPv6 multicast routing z C D z 2
y

Network Layer 4-101 Network Layer 4-102

RIP advertisements RIP: Example


z
 distance vectors: exchanged among
neighbors every 30 sec via Response w x y
A D B
Message (also called advertisement)
 each advertisement: list of up to 25 C
destination subnets within AS Destination Network Next Router Num. of hops to dest.
w A 2
y B 2
z B 7
x -- 1
…. …. ....
Routing/Forwarding table in D

Network Layer 4-103 Network Layer 4-104


RIP: Example
RIP: Link Failure and Recovery
Dest Next hops
w - 1 Advertisement
x - 1 from A to D If no advertisement heard after 180 sec -->
z C 4 neighbor/link declared dead
…. … ...
z  routes via neighbor invalidated
w x y  new advertisements sent to neighbors
A D B  neighbors in turn send out new advertisements (if
tables changed)
C  link failure info quickly (?) propagates to entire net
Destination Network Next Router Num. of hops to dest.
 poison reverse used to prevent ping-pong loops
w A 2
y B 2 (infinite distance = 16 hops)
z B A 7 5
x -- 1
…. …. ....
Routing/Forwarding table in D Network Layer 4-105 Network Layer 4-106

RIP Table processing Chapter 4: Network Layer


 RIP routing tables managed by application-level  4. 1 Introduction  4.5 Routing algorithms
process called route-d (daemon)  4.2 Virtual circuit and  Link state

 advertisements sent in UDP packets, periodically datagram networks  Distance Vector

repeated  Hierarchical routing


 4.3 What’s inside a
router  4.6 Routing in the
routed routed
 4.4 IP: Internet
Internet
 RIP
Transprt Transprt Protocol
(UDP) (UDP)  OSPF
 Datagram format
network forwarding forwarding network  BGP
 IPv4 addressing
(IP) table table (IP)  4.7 Broadcast and
 ICMP
link link multicast routing
 IPv6
physical physical

Network Layer 4-107 Network Layer 4-108


OSPF (Open Shortest Path First) OSPF “advanced” features (not in RIP)

 “open”: publicly available


 security: all OSPF messages authenticated (to
 uses Link State algorithm
prevent malicious intrusion)
 LS packet dissemination
 topology map at each node
 multiple same-cost paths allowed (only one path in
 route computation using Dijkstra’s algorithm
RIP)
 For each link, multiple cost metrics for different
 OSPF advertisement carries one entry per neighbor
TOS (e.g., satellite link cost set “low” for best effort;
router high for real time)
 advertisements disseminated to entire AS (via
 integrated uni- and multicast support:
flooding)  Multicast OSPF (MOSPF) uses same topology data
 carried in OSPF messages directly over IP (rather than TCP base as OSPF
or UDP  hierarchical OSPF in large domains.
Network Layer 4-109 Network Layer 4-110

Hierarchical OSPF Hierarchical OSPF


 two-level hierarchy: local area, backbone.
Link-state advertisements only in area
 each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.
 area border routers: “summarize” distances to nets
in own area, advertise to other Area Border routers.
 backbone routers: run OSPF routing limited to
backbone.
 boundary routers: connect to other AS’s.

Network Layer 4-111 Network Layer 4-112


Chapter 4: Network Layer Internet inter-AS routing: BGP

 4. 1 Introduction  4.5 Routing algorithms  BGP (Border Gateway Protocol): the de


 4.2 Virtual circuit and  Link state facto standard
datagram networks  Distance Vector
 BGP provides each AS a means to:
 Hierarchical routing
 4.3 What’s inside a 1. Obtain subnet reachability information from
router  4.6 Routing in the neighboring ASs.
 4.4 IP: Internet
Internet 2. Propagate reachability information to all AS-
 RIP internal routers.
Protocol
 OSPF 3. Determine “good” routes to subnets based on
 Datagram format
 BGP reachability information and policy.
 IPv4 addressing
 ICMP  4.7 Broadcast and  allows subnet to advertise its existence to
 IPv6 multicast routing rest of Internet: “I am here”

Network Layer 4-113 Network Layer 4-114

BGP basics Distributing reachability info


 pairs of routers (BGP peers) exchange routing info
 using eBGP session between 3a and 1c, AS3 sends
over semi-permanent TCP connections: BGP sessions
prefix reachability info to AS1.
 BGP sessions need not correspond to physical
links.  1c can then use iBGP do distribute new prefix
 when AS2 advertises a prefix to AS1: info to all routers in AS1
 AS2 promises it will forward datagrams towards  1b can then re-advertise new reachability info
that prefix. to AS2 over 1b-to-2a eBGP session
 AS2 can aggregate prefixes in its advertisement  when router learns of new prefix, it creates entry
for prefix in its forwarding table.

eBGP session eBGP session


3c iBGP session
3c iBGP session
3a 2c 3a 2c
3b 2a 3b 2a
AS3 2b AS3 2b
1c AS2
1c AS2
1a 1b 1a 1b
AS1 1d AS1 1d
Network Layer 4-115 Network Layer 4-116
Path attributes & BGP routes BGP route selection
 advertised prefix includes BGP attributes.  router may learn about more than 1 route
 prefix + attributes = “route” to some prefix. Router must select route.
 two important attributes:  elimination rules:
 AS-PATH: contains ASs through which prefix
1. local preference value attribute: policy
advertisement has passed: e.g, AS 67, AS 17
decision
 NEXT-HOP: indicates specific internal-AS router
to next-hop AS. (may be multiple links from 2. shortest AS-PATH
current AS to next-hop-AS) 3. closest NEXT-HOP router: hot potato routing
 when gateway router receives route 4. additional criteria
advertisement, uses import policy to
accept/decline.

Network Layer 4-117 Network Layer 4-118

BGP messages BGP routing policy


legend: provider
 BGP messages exchanged using TCP. B network
 BGP messages: X
W A
customer
 OPEN: opens TCP connection to peer and C network:
authenticates sender
Y
 UPDATE: advertises new path (or withdraws old)
 KEEPALIVE keeps connection alive in absence of  A,B,C are provider networks
UPDATES; also ACKs OPEN request
 X,W,Y are customer (of provider networks)
 NOTIFICATION: reports errors in previous msg;
 X is dual-homed: attached to two networks
also used to close connection
 X does not want to route from B via X to C
 .. so X will not advertise to B a route to C

Network Layer 4-119 Network Layer 4-120


BGP routing policy (2) Why different Intra- and Inter-AS routing ?
legend: provider Policy:
B network
X  Inter-AS: admin wants control over how its traffic
W A
customer routed, who routes through its net.
C network:
 Intra-AS: single admin, so no policy decisions needed
Y
Scale:
 A advertises path AW to B  hierarchical routing saves table size, reduced update
 B advertises path BAW to X traffic
 Should B advertise path BAW to C? Performance:
 No way! B gets no “revenue” for routing CBAW  Intra-AS: can focus on performance
since neither W nor C are B’s customers  Inter-AS: policy may dominate over performance
 B wants to force C to route to w via A
 B wants to route only to/from its customers!
Network Layer 4-121 Network Layer 4-122

Chapter 4: Network Layer Broadcast Routing


 deliver packets from source to all other nodes
 4. 1 Introduction  4.5 Routing algorithms  source duplication is inefficient:
 4.2 Virtual circuit and  Link state

datagram networks  Distance Vector duplicate


duplicate
R1 creation/transmission R1
 Hierarchical routing
 4.3 What’s inside a duplicate

router  4.6 Routing in the R2 R2

 4.4 IP: Internet


Internet
 RIP R3 R4 R3 R4
Protocol
 OSPF
 Datagram format source in-network
 BGP duplication duplication
 IPv4 addressing
ICMP  4.7 Broadcast and

multicast routing  source duplication: how does source
 IPv6
determine recipient addresses?
Network Layer 4-123 Network Layer 4-124
In-network duplication Spanning Tree
 flooding: when node receives brdcst pckt,  First construct a spanning tree
sends copy to all neighbors  Nodes forward copies only along spanning
 Problems: cycles & broadcast storm
tree
 controlled flooding: node only brdcsts pkt
A
if it hasn’t brdcst same packet before A

 Node keeps track of pckt ids already brdcsted B B


c c
 Or reverse path forwarding (RPF): only forward
pckt if it arrived on shortest path between D D
node and source F E F E

 spanning tree G G

 No redundant packets received by any node (a) Broadcast initiated at A (b) Broadcast initiated at D

Network Layer 4-125 Network Layer 4-126

Spanning Tree: Creation Multicast Routing: Problem Statement


 Center node
 Goal: find a tree (or trees) connecting
 Each node sends unicast join message to center
routers having local mcast group members
node
 tree: not all paths between routers used
 Message forwarded until it arrives at a node already
belonging to spanning tree  source-based: different tree from each sender to rcvrs
 shared-tree: same tree used by all group members
A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) Stepwise construction (b) Constructed spanning
of spanning tree tree
Shared tree Source-based trees
Network Layer 4-127
Approaches for building mcast trees Shortest Path Tree
Approaches:  mcast forwarding tree: tree of shortest
 source-based tree: one tree per source
path routes from source to all receivers
 Dijkstra’s algorithm
 shortest path trees
 reverse path forwarding S: source LEGEND
 group-shared tree: group uses one tree R1 2
1 R4 router with attached
 minimal spanning (Steiner) group member
R2
 center-based trees 5
router with no attached
3 4
R5 group member
…we first look at basic approaches, then specific R3 6 link used for forwarding,
i
protocols adopting these approaches R6 R7 i indicates order link
added by algorithm

Reverse Path Forwarding Reverse Path Forwarding: example


S: source
LEGEND
R1
 rely on router’s knowledge of unicast R4 router with attached
shortest path from it to sender group member
R2
 each router has simple forwarding behavior: router with no attached
R5 group member
R3 datagram will be
if (mcast datagram received on incoming link R6 R7 forwarded
on shortest path back to center) datagram will not be
forwarded
then flood datagram onto all outgoing links
else ignore datagram • result is a source-specific reverse SPT
– may be a bad choice with asymmetric links
Reverse Path Forwarding: pruning Shared-Tree: Steiner Tree
 forwarding tree contains subtrees with no mcast
group members  Steiner Tree: minimum cost tree
 no need to forward datagrams down subtree connecting all routers with attached group
 “prune” msgs sent upstream by router with no members
downstream group members
 problem is NP-complete
S: source LEGEND
 excellent heuristics exists
R1 router with attached
R4
group member  not used in practice:
R2 router with no attached  computational complexity
P group member
P  information about entire network needed
R5 prune message
P links with multicast  monolithic: rerun whenever a router needs to
R3
R6 R7 forwarding join/leave

Center-based trees Center-based trees: an example


 single delivery tree shared by all Suppose R6 chosen as center:
 one router identified as “center” of tree
LEGEND
 to join:
R1 router with attached
 edge router sends unicast join-msg addressed R4
3 group member
to center router
R2 router with no attached
 join-msg “processed” by intermediate routers 2 group member
and forwarded towards center 1
R5 path order in which join
 join-msg either hits existing tree branch for messages generated
R3
1 R7
this center, or arrives at center R6

 path taken by join-msg becomes new branch of


tree for this router
Internet Multicasting Routing: DVMRP DVMRP: continued…
 soft state: DVMRP router periodically (1 min.)
 DVMRP: distance vector multicast routing
protocol, RFC1075 “forgets” branches are pruned:
 mcast data again flows down unpruned branch
 flood and prune: reverse path forwarding,
 downstream router: reprune or else continue to
source-based tree receive data
 RPF tree based on DVMRP’s own routing tables
 routers can quickly regraft to tree
constructed by communicating DVMRP routers
 no assumptions about underlying unicast
 following IGMP join at leaf
 initial datagram to mcast group flooded  odds and ends
everywhere via RPF  commonly implemented in commercial routers
 routers not wanting group: send upstream prune  Mbone routing done using DVMRP
msgs

Tunneling PIM: Protocol Independent Multicast


Q: How to connect “islands” of multicast  not dependent on any specific underlying unicast
routers in a “sea” of unicast routers? routing algorithm (works with all)
 two different multicast distribution scenarios :

Dense: Sparse:
 group members  # networks with group
physical topology logical topology
densely packed, in members small wrt #
 mcast datagram encapsulated inside “normal” (non-multicast-
“close” proximity. interconnected networks
addressed) datagram  bandwidth more  group members “widely
 normal IP datagram sent thru “tunnel” via regular IP unicast to plentiful dispersed”
receiving mcast router  bandwidth not plentiful
 receiving mcast router unencapsulates to get mcast datagram
Consequences of Sparse-Dense Dichotomy: PIM- Dense Mode
Dense Sparse: flood-and-prune RPF, similar to DVMRP but
 group membership by  no membership until  underlying unicast protocol provides RPF info
routers assumed until routers explicitly join for incoming datagram
routers explicitly prune  receiver- driven
 less complicated (less efficient) downstream
 data-driven construction construction of mcast flood than DVMRP reduces reliance on
on mcast tree (e.g., RPF) tree (e.g., center-based) underlying routing algorithm
 bandwidth and non-  bandwidth and non-group-
 has protocol mechanism for router to detect it
group-router processing router processing
is a leaf-node router
profligate conservative

PIM - Sparse Mode PIM - Sparse Mode


 center-based approach sender(s):
 router sends join msg  unicast data to RP,
to rendezvous point R1 which distributes down R1
R4 R4
(RP) join RP-rooted tree join
 intermediate routers R2  RP can extend mcast R2
join join
update state and tree upstream to
forward join R5 R5
join source join
 after joining via RP, R3 R7 R3 R7
R6  RP can send stop msg R6
router can switch to
if no attached
source-specific tree all data multicast rendezvous all data multicast rendezvous
from rendezvous receivers from rendezvous
 increased performance: point point
less concentration, point  “no one is listening!” point
shorter paths
Chapter 4: summary
 4. 1 Introduction  4.5 Routing algorithms
 4.2 Virtual circuit and  Link state

datagram networks  Distance Vector


 Hierarchical routing
 4.3 What’s inside a
router  4.6 Routing in the
 4.4 IP: Internet
Internet
 RIP
Protocol
 OSPF
 Datagram format
 BGP
 IPv4 addressing
 ICMP  4.7 Broadcast and
 IPv6 multicast routing

Network Layer 4-145

You might also like