Professional Documents
Culture Documents
Shivkumar Kalyanaraman
Rensselaer Polytechnic Institute
shivkuma@ecse.rpi.edu
Based in part upon slides of Prof. Raj Jain (OSU), S. Keshav (Cornell), J. Kurose (U Mass)
Shivkumar Kalyanaraman
1
Overview
RIP, RIPv2, EIGRP
OSPF, PNNI, IS-IS: LS efficiency & robustness
Link state distribution, DB synchronization, NBMAs etc
Shivkumar Kalyanaraman
2
RIP: Routing Information Protocol
Uses hop count as metric (max: 16 is infinity)
Tables (vectors) “advertised” to neighbors every 30 s.
Each advertisement: upto 25 entries
No advertisement for 180 sec: neighbor/link declared dead
routes via neighbor invalidated
new advertisements sent to neighbors (Triggered
updates)
neighbors in turn send out new advertisements (if
tables changed)
link failure info quickly propagates to entire net
poison reverse used to prevent ping-pong loops (infinite
distance = 16 hops)
Shivkumar Kalyanaraman
3
RIPv1 Problems (Continued)
❑ Split horizon/poison reverse does not guarantee
to solve count-to-infinity problem
❑ 16 = infinity => RIP for small networks only!
❑ Slow convergence
❑ Broadcasts consume non-router resources
❑ RIPv1 does not support subnet masks (VLSMs)
❑ No authentication
Shivkumar Kalyanaraman
4
RIPv2
❑ Why ? Installed base of RIP routers
❑ Provides:
❑ VLSM support
❑ Authentication
❑ Multicasting
❑ “Wire-sharing” by multiple routing domains,
❑ Tags to support EGP/BGP routes.
❑ Uses reserved fields in RIPv1 header.
❑ First route entry replaced by authentication info.
Shivkumar Kalyanaraman
5
E-IGRP (Interior Gateway Routing Protocol)
CISCO proprietary; successor of RIP (late 80s)
Several metrics (delay, bandwidth, reliability, load etc)
Uses TCP to exchange routing updates
Loop-free routing via Distributed Updating Alg. (DUAL)
based on diffused computation
Freeze entry to particular destination
Diffuse a request for updates
Other nodes may freeze/propagate the diffusing
computation (tree formation)
Unfreeze when updates received.
Tradeoff: temporary un-reachability for some
destinations
Shivkumar Kalyanaraman
6
Link State vs. Distance Vector
Link State (LS) advantages:
More stable (aka fewer routing loops)
Faster convergence than distance vector
Easier to discover network topology, troubleshoot
network.
Can do better source-routing with link-state
Type & Quality-of-service routing (multiple route
tables) possible
Shivkumar Kalyanaraman
7
Link State Protocols
❑ Key: Create a network “map” at each node.
Shivkumar Kalyanaraman
8
Link State Issues
❑ Reliable Flooding: sequence #s, age
❑ LSA types, Neighbor discovery and maintainence
(hello)
❑ Efficiency in Broadcast LANs, NBMA, Pt-Mpt
subnets: designated router (DR) concept
❑ Areas and Hierarchy
❑ Area types: Normal, Stub, NSSA: filtering
❑ External Routes (from other ASs), interaction
with inter-domain routing.
❑ Advanced topics: incremental SPF algorithms
Shivkumar Kalyanaraman
9
Reliable Flooding…
Shivkumar Kalyanaraman
10
Topology Dissemination
❑ A.k.a LSP distribution
❑ 1. Flood LSPs on links except incoming link
❑ Require at most 2E transfers for n/w with E
edges
❑ 2. Sequence numbers to detect duplicates
❑ Why? Routers/links may go down/up
❑ Issue: wrap-around, larger sequence number
is not the most recent!
Shivkumar Kalyanaraman
11
Sequence Number Space Organization
❑ Circular space: S1 > S2 > S3 > S1
❑ Accidental bit errors in switch memory caused this
problem in ARPANET
❑ Lollipop sequence: Start with S0, increment till you
reach circle and then view it as a circular space
❑ No ambiguity in lollipop handle
❑ Linear space: OSPFv2.
❑ If Smax reached, expicitly delete Smax LSA before
wrapping around
Shivkumar Kalyanaraman
12
Topology Dissemination (Continued)
❑ Checksum field:
❑ Drop packet if in error, get retransmission from
neighbor
❑ Age field (similar to TTL)
❑ Number of seconds since LSA originated
❑ Periodically incremented after acceptance
❑ Originating router refreshes LSA after 30 min
❑ Delete if Age = MaxAge
❑ Low age field + large seq # => that LSA is
flapping or frequently changing …
Shivkumar Kalyanaraman
13
Recovering from a partition
On partition, LSP databases can get out of synch
Shivkumar Kalyanaraman
15
OSPF Router-LSA: Scenario
Shivkumar Kalyanaraman
16
Neighbor Discovery & Relationship
Every OSPF router sends out 'hello' packets
Hello packets used to determine if neighbor is up
Hello packets sent periodically (short intervals)
HelloInterval = 10s (in example)
Initial DB synchronization
Shivkumar Kalyanaraman
18
Hello:
Packet Format
Shivkumar Kalyanaraman
19
Router-LSA:
Shivkumar Kalyanaraman
20
Database Synchronization
❑ LS Database (LSDB): collection of the Link State
Advertisements (LSAs) accepted at a node.
❑ This is the “map” for Dijkstra algorithm
❑ When the connection between two neighbors comes up,
the routers must wait for their LS DBs to be synchronized.
❑ Else routing loops and black holes due to inconsistency
❑ OSPF technique:
❑ Source sends only LSA headers, then
❑ Neighbor requests LSAs that are more recent.
❑ Those LSAs are sent over
❑ After sync, the neighbors are said to be “fully adjacent”
Shivkumar Kalyanaraman
21
Recap: IP Subnet Model
❑ Each subnet assigned one
or more address prefixes.
❑ Each address prefix is
called an IP subnet
❑ IP routes to subnets, not to
individual hosts
❑ Two hosts on different IP
subnets have to go through
one or more routers.
❑ Even if they are on the
same “physical” network
Shivkumar Kalyanaraman
22
IP Subnet Model (Contd)
❑ Two hosts or routers on
a common subnet can
send packets “directly”
to one another
❑ Two routers cannot
exchange routing
information directly
unless they have one or
more IP subnets in
common
❑ All these issues will be
strained as we study
OSPF adjacency
operation over different
subnets Shivkumar Kalyanaraman
23
Broadcast Media Issues
❑ Multiple (N) OSPF routers attached to a common subnet
❑ Problems:
❑ One “physical link” vs N*(N-1) “adjacencies”
❑ How many “links” to be counted for Dijkstra algo?
Shivkumar Kalyanaraman
24
Broadcast net: # links for DIjkstra
❑ Each router is assumed to be “linked” to every other
router for the purposes of Dijkstra.
❑ Hello protocol optimization:
❑ Each node multicasts Hello to 224.0.0.5 (multicast
address “AllSPFRouters”)
❑ The Hello multicast message also indicates acks for
other routers’ Hellos by listing their RouterIDs
❑ “Link” relationship for purposes of Dijkstra maintained
by each node sending a single Hello packet, instead of
N packets.
❑ What about “flooding adjacencies”, I.e.,
❑ Whom to send (flood) LSAs when a router generates
or learns a new LSA?
❑ Does it need to synchronize DBs with all nodes ?
Shivkumar Kalyanaraman
25
Flooding Adjacencies : option 1
❑ Using Router-LSAs …
❑ O(N) Router-LSAs, with O(N2) adjacency info
❑ Multicast of Router-LSAs does not solve O(N2) DB
synchronization issue
Shivkumar Kalyanaraman
26
Flooding Adjacencies: option 2
❑ New LSA-type: Network-LSA …
❑ O(N) Router-LSAs + 1 network-LSA+ O(N) adjacencies
❑ Converted O(N2) adjacency problem into O(N) problem
Shivkumar Kalyanaraman
27
Recap: O(N2) model ⇒ O(N) model
Shivkumar Kalyanaraman
29
DR, BDR … continued
❑ Backup DR (BDR) also syncs with all routers, and
takes over if DR dies (typically 5 s wait)
❑ Total: 2N – 1 adjacencies
❑ Multicast-based optimization:
❑ New LSAs, Hellos sent to AllSPFRouters avoids
DR re-advertising new information
❑ LSA acks sent to AllDRRouters avoids separate
copies to be sent to DR and BDR
❑ DR election:
❑ First router on net = DR, second = BDR
❑ RouterPriority: [0, 127] indicated in Hello packet=>
highest priority router becomes DR
❑ If network is partitioned and healed, the two DRs
are reduced to one by looking at RouterPriority
Shivkumar Kalyanaraman
30
Network-LSA Example: Summary
DR
Shivkumar Kalyanaraman
31
What if subnet does not support
broadcast?
❑ Non-Broadcast Multiple Access (NBMA) media
❑ NBMA segments may support more than 2 routers, and
allow any two routers to communicate directly, but do not
support data-link broadcast/mcast capability
❑ Eg:X.25, SMDS, Frame-Relay, ATM etc
❑ Connection-oriented (VC-based) communication
❑ Each VC is costly => setting up full mesh for Hellos is
prohibitively expensive
❑ Two flooding adjacency models in OSPF:
❑ Non-Broadcast Multiple Access (NBMA) model
❑ Point-to-Multipoint (pt-mpt) Model
❑ Different tradeoffs…
Shivkumar Kalyanaraman
32
NBMA Subnet Model
❑ Neighbor discovery: manually configured
❑ Dijkstra SPF views NBMA as a full mesh!
❑ Most routers assigned a RouterPriority = 0
❑ Other routers: eligible to become DRs =>
❑ ID of all routers in the NBMA configured
❑ Maintains VCs and Hellos with all routers eligible to
become DRs (RouterPriority > 0)
❑ Enables election of new DR if current one fails
❑ DR and BDR only maintain VCs and Hellos with all
routers on NBMA
❑ DB synchronization works same as broadcast subnet
❑ Flooding in NBMA always goes through DR
❑ Multicast not available to optimize LSA flooding.
❑ DR generates network-LSA just like broadcast subnet
Shivkumar Kalyanaraman
33
NBMA vs Pt-Mpt Subnet Model
❑ Key assumption in NBMA model:
❑ Each router on the subnet can communicate with
every other (same as IP model)
❑ But this requires a “full mesh” of expensive PVCs at
the lower layer!
❑ Many organizations have a hub-and-spoke PVC setup,
a.k.a. “partial mesh”
❑ Conversion into NBMA model requires multiple IP
subnets, and complex configuration (see fig on next
slide)
❑ OSPF’s pt-mpt subnet model breaks the rule that two
routers on the same network must be able to talk directly
❑ Can turn partial PVC mesh into a single IP subnet
Shivkumar Kalyanaraman
34
Partial Mesh F-Relay: NBMA model
Shivkumar Kalyanaraman
35
Partial Mesh F-Relay: pt-mpt model
Shivkumar Kalyanaraman
36
Pt-Mpt Subnet Model
❑ Each router: single OSPF interface, but multiple neighbor
relationships
❑ Note that neighbor relationships not formed to nodes to
which direct PVC does not exist.
❑ Key differences:
❑ No DRs or BDRs! Just hellos over the PVCs. Make
sure that the communication is bi-directional.
❑ I.e. Partial mesh is viewed in Dijkstra as a partial
mesh. Full mesh view not forced like in NBMA model.
❑ Sometimes auto-configuration is possible.
❑ Loss in efficiency because the DB synchronization has to
be done between every peer.
❑ O(n^2) if full mesh. So, in true full PVC mesh situations,
it is better to operate subnet as an NBMAShivkumar Kalyanaraman
37
Hierarchical Routing
Shivkumar Kalyanaraman
38
Why Hierarchy?
❑ Information hiding (filtered) => computation,
bandwidth, storage saved => efficiency => scalability
❑ Address abstraction vs Topology Abstraction
❑ Multiple paths possible between two areas
Shivkumar Kalyanaraman
39
Hierarchical OSPF
Shivkumar Kalyanaraman
40
Area
❑ Configured area ID
❑ A set of address prefixes
❑ Do not have to be contiguous
❑ So a prefix can be in only one area
❑ A set of router IDs
❑ Router functions may be interior, inter-area, or
external
Shivkumar Kalyanaraman
41
Hierarchical OSPF
Two-level hierarchy: local area, backbone.
Link-state advertisements only in area
each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.
Two-level restriction avoids count-to-infinity issues in
backbone routing.
Area border routers (ABR): “summarize” distances to
nets in own area, advertise to other Area Border routers.
Backbone routers: uses a DV-style routing between
backbone routers
10.2.0.0/24
Shivkumar Kalyanaraman
43
Summary-LSA Example
Shivkumar Kalyanaraman
44
Externals and Aggregation 1
❑ A full ISP routing table has approximately 100K
routes!
❑ But will you do anything differently if you know
all of them and have a single ISP?
❑ Multiple ISP situations call for complex OSPF
and BGP design
❑ Never redistribute IGPs into BGP! (later…)
❑ Redistribute BGP into IGPs with extreme care
Shivkumar Kalyanaraman
45
Externals & Aggregation 2
❑ In an enterprise
❑ Limit externals from subordinate domains
(e.g., RIP) to be within area (area-scope)
❑ Flood only in area 0 and in area with ASBR
Shivkumar Kalyanaraman
46
Type 1 and Type 2 externals
❑ Type 2:
❑ Default type for routes distributed into OSPF
❑ EGP costs very different from IGP costs
❑ Exit based on external (EGP) cost only
❑ Type 1
❑ Needs to be set explicitly: not default
❑ IGP costs can be compared and summed
❑ Selects exit based on internal + external costs
Shivkumar Kalyanaraman
47
Stubbiness: A Means of
Controlling Externals
Shivkumar Kalyanaraman
48
Normal Areas
Shivkumar Kalyanaraman
52
IS-IS Terminology
Shivkumar Kalyanaraman
53
Functional Comparison
❑ Protocols are recognizably similar in function and
mechanism (common heritage)
❑ Link state algorithms
❑ Two level hierarchies
❑ Designated Router on LANs
❑ Widely deployed (ISPs vs enterprises)
❑ Multiple interoperable implementations
❑ OSPF more “optimized” by design (and therefore
significantly more complex)
❑ IS-IS not designed from the start as an IP routing
protocol (and is therefore a bit clunky in places)
Shivkumar Kalyanaraman
54
Sample comparison points
❑ Encapsulation
❑ OSPF runs on top of IP=> Relies on IP fragmentation
for large LSAs
❑ IS-IS runs directly over L2 (next to IP) =>
fragmentation done by IS-IS
❑ Media support
❑ Both protocols support LANs and point-to-point links in
similar ways
❑ IS-IS supports NBMA in a manner similar to OSPF pt-
mpt model: as a set of point-to-point links
❑ OSPF NBMA mode is configuration-heavy and risky
(all routers must be able to reach DR; bad news if VC
fails)
Shivkumar Kalyanaraman
55
Packet Encoding
❑ OSPF is “efficiently” encoded
❑ Positional fields, 32-bit alignment
❑ Only LSAs are extensible (not Hellos, etc.)
❑ Unrecognized types not flooded. Opaque-LSAs
recently introduced.
Length Indicator 1
Version/Protocol ID Extension 1
ID Length 1
R R R PDU Type 1
Version 1
Reserved 1
TLV Fields
Shivkumar Kalyanaraman
57
More detailed comparison provided
as a reference (not covered in
class)…
Shivkumar Kalyanaraman
58
Private Network to Node Interface (PNNI)
❑ Link State Routing Protocol for ATM Networks
Shivkumar Kalyanaraman
59
PNNI Features
❑ Scales to very large networks.
❑ Supports hierarchical routing.
❑ Supports QoS.
❑ Supports multiple routing metrics and attributes.
❑ Uses source routed connection setup.
❑ Operates in the presence of partitioned areas.
❑ Provides dynamic routing, responsive to changes in
resource availability.
❑ Separates the routing protocol used within a peer group
from that used among peer groups.
❑ Interoperates with external routing domains, not
necessarily using PNNI.
❑ Supports both physical links and tunneling over VPCs.
Shivkumar Kalyanaraman
60
PNNI Terminology (partial)
❑ Peer group: A group of nodes at the same hierarchy
❑ Border node: one link crosses the boundary
❑ Logical group node: Representation of a group as a single point
❑ Child node: Any node at the next lower hierarchy level
❑ Parent node: LGN at the next higher hierarchy level
❑ Logical links: links between logical nodes
❑ Peer group leader (PGL): Represents a group at the next higher
level.
❑ Node with the highest "leadership priority" and highest
ATM address is elected as a leader.
❑ PGL acts as a logical group node.
❑ Uses same ATM address with a different selector value.
❑ Peer group ID: Address prefixes up to 13 bytes
Shivkumar Kalyanaraman
61
PNNI Terminology
Shivkumar Kalyanaraman
62
Hierarchical Routing: PNNI
Shivkumar Kalyanaraman
63
Source Routing
❑ Source specifies route as a list of all intermediate
systems in the route. Abstracts out area hops.
❑ Designated Transit List (DTL) Source route
across each level of hierarchy
❑ Entry switch of each peer group specifies
complete route through that group
❑ Set of DTLs and manipulations implemented
as a stack
❑ DTL example: next slide
Shivkumar Kalyanaraman
64
DTL Example
Shivkumar Kalyanaraman
65
Crank back and Alternate Path Routing
❑ If a call fails along a particular route:
❑ It is cranked back to the originator of the top DTL
❑ The originator finds another route or
❑ Cranks back to the generator of the higher level
source route
Shivkumar Kalyanaraman
66
Summary
Shivkumar Kalyanaraman
67