You are on page 1of 56

High Availability

for IPTV

Søren Andreasen
CCIE #3252
Cisco DK

© 2005 Cisco Systems, Inc. All rights reserved. Cisco Confidential 1


Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


2
Tripleplay world
Portal Broadband Subscriber Policy
Billing Identity
Policy Manager Database Definition

Policy and Service Management Layer


Residential Ethernet

Access IP / MPLS / Ethernet


Ethernet IP / MPLS Core
ISG/BRAS
Aggregation
ETTx
RGW Si
Distributed: Centralized:
L2 PW, VPLS H-VPLS
Cable
L3 VPN L3 VPN
DSL
Local CO
PE-R / PE-S Metro CO
STB PE-R CORE
STB
PON

Service
Control Engine

Hosted Business Apps


iFrame Cache (Storage, Centrex, Security, Gaming) VoD VoIP Video Broadcast
Service/Application Layer

© 2005 Cisco Systems, Inc. All rights reserved.


3
Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


4
High Availability

• End-to-end protection against network failures


• Preserve end-to-end network service connectivity
• Isolate control plane failures from forwarding
plane
• Separation of control and forwarding plane should
allow forwarding to continue while control plane
recovers (NSF)

© 2005 Cisco Systems, Inc. All rights reserved.


5
HA Mechanisms

• Component and Device Level Resiliency


– Hardware and software component resiliency
• Distributed line cards, route processors, Modular operating
software
– Stateful Switch-Over between RPs (SSO)
– Control/forwarding plane decoupling; Non-Stop
Forwarding (NSF)
– Graceful Restart of protocols (GR)

• Network Level Resiliency


– Optimized convergence algorithms, speeding up network
recovery

© 2005 Cisco Systems, Inc. All rights reserved.


6
Device Level HA:
NSF With SSO

• Stateful Switch-Over (SSO):


zero interruption to protocol
sessions Route Processors

– Active RP synchronizes
information with standby RP
– Session state maintained for high RP1 RP2
availability-aware protocols on ACTIVE STANDBY
standby RP
– Standby RP takes control when
active RP is compromised
Bus
• Non-Stop Forwarding (NSF):
minimal or no packet loss
– Packet forwarding continues
during reestablishment of peering
relationships Line Cards
– No route flaps between
participating neighbor routers

© 2005 Cisco Systems, Inc. All rights reserved.


7
SSO in action

HA-Router(config)#redundancy
HA-Router#show redundancy
Active GRP in slot 0:
HA-Router(config-red)#mode ?
Standby GRP in slot 9:
rpr Route Processor Redundancy
rpr-plus Route Processor Redundancy Plus
Preferred GRP: none
sso Stateful Switchover (Hot Operating Redundancy Mode: SSO
Standby)
Auto synch: startup-config running-
config
HA-Router(config-red)#mode sso switchover timer 3 seconds [default]

© 2005 Cisco Systems, Inc. All rights reserved.


8
Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


9
GR/NSF Fundamentals

• NonStop Forwarding (NSF) is a way to continue


forwarding packets while the control plane is
recovering from a failure.
• Graceful Restart (GR) is a way to rebuild
forwarding information in routing protocols when
the control plane has recovered from a failure.

© 2005 Cisco Systems, Inc. All rights reserved.


10
GR/NSF Fundamentals (uden GR/NSF)

• Router A loses its control


plane for some period of Control Data A
time.
• It will take some time for
Router B to recognize this
failure, and react to it.

Control Data B

© 2005 Cisco Systems, Inc. All rights reserved.


11
GR/NSF Fundamentals (uden GR/NSF)
reset
• During the time that A has
failed, and B has not detected
the failure, B will continue Control Data A
forwarding traffic through A.
• Once the control plane resets,
the data plane will reset as
well, and this traffic will be
dropped.
• NSF reduces or eliminates the
traffic dropped while A’s
control plane is down.

Control Data B

© 2005 Cisco Systems, Inc. All rights reserved.


12
GR/NSF Fundamentals
no reset
• If A is NSF capable, the control
plane will not reset the data
plane when it restarts Control Data A
• Instead, the forwarding
information in the data plane is
marked as stale.
• Any traffic B sends to A will
still be switched based on the
last known forwarding
information.

Control Data B

mark forwarding
information as stale
© 2005 Cisco Systems, Inc. All rights reserved.
13
GR/NSF Fundamentals

• While A’s control plane is


down, the routing protocol
hold timer on B counts Control Data A
down....
• A has to come back up and
signal B before B’s hold timer
expires, or B will route around
it.
• When A comes back up, it
signals B that it is still
forwarding traffic, and would
like to resync.
Control Data B

Hold Timer: 15
6
7
8
9
10
11
12
13
14

© 2005 Cisco Systems, Inc. All rights reserved.


14
GR/NSF Design Goals

• No Link Flap
• CEF table sync to the standby RP
• LC CEF Table not cleared on switchover
• Packet forwarding during switchover while routing
is converging on Standby RP
• Maintain peer relationship, no adjacency flapping
with peers
• Limit restart to be local event, not network wide

Ultimate Goal: Achieve 0% Packet Loss


© 2005 Cisco Systems, Inc. All rights reserved.
15
IS-IS GR/NSF

• Use the nsf command under


router isis
the router isis configuration
nsf A
mode to enable graceful ....
restart.
• Show isis nsf can be used to
verify graceful restart is
operational.
router isis
nsf
....

router#show isis nsf


NSF is ENABLED, mode 'cisco'
B
RP is ACTIVE, standby ready, bulk sync complete
NSF interval timer expired (NSF restart enabled)
Checkpointing enabled, no errors
Local state:ACTIVE, Peer state:STANDBY HOT, Mode:SSO

© 2005 Cisco Systems, Inc. All rights reserved.


16
OSPF GR/NSF

• Use the nsf command under


router ospf 100
the router ospf configuration
nsf A
mode to enable graceful ....
restart.
• Show ip ospf can be used to
verify graceful restart is
operational.
router ospf 100
nsf
....

router#sh ip ospf
Routing Process "ospf 100" with ID 10.1.1.1
.... B
Non-Stop Forwarding enabled, last NSF restart 00:02:06
ago (took 44 secs)
router#show ip ospf neighbor detail
Neighbor 3.3.3.3, interface address 170.10.10.3
....
Options is 0x52
LLS Options is 0x1 (LR), last OOB-Resync 00:02:22 ago
© 2005 Cisco Systems, Inc. All rights reserved.
17
BGP GR/NSF

• Use the bgp graceful-restart


command under the router bgp router bgp 65000
configuration mode to enable bgp graceful-restart A
graceful restart. ....

• Show ip bgp neighbors can be


used to verify graceful restart
is operational.
router bgp 65501
bgp graceful-restart
....

router#show ip bgp neighbors x.x.x.x


.... B
Neighbor capabilities:
....
Graceful Restart Capabilty:advertised and received
Remote Restart timer is 120 seconds
Address families preserved by peer:
IPv4 Unicast, IPv4 Multicast

© 2005 Cisco Systems, Inc. All rights reserved.


18
Multicast HA :
PIM/SSM in the Core

How Triggered PIM Join(s)


work when Active Route Processor Fails:

Periodic PIM Joins 1 Active Route Processor receives periodic


“Special” PIM Hello
PIM Joins in steady-state
Triggered PIM Joins
STANDBY
ACTIVE

ACTIVE

2 Active Route Processor fails


Failure

3 Standby Route Processor takes over


Line Card

Line Card

Line Card

Line Card

4 “Special” PIM Hello is sent out

5 Triggers adjacent PIM neighbors to resend


PIM Joins refreshing state of distribution
tree(s) preventing them from timing out

© 2005 Cisco Systems, Inc. All rights reserved.


19
Multicast HA:
IGMP at the edge
Provides Seamless Forwarding of Multicast Traffic During
Supervisor Switchovers

How it works:

STANDBY
ACTIVE
ACTIVE
1 Global sync of IGMP Data Structures Failure
occurs when standby Supervisor comes online
2 Periodic syncs are performed as changes occur

3 Active Supervisor fails


global
4 Standby Supervisor takes over sync

IGMP IGMP
5 Synced IGMP Data Structures are utilized Data Data
to provide Layer-2 Non-Stop Forwarding (NSF) Structures periodic Structures
of multicast traffic
syncs

© 2005 Cisco Systems, Inc. All rights reserved.


20
Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


21
MPLS HA – Component Resiliency

• MPLS HA features extend NSF with SSO for:


– LDP (for L3VPNs and L2VPNs)
– BGP (L3VPNs)
– RSVP (for TE)

• Minimal disruption to MPLS forwarding plane due


to route processor control plane failures
– Includes MPLS control plane failures (LDP, BGP, RSVP)

© 2005 Cisco Systems, Inc. All rights reserved.


22
MPLS HA – Network Resiliency

• MPLS control plane protocol enhancements to


improve failure detection time and network
convergence
• Graceful Restart (GR)
– LDP, BGP, RSVP
• Fast Convergence
– LDP (IGP sync), LDP session protection
• TE FRR
– Link protection
– Node protection
• Pseudowire Redundancy
© 2005 Cisco Systems, Inc. All rights reserved.
23
MPLS Graceful Restart + NSF/SSO

No MPLS HA support
1
Control Plane Control Plane Control Plane

2 2
Data Plane 3 Data Plane 3 3 Data Plane

LSPs LSPs
PE P PE
MPLS HA Support: NSF/SSO + Graceful Restart
1
Control Plane Control Plane Control Plane

2 2
Data Plane Data Plane Data Plane

LSPs LSPs
PE P PE

© 2005 Cisco Systems, Inc. All rights reserved.


24
LDP NSF/GR

• LDP NSF/SSO/GR is also known as LDP NSF


• LDP NSF allows a route processor to recover from
disruption in LDP control plane without losing its
MPLS forwarding state
• LDP NSF works with LDP sessions between directly
connected peers as well as with targeted sessions
• LDP HA Key Elements
– Checkpointing local label bindings (synkronisering mellem
RP’s)
• On devices with route processor redundancy
– LDP graceful restart capability
• On participating PEs, RRs, and P routers

© 2005 Cisco Systems, Inc. All rights reserved.


25
LDP NSF/GR Key Elements:
Checkpointing Local Label Bindings

• The checkpointing function is enabled Local Label Bindings


by default RP A RP B
(Active) (Standby)
• The checkpointing function copies
active RP’s LDP local label bindings to
IPC
the backup RP
• For the first round, all the labels are
copied from active to back up RP RP Checkpointing RP
• Periodic incremental updates are done
to reflect new routes that have been
learned or routes that have been
removed and/or when labels are
allocated or freed
• Label bindings on backup RP are
marked checkpointed Line Card Line Card Line Card
• This marking is removed when it
becomes active RP

© 2005 Cisco Systems, Inc. All rights reserved.


26
LDP NSF/GR Operation
Rb(config)#mpls ldp
LDP GR: 20.0.0.0/8:: updating LDP GR: 10.0.0.0/8:: updating
graceful-restart
binding from 1.1.1.1:0, inst 1:: binding from 1.1.1.1:0, inst 1::
marking stale; LDP Adj LDP Adj marking stale;
UP UP
Primary
A Standby C
LDP B LDP
Reset Reset

LDP GR: 20.0.0.0//8:: LDP GR: 10.0.0.0//8::


refreshing stale binding from LDP refreshing stale binding from
1.1.1.1:0, inst 1 -> inst 2 Restart 1.1.1.1:0, inst 1 -> inst 2

• LDP paths established, LDP GR negotiated


• When RP fails on LSRb, communication between peers is lost; LSRb encounters a LDP restart, while
LSRa and LSRc encounter an LDP session reset
• LSRa and LSRc mark all the label bindings from LSRb as stale, but continue to use the same
bindings for MPLS forwarding
• LSRa and LSRc attempt to reestablish an LDP session with Rb
• LSRb restarts and marks all of its forwarding entries as stale; at this point, LDP state is in restarting
mode
• LSRa and LSRc reestablish LDP sessions with Rb, but keep their stale label bindings; at this point,
the LDP state is in recovery mode
• All routers re-advertise their label binding info; stale flags are removed if a label has been relearned;
new LFIB table is ready
© 2005 Cisco Systems, Inc. All rights reserved.
27
BGP NSF/GR for Inter-AS and CSC
Enable MPLS Nonstop Forwarding on an
Interface that Uses BGP as the Label
Distribution Protocol (Inter-as and
CsC Environment)
mpls bgp forwarding

ASBR-1 ASBR-2

AS #100 AS #200
eBGP IPv4 +
AS1_PE1 Labels AS2_PE1

mpls bgp forwarding mpls bgp forwarding

© 2005 Cisco Systems, Inc. All rights reserved.


28
MPLS TE Fast Re-Route (FRR)
Link Protection
• IP routing protocols (e.g.,
R3
OSPF, BGP) may be tuned
to convergence within a
few seconds
• Some traffic (e.g. voice) will R1 R2 R4 R5

require more aggressive


convergence time Node Protection

– Typically 50 ms or less R3 R4
R5
• MPLS TE FRR offers
protection against network
failures R1 R2 R6 R7 R8

– Link protection
R9
– Node protection
Protected Path Backup Path

© 2005 Cisco Systems, Inc. All rights reserved.


29
FRR Link Protection

R3
• Creation of Next-hop
Backup Tunnel
– Parallel path around 20
20 11
11 11
11
protected link to next hop HE PLR

• On link failure detection


Point of Local Repair (PLR) R1 10
10 R2 11
11 R4 R6
swaps label and pushes
backup label
– Traffic sent over backup
R5
path
• PLR notifies TE Head End xx Label for the backup LSP
(HE), which triggers global
TE path re-optimization xx Label for the protected LSP

Protected Path Backup Path

© 2005 Cisco Systems, Inc. All rights reserved.


30
FRR Node Protection

• Creation of Next-next-hop
21
21 12
12
Backup Tunnel
– Parallel path around R3 R4
20
20 12
12 R5
protected node to next- 12
12
next-hop HE PLR R6

• Node failure triggers Point 11 12


R1 10 R2 11 12 R7 R8
of Local Repair (PLR) to 10

swap label and push


backup label R9

– Traffic sent over backup


path around failed node
xx Label for the backup LSP
• PLR notifies TE Head End
(HE), which triggers global xx Label for the protected LSP
TE path re-optimization
Protected Path Backup Path

© 2005 Cisco Systems, Inc. All rights reserved.


31
FRR in triple play with P2MP-TE
Basic RSVP-TE P2MP Tunnel Setup

Distribution/
Source Service Edge Core Access Receiver

Multicast Packet
Primary RSVP Tunnel

Multicast Packet
+ MPLS Label

P P
Layer 2 Layer 2
Switch Switch

PE MPLS Core
CE PE PE CE
Source Receiver

P P

Backup RSVP Tunnel


© 2005 Cisco Systems, Inc. All rights reserved.
32
Multiple Points of Failure Still Exist

Distribution/
Source Service Edge Core Access Receiver

Primary RSVP Tunnel

P P
Layer 2 Layer 2
Switch Switch

X X X X X XPE MPLS Core X X XX X


CE PE PE CE
Source Receiver

P P

Backup RSVP Tunnel


© 2005 Cisco Systems, Inc. All rights reserved.
33
Pseudo Wire HA

PE1 PE3
PE1 P1 PE

Site1 P2 P4 Site2
PE2 PE4
CE2

CE1

• Failure in the provider core mitigated with link redundancy and FRR
• PE router failure–Another PE or NSF/SSO
• PE-CE Attachment Circuit failure–need pair of attachment Circuits
end-to-end
• CE Router failure–redundant CEs

© 2005 Cisco Systems, Inc. All rights reserved.


34
Pseudowire Redundancy

x
PE1 PE3 CE2
PE1 P1 PE

Site2
Site1 P2 P4
PE2 PE4
CE1
CE3

pe1(config)#int e 0/0.1
pe1(config-subif)#encapsulation dot1q 10
pe1(config-subif)# xconnect <PE3 router ID> <VCID> encapsulation mpls
pe1(config-subif-xconn)#backup peer <PE4 router ID> <VCID>

© 2005 Cisco Systems, Inc. All rights reserved.


35
MPLS HA
PE3
PE1
P1 P3
Primary
Site1 x Standby

P2 P4 Site2
PE2
10.1.1.0/24

PE4 10.1.2.0/24

10.1.1.0/24

10.1.2.0/24
• Link failures—solved using redundancy in network design and
MPLS TE FRR and/or Fast IGP
• Node failures—solved using redundant nodes and meshing of connections
– Also using MPLS TE FRR node protection
• Line card failures
– Hardware redundancy—line card redundancy (1+1, 1:N, 1:1)
• PE router (RP) failures
– Dual RP systems NSF/SSO
– HA software provides seamless transition…
© 2005 Cisco Systems, Inc. All rights reserved.
36
HA in MPLS L3VPN Environment
RR1 RR2

10.1.1.0/24 P1 P3 10.1.2.0/24
PE1 PE1

Primary Primary
Standby P2 P4 Standby

10.1.1.0/24 10.1.2.0/24
LDP
iBGP
• LDP and BGP use TCP as a reliable transport mechanism for
its protocol messages
• The TCP session between two LDP/BGP peers may go down
due to HW/SW failure (RP switchover)
• Use IGP, BGP, LDP NSF/GR

© 2005 Cisco Systems, Inc. All rights reserved.


37
HA in L2VPN Networks

PE3
PE1
P1 PE
Primary
Site1 Primary
Standby

P2 P4 Site2

Primary
PE4

• The TCP session between two LDP peers may go


down due to HW/SW failure (RP switchover)
• If PE3 fails, traffic will be dropped
• Use PW-redundancy, LDP NSF/GR and IGP NSF/GR
© 2005 Cisco Systems, Inc. All rights reserved.
38
Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


39
Fast Convergence:

• Too much to cover, not possible in 45 mins


• Layer 2 tuning
– SONET alarms
– Interface timers
• CEF optimizations
• IGP FC
– Fast IGP timers, iSPF, fast flooding, fast hellos, etc
• BGP FC
– Next Hop Tracking, BGP timers
• BFD (bidirectional forwarding)
• Multicast Fast Convergence
– Multicast timers, fast channel change
• LDP
– LDP (IGP sync), LDP session protection

© 2005 Cisco Systems, Inc. All rights reserved.


40
Multicast Subsecond Convergence

First, what it’s NOT:


A single feature or set of CLI’s
So What is it?
Various components seamlessly working together to
provide an end-to-end fast convergence solution

Objective:
Achieve and meet requirements for sub-
second channel change, pause, resume,
etc.

© 2005 Cisco Systems, Inc. All rights reserved.


41
Multicast Subsecond Convergence:
The Video Challenge - Channel Change Time & Source Redundancy

Network Components: Channel Change Delay and Slow Convergence


Contributors:
1 Set Top Box (STB) performance in processing channel
Set Top Box change request
(STB) 2 Network Signaling latency to join or resume new IP video feed

3 Delay incurred if using encryption

DSLAM 4 DSLAM Signaling latency to terminate previous IP video feed

Multicast
Distribution 5 Network Signaling latency to rebuild tree(s) during
Tree
Building link/node failures in the core

primary backup
Video
Sources
6 Time for video feed to recover during primary source failover

© 2005 Cisco Systems, Inc. All rights reserved.


42
Multicast Subsecond Convergence
Areas Addressed Today:
1 Set Top Box (STB) performance in processing channel
change request
• Vendor dependent
2 Network Signaling latency to join or resume new IP video feed
• Multicast Fast Join/Leave for Fast Channel Change
3 Delay incurred if using encryption
• Vendor dependent
4 DSLAM Signaling latency to terminate previous IP video feed
• Vendor dependent
5 Network Signaling latency to rebuild tree(s) in the core
• IGP timers
• RPF Interval timer - ip multicast rpf interval <seconds> [list <acl> | route-map route-map>])
• PIM RPF Failover timer - ip multicast rpf backoff <min> <max> [disable]
• PIM Router Query interval timer - ip pim query-interval <period> [msec]
• PIM Triggered Join(s)
• IGMP High Availability
6 Time for video feed to recover during primary source failover
• Anycast Sources (and other related methodologies)
© 2005 Cisco Systems, Inc. All rights reserved.
43
Fast Join/Leave for Faster Channel Change

Problem Description:
In networks where bandwidth is constrained between
multicast routers and hosts (like in xDSL deployments),
fast channel changes can easily lead to
bandwidth oversubscription, resulting in a temporary
degradation of traffic flow for all users.

Solution:
Reduce the leave latency during a channel change
by extending the IGMPv3 protocol.

Benefits:
• Faster channel changing without BW oversubscription
• Improved diagnostics capabilities

© 2005 Cisco Systems, Inc. All rights reserved.


44
Multicast Fast Join/Leave for
Faster Channel Change
How it works: “David” “John”
• Relies on IGMPv3
• Router tracks both User and
Channel(s) being watched

igm
p
p
• When user leaves channel no one igm

else is watching, router immediately


prunes the channel off the interface

igmp
igmp
compared to IGMPv2 (up to 3 E0
seconds) and IGMPv1 (up to 180
seconds)!

Configuration:
interface
interfaceEthernet
Ethernet00 Int Channel User
ipippim
pimsparse-mode
sparse-mode
ipipigmp
igmpversion
version33
E0 10.0.0.1, 239.1.1.1 “David”
ipipigmp
igmpexplicit-tracking
explicit-tracking
E0 10.0.0.1, 239.2.2.2 “John”
First introduced in 12.0(29)S E0 10.0.0.1, 239.3.3.3 “David”
© 2005 Cisco Systems, Inc. All rights reserved.
45
Fast Convergence:
LDP-IGP sync => LDP Down but IGP Still Up

1. Data stream between CE1 and CE2 traversing SP cloud


2. LDP Adjacency goes down between P4 and P3 but IGP still points
to P3 as the best path
3. LDP Adjacency goes down between P4 and P3 but IGP still points
to P3 as the best path
RR1 RR2

P2

PE1 P1 P3 PE4

Primary P4 Primary
Standby Standby

Network x Network y

CE1 CE2

© 2005 Cisco Systems, Inc. All rights reserved.


46
LDP down but IGP Still Up
with IGP-LDP Synch Feature

1. Data stream between CE1 and CE2 traversing SP cloud


2. LDP Adjacency goes down between P4 and P3 but IGP still
points to P3 as the best path
3. P1 chooses P2 as the preferred IGP path because of the
lower metric
RR1 RR2

P2

PE1 P1 P3 PE4

Primary P4 Primary
Standby Standby
PE2 PE3
Network x Max-metric Network y
advertised
CE1 CE2

© 2005 Cisco Systems, Inc. All rights reserved.


47
IGP-LDP Sync

• IGP-LDP Synch features synchronizes the state of IGP and


LDP between two routers
• If an LDP adjacency between two routers goes down then
these routers would advertise a max-metric for the link
connecting them via IGP
• This way upstream router can choose an alternate path if
available
• Following configuration needs to be enabled on the routers

– P4(config)# router ospf 1


– P4(config-router)# mpls ldp sync
– P4(config)# mpls ldp igp sync holddown <msecs>
– If no holddown is configured then no timeout is applied i.e.
IGPs will wait forever.

© 2005 Cisco Systems, Inc. All rights reserved.


48
Fast Convergence:
LDP Session Protection

• There are two discovery mechanisms for LDP peers


1. LDP Basic Discovery—Discovery of directly connected neighbors via link hellos
2. LDP Extended Discovery—Discovery of non-directly connected neighbors via Targeted Hellos
• If the session protection is enabled, the establishment of a
directly connected LDP session triggers Extended Discovery with the
neighbor
• With targeted hellos, session stays up even when the link goes down
• No need to re-establish an LDP session with the link neighbor and
relearn prefix label bindings when the link recovers

p1(config)#mpls ldp session protection ?


duration Period to sustain session protection after loss of link discovery
P2 Extended
for Access-list to specify LDP peers
vrf VRF Routing/Forwarding instance information Discovery
P1 P3

P4
Basic mpls ldp session
Discovery protection
© 2005 Cisco Systems, Inc. All rights reserved.
49
Agenda

• IPTV Architecture overview


• High Availability (HA) in the core and edge
– Router Level HA with SSO
– Protocol HA with NSF/GR (IGP, BGP, Multicast)
– MPLS HA (LDP NSF/GR, BGP + label NSF, FRR, PW
Redundancy)
– Fast Convergence (Multicast FC, Fast channel change,
LDP-IGP sync, LDP session protection)

• Video Source Redundancy


• Conclusion

© 2005 Cisco Systems, Inc. All rights reserved.


50
Video Source Redundancy : Two Approaches

Primary-Backup “Heartbeat” Hot-Hot


Two sources, One active (src’ing content), Two sources, both are active and src’ing
Second in standby mode (not src’ing content) multicast into the network
Heartbeat mechanism used to communicate with No Protocol between the two sources
each other
Only one copy is on the network at any instant Two copies of the multicast packets will be in the
network at any instant
Single Multicast tree is built per the unicast
routing table Two Multicast trees on almost redundant
Infrastructure
Uses required bandwidth Uses 2x network bandwidth
Receiver’s functionality simpler: Receiver is smarter:
Aware of only one src, fail-over logic handled Is aware/configured with two feeds (s1,g1),
between sources. (s2,g2) / (*,g1), (*,g2)
Joins both and is always receiving both feeds

This approach requires the network to have fast This approach does not require fast IGP and PIM
IGP and PIM convergence convergence

© 2005 Cisco Systems, Inc. All rights reserved.


51
Video Source Redundancy:
Heartbeat Model
Source Service Edge National Backbone Regional Backbone Residence

Primary

X
Source 1
Heartbeat

Regional
Backbone

Secondary
Source 1

P P

Primary PE
Source 2
PE
P P PE
Heartbeat

Regional Backbone PE

P P PE
PE

Secondary PE
Source 2
P P

Primary
Source 3
Regional
Backbone
Heartbeat

Secondary
Source 3

© 2005 Cisco Systems, Inc. All rights reserved.


52
Video Source Redundancy:
Hot-Hot Video Delivery Model
Source Service Edge National Backbone Regional Backbone

Source 1

Source 1

PE

PE P P PE PE PE
PE
PE
PE P P PE
Source 2 PE PE
Source 2

P PE
P

Source 3
Source 3

© 2005 Cisco Systems, Inc. All rights reserved.


53
Multicast Source Redundancy
Using Anycast Sources
How is source redundancy achieved
in the network? Anycast Sources
• Enable SSM on all routers

X
SF NY
• Have R1 and R2 advertise same 1.1.1.1 1.1.1.1
prefix for each source segment.
• R3 and R4 follow best path R1 R2
towards source based on IGP
v2 join
metrics.
• Let’s say R3’s best path to SF is R3 R4
through R1. The source in SF
now suddenly fails.
I will send join
• R3’s IGP will reconverge and to the nearest
1.1.1.1/32
trigger SSM joins towards R2 STB STB
in NY.
© 2005 Cisco Systems, Inc. All rights reserved.
54
Conclusion

• High Availability involves many more components


in Metro Ethernet, DSL/PON deployments,
Authentication/Application Servers
– Distributed DSLAMs

• Slowest component is a bottleneck for the entire


system
• Careful when using High Availability versus Fast
Convergence. Race Conditions possible!!

© 2005 Cisco Systems, Inc. All rights reserved.


55
RST-3108
11055_05_2005_c1 © 2005 Cisco Systems, Inc. All rights reserved.
© 2005 Cisco Systems, Inc. All rights reserved. 56 56

You might also like