You are on page 1of 73

BRKACI-2003

Cisco ACI Multi-Pod


Design and Deployment

John Weston – Technical Marketing Engineer


Data Center Networking
CCIE 22370
Session Objectives

At the end of the session, the participants should be able to:


 Articulate the different deployment options to interconnect
Cisco ACI networks (Multi-Pod vs. Multi-Site)
 Understand the functionalities and specific design
considerations associated to the ACI Multi-Pod Fabric option
Initial assumption:
 The audience already has a good knowledge of ACI main
concepts (Tenant, BD, EPG, L2Out, L3Out, etc.)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• ACI Network and Policy Domain Evolution
• ACI Multi-Pod Deep Dive
Overview, Use Cases and Supported Topologies
APIC Cluster Deployment Considerations

Inter-Pod Connectivity Deployment Considerations


Control and Data Planes
Connecting to the External Layer 3 Domain
Network Services Integration
Migration Scenarios

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Cisco Webex Teams

Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session

How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

cs.co/ciscolivebot#BRKACI-2003

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
ACI Network and
Policy Domain
Evolution
Introducing: Application Centric Infrastructure (ACI)

Web App DB
Outside QoS QoS QoS
(Tenant
Filter Service Filter
VRF)

APIC

Application Policy
ACI Fabric Infrastructure Controller
Integrated GBP VXLAN Overlay

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
ACI Anywhere
Fabric and Policy Domain Evolution

ACI Single Pod Fabric ACI Multi-Site ACI Multi-Cloud


IP
& vPod
Fabric ‘A’ Fabric ‘n’

MP-BGP - EVPN

… ACI 3.1/3.2 - Remote Leaf


ACI 2.0 - Multiple Networks
(Pods) in a single Availability and vPod extends an
Zone (Fabric) Availability Zone (Fabric) to
remote locations

ACI 1.0 - ACI Multi-Pod Fabric ACI 3.0 – Multiple Availability ACI Remote Leaf Future – ACI Extensions
Leaf/Spine Single Zones (Fabrics) in a Single to Multi-Cloud
Pod Fabric IPN Region ’and’ Multi-Region
Pod ‘A’ Pod ‘n’ Policy Management
MP-BGP - EVPN


APIC Cluster

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Fabric and Policy Domain Evolution
Deployment Options
Single APIC Cluster/Single Fabric Multiple APIC Clusters/Multiple Fabrics
Stretched Fabric Multi-Fabric (with L2 and L3 DCI)
ACI Fabric Fabric ‘A’ Fabric ‘n’
DC1 APIC Cluster DC2
Inter-Site
App

L2/L3
DCI

Multi-Pod (from 2.0 Release) Multi-Site (3.0 Release, Q3CY17)


IPN
Pod ‘A’ Pod ‘n’ Fabric ‘A’ IP Fabric ‘n’

MP-BGP - EVPN MP-BGP - EVPN

… …

APIC Cluster
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
For More Information on

ACI Multi-Site
VXLAN ACI Multi-Site:
IP Network BRKACI-2125
Overview

MP-BGP - EVPN

Multi-Site Orchestrator

Site 1 Site 2
REST
GUI
API Availability Zone ‘B’
Availability Zone ‘A’
Region
• Separate ACI Fabrics with independent APIC clusters • MP-BGP EVPN control plane between sites
• ACI Multi-Site Orchestrator pushes cross-fabric • Data Plane VXLAN encapsulation across
configuration to multiple APIC clusters providing sites
scoping of all configuration changes • End-to-end policy definition and enforcement

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Typical Requirement
Creation of Two Independent Fabrics/AZs

Fabric ‘A’ (AZ 1)

Fabric ‘B’ (AZ 2)

Application
workloads
C deployed across
availability zones 11
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
Typical Requirement
Creation of Two Independent Fabrics/AZs

Multi-Pod Fabric ‘A’ (AZ 1)


‘Classic’ Active/Active

Pod ‘1.A’ Pod ‘2.A’

ACI Multi-Site

Multi-Pod Fabric ‘B’ (AZ 2)


‘Classic’ Active/Active
Application
Pod ‘1.B’workloads Pod ‘2.B’
deployed across
availability zones
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
ACI Multi-Pod
Deep Dive
Overview, Use Cases
and Supported
Topologies
ACI Multi-Pod
VXLAN
Inter-Pod Network
Pod ‘A’ Pod ‘n’

MP-BGP - EVPN


50 msec RTT

APIC Cluster
IS-IS, COOP, MP-BGP IS-IS, COOP, MP-BGP

Availability Zone

 Multiple ACI Pods connected by an IP Inter-Pod L3  Forwarding control plane (IS-IS, COOP) fault
network, each Pod consists of leaf and spine nodes isolation
 Managed by a single APIC Cluster  Data Plane VXLAN encapsulation between Pods
 Single Management and Policy Domain  End-to-end policy enforcement

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Single Availability Zone with Maintenance & Configuration Zones
Scoping ‘Network Device’ Changes
Maintenance Zones – Groups of
switches managed as an “upgrade”
group Inter-Pod Network

ACI Multi-Pod
Fabric
APIC Cluster

Configuration Zone ‘A’ Configuration Zone ‘B’

 Configuration Zones can span any required set of switches, simplest approach may be to map a
configuration zone to an availability zone, applies to infrastructure configuration and policy only

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Reducing the Impact of Configuration Errors
Introducing Configuration Zones

 Three different zone deployment modes:


 Enabled (default): updates are immediately sent
to all nodes part of the zone
Note: a node not part of any zone is equivalent Change the deployment
to a node part of a zone set to enabled. mode
Select entire Pod
 Disabled: updates are postponed until the zone
deployment mode is changed (or a node is Select specific Leaf Switches
removed from the zone)
 Triggered: send postponed updates to the nodes
part of the zone
Show the changes not applied yet
 The deployment mode can be configured for to a Disabled zone
an entire Pod or for a specified set of leaf
switches

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Single Availability Zone with Tenant Isolation
Isolation for ‘Virtual Network Zone and Application’ Changes

Inter-Pod Network

ACI Multi-Pod
Fabric
APIC Cluster

Tenant ‘Prod’ Configuration/Change Domain Tenant ‘Dev’ Configuration/Change Domain

 The ACI ‘Tenant’ construct provide a domain of application and associated virtual
network policy change
 Domain of operational change for an application (e.g. production vs. test)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
ACI Multi-Pod
Supported Topologies
Intra-DC Two DC sites directly connected

10G/40G/100G
10G*/40G/100G 10G*/40G/100G
POD 1 10G*/40G/100G 10G*/40G/100G
POD n POD 1 Dark fiber/DWDM POD 2
(up to 50 msec RTT**)

APIC Cluster APIC Cluster

3 (or more) DC Sites directly connected Multiple sites interconnected by a


10G/40G/100G
generic L3 network
10G*/40G/100G
POD 1 10G*/40G/100G POD 2
Dark fiber/DWDM 10G*/40G/100G 10G*/40G/100G
(up to 50 msec RTT**)
L3
10G*/40G/100G
10G*/40G/100G (up to 50msec RTT**)
10G*/40G/100G

POD 3 **2019
© 50 msec support
Cisco and/or added
its affiliates. in SW
All rights release
reserved. 2.3(1)
Cisco Public
* 10G only with QSA adapters on EX/FX spines
ACI Multi-Pod
SW/HW Support and Scalability Values
 All existing Nexus 9000 HW supported as leaf and spine nodes

 Maximum number of supported ACI leaf nodes (across all Pods)


 Up to 80 leaf nodes supported with a 3 node APIC cluster
 200 leaf nodes (across Pods) with a 4 node APIC cluster (from ACI release 4.1)
 300 leaf nodes (across Pods) with a 5 node APIC Cluster
 400 leaf nodes (across Pods) with a 7 node APIC Cluster (from ACI release 2.2(2e))
 Maximum 200 leaf nodes per Pod
 Up to 6 spines per Pod

 Maximum number of supported Pods


 4 in 2.0(1)/2.0(2) releases
 6 in 2.1(1) release
 10 in 2.2(2e) release
 12 in 3.0(1) release
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
APIC Cluster
Deployment
Considerations
APIC – Distributed Multi-Active Data Base

One copy is ‘active’ for every


The Data Base is replicated specific portion of the Data
across APIC nodes Base
Shard 1 Shard 1 Shard 1
APIC APIC APIC
Shard 2 Shard 3 Shard 2 Shard 3 Shard 2 Shard 3

 Processes are active on all nodes (not active/standby)


 The Data Base is distributed as active + 2 backup instances (shards) for every attribute

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
APIC Cluster Deployment Considerations
Single Pod Scenario

X X
APIC APIC APIC
Shards in
‘read-only’
mode
X X
APIC APIC APIC APIC APIC

Shards in Shards in
‘read-only’ ‘read-write’ mode
 APIC will allow read-only access to the DB 
mode
Additional APIC will increase the system scale (up to
when only one node remains active (standard 7* nodes supported) but does not add more
DB quorum) redundancy
 Hard failure of two nodes cause all shards to  Hard failure of two nodes would cause inconsistent
be in ‘read-only’ mode (of course reboot etc. behaviour across shards (some will be in ‘read-only’
heals the cluster after APIC nodes are up) mode, some in ‘read-write’ mode)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
APIC Cluster Deployment Considerations
Multi-Pod – 2 Pods Scenario

X
Pod 2

X
Pod 1 Pod 2 Pod 1
X X

Up to 50 msec Up to 50 msec

X X
APIC APIC APIC APIC
X X X
APIC APIC APIC APIC APIC

Read/Write Read Only


 Pod isolation scenario: same considerations as with
 Pod isolation scenario: changes still possible single Pod (different behaviour across shards)
on APIC nodes in Pod1 but not in Pod2  Pod hard failure scenario: may cause the loss of
 Pod hard failure scenario: recommendation is information for the shards replicated across APIC
to activate a standby node to make the cluster nodes in the failed Pod
fully functional again Possible to restore the whole fabric state to the latest taken
configuration snapshot (‘ID Recovery’ procedure – needs BU
and TAC involvement)
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
ACI 4.1(1)
APIC Cluster Deployment Considerations Release
What about a 4 Nodes APIC Cluster?

X
Pod 1 Pod 2
X

Up to 50 msec

X X
APIC APIC APIC APIC APIC

 Intermediate scalability values compared to a 3 or 5 nodes cluster scenario (up to


170-200 leaf nodes supported)
 Pod isolation scenario: same considerations as with 5 nodes (different behaviour
across shards)
 Pod hard failure scenario
• No chance of total loss of information for any shard
• Can bring up a standby node in the second site to regain full majority for all the shards
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
APIC Cluster Deployment Considerations
Deployment Recommendations
 Main recommendation: deploy a 3 nodes APIC cluster when less than 80 leaf nodes are
deployed across Pods
 From 4.1(1) can deploy 4 nodes if the scalability requirements are met
 When 5 (or 7) nodes are really needed for scalability reasons, follow the rule of thumb
of never placing more than two APIC nodes in the same Pod (when possible):
Pod1 Pod2 Pod3 Pod4 Pod5 Pod6

2 Pods* APIC APIC APIC APIC APIC

3 Pods APIC APIC APIC APIC APIC

4 Pods APIC APIC APIC APIC APIC

5 Pods APIC APIC APIC APIC APIC

6+ Pods APIC APIC APIC APIC APIC

*’ID Recovery’ procedure possible for recovering of lost information


BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Inter-Pod Connectivity
Deployment
Considerations
ACI Multi-Pod
Inter-Pod Network (IPN) Requirements

Pod ‘A’ Pod ‘B’

MP-BGP - EVPN

DB Web/App APIC Cluster Web/App

 Not managed by APIC, must be separately configured (day-0 configuration)


 IPN topology can be arbitrary, not mandatory to connect to all spine nodes
 Main requirements:
 Multicast BiDir PIM  needed to handle Layer 2 BUM* traffic
 OSPF to peer with the spine nodes and learn VTEP reachability
 Increase MTU support to handle VXLAN encapsulated traffic
 DHCP-Relay

* Broadcast, Unknown unicast, Multicast BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
ACI Multi-Pod and MTU
Different MTU Meanings

1. Data Plane MTU: MTU of the traffic


generate by endpoints (servers,
routers, service nodes, etc.) 2 IPN
connected to ACI leaf nodes MP-BGP EVPN

• Need to account for 50B of


overhead (VXLAN encapsulation) for
inter-Pod communication
2. Control Plane MTU: for CPU
generated traffic like EVPN across 1 1

sites
• The default value is 9000B, can be
tuned to the maximum MTU value
supported in the ISN

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
ACI Multi-Pod and MTU
Tuning CP MTU for EVPN Traffic across Pods

Configurable MTU

IPN

MP-BGP - EVPN

 Control Plane MTU can be set leveraging


the “CP MTU Policy” on APIC Modify the default
9000B MTU value
 The required MTU in the IPN would then
depend on this setting and on the Data
Plane MTU configuration
Always need to consider the VXLAN encapsulation
overhead for data plane traffic (50/54 bytes)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
ACI Multi-Pod and QoS
Inter-Pod QoS Behavior
• Traffic across sites should be consistently prioritized (as it happens intra-site)
• To achieve this end-to-end consistent behavior it is required to configure DSCP-to-
CoS mapping in the ‘infra’ Tenant
• Allows to classify traffic received on the spines from the IPN based on outer DSCP value
• Without the DSCP-to-CoS mapping configuration, classification for the same traffic will be CoS
based (preserving CoS value in the IPN is harder)
• The traffic can also then be properly treated inside the IPN (classification/queuing)
• Recommended to always prioritize at least Policy and Control Plane traffic
Traffic classification
Spines set the outer and queuing Spines set the iVXLAN
DSCP field based on the CoS field based on the
configured mapping configured mapping

IPN
Pod ‘A’ Pod ‘B’

MP-BGP - EVPN

CS5 CS5

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Inter-Pod Connectivity
Frequently Asked Questions  Nexus 9200s, 9300-EX, but also any other
switch or router supporting all the IPN
requirements
What platforms can or should I
deploy in the IPN?  First generation Nexus 9300s/9500s not
supported as IPN nodes

 Yes, with QSA adapters supported on the ACI


Can I use a 10G connection spine devices
between the spines and the IPN Available from 2.1(1h) release on EX/FX based
network? HW
No plans to introduce support for first generation
spines (including 9336-PQ ‘baby spine’)
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Inter-Pod Connectivity
Frequently Asked Questions (2)
I have two sites connected with
POD 1
X POD 2

dark fiber/DWDM circuits, can I


connect the spines back-to- APIC Cluster

back?
 No, because of multicast requirement for L2 multi-
destination inter-Pod traffic

10G*/40G/100G
IPN Devices
connections

POD 1 POD 2

Do I need a dedicated pair of


IPN devices in each Pod?
APIC Cluster

 Can use a single pair of IPN devices, but before 2.1(1h) release
mandates the use of 40G/100G inter-Pod links
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Control and Data Planes
For more information on how to

ACI Multi-Pod
setup an ACI Fabric from scratch:
BRKACI-2004
Auto-Provisioning of Pods
DHCP requests are relayed
by the IPN devices back to
Provisioning interfaces on the spines the APIC in Pod 1 Spine 1 in Pod 2 connects to
facing the IPN and EVPN control the IPN and generates DHCP
plane configuration 5 requests

3 1
4
6
DHCP response reaches Spine 1
allowing its full provisioning

2 7

Discovery and provisioning Discovery and provisioning


of all the devices in the 11 Single APIC Cluster of all the devices in the
local Pod 8 local Pod
9
APIC Node 1 connected to a APIC Node 2 connected to
APIC Node 2 joins the a Leaf node in Pod 2
Leaf node in ‘Seed’ Pod 1
‘Seed’ Pod 1 Cluster Pod 2

10 Discover other Pods following the same procedure

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
ACI Multi-Pod IPN Network Routing Table

IPN Control Plane 10.0.0.0/16


10.1.0.0/16
 Separate IP address pools for VTEPs
assigned by APIC to each Pod
Summary routes advertised toward the IPN OSPF OSPF
via OSPF routing IPN

IS-IS convergence events local to a Pod not


propagated to remote Pods IS-IS to OSPF
10.0.0.0/16 mutual redistribution 10.1.0.0/16
 Spine nodes redistribute other Pods
summary routes into the local IS-IS APIC Cluster
process
Leaf Node Underlay VRF
Needed for local VTEPs to communicate with
remote VTEPs IP Prefix Next-Hop
10.1.0.0/16 Pod1-S1, Pod1-S2, Pod1-S3, Pod1-S4

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ACI Multi-Pod
Inter-Pod MP-BGP EVPN Control Plane
 MP-BGP EVPN to sync Endpoint (EP)
and Multicast Group information EP1 Leaf 1 EP1 Proxy A
Leaf 3 MP-BGP - EVPN
All remote Pod entries associated to a Proxy EP2 EP2 Proxy A

VTEP next-hop address (not part of local EP3 Proxy B EP3 Leaf 4
TEP Pool) EP4 Proxy B IPN EP4 Leaf 6

Same BGP AS across all the Pods


Proxy A Proxy B

 iBGP EVPN sessions between spines in COOP


separate Pods
Full mesh MP-iBGP EVPN sessions EP2 APIC Cluster EP3 EP4
EP1
between local and remote spines (default
behavior)
Optional RR deployment (recommended Single BGP ASN
one RR in each Pod for resiliency)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
ACI Multi-Pod
Policy and network
information carried = VXLAN Encap/Decap
across Pods
Inter-Pod Data Plane VTEP IP VNID Class-ID Tenant Packet

Spine encapsulates
Leaf 4
EP1
traffic to remote EP2 Leaf 4
EP2 Proxy B
IPN
Proxy A
Proxy B Spine VTEP Spine encapsulates EP1

traffic to local leaf


3 4
Proxy A Proxy B

EP2 e1/1
EP1 e1/3 EP1 Pod1 L4

5 * Proxy B
* Proxy A
Leaf learns remote EP1
EP2 unknown, traffic is 2 location and enforces policy
EP1 EP2
encapsulated to the local Proxy APIC Cluster
A Spine VTEP (adding S_Class 1 6
information) VM1 sends traffic destined If policy allows it, EP2
to remote EP2 receives the packet
EP1 EP2
EPG
C EPG
Configured on APIC
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
ACI Multi-Pod = VXLAN Encap/Decap

Inter-Pod Data Plane (2)

IPN

Proxy A Proxy B
EP1 e1/3
EP2 Pod2 L4 EP1 Pod1 L4

** Proxy A
8 * Proxy B

Leaf learns remote VM2 location 9 Leaf enforces policy in ingress


(no need to enforce policy) and, if allowed, encapsulates
EP1 EP2 traffic to remote Leaf node L4
APIC Cluster
10 7
VM1 receives the packet VM2 sends traffic back to
remote VM1
EP1 EP2
EPG
C EPG
Configured on APIC
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
ACI Multi-Pod = VXLAN Encap/Decap

Inter-Pod Data Plane (3)

IPN

Proxy A Proxy B
EP1 e1/3
EP2 Pod2 L4 EP1 Pod1 L4

** Proxy A
* Proxy B

EP1 EP2
APIC Cluster

From this point EP1 to EP2 communication is encapsulated


Leaf to Leaf (VTEP to VTEP) and policy always applied at the
11 ingress leaf (applies to both L2 and L3 communication)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ACI Multi-Pod
Use of Multicast for Inter-Pod Layer 2 BUM Traffic
BUM traffic originated in
the local Pod IPN1
 Ingress replication for BUM* traffic not
IGMP Join for (*, GIPo1) supported with Multi-Pod
IPN2  PIM Bidir is the only validated and
Spine 1 elected
supported option
authoritative for BD1 BUM traffic
originated from a
Scalable: only a single (*,G) entry is created in
remote Pod the IPN for each BD
Fast-convergent: no requirement for data-
driven multicast state creation
 A spine is elected authoritative for each
Bridge Domain:
Generates an IGMP Join on a specific link
toward the IPN
Always sends/receives BUM traffic on that link
BD1 GIPo1: 225.1.1.128

BUM: Broadcast, Unknown Unicast, Multicast BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ACI Multi-Pod
Use of Multicast for Inter-Pod BUM Traffic
IPN replicates traffic to all the
4 PODs that joined MG1
(optimized delivery to Pods)

Spine 2 is designated to send


MG1 traffic toward the IPN

BUM frame is flooded


5 along one of the trees
* 2 associated to MG1

BD1 has associated MG1,


traffic is flooded intra-Pod via
one multi-destination tree EP1 EP2
APIC Cluster
1 6
VM1 in BD1 generates a VM2 receives the BUM
BUM* frame frame

BUM: Layer 2 Broadcast, Unknown Unicast, Multicast


BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
ACI Multi-Pod
PIM Bidir for BUM – Supported Topologies
Full Mesh between remote IPN devices
IPN1 IPN3  Create full-mesh connections between IPN devices
 More costly for geo-dispersed Pods, as it requires
IPN2 IPN4
more links between sites
 Alternatively, connect local IPN devices with a port-
*
*
channel interface (for resiliency)
APIC Cluster EP2
 In both cases, it is critical to ensure that the
preferred path toward the RP from any IPN devices
is not via a spine
Directly connect local IPN devices
IPN1 IPN3  Recommendation is to increase the OSPF cost of
the interfaces between IPN and spines
IPN2 IPN4
interface Ethernet1/49.4 IPN1
description L3 Link to Pod1-Spine1
mtu 9150
encapsulation dot1q 4 e1/49
* ip address 192.168.1.1/31
* ip ospf cost 100
ip ospf network point-to-point
APIC Cluster EP2 ip router ospf IPN area 0.0.0.0 IPN2
ip pim sparse-mode
ip dhcp relay address 10.1.0.2
ip dhcp relay address 10.1.0.3

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Connecting to the
External Layer 3 Domain
Connecting ACI to Layer 3 Domain
‘Traditional’ L3Out on the BL Nodes
Client
PE
PE
WAN
PE
L3Out PE

• Connecting to WAN Edge devices at Border


Leaf nodes
• VRF-Lite hand-off for extending L3 multi-
tenancy outside the ACI fabric
• Support for host route advertisement out of
Border Leafs the ACI Fabric supported in release 4.0

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Connecting ACI to Layer 3 Domain
‘GOLF’ Design
= VXLAN Encap/Decap Client
PE
PE
WAN
PE
PE
VXLAN Data
Plane
GOLF Routers (ASR 9000, ASR
DCI 1000, Nexus 7000)
OTV/VPLS
 Direct or indirect connection from spines to WAN Edge
routers
 Better scalability, one protocol session for all VRFs, no longer
constraint by border leaf HW table
 VXLAN handoff with MP-BGP EVPN
 Simplified tenant L3Out configuration
 Support for host routes advertisement out of the ACI Fabric
 VRF configuration automation on GOLF router through
OpFlex exchange
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Connecting Multi-Pod to Layer 3 Domain
‘Traditional’ L3Out on the BL Nodes
 A Pod does not need to have a dedicated WAN
connection (i.e. can offer transit services to other
Pods)
MP-BGP - EVPN
 Multiple WAN connections can be deployed across
Pods
 Outbound traffic: by default VTEPs always select
WAN connection in the local Pod based on preferred Pod 1 Pod 2
metric

WAN WAN

Pod 3

By default traffic flows


are hashed across
L3Outs of remote
Pods

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
‘Traditional’ L3Out on the BL Nodes
Influencing Outbound Traffic

Increase Local- Increase Local-


Preference for Preference for
 In some scenarios it may be required to 100.100.100.0/24 200.200.200.0/24
ensure the L3Out in a given Pod is the
preferred outbound path to reach an MP-BGP - EVPN
external destination for all the endpoints
deployed across Pods
 This should be tunable per VRF or even Pod 1 Pod 2
per specific external destination
 Can leverage route-maps to tune the WAN WAN
Local-Preference of external prefixes
received on the BL nodes from the WAN 100.100.100.0/24
Pod 3
100.100.100.0/24
Edge routers 200.200.200.0/24 200.200.200.0/24
100.100.100.0/24
reachable via Pod1,
200.200.200.0/24
via Pod2

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Influence outbound path
BGP Local Preference option

IPN
Pod1 Pod2

APIC Cluster

Preference Preference
150 100

L3Out1 L3Out2
WAN WAN

WAN

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Host Route Advertisement ACI 4.0
Regular L3outs

IPN

APIC Cluster

192.168.1.201 192.168.1.202

L3Out-1 L3Out-2
WAN WAN
192.168.1.201 192.168.2.201 192.168.1.202

192.168.1.0/24 192.168.1.0/24
WAN
192.168.1.201/32 192.168.1.202/32

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Host Route Advertisement Overview
• Can be enabled per bridge domain
• Border leaves download host routes for endpoint entries in the COOP
database on the spine
• Border leaves only download host routes for endpoints connected to the
local pod
• host route withdrawn from border leaf if endpoint moves to another pod or
times out
• L3Out route-maps can be used to filter (permit or deny) BD subnet routes
and host routes and host route ranges

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Network Services
Integration
ACI Multi-Pod
Network Services Integration Models

 Active and Standby pair deployed across Pods


 No issues with asymmetric flows but causes
Active Standby
traffic hair-pinning across the IPN

 Independent Active/Standby pair deployed in


each Pod
 Only for perimeter FW use case assuming proper
solution is adopted to keep symmetric
Active/Standby Active/Standby
ingress/egress traffic flows

 FW cluster deployed across Pods


 Supported from ACI release 3.2
 Requires the use of Service-Graph with PBR
Cluster

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Active/Standby Pair across Pods
PBR

IPN
Pod1 Pod2

PBR Policy APIC Cluster


Applied Here

L3Out L3Out
WAN WAN

L3 Mode L3 Mode
Active Standby
WAN

= East-West
= North-South

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Active/Standby Independent Pairs in each Pod
Symmetric PBR
Pod1 IPN Pod2
PBR Policy
applied Here APIC Cluster Optimized behavior where the
FW located in the same Pod with
L3Out
WAN
L3Out
WAN the destination endpoint is
L3 Mode L3 Mode
selected by the PBR policy
Active/Standby WAN Active/Standby

Pod1
IPN Pod2

APIC Cluster
Sub-optimal traffic flow if the FW
located in the remote Pod is
selected by the PBR policy
L3Out L3Out
WAN WAN

L3 Mode L3 Mode
PBR Policy Active/Standby WAN Active/Standby
applied Here

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Active/Standby Independent Pairs in each Pod
Location Based PBR (ACI 3.1 Release)
Optimal Optimal
Inbound IPN Inbound
Traffic Path Traffic Path
Pod1 Pod2

PBR Policy
Proxy A APIC Cluster Proxy B
applied Here PBR Policy
applied Here

L3Out L3Out
WAN WAN
Web VM1 Web VM2
192.168.1.201 192.168.1.202

L3 Mode L3 Mode
Active/Standby WAN Active/Standby

With Host Route Advertisement


With Host Route Advertisement
Routing Table:
Routing Table:  192.168.1.201  192.168.1.202 192.168.1.202/32 via Pod2 border leaf nodes
192.168.1.201/32 via Pod1 border leaf nodes

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Active/Active Cluster across Pods ACI 3.2
Anycast IP/MAC with PBR

 All the active FW nodes have the same IP/MAC identity, so one of them will be picked
By default one of the nodes local to a Pod is selected (based on IS-IS metric toward the IP address)

IPN
Pod1 Pod2

PBR Policy
Applied Here
APIC Cluster PBR Policy
Applied Here

L3Out L3Out
WAN WAN

L3 Mode L3 Mode
Active WAN Active

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Without Anycast IP/MAC feature
X THIS IS NOT WORKING WITHOUT
ANYCAST SERVICE

Spines in Pod1 Spines in Pod2


10.1.1.1 via Service Leaf in Pod1 or Pod2?? 10.1.1.1 via Service Leaf in Pod1 or Pod2??
10.1.2.1 via Service Leaf in Pod1 or Pod2?? 10.1.2.1 via Service Leaf in Pod1 or Pod2??
IPN
Pod1 Pod2

Proxy A Proxy B

Service Leafs in Pod1 Service Leafs in Pod2


10.1.1.1 local 10.1.1.1 local
10.1.2.1 local 10.1.2.1 local

Web VM1 ASA External: 10.1.1.1


Web VM2
192.168.1.201
Active ASA Internal: 10.1.2.1 Active 192.168.1.202

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
With Anycast IP/MAC feature


Works with
Anycast Service starting 3.2

Spines in Pod1 Spines in Pod2


10.1.1.1 via Service Leaf in Pod1 (preferred) 10.1.1.1 via Service Leaf in Pod2 (preferred)
10.1.1.1 via Pod2 10.1.1.1 via Pod1
IPN
Pod1 Pod2

Proxy A Proxy B

Service Leaf in Pod1 Service Leaf in Pod2


10.1.1.1 local 10.1.1.1 local
10.1.2.1 local 10.1.2.1 local

Web VM1 ASA External: 10.1.1.1


Web VM2
192.168.1.201
Active ASA Internal: 10.1.2.1 Active 192.168.1.202

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Anycast IP/MAC Overview
• Use case
• Firewall and ACI both are L3. Use of PBR.
• In Anycast IP/MAC service:
• User will configure anycast IP/MAC (PBR destination)
• The COOP will maintain local and remote endpoint for the Anycast endpoint.
• Anycast IP/MAC endpoint on each pod prefers local anycast endpoint rather than
the remote endpoint.
• Anycast service is NOT supported for 1st Gen leaf (Non-EX/FX)

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Multi-Pod and Virtual
Machine Manager
(VMM) Integration
ACI Multi-Pod and VMM Integration
IPN

Pod 1 Pod 2

VMM Domain
DC1

HV HV HV vSwitch1 HV HV HV

• Single VMM domain created across Pods


• Logical switch extended across the hypervisors part of the same stretched cluster
• Support for all intra-cluster functions (vSphere HA/FT, DRS, etc.)
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Migration Scenarios
Migration Scenarios
Adding Pods to an Existing ACI
Add connections to the Connect and auto-provision

1 IPN network the other Pod(s)

Pod1 Pod2
MP-BGP - EVPN

Distribute the APIC nodes


across Pods

2 Add connections to the


IPN network
Connect and auto-provision
the other Pod(s)

MP-BGP EVPN

Pod1 Distribute the APIC nodes Pod2


across Pods

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Migration Scenarios
Converting Stretched Fabric to Multi-Pod

3
Pod1 Pod2

MP-BGP EVPN

 Re-cabling of the physical interconnection (especially when using


DWDM circuits that must be reused)
 Re-addressing the VTEP address space for the second Pod 
disruptive procedure as it requires a clean install on the second Pod
 Not internally QA-validated or recommended

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Conclusions and Q&A
ACI Multi-Pod & Multi-Site
A Reason for Both

Multi-Pod Fabric ‘A’ (AZ 1)


‘Classic’ Active/Active

Pod ‘1.A’ Pod ‘2.A’

ACI Multi-Site

Multi-Pod Fabric ‘B’ (AZ 2)


‘Classic’ Active/Active
Application
Pod ‘1.B’workloads Pod ‘2.B’
deployed across
availability zones
BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Useful Links
• ACI Multi-pod white paper
• https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-737855.html

• ACI Multi-site white paper


• https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739609.html

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Cisco Webex Teams

Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session

How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

cs.co/ciscolivebot#BRKACI-2003

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Complete your online
session survey
• Please complete your Online Session
Survey after each session
• Complete 4 Session Surveys & the Overall
Conference Survey (available from
Thursday) to receive your Cisco Live T-
shirt
• All surveys can be completed via the Cisco
Events Mobile App or the Communication
Stations

Don’t forget: Cisco Live sessions will be available for viewing


on demand after the event at ciscolive.cisco.com

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Continue Your Education

Demos in Meet the Related


Walk-in
the Cisco engineer sessions
self-paced
Showcase labs 1:1
meetings

BRKACI-2003 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Thank you

You might also like