You are on page 1of 184

#CiscoLive

ACI Multi-Site Architecture and


Deployment
Max Ardica, Principal Engineer
@maxardica
DGTL-BRKACI-2125

#CiscoLive
Session Objectives

At the end of the session, the participants should be able to:


• Articulate the different deployment options to interconnect Cisco
ACI networks (Multi-Pod and Multi-Site) and when to choose one
vs. the other
• Understand the functionalities and specific design considerations
associated to the ACI Multi-Site architecture
Initial assumption:
• The audience already has a good knowledge of ACI main concepts
(Tenant, BD, EPG, L2Out, L3Out, etc.)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• Introduction
• Inter-Site Connectivity Deployment Considerations
• ACI Multi-Site Orchestrator (MSO) Architecture
• Provisioning Tenant Policies on MSO
• ACI Multi-Site Control and Data Plane
• Connecting to the External L3 Domain
• Tenant Routed Multicast (TRM) Support
• Network Services Integration
• Multi-Pod and Multi-Site Integration
• Conclusion
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Introduction
ACI Anywhere
Fabric and Policy Domain Evolution

ACI Single Pod Fabric ACI Multi-Site Cloud ACI


ISN
Fabric ‘A’ Fabric ‘n’

MP-BGP - EVPN


ACI 2.0 - Multiple Networks ACI 3.1/4.0 - Remote Leaf
(Pods or Availability Zones) and vPod extends a Fabric
in a single Fabric (Region) to remote locations

ACI 1.0 - ACI 3.0 – Multiple Fabrics ACI 4.1 & 4.2 – ACI
ACI Multi-Pod Fabric (Regions) interconnected in
ACI Remote Leaf
Leaf/Spine Single Extensions to Public Cloud
Pod Fabric the same Multi-Site
IPN
Pod ‘A’ Pod ‘n’
Orchestrator domain

MP-BGP - EVPN


APIC Cluster

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Multi-Pod or Multi-Site?

That is the question…

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
And the answer is…

BOTH!

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Terminology

• Pod – A Leaf/Spine network sharing a common control plane (ISIS, BGP, COOP, …)
§ Pod == Availability Zone
• Fabric – Scope of an APIC Cluster, it can be one or more PODs
§ Fabric == Region
• Multi-Pod – Single APIC Cluster with multiple leaf spine networks
§ Multi-Pod == Multiple Availability Zones within a Single Region (Fabric)
• Multi-Fabric – Multiple APIC Clusters + associated Pods (you can have Multi-Pod
with Multi-Fabric)*
§ Multi-Fabric == Multi-Site == a DC infrastructure with multiple regions

* Available from ACI release 3.2(1)


#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Systems View (How do these things relate)
Change and Network Fault Domain Isolation
Active Workloads Layer 3 Active Workloads
Layer 2 & Layer 3 Inter Region Layer 2 & Layer 3

ACI Multi-Site
Orchestrator

Multi-Pod Fabric ‘A’ (Region 1) Multi-Pod Fabric ‘B’ (Region 2)

Fabric NetworkFault Domain Fabric Network Fault Domain Fabric Network Fault Domain Fabric Network Fault Domain

Pod A.1 (AZ 1) Pod A.2 (AZ 2)


Application1 Pod B.1 (AZ 1) Pod B.2 (AZ 2)
Application 2
workloads deployed workloads deployed
Application Policy Change Domain Application Policy Change Domain
across Pods (AZs) across Pods (AZs)

Common Namespace (IP, DNS, Active Directory…)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Typical Requirement
Creation of Two Independent Fabrics/Regions
Data Center 1 Data Center 2

Multi-Pod Fabric ‘A’


(Region 1)
‘Classic’ Active/Active (L2 and L3)

Pod ‘1.A’ Pod ‘2.A’

L3 Only ACI Multi-Site L3 Only

Multi-Pod Fabric ‘B’


(Region 2)
‘Classic’ Active/Active (L2 and L3)

Pod ‘1.B’ Pod ‘2.B’

#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
But wait! Could I deploy
Multi-Site also to handle
more typical Multi-Pod
use cases?

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Multi-Site for Active/Active Application Deployments?
Multi-Site for Active/Active Application Deployments

• ACI Multi-Site allows to extend connectivity and


policies between separate APIC domains
• Layer 3 only across sites
• Layer 2 with and without BUM flooding

• Keep in mind some specific considerations before


deploying Multi-Site for “classic” Active/Active
application deployments (i.e. same application
components deployed across sites)
• Loss of change and network fault domain isolation
across separate ACI domains
• Creation of separate VMM domains by design (loss of
intra-cluster functionalities like DRS, vSphere FT/HA,
…)
• Specific service node insertion deployment
considerations (use of separate service nodes per
fabric, limited support for service nodes clustering
across sites, …)
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
ACI Multi-Site and VMM Integration
Separate VMM per Site
ISN

Site 1 VMM Domain VMM Domain Site 2


DC1 DC2

VMM 1 VMM 2

HV vSwitch1
HV HV Managed by HV HV
vSwitch2 HV
VMM 1
HV Cluster 1 Managed HV Cluster 2
by VMM 2

• Typical deployment model for an ACI Multi-Site


• Creation of separate VMM domains in each site, which are then exposed to the Multi-
Site Orchestrator
• No support for DRS, vSphere HA/FT

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
ACI Multi-Site and Network Services Deployment options fully
supported with ACI Multi-Pod
Integration Models
ISN

• Active and Standby pair deployed across Pods


• Limited supported options

Active Standby

ISN
• Active/Active FW cluster nodes stretched across Sites
(single logical FW)
• Requires the ability of discovering the same MAC/IP info
in separate sites at the same time
Active/Active Cluster
• Not supported
Active Active

ISN • Recommended deployment model for ACI Multi-Site


• Option 1: supported for N-S traffic flows when the FW
is connected in L3 mode to the fabric
• Option 2: supported for N-S and E-W traffic flows with
the use of Service Graph with Policy Based Redirection
Active/Standby Active/Standby #CiscoLive (PBR) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
ACI Multi-Pod For more Information on
ACI Multi-Pod:
Overview DIGACI-2003
VXLAN Data Plane

Pod ‘A’
Inter-Pod
Pod ‘n’
Network

MP-BGP - EVPN


Up to 50 msec RTT

APIC Cluster
IS-IS, COOP, MP-BGP IS-IS, COOP, MP-BGP

• Multiple ACI Pods connected by an IP Inter-Pod L3 • Forwarding control plane (IS-IS, COOP) fault
network, each Pod consists of leaf and spine nodes isolation
• Up to 50 msec RTT supported between Pods • Data Plane VXLAN encapsulation between Pods
• Managed by a single APIC Cluster • End-to-end policy enforcement
• Single Management and Policy Domain
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
ACI Multi-Pod
Most Common Use Cases

§ Need to scale up a single ACI fabric above


the number of leaf nodes supported in a Pod
single Pod Inter-Pod
Leaf Nodes Network
§ Desire to divide an ACI fabric into smaller
network fault domains (AZs)
§ Handling 3-tiers physical cabling layout (for Spine Nodes
example traditional N7K/N5K/N2K
deployments)

§ True Active/Active DC deployments


Pod 1 Pod 2
Single VMM domain across DCs (stretched ESXi
Metro Cluster, vSphere HA/FT, DRS initiated
workload mobility,…)
Deployment of Active/Standby or Active/Active
clustered network services (FWs, SLBs) across DCs
DB APIC Cluster
Web/App Web/App
Application clustering (lots of L2 BUM extension
required across Pods)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
ACI Multi-Site and Network Services
Overview
VXLAN

IP Network

Site 1 Site 2
MP-BGP - EVPN

Multi-Site Orchestrator

REST
GUI
API

§ Separate ACI Fabrics with independent APIC § Data Plane VXLAN encapsulation across
clusters sites
§ MP-BGP EVPN control plane between sites § End-to-end policy definition and
enforcement

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
ACI Multi-Site
Most Common Use Cases
• Scale-up DC network model • Data Centre Interconnect (DCI)
Building a very large intra-DC network (above 500* leaf nodes) Extend connectivity/policy between ‘loosely coupled’ DC sites
Disaster Recovery and IP mobility use cases

*500 leaf nodes supported in a Multi-Pod fabric starting with ACI 4.2(4) release

• Cloud ACI • SP 5G Telco DC/Cloud**


Integration between on-prem and public clouds Centralized DC Orchestration for 5G
SR-MPLS/MPLS Handoff on Border Leaf nodes

**MSO ©
3.0(1) and ACI 5.0(1) releases (and beyond)
#CiscoLive 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Inter-Site Connectivity
Deployment
Considerations
Agenda

• Inter-Site Network Functional Requirements


• Multi-Site MTU Size Considerations
• Inter-Site Connectivity FAQs
• Inter-Site QoS
• Back-to-Back Topology and CloudSec Support

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Inter-Site Network
Functional
Requirements
Inter-Site Network (ISN) Functional Requirements
Sub-interfaces
(VLAN tag 4) Inter-Site Network

F
OSP

MP-BGP - EVPN

Multi-Site
Orchestrator

• Not managed by APIC or MSO, must be independently configured (day-0 configuration)


• IP topology can be arbitrary, not mandatory to connect all the spine nodes to the ISN
• ISN main functional requirements:
ü OSPF on the first hop routers to peer with the spine nodes and exchange TEP address reachability
Must use sub-interfaces (with VLAN tag 4) toward the spines

ü No multicast requirement for BUM traffic forwarding across sites


ü Increased end-to-end MTU support (at least 50/54 extra Bytes)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Multi-Site MTU Size
Considerations
ACI Multi-Site and MTU Size
Different MTU Meanings

1. Data Plane MTU: MTU of the traffic


generate by endpoints (servers, routers, 2 ISN
service nodes, etc.) connected to ACI leaf MP-BGP
nodes
Need to account for 50B of overhead (VXLAN
encapsulation) for inter-site communication
2. Control Plane MTU: for CPU generated Multi-Site
traffic like MP-BGP sessions across sites 1
Orchestrator
1 1
Control plane traffic is not VXLAN encapsulated
The default value is 9000B, can be tuned to
match the maximum MTU value supported in
the ISN

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
ACI Multi-Site and MTU Size
Tuning MTU Size for EVPN Control Plane Traffic

Configurable MTU

ISN

MP-BGP

Multi-Site
Orchestrator

Modify the default 9000B


MTU value (if needed)

§ Control Plane MTU can be set leveraging the “CP MTU Policy” on APIC
The setting applies to all the control plane traffic generated by ACI leaf/spine nodes
§ The required MTU in the ISN would then depend on this setting and on the Data Plane MTU
configuration
Always need to consider the VXLAN encapsulation overhead for data plane traffic

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Inter-Site Connectivity
FAQs
Inter-Site Connectivity
Frequently Asked Questions

§ Any network device capable of routing traffic and


What platforms can or should I supporting packets with increased MTU size can be
1 deploy in the ISN? deployed in the ISN
§ Need sub-interfaces support for the ISN devices directly
connected to the spines

§ No, ingress replications is performed by the ACI spine


Do I need to run L3 multicast nodes to forward BUM traffic across sites
2
inside the ISN? § This function is only required for the BDs that are stretched
across sites with BUM flooding enabled

§ No, the only officially supported configuration consists in


Can I use a Layer 2 only deploying the ISN nodes as L3 network devices
3 infrastructure as ISN? (particularly the ISN devices connected to the spines)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Inter-Site Connectivity
Frequently Asked Questions (2)

§ No, the network providing ISN services for Multi-Site could


Do I need to deploy a dedicated also be used for other functions
4 infrastructure as ISN? § It is recommended (but not mandatory) to use a dedicated
VRF for providing ISN connectivity

§ No, the bandwidth required between sites mostly depends


Is there a minimum bandwidth I
5 on the amount of east-west connectivity expected
should deploy between sites? between sites

§ OSPF is mandatory but only between the spines and the


Is OSPF the only protocol first L3 hop ISN devices
6 supported in the ISN network? § It is quite common for customers to redistribute OSPF into
a different routing protocol running inside the ISN

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Inter-Site QoS
ACI Multi-Site and QoS
Intra-Site QoS Behavior

CoS Value DEI Bit Class of Service

2 0 Level 1 user data


1 0 Level 2 user data
• ACI Fabric supports multiple classes of services
0 0 Level 3 user data • Six* user configurable classes of services for
2 1 Level 4 user data
user data traffic
3 1 Level 5 user data
Level3 is the default class (CoS value 0)

5 1 Level 6 user data • Four reserved classes of service


3 0 APIC controller traffic APIC controller traffic
SPAN traffic
4 0 SPAN traffic
Control plane traffic
5 0 Control Plane Traffic
Traceroute traffic
6 0 Traceroute
7 0/1 Reserved • Each class is configured at the fabric level and
mapped to a hardware queue
*Note: 3 user classes are supported before ACI 4.0(1) release

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Inter-Site QoS Behavior
• Traffic across sites should be treated consistently as it happens intra-site
• To achieve this end-to-end consistent behavior it is required to perform DSCP-to-
CoS mapping on the spines (Spines-to-ISN and ISN-to-Spines)
DSCP-to-CoS mapping configuration directly exposed on MSO from 3.0(2) release
• The traffic can then be properly treated inside the ISN (classification/queuing)

APIC Site 1 (‘infra’ Tenant) APIC Site 2 (‘infra’ Tenant)

Traffic classification
and queuing
Spines set the outer Spines set the iVXLAN
DSCP field based on the CoS field based on the
configured mapping configured mapping
ISN
Site 1 Site 2

MP-BGP - EVPN

Multi-Site
Orchestrator

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Back-to-Back Topology
and CloudSec Support
ACI Multi-Site
Spines in Separate Sites Connected Back-to-Back

Physical or logical
connections
• Back-to-back connections only supported
Dark fiber/DWDM between 2 sites from ACI 3.2(1) release
10G*/40G/100G
A site cannot be ‘transit’ for communication
between other sites
Requires physical or logical (i.e. use of PWs)
Multi-Site
point-to-point connections (no support for an
Orchestrator intermediate Layer 2 infrastructure)
Best practice recommendation is to create a full
mesh topology
‘Square’ topology is also supported
Site 1 Site 2

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
ACI Multi-Site
CloudSec Encryption for VXLAN Traffic
Encrypted Fabric to Fabric Traffic
[GCM-AES-256-XPN (64-bit PN)])
CloudSec = “TEP-to-TEP MACSec”

VTEP IP MACSec VXLAN Tenant Packet

VTEP Information
in Clear Text
Inter-Site Network

MP-BGP - EVPN

Multi-Site
Orchestrator

Supported from ACI 4.0(1) release for FX line cards and 9332C/9364C platforms

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
ACI Multi-Site
Orchestrator (MSO)
Architecture
Agenda

• Multi-Site Orchestrator Architecture


• Types of MSO Clusters
• Comparison between MSO and APIC functionalities
• Inter-Version Support
• Recommended MSO Upgrade Procedures
• Scalability Considerations
• Provisioning Tenant Policies on MSO

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Multi-Site Orchestrator
Architecture
ACI Multi-Site
Multi-Site Orchestrator (MSO) Cluster

• Three MSO nodes are clustered and run concurrently


REST
GUI
API (active/active)
Typical database redundancy considerations (minority/majority rules)
ACI Multi-Site Orchestrator Cluster
Up to 150 msec RTT latency supported between MSO nodes

150 msec
MSO nodes can be connected to separate IP subnets
MSO Node 1 MSO Node 2 MSO Node 3
RTT (max)
• The MSO nodes do not need to be connected to the ACI
1 sec RTT leaf nodes
(max)
IP network Best practice recommendation is to spread the MSO nodes in
different physical locations to minimize the risk of concurrent MSO
node failures
….. • OOB Mgmt connectivity to the APIC clusters deployed
Site 1 Site 2 Site n
in separate sites
Up to 1 sec RTT latency between MSO and APIC nodes

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
ACI Multi-Site Orchestrator (MSO) Cluster
Microservice Architecture

API UI

API Gateway

Schema Execution Consistency APIC APIC


APIC
Site Service
Service Engine Service

MongoDB

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Types of MSO Clusters
ACI Multi-Site Orchestrator
1 - VM Based MSO Cluster (OVA)

• Supported from the beginning (MSO release 1.0(1))


• Each Cisco ACI Multi-Site Orchestrator node is
ACI Multi-Site Orchestrator Cluster
packaged in a VMware vSphere virtual appliance
(OVA)
MSO Node1 MSO Node2 MSO Node3

ESXi Hypervisor ESXi Hypervisor ESXi Hypervisor


• For high availability, you should deploy each Cisco
ACI Multi-Site Orchestrator virtual machine on its
IP network
own VMware ESXi host
• Requirements for MSO Release 1.2(x) and above:

….. Site n
VMware ESXi 6.0 or later
Site 1 Site 2 Minimum of eight virtual CPUs (vCPUs), 24 Gbps of
memory, and 100 GB of disk space

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
MSO 2.2(3)
ACI Multi-Site Orchestrator Release
2 - Cisco Application Service Engine (CASE) Based MSO Cluster

• CASE is the new Cisco compute platform deployed for hosting applications
• MSO can be installed as an App running on a cluster of three CASE nodes
• Recommended MSO cluster deployment option going forward
New functionalities currently available: GUI based upgrade/downgrade, Proxy-Server Configuration

MSO App on CASE MSO App on CASE VM form MSO App on CASE AMI form
1 2 factor (on premises) 3
physical form factor factor (on AWS)*

ACI Multi-Site Orchestrator Cluster ACI Multi-Site Orchestrator Cluster

MSO Node1 MSO Node2 MSO Node3 MSO Node1 MSO Node2 MSO Node3
CASE Physical CASE Physical CASE Physical CASE VM on CASE VM on CASE VM on
Host Host Host ESXi/KVM ESXi/KVM ESXi/KVM

IP network IP network
IP network

Site 1
….. Site 1 Site 2
….. Site n …..
Site 2 Site n
Site 1 Site 2 Site n

*MSO on AWS to manage on-prem ACI


fabrics planned for a future release

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Comparison between
MSO and APIC
functionalities
APIC vs. Multi-Site Orchestrator Functions

• Central point of management and • Complementary to APIC (not a “super-APIC”)


configuration for the Fabric • Provisioning of day-0 infra configuration for the
• Responsible for all Fabric local functions establishment of intersite control and data planes
• Fabric discovery and bring up • Provisioning and managing of tenant policies
• Fabric access policies • Recommendation is to use MSO for defining all the
• Domains creation (VMM, Physical, etc.) tenant policies (at least for all the tenants existing in
• L3Out configuration (logical nodes, logical multiple fabrics or requiring intersite connectivity)
interfaces, etc.) • Can import tenant configuration from APIC cluster
• Maintains runtime data (VTEP address, VNID, domains (brownfield integration)
Class_ID, GIPo, etc.) • End-to-end visibility and troubleshooting (future)
• No participation in the fabric control and data • No participation in the fabric control and data
planes planes

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
ACI Multi-Site Orchestrator
UCSD and Ansible Integration
From MSO 2.1(1)
UCSD 6.6 and Ansible Main Functions
MSO
UCSD 6.6 'Homegrown’ Site Management
Ansible
Orchestration Tool Site Infra config and test connectivity
Modules
MSC site inventory
APIC site management (cross-launch)
MSO APIs User Management
Tenant Lifecycle and Site Association
Schema and Template lifecycle (AP, EPGs, Contracts, VRF, BD, etc … )
L3Out and External EPG
Deploy Tenants and Schemas to sites
Monitoring MSC and Management
Import brownfield tenant policies and deploy across sites

….. Troubleshooting

Site 1 Site 2 Site n

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Inter-Version Support
ACI Multi-Site Orchestrator
MSO and APIC Release Dependency (Pre-MSO 2.2(1) Release)

• Pre-MSO 2.2(1) release, MSO and ACI


releases have to be aligned
Inter-Site Network
For example MSO 1.2(x) is used with all sites
running ACI release 3.2(x)
• Different ACI versions across sites are only
supported during an ACI SW upgrade
procedure
ACI 3.2(x) ACI 3.2(x) ACI 3.2(x) ACI 3.2(x)
• The supported order of upgrade is:
1. APIC firmware
For each Site
2. Switches firmware
3. MSO As last task
MSO 1.2(x)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
ACI Multi-Site Inter-Version MSO 2.2(1)
Release
Decoupling MSO and APIC Releases

• MSO “inter-version” support is available with


Inter-Site Network MSO release 2.2(1)
• Different ACI versions across sites can be
supported at steady state
• MSO gets visibility into what functionalities are
supported in each fabric (based on the specific
ACI 3.2(6) ACI 4.0(1) ACI 4.1(1) ACI 4.2(1) ACI releases)
Preventing the deployment of unsupported
functionalities
• Best practice recommendation is to always run
MSO 2.2(1) and beyond the latest MSO version available

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
ACI Multi-Site Inter-Version MSO 2.2(1)
Release
Decoupling MSO and APIC Releases

1 2

Version check during the “Save” operation of a template Version check at deployment time for a template

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Recommended MSO
Upgrade Procedures
ACI Multi-Site Orchestrator
Recommended Upgrade Options

• MSO App running on top of SE provides a


GUI upgrade procedure
The upgrade image can be fetched from via
HTTP or from a local host

• For upgrading the MSO OVA cluster or Old Cluster


Rollback Config to
New Cluster
the Backup File
migrating to the MSO App cluster running
3
on SE, the recommendation is to
export/import the backup file between
clusters
Export Import
Backup 1 2 Backup

Backup File

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Scalability
Considerations
MSO Scalability Considerations
Continuous Scale Improvements
ACI Release 3.0 ACI Release 3.1 ACI Release 3.2 ACI Release 5.0

MSO 1.0 MSO 1.1 MSO 1.2 MSO 3.0


Number Of Sites 5 8 10 12

Max Leafs (across sites) 250 800 1200 500 per Site

Tenants 100 200 300 400

VRF 400 400 800 1000



BD 800 2,000 3,000 4,000

EPGs 800 2,000 3,000 4,000

Contracts 1,000 2,000 3,000 4,000

L3Out External EPGs 500 500 500 500

Isolated EPGs N/A 400 400 400

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
MSO Scalability Considerations
Scale Parameter Stretched Objects
Sites 12
400 leaf nodes per site (500 if
Leaf scale
deploying Multi-Pod with ACI 4.2(4))
Tenants 400
VRFs 1000 • Starting from ACI release 4.1(1) only the scalability
IP Subnets 8000 values for the ‘stretched objects’ are documented
BD 4000 • The total number of objects (stretched + local site)
EPGs 4000 cannot exceed the maximum values published in
Endpoints 100000 (learned from other sites) the scalability guide
Contracts 4000 • For more information please refer to:
L3Outs External EPGs 500 (prefixes)
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/ac
IGMP Snooping 8,000 i/apic/sw/4-x/verified-scalability/Cisco-ACI-Verified-Scalability-
Guide-424.html
MSO Schemas 80
5 à up to MSO 2.2(4)
MSO Templates per Schema
10 à from MSO 2.2(4)
500 à up to MSO 2.2(2)
MSO Objects per Schema
1000 à from MSO 2.2(3)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Provisioning Tenant
Policies on MSO
Agenda

• MSO to Define Tenant Policies


• Schemas and Templates Definition and Deployment
Considerations
• Importing Existing Policies from APIC Domains

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
ACI Multi-Site Orchestrator
Defining Tenant Policies

• The Multi-Site Orchestrator represents the single pane of glass


from where provisioning tenant policies
• Applies both to site local and stretched objects configuration
• MSO represents the single source of truth for the configuration
of the managed tenants
• At any given time, the tenant APIC configuration should be in sync with
MSO for the specific MSO managed objects (and their properties)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
ACI Multi-Site
MSO Schema and Templates

Schema
§ Template = ACI policy definition
Tenant1
(ANP, EPGs, BDs, VRFs, etc.) Stretched Tenant1
Template
§ Schema = container of Templates sharing
a common use-case
• As an example, a schema can be dedicated
to a Tenant
§ The template is currently the atomic unit
of change for policies
• Such policies are concurrently pushed to
one or more sites
Site 1 Site 2
§ Scope of change: policies in different
templates can be pushed to separate EFFECTIVE
POLICY
EFFECTIVE
POLICY
sites at different times

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
ACI Multi-Site
MSO Schema and Templates
§ Flexible way of referencing objects
• Inside the same template
Schema 1 Schema 2
• Across templates in the same schema
Template 1 (Tenant 1) Template 1 (Tenant 2) • Across templates in different schemas
§ All objects defined inside a schema are visible and
can be referenced via the drop-down list
EPG1 BD1 EPG2 BD2 § For inter-schema references need to digit at least 3
letters of the object’s name to be displayed
§ Contract supported between EPGs of different tenants

Template 2 (Tenant common) Template 2 (Tenant 1) § Every tenant can consume objects from the
predefined ‘common’ tenant
§ In order to properly plan how to deploy objects, be
VRF1 C1 aware of the following scalability values
• 10 templates per schema from MSO 2.2(4) release, 5
before that
• 500 objects per schema supported up to MSO 2.2(2),
1000 objects per schema from MSO 2.2(3)
• Every object that can be defined in a template counts
(EPGs, BDs, VRFs, Contracts, etc.)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Schema Design (Typical Deployment)
One Template per Site, plus a ‘Stretched’ Template
Schema Site 1

ANP1 Site 1 Template


(Tenant1)
EPG1 EPG2 BD1 BD2

ANP1 Site 2 Template Site 2


(Tenant1)
EPG3 EPG4 BD3 BD4

ANP1 Site 3 Template


(Tenant1)
EPG5 EPG6 BD5 BD6 Site 3

ANP1 VRF

EPG7 BD7 C1 C2

Contracts

Stretched Template (Tenant1)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
ACI Multi-Site Orchestrator
Defining Policies in a Template

Green Field Deployment Import Policies from an Existing Fabric

Site 1
Site 1 Site 1 Site 1

1a 1b 2b
2a
Site 2
Site 2 Site 2 Site 2

2 2 1

Site 1 Site 2 Site 1 Site 2


Green Field Green Field Existing Fabric Green Field
1a. Model new tenant and policies to a common template on MSO 1. Import existing tenant policies from site 1 to new common and
and associate the template to both sites (for stretched objects) site-specific templates on MSO
1b. Model new tenant and policies to site-specific templates and 2a. Associate the common template to both sites (for stretched objects)
associate them to each site 2b. Associate site-specific templates to each site
2. Push policies to the ACI sites 3. Push the policies back to the ACI sites
#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI Multi-Site Orchestrator
Defining Policies in a Template (2)

Import Policies from Multiple Existing Fabrics

§ In the current implementation, MSO does not


Site 1
Site 1

2a 2b
Site 2
Site 2
allow diff/merge operations on policies from
different APIC domains
1 1 § It is still possible to import policies for the
3 same tenant from different APIC domains,
under the assumption those are no
conflicting
Site 1 Site 2
• Tenant defined with the same Name
Existing Fabric Existing Fabric • Name and policies for existing stretched
1. Import existing tenant policies from site 1 and site 2 to new
objects are also common
common and site-specific templates on ACI MSO
2a. Associate the common template to both sites (for stretched objects)
2b. Associate site-specific templates to each site
3. Push the policies back to the ACI sites #CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
ACI Multi-Site Control
and Data Plane
Agenda

• Namespace Normalization and Shadow Objects


• Per Bridge Domain Behavior
• Underlay and Overlay Control Plane Considerations
• Data Plane Communication across Sites
• Simplify ACI Policy Application
• Preferred Group
• vzAny

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Namespace
Normalization and
Shadow Objects
ACI Multi-Site
Software and Hardware Requirements

• ACI Multi-Site introduced from release 3.0(1)

• Support all ACI leaf switches (1st Generation, -EX, -FX and Can have only a subset
newer) Inter-Site of spines connecting to
Network (ISN) the IP network
• Only –EX spine (or newer) to connect to the ISN
1st Gen 1st Gen
• 9332C, 9364C and 9316D non modular spines -EX -EX
also supported
• 1st generation spines (including 9336PQ)
not supported
• Can still leverage those for intra-site leaf
to leaf communication

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
ACI Multi-Site
Network and Identity Extended between Fabrics

Network information carried across Identity information carried across


Fabrics (Availability Zones) Fabrics (Availability Zones)

VTEP IP VNID Class-ID Tenant Packet


No Multicast Requirement in
Backbone, Head-End
Replication (HER) for any
Inter-Site Network Layer 2 BUM traffic)

MP-BGP - EVPN

Multi-Site
Orchestrator

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
ACI Multi-Site
Inter-Site Policies and Spines’ Translation Tables

Site 1 Spines Translation Table Site 2 Spines Translation Table


§ Inter-Site policies defined on the ACI
Multi-Site Orchestrator are pushed to ISN
the respective APIC domains
• End-to-end policy consistency
VNID à 16678781 VNID à 15434256
• Creation of ‘Shadow’ EPGs to locally Class-ID: 49153 Class-ID: 32770
represent the policies
§ Inter-site communication requires the
installation of translation table entries on
the spines (namespace normalization) EP1 EP2
Site 2
§ Translation entries are populated in Site 1

different cases: EP1


C
EP2 EP1
EPG
C
EP2
EPG
EPG EPG
• Stretched EPGs/BDs VRF VNID: 16678781 VRF VNID: 16678781 VRF VNID: 15434256 VRF VNID: 15434256
BD VNID: 13543235 BD VNID: 15434518 BD VNID: 13762843 BD VNID: 12753426

• Creation of a contract between not Class-ID: 49153 Class-ID: 31564


‘Shadow’
Class-ID: 32770 Class-ID: 36784

stretched EPGs EPGs

• Preferred Group or vzAny deployments

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Per Bridge Domain
Behavior
ACI Multi-Site
Intra-VRF Layer 3 Communication across Sites

MSO
Site 1 Site 2

Tenant 1
• Stretch tenant/VRF across ACI fabrics
• BDs/EPGs defined as site local objects
VRF1 Shadow Objects
BD-Red and Subnet 1 BD-Red and Subnet 1 BD-Red and BD-Green
EPG-Red EPG-Red

• Configuration of policy between EPGs in


Contract C1 separate fabrics to enable intra-VRF
Shadow Objects
Layer 3 inter-site connectivity
BD-Green and Subnet 2 BD-Green and Subnet 2 • Creation of shadow BDs/EPGs in remote
EPG-Green EPG-Green site(s)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
ACI Multi-Site
Intra-VRF Layer 3 Communication across Sites (Stretched EPGs)

Site 1
MSO
Site 2 • Stretch tenant/VRF and EPG across ACI
fabrics
Tenant 1 • BDs are also defined in a template
associated to both sites but with “L2 Stretch”
VRF1 option disabled
BD-Red and Subnet 1 BD-Red and Subnet 2 Allows to assign a different IP subnet to the BDs in
EPG-Red separate sites

BD-Red and BD-Green


Contract C1

• Layer 3 communication between sites for


BD-Green and Subnet 3 BD-Green and Subnet 4 intra-EPG and inter-EPG flows
EPG-Green
• No requirement for shadow objects creation

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
ACI Multi-Site
Inter-VRF Layer 3 Communication across Sites (Shared Services)

MSO
Site 1 Site 2
• VRF/BD/EPG locally defined in each site
Tenant 1 VRF route leaking
BD-Red and BD-Green
Shadow Objects

VRF1 VRF1
Shadow Objects
• Inter-VRF communication across sites
BD-Red and Subnet 1 BD-Red
BD-Red and
and Subnet
Subnet 11
(shared services)
EPG-Red EPG-Red
EPG-Red
• Route leaking between VRFs (requires
subnet configured under the provider EPG)
Shadow Objects Contract C1

VRF2 VRF2 • Supported within the same stretched tenant


BD-Green and Subnet 2 but also between different tenants
BD-Green and Subnet 2
EPG-Green EPG-Green • Creation of shadow VRFs/BDs/EPGs in
remote site(s)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
ACI Multi-Site
Layer 2 Extension across Sites

MSO
Site 2
• Stretch tenant/VRF but also BDs/EPGs
Site 1
across ACI fabrics
Tenant 1 • BUM forwarding can be controlled on a
BD basis
VRF1 Required only for establishing pure L2
BD-Web and Subnet 1 (BUM Forwarding DISABLED) communication across sites (DB clustering
EPG-Web using L2 multicast or broadcast, for example)

BD-DB
Contract C1
IP mobility (and live migration) can be
BD-DB and Subnet 2 (BUM Forwarding ENABLED) supported without enabling BUM forwarding
EPG-DB
BD-Web

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Underlay and Overlay
Control Plane
Considerations
ACI Multi-Site
BGP Inter-Site Peers
• Spines connected to the Inter-Site Network perform
two main functions:
Inter-Site
1. Establishment of MP-BGP EVPN peerings with spines in
Network
remote sites
Anycast VTEP Addresses:
O-UTEP & O-MTEP § One dedicated Control Plane address (EVPN-RID) is
assigned to each spine running MP-BGP EVPN
2. Forwarding of inter-sites data-plane traffic
§ Anycast Overlay Unicast TEP (O-UTEP): assigned to all the
EVPN-RID 4 spines connected to the ISN and used to source and
receive L2/L3 unicast traffic
EVPN-RID 1 § Anycast Overlay Multicast TEP (O-MTEP): assigned to all
EVPN-RID 2 EVPN-RID 3
the spines connected to the ISN and used to receive L2
BUM traffic

• EVPN-RID, O-UTEP and O-MTEP addresses are


assigned from the Multi-Site Orchestrator and must be
routable across the ISN

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
ACI Multi-Site
Exchanging TEP Information across Sites IP Network Routing Table

O-UTEP A, O-MTEP A
EVPN-RID S1-S4
O-UTEP B, O-MTEP B
Filter out the
advertisement of internal
EVPN-RID S5-S8
• OSPF peering between spines and TEP pools into the ISN

Inter-Site network
Inter-Site
• Mandates the use of L3 sub-interfaces (with OSPF Network OSPF
VLAN 4 tag) between the spines and the ISN
• Exchange of External Spine TEP S1 S2 S3 S4 S5 S6 S7 S8
addresses (EVPN-RID, O-UTEP and IS-IS to OSPF
mutual redistribution
O-MTEP) across sites TEP Pool 1 TEP Pool 2
Internal TEP Pool information not needed to
establish inter-site communication (should
be filtered out on the first-hop ISN router) Multi-Site
Orchestrator

Use of overlapping internal TEP Pools Site 1 Site 2


across sites is fully supported
Leaf Routing Table Leaf Routing Table
IP Prefix Next-Hop IP Prefix Next-Hop
O-UTEP B, Site1-S1, Site1-S2, Site1-S3, O-UTEP A, Site2-S5, Site2-S6, Site2-S7,
O-MTEP B, EVPN- Site1-S4 O-MTEP A, EVPN- Site2-S8
RID S5-S8 RID S1-S4

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
ACI Multi-Site
Inter-Site MP-BGP EVPN Control Plane

S3-S4 Table S5-S8 Table


EP1 Leaf 1 EP2 Leaf 4
• MP-BGP EVPN used to communicate MP-BGP EVPN
EP2 O-UTEP B
Endpoint (EP) information across Sites EP1 O-UTEP A

MP-iBGP or MP-EBGP peering options


supported
Inter-Site
Remote host route entries (EVPN Type-2) Network
are associated to the remote site Anycast O-UTEP A O-UTEP B
O-UTEP address
S1 S2 S3 S4 S5 S6 S7 S8
• Automatic filtering of endpoint Multi-Site

information across Sites COOP


Orchestrator
COOP
Host routes are exchanged across sites only
if there is a cross-site contract requiring
communication between endpoints EP2
EP1
Site 1 Site 2

Define and push inter-site policy


EP1 C EP2
EPG EPG
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Data Plane
Communication across
Sites
ACI Multi-Site
Inter-Sites Unicast Data Plane Policy information
carried across Pods

VTEP IP VNID Class-ID Tenant Packet

S2 has remote info for EP2 and S6 translates the VNID and
EP1 Leaf 4 encapsulates traffic to remote EP2 S2-L4-TEP
3 Class-ID to local values
EP2 O-UTEP B O-UTEP B Address (also Inter-Site and sends traffic to the EP1 O-UTEP A
changes src TEP to be O-UTEP
A) Network local leaf

Site 1 Site 2
O-UTEP A VXLAN Inter-Site unicast traffic O-UTEP B
Proxy A
sourced from O-UTEP A and Proxy B
S1 S2 S3 S4 destined to O-UTEP B S5 S6 S7 S8

EP1 e1/3 2
Multi-Site
Orchestrator
4 EP2
EP1
e1/1
O-UTEP A

5 10.10.10.0/24 Proxy B
20.20.20.0/24 Proxy A
EP1 sends traffic Leaf learns remote Site
EP2 unknown, traffic is 1 to EP2 location info for EP1
encapsulated to the local Proxy
EP1 EP2
A Spine VTEP (adding S_Class EP1
EPG C
EP2
EPG 20.20.20.20
10.10.10.10
information)
6
3 4 If policy allows it, EP2
2
receives the packet
Proxy-A O-UTEP B S2-L4-TEP
1 6
S1-L4-TEP O-UTEP A O-UTEP A

20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20


= VXLAN Encap/Decap
20.20.20.20

10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10


#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
ACI Multi-Site Policy information (EP1’s Class-ID)
Inter-Sites Unicast Data Plane (2) carried across Pods

VTEP IP VNID Class-ID Tenant Packet

S3 translates the VNID


and S_Class to local EP1 S1-L4-TEP
values and sends traffic EP2 O-UTEP A Inter-Site S6 rewrites the S-VTEP
to the local leaf Network to be O-UTEP B

9
Site 1 10 Site 2
O-UTEP A VXLAN Inter-Site unicast traffic O-UTEP B
sourced from O-UTEP B and
S1 S2 S3 S4 destined to O-UTEP A S5 S6 S7 S8
EP1 e1/3
EP2 O-UTEP B Multi-Site EP1 O-UTEP A
Orchestrator
** Proxy A
8 * Proxy B
11 Leaf applies the policy and, if
Leaf learns remote Site allowed, encapsulates traffic to
location info for EP2 EP1 EP2 remote O-UTEP address
EP1 EP2
EPG C EPG 20.20.20.20
10.10.10.10
12 7
EP1 receives the packet EP2 sends traffic back
10 9 8
to remote EP1
S1-L4-TEP O-UTEP A O-UTEP A
12
O-UTEP B O-UTEP B S2-L4-TEP 7
10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10
= VXLAN Encap/Decap
10.10.10.10

20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20


#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
ACI Multi-Site
Inter-Sites Unicast Data Plane (3)

From this point EP1 to EP2 communication is encapsulated Leaf to Remote Spine O-UTEPs in both directions

Inter-Site
Network

Site 1 Site 2
O-UTEP A Flows between O-UTEP A and O- O-UTEP B
UTEP B (and vice versa)
S1 S2 S3 S4 S5 S6 S7 S8
Multi-Site
Orchestrator
**

EP2 e2/5
EP1 e1/3 EP1 EP1 EP2 EP2 EP1 O-UTEP A
EPG C EPG 20.20.20.20
EP2 O-UTEP B 10.10.10.10 * Proxy B

Proxy A

= VXLAN Encap/Decap

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
ACI Multi-Site
Layer 2 BUM Traffic Data Plane

S3 is elected as Multi-Site forwarder for GIPo 1


BUM traffic à it creates an unicast VXLAN S7 translates the VNID and the
packet with O-UTEP A as S_VTEP and GIPo values to locally
Multicast O-MTEP B as D_VTEP Inter-Site significant ones and associates
Network the frame to an FTAG tree

3 4
O-UTEP A Inter-Site BUM traffic sourced O-MTEP B
from O-UTEP A and destined to
S1 S2 S3 S4 O-MTEP B S5 S6 S7 S8
BUM frame is flooded along the
Multi-Site tree associated to GIPo. VTEP
2
Orchestrator 5 learns VM1 remote location
*
*
EP1 O-UTEP A
BUM frame is associated to
GIPo1 and flooded intra-
site via the corresponding * Proxy B
FTAG tree
EP1 EP2
1 6
GIPo1 = Multicast Group EP1 generates a BUM EP2 receives the
associated to EP1’s BD frame BUM frame

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Layer 3 Only
Communication across
Sites
ACI Multi-Site
Never Mixing ISN and L3Out Paths for E-W Routed Communication

• Routed communication between different ACI fabrics can be


established in two ways:
1. The first and recommended way is by using VXLAN tunnels established via the Inter-
Site Network
2. Through the L3Out connections deployed in each fabric and leveraging the connectivity
services of the external Layer 3 network domain

• Mixing those two approaches is highly discouraged because it may


lead to unexpected dropping of traffic

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
ACI Multi-Site
L3 Only across Sites

Why not using just normal routing across independent fabrics?

Independent ACI Fabrics (no ACI Multi-Site)

L3Out L3Out

Need to apply a contract Need to apply a contract


WAN between internal EPG and
between internal EPG and
Ext-EPG associated to the Ext-EPG associated to the
L3Out in Fabric 1 L3Out in Fabric 2
Mandates the use of a multi-VRF
capable backbone network (VRF-Lite,
MPLS-VPN, etc.) to extend multiple
VRFs across fabrics
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
ACI Multi-Site
Never Mixing ISN and L3Out Paths for E-W Routed Communication

Inter-Site
Network

Multi-Site
Site 1 Orchestrator Site 2
O-UTEP A O-UTEP B

S1
Proxy A
S2 S3 S4 S5 S6Proxy BS7 S8

BL1 BL2 BL3 BL4

20.20.20.0/24 30.30.30.0/24
L3Out L3Out
EP1 EP2 EP3
10.10.10.10 20.20.20.20 30.30.30.30
20.20.20.0/24 BL3,BL4
30.30.30.0/24 BL1,BL2

WAN

1 Initial Condition: EP2 is communicating to EP3 via the L3Out path

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
ACI Multi-Site
Never Mixing ISN and L3Out Paths for E-W Routed Communication
EP2-EP3 traffic is dropped on the
EP1-EP3 communication is spine because of lack of translation
successfully established entries for EP2 EPG
Inter-Site
Network

Multi-Site
Site 1
O-UTEP A
Orchestrator
X O-UTEP B Site 2

S1
Proxy A
S2 S3 S4 S5 S6Proxy BS7 S8

EP1 EP3
EPG C EPG

20.20.20.0/24 30.30.30.0/24
L3Out L3Out
EP1 EP2 EP3
10.10.10.10 20.20.20.20 30.30.30.30
20.20.20.0/24 BL3,BL4
30.30.30.0/24 Proxy A
10.10.10.0/24 Proxy B

WAN
EP3 EPG subnet route pointing to
spine proxy is installed on all the
leaf where the VRF is deployed
2 A contract between EP1 and EP3 EPG is configured on MSO to allow E-W
communication via the ISN
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Simplify Policy Application
Preferred Group and vzAny
ACI Multi-Site MSO 2.0(2)
Release
Simplify Policy Enforcement: Preferred Groups
Contract required to
Multi-Site Preferred Group communicate with EPG(s)
external to the Preferred Group

App DB
C1 Non-PG
EPG
C2
Free Web
communication

§ ”VRF unenforced” not supported with Multi-Site


§ Multi-Site Preferred Group configuration from the Multi-Site Orchestrator is supported from MSO 2.0(2)
release
• Creates ‘shadow’ EPGs and translation table entries ‘under the hood’ to allow ‘free’ inter-site communication
• 250 Preferred Groups supported as MSO release 2.2(3), 1000 from MSO release 2.2(4)
§ Typically desired in legacy to ACI migration scenarios

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Simplify Policy Enforcement
Preferred Groups for E-W and N-S Flows

Inter Site § Adding internal EPGs and External EPGs


Network
(associated to L3Outs) to the Preferred Group
Site 1 Site 2 allows to enable free east-west and north-
south connectivity
§ When adding the Ext-EPG to the Preferred
Group:
L3Out L3Out
Site 1 Site 2 • Can’t use 0.0.0.0/0 for classification, needs more
EP1 EP2
Ext-EPG Ext-EPG
specific prefixes
• As workaround it is possible to use 0.0.0.0/1 and
Multi-Site Preferred Group 128.0.0.0/1 to achieve the same result
EPG1 EPG2
• Must ensure Ext-EPG is a stretched object
Ext-EPG

On MSO

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Simplify Policy Enforcement MSO 2.2(4)
Release
vzAny Support
What is vzAny? Logical object representing all the EPGs in a VRF

Use case 1: Many-to-One communication Use case 2: Enable free communication


(Shared Services) inside a VRF
vzAny (VRF1)
vzAny (VRF1) vzAny (VRF1)
VRF1 or
VRF-Shared
EPG1 EPG2 EPG1 EPG2 EPG1 EPG2
Shared
C C1 P EPG C C1 P
Permit-Any
EPG3 Ext-EPG
No Service-Graph Ext-EPG
support in 2.2(4)
No Service-Graph
support in 2.2(4)

§ Multiple EPGs part of a specific VRF1 consume the § vzAny provides and consumes a contract with an associated
services provided by a shared EPG (part of VRF1 or “Permit-any” filter
of a VRF-shared)
§ Use ACI fabric only for network connectivity without policy
§ VRF-shared can be part of the same tenant or of a enforcement
different tenant
§ Equivalent to “VRF unenforced”

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
ACI Multi-Site and vzAny MSO 2.2(4)
Release
Many-to-One Communication (Shared Services)

Inter Site
Network

Site1 Site2

Ext-EPG Ext-EPG
L3Out-Site1 L3Out-Site2

EPG1 EPG3 Shared-EPG

Shared-Resource

• Proper translation entries are created on the spines of both fabrics to enable east-
west communication
• Supported also for Shared Services behind an L3Out

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
ACI Multi-Site and vzAny MSO 2.2(4)
Release
Enable Inter-Site Free Communication Inside a VRF

Inter Site
Network

Site1 Site2

Ext-EPG Ext-EPG
L3Out-Site1 L3Out-Site2

EPG1 EPG2

• Proper translation entries are created on the spines of both fabrics to enable east-
west communication
• Supported also for connecting to the external Layer 3 domain

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Connecting to the
External L3 Domain
Agenda

• Different Types of L3Outs


• Deploying External EPG(s) Associated to the L3Out
• Solving Asymmetric Routing Issues with the External
Network
• Intersite L3Out Support

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
Different Types of L3Outs
Connecting to the External Layer 3 Domain
‘Traditional’ L3Outs on the BL Nodes (Recommended Option)

Client

L3Out WAN Edge WAN


Routers

ff
-o
Connecting to WAN Edge routers from Border Leaf nodes

nd

Ha
VRF-Lite hand-off for extending L3 multi-tenancy outside

i te

L
the ACI fabric

F-
VR
§ Up to 800 L3Outs/VRFs currently supported on the same BL
Border Leafs
nodes pair
• Support for host routes advertisement out of the ACI Fabric
from ACI release 4.0(1)
§ Enabled at the BD level
• Support for L3 Multicast and Shared L3Out

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
ACI Multi-Site and Border Leaf L3Outs
Deployment Options
Dedicated pair of WAN edge routers Shared pair of WAN edge routers

ISN ISN

WAN WAN

§ BLs on each ACI site connect to a separate pair of WAN edge § BLs of different sites connect to a common pair of WAN
routers for communication with the WAN edge routers for communication with the WAN
§ Most common deployment model for ACI fabrics § Typical deployment model when Multi-Site is used for
geographically dispersed scaling up the fabric in a single DC location

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
Connecting to the External Layer 3 Domain ACI 5.0(1)
Release
SR-MPLS/MPLS Hand-Off on the BL Nodes

NCS5500, NCS540/ 560 or


ASR9K with ACI 5.0(1) release
Client

WAN Edge WAN


MPLS tagged Routers
traffic

• Connecting to WAN Edge routers from Border Leaf nodes

PN
• Typically connected directly

EV
• Possible to connect through a Segment Routing enabled network

GP
-B
• Single control plane session (MP-BGP EVPN) for all tenant VRFs

MP
• BGP EVPN address family to carry DC prefixes, MPLS label for VRF
Border Leafs (VPN label) and color community
• MPLS tagged traffic between the BL nodes and the WAN Edge
routers
FX, FX2, GX • Only using a single MPLS label (VPN label) with directly connected
Leaf models WAN Edge routers
• VPN + SR labels used with a SR enabled network in the middle
• No current support for Layer 3 Multicast communication

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
ACI Multi-Site and Border Leaf L3Outs ACI 5.0(1)
Release
Deployment Options
SR-MPLS/MPLS hand-Off for N-S Traffic SR-MPLS/MPLS hand-Off for Inter-DC Flows

ISN ISN

SR-MPLS/MPLS SR-MPLS/MPLS

§ No EPGs, BDs or VRFs are stretched across ACI sites


Different VRFs in each site are mapped to a common VRF in the MPLS
§ SR-MPLS/MPLS Hand-Off used in each site to enabled backbone
communicate with external resources (through an MPLS § Use of SR-TE/Flex-Algo for inter-DC flows
enabled backbone)
§ WAN teams can monitor inter-DC flows using existing IP/SR
based monitoring tools
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Connecting to the External Layer 3 Domain
‘GOLF’ L3Outs (VRF High Scale Use Cases)
= VXLAN Encap/Decap
Different WAN
Hand-Off options:
VRF-Lite, MPLS-
VPN, LISP*
Client

WAN
N
E VP
-B GP
MP
Fle
x OTV/VPLS
• Connecting to WAN Edge devices at Spine nodes
Op
(directly or indirectly)
GOLF Routers
(ASR 9000, ASR 1000,
§ VXLAN data plane with MP-BGP EVPN control plane
Nexus 7000) • High scale tenant L3Out support
• Automated configuration with OpFlex
• Support for host routes advertisement out of the ACI
Fabric
§ Enabled at the VRF level
• No support for L3 Multicast or Shared L3Out (every
tenant VRF requires its own L3Out)
• GOLF is still supported but the strong recommendation
is to use BL L3Outs instead
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
ACI Multi-Site and ‘GOLF’ L3Outs
Deployment Options
Distributed GOLF Routers Shared GOLF Routers (from ACI 3.1)

WAN WAN
GOLF Routers GOLF Routers GOLF Routers

MP-BGP MP-BGP
MP-BGP ISN MP-BGP EVPN ISN EVPN
EVPN EVPN

§ Each ACI sites utilizes a separate pair of GOLF routers for § Common pair of GOLF routers shared by all sites for
communication with the WAN communication with the WAN
§ Local EVPN peering between spines and GOLF routers § GOLF routers can be connected to the ISN or directly to
the spines
§ GOLF routers connect to the ISN or directly to the spines

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
ACI Multi-Site and ‘GOLF’ L3Outs
Must Use Host-Route Advertisement for Stretched BDs with GOLF L3Outs
10.1.0.10/32 à G1, G2 Traffic destined
10.1.0.20/32 à G3, G4 10.1.0.0/24 à G1-G4


to 10.1.0.20

WAN WAN
G1 G2 G3 G4 G1 G2 G3 G4

ISN ISN

10.1.0.0/24 10.1.0.0/24
.10 .20 .10 .20

§ Host-route advertisement into the WAN for stretched BDs § Without host-route advertisement traffic destined to a
stretched IP subnet may enter the ‘wrong site’
§ Ensures that ingress traffic is always delivered to the ‘right site’
§ A site can’t be used as ‘transit’ for traffic destined to an
endpoint part of a stretched BD and remotely located (traffic is
dropped on the receiving spines)
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Deploying External EPG(s)
Associated to the L3Out
ACI Multi-Site and L3Out
Stretching or Not Stretching the Ext-EPG?

• The Ext-EPG can be defined in a


Inter-Site Network template associated to multiple sites
(stretched object)
The Ext-EPG must then be mapped to the local
L3Outs in the “site level” section of the template
configuration
L3Outs remain independent objects defined in
each site
Ext-EPG
L3Out Site 1 L3Out Site 2
• Recommended when the L3Outs in the
separate sites provide access to a
BD-Red BD-Green
common set of external resources (as
WAN
the WAN)
Simplifies the policy definition and external traffic
classification
Still allows to apply route-map polices on each
L3Out (since we have independent APIC domains)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
ACI Multi-Site and L3Out
Stretching or Not Stretching the Ext-EPG?

Inter-Site Network
• Separate Ext-EPGs can be defined in
templates mapped to separate sites (non
stretched objects)
Each Ext-EPG can be mapped to the local L3Out in
the “global” or “site level” section of the template
configuration
Ext-EPG Ext-EPG • Allows to apply different policies to each
L3Out Site 1 L3Out Site 2
Ext-EPGs at different time
BD-Red BD-Green
• Can still use the same 0.0.0.0/0 network
WAN
configuration for classification on both
sites
• May require enablement of Intersite L3Out

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 107
Solving Asymmetric Routing
Issues with the External
Network
ACI Multi-Site and L3Out
Endpoints Normally Use Local L3Outs for Outbound Traffic

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 Active/Standby IP Subnet 10.10.10.11
IP Subnet 10.10.10.0/24 Active/Standby
10.10.10.0/24

Traffic dropped
because of lack of state
in the FW

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
Solving Asymmetric Routing Issues ACI 4.0(1)
Release
Use of Host-Routes Advertisement

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 Host routes 10.10.10.11
10.10.10.10/32
Active/Standby Active/Standby Host routes
10.10.10.11/32
*Alternative could be
running an overlay solution
(LISP, GRE, etc.) Host-routes
injected into the Enabled on MSO
WAN* at the BD level in
each site
• Ingress optimization requires host-routes advertisement on the L3Out
§ Native support on ACI Border Leaf nodes available from ACI release 4.0(1)
§ Supported also on GOLF L3Outs (enabled #CiscoLive
at the VRF level) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
Solving Asymmetric Routing Issues ACI 4.2(1)
Release
Use of Active/Standby FW Pair Deployed across Sites

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 10.10.10.11
Active Standby
IP Subnet
10.10.10.0/24

• Inbound and outbound flows are forced through the site with the active perimeter FW node
• Common scenario in a Multi-Pod deployment, less desirable with Multi-Site
• Requires Intersite L3Out support (ACI#CiscoLive
release 4.2(1)) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Intersite L3Out Support
Problem Statement
Behavior before ACI Release 4.2(1)

Supported Design
✓ Not Supported Design

Inter-Site Network Inter-Site Network


X

L3Out L3Out L3Out


Site 1 Site 2 Site 1

WAN, Mainframes, WAN,WAN


Mainframes,
FW/SLB, etc… FW/SLB, etc…

Note: the same consideration applies to both Border Leaf L3Outs and GOLF L3Outs
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
ACI Multi-Site and L3Out
Support of Intersite L3Out

• Starting with ACI Release 4.2(1) it is possible for


Inter-Site Network endpoints in a site to send traffic to resources (WAN,
Mainframes, FWs/SLBs, etc.) accessible via a remote
L3Out connection
• External prefixes are exchanged across sites via MP-
MP-BGP VPNv4/VPNv6

BGP VPNV4/VPNv6 sessions between spines


• Traffic will be directly encapsulated to the TEP of the
L3Out
Site 1
remote BL nodes
• The BL nodes will get assigned an address part of an
WAN, Mainframes,
WAN additional (configurable) prefix that must be routable across
FW/SLB, etc…
the ISN

• Same solution will also support transit routing across


sites (L3Out to L3Out)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
ACI Multi-Site and Intersite L3Out ACI 4.2(1)
Release
Supported Scenarios

Inter-Site Network Inter-Site Network

L3Out L3Out L3Out


Site 1 Site 1 Site 2

WAN, Mainframes,
WAN
FW/SLB, etc…
WAN
WAN, Mainframes, WAN, Mainframes,
FW/SLB, etc… FW/SLB, etc…

• Endpoint to remote L3Out communication (intra-VRF) • Inter-site transit routing (intra-VRF)


• Endpoint to remote L3Out communication (inter-VRF) • Inter-site transit routing (inter-VRF)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
ACI Multi-Site and Intersite L3Out
Introduction of a Routable TEP Pool MSO Infra Config

Inter-Site
Network

Site 1 Site 2
Address taken from
the new routable Next-Hop to
TEP pool reach P1 is BL
TEP

BL RTEP

L3Out
Site 1
External prefix P1 EP

WAN, Mainframes,
FW/SLB, etc…

• The BL TEP is normally taken from the original TEP pool assigned during the fabric bring-up procedure
• Since we don’t want to assume that the original TEP pool can be reached across the ISN, a separate
routable TEP pool is introduced to support intersite L3Out
• The routable TEP pool can be directly configured on the Multi-Site Orchestrator
• One or more routable TEP pools can be configured (pool size is /22 to /29)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
ACI Multi-Site and Intersite L3Out
Control Plane

Inter-Site
Red Ext
EPG C EPG Network

Site 1 Site 2
• Next-Hop to reach P1 is
RR RR MP-BGP VPNv4/VPNv6 RR RR BL routable TEP (RTEP)
• Info to rewrite remote
L3Out’s VRF L3VNI

BL RTEP

L3Out
Site 1
External prefix P1
EP
WAN, Mainframes,
VPNv4 FW/SLB, etc…

• External prefix advertisements received via the L3Out are redistributed to the leaf nodes in the local site
via MP-BGP VPNv4/VPNv6 through the RRs in the spines (normal ACI intra-fabric behavior)
• MP-BGP VPNv4 advertisements are also used to distribute this information to the remote sites
• The prefixes are then redistributed inside the remote sites via VPNv4/VPNv6 by the RR spines
• The next-hop VTEP for the prefixes is the BL routable TEP (RTEP) that received the routes from the external network
• Associated to the prefix information are the info to rewrite the VRF L3VNI value to match the one in the remote site
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
ACI Multi-Site and Intersite L3Out
No VNID/S-Class Translations on Receiving Spines

No VNID/S-Class Inter-Site
translations applied
Network

Site 1 Site 2
Rewrite of remote
L3Out’s VRF VNID
information

BL RTEP

L3Out
Site 1
External prefix P1 EP

WAN, Mainframes,
FW/SLB, etc…

• Differently from regular inter-site (east-west) communication between endpoints, the spines on
the receiving site are not involved in VNID/Class-ID translations for communication to external
prefixes
• BGP program the rewrite of VNIDs of remote site routes directly on ToRs

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
ACI Multi-Site and Intersite L3Out
Data Plane (EndPoint to L3Out)

VXLAN traffic treated as Inter-Site Modify the SIP of the


transit and simply routed VXLAN packet to be
toward the BL (no Network the Site’s O-UTEP
VNID/sClass translation)

Site 1 Site 2
Decapsulate traffic
and perform L3 Insert remote VRF
lookup in the right L3VNI value in the
VRF VXLAN header

BL RTEP Derive D-Class


based on the
Remote EP learning L3Out locally installed
is always disabled on Site 1 external prefix and
this BL node External prefix P1 apply policy* EP
WAN, Mainframes,
FW/SLB, etc… *True for both intra-VRF and inter-VRF flows

• VXLAN tunnel is established directly between the leaf in site 2 and the BL in site 1
• The spines in the source site still translate the SIP to be the site’s O-UTEP
• The spines in the destination site simply route the VXLAN packet toward the destination BL node

• The BL node uses the L3VNI info in the VXLAN header to perform the lookup in the right VRF
• Remote endpoint learning always disabled on the BL nodes to avoid learning the wrong sClass
info (as there is no translation happening on the receiving spines)
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
ACI Multi-Site and Intersite L3Out
Data Plane (L3Out to EndPoint)
Spine performs the lookup
in the COOP database
Spine performs the EP and encapsulates traffic
lookup in the COOP Inter-Site toward the local leaf
database and
encapsulates traffic Network
toward site 2 O-UTEP

Site 1 O-UTEP Site 2


Incoming traffic is
always sent to the proxy Decapsulate traffic and
spine (since remote EP perform L3 lookup in the
learning is disabled) right VRF

BL RTEP
Derive D-Class for
local endpoints
L3Out and apply policy
Site 1
EP
WAN, Mainframes,
FW/SLB, etc…

• Traffic received from the WAN hit the BL node and is always sent to the spine proxy
• The local spine performs the lookup for the destination endpoint and encapsulates to the O-UTEP of the
destination site (after changing the SIP in the VXLAN header to match the local O-UTEP)
• The receiving spine performs the lookup in the COOP DB and S-Class/VNID translations (as in regular Multi-
Site data-plane) à the traffic is encapsulated to the local leaf
• The local leaf decapsulates the packet, performs the L3 lookup, applies the policy and sends traffic to the
endpoint (if allowed)
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Intersite L3Out Deployment
Considerations
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Outbound Flows

• When peering BGP with the external router,


Inter-Site Network
if all BGP parameters are the same, the local
L3Out is preferred by default for outbound
flows
The default behavior can be modified by tuning a
BGP parameter (Local-Preference, MED, …) for
the received external prefixes
Ext-EPG Notice that Local-Preference is only propagated
L3Out Site 1 L3Out Site 2 inside a BGS AS, so can’t be used if separate
sites are peering EBGP
BD-Red BD-Green

WAN • When peering OSPF or EIGRP with the


external router, local L3Out is preferred by
default (best IS-IS metric to the local BL
nodes)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Outbound Flows

Inter-Site Network • When peering BGP with the external router,


if all BGP parameters are the same, local
L3Out is preferred by default
• This is not the case if the BGP attributes
172.16.1.0/24 (like AS-Path for example) are better for a
via BL in Site 2 prefix received in a given site
• A user may not be able to modify this
Ext-EPG
L3Out Site 1
behavior by applying inbound route-map on
L3Out Site 2
the BL nodes and would lose control on
BD-Red outbound traffic path
172.16.1.0/24 172.16.1.0/24
WAN
received with received with
AS-Path is considered before Local-Preference
longest AS-Path shortest AS-Path
and MED in the BGP route selection algorithm

172.16.1.1

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
ACI Multi-Site and Intersite L3Out
Stretched Ext-EPG - Inbound Flows

• If Intersite L3Out is enabled for the WAN


isolation use case, it is required to
Inter-Site Network advertise the BD Red subnet out of the
local and remote L3Outs
• Without additional tuning it may happen
that inbound traffic is steered toward the
site where the BD is not deployed
This may cause asymmetric traffic path also for
not stretched BD
Ext-EPG
L3Out Site 1 L3Out Site 2 • Enabling host-based advertisement is a
BD-Red
possible solution, if there are not scalability
concerns on the amount of host routes
WAN advertised externally
Advertise BD-Red prefix
Advertise BD-Red prefix
into the external network • Policies can also be applied on the L3Out
into the external network
of each site to make one path preferable
compared to the other

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 125
ACI Multi-Site and Intersite L3Out
Current Restrictions

• The following restrictions currently apply to the Inter-Site L3Out functionality:

Ø Not supported when deploying GOLF L3Outs: this implies that a local GOLF L3Out connection
is always required to enable outbound connectivity for endpoints deployed in a given site
Ø Not supported between Remote Leaf pairs associated to separate sites

Ø Not supported for some specific L3 multicast deployment scenarios (external source with PIM
ASM/SSM or external receiver with PIM ASM)
Ø Use of a remote L3Out connection after redirecting traffic through a local service node via
PBR is not supported as of ACI 5.0(1) release
Ø Currently CloudSec and the configuration of Routable TEP pools are mutually exclusive
No support for Remote Leaf and/or Intersite L3Out deployments

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 126
Tenant Routed
Multicast (TRM)
Support with Multi-Site
Agenda

• TRM Forwarding Behavior


• PIM Sparse Mode
• Anycast RP Deployment Options
• Control and Data Planes
• Multicast Data Plane Traffic Filtering

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
TRM Forwarding Behavior
Tenant Routed Multicast with Multi-Site ACI 4.0(2)
Release
Deployment Considerations

• Supported from ACI 4.0(2) release only on 2nd Gen leaf switches (EX/FX and beyond)

• Support for PIM-ASM and PIM-SSM Tenant routed multicast


• For PIM-ASM external RP or RP in the fabric are possible options (the latter from ACI 5.0(1) release)
• When using external RP, all sites must have reachability to the RP via a local L3Out

• Each BL node runs PIM in active mode and forms neighborship with other BLs in the same
site and with the external router(s)

• Supports sources attached to the fabric and external sources (reachable via local L3Out)
• BDs with L3 Multicast sources or receivers may or may not be stretched across the sites
• External sources must be reachable independently from each site via a local L3Out

• Supports receivers attached to the fabric and external receivers (reachable via local or
remote L3Out)
• BDs with L3 Multicast receivers or receivers may or may not be stretched across the sites

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
Tenant Routed Multicast with Multi-Site
L2 Multicast over Multi-Site (Supported since ACI 3.0(1) Release)
• Stretched BDs with BUM Traffic Enabled (no PIM configuration required)
• Within a site the L2 multicast traffic is encapsulated to the BD GIPo multicast address
• The spine elected as Designated Forwarder (DF) replicates the stream to each remote sites where
the BD is stretched (traffic sent to the O-MTEP address of each remote site)
• At the receiving spine the multicast stream is sent down the FTAG tree to the receiving site BD GIPo
multicast address
Inter-Site
Network

Site 2 O-MTEP
HREP tunnel destination:
Site 2 O-MTEP

Site 1
BD1 VNID à 16514962
BD1 GIPo à 225.0.195.240

Site 2
BD1 VNID à 16711545
BD1 GIPo à 225.1.128.160
BD1 BD1 BD1 BD1 BD1
Site 1 Source R1 NOT Receiver NOT Receiver R2 Site 2

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 135
Tenant Routed Multicast with Multi-Site
L3 Multicast over Multi-Site (Source Inside the Fabric)
• Built as Routing-First Approach (decrement TTL at source and destination ACI leaf nodes)
• L3 Multicast is always sent to the VRF GIPo within a site (existing behavior)
• Between sites it is sent over the HREP tunnel to the O-MTEP address of the remote sites
where the VRF is stretched (the VXLAN header will include the source site VRF VNID)
• L3 Multicast at the receiving site will be sent in the VRF GIPo of the receiving site

Inter-Site
Network

Site 2 O-MTEP
Site1 VRF VNID à 2293762
HREP tunnel dest: Site 2 O-MTEP

Site 2 Decrement TTL


Decrement TTL
VRF VNIDà 2621443
VRF VRF GIPo à 225.1.232.16
Site 1
L3Out VRF VNIDà 2293762 L3Out
VRF VRF GIPo à 225.1.248.32 VRF

BD1 BD1 BD2Decrement TTL BD1 BD2 BD2


VRF VRF VRF VRF VRF VRF
VRF
Site 1 Source R1 R2 R3 R4 Site 2

R2 #CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 136
Tenant Routed Multicast with Multi-Site
L3 Multicast over Multi-Site (Source Outside the Fabric)

• Local L3Out must be used to receive traffic from an external source


• Multicast traffic from external sources dropped on the spines (to avoid traffic duplication)

Multicast traffic from external Multicast traffic from external


sources is dropped on the spines Inter-Site sources is dropped on the spines
and not sent over HREP tunnels Network and not sent over HREP tunnels

Site 2 Decrement TTL


VRF VNIDà 2621443
VRF VRF GIPo à 225.1.232.16
Site 1
L3Out VRF VNIDà 2293762 L3Out
VRF VRF GIPo à 225.1.248.32 VRF
Decrement TTL
BD2 Decrement TTL BD1 BD2 BD2
VRF
Site 1 VRF VRF VRF
VRF
R1 R2 R3 Site 2

Source
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
PIM Sparse Mode
Anycast RP Deployment Options
Why Anycast RP (Rendezvous Point)?
• Anycast RP is a feature used in PIM SM networks to provide redundancy and load
balancing for the RP
• Anycast RP allows for two or more RPs in the network to share the same anycast IP
address à if an anycast RP device fails another RP can take over (redundancy).
• Multicast routers in the network can join the shared (RP tree) to any of the anycast RP
members (load balancing)
• Joins are forwarded to an RP based on the unicast routing table à joins will be sent to
closest RP (based on metric) or load balanced over ECMP links
• Anycast RP also provides fast failover convergence because the IGP selects the best path
the RP so convergence should match that of the IGP

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
Tenant Routed Multicast with Multi-Site
External Anycast RP with MSDP
• Different sources may register with different RP routers (based on routing information)
• Source Active (SA) messages exchanged between MSDP peers to share info of the active sources for
each (S,G)
Inter-Site
Network

PIM Data PIM Data


Register Site 2 using Register
RP 1.1.1.1
Site 1 using
RP 1.1.1.1
L3Out-1 L3Out-2
Source 1 Source 2
Site 1 Site 2

PIM Enabled Core

MSDP
MSDP IP1 MSDP IP2
RP: 1.1.1.1 RP: 1.1.1.1
#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 140
Tenant Routed Multicast with Multi-Site
External Anycast RP with Anycast RP PIM (RFC 4610)
• Different sources may register with different RP routers (based on routing information)
• PIM Register messages exchanged between external RPs to share info of the active sources for each (S,G)

Inter-Site
Network

PIM Data PIM Data


Register Site 2 using Register
RP 1.1.1.1
Site 1 using
RP 1.1.1.1
L3Out-1 L3Out-2
Source 1 Source 2
Site 1 Site 2

PIM Enabled Core

PIM
PIM IP1 PIM IP2
RP: 1.1.1.1 RP: 1.1.1.1
#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 141
Tenant Routed Multicast
Fabric RP: Anycast RP in the Fabric

10.99.99.99 10.99.99.99
RP RP BL
BL
L3Out

• Fabric RP is a feature added in ACI release 4.0(1)


• When Fabric RP is configured a loopback interface is created on all
border leaf nodes with the IP address of the fabric RP
PIM Enabled Core
• The RP address is an anycast address on all BLs and the behavior
is the same as PIM anycast where all BLs are members on the
anycast list
• The ACI fabric looks like a single RP to the PIM routers in the core

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Tenant Routed Multicast with Multi-Site ACI 5.0(1)
Release
Fabric RP in a Multi-Site Domain (ACI 5.0(1) Release)
• ACI release 5.0(1) adds support for Fabric RP in a Multi-Site domain
• Anycast-RP Register messages are forwarded across sites via the ISN
• Remote RPs receive the Anycast-RP Register messages and learn source information
Anycast-RP Register
inner source: 1.1.1.103
Inter-Site
inner dest: 224.0.0.13 Network
Outer dest: site 2 O-MTEP

PIM register Register


Anycast-RP Site 2 O-MTEP
inner source:
inner source: 1.1.1.103
1.1.1.103
inner dest: 224.0.0.13
inner dest: 224.0.0.13
PIM Data Register outer dest:
outer dest: VRF
VRF GIPo
GIPo
source: 10.50.1.1
dest: 10.99.99.99
RP RP RP RP
1.1.1.103

L3Out-1 L3Out-2
Source 1 Enter RP address
Site 1 Specify it is a Fabric RP
Site 2
Multicast Route-MAP (Optional)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
PIM Sparse Mode
Control and Data Planes
Source Inside, Receivers Inside
and Outside, Fabric RP
Control and Data Planes

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 145
Tenant Routed Multicast with Multi-Site
Source Inside, Receivers Inside and Outside, Fabric RP (Control Plane)
§ A receiver is connected to a leaf node in a site § A receiver is connected to the external network site
and sends an IGMP Join for group G and sends an IGMP Join for group G
§ (*,G) state is propagated within the site to all BLs § (*,G) state is propagated via PIM Joins toward the RP
via COOP on the BL of a site (based on routing)
Inter-Site
Network

COOP
RP RP (*,G) state
(*,G) state (*,G) state
BL11 BL21
IGMP Join
for G
L3Out-1 Advertise L3Out-2
Source S source’s IP Receiver
Site 1 PIM (*,G)
Subnet Site 2
Join

(*,G) state (*,G) state

PIM (*,G) IGMP Join


Join for G
Receiver
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 146
Tenant Routed Multicast with Multi-Site
Source Inside, Receivers Inside and Outside, Fabric RP (Control Plane, cont.)
§ The source connected to site 1 starts sending § The BL node in site 1 sends an Anycast-RP Register
traffic to group G message to all the remote sites (where the VRF is
§ (S,G) state is created on FHR, which also sends a extended)
PIM Register message to the local BL node (RP) § The remote BL nodes receive that message and install
Inter-Site (S,G) state
Network

Anycast-RP
PIM Data Register Message
Register*

(S,G) state RP (*,G) state RP (*,G) state


(*,G) state
(S,G) state
BL11 (S,G) state BL21
MC traffic
to G
L3Out-1 L3Out-2
Source S Receiver
Site 1 Site 2

* PIM data register packets are unicast packets (sent (*,G) state (*,G) state
from first-hop router to the RP), with PIM protocol
number (103) set in the IP header

Receiver
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 147
Tenant Routed Multicast with Multi-Site
Source Inside, Receivers Inside and Outside, Fabric RP (Data Plane)
§ The multicast stream is replicated inside the local § The spine receiving the stream in the remote site
site to all the leaf nodes where the VRF is deployed forwards it into the site
and by the DF spine to all the remote sites where § The stream gets replicated to all the local leaf nodes
the VRF is stretched where the VRF is deployed and reaches the receiver
§ The local BL forwards the stream via the L3Out to Inter-Site
the external receiver Network

Site 2 O-MTEP

RP (*,G) state RP (*,G) state


(*,G) state
(S,G) state
BL11 (S,G) state BL21
MC traffic
to G
L3Out-1 L3Out-2
Source S Receiver
Site 1 Site 2

(*,G) state

Receiver
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 148
Source Outside, Receivers
Inside, Fabric RP
Control and Data Planes

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 153
Tenant Routed Multicast with Multi-Site
Source Outside, Receivers Inside, Fabric RP (Control Plane)
§ Receivers are connected to leaf nodes in each site § (*,G) state is propagated within each site to all local
and send an IGMP Join for group G BLs via COOP
§ (*,G) state is created on the leaf nodes where the
receivers are connected (LHRs)
Inter-Site
Network

COOP COOP
(*,G) state RP (*,G) state RP (*,G) state
(*,G) state
BL11 BL21
IGMP Join
IGMP Join
for G
for G L3Out-1 L3Out-2
Receiver Receiver
Site 1 Site 2

Source
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 154
Tenant Routed Multicast with Multi-Site
Source Outside, Receivers Inside, Fabric RP (Control Plane, cont.)
§ The source connected to the external network § The BL node in site 1 installs (S,G) state and sends an
starts sending traffic to group G Anycast-RP Register message to all the remote sites
§ (S,G) state is created on FHR, which also sends a (where the VRF is extended)
PIM Register message toward the RP in one of § The BL node in all sites send PIM (S,G) Joins on the
the sites (based on routing) Inter-Site L3Outs toward the source
Network

Anycast-RP
Register Message

RP (*,G) state RP
(*,G) state
(S,G) state
BL11 (S,G) state BL21

L3Out-1 L3Out-2
Receiver PIM (S,G) Receiver
Site 1 PIM (S,G)
PIM Data
Join Site 2
Register*
Join
PIM (S,G)
* PIM data register packets are unicast packets (sent (S,G) state (S,G) state Join (S,G) state
from first-hop router to the RP), with PIM protocol
number (103) set in the IP header
PIM (S,G)
MC traffic
Join
to G
Source
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 155
Tenant Routed Multicast with Multi-Site
Source Outside, Receivers Inside, Fabric RP (Data Plane)
§ The multicast stream is replicated in the external § There is a filter applied on the spine nodes to ensure
network and sent toward the BL nodes of all the sites that the multicast stream is not forwarded to remote
that sent PIM (S,G) Joins sites via ISN
§ The stream is replicated inside each site to all the leaf § Mandates the existence of a local L3Out for
nodes where the VRF is deployed Inter-Site receiving traffic from an external source
Network

Multicast traffic from external


sources is dropped on the spines
and not sent over HREP tunnels

RP (*,G) state RP
(*,G) state
(S,G) state
BL11 (S,G) state BL21

L3Out-1 L3Out-2

`
Receiver Receiver
Site 1 Site 2

(S,G) state (S,G) state (S,G) state

MC traffic
to G
Source
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 156
Multicast Data Plane Traffic
Filtering
Multicast Traffic Filtering
General Deployment Considerations

• Prior to release 5.0(1), ACI only supported control plane filtering options for
multicast traffic
IGMP report filters, PIM Join prune filters, RP filters

• The multicast traffic filtering features in ACI 5.0(1) add support for filtering
multicast traffic in the data plane
Data plane filtering is controlled by user defined route-map configuration
Sources and destination (receivers) filtering is supported
Configuration applied on APIC for a single ACI fabric or on MSO for a Multi-Site
deployment

• Multicast traffic filtering current limitations:


Supported only for IPv4
Multicast filtering is done at the BD level and apply to all EPGs within the BD
Source-Specific Multicast (SSM) is not supported for source filtering and is supported only for
receiver filtering

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 170
Multicast Data Plane Traffic Filtering
Source Filtering

• Source route-map filters are applied to the BD where the multicast


source is connected
• Source multicast filters apply to routed multicast traffic
BD1
route-map block-G1
• All multicast traffic is routed (not switched) for PIM enabled BDs,
1 deny G1 including streams for receivers in the same BD as the source
2 permit all
As a result, multicast will also be filtered for these receivers

G1 G1 G1 G1

BD1 BD1 BD2 BD3

Source S1: 10.1.1.1 Receiver for G1 Receiver for G1 Receiver for G1


Sending to G1

Receivers in any BD will not receive multicast for G1

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 171
Multicast Data Plane Traffic Filtering
Source Filtering Use Cases

• Sources in BD1 are allowed to send streams to group G1


• Sources in BD2 are allowed to send streams to group G2
BD1 • A source connected to BD2 should never be allowed to send streams
to group G1
S1àG1 Source BD filter is applied to BD2 to only allow streams to group G2

BD2 BD2
route-map block-G1
1 deny G1
2 permit all
S2àG2
S3àG1

Filters reference route-maps. Route-maps can


match on combinations of groups and sources

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 172
Multicast Data Plane Traffic Filtering
Destination Filtering
• Destination filtering does not prevent receivers from joining a multicast group
• Destination route-map filter is applied in data plane on the BD where the receivers are
connected
• The receiving BD VLAN will not be added to OIF in hardware for the (S,G) mroute
• Other BDs can still receive the multicast stream
BD3
route-map block-G1
1 deny G1
2 permit all

G1 G1 G1

BD1 BD2 BD3

Source S1: 10.1.1.1 Receiver G1 Receiver G1


Sending to G1

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 173
Multicast Data Plane Traffic Filtering
Destination Filtering Use Cases

BD3
route-map allow-s1-g1
BD1 1 permit S1, G1
BD3
• Sources in BD1 and BD2 send multicast
S1àG1 G1àR1 to the same group G1
(from S1 only)
• Receivers in BD3 should only receive
multicast from sources in BD1
• Receivers in BD4 should only receive
BD2 BD4 multicast from sources in BD2
• Receiver filters can match both on
group and source to achieve this result
S2àG1 G1àR2
(from S2 only)
BD4
route-map allow-s2-g1
1 permit S2, G1

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 174
Network Services
Integration
Agenda

• Integration Models
• Use of Service Graph and PBR
• North-South Traffic Flows
• East-West Traffic Flows

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 176
Integration Models
ACI Multi-Site and Network Services Deployment options fully
supported with ACI Multi-Pod
Integration Models
ISN

• Active and Standby pair deployed across Pods


• Limited supported options

Active Standby

ISN
• Active/Active FW cluster nodes stretched across Sites
(single logical FW)
• Requires the ability of discovering the same MAC/IP info
in separate sites at the same time
Active/Active Cluster
• Not supported
Active Active

ISN • Recommended deployment model for ACI Multi-Site


• Option 1: supported for N-S traffic flows when the FW
is connected in L3 mode to the fabric
• Option 2: supported for N-S and E-W traffic flows with
the use of Service Graph with Policy Based Redirection
Active/Standby Active/Standby #CiscoLive (PBR) © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 178
Independent Service Node Instances across Sites
Use of Service Graph and Policy Based Redirection

§ The PBR policy applied on a leaf switch can only redirect traffic to a service node deployed
in the local site
Requires the deployment of independent service node function in each site
Various design options to increase resiliency for the service node function: per site Active/Standby pair, per site
Active/Active cluster, per site multiple independent Active nodes

§ HW dependencies:
Mandates the use of EX/FX or newer leaf nodes (both for compute and service leaf switches)

§ SW dependencies:
ACI release 3.2(1)
ØOnly a single node PBR policy with FW supported
ØConsumer side enforcement in 3.2 release (for E-W flows)
ACI release 4.0(1)
ØMultiple node PBR policy with FW + LB supported in 4.0 release
ØProvider side enforcement in 4.0 release

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 179
Use of Service Graph and Policy Based Redirection
Resilient Service Node Deployment in Each Site

Active/Standby Cluster Active/Active Cluster Independent Active Nodes


Site1 Site1 Site1

L3 Mode L3 Mode L3 Mode L3 Mode L3 Mode Active/Standby


Active/Standby Cluster Active/Active Cluster Active Node 1 Active Node 2 Node 3

• The Active/Standby pair represents a • The Active/Active cluster represents a • Each Active node represent a unique
single MAC/IP entry in the PBR policy single MAC/IP entry in the PBR policy MAC/IP entry in the PBR policy
• Spanned Ether-Channel Mode supported • Use of Symmetric PBR to ensure each
with Cisco ASA/FTD platforms flow is handled by the same Active node
All ASA/FTD nodes must be connected in both directions
to the same leaf nodes pair
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 180
Multi-Site and Network Services Integration
Single-Node Service Graph with PBR
• Single-Node FW with PBR (used for both traffic directions)
EPG
Contract EPG • FW connected in one-arm or two-arms mode
Client Server
PBR • ACI 3.2(1): PBR policy applied on the compute leaf (N-S
flows) or on the consumer leaf (E-W flows)
• ACI 4.0(1): PBR policy applied on the compute leaf (N-S
flows) or on the provider leaf (E-W flows)

EPG EPG EPG EPG


Client Contract Server Client Contract Server
PBR only for
Non-PBR return traffic

VIP SNAT VIP No SNAT

• LB connected in one-arm or two-arms mode • LB with PBR for return traffic supported from ACI
4.0(1) release
• No need for PBR at all, LB connected to BDs/EPGs
as a regular endpoint • LB connected in one-arm or two-arms mode
• LB could also be connected to an L3Out (may • PBR policy applied on the compute leaf (N-S flows)
require intersite L3Out support) or on the provider leaf (E-W flows)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 181
Multi-Site and Network Services Integration
Dual-Node Service-Graph with PBR (Main Use Cases)

EPG EPG EPG EPG


Client Contract Server Client Contract Web
PBR Non-PBR PBR only for
PBR
return traffic

VIP SNAT VIP No SNAT

• PBR policy only to redirect traffic to the FW, no • PBR policy to redirect traffic to the FW, second
need for PBR for the LB (LB connected to PBR policies for the LB (for return traffic)
BDs/EPGs as a regular endpoint)
• FW and LB in single-arm or dual-arms mode
• FW and LB in single-arm or dual-arms mode
• Supported from ACI release 4.0(1)
• Supported from ACI release 4.0(1)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 182
Use of Service Graph and
PBR
North-South and East-West
Use of Service Graph and Policy Based Redirection
North-South and East-West Use Cases

North-South • Best practice recommendations for both North-South and East-


West use cases:
VRF1
§ Service Node deployed in ‘one arm’ mode (‘two-arms’ mode also
L3 L3
supported but not preferred)
WAN L3Out
WAN Web-EPG § Service-BD must be stretched across sites (BUM flooding
WAN can/should be disabled)
Edge

Web-BD
§ Ext-EPG must also be a stretched object, mapped to the individual
Ext-BD Service-BD
L3Outs defined in each site
§ Web-BD and App-BD can be stretched across sites or locally
defined in each site
East-West • North-South use case
VRF1 § Intra-VRF only support as of ACI 5.0(1) release
L3 L3 L3
• East-West use case
Web-EPG App-EPG
§ Supported intra-VRF or inter-VRFs/Tenants
§ Requires to configure the subnet under the Consumer EPG from
Web-BD Service-BD App-BD ACI release 4.0(1) (under the Provider EPG in 3.2(x) releases)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 184
Use of Service Graph and Policy Based Redirection
North-South Communication – Inbound Traffic

Inter Site
Network

Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web

Consumer Provider
(Provider) (Consumer)

L3Out-Site1 L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby

• Inbound traffic can enter any site when destined to a stretched subnet (if ingress optimisation is not deployed or
possible)
• PBR policy must always be applied on the compute leaf node where the destination endpoint is connected
§ Requires the VRF to have the default policies for enforcement preference and direction
§ Supported only intra-VRF in ACI release 3.2
§ Ext-EPG and Web EPG can indifferently be provider or consumer of the contract
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 185
Use of Service Graph and Policy Based Redirection
North-South Communication – Outbound Traffic

Inter Site
Network

Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web

Consumer Provider
(Provider) (Consumer)

L3Out-Site1 L3Out-Site2
L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby

• PBR policy always applied on the same compute leaf where it was applied for inbound traffic
• Ensures the same service node is selected for both legs of the flow
• Different L3Outs can be used for inbound and outbound directions of the same flow

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 186
Use of Service Graph and Policy Based Redirection
East-West Communication (1)

Inter Site
Network

Site1 Site2

Provider leaf
always applies
the PBR policy
EPG EPG
Web C App
Provider Consumer

L3 Mode L3 Mode EPG


EPG Active/Standby
Active/Standby
Web App

§ EPGs can be locally defined or stretched across sites


§ EPGs can be part of the same VRF or in different VRFs (and/or Tenants)
§ PBR policy is always applied on the leaf switch where the Provider* endpoint is connected
• The Consumer leaf always redirects traffic to a local service node
• Mandates to configure the subnet under the Consumer EPG
*Before ACI 4.0(1) release, the PBR policy was applied on the Consumer side
#CiscoLive © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 187
Use of Service Graph and Policy Based Redirection
East-West Communication (2)

Inter Site
Network

Site1 Site2

Provider leaf
always applies
Consumer leaf
the PBR policy does not apply
EPG EPG
Web C App the PBR policy

Provider Consumer

L3 Mode L3 Mode EPG


EPG Active/Standby
Active/Standby
Web App

§ The Consumer leaf must not apply PBR policy to ensure proper traffic stitching to the FW
node that has built connection state
§ Ensures both legs of the flow are handled by the same service node

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 188
Multi-Pod and Multi-Site
Integration
Agenda

• Main Use-Cases
• Connectivity between Pods and Sites
• Inter-Site and Intra-Site EVPN Sessions
• Inter-Site Data Plane Communication
• Internal TEP Pools Filtering

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 190
ACI Multi-Pod and Multi-Site For More Information on how
to setup ACI Multi-Pod +
Main Use Cases Multi-Site from scratch:
DIGACI-2291

§ Adding a Single Pod Fabric as a ‘DR Site’ to the Multi-Site domain

§ Converting a single Pod Fabric (already added to MSO) to a Multi-Pod fabric

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 191
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites
Single external network used for
IPN and ISN

IPN/ISN

1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

§ Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1st gen
spines to 2nd gen spines
§ Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 192
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites

IP WAN
Separate networks
used for IPN and ISN
IPN

Site 2
1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

§ Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1st gen
spines to 2nd gen spines
§ Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 193
Connectivity between Pods and Sites
Not Supported Topology

Separate uplinks
between spines IP WAN
and external
networks

IPN

Site 2
1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

§ Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1st gen
spines to 2nd gen spines
§ Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 194
ACI Multi-Pod and Multi-Site
BGP Spine Roles

§ Spines in each Multi-Pod fabric can have one


of those two roles:
1. BGP Speakers: establish EVPN peerings with BGP
IPN/ISN speakers in remote sites and with BGP Forwarders
in the local site (intra- and inter- Pods)
BGP Forwarders
BGP Forwarders
BGP
BGP Forwarders
Forwarders
• Recommended to deploy two speakers per
BGP Speakers
BGP Speakers Multi-Pod fabric (in separate Pods)

1st Gen 1st Gen


• Explicitly configured on MSO
• Multi-Site Speaker must be a Multi-Pod spine
as well
2. BGP Forwarders: establish BGP EVPN peerings
APIC Cluster
with BGP speakers in the local site
Pod ‘B’
• All the spines that are not speakers implicitly
become forwarders
Site 1

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 195
ACI Multi-Pod and Multi-Site
Inter-Site and Intra-Site EVPN Sessions
= Inter-Site MP-BGP EVPN Peering (Speaker-to-Speaker)
= Intra-Site MP-BGP EVPN Peering (Speaker-to-Forwarders)

Site 2
IP
BGP BGP
Speaker Speaker

IP WAN

IPN

BGP Forwarder Forwarder Forwarder Forwarder BGP


Speaker Speaker

Site 1
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 196
ACI Multi-Pod and Multi-Site
Inter-Site L2/L3 Unicast Traffic

Site 2
IP
O-UTEP-S2

IP WAN
EP1
Site 2 Spine Table
EP1 Leaf 1

EP2 O-UTEP-S1P1 Site 1-Pod2 Spine Table


EP3 O-UEP-S1P3 EP4 Leaf 4

EP4 O-UTEP-S1P2
IPN EP1 O-UTEP-S2

EP2 O-UTEP-S1P1

EP3 O-UTEP-S1P3
Site 1-Pod3 Spine Table
Site 1-Pod1 Spine Table O-UTEP-S1P1 O-UTEP-S1P2 O-UTEP-S1P3 EP3 Leaf 4
EP2 Leaf 1
EP1 O-UTEP-S2
EP1 O-UTEP-S2
EP2 O-UTEP-S1P1
EP3 O-UTEP-S1P3
EP4 O-UTEP-S1P2
EP4 O-UTEP-S1P2

EP2 EP4 EP3

Site 1
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 197
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
= BUM sent via PIM-Bidir

Site 2
IP
O-MTEP-S2

IP WAN
EP1

Use PIM-Bidir for


Replication

BUM originated in the


IPN
local Pod, so use Ingress
Replication toward the Only forwarding
Remote Site the BUM frame
into the local Pod

DF DF DF

BUM
Frame
EP2 EP4 EP3

Site 1
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 198
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
Ingress Replicated = BUM sent via PIM-Bidir

Site 2
IP to O-MTEP address
of Site 1
DF

BUM
Frame IP WAN
EP1

Use PIM-Bidir for


EP1 generates Use PIM-Bidir for Replication
a L2 BUM frame L2 BUM frame Replication
cannot be re-injected
into the IPN IPN

O-MTEP-S1

EP2 EP4 EP3

Site 1
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 199
ACI Multi-Pod and Multi-Site
Internal TEP Pools Filtering
Filter out the
advertisement of
TEP Pool advertisement not
10.1.0.0/16
the TEP Pool into needed for inter-Sites
Site 2
IP the backbone
communication

IP WAN
Filter out the
All Pods TEP Pools
TEP Pool Site 2 advertisement of
the TEP Pools into advertised into the
10.1.0.0/16 IPN
the backbone

IPN
TEP Pool Pod1 10.1.0.0/16 10.3.0.0/16 TEP Pool Pod3
10.2.0.0/16
10.1.0.0/16 TEP Pool Pod2 10.3.0.0/16
10.2.0.0/16

Site 1
#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 200
Conclusion
Multi-Pod and Multi-Site
Complementary Architectures
Active Workloads Layer 3 Active Workloads
Layer 2 & Layer 3 Inter Region Layer 2 & Layer 3

ACI Multi-Site
Orchestrator

Multi-Pod Fabric ‘A’ (Region 1) Multi-Pod Fabric ‘B’ (Region 2)

Fabric NetworkFault Domain Fabric Network Fault Domain Fabric Network Fault Domain Fabric Network Fault Domain

Pod A.1 (AZ 1) Pod A.2 (AZ 2)


Application1 Pod B.1 (AZ 1) Pod B.2 (AZ 2)
Application 2
workloads deployed workloads deployed
Application Policy Change Domain Application Policy Change Domain
across Pods (AZs) across Pods (AZs)

Common Namespace (IP, DNS, Active Directory…)

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 202
Where to Go for More Information

ü ACI Multi-Pod White Paper


http://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-737855.html?cachemode=refresh

ü ACI Multi-Pod Configuration Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739714.html

ü ACI Multi-Pod and Service Node Integration White Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739571.html

ü ACI Multi-Site White Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739609.html

ü ACI Multi-Site and Service Node Integration White Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-743107.html

#CiscoLive DGTL-BRKACI-2125 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 203
Thank you

#CiscoLive
#CiscoLive

You might also like