You are on page 1of 92

#CLUS

ACI Troubleshooting:
Endpoints

Andy Gossett, DCBU ACI Escalation


@agccie
BRKACI-2641

#CLUS
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session

How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

Webex Teams will be moderated cs.co/ciscolivebot#BRKACI-2641


by the speaker until June 16, 2019.

#CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
8:00 a.m. 120min 8:00 a.m. 120min

BRKACI-1001 BRKACI-3545
9:30 a.m. 60min 9:30 a.m. 60min
BRKACI-2641 BRKACI-2642

1:00 p.m. 60min 11:00 a.m. 60min

BRKACI-2643 BRKACI-2644
2:30 p.m. 60min

BRKACI-2645

4:00 p.m. 90min 1:00 p.m. 120min

BRKACI-2934 BRKACI-2271

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Agenda
• ACI Endpoint Learning
• Configuration Options
• Endpoint Learning Troubleshooting Tips

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Acronyms/Definitions
Acronyms Definitions Acronyms Definitions
ACI Application Centric Infrastructure LPM Longest Prefix Match
ACL Access Control List MDT Multicast Distribution Tree

APIC/IFC Application Policy Infrastructure Controller/ pcTag Policy Control Tag


Insieme Fabric Controller
BD Bridge Domain PL Physical Local
COOP Council of Oracle Protocol sclass Source class (source pcTag)
ECMP Equal Cost Multipath SVI Switch Virtual Interface

EP Endpoint TC Topology Change


EPG Endpoint Group VL Virtual Local
EPM Endpoint Manager VNID Virtual Network Identifier
EPMC Endpoint Manager Client (LC component) VXLAN/iVXLAN Virtual Extensible LAN / Insieme VXLAN
FTEP/VTEP Fabric/Virtual or VXLAN Tunnel Endpoint XR VXLAN Remote

 Reference Slide

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Endpoint Learning
What is an ACI Endpoint
Depends on who’s counting…

An endpoint is a MAC with one or An endpoint is a MAC, IPv4 (/32), or


more IPv4 (/32) or IPv6 (/128) IPv6 (/128) address
addresses Endpoint Synthetic IP
00:00:00:00:00:0a 28.186.73.78

10.0.0.10 21.215.190.9
fvCEp
<epg-dn>/cep-00:00:00:00:0a
coop
fvIp
db Spine Two hardware
<epg-dn>/cep-00:00:00:00:0a/ip-[10.0.0.10]
mac: 00:00:00:00:0a entries
count: 1
ip0 : 10.0.0.10

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
What is an ACI Endpoint
Why the count matters

#Mac w/ one
#Mac + #IP
or more IPs

450K max

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Classical Learning
Encap + Interface => VLAN
VLAN => VRF
L4/Payload Proto DIP SIP 802.1Q SMAC DMAC

L2 Forwarding for (VLAN, DMAC)


L2 Learning for (VLAN, SMAC) => (Interface)

L3 Forwarding for (VRF, DIP)

L2 Forwarding:
L3 Forwarding (Longest Prefix Match)
(VLAN, DMAC) Miss => Flood
(VRF, DIP) Miss => Drop
(VLAN, DMAC) Gateway MAC => Route
(VRF, DIP) Hit=> Adjacency
(VLAN, DMAC) Hit => Destination Port

config on destination port + VLAN Might be Glean or packet rewrite (SMAC, DMAC,
determines egress encap VLAN, etc…), may include destination port in
(tagged or untagged) adjacency or require second L2 lookup on new DMAC

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
ARP Packet
Classical Learning DMAC

SMAC
LPM Routes
Eth: 0x0806
• Connected/direct routes manually Route Adj
configured 10.1.1.101/32 … Hdr/Opcode
• Static/dynamic routing protocols to 20.1.1.101/32
10.1.1.0/24 …
Glean Sender MAC
learn prefixes 20.1.1.0/24 Glean
Sender IP
Host Routes (IP Endpoints)
ARP Target MAC
• Glean adjacency for connected P ARP
routes to punt frame and generate Target IP
ARP request
• ARP/ND used to create MAC to IP
binding and install host route into 10.1.1.101/24 20.1.1.101/24
routing table

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
ACI Learning (Physical Local - PL)
Encap + Interface => EPG EPGs and L3
EPG => BD Learning
L4/Payload Proto DIP SIP 802.1Q SMAC DMAC BD => VRF

L2 Forwarding for (BD, DMAC)


L2 Learning for (BD, SMAC) => (EPG, Interface)
L3 Learning for (VRF, SIP) => (EPG, Interface)
L3 Forwarding for (VRF, DIP)

L2 Forwarding: L3 Forwarding (Longest Prefix Match)


(BD, DMAC) Miss => (Flood/Proxy+Drop) (VRF, DIP) Miss => Drop
(BD, DMAC) Gateway MAC => Route Proxy/Glean for BD subnets
(BD, DMAC) Hit => Adjacency (VRF, DIP) Hit=> Adjacency

Adjacency contains dst EPG, encap


information, dst VTEP or port, etc…
#CLUS MoreBRKACI-2641
in upcoming slides
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Optimize Forwarding
ACI Learning (ARP) (ARP Flooding disabled)

Encap + Interface => EPG


EPG => BD
Target Target Sender Sender Hdr/ ethtype BD => VRF
802.1Q SMAC DMAC
IP MAC IP MAC Opcode ARP

L2 Learning for (BD, SMAC) => (EPG, Interface)

L2 Learning for (BD, ARP SMAC) => (EPG, Interface)


L3 Learning for (VRF, ARP Sender IP) => (EPG, Interface)
L3 Forwarding for (VRF, ARP Target IP)

ARP L3 Forwarding
L3 forwarding based on ARP target IP field
(VRF, ARP Target IP) Miss => Proxy
with miss sent to spine proxy 
(VRF, ARP Target IP) Hit=> Adjacency

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
ACI Learning (Virtual Local - VL) Fabric TEP
Host VTEP
Inner Header VXLAN Outer Header
Infra VLAN

Proto
L4/Payload Proto DIP SIP ethtype SMAC DMAC VNID Rsvd DIP SIP 802.1Q SMAC DMAC
UDP

External VNID => EPG Infra BD MAC


EPG => BD Host MAC
BD => VRF
L2 Forwarding for (BD, DMAC)
L2 Learning for (BD, SMAC) => (EPG, Tunnel)
L3 Learning for (VRF, SIP) => (EPG, Tunnel)
VXLAN Tunnel

L3 Forwarding for (VRF, DIP)

AVS/AVE/OVS
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
iVXLAN Header
OUTER INNER
MAC Header 802.1Q IPv4 Header UDP Header iVXLAN Header
VXLAN Header MAC Header IPv4 Header UDP Header PAYLOAD FCS

D S D
Flags E Source Group
L P P
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Virtual Network Identifier (VNID) Reserved

32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

Abbr. Name Description


DL Do not learn Informs remote leaf that it should not perform dataplane learning
from this frame
E Exception Set when frame has gone through proxy path
SP Source-policy-applied Policy has already been applied to this frame
DP Destination-policy-applied - (DP and SP are always set together)
sclass/pcTag Source group (policy-control tag) 16-bit policy control tag representing the EPG that sourced the
#CLUS
frame BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
iVXLAN Header
OUTER INNER
MAC Header 802.1Q IPv4 Header UDP Header iVXLAN Header
VXLAN Header MAC Header IPv4 Header UDP Header PAYLOAD FCS

D S D
Flags E Source Group
L P P
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Virtual Network Identifier (VNID) Reserved

32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

Abbr. Name Description


DL Do not learn Informs remote leaf that it should not perform dataplane learning
from this frame
E Exception Set when frame has gone through proxy path
SP Source-policy-applied Policy has already been applied to this frame
DP Destination-policy-applied - (DP and SP are always set together)
sclass/pcTag Source group (policy-control tag) 16-bit policy control tag representing the EPG that sourced the
#CLUS
frame BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
ACI Learning (Remote - XR) Dst Leaf VTEP
Src Leaf VTEP
Inner Header iVXLAN Outer Header
Fabric QoS

flags Proto
L4/Payload Proto DIP SIP ethtype SMAC DMAC VNID DIP SIP 802.1Q SMAC DMAC
EPG UDP

EPG (pcTag) Internal MAC

BD or VRF VNID (based on routed or switched)


L2 Forwarding for (BD, DMAC)
L2 Learning for (BD, SMAC) => (EPG, Tunnel)
L3 Learning for (VRF, SIP) => (EPG, Tunnel)
L3 Forwarding for (VRF, DIP)

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
ACI Learning
Learning Exceptions
• No IP EP learning if routing is
disabled on the BD
• No IP EP learning on external BD’s
(Layer-3 Outside interfaces)
• No IP EP learning on Infra VLAN
VXLAN/Opflex traffic

VXLAN Tunnel
• No IP learning of shared service
prefixes outside of our VRF between host and
fabric on Infra VLAN
LPM Routes (Same as Classical)
• Pervasive SVI Routes (BD Subnets) Static/Dynamic WAN/
Routing on L3Out Internet
• Static and dynamic routing protocols
on L3Out AVS/AVE/OVS

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
ACI Learning Leaf Endpoint Database
Remote IP Entries Endpoint Entry
VRF (VRF, IP) - EPG (pcTag)
Frame Forwarding Learn - Interface/Tunnel
Operation - Control flags
Non- Bridged MAC
IP/IP BD
Remote MAC Entries
(VRF, BD, MAC)
ARP - MAC (sender-HW),
IP (sender-IP)
IPv4 Unicast MAC, IP Local MAC and IP Entries
Routed Encap (VRF, BD, VLAN/VXLAN, MAC)
(VRF, BD, VLAN/VXLAN, IP)
IPv6 Unicast MAC, IP
Routed
IP
IPv6 Neighbor MAC, IP IP
Mac Entry
Discovery IP
Entry
Entry IP
Entry
Relationship to Entry
multiple IPs
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
ACI Learning (COOP and EP Sync)

COOP sync between oracles (Spines)


Spines learns all
endpoints through Coop
COOP citizen(leaf) update to oracle
remote learn on leaf (spine) for local EP learn
from dataplane packet

vPC Domain 1 vPC Domain 2


local learn on leaf
EP sync between vPC peers EP sync between vPC peersfrom dataplane packet
for remote learns for local learns
(both orphan and vPC ports)

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
ACI Learning: Review
• MAC learning for all frames
• IP learning for routed packets and ARP packets
• No IP learning on frames received on L3Out or Infra vlan
• All local endpoint learns are published to coop
• spine has full knowledge of all fabric endpoints
• Proxy forwarding for any fabric endpoint allowing for zero-penalty impact
for remote endpoint miss

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Spines
Moves and Bounce Addr Interface Detail
A tun1001 leaf101/102 vTEP
B tun4 leaf104 TEP

Leaf101/102
Addr Interface Detail
A vpc1 local vpc

leaf102 leaf103 leaf104


B tun4 XR -> leaf104
leaf101

Leaf 103
Addr Interface Detail
A B
- - -
- - -

Initial State Leaf 104


Addr Interface Detail
A tun1001 XR -> leaf101/102 VIP
B eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Spines
Moves and Bounce Addr Interface Detail
A tun1001
tun3 leaf101/102
leaf103 TEP vTEP
Spines receive
3 event and updates B tun4 leaf104 TEP
leaf101/102 Leaf101/102
Addr Interface Detail
Detail
A vpc1
tun3, local
XR ->vpc
leaf103 with
Bounce set on
4 old leaf101/102
bounce bounce bit set
leaf101 leaf102 leaf103 leaf104 B tun4 XR
XR->
->leaf104
leaf104
Leaf 103
Addr Interface Detail
A A B
-
A -
eth1/1 -
local learn from 1st packet
- - -
learn on leaf103,
2 published to coop
Host A moves to Leaf 104
1 leaf-103 Addr Interface Detail

leaf104 still points A tun1001 XR -> leaf101/102 VIP


! to old tunnel B eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Spines
Moves and Bounce Addr Interface Detail
A tun3 leaf103 TEP
leaf101/102
2 bounce to leaf103
B tun4 leaf104 TEP

Leaf101/102
Addr Interface Detail
A tun3, XR -> leaf103 with
bounce bounce bit set
leaf101 leaf102 leaf103 leaf104 B tun4 XR -> leaf104

Leaf 103
leaf103 learns
Addr Interface
Interface 3
Detail
Detail host B to leaf104
A B
A eth1/1
eth1/1 local
locallearn
learn
-
B tun4
- XR
- -> leaf104
host B sends
1 packet to host A Leaf 104
Addr Interface Detail
A tun1001 XR -> leaf101/102 VIP
B eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Spines
Moves and Bounce Addr Interface Detail
A tun3 leaf103 TEP
B tun4 leaf104 TEP

Leaf101/102
Addr Interface Detail
A tun3, XR -> leaf103 with
bounce bounce bit set
leaf101 leaf102 leaf103 leaf104 B tun4 XR -> leaf104

Leaf 103
Addr Interface Detail
A B
A eth1/1 local learn
B tun4 XR -> leaf104
host A sends
4 packet to host B Leaf 104
Addr Interface Detail

leaf104 updates A tun1001


tun3 XR -> leaf101/102
leaf103 TEP VIP
5 XR to leaf103 B eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Addr Time-left Reset-count Hit

Aging A 15 second
900 second 225
224 Yes
No

• Hardware maintains hit-bit for each entry which is set whenever a A


frame is received from corresponding source address
• If packet is not seen within timeout, then entry is aged and removed
from hardware
• Else if leaf receives a frame and hit-bit is set, then software resets timer
and hit bit and entry is not aged out.
• For local IP endpoints, at 75% of endpoint timer, then host tracking
sends 3x ARP/ND to verify if endpoint is still present
• ARP/ND reply resets timer for both IP and MAC
No regular ARP/ND required
• Support for silent hosts to verify IP is still present if
traffic is regularly received!
• No response and endpoint will eventually age-out

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
VPC Aging

Addr Hit Flags Addr Hit Flags


A No Local,vpc-attached
local, vpc-attached A No Local,vpc-attached
local, vpc-attached

B No peer-attached B No local
A B

vpc host Orphan host

• For vpc, both leaves in the vpc domain have to age out the entry before it
is removed. This applies to remote and local entries
• For orphan ports, as soon as the local leaf ages it out it is deleted from
both switches.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
VPC Aging
Peer-aged flag set indicating that peer When vpc endpoint is aged,
2 has aged the entry. Will be deleted 1 set local-aged flag and send
once local leaf ages out it as well. update to peer

Addr Hit Flags Addr Hit Flags


A No local, vpc-attached A No local, vpc-attached
peer-aged local-aged
B No peer-attached B No local
A B

vpc host

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
VPC Aging
Endpoint is locally-aged, send update Receive peer-aged from peer.
3 to peer. Since both local-aged and 4 Since both local-aged and
peer-aged is set, delete entry peer-aged is set, delete entry

Addr Hit Flags Addr Hit Flags


A No local, vpc-attached A No local, vpc-attached
peer-aged, local-aged local-aged, peer-aged
B No peer-attached B No local
A B

vpc host

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
VPC Aging
When orphan port is locally-
2 Orphan port deleted as
1 aged, simply delete and
soon as peer ages it out send update to peer

Addr Hit Flags Addr Hit Flags


A No local, vpc-attached A No local, vpc-attached

B No peer-attached B No local
A B local-aged

Orphan host

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Configuration Options
Nerd Knobs
Timers – Endpoint Retention Policy
XR MACs are always
Timer Default Applied at BD Applied at VRF
learned at BD level
Local 900 sec Mac and IP -

Bounce 630 sec Mac IP


XR IP’s are always
Remote 300 sec Mac IP learned at VRF level

Move 256/sec - -

Hold 300 sec - -

• If moves/sec exceed rate then learning is disabled on BD for the hold time
as a protection mechanism for software components (epm/epmc/coop)

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Timers – Endpoint Retention Policy

Custom Aging Timers


at BD level

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Timers – Endpoint Retention Policy

Custom Aging Timers


at VRF level

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Issue #1
Switch independent NIC team and load/spreading (Misconfigured Host)

ARP on eth2-1
1 with mac A, IP C

eth2-1 eth2-2
mac: A mac: B

Source traffic for Source traffic for


3 2 flow-X from B
flow-Y from A
IP: C

• Each routed IP frame triggers a new IP learn within the fabric and endpoint
is rapidly moving between mac A and mac B
• Possibly no perceived impact on dataplane traffic, however high CPU on
leaf. If NIC is between two leaves, then may see coop process high on
spine as well.
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Issue #1 Available in 3.2(1)
Fix – Enable Rogue Endpoint Detection

System -> System Settings -> Endpoint Controls ->


• An endpoint is marked as
Rogue EP Control
Rogue if it moves over the
multiplication factor within
the detection interval.
• Endpoint is programmed
as static to prevent new
local learns and DL bit is
set for all frames to
prevent XR updates.
• Fault raised for endpoints
Note, this is not a fix but allows operators an
opportunity to protect their fabric and get detected as rogue.
notified of misconfigured hosts

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Issue #1
Fix – Enable Rogue Endpoint Detection

Example Fault
• Fault is raised under the
node and also be seen
under System faults.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Issue #1
Fix – Enable Rogue Endpoint Detection

Check EPM flag on leaf


fab4-leaf101# show system internal epm endpoint ip 10.1.1.101

MAC : 0000.0000.000a ::: Num IPs : 1


IP# 0 : 10.1.1.101 ::: IP# 0 flags : rogue|
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name : ag:v1
BD vnid : 15958069 ::: VRF vnid : 2555909
Phy If : 0x16000002 ::: Tunnel If : 0
Interface : port-channel3
Flags : 0x80080c05 ::: sclass : 10932 ::: Ref count : 5
EP Create Timestamp : 12/31/1969 19:00:00.000000
EP Update Timestamp : 05/13/2019 19:58:26.310178
EP Flags : local|vPC|IP|MAC|sclass|rogue|
::::

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Issue #1
What about EP Loop Protection? Not RECOMMENDED

• Action is potentially
disruptive to other stable
endpoints.
• BD Learn disable prevents
new learns on the entire
BD
• Port disable may impact a
critical port such as fabric-
interconnect or DCI link.
No mechanism to prioritize
a host port.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Issue #2
Old IP never times out after new IP is assigned to host

fab4-leaf101# show endpoint ip 10.1.1.101


Legend:
s - arp H - vtep V - vpc-attached p - peer-aged
R - peer-attached-rl B - bounce S - static M - span
D - bounce-to-proxy O - peer-attached a - local-aged L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLAN IP Address IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
3028 vlan-101 0000.0000.000a LV po3
ag:v1 vlan-101 169.254.8.62 LV po3
ag:v1 vlan-101 #CLUS
10.1.1.101 LV
BRKACI-2641
po3
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Issue #2 Available in 2.1(1)
Fix: Enable IP Aging Policy

System -> System Settings -> Endpoint Controls -> • For aging, an endpoint is a
IP Aging
MAC with one or more IP
addresses. If the MAC is
active then all IPs learned
on the MAC will remain
active.
• IP Aging policy performs
aging on each IP
individually

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
Border Leaf (BL)
L3 Addr Interface Detail
Out A tun1 XR -> Service Leaf
IP: X
B tun1 XR -> Service Leaf
C eth1/1 local learn
service border

Initial Working State


A C Service Leaf (SL)
B
Addr Interface Detail
A eth1/1 local learn
IP X represents a prefix that is learned on the L3Out. B eth1/2 local learn
During stable state, the service leaf would have an
C tun6 XR -> Border Leaf
LPM route pointing to the border leaf for this prefix

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
Host-A sends pkt Border Leaf (BL)
1 with source-IP X Addr Interface Detail
L3
Out A tun1 XR -> Service Leaf
dmac Triggers a learn
XR 3
IP: X
B tun1 -> Service Leaf leaf
on border
smac
C eth1/1 local learn
SIP-X service border
X tun1 XR -> Service Leaf
DIP-C

A C Service Leaf (SL)


B
Addr Interface Detail
A eth1/1 local learn
Triggers a learn
B eth1/2 2 learnon service leaf
local
C tun6 XR -> Border Leaf
X eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Issue #3
Misconfigured host/L4-L7 service triggers unexpected learn
Packet incorrectly sent Border Leaf (BL)
3 to SL instead of L3Out L3 Addr Interface Detail
Out A tun1 XR -> Service Leaf
IP: X
B tun1 BLService
has learned IP
2XR -> Leaf
X toward SL
dmac C eth1/1 local learn
service border
smac X tun1 XR -> Service Leaf
SIP-C
A C Service Leaf (SL)
B DIP-X Addr Interface Detail
A eth1/1 local learn
Same problem if Host-B Host-C sends pktB eth1/2 local learn
tries to send packet to IP X. 1 with source-IP XC tun6 XR -> Border Leaf
All connectivity to this IP is
broken X eth1/1 local learn

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Issue #3 Available in 1.1(1)
Fix: Limit IP Learning to Subnet
Tenant -> Networking -> Bridge Domain
• Default setting for new
BDs created in 2.3(1e)
and 3.0(1k) and
above.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Issue #3
Fix: Limit IP Learning to Subnet (Partial Fix)
Local off-subnet Border Leaf (BL)
Packet is still
1 2
learn is ignored L3forwarded toAddrBL Interface Detail
Out A tun1 XR -> Service Leaf
dmac Triggers a learn
XR 3
IP: X
B tun1 -> Service Leaf leaf
on border
smac
C eth1/1 local learn
SIP-X service border
X tun1 XR -> Service Leaf
DIP-C

A C Service Leaf (SL)


B
Addr Interface Detail
A eth1/1 local learn
Limit IP learning to subnet prevents off-subnet learn B eth1/2 local learn
on local leaf but border leaf cannot apply off-subnet
C tun6 XR -> Border Leaf
logic on XR frame since BD information is not present
in packet, only VRF VNID in iVXLAN header

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Available in
Issue #3 2.2(2) and 3.0(2)
Fix: Enforce Subnet Check
System -> System Settings -> Fabric Wide Settings

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Available in
Issue #3 2.2(2) and 3.0(2)
Fix: Enforce Subnet Check
Local off-subnet
1 learn is ignored L3 • This feature is available only for Gen2
dmac
Out switches and above
IP: X
smac • This implicitly enables local subnet
SIP-X service border check whether it is enabled or not
XR off-subnet for all
DIP-C 2 BDs in VRF is ignored
enabled on the BD (i.e., Limit Ip
A B C Learning to Subnet on the BD is no
longer required).
• For remote learns, the IP is only
learned if the IP belongs to at least
BD in the VRF.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Issue #4 Leaf101
Addr Interface Detail
Stale Endpoint on Border Leaf
A tun3, XR -> leaf103 with
Traffic from L3out destined to Host-A bounce bounce bit set
is bounced through leaf101
L3
Out
Leaf 103
Addr Interface Detail
A eth1/1 local learn
leaf101 leaf103 border
Border Leaf
Addr Interface Detail
A A B A tun1 XR -> leaf101 TEP

• In initial state, Host-A has triggered an XR learn on the border leaf. Let’s
assume in this example that Host-A was communicating with Host-B.
• Host-A then moves to leaf103. It no longer sends any frames to Host-B but
continues sending frames out the L3out toward the border leaf.
• Leaf101 maintains a bounce-entry for Host-A
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Issue #4 Leaf101
Addr Interface Detail
Stale Endpoint on Border Leaf
A -
tun3, Bounce entry timed
XR -> leaf103 with out
bounce bounce bit set
Eventually bounce
L3
Leaf 103 entry times out
Out
Addr Interface Detail
A eth1/1 local learn
HIT bit set, but move
leaf101 leaf103 border
Border Leaf ignored due to DL bit
Addr Interface Detail Hit
A A tun1 XR -> leaf101 TEP No
Yes

• Leaf103 is a Gen1 leaf and the VRF is in ingress enforcement. Due to hardware
restriction on Gen1, traffic sent to the L3Out has the DL (don’t-learn) bit set in the
iVXLAN header.
• When the border leaf receives the frame, it updates aging hit bit but does not update
the learn entry since DL bit is set.
• Eventually, the bounce entry on leaf101 will timeout but border leaf will still have XR
entry point to leaf-101. Any traffic destined to host-A will be dropped
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Issue #4 Leaf101
Stale Endpoint on Border Leaf Addr Interface Detail

Traffic from L3out toward A - Bounce entry timed-out


Host-A is sent to leaf-101
L3
Out
Leaf 103
Addr Interface Detail
A eth1/1 local learn
leaf101 leaf103 border
Border Leaf
Addr Interface Detail Hit
A A tun1 XR -> leaf101 TEP Yes
Leaf-101 drops
the packet Entry on BL is now stale. It
points to leaf-101 which is not
where Host-A exists

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Available in
Issue #4 2.2(2) and 3.0(1)
Fix: Disable Remote Endpoint Learning on Border Leaf
System -> System Settings -> Fabric Wide Settings

• No XR IP learning on Border Leaf


• L3Out deployed with VRF in ingress policy enforcement mode
• Prevents stale endpoint caused by Gen1 sending traffic to L3Out with DL bit set

• Note, routed multicast will still trigger an XR#CLUS


IP learn BRKACI-2641
on Border Leaf
© 2019 Cisco with
and/or its Gen2
affiliates. All switches
rights reserved. Cisco Public 52
Stale Endpoint Software Fix
Feature: EP Announce on Bounce Delete
Leaf101
Addr Interface Detail
L3
Out A tun3, XR -> leaf103 with
bounce bounce bit set

leaf101 leaf103 border Border Leaf


Addr Interface Detail Hit
A tun1 XR -> leaf101 TEP Yes
A A

• Let’s consider the same scenario as Issue#4. Host-A moved from leaf101 to
leaf103, a bounce entry is present on Host-A, and some flow is resetting the
XR hit-bit on the border leaf toward leaf101

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Stale Endpoint Software Fix
Feature: EP Announce on Bounce Delete
Leaf101
Addr Interface Detail
L3
Out A tun3
- XR -> leaf103
Bounce entry timed-out

Border Leaf
leaf101 leaf103 border Addr Interface Detail
Interface Detail
A -
tun1 XRDeleted by announce
-> leaf101 TEP
Bounce timer expires, A
Send EP Announce Delete Triggers XR delete on any
leaf still pointing to leaf101

• Enabled by default in 3.2.2 and above, no configuration required

• Supports Gen1 and Gen2

• Prevents stale endpoint issues

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Issue #5
I have no control over the devices connected to the network…
• Some environments must support
VM with multiple NICs that
perform their own routing OR
allow users to spin up their own
Users routing
virtual routers, load-balancers, or
through their own firewalls
virtual firewalls
• There are supported design
recommendations to address
each scenario, however it is too
Servers IP
load-sharing
difficult or not possible to address
Dynamic load-
Virtual
balancers
each in the current network
routers
• Can we just do traditional IP
learning?
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Issue #5 Available in 4.0(1)
Fix: Disable IP Dataplane Learning on the VRF
Tenant -> Networking -> VRFs • Local MAC learning still occurs via
dataplane
• Remote MAC learning still occurs
via dataplane for Gen2
• BD L2 hardware proxy is required
to support Gen1 since remote MAC
learning will not occur
• Local IPs are only learned via
ARP/ND control plane
• Remote IPs are not learned from
IP Dataplane unicast
learning
• Remote IPs are still learned from
routed multicast packets

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Issue #5
What about Disable IP Dataplane Learning on the BD?
Tenant -> Networking -> Bridge Domains
• Disabling IP Dataplane
learning on the BD is
only tested/supported
for service graph BDs
with PBR
• In 3.1 and above with
Gen2, this feature is
Not recommended
to disable
auto-enabled on the
PBR node EPG, so
disabling on BD is not
required with PBR

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Endpoint Control Best Practices
• Run 3.2 or above to take advantage of EP Announce Delete
• Per BD, enable Limit IP Learning to Subnet
• Enable Global IP Aging
• Enable Global Enforce Subnet Check (not applicable for Gen1)
• If Gen1 leaf present, enable Disable Remote EP Learn on Border Leaf

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Endpoint Learning
Troubleshooting
Tips
Packet Walk Checklist
Problem: Host-A cannot ping the gateway
• Start with the basics:
 Verify EPG/BD/VRF basic config
 What leaf/port is the host connected?
 Is the vlan-encap deployed to the leaf?
 Is the port a member of the vlan?
 Is the SVI present with gateway config?
A  Is the endpoint learned?
10.1.1.101
0000.0000.000A
If we were learning the endpoint in the
EPG: e1
fabric, we could quickly tell which leaf/port
BD: bd1 it was connected and, most likely, it would
VRF: v1 be able to ping its gateway…

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Packet Walk Checklist  Is the endpoint learned?

Problem: Host-A cannot ping the gateway Skip to the last step first, since it
can validate all other steps

Check EP Tracker in APIC UI

fab4-apic1# show endpoint ip 10.1.1.101


Legends:
(P):Primary VLAN
(S):Secondary VLAN Check for endpoint on APIC CLI

Total Dynamic Endpoints: 0


Total Static Endpoints: 0

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Packet Walk Checklist
Problem: Host-A cannot ping the gateway

Validate static path attachment and


encap. In this example, vpc on node-
VRF: v1
101/102 and VLAN encap 101

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Packet Walk Checklist
Problem: Host-A cannot ping the gateway

Ensure the BD is associated to the EPG


Also (not shown), ensure the BD is
associated to the VRF

Network faults may require you to verify your


access policy configuration (AEP, phy domain,
vlan pool, switch/interface selectors)

Ensure there are no faults for the


EPG that might have stop
deployment to your leaf.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Packet Walk Checklist  Is the vlan-encap deployed?

Problem: Host-A cannot ping the gateway  Is the port a member of the vlan?

fab4-leaf101# show port-channel extended | egrep ag_po1001


3 Po3(SU) ag_po1001 LACP Eth1/3(P)
Port-channel ag_po1001 with id
Po3 and member interface Eth1/3

fab4-leaf101# vsh_lc -c 'show system internal eltmc info vlan access_encap_vlan 101' | egrep "vlan_id"
vlan_id: 3028 ::: hw_vlan_id: 3009
vlan_id: 3028 ::: isEpg: 1
bd_vlan_id: 3027 ::: hwEpgId: 12766
Get the PI vlan for the encap
fab4-leaf101# show vlan id 3028 extended
(FD) and the BD vlans
VLAN Name Encap Ports
---- -------------------------------- ---------------- ------------------------
3028 ag:app:e1 vlan-101 Eth1/3, Eth1/4, Eth1/6,
Po3, Po4 Verify my interface is
fab4-leaf101# show vlan id 3027 extended
forwarding for both EPG
VLAN Name Encap Ports and BD vlans
---- -------------------------------- ---------------- ------------------------
3027 ag:bd1 vxlan-15958069 Eth1/3, Eth1/4, Eth1/6,
Po3, Po4

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Packet Walk Checklist  Is the SVI present with gateway
config?
Problem: Host-A cannot ping the gateway
 Is the endpoint learned?

fab4-leaf101# show ip interface vlan 3027


IP Interface Status for VRF "ag:v1"
vlan3027, Interface status: protocol-up/link-up/admin-up, iod: 1028, mode: pervasive
IP address: 10.1.1.1, IP subnet: 10.1.1.0/24 Remember, vlan-3027
IP broadcast address: 255.255.255.255
IP primary address route-preference: 1, tag: 0
is the vlan for bd1

Queries EPM state


fab4-leaf101# show system internal epm endpoint ip 10.1.1.101
<none>
directly (fast)

fab4-leaf101# show endpoint ip 10.1.1.101


Legend:
s - arp H - vtep Same command
V - vpc-attached used on
p - peer-aged
R - peer-attached-rl B - bounce APIC, queries epm
S - static MIT state
M - span
D - bounce-to-proxy O - peer-attached a - local-aged L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLAN IP Address IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
<none>

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Packet Walk Checklist  Is the endpoint learned?
 Is the correct subnet pushed?
Problem: Host-A cannot ping the gateway
 Is learning enabled?

fab4-leaf101# show system internal epm vlan 3027 detail | egrep "Learn|fwd_mode|BD Subnet"
Valid : Yes ::: Incomplete : No ::: Learn Enable : Yes
fwd_mode : route,bridge ::: fwd_ctrl : mdst-flood,ip-lrn-pfx-check,
BD Subnet ip_pfx-1 : 10.1.1.1/24

fab4-leaf101# vsh_lc -c 'show system internal epmc vlan 3027 detail' | egrep "Learn|fwd_mode|BD Subnet"
fwd_mode : route,bridge ::: fwd_ctrl : mdst-flood,ip-lrn-pfx-check, ::: bridge_mode: mac ::: unk_mac_ucast:
proxy
Learning disabled :no
BD Subnet ip_pfx-1 : 10.1.1.1/24 Both epm (sup component) and epmc (LC
component) have routing enabled on the BD
and learning is enabled.
Gen2 only, ensure that learning Also BD subnet list contains our prefix
is globally enabled in Hal
fab4-leaf101# vsh_lc -c 'show system internal epmc global-info' | egrep "Hal Learn"
Hal Learn Disabled : No

fab4-leaf101# vsh_lc -c 'show platform internal hal learn learn' | egrep status
status : Enabled
status_reason : None

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Packet Walk Checklist  Is the endpoint learned?
 Is the correct subnet pushed?
Problem: Host-A cannot ping the gateway
 Is learning enabled?

• Under what conditions do we expect learning to be disabled?

Endpoint Retention Policy


Timer Default Applied at BD Applied at VRF
Remember, if moves per second
Local 900 sec Mac and IP -
exceed BD configured policy, learning
will temporarily be disabled! Bounce 630 sec Mac IP

Remote 300 sec Mac IP

Move 256/sec - -

Hold 300 sec - -

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Packet Walk Checklist  Is the endpoint learned?
 Are we receiving the frame?
Problem: Host-A cannot ping the gateway
What tools do we have to help?
SPAN, ELAM (ELAM-Assistant App)

fab4-leaf101# show endpoint mac 0000.0000.000a


Legend:
s - arp H - vtep We did learn
V - vpc-attached the MAC, but in the
p - peer-aged
R - peer-attached-rl B - bounce S - static wrong vlan. Misconfigured host
M - span
D - bounce-to-proxy O - peer-attached a - local-aged L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLAN IP Address IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
291/ag:v1 vlan-102 0000.0000.000a LV po3

• We got lucky that the vlan-encap the host was sending in was configured on
the leaf, else the frame would have been dropped and no MAC learn
triggered Limit IP Learning to Subnet enabled by
default, vlan-102 in a different BD or
• Why wasn’t the IP learned? unicast routing disabled on that BD
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
Packet Walk Checklist  Is the endpoint learned?

Fixed: Host-A can ping the gateway


Fixed the host config and now
we’re learning the IP!
fab4-leaf101# show endpoint ip 10.1.1.101
Legend:
s - arp H - vtep V - vpc-attached p - peer-aged
R - peer-attached-rl B - bounce S - static M - span
D - bounce-to-proxy O - peer-attached a - local-aged L - local
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLAN IP Address IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+
3028 vlan-101 0000.0000.000a LV po3
ag:v1 vlan-101 10.1.1.101 LV po3

fab4-leaf101# show system internal epm endpoint ip 10.1.1.101

MAC : 0000.0000.000a ::: Num IPs : 1


IP# 0 : 10.1.1.101 ::: IP# 0 flags : Remember that epm/epmc treat an
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name : ag:v1
BD vnid : 15958069 ::: VRF vnid : 2555909 endpoint as a MAC with one or more
Phy If : 0x16000002 ::: Tunnel If : 0 IPs, so MAC is also displayed for local
Interface : port-channel3
Flags : 0x80000c05 ::: sclass : 10932 ::: Ref count : 5 IP endpoints
EP Create Timestamp : 05/17/2019 02:14:09.965041
EP Update Timestamp : 05/17/2019 02:14:09.965041
EP Flags : local|vPC|IP|MAC|sclass|
::::
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Packet Walk Checklist  Does coop have the endpoint?

Fixed: Host-A can ping the gateway


Bonus validation
fab4-spine201# show coop internal info ip-db key 2555909 10.1.1.101

IP address : 10.1.1.101
Vrf : 2555909 Verify endpoint in coop using
Flags : 0 VRF vnid and IP address
EP bd vnid : 15958069
EP mac : 00:00:00:00:00:0A
Publisher Id : 10.0.128.93 Mac and BD VNID
Record timestamp : 06 09 2019 13:32:53 827717825
Publish timestamp : 06 09 2019 13:32:53 828777370
Seq No: 0
Remote publish timestamp: 12 31 1969 19:00:00 0
URIB Tunnel Info
Num tunnels : 1
Tunnel address : 10.0.128.95
Tunnel ref count : 1:::: pTEP/vTEP/eTEP of leaf/pod/site

• Endpoint must be in coop in order for proxy lookups to work. This is critical
for XR miss for both intra/inter-pod and intra/inter-site. You should see the
same state on all spines.
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
Packet Walk Checklist  Does coop have the endpoint?

Fixed: Host-A can ping the gateway


Bonus validation

fab4-spine201# show coop internal info repo ep key 15958069 00:00:00:00:00:0A | egrep "^Vrf|^Tunnel nh|^EP|num
of active|^Real"
EP bd vnid : 15958069
EP mac : 00:00:00:00:00:0A Verify endpoint is in coop using
Vrf vnid : 2555909 Tunnel next-hop BD VNID and mac address
Tunnel nh : 10.0.128.95
num of active ipv4 addresses : 4
num of active ipv6 addresses : 1
Real IPv4 EP : 10.1.1.101 IPv4/IPv6 addressed
Real IPv4 EP : 10.1.1.102
Real IPv4 EP : 10.1.1.103 tied to this MAC
Real IPv4 EP : 10.1.1.104
Real IPv6 EP : 2001:0000:0000:0000:0000:0000:0000:0065

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Endpoint Learning Troubleshooting Review
 Verify logical config (EPG/BD/VRF and contracts)
 Verify no network faults under the EPG that would prevent the encap from being
deployed
 Verify that the leaf has the encap deployed
 Verify that the port is a member of the vlan
 Verify that the SVI is present on the leaf with the proper subnets
 Verify that local leaf is learning the endpoint
 Verify learning is enabled on the BD
 Verify software components have the correct BD prefixes programmed
 Verify the leaf is receiving the frame on expected interface and encapsulation
 Verify that endpoint is present in coop and coop has correct tunnel address

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Recommend Troubleshooting Apps
https://aciappcenter.cisco.com/

EnhancedEndpointTracker ELAM Assistant

The ELAM Assistant performs ELAM to capture a


packet and decode the result.

The EnhancedEndpointTracker is a Cisco ACI ELAM is a built-in tool that captures a single packet at
application that maintains a database of endpoint the ASIC level to check forwarding decision details.
events on a per-node basis allowing for unique fabric- It is typically used by Cisco TAC as it requires a deep
wide analysis. The application can be knowledge of each ACI ASIC to both perform and
configured to analyze, notify, and automatically correctly understand the resulting output.
remediate various endpoint events. This gives
ACI fabric operators better visibility and control over This app wraps the differences between each ACI ASIC
the endpoints in the fabric. and provides a UI to perform an ELAM capture for
those who don't have access to ASIC level information.
It then decodes this results of the ELAM capture in a
user friendly format.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Enhanced Endpoint Tracker
Active endpoint count
and fast search

Start/Stop the monitor


Uptime of the monitor
and number of queued
events to process
Health/history of the
monitor itself

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Enhanced Endpoint Tracker
Fast search for IP or MAC

~150ms for search to


complete

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
Enhanced Endpoint Tracker

Historical tables to browse various events along


with browsing all endpoints in the fabric

Top moves in the fabric, quickly see any


unstable/misconfigured endpoints

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Enhanced Endpoint Tracker
Full details of current state of endpoint within
the fabric including local and XR learns

Also per-node detailed history, move events,


rapid/offsubnet/stale/and clear events

History of where endpoint was learned or if it


was deleted from the fabric

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Enhanced Endpoint Tracker

Clear problem endpoints on


multiple nodes quickly

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Complete your
online session • Please complete your session survey
evaluation after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live water bottle.
• All surveys can be taken in the Cisco Live
Mobile App or by logging in to the Session
Catalog on ciscolive.cisco.com/us.
Cisco Live sessions will be available for viewing
on demand after the event at ciscolive.cisco.com.

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Continue your education

Demos in the
Walk-in labs
Cisco campus

Meet the engineer


Related sessions
1:1 meetings

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Thank you

#CLUS
Appendix
Packet Walk Checklist Subtle but important. If bridged then
we need to check MAC endpoints, if
Problem: Host-A cannot ping Host-B routed we need to check IP…

 Is this frame bridged or routed?


 Am I learning Host-A and Host-B IPs in
the fabric?
 Do we have a remote learn for Host-B on
leaf101 leaf102 leaf103
ingress leaf or are we using proxy-path?
 Do the spines have Host-B entry
A B programmed to handle proxy forwarding?
10.1.1.101 10.1.2.102  For the leaf that is performing policy
0000.0000.000A 0000.0000.000B enforcement, do I have the appropriate
EPG: e1 EPG: e2 contract?
BD: bd1 BD: bd2
VRF: v1

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Packet Walk Checklist  Am I learning Host-A and Host-B
IPs in the fabric?
Problem: Host-A cannot ping Host-B

fab4-apic1# show endpoint ip 10.1.1.101


<snip> We can check the endpoint directly on
Dynamic Endpoints:
Tenant : ag
the APIC. If not present, then repeat
Application : app previous local learn troubleshooting
AEPg : e1

End Point MAC IP Address Node Interface


----------------- ---------------------------------------- ---------- ------------------------------
00:00:00:00:00:0A 10.1.1.101 101 102 vpc ag_po1001

fab4-apic1# show endpoints ip 10.1.2.102


<snip>
Dynamic Endpoints:
Tenant : ag
Application : app
AEPg : e2
End Point MAC IP Address Node Interface
----------------- ---------------------------------------- ---------- ------------------------------
00:00:00:00:00:0B 10.1.2.102 103 eth1/5

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
Packet Walk Checklist  Do we have a remote learn for
Host-B on ingress leaf or are we
Problem: Host-A cannot ping Host-B using proxy-path?
fab4-leaf101# show endpoint ip 10.1.2.102
Legend:
s - arp H - vtep V - vpc-attached p - peer-aged
R - peer-attached-rl B - bounce S - static M - span
D - bounce-to-proxy O - peer-attached a - local-aged L - local
Leaf-101 (ingress leaf) does not
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLANhave an XR IP
learn for Host-B IP Info
Address
+-----------------------------------+---------------+-----------------+--------------+-------------+
<none>

fab4-leaf101# show ip route 10.1.2.0 vrf ag:v1


IP Route Table for VRF "ag:v1"
'*' denotes best ucast next-hop
Ensure that the route has
'**' denotes best mcast next-hop pervasive flag for ‘pervasive BD’
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

10.1.2.0/24, ubest/mbest: 1/0, attached, direct, pervasive


*via 10.0.208.64%overlay-1, [1/0], 00:24:38, static, tag 4294967295
recursive next hop: 10.0.208.64/32%overlay-1

fab4-leaf101# show isis dteps vrf overlay-1 | grep 10.0.208.64 Next-hop IP is spine anycast
10.0.208.64 SPINE N/A PHYSICAL,PROXY-ACAST-V4 IPv4 Proxy

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Packet Walk Checklist  Do the spines have Host-B entry
programmed to handle proxy?
Problem: Host-A cannot ping Host-B
We can get it vrf vnid First, we need the VNID for the
from the leaf VRF to validate routed flow.

fab4-leaf101# moquery -c fvCtxDef -x 'query-target-filter=eq(fvCtxDef.ctxDn,"uni/tn-ag/ctx-v1")'


scope : 2555909

fab4-leaf101# vsh_lc -c 'show system internal eltmc info vrf ag:v1' | egrep vnid: | head -1
overlay_index: 0 ::: vnid: 2555909

Tenant -> Networking -> VRFs

fab4-apic1# moquery -d uni/tn-ag/ctx-v1 | egrep scope


scope : 2555909

We can get it vrf vnid


from the APIC cli

We can get it vrf vnid


from the APIC UI
#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Packet Walk Checklist  Do the spines have Host-B entry
programmed to handle proxy?
Problem: Host-A cannot ping Host-B

fab4-spine201# show coop internal info ip-db key 2555909 10.1.2.102 The tunnel address can by
IP address : 10.1.2.102
one of several different
Vrf : 2555909 type of TEPs:
Flags : 0x2 Spine has the entry in coop
EP bd vnid : 16187409
EP mac : 00:00:00:00:00:0B
(should validate each spine) • Physical TEP within same
Publisher Id : 10.4.0.2 pod
Record timestamp : 12 31 1969 19:00:00 0
Publish timestamp : 12 31 1969 19:00:00 0
Seq No: 0 • VPC TEP within same
Remote publish timestamp: 05 17 2019 02:22:08 814730181
URIB Tunnel Info
pod
Num tunnels : 1
Tunnel address : 10.0.16.94 • Anycast External IP for
Tunnel ref count : 1
In this case, this is remote pod or site
leaf103 PTEP
• RemoteLeaf PTEP
admin@fab4-apic1:~> acidiag fnvread | grep 10.0.16.94
103 1 fab4-leaf103 SAL19069BUY 10.0.16.94/32 leaf

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Packet Walk Checklist  For the leaf that is performing
policy enforcement, do I have the
Problem: Host-A cannot ping Host-B appropriate contract?

Which Leaf applies the contract?


• Ingress leaf applies contract if remote endpoint is known so packet does not have to
be forwarded all the way through the fabric
Will focus on leaf-103
• Egress leaf applies contract if packet was sent via spine proxy.
• Border leaf in ingress policy enforcement does not apply contract unless application
EPG is deployed locally.
To Verify Contract

 VRF VNID
 Source EPG pcTag (Host-A)
 Destination EPG pcTag (Host-B)

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Packet Walk Checklist  For the leaf that is performing
policy enforcement, do I have the
Problem: Host-A cannot ping Host-B appropriate contract?

fab4-leaf101# show system internal epm end ip 10.1.1.101

MAC : 0000.0000.000a ::: Num IPs : 1


IP# 0 : 10.1.1.101 ::: IP# 0 flags :
Vlan id : 3028 ::: Vlan vnid : 8292 ::: VRF name :Host-A
ag:v1 local EPM entry on
BD vnid : 15958069 ::: VRF vnid : 2555909 leaf101 contains source pcTag
Phy If : 0x16000002 ::: Tunnel If : 0
Interface : port-channel3
Flags : 0x80004c05 ::: sclass : 49155 ::: Ref count : 5
fab4-leaf103# show system internal epm endpoint ip 10.1.2.102
EP Create Timestamp : 05/17/2019 02:14:09.965041
EP Update Timestamp : 05/17/2019 03:46:08.819921
MAC : 0000.0000.000b ::: Num IPs : 1
EP Flags : local|vPC|IP|MAC|sclass|timer|
IP# 0 : 10.1.2.102 ::: IP# 0 flags :
::::
Host-B
Vlan id : 279 ::: Vlan vnid : 8293 ::: VRF name : local
ag:v1EPM entry on
leaf103 contains dest pcTag
BD vnid : 16187409 ::: VRF vnid : 2555909
Phy If : 0x1a004000 ::: Tunnel If : 0
Interface : Ethernet1/5
Flags : 0x80004c04 ::: sclass : 16389 ::: Ref count : 5
EP Create Timestamp : 05/17/2019 02:21:47.612351
EP Update Timestamp : 05/17/2019 03:45:01.836174
EP Flags : local|IP|MAC|sclass|timer|
::::

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Packet Walk Checklist  For the leaf that is performing
policy enforcement, do I have the
Problem: Host-A cannot ping Host-B appropriate contract?
fab4-leaf103# show zoning-rule scope 2555909
Rule ID SrcEPG DstEPG FilterID operSt Scope Action
======= ====== ====== ======== ====== ===== ======
4419 0 0 implicit enabled 2555909 deny,log
4420 0 0 implarp enabled 2555909 permit
4421 0 15 implicit enabled 2555909 deny,log
4535 0 49154 implicit enabled 2555909 permit

fab4-leaf103# contract_parser.py --vrf ag:v1


Key:
Available since 3.2.2
[prio:RuleId] [vrf:{str}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:{str}] [hit=count]

[16:4535] [vrf:ag:v1] permit any epg:any tn-ag/bd-bd2(49154) [contract:implicit] [hit=0]


[16:4420] [vrf:ag:v1] permit arp epg:any epg:any [contract:implicit] [hit=0]
[21:4419] [vrf:ag:v1] deny,log any epg:any epg:any [contract:implicit] [hit=5157]
[22:4421] [vrf:ag:v1] deny,log any epg:any pfx-0.0.0.0/0(15) [contract:implicit] [hit=0]

fab4-leaf103# show logging ip access-list internal packet-log deny | egrep 10.1.2.102 | head
[ Fri May 17 04:02:02 2019 634490 usecs]: CName: ag:v1(VXLAN: 2555909), VlanType: Unknown, Vlan-Id: 0, SMac:
0x000c0c0c0c0c, DMac:0x000c0c0c0c0c, SIP: 10.1.1.101, DIP: 10.1.2.102, SPort: 0, DPort: 0, Src Intf: Tunnel14,
Proto: 1, PktLen: 98
<snip>

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Packet Walk Checklist  For the leaf that is performing
policy enforcement, do I have the
Problem: Host-A cannot ping Host-B appropriate contract?

In this instance the contract was missing. Add the proper consumer/provider and/or
VzAny/preferred group updates to allow communication between the two EPGs
fab4-leaf101# show zoning-rule scope 2555909 | egrep "Rule|===|16389"
Rule ID SrcEPG DstEPG FilterID operSt Scope Action
======= ====== ====== ======== ====== ===== ======
4735 49155 16389 7 enabled 2555909 permit
4700
4736
49155
16389
16389
49155
default
default
Traffic from
enabled
enabled
Host-A (pcTag
2555909
2555909
permit
permit
6137 16389 49155 6 49155)
enabled to Host-B (pcTag 16389)
2555909 permit

fab4-leaf101# contract_parser.py --vrf ag:v1 --epg tn-ag/ap-app/epg-e1


Key:
[prio:RuleId] [vrf:{str}] action protocol src-epg [src-l4] dst-epg [dst-l4] [flags][contract:{str}] [hit=count]

[7:6137] [vrf:ag:v1] permit ip tcp tn-ag/ap-app/epg-e2(16389) tn-ag/ap-app/epg-e1(49155) eq 80 [contract:uni/tn-ag/brc-c1] [hit=0]


[7:4735] [vrf:ag:v1] permit ip tcp tn-ag/ap-app/epg-e1(49155) eq 80 tn-ag/ap-app/epg-e2(16389) [contract:uni/tn-ag/brc-c1] [hit=0]
[9:4736] [vrf:ag:v1] permit any tn-ag/ap-app/epg-e2(16389) tn-ag/ap-app/epg-e1(49155) [contract:uni/tn-ag/brc-c1] [hit=0]
[9:4700] [vrf:ag:v1] permit any tn-ag/ap-app/epg-e1(49155) tn-ag/ap-app/epg-e2(16389) [contract:uni/tn-ag/brc-c1] [hit=220,+10]

#CLUS BRKACI-2641 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
#CLUS

You might also like