You are on page 1of 94

Day-2 Telemetry

Network Insights for ACI/NX-OS

Karishma Gupta, Technical Marketing Engineer


Intent Based Networking Group

BRKDCN-2712
Cisco Webex Teams

Questions?
Use Cisco Webex Teams to chat
with the speaker after the session

How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Main Message

“You can't manage what you


don’t measure. You can't
measure what you don't see”

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
Session Abstract
The session provides the journey and current state on modern telemetry infrastructure
supporting Day 2 operations.

The Cisco’s Network Insights offering will provide a common tooling for ACI and NX-OS to
efficiently deliver information for Day 2 operations, all built on top of a scale-out micro-
services architecture.

In addition to the infrastructure and architecture that enables Network Insights, model based
software and hardware telemetry will be put in contrast as well as the various telemetry data
consumption models. Where data comes un-throttled from the switches forwarding chip
(ASIC), how we receive this mass of information and talk about the benefits, the trade-offs,
and challenges across ingestion, retention, correlation, and visualization.
The architecture discussion will be complemented with network telemetry concepts, future
directions, and use cases with real-world examples of the concepts, capabilities and key
differentiators.

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Network Visibility Is Hard

Hard to Operationalize
SNMP
Incomplete
Syslog
Unstructured
CLI
Device-Specific

Slow

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Why is support Is my configuration
asking for access to at the supported
my setup three times scale?
If someone else hit for this issue?
this issue and its
fixed, why not notify
that It could impact
me!
How many times do I
need to gather
these, always rolling
over logs to resolve
this case?
Are my network
security patches up
to date?

Is the code I’m


running since 2016 Are my network
still recommended? resources healthy,
underlay/overlay,
routing, etc?

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Network Telemetry Frees the Data

Where Data Is Created Where Data Is Useful

Sensing & As Much Useful Data


measurement As Efficiently as Possible

Storage &
analysis

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Key Telemetry Characteristics

Push not Pull Efficient Delivery

Analytics-ready Tool-Chain consumption


UDP Data and Integration

Data-model Driven Structure and


Consistent format Automation

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
“Pushing” More Data Really Does Work Better
Counters CPU load
400 30
Thousands

300 20%
20 14%
200 8%
10 7% 7% 7%
100
0
0
1 2 3
5s 10s 15s 20s Destinations

Time to collect all data

Interface counters
(In/Out pkts, In/Out Discards, In/Out Errors)

MemAllocated
Telemetry
SNMP
0 5 10 15 20 25
Seconds
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Why This Matters Now
What hasn’t changed What has changed

Use Cases Trends


• Real time statistics
• Network Health • Centralized / Software-defined
• Anomaly detection • Speed
• Troubleshooting / Remediation • Scale
• SLAs, Performance Tuning
• Capacity Planning Capabilities
• Security

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Getting the Information out of the Network

• What are the mechanisms for getting information out of the network? – i.e.,
“Telemetry Sources”
• This session: Software and Hardware Streaming Telemetry

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Nexus Software Telemetry

Software Telemetry Data

UDP

• Provides visibility to control-plane protocol state, environmental info,


counters, etc.
• Data retrieved from DME (system object model) or NX-API (structured CLI
ouput)
• Limited data-plane visibility, no flow-level visibility
• Not designed for high frequency export

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Streaming Software Telemetry Platform Support

Nexus Platform DME NX-API Release


3000 with 8GB+ DRAM ✓ ✓ 7.0(3)I7(1)
9200/9300 ✓ ✓ 7.0(3)I5(1)
9500 ✓ ✓ 7.0(3)I5(1)
5000/5500/6000 X X N/A
7000/7700 X ✓ 8.3(1)

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
ASIC-Specific Telemetry Outputs

• Different ASICs support different hardware telemetry capabilities


• Different hardware telemetry types use different output formats
• No standard in industry (some are evolving)
• Generally requires normalization/conversion into common, structured
format for consumption

Hardware Telemetry Data

Telemetry Receiver
UDP

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Flow Table (FT)

• Collects full flow information plus metadata


o 5-tuple flow info
o Interface/queue info
o Flow start/stop time
o Flow latency

• 32K flow table entries per ASIC slice


• Direct hardware export

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Flow Table Events (FTE)

• Triggers notifications based on criteria / thresholds met by


data-plane packet flows
• Collects full flow information plus metadata
• 5-tuple flow info with timestamp
• Interface/queue info
• Buffer drop indication
• Forwarding drop, ACL drop, policer drop indication
• Latency/burst threshold exceeded indication
• Direct hardware export, with flow-level and global throttling

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Streaming Statistics Export (SSX)

• Streams ASIC statistics at rapid cadence based on user config


• Interface counters (packets/bytes/drops)
• Ingress/Egress queue depth
• Ingress/Egress queue drops
• Egress queue microbursts
• Buffer depth
Periodically
• User defines streaming parameters : pull values
SSX ASIC Stats
• which statistics, how often, and to which collector
• Direct export from ASIC to front-panel port SSX Export

Lookup
Pipeline

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Hardware Telemetry Platform Support

Platform FT FTE SSX

9300/9500-EX ✓ X X

9300/9500-FX ✓ ✓ X

9364C X X ✓

9300-FX2 ✓ ✓ ✓

9300-GX ✓ ✓ ✓

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
A Very Basic Analytics Platform Architecture

Dashboard Visualize, Alert, Automate

Storage Index, Search, Store

Collection Ingest, Aggregate, Normalize

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Network Insights - Enabling Proactive Action

Sources of
Ingest and Process Derive Insights Recommend Action
Telemetry Data

Config File
PSIRTs Complex
Syslog Tech- Config / Scale / Process
correlation
Support Hardening
Accounting RIB FIB
Metadata
Logs Debug Logs
CLI extraction
Streaming
Telemetry Environmentals
Event Topology Cores Correlate
History Field notices Network SW-Version /
Consistency against
Checkers Database Protocols PSirts
Mac Table

Increase Availability and Performance


Leverage Knowledge Base of Digitized Known Issues
ACI | NX-OS
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Network Insights Applications

Apps
DCNM APIC

Platform
App Hosting Framework App Hosting Framework
App Store App Store

Data collection and ingestion Data correlation and analysis Data visualization and action

Visibility Insights Proactive Troubleshooting


Learn from your network and See problems before Find root cause faster with
recognize anomalies your end users do granular details

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Download application from the App Store
Common App Store for ACI and NXOS - https://dcappcenter.cisco.com/

Network Insights Suite

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
Day 2 Operations Stack

Assurance
• Policy/ Control/Data plane Assurance
Network Assurance Engine (NAE)
• Incident and Problem Management
• Compliance and Audit O
P
Proactive Maintenance S
• Fabric health and maintenance based on
Network
global Insights Advisor (NIA)
Cisco advisories T
• Network security maintenance based on
PSIRTs and known vulnerabilities A
C
Troubleshooting
K
• Fabric Health monitoring
Network Insights Resources (NIR)
• Fabric wide resource monitoring
• Anomaly detection

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Data Center Visibility Use Cases

Network Health Path and Latency Network Performance


CPU and memory Measurement
• • Interface utilization
utilization • End-to-end visibility
• Buffer monitoring
• Forwarding table utilization • Path tracing over time
• Microburst detection
• Protocol state and events • Flow latency monitoring
• Drop event correlation
• Environmental data

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Shorten Time to Remediation
for Troubleshooting
Locate Locate Root cause

Identify Root cause


Network
Identify Automated with Operations
Artificial
Intelligence

Problem Problem
lifecycle lifecycle
Network Remediate NIR Remediate
Operations

Network Insights:
Resources

Packet drops | Latency | Reachability | Routing | Micro Bursts


BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Increase Speed and Agility
for Capacity Planning
Trends Trends Forecast

Readings Forecast
Network
Readings Automated with Operations
statistical models

Capacity Capacity
planning planning
Network Adjust NIR Adjust
Operations

Network Insights:
Resources

Bandwidth | Ports | TCAM | Configuration limits


BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Empower Your Team
with Proactive Monitoring
Impact Analysis Impact Analysis Notify

Detect Notify
Network
Detect Automated with Operations
Anomaly detection

Incident Incident
lifecycle lifecycle
Network Resolve NIR/NIA Resolve
Operations

Network Insights:
Resources and Advisor

Anomaly detection | Pro-active alerts | Predictive failure


BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Network Insights Resources

• Analysis and correlation of software and hardware telemetry data


with focus on Day 2 network operations use-cases
• Focus on identifying anomalies and providing quick drill-down to
specific issues

NIR

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Telemetry and Analytics Deployment Models
Cisco Turnkey Custom

Network
Insights
Resources UI
Custom Third
Party

Data Lake

Software and Hardware


Telemetry Telemetry Software
Telemetry
Receiver

Telemetry Sources
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Complexity of Building a Telemetry Platform

Too many Complex


Auto scaling Building
open-source interactions Long term
of the telemetry
options to between support
application dashboards
pick from services

Investing in a software development team

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
How Can NIR Help with Day 2 Operations?
Monitor fabric-wide and node-
Resources specific resource utilization

Track CPU & memory


Environmental consumption, monitor power and
temperature

Network Monitor network bandwidth


Insights Statistics utilization, packet drops, and
network protocol statistics
Resources
Track flow paths, identify
Flows applications experiencing high
latency or packet drops

Correlate changes to events,


Events identify faults

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
First, We Need Data!

• Software Telemetry • Hardware Telemetry


• Provides visibility into: • Provides visibility into:
• Resource utilization • Data-plane flow information
• Environmental data • Flow path data

• Interface counters • Flow statistics

• Control-plane protocol stats &


events
NIR
Telemetry Sources Telemetry Data

Software Telemetry
Hardware Telemetry

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
NIR Architecture NIR
NIR GUI

REST APIs REST Client


Telemetry
APIC / DCNM
Manager

Data Lake

Anomaly &
Data Lake
Correlation
Connector
Engines

Message Bus (Kafka) Kafka Client

Telemetry Sources (Roadmap)


ACI/NX-OS Telemetry Collectors
Hardware & Software

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Operational Intelligence Engine for Network Insights

Dynamic Correlation
Correlate information across Dynamic Correlation
data sources

Proactive Alerts
See problems before end users Proactive Alerts
do and alert

Failure Prediction &


Corrective Action Failure Prediction and
Ability to predict failure and Corrective Action
provide corrective action

Intelligent Insights Intelligent Insights


Ability to discover
information with ease

Increase Availability and Performance


BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Correlation Engine
Correlate normalized telemetry data streams from Transformation
Receiver

Configs
End-to-end Flow Path
LLDP

Buffer and End-to-end Path Latency


Queue stats
Flow Correlation based on Buffer Occupancy and
details timestamp and drops along Flow Path
matching 5-tuple

Pipelines

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
NIR Integration with External Systems

REST APIs exposed to provide Kafka topic(s) (Roadmap)


data to third-party tools
• Normalized pre-correlated
• Anomalies data
• Resources
• Post-correlated data
• Events
• Nodes

{ REST }

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Forwarding of Anomalies

• Every anomaly is treated as a fault


• Every fault is written to Kafka topic
• 3rd party applications (like ServiceNow) can subscribe to these
topic to retrieve the faults and process/analyze them further

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Let’s start the Day
with an Overview

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Select time range

Dashboard view to retrieve


historical data

Dashboard -
intended to
quickly drill down
to issues

Anomalies by
Type and Severity

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
My Database
is/was slow!

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Flow Visibility and Flow Anomalies
Flows (5-Tuples) visibility
Shooting in dark using using Cisco ASIC’s
Ping/Traceroute/SPAN hardware telemetry
Without NIR capabilities
Ping/Traceroute may not Correlates and tells you
take same path as Service what caused these
flow in Fabric - anomalies – buffer drops,
troubleshooting is not queue drops, QOS drops,
accurate policy, ACL, policer,
With NIR forwarding drops

Does not provide historical


data Keeps historical data and
shows path and topology
for flows in fabric
Reactive and not proactive
Proactive and not reactive

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
I need to Capacity
Plan

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Capacity planning
Operational, configuration, hardware, environmental
Stats Collection resource utilization, interface and routing protocol stats,
flow records

• Baselines statistics and studies pattern to identify


Trending ‘normal’ behavior
• Provide Trending information

• When Utilization exceeds thresholds


Anomaly
• On sudden rate of change in the utilization of these
Detection
resources

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
My switch is not
forwarding any
Traffic to
Destination X

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
BGP statistics

NIR will ingest and correlate BGP data per node which includes -
1) Number of BGP Sessions per switch
2) Total number of Neighbors per switch
3) Per Neighbor information on operational state, address family, connection
attempts, prefixes sent and accepted paths
4) Anomalies and trends for the above data. Anomalies bubbled up in
dashboard for BGP - For example: connection retries and connection
drop counts, sudden increase/decrease in prefixes received/sent

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
BGP Statistics
BGP Protocol data and Anomalies

NIR recommendations

BGP Anomalies in
Dashboard

Trends and neighbors with prefix count

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
I see CRC errors
on my interface.
Could it be
because of
Interface Optics?

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
DOM Anomaly

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public
Other use
cases!

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Interface Errors
• Capture Stats, Errors, Drops, Utilization and Rate of all
Stats Collection
Interfaces

• Baselines behavior of every interface


Anomaly • Raise anomaly if Interface Utilization exceed thresholds
Detection • Anomalies for CRC errors, DOM anomalies, Interface
drops, QOS drops

• Correlate DOM to CRC errors


• Check if Stomped CRC to Fabric receiving CRC errored
Correlate
packets
& Diagnose
• Correlate Platform counters e to hardware errors
• Correlate Flow drops to Interface errors

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Monitoring vPC
• vPC State and stats, vPC Domain State, Peer State,
Stats Collection Role, Orphan Ports
• Operational state

• vPC configured but peer-link members down


Anomaly • Partially down anomaly if one leg of port-channel is
Detection down
• Detection of Split-Brain

Correlate If same EPs are not learnt across vPC legs, this is
& Diagnose correlated to vPC inconsistency

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Network Insights Advisor

• Provides deployment-relevant supportability information and


advisories
• Focus on actionable recommendations based on known issues and
Cisco best common practices

NIA
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
How Can NIA Help with Day 2 Operations?
Deployment-specific
recommendations & best practices,
Advisories upgrade impact
analysis/Experience*

Inbox function/Smart Inbox*,


proactive EOL/EOS
Notices announcements, new Field
Notices, new software/SMUs

Network
Alert to known defects, PSIRTs,
Insights Anomalies Flow State Validator

Advisor
System hardening checks, version-
Compliance specific scale limits monitoring (NIR
-> NIA) to generate advisory *

TAC assist, Tech support to Cloud,


Diagnostics Fast Start

* Roadmap BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
First, We Need Data!

Network Cisco
Provides: Provides:
• Running config of all devices • Best practices updates
• “show tech” from all devices • PSIRTs, FNs, EOS/EOL
(including APIC) • Software release notifications
• Digitized signatures of known
defects
NIA
Network On-prem Data Cloud Data Cisco

User-specified
Every 24h
interval

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
NIA Architecture

NIA Cisco NIA Cloud Service

NIA GUI TAC Services

EOL/EOS

Correlation / Secure PSIRT/FNs


Analytics Internet
Engines Connection BCP/Recommendations
Internet
Software Image Repo
Database
On-Prem Data Sources Statistics

“show run”
Data Collection
“show tech”

On-Premise Cisco Cloud

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
How does NIA detect known issues?

Hardening
Check

Tech Storage
Support Data Sources Signature Advisory NIA – GUI
and ‘show
Matching Services
run’
Insights DB
collection

Updated periodically with Bugs/PSIRTs


signatures from the cloud detection

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
NX-OS
Flow State Validator – SW/HW State Validation
for a Flow Check
Input flow info
Software and
hardware
consistency
on each
switch

Once complete,
view details

Optionally run Bug


Scan/Collect Logs

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Alert me about Field Notices/EOL/EOS/Recommended
Code Notifications
s Affected devices: 3
Leaf 1, Leaf 2, Leaf 3
3 Alert / Inform Anomaly: EOS of current software

Recommend:
Upgrade S/W to NXOS 7.0(3)I7(3)

NIA
Push
Notification Insight DB
Fabric
1
Monitor
4 Implement

p Affected
devices
s S/W 2 Identify Switches
Notify
p p p

Remediate
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
TAC Assist
Helps with log collection directly from the app. These logs can then be attached to an
SR. Optionally upload to Cloud

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Upgrade impact
When a software upgrade advisory is generated, an upgrade impact can be measured
from NIA per switch

Advisory to
Upgrade

Number of Hops
to upgrade

Measure
Upgrade Impact

Non-
Disruptive
disruptive

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
NIA Inbox
Used to send important notifications to end users
• New App Version, new NXOS software, new APIC/DCNM releases
• In future, will be used by Cisco to communicate with customers on best practices,
scripts, FAQs

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
NI Base 2.0.1 $0
ACI 4.2(3), DCNM 11.3(1)

• Infra details
Collect basic info about customer Hardware/Software and send to Cisco
• TAC Assist
Allows user to collect Tech-Support from NI app for specific or set of nodes
• Tech Support to Cloud
Allows user to upload Tech-Support to Cloud. These logs can then be accessed by
TAC searchable using Serial #
• Fast Start
Enables TAC to pull data from customer premise using NI App. No manual intervention
is needed

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
DEMOS on NIR/NIA
Scale, Software,
Hardware Support
Network Insights Resources Scale – NIR 2.1

Single fabric for ACI (Roadmap for Multiple fabrics


Fabrics - NIR 2.2)
Multiple fabric support for NXOS

100 leaf switches (APIC/ACI)


# of Switches 250 switches (DCNM/NX-OS)

10,000 5-tuple flows/second with Services


Flow Monitoring Engine/Cluster of compute nodes

Target 30 days for software telemetry


Data Retention Target 7 days for hardware telemetry

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Network Insights Resources 2.1
Software and Hardware Support

Nexus
Nexus Nexus
9300 /
Telemetry Type 9300-EX / 9364C /
9500 1st
FX / FX2 9332C
Gen
Software
Yes Yes Yes
Telemetry

Flow Telemetry No Yes No

ACI Software Version APIC / ACI Release - 4.2(3) / 14.2(3) AND 3.2(8) / 13.2(8)

7.0(3)I7(6) - [SW Telemetry]


Standalone Software Versions DCNM release 11.3(1)
9.3(2) - [SW and HW Telemetry]

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Network Insights Advisor 2.0.1
Software, Hardware Support and Scale
Nexus 9300 / 9500 1st Gen and Cloud Scale –
Nexus 3000 (All Models) – NXOS only
ACI and NXOS

Minimum ACI Software Version APIC / ACI Release - 4.2(3) / 14.2(3)

Minimum Standalone Software DCNM release 11.3(1) NX-OS release 7.0(3)I7(1) or later
Versions
FSV – 9.3(3) onwards

250 switches – NXOS


100 Nodes - ACI

NIA not supported on:


• Cisco Nexus 9500 Series switches with -R line
cards
• Cisco Nexus 3600 Series switches

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Compute
Requirements for
NIR/NIA
Supported fromACI

Cisco Application Services Engine ACI DCNM


4.2

Modern Scale-out application services stack to host Day-2 Operations applications

Network Network 3rd Party apps


insights Assurance Engine * *

2.2 GHz 10 core CPU x 2


256 GB memory
2.4 TB x 4 HDD
10G/25G/40G connect

Network automation Scale-out cluster SE-CL-L3

* In planning
BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
ACI/APIC Compute Requirements for NIR 2.1

Software Telemetry

Existing APIC Cluster (M3/L3)

Software + Flow Analytics

Existing APIC Cluster + 3-Node Services


Engine Cluster

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
ACI/APIC Compute Requirements for NIA 2.0.1

Up to 20 Nodes

Existing APIC Cluster (M3/L3)

21 - 100 Nodes

Existing APIC Cluster + 3-Node Services


Engine Cluster

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
DCNM/NX-OS Compute Requirements for
Network Insights
Hardware Recommendations for Deployments up to 80 Switches and 2000 Flows
Node Deployment CPU Memory Storage Network
Mode
Cisco DCNM OVA/ISO 16 vCPUs 32G 500G HDD 3x NIC
Computes (x3) OVA/ISO 32 vCPUs 64G 500G HDD 3x NIC

Hardware Recommendations for Deployments from 81 to 250 Switches and 10000 Flows
Node Deployment CPU Memory Storage Network
Mode
Cisco DCNM OVA/ISO 16 vCPUs 32G 500G HDD 3x NIC
Computes (x3) ISO/Service 40 vCPUs 256G 2.4TB HDD 3x NIC*
Engine

* Network card: Quad-port 10/25G

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Licensing
Network Insights and Assurance: Licensing

Premier Subscription Tier includes NIR/NIA/NAE Licenses +


Everything ‘Advantage’ License

Day2 Ops Subscription bundle includes NIR/NIA/NAE


Licenses

A La Carte Subscription-based Licenses

Day2 Ops bundle info - https://nexus9kaci.cisco.com/subscriptions-licensing#day-2-operations

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Network Insights and Assurance:
Licensing
NIR/NIA/NAE license is part of Premier Edition License
Premier tier and separate License is not required

D2 Ops bundle or Add-on Licenses are supported &


Advantage tier need to be purchased separately

D2 Ops bundle or Add-on Licenses are supported &


Essentials tier need to be purchased separately

NIR/NIA is subscription only license and supports per


License model Leaf licensing model for ACI and per node Licensing
model for DCNM
Licensing and ordering guide: https://www.cisco.com/c/en/us/td/docs/data-center-analytics/network-insights/1-x/licensing-guide/NIR-NIA-Licensing-Guide-r0.html

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Agenda

• Introduction to Data Center Telemetry


• Operationalizing Telemetry
• Network Insights Use Cases
• Network Insights Resources
• Network Insights Advisor
• Sizing, Demos, Licensing
• Key Takeaways

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Network Insight Telemetry Applications
Providing Pervasive Network Health Visibility & Enabling Proactive Insights

New Apps

Network Availability Network Health


Network Insights Network Insights
NIA Advisor NIR Resources
Proactive Software Recommendations/Notifications Physical/Logical Network Capacity & Utilization
Issue Vulnerability Detection & Remediation Data & Control Plane & Environmental Health

Enhance Availability, Uptime & Network Wide Visibility


BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Key Takeaways

• Nexus leads the industry in Telemetry capabilities


• Combination of Software and Hardware streaming provides
deepest level of Network Visibility
• Platforms for consuming, analyzing, visualizing Telemetry data
available or being developed for both ACI and NX-OS
• Both Cisco turnkey solutions and custom/third-party integrations
exist today

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Main Message

You can't manage what you don't measure.


You can't measure what you don't see

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Complete your
online session
survey • Please complete your session survey
after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events
Mobile App or by logging in to the Content
Catalog on ciscolive.com/emea.

Cisco Live sessions will be available for viewing on


demand after the event at ciscolive.com.

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Continue your education

Demos in the
Walk-In Labs
Cisco Showcase

Meet the Engineer


Related sessions
1:1 meetings

BRKDCN-2712 © 2020 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Thank you

You might also like