You are on page 1of 111

Macallan Software

Architecture Overview

Macallan Platform Software Team

November 2013

EDCS-1340217 (rev 18)

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1
Macallan Highlights
• Macallan is the next generation modular
enterprise Ethernet switch (next generation
Cat4500)
– Centralized forwarding with Doppler ASICs
(Doppler D, G & E)
– A series of new chasses, supervisors and line cards
– Converged Operating System (Polaris)
– Converged access (wired and wireless)
– Instant access and VSS
Contents
• Architecture overview
– System architecture
– Control plane
– Data plane
• Major platform modules
System Architecture
Macallan System Block Diagram
Chassis
• New chassis: 4 slot, 7 slot and 10 slot
• High availability design in data plane and control
plane
Power Supply
• High speed data link Fan
Tray
– 320G per slot on 4 &7 slot
– 240G per slot on 10 slot
• Standard control link
– High speed PCIe link Supervisor LineCard

– I2C
Passport Supervisor
Control block
(CPU & FPGA) Forwarding
engines

Power
block

uplinks
Supervisor
• Since Macallan is a centralized forwarding system, the
supervisor contains all the control plane features and data
plane features
– In-Service Software Upgrade (ISSU): the linecards continue to
pass traffic when the system is being upgraded
• The system can expand (both the capacity and features)
when we build new supervisors
– The linecards will not need to change
• We planned to build two supervisors
– Passport: low end, Doppler D based
– Imperial: high end, Doppler E based
Line Card Architecture Example
(Bell 48x1GE UPOE)
New 1588
support

Doppler G in stub Powerful


mode for POE fatures
forwarding (low
cost)
No CPU, No memory (lower
HW cost and more simplified
software architecture)
Control Plane
APM 883208 ARM CPU
2 memory
8 cores controllers
(2.0 GHz) (Upto 8GB memory)

5 PCIe
controllers

Embedded ARM cores


and security engines for
packet processing
acceleration
Innovation with ARM CPU
• ARM CPU is selected for Macallan
– Macallan is one of the early adopters of ARM CPU
in Cisco
– The pioneering work on the ARM CPU will help to
build up the ARM ecosystem in Cisco
CPU Architecture with Passport
Passport

Ethernet SW Standby

A15 A15 A15


M3 M3 M3
/A7 /A7 /A7

DopplerD DopplerD DopplerD

M3
RP
Ardbeg

PCIe

A15 A15
M3 M3
/A7 /A7
Ethernet
DopplerG DopplerG

Linecard Linecard
CPU Architecture with Passport (Cont.)
Board Chip CPU (Cores) OS Ethernet Application

RP APM883208 (8) Polaris Yes main CPU

A15(4)/A7(4) TBD (Polaris Yes AVC, BFD, IPSLA


Lite ?)
Sup Doppler D
M3(4) Firmware No IODMA, SysCtrl,
1588/PTP
Ardbeg M3(4) Firmware No IODMA

A15(2) - - Not used (no memory)

Doppler A7(4) - - Not used (no memory)


Linecard G
M3(4) Firmware No IODMA, POE, SysCtrl,
1588/PTP
Embedded A15/A7 Support
Component Description
Application Various data plane assistance
Forwarding Packet driver
Infra
FED Lite
Platform OS TBD (Polaris Lite?)
Infra
CPU 4xA15 cores and 4xA7 cores
Memory 1 GB (surface mount)
Network 1 Gb MAC to EOBC switch
Disk No disk (NFS from main RP)
Console No console (only for debugging on the
engineering board)
Embedded CPU Management
RP

eMMC M3 M3
ROM
ROM

A15/ A15/
M3 M3 M3 M3 M3 M3
A7 A7

DopplerD/G 0
… DopplerD/G

(Up to 6 Doppler G on a line card)

M3

M3 Core 0 Control PCIe (memory and register)


Embedded CPUs Boot Sequence
M3 Core 0 in Doppler D/G 0 M3 Core 0 in Other Doppler D/G

• Run Sboot0 image off ROM upon reset


• Verify and load Sboot1 image from eMMC

• Compare the Sboot1 version with RP


• Upload all the images to RP if the local
version is selected (SR)

• RP downloads Sboot1 image if the version in • RP downloads Sboot1 image


RP is selected • RP resets M3 Core 0; Sboot0 runs and finds
Sboot1 image in the SRAM

• Initialize DMC and FTN


• Initialize DMC and FTN
• Load from eMMC or download from RP M3
• Download from RP M3 application images
application images and A15/A7 uboot
and A15/A7 uboot image
image
• Validate these images
• Validate these images
• Bring A15/A7 and other M3 out of reset
• Bring A15/A7 and other M3 out of reset

• All M3 and A15/A7 start running • All M3 and A15/A7 start running
Embedded CPUs Shutdown Handling

• When RP goes down


– All the embedded CPUs on the sup go down
– All the embedded CPUs on the LC are not impacted
• When an embedded CPU goes down
– If it is restartable, it will be restarted after the reset
reason/core dump is collected
– If it is not restartable, the board will be rebooted
• If it is on the sup, the sup will be brought down
• If it is on the LC, the LC will be rebooted (resulting in
IOMD on the sup to be killed)
PCIe Network
EOBC Network
EOBC Network Addressing
Logical Slot:0 Logical Slot:1

Host Name IP Address Host Name IP Address

eth1 N/A 100.0.0.200/16 N/A 100.1.0.200/16

RP Complex rp0-0 10.0.1.0/16 rp1-1 10.1.1.1/16

FP Complex fp0-0 10.0.2.0/16 fp1-1 10.1.2.1/16

CC Complex cc[0..9]-0 10.0.3.[0..9]/16 cc[0..9]-1 10.1.3.[0..9]/16

Doppler D0 N/A 10.0.2.208 N/A 10.1.2.208

Doppler D1 N/A 10.0.2.209 N/A 10.1.2.209

Doppler D2 N/A 10.0.2.210 N/A 10.1.2.210


File System Structure
Partition File System Size Alias Use

/dev/flash1 Ext2 512 MB /crashinfo Crashinfo files and


core files
/dev/flash2 Ext2 ~12+ GB /flash (same For IOS bootflash:
as
/misc/scratch)
/dev/flash3 Ext2 4 MB /lic0 Primary license file

/dev/flash4 Ext2 4 MB /lic1 Secondary license file

/dev/flash5 Ext2 32 MB /obfl OBFL

/dev/flash6 Ext2 256 MB /dpu For three A15/A7’s


Data Plane
Passport: Centralized 4 Slot Chassis with DopplerD/G
Passport: Centralized 7 Slot Chassis with DopplerD/G
Passport: Centralized 10 Slot Chassis with DopplerD/G
Passport Bandwidth (Simplified View)
720 Gbps stacks (2 rings)

Doppler D
(100-10-1)x2G

24
Sup<--->Line Card
Backplane 32 for 4 & 7 slot
24 for 10 slot

6 to active 6 to standby

Doppler G
(100-20)G

8 NIF
Line Card Bandwidth
• The line card bandwidth varies in different
chassis and grows with the supervisor
• For example, the bandwidth of 24x10 GE line
card
4 slot 7 slot 10 slot
Passport 3:2 oversubscription 3:1 oversubscription Not supported
(D&G) (line rate on DG, 3:2 (2:1 on DG, 3:2 on D)
on D)
Imperial Line rate 2:1 oversubscription 3:1 oversubscription
(E&G) (3:2 on EG, 4:3 on E) (2:1 on EG, 3:2 on E)
Passport: Centralized BW with DopplerD/G
LC Type # of G # Doppler D Serdes Used Bandwidth/slot (Oversub)
4 slot 7 slot 10 slot 4 slot 7 slot 10 slot
48x1G 1 6x10G 6x10G 6x10G Line rate Line rate Line rate

12x10G 2 12x10G 12x10G 8x10G Line Rate D 3:2 D/G


(*80G) (*50G) 2:1/3:2
24x10G 4 24x10G 12x10G 8x10G D 3:2 D/G D/G
(* 160G) (*80G) (*50G) 3:1/2:1 4:1/3:1
48x10G 6 24x10G 12x10G 8x10G D/G D/G D/G
(*160G) (*80G) (*50G) 4:1/4:3 6:1/4:1 8:1/8:1
24xmGig 4 24x10G 12x10G 8x10G Line Rate Line Rate m=2.5G
+24x1G (*160G) (*80G) (*50G) G 1.5:1;
m=5G
D/G
8:5/3:1

(* Rate Limited to)


Case Studies
Active/Standby with Doppler D/G
10G port: Each Doppler G is
connected to only one Doppler D
• Active Sup can reach any port:
– Through the same Doppler D
– Through other Doppler Ds on the same sup
– Through mux to the standby uplink
• Each port can only be reached through one Doppler D
• Both ingress and egress are pinned
• Only one switch link is used to reach a given port in G
• The PCH header between Doppler D and G will have the
sub-port (G’s port) in the packet, so G knows what port
to send the packet to in egress
D1

LC 1

G1 G2

1 2 3 4 5 6 7 8 9 10 11 12
40G with Single Connection
• 40G front panel
must have 40G D D
SLI, can not
have 2x10G or 40G 2x10G

3x10G bundle
• Works the G G

same way as 40G 40G 40G 40G

10G mode
Macallan: Centralized Active Standby
DopplerD 3-way Switch Point to Cross Link for Path#2
Path#1 Path#2
40G 40G
Congestion Management (1)
• Internal Flow Control: Within ASIC, use stall or
credit scheme at congestion points to limit the
buffering and achieve fairness
• External Flow Control:
– Port Based Flow Control (PFC)
– Class Cased Flow Control (CBFC)
SIF
12 BCN11Generation

10 AQM
IQS 13

8 9 ESM
OCI
Dopper D
Ingress NIF OCI Egress NIF
7 14

OCI SLI OCI Dopper G


6
IQS 5 AQM
15
4 ESM

Ingress NIF
16 2 Egress NIF 3
1
Pause frame from network Pause frame to network
Congestion Management (2)
• Port Based Flow Control:
– LC NIF receives pause frame 1
– Ingress NIF passes the PFC to egress NIF 2

– egress NIF arbitor stops requesting data from egress port FIFO
(EPF) for the duration defined in pause message; 3
• Class Based Flow Control:
– LC NIF receives pause frame 1

– Ingress NIF passes the flow control info to egress scheduler


4
manager (ESM)
– ESM stops sending data to the corresponding port/queue
defined in the pause frame
Congestion Management (3)
• OCI
– AQM (active queue management) compares egress arrival
rate against threshold per port/queue, if cross threshold
5
– AQM sends flow control message to local SLI OCI
– OCI on master doppler G sends OCI frames to supervisor 6

Doppler D
– Supervisor D receives OCI 7data on ingress SLI and passes
to ESM (egress scheduler manager)
– ESM stops scheduling to the corresponding SLI 8
port/queue
9
Congestion Management (4)
• Backward Congestion Notification (BCN)
– Doppler D AQM compares 10 arrival rate per SLI port/queue, when
exceed limit 11
– AQM sends flow control message to local stack interface (SIF)
12
– SIF generate BCN to all ingress IQS
– IQS in Dopper D receives BCN and combines with ingress flow
13
control status, and send per port flow control info to egress SLI
14
– Egress SLI generate OCI message and sends to LC
15
– LC Doppler G receives OCI and generate flow control to IQS
– IQS flow controls back to ingress NIF
16
– Ingress NIF passes the flow control info to egress NIF and send
pause frame to network port
Major Platform Modules
Polaris for Macallan (Centralized)
Supervisor (Active) CPU Supervisor (Standby) CPU

I/O Complex RP Complex FP Complex (FED) I/O Complex RP Complex FP Complex (FED)

Ap CMA Ap CMA
CMAN- Ap FMA CMAN- Ap FMA
IOSd ps IOSd ps
CC ps N-FP N-FP CC ps N-FP N-FP

CMAN- FM FWD CMAN- FM


FWD
IOMD
IOMD RP AN - Client IOMD
IOMD RP AN -
Client
RP RP
FWD FWD
Infra Driver Infra Infra Infra
Infra Infra Driver
Servic Servi Services Servi
Services Services
es FED ces FED ces

Linux Kernel Linux Kernel

Platform HW Doppler Platform HW Doppler

Backplane / System HW

LC Module HW LC Module HW
LC Module HW …… LCModule HW

RP Complex
I/O Complex Macallan 4K : Modular w/ central
FP Complex
forwarding architecture
Architecture Baseline
• Macallan software is based on Polaris
– The kernel and tool chain are based on MontaVista CGE
7
• Linux kernel for ARMv8 CPU (version 3.10, 64 bit, big endian)
• 32/64 bit big endian applications
– We will run all applications in the 32 bit mode
– We will migrate to 64 bit mode when Polaris migrates to 64 bit
» Macallan team will migrate the platform specific IOMD and FED
• The platform modules are based on Rudy
– FED 2.0 (including Fed-lite library)
– VSS and Fex infra
Infrastructure Plane
IOSD

chasfs
CMAN RP
PD

CMAN FP CMAN CC

IOMD

driver Kernel driver


PI

Platform Hardware PD Hook


provided

PD
Control Plane
IOSd WCM SANET

Control Plane Protocols Wireless Control plane Session Manager


(L2/L3..)

Fast
Int
License Path Interface
Mgmt
(IDB)
Threa Drivers FMAN RP
Mgmt FMAN RP
d Punt Path SHIM Punt Path
SHIM
FMAN RP
Platform SHIM
SHIM
Forwarding Plane
FMAN RP

FMAN FP

Multiple Doppler FED (Forwarding


Engine Driver)

Doppler (D/E/G) PI

PD Hook
provided

PD
Management Plane
Management Plane

CLI SNMP WEBUI

CRIMSON

Control / Forwarding / Infra / Services Planes


Macallan CM Architecture
Active RP Standby RP

Active CMRP Standby CMRP

CMCC CMCC CMCC CMCC CMCC CMCC


… … CMFP CMFP
(supA) (supB) (LC) (supA) (supB) … (LC)

IOMD IOMD IOMD IOMD IOMD IOMD


(sup A) (Sup B) (LC) (sup A) (Sup B) (LC)

sup A Hardware sup B Hardware LC Hardware


Macalla CM Architecture (Cont.)
• Defined CM architecture
– Studied CM architecture of ASR1K , UEA/Rudy and
Overload. There are similarities and difference
among the 3 platforms
– Macallan CM is based mainly on Overlord because it
is a super set and best fits the requirement of the
the Macallan system
– The Macallan CM has been reviewed and agreed
upon by BinOs team and Polaris team
• Fully engaged with the Polaris team on the future CM
architecture
Macalla CM Architecture (Cont..)
• Adopted ASR1K CPLD control model
– All BinOS platforms adopted the ASR1K CPLD for a
consistent interface between hardware and
software (CM)
– Worked with Macallan hardware team to define
Ardbeg registers that provides an interface
between hardware software that is compatible
with ASR1K CPLD
• Based on Jackpot CPLD
IOMD Architecture
RP Complex FP Complex CC Complex

CMAN-RP
CMAN-CC
FMAN-FP

FMAN-RP
chasfs

FED
IOSD
OIR Shim PM Shim IOMD

SW PM

L2/L3 config, status and stats Interface config, status and stats Fast link notification

OIR
IOMD Architecture (Cont.)
• Feature definitions
– Interface (Layer 1) features, including port config,
port status (including interrupt), and port stats
FED 2.0 Architecture
FED in Macallan
RP/FP Complex CC Complex Sup CC Complex
1. Doppler
D library:
SLI, SIF

CMAN-FP
VSS FED CMAN-CC CMAN-CC
Manager Lite-D
6. VSL link
mgmt 1. Doppler
3. Fast Link
D library:
notification 2. Doppler
NIF, SLI
chasfs G library:
IOMD IOMD NIF, SLI

FED FED FED


5. chasfs Lite-D Lite-G
interface:
role,
inventory
4. PCIe
discovery;
Interrupt
dispatch SIF Kernel

Doppler D Doppler G
FED in Macallan (Cont.)
Interface Description Document

1 FED-Lite D library: NIF (uplink); SIF

2 FED-Lite G library: NIF; SIF

3 Fast link status notification

4 PCIe device discovery, Interrupt dispatch

5 chasfs interface: role, inventory etc

6 VSS interaction for VSL link mgmt


Function Partitions
• FED provides two libraries
– FED Lite G – for Doppler G
– FED Lite D – for Doppler D
• IOMD manages Layer 1, including port config,
monitoring, interrupts, and stats
Silent Roll
• Macallan SR summary
– SR driver contents
• M3 images
• .so (for FED, IOMD and CM)
• .ko (drivers)
• .xml (PIO pin map)
– SR storage : eMMC (both Supervisor and linecard)
– SR manager : an adaptation of the NG3K installer for the
modular system
• Find more details at Macallan Silent Roll
Architecture (EDCS-1440260)
Green Features
• Efficient power management
– Dynamic power management based on the load (the
number of LC) in the chassis
• Put the power supply in the standby mode
• Operate the power supply in the peak performance range
• Efficient cooling management
– Dynamic cooling management based on the load (the
number of LC), the temperature and air flow in the chassis
• Adjust fan speed
• Selectively power off the fans
HA Architecture
Active Standby

IOS RF/CF
IOSd IOSd

Process X IOMD IOMD Process X


BinOS Processes

Chassis rsync Chassis


Mgr chasfs chasfs Mgr

FMAN-RP FMAN-FP FMAN-FP FMAN-RP

TDL database + replication


FED FED

CPLD (Ardberg) CPLD (Ardberg)

HW HA infra
VSS Centralized
Active chassis ICA Standby chassis ICA
FMAN FMAN
IOSd VSS Mgr IOSd
-RP -RP VSS Mgr

RM
FMAN LC-complex LC-complex
Chasfs FMAN
-FP Chasfs
CMAN- -FP
CMAN-
CC CC
RMC RMC
CMAN- CMAN-
FED IOMD FED IOMD
RP RP
VSS Mgr IOMD IOMD
VSS Mgr
client client

HW Doppler Doppler
HW

Active Chassis LC Standby Chassis LC

HW Doppler HW Doppler

SSO active apps Chassis apps HA sync

Stateless apps Standby app VSL


Packaging
Complex Type Content

rpbase Kernel, initramfs, platform drivers, init sripts

rpios-universalk9 IOSd and associated libraries

RP rpcontrol Control plane process (e.g., cman, fman)

rpaccess Software for router access

WCM Wireless Controller

espbase FP control process and drivers

FP espdp A15/A7 images

espdriver SR drivers and libraries

sipbase IO control process

CC sipspa IO drivers (IOMD)

sipdriver SR drivers and libraries


ISSU
• Polaris support two software upgrade modes
– Consolidated Package Mode (aka Super Package)
• Similar to classic IOS ISSU on Cat4k and Cat6k
• Will support this mode on Macallan
• ISSU switchover goal: the same amount of traffic loss as
the standard HA switchover
– Sub-Package Mode
• Each package can be individually upgraded
• Will not support this mode on Macallan
ISSU (Cont.)
• ISSU boundary
– software packages
• a message that communicates across a package
boundary has ISSU implications
• Use TDL because it has built-in versioning support
– Build time syntax check
– ioctl
– packet format (punt-n-inject)
– kernel and major libraries
System Scalability
Scalability
• Macallan supports Instant Access (IA),
Converged Access (CA), and a combination of
IA and CA
IA Only CA Only IA + CA
Passport 1500 ports 250 AP’s, 4000 1200 IA ports
clients 250 AP’s
4000 clients
Imperial 3000 ports 500 AP’s 2400 IA ports
8000 clients 500 AP’s
8000 clients
Scalability (Cont.)
• Macallan can support UPOE (Universal
Power Over Ethernet) on every port in the
chassis
– The maximum power on a UPOE port is 60W
– On a 10 slot chassis, Macallan delivers 23040
(8x48x60) Watts for all UPOE ports
• Macallan external power shelf
– provides additional 25.6 KW
CLI
Macallan CLI Challenge
• Conflicting requirements
– Macallan CLI needs to be consistent with
current Cat4500
• Macallan will be marketed as the next generation
Cat4500
– Macallan CLI needs to be consistent with
other platforms on Polaris
• Otherwise, it defeats the goal of Polaris
Macallan CLI Challenge (Cont.)
• Architecture decisions
– For applications and features (Layer2, Layer 3,
QoS, and Netflow etc.), will follow Polaris
• They are very similar since they are IOS based
– For platform commands (e.g., show module, show
inventory), will follow Cat4k
• There are big differences between Cat3k/Cat4k and
even bigger differences between the routing platforms
and switch platforms
Macallan CLI Challenge (Cont..)
– For interface naming, will follow Cat4k
• Will use the three level notation of Switch/Slot/Port (all
1-based)
– The Switch # will be 1 for non-VSS system
– The switch # 1 and 2 are for VSS and 100 and onwards for Fex
– For ISSU, we need bot “request platform software”
and “issu loadversion/runversion” as we will
continue to support both the intall mode and
bundle boot mode
Macallan CLI Challenge (Cont…)
– For “RP, FP, RP-0, RP-1” reference, we will allow
them in debugging command and loggings
• These references will be used in the “show platform”
command chain
• These references will be used in log messages
• Find more details at Polaris/Macallan : CLI
Support changes/estimates (EDCS-1439018)
References
• Macallan Hardware Architecture Specification (
EDCS-1191896)
• Macallan Software System Functional
Specification (EDCS-1417467)
• Macallan Software Functional Architecture and
System (EDCS-1253619)
• Polaris VSS/FSS Requirements and Architecture
EDCS-1308266
• ECSG Macallan Software Development wiki page
Back Up Slides
Data Path Acceleration
Data Path Acceleration Architecture
• Problem definition
– Hard to reach high packet processing rate in the Linux user space due
to context switch overhead and Linux network stack overhead
• Solution
– Dedicate cores to packet processing (i.e., data plane CPU)
• Run-to-completion model (no scheduling)
• Direct HW access with polling (no interrupt)
• Optimal memory/buffer allocation scheme
– Use case
• AVC
– Reference design
• Intel DPDK (Data Plane Development Kit) for Intel CPU
• Linaro ODP (Open Data Plane) for ARM
Linaro Fastpath
Intel DPDK
DPU Case Study
Scope
• DPU in this study refers to the A15/A7
complex in Doppler D
– Not M3
– Not A15/A7 in Doppler G
• We study two types of protocols
– Aliveness protocol – keep alive with peers
• BFD, ISPLA VO, CFM, Flexlink(?)
– AVC – assist packet classification
• AVC, FNF, firewall(?)
BFD Requirements
• Assume 50 ms detection timer and 20 ms
interval, each session requires 50
packet/second, tx and rx
• Assume 200 sessions (marketing), we need to
support 10k packet/second, tx and rx
BFD HWO (Hardware Offload)
• HWO is part of the IOS BFD architecture
• BFD HWO is developed on many platforms
– BFD HWO is implemented by FPGA on Cat4k/Cat6k
– BFD HWO is implemented by LC CPU on N7k
– BFD HWO is implemented by network processor
on some routers
• What about Macallan?
– DPU is a natural choice
BFD HWO Features
BFD HWO Features DPU support
send and receive all BFD packets DPU tx/rx packets via Doppler core
through hardware
Send/Punt select packets to BFD DPU tx/rx packets to RP via Doppler
state machine for session creation, core
deletion and timer update
Keep track of missed packets and DPU implement the timeout logic;
report timeouts to BFD state DPU triggers interrupt on timeout
machine on RP
Collect statistics for packets sent DPU uploads stats via EOBC
and received
Macallan BFD Arch with DPU
RP

TDL message
IOSD
BFD IPC message (over EOBC SW)
FMAN
Register access (over PCIe)
Punt-n-Inject
Driver BFD packets
FED BFD

BFD packets punted to RP

config
config
stats

intr

Doppler Interrupt

BFD
Doppler
DPU
Core Other Dopplers
Punt-n-
packet packet
Inject
Driver
Macallan BFD Arch with DPU (Cont.)
• Only the DPU in one Doppler D is used
– The other two Doppler D’s will forward the packet to this Doppler D’s DPU
• No FED knowledge is needed on DPU
• HA
– The BFD in IOSd will sync to standby
– BFD HWO needs to sync?
• Indirect: DPU-IOSd-IOSd-DPU
• Direct: DPU-DPU
• Since BFD is not supported on Cat3k, BFD is a new feature to the FED
• BFD HWO can send a message to IOSd to report the neighbor status
change. It can also trigger an interrupt to accelerate the report. If that
is the case, an ISR is needed to collect the info and notify the BFD in
IOSd quickly
Firmware Consideration
• One alternative is to run firmware on A15/A7
– Each core runs one application in a tight loop without an OS
• Just like M3
– May work if all the applications are similar to the BFD
• Disadvantage
– Complexity: The cores share memory, interrupt, MAC and all the IO
devices. To manage them correctly among all the cores results in a
mini OS
• It actually requires more resources to develop this solution once we have
more than one application
– Scalability: super fast for one core, but does not scale on multiple core
• What if an application can be multithreaded?
FNF
• DPU assistance to FNF
– Collect NF data
– Export them to the collector
– Manage the expiration of the flow
• DPU/RP interaction
– RP programs the NF TCAM
– The DPU in each Doppler D will be active
• Need more investigation
Macallan AVC Arch with DPU
FED
TDL message

NBAR Control Plane IPC message (over EOBC SW)


NBAR PD
(‘show AVC’)
Register access (over PCIe)

Inter Thread Communication

config
Flow Table Flow table sync to
other DPU’s
NBAR Data
Plane
The other two
Doppler
Punt-n-
Core
Dopplers have the
Inject
Driver same architecture
packet
DPU

Doppler
AVC with DPU
• This architecture is based on the Polaris NG3K
AVC architecture for wired ports
– DPU is used to assist in the data plane processing
– Since DPU will do the packet inspection, we will
not tightly integrate AVC with FNF (like the NG3K)
DPU Requirements Summary
Requirements BFD AVC
OS Linux Linux
(Polaris?)
Multiple DPU No Yes
Distributed FED (FED components No No
on DPU including Doppler
programming)
DPU Case Study References
• EDCS-845615: IOS BFD Offload Software
Design Specification
• EDCS-902869: Polaris NG3K AVC for Wired
Ports High Level Software Architecture
System Flow on Critical Events
Linecard Insert/Remove
RP Complex FP Complex CC Complex

IOSD PMAN
FMAN-FP

FMAN-RP chasfs
chasfs IOMD
FED

CMAN-RP CMAN-CC

Kernel

HW
OIR notification Spawn/kill
DPIDB (interface)
chasfs update/notification Polling
Linecard Online
RP Complex FP Complex CC Complex

IOMD
IOSD chasfs
chasfs

CMAN-RP CMAN-CC

Kernel

OIR notification chassfs update/notification


Link Up/Down
RP Complex FP Complex CC Complex

FMAN-RP FMAN-FP

FED

IOSD IOMD

Kernel

HW

Link Up/Down Fast link notification Polling


Future Macallan
Doppler E and Imperial
• Doppler E will not include A15 due to space
constraint
• A data plane processor (DPP) is needed to
support packet acceleration for Doppler E
– Option 1: One DPP for all the Doppler E’s
– Option 2: Merge DPP into the main RP
– One DPP per Doppler E (too expensive)
CPU Architecture with Imperial (Option 1)
Passport
Standby
Ethernet SW

A7 M3 A7 M3
DPP
DopplerE DopplerE

M3
RP
System FPGA

PCIe

A15 A15
M3 M3
/A7 /A7
Ethernet
DopplerG DopplerG

Linecard Linecard
CPU Architecture with Imperial (Option 2)
Passport
Standby
Ethernet SW

A7 M3 A7 M3

DopplerE DopplerE

M3 RP/
DPP
System FPGA

PCIe

A15 A15
M3 M3
/A7 /A7
Ethernet
DopplerG DopplerG

Linecard Linecard
CPU Architecture with Imperial
Board Chip CPU OS BIPC Application

RP Intel or Polaris Yes Option 1: Everything


ARM Option 2: A portion of the cores
dedicated to DP Acceleration
DPP ARM Polaris w/DP Yes Option 1 only
Acceleration
Sup
A7(4) ?? No
Doppler E
M3(4) Firmware No IODMA

System FPGA M3(4) Firmware No IODMA

A15(4) - - Not used (no memory)

Linecard Doppler G A7(2) - - Not used (no memory)

M3(4) Firmware No IODMA, POE, 1588,?


Imperial: Option 1 vs. Option 2
Option 1 Option 2

Cost • more expensive (need memory and usb flash for the • lowest cost
second CPU)
• If we put DPP and its memory and usb flash on a DB
(make it a FRU), the customer then have a choice to
pay for more data processing capability
Software Complexity • more complicated because there are two OSes • simplest
running
• however, it is still simpler than the Passport model
where there are 3 DPP instances
Ease of use • more complicated because the customer will • easiest
inevitably be exposed that there is a second CPU in
the system. For example, debugging high CPU
utilization in the DPP, collecting core files from DPP
• more images to do ISSU and SR
Imperial: Centralized BW with DopplerE/G
LC Type # of G # Doppler E Serdes Used Bandwidth/slot
4 slot 7 slot 10 slot 4 slot 7 slot 10 slot
48x1G 1 6x10G 6x10G 6x10G Line rate Line rate Line rate

12x10G 2 12x10G 12x10G 12x10G Line Rate Line rate Line rate

24x10G 4 24x10G 16x10G 12x10G Line Rate 3:2 2:1


oversub oversub
48x10G 6 30x10G 12x10G 12x10G 8:5 4:1 4:1
oversub oversub oversub
24xmGig 4 24x10G 15x10G 12x10G Line Rate Line Rate m=2.5G
+24x1G (8+7) for Line rate;
m=2.5 m=5G
and 5 3:2
Imperial: Centralized 4 Slot Chassis with DopplerE/G
Imperial: Centralized 7 Slot Chassis with DopplerE/G
Imperial: Centralized Arch : 10-slot, 80G/Slot
Misc.
Doppler Family for Macallan
Doppler G Doppler D Doppler E
Platforms 2K/IE/Stub/CS NG4K LE/Next3K EBG/NG4K SUP
Process 28 nm 28 nm 16nm
First Silicon CY'Q2 14 CY'Q2 14 CY’Q1 15
Cost/ Target $40 $220 ~$315
Power/ Target ~25W (9W conf) ~110W ~105W
Die Size ~140mm2 ~480mm2 ~480mm2
Bandwidth 80G / 120Mpps 200G/ 300Mpps 640G / 720Mpps
Ports 52 (8x10G) 144 (20x10G) 390 (48x10G+4x40G)
Fabric/Stk/Stb 80G Stk, 60G Stb 720G Stk, 160G StbM 720G F/Stk, 480G StbM
Packet Buffer 6MB 32MB(16+16) 36MB
CPU Complex A15*2,A7*4,M3*4 A15*4,A7*4,M3*4 M3*4, A7*4
MAC 32K 64K Up to 416K
FIB v4/v6 8K/4K 64K/64K Up to 416K/416K
Multicast 8K/4K 32K/16K Up to 416K/416K
ACL v4/v6 4K/2K 48K/24K 64K/32K
Netflow v4/v6 32K/16K 128K/64K Upto 128K/128K bidir
AP/Client/DTLS 100/3K/20G 1000/16K/80G 1000/16K/80G
Adjacencies 16K 64K 96K
BD/LIF 4K/1K 4K/4K 8K/16K
ECC HW int HW int HW+Cntrl(FIB/L2/L3)
PSV capacity 1.0X 1.0X 1.2X
Punt and Inject
BinOS HA Architecture
Active
IOSd pman.sh
App

pvp.sh pman.sh Table_def API (tdl)

pman.sh cmand
Elcaro

CPLD Driver chasfs TDL


(control, process, oir dirs) issu_boottime.sh
Epoch
(HW Signals)

chasfs TDL
CPLD Driver issu_boottime.sh
Epoch
(control, process, oir dirs)

pman.sh cmand Elcaro

pvp.sh pman.sh Table_def API (tdl)


App
IOSd pman.sh
Standby
Macallan Chassis Family
Power Supply

Fan
Tray

Supervisor LineCard

Johnny, 4 Slot, 6 RU Chivas, 7 Slot, 10 RU Highland 10 Slot, 13 Dalmore:


• 17.22” x 10.0” x 16.25” • 17.22” x 17.45” x 16.25” RU Power Shelf
• 2 Supervisors • 2 Supervisors • 17.22” x 22.7” x 16.25” 3 RU
• 2 Line Cards • 5 Line Cards • 2 Supervisors
• 4 x 3.2KW Power Supplies • 8 x 3.2KW Power Supplies • 17.22” x 22.7” x
• 8 Line Cards
• 1 Fan Tray • 1 Fan Tray • 8 x 3.2KW Power Supplies 16.25”
• Support Power-Shelf Option • Support Power-Shelf Option • 8 x 3.2KW Power
• 1 Fan Tray
• Bandwidth: • Bandwidth: • Support Power-Shelf Option Supplies
• 320G/Slot (Centralized) • 320G/Slot • Bandwidth:
• 480G/Slot (Distributed) (Centralized) • 240G/Slot (Centralized)
• 480G/Slot (Distributed) • Not planned
(Distributed)

Phase 1 Phase 2
Sup Control Func Blocks
Branch

Macallan_dev

FED2.0_dev
Macallan
DTHO
Polaris_dev

Nov 13 Feb 14
May 14
FCS

Rel 3.14 Rel 3.15 Rel 3.16 Rel 3.17


mcp_dev

3/2014 7/2014 11/2014 3/2015


Polaris Engagement
• CLI
• Silent Roll
• Switch PM
• ISSU
• VSS/IA
• ARM Infra

You might also like