You are on page 1of 25

NEW EBOOK:

DATA CENTER NETWORK


AUTOMATION. SIMPLIFIED.
Mike Capuano
Table of Contents
Introduction...........................................................................................................................................3

Why Every Size IT Team Should Strive to Implement a Software-Defined Data Center...........................4
Public vs. Private Cloud......................................................................................................................................................... 4
Software-Defined Data Center............................................................................................................................................. 5
Pluribus Networks – Putting SDDC and Private Cloud Within Reach......................................................................... 6

A Foundation of Open Networking for Network Automation..................................................................7


Open Networking Has Matured........................................................................................................................................... 7
Open Networking Switches Are Becoming More Powerful........................................................................................... 9
Support has Matured...........................................................................................................................................................10
Brownfield or Greenfield.....................................................................................................................................................10
Bottom Line............................................................................................................................................................................10

The “Easy Button” for SDN Control of Physical and Virtual Data Center Networks...............................11
The Underlay..........................................................................................................................................................................11
Controller-based vs. Controllerless SDN Underlay Automation................................................................................12
The Overlay.............................................................................................................................................................................12
A New Approach to Overlay Fabric Networking.............................................................................................................14
Underlay and Overlay Unified – It Just Works................................................................................................................16

Network Analytics Without Probes, TAPs and Packet Brokers..............................................................17


Traditional Approaches to Analytics.................................................................................................................................17
Another Approach to Telemetry and Analytics..............................................................................................................19
Insight Analytics....................................................................................................................................................................19

The Importance of Network Segmentation for Security and Multi-Tenancy.........................................21


Automating Network Segmentation................................................................................................................................21
More Efficient Utilization of Security Devices.................................................................................................................22
Integrated Telemetry and Analytics..................................................................................................................................23

Summary.............................................................................................................................................24

Additional Resources...........................................................................................................................25

2  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Introduction
We are living in a multi-cloud world where certain workloads work best in specific cloud
environments – large hyperscale public clouds, private hosted clouds or on-premises
(on-prem) or colocation (colo)-based private clouds. A recent survey by IDC found that
approximately 75% of workloads will reside in hosted private cloud or in on-prem or
colo-based data centers supporting end-user-owned private clouds. In light of this, data
center operators need to build IT foundations for private cloud that are agile and that enable
IT teams to focus on outcomes that support the digital transformation of their businesses.
Thus, on-prem or colo-based single-site or multi-site data centers must be fully automated
and orchestrated as software-defined data centers (SDDC) to support private cloud.
Unfortunately, while compute and storage automation and virtualization have made leaps
and bounds over the last decade, software-defining and virtualizing the data center network
has continued to be extremely complicated and expensive. This has put SDDC and private
cloud out of reach for lean IT teams operating small and medium single-site or multi-site data
centers. This eBook takes a look at these challenges and outlines some solutions that can
help small IT teams achieve SDDC cost effectively.

3  |  Data Center Network Automation. Simplified. pluribusnetworks.com


CHAPTER 1
Why Every Size IT Team Should Strive to
Implement a Software-Defined Data Center

In a fast-moving and increasingly digital world, businesses of all sizes are working toward digital
transformation (DX) to stay competitive and grow profitably (Figure 1). This results in thinking through
existing application workloads and new applications that might be required for DX.

Digital Investments Are Paying Off

Financial Impact Of Digital Transformation


(2013-2017 CAGR)
3.0% 2.3%
2.0% 1.2%
1.0%
0.0%
Digital Manufacturers Non Digital Manufacturers
-1.0%
-2.0%
-3.0% -2.1%

-4.0% -3.1%

Revenue Growth Profit Growth

Figure 1: IDC’s “DX Reinvention — The Race to the Future Enterprise,” Doc #DR2019_GS4_MW, March 2019

Public vs. Private Cloud units. That said, many IT teams have now built
experience and an understanding of which type
On this DX journey, the allure of the hyperscale of workloads make sense in the public cloud
public cloud is undeniable: no need for and which do not, based on criteria such as cost,
infrastructure, spin up workloads quickly, easily performance, security, data privacy and more.
spin them down (if you remember) and scale
nearly infinitely. This has driven more and As a result, while many businesses continue to
more businesses to use public hyperscalers for move workloads to hyperscalers, they are also
development and testing (dev and test), as well making decisions to keep specific workloads
as production workloads, either sponsored by IT on-prem or in colo facilities running on
or as shadow IT projects directly out of business end-user-owned data

4  |  Data Center Network Automation Simplified pluribusnetworks.com


center infrastructure, or alternatively, moving to and even as we see enterprises consolidating or
private hosted clouds run by managed service closing down their data center real estate, it is
providers (MSPs) and cloud service providers clear that on-prem and especially colo data center
(CSPs). In fact, IDC’s “Cloud Repatriation Accelerates infrastructure owned by end-users as well as
in a Multicloud World,” Doc #US44185818, August hosted private cloud are both growing and here to
2018, showed that 81% of enterprises had initiated stay as part of a multi-cloud world. As can be seen
some sort of repatriation activity (pulling back in figure 2, from the same IDC survey, the majority
workloads from the public cloud). So, while the of workloads will be in private cloud environments
growth of hyperscale public cloud continues, in the “ideal” state.

Substantial Workload Shifts to Cloud Environments


Private Cloud a strong focus for on- and off-premises solutions
Q. What percent of your organization’s applications are Q. Ideally, if your organization could start over tomorrow without legacy
currently deployed in the following venues: sum to 100 IT decisions, what percent of your organization’s applications would be
% of Applications deployed in the following venues sum to 100

Saas 11% 13% PUBLIC Focus shifts from efficiencies &


Iaas/Paas 10% CLOUD cost savings to speed &
12%
Hosted perfomance
14%
Private Cloud 17%

PRIVATE Application churn, particularly


On-Premises 30% CLOUD for non-cloud on premises apps
Private Cloud favors private clouds
31%
Off-Premises
Non-Cloud 10%
10% Cloud management the glue for
the increasingly complex
On-Premises 26% NON-CLOUD
Non-Cloud application portfolio
17%

TODAY IDEAL STATE


Figure 2: The majority of workloads will reside in on-prem or reside in private hosted cloud environments.
From IDC’s Cloud Repatriation Accelerates in a Multicloud World, Doc #US44185818, August 2018

Software-Defined Data Center high-performance and highly automated SDDC


foundation to support private cloud. SDDCs
What is important, then, whether you are an generally consist of software-defined networking
end-user business owner of data center (SDN) of the physical network (underlay) along with
infrastructure or a non-hyperscale MSP or CSP, is virtualized network (overlay), compute and storage,
to ensure that your on-prem or colocation data all coordinated by a higher-level orchestrator such
center infrastructure can provide a cost-effective, as vCenter, OpenStack or Kubernetes.

5  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Compute and storage virtualization, and now multi-site data centers. This is unfortunate because
containerization, have made leaps and bounds digital transformation is critical, and it is clear that
over the last decade. Yet software-defining and businesses that have automated their data center
virtualizing the data center network has continued networks with SDN have seen clear economic
to be extremely complicated and expensive – benefits, as highlighted in the Enterprise Strategy
putting SDDC and private cloud out of reach for lean Group’s survey results displayed in figure 3.
IT teams operating small and medium single-site or

How would you characterize your company’s timeliness developing and launching
new products and services, relative to its competition? (percent of respondents)

Organizations implementing SDN (N=556) Organizations not implementing SDN (N=3,340)


55%
49%

39%
32%

13% 11%

We are usually significantly ahead We are often ahead of our We are usually in line with or
of our competition competition behind our competition
Figure 3: Organizations that deploy SDN in the data center are significantly more competitive.
From Enterprise Strategy Group, 2018

SDN provides new levels of network automation to Pluribus Networks – Putting SDDC
accelerate IT transformation: and Private Cloud Within Reach
• 
97% of transformed companies have At Pluribus Networks, we believe there is an
committed to SDN approach based on open networking and open
• 
SDN users are 3.5 times more likely to be source principles for businesses of all sizes to
significantly ahead of their competitors in time achieve SDN, network virtualization and network
to market (49% versus 13%) analytics cost-effectively – putting SDDC and
• 
2.5 times more SDN users made excellent private cloud within reach of every IT team, even
progress enabling a rapidly elastic data center for small data centers and lightly staffed IT teams.
environment (46% versus 18%) It is clear that breaking through the barrier of
automating the network is the critical hurdle to
supporting SDDC and private cloud.

6  |  Data Center Network Automation. Simplified. pluribusnetworks.com


CHAPTER 2
A Foundation of Open Networking
for Network Automation

In Chapter 1, I talked about the fact that, while many 3: Do I want to create a virtualized network overlay
workloads are moving to hyperscale public clouds, fabric that abstracts the physical network into
many will continue to run either in data centers with a set of software-based virtual tunnels between
end-user owned infrastructure (on-prem or in colos) or all switches supported by software-based
in hosted private clouds. I also reviewed the business routers, switches and load balancers that offer
benefits of transforming to an SDDC and, in particular, the ability to establish new network topologies
focused on the challenges of providing a software- and services in seconds? Is the cost and
defined underlay and virtual overlay networking complexity of deploying a virtual network worth
infrastructure, which has been the Achilles heel for IT the effort? What are the various approaches?
teams in terms of achieving SDDCs.
These questions can be asked in any order, and
Given the clear benefits of SDDC transformation, often can and should all be asked and investigated in
is there an affordable and simple approach to parallel. Here I will focus on the first set of questions,
get there? Is there an “Easy Button” that makes it about open networking, and then address SDN and
feasible for even small and medium data center network virtualization in Chapter 3.
operators? We believe the answer is clearly yes.

To understand how, I will look at three sets Open Networking Has Matured
of questions:
Over the last decade, some customers have
1: Should I deploy open networking or go with a been reluctant to take a risk on open networking
vertically integrated vendor when it comes time because of its perceived immaturity and concerns
to upgrade, expand, migrate or consolidate my with service and support in a model where
data center and to support SDDC? How much software comes from one company and hardware
risk is there in open networking, and what is the from another. This is juxtaposed against the
support model? tremendous benefits that have been achieved from
2: Do I want my leaf-and-spine physical network disaggregating software from hardware, including
to be deployed as an SDN fabric, or am I driving capital costs down by up to 50% and, more
comfortable with box-by-box configuration, importantly, speeding hardware and software
operations and troubleshooting? Is the cost and innovation by enabling separate innovation
complexity of deploying SDN worth the effort? paths and leveraging an open source community
What are the various approaches? approach. For example, Pluribus utilizes Free Range

7  |  Data Center Network Automation Simplified pluribusnetworks.com


Routing (FRR), which is an open source codebase down feature velocity. Pluribus and other open
that sits under the auspices of the Linux Foundation software-only solutions can innovate quickly in a
and which gives us the base routing code for our DevOps model and issue frequent software releases
Netvisor® ONE Network Operating System (NOS). with new capabilities, while the hardware is also
Because we leverage this core set of code from innovated in parallel from the likes of chip vendors
FRR, we are able to apply our software engineering such as Broadcom and system-level hardware
resources to quickly innovate around the edges, solutions from Edgecore, Dell EMC and Celestica.
focusing on key use cases for our customers and our
Open networking has been widely deployed by
unique approach to SDN and network virtualization,
the hyperscalers and is now moving into the
and contribute code back upstream for others to
mainstream as IT teams become more comfortable
leverage. There is no doubt that disaggregation
with the technology’s performance and quality, as
itself speeds innovation and, unlike with vertically
well as support from open networking vendors.
integrated vendors where hardware and software
As a result, open networking – also referred to as
are highly intertwined, there is no hardware
“white box” or “bare metal” switching – is growing
dependency that increases complexity and slows
faster than other switching categories (figure 4).

Open networking adoption with bare metal switching grows


Data Center Ethernet Switch Port Forecast by Type
Key takeaways
70%
Bare metal switches to reach 31% of ports
shipped in CY23, up from 20% in CY18 60%
Expected YoY revenue growth 46% for
CY19 50%

Open networking switch ecosystems


Ports (%)

40%
selling to enterprise are expanding
Bare Metal 46% YoY
30%
Open switch designs certified by the Open
Compute Project continue to increase 20%

The maturity and availability of options for 10%


switch OSs continues to improve
0%
CY17 CY18 CY19 CY20 CY21 CY22 CY23

General Purpose Purpose-built Blade Bare metal


Source: IHS Market Data Center Networks Intelligence Service - March 2019

Figure 4: White box switching is growing faster than any other switching category

For example, AT&T has completely committed Pluribus is deployed in over 240 customers today,
to the white box and open source path across including deployments in over 60 virtualized
multiple places in the network publicly. Many (NFVi) 4G/5G mobile cores of Tier 1 service
other institutions, from cloud service providers providers, where our software is carrying the
to K-12 school districts and local governments to traffic of hundreds of millions of mobile data
enterprises, have deployed open networking with subscribers. These sorts of mission-critical, large-
great success. scale deployments have allowed the software and
hardware technology to mature and be hardened.

8  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Open Networking Switches Are Trident 3. The system-level architecture of these
Becoming More Powerful platforms, shown in figure 5, has also matured,
with some manufacturers implementing two
This large number of deployments has led to parallel 10G network interfaces between the Intel
important feedback going to the open networking CPU and the Broadcom ASIC providing high-
hardware vendors and resulted in rapid innovation speed links to support significant control and
as well as increased performance, not only in terms management plane traffic. This has resulted in
of data plane but also control plane processing not only wire-rate performance in the data plane
power, memory and architectural innovations. but the ability to run containerized workloads
For example, 32x 100 GbE white box switches can with significant traffic throughput in the control
now be sourced with Intel Xeon Broadwell 12 core plane, such as tens of virtual routers with high
processors, 8/16/32G of RAM and 256G or larger performance, making these switches suitable
SSDs, providing a powerful server-like platform in the most demanding single- or multi-tenant
that complements high-performance Broadcom network environments.
network processing units (NPUs) like the

Next-Gen Open Networking/Whitebox Switching


Enabling Consolidation Of Multiple Infrastructure Layers
Ethernet Switch
CPU Complex
Switch designs leverage server design elements
Flash BMC Powerful multicore x86 CPU architectures
DDR SSD TPM
Internal 10G NICs
Multi-Core X86 Onboard solid state storage

Dedicated 10G Links Baseboard management controller (BMC)


Trusted platform module (TPM)
Switch ASIC
Enables NOS to offer containerized/virtualized
applications with high-speed data workloads e.g.:
Front Panel Ports
Network Services (like virtual FW leveraging internal 10G NICs)
Application Flow Telemetry & PCAP capture
Open networking switch hardware
Multi-tenant routing engines
Figure 5: White box switches become more server-like and include dual 10G internal NICs

In spite of all of our talk of network virtualization, platforms to run containerized applications like
we will always need the physical underlay for SDN, network virtualization, virtual network
connectivity. Our view at Pluribus is that if this functions (VNFs), virtual routers, network analytics
latent server-like processing power is being and more. This novel approach is covered in more
deployed anyway, it should not go to waste. detail in Chapter 3.
With clever software, one can leverage these

9  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Support has Matured and program the switches with the OpenFlow
protocol. In this case, the spine switches must
Customers have become more comfortable with be replaced with white box spines running the
support from open networking software and same OpenFlow- based OS that is running on the
hardware vendors. The support model does leaves, effectively limiting this type of solution to
depend on the vendor partnership structure, but greenfield-only deployments. Another approach
these have been set up, exercised and polished that effectively requires greenfield environments
over the last decade. For example, in the case is hardware-bound SDN implementations like
of Pluribus’ partnership with Dell EMC, which Cisco ACI. ACI requires specific switches with
has an extremely large global sales and support specific hardware, typically requiring a rip-and-
infrastructure, Dell EMC will take first- and replace of existing infrastructure to deploy. The
second-level support, with Pluribus providing hardware dependency adds a layer of complexity
third-level technical software support. In the case and fragility that can hamstring and overwhelm IT
of Edgecore or Celestica, Pluribus takes first- and teams trying to deploy SDN.
second-line support and brings in the hardware
vendors if needed. Pluribus has a follow-the-sun
model, with 24×7 support out of our offices in Bottom Line
Santa Clara, California and Bangalore, India.
Open networking provides tremendous innovation
and has been operating in mission-critical
Brownfield or Greenfield networks around the globe for a number of
years – the code has been stressed and hardened
Any solution can be used to build a greenfield in real world deployments at scale. For example,
leaf-and-spine data center network once basic the traffic from hundreds of millions of mobile
proof-of-concept lab testing is complete. However, subscribers is running through the Pluribus
in many instances, IT teams will want to insert a Netvisor ONE OS and Adaptive Cloud Fabric™
few leaf switches at the top of one or two racks today as we are deployed in over 60 virtualized
into a brownfield data center to get a feel for open 4G and 5G mobile cores and over 240 customer
networking performance, stability and usability. environments. In addition to being hardened, the
In such a case, the data center network might cost-to-performance ratio and feature velocity of
have a pair of existing spine switches from a the hardware and software available from open
traditional vertically integrated vendor like Cisco, networking solutions is compelling. There has
Arista or Juniper. Many open network operating never been a better time to take a hard look at
systems, including Pluribus’ Netvisor ONE OS, this technology in pursuit of lowering your CapEx,
are designed to use standards-based Layer 2 and benefiting from modern automation, breaking free
Layer 3 protocols and can easily insert into such a of vendor lock-in and enjoying an increasing rate
scenario. An exception is when open networking of innovation.
solutions use a centralized SDN controller
running on multiple servers to hold network state

10  |  Data Center Network Automation. Simplified. pluribusnetworks.com


CHAPTER 3
The “Easy Button” for SDN Control of
Physical and Virtual Data Center Networks
(Especially for Space- and Cost-Constrained Environments)

In Chapter 2, I posed three questions regarding The Underlay


automating the data center network in pursuit
of building an SDDC to support private cloud. When first rolling out a data center leaf-and-spine
There I focused on question 1 – should I deploy network, one needs to deploy physical switches
open networking? – and highlighted how open for connectivity. Typical deployments have two
networking switches have developed powerful, 10 GbE/25 GbE connections from each server to a
server-like control planes featuring multi-core top-of-rack (TOR) switch, which in turn connects
CPUs complemented with plenty of RAM and via 100 GbE uplink to a spine switch. In an open
flash storage. networking world, the network operator loads
their choice of open source NOS via the Open
In this chapter I will focus on two more sets Networking Install Environment (ONIE) onto the
of questions: switch, with the NOS running on the switch CPU
(e.g., from Intel) and programming the forwarding
•  Do I want my leaf-and-spine physical network
ASIC (e.g., from Broadcom).
to be deployed as an SDN fabric, or am I
comfortable with box-by-box configuration, In a leaf-and-spine network without SDN
operations and troubleshooting? Are the cost automation, each of these switches must be
and complexity of deploying SDN worth the configured, managed and monitored individually,
effort? What are the various approaches? typically through command line interface (CLI),
•  Do I want to create a virtualized network overlay which consumes a lot of time from deeply
fabric that abstracts the physical network into a technical networking experts. There are DevOps
set of software-based virtual tunnels between all tools like Red Hat Ansible that can help automate
switches, supported by software-based routers, some steps, but fundamentally provisioning,
switches and load balancers that offer the operations and troubleshooting is still happening
ability to establish new network topologies and box by box. On the other hand, SDN can
services in seconds? Is the cost and complexity completely automate the underlay and make
of deploying a virtual network worth the effort? 10, 20, 30 or more switches look like one logical
What are the various approaches? programmable entity. In other words, instead of

11  |  Data Center Network Automation. Simplified. pluribusnetworks.com


managing 32 switches, for example, the operator into an extremely space-efficient micro database
manages a single logical switch with a single IP resident on each switch. This is the approach
management address, dramatically increasing Pluribus takes with our Netvisor ONE and
agility, simplifying management, reducing human Adaptive Cloud Fabric. With this “controllerless”
errors and enabling less skilled techs to operate approach, there is no set of external controllers
the network. and thus no associated hardware or software
controller costs or unnecessary consumption of
space and power, which can be problematic in
Controller-based vs. Controllerless constrained environments. There are a number of
SDN Underlay Automation other benefits that come with the controllerless
approach, including the ability to easily insert into
How automation is achieved is where things get
brownfield networks, improved resiliency with
interesting. The more traditional SDN approach
in-band control and the ability to seamlessly
is based on the Open Networking Foundation’s
stretch across geographically distributed sites
(ONF) OpenFlow protocol. In this architecture, all
regardless of distance, with no need for multiple
the network fabric state is kept in a centralized SDN
controllers at every site.
controller, physically separate from the network
switches, and the forwarding tables in the switches Once SDN is deployed, the underlay can be
are programmed via the OpenFlow protocol. managed as a single fabric, so it’s easy to make
Examples of solutions that use this approach configuration changes or troubleshoot across all
come from Big Switch Networks or leverage open switches in the fabric with a single command or
source controllers such as Open Daylight (ODL) query. SDN can be used to set up any topology for
or ONOS. To achieve high availability, typical best the underlay, including Layer 2 or Layer 3
practice dictates three redundant controllers per leaf and spine, and automation increases agility
data center location. This is certainly a workable and reduces operational costs and human
approach for a single large data center, but it errors significantly.
can run into economic feasibility challenges
with smaller or geographically distributed sites,
because the network team needs to pay for and The Overlay
deploy three servers along with licensing for
Once you have the underlay established, the
three controllers at every site. Two sites equals
next step is to virtualize the network by creating
six controllers, three sites equals nine controllers
an overlay network. The overlay network is
and so on.
constructed by creating software tunnels between
The alternate approach is to deploy a distributed virtualized endpoints with an encapsulation
SDN function that runs as an application technology such as VXLAN, software-based
in the user space of every switch in the switches and software-based routers. This
network – leveraging the power of distributed method of software abstraction is agnostic to the
computing – with the state of the fabric distributed physical server connections and the underlying
network topology.

12  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Overlay network virtualization has a number including separating Internet of Things (IoT) traffic
of important benefits, including scalability, from mission-critical corporate traffic to reduce
multi-tenant security and operational agility. attack surfaces and efficiently utilizing firewall
ports and other security devices. In chapter 5, I
Underlay networks, including OpenFlow-based
discuss the importance and benefits of network
networks that only implement VLAN-based flow
segmentation in greater detail.
redirection without an overlay, are typically limited
to 4094 VLANs, while VXLAN-based overlays enable Overlays also enable far better scalability and
scaling to over 16 million VLANs. This is especially service agility when stretching services across
valuable for multi-tenant service provider networks multiple sites and arbitrary topologies, because
because it enables each tenant to control its own they decouple logical service provisioning from
VLAN numbering and scale independently of other the underlying network topology and take
tenants up to the full range of 4094 VLANs. advantage of the optimized resilience and efficient
link utilization of standard Layer 3 underlay
Overlay networks also improve security in those
technologies, such as equal-cost multipath (ECMP)
multi-tenant networks by providing an additional
load balancing. With a virtualized overlay network
layer of abstraction and full isolation between
running completely in software, new network
customers, as shown in figure 6. That level of
services can be quickly established without having
isolation is also highly valuable in other use cases,
to touch the underlay.

Figure 6: Overlay networking creates software-based networks that can be defined per tenant

13  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Like SDN for the underlay, there are multiple Juniper Contrail, Nokia Nuage and VMware NSX.
approaches to network virtualization. One Some data center operators will see benefits to
approach has the software switches and routers this approach that outweigh the added cost and
running on the same servers that run the complexity, but generally that will only be true
application workloads. In this case, the VXLAN for larger data center environments where having
tunnels terminate into VXLAN tunnel endpoints multiple servers for management and control is
(VTEPs) that run on those same servers as well. acceptable and where there are usually large IT
The pricing model for this approach is typically teams that can manage integration complexity. In
on a per-processor basis for each of the servers, those cases, the software overlay solutions can be
often adding thousands of extra dollars per CPU. deployed over a Pluribus SDN controller underlay
Then, in addition, a number of separate external or even over a standard manually configured
servers are needed for management, network underlay.
controllers, edge services gateways and more. This
means extra cost, space and power for the servers,
as well as multiple software licenses that must A New Approach to Overlay
be purchased. Again, like SDN for the underlay, Fabric Networking
these overlay controllers need to be deployed
The alternative approach for overlay fabric
in clusters of three for redundancy. Finally,
networking is to leverage the power of the CPU
because the packet processing in these solutions
and the packet processing power of the forwarding
consumes host compute power that would
ASIC in the TOR switches that are being deployed
otherwise support application workloads, many
anyway. Just like the SDN underlay, this is the
in the industry are advocating that SmartNICs
approach that Pluribus takes with our Adaptive
be deployed. These “smart” network interface
Cloud Fabric for the overlay fabric. In this case, the
cards (NICs) include CPUs and memory to offload
VXLAN tunnels terminate on the white box TOR
network processing from the main server CPUs,
switch and leverage the specialized ASIC from
increasing processing power but adding significant
Broadcom to hardware-accelerate the packet
expense per server along with complexity,
processing and termination of VXLAN tunnels
requiring multiple integration and configuration
into VTEPs
steps. Solutions that take this approach are

14  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Traditional Controller-Based Controllerless

Network
Monitoring
Infrastructure

Network
Virtualization and
Segmentation

SDN Underlay
Control

API
White Box
Packet L2/L3 Packet L2/L3 SDN Network DCGW
Analytics
Processing NOS Processing NOS Control Virtualization Router

Broadcom CPU / Memory Broadcom Intel CPU / Memory

Figure 7: Leveraging the distributed compute power of white box switches to implement SDN and network virtualization

Layer 2 and Layer 3 unicast and multicast services expense of other network virtualization solutions,
are distributed throughout the overlay fabric using it also is controllerless – again, where no external
an Anycast Gateway approach, again leveraging controllers are needed – reducing space, power
the distributed processing power of the switches. and cost, which is especially critical in smaller data
Not only does this eliminate the per-CPU license center environments.
expense and the optional per-server SmartNIC

Analytics Analytics Analytics

Packet broker Switch Packet broker Switch Packet broker Switch

Large amounts of external VIM Controller VIM Controller VIM Controller


Cluster Overlay Cluster Overlay Cluster Overlay
hardware and software Controller cluster Controller cluster Controller cluster
Underlay Underlay
licenses to automate the Controller cluster Controller cluster
Underlay
Controller cluster

underlay and overlay

DC1 DC2 DC3


Bookended leaf-
spine with closed 1c1c1b
Protocols
greenfield fit

Optional: SmartNics per server for network virtualization data plane acceleration

VIM Controller
Cluster

Leverage the processing Single UMUM Network


Management & Insight
power of the switch to save Analytics
DC1 DC2 DC3

cost, space and power


UNUM

Any WAN

Figure 8: A dramatic reduction in cost, space and power for constrained environments and a pre-integrated solution
that is simple for resource-constrained IT teams to deploy.

15  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Underlay and Overlay Unified – smaller data centers moving workloads closer to
It Just Works users and things to improve customer experience
and enable new latency-sensitive applications.
In the case of Pluribus’ Netvisor ONE OS and Most approaches to SDN control of the underlay
Adaptive Cloud Fabric, the typical deployment and overlay are complex, costly and consume
is with unified SDN control of the underlay and a lot of space and power, and are not really
overlay fabric, with no external controllers needed, well designed for smaller and geographically
as shown in figures 7 and 8. Not only is this distributed mini data center environments. By
approach very cost effective, with extremely low using the processing power of the switches
space and power consumption, the unification of that are being deployed anyway for physical
these two automation layers results in a simple connectivity, Pluribus enables a very cost-,
deployment with minimal integration required. space- and power-efficient SDN underlay and
The solution works out of the box, from zero-touch virtualized overlay, which makes it feasible to
provisioning (ZTP) of the switches to building deploy an SDDC in small/medium data centers
an SDN-automated underlay to deploying the and constrained edge data center deployments.
virtualized network fabric. It is simple, fast and
efficient, as you can see in this short four-minute Now this is not the final layer of automation we
video that shows the deployment of a simple need, however, so in Chapter 4 I will talk about the
four-switch fabric using our UNUM management importance of network analytics and how, once
system – UNUM Day-0 Automation. Once again, the power of open networking switches can
deployed, the solution is easily integrated through be leveraged with clever software to deliver
a northbound RESTful API with orchestration cost-effective yet very granular network telemetry.
systems such as vCenter, Red Hat OpenStack
or Kubernetes.

Data centers are becoming more distributed into


colocation facilities and even out to the edge
of the network - what we call distributed cloud.
Fundamentally, the industry is seeing more and

16  |  Data Center Network Automation. Simplified. pluribusnetworks.com


CHAPTER 4
Network Analytics Without Probes,
TAPs and Packet Brokers

In Chapter 3, I wrote about a controllerless are not optimal for smaller environments as they
implementation for SDN automation of the require a set of external test access points (TAPs),
underlay and a virtualized network overlay fabric probes and packet brokers that effectively overlay
that leverages the distributed processing power the network fabric, not to mention a number of
of open networking switches. The result of this servers to execute the analytics. This results, again,
approach is a very efficient and highly integrated in high cost and space and power consumption, as
network automation solution for smaller data well as additional complexity.
center environments where traditional SDN
approaches are simply too expensive, consume
too much space and power and struggle to span Traditional Approaches
geographically distributed multi-site or edge data to Analytics
center locations.
Traditional switches and routers switch billions of
This novel approach is very powerful and packets per second between servers and clients at
necessary but not sufficient – there is another sub-microsecond latencies using custom ASICs but
layer of functionality required to support have limited capability to record enough telemetry
comprehensive data center automation. In order detail to provide a truly useful picture of network
to monitor the network and quickly identify performance over time. It is a very similar story for
and troubleshoot performance issues, granular OpenFlow-based switches, which use merchant
telemetry on every flow that traverses the fabric silicon but have insufficient telemetry. As such,
is essential. In fact, major vendors like Cisco, with external TAPs and monitoring networks have to be
their Tetration offering, have heavily validated the built to get a sense of what is actually going on in
need for application analytics for today’s modern the infrastructure. The figure below shows what
applications. But these traditional approaches monitoring today looks like.

17  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Monitoring
Traditional Network Monitoring Fabric

Selectively place Mirroring


Taps and mirroring
ports

Tap

Tools
VM VM VM VM VM VM VM VM VM

EXPENSIVE Fabric to
VM VM VM VM VM VM VM VM VM aggregate and filter
traffic
Server Server Server

Server Server Server

Server Server Server

Server Server Server

Figure 9: A traditional approach to network telemetry and analytics requires external TAPS, probes and packet brokers.

This is where challenges arise. A typical data center routes traffic to the monitoring tools. While
network that connects servers runs a combination this is less costly, it means the inner network
of 10, 25, 40 and 100 GbE today. These switches becomes a black hole with no visibility. Many
typically have many servers connected to them of us have learned the hard way over time that
that are pumping traffic at high speed. without 100% visibility, you can’t fix a problem
very efficiently. In addition, even this selective
Some possible approaches to instrumenting the
deployment of hardware makes the cost go up
network today are as follows:
dramatically, as more switches are deployed
1. Provision a copper or fiber optic TAP at every and require monitoring – the monitoring fabric
link and divert a copy of every packet to a packet needs more capacity and the monitoring
broker fabric, which in turn routes traffic to software gets more complex and needs more
the monitoring tools. With the fiber optics TAP hardware resources.
and passives, every packet is mirrored, and the
3.  Using the networking switches themselves
monitoring tools need to deal with a few Tb/s
to selectively sample traffic (e.g., sFlow with
or 1B+ packets per second from each switch.
standard hardware or NetFlow with proprietary
However, the reality is that this approach is
hardware) and send this traffic and flow
impossibly expensive, and thus no one deploys it.
information to monitoring tools. This approach
2.  Selectively place copper or fiber optic TAPs is built upon the premise of sampling, where
at uplinks or edge ports. Mirror these edge the sampling rates are typically 1 in 5,000 to
packets to a packet broker fabric, which in turn 10,000 packets – any more than this runs into

18  |  Data Center Network Automation. Simplified. pluribusnetworks.com


scale challenges. This approach is better than With this approach one can effectively capture
nothing, but does not really have enough raw every TCP flow across the fabric at wire speed,
detail to attain a full picture of the network. including TCP connection states (SYN, SYNACK,
EST, FIN, etc.) by service, client, domains and many
other options over time, and store the metadata in a
Another Approach to Telemetry repository for deep analysis. Also, multiple options
and Analytics to tag IP addresses, VLANs, MAC addresses and
switch ports with metadata/contextual tags can be
As described in the previous chapter, it makes
offered, and then one can aggregate or filter flows
sense in constrained environments to leverage
based on the custom tags. In addition to flows, of
the distributed processing power of white box
course, it is important to have port telemetry and
switches when possible. Similar to what can
device diagnostics via a selection of searchable
be done with SDN and network virtualization,
options such as fabric node, switch port, vport
one can write clever software that leverages the
(virtual port) and state, including a dashboard of all
CPU and memory of the switch as well as the
ports in the fabric. This is extremely valuable as it
packet processing ASIC to monitor every TCP
provides real-time and/or historical data analytics
connection, including traffic within VXLAN tunnels,
to identify performance concerns and root-cause
at wire speed across the entire fabric. This allows
network outages or to quickly understand security
the tracking of all east-west and north-south
threats like DDOS attacks.
traffic flows to expose important network and
application performance characteristics and
quickly isolate and fix problems.
We had a large DDoS attack against the main
Specifically, rich telemetry from the SDN and district website. With Pluribus, we were able to
virtual network fabric can be gathered, where each determine in minutes that it was a couple of
switch in the fabric collects the metadata for every IP addresses from a nearby college and shut it
flow, stores some amount of metadata on the down quickly before it had any impact.
switch and then sends the bulk of the metadata to Sr. System Engineer,
Large Midwest K-12 School District
an analytics application via REST API. In particular,
more recent open networking switches feature
dual 10G NICs that run between the CPU and
the network processing ASIC, providing plenty of Insight Analytics
throughput to transport the data to the CPU. The Pluribus Netvisor ONE has implemented this novel
bulk of the processing would happen on the local software approach described above, leveraging
OS instance, and only metadata would be peeled the CPU, memory and packet processing ASIC to
off and sent to the analytics application, which provide comprehensive flow and switch telemetry.
allows this solution to scale to billions of flows.

19  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Performance metrics are stored within the fabric flows over a time window and has an analytics
and delivered as lightweight metadata that can engine and a rich set of reports that allow network
be viewed using CLI from the fabric or delivered operations teams to drill down to a single flow
via API or IPFIX to other monitoring systems, and identify performance issues or bad actors. Its
security information and event management powerful search engine UI and simple query syntax
(SIEM) platforms or the Pluribus Insight Analytics can help isolate and filter specific flows among
solution, figure 10, which is an optional software millions in a fraction of a second. This can help
module in our UNUM product offering. This quickly identify and rectify performance issues for
solution can store from 100 million up to 2.5 billion regular reporting to senior management.

Figure 10: One of many prepackaged reporting displays in UNUM Insight Analytics

As applications become more distributed, with significant hardware overlay infrastructures that are
both east-west and north-south traffic, and expensive, complex and consume space and power
services are deployed within private clouds, the – not ideal for smaller data center environments.
ability to monitor each and every connection is of The best approach is to leverage the distributed
paramount importance for both performance and processing power of the network switches
security reasons (find more on security in chapter themselves with some clever software to provide
5). Given the amount of data, traditional sampled the data sources and analytics tools the ability to
data network analytics sources do not scale. More observe every packet and flow at a fraction of the
traditional packet monitoring solutions designed cost of traditional hardware-based solutions.
to overcome this limitation unfortunately require

20  |  Data Center Network Automation. Simplified. pluribusnetworks.com


CHAPTER 5
The Importance of Network Segmentation
for Security and Multi-Tenancy

We can’t talk about data center networking it is necessary to configure multiple VRFs per
automation without addressing the topic of switch on multiple switches across the data center
security. Firewalls sit at the perimeter of the data or campus – a nightmare of complexity that is
center network and can protect against very prone to human error. In addition, because
north-south traffic entering the data center fabric, of the heavy protocol exchange in a typical VRF
but do not address east-west traffic and threats implementation, traditional solutions run into
moving laterally once inside the network. For VRF scale challenges. Similarly, setting up a VXLAN
example, sophisticated malware can hide within fabric using, for example, BGP-EVPN requires N x
encrypted data and be missed by conventional tens of steps per switch, and then adding VRFs
firewalls, and once inside can create significant on top of that adds another N x tens of steps
damage. IoT is one example of an application per switch.
with many new endpoints generating traffic and a
potentially immature security model that results
in a new, large attack surface. There are already Automating Network
examples of successful attacks through Segmentation
IoT devices, such as an attack on a casino via a
On the other hand, Pluribus’ open SDN approach
WiFi-connected fish tank temperature sensor, as
with the Adaptive Cloud Fabric sets up a mesh of
well as a massive retail attack via a WiFi-connected
VXLAN tunnels automatically. Once deployed,
HVAC system.
VRFs can be programmed to run across the fabric
Consequently, the industry has moved toward on every switch within a VXLAN segment with
leveraging virtual routing and forwarding instances a single atomic command. Literally only one
(VRFs) to segment the network and isolate traffic. command is needed – a dramatic simplification.
VRFs can be deployed in a traditional underlay or In addition, Pluribus’ VRF scale is limited only by
on top of a VXLAN-based overlay. In either case, hardware because the Adaptive Cloud Fabric’s SDN
one of the challenges traditional networking approach does not need the protocol exchange
solutions face is the complexity of provisioning and typically required by VRFs.
deploying the VRFs. If deployed in the underlay,

21  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Ultimately, this simple-to-deploy and highly More Efficient Utilization of
scalable network segmentation approach Security Devices
significantly reduces system attack surfaces
so that endpoints only see the resources and This segmentation also allows the more efficient
services necessary to perform their tasks, use of firewalls, intrusion detection systems (IDS),
limiting accessibility and mitigating risk. With the intrusion prevention systems (IPS) and other physical
programmability and ease of use of the Adaptive or virtual security devices. Instead of deploying many
Cloud Fabric, network or security team members of these devices throughout the network, they can
can quickly add segments and VRFs to control be centralized for easy management, while also
traffic flowing across the fabric without having allowing the pooling and sharing of these typically
to reconfigure the underlying physical expensive resources. Furthermore, traffic can be
network infrastructure. steered to specific resources such as a separate IoT
analytics system or to external cloud services for
processing, with confidence that it is separated from
higher-value traffic.

Ext Network

CORE ROUTERS

MULTIPLE vFLOW
VRF SERVICE
VRF VRF VRF VRF
INSTANCES INSERTION
(COLORS) POLICY

PERIMETER
VM-10 VM-20 VM-30 VM-20 VM-20 FIREWELL (OR
VM-10 VM-10 IPS/IDS)
VM-10 VM-20 VM-20
VM-20 VM-30
VM-20 VM-10 VM-30 CLUSTER
VM-30

Figure 11: VRFs distributed across the fabric with Anycast Gateways to better leverage pooled firewall resources

22  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Segmentation can also be used for multi-tenancy. Integrated Telemetry
There are different levels of segmentation, and and Analytics
the Adaptive Cloud Fabric is unique in its ability to
offer not only VXLAN overlay supported by VRFs Finally, as discussed in Chapter 4, the integrated
for segmentation across the data and control telemetry monitors every TCP connection and
planes but also deep slicing. Deep slicing leverages flow at wire speed, including traffic within overlay
a construct called vNETs (virtual networks) that tunnels orchestrated by the Adaptive Cloud Fabric,
slice the fabric across the data, control and to expose important network and application
management planes. Deep slicing allows each behavior characteristics. Visibility is provided on
tenant, if desired, to use its own automation tools a per-segment basis with complete separation
to control its slice fabric wide. For example, if you of data for compliance requirements. Enabling a
are a regional cloud service provider that deployed comprehensive view of the fabric, the integrated
ACF across five data centers, you could define a visibility greatly improves situational awareness
slice for one customer with a set of physical or while eliminating the costs and complexity
virtual ports at three of those five data centers, and associated with hardware-based monitoring tools.
that tenant could configure its slice as it sees fit.

23  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Summary
The world is moving to a hybrid multi-cloud model, with IDC estimating that 75% of workloads
will be deployed in on-prem, colo-based or hosted private clouds for cost, security, performance
and data sovereignty reasons. These on-prem, colo and hosted private cloud environments
require a completely automated data center foundation – the software-defined data center or
SDDC. While storage and compute virtualization and automation have made great strides over
the last decade, networking has lagged. Data center networking automation solutions today
have been designed for large data centers operated by large IT teams that have the budget to
buy, and the resources to integrate and deploy, layers of external hardware and software to
achieve network automation and virtualization. Unfortunately, these traditional approaches do
not suit the larger universe of data center and private cloud operators that have smaller IT teams.

The Pluribus Netvisor ONE operating system and Adaptive Cloud Fabric have been designed
to deliver a superior level of network automation to small IT teams while simultaneously fitting
into cost-, space- and power-constrained environments. Pluribus Networks takes advantage of
the underutilized distributed computational power, memory and packet processing inherent
in leaf-and-spine network switches distributed across one or multiple data center sites. By
leveraging these resources, Pluribus delivers a unique “controllerless” approach to SDN
automation of the physical network, provides a service-rich and secure VXLAN virtual fabric
and enables comprehensive and granular telemetry and analytics. Not only is the solution
cost-, space- and power-efficient, it is unified and pre-integrated, so it just works out of the box.
Supporting well-known orchestration systems, including VMware vCenter, Red Hat OpenStack
and Kubernetes, Pluribus Networks puts fully automated SDDC within reach for small IT teams
with constrained physical environments - the Easy Button for SDDC.

24  |  Data Center Network Automation. Simplified. pluribusnetworks.com


Additional Resources
Webinar Replay: Realizing the SDDC: Simple, Affordable SDN and Network Virtualization for
Any Size Data Center
Web Page: Netvisor ONE Operating System
Web Page: Adaptive Cloud Fabric
Web Page: Insight Analytics

Copyright © 2019 Pluribus Networks, Inc. All Rights Reserved. Netvisor is a registered trademark, and The Pluribus
Networks logo, Pluribus Networks, Freedom, Adaptive Cloud Fabric, and VirtualWire are trademarks of Pluribus
Networks, Inc. All other brands and product names are registered and unregistered trademarks of their respective
owners. Pluribus Networks reserve the right to make changes to its technical information and specifications at any
time, without notice. This informational document may describe features that may not be currently available or Pluribus Networks, Inc.
may be part of a future release, or may require separate licensed software to function as described. www.pluribusnetworks.com

You might also like