You are on page 1of 24

1

Paragon Automation – Application


Positioning Document

Table of Contents

1 Introduction and Scope .............................................................................................................................. 3


2 Paragon Automation Blueprint ................................................................................................................. 3
2.1 Overview ............................................................................................................................................ 3
2.2 Paragon Pathfinder: Traffic Engineering ........................................................................................ 7
2.3 Paragon ATOM: Service Orchestration, Workflow, EMS and Compliance ............................... 7
2.4 Paragon Active Assurance: Active Assurance ............................................................................... 8
2.5 Paragon Insights: Telemetry Collector, Root Cause Analysis, Remediation Actions ................ 9
3 Use-cases with Deviations from the Guidelines ................................................................................... 10
3.1 CSD Replacement ............................................................................................................................ 10
3.1.1 CSD Basics ................................................................................................................................... 10
3.1.2 Anuta ATOM as Replacement for CSD .................................................................................... 10
3.1.3 When Is It Apt to Position Anuta ATOM in Place of CSD? ................................................... 12
3.1.4 What Products to Position for Service Activation and Testing in a Customer Network
Without CSD?............................................................................................................................................ 12
3.1.5 Points to Leverage When Positioning Anuta ATOM in Lieu of CSD ................................... 12
3.2 Junos Space Replacement (EMS) ................................................................................................... 13
3.2.1 Junos Space Background and Basics ........................................................................................ 13
3.2.2 Junos Space Positioning for JAWS ........................................................................................... 14
3.2.3 What EMS Features Can Anuta ATOM Be Positioned for? .................................................. 15
3.2.4 Points to Leverage When Positioning Anuta ATOM in Lieu of Junos Space...................... 16
4 Real-world Scenarios ................................................................................................................................ 17
4.1 Example: How to Troubleshoot Real-world Network Issues with Paragon Automation ...... 17
4.1.1 Label Conflict ............................................................................................................................... 17
4.1.2 No Path Available; No State for Constraints ........................................................................... 18

Juniper Business Use Only


| 2

4.1.3 Latency Spike ............................................................................................................................... 19


4.1.4 Device Unreachable .................................................................................................................... 19
4.1.5 IP Conflict ..................................................................................................................................... 20
4.1.6 Intermittent Blackholes .............................................................................................................. 20
4.1.7 Link Health/Errors....................................................................................................................... 20
4.1.8 Node Health/Errors .................................................................................................................... 20
4.1.9 Tree State Failure/Replication Failure...................................................................................... 21
5 Appendix: Customer Input ....................................................................................................................... 21
5.1 Background ...................................................................................................................................... 21
5.2 High Level Goals as Communicated by the Customer ............................................................... 21
5.3 High Level Automation Requirements as Communicated by the Customer ........................... 22
5.4 PCE High Level Principles Requirements as Described by the Customer ............................... 24

Juniper Business Use Only


| 3

1 Introduction and Scope


This document clarifies the responsibilities and positioning of the various Paragon
applications. This is important since some of the applications overlap in terms of
functionality.
The document starts by describing a blueprint composed of the Paragon applications and
explains their respective responsibilities in addressing customer needs. It also briefly
describes specific scenarios where you can consider deviating from this blueprint: for
example, a scenario where 95% of the customer use-case can be solved with one
application, and we cannot justify proposing an additional application to contribute the last
5%.
Since this blueprint is based on a real customer RFI for an end-to-end ecosystem, the
requirements from that customer RFI are documented in Appendix: Customer Input for the
sake of completeness. The RFI contains several fault scenarios which are important to be
able to detect, isolate and fix with the offered solution. In the present document these fault
scenarios have been used as examples, and we show how to handle them using a
combination of Paragon applications.
This document mainly involves Pathfinder, Active Assurance, Insight and ATOM. Planner
does not overlap with any of the other applications and is therefore only briefly mentioned in
chapter 2 and occasionally later.

2 Paragon Automation Blueprint

2.1 Overview
The Paragon Automation components covered in this blueprint are:
• Paragon ATOM
• Paragon Pathfinder
• Paragon Insight
• Paragon Active Assurance
• Paragon Planner
The most important requirement from a customer perspective is to be able to solve real-
world problems. This is why Paragon focuses on common use-cases and how to solve them.
The key use-cases are shown in Figure 1 below.

Juniper Business Use Only


| 4

Figure 1 Paragon use-cases

Each Paragon application has a special responsibility, shown below from a high-level
perspective in Table 1. By making the responsibility of each application clearer, we will
reduce the confusion in cases where the functionalities overlap slightly and minimize
overlapping development of the applications to accommodate customer requests.
Some of the above use-cases are “stand-alone”, meaning that only a single Paragon
application is needed. Some of the others may include several Paragon applications.
Table 1 below provides a high-level view of which applications are responsible for solving
specific use-cases. Most use-cases have only one checkmark, meaning that only one
application is commonly needed to address that use-case. Some of the use-cases require
two or more applications; this is shown by checkmarks in multiple columns of the table.

Use-cases Description Planner ATOM Active Insights Pathfinder


Assurance
Device turn- Onboard new devices ✓
up/ZTP
Device LCM SW upgrades ✓
Audit, Compliance Check and enforce ✓
management versions, configurations
using rules and
templates
Service modeling Stateless service ✓
(orchestration) orchestration using
YANG
Service activation Run tests, provide birth ✓
testing certificate
Service-centric Monitor network/LSPs ✓
assurance end-to-end, monitor
(monitoring) services

Juniper Business Use Only


| 5

Use-cases Description Planner ATOM Active Insights Pathfinder


Assurance
Device monitoring CPU, mem, load, link (1) ✓
flaps
Latency-based Place traffic on optimal ✓
routing paths to meet latency
demands
Autonomous Place traffic on paths ✓
capacity that meet bandwidth
optimization demands
AI-based Predict failures, flag ✓
predictive anomalies, correlation
analytics
Closed loop Combine active and ✓ ✓
remediation passive assurance to
detect issues and take
actions to fix them
Assured network Transport slice ✓ ✓
slicing management with active
assurance
AI-based network Use of resource graphs ✓
analytics, root to correlate issues and
cause highlight root cause
Low code Workflows with manual ✓
workflow decision points, etc.
automation
Fault Ingestion of device and (2) ✓
management service notifications via
SNMP, webhooks, etc.
Coordinated Fix issues without traffic ✓
maintenance impact
Strategic network Plan activities for ✓
planning transport networks

1
Insight is the main solution to be used for device monitoring. When required for specific situations, ATOM
may also monitor device health and status. This mainly when needed for workflows or lifecycle management
activities. See chapter 3 for more information. For use-cases/scenarios NOT mentioned in chapter 3, please
reach out to PLM for a discussion around the use of ATOM instead of Insight.
2
Insight is the main solution to be used for receiving Alerts (for Fault Management). When required for specific
situations, ATOM may ingest SNMP traps, see chapter 3 for more information. For use-cases/scenarios NOT
mentioned in chapter 3, please reach out to PLM for discussion around the use of ATOM instead of
Insight.

Juniper Business Use Only


| 6

Use-cases Description Planner ATOM Active Insights Pathfinder


Assurance
Risk-failure Simulate different ✓
analysis, scenario scenarios to ensure that
planning the network meets
customer needs,
including “what if”
scenarios
Network For example: Migrate ✓
migrations from IPv4 to IPv6
Network-wide Routing/switching and IP ✓
integrity checks addressing integrity
checks (correlated
between all devices in a
routing domain)
Service activation Provision a service with ✓ ✓
and testing ATOM and then also test
the service with PAA

Table 1 Positioning of applications.

See Figure 2 for an integrated setup utilizing the different applications.

Figure 2 Fully integrated and automated workflows.

All Paragon applications can work either stand-alone or integrated using flexible APIs. This
also means that if a customer uses products from some other vendor(s) for certain tasks and
only needs selected Paragon applications to be integrated into their environment, that will

Juniper Business Use Only


| 7

work perfectly fine. Based on use-cases and customer needs, the integration can be
implemented as part of a customer project by Juniper Professional Services.
There are several examples of integrating two or more of the applications. For example, in a
PoC setup for a large service provider in EMEA, ATOM provisions VPNs, maps these to
Pathfinder-created LSPs and deploys Paragon Insights playbooks to start monitoring the
VPN PE–CE BGP sessions. Another example is a PoC setup for an ASEAN service provider
that has integrated Service Now with ATOM, which in turn controls Paragon Active
Assurance and Paragon Insights.
Below, the responsibilities of the different applications are detailed further.

2.2 Paragon Pathfinder: Traffic Engineering


1. Learns about the network and LSP path state via the Path Computation Element
Protocol (PCEP) and by listening to BGP-LS updates.
2. Offers real-time LSP path computation and provisioning + optimization for the
transport network.
3. Can provide information about which LSPs exist to other solutions (for example,
ATOM) as needed through APIs.

Further clarifications regarding Pathfinder from a Paragon perspective:


• Pathfinder “owns” the SR-TE/LSPs. Some changes to the LSPs can be made without
changing the VPNs, meaning that Insights can interact directly with Pathfinder to put
nodes into maintenance and have LSPs recalculated.

2.3 Paragon ATOM: Service Orchestration, Workflow, EMS and


Compliance
In the customer RFI example described in Appendix: Customer Input, ATOM is doing the
VPN configurations. Each time a new VPN customer comes on board, ATOM will:
1. Configure the PEs involved with the corresponding VRF configuration. It will query
Pathfinder about which LSP meshes exist and will also add configuration of which
mesh to map the VPN to. Any pre-checks are done by ATOM.
2. Add a VLAN interface on the PE towards the PAA Test Agent (which is connected to
each PE in this scenario), create a corresponding VLAN on the PAA Test Agent via
PAA Control Center and add that VLAN to the customer VRF.
3. Configure a TWAMP responder on the customer CPE.
4. Add the customer CPE responder IP address to the PAA responder inventory via
Control Center.
5. Start an activation test towards the customer CPE via the Control Center API.

Juniper Business Use Only


| 8

6. Add the customer CPE to the Insights device group so that Insights starts collecting
metrics.
7. Remove the customer VLAN interface on the Test Agent.
8. Via the Control Center API start an LSP monitor that will monitor the LSP end-to-end
between the Test Agents connected to the different PEs.
9. Any post-checks will be done by ATOM.

Further clarifications regarding ATOM from a Paragon perspective:


• ATOM should not be sold as a generic telemetry collector or an assurance solution.
Telemetry collection is done by Insights, and SNMP notifications should also, in
general, go to Insights.
• ATOM should not be sold as a tool for service activation testing (SAT). ATOM will tell
Paragon Active Assurance to run activation tests.
• ATOM is not the “brain” of the solution that decides which remediation actions to
take or for closed-loop control. This is the role of Insights. However, Insights might tell
ATOM to make changes.
• ATOM is the EMS. This means that ZTP of new devices as well as SW upgrades of
the devices is part of ATOM’s responsibility. This does not mean that ATOM takes on
full FCAPS responsibilities (FCAPS = fault, configuration, accounting, performance,
security), as Insights is responsible for ingesting device fault events and performance
management metrics.
• ATOM is responsible for compliance. If the customer is interested in compliance use-
cases, ATOM is the solution to sell.
• ATOM “owns” the VPN config, so changes to the VPNs must always go via ATOM.

2.4 Paragon Active Assurance: Active Assurance


1. Runs test suites for activation tests started by ATOM as part of service activation and
provides test results to ATOM with a pass or fail indication. If the test fails, the
service cannot be activated or handed over to the customer.
2. If the test passes, ATOM will tell Active Assurance to monitor the LSP meshes end-to-
end between all the PEs by sending UDP flows between Test Agent interfaces into
the SR-TE LSPs (note that monitors are also started by ATOM as part of fulfillment).
3. Sends a notification to Insights if end-to-end thresholds are violated for any of the
LSPs, with information enabling unique identification of the LSP having issues.
4. Streams real-time metrics to Insights via Kafka for correlation and root-cause
analysis, as well as for prediction and anomaly detection.

Further clarifications regarding Active Assurance from a Paragon perspective:

Juniper Business Use Only


| 9

• During the fulfilment phase (activation test results), PAA can provide test results back
to ATOM. However, once the test has passed and the service is operational, PAA will
rather be providing KPIs and notifications to Insights and not to ATOM.

2.5 Paragon Insights: Telemetry Collector, Root Cause Analysis,


Remediation Actions
1. Paragon Insights is configured by ATOM (or by the user) to monitor the health of
network devices via a variety of ingest methods.
2. If an anomaly is detected, it will be highlighted in the UI and an alert can be sent via
Slack, Teams, Kafka, email etc. to the customer system. REST API calls, too, can be
made to other systems.
3. User-defined Actions (UDAs) can be created by the customer. These are written in
Python and so are very flexible. For example, the UDA code can run troubleshooting
playbooks, gather additional information from routers, issue operational commands
on routers or modify the configuration of routers, or run troubleshooting tests with
Paragon Active Assurance. For more complex actions, a workflow engine will be
invoked via ATOM, which will manage the workflow.
4. Receives notifications from Paragon Active Assurance if end-to-end monitor KPI
thresholds are violated, which might include LSP identity.
a. Then requests information from Pathfinder about the details of the LSP, and
correlates this problem with other LSPs having problems, and with device
health information.
5. Applies rules for remediation action directly to devices or via ATOM, starts
troubleshooting tests with PAA to gather more insight into the problem, etc.
6. Resource trees in Paragon Insights allow for the description of dependencies
between services and network components (for example, VPN dependent on an LSP
dependent on interfaces, which are dependent on PFEs, which are dependent on line
cards). These is then used to enable root cause analysis.

Further clarifications regarding Insights from a Paragon perspective:


• Insights is the “brain” of the system, deciding which operational actions to take. The
actions might be taken by communicating with ATOM, which will then make the
actual change, or potentially directly with devices or with Pathfinder.
• Insights should be the ingest of KPIs from PAA for visualization and
analytics/correlation.
• While Insights has some workflow capabilities, Insights is not the workflow engine of
the Paragon solution. This is ATOM.

Juniper Business Use Only


| 10

3 Use-cases with Deviations from the Guidelines

3.1 CSD Replacement

3.1.1 CSD Basics

Connectivity Services Director (CSD) is a Juniper product (which has reached EoL) having
the following main functions:
• Provisioning and activating L2/L3 VPNs across MPLS and Carrier Ethernet networks.
• Basic service troubleshooting, service statistics, and performance measurement
according to the ITU Y.1731 specification.
CSD reached EoS in July 2021. It will reach the end of engineering support in December
2022, and therefore there is currently no active feature development work being accepted
by Engineering towards CSD.

3.1.2 Anuta ATOM as Replacement for CSD

Anuta ATOM has been introduced as an alternative product in lieu of CSD, possible to offer
to customers seeking service provisioning support in their networks. Sufficient feature parity
has been introduced in Anuta ATOM to ensure that the account teams can continue to
position Anuta ATOM with the same feature set as CSD.
The following service templates were part of CSD and are now made available in Anuta
ATOM:
ü ELine-Dot1q-SingleVLAN
ü ELine-PortBased
ü ELine-QinQ-AllVLAN
ü ELine-QinQ-VLANRange-CCC
ü ELine-BGP-Port-Based
ü ELAN-BGP-Dot1Q-SingleVLAN
ü ELAN-BGP-PortBased
ü ELAN-BGP-QinQ-AllVLAN
ü ELAN-Hub-Spoke-QinQ-AllVLAN
ü L3VPN-OSPF-STATIC L3 VPN (Full Mesh)
ü L3VPN-BGP-STATIC L3 VPN (Full Mesh)
ü L3VPN-OSPF-Static (Hub-Spoke-1-Interface)
ü L3VPN-BGP-Static (Hub-Spoke-1-Interface)

Juniper Business Use Only


| 11

ü EVPN-VXLAN
ü EVPN-ETREE
ü ELine-Dot1q-SingleVLAN-CCC
ü ELine-Dot1q-SingleVLAN-Ext-CCC
ü ELine-QinQ-AllVLAN-CCC
ü ELine-QinQ-AllVLAN-Ext-CCC
ü ELine-QinQ-VLANRange
ü ELine-QinQ-VLANRange-Ext-CCC
ü Eline-BGP-QinQ-AllVLAN
ü Eline-BGP-Dot1q-SingleVLAN
ü ELAN-BGP-Dot1q-Normalized-VLAN-None
ü ELAN-BGP-QinQ-AllVLAN-Normalized-All
ü ELAN-BGP-Dot1q-Normalized-VLAN-None
ü ELAN-BGP-QinQ-Range-Normalized-VLAN
ü ELAN-Hub-Spoke-QinQ-AllVLAN-No
For more details on the parameters covered in each of the service templates, refer to
<Detailed document of service templates> (not part of this document).
Note: The CSD service templates are tried and tested in customer networks and can
be regarded as Juniper-recommended service templates that can be implemented in
customer networks. If customers need modifications to these templates, account
teams can reach out to their respective Professional Services (PS) BDMs to conduct
an assessment and provide an estimate of professional services charges for such
modifications.
Besides providing basic service templates, CSD also has very limited service testing and
monitoring functionality. The same functionality has now been introduced in Anuta ATOM to
maintain parity between CSD and ATOM. These service monitoring and testing capabilities
are outlined in the table at the end of section 3.1.5.
Note: The service monitoring and testing capabilities in ATOM are strictly limited and
exist mainly for the benefit of accounts that are looking to position ATOM in various
scenarios as a replacement for CSD. There is no roadmap for enhancing these basic
service monitoring features in ATOM.
In scenarios where customers are looking for service provisioning, monitoring, and testing
(assurance) capabilities which are not currently part of ATOM, be sure to position Paragon
Active Assurance (PAA) and Paragon Insights along with ATOM.

Juniper Business Use Only


| 12

• PAA is a feature-rich, active test and monitoring solution, providing end-to-end


service insights for service activation testing, quality monitoring, and troubleshooting.
PAA has an active roadmap for adding the latest in-demand features for service
assurance.
• Paragon Insights is a flexible and powerful solution for device telemetry collection,
prediction, anomaly detection and root-cause analysis.
To conclude: Anuta ATOM provides the necessary functionality needed to replace CSD in
any customer network.

3.1.3 When Is It Apt to Position Anuta ATOM in Place of CSD?

• Accounts that had already positioned CSD along with a hardware deal but could not
close the deal due to CSD EoL: Such accounts can position Anuta ATOM instead of
CSD. In case of any questions with respect to discount asks to match a net price that
was agreed upon for CSD, please reach out to the PLM team with adequate
documentation (a prior quote highlighting the discounts and net price provided for
CSD).
• Customers who already have CSD in their network and are now looking to expand
with additional licenses: Such accounts can position Anuta ATOM. For service
migration and other queries, reach out to the PLM team.

3.1.4 What Products to Position for Service Activation and Testing in a


Customer Network Without CSD?

In customer networks where CSD was never positioned as a service provisioning platform in
the past, be sure to position the following products for the unique capabilities that they
provide to the customer network:
• Anuta ATOM: For provisioning L2/L3 VPNs and other key services, using an efficient
technique of service modeling.

• Paragon Active Assurance: For end-to-end service insights for service activation
testing, quality monitoring, and troubleshooting.

3.1.5 Points to Leverage When Positioning Anuta ATOM in Lieu of CSD

CSD was an early entry product into the market, which was designed considering the
market requirements for offering an automated way to provision services in customer
networks. At the time when both service provider and enterprise sectors were moving to
IP/MPLS networks, there was a need to provide an automated way to provision L2 Ethernet
and L3 (IP-based) VPN services deployed over pseudowires or LSPs. For this reason, CSD
was developed, which catered to the then market requirements and customer needs.

Juniper Business Use Only


| 13

However, as years have passed by, technology has developed at an exponential rate. Today,
it is not only important that the service provisioning platform provide a means to
automatically provision services; it should also be able to cater to multivendor networks, be
able to interface with external apps or tools in the customer networks via APIs, offer easy
means to develop any kind of services regardless of the type of platform in the underlying
infrastructure, and most importantly, have the ability to be deployed in the customer
network, or in a public cloud, or in a private cloud with adequate scale capabilities.
All these capabilities are provided by Anuta ATOM, along with providing feature parity with
CSD.
The following table provides a comparison between Anuta ATOM and CSD:

Category Junos Space CSD Anuta ATOM (Advanced 2 License)


Service provisioning pre-built ✓ ✓
templates
Basic service monitoring ✓ ✓
dashboards & service functional
audits
Device & config management ✓ ✓
and inventory management
(Junos Space)
Service lifecyle management X ✓
with pre-and post-checks with
in-built workflow capabilities
Service modeling X ✓
Northbound APIs Partial ✓
Compliance management X ✓
Multivendor device support X ✓
Horizontally scalable and cloud- X ✓
native
Micro-services architecture X ✓
Integration with Paragon X ✓

3.2 Junos Space Replacement (EMS)

3.2.1 Junos Space Background and Basics

Junos Space has been a dedicated EMS platform positioned at Juniper for the past 10 years
(since 2013 and earlier). Junos Space is a Juniper homegrown product that was positioned
for management of networking devices (including routers, switches and security devices).

Juniper Business Use Only


| 14

The Junos Space solution can be broken down into two parts:
• The core EMS platform which provides all the generic functions of network management.
o This is the Junos Space platform.
• Plug and play management applications that solve specialized problems.
o The following plug-and-play applications were developed in-house: Connectivity
Services Director (CSD), Network Director (ND) and Security Director (SD). A few
other apps such as CPP and ICEaaa (which were very specifically developed for
customers) were also supported by Junos Space.
o External plug-and-play applications such as Service Now were also supported by
Space (however, Service Now has reached EoL).
Junos Space has now reached a stage where it is mostly regarded as a legacy application,
due to the non-availability of a cloud-native architecture approach, the lack of scale, and the
lack of an aggressive roadmap to add advanced features (such as comprehensive support
of Netconf, YANG, gRPC and so on).

3.2.2 Junos Space Positioning for JAWS

Typically, Junos Space has been positioned as an on-prem network management platform
for enterprise, service provider, and security businesses, owing to the support it provides
with its plug-and-play applications ND, CSD, and SD respectively. The CSD application that
was owned by JAWS has now been called for end of life, and Anuta ATOM is now being
positioned to cover that functionality (as discussed in section 3.1). The ND and SD apps,
however, continue to live since they are still being positioned actively by their respective
business units.
JAWS has been very clear and articulate that we can no longer go ahead positioning a
legacy platform for providing basic element management functionality, and hence has
reduced investment in the Junos Space platform. Instead, the product to be positioned for
“EMS functionality” for JAWS customers (WAN customers), henceforth is Anuta ATOM.
Anuta ATOM has fairly good feature parity with Junos Space and has an active roadmap to
cover additional features, as needed for Juniper customers.
It must be noted that platforms such as MX, PTX, and ACX will have limited support (no new
features and functionality added) on Junos Space going forward. Junos Space also does
not support Evo-based platforms. Hence any new and upcoming platform, which has Evo
has its OS, cannot be managed by Junos Space. Anuta ATOM, on the other hand, will
support MX, PTX, and ACX platforms regardless of whether they are running Junos or Evo.
Note: Junos Space continues to be positioned as an on-prem EMS platform for
Enterprise and Security business units. The ND and SD apps continue to be active
and will continue to be positioned along with Junos Space. For any details on ND, SD

Juniper Business Use Only


| 15

and positioning of Junos Space for campus and security requirements, reach out to
the Junos Space PLMs from AIDE and Security BUs.

3.2.3 What EMS Features Can Anuta ATOM Be Positioned for?

The table below indicates the EMS features that are supported by Anuta ATOM:

Category Details
ZTP and device discovery DHCP-based ZTP, SSH, SNMP, Netconf, IP-range
discovery, dual routing engine discovery, Telnet,
hostname discovery, IP subnet discovery
Device operations Manual addition, modification, and deletion of
devices, RMA, device-specific diagnostics, device
groups
Inventory management Physical inventory, license inventory, software
inventory, inventory status
Image management Image workflows, image upload, image
deployment/upgrade
Config management Config version tagging, config diffs, detection of
out-of-band config changes from the changelog
feature (config snippet management), config
backup and restore
Fault and performance management 3 Outages, events, threshold alarms, notifications,
service level alarms, data collection and reporting,
charts
Network monitoring 4 Managing thresholds, sending events, managing
SNMP data collection, managing outages,
configuring notifications, database backup, outage
charts/graphs
Topology Automated discovery of network (L2) topology
(devices and interconnections), tagging devices
with geographical coordinates
Compliance Device-based configuration compliance and
service compliance, auto-remediation

3
Note: ATOM is to be positioned around Fault and Performance Management when needed as part of a
replacement of Junos Space. If the customer needs features that are not already available in Junos Space and
that will require development in ATOM, Paragon Insights should be considered.
4
Note: ATOM is to be positioned around Network Monitoring when needed as part of a direct replacement of
Junos Space. If the customer needs features that are not already available in Junos Space and that will require
development in ATOM, Paragon Insights should be considered.

Juniper Business Use Only


| 16

Category Details
Platform capabilities RBAC, job management, audit logs, integration
with AD, LDAP, TACACS

3.2.4 Points to Leverage When Positioning Anuta ATOM in Lieu of Junos


Space

Junos Space has long been a much-desired platform for achieving EMS functionality such
as config management, inventory, and image management for Juniper platforms. However,
owing to the technology developments that have taken place in the recent years, customers
are now looking to own an EMS platform that will not only help them in achieving the
fundamental automation requirements, but eventually also assist them in advanced
automation.
Customers now look for automation platforms that are scalable, can support multivendor
networks, can integrate with external or existing apps, and most importantly have the ability
to be deployed in the customer network, or in a public cloud, or in a private cloud
supporting EMS features on thousands of devices.
All the above-mentioned features are provided by Anuta ATOM, along with feature parity
with Junos Space.
The following table provides a comparison between Anuta ATOM and Junos Space:

Features Junos Space Anuta ATOM


Device management ✓ ✓
Inventory management ✓ ✓
Script management ✓ ✓
Fault & performance ✓ ✓
management
Northbound APIs ✓ ✓
Compliance management X ✓
Multivendor device support X ✓
Horizontally scalable and cloud- X ✓
native
Microservices architecture X ✓

Note: As seen above, ATOM does provide some fault and performance management
features, areas in which Paragon Insights should generally be positioned. Paragon Insights is
however not required to get the necessary telemetry data that is already supported by

Juniper Business Use Only


| 17

Junos Space EMS and is therefore supported also by ATOM to meet the requirements for a
replacement.
However, if any new development is required in ATOM to meet new customer requirements
when in the fields of fault and performance management or active testing, we recommend
seriously considering the addition of Paragon Insights and Paragon Active Assurance to the
proposal instead.

4 Real-world Scenarios

4.1 Example: How to Troubleshoot Real-world Network Issues


with Paragon Automation
A customer asked how Paragon Automation can be used to troubleshoot and isolate various
key problems they see in their network. This section gives some suggestions, going through
some important real-world problem scenarios that can be detected, isolated and fixed with
Paragon Automation.

4.1.1 Label Conflict

The key scenario in which label conflict could occur is if multiple nodes (through
misconfiguration) advertise the same index for the same SID (for example, two nodes are
accidentally configured with the same node index).
When a router detects such a collision, as well as using the conflict resolution scheme
defined in the IETF, it generates a syslog message. This will be ingested by Insights as
natively syslog or as a streaming telemetry event (gRPC), for example as shown below.
A playbook is created on Insights that listens to such events and flags them via any of the
supported channels (UI, Slack, Kafka etc.). In addition, a user-defined action (UDA) is
defined that reconfigures the node index on one of the two conflicting nodes with an
"emergency" index value reserved for this purpose. This is on the assumption that Insights
does not know which of the two nodes has been configured incorrectly but can at least
reconfigure one of them to resolve the conflict.

component_id: 65535
sub_component_id: 0
path: sensor_1004:/junos/events/:/junos/events/:eventd
sequence_number: 5831
timestamp: 1551326098693
sync_response: false
key: __timestamp__
uint_value: 1551326098694
key: __junos_re_stream_creation_timestamp__
uint_value: 1551326098693
key: __junos_re_payload_get_timestamp__

Juniper Business Use Only


| 18

uint_value: 1551326098693
key: __junos_re_event_timestamp__
uint_value: 1551326098693
key: __prefix__
str_value: /junos/events/event[id=’RPD_ISIS_PREFIX_SID_CNFLCT’ and type=’2’
and facility=’3’]/
key: timestamp/seconds
uint_value: 1551326098
key: timestamp/microseconds
uint_value: 692834
key: priority
uint_value: 3
key: pid
uint_value: 3211
key: message
str_value: RPD_ISIS_PREFIX_SID_CNFLCT: ISIS
detected L1 prefix segment index ‘9’ originated by ‘p1.
iad’ for prefix 128.49.106.9/32 conflicting with selforiginated
L1 prefix segment index ‘99’ for prefix 128.49.106.9/32
key: daemon
str_value: rpd
key: hostname
str_value: pe1.iad
key: __prefix__
str_value: /junos/events/event[id=’RPD_ISIS_PREFIX_SID_CNFLCT’ and type=’2’
and facility=’3’]/
attributes[key=’isis-level’]/
key: value
str_value: 1
key: __prefix__
str_value: /junos/events/event[id=’RPD_ISIS_PREFIX_SID_CNFLCT’ and type=’2’
and facility=’3’]/
key: logoptions
int_value: 9

4.1.2 No Path Available; No State for Constraints

The above two conditions are detected in a similar way. There are two cases to consider:
distributed path computation and centralized path computation on the PCE (Pathfinder).

(a) Distributed path computation case


A syslog message is generated by the router as follows:
RPD_SPRING_TE_COMPUTE_LSP_COMPUTE_FAIL
This is the case whether (i) the LSP/policy was initially up and running but then following a
topology change no path was feasible or (ii) no path was feasible when path setup was first
requested. This will be consumed by Insights either as native syslog or via streaming

Juniper Business Use Only


| 19

telemetry. (Junos supports streaming telemetry for syslog messages). A playbook is created
that listens for such messages, sets an alert on the dashboard and publishes the alert to
other channels.

(b) Centralized path computation (Paragon Pathfinder)


If no path is available, because (i) the LSP/policy was initially up and running but then
following a topology change no path was feasible or (ii) no path was feasible when path
setup was first requested, an asynchronous event notification is sent by Pathfinder to
Insights. A playbook is written for Insights to ingest the notification, set an alert on the
dashboard to visualize the problem in the Paragon Automation UI and publish the alert to
other channels as needed. If the problem is detected during initial LSP setup, the problem is
indicated in response to the API call via REST to ATOM.

4.1.3 Latency Spike

Paragon Active Assurance (PAA) Test Agents are deployed in each PoP, and test packets
are sent between the Test Agents in order to test full-mesh connectivity between all PEs in
the network. PAA streams the KPIs to Insights via Kafka. The LSP KPIs will be picked up by
Pathfinder and visualized on the Paragon Automation topology map. If an issue arises,
Paragon Active Assurance will also send a notification to Insights that there is an anomaly,
including a way to identify which LSP is affected. In this way, Insights will know which PE-to-
PE flows are affected by the latency increase.
Test packets that fail the configured KPIs (for example, latency and packet loss) are also
shown on the Active Assurance dashboard. The dashboard displays KPIs as a function of
time.
To understand which link on the LSP is having issues, each router is configured to measure
the latency to its directly connected neighbors via TWAMP. Insights will also ingest the link-
level latency values and store the values in its time-series database. Insights is configured
with a static alert threshold or can learn the expected value via ML (3-sigma, K-means or
Holt-Winters). If an anomalous value is observed, this is flagged and shown on the
dashboard and published to the desired alert channels. End-to-end LSP issues will be
correlated in Insights with issues on specific links, as well as with device-level information to
point out the root cause and take defined actions.

4.1.4 Device Unreachable

As already described in section 4.1.3 (Latency Spike), Paragon Active Assurance Test
Agents will be deployed in each PoP, and test packets will be sent from the Test Agents in
order to test full-mesh connectivity between all PEs in the network. Test packets that fail the
configured KPIs (for example, latency and packet loss) are flagged on the Active Assurance
dashboard, and a notification is also sent to Insights for analysis and remediation.

Juniper Business Use Only


| 20

4.1.5 IP Conflict

An IP address conflict can be detected by the integrity checks performed on router


configurations by Paragon Planner.

4.1.6 Intermittent Blackholes

Intermittent blackholes can be detected by the Juniper Resiliency Interface on the routers.
These send reports about exception packets, including metadata and a reason code (for
example, unknown MPLS label or MTU exceeded) via IPFIX to Insights. Corresponding
playbooks are created in Insights to highlight such blackhole scenarios. Actions will be taken
depending on the nature of the blackhole. For example, if the blackhole is due to MTU-
exceeded, a user-defined action could inspect the MTU value configured on the interface
and run an MTU test with PAA Test Agents, and if an issue occurs make a REST call to
ATOM to reconfigure the problematic device. In addition, if the MTU values on links are
being monitored by means of a playbook, outlier detection will highlight a link that has an
outlier MTU value compared to the rest of the population.

4.1.7 Link Health/Errors

Link statistics, including error statistics, will be monitored by Insights. Issues will be shown
on the Paragon Automation dashboard. If BFD is being used on the link, then BFD session
failure will take down the IGP adjacency, which will automatically take the link out of action.
However, the link may also be suffering from a gray failure that is insufficient to take the BFD
down but is sufficient to cause some degree of packet loss (which will be detected by PAA).
In such a situation, there are two main scenarios:
(i) In a Traffic Engineering deployment, Insights makes a REST API call to Pathfinder to
request immediate maintenance on the link. This triggers Pathfinder to divert all the TE
LSPs/policies passing through the link.
(ii) In a non-TE deployment, a user-defined action will be created to either put the link in
admin-down state or to set the IGP metric to maximum.

4.1.8 Node Health/Errors

Node KPIs such as CPU utilization, memory utilization and protocol health will be monitored
by Insights and visualized in the Paragon Automation UI. These node KPIs will be correlated
with PAA LSP metrics. In a Traffic Engineering deployment, if problems are detected,
Insights will make a REST API call to Pathfinder to request immediate maintenance on the
node. This triggers Pathfinder to divert all the TE LSPs/policies passing through the node. In
a non-TE deployment, a user-defined action could be created to set the IGP overload bit.

Juniper Business Use Only


| 21

4.1.9 Tree State Failure/Replication Failure

An Insights playbook will be created that issues MPLS P2MP ping commands on a periodic
basis. This results in lsping packets being sent to all egress nodes of the multicast tree in
order to verify the integrity of the data plane. Lsping failure to an egress node will be
highlighted on the Insights dashboard. A user-defined action will check if the egress node is
reachable in general from the ingress node (by using Active Assurance probe statistics).
This acts as a first level of automated triage and serves to distinguish between a problem
with the multicast tree in particular and a more general connectivity problem. Based on the
analysis, different actions can be defined.

5 Appendix: Customer Input


This appendix is for information only, showing what a real customer RFI related to
automation might look like.

5.1 Background
The number of PE nodes in the customer RFI for their new Global Converged Core network
consisted of 234 nodes (156 PEs and 78 P routers). The hardware proposed by Juniper was
MX10003 as PE; MX10003 or ACX7100 for P router. The network will be running OSPF, SR
and will provide L3VPN, L2VPN (EVPNs), MVPNs and IP peering services. This will be a
global deployment across all regions: EMEA, AMER and APAC. The customer is looking for
a resilient (HA) solution.
Another similar request for a transport network in Asia consisted of 2000 metro nodes and
150 core nodes, with requirements on the automation solution for geo redundancy and high
availability.

5.2 High Level Goals as Communicated by the Customer


[HLG1] This turnkey solution would have external open APIs to integrate with other system
components including Customer Portal, Ciena Blue Planet (e2e orchestration), Service Now,
Capacity Planning Tool (WANDL, etc.).
[HLG2] New global network will offer a self-service customer experience. This customer
experience will initially be focused on Internet and VPN products. Some of the portal options
will be:
1. Select available global locations and deploy ports
2. Instantiate a service and attach to ports
3. Add additional service(s) to a port
4. Select a path with constraints (premium, best effort, disjoint, etc.) to be used by a service

Juniper Business Use Only


| 22

5. Choose bandwidth required


6. Predictive bandwidth optimization and upgrade
[HLG3] New global network will make use of new technology including Segment Routing,
Flex Algo, SR Policy.
[HLG4] New global network will make use of a Path Optimizer and PCE for traffic
engineering purposes.
[HLG5] New global network will adopt intent based, zero touch network principles.
[HLG6] New global network will adopt DevOps practices.
[HLG7] New global network would like to make use of an ecosystem that is based on an
open framework that provides a platform for visual and low code automation activities to
take place (Python).
[HLG8] New global network would like to make use of an ecosystem that allows for service
provisioning & orchestration.
[HLG9] New global network would like to make use of an ecosystem that allows for visual
workflow and closed loop automation.
[HLG10] New global network would like to make use of an ecosystem that allows for multi-
vendor provisioning, config management and compliance.
[HLG11] New global network would like to make use of an ecosystem that allows for
diagnostics and service assurance with analytics.

5.3 High Level Automation Requirements as Communicated by


the Customer
[HLR1] Solution must be based on a micro-services architecture that can be deployed on a
public cloud.
[HLR2] Solution must abide by good software engineering and abstraction principles
(example: network, pull/push Kafka integration).
[HLR3] Solution must offer flexible perpetual licensing.
[HLR4] Ecosystem must support a multi-vendor network environment (all component
functionalities must be vendor agnostic).
[HLR5] Ecosystem must support network configuration and state modelling (YANG - OC (all
models defined in http://ops.openconfig.net/branches/models/master/).
[HLR6] Ecosystem must support service configuration modelling (YANG).
[HLR7] Ecosystem must support resource modelling (YANG).
[HLR8] Ecosystem must support resource discovery and create a multi-view topology (L1,
L2, L2/L3 Service, BGP, IGP/Flex-Algo, etc.).

Juniper Business Use Only


| 23

[HLR9] Ecosystem must support intent based (WF modify intent and direct to device network
configuration).
[HLR10] Ecosystem must support protocols and APIs as defined in high level architecture.
[HLR11] Ecosystem must support non-strategic protocols such as SNMP, NETCONF, PCEP,
etc.
[HLR12] Ecosystem must support config store to enable review/approval/rollback –
automate or on-demand.
[HLR13] Ecosystem must support service pre-checks and post-checks.
[HLR14] Ecosystem must support local domain(s) service orchestration as well as interface
with a higher-level service orchestrator (example: e2e network slicing across unique service
domains).
[HLR15] Ecosystem must support a service assurance model that includes fault/event
management, performance management including synthetic probing, service
validation/health/KPI monitoring, diagnostics, visual troubleshooting.
[HLR16] Ecosystem must improve resource MTTR (Advanced logging collection for TAC
Engagement, Automated RMA, etc.).
[HLR17] Ecosystem must support workflow automation for but not limited to software
upgrades, provisioning and bulk changes, alerting, network self-healing/closed loop
automation, troubleshooting and MOPs.
[HLR18] Ecosystem must comply with and simplify design, plan & build processes
(Standardised resource/service specification and catalogue, Supply Chain, Bill of Material,
Status management).
[HLR19] Ecosystem must support resource/service provisioning and activation (ZTP,
automated model based turn up/turn down is in scope)
[HLR20] Ecosystem must support resource monitoring & assurance.
[HLR21] Ecosystem must offer advanced visualization and dashboard capabilities.
[HLR22] Ecosystem must make use of AI/ML analytics (local or external integration).
[HLR23] Ecosystem components must come pre-integrated and should require minimal to
no development effort from CUSTOMER.
[HLR24] Ecosystem must come prebuilt with physical and logical device, service templates
as well as fulfilment, assurance rules and recipes. It should be easy to modify these to suit
CUSTOMER service offerings.

Juniper Business Use Only


| 24

5.4 PCE High Level Principles Requirements as Described by the


Customer
For the sake of simplicity, the term PCE is used here to describe a function that
computes/optimizes and programs a path on a network endpoint.
The high-level network principles are:
• GNV is an underlay with Centralised Traffic Engineered overlays.
• GNV will support multicast traffic.
• GNV will be able to carry and route unencapsulated IPv[46] traffic.
• GNV can deliver high priority traffic in the event that Centralised Traffic Engineering fails.
• GNV will route around all network failures in under 2 minutes.

The following high-level requirements apply:


[PCE1] SR Policy is to be used for traffic forwarding (static, dynamic, on-demand).
[PCE2] Any PCE path state changes must be introduced via SR Policy construct.
[PCE3] PCE must be able to steer L2(PW3, EPVN) and L3 service traffic.
[PCE4] PCE must manage and prevent bandwidth congestion (with and without label
compression).
[PCE5] PCE must be able to compute and invoke constrained paths (latency, disjointness,
best-effort).
[PCE6] PCE must support inter-domain traffic steering.
[PCE7] PCE must use telemetry (OC) data for path computation (interface, topology).

Juniper Business Use Only

You might also like