You are on page 1of 20

ONAP Learnings

Dec 28th, 2018

Presenter: Srinivasa Addepalli


Contributions from: Deepak, Kannan, Itohan, Haibin, Lianhao, Kiran, Ritu, Dileep, Ruoyu, Libo
Disclaimer: Few feedback items are known to the community and duplicate. Few feedback items could be lack of
understanding of capabilities
Background

• Feedback is from
- From both ONAP architects, developers & Integration Engineers
- From ONAP users who try to onboard and bring up their own VNFs
• Some of the feedback is known the ONAP community and hence please
treat it as supporting data
Feedback Areas

• ONAP Value statement


• ONAP Installation
• ONAP Pre-onboarding of services
• ONAP service onboarding and Life cycle management of service instances
• ONAP Size
• ONAP regression
• Miscellaneous
• Q&A
Common feedback
Positives Challenges / Opportunities
ONAP documentation is not comprehensive enough, need to depend
Active community on wiki, but wiki is not organized for ONAP users.
Wiki is not well organized on release basis
Good number of tutorials (Video and text)
Knowledge base is lacking - Can’t find issues faced by ONAP users,
Sample VNFs and use cases to jump start how they got fixed etc..
VNF vendors and ONAP users ONAP Size is too big, not developer friendly
Not leveraging Industry best tools/practices
Immediate help from integration engineers No example/sample use case that showcases many features of ONAP
and community in general
Lack of GUI/CLI (or bugs) for few operations leading to curl/postman
way of accessing services.
Features are being added in good pace Design tools, flexible workflows are expected to reduce the
development time and provide flexibility – But… they seem to have
opposite impact (may be lack of documentation/training materials)
Creating new use case with new VNFs take too long (even after
having HEAT template/env files and tested on Openstack).
ONAP Value Statement
Some notions that people think about ONAP
• It is only for operators
• It is another abstraction on top of Openstack – Only openstack sites are supported.
• It just adds onboarding features of VNFs to existing Openstack
• It only can manage very few cloud sites, not meant for hundreds of sites.

Need to change perception


• It is Multi-Cloud abstraction of various technologies – Such as K8S, Azure, AWS
• It can be used by Enterprises in other use cases such as private 5G (CBRS).
• Its can be used in scenarios where a service extends multiple cloud-regions.
• Scenarios where a given VNF needs to be brought up multiple times in multiple sites.
• Scenarios where WAN connections are required.
• Scenarios where VNFs are placed in locations that are closer to the consumer and locations where there is right HW is present.
• Scenarios where service is expected to be monitored as single entity (monitoring of each workload of VNFs of the service)
• Scenarios where analytics performed across workloads/VNFs of a service.
• Scenarios where scale-out, load sharing, traffic distribution (not only across nodes in a site, but across multiple sites) are needed
ONAP Value Statement – feedback/suggestions

• Messaging : Blog post, white paper detailing the value of ONAP over K8S and
Openstack and other service orchestrators.
• A comprehensive use case (using sample VNFs) that show the value of ONAP with
respect to following:
- Service that consists of multiple VNFs, where each VNF has one or more workloads.
- Service where VNFs are to be placed in different locations.
- Service where a VNF is instantiated multiple times in different locations.
- Service where VNF workloads are placed in locations having right hardware features.
- Showcase monitoring of service instance (consisting of multiple VNFs with several workloads)
- Show case analytics of service instance (example: CPU usage, memory usage, Traffic characteristics of
all workloads of all VNFs on per service instances basis, SLA compliancy etc…)
- Showcase scale-out of workloads of a given VNF of the service
- Showcase traffic load sharing across of multiple workload instances.
- Showcase where VNF workloads are always up and running (even upon crash)
- Showcase update of new configuration in all workload instances
- Showcase a service that has VNFs spanning across openstack, K8S and Azure/AWS cloud-regions
ONAP Installation Challenges & Opportunities

Time it takes to bring up ONAP took more than a week


Goal should be to bring ONAP in minutes!!!!

Challenges Feedback/Suggestions
No clear documentation on hardware resource Getting started guide
requirements (for deploying on bare-metal) or type (https://onap.readthedocs.io/en/casablanca/release/index.html#
of instances to be taken in public clouds to bring up getting-started-with-onap) to have information about
ONAP requirements to deploy ONAP
- On bare-metal servers as containers.
- On public cloud deployments
ONAP installation always fails to get images in time It appears that local dockerhub is needed. It is not clear from the
documents. Suggest to add instructions to create local dockerhub
before ONAP installation is started.
Running the entire ONAP in one VM was tough to Clear documentation on challenges and way to mitigate them will
make it work (for cost reasons) be good.

Documentation on using public cloud CaaS facilities


ONAP Installation Challenges & Opportunities
Challenges Feedback/Suggestions
“Spent two weeks in bringing up ONAP due to • Readiness checks need to consider readiness of application in
dependency challenges between various the container/PoD.
components (example: OOF and MUSIC)” • Don’t make initialDelaySecond too big which causes the delay
• Currently dependencies are taken care using in ONAP installation. Make use of failureThreshold
initContainers with pod-ready checks and job- appropriately to consider the lower value for
completion checks initialDelaySecond.
• Few components makes the service ready by • Avoid dependency checks in each individual project helm
having probes related to TCP, but they should charts. Consider using ARGO (https://argoproj.github.io/),
check for actual application readiness. which consolidates all dependencies at one place, makes it
• Higher level services need to look for multiple easy to debug and make it independent of future ONAP
low level services readiness before they are run packaging profiles (where components of ONAP differ from
and make themselves available – Complexity one package to another package).
increases with number of levels of dependencies
and hence not done well in each individual helm
charts
ONAP Installation Challenges & Opportunities
Challenges Feedback/Suggestions
“Needed to change SO INFRA BPMN config-map • Ensure that the default config-map works for all use cases with
from v1 and v2” various functional capabilities.

From
“endpoint: http://so-openstack-
adapter.onap:8087/services/rest/v1/vnfs” to

“endpoint: http://so-openstack-
adapter.onap:8087/services/rest/v2/vnfs”
ONAP Operation Challenges & Opportunities
Challenges Feedback/Suggestions
Lack of documentation on user management is No user and RBAC management document in “User Guides”
forcing people to use default user names and section of readthedocs.
passwords for all NB operations. Provide CLI/GUI to do user and RBAC management.
Following industry best practices such as password- Good to support JWT
less authentication Good to support FIDO2
No monitoring support of ONAP micro-services Prometheus support for monitoring ONAP micro-services
Guidelines on what metrics to expose, way to expose and how
they can be used to troubleshoot long term issues.
Comprehensive tracing to troubleshoot issues Guidelines on logging for tracing and/or usage of service mesh
technologies
ONAP Pre-onboarding Challenges & Opportunities

Took quite a bit of time to make pre-onboarding steps to work

Challenges Feedback/Suggestions
Expecting that ONAP users know about curl and • ONAP concepts document detailing terminology and the
postman to create various ONAP resources such as how/when to use various ONAP resources.
- Complexes, Cloud regions, Cloud owners (Openstack • CLI/GUI for all ONAP resources.
tenants), ONAP tenants, associating ONAP tenants • Complex management
Cloud region/owners, Associating ONAP tenants to • Cloud region & owner management
services they control and customers of services
• ONAP tenant management
Challenges:
• Subscriber management
• Curl commands (Number of curl commands are more
than 6) – See vFW use case here: • ONAP tenant and service relationship management
https://onap.readthedocs.io/en/casablanca/submodul • ONAP subscriber to service relationship management
es/integration.git/docs/docs_vfwHPA.html#docs-vfw- • Updating Robot to use CLI/GUI.
hpa • TLS support to communicate with cloud-regions.
• No proper description of above resources in
readthedocs
• Hiding the complexity in robot makes people think that
they need to update robot for every use case.
ONAP Service onboarding and LCM of service instances

Challenges Feedback/Suggestions
There is GUI to create VLM (Virtual License Model), VSP, Improve documentation on readthedocs user guide.
Service, but there is no documentation on why this is
being done, what are entitlements, feature groups etc..
What information is expected to be provided by VNF
vendors etc… (Many follow examples blindly ).
Not sure on the what VNF vendors are expected to provide More information in readthedocs user guide would help
for testers and approvers to test and approve the VNF.
VID is not flexible enough to create service instance that Always use CLI/GUI. If something can’t be done via CLI/GUI, then
uses OOF for homing and Multi-Cloud. Hence, team had to they have to be fixed before declaring a use case successful.
use RESTful API to SO to instantiate the service. Since this
RESTful API requires service model UUIDs, one needs to
- Fix VID to provide flexibility
refer to the output CSAR to get hold of them. Too many
manual operations.
Concerning is that these steps are being accepted by
community.
ONAP Service onboarding and LCM of service instances

Challenges Feedback/Suggestions
Associating policy with service, customer, resource (VNF) is Improve documentation on readthedocs user guide on how to
complicated. Too many manual actions (Policy models can create policies
be uploaded using GUI, but not the actual policies of the
models)
Allow the flexibility of creating policies even before services are
created or onboarded. Having two operations are good –
Creation of policy records and then associating them with
service/VNF/Customer.

Ensure that the ‘association’ operation does not require user to go


through output CSAR files or going through some debug logs of
SO.
Single VID command/GUI action to bring up the entire Currently multiple actions are expected to be taken.
service
Create one generic BPMN which does all actions.
ONAP Service onboarding and LCM of service instances

Challenges/requirements Feedback/Suggestions
Could not find a way to monitor workloads of VNFs of New feature suggestion
service at the service instance level
Could not find a way to monitor compute nodes of each New feature suggestion
cloud site
A way to save power by consolidating workloads by New feature suggestion
swithching off unused nodes
A way to predict the traffic for pro-active scale-out New feature suggestion
A way to predict faults New feature suggestion
A way to do root cause analytics of misbehavior of service New feature suggestion
considering infrastructure (nodes, storage, network etc…)
ONAP Size challenges and Opportunities

Size is too big for its features


- Duplicate orchestration implementations
- Not leveraging existing solutions
- Many DB technologies
- Too many DB instances of same DB technology
- Java (JVM) takes too much of memory
- Wrappers (e.g not leveraging Kafka directly)
- Copy & Paste errors of CPU, memory and disk requirements
- Over usage of micro-services (Size concerns as well as too many REST API calls for a given operation)
- No flexibility to create smaller packages based on requirements
- Deployment that only have Openstack cloud regions.
- Deployments that only have K8S based cloud regions
- Deployments that only have services with HEAT or Helm based VNFs
- Deployments that only have services that use ETSI TOSCA
- Deployments that can leverage service mesh (e.g ISTIO)
- Deployments that always use VNFMs for VNF LCM- scale-out, monitoring and analytics
- Deployments that always use EMS for VNF configuration
ONAP Size challenges and Opportunities

Opportunities
- Avoid duplicate instances of DBs
- Enforce only few DB technologies
- Avoid duplicate instances of monitoring infrastructure
- Leverage service mesh (such as ISTIO) and avoid duplicate functions of existing ONAP
- Reduce the usage of wrappers on very popular technologies
- Provide utilities to measure the resource requirements & Provide ways to detect resource requirement
and auto-tune at run time
- Provide an various ONAP packages based on deployment requirements
- Serverless architecture for API driven micro-services
ONAP regression and opportunities

Challenge
- During Casablanca integration time, it was felt that the quality of release had gone down over Beijing release
- Need for ensuring continuously that working features are continue to work

Opportunity
- Automate (similar to robot) that automates use case verification
- E2E setup (including cloud regions, ONAP)
- Pre-onboarding steps
- Onboarding steps
- Instantiation steps
- Generate traffic
- Verify VNF working version
- LCM actions – restart and scale-out
- Run it every day and identify issues and auto-generate emails/JIRA stories.
Orchestration gaps – To be discussed

Dependency management among workloads of a VNF - DAG


representation & implementation of same
Ability to bring up VNF multiple times based on criteria as part of
service instantiation. Use case: vCDN
Common parameters definition across VNFs of a service
Common parameters definition across services
Next steps

• JIRA stories for known items


• Bring items, that require architecture team attention, to appropriate
architecture task forces
• Work with community to brainstorm on comprehensive use case that
leverages many features of ONAP
• Identify gaps in realizing vfw-hpa use case with CLI/GUI
- Without using Curl commands, kubectl exec, getting data from logs for further
operations, modifications of helm charts
• E2E automation (including installing openstack cloud regions) of realizing a
use case and use it for regression testing – Work with integration team.
s

You might also like