Professional Documents
Culture Documents
CBIS 20.xxx
This document is intended for use by Nokia's customers (“You”) only, and it may not be used except for
the purposes defined in the agreement between You and Nokia (“Agreement”) under which this document
is distributed. No part of this document may be used, copied, reproduced, modified or transmitted in any
form or means without the prior written permission of Nokia. If you have not entered into an Agreement
applicable to the Product, or if that Agreement has expired or has been terminated, You may not use
this document in any manner and You are obliged to return it to Nokia and destroy or delete any copies
thereof.
The document has been prepared to be used by professional and properly trained personnel, and You
assume full responsibility when using it. Nokia welcomes your comments as part of the process of
continuous development and improvement of the documentation.
This document and its contents are provided as a convenience to You. Any information or statements
concerning the suitability, capacity, fitness for purpose or performance of the Product are given solely
on an “as is” and “as available” basis in this document, and Nokia reserves the right to change any such
information and statements without notice. Nokia has made all reasonable efforts to ensure that the
content of this document is adequate and free of material errors and omissions, and Nokia will correct
errors that You identify in this document. Nokia's total liability for any errors in the document is strictly
limited to the correction of such error(s). Nokia does not warrant that the use of the software in the Product
will be uninterrupted or error-free.
NO WA RR AN TY O F AN Y K I ND , EI T H ER EX PR E S S O R I M P L I E D , I N C L U D I N G B U T
NOT LI M I T ED T O AN Y W AR RA NTY O F AV AI L A B I L I T Y , A C C U R A C Y , R E L I A B I L I T Y ,
TITL E, NO N- I NFR I NG E M EN T, M ER CH AN T AB I L I T Y O R F I T N E S S F O R A P A R T I C U L A R
PURPO S E, I S M AD E I N RE LATI O N T O T HE CO N T E N T O F T H I S D O C U M E N T . I N N O
EVENT W I LL NO KI A BE LI AB LE FO R AN Y D AM A G E S , I N C L U D I N G B U T N O T L I M I T E D
TO SPECI AL, D I RE CT, I N DI R EC T, I NC I DE NT A L O R C O N S E Q U E N T I A L O R A N Y
L OSSES, SUCH AS BU T NO T LI M I T E D T O L O S S O F P R O F I T , R E V E N U E , B U S I N E S S
INTERRUPT I O N, B US I NE SS O PP O RT U NI T Y OR D A T A T H A T M A Y A R I S E F R O M T H E
USE O F TH I S D O CU M EN T O R TH E I N F O RM A T I O N I N I T , E V E N I N T H E C A S E O F
ERRO RS I N O R O M I SS I O NS FRO M T H I S D O CU M E N T O R I T S C O N T E N T .
This document is Nokia proprietary and confidential information, which may not be distributed or disclosed
to any third parties without the prior written consent of Nokia.
Nokia is a registered trademark of Nokia Corporation. Other product names mentioned in this document
may be trademarks of their respective owners.
This product may present safety risks due to laser, electricity, heat, and other sources of danger.
Only trained and qualified personnel may install, operate, maintain or otherwise handle this product
and only after having carefully read the safety information applicable to this product.
The safety information is provided in the Safety Information section in the “Legal, Safety and
Environmental Information” part of this document or documentation set.
Nokia is continually striving to reduce the adverse environmental effects of its products and services. We
would like to encourage you as our customers and users to join us in working towards a cleaner, safer
environment. Please recycle product packaging and follow the recommendations for power use and proper
disposal of our products and their components.
If you should have questions regarding our Environmental Policy or any of the environmental services we
offer, please contact us at Nokia for any additional information.
CBIS Acceptance Test Procedures (ATP) - R20GA, R20SP1 and R20SP2
Contents
1 Summary of Changes..................................................................................................................................... 6
3 Scope of Document........................................................................................................................................ 9
3.1 Overview................................................................................................................................................... 9
3.2 Assumptions..............................................................................................................................................9
3.3 10 Test Execution Tips.............................................................................................................................9
3.4 Prerequisites........................................................................................................................................... 10
3.5 Test Outcomes....................................................................................................................................... 10
3.6 CLI Deprecation - Important information................................................................................................ 10
3
CBIS Acceptance Test Procedures (ATP) - R20GA, R20SP1 and R20SP2
4
CBIS Acceptance Test Procedures (ATP) - R20GA, R20SP1 and R20SP2
5
CBIS Acceptance Test Procedures (ATP) - R20GA, Summary of Changes
R20SP1 and R20SP2
1 Summary of Changes
The following tables list the issues and dates of the last publications of the CBIS documentation set.
Discovery Center
Update
Repository Document Version Date
Frequency
Set
Please ensure that the correct and relevant information/procedure is used according to the specific
release.
For additional information please contact the CBIS Hardware Certification team.
3 Scope of Document
3.1 Overview
These are generic test cases that are used by delivery teams for CBIS acceptance at customer sites.
The test cases are optional. In other words, customize these ATPs per customer. There is no need
to perform all of them. In addition, the appendix (the CLI - included with this document) contains all
relevant command examples.
Note: Code shown in this manual that contains several lines cannot be pasted from a PDF
file because it will split the code into 2 or more lines causing the pasted code to fail when
applied. Users are recommended in these instances to copy and paste the code to notepad,
fix the code and then copy and paste from notepad to the server.
3.2 Assumptions
The delivery teams have experience with OpenStack and are trained with the CBIS product.
1. Time saving:
6. ATP Kick-off deck – Create a slide deck to kick off the ATP window off. This kick-off should
include the scope of testing, a reminder about the value of the system, a reminder that it is a
testing phase and they will find defects in the system, and instructions on how to perform the ATP.
7. ATP User Manual – This book.
8. Pre-run scripts – Ideally you should pre-run the scripts before users try to execute them. You are
familiar with the system, so your eyes on the scripts will be looking for things that are not obvious
or incorrect steps. This will help ensure a much more smoothly run ATP.
9. Report and Track Defects – Ensure that users report defects into a defect tracking including
(logins, urls, steps to recreate) and how to set severity and priority values if appropriate.
10. Coordinate schedule – Ensure that all has been coordinated with the ATP testing schedule after
initial installation and that the setup is running normally and is configured correctly.
3.4 Prerequisites
Successful CBIS installation per the high and low level design agreed upon with the customer. In
detail:
• Hardware and Networking equipment installation and configuration, according to the CBIS
Installation Manual.
• (Hypervisor) Undercloud Physical Server installation.
• CBIS Manager installation over the Hypervisor using the CBIS Manager manual.
• Image to be used for testing.
• All Stacks.
• Appendix 2: Artifact Files for ATP Tests on page 174 in the sub-section titled Patch Files on
page 174
• Pass – Passed OK
• Partially passed – Passed with comments (can be corrected in a later release or in a roadmap
commitment)
• Failed – The test case was not executed OK
• Skipped – The test was passed over and not run
Description Install the CBIS Undercloud Physical Server, Undercloud VM, and Overcloud.
Test Case ID
Objective Verify that the CBIS Undercloud and Overcloud have installed successfully.
Estimated ~ 8 hours, depending on the setup scale and selected options and preferably run during the
Duration previous night.
Supported All
From Version
Prerequisite/s
a. Follow the relevant CIQ (Customer Information Questionnaire) and HLD (High Level
Design).
b. Follow the hardware configuration sections in the CBIS Installation manual (per
hardware type).
2. Use the CBIS Manager manual and install the CBIS Undercloud Physical Server.
3. Use the CBIS Manager and deploy the CBIS Undercloud VM and Overcloud.
Note: The above 2 steps are normally executed within a single installation
deploy activity. Desired settings can be prepared in advance on another setup
and importing a saved JSON file or on-line based on relevant platform default
template, while consulting the CBIS Manager manual.
Note: When CBIS is installed with Nuage, the ION delivery/NPI should be
involved as Nuage binaries and licenses are provided by ION teams. To support
TLS for CBIS external services (Horizon/Keystone, Zabbix, Kibana), follow the
Overcloud TLS manual configuration steps.
Note: If a secured communication setup is required, the user should run the
appropriate security hardening sections after deployment and execute the ATP
with no additional steps.
Note: Testing IPv4 or IPv6 addresses in the rest of the test cases depends on the
type of installation in the current test case (either IPv4 or IPv6).
Expected 1. The CBIS Undercloud Physical Server has been deployed successfully.
Result 2. The CBIS Undercloud and Overcloud are successfully installed and all steps are marked
successful (Green).
3. CBIS automated verification steps have completed successfully.
• Verification steps are executed automatically by the NOVL tool (Node Validation
Tool).
• NOVL Log can be found in Undercloud VM under /var/log/cbis/cbis_novl_
res.log.
• Check that the CBIS controllers and computes are configured per the high and low
level design (NOVL will show this).
• Undercloud VM is accessible via SSH from the customer network.
• Controllers are available via SSH from the customer network.
• All password in the system are set per the user_config.yaml file.
Status OK | not OK
Comments
Test Case ID
Supported All
From Version
Test Execution 1. Open the OpenStack Horizon Dashboard and check various windows.
2. Open a CLI session to the Undercloud Physical Server (HV). Open tmux. SSH to the
Undercloud using:
ssh stack@uc
6. Create a new user and login to the OpenStack Horizon Dashboard as follows:
a. From the CBIS Manager - External Tools tab, open Horizon and login to the
OpenStack Horizon Dashboard using the admin user.
b. Navigate to Horizon Dashboard > Identity > Projects > Create a Project.
c. Enter a name and click Create Project.
d. Navigate to Horizon Dashboard > Identity > Users > Create a User.
e. Enter the User Name and Password.
f. Select the Primary Project that you created in step c, above. The Role should remain
_member_.
g. Click Create User.
h. Logout from the OpenStack Horizon Dashboard.
i. Re-login with the newly created user.
7. Check the connection between the Undercloud and the Overcloud nodes by running this
command:
8. From the CBIS Manager - External Tools tab, connect to Zabbix https://[OpenStack
Horizon Dashboard IP address]/zabbix. The user and password are as configured in
CBIS Manager >LCM >Installation> Overcloud > Security Configuration.
9. If applicable, check the Ceph status on a one of the computes, controllers as follows.
From the CBIS Manager - External Tools tab, open the Ceph Storage dashboard.
10. If applicable, check the external storage status (login to NetApp or EMC controllers).
8. The Zabbix UI is accessible. Alerts should not show. If Alerts are shown, they require a
specific examination to understand if they are expected.
9. From one of the controllers check that the Ceph health status is OK by running the
command:
sudo ceph -s
Note: If applicable, (that is when Nuage is installed with the CBIS version),
and all is working correctly with no alerts, when integrating with Nuage SDN,
the Nuage components will not be monitored by Zabbix. In addition, Nuage
SDN High Availability will require additional physical hosts (with current level
of integration between CBIS and Nuage, the VSD and VSC are installed as
VMs on the Undercloud host).
Status OK | not OK
Comments
4.1.3 CRUD (Create, Update and Delete) Virtual Resources in OpenStack (Tenants,
Images, Networks, VMs, Stacks and vRouters)
Test Case ID
Estimated 2 Hours
Duration
Supported All
From Version
Prerequisite/s Retrieve ATP images and YAML files from the CloudBand team (OLCS, FTP, Support
Team).
3. Make sure that the tenant has been created by executing the following command:
4. Create a new user for the tenant which we created in previous steps.
5. Download the image which will be used for testing locally (to the Undercloud VM).
6. Load the image to the Overcloud (use either the command below or the Horizon).
Example:
7. Create a few flavors on the setup (you can use the command below or use the Horizon
dashboard).
8. Create external network on the setup as shown in the following (can use the commands
below or use the Horizon dashboard):
a. stacks_ATP1_yaml.yml creates:
Note: Ensure that there is connectivity between the VMs on all the
networks.
11. Create a floating network and then deploy via Horizon (Navigate to Project >
Orchestration > Stacks) a stack that creates 1 x external network with a floating IP on
it, (use the attached file - stack_ATP3_yaml.yml):
12. The external IP addresses should be routable IPs that can reach the tester computer.
d. Ensure that there is 1 VM on each network connected to the router with the following
steps:
a. Edit the stack_ATP3_yaml.yml file and ensure that the “segmentation_ id:” value
is not already allocated to the resources ProviderNet1 and ProviderNet2.
13. Check that the router name per the stack_ATP3_yaml.yml file is RouterSB as follows:
a. Run a long endless ICMP connectivity check (ping) to the floating IP address of the
VM from the Tester's computer.
b. While running the ICMP connectivity check, reboot the controller that was found
active in step (12-e).
c. The ICMP traffic is expected to break for a short time (1-2 sec) and then continue.
14. Create 1 VM, and perform the following VM lifecycle actions: start/stop/restart.
Status OK | not OK
Comments
Test Case ID
Objective Verify that volumes can be created, attached, detached, and deleted.
Supported All
From Version
Test Execution 1. Create 1 volume and 2 instances (volume type should be tripleo-ceph).
3. Ensure that the new device was added to the instance using fdisk -l or lsblk commands,
(usually its /dev/vdb).
4. Create the file system and mount-point as follows:
5. Create or copy files to the VM (you can use the dd utility for writing).
6. Verify the md5sum of the file which you have copied/created.
Example
7. Detach the volume from the instance and attach it to the other instance.
8. Connect to the new VM and check that the volume is attached using fdisk -l or lsblk
commands.
9. Create file system and mount-point as follows:
10. Check that all files which you created/copied from step-6 exist.
11. Detach the volume from the instance.
Status OK | not OK
Comments
4.1.4.2 CRUD Storage on Storage Node - Ceph (Create, Attach, Detach, Delete)
Test Case ID
Objective Verify that volumes can be created, attached, detached, and deleted on storage nodes
Supported All
From Version
3. Make sure that new device was added to the instance using fdisk -l or lsblk commands,
(usually its /dev/vdb).
4. Create file system and mount-point as follows:
5. Create or copy files to the VM (you can use dd utility for writing).
6. Verify the md5sum of the file which you copied/ created.
Example
7. Detach the volume from the instance and attach it to the other instance.
8. Connect to the new VM and check that the volume is attached using fdisk -l or lsblk
commands.
9. Create file system and mount-point as follows:
10. Check that all files which you created /copied from step-6 exist.
11. Detach the volume from the instance.
Status OK | not OK
Comments
Description Deploy CBIS with OSDs on SSD disk as new Ceph pool (volumes-fast)
Test Case ID
Objective Verify the Ceph build new pool volumes-fast, which includes ssd drives.
Estimated
Duration
Supported 18.0
From Version
Note:
In addition to the above, ensure that the fast disk has been partitioned to 10G
volumes for each of the local regular OSDs for storage compute by running
the following command:
Each of the 10G volumes is a journal for the regular OSD on the storage compute by
running the following command:
4. Reboot storage nodes. After the storage boot, check that all OSDs are up in the same
order as before the reboot.
5. Check the volume fast pool usage before creating a volume.
ceph df
d. Ensure that only the volume fast pool usage has been changed.
a. As Overcloudrc execute:
ceph df
a. In the image below, we can see the volume status showing, "available".
c. In the image below, we can now see that the volumes status has now changed
into "in-use".
h. Scale in/out storage compute with fast pools according to the CBIS Manager
manual. Fast pools should be built as needed on the relevant devices.
Status OK | not OK
Comments
Test Case ID
Objective Verify the Ceph build multiple pools according to user definitions.
Estimated 1 day.
Duration
Supported 18.5
From Version
The CBIS Node installed successfully with storage nodes (AirFrame RM/OR) and at least 2
Ceph pools.
Test Execution 1. Check that the Ceph pools have been created according to your definitions.
2. Ensure that pools OSDs are really split into different pools.
3. Reboot one storage node.
4. Create Instance on each of the pools OSDs.
5. Attach volumes to instances.
6. Write to volume.
7. Create instances /volume snapshot.
8. Create instances from that snapshots.
c. Use the command “ceph-disk list|grep osd” to see the OSD disks mapping..
4. Ensure that every instance/volume was created under the right pool by using the
following command: rbd ls <pool-name>.
Status OK | not OK
Comments
Test Case ID
Estimated 20 minutes.
Duration
Test Execution 1. Logon onto Horizon and navigate to Share Types in the Share tab as an admin user.
2. Click Create Share Type and enter cephfstype in the Name box.
3. Create Share volume in your Project > Share > Shares > Create Share.
4. Go to Share Overview, view the share properties and take a capture or copy the Path
value.
5. Download a Cloud image of CentOS 7 or any other images that you can install ceph-
fuse utilities upon.
6. Create an image out of the cloud image.
7. Create a storage network with the same vlan used for storage and a range of IPs in that
subnet like this: Admin > Network > Networks > Create network.
8. Launch a VM instance based on the previously created image.
9. Add the storage interface from Horizon and initiate it.
10. Connect to the launched VMs (from one of the controllers), after identifying your network
identifier.
11. Check the storage interface status by using the ip command. In most cases it will be
down.
12. Run the dhcclient \[interface name \] command to get the interface up.
13. Validate the ping, from the VM to one of the storage ip addresses (of the controllers).
14. Install ceph-common and ceph-fuse utilities on the launched VM.
15. Create a Rule for share and for cephclient.
16. Create a cephclientkeyring file for the cephclient that will be created to connect the Ceph
cluster from the VM (or from other VMs that would like to use the same share).
17. Copy the created file, ceph.client.cephclient.keyring, to the VM that was
previously created, to the directory /etc/ceph. Make sure that file mode is set to 644.
18. Validate that the file content is valid (assuming it was copied to the VM) from within the
VM.
19. Identify the virtual ip of the Ceph storage network.
20. Create a ceph.conf file in the VM file system, where you want to consume the share.
21. Mount the share by using the ceph-fuse utility inside the VM.
22. Check that the mount point is ready, using the command: df -hT.
Status OK | not OK
Comments
Description Create, Attach, Detach, and Delete volumes on NetApp external storage
Test Case ID
Objective Verify that volumes can be created, attached, detached, and deleted on NetApp external
storage.
Estimated 1 Hour
Duration
Supported All
From Version
c. Create a mount-point:
7. Create or copy files to the VM (you can use the dd utility for writing). You can use the
following example:
8. Verify the md5sum of the file which you copied/created. For example:
10. Write/copy to the volume while the 2 ISCSI paths are disabled.
11. Create volume with a different size from the other volumes and attach it to the third
instance.
12. Enable back the 2 ISCSI paths:
13. Ensure that all ISCSI paths are active and running for all volumes.
14. Migrate the VM while copying/writing to it.
15. Detach volumes from instances.
16. Change volume-type from "tripleo_netapp" to "tripleo-ceph".
17. Delete volumes.
Status OK | not OK
Comments
This is also covered by the CRUD (Create, Update and Delete) Virtual Resources in OpenStack
(Tenants, Images, Networks, VMs, Stacks and vRouters) on page 15 test case.
Test Case ID
Objective Verify IPv4, IPv6 and Dual Stack Virtual Networking Support.
Estimated 1 Hour.
Duration
Supported All.
From Version
Prerequisite/s The CBIS Cluster is installed successfully and tenant, image and flavor are already created.
Test Execution This is a manual procedure for the CRUD (Create, Update and Delete) Virtual Resources in
OpenStack (Tenants, Images, Networks, VMs, Stacks and vRouters) on page 15 stack
creation.
As Tenant:
IPv4 net
IPv6 net
Status OK | not OK
Comments
Test Case ID
Objective Creating SR-IOV instance with direct ports under NIC 1 (infra NIC).
Estimated 1 hour
Duration
Supported 20 SP2
From Version
Prerequisite/s The CBIS Cluster is installed successfully with SR-IOV mapping from physnet X to NIC 1
port 1 and/or NIC 1 port 2.
By default, the SR-IOV mapping maps physnet 1 to NIC 1 port 1 and NIC 2 port 1 and
physnet 2 is mapped to NIC 1 port 2 and NIC 2 port 2.
Test Execution From the admin tenant, perform the following actions:
a. SSH to the SR-IOV compute which the instance you just created reside on
b. Run sudo ip link show <interface name> and compare between the MAC
addresses that the instance interfaces received and the MAC addresses that are
shown under sudo ip link show <interface name>
The <interface name> is taken from what that was configured in the user_config.
yaml.
By default, the interfaces that are mapped to physnet 1 are NIC 1 port 1 and NIC 2
port 1
6. Delete the instance and re-check using sudo ip link show <interface name>
that the virtual functions (VFs) are cleared.
vf 82 MAC fa:16:3e:9e:d3:23, vlan 404, spoof checking on, link-state enable, trust off
Note: After a VF is cleared from the system, the last MAC address is still
shown. This is a known behaviour and should not affect any functionality.
4. The instance should be created and be in a running state. Inside the instance, when
running ifconfig, you should see the IP address that you configured in the Neutron
subnet or port. Run ifconfig <SR-IOV interface> within the instance to obtain the MAC
address (ether).
5. You should see the allocated VF by running from within the compute which you have
raised the instance on the command: “ip link show <MAC address>”. This is the same
MAC address taken from the SR-IOV port inside the instance.
6. The instance should be deleted and after deleting the instance the VF should be un-
allocated right away. To validate this, run: “ip link show <MAC address>”. The command
should return no results.
Status OK | not OK
Comments
Test Case ID
Objective Send OSPF multicast traffic from a SR-IOV VM on compute X to a SR-IOV VM on compute
Y. Check (without trust enabled on the VF), that the traffic is not retrieved on the receiver
VM and then check again (with trust now enabled on the VF), that the traffic flows to the
receiver VM.
Estimated 1 hour
Duration
Supported 20 SP2
From Version
This way, the trust state (on or off) will be modified as the service enable_trust.service will
re-configure the VFs with trust set to ON.
This is optional because even without the help of the service enable_trust.service
you can change the trust state of a VF on the fly. The main issue is that the configuration
changes will be reset after reboot.
The way to configure the CBIS Cluster with trust state of the VF’s for certain port(s):
From CLI:
Note: If trust: true is not set, then the default will be trust: false
sriov_mapping:
• physnet: ref:common_network_config.physnets.physnet1
port: ref:common_network_config.nic_1_port_1.name
trust: true
• physnet: ref:common_network_config.physnets.physnet2
port: ref:common_network_config.nic_1_port_2.name
trust: true
• physnet: ref:common_network_config.physnets.physnet1
port: ref:common_network_config.nic_2_port_1.name
trust: true
• physnet: ref:common_network_config.physnets.physnet2
port: ref:common_network_config.nic_2_port_2.name
trust: true
As the option to enable or disable trust on certain port(s) is not enabled in the CBIS
Manager at this time, a small workaround is needed. The workaround will require installing
the CBIS Cluster from scratch, (meaning from the Undercloud VM phase (include).
Note: The CBIS Cluster deployment can be split into several phases:
• Undercloud
• Hardware scan and generate templates
• Overcloud
1. Install the Undercloud VM from the CBIS Manager normally as you would even without
trust.
2. SSH or console to the Undercloud VM and configure trust: true in the user_config.
yaml under /home/stack as explained in the above CLI instructions.
3. After adding trust: true inside the user_config.yaml continue with the rest of the
installation from the UI.
Example
iperf2 (2.08) can be used to generate OSPF multicast UDP packets. The example
refers to iperf but any tool (built-in in Linux or not) that perform the same operation
is good.
Note: Make sure the image used for creating the VMs contains a tool
which can sniff packets such as tcpdump. For example: RedHat 7.4
image with built-in tcpdump.
4. Create 2 instance using the above created ports. Make sure that each VM is directed to
different SR-IOV compute using the –availability-zone parameter
5. Validate that the virtual functions (VFs) from the proper interface are being used. To do
this, perform the following actions:
a. Connect via SSH or console to one of the instances that you just created and run
ifconfig to obtain the MAC address of the SR-IOV interface. Remember the MAC
address.
b. Connect via SSH to the SR-IOV compute which holds the instance from where you
have just taken the MAC address.
7. Connect via SSH or console to both of the instances that you have created. To simplify
the process, we will designate the VM where we did not change its trust state as VM A
and the VM where we recently asked to change/validate its trust state as VM B.
From VM A (which can be with trust ON or trust OFF - does not matter for this scenario)
generate OSPF multicast packet, for example iperf -c 224.0.0.5 -u
Note: iperf will translate the OSPF multicast IP address to its multicast MAC
address 01:00:5e:00:00:05
Note: Keep running the iperf OSPF multicast traffic from VM A till the end of
the test.
Now run:
“<BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP>”.
9. After VM B is configured with an allmulti flag and with trust on the OSPF multicast,
packets sent from VM A with the destination IP of 224.0.0.5 should now be seen in the
tcpdump of VM B.
10. The instances should be deleted and the VFs trust state should not change.
Status OK | not OK
Comments
Test Case ID
Objective Verify Unmanaged Virtual Networks Support (no IPAM – IP Address Management).
Estimated 1 Hour
Duration
Supported All
From Version
Test Execution This is a manual procedure for the “CRUD Virtual Resources in OpenStack” stack creation.
Note: The relevant port is identified based on the subnet_id and the ip_
address above.
| binding_vnic_type | normal |
| created_at | 2019-06-04T12:30:40Z |
| data_plane_status | None |
| description | |
| device_id | 5119cea1-c05f-4ff2-8d8d-b958bee477ad |
| device_owner | compute:sriov_zone |
| dns_assignment | None |
| dns_domain | None |
| dns_name | None |
| extra_dhcp_opts | |
| fixed_ips | ip_address='10.10.10.4', subnet_id='35e5fae5-
cf63-4ada-8260-8ce338469009' |
| id | 99335665-0d74-47e8-b8d7-bb3de5850a84 |
| mac_address | fa:16:3e:92:58:c9 |
| name | |
| network_id | 4012596a-9ead-43a7-85d1-141aad025ceb |
| port_security_enabled | False |
| project_id | aa251e43bedb4dfaa22e023c5b03fa3f |
| qos_policy_id | None |
| revision_number | 9 |
| security_group_ids | |
| status | ACTIVE |
| tags | |
| trunk_details | None |
| updated_at | 2019-06-04T12:59:35Z |
+-----------------------+--------------------------------------
-------------------------------------+
| binding_vnic_type | normal |
| created_at | 2019-06-04T12:30:40Z |
| data_plane_status | None |
| description | |
| device_id | 5119cea1-c05f-4ff2-8d8d-b958bee477ad |
| device_owner | compute:sriov_zone |
| dns_assignment | None |
| dns_domain | None |
| dns_name | None |
| extra_dhcp_opts | |
| fixed_ips | ip_address='10.10.10.4', subnet_id='35e5fae5-
cf63-4ada-8260-8ce338469009' |
| id | 99335665-0d74-47e8-b8d7-bb3de5850a84 |
| mac_address | fa:16:3e:92:58:c9 |
| name | |
| network_id | 4012596a-9ead-43a7-85d1-141aad025ceb |
| port_security_enabled | False |
| project_id | aa251e43bedb4dfaa22e023c5b03fa3f |
| qos_policy_id | None |
| revision_number | 9 |
| security_group_ids | |
| status | ACTIVE |
| tags | |
| trunk_details | None |
| updated_at | 2019-06-04T12:59:35Z |
+-----------------------+--------------------------------------
-------------------------------------+
Note: The following command is issued on the compute node, where the
recently created VM is located.
9. Check that all iptables rules related to this port are updated with no security enforced:
5. Check all iptables rules related to this port updated with no security enforced:
Status OK | not OK
Comments
Test Case ID
Objective Check the ability to configure the VGT filter on Mellanox SR-IOV ports.
Estimated ~8 hours.
Duration
Prerequisite/s Install CBIS on a platform with Mellanox NICs and vlans ranges at the flat physnets fields.
Follow these steps:
1. Go to: CBIS Manager > installation >Overcloud>VLANs > Physical Networks VLAN
Ranges.
2. You should also set VGT allowed VLAN ranges as follows. Go to: CBIS Manager >
installation > Customize host groups> SR-IOV-Performance-Compute > SR-IOV
per port configuration and VGT allowed VLAN ranges.
3. The flat network should be enabled at the switch side (vlan ranges as well):
a. At the CBIS Manager Installation tab, under the Customize host group SR-
IOV compute go to sr-iov per port and select the flat physnet at the port physnet
mapping and configure the number of VFs that you need.
b. Go to "VGT allowed VLAN ranges". For every port, select the allowed vlan range
(adding the vlan ranges will enable the VGT+ feature).
Test Execution 1. First, create 2 VMs, in different host computes (use the same flat network but different
ports for the VM).
Note: Both IPs from both the interfaces should be on the same subnet.
Note: The sub interfaces vlan should not be contained in the allowed VGT+
range that was configured during the installation.
3. Ping between the 2 VMs, using the sub interfaces IP address. Pinging using the
sub interface IP address will automatically result in routing of the ping through, so
communication using the blocked vlan is guaranteed. result=unsuccessful ping.
4. Negative step > Create another 2 VMs similar to the previous step but this time use a
sub interface that is in the allowed range, result=successful ping.
5. Ensure that there is connectivity between the “pure” flat to the VGT+ host.
Note: You need to have 2 zones,1 zone with VGT+ and 1 zone without VGT+.
6. Scale out a new SR-IOV compute. Test the functionality of this feature on the new
added server. The feature should work successfully. (Unallowed ranges should not be
communicating).
7. Scale in another compute from the same host group. The feature should work
successfully.
8. Check that the functionality of the feature is not affected after power off/on and reboot
actions. The feature should work successfully.
9. The traffic should blocked between a none flat VM and a flat VM as tested by the
creation of a none flat network and of a vm based on this network.
10. The ability to create a large amount of vNICS and VMs should worrk normally.
To test:
Note: All these SR-IOV ports can be configured to use VGT+ mode while
each VM consumes a big amount of vNICs [>20].
Option:
Note: Creation and deletion of the VMs should work fine with no problems.
To test:
According to the "Number of VFs" configured at the installation, create the maximum
amount of VFs per host (default ~45).
Status OK | not OK
Comments
Test Case ID
Estimated ~ 1 hour
Duration
Supported CBIS 20
From Version
Prerequisite The setup is installed with at least one host with the ARP responder set to "on" and one set
to "off".
Test Execution 1. After the installation is complete, deploy a VM on each of the hosts.
2. Ping (cross-compute) different VMs from an ARP responder "on" host group.
3. Verify that the first ping to a non "ARP responder" VM takes slightly longer than the rest
of the VMS (due to the first ARP).
4. Open tcpdump on both VMs setting "ARP on & ARP off" to see the incoming ARP
requests.
5. Create a dual-stack network and test this feature in a dual-stack situation.
6. Test the functionality of this feature after performing a scale in/out procedure. Scale in a
compute and scale it out with computing in ARP set off.
7. Test the functionality of this feature after creating a new custom template.
8. Repeat this test in DPDK, OVS, and SR-IOV computes, to check that it works in all
different types of computes.
Status OK | not OK
Comments
Test Case ID
Objective Verify that the DPDK - Active/Active Bond Mode works as designed.
Estimated ~ 1 hour
Duration
Supported CBIS 20
From Version
Prerequisite Ensure that the CBIS Manager setup is installed with a host group and a DPDK zone where
the Bond mode is Active-Active.
Test Execution 1. After the installation is complete, check that the Tenant bond mode is set to active-
active, first ssh to the target compute and then run the command :
"ovs-appctl bond/show"
"bond_mode: balance-tcp"
"less /proc/net/bonding/infra-bond"
In the output, the infra bond mode should be "active-backup", and verify correct ports
bonding by the Identification of the ports name (for example ens5f0-ens5f1).
3. Test the functionality of the tenant Bond, and first verify correct ports bonding.
4. Check the bond operation, run a continuous ping to the target VM sitting on active-active
compute in parallel watch the state of the bond. Set the active interface down:
When putting the interface down, you should see that the active interface has been
replaced immediately with the other interface, and the ping that was sent should not be
affected.
5. Set the interface back up and see that it recovers.
6. After the recovery of the interface, create more VMs to check the bond after the bond
test and after creating the VM test connectivity between them.
Expected 1. The installation was completed, and the Tenant bond mode was set to active-active. The
Result output was :
"bond_mode: balance-tcp"
2. The command was run, and in the output, the infra bond mode was "active-backup", and
ports bonding was correct.
3. Ports bonded correctly.
4. The bond operation was checked. A continuous ping to the target VM sitting on active-
active compute was run. The active interface was set down. The active interface
was replaced immediately with the other interface, and the ping that we sent was not
affected.
5. The interface back up was set, and it recovered.
6. The interface created more VMs and checked the bond post the bond test, after creating
the VM test connectivity between them.
Status OK | not OK
Comments
Test Case ID
Objective Verify that VMs deploy on SR-IOV computes and use SR-IOV networks.
Estimated 1 Hour
Duration
Supported All
From Version
Prerequisite Image with SR-IOV support is available (Image level optimization may be required). The
image has the iperf binary already installed.
Example:
a. Create a network:
b. Create a subnet:
Note: Keep track of the port IDs, they will be needed to boot the VM
3. Instantiate 2 VMs on 2 computes with SR-IOV support (VMs should have an interface on
the SR-IOV network).
iperf -s
Status OK | not OK
Comments
Test Case ID
Objective Verify that VMs can be deployed with functioning VXLAN networks.
Estimated 1 Hour
Duration
Supported All
From Version
3. Create subnet:
5. Make sure that all VMs in all computes received the IP.
6. Perform connectivity tests between each VM and all other VMs.
7. Perform a file transfer between each VM and all other VMs.
4. 5 VMs created.
5. VMs get IPs from the correct namespace.
6. Ping has no packets lost between VMs.
7. Files can be transferred between VMs.
Status OK | not OK
Comments
Test Case ID
Objective Verify that there is network connectivity between OVS and SR-IOV computes.
Estimated 1 Hour
Duration
Supported All
From Version
Prerequisite The CBIS Cluster is installed successfully with OVS and SR-IOV computes.
Status OK | not OK
Comments
Test Case ID
Estimated 1 Hour
Duration
Supported All
From Version
Status OK | not OK
Comments
Description VXLAN connectivity testing via a router over both IPv4 and IPv6 networks.
Test Case ID
Objective Verify that VM networks can communicate via a router over both IPv4 and IPv6 networks.
Estimated 1 Hour
Duration
Supported All
From Version
Test Execution 1. Create 3 VXLAN networks - Each network with 2 subnets (IPv4 & IPv6)
• The default route on each instance can be checked with the following commands:
Status OK | not OK
Comments
Test Case ID
Objective Verify that Floating IPs can be assigned and test network connectivity using these IPs.
Supported All
From Version
Status OK | not OK
Comments
Test Case ID
Objective Verify that live migration is successful for VMs with VXLAN networks and there is no packet
loss during the migration.
Supported All
From Version
Status OK | not OK
Comments
Description Live Migrate VMs with VXLAN networks and Floating IPs.
Test Case ID
Objective Verify that live migration is successful for VMs with VXLAN networks and Floating IPs and
there is no packet loss during the migration.
Supported All
From Version
Note:
Note:
Status OK | not OK
Comments
Test Case ID
Objective Verify that live migration is successful for VMs using OVS-DPDK networks and there is no
packet loss during the migration.
Supported All
From Version
a. Each VMs changes an Undercloud Physical Server to another compute on the same
host group.
b. VM still has 3 VXLAN networks each.
c. There might be packet losses due to Community Bug: https://bugs.launchpad.net/
neutron/+bug/ 1815989*
Status OK | not OK
Comments
Description Infrastructure networks connectivity testing with interface failover over the 1st NIC.
Test Case ID
Objective Verify that infrastructure networks continue operating when a bonded interface link is
disconnected.
Estimated 1 Hour
Duration
Supported All
From Version
Test Execution 1. Set the source to the overcloudrc admin user as follows:
source ~/overcloudrc
2. From the Undercloud VM, check OVS active port of infra-bond in each compute.
3. Run a constant connectivity check (ping for example) from one of the controllers on
one of the infra networks (for example internal network usually 172.31.0.x) to one of the
computes.
4. Disconnect the active bond port of the target compute either by disconnecting the cable
directly from the server or by logging into the connected switch and shutting down the
appropriate port.
5. Ensure that the Active Interface in the bond has been changed and now the active port
is the one that was inactive before the disconnection.
6. Validate that the connectivity check is still running and there are no dropped packets.
7. Login to the Zabbix portal and validate that the alarms interface <downed active
interface name> appears on the <server name>.
8. Reconnect the compute port that was disconnected.
9. Login again to Zabbix portal and validate that the alarms interface <downed active
interface name> on the <server name> has disappeared. Note that it may take few
minutes for the alarms to disappear and reset.
10. Repeat steps 3-7 for the new active port.
11. From the Undercloud VM, check OVS active port of infra-bond in each controller.
12. Run a constant connectivity check (ping for example) from one of the computes on one
of the infra networks (for example internal network usually 172.31.0.x) to one of the
controllers.
13. Disconnect the active bond port of the target controller either by disconnecting the cable
directly from the server or by logging into the connected switch and shutting down the
appropriate port.
14. Ensure that the Active Interface in the bond has been changed and now the active port
is the one that was inactive before the disconnection.
15. Validate that the connectivity check is still running and there are no dropped packets.
16. Login to the Zabbix portal and validate that the alarms interface <downed active
interface name> appears on the <server name>.
17. Reconnect the controller port that was disconnected.
18. Login again to Zabbix portal and validate that the alarms interface <downed active
interface name> on the <server name> has disappeared. Note that it may take few
minutes for the alarms to disappear and reset.
19. Repeat steps 10-14 for the new active port.
Expected 1. After sourcing to overcloudrc, you should see the prompt change to [stack@undercloud
Result (overcloudrc) ~]$
2. The active port is listed. Example: active slave mac: 24:8a:07:38:1a:f1
(ens6f1)
3. The connectivity check runs constantly and successfully.
Expected Output:
Expected Output:
9. Check in Zabbix that the “interface down” alarms have been cleared and are not
showing. Note that it may take few minutes for the alarms to disappear and reset.
10. The results are the same as results 3-7 for the new active port.
11. The active port is listed. Example: active slave mac: 24:8a:07:38:1a:f1
(ens6f1)
12. The connectivity check runs constantly and successfully.
13. The interface should change to state DOWN as shown here:
Expected Output:
Expected Output:
18. Check in Zabbix that the “interface down” alarms have been cleared and are not
showing. Note that it may take few minutes for the alarms to disappear and reset.
19. The results are the same as results 10-14 for the new active port.
Status OK | not OK
Comments
Test Case ID
Objective Verify that tenant networks continue operating when a bonded interface link is disconnected.
Estimated 1 Hour
Duration
Supported All
From Version
Test Execution 1. Set the source to the overcloudrc admin user as follows:
source ~/overcloudrc
Note: If you use hardware type AirFrame, Dell or HP, you have to direct the
VMs to OVS compute.
4. From the Undercloud VM, check OVS active port of tenant-bond in each compute.
5. Run a constant connectivity check (ping for example) from the namespace of the VXLAN
network to the VMs IP address. This connectivity check needs to run constantly from this
point until the end of the test.
6. Disconnect the active bond port of the target controller either by disconnecting the cable
directly from the server or by logging into the connected switch and shutting down the
appropriate port.
7. Ensure that the Active Interface in the bond has been changed and now the active port
is the one that was inactive before the disconnection.
8. Validate that the connectivity check is still running and there are no dropped packets.
9. Login to the Zabbix portal and validate that the alarms interface <downed active
interface name> appears on the <server name>.
10. Reconnect the compute port that was disconnected.
11. Login again to Zabbix portal and validate that the alarms interface <downed active
interface name> on the <server name> has disappeared. Note that it may take few
minutes for the alarms to disappear and reset.
12. Repeat steps 3-7 for the new active port.
Expected 1. After sourcing to overcloudrc, you should see the prompt change to [stack@undercloud
Result (overcloudrc) ~]$
2. VXLAN networks have been created successfully.
3. VMs have been created successfully.
4. The active port is listed. Example: active slave mac: 24:8a:07:38:1a:f1
(ens6f1)
5. The connectivity check should run constantly and successfully.
6. The interface should change to state DOWN as shown here:
Expected Output:
Expected Output:
11. Check in Zabbix that the “interface down” alarms have been cleared and are not
showing. Note that it may take few minutes for the alarms to disappear and reset.
Status OK | not OK
Comments
Test Case ID
Objective Verify that the VXLAN networks are isolated and that broadcast traffic from one network is
not present on any other network.
Estimated 1 Hour
Duration
Supported All
From Version
Test Execution 1. Create 2 internal VXLAN networks (for example vxlan-net-1 and vxlan-net-2).
2. Create 1 VM (VM-1) on compute 1 using vxlan-net-1, 1 VM (VM-2) on compute 2 using
vxlan-net-1, 1 VM (VM-3) on compute 3 using vxlan-net-2.
3. Login to VM-1 via SSH or console and send a constant ARP broadcast.
4. Let the broadcast ARP traffic run until the end of the test.
5. Login to all the 3 used computes and run the command tcpdump:
Status OK | not OK
Comments
4.1.6.12 HugePages
Test Case ID
Estimated 1 Hour
Duration
Supported All
From Version
Prerequisite/s The CBIS cluster is installed with at least 1 compute node with 1G HugePages enabled.
BOOT_IMAGE=/boot/vmlinuz-3.10.0-327.28.3.el7.x86_64
root=UUID=b977c2c1-dc3c-4853-a353-0598c429fce6 ro console=tty0
console=ttyS0,115200n8 crashkernel=auto rhgb quiet intel_iommu=on
default_hugepagesz=1G hugepagesz=1G hugepages=87 isolcpus=2-11,13-
23,26-35,37-47
Test Execution 1. Show that HugePages is configured on both the compute and OpenStack (in which the
host is part of the relevant huge pages host aggregate).
+----+-------------+-------------------+-------------------------
---------------------------------------------+-------------------
---------+
| Id | Name | Availability Zone | Hosts | Metadata |
+----+-------------+-------------------+-------------------------
---------------------------------------------+-------------------
---------+
| 1 | HugePages1G | - | <the compute on which it was enabled
should appear in this list> | 'hw:mem_page_size=1048576' |
+----+-------------+-------------------+-------------------------
---------------------------------------------+-------------------
---------+
2. Create a matching flavor.
Status OK | not OK
Comments
Test Case ID
Estimated 2 hours.
Duration
Supported All
From Version
• AggregateInstanceExtraSpecsFilter
• NUMATopologyFilter
After checking which are the SR-IOV computes, select spontaneously 1 SR-IOV
compute that you will use to create VMs upon.
3. Login to the SR-IOV compute upon which you will create VMs and run the following
command to understand what physical interface goes to what NUMA (0 or 1):
cat /sys/class/net/ens1f0/device/numa_node
cat /sys/class/net/ens1f1/device/numa_node
cat /sys/class/net/ens12f0/device/numa_node
cat /sys/class/net/ens12f1/device/numa_node
Remember: When PCPUs are isolated, some PCPUs are taken from NUMA
0 and others from NUMA 1. The system always takes an even number of
PCPUs from each NUMA according to the pCPU siblings.
Example
• In the case of 6 isolated PCPUs (default), the system will take 4 PCPUs
from 1 NUMA and 2 more PCPUs from the second NUMA.
• In detail, the first 1 siblings pair from NUMA 0 (2 pCPU), the second 1
siblings pair from NUMA 1 (2 pCPU), and finally the 1 siblings pair from
NUMA 0 (2 pCPU).
• This is important to understand in case you receive the following error,
"not enough resources available" error when creating VM".
5. Ensure that the SR-IOV compute that you are using is cleared from VMs. If not, remove
all VMs from that compute.
6. Create via CLI or via Horizon, 2 provider vlan networks, deploying one for each NUMA,
depending on the corresponding physical interface (from physnet 1 to physnet 4).
For each network create also subnet and a port.
7. Create a flavor with the number of available CPUs in NUMA-0 with SR-IOV and CPU
pinning flags.
a. To calculate how many available CPUs there are in NUMA-0, see the following
examples.
b. If the compute has 48 CPUs, and 6 CPUs are isolated, then 42 CPUS are left.
c. Knowing that 4 of the isolated CPUs are from NUMA-0, and 2 CPUs from NUMA-1,
we can see that NUMA-0 have 20 CPUs, and NUMA-1 have 22 CPUs.
Examples
8. Create VM with the number of available CPUs in NUMA-0, using the matching network
for the NUMA.
To validate what PCPUs VM uses:
sudo -i
virsh list
4. Check if all the used PCPUs by the VM are taken from NUMA 0
Example
9. Create a flavor with the number of available CPUs in NUMA-1 minus one, with SR-IOV
and CPU pinning flags.
Examples
10. Create VM with the number of available CPUs in NUMA-1 minus one, using the
matching network for the NUMA.
Example
11. Create a flavor with 2 CPUs, with SR-IOV and CPU pinning flags.
Example
12. Create VM with 2 CPUs for NUMA-1, using the matching network for the NUMA.
Example
13. Create a flavor with 1 CPU, with SR-IOV and CPU pinning flags.
Example
14. Create VM with 1 CPU for NUMA-1, using the matching network for the NUMA.
Example
Status OK | not OK
Comments
4.1.6.14 OVS-DPDK
Test Case ID
Objective Verify that the system deploys the VM with ovs-dpdk networking.
Estimated 5 hours
Duration
Prerequisite/s CBIS node is deployed successfully with at least one DPDK dedicated compute.
Test Execution 1. Make sure you have an availability-zone with DPDK computes only (at least one DPDK
compute). If not, create an availability-zone DPDK_zone and add at least one DPDK
compute into that zone.
2. Make sure you have a flavor with HugePages extra spec. If the DPDK host group is
configured with 1G HugePages (default) the flavor needs to be configured accordingly
with 1G HugePages extra spec.
3. Make sure you have a Security Group which allows ingress and egress ICMP packets.
4. Create a VM on the DPDK_zone availability-zone using the permissive security group
and the flavor with the HugePages metadata.
5. Wait until the VM:
• is ACTIVE
• is running
• its operating system is fully loaded
6. Send ICMP (ping) packets to the VM.
Status OK | not OK
Comments
Test Case ID
Objective Verify that NOVA CPU pinning operates as expected. Items checked:
• Simple CPU pinning with different number of CPUs (up to the number in NUMA).
• CPU pinning with NUMA awareness (positive and negative).
Estimated 2 Hours
Duration
Supported All
From Version
Prerequisite/s The latest CBIS Manager Manual is available and the CBIS Cluster is installed successfully.
Note: You need to account for how many CPUs were allocated to the OS.
If the compute has total of 48 CPUs and you have allocated 6 isolated CPUs for the OS, the
VMs will be able to use only 42 CPUs.
If you wish to verify how many isolated CPUs were allocated for the OS, check in /home/
stack/user_config.yaml.
2. Check that each VM VCPU uses different CPUs and the CPUs are not overlapping.
3. To connect to the compute use the command below (depending on the type of the
compute):
c_ovs<compute number> or
c_sriov<compute number>
for i in `sudo virsh list | grep -vE '"--"|Name' |awk '{print
$2}'`; do echo; echo $i:; sudo virsh dumpxml $i | grep cpu;
done
4. Try to exceed the maximum number of CPUs allowed by the NUMAs. Create an extra
VM after all the CPUs are already used by existing VMs.
5. After exceeding the NUMAs CPU capacity, delete several VMs to free CPUs. Re-create
more VMs and verify that the CPUs are reused.
6. Leave 2 CPUs free on each NUMA and try to create a VM with 3 or more VCPUs.
7. Open one of the dedicated CPU pinning VM consoles and run the command "yes > /
dev/null &".
8. Connect using SSH to the compute which holds the VM and run "top".
Status OK | not OK
Comments
Description Create enough VMs to verify CPU allocation ratio limits are not exceeded
Test Case ID
Objective Verify that the CPU allocation ratio limits are not exceeded
Estimated 1 Hour
Duration
Supported All
From Version
Choose a compute you want to validate CPU allocation ratio on and make sure this
compute is empty from VMs.
It is important to understand that the CPU allocation ratio is meaningless when using CPU
pinning. Therefore, you should use a flavor without CPU pinning metadata.
If CBIS is already deployed check the current cpu allocation ratio in the computes.
c <compute number>
sudo grep 'cpu_allocation_ratio' /var/lib/config-data/puppet-
generated/nova_libvirt/etc/nova/nova.conf
Wait 2 minutes for the services to restart and check if they are running.
c <compute number>
sudo -S grep CPUAffinity= /etc/systemd/system.conf| sed -e 's/\
<CPUAffinity\>=//g'
3. Create 1 or more VMs that will allocate all the available CPUs for the VMs.
Note: It does not matter how it is done. It can be with many VMs that have 1
VCPU each or even 1 VM that uses a flavor with all the VCPUs.
4. With cpu_allocation_ratio=1.0 configured in nova you can create 1:1 ratio of VCPUs.
Meaning, if you have 42 available PCPUs for the VMs you will be able to create
maximum of 42 VCPUs. After that if you try to create more VMs it should fail.
Check that you cannot create more VMs after reaching the full allocation of the PCPUs.
This is how to check the total CPUs against the used CPUs in a compute:
source /home/cbis-admin/overcloudrc
openstack host show overcloud-(ovs|sriovperformancecompute
)-(desired computer number).localdomain
5. From within the compute you are working on, change the CPU allocation ratio to 2.0.
To connect to the compute, use the command below (depending on the type of the
compute):
c_ovs <compute number> or c_sriov <compute number>
change in /var/lib/config-data/puppet-generated/nova_libvirt/
etc/nova/nova.conf the cpu_allocation_ratio=1.0 parameter to cpu_
allocation_ratio=2.0 and restart nova_compute and nova_libvird services
Wait 1 minute for the services to restart and check if they are running.
Check that additional VMs can be created according to the new ratio.
7. Check that you cannot create more VMs after reaching the full allocation of the PCPUs.
This is how to check the total CPUs against the used CPUs in a compute:
change in /var/lib/config-data/puppet-generated/nova_libvirt/
etc/nova/nova.conf the cpu_allocation_ratio=1.0 parameter to cpu_
allocation_ratio=2.0 and restart nova_compute and nova_libvirt services
sudo docker restart nova_compute
sudo docker restart nova_libvirt
Wait 2 minites for the services to restart and check if they are running.
sudo sudo docker ps
9. Delete the VMs
Expected 1. Deployment was successful, and cpu_allocation_ratio have value of 1.0 and services
Result are active/running.
2. Example case HPc7k -> 48 cores - 6 dedicated = 42 ; -> 42 vCPU available for VM
instances (ratio: 1.0).
3. Create success max number of VMs (1*vCPU flavor).
4. VM creation fail.
Status OK |
Comments
Objective The software RAID 1 feature enables the configuration of /dev/sda and /dev/sdb
physical disks in RAID 1 allowing redundancy and helping with avoiding the operating
system from failing upon disk failure.
Estimated 60 minutes ~ (not including the CBIS deployment and preparartions to obtain another disk).
Duration
Prerequisite/s Software RAID can only be configured on controllers or computes and cannot be configured
on storage nodes or monitoring servers.
Software RAID 1 is only available for the Airframe RM/OR family from RM/OR 17 and
above.
Software RAID can only be enabled on a fresh CBIS deployment and cannot be deployed
post-deployment. However, using the custom template feature, you can scale out a new
compute host group with software RAID 1 enabled. (only relevant for computes).
Software RAID 1 requires the /dev/sdb disk to be unused and unpartitioned. This means,
that software RAID 1 can't work along-side other features that may allocate the /dev/sdb
disk to their own benefit. Features such as ELK DB disk and local storage. Local storage (if
used – default off) can be used on another disk other then /dev/sdb or /dev/sda and
the ELK DB can be configured on different disk as well other then /dev/sdb or /dev/
sda (assuming there is a third or more disks within the servers).
4. Continue with the rest of the CBIS deployment configuration and deploy CBIS.
5. Once CBIS is successfully deployed, select the under-test servers that have software
RAID 1 configured (controller or compute).
6. Physically unplug the sda disk, (to simulate a failed disk).
7. Follow the CBIS Operation Manual - Replace a failed disk in Software RAID 1
procedure to properly remove the failed disk from the operating system.
8. Follow the CBIS Operation Manual - Adding a new disk procedure to add a new disk
to replace the failed disk. (If the disk is not new, it has to be unpartitioned and formatted
like any new out of the box disk).
9. Check the RAID 1 status using cat /proc/mdstat.
Status OK | not OK
Comments
Description Check network connectivity via a router while rebooting the active controller.
Test Case ID
Objective Verify that network connectivity via a router is operational after rebooting the active
controller.
Supported All
From Version
Test Execution
Note: Unless requested differently, all the following test commands should
be executed from the Undercloud VM while sourced to the admin tenant
(overcloudrc).
source ~/overcloudrc
b. Create permissive security group rules and attach them to the created security
group:
7. Perform ICMP (ping) continual connectivity check between the 2 created VMs.
8. Reboot the controller where the L3 agent is active.
Note: The following steps will request the that the active controller is
rebooted. Before, during and post reboot, it is important to constantly monitor
the VMs ICMP connectivity to check that no unexpected packet drops or
connectivity termination occurs.
sudo reboot
Note: It may take several minutes for the other l3-agent controller to show
active.
e. Check that the ICMP (ping) continual connectivity check between the VMs is still
running.
9. Wait for the rebooted controller to finish its reboot process and become fully active and
operational (normally, it takes between 10 to 15 minutes) and validate that its L3 agent is
Alive and UP.
Note: Even after the rebooted controller is accessible via SSH it takes several
minutes for the L3 agent of that controller to become Alive and UP.
Note: If there are unordinary packet drops, check the network quality before
continuing to the next steps.
8. The controller successfully reboots and during the controller reboot the ICMP (ping)
connectivity show continue running without packet drops. In the meanwhile, a new
controller should take the lead for the l3-agent active controller.
9. The controller should be back up and its L3 agent should be Alive and UP.
Status OK | not OK
Comments
Description Power-off active controller and verify CBIS functionality during and after the failure.
Test Case ID
Estimated 1 Hour.
Duration
Supported All.
From Version
Test Execution 1. Create 2 IPv4 VXLAN networks and 2 IPv6 VXLAN networks. Each network on different
subnet.
2. Create a router and add the networks interfaces (the default gateway of the networks).
3. Create a security group that allow ingress and egress ICMP traffic for both IPv4 and
IPv6 to all addresses.
4. Create VM for each network with the created security group.
5. Create 4 volumes.
Note: The volume size is set according to the user requirements and if there
are no specific requirements, use the default values.
6. Attach and mount each volume to each VM and check the network connectivity and
volume accessibility of the VMs.
7. Run continuously ping connectivity check between the IPv4 VMs and between the IPv6
VMs. The connectivity check needs to run though all the test period.
8. Check which controller is running the internal API VIP as follows:
b. The IP address that will be presented is the internal API VIP. To find which controller
holds the internal API VIP, from one of the controllers run the command:
Note: In this test, we will be calling the controller which holds the internal API
VIP, the PRIMARY controller, and the rest of the controllers will be referred to
as SECONDARY controllers.
$ sudo shutdown +5
16. Verify that there is still connectivity to the previously deployed VMs.
17. At this point, the internal API VIP should have failed-over to a new controller. Perform
the same steps as previously mentioned in this test to obtain the new PRIMARY
controller.
18. Login to the new PRIMARY controller.
19. Capture and save the “sudo pcs status” command output.
20. Check that Zabbix, Kibana and Horizon are working.
21. Create VMs (SR-IOV and OVS) and check the connectivity to the VMs.
22. Start the powered off controller using the same method that was used to power off the
controller. This can now be used to power on the controller.
Note: It takes around 10-12 minutes ~ (depends on the hardware type and
shape) for the controller to become fully functional again. Note that it might
take few more minutes for the controllers to sync. This depends on how much
time the controller was down and how much data needs to be synchronized to
the started controller.
Note: You might see at the bottom of the sudo pcs status output, one or
more failed actions. Note that these contain a history of previous pacemaker-
controlled services that failed. These do not reflect the status of the services
and should be ignored in this specific test.
12. The “sudo ceph -s” command output should show “HEALTH_OK”
13. Zabbix, Horizon, Ceph and Kibana portals are all reachable and are operating. Zabbix
shows alarms (in case the trigger requirement is met). Kibana receives logs.
14. The compute is powered off.
15. The connectivity check to the VMs should succeed. No packets should be lost during
the powering off the controller. However, if there is a minimal packet loss during the
controller power off, check to see if it is within the acceptable packet loss threshold of
the VNF.
16. You should see that the internal API now resides on a new controller.
17. Successfully logged in to the new PRIMARY controller.
18. The “sudo pcs status” command should show all the services that were residing on
the powered off controller as Stopped except for galera and redis.
a. For galera, 2 containers should show Primary and the container that is on the
powered off controller should display as Stopped.
b. For the redis containers 1 should act as Secondary, 1 should act as Primary and the
one that was on the powered off controller should display as Stopped.
c. For the other containers that are not redis or galera, the containers should show 2
out of 3 Started and 1 out of 3 should display as Stopped.
Note: You might notice at the bottom of the sudo pcs status output, one
or more failed actions. These contain a history of previous pacemaker-
controlled services that failed. They do not reflect the status of the services
and should be ignored in this specific test.
Note: The “sudo ceph -s” command should show “WARNING”. This
is because one Ceph Monitoring is now down and Ceph reports this. If
this is the only warning, then this is expected and Ceph should still work
normally.
19. Kibana and Zabbix portals are reachable and are operating. Zabbix shows alarms (in
case the trigger requirement meets). Kibana receives logs.
20. The VMs are created and reply to the connectivity check successfully.
21. Start the powered off controller.
22. You should be able to login to the previously powered off controller.
23. The “sudo docker ps” command output should show identical information as before.
24. The “sudo pcs status” command should show all the containers as Started except
for galera and redis.
Note: All 3 galera containers should show Primary and from the 3 redis
containers 2 should act as Secondary and 1 as Primary.
Note: You might notice at the bottom of the sudo pcs status output one or
more failed actions. Note that these contain a history of previous pacemaker-
controlled services that failed. They do not reflect the status of the services
and should be ignored in this specific test.
Status OK | not OK
Comments
4.2.1.3 Storage Volume Recovery (NetApp, EMC, HCI-Ceph, SN-Ceph failovers and CBIS monitoring)
Test Case ID
Objective Verify that instances with attached volumes auto-evacuate after a compute shutdown.
Test Execution 1. Create an OVS VXLAN VM and understand on which compute the VM was created on.
a. Check for the new volume index – the last volume in list:
Note: It is not mandatory to fill all the volume capacity, although this is
possible if required. Several Megabytes will suffice.
Note: Both graceful (poweroff command) and ungraceful (plug off the
electricity of the compute) are valid options for shutting down the compute.
Note: It may take several minutes for the VM to fully rebuild on the new
compute.
sudo df -h
Note: If the mounted directory does not appear in the output of df -h, mount
the directory again.
11. Verify that the created data still resides inside the mounted directory.
Status OK | not OK
Comments
Test Case ID
Estimated 1 Hour.
Duration
Supported All
From Version
If the alarms are real, fix them and see that the alarms are cleared from Monitoring >
Dashboard
If Zabbix show alarms that are a false negative, there are 2 options:
ICMP, SSH and Zabbix agent alarms should raise in Zabbix. CEPH alarms may raise as
well in such scenario.
Note: You should see under Zabbix > Configuration > Hosts > Open a
host such as overcloud-ovscompute-1.localdomain and then access the
‘Triggers’ tab. All the triggers that may raise an alarm if a condition is met for
that compute are shown.
11. Fix the issues that you created in the previous step such as powering up the compute
and see that after approximately 10 minutes, all the alarms that went up after shutting
down the compute, now disappear from Zabbix > Monitoring > Dashboard
12. Check the past event under Zabbix > Monitoring > Problems and validate that you see
all the alarms that were up from the previous steps.
13. Open the Horizon GUI portal and log in (there was a bug that Zabbix/Vitrage might kill
httpd processes).
Status OK | not OK
Comments
Description The Zabbix Metrics Exporter is a feature for exporting metrics from a Zabbix server
into a CSV file periodically.
Test Case ID
Objective On each of the controllers verify that the Zabbix metrics are exported and contains
the expected information.
The following columns are expected to present within the csv files:
• timestamp
• hostID
• hostname
• templateItemID
• itemID
• name
• key
• units
• value
• error
Estimated 20 minutes.
Duration
Supported 18.0
From Version
Prerequisite/s It is advised to wait at least 3 hours from CBIS deployment until the execution of
this test, to allow the controllers to become filled with several metrics in the csv
files.
Test Execution 1. Login to one of the controllers. Check the existence of the directory /var/
log/zabbix/metrics.
2. In that folder, see that the 10 last files, ordered by date, were created every 15
minutes, until current operating system time.
3. Check that the last 10 files end with “.csv” and not with “inprogress” (except
for the last one). If the last one is “in progress”, wait for it to be “.csv” (up to 15
minutes wait).
4. “less” the last csv file (must end with “.csv”). Example:
less zbx_metrics_11012018_0930.csv.
2. The last 10 files were created every 15 minutes, until current time.
3. The last 10 files end with “.csv”.
4. The last line isn’t “cut” and contains these expected columns:
• timestamp
• hostID
• hostname
• templateItemID
• itemID
• name
• key
• units
• value
• error
Note: Except "units" and "error" columns, all the other columns
are expected to contain data while the "units" and "error"
columns will not present data if there is no relevant data to
display.
Example:
Status OK | not OK
Comments
Description Zabbix is pre-defined with the "General HW Templates" template which monitors SNMP
traps coming from the hardware switches.
By default, the SNMP traps, coming from the switch that will cause Zabbix to raise the
following alarms are:
Test Case ID
Objective Verify that Zabbix monitors the SNMP traps that were pushed from the switches.
Estimated 1 Hour.
Duration
Test Execution 1. Configure the switch to send SNMP traps to the public VIP of the controllers. The public
VIP is the same IP address as the OpenStack Horizon web portal.
• Community
• Access
• Trap destination
• SNMP version
NOTICE: Retrieving SNMP traps from the switch can be configured either
from PRE CBIS deployment from the CBIS Manager (recommended) or
POST CBIS deployment.
2. For PRE CBIS deployment from the CBIS Manager, perform the following steps and
then skip to step 6 (Create Zabbix alarm...).
a. From CBIS Manager navigate to CBIS Installation > Overcloud > Zabbix
Optional Parameters and check these 3 parameters:
b. Click on the + icon and add your switch name, switch IP address, community string
and switch SNMP port.
c. After all the CBIS Manager parameters are configured, click on Deploy to start the
CBIS deployment.
d. To continue, skip to step 6 (Create Zabbix alarm...)
3. For POST CBIS deployment, continue with Step 4 (Login to the Zabbix portal...)
Note: The following two steps; #4 Login to the Zabbix portal and #5 Create
and configure “Host” for SNMP alarms, are only to be performed for post
CBIS deployment.
e. Add SNMP interfaces and the port (default 161) and type the IP address of the
switch that sends the SNMP traps.
• Click Select. From the new window, open the Group drop-down list.
• After the 2 templates are selected, click on the upper Add to apply the linked
templates.
g. Click Macros and within the Macro field enter {$SNMP_COMMUNITY}. In the
Value field, write the community string as configured in the switch (the value
"public" is only an example).
Note: After SNMP Availability becomes active (green), Zabbix should send
alarms about problematic interfaces in the switch (link down or administratively
down). Initially it takes up to an hour for Zabbix to collect all the information
(items and triggers).
6. Create Zabbix alarm by shutting down administratively an active interface in the switch.
7. Cancel an alarm by turning on the switch interface.
Note: There should be a received alarm at “system status”, “host status”, “last
20 issues” sections.
Status OK | not OK
Comments
Test Case ID
Estimated 15 minutes ~
Duration
Supported CBIS 20
From Version
Status OK | not OK
Comments
Test Case ID
Supported 20 SP2
From Version
Status OK | not OK
Comments
Test Case ID
Objective Verify that SNMP traps are sent for each alarm activity.
Estimated 1 Hour
Duration
Supported All
From Version
Prerequisite The latest CBIS Manager Manual is available and the CBIS Cluster is installed successfully.
Follow CBIS Operation Manual Procedure – “Registering for SNMP Traps”, and
configure the IP address for which you want to send the SNMP traps.
2. Raise an alarm in Zabbix (set Zabbix trigger on).
3. Use the MIB browser program or Wireshark to examine the SNMP traps.
Status OK | not OK
Comments
4.2.2.7 ELK
Test Case ID
Supported All
From Version
Prerequisite/s ELK is enabled in the CBIS Manager before the Overcloud installation as follows:
CBIS Manager > CBIS Installation > Overcloud > General Optional Parameters >
Deploy ELK
Note: If the values are not as expected, refresh the web-page and check
again.
Note: If the values are not as expected, refresh the web-page and check
again.
• cloud-*
• ipmitool-*
• ceph-*
• metricbeat-*
• openstack-*
• vtop*
c. Make sure the cloud-* filter is selected and logs are displayed.
Note: If the pre-defined filters do not return logs, it may be due to the filters not
having any logs to show. You can also create a wider custom filter. Navigate to
"Management page > Index Patterns > Create Index Pattern"For example, To
filter in all the logs, create the filter '*'.
Note: If logs are still not showing, this could be due to a small time range.
Increase the time range to potentially show more logs. For example, if the Kibana
time range is configured to monitor only the last 5 minutes, increase the time
range to obtain potentially more results.
Status OK | not OK
Comments
Description When deploying CBIS, the user can configure target rsyslog servers. For post-CBIS
deployment, selected logs are sent via UDP port 514 from the CBIS Overcloud hosts to the
configured rsyslog servers.
Test Case ID
Objective Verify that the rsyslog server receives log files from the CBIS Overcloud.
Supported All
From Version
Prerequisite/s 1. Rsyslog is enabled in the CBIS Manager before the CBIS Overcloud deployment.
Navigate as follows:
CBIS Manager > CBIS Installation > Overcloud > ELK Optional Parameters >
Rsyslog Servers > Type the Rsyslog server IP address > Click ADD
2. The UDP port 514 (syslog) is opened in the firewall, and the rsyslog server receives
traffic on port 514.
Test Execution Connect to the rsyslog server and check the logs sent from CBIS.
• /var/log/containers/neutron/server.log
• /var/log/container/nova/nova-compute.log
• /var/log/messages
Each file contains the paths of the log files which are sent to the rsyslog
server.
Expected All the relevant logs from all the CBIS hosts are sent through UDP port 514 (default) to the
Result rsyslog server.
Even if the server is not configured with rsyslog, the UDP port 514 packets should still arrive
at their destination.
To check if the syslog packets reached the rsyslog server run from the rsyslog server the
command:
Note: The way that the logs are presented in the rsyslog server is determined by
the end user who initiated and configured the rsyslog server.
Status OK | not OK
Comments
Description By default, user actions made inside the OpenStack Horizon web portal are not audited.
When deploying CBIS the user can now enable the Horizon Audit Logging feature which
logs all of them user's actions.
Test Case ID
Supported 18.0
From Version
Prerequisite/s • The setup is installed with the parameter Enable Horizon Audit Logging enabled.
• In this test, the user will perform an action in the UI and will check the horizon log on the
controllers (only the controller that holds the internal API VIP is the one that will get the
logs). The Horizon log location is /var/log/containers/horizon/horizon.log
Note: This feature can be used along with rsyslog and ELK. But in the ATP,
this will be tested separately to focus on it. This is not a prerequisite in itself.
Status OK | not OK
Comments
Zabbix constantly monitors the following statistics for all the VMs:
VM System Statistics
• Total memory
• Used memory
• Free memory
• CPU time
• CPU user time
• CPU system time
• Number of CPUs
• Number of used CPUs
• CPU utilization
• Total disk space
• Number of adapters
• Uptime
• State
VM Disk Statistics
VM Network Statistics
Alarms
Apart from the constant VMs statistics monitoring, Zabbix is set to raise an alarm for the
following triggers:
Note: More information can be obtained from the CBIS Operation Manual
document under Monitoring > Zabbix Monitoring in CBIS > VM
Level Monitoring.
Test Case ID
Supported All
From Version
Test Execution 1. Create a flavor with 1 VCPU, 1024 MB RAM and 100G disk space.
2. Initiate a VM using the created flavor.
3. Login to Zabbix from CBIS Manager > EXTERNAL TOOLS > Zabbix or alternatively
navigate to https://<OpenStack_External_VIP>/zabbix
4. Inside Zabbix navigate to Monitoring > Latest data.
6. After 60 seconds Zabbix is expected to trigger the alarm: High usage of CPU on VM:
<VM ID>
7. Stop the CPU consumption on the VM.
Note: If you used the command yes > /dev/null & remember to kill it!
( Using killall might also work).
8. After 60 seconds from the moment, the VM VCPU is below 70% the Zabbix alarm High
usage of CPU on VM:<VM ID> is expected to disappear.
5. The VM should reach 90%+ CPU utilization. It should stay like this till the Zabbix alarm
High usage of CPU on VM: <VM ID> is shown (estimated time for the alarm to trigger is
60 seconds).
6. The alarm High usage of CPU on VM: <VM ID> should show in Zabbix dashboard.
8. The alarm High usage of CPU on VM: <VM ID> disappeared from Zabbix dashboard
Status OK | not OK
Comments
Test Case ID
Objective Verify that customer can provide the CA and Server certificates; specifically, to support the
TLS service.
Estimated 4 Hours.
Duration
Supported 20 SP2
From Version
Prerequisite/s
Status OK | not OK
Comments
Description CBIS is automatically installed with TLS enabled. Connections between the client and server
are encrypted using TLS. Accessing any of the external web-interfaces such as Horizon,
Zabbix and Kibana is done only via HTTPS. Trying to access one of the web-interfaces via
HTTP will automatically redirect you to HTTPS.
Test Case ID
Objective Verify that CBIS TLS installed and that only HTTPS access is available to CBIS GUIs.
Supported 18.0
From Version
Status OK | not OK
Comments
Description The CBIS platform offers the possibility of performing security hardening of the system as
a post-installation automated procedure. The post-installation hardening is done using the
Ansible framework.
The hardening includes multiple configurable parameters and defines the set of supported
security specifications (STIG, CIS, ANSSI and others), and their default value settings.
• RPM Hardening
• Password Hardening
• Boot Hardening
• Advanced Intrusion Detection Environment (AIDE) Configuration
• Kernel Hardening
• File/Directory Permissions
• Audit Rules
• SSH Hardening
• TLS Hardening
• Web Hardening
• NTP Configuration
• User Management
• OpenStack Hardening Configuration
• Miscellaneous
Note: In check mode, you can check every supported hardening task without
applying the changes on the system.
Navigate to the CBIS Manager > SECURITY > Security Hardening Check
Mode as shown here:
Test Case ID
Objective Deploy the security hardening check mode and verify that the ssh ClientAliveInterval
can be changed on all hosts, Undercloud and the Undercloud Physical Server.
Supported CBIS 20
From Version
Test Execution 1. Check the RHEL-07-040190 / CIS-5.2.13 - Terminate SSHsession after a period
of inactivity - Idle Timer security task, that it is not configured on the system yet, by
executing the following:
• For all hosts, run the following from the Undercloud VM:
CBIS Manager > SECURITY > Security Hardening > In the Task selection –check
mode Security Tasks select Specific TAG(s) and write the tag as shown here:
Navigate to CBIS Manager > Security > Security Hardening Check Mode >
SSH hardening Check Mode and ensure that the RHEL-07-040190 / CIS-5.2.13 -
Terminate SSHsession after a period of inactivity - Idle Timer security task button is
enabled and set the interval to 60 (sec) as shown here:
Click DEPLOY.
3. Check the RHEL-07-040190 / CIS-5.2.13 - Terminate SSHsession after a period of
inactivity - Idle Timer security task, that it is not configured on the system after running
the check-mode, by executing the following:
• For all hosts, run the following from the Undercloud VM:
Expected 1. The result returned from the salt command and from the Undercloud and the Undercloud
Result Physical Server should be empty. (On condition that the security hardening was not run
before).
2. The Log window is displayed as follows:
Note: See that the changed column contains progressive numbering as this
means that the value can be implemented when hardening will take place.
3. The result returned from the salt command and from the Undercloud and the
Undercloud Physical Server should be empty, (on condition that the security hardening
was not run before).
Status OK | not OK
Comments
Description The CBIS platform offers the possibility of performing security hardening of the system as
a post-installation automated procedure. The post-installation hardening is done using the
Ansible framework.
The hardening includes multiple configurable parameters and defines the set of supported
security specifications (STIG, CIS, ANSSI and others), and their default value settings.
• RPM Hardening
• Password Hardening
• Boot Hardening
• Advanced Intrusion Detection Environment (AIDE) Configuration
• Kernel Hardening
• File/Directory Permissions
• Audit Rules
• SSH Hardening
• TLS Hardening
• Web Hardening
• NTP Configuration
• User Management
• OpenStack Hardening Configuration
• Miscellaneous
Note: In check mode, you can check every supported hardening task without
applying the changes on the system.
Test Case ID
Objective Deploy the security hardening and verify that the ssh ClientAliveInterval changed on
all hosts, Undercloud and the Undercloud Physical Server.
Supported CBIS 20
From Version
Test Execution 1. Check the RHEL-07-040190 / CIS-5.2.13 - Terminate SSHsession after a period
of inactivity - Idle Timer security task, that it is not configured on the system yet, by
executing the following:
• For all hosts, run the following from the Undercloud VM:
2. Run the Security Hardening check-mode for Disable the idle timer - keep the idle SSH
session active security task, and navigate as follows:
CBIS Manager > SECURITY > Security Hardening > In the Task selection –Security
Tasks select Specific TAG(s) and write the tag as shown here:
Navigate to CBIS Manager > Security > Security Hardening Check Mode >
SSH hardening Check Mode and ensure that the RHEL-07-040190 / CIS-5.2.13 -
Terminate SSHsession after a period of inactivity - Idle Timer security task button is
enabled and set the interval to 60 (sec) as shown here:
Click DEPLOY.
3. Check the Terminate SSHsession after a period of inactivity - Idle Timer security
task, that it is not configured on the system after running the check-mode, by executing
the following:
• For all hosts, run the following from the Undercloud VM:
4. ssh to the Undercloud Physical Server and from there ssh to the Undercloud VM. Wait
60 seconds and see that the session was terminated.
Expected 1. The result returned from the salt command and from the Undercloud and the Undercloud
Result Physical Server should be empty. (On condition that the security hardening was not run
before).
2. The Log window is displayed as follows:
Note: See that the changed column contains progressive numbering as this
means that the value can be implemented when hardening will take place.
3. The result returned from the salt command and from the Undercloud and the
Undercloud Physical Server should be empty, (on condition that the security hardening
was not run before).
4. The session terminated towards the Undercloud after 60 seconds.
Status OK | not OK
Comments
Description The CBIS platform offers the possibility of performing security hardening of the system as
a post-installation automated procedure. The post-installation hardening is done using the
Ansible framework.
The Hardening Rollback supports for most of the hardened tasks in the system.(see security
hardening doc.)
• RPM Hardening
• Password Hardening
• Boot Hardening
• Advanced Intrusion Detection Environment (AIDE) Configuration
• Kernel Hardening
• File/Directory Permissions
• Audit Rules
• SSH Hardening
• TLS Hardening
• Web Hardening
• NTP Configuration
• User Management
• OpenStack Hardening Configuration
• Miscellaneous
Navigate to the CBIS Manager > SECURITY > Security Hardening as shown here:
Test Case ID
Objective Deploy the security hardening rollback and verify that the ssh ClientAliveInterval was
disabled on all hosts, Undercloud and the Undercloud Physical Server.
Supported CBIS 20
From Version
Test Execution 1. Check the RHEL-07-040190 / CIS-5.2.13 - Disable the idle timer - keep the idle
SSH session active security task, that it is configured on the system, by executing the
following:
• For all hosts, run the following from the Undercloud VM:
CBIS Manager > Security > Security Hardening Rollback > In the Task selection –
Security Tasks select Specific TAG(s) and write the tag as shown here:
Navigate to CBIS Manager > Security > Security Hardening Rollback > SSH
hardeningand ensure that the RHEL-07-040190 / CIS-5.2.13 – Disable the idle timer -
keep the idle SSH session active security task button is enabled as shown here:
Click DEPLOY.
3. Check the RHEL-07-040190 / CIS-5.2.13 – Disable the idle timer - keep the idle
SSH session active security task, that it is removed from the system after running the
security hardening Rollback, by executing the following:
• For all hosts, run the following from the Undercloud VM:
4. ssh to the Undercloud Physical Server and from there ssh to the Undercloud VM. Wait
60 seconds and see that the session was terminated.
Expected 1. The result returned from the salt command and from the Undercloud and the
Result Undercloud Physical Server should be ClientAliveInterval 60.
2. The Log window is displayed as follows:
Note: See that the changed column contains progressive numbering as this
means that the value can be implemented when hardening will take place.
3. The result returned from the salt command and from the Undercloud and the
Undercloud Physical Server should be empty.
4. The session terminated towards the Undercloud after 60 seconds.
Status OK | not OK
Comments
Description Networks which are associated with different Tenants have no connectivity between them.
Test Case ID
Objective Verify that different tenants can use the same CIDR values but these tenants have no
network connectivity between their VMs.
Estimated 1 Hour.
Duration
Test Execution 1. Create two networks where each network is associated with a different tenant but
includes the same subnet (same CIDR).
2. Create a security group which allows incoming/outgoing ICMP packets.
3. Create a VM for each tenant using its designated network and the previously created
"Allow ICMP security group".
4. Ping from one VM to the other’s network address.
Status OK | not OK
Comments
Test Case ID
Objective Verify that different users in different roles can be authenticated through an external LDAPS
server.
Estimated 2 Hours.
Duration
Supported 20 SP2
From Version
a. Configure users under a group, specific for the OpenStack users. (use LDAPAdmin
for that).
d. Users are created and should be visible in the LDAP management interface.
2. Inspect the /var/log/cbis/ansible/ansible.log
The Ansible playbook completed successfully and LDAP was set correctly..
sudo -i
cd /etc/keystone/domains/
cat keystone.LDAP.conf
3. Ensure that the default domain and LDAP domain are present, as follows:
. overcloudrc
openstack domain list
4. Ensure that users under the default domain are displayed, as follows:
5. Ensure that users that have been configured on the LDAP Server are displayed, as
follows:
. overcloudrc
openstack user list --domain ldap
7. Ensure that the project is presented as part of the LDAP domain as follows:
8. Using the UUID obtained earlier, (see steps 4 and 5), assign one of the LDAP users as
an admin role for the project just created:
9. Login to the Horizon interface with the admin LDAP User/specifying the LDAP domain
and password configured under LDAP.
10. Create with the user resources such as network, subnet, volume and create VMs.
11. On the OpenStack CLI assign one of the LDAP users as a member role for the project
just created:
13. Create with the user, resources such as network, subnet, volume and create VMs.
14. Login to Horizon with the admin user defined under the default domain.
15. Create VMs with user resources such as network, subnet and volume.
Status OK | not OK
Comments
For a detailed description of this procedure, see the CBIS Operations Manual, DN09259995.
Test Case ID
Objective Verify that a complete snapshot of the Undercloud VM is successfully backed up.
Estimated 10 min.
Duration
Test Execution 1. Follow the "Undercloud Backup" procedure in the CBIS Manager manual.
2. After a successful Undercloud VM backup procedure, check that the backup file exists in
the backup directory. The default backup directory (if it has not been manually changed)
should be:/root/cbis-backup in the Undercloud Physical Server.
Note:
• The operator should store the backup files in an external repository to avoid a
situation where the Undercloud Physical Server is damaged and the backup
files which reside in the Undercloud Physical Server are lost.
• The Undercloud should be backed up manually each time a new node is
added or removed (compute, storage), or if a controller node is replaced.
Status OK | not OK
Comments
For a detailed description of this procedure, see the CBIS Operations Manual, DN09259995.
Description If the Undercloud hardware fails, new hardware of the same type will replace it, and the
Undercloud VM will be restored from the backup snapshot. Perform an Undercloud VM
Restore using the CBIS Manager.
Test Case ID
Estimated 30 min.
Duration
Prerequisite/s An appropriate Undercloud VM backup file is available and resides in the Undercloud
Physical Server.
Test Execution 1. Select one compute from the CBIS Cluster computes. This compute will be erased
purposely and then restored as part of the Undercloud VM restore procedure.
Note: If, for any reason you are unable to SSH to the selected compute,
select a different compute to which you can SSH.
Example
4. Validate that the nova compute server is deleted and does not exist in the nova servers
list:
5. Follow the "Undercloud Restore" procedure as shown in the CBIS Manager manual.
Note: You need to enter the Backup directory where the backup file resides
for the backup(s) to show in the "Backup file to restore the "drop-down list".
source ~/stackrc
11. Connect with SSH to the compute that was eradicated before the Undercloud VM
restore.
12. Create a VM on the restored compute.
Status OK | not OK
Comments
Description The Overcloud is backed up, once a day at 2:00 AM, (by default or as user defined in
installation) using the CBIS backup solution. The Overcloud backup is a complete backup of
the database taken from all the controllers.
Backups are stored on the Undercloud VM at /mnt/backup which is mapped to the NFS
mountpoint. This is configured in the CBIS Manager under CBIS Installation > Overcloud
> Backup Optional Configuration.
Note: The default NFS mountpoint is the setup Undercloud Physical Server /
root/backup directory.
Test Case ID
Estimated 30 minutes.
Duration
Prerequisite/s As the Overcloud database backup is performed automatically, once a day at 2:00 AM, it is
required to wait at least 1 backup iteration before you are able to have backup files.
Test Execution 1. Wait until you have at least 1 backup file under /mnt/backup in the Undercloud VM.
This requires waiting 1 backup iteration which takes place at 2:00 AM each day.
2. Check that the backup files exist in the Undercloud VM /mnt/backup directory.
a. The backup file is taken simultaneously from each controller, separately for HA
purposes. Therefore, it is expected that a directory for each Overcloud controller will
be seen:
b. Inside each Overcloud controller directory there are more directories named by the
date and time that the backup was taken:
c. In addition, inside each of the directories represented by date and time the encrypted
db_backup.enc backup file should appear:
df | grep /mnt/backup
Note: If using an external NFS server (not the default Undercloud Physical
Server) make sure to allow access from the Undercloud VM to the NFS server
(ports, firewall and etc..).
Expected 1. The setup was up and running from 2:00 AM to at least 3:00 AM, (According to the
Result default backup time, or 1 hour after the manually defined backup time).
2. Assuming the setup was working correctly at 2:00 AM, it is expected that there will be
backup files under /mnt/backup in the Undercloud VM.
3. The /mnt/backup directory is mapped to the NFS mountpoint as configured in the
CBIS Manager.
4. The backup files should reside in the NFS mountpoint.
Status OK | not OK
Comments
Test Case ID
Objective Verify that the the Overcloud database restore is performed correctly.
Estimated ~ 2 hours. The procedure duration depends on the size of setup and the amount of data.
Duration
Supported CBIS 20
From Version
Prerequisite/s A valid backup: it was taken on the same CBIS cluster. A backup file cannot be used on
new CBIS deployments.
Test Execution 1. Open CBIS Manager > CBIS OPERATIONS and verify that the Overcloud database
restore option is displayed.
2. Enter the Backup Folder Location. The location of the folder containing the backup file
(db_backup.enc).
/root/backup/overcloud-controller-0/2018.12.05.04.00.02
3. On the setup, create a number of VMs (the user can choose from 5 to 10). Create at
least 1 volume and attach it to 1 VM. Mount the volume and create a large file on it.
Now, get the md5sum from the file. Check that all VMs are up and running.
4. Make sure you have the relevant backup file in the Hypervisor under /root/backup
and in the Undercloud VM under /mnt/backup.
sudo -i
mysql
DROP database nova
• At the CBIS Manager, enter the operation section, click on the Overcloud
database restore tab, fill the Backup Folder Location field, enter the Backup
Password as set in the user_config.yaml.
The Overcloud DB should be restored to the previous working version without the
issue.
7. Verify all created VMs on step 1 are also restored and are up and running.
8. Verify that the large volume is attached and undamaged, comparing the md5sum with
the md5sum from step 1.
Expected 1. CBIS Manager/Operation was opened. The Overcloud database restore block was
Result presented. The CBIS Manager > CBIS OPERATIONS was reached and the Overcloud
database restore option is displayed.
2. The Backup Folder Location was entered. The location of the folder containing the
backup file was (db_backup.enc).
3. Running VMs on the setup were created. The volume was attached to one of the VMs. A
large file was created successfully. All VMs were up and running.
4. The relevant backup file was in the Hypervisor under /root/backup and in the
Undercloud VM under /mnt/backup.
5. The DB was ended.
6. The Overcloud database restore procedure was performed.
7. All created VMs on step 1 were also restored and were up and running.
8. The large volume was attached and unhurt. All VM created on step 1 were restore and
up and running.
9. Sanity and several tests, such as create VMs and so on, were performed. Everything
was working.
10. Zabbix had NO unexpected alarms.
Status OK | not OK
Comments
Test Case ID
Objective Verify that the compute has been successfully removed from the Overcloud.
Supported All
From Version
Test Execution 1. If the CBIS Cluster is an HCI (Hyper-converged infrastructure) system (meaning
the Ceph OSDs reside on the computes) check the Ceph status before the scale in
operation. From one of the controllers, execute:
sudo ceph -s
Note: It is not advisable to start the compute scale-in operation if Ceph status
isn't HEALTH_OK. If the Ceph status returns HEALTH WARNING make sure
to acknowledge the warning. If the warning is acknowledged and accepted
continue to the scale-in operation. Otherwise, fix the Ceph issues and then
continue with the scale-in operation. If Ceph status returns HEALTH ERROR,
it is required to fix the Ceph error and only then continue with the scale-in
operation.
4. Check that the scaled in compute does not exist in nova servers list.
5. If the scaled in host is an HCI compute, check that all the OSDs are removed as
expected and that the Ceph status is HEALTH_OK or, Ceph status is as it was before
the scale-in operation. From one of the controllers, execute:
sudo ceph -s
sudo ceph osd tree
Expected 1. The Ceph status is acknowledged and accepted before starting the scale in procedure.
Result 2. The scale in operation finished successfully.
3. The compute has been removed from baremetal (ironic).
4. The compute has been removed from nova servers list.
5. Ceph health is either HEALTH_OK or as it was before the scale out operation. The
compute and its OSDs are removed from the sudo ceph osd tree table.
Note: Step 5 above is only relevant if the CBIS Cluster is an HCI (Hyper-
converged infrastructure) system ( that is the Ceph OSDs reside on the
computes).
Status OK | not OK
Comments
Test Case ID
Objective Verify that a compute has been successfully added to the Overcloud.
Supported All
From Version
Prerequisite/s • An available unused compute for the scale out operation, (you will need its IPMI
address). If there is none, you need to scale in a compute before the scale out
operation.
• CBIS allows the user to automatically execute the security hardening on the scaled-
out compute by enabling the "Run Security Hardening Post Scale out" radio button
in the scale-out operation page in CBIS Manager. This is only relevant if the security
hardening is applied to the CBIS Cluster before the scale-out operation.
Test Execution 1. If the CBIS Cluster is an HCI (Hyper-converged infrastructure) system (meaning the
Ceph OSDs reside on the computes) check the Ceph status before the scale-out
operation. From one of the controllers, execute:
sudo ceph -s
Note: It is not advisable to start the compute scale-out operation if the Ceph
status is not displaying HEALTH_OK. If the Ceph status returns HEALTH
WARNING make sure to acknowledge the warning. If the warning is
acknowledged and accepted, continue to the scale-out operation. Otherwise,
fix the Ceph issues and then continue with the scale-out operation. If the Ceph
status returns HEALTH ERROR, it is required to fix the Ceph error and only
then continue with the scale-out operation.
4. Check that the scaled-out compute exists in the nova servers list.
sudo ceph -s
sudo ceph osd tree
7. Initiate a VM on the scaled-out compute using a permissive security group (only relevant
to OVS/DPDK VMs.
Expected 1. The Ceph status is acknowledged and accepted before starting the scale-out procedure.
Result 2. The scale out operation finished successfully.
3. The compute is shown in baremetal(ironic).
4. The compute is shown in nova servers list.
5. Connection made with SSH to the scaled out compute.
6. Ceph health is either HEALTH_OK or as it was before the scale-out operation. The
number of presented OSDs as shown in sudo ceph osd tree correspond to the
number of physical disks of the added compute.
Note: Step 6 above is only relevant if the CBIS Cluster is an HCI (Hyper-
converged infrastructure) system (meaning the Ceph OSDs reside on the
computes).
Status OK | not OK
Comments
Test Case ID
Objective Verify that a storage node is successfully removed from the Overcloud.
Estimated ~1 hour
Duration
Note: The duration of scaling in storage node depends on the number of objects
and the amount of data within the Ceph cluster which corresponds with the time it
takes for Ceph to finish rebalancing. These variables make it harder to determine
an accurate estimated duration for the scale in process. As an example, it may
take several hours long if the Ceph cluster is 50% allocated.
Supported 18.0
From Version
Test Execution 1. Check the Ceph status before the scale-in operation. From one of the controllers
execute:
sudo ceph -s
Note: It is not advisable to start the storage node scale-in operation if Ceph
status is not HEALTH_OK. If the Ceph status returns HEALTH_WARNING
make sure to acknowledge the warning. If the warning is acknowledged and
accepted as a valid warning, continue with the scale-in operation. Otherwise,
fix the warning root cause and then, continue with the scale-in operation. If
Ceph status returns HEALTH_ERROR, it is mandatory to fix the Ceph error and
only then continue with the scale-in operation.
3. Check that the scaled-in storage node does not exist in the baremertal (ironic) database.
From the Undercloud VM, execute:
4. Check that the scaled-in storage node does not exist in nova servers list. From the
Undercloud VM, execute:
5. Check that all the OSDs are removed as expected and that the Ceph status returns
HEALTH_OK, or that the Ceph status is as it was before the scale-in operation. From one
of the controllers, execute:
sudo ceph -s
sudo ceph osd tree
Expected 1. The Ceph status is acknowledged and accepted before starting the scale-in procedure.
Result 2. The scale in operation finished successfully.
3. The storage node has been removed from the baremetal (ironic).
4. The storage node has been removed from nova servers list.
5. The Ceph health is either HEALTH_OK or as it was before the scale-out operation. The
storage node and its OSDs are removed from the sudo ceph osd tree table.
6. The storage node has been removed from Zabbix.
Status OK | not OK
Comments
Description With CBIS, you can add a storage node to an existing working CBIS system via the CBIS
Manager. By adding more storage nodes, the capacity of the Ceph cluster is increased as
well as the maximum IOPS.
Test Case ID
Objective Verify that a storage node has been successfully added to the Overcloud.
Note: Enabling "Run Security Hardening Post Scale out" is relevant if the
security hardening was applied on the CBIS Cluster before the scale-out
operation.
Test Execution 1. Check the Ceph status before the scale out operation.
sudo ceph -s
3. Check that the scaled-out storage node exists in the baremetal (ironic) database. From
the Undercloud VM, execute:
4. Check that the scaled-out storage node exists in nova servers list. From the Undercloud
VM, execute:
sudo ceph -s
sudo ceph osd tree
Expected 1. The Ceph status is acknowledged and accepted before starting the scale-out procedure.
Result 2. The scale-out operation has finished successfully.
3. The storage node is shown in baremetal (ironic).
4. The storage node is shown in nova servers list.
5. Connected with SSH to the scaled out storage node.
6. The Ceph health is either HEALTH_OK or as it was before the scale out operation. The
number of presented OSDs is as shown in sudo ceph osd tree corresponds to the
number of physical disks of the added storage node.
Status OK | not OK
Comments
Description A compute node will need to be set to maintenance (gracefully shutdown) to allow
the replacing of faulty HW components such as disks, NICs, memory or any kind of
maintenance required to be performed on a single node at a time to allow the system High
Availability features to retain normal operation.
The CBIS Manager Maintenance Mode Compute, Storage and Monitoring feature enables
you to Set/unset compute, storage and monitoring nodes to/from maintenance mode.
Note: As of CBIS R20, this feature enables you to set to maintenance several
non-HCI computes at once (as this does not affect storage redundancy), and/or a
single storage / HCI compute at a time.
As Replicate Ceph data may take a long time with large storage, when you have planned
larger activities involving several storage nodes one after the other with enough storage
redundancy, it is advisable to execute storage nodes set/unset to maintenance without
Replicate Ceph data option, and allow this option only during unset from maintenance of last
storage node. This will enable replication by the end of the procedure and the return to a
normal cluster storage state.
Test Case ID
Objective Verify that a compute node can be set to maintenance (gracefully shutdown) and can be
powered up, easily and with minimal effect on the rest of the CBIS cluster.
Estimated 2 hours ~
Duration
Test Execution 1. If the required node is an HCI (Hyper-converged infrastructure) system (meaning
the Ceph OSDs reside on the computes) check the Ceph status before the scale in
operation. From one of the controllers, execute:
sudo ceph -s
2. Check that all the CBIS web portals such as Kibana, Zabbix, Horizon, Vitrage (inside
Horizon up to V19A) and Ceph manager dashboard are accessible and viable. Access
the web portals via CBIS Manager > External Tools.
3. Check the Zabbix Dashboard for any existing failures which should better be fixed
before starting or record the status in order to be able to verify status of the cluster
returned to same condition after unset from maintenance. You can also track the events
reported during the procedure using Zabbix – Monitoring – Problems – History – Last 2
days.
4. Invoke Maintenance Mode Compute – Set to maintenance, select the required
compute node, and if needed, change the default selected parameters and click
DEPLOY. Track the execution log and verify that it ends successfully.
5. Verify that the compute was set to maintenance and shutdown using an openstack
baremetal node list and check the Zabbix and Ceph Dashboards for any existing
failures. Now you can perform any required HW maintenance.
6. Run the Maintenance Mode Compute – Unset from maintenance, select the required
compute node, and if needed, change the default selected parameters and click
DEPLOY. Track execution log, verify it ends successfully.
7. Verify that the compute was unset from maintenance and shutdown using openstack
baremetal node list and check the Zabbix and Ceph Dashboards for any existing
failures compared to the previous state before the start of this test.
8. You may now create new VM and/or migrate back any previously removed VMs.
9. Check that the VMs are up and running and reply to the ICMP packets (ping).
10. Once again, access the CBIS web portals such as Kibana, Zabbix, Horizon, Vitrage
(inside Horizon up to V19A) and the Ceph manager dashboard and check that they are
accessible and viable as was before the Set to maintenance procedure.
Status OK | not OK
Comments
Description The Reboot Servers operation in CBIS Manager enables you to gracefully reboot specific
compute, storage, monitoring and controller node/s, as well as reboot all nodes belonging
to an aggregate, node type or entire cluster, or any user selected combination out of the
above.
Reboot activity is performed one node at a time out of selected list according to a suggested
order to reduce traffic and storage failure starting from controllers, storage, HCI and then all
non-HCI computes together.
Reboot is used as part of the upgrade procedure and in any other case where rebooting
nodes is needed.
Test Case ID
Objective Verify that the cluster nodes can be gracefully rebooted, easily and with minimal effect on
the rest of the CBIS cluster.
Estimated 2 hours ~.
Duration
Test Execution 1. If the desired node is an HCI (Hyper-converged infrastructure) system (meaning
the Ceph OSDs reside on the computes) check the Ceph status before the scale in
operation. From one of the controllers, execute:
sudo ceph -s
Note: It is not advisable to start the reboot server operation if the Ceph
status is not HEALTH_OK. If the Ceph status returns HEALTH WARNING,
make sure to acknowledge the warning. If the warning is acknowledged
and accepted, continue to the reboot server operation. Otherwise, fix the
Ceph issues and then continue with the reboot server operation. If the Ceph
status returns HEALTH ERROR, you need to fix the Ceph error and only then
continue with the reboot server operation.
2. Check that all the CBIS web portals such as Kibana, Zabbix, Horizon, Vitrage (inside
Horizon up to V19A) and Ceph manager dashboard are accessible and viable. Access
the web portals via CBIS Manager > External Tools.
3. Check Zabbix Dashboard for any existing failures which should better be fixed before
starting or record the status in order to be able to verify status of the cluster returned to
same condition after reboot server. You can also track the events reported during the
procedure using Zabbix – Monitoring – Problems – History – Last 2 days.
4. Invoke Reboot Servers, select desired server node/s, aggregates etc. combination, if
needed change default selected parameters and click DEPLOY. Track execution log,
verify it ends successfully.
5. Verify all selected computes were rebooted Check Zabbix and Ceph Dashboards for any
existing failures – compare to previous state before start of this test.
6. You may now create new VM and/or migrate back any previously removed VMs.
7. Check that the VMs are up and running and reply to the ICMP packets (ping).
8. Once again, access the CBIS web portals such as Kibana, Zabbix, Horizon, Vitrage
(inside Horizon up to V19A) and the Ceph manager dashboard and check that they are
accessible and viable as was before the Set to maintenance procedure.
9. Optionally repeat for other types of servers or groups.
Status OK | not OK
Comments
Test Case ID
Objective Verify that the Replace Controller action is performed via the CBIS Manager.
Estimated ~ 4 hours.
Duration
Prerequisites The setup should be installed and one server should be left out to be wiped as a candidate
for the new controller.
Test Execution 1. After the installation is complete, deploy a VM on each of the hosts.
2. Verify ping (cross compute) different VMs.
3. Verify on one of the controllers that the Ceph health is OK and that the PCS status is
OK.
4. Perform Replace Controller using CBIS Manager under the CBIS OPERATIONS menu
using the disk wiped server.
5. Verify ping (cross compute) that different VMs are not lost during the Replace Controller
operation.
6. Verify that the Replace Controller operation has passed OK and check the logs.
7. Deploy a VM and verify ping (cross compute) that different VMs are not lost after the
Replace Controller operation has been performed.
8. Connect the new Controller from the Undercloud and verify that the Ceph health is OK
and that the PCS status is OK.
9. Verify that the old controller has been removed from the baremetal node list in the
Undercloud and from the Nova service-list in the Overcloud.
Status OK | not OK
Comments
Description In some cases, a CBIS Cluster will need to be shipped between locations. The CBIS Cluster
Shutdown/Power Up procedure allows the user to gracefully shutdown the CBIS Cluster to
re-locate it with minimum disruption.
In addition, the CBIS Cluster Shutdown/Power Up procedure should also be used for any
kind of maintenance that is required on the the entire CBIS Cluster.
Test Case ID
Objective Verify that a CBIS cluster can be put into a maintenance mode and can be powered up after
a graceful shutdown procedure.
Estimated 5 hours.
Duration
Supported All
From Version
Prerequisite/s 1. The CBIS Cluster has been installed successfully with 3 controllers.
2. The CBIS Manager Manual is available.
3. The Leaf/Spine switch backups have been completed and saved outside the CBIS
cluster. This is relevant when the CBIS Cluster is relocated and when the CBIS Cluster
will be connected to other switches and reconfiguring the switches is required.
Test Execution 1. Check that all the CBIS web portals inside Horizon such as Kibana, Zabbix, Horizon,
(Vitrage -only up to CBIS R19A-not available in CBIS R20) and the Ceph Manager
dashboard are accessible and viable. Access the web portals via CBIS Manager >
External Tools as shown here:
2. Create a security group that will allow ingress/egress ICMP packets (ping).
3. Create a VM on each unique CBIS host group. For example, if the existing host groups
are SriovPerformanceCompute, DpdkPerformanceCompute and OvsCompute, create
at least 1 x OVS VXLAN VM on the OVS compute, 1 x SR-IOV VM on the SR-IOV
compute and 1 x OVS VLAN VM on the DPDK compute using a flavor with HugePages.
Use the created permissive security group.
4. Validate that the VMs are up and running and reply to the ICMP request (ping).
5. Follow the Manual Graceful Shutdown and Startup procedure (only in versions up to
CBIS 19) in the CBIS Operation Manual document.
Note: from CBIS 19A up, use the CBIS Manager manual featuring the Cluster
Shut Down and Powerup Plugin.
6. Verify that all cluster nodes are powered off (using ping to their IPMI addresses), then
continue to the startup phase.
7. Check that the VMs are still up and running and reply to the ICMP packets (ping).
8. Once again, access the CBIS web portals such as Kibana, Zabbix, Horizon, Vitrage
(inside Horizon) and the Ceph manager dashboard and check that they are accessible
and viable as was before the CBIS Cluster Shutdown/Power Up procedure.
Status OK | not OK
Comments
Note: The given patches are dummy patches, meaning they will not perform
any changes in the system but will be written in the list of applied patches after
loading.
Test Case ID
Estimated 1 hour
Duration
• Dummy patch files (can be downloaded separately from NOLS, see Appendix 2: Artifact
Files for ATP Tests on page 174 in the sub-section titled Patch Files on page 174.
Test Execution
Note: Follow the Patch Management procedure in the CBIS Manager document
to install/rollback the given dummy patches.
1. Install CBIS-20.100.1-DUMMY-PP1.tar.gz
2. Install CBIS-20.100.1-DUMMY-PP1.tar.gz.sha1
3. Rollback CBIS-20.100.1-DUMMY-PP2.tar.gz
4. Rollback CBIS-20.100.1-DUMMY-PP2.tar.gz.sha1
Status OK | not OK
Comments
Description With Multi CBIS clusters, the user can manage other CBIS Managers with the ability to
add, remove, edit and manage patches on each of the CBIS clusters using the Multi CBIS
Operations feature.
Test Case ID
Estimated 1 hour.
Duration
• Dummy patch files and their corresponding sha1 checksum files can be downloaded
separately from NOLS. See Appendix 2: Artifact Files for ATP Tests on page 174 in
the sub-chapter titled Patch Files on page 174.
Test Execution
Note: Follow the Multi CBIS Operations procedure from the CBIS Manager
manual. The Multi CBIS Operations assumes that the procedure Manage Multi
CBIS was already executed and that there is at least one remote CBIS cluster in
the Multi CBIS management clusters list.
Status OK | not OK
Comments
Open a tmux session to keep the session persistent and allow sharing in case of failure:
tmux ls
tmux a -t <name/number of session (left part of ls result)>
ssh stack@uc
• Source the relevant credential file for your relevant project in the Undercloud:
Then execute:
• List of VMs:
Execute:
Execute:
Execute:
Execute:
Execute:
5.5 Flavors
• List of Flavors:
Execute:
• Create Flavor:
Execute:
Example:
5.6 Images
• List of Images:
Execute:
• Create Image:
Execute:
5.7 Volumes
• List of volumes:
Execute:
• Create a volume:
Execute:
• Attach a volume:
Execute:
• Detach a volume:
Execute:
Execute:
Execute:
Execute:
For creating multiple VMs with network and subnet, execute the following, (VMs are spread
between computes in a round-robin):
• Create multiple VMs with VLAN network and subnet ( physnet1 for example)
For creating multiple VMs with network and subnet, execute the following, (VMs are spread
between computes in round-robin):
• Stack List:
Execute:
• Create tenant:
Execute:
Execute:
• Create tenant:
Execute:
5.12 Networks
• List Networks:
Execute:
• List Subnets:
Execute:
Execute:
• Create new VXLAN network and subnet ( physnet0 > 1st nic):
The above will stdout the new networks parameters. You can always use network net-show
<net id>to list all details.
The above will stdout the new Subnet parameters. You can always use network subnet-show
<sub id> to list all details
• Unmanaged Network:
If you want to create a neutron network but disable DHCP for the instances, just use:
• Create New VLAN Network and subnet (physnet0 > 1st nic):
The above will stdout the new networks parameters. You can always use network net-show
<net id> to list all details.
The above will stdout the new Subnet parameters. You can always use network subnet-show
<sub id> to list all details.
For direct ports (SR-IOV), we use computes that have been deployed as SR-IOV. There, all NICs
other than the first NIC are dedicated to serving instances on direct ports.
Example:
The above will stdout relevant parameters for the created port. You can use neutron port-
show <id> also for getting the detailed parameters list.
An external network should be setup with the correct subnet/vlan/gateway of the environment, to
the defined external network
– The Instance to which you would like to associate the Floating IP has a private neutron
network in the router.
– You have external network defined and the router is set as a gateway with this network.
With a Neutron floatingip-list, you will see the list of floating IPs and their status.
+--------------------------------------+------------------+-------------
--------+---------+
| id | fixed_ip_address | floating_ip_
address | port_id |
+--------------------------------------+------------------+-------------
--------+---------+
| 184a5054-8bab-4eb5-b12e-bdae8f13941b | | 172.16.19.
155 | |
+--------------------------------------+------------------+-------------
--------+---------+
The next step is to find the ID of the instance port that you would like to associate the floating IP:
Then:
• VM_Connectivity_validation_extend.py script.
• stack_ATP3_yaml.yml
6.1 Images
In the ATP, there are 2 images that can be used (note that the ATP usually refers to only one). They
are:
• redhat-7.4-ATP.qcow2
user: root
password: password
• stack_ATP1_yaml.yml
• stack_ATP2_yaml.yml
• stack_ATP3_yaml.yml
• CBISHF_0.5_FRAMEWORK_DUMMY_19A_1.tar.gz
• CBISHF_0.5_FRAMEWORK_DUMMY_19A_1.tar.gz.sha1
• CBISHF_0.5_FRAMEWORK_DUMMY_19A_2.tar.gz
• CBISHF_0.5_FRAMEWORK_DUMMY_19A_2.tar.gz.sha1
6.4 Scripts
Using the following scripts may help simplify some of the tests:
• Automatically check connectivity between ALL VMs or just VMs with a specific name string using
the VM_Connectivity_validation_extend.py script.
• Automatically check System Health properties - for CBIS19A and on using
cbis-11852_system_health_validation.sh