By Anand Nande

NOVA
By Anand Nande
AGENDA ITEMS
● What is Openstack NOVA ?
● Its Components
● AMQP and NOVA
● VM state transitions
● Host Aggregates: NUMA, CPU-Pinning
● Server Groups: Affinity, Anti-Affinity.
● Migration: Stages, Types, Security, How to interact with live-

migration ?
● Debugging and Troubleshooting

Whats NOVA

Its the big daddy of all the openstack projects

one of the core components which has been there in openstack
since forever

provides compute as a service and all the juice required to
run the vms on top

inherits ancient qemu/kvm virtualization features and
principles

built on top of messaging-based architecture

pluggable/hybrid hypervisor support like
xen,lxc,hyperv,esx,docker
NOVA components
NOVA
NOVA API
Scheduler
message queue
NOVA
NOVA Compute
Conductor
NOVA NOVA
ConsoleAuth novncproxy
Components interaction via MQ | External service

interaction via REST-api
➢ NOVA API:
○ nova-api is responsible to provide an API for users and

services to interact with NOVA
○ For ex. Spawning the instance from Horizon / NOVA CLI.

➢ NOVA Scheduler:
○ Using Filters dispatches requests for new virtual machines to the correct
node.
➢ openstack-nova-compute:
Runs on each node to create and terminate virtual instances. The
compute service interacts with the hypervisor to launch new instances,
and ensures that the instance state is maintained in the Compute
database.
➢ openstack-nova-conductor: Provides database-access support for
Compute nodes (thereby reducing security risks).
➢ openstack-nova-consoleauth: Handles console authentication.
➢ openstack-nova-novncproxy: Provides a VNC proxy for browsers (enabling

VNC consoles to access virtual machines).
What is keypair and security group ?
➢ Keypair: On standard cloud images of Linux operating systems like Ubuntu and
Fedora SSH access is restricted to public key authentication. Instead of
authenticating with a password you authenticate with a private key that
corresponds to a public key that is installed on the instance.
➢ Security groups are sets of IP filter rules that are applied to an instance's
networking. i.e. we can filter the network traffic which should allow/deny. For ex.
Deny “ssh” access to any specific instance.They are project specific, and project
members can edit the default rules for their group and add new rules sets. All
projects have a "default" security group, which is applied to instances that have
no other security group defined.
➢ NOVA call for authentication with keystone:
○ It provides authentication token along with service catalog.
REQ: curl -i 'http://10.65.234.1:5000/v2.0/tokens' -X POST -H "Accept: application/json" -H "Content-Type:

application/json" -H "User-Agent: python-novaclient" -d '{"auth": {"tenantName": "admin", "passwordCredentials":
{"username": "admin", "password": "{SHA1}121c3faea23dd4467fc992f1b77f6eacf8587ed5"}}}'
➢ Keystone response(token + service catalog) :
RESP BODY: {"access": {"token": {"issued_at": "2015-05-30T11:05:03.054462", "expires": "2015-05-30T12:05:03Z", "id":

"{SHA1}7781e321bfbfbf909ae44027ef60cb92ccce8f2e", "tenant": {"enabled": true, "description": "admin tenant", "name": "admin", "id":
"97787e34dc0d4f2b8fc04034eed3594c"},
"serviceCatalog": [{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:8774/v2/97787e34dc0d4f2b8fc04034eed3594c", "region": "RegionOne", "publicURL":

"http://10.65.234.1:8774/v2/97787e34dc0d4f2b8fc04034eed3594c", "internalURL": "http://10.65.234.1:8774/v2/97787e34dc0d4f2b8fc04034eed3594c", "id":
"42142cca01fd4bc382ac9f95c204e116"}], "type": "compute", "name": "nova"},
{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:9696/", "region": "RegionOne", "publicURL": "http://10.65.234.1:9696/", "internalURL":
"http://10.65.234.1:9696/", "id": "466354cac1094127ac0617cf75dd1494"}], "type": "network", "name": "neutron"},
{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:9292", "region": "RegionOne", "publicURL": "http://10.65.234.1:9292", "internalURL":
"http://10.65.234.1:9292", "id": "43c49fe7dd8f4315af848b48a53021c1"}], "type": "image", "name": "glance"},
{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:8776/v1/97787e34dc0d4f2b8fc04034eed3594c", "region": "RegionOne", "publicURL":

"http://10.65.234.1:8776/v1/97787e34dc0d4f2b8fc04034eed3594c", "internalURL": "http://10.65.234.1:8776/v1/97787e34dc0d4f2b8fc04034eed3594c", "id":
"30ce33a6d05e4a80b8a0e22ada52abdb"}], "type": "volume", "name": "cinder"},
[...]
➢ Required details to boot instance:
○ instance name
○ glance image
○ flavor ID
○ network ID
○ security group
➢ NOVA call to boot an instance:
[root@dhcp209-220 ~]# nova boot --flavor 1 --image 2d946232-5773-48df-b8bb-7677f8b6e0fe --nic net-

id=97bd405a-77e3-4ef8-836e-8ad1ddb3ee63 --security-groups default pratik_test_instance
[...]
REQ: curl -i 'http://10.65.209.220:8774/v2/27513fe577364ce594d48f629f7b74fd/servers' -X POST -H
"Accept: application/json" -H "Content-Type: application/json" -H "User-Agent: python-novaclient" -H "X-
Auth-Project-Id: admin" -H "X-Auth-Token: {SHA1}fde39ed28acaf2d30788fced000970f9c7f65dfb" -d
'{"server": {"name": "pratik_test_instance", "imageRef": "2d946232-5773-48df-b8bb-7677f8b6e0fe",
"flavorRef": "1", "max_count": 1, "min_count": 1, "networks": [{"uuid": "97bd405a-77e3-4ef8-836e-
8ad1ddb3ee63"}], "security_groups": [{"name": "default"}]}}'
[...]
NOVA and AMQP interaction
● The API services process REST requests, which typically involve database
reads/writes, optionally sending RPC messages to other Nova services, and
generating responses to the REST calls.
● RPC messaging is done via the `oslo.messaging library`, an abstraction on top of

message queues.
● Most of the major nova components can be run on multiple servers, and have a
manager that is listening for RPC messages. The one major exception is nova-
compute, where a single process runs on the hypervisor it is managing.
oslo_messaging.rpc.dispatcher
REST RPC
Oslo.messaging oslo.messaging._drivers.impl_rabbit
Dig further ?
On compute node :
* nova-manage logs errors | grep oslo
* nova.conf > [oslo_messaging_rabbit] section
NOVA and AMQP(rabbit) interaction
* Nova uses direct, fanout, and topic-
based exchanges. [1] Broader view
* Each Nova service (for example

Compute, Scheduler, etc.) create two
queues at the initialization time,
one which accepts messages with routing
keys ‘NODE-TYPE.NODE-ID’
Example: compute.hostname
* other, which accepts messages with

routing keys as generic ‘NODE-TYPE’ (for
example compute).
The former is used specifically when

Nova-API needs to redirect commands to a Zoomed in
specific node like ‘destroy instance’.
[1]https://www.rabbitmq.com/tutorials/amqp-concepts.html
VM State Transitions
! Lets see how they dance !
Horizon Keystone Database -openstack-
1. Sending API 2.Authentication Glance glance-api
request -openstack-
request Server glance-registry
NOVA API
3.Authentication ACK &
validates if provided data is correct. e
4 ag
im
ce
lan ad
NOVA Conductor lo
5. Update DB rg wn -openstack-cinder-
fo
t do api
6 es e Cinder
q u ag -openstack-cinder-
Re im Server
9. ce scheduler
8.
an
U
NOVA Scheduler l -openstack-cinder-
pd
. g
10 volume
at
e
7. S
DB
ele
cts
co mp
ute
Hos
t
11. create port(allocate MAC - IP)
-neutron-server
NOVA Neutron
NOVA NOVA -neutron-l3-agent
Compute
Compute ‘A’ Compute ‘C’ 12. notify l2 agent Server -neutron-dhcp-
‘B’
15. port up(RPC: Neutron to NOVA) agent
-l2 agent
-openstack-nova-compute 13. configure local VLAN, OVS flows
14. send port up notification (RPC: l2 agent to Neutron)
16. instance booted.
Host Aggregates and Availability Zones
http://wordpress-
anande.rhcloud.com/2016/05/24/digging-into-
aggregate-groups-and-availability-zones/
NUMA and CPU Pinning
How to Interact with numa nodes and monitor usage ?
Example showing sub-optimal memory:CPU alignment on NUMA

nodes of a compute
After running `numad` the memory:cpu alignment is adjusted to best
suite
But we dont use ‘numad’ in Openstack
http://bit.ly/1suSHm5
cpu_pinning + NUMA + Live_Migration + ? Coffee
Now Lets understand NOVA Flavor extra_specs
hw_cpu_policy=shared V/S hw_cpu_policy=dedicated
More on NUMA:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_
Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-NUMA-NUMA_and_lib
virt.html
-
https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/virt-driver-numa-plac
ement.html
-
https://access.redhat.com/solutions/2046753
More on cpu-pinning:
https://access.redhat.com/solutions/2191071
What are Server Groups
http://wordpress-
anande.rhcloud.com/2016/03/17/host-aggregate-
groups-and-server-groups/
Whats Live Migration from a qemu perspective ?
● Pick guest state from one QEMU process on one hypervisor and transfer it to another qemu
process, while the guest is running, on another hypervisor.
● The guest shouldn’t realize the world is changing beneath its feet
● Guest might notice some degraded performance inside it, though - only for a few seconds
(ideally) due to dirty-page-tracking taking place.
qemu 19997 1 0 May17 ? 00:04:06 /usr/bin/qemu-system-x86_64 -machine qemu 21200 1 0 May17 ? 00:04:06 /usr/bin/qemu-system-x86_64 -machine
accel=kvm -name mini1,debug-threads=on -S -machine pc-i440fx- accel=kvm -name mini1,debug-threads=on -S -machine pc-i440fx-
2.4,accel=kvm,usb=off,vmport=off -cpu Haswell-noTSX -m 1024 -realtime mlock=off -smp 2.4,accel=kvm,usb=off,vmport=off -cpu Haswell-noTSX -m 1024 -realtime mlock=off -smp
1,sockets=1,cores=1,threads=1 -uuid 05708fb7-672e-4493-a316-e3765f37eedc -no-user- 1,sockets=1,cores=1,threads=1 -uuid 05708fb7-672e-4493-a316-e3765f37eedc -no-user-
config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1- config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-
mini1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc mini1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global
PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb- PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-
ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb- ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-
uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb- uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-
uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb- uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-
uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio- uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-
serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/mininet-vm- serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/mininet-vm-
x86_64.vmdk,format=vmdk,if=none,id=drive-virtio-disk0 -device virtio-blk- x86_64.vmdk,format=vmdk,if=none,id=drive-virtio-disk0 -device virtio-blk-
pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev
tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net- tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-
pci,netdev=hostnet0,id=net0,mac=52:54:00:a0:1b:bb,bus=pci.0,addr=0x3 -chardev pci,netdev=hostnet0,id=net0,mac=52:54:00:a0:1b:bb,bus=pci.0,addr=0x3 -chardev
pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1
-device isa-serial,chardev=charserial1,id=serial1 -chardev -device isa-serial,chardev=charserial1,id=serial1 -chardev
spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio- spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-
serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice
port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on
-device vmware-svga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2 -device virtio-balloon- -device vmware-svga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2 -device virtio-balloon-
pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on
Hypervisor/Compute-1 Hypervisor/Compute-2
Lets drill down further..
Stage1: Stage 2: Stage 3:

Mark all RAM Keep sending Stop guest,
dirty dirty RAM transfer
pages since remaining
last iteration dirty RAM.
Stop when some Continue
low watermark execution on
or condition destination
reached qemu process
… but this is just what happens on the qemu-level
SRC
Nova-scheduler
Does DEST meet SRC specs to - Free RAM

accommodate the incoming instance - TIME sync
- CPU_map xml
- same subnet
DEST
vm
DEST
matches,
start SRC
migrating
Nova-scheduler
DEST
vm
SRC
Nova-scheduler
vm
DEST
vm
SRC
Nova-scheduler
vm
vm
DEST
vm
SRC
Nova-scheduler
vm
vm
DEST
SRC
Nova-scheduler
Send RARP to
Retain the ip-
address of the VM
vm
DEST
Many ways to migrate - Its your call
[stack@instack ~]$ nova help | grep -E '(migr|eva)'
evacuate Evacuate server from failed host.
live-migration Migrate running server to a new machine.
migrate Migrate a server. The new host will be
host-evacuate Evacuate all instances from failed host.
migration-list Print a list of migrations.
host-servers-migrate Migrate all instances of the specified host to
host-evacuate-live Live migrate all instances of the specified

Credits : Stephen Gordon
Credits : Stephen Gordon
When security is of concern
● By default, live-migration uses
TCP/unsecured way of migrating
devices.
● There are a few ways to leverage

unsecured TCP as a means for
libvirtd socket xommunication :
– TLS for encryption and X.509
client certificates for
authentication
– GSSAPI/Kerberos for
authentication and encryption
– TLS for encryption and
Kerberos for authentication
live_migration_uri=qemu+ACCESSTYPE://USER@%s/system
ACCESSTYPE = tcp (unencrypted) or tls (encrypted)
USER = who has access to the compute service - “nova”
Example: live_migration_uri=qemu+tls://nova@%s/system
PCI passthrough
● Nova.conf > scheduler_default_filters=…, PciPassthroughFilter
● Add the device whitelist on the compute node :
● Create a flavor with with pci property.
● Boot an instance
ISSUE: no live-migration support, Admin needs to detach/attach pNIC.

● Proposal (not accepted though) to detach_nic_from_src + emulate +
migrate + attach_nic_on_dest.
How to interact with Live Migration ?
# virsh domjobinfo <domain>

Job type: Unbounded
Time elapsed: 4000ms
Data processed: 5.093 MiB
Data remaining: 1015.707 MiB
Check the status of the Data total: 1.008 GiB
migration using ‘virsh’ Memory processed: 5.093 MiB
Memory remaining: 1015.707 MiB
Memory total: 1.008 GiB
Constant pages: 460504
Normal pages: 78809
Normal data: 307.848 MiB
Expected downtime:30ms
# virsh suspend <instance-name>
The simplest and crudest mechanism for ensuring guest

migration complete is to simply pause the guest CPUs.
This prevent the guest from continuing to dirty memory
and thus even on the slowest network, it will ensure
migration completes in a finite amount of time.
# virsh domjobabort <instance- name>
Cancel an ongoing live-migration
# virsh migrate-set-speed <domain> <speed_in_Mbps>
Multiple small vm’s with less RAM or One large VM

with more RAM
Future of Live Migration
- Instances with direct pci passthrough
- Split network plane for live migration
- Abort an ongoing live migration
- Pause a migration
- Check the destination host when migrating or evacuating

Debugging
● LIVE MIGRATION : virsh qemu-monitor-command, virsh qemu-monitor-event
● If the instance crashes – coredump : virsh dump
● Enable libvirt logging for understanding lower level interactions between libvirt and
qemu
– libvirtd.conf
– log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object
1:util 1:qemu_monitor"
– log_outputs="1:file:/var/log/libvirt/libvirtd.log"
● NOVA debug (and verbose if required) logging:

– sed -i 's/debug=False/debug=True/g' /etc/nova/nova.conf
– sed -i 's/verbose=False/verbose=True/g' /etc/nova/nova.conf
– Get compute.log, conductor.log, scheduler.log for analysis
Troubleshooting
● Enable debugging and check for keywords like WARN, ERROR, STOP, FAIL in the
logs and on STDOUT with –debug/--verbose on the commands (if cli can be
used to reproduce the issue)
● Check for existing bugzillas(bugzilla.redhat.com & bugs.launchpad.net).
● Try to reproduce the issues in your test-environment with exact component

version.(and hardware if necessary)
● Discuss it with your team.
● Reach out to mailing lists and IRC channels.
● Reach out directly to engineering by opening bugzilla with all information

and regularly following up on the bugzilla.
QUESTIONS
THANK YOU

By Anand Nande

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

By Anand Nande

Uploaded by

Copyright:

Available Formats

NOVA

● AMQP and NOVA

● Host Aggregates: NUMA, CPU-Pinning

● Server Groups: Affinity, Anti-Affinity.

● Migration: Stages, Types, Security, How to interact with live-

● Debugging and Troubleshooting

Components interaction via MQ | External service

○ nova-api is responsible to provide an API for users and

○ For ex. Spawning the instance from Horizon / NOVA CLI.

➢ openstack-nova-consoleauth: Handles console authentication.

➢ openstack-nova-novncproxy: Provides a VNC proxy for browsers (enabling

○ It provides authentication token along with service catalog.

REQ: curl -i 'http://10.65.234.1:5000/v2.0/tokens' -X POST -H "Accept: application/json" -H "Content-Type:

RESP BODY: {"access": {"token": {"issued_at": "2015-05-30T11:05:03.054462", "expires": "2015-05-30T12:05:03Z", "id":

"serviceCatalog": [{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:8774/v2/97787e34dc0d4f2b8fc04034eed3594c", "region": "RegionOne", "publicURL":

{"endpoints_links": [], "endpoints": [{"adminURL": "http://10.65.234.1:8776/v1/97787e34dc0d4f2b8fc04034eed3594c", "region": "RegionOne", "publicURL":

➢ NOVA call to boot an instance:

[root@dhcp209-220 ~]# nova boot --flavor 1 --image 2d946232-5773-48df-b8bb-7677f8b6e0fe --nic net-

● RPC messaging is done via the `oslo.messaging library`, an abstraction on top of

* Each Nova service (for example

* other, which accepts messages with

The former is used specifically when

Example showing sub-optimal memory:CPU alignment on NUMA

Now Lets understand NOVA Flavor extra_specs

hw_cpu_policy=shared V/S hw_cpu_policy=dedicated

Stage1: Stage 2: Stage 3:

Does DEST meet SRC specs to - Free RAM

evacuate Evacuate server from failed host.

live-migration Migrate running server to a new machine.

migrate Migrate a server. The new host will be

host-evacuate Evacuate all instances from failed host.

migration-list Print a list of migrations.

host-servers-migrate Migrate all instances of the specified host to

host-evacuate-live Live migrate all instances of the specified

● There are a few ways to leverage

ACCESSTYPE = tcp (unencrypted) or tls (encrypted)

USER = who has access to the compute service - “nova”

● Add the device whitelist on the compute node :

● Create a flavor with with pci property.

ISSUE: no live-migration support, Admin needs to detach/attach pNIC.

# virsh domjobinfo <domain>

The simplest and crudest mechanism for ensuring guest

# virsh domjobabort <instance- name>

Cancel an ongoing live-migration

# virsh migrate-set-speed <domain> <speed_in_Mbps>

Multiple small vm’s with less RAM or One large VM

- Instances with direct pci passthrough

- Split network plane for live migration

- Abort an ongoing live migration

- Check the destination host when migrating or evacuating

● If the instance crashes – coredump : virsh dump

● NOVA debug (and verbose if required) logging:

● Check for existing bugzillas(bugzilla.redhat.com & bugs.launchpad.net).

● Try to reproduce the issues in your test-environment with exact component

● Discuss it with your team.

● Reach out to mailing lists and IRC channels.

● Reach out directly to engineering by opening bugzilla with all information

You might also like