Professional Documents
Culture Documents
Home > Blog > Understanding your options: Deployment topologies for High Availability (HA) with OpenStack
Piotr Siwczak
- September 10, 2012 -
When I joined Mirantis, it strongly affected my approach, as I realized that all my ideas
involving a farm of dedicated compute nodes plus one or two controller nodes were
wrong. While it can be a good approach from the standpoint of keeping everything
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 1/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
tidily separated, in practice we can easily mix and match workhorse components
without overloading OpenStack (e.g. nova-compute with nova-scheduler on one
host). It turns out that in OpenStack, “controller node” and “compute node” can have
variable meanings given how flexibly OpenStack components are deployed.
In general, one can assume that each OpenStack deployment needs to contain at
least three types of nodes (with a possible fourth), which my colleague Oleg Gelbukh
has outlined:
• Endpoint node: This node runs load balancing and high availability services that
may include load-balancing software and clustering applications. A dedicated load-
balancing network appliance can serve as an endpoint node. A cluster should have
at least two endpoint nodes configured for redundancy.
• Controller node: This node hosts communication services that support operation
of the whole cloud, including the queue server, state database, Horizon dashboard,
and possibly a monitoring system. This node can optionally host the nova-
scheduler service and API servers load balanced by the endpoint node. At least two
controller nodes must exist in a cluster to provide redundancy. The controller node
and endpoint node can be combined in a single physical server, but it will require
changes in configuration of the nova services to move them from ports utilized by
the load balancer.
• Compute node: This node hosts a hypervisor and virtual instances, and provides
compute resources to them. The compute node can also serve as the network
controller for instances it hosts, if a multihost network scheme is in use. It can also
host non-demanding internal OpenStack services, like scheduler, glance-api , etc.
• Volume node: This is used if you want to use the nova-volume service. This node
hosts the nova-volume service and also serves as an iSCSI target.
While the endpoint node’s role is obvious—it typically hosts the load-balancing
software or appliance providing even traffic distribution to OpenStack components
and high availability—the controller and compute nodes can be set up in many
different ways, ranging from “fat” controller nodes which host all the OpenStack
internal daemons (scheduler, API services, Glance, Keystone, RabbitMQ, MySQL) to
“thin,” which host only those services responsible for maintaining OpenStack’s state
(RabbitMQ and MySQL). Then compute nodes can take some of OpenStack’s internal
processing, by hosting API services and scheduler instances.
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 2/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
At Mirantis we have deployed service topologies for a wide range of clients. Here I’ll
provide a walk-through of them along with diagrams, and take a look at the different
ways which OpenStack can be deployed. (Of course, service decoupling can go even
further.)
In this deployment variation, the hardware load balancer appliance is used to provide
a connection endpoint to OpenStack services. API servers, schedulers, and instances
of nova-scheduler are deployed on compute nodes and glance-registry instances
and Horizon are deployed on controller nodes.
All the native Nova components are stateless web services; this allows you to scale
them by adding more instances to the pool (see the Mirantis blog post on scaling API
services for details). That’s why we can safely distribute them across a farm of
compute nodes. The database and message queue server can be deployed on both
controller nodes in a clustered fashion (my earlier post shows ways how to do it). Even
better: The controller node now hosts only platform components that are not
OpenStack internal services (MySQL and RabbitMQ are standard Linux daemons). So
the cloud administrator can afford to pass the administration of them to an external
entity, Database Team, a dedicated RabbitMQ cluster. This way, the central controller
node disappears and we end up with a bunch of compute/API nodes, which we can
scale almost linearly.
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 3/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
In this deployment, endpoint nodes are combined with controller nodes. API services
and nova-schedulers are also deployed on controller nodes and the controller can
be scaled by adding nodes and reconfiguring HAProxy. Two instances of HAProxy are
deployed to assure high availability, and detection of failures and promotion of a
given HAProxy from standby to active can be done with tools such as Pacemaker and
Corosync/Heartbeat.
FREE EBOOK!
DOWNLOAD
I’ve shown you service distribution across physical nodes, which Mirantis has done for
various clients. However, sysadmins can mix and match them in a completely
different way to suit their needs. The diagram below shows—based on our experience
at Mirantis—how OpenStack services can be distributed across different node types.
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 5/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
The main load on the endpoint node is generated by a network subsystem. This type
of node requires much of the CPU performance and network throughput. It is also
useful to bind network interfaces for redundancy and increased bandwidth, if
possible.
The cloud controller can be fat or thin. The minimum configuration it can hosts
includes those pieces of OpenStack that maintain the system state: database and
AMQP server. Redundant configuration of the cloud controller requires at least two
hosts and we recommend use of network interface binding for network redundancy
and RAID1 or RAID10 for storage redundancy. The following configuration can be
considered the minimum for a controller node:
• 8GB RAM
Compute nodes require as much memory and CPU power as possible. Requirements
for the disk system are not very restrictive, though the use of SSDs can increase
performance dramatically (since instance filesystems typically reside on the local
disk). It is possible to use a single disk in a non-redundant configuration and in case
of failure replace the disk and return the server back to the cluster as a new compute
node.
Block volumes are shared via iSCSI protocol, which means high loads on the network
subsystem. We recommend at least two bonded interfaces for iSCSI data exchange,
possibly tuned for this type of traffic (jumbo frames, etc.).
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 6/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
LIVE WEBINAR
SAVE SEAT
Network topology
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 7/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
• Expose Virtual IPs of the endpoint node that are used by clients to connect to
OpenStack services APIs.
The public network is usually isolated from private networks and management
network. A public/corporate network is a single class C network from the cloud
owner’s public network range (for public clouds it is globally routed).
The private network is a network segment connected to all the compute nodes; all
the bridges on the compute nodes are connected to this network. This is where
instances exchange their fixed IP traffic. If VlanManager is in use, this network is
further segmented into isolated VLANs, one per project existing in the cloud. Each
VLAN contains an IP network dedicated to this project and connects virtual instances
that belong to this project. If a FlatDHCP scheme is used, instances from different
projects all share the same VLAN and IP space.
The management network connects all the cluster nodes and is used to exchange
internal data between components of the OpenStack cluster. This network must be
isolated from private and public networks for security reasons. The management
network can also be used to serve the iSCSI protocol exchange between the compute
and volume nodes if the traffic is not intensive. This network is a single class C
network from a private IP address range (not globally routed).
The iSCSI network is not required unless your workload involves heavy processing on
persistent block storage. In this case, we recommend iSCSI on dedicated wiring to
keep it from interfering with management traffic and to potentially introduce some
iSCSI optimizations like jumbo frames, queue lengths on interfaces, etc.
In high availability mode, all the OpenStack central components need to be put
behind a load balancer. For this purpose, you can use dedicated hardware or an
endpoint node. An endpoint node runs high-availability/load-balancing software and
hides a farm of OpenStack daemons behind a single IP. The following table shows the
placement of services on different networks under the load balancer:
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 8/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
OpenStack Network
Host placement Remarks
component placement
nova-api ,
glance-api ,
Since these services are
glance- accessed directly
public
registry ,
controller/compute by users (API endpoints), it
network
keystone- is logical to put
api ,
them on the public net.
Horizon
nova- mgmt
controller/compute
scheduler network
nova- mgmt
compute
compute network
nova- mgmt
controller/compute
network network
mgmt Including replication/HA
MySQL controller
network traffic
mgmt Including rabbitMQ cluster
RabbitMQ controller
network traffic
mgmt
nova-volume volume
network
mgmt
network
(dedicated In case of high block
iSCSI volume VLAN) or
storage traffic, you should
separate use a dedicated network.
iSCSI
network
Conclusion
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 9/18
02/12/2021 12:28 OpenStack HA (High Availability) topologies deployment
When architectures are distributed, you need to properly spread traffic across many
instances of a service and also provide replication of stateful resources (like MySQL
and RabbitMQ). The OpenStack folks haven’t provided any documentation on this so
far, so Mirantis has been trying to fill this gap by producing a series of posts on scaling
platform and API services.
LIVE WEBINAR
SAVE SEAT
Links
F dP
https://www.mirantis.com/blog/understanding-options-deployment-topologies-high-availability-ha-openstack/ 10/18