You are on page 1of 24

CONFIDENTIAL

Metaswitch Products Azure


Deployment Design Guide

VC3-603 - Version 8.5 - Issue 1-1445

December 2022

A Microsoft Company
Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Notices
Copyright © 2022 Microsoft. All rights reserved.

This manual is issued on a controlled basis to a specific person on the understanding that no part of
the product code or documentation (including this manual) will be copied or distributed without prior
agreement in writing from Metaswitch Networks and Microsoft.

Metaswitch Networks and Microsoft reserve the right to, without notice, modify or revise all or part of
this document and/or change product features or specifications and shall not be responsible for any
loss, cost, or damage, including consequential damage, caused by reliance on these materials.

Metaswitch and the Metaswitch logo are trademarks of Metaswitch Networks. Other brands and
products referenced herein are the trademarks or registered trademarks of their respective holders.
CONFIDENTIAL Azure Deployment Design Guide (V8.5)

Contents

1 Introduction.............................................................................................................4
1.1 About this document............................................................................................................. 4
1.2 Relevant product versions.....................................................................................................4
2 Background information for deploying in Azure................................................ 5
2.1 Azure deployment models.....................................................................................................5
2.1.1 Public cloud............................................................................................................. 5
2.1.2 Hybrid cloud.............................................................................................................5
2.2 Availability (Azure public cloud)............................................................................................ 6
2.3 Security..................................................................................................................................7
2.4 Performance.......................................................................................................................... 7
2.5 Regulatory requirements....................................................................................................... 8
3 Planning your Azure deployment.........................................................................9
3.1 Product deployment footprint................................................................................................ 9
3.2 Storage requirements.......................................................................................................... 11
3.2.1 Overview of Azure data storage............................................................................11
3.2.2 Per-product data storage requirements.................................................................12
3.3 System availability............................................................................................................... 12
3.4 Backup and recovery.......................................................................................................... 13
3.5 User access control.............................................................................................................14
3.6 Networking in Azure............................................................................................................ 14
3.6.1 Connectivity outside Azure.................................................................................... 14
3.6.2 Networking within Azure........................................................................................ 15
3.6.3 Networking requirements for VMs......................................................................... 17
3.6.4 Routing traffic to instances....................................................................................17
3.6.5 DNS in Azure........................................................................................................ 18
3.6.6 SSH access in Azure............................................................................................ 18
3.6.7 VXLANs in Azure...................................................................................................18
4 Management, automation and monitoring in Azure......................................... 23
4.1 Overview of deployment and lifecycle management...........................................................23
4.2 Monitoring Azure deployments............................................................................................23
Azure Deployment Design Guide (V8.5) CONFIDENTIAL

1 Introduction

1.1 About this document


This document explains the considerations for deploying Metaswitch Voice Core and UC products in
an Azure public cloud environment. It is focused on cases where Metaswitch software is delivered
as a licensed product, deployed in Azure and operated by a customer or partner. It does not discuss
any elements of deploying Metaswitch products through the Azure Marketplace either as standalone
software or as managed applications/services.

This document does not discuss pricing or cost implications of the Azure services used by Metaswitch
products.

1.2 Relevant product versions


This document applies to the product versions as set out in the table below.

Table 1: Relevant product versions

Product Version

Metaswitch Deployment Manager (MDM) V3.3.0+

Distributed Capacity Manager (DCM) V4.1+

Perimeta V4.9.20+

Secure Distribution Engine (SDE) V2.3+

ServiceIQ Management Platform (SIMPL) VM V6.10+

4 1 Introduction
CONFIDENTIAL Azure Deployment Design Guide (V8.5)

2 Background information for deploying in Azure


This section of the guide highlights some key considerations for deploying Metaswitch solutions in
Azure. It helps you understand which Azure platform features are relevant for your deployment and
determine if your deployment is suitable for Azure.

This document assumes you are familiar with the concept of Azure tenants, subscriptions and
resource groups.

2.1 Azure deployment models


There are several different topologies for deploying workloads in Azure and the choice of topology
affects where workloads are run. You must understand these topologies to decide how to deploy your
solution in Azure.

It is important that you understand the availability implications of each choice because different Azure
resources come with different service-level agreements as outlined in the Azure documentation linked
below.

2.1.1 Public cloud


Azure allows you to control the specification of the infrastructure on which your workloads are
deployed and some control over where the workload is deployed (that is, the region and availability
zone your workload is running in). You do not need to manage the availability, maintenance or
upgrade of hardware. Consider the SLAs of the Azure services you intend to use to determine if
running a particular workload in the public cloud is appropriate.

Azure public cloud regional variances

When selecting the Azure region to deploy to you should consider the following:

• Capacity - Each region has a maximum capacity, which affects the types of services you can
deploy under different circumstances. Some regions allow you to reserve capacity.
• Availability zone support - Not all regions support availability zones; see Azure regions with
availability zones - Azure documentation | Microsoft Docs
• VM SKUs (if deploying on virtual machines) - The availability of VM sizes may vary by region and
availability zone; see Products available by region | Microsoft Azure
• Fault Domain count (if your deployment does not use availability zone redundancy) - See https://
github.com/MicrosoftDocs/azure-docs/blob/main/includes/managed-disks-common-fault-domain-
region-list.md.

2.1.2 Hybrid cloud


Azure allows deploying Azure resources into dedicated hardware integrated with Azure. These
options allow you greater control over the physical location of resources and what runs on those
resources while using the same tools and management interfaces across the whole Azure platform.

2 Background information for deploying in Azure 5


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Azure Stack

Azure Stack (Azure Stack | Microsoft Azure) allows you to deploy Azure workloads in private clouds or
edge locations as well as the Azure public cloud in a consistent way. This gives greater control over
where data is processed and transmitted and can be used to optimize data workflows.

Azure Stack is a good option if you have existing cloud compute resources you want to use alongside
Azure public cloud resources. It provides one consistent way to deploy and manage workloads on
your hardware and in Azure public cloud.

Azure Stack Edge

Azure Stack Edge (Azure Stack Edge | Microsoft Azure) provides Azure managed hardware that can
be deployed where you choose and integrated seamlessly with Azure public cloud.

Azure Stack Edge is the best choice if you require private compute resource, possibly in a specific
location, but do not want to manage that hardware yourself. The hardware is provided as-a-service:
workloads are managed in the same way as Azure public cloud and you control where your workloads
are deployed.

2.2 Availability (Azure public cloud)


Azure provides several ways to add redundancy to a deployment and the most appropriate solution
will depend on the service you want to provide as well as local consumer and data protection laws. All
Azure resources come with a service level agreement (SLA) that you must factor into your availability
and uptime calculations. For more details, see Azure reliability | Microsoft Azure and Architecting for
resiliency and availability - Azure Architecture Center | Microsoft Docs.

Carrier grade availability inherently requires that the application is deployed across multiple regions,
which in turn requires careful thought about how the application's data will be managed across
those regions. If you need to protect against failures of Azure regions, you can use the geographic
redundancy mechanisms built into Metaswitch products (where supported). For more information, see
Geographic redundancy across Azure regions on page 7.

When distributing nodes between Availability Zones or Regions the deployment must be correctly
configured for automatic failover. For more information on designing for resilience in Azure see
Principles of the reliability pillar - Azure Architecture Center | Microsoft Docs.

In non-production environments where preserving service after the infrastructure or software fails is
not required, a non-highly available solution can be deployed, possibly to a single region or availability
zone.

Azure Availability Zones within regions


If your Azure region supports Availability Zones, you can deploy with Metaswitch products spread
redundantly across Availability Zones to guard against data center failures. Your deployment design
should take into account the latency of hops between nodes in different zones, and additional cost of
traffic between zones.

6 2 Background information for deploying in Azure


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

Geographic redundancy across Azure regions


Some Metaswitch products support geographically redundant (GR) deployments, where traffic is split
over two or more Azure regions to provide protection against the failure of one region. The Metaswitch
products installed in these regions are configured to be geographically redundant in the usual way for
each product.

Note that geographic redundancy is not a substitute for redundancy within each region.
Geographically redundant deployments are designed to counteract very rare and serious region
failures (for example, natural disasters or multiple coincident system failures within a region).
Metaswitch considers that in most cases, the GR recovery mechanisms are too disruptive to be used
to protect against individual failures within a region, which are typically more frequent.

2.3 Security
Azure Security Center (Azure Security Center | Microsoft Azure) provides a suite of tools to help you
secure your Azure deployment. Azure Security Center is used with public cloud and hybrid cloud
deployments to improve the security of your deployment and monitor enterprise compliance. Azure
security features are not enabled on Metaswitch products by default. You must determine which Azure
features you need and enable them.

General information on Azure network security is available at Azure network security | Microsoft
Azure. You should ensure network security is considered as part of your deployment design process.

We recommend that you use Azure Active Directory for authentication and authorization where
possible. For more information, see User access control on page 14.

Azure Key Vault (Azure Key Vault documentation | Microsoft Docs) can be used to store all secrets
relating to the deployment. This may include any SSH keys.

You should avoid exposing IP addresses and ports to the public internet if those endpoints do
not require internet connectivity to provide service. You can find which ports and IP addresses to
protect in the firewall documentation for your Metaswitch product(s). Use Azure security features (for
example, Azure Bastion or Azure Firewall) to protect them. These features are discussed further in
SSH access in Azure on page 18.

2.4 Performance
VM performance in Azure depends on the specification level chosen for the VM and other
components.

Performance testing in Azure is discussed in Performance testing - Azure Architecture Center |


Microsoft Docs.

Note that some resources or services are subject to limits. This is discussed in Azure subscription
limits and quotas - Azure Resource Manager | Microsoft Docs.

2 Background information for deploying in Azure 7


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

2.5 Regulatory requirements


Note that data in the Azure public cloud is stored by Microsoft. You must be aware of any data
sovereignty laws or similar laws that apply in your jurisdiction and ensure compliance with those laws.
You must also consider lawful intercept and other applicable laws.

8 2 Background information for deploying in Azure


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

3 Planning your Azure deployment


This section provides guidance for planning your deployment in Azure. It should help you to determine
what Azure resources your deployment requires.

3.1 Product deployment footprint


This section lists the deployment type and the minimum resource requirements for each Metaswitch
product in Azure. When resource requirements for your deployment change, Azure allows you to
increase or decrease your resource usage according to your needs.

The table below gives an overview of how Metaswitch products are deployed in Azure. For further
details, see the per-product subsections.

Table 2: Product deployment types

Product VM spec Deployment model Minimum number of


VMs

Distributed Capacity Standard_D2s_v3 Public cloud Lab:1


Manager
Production: 2

MDM VM (small Standard_D4s_v3 Public Cloud 1


footprint, not
production)

MDM VM (medium Standard_D8s_v3 Public Cloud 3


footprint)

Perimeta (low capacity, Standard_D2s_v4 Public Cloud 1


ISC only)

Perimeta (medium Standard_D8s_v4 Public Cloud 1


capacity)

Perimeta (high Standard_D16s_v4 Public cloud 1


capacity, SSC only

Secure Distribution Standard_D8s_v4 Public Cloud 2


Engine (SDE)

ServiceIQ Management Standard_D2s_v4 Public Cloud 1


Platform (SIMPL) VM

3 Planning your Azure deployment 9


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Note:

Azure VM specs, also known as VM SKUs or VM sizes, vary over time as new ones are released
and old ones are retired. If you want to use a VM SKU different from the ones listed above,
consider:

• Whether the VM SKU is available in your target Azure public cloud region; see Products
available by region | Microsoft Azure.
• The CPU and memory requirements for the product. You should aim for the same values as the
recommended options above.
• The maximum number of NICs and expected network bandwidth.

The number of NICs required will depend on your IP network design and how you want to
separate traffic. Some products have a fixed number of NICs required; others allow you to use
separate NICs or share traffic between NICs for certain types of traffic.
• Whether the VM SKU supports Encryption at Host, which is recommended for all VMs
(Supported VM sizes - Virtual Machines | Microsoft Docs).

MDM
MDM is deployed as an Azure Virtual Machine.

• MDM clusters of 3 are mandatory for production environments.


• Standalone MDM is supported only in non-production deployments.

MDM must be deployed using SIMPL VM. SIMPL VM will manage MDM's storage requirements.

DCM
The Metaswitch Distributed Capacity Manager (DCM) is deployed as an Azure Virtual Machine.

Production deployments require at least two DCM VMs in each site. Lab deployments can have a
single DCM VM.

Perimeta
Perimeta is deployed as an Azure Virtual Machine.

• Both standalone and high availability instances are supported.


• Medium capacity SSCs and MSCs are supported from Perimeta V4.9.35.
• High capacity SSCs are supported from Perimeta V5.2.

SDE
SDE is deployed as an Azure Virtual Machine using SIMPL VM.

10 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

SIMPL VM
SIMPL VM is deployed as an Azure Virtual Machine. It is typically the first application deployed
in a new deployment (because it is used to deploy and manage other Metaswitch VMs). If your
deployment has products deployed and managed by SIMPL VM in multiple Azure regions, we
recommend deploying one SIMPL VM in each region.

3.2 Storage requirements


This section contains details of the storage requirements of Metaswitch products in Azure. All of the
storage options discussed here have Azure-managed server side encryption enabled automatically.

• Overview of Azure data storage on page 11 contains an overview of the Azure storage
technologies used by Metaswitch products.
• Per-product data storage requirements on page 12 outlines the requirements of each
Metaswitch product.

SIMPL VM will automatically configure storage requirements for products based on the SDF provided
- see Creating and editing an SDF in the SIMPL VM Product Deployment and Management Guide for
information on how to configure an SDF.

3.2.1 Overview of Azure data storage


Metaswitch products make use of two Azure data storage facilities: Managed Disks and Azure Blob
Storage.

Managed disks

Azure Managed disks (Azure Disk Storage overview - Azure Virtual Machines | Microsoft Docs) are
block-level storage, managed by Azure, and used with Azure Virtual Machines. Managed disks are
locally redundant by default.

When provisioning managed disks there are two high level choices to be made: size and SKU. Each
product has its own requirements highlighted below with further details in the product documentation.

Managed disks offer 3 levels of encryption discussed in Overview of managed disk encryption options
- Azure Virtual Machines | Microsoft Docs. These can be applied to ensure your data is secure and
that your deployment is aligned with any policy or legal requirements.

Azure Blob Storage

Azure Blob Storage (ABS) (About Blob (object) storage - Azure Storage | Microsoft Docs) is a cloud
storage solution optimized for storing large amounts of unstructured data. ABS provides various levels
of redundancy, from local to Availability Zone redundancy, as well as three levels of encryption.

3 Planning your Azure deployment 11


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

3.2.2 Per-product data storage requirements

DCM

The Metaswitch Distributed Capacity Manager (DCM) requires a managed disk of 16 GiB. We
recommend a standard SSD.

MDM

Each MDM VM requires a 40 GiB OS disk for persistent storage.

To accommodate this, SIMPL VM creates a managed disk for each MDM VM based on the
deployment-size set in the SDF.

Perimeta

Each Perimeta VM requires a managed disk for persistent storage.

Size should be chosen as described in Perimeta data storage. Most use cases should use 64GiB
disks which corresponds to size '6' in SKUs.

Choice of SKU depends to a large extent on availability. We recommend Premium SSD Managed
Disks.

SDE

Each SDE VM requires a 80 GiB managed disk for persistent storage.

SIMPL VM

Each SIMPL VM requires a 30GiB managed disk for persistent storage and an external managed data
disk of 128GiB. We recommend using Premium SSD Managed Disks with locally-redundant storage
(LRS).

3.3 System availability


All Azure resources come with a defined service level agreement (SLA). You must consider the
SLA of every component in your deployment to determine if the deployment meets your availability
requirements.

Distributed Capacity Manager


The Metaswitch Distributed Capacity Manager (DCM) can be deployed in a single Azure region, or in
a geographically redundant deployment of up to three Azure regions. Production deployments require
two DCMs per region. We recommend two DCMs per region for lab deployments.

MDM
MDM is always deployed as a pool of three instances per site. This provides tolerance to a single
instance failure in each site.

12 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

MDM can also be deployed across multiple sites, with a cluster of three MDMs in each site. This
provides geo-redundancy via per-VM data, topology and DNS information replication across each site
in the deployment.

Perimeta

Call survivability and service availability

Perimeta can be deployed as high availability system or as standalone.

If you are deploying Perimeta as a standalone system, any failures of either the Azure infrastructure
or the software itself (that is, anything that would normally cause a software protection switch (SPS)),
calls on the instance will be dropped. Instead, networks should be designed for service availability.

If deploying Perimeta as standalone, it is still possible to provide highly available solutions on Azure.
By distributing instances across Availability zones and regions it is possible to make failures of the
software independent. Combining this with active monitoring (for example, using OPTIONS polls)
allows users to immediately redial in the event of a call failure.

Diagnostics

The standard Perimeta troubleshooting tools described in Diagnosing problems in Perimeta


Operations and Maintenance are available. Additionally, the Azure platform records events relating
to VM health (for example, Azure platform issues) and Perimeta sends logs to Azure Monitor both of
which are useful for determining the cause of any problems.

Secure Distribution Engine


SDE is deployed as a high availability system in Azure.

SIMPL VM
SIMPL VM is only used when deploying or updating other Metaswitch products and is not essential
to provide service. Each Azure region your deployment uses should have a SIMPL VM. These are all
managed independently.

On failure of a SIMPL VM, the VM can be recreated to restore function.

3.4 Backup and recovery


Azure provides the backup mechanisms as described in Cloud Backup Services | Microsoft Azure.
These can be used to regularly back up critical resources.

Guidance on Azure backup and disaster recovery plans can be found in Backup and disaster recovery
for Azure applications - Azure Architecture Center | Microsoft Docs.

Most Metaswitch products implement an application-level backup and restore mechanism. This
provides a disaster recovery process for Metaswitch products.

3 Planning your Azure deployment 13


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

3.5 User access control


Metaswitch recommends that you manage users with Azure Active Directory. This allows users to log
into Metaswitch products using the same credentials they would use for other Azure and Microsoft
services.

MDM requires on-instance local users. MDM does not support Azure Active Directory or third-party
Radius servers.

Perimeta offers two alternatives to Azure Active Directory:

• Using a RADIUS server. This can be Azure Active Directory, a third-party RADIUS server in Azure
or an on-premises RADIUS server.
• Using on-instance Perimeta user management.

The Metaswitch Distributed Capacity Manager (DCM) requires using on-instance local users. It does
not support Azure Active Directory or RADIUS.

3.6 Networking in Azure

3.6.1 Connectivity outside Azure


There are three main ways of obtaining connectivity outside of the Azure cloud. They are listed here in
order of suitability for providing voice services in Azure:

1. Express Route

• Express Route is a dedicated network connection between your data center and Azure. It
requires you or a partner to have a presence in an exchange location where Microsoft is also
present.
• This is the preferred option and effectively extends your internal network into Azure. See
ExpressRoute documentation | Microsoft Docs for details.
2. Microsoft Azure Peering Service (MAPS)

• As with Express Route this requires a physical connection, from you or a partner, to an
exchange where Microsoft is present. For MAPS, your network is peered with the Azure
one and you connect to Azure resources through public IP addresses, but with an SLA and
guaranteed QoS.
• The flavor of MAPs relevant to Metaswitch customers is Azure Internet peering for
Communication services.
3. Public internet

• There is no SLA or QoS commitment with this option.


• In general, connecting over the public internet is not recommended for voice traffic. It may be
an option in some circumstances.

You must decide which option is most appropriate for your deployment.

14 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

3.6.2 Networking within Azure


Network traffic within Azure is segregated using Azure Virtual Networks (Azure Virtual Network |
Microsoft Docs) (VNets). These ensure isolation of traffic and are scoped within a single Azure region.
Each Azure Virtual Machine has an associated virtual network which handles all traffic to and from
that VM. A VM can only be associated with a single VNet. Traffic within a VNet is segregated using
subnets.

It is possible to peer VNets to allow traffic to freely flow from one to another. This provides flexibility in
the networks that can be used but requires firewall and Network Security Group rules to be carefully
considered and updated to ensure that traffic remains confined to an appropriate subnet.

You will need to implement an IP network design for the selected products in your deployment. This
includes deciding how many VNets you are going to use, and which subnets you require on those
VNets. These are some items you will need to consider:

• The networking requirements of the products you are deploying, and how you want to separate
traffic. Some products have a fixed number of NICs required, while others allow you to use
separate NICs or share traffic between NICs for certain types of traffic. If using separate NICs, you
will need to decide whether you are happy with these traffic types being on the same subnet, or
whether you also want subnet separation. It is typical to want all management traffic separated on
a separate subnet.
• The VM specs of the products you are deploying, and the maximum number of NICs and expected
network bandwidth of those NICs.
• How you want to manage network security with the use of Network Security Groups. Each product
manual set has a firewall rule table with its requirements. You need to add Network Security
Groups to your subnets and add rules to these to satisfy the connectivity and security requirements
of your deployment.
• Whether any of your products will be exposed to the internet with a public IP. We do not
recommend using public IP addresses for management access.
• Whether any of your products requires VXLANs.
• What VNet peering you need to other sites.

Attention:

By default, Azure routes network traffic within the same VNet between all subnets. This behavior
must be overridden using Network Security Groups. This maintains the segregation of traffic
between management, external and internal networks.

As an example, a typical single-site Perimeta deployment would use a single VNet with 3 subnets:

• Management - used by all products for shared management traffic


• Access - a public internet facing subnet for SIP signaling and media traffic, open only to voice
traffic with an Access Network Security Group

3 Planning your Azure deployment 15


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

• Core signaling and media - for SIP signaling and media traffic between the internal interfaces of
Perimeta and other Metaswitch products, open only to those IP addresses and ports for voice
traffic.

DCM

The Metaswitch Distributed Capacity Manager (DCM) requires a network interface on the
management network. DCM must be able to connect to all the products that it licenses over this
management network.

MDM

MDM requires a single management network interface. This should be on the same management
subnet as the products MDM is managing. However, you can allow network interfaces on the same
subnet to communicate with each other on all ports.

If you are deploying a multi-site MDM across multiple regions, you must add VNet peering between
your VNets. This allows MDMs in different regions to communicate with each other over the
management subnets.

When you set up a firewall for MDM you must ensure you configure rules following the MDM
documentation. For full details of the firewall configuration that MDM requires, see MDM firewall
configuration in the Metaswitch Deployment Manager Overview Guide.

Perimeta

Perimeta on Azure requires a minimum of three network interfaces, and as such three subnets. These
are:

• A core interface to softswitches or other internal network elements


• A general access interface supporting SIP subscriber and peers
• A management interface

Perimeta supports a management interface plus up to eight interfaces for carrying service traffic.
For example, in deployments supporting Microsoft Teams Direct Routing, Metaswitch expects the
following network interfaces:

• One interface towards the MicrosoftTeams PSTN hub (signaling and media)
• (Optional) One interface towards the public internet for media bypass (media only)
• One interface towards your carrier datacenters (signaling and media)
• One management interface

When you set up a firewall or Network Security Group for Perimeta you must ensure you configure
rules following the Perimeta documentation. For full details of the traffic flows that Perimeta requires,
see Traffic information for firewall configuration in Perimeta Network Integration Guide and Firewall
and security group configuration in the Perimeta Initial Setup and Commissioning Guide for your
deployment.

16 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

SDE

SDE on Azure requires a minimum of three network interfaces:

• An internal interface to connect SDE's backends.


• An external interface to connect to VTEP peers.
• A management interface.

SIMPL VMs

SIMPL VM requires one network interface on the management subnet.

3.6.3 Networking requirements for VMs


Perimeta instances larger than 2vCPUs require the use of accelerated networking for performance
reasons. Accelerated networking is equivalent to SR-IOV in private clouds, because it allows the VM
to communicate directly with the NIC.

You do not need to plan redundant network interfaces because Metaswitch products rely on Azure's
redundancy features to protect against networking failure. For example, you do not need to select a
Perimeta port group scheme that supports redundant ports. For more information about the Azure
redundancy mechanisms that Metaswitch recommends, see Availability (Azure public cloud) on page
6.

3.6.4 Routing traffic to instances

Public addresses

Public addresses are a chargeable resource in Azure that can be assigned to VMs. If a public IP
address is not assigned to a VM, outbound connectivity is still possible and Azure dynamically assigns
an available IP address that is not dedicated to a resource. Instead of using public IP addresses for
management access, we recommend that you set up an Azure VPN Gateway; see VPN Gateway |
Microsoft Docs.

Public IPs have 2 SKUs: Standard and Basic. See Public IP addresses in Azure | Microsoft Docs.
Because only Standard public IP addresses support availability zones, you should use this SKU for
any public IP address associated with a Metaswitch product.

Load balancing

Azure does not provide a layer 7 SIP load balancer for SIP traffic. Any load balancing in Azure is done
via DNS records (for example, DNS SRV weightings) or Azure Traffic Manager.

Azure Traffic Manager

Azure Traffic Manager (Traffic Manager - Cloud Based DNS Load Balancing | Microsoft Azure) is a
DNS based traffic load balancer. It allows distributing traffic to public facing applications across Azure
regions and is the way Metaswitch expects most instances of SBC applications to be addressed using
DNS. Traffic Manager is not SIP-aware; it therefore cannot ensure that all messages on a SIP dialog
are sent to the same endpoint.

3 Planning your Azure deployment 17


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Traffic Manager IP addresses that are the target of FQDNs can be updated immediately in the case
of an instance failure. Traffic Manager can also resolve FQDNs based on geographic location. This
means that a deployment that spans the world needs to expose only a single FQDN that maps to
different IP addresses depending on the location of the querying entity. This simplifies routing for
applications like Microsoft Teams Direct Routing hosted in an SBC.

3.6.5 DNS in Azure


Azure DNS (DNS | Microsoft Azure) can be used to manage all DNS records for deployments
of Metaswitch products in Azure. Metaswitch products rely on DNS to perform routing and load
balancing. It is therefore important that you understand how Azure DNS is configured and maintained.

3.6.6 SSH access in Azure


If you need SSH access to your Azure VM or container workloads over the public internet:

• Azure VPN Gateway (VPN Gateway - virtual networks | Microsoft Azure) provides a standard VPN
for connecting to your Azure VNets. Azure VPN Gateway is Metaswitch's recommended option.
• Azure Bastion (Azure Bastion | Microsoft Azure) provides a mechanism to connect to VMs over
SSH through the Azure Portal and so avoids exposing the workload's SSH port on the public
internet.

3.6.7 VXLANs in Azure

Virtual extensible Local Area Network (VXLAN) tunnels allow Ethernet (layer 2) traffic to be
transferred over an IP (layer 3) network. Certain Metaswitch products support the use of VXLANs,
allowing them to be deployed as a high availability system in Azure, and granting access to
networking features not normally supported in Azure.

VXLANs are an encapsulation of an ethernet frame as documented by RFC 7348. They can be used
to provide a complete layer 2 network (known as the overlay network) over the top of an existing layer
3 network (the underlay network).

Figure 1: Simple VXLAN network

18 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

In normal VXLAN operation, the underlay network is invisible to the servers. VXLAN Tunnel End
Points (VTEPs) encapsulate the layer 2 overlay packet before forwarding it across the layer 3
underlay network.

Each VXLAN overlay network is uniquely identified with a VXLAN Network Identifier (VNI).

Perimeta and the Secure Distribution Engine both support the use of VXLANs.

VXLANs in Perimeta

For two Perimeta VM instances to function as a high availability pair, they must have floating IP
addresses for signaling and media that are claimed by the current primary instance. The primary
instance claims these addresses by sending gratuitous ARPs (GARPs) to peers.

However, Azure does not support floating virtual IP addresses, and Azure vNets do not support
sending GARPs. In order to function as a high availability system in Azure, Perimeta must
encapsulate these GARPs in VXLAN tunnels.

Service traffic is also sent through VXLAN tunnels when deploying Perimeta as a high availability
system in Azure. As such, network elements connected to Perimeta must be capable of handling
encapsulated VXLAN traffic.

To use VXLANs, Perimeta must be configured with the following:

• VXLAN tunnels.
• Service interfaces for the overlay network.
• Service interfaces for the underlay network.

Overlay service interfaces

Overlay service interfaces represent Perimeta's connection to the overlay network. In addition to other
service interface configuration, they are configured with a VXLAN tunnel. When packets are sent out
from an overlay service interface, they are encapsulated and passed to the underlay service interface,
based on the address configured on the overlay service interface's VXLAN tunnel.

Underlay service interfaces

Underlay service interfaces represent Perimeta's connection to the underlay network. In addition to
other service interface configuration, they must be configured with the named per-instance address of
a VXLAN tunnel.

When used to create a high availability system in Perimeta, underlay service interfaces are configured
with the pair of per-instance IP addresses of both instances of the high availability pair.

Example packet flow in a high availability Perimeta system in Azure

In the following example, Perimeta functions as a VTEP, decapsulating a packet received from a
VTEP peer on the core underlay network, and then re-encapsulating it before forwarding it to a VTEP
peer on the access underlay network.

3 Planning your Azure deployment 19


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Figure 2: Packet flow in a high availability Perimeta system in Azure

1. A packet arrives at VTEP Peer 1. VTEP Peer 1 encapsulates the packet with a VXLAN header.
2. VTEP Peer 1 sends the encapsulated packet to Perimeta's core underlay interface. This is the
VXLAN tunnel.
3. The packet is decapsulated and passed to the core overlay service interface. It is then forwarded
to the access overlay service interface, and then to the access underlay service interface.
4. The packet is encapsulated with a VXLAN header and sent from the access underlay service
interface to the VTEP Peer 2 on the access underlay network.
5. VTEP Peer 2 decapsulates the packet and forwards it to the next network element.

Restrictions

VXLANs must only be used to create high availability systems in Perimeta. Perimeta does not support
all VXLAN features, and must not be used as a generic VTEP. Perimeta does not support:

• IP multicast traffic on the underlay network.


• Multicast or unknown-unicast traffic in the overlay network.

20 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

• Using VXLAN tunnels outside of poll mode. As VXLANs can only be used in poll mode, all tunnel
configuration must be manually removed before switching a system to interrupt mode.

VXLANs in the Secure Distribution Engine

When deployed in Azure, each SDE VM uses VXLANs to route traffic: external VXLANs and an
internal VXLAN. SDE functions as a VTEP, decapsulating, directing and recapsulating traffic received
from VTEP peers (such as VTEP-capable routers on the external network, or Perimeta Session
Controllers on the internal network).

To use VXLANs, SDE must have the following configuration:

• In the Solution Definition File (SDF) for your SDE deployment:

• Underlay addresses for SDE's external network.


• Underlay and overlay addresses for SDE's internal network.
• VNIs for SDE's internal network.
• The details of a single or pair of peer VTEPs (such as a routers) on SDE's external network.
• In the sde_service configuration document:

• Overlay addresses for SDE's internal and external networks.


• In the sde_backends configuration document:

• The details of peer VTEPs (such as underlay addresses of Perimeta Session Controllers) on
SDE's internal network.

Overlay addresses

Overlay addresses represent SDE's connection to the overlay network. When packets are sent out
from an overlay address, they are encapsulated and passed on based on the address configured on
an overlay network VXLAN tunnel.

Underlay addresses

Underlay service interfaces represent SDE's connection to the underlay network. Underlay addresses
correspond to the per-instance addresses of the SDE peers in a high availability pair.

VXLANs in Flow Steerer

In Azure, Flow Steerer functions as a VTEP, decapsulating, directing and recapsulating traffic
received from VTEP peers (such as VTEP-capable routers on the external network, and Perimeta
Session Controllers on the internal network).

When deployed in Azure, Flow Steerer requires the following VXLANs to route traffic:

• one external VXLAN between Flow Steerer and the router for each VLAN,
• one internal VXLAN, between Flow Steerer and your MSCs.

To use VXLANs, Flow Steerer must have the following configuration:

• In the Solution Definition File (SDF) for your Flow Steerer deployment:

• Flow Steerer's IP addresses.

3 Planning your Azure deployment 21


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

• The IP addresses of the routers Flow Steerer will connect to.


• The VNI for Flow Steerer's internal network.
• In the flowsteerer_backends.yaml document:

• Overlay and underlay addresses for the MSCs Flow Steerer will connect to.
• In the flowsteerer_vlans.yaml document:

• VLAN IDs for the external network.

22 3 Planning your Azure deployment


CONFIDENTIAL Azure Deployment Design Guide (V8.5)

4 Management, automation and monitoring in Azure


This section outlines methods of managing Metaswitch deployments in Azure. It covers the tools and
workflows that your operators will need to use.

Note that some Azure services require an agent to be installed. Not all agents are supported by all
Metaswitch products. Details of supported agents are available in the documentation for individual
products.

4.1 Overview of deployment and lifecycle management


You can deploy Perimeta with Azure tooling (for example, ARM templates of the Azure CLI) or with
SIMPL VM. All other Metaswitch products are deployed using SIMPL VM.

SIMPL VM
Metaswitch's SIMPL VM is used to manage other Metaswitch VMs on Azure public cloud. SIMPL
provides a unified and consistent way to deploy, commission, and update Metaswitch products.

Azure tooling (Perimeta only)


Deploying using Azure tooling is a user-driven operation and is done in one of two ways:

• Using the Azure CLI. This is the most manual approach as all Azure resources must be created
and linked together by the user. This provides flexibility to the user but at the cost of convenience.
• Using ARM templates. ARM templates provide a parameterized way to deploy components and
their dependencies. A user fills out a parameters file and provides this file and a template file to
Azure to create the product and its dependencies. Using ARM templates is a declarative process.

Both mechanisms rely on the user having access to the appropriate Azure VM image.

If new software can be uploaded to Perimeta instances (for example, if SFTP connectivity is possible),
upgrading and applying efixes can be done with Perimeta's standard procedures for upgrading and
applying efixes. If new software cannot be uploaded directly, Perimeta can download new software
versions and efixes from an Azure storage account. This uses Perimeta's orchestration API to
download the software to the instance and apply it.

4.2 Monitoring Azure deployments

Logs
Azure Monitor is used to collect and view logs generated by Metaswitch products. This provides a
single place to view and analyze logs from your deployment. All Metaswitch products supported in
Azure integrate with Azure Monitor except Metaswitch Distributed Capacity Manager (DCM).

4 Management, automation and monitoring in Azure 23


Azure Deployment Design Guide (V8.5) CONFIDENTIAL

Logging and auditing in the Azure platform is covered in Azure security logging and auditing |
Microsoft Docs. Azure provides mechanisms for security logging across Azure resources and allows
you to audit and analyze those logs using a range of tools.

Azure has an Activity Log for all resources to record health events and updates. This integrates with
Azure Resource Health monitoring. As well as user initiated changes these logs also display issues
affecting your resources caused by the Azure platform.

24 4 Management, automation and monitoring in Azure

You might also like