NSX V Hands On Guide - v1.0 - Public PDF

NSX-v Hands-on Guide
VMware NSX-v
Hands-on Guide
Lessons Learned in Real Life Deployments
Version 1.0
© 2013 VMware, Inc. All rights reserved.

Page 1 of 208
This document can only continue to be successful if we have people contributing with their experience
and knowledge. I would also like to take this opportunity to thank the people that have already helped in
the past year with the creation of the current content:
Kevin Barrass, Nimish Desai, Ray Budavari, Francis Guillier, Dimitri Desmidt, Brad Hedlund, Michael
Haines, Shachar Bobrovskye, Michael Moor, Tiran Efrat, Marcos Hernandez .
Special thanks to Max Ardica for reviewing this document.

Page 2 of 208
Contents
NSX-v Hands-on Guide ............................................................................. 2

1. NSX Manager Registration to vCenter Issues ................................... 7
1.1 Lookup Service ......................................................................................................... 7
1.2 vCenter Server .......................................................................................................... 7
1.3 TSHOOT Connectivity Issues .................................................................................... 9
1.4 DNS Issues ..............................................................................................................10
1.5 Firewall Issues .........................................................................................................11
1.6 NTP Issues ..............................................................................................................11
1.7 User Permission Issues ............................................................................................11
2. Deploying NSX-V Controller Issues ................................................. 12

2.1 Troubleshooting .......................................................................................................13
3. Host Preparation Overview .............................................................. 15

3.1 Host Preparation Process.........................................................................................16
3.2 Troubleshooting Host Preparation ............................................................................16
3.3 Older VIB’s Version ..................................................................................................18
3.4 ESXi Bootbank Space Issue .....................................................................................18
3.5 vCenter on Windows, EAM TCP/80 Taken by Other Application ...............................18
3.6 UWA Agent Issues ...................................................................................................19
3.7 The NSX manager has a direct link to download the VIB’s as zip file ........................20
3.8 Reverting a NSX Prepared ESXi Host ......................................................................20
3.9 Manually Uninstall VIBs ............................................................................................20
4. Improving NSX GUI User Experience .............................................. 21

5. NSX Minimum MTU.......................................................................... 24
5.1 View VXLAN Header ................................................................................................25
5.2 MTU Math ................................................................................................................27
5.3 Conclusions .............................................................................................................28
6. Teaming Policy ................................................................................. 29

6.1 What is Multi-VTEP Support? ...................................................................................29
6.2 Why Do We Need Multiple VTEPs? ..........................................................................30
6.3 Where Do We Configure Multiple VTEPs? ................................................................31
6.4 Source Port Teaming Policy (SRCID) .......................................................................32

Page 3 of 208
6.5 Source MAC Teaming Policy (SRCMAC)..................................................................35

6.6 LACPv2 (Enhanced LACP).......................................................................................38
6.7 LACPv2 Source or Destination IP Hash (Bad for NSX) ............................................40
6.8 LACPv2 Layer 4 .......................................................................................................42
6.9 Determining the VM Pinning .....................................................................................44
6.10 Conclusion ...............................................................................................................44
7. TSHOOT NSX Controller ................................................................. 45

7.1 TSHOOT NSX Controllers ........................................................................................48
8. Edge ECMP ...................................................................................... 63

8.1 Scalability and Redundancy with ECMP ...................................................................63
8.2 How the Path is Determined .....................................................................................65
8.3 What Happens in Case of a Failure on One of the Edge Devices? ............................65
8.4 Troubleshooting and Visibility ...................................................................................66
8.5 Useful CLI for Debugging ECMP ..............................................................................69
8.6 ECMP Deployment Consideration ............................................................................69
8.7 Asymmetric Routing with ECMP and Edge Firewall Enabled.....................................71
8.8 ECMP and Edge Firewall NSX .................................................................................73
8.9 Conclusions .............................................................................................................75
8.10 NSX Edge and DRS Rules .......................................................................................76
9. NSX-v Edge NAT ............................................................................. 87

9.1 SNAT .......................................................................................................................87
9.2 DNAT .......................................................................................................................88
9.3 Firewall Rules and SNAT .........................................................................................89
9.4 Firewall Rules and DNAT .........................................................................................90
9.5 DNAT Configuration .................................................................................................90
9.6 DNAT Verification.....................................................................................................93
9.7 SNAT Configuration .................................................................................................95
9.8 SNAT Verification .....................................................................................................97
9.9 DNAT with L4 Address Translation ...........................................................................98
9.10 NAT Order ...............................................................................................................99
10. NSX Load Balancing .................................................................. 101

10.1 Overview................................................................................................................101
10.2 Configure One Arm Load Balance ..........................................................................107
10.3 General Load Balancer Troubleshooting Workflow .................................................118
10.4 Check the Configuration Through UI.......................................................................119
10.5 Perform Online Troubleshooting via CLI .................................................................120
Page 4 of 208
11. NSX L2 Bridging ......................................................................... 126

11.1 Overview................................................................................................................126
11.2 Configure L2 Bridge ...............................................................................................128
11.3 Design Consideration .............................................................................................131
11.4 Bridge Redundancy ................................................................................................133
11.5 Bridge Troubleshooting ..........................................................................................134
11.6 NSX Bridge and HP Loop Prevention .....................................................................137
12. Create Firewall Rules that Blocked Your Own VC .................... 139
12.1 How is This Related to NSX? .................................................................................139
12.2 How Can We protect Ourselves from this Situation? ...............................................141
12.3 What if we made a mistake and do not yet have access to the VC? ........................143
13. Firewall port need for NSX-v ...................................................... 147

14. NSX Edge/DLR Interface Types ................................................ 148
15. NSX-v Troubleshooting L2 Connectivity .................................... 149
15.1 High Level Approach to Solve L2 Problems ............................................................149
15.2 Understand the Problem.........................................................................................149
15.3 Know Your Network Topology ................................................................................150
15.4 Validate Network Topology .....................................................................................150
15.5 Is the Problem in the Physical Network? .................................................................153
15.6 Checks for VXLAN Routing ....................................................................................154
15.7 Verify NSX Control Plane .......................................................................................156
15.8 Control Plane Verification from ESXi Point of View .................................................157
15.9 Control Plane Verification from NSX Controller Point of View ..................................159
15.10 Capture Traffic in the Right Spots ....................................................................161
15.11 Summary of all CLI Commands for This Post...................................................165
16. NSX Role Based Access Control ............................................... 166

16.1 NSX Roles .............................................................................................................166
16.2 Configure the Lookup Service in NSX Manager ......................................................167
16.3 Configure Active Directory Groups..........................................................................168
16.4 Connect Directory Domain to NSX Manager ...........................................................169
16.5 Mapping Active Directory Groups to NSX Managers Roles .....................................174
16.6 Configure vCenter Roles ........................................................................................177
17. Upgrade NSX-V, The right Way................................................. 186

17.1 Read the NSX Release Notes ................................................................................187
17.2 Compare the MD5 ..................................................................................................187

Page 5 of 208
17.3 Upgrade NSX Manager ..........................................................................................193

17.4 Upgrade the NSX Controller Cluster .......................................................................196
17.5 Upgrade Clusters ...................................................................................................199
17.6 Upgrade DLR and ESG’s .......................................................................................204
17.7 Upgrade Guest Introspection / Data Security (if required) .......................................206
18. Microsoft NLB and DLR ............................................................. 208

Page 6 of 208
1. NSX Manager Registration to vCenter Issues

In the current NSX software release, the NSX Manager is tightly connected to the vCenter
server in a 1:1 relationship.
During the process of coupling the NSX Manager to vCenter we have two different initial steps:
the configuration of “Lookup Service” and “vCenter Server”.
1.1 Lookup Service

Lookup Service allows to bind NSX role to SSO user or group. In other word this enable the
“Role Based Access Control” authentication functionality in NSX and its optional configuration.
Notice that without Lookup service configuration the functionality of NSX is not affected at all.
1.2 vCenter Server
This is a mandatory configuration. Registering the NSX Manager with vCenter injects a plugin
into the vSphere Web Client for consumption of NSX functionalities within the Web
management platform.

Page 7 of 208
While trying to Register to vCenter or configuring the Lookup Service you might see this error:
“nested exception is java.net.UnknownHostException: vc-l-01a.corp.local( vc-l-01a.corp.local )”
Or when trying to setup the Lookup Service:

“nested exception is java.net.UnknownHostException: vc-l-01a.corp.local( vc-l-01a.corp.local )”

Page 8 of 208
Or similar to this Error:
“NSX Management Service operation failed. ( Initialization of Admin Registration Service

Provider failed. Root Cause: Error occurred while registration of lookup service,
com.vmware.vim.sso.admin.exception.InternalError: General failure. )”
Most of the problems to register NSX Manager to vCenter or configure the SSO Lookup service
are:
1. Connectivity problem between the NSX Managers and vCenter.

2. Firewall blocking this connection.
3. DNS not configured properly on NSX Manager or vCenter.
4. Time is not synced between NSX Manager and vCenter.
5. The user authenticated via SSO needs to have administrative rights.
1.3 TSHOOT Connectivity Issues
Verify connectivity from NSX Manager to vCenter. Ping from NSX Manager to vCenter using
both the IP address and the Fully Qualified Domain Name (FQDN). Check for routing or static
information or for the presence of a default route in NSX Manager:
nsxmgr-l-01a# show ip route

Codes: K – kernel route, C – connected, S – static,
> – selected route, * – FIB route
S>* 0.0.0.0/0 [1/0] via 192.168.110.2, mgmt
C>* 192.168.110.0/24 is directly connected, mgmt

Page 9 of 208
1.4 DNS Issues
Verify NSX Manager can successfully resolve the vCenter DNS name. Ping from NSX Manager to
vCenter with FQDN:
nsxmgr-l-01a# ping vc-l-01a.corp.local

PING vc-l-01a.corp.local (192.168.110.22): 56 data bytes
64 bytes from 192.168.110.22: icmp_seq=0 ttl=64 time=0.576 ms
If this does not work verify the DNS configuration on the NSX Manager.
Go to Manage -> Network -> DNS Servers:

Page 10 of 208
1.5 Firewall Issues
If you have a firewall between NSX Manager and vCenter, verify it allows SSL communication on
TCP/443 (also allow ping for connective checks).
A complete list of the communication ports and protocols used for VMware NSX for vSphere is
available at the links below:
kb.vmware.com/kb/2079386
or
https://communities.vmware.com/docs/DOC-28142
1.6 NTP Issues
Verify that actual time is synced between vCenter and NSX Manager.
From NSX Manager CLI:

nsxmgr-l-01a# show clock
Tue Nov 18 06:51:34 UTC 2014
From vCenter CLI:

vc-l-01a:~ # date
1.7 User Permission Issues
Registered user to vCenter or Lookup service must have administrative rights.

Try to work with default administrator user: administrator@vsphere.local

Page 11 of 208
2. Deploying NSX-V Controller Issues
One of the following issues hit during the deployment of the NSX-v Controller cluster may cause
the deployment to fail and the deletion after few minutes of the instantiated Controller nodes.
1. Firewall blocking Controller communication with NSX Manager.

2. Network Connectivity between NSX Manager and Controllers.
3. DNS/NTP misconfiguration between NSX Manager/vCenter/ESXi hosts.
4. Lack of available resources, like disk space, in the Datastore utilized for the deployment
of the Controllers.
The first area to investigate is the “Task Console” on vCenter. From an analysis of the entries
displayed on the console, it is clear that first the Controller virtual machine is “powered on”,
but then it gets powered off and deleted. But why?

Page 12 of 208
2.1 Troubleshooting
 Download the NSX manager logs.

 Right click on the upper right corner of the NSX Manager GUI and choose “Download Tech
Support Log”.
The Tech support file can be a very large text file, so finding an issue is as challenging as looking
for a needle in a pile of hay. What to look for?
My best advice is to start with something we know, the name of the Controller node that was
first instantiated and then deleted. This name was assigned to the Controller node after the
completion of the deployment wizard.
In my specific example it was “controller-2”.
Open the text file and search for this name:

Page 13 of 208
When you find the name try to use the arrow down key and start to read:
From this error we can learn we have connectivity issues; it appears that if the Controller node
can’t connect to NSX Manager during the deploying process, it will get automatically deleted.
The next question is: why do I have connectivity issues? In my case the NSX Controller and the
NSX Manager run in the same IP subnet.
The answer is found in the manual Static IP pool object that was created for the Controller
cluster.
In this lab I work with subnet class B 255.255.0.0 = prefix of 16, but in the object pool I
mistakenly assigned a prefix length of 24.
This was just an example on how to troubleshoot an NSX-v Controller node deployment but
there may be other reasons that can cause a similar problem.

Page 14 of 208
3. Host Preparation Overview

The information in this post is based on my NSX Professional experience in the field and from a
lecture by Kevin Barrass, a NSX solution architect.
Host preparation is the process in which the NSX Manager triggers the installation of the NSX
Kernel modules (also known as NSX VIBs) inside a vSphere cluster and builds the NSX Control
plan fabric.
Before the host preparation process we need to complete the following steps (discussed in the
previous sections):
 Registering the NSX Manager with vCenter.
 Deploying the NSX Controllers.
Three components are involved during the NSX host preparation: vCenter, NSX Manager, EAM
(ESX Agent Manager).
vCenter Server:
Management of vSphere compute infrastructure.
NSX Manager:
Provides the single point of configuration and REST API entry-points in a vSphere environment
for NSX.
Page 15 of 208
EAM (ESX Agent Management):

The middleware component between the NSX Manager and the vCenter. The EAM is part of the
vCenter and is responsible to install the VIBs (vSphere Installation Bundles), which are software
packages prepared to be installed inside an ESXi host.
3.1 Host Preparation Process

The host preparation begins when we click the “Install” button in vCenter GUI. This process is
done at the vSphere Cluster level and not per ESXi host. The EAM will create an agent to track
the VIB’s installation process for each host. The VIB’s are being copied from the NSX Manager
and cached in EAM. If the VIBs are not present in the ESXi host, the EAM will install the VIBs
(ESXi host reboot is not needed at the end of this process).
During an NSX software upgrade, the EAM also takes care of removing the installed old version
of the VIBs but an ESXi host reboot is then needed.
VIBs installed during host preparation:
esx-dvfilter-switch-security
esx-vsip
esx-vxlan
Note: it is possible to verify that those VIBs have been successfully installed by connecting to
each ESXi host in the cluster and issuing the command “esxcli software vib list”.
The ESXi host has a fully working Control Plane after the host preparation was successfully
completed. Two control plan channels will be created:
 RabbitMQ Message bus: provides communication between the vsfwd process on the ESXi
hypervisor and the NSX Manager.
 User World Agent (UWA) process (netcpa on the ESXi hypervisor): establishes TCP/1234
over SSL communication channels to the Controller Cluster nodes.
3.2 Troubleshooting Host Preparation

 DNS Issues:
EAM fails to deploy VIBs due to misconfigured DNS or no DNS configuration on host.
In this case, we may get a status of “Not Ready”:
Page 16 of 208
The message clearly indicates that “Agent VIB module not installed” on one or more hosts.
We can check the vSphere ESX Agent Manager for errors:
“vCenter home > vCenter Solutions Manager > vSphere ESX Agent Manager”
On “vSphere ESX Agent Manager”, check the status of “Agencies” prefixed with “_VCNS_153”.
If any of the agencies has a bad status, select the agency and view its issues:
We need to check the associated log /var/log/esxupdate.log (on the ESXi host) for more details
on host preparation issues.
Log into the ESXi host in which you have the issue, run “tail /var/log/esxupdate.log” to view the
log
From the log it appears suddenly clear that the issues may be related to DNS name resolution.
Solution:
Configure the DNS settings in the ESXi host for the NSX host preparation to succeed.

Page 17 of 208
 TCP/80 from ESXi to vCenter is blocked:

The ESXi host is unable to connect to vCenter EAM on TCP/80:
Could be caused by a firewall blocking communication on this port. From the ESXi host
/var/log/esxupdate.log file:
esxupdate: esxupdate: ERROR: MetadataDownloadError:
(‘http://VC_IP_Address:80/eam/vib?id=xxx-xxx-xxx-xxx), None,
“( http://VC_IP_Address:80/eam/vib?id=xxx-xxx-xxx-xxx), ‘/tmp/tmp_TKl58’, ‘[Errno 4]
IOError: <urlopen error [Errno 111] Connection refused>’)”)
Solution:
The NSX-v has a list of ports that need to be open in order for the host preparation to succeed.
The complete list can be found in:
https://communities.vmware.com/docs/DOC-28142
3.3 Older VIB’s Version

If an old VIBs version exists on the ESXi host, EAM will remove the old VIB’s
But the host preparation will not automatically continue.
Solution:
We will need to reboot the ESXi host to complete the process (this condition will be clearly
indicated next to the host name on vCenter).
3.4 ESXi Bootbank Space Issue

If you try Upgrade ESXi 5.1u1 to ESXi 5.5 and then start NSX host preparation you may face
issue and from /var/log/esxupdate log file you will see message like:
“Installationerror: the pending transaction required 240MB free space, however the maximum
size is 239 MB”
I faced this issue in customer ISO of IBM blade but may appear in other vendors.
Solution:
Install fresh ESXi 5.5 Customer ISO. (this is the version i upgrade too)
3.5 vCenter on Windows, EAM TCP/80 Taken by Other Application

If the vCenter runs on a Windows machine, other applications can be installed and already
using port 80, causing a conflict with EAM port tcp/80.
For example: By default IIS server use TCP/80

Page 18 of 208
Solution:
Use a different port for EAM:
Changed the port to 80 in eam.properties in
\ProgramFiles\VMware\Infrastructure\tomcat\webapps\eam\WEB-INF\
3.6 UWA Agent Issues

In rare cases the installation of the VIBs succeeded but for some reason one or both of the
userworld agents were not functioning correctly. This could manifest itself as:
The firewall showing a bad status OR The control plane between hypervisor(s) and the
controllers being down
If Message bus service is active on NSX Manager:

Check the messaging bus userworld agent status by running the command/etc/init.d/vShield-
Stateful-Firewall status on the ESXi hosts
Check Message bus userworld logs on hosts at /var/log/vsfwd.log

esxcfg-advcfg -l | grep Rmq
Run this command on the ESXi hosts to show all Rmq variables –there should be 16 variable in
total
esxcfg-advcfg -g /UserVars/RmqIpAddress
Run this command on the ESXi hosts, it should display the NSX Manager IP address
Run this command on the ESXi hosts to check for active messaging bus connection
esxcli network ip connection list | grep 5671 (Message bus TCP connection)

Page 19 of 208
3.7 The NSX manager has a direct link to download the VIB’s as zip
file
https://$nsxmgr/bin/vdn/vibs/5.5/vxlan.zip
3.8 Reverting a NSX Prepared ESXi Host

 Remove the host from the vSphere cluster.
 Put ESXi host in maintenance mode and remove the ESXi host from the cluster. This will
automatically uninstall NSX VIBs.
Note: ESXi host must be rebooted to complete the operation.
3.9 Manually Uninstall VIBs

The following commands can be entered directly on the ESXi host to remove the installed NSX
VIBs:
esxcli software vib remove -n esx-vxlan
esxcli software vib remove -n esx-vsip
esxcli software vib remove -n dvfilter-switch-security
Note: ESXi host must be rebooted to complete the operation

Page 20 of 208
4. Improving NSX GUI User Experience

Working with NSX on different environments I found that the vSphere Web client could work
slowly. After hearing others users complains about those problems, I decided to write some tips
for improving the overall user experience.
Fist try to login with local default user: administrator@vsphere.local; if this work faster than an
LDAP user then please try to switching from using user@domain to domainuser.
1. Increase the java memory limit for vCenter. The following VMware published KB article
offers more information on how to do that:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=display
KC&externalId=2021302
2. Install the Client Interaction plugin. You should see the message on the vCenter login
page:
3. Increase the local storage settings on the Flash Player will also speed up the web client.
Adobe have online tool to view and change the local storage setting:
http://www.macromedia.com/support/documentation/en/flashplayer/help/settings_m
anager07.html

Page 21 of 208
Note: if the PC you are using to connect to vCenter does not have access to Internet this Adobe
link will not work. Thanks to Micha Novak and Yaniv Yaakov (my team colleagues) for this tip.
When we load the Web Client in the blue Screen try to fast “right click” on your mouse, then
click on the Settings button.
Swipe right the scroll bar:

Page 22 of 208
4. Remove idle sessions from you vCenter
5. TCP/80 being blocked between NSX Manager and vCenter.

PS. NSX Manager Logs looked like this:
2015-02-19 14:21:19.865 GMT ERROR TaskFrameworkExecutor-15 VcEamConnection:81 - Error connection to
eam:Exception occurred while getting Tunneled VcConnection.; nested exception is
java.util.concurrent.ExecutionException: com.vmware.vim.vmomi.client.exception.ConnectionException:
org.apache.http.conn.HttpHostConnectException: Connection to http://VMW-VC-01.domain.local:80 refused
Network Port Requirements for VMware NSX for vSphere can be found KB:
kb.vmware.com/kb/2079386

Page 23 of 208
5. NSX Minimum MTU
The VXLAN RFC can be found at the link:
https://www.rfc-editor.org/rfc/rfc7348.txt
From each ESXi host we can run the command:

pktcap-uw –capture UplinkSnd –uplink vmnic1 -o /tmp/cap2.pcap
This command will capture all the traffic sent from the local VTEP toward the physical switch
and save it in a file named cap2 with pcap format. While running this command, ping from one
guest 192.168.1.1 to another guest 192.168.1.2 (hosted in a different ESXi host) to generate
some traffic.
With WinSCP we can bring the pcap file from the ESXi host to my Windows PC and open it with
WireShark.
Opening the file will show us something like this:
We can see udp traffic from VTEP host 192.168.64.130 to VTEP 192.168.64.131 dest to port
8472 (VXLAN) but where is the VXLAN header ?

Page 24 of 208
5.1 View VXLAN Header
Wireshark can display VXLAN traffic, but for doing that we just need to change decode to
VXLAN!!!
Right Click to the frame and chose “Decode As…”
Change the “Transport” to VXLAN

Page 25 of 208
Now we can see the VXLAN header:

Page 26 of 208
5.2 MTU Math
Outside MTU for IPv4 without Internal Guest OS dot1q Tagging

= 20 + 8 + 8 + 14 + 1500 = 1550 bytes
Outside MTU for IPv4 with Internal Guest OS dot1q Tagging

= 20 + 8 + 8 + 14 + 4 + 1500 =1554 bytes
For IPv6 we will need to add 20 more bytes to Outer IPv4 so total max MTU will be 1574 bytes

Page 27 of 208
5.3 Conclusions
When we configure VXLAN in DSwitch keeping the default MTU 1600 will keep you in the safe
side!!!

Page 28 of 208
6. Teaming Policy
Thanks to Dimitri Desmidt for reviewing this post.
Teaming policies allow the NSX vSwitch to load balance the traffic between different physical
NIC’s (pNICs). The NSX Reference Design Guide (available at the link
https://communities.vmware.com/docs/DOC-27683) contains a table with different teaming
policy configuration options.
VTEP – Special VMkernel interface created on the ESXi host to encapsulate/de-encapsulate

VXLAN traffic.
VXLAN traffic has a separate IP stack from the other VMkernel interfaces (Management,
vMotion, FT, iSCSI).
At first glance of the table, we can see that only some of the supported teaming options imply
the creation of Multiple VTEPs (on the same ESXi host).
6.1 What is Multi-VTEP Support?
Multiple VTEPs – two or more VTEP kernel interfaces that can be created in an NSX vSwitch.
In a Multiple VTEPs deployment we will have 1:1 mapping with the physical uplinks of the
vSwitch. That means each VTEP will send/receive traffic on a specific pNIC interface.
In our example VTEP1 will map to pNIC1 and VTEP2 will map to pNIC2.
Page 29 of 208
This is the point to mention, all VXLAN traffic originated from VTEP1 will go out to pNIC1 , all
encapsulated traffic destined to VTEP1 will be received from pNIC1 (the opposite is true for
VTEP2).
This means that all VXLAN outbound and VXLAN inbound traffic from pNIC1 will forward from
and to VTEP1.
6.2 Why Do We Need Multiple VTEPs?
If we have more than one physical link that we would like to use for VXLAN traffic and the
upstream switches do not support LACP (or they are not configured). In that case the use of
multiple VTEPs allows to balance the traffic between physical link’s.

Page 30 of 208
6.3 Where Do We Configure Multiple VTEPs?
Configuration of the multiple VTEPs is done on the Network & Security > Installation >
Configure VXLAN tab.
Note: for the creation of multiple VTEPs it is required to select SRCID or SRCMAC as VMKNic
teaming policy during the VXLAN configuration of an ESXi cluster.
In this example we can see how 4 VTEPs are going to be created. This Number is coming from
the Number of physical uplink’s configured in the vDS.

Page 31 of 208
6.4 Source Port Teaming Policy (SRCID)

The NSX vSwitch selects an uplink based on the virtual machine portID.
In our example below we have two VTEPs and two physical uplinks.
When VM1 connects to an NSX vSwitch and sends RED traffic, the NSX vSwitch will pick one of
the VTEPs (VTEP1) based on Portgroup1 (portID1) to handle this traffic.
VTEP1 will then send this traffic to pNIC1 (since VTEP1 is pinned to this uplink in our specific
example).
When VM2, with portID2, connects and generates green traffic, the NSX vSwitch will pick a
different VTEP to send out this traffic.
We will use another VTEP since the NSX vSwitch will see a different portID as the source, and
VTEP1 already has traffic. VTEP2 will hence forward this traffic to pNIC2.
At this point we are using both of the physical links.

Page 32 of 208
Now VM3, from portID3, connects and sends yellow traffic. The NSX vSwitch with randomly
pick one of the VTEPs to handle this traffic.
Both VTEP1 and VTEP2 already have the same number of VM connections (one on each), so
there is no preference for who will be selected in terms of port-group balancing.
In this example, VTEP1 was chosen for this and forwards traffic to pNIC1.

Page 33 of 208
Positive aspects: Very simple and there is no need to configure any LACP on the upstream
switch.
Negative aspects: If VM1 doesn’t generate heavy traffic, and VM2 is generating heavy VM
traffic, the usage of the physical links will not be balanced.

Page 34 of 208
6.5 Source MAC Teaming Policy (SRCMAC)

This method is identical to the previous method, NSX vSwitch Select up link based on the virtual
machine MAC Address.
Note: given the fact that the behavior is very similar, the recommendation is to use the
previously described SRCID teaming option.
In our example we will have two VTEP’s and two physical uplinks.
When VM1 with MAC1 connects to the NSX vSwitch and sends RED traffic, the NSX vSwitch will
pick one of the VTEP (VTEP1) to handle this traffic based on VM1 MAC address.
VTEP1 will send this traffic to pNIC1.

Page 35 of 208
When VM2 with MAC2 connects and generates Green traffic, the NSX vSwitch will pick a
different VTEP to send this traffic out.
We will use the other VTEP since the NSX vSwitch sees a different MAC address as Source and
VTEP1 already have traffic. VTEP2 will forward this traffic to pNIC2.
At this point we are using both of the physical uplinks.

Page 36 of 208
When VM3 with MAC3 connects and sends Yellow traffic, the NSX vSwitch will pick randomly
one of the VTEP to handle this traffic.
Both VTEP1 and VTEP2 already have the same number of VM connections, so there is not
preference for who will be selected in the context of MAC address balancing.
In our example VTEP1 is chosen and forwards traffic to pNIC1.
Positive points: very simple, no need to configure any LACP on the upstream switch.
Negative points: if VM1 doesn’t generate heavy traffic and VM2 sources very heavy VM traffic,
the utilization of the physical uplinks will not be balanced.

Page 37 of 208
6.6 LACPv2 (Enhanced LACP)
Starting from ESXi 5.5 release, VMware improved the hashing method for LACP to be able to
leverage up to 20 different HASH algorithms. vSphere 5.5 supports these load balancing types:
1. Destination IP address
2. Destination IP address and TCP/UDP port
3. Destination IP address and VLAN
4. Destination IP address, TCP/UDP port and VLAN
5. Destination MAC address
6. Destination TCP/UDP port
7. Source IP address
8. Source IP address and TCP/UDP port
9. Source IP address and VLAN
10. Source IP address, TCP/UDP port and VLAN
11. Source MAC address
12. Source TCP/UDP port
13. Source and destination IP address
14. Source and destination IP address and TCP/UDP port
15. Source and destination IP address and VLAN
16. Source and destination IP address, TCP/UDP port and VLAN
17. Source and destination MAC address
18. Source and destination TCP/UDP port
19. Source port ID
20. VLAN

Page 38 of 208
Source or Destination IP Hash will derive from the VTEP IP address located in the outer IP
header of the VXLAN frame.
Every time we need to calculate the Hash algorithm for Source or Destination IP Method
(option 1 or 7) the VTEP IP address will be used.
Selecting LACPv2 (also referred to as “Enhanced LACP”) as teaming policy between an ESXi host
and the ToR switch leads to the creation of one VTEP only.
In this example we have 2 physical uplinks connected to one physical upstream switch. Those
uplinks are bundled together in a single “logical uplink”, which explains why a single VTEP is
created.

Page 39 of 208
6.7 LACPv2 Source or Destination IP Hash (Bad for NSX)
In this scenario we are selecting the IP Hash algorithm for LACPv2. We have two ESXi hosts,
esx1 and esx2. When VM1 connects to NSX vSwitch on host1 and generates Red traffic toward
VM2, the traffic is sent to VTEP1 (the only VTEP we have in the source ESXi host).
Then the NSX vSwitch calculates the Hash value based on Source VTEP IP1 or Destination VTEP
IP2 and as a result of this Hash value it selects pNIC1.
When the physical switch connected to esx2 receives the frame, it performs a similar hash
calculation (assuming the same IP Hash algorithm is also locally configured on the physical
switch) and selects one of the physical links (in this example pNIC1).

Page 40 of 208
Now VM3 connects to NSX vSwitch at esx1 try to send Green traffic to VM4 also connected to
esx2, hence VTEP1 will handle this traffic. NSX vSwitch will calculate the Hash algorithm base on
the source IP (VTEP1) or destination IP (VTEP2) or both.
In any case, the result will be electing the same pNIC1 since this is the same Hash that was
calculated when VM1 sent traffic to VM2!!!
In this scenario we can see that both traffic flows originated from VM1 and VM3 are using the
same pNIC1 uplink.

Page 41 of 208
6.8 LACPv2 Layer 4
When using L4 information, the Hashing will be calculated based on “Source port” or
“Destination port” (Option 2,4,6,8). In VXLAN that mean Hash will be derived based on the
values in the “Outer UDP Header”.
VXLAN destination port is always udp/8472
VMware creates a random UDP source port value based on the L2/L3/L4 headers present in the
original frame.

Page 42 of 208
As a result of this method, every time a different flow (identified by the original L2, L3 and L4
values) is established between VMs, a different random UDP source port will be generated.
That mean different Hash results = Better Load Balancing
Now when VM1 and VM3 send traffic, the load-balancing algorithm may select different pNIC’s
(the more number of flows are originated, the more even utilization of the uplinks is achieved).
Note: both uplinks can be utilized also for flows originated from the same VM, as long as they
are associated to different types of communications (for example HTTP and FTP flows).

Page 43 of 208
6.9 Determining the VM Pinning
To know exactly what ESXi uplink is going to be used for traffic sourced by a given VM, it is
possible to use the following command after connecting SSH to the ESXi host where that VM is
located:
 Type esxtop and then press ‘n’ (shortcut of network).
Here is an example of the output for the esxtop command:
The VM named “web-sv-01a” is pinned to vmnic0. vmk3 is the VMkernel interface used for
VXLAN traffic and is pinned to vmnic0.
Note: in vSphere, vmnicx represent the physical uplinks of the ESXi host (also previously called
pNics).
6.10 Conclusion
When ever possible use LACPv2 with L4 as Hash Algorithm.

Source MAC teaming option is more CPU intensive than Source ID, hence Source ID is
recommended when LACP is not possible and it is desired to leverage more than one uplinks for
a specific type of traffic.

Page 44 of 208
7. TSHOOT NSX Controller
The Controller cluster in the NSX platform is the control plane component that is responsible
for managing the switching and routing modules in the hypervisors.
The use of the Controller cluster in managing VXLAN based logical switches eliminates the need
for IP multicast in the underlay network.
Each Controller Node is assigned a set of roles that define the type of tasks the node can
implement. By default, each Controller Node is assigned all roles.
NSX controller roles:

Page 45 of 208
API provider: Handles HTTP web service requests from external clients (NSX Manager) and
initiates processing by other Controller Node tasks.
Persistence Server: Stores data from the NSX Manager APIs and vDS devices that must be
persisted across all Controller Nodes in case of node failures or shutdowns.
Logical manager: Monitors when endhosts arrive or leave the vDS devices and configures the
vDS forwarding states to implement logical connectivity and policies.
Switch manager: Maintains management connections for one or more vDS devices.
Directory server: manages VXLAN and the distributed logical routing directory of information.
Any multi-node HA mechanism has the potential for a “split brain” scenario in which a cluster is
partitioned into two or more groups, and those groups are not able to communicate. In this
scenario, each group might assume control of all tasks under the assumption that the other
nodes have failed. NSX uses leader election to solve this split-brain problem. One of the
Controller Nodes is elected as a leader for each role, which requires a majority vote of all active
and inactive nodes in the cluster.
The leader for each role is responsible for allocating tasks to individual Controller Nodes and
determining when a node has failed. Since election requires a majority of all nodes, it is not
possible for two leaders to exist simultaneously within a cluster, preventing a split brain
scenario. The leader election mechanism requires a majority of all cluster nodes to be
functional at all times.
Below is an example of 3 NSX Controllers and role election per Node members.

Page 46 of 208
Node 1 master for roles: API Provider and Logical Manager

Node 2 master for roles: Persistence Server and Directory Server
Node 3 master for roles: Switch Manager
The different majority number scenarios depend on the number of deployed Controller Cluster
nodes. It is evident how deploying 2 nodes (traditionally considered an example of a redundant
system) would increase the scalability of the Controller Cluster (since at steady state two nodes
would work in parallel) without providing any additional resiliency. This is because with 2
nodes, the majority number is 2 and that means that if one of the two nodes were to fail, or
they lost communication with each other (dual-active scenario), neither of them would be able
to keep functioning (accepting API calls, etc.). The same considerations apply to a deployment
with 4 nodes that cannot provide more resiliency than a cluster with 3 elements (even if
providing better performance).
Note: Currently NSX-V 6.1 supports for production deployments only a Controller cluster with 3
nodes.

Page 47 of 208
7.1 TSHOOT NSX Controllers
The next part of TSHOOT NSX Controllers is based on VMware NSX MH 4.1 User Guide:
https://my.vmware.com/web/vmware/details?productId=418&downloadGroup=NSX-MH-412-
DOC
NSX Controller nodes IP addresses used in the next screenshots are:
Node1 192.168.110.201, Node1 192.168.110.202, Node1 192.168.110.202
7.1.1 Verify NSX Controller Installation
Ensure that the Controllers are installed on systems that meet the minimum requirements.
On each Controller:
The CLI command “request system compatibility-report” provides informational details that
determine whether a Controller system is compatible with the Controller requirements.
1 # request system compatibility-report
7.1.2 Checking Controller Status in NSX Manager
The NSX Manager continually checks whether all Controller Clusters are accessible. If a
Controller Cluster is currently in disconnected status, your diagnostic efforts and log review
should be focused on the time immediately after the Controller Cluster was last seen as
connected.
Page 48 of 208
Here example of “Disconnected” controller from NSX Manager:
This NSX “Controller nodes status” screenshot show status between the NSX Manager and
the Controller and not the overall controller cluster status.
So even if we have all controllers in “Normal” state like the figure below , that doesn’t mean
the overall controller status is ok.
7.1.3 Checking the Controller Cluster Status from CLI

The current status of the Controller Cluster can be determined by running show control-cluster
status:
1 # show control-cluster status
Join status: verify this node completed the process of joining the cluster.
Majority status: check if this cluster is part of the majority.
Cluster ID: all node members need to be in the same cluster id.

Page 49 of 208
The current status of the Controller Node’s intra-cluster communication connections can be
determined by running
1 show control-cluster connections
If a Controller node is a Controller Cluster majority leader, it will be listening on port 2878 (as
indicated by the Y in the “listening” column).
The other Controller nodes will have a dash (-) in the “listening” column.
The next step is to check whether the Controller Cluster majority leader has any open
connections as indicated by the number in the “open conns” column. On a properly functioning
Controller, the open connections should be the same as the number of other Controller nodes
in the Controller Cluster (e.g. in a three-node Controller Cluster, the Controller Cluster majority
leader should show two open connections).
The command show control-cluster history will allow you to see a history of Controller Cluster-
related events on this node including restarts, upgrades, Controller Cluster errors and loss of
majority.

Page 50 of 208
controller # show control-cluster history
7.1.4 Joining a Controller Node to a Controller Cluster
This section covers issues that may be encountered when attempting to join a new Controller
Node to an existing Controller Cluster. An explanation of why the issue occurs and instructions
on how to resolve the issue are also provided.
Symptom: Joining a new Controller node to a Controller Cluster may fail when all of the
existing Controllers are disconnected.
Example for this situation:
As we can see controller-1 and controller-2 are disconnected from the NSX Manager.
When we try to add new controller cluster we get this error message:

Page 51 of 208

Page 52 of 208
Explanation:
If n nodes have joined the NSX Controller Cluster, then a majority (strictly greater than 50%) of
those n nodes must be alive and connected to each other, before any new data can be written
to the system. This means that if you have a Controller Cluster of 3 nodes, 2 of them must be
alive and connected in order for new data to be written in NSX.
In our case to add a new controller node to cluster we need at least on member of the cluster
to be in “Normal” state.
Resolution: Start the Disconnected Controller. If the Controller is disconnected due to a

permanent failure, remove the Controller from the Controller Cluster.
Symptom: the join control-cluster CLI command hangs without ever completing the join
operation.
Explanation:
The IP address passed into the join control-cluster command was incorrect, and/or does not
refer to a currently live Controller node.
For example the user type the command:

join control-cluster 192.168.110.201
Make sure that the 192.168.110.201 node is part of the existing controller cluster.
Resolution:
Use the IP address of a properly configured Controller that is reachable across the network.
Symptom:
The join control-cluster CLI command fails.
Explanation:

Page 53 of 208
If you have a Controller configured as part of a Controller Cluster, and that Controller has been
disconnected from the Controller Cluster for a long period of time (perhaps it was taken offline
or shut down), and during that time, the other Controllers in that Controller Cluster were
removed from the Controller Cluster and formed a new Controller Cluster, then the long-
disconnected Controller will not be allowed to rejoin the Controller Cluster that it left, because
that original Controller Cluster is gone.
The following event log message in the new Controller Cluster indicates that something like this
has happened:
Node b567a47f-9a61-43b3-8d53-36b3d1fd0675 tried to join with incorrect cluster ID
Resolution:
You must issue the join control-cluster command with the force option on the old Controller to
force it to clear its state and join the new Controller Cluster with a fresh start.
Note: The forced join command deletes previously joined node with the same IP.
1 nvp-controller # join control-cluster 192.168.110.201 force

Page 54 of 208
7.1.5 Recovering a Node Disconnected from the Cluster
When controller cluster majority issue arises, it will be very difficult to spot it from the NSX
Manager GUI. For example the current state of the controllers from the NSX Manager point of
view is that all the members are in “Normal” state.
But in fact the current status in my cluster is:
Node1 + Node 2 are part of the cluster and share the roles between them; for some reason
Node 3 disconnected from the majority of the cluster.

Page 55 of 208
Output example from controller Node 3:
Node 3 thinks it is alone and owns all of the roles.
From Node 1 perspective he is the leader (have the Y) and have one open connection from
Node2 as shown below:
To recover from this scenario Node 3 needs to join to majority of the cluster, the IP address to
join needs to be the one of Node1 because it is the leader of the majority.

Page 56 of 208
join control-cluster 192.168.110.201 force
7.1.6 Recovering from Loss of all Controller Nodes
In this scenario all NSX Controller nodes failed or had been deleted. Do we need to start from
scratch?
The assumption is that in our environment we have already deployed NSX Edge, DLR and we
have VMs actively connected to logical switches. The desire would be to preserve all those
configurations.
The recovering process:
Step 1:
Migrate existing logical switches to Multicast mode.
Step 2:
Page 57 of 208
Deploy 3 new NSX controllers.
Step 3:
Sync the new deployed NSX controllers to unicast mode with the current state of our NSX.
Roie Add Pooja Method as alternative way..

Page 58 of 208
7.1.7 Other Useful Commands

Checking Controller Processes
Even if the “join-cluster” command on a node appears to have been successful, the node might
not have come up completely for a variety of reasons. The way this error tends to manifest
itself most visibly is that the controller process isn’t listening on all the ports it’s supposed to
be, and no API requests or switch connections are happening.
# show network connections of-type tcp
Active Internet connections (servers and established)

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program
tcp 0 0 172.29.1.20:6633 0.0.0.0:* LISTEN 14038/domain
tcp 0 0 172.29.1.20:7000 0.0.0.0:* LISTEN 14072/java


tcp 0 0 172.29.1.20:9160 0.0.0.0:* LISTEN 14072/java
tcp 0 0 172.29.1.20:2888 0.0.0.0:* LISTEN 14072/java
tcp 0 0 172.29.1.20:2888 172.29.1.20:55622 ESTABLISHED 14072/java
tcp 0 0 172.29.1.20:9160 172.29.1.20:52567 ESTABLISHED 14072/java
tcp 0 0 172.29.1.20:52566 172.29.1.20:9160 ESTABLISHED 14038/domain
tcp 0 0 172.29.1.20:443 172.17.21.9:46438 ESTABLISHED 14067/domain
The show network connection output shown in the preceding block is an example from a
healthy Controller. If you find some of these missing, it’s likely that NSX didn’t get past its install
phase. Here are some misconfigurations that can cause this:
Bad management address or listen IP
You’ve set an incorrect IP as the management-address, or as the listen-ip for one of the roles
(like switch_manager or api_provider).
NSX attempts to bind to the specified address, and fails early if it cannot do so. You’ll see log
messages in cloudnet_cpp.log.ERROR like:
E0506 01:20:17.099596 7188 dso-deployer.cc:516] Controller component installation of rpc-broker
failed: Unable to bind a RPC port $tags:tracing:3ef7d1f519ffb7fb^
E0506 01:20:17.100162 7188 main.cc:271] RPC deployment subsystem not installed; exiting.
$tags:tracing:3ef7d1f519ffb7fb^

Page 59 of 208
Or in cloudnet_cpp.log.WARNING:
W0506 01:22:27.721777 7694 ssl-socket.cc:530] SSLSocket failed to bind to 172.1.1.1:6632: Cannot
assign requested address
Note that if you are using DHCP for the IP addresses of your controller nodes (not
recommended or supported), the IP address could have changed since the last time you
configured it.
Verify that the IP addresses for switch_manager and api_provider are what they are supposed
to be by performing the CLI command:
<switch_manager|api_provider> listen-ip
Bad first node address

You’ve provided the wrong IP address for the first node in the Controller Cluster. Run the
command
show control-cluster startup-nodes
to determine whether the IPs listed correspond to the IPs of the Controllers in the Controller
Cluster.
Out of disk space
The Controller may be out of disk space. Use the
‘‘show status’’
and see if any of the partitions have 0 bytes available.

The NSX CLI command show system statistics can be used to display resource utilization for disk
space, disk I/O, memory, CPU and various other processes on the Controller Nodes. The
command offers statistics with one-minute intervals for a window of one hour for various
combinations. The show system statistics CLI command does auto-completion and can be used
to view the list of metric data available.
show system statistics <datasource> : for the tabular output
show system statistics graph <datasource> : for the graphical format output
As an example, the following output shows the RRD statistics for the datasource disk_ops:write
associated with the disk sda1 on the Controller in a tabular form:
# show system statistics disk-sda1/disk_ops:write
Time Write
12:29 0.74
12:28 0.731429
12:27 0.617143
12:26 0.665714 <snip>

Page 60 of 208

Page 61 of 208
# show network interface
# show network default-gateway
# show network dns-servers
# show network ntp-servers
# show network ntp-status
# traceroute <ip_address or dns_name>
# ping <ip address>
# ping interface addr <alternate_src_ip> <ip_address>
# watch network interface breth0 traffic

Page 62 of 208
8. Edge ECMP
This post was written by Roie Ben Haim and Max Ardica, with a special thanks to Jerome Catrouillet,
Michael Haines, Tiran Efrat and Ofir Nissim for their valuable input
In this section we will describe the Equal Cost Multi-Path functionality (ECMP) introduced in
VMware NSX release 6.1 and discuss how it addresses the requirements of scalability,
redundancy and high bandwidth. ECMP has the potential to offer substantial increases in
bandwidth by load-balancing traffic over multiple paths as well as providing fault tolerance for
failed paths. This is a feature that is available on physical networks and that has also been
introduced for virtual networking as well. ECMP uses a dynamic routing protocol to learn the
next-hop towards a final destination and to converge in case of failures. For a great demo of
how this works, you can start by watching this video, which walks you through these
capabilities in VMware NSX.
https://www.youtube.com/watch?v=Tz7SQL3VA6c
8.1 Scalability and Redundancy with ECMP
To keep pace with the growing demand for bandwidth, the data center must meet scale out
requirements, which provide the capability for a business or technology to accept increased
volume without redesign of the overall infrastructure. The ultimate goal is avoiding the “rip and
replace” of the existing physical infrastructure in order to keep up with the growing demands of
the applications. Data centers running business critical applications need to achieve near 100
percent uptime. In order to achieve this goal, we need the ability to quickly recover from
failures affecting the main core components. Recovery from catastrophic events needs to be
transparent to end user experiences.
ECMP with VMware NSX 6.1 allows you to use up to a maximum of 8 ECMP Paths
simultaneously. In a specific VMware NSX deployment, those scalability and resilience
improvements are applied to the “on-ramp/off-ramp” routing function offered by the Edge
Services Gateway (ESG) functional component, which allows communication between the
logical networks and the external physical infrastructure.

Page 63 of 208
External user’s traffic arriving from the physical core routers can use up to 8 different paths (E1-
E8) to reach the virtual servers (Web, App, DB).
In the same way, traffic returning from the virtual server’s hit the Distributed Logical Router
(DLR), which can choose up to 8 different paths to get to the core network.

Page 64 of 208
8.2 How the Path is Determined

NSX for vSphere Edge Services Gateway device:
When a traffic flow needs to be routed, the round robin algorithm is used to pick up one of the
links as the path for all traffic of this flow. The algorithm ensures to keep in order all the packets
related to this flow by sending them through the same path. Once the next-hop is selected for a
particular Source IP and Destination IP pair, the route cache stores this. Once a path has been
chosen, all packets related to this flow will follow the same path.
There is a default IPv4 route cache timeout, which is 300 seconds. If an entry is inactive for this
period of time, it is then eligible to be removed from route cache. Note that these settings can
be tuned for your environment.
Distributed Logical Router (DLR):
The DLR will choose a path based on a Hashing algorithm of Source IP and Destination IP.
8.3 What Happens in Case of a Failure on One of the Edge Devices?
In order to work with ECMP the requirement is to use a dynamic routing protocol: OSPF or BGP.
If we take OSPF for example, the main factor influencing the traffic outage experience is the
tuning of the OSPF timers.
OSPF will send hello messages between neighbors, the OSPF “Hello” protocol is used and
determines the Interval as to how often an OSPF Hello is sent.
Another OSPF timer called “Dead” Interval is used, which is how long to wait before we
consider an OSPF neighbor as “down”. The OSPF Dead Interval is the main factor that influences
the convergence time. Dead Interval is usually 4 times the Hello Interval but the OSPF (and BGP)
timers can be set as low as 1 second (for Hello interval) and 3 seconds (for Dead interval) to
speed up the traffic recovery.

Page 65 of 208
In the example above, the E1 NSX Edge has a failure; the physical routers and DLR detect E1 as
Dead at the expiration of the Dead timer and remove their OSPF neighborship with him. As a
consequence, the DLR and the physical router remove the routing table entries that originally
pointed to the specific next-hop IP address of the failed ESG.
As a result, all corresponding flows on the affected path are re-hashed through the remaining
active units. It’s important to emphasize that network traffic that was forwarded across the
non-affected paths remains unaffected.
8.4 Troubleshooting and Visibility
With ECMP it’s important to have introspection and visibility tools in order to troubleshoot
optional point of failure. Let’s look at the following topology.

Page 66 of 208
A user outside our Data Center would like to access the Web Server service inside the Data
Center. The user IP address is 192.168.100.86 and the web server IP address is 172.16.10.10.
This User traffic will hit the Physical Router (R1), which has established OSPF adjacencies with
E1 and E2 (the Edge devices). As a result R1 will learn how to get to the Web server from both
E1 and E2 and will get two different active paths towards 172.16.10.10. R1 will pick one of the
paths to forward the traffic to reach the Web server and will advertise the user network subnet
192.168.100.0/24 to both E1 and E2 with OSPF.
E1 and E2 are NSX for vSphere Edge devices that also establish OSPF adjacencies with the DLR.
E1 and E2 will learn how to get to the Web server via OSPF control plane communication with
the DLR.
From the DLR perspective, it acts as a default gateway for the Web server. This DLR will form an
OSPF adjacency with E1 and E2 and have 2 different OSPF routes to reach the user network.
From the DLR we can verify OSPF adjacency with E1, E2.
We can use the command: “show ip ospf neighbor”

Page 67 of 208
show ip ospf neighbor
From this output we can see that the DLR has two Edge neighbors: 198.168.100.3 and
192.168.100.10.The next step will be to verify that ECMP is actually working.
We can use the command: “show ip route”
show ip route
The output from this command shows that the DLR learned the user network 192.168.100.0/24
via two different paths, one via E1 = 192.168.10.1 and the other via E2 = 192.168.10.10.
Now we want to display all the packets that were captured by an NSX for vSphere Edge
interface.
In the example below and in order to display the traffic passing through interface vNic_1, and
which is not OSPF protocol control packets, we need to type this command:
“debug packet display interface vNic_1 not_ip_proto_ospf”
We can see an example with a ping running from host 192.168.100.86 to host 172.16.10.10

Page 68 of 208
Capture traffic
If we would like to display the captured traffic to a specific ip address 172.16.10.10, the
command capture would look like: “debug packet display interface vNic_1 dst_172.16.10.10”
debug packet display interface vNic_1 dst

Note: When using the command “debug packter display interface” we need to add
underscore between the expressions after the interface name.
8.5 Useful CLI for Debugging ECMP

To check which ECMP path is chosen for a flow
 debug packet display interface IFNAME

To check the ECMP configuration
 show configuration routing-global
To check the routing table
 show ip route
To check the forwarding table
 show ip forwarding
Useful CLI for Dynamic Routing

 show ip ospf neighbor
 show ip ospf database
 show ip ospf interface
 show ip bgp neighbors
 show ip bgp
8.6 ECMP Deployment Consideration
ECMP currently implies stateless behavior. This means that there is no support for stateful
services such as the Firewall, Load Balancing or NAT on the NSX Edge Services Gateway.

Page 69 of 208
Up to NSX 6.1.1 release, the Edge Firewall and ECMP could not be turned on at the same time
on the NSX edge device. Note however, that the Distributed Firewall (DFW) is unaffected by
this.
Starting from the NSX release 6.1.2 the Edge Firewall is not disabled automatically on ESG
when ECMP is enabled. It is hence recommended to turn off Firewall when deploying the ESG
in ECMP mode, in order to avoid traffic drops caused by asymmetric routing.

Page 70 of 208
8.7 Asymmetric Routing with ECMP and Edge Firewall Enabled
8.7.1 What is Asymmetric Routing?
In Asymmetric routing, a packet between a source and a destination traverses one path and
takes a different path when it returns to the source.
Start from version 6.1 NSX Edge can work with ECMP – Equal Cost Multipath. The deployment
of ECMP makes very likely that traffic flows in Asymmetric fashion between Edges and DLR or
between Edge and physical routers.
8.7.2 ECMP Consideration with Asymmetric Routing
ECMP with Asymmetric routing is not a problem by itself, but will cause problems when more
than one NSX Edge is in place and stateful services are inserted in the path of the traffic.
Stateful services like firewall, Load Balanced Network Address Translation (NAT) can’t work
with asymmetric routing.

Page 71 of 208
8.7.3 Explaining the Problem
User from the external network tries to access a Web VM inside the Data Center. The FW is
turned on both E1 and E2, which being deployed independently are not syncing the FW state
information. The traffic will pass through E1 Edge. From E1 the traffic will go to DLR, traverses
the NSX distributed firewall and get to the Web VM.
When the Web VM replies back, the traffic will hit the DLR default gateway. DLR has two ECMP
paths to route the traffic (via E1 or E2).
If DLR chooses the path via E2 the traffic will get to E2 and will be dropped !!!
The reason for this is that E2 is not aware of the state of the session started at E1; hence,
replay packet from Red VM arrives to E2 where it cannot match any existing session for that
flow.
From E2 perspective this is a new session but any new TCP session should start with SYN, since
this is not the begin of the session E2 will drop it!!!

Page 72 of 208
Note: NSX Distributed firewall is not part of this problem, NSX Distributed firewall is
implement at the vNic level, all traffic get in/out the same vNic.
There is no Asymmetric route in the vNic level, btw this is the reason when we vMotion VM,
the Firewall Rule, Connection state is moved with the VM itself.
8.8 ECMP and Edge Firewall NSX
Starting from version 6.1 when we enable ECMP on NSX Edge we get the following message:
The firewall service is disabled by default:
Even if you try to enable it you will get warning message:

Page 73 of 208
In version 6.1.2 when we enable ECMP we get same message:
But the BIG difference is that Firewall Service is NOT disabled by default (the message is
displayed as a consequence of a bug that is fixed in release 6.1.3).
Even if you have “Any, Any” rule with “Accept” action as default FW rule, the traffic flows may
still be dropped because of the previously explained asymmetric routing problem!!!
Even in Syslog or LogInSight you will not see this dropped packets!!!

Page 74 of 208
From the point of view of the end user, some of the sessions are working just fine (this sessions
are not following an asymmetric path), other sessions are dropped (asymmetric sessions).
The place I found where we can learn packets are dropped because of the state of the session is
with the command:
show tech-support
8.9 Conclusions
When you enable ECMP and you have more then one NSX Edge in you topology, go to Firewall
service and disable it by yourself otherwise you will spend lots of hours troubleshooting
inconsistent connectivity.

Page 75 of 208
8.10 NSX Edge and DRS Rules
The NSX Edge Cluster Connects the Logical and Physical worlds and usually hosts the NSX Edge
Services Gateways and the DLR Control VMs.
There are deployments where the Edge Cluster may contain the NSX Controllers as well.
In this section we discuss how to design an Edge Cluster to survive a failure of an ESXi host or an
Physical entire chassis and lower the time of outage.
In the figure below we deploy NSX Edges, E1 and E2, in ECMP mode where they run
active/active both from the perspective of the control and data planes. The DLR Control VMs
run active/passive while both E1 and E2 running a dynamic routing protocol with the active DLR
Control VM.
When the DLR learns a new route from E1 or E2, it will push this information to the NSX
Controller cluster. The NSX Controller will update the routing tables in the kernel of each ESXi
hosts, which are running this DLR instance.

Page 76 of 208
In the scenario where the ESXi host, which contains the Edge E1, failed:
 The active DLR will update the NSX Controller to remove E1 as next hop, the NSX Controller
will update the ESXi host and as a result the “Web” VM traffic will be routed to Edge E2.
The time it takes to re-route the traffic depends on the dynamic protocol converge time.
In the specific scenario where the failed ESXi or Chassis contained both the Edge E1 and the
active DLR, we would instead face a longer outage in the forwarded traffic.
The reason for this is that the active DLR is down and cannot detect the failure of the Edge E1
and accordingly update the Controller. The ESXi will continue to forward traffic to Edge E1 until
the passive DLR becomes active, learns that the Edge E1 is down and updates the NSX
Controller.

Page 77 of 208
The Golden Rule is:

We must ensure that when the Edge Services Gateway and the DLR Control VM belong to the
same tenant they will not reside in the same ESXi host. It is better to distribute them
between ESXi hosts and reduce the affected functions.
By default when we deploy a NSX Edge or DLR in active/passive mode, the system takes care of
creating a DRS anti-affinity rule and this prevents the active/passive VMs from running in the
same ESXi host.

Page 78 of 208
DRS anti affinity rules
We need to build new DRS rules as these default rules will not prevent us from getting to the
previous dual failure scenario.
The figure below describes the network logical view for our specific example. This topology is
built from two different tenants where each tenant is being represented with a different color
and has its own Edge and DLR.
Note connectivity to the physical world is not displayed in the figure below in order to simplify
the diagram.

Page 79 of 208
The physical Edge Cluster has four ESXi hosts that are distributed over two physical chassis:
Chassis A: esxcomp-01a, esxcomp-02a
Chassis B: esxcomp-01b, esxcomp-02b
Create DRS Host Group for each Chassis
We start with creating a container for all the ESXi hosts in Chassis A, this container group
configured is in DRS Host Group.
Edge Cluster -> Manage -> Settings -> DRS Groups

Page 80 of 208
Click on Create Add button and call this group “Chassis A”.
Container type need to be “Host DRS Group” and Add ESXi host running on Chassis A (esxcomp-
01a and esxcomp-02a).
Create another DRS group called Chassis B that contains esxcomp-01b and esxcomp-02b:
VM’s DRS Group for Chassis A:
We need to create a container for VMs that will run in Chassis A. At this point we just name it as
Chassis A, but we are not actually putting the VMs in Chassis A.

Page 81 of 208
This Container type is “VM DRS Group”:
VM DRS Group for Chassis B:
At this point we have four DRS groups:

Page 82 of 208
DRS Rules:
Now we need to take the DRS object we created before: “Chassis A” and “VM to Chassis A “ and
tie them together. The next step is to do the same for “Chassis B” and “VM to Chassis B“
* This configuration needs to be part of “DRS Rules”.
Edge Cluster -> Manage -> Settings -> DRS Rules
Click on the Add button in DRS Rules, in the name enter something like: “VM’s Should Run on
Chassis A”
In the Type select “Virtual Machine to Hosts” because we want to bind the VM’s group to the
Hosts Group.
In the VM group name choose “VM to Chassis A” object.
Below the VM group selection we need to select the group & hosts binding enforcement type.
We have two different options:
“Should run on hosts in group” or “Must run on hosts in group”
If we choose “Must” option, in the event of the failure of all the ESXi hosts in this group (for
example if Chassis A had a critical power outage), the other ESXi hosts in the cluster (Chassis B)
would not be considered by vSphere HA as a viable option for the recovery of the VMs.
“Should” option will take other ESXi hosts as recovery option.

Page 83 of 208
Same for Chassis B:
Now the problem with the current DRS rules and the VM placement in this Edge cluster is that
the Edge and DLR Control VM are actually running in the same ESXi host. We need to create
anti-affinity DRS rules.

Page 84 of 208
Anti-Affinity Edge and DLR:
An Edge and DLR that belong to the same tenant should not run in the same ESXi host.
For Green Tenant:
For Blue Tenant:
The Final Result:
In the case of a failure of one of the ESXi hosts we don’t face the problem where Edge and DLR
are on the same ESXi host, even if we have a catastrophic event of a chassis A or B failure.

Page 85 of 208
Note:
Control VM location can move to compute cluster and we can avoid this design consideration.

Page 86 of 208
9. NSX-v Edge NAT
Thanks to Francis Guillier and Tiran Efrat of the overview and feedback.
One of the most important NSX Edge features is NAT.
With NAT (Network Address Translation) we can change the Source or Destination IP addresses
and TCP/UDP port. Combined NAT and Firewall rules can lead to confusion when we try to
determine the correct IP address to which apply the firewall rule.
To create the correct rule we need to understand the packet flow inside the NSX Edge in
details. In NSX Edge we have two different type of NAT: Source Nat (SNAT) and Destination NAT
(DNAT).
9.1 SNAT
Allows translating an internal IP address (for example private IP described in RFC 1918) to a
public External IP address.
In figure below, the IP address for any VM in VXLAN 5001 that needs outside connectivity to the
WAN can be translated to an external IP address (this mapping is configured on the Edge). For
example, VM1 with IP address 172.16.10.11 needs to communicate with WAN Internet, so the
NSX Edge can translate it to a 192.168.100.50 IP address configured on the Edge external
interface.
Users in the external network are not aware of the internal Private IP address.

Page 87 of 208
9.2 DNAT
Allow to access internal private IP addresses from the outside world.

In the example in figure below, users from the WAN need to communicate with the Server
172.16.10.11.
NSX Edge DNAT mapping configuration is created so that the users from outside connect to
192.168.100.51 and NSX Edge translates this IP address to 172.16.10.11.
Below is the outline of the Packet flow process inside the Edge. The important parts are where
the SNAT/DNAT Action and firewall decision action are being taken.
We can see from this process that the ingress packet will evaluate against FW rules before
SNAT/DNAT translation.
Note: the actual packet flow details are more complicated with more action/decisions in Edge
flow, but the emphasis here is on the NAT and FW functionalities only.

Page 88 of 208
Note: NAT function will work only if firewall service is enabled.
9.3 Firewall Rules and SNAT
Because of this packet flow the firewall rule for SNAT need to be applied on the internal IP
address object and not on the IP address translated by the SNAT function. For example, when a
VM1 172.16.10.11 needs to communicate with the WAN, the firewall rule needs to be:

Page 89 of 208
9.4 Firewall Rules and DNAT
Because of this packet flow the firewall rules for DNAT need to be applied on the public IP
address object and not on the Private IP address after the DNAT translation. When a user from
the WAN sends traffic to 192.168.100.51, this packet will be checked against this FW rule and
then the NAT will change the destination IP address to 172.16.10.11.
9.5 DNAT Configuration
Users from outside need to access an internal web server connecting to its public IP address.
The server internal IP address is 172.16.100.11, the NAT IP address is 192.168.100.6.

Page 90 of 208
The first step is creating the External IP on the Edge, this IP is secondary because this edge
already has a main IP address configured in the 192.168.100.0/24 IP subnet.
Note: the main IP address is marked with a black Ddot (192.168.100.3).
For this example the DNAT IP address is 192.168.100.6.

Page 91 of 208
Create a DNAT Rule in the Edge:
Now pay attention to the firewall rules one the Edge: a user coming from the outside will try to
access the internal server by connecting to the public IP address 192.168.100.6. This implies
that the fw rule needs to allow this access.

Page 92 of 208
9.6 DNAT Verification
There are several ways to verify NAT is functioning as originally planned. In our example, users
from any source address access the public IP address 192.168.100.6, and after the NAT
translation the packet destination IP address is changed to 172.16.10.11.
The output of the command:
show nat
show firewall flow
We can see that packet is received by the Edge and destined to the 192.168.100.6 address, the
return traffic is instead originated from the different IP address 172.16.10.11 (the private IP
address). That means DNAT translation is happening here.

Page 93 of 208
We can capture the traffic and see the actual packet:

Capture Edge traffic on its outside interface vNic_0, in this example user source IP address is
192.168.110.10 and destination is 192.168.100.6
The command for capture is:

debug packet display interface vNic_0 port_80_and_src_192.168.110.10
Capture edge on internal interface vNic_1 we can see destination IP address has changed to
172.16.10.11 because of DNAT translation:

Page 94 of 208
9.7 SNAT Configuration
All the servers part of VXLAN segment 5001 (associated to the IP subnet 172.16.10.0/24) need
to leverage SNAT translation (in this example to IP address 192.168.100.3) on the outside
interface of the Edge to be able to communicate with the external network.

Page 95 of 208
SNAT Configuration:
Edge Firewall Rules:
Allow devices belonging to 172.16.10.0/24 to go out:

Page 96 of 208
9.8 SNAT Verification
Show nat

Page 97 of 208
9.9 DNAT with L4 Address Translation
DNAT with L4 Address Translation allows changing Layer4 TCP/UDP port.

For example we would like to mask our internal SSH server port for all users from outside.
The new port will be TCP/222 instead of regular SSH TCP/22 port.
The user originates a connection to the Web Server on destination port TCP/222 but the NSX
Edge will change it to TCP/22.
From the command line the show nat command:

Page 98 of 208
9.10 NAT Order
In this specific scenario, we want to create the two following SNAT rules.
 SNAT Rule 1:
The IP addresses for the devices part of VXLAN 5001 (associated to the IP subnet
172.16.10.0/24) need to be translated to the Edge outside interface address 192.168.100.3.
 SNAT Rule 2:
Web-SRV-01a on VXLAN 5001 needs its IP address 172.16.10.4 to be translated to the Edge
outside address 192.168.100.4.
In the configuration example above, traffic will never hit rule number 4 because 172.16.10.4 is
part of subnet 172.16.10.0/24, so its IP address will be translated to 192.168.100.3 (and not the
desired 192.168.100.4).
Order for SNAT rules is important!

We need to re-order the SNAT rules and put the more specific one on top, so that rule 3 will be
hit for traffic originated from the IP address 172.16.10.4, whereas rule 4 will apply to all the
other devices part of IP subnet 172.16.10.0/24.
After re-order:

Page 99 of 208
Another useful command:

show configuration nat

Page 100 of 208
10. NSX Load Balancing

10.1 Overview
This next overview was taken from great work of Max Ardica and Nimish Desai in the NSX
Reference Design Guide (https://communities.vmware.com/docs/DOC-27683).
Load Balancing is another network service available within NSX that can be natively enabled on
the NSX Edge device. The two main drivers for deploying a load balancer are scaling out an
application (through distribution of workload across multiple servers), as well as improving its
high-availability characteristics
NSX Load Balancing

Page 101 of 208
The NSX load balancing service is specially designed for cloud with the following characteristics:
 Fully programmable via API

 Same single central point of management/monitoring as other NSX network services
The load balancing services natively offered by the NSX Edge satisfies the needs of the majority
of the application deployments. This is because the NSX Edge provides a large set of
functionalities:
 Support any TCP applications, including, but not limited to, LDAP, FTP, HTTP, HTTPS
 Support UDP application starting from NSX SW release 6.1.
 Multiple load balancing distribution algorithms available: round-robin, least
connections, source IP hash, URI
 Multiple health checks: TCP, HTTP, HTTPS including content inspection
 Persistence: Source IP, MSRDP, cookie, ssl session-id
 Connection throttling: max connections and connections/sec
 L7 manipulation, including, but not limited to, URL block, URL rewrite, content
rewrite
 Optimization through support of SSL offload
Note: the NSX platform can also integrate load-balancing services offered by 3rd party vendors.
This integration is out of the scope for this paper.
In terms of deployment, the NSX Edge offers support for two types of models:
 One-arm mode (called proxy mode): this scenario is highlighted in Figure below and
consists in deploying an NSX Edge directly connected to the logical network it provides load-
balancing services for.

Page 102 of 208
One-Arm Mode Load Balancing Services
The one-armed load balancer functionality is shown above:
1. The external client sends traffic to the Virtual IP address (VIP) exposed by the load
balancer.
2. The load balancer performs two address translations on the original packets
received from the client: Destination NAT (D-NAT) to replace the VIP with the IP address of
one of the servers deployed in the server farm and Source NAT (S-NAT) to replace the client
IP address with the IP address identifying the load-balancer itself. S-NAT is required to force
through the LB the return traffic from the server farm to the client.
3. The server in the server farm replies by sending the traffic to the LB (because of the
S-NAT function previously discussed).
The LB performs again a Source and Destination NAT service to send traffic to the external
client leveraging its VIP as source IP address.

Page 103 of 208
The advantage of this model is that it is simpler to deploy and flexible as it allows deploying LB
services (NSX Edge appliances) directly on the logical segments where they are needed without
requiring any modification on the centralized NSX Edge providing routing communication to the
physical network. On the downside, this option requires provisioning more NSX Edge instances
and mandates the deployment of Source NAT that does not allow the servers in the DC to have
visibility into the original client IP address.
Note: the LB can insert the original IP address of the client into the HTTP header before
performing S-NAT (a function named “Insert X-Forwarded-For HTTP header”). This provides the
servers visibility into the client IP address but it is obviously limited to HTTP traffic.
Inline mode (called transparent mode) requires instead deploying the NSX Edge inline to the
traffic destined to the server farm. The way this works is shown in Figure below.
Two-Arms Mode Load Balancing Services



Page 104 of 208
1. The external client sends traffic to the Virtual IP address (VIP) exposed by the
load balancer.
2. The load balancer (centralized NSX Edge) performs only Destination NAT (D-
NAT) to replace the VIP with the IP address of one of the servers deployed in the server
farm.
3. The server in the server farm replies to the original client IP address and the
traffic is received again by the LB since it is deployed inline (and usually as the default
gateway for the server farm).
4. The LB performs Source NAT to send traffic to the external client leveraging
its VIP as source IP address.
This deployment model is also quite simple and allows the servers to have full visibility into
the original client IP address. At the same time, it is less flexible from a design perspective
as it usually forces using the LB as default gateway for the logical segments where the
server farms are deployed and this implies that only centralized (and not distributed)
routing must be adopted for those segments. It is also important to notice that in this case
LB is another logical service added to the NSX Edge already providing routing services
between the logical and the physical networks. As a consequence, it is recommended to
increase the form factor of the NSX Edge to X-Large before enabling load-balancing services.
In terms of scalability and throughput figures, the NSX load balancing services offered by
each single NSX Edge can scale up to (best case scenario):
 Throughput: 9 Gbps
 Concurrent connections: 1 million
 New connections per sec: 131k
Below are some deployment examples of tenants with different applications and different
load balancing needs. Notice how each of these applications is hosted on the same Cloud
with the network services offered by NSX.

Page 105 of 208
Deployment Examples of NSX Load Balancing
Two final important points to highlight:
 The load balancing service can be fully distributed across tenants. This brings
multiple benefits:
 Each tenant has its own load balancer.
 Each tenant configuration change does not impact other tenants.
 Load increase on one tenant load-balancer does not impact other tenants load-
balancers scale.
 Each tenant load balancing service can scale up to the limits mentioned above.
Other network services are still fully available
 The same tenant can mix its load balancing service with other network services such
as routing, firewalling, VPN.

Page 106 of 208
10.2 Configure One Arm Load Balance

In this One Arm Load Balance Lab Topology we have a 3-tiers application built from:
Web servers: web-sv-01a (172.16.10.11), web-sv-02a (172.16.10.12)
App: app-sv-01a (172.16.20.11)
DB: db-sv-01a (172.16.30.11)
We will add to this lab NSX Edge service gateway (ESG) for load balancer function.
The ESG (highlighted with the red line) is deployed in one-arm mode and exposes the VIP
172.16.10.10 to load-balance traffic to the Web-Tier-01 segment.

Page 107 of 208
Create NSX Edge gateway:
Select Edge Service Gateway (ESG):

Page 108 of 208
Set the Admin password, enable SSH and Auto rule:
Install the ESG in Management Cluster:

Page 109 of 208
In our lab appliance size is Compact, but we should choose the right size according to amount
of traffic expected:
Configure the Edge interface and IP address; since this is one-arm mode we have only one
interface:

Page 110 of 208
Create default gateway:
Configure default accept fw rule:

Page 111 of 208

Page 112 of 208
Complete the installation:
Verify ESG is deployed:
Enable Load Balance in the ESG, go to Load Balance and click Edit:

Page 113 of 208
Check mark “Enable Load Balancer”
Create the application profile:
Add a name, in the Type select HTTPS and Enable SSL Passthrough:

Page 114 of 208
Create the pool:
In the Algorithm select ROUND-ROBIN, monitor is default https, and add two servers member
to monitor:

Page 115 of 208
To add Members click on the + icon, the port we monitor is 443:
We need then to create the VIP:
In this step we glue all the configuration parts, tie the application profile to pool and give it the
Virtual IP address:

Page 116 of 208
Now we can check that the load balancer is actually working by connecting to the VIP address
with a client web browser.
In the web browser, we point to the VIP address 172.16.10.10.
The results is to hit 172.16.10.11 web-sv-01a:
When we try to refresh our web browser client we see we hit 172.16.10.12 web-sv-02a :

Page 117 of 208
10.3 General Load Balancer Troubleshooting Workflow
Review the configuration through UI
Check the pool member status through UI
Do online troubleshooting via CLI:
 Check LB engine status (L4/L7)

 Check LB objects statistics (vips, pools, members)
 Check Service Monitor status (OK, WARNING, CRITICAL)
 Check system log message (# show log)
 Check LB L4/L7 session table
 Check LB L7 sticky-table status

Page 118 of 208
10.4 Check the Configuration Through UI
Check the pool member status through UI:
Possible errors discovered:
1. 80/443 port might be used by other services (e.g. sslvpn);

2. Member port and monitor port are misconfigured hence health check failed.
3. Member in WARNING state should be treated as DOWN.
4. L4 LB is used when:
a) TCP/HTTP protocol;
b) no persistence settings and L7 settings;
c) accelerateEnable is true;

Page 119 of 208
5. Pool is in transparent mode but the Edge doesn’t sit in the return path
10.5 Perform Online Troubleshooting via CLI
Check LB engine status (L4/L7)
# show service loadbalancer
Check LB objects statistics (vips, pools, members)
# show service loadbalancer virtual [vip-name]

# show service loadbalancer pool [poo-name]
Check Service Monitor status (OK, WARNING, CRITICAL)
# show service loadbalancer monitor
Check system log message
# show log
Check LB session table
# show service loadbalancer session
Check LB L7 sticky-table status
# show service loadbalancer table

Page 120 of 208
One-Arm-LB-0> show service loadbalancer

<cr>
error Show loadbalancer Latest Errors information.
monitor Show loadbalancer HealthMonitor information.
pool Show loadbalancer pool information.
session Show loadbalancer Session information.
table Show loadbalancer Sticky-Table information.
virtual Show loadbalancer virtualserver information.
#########################################################
One-Arm-LB-0> show service loadbalancer

———————————————————————–
Loadbalancer Services Status:
L7 Loadbalancer : running
Health Monitor : running
#########################################################
One-Arm-LB-0> show service loadbalancer monitor

———————————————————————–
Loadbalancer HealthMonitor Statistics:
POOL MEMBER HEALTH STATUS
Web-Servers-Pool-01 web-sv-02a_172.16.10.12 default_https_monitor:OK
Web-Servers-Pool-01 web-sv-01a_172.16.10.11 default_https_monitor:OK
One-Arm-LB-0>
##########################################################
One-Arm-LB-0> show service loadbalancer virtual

———————————————————————–
Loadbalancer VirtualServer Statistics:
VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| SESSION (cur, max, total) = (0, 3, 35)
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)

Page 121 of 208
+->POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
| | BYTES in = (7233), out = (29320)
####################################################################
One-Arm-LB-0> show service loadbalancer pool
———————————————————————–
Loadbalancer Pool Statistics:
POOL Web-Servers-Pool-01
| LB PROTOCOL L7
| BYTES in = (17483), out = (73029)
| | BYTES in = (8882), out = (43709)
| | BYTES in = (7233), out = (29320)
##########################################################################
One-Arm-LB-0> show service loadbalancer session

———————————————————————–
L7 Loadbalancer Current Sessions:
0x5fe50a2b230: proto=tcpv4 src=192.168.110.10:49392 fe=Web-Servers-VIP be=Web-Servers-
Pool-01 srv=web-sv-01a_172.16.10.11 ts=08 age=8s calls=3

Page 122 of 208
rq[f=808202h,i=0,an=00h,rx=4m53s,wx=,ax=] rp[f=008202h,i=0,an=00h,rx=4m53s,wx=,ax=]
s0=[7,8h,fd=13,ex=] s1=[7,8h,fd=14,ex=] exp=4m52s
0x5fe50a22960: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=09
age=0s calls=2 rq[f=c08200h,i=0,an=00h,rx=20s,wx=,ax=]
rp[f=008002h,i=0,an=00h,rx=,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,0h,fd=-1,ex=] exp=20s
———————————————————————–
Disconnect web-sv-01a_172.16.10.11 from the network

Page 123 of 208
From the GUI we can see the effect in the members pool status:
One-Arm-LB-0> show service loadbalancer virtual

———————————————————————–
Loadbalancer VirtualServer Statistics:
VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)
+->POOL Web-Servers-Pool-01
| LB PROTOCOL L7

Page 124 of 208

| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: DOWN
| | STATUS = DOWN, MONITOR STATUS = default_https_monitor:CRITICAL
| | BYTES in = (8882), out = (43709)
| | BYTES in = (7233), out = (29320)

Page 125 of 208
11. NSX L2 Bridging

11.1 Overview
This next overview of was taken from great work of Max Ardica and Nimish Desai in the NSX
Reference Design Guide.
There are several circumstances where it may be required to establish L2 communication
between virtual and physical workloads. Some typical scenarios are (not exhaustive list):
 Deployment of multi-tier applications: in some cases, the Web, Application and
Database tiers can be deployed as part of the same IP subnet. Web and Application tiers
are typically leveraging virtual workloads, but that is not the case for the Database tier
where bare-metal servers are commonly deployed. As a consequence, it may then be
required to establish intra-subnet (intra-L2 domain) communication between the
Application and the Database tiers.
 Physical to virtual (P-to-V) migration: many customers are virtualizing applications
running on bare metal servers and during this P-to-V migration it is required to support
a mix of virtual and physical nodes on the same IP subnet.
 Leveraging external physical devices as default gateway: in such scenarios, a physical
network device may be deployed to function as default gateway for the virtual
workloads connected to a logical switch and a L2 gateway function is required to
establish connectivity to that gateway.
 Deployment of physical appliances (firewalls, load balancers, etc.).
To fulfill the specific requirements listed above, it is possible to deploy devices performing a
“bridging” functionality that enables communication between the “virtual world” (logical
switches) and the “physical world” (non virtualized workloads and network devices connected
to traditional VLANs).
NSX offers this functionality in software through the deployment of NSX L2 Bridging allowing
VMs to be connected at layer 2 to a physical network (VXLAN to VLAN ID mapping), even if the
hypervisor running the VM is not physically connected to that L2 physical network.

Page 126 of 208
Figure above shows an example of L2 bridging, where a VM connected in logical space to the
VXLAN segment 5001 needs to communicate with a physical device deployed in the same IP
subnet but connected to a physical network infrastructure (in VLAN 100). In the current NSX-v
implementation, the VXLAN-VLAN bridging configuration is part of the distributed router
configuration; the specific ESXi hosts performing the L2 bridging functionality is hence the one
where the control VM for that distributed router is running. In case of failure of that ESXi host,
the ESXi hosting the standby Control VM (which gets activated once it detects the failure of the
Active one) would take the L2 bridging function.
Independently from the specific implementation details, below are some important
deployment considerations for the NSX L2 bridging functionality:
 The VXLAN-VLAN mapping is always performed in 1:1 fashion. This means traffic for
a given VXLAN can only be bridged to a specific VLAN, and vice versa.
 A given bridge instance (for a specific VXLAN-VLAN pair) is always active only on a
specific ESXi host.
 However, through configuration it is possible to create multiple bridges instances
(for different VXLAN-VLAN pairs) and ensure they are spread across separate ESXi hosts.
This improves the overall scalability of the L2 bridging function.
 The NSX Layer 2 bridging data path is entirely performed in the ESXi kernel, and not
in user space. Once again, the Control VM is only used to determine the ESXi host where
a given bridging instance is active, and not to perform the bridging function.

Page 127 of 208
11.2 Configure L2 Bridge
In this scenario we would like to Bridge Between App VM connected to VXLAN 5002 to virtual
machine connected to VLAN 100.
My current Logical Switch configuration:

Page 128 of 208
We have pre-configured a VLAN-backed port group for VLAN 100:
Bridging configuration is done at the DLR level. In this specific example, the DLR name is
Distributed-Router:
Double Click on the edge-1:
Click on the Bridging and then green + button:

Page 129 of 208
Type Bridge Name, Logical Switch ID and Port-Group name:
Click OK and Publish:

Page 130 of 208
Now VM on Logical Switch App-Tier-01 can communicate with Physical or virtual machine on
VLAN 100.
11.3 Design Consideration
Currently in NSX-V 6.1 we can’t enable routing on the VXLAN logical switch that is bridged to a
VLAN.
In other words, the default gateway for devices connected to the VLAN can’t be configured on
the distributed logical router:
So how can VM in VXLAN 5002 communicate with VXLAN 5001?
The big difference is VXLAN 5002 is no longer connected to the DLR LIF, but it is connected
instead to the NSX Edge.

Page 131 of 208

Page 132 of 208
11.4 Bridge Redundancy
DLR Control VM can work in high availability mode, if the Active DLR control VM fails, the
standby Control VM takes over, which means the Bridge instance will move to a new ESXi host
location.

Page 133 of 208
11.5 Bridge Troubleshooting
Most issues I ran into was that the bridged VLAN was missing on the trunk interface configured
on the physical switch.
In the figure below:
 Physical server is connected to VLAN 100, App VM connected to VXLAN 5002 in esx-01b.
 Active DLR control VM is located at esx-02a, so the bridging function will be active in this
ESXi host.
 Both ESXi hosts have two physical nics: vmnic2 and vmnic3.
 Transport VLAN carries all VNI (VXLAN’s) traffic and is forwarded on the physical switch in
VLAN 20.
 On physical switch-2 port E1/1 we must configure trunk port and allow both VLAN 100 and
VLAN 20.
Note: Port E1/1 will carry both VXLAN and VLAN traffic.

Page 134 of 208
Find Where Bridge is Active:
We need to know where the Active DLR Control VM is located (if we have HA). Inside this ESXi
host the Bridging happens in kernel space. The easy way to find it is to look at “Configuration”
section in the “Manage” tab.
Note: When we powered off the DLR Control VM (if HA is not enabled), the bridging function on
this ESXi host will stop to prevent loop.
W
e can see that Control VM located in esx-02a.corp.local
SSH to this ESXi host, find the Vdr Name of the DLR Control VM:
11.5.1 ~ # xxx-xxx -I -l
VDR Instance Information :

—————————
Vdr Name: default+edge-1

Vdr Id: 1460487509
Number of Lifs: 4
Number of Routes: 5
State: Enabled
Controller IP: 192.168.110.201
Control Plane IP: 192.168.110.52

Page 135 of 208
Control Plane Active: Yes

Num unique nexthops: 1
Generation Number: 0
Edge Active: Yes
Now we know that “default+edge-1″ is the VDR name.
xxx-xxx -b –mac default+edge-1
##############################################################################
~ # xxx-xxx -b –mac default+edge-1
VDR ‘default+edge-1′ bridge ‘Bridge_App_VLAN100′ mac address tables :

Network ‘vxlan-5002-type-bridging’ MAC address table:
total number of MAC addresses: 0
number of MAC addresses returned: 0
Destination Address Address Type VLAN ID VXLAN ID Destination Port Age
——————- ———— ——- ——– —————- —
Network ‘vlan-100-type-bridging’ MAC address table:
total number of MAC addresses: 0
number of MAC addresses returned: 0
Destination Address Address Type VLAN ID VXLAN ID Destination Port Age
——————- ———— ——- ——– —————- —
##############################################################################
From this output we can see there is no any MAC address learning.
After connecting a VM to Logical Switch App-Tier-01 and pinging a VM in VLAN 100, we can see
MAC addresses from both VXLAN 5002 and VLAN100 segments:

Page 136 of 208
11.6 NSX Bridge and HP Loop Prevention
Thanks to Marcos Hernandez for help this information.

Explain the problem:
L2 bridging is working only if the VM connected to the VLAN-backed port-group is located within the same
host with the Active Control VM (where the bridging function is actually performed).
General info:
One VDS for Edge and Compute workloads.
We have two physical link on each esxi host.
Two VTEP with Source_Port as teaming mode.
All VLANs are trunked to all physical links.
Topology:
VM1 located on compute Cluster.
VM3+Active Control VM on MNG/Edge Cluster.
VM3 MNG/Edge Cluster.

Page 137 of 208
Test and Work

When VM1,VM2,VM3 on same VXLAN 5001, all work fine.
When VM1,VM2,VM3 on same VLAN 100, all work fine.
VM1 on VXLAN 5001, VM3 on VLAN 100 work fine.
If we move the Active Control VM to ESXi2 then VM3 stops working and VM2 starts working.
Not Work.
Bridging is not working if the VM located on VLAN 100 is not on the same host with the Active Control
VM.
From net-vdr command, in all test Bridge can see all mac of VM1,VM2,VM3
When its not working With pktcap-uw I can see ARP send out from VM1 to VM2 on:
1. ESXi1
2. ESX3 on vmnic0,vminc1 (VLAN Traffic)
3. ESXi2 on vmkic0,vmnic1 (VLAN traffic), but ARP is not received on VM2 vDS port-ID or inside
windows with wireshark.
Solution:
HP Network loop protection interacts with NSX Bridging, blocking the traffic.
Depending on the role of the VC Ethernet port, VC can use several loop avoidance mechanisms. A
VC Ethernet port can be an uplink, a downlink, or a stacking link. VC Ethernet uplink ports connect to
external LAN switches. VC Ethernet downlink ports connect to server NIC ports. VC Ethernet stacking
Link ports connect to other VC Ethernet modules.
If you Disable HP loop protection NSX Bridge starts working.

Page 138 of 208
12. Create Firewall Rules that Blocked Your Own VC
Thank to Michael Moor for reviewing this post
Working on daily tasks with firewalls can sometimes lead to a situation where you end up
blocking access to the management of your firewall.
This situation is very challenging, regardless of the vendor you are working with.
The end result of this scenario is that you are unable to access the firewall management to
remove the rules that are blocking you from reaching the firewall management!
12.1 How is This Related to NSX?
Think of a situation where you deploy a distributed firewall into each of your ESX hosts in a
cluster, including the management cluster where you have your vCenter server located.
And then you change the Default Rule from the default “Allow” value to “Block” (as shown
below):

Page 139 of 208
Let me show you an example of what you’ve done by implementing this rule:
Like the poor guy above blocking himself from his tree, by implementing this rule, you have
blocked yourself from managing your vCenter.

Page 140 of 208
12.2 How Can We protect Ourselves from this Situation?
Put your vCenter (and other critical virtual machines) in an exclusion list.
Any VM on that list will not receive any distributed firewall rules.
Go to the Network & security tab Click on NSX Manager
Exclusion VM list 1
Double click on the IP address object. In my example it is 192.168.110.42

Page 141 of 208
Click on Manage:
Go in the “Exclusion List” tab and click on the green plus button.
Choose your virtual machine.

Page 142 of 208
That’s it! Now your VC is excluded from any enforced firewall rules.
Exclusion VM list 6
12.3 What if we made a mistake and do not yet have access to the
VC?
We can use the NSX Manager REST API to revert to the default firewall ruleset.
By default the NSX Manager is automatically excluded from DFW, so it is always possible to
send API calls to it.
Using a REST Client or cURL:
https://addons.mozilla.org/en-US/firefox/addon/restclient
Submit a DELETE request to:
https://$nsxmgr/api/4.0/firewall/globalroot-0/config

Page 143 of 208
After receiving code status 204 we will revert to the default DFW policy with default rule set to
allow.
Now we can access our VC! As we can see we revert to the default policy, but don’t panic ,
we have saved the policy.
Click on the “Load Saved Configuration” button.
Load Saved Configuration before the last Saved.

Page 144 of 208
Accept the warning by click Yes.
Now we have our last policy before we blocked our VC.

Page 145 of 208
We will need to change the last Rule from Block to Allow to fix the problem.
And Click “Publish the Changes”.

Page 146 of 208
13. Firewall port need for NSX-v

Information take from
KB: kb.vmware.com/kb/2079386
SOURCE TARGET PORT PROTOCOL PURPOSE SERVICES
Client PC NSX Manager 443 TCP NSX Manager Admin Interface HTTPS
REST Client NSX Manager 443 TCP NSX Manager REST API HTTPS
Client PC NSX Manager 80 TCP NSX Manager VIB Access HTTP
REST Client NSX Controller 443 TCP NSX Controller REST API HTTPS
NSX Manager vCenter Server 443 TCP vSphere Web Access HTTPS
NSX Manager vCenter Server 902 TCP vSphere Web Access VMware
Internal
NSX Manager ESXi Host 443 TCP Management and provisioning connection HTTPS
NSX Manager ESXi Host 902 TCP Management and provisioning connection VMware
Internal
NSX Manager Distributed Firewall 443 TCP Management and provisioning connection HTTPS
NSX Manager Distributed Firewall 902 TCP Management and provisioning connection VMware
Internal
NSX Controller ESXi Host 8672 TCP User World Agent connection VMware
Internal
VXLAN Termination VXLAN Termination 8472 UDP Transport Network encapsulation between VXLAN
End Point (VTEP) End Point (VTEP) VTEP end points
NSX Manager DNS Server 53 TCP/UDP DNS client connection DNS
NSX Manager & NTP Time Server 123 TCP/UDP NTP client connection NTP
NSX Controller
NSX Manager Syslog Server 514 TCP/UDP Syslog connection Syslog
NSX Controlller NSX Controller 2878, 2888, TCP State Sync between controllers Zookeeper
3888

Page 147 of 208
14. NSX Edge/DLR Interface Types

From VMware Vault NSX
https://vault.vmware.com/group/nsx/frequently-asked-questions#portlet_56_INSTANCE_Ny3JeYLBfBox

Page 148 of 208
15. NSX-v Troubleshooting L2 Connectivity

In this section we describe the methodology to troubleshoot L2 connectivity within the same Logical
switch L2 segment.
Some of the steps here can and should be done via NSX GUI, vRealize Operations Manager
6.0 and vRealize Log Insight, so see it like education post.
There are lots of CLI commands in this post :-). To view the output of CLI command you can scroll right.
15.1 High Level Approach to Solve L2 Problems

1. Understand the problem.
2. Know your network topology.
3. Figure out if is a configuration issue.

4. Check if the problem is within the physical space or logical space.
5. Verify NSX control plan from ESXi hosts and NSX Controllers.
6. Start to capture traffic in the right spots.
15.2 Understand the Problem

The issue in this case is that VMs on the same logical switch 5001 are unable to communicate.
Show the problem:

web-sv-01a:~ # ping 172.16.10.12
PING 172.16.10.12 (172.16.10.12) 56(84) bytes of data.
^C
--- 172.16.10.12 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3023ms

Page 149 of 208
15.3 Know Your Network Topology
VMs: web-sv-01a and web-sv-02a reside in different compute resource esxcomp-01a and esxcomp-
02a respectively.
web-sv-01a: IP: 172.16.10.11, MAC: 00:50:56:a6:7a:a2

web-sv-02a: IP:172.16.10.12, MAC: 00:50:56:a6:a1:e3
15.4 Validate Network Topology

I know its sounds stupid, let's make sure that the VMs actually reside in the right ESXi hosts and
connected to the right VXLAN segment. Verify VM "web-sb-01a" is actually residing in "esxcomp-
01a":

Page 150 of 208
From esxcomp-01a run the command esxtop then press "n" (Network):
esxcomp -01a # esxtop
PORT-ID USED -BY TEAM-PNIC DNAME PKTTX/s MbTX/s PKTRX/s MbRX/s %DRPTX %DRPRX
33554433 Management n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00
50331649 Management n/a DvsPortset -0 0.00 0.00 0.00 0.00 0.00 0.00
50331650 vmnic0 - DvsPortset -0 8.41 0.02 437.81 3.17 0.00 0.00
50331651 Shadow of vmnic0 n/a DvsPortset -0 0.00 0.00 0.00 0.00 0.00 0.00
50331652 vmk0 vmnic0 DvsPortset -0 5.87 0.01 1.76 0.00 0.00 0.00
50331653 vmk1 vmnic0 DvsPortset -0 0.59 0.01 0.98 0.00 0.0 0 0.00
50331656 35669:db -sv-01a.eth0 vmnic0 DvsPortset -0 0.00 0.00 0.00 0.00 0.00 0.00
50331657 35888:web-sv-01a.eth vmnic0 DvsPortset -0 4.89 0.01 3.72 0.01 0.00 0.00
50331658 vdr -vdrPort vmnic0 DvsPortset -0 2.15 0.00 0.00 0.00 0.00 0.00
In line 12 we can see that "web-sv-01a.eth0" is shown, another imported information is its "Port-ID".
The "Port-ID" is the unique identifier for each virtual switch port, in our example web-sv-01a.eth0 has
Port-ID "50331657".
Find the vDS name:
es xc omp- 01a # es xc l i net wor k vs wi t ch dvs vmwar e v x l an l i st
VDS I D VDS Name MTU Segment I D Gat eway I P Gat eway MAC Net wor k Count Vmk ni c Count
----------------------------------------------- ----------- ---- ------------- ------------- ----------------- ------------- ------------
3b bf 0e 50 73 dc 49 d8- 2e b0 df 20 91 e4 0b bd Comput e_VDS 1 600 192. 168. 250. 0 192. 168. 250. 2 00: 50: 56: 09: 46: 07 4 1
From Line 4, we learn that the vDS name is "Compute_VDS"
Verify "web-sv-01a.eth0" is indeed connected to VXLAN 5001:

esxcomp -01a # esxcli network vswitch dvs vmware vxlan network port list --vds-name Compute_VDS --vxlan-id=5001
Switch Port ID VDS Port ID VMKNIC ID
-------------- ----------- ---------
50331657 68 0
50331658 vdrPort 0
From Line 4 we see that we have a VM connected to VXLAN 5001 to port ID 50331657, which is the
same port ID of VM web-sv-01a.eth0.
Verification in esxcomp-01b:

Page 151 of 208
esxcomp -01b esxtop
PORT-ID USED-BY TEAM -PNIC DNAME PKTTX/s MbTX/s PKTRX/s MbRX/s %DRPTX %DRPRX
33554433 Management n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00
50331649 Management n/a DvsPortset -0 0.00 0.00 0.00 0.00 0.00 0.00
50331650 vmnic0 - DvsPortset -0 6.54 0.01 528.31 4.06 0.00 0.00
50331651 Shadow of vmnic0 n/a DvsPortset -0 0.00 0.00 0.00 0.00 0.00 0.00
50331656 35663:web-sv-02a.eth vmnic0 DvsPortset -0 3.96 0.01 3.57 0.01 0.00 0.00
50331657 vdr -vdrPort vmnic0 DvsPortset -0 2.18 0.00 0.00 0.00 0.00 0.00
From Line 11 we can see that "web-sv-02a.eth0" has Port-ID "50331656".

Verify "web-sv-02a.eth0" is connected to VXLAN 5001:
esxcomp-01b # esxcli network vswitch dvs vmware vxlan network port list --vds-name Compute_VDS --
vxlan-id=5001
Switch Port ID VDS Port ID VMKNIC ID
-------------- ----------- ---------
50331656 69 0
50331657 vdrPort 0
From Line 4 we have a VM connected to VXLAN 5001 to port ID 50331656
At this point we have verified that the VMs are located in the right ESXi hosts as shown in the topology
diagram. It is now time to start with the actual TSHOT steps.

Page 152 of 208
15.5 Is the Problem in the Physical Network?

Our first step will be to find out if the problem is in the physical space or logical space.
The easy way to find out is by pinging from the VTEP IP address in esxcomp-01a to the VTEP
in esxcomp-01b>. Before doing that, let's find out the VTEP IP addresses.
es x c omp- 01a # es x c f g- v mk ni c - l
I nt er f ac e Por t Gr oup/ DVPor t I P Fami l y I P Addr es s Net mas k Br oa dc as t MAC Addr es s MTU TSO MSS Ena bl ed Ty pe
v mk 0 16 I Pv 4 192. 168. 21 0. 51 255. 255. 255 . 0 192. 168. 210. 255 00: 50: 56: 09: 08: 3e 1500 65535 t r ue S TATI C
v mk 1 26 I Pv 4 10. 20. 20. 5 1 255. 255. 255 . 0 10. 2 0. 20. 255 00: 50 : 56: 69: 80: 0f 1500 65535 t r ue STATI C
v mk 2 35 I Pv 4 10. 20. 30. 5 1 255. 255. 255 . 0 10. 2 0. 30. 255 00: 50: 56: 64: 70: 9f 1500 65535 t r ue STATI C
v mk 3 44 I Pv 4 192. 168. 25 0. 51 255. 255. 255 . 0 192. 168. 250. 255 00: 50: 56: 66: e2: ef 1600 65535 t r ue STATI C
From Line 6 we can tell that VTEP IP address for vmk3 (MTU is 1600) is 192.168.250.51.
Another command to find out the VTEP IP address is:
es xc omp- 01a # es xc l i net wor k vs wi t ch dvs vmwar e v x l an vmkni c l i st - - vds - name=Comput e_VDS
Vmkni c Name Swi t c h Por t ID VDS Por t I D Endpoi nt I D VLAN I D IP Net mask I P Acqui r e Ti meout Mul t i c ast Gr oup Count Segment I D
----------- -------------- ----------- ----------- ------- -------------- ------------- ------------------ --------------------- -------------
v mk3 50331655 44 0 0 192. 168. 250. 51 255. 255. 255. 0 0 0 192. 168. 250. 0

Page 153 of 208
Same commands in esxcomp-01b:

es xc omp- 01b # es xc l i net wor k vs wi t ch dvs vmwar e v x l an vmkni c l i st - - vds - name=Comput e_VDS
Vmkni c Name Swi t c h Por t ID VDS Por t I D Endpoi nt I D VLAN I D IP Net mask I P Acqui r e Ti meout Mul t i c ast Gr oup Count Segment I D
----------- -------------- ----------- ----------- ------- -------------- ------------- ------------------ --------------------- -------------
v mk3 50331655 46 0 0 192. 168. 250. 53 255. 255. 255. 0 0 0 192. 168. 250. 0
The VTEP IP for esxcomp-01b is 192.168.250.53. Now let's add this info to our topology.
15.6 Checks for VXLAN Routing

NSX uses a different IP stack for VXLAN traffic, so we need to verify if default gateway is configured
correctly for VXLAN traffic.
From esxcomp-01a:
esxcomp-01a # esxcli network ip route ipv4 list -N vxlan
Network Netmask Gateway Interface Source
------------- ------------- ------------- --------- ------
default 0.0.0.0 192.168.250.2 vmk3 MANUAL
192.168.250.0 255.255.255.0 0.0.0.0 vmk3 MANUAL

Page 154 of 208
From esxcomp-01b:
esxcomp-01b # esxcli network ip route ipv4 list -N vxlan
Network Netmask Gateway Interface Source
------------- ------------- ------------- --------- ------
default 0.0.0.0 192.168.250.2 vmk3 MANUAL
192.168.250.0 255.255.255.0 0.0.0.0 vmk3 MANUAL
The two ESXi hosts in this example have VTEP IP addresses part of the same L2 segment, so they both
have the same default gateway. But with this command we can verify routing for VTEP’s in different
subnets
Ping from VTEP in esxcomp-01a to VTEP located in esxcomp-02a.
Source ping will be from VXLAN IP stack with packet size of 1570 and don't fragment bit set to 1.
esxcomp-01a # ping ++netstack=vxlan 192.168.250.53 -s 1570 -d
PING 192.168.250.53 (192.168.250.53): 1570 data bytes
--- 192.168.250.53 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.585/0.784/0.936 ms
Ping is successfully.
If ping with "-d" doesn't work and without "-d" work it is a clue of an MTU problem. Check
for MTU in the physical switches.
Because the VTEPs on different ESXi hosts in this example are part of the same L2 domain, we can view
the ARP entries for others VTEP's:
From esxcomp-01a:
esxcomp-01a # esxcli network ip neighbor list -N vxlan
Neighbor Mac Address Vmknic Expiry State Type
-------------- ----------------- ------ -------- ----- -----------
192.168.250.52 00:50:56:64:f4:25 vmk3 1173 sec Unknown
192.168.250.53 00:50:56:67:d9:91 vmk3 1171 sec Unknown

Page 155 of 208
192.168.250.2 00:50:56:09:46:07 vmk3 1187 sec Autorefresh
Looks like the physical layer is not our issue.
15.7 Verify NSX Control Plane

Kevin
During NSX host preparation NSX Manager install VIB agents called User World Agent (UWA) inside
ESXi hosts.
The process responsible to communicate with the NSX Controller is called netcpad.
ESXi host using VMkernel Management interface to create this secure channel over TCP/1234, traffic is
encrypted with SSL.
Part of the information netcpad send to NSX Controller is:
VM's: MAC, IP.

VTEP: MAC, IP.
VXLAN: the VXLAN Id's
Routing: Routes learn from the DLR Control VM (explained in the next post).
Based on this information the Controller learns the network state and builds directory services.
To learn how the Controller Cluster works and how fix problem in the cluster itself, please refer to the
NSX Controller Cluster Troubleshooting section.
For two VMs to be able to talk to each other we need to ensure that the NSX control plane is working fine.
In this lab we have 3 NSX Controller nodes.
Verification commands need to be issued from both ESXi and Controllers sides.

Page 156 of 208
NSX controllers IP address: 192.168.110.201, 192.168.110.202, 192.168.110.203
15.8 Control Plane Verification from ESXi Point of View

Verify esxcomp-01a have ESTABLISHED connections to each NSX Controller node (grep 1234 to show
only TCP port 1234 ).
esxcomp-01a # esxcli network ip c onnection list | grep 1234
tcp 0 0 192.168.210 .51:54153 192 .168.110.202:1 234 ESTABLISHED 3518 5 newreno ne tcpa-worker
tcp 0 0 192.168.210 .51:41342 192 .168.110.201:1 234 ESTABLISHED 3451 9 newreno ne tcpa -worker
Verify esxcomp-01b have ESTABLISHED connections to each NSX Controller node:

esxcomp-01b # esxcli network ip c onnection list | grep 1234
tcp 0 0 192.168.210 .56:16580 192 .168.110.202:1 234 ESTABLISHED 3451 7 newreno ne tcpa -worker
Example of problem with communication from ESXi host to NSX Controllers:

esxcli network ip connection list | grep 1234
tcp 0 0 192.168.210.51:54153 192.168.110.202:1234 TIME_WAIT 0
tcp 0 0 192.168.210.51:34656 192.168.110.203:1234 FIN_WAIT_2 34519 newreno
tcp 0 0 192.168.210.51:41342 192.168.110.201:1234 TIME_WAIT 0
If we can't see ESTABLISHED connections check:
1. IP connectivity from ESXi host to all NSX Controller nodes.
2. If you have a firewall between ESXi hosts and the NSX Controller nodes, TCP/1234 need to be open.
3. If netcpad is running on each ESXi host:

esxcomp-01a # /etc/init.d/netcpad status
netCP agent service is not running
If netcpad is not running start it with the command:

esxcomp-01a #/etc/init.d/netcpad start
Memory reservation set for netcpa
netCP agent service starts
Verify again:
esxcomp-01a # /etc/init.d/netcpad status
netCP agent service is running

Page 157 of 208
Verify that the control plane is enabled in esxcomp-01a and the connection is in up state for VXLAN
5001:
es xc omp- 01a # es xc l i net wor k vs wi t ch dvs vmwar e v x l an net wor k l i s t - - v ds - name Comput e_VDS
VXLAN I D Mul t i c as t I P Cont r ol Pl ane Cont r ol l er Connect i on Por t Count MAC Ent r y Count ARP Ent r y Count
-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- ---------------
5003 N/ A ( headend r epl i c at i on) Enabl ed ( mul t i c as t pr oxy , ARP pr ox y) 192. 168. 110. 202 ( up) 2 0 0
Verify that the control plane is enabled in esxcomp-01b and connection is in up state for VXLAN
5001:
es xc omp- 01b # es xc l i net wor k vs wi t ch dvs vmwar e v x l an net wor k l i s t - - v ds - name Comput e_VDS
VXLAN I D Mul t i c as t I P Cont r ol Pl ane Cont r ol l er Connect i on Por t Count MAC Ent r y Count ARP Ent r y Count
-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- ---------------
Check that esxcomp-01a can learn ARP information of remote VMs part of VXLAN 5001:
esxcomp-01a # esxcli network vswitch dvs vmware vxlan network arp list --vds-name Compute_VDS --
vxlan-id=5001
IP MAC Flags
------------ ----------------- --------
172.16.10.12 00:50:56:a6:a1:e3 00001101
From this output we can understand that esxcomp-01a has learned the ARP info of web-sv-02a
Check that esxcomp-01b can learn ARP information for remote VMs part of VXLAN 5001:
esxcomp-01b # esxcli network vswitch dvs vmware vxlan network arp list --vds-name Compute_VDS --
vxlan-id=5001
IP MAC Flags
------------ ----------------- --------
172.16.10.11 00:50:56:a6:7a:a2 00010001
From this output we can understand that esxcomp-01b has learned the ARP info of web-sv-01a
What we can tell at this point.
esxcomp-01a:
Page 158 of 208
Knows web-sv-01a VM is running in VXLAN 5001, its IP address 172.16.10.11 and its MAC address
00:50:56:a6:7a:a2.
The communication to Controller's cluster is UP for VXLAN 5001.
esxcomp-01b:
Knows web-sv-01b VM is running in VXLAN 5001, its IP address 172.16.10.12 and its MAC address
00:50:56:a6:a1:e3.
The communication to Controller's cluster is UP for VXLAN 5001.
So why web-sv-01a can't talk to web-sv-02a ?
To answer this question we need to find out first the answer to another question: what does the NSX
controller know?
15.9 Control Plane Verification from NSX Controller Point of View

We have 3 active Controller nodes, one of them is elected to manage VXLAN 5001. Remember slicing?
Find out who is managing VXLAN 5001, SSH to one of the NSX Controller nodes, for
example 192.168.110.202:
nsx-controller # show control-cluster logical-switches vni 5001
VNI Controller BUM-Replication ARP-Proxy Connections VTEPs
5001 192.168.110.201 Enabled Enabled 0 0
Line 3 say that 192.168.110.201 is managing VXLAN 5001, so the next commands will be run
from 192.168.110.201:
nsx-controller # show control-cluster logical-switches vni 5001
VNI Controller BUM-Replication ARP-Proxy Connections VTEPs
5001 192.168.110.201 Enabled Enabled 6 4
From this output we learn that VXLAN 5001 have 4 VTEPs connected to him and total of 6 active
connection.
At this point i would like to point you to an excellent blog page with lots of information on what is
happening under the hood in NSX.
The author’s name is Dmitri Kalintsev and the link to his blog: NSX for vSphere: Controller “Connections”
and “VTEPs”
From Dmitri’s post:
"ESXi host joins a VNI in two cases:
1. When a VM running on that host connects to VNI’s dvPg and its vNIC transitions into “Link Up”
state; and
2. When DLR kernel module on that host needs to route traffic to a VM on that VNI that’s running
on a different host."

Page 159 of 208
We are not routing traffic between VMs, so the DLR is not part of the game here.
Find out the VTEP IP addresses connected to VXLAN 5001:

nsx-controller # show control-cluster logical-switches vtep-table 5001
VNI IP Segment MAC Connection -ID
5001 192.168.250.53 192.168.250.0 00:50:56:67:d9:91 5
5001 192.168.250.52 192.168.250.0 00:50:56:64:f4:25 3
5001 192.168.250.51 192.168.250.0 00:50:56:66:e2:ef 4
5001 192.168.150.51 192.168.150.0 00:50:56:60:bc:e9 6
From this output we can learn that both VTEP's esxcomp-01a line 5 and esxcomp-01b line 3 are seen
by NSX Controller on VXLAN 5001.
The MAC address output in this command refers to the VTEPs’ MAC.
Find out what the MAC addresses associated to VMs that have been learned by the NSX Controller:
nsx-controller # show control-cluster logical-switches mac-table 5001
VNI MAC VTEP-IP Connection-ID
5001 00:50:56:a6:7a:a2 192.168.250.51 4
5001 00:50:56:a6:a1:e3 192.168.250.53 5
5001 00:50:56:8e:45:33 192.168.150.51 6
Line 3 show MAC of web-sv-01a, line 4 show MAC of web-sv-02a
Find out the ARP entries associated to the VMs that have been learned by the NSX Controller:
nsx-controller # show control-cluster logical-switches arp-table 5001
VNI IP MAC Connection-ID
5001 172.16.10.11 00:50:56:a6:7a:a2 4
5001 172.16.10.12 00:50:56:a6:a1:e3 5
5001 172.16.10.10 00:50:56:8e:45:33 6
Line 3,4 show the exact IP/MAC of web-sv-01a and web-sv-02a
To understand how the Controller has learned this info read my post NSX-V IP Discovery. In case where
ARP entry are not learn restarting the netcpad process can fix this issue.
esxcomp-01a # /etc/init.d/netcpad restart
watchdog-netcpa: Terminating watchdog process with PID 4273913
Memory reservation released for netcpa
netCP agent service is stopped
Memory reservation set for netcpa
netCP agent service starts

Page 160 of 208
Summary of controller verification:
NSX Controller knows where the VMs are located, their IP and MAC addresses. It's seem like the control
plan is working just fine.
15.10 Capture Traffic in the Right Spots

pktcap-uw command allows to capture traffic in so many places in the NSX environments.
Before starting to capture all over the place, lets try to narrow down where we think the problem could be.
When a VM connects to a Logical switch there are few security services that packets originated from the
VM must traverse, each service represented with a different slot id.
SLOT 0 : implements vDS Access Lists.
SLOT 1: Switch Security module (swsec) captures DHCP Ack and ARP message to learn the VM IP
address. This info is then forwarded to the NSX Controller cluster.

Page 161 of 208
SLOT2: NSX Distributed Firewall.
We need to check if VM traffic is successfully transmitted through the NSX Distributed firewall, which
mean through slot 2.
The capture command will need to SLOT 2 filter name for Web-sv-01a
From esxcomp-01a:
esxcomp-01a # summarize-dvfilter
~~~snip~~~~
world 35888 vmm0:web-sv-01a vcUuid:'50 26 c7 cd b6 f3 f4 bc-e5 33 3d 4b 25 5c 62 77'
port 50331657 web-sv-01a.eth0
vNic slot 2
name: nic-35888-eth0-vmware-sfw.2
agentName: vmware-sfw
state: IOChain Attached
vmState: Detached
failurePolicy: failClosed
slowPathID: none
filter source: Dynamic Filter Creation
vNic slot 1
name: nic-35888-eth0-dvfilter-generic-vmware-swsec.1
agentName: dvfilter-generic-vmware-swsec
state: IOChain Attached
vmState: Detached
failurePolicy: failClosed
slowPathID: none
filter source: Alternate Opaque Channel
We can see in line 4 that the VM name is web-sv-01a, in line 5 that filter applied at slot 2 and in
line 6 we have the filter name: nic-35888-eth0-vmware-sfw.2
pktcap-uw command help with -A output:

esxcomp-01a # pktcap-uw -A
Supported capture points:
1: Dynamic -- The dynamic inserted runtime capture point.
2: UplinkRcv -- The function that receives packets from uplink dev
3: UplinkSnd -- Function to Tx packets on uplink

Page 162 of 208
4: Vmxnet3Tx -- Function in vnic backend to Tx packets from guest
5: Vmxnet3Rx -- Function in vnic backend to Rx packets to guest
6: PortInput -- Port_Input function of any given port
7: IOChain -- The virtual switch port iochain capture point.
8: EtherswitchDispath -- Function that receives packets for switch
9: EtherswitchOutput -- Function that sends out packets, from switch
10: PortOutput -- Port_Output function of any given port
11: TcpipDispatch -- Tcpip Dispatch function
12: PreDVFilter -- The DVFIlter capture point
13: PostDVFilter -- The DVFilter capture point
14: Drop -- Dropped Packets capture point
15: VdrRxLeaf -- The Leaf Rx IOChain for VDR
16: VdrTxLeaf -- The Leaf Tx IOChain for VDR
17: VdrRxTerminal -- Terminal Rx IOChain for VDR
18: VdrTxTerminal -- Terminal Tx IOChain for VDR
19: PktFree -- Packets freeing point
The capture command allows to sniff traffic in interesting points,

with PreDVFilter and PostDVFilter line 14,15 we can sniff traffic before or after the filtering action.
Capture after SLOT 2 filter:

pktcap-uw --capture PostDVFilter --dvfilter nic-35888-eth0-vmware-sfw.2 --proto=0x1 -o web-sv-
01a_after.pcap
The session capture point is PostDVFilter
The name of the dvfilter is nic-35888-eth0-vmware-sfw.2
The session filter IP protocol is 0x1
The output file is web-sv-01a_after.pcap
No server port specifed, select 784 as the port
Local CID 2
Listen on port 784
Accept...Vsock connection from port 1049 cid 2
Destroying session 25
Dumped 0 packet to file web-sv-01a_after.pcap, dropped 0 packets.
PostDVFilter = capture after the filter name.

Page 163 of 208
--proto=01x capture only icmp packet.
--dvfilter = filter name as it show from summarize-dvfilter command.
-o = where to capture the traffic.
From he output of this command (line 12) we can tell that ICMP packets are not passing this filters
because we have 0 Dumped packet.
We found our smoking gun :-)
Now capture before SLOT 2 filter.
pktcap-uw --capture PreDVFilter --dvfilter nic-35888-eth0-vmware-sfw.2 --proto=0x1 -o web-sv-

01a_before.pcap
pktcap-uw -capture PreDVFilter -dvfilter nic-35888-eth0-vmware-sfw.2 -proto=0x1 -o web-sv-
01a_before.pcap
The session capture point is PreDVFilter
The name of the dvfilter is nic-35888-eth0-vmware-sfw.2
The session filter IP protocol is 0x1
The output file is web-sv-01a_before.pcap
No server port specifed, select 5782 as the port
Local CID 2
Listen on port 5782
Accept...Vsock connection from port 1050 cid 2
Dump: 6, broken : 0, drop: 0, file err: 0Destroying session 26
Dumped 6 packet to file web-sv-01a_before.pcap, dropped 0 packets.
Now we can see at line 6 that we have 6 Dumped packet. we can open web-
sv01a_before.pcap captured file:
esxcomp-01a # tcpdump-uw -r web-sv-01a_before.pcap
reading from file web-sv-01a_before.pcap, link-type EN10MB (Ethernet)
20:15:31.389158 IP 172.16.10.11 > 172.16.10.12: ICMP echo request, id 3144, seq 18628, length 64
Voila’, the conclusion is that NSX dFW is blocking the traffic.
And now from the NSX GUI:

Page 164 of 208
Looking back on this paragraph, we skipped intentionally step 3 "Configuration issue". Had we checked
upfront the configuration settings, we would probably have immediately noticed this problem.
15.11 Summary of all CLI Commands for This Post

ESXi Commands:
esxtop
esxcfg-vmknic -l
esxcli network vswitch dvs vmware vxlan list
esxcli network vswitch dvs vmware vxlan network port list --vds-name Compute_VDS --vxlan-id=5001
esxcli network vswitch dvs vmware vxlan vmknic list --vds-name=Compute_VDS
esxcli network ip route ipv4 list -N vxlan
esxcli network vswitch dvs vmware vxlan network list --vds-name Compute_VDS
esxcli network vswitch dvs vmware vxlan network arp list --vds-name Compute_VDS --vxlan-id=5001
esxcli network ip connection list | grep 1234
ping ++netstack=vxlan 192.168.250.53 -s 1570 -d
/etc/init.d/netcpad (status|start|)
pktcap-uw --capture PostDVFilter --dvfilter nic-35888-eth0-vmware-sfw.2 --proto=0x1 -o web-sv-

01a_after.pcap
NSX Controller Commands:

show control-cluster logical-switches vni 5001
show control-cluster logical-switches vtep-table 5001
show control-cluster logical-switches mac-table 5001
show control-cluster logical-switches arp-table 5001

Page 165 of 208
16. NSX Role Based Access Control
One of the most challenging problems in managing large networks is the complexity of security
administration. “Role-based access control (RBAC) is a method of regulating access to computer
or network resources based on the roles of individual users within an enterprise. In this context,
access is the ability of an individual user to perform a specific task, such as view, create, or
modify a file. Roles are defined according to job competency, authority, and responsibility
within the enterprise”
Within NSX we have four built in roles, and we can map Users or Groups to one of those NSX
Roles. Instead of assigning roles to individual users, the preferred way is to assigning roles to
groups.
Organizations create user groups for proper user management. After integration with SSO, NSX
Manager can get the details of groups to which a user belongs.
16.1 NSX Roles
Within NSX Manager we have four pre built RBAC roles covering different NSX permissions and
areas in the NSX environment.
The four NSX built in roles are: Auditor, Security Administrator, NSX administrator and
Enterprise Administrator:

Page 166 of 208
16.2 Configure the Lookup Service in NSX Manager
Whenever we want to assign a role in NSX, we can assign a role to SSO User or Group. When
Lookup service is not configured then the group based role assignment would not work (i.e the
user from that group would not be able to login to NSX).
The reason is we cannot fetch any group information from the SSO server. The group based
authentication provider is only available when Lookup service is configured. User login where
the user is explicitly assigned role on NSX will not be affected. This means that the customer has
to individually assign roles to the users and would not be able to take advantage of SSO groups.
For NSX, vCenter SSO server is one of the identity providers for authentication. For
authentication on NSX, prerequisite is that the user/group has to be assigned role on NSX.
Note: NTP/DNS must be configured on the NSX Manager for lookup service to work.
Note: The domain account must have AD read permission for all objects in the domain tree.
The event log reader account must have read permissions for security event logs.

Page 167 of 208
16.3 Configure Active Directory Groups
In this example, I will use Microsoft Active directory as user Identity source. In “Active
Directory Users and Computers” I created four different groups. The groups will have the same
name as the NSX roles to make life easier: Auditor, Security Administrator, NSX Administrator,
Enterprise Administrator.
We created four AD users and added each user to a different AD group. For example, the
nsxadmin user is associated with the group NSX Administrator. This association is done by
clicking on the Add button:

Page 168 of 208
In the same way, it is possible to associate others users to the defined AD groups:
username: AD groups:
auditor1 -> Auditor
secadmin -> Security Administrator
nsxadmin -> NSX Administrator
entadmin -> Enterprise Administrator
16.4 Connect Directory Domain to NSX Manager
Go to “Network & Security” tab and double click on the “NSX Manager”

Page 169 of 208
Double click on the “192.168.110.42” icon:
Note: Configure Domain is not needed for RBAC, only if we want to use identity firewall rules
based on user or group information.
Go to “Manage -> “Domains” -> Click on the green Plus button:

Page 170 of 208
Fill Name and NetBIOS name fields with appropriate information of your Domain Name and
NetBIOS name:
In My example my domain name is corp.local:

Page 171 of 208
Enter LDAP (i.e AD) IP address or hostname and domain account (username and password):
Click on next. NSX Manager will try to connect to LDAP (i,e AD) server using the above info. If
result is successful, the screenshot on next page will appear.
This configuration allows the NSX Manager to read Active Directory “Security Event Log”; this
log contains information about the users that logon/logoff from the domain. NSX can use this
information to improve user identity firewall rules.

Page 172 of 208
Click Next and Finish:

Page 173 of 208
16.5 Mapping Active Directory Groups to NSX Managers Roles

Note: This step is a must for NSX RBAC to work.
Now we can map the Active Directory groups to the pre-built NSX Manager roles.
Go to “Manage -> “Users” -> Click on the green Plus button:
Here we can select if we want to map specific AD user to NSX Role or AD Group to Role.
In this example we use AD group, so we create an AD group called auditor. The format to input
here is:

Page 174 of 208
“group_name”@domain.name. Let’s start with auditor group, this group is “Read Only”
permission:
Select one of the NSX Role, for Auditor AD group we chose Auditor
We can limit the scope of this group to a specific object (port group, datacenter, NSX Edge), in
this example no restrictions are applied:

Page 175 of 208
Same way Map all others AD groups to NSX Roles:
Auditor@corp.local – > Auditor
Security Administrator@corp.local -> Security Administrator
NSX Administrator@corp.local -> NSX Administrator
Enterprise Administrator@corp.local -> Enterprise Administrator
Try our first login with user Auditor1:

Page 176 of 208
The login was successful but where is the “Network & Security” tab gone ?
So far we have configured all NSX Manager section but we didn’t take care of the vCenter
Configuration permission for that group. Confusing?
vCenter has its own Role for each group. We need to configure roles to etch AD group we
configured. These settings determine what the user can do in the vCenter environment.
16.6 Configure vCenter Roles
Let’s start by configuring the Auditor Role for Auditor AD group. We know this group is for
“Read Only” in the NSX Manager, so it will make sense to give this group “Read Only”
permission also for the vCenter environment.
Go to vCenter -> Manage -> Permissions and click the green button:
We need to choose Roles from the Assigned Role, if we select No-Access we will not be able to
login to vCenter. So we need to choose something from “Read-Only” to “Administrator”

Page 177 of 208
For Auditor Role “Read Only” is the Minimum.
Select “Read Only” from the Assigned Role drop down list and click on the “Add” button from
“User and Group”:
From the Domain Select your Domain name, in our lab the domain is “CORP”, choose your
Active Directory group from the list (Auditor for this example) and click the “Add” button:

Page 178 of 208
Click Ok and Ok for Next Step:

Page 179 of 208
Same way we need to configure all other groups roles:

Page 180 of 208
Now we can try to login with auditor1 user:
As we can see auditor1 is in “Read Only” role:

Page 181 of 208
We can verify that auditor1 can’t change any other vCenter configuration:
Test now the secadmin user mapped to “NSX Security” role; this user cannot make any NSX
infrastructure related change like adding a new NSX Controller node:

Page 182 of 208
But the secadmin user can create a new firewall rule:

Page 183 of 208
When logging as the nsxadmin user mapped to the NSX Administrator Role, we can see that the
user can add new Controller nodes:
But the nsxadmin user cannot change or see any firewall rules configured:

Page 184 of 208
What if the configure a user to be member of two AD Groups?
The user will gain combined permission access from both the groups.
For example: if a user is member of the “Auditor” and “NSX Security” groups, the result will be
that the user has read only permission on all NSX infrastructure tasks but also gains access to all
security related areas in NSX.
Summary
In this section we demonstrated the NSX Manager different roles. We configured Microsoft
Active Directory as External database source for user’s identity.

Page 185 of 208
17. Upgrade NSX-V, The right Way
During November I had the opportunity to take NSX Advance bootcamp with one of brilliant
PSO Architect in the NSX field, Kevin Barrass This section was based on Kevin’s lecture, I added
screenshots and my experience.
Upgrading NSX can be very easy if planned right, or very frustrating if we try to do shortcuts in
the process. In this section I will try to document all the steps needed for completing an NSX-v
upgrade.
High level upgrade flow:
Before starting the upgrade procedure, pre upgrade steps must be taken under consideration:
1. Read the NSX release notes.

2. Check upgrade MD5 file.
3. Verify the state of the NSX and vSphere infrastructure.
4. Preserve the NSX infrastructure.

Page 186 of 208
17.1 Read the NSX Release Notes
How many times you face issue during the upgrade process, waste hours of troubleshooting,
sure you work exactly as guided, open support ticket and get the answer: you are hitting a
known upgrade issue and the workaround is writtent in the release notes. RTFM, feeling
dummy…? 
Do not skip this important step!!!! Read the release notes.
17.2 Compare the MD5
Download any of your favorite MD5 tools, I’m using free winMd5Sum
Compare MD5 sum you get from Calculate against VMware official MD5 web site.
The link to software:
http://www.nullriver.com/
17.2.1 Verify NSX working state
Again this line came from the field, the scenario is you complete the upgrade process and now
facing issue. How do we you know if the issue wasn’t there before you started the upgrade?

Page 187 of 208
Do not assume everything is working before you start to touch the infrastructure, check it!!!
Note current versions of NSX Manager, vCenter, ESXi and Edges Verify you can log into:
 NSX Manager Web UI
 vCenter and see NSX Manager in Plugin
 ESG, DLR control VM’s
Validate VXLAN is functional:

 Ping between two VM’s on same logical switch (different hosts):
 Ping –l 1472 –f <dest VM>
 Ping between two VTEP’s (different hosts)
 Ping ++netstack=vxlan –d –s 1572 <dest VTEP IP>
Validate North south by pinging out from a VM

1. Visual inspection of Host Prep, Logical Network Prep, Edges (check for all Green)
Verify vSphere working state
Check DRS is enabled on clusters, Validate vMotion functions correctly, Check host connection
state with vCenter
Check you have minimum 3 ESXi hosts in each Cluster.
During an NSX upgrade in some situation, a cluster with 2 hosts or less can cause issues with
DRS/Admission control/Anti-Affinity rules. My recommendation to be successful with the
upgrade process is trying to work with 3 hosts in etch cluster you plan to upgrade.
Preserve the NSX infrastructure

Do the upgrade during a maintenance window
Backups NSX-Managers:
Create a current backup of the NSX Manager, Check you know the backup password

Page 188 of 208
Take VM level snapshot where possible of the NSX Manager:
Take controller snapshot
Starting from 6.0.4 we have a special API call to take snapshot of controller.

Page 189 of 208
https://NSXManagerIPAddress/api/2.0/vdn/controller/controllerID/snapshot
Example of backup controller-1:
To actually get the backup file we can use the browser:

Page 190 of 208
Starting from version 6.1.x we can take snapshot via UI:
Export the Distributed Firewall Rules if DFW is in use:

Page 191 of 208

Page 192 of 208
17.3 Upgrade NSX Manager

Verify the NSX Manager OVA file name ended with tar.gz
Some browser may remove the gz extension, if the file looks like:
VMware-NSX-Manager-upgrade-bundle-6.1.0-X.X.gz
Change it to:
VMware-NSX-Manager-upgrade-bundle-6.1.0-2107742.tar.gz
Otherwise you will get an error after completing the uploade of the OVA file to NSX Manager:
“Invalid upgrade bundle file VMware-NSX-Manager-upgrade-bundle-6.0.x-xxxxx,gz, upgrade file

name has extension tar.gz”
To begin the NSX Manager upgrade, open the NSX Manager web interface and select the
Upgrade section:

Page 193 of 208
Click on the upgrade button:
Click “Browse” and open the upgrade file, click Continue:
Note: NSX Manager will reboot during the upgrade process, but the forwarding path of VM
workloads will not be affected during this step unless:
We are using user identity with distributed firewall and new user login while the NSX Manager
is down.

Page 194 of 208
The upgrade process is built on two steps: validate the tar.gz image and start the actual
upgrade process:
When NSX Manager finishes the validation process, the upgrade process can start:

Page 195 of 208
After completing the upgrade, confirm the version from the Summary Tab of the NSX Manager
Web UI:
17.4 Upgrade the NSX Controller Cluster
During the upgrade of the NSX Controller cluster, the upgrade file is downloaded to each node;
the process will then start to upgrade node1, then node2 and at the end node3.
To start the upgrade process click on the “Upgrade Available”

Page 196 of 208
During the upgrade of the NSX Controller cluster we will face this state:
Node1: completed the upgrade (to release 6.1 in this specific example)
Node2: Is rebooting
Node3: In Normal state but still running the old version 6.0.0.
Results: we have only one node active running 6.1, and this node has lost the cluster majority
because of the version mismatch with the other running controller node. As a consequence the
Controller cluster is down in this situation.
What does it mean? -> Impact on Control plane
While the Controller cluster is down, it is possible to have issues when performing live
migration of VMs between ESXi hosts (vMotion). This is the case if a VM is migrated to a specific
ESXi hosts and it is the first VM on that host part of a given VXLAN segment; the other VTEPs
won’t in fact be able to receive this information from the Controller (since it is down), which
implies multi-destination traffic (as ARP requests, for example) originated by VMs part of that
VXLAN segment will never be sent to ESXi host the VM moved to. This issue may be
exacerbated in virtual environments leveraging DRS, as vMotion could happen dynamically and
without the user intervention.

Page 197 of 208
To limit the impact of this issue, my recommendation is to change the DRS setting to manual to
control the occurrences of vMotion events during the NSX Controller update process!!
Note: After completing the controller upgrade, change it back to the previous configuration.
Another issue may occur if the DLR Control VM gets a dynamic routing update because of a
topology change (for example a new route is added or removed); in this case, this information
cannot be communicated to the NSX Controller cluster, which implies that the kernel
forwarding tables on the ESXi hosts cannot be updated. The update will then be performed as
soon as the Controller cluster is activated.
Back to our example, when controller node-2 completes its rebooting process, we get two
controllers upgraded and running the same 6.1 version. At that point we gain back cluster
majority (and the Controller cluster gets reactivated), even if controller node-3 still needs to
finish its upgrade and rebooting process.

Page 198 of 208
When all tree controller nodes completed the rebooting the cluster is fully upgraded and
functional.
17.5 Upgrade Clusters
During the upgrade of the NSX clusters, the ESXi host requires a reboot, but there will no
impact on data plane for VMs because they will be automatically evacuated to other hosts with
DRS.
If DRS were disabled, the vSphere admin will need to manually move the VM’s and then reboot
this ESXi host.
This is the main reason why admission control with only 2 hosts part of the cluster may prevent
automatic host upgrade. My recommendation is to avoid 2 host clusters, or manually evacuate
a host and put it into maintenance mode.
If you have manually created an anti-affinity rule for the Controllers nodes (in the current NSX
release this is not performed automatically, as it is instead the case for example for the DLR
Control VMs), using a cluster with 3 hosts will prevent the upgrade.

Page 199 of 208
To solve the problem, disable this anti-affinity rule by unchecking “Enable rule” for automatic
hosts upgrade and enable it back after the upgrade is completed.

Page 200 of 208
With default anti-affinity rules for Edges/DLR, 2 hosts will prevent the upgrade. Uncheck the
“Enable rule” anti-affinity rules for Edges to allow automatic hosts upgrade. Enable it after the
upgrade is completed.

Page 201 of 208
Click Cluster Host “Update”
If an upgrade is available for a Cluster, an “Update” link is available in the NSX UI. When the
upgrade is initiated, NSX Manager updates the NSX VIBs on each ESXi host.
Click on “update” to upgrade Cluster:

Page 202 of 208
VIBs are updated on hosts
The ESXi hosts reboot during the upgrade:
Task view will reveal what happens while the upgrade process is running:
Once all hosts are rebooted, the host update is completed.

Page 203 of 208
17.6 Upgrade DLR and ESG’s

During the upgrade process a new ESG VM is deployed alongside the existing one. When the
new ESG is ready, the old ESG vnics are disconnected and the new ESG vnics connected. The
New ESG sends GARP frames.
This process can affect the forwarding plan, we can minimize the traffic outage by deploying
multiple Edges working in ECMP mode.
Go to NSX Edges and Upgrade each one

Page 204 of 208
Each ESG/DLR will then be upgraded
Check status and verify it is deployed with the correct version.

Page 205 of 208
17.7 Upgrade Guest Introspection / Data Security (if required)

NSX Guest Introspection / Data Security One Upgrade
If an upgrade is available to the Guest Introspection / Data Security an upgrade link is available
in the NSX UI.
Click on upgrade if available
Follow NSX installation guide for specific details on upgrading Guest Introspection / Data
Security.
Once upgrade is successful create new NSX Manager backup
The previous NSX Manager backup is only valid for the previous release

Page 206 of 208
Don’t forget to Verify NSX working state

Page 207 of 208
18. Microsoft NLB and DLR
Microsoft NLB can work with two different mode Unicast or Multicast.
With Unicast Mode NSX and Microsoft NLB is working and VM in same/different VXLAN was able to ping
the NLB IP address.
With Multicast mode NSX-v and will not work without statically set ARP entry.
I found that the VDR will not update the NLB multicast mac address 03:xx:xx:xx:xx.
This is because of RFC 1812: RFC 1812 - Requirements for IP Version 4 Routers
"A router MUST not believe any ARP reply that claims that the Link Layer address of another host or
router is a broadcast or multicast address"
Also read this VMware KB:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10
06558
Solution:
configure static ARP in the DLR with the command:
net-vdr –nbr –a –i dstIp –m destMac –n lifname vdrName
dstIp is the NLB VIP IP address.
desMac is the NLB multicast MAC address.

Page 208 of 208

NSX V Hands On Guide - v1.0 - Public PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NSX V Hands On Guide - v1.0 - Public PDF

Uploaded by

Copyright:

Available Formats

NSX-v Hands-on Guide

Lessons Learned in Real Life Deployments

© 2013 VMware, Inc. All rights reserved.