You are on page 1of 22

NMX Server Redundancy Setup

© 2017 Harmonic. All rights Reserved.


Table of Contents

1. Introduction of NMX Redundancy ......................................................................................... 3


1.1 What is NMX Redundancy ............................................................................................ 3
1.2 How do the 2 NMX servers decide which is the active controller of the head end
equipment ................................................................................................................................. 3
2. How to setup NMX redundancy ............................................................................................ 4
3. What are the prerequisites for NMX Redundancy ................................................................ 5
4. Preparing the System for Configuring Redundancy.............................................................. 8
5. Step by Step instructions for setting up Redundancy ........................................................... 9
6. What happens on NMX server failure ................................................................................. 14
7. Master NMX Recovery ........................................................................................................ 16
8. Set Backup As Master (Role Swap).................................................................................... 18
9. Other menus in NMX Redundancy ..................................................................................... 19
10. Stale Catalog detection ..................................................................................................... 21
11. Rebooting NMX redundancy setup ................................................................................... 21
12. Redundancy Setup Considerations .................................................................................. 21
13. Troubleshooting Tips ........................................................................................................ 22
1. Redundancy setup failure or Replication failure alert popping up ............................... 22
2. If the Redundancy Wizard complains about NIC (Network Interface Card) Priority .... 22
3. Log Files ...................................................................................................................... 22
4. Not able to launch domain manager ............................................................................ 22

2016 Harmonic Inc. CONFIDENTIAL Page 2 of 22


1. Introduction of NMX Redundancy

1.1 What is NMX Redundancy


NMX redundancy ensures that when there is a failure of a SQL server or an NMX process, or if
there is a network outage to a master server controller, a head-end administrator can still control
and manage Harmonic equipment successfully using a Backup NMX server. NMX redundancy
also provides a Backup control server when the master server controller is removed for scheduled
maintenance.

Specifically, NMX redundancy sets up dat abase replication of a Master NMX server database
onto Backup NMX server database. The two servers maintain identical databases using SQL
replication mechanisms, but only the master server has all the processes to control head-end
devices. The master and the Backup servers exchange messages for arbitration purposes, and
when the active server fails, the Backup server takes over.

Replication can only be set for NMX Catalogs. There is no s upport for SMS (reports) catalog
replication. SMS catalog is copied from Master NMX to Backup NMX during redundancy setup
and there is no data being pushed to Backup when there is any change in SMS data. The Backup
NMX pulls the SMS data on a daily basis every day at 6:00AM using SQL server automated job
(PullSMSData). You can see this job on the Backup NMX from SQL Server Management Studio
(SSMS) -> SQL Server Agent ->Jobs. This job only runs on the Backup NMX when Master NMX
is active. If the Backup takes over before 6:00AM in the morning, all the changes done to SMS
catalog on Master NMX are lost for that day, since this SMS data is not critical to the customer,
this is not a big issue, and designed this way.

Please note that when the redundancy is set and Master NMX is active, the catalog name on the
Backup would be N MX_REPLICATION. SMS catalog name is the same on Mas ter as well as
Backup NMX.

1.2 How do the 2 NMX servers decide which is the active controller of the head
end equipment
NMX Master and Backup servers are peers that monitor each other through a c onstant
messaging mechanism (heartbeat messages). When one o f the servers does not receive a
heartbeat message, that NMX server first pings its own gateway to ensure that they themselves
are connected to the network.

If the Backup NMX server is connected to the network and loses 3 consecutive heartbeat
messages from the Master server, it takes control by becoming active server. This is considered a
fail-over. The Backup server then becomes the active server. Note in this case, only Backup NMX
is monitoring the devices, there is no protection available if the active Backup NMX were to fail.
So it is suggested, as soon as you see this happen, either do Master Recovery or set Backup as
Master operation to put the NMX system back in redundancy.

If the Master NMX server is connected to the network and loses 3 consecutive heartbeat
messages from the Backup server, it retains the control. As soon as Backup NMX comes back
up, it starts sending the HB message to Master, and detecting this master NMX server can put the
Backup NMX on Standby mode without any user intervention on Master NMX front.

Once the Backup server takes control, it renames and pr epares the databases as the standard
databases names customer named them originally on Mas ter. All the connected client
applications time out and shut down until reconnected.

2016 Harmonic Inc. CONFIDENTIAL Page 3 of 22


When the Backup takes over as the active server after the fail-over, it asserts two alarms: one
indicating that the Master active server has failed, and another to indicate that the Backup server
has taken over successfully and has become active.

The frequency at which the two servers exchange HB messages is set at 5 seconds.

2. How to setup NMX redundancy


Use the Setup menu item, from the domain manager top level 'Server Fail Safe' menu to setup
the redundancy. This is a f our-step process, where in the actual Backup machine is identified,
validated to be identical to the Master server, setup with database from the Master server,
application of the virtual IP address to the active Master NMX. All the databases that are
associated with the Master active controller are copied using the SQL server replication
mechanism. The SQL mechanism used is called publisher subscriber replication mechanism.

Predetermine if SMS server is needed. If needed, then it should be added before NMX
redundancy is setup to avoid duplicating the effort, since redundancy will have to be
removed if these databases are to be added later.

Keep a Backup copy of all the databases before setting up redundancy as a precaution

Ensure in case of multiple NIC cards the default gateway should be defined for the
management NIC interface (to view the default gateway use ipconfig from command
prompt).

Please ensure that Master and Backup NMX servers are configured with same time zone,
this validation is not done by NMX redundancy wizard today.

Ensure that the PC Server name is less than 15 characters long (NETBIOS limit). Having
longer names will cause NMX Redundancy setup to fail.

NMX redundancy wizard will run lots of validations on Master and Backup NMX, the list of
validations that are carried out are listed here.

Post NMX redundancy set up, use the Auto-restart menu under the Server Fail Safe top level and
set up bot h the Master and Backup for auto restart and provide the appropriate login and
password information. Typically Windows Administrator login is used for this purpose. The
password used cannot be an empty string. If any user other than Administrator is specified then
that user must belong to Administrator group on that machine.

After going through the wizard successfully, the active NMX will transition through various states
and turn green and the Backup NMX will turn yellow and be at Standby state. If the Backup says
‘Unreachable’ or ‘Stopped’ then ping from Master to ensure that Backup is reachable. Shutdown
and restart of the Master can be tried if PING is successful.

Virtual IP's setup screen allows the user to enter virtual IP's and s elect the physical MAC's to
which they need to be assigned. The user can select a virtual IP for every physical NIC card that
he has on the PC.
The main option on the VIP menu identifies the Management Nic.

Please ensure the virtual IP addresses belong to the same subnet as that of Master and
Backup Nics selected and are not already assigned on the network. This check is handled
by validation wizard now.

2016 Harmonic Inc. CONFIDENTIAL Page 4 of 22


Please check if any ’host’ files are present on the servers and that they resolve the IP to
name correctly.

While setting up on a network WITHOUT any DNS servers make sure that hosts file is
properly setup with the IP and PC names for the peer machine. i.e., the hosts file on the
Master should have entries for the Backup PC and vice-versa. This check is handled by
validation wizard now.

These virtual IP addresses are assigned on t he active Domain after successful domain startup
and remain assigned until shutdown or redundancy switch-over happens. Please note however
disabling and enabl ing the NIC card using Windows Network settings will remove the virtual IP
assignment. The user should refrain from doing so on an active Domain.

3. What are the prerequisites for NMX Redundancy


Now NMX redundancy can be setup when domain is stopped or when domain is running. Similarly
NMX redundancy can be removed when domain is running or when it is stopped.

NMX will automatically check for various validation rules upon redundancy setup. Once
redundancy is setup these checks can also be m anually performed by the user by clicking on
Server Fail Safe -> Validate Backup Configuration option.

The spread sheet below shows all the checks that NMX will verify for you at the time of
redundancy setup or post setup.

There are two ways to trigger redundancy validation wizard.


1) Server Fail Safe -> Validate Backup Configuration
2) Server Fail Safe -> Setup

Redundancy validator will carry out various checks on Master and Backup NMXs and if it finds any
errors, it will report the errors and us er would be gi ven the option to fix them. It will make
comprehensive list of checks during redundancy setup to ensure once redundancy is setup, the
chances of replication failure are very less. User will be pr ovided with an option to save the
redundancy validation results to a file.

Here is the full list of checks validator will carry out.


Check
S.No. Tag Name Rule Description AutoFix Type
Time
1 GENERIC_ERROR General Error FALSE ERROR Any Time
2 DOMAIN_CONFIGURED Domain IP on Local PC FALSE ERROR Any Time
Master and Backup should have same Post
3 NMX_DSN FALSE ERROR
NMX DSN Setup
Master and Backup should have same Post
4 SMS_DSN FALSE ERROR
SMS DSN Setup
Post
5 VALID_NICS Valid Master and Backup NIC Interfaces FALSE ERROR
Setup
Master and Backup should have same
6 NDDS_ID TRUE ERROR Any Time
NDDS ID
At least one gateway is configured on
7 SINGLE_GATEWAY FALSE ERROR Any Time
NMX
8 BACKUP_ACTIVE Backup NMX should not be Active FALSE ERROR Any Time
SQL Server Name in DSN belong to
9 SQL_NAME_DSN FALSE ERROR Any Time
Local PC
SQL server name is matching machine
10 DB_SQL_NAME TRUE ERROR Any Time
name

2016 Harmonic Inc. CONFIDENTIAL Page 5 of 22


Check
S.No. Tag Name Rule Description AutoFix Type
Time
Master and Backup should have same
11 NMX_VERSION FALSE ERROR Any Time
NMX Version
Master and Backup should have same
12 SQL_VERSION FALSE ERROR Any Time
SQL Version
Backup NMX should not be already in
13 REDUNDANCY_MODE FALSE ERROR Any Time
Redundancy
Master and Backup should have same
14 WINDOWS_VERSION FALSE ERROR Any Time
Windows Version
NMX has correct version of .Net
15 DOT_NET_FRAMEWORK FALSE ERROR Any Time
Framework
16 IP_PING Ping Peer Domain IP FALSE ERROR Any Time
17 GW_IP_PING Ping Gateway from Peer Domain FALSE ERROR Any Time
18 NAME_PING Ping Machine Name from Peer Domain FALSE ERROR Any Time
19 PING_OWN_GW Ping NMX Gateway from own Domain FALSE ERROR Any Time
Master PC name should not be identical
20 SAME_PC_NAME FALSE ERROR Any Time
to Backup name
PC name should not be same as NMX-
21 DEFAULT_PC_NAME FALSE ERROR Any Time
SERVER
NMX Catalog should not exist on Backup
22 BACKUP_NMX_CATALOG TRUE INFO Any Time
DB
SMS Catalog should not exist on Backup
23 BACKUP_SMS_CATALOG TRUE INFO Any Time
DB
24 ENABLE_DCOM DCOM setting should be enabled on NMX TRUE ERROR Any Time
NMX Domain Nic should have highest
25 NIC_PRIORITY FALSE ERROR Any Time
priority
NMX Domain Nic should have DHCP
26 NIC_DHCP_ENABLED FALSE ERROR Any Time
disabled
Master and Backup should have same
27 SQL_PORT TRUE ERROR Any Time
SQL listening port
28 WINDOWS_HOST_FILE Windows Host file has right entries TRUE ERROR Any Time
Master and Backup should have same
29 NMX_USER_PASSWORD TRUE INFO Any Time
NMX User Password
30 TEXTCOPY_PATH NMX has TextCopy in system path FALSE ERROR Any Time
Master NMX has all the required
31 FIRMWARE_VERSIONS FALSE ERROR Any Time
Firmware Versions
Backup NMX should have all the Master
32 TFTP_FOLDER_SYNC TRUE ERROR Any Time
NMX Firmware Versions
33 REPLICATION_STATUS Replication Status TRUE ERROR Running
34 VIRTUAL_IP_PING Can Ping Virtual IPs TRUE INFO Running
SYSTEM_TIME_MATCHIN
35 Master and Backup NMX have same time FALSE ERROR Any Time
G
Is Backup License Matching Master
36 LICENSE_CHECK FALSE ERROR Any Time
License

Background Color – This legend indicates, check can be auto fixed by the NMX
Strike through indicates, the check has been removed

The redundancy wizard will show the result in a s pread sheet to the user with all the validation
results. If any of the validation is failed, and If it cannot be fixed automatically, then user needs to
intervene and correct those errors manually. Post correcting manual failed validations, user can re
validate or re-setup the redundancy. And if there are failed validations that can be automatically
fixed, wizard will provide fix option to the user. Please note that, if there is any manual validation
failure is present, auto fix button will be disabled, unless user corrects those errors first, automatic
fix won’t be enabled.

2016 Harmonic Inc. CONFIDENTIAL Page 6 of 22


There are 3 types of error categories in the spread sheet.
1. Manual Fix – These must be m anually fixed by the user, and r estart redundancy.
(e.g.: NIC Priority, windows version mismatch, NMX version mismatch etc.)
2. Auto Fix – If there are no manual fix items present, auto fix items can be f ixed by
NMX and redundancy can proceed (e.g.: NDDS ID, NMX Password, SQL server
name)
3. Info – These are warnings that user must be aware of, redundancy can proceed even
if info’s are present. (e.g.: DHCP enabled, catalog exists on Backup, firmware version
mismatch)

Check Site ID matches on both servers - This site id can be verified by visually looking at the
HLORManager.ini file on the two machines that are going to participate in the NMX redundancy. If
they are different, the software will correct them during the setup process.
Check NMX software version on the two servers - The software version can be verified either
by looking under the Harmonic registry entry or by launching the Domain manager GUI and
checking the version in the about box.

Verify Network connectivity – Ensure that Master can ping Backup NMX by machine name, and
vice versa. If proper DNS servers are not configured, please add hos t file entries to enable
machine name pinging.

Machine name and SQL name - For NMX redundancy to work correctly, each of the two servers
must have distinct hostnames and distinct SQL server names. Validation wizard makes sure this
is true.

Embedded software - When NMX redundancy is setup; the system will copy the files.ini, which is
a listing of the embedded software present on the Master server to the Backup server for its use
when fail over happens. If there is a mismatch between embedded software between Master and
Backup NMX, validation wizard will warn the user, and on dom ain start up, it will automatically
sync the Master firmware on to the Backup NMX. Post redundancy setup, whenever (if ever) new
embedded software is manually added to the TFTP directory it must be copied to both Master and
Standby machines. Or use the Tools->Transfer Software option from Master DM Gui.

Backup and Restore of catalogs - When NMX redundancy has been applied and the catalogs
backed up, the redundancy information will also be backed up, including virtual address settings.
Care must be taken to restore these catalogs only on systems that have identical settings for the
restored database to work correctly.

Connections to serial devices - When setting up N MX redundancy user must verify serial
devices connections that exist in the system can be c ontrolled from either server. This can be
achieved either directly if the serial device supports two or more connection ports or by the use of
3rd party port servers that simulate serial COM connections like the DigiServer port. Please refer
to corresponding manuals for using such devices. Failure to comply with this requirement will
result in switches being in timeout after server fail over.

Outlook email account – The user should ensure that identical outlook email setup exists on the
Backup NMX PC, such that when the Backup server becomes active, the NMX can successfully
continue to send VOD statistics and alarm statistics reports.

2016 Harmonic Inc. CONFIDENTIAL Page 7 of 22


4. Preparing the System for Configuring Redundancy
Before you begin configuring NMX redundancy, you need to follow some basic procedures to
prepare the system.

To install NMX redundancy on servers and test access

Note: When you perform a new NMX installation you must restart your computer and login to
Domain Manager and create a new catalog. Refer to the procedure shown below.

1. Make sure ECL license is configured on both Master and Backup NMX.

2. Install the same version of NMX software on the Backup server.

3. Configure the default gateway and ensure that the Backup and master servers can
access it.

4. Provide different computer names for the master and Backup server. If you have the
DNS server established, skip step 5.

5. Edit the hosts file in each computer so that both computers can access each other by
computer name and IP numeric address.

The hosts file is located at C:\windows\system32\drivers\etc\hosts

 On the Master computer, add a line with the Backup machine's management IP
address and its name.

 On the Backup computer, add a line to this file with the Master machine's
management IP address and its name.

6. Test the access (by pinging) by numeric IP address and access by computer name to
verify connections.

 From the Command window, enter ping <computername>. If the system responds
with "unknown host," the computers are not set up correctly.

7. Open the Domain Manager and stop the domain manager on the Backup server if it is
running.

8. Back up all databases on the Master server. (NMX and SMS catalogs)

9. On the Master computer, click on Database -> Catalog Management to open the
Database Configuration.

The Database configuration window opens.

10. Create all necessary database catalogs on the Master server, including the NMX, and
SMS catalogs

2016 Harmonic Inc. CONFIDENTIAL Page 8 of 22


5. Step by Step instructions for setting up Redundancy

1. In the Domain Manager, select Server Fail Safe > Setup. The NMX Redundancy Setup
Wizard opens.

2. Read the first screen, and t hen click Next and ens ure that the system is compliant with
suggested guidelines.

3. In Step 2 of the wizard, define the Backup server details.

Enter the Backup server IP address.

2016 Harmonic Inc. CONFIDENTIAL Page 9 of 22


4. Click Next. The Redundancy Configuration Validation dialogue box appears.
This dialogue box presents a check list of passed and failed validations for both Master and
Backup servers. If there are no failed validations this dialogue box is skipped. However, if an
entry appears in the Failed Validations section, they are either classified as can be fixed
manually or can be fixed automatically. There can also be warnings, which do not impact
redundancy setup process. They are essentially informational warnings only.

2016 Harmonic Inc. CONFIDENTIAL Page 10 of 22


Manual failed validations cannot be fixed by NMX; these issues must be fixed by user before
the redundancy setup process can go forward. If there are any manual fail items detected, the
fix button is disabled. If only automatic fail items are present, then the fix button is enabled. In
this case user can click on t he fix button and continue with the redundancy setup.

If all of the validations are passed, then click on Close.

2016 Harmonic Inc. CONFIDENTIAL Page 11 of 22


5. Click Fix to correct any failed validations. Otherwise, you have to manually correct the failed
validation.

6. Optionally, Click Save to file... in order to save the validation results to an *.xml file so you can
troubleshoot the validation issue later.

7. Click Close to proceed to the third step of the Redundancy Setup Wizard.
The Define Virtual IPs dialogue box appears.

8. Click Add to define a unique virtual IP address. The virtual IP address is not mandatory, but is
needed if external clients are connected to the system and ar e designed to handle the
concept of server redundancy. The virtual IP address is assigned to active NMX NICs only.
When a failover occurs this virtual IP is moved to a Backup server by NMX.

9. In the Virtual IP field, add an I P address. The virtual IP address must belong to the same
subnet and cannot be already assigned on the network.

2016 Harmonic Inc. CONFIDENTIAL Page 12 of 22


10. You can define one virtual IP address for each NIC on the Master. The virtual IP address
maps to the MAC address. When you shut down the Domain Manager, the virtual IP
address is released.

11. From the Master MAC Address drop-down list, select the MAC address of a NIC card on
the master server. When you start the domain, the virtual IP address is allocated, not on
redundancy setup.

12. From the Backup MAC Address drop-down list, select the MAC address of the
corresponding NIC card on the Backup server. If you want to add another Virtual IP address
for different Nic, click Add, and follow the same process for the Master and Backup.

13. Select the Main check box to show which virtual address corresponds to the management
NIC. Main signifies the management NIC of the NMX server.

14. Click OK. The next step of the wizard shows the process of establishing the replication
between the Master and Backup databases and updates the registry keys. If the setup is
successful, you will see a Redundancy Setup Completed Successfully message.

15. Click Finish to close the wizard. Once you have set up NMX redundancy, you will see three
columns in the Domain Manager window: Component, State on Mas ter, and S tate on
Backup. The last two columns display the state of the component on t he master and
Backup PCs. The status of any firmware copying is displayed at the bottom of the Domain
Manager interface. During the copying process, the master server is accessible but the
Backup server is not available.

16. (Optional) Test the setup by forcing a switch, followed by recovering the master server to
ensure and verify the redundancy setup is working correctly.

2016 Harmonic Inc. CONFIDENTIAL Page 13 of 22


Once the redundancy setup is complete, on domain start, first Master NMX domain comes
up, and the state goes green, before Backup going into “Standby” state, the firmware
versions sync will be done; this can take few minutes depending on the difference in firmware
versions between Master and Backup NMX. Note only master versions are sync’d to Backup,
If Backup has additional firmware’s, those will not be sync’d back to Master during this time.

Once the firmware version sync is done, then virtual IP is set, and then the Backup NMX will go
in to “Standby” mode.

6. What happens on NMX server failure


Once the Backup domain manager decides to take over, it stops the replication engine from
copying data to the replication database. The replicated databases are then renamed and
prepared for use as standard databases. After this the NMX domain manager triggers itself for a
data load and follows the entire standard steps to come up to the post begin management stage.
Also, a red blinking alert, "Backup Mode," appears in the lower right corner of Domain Manager
GUI.

2016 Harmonic Inc. CONFIDENTIAL Page 14 of 22


Once the Backup has taken over, the system is back up f ully functional, but there is no s erver
redundancy available. It is recommended that the user find out the cause of the failure, fix it as
soon as possible.

After you restore the failed master server, you should immediately perform one of the following
operations:

Master Recovery: Recover the master server. Choose this option if the restored Master server is
better or faster than the Backup server. This operation, performed when the domain is not
running, copies the Backup database to the Master database and sets the Master to "Active" and
the Backup to "Standby."

Set the Backup server as the master server. Choose this role-swapping option if you want to set
the Backup server as the new Master server, and set the failed and restored master server as the
new Backup server. This function is available while the server is running.

All the DSM GUI clients will timeout and shutdown when a fail over occurs. The TCP clients will
also detect a t imeout and af ter the fail over the Backup NMX will be av ailable for reconnection
again. Since the virtual IP address is used that migrates to the server that is actually in control, the
GUI and the clients need to know only one IP address to connect to.

Backup PC could be in one of 2 modes: ‘Backup Standby’ or ‘Backup Active’ When in ‘Backup
Active’ mode two menu items ‘Master Recovery’ and ‘Set Backup as Master’ are available to the
user.

When Backup takes over after detecting a failure, it will assert two alarms one to indicate that the
Master has failed and anot her to indicate that the Backup server has taken over successfully.
These alarms can be s een on t he alarm viewer at the site level on the Backup server that is
active.

2016 Harmonic Inc. CONFIDENTIAL Page 15 of 22


If NMX redundancy switch over happens when new software is being downloaded to the
devices (which is in process and not completed), when the Backup takes over it will try to
set the devices back to the original software version, because the TFTP server of the
system also gets changed when the NMX redundancy takes place. Hence care should be
used when downloading new code to the devices.

User also has the option to manually switch to Backup Nmx, by clicking on Server Fail
Safe -> Switch To Backup option, this option is only enabled when server is running. This
step will follow the exact same procedure as automatic failure switch over.

7. Master NMX Recovery


Master Recovery
This option is used to reinstate the redundancy as per initial configuration, and i s only available
when the domain is in ‘Backup Active’ mode and master PC is available for use. Master recovery
cannot be done when domain is running, hence to recover Master; you need to stop the domain
first.

This Operation will run the recovery process that will copy the Backup catalogs to the Master NMX
and set back the mode on Master to ‘Master Active’ and set Backup state to ‘Standby’.

2016 Harmonic Inc. CONFIDENTIAL Page 16 of 22


2016 Harmonic Inc. CONFIDENTIAL Page 17 of 22
8. Set Backup As Master (Role Swap)
This option is available only on t he Backup domain manager when the domain is in ‘Backup
Active’ mode and master PC is available.

This Operation will set the active Backup as the new master, while the failed master will be set as
the new Backup server. The role swap operation could be invoked even when the server is
running.

2016 Harmonic Inc. CONFIDENTIAL Page 18 of 22


9. Other menus in NMX Redundancy
Auto Restart Preferences
When logging into the Master server a menu item is available under ‘PC Fail Safe’ named ‘Auto-
restart’ Using this menu item will allow the user to activate/deactivate auto restart of NMX and
define login parameters for both servers.

2016 Harmonic Inc. CONFIDENTIAL Page 19 of 22


Remove Redundancy
1. From Domain Manager, select Server Fail Safe > Remove (This menu option is visible
only if NMX redundancy has been set.)

2. The Remove Redundancy dialog box opens.

3. Enter a new site ID for the backup machine.

Note: The site ID is a unique number for each machine except in the case of NMX redundancy,
in which case both machines share the same ID.

4. Click OK.

When the operation completes, a message indicating that the NMX redundancy was
removed successfully displays.

This option is available only when the server is not running. This operation will remove the entire
redundancy definitions, setting back all PCs and releasing the Backup for normal use.

The user will be prompted to enter a distinct site ID for the Backup server.
Redundancy can be removed when Master is active or when Backup is active.
If redundancy is removed when Master is active, the catalog on the Master is in up-to-date state.
If redundancy is removed when Backup is active, the catalog on the Backup is in up-to-date state.

2016 Harmonic Inc. CONFIDENTIAL Page 20 of 22


10. Stale Catalog detection
A catalog is considered stale if there is any indication that the catalogs on the Backup NMX are
out of sync with Master NMX catalogs. When this is detected, “Replication Failed” alert will be
flashed on DM gui.

If there is a replication failed alarm, manually “Switch To Backup” option will pop up a dialog box
with stale catalog warning along with last replication time on c atalog, and if user is willing to
proceed, he can click on OK to continue.

Post automatic fail-over case, if the domain manager detects stale catalog in the Backup, the
domain will not come up. User should manually start the domain; in that event NMX will pop up a
dialog box with stale catalog warning along with last replication time on catalog. In this case user
has the option to go back to Master Catalog if he thinks the catalog on the Master is up to date, he
can then remove the redundancy manually and re configures the redundancy. Or if user is aware
that there was no change on the catalog done during that said period he can go ahead with stale
catalog on Backup domain.

Backup NMX having Stale catalog can cause Service outages, but the big problem is to detect the
stale catalog.

Our approach to detect stale catalogs is for the Master domain to periodically (every 5 seconds)
insert the timestamp (current time) into a database table (Domain_ProductVersion). Backup NMX
can periodically (every 5 seconds) check the timestamp in the Backup table and compare against
current time. If they are not within 2 minutes gap, Backup database replication is considered to be
out of order. When this is detected, “Replication Failed” alert will be flashed on DM GUI.

11. Rebooting NMX redundancy setup


Rebooting the setup that is participating in redundancy is very tricky. Let us look at different ways
of rebooting machines that are participating in redundancy.
1. When Master NMX is active, it is safe to reboot Backup NMX.
2. When Master NMX is active, and Backup is in Standby mode, you cannot simply reboot
Master server. If you do that, Backup NMX will take over. If you don’t mind Backup taking
over, then you manually switch to Backup then restart Master NMX. It is recommended
you do a Master Recovery process to put the system back into redundancy.
3. If the idea is reboot both machines, then the proper order is to first shut down Backup
NMX, followed by reboot of Master NMX and start the domain on the Master, and then
start the Backup NMX.
4. If the Backup is Active, it is safe to reboot Master NMX

12. Redundancy Setup Considerations


1. Once you set up r edundancy and i t is active, you cannot add s ervers, add or remove
element managers, update PC settings, change catalogs, restore catalogs or upgrade
your NMX version.
2. You must remove NMX redundancy before taking any of these actions. Do not disable a
NIC card from your network configuration while your system is in redundancy mode or the
virtual IP address will be lost.
3.
4. Do not configure two NIC cards on t he same PC subnet or the TFTP daemon will
malfunction.

2016 Harmonic Inc. CONFIDENTIAL Page 21 of 22


13. Troubleshooting Tips

1. Redundancy setup failure or Replication failure alert popping up


• Check that both the Master and Backup server are configured with the same time
and time zone; the redundancy wizard does not check for the same time zone.

2. If the Redundancy Wizard complains about NIC (Network Interface Card) Priority
• Change the NIC order by going to: Control Panel > Network and Internet > Network
Connections. Select Advanced > Advanced Settings. In the Connections box, use the
arrows to move the required NIC to top of the list.

3. Log Files
• To help you and Harmonic Support identify any NMX issues, the DomainManager.log
file is generated with any critical failures and saved to:
C:\ProgramData\Harmonic\NMX\SharedFiles
Send it to Harmonic Support for further debugging and investigation.
Mark the file name with Master and Backup with so that support can identify which
log file belongs to which server.

4. Not able to launch domain manager


• If you are not able to launch domain manager either using Master NMX or using
Backup NMX, you can manually remove the redundancy using registry settings. For
more details contact support if you get into this case. We will not add more details
about this method as this is very risky, and you have no way to know which NMX has
the up-to-date catalogs.

2016 Harmonic Inc. CONFIDENTIAL Page 22 of 22

You might also like