Professional Documents
Culture Documents
Abstract
1. Introduction
High availability (HA) is a primary design goal in any communications
infrastructure. Definitions of HA vary, but a typical contractual requirement is
“4 nines” or 99.99% , which correlates to 52 minutes of downtime per year. A
central feature of high availability designs is the the ability to maintain a signal
path in the event of component failure. Often this is done with patented or
proprietary applications [1,2]. This paper presents an extensible and easy to
implement approach to an HA design using commercially available software.
2. Discussed problem
SCPC satellite links and circuit switched networks are harder to protect with
redundant equipment than an IP based signal train. IP networks can have
redundant equipment positioned in line in the signal train because they allow for
multiple simultaneous connections. Switching protocols such as Spanning Tree
allow for loop-free multiple switch connections, and routing protocols redirect
traffic in the event of link failure. None of these mechanisms is available to this
maritime customer. The signal must route to a single defined port at the customer
premise. This means that if a component fails, the original circuit has to be torn
down and a new path created to the customer demarc, via manual, human
interaction. The goal is to reduce the time it takes to accomplish this.
3. Test Setup
A test network simulating this customer’s network was set up as shown in fig. 2.
The abstraction layer controls the entire terrestrial circuit-switched network and
the modem RF chain. A modem was put in failure mode, the correlation engine
sensed this and put a new modem online, configured it for operation, and re-
routed the signal through the terrestrial network by issuing commands to the
equipment. This approach shows great promise to reduce downtime and increase
availability of the communication network for similar communications networks.
A set of COMTECH CDM 570L modems was provisioned on the “teleport” side
of the test network. To simulate an RF connection, transmit RF was connected
to receive RF and the modem was configured so it would lock up on itself when
it transmitted.
The customer end of the network was simulated with a Fireberd serial tester. A
test pattern from the Fireberd traverses the network to modem and back to the
tester via the modem RF loop. If the modem or terrestrial path fails, the path
must be torn down, because the signal must arrive a specified customer demarc
(the fireberd in this case).
In order to accomplish this, the abstraction layer must perform the following
actions:
Fig. 3 shows the beginning of the correlation script. The script is activated when
any critical alarm appears on the modem. The definition of what constitutes a
critical alarm can be specially tailored so the script is only activated when
desired.
The Correlation script executes pre-defined automation scripts which are created
in the Automation portion of the DataMiner. Fig. 4 shows these scripts. In
sequence, they (1) re-route the path, (2) turn the failed modem off, and (3)turn
the spare modem on.
Fig. 5 shows the first of these automation scripts, which deactivates the primary
terrestrial path and activates the backup path.
Figure 5: Automation script which deactivates the primary terrestrial path and
activates the secondary
The two other scripts which turn off the primary modem transmitter and turn on
the backup modem transmitter are not shown here in the interest of space.
4. Results
The test setup was configured as described above. A modem fault was simulated
on modem 1 by turning the transmit scrambler off, which generated a critical
alarm. The script worked as configured when the alarm generated. The
terrestrial path was re-routed, modem 1 stopped transmitting and modem 2
started transmitting.
1. The signal path must route to a specific customer interface, and multiple
simultaneous paths are not possible.
2. The signal train components cannot be configured to fail over to an
alternate path by themselves.
3. There are sufficient spares to support a 1:1 sparing ratio.
1. Extend the correlation and automation script so that 1:N sparing can be
accommodated—i.e., make the script complex enough to configure an
element from scratch and put it in the signal train.
2. Operationally test this approach in a live network, to determine if it has
a measurable effect on circuit availability.
References