You are on page 1of 50

mcRNC Brief Description and

Introduction To Basic
Troubleshooting

BCN-B
PHYSICAL DESIGN OF ONE MODULE

Front view Rear view


CONFIGURATION STANDARD

• STEP1: TWO (2) MODULES


• STEP3: FOUR (4) MODULES
• STEP7: EIGHT (8) MODULES

S1-B2 S3-B2 S7-B2


SITE VIEW
INTERNAL VIEW
OF
MOTHERBOARD
AND ADD-IN
CARDS
mcRNC
Example
Configuration
Box Controller Node Ethernet interfaces (BCN-B)
Provided interfaces and supported standards

2x USB
1x RJ-45 Software download
Hardware maintenance

Debugging
interfaces

1x SFP 9x SFP+ 10x SFP 4x RJ-45


Tracing 7x BCN interconnect, UTRAN interfaces
SFP Alarm and sync interfaces,
2x UTRAN interfaces EM, not used by mcRNC
SFP13 – SFP22 DCN

10GE external ports


SFP+ 11, SFP+ 12
BCN-B
ETHERNET
INTERFACES
BACKPLANE
CONNECTIVITY
EXTERNAL
CONNECTIVITY

• CFPUs and EIPUs are the only add-in cards that have
Transport and IP Layer ref. TCP/IP Protocol
• CFPUs are directly linked to LAN1 ports for OAM
connection to OMS and NetAct
• EIPUs are directly linked to SFP11 to SFP22
O&M
CONNECTIVITY
• Know the EIPUs in Quantity and Positions
>show hardware inventory list brief
Know which EIPUs are for which Roles and connected
to which SFP ports
> show networking interface
USEFUL
• Check Backplane Inter-Module Cabling COMMANDS TO
1. Go to directory /opt/nokiasiemens/SS_QNTools/script
2. Execute command ./zcablecheck.sh UPDATE
• How to check mcRNC clustering (BACKPLANE) KNOWLEDGE AND
1. Go to directory /opt/nokiasiemens/SS_QNTools/script
2. ./zbackb.sh TROUBLESHOOTING
• How to check SFP port state
Note: Each BCN1=m10=1; BCN2=m20=1; BCN3=m30=3; BCN4=m40=4 etc
#zjane 1 ZLAI
#zjane 2 ZLAI
#zjane 3 ZLAI
#zjane 4 ZLAI
etc
• mcRNC HW STATUS
1. Check status of add-in cards
#hwcli --t
2. Check all add-in cards and managed objects
#hwcli

• mcRNC Temperature status USEFUL


1. go to LMP
#ssh root@lmp-1-1-1 COMMANDS TO
#ipmitool sdr
UPDATE
KNOWLEDGE AND
TROUBLESHOOTING
HARDWARE ITEMS

BCN-B
All add-in cards in the mcRNC look the same – CFPU, CSPU, USPU & EIPU

Processor add-in
card (BMPP2-B) –
Octeon 2 variant B

Memory module for processor add-in card


• Dummy module with
no electrical
components
• Placed on empty card
slots to ensure proper
cooling of BCN
module Add-in filler card
(BFC-A)
• AMC (HDSAM-A) is a mid-size (single-
width, 4 HP) AMC module
• A
• Provides serial attached SCSI (SAS)
storage in the system
• HDSAM-A is equipped with a 2.5-inch
small form factor serial attached SCSI Hard disk drive
(SAS) hard disk drive
• Hard disk drive needs to be acquired
carrier AMC
separately (HDSAM-A)
• AMC filler is a dummy module with no
electrical components
• Empty AMC bays must always be equipped
with AMC fillers to ensure proper cooling
of the BCN module
• AMC filler acts also as an EMC shield
BCN AMC filler
(BAMF-A)
AC power
distribution unit
(BAPDU-A)
• Used in 19-inch cabinet installation
• Take the input power from the site power supply (180-264V)
• Eight circuit breakers installed in the front panel
• One PDU provides eight outputs
• Can provide power up to eight BCN if the two PSU in each module take
power from two PDUs
• Can provide power up to four BCN if the two PSUs in each module take
power from the same PDUs
AC power supply
• 1200-watt redundant AC power unit, variant B
supply units
• Located on the rear of the BCN
(BAFE-B)
module
• Hot swappable and has an IEC 320
C20 type input which operates on
230 VAC
• Two outputs to BCN module
• Main output with 12V for all BCN
electronics including HW
management
• Standby output with 3.3V for BCN
HW management
• For cooling the BCN
• Contains two dual-fans
• Located on the rear of the BCN module
• Fan speed is controlled by the hardware
management system to regulate the
temperature within the BCN
Main fan (BMFU-
• BMFU-A is used in BCN-A with max A/BMFU-B)
rotation speed 3700 rpm
• BMFU-B is used in BCN-B with max
rotation speed 4000 rpm
• Dimensions (H x W x D) - 142 mm x 140
mm x 75 mm
• For cooling the AMCs that are installed
in BCN
• Located on the rear of the BCN module
• Fan speed is controlled by the hardware
management system to regulate the
temperature within the BCN
• Dimensions (H x W x D) - 95 mm x 75
mm x 105 mm
Fan for the AMCs
(BAFU-A)
Air filter (BAFI-A)

• Located at the front of the BCN module in the cooling air inlet
• Prevent dust from accumulating inside the equipment
• Meets the NEBS GR 63 CORE and GR 78 CORE requirements
Network
Architecture
Check Associations for all external connectivity (ACESS (IUB) & CORE)
>show troubleshooting z-commands zassoc
Check Bi-directional Forward Detection status
> show troubleshooting z-commands zbfd
Check IUB Transport
> show troubleshooting z-commands zbts
List measurement files waiting for transfer to OMS
> show troubleshooting z-commands zifo meafile -f
List measurement files in temporary working directory
Z-Commands
> show troubleshooting z-commands zifo meafile -r
List measurement files transferred to OMS
> show troubleshooting z-commands zifo meafile -t
Check SFP Port Status
> show troubleshooting z-commands zjane box 10 zlai
Check SFP Interface to Port Mapping
> show troubleshooting z-commands zjane box 10 zmap
Check Switch (Main & Extension running configuration)
> show troubleshooting z-commands zjane box 10 zr
Check L2 Packet Count Statistics
> show troubleshooting z-commands zjane box 10 zs port portnumber 0/7 targetswitch e
Check Detailed L2 Packet Count Statistics
> show troubleshooting z-commands zjane box 10 zss show-switch-port-counters port-name portname SFP13
Check Up Times of BCN and ADD-IN cards
> show troubleshooting z-commands zjane box 10 zu
Check Memory Utilization – ctrl+c to end it
> show troubleshooting z-commands zmem
Check Recovery Groups States Z-Commands
> show troubleshooting z-commands zrg state -
-all - list all recovery groups
-d - list disabled recovery groups
-e - list enabled recovery groups
-l - list locked recovery groups
-u - list unlocked recovery groups
Symptom Data Reports

- This is one of the items you will need to provide when asking for support from Nokia or Support Team. It comes
under the category of Problem Reporting

- Can also be used to guide our own investigation and troubleshooting if you can read the logs
SYMPTOM DATA REPORTS

COLLECTING STANDARD SYMPTOM REPORTS

LISTING THE STANDARD SYMPTOM REPORTS

COPYING STANDARD SYSMPTOM REPORTS TO REMOTE MACHINE


AGENDA

DELETING THE STANDARD SYMPTOM REPORTS

PRACTICAL SESSION
Standard symptom report is a framework used to collect symptom/behavior data
from Multicontroller RNC to support troubleshooting of problems.

The framework collects standard symptoms data by running individual or


multiple plugins.

Plugins are specific internal programs designed to do checks, monitor and collect
SYMPTOM DATA
logs. Some of the plugins are put into groups, thus it is possible to collect the REPORTS
symptom data on the group level too.

Most of the Multicontroller RNC configuration data and logs essential for
investigation of the problem can be collected with the standard symptom plugin
report group-RNC and subreport-MessageMonitor
group-RNC is a group plugin which contains all the plugins below;
subreport-rnchw
subreport-rncinfo
subreport-rncipconfig
subreport-rncsignaling
subreport-rncrnw
subreport-rnchas SYMPTOM DATA
subreport-rncalarm
subreport-rncuplane
REPORTS
subreport-rncmon
subreport-ipmgmt
subreport-networkresiliency

The name of the plugin defines its function and expected log.
The framework allows easy enhancement with additional plugins, if the available
standard plugins are not sufficient.
The collected symptoms data is stored in /mnt/backup/stdsymp directory.
The data must be collected as soon as possible after an abnormal situation has
taken place and before any recovery action is performed, such as Multicontroller
RNC restarts or replacing hardware.
This is important because the information stored about the problem
may get overwritten in the process of time or be lost because of
recovery actions.
The standard symptom report group-RNC is expected to be SYMPTOM DATA
completed in around 15 minutes depending on Multicontroller RNC configuration.
Recovery actions can be started, if needed, as soon as symptom report generation
REPORTS
is completed.
Save or collect the standard symptom reports.
To save (collect) the standard symptom report, enter the following command:
save symptom-report <name> include <plugin>

Often what is basic for Nokia when Collecting Standard Symptom Report is: COLLECTING
STANDARD
save symptom-report RNC260 include group-RNC include subreport-MessageMonitor
SYMPTOM
NB: REPORTS
(1) name
(a) cannot exceed 25 characters and must start with a letter
(b) cannot be duplicated
(2) IF you are in CFPU-0 and you collect symptom report, it will be store in
/mnt/backup/stdsymp directory under CFPU-0
List the symptom reports
To list all the standard symptom reports, enter the following command:
>show symptom-report all

List the detailed contents of a particular symptom report LISTING THE


>show symptom-report <name>
STANDARD
Display the list of available plugins.
SYMPTOM
>show symptom-report plugin-list
REPORTS

List the group of plugins based on the group names


>show symptom-report group group-RNC
OPTION-1
Use SCP to copy standard symptoms report to remote machine.
Note that the report files are stored in /mnt/backup/stdsymp directory.
Switch to bash shell (type exit later on to return to fsclish).
shell bash full COPYING
STANDARD
scp /mnt/backup/stdsymp/<report_name>.tar <username>@:/<external server IP:/<external server folder path> SYSMPTOM
Note that you can use wildcard symbol "*" in the to copy all the report files from
/mnt/backup/stdsymp directory.
REPORTS TO
REMOTE MACHINE
OPTION-2
Use FileZilla FTP Client
Due to the storage space available on the RNC, Symptom reports may have to be
deleted after we make use of the data

Delete the particular standard symptom report.


To delete a particular standard symptom report, enter the following command:
> delete symptom-report <name> DELETING THE
STANDARD
Delete all the standard symptoms reports. SYMPTOM
To delete all the standard symptom reports, execute the following command:
> delete symptom-report all
REPORTS
REPLACING A FAULTY
HARD DISK
The hard disk drive is a mechanical device and it wears out in 3 to 4 years.
Hard disk drive and hard disk drive AMC are separate spare parts.
These are the main steps to replace HDD:

1. Identify in which mcRNC module the HDD to be replaced is located

2. Shutdown the CFPU in mcRNC of HDD to be replaced


Replacing AMC
3. Remove the AMC from the AMC bay Hard Disk Drive
4. Uninstall the faulty HDD (HDD)
5. Install the new HDD Before you start
CAUTION! Electrostatic discharge (ESD)
6. Insert the AMC to the AMC bay may damage components in the module
or other units.
7. Activate the CFPU with the new HDD Wear an ESD wrist strap or use a corresp
onding method when handling the units,
and
do not touch the connector surfaces.
Procedure
1. Identify in which mcRNC module the HDD to be replaced is located.

2. Log into the CFPU node where the HDD is NOT faulty.

3. Create Fallback build.


> save sw-manage app-sw-mgmt fb-build build-name <build-name>

4. Prepare mcRNC module with the HDD to be replaced. Removing AMC


a) Lock the CFPU node where the HDD is located.
> set has lock force managed-object /CFPU- <x>
b) Power off the node where the faulty hard disk drive is located.
> set has power off managed-object /CFPU- <x>
5. Gently pull the hot swap handle on the front panel of the AMC.
Do not pull the handle out all the way yet. Pulling the handle notifies the hardware
management system that you are going to remove the AMC and tells it to finish all
processes. The hot swap LED starts flashing.

6. Wait until the hot swap LED turns into a solid blue. This may take a few seconds.

Removing AMC
7. Pull the hot swap handle again more firmly and slide the AMC out of the bay

Removing AMC

8. If you are not installing another AMC immediately, install an AMC filler into the
empty AMC bay.
This is to ensure adequate cooling and a proper EMC shield in the module.
Procedure

1. Place the AMC so that the faulty hard disk drive side is facing down. Unscrew the
four screws on the metal bracket of the AMC module, then turn the module over
carefully while holding the hard disk drive.

2. Disconnect the faulty hard disk drive.


Detach the faulty hard disk drive from the connector by pulling it gently Removing faulty
HDD
Procedure

1. Connect the new hard disk drive to the SAS connector of HDSAM-A.
Connect the new hard disk drive to the SAS connector in the HDSAM-A by pushing it
gently

2. Turn the AMC over and


attach the new hard disk
drive to the AMC with four Installing new HDD
screws.
Tighten the screws so that
their heads are in line with
the metal bracket.

3. Install the AMC module


back into the AMC bay.
Procedure

1 Check that the EMC gasket is correctly in place and that its contacts are clean. 2
Insert the AMC into the bay, sliding it along the guide rails as shown in the figure
below. Make sure that the AMC is firmly seated in the module’s connectors.

Installing AMC HDD

3 Press the hot swap handle firmly.


Wait until the blue hot swap LED turns off and the power LED turns solid green.
Procedure
3. Press the hot swap handle
firmly.
Wait until the blue hot swap LED
turns off and the power LED tur
ns solid green.

4. Enable network boot for the CFPU node with the new HDD.
Installing AMC HDD
To enable the network boot for the node with the new hard disk drive, enter the
following commands:
a) Log in as root.
> set user username root
b) b) Power on the node.
> hwcli -np on <CFPU-X>
Wait a few seconds before proceeding to the next step.
c) Reset the node.
> hwcli -nr -B 3 <CFPU-X>
d) Exit root.
> exit
Installing AMC HDD
5. Disable the watchdog on the CFPU node with the new HDD.
To disable the watchdog on the node with the new hard disk drive through SSH,
enter the following command: ssh \ wdctl -d
6. Initialize the new HDD from the other CFPU node where the hard disk is not faulty.
To initialize the new disk, enter the following fsclish command:
> initialize hw
The following output is displayed:
Hardware successfully initialized
Note: To run the initialization script and display the console output, the space bar
must be pressed several times after entering the command.

7. Reboot the CFPU node with the new HDD from the local HDD.
Enter the following commands: Installing AMC HDD
> set user username root
# hwcli -nr -B 2 <CFPU-X>
# exit

The node will restart and synchronize the Distributed Replicated Block Devices
(DRBD). You can enter the watch -n 10 cat /proc/drbd to see how the
synchronization is progressing. However, if the watch -n 10 cat /proc/drbd
command fails, the cat /proc/drbd command must be executed.
Note: Set user username root must first be executed before
watch -n 10 cat /proc/drbd.

Do not restart the node during the DRBD synchronization. The initialization process
of the new disk is not ready until the synchronization is successfully completed.

Installing AMC HDD


Once synchronization is complete, the oos value for all blocks will be 0.

8. Check that serving and backup CMF (Cluster Management Functionality) are
working normally. This can be done by comparing Managed Objects CFPU-0 and
CFPU-1. Enter:
show cmf status recovery-unit node-name <mo-name>

Installing AMC HDD


Installing AMC HDD

Compare the blocks and they should match for both managed objects.
9. Unlock the node with the new hard disk drive.
Enter: set has unlock managed-object
Example:
> set has unlock managed-object /CFPU-1
The following output is displayed: /CFPU-1 unlocked successfully.
Note: Hard-disk failure alarm will be cleared manually and observed over a period for
confirmation.
Thank you

You might also like