You are on page 1of 11

Board Overheated

OPERATING INSTRUCTIONS

123/1543-CRX 901 49/1 Uen C


Copyright

© Ericsson AB 2007, 2014. All rights reserved. No part of this document may be
reproduced in any form without the written permission of the copyright owner.

Disclaimer

The contents of this document are subject to revision without notice due to
continued progress in methodology, design and manufacturing. Ericsson shall
have no liability for any error or damage of any kind resulting from the use
of this document.

123/1543-CRX 901 49/1 Uen C | 2014-09-26


Contents

Contents

1 Overview 1
1.1 Description 1
1.2 Prerequisites 3

2 Procedure 5

3 Further Information 7

123/1543-CRX 901 49/1 Uen C | 2014-09-26


Board Overheated

123/1543-CRX 901 49/1 Uen C | 2014-09-26


Overview

1 Overview

This instruction concerns alarm handling.

1.1 Description
The alarm is a primary alarm. The alarm is issued by the MO PlugInUnit.

The possible alarm causes and fault locations are explained in Table 1.

123/1543-CRX 901 49/1 Uen C | 2014-09-26 1


Board Overheated

Table 1 Alarm Causes


Alarm Description Fault Reason Fault Location Impact
Cause
The air The fan unit The physical In the environment Too high temperature
surrounding is not able to environment of outside the can damage the
the node is decrease the the node is too subrack. hardware. Traffic is
too warm. temperature to warm. affected.
normal values.
The Fan The Fan unit is There can The fault is located The subrack is not
unit is not not cooling the be one or in the Fan unit. being cooled. Boards
working boards in the more Fan can be damaged.
correctly. subrack. alarms. The
Fan problem
is the main
reason for this
alarm.
The board The board is The process The fault is located The high temperature
has too high too hot. (traffic) load on in the configuration can damage the board.
load. the board is and/or dimensioni Traffic is affected.
too high. ng of the board.
The hardw The board is Some hardwa The board. The overheated
are on the faulty. re component component can be
board is is too hot damaged and traffic
faulty. because this affected.
or another
component is
faulty.
There is at If no real board Electromagn The positioning of The high temperature
least one is present in a etic shielding the boards. can damage the board.
empty slot in slot, a dummy and cooling Traffic is affected.
the subrack. board must be airflow is not
inserted. optimal.
Dirty or The dust filter Accumulated The dust filter used The high temperature
damaged used by the dust in the by the PFM in an can damage the board.
dust filter. Power Fan input air filters EvoC node. Traffic is affected.
(Applies in Module (PFM) or possibly
an EvoC is either dirty or damaged
node only.) is damaged. filters.

The alarm is issued if the temperature of a board becomes higher than the
maximum allowed temperature. That is, the alarm is issued if one or more
temperature sensors on a board detect that the current temperature is too
high. This can be caused by a number of hardware faults, either on the board
itself or in the Fan units in the subrack. The processor load can also affect
the temperature on the board. The alarm can therefore be issued for several
different reasons, and sometimes a combination of two reasons, for example,

2 123/1543-CRX 901 49/1 Uen C | 2014-09-26


Overview

decreased capacity in the Fan unit combined with increased load on the board
processor.

Additional information additionalInfo of the alarm contains the ID of the


sensor that detected the high temperature first. It also contains the value of the
temperature at the time of the alarm. This value remains for the whole duration
of the alarm, regardless of further changes in board temperature.

The alarm ceases when the temperature of the sensor decreases and remains
below the alarm cease limit.

If another sensor exceeds its corresponding alarm limit when this alarm is still
active, no second alarm is issued. All such sensors must remain below their
corresponding alarm cease limits for the alarm to cease.

If the board is an EvoET board in an EvoC node, the temperature is not affected
by the intensity of the processor load. For this board type, the temperature is
instead affected by the location of the board. If the board is located in slot 23 or
higher, it must be moved to a slot with a lower slot number.

If several boards are overheated, that is, if this alarm is issued on several
instances of the MO PlugInUnit, it is likely that the fault is not on the boards,
but in the Fan unit(s) or in the physical environment of the node (including fire).
Damaged boards must be replaced.

If the alarm is not acted on, the overheated board can be damaged and the
capacity of the node will be affected.

It is probably necessary to visit the node.

1.2 Prerequisites
This section provides information on the documents, tools and conditions that
apply to the procedure.

1.2.1 Documents

Before starting this procedure, ensure that you have read the following
documents:

• System Safety Information

• Personal Health and Safety Information

123/1543-CRX 901 49/1 Uen C | 2014-09-26 3


Board Overheated

4 123/1543-CRX 901 49/1 Uen C | 2014-09-26


Procedure

2 Procedure

Do the following:

1. Investigate whether there are any empty slots in the subrack. If there is an
empty slot where no real board is needed, insert a dummy board in the
slot. If the alarm ceases, exit this procedure. If the alarm does not cease,
continue with the next step.

2. Lock the board, using the instruction Lock Board. The Lock type is Hard
lock. When the board is locked, the processor load is reduced and the
temperature decreases.

3. Restart the board, using the instruction Restart Board. RestartRank is Cold
with Test and Restart Reason is UNPLANNED_COLD_WITH_HW_TEST.

4. Unlock the board, using the instruction Unlock Board. If the alarm is not
issued again, exit this procedure. If the alarm is issued again, continue
with the next step.

5. Look for Fan alarms that indicate that the Fan is faulty or that there is
overheating. If there are any such alarms, act on them, using the relevant
instruction.

6. Investigate whether the alarm is issued for other boards in the node, in
particular, in the same subrack. If so, there is an overheating problem in
the node, but no corresponding alarm on node level is issued. Investigate
this problem at the node. If the alarms do not cease, contact the next level
of maintenance support.

7. If the alarm is issued for just one board and the board is faulty, lock the
board again and replace it, using the instruction for replacing that board.

8. Unlock the board. If the alarm is not issued again, exit this procedure.

9. If the node is an EvoC node, inspect the filters in front of and below the
Power Fan Modules (PFMs). If necessary, clean or replace them, using the
instruction Replacing the Dust Filters in BFD 538. Wait for two minutes. If
the alarm ceases, exit this procedure. If the alarm does not cease, continue
with the next step.

10. If the board is an EvoET board in an EvoC node, and if the board is
located in slot 23 or higher, move it to a slot with a lower slot number using
the instruction Replacing an EvoET Board. If the alarm ceases, exit this
procedure. If the alarm does not cease, continue with the next step.

11. Contact the next level of maintenance support. The cause of the alarm can
be a combination of high processor load and high ambient temperature, or
a faulty configuration of the board. See Section 3 on page 7.

123/1543-CRX 901 49/1 Uen C | 2014-09-26 5


Board Overheated

6 123/1543-CRX 901 49/1 Uen C | 2014-09-26


Further Information

3 Further Information

The processor load on a board depends on the software configuration on the


board. Each board is designed for a certain maximum processor load, and an
incorrect software configuration on the board can contribute to this alarm. The
temperature on the board processor is affected by the air temperature and
can be affected by the processor load.

You can monitor the temperature on the board using the command boardtemp.

123/1543-CRX 901 49/1 Uen C | 2014-09-26 7

You might also like