You are on page 1of 9

The Reliability Approaches and Requirements for IC

Component in Telecom System


Guofujun, Zhoujiang, Xieronghua
Component Reliability Dept., 2012 Lab
Huawei Tech. Co., Ltd.
Shenzhen, China
(86) –(755)- 896-50813, guofujun@huawei.com

Abstract—Telecom system and its reliability are introduced.


Approaches in telecom equipment design and manufacture are II. TELECOM SYSTEM RELIABILITY APPROACHES
summarized, such as IC supplier process management, incoming
quality control, failure rate and wear-out lifetime evaluation, Telecom system reliability is an end to end program.
equipment process online monitor, reliability burn-in and Reliability requirements should flow fluently throughout
screening, field failure analysis and improvement and so on. supply chain of telecom carrier, equipment provider and IC
Telecom equipment becomes more complicated and integrated supplier in an up-down way. Reliability approaches should be
and brings engineering challenges to board design with IC employed throughout all levels of IC, board, equipment and
component suffering high electrical, temperature and mechanical system in down-up way. Also, it needs to form partnership
stress. Flexible and low cost engineering approaches introduced between equipment provider and IC supplier, as well as
and requirements for IC industry proposed. partnership between equipment provider and telecom carrier to
Keywords- telecom; IC component; reliability improve telecom system reliability.

I. TELECOM SYSTEM AND ITS RELIABILITY


INTRODUCTION

Telecom system is for information exchange, enabling the


communications between people or terminals anywhere and
anytime, through media conversion and data transmission
based on defined protocols. In telecom system, terminals like
telephone and computers, are linked by wire, radio, optical or
other electromagnetic ways, and formed into network, to
convert voices, images, data or any information.

Figure 2. Telecom end to end reliability program

III. IC LEVEL RELIABILITY APPROACHES


Main ideas in IC level reliability approaches are as below.
• Reducing-Integrating: reduce component number by
integrating board circuit into ASIC (Application
Specific Integrated Circuit).
• Sourcing-Evaluating: select supplier with good quality
Figure 1. Telecom system diagrammatic view
history, source IC with mature process, and evaluate IC
One of most important attributes for telecom system is high reliability for target application.
availability, that requires system down time is as short as
• Monitoring-Improving: monitor process and lot
possible, generally, about 0.0001%~0.000001% down time
difference to make sure incoming IC under control,
allowed, this is about 0.5~50 minutes in one year. To reduce
improve process if defect found or failure rate is high.
down time, the most effective way is to improve telecom
system reliability.

978-1-4577-1680-5/12/$26.00 ©2012 IEEE 4B.5.1


A. Integrate circuit as ASIC to improve reliability high-end telecom equipment, field data show it shares high
failure rate.
On PCB (Printed circuit board) board, there are all kinds
of components; the board’s failure rate, simply, is sum of all Select and source IC with mature process. It is better to use
these components. One way to improve the board’s reliability IC with technology node 1~2 years later than ITRS
or reduce its failure rate is to reduce the number of (International Technology Roadmap for Semiconductors)
components, and one way to reduce the number of roadmap. For example, 90nm process IC is limited to use in
components is to integrate local function module circuit into year 2004 even though ITRS roadmap shows it is mature in
ASIC. that year. It is better to begin to use it in 2005 or 2006. If
90nm IC is necessary for some telecom product, enhanced
evaluation and test are needed. Process with poor history
quality record also will be limited to be used.
Select and source IC for target application. For example,
IC used in telecom equipment in regions, like Arctic Circle
and Everest, can endure extreme cold; IC used in submarine
Figure 3. Integrate circuit into ASIC equipment should have long wear out life; IC used in rail can
endure vibration.
Field data show that a piece of ASIC’s failure rate is,
generally, lower than that of the module (before integrated into D. IC reliability qualification
the ASIC). Of course, the ASIC should be well designed with
mature process. Any new IC component must be qualified before use.
Qualification is to evaluating and testing IC component to
B. ASIC DFR and DFT judge it whether or not meets telecom equipment requirements
with items below.
ASIC is designed by telecom equipment manufacturer
itself, reliability and testability should be considered in design 1) Reliability evaluation:
phase. Reliability test and failure rate evaluation: HTOL (High
1) DFR (design for reliability): Temperature Operating Life), HTSL (High Temperature
Storage Life), TC (Temperature Cycling), Humidity (one of
Cover enough wafer level failure mechanisms, like EM HAST (Highly Accelerated Temperature and Humidity
(electromigration), SM (stress migration), NBTI (negative bias Stress), Unbiased HAST or Autoclave) and ESD/Latch-up by
temperature instability), HCI (hot carrier integrity), TDDB supplier are necessary items per JEDEC standards. Any failure
(time-dependent dielectric breakdown) and so on. out of sample size specified in JESD47 [1] will fail
Make sure wear out life and failure rate meet requirement qualification. For completely new process IC component,
by wafer level simulation and reliability test. For example, EM failure rate should be ≤50 fits; roughly, it is allowed 0 fail out
wear out life of metal lines and via should be over 10 years of 3 lots × 77 pcs, given Ea is 0.7eV in HTOL with 60% C.L.,
under max current density and temperature with failure rate using temperature 55ºC and stressed temperature 125ºC.
less than 0.1%. Wear-out lifetime evaluation: EM, SM, NBTI, HCI,
2) DFT (design for testability): TDDB and other wafer level failure mechanism, 0.1%
cumulated lifetime under max temperature and Vdd should be
Cover enough test items, for example, Stuck-at, at-speed ≥10 years (or 100,000 hours). AC life equals DC life
transition, at-speed path delay, Iddq, bridging fault, n-detect multiplied by duty factor, here, duty factor varies with process
and so on, used for logic circuit. And inversion coupling, and supplier, typical duty factor is 2 for NBTI, ≥ 50 for HCI.
idempotent coupling, dynamic coupling, data retention, write
mask, active neighborhood pattern sensitive coupling and so 2) Reliability testing:
on, used for memory circuit. Electrical: I/V curve, ESD/Latch-up, open/short, key
Keep high test coverage, for example, stuck-at test parameters and function tested.
coverage is at least 98% for logic circuit. Mechanical: dimension and co-planarity, wire bonding
pull, ball shear, die shear, lid pull, package bending, etc.
C. IC selecting and sourcing
Physical: for package, defects like wire bonding short or
In addition to ASIC, most IC components are from outside open, attachment void and delaminating, Flip-Chip under-fill
sourcing. To improve telecom equipment reliability, the first crack, foreign particles and other defects are inspected. For
and most important step is supplier selecting and IC sourcing. die, defects like metallization corrosion, foreign particles and
other defects are checked. Here is an example: abnormal found
Select IC supplier with good quality history.
in mold compound with SAM (Scanning Acoustic
Select and source IC with mature and reliable package. For Microscope), it is “knit-line” phenomenon and happens in hot
example, IC with PLCC (Plastic leaded chip carrier) package, liquid mold compound injecting process where density of
though it is mature enough, it is not recommended to use in silica filler in the mold compound is altered, this does not
affect function and no failure found in reliability testing.

4B.5.2
Here is an example of SDDV (Stress Driven Diffusive
Voiding) or SM: field failure, 1~2 year operating time, failure
rate ~5%, no function no output, Tc (case temperature) is over
70ºC under room temperature. Failure mechanism is SDDV or
Stress Migration/Voiding resulting in via resistance increasing
Figure 4. Abnormal mold compound and local high temperature. Root cause is Al-Si metal line and
via sensitive to stress. Improvement measurement is adding
Stressed test: component level TC, HTSL, Humidity test
Cu to Al-Si. Test and verification, over 150ºC and thousands
and then validate on PCB. Notes, maybe suppliers have
of hours stressed test show no failure with Al-Si-Cu.
already done these tests in qualification phase, but they do
these tests in process family group. We, as telecom equipment
manufacturer, do these tests with specified code to be used on
our product. Another reason is some suppliers are not able to
or do not want to do these tests.

E. IC process control
Many ways are used to control IC process at supplier site,
like regular audit of IC production line, regular review of Figure 5. Metal via void
supplier process report, remote online monitor of real time
process data, and sometimes sending people to supplier site if IV. BOARD LEVEL RELIABILITY APPROACHES
necessary.
Board level reliability approaches come to 3 ways:
Cover key processes and parameters, like Vt, Poly-CD
(Critical Dimension) and metal line resistance for wafer • Accepting-Absorbing: accept the reality that IC
process, and die attachment, bond strength of pull and shear parameter variation is inevitable, absorb and tolerate IC
and molding void for package process. Monitor processes and process variation by board level design.
parameters under control, Cpk (Process capability Index) level
• Avoiding- Compensating: any IC has its weakness,
is over 1.33 or 1.67 for key process. Keep enough check point
avoid it, or else improving it by adding a small circuit
and online sampling.
or designing in special way to compensate this
weakness.
F. IC incoming quality inspection and monitor
• Derating-Improving: improve reliability and extend the
After IC is qualified, it can be use in any telecom product.
life by derating environmental stress, assumed that IC
To make sure every incoming lot is acceptable and under
is poor in quality even though it is strength enough
control, lot by lot quality inspection and monitor are
actually.
necessary.
Samples, 5 pcs (pieces)/lot for instance, will be inspected. A. Board level design for IC process variation
If there is one or more samples failed inspection, the lot will
be rejected, and the supplier related must review its process. IC process and its all kinds of parameters vary lot by lot;
Tightened, reduced or skip-lot sampling will be used based on theoretically, they should keep far from spec limit.
component type, lot size and AQL (Acceptable Quality Level) Unfortunately, there are probabilities in practice that some lots
level. or individual units are very near to spec limit (SL-spec limit,
LSL-Low spec limit, USL-Upper spec limit), without enough
Incoming inspection and monitor items also cover margins, and there even are some outliers that are out of spec
electrical, mechanical and physical, same as those in limit. If IC process control at supplier site and incoming
qualification phase. inspection in telecom system manufacture site do not work
well, there outliers will result in board or equipment down.
Seasonal reliability monitor reports from IC suppliers will
be reviewed to make sure IC reliability is under control.
Normally, monitor items are also same as those in
qualification, but with a reduced sample size.

G. IC failure analysis and improvement


Assembly and field failure are inevitable for IC
component. Failure analysis is necessary, if failure happen in
assembly phase before equipment delivery, or just in board Figure 6. Units near to SL and Outliers
R&D phase, to find out root cause and improve it. Failure
analysis and reliability evaluation both are necessary, if failure IC parameters will vary with ambient temperature and
happens in field, to decide what measurement should be operating life time. In addition, there must be differences
adopted, like rectification, recall, or just ignore it if it is only a between ICs from different suppliers, which can substitute with
random failure and failure rate is not out of target. each other, in specified location on board.

4B.5.3
FFF with low temperature and high voltage, are
recommended.

a. Normal distribution b. Shift with temperature

c. Shift with life time d. Difference between suppliers or ICs


Figure 9. Corner device, temperature and voltage combinations
Figure 7. IC process and parameter shift recommended
To absorb all there variations, board must be designed for
IC process and parameter variation. There are different ways in B. Board level application design for IC reliability
board design phase and board test phase, worst case simulation Every IC has its weaknesses; it is necessary for board
is used in board design phase and corner device (or corner circuit designer or component reliability engineer to know and
process device, corner wafer device) testing is used in board avoid IC weaknesses, to improve IC reliability through board
test phase. level design. There are different ways for different ICs, here
1) Worst case simulation: are some examples below.
Worst case simulation is circuit simulation with special Flash and EEPROM, data retention and endurance: these
models that integrate component characteristic of temperature, ICs can only retain data for limited life, 10 years for instance,
voltage and life time. Simulation is done with temperature, telecom equipment with a service life over 10 years should
voltage and life time adjustable, and they are adjusted to worst rewrite data or replacing device before the deadline with
condition in board circuit design practice. This method works enough time margin . Also, these ICs can only sustain limited
well for the circuit with analog IC. cycles of repeated data changes (program/erase) before failure,
100,000 cycles for instance; board circuit design should
Worst case simulation example: an IC has a key parameter reduce program/erase cycles.
R, its spec limit is 100Ω±1%, with 1Ω (100×1%) as process
tolerance. In normal simulation, only typical spec of 100Ωwill ICs with RAM cell, soft error: if soft error happens, bit 0
be used, or sometimes spec limit of 101Ω and 99Ω will be or 1 in RAM cell changed by alpha-particle or cosmic-ray
used. But In Worst case simulation, 102.5Ω and 97.5Ω will be radiation, board circuit should be able to detect the error and
used, given that its ΔR.t resulting from temperature change is then correct it. PC (Parity Check) or CRC (Cyclic Redundancy
0.8Ω, ΔR.l from life time change 0.5Ω, ΔR.v from voltage Check) or other error checking can be used for error detection;
change 0.2Ω, then here come max 102.5Ω some correcting error ways can be used, like data backup and
(100+1+0.8+0.5+0.2) and min 97.5Ω (100-1-0.8-0.5-0.2). renewing, periodic refresh, reset and other ECCs. Soft error
detecting and correcting are important for telecom equipments
2) Corner device testing: used at high altitude sites like the Tibetan Plateau.
Corner device is manufactured by supplier to verify IC Logic IC, input pin not used or an external device is
process, its function is as same as normal device, but its disconnected: a definite logic level should be asserted, with a
process is designedly adjusted to spec limit. For example, pull-up or pull-down resistor.
supplier can adjust process of Vt_n, Vt_p and Poly-CD to get
TTT (Typical), SSS (Slow) and FFF (Fast) corner devices. IC with lidless package: thermal designer should pay close
attention for mechanical stress if a heat spreader attached, poor
design may result in die crack failure.
ESD protection: it is better not to use IC with low ESD
level, like 300V HBM (Human Body Model) or 200V CDM
(Charged Device Model) and below, otherwise board circuit
protection should be designed to make sure it is not injured by
human or machine. It is prohibited to use IC with too low an
ESD level, like 150V.
Figure 8. Corner device

Corner device is for internal use only at supplier site. C. Board level stress derating design for IC reliability
Telecom manufacturer can also use corner device in board
level to verify circuit design margin. This method works well For electrical component and equipment, low failure rate
for the circuit with both analog and digital ICs. It is better if and long lifetime can be achieved through reducing stress, like
temperature and voltage stress involved synchronously. temperature, voltage and mechanical stress.
Combinations of SSS with high temperature and low voltage,

4B.5.4
In addition, learning from failure, feedback information to
board level and form reliability methodology for later
equipment version are also targets in this phase.
• FMEA (Failure Mode Effective Analysis): reliability
prediction, FMEA, reliability assignment from
Figure 10. Failure rate with time with different level stress equipment, board, module to IC level.
Field data show that, with temperature stress increased, • FIT: fault injection test.
failure rate of board or module in telecom equipment will • Reliability growing: involved to find board weakness
increased. Industry data of Telcordia standard SR-332 [2]show and then improve it in R&D phase, like HALT (High
that IC failure rate will also increase with high temperature Accelerated Life Test) or other destructive tests.
stress.
• Environmental test: involved to make sure board or
equipment are adaptive to objective market
environment, like high or low temperature, high
humidity, low air pressure, salt fog, vibration and so
on.
• Assembly inline monitor: keep variation is under
control.
• Burn-in and screening: screen out early phase failure.
a. Board failure rate from 2000+ sites, temperature measured at equipment air
exhaust vent • Ongoing reliability test: with equipment shipped out lot
by lot, samples are tested to make sure reliability is
monitored.
There are also higher level approaches, not described in
detail here since less related with IC component reliability, like
backup, from equipment to system, and then to telecom
network level backup (Two or more ones mirror each other, if
the active one goes down, the standby one takes over).

VI. TELECOM ENGINEERING CHALLENGES AND


b. IC failure rate of 1Gbits DRAM with Ea=0.45eV per Telcordia standard
SR-332 REQUIREMENTS FOR IC
Telecom equipment is getting more complicated and
Figure 11. Failure rate with temperature
integrated in functions, smaller in size, denser in power
Derating is design method to keep stress below to density and lower in cost.
component spec limit or rating value to improve component For examples, to save energy, some telecom carriers shut
reliability. For example, a FPGA’s recommended operating down half air conditions in equipment room, resulting in high
temperature rating is 85ºC and absolute maximum rating is temperature for board and IC; to save space in crowed city,
125ºC, then, it should be applied, with de-rating, under outdoor equipment shrink in size to be installed in top of
temperature below 85ºC by system thermal design. In practice, building or roadside; some equipments are used in harsh
there may be different de-rating grades, and can be express like environment, make equipment engineering realization
‘Rating-5ºC’, ‘Rating-10ºC’ or ‘Rating-15ºC’, or like ‘Rating × difficult.
95%’, ‘Rating × 90%’ or ‘Rating × 80%’. It depends on IC
type and equipment requirement to select de-rating grade. These bring forward engineering challenges in equipment
Besides temperature, de-rating can also be used on frequency, and board design, and result in stress on IC, including
voltage and other important stresses or IC operating mechanical, temperature and electrical stress. To reduce stress,
parameters. new technologies and approaches are employed in equipment
and board level design, even so, telecom system still propose
V. EQUIPMENT AND HIGHER LEVEL RELIABILITY some requirements for IC industry.
APPROACHES The requirements below are only for technical discussion.
Equipment level reliability approaches simply introduced In commercial and business view, all these requirements can
here. Strategy is, detecting defects, holding them up before be met by using high grade IC component or high cost board
equipment shipped to field, burning-in and screening to make level design.
sure equipment shipped out has low early failure.

4B.5.5
A. Temperature spec definition resistance varies with air flow; Ta and Tc is not definitely
defined as mentioned in the sections above. With all these
Almost all ICs’ temperature spec is expressed as Ta uncertainties, accurate calculation of Tj is impossible for
(ambient), Tc (case) or Tj (junction) in datasheet; sometimes users.
Tb (ball or board) is used. They are widely used and well
accepted. Accordingly, thermal resistances θja, θjc and θjb are
defined and measure methods introduced in JESD51
standards.
Electronic equipment gets integrated, resulting in IC
ambient temperature increased, and there is little margin
between IC real time operating temperature and its spec limit.
Sometimes, IC may run out of its spec limit.
Figure 14. IC with Tj calculate model
1) Engineering challenge:
Without Ta, Tc or Tj definitely defined, it is difficult to
accurately know IC operating conditions, and it is not easy to
judge whether or not IC runs out of its temperature spec limit.
Below is a running board, temperature scanning with
infrared thermal imager shows that it has temperature gradient
on its surface, especially around power module and IC
components. Some suppliers only provide Ta as IC
temperature spec, and roughly defined Ta as ‘air temperature Figure 15. Thermal resistance and air flow
surrounding device’, but which point can be defined as
‘ambient’, point A, B, C or D in this case? For most ICs, there are two parameters used as
temperature spec, like Ta and Tc, or Ta and Tj. Users may be
confused if Ta runs out of spec but Tj is still far below spec,
since they always educated the important parameter is Tj that
reflects and links with IC reliability.
Additionally, some ICs do not definitely define or
distinguish between recommended operating rating and
absolute max rating.
2) Requirement for IC:
Figure 12. Board temperature scanning with infrared thermal imager
Temperature spec should be definitely defined AND can
Below is a MOSFET, thermal simulation result and real
be easily used.
test data show that it has a large temperature gradient on its
surface. If Tc is defined as point A on package mold Principles for ‘definitely defined and easily used’:
compound, it is far below its spec limit; but if defined as point
B on its metal pad, with about 30ºC temperature rising, it may • Only one parameter can be used as temperature spec
be out of its spec limit. for user.
• What user test is what he wants, no calculation or as
little as possible calculation for user.
Good choices following the principles above:
• Tc is the best one and can be defined as IC package top
surface center’s temperature. In case heatsink is used, it
can be defined by supplier case by case, at any point
only when the point is definite for user;
(MOSFET: TO220AB, Rg=24.3Ω, Vd=20V, Vg=3.37V, Id=0.05A, P=1.0W;
Simulation result, Tc=67ºC at point A and Tc=100ºC at point B; Real time test
data, Tc=71ºC at point A and Tc=104ºC at point B)

Figure 13. MOSFET thermal simulation and temperature gradient

Below is widely used model to calculate Tj, that is, a. Tc test point b. Tc test point with heatsink
Tj=Tc+θjc×Qc or Tj=Ta+θja×Qc, here Qc is part P of power
Figure 16. Tc definition and test point
dissipated from IC top case. Problems are, accurate proportion
of Qc from power dissipation is not known; θja is gotten per • Tj can be considered for any highly complicated and
JESD51 and does not reflect real environment, since thermal integrated ICs, like CPU and FPGA, only when there is

4B.5.6
a temperature sensor integrated in the IC and It is difficult to cool down equipment; thermal design cost
temperature can be directly read out with board level is getting higher, IC under high temperature stress shares high
circuit. failure rate.
• Ta can be considered only when it is defined by Board level approaches employed to reduce power density
industry standards and widely accepted by suppliers. and make thermal design easy. Here are approaches IC related,
SR-332 defines Ta as “the temperature 0.5 inch above shut down the modules of IC, if they do not need to work;
the surface of the device or …the average board reduce operating frequency as low as possible; use low supply
temperature may be used…”. It will be better for SR- voltage IC; use IC with lower thermal resistance package; use
332 if further definite information added, for example, IC with wide operating temperature range.
definitions below as fig b or c can be used, if there is a
2) Requirement for IC:
heatsink attached on IC or forced air flow in
equipment; Distance d is proportional to airflow, like
10mm with 1m/s downwind airflow, or simply makes IC can suffer higher temperature stress or has wide
it a constant. operating temperature range. Has high intrinsic reliability even
operated close to it spec limit for extended term.
IC level approaches can be employed at supplier site, like
improving IC design and process to get low power
consumption, reducing static and leakage current/power with
substrate insulator technology.

a. Ta in SR-332

Figure 18. Substrate insulator to reduce substrate leakage

C. Mechanical Stress- Adhesive strength


Mechanical stress comes from heatsink, board bending,
and equipment assembling; and can result in IC failure if
board level structure are not well designed and IC itself is poor
b. Ta with air flow in mechanical strength .
IC with a package lid has two main failure modes if a
heatsink attached on it, heatsink dropping off, or IC package
lid dropping off. When adhesive strength F1 is larger than F2,
heatsink may drop off from IC under mechanical stress.
Otherwise lid may drop off from die and substrate.

c. Ta with heatsink and air flow

Figure 17. Ta definition and test point

If Tc (or Ta) is selected as only parameter as temperature


spec, algorithm can be established between Tj and Tc (or Ta) at
supplier site if Tj is necessary. For example, there is an IC, to Figure 19. IC with heatsink attached on its package lid
keep it healthy, Tj must be kept below 115ºC, test and research
can be conducted, given that Tc=Tj-15ºC under worst
conditions; Then, for user, only Tc with 100ºC is available in
datasheet as temperature spec and the IC is healthy enough at
the same time.

B. Temperature Stress
With temperature rising in equipment, some ICs become a. Heatsink dropping off b. Package lid dropping off
bottleneck in board level thermal design, like most
commercial grade FPGAs, with junction temperature only Figure 20. Poor adhesive failures
85ºC, or industry grade only 100ºC.
1) Engineering challenge:
1) Engineering challenge:

4B.5.7
Select and evaluate adhesive between IC and heatsink in E. Mechanical Stress- Bending strain
board level thermal design, avoid mechanical stress during
board assembly, equipment transportation and installation. IC may suffer bending strain stress if compressive force
does not uniformly loaded on IC. PCB bending also causes
2) Requirement for IC: strain on IC.
Enough adhesive strength as below.
• Enough surface adhesive strength given that user may
attach a heatsink on IC to avoid heatsink dropping off
under mechanical stress. Adhesive strength is a
function of package molding surface energy, a. Force not uniformly loaded
roughness, Logo/Mark process and so on.
• Enough adhesive strength between lid and die/substrate
to avoid lid dropping off.
• Lid with four pillars is not a good design; it should be b. PCB bending
attached around all along substrate, and do not be
attached on die to avoid die crack failure. Figure 22. Bending strain resulted

• Adhesive strength spec is written into IC datasheet to Solder ball crack, die fracture and die crack are main failure
make board level design easy. modes with over bending.

D. Mechanical Stress- Compressive load


IC may suffer mechanical stress of compressive load from
heatsink. In some applications, equipment shell (or
housing/case) also exerts compressive force on IC. Solder ball
collapse and package substrate crack are main failure modes a. Solder ball crack
with over compressive load.

b. Die fracture

a. Compressive load

c. Die crack

b. Solder ball collapse Figure 23. Bending strain failures

1) Engineering challenge:
Evaluate bending strain effect, relieve strain with adhesive
or elastic pad, layout IC in the vertical direction as PCB
bending and far from PCB edge, fix PCB and reduce its
bending, avoid over mechanical stress during board assembly,
c. Substrate crack equipment transportation and installation.

Figure 21. Compressive load and failures Bending failure risk will increase if layout IC in the same
direction as PCB bending, as the device a in the fig below; or
1) Engineering challenge: too close to PCB edge, as the device b.
Evaluate compressive load effect, relieve pressure with
adhesive or elastic pad, avoid over compressive stress.
2) Requirement for IC:
Enough strength to endure compressive load; max
compressive load spec is written into IC datasheet; form
industry standard if possible.
Figure 24. Layout and PCB bending

4B.5.8
2) Requirement for IC: wear out life declines, generally, with IC process shrinking,
from hundreds year to decades years. Now, it is only about 10
Enough strength to endure bending strain; max bending to 20 years at 45nm, 28nm and 22nm nodes.
strain spec, like 300uε (micro-strain), 500uε or 1000uε, is
written into IC datasheet. Several suppliers’ data show that their devices wafer level
EM wear out life is less than 10 years, however, this is at
F. Electrical Stress absolutely max temperature 125ºC; the device still can work
for 10 years under normal conditions.
Telecom equipment may be used with its supply power
on/off switching frequently, like equipments driven by solar
power, or only work during daytime and can be switched off at
night. Sometimes, to save energy, some modules in equipment
are switched off temporarily and switched on for a while,
repeatedly in off-on-off-on cycles.
1) Engineering challenge:
In occasions above, IC in telecom equipments and
modules also suffers frequent power on/off cycling, how does
cycling affect IC reliability? Can IC work so long a time to Figure 25. Wear out life declines with IC process shrinking
meet equipment service life requirement? 1) Engineering challenge:

Frequent power on/off cycling affects IC reliability mainly Some equipments service life is 10~20 years, like high-end
in two ways. One is thermal shock or temperature stress, that Router, or 20~30 years for equipment that are difficult to
is, temperature increased with power on and then IC cools maintain, like submarine equipment. If IC’s wear out life is
down with power off. Another one is electrical stress, its effect less than 10 years, it will be a challenge for telecom equipment
can be ignored with a few cycles as normal use, but cannot design.
with vast amounts like 100, 000~1000, 000 cycles. 2) Requirement for IC:
IC level and board level power on/off test can be Keep wear out life above 10 years.
conducted with high frequency to valuate risk, like 10 on/off
cycle/hour, but suitable accelerate model needed. Power on/off CONCLUSION
cycling is not exactly the same as Temperature Cycling
(JESD22-A104) or Power and Temperature Cycling (JESD22- Telecom system and its reliability are introduced, reliability
A105). approaches are summarized. To communicate anywhere and
anytime, and keep high available and reliable, telecom system
2) Requirement for IC: be confronted with engineering challenges, this simultaneously
Max power on/off cycles allowed. propose requirements for IC components.

G. Wear out life REFERENCES


[1] JESD47, Stress-Test-Driven Qualification of Integrated Circuits, JEDEC
In reliability qualification phase, IC wear-out lifetime is Solid State Technology Association, Revision H, 2011
evaluated. Collect these evaluated data, and analyze
[2] SR-332, Reliability Prediction Procedure for Electronic Equipment,
relationship of lifetime and process node, the trend show that, Telcordia Technologies Special Report, Issue 3, 2011

4B.5.9

You might also like