You are on page 1of 546

T. I. Biljenescu, M. I.

Bazu
Reliability of Electronic Components
Springer-Verlag Berlin Heidelberg GmbH
T. I. Băjenescu, M. I. Bâzu

Reliability of
Electronic Components
A Practical Guide to Electronic
Systems Manufacturing

With 212 Figures and 105 Tables

Springer
Prof. Eng. Titu I. Băjeneseu, M. Se.
13, Chem in de Riant-Coin
CH-I093 La Conversion
Switzerland

Ph. D. Marius I. Bâzu


IMT Bueharest
CP Box 38-160
Romania
E-mail: mbazu@imt.ro

ISBN 978-3-642-63625-7

Cataloging-in-Publication Data applied for

Die Deutsche Bibliothek - CIP-Einheitsaufnahme


Băjenescu, Titu 1.: Reliability of electronic components : a practical guide to electronic
systems manufacturing / T. 1. Bâzu. - Springer-Verlag Berlin Heidelberg GmbH

ISBN 978-3-642-63625-7 ISBN 978-3-642-58505-0 (eBook)


DOI 10.1007/978-3-642-58505-0

This work is subject to copyright. AII rights are reserved, whether the whole or part ofthe material
is concemed, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilm or in other ways, and storage in data banks. Duplication
of this publication or parts thereof is permitted only under the provisions of the German
Copyright Law ofSeptember 9, 1965, in its current version, and permission for use must always
be obtained from Springer-Verlag. Violations are liable for prosecution act under German
Copyright Law.
O Springer-Ver1ag Berlin Heide1berg 1999
Originally published by Springer-Verlag Berlin Heidelberg New York in 1999
Softcover reprint ofthe hardcover Ist edition 1999
The use of general descriptive names, registered names, trademarks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the
relevant protective laws and regulations and therefore free for general use.

Typesetting: Camera-ready by authors


Cover-design: MEDIO, Berlin
SPIN: 10721349 62/ 3020 - 5 4 3 2 1 O- Printed on acid-free paper
Foreword

The first detailed studies of electronic components reliability were undertaken to


improve the performance of communications and navigational systems used by the
American army. The techniques then developed were subsequently refined and
applied to equipment used for many other applications where high reliability was of
paramount importance - for example in civil airline electronic systems.
The evolution of good and reliable products is the responsibility of technical and
professional persons, engineers and designers. These individuals cannot succeed
unless they are given adequate opportunity to apply their arts and mysteries so as to
bring the end-product to the necessary level of satisfaction. Few managements,
however, are yet aware of the far greater potential value of the reliability of their
products or services. Yet customer satisfaction depends, in most cases, far more on
the reliability of performance than on quality in the industrial sense.
There was a time when reliable design could be prescribed simply as "picking
good parts and using them right". Nowadays the complexity of systems,
particularly electronic systems, and the demand for ultrahigh reliability in many
applications mean that sophisticated methods based on numerical analysis and
probability techniques have been brought to bear - particularly in the early stages
of design - on determining the feasibility of systems. The growing complexity of
systems as well as the rapidly increasing costs incurred by loss of operation, have
brought to the fore aspects of reliability of components; components and materials
can have a major impact on the quality and reliability of the equipment and systems
in which they are used. The required performance parameters of components are
defined by the intended application. Once these requirements are established, the
necessary derating is determined by taking into account the quantitative
relationship between failure rate and stress factors. Component selection should not
just be based only on data sheet information, because not all parameters are always
specified and/or the device may not conform to some of them.
When a system fails, it is not always easy to trace the reason for its failure.
However, once the reason is determined, it is frequently due either to a poor-
quality part or to abuse of the system, or a part (or parts) within it, or to a
combination of both. Of course, failure to operate according to expectations can
occur because of a design fault, even though no particular part has failed. Design is
intrinsic to the reliability of a system. One way to enhance the reliability is to use
parts having a history of high reliability. Conversely, classes of parts that are
"failure-suspect" - usually due to some intrinsic weakness of design materials -
can be avoided. Even the best-designed components can be badly manufactured. A
process can go awry, or - more likely - a step involving operator intervention can
result in an occasional part that is substandard, or likely to fail under nominal
VI Foreword

stress. Hence the process of screening and/or burn-in to weed out the weak part is a
universally accepted quality control tool for achieving high reliability systems.
The technology has an important role regarding the reliability of the concerned
component, because each technology has its advantages and weaknesses with
respect both to performance parameters and reliability.
Moreover, for integrated circuits - for example - is particularly important the
selection of packaging form [inserted, surface-mounted devices; plastic quad
flatpack, fine pitch; hermetic devices (ceramic, cerdip, metal can); thermal
resistance, moisture problem, passivation, stress during soldering, mechanical
strength], as well as the number of pins and type.
Electronic component qualification tests are peremptorily required, and cover
characterisation, environmental and special tests as well as reliability tests; they
must be supported by intensive failure analysis to investigate relevant failure
mechanisms. The science of parts failure analysis has made much progress since
the recognition of reliability and quality control as a distinctive discipline.
However, a new challenge is born - that of computer failure analysis, with
particular emphasis on software reliability. Clearly, a computer can fail because of
a hardware failure, but it can also fail because of a programming defect, though the
components themselves are not defective. Testing both parts and systems is an
important, but costly part of producing reliable systems.
Electrostatic discharge (ESD) induced failures in semiconductor devices are a
major reliability concern; although improved process technology and device design
have caused enhancements in the overall reliability levels achieved from any type
of device family, the filler mechanisms due to ESD - especially those associated
with Charge Device Model (CDM), Machine Model (MM), etc. -, are still not fully
understood.
Recent reliability studies of operating power semiconductor devices have
demonstrated that the passage of a high energy ionising particle of cosmic or other
radiation source through the semiconductor structure, may cause a definitive
electric short-circuit between the device main terminals.
In electromigration failure studies, it is generally assumed that electromigra-tion
induced failures may be adequately modelled by a log-normal distribution; but
several research works have proved the inefficiency of this modelling and have
indicated the possible applicability of the logarithmic distribution of extreme
values.
The reliability problems of the electroriic devices, the parameters influencing the
life time and the degradation process leading to the failure have rapidly gained
increasing importance. The natural enemies of electronic parts are heat, vibrations
and excess voltage. Thus a logical tool in the reliability engineer's kit is derating -
designing a circuit, for example, to operate semiconductors well below their
permitted junction temperatures and maximum voltage rating.
Concerning the noise problem and reliability prediction of metal-insulator-
metal (MIM) capacitors, generally the MIM system may be a source of partial
discharges, if inhomogenities like gas bubbles are present. If the ramp voltage is
applied, a number of current fluctuations occurring in the system is experimen-tally
observable in many capacitors. In the time domain, the current fluctuations are
present with random amplitude and random time between two consecutive pulses.
Electric charge is transferred through this system and its value reaches as much as I
Foreword VII

pc. This charge is sufficient to make irreversible changes in the polyethylene-


terephtalate insulating layers. The occurrence of current pulses is used as a
reliability indicator.
.... And the reliability problems catalogue of active and passive electronic
components, integrated or not, could be continued with other various problems and
aspects.
Classic examples of ultrahigh reliability systems can be found both in military
applications and in systems built for the NASA; certain supersystems - of which
only one or very few of a kind will be built - must rely more on parts quality
control, derating, and redundancy than on reliability prediction methods.
Young people who are beginning their college studies will pursue their pro-
fessional careers entirely in the 21 st century. What skills must those engineers
have? How should they be prepared to excel as engineers in the years to come?
The present book - a practical guide to electronic systems manufacturing - tries
to find the right actual response to some of particular reliability aspects and
problems of electronic components.

The authors
To my wife Andrea - for her patience and encouragement throughout this
project - and daughter Christine - a many faceted gem whose matchless
brilliance becomes more abundant with every passing year.

To my wife Cristina and to my parents.


Acknowledgements

A book could not be produced without the co-operation of a large number of


people. The authors owe primarily a collective debt of gratitude to Dr. Dietrich
Merkle (Springer Verlag) who provided the practical assistance towards bringing
the book to completion. Thanks are due to Ph. D. Eng. Mihai Grecescu, Eng. A.
Jurca, M. Sc., Eng. F. Durand, and Dip!. Chern. Cristina Biizu, who critically read
and corrected the manuscript - or portions of it, at various stages of its preparation
- and also the staff of the publisher (with special thanks to Mrs Sigrid Cuneus) for
their valuable assistance. The authors would like to acknowledge the kind
permission of all in text cited publishers to reproduce the respective illustrations
and/or tables from different books and/or papers.
Preface

The last decades have generated extremely strong forces for the advancement of
reliable products beyond the current state of the art. The obvious technical require-
ments of the American military forces (during World War II, the Korean, Vietnam
and Gulf conflicts), but also of the American and European Space Programmes,
have resulted in vastly improved reliability in machinery components. New ap-
proaches to components as well as to system reliability will be required for the next
generation. Product reliability can only be realised by combining the proper uses of
compatible materials, processes, and design practices.
While it is not possible to test reliability into a product, testing can be instru-
mental in identifying and eliminating potential failures while not adversely affect-
ing good components. Unfortunately, product reliability is often compromised by
economic considerations. Optimising the product reliability involves special con-
sideration applied to each of the three life intervals: infant mortality period, useful
life and wearout.
Infant failures should be eliminated from the device population by controlled
screening l and burn-in procedures. Two major types of defects are dominant in the
infant mortality period: quality defects and latent defects 2. Adequate derating fac-
tors and design guidelines should be employed to minimise stress-related failures
during the normal operating lifetime of the product. Finally, the effects of compo-
nent wearout should be eliminated by timely preventive maintenance.

1The most" proficient method for determining representative efficiency factors for conventional
screening tests - and subsequently an optimal screening effectiveness programme - involves a
five-step approach: (i) Determine the dominant failure modes experienced in each technology
and package configuration, as well as the impact of variables such as device complexity on
those failure mode distributions. (ii) Investigate the types and magnitudes of stress that activate
the various failure mechanisms and associated failure modes. Relate these stresses and stress
magnitudes to those specified in conventional screens. (iii) Examine screening data for each
technology to establish the range and the average reject rates actually experienced in the con-
ventional screening tests. (iv) Analyse the field experience of screened devices to determine for
each technology the screening escape rates and the types of failure modes that escape screening.
(v) Combine all reliability information to formulate efficiency factors for individual screening
tests for various technologies and package configurations; these efficiency factors can be
merged with screening cost information to determine overall screening effectiveness.
2 A quality defect is one that may be found by employing normal quality control inspection

equipment and procedures without stressing the component. A latent defect is one that will es-
cape the normal quality control procedures and requires component stressing in order to be de-
tected by inspection at the propagated failure level. A well-planned inspection station utilising
detailed criteria, proper instrumentation and trained personnel will exhibit high inspection effi-
ciency. However, no inspection is perfect, and a 100% efficiency is impossible to attain.
XIV Preface

In many ways, the story of reliability is intimately linked to the exploration of


space. With the space race that followed the orbiting of Sputnik in 1957, there has
been a great effort for improvement in the reliability of components, controls, bat-
teries and energy cells, computers, etc. The extraordinary reliability required for the
flights to the moon, while recognised, has come to be taken for granted as a famil-
iar part of our lives. Although not as breathtaking as the spectacular flash of the
great rockets, modem computers are equally important in making space flight pos-
sible. Just as it is a fact that modem supercomputer is possible only because of the
existence of extremely reliable components, it is also true that the assessment of the
reliability of components is often possible only because of the existence of modem
supercomputers. The reliability of electronic components has improved dramati-
cally over the past decades and will increase just as strongly in the future, since the
pressure from component users is forcing the component manufacturers to continu-
ously set new and stringent reliability goals.
The bipolar technology is the most mature semiconductor technology available.
As a result, many of the early failure mechanisms observed in these devices have
been totally eliminated; the devices no longer have failure mechanisms specific to
them - and for the most part - failure mechanisms in bipolar devices tend to be
universal through all device structures. However, the importance of a particular
failure mechanism is dependent on the application of the device. Common failure
mechanisms affecting all bipolar devices are contamination, corrosion, electrical
stress, electromigration, diffusion, insulating oxide breakdown, radiation and stress
relief migration. Packaging and interconnection processes have a dominant influ-
ence on bipolar device reliability.
A part of the great improvement in reliability of manufactured products is due to
the fact that during the last fifty years methods for the assessment and prediction of
reliability have been developed. Looking ahead, we foresee enormous technical
challenges from manufacturing to design. The era of deep submicron is confronting
us with problems that require new approaches. In packaging, multi-chip module has
become feasible, but how to make it economically competitive remains a major
undertaking. Much works needs to be done in interconnect technology, modelling,
extraction, and simulation. The semiconductor industry will continue to progress
well past the tum of the century; progress will generally be determined by a series
of practical limits or challenges, rather than fundamental limits. However, over-
coming these challenges will become increasingly difficult, and the industry will
continue to struggle against perhaps the most important limit of all to its growth:
the cost. The "giga-chip era" has begun; a new challenging approach to ULSI reli-
ability is now needed in response to the "paradigm shift" now being brought about
by simple scaling limitations, increased process complexity, and diversified ULSI
application to advance multimedia and personal digital assistant (PDA) systems. A
good example to this shift is the new movement from simple failure analysis by
sampling the output of a manufacturing line to the "building-in-reliability" ap-
proach based on identifying and controlling the causes for reduced reliability.
Future technologies in information science will rely on systems with increasing
complexity and on structures with decreasing size.
The higher complexity of modem systems imposed the development of micro-
systems, integrating mechanical structures and multifunctional materials with mi-
Preface XV

crocircuits on the same chip, the key element of the so-called "second silicon
revolution". The monolithic integration of sensors, actuators, optical devices,
valves leads to new devices - the microsystems - having a higher reliability, be-
cause the failure mechanisms linked to bond wires are virtually eliminated. And the
higher integration degree reduces not only bond pads and bond wires, but also the
number of system interconnects, with beneficial effects on the overall reliability.
On the road of the continuous decreasing of the structure size, the physical and
technological limits of semiconductor nanostructures point to the use of molecules
and atoms in information science. In particular, organic molecules are very attrac-
tive because they can be engineered with very large complexity, and their elec-
tronic and optical properties can be controlled technologically.
Our book should be viewed as a "matter of fact" text on a practical reliability
guide to electronics manufacturing of complex systems rather than a work on the
theory of components reliability, and - as such - it constitutes only a partial survey
(thus, for example, it ignores RF and microwave devices and circuits which are the
heart of wireless products) of some particular aspects of the more common and/or
more encountered practical reliability problems. The aim of this book is to contrib-
ute to the new approaches and to the understanding and development of electronic
component reliability. The underlying objective of the book is to better understand
why components fail, addressing the needs of engineers who will apply reliability
principles in component design, manufacture, testing, and field service.
This book is designed to present such information at a level suitable for students
in the final year and diploma courses, but it is very useful, too, both for electronic
systems manufacture specialists and users, and for the candidate for a doctor de-
gree. Although the material of the book is not developed to the level generally
reached in postgraduate studies, it would be a suitable introduction to the subject, to
be followed by a more detailed examination of particular topics. This book took an
awfully long time to be written; much of the material put together over several
years has been discarded and new chapters have been added for the final English
version.
Our book is the first attempt to compile a volume specifically focused on the re-
liability problems of electronic and/or telecommunications components; it presents
an ample synthesis of specific reliability information in the field, and is addressed
to the electronic engineer who is concerned with equipment and component reli-
ability, and who will encounter a variety of practical, mathematical and scientific
problems in addition to those arising from his own particular branch of engineering.
The result is a reference work that should be invaluable to those involved in the
design and/or in the test of these highly challenging and interesting types of com-
plex electronic systems.
The book tries to make the point in this domain and attempts to summarise the
present knowledge on semiconductor failure modes, degradation and mecha-nisms,
knowledge derived from the studies of numerous workers in the field. For com-
pleteness the book also includes a survey of accelerated testing, achieving better
reliability, total quality problems, screening tests and prediction methods (allowing
evaluating the reliability of a future electronic system by starting from predictions
on the reliability of each component. A detailed alphabetical index, a glossary, two
acronym lists (international organisations and useful abbreviations), three reliability
dictionaries and a rich specific bibliography by the end of each chapter, and a gen-
XVI Preface

eral one by the end of the book round the picture of the infonnation offered by the
book.

The authors
Contents

1 INTRODUCTION

1.1 Definition of reliability

1.2 Historical development perspective 2

1.3 Quality and reliability 3

1.4 Economics and optimisation 5

1.5 Probability; basic laws 5


1.5.1 Probability distributions 6
1.5.2 Basic reliability distribution theory 9

1.6 Specific terms II


1.6.1 The generalised definition of I and MTBF 13

1.7 Failures types 15


1.7.1 Failures classification 16

1.8 Reliability estimates 17

1.9 "Bath-tub" failure curve 19

1.10 Reliability of electronic systems 20


1.10.1 Can the batch reliability be increased? 20
1.10.2 What is the utility of screening tests? 21
1.10.3 Derating technique 24
1.10.4 About the testability of electronic and telecommunication systems 25
1.1 0.5 Accelerated ageing methods for equipped boards 26
1.10.6 Operational failures 27
1.l0.7 FMEAIFMECA method 29
1.l0.8 Fault tree analysis (FTA) 30
1.1 0.8.1 Monte Carlo techniques 30
1.l0.9 Practical recommendations 32
1.10.10 Component reliability and market economy 33
XVIII Contents

1.11 Some examples 35

References 37

2 STATE OF THE ART IN RELIABILITY 43


2.1 Cultural features 44
2.1.1 Quality and reliability assurance 44
2.1.2 Total quality management (TQM) 46
2.1.3 Building-in reliability (BIR) 48
2.1.4 Concurrent engineering (CE) 49
2.1.5 Acquisition reform 50

2.2 Reliability building 51


2.2.1 Design for reliability 51
2.2.2 Process reliability 52
2.2.2.1 Technological synergies 53
2.2.3 Screening and bum-in 54
2.2.3.1 Bum-in 56
2.2.3.2 Economic aspects of bum-in 59
2.2.3.3 Other screening tests 60
2.2.3.4 Monitoring the screening 61

2.3 Reliability evaluation 65


2.3.1 Environmental reliability testing 66
2.3.1.1 Synergy of environmental factors 68
2.3.1.2 Temperature cycling 70
2.3.1.3 Behavior in a radiation field 72
2.3.2 Life testing with noncontinous inspection 73
2.3.3 Accelerated testing 75
2.3.3.1 Activation energy depends on the stress level 77
2.3.4 Physics of failure 78
2.3.4.1 Drift, drift failures and drift behaviour 81
2.3.5 Prediction methods 83
2.3.5.1 Prediction methods based on failure physics 84
2.3.5.2 Laboratory versus operational reliability 86

2.4 Standardisation 87
2.4.1 Quality systems 87
2.4.2 Dependability 87

References 87
Contents XIX

3 RELIABILITY OF PASSIVE ELECTRONIC PARTS 93

3.1 How parts fail 93

3.2 Resistors 94
3.2.1 Some important parameters 97
3.2.2 Characteristics 98
3.2.3 Reasons for inconstant resistors [3.8] ... [3.10] 100
3.2.3.1 Carbon film resistors (Fig. 3.4) 101
3.2.3.2 Metal film resistors 101
3.2.3.3 Composite resistors (on inorganic basis) 101
3.2.4 Some design rules 101
3.2.5 Some typical defects of resistors 102
3.2.5.1 Carbon film resistors 104
3.2.5.2 Metal film resistors 104
3.2.5.3 Film resistors 105
3.2.5.4 Fixed wirewound resistors 105
3.2.5.5 Variable wirewound resistors 105
3.2.5.6 Noise behaviour 105

3.3 Reliability of capacitors 105


3.3.1 Introduction 105
3.3.2 Aluminium electrolytic capacitors 107
3.3.2.1 Characteristics 108
3.3.2.2 Results of reliability research studies 110
3.3.2.3 Reliability data III
3.3.2.4 Main failures types II J
3.3.2.5 Causes of failures 112
3.3.3 Tantalum capacitors 112
3.3.3.1 Introduction 112
3.3.3.2 Structure and properties 113
3.3.3.3 Reliability considerations 115
3.3.3.4 DCIC o variation with temperature 116
3.3.3.5 The failure rate and the product CD 117
3.3.3.6 Loss factor 117
3.3.3.7 Impedance at 100 Hz 117
3.3.3.8 Investigating the stability of35 V tantalum capacitor 117
3.3.3.9 The failure rate model 121
3.3.4 Reliability comparison 121
3.3.5 Another reliability comparison 123
3.3.6 Polyester film / foil capacitors 124
3.3.6.1 Introduction 124
3.3.6.2 Life testing 125
3.3.6.3 I as a function of temperature and load 126
3.3.6.4 Reliability conclusions 127
3.3.7 Wound capacitors 129
3.3.8 Reliability and screening methods [3.37] [3.38] 131
XX Contents

3.4 Zinc oxide (ZnO) varistors [3.39] ... [3.45] 132


3.4.1 Pulse behaviour of ZnO varistors 134
3.4.2 Reliability results 138

3.5 Connectors 138


3.5.1 Specifications profile 139
3.5.2 Elements of a test plan 140

References 141

4 RELIABILITY OF DIODES 145

4.1 Introduction 145

4.2 Semiconductor diodes 146


4.2.1 Structure and properties 146
4.2.2 Reliability tests and results 146
4.2.3 Failure mechanisms 148
a. Mechanical failure mechanisms 148
b. Electrical failure mechanisms 148
4.2.4 New technologies 149
4.2.5 Correlation between technology and reliability 150
4.2.6 Intermittent short-circuits 153

4.3 Z diodes 154


4.3.1 Characteristics 154
4.3.2 Reliability investigations and results 155
4.3.3 Failure mechanisms 158
4.3.3.1 Failure mechanisms of Z diodes 159
4.3.3.2 Design for reliability 160
4.3.3.3 Some general remarks 161
4.3.3.4 Catastrophic failures 162
4.3.3.5 Degradation failures 162

4.4 Trans-Zorb diodes 163


4.4.1 Introduction 163
4.4.2 Structure and characteristics 163

4.5 Impatt (IMPact Avalanche and Transit-Time) diodes 163


4.5.1 Reliability test results for HP silicon single drift Impatt diodes 165
4.5.2 Reliability test results for HP silicon double drift Impatt diodes 166
4.5.3 Factors affecting the reliability and safe operation 166

References 169
Contents XXI

5 RELIABILITY OF SILICON TRANSISTORS 171


5.1 Introduction 171

5.2 Technologies and power limitations 172


5.2.1 Bipolar transistors 173
5.2.2 Unipolar transistors 173

5.3 Electrical characteristics 175


5.3.1 Recommendations 176
5.3.2 Safety Limits 176
5.3.3 The du/dt phenomenon 177

5.4 Reliability characteristics 178

5.5 Thermal fatigue 180

5.6 Causes of failures 182


5.6.1 Failure mechanisms 182
5.6.2 Failure modes 183
5.6.3 A check-up for the users 185
5.6.4 Bipolar transistor peripherics 185

5.7 The package problem 185

5.8 Accelerated tests 186


5.8.1 The Arrhenius model 187
5.8.2 Thermal cycling 188

5.9 How to improve the reliability 190

5.10 Some recommendations 191

References 193

6 RELIABILITY OF THYRISTORS 197


6.1 Introduction 197

6.2 Design and reliability 199


6.2.1 Failure mechanisms 199
6.2.2 Plastic and hermetic package problems 202
6.2.3 Humidity problem 204
6.2.4 Evaluating the reliability 204
XXII Contents

6.2.5 Thyristor failure rates 206

6.3 Derating 207

6.4 Reliability screens by General Electric 209

6.5 New technology in preparation: SITH 210

References 213

7 RELIABILITY OF INTEGRATED CIRCUITS 215

7.1 Introduction 215

7.2 Reliability evaluation 219


7.2.1 Some reliability problems 219
7.2.2 Evaluation of integrated circuit reliability 219
7.2.3 Accelerated thermal test 221
7.2.4 Humidity environment 222
7.2.5 Dynamic life testing 223

7.3 Failure analysis 224


7.3.1 Failure mechanisms 224
7.3.1.1 Gate oxide breakdown 225
7.3.1.2 Surface charges 226
7.3.1.3 Hot carrier effects 226
7.3.1.4 Metal diffusion 226
7.3.1.5 Electromigration 227
7.3.1.6 Fatigue 228
7.3.1.7 Aluminium-gold system 229
7.3.1.8 Brittle fracture 229
7.3.1.9 Electrostatic Discharge (ESD) 229
7.3.2 Early failures 230
7.3.3 Modeling Ie reliability 231

7.4 Screening and burn-in 233


7.4.1 The necessity of screening 233
7.4.2 Efficiency and necessity of bum-in 235
7.4.3 Failures at screening and bum-in 237

7.5 Comparison between the IC families TTL Standard and TTL-LS 240

7.6 Application Specific Integrated Circuits (ASIC) 240

References 241
Contents XXIII

8 RELIABILITY OF HYBRIDS 247

8.1 Introduction 247

8.2 Thin-film hybrid circuits 250


8.2.1 Reliability characteristics of resistors 250
8.2.2 Reliability of throughout-contacts 251

8.3 Thick-film hybrids 252


8.3.1 Failure types 253
8.3.2 Reliability of resistors and capacitors 254
8.3.3 Reliability of "beam-leads" 254

8.4 Thick-film versus thin-film hybrids 257

8.5 Reliability of hybrid ICs 259

8.6 Causes of failures 261

8.7 Influence of radiation 264

8.8 Prospect outlook of the hybrid technology 264

8.9 Die attach and bonding techniques 270


8.9.1 Introduction 270
8.9.2 Hybrid package styles 271

8.10 Failure mechanisms 274

References 275

9 RELIABILITY OF MEMORIES 277

9.1 Introduction 277

9.2 Process-related reliability aspects 283

9.3 Possible memories classifications 288

9.4 Silicon On Insulator (SOl) technologies 290


9.4.1 Silicon on sapphire (SOS) technology 291

9.5 Failure frequency of small geometry memories 291


XXIV Contents

9.6 Causes of hardware failures 292


9.6.1 Read only memories (ROMs) 294
9.6.2 Small geometry devices 296

9.7 Characterisation testing 296


9.7.1 Timing and its influence on characterisation and test 298
9.7.2 Test and characterisation of refresh 298
9.7.2.1 Screening tests and test strategies 299
9.7.3 Test-programmes and -categories 301
9.7.3.1 Test categories 301
9.7.3.2 RAM failure modes 302
9.7.3.3 Radiation environment in space; hardening approaches 303

9.8 Design trends in microprocessor domain 305

9.9 Failure mechanisms of microprocessors 306

References 310

10 RELIABILITY OF OPTOELECTRONICS 313

10.1 Introduction 313

10.2 LED reliability 316

10.3 Optocouplers 318


10.3.1 Introduction 318
10.3.2 Optocouplers ageing problem 318
10.3.3 CTR degradation and its cause 320
10.3.4 Reliability of optocouplers 321
10.3.5 Some basic rules for circuit designers 323

10.4 Liquid crystal displays 324


10.4.1 Quality and reliability of LCDs 325

References 327

11 NOISE AND RELIABILITY 329

11.1 Introduction 329


Contents xxv

11.2 Excess noise and reliability 330

11.3 Popcorn noise 331

11.4 Flicker noise 333


11.4.1 Measuring noise 333
11.4.2 Low noise, long life 333

11.5 Noise figure 334

11.6 Improvements in signal quality of digital networks 336

References 336

12 PLASTIC PACKAGE AND RELIABILITY 339

12.1 Historical development 339

12.2 Package problems 341


12.2.1 Package functions 342

12.3 Some reliabilistic aspects of the plastic encapsulation 343

12.4 Reliability tests 344


12.4.1 Passive tests 345
12.4.2 Active tests 346
12.4.3 Life tests 347
12.4.4 Reliability of intermittent functioning plastic encapsulated res 349

12.5 Reliability predictions 352

12.6 Failure analysis 353

12.7 Technological improvements 354


12.7.1 Reliability testing of PCB equipped with PEM 356
12.7.2 Chip-Scale packaging 356

12.8 Can we use plastic encapsulated microcircuits (PEM) in high


reliability applications? 357

References 359
XXVI Contents

13 TEST AND TESTABILITY OF LOGIC ICS 363

13.1 Introduction 363

13.2 Test and test systems 364


13.2.1 Indirect tests 365

13.3 Input control tests of electronic components 365


13.3.1 Electrical tests 366
13.3.2 Some economic considerations 367
13.3.3 What is the cost of the tests absence? 368

13.4 LIC selection and connected problems 369


13 .4.1 Operational tests of memories 370
13.4.2 Microprocessor test methods 371
13.4.2.1 Selftesting 371
13.4.2.2 Comparison method 371
13.4.2.3 Real time algorithmic method 372
I 3.4.2.4 Registered patterns method 372
13.4.2.5 Random test of microprocessors 373

13.5 Testability of LICs 373


13.5.1 Constraints 374
13.5.2 Testability of sequential circuits 374
13.5.3 Independent and neutral test laboratories 375

13.6 On the testability of electronic and telecommunications systems 376

References 379

14 FAILURE ANALYSIS 381

14.1 Introduction [14.1] ... [14.25] 381

14.2 The purpose of failure analysis 383


14.2.1 Where are discovered the failures? 383
14.2.2 Types of failures 384

14.3 Methods of analysis 386


14.3.1 Electrical analysis 386
14.3.2 X-ray analysis 387
14.3.3 Hermeticity testing methods 388
14.3.4 Conditioning tests 388
14.3.5 Chemical means 388
Contents XXVII

14.3.6 Mechanical means 389


14.3.7 Microscope analysis 389
14.3.8 Plasma etcher 389
14.3.9 Electron microscope 389
14.3.10 Special means 390

14.4 Failure causes 392

14.5 Some examples 393

References 410

15 APPENDIX 413
15.1 Software-package RAMTOOL++ [15.1 J 413
15.1.1 Core and basic module RJ Trecker 413
15.1.2 RM analyst 414
15.1.3 Mechanicus (Maintainability analysis) 414
15.104 Logistics 414
15.1.5 RM FFT-module 415
15.1.6 PPoF-module 415

15.2 Failure rates for components used in telecommunications 415

15.3 Failure types for electronic components [15.2J 418

15.4 Detailed failure modes for some components 419

15.5 Storage reliability data [15.3J 420

15.6 Failure criteria. Some examples 420

15.7 Typical costs for the screening of plastic encapsulated ICs 421

15.8 Results of 1000 h HTB life tests for CMOS microprocessors 421

15.9 Results of 1000 h HTB life tests for linear circuits 422

15.10 Average values of the failure rates for some IC families 422

15.11 Activation energy values for various technologies 423

15.12 Failures at burn-in 424

References 424
XXVIII Contents

GENERAL BIBLIOGRAPHY 425

RELIABILITY GLOSSARY 455

LIST OF ABBREVIATIONS 473

POLYGLOT DICTIONARY OF RELIABILITY TERMS 481

INDEX 501
List of figures and tables

Figures

Fig. 1.1 Elements of the product quality


Fig. 1.2 Factors influencing the purchasing of an equipment: a some years ago;
b today (in %)
Fig. 1.3 The optimum zone of the best compromise price/reliability: a first
investment costs; b operation costs; c total costs
Fig. 1.4 Relationship between the probability density function f(x) and the
cumulative distribution function F(x)
Fig. 1.5 Relationship of shapes of failure rate (A), failure density (B), and
reliability function (C)
Fig. 1.6 Reliability and probability of failure
Fig. 1.7 Failure classification
Fig. 1.8 Part base failure rate versus stress and temperature
Fig. 1.9 The "bath-tub" failure curve of a large population of statistically
identical items, for two ambient temperatures 82 > 8 J for electronic components
Fig. 1.10 Variation offailure rate in function ofIC complexity
Fig. 1.11 Failure mechanisms detectable with the aid of screening tests
Fig. 1.12 Typical defects in an electronic system, arisen during the useful life
Fig. 1.13 Product development chart with scheduled FTA inputs
Fig. 1.14 Effect of TTR and ITF on mission performance
Fig. 1.15 Possible testing scenario, from input control to system testing. To
reduce the duration required for each developing step, specific testing methods
will be developed
Fig. 2.1 Corrective action in quality and reliability assurance programme
Fig. 2.2 Information flow between the quality assurance department and others
departments
Fig. 2.3 An example of the structure for quality and reliability activity
Fig. 2.4 The relationship between supplier and customer in a total quality
system
Fig. 2.5 Elements of a concurrent engineering (CE) analysis
Fig. 2.6 Distribution of contamination sources for semiconductor wafe
Fig. 2.7 Typical curves for the difference Cr - Cs . The curve A shows a
situation where burn-in does not pay-off, i. e. total costs using burn-in is always
greater than the costs without burn-in, irrespective of the burn-in period; the curve
B demonstrates that a burn-in lasting about two days (48 h) gives the maximum
economic benefit. [2.42]
xxx List of figures and tables

Fig. 2.8 Flow-chart of MOVES


Fig. 2.9 Fuzzy set: triangle-shaped membership function with five regions
Fig. 2.10 Failure rates ratios of different component families at environment
temperatures of +40°C and +70°C [2.70]
Fig. 2.11 The median number of temperature cycles producing the failure of
SO% of a component batch (Nm) vs. temperature range (ilT)
Fig. 2.12 Failure mechanisms at temperature + vibrations. Appearance of the
second failure mechanism after 104 temperature cycles
Fig. 2.13 Comparison between: a the reliability fmgerprint (RF) for a current
batch and b the fingerprint of the reference batch (RFreD
Fig. 2.14 Screening and reliability evaluation performed by using the model
described by the relations (2.29) and (2.30)
Fig. 2.15 Emergence possibilities of the semiconductors defects
Fig. 2.16 Superposition of physics of failure intrinsic reliability models with
field failure data, in the useful period
Fig. 2.17 The physics of failure modelling approach
Fig. 3.1 Overall life characteristic curve
Fig. 3.2 Time behaviour domain of 100 carbon film resistors (lMnlO.2S W;
nominal power). Prescribed limit value ilRIR == 1%
Fig. 3.3 Drift data for metal film resistors in accordance with MIL-R-I0S09: t
- operating time; 8K - body temperature (0C); ilR - resistance variation (%)
Fig. 3.4 Parameters variation by ageing depending on the following parameter:
a nominal value; b operating power; c nominal charge [3.9]
Fig. 3.5 Minimisation charging curve for: a) carbon film resistors; b) metal
film resistors; c) wirewound resistors. P == permitted region; the area with the best
ratio reliability/costs and with optimal safety working reserves. Utilisation of
resistors in this area is very frequent since a reliability deterioration is normally
not expected. D == doubtful region; in this area the resistors are working without
going beyond the nominal values, but not with the optimum reliability. F ==
forbidden region; in this area the nominal values are exceeded and the resistors are
overcharged
Fig. 3.6 200 kQ carbon film resistor time behaviour at different normal
operating temperatures (mean values, alternating voltage)
Fig. 3.7 Failure rate dependence on the operation temperature, for different
derating ratios and at a relative humidity ~ 60%
Fig. 3.8 Noise variation for 1) metal resistor; 2) carbon resistor and 3)
wirewound resistor
Fig. 3.9 Impedance and residual current variation for an electrolytic capacitor
68f.lF/lSV for an environmental temperature of +70°C (at nominal voltage,
without charge): charge with nominal d.c. voltage; without charge (environmental
temperature +70°C)
Fig.3.10 Guaranteed lifetime for French aluminium electrolytic capacitors (m
hours by nOC, categories from A to G), unaffected from encapsulation and voltage
Fig. 3.11 Possible lifetime for the categories A to G, for different utilisation A
(0 > 6.S mm / 1000 h / 70°C); B (0 > 6.5 mm / 2000 h / 70°C); C (0 > 6.S mm /
1000 h / 8S°C); D (0 > 6.5 mm / 2000 h / 8S°C); E (0 > 6.5 mm / SOOO h / 8S°C);
F (0 > 6.5 mm / 10,000 h / 8S0C); G (U > 100 V / 6.Smm < 0 < 14 mm / 2000 h /
12SC)
List of figures and tables XXXI

Fig. 3.12 Operation principle of the tantalum capacitor


Fig. 3.13 The residual curve of the tantalum capacitor CTS 13 (lOIlF I 25V)
Fig. 3.14 Time dependence of the residual current for a tantalum capacitors
group operating at an environmental temperature of +85°C. A) After zero
operation hours; B) after 1000 operation hours; C) after 4000 operation hours; D)
after 8000 operation hours
Fig. 3.15 Reliability of tantalum capacitor (the hatched zones are theoretically
estimated)
Fig. 3.16 LlC/Co variation between 25 and 85°C, at nominal voltages from 6 to
40V
Fig. 3.17 Interdependence of CU and A. M = mean failure rate
Fig. 3.18 Measured values of tantalum capacitor impedance, at different
nominal voltages if= 100 Hz)
Fig. 3.19 The main type of graphical display for the obtained results
Fig. 3.20 Results of the stability investigations of tantalum capacitors from
various manufacturers (L, M, N, 0)
Fig. 3.21 Tantalum capacitors breakdown voltage, for various manufacturers
Fig. 3.22 Comparison between electrolytic and tantalum capacitors, at different
nominal voltages (j= 100Hz). A - aluminium electrolytic capacitor 101lf I 35V I
70°C. B - tantalum capacitor 81lF/35V/85°C
Fig. 3.23 Increase of the median lifetime with the reduction of operating
voltage, at +85°C. Criterion: A - IR > 0.041lA11lf x V; B - IR > 0.021lA11lf x V
Fig. 3.24 Distribution of DC during the damp heating test without load at
40°C, RH 90-95%, for 21 days: 100 V; x = 2.1 %; n = 460
250 V; x = 2.6%; n = 855
400 V; x = 2.8%; n = 505
600 V; x = 2.8%; n = 458
Fig. 3.25 Distribution of tan 8 after the damp heat test without load at 40°C,
RH 90-95%, for 21 days: (a) before the test; x = 36 x 10-4; n = 505, (b) after the
test-, x = 38 X 10-4,. n = 505
Fig. 3.26 Capacity variation of the 100 nF polystyrene with plastic cover
capacitor
Fig. 3.27 Comparison between the limitation voltages for different peak pulse
currents: a) 39 V metal-oxide varistor; b) 39 V Trans-Zorb
Fig. 3.28 The mean decrease of breakdown voltage BV after the pulse tests
(measured after 10 pulses, each having the duration of IllS): a) 39V metal-oxide
varistor; b) 39V Trans-Zorb
Fig. 3.29 Oscilloscope pictures: a) 39V Trans-Zorb; b) 39V metal-oxide
varistor. Pulse test conditions: 50Allls with a rise time of 4kV IllS. Vertical scale:
50V/div.; horizontal scale: 2ns/div
Fig. 3.30 Some typical electrical values of a varistor on the U-I curve
Fig. 3.31 The varistor polarisation. a) The U-I characteristic. b) Modification
of the varistor voltage (+ in the pulse direction; - in the opposite pulse direction)
Fig. 3.32 Evolution of the leakage current during the operating test time: 1 - in
the opposite direction of the pulse; 2 - in the direction of the pulse; 3 -
comparative curve, without pulses
Fig. 3.33 Distribution of connectors on the users' market
Fig. 3.34 Time behaviour of CuNi 9Sn2 connectors
XXXII List of figures and tables

Fig. 4.1 Comparison between failure rates of silicon rectifier diodes, for
different stresses: d. c.loading on barrier layer and operation under capacitive load
Fig. 4.2 The failure causes (in %) of the silicon rectifier diodes
Fig. 4.3 Failure rate versus normalised temperature of the barrier layer,
according to MIL-HDBK-217; 1 - silicon diode; 2 - germanium diode; 3 - Z-diode
Fig 4.4 "Superrectifier" technology with "glass" of plastic materials (General
Instrument Corp.). 1 - brazed silicon structure; 2 - sinterglass passivation; 3 - non
inflammable plastic case
Fig. 4.5 The "double plug" technology. 1 - glass tube; 2 - structure; 3 - plug
Fig. 4.6 Planar structure in the standard technology. 1 - silver excrescence
assuring the anode contact; 2 - Si02 passivation assuring the protection of the pn
junction, at the surface; 3 - metallisation of the cathode contact
Fig. 4.7 Standard technology with the two plugs (FeNi-alloy) 1 - connection; 2
- structure; 3 - hermetically closed glass body ; 4 - plug; 5 - silver outgrowth
assuring the anode contact; 6 - cavity having about 200lllll width; 7 - welding
Fig. 4.8 Technology "without cavity", with mesa structure. 1 - metallisation of
the anode contact; 2 - metallisation of the cathode contact; 3 - Si0 2 passivation
assuring the protection of junction on the lateral parts of the structure
Fig 4.9 Technology ''without cavity", with the two silvered tungsten plugs. 1 -
structure; 2 - welded contact; 3 - hermetically sealed glass body
Fig. 4.10 Intermediate technology between "standard" and ''without cavity":
this is a planar structure, but of bigger dimensions. 1 - (passivate) oxide; 2 -
glassivation; 3 - cathode contact (metallisation)
Fig. 4.11 Intermediate technology: the glass body is in contact with the
giassivation
Fig. 4.12 Behaviour of different Z diodes while ageing after storage at +70°C.
Beyond 20 000 hours, the 6.3V Z diode does not operate reliable anymore
Fig. 4.13 Behaviour at ageing of the breakdown voltages ofZ diodes measured
at -ID = 1mA and 20mA: A) Tj = 135°C; B) Tj = 90°C
Fig. 4.14 Impatt diode chip in hermetically sealed package, with copper stud at
bottom serving as terminal and heatsink. Other terminal is at top
Fig. 4.15 Effect of junction temperature on failure rate for ~ = 1.8eV
Fig. 4.16 The influence of circuit load resistance on output power for either a
pulsed or CW Impatt in a circuit which resonates the diode at a single frequency
faa. The pulsed or d. c. operating current is kept fixed at 10
Fig. 5.1 Failure rate vs. virtual junction temperature [5.10]
Fig. 5.2 Correlation between the damage speed, expressed by the failure rate
(A., in lO'5/h) and the reverse of the temperature, Iff (in lO"l/K)
Fig. 5.3 Voltage dependence of the median time (lognormal distribution).
Experimental data were obtained from four samples withdrawn from the same
batch of bipolar transistors undergoing a life test at the same temperature, at the
same dissipated power (Pmax ), but at different combination Ui, Ii (where Ui x Ii =
Pmaxfor all samples)
Fig. 5.4 Temperature range vs. number of cycles till failure (for power
transistors encapsulated in package TO-3)
Fig. 5.5 Temperature range vs. number of cycles till failure (for power
transistors encapsulated in package TO-220)
List of figures and tables XXXIII

Fig. 5.6 Correlation between failure rate and normalised junction temperature.
For transistors with dissipated power higher than 1W at an environmental
temperature of 25°C, the values must be multiplied by 2
Fig. 5.7 Failure rate vs. junction temperature for various reliability levels of
power transistors
Fig. 6.1 Two transistor analogue ofpnpn structures
Fig. 6.2 Passivation and glassivation (National Semiconductor document). The
passivation is a proceeding permitting the protection against humidity and surface
contaminants with a doped vitreous silicon oxide film: 1 diffusion; 2 substrate; 3
glassivation; 4 conductive line; 5 metal; 6 passivation
Fig. 6.3 Estimated A of a standard SCR depending on junction temperature,
reverse and/or forward voltage, and failure definition for a maximum rated
junction temperature of + 100°C
Fig. 6.4 Estimated A of a standard SCR depending on junction temperature,
reverse and/or forward voltage, and failure definition for a maximum rated
junction temperature of + 125°C
Fig. 6.5 Estimated A of a standard SCR depending on junction temperature,
reverse and/or forward voltage, and failure definition for a maximum rated
junction temperature of + 150°C
Fig. 6.6 Simplified structural simulation model of SITH
Fig. 6.7 Potential distribution in SITH along channel axis
Fig. 6.8 Electron energy distribution along channel axis
Fig. 6.9 Barrier height versus gate bias
Fig. 7.1 Evolution of the metallisation technology and corresponding allowed
current densities
Fig. 7.2 Main sequences of the planar process: a starting material; b deposition
of an epitaxial n layer; c passivation (with an oxide layer); d photolithography; e
diffusion of a p+ layer; f metallisation
Fig. 7.3 A log (~ VN 0) vs. log t plot for hot-carrier degradation mechanism
Fig. 7.4 Plot of the Arrhenius model for A = 1 and Ea = 1.1 eV
Fig. 7.5 Comparison of data refering to early failures and long term failures: a)
typical domain of long term failure mechanisms for commercial plastic
encapsulated ICs; domain of early failures for bipolar commercial SSIIMSI;
domain of early failures of commercial MOS LSI [7.21]
Fig. 7.6 Replacement rate of commercial TTL ICs in plastic package (in RlT,
during infant mortality period) [7.21]
Fig. 7.7 Monte-Carlo reliability simulation procedure for ICs
Fig. 7.8 Failure distribution for bipolar monolithic ICs
Fig. 7.9 Failure distribution for MOS ICs
Fig. 7.10 Failure distribution for COS/MOS ICs
Fig. 8.1 The place of hybrid circuits in the general framework of
microelectronics
Fig. 8.2 Drift of nitride tantalum resistors, under load, is smaller than 0.1 %
after 103 working hours
Fig. 8.3 Stability of nitride tantalum resistors depending on number of cycles
of damp heat
Fig. 8.4 The results of high temperature storage of nitride tantalum resistors,
at various temperatures
XXXIV List of figures and tables

Fig. 8.5 Noise characteristics of Birox 1400 pastes before and after laser
adjustment, depending on the resistor surface (for Birox 1400, 17S, and 17G
pastes of Du Pont better noise figures may be obtained)
Fig. 8.6 Evaluation of the relative costs for the thick- and thin-film integrated
circuits
Fig. 8.7 The experience of users (A. .. L) versus predicted failure rates
Fig. 8.8 Primary causes of failures of small power hybrid circuits
Fig. 8.9 The primary causes of the failures (power hybrid circuits)
Fig. 8.10 Statistical reliability data for hybrid circuits
Fig. 8.11 Without cooling radiator, the enamelled layer works at a smaller
temperature than that of an equivalent aluminium oxide chip. As consequence, for
the aluminium oxide, a cooling radiator has a better power dissipation. 1 -
enamelled layer; 2 - aluminium oxide; 3 - beryllium oxide
Fig. 8.12 A good example of thick-film circuit: a band filter (Ascom Ltd.,
Berne)
Fig. 8.13 Conductive lines printed on ceramic substrate: drying at + lS0°C;
baking of the conductive lines at +8SoC
Fig. 8.14 Printing of the first resistor paste; drying at + lS0°C
Fig. 8.15 Printing of the second resistor paste; drying at +1 SO°C; pastes baking
at+8S0°C
Fig. 8.16 Printing the protection layer (glazing); drying at + lS0°C; baking the
glazing at +SOO°C
Fig. 8.17 Printing the soldering (which remains wet for component mounting);
mounting of capacitors; reflow-soldering
Fig. 8.18 Measuring of all capacitors; calculation of nominal values of resistors
(97% of nominal value); ageing of substrate (70 hours at + lS0°C)
Fig. 8.19 Fine adjustment of resistors at nominal value
Fig. 8.20 Mounting of the active components; mounting of connections
Fig. 8.21 Pre-treatment of integrated circuits for thick-film hybrids
Fig. 8.22 Chip mounting
Fig. 8.23 Beam lead attachment requires thermocompression bonding or
parallel gap welding to the substrate metallisation
Fig.9.1 Decrease of device dimensions in the years 1970 to 2010 [9.3]
Fig. 9.2 Development of molecular electronics/photonics from conventional
electronics and optics [9.3]
Fig. 9.3 Trend of DRAM device parameters [9.5]
Fig. 9.4 Increase of process steps due to device complexity [9.S]
Fig. 9.5 Record density trend in DRAM and other media [9.S]
Fig. 9.6 Another possible classification of semiconductor memories. (PLA:
programmable logic array)
Fig. 9.7 Illustration of a soft error
Fig. 9.8 Defects in digital MOS and linear and digital bipolar technologies IC's
[9.20]
Fig. 9.9 Generation of electron-hole pairs in the gate and field oxides (PG =
polysilicon gate)
Fig. 10.1 Classification of optoelectronic semiconductor components
[10.1][10.2]
Fig. 10.2 A typical red LED cross-section
List of figures and tables XXXV

Fig. 10.3 Basic large-area-contact LED structure [10.3]


Fig. 10.4 System model for an optocoupler [10.1][10.7][10.9]
Fig. 10.5 Effect of varying the stress to monitor ratio (M) on eTR
Fig. 10.6 IRED output versus time slope prediction curves, assuming a virtual
initial time of 50 hours
Fig. 10.7 Optical response curve of liquid crystal cell. Vth = threshold voltage
(threshold at which response is 10% of maximum); Vsat = saturation voltage
(voltage at which response is 90% of maximum)
Fig. 10.8 LCD failure rate A dependence on the time t; typically lifetime: 50
OOOh, A ~ 10·6/h for Us = 5V, T~b = 25°C
Fig. 11.1 Typically burst noise observed at the collector of a transistor [11.16]
Fig. 11.2 Equivalent current generators
Fig. 11.3 Sequence of the proposed lot acceptance reliability test programme
Fig. 11.4 Noise characterisation of an operational amplifier [11.26]
Fig.12.1 Results of destructive tests performed with thermal shocks (MIL-STD-
883, method 1011, level C, -65°C ... +125°C) for various package types [12.12]: 1 -
epoxy without die protection; 2 - silicone with detrimental package protection; 3 -
epoxy with die protection; 4 - silicone with normal die protection; 5 - ceramic
package; 6 - phenol package with die protection; 7 - flat pack
Fig. 12.2 Results of temperature cycling tests for various types of plastic
encapsulation [12.15]; to be noted the good behavior of encapsulant no. 6 (epoxy
A, without die protection) and, especially, the remarkable behavior of the
encapsulant no. 5 (epoxy B, without die protection)
Fig. 12.3 Lognormal distribution of failures for transistors encapsulated in
silicone resin. Test stress: ambient temperature TA = 100°C, relative humidity r.h.
= 97% [12.49]
Fig. 12.4 Average lifetime for ~n integrated circuit plastic encapsulated (DIL,
14 pins) vs. [RH] 2
Fig. 12.5 Dependence of the acceleration factor on the duty cycle, having as
parameter the die over-temperature [12.61]; test conditions: 85°C!85% r.h. (192
hours cycle)
Fig. 13.1 The productivity gap between the expected chips and the design tools
can be transferred on the chip only with a clever combination of intellectual
property. (Source: Sematech)
Fig. 14.1 Scheme for performing a failure analysis
Fig. 14.2 Detail of a memory chip
Fig. 14.3 Detail of a metallisation
Fig. 14.4 Detail from Fig.l4.4, at a higher enlargement
Fig. 14.5 Contact of a connection wire
Fig. 14.6 Distribution offailures for a semiconductor device
Case 1:
Fig. 14.7 TTL Integrated circuit 944. Overcharge of an extender input
Case 2:
Fig. 14.8 DTL integrated circuit 9936, good at the input control, but failed at
the control of equipped cards (pin 13 interrupted). By oppening the case, the path
was found to be melt and the input diode shorted
Case 3:
XXXVI List of figures and tables

Fig. 14.9 Integrated circuit 936. Electrical overcharge: pads of the output
transistors are melted
Case 4:
Fig. 14.10 DTL integrated circuit 9946, defect at electrical control of equipped
cards (inputs 1 and 2 overcharged)
Case 5:
Fig. 14.11 Optocoupler: the failure mode is an open circuit of the
phototransistor; the emitter solders are interrupted. Because the optocouplers
passed by a 100% electric control, it seems that no mechanic defects occured. To
reach the aluminium pad (leading to the emitter windows), the glass passivation
layer was removed and the failure mechanism was discovered: the metallisation
surrounding the emitter area was burned by a overcharge current produced by the
scratch of the pad during the manufacturing process. Only a small portion of the
pad remains good, allowing the passing of the electric control. When the
optocoupler was used, the pad was burned and the failure occured
Case 6:
Fig. 14.12 Aluminium and oxide removal during ultrasound solder
Case 7:
Fig. 14.13 Local damage of the protection layer during ultrasound solder
Case 8:
Fig. 14.14 TTL IC 7410: Two inputs are found defect at electrical functionning
control of equipped cards. The silicon was broken under the contact zone (a rare
defect, produced by an incorrect manipulation during manufacturing process
Case 9:
Fig. 14.15 Local removal of aluminium at testing, bellow a thermocompression
area
Case 10:
Fig. 14.16 Break of an aluminium wire (ultrasound bond)
Case 11:
Fig. 14.17 Crack in a crystal
Case 12:
Fig. 14.18 Break of a die
Case 13:
Fig. 14.19 TTL IC 7400 (X170): Output 8 is defect at the electrical control of
equipped cards. One may notice the shortcircuit between the contact wires
soldered at pin 8 and 7, respectively
Case 14:
Fig. 14.20 Failures of diodes after a test at temperature cycling [14.34]. Causes:
wrong centred dies and wrong aligne-ment at diodes mounting
Case 15:
IC TTL 7475 (flip-flop with complementary outputs. The normal operation was
observed only for temperatures between 25 and 40°C. At temperatures higher than
40°C, the output level is instable. The phenomenon is produced by the contact
windows insufficiently open at the open collector output transistors. (Fig.
14.21 ... 14.23 Metallised dies. Fig. 14.24 Dies with metallisation removed.)
Case 16:
Bipolar LSI IC type HAI-4602-2: electrostatic discharges. There are no
differences between the handling precautions for bipolar and MaS ICs, because
List of figures and tables XXXVII

both categories are sensitive to electrostatic discharges. SEM pictures show the
areas affected by electrostatic discharge (Fig. 14.25... 14.27)
Case 17:
Partial vue of the metallisation layer of a ROM die, longitudinal section
(Fig. 14.28... 14.31)
Case 18:
Fig. 14.32 Notches formed during metallisation corrosion
Case 19:
Fig. 14.33 Excellent metallisation of a collector contact window of a TTL IC
(X5000)
Case 20:
Fig. 14.34 Excellent covering of the metallisation over an oxide step (X9000)
Case 21:
Fig. 14.35 Wrong thining of a metallisation pad over an oxide step (Xl 0000)
Case 22:
Hybrid circuit voltage regulator with power transistor at the output. Melt
connection at the emitter of power transistor. This failure mecanism may be
avoided if the manufacturer does not forget to specify in the catalogue sheet that at
the regulator input a capacitor with good high frequency characteristics must be
mounted (Fig. 14.36... 14.38)
Fig. 14.38 An error occured: the output voltage is higher than the input voltage.
To avoid the failure, a blocking diode must be mounted between the input and
output (a detail not mentioned by the manufacturer).
Case 23:
Small signal transistors with wire bonding defects
Fig. 14.39 Bad solder of a connection wire
Fig. 14.40 Edge solder joint
Fig. 14.41 Shortcircuit of the base wire with the crystal
Case 24:
Fig. 14.42 Electrical opens of a metallic pad (RAM chip), produced by
electromigration
Case 25:
Fig. 14.43 Typical example of pop com noise at an operational amplifier
Case 26:
Fig. 14.44 Silicon dissolution in aluminium (X 11000)
Case 27:
Fig. 14.45 Dissolution of silicon in aluminium. To be noted the change of
orientation in horizontal plane (100) (X 1700)
Case 28:
Fig. 14.46 Hole in a gate oxide, leading to a shortcrcuit between metallisation
and substrate (X 5000)
Case 29:
Fig. 14.47 Hole in a gate oxide, leading to a shortcrcuit between metallisation
and substrate (X 5000)
Case 30:
Fig. 14.48 Cristallisation of a point defect in a thermally grown SiOz (X 4400)
Case 31:
XXXVIII List of figures and tables

Fig. 14.49 Surface separation of an aluminium metallisation covering an oxide


step (X 16000)
Case 32:
Fig. 14.50 Image of a biased transistor, evidenced by potentional contrast
method (X 1000)
Case 33:
Fig. 14.51 Discontinuity of a metallisation pad, evidenced by potentional
contrast method (X 500)
Case 34:
Metal or ceramic packages may be opened by polishing, cuting, soldering or
hiting in a certain point, carefully, to not damage the die. The pictures show the
opened metal packages for two hibrid circuits with multiple dies. The solder joints
are the weak points of the system (Fig. 14.52-14.53)
Case 35:
Fig. 14.54 For the plastic packages, the opening is difficult. If in previous
researches input shortcircuits or opens have been found, one may establish with
X-ray radiography, before opening the package, if the defect is at the connection
between the pin and the die [14.26]

Tables

Table 1.1 Relationships used in reliability modelling


Table 1.2 Comparison between control costs (expressed in $) of a defect
component
Table 1.3 Ratio effectiveness / cost for screening tests
Table 1.4 Limit values of the three tested parameters
Table 1.5 Experimental data, before and after reliability tests (RT)
Table 2.1 The evolution of the reliability field
Table 2.2 Actual domains in the reliability of semiconductor devices
Table 2.3 The principles ofTQM
Table 2.4 The core elements of a building-in reliability approach
Table 2.5 Screening procedure for ICs class B (MIL-STD-883)
Table 2.6 Selection of the reliable items at screening, for a batch of 15 items
(fuzzy method with 5 regions)
Table 2.7 Climatic conditions for using in fixed post unprotected to bad
weather
Table 2.8 Climatic conditions for using in fixed post protected to bad weather
Table 2.9 Experiments on temperature cycling
Table 2.10 Comparison of the sensitiveness in a radiation field, for compo-
nents manufactured by various technology types
Table 2.11 Simulated noncontinuous inspection for Menon data
Table 2.12 Comparison of estimated value obtained by various methods
Table 2.13 Rapid estimation of the reliability level for the current batch
presented in Fig.2.l3 (fuzzy comparison method with 5 regions)
Table 2.14 Models obtained frpm the model described by the relations (2.31)
and (2.32)
List of figures and tables XXXIX

Table 2.15 SYRP prediction vs. accelerated life test (ALT) results
[SYRP/ALT in each column]
Table 2.16 Comparison of reliability prediction procedures
Table 3.1 Resistors; fixed; power
Table 3.2 Resistors; variable; power
Table 3.3 Comparison between metal film and carbon film resistors (general
specifications; charge 0.1 ... 2 W)
Table 3.3 Correlation between storage duration and new forming process
(reactivation) for wet aluminium electrolytic capacitors, for different nominal
voltages and diameters
Table 3.4 Criteria for aluminium electrolytic capacitors drift failures (DIN
41240,41332)
Table 3.5 Tantalum capacitor impedance as a function of frequency
Table 3.6 Correction factor OR for various values of the series resistance Rs
Table 3.7 Aluminium electrolytic capacitors versus tantalum capacitors
Table 3.8 Tested quantities and failures in life testing at +85°C, 1.5 UN, max.
7000h
Table 3.9 Estimated A under derated conditions
Table 3.10 Tested quantities and catastrophic failures in climatic tests
Table 3.11 Percentages outside requirements after the damp heating test
without load: 40°C, RH 90-95%, 21 days
Table 3.12 Percentages outside requirements after the damp heating test
without load: 40°C, RH 90-95%, 21 days
Table 3.13 Percentages outside requirements after the accelerated damp
heating test preceeded by the rapid temperature change test 55°C, RH 95-100%,2
days
Table 3.14 Breakdown voltage and field strength at breakdown
Table 4.1 Results of a comparative reliability study on 400m W Z diodes,
allied and diffused, respectively
Table 4.2 Compared reliability of Z diodes (% defects, after 168 hours
operation, at Pmax)
Table 4.3 Mean temperature coefficient (in %/C) of the Z diodes, between
+25°C and+125°C
Table 4.4 Reliability comparisons at the component level
Table 4.5 Failure rates, predicted and observed
Table 4.6 Catastrophic failures
Table 4.7 Degradation failures
Table 4.8 Catastrophic failures, FRD cards.
Table 4.9 The distribution of the typical failure modes
Table 5.1 The main technologies used to manufacture silicon transistors
Table 5.2 Main bonding techniques for silicon transistors
Table 5.3 Technological variants for power transistors
Table 5.4 Bipolar vs. VMOS transistors
Table 5.5 Dilatation coefficients
Table 5.6 Failure sources (in %) for power transistors encapsulated in TO-3
and TO-220
Table 5.7 Testing conditions for temperature cycling testing of cases TO-3
and TO-220
XL List of figures and tables

Table 6.1 Failure mechanisms and associated stresses


Table 6.2 Description of the device parameters for simulations
Table 7.1 Predictions for Si CMOS technology development: 1994, 1995 and
1997 editions of the National Technology Roadmap for Semiconductors
Table 7.2 Acceleration factors at an operating temperature of 125°C vs. 25°C
Table 7.3 Acceleration factors for various activation energies and testing
temperatures vs. a testing temperature of 55°C
Table 7.4 Results of oxygen plasma treatment
Table 7.5 Incidence of main failure mechanisms (in %) arising in infant
mortality period
Table 7.6 Corresponding costs for various percentages of failed ICs
Table 7.7 Screening tests for aerospace and defense applications (MIL-STD-
783)
Table 7.8 A comparison between various reliability tests: efficiency, failure
percentages, cost (MIL-STD-883, class B)
Table 7.9 Failures arising from a screening sequence
Table 7.10 Failure rates for transistors and ICs
Table 7.11 Distribution of failure causes (in %) for various utilisation fields
Table 7.12 A comparison between two bipolar IC families: LS vs. TTL
Standard
Table 8.1 Some data on layers
Table 8.2 Usual causes and modes of failure of thick-film hybrids
Table 8.3 Some encapsulation techniques
Table 8.4 The efficiency of screening tests (MIL-STD 883, method 5004, class
B)
Table 8.5 Typical failure rates of components for hybrid (FIT), versus the
working temperature (0C). [It is recommended to be used only for the costs
evaluation and circuit classification, since the data are strongly dependent on
process]
Table 8.6 Properties of thick-film substrates [8.25]
Table 8.7 Properties ofthin-film substrates [8.25]
Table 8.8 Die attach - diode chips
Table 8.9 Comparative A for various bonding techniques (in %/1000 h) [8.25]
Table 9.1 X86 microprocessor chronology [9.7]
Table 9.2 Some semiconductor memories types
Table 9.3 Pareto ranking of failure causes in 3400 VLSI failed devices+) ifd)
[9.19]
Table 9.4 Historical perspective of the dominant causes of failures in devices
[9.18]
Table 9.5 EPROM failure mechanisms [9.25]
Table 9.6 Incoming inspection testing versus characterisation [9.23]
Table 9.7 Some typical characteristics of the two types of testing [9.23]
Table 11.1 Measurement results
Table 12.1 The properties of some moulding compounds
Table 12.2 A comparison between the 1979-1992 decrease of failure rates (in
FITs) for plastic and ceramic packages, respectively
Table 12.3 Surface leakage current produced by humidity on a test structure
Si/AI
List of figures and tables XLI

Table 12.4 The effect of the hwnidity on the time till the pad interruption (that
is SO% corrosion); the pad has the width = 4!J.lll and the thickness = 11lm
Table 12.5 Relationship between the duty cycle and the equilibriwn state (test
conditions: over-temperature of 20°C, duty cycle 0.15, 85°C and 8S% r.h.)
Table 12.6 A history of failure rate improvements (in FITs) for plastic
encapsulated ICs
Table 12.7 Results of a reliability test program: high hwnidity testing in a non-
saturating autoclave (108°C, 90%RH). SOIC = Small outline IC package, SLCC =
Silicone junction coated IC, CerDIP = Ceramic dual-in-line package (hermetic)
Table 12.8 Results of reliability tests performed by IEEE Gel Task Force
Table 13.1 Classification of defects depending on their effects [13.1][13.2]
Table 13.2 Average indicative figures of the parameters A. ... F and the unit
cost for discrete components, linear and digital ICs [13.5]
Table 14.1 Working plan for a failure analysis for semiconductor components
Table 14.2 Trap characterisation from DLTS spectra
Table 14.3 Examples for the usage of a Scanning Electron Microscope (SEM)
Chap. 15
15.2 Failure rates for components used in telecommunications
15.3 Failure types for electronic components [1S.2]
15.4 Detailed failure modes for some components
15.5 Storage reliability data [IS.3]
15.6 Typical costs for the screening of plastic encapsulated ICs (in Swiss
francs) [1S.4]
15.7 Failure criteria. Some examples
15.8 Results of 1000 h HTB life tests for 8 bit CMOS microprocessors
encapsulated in ceramics, type NSC 800 [IS.5]
15.9 Results of 1000 h HTB life tests for lil).ear circuits encapsulated in plastic
[lS.5]
15.10 Average values of the failure rates for some IC families
15.11 Activation energy values for various technologies
15.12 Failures at burn-in [15.8]
1 Introduction

1.1
Definition of reliability

Reliability is a relatively new concept, which rounds off the quality control and is
linked to the study of quality itself Simply explained, the reliability is the ability
of an item to work properly; it is its feature not to fail during its operation. One
may say that the reliability is the operational certainty for a stated time interval.
This deftnition is however imperfect, because although containing the time factor,
it does not describe precisely a measured size.
As the ftrst reliability studies have been made in the USA, at the beginning the
American deftnition has been adopted: the reliability is the probability that a
certain product does not fail for a given period of time, and for certain opera-
tional and environmental conditions. The reliability of an element (or of an en-
semble) is today deftned as the probability that an item will perform its required
function under given conditions for a stated time interval. I The component reli-
ability involves the study of both reliability physics and reliability statistics. They
have an important contribution to a better understanding of the ways in which the
components fail, and how the failures are developing in time. This provides an
invaluable background for understanding and assessing the real-world failure
patterns of component reliability that come to us from fteld failure studies. The
effort of the researchers has been concentrated on establishing lifetime patterns for
individual component types (or for individual failure mechanisms). Reliability is a
collective name for those measures of quality that reflect the effect of time in the
storage or the use of a product, distinctly from those measures that show the state
of the product at the time of delivery.
In the general sense, reliability is deftned as the ability of an item to perform a
required function under stated conditions for a stated period of time.

I Although this definition corresponds to a concept rich in infonnations, it has however one
disadvantage. Because of the need to specify a defmed operation time of the respective
item, the reliability has different values for each time interval. That is why it is necessary
to defme other sizes, depending not only on the operation time, but also on the mean in-
terval between failures (MTBF - mean time between failures) and on the failure speed
for one hour (A). Nevertheless, it is not sufficient to indicate the failure speed for a cer-
tain constructive element, if the operational and environmental conditions (on which the
failure speed depends) are not simultaneously given.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
2 1 Introduction

The stated conditions include the total physical environment (also mechanical,
electrical and thermal conditions). Perform means that the item does not fail. The
stated time interval can be very long (twenty years, as for telecommunication
equipment), long (a few years) or short (a few hours or weeks, as for space re-
search equipment). This parameter might be, too, - for example - the mileage (of
an automobile) or the number of cycles (of a relay unit).

1.2
Historical development perspective

The fIrst studies concerning the electronic equipment and its reliability have been
made in the purpose to improve the military avionics technique and the radar
systems of the army. The mathematical formulation of the reliability and its utili-
sation for material tests originate in ideas born during the Second World War,
when Werner von Braun and his colleagues worked on the VI missiles. They
started from the idea that a chain can't be more resistant than its weakest link. The
studied object was a simple rocket; nevertheless they registered failure after fail-
ure, each time a constructive element gave up, although the components have been
submitted to detailed control tests. The diffIculties appeared less due to the sys-
tematic errors, but rather to the multiple error possibilities arising from the conju-
gation of different aspects concerning the numerous component parts, which acted
simultaneously. So they came up with the idea that all the constructive elements
must playa role in the reliability evaluation.
The reliability of individual parts is usually characterised by their failure rate "-
giving the number of failures over the time unit. The mathematician Erich
Pieruschka was invited to this debate and he stated - for the fIrst time - that the
chance of an ensemble composed of n identical elements to survive is 1/xn. In
exponential terms, we can write, for a constant failure rate: the reliability of an
isolated constructive element is exp(-At} and, consequently, the reliability of n
elements is exp(-nAt}. In the general case (in exponential form or not), the reli-
ability of an element is calculated with:
1? = J/x. (1.1)
The reliability of an ensemble formed of n elements connected in series will be:
l?s = ~ = J/xn. (1.2)
Therefore, the reliability of a series circuit, formed of n elements will be:
n
l?s = 1?}·1?2·1?3·····l?n = Imi. (1.3)
i=]
This equation is known as the ''theorem of reliabilities product". It was, also,
established that the reliability of one constructive element must be much greater
than the asked system's reliability. That is why new constructive elements have
been elaborated, with higher reliability, and fInally - for the VI rocket - an overall
reliability of75% was obtained.
1 Introduction 3

Since that time, the complexity - especially that of electronic systems - has
been growing continuously. This explains why all the engineers - if they desire to
reach and remain at the top of the new technologies - and the manufacturers - if
they do not want to lose the collaboration because of the different interpretation or
signification of the reliability concept - must learn how to use the new methods.

1.3
Quality and reliability

To clarify, from the beginning, the problems - although they are inseparably
bound -, we distinguish some very important properties of the electronic systems.
The German society for Quality DGQ defines the quality as the condition that
makes an object or a functional element to correspond to the pre-established re-
quirements. Another definition says: the quality is the measure in which a compo-
nent corresponds to the properties guaranteed by the manufacturer, beginning
with the delivery momentum to the client. In the following, we understand by
quality a measure of the degree to which a device conforms to applicable specifi-
cation and workmanship standards. The quality is characterised by the acknowl-
edged percentage of defects in the studied batch. The quality of the components is
determined by the design quality and manufacture quality, taking into account an
optimum compromise between requirements and costs. We distinguish, too, be-
tween "design quality" and "quality of the finished object".
The product testing must assure that each unit satisfies the requirements. These
tests can be made on the entire lot, or on samples. If the series cost (which can
appear after the utilisation of defective elements) overpasses substantially the test
costs, using a programmable tester instead of samples testing can increase the
certainty of test results for the entire lot.
Since an operational defect can never be excluded for a given time interval, an
operation without errors can be foreseen only with a certain probability. There-
fore, the bases of the reliability theory are probability theory and statistics. That is
why it must be taken into account that the reliability depends directly on the
manufacturing manner, and also greatly depends on the utilisation mode of the
item. This is underlined by the fact that for the reliability not only the number of
elements from the first series which fail is important, but also the deviations of
their characteristics. We must know for what time period the initial characteristics
are preserved, and how great the variation over time of the deviations is, what are
the percentages of failures during the first operation hours, what is the failure
speed for the operation time, what is the shape of the survival function, and finally
what statistical distribution can be associated with. All these characteristics are
represented in Fig. 1.1.
The reliability is the decision criterion for a component, which fulfils all the
quality requirements. Don't forget that the user can have an important contribution
to prolong (or to shorten) the lifetime of the component. In the past, the system
designers imposed drastically quality conditions, trying to obtain a greater certi-
tude that the constructive elements satisfy the specifications of the certificate
4 1 Introduction

1---------------- 1

i
! .--------------,
1 % Defects (AQL, LTPD, ppm)
i'---------------'

Independent Dependent
on time on time
Parameter distribution f--------'

Parameter stability
,-------------------------------------------------------------------------------------r---------------------------------------------------------------------------
I Evaluation I
Fig. 1.1 Elements of the product quality

Quality
Quality
Service
Service

Price
Price
0 50
0 50

a b
Fig. 1.2 The factors influencing the purchasing of an equipment: a some years ago; b today

of guarantee_ Today, the designers demand acceptable tests that complete the
quality inspection; this is requested to make sure that the manufacturer's specifi-
cations are valid and applicable initially, at input inspections, but also later, after a
longer operation time. Some years ago, the factors influencing the purchase of
equipment or of a system had the ratios shown in Fig. L2a_ Today, these ratios
1 Introduction 5

have changed into those shown in Fig. 1.2b. It can be seen that reliability and
quality make together a total of 50%.

1.4
Economics and optimisation

It is known that the improvement of the systems reliability leads to the diminish-
ing of the maintainability costs. In accordance with the DIN 40042 standard, the
maintainability is defined as a size that estimates the measure in which a studied
element is able to be maintained or restored in the situation permitting to fulfil the
specified function. Another definition (MIL-STD 721 D) of maintainability is: a
characteristic of design and installation expressed as the probability that an item
will be retained in or restored to a specified condition within a given period of
time, when the maintenance is performed in accordance with the pre-scribed pro-
cedures and resources.

Price

I'
c

a
I
~ Reliability
0 0.2 0.4 0.6 0.8 1.0
Fig. 1.3 The optimum zone of the best compromise price/reliability: a first investment costs; b
operation costs; c total costs

Still in the planning phase or in the design phase of a new product a maximisa-
tion of the probability that the desired product will be in the limits of the general
planned costs must be taken into account. Not only an optimal reliability, but also
an optimum compromise between price and reliability (Fig. 1.3) is searched. It can
be seen that if the pursued goal is correctly established, the reliability acts in the
sense of price reduction.

1.5
Probability; basic laws

Modem reliability principles are mainly based upon statistics and probability.
Therefore, in the following some elementary concepts are reviewed.
There are two main definitions of the probability: the classical definition and,
the relative-frequency definition. In the classical definition, if an event can occur
6 1 Introduction

in N mutually exclusive and also likely ways, and if n of these outcomes are of
one kind A, the probability of A is niN For example, the probability of a head or
tail in the toss of a coin is 1/2. The classical definition is not widely used in real
applications.
In the relative-frequency definition of probability, a random experiment is re-
peated n times under uniform conditions, and a particular event E is observed to
occur in J of the n trials. The ratio jln is called the "relative frequency" of E for
the first n trials. If the experiment is repeated, a sufficiently large number of times,
the ratio ofjln for the event E approaches the value P, the probability of the event
E. This definition indicates that the probability is a number between 0 and 1:
O::;P::;I. (1.4)
There are three basic laws (for complementation, for addition and for multipli-
cation):
Law oj complementation. If the probability that the event A does not occur is
P(A), then:
P(A) + P(A) = 1 (1.5)

P(A) = 1 - P(A). (1.6)


Conditional probabilities. P(A IB) is the probability that event A will occur
given that event B occurs. [The P(B IA) is the same statement for B]. If events A
and B are statistically independent then P(A I B) = P(A) and P(B IA) = P(B)
Law oj multiplication. The intersection (multiplication) is an event that occurs
if each one of the events A and B occurs and is written P(A n B). Then:
P(A nB) = P(A) P(B IA) = P(B) P(A IB). (1.7)
If the events A and B are statistically independent, the law of multiplication is
reduced to:
P(A n B) = P(A) P(B). (1.8)
Law oj addition. The union (addition) is an event that occurs if at least one of
the events A or B (or both) occurs and is written P(A U B). Then:
P(A U B) = P(A) + P(B) - P(A n B). (1.9)
If A and B are mutually exclusive, i.e., if A occurs then B cannot; and if B oc-
curs then A does not, then P(A n B) = O.

1.5.1
Probability distributions

A probability distribution describes how often different values of a given charac-


teristic are expected to occur. These distributions may be either discrete or conti-
nuous. Discrete random variables assume distinct values, such as the integers,
while the continuous random variables assume any value within a defined range.
Discrete distributions. IfJ(x) generates a probability that a random variable X
will take on certain discrete values, it is called a probability function. The bino-
1 Introduction 7

mial probability function is a discrete distribution, which is associated with re-


peated trials of the same event. For an event (for example: success or no failure)
where the probability of its occurrence on any trial is p, the probability of non-
occurrence is I-p. Thenf(x) is defined by
f(x) = n! / [x! (n-x)!J pX(I_pyn-x (1.10)
where x = 0, 1, 2, ... n. This fimction describes the number of successful trials
expected in a series of n independent trials.
The Poisson probability function, in addition to being an approximation of the
binomial probability function when n is large and p is very small, is a useful prob-
ability function in its own right to describe the occurrence of isolated or rare
events. The Poisson probability function is expressed as
f(x) = [X exp(-l)] / x! (1.11 )
where x = 0, 1, 2, ... number of times rare event occurs, and Ie is the average num-
ber of times the event occurs.
When x = 0, the Poisson reduces to the reliability formula or negative expo-
nential:
R = exp(-Jet). (1.12)
Continuous distributions. When a random variable X is free to take any value
within an interval, the resulting probability distribution is continuous. Fig. 1.4
shows the relationship between the probability density function f(x) and the cu-
mulative distribution function F(x).
F(x) = P(Xsx). (1.13)
f(x)

area = 80% 0.8

Xj F(x)
Fig. 1.4 Relationship between the probability density functionf(x) and the cumulative distribu-
tion function F(x)

For a continuous distribution:


f(x) = dF(x) / dx. (1.14)
It should be clear from these last statements that:
f(x) 20 (1.15)
00

/f(x)dx = 1. (1.16)
-00

The cumulative distribution fimction never decreases as the variable increases:


8 1 Introduction

(1.17)

and

F(-!Xi) = 0 F(!Xi) = 1 (1.18)


x
/f(x)dX = F(xJ) (1.19)
00

as shown in Fig. 1.3.


Engineers are familiar, in general, with the normal distribution, which is the ba-
sis for many statistical techniques. The probability density function for the normal
distribution is
f(x) = J/(av2;rJ exp [- (x-f.1l/2if] (1.20)
where x ranges from -!Xi to +!Xi, m is the mean, and a is the standard deviation.
The cumulative distribution function F(x) for the normal distribution cannot be
integrated as an algebraic equation, but has been evaluated by numerical inte-
gration techniques and is extensively tabulated in books on statistics.
In the lognormal distribution the InX is normally distributed. The density func-
tion is:
f(x) = 1/(axv2iij exp - [J/2(lnx - f.1l/ aj. (1.21)
The failure rate function Z(t) begins to zero, rises to a maximum and then de-
creases (this is the only density for which this occurs). An equation for the loca-
tion of this maximum was derived by Sweet (1990). The maximum lies in a finite
interval for all positive values of the standard deviation of the associated normal
random variable.
The Weibull distribution was developed to study the fatigue of materials. The
density function for the Weibull distribution is:
f(x) = {[fJ(x-y)~J]l17}· exp{[-(x-y)P]l17} (1.22)

and

F(x) = 1- exp{[-(x-y)P]l17} (1.23)

where 17 is the scale parameter, fJ is the shape parameter, and y is the location
parameter. If the failures can start as soon as the devices are operated, then y= O.
The fJ parameter of the Weibull distribution is important in determining the failure
rate:
For fJ < 1, the failure rate is decreasing; for fJ = 1, the failure rate is constant;
and for fJ > 1 the failure rate is increasing. Therefore, the Weibull distribution can
be used to characterise components that are subject to infant mortality, random
failures, or wearout failure.
1 Introduction 9

1.5.2
Basic reliability distribution theory

Almost every discussion on reliability begins and ends with the statement of fail-
ure rates for either components or systems. Some very basic and interesting reli-
ability equations can be developed. For example, if the probability of a successful
event is represented by R(t) and the probability of an unsuccessful event (a failure)
is represented by F(t):
F(t) = J/(t)dt (1.24)
o
and the probability of success is:
R(t) = I-H(t)dt. (1.25)
o
F(t) is the distribution function for the probability of failure (the probability
that a device will fail until the time moment t). R(t) is the distribution function for
the probability of success (the probability that a device will not fail until the time
moment t). The probability that failures will occur between any times tJ and t2 can
be calculated from the probability function

p =/Jrt)dt (1.26)
tJ
and since all devices and systems have a finite lifetime:
00

P = /f(t)dt = 1. (1.27)
o
The density function f(t) may be derived from (1.25), by differentiating:
f(t) = dR(t) / dt = R'(t). (1.28)

Another expression that is always part of every reliability discussion is mean time
to failure, MITF, used for non-repairable systems. The mean time between (suc-
cessive) failures, MI'BF, is used if the system recovers to the same state after
each failure [1.33]. MTBF values must be computed with different reliability
distributions for different time periods between failures. By using the mathema-
tical expectation theorem, MITF can be expressed as:
00

MITF =/ tf(t)dt (1.29)


o
The instantaneous failure rate, Z(t), can be calculated for any probability dis-
tribution by taking the ratio of the failure density function f(t) and the reliability
function R(t):
Z(t) = f(t) / R(t). (1.30)
Figure 1.5 shows the relationship between the shapes of the failure rate, failure
density and reliability functions. One of the most used distributions is the negative
exponential distribution. The reliability formula:
R = exp(-lt) (1.31)
10 1 Introduction

where A is the failure rate, and t is the time, can be derived from the Poisson dis-
tribution by using the first term of this distribution (for x = 0). The probability
density function of the exponential distribution is:
f(t) = A exp(-At). (1.32)
The MITF can be calculated with (1.29). Making the substitution for f(t):
00 00

MITF = ItA exp(-At)dt = Aft exp(-At}dt = 11..1,. (1.33)


o o
Failure rate Z(t) 1 Wearout or
i increasing failure
!Jate period ~
Useful life or constant failure rate region
A

!4-----Z(t) = A _ _ _ _ _---j~~1
o '------+--------------+-------..time

Failure rate density f(t)


frO)

o time
Reliability function

Fig. 1.5 Relationship of shapes of failure rate (A), failure density (B), and reliability function
(C)

One can see that the MITF of the negative exponential is equal to the recipro-
cal of the failure rate; this relationship holds only for the negative exponential
distribution. The failure rate of the negative exponential distribution is:
Z(t} = f(t)IR(t) = [A exp(-At}] 1 exp(-At} = A. (1.34)
1 Introduction 11

R(t), F(t)
R(t) + F(t) = 1
1.00 F(t)

0.63
0.37
R(t)
time
o m
Fig. 1.6 Reliability and probability offailure

Very often, as a first approximation, it is assumed that the electronic compo-


nents follow an exponential distribution. One important characteristic of this dis-
tribution is that the failure rate is independent on time. This allows to vary the
combination of the devices and hours in unit hours of reliability testing. For ex-
ample, if 100 000 unit hours are required for testing, 100 units can be tested for
1000 hours, or 10 units can be tested for 10 000 hours to demonstrate a given
reliability level. Failure rates are then usually expressed as percent/l 000 hours or
10-5Ih. If one cannot use the exponential distribution (meaning that the failure rate
is not constant in time), then the reliability cannot be expressed by percentllOOO
hours. If we presume that the relation Iv = constant is valid only for a limited time
interval, beyond this time interval, Iv is time dependent, particularly operation time
dependent. That is why, if MTBF is greater than the operation time period in
which Iv was presumed constant, then MTBF can be considered as the inverse of
the failure rate Iv_ Inversely, if the operation time is greater than MTBF, from the
reliability test of a batch of components, one can deduce if a component survives
to the MTBF value, expressed in hours. This survival probability of the batch
beyond the MTBF value is approximately 37% (Fig. 1.6), where:
R(MTBF) = exp[-MTBFIMTBF] = exp(-l) zO.37. (1.35)
This means that - after a proof time t = MTBF (in hours) - about 37% of all
units that began the test remain. This phenomenon that limits the life of a com-
ponent is, after all, the wearout (Fig. 1.5 A).

1.6
Specific terms

To avoid the lack of understanding, we must clarify from the beginning the cha-
racteristic terms and expressions of the reliability vocabulary. At the end of this
book you will find a glossary with the most frequently used reliability terms. Here
only some important notions will be presented.
A device or an item is any component, electronic element, assembly, equip-
ment that can be considered individually. It is a functional or structural unit, which
is considered as an entity for investigations. It may consist of hardware and/or
software and also include, if necessary, human resources.
12 1 Introduction

As far as the reliability is concerned (that is the ability of an item to remain


functional under given conditions for a stated time interval), the following notions
must be remembered: the probability, the requirements that must be satisfied, the
operation conditions, and the operating time. As the reliability is defined to be the
probability to fail, it can be expressed through a mathematical model or equation.
To complete the reliability notion, it is necessary to explain some other terms
such as failure and operating time. The failure is the termination of the ability of
an item to perform a required function. The operating time is the period of time
during which an item performs its required function. For a non-repairable item, the
operating time until failure is named lifetime. If the operating time is equal to the
mean operating time (the slope of the exponential distribution of the failure), this
means that from 100 initial working items it is probable that only 37 items (ex-
pected value) will remain functional, (and not 50 items as it can erroneously be
considered, if a mean value is utilised).
The hazard rate Z(t) is the instantaneous failure rate of items having survived
to time t. The hazard rate multiplied by the time segment dt - that is Z(t)dt - repre-
sents the conditional probability of failure in that small time increment.
It can be shown that
Z(t) = f(t) / R(t) = f(t) / [1 - F(t)}. (1.36)
The failure rate Z(t) is defined as the ratio of the number of failures occurring
in the time interval, to the number of survivors at the beginning of the time inter-
val, divided by the length of the time interval. It is a measure of the instantaneous
speed of failure. The unit for Z(t) is the number of failures per time unit; the most
often used unit is 10·% or the FIT (Failure In Time). A variety of other names is
used in the reliability literature for the hazard rate (such as, for example, instanta-
neous failure rate, mortality rate, instantaneous hazard rate, rate of occurrence of
failure ROCOF, etc.). We will utilise synonymously the terms hazard rate Z(t) and
failure rate ).,(t) - defined as the defect density f(t) divided by the working ele-
ments fraction; during the observation moment, this fraction is 1 - F(t), in which
case F(t) is the failure probability. Also
).,(t) = f(t) / [1-F(t)] for each t ~O with F(t) < 1. (1.37)
Considering these simplified conditions, the inverse ratio of failure rate defines
the mean time between failures MTBF, measured in hours. In general, we can
distinguish four types of failure rates: observed, extrapolated, estimated and pre-
liminated.
The elementary relationships between failure rate, failure distribution functions,
failure density functions, reliability, mean time between failures (MTBF), mean
time to repair (MITR), and availability should be clearly under-stood and are
summarised in Table 1.1. For their derivation see [1.22] [1.24] [1.46] [1.94] and
[1.103].
1 Introduction 13

1.6.1
The generalised definition of failure rate (A) and of
mean time between failures (MTBF)

The failure rate can be deduced considering that the test begins at the moment t =
o with no components. After a time t will survive ns components, also nf compo-
nents have failed;
(1.38)

Table 1.1 Relationships used in reliability modelling

f(t) = ([n(tJ - n(t; + L1tJ] N) Lit; t; < t ~ t; + Lit;

z(t) = ([n(tJ - n(t; + LitJ] nt;) Lit; t; < t ~t; + Lit;

R(t) = 1 - F(t)
t
F(t) = If(x)dx
o

t
R(t) = exp[- Jz(x)dx]
o

t
f(t) = Z(t) exp[- Jz(x)dx]
o
Z(t) = f(t) / R(t)
00 00

MTBF = /tf(t)dt = /R(t)dt


o 0
MTBF = lim R*(s) R*(s) = Laplace transform of R(t)
s -+0
T t=T
A (0, T) = IIT/[A(t)dt] = liT lfi(t) A(t) = Pointwise (or instantaneous) availability; defmed
o t=O as the probability that a system is operational at the time
t,regardless of how often it has failed during the period
of time.
A(0, t) = Interval availability; the fraction of time the
system is operational during t E (0, T).
A = lim A(t) = MTBF / (MTBF + MTTR) A = Steady state availability; the probability that the
t -+ 00 system will be operational at any random point of time.

The failure rate is given by: dnfl dt. This ratio can be interpreted as the number of
14 1 Introduction

components which fails in the unit of time. As ns components survived, the failure
rate of each component is
A = (1 I nJ(dnfl dl). (1.39)
The reliability at the time 1 can be expressed as the probability of non-failure for
the interval (0, t]. While from initial no remained only ns:
(1.40)
Differentiating, we obtain:
dR(t) I dl = - (llno}{dnfl dt) (1.41 )
and
dnfl dl = - no[dR(I} I dl}. (1.42)
From (1.39) and (1.42) it results:
A = 1 I ns [-no(dR I dt)]. (1.43)
But, in accordance with (1.40), A(t) - in s·\ - becomes:
A= - 1 I R(t) [dR(I} I dl}. (1.44)
This relation has a general validity if nothing is known about the variation in
time of 'A. The unique restrictive condition is that 'A must be always positive and -
like R(t) - must be a monotone decreasing function. By integrating the relation
(1.44) between 0 and I, we obtain:
t R
1Adl = -IdR(t) I R = - In R(t). (1.45)
o 1
For t = 0 and R = 1 we have
t
R(t} = exp[-IMI}. (1.46)
o
In electronics, the problem is simplified if we consider 'A constant; in this case:
R(t) = exp(-AI} (1.47)
and - in accordance with (1.28) - we have
(t) = Aexp(-At). (l.48)
It can be proved also, that for a working interval (I, t+ ,1I), the reliability is given
by:
t+Llt
exp[-AIdt} = exp(-A,1t}. (1.49)
t
Obviously, the working moment (the age) from the expression (1.49) is not im-
portant, but only the time interval Lit, measured at a certain reference moment, at
which the item was still in operation. If LlI represents the duration of an experi-
ment, then for this experiment the components have the same reliability at differ-
ent ages. In statistics it is considered, that the mean value of a given distribution
1 Introduction 15

f(t) is obtained from the moment of the first order off(t), namely tf(t), the integral
being calculated from t = 0 to t = GO. From the mean of the failure times the good
operation time - MTBF (for repairable systems) or MTTF (for nonrepairable sys-
tems) are calculated. The general expression of MTBF (respectively MTTF) is:
co
m = / tf(t)dt. (1.50)
o
With the aid of the relation (1.28), we can write:
00

m = - [/tdR(t) I dt]dt (1.51)


o
and, by partial integration:
00 00

m = - [tR(tJ] / + /R(tJdt. (1.52)


o 0
For t= 0, we have R(t) = I and tR(t) = o. If t growths, R(t) decreases; we can find
then a value k, which satisfies the inequality
R(t) < exp(-kt). (1.53)
While limit exp(-tJ] = 0 (1.54)
t-fCO

it follows that
limit R(t)] = o. (1.55)
t-fCO

So we obtain the expression 00

m = /R(tJdt. (1.56)
o
If A = constant, then
00

m = !exp(-At)dt = IIX (1.57)


o
The expressions (1.44), (1.46) and (1.56) are generally valid, for any time distri-
bution ofA.

1.7
Failures types

One may distinguish three failure types. (To be noted that the manipulation, trans-
port and faulty failures are not taken into account.) They appear even if the user
does not make any error.
First, there are failures that appear during the early period of component life
and are called early (infantile) failures. They can be explained through a faulty
manufacture and an insufficient quality control in the production. They can be
eliminated by a systematic screening test.
Wearout failures, the second category, constitute an indicator of the compo-
nent ageing.
16 1 Introduction

Accidental failures, the third category, can't be eliminated neither by a scree-


ning test, nor by an optimal use politics (maintenance). They can be provoked by
sudden voltage increases that can strongly influence the component quality and
reliability. These failures appear erratically, accidentally, unforeseeably.

1.7.1
Failures classification

The most useful and frequent classifications are:


• Depending on cause:
- failure due to an incorrect assembling
- failure due to an inherent weakness
- infant mortality failures
- wearout failures
• Depending on phenomenon speed:
- sudden failure
- progressive failure
• Depending on technical complexity:
- total failure

caused by
inherent weakness Cause

affailure

Emergence
& test

revealed by an revealed by a
interruption of test programme
operation
Nature
affailure

Fig. 1.7 Failure classification


1 Introduction 17

- partial failure
- intermittent failure
• Depending on emergence manner:
- catastrophic failure
- degradation failure.
Figure 1.7 gives a general picture of the most usual failure categories. Being fa-
miliar with the real failure mechanism facilitates both the selection of best compo-
nents and their correct use, and helps to the reliability growth, in general.

1.8
Reliability estimates

Two methods are generally used to make reliability estimates: (i) parts counts
method and (ii) parts stress analysis method.
The parts counts method requires less information, generally that dealing with
the quantity of different part types, quality level of the parts, and the operational
environment. This method is applicable in the early design phase and during
bid/proposal formulation.
Parts stress analysis requires the greatest amount of details and is appli-cable
during the later design phases where actual hardware and circuits are being de-
signed.
Whichever method is used, the objective is to obtain a reliability estimate that
is expressed as a failure rate; from this basic figure, R(t) and MIBF may be devel-
oped. Calculation of failure rate for an electronic assembly, unit or system requires
knowledge on the failure rate of each part contained in the item of interest. If we
assume that the item will fail when any of its parts fail, the failure rate of the item
will equal the sum of the failure rate of its parts. This may, in general, be ex-
pressed as:
n
(1.58)
i=]
where II; = failure rate of the ith parts, and n = number of parts.
Parts count reliability prediction [1.3][1.4]: the information needed to use the
method is: (i) generic part types (including complexity of microelectronics) and
quantities; (ii) part quality levels; and (iii) equipment environment. The general
expression for equipment failure rate with this method is:
n
A = IN/AClrc), (1.59)
i=i
for a given environment, where:
A =total equipment failure rate (failuresll Oh)
AG = generic failure rate for the ith generic part (failures/10 6h)
1tQ = quality factor for the ith generic part
N = quantity of ith generic part
n = number of different generic part categories.
18 1 Introduction

Infonnation to compute equipment failure rates using equation (1.59) applies if


the entire equipment is used in a single environment. If the equipment comprises
several units operating in different environments, the equation (1.59) should be
applied to the portion of equipment in each environment. These environment-
equipment failure rates should be added to detennine total equipment failure rate.
Parts stress analysis method. Part failure models vary with different part types,
but their general fonn is:
(1.60)
where:
..18 = base failure rate
7rE = environmental adjustment factor; it accounts for the influence of environ-
ments other than temperature and is related to operating condition (vibration,
humidity, etc.).
7rA = application adjustment factor; it depends on the application of the part and
takes into account secondary stress and application factors.
7rN = additional adjustment factors.
7rQ is used to account for the degree of manufacturing control with which the part
was fabricated and tested before shipment to user.
The value of AB is obtained from reduced part test data for each generic part type.
Generally, data are presented in the fonn of failure rate vs. nonnalised stress and
temperature factors (Fig. 1.8). The values of applied stress relative to the rated
stress represent the variables over which design control can be exercised and
which influence part reliability.

stress level 3
stress level 2
stress level 1

Stress levels represent fmal values of


applied stress (voltage, current, etc.).

Temperature

Fig. 1.8 Part base failure rate versus stress and temperature

It should be noted that there are certain fundamental limitations associated with
reliability estimates. The basic irifonnation used in part failure rate models is
averaged over a wide data base involving many persons and a variety of data col-
lection methods and conditions which prevent exact co-ordination and cor-
relation. The user is cautioned to use the latest part failure data available, as part
failure rates are continuously improving.
1 Introduction 19

1.9
"8ath-tub" failure curve

The time between successive failures is a continuous random quantity. From the
probabilistic standpoint, this random quantity can be fully determined if the distri-
bution function is known. These failure models are related to life test results and
failure rates via probability theory.
Figure 1.9 shows a typical time versus failure rate curve, the well-known "bath-
tub" curve. In the region of infant mortality the high failure rate is attributed to
gross built-in flaws which soon cause the parts to fail. After this zone - under
certain circumstances - the failure rate remains constant; this is the useful operat-
ing life. These part failure rate are usually summed up to calculate the inherent
system reliability. Finally, whatever wear or ageing mechanisms are involved,
they occur in the wearout time (here the failure rate increases rapidly).

A(t)

Infant
mortality ~Wearout period

/ / {}2> {}J
.,/

Time

Fig. 1.9 The "bath-tub" failure curve of a large population of statistically identical items, for two
ambient temperatures e] > e1 for electronic components

The "bath-tub" failure curve gives a good insight into the life cycle reliability
performance of an electronic system. Depending on the physical meaning, the
random quantities obtained can have different probability distributions laws (ex-
ponential, normal, Weibull, gamma, Rayleigh, etc.). Over the burn-in period of
operation, the bath-tub curve can be represented by gamma and/or Weibulllaws;
over the normal period of operation, by the exponential distribution; over the
wearout period of operation, by gamma and normal distributions. Thus, most
component failure patterns involve a superposition of different distribution laws.
Consequently, with the aid of the above laws, a failure density function, a relia-
bility function and MTBF expression can be obtained. In practice, this is a very
difficult task, hence approximation and much judgement is involved. Each ob-
server may consequently give a different solution to any distribution.
Voices claming that the "bath-tub" failure rate curve does not hold water any-
more [1.112] must also been reviewed.
20 1 Introduction

As it has been seen, the task of reliability modelling may be difficult and the
best a reliability engineer can do is to analyse a system through a simple model-
ling configuration.

1.10
Reliability of electronic systems

Firstly, the components decisively settle the operational reliability of electronic


devices and equipment. The equipment manufacturer must take security measures
before including the components in the respective equipment; to achieve this task,
input controls at the acquisition are used with the aim to prevent the element fail-
ure. In this chapter, from the vast field concerning the reliability of electronic
systems, only the problems related to the reliability of electronic components will
be treated.
Before mounting, the components must be tested (high temperature storage,
temperature cycles, mechanical shocks, humidity tests, etc.). Even the component
manufacturers think that the components that passed these tests are more "reli-
able"; but they forget that with screening tests a certain unreliability of these com-
ponents is emphasised. The reliability of a product is given by the design, the
quality of materials and the fabrication process. Consequently, the economic solu-
tion for the reliability problems can be found only in a strong co-operation be-
tween user and manufacturer. Only if the user knows the most important charac-
teristics of the component, and if the manufacturer knows the operational condi-
tions, the component reliability can be achieved within the framework of the reli-
ability of the finished equipment. The reliability of a component can be tested as
much outside as inside the equipment; the reliability of a subassembly is intrinsic
to its construction. If the manufacturer observes this simple rule, and the user
places adequately each constructive element, the avoidance of unpleasant surprises
is guaranteed.
For a correct selection concerning reliability, economics and degree of adequa-
tion from the vast market supply, the engineer needs all the possible information
about the up to date behaviour of the used components and of the predicted de-
fects/failures. These comprise the influence of the environmental conditions and
of the operating stresses on the component parameters, such as the predicted fail-
ure rate and lifetime. Until now, these data exist only in an unsatisfactory measure.
For the user, the component failure rate after operational errors is the most impor-
tant reliability criterion. For the evaluation of data concerning the failure rate, it
must be taken into account that these data are determined in various conditions, on
the basis of different assumptions. These elements constitute the calculus basis
also for normal operating conditions, by using various models. The failure mecha-
nisms give important information about the reliability.

1.10.1
Can the batch reliability be increased?

The reliability of a batch of components can be increased in three different ways,


1 Introduction 21

which may be used separately or combined. Firstly, it is the so-called pre-ageing,


which can be applied to all components before taking over the input control. The
pre-ageing eliminates a part of the early failures and awards to the surviving com-
ponents a stable behaviour during the operation time. This pre-ageing type has
nothing to do with the pre-ageing performed - for example - by the component's
manufacturer as part of the manufacturing process, for the establishing of normal
operating proprieties. To increase the reliability with a pre-ageing it is necessary
to know the taking over input control conditions and the stress conditions during
pre-ageing. In general, the pre-ageing is realised by an inferior component loading
(in comparison with the ulterior operating conditions). With the aim not to in-
crease needlessly the pre-ageing time, usually a load greater than the operation
load value, respectively the nominal load is selected. The loading must not be too
great, because otherwise the component can reach the failure limit and will not
have at the input control the desired behaviour.
Another way is the operational derating or the devaluation that contributes to a
substantial increase of reliability. It must be mentioned too the problem of toler-
ance limits that can also influence the system reliability. By using this method,
one may pay attention to the outrunnings, since an optimal efficiency can be ob-
tained only as parts are inside the established limits. Exceeding these limits can
operate inversely, reducing the reliability. With the aim to not allow to these
variations to perturb the system function, the circuit designer must establish toler-
ance limits that harmonise with the parameter variations. To define these tolerance
limits, the density function and the long-term behaviour of the respective parame-
ters must be known. From the modifying of the distribution function for the life-
time, those parameters that exceed the prescribed limits can be identified. The
knowledge of this behaviour of the parameters allows either to select the parame-
ters having regard for the prescribed limits, or to establish the limits that must not
to be exceeded during the operation.

1.10.2
What is the utility of screening tests?

To obtain a high reliability level of the equipment, it is recommended: (i) to buy


screened components, each time it is possible; (ii) to perform screening at the
input control level, if buying screened components is not possible; and (iii) to
incorporate the screening specifications (internal and/or external) in the buying
proceedings, foreseeing sanctions.
The screening tests can be performed to all fabrication levels of the equipment
(components, equipped PCBs, subsystems and systems), but for very different
prices. The motor behind these tests is the desire to obtain the best possible reli-
ability with reasonable prices.
Among the tests usually put into practice first should be mentioned the char-
acterisation test allowing the careful measurement, on a sample, of the most im-
portant parameters, for the considered application.
The characterisation is also used to determine the parameters in view of the de-
sign and of the manufacturer qualification for each circuit type. Some users make
a periodical characterisation of samples, since certain non specifIed parameters
22 1 Introduction

may vary as a result of (unknown) changes occurred in the manufacturing process


of the respective component.
The second type of tests - the good/bad-test - is a trial performed at 100%,
which verify the important parameters and fimctions of the respective application
[1.90]. These tests vary depending on the user, on the circuit complexity, and on
the exigencies of use. In general, at least static d. c. tests, and functional tests
good/bad are effectuated for 100% of the circuits. The limits can be that of the
manufacturer (indicated in the product data sheet), or that of the user [1.12]. These
measures include at least the input currents, the output voltages and the total dissi-
pated power.
The fimctional tests need the use of a programmed sequence of inputs, which
force the circuit to operate in all possible states (for example, the verification of
the truth table). To be noted that the principal difference between SSI, LSI, and
VLSI-ICs is the failure probability per component. Generally, to avoid the circuit
degradation when its complexity growths 10 times, it is necessary that the failure
rate per gate diminishes 10 times (Fig. 1.10).

Amount offailures
per 1000 circuits

100

10

0.1

0.01

0.001

0.0001
Number of Ie gates
0.00001
10 102 103 104
. - SSI_-.~....~_MSI ~~ LSI ----..- VLSI .....
Fig. 1.10 Variation offailure rate in function ofIe complexity

All the bibliographical sources agreed that the selection level is the one that al-
lows an economical approach of the electronic systems reliability. The Table 1.2
presents a comparison of the costs for four selection levels and three products
categories. It results that is more economically to identify and eliminate a defect
component by the input controls, and not by the equipped PCBs controls.
An empirical rule says that these costs growth with a magnitude order at each
successive control level. The more advanced the selection level, the more impor-
1 Introduction 23

tant are the costs. As a result it is recommended [1.5] to utilise 100% input con-
trols, justifying this unusual proceeding through a detailed economical analysis.

Table 1.2 Comparison between control costs (expressed in $) of a defect component

Products Input control of Control of System control At the user, during


types the components the PCBs in the factory the warranty period
General use 6 15 15 150
Industrial 12 75 135 645
Military 21 150 360 3000

FAlLURE MECHANISMS
Electrical unstability
Thermal mismatc h
External failures
Encapsulaion fai lures
Seal failures
Contamination
Wires and solder
~

.ll "
logf.1m"
Surface substrate failures

~~d","
Manufacturing fa ilures
Mounting substra
SCREENING TE Fllr
Optical Internal visual in spection • • •• •
External visual in spection t.
.- •
Mechanical Centrifugation • •• • I
Shock • •• •
Vibration •
.. · • • 1:.
f--+-- 1- +-- I--
Thermal High temperature storage
Thermal cycles • • • r
·,. •

I

Thennal shocks • • • • !. •
r
Bum-in •• • •
Electrical X-rays
Waterproofs
• t--t
~+-+- ~ t-
.
• !.
:
l..--'-L-..L...-I

Fig. 1.11 Failure mechanisms detectable with the aid of screening tests

At this present, the greatest part of the available data concerning the pre-ageing
(or the selection) refers deliberately to components and, particularly, to les. The
principal result (Fig. 1.11) is the revelation of some failure mechanisms and -
implicitly - of some new selection proceedings of the defect items, non-
satisfactory, marginal or with likely early failures (potential defects items). All
these definitions are presented in MIL-S-19500.
Till the end of the 80's, the plastic encapsulated devices were used only if the
environmental variations were relatively reduced, and the reliability performance
24 1 Introduction

are reasonable. The progresses obtained in the 90's produced the so-called "Acqui-
sition reform" (see section 2.1.5).
The tests constituting the screening tests must have the best ratio effectiveness
Icost. An analysis of these tests is given in Table 1.3.
Besides these aspects, there are other elements that have a certain influence on
the ratio cost/reliability [1.37]:
• the relations between the manufacturer and user;
• the confidence level pitched to the provider;
• the inspections effectuated by the user at the provider;
• the utilisation of an unique specifications set;
• the centralised supply, on the base of a plan that contains more providers.

Table 1.3 Ratio effectiveness / cost for screening tests

Test Effectiveness Cost Effectiveness and


cost
High temperature storage Very poor Very low Good
Temperature cycles Poor Very low Very good
Thermal shocks Poor Very low Good
Bum-in
Damp heat
Good / very good
Poor / very poor
. High
Enough high
Fair / good
Poor
Thermal vacuum Very poor High Very poor
Sinusoidal vibrations Poor Very high Very poor
Vibration, random Poor Very high Very poor
Mechanical shocks Very poor Very high Very poor
Automatic impacts Very poor Low Poor
Centrifugation Poor Very high Poor

*) Does not concerns the plastic encapsulation

Other compromises can be found, i. e. (i) better equipment reliability for a


given cost; and (ii) more favourable equipment cost for a given reliability level. In
this way each producer can optimise the equipment and reliability cost depending
on objectives and on possibilities.

1.10.3
Derating technique

One of the most used methods to improve the reliability of the equipped printed
circuits boards (PCBs) is the derating technique (the mounted component is ex-
posed to voltages, currents, tests, temperatures, far bellow the nominal operating
values; in this wayan increase of the lifetime duration for the respective compo-
nent is obtained). The underloading values can be found by the manufacturer or in
failure rates handbooks such as CNET Handbook [1.36] or MIL-HDBK 217
[1. 76]. This data - in which the values corresponding to the prescriptions are taken
as parameter - can provide specific failure rates for each one of the operating
1 Introduction 25

conditions. So one must begin with the study of the operating conditions of the
system, by evaluating - in percentage of the nominal values - the voltage, the load
and the temperature, for each component. With the aid of the given tables the
value for the specific operating conditions can be determined and the sum of the
failure rates with a tolerance of approximately 10% can be found, allowing to take
into account the solder joints, the connections, etc.
On demand, it can be foreseen special selection tests (thermal cycles, high tem-
peratures, thermal shocks, vibrations).
By using a minimum number of components operating far bellow the nominal
values, the circuit designer himself may settle the circuit reliability.
If the reliability problem is correctly treated, any apparatus, device or equip-
ment can be decomposed in modules, subsystems, units, ensuring for each ele-
ment the best reliability level, so that the desired reliability of the ensemble can be
obtained.

1.10.4
About the testability of electronic and telecommunication systems

The testability is a methodology based on the application of adequate recom-


mendations and on structural techniques, with the aim to facilitate not only the test
and the diagnosis of electronic and telecommunication systems for the pro-
totypes, and production level, but also the preventive and curative maintenance.
All these aspects must be very efficient - from technical point of view - and must
have a moderate price - from financial point of view.
The most effective way to succeed is to lead a testability politics from which a
better, more rapidly, cheaper production may result.
By conceiving as early as possible equipped PCBs that can be easily tested, the
following results are obtained:
• investments in test equipment are small, because more reduced performances
are necessary, with the condition that the required quality level is maintained;
• the design, better adapted to the testing, facilitates the programming, making it
more rapid and more comprehensible;
• the diagnosis - more obvious - is rapidly given, by personnel with little qualifi-
cation;
• reducing the cost and the times contributes to diminishing the development
time and the duration of the production cycle.
The 1960's have represented the discrete components era; the functional complex-
ity growth with the number of components. It was the age in which the tests were
made manually, with measurement instruments.
During the 1970's, the functional testers permitted an effective go/no go test for
the good PCBs. For PCBs with failures, the diagnosis of failures was long (defect
by defect) and expensive, since very specialised operators were needed.
At the beginning of the 1980's the age of in situ testing appears. The PCB test is
a conformity test (the good component at the appropriate place), the diagnosis is
more rapidly and more advantageous, according as the quantities are growing. An
easy diagnosis leads to a cheap production.
26 1 Introduction

Now the tests represent 35-45% of the production costs (it is not a productive
operation, because the tested PCB has no added value before, but after tests). The
enterprises being exposed on the market to the international competition, must:
• design more quickly (to be present early on the market)
• produce more rapidly (to shorten the putting into fabrication)
• produce cheaper (to be competitive)
• produce the best quality (to reduce the cost of the non-quality) to maintain the
commercial position on the market, to enlarge the sphere of sales.
The solution of the problem: (i) to select a testability politics that permit the
achievement of all these objectives; (ii) to design products that are easly testable.
For the future, the following tendencies are important:

• growth of the integration degree;


• a deeper individualisation of the components;
• new technologies for the achievement of the equipped PCBs (with surface
mounted components, SMC);
• world-wide extension of the market.
The last tendency supposes:

• a maximum of functions in a minimum of volume (reducing the possibilities of


physical access to the elementary functions);
• an intrinsic intelligence of the PCB at the component level (the concept of
component is directly related to the undissociable nature of its constituents).
The resulting conclusion is the same: to succeed, the companies must have a wise
testability politics.
The testability is based on recommendations that justify why and how to pro-
ceed. It is the designer's duty to decide on the application method, depending on
the quantitative and qualitative requirements of the project; the technical and fi-
nancial comparisons help him to select the best compromise. During the selection
of the specifications for the future product, a commercial company spends in real-
ity only 1% to 3% of the project budget. In the same time, by selecting the orien-
tations, it is decided on, and - implicitly - engaged, the allocation of 70% of the
budget.

1.10.5
Accelerated ageing methods for equipped boards

The following procedure is recommended:


• Visual control; rough electrical testing.
• 10 temperature cycling (-40°C I +70°C), with a speed of 4°C/minute and a
break of maximum 10 minutes. During cooling, the bias will be discon-
nected.
• 24 hours bum-in at ambient temperature or, even better, at +40°C ("debug-
ging"), with periodic "on" and "off'.
• Final electrical testing.
1 Introduction 27

These methods are complementary with the screening performed at the component
level.

1.10.6
Operational failures

In the past, usually, the reliability of a system was quantified based on the results
obtained from laboratory tests, the testing conditions being chosen to simulate, as
closed as possible, the real operational conditions. Unfortunately, various con-
straints - such as: equipment cost and lack of knowledge on real operational con-
ditions - determine that the results of laboratory tests are far enough from the real
operational results. This explains why a direct research on the operational behav-
iour is always desirable. But this operation is not as simple as it seems at first
sight. Before to perform the study of system reliability, some other operations
must be solved, allowing to obtain results as closed as possible to the real case.
These problems can be divided in three categories:
• practical problems
• mathematical problems
• data processing problems
Further on, we will try to describe these problems and to find viable solutions.
Practical problems
Theoretically, the collecting of information on system or equipment operation
is simple. One must only to fill a form, each time the system is connected or dis-
connected, and each time a failure occurs.
However, the experience shows that this procedure is not a simple one when
the entire life of equipment must be covered. Moreover, the form is often too
sophisticate for the personnel required to fill it, or the time constraints are impor-
tant. In all these cases, the obtained information is affected by serious doubts. The
solution is to be extremely cautious and careful at the defining of the required
information. It is important to correlate this information with the defined purpose
and to instruct the personnel not only how to fill the form, but also on the purpose
of this operation and to be aware that a high confidence degree about the informa-
tion is extremely important.
However, even for well-organised collecting systems, with well-trained and
motivated personnel, some uncertainties may arise: writing errors, or miss-
interpretations of the handwriting. Other problems are connected with the real
cause of the replacing of some components. Without speaking about the time
elapsed between the failure moment and the moment the failure is reported.
Eventually, there is a very frequent possibility to not find an explanation for a
system failure: that is at the subsequent repair of the system no defect is identified
(in Fig.l.12 this case was not included). There are many possible explanations for
such a situation, but, essentially, the lack of information must be the cause.
28 1 Introduction

~ Connectors

f!53!iEa Capacitors

Semiconductors
Resistors
Various
ICs
Solders

0 10 20 30 40

Fig. 1.12 Typical defects in an electronic system, arisen during the useful life

The most frequently mentioned causes are intermittent contacts, problems


linked with the tolerance of the values and with connectors, respectively.
A computerised system for data processing in real time may be a useful tool for
eliminating false information.
Mathematical problems
If a failure arise, most systems are repaired and continue to work. Repairing
means to replace failed components. However, the system is not "as new", be-
cause the not failed components were not replaced. But it is not either "as bad as
the old one". This distinction between "as new" and "as the old one" is important,
because the failure rate of components is not, generally, constant in time! Conse-
quently, the "failure intensity" of the system failures may become constant only
after a transitory period of time, when the "failure intensity" may increase, de-
crease, or may have sudden peaks, depending on the component failure rate.
Because a time variable failure rate is expected and because the system is repa-
rable, the traditional analysis methods are not appropriate. Moreover, the defect
number for the population of fallible components is small. Consequently, the use
of traditional analysis is difficult and methods based on the concept of stochastic
processes are needed. For this purpose, the renewal process is appropriate.
Taking into account that the electronic systems have often early failures, a
practical approximation based on the bimodal exponential distribution is an ade-
quate mode to solve the problem.
Data processing problems
In a discussion about operational results, there is a tendency to concentrate ex-
clusively on failure reporting problems. However, there are other problems linked
to the failed population, such as the processing of huge amount of data. Computers
accomplish this operation. To solve these kind of problems, three files must be
created:
• a file for the population, containing valuable information for the calculus of
the number of functioning systems, depending on the operational period, with
the assurance that the reported failures belong to the studied population,
1 Introduction 29

• a file identifying the system structure and describing details on the system
components,
• a file containing details on the observed failures.
Normally, the information about system operation is not structured and an indi-
vidual ''translating'' soft must be created for each company. If the company has an
well-organised system, in accordance with the requirements of the analysed sys-
tem, this problem can be easily solved.

1.10.7
FMEA/FMECA method

Anytime (whenever) the failure rate (predicted reliability) for critical components
of a system, especially for systems using the redundancy, is to be analysed, a
failure analysis must be performed. The method, known as FMEA (Failure mode
and effect analysis) or FMECA (Failure mode, effect and criticality analysis), is a
systematic research about the influence of possible defects on the reliability of a
component and about the influence of this component on other elements of the
system. The research takes into account various failure rates and their causes,
allowing determining the potential dangers. The efficiency of the measures pro-
posed for avoiding the probability of appearance for these failures is also investi-
gated. The method FMEAlFMECA takes into accounts not only failures, but also
errors and mistakes.
A development engineer, with the help of a reliability engineer performs
FMEAlFMECA upstream. Further on, details about the procedure are presented.
Step 1. A description of the function for the studied element (such as a transistor, a
resistor, etc.) is given. If possible, references about the bloc-diagram of the system
reliability are made.
Step 2. A hypothesis about a possible failure mode is made. In this case, the phase
of the mission for the studied system must be taken into account, because a failure
or a mistake in an early operational period can be easily avoided. For each ele-
ment, all possible defects must be considered, one by one.
Step 3. The possible cause must be described for each possible defect identified
step 2. This is used for calculating the probability of appearance (step 8) and for
elaborating the necessary protection measures (step 6). A failure mode (short
circuit, open circuit, parameter drift, etc.) may have various causes. Moreover, a
primary defect or a secondary defect (produced by other defect) may arise. All
independent causes must be identified and carefully investigated.
Step 4. The symptoms for the failure mode presumed at step 2 and the possibilities
to localise the failure must be given. Also, a short description of the repercussions
of the failures for the studied element and for other elements must be made.
Step 5. A short description of the effects of the failure mode (presumed at step 2)
on the reliability of the entire studied system must be performed.
Step 6. A short description of the proposed measures for reducing the effect of the
failure and the probability of its appearance, and allowing the continuance of
system mission, must be given.
30 1 Introduction

Step 7. The importance of the presumed failure mode on the reliability of the
whole system must be estimated. The estimation figures cover, usually, the fol-
lowing range:
1 - no influence (sure)
2 - partial failure (noncritical)
3 - total failure (critical)
4 - overcritical failure (catastrophic)
The fuzzy type estimation is based on the skill of the reliability engineer
Step 8. For each presumed failure mode (step 2), the probability of failure (or the
estimated failure rate) must be calculated, taking into account the causes identified
at step 3. The usual evaluation range contains the following fuzzy type items:

• A - frequently
• B -probably
• C - less probably
• D - Improbably
• E - very improbably
Step 9. The previous observations are recalled and new ideas are stimulated to
arise, especially about the necessary corrective actions.

1.10.8
Fault tree analysis (FTA)

FTA is a system engineering technique that provides an organised, illustrative


approach to the identification of high-risk area [1.111]. FTA is applied to systems
where safety and/or operational failure modes are of concern. It is not suggested
that FTA replaces other forms of system analysis, but can be instead utilised on
conjunction with inductive techniques. FTA does not solve complex design prob-
lems or reveal overstressed parts, but provides the analyst with a qualitative sys-
tem evaluation procedure enabling detection of system failure modes, potential
safety hazards, and subsystems with high failure rates. FTA should be prepared for
the preliminary system design review and once again during the critical design
review. This allows design changes resulting from the analysis to be incorporated
in a cost-effective manner before the equipment goes into service.
Usually, the construction of a FTA begins with the definition of the top unde-
sired event (the system failure); the causes are then indicated and connected to the
top event by conventional logic gates. The procedure is repeated for each of the
causes and the causes of the causes, etc. until all the events have been considered
(Fig. 1.13).

1.10.B.1
Monte Carlo techniques

There is more than one fault tree simulation programme developed to describe
systems and provide quantitative performance results. The Monte Carlo technique
1 Introduction 31

has the ability to include considerations that would be very difficult to include in
analytical calculations. The programme views the system represented by the fault
tree as a statistical assembly of independent basic input events. The output is a
randomly calculated time to failure (TTF) for each basic block, based on the as-
signed MI'BF. The system is tested, as each basic input event fails, to detect sys-
tem failure within the mission time. A time to repair (TTR) is predicted, based on
the MI'TR values with detection times and a new TTF value assigned to each
failed basic input event to permit failure after repair (Fig. 1.14).

Conceptual System deve- Equipment deve- Production Operational


pbase lopment pbase lopment pbase pbase use pbase

Feasibility stu-
dies and logis-
tics concepts
Systems analYSIS,
optimisation, syn-
thesis & defmition
uerauea eqUipmem
design, layouts, parts ~/n-service
lists, drawings, sup- design review
port data
l'aoncatlon, as-
k::Conceptual k:: System design sembly, test, in-
design review reViews spect, deploy
operational
equipment
Op~rate ~ mam-
~Equipment tam eqUipment
design reviews in the field

Fig. 1.13 Product development chart with scheduled FTA inputs

~TTR. ~IIIIII TTE:....-----I~~I


Mission period

System failure

Fig. 1.14 Effect ofTTR and TTF on mission performance

The process continues until the mission period is reached or the system fails. A
new set of randomly selected values is assigned to the basic blocks and the pro-
gramme is rerun. After a significant number of such trials, the user obtains:
• system probability of failure;
• probability of success;
• subsystem/component contributions to system failure identification;
• subsystem failures are recorded for performance comparisons.
32 1 Introduction

Some useful outputs which can be obtained are:


• probability of system failure/success;
• total number of system failures;
• average failure mission time;
• system availability;
• mean and total downtime;
• sequential listing of basic input failures which cause system failure for each
first mission period system failure;
• weighting of all basic block failure paths and of all logical gates which cause
system failure directly;
• failure path weighting of basic blocks which are in a failed state when system
failure occurs;
• rank plus listing of basic input failure and availability performance results,
including optional weighting or cost effectiveness information if desired;
• number of times repair is attempted and completed for each basic input event;
• number of times each basic input and logical gate cause a shutdown.
In accomplishing the foregoing programme outputs, Monte Carlo simulates the
tree and - by using the input data - randomly selects rate data from assigned sta-
tistical distribution parameters. One great advantage of Monte Carlo is the techni-
cal and engineering alternatives that could be implemented to produce a sensitivity
analysis, leading to selection and scheduling of system improvements.
Around 1944 Fermi, von Neumann and Vlam used the Monte Carlo technique
to perform neutron-diffusion calculations. Shortly after this, fault tree simulation
programmes were performed by using a straightforward Monte Carlo sampling
technique called direct simulation. Here events were generated with real frequen-
cies equal to their true occurrence frequencies. This type of simulation meets the
requirement that the most probable combination happen most frequently.
The crucial advantage of the Monte Carlo method is that this technique is (by
orders of magnitude) less sensitive to the complexity of the system, thus enabling
the analysis of models that otherwise can't be approached. The basic idea of the
method is the generation - by using a computer - of a large sample of histories;
from which all the information required about the system is obtained [1.21].

1.10.9
Practical recommendations

• Prepare an initial list of all components. One knows that, roughly, the reli-
ability of a system is determined by the reliability of the components. Conse-
quently, the supplier of the components must carefully chosen. For economi-
cal reasons (time and money), the number of components must be drastically
diminished. Often, to determine the component quality, only a control of the
producer specifications is needed. And this control can be made by a data
bank. For doubtful cases, reliability tests must be organised (damp heat, tem-
perature cycling, thermal shocks, vibrations). For memories, microprocessors
and, generally, for LSI and VLSI circuits, sophisticated and expensive sys-
tems are needed. Moreover, even during manufacturing, one must establish if
1 Introduction 33

the input control is made 100% or by samples. In the last case, one must state
ifLTPD method or AQL method is used and the exact control values for each
component.
• State the quality and reliability requirements before starting the manufactur-
ing. It is likely to pre-determine the MTBF value for the future product, the
implications on the warranty costs and on the market chances of the future
product, etc. The best product may not be taken into account if a failure arises
after 2-3 weeks and, consequently, the producer must reset the manufacture
once more. Even during the manufacturing process, the magnitude order of
the future failures must be estimated.
• State a control strategy, prepare all the details of the control specifications and
demand a manufacturing process with an easy access to the measure and con-
trol points, with reduced maintenance and small costs.
• Organise periodically reliability analyses. Even during the manufacturing,
analyses must be performed, to determine the potential reliability of the proj-
ect.
• Perform early tests with increased stress level for some prototypes. The pur-
pose of these tests is to identify the weak points of the design, for operational
conditions, but also for all higher stresses stated in the product specification.
This stress catalogue (shocks, vibrations, high temperatures and humidity, du-
ration, etc.) must be prepared before starting the manufacturing.
• Form workteams and inform regularly the manufacturing department, sales
department and public relation department on the specified problems and on
progress obtained in manufacturing the new product.
• Made a design review, involving the head of manufacturing department, the
sales engineer, the control team, etc.
• If the inherent reliability of the components is too small, the derating tech-
nique must be used. The result will be a decrease of the system failure rate
and an increase of the lifetime. If the price of the common component is taken
as a unity, the price of the component with high reliability (tested 100%, ac-
cording to MIL-STD-883D) increases at least 1.2 ... 1.5 times. For very high
reliability components (military use, etc.), the cost may be multiplied up to
5 ... 20 times.

1.10.10
Component reliability and market economy

Two factors are the most important at the development of a new product: the mar-
keting and the manufacturing. The marketing is a strategic activity, because the
life cycle must be correlated with the cost. In this respect, the duration of the de-
sign phase and the number of iterations required for developing a high quality
product must be drastically reduced. In the manufacturing field, the number of
iterations till the development of anew process must be also diminished. For both
factors, an important problem is that of testing.
34 1 Introduction

As the product life cycles diminish, the manufacturing services undergo an


increasingly higher pressure, influencing considerably the testing activities. Con-
sequently, new testing strategies must be elaborated, to help the testing engineers
in this new situation. Using adaptive methods can reduce the testing duration. A
computerised system for the collecting of the testing data will be very helpful.
Elaborating testing strategies for a new product demands performing a deep study
of the various testing variants. For each product, a strategic testing plan, applied
during the entire product life (starting from design phase, passing through the
experimental model and prototype phases, till the manufacturing phase) must be
elaborated.
A possible variant of testing strategy is presented in Fig. 1.15. One of the
needed conditions for such a strategy is to involve the reliability, quality, testing
and manufacturing engineers from the first phase and till the end of the develop-
ment process. This new concept, named concurrent engineering, becomes more
and more common for companies involved in the semiconductor industry. Details
are presented in Chap. 2. In the following, elements of a testing strategy for devel-
oping a new electronic system will be presented.

In situ testing

Analysis of fabrication
defects

Fig. 1.15 Possible testing scenario, from input control to system testing. To reduce the duration
required for each developing step, specific testing methods will be developed

For each component family, an acceptable quality level (AQL) must be defined.
This AQL may be assured by three ways: i) quality certifying at the provider,
allowing to the component user to avoid component testing; ii) control of limited
number of samples, if the quality level is closed to the specifications; iii) 100%
testing, when the required quality level is far superior to the quality level assured
by the provider. In the last two cases, a testing programme is needed. For LSI
circuits, no standard testing programme is available and consequently the own
specialists of the user must develop a specific testing programme.
Further on, the quality of the equipped cards must be carefully controlled, by
computer aided methods. For each new equipped card, the development depart-
ment creates a new file. The testing engineer creates in situ tests, based on func-
tional testers. A new concept is to design even the testability of the system; self-
control programmes being inserted in the system. Finally, the system is tested
before delivering, with the soft that will be used in operational life. With this
method, short installation times may be obtained.
1 Introduction 35

1.11
Some examples

To lUlderstand better how to apply some of the precedent notions, several practi-
cal examples are given in the next pages. They will aid the reader to have a better
and more complete image concerning the reliability aspect problems. It will be
considered that A = constant.
Example 1.1 - A certain number of tape recorders has been operated 20 000
hours. During this time 8 repairs have been made. If A = constant, then MTBF is
20 000 : 8 = 2500 hours, and the failure rate is 8 : 20 000 hours = 0.0004 failures
per operating hour.
Example 1.2 - For a tested sample, the failure rate will have a likely value
evaluated on the basis of sample data; A is calculated with the ratio
A, = (number offailures) : (total operation time) (1.61)
It was taken a sample of 10 items and after 250 operation hours 2 failures re-
corded; the rest of 8 items survived - without failures - during a 2000 hours test.
We may write:
A, = 2 : [(2 x 250) + (8 x 200)] = 2 : 16500 = 0.0001212 failures/hour =

12.12%/1000 hours (1.62)


Example 1.3 - For a sample of 15 transistors, 3 parameters are tested:
• ICBO (residual current collector-base with IE = 0 and U CB = constant);
• hob (output admittance for small signal, short-circuited input, common
base);
• hlb (small signal amplification, short-circuited output, common base).

Table 1.4 Limit values of the three tested parameters

No. Parameter Max. value for the reliability tests (RT)


beforeRT afterRT
I Icso (J.lA) 1.2 5.0
2 hob (mn· l ) 1.0 2.0
3 1+ htb 0.05 0.065

These parameters should not exceed the limit values indicated in Table 1.4. The
measured data (before and after the reliability tests) are given in Table 1.5.
Before we calculate A, we must remember some rules:

• If - at the end of the reliability test - an item exceeds the maximum prescribed
limit value, the item must be considered as defective.
• The items which exceed the prescribed limits before the reliability test, will not
be considered for the calculus of the failure.
36 1 Introduction

• If-for an item - more parameters have been affected, it will be considered that
a simple failure (and only one) has occurred.
• If - during the intermediate controls - some items are identified as overreach-
ing the failure limits, they will be also taken into account as failed items, even
if later they do no longer exceed the prescribed limit values.
For the calculus of the operation hours, it is considered that the respective item
is failed immediately after the last measurement.

Table 1.5 Experimental data, before and after reliability tests (RT)

Component no. Parameter BeforeRT AfterRT


1 1 1.15 5.0
2 0.8 2.2
3 0.04 0.06
2 1 0.9 5.2
2 0.5 1.2
3 0.04 0.07
12 1 1.1 4.8
2 1.2 1.8
3 0.035 0.068
3,4 ... 11 and 13 ... 15 1
2 O.K. O.K.
3

With these rules in mind and taking into account that the item 1 and 3 failed
after 200 hours, for a 1000 hours test it results:
A = 2/ [(2 x 200) + (12 x 1000)J = 16.12 X 10.5 failures/hour =

= 16.12%/1000 h (1.63)
Example 1.4 - A reliability test with 100 items gives after 5000 hours the fol-
lowing result: after 2000 h - one failure; after 4000 h - two failures. What is the
value of the mean operation time?
A, = 3/[(1 x 2000) + (2 x 4000) + (97 x 5000)} = 6.06 x 1006 failures/hour

and

MFBF = 1/,1, = 0.165 x 106 hours = 19 years (1.64)


Example 1.5 - A batch of 15 items is tested until failure; the time intervals until
failure are: 400; 500; 280; 600; 1000; 700; 530; 615; 690; 580; 290; 350; 450 and
720 hours.
The total test time is:
15
I Ii = 8265 hours (1.65)
i=1
1 Introduction 37

and MrBF = 8265 / 15 = 551 hours.


Example 1.6 - For a certain electronic component A = 0.1 for 1000 hours. What
for survival probability R(t) can be calculated for t = 150 hours, and for t = 900
hours, respectively?
In accordance with the equation (1.12), R(t) = exp(-At), it results:
R(150) = exp(- 0.0001 x 150) = 0.985119 (1.66)
and for t = 900 hours: R(900) = exp(- 0.09) = 0.9139312. It can be seen that
R(900) < R(150).

References

1.1 AFCIQ (1983): Donnees de fiabilite en stockage des compos ants electroniques
1.2 Ambrozy, A. (1982): Electronic Noise. McGraw-Hill, New York
1.3 Arsenault, 1. E.; Roberts, J. A. (1980): Reliability and maintainability of electronic sys-
tems. Computer Science Press
1.4 Arsenault, J. E. (1980): Screening. Reliability and maintainability of electronic systems,
pp. 304-320. Computer Science Press, Rockville
1.5 Bajenesco, T. 1. (1975): Quelques aspects de la fiabilite des microcircuits avec enrobage
plastique. Bulletin ASEIUCS (Switzerland), vol. 66, no. 16, pp. 880-884
a
1.6 Bajenesco, T. 1.: (1978): Initiation la fiabilite en electronique modeme. Masson, Paris
1.7 Bajenescu, T. 1. (1978): ZuverHissigkeit in der Elektronik. Seminar at the University of
Berne (Switzerland), November 6
1.8 Bajenescu, T. 1. (1979): Elektronik und Zuverliissigkeit. Hallwag Verlag, Bern & Stuttgart
1.9 Bajenescu, T. 1. (1981): Wirtschaftliche Altemativen zu "Bum-in"-Verfahren. Fach-
sitzungsprogramm Productronica 81, Munich
1.10 Bajenescu, T. 1. (1981): Grundlagen der Zuverliissigkeit anhand von Bauelemente-zuver-
liissigkeit. Elektronik Produktion & Priiftechnik, no. of May-September
1.11 Bajenescu, T. 1. (1981): Qu'est-ce que Ie "bum-in"? Electronique, no. II, pp. ELl-EL3
1.12 Bajenescu, T. 1. (1982): Contr61e d'entree et fiabilite des composants electroniques.
L'Indicateur Industriel no. 1, pp. 17-19
1.13 Bajenescu, T. 1. (1983): Quelques aspects economiques du "bum-in". La Revue Polytech-
nique (Switzerland), no. 1439, pp. 667-669
1.14 Bajenescu, T. 1. (1983): Pourquoi les tests de deverminage des composants? Electronique,
no. 4, pp. EL8-ELlI
1.15 Bajenescu, T. 1. (1984): Relais und Zuverliissigkeit. Aktuelle Technik (Switzerland), no. 1,
pp. 17-23
1.16 Bajenescu, T. 1. (1985): Einige Gedanken tiber Qualitiits- und ZuverHissigkeitssicherung
in der Elektronikindustrie. Aktuelle Technik (Switzerland), no. 3, pp. 17-20
1.17 Bajenescu, T. 1. (1989): La testabilite: pourquoi et comment. La Revue Polytechnique
(Switzerland), no. 1514, p. 884
1.18 Bajenescu, T. 1. (1992): Quality Assurance and the "Total Quality" concept. Optimum Q
no. 2 (April), pp. 10-14
1.19 Bajenescu, T. 1. (1993): Einige Aspekte der Zuveriiissigkeitssicherung in der Elektronik-
Industrie. London
1.20 Bajenescu, T. 1. (1993): Wann konunt der niichste Uberschlag? Schweizer Maschinen-
markt no. 40, pp. 74-81
1.21 Bajenescu, T. 1. (1998): On the spare parts problem. Proceedings of Optim '98, Bra'tov
(Romania)
1.22 Barlow, R. E.; Prochan, F. (1965): Mathematical theory of reliability. J. Wiley and Sons,
Inc., New York
38 1 Introduction

1.23 Bazovsky, I. (1961): Reliability theory and practice. Prentice Hall, Inc.
l.24 Beckmann, P. (1968): Elements of applied probability theory. Harcourt, Brace and World,
Inc., New York
1.25 Bellcore, TR-332 (1995): Reliability prediction procedure for electronic equipment. 4th
Edition, Bellcore, Livingston, NJ
1.26 Bell Laboratories (1975): EMP engineering and design principles. Bell Telephones
1.27 Beneking, H. (1991): Halbleiter-Technologie. Teubner Verlag, Stuttgart
1.28 Berger, M. C. (1980): Experience pratique de deverminage de compos ants electroniques.
Actes du second colloque international sur la fiabilite et la maintainabilite. Perros-Guirec-
Tregastel, September 8-12
1.29 Birolini, A. (1997): Quality and reliability of technical systems (second edition).
Springer-Verlag, Berlin
1.30 Blanks, L. (1992): Reliability procurement & use: from specification to replacement. John
Wiley & Sons, Inc.
1.31 Blanquart, P. (1978): Interet de la normalisation des modeles de compos ants par un or-
ganisme international. Electronica, Munich, November 10
1.32 Brombacher, A. C. (1992): Reliability by design: CAE techniques for electronic compo-
nents and systems. J. Wiley and Sons, Chichester
1.33 Ciltuneanu, V. M.; Mihalache, A. N. (1989): Reliability fundamentals. Elsevier, Amster-
dam
1.34 Christou, A. (1994): Reliability of Gallium Arsenide monolithic microwave integrated
circuits. John Wiley & Sons, Inc.
1.35 Christou, A. (1994): Integrating reliability into microelectronics manufacturing. John
Wiley, Design and Measurement in Electronic Engineering Series
1.36 CNET RDF 93 (1993): Recueil de donnees de fiabilite des compos ants electroniques.
CNET, Lannion; also as British Telecom Reliability Handbook HRD5, and Italtel Reli-
ability Prediction HDBK IRPHB93
1.37 Compte, Le, M. (1980): Modes et taux de defaillance des circuits integres. Actes du sec-
ond colloque international sur la fiabilite et la maintainabilite, Perros-Guirec-Tregastel,
8-12 Sept., p. 491
1.38 Crosby, P. B. (1971): Qualitat kostet weniger. Verlag A. Holz
1.39 Danner, F.; Lombardi, J. J. (1971): Setting up a cost-effective screening program for ICs.
Electronics, vol. 44 (30 August), pp. 44-47
1.40 Dhillon, B. S. (1986): Human reliability. Pergamon, New York
1.41 DIN 40039: Ausfallraten Bauelemente
1.42 Dorey, P. et al. (1990): Rapid reliability assessment ofVLSIC. Plenum Press
1.43 Dubi, A. et al. (1995): Monte Carlo modeling of reliability systems. Proceedings of ES-
REDA EC&GA meeting and seminar, Helsinki, May 16-18
1.44 Dull, H. (1976): Zuverlassigkeit und Driftverhalten von Widerstanden. Radio Mentor no.
7,pp.73-79
1.45 Ekings, J. D. (1978): Bum-in forever? Proceedings of the Annual Reliability and Main-
tainability Symp., pp. 286-293
1.46 Feller, W. (1968): An introduction to probability theory and its applications. John Wiley
& Sons, Inc., New York
1.47 Fiorescu, R. A. (1986): A new approach to reliabilediction is needed. Quality and Reli-
ability Engineering Internat., vol. 2, pp. 101-106
1.48 Friedman, M. A.; Tran, P. (1992): Reliability techniques for combined hardware/ software
systems. Proc. Annual Reliability and Maintainability Symp., pp. 290-293
1.49 Frost, D. F.; Poole, K. F. (1989): RELIANT: A reliability analysis tool for VLSI intercon-
nects. IEEE Solid-State Circuits, vol. 24, pp. 458--462
1.50 Gallace, L. J. (1974): Reliability - an introduction for engineers. RCA ST-6342, Som-
merville, N.J.
1.51 Goldthwaite, L. R. (1961): Failure-rate study for the log-normal life time model. Proc.
Seventh Nat. Symp. on Reliab. and Quality Control in Electronics, Philadelphia, Pa.,
January
1 Introduction 39

1.52 Graf, R. (1974): Electronics data book. D. Van Nostrand, New York
1.53 Guillard, A. (1980): Le deverminage de composants: est-ce utile? Bilan d'une experience.
Actes du second Colloque International sur la Fiabilite et la Maintainabilite, Perros-
Guirec-Tregastel, September 8-12
1.54 Hakim, E. B. (1988): Microelectronic reliability, Tome II. Artech House, London
1.55 Hannemann, R. 1. et al. (1994): Physical architecture of VLSI systems. John Wiley &
Sons, Inc.
1.56 Harrison, R.; Ushakov, 1. (1994): Handbook of reliability engineering. John Wiley &
Sons, Inc.
1.57 Henley, E. J.; Kummamoto, H. (1992): Probabilistic risk assessment. IEEE Press, Pis-
cataway, N. J.
1.58 Hernandez, D. et al. (1978): Optimisation cout-fiabilite des composants - I'exemple du
lanceur Ariane. Actes du Colloque International sur la Fiabilite et la Maintainabilite, Paris,
June 19-23
1.59 Hnatek, E. (1973): Epoxy packages increases IC reliability at no extra cost. Electronic
Engineering, February, pp. 66-68
1.60 Hnatek, E. (1977): High-reliability semiconductors: paying more doesn't always payoff.
Electronics, vol. 50, pp. 101-105
1.61 Hoel, P. G. (1962): Introduction to mathematical statistics. John Wiley & Sons, Inc.
1.62 IEC 1709 (1996): Electronic components reliability - Reference - Condition for failure
rates and stress models for conversion
1.63 IEEE-STD 493-1980: Recommended practice for the design of reliable industrial and
commercial power systems
1.64 Information about semiconductor grade moulding compounds. Down Corning Corpora-
tion, Midland, Michigan, 48640 USA
1.65 Jensen, F.; Petersen, N. (1982): Burn-in - an engineering approach to the design and
analysis ofburn-in procedures. John Wiley & Sons, Inc.
1.66 Jensen, F. (1995): Electronic component reliability. John Wiley & Sons, Inc.
1.67 Kohyama, S. (1990): Very high speed MOS devices. Oxford Science Publications
1.68 Kulhanec, A. (1980): Kriterien fur die Konfiguration eines Burn-in Systems. Elektronik
Produktion & Priiftechnik, February, pp. 11-14
1.69 La fiabilite des grands systemes electroniques et Ie contr61e d'entree. Bulletin SAQ (Swit-
zerland), vol.9 (1975), pp. 9-10
1.70 Locks, M. O. (1973): Reliability, maintainability & availability assessment. Hayden Book
Co., Inc. Rochelle Park, New Jersey
1.71 Lukis, L. W. F.: Reliability assessment - myths and misuse of statistics. Microelectronics
and Reliability vol. 11, no. 11, pp. 177-184
1.72 Mader, R.; Meyer, K.-D. (1974): Zuverlassigkeit diskreter passiver Bauelemente. In:
ZuverIassigkeit elektronischer Bauelemente. VEB Deutscher Verlag fur Grundstoff-
industrie, pp. 93-105
1.73 Masing, W. (1974): Qualitatslehre. DGQ 19, Beuth Verlag, Berlin
1.74 Merz, H. (1980): Sichedrung der Materialqualitat. Verlag Technische Rundschau, Bern
1.75 Messerschrnitt-Bolkow-Blohrn (1986): Technische Zuverlassigkeit. 3rd Edition, Springer
Verlag, Berlin
1.76 MIL-HDBK-217 (1991): Reliability prediction of electronic equipment. Edition F
1.77 MIL-HDBK-338: Electronic reliability design handbook; vol. I (1988); vol. II (1984)
1. 78 MIL-S-19500, General specification for semiconductor devices. U. S. Department of
Defense, Washington D. C.
1.79 Mood, A.; Graybill, F. A. (1963): Introduction to the theory of statistics. McGraw-Hill
Co.
1.80 Myers, D. K.: (1978): What happens to semiconductors in a nuclear environment? Elec-
tronics, 16th March, pp. 131-133
1.81 NASA CR-1126-1129 (1968): Practical reliability; vol. 1 to 4
1.82 NTT (1985): Standard reliability tables for semiconductor devices, Nippon Telegraph and
Telephone Corporation, Tokyo
40 1 Introduction

1.83 Novak, V.; Kadlec, J. (1972): Thennische Ubertragung in integrierten Schaltungen. Fern-
meldetechnikvoI.12,no.3,pp.ll7-118
1.84 O'Connor, N. (1991): Practical reliability engineering. 3rd edn., John Wiley & Sons, Inc.
1.85 O'Connor, P. D. T. (1993): Quality and reliability: illusions and realities. Quality and
Reliability Engineering Internat., vol. 9, pp. 163-168
1.86 Ott, W. H. (1988): Noise reduction techniques in electronic systems. 2nd edn. l Wiley &
Sons, Inc.
1.87 Pecht, M. (1994): Reliability predictions: their use and misuse. Proc. Annual Reliability
and Maintainability Symp., pp. 386-387
1.88 Pecht, M. G.; Palmer, M. and Naft, J. (1987): Thennal reliability management in PCB
design. Proc. Annual Reliab. and Maintainability Symp., pp. 312-315
1.89 Pecht, M. G. (1994): Integrated circuit, hybrid, and multichip module package design
guidelines. John Wiley & Sons, Inc.
1.90 Pecht, M. G. (1994): Quality confonnance and qualification of microelectronic package
and interconnects. John Wiley & Sons, Inc.
1.91 Pecht, M. G. (1995): Plastic encapsulation of microcircuits. John Wiley & Sons, Inc.
1.92 Peck, D. S.; Trapp, O. D. (1978): Accelerated testing book. Technology Associates, Por-
tola Valey, California
1.93 Pollino, E. (1989): Microelectronic reliability. Integrity, assessment and assurance. Tome
II, Artech House, London
1.94 Polovko, A. M. (1968): Fundamentals of reliability theory. Academic Press, New York
1.95 Prasad, R. P. (1989): Surface mounted technology. Van Nostrand Reinhold
1.96 Robach, Ch. (1978): Le test en production. Conception des systemes logiques tolerant les
pannes. Grenoble, February
1.97 Robineau, l et al. (1992): Reliability approach in automotive electronics. Int. Conf. ES-
REF, pp. 133-140
1.98 Rooney, J. P. (1989): Storage reliability. Proc. Annual Reliability and Maintainability
Symp., pp. 178-182
1.99 Rubinstein, E. (1977): Independent test labs: Caveat Emptor. IEEE Spectrum, vol. 14,
no.6,pp.44-50
1.100 Schaefer, E. (1980): Bum-in: Was ist das? Qualitlit und Zuverliissigkeit, vol. 25, no. 10,
pp.296-304
1.101 Schmidt-Briicken, H. (1961): Die Zuverliissigkeit sich verbrauchender Bauelemente. NTF
vol. 24,pp. 188-204
1.102 Schwartz, Ph. (1981): Le bum-in: une garantie de la fiabilite des circuits integres. EI
(France) no. 16,pp. 57-62
1.103 Shooman, M. L. (1968): Probabilistic reliability. An engineering approach. McGraw-Hill
Book Co., New York
1.104 Siewiorek, D. P. (1991): Architecture offault-tolerant computers, an historical perspective.
Proc.IEEE, vol. 79,no. 12,pp. 1710-1734
1.105 Silberhorn, A. (1980): Aussere, einschrlinkende Einfliisse auf den Einsatz von VLSI-
Bausteinen. Bulletin SEVNSE vol. 71, no. 2, pp. 54-56
1.106 Stonner, H. (1983): Mathematische Theorie der Zuverliissigkeit. Oldenbourg Verlag,
Munich
1.107 Suich, R. C.; Patterson, R. L. (1993): Minimize system cost by choosing optimal subsys-
tem reliability and redundancy. Proc. Annual Reliability and Maintainability Symp., pp.
293-297
1.108 Traon, Le, l-Y; Treheux, M. (1977): L'environnement des materiels de telecom-
munications. L'echo des recherches, October, pp. 12-21
1.109 Tretter, J. (1974): Zum Driftverhlaten von Bauelementen und Geriiten. Qualitiit und Zu-
verliissigkeit (Gennany), vol. 19, no 4, pp. 73-79
1.110 Villemeur, A. (1993): Surete de fonctionnement des systemes industriels. 2nd Edition,
Eyrolles, Paris
I.ll1 Williams, S. D. G. (1980): Fault tree analysis. In: Arsenault, J. E.; Roberts, J. A. (eds.):
Reliability and maintainbility of electronic systems. Computer Science Press
1 Introduction 41

1.112 Wong, K. L. (1990): What is wrong with the existing reliability methods'? Quality and
Reliability Engineering Internat., vol. 6, pp. 251-258
1.113 Denson, W. K.; Keene Jr., S. J. (1998): A new reliability-prediction tool. Proceedings of
the Annual Reliability and Maintainability Symp., January 19-22, Anaheim, California
(USA), pp.15-22
1.114 Lin, D. L.; Welsher, T. L. (1998): Prediction of product failure rate due to event-related
failure mechanisms. Proceedings of the Annual Reliability and Maintainability Symp.,
January 19-22, Anaheim, California (USA), pp. 339--344
1.115 De Mari, A. (1968): An accurate numerical steady-state one-dimensional solution of the
pnjunction. Solid-St. Electron., vol. 11, pp. 33-39
1.116 Frohman-Bentchkowski, D.; Grove, A. S. (1969): Conductance of MOS transistors in
saturation. IEEE Trans. Electron. Dev., vol. 16, pp. 108-116
1.117 Sincell, J.; Perez, R. J.; Noone, P. J.; Oberhettinger, D. (1998): Redundancy verifiaction
analysis: an alternative to FMEA for low-cost missions. Proceedings of the Annual Reli-
ability and Maintainability Symp., January 19-22, Anaheim, California (USA), pp. 54-60
1.118 Grove, A. S.; Deal, B. E.; Snow, E. H.; Sah, C. T. (1965): Investigation of thermally
oxidized silicon surfaces using MOS structures. Solid-State Electron., vol. 8, pp. 145-165
1.119 Hauser, 1. J. R.; Littlejohn, M. A.(1968): Approximations for accumulation and inversion
space-charge layers in semiconductors. Solid-St. Electron., vol. 11, pp. 667-674
1.120 Leistiko, 0.; Grove, A. S.; Sah, C. T. (1965): Electron and hole mobility in inversion
layers on thermally oxidized silicon surfaces. IEEE Trans. Electron Dev., vol. 12, pp.
248-255
1.121 Hoffman, D. R. (1998): An overview of concurrent engineering. Proceedings of the An-
nual Reliability and Maintainability Symp., January 19-22, Anaheim, California (USA),
pp.I-7
1.122 Onodera, K. (1997): Effective techniques ofFMEA at each life-cycle stage. Proceedings of
the Annual Reliability and Maintainability Symp., January 13-16, Philadelphia, Pennsyl-
vania (USA), pp. 50--56
1.123 Gulati, R.; Dugan, J. B. (1997): A modulat approach for analyzing static & dynamic fault-
trees. Proceedings of the Annual Reliability and Maintainability Symp., January 13-16,
Philadelphia, Pennsylvania (USA), pp. 57--63
1.124 Price, C. 1.; Taylor, N. S. (1998): FMEA for multiple failures. Proceedings of the Annual
Reliability and Maintainability Symp., January 19-22, Anaheim, California (USA), pp.
43-47
1.125 Bowles, J. B. (1998): The new SAE FMEA standard. Proceedings of the Annual Reliabil-
ity and Maintainability Symp., January 19-22, Anaheim, California (USA), pp. 48-53
1.126 Upadhayayula, K.; Dasgupta, A. (1998): Guidelines for physics-of-failure based acceler-
ated stress testing. Proceedings of the Annual Reliability and Maintainability Symp.,
January 19-22, Anaheim, California (USA), pp. 345-364
1.127 Klyatis, L. M. (1997): One strategy of accelerated-testing technique. Proceedings of the
Annual Reliability and Maintainability Symp., January 13-16, Philadelphia, Pennsylvania
(USA), pp. 249-253
1.128 Epstein, G. (1998): Tailoring ESS startegies for effectiveness & efficiency. Proceedings of
the Annual Reliability and Maintainability Symp., January 19-22, Anaheim, California
(USA), pp. 37-42
1.129 Zimmer, W. J.; Keats, J. B.; Prairie, R. P. (1998): Characterization of non-monotone
hazard rates. Proceedings of the Annual Reliability and Maintainability Symp., January
19-22, Anaheim, California (USA), pp. 176--181
1.130 Zimmerman, P. (1997): Concurrent engineering approach to the development of the
TM6000. Proceedings of the Annual Reliability and Maintainability Symp., January 13-
16, Philadelphia, Pennsylvania (USA), pp. 13-17
1.131 Dugan, J. B.; Venkataraman, R. G. (1997): DIFtree: a software package for analyzing
dynamic fault-tree models. Proceedings of the Annual Reliability and Maintainability
Symp., January 13-16, Philadelphia, Pennsylvania (USA), pp. 64-70
42 1 Introduction

1.132 Anand, A.; Somani, A. K. (1998): Hierarchical analysis of fault trees with dependencies,
using decomposition. Proceedings of the Annual Reliability and Maintainability Symp.,
January 19-22, Anaheim, California (USA), pp. 69-75
1.133 Kocza, G.; Bossche, A. (1997): Automatic fault-tree synthesis and real-time tree trimming,
based on computer models. Proceedings of the Annual Reliability and Maintainability
Symp., January 13-16, Philadelphia, Pennsylvania (USA), pp. 71-75
2 State of the art in the reliability
of electronic components

Today, the manufacturing of electronic components is the most dynamic process,


because the great demands imposed on the performance specifications of modem
devices determine a quick rate of change for these products. The electronic com-
ponents and, especially, the semiconductor devices have always been thought as
having the potential to achieve a high reliability and, consequently, the develop-
ment of many quality and reliability techniques was made particularly for these
devices. Consequently, the reliability researches on this field stand for the front line
in the battle for the best products.
The evolution of the reliability field can be traced between the milestones of the
semiconductor manufacturing history, as given by Birolini [2.15], Kuehn [2.48],
Knight [2.46] and Bazu [2.12]. The "new wave" in the reliability field, arrived after
1990, imposed some cultural changes, shown as the main features in Table 2.1.

Table 2.1 The evolution of the reliability field

Period Main features Domain

1945-1960 Nonnal tests on fmite products Final inspections


Collection of reliability data
Failure analysis

1960-1975 Accelerated life tests Control


Statistic process control (SPC)
Physics of failure
Reliabilitv orediction
1975-1990 Failure prevention Assurance
Process reliability
Screening strategies
Testing-in reliability
After 1990 Total quality management (TQM) Management
Concurrent engineering (CE)
Building-in reliability
Acquisition refonn

These changes determine a new attitude toward the reliability field expressed by
the approaches in the main domains concerning the reliability of semiconductor
devices, domains listed in Table 2.2.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
44 2 State of the art in reliability

Further on, the new trends in each of these domains (cultural features, reliability
building, reliability evaluation, and standardisation) will be identified.

Table 2.2 Actual domains in the reliability of semiconductor devices


Cultural features Quality and reliability assurance
Total quality management
Building-in reliability
Concurrent engineering
Acquisition reform

Reliability building Design for reliability


Process reliability
Screening and bum-in

Reliability evaluation Environmental reliability testing


Accelerated life tests
Physics of failure
Prediction methods

Standardisation Quality systems


Dependability

2.1
Cultural features

Firstly, the basic approach describing the new wave in reliability and the cultural
features of the present period will be presented.

2.1.1
Quality and reliability assurance

Quality assurance means all the organisational and technical activities assuring the
quality of design and manufacturing of a product, taking also into account eco-
nomical constraints.
Traditionally, quality assurance performs the assurance function through inspec-
tion and sorting operations. By using this strategy, one assumes that large amounts
of nonconforming material are allowed. Consequently, the quality assurance
department assumes a police role, guarding against the nonconforming material.
The new quality assurance function, rested on prevention, by eliminating the
sources of nonconforming material, arises in the early 70's.
The nonconforming material has two major causes: inadequate understanding of
the requirements and unsatisfactory processes. The quality assurance team must
determine, analyse and disseminate the requirements, both at the manufacturer and
2 State of the art in reliability 45

in customer's hands. In addition, it must determine process capability, bring it to the


required level and hold it there.
The new approach, based on prevention, required a paradigm shift for most
people since they were accustomed to inspection based systems. In the past, when
pro-blems raised, they required more inspection rather than eliminate the root
causes and perform corrective action.
The clue of this new quality and reliability assurance paradigm is the feedback
process, containing corrective action and preventive action. The distinction between
these types of action is:
• Corrective action deals with reliability problems found during the production of
a current item and solved by modifying the design, the manufacturing or control
instructions, the marketing programs (Fig. 2.1).
• Preventive action deals with the response given by the manufacturer to the cor-
rective action and it is intended to eliminate generic causes of product unreli-
ability.

Fig. 2.1 Corrective action in quality and reliability assurance programme

The reliability problems found during the field use phase can also be taken into
account for corrective and preventive actions. A very reliable link must be created
between all the teams involved in the quality and reliability assurance.
It is important that a reliability assurance program contains the following
elements:
• A set of strategic and tactical objectives.
• A reliability program with objectives for different organisational segments.
• Measurement process of the global system, which is complementary to the reli-
ability measurements performed by each organisational segment (design, manu-
facturing etc.).
• A very strong feedback process based on corrective and preventive actions.
The system for quality and reliability assurance must be described in an appropriate
handbook supported by the company management. Anyway, the reliability team
must depend only to the quality assurance manager (Fig. 2.3). Further details about
this subject are given in [2.15].
46 2 State of the art in reliability

Preparing the activity Manufacturing Manufacturing


Leading

~~
Designing Sales

Material QUALITY
+------(
purchasing ASSURANCE Service

Data Processing Dispatch

Planning Sub-Suppliers
Management

Fig. 2.2 Information flow between the quality assurance department and others departments

Fig. 2.3 An example of the structure for quality and reliability activity

2.1.2
Total quality management (TQM)

At the end of the 80's, a new approach, called total quality management (TQM),
was introduced. The definition given in August 1988 by the Department of Defence
of USA for it and reported by Yates and Johnson [2.74] considers TQM an appli-
cation of the management for the involved methods and human resources in the
purpose to control all process and to achieve a continuous improvement of the
quality. This is the so-called total quality approach. TQM demands teamwork,
commitment, motivation and professional discipline. It relies on people and in-
volves everyone. In fact, as Birolini [2.15] said, TQM is a refinement to the concept
of quality assurance. TQM is based on four principles, presented in Table 2.3.
The relationship between the customer and the manufacturer changes its content.
A real partnership is created (see Fig. 2.4) but this change must occur also at the
level of the other relationships, inside the company:
2 State of the art in reliability 47

• The relationship customer/market - the fabrication facilities, including the de-


velopment, the manufacturing and the post-sale service, must be taken into ac-
count vs. the user requirements.
• The relationship marketing/development - the technical specifications must
derive from the marketing activity.
• The relationship development/fabrication - the results obtained by the deve-
lopment must be transmitted to the manufacturing.
• The relationship sales/suppliers - the technical team and the purchasing team
must have a common opinion about the specifications.
The relationship post-sale service/customer - the customer's requirements must be
satisfied.

Table 2.3 The principles ofTQM

Principles Explanations
i
Customer satisfaction Total quality means satisfaction of the d~
needs and expectations of a customer I

Plan - do - check - act Known also as the Deming circle: Plan what to
do - do it - check the results - act to prevent
further error or to improve the process

Management by facts First collect objective data, than manage


according these data

Respect for people Assuming that all employees have a capacity


for self-motivation and creative thOUght

Fig. 2.4 The relationship between supplier and customer in a total quality system

Recently, a new tendency appears, trying to replace the term TQM with other
terms, such as CI = constant improvement and TQL = total quality leadership
[2.25].
48 2 State of the art in reliability

2.1.3
Building-in reliability (BIR)

The implement of TQM requires changes in the organisational environment. An


example is the role of the reliability group. Traditionally, this group defined the
testing requirements for a new product (correlated with operational conditions),
performed the stress test and reported the fmal results. Consequently, the reliabi-
lity risk was assessed at the end of the development process, and it was difficult for
the reliability group to be involved in the product development. Only reactions to
the development team were allowed. But this is a team not containing the reliability
group. The lack of an integrated reliability effort leads to the cultivation of an or-
ganisational climate that recognises winners and losers in the new-product intro-
duction process. This can lead to a tension between the new-product development
team and the reliability-engineering organisation that further limits its access to the
new-product development process. Hence, the reliability group focused on reliabil-
ity evaluation, which develops reaction, rather than anticipation skills, required by
TQM.
Efforts were made on the way to surpass the weak points of the traditional
approach to reliability improvement. As the semiconductor devices become more
reliable, the problem of ever rising costs and longer testing times begin to be
recognised.
BIR is a new concept, arisen in 27-29 March 1990, at the 28th edition of the
International Reliability Physics Symposium held in New Orleans, Louisiana
(USA). There is a shift in focus within the semiconductor reliability community
from the traditional reliability measurement models to building-in reliability. It was
felt that improvement in reliability can only be realised if emphasis is placed on
identifYing and controlling the critical input variables (process and control
parameters) that affect the output variables (such as failure rates and activation
energies) [2.10]. The process of implementing the BIR begins with looking at
output variables and then working backwards to identifY the key input variables that
have an impact on the output variables. Eventually, the identified input variables are
monitored and a stable process in manufacturing is obtained. Experimental results
proved that BIR is an effective approach [2.37]. The core elements of a BIR
approach are presented in Table 2.4.
For semiconductor devices, the BIR principles require an understanding of all
elements related to:
• The design (robust design, design for reliability and testability).
• The processing (process monitoring, materials characterisation, screening).
• The testing (final testing, periodic tests).
The implement of a BIR approach involves too many cultural changes and too
many segments of semiconductor and allied industries to evolve quickly enough
without significant assistance. Consequently, for the next years, testing-in reliabi-
lity (TIR) approach remains an important tool and a complement even for a BIR
technology. This means that, together with the implementation of a documented
2 State of the art in reliability 49

control of the input parameters, the reliability must be tested and monitored on the
manufacturing flow l .

Table 2.4 The core elements of a building-in reliability approach

Element Details
I
Proact rather than react Identify and eliminate or control the cause~
reduced reliability rather than test for and I
react to the problem I

Control the input parameters Control the input parameters of the process rather -I
than test the results of the process
I
:
Integrate the reliability Integrate the reliability driven considerations into I
\
all phases of manufacturing
·----l
I

Asses the reliability Asses the reliability of the product on the basis of!
a documented control of critical input parameter I
!
and of the reliability driven rules ,
i

2.1.4
Concurrent engineering (CE)

Concurrent engineering (CE) is a DoD (Department of Defence of USA) initiative


for the defence industry, successfully used, too, by commercial industries. It is a
systematic approach to the integrated concurrent design of products and their
related processes, including manufacture and support. This approach is intended to

Robust design

Environm. requirements Manufacturability

Technical requirements . - - - - 1 C E 1 - - - - . Testability

Economical requirements Quality and reliability


Fig. 2.5 Elements of a concurrent engineering (CE) analysis

I The BIR focus is on uncovering and understanding causes for reduced reliability and on finding
ways to eliminate or control them. In doing so, the approach offers not only new measures for
product reliability, but also a methodology for attaining ever-greater product reliability.
50 2 State of the art in reliability

cause the developers, from the outset, to consider all elements of the product life
cycle from conception through disposal, including quality, cost, schedule and user
requirements (MIL-HDBK-59, Dec. 1988).
As Hoffman [2.41] points out, CE must include business requirements, human
variables and technical variables. All these elements are presented in Fig. 2.5 and
must be taken into account starting with the design phase. The design team contains
specialists from various fields, such as: designing, manufacturing, testing, control,
quality, reliability, service, working in parallel. (In fact, another name for CE is
parallel engineering). Each specialist works part time in a project and he is involved
at each phase of the developing process. A synergy of the whole team must be
realised: the final result overreaches the sum of the individual possibilities.
With CE, the number of iteration to a project is diminished and the time period
required to obtain a new product is shortened. An important change in the mentality
must be performed: from toss it over the wall to a synergetic team. A strong
supporter ofCE is DoD, which encourages its contractors to lead the way.

2.1.5
Acquisition reform

In June 1994, the Department of Defence (DoD) of USA abolished the use of mili-
tary specifications and standards in favour of performance specifications and com-
mercial standards in DoD acquisitions [2.25]. Consequently, in October 1996,
MIL-Q-9858, Quality Program Requirements, and MIL-I-45208 A, Inspection
System Requirement, were cancelled without re-placement. More over, contractors
will have to propose their own methods for quality assurance, when appropriate. It
is likely that ISO 9000 will become the de facto quality system standard. The DoD
policy allows the use of military handbooks only for guidance. Many professional
organisations (e.g. IEEE Reliability Society) are attempting to produce commercial
reliability documents to replace the vanishing military standards [2.35]. Besides
them, there are a number of international standards produced by IEC TC-56, some
NATO documents, British documents and Canadian documents. In addition to the
new standardisation activities, Rome Laboratory (USA) is also undertaking a num-
ber of research to help implement acquisition reform. However, there are voices,
such as Demko [2.32], considering that a logistic and reliability disaster is possible,
because commercial parts, standards and practice may not meet military require-
ments. For this purpose, lIT Research Institute of Rome (USA) developed, in June
1997, SELECT, as a tool that allows to the users to quantify the reliability of com-
mercial off-the-shelf (COTS) equipment in severe environment [2.53]. Also, begin-
ning with April 1994, a new organism, called GIQLP (Government and Industry
Quality Liaison Panel), made up of government agencies, industry associations and
professional societies, is intimately involved in the vast changes being made in the
government acquisition process [2.63].
A great effort was made for reliability evaluation of Plastic Encapsulated
Microcircuits (PEM), which are considered typically commercial devices. The
current use of these devices is an example of reliability engineering responding to
both technology trends and customer policy [2.24]. The acquisition reform policy
2 State of the art in reliability 51

encouraged U. S. Military to use PEM over other packages. On the technical side,
users of PEM are employing Highly Accelerated Stress Testing (HAST) and
acoustic microscopy to screen out flawed devices. While the reliability of PEM is
constantly improving, the variability between suppliers remains a problem. More
details are given in chapter 12.

2.2
Reliability building

The reliability is built at the design phase and during the manufacturing. This
means that reliability concerns must be taken into account both at the design of the
process/product (the so-called design for reliability) and also at the manufacturing
(process reliability). A special attention must be given to the last step of the manu-
facturing process, the screening (or burn-in).
The component reliability is influenced by the materials, the concept and the
manufacture process, but strongly depends on the taking over input control
conditions, so not only the component manufacturer, but the equipment
manufacturer too must contribute greatly to the reliability growth of the equipment.
If the failure rate is constant during the operation period, this is a consequence of a
good component selection during the manufacturing process. But there are, also,
components that frequently fail, without a previous observation of a wearout effect.
The early failures - usually produced as a consequence of an inadequate
manufacturing process - must be avoided from the beginning, in the interest as
much of the manufacturer, as of the user. Unfortunately, this wish is not always
feasible; before all because physical and chemical phenomena with unknown action
can produce hidden errors which appear as early failures.

2.2.1
Design for reliability

This new concept is an important step in the implement of the cultural changes,
being linked with the Concurrent Engineering. First, the customer voice is to be
considered in the design, being translated in an engineering function [2.49]. Then,
the design must be immune to the action of perturbing factors, and this can be done
with the so-called Taguchi methods. This means: (i) to develop a metric capturing
the function while anticipating possible deviations downstream and (ii) to design a
product that ensures the stability of the metric in the presence of deviation. Finally,
the design team must use reliable prediction methods. In principle, the design for
reliability means to pass from evaluate and repair to anticipate and design. An im-
portant contribution to the development of the design for reliability was given by
the special issue on this subject of IEEE Transactions on Reliability, June 1995,
with papers covering the various aspects of the subject. Taguchi [2.65] talked about
developing a stable technology by taking into account not only the predictible
variations in manufacturing and operation, but also the unknown or unproved.
Other papers treated the logic-synthesis to handle electro migration and hot-carrier
52 2 State of the art in reliability

degradation early in the design stage [2.60] or synergetic reliability predictions to


assess the potential failure mechanisms induced at each manufacturing step [2.9].
Later, Yang and Xue [2.73] used the fractional experiment method to degradation
testing and reliability design, based on a safety margin function. This is an im-
provement of the signal-noise radio defined by Taguchi, but with a more clear
relationship with reliability measure.
To address the impact of temperature the following physics of failure six-step
method [2.74] should be used:
• Develop a thorough knowledge and understanding of the environment in which
the equipment will operate.
• Develop an understanding of the material properties and architectures used in
the design.
• Learn how products fail under various forms of degradation.
• Carefully examine field failure data to get information on how failures occur.
• Control manufacturing to reduce the variations that cause failure.
• Design the product to account for temperature related degradation of the
performance.

2.2.2
Process reliability

A manufactured device is a collection of failure risks, depending on a large variety


of factors, such as: quality of materials, contamination, quality of chemicals and of
the packaging elements, etc. It may be noted that these factors are interdependent
and, consequently, the failure risks may be induced by each technological step or
by the synergy of these steps [2.8].

2.2.2.1
Technological synergies

The particle contamination is a good example for the technological synergies, with
two effects inducing failure risks for the future device:
• The physical effect: the particles mask an area of the chip, hindering the deliber-
ate impurity doping process or producing the breakdown of the processed layer.
• The chemical effect: the particle-contaminant diffuses into the crystal, producing
electrical effects, such as soft I-V characteristics or premature breakdown; the
electrical effect may appear later, after the contaminant has migrated into the
active area, during the device functioning.
For the physical effect, a failure risk synergy is obvious at the subsequent
manufacturing steps:
• at photolithography, the dust particles reaching the transparent area of the masks
transfer their images on the wafer.
2 State of the art in reliability 53

• at etching, metallisation and ionic implantation, the particles may produce


shortcircuits, open circuits, needle holes or localised areas with different
electrical properties.
For the chemical effect, a failure risk synergy comes out because the contaminants
containing alkali ions become active at the thermal processes (oxidation, diffusion).
Localised regions with ionic contamination arise and the ions migrating to the
active areas of the device produce an increase of the leakage currents or a drift of
the threshold voltage (for MOS devices).
Some corrective actions may be used for removing these effects:
• contamination prevention, by identifying and avoiding the contamination
sources;
• wafer cleaning with most sophisticate methods for removing the particles
reaching the wafer.
The most important contamination sources are shown in Fig. 2.6.
In the back-end of the technological flow, the different constitutive elements of
an electronic component (die, package, and encapsulation) are coated with a
metallic layer in the aim to fulfil the prescriptions in accordance with their
requirements, and to guarantee a high operational reliability. The most important
phases ofthis back-end part are:
• Die bonding to the package.
• Wire bonding from the conductive arias of the semiconductor die to the con-
ductive surface of the package.
• Electrical soldering (or Zn soldering) of the package on the socket.
The chemical structure and the cleanliness of the gold layer of the package and of
the die influence decisively all the manufacturing methods for semiconductor
components. Other important aspects are the capacity of the semiconductor die to
transfer the heat to the heat sink and the soldering resistance of the electrical
connections.
To control a manufacturing process means to keep in time the quality of this
process, to assure the reliability of the process. The operations that must be made
are evaluation, optimisation, qualification and monitoring. An optimal process is
first qualified and then, with the aid of the monitors, the process can be kept under
control. A specific tool is used for the evaluation, namely the statistical process
control (SPC), containing such tools as cause-effect diagrams (Ishiqawa), Pareto
diagrams, ANalysis Of VAriance (ANOVA), etc. To optimise the process, the
Design of Experiments (DoE) must be used. After the process is statistically
controlled, one can act for the continuous process improvement (CPI, or Kaizen, in
Japanese), based on SPC.
Recent studies suggest the use of test chips as an instrument for monitoring the
quality of each manufacturing step ofVLSI chips and as a cost-effective procedure
for eliminating potentially "bad" wafers. The long-term reliability is estimated on
the basis of the test chips manufactured on the same wafer as the fully functional
chips.
54 2 State of the art in reliability

Acids, gases, solvents

Static charge
Wafer handling

Men

Masks
Equipment

o 10 20 30 40

Fig. 2.6 Distribution of contamination sources for semiconductor wafers

In the 70's, the use of new test structures for process monitoring was initiated.
By stressing these reliability test structures (used earlier in the process and sensitive
to specific failure mechanisms) more accurate information about the reliability of
the devices would be obtained and in a shorter time than using traditional methods.
Because test structures are used, the extrapolation of the results to the device level
must be cautious. From 1982, the Technology Associates initiated annually the
wafer level reliability (WLR) workshop, were the concept WLR was developed.
Tools allowing to investigate the reliability risks at the wafer level and to monitor
the process factors affecting the reliability were created. In a more general sense,
WLR problems are included in the process reliability concept. Hansen [2.40]
determined with a Monte Carlo simulation model the effectiveness of estimating the
wafer quality, in particular in terms of wafer yield. Reliability predictions can be
obtained from wafer test-chip measurements.
Details about the process reliability for particular types of electronic compo-
nents will be given in chapters 3 to 10.

2.2.3
Screening and burn-in

The growing complexity of the microelectronic components made necessary to


elaborate more efficient test systems. Reliability screening is based on the study of
parameters, which reveal the inherent weaknesses and the difference in capability
of parts that did not fail yet. For example, it can happen that some new types of
integrated circuits, produced in small series, manufactured with insufficient stable
process parameters, it is hard to identify defects during the first operating hours
(generally between a hundred and few thousand hours). Since these elements are
often mounted in different systems, these systems must be completed with supervi-
sion structures: any real system should be redundantly designed so that the errors
can be automatically corrected.
2 State of the art in reliability 55

To better understand the role of screening tests for the reliability estimation, it
will be given an example concerning the failure causes. Assume that a printed
circuit board (PCB) has 60 integrated circuits (ICs), and the probability of failure
for an IC is 2%; it is considered that all the ICs are statistical independent. It results
that the probability to find at least one defect IC is 1 - 0.9860 = 0.7. Some reasons
can lead to component failures. For example, if the components are very old, or if
they are overloaded. In these cases, the screening tests have no sense. Other defects
result from the intrinsic weaknesses of the components. These weaknesses are
surely unavoidable and - for well defined limits - are accepted even by the
manufacturer. With the aid of electrical tests and/or operating tests (during the
fabrication or before the delivering) these components with defects can be identified
and eliminated. Nevertheless it remains a small percentage 2 of components with
hidden defects, which - although still operational - have a low reliability and
influence negatively the reliability of the components batch. The role of the
screening tests is to identify the components partially unreliable, with defects that
do not lead immediately to non-operation. For each lot, the time dependence of
A has the form - already presented - of the bathtub failure curve (Chap. 1). From
this point of view, the screening tests signify:

• Selection of the best lots.


• Elimination of the early failures from the selected lots.
For at least two reasons it is difficult to define a cost-effective screening sequence,
while: (i) it may activate failure mechanisms that would not appear in field
operation; (ii) it could introduce damage (transients, electrostatic discharges ESD)
which may be the cause of further early failures.
The following methods can be used:
• Rejecting inadequate batches in the early failure period.
• Sorting with the aid of electronic controls.
• Accelerate ageing of the medium level batches.
• Activating the catastrophic and drift failure modes.
• Using thermal, electrical and mechanical shocks (without to exceed the allowed
limits).
These methods can be applied in the following life stages of the products:
• At the level of components manufacturing.
• At the level of output control ofthe components, by the manufacturer.
• At the input control level, by the client.
• At the PCBs test level, with greater or smaller amplitudes of the stresses.
Generally, the selection is a 100% test (or a combination of 100% tests), the stress
factors being the temperature, the voltage, etc. followed by a parametric electrical

2 It is considered that the early failures vary between 1% and 3% for SSUMSI ICs, and
respectively between 4% and 8% for LSI ICs [2.2(1996)]. The defective probability of a PCB
with about 500 components and 3000 solder joints can have the following average values [2.15]:
1-3% defective PCBs (113 assembling, 113 defective components, 113 components out of
tolerance) and 1.2 to 1.5 defects per defective PCB.
56 2 State of the art in reliability

or functional control (performed 100%), with the aim to eliminate the defect items,
the marginal items or the items that will probably have early failures (potentially
unreliable items).
By deftnition, an accelerate test is a trial during which the stress levels applied to
the components are superior to these foreseen for operational level; this stress is
applied with the aim to shorten the necessary time for the observation of the
component behaviour at stress.
The accelerated lifetesting is used to obtain information on the component
lifetime distribution (or a particular component reliability parameter) in a timely
manner. To do this, a deep knowledge of the failure mechanisms - essential in all
reliability evaluations - is needed. In the practice, the thermal test alone is not
sufftcient for the reliability evaluation of a product; it is necessary to perform other
stress tests too (supposing that the stress is not "memorised", and consequently the
wearout does not exist).
The accelerated thermal test has an important disadvantage: there is a great
probability that the stress levels create failure mechanisms, which don't appear
usually in the normal operation conditions. On the other hand, it is true that for the
comparative evaluation of different component series this disadvantage doesn't
exist. At any rate, the accelerated thermal test is not a panacea to economise the
time or for the elaboration of economic tests concerning the lifetesting and the
behaviour of electronic components.
The goal of screening tests can be realised in two ways: (a) the utilisation of the
maximum allowed load, since the components predestined to fail in the early failure
period are very sensitive at overloading; (b) the utilisation of several efftcient
physical selection methods which can give information concerning any potential
weaknesses of the components (noise, non-linearity, etc.). In general, it can be said
that all selection tests and practical methods are described in MIL-STD 883. The
methods described in this handbook are too expensive for the usual industrial
purposes. It has been proved that the combination of different stresses to produce
the early failures of the elements, followed by a 100% electrical test, is optimal and
efftcient, especially if the costs must be taken into account. To establish the optimal
stresses (their sequence and stretching) is a delicate problem, while the failures
depend on the integration degree, on the technology and on the manufacturing
methods. In the following the most important tests groups and their shortcomings
will be mentioned, without discussing the mechanical tests (acceleration, shocks,
vibrations).

2.2.3.1
Burn-in

The bum-in method (no. 1015.2 ofMIL-STD 883D) belongs to the first test cate-
gory. Its goal is to detect latent flaws or defects that have a high probability to
come out as infant mortality failures under fteld conditions. Although the major
defects may be found and eliminated in the quality and reliability assurance de-
partment of the manufacturer, some defects remain latent and may develop into
infant mortality failures over a reasonably short period of operation time (typically
comprised between some days and a few thousand hours). It is not so simple to find
2 State of the art in reliability 57

the optimum load conditions and bum-in duration), so that nearly all potential in-
fant mortality components are eliminated. There must be a substantial difference in
the lifetime of the infant mortality population and the lifetime of the main (or long
term) wearout population under the operating and environmental conditions applied
in bum-in [2.42]. The situation may differ depending on the today's components, on
the new technologies, on the custom-designed circuits. The trend is towards moni-
tored bum-in [2.59]. The temperature should be high, without to exceeding
+ 150°C, for the semiconductor crystal.
A clear distinction must be made between test and treatment. A test is a
sequence of operations for determining the manner in which a component is
functioning and also a trial with previously formulated questions, without expecting
a detailed response. That is why the test time is short and the processing of the
results is immediately made. It is an attributive trial, which gives us information
about the type goodlbad. As a treatment, the bum-in must eliminate the early
failures, delivering to the client the rest of the bath-tub failure curve. We
distinguish three types ofbum-in:
• Static bum-in: temperature stresses and electrical voltages are applied; all the
component outputs are connected through resistors too high or too low.
• Dynamic bum-in: temperature stresses and dynamic operation of components (or
groups of components).
• Power bum-in: operation at maximum load and at different ambient temperature
(0 ... +150°C), also the function test under the foreseen limits of the data sheet for
+25°C.
It is often difficult to decide when static or a dynamic bum-in is more effective.
Should surface, oxide and metallisation problems be dominant, a static bum-in is
better; a dynamic bum-in activates practically all failure mechanisms. That is why
the choice must be made on the basis of practical results.
The static bum-in is used as control selection, by the manufacturers, and by the
users. Usually, according to MIL-STD 883D, a temperature of+125°C during 168
hours is applied. From all the six basic tests specified by the method 1015.2, the
methods A and D are the most utilised (min. 168 h at the specified temperature).
The condition A foresees a static bum-in (only the supply voltages are present, so
that the many junctions can be biased). This type is applied particularly if utilised
together with cooling, to bring forward the surface defects. The condition D is
frequently utilised for integrated circuits. The clock signal is active during the
whole bum-in period and exposes all the junctions as much to the direct voltages,
as to the inverse voltages. All outputs are loaded to the maximum allowed value.
The direction in which the bias is applied will influence the power dissipation and
consequently the junction temperature of the device. However, in complex devices
there is very little distinction between stresses resulting from the two biasing
methods since it becomes increasingly difficult to implement a clear-cut version of
either option.

) Any application of a load over any length of time will use up component lifetime; there can
easily be situations where bum-in can use up an unacceptable portion of the main population
lifetime.
58 2 State of the art in reliability

The static burn-in is particularly adequate for the selection of great quantities of
products, and is simultaneously an economic proceeding. The distribution is
dominated by the surface-, oxide-, and metallisation-defect categories, resulted
from some type of contamination or corrosion mechanism 4•
The continuously growing number of LSI and VLSI ICs (memories, micropro-
cessors) has essentially contributed in the last time in disseminating the dynamic
burn-in, while the load can be easily regulated, the tests can be programmed and
continuously supervised, memorised, and the tests results can be automatically and
statistically processed. The selection temperature usually varies between +100°C
and +150°C. Beyond a certain duration (comprised normally between 48 and 240
hours, depending on component and selection parameters), no more failure
diminishing occurs. The applied burn-in voltage depends also on duration; so, for
example, the same result can be obtained with the applied nominal voltage after 96
hours, or - with a superior applied voltage - after only 24 hours. But - as in the
case of temperature - the limit values must not be exceeded.
Another parameter for dynamic burn-in is the resolution that determines the
maximum frequency of the stimuli sent to the components (for example, in the case
of ICs, a resolution of lOOns corresponds to a frequency of 10 MHz). The best
solution is to reach the vicinity of the effective operation frequency of the
component.
MIL-STD 883 specifies clearly defined methods: class B (168 hJ125°C), class S,
for high reliability and special applications (240 h), etc., without any mention of the
particular manufacturers methods or the methods of ICs users. Table 2.5 shows the
screening sequence according to MIL-STD-883, ICs class B quality.

Table 2.5 Screening procedure for ICs class B (MIL-STD-883)

Screening step Screening condition

Internal visual 100%


High-temperature storage 24h / +150°C
Thermal cycling (20 x) -65°C to + 150°C
Constant acceleration (only for hermetic.packages) 30 OOOg for 60s
Reduced electrical tests 100%
Bum-in 160h at 125°C
Electrical test 100%
Seal (fine/gross leak; only for hermetic ICs) 100%
External visual inspection 100%

4 Other defects include wirebond problems resulting from intermetallic formation and oxide
breakdown anomalies. Dynamic operation results in higher power dissipation, current densities
and chip temperature that the static bum-in configuration.
2 State of the art in reliability 59

2.2.3.2
Economic aspects of burn-in

Is it often asked if one may replace the component burn-in with an equipped PCBs
burn-in. The answer is negative and this for three essential reasons:
• the most equipped PCBs can't be exposed or operated at high temperatures;
• the hunting out of the early failures should be made through a repair and renewal
process, waiting the failures to appear;
• to a reduced temperature, the acceleration time can't be extended to cover the
early failure period; by testing the equipped PCBs, the component itself can't be
tested in accordance with the complete data sheet specification.
Consequently the components burn-in is the key of component reliability pro-
blems. The burn-in at the system level is recommended as a first step for burn-in
optimisation; analysing the defects that appeared at this level, the utility of burn-in
for certain components can be better exploited. In fact, in most cases, the optimal
solution consists in a burn-in combination at components level and at system level.
Although complementary, the equipped PCBs level is seldom utilised.
Theoretically, presuming that the environmental and selection conditions are
unchanged, a burn-in at system level must be optimised in relation with the
reliability and in relation with the costs. In the first case, the situation has some
ambiguities, while it is virtually impossible with a burn-in to eliminate all the weak
components. On the contrary, if you wish the batch5 to contain, after the burn-in,
only I % of the potential failures, it is possible to determine the optimal duration
with the aid of a combination of analytical and graphical methods [2.42].
Concerning the burn-in optimisation costs, we can distinguish the following
parameters:
Cr - the total costs, in cost/equipment units;
C) - constant cost that can be expressed as units of costs per system (or units of
costs per equipment), independently of the burn-in duration and on the number of
failures recorded in this period (for example the burn-in installation and taking
down costs);
C2 - costs that appear each time the equipment fails;
C3 - costs depending on time, such as a) costs/equipment/day of ovens; b) costs
due to the delay of total production, for the number of days in which the systems
are submitted to burn-in; c) tests and failure controls costs (failure monitoring
costs);
C4 - costs/failure/equipment for the systems under guarantee (repair cost by the
clients);
Np - number of failures during the burn-in period;
Nb - number of failures after burn-in, during the guarantee period;
n - duration (number of days) of the equipment burn-in.

5 The assumptions of a good selection [2.2(1983)] are: (i) homogenous batches; (ii) accelerated
ageing eliminates the early failures; (iii) accelerated ageing eliminates, also, the components
which normally should not fail during the first years of operation.
60 2 State of the art in reliability

With these notations, we can write:


Cr = C] + Np C2 + nC3 + NbC4. (2.1)
If the burn-in is not performed and Ns is the number of failures without burn-in,
the total guarantee costs for an equipment are:
(2.2)

Cost/equipment

L-_~_ _ _ _ _ _ _ -+__________~ burn-in time (days)


o 2 5 6

Fig. 2.7 Typical curves for the difference Cr - Cs. The curve A shows a situation where bum-in
does not pay-off, i. e. total costs using bum-in is always greater than the costs without bum-in,
irrespective of the bum-in period; the curve B demonstrates that a bum-in lasting about two days
(48 h) gives the maximum economic benefit. [2.42]

It can be seen that the value of total costs Cr have a linear dependency on the
number of days in which the equipment are on burn-in, while the value of total
guarantee costs without burn-in Cs is a constant. If the difference C r - Cs is
calculated utilising n as an independent variable, one obtains the curves plotted in
Fig. 2.7, corresponding to two different equipments.
For the curve A, the problem is to know if the awaited number of failures
(without burn-in) is acceptable. If the response of manufacture fIrm is negative,
burn-in must be introduced at the system level, with duration of 2-3 days, as being
more effIcient. Certainly, it must be evaluated the number of failures awaited after
this burn-in period, and during the guarantee period.
Any burn-in policy must be closely evaluated for each specifIc product leaving
the company.

2.2.3.3
Other screening tests

High temperature storage (stabilisation bake) - method 1008.1, MIL-STD 883D-


belongs to the second group of test methods and serves to the electrical charac-
teristics and drift parameters stabilisation. Although it is not considered to be a very
effective screen, it is not expensive and it is a good instrument for surface related
defects, accelerating the chemical degradation (contamination, substrate defects,
2 State of the art in reliability 61

etc.). Usually, the tested components (the ICs are placed, pins down, on a metal
tray in the oven) remain during 24 hours at the temperature of +IS0°C (for an IC,
this temperature is much greater than the maximum allowed limit in operation).
The third group of tests is formed by the thermal cycles (method 1010.2, MIL-
STD 883D). This is a process that causes mechanical stresses, while the compo-
nents are alternatively exposed to very high and very low temperatures. This
explains why the method can easily emphasise the potential defects of each tested
entity (capsule, marking, semiconductor surface, contact wires, structure soldering
defects, structure cracks). Thermal cycles are performed air-to-air in a two-
chamber oven (transfer from high to low temperature chamber, and vice versa,
using a lift). The non-biased ICs are placed on a metal tray (pins on the tray to
avoid thermal voltage stress) and exposed to at least 10 thermal cycles (at the
temperature range -6SoC ... +IS0°C), but 20 cycles are often used.
A typical cycle consists in a dwelling time at extreme temperatures (~l 0 mi-
nutes), with a transfer time inferior to one minute. Should solderability be a
problem, an N2-protective atmosphere can be used. Normally, after the thermal
cycles a stabilisation at high temperature is made, with the aim to better localise the
defects.
The thermal shock belongs to the fourth group of methods (MIL-STD 883D-no.
1011.2). It is utilised to test the integrity of the connection wires (with important
dilatation coefficients, positive and negative). This method is similarly to the
thermal cycles, but is much harder, since the thermal transfer medium is not air, but
a transfer fluid able to produce the shock. The extreme temperatures must be
selected with care, because the thermal shock can destroy much constructive
elements, e.g. ceramic packages of ICs. We recommend not to exceed the extreme
temperatures of O°C and +100°C. Even for these limits, the manufacturer must be
consulted.
The seal test (fine leak and gross leak) is performed to check the seal integrity of
the cavity around the chip in hermetically packaged ICs. For the fine leak, the ICs
are placed in a vacuum (lh at O.S mm Hg) and stored in a helium atmosphere under
pressure (4h at 5atm), then placed \ll1der normal atmospheric conditions, in open air
(30 minutes), and finally a helium detector (required sensitivity 1O-8 atm cm3/s,
depending on the cavity volume) identifies any leakage.
For gross leak, the ICs are placed in a vacuum (1 hour at 5 mm Hg) and the in a
2 hours storage under 5atm in fluorocarbon Fc-n. After a short exposure (2
minutes) in open air, the ICs are immersed in a FC-40 indicator bath at 125°C
where the hermeticity is tested; the presence of a continuous stream of small
bubbles or two large bubbles from the same place within 30 seconds indicates a
defect.

2.2.3.4
Monitoring the screening

Screening is an important step for the manufacturing of high reliability compo-


nents: the whole lot of finished devices undergoes a succession of tests, called
screening sequence, intended to produce the failure of low reliability components
(early failures, i. e. failures occurring during the first operation hours).
62 2 State of the art in reliability

Consequently, the remainder of the lot has a better reliability. This is the ideal
case [2.15][2.2(1996)]. However, reports on components damaged after screening
were often made. There are two sources for such an unlucky event: (i) the screening
sequence contains destructive tests; (ii) the electrical characterisation does not
succeed in eliminating the weak components.

Design of
screening
sequence

VERDECT Screening sequence

Screened lot

Fig. 2.8 Flow-chart of MOVES


2 State of the art in reliability 63

To overcome these problems, recently, a method was proposed [2.11]. The method
was called MOVES, a acronym for Monitoring and Verifying a Screening
sequence. MOVES contains five procedures: VERDECT, LODRlFT, DISCRlM,
POSE and INDRlFT. One can say that, with MOVES, low reliability items moves
away from a lot passed through a screening sequence. In Fig. 2.8, the flow chart of
MOVES is presented. From the designed screening sequence, VERDECT
(VERifying the DEstructive Character of a Test) identifies the destructive tests.
These category of tests must be substituted at the design review by non-destructive
tests activating the same failure mechanisms (e. g. thermal cycles may replace
thermal shocks). Then, the screening sequence is performed for all the N
components and the failed items are withdrawn. For the remainder of the lot,
LODRlFT (Lot DRlFT) can say if the drift of the lot _. described by the mean of
each main electrical parameters - reaches the failure limit, during the lifetime. If it
is so, the lot must be rejected.
If the answer is negative, the behaviour of individual items is to be investigated.
DISCRlM sets apart by optimal discrimination and eliminates the items which do
not follow the whole lot tendency, POSE (POSition of the Elements) identifies the
components which change their position in the parameter distribution for each
measuring moment and INDRlFT (INDividual DRlFT) analyses the individual drift
for the main parameters of each component. Eventually, the failed items (nf) are
eliminated and for the remainder of the lot (N-nf) a higher reliability is obtained.
An improvement of the POSE method, by using fuzzy logic was recently
proposed [2.77].
Basically, with POSE, the electrical parameter drift of each item during the
screening sequence, after each screening test, is carefuly analysed. For each
electrical parameter of the device, the value range is divided into five zones. The
position of the parameter value is noted at the beginning of the screening sequence
and then, identified after each screening step. With an appropriate rule, the
movement of a parameter from a zone to another may be linked to the reliability of
the device. But the analysis is difficult to perform. The fuzzy logic may be useful in
this respect, and, in the following, a method allowing to properly select (and to
remove) the items which might fail in the future is presented.

Il,(x)

Fig.2.9 Fuzzy set: triangle-shaped membership function with five regions

The "mobility" of the parameter value after each screening test is investigated. A
triangle-shaped membership function with 5 regions (called: very small, small,
64 2 State of the art in reliability

medium, high, very high, referring to the mobility value, m, with core values from
0.1 to 0.5) is used (Fig.2.9), given by:
rr
/I = ~.. I I,V, fior x < rand
IX - r,,v.I / \'Ir - r.l I
IX
\- - r·,J
" / Ir
{I
I
- r,J
I,'
fior x> r U (2.3)
where: ro = ri - 0.1; r;,u = ri + 0.1; r] = 0.1 (for very low), to r5 = 0.5 (for very high).
The "movement" of the parameter value from a zone to another is quantified by
the following rules:
• Initially, a "very small" mobility (m) is assigned to each device, with core value
0.1.
• All "jumps" from a zone to the next one is penalised by a doubling of m. This
multiplication factor becomes 3 and 4 for jumps over two or three zones.
• A "jump back" from a next zone does not modify m. If this jump back is longer
than the initial jump (two zones, instead of one), m is doubled. For shorter jumps
back (e.g. one zone, instead of two), m is diminished by 50%.
• If the parameter value remains in the same zone, each time m is diminished by
30%.
• Usually, the final screening test is a burn-in. It seems that the failures arisen at
this test are indicative for the reliability. So, if a jump of one or two zones arises
at this final test, a value of 0.1 or 0.2, respectively is added to m.
• Finally, for the screening sequence the overall mobility (m) is obtained for each
device. If this value is higher than 0.3, the device must be removed, because its
reliability is not high enough. Certainly, for various applications, other removing
limits may be established.

Table 2.6 Selection of the reliable items at screening, for a batch of 15 items (fuzzy method with
5 regions)

Item Initially AfterTl AfterT2 AfterT3


no. Zone m Zone m Zone m Zone m
1. 2 0.1 2 0.07 3 0.14 3 0.1
2. 4 0.1 4 0.07 3 0.14 3 0.1
3. 4 0.1 4 0.07 4 0.05 3 0.2
4. 4 0.1 5 0.2 3 0.3 2 0.7
5. 2 0.1 2 0.07 2 0.05 1 0.2
6. 2 0.1 1 0.07 2 0.14 1 0.27
7. 1 0.1 1 0.07 1 0.05 1 0.035
8. 3 0.1 3 0.07 4 0.14 2 0.48
9. 3 0.1 3 0.07 5 0.21 4 0.2
10. 5 0.1 5 0.07 5 0.04 4 0.18
11. 4 0.1 4 0.07 3 0.14 1 0.42
12. 5 0.1 5 0.07 4 0.14 4 0.1
13. 3 0.1 3 0.07 2 0.14 2 0.1
14. 3 0.1 4 0.2 2 0.4 2 0.28
15. 2 0.1 2 0.07 1 0.14 1 0.1
2 State of the art in reliability 65

The procedure will be detailed for a case study. For a batch of 15 devices
undergoing a screening sequence with 3 tests (temperature cycling, acceleration
and burn-in), an electrical parameter is measured initially (i), and after each test
(Tl, T2, T3). The results are presented in Table 2.6, together with the mobility
values (m) calculated following the rules previously presented. As a conclusion, the
"mobility" being higher than 3, the devices with no. 4, 8, and 11 must be removed.
A new methodology to select an effective burn-in strategy for ICs used in
automotive applications is given by Tang [2.68]. The clue is to analyse failure
mechanisms for different technologies and to use the results together with the IC
family data to determine appropriate burn-in conditions for new ICs. The results
have shown that burn-in is useful for detecting wafer processing defects rather than
packaging defects.

2.3
Reliability evaluation

To deliver data concerning the components reliability, it is necessary to establish


technical criteria linked to the basis parameters. To evaluate the adequacy degree to
the pursued utilisation goal, it must be agreed on several quantitative magnitudes
that can be deduced from the component parameters. The causes of component
unreliability can be the total or drift failures. The last ones, in contrast to the total
failures, are statistically predictable and generally contribute to the perturbation of
the equipment where they are included. In addition to these statistical modifica-
tions, there appear sudden and unpredictable changes.
The reliability of an electronic component is determined by the relation between
two basic elements: the stress (characterising the environment and the electrical
constraints) and the strength (expressing the capacity of the product, built by the
manufacturing process, to fulfil the task). There are four conceptual models for this
relation [2.25]:
• Stress-strength (the component fails when the stress surpasses the strength; the
model describes critical events, the strength being treated as a random variable).
• Damage-endurance (a stress produces an irreversibly accumulated damage; this
increased damage do not degrade the performances of the component, but when
a threshold is exceeded, the failure occurs).
• Challenge-response (the failure of a component occurs only if the component is
functioning).
• Tolerance-requirement (the parameter drift may be tolerated if the operational
requirements are not exceeded).
This last conceptual model, requires prediction models describing the relationship
between the parameter drift and the end of device life. Such models were developed
starting from the 70's [2.78] ... [2.80]. The model proposed by Ash and Gorton [2.81]
starts form the hypotesis that the physico-chemical reactions between the impurities
in the semiconductor volume, between the chip surface and the package, etc.,
developed in the presence of environmental stress factors, produce drifts of the
electrical parameters, leading eventually to failure. One assumes that the electrical
66 2 State of the art in reliability

and thennal stress follows an Arrhenius model. A burn-in perriod (tB) is followed
by a functioning test at accelerated thennal stress (with the duration tA) till the end
of life (tE). On the basis of the parameter measurements, the initial value (PI) and
the values measured at tB (noted with PB) and at tB + tA (noted PA), the model gives
the drift at the end of life (tE), noted with LlpE/ PI. The following relation is
obtained:
iJPd PI = [iJpII PIln(r)] In [r2 + (r-1) (tdtsJ] -

- (r+1) exp[-(EJk) (TO·I-h I)] (2.4)


where: r = tA/tB, Ea = the smallest activation energy for the involved failure
mechanisms (in eV), To = operational temperature (in K), T A = temperature of the
accelerated testing (in K). The model may be used for the following cases:
• The parameter p has an initial value, Ph then a value after burn-in, PB, reaching
PA after accelerated testing and, eventually, PE at the end oflife.
• The parameter p has initially the value 0, having then successively the values PB,
PA and PE, after burn-in, after accelerated testing at the end oflife, respectively.
• The parameter p has initially the value -Ph reaches the value 0 after burn-in and
has the values PA and PE, after accelerated testing at the end oflife, respectively.
The model is completely characterised by the values measured initially and at the
moments tB and tA and by the activation energy.

2.3.1
Environmental reliability testing

To find of a correct definition for the environment is the first step in Environmental
Reliability Testing. For this purpose, an international document, namely IEC 721
"Classification of environmental conditions", may be used. The environmental
conditions are codified with three digits: a figure (from 1 to 7) indicating the using
mode, a letter indicating the environmental conditions and again a figure (from I to
6) indicating the severity degree. As an example, the climatic conditions for using
in a fix post unprotected from bad weather (Table 2.7) and for a fix post protected
from bad weather (Table 2.8).
One may notice that for the same severity degree the climatic conditions for
using in a protected post are more severe. For instance, the maximum air
temperature is 40°C for 3K4, and 55°C for 4K4, respectively. Now we have all the
elements for expressing the environment of a device. First, the using type must be
settle, between the seven categories: 1 - storage, 2 - transport, 3 - used in a fix post
unprotected from bad weather, 4 - used in a fix post unprotected from bad weather,
5 - used in a terrestrial vehicle, 6 - used on the see, 7 - used in portable sets. Then,
the environmental conditions are indicated by letters: K - climatic, Z - special
climatic, B - biological, C - chemical active substances, F - contaminant fluids, M
- mechanical. Eventually, the severity degree (from 1 - small, to 6 - high) is
indicated. Some examples: 3Z1 - negligible heat irradiation from the environment,
3S1 - chemical active substances, 3M3 - mechanical conditions of vibrations /
shocks.
2 State of the art in reliability 67

Table 2.7 Climatic conditions for using in fixed post unprotected from bad weather

Environmental agent Unity Category

3Kl 3K2 3K3 3K4 3K5 3K6

Minimum air temperature °C +20 +15 +5 +5 -5 -25


Maximum air temperature °C +25 +30 +40 +40 +45 +55

Weak relative humidity % 20 10 5 5 5 10


Strong relative humidity % 75 75 85 95 95 100
Variation rate of the air °C/min 0.1 0.5 0.5 0.5 0.5 0.5
temperature

Solar iradiation W/m2 500 500 700 700 700 1120

Table 2.8 Climatic conditions for using in fixed post protected from bad weather
Environmental agent Unity Category

4Kl 4K2 4K3 4K4 4K5 4K6


Minimum air temperature °C -20 -33 -50 -65 -20 -65
Maximum air temperature °C +35 +40 +40 +55 +55 +35

Weak relative humidity % 20 15 15 40 40 20


Strong relative humidity % 100 100 100 100 100 100
Variation rate of the air °C/min 0.5 0.5 0.5 0.5 0.5 0.5
temperature

Solar iradiation W/m2 1120 1120 1120 1120 1120 1120

The climatic stress [2.52][2.70] are expressed especially through effects


conditioned by the temperature (humidity, pressure, solar irradiation, etc.). In Fig.
2.10 the failure rates (at 40°C and 70°C) of some component families are given. If,
supplementary, the common distribution of failed component during operation time
is considered (namely 40% of the ICs, 40% of the active discrete components and
20% of passive components) it can be said - for the newest electronic systems
(where the monolithic ICs and the hybrid circuits represent already a great part) -
that the failure speed doubles each 10°C. This explains why it is necessary to
reduce at a minimum the heating of the capsules.
68 2 State of the art in reliability

A.70'C / MO'C
10

8
I - Integrated circuits
6 2 - Capacitors
3 - Hybrid circuits
4 4 - Transistors
5 - Connectors
2 6 - Resistors
7- Relays
o 8-Coils
12345678
Fig. 2.10 Failure rates ratios of different component families at environment temperatures of
+40°C and +70°C [2.70]

Concerning the humidity activity, it must be observed that by reaching the dew
point, a water deposition is formed which produce the surface corrosion. More
ionised particles (producing modifications in the isolation resistance, capacities,
wafer dimensions, and water diffusion -leading to the growths of the failure rate of
the components encapsulated in plastics) are contained by the condensed water,
more the corrosion is important.
The air pressure influences the ventilation (heat evacuation) and the air exchange
(sensibility at too rapid variations).
The solar radiation influences the material composition (through photochemical
processes) and leads so to a supplementary heating of the environmental air
(dilatation, mechanical effects, etc.).

2.3.1.1
Synergy of environmental factors

The environment is in fact a combination of environmental factors. Experimentally,


it was felt that the combined effect of these factors is higher than the sum of the
individual effects, because a synergy of the stress factors occurs. It results that one
must study the individual effects, but also the synergy of these factors. There are
stress factors strongly interdependent, such as solar radiation and temperature, or
functioning and temperature. But also independent factors may be outlined, such as
acceleration and humidity.
An analysis of the possible synergies must cover the main phases of the
component life: storage, transport and functioning.
Storage and transport. The stress factors arisen at storage and transport are
those factors acting between the component manufacturing and its mounting in an
electronic system. In principle, the storage and transport period is short enough and
does not influence the component reliability. On the other hand, stocks of
components are made for special military and industrial purposes (weapons,
nuclear plants, etc.). For these applications, the storage period becomes important
2 State of the art in reliability 69

for the reliability. The involved stresses are carefuly analysed. As an example, for
the weapons in storage by U.S Army at Anniston, the temperature varies daily with
maximum 2°C and the humidity is 70% [2.82]. There are, also, storage areas in
tropical or arctic zones. For systems exposed to solar radiation, temperatures higher
than 75°C were measured, with temperature variations exceeding 50°C. For cheking
the component reliability in these situations, studies about the behavior at
temperature cycling were performed (see Section 2.3.1.2). Other stress factors, such
as rain, fog, snow or fungus or bacteria may act and must be investigated. At
transport, the same (temperature cycling, humidity) or specific stress factors
(mechanical shocks, etc.) may arise.
For all these factors, studies about the involved synergies were performed. An
example is given in [2.83], where the behavior at temperature and vibrations of an
electronic equipment protecting the airplane against the sol-air missiles is
investigated. Operational data were collected, due to a complex system (elaborated
by the specialists from Westingouse). This system contains 64 temperature sensors
(AD 590 MF, from Analog Devices) and 24 vibration sensors (PCB Piezotronics
303 A02 Quartz Accelerometers), mounted on two systems ALQ-131 used for the
fight plane F15. The tests were performed between December 1989 and August
1990. The data were processed and laboratory tests were built, based on the
obtained information, for the components with abnormal behavior. Eventually,
corrective actions were used for improving the component reliability. The result
was that during the Gulf W ar (January 1991), the equipment ALQ-131 had a higher
reliability than previously.
Functionning. The essentional difference between the storage and transport
environment and the functioning is the presence of the bias. At first sight, it seems
that the only effect of the electrical factor is an increase of the chip temperature,
folowing the relation:

where ~ is the junction temperature, Ta - the ambient temperature, rth j-a - the
thermal resistance junction-ambient and P d - the dissipated power. If the effect of
the electrical factor means only a temperature increase, than its effect must be the
same as an increase of the ambient temperature. Experimentally, it has been shown
that this hypotesis is not valid. The electrical factor has a thermal effect, but also an
specific electrical effect due to the electrical field or electrical current.
Often, the components have to work in an intermittent regime. In these cases, the
phenomenon limitting the lifetime is the thermal fatigue of the metal contact,
produced by the synergy between the thermal factor (thermal effect of functioning)
and the the mechanical factor, modeled with [2.84]:
(2.6)
where N is the number of functioning cycles, L1ep is the terma-mechanical stress,
given by:
(2.7)
where L is the mlrumum dimension of the contact, L1a is the average of the
dilatation coefficients of the two interfaces, L1 T is the temperature variation and x is
the width of the contact. Experiments about intermittent functioning of rectifier
70 2 State of the art in reliability

bridges (2000 cycles of 20 minutes each at on- and off-state, respectively)


emphasise [2.85] the main failure mechanisms: i) degradation of the contact
between silicon and electrodes and ii) contact interruptions. Therefore, the
intermittent functioning induces different failure mechanisms than the continuous
functioning.

2.3.1.2
Temperature cycling

The behavior of components at temperature cycling offers important information


both for manufacturers and for users. An experimental study [2.86] on this behavior
was performed for components encapsulated in various packages (see Table 2.9).

Table 2.9 Experiments on temperature cycling

T min ("C) Tmax (uC) Components encapsulated in the following cases:


T0-39 TO-72 TO-I8 plastic TO-I8 metal
-55 +175 X X
-40 +150 X X X
-40 +125 X
·40 +100 X
-25 +100 X
-25 +85 X
-10 +85 X

Nm
T0-39
lOs
TO-I8 metal
104 ~ TO-72
• TO-I8 plastic
103 \

102

10
10 102

Fig. 2.11 The median number of temperature cycles producing the failure of 50% of a component
batch (Nm) vs. temperature range (~T)

The components were measured initially and after 50, 100, 200, 400, 500 and 1000
cycles. The failed components were carefully anlaysed and the populations affected
by each failure mechanism were established. For the component encapsulated in
T0-39 case (bipolar RF transistor), a degradation of the chip solder was observed,
produced by different dilatation coefficients of silicon and header, respectively.
2 State of the art in reliability 71

This a typical failure mechanisms accelerated by temperature cycling. For the


component encapsulated in TO-72 (field effect transistor), an increase of leakage
current appeared, produced by alcali ions form the interface Si/Si02 . This
phenomenon is accelerated by the high temperature and it is not specific for
temperature cycling. The same component was encapsulated in TO-18 plastic and
TO-18 metal, a phototransistor, having as the main failure mechanism the increase
of leakage currents due to deterioration of the contact between the die and the
header. The results are shown in Fig. 2.11. As one can see, for the component
encapsulated in TO-72 the reliability does not depend on the number of
temperature cycles, because the failure mechanism is not specific for this test.
The failure distributions were found to be lognormal, having Nm and (J as
parameters. The relationship between the median number of temperature cycles
producing the failure of 50% of a component batch (Nm) and the temperature range
for temperature cycling (Ll T) is:
Nm = ifJ exp(-a L1T) (2.8)
where ifJ and a are constants and L1 T = Tmax - Tmin'
In an attempt to model the synergic action of temperature cycling and vibrations,
the following relation between the median number of temperature cycling till the
failure (N) and the mechanical stress range (O"r) has been found [2.87]:
(2.9)
where c and m are material constants that may be calculated from experimental
data. But the relation (2.9) is valid only for a high number of temperature cycles.
For less than 104 cycles, it was found that another failure mechanism arises, as one
can see from Fig. 2.12. The passing from one mechanism to another is possible by
modifying the range of the mechanical stress.

Range of the mechanical stress

Accelerated level

Nonnallevel

L-._ _ _---L-_ _ _ _ _ _ _. . . . Median number of cycles till failure

Fig. 2.12 Failure mechanisms at temperature + vibrations. Appearance of the second failure
mechanism after 10 4 temperature cycles
72 2 State of the art in reliability

2.3.1.3
Behavior in a radiation field

The most nocive environment for the semiconductor components is the nuclear one.
In Table 2.10, the sensitiveness in a radiation field is shown for components
manufactured by various technology types.
Various failure mechanisms were investigated. The rapid neutrons produce
current gain degradation and increase of saturation voltage for bipolar transistors,
by creating defects in the crystaline structure. The ionisation radiation generates
photo currents in all PN junctions reversely biased, producing modifications of the
logic states [2.88].
In 1992, a team of researchers from Hitachi elaborated two models for the
evaluation of threshold drift and leakage current increase for CMOS devices
irradiated by 'Y rays (C0 60). They stated that the defects are produced by trapping the
hole charge in MOSFET gate and increasing the state density at the interface
Si/Si0 2•
F or the threshold drift (11VTO), a linear model was proposed, described by:
LlVro(t) = - TC + A log t + IS (2.10)
where TC is the threshold drift generated by the charge trapped in the oxide per
unitary dose, A is a coefficient linked to this phenomenon and IS is the threshold
drift generated by the charge of interface states. So, a synergy of two failure
mechanisms is modeled.
The increase of the leakage current (IL) was modeled with the formula:
It = KJ exp(-A-jI) + K2exp(-A2t) (2.11)
where KI and K2 are the leakage currents generated by the unitary dose and Al and
A2 are constants.

Table 2.10 Comparison of the sensitiveness in a radiation field, for components manufactured by
various technology types

Technology types Radiation field


Neutrons Ionisation radiation
(n1cm2) Total dose (rads Si) Transitory dose for
surviving (rads Si/s)
Bipolar transistors and JFET 1010... 10 12 104 1010
Tyristhors 101°.. 10 12 104 10 10
TTLIC 10 14 106 >1010
TTL smaIl power Schottky 10 14 106 >1010
Linear IC 1013 5xI04.. .105 >10 10
CMOSIC 10 15 103 .. 104 109
NMOSIC 10 15 10) 1010
LED 1013 >10 5 >1010
Isoplanar ECL >10 15 107 1011
2 State of the art in reliability 73

To be noted that both models take into account the synergy between irradiation
and thermal factor, because the coefficients depend on temperature following an
Arrhenius model. For instance, for the coefficient A from (2.10):
A = Ao exp(-EJkT). (2.12)
So, the superposition of temperature and ionisation radiation is accomplished.

2.3.2
Life testing with noncontinous inspection

The reliability tests are performed on samples withdrawn from a batch of compo-
nents. If the components are measured at foreseen inspection moments, when the
life tests are stooped, this is the method of noncontinuous inspection. On the con-
trary, if the components are measured permanently, during life testing, the method
is called continuous inspection. In most cases, the noncontinuous inspection, a
much cheaper method, is used. With this method, the failure moment is not accu-
rately known, being assimilated with the subsequent measuring moment. Further
on, a method for increasing the accuracy of the noncontinuous inspection will be
presented.
If n items were withdrawn from a batch of N components, (a", b,J, k = I,2, ... ,i is
the time period between two successive inspections, i is the total number of
inspections and m],m2, ... ,m; are the failures at each time period (ak' b,J, then:
m] + m2 + .... + m, = n

ak+] = b", k = 1,2... , (i-I). (2.13)


The exact failure moment is not known, but it can be restored by an iterative
procedure [2.89]. An example for a Weibull distribution will be presented [2.90].
The failure moments for a Weibull distribution with parameters fJ and 0 are
given by:
~ = 0 [-In(1 -JI(N+ I))l'IP, j = I,2, ... ,N. (2.14)
For the studied case, one may write:
~ = 0 [-In(I - Ck/(N+I))l'IP,j = I,2, ... ,N (2.15)
where:
(2.16)
and:
Ak = (N+ 1) exp [-(akl 0) I) (2.17)

Bk = (N+ 1) exp [-(bkl 0) PI (2.18)


The procedure contains the following steps.
1. The input data are:
- the inspection moments: ak, b k
- the failures occured in each time period between two inspections: mk
74 2 State of the art in reliability

- sample volume: n
2. In a zero approximation, one emphasises that all items fail in the middle of the
time period (ak' bJ:
(2.19)
where:
k = l,for 1<j<mj,
k = 2,for mI+l<j<mI+m2,

3. The Weibull distribution parameters in the first approximation (~(O) and e(O)) are
calculated by using the Maximum Likelihood Estimation (MLE), with the iterative
method Newton-Raphson [2.93]. From the equation:
n n

E (flln II
j~I J
E (p -liP = (E
J j~I J j~I
In t)ln (2.20)

one may calculate ~(O). Then, e(O) is obtained from:


n

(/0) = (E t!lIP. (2.21 )


J~l

4. Further on, the failure moments for the first approximation are calculated with
relations (2.13 )... (2.17).
5. The Weibull distribution parameters in the first approximation (~(I) and e(l)) are
calculated.
6. The iterative process continues until I ~(r) - ~(r.l) I< £, where r is the order number
ofthe approximation and £ is the foreseen accuracy.
7. For the values ~(r) and e(r) the failure rate is calculated with formula:
A(t) = (prrJI ElJ) (t I (/rJ) P(r)-I. (2.22)
The method (called NIMLE = Noncontiuous Inspection with Maximum Likelihood
Estimation) will be used for an example used for the first time by Menon [2.91]. A
sample of 20 items, withdrawn from an Weibull population of 1000 items, with ~ =
0.5 and e = 2.7183. The failure moments for these 20 items were:
0.001 0.030 0.071 0.185 0.345 0.435 0.469 0.470 0.505 0.664
0.806 0.970 1.033 1.550 1.550 2.046 3.532 7.057 9.098 57.628
A noncontinuous inspection was simulated and the results presented in Table 2.11.
The results obtained with the NIMLE method were compared with results obtained
by Menon [2.91] and Cohen [2.92] (which is the MLE method), with the moment
method and with the graphical method (Table 2.12). From Table 2.12 one may note
that the NIMLE method, although starts from incomplete data, allows to obtain
surprisingly accurate results (especially refering to ~ value). By using this method,
the handicap of the noncontinuous inspection (compared with the continuous
inspection) may be almost surpassed.
2 State of the art in reliability 75

Table 2.11 Simulated noncontinuous inspection for Menon data

Time period 0-0.1 0.1-0.2 0.2-0.4 0.4-0.5 0.5-0.7 0.7-1.0 1.0-1.3


Failed devices 3 1 I 3 2 2 I
Time period 1.3-1.8 1.8-2.5 2.5-4.0 4.0-8.0 8.0-10.0 10-100
Failed devices 2 I I I I I

Table 2.12 Comparison of estimated value obtained by various methods

Paramo Real Continuous inspection Noncontinuous


value Menon Cohen (MLE) Moment Graphical Inspection
[2.91) [2.92) method method (NIMLE)
~ 0.50 0.57 0.51 0.43 0.63 0.52
e 2.72 1.80 1.84 1.62 1.55 1.79

2.3.3
Accelerated testing

The first accelerated tests were made in the early 60's and tried to shorten the time
period necessary to obtain significant results from life tests. The failure mecha-
nisms must be investigated with great care, because it is essential that the failure
mechanism acting at the higher stress level be the same with that acting at normal
stress level. Accelerated life tests (ALT) with bias and temperature as stress factors
have been developed since the early 60's [2.57]. Data from ALT are processed with
the aid of life & stress models: Weibull or lognormal for the life models and Ar-
rhenius or reverse-power for the stress models [2.71][2.58]. Constant-stress tests,
with bias and temperature as stress factor, are used for quantitative determinations
of the failure rate. The activation energy (calculated from at least three constant-
stress tests performed at three different electrical and environmental conditions) is
the key parameter, allowing to extrapolate ALT results to normal operational condi-
tions.
A step-stress test used for qualitative determination of the failure rate was
proposed by Bazu [2.4], based on previous works [2.39][2.62]. This test was called
reliability fingerprint and allows obtaining a measure of the lot reliability. The
fingerprint of the lot reliability may be obtained as the response of the device to a
step-stress test: a single sample of 30-50 items, undergoing 4-10 hours at each
stress level, progressively increased, the number of failed devices at each stress
level giving the fingerprint. As one can see from Fig. 2.13, by comparing this
fingerprint with a reference fingerprint, obtained for a reference lot with the
reliability level determined from constant stress tests (3 samples of 30-50 items,
each sample at a given stress level, 1000 hours), the failure rate level of the lot can
be estimated. The clue of this method is to have a robust procedure for comparing
the fingerprint of the current batch with those of the reference batch. Currently, this
analysis is performed qualitatively by a human expert. But fuzzy logic allows
76 2 State of the art in reliability

developing a better comparison method [2.77]. The range of the ten steps is
fuzzified by a triangle-shaped membership function with five regions, as shown in

%
8

Step 2 3 4 5 6 7 8 9 10

%
8

Step I 2 3 4 5 6 7 8 9 10

Fig. 2.13 Comparison between: a the reliability fingerprint (RF) for a current batch and b the
fingerprint of the reference batch (RFref)

Table 2.13 Rapid estimation of the reliability level for the current batch presented in Fig.2.l3
(fuzzy comparison method with 5 regions)

Reliability level Reference Current Dominant


batch batch
% II, % II, Reference Current
batch batch
very low 2 0.1 0 0
low 6 0.3 0 0
medium 10 0.5 6 0.3 medium
high 2 01 10 0.5 high
very high 0 0 4 0.2
2 State of the art in reliability 77

Fig.2.9 and described by formula (2.3): very low, low, medium, high, very high,
referring to the reliability level of the batch. The method has been used for the two
RFs from Fig.2.9. They are compared in the purpose to evaluate the reliability level
of the current batch. The results are presented in Table 2.13. As a conclusion, the
current batch has a higher reliability level than the reference batch.
Eventually, a System for Rapid Estimation of the Reliability (SRER) was
created, based on constant and step stress tests [2.7].
Lately, new tendencies were observed in ALT: to increase the accelerated stress
level up to the highest possible value, counterbalanced by the idea of Barton [2.3]
which proposed a method for optimising ALT by minimising the maximum test-
stress. Another tendency is to perform accelerated tests early on the manufacturing
process (even at the wafer level).
It seems that the Arrhenius model must be corrected: the temperature is no
longer sufficient as an acceleration stress. The electric field must also be taken into
account. In an experiment performed for many samples withdrawn from the same
bipolar transistor batch, life tests at the same junction temperature but at different
electric fields were made [2.94]. Instead of having the same reliability level,
according to the Arrhenius model, an electric field dependence of the median time
for each sample (describing the lognormal distribution) was found:
(2.23)
where k is a constant and U is the applied voltage). The failure mechanism was the
formation of a diffusion channel and shortcircuiting by spikes. These spikes
depends on the width of the space charge region collector-base, which is directly
proportional to U 1I2 • So, the formula (2.22) is verified by the physics of failure. A
generalised Arrhenius model, taking into account the electric field dependence, was
given by Bazu [2.96]:
tm = A U 1l2 exp (E,/kT). (2.24)
This model proved to have a fair accuracy, being used for other types of
semiconductor devices.

2.3.3.1
Activation energy depends on the stress level

In the last years, experiments have shown that a dependence of the standard
deviation on the stress level exists. The current procedure for processing statistical
data from a lognormal distribution (as it is the case for most of electronic
components) is based on the assumption that the standard deviation (J has the same
value at any stress level [2.97]. In fact, it has been proved that the standard
deviation (and, as a consequence, the activation energy) is dependent on the stress
level.
An example is the experiment about the effect temperature and humidity on the
reliability of metallic pads [2.98]. Three test structures (metallic pads having width
of 8/llll and inter-pads distance of 811m, unprotected, to be more sensitive to the
effect of the environment) underwent tests at 70 .. .l30°C and 10 ... 70% relative
humidity. The leakage currents induced by humidity in the presence of the
temperature were measured. The phenomenon leading to the increase of the leakage
78 2 State of the art in reliability

current is the absorption of the water molecules on the structure surface. Three
models tried to explain the dependence of the surface conductivity (a parameter
linked to the structure reliability, because a high surface conductivity produce the
failure by shortcircuiting) on the temperature and humidity:
In y = In A + B (JrllT) + C InH (2.25)

In y = InA + B (lrllT) + C H (2.26)

In y = In A + B (JrllT) + C H + DH (lrllT) (2.27)


where'Y is the surface conductivity (Din), T is the temperature (K), H is the relative
humidity and A, B, C and D are constants. In the model (2.25), proposed by Peck
and Zierdt, the dependence on relative humidity is expressed by a power-law. In the
model (2.26), given by Weick, the dependence on relative humidity is an
exponential one. In the model (2.27), proposed by Sbar and Kozakiewicz, the same
dependence on temperature and relative humidity as for the model (2.26) is
considered, but there is a term linking temperature and relative humidity, proving
their synergy. For all materials investigated (various metal layers), the model (2.27)
seemed to be the most accurate. But, it has been observed that this model implies a
dependence of the activation energy on the level of humidity. So, this is a support
for the new theory of the stress dependence of the activation energy.
But there are others experiments supporting this theory. Schwartz [2.99] has
shown that for the electromigration of aluminium pads, the activation energy
depends on the temperature following a gaussian law. A similar result was obtained
by Chan [2.20], for the standard deviation.
An experiment tried to evaluate the error comitted by taking into account a
constant standard deviation with the stress level, instead of the real case of a
dependence of the standard deviation on the stress level [2.95]. For a bipolar silicon
transistor, three samples of 50 items were withdrawn from the same batch and
introduced in accelerated life tests at maximum power and at the ambient
temperatures 80°C, 125°C and 150°C. As a stress dependence of the standard
deviation, the formula (2.28) was chosen.
if = a + hit + elr. (2.28)
The data processed in the hypotesis that the standard deviation is the same at
each stress level lead to a failure rate of 5 x 1O·6h- 1, after 5000 hours at maximum
power and 25°C ambient temperature. If the standard deviation varies with the
stress level, a failure rate value of7 x 1O-7h- 1 (in the same conditions) was obtained.
If, for instance, the target value for the failure rate is I 0-6h- 1, the constant standard
deviation hypotesis may lead to the rejection of a good batch!

2.3.4
Physics of failure

Until now, no main wearout mechanism for a mature technology of manufacturing


semiconductor components is known. The discussed defects are reportable to the
structure anomalies of the component. The knowledge of these structural weak-
2 State of the art in reliability 79

nesses or of some constructive or technological insufficiencies and their causes is a


useful premise for the uninterrupted ageing of the semiconductor component reli-
ability.
There are several problems that must be solved in the next years. First, to
identify the acceleration laws with different stress factors. Then, the idea to take
into account, at the design of the accelerated tests, the synergies between the stress
factors encountered in the operational environment. In this respect, a step was the
model proposed by Bazu and Tiizliiuanu [2.6]. The model can be used for designing
ALT with three or more stress factors:
n

(2.29)

where t mo is a constant, K - Boltzmann's constant, Ea - the activation energy, ai, b i -


coefficients, Si - stress factors (others than temperature and electrical bias) and F is
given by: n

F = Ta + r1h;_a Pd + I Ci S,d, (2.30)


i~1

where Ta is the ambient temperature, rthj-a - the thermal resistance between junction
and ambient, P d - the dissipated power, Ci and di - coefficients. The coefficients ai,
b i, Ci and di may be calculated form experimental data. For instance, if there are
three stress factors (temperature, bias, humidity), i = 1, and the relations (2.29) and
(2.30) become:
(2.31 )

(2.32)
From this generalised model, previously developed models may be obtained (see
Table 2.14).

Table 2.14 Models obtained from the model described by the relations (2.31) and (2.32)

Parameters Models
SI al bl CI dl
0 - - - - Arrhenius
SI 0 0 1 1 Hakim-Reich [2.100)
SI I 2 0 - Lawson [2.100)
S, I m 0 - Peck [2.101)

The model can be used for calculating the failure rate at various environments and
electric stresses, but also to design accelerated tests. Such tests may be useful for
screening or for evaluating the reliability level of a batch of components. In Fig.
2.14, two examples are presented:
• Screening: the point SI is for normal test conditions; the equivalent duration of a
test performed at higher stress level (desribed by the point S2) is obtained: the
point 8 3•
80 2 State of the art in reliability

• Reliability evaluation: from an accelerated test, performed at the stress level EJ,
a lognormal failure rate distribution, described by the parameter tm (point E 2),
was obtained; by using the model, at the normal test condition described by the
point E 3, the value oftm (point E4) may be obtained.

.......:lE2. .
.: .... ....
: ....
~--~ ....
....
....
.... .

10
100 120 140 160 F (a.u.)
Fig. 2.14 Screening and reliability evaluation perfonned by using the model described by the
relations (2.29) and (2.30)

Absolutely fault Electrically fault- El. faultless, but El. faultless, but with struct ure im-
less components less, but with con- with structure perfections, caused by exceeding
structive & techno- imperfections the technological parameters
logical weaknesses

Components with structure defects

Overcharge I Ageing Possibility of a local overcharge or


possibilities faulty operation at a structure
defect

I
I Potentially unreliable components I

I Defect mechanisms
I

Structural causes
of the defect
I
Electrical causes
of the defect
Fig. 2.15 Emergence possibilities of the semiconductors defects
2 State of the art in reliability 81

It is important to note that the model described by the relations (2.29) and (2.30)
can be used for various stress factors, such as: temperature cycling, pressure and
mechanical stress.
The accelerated testing is now the main tool for the determination of the
reliability level. Recent progress was obtained in this field. Clark et al. [2.23]
presented an approach to design ALT experiments, using multiple stress, usable to
low-cost, high volume production items. Another method, developed by Klyatis
[2.45] is based upon physical modeling to demonstrate the influence of mechanical
and environmental factors under operating conditions.
In Fig. 2.15 a possible classification of the semiconductor defects depending on
their origin can be seen.
Initially, for an allowed load, almost always a constructive fault or a fabrication
failure occurs. Consequently, an ageing mechanism, respectively a latent structural
weakness of the component may come out. This is why the component structure is
continuously modified under the influence of its load, until finally its electrical
parameters exceed the allowed limits. Performing the defect analysis, the causal
relations are gradually discovered starting the examination from the causes of the
electrical fault.
Because of their high package density and reduced utilisation voltages, the
semiconductor components are exposed to various influences that produce failures
in operation and, unfortunately, often lead to their destruction. It must be
distinguished between internal and external influences.
The external influences operate through direct inductive, capacitive or chemical
effect or occur to the very sensitive structure through component operation i. e.
during the commutation, electrical current running or connecting to the electrical
mass. In this respect, the actions of the short-circuits, of the connecting and the
deconnecting shocks, of the atmosphere conditions, and of the inductive and ca-
pacitive influences are to be mentioned. Other possible effects are produced by UV
and X (for example REPROM) irradiation, by radiowaves irradiation and under
special conditions [2.65] through electromagnetic pulse EMP. While the EMP
effect is complex [2.13] and its action mode is multiple, the protection measures
against EMP are not simple. Until a certain point, they are similar to the protection
measures against lightning. Taking into account the differences between the rise
times, frequencies, field intensities and energy, and taking into account that the
involved domain can cover some million of km 2 , it can be noticed that the anti-
EMP protection has greater requirements and its realisation is more expensive.
The internal influences can act inside an electronic constructive element or for a
semiconductor structure, through the introduction of inductive currents, short-
circuit, and - at any rate - by inductive and particularly capacitive influences.

2.3.4.1
Drift, drift failures and drift behaviour

The drift is a change in time of the parameters, under conditions determined by


stress, a change that not lead immediately to failure. The drift is caused by physical
or chemical transformations [2.69][2.33][2.2(1978)(1981 )(1984)] that happen in-
side or on the surface of an item. In general, the drift failures emerge at the end of
82 2 State of the art in reliability

the useful lifetime and in the wearout period, when the parameters exceed the al-
lowed tolerance limits for the nominal values corresponding to the normal opera-
tion of the component. As a consequence of this parameter drift, overloads may
emerge and some of them can lead to total failures. Drift failures can be tracked
through periodical electrical characterisations of the components, before the general
failure occurs (for example, the growth of the resistance for the carbon resistors, the
diminution of the capacity for the electrolyte capacitors, the growth of the residual
current for the semiconductor components, etc.).
The drift behaviour can be tracked through long time researches and often
emphasises a dependency on the loading value. The drift can be eliminated through
ageing, so that during the useful lifetime a more stable behaviour can be obtained. If
the drift behaviour of the component is known, the useful lifetime can be deter-
mined by selecting correspondingly the load and the loading conditions. Utilising
these indirect methods to determine the reliability data needs a short time, and that
is why it can be renounced to the long trials.
At the design and re-design phases, the drift behaviour must be considered. This
enables to reduce the elaboration time and the costs.
Failure mechanism (FM) identification is essential for the reliability accelerated
testing, because the obtained degradation laws must be extrapolated beyond the
time period of the test and the extrapolation must be made separately for each
population affected by a FM. The subject is still modem, taking into account that
new failure mechanisms are discovered and even the old ones are not completely
explained. Consequently, a series of tutorial papers on FM were published from
1991 by IEEE Transactions on Reliability, almost in each issue. Most of these
papers were written by the specialists from CALCE Electronic Packaging Research
Centre, a research team led by Michael Pecht, from Maryland University (USA).
The following FM were investigated: overstress and wearout, including quantitative
models [2.26], excessive elastic deformation [2.27] irreversible plastic deformation
[2.28], brittle fracture [2.28], ductile fracture [2.29], buckling [2.30], mechanical
wear [2.34], creep, cycling fatigue [2.31], material ageing due to interdiffusion,
electromigration [2.75] irradiated polymeric material [2.1], popcoming of plastic
encapsulated components [2.38].
A method predicting the effect of particular defects on the failure rate of metal
interconnections due to electromigration was proposed by Kemp et al. [2.44]. The
defect of interest is the missing-material that reduces the effective cross section of
the conductor at the point of the defect.
Chick and Mendel [2.21] proposed a method for incorporating previous infor-
mation on FM into a lifetime model. The clue of this approach is that wear, stress
and strain are more directly linked to the failure than is the component age. A
lifetime model, based on previous information about wear, allows using lifetime
data for similar components used under various operating conditions.
Trends in micro systems integrating electronic, microelectromechanical, electro-
optical and micro-fluidic devices are bringing the miniaturisation close to its
physical limits creating in this way a need for extensive reliability, a physics effort
to identify and counter failure mechanisms in new devices [2.24].
2 State of the art in reliability 83

2.3.5
Prediction methods

Weare today far from the situation of late 1960s, when the poor intrinsic quality
and reliability of components dominated the failures in electronic equipment. In-
trinsic reliability is the reliability a system can achieve based on the types of de-
vices and manufacturing processes used. The models are oriented to describe fail-
ures as originated by the intrinsic characteristics of the components in relation to
their use and application. If intrinsic failures in the useful life period are really due
to inherent defects (a residual of defects that did not surface in early life), the track
record of the component vendor is more meaningful in this respect than the generic
handbook values of A. In fact, it is quite difficult to separate the contribution due to
the component from that induced by the application. In the 1960s, two approaches
were prevalent: a) a constant hazard rate is assigned to the components; b) the
influence of the environmental and loading conditions can be modelled using sim-
ple formulas or correction factors. In other words, the systems are assumed to be
simple series systems in a reliability sense, and the hazard rate of the system is the
sum of the hazard rates of all contributing components. The operating and the stress
conditions of the individual components were partially absent. Assigning a constant
hazard rate to a component is a fallacy. When a component fails, it is either because
it has been subjected to a freak overload during its useful life period, or because it
has reached a long-term wearout phase. The majority of the real life failures is
caused by external events, being in fact extrinsic failures (rather than caused by any
inherent deficiency of a component). Actually, among the available data a small
amount of non-intrinsic failures almost always exists, which it is not possible to
positively identify as non-intrinsic. All the field failures are caused by randomly
occurring defects accelerated by operational stress. Component failures (even those
considered of a random nature) are due to manufacturing defects or misuse that
show their effects during operation.

Hazard rate Z(t)

With bum-in

curve

L..:--::::..___.~-=:::'=========.===-==-==:i.::' ::::::-=----. Lifetime

Fig. 2.16 Superposition of physics offailure intrinsic reliability models with field failure data, in
the useful period
84 2 State of the art in reliability

The reliability estimations constitute an assessment instrument for the reliability


level. They serve to determine the real reliability state of an item versus the initial
predicted performances for it. The most important ways to predict the reliability are:
• Determination of the prescribed performances.
• Evaluation of the possibilities to obtain the item performances (in comparison
with the competing solutions).
• Selection of optimal solution.
• A calculus of the system price (in comparison with other elements of the item).
• Maintainability and availability estimations.
The reliability estimations can be divided in three groups:
• First draft estimations.
• Design and prototype estimations.
• Series product estimations.
Further on, these estimations can be classified in function of the level at which they
are made:

• General estimations;
• Per function estimations;
• Analytical estimations.
For a dependable prediction of field component reliability we need information
about composite reliability (the sum of intrinsic and extrinsic failures - see Fig.
2.16). Failure rate prediction requires knowledge on event statistics as well as on
device robustness. The benefits of dependable predictions can be summarised as
follows:
• forecast the field reliability of a system;
• comparing the reliability of similar designs and ability to make trade-offs with
the aim to enhance reliability through derating or design changes;
• prediction of spare parts provisioning;
• prediction of warranty costs;
• prediction oflogistic support (repair and maintenance facilities);
• survey of the company's competitiveness.

2.3.5.1
Prediction methods based on failure physics

Now, in the concurrent engineering era, Reliability Predictions Procedures (RPP)


become valuable tools for designers and users of any product. The designer needs
RPP to avoid the lag in feedback occurring when the predictions are made be are
reliability team. The user, which in turn is a designer of a complex system, wants to
have correct information about the part reliability. All the usual RPP (the most
known being MIL-HDBK-2l7) have some common features diminishing the pre-
diction accuracy [2.16]: (i) a constant failure rate model is used; (ii) the failure
mechanisms (FM) are not analysed.
2 State of the art in reliability 85

RPPs [2.36][2.17] based on a physics of failure approach were proposed (Fig.


2.16) for intrinsic component reliability, but - as the models primarily focus on a
failure mechanism basis - problems in satisfying reliability practitioners may arise
in the long term wearout phase. The intrinsic infant mortality problem is a question
of defect elimination in the production process. The ability to forecast and to
compare the reliability of similar designs and to make design trade-offs has been
widely discussed in the specialised literature [2.34][2.73][2.17][2.54][2.56].
Lately, improved RPP - with failure rate models other than the exponential and
starting from the physics offailure - were considered desirable and as Wong [2.73]
pointed out, a change in direction for reliability engineering was made.

Hazard rate
Z(t)

I nored
i+---
Wearout period
(related to component
~i

Useful life period (A '" 0)

Time
Fig. 2.17 The physics offailure modelling approach

Bazu [2.9] proposed an improved methodology called SYRP - for predicting the
failure rate of a lot - with the following characteristics: (i) a lognormal distribution
for each FM involved is used and (ii) the interaction (synergy) between the
technological factors depending on the manufacturing and control techniques, is
considered. Failure risk coefficients (FRC) - assessed at each manufacturing step -
are fuzzy sets (triangle membership function) and they are corrected at the
subsequent manufacturing steps by considering the synergy of the manufacturing
factors. From the final FRC, the parameters of the lognormal failure distribution are
calculated for each potential FM. A comparison of SYRP predictions and
accelerated life test results is shown in Table 2.15

Table 2.15 SYRP prediction VS. accelerated life test (ALT) results [SYRP/ALT in each column]

Batch Parameters Failure mechanisms


Infant Crystallograpfic Purple plague Metallisation
mortality defects corrosion
BI tm 100 I 50 103 /2010:' 20104 /1.5'J104 -
cr 2/2.1 1.5/1.5 0.83/0.8
B2 tm 1001500 - - 3,110:' l1.5mo:'
cr 2/2.5 1/0.8
86 2 State of the art in reliability

One may notice that although data obtained from AL T are experimental data,
SYRP prediction, made before performing ALT, seems to be fair enough.
A comparison of RPP, containing also the work done by Talmor [2.68], is given
in Table 2.16.

Table 2.16 Comparison of reliability prediction procedures

Procedure Characteristics Physics of Simplicity


failure in use
MlL-HDBK 217F Average value No Yes
Relative simplified factors
CNET Based on established data bank No Yes
Relative simplified factors
Thomson-CSF Based on accelerated tests No Yes
Activation factors strictly chosen No Yes
Accelerated tests and field data No Yes
RAC Accelerated tests and field data No Yes
Plessey Stress-strength No Moderate
SYRP Lognormal distribution Yes Moderate
Synergy of technological factors

2.3.5.2
Laboratory versus operational reliability

What for a link exists between the reliability measured in laboratory and the opera-
tional reliability of components? It is known, that the values established by the
component manufacturer depend on the test conditions, and the operational values
depend on the incoming control conditions of the components used in electronic
systems. In practice, the differences between the two results can reach one or two
orders of magnitude. Surely, only the operational reliability can include all the
stresses that can demonstrate a sufficient reliability of the component. As long as
exact operational reliability knowledge is not available, the ratio between the reli-
ability measured in laboratory and the operational reliability will remain a perma-
nent discussion subject between manufacturer and user. The reduction of compo-
nent defects/failure rate can be obtained on the following ways:

• Avoiding detrimental transport and storage conditions.


• Avoiding mechanical stresses by component mounting.
• Using reduced soldering temperature and times.
• Loading the component at reduced dissipation power.
• Obtaining internal reduced temperatures of the equipment
2 State of the art in reliability 87

2.4
Standardisation

2.4.1
Quality systems

The set of international standards on quality management, known generically as


ISO 9000 (and produced by the International Standards Organisation - ISO), is
firmly established as the way to go. In January 1997, there were world-wide over
120 000 companies certified as compliant to ISO 9000 requirements, including
almost 11 000 U. S. sites. Among them, three huge companies: Ford, Chrysler and
General Motors. In December 1995, NASA issued a policy statement directing the
use of quality systems compliant with ISO 9000 [2.25]. The ISO 9000 standards
require a third party to audit and register a manufacturing site. These third parties
are accredited by agencies established by national governments. DoD allows its
contractors to propose ISO 9000 standards for their quality management system,
but does not require third party certification. In 1996, ISO 14001, the international
standards on environmental management systems, and ISO 14004, on general envi-
ronmental management guidelines, were officially adapted by 40 voting countries.
The ISO 14000 series parallel the ISO 9000 series, including the requirement for
third-party certification.

2.4.2
Dependability

Dependability is the official title of the Technical Committee (TC 56) of the Inter-
national Electrotechnical Commission (IEC), producing international standards on
reliability, maintainability and availability. A fruitful co-operation [2.14] was initi-
ated between the ISO and IEC, at the level of the committees: ISO TC 176 (qual-
ity), IEC TC 56 (dependability), and ISO TC 69 (statistics). Consequently, the IEC
series IEC 300, on dependability management is directly linked with ISO 9000
family [2.55].

References

2.1 A1-Sheiklh1y, M.; Christou, A. (1994): How radiation affects polymeric material. IEEE
Trans. Reliability, vol. 43, no. 4, December, pp. 551-556
2.2 Bajenescu, T. (1996): Fiabilitatea componente1or electronice. (The reliability of electronic
components). Editura Tehnicii, Bucharest, Romania
Biijenesco, T. I. (1978): Microcircuits. Reliability, incoming inspection, screening and
optimal efficiency. International Conference on Reliability and Maintainability, Paris, June
19-23
Bajenesco, T. I. (1981): Problemes de la fiabilite des composants electroniques actifs
actuels. Masson. Paris
88 2 State of the art in reliability

Biijenescu, T. I. (1984): ZuverHissigkeitsprobleme bei den Halbleiterspeichern und


Mikroprozessoren. Elektroniker (Switzerland) no. 9, pp. 25-34; no. 10, pp. 49-57
2.3 Barton, R. R. (1991): Optimal accelerated life-time plans that minimise the maximum test-
stress. IEEE Trans. Reliability, vol. 40, no. 2, June, pp. 166-172
2.4 Bazu, M. et al.(1983): Step-stress tests for semiconductor components. Proceedings of the
Annual Conference for Semiconductors, Oct. 6-8, Timisul de Sus, Romania, pp. 119-122
2.5 Bazu, M. et al. (1985): SRER - a System for Rapid Estimation ofthe Reliability. 6th Symp.
on Reliability in Electronics (Relectronic), Aug. 26-30, Budapest, Hungary, pp. 267-271
2.6 Bazu, M.; Tiizliiuanu, M. (1991): Reliability testing of semiconductor devices in a humid
environment. Proceedings of the Annual Reliability and Maintainability Symp., January
29-31, Orlando, Florida (USA), pp.237-240
2.7 Bazu, M.; Bacivarof, I. (1991): A method of reliability evaluation of accelerated aged
electron components. Proceedings of the Conference on Probabilistic Safety Assessment
and Management (PSAM), February, 1991, Beverly Hills, California (USA), pp. 357-361
2.8 Biizu, M. (1994): A synergetic approach on the reliability assurance for semiconductor
components, Ph. D. thesis, Politechnica University of Bucharest, Romania
2.9 Bazu, M. (1995): A combined fuzzy logic & physics-of-failure approach to reliability
prediction. IEEE Trans. Reliability, vol. 44 , no. 2, June, pp. 237-242
2.10 Biizu, M. (1996): Fuzzy-logic based reliability prediction for the building-in reliability
approach. In: Zimmermann, J.-D. and Dascalu, D. (eds.) Real word applications of
intelligent technologies. Editura Academiei Romane, pp. 124-128, Bucharest, Romania
2.11 Bazu M.; Dragan, M. (1997): MOVES - a method for monitoring and verifying the
reliability screening. Proceedings of the International Semiconductor Conference CAS'97,
October 7-11, Sinaia, Romania, pp. 345-348
2.12 Biizu, M. (1998): The reliability of semiconductor devices: an overview. Proc. of the 6th Int.
Conf. On Optimization of Electrical and Electronic Equipments, Brasov, Romania, May
14-15, 1998,pp. 785-788
2.13 Bell Lab. (1975): EMP engineering and design principles. Bell Telephones (USA)
2.14 Benski, C.(1996): Dependability standards: an international perspective. Proceedings of the
Annual Reliability and Maintainability Symp., Jan. 22-25, Las Vegas, Nevada (USA), pp.
13-16
2.15 Birolini, A. (1994): Quality and reliability of technical systems. Springer, Berlin,
Heidelberg, New York
2.16 Bowles, J. (1992): A survey of reliability prediction procedures for microelectronic devices.
IEEE Trans. Reliability, vol. 41, March, pp. 2-10
2.17 Brombacher, A. C. (1992): Reliability by design: CAE techniques for electronic
components and systems. John Wiley & Sons, Inc., Chichester
2.18 Caroli, J. et aI. (1996): R & M in an era of acquisition reform. Proceedings of the Annual
Reliability and Maintainability Symp., Jan. 22-25, Las Vegas, Nevada, pp. 1-6
2.19 Caruso, H. (1996): An overview of environmental reliability testing. Proceedings of the
Annual Reliability and Maintainability Symp., Jan. 22-25, Las Vegas, Nevada (USA), pp.
102-109
2.20 Chan, C. K. (1991): Temperature dependent standard deviation of log (failure time)
distributions. IEEE Trans. Reliability, vol. 40, no. 2, June, pp. 157-162
2.21 Chick, S. E.; Mendel, M. B. (1996): An engineering basis for statistical lifetime models
with an application to tribology. IEEE Trans. Reliability, vol. 45, June, pp. 208-21S
2.22 Christou (1994): Reliability of Gallium Arsenide monolithic microwave integrated circuits.
John Wiley & Sons, Inc., Design and Measurement in Electronic Engineering Series
2.23 Clark, J.A. et al. (1997): An approach to designing accelerated life-testing experiments.
Proceedings of the Annual Reliability and Maintainability Symp., Jan. 13-16, Philadelphia,
PeIll1sylvania (USA), pp. 242-248
2.24 Coppola, A. (1996): The status of the reliability engineering discipline. Reliability Society
Newsletter, vol. 42, no. 2, April, pp. 10-14
2.25 Coppola, A. (1997): The status of the reliability engineering technology. Reliability Society
News, vol. 43, no. 2, April, pp. 7-10
2 State of the art in reliability 89

2.26 Dasgupta, A.; Pecht, M. (1991): Material failure mechanisms and damage models. IEEE
Trans. Reliability vol. 40, no. 5, Dec., pp. 531-536
2.27 Dasgupta, A.; Hu, J. M. (1992): Failure mechanical models for excessive elastic
deformation. IEEE Trans. Reliability vol. 41, no. I, March, pp. 149-154
2.28 Dasgupta, A.; Hu, J. M. (1992): Failure mechanical models for plastic deformation. IEEE
Trans. Reliability vol. 41, no. 2, June, pp. 168-174 and Dasgupta, A., Hu, J. M. (1992):
Failure mechanical models for brittle fracture. IEEE Trans. Reliability vol. 41, no. 3, June,
pp.328-335
2.29 Dasgupta, A.; Hu, J. M. (1992): Failure mechanical models for ductile fracture. IEEE
Trans. Reliability vol. 41, no. 4, Dec., pp.489-495
2.30 Dasgupta, A.; Haslach Jr., H. W. (1993): Mechanical design failure mechanism for
buckling. IEEE Trans. Reliability vol. 42, no. 1, March., pp.9-16
2.31 Dasgupta, A. (1993): Failure mechanism models for cyclic fatigue. IEEE Trans. Reliability
vol. 42, no. 4, Dec., pp. 548-555
2.32 Demko, E. (1996): Commercial-Off the Shelf (COTS): challenge to military equipment
reliability. Proceedings of the Annual Reliability and Maintainability Symposium, Jan. 22-
25, Las Vegas, Nevada (USA), pp. 7-12
2.33 Dull, H. (1976): ZuverHissigkeit und Driftverhalten von WidersHinden. Radio Mentor no. 7,
pp. 73-79
2.34 Engel, P. (1993): Mechanical failure mechanism models for mechanical wear. IEEE Trans.
Reliability vol. 42, no. 2, June, pp. 262-267; Fiorescu, R. A. (1986): A New Approach to
Reliability Prediction is Needed. Quality and Reliability Eng. Int., vol. 2, pp. 101-106
2.35 Ermer, D. (1996): Proposed new DoD standards for product acceptance. Proceedings of the
Annual Reliability and Maintainability Symp., Jan. 22-25, Las Vegas, Nevada (USA), pp.
24-29
2.36 Frost, D. F.; Poole, K. F. (1989): RELIANT: A Reliability Analysis Tool for VLSI
Interconnects. IEEE Solid-State Circuits, vol. 24, pp. 458-462
2.37 Giilateanu, L. et al. (1996): Stress and strain in automotive diodes - a RVT, IR and XR
study. Proceedings of the International Semiconductor Conference CAS'96, Oct. 9-12,
Sinaia, pp. 361-364
2.38 Gallo, A. A.; Munamarty, R. (1995): Popcorning: A failure mechanism in plastic-
encapsulated microcircuits. IEEE Trans. Reliability vol. 44, no. 3, Sept., pp.362-367
2.39 Hakim, E. (1963): Step-stress as an indicator for manufacturing process change. Solid State
Design, March, pp. 115-117
2.40 Hansen, C. (1997): Effectiveness of yield-estimation and reliability-prediction based on
wafer test-chip measurements. Proceedings of the Annual Reliability and Maintainability
Symp., Jan. 13-16, Philadelphia, Pennsylvania (USA), pp. 142-148
2.41 Hoffman, D. (1997): An overview of concurrent engineering. Proceedings of the Annual
Reliability and Maintainability Symp., Jan. 13-16, Philadelphia, Pennsylvania (USA), pp.
1-6
2.42 Jansen, F.; Petersen, N. (1982): Burn-in - an engineering approach to design and analysis of
burn-in procedures. John Wiley & Sons, Inc., Chichester
2.43 Jensen, F. (1995): Electronic component reliability. John Wiley & Sons, Inc.
2.44 Kemp, K. G.et al. (1990): The effects of particular defects on the early failure of metal
interconnects. IEEE Trans. Reliability, vol. 39 , no. 1, April, pp. 26-29
2.45 Klyatis, L.M. (1997): One strategy of accelerated testing technique. Proceedings of the
Annual Reliability and Maintainability Symp., Jan. 13-16, Philadelphia, Pennsylvania
(USA), pp.249-253
2.46 Knight, C. R. (1991): Four decades of reliability progress. Proceedings of the Annual
Reliability and Maintainability Symp., Jan. 29-31, Orlando, Florida (USA), pp.156-160
2.47 Kross, E. J.; Sicuranza, M. A. (1996): Commercial-components initiative: ground benign
systems - plastic encapsulated microcircuits. IEEE Trans. Reliability vol. 45, no. 2, June,
pp. 180-183
2.48 Kuehn, R. E. (1991): Four decades of reliability experience. Proceedings of the Annual
Reliability and Maintainability Symp., Jan. 29-31, Orlando, Florida (USA), pp.76-81
90 2 State of the art in reliability

2.49 Kuo, W.; Oh, H. (1995): Design for reliability. IEEE Trans. Reliability vol. 44, no. 2, June,
pp.170-171
2.50 Li, J.; Dasgupta, A. (1993): Failure mechanism models for creep. IEEE Trans. Reliability
vol. 42, no. 3, Sept., pp.339-353
2.51 Li, J.; Dasgupta, A. (1994): Failure mechanism models for material ageing due to
interdiffusion. IEEE Trans. Reliability vol. 43, no. 1, March, pp.2-10
2.52 Lukis, L. W. F. (1972): Reliability assessment - myths and misuse of statistics.
Microelectronics and Reliability vol. 11, no. 11, pp. 177-184
2.53 Nicholls, D. (1996): Selection of equipment to leverage commercial technology (SELECT).
Proceedings of the Annual Reliability and Maintainability Symp., Jan. 22-25, Las Vegas,
Nevada (USA), pp. 84-90
2.54 O'Connor, P. D. T. (1993): Quality and reliability: illusions and realities. Quality and
Reliability Engineering Internat., vol. 9, pp. 162-168
2.55 O'Leary, D. J. (1996): International standards: their new role in a global economy.
Proceedings of the Annual Reliability and Maintainability Symp., Jan. 22-25, Las Vegas,
Nevada (USA), pp. 17-23
2.56 Pecht (1994): Quality Conformance and Qualification of Micro Electronic Package and
Interconnects. John Wiley & Sons, Inc.
2.57 Peck, D. S. (1961): Semiconductor reliability predictions from life distribution data.
Semiconductor Reliability, Reinhold Publishers, pp. 51-63
2.58 Peck, D. S.; Zierdt Jr., C. H.(1974): The reliability of semiconductor devices in the Bell
System. Proceedings of the IEEE, vol. 62, no. 2, Feb., pp. 185-211
2.59 Robineau, J. et al. (1992): Reliability Approach in Automotive Electronics. ESREF '92, pp.
133-140
2.60 Roy, K.; Prasad, S. (1995): Logic synthesis for reliability: an early start to controlling
electromigration & hot-carrier effects. IEEE Trans. Reliability, vol. 44, no. 2, June, pp.
251-255
2.61 Rudra, B.; Jennings, D. (1994): Failure mechanism models for conductive-filament
formation. IEEE Trans. Reliabilityvol. 43, no. 3, Sept., pp.354-360
2.62 Ryerson, I. (1978): Reliability testing and screening: a general review paper.
Microelectronics and Reliability, no. 3, pp. 112-118, London
2.63 Schneider, C. (1997): The GIQLP-Product integrity's link to acquisition reform.
Proceedings of the Annual Reliability and Maintainability Symp., Jan. 13-16, Philadelphia,
Pennsylvania (USA), pp. 26-28
2.64 Jensen, F.; Petersen, N. E. (1982): Bum-In. Wiley, New York
2.65 Taguchi, G. (1995): Quality engineering (Taguchi methods) for the development of
electronic-circuit technology. IEEE Trans. Reliability, vol. 44, no. 2, June, pp. 225-229
2.66 Silberhorn (1980): Aussere, einschrenkende Einfliisse auf den Einsatz von VLSI-
Bausteinen. Bulletin SEVNSE vol. 71, no. 2, pp. 54-56
2.67 Talmor, M.; Arueti, S. (1997): Reliability prediction: the tum-on point. Proceedings of the
Annual Reliability and Maintainability Symp., Jan. 13-16, Philadelphia, Pennsylvania
(USA), pp. 254-262
2.68 Tang, S.-M. (1996): New bum-in methodology based on IC attributes, family IC bum-in
data and failure mechanism analysis. Proceedings of the Annual Reliability and
Maintainability Symp., Jan. 22-25, Las Vegas, Nevada (USA), pp. 185-190
2.69 Tretter (1974): Zum Driftverhalten von Bauelementen und Geriiten. Qualitiit und
Zuverliissigkeit (Germany), vol. 19, no.4, pp. 93-79
2.70 Traon Le, I.-Y.; Treheux, M. (1977): L'environnement des materiels de telecom-
munications. L'echo des recherches, Oct., pp. 12-21
2.71 Tseng, S.-T.; Hsu, C.-H. (1994): Comparison of type-I & type-II accelerated life tests for
selecting the most reliable product" IEEE Trans. Reliability, vol. 43, Sept., pp. 503-510
2.72 Wong, K. L. (1993): A change in direction for reliability engineering is long overdue. IEEE
Trans. Reliability, vol. 42, no. 2, June, pp. 261-266
2 State of the art in reliability 91

2.73 Yang, K.; Xue, 1. (1997): Reliability design based on dynamic factorial experimental
model. Proceedings of the Annual Reliability and Maintainability Symp., Jan. 13-16,
Philadelphia, Pennsylvania (USA), pp. 320-326
2.74 Yates, W.; Johnson, R. (1997): Total Quality Management in U. S. DoD electronics
acquisition. Proceedings of the Annual Reliability and Maintainability Symp., Jan. 13-16,
Philadelphia, Pennsylvania (USA), pp. 571-577
2.75 Young, D.; Christou, A. (1994): Failure mechanism models for electromigration. IEEE
Trans. Reliability vol. 43, no. 2, June, pp. 186-192
2.76 Lall, P. (1996): Temperature as an input to microelectronics-reliability models. IEEE
Trans. Reliability, vol. 45, no. 1, pp. 3-9
2.77 Bazu, M. (1999): Reliability assessment based on fuzzy logic. International Conf. on
Computational Intelligence for Modelling, Control and Automation, CIMCA'99, Viena,
Austria, February 17-19
2.78 Bosch, G. (1979): Model for failure rate curves. Microelectronics and Reliability, vol. 19,
pp.37l-379
2.79 Hallberg, O. (1977): Failure-rate as a function of time due to log-normal distributions of
weak parts. Microelectronics and Reliability, vol. 17, pp. 155-161
2.80 Moltoft, J. (1980): The failure rate function estimated from parameter drift measurement.
Microelectronics and Reliability, vol. 20, pp.787-791
2.81 Ash, M.; Gorton, H. (1989): A practical end oflife model for semiconductor devices. IEEE
Trans. on Reliability, October, pp. 485-493
2.82 Livesay, B. R. (1978): The reliability of electronic devices in storage environment. Solid
State Technology, October, pp. 63-68
2.83 Calatayud, R.; Szymkowiak, E. (1992):Temperature and vibration results form captive-
store flight-tests provide a reliability improvement tool. Proceedings of the Annual
Reliability and Maintainability Symp., Las Vegas, Nevada, January 21-23, pp.266--271
2.84 Popa, E. et al. (1986): Thermal fatigue - a limitation of the reliability of medium power
semiconductor devices. Proceedings of the Annual Conference for Semiconductors,
October, Sinaia (Romania), pp. 247-250
2.85 Udrea, M. et al. (1988): Intermittent functioning - an efficient method for evaluating the
reliability of soldering systems. Proceedings of the Annual Conference for Semiconductors,
October, Sinaia (Romania), pp. 219-222
2.86 Bazu, M. et al. (1989): Behaviour of semiconductor components at temperature cycling.
Revue Roumaine des Sciences Techniques, January-March, pp. 151-155
2.87 Hu, J. et al. (1992): Role of failure mechanism identification in accelerated testing.
Proceedings of the Annual Reliability and Maintainability Symp., Las Vegas, Nevada,
January 21-23, pp. 181-188
2.88 Bi'ijenescu, T.I. (1985): Zuverlassigkeit Elektronischer Komponentes. VDE Verlag
2.89 Bazu, M. (1982): A mathematical model for the reliability of semiconductor devices. Elec-
tronics and Automatics, no. 4, pp. 151-157
2.90 Bazu, M. (1982): Reliability prediction for a Weibull population of semiconductor compo-
nents: the non-continuous inspection case. International Conference on Reliability, Varna
(Bulgaria), May
2.91 Menon, M.V. (1963): Estimation of the shape and scale parameters of the Weibull
distribution. Technometrics, no. 2, pp. 175-181
2.92 Cohen, A. C. (1965): Maximum likelihood estimation in the Weibull distribution based on
complete and on censored samples. Teclmometrics, no. 4, pp. 579-585
2.93 Thoman, D. et al. (1969): Inference on the parameters of the Weibull distribution.
Technometrics, no. 3, pp. 445-453
2.94 Bazu, M. et a!. (1987): Failure mechanisms accelerated by thermal and electrical stress.
Proceedings of the Annual Conference for Semiconductors, October, Sinaia (Romania), pp.
53-56
2.95 Bazu, M. (1992): Accelerated life test when the activation energy is a random variable.
Proc. of the Int. Semicond. Conf. CAS '92, October 5-10, Sinaia, pp. 245-248
92 2 State of the art in reliability

2.96 Bazu, M. (1990): A model for the electric field dependence of semiconductor device
reliability. 18th Int.Conf. on Microelectronics (MIEL). Ljubljana, Slovenia, May
2.97 Peck, D. S.; Trapp, O. D. (1987): Accelerated Testing Handbook. Technology Associates,
Portola Valley, California
2.98 Weick, W. (1980): Acceleration factors for IC leakage current in a steam environment.
IEEE Trans. on Reliability, June, pp. 109-114
2.99 Schwartz, J. A. (1987): Temperature dependent standard deviation of log (failure time)
distributions. J. Appl. Phys., vol. 61, pp. 801-805
2.100 Reich, B.; Hakim, E. (1972): Environmental factors governing field reliability of plastic
transistors and integrated circuits. International Reliability Physics Symp., pp. 82-87
2.101 Peck, D. S. (1986): Comprehensive model for humidity testing corelation. International
Reliabilty Physics Symp., pp. 44-50
2.102 Klinger, D. J. (1991): Humidity Acceleration Factor for Plastic Packaged Electronic-
Devices. Quality and Reliability Engineering International, vol. 7, pp. 365-370
2.1 03 Baw, M. (1982): Temperature dependence of the reliability of semiconductor components.
National Conference of Electronics, Telecommunications and Computers (CNETAC), No-
vember, pp. 1.81-1.85
2.104 Bazu, M. et al. (1987): Reliability of semiconductor components in the first hours of func-
tioning at high temperature. Electrotechnics, Automatics and Electronics, no. 1, pp. 10-15
2.105 Bazu, M.; Ilian, V. (1990): Accelerated testing of integrated circuits after storage. Scandi-
navian Reliability Engineers Symp., Nykoping, Sweden, October
2.106 Bazu, M. (1992): Synergetic effects in reliability. Optimum Q, no. 2, April, pp. 32-35
2.107 Bazu, M. (1992): Accelerated life test when the activation energy is a random variable.
Proc. of the Int. Semicond. Conf. CAS '92, October 5-10, Sinaia, pp. 245-248
2.108 Baw, M.; Bacivarof, I. (1989): On the Validity of the Arrhenius Model in the Accelerated
Testing of Semiconductor Devices Reliability. In: Aven, T. (ed.) Reliability Achievement,
Elsevier Science Publishers Ltd., pp. 151-157
2.109 Birolini, A. (1996): Reliability Analysis Techniques for Electronic Equipment and Systems.
Proceedings ofEuPac'96, Essen, January 31-February 2, 1996
2.110 Birolini, A. (1996): Reliability Engineering: Cooperation between University and Industry
at the ETH Zurich. Quality Engineering 8(4), pp. 659-674
2.111 Bora, J. S.: Limitations and Extended Applications of Arrhenius Equation in Reliability
Engineering. Microelectronics and Reliability vol. 18, pp. 241-242
2.112 Bowles, J. B., Klein, L. A. (1990): Comparison of Commercial Reliability-Prediction Pro-
grams. Proc. Ann. ReI. & Maint. Symp., pp. 450-455
2.113 Howes, M. J. and Morgan, D. V. (Eds.) (1981): Reliability and Degradation. J. Wiley, New
York
2.114 Pieruschka, E. (1963): Principles of Reliability. Prentice-Hall, Englewood Cliffs
2.115 Schaeffer, R. L. (1971): Optimum Age Replacement Policies With an Increasing Cost
Factor. Technometrics no. 13, pp. 139-144
2.116 Sinnadurai, N (1991): Environmental Testing and Component Reliability Observations in
Telecommunications Equipment Operated in Indian Climatic Condition's. Proceedings
ESREF'91, Bordeaux, pp. 55-63
3 Reliability of passive electronic parts

3.1
How parts fail

An understanding of how electronic parts fail' is essential for the improvement of


the reliability of devices as well as of the systems in which they are used. Such
failure analyses help in identifying device failure modes, mechanisms, and stress
factors that influence degradation [3.1]. The early failures (infant mortality, or
failure during the burn-in or debugging period) occur at high initial failure rate
A - which is the number of failures of a part per unit of time. This failure rate
decreases rapidly and stabilises at a time TB, when the weak units have died out.
Early failures (poor solder joints, wire bonds, or hermetic seals; poor connections,
dirt or contamination on surfaces or in materials; chemical impurities in metal or
insulation; voids, cracks, and thin supports in insulation or protective coatings;
and incorrect positioning of parts) are characterised by a high but rapidly
decreasing hazard rate 2 and are caused by many factors, including the built-in
flaws of faulty workmanship, inferior materials and inadequate process controls,
transportation and assembly damages, and installation and test errors.
Stress related failures exhibit a low and relatively constant failure rate A and a
relatively constant hazard rate Z(t). The MTBF figure of merit is defined as 1IA
and is a measure of A during the useful life period for comparing hardware parts.
These relatively constant failure and hazard rate periods (between times T Band Tw
in Fig. 3.1) vary among hardware types.

I A device can fail in a catastrophic, degradation, or intelTI1ittent mode. Electrical failures are
usually opens, shorts, or parameter drift out of specification. The failure mechanism is the
basic chemical of physical change that results in an identifiable failure mode. Similar
definitions apply to mechanical failure modes and mechanisms. Part failures can be labelled as
early fallout, stress-related, and wearout. Early failures are often linked to design and
manufacturing flaws or to reliability screening escapes, but can be stress-related. NOlTI1al
operating stresses cause most failures that occur after the early ones. Wearout failures are
linked to ageing and deterioration.
2 Hazard rate is defined as the rate of change of the number of parts that have failed at a
particular time, divided by the number of surviving parts.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
94 3 Reliability of passive electronic parts

A(t), Z(t)
.: .. Wearout

Quality failure Ov~ralllife-characteristic ~urve . . - /


[Hazard rate Zet)] Wearout failures
, ! /7 It
-r--.-/
_ _'_',o_',o_,oo_o,o_",_o,,_ooo_Stri!-l!_S_Sr_e_la_te_d_fa_il_ur_e_s time

Ts Tw
Fig. 3.1 Overall life characteristic curve

3.2
Resistors

It is generally recognised - in accordance with the laws of chemical and physical


degradation - that by increasing the electrical, thermal and mechanical stresses on
electronic parts, either the time to failure or the time required to accumulate a
given amount of degradation will decrease. On the contrary, the decrease of these
stresses will diminish the rate of degradation, reducing the probability of
catastrophic failure, and thus improving the reliability. Derating is defined as the
practice of limiting these stresses. on electronic parts to levels far below their
specified or proved capabilities, in order to enhance reliability, The benefit of
derating parts is clearly established. Even the best parts, when operate at
maximum rated stress levels, do not have failure rates low enough for economical
operation and maintenance of complex systems. A major contributing factor to the
success of many reliability programs has been a conservative design approach
incorporating substantial derating of parts.
In the operation at high ambient temperature, the resistors are changing with
time. These normal changes must be taken into account at the definition and
design of the equipment (or of the system), corresponding to the imposed duration
of operation. Tables 3.1 and 3.2 [3.2][3.3][3.4] give basic information for the
derating of resistors (fixed and variable). The specified derating ratios and
applicable notes will assist the designer in obtaining reliable operation of parts. It
must be emphasised that the designer should evaluate all parts to the requirements
of his application, since he is responsible for the implementation of adequate
deratings.
The derating factors indicate the maximum recommended stress values and do
not preclude further derating; when derating, the designer must determine the
difference between the part environment specification and operating conditions of
the application, derate - if possible - and then apply the recommended derating
factor( s) contained herein.
For new devices it is advisable to obtain vendor life test data for analysis which
may lead to suggest derating figures. This is particularly true for step-stress data
3 Reliability of passive electronic parts 95

where a distinct breakpoint may be identifiable, so that reducing stress beyond the
breakpoint will not significantly affect device failure rate.
In any organisation - even those engaged in purely commercial manufacturing
- preferred parts lists (PPL) have an important place, because they offer the
opportunity, to the component specialist, of controlling the use of components of
known quality level. An ideal PPL should include an outline of the relative costs
of components [3.5]. It is vital that the designers give at an early stage a clear
statement of the components to be used. Non-standard part procurement is very
costly, particularly when qualification data has to be supplied.

Table 3.1 Resistors; fixed; power


Part & type Maximum stress ratio
Carbon composition per MIL-R-39008, RCR; MIL-R-II, RC 0.5
Film per MIL-F-39017, RLR; MIL-R-22684, RL 0.5
Film per MIL-R-10509, RN; MIL-R-55 182, PNR 0.5
Power film per MIL-F-11804, R 0.7
Wire wound per MIL-R-39005, PBR; MIL-R-93, RB 0.4
Power wire wound per MIL-R-39007; PWR; MIL-R-26, RW 0.5
Power wire wound per MIL-R-39009, RER; MIL-F-18546, RE 0.5

Table 3.2 Resistors; variable; power


Part & type Maximum
stress ratio
Wirewound lead screw per MIL-R-39015, RTR; MIL-R-12934, PR 0.7
Potentiometer per MIL-R-12934, PR 0.2
WW semi-precision per MIL-R-19, RA; MIL-R-39002, RK 0.3
WW power per MIL-R-22, RP 0.6
Non WW trimmers per MIL-R-22097, RJ 0.2
Composition pot per MIL-F-94, RV 0.7

We distinguish three classes of resistors:


• Carbon composition resistors
• Film resistors
• Wirewound resistors
These types can be manufactured not only as fixed resistors, but also as variable
resistors, too.
Ceramic materials are generally used as a support for the production of film
resistors. Depending on the conductive film, there are glance coal resistors and
metal film resistors. For glance coal resistors the main parameters are: the
regularity of the film, the maintenance of a clean environment and monitoring the
parameters at the evaporation (temperature, concentration of hydro-carbon,
evaporation time), and the surface activity of ceramic body [3.6].
In the case of metal film resistors, the procedure of laying down nickel-plated
or -evaporated films is used; often the evaporation process of metals or alloys, for
example an alloy 80 Ni!20 Cr, is applied. To improve the specific properties of the
96 3 Reliability of passive electronic parts

film, a third element can be incorporated, for example aluminium. With a


controlled gas inclusion, the temperature coefficient can be strongly influenced.
The insulating layers prevent the penetration of hydrogen between the grains of
metallic layer, which can be a cause of instability.
For the protection against mechanical external influences and microclimate
influences, the resistors are covered with a multilayer lake film. The connecting
elements have a strong influence on the thermal conductibility and they represent
a key factor for the charging capacity of a resistor. For greater dimensions and
higher temperatures, the radiation and the convection must be first mentioned. The
equipment reliability and the user costs are strongly influenced by the solderability
of connecting elements during the cold soldering. The solderability must be
independent of the storage duration before the utilisation of the resistor, and must
be adequate to the different soldering methods applied in usage. The film covering
the resistor protects against climatic influences and mechanical stresses; it is made
through compression, lakeing moulding, or synthetic resin remoulding. To do this,
some important points must be fulfilled:
• great voltage stability
• very good closing system
• great insulating resistance
• maintenance of the properties during very long time periods (>109h), inside the
limits of operating temperature
• absence of ions
• absence of fissures or cracks at extreme temperature variations
• very good strengthening capabilities
• an extremely uniform pigmentation, and distribution of filling substance
• having a precise colour that could make the resistor colour code possible.
For lakes and resins, the respect of the following rules is compulsory:
• a good wetting facility
• good running properties
The researches concerning the physical-technical data have demonstrated that a
film resistor can be affected by inadequate testing conditions. In this respect the
electrolytic destruction of resistant film by the existing humidity and applied
voltage is fundamental. The efficiency of the hardest and most compact protection
layer is limited and depends on the intensity of humidity stresses, so that they have
an unsatisfactory behaviour at the extreme humidity values. For a uniform heat
distribution on the whole resistor under charge, it is necessary to have the helixes
uniformly distributed on the whole length. If not, "hot spots" will appear and these
can produce great variations of the resistance, and a rapid failure. A resistor must
be balanced, taking into account the tolerance limits and its final value; this is
important if we want to be sure to eliminate the unforseeable errors. This is the
best guarantee for the reliability of the coating isolation film.
If we consider the multiple mechanical and thermal stresses simultaneously
present - still during the manufacture process - and influencing the different
constitutive materials of the relatively simple components, we arrive to the
3 Reliability of passive electronic parts 97

conclusion that a lack of balance will be produced (in comparison with the initial
state, or in comparison with the output state), and that this lack of balance will
increase with time. We can presume that at the manufacture of a lot of electronic
parts (initially foreseen to have the best quality and reliability) - because of an
error during the production - the quality requirements will not be satisfied by all
prescribed parameters. As the "error" is known, the reliability is not necessarily
negatively influenced. In this case, we sort all the items that satisfy all high quality
and reliability requirements, without knowing why they have had this behaviour.
Due to a superposition effect, it is possible for a second error to have been
introduced. In this latter case, with a sorting process, the high quality components
will be eliminated, too.
The parameter list, given to the user by the component manufacturer, must
contain the following important points:
• the foreseen type (eventually special operating requirements)
• the smallest / the highest value of the operating resistance
• the maximal charge during operation
• the maximal temperature of the environment in which the resistor is operating,
taking into account the heating produced by the other parts of the mounted parts
of the equipped card
• the maximum value of relative humidity during operation
• the operating type (pulse/uninterrupted); by pulse operation, it is necessary to
have the exact form of the pulse and its repetition frequency
• the maximal foreseen duration of operation
• the mean time operation of the system during 24 hours
• the data concerning the eventuality of a particularly unfavourable operation
• the foreseen failure criteria (what is understood under "failure" and the maxi-
mall accepted limits)
• the desired statistical safety. (In which percent the electronic parts must
correspond to the data supplied by the producer? The data must generally be as
exact as possible but - for sudden variations - it is difficult to have more than
0.073%, respectively 30', in a statistical sense. The specified confidence limit
cannot be greater than 91 %).

3.2.1
Some important parameters

The size of resistor values variations depends on temperature (compared with the
reference temperature) and is expressed generally in 1O-6/0 C. If the variation is
linear in the range of operating temperature, we call it a temperature coefficient; if
this variation is non-linear, it is named the resistance / temperature curve.
The so-called own noise is the consequence of the tension produced in the
inside part of the resistor. Any element with a resistance R, which is found in a
thermic balance, has an own white noise, proper to all constructive elements.
The temperature coefficient of the resistor (expressed in percentage) is the
measure of the variation of a resistance under voltage. The resistor value is
influenced by time, mechanical factors, humidity and operating conditions. If - by
98 3 Reliability of passive electronic parts

using a resistor in different temperature and humidity conditions - the final


resistance value does not differ very much from the initial value, one may say that
the resistor is stable.
The resistor stability expresses the reversible variation in time of the resistance.
It depends much on the dissipated power and on the environmental temperature;
the greater the resistor temperature, the smaller the resistor stability. On the other
hand, the dissipated power depends on environmental prescribed temperature.
Resistance variations occur because of nominal charge and of higher
environmental temperatures. For the limits indicated in Fig. 3.2 (tlR!R = 1%),
10% of the carbon film resistance have drift failures after a relative shortly time.
On the other hand, if the prescribed limits are tlR!R = 5%, no failure occurs. The
resistance variations are - generally - greater at the beginning, whereas after an
operational period of 5000 h they are so small, that they can be neglected. Nagel
[3.7] has indicated that the variation of the resistance of carbon or metal film
resistors can be determined if the reference values for charge, environmental
temperature and time are known.

L1RIRx 100(%)
1.5

1.0
-
0.5 /'
o V
o 2000 4000 6000 8000 10,000 time (h)
Fig. 3.2 Time behaviour domain of 100 carbon film resistors (IMn/O.25 W; nominal power).
Prescribed limit value L'iRIR = 1%

We consider the "t


law valid for all resistors, although the drift is smaller for
small ohmic values than for higher ohmic values. The mentioned law is still a
method used with good results to calculate the life duration of film resistors.

3.2.2
Characteristics

a) Carbon film resistors:


• reduced dependence upon voltage and frequency;
• relatively good stability;
• resistance by charging, in impulses working conditions;
• small own noise; small negative temperature coefficient.
b) Metal film resistors (in comparison with carbon film resistors):
• great stability in time (Fig. 3.3); high reliability if properly used;
• much smaller temperature coefficient;
3 Reliability of passive electronic parts 99

• very small noise factor;


• very small tolerances; smaller dispersion.

Table 3.3 Comparison between metal film and carbon film resistors (general specifications;
charge 0.1 ... 2 W)

Parameter Carbon film Metal film


Tolerance (%) 40 25
Temperature coefficient (ppmfK) -250 ... -1500 ±200
Stability (200h) [%] 1.5 ... 3 0.2
Current noise (11VN) 0.005 ... 3 0.01...0.2

t (hours)

106

10 5

104

10 3

102

50 75 100 125 150 175 200

Fig. 3.3 Drift data for metal film resistors in accordance with MIL-R-10509: t - operating time;
6 K - body temperature (OC); tlR - resistance variation (%)

c) Wirewound resistors:
• smaller temperature coefficient; very reliable if proper care is taken to keep the
operating temperature within reasonable limits;
• hardly measurable temperature coefficient (if one intends to obtain quality
resistors);
• very small own noise;
• susceptible to induction;
• bad properties at high frequency, and great dissipation power.
100 3 Reliability of passive electronic parts

3.2.3
Reasons for inconstant resistors [3.8]. .. [3.10]

&?/R (10)
5

2 a

o
5

4 N nom

0 t (h)
0 10 000 20000 30000 40000 50000

Fig. 3.4 Parameters variation by ageing depending on the following parameter: a nominal value;
b operating power; c nominal charge [3.9]
3 Reliability of passive electronic parts 101

3.2.3.1
Carbon film resistors (Fig. 3.4)

Certain impurities are working on the quality of the film resistors (which can not
always be avoided during the manufacturing process). For example, since the
ceramic support, the carbon film and encapsulation cannot be maintained entirely
free of ionic substances, these ones - like the humidity - are the premises for the
ion migration and for indirect destruction of the film resistors, as a result of
electrolytic phenomena [3.8].
Another cause of inconstancy is the irregular feldspat local concentration on the
surface of the porcelain. These concentration differences may lead to smaller
thickness of the carbon film which has a higher resistance in these regions. The
results are specific greater charges and strong local heatings which lead to the
destruction of carbon film and to early failures. The best preventive measure is the
underheating.

3.2.3.2
Metal film resistors

The predominant factor is the oxidation, in accordance with Arrhenius law. Other
parameters that influence the stability characteristics are the surface roughness, the
chemical reaction between the used materials, the proportion of alkaline ions, etc.
The evaporation speed, the substrate temperature, the resistance value, the
temperature coefficient, etc. are determinant factors on which the film formation
depends. A certain temperature level must be overreached to obtain significant
variations.

3.2.3.3
Composite resistors (on inorganic basis)

The only ageing causes are the local thermal processes at the terminal contacts
level.

3.2.4
Some design rules

If a resistor is correctly designed, manufactured and utilised, it will normally have


the longest life duration of all components included in an apparatus, equipment or
system. In view of really reaching this goal, the following rules must be observed:
• The designer must know if the value of the resistance mounted in the real
physical circuit is different from that prescribed by the circuit design, and if
this difference varies.
• The designer must be sure that the designed circuit - whose resistance have
values ranged in the prescribed limits - will correctly work, and he must check
• Whether the voltage coefficients, the temperature coefficients, and the drifts
during operation have been considered.
102 3 Reliability of passive electronic parts

• If the preceding enumerated conditions are not respected, it is expected that - in


the series manufacturing of an item - at least some circuits will not operate
well under the known limit conditions.
• With the aim to choose correctly the resistor type, all the reliability factors and
also the price must be considered.
• The stability of reliable resistors is evaluated by the drift after 2000 working
hours expressed in %/1000 h (or in %/year). The best method to improve the
reliability is the "derating", especially for the dissipated power (Fig. 3.4).
• What is reliable for one user may not be so for another. Reliability in one
circuit may mean that the resistor must not drift in value more than a small
percentage of its initial value over a period of many thousands of hours. In
another circuit a large percentage change may be entirely satisfactory, but the
reliable resistor in the latter case is very unreliable in the former.

3.2.5
Some typical defects of resistors

The resistors failures can be explained by one or more of the next factors:

• fatigue, interruptions
• design errors
• manufacturing errors
• inadequate utilisation.

a b c
Charge level

"",
1.0
F F

-r--"
0.8
D

"
0.6
...
0.4 p

0.2

o s
(Oe) 0 40 80 120 160 40 60 80 100 120 0 40 80 120 160

Fig. 3.5 Minimisation charging curve for: a) carbon film resistors; b) metal film resistors; c)
wirewound resistors. P = permitted region; the area with the best ratio reliability/costs and with
optimal safety working reserves. Utilisation of resistors in this area is very frequent since a
reliability deterioration is normally not expected. D = doubtful region; in this area the resistors
are working without going beyond the nominal values, but not with the optimum reliability. F =
forbidden region; in this area the nominal values are exceeded and the resistors are overcharged
3 Reliability of passive electronic parts 103

iJR/R x 100 ("/0)


25

20 ./ 192°e

15 / I

10 ./
V II
I

V l60 e I
. /~
0

5
l~OOr"' i

o
~ we \
"1 Or"'
t (hours)
o 1000 2000 3000 4000 5000 6000
Fig. 3.6 200kn carbon film resistor time behaviour at different normal operating temperatures
(mean values, alternating voltage)

A(%/lOOOh)
operating power / nominal power
0.1 ~-~--,------,-----;,------.--------.-.-~~---,r-..-----, 0.2
f--+---+--+--I--+---A-f~hh~----:c=- 0.4
0.08 I----+----+--+----+---_l_~~,K--,~~~~_l_-~ 0.6
I----+----+--+----+--______'-.~L-='I_~'---~_l_~~ 0.8
1.0
0.06

0.04

--t--~~-I-~<---J-----------I~--+---t-I
I '
0.02

0.01
o 20 40 60 80 100 120 140 160 180
Fig. 3.7 Failure rate dependence on the operation temperature, for different derating ratios and at
a relative humidity s 60%

Fatigue and intenuptions are explained by the degradation of inorganic


materials. This degradation is produced by the migration of impurities in the
104 3 Reliability of passive electronic parts

substrate layer and by the oxidation of constitutive elements of the resistors, after
the uninterrupted utilisation during some years.
The defective design is rarely encountered in the case of operating products.
Hazard manufacturing errors may appear if the producer employs a new material,
without a sufficient previous testing. Inadequate utilisation of the resistors can be
reproached to the user only.

3.2.5.1
Carbon film resistors

Defects appear generally rarely in normal temperature conditions and if the


nominal charging capabilities are not overreached (by short-circuit or inter-
ruption). It is the same for small value resistors whose drift failure rate is smaller
than that of high value resistors [3.10]. The same observation is true for small
value resistors whose drift failure rate is smaller than that of great value resistors
[3.10]. In Fig. 3.6 the mean variation .1R1R of a 200kQ carbon film resistor is
represented (in %), in a given time, and depending on the operating temperature. It
can be seen that the unreliability zone arises over 192°C. By testing this resistor
type (for an environmental temperature of +70°C and a nominal charging capacity
of 50%) Stanley [3.11] has obtained the following failure rates (60% charging
domain):
• for values ~IMQ, A = lO'7h't,
• for values> IMQ, A = 9 X lO'7h'l.
The data concerning the operating reliability are more favourable and reach values
of the order A = 5 .1O'lIh'l.

3.2.5.2
Metal film resistors

The metal film is thinner than the layers utilised for the manufacture of carbon
film resistors. Therefore the probability for fissures and "hot spots" to arise -
which can lead to failures at interruption - is much more important (in the case of
early failures, the interruptions are the most frequent phenomenon). Other frequent
defects are the non-homogenities of the resistive layer (for example a too thin
layer of the wirewound resistors, because of the defective disposition of the
helixes, and of the bad cut grooves, can lead to intermittent contacts between two
neighbouring helixes and - as a consequence - to instability and to high noise
level).
For thick film resistors, Russel [3.12] indicates the following failure rates (at
60% confidence level):
• sudden failures: A = 9.lO'8h'l
• drift failures: A = 2.5 .lO'7h'l.
In Fig. 3.7 the variation of the failure rate A (in % for 1000 hours) versus the
operating temperature (0C), for different undercharging ratios (operating power /
nominal power) and at a relative humidity ~60% is shown. The final A value is
3 Reliability of passive electronic parts 105

obtained by multiplication of the values from Fig. 3.7 with the product iJ2 (see
Tables 3.7).

3.2.5.3
Film resistors

In the case of film resistors (whose film is neither metallic nor carbon) there are
some possibilities to emphasise the early failures, before putting the resistor into
operation. For well manufactured resistors, and for a smaller charge, a diminution
of A during the whole observation period can be recorded; some resistors have -
for a small charge - interrupting failures only after 30 years of operation. The
resistor reliability depends not only on charging, but also on the resistance value.

3.2.5.4
Fixed wirewound resistors

The most frequent failure causes are short-circuits between two neighbouring
helixes and bad contact between wire and terminal, especially for thin wires.

3.2.5.5
Variable wirewound resistors

Contact interruptions, produced by corrosion between cursor and resistance wire


can occur, because at high temperatures the wire can be strongly oxidised [3.6].

3.2.5.6
Noise behaviour

Figure 3.8 depicts the noise behaviour of the three main resistor types. Resistors
may be purchased off-the-shelf as established reliability (ER) devices for most
parts. Various screening programmes, established on the basis of life test criteria,
are available. Details may be found in [3.13].

3.3
Reliability of capacitors

3.3.1
Introduction

There are few electronic components which - carrying out one and the same
function - can differ so much from the constitutive materials point of view as the
capacitor. Concerning the constitutive materials, the capacitors are built from two
electrodes isolated with a dielectric medium, serving to store an electrical charge.
They are characterised by the capacity (with the given tolerance), by the dielectric
constant of the utilised dielectric medium, by the insulating time constant (in the
106 3 Reliability of passive electronic parts

case of electrolytic capacitors, by the leakage current), by the loss factor, by a


good and reliable power dissipation, by a reliable working voltage, by the
temperature dependence, by their constructive shape. The dielectric material is the
most important element; it determines the characteristics of a capacitor.

Noise (JiV/V)
2.0 , - - - - - - - - , - - - - - - - - - , - - - - - - - - . ,

1.6

1.2

0.8

0.4

o R (MQ)
0.001 0.01 0.1
Fig. 3.8 Noise variation for 1) metal resistor; 2) carbon resistor and 3) wirewound resistor

Which are the criteria for selecting the right capacitor for the right application?
There are some important parameters such as:
• the required reliability for the entire device
• the system complexity
• the component failure rate and its time dependence
• the costs of a system failure
• the required precision of life duration prediction for different operating voltages
and temperatures
• the environmental component stress (chemical, electrical and mechanical
influences)
• the limitations concerning the dimensions and weight of components.
The capacitors can have fixed or variable capacities. In the first category there are
capacitors with paper, plastic (KS, KP, KT, KC), metallised plastic (MKP, MKT,
MKC, MKU), metallised paper (MP, MKV), mica 3, synthetic film, polyester
film/foil, ceramic\ electrolytic, tantalum and special capacitors. The trimmers
belong to the second category.

3 Mica as a dielectric can withstand temperatures up to about 400°C before dehydration occurs,
but mica capacitors are limited by the sealing material. Silvered mica capacitors in Mycalex
cases will operate at about !30°C. Vitreous-glaze capacitors should operate satisfactorily at
150°C in sizes comparable to the mica capacitors.
4 High-permitivity ceramic dielectric capacitors cannot - in general - operate beyond 100°C
because of a degradation effect known as creep, which becomes apparent as a change in
capacitance with temperature; the mechanism of the change is not fully understood, but is
being under study.
3 Reliability of passive electronic parts 107

The prime objective of any system is that it must meet the basic operational
performance. High reliability, good maintainability, electromagnetic compati-
bility and other desirable goals are of course important, but they are secondary
factors. The struggle between the basic performance requirements and the
reliability and maintainability requirements are often reflected in the part selection
problems. Choosing the latest types of parts can improve performance and
sometimes leads to an increase of the reliability level. Great care is needed to
ensure the following: a) the new part (range) is indeed superior; b) the new part
will become a de facto standard and thereby multisourced; c) the new part is
qualifiable to a degree equal to standard parts of roughly similar function.
Capacitors may be purchased off-the-shelf as established reliability (ER)
devices for most parts. Various screening programmes, established on the basis of
life test criteria, are available. As an example, ceramic capacitors may be screened
at twice rated voltage for a specified time period, at a maximum rated temperature.
Details may be found in [3.14].

3.3.2
Aluminium electrolytic capacitors

Half-dry electrolytic wound capacitors (the most used) are formed from an
oxidised aluminium foil (anode and dielectric) and a conducting electrolyte
(cathode). A second aluminium foil is utilised as covering cathode layer. They are
available with two formed nonpolarised foils and they have large loss factor
(frequency and temperature dependent), a limited useful life, and are not too
reliable (A = 10 to 50FIT; drift, shorts, opens). If their utilisation cannot be
avoided, it is better to choose the types built with high quality requirements. In the
case of aluminium electrolytic capacitors for high requirements - besides the
unavoidable early failures - they have nearly always wearout failures, too. The
beginning of these wearout failures limits their usability. A reliability dependence
on the size of the case and of the electrolyte quantity was proved: the smaller the
capacitor, the shorter the useful life and the higher its failure rate.
The operating capacity of these types is limited by the existence of a
determined adequate electrolyte quantity. As a consequence of the diffusion, of
the ageing or of the decomposition, the active electrolyte quantity of a capacitor
system diminishes and leads to a growth of the loss factor (tan c5) or to a
diminution of the capacity. These modifications are important in the case of fluid
systems; for the solid semiconductor electrolytes, the changes are insignificant and
- generally - don't lead to the fault of the capacitor. The growth of the leakage
current over a certain limit serves as failure criterion for this capacitor type, since
tan 5 and the capacity have only unimportant variations. Due to the structure and
to the operation mode of electrolytic capacitors, the voltage stresses at the nominal
value are not taken into account, the solution for accelerated testing is the
operation at high temperatures. In the case of operation at temperatures over the
guaranteed limit value given by the manufacturer, supplementary electrolyte
108 3 Reliability of passive electronic parts

losses appear. It results that one must remember that the life duration of a
capacitor is inversely proportional to the specific loading:
q= U. C/V (3.1)
where U = nominal voltage, in volts; C = capacity, in JLF; V = volume, in cm3 .
Studies have demonstrated that this relationship is valid only for a capacitor
batch having the same geometric dimensions and manufactured with the same
technology. The same studies have led to the conclusion that to evaluate the
natural duration of life of electrolytic capacitors, the duration test at nominal
voltage is inadequate. Since these tests are rarely utilised in permanent operation
conditions, the so-called "duration life" studies undertaken at nominal voltage and
maximum operation temperature (having the aim to estimate the total natural
lifetime) will lead to completely false results.
The natural behaviour can be faithfully reproduced with periodic testing
methods [3.15][3.16]. According to these methods, the capacitors are submitted to
voltage and high temperature, with a certain periodicity, and then stored without
voltage. High temperatures accelerate the test. Until now, the aluminium
electrolytic capacitors are the classical example of components with increased
failure rate, and suffering a clear ageing phenomenon. We can say that the end of
the duration life is programmed in advance. Many manufacturers - taking into
account these circumstances - indicate, for high temperatures, a maximum
duration of life, in years.

3.3.2.1
Characteristics

The miniaturisation has led to the reduction of the foil surface. To increase the foil
surface, the chemical or the electrochemical ruggedness is utilised, which allows
obtaining higher capacities for the same volume, and for relatively harder
reliability conditions. The capacity of this capacitor type strongly depends on
voltage, temperature and frequency. Because of the bad electrolyte conductivity at
temperatures under O°C, the operational capacity is strongly affected (growth of
the capacitor impedance, expressed by increased apparent resistance and dissipa-
tion factor values, and by an increased apparent series resistance, respectively).
The alternating current passing through the equivalent series resistance can heat so
much the aluminium electrolytic capacitor (in spite of reduced environmental
temperature), that it is not possible to maintain its capacitive properties necessary
for the system operation.
As a result, the capacity variation with temperature is an important quality
criterion. To increase the lifetime and the reliability of a capacitor, it is recom-
mended that they operate at the lowest possible temperature. For the same reason
it is recommended to mount this capacitor type in the zones with the lowest
environmental temperature. The highest storage temperature is +40°C, but an
operating temperature between 0 and +25°C should be preferred. Other
disadvantages of the electrolytic capacitors are: an important leakage current (as a
consequence of the imperfect closing device current), a strong dependence on
3 Reliability of passive electronic parts 109

temperature, and an important dissipation factor. The most important parameters


are the impedance and the leakage current (Fig. 3.9), but also the capacity
variation L1CIC and tan 5.
The time variation of the residual current is determined especially by the metal
impurities of the aluminium foil and by the dielectric porosity. The leakage
current dependence on voltage and the time variation are determined by the
applied voltage during the oxidation of aluminium foil, by impurities, and by the
exchange effect between dielectric and impregnation electrolyte [3.6].
The temperature influence is much greater than that of the operating voltage.
To the environmental temperature the selfheating due to the loading in alternating
current is added, so that for the ageing speed the sum (environmental temperature
plus self heating) is decisive. The variation behaviour at high temperatures
furnishes valuable data on the reliability.
The stability of the electrical parameters of electrolytic capacitors is good only
if the capacitors are currently operating. A long storage (1 to 2 years) contributes
to the growth of the leakage current; but - after applying some minutes the correct
voltage - the leakage current will stabilise at a very small value (new forming

Impedance Z(Q) Normalised residual current lliv(%)

4 90
.'.
:IA'"
3
.......
70
I \ ......

-
2 50
. /~
~
....... ......'
30 I
1.. . /
...

o o
2 5 10 20 50 100 t (x24 h) o 5 10 15 20 25 30 t(l03h)

Fig.3.9 Impedance and residual current variation for an electrolytic capacitor 681J.F I 15Y for
an environmental temperature of +70°C (at nominal voltage, without charge). - Charge with
nominal d. c. voltage; Without charge (environmental temperature + 70°C)

process, reactivating). One may say that a capacitor is in a conserving state if the
voltage applied is smaller than O.l5UN (UN = nominal Voltage). For conserving
temperature between 15 and 40°C the new forming process (reactivating) must be
applied after the amount of years specified in Table 3.3.

Table 3.3 Correlation between storage duration and new forming process (reactivation) for wet
aluminium electrolytic capacitors, for different nominal voltages and diameters

Nominal voltage Diameter> 6.5mm Diameter < 6.5mm


UN s 100Y 4 years 1 year
100Y S UN S 300Y 2 years 6 months
U> 300Y 1 year 6 months
110 3 Reliability of passive electronic parts

3.3.2.2
Results of reliability researchstudies

Reliability researches have led to a long series of results [3.15] ... [3.29].
Concerning the lifetime, the wet aluminium electrolytic capacitors can be
classified [3.17] in the following seven categories:
Class A - guaranteed 1000 h at 70°C
Class B - guaranteed 2000 h at 70°C
Class C - guaranteed 1000 h at 85°C
Class D - guaranteed 2000 h at 85°C
Class E - guaranteed 5000 h at 85°C
Class F - guaranteed 10 000 h at 85°C
Class G - guaranteed 2000 h at 125°C.

Guaranteed lifetime (h)


106
, ,
20
'.'
,,'........ "
.........
year ..... , ',', "
105
...... :""~"~'':-~'':''
~ "',
.,'...... . '. r"-G
•...
10year· " •.... , "
" ....... ',
5 year ..... A ......... ..•... ".....' , '''. -....¥
".....

"'8
............ .... '..... .........
104 ................ ••••• "

~..:·::···>D 'E
··c

10 30 50 70 90 110 130 (%)


Fig. 3.10 Guaranteed lifetime for French aluminium electrolytic capacitors (m hours by nOC,
categories from A to G), unaffected from encapsulation and voltage

Possible lifetime (h)


106 ..... , ......

.,
,',
"
D' G ....
c••• '
"
..•••• ..... , " ~.,.,' .
"

~ ~.:.~". ... ' ... " • ,


A ,
,
......

104
"~'.

'"
~:: .... .... , "
. F
~~
·B

10 30 50 70 90 110 130 (%)


Fig.3.11 Possible lifetime for the categories A to G, for different utilisation A (0 > 6.5mm /
1000 h / 70°C); B (0 > 6.5mm / 2000h / 70°C); C (0 > 6.5mm / 1000h / 85°C); D (0 > 6.5mm /
2000h / 85°C); E (0 > 6.5mm / 5000h / 85°C); F (0 > 6.5mm / 10 000 h / 85°C); G (U > 100V /
6.5mm < 0 < 14 mm / 2000h / 125C)
3 Reliability of passive electronic parts III

In Fig. 3.10 the guaranteed lifetime for these capacitors, unaffected from
voltage and encapsulation is shown. Fig. 3.11 gives the possible lifetime for
different case studies.
To calculate the failure rate of the dry aluminium electrolytic capacitors,
Durieux [3.17] - starting from the relationship 11,== AbITu.IIo.1O·9h-1 (where ITu
and ITo represent the environmental factor, respectively the quality factor) -
proposed an adequate nomogram. Ab depends on the charge p and on the
temperature.

3.3.2.3
Reliability data

Generally, for electrolytic capacitors the environmental temperature (TE :S 40°C)


and the allowed loading (for example the nominal voltage and the acknowledged
exchange current for 40°C) are the stress factors.
In this respect, a failure rate and the corresponding stress duration (see the
normative documents DIN 41 240, DIN 41 257, and DIN 41 332) are always
indicated. Consequently, a constant failure rate may be considered. The failure
rates indicated in the mentioned normative documents include as much the total
failure as the parameter drifts. For the electrolytic capacitors, the short-circuit and
the interrupts are total failures. Concerning the drift failures, the documents DIN
41 240 and DIN 41 332 mention the criteria from Table 3.4.

3.3.2.4
Main failures types

During the operation time, the electrolytic capacitors are submitted to a multitude
of stresses. To evaluate the quality and reliability, we must consider not only the
electrical stresses due to the voltage and current, but also the mechanical and
microclimatic influences, caused mainly by the temperature and humidity of the
air [3.6][3.19][3.21].

Table 3.4 Criteria for aluminium electrolytic capacitors drift failures (DIN 41240, 41332)
Elements Severe Normal
specifications specifications
* Growth of tan Jvs. the initial value. 3 3
* Diminution of nominal capacity:
- at UN up to 6.3V 40% 50%
- at UN between 10 and 25V 30% 40%
- at UN between 40 and 100V 25% 30%
- at UN between 160 and 450V. 20% 30%
* Growth of nominal capacity (vs. the upper limit) XU XU
* Impedance increases with a factor:
- at UN::O; 25V 4 4
- at UN >2SV. 3 3
* Leakage current Up to the initial Up to the initial
limit value limit value
112 3 Reliability of passive electronic parts

The main factors that influence the reliability are oxide layer, impregnation
layer, foil porosity and paper (the last two factors are common to all types of
capacitors). At oxide forming - for example - various hydrate modifications can
appear. The conductivity of the impregnating electrolyte works directly on the loss
factor of impedance, on the chemical combinations and on the stability of
electrical values.
Capacitors with great stability, reduced dimensions and reduced corrosion
sensitivity, having simultaneously reduced dissipation factors and impedances
may be obtained by using electrolytes with great ionic mobility, even in poor
water media. The depositing volume of the electrolyte influences directly the
lifetime. Untight encapsulation leads to a rapid modification of the electrical
parameters: a diminution of the electrolyte quantity or a modification of its
consistency lead to a growth of the loss factor, a diminution of the capacity and a
growth of the impedance.

3.3.2.5
Causes of failures

• Breakdown voltage (outage of dielectric medium)


• Important leakage currents leading to breakdown
• Decrease of the capacity
• Very important loss factor.
In most cases, the last two causes occur simultaneously. The lifetime and
reliability of aluminium electrolytic capacitors are strongly dependent on tempe-
rature.

3.3.3
Tantalum capacitors

3.3.3.1
Introduction

In the last two decades, [3.30][3.36] the tantalum capacitors with solid electrolyte
have conquered large utilisation domains, and - due to their superiority - have
partially replaced the aluminium electrolytic capacitors. In comparison with
aluminium electrolytic capacitors, the indisputable qualities of tantalum capacitors
are a very good reliability, a favourable temperature and frequency behaviour, a
large temperature range, and relatively reduced dimensions. The factors that
influence the reliability are the environmental temperature TE, the operation
voltage UE, the series resistance Rs, and - for plastic encapsulated capacitors - the
air humidity.
Because drift failures appear only at the limits known by each user only, the
failure rate calculation is only optional, since for other applications drift failures
do not arise. Until now, wearout processes for tantalum capacitors have not been
observed. In most cases, a diminution in time of the failure rate has been observed.
3 Reliability of passive electronic parts 113

Ta

~fi
RectI ler

Fig. 3.12 Operation principle of the tantalum capacitor

10

Normal polarisation
J
V
6

V
4

V
--
2
V
-20 -10 10 20 30 40 5o U(VJ
Reverse polarisation
I

Fig. 3.13 The residual curve of the tantalum capacitor CTS 13 (10~ 125V)

3.3.3.2
Structure and properties

A tantalum capacitor (Fig. 3.12) is a metal-oxide rectifier used in its blocking


direction (that explains its polarisation). It is characterised by reduced dimensions,
good stability of electric parameters, very good high frequency properties, long
lifetime, and large temperature domain. Concerning its reliability, experience
shows that the failure rate is smaller than 10-8h- l , with a confidence level of 90%.
Two restrictions must however be underlined:
• small value of the reverse biased voltage (compared with the nominal voltage);
• reduced reliability of solid electrolyte tantalum capacitor in pulse operation, for
example for circuits with small impedance, where the overvoltage can lead to
blackout failures.
114 3 Reliability of passive electronic parts

Distribution; sum percent (%)


98

)
95

90
/~
v7'<~
/
80
1//// ",,,.......'.D'
60
//// ,/,/~.•...•.../
" ....

40
j/// ,,, ,,, ,
....
....'
.'.'
.. '

20 ,-, .'
/l.f/ ,, .....'
,, .'
10 J ,,- .....
,,
5
2
/1 ,, .....'.'
0.02 0.04 0.1 Residual current IR (JiA)

Fig. 3.14 Time dependence of the residual current for a tantalum capacitors group operating at
an environmental temperature of +85°C. A) After zero operation hours; B) after 1000 operation
hours; C) after 4000 operation hours; D) after 8000 operation hours

Operation time (h) .dC/Co (%)


100,000 ,---.;--,--:-----r:;oo--=_ _

10,000 12

1,000

100 5

10
0.01 0.1 10 IR (rnA) 6 40 U v (V)

Fig. 3.15 Reliability of tantalum capacitor Fig. 3.16 LiC/Co variation bet-
(the hatched zones are theoretically estimated) ween 25 and 85°C, at nominal
voltages from 6 to 40V
3 Reliability of passive electronic parts 115

That is why in such cases it is proper to select tantalum capacitors with higher
nominal voltage, since the greater dielectric thickness can assure a higher
reliability for the applied stress. When such a capacitor is submitted to a voltage
with sudden variations (switching-on, switching-off, over-voltages, etc.),
momentary overcurrents arise. These great local current densities lead to thermic
modifications of the dielectric crystallographic state.
Fortunately, the dielectric breakdown seldom occurs. On the contrary, for great
stresses in pulse operation (source with interpolated filter in a circuit without
series resistances) it is recommended to utilise fluid electrolyte tantalum
capacitors due to "selfrepair" of dielectric under voltage and due to the thermic
exchange favoured by the electrolyte mobility. Lately some progresses have been
realised in this respect.
The other disadvantage of tantalum capacitors is a consequence of the operation
principle; it is equivalent to a diode with a very great surface, used in the blocking
direction. The residual current curve (Fig. 3.13) is a proof in this respect.
Since, in reality, the data are more complex, the limits of the reverse voltage
depend on structures and technologies.
Some useful recommendations:
• Don't overheat the junction wires.
• Bind rigidly the case on the printed circuit board (to avoid the deterioration
produced by vibrations).
• Reserve sufficient space for component cooling.
• Don't mount reversely a tantalum capacitor.

3.3.3.3
Reliability considerations

Invariably a certain arbitrariness rules in establishing the failure criteria (given by


exceeding foreseen limits) for the reliability data. In many cases, the selected
criteria are not optimal, but the information concerning the time variations of the
parameters offers greater possibilities to the electronic circuit designer in the
reliability estimation of his apparatus working in different limits. For the
components used in electronics and/or in information technology, the failure
criteria are very different. Even in the same component group (as the capacitors) a
general available definition of the delivery tolerances do not exist. For smaller
voltage electrolytic capacitors the delivery tolerances are up to +100%, but in the
case of tantalum capacitors only a tolerance of only ± 20% is allowed. Concerning
the lifetime, although establishing of delivery tolerances as failure criterion is
really a possibility, it is not sufficiently conclusive in various cases, since the
parameters enclosed in each of these limits must have only little variations.
Similar assumptions are available also in the case of other main parameters:
residual current, loss factor, isolation resistance, frequency behaviour.
In Fig. 3.14 the residual current variation of a batch of tantalum capacitors
operating at +85°C [3.28] is shown. By an empirical selection of the ordinate
116 3 Reliability of passive electronic parts

dimensions the plot of a straight lines was obtained. If this diagram is not used, the
time variation of the 50% value, represented by the curve A (Fig. 3.15) [3.28] is
followed. Curve B represents the mean residual current variation of a capacitor
group operating at nominal voltage and 55°C. It is very likely that before 100 000
hours, no manifest deterioration will appear. Curve C represents the variation
corresponding to the value of95%.

3.3.3.4
LtC/Co variation with temperature

The specifications generally allow a capacitor variation of -10% between the


measured values at +20°C and -55°C, and a variation of + 12% between +20°C and
+85°C. In practice, the measured variations are normally much smaller, because in
reality tantalum capacitors are more robust. The LtC values are correlated with the
dielectric modifications during the thermal processing. On the one hand,
beginning with 200°C, tantalum loses the oxygen from the combination Ta20s,
which leads to the formation of a lacuna zone, so that the conductivity of the
metal-oxide layer increases. On the other hand, the oxygen from Mn0 2 fills up
this zone. An equilibrium is established, and the capacitor variations dependent on
temperature are proportional to the thickness of the remaining lacuna zone (Fig.
3.16).

A (la-II Ih) z(O). {}


I ..., 5
A}
~ .

..............................
0.5 Confidence
se~ I
4
level 60%
.....

-V
1\1 3
0.2 /'L
..../ ...•./ .......................; ....... 2
0.1 100kHz B
~
......................... C
~
0.05 ~ D
o
0.02 6
100 1000 cu
Fig. 3.17 Interdependence of CU Fig. 3.18 Measured values oftantalum
and f..... M = mean failure rate capacitor impedance, at different nomi-
nal voltages (j= 100Hz)

j No matter how surprising this artifice can appear, it is nevertheless valid. The exact
mathematical expression of the distribution law is not important, if a unique straight line is
obtained; it is only a question of clarity and graphic representation. It must still be observed
that the selected scale is satisfactoriy only between 2% and 98% and Fig. 3.14 covers this
range.
3 Reliability of passive electronic parts 117

3.3.3.5
The fai/ure rate and the product CU

The product CU (capacity x voltage) is an important parameter of tantalum


capacitors, because these are the only capacitors characterised by a very great
value for the product CU, at small volume and reduced price. Its reliability is
strongly influenced by the capacity value (Fig. 3.17).

3.3.3.6
Loss factor

The typical value of the loss angle tangent tan 6 is about 2% (at normal
environmental temperature and 100Hz) - for small CU values - and increases up
to about 8% for very great capacities. Because the dielectric losses are small, the
value of tan6 depends in essence on the series resistance elements of the capacitor.
The small values of the series resistance are a proof that the manufacturer
manages well the fabrication problems.

3.3.3.7
Impedance at 100Hz

This parameter is more important than tan 6, because it allows a better evaluation
of the constructive qualities oftantalum capacitors (Fig. 3.18).

3.3.3.8
Investigating the stability of 35V tantalum capacitor

The investigation [3.33] includes capacitors of widely different types which all
have undergone the same test plan. For the user of capacitors it is important to be
able to compare the stability and quality of different types with the prices. The
purpose of the investigation is to compare the stability of the 35V tantalum
capacitor from different manufactures. With this in view, a 35V capacitor in a
high (47!-lF) and a low capacity value (1 ~F) was tested for each manufacturer. The
greatest part of the attention has been fixed on the long-term stability when
exposed to humidity conditions and to load lifetest at high temperature.
The following exposures have been used.
Group number 1:
a) Transient test. 10 capacitors are connected in parallel, and exposed to a 10Hz
square-wave voltage with low voltage OV and high voltage 35V. The rate of
increase and decrease in the loading current is 20A/~s and the loading current is
limited by a resistance of 0.10, corresponding to maximum 350A. The capacitors
are exposed to 3 000 cycles (5 minutes).
b) Climatic sequence. This is a block of tests often used for components in the
lEe specifications. The various tests are carried out without measuring the
capacitors between the tests. In this sequence, there is no bias on the components.
The following five steps are carried out in the sequence:
118 3 Reliability of passive electronic parts

• 16 hours at 85°C ± 2°C.


• 1 cycle (24 hours) of the IEC damp heat test Db. Immediately, the next step is
carried out.
• 2 hours at -25°C. After conditioning for minimum 2 hours at +25°C, the next
step is carried out.
• 1 hour at a pressure of 150mbar at +25°C. Immediately, the next step is carried
out.
• Perform the remaining 5 cycles of the damp heat test started in the second step.
Component measuring is carried out after minimum 1 hour and maximum 2
hours of conditioning at 25°C, 50% RH.
Group number 2:
Damp heat steady state. For 56 days the capacitors are exposed at 40°C ±
O.5°C, 93% ± 1% RH (corresponding to IEC Publication 68, test Coo but with
tighter limits concerning the humidity condition) each one biased at nominal
voltage. The capacitors are connected in parallel, and they have a common serial
resistor which allows a short-circuit current of 1...2A. The voltage was increased
gradually from 0 to 35V, during 2 ... 3 minutes.
Group number 3:
Lifetest at 70 "C. The test is carried out for 2000 hours at 70°C ± 2°C (lEC
Publication 68, test B). The components are biased at nominal d. c. voltage. The
voltage increase from 0 to 35V is gradually made during 2 .. .3 minutes. The
capacitors are connected in parallel, and they have a common serial resistor which
allows a short-circuit current of 1... 2A.
Group number 4:
Lifetest at maximum temperature. This test is carried out in group number 3,
only maximum allowed temperatures are used.
Group number 5:
a) Temperature coefficient 0/ /Z /.
The numerical value of the capacitor is
measured at 120Hz in the temperature range -55°C to maximum allowed
temperature.
b) /Z / z
as a/unction o/frequency. The value of I I as a function of frequency
is measured at 25°C. Lowest frequency: 500Hz; highest frequency: 500kHz.
c) Surge test at 85 "C. Each capacitor is connected in series to a 1000n ± 10%
resistor. At 85°C the combination capacitor/serial resistor is exposed to 1000
periods of the following voltage cycle: a) 40V for 0.5 minutes; b) OV for 5.5
minutes.
Group number 6:
Breakdown voltage. Each capacitor is exposed to a d. c. voltage starting at 35V
and increasing with 1 VIs. Maximum serial resistance is 3n. One must record the
value of the voltage at breakdown. All measurements are carried out at a
temperature of 25°C ± 1°C and RH of 50%. Conditioning time before measuring
is minimum 16 hours, except for the humidity test group, where the conditioning
time was minimum lh, and maximum 2 hours. The test equipment for measuring
of C and tgB has a measuring accuracy better than ± 0.1 %. During the measuring,
d. c. voltage is not applied to the com-ponent. The maximum value of a. c. is
O.3V. Measuring ofleakage current is carried out 5 minutes after the component is
3 Reliability of passive electronic parts 119

biased at normal voltage according to IEC specification. The main type of


graphical display is given in Fig. 3.19.

+ marks a catastrophic failure

+- A (initial)

+- B (after transient testl500h)

'-t---L-----'- +- C (after climatic sequence/2000h)


\ mean value
LIe (%)
----2------1----~O------------2~----~3------4~--·

Fig. 3.19 The main type of graphical display for the obtained results

A, B and C describe measurements carried out at different stages of the testing


- for instance: initial values (A) for a life test group and values after 500 hours (B)
and 2000 hours (C). In this case, the B and C values represent the change in
capacity from the initial value for each component. Consequently, in the B
measurement the changes in the capacitors have been from -0.5% to + 1.3%, and
the mean value of change for all capacitors in the test group was +0.3%. The two
crosses indicate that two of the capacitors catastrophically failed, in the period
500h to 2000h. In such cases, the values of the two capacitors are not taken into
account in the graphical presentation of measurement C, and the calculation of the
mean value includes only the capacitors that are still alive.
The results of tan 6 and leakage current measurements are given in a similar
way, the only difference being that the given values are the absolute ones, and not
the variations. For leakage current no mean value is calculated due to the fact that
a logarithmic axis is used. For the transient test, climatic sequence and surge test
measurements, only .de is given, because no significant changes were found in tan
6 and 1/ during these tests.
Notes: a) In the 56 days of damp heat steady state test (group 2), some of the
types are overstressed because the manufacturer specifies the capacitors to only 21
days of exposure. b) The surge test has been carried out for all capacitors at 46 V.
This is an overstress for some of the tested ty~es. The Table 3.5 ~ives the
impedance as a function of frequency, in the ratio I 2 1 : 121500, where 121 is the
mean impedance of 5 tested capacitors; and 12500 Iis the impedance of an ideal
capacitor (tan 6 = 0), having a capacity that is the mean value of the 5 capacitors
tested at 500Hz.
Comments on the results. Generally, the conclusions are: a) The compared
capacitors do not have the same capacity values. This should be taken into account
when comparing two different manufacturers. b) All manufacturers were aware of
the fact that capacitors were ordered for testing purposes. The possibility that
perhaps some kind of sorting burn-in may take place cannot be ex-
120 3 Reliability of passive electronic parts

LIe transient test and climatic sequence


Manuf 'JJIJ2f Low capacity values High capacity values
L IIlF + IOIlF (~

~~
~ 11%

1~ll
LlC(%) LlC(%)
-I 0 2 3 -I 0 2 3

h load life test,- Low capacity values High capacity values


max. temp. 2000h
L IIlF+10IlF

---'-_--'-----'-_--'--..l.J.. tg8
10-9 10-8 10-7 10-6 10-5 10-9 10-9 10-9 10-9 10-9

Fig. 3.20 Results of the stability investigations of tantalum capacitors from various
manufacturers (L, M, N, 0)

Table 3.5 Tantalum capacitor impedance' as a function of frequency

Manufacturer f.1F ZI:IZ5001


5kHz 50kHz 500kHz
L 1 1.04 1.44 7.61
10 1.02 2.13 23.0
M 0.33 1.04 1.06 1.76
2.2 1.00 1.09 5.29
N 1 1.00 1.34 6.16
47 1.09 2.71 102.0
0 2.2 1.11 1.38 4.1
47 1.50 3.51 120.1

eluded. c) The key point is that all the capacitors have undergone the same test.
Consequently only a few types of capacitors have been exposed to harder
3 Reliability of passive electronic parts 121

conditions than recommended by the manufacturer (mainly in the damp heat test
and the surge test).

Breakdown voltage/or 10 capacitors; voltage increase speed: 1 Vis.


Manufacturer lJ!J2.f. Low capacity values High capacity values
I--
L ~
I-- --...... "--
M ""'-.

N '\.
r--
o r-- ~
0 1 2 3 X Unom 0 2 3 X Unom

Fig. 3.21 Tantalum capacitors breakdown voltage, for various manufacturers

3.3.3.9
The failure rate model

In accordance with the standard 005 ITT 10300 the following failure rate is
indicated, estimated for tantalum capacitors in stationary operation and normal
conditions, at an environmental temperature of +40°C, 50% charge and a
confidence level of 60%:
(3.2)
where IIR is a correction factor for the series resistance, indicated in Table 3.6. In
these conditions, the useful lifetime of tantalum overreaches 25 years.

3.3.4
Reliability comparison: aluminium electrolytic capacitors
versus tantalum capacitors

It is difficult to make a comparison between these two capacitors families, since


different evaluation criteria are utilised. Ackmann [3.24] compared two similar
types (electrolytic aluminium capacitor lOj.IF/35V at 70°C, and tantalum capaci-
tor 8j.IF/35V at 85°C), and showed that for the aluminium electrolytic capacitor
the loss factor has the variation presented in Fig. 3.22. On the contrary, for the
tantalum capacitor with solid electrolyte, the residual currents represent the ageing
criterion; these are influenced by temperature (for a temperature variation between
20°C and +85°C, they grow with a factor of 10). Concerning the median lifetime -
122 3 Reliability of passive electronic parts

for the same conditions - the tantalum capacitors have much greater values than
the electrolytic capacitors. Moreover, contrary to the situation encountered for
electrolytic capacitors, in the case of operation at reduced starting voltage, an
important reliability growth arises (Fig. 3.23). The consequence of the reduction
with 75% of the starting voltage nominal value is the doubling of the lifetime; at a
voltage representing 50% from the nominal value, the life duration increases with
a magnitude order. Concerning the environmental temperature, Ackmann showed
that, starting from +85°C, at each temperature diminution with 10 ... 12°C, the
lifetime increases with a power of ten. One must mention also that the median
lifetime of the aluminium electrolytic capacitors is - in a first approximation -
reversely proportional with the area of the capacitor.

-
Relative capacity (%)
130

"'
120
110 f
100
-'
\\
90 R
I
80

Lossfactor tan ""(%)


20
10
6 ./
4 ..-- ".,
B
2

10 20 40 100 200 400 1000 2000 4000 10,000 t (h)


Fig. 3.22 Comparison between electrolytic and tantalum capacitors, at different nominal
voltages if = 100Hz). A - aluminium electrolytic capacitor 1O~/35V/70°C. B - tantalum
capacitor 8~/35V/85°C

Table 3.6 Correction factor DR for various values of the series resistance Rs

Rs(!!/V) 11K
3 ~Rs I
2~ Rs<3 1.5
I~ Rs<2 3
0.8~ Rs<1 4
0.6~Rs<0.8 6
0.4~ Rs<0.6 9
0.2:5 Rs <0.4 12
0.1:5 Rs<0.2 IS
3 Reliability of passive electronic parts 123

About aluminium electrolytic capacitors one knows that the critical parameter is
the increase of impedance produced by the growth of the series resistance, as a
consequence of electrolyte loss. On the contrary, for tantalum capacitors, the
apparent diminution is smaller, and - on the other hand - the loss factor varies
very slightly, so that such ageing criteria have no sense.

Mean life duration (h)


100000
50000

B
10 000
5000

0.7 UN
1000
15 20 25 30 35 Operating voltage (V)

Fig. 3.23 Increase of the median lifetime with the reduction of operating voltage, at +85°C.
Criterion: A - IR > O.041lA11lf x V; B - IR > O.021lA1J.!F x V

Table 3.7 Aluminium electrolytic capacitors versus tantalum capacitors


Parameter Aluminium Tantalum
wet dry

Product CU large very large large


Ratio costs/CU little fair fair
Derating (OC/%) 60°C / 50% 85°C / 30% 85°C / 30%
Stability fair very good very good
Lifetime good very good very good
Remarks high temperature important costs internal resistance must
sensitivity (professional be> 3n1V; if not, Ie is too
applications) high

3.3.5
Another reliability comparison: aluminium electrolytic
capaCitors (miniature type) versus tantalum capacitors

The initial values of the parameters of miniature type aluminium electrolytic


capacitors are comparable with those of tantalum capacitors ones. Even the newest
miniature types of aluminium electrolyte capacitors do not reach the main
advantages of tantalum capacitors (small temperature and good behaviour at high
frequencies). Indeed, at high frequencies a good behaviour of the impedance can
124 3 Reliability of passive electronic parts

be obtained (depending on volume and electrolyte), however, for f> 10kHz the
series resistance requirements will be exceeded, unlike the tantalum capacitors.
For aluminium capacitors, the diminution of capacity at growing frequency
produces high values of the resonance frequency. At low temperatures a still more
clear diminution of the impedance of aluminium electrolytic capacitors was
observed: in worst cases, 5 to 10 times, at 40 o e. However, at 10kHz and -40 o e,
the lowest impedance values of these capacitors are already 3 to 30 times greater
(for 1, and 47~, respectively) than that of tantalum capacitors, and at 100kHz this
ratio can't even be calculated. One must remember that at lower temperature and
high frequencies the aluminium electrolytic capacitors are considered only as
resistors.
Today, for the miniature aluminium electrolytic capacitors with connections
only on one side, the time behaviour is improved. Depending on volume and
electrolyte, the expected lifetime varies between 2000 and 7000 hours, at +85°e
and nominal voltage, taking as reference the criteria given in DIN 46910/124. At
5% failures, a lifetime representing 85% of the foreseen values can be expected.
The typical causes of failure are the increase of tan 0 and the modification of the
impedance and (rarely) of the capacity, or a greater residual current.
Because of the typical ageing behaviour of aluminium fluid electrolytic capa-
citors, the process of parameter degradation with the operation time is rapid in
comparison with that of tantalum capacitors. Particularly at low temperatures
and/or high frequencies, no resemblance with behaviour of the tantalum capacitor
may be observed. The utilisation of miniature aluminium electrolytic capacitors
seems justified due to dimensions and price, if it is not necessary to have a long
lifetime, if there are no high temperatures in the environment, and if the beha-
viour at high frequencies or at lower temperatures can be neglected.

3.3.6
Polyester film I foil capacitors

3.3.6.1
Introduction

The polyester film/foil capacitors consist of a non-inductive wound section of


aluminium foil with a polyethylene terephthalate (PETP) film; the section is
protected by a hard, water-repellent, self-extinguishing lacquer. The leads are of
solder-coated copper wire, crimped and cropped. It is a flat, upright mounting
type, lacquered, with 100V, 250V, 400V and 630V rated d. c. voltages, at ±10%
and ± 20% tolerance on rated capacities (lEe 68), utilised in a wide range of
(consumer and industrial) applications, especially where high currents and/or steep
pulses occur.
3 Reliability of passive electronic parts 125

3.3.6.2
Life testing

These tests are carried out tmder extreme conditions of load and temperature. The
components are loaded for maximum 7000 hours at 150% of the rated d. c. voltage
and at the maximum allowable temperature of +85°C. The failure rate 'A is
calculated from the number of failures and the available number of component
hours. In accordance with IEC Publication 271, catastrophic failures are short-
circuits and interruptions. The degradation failures - after test duration of 1000
hours - are as follows:
• LlC greater than 2 x the required value;
• tan J greater than 2 x the required value;
• Rinsu' less than 0.1 x the required value.
In analogy with those two types of failures, two failure rates 'A are calculated:
Ac : failure rate where only catastrophic failures are taken into accotmt.
Ac+d: failure rate where both catastrophic and degradation failures are taken into
accotmt.
A is quoted with a confidence level of 60% and - in accordance with MIL-
HDBK-217 - in failures per million hours (x 10-6 h-I). To ensure this confidence
level, the number of failures actually observed is artificially raised by adding a
quantity C60 (based on Poisson distribution). A is then calculated from:
..1,60% = (number offailures + C60) / (number of component hours). (3.3)
The calculated failure rate is an average for all values of the relevant testing
period, and is valid only for the conditions under which the tests were carried out.
These failure rates are further analysed. Table 3.8 gives a survey of some test
results. The tests were all performed at an overload of 50% and a temperature of
+85°C; the maximum test duration was 7000 hours; 7714 capacitors were tested
for a total of 19 242 000 component hours.

Table 3.8 Tested quantities and failures in life testing at +85°C, 1.5 UN, max. 7000h

Voltage rating (V) 100 250 400 630


Quantity tested 1750 2484 1760 1720
Component hours (x 103) 4250 6582 4420 3990
Catastrophic failures 6 0 2
Degradation failures 0 0 0
A,(60%) (x 10"%) 0.48 Ll2 0.21 0.78
A,'d(60%)(x 10-%) 0.73 1.12 0.21 0.78
A, (60%) total 0.55 x 10-%
Ac+d (60%) total 0.6 x 10-%
126 3 Reliability of passive electronic parts

3.3.6.3
A, as a function of temperature and load

The A, values given in Table 3.8 are valid only for the specific testing conditions:
50% overload at +85°C. The acceleration factors can be derived from MIL-
HDBK-217 where the A, is given as a function of temperature and load (type MIL-
27287). By using the same acceleration factors for the capacitors with PETP film,
we can calculate the A, under derated conditions. The estimated A, for some derated
conditions are given in Table 3.9.

Table 3.9 Estimated A under derated conditions

Conditions Ac AC+d
Temp. eq Load (%) (10-<i/h) (10-<i/h)

85 100 0.0087 0.0095


85 50 0.0005 0.00055
50 100 0.003 0.0033
50 50 0.00018 0.0002
25 100 0.0017 0.0019
25 50 0.00011 0.00012

For climatic tests to a high relative humidity (RR) and for temperature changes,
Table 3.11 gives a survey of all the quantities of capacitors tested under various
conditions. According to Table 3.10, the average rejects - with a confidence level
of 60% - was less than 0.02% (1 failure from 12 428). The only failure occurred
in the accelerated damp heat test (not preceded by the rapid change of temperature
test) on the 100 V capacitor.

Table 3.10 Tested quantities and catastrophic failures in climatic tests

Voltage rating (V) 100 250 400 630

Damp heat test without load Q (quantity) 460 855 505 458
40°C, RH 90-95%, 21 days F (failures) 0 0 0 0

Test on rapid thennal variation Q 442 847 503 459


3h +85°C/3 h -40°C; I cycle F 0 0 0 0

Accelerated damp heat test Q 442 847 503 458


55°C, RH 95-100%, 2 days F 0 0 0 0

Accelerated damp heat test Q 1328 1768 1292 1262


55°C, RH 95-100%, 2 days F 1 I 0 0 0
3 Reliability of passive electronic parts 127

3.3.6.4
Reliability conclusions

• There is no difference in A between the four voltage ratings.


• The Ac (60% confidence) is 0.55 x 1O-6/h; the AC+d (60% confidence) is
0.6 x 1O-6/h; both at +85°C and 50% overloaded.
• In the climatic tests the average reject was 0.02% (60% confidence).

Stability in climatic tests; damp heating without load


The cumulative frequency distributions of LiC and tan Ii are shown in Fig. 3.24
and 3.25. In addition, the frequency distributions of the R imu! per rated capacitance
value are given. Fig. 3.24 gives the LiC distribution per voltage rating in the test.
In all cases the drift is positive (about 2%), and greater for the capacitors with the
highest voltage ratings. Fig. 3.25 gives the cumulative frequency distributions of
tan Ii as measured before (curve b) and after (curve a) the test. All voltage ratings
show the same shape: a slight increase of the tan Ii level after the damp heat test.
R lIlsu ! decreases as the capacitance values increase, but all the tested capacitors
easily met the requirements. Table 3.11 gives a survey of tested quantities and the
percentages outside requirements found during the tests. In this test, at a
confidence level of 60%, the total percentage outside requirements was under
0.4%.

Accelerated damp heating test


This test was performed on the capacitors having undergone the rapid thermal
variation test. The capacitance drift after this test was positive, varying from 2.5%
(lOOV) to 3.1 % (630V).

Table 3.11 Percentages outside requirements after the damp heating test without load: 40°C, RH
90-95%, 21 days
.~

Voltage rating (V) 100 250 400 630

I Quantity-t~ted-- ----_._-------"
460 855
---

505 458
Percentage LIe 0 0.1 0 0.7
I
outside tan () 0 0.1 0 0
I
requirements Rin.\,ul 0 0.1 0 0.2
I
- - - - - - - - - ---
128 3 Reliability of passive electronic parts

LlC (%)

6
--------- ---- ----- ----------- ----------- ------:.0_ 0: ------------- ------------- requirement

o 7

-2 ~--~-+--~-----+-----+-----+------~----~

-4 ~--~-+--~-----+-----+-----+------~----~

--------- ---- ------ ----------- ----------- ----------- ------------- ------------- requirement

0.1 I 10 50 90 99 99.9 99.99 Cumulative


frequency (%)

Fig. 3.24 Distribution of DC during the damp heating test without load at 40°C,
RH 90-95%, for 21 days. -100 V; x
= 2.1 %; n = 460
·-----250 V; x=
2.6%; n 855 =
400 V; x=
2.8%; n 505 =
----600 V; x=
2.8%; n 458 =
The tan 15 drift level was slightly positive versus the level measured after the
testing at the change of temperature. Table 3.13 gives a survey of the quantities
tested and the percentages outside the requirements. In this test, with a confidence
level of 60%, the total percentage outside requirements is under 2.2%.

Breakdown voltage
The breakdown voltage shows a normal distribution for all four ratings. All the
tested capacitors met the requirement that the breakdown voltage should be at
least equal to twice the rated operating voltage. Table 3.14 gives a survey of the
average breakdown voltage and the average field strength at breakdown.

Insulation resistance
Rinsulis found to be dependent on the operating voltage and no values outside
requirement (requirement: Rinsul :2: 105 MQ).
3 Reliability of passive electronic parts 129

400V
tan ox 10- 4

170 ,--,--,-----,-----,-----,------,

60 requirement

50
...._----- V"a
bc-'//<
40

30

20
/
10
0.1 I 10 50 90 99 99.9 Cumulative
frequency (%)

Fig. 3.25 Distribution of tan 0 after the damp heat test without load at 40°C, RH 90-95%, for 21
days: (a) before the test; x = 36 x 10-4; n == 505, (b) after the test; x = 38 x 10-4 ; n = 505

Table 3.13 Percentages outside requirements after the accelerated damp heating test pre-
ceeded by the rapid temperature change test 55°C, RH 95-100%, 2 days
Voltage rating (V) 100 250 400 630 I

Quantity tested 442 847 503 458 -I


i
percentage LlC 0.8 0.9 0 OJ
outside tan 0 1.0 0.1 0 0
requirements Rimu { 2.4 0.8 0.4 1.9
••

3.3.7
Wound capacitors

This type of capacitors (with or without case) contains plastic foil capacitors
where an aluminium foil serves as coating, and the polystyrene as dielectric. The
terminal connections are joined with the coating (thickness: 10 to 401-lIll), so that
the capacitors can be used at high frequencies. The classical paper types with
various impregnation (with wax, oil, chloride naphthalene, epoxy resin) undergo
strong ageing phenomena. On the contrary, no degradations by the ageing of
parameters for the plastic foil capacitors were observed. The better behaviour in
time of these capacitors is one of the reasons which has contributed to the disap-
pearance of the paper capacitors from the market. They have remarkable
characteristics: small losses at high frequencies, high capacity constancy, insen-
sitivity at overcharge and mechanical over-stresses, well defined temperature
coefficient, relative insensitivity at humidity and temperature. For the unencap-
130 3 Reliability of passive electronic parts

sulated types, the capacitor variations are partially reversible; nevertheless, it is


possible that after more important stresses, the capacity values remain modified.
(Fig. 3.26). A modification of the capacity by 0.5% can produce operational
perturbations.

Table 3.14 Breakdown voltage and field strength at breakdown

Voltage Dielectric Average break- Average breakdown


rating (V) thickness (Jl) down voltage (V) field strength (V/Jl)
100 5 2400 480
250 8 2800 350
400 12 3400 283
630 19 4400 232

!iC/Co (to)
2

-- -- - -
o ~
~~~ //
.-.
/' /
-\
~
-2
o 200 400 600 800 1000 t (hours)

Fig. 3.26 Capacity variation of the lOOnF polystyrene with plastic cover capacitor

These modifications become observable after more than 1000 hours, if


humidity tests are performed at +40°C and RH 95%. A temperature increase
between +25°C and +40°C reduces by half the humidity compensation constant.
The environmental temperature and humidity determine the ageing.

Failure types:
• Bad contacts, particularly between foil and terminals; after a long operation
time, they can be recognised by a strong oxidation;
• Bad soldering;
• Dielectric ionisation, as a result of high alternating voltages;
• Mechanical instability (bad adhesion of the aluminium foil, weak winding up of
the foil, etc.).
In the case of styroflex capacitors, the main failure modes are due to the changes
of capacity.
3 Reliability of passive electronic parts 131

3.3.8
Reliability and screening methods of multilayer
ceramic capacitors [3.37] [3.38]

The multilayer ceramic capacitors are more and more utilised in hybrid circuits for
telecommunications, but always they have reliability problems. This explains why
many users are interested to have a non-destructive screening method allowing to
detect the early failures of these capacitors, particularly those occurring at low
voltage.
In 1983, Standard Telecommunications Laboratories (STL) presented a
screening method using a ionised solvent (methanol) which produces a temporary
increase of the electric conductivity of the zone which fails at low voltage, zone of
the capacitor structure placed between the two electrodes of the ceramic capacitor.
The silver particles migrating between electrodes (or between the porous
terminals) in the direction of constructive defects, cracks or voids, are identified as
being the causes of failures at low voltages. The humidity - considered as a
necessary condition to produce a failure - penetrates into the capacitor through the
electrodes and terminals (porous - in a certain measure). Strange enough, the
delamination (that is to say the separation of the electrode from the substrate) is
not a failure cause [3.38].
The method proceeds as follows: for example, for a 100V capacitor, a voltage
of only 10V is applied and, after 10 seconds, the crossing current 1/ is measured.
The capacitors are then heated to +85°C and afterwards they are introduced in
methanol for 1.. .15 hours. The greater the immersion duration, the better the
methanol penetrates inside the failure zones. Then, for a time;::: 60 seconds, the
capacitor is dried; once more a voltage of 40V is applied and, after lOs, the
loading current 12 is measured. If the tested component presents an important
failure, normally 12 will be greater than 1O-8A and the ratio 12/11 will have some
magnitude orders. For identical dimensions, the capacitors having the smallest
current values will lead to a smallest failure rates.
The semi-automatic testing installations of STL can test 2000 components per
hour. The tests results indicate that all the capacitors that failed at the screening
process will fail at the life tests (for structures: +85°C, RH 97% over 2600 hours,
with an applied voltage of 4.5V; for encapsulated capacitors: +85°C, RH 85%,
1200 hours, l.5V).
According to the statistical data published by STL, only 1% of all tested
capacitors pass the screening tests but fail at the lifetime testing. Many failures are
attributed to short-circuits due to methanol penetration. A number of failures
originate from defects too small to be detected. Strange enough, there are
components considered inadequate at the screening, but found good at the end of
lifetime test.
For an effective detection it is recommended to test the capacitors at the
structure level, and not at the encapsulated capacitor one.
The low voltage failure depends on the ion migration through or along a
physical defect (crack, void, porous region) that brings in contact two opposed
electrodes. To produce this effect, some environmental conditions must occur
[3.38].
132 3 Reliability of passive electronic parts

In the presence of a defect, the silver particles can migrate from a final point to
an electrode, even if the internal electrode is not made of silver. In addition, the
palladium, platinum or gold complexes are transported against the electrostatic
field. This explains why the movements can be greater for smaller fields. A high
humidity level is unfavourable. For encapsulated capacitors, a bad quality solder
joint allows the humidity penetration, so that the ions can migrate between the
final points. This explains why an encapsulated capacitor can fail, even if its
structure is faultless.
One of the first reaction to the enunciation of the STL screening method
concerned the humidity role: is it a sine qua non condition for low voltage
failures, or is it an accelerator agent?
According to the STL statistics, from more than 800 structures that passed
through the bum-in (which precedes the screening) none failed in an environment
with reduced humidity. It seems that the researches performed at STL have not
emphasised the fundamental failure mechanism. It is known that it is necessary to
have a defect, a fault, in order to lead to failure. But, in our case, the essential
element is a crack or a pinhole?
This confusion results from the fact that certain low voltage defects are self-
cicatrisated. Even if a microscopic analysis is performed, no physical defect can
be emphasised. This leads to the conclusion that the STL researches have isolated
only the defects which appear in certain operation conditions and which lead to a
low voltage defect. Further researches will allow to investigate physically and
electrically a greater and wider number of defects. Correlating the manufacture
process with the defects, it should be possible to establish the necessary
improvements of the technological process with the aim of removing a certain
defect type.

3.4
Zinc oxide (ZnO) varistors [3.39]. .. [3.45]

For the suppression of contact sparks, in parallel with the coils or with the
contacts, RC-combinations, diodes, and selenium overvoltage protections are
mounted. Recently, for the same purpose, the varistors are more and more utilised.
The zinc oxide varistors are resistors dependent on voltage with symmetrical U-I
characteristics, having very similar properties with anti series connected Z diodes,
but with a much greater loading capacity and therefore serving for the protection
of circuits (with a response time < 25ns). At the occurrence of high-energy voltage
peaks, the varistor modifies its previous high resistance value and goes into a
conductive state, until the voltage peak diminishes to a non-dangerous value. The
pulse voltage energy is absorbed by the varistors, and so the voltage sensitive
components are protected against destruction.
One must notice that the Trans-Zorb diode can't be replaced with varistors,
because their purpose and function are identical. For some utilisations, the varistor
can protect the circuit as well as the diodes, but each of the two products has its
own specifications. The specific and concrete application decides which of the two
products will be selected.
3 Reliability of passive electronic parts 133

The varistors have a negative thermic coefficient. The Trans-Zorb diodes, on


the contrary, have a positive thermic coefficient. It has been observed that in the
case of metal-oxide varistors the limiting voltage varies with the impulse peak
current (Fig. 3.27). Also, the ageing must be considered (Fig. 3.28).

Ipp (A) - peak pulse current

-
100 I
I
:a
10
I ~
I
I

/'
/'
I
I
I

,1/
I
I
I
I
I
0.1

0.01

0.001
o 25 50 75 100 125 Uc limitation voltage (V)
Fig. 3.27 Comparison between the limitation voltages for different peak pulse currentsa) 39 V
metal-oxide varistor; b) 39V Trans-Zorb

BV (V) - Breakdawn voltage


45

-i"
b
40

35

30
10 20 30 40 50 60 70 80 Ipp (A)

Fig. 3.28 The mean decrease of breakdown voltage BVafter the pulse tests (measured after 10
pulses, each having the duration of IllS): a) 39V metal-oxide varistor; b) 39V Trans-Zorb

J,
U

f-- mn _
/
'f
----,
.-'
,,-, b

Fig. 3.29 Oscilloscope pictures: a) 39V Trans-Zorb; b) 39V metal-oxide varistor. Pulse test
conditions: 50AlllS with a rise time of 4kV/IlS. Vertical scale: 50V/div.; horizontal scale: 2ns/div
134 3 Reliability of passive electronic parts

After an important number of pulses, the metal-oxide varistors modify their


barrier voltage and a deterioration of constant voltage parameters may arise. For
integrated circuits and sensitive microprocessors it is necessary to have protective
elements with small nominal voltage. In this case, too, different behaviours of
varistors and Trans-Zorb diodes respectively are found (Fig. 3.29).
In many cases it is not recommended - for safety reasons - to mount a zinc
oxide varistor directly on an open contact.

3.4.1
Pulse behaviour of ZnO varistors

The characteristics of ZnO varistors change if they are exposed to a pulse current
load. We will try to explain how these effects depend on the amplitude, on the
duration and on the number of pulses. Especially, a particular influence on the
leakage current and on the lifetime of the varistor was observed.
ZnO characteristics. To specify exactly a ZnO varistor, the characteristic
values have been defined on the international level [3.45]. In Fig. 3.30 the
voltage-current characteristics of a ZnO varistor and the principal measurement
points are shown. The maximum current refers to a normalised stress of 8/2011s.
At repeated stresses (or other pulse forms), a maximum current under the form of
a derating curve is supposed. A measure point situated in the middle of the
measured domain serves to obtain the "clamp's voltage" or ''terminal voltage", UI .

°
These current values depend on the diameter of the varistor and are defined at the
series Rl (ISO 3).
The advantage of ZnO varistors is given by the important slope "a". In
practice, this nonlinearity coefficient is not perceived as a differential slope, but as
a measure for a mean slope between two currents that should be defined. As
principal value of the U-I characteristic, the "varistor voltage", in other words the
measured voltage for a current of 1mA made its way. It can be simply measured,
has a weak temperature dependence, and therefore can be utilised as a criterion for
climatic and environmental tests.

500

200

100

-
1/
U(VJ ~
50
V-
20

10
10-5 10-2 104 I(A)

Fig. 3.30 Some typical electrical values of a varistor on the U-J curve
3 Reliability of passive electronic parts 135

To take into account the dispersion due to the manufacture conditions, a


tolerance of ±10% for the varistor voltage is suited. The varistor voltage is greater
than the d. c. voltage. A duration load of lmA is not allowed for ZnO varistors.
"The maximum service voltage V" has been defined on the basis of lifetime tests,
at high temperatures and currents. The corresponding current is named "leakage
current IL". Due to the importance of the nonlinearity of the V-I characteristic, the
maximum alternating voltage is smaller than the service d. c. voltage. The
maximum alternating voltage was so defined, that at most defavourable varistor
position in the tolerance band, the maximum current don't rise over lmA.
For some applications - for example in telecommunications - it is important to
maintain the smallest possible current in the lowest part of the characteristic. For
this application new definitive characteristic values have been defined, "the
blocking voltage VB'" and "the blocking current IB", respectively. For the defined
blocking currents (ranged between 1!lA and 10!lA), the blocking voltage can't be
below a given threshold.
The following list gives - in order of importance - a survey of the possibilities
to influence the ZnO characteristics [3.46]:
• temperature;
• current pulses;
• service d. c. voltage, respectively service a. c. voltage;
• chemical influences (internal: basis and additional materials, etc.; or external:
aggressive moulding materials, poor oxygen gases, etc.);
• mechanical influences concerning the transforming processes in production,
and loading depending on pressure;
• sensitivity to humidity;
• sensitivity to light.
Degradation to pulse. As shown in [3.47], after the application of a pulse, the
inferior part of the characteristic is modified. It can't not only move down or grow
up, but it can also have a polarisation effect. If we compare the characteristic in
the pulse direction and in the opposite direction, after the pulse application the
variations can rise up to 20%. That is why, at the pulse degradation tests, both
directions must be taken into account. As a rule, the negative characteristic (in the
opposite direction of pulse) is much more affected.
The utilisation domain of ZnO varistors stretches from the protection of elec-
tronic components against electrostatic discharges to the protection of electric
energy supply equipment against lightning flash and switching.
Each current source has its own internal impedances, pulse forms and repetition
rates. Fast increases of current pulses induced voltages in the varistor connections,
which superpose then with the clamps varistor voltage. These effects can be
avoided only by changing the constructive shapes [e. g. surface mounted devices
(SMD) as components without connections or in coaxial form].
By identifying the defects arisen in destroyed varistors as a result of the load
pulse, two main groups can be identified [3.48]: in one group the varistor is
mechanically destroyed by the pulse, while the "slice" of which is formed
explodes in many parts; this type of defect can be observed especially at short
pulses (4/10/ls; 8/20/ls). In the case of longer pulses (> 100/ls), one may observe
the formation of melting channels. As a rule, the varistor remains intact, without
136 3 Reliability of passive electronic parts

fissures. The cause of failure is, for the first case, a too important pulse current,
and, for the second case, a too greater energy absorption.
Corresponding to these results, a "normalised" form of pulses has been
selected. As short pulse of great current the exponential shock of 8120j.lS, and as
great energy pulse, the rectangular pulse of 2ms, respectively the exponential
shock of 1011 00j.lS, have been standardised.
Degradation to pulse, without service voltages. Responsible for the
characteristic modifications due to the pulses are the pulse amplitude, the pulse
duration, the number of pulses and the pulse rate.
It is not allowed for the mean pulse load to be greater than the duration load. If
this limit value is exceeded, the varistor lifetime will be shorter due to this
supplementary stress factor. The following correlation can be noted thanks to the
study of the maximum shock current (having a certain form, depending on the
pulse number at which the modification of the varistor voltage Uv should remain
constant):
n(,d.Uv = cons!.) = (III,)" (3.4)
where v = nonlinearity coefficient of the pulse load; I = current amplitude at which
- after a pulse - the varistor voltage Uv is modified with a constant percentage (for
example: 10%); In = current amplitude at which - after n pulses - the varistor
voltage is modified with a constant percentage (for example: 10%).
The nonlinearity coefficient v is a measure of the pulse amplitude dependence
on the number of pulses, but it does not represent a measure for the maximum
number of pulses until the varistor destruction.
Pulse polarisation. The above mentioned differences between the positive and
negative characteristic after a pulse load, can be probably explained through ions
migration in the intergranulated zone [3.49][3.50]. Particularly the interstitial zinc,
which migrates under the influence of electric field, contributes to this effect. The
polarisation degree depends on the density of the current pulse (Fig. 3.31).

V a) LlUv b)
500 10

400 5

0
without load
LlUv _
300 -5
10-5 10-4 10'2 10° A n (i = const)

Fig. 3.31 The varistor polarisation. a) The U-I characteristic. b) Modification of the varistor
voltage (+ in the pulse direction; - in the opposite pulse direction)

Degradation to pulse, with service voltage. For the "operation test", the
ZnO varistors - connected to the service voltage - are submitted to pulses. Until
now it was not possible to indicate precisely if a degradation to pulse is
3 Reliability of passive electronic parts 137

superposed with a degradation at service voltage. To study the influence of these


two mechanisms, extreme test conditions ("accelerated life test") were selected:

• environmental temperature: 120°C;


• maximum d. c. service voltage; and
• maximum allowable value for shock currents.
In this way it has been clearly demonstrated that for ZnO varistors the operating
voltage can be interrupted for few minutes, and that the shock test can be
performed at normal ambient temperature, without the varistors suffering too
much due to the severity of the test.
Failure criteria for the lifetime estimation. A single definition of the control
criterion for the accelerated life test doesn't exist. Many limit conditions are
applied at the time:
1) The modification of leakage current, e. g. until the double of its value.
2) Operation at the maximum duration load.
3) Using that load at which the varistor becomes unstable (from a thermal point
of view).
The first variant is the simplest, but the varistors - having small leakage currents -
are disadvantaged. Their answer, on average, is more sensitive than that of the
varistors with higher initial values of the leakage currents (although they have - in
absolute values - the smallest values).
If the maximum duration load is exceeded - the second condition - the
influence of the own load can't be neglected. The varistor is not yet unstable, but
it is out from the security domain. If the operating voltage or the environmental
temperature increases beyond a forbidden value, the own heating of the varistor
prevails, and the heat dissipation can't realise a stable equilibrium. As a result, an
exponential temperature increase arises leading to the varistor destruction.
Reaching this unstable point, the third possible limit condition may be used for
the lifetime estimation. The important influence of the thermal conditions on the
varistor represents the difficulty of this last method.
Arrhenius equation. If the observed duration is shorter than the lifetime, we
can write for the increasing leakage current, with a good approximation, the
relation:
It = A + kit (3.5)
where A is a constant depending on the preceding "history" of the leakage current,
and on the slope k of the leakage current, for the time t 1l2 .
As one can see in Fig. 3.32, the pulse operating varistors have a greater slope,
so that they reach faster the limit current, in accordance with the second criterion.
This is why their lifetime is shorter. If we repeat the test at different environmental
temperatures and at different voltages, we observe that the logarithm of the
lifetime depends linearly on the inverse value of the environmental temperature
(Arrhenius equation) [3.51].
The slope of these straight lines is constant for ZnO varistors, and one may
associate to it an activation energy. Newer ceramic systems have stable (or falling)
tendencies; but this evaluation method can't be applied to them.
138 3 Reliability of passive electronic parts

I (IlA)
100

80

60

40

20

0
0 2 3 4 5 6 t (h)

Fig. 3.32 Evolution of the leakage current during the operating test time: 1 - in the opposite
direction of the pulse; 2 - in the direction of the pulse; 3 - comparative curve, without pulses

3.4.2
Reliability results

A lifetime test for a batch of260 varistors of type VP 130AI0, during 200 hours at
lOOO°C, and for an operation voltage of 184V/60Hz [3.52] was performed. Four
catastrophic failures and 12 derating failures 6 arise. This corresponds to a mean
lifetime of 59 000 hours, and of 28 000 hours respectively, at a confidence level of
95%.
In accordance with the data of the firm General Electric, the operation duration
to the same level of confidence is greater than 40 x 106 hours.

3.5
Connectors

The connector must perform an electrical and mechanical connection, easy to


detach [3.53] ... [3.61]. Analytically, the general functions of a connector can be
classified - after the unitary functions - as follows:
• to perform an electrical contact, easy to detach;
• to perform a mechanical coupling, easy to detach;
• to foresee the possibility to connect different conductors;
• to perform the insulation of the conductive electric parts;
• to attain the mechanical stability;
• to foresee the fixation possibilities.

6 We speak about derating failure if the voltage modification (measured at lmA current) is
greater than 10%.
3 Reliability of passive electronic parts 139

The evaluation of these functions depends on the given utilisation and on the fonn
of the conductor. Although it seems that some functions are "important" and some
other are "unimportant" functions, the general irreproachable function is
nevertheless only guaranteed if all functions are performed in the correct
proportion. That is why the fonn of the contact elements, the used materials, and
the galvanic coating are particularly important.
In the last 40 years, the connecting technique was efficiently influenced by
three important innovations:
• 1958: the printed circuits;
• 1975: the optical conductors for the digital transmission of the infonnations;
• 1986: the realisation of electronic connections without soldering.

Percentages I -Electronics industry


30 = ~
2 - Telecommunications
3 -lnfonnation technology
4 -Power car electronics
~

20
5 - Consumer electronics

nn
r-
10 6 - Domestic industry

o
23456

Fig. 3.33 Distribution of connectors on the users' market

The conductors are used in each of the following domains: transmission of data
or infonnations with the aid of electrical or optical signals, electronic regulation
and measurement devices used to control the industrial processes, in
telecommunications and data technique, in the office activities, transports,
domestic, and consumer electronics. In Fig. 3.33 the distribution of connectors on
the users' market is shown.

3.5.1
SpeCifications profile

• The contact quality depends on the passing resistance.


• The reliability depends on early failures, on wearout failures, on operation
failures and on environmental influences.
• Operation current: depending on the environmental temperature, it is
detennined by the body isolation material, by the elastic spring contact
material, by the contact force, and by the contact diameter.
• The contact achievement depends - from a mechanical point of view - on the
ratio between elasticity forces and scissors forces (system tolerance).
• The contact technique: it is the combination of manual soldering, immersion
soldering, pressure without soldering and "press-in technique".
• Particular specifications for optical fibre connectors: stable construction,
nonnalised components, simple mounting, reduced vaporisation, resistant to
140 3 Reliability of passive electronic parts

environmental action, protected fibre surface, frequent and reproducible con-


nection.
The degree of adequation of a noble metal as contact material can be definitely
established only for commercial conductors. The researches effectuated by the
firm ITT during a ten-year period have shown (Fig. 3.34) that the diminution of
the contact force during the lifetime follows a logarithmic curve.

Contact force (N) Temperature: ,.,.,.,., 90°C _ 120°C


2.0 max. 1.98mm min. 1.7mm min. 1.4mm

1.5

1.0

0.5

0.2
o 250 500 750 1000 1250 1500 Lifetime (h)
Fig. 3.34 Time behaviour of CuNi 9Sn2 connectors

3.5.2
Elements of a test plan

From the user's point of view, a test plan for the qualification of a conductor must
contain the following groups of characteristics: a) product characteristics; b)
operation characteristics; c) processing characteristics; d) operation behaviour
characteristics.
Each of these groups can be subdivided into well defined elements; in this
sense, a distinction between estimation characteristics and test characteristics can
be made.
a) Product characteristics
In this group are gathered the characteristics delivered by the manufacturer, on
the basis of selected materials and technology applied to assure a correct
behaviour of the product. Among these should be mentioned the used materials,
the processing (optical examination of the package and of the contact parts), the
surfaces (material junctions, contact materials porosity, impurities, contacts
smearing), and the layer construction (in contact and connection areas).
b) Operation characteristics
This group contains important operation characteristics: mechanical operation
characteristics (dimensions and tolerances, unchangeability, crossing possibili-
ties, the total connecting and traction forces, contact force, static axial load, etc.),
3 Reliability of passive electronic parts 141

and electrical operation characteristics (devaluation, passing resistance, operation


current, resistance to voltage).
c) Processing characteristics
This group of characteristics is particularly important and contains the charac-
teristics which settle manipulation (modularity and mechanical resistance of the
connection), mounting (the maximum admissible traction moment at screening up
and fastening), and connecting capacity (pressing, soldering, clamping).
d) Operation behaviour characteristics
Main characteristics: transport temperatures and storage range, operation
temperature range, connecting-disconnecting cycles, inflammability, humidity
stress, influences during operation, mechanical stresses (vibrations, accelerations,
shocks), electrical stresses (endurance electrical stress, overvoltage), climatic
stress (cooling, heating, humidity, sand and dust, thermal shocks, temperature
variations, mouldiness).

References

3.1 Doyle, E. A. Jr. (1981): How parts fail. IEEE Spectrum, October, pp. 36-43
3.2 MIL-HDBK -175, Microelectronic Device Data Handbook, U. S. Department of Defense,
Washington, D.C.
3.3 Hnatek, E.R. (1975): The Economics of In-House Versus Outside Testing. Electronic
Packaging and Production, August, p. T29
3.4 Johnson, G.M.: Evaluation of Microcircuits Accelerated Test Techniques, RADC-TR-76-
218, Rome Air Development Centre, Griffins Air Force Base, New York, 13441
3.5 Roberts, J. A.; Chabot, C. B. (1980): Application Engineering. In: Arsenault, J. E. and
Roberts, J. A. (eds.) Reliability and Maintainability of Electronic Systems. Computer
Science Press
3.6 Mader, R.; Meyer, K.-D.(l974): Zuverlassigkeit diskreter passiver Bauelemente. In:
Schneider, H. G. (ed.) Zuverlassigkeit elektronischer Bauelemente, Leipzig; VEB
Deutscher Verlag fur Grundstoffindustrie, pp. 400-401
3.7 Nagel, O. (1970): Stabilitat von Schichtwiderstanden. Internationale Elektronische
RundschauH. 12,pp. 315-318
3.8 Hofbauer, C. M.: Die Feuchtigkeits- und Klimabestandigkeit von Schichtwiderstiinden.
Radio Mentor, vol. 27, no. 5, pp. 400-401
3.9 Tretter, J. (1974): Zum Driftverhalten von Baue!ementen und Geraten. Qualitat und
Zuverlassigkeit vol. 19, no. 4, pp. 73-79
3.10 Bajenescu, T. I. (1978): La fiabilite des resistances. La revue polytechnique no. 9, pp. 993-
997; Bajenescu, T. I. (1981): Zuveriassigkeit elektronischer Komponenten. Tei! 1: Zu-
verlassigkeitskenngrossen und Ausfallmechanismen. Feinwerktechnik & Messtechnik no.
5,pp.232-239
3.11 Stanley, K. W. (1971): Reliability and Stability of Carbon Film Resistors. Microelectronics
and Reliability, no. 10, pp. 359-374
3.12 Russel, R. F. (1971): Test on Thick Film Resistors. Microelectronics and Reliability, no. 10,
p.1I5
3.13 MIL-STD-199, Resistor, Selection and Use of, Supplemental Information, U. S. Department
of Defense, Washington, D. C. 7
3.14 MIL-STD-198, Capacitor, Selection and Use of, Supplemental Information. U. S.
Department of Defense, Washington, D. C. 7
3.15 Kormany, T.; Barna, H. (1982): Wege zur Beurteilung der nattiriichen Lebensdauer von
Elektrolytkondensatoren. Nachrichtentechnik vol. 12, no. 10, pp. 391-392
142 3 Reliability of passive electronic parts

3.16 Stade, W.; Hahn, G. (1975): Betrachtungen zu Lebensdaueruntersuchungen von Elektro-


lytkondensatoren. Nachrichtentechnik vol. 17, no. 11, pp. 441-443
3.17 Durieux, J. (1974): Fiabilite et duree de vie des condensateurs electrolytiques 11
l'aluminium. CNET, Document de travail DT CPMlICS 38
3.18 Bajenescu, T. I. (1981): Zuverlassigkeit von Kondensatoren. Feinwerktechnik &
Messtechnik vol. 89, no. 7, pp. 313-320
3.19 Solid Aluminium Capacitors - Reliability and Stability. Philips Technical Information
057/12.6.79
3.20 Lapp, J. (1978): Evaluating Capacitor Reliability. Electrical World, June 15, pp. 42-44
3.21 Bora, 1. S. (1973): Short-Term and Long-Term Performance of electrolytic capacitors.
Microelectronics and Reliability, vol. 18, pp. 237-242
3.22 Webinger, R. (1979): Aluminium-Elektrolytkondensatoren fur den Einsatz in Strom-
versorgungen. Bauteile Report vol. 17, no. 2, pp. 37-41
3.23 lEC-Specification for Aluminium Electrolytic Capacitors High Reliability Type. 40-1
(Secretariat) 36
3.24 Ackmann, W.: Alterungskriterien bei Elektrolytkondensatoren. NTF vol. 24, pp. 115-126
3.25 Ackmann, W. (1973): Reliability and Failure of Capacitors. Proceedings of the 3rd
Symposium on Reliability, Budapest, pp. 3-12
3.26 Hahn, G.; Wilke, N.: Veranderung charakteristischer Parameter von Kondensatoren bei
Forcierungsbelastungen. Nachrichtentechnik vol. 19, no. 2, pp. 53-57
3.27 Geigerhilk, B.; Bretschneider, R. (1982): Das Verhalten von Kondensatoren unter extremen
Bedingungen. Nachrichtentechnik vol. 12, no. 10, pp. 393-396
3.28 Meuleau, C. (1981): ZuverHissigkeitspriifung und -bestimmung elektronischer
Bauelemente. Elektrisches Nachrichtenwesen vol. 38, no. 3, pp. 308-324
3.29 DIN 40040, 40041, 40046, 40815, 41099, 41122, 41240, 41247, 41255, 41256, 41257,
41311,41314,4133.2,41426,41640,44350,44351,44358,50018
3.30 Rechs, L. (1979): Guide de choix du condensateur au tantale. Toute l'Electronique,
November, pp. 39-46
3.31 Kleindienst, P. (1970): Neue Ergebnisse tiber die Zuverlassigkeit des Tantal-Kondensators.
Internationale elektronische Rundschau no. 8, pp. 205-208
3.32 Matsumoto, T., Sugita E.: Properties and Reliability of Tantalum Oxide Thin Film
Capacitor. Review of the Electrical Communication Laboratories vol. 23, no. 3, pp. 257-270
3.33 Bjerre, A.; Skaaning, K.: Tantalum Capacitors: An Evaluation of 11 Types of Tantalum
Capacitors. Elektronikcentralen ECR-38, Danish Research Centre for Applied Electronics
3.34 Ackmann, W. (1973): Neuere Ergebnisse zur Zuverlassigkeit des Ta-Kondensators. SEL-
Nachrichten vol. 12, no. 1, pp. 38-41
3.35 Petrick, P. (1971): Das Dauerverhalten von Kondensatoren. Elektronikpraxis vol. 3, no. 2,
p. 7-17; no. 3/4,pp. 9-16
3.36 Bora, 1. S. (1973): Limitations and Extended Applications of Arrhenius Equation in
Reliability Engineering. Microelectronics and Reliability vol. 18, pp. 241-242
3.37 Bajenescu, T. I. (1984): Fiabilite et methode de deverminage des condensateurs ceramiques
multicouches. Electronique, no. 2, p. 25
3.38 Biancomano, V. (1983): Screening method points to causes of low-voltage failure in MLC
capacitors. Electronic Design, 23rd June, 47-48
3.39 GE Application Notes Nr. E201.28, E200.71172, E200.73, E201.28
3.40 Levinson, L. M.; Philipp, H. R. (1977): ZnO Varistors for Transient Protection.lEEE Trans.
on Parts, Hybrids and Packaging, vol. PHP-13, no. 4, pp. 338-343
3.41 Fox, R. W. (1981): Six Ways to Control Transients. Electronic Design, vol. 22 no. 11, pp.
52-57
3.42 Bernasconi, 1. et al. (1975): Investigation of Various Models for Metal Oxide Varistors.
Journal of Electron Materials, vol. 5, no. 5, pp. 473-495
3.43 Hey, 1. c., Kra, W. P., eds. (1978): Transient Voltage Suppression Manual. General
Electric, Auburn
3.44 Reliability of General Electric GE-MOV Varistors, General Electric Report, E95.44
3 Reliability of passive electronic parts 143

3.45 CECC 42,000, CECC 42,200: Hannonisiertes Geriitebestiitigungssystem fur Bauelemente


der Elektronik. Fachgrundspezifikationen und Rahmenspezifikationen, Varistoren.
Deutsche Elektrotechnische Kommission, Frankfurt
3.46 Philipp, Levinson (1983): Degradation Phenomena in ZnO, a Review. Advances in
Ceramics, no. 7
3.47 Fujiwara et al. (1982): Evaluation of Surge Degradation of Metal Oxide Surge Arresters.
IEEE Trans. on Power Appl. and Systems PAS-lOl, no. 4, p. 978-985
3.48 Ega (1984): Destruction Mechanism of ZnO Varistors due to High Currents. 1. Appl. Phys.
56, p. 810
3.49 Eda, Iga, Matsuoko (1980): Degradation Mechanism of Nonohmic Zinc Oxide Ceramics.
Appl. Phys. 51, no. 5
3.50 Einzinger (1978): Nichtlineare elektrische Leitfahigkeit von dotierten Zink-Oxid-Keramik.
Dissertation, FakuItiit fur Physik der TV Miinchen
3.51 Carlson et al. (1986): A Procedure for Estimating Life Time of Gapless Oxid Surge
Arresters for an Application. IEEE Trans. Power Applic. and Systems PAS-O 1
3.52 Reliability of General Electric GE-MOV Varistors, Report E95.44 of the finn "General
Electric"
3.53 Bajenescu, T. 1. (1994): AschenbrOdel Steckverbinder. Bulletin SEVIVSE 25, pp. 35-39
3.54 Bajenescu, T. 1. (1992): Elektronsiche Bauelemente und Zuverliissigkeit. Aktuelle Technik
no. 9,pp. 17-20
3.55 Bajenescu, T. 1. (1995): Priifung und Vorbehandlung elektronischer Bauteile und Geriite.
Aktuelle Technik no. 3, pp. 6-8
3.56 Bajenescu, T. 1. (1995): Zuverliissigkeitskenngrossen. Aktuelle Technik no. 5, pp. 13-15
3.57 Bajenescu, T. 1. (1991): Steckverbinder und Zuverliissigkeit. Aktuelle Technik, vol. 13, no.
6,pp.17-20
3.58 Bajenescu, T. 1. (1989): A Pragmatic Approach to the Evaluation of Accelerated Test Data.
Proceedings of the Fifth IASTED International Conference on Reliability and Quality
Control, Lugano (Switzerland), June 20-22, 1989
3.59 Bajenescu, T. 1. (1989): Realistic Reliability Assements in the Practice. Proceedings of the
International Conference on Electrical Contacts and Electromechanical Components,
Beijing (P. R. China), May 9-12, pp. 424-428
3.60 Bajenescu, T. 1. (1989): Evaluating Accelerated Test Data. Proceedings of the International
Conference on Electrical Contacts and Electromechanical Components, Beijing (P. R.
China), May 9-12, pp. 429-432
3.61 Bajenescu, T. 1. (1992): New Aspects of the Reliability of Lithium Thyonil Chloride Cells.
Microelectronics and Reliability vol. 32, no. 11, pp. 1651-1653
3.62 Losee, F. (1997): RF Systems, Components, and Circuits Handbook. Artech House, Boston
and London
3.63 Ackmann, W. (1976): Zuveriiissigkeit elektronischer Bauelemente. Hiithig-VerJag,
Heidelberg
3.64 Bajenescu, T. 1. (1979): Elektronik und Zuverliissigkeit. Hallwag-Verlag, Bern
(Switzerland), Stuttgart (West Gennany)
3.65 Bajenescu, T. 1. (1981): Zuverliissigkeit passiver Komponenten. Technische Rundschau,
Switzerland (Artikelserie von Miirz bis und mit Juni 1981)
3.66 Bajenesco, T. 1. (1981): Zuverliissigkeitsproblemlosungen elektroniker Bauelemente.
INFOR-MIS-Infonnationsseminarien 81-8, Ziirich, May 14 and October 20
3.67 Bajenescu, T. 1. (1982): Eingangskontrolle hilft Kosten senken. Schweizerische Technische
Zeitschrift (Switzerland) 22( 1982), pp. 24-27
3.68 Bajenesco, T. 1. (1984): La fiabilite des relais. Electronique (Switzerland), no. 10
3.69 Bajenescu, T. 1. (1984): Relais undZuverliissigkeit. Aktuelle Technik (Switzerland), no. 1,
pp. 17-23
3.70 Bajenescu, T. 1. (1984): Zeitstandfestigkeit von Drahtbondverbindungen. Elektronik
Produktion & Priiftechnik (West Gennany), October, pp. 746-748
3.71 Bauer, C. L. (1991): Stress and Current-Induced Degradation of Thin-Film Conductors.
Proceedings ESREF '91, pp. 161-170, Bordeaux
144 3 Reliability of passive electronic parts

3.72 Bavuso, S. 1.; Martensen, A. L. (1988): A Fourth Generation Reliability Predictor. Proc.
Ann. ReI. & Maint. Symp., pp. 11-16
3.73 Bazovsky, I. (1961): Reliability Theory and Practice. Prentice-Hall Inc., Englewood Cliffs,
New Jersey
3.74 Bazowsky, 1.; Benz, G. (1988): Interval Reliability of Spare Part Stocks. Qual. Reliab.
Engng. Int., no. 4, pp. 235-246
3.75 DUrr, W, Meyer, H. (1981): Wahrscheinlichkeitsrechnung und schliessende Statistik.
Hanser-Verlag, Miinchen / Wien
3.76 Ellis, B. N. (1986): Cleaning and Contamination of Electronics Components and
Assemblies. Electrochem. Publ., Ayr (Scotland)
3.77 Moore, E. V., Shanon, C. E. (1956): Reliable Circuits Using less Reliable Relays. J. W.
Franklin Institute, pp. 191-208; 281-297
3.78 Mi.inchow, E., Erzberger, W. (1994): Wie zuverlassig ist zuverlassig? MegaLink
3.79 Munikoti, R., Dhar, P. (1988): Low-Voltage Failures in Multilayer Ceramic Capacitors: A
New Accelerated Stresss Screen. IEEE Trans. on Components, Hybrids, and Manufacturing
Technology, vol. 11, no. 4, pp. 346-350
3.80 Reinschke, K. (1973): Zuverlassigkeit von Systemen (Band I). VEB Verlag Technik, Berlin
3.81 Reinschke, K., Usakov, 1. (1987): Zuverlassigkeitsstrukturen. Verlag Technik, Berlin
3.82 Reiszmann, E. (1972): Messung und Bewertung mechanischer Umweltbeeinfli.isse auf
Gerate. Fernmeldetechnik, vol. 12, no. 3, p. 117
4 Reliability of diodes

4.1
Introduction

The diodes and the rectifiers are bipolar components with non-linear constants
which have a different behaviour, depending on the polarisation of the applied
voltage [4.1]. .. [4.13]. The silicon is used almost exclusively as semiconductor
material. The main constructive forms are planar diodes and MESA diodes.
Among the important characteristics that can't be exceeded, the reverse voltage,
the forward current and the maximum junction temperature (including data
concerning the thermal resistance at high temperature) may be mentioned.
For rectifier diodes four categories may be mentioned:
• Rectifier diodes for general purposes
• Avalanche rectifier diodes
• Fast rectifier diodes (with small reverse recovery time)
• Avalanche rectifier diodes with controlled reverse current decrease.
The rectifier diodes for general purposes (namely without avalanche breakdown)
are improper for high transient reverse voltages. The small irregularities of the
barrier layer may lead to a local breakdown, which may produce a local
overheating (hot spot) leading to chip deterioration. Despite of the important costs
for reliability assurance, new defects arise all the time, leading to perturbations in
the operation of the equipment. Since the majority of stress tests don't supply
information about the long time behaviour (failure rate or other comparable
reliability indicators can not be calculated), the only possible manner to obtain
reliability data is the trial operation under the most appropriate conditions for the
foreseen utilisation. With this end in view, the user has only the information that
in his equipment, in similar loading conditions and coming from the same
manufacture moment, the component will have results similar to those obtained
from tests. To complete the information, one may say that the trustworthiest data
can be obtained only by the equipment operation itself.
The completion of laboratory reliability data with failure rate data obtained by
testing at the user, in various and arbitrary established conditions (failure rates
being in a more or less marked relationship with the operational failure rates) must
be only a momentary solution. Consequently it is necessary to concentrate the
activity of a reliability laboratory on short duration tests, and not to obtain failure
rates from long duration tests (up to two years).

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
146 4 Reliability of diodes

4.2
Semiconductor diodes

4.2.1
Structure and properties

The silicon wafer is subject to a series of diffusion processes with the aim of per-
forming the doping of the semiconductor barrier layer. Under practical operation

Manufacturer: A B c
Fig. 4.1 Comparison between failure rates of silicon rectifier diodes, for different stresses:
d. c. loading on barrier layer"""" operation under capacitive load =

conditions, high reverse voltages are applied on this doped layer. During opera-
tion, the diode should not exceed a certain well-defined temperature of this barrier
layer.
The possible influences, which lead to parameter modifications, are due to
electrochemical phenomena, caused - for example - by the residues of the neces-
sary cleaning processes that take place, during the manufacture, on the crystal
surface. Since at the edge of the structure, the barrier layer reaches the surface,
these phenomena appear in this region. The electrochemical reactions are acce-
lerated by the potential differences and by the increase of temperature [4.8].

4.2.2
Reliability tests and results

Comparative studies on different products have been performed: in a test se-


quence, for comparable voltage and current conditions, at d. c. voltages and - in
4 Reliability of diodes 147

another test series - at capacitive load [4.8]. Important differences between the
failure levels have been found (Fig. 4.1). At capacitive load a higher failure per-
centage than at d. c. stresses has been obtained.
The Fig. 4.2 gives a survey of failure causes based on failure analysis for
silicon rectifier diodes. Various silicon rectifier diodes 1N4005 have been exposed
to humidity tests during 74 hours at 85°C and RH 85%; after 2 hours test, IR was

Breaking through
b 0.5 I
I
i
Burned silicon crystal
~
Eccentric position
I==:::J 7
Other causes
14

Interruption of I
1 .2
internal soldering

Inadequate internal I 22.6


soldering I
Lateral and superior part
I 30.7
breaking through I
i
I
I
Defects (%) o 10 20 30 40

Fig. 4.2 The failure causes (in %) of the silicon rectifier diodes

measured (failure limit for IR is 5Jl.A). All samples were good, with one exception:
IR increased from lOnA to IOJl.A. This test doesn't correspond to a real stress, but
allows an useful comparison between technological variants. A supplementary
protection has been foreseen for good samples. By storing up together all con-
necting conductors, an increased humidity leakage path and a supplementary me-
chanic anchoring of conductors can be obtained.
In Fig. 4.3 the effect of temperature on various diode types, in accordance with
the physical failure model indicated in MIL-HDBK-217 is shown. The voltage has
a considerable accelerating effect on the failure rate [4.7], more accentuated at low
temperatures than at high temperatures (accelerating factor: 7 at 75°C, and 2 at
150°C).
148 4 Reliability of diodes

4.2.3
Failure mechanisms

a. Mechanical failure mechanisms

ai. Totalfailure
• Poor soldering ~ interruption.
• Insufficient mechanical pressure ~ intermittent failures / interruption.
• Inadequate expansion coefficient ~ interruption.
• Scratches on the structure surface ~ increase of thermal resistance and of
breakdown voltage.
a2. Degradation failures
• Mechanical degradation (contact points or connection wires partially faulty ~
local overheating ~ hot spot ~ total failure).

b. Electrical failure mechanisms

hi. Totalfailures
• Too high voltages (or currents) ~ interruption (short-circuit).
• High temperature (for a relative short time) ~ degradation of electrical para-
meters.
• Voltage peaks ~ breakdown of pn barrier layer ~ short-circuit.
• Quick change of the polarity: reversal from forward direction to blocking di-
rection ~ breakdown ofbarrierJayer ~ short-circuit.

Failure rate A (FIT)


10'

o 0.2 0.4 0.6 0.8 1.0 Normalised temperature


of the barrier layer

Fig. 4.3 Failure rate versus nonnalised temperature of the barrier layer, according to MIL-
HDBK-217; 1- silicon diode; 2 - gennanium diode; 3 - Z-diode
4 Reliability of diodes 149

b2. Degradation failures


• Ion immobility in the oxide layer => degradation in reverse conduction.
• Stress produced by forward current => formation of an immersion layer =>
degradation of barrier properties.
For screened and reliable components used in military applications, the failure rate
is about 0.04 x 1O-6/h - with a confidence level of 60%, and at the maximum
dissipation power.

4.2.4
New technologies

To fulfil the latest more and more severe specifications concerning these compo-
nents, some manufacturers have elaborated new diode fabrication technologies.
Thus, for example, the company Texas Instruments achieved a metallic contact
between the end of conductors and the semiconductor crystal, by adding a contact
material at the anode and at the cathode of the crystal. The high reliability is guar-
anteed by a special glass passivation technique.

1 3

Fig 4.4 "Superrectifier" technology with "glass" of plastic materials (General Instrument
Corp.). 1 - brazed silicon structure; 2 - sinterglass passivation; 3 - non inflammable plastic case

General Instrument Corporation has developed a glass plastic material techno-


logy (Fig. 4.4), a combination between glass passivation and plastic case. The
advantage of plastic encapsulation is that the mechanical problems raised by the
utilisation are solved; even the thermal robustness is greater. The external me-
chanical forces (emphasised by the bending of the conductors and their ulterior
tearing) are removed due to the plastic case. The electrical advantages of the glass-
passivated case are maintained. Since the glass cell has not to resist by itself to the
external mechanical forces, the glass body (together with the molybdenum pellet,
used because its dilatation coefficient is closed to that of the glass) can remain an
entity.
In this way a decrease of the thermal resistance was obtained. The cell glass
does not contain alkaline ions; as a result, the barrier layer is electrically neutral
and has the same dilatation coefficient as the silicon. This explains why no cell
breaks due to thermal causes. The good properties of the glass - as passivation
material - guarantee the time constancy of the doping state of the rectifier cell
and, consequently, the stability of all electrical parameters (reverse current,
reverse voltage, and reverse switching time). By mounting the cell at a
150 4 Reliability of diodes

temperature of 600°C, a very small thermal and electrical resistance of the


transition from the structure itself to the connection conductors is obtained.
A life test for 2 x 106 components hours has indicated a small failure rate (A =
1.8 x 1O-61h, at 60% confidence level). The testing method was in accordance with
MIL-STD-202 "Test Methods for Electronic and Electrical Components Parts"
and MIL-STD-750 "Test Methods for Semiconductor Devices".

4.2.5
Correlation between technology and reliability:
the case of the signal diodes 1N4148 [4.9]

A frequent challenge in all seminars and conferences dedicated to reliability


problems in electronics is that concerning the effect of a certain type of technol-
ogy on the reliability of the considered product; in other words, does it exist a
relation between technology and expected reliability? In the next paragraphs it
will be discussed the particular example of the signal diodes 1N4148, examining
in detail the failure risks associated with different technologies. This is the case of
the small current « 500mA) pn junction diodes, glass encapsulated and having
axial terminals.
An important user of these diodes has noted, for diodes of "standard" quality,
an abnormal high failure rate, above +70°C. This was confirmed by the quality
control department of the user's plant, also for the failures appeared for the
finished products already delivered to clients. Alerted by the user, the
manufacturers confirmed the failures at high temperature (instability defects of
the forward voltage VF , and the random short-circuit defects) inherent to certain
assembling technologies of signal diodes. One of the manufacturers has
confirmed, in addition, the interruption of the continuity at high temperature, and
short-circuits due to the free conducting particles, in the cavity.
As a result of the received claims, the manufacturers promised to obtain diodes
in an essentially improved and more reliable technology, avoiding the former cited
risks.
The initial manufacture principle utilised is the same for all manufacturers.

,
/

Fig. 4.5 The "double plug" technology. 1 - glass tube; 2 - structure; 3 - plug

In Fig. 4.5 the "double plug" technology is presented: a chip pressed between two
plugs foreseen with connections and sealed into a glass tube. One must distinguish
between "standard" technology in two variants (pressed contacts, and welded
contacts), and the technology ''without cavity".
Standard technology. A planar chip (Fig. 4.6) has the form of a parallelepiped
whose typical dimensions are 400llm x 500llffi x 200Ilffi. The two plugs are made
4 Reliability of diodes 151

of an FeNi-alloy (Dumet) covered with a copper metallisation. The glass tube is


made of silica-Iead-potassimn. Concerning the assembling (Fig. 4.7), the
aggregate connection-plug is realised with the aid of warm or electrical welding,
the ensemble chip-plug is made either by pressure (obtained during the sealing at
600°C) or by soldering (during sealing at a temperature greater than 700°C). This
contact type is imposed by MIL-S-19500, for the model 1N4148. The plug sealing
is made by heating, the temperatures being determinated by the type of the chip-
plug contacts.

1
-6
\
7

Fig. 4.6 Planar structure in the Fig. 4.7 Standard technology with the two plugs
standard technology. 1 - silver (FeNi-alloy). 1 - connection; 2 - structure; 3 - her-
excrescence assuring the anode metically closed glass body; 4 - plug; 5 - silver
contact; 2 - Si02 passivation as- outgrowth assuring the anode contact; 6 - cavity
suring the protection of the pn having about 200J.!m width; 7 - welding.
junction, at the surface; 3 -me-
tallisation of the cathode con-
tact.

Fig. 4.8 Technology "without Fig 4.9 Technology "without cavity", with the
cavity", with mesa structure. 1 - two silvered tungsten plugs. 1 - structure; 2 - wel-
metallisation of the anode con- ded contact; 3 - hermetically sealed glass body.
tact; 2 - metallisation of the ca-
thode contact; 3 - Si0 2 passiva-
tion assuring the protection of
jlllction on the lateral parts of
the structure.
152 4 Reliability of diodes

Technology "without cavity n. The "mesa" disk (Fig. 4.8) has the form of a par-
allelepiped with the typical dimension 400fll11 x 500fll11 x 100fll11. The two plugs
are made of tungsten, covered by silver. The two connections are made of FeNi
(Dumet). The glass tube is made of a very resistant non-alkaline compound. Con-
cerning the assembling (Fig. 4.9), the ensemble connection-plug is made by
welding at 680°C, the assembling plug-glass tube is assured by sealing at 700°C,
and the assembling chip-plug is realised by welding at 850°C (the welding is
made with eutectic).
The three operations are made in an oven by a single passing through. The
specific feature of this assembling technique (compared with the standard
technology) is the absence of the inner cavity at the chip level (Fig. 4.8 and 4.9).
In fact, a micro cavity can exist,but the short-circuit risk is cancelled by the
glassivation 1 of the chip edges.

Fig. 4.10 Intermediate technology Fig. 4.11 Intermediate technology: the glass body
between "standard" and ''without is in contact with the glassivation.
cavity": this is a planar structure,
but of bigger dimensions. 1 - (pas-
sivate) oxide; 2 - glassivation;
3 - cathode contact (metallisation).

An intermediate technology. This new technology is an intermediate one


between the two technologies described earlier, while the chip (Fig. 4.10) is
planar, cubic, but of bigger dimensions (750fll11), to have a greater contact area. In
addition, a glassivation covers the surface and the edges, as in the case of "mesa"
chip. The alloy quality could be improved by an optimisation of the pellet's back
metallisation. The assembling method is the same as for "double plug" type, with
the difference that glass sealing is in contact with the glassivation of the pellet's
flanks. The existence of two separate cavities (Fig. 4.11) eliminates any short-

I Glassivation: vitrous layer which covers the semiconductor chip, with the exception of
contact areas ("bonding pads"), intended to completely protect it against contaminants
aggression (particles, humidity, etc.).
Passivation: insulating layer (Si02 , Si 3N4 ), deposed on the surface of a semiconductor
pellet for the protection of junctions against the contaminants and to isolate the con-
ductive parts between them.
The two proceedings can be used together or separately on a sole chip; in contrast to the
glassivation, the passivation can be deposited even on a non-plane area.
4 Reliability of diodes 153

circuit risk through particles. This important feature requires modifications in


sealing proceedure, and depends on the used glass type.
The "standard" technology has to deal with two types of potential defects: the
intermittent short-circuits, and total tear of contacts, or intermittent contacts,
which affect the "contacts under pressure" variant.

4.2.6
Intermittent short-circuits

Conductive particles with typical dimensions of 10 .. .50fllll may separate from the
metallisation of internal part of the plugs or from the chip (burr after cutting up). It
has been proved that the particles can originate from an absence of cleanliness
during the assembling operations. Then, the particles can move in the cavity of
200!J,m width (Fig. 4.12) and producing intermittent short-circuits between plugs
or between chip and plugs (Fig. 4.12) when the diode is subjected to vibrations,
shocks, and accelerations. This type of defect was not identified, because: 1) the
defect has a hidden character, appearing randomly and only if the diode is operat-
ing under vibrations, shocks and accelerations; 2) it is difficult to correlate the
effect the material defect with the diode defect; in addition, the fragility and the
small dimensions of these diodes make difficult its dismantling, especially if it is
encapsulated in Dobekan.
The detection and prevention methods are: (i) internal visual inspection - if the
package is transparent - and (ii) electrical testing, if the diode is simultaneously
subject to vibrations ("pind test") or shocks ("tap test").
Depending on specifications and on manufacturers, these tests are foreseen (or
not) by the quality assurance manual of the manufacturer. So, for example, in the
case of "tap tests", the target is to detect - and to destroy (by burning with a
voltaic arc) - the casual particles. To do this, a machine places the diodes (one by
one, automatically, and during several seconds) facing a multi contact measuring
head which exposes each diode, vertically mounted (as this position seems to be
particularly favourable) simultaneously to the vibrations (10 cycles, reversing the
mounting sense) and to a reverse voltage of llOV, higher than the break-down
voltage (VBR) specified in the data sheet of the models IN418 and IN4448 (whose
breakdown voltage is lOOV).
Using this test, manufacturers successfully eliminated about 5% of the tested
diodes. One may also notice that avariant of this test does exist: measuring VF or
IR during the diode exposition to microshocks.
Contact tears. The assembling of IN4148 and IN4448 diodes (not of the
1N4148-1 diode) is performed by pressure. In fact, this pressure is assured by the
difference between the dilatation coefficients of the materials (silicon, copper, and
glass), after the sealing at a temperature of 600°C. For a certain manufacturer, the
typical dimension of this construction is OAfllll (Fig. 4.12).
Therefore, the smallest deviation from the manufacturing proceeding can
produce inadequate contacts, which may be identified - particularly for an
environmental temperature greater than +70°C - by the increase of VF beyond
tolerances.
154 4 Reliability of diodes

To underline the difficulty to detect such contact tears, it is sufficient to


mention that is very difficult to emphasise this type of failure, if the diode is
connected in parallel to a coil. What is, in this case, the failure to be detected?
With the aid of an evaluation test (mechanical pulling of connections: about 5.5
... 6.8 kgt) and observation of VF on an oscilloscope. This experiment - in accor-
dance with method 2036 ofMIL-STD-750 - is performed between 25 and 150°C.
To remove the failed diodes, the following combined tests can be utilised:
• pulling tests of connections, performed at an environmental temperature of
+25°C ("terminal strength");
• chocks and vibrations tests on the diode body, at an environmental temperature
of+25°C;
• measurement of VF , performed at an environmental temperature of + 150°C.
All these tests are foreseen in the MIL specification and partially in the CECC
specification. As a matter of fact, the absence of CECC specification for the two
last mentioned tests explains why an abnormally high failure rate was observed for
such defects, by the users of the diodes manufactured in accordance with the
CECC specifications.
Possible remedies:
• Contact tears: use diodes with welded contacts, instead of pressed contacts.
This is the case, for example, for the variant I N4148-1.
• Intermittent short-circuits: use diodes without cavity, with a glassivated pellet.
This technology is utilised only for the manufacture of "I" diodes. The presence
of a microcavity is not dangerous if the pellet is well glassivated.
This analysis emphasised the technological disparities that can exist between
different manufacturers, for the same basis model. Obviously, at the level of
operational reliability, they lead to very different results and (partially) explain the
price differences.
Remarks:
a) In the case of the "1" diode, the quality of the welded contacts must be
verified: the MIL specifications concerning the foreseen quality assurance contain
for the group C a pull test at 4.5kgf (instead at 1.8kgf for the diodes models
without "1 ").
b) The "I diodes are exempt from selection tests "fist and bist" (see note 4 of
the accelerated ageing programme of MIL-S-19500).
c) The paragraph 3.6.8 ofMIL-S-19500 prescribes for the special quality level
(lANS) a welded contact type, except of the Schottky signal and UHF diodes.
4 Reliability of diodes 155

4.3
Z diodes

4.3.1
Characteristics

A particular characteristic of the silicon diodes is a marked increase of the reverse


current if the critical voltage value (breakdown value) is overpassed. The silicon
diodes that operate in the breakdown zone are called Z diodes. The works of Dr.
C. Zener, published in 1934 [4.10], deal with the physical phenomena that occur
in this operation domain. Until few years ago, his name was given to the stabilis-
ing diodes intended particularly to operate in the breakdown zone, although at
breakdown voltages higher than about 5.5V not the Zener effect is decisive for the
parameters variation, but the phenomenon of avalanche breakdown explained by
McKay, in 1954 [4.11]. In accordance with Dr. Zener, who refused to give his
name to a component which has nothing to do with the Zener effect, the diodes -
called until that time Zener diodes [4.12] - are called now Z diodes (DIN 41855).
If a Z silicon diode operates "conversely", so that the anode is connected to the
negative pole of the voltage source, the reverse current is slowly modified until the
breakdown voltage is reached. Then, the reverse current increases very quickly. In
the ideal case, in the blocking zone it should be no current until the breakdown
voltage is reached. Because of semiconductor material impurities, there is never-
theless a small reverse current IR, which depends on temperature; this fact must be
taken into account in the case of high temperature operation. (The current increa-
ses 100 time if the blocking layer temperature varies from -55°C to + 100°C). If
the maximum temperature of blocking layer is exceeded, the reliability decreases.
In the stabilisation zone, this parameter has sudden variations, its increase leads to
an increase of the dynamic impedance Z. For the achievement of a good voltage
stabilisation, the impedance must be the smallest possible. The sharper the mini-
mum point, the better the regulation properties. Beyond the minimum point, the Z
impedance decreases if the current increases. The Z diodes are utilised in the
breakdown region both for stabilisation and voltage limitation, for the circuit
protection to over-voltage, and as noise sources for the noise generators.

4.3.2
Reliability investigations and results

Kim and Misra [4.13] undertook investigations on the pn junctions. In particular,


the Z diode was tested with a Z voltage of 15V. First, the necessary variation of
the Z voltage was measured for a certain value (300flA) of the breakdown current,
after an operation time of 1000 hours. Then, the small band noise was measured
depending on the reverse current. The following results have been obtained: the
band noise (at 1kHz) increases strongly depending on the breakdown current,
reaches a maximum and then decreases; in the decrease zone one or more maxi-
mums may appear. The magnitude of the second maximum of the noise curve is
156 4 Reliability of diodes

Uz (V)
7
i"oo...

6
---
5
o 2 3 4 time (](JI hours)

Fig. 4.14 Behaviour of different Z diodes while ageing after storage at +70°C. Beyond 20 000
hours, the 6.3V Z diode does not operate reliable anymore

Table 4.1 Results of a comparative reliability study on 400mW Z diodes, allied and diffused,
respecflVelV
I

Manufacturer Allied Z diodes Diffused Z diodes


3...5V,400rnW 6...33V,400rnW
A 0 0
B 0 4*)
C 3**) 36**)
D 0 3**)
E 2**) 1**)

*) Modifications ofUz; **) Modification of IR

(iJU/Uo)lrl A) B)
2
- 10= IrnA 0
-2
-4 "\ ~
-6

-10 = 20rnA
4
2 /1""'"
~
/
-
0 f"""ooo.. ~

-2
-4
-6
103 104 2.104 3.104 103 104 2.10' 3.104 t (h)
Fig. 4.15 Behaviour at ageing of the breakdown voltages ofZ diodes measured at -ID = lmA and
20mA: A) Tj = 135°C; B) Tj = 90°C
4 Reliability of diodes 157

well correlated with the variation of Z voltage, after an operation time of 1000
hours.
Until now no quantitative substantiation on the correlation of the low frequency
noise and the estimated lifetime of the Z diodes appears in the technical literature.
As one already knows, the tests take much time and don't lead always to sure
conclusions. If a long test time is not available, it is recommended to undertake
short time investigations, under appropriate operation conditions, and to compare
the results obtained in this way with the existing data concerning the same type or
a related component type. The investigations that take into account simultaneously
one or more parameters (Fig. 4.13), but also - if possible - the comparative 1000
hours tests (Fig. 4.14 and 4.15) are conclusive.
The results of a comparative reliability study for allied or diffused Z diodes,
operating at 400mW, are presented in Table 4.1 (operating time: 1000 hours at
total charge; ambient temperature 25°C). Excepting the manufacturer C, the
failure rate varies between 10-6Ih and 7.10-6Ih, for a confidence level of 60%.
For a series of tests performed for four Z diodes manufacturers - and for
samples of minimum 100 items - the most significant failure mode was the
increase oflosses for two diodes manufacturers (Table 4.2).

Table 4.2 Compared reliability of Z diodes (% defects, after 168 hours operation, at P max)

Manufacturer Alloy,400mW
(3 ..SV)
Diffused,400mW
(6 ...33V)
Ii,Diffused,
(6 ...33V)
I
1\"'-1
-----ji'--- -----~
A 0 4.3
B 3.3*) 37.5*) I 1~8*) I
C 0 3.4*)
I 7,1*)_
D 0 0 O__
____ L .. _ _
--

*) Drift of IR

As for the encapsulation, for the same four manufacturers the results presented
in Table 4.2 were obtained, with the following specifications:
a) For the 400mW diodes, DO-35 and DO-7 packages, the manufacturer C
offered higher voltages for glass DO-35, but with greater losses. The
manufacturers A and B have used the DO-7 package, which is exposed to internal
contamination during the assembling, and therefore the life test results are poorer.
b) For the I W diodes, manufacturers A and B supply the device in an epoxy
package; the diodes are also exposed to failures in high humidity environment.
The utilisation of welded contacts leads to a drift of the losses. The manufacturers
C and D use DO-14 glass package due to the small dimensions of the die. This
explains why for A an B leakage shifts were measured. Many manufacturers
experienced a higher level of breakage and intermittence with plastic packages as
a result of automatic insertion. This is not the case with glass package.
158 4 Reliability of diodes

Table 4.3 Mean temperature coefficient (in 'YO/C) of the Z diodes, between +25°C and +125°C

Manufacturer Allied, 400mW Diffused,400mW Diffused, 1W


3V 12V 8V
A 0.055 0.050 0.049
B 0.053 0.059 0.048
C 0.052 0.053 0.043
D 0.053 0.050 0.042

The prediction system for reliability (Tables 4.4 and 4.5) plays an important
role in the improvement of the product reliability. On the one hand it is a tool for
estimating the reliability level of the product during the design and development
phases, which allows to optimise the selection of the components and circuits, of
the system structure and of the organisation of the logistic support. On the other
hand, it gives the target-values, which can be compared with the measurements
performed in operation.

Table 4.4 Reliability comparisons at the component level

Component Quantity Compxh Total replacements')


(106) Replaced RIT
Z diodes 193 11,342 11.2 0.99
Resistors with thick film 64 3,761 149.3 39.70
Signal diodes 98 5,759 1.6
Rectifier diodes 163 9,579 9.6 1.00
LED 50 2,930 3.2 10.92
CMOS SSlIMSI 130 76,930 276.1 36.14
*) For "Total replacements" data originated from reparation centres (component replacements) were
used; for RIT, the measurement unit is (replacements/J(I hours).

Table 4.5 Failure rates, predicted and observed

Component Predicted Operation Predicted Operation Predicted


1986 (FIT) (RIT) 1989 (FIT) (RIT) 1992 (FIT)
Z diodes 14.0 4.5 5.0 9.0 3.0
Signal diodes 3.0 3.0 3.0 1.0 2.0
Signal transistors 20.0 16.0 15.0 9.0 10.0
CMOS SSIIMSI 25.0 40.0') 20.0') 14.0 14.0
*) High replacement rate due to systematic replacements; in 1989, the mean RIT = 28.
**) Value adopted taking into account the decreasing tendency of the failure rate.

4.3.3
Failure mechanisms

In general, the failure mechanisms of semiconductor devices can be divided in


three categories:
4 Reliability of diodes 159

a) Electrical stress (in-circuit) failures are event-dependent and directly related


either to poor design (leading to the electrical overstress) or to careless handling of
components (leading to static damages); although the cause of such failures is
misuse, they concern the manufacturers as well as the consumer.
b) Intrinsic! failure mechanisms, produced by the crystallographic or processing
defects (holes in the thermally grown oxide or in the epitaxially deposited layer),
arising during the die manufacturing.
c) Extrinsic failure mechanisms, which result following to the assembling
operations. These mechanisms are a result of the device packaging and
interconnection process (the "back-end") of semiconductor manufacturing. As the
technologies mature and problems arised in the manufacturer's fabrication lines
are ironed out, intrinsic failures are reduced, thereby making extrinsic failures the
more important for device reliability.
It is difficult to establish an order of importance for the failure mechanisms. It
may happen that some intrinsic failures reach across all ambient conditions but
with a much lower effectiveness. Some mechanisms are dominant in certain
operational and environmental conditions, while others don't appear in normal
operation conditions, and are induced in laboratory conditions (being very
important for the manufacturer, but not for the user).
During manufacturing, the device passes through a series of processes in which
it is metallised, passivated, encapsulated or submitted to combinations of these
operations. To establish the device operational reliability, accelerated life test and
selection tests based on such tests are performed. The following stresses are used:
high temperatures, high current densities, corrosive or high irradiation media, or
media formed of combinations of these.
Because of the complicated processes utilised at the manufacture of semicon-
ductor devices, two principal elements are necessary to obtain a reliable product:
• the device should be designed with the newest technologies (mask, passi-
vation, metallisation system, encapsulation system, materials):
• the reliability must be incorporated in the device with a careful manufacture
control, utilising the newest processing and characterisation techniques, and the
best trained operators.
Reliability results. A series of tests have been performed to compare the perfor-
mance of various Z diode manufacturers. All samples were representative for
commercial grade products and have no preconditioning or special tests. Sample
size has a minimum of 100 units per test for each manufacturer. In operating life,
for two manufacturers the most significant failure mode was the increased leak-
age. Electrical test conditions and limits were in accordance with JEDEC registra-
tion. As mentioned in Table 4.3, two packages have been tested: 400m Wand 1W.

2 Mechanisms inherent to the semiconductor die itself are termed intrinsic. Such me-
chanisms include crystal defects, dislocations, and processing defects. Processing defects,
for example, may take the form of flaws in the thermally grown oxide or the chemically
deposited epitaxial layer. Intrinsic failure mechanisms tend to be the result of the wafer
fabrication.
160 4 Reliability of diodes

The results confirm that no generalised acceleration factors can be used for Z
diodes. There are so many variables and so dissimilar that a factor must be deter-
mined for each specific type and technology. Our choice to set up a component
reliability data bank was a successful strategic choice, which permits the integra-
tion with the reliability prediction system as regards the correlation between field
data and prediction models, and the evaluation of component field data.

4.3.3.1
Failure mechanisms of Z diodes

The main parameter of the Z diodes is the diode capacitance at the limits of the
operating voltage range. It is also usually to measure the leakage current and the
series resistance. The temperature coefficient of the capacitance is usually
determined as a type test.
IR drift occurs after extended operating life 3 • Usually it is caused by
contamination near the semiconductor junction, or under, into or on top of
passivation layers.
Shorts result usually from thermal runaway due to excessive heat in the
semiconductor junction area and are caused by power dissipation defects.
IR drift or shorts. After the humid environment test, they are caused by a lack of
hermeticity allowing moisture to reach the semiconductor chip.
Zz drifts are usually caused by changes in chip ohmic contacts.
The glass package has a good hermeticity, and a good metallisation system
which guarantees ohmic contact integrity on all Z diodes.
If the surface conductivity is increased by the impurities introduced by ionic
contaminants, a gradual increase of the leakage current is observed when the
devices are in the "off' state [4.14]. For diodes, a key parameter is the energy gap
between the valence and conduction bands of the semiconductor material. It has
been found [4.15] that mechanical stresses in silicon can reduce this energy gap,
and as a consequence it is possible to reduce the "on" voltage of the devices.
Thermal stress may cause degradation of device characteristics by impairing the
junctions. Migration of the dopant impurities can lead to short-circuits and
subsequently burn-out of the pn junction. Contact migration (another form of
aluminium migration; however, the physical process governing the movement of
aluminium atoms is different from that of electromigration) is a particular problem
of Schottky diodes. All diodes and thyristors must be designed within the
specifications required to prevent an electromigration of the metallisation.
The electrical overstress (EOS) is the major mechanism affecting these devices.
They are sensitive to static potentials, and can be destroyed by a permanent break-
down across the reverse-biased junction.

3 The use of silicon nitride onto planar junctions (as a barrier against ionic contamination)
virtually eliminates this type of drift.
4 Reliability of diodes 161

4.3.3.2
Design for reliability

The reliability of Z diodes can be greatly impaired by only slight variations in


processing, materials or techniques (the purity of water or chemicals, the tempe-
rature or the humidity during processing, the impurity content of piece parts,
variations in the operator training). The processes and the controls must be
designed to eliminate these variations.
Silicon nitride passivation is an almost impenetrable barrier to the abundant
alkali elements, especially sodium. The presence of these mobile positive ions in a
passivation layer will cause high surface fields which can lead to device
degradation (lR drift) or complete failure. The improvement in life test perfor-
mance is attributed to silicon nitride, which prevents the migration of mobile ion
contaminants under the influence of a reverse bias; contaminants as lithium,
sodium, potassium are the major causes of long-term reverse bias failures.
The deposited oxide layer provides scratch protection during final fabrication
and increases the thickness of the dielectric passivation on the surface of the chip.
This increase in thickness reduces the electric field strength through the
passivation, eliminating dielectric breakdown and arc-over reverse bias that cause
shorts.

4.3.3.3
Some general remarks

Most generalised data relate the failure rate A to the (ambient or junction)
normalised temperature of the device. It is important to note that temperature is
the only accelerated factor taken into account. Sometimes ambient and dissipation
effects are unrelated. Anyhow, operating voltage is not used as an acceleration
factor.
In order to facilitate the comparison between various results, acceleration
factors based on normalised junction temperature and the ratio of normalised
junction are given in Table 4.6.

Table 4.6 Catastrophic failures

Diode type Failure Acceleration factor


rate r=O.O r=1.0
Si, junctions 0.1 1.0 1.0
Si, power 0.1 3.5 35
Si,Z 0.1 1.3 13
Si,Z 0.3 3.0 10

From Table 4.7 one may see that degradation failure found in practice are
considerably higher than those given by the generalised sources (related to
catastrophic failures: open- or short-circuits).
162 4 Reliability of diodes

Table 4.7 Degradation failures

Diode type Failure rate at normalised junction temperature 1.0 (AIMh)


Minimum Average Maximum
Si,Z 0.8 15 IDS
Si, junctions 0.05 3 14

In Table 4.8, data from FRD-cards restricted to catastrophic failures alone


(where this is possible) are shown. The failure rate values are better by about an
order of magnitude for Z diodes, but only marginally better for silicon diodes.
It is difficult to find an explanation for the discrepancies between Tables 4.6
and 4.8. It is probable that the differences are due to several factors such as the
dissimilarity of types concerned, the preponderance of military types on FRD-
cards data, etc.
Few statistical data are available in the literature for failure mechanisms or
failure modes for diodes. A distribution of the failure modes is given in Table 4.9.

Table 4.8 Catastrophic failures, FRD cards

Diode type Failure rate at normalised junction temperature 1.0 (llMh)

Minimum Average Maximum

Si,Z 0.1 0.8 2.6

Si,junction 0.09 1.1 2.2

Table 4.9 The distribution of the typical failure modes

Diode type Typical failure modes Conditions


Short-circuit Open Drift

Z 0.3 0.1 0.6 Electrical load


Rectifiers 0.35 0.1 0.45 Electrical load
(metal-case)

4.3.3.4
Catastrophic failures

Catastrophic failures of mechanical origin may have various causes. A cata-


strophic failure of electrical origin is a voltage transient with a steep wave front
which can cause a breakdown of the pn junction with short-circuit as a result.
Results similar to those produced by voltage transient may arise when the diode is
switched rapidly from the forward to the reverse direction. This is due to the ef-
fects of reverse recovery phenomena.
4 Reliability of diodes 163

In some junction and Z diodes, pressure contact is maintained by a spring of


diminished resilience where the initial pressure is only hardly sufficient, so that
failure may occurs in time, especially when the diode is subjected to variations in
ambient temperature or in loading. Incipient failures of this type can be screened
out to some extent by thermal cycling (-55° to +150°C). The most usual form of
damage is cracking. With a loose fragment, intermittent or permanent short-
circuits may be caused. If the crack runs across the active area, the actual
breakdown voltage of the diode may be reduced by the direct flashover across the
exposed pn junction.

4.3.3.5
Degradation failures

Mechanical degradation in the form of partially failed bonds or broken dice lead
to corresponding increases in the forward voltage drop with electrical degradation
as a result. However, such local degradations often lead to local thermal runaway
and total failure.
Misalignment may result in very small insulation path, where moisture or ion
concentration may lead to high leakage. High and often unstable leakage currents
may occur as a result of the oxide passivation being bridged by effects such as
purple plague.

4.4
Trans-Zorb4 diodes [4.15]. .. [4.21]

4.4.1
Introduction

A protection transient diode is, in principle, a Zener diode for current peaks with a
short response time. While these diodes are frequently subject to strong overloads
(for example, inductive loads or commuting capacity, electrostatic loads, "flash"
unloads), to obtain a good reliability, a long time test at their introduction in cir-
cuits is needed. Usually, such data concerning the lifetime does not appear in the
manufacturer's data sheet. For the user it is necessary to perform adequate tests
and to find the diode type that supports the overcharge for the longest time. Such a
test, its evaluation and the obtained results are presented in [4.3].

4.4.2
Structure and characteristics

To satisfy the rapid response, and strong breakdown currents specifications, the
Trans-Zorb diodes are designed as avalanche diodes of great surface, with

4 The denomination "Trans-Zorb" (transient Zener absorber) is a trade mark of the ameri-
can society General Semiconductors Ind., Inc.
164 4 Reliability of diodes

avalanche breakdown. For very high limiting voltage there are two diode circuits
in series into a package. The mechanical structure, the current distribution on the
chip surface area, the uniformity of silicon material, and the protection of the
edges of the crystal are decisive factors for the lifetime of a diode. So, for
example, for the diode IN5907, after approximately 400 pulses in the last load test
(125% Ipp3 ) an internal short-circuit appeared (due to the presence of a new
traversing alloy) [4.20]. At the pulse voltage test, no noticeable heating or thermal
fatigue of the Trans-Zorb diode was found out. The sheet data have been observed
or even exceeded.

4.5
Impatt (IMPact Avalanche and Transit-Time) diodes

This is a power microwave device (Fig. 4.16) [4.21] ... [4.25] whose negative char-
acteristic is produced by a combination of impact avalanche breakdown and
charge-carrier transit-time effects. Avalanche breakdown occurs when the electric
field across the diode is high enough for the charge carriers (holes or electrons) to
create electron-hole pairs. With the diode mounted in an appropriate cavity, the
field patterns and drift distance permit microwave oscillations or amplification.
Impatt diodes are used at higher junction temperatures and higher reverse bias
than other semiconductor devices. This has required the elimination of potential
failure mechanisms, which might not develop at lower temperatures. Surface con-
tamination can cause excess reverse leakage current. Devices with surface con-
tamination are eliminated during a high-temperature reverse-bias screen conducted
on all Impatt diodes. Process cleaning steps have also been developed to minimise
yield loss.

Gold ribbon

Chip

Fig. 4.16 Impatt diode chip in hennetically sealed package, with copper stud at bottom serving
as tenninal and heatsink. Other tenninal is at top

Titanium has been a traditional bonding metal in three-layer metallisation


systems on silicon Impatt diodes. The metallisations include Cr-Pt-Au system
used for ohmic contacts in microwave Impatt diodes. The use of Pt or Pt-Ti as a
barrier layer between the gold and the contact metal is becoming increasingly
widespread. To be noted that the semiconductor material used for Impatt diodes is
GaAs. A major mode of Impatt degradation is the interdiffusion of Au through Pt
4 Reliability of diodes 165

forming metallic spikes which extend into the GaAs; the metallic spikes so formed
short-circuit the junction either in the bulk or at the metal-GaAs interface. In
addition, since more than 90 % of the DC input power can be dissipated in the
high field region [4.28], the attendant rise in junction temperature can result in
concomitant increase in leakage current. Bonding and metallisation are generally
responsible for a high percentage of semiconductor failures. For Impatt diodes,
100% thermal resistance testing and 100% high temperature reverse bias testing
effectively screen the devices with weak die attach, metal contact or bonding.
Process controls developed through feedback from 100% testing have minimised
these fabrication defects. The result is a highly uniform and reliable product.
Small process changes are detrimental to high reliability Impatts with a required
MTBF greater than 105 h for operation at 175°C, since the performance of these
high efficiency diodes depends critically on the exact doping profile of the
epitaxial layer. The degradation indicates time-temperature-dependent changes in
the PtGa-PtAs layers.
Diffusion of the contact metal into the semiconductor material is another cause
of failure. This failure mode is controlled by the choice of metals used in the
contacting system, the control exercised while applying those metals, and the
junction temperature. For any given metallisation system, the diffusion of the
contact metal into the semiconductor is an electrochemical process. The failure
rate due to this diffusion can be described by the Arrhenius equation:
A = ,,1,0 exp-(¢/kT) (4.1)
where A = failure rate; Ao = a constant; ~ = activation energy (eV); T = tempe-
rature (K); and k = Boltzman's constant (8.63 x 1O·5 eV /o K).
The Arrhenius equation has been widely used and its validity has been
demonstrated for many semiconductor failure mechanisms. The value of ~
depends on the specific failure mechanism and is about 1. 8e V for metal
diffusion into silicon.

AiX173 = exp{-1.8/k[(1IT) - (11473)])


105 r----------,---------,----------~~-,

10

2.2 2.0 1.8 1.6 lOOO/T(J/"K)

Fig. 4.17 Effect of junction temperature on failure rate for ~ = 1.8eV


166 4 Reliability of diodes

For a known mechanism, the activation energy can be used to project the
failure rate at one temperature to a corresponding failure rate at another
temperature. The acceleration factor is the ratio of failure rates at each temperature
(Fig. 4.17):
AT/·1·T2 = exp{-1.8Ik[(lIT]J - (lITzJ}). (4.2)
Failure rate due to surface leakage also follows the Arrhenius equation.
However, the associated activation energy is 1.0eV. Thus, if ionic contamination
is present, failure will result before metal diffusion occurs.

4.5.1
Reliability test results for HP silicon single drift Impatt diodes

All Hewlett Packard (HP) diodes of this type are burned-in for at least 48 hours at
a junction temperature Tj exceeding the maximum rating of 200°C. The following
tests were performed on HP standard production units [4.21], taken from inven-
tory:
Test 1. - Operating lifetest. Units were tested at the maximum recommended Tj .
(104 diodes tested at Tj = 200°C for a total device hours of 344 000. Failures: two.
A = 0.58 x lO'5h'l; MTBF = 172 OOOh).
Test 2. - Storage lifetest. Units were tested at the maximum recommended Tj .
(54 diodes tested at a storage temperature of 150°C for a total volume of device
hours of 153000. Failures: 0; A:;; 0.65 x lO'5h'\ MTBF 2153 OOOh).

4.5.2
Reliability test results for HP silicon double drift Impatt diodes

These diodes are all burned-in for at least 48 hours at a junction temperature ex-
ceeding the maximum rating of 250°C. The following tests were performed on HP
standard production units [4.21], taken from inventory:
Test 1. - Accelerated lifetest. Units were tested at a junction operating
temperature far exceeding the recommended maximum, in order to accelerate the
failure mechanism. [12 diodes tested at Tj = 350°C for a total volume of device
hours of 77 000. Failures: 3 (1 unit < 48 h; 1 unit < 96 h; 1 unit == 6700h).
Al = 3.9 x 1O.5/h at Tj = 350°C (extrapolating this result to Tj = 250°C gives a
A2:;; 0.01 x lO'5/h); MTBFI 225 667h; MTBF2 > 10 7h].
Test 2. - Operating lifetest. Units were tested at the maximum recommended
junction operating temperature. [29 diodes tested at 250°C junction operating
temperature for a total device hours of 249 000. Failures: 0; A:;; 0.4 x lO'5/h;
MTBF 2 249 400h].
Test 3. - Operating lifetest. [29 diodes tested at 225°C junction operating
temperature for a total device hours of 246 500. Failures: 0; A:;; 0.41 x 1O.5/h;
MTBF 2 246 500h].
These diodes are relatively easy to stabilise against bias circuit instabilities.
Simple biasing schemes (as those described in HP AN935 [4.22]) have been found
to result in reliable low noise operation under proper RF tuning conditions. These
4 Reliability of diodes 167

circuits have been shown to be effective in eliminating tuning-induced burnout


and bias-circuit oscillations in GaAs Impatt oscillators. Perhaps the most
commonly observed and frustrating failure mechanism in silicon double drift
Impatt diodes is that which results from important RF tuning. Tuning-induced
burnout can be easily avoided after understanding the circumstances that result in
these failures.

4.5.3
Factors affecting the reliability and safe operation

In most cases, it is possible to avoid the failures by taking into consideration four
predominant failure mechanisms: (i) fabrication defects; (ii) excessive Tj ; (iii) bias
circuit related burnout; (iv) tuning-induced burnout.
Fabrication defects. Excessive surface leakage current or metallisation
overhang in a defective diode can lead to early failure, even under normally safe
operating conditions. Careful screening with a high Tj burn-in procedure is also
recommended. Where extremely reliable operation in harsh environments is
required, a screening and preconditioning program is recommended.
Excessive ~. The long term intrinsic operating lifetime is directly related to the
average Tj . For a given Tj the failure rate is then critically dependent on the
particular metallisation scheme used to contact the silicon chip. The HP metalli-
sation system used on double drift Impatt diodes has been shown to result in
extremely high reliability under severe conditions. For example, the median MI'TF
(defined as the time to failure of 50% of a population of devices) - at an operating
Tj of 250°C has been calculated to be 2 x 10 6 hours.
Bias circuit related burnout. The frequency band of small-signal negative
resistance in an Impatt diode is limited by transit-time effects to approximately
1.5 octaves at microwave frequencies. When operated as a free-running oscillator
or amplifier under large-signal conditions, however, an Impatt diode develops an
induced negative resistance at lower frequencies; (this effect is less serious in
silicon than in GaAs Impatt diodes). An improperly designed biasing network that
resonates with the diode can thus result in bias circuit oscillations and excessive
noise. In certain cases, the transient current that results from the discharging of
any bias circuit capacitance shunting the diode can lead to failure. Shunt
capacitance should therefore be kept to an absolute minimum.
Tuning-induced burnout. Tuning-induced burnout can be easily avoided after
understanding the circumstances that result in these failures.
(a) Load resistance and safe operation (Fig. 4.18). Oscillation does not occur
for load resistance greater than Ro, the magnitude of the diode's small-signal
negative resistance. Output power increases as R\oad is reduced below Ro until the
maximum obtainable power is achieved for R\oad = R z. It has been experimentally
determined that the onset of power saturation in silicon double drift Impatt diodes
results from large-signal limiting of the RF chip voltage amplitude to a maximum
value of approximately 0.35 times the d. c. bias voltage. In general, Rz will be
between one-half and one-third of the small-signal negative resistance Ro. For
R\oad less than R z, the output power decreases sharply due to the saturation of the
RF voltage. Failure is likely to occur when R\oad is significantly less than R 2 .
168 4 Reliability of diodes

Power output Po Power output Po


~ fo=foo
Unsafe Safe operation lop = 10
PM --?, P ----------------------------------------------- Rload = R2
a

1
Ro 10 current
Fig. 4.18 The influence of circuit load resistance on output power for either a pulsed or CW
Impatt in a circuit which resonates the diode at a single frequency foo. The pulsed or d. c.
operating current is kept fixed at 10

One possible mechanism, which might be responsible for diode burnout under
this condition, has been described in [4.24]; it is suggested that the low-frequency
negative resistance induced by large RF modulation, could lead to a transversely
non-uniform current density within the diode.
(b) Threshold current and optimum tuning. Tuning-induced failure can - in
general- be avoided by paying careful attention to the relationship between power
output and bias current for a particular diode. The three curves in Fig. 4.1Sb
illustrate output power versus bias current corresponding to the three values of
~oad indicated in Fig. 4.1 Sa. For single frequency operation at faa there is an
unambiguous one-to-one relationship between the threshold current where
oscillations begins and the value Of the load resistance. Once the optimum load
resistance has been determined for a particular diode, the corresponding threshold
current can be used as an indicator of unsafe circuit loading. Figure 4.1Sb shows
that the threshold current ITH3 for a load resistance of R3 is considerably less than
ITH2 which corresponds to the optimum load resistance for the desired operating
current of 10. The observation of a threshold current less than ITH2 would therefore
indicate that an unsafe overload condition would exist if the bias current were
increased to 10 •
Although a load resistance of R3 would be unsafe for operation at a bias current
of 10, it would result in optimum performance at some lower bias current. A rough
but useful rule-of-thumb for double drift silicon Impatt diodes is that for optimum
tuning, the threshold current will be approximately one third of the desired
operating current. The threshold current corresponding to maximum output power
at a particular bias current is also a weak function of the fixed frequency of
oscillation. In general, the optimum threshold current will increase slightly as the
operating frequency is increased within the useful frequency range of a diode.
For diodes of the same type it is important to realise that the optimum threshold
current for operation at a particular output power or operating current may vary as
much as ± 10% from diode to diode, due to differences in the packages or chip
negative resistances.
4 Reliability of diodes 169

(c) Coaxial and waveguide cavities. The curves in Fig. 4.18a and 4.18b are
useful for achieving safe operation of diodes which remain resonated at a single
frequency approximately independent of the bias current and the RF voltage
amplitude. For this reason, single-transformer coaxial cavities are recommended
for initial device characterisation because they are broadband, well behaved and
relatively easy to understand. Noise, stability or resistive power loss
considerations may, however, ultimately require the use of a higher Q waveguide
cavity. Great care should be taken in this case to insure singly resonant operation
and avoid tuning-induced failures due to improper loading. Below the waveguide
cut-off frequency, the Impatt diode is decoupled from the external load and a
short-circuit may arise at the plane of the diode. The use of absorptive material in
the bias circuit can be an effective solution to this problem. The large harmonic
voltages that are easily generated in waveguide cavities can also play a part in
tuning-induced failures. It has been found that these failures can be eliminated if a
sliding load for the next higher frequency band replaces the commonly used
sliding short.

References

4.1 Perfectionnement aux diodes Zener. Brevet fran'rais nr. 1422532117.3.1967


4.2 Bfljenescu, T. 1. (1981): Zuverliissigkeit von Halbleiterdioden und Gleichrichtem. Fein-
werktechnik und Messtechnik, vol. 89, no. 8, pp. 388-392
Bfljenescu, T. 1. (1985): Zuverliissigkeit elektronischer Komponenten. VDE Verlag, Berlin
Bfljenesco, T. 1. (1986): Fiabilite des diodes BYWn Electronique, no.11
Bfljenesco, T. 1. (1987): Diodes de puissance sous la loupe. MSM no. 16, pp. 20-23
4.3 Stohr, H. J. (1962): Bemerkungen zum Stabilisierungsverhalten von Zenerdioden. Elek-
tronische Rundschau no. 7, pp. 297-301
4.4 Valdman, H.: Diodes regulatrices de tension: diodes Epi-Z ou nouvelles conceptions des
diodes Zener. L'Onde electrique vol. 51, no. 4
4.5 Noble, P. G.: Zum Datenblatt von Gleichrichterdioden. Elektroniker, vol. 19, no. I, pp.
EL7-EL13
4.6 Gerlach, A.: Die Zenerdiode. Bulletin SEV, vol. 53, no. 25, pp. 1228-1237
4.7 Bair, B. L. (1967): Semiconductor Reliability Program Design. Proceedings of Sympo-
sium on Reliability, pp. 612--624
4.8 Ackmann, W. (1976): Zuverliissigkeit elektronischer Bauelemente. Huthig Verlag, Hei-
delberg
4.9 Bfljenescu, T. 1. (1985): Correlation technologie-fiabilite: cas des diodes de signal. Elec-
tronique,no. 5,p. 35
Bfljenesco, T. 1. (1981): Problemes de la fiabilite des compos ants electroniques actifs
actuels. Masson, Paris / Arm, Suisse
4.10 Zener, C. (1934): A Theory of Electrical Breakdown Voltage of Solid Dielectrics. Pro-
ceedings of the Royal Society, Series A, vol. 145, no. 855, pp. 523-529
4.11 McKay, K. G. (1954): Avalanche Breakdown in Silicon. Physical Review, vol. 94, no. 4,
pp. 877-884
4.12 Sydow, R. (1977): Z-Dioden, integrierte Stabilisierungsschaltungen und Span-
nungsregler; Grundlagen und Anwendungen. Intermetall, ITT
4.13 Kim, Y. D.; Misra, R. P. (1969): IEEE Transactions on Reliability, pp. 197-204
170 4 Reliability of diodes

4.14 Amerasekera, E. A., Campbell, D. S. (1987): Failure Mechanisms in Semiconductor


Devices. John Wiley and Sons, Chichester & New York
4.15 Zalar, S. M. (1981): The Effect of Insulation Coating on Forward Degradation in Bipolar
Transistors. 19th Ann. Proc. Int. ReI. Phys. Symp., pp. 257-263
4.16 Lebensdauer und Standfestigkeit von Transienten-Schutzdiode. Elektronik (1980), no. 2,
pp. EL36-EL38
4.17 Adair, R. P.: Guidelines for Using Transient Voltage Suppressors. Unitrode Application
Note U-79
4.18 Pizzicaroli, J. J. (1977): A Comparison Report of General Semiconductor Industries, Inc.;
Trans-Zorb versus Silec Transient Voltage Suppressors, November 9
4.19 General Semiconductor Industries, Inc., Application Notes 1009, 1010
4.20 * * * (1980): Lebensdauer und Standfestigkeit von Transienten-Schutzdiode. Elek-
troniker, no. 2, pp. EL36-EL38
4.21 Hewlett Packard. Application Notes 959-1; 959-2; AN935;
4.22 Kurokawa, K.; Magalnaes F. M. (1971): An X-Band 10 Watt Multiple-Impatt Oscillator.
Proceedings IEEE, vol. 59, pp. 102-107
4.23 Iperan, van, B. B. (1974): Efficiency Limitation by Transverse Instability in Si Impatt
Diodes. Proceedings IEEE (Lett.), pp. 284-285
4.24 Peck and Zierdt (1974): The Reliability of Semiconductor Devices in the Bell System.
Proceedings ofIEEE, vol. 62, no. 2, pp. 185-211
4.25 Brackett, C.A. (1973): The Elimination of Tuning-Induced Burnout and Bias-Circuit
Oscillations in Impatt Oscillators. B.S.TJ 52, pp. 271-307
4.26 Bonding Handbook. Small Precision Tools. San Rafael CA 94903, 28 Paul Drive
4.27 Dascalu, D. (1968): Space-charge waves and high-frequency negative of SCL diodes.
Internat. Journ. Electron., vol. 25, pp. 301-304
4.28 Howes, M. J., Morgan, D. V. (1981): Reliability and Degradation. John Wiley & Sons,
Chichester
4.29 Copelland, J. A. (1967): LSA oscillator-diode theory. J. Appl. Phys., vol. 38, pp. 3096-
3102
4.30 Dascalu, D. (1969): Small-signal impedance of space-charge-limited semiconductor
diodes. Electron. Lett., vol. 5, pp. 230--231
4.31 Dascalu, D. (1968): Transit-time effects in bulk negative-mobility amplifiers. Electyron.
Lett., vol. 4, pp. 581-583
4.32 Dascalu, D. (1966): Detection characteristics at very high frequencies of the space-charge-
limited solid-state diode. Solid-State Electronics, vol. 9, pp. 1143-1148
4.33 Dascalu, D. (1967): Detection properties of space-charge-limited diodes in the presence of
trapping. Solid-State Electronics, vol. 10, pp. 729-733
5 Reliability of silicon power transistors

5.1
Introduction

The explosive development of the semiconductor technology imposes greater


demands on the quality and reliability of the components. The problem of the
failure rate during the life of a device or system is more and more important and
the failure analysis helps to the improvement of the product quality.
The power transistor is an important element of the interface between the
command electronics and the elements of the power electronics. The greater cur-
rents and voltages led to new absolute limit values for the dissipated power of
these components. Outside of the well-known thermal power dissipation corre-
sponding to the operating state, there are some specific limits for bipolar transis-
tors: a limit for pulse operating and another for second breakdown.
At first sight, it seems that the quality is a relative notion for a power transistor,
but this relativism can be overcome by defming the electrical properties. In prac-
tice, the data sheet is the simplest quality certificate: smaller is the number of bad
components, higher is the quality level. Usually, the manufacturer establishes a
maximum value that can be reached by the silicon crystal in the operating life.
This value arises from the quality and reliability criteria and it is justified by the
risk of contamination for the oxide or for the protecting layers of the active junc-
tion area and it takes into account the solder "fatigue", produced by the mechani-
cal dilatation stresses. Generally, the value specified for this temperature varies
from 200°C - for metal encapsulated transistors, to + 150°C - for plastic encapsu-
lated ones [5.1][5.2]. Because the crystal surface has some tens ofmm2, one must
admit that the specified limit value for the junction temperature is overreached in
some crystal points, even if the average value is bellow this limit.
Experimentally, it has been shown that the transistors can be destroyed in some
operating points, even if the limit given by the dissipation hyperbole is not sur-
passed. This means that the concept of maximum junction temperature is insuffi-
cient for safety using the power transistors. Consequently, the producers perform
testing at power limit, the transistors being commanded through the base, in the
purpose to attain an operating point at the limit specified by second breakdown.
Recent advancements in the manufacturing of power transistors allow extend-
ing their application field. Due to the values obtained for currents, voltages and
switching speed, electronic systems operating in the power field of some kW were

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
172 5 Reliability of silicon power transistors

realised. The technical problems raised by the circuits with power transistors are
simple and allow small manufacturing costs, small dimensions and small weight.
Moreover, high frequency operation produces fewer disturbances to the power
supply than classical device [5.3].
Generally, these components work in supply circuits having small source im-
pedance. Therefore, overvoltage with multiple causes can come out [5.4]. For
instance, voltage peaks arise in the inductive circuit at a current-off or due to the
disturbances transmitted on-line. These overvoltages represent a great danger for
the semiconductor device in an off state, because in this case the device acts like a
dielectric. This phenomenon occurs mainly for the components without a "con-
trolled avalanche" characteristic and, particularly, for transistors sensitive to sec-
ond breakdown. One must realise that the energies implied are very important and
their suppression is difficult because of the reduced source impedance.
The protection device, preventing the reaching of a dangerous value of the
circuit voltage, must satisfy the following conditions:
• Do not cause losses in normal operating,
• To be an effective limiter for the overvoltages with a rapid on-characteristic,
• To swallow-up, without being destroyed, the delivered energies.
Today, one knows that the difficulties encountered with power transistors arise
from bad using conditions. This means that the most important technological
problems linked to the component reliability have been solved. Researches on the
operating conditions (detailed definition of the specifications on protection, cor-
rect choice of components and the avoidance of the errors in use) have to be done.
Because their essential function is linked to high energy levels, the practical using
conditions for power transistors are a peremptory factor in the quality of the sys-
tems using power transistors. Experimentally, it has been shown that the lack of
this information always leads to failures. Consequently, the user must ask as fol-
lows:
• Has the transistor complete and correct specifications? Is it correctly mounted,
compiling with these specifications?
• If yes, an inherent reliability for the specified using mode can be obtain?

5.2
Technologies and power limitations

There are two basic variants for power transistors: bipolar transistors (operating
with minority carriers) and unipolar transistors (operating with majority carriers).

5.2.1
Bipolar transistors

Three main technologies, summarised in Table 5.1, each one containing some
variants, are used for the manufacturing of bipolar transistors. Only one of them
(voltlbase and collector) seems to be adequate for power transistors.
5 Reliability of silicon power transistors 173

Table 5.1 The main technologies used to manufacture silicon transistors

Small signal transistors Medium power transistors Power transistors


- Volt I base - Volt I collector - Volt I base and collector
- Homogeneous base - Homogeneous collector - Homogeneous base and
homogeneous collector
- Diffusion or epitaxy - Epitaxy or diffusion - Epitaxy only
- More appropriate for linear - More appropriate for power
integrated circuits transistors
- Higher switching time

The technological variants for power transistor are presented in Table 5.3. A
special attention must be given to the mounting operation and especially to the
wire bonding. A comparison between the main bonding techniques is presented in
Table 5.2.

Table 5.2 Main bonding techniques for silicon transistors

Thermo- High temperature Gold wire; ribbon Very small Expensive for
compression and pressure surfaces high surfaces

Needle High temperature Gold wire More robust than A high contact
and pressure thermo- surface is needed
compression

U1trasounds Ultrasound bound Gold or alumi- Avoids gold- Expensive for


nium wire aluminium high surfaces
problems

Wire solder Wire inserted in Solderable wire Moderate cost A high contact
melted solder surface is needed

Clip solder Position the clip and Bronze clip with Low cost A high contact
solder phosphorus or surface is needed
nickel

A power transistor is a current amplifier with parameters depending on the


structure and on the layout. The four interdependent fundamental parameters are:
• The breakdown voltage (depending on the receptivity ofthe less doped part of
the silicon junction).
• The current gain (depending on: the injection power of the emitter, the life-
time in base, the base width and the surface recombination).
• The switching speed (depending on the transistor capacitances and resis-
tances).
• The dissipated power (given by: the structure dimension, the case and the
second breakdown characteristics).
174 5 Reliability of silicon power transistors

Ta ble 5.3 Technological variants for power transistors

Technological variant Advantages Disadvantages

Simple diffusion Robust Slow; Inadeqately limited


(hometaxial base) voltage for switching;
expensive

Mesa with double diffusion Rapid High saturation resistance

Planar with double Raid; small losses High saturation resistance


diffusion

Triple diffusion Rapid; small saturation resistance Relatively expensive

Triple diffusion (with che- High voltage Medium speed


mical etching)

Mesa epitaxial with double Rapid; small saturation resistance Relatively expensive; less
diffusion robust; medium losses

Mesa planar with double Rapid; small losses; small Less robust; expensive
diffusion saturation resistance

Base with mesa epitaxy Medium speed; small saturation Small voltage; medium losses
resistance

Collector and base with Robust; medium speed; small Relatively expensive
mesa epitaxy saturation resistance; high voltage

Homogeneous multi- Rapid; small saturation resistance Relatively expensive; medium


epitaxial collector losses

5.2.2
Unipolar transistors

The Field Effect Transistors (FET) have some important advantages, such as:
linearity, high input impedance, negative temperature coefficient for the drain
current (preventing second breakdown and protecting against short-circuiting
when the FET is placed at the output of an amplifier).
The MOS (Metal Oxide Semiconductor) transistor is another unipolar device.
As Rossel, P. et a1. [5.4] say, since 1974, various MOS transistors have been real-
ised. More recently, more complex MOS and bipolar devices, allowing more di-
versified electrical functions have been created. These developments offer the
opportunity for integrating the complete control circuit and the power device into
the same chip. In the MOS technology, two structural configuration are regularly
used: the vertical structures (the current flows vertically across the chip) and the
horizontal structures (the current always flows at the surface). There are integrated
devices: the current can be both vertical and horizontal.
5 Reliability of silicon power transistors 175

A vertical device is the VMOS transistor, where V is the shape of the etched
silicon. A comparison between bipolar and VMOS transistors is presented in Ta-
ble 5.4.

Table 5.4 Bipolar vs. VMOS transistors

Criterium Bipolar transistors VMOS transistors

Commanded by: Current Voltage

Bias: Reverse Blocked without bias

Switching speed: 100 ns 5... 30 ns

Input impedance (RF): Small High

Temperature coefficient: Positive Negative

Characteristics of the < 2 V, at few mA 10 V, at few nA


command circuit:

Another vertical MOS device is the VDMOS transistor (Vertical Double dif-
fused), which allows to obtain a higher voltage (higher than 400V) than the
VMOS ones.
The horizontal devices are basically double diffused lateral MOS transistors
(LDMOS) having a highly integrated gate and source geometry.

5.3
Electrical characteristics

Generally, a semiconductor component is characterised by two distinct values: the


limit operating values and the electrical characteristics.
The limit values are not directly measurable, being determined by different
tests. These values can be verified, but any surpassing means the destruction of the
device or an alteration of the reliability. On the contrary, the electrical character-
istics can be measured. The manufacturer does not guarantee an exact value of the
electrical parameters, but only a maximum value and, sometimes, certain disper-
sion.

5.3.1
Recommendations

• Current gain: To counter-balance the drift arising during the ageing of the
transistor, a safety margin of 15-20% must be taken.
• Leakage current: A long time, the leakage current CE was considered the
only representative parameter for the reliability of a power component. To-
day, the leakage currents can be made small and stable, so their importance
decreased. On the contrary, the difference between the reliability require-
176 5 Reliability of silicon power transistors

ments and the circuit demands are peremptory. An example is the surface sta-
bility problem [5.5].
• Breakdown voltage: Usually, for modem transistors a well-defined break-
down voltage is guaranteed and considered an absolute limit. For small en-
ergy components, the operation at values up to 70% of the maximum admis-
sible value is recommended. For power devices, values up to 90% of the
maximum admissible value are allowed.
• Residual current leBO: The measurements made in operating conditions for
power transistors (but also for bipolar small signal transistors) proved, for up
to 1% of the measured transistors, the existence of a residual current produc-
ing troubles. The residual current has three components: a space charge com-
ponent, an interface component and a leakage current component.
The first component arises from the space charge region of the CB junc-
tion. Even a depleted surface layer can have its contribution. The second
component is produced by the generation/recombination states existing at the
Si-Si02 interface. This component increases considerably when it reaches the
surface of a depleted layer and can be divided in a volume component (gene-
ration/recombination and diffusion) and a surface component (generation/re-
combination). The third component has three main causes:
i) contamination and humidity, ii) a current flow in the depleted layer and iii)
leading "path" for electrons in the oxide layer.
For real transistors, these three components are superposed, with different
intensities. The measurements identifying the voltage and temperature de-
pendence allow an estimate of the dominant component. To be noted that the
passivation plays an important role in taking under control the residual cur-
rents, but often, an increase of the noise factor was observed.

5.3.2
Safety Limits

Why one must take safety limits? Because it is important that the unexpected
cases be taken into account and because the power transistors must operate some-
times far beyond their physical limits.
To do this, the use of a safety factor for unexpected cases is not sufficient,
without performing an analysis, because:

• The possibilities for a semiconductor component to resist to a perturbation are


relatively small. The energy that can be supported directly by the transistor is very
small with respect to the energy of an overcharge.
• The different parameters of a power transistor are not independent (a high volt-
age transistor means, for instance, a higher switching time and a higher heating,
together with a decreasing of the current gain).
Moreover, it is recommended to analyse the perturbing causes and to take the
necessary protecting measures. Usually, the safety limits are a compromise be-
tween technical and economical requirements, respectively. The purpose followed
by the safety limits is to assure smaller stresses, increasing thus the reliability of
the component. To do this, it is necessary:
5 Reliability of silicon power transistors 177

• to foresee a greater safety value for the junction temperature,


• to take into account a smaller safety limit for the maximum voltage (without
neglecting the danger of overvoltages),
• to take a safety limit for the operational field, especially when high instanta-
neous power occurs.
The choice of the safety limits is always the result of a compromise and it is not
possible to give figures if the operating conditions are not taken into account.
Consequently, it is useless to recommend a safety value for the current, if the
system designer already choose a maximum value of the current (with the excep-
tion of some short time peak values), the value ofIe sa! (or if he defined the speci-
fications for the saturation voltage) and the corresponding base current, 18 sa!

5.3.3
The du/dt phenomenon

When an active component undergoes roughly an increase of the voltage with a


greater gradient, duldt, this may have a strong influence on its operation. This
phenomenon, well known by the thyristor users, is almost unknown by the power
transistor users. However, in much energy converting circuits the switch under-
goes periodically sudden voltage oscillations with a high speed. For power tran-
sistors (and also for thyristors), the initial state of the component, short before the
variation du/dt, has a great influence. Generally, two cases may occur:
• An inactive transistor (corresponding to a single, isolated variation duldt),
• An active transistor (when a reverse duldt variation follows).
In both cases, this phenomenon influences considerably the system reliability.
Three protection methods can be used [5.7]:
• The introduction of a diode in series with the perturbed transistor in order to
stop the reverse current (an efficient method for components working at high
voltages; for small voltages it is better to use another method, because the
yield can be affected).
• The negative bias of the CB junction of the perturbed transistor (an efficient
method, but hard to use).
• The protection with series impedance (a method that can be used in all cases:
the current losses are reduced, but many supplementary components are
needed).

5.4
Reliability characteristics

The power transistors convert or switch high energies. This may lead to very high
stresses that often produce the component degradation. Consequently, the compo-
nent reliability is strongly linked to the operating conditions.
The component damage is due to a too higher junction temperature. The result
is an abnormal component operation (electron-hole pairs are created by thermal
178 5 Reliability of silicon power transistors

agitation), leading later to thermal turn-on or to a variation of the physical charac-


teristics of the component.
Because these transformations are irreversible, the lifetime of the component is
influenced, leading eventually to the total failure, even if the stress producing the
weakness does not act anymore. Often, the damage consists in the loss of the
blocking power of the CB junction, by short-circuiting. If the transistor is used in a
power equipment, this initial short-circuit leads to other troubles, such as silicon
melting, contact damage etc. It is obvious that in these circumstances the expertise
of such a failed transistor is difficult, because the subsequent damages cover the
initial defect. The instant failure of a transistor is a rather rare phenomenon. In
most cases, undergoing an abnormal high stress, the transistor is damaged pro-
gressively, a failure being registered after minutes or even hundreds of hours. The
failure is produced by a temperature increase, arising after the damage. A great
role in the damage (mostly in the surface degradation) is played by the tempera-
ture, the current, the voltage and the thermal fatigue, having a tendency towards
the increasing of the leakage currents and the decreasing of the small signal cur-
rent gain.

2 /
/
5 I
2
II
10"6 0
/
25 50 75 100 125 T; Ie)
Fig. 5.1 Failure rate vs. virtual junction temperature [5.10]

Peter [5.8] presented a diagram (Fig. 5.1) showing the failure rate dependence
on the virtuaP junction temperature for a blocking test at high temperature and at a
voltage approaching the maximum value. When an over-charge current occurs at a
given junction temperature, the current flow lead to an ageing by electromigration.

I It is known that the junction temperature is a physical basis parameter, very difficult to be
measured. Therefore, the manufacturers give tempearture limite values corresponding to the
absolute using limits. If such a limit value is surpassed, even for a short time and even subse-
quently these limits are not surpassed anymore, risks of a progressive damage or of an irre-
versible modifying of the characteristics occur.
5 Reliability of silicon power transistors 179

An important part of the testing program uses the temperature as a stress factor
to predict the time behaviour of the transistor. One may use the temperature be-
cause a correlation between the damage speed and the exponential factor of liT
takes place (Fig. 5.2, from [5.9]). The points A, Band C are calculated at three
different temperatures, for the same failure mechanism. If no new failure mecha-
nism (modifying the slope of the established characteristic) occurs, the character-
istics can be extrapolated and the result for another temperature (point D) can be
obtained.

A B

/ /
2
~
/'
7
1\.V
C

0.5
\
I\. V
\V
,,
0.2

0.1
,
0.05 '" , 0(85
0.02
18 2.0 2.2 2.4 2.6 2.8

Fig. 5.2 Correlation between the damage speed, expressed by the failure rate (A, in lO-slh) and the
reverse of the temperature, liT (in IO- JIK)

In an electronic system, the lifetime of a transistor must be greater than the


lifetime of any other components from the system. The manufacturing errors, the
operating environment of the component or the incorrect suppression of a system
failure can shorten the transistor life, down to some hours. In the early days of the
semiconductor technique, the transistor reliability was evaluated by different reli-
ability tests, simulating the operating environment. This procedure is no longer
used, accelerated tests being the main tool (see chapter 2).
For switching transistors, the maximum power appears in the domain of resi-
dual voltages. If the transistor uses more often the on-state, understanding that no
periodic switching at relatively high frequency occurs, the calculus can be made
for a voltage reduced to 70%. In this case, the reliability increases five times. The
resistance to temperature is not the same for all transistors, being strongly influ-
enced by the technology. In this matter, the transistors with a simple diffusion are
the most robust and, consequently, for these transistors it is allowed that the whole
power reach the value corresponding to the maximum collector-emitter voltage
180 5 Reliability of silicon power transistors

(VCEO)' For very short power pulse, the value allowed for the equalising currents
can be surpassed, up to the reaching of the higher allowed temperature. For a
faster sequence of pulses, the system is not completely cooled and, therefore, the
heating is kept constant.
Generally, when data on the components based on facts are missing, and an ab-
solute and fixed limit given by the manufacturer is solely known, the failure rate
of the system must be minimised by all means. The stresses corresponding to the
normal operation are known, but those corresponding to an accidental operation
are not. Therefore, the elements taken into account by the system designer are the
choice of the components and of the circuit, the definition of the protection means
and of the safety limits. The circuit designer must take into account the economi-
cal and technical requirements.

5.5
Thermal fatigue

The thermal fatigue is the slow degradation of the components, produced by tem-
perature variations. Generally, the phenomenon is linked to the mechanical
stresses generated by the different dilatation coefficients, which influence the
quality of the solder joints and of the metal/silicon and passivantisilicon joints,
respectively.
After thermal cycling, thermal fatigue is a current phenomenon for power tran-
sistors encapsulated in metal case. If a transistor is heated and cooled alternately,
mechanical stresses are produced, because the dilatation coefficients of the silicon
and of the metal used for chip mounting are different.
The transistor heat sink plays an important role in heat dissipation and, there-
fore, it is made from copper, from steel or from aluminium. The dilatation coeffi-
cients of these metals are different from the silicon ones (Table 5.5).

Table 5.5 Dilatation coefficients

Material Coefficient
(lO-6fC)
Silicon 3
Steel 10.5
Copper 17
Aluminium 23

It is obvious why at the same temperature different stresses arise at the inter-
face chip-heat sink.
The link between the silicon chip and the case is made by a "soft" solder joint,
or by"hard" solder joint. In the first case, the melt consists mainly in lead, which
can swallow up the stresses between the chip and the case, because the lead is
modelled by plastical deformation. After deformation, the recrystalising restores
the metal, acting better at higher solder temperature and at longer times. However,
the formation of microscopic holes cannot be avoided. These holes lead to stress
5 Reliability of silicon power transistors 181

concentration and as soon as the twisting limit is reached locally, a crack appears
in this point, limiting the heat dissipation and modifying the thermal resistance.
In the case of a "hard" solder, used for transistors with power greater than
150W, the alloying of gold with silicon sends the stress entirely to the chip, which
is more fragile and can be broken. To protect the chip, the molybdenum can be
used, with a dilatation coefficient closed to the silicon one, and which can swallow
up the stress if the thickness of the molybdenum layer is well chosen. Even if the
melting temperature is not reached, the stress subsequent to many thermal cycles
is so big that the alloy molybdenum-copper is weakened and a crack may appear.
The produced heat is dissipated with difficulties and the thermal resistance in-
creases.
An important increase (with 25%) of the thermal resistance between the
junction and the heat sink certifies a thermal fatigue. Usually, in all practical cir-
cuits, the power transistors undergo thermal stresses. In many applications, these
stresses are very big and, therefore, it can lead to the physical destruction of the
chip or of the intermediate layers. The tests made by the manufacturers led to the
following conclusion [5.l 0][5.11]:
• Short cycles produce reduced ageing phenomena,
• The number of cycles leading to a significant and measurable ageing is re-
versely proportional to the maximum temperature and to the temperature gap
of a cycle. The absolute limit values of the producer define the limits that may
not be surpassed.
The user may identify the ageing of the solder joints of a component in operation
by the abnormal heating of the junction, leading to the component failure. The
behaviour at second breakdown is a sensitive parameter to thermal fatigue. As
soon as a microcrack is formed, the thermal resistance increases locally and the
behaviour at second breakdown will be worst. If a transistor must operate closed
to second breakdown, it can be suddenly destroyed, without a previous degrada-
tion of the connections by thermal fatigue.
To improve the component manufacturing, the producer may act in two ways
[5.l2]:
• A constant and careful surveillance of the manufacturing conditions, so that
an optimal quality being reached,
• To produce improvements by failure analysis, by the searching of the causes
and by the development of a new and improved technology.
The experience cumulated so far shows that for small stress variations the sol-
der joints transmit integrally the stresses, without fatigue phenomena (in the elas-
tical domain), while for important stress variations a weakness of the solder joint,
increasing with the duration of the stress, is observed.

5.6
Causes of failures

The primary cause of the failure is almost always (excepting the overvoltage) an
abnormal increase of the temperature, often spatially limited ("hot spot") and
182 5 Reliability of silicon power transistors

consecutive to an abnormal operation (second breakdown - avalanche operation-


wrong base command). All these phenomena produce [5.10]:
• An abnormal operation of the component (electron-hole pairs formed by
thermal agitation) that may lead later to a thermal turn-on\
• A variation of the physical properties of the component (solders - surface
damages - local melting of the eutectic silicon/metal - dilatation - modifi-
cation of contact propertiesy
Sometimes, some abnormal stresses, even limited and occurring once, lead to a
progressive degradation (in normal operating conditions), going up to the failure
of the component. This progressive degradation occurs after some minutes or even
thousands of hours after the initial accident.

5.6.1
Failure mechanisms

The failure mechanisms of discrete components may be produced, roughly, by


three categories of defects:
• Mechanical defects or defects induced by the manufacturing.
• Surface defects.
• Chip (volume) defects.
The mechanical defects are sometimes easily detected. For instance:
• Bad solder joint (thermocompression, ultrasounds etc.): wire (or chip) bon-
ding is a critical operation, requiring good controls, well-designed tests and
frequent periodic inspections.
• Chip mounting errors (leading to an increase of the thermal resistance and to
overheating).
• Use of improper materials for the contact area and connection wire (e.g. gold
and aluminium). An example is the alloy Au-AI formed when the gold wires
are heated by operating regime at 200-300°C, close to the aluminium contact
area. This phenomenon is called purple plague.
• Imperfect sealing, allowing entering contaminants and moisture into the case
(producing surface problems and metal corrosion).
The surface problems are, perhaps, the main cause of the failures. Surface con-
tamination is in this case the main factor. Some stresses may lead to the following
failure mechanisms:
• Gas escape (proceeding from the chips or from the inside of the package).
• Captured moisture.
• Package leakage during (or after) the manufacturing.
The surface defects contain:

2 If no irreversible physical transformations occur, the power transistor can regain the originar
characteristics.
3 This modification corresponds to irreversible transformations: therefore, the component IS
damaged or destroyed.
5 Reliability of silicon power transistors 183

• The contamination (of the glass or of the protecting layer). Leakage currents
directly proportional to the operating voltages and ambient temperatures are
produced.
• The lack of adhesion of the aluminium to the glass. A non-conform current
distribution in silicon leads to the phenomenon called hot spot.
Usually, the chip failure is produced by defects from the semiconductor crystal
structure, by unwanted impurity or by diffusion induced defects. Generally, these
defects can be discovered even during the final electrical control. Undiscovered
defects lead in time to wear-out failures.
As for other semiconductor devices, volume defects (epitaxy or diffusion-
induced defects, microcracks) arise also for power transistors. These defects may
lead to hot spot in CB junction. If the transistor is not efficiently protected (in
current and voltage), hot spot phenomena may lead to total destruction by the
breakdown of the junction, based on a well-known failure mechanism: a current
increase - entering in second breakdown - entering in intrinsic conduction of
silicon).

5.6.2
Failure modes

The external indicators signalling the failures are called failure modes. For failure
analysis (and also for the building of the screening methods and tests), the basic
knowledge on manufacturing methods and on the correlation between the failure
modes and the component design are essential.

Table 5.6 Failure sources (in %) for power transistors encapsulated in TO-3 and TO-220

Failure sources TO-3 TO-220

Operator deftness 35 -
Metallic impurities 20 25

Internal connections 15 25

Moulding - 15

Series fabrication 10 10

Surface effects - 10

Tightness 5 -
Materials 5 -

Tests 5 5

Unidentified sources 5 10

The main failure modes are given bellow:


184 5 Reliability of silicon power transistors

• leBo is the most sensitive indicative parameter for a surface defect. A conti-
nuous increase of this parameter, often accompanied by a decrease of the cur-
rent gain, hFE is a sure indication of a surface with impurities.
• The short-circuits (especially CE) may announce the presence of hot spot
phenomena, due to chip problems or a circuit defect.
• The open circuit may indicate a bad solder joint or a melted conductor due to
an excessive current.
• The combinations between short-circuit and open circuit may be the result of
a melted conductor linked with the upper conducting layer.
• The intermittent open circuit, especially at high temperature, must be consid-
ered, usually, as a sign for a bad quality solder joint.
To establish the failure modes and mechanisms, life test results and operation
failures must be investigated. This information is useful for the establishing of the
error and failure sources. Then, the manufacturer uses it for the improvement of
the fabrication process. If the failure modes and mechanisms are known, acceler-
ated tests must be performed for each of the considered applications. Thus the
failure sources can be established in the laboratory. Based on this information, the
producer will work out an improvement programme for the technology. The fail-
ure analysis is also an important information source for the elimination of the
manufacturing defects, or, if any, of the utilisation errors. In accordance with the
RCA statistics [5.6][5.12] ... [5.16], the failure sources for power transistors encap-
sulated in T0-3 (metal package) and in T0-220 (plastic package) are given in
Table 5.6.

5.6.3
A check-up for the users

Because many failures result from an improper utilisation of the power transistors,
it is likely to check fIrst that suitable mechanical and electric procedures have
been used. Thus, it is recommended to use the following checking:
Mechanical problems
• Cooling elements correctly dimensioned and effIcient.
• Smooth and without defects mounting surface.
• Correct compression coupling.
• No excessive stresses.
• Correct use of the silicone paste.
• No contamination of the isolators or of the case (no leakage).
Electrical problems
• Does the power device work in the domain specifIed by the manufacturer?
• Are the limit values for current and voltage exceeded?
• The electrical tests do not damage the transistor?
• Is the power correctly measured?
• Are the used components correctly dimensioned to avoid overvoltages?
5 Reliability of silicon power transistors 185

Other problems

• Purchasing date.
• The quantity purchased and the storage conditions.
• Manufacture date.
• The number of transistors used for building equipment.
• The number of failed components.
• Analogue experiences with previous component batches.
• Operating conditions in the failure moment.

5.6.4
Bipolar transistor peripheries

The package, the chip connections and the chip-package connections are the tran-
sistor peripheries. The weak points of these peripherics are [5.17]:
• Material migration on chip.
• Insufficient package tightness.
• Silicon degradation in connection area,
• Setting-up of gold aluminium alloy.
.. Aluminium reliability in the area closed to the connection (identified by ultra-
sounds).
• Anodic decomposition of the aluminium (when the moisture penetrates).
• Insufficient adherence of the aluminium to silicon.
• Oxide residues in contact windows.
• Cracks and material residues on the leading path or along the wires making
the connection with the environment.
To emphasise these structural weaknesses, accelerated tests (high current, tem-
perature and humidity) and vibration tests are performed.

5.7
The package problem

Since plastic package was introduced (in 1962), important progress is accumulated
both in the field of plastic materials and in the packaging technology, respectively.
To assure high transistor reliability, the plastic material must adhere well to all
metallic parts, serving as a separating protection buffer during the component
lifetime. The dilatation coefficient of the plastic material must be comparable with
those of the other constitutive parts. This material needs to permanently take over
the heat emitted during operation.
Large programmes for studying the reliability, initiated by all important ma-
nufacturers allowed to evaluate the reliability of power transistors and also to de-
monstrate important properties such as: chip surface stability, package suitability,
parameter stability during long lifetimes [5.18].
For TO-3 package, three materials may be used: copper, aluminium and iron-
nickel alloy (Fe-Ni). The copper is used only for special cases (high reliability
equipment, special programs), because it is very expensive. For professional
186 5 Reliability of silicon power transistors

items, Fe-Ni is used, but for the majority of common applications aluminium
gives satisfactory results. Since the transistors in aluminium package are encap-
sulated at low temperature, tightness problems may arise. Adams [5.19] says that
the real average failure percentage due to tightness deficiencies is smaller than
2.2% from the total number of complaints.
As concerns the plastic encapsulated transistors, it was known that high tem-
perature packaging might cause chip-epoxy material interactions. Specific pro-
blems linked to the degradation, to the life duration and, especially, to the pheno-
mena occurring during the encapsulation were carefully investigated and it seems
that, today, plastic cases are as reliable as hermetic ones (see Chap. 12).

5.8
Accelerated tests

Each time a new manufacturing technology is introduced, a reliability improve-


ment and an increase of the life duration for the respective component is needed.
This may be obtaining by a reliability analysis, containing reliability tests, failure
analysis, data processing and corrective actions. The reliability tests point out the
failures. But, for transistors, the failure mechanisms evolve very slowly, and the
long time observation of the behaviour of these devices risks to be useless, be-
cause the conclusions of the reliability analysis are available when, maybe, the
respective transistor type is no longer in fabrication ... Consequently, an accele-
rated test is needed. The conditions that must be respected are i) relatively simple
and economical tests and, most important, ii) the failure must be produced by the
same failure mechanism as for normal operation.
The transistor reliability is conditioned by the failure mechanism leading to the
degradation of the characteristic parameters. A synthesis of the experience in this
respect, made by Tomasek [5.20], led to the following conclusions:
• The common failures of a stable transistor production are the drift ones.
• The reliability of the transistors depends mainly on the operation condition,
more than in the case of resistors, capacitors and electron tubes.
• Thermal cycling produces a rapid ageing of the transistors compared to static
stresses at high temperature.
• The transistor parameters and their stability are given by the volume pheno-
mena, produced inside the chip, but also by the physical and chemical pheno-
mena occurring at the chip surface.
• Early failures appear in the first 100-300 hours of operation.
• Random and accidental failures arise roughly after 2 x 105 hours.
• The intrinsic reliability of power transistors may be smaller than the small
signal transistors one.
• The failure rate for pulse operation does not depend on the base current in the
conduction state.
• Voltage tests are essential for measuring the failure rate. They are more ex-
pensive than storage tests, but the supplementary cost is justified by the in-
formation obtained.
5 Reliability of silicon power transistors 187

• The failure rate of the transistors depends on junction temperature, but also on
the way that this temperature was obtained.

5.8.1
The Arrhenius model

A model for the relation between the failure rate and the junction temperature of
the devices was developed, based on the Arrhenius law. The diagram from Fig.
5.2 allows estimating the probable reliability of the device for different junction
temperature produced in practical condition.
Generally, if the transistor has a heat sink, one can write:
(5.1)
where Tj is the junction temperature, PD - the dissipated power, R;c - the thermal
resistance junction-case (a technological characteristic specific to a transistor and
given by the manufacturer), Res - the thermal resistance case-heat sink (referring
to a conduction transfer; this resistance is smaller if the contact between the case
and the heat sink surface is good and this contact can be improved by using sili-
cone oil), RSE - the thermal resistance heat sink-environment (depending not only
on the size, form and structure of the heat sink, but also on its orientation and on
the air stream flowing around it), TA - the ambient temperature. Since the resis-
tances are in series, one may write:

RJA = ISc + Rcs + RsE· (5.2)


Example: IfRsE = 8"CIW, PD = 5W, Rjc = 4.2°CiW, RCE = 0, TA = 25°C, the value
Tj = 8ftc is obtained.
Generally, the reliability is improved if the transistor operates at a temperature
far bellow the maximum recommended value, Tj.
In the late 80's, the idea that the way used to obtain the junction temperature is
also important becomes more and more credible. This conclusion was proved
experimentally, for instance, by Bazu [5.21][5.22]. Four samples withdrawn from
the same batch of bipolar transistors undergo a life test at the same temperature, at
the same dissipated power (Pmax), but at different combination Uj, II (where UI x II
= Pmaxfor all samples).

\:............... Arrhenius model

0.1 ~imental data

U(V)
10 20 30 40 50

Fig. 5.3 Voltage dependence of the median time (lognormal distribution). Experimental data
were obtained from four samples withdrawn from the same batch of bipolar transistors undergo-
ing a life test at the same temperature, at the same dissipated power (Fmax), but at different com-
bination U" Ii (where Uix Ii = Pmaxfor all samples)
188 5 Reliability of silicon power transistors

In Fig. 5.3, the variation of 1m (the median time, for a lognormal distribution)
with the applied voltage is presented, for a single failure mechanism (a field in-
duced junction). Since the junction temperature is the same for all samples, tm
must be constant. The voltage dependence observed in Fig. 5.3 means that the
Arrhenius model (described by the dotted line) is no longer sufficient to describe
the temperature acceleration, because it seems that the way used to obtain this
temperature (by electrical and/or thermal stress) is also important. Consequently,
Biizu and Tazlauanu [5.23] proposed a new model, suitable for many electrical
and climatic stresses. The model can be used, for instance, for building accelerated
tests with humidity as a stress factor (see Chap. 2).
Martinez and Miller [5.24] studied the reliability of power RF transistors oper-
ating at temperature above + 150°C, the maximum junction temperature. Acceler-
ated tests, as follows, were performed:
• DC tests: 1000h at 180°C and 240°C, constant stress,
• RF tests: 168h, full power, step stress,
• Temperature increase up to 220°C, in 20°C steps, 200h at each step
The only failure mechanism found was the electromigration. Consequently, it
seems that the transistors can operate successfully at that high temperature and
accelerated life tests at +220°C are feasible.

Table 5.7 Testing conditions for temperature cycling testing of cases TO-3 and TO-220

Case Testing Dissipated Temp. ~T Increase Decrease Cooling


condo power range ("C) time ton time tolf element
(W) ("C) (s) (s) ("CIW)

T0-3 A 16 40 ... 130 90 50 100 Air

B 56 70 ... 120 50 15 25 6.3

T0-220 A 18 55...110 55 180 180 3

B 4.75 35 ... 155 120 50 100 Air

5.8.2
Thermal cycling

The thermal cycling proved to be a very good method used in accelerated tests
for technological improvement evaluation. With this proceeding, the quality of
solder joints can be tested continuously.
In the purpose to compare the reliability of the same transistor encapsulated in
plastic packages TO-220 and in metal packages T0-3, the number of cycles up to
failure vs. the junction temperature are presented in Fig. 5.4 and 5.5 [5.6][5.15].
The testing conditions are summarised in Table 5.7.
· 5 Reliability of silicon power transistors 189

LlT, (,C)

200 * 30W
ISO

100

50

Number of cycles failure

IK 10K lOOK 1000K 10000K

Fig. 5.4 Temperature range vs. number of cycles till failure (for power transistors encapsulated
in package TO-3)

LlT, ('C)

200 • 6.75W

150

100

50

Number of cycles till failure


100 lK 10K lOOK 1000K

Fig. 5.5 Temperature range vs. number of cycles till failure (for power transistors encapsulated
in package TO-220)

Failure rate
A (lrr 6h-')

2.0
1.0
0.8

0.5

0.2
0.1
o 0.2 0.4 0.6 0.8 Normalisedjunction temperature

Fig. 5.6 Correlation between failure rate and normalised junction temperature. For transistors
with dissipated power higher than 1W at an environmental temperature of 25°C, the values must
be multiplied by 2
190 5 Reliability of silicon power transistors

If the temperature is the main stress in operation, the curves given by the stan-
dard MIL-S-19500 (Fig. 5.6) may be used to predict the reliability. For instance, if
the thermal cycling produces the failure, the maximum number of cycles for a
transistor can be calculated.
In Fig.5.7, the dependence of the failure rate on the junction temperature, for
different reliability level, is shown for power transistors. One must note that the
use of screening test (such as JAN TX) can diminish the failure rate.

.... ....
........ •••••• Plastic

....
Hermetk .......... .
.... ........
.....
.....
.....
.....
.... ",
...... JANTX .... .....
....
.... .....
....
....
.....
....

10-6 '-----'---'------'-------'-----.

250 200 150 100 50 Junction temperature Ie)


Fig.5.7 Failure rate vs. junction temperature for various reliability levels of power transistors

5.9
How to improve the reliability

A perfect flatness of the radiator part, which is in contact with the case, is indis-
pensable both for a good heat evacuation and to avoid the case deformation. To do
this:
• The radiator thickness must be greater than 2 mm.
• The fixing holes must not to be too large.
• The recommended value of the pressing coupling must be 0.8 x maximum
value.
• The radiator holes must be perpendicular to its surface.
• The relief collar, arising when the hole is made, must be completely elimi-
nated_
• Silicone oil must be used to improve the thermal contact.
Since the arising of inductive overvoltage is very dangerous for decoupling cir-
cuits, a rapid diode limiting the applied voltage to the transistor jacks may be used.
5 Reliability of silicon power transistors 191

This diode is efficient only if the connections are short enough. Moreover, the
diode must not be placed too close to the coil, but close enough to the power tran-
sistor and to the irregular voltage source. About soldering, one must note that at a
soldering gun temperature of about +35°C, the soldering duration may not over-
pass 10 seconds.

5.10
Some recommendations [5.26] ... [5.63]

Many manufacturers succeed to achieve very reliable equipment. There is no


prescription to do this, but it comes out [5.11] that always the best results are
obtained if a detailed analysis of the operation conditions and an experimental
study are performed. Often, the duration of this study is long, with a frequent use
of protection circuits.
Taking into account the normal dispersion at the design, the definition tests for a
prototype and the calculus of all dimensions referring to the reliability must be
made.
The transistor currents and voltages must be studied and analysed carefully,
because it can lead to high stresses and to overpassing of the maximum limit
value. The transistor regimes last for some seconds only and, being too short, they
do not constitute causes for thermal fatigue. One must take into account only the
functional parameters.
To obtain a maximum power at a minimum cost, one must realise:
• A correct choice of the power transistors (with soft or hard solder joints).
• A small variation of the case temperature (for a given power, it depends on
the dissipation capacity).
Normally, a power transistor operates as a switch. Its possibilities to dissipate the
energy are weak, limited by second breakdown and by ageing. Usually, protection
methods are not recommended, because they put in danger the transistor by over-
passing the parameter values (avalanche limitation, transistor operating as a limi-
ter). The transistor operating as switches at small frequencies undergoes fatigue
cycles with the same duration as that of the equipment operation. The temperature
difference between starting and stopping depends only on the average switching
losses and on the switched power.
If the power transistor undergoes long thermal cycling and with high amplitude,
it is preferable:
• To choose transistors mounted on molybdenum, because their resistance to
thermal fatigue is always greater.
• To chose mounting elements so that the temperature variation of the case
between the on-state (at full power) and the off-state are minimised.
• To choose the operating conditions of the transistor so that the temperature
difference between the junction and the case be as small as possible, meaning
to limit the dissipated power.
192 5 Reliability of silicon power transistors

• To place the operating point inside the curve SOA (Safe Operating Area), so
that the point be far from the limit given by second breakdown.
• To foresee safety margins for switching losses, for the maximum voltage, for
the maximum junction temperature (especially for high voltage components)
and for second breakdown.
• To achieve the contact between the component and the heat sink so that the
lack of flatness to be smaller than O.lm and the ruggedness smaller than 15m.
The contact resistance cannot be avoided, indifferently of the used com-
pression force.
• To use silicone oil that is a good heat conductor, eliminating the supplemen-
tary corrosion risks.
• To paint in black the radiator (absolute black body), but with a thin layer, to
avoid a supplementary thermal resistance.
In the case of power transistors, the Arrhenius relation plays an important role.
The surface problems and those related to the layers involved are complex and the
chemical processes limit the life duration. The solubility of the materials increases
with temperature and the stability decreases.
A single stress type cannot emphasise all the failure types. This means that for a
semiconductor device the screening proceedings have only a limited success.
For silicon power transistors that are metal packaged, a temperature range be-
tween -65°C and +200°C is recommended. Inside this range, the transistor reli-
ability is considered to be satisfactory. Outside this range, the transistor is unsta-
ble, it cannot be commended and, eventually, it fails. For this reason, the failure
rate increases with temperature 4 •
The curves showing the time distribution of the failures are not reproducible
(excepting the case when a specified dominant failure mechanism exists). The
early failures do not arise always. Consequently, the screening does not bring
always a reliability improvement.
The SOA Test, used close to the intersection area (maximum dissipated power,
second breakdown) of the characteristic parameter plane, Ic = f (V CE), is a global
test pursuing to verify the transistor capability to support the operation power.
This test is applied during 0.25-1.5 seconds, at a given power (lE,V CE). If the
voltage VCE decreases this means that the transistor operates defectively and there
are hot points, with a tendency towards short-circuits. This test allows to detect
solder joints defects, some volume defects (microcracks, base inhomogenities) and
some surface defects (adhesion losses of aluminium to silicon). Therefore, the
SOA Test is an "all or nothing" test and does not allow obtaining measurable
values.
In spite of all foreseen measures that can be take (IE must be applied before V CE,
to avoid oscillations and VCE must be interrupted in 1 s if the fixed limit is over-
passed). The SOA test can lead to the failure of the tested components, especially
of those with defects (e.g. EC junction breakdown).

4 For the failure criteria, see the standard' DIN 41794.


5 Reliability of silicon power transistors 193

References

5.1 Biijenescu, T. (1982): ZuverHissigkeitsprobleme bei Siliziumleistungstransistoren. Elek-


troniker, vol. 18, pp. 1-8 and pp. 19-26
5.2 Chabanne, J. P. (1974): Le phenomene de second clacage dans les transistors de puissance.
Electronique et Microelectronique Industrielles, no. 192, pp. 123-131
5.3 Peter, J.-M. (1975): Les transistors de puissance Sescosem en Europe. Sescosem Informa-
tions, no. 5, S. 1
5.4 Chabanne, J. P. (1975): Redresseurs: puissance + rapidite + economie + fiabilite. Sescosem
Informations, no. 5, S. 2
5.5 Rossel, P.; Baffleur, M.; Charitat, G.; Tranduc, H. (1994): Discrete and integrated power
MOS structures and technologies. Proceedings of International Semiconductor Conference
CAS'94, Sinaia (Romania), Oct. 11-16, pp. 15-36
5.6 Marmann, A. (1976): Reliability of silicon power transistors. Microelectronics and Reli-
ability, vol. 15, pp. 69-74
5.7 Redoutey, J. (1977): Le phenomene dv/dt: Tenez-en compte dans les transistors de puis-
sance aussi. Electronique et Microelectronique Industrielle, no. 234, pp. 27-33
5.8 Peter, J. M. (1977): Comment ameliorer la fiabilite des semiconducteurs des puissance dans
les equipments. Thomson-CSF, Division semiconducteurs
5.9 Fink, D.G. (1975): Electronics engineer's handbook. McGraw-Hill Book Company, New
York
5.10 Peter, J.-M. (1978): L'amelioration de la fiabilite des equipments utilisants des transistors
des puissance. Actes du colloque international sur la fiabilite et la maintenabilite, Paris,
June 19-23, pp. 395-400
5.11 Biijenescu, T. 1. (1984): Sur la fiabilite des thyristors. Electronique, vol. 4, pp.26-31
5.12 Jarl, R. B. (1973): A test set for nondestructive safe-area measurements under high-volt-
age, high-current conditions. RCA Technical Publication AN-6145, June
5.13 Lukach, V. J.; Gallace, L. J.; Williams, W. D. (1972): Thermal-cycling ratings of power
transistors. RCA Technical Publication AN-4783, November
5.14 Gallace, L. 1. (1973): Quantitative measurement of thermal-cycling capability of silicon
power transistors. RCA Technical Publication AN-6163, June
5.15 Williams, W. D. (1971): Thermal-{;ycling rating system for silicon power transistors. RCA
Technical Publication AN-4612, Mars
5.16 Baugher, D. M.; Gallace, L. J. (1973): Methods and test procedures for achieving various
levels of power transistor reliability, RCA Technical Publication ST-6209, September
5.17 Schiller, N. (1974): Ausfallursachen bei bipolaren Transistoren. In: Zuverliissigkeit elektro-
nischer Bauelemente, VEB Deutscher Verlag fUr Grundstoffindustrie, Leipzig
5.18 Biijenescu, T. 1. (1982): Zuverliissigkeit von Silizium-Leistungtransistoren. Feinwerk-
technik & Messtechnik, H. 2, pp. 88-92
5.19 Adams, G.E. (1973): Package hermeticity. Proceedings pfthe 11th annual reliability phys-
ics symposium, Las Vegas, Nevada, April 3-5, pp. 95-97
5.20 Tomasek, K. (1971): Zur Problematik der zeitraffenden ZuverHissigkeitpriifungen an Si-
Transistoren. Nachrichtentechnik, H. 1, pp. 43-48
5.21 Bazu, M., (1987): Thermally and electrically accelerated failure mechanisms produced by
the functionning of semiconductor devices. Proceedings of Annual Semiconductor Confer-
ence CAS'87, Sinaia (Romania), October 7-10, pp.53-56
5.22 Bazu, M.; Ilian, V. (1990): Accelerated testing ofICs after a long term storage. Scandina-
vian Reliability Engineers Symp., Nykoping (Sweden), October 13-15, 1990
194 5 Reliability of silicon power transistors

5.23 Bazu, M.; Tazlauanu, M. (1991): Reliability testing of semiconductor devices in humid
environment. Proceedings of the Annual Reliability and Maintainability Symposium
(ARMS), Orlando, Florida, January 29-31, pp. 307-311
5.24 Martinez, E. C.; Miller, J. (1994): RF power transistor reliability. Proceedings of the An-
nual Reliability and Maintainability Symposium (ARMS), Anaheim, California, January
29-31,pp.83-87
5.25 Deger, E.; Jobe, T. C. (1973): For the real cost of a design factor in reliability. Electronics,
August 30, pp. 83-89
5.26 Grange, J. M.; Dorleans, J. (1970): Failure rate distribution of electronic components.
Microelectronics and Reliability, vol. 9, pp. 510-513
5.27 Kemeny, A. P. (1971): Experiments concerning the life testing of transistors. Microelec-
tronics and Reliability, vol. to, part I: pp. 75-93; part II: pp. 169-194
5.28 Lang, G. A., Fehnder, B. J., Williams, W. D. (1970): Thermal fatigue in silicon power
transistors. IEEE Trans. on Electronic Devices, ED-17, pp. 787-793
5.29 Redoutey, J. (1977): Les parametres importants des transistors de puissance. Sescosem
Informations No.5, April, pp.3-15
5.30 Gallace, L. J., Vara, J. S. (1973): Evaluating the reliability of plastic-packaged power
transistors in consumer applications. IEEE Trans. on Broadcast and TV, BTR-19, No.3,
pp.194-204
5.31 Preuss, H. (1969): Der Einfluss der Parameterdrift auf die Ausfallrate von Schalttransis-
toren. Femrneldtechnik, vol. 9, pp. 263-267
5.32 Happ, W. J.; Vara, J. S.; Gaylord, J. (1970): Handling and mounting of RCA moulded-
plastic transistors and thyristors, RCA Technical Publication, AN-4124, February
5.33 Ward, A. L. (1977): Studies of second breakdown in silicon diodes. IEEE Trans. on Parts,
Hybrids and Packaging, PHP-13, No.4, December, pp. 361-365
5.34 La Combe, D. J.; Naster, R. J.; Carroll, J. F. (1977): A study on the reliability of microwave
transistors. IEEE Trans. on Pints, Hybrids and Packaging, PHP-13, No.4, December, pp.
242-245
5.35 Schultz, H.-G. (1977): Einige Bemerkungen zum Rauschverhalten des Feldeffekt-
transistoren. Nachrichtentechnik Elektronik, vol. 27, H. 6, pp. 242-245
5.36 Harper, C. A. (1978): Handbook of components for electronics. New York: McGraw-Hill
Book Company
a
5.37 Cavalier, C. (1974): Contribution la modelisation des transistors bipolaire de puissance:
aspects thermiques. These, Universite de Toulouse
5.38 Davis, S. (1979): Switching-supply frequency to rise: power FETs chalenge bipolars. Elec-
tron Device News, January to, pp. 44-50
5.39 Ginsbach, K.H.; Silber, D. (1977): Fortschritte und Entwicklungstendenzen auf dem Gebiet
Silizium-Leistungshalbleiter. Elektronik, H.11, pp. ELl-EL5
5.40 Stamberger, A. (1977): Tendenzen in der Leistungelektronik. Elektroniker, H.11, p. EL34
5.41 Grafham, D. H.; Hey, J. C. (1977): SCR-manual. Fifth edition. General Electric, Syracuse,
New York
5.42 Biijenesco, T. I. (1981): Problemes de la fiabilite des compos ants electroniques actifs ac-
tuels. Masson, Paris
5.43 Biijenescu, T. I. (1981): Zuverliissigkeitsproblemlosungen elektroniker Bauelemente. IN-
FORMIS-Informationsseminarien 81-8, ZUrich, May 14 and October 20
5.44 Biijenescu, T. I. (1981): Ausfallraten und Zuverliissigkeit aktiver elektronischer Bauele-
mente. Lehrgang an der Techn. Akedemie Esslingen, February 17-18
5.45 Antognetti, P. (1986): Power integrated circuit
5 Reliability of silicon power transistors 195

5.46 Hower, P. L. (1980): A model for tum-off in bipolar transistors. Tech. Digest IEEE IEDM,
p.289
5.47 Sun, S. C. (1982): Physics and technology of power MOSFETs. Stanford electronics labs.
TR no. IDEZ696-1
5.48 Bertotti, F. et al. (1981): Video stage IC implementated with a new rugged isolation tech-
nology. IEEE Trans. Consumer Electronics, vol. CE-27, no. 3, August
5.49 Sakurai, T. et al. (1983): A dielectrically isolated complementary bipolar technique for aid
compatible LSIs. IEEE Trans. Electron Devices, ED-30, p. 1278
5.50 Zarlingo, S.P.; Scott, R.1. (1981): Lead frame materials for packaging semiconductors. First
Ann. Int. Packaging Soc. Conf.
5.51 Dascalu, D. et al. (1988): Contactul metal-semiconductor. Ed. Academiei, Bucharest (Ro-
mania)
5.52 Kubat, M. (1984): Power semiconductors. Spinger, Berlin Heidelberg New York
5.53 Regnault, J. (1976): Les defaillances des transistors de puissance dans les equipments.
Thomson-CSF, Semiconductor Division
5.54 Baugher, D. M. (1973): Cut down on power-transistor failures inverters driving resistive or
capacitive loads. RCA Technical Publication no. ST-3624
5.55 Sagin, M. (1977): Power semiconductors. Wireless World, May, pp. 71-76
5.56 Lilen, H. (1976): Les nouvelles generations de compos ants de puissance dependront des
tchnologies de bombardement neutronique etJou electronique. Electronique et microelec-
tronique industrielle, no. 225, October, pp. 22-25
5.57 Gallace, L. J.; Lukach, V. J.(1974): Real-time controls of silicon power-transistor reliabil-
ity. RCA Technical Publication AN-6249, February
5.58 Turner, C. R. (1973): Interpretation of voltage ratings for transistors. RCA Technical Publi-
cation AN-6215, September
5.59 Tomasek, K. F. (1970: Surveying the results of transistor reliability tests. Tesla Electronics,
vol. I, pp.17-21
5.60 Walker, R. c.; Nicholls, D. B. (1977): Discrete semiconductor reliability transistor/diode
data. ITT Research Institute
5.61 Bodin, B. (1976): Reliabilty aspect of silicon power transistors. Motorola Application Note
5.62 Thomas, R. E. (1964): When is a life test truly accelerated? Electronic Design, January 6,
pp.64-70
5.63 Baudier, J.; Fraire, C. (1977): Mesure sur les transistors de commutation de forte puissance.
Sescosem Informations, no. 5, April, pp. 26-30
5.64 Bulucea, C. D. (1970): Investigation of deep depletion regime of MOS structures using
ramp-response method. Electron. Lett., vol. 6, pp. 479-481
5.65 Grove, A. S. (1967): Physics and technology of semiconductor devices. John Wiley, New
York
5.66 Grove, A. S.; Deal, B. E.; Snow, E. H.; Sah, C. T. (1965): Investigation of thermally oxi-
dized silicon surfaces using MOS structures. Solid-State Electron., vol. 8, pp. 145-165
5.67 Das, M. B. (1969): Physical limitations of MOS structures. Solid-State Electron., vol. 12,
pp.305-312
5.68 Hofstein, S. R. (1967): Stabilization of MOS devices. Solid-State Electron., voL 10, pp.
657-665
5.69 Deal, B. E.; Snow, E. H. (1966): Barrier energies in metal-silicon dioxide-silicon struc-
tures. J. Pys. Chern. Solids, vol. 27, pp. 1873-1879
5.70 Bulucea, C. D.; Antognetti, P. (1970): On the MOS structure in the avalanche regime. Alta
Frequenza, vol. 39,pp. 734-737
196 5 Reliability of silicon power transistors

5.71 Sah, C. T. ; Pao, H. C. (1966): The effects of fixed bulk charge on the characteristics of
metal-oxide-semiconductor transistors. IEEE Trans. Electron Dev., vol. 13, pp. 393-397
6 Reliability of thyristors

6.1
Introduction

The Silicon Controlled Rectifier (SCR), invented in 1958, in the laboratories of


General Electric, is the most important member of the thyristor family of semicon-
ductor components, including the triac, bi-directional diode switch, the silicon
controlled switch (SCS), the silicon unilateral and bilateral switches (SUS, SBS)
and light activated devices like the LAS CR. Most recent members of the thyristor
family are the complementary SCR, the programmable unijunction transistor (PUT)
and the asymmetrical trigger [6.1][6.6]. As a silicon semiconductor device, the
SCR is compact, static, capable of being passivated and hermetically sealed, silent
in operation and free from the effects of vibration and shock. A properly designed
and fabricated SCR has no inherent failure mechanism. When properly chosen and
protected, it should have virtually an operating life without limits, even in harsh
atmosphere. Consequently, countless billions of operations can be expected, even
in explosive and corrosive environments. All components - including power semi-
conductors - have the potential of failing or degrading in ways that could impair
the proper operation of such systems. Well-known circuit techniques (including
fusing and self-checking) are available to protect against the effects of such phe-
nomena. For any systems where safety is in question fault analysis is recom-
mended.
The name of thyristor defines any semiconductor switch whose bistable action
depends on pnpn regenerative feedback. Thyristors can be two, three, or four
terminal devices, and both unidirectional and bi-directional devices' are available.
SCR is by far the best known of all thyristor devices; because it is a unidirectional
device (current flows from anode to cathode only) and has three terminals (anode,
cathode and control gate), the SCR is classified as reverse blocking triode thyristor.
A simple pnpn structure - like the conventional SCR - can best be visualised as
consisting of two transistors, a pnp and an npn interconnected [6.2] to form a
regenerative feedback pair (Fig. 6.1). Obviously, the collector of the npn-transistor
(along with the possible n-gate drive) provides the base drive for the pnp-transistor:
1B/ = 10 + 1G(n)' (6.1)

, Bidirectional thyristors are classified as pnpn devices that can conduct current in either
direction; commercially available bidirectional triode thyristors are triac (for triode AC switch),
and the silicon bilateral switch (SBS).

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
198 6 Reliability of thyristors

Anode

n ngate n n

pgate p p p

Cathode

Fig. 6.1 Two transistor analogue ofpnpn structures

Similarly, the collector of the pnp-transistor along with any p-gate current [IGrP)]
supplies the base drive for the npn-transistor:
(6.2)
Thus, a regenerative situation exists when the positive feedback gain exceeds the
unitary value.
The thyristor is a small power semiconductor switch with short response time,
able to close an electric circuit, but not to re-open it. For this, it must be brought,
for a short time, to a zero direct voltage, situation which is reproduced at each
halfperiod for the alternating current circuits. The thyristor is utilised for the control
of alternating currents (regulated motors, regulated heatings, lighting installations,
etc.).
The complexity of equipment - on the one hand - and the development of new
components - on the other - have forced industry to invest considerably effort in
finding means for controlling and predicting reliability. In many cases, the efforts
were accelerated by the desire of the military responsible to evaluate (and improve
where necessary) the reliability of new devices which offered the promise of
improvements in size, weight, performance, and reliability in aerospace and
weapons systems. One may note that after only two years, in 1960, the new
invented thyristor C35 of General Electric filled all requirements of the American
army and was successfully qualified according the first SCR military specification.

6.2
Design and reliability

The design of a new component has to assure that their performances, during the
entire lifetime, do not exceed the specified tolerances. This concerns particularly
the mechanical and thermal design of the components. In the case of thermal de-
sign, the stability of thermal characteristics is important because the junction tem-
perature represents the major limitation in applications. The deterioration of ther-
mal way can lead to a thermal stirring up and to component destruction. To assure
6 Reliability of thyristors 199

the compatibility of the thermal coefficients and to reduce the thermal fatigue, it is
necessary to select adequately the interfacing materials. Normally, the thermal
fatigue is attached to the stresses that affect the quality of the die-pellet or the
metal-silicon connections or the passivation-silicon medium. The thermal fatigue
can appear as a consequence of the thermal cycles. If a thyristor is successively
heated and cooled, stresses are produced in it, since the dilatation coefficients of the
silicon and of the metal on which the structure is fixed are very different.

3 4 5

-\;:----2

Fig. 6.2 Passivation and glassivation (National Semiconductor document). The passivation is a
proceeding permitting the protection against humidity and surface contaminants with a doped
vitreous silicon oxide film: 1 diffusion; 2 substrate; 3 glassivation; 4 conductive line; 5 metal; 6
passivation

The reliability of mechanical parts requires: (i) utilisation of rigid ensembles


with reduced mass and small inertial moments; (ii) elimination of the mechanical
resonances for the normal vibration and shock domains. The design of the junction
surface protection is critical too (tightness or passivation). Since the defects due to
component degradation are, mainly, manifestations of the changes which take place
on the junction surface, the component reliability strongly depend on the integrity
of the protection surface. From this point of view the glassivation (Fig. 6.2) - as
well as, for example, the deposition of a protection silicon nitride layer on the
whole structure surface, excepting the soldering pads - represents a milestone on
the long and difficult way of thyristors reliability improvement.

6.2.1
Failure mechanisms

Failure mechanisms are chemical and physical processes leading eventually to the
device failure. The kinds of mechanisms that have been observed in the semicon-
ductor classification of component devices are shown in the Table 6.1. Also shown
in the table are those kinds of stresses to which each mechanism is likely to re-
spond. If some of these failure mechanisms arise, to any significant degree, in a
given device type obtained from a given process, it would not be reasonable to
expect to achieve a high reliability device. The dominant mechanisms to which the
device type may be susceptible will vary according to the peculiarities of the design
and fabrication process of that device. The failure mechanisms of the discrete semi-
conductors may be produced by three categories of defects:
200 6 Reliability of thyristors

• mechanical and fabrication defects;


• surface defects;
• structure defects.

Table 6.1 Failure mechanisms and associated stresses

~
Mechanical Thermal Electrical Miscellaneous
Failure 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
mechanism

Structural flaws
weak parts x x x x x x x x x
weak connect. x x x x x x x x x
loose particles x x x x
thermal fatigue x x x

Encapsulation flaws x x x x x x x
Internal contaminants x x
entrapped foreign gases x x
outgasing x x
entrapped ionisable contaminants x
base minority carrier trapping x
ionic conduction x x x x x
corrosion x x x x
Material electrical flaws
junction imperfection x x x
Metal diffusion x x
Susceptibility to radiation x
1 - static force" 2 - shock 3 - vibration" 4 - pressure (f uid)," 5 - static" 6 - shock" 7 - cycling; 8 - voltage;
9 - current; 10 - continuous power; 11 - cycled power; 12 - corrosion; 13 - abrasion; 14 - humidity; 15 -
radiation"

Mechanical defects are sometimes very easily detectable and surely very easy to
analyse. It must be cited, among others:
• inadequate soldering (thermocompression, ultrasonics, etc.); soldering is a
critical operation, asking careful controls, good organised tests and frequent
periodical inspections;
• defects of structure attaching (which lead to the growth of the thermal
resistance and to overheating);
• utilisation in the contact zone and for the connection wires of different metals
(such as gold and aluminium), incompatible with the operation conditions of
the device. An example in this respect is the formation of a compound gold-
aluminium; if the gold wires soldered on the contact zones of aluminium are
heated (thermally or electrically) at a temperature of +200°C ... +300°C this will
lead to the phenomenon named purple plague;
6 Reliability of thyristors 201

• the imperfect tightness permits the access of the contaminants and of the
humidity, which lead to surface problems (corrosion of the metallisation).
The surface defects are, probably, the predominant cause of the weak reliability of
thyristors. They can be produced by the thyristor surface imperfection, by external
contaminants collected in the encapsulation, or penetrating through an encapsu-
lation defect, or by a combination of these possibilities. Some stresses at which the
thyristor is exposed can lead to the following failure mechanisms:
• gas emissions (from the internal structure or from the encapsulation), parti-
cularly at high temperatures;
• taped humidity;
• package leakage during (or after) the manufacturing.
The surface defects comprise:
• contaminants (of glass and of the protection layer, through ionic residues of the
chemical products used by the fabrication, or produced by external agents)
which produce high leakage currents, (increasing with the applied voltages and
temperatures);
• lack of aluminium adhesion to the silicon (the hot points are due to an
inadequate distribution of the electrical currents in silicon).
The bulk defects are defects in the crystalline structure of the semiconductor,
undesirable impurities and diffusion defects. Generally, they can be detected by the
final electrical test of the thyristors. The undetected defects will contribute slowly,
in time, to the arising of wearout defects. It is considered that the structu-ral defects
result from the weak parts, from the manufacturing discrepancies or from an
inadequate mechanical design. Various tests performed during the fabrication
process are effective means to identify the structural defects and to eliminate the
inadequate thyristors.
Among the possible failure mechanisms, metal diffusion is the least significant.
The diffusion occurs over a long period of time, when two metals are in intimate
contact at very high temperatures; in this case the rate at which it progresses is too
slow to have tangible effects during the useful life. For example, many SCR's are
gold diffused at a temperature exceeding +800°C for time periods reaching two
hours. In this way it is possible to obtain desired speed characteristics. The
accomplishment of the equivalent gold diffusion at + ISO°C would require
approximately 3 x 108 h (34 000 years).
Structural flaws are generally considered to be the result of weak parts,
discrepancies in fabrication, or inadequate mechanical design. Various in-process
tests performed on the device - such as forward voltage drop at high current density
levels and thermal resistance measurement - provide effective means for the
monitoring of control against such flaws. These tests also provide means for the
removal of the occasional possible discrepant device. The failures modes generally
associated with the mechanical flaw category are excessive on-voltage drop, failure
to turn on when properly triggered,and open circuit between the anode and cathode
terminals. Because the corresponding types of failure mechanisms are relatively
rare, the incidence of these modes of failure is low.
202 6 Reliability of thyristors

Encapsulation flaws are deficiencies in the hermetic seal or passivation that will
allow undesirable atmospheric impurities - such as oxygen and moisture - to react
in such a way as to permanently alter the interface characteristics of silicon/metal.
A change in surface conductivity is evidenced by gradual increase of the forward
and reverse blocking current. Because the thyristor is a current actuated device, it
will lose its capacity to block rated voltage if blocking current degrades beyond
some critical point. This type of mechanism may eventually result in catastrophic
failure. The rate of degradation is dependent mostly on the size of the flaw and the
level of the applied stress, particularly temperature.
Failure modes} associated to the category of mechanical defects of a thyristor are
the excessive conduction voltage drop (which can be avoided if the thyristor is
correctly started) - and the open circuit between anode and cathode. As this defects
are rare, their incidence is reduced [6.1][6.3] ... [6.10].
The reliability a/thyristors depends on three main factors:

• design;
• manufacturing;
• application.
The five major stresses a thyristor can encounter in its life are:

• current;
• voltage;
• temperature;
• mechanical stresses;
• moisture.
From the reliability point of view the thyristors used in systems can be the weakest
point - for two main reasons:
a) Although the dangers represented by current, voltage, and temperature are
widely recognised, the importance of high mechanical stresses and of moisture for
thyristors is often underestimated.
b) Thyristors are most exposed to the external environment; their internal
impedance must be the lowest possible. Any form of overload (voltage or low
impedance) is immediately converted into heavy current flow that - in some cases
- can have catastrophic consequences.

6.2.2
Plastic and hermetic package problems

Today a large percentage of thyristors are produced in plastic packages. It is im-


portant to verify if these devices can resist to the most serious temperature and
humidity conditions encountered during the life of the device.

2 Failure mode: The effect by which a failure is observed. In failure analysis (and in adjusting of
screening tests and of tests methods), the knowledge of the fabrication methods and the
correlation between the failure mode and the device design are essential.
6 Reliability of thyristors 203

The following accelerated laboratory tests are normally used for this purpose':
Pressure cooker: 1211100 (+121 °C, 100% relative humidity RH) at 2.08 atm.
(This test can be carried out with or without bias).
85/85 (+85°C, 85% RH). This test can also be carried out with or without bias. It
is named TH (temperature humidity) and THB (temperature humidity bias),
respectively. The present trend is toward testing with bias, even though it is more
costly and causes more complex interpretation problems.
In the case of hermetically sealed thyristors, a sequence of fine and gross leak
tests can eliminate the occasional discrepant device. The use of radiflo and bubble
testing has been found very effective for the selection and elimination of inadequate
components.
The inclusion of a source of ionisable material inside a hermetically sealed
package - or under a passivation layer - can lead to failure. The failure
mechanisms are similar to those resulting from encapsulation flaws if the inclusion
is gross. If the inclusion is small - as compared with the junction area - the amount
of electrical change that occurs is limited. Thus the increase in blocking current is
not sufficient to degrade the blocking capacity of the device. This mechanism acts
even if a permanent change in the surface characteristics of the silicon does not
occur. The apparent surface conductivity of the silicon can be altered by build-up
and movement of the electrical charges carried by the inclusions. This condition is
often reversible, with recovery accomplished through the removal of electrical bias
and the employment of an elevated temperature. This category of failure
mechanism arises only if the forward blocking current can increase to the point
where forward blocking capability is impaired. The probability of occurrence is
extremely low, excepting the possible case of the small junction area, highly
sensitive devices. But this mechanism is often conteracted by a negative gate or
resistor biasing the circuit.
Removal of devices containing undesirable internal contaminants can effectively
be accomplished by means of a blocking voltage bum-in screen. the ionisation of
the contaminants under these conditions takes place rapidly, permitting a relatively
short term burn-in to be effective. Detection of discrepant devices is accomplished
by both tight end-point limits and methods to detect tum-on during the screening.
Basically this category of failure mechanism involves imperfections in junction
formation. Discrepancies of this nature are not generally experienced with SCRs
because of their relatively thick base widths and because the blocking junctions are
formed by the diffusion process, which allows consistent control of both depth and
uniformity of junction. Initial electrical classification would effectively remove any
such discrepant device.

J Since the new plastic devices are finnly encapsulated and have no internal cavity, conventional
methods of leak testing obviously are no longer applicable; it has been necessary to develop new
methods. One of these methods is the pressure cooker type, which has been found very effective
in detecting devices with defective passivation.
204 6 Reliability of thyristors

6.2.3
Humidity problem

When environmental humidity reaches the die, after a certain time, it can cause the
corrosion of aluminium. Corrosion - a very complex phenomenon - may be gal-
vanic or electrolytic.
Galvanic corrosion requires two metal and an electrolyte. The corrosion
processes are complicated by the fact that the metals are usually protected by oxide
films, which are themselves attacked by impurities, such as the cr
ion, which starts
the reaction.
On the other hand, electrolytic corrosion occurs when there is a cell consisting
of two metallisations (even of the same type of metal, here - aluminium), but with
externally applied bias. The presence of impurities sparks off the reactions [6.5]:
Al + 3CT -fAI3 + + 3CT + 3e (anodic reaction). (6.3)
The ionised aluminium is transported to the cathode, where we have:
A13 + + 3e -f Al (cathodic reaction). (6.4)
But the aluminium is not able to deposit in these conditions, and in the presence
of humidity the following reaction occurs:
(6.5)
Corrosion appears as an interruption (open circuit) in the aluminium or in the
bonds, preceded at times by the degradation of the electrical characteristics of the
device (e. g. increased leakage current). The corrosion is therefore accelerated by
the impurities carried by the H 20 when it crosses the resin and laps against the
metal surfaces of the frame and by the voltage applied to the device (electrolytic
corrosion). The phenomenon is delayed by passivating the die and by increasing the
thickness of the aluminium metallisations.
Humidity tests are therefore used to evaluate:
• plastic-frame adhesion and possible package cracks;
• permeability of plastic to water and corrosive atmospheric polluants;
• plastic, die attach, and frame-plating purity (ionic contamination);
• passivation quality (condensation occurs mainly in passivation cracks);
• design characteristics (i. e. aluminium thickness, quality, and morphology; inter-
nal slug and frame geometric design; passivation type; phosphorous content).

6.2.4
Evaluating the reliability

Thyristors are current-controlled devices, acting as high-impedance paths to the


flow of current, irrespective of the voltage dropped across them, until turned on, i.e.
assuring a low resistance by application of a suitable gate current. Hence, the
surface conductivity of the silicon is important to the operation; at the surface, the
conductivity is increased by means of impurities introduced by ionic contaminants;
a gradual increase in the leakage current is observed when the devices are in the off
state. It has been found that mechanical stresses in silicon can reduce the energy
6 Reliability of thyristors 205

gap, and as a consequence it is possible to reduce the on voltage of the devices.


Thennal stress can cause degradation of device characteristics by affecting the
junctions. All thyristors must be designed within the specifications required to
prevent electromigration of the metallisation.
Effects of nuclear radiation of SCRs can include pennanent damage to the
crystal lattice, thereby reducing minority carrier lifetimes. Increased gate current to
trigger, and - to a lesser degree - increased holding current, on voltage and forward
breakdown voltage have been observed.
EOS is once again the major failure mechanism affecting this devices, which is
sensitive to static potentials, and can be destroyed by means of permanent
breakdown across reverse-biased junction.
The language and the techniques relating to the reliability treatment have
continued to develop as the technology advanced and became increasingly
complex. The need to define reliability as a product characteristic expanded as the
newer technologies moved from laboratory to space, to industry, to home. The steel
mill calculates the cost of down time in thousands of dollars per minute; the utility
is sensitive to the low tolerance level of its customer to interruptions in service; the
manufacturer of consumer equipment relies on a low incidence of in-warranty
failures to maintain profitability and reputation. Discussion in this chapter is limited
to the effects of component part reliability; in addition, the assumption is made that
the parts are properly applied, and that they are not subject to stresses that exceed
rated capability.
Evaluating the reliability of a thyristor involves the study of numerous factors
[6.5]. The subject is vast, fruther on, only a few aspects are mentioned. If we note:
tB - the time at which infant mortality can be considered exhausted,
tM - the time at which 50% of devices fail, and
tw - the start of wearout,
we can emphasize that the failure rate 'A and the times tB, tM and tw depend on the
type and intensity of stress both in the laboratory and in the actual application. The
main activities in the thyristors reliability field are:
(i) Study and detennination of the stress that the thyristors encounter in typical
applications.
(ii) Study and definition of laboratory tests' to be used to check the reliability.
Numerous standardised tests are required to research the effect of each stress on the
device. These can be divided into standard tests (used for checks) and accelerated
tests (which aim to give results in a short time through acceleration of the test
stresses).

• The laboratory tests consider one stress (or a few simultanously stresses), as opposed to the
large variety of stresses encountered by the device during operation in the field. The laboratory
tests (for example pressure cooker or 85/85 tests) have a dual aim: i) If we know the
acceleration factors between one test and another, and between the conditions in the laboratory
and those in field, it is possible to evaluate a certain useful device life - if certain laboratory
tests are passed. ii) Laboratory tests are also used to compare different constructive solutions or
products from different suppliers, even if the acceleration factors of each test are not known
exactly.
206 6 Reliability of thyristors

(iii) Study of physical laws governing the various failure mechanisms, and
determination for each failure mechanism of its dependence on a particular stress.
(iv) Development of more and more sophisticated analysis techniques to find the
causes offailure of devices that fail during testing or while in use.
(v) Study and determination of screening techniques and preconditioning to
remove infant mortality before the product is used.
(vi) Theoretical studies of general laws governing reliability (reliability mo-
dels).
(vii) Study of the best systems for collecting and interpreting the data obtained
from laboratory test and from the field (data banks and statistical analysis for a
correct interpretation of the results).
(viii) Transfer to production people of the reliability knowledge acquired during
the design of devices and processes, designing at the same time suitable reliability
checks during the production process.
(ix) Transfer of acquired reliability knowledge to the designer of the thyristor
application in order to forecast and to optimise the reliability.
Since semiconductor technology is continuously evolving, obviously the
problem of studying the reliability of these device is also more and more complex.

6.2.5
Thyristor failure rates

An individual component part, such as a thyristor, does not lend itself to reliability
measurement in the same manner, as does a system. For this reason, the statistical
approach to estimating device reliability is to extrapolate the performance observed
by a sample quantity of devices to the probable performance of an infinite quantity
of similar devices operated under the same conditions for a given period of time.
The statistical measurement is based on unit hours of operation, using a sampling
procedure whose derivation takes into account the resolution with which the sample
represents the population from which it was withdrawn and the general pattern of
time behaviour of the devices.
Some practical observations:
(i) It would be extremely difficult to perform an accurate test demonstration to
verify even a failure rate of 1.0%/1 OOOh, since the test equipment and
instrumentation must have a greater MTBF in order not to adversely affect the test
results. The problem becomes more complicate as the failure rate being tested
decreases: not only test equipment complexity increases, but its MTBF must be
increased at the same time!
(ii) The terminology failure rate is perhaps a poor choice of words. To the
reliability engineer it relates the performance of a limited number of observations to
the probable performance of an infinite population. To those not familiar with the
used statistics, unfortunately the impression of actual percent defective is
transmitted.
Graphical presentations (Fig. 6.3 ... 6.5) have been found very useful to electronic
device users as a guide for reliability predictions.
6 Reliability of thyristors 207

Example: A sample of 950 devices C35 were subjected to full load, intermittent
operation of 1000h duration in formal lot acceptance testing to MIL-S-1950011 08.
Only one device was observed to be a failure to the specification end point limits.
The calculation of failure rate based on these results indicates the failure rate to be
no more than 0.41 % for 1000h at 90% confidence level.

6.3
Derating

The most probable thyristor failure mechanism is the degradation of blocking ca-
pability, as a result of either encapsulation flaw - or damage - or internal contami-
nants. The process can be either chemical or electrochemical, and therefore variable
in rate according to the degree of temperature and/or electrical stress applied. Thus
it is possible by means of derating (using the device at stress levels smaller than the
maximum ratings of the device) to retard the process by which the failure of the
occasional defective device results. This slowdown of the degradation process
results in lower failure rate and increased MTBF.
Example. A sample of 778 devices is tested under maximum rated conditions for
1000h with one failure observed. The calculated A is 0.5%11 OOOh and the MTBF is
200 OOOh. If the failed device would have remained within limits at the 1000h point
because of lower applied stresses, the calculated A becomes 0.3xlO,5h'! and the
MTBF increased to 333 OOOh.
The relationship of applied stress to General Electric SCR device failure rate is
shown graphically in Figures 6.3 ... 6.5. The model that describes the relationship of
these stresses to A, is the Arrhenius model:
A= i + BIT; (6.4)
where A = failure rate expressed in %11 OOOh; T; = junction temperature (Kelvin
degrees); A and B = constants.
The Arrhenius model has been successfully applied by the General Electric to
extensive life tests data involving thousands of devices and millions of test hours.
The data was obtained from product design evaluations, military lot acceptance
testing, and several large scale reliability contracts.
A thorough examination of the data on all General Electric SCRs revealed that
these three graphical presentations could describe the results of derating failure rate
for the entire family of SCRs with reasonable accuracy. The use of these graphical
presentations is quite straightforward. Suppose, for example, that one intends to put
a C35D thyristor under some stress conditions (200 volts peak and a junction
temperature of +75°C) in a circuit. This circuit will become inoperative when the
electrical characteristics of the SCR change to values outside of the specification
limits. This is a definition of failure and this means that the solid lines on the
graphical presentations must be used. Since the rated junction temperature of the
C35D thyristor is + 125°C, Fig. 6.4 must be used. Projecting a horizontal line from
the intersection of the +75°C junction temperature ordinate and the applicable per
cent of rated voltage curve (50% in this example), we obtain an estimated A of
0.08% per 1000h at 90% confidence level. If - due to a change in the design of the
208 6 Reliability of thyristors

circuit - only devices which failed catastrophically (opens or shorts) would cause
the circuit to become inoperable, the dashed curves could be used. This would
result in an estimated A of 0.008% per 1000h at 90% confidence.

A (%/lOOOh) at
90% confidence level
10 ~----~----~----~----~----~

100
75
50
0.1 25

0.01 .....<::::::::1----- 100


""" 75
0.001 50
25
0.0001 percent of rated reverse voltage
and/or forward blocking voltage
0.00001

0.000001
175 150 125 100 75 50 Junction temperature ('C)
Fig. 6.3 Estimated A of a standard SCR depending on junction temperature, reverse and/or
forward voltage, and failure definition for a maximum rated junction temperature of +1OocC

A (%/lOOOh) at
90% confidence level
10 ~----~----~----~------~----

100
75
50
0.1
. : : ::: : : : : :. . t::::::::~
...... .........:::::::::::..... ~~
25

0.01
100
0.001 75
50
0.0001 25

0.00001 percent of rated reverse voltage


~----~-----+------+-----~----~
and/or forward blocking voltage
0.000001
175 150 125 100 75 50 Junction temperature ('C)
Fig. 6.4 Estimated A of a standard SCR depending on junction temperature, reverse and/or
forward voltage, and failure definition for a maximum rated junction temperature of +125 c e
6 Reliability of thyristors 209

A (%/lOOOh) at
90% confidence level
10 ~----~----~----~----~----,

100
75
50
0.1 25

0.01 100
75
0.001 50
25
0.0001 percent of rated reverse voltage
and/or forward blocking voltage
0.00001

0.000001
175 150 125 100 75 50 Junction temperature ('C)

Fig. 6.5 Estimated A of a standard SCR depending on junction temperature, reverse and/or
forward voltage, and failure definition for a maximum rated junction temperature of + 150 a C

6.4
Reliability screens by General Electric

According to the fact that as more effective procedures are developed, the reliabil-
ity screens [6.1] are updated. In the following an example of one of these reliability
screens specifications is given:
J 00% preconditioning tests
1. High temperature bake at 150a C for 168h minimum.
2. Temperature cycle, MIL-STD-202C, method 107B, test condition F, excep-
ting that 10 cycles instead of 5 are performed.
3. Thermal resistance Gunction to case) = 2°C/w.
4. Blocking burn-in TA = +122°C ± l.5°C, PRY = VBO = 400V, time = 100h
minimum.
5. Forward and reverse leakage TA = +25°C, PRY = VBO = 400V,
IR = Is = 2mA maximum.
6. Gate trigger voltage TA= +25°C, VGT = 3V maximum.
7. Gate trigger current TA = +25°C, IGT= 40mA maximum.
8. Forward voltage drop TA = +24°C, Irt.peak) = 50A, Vr= 2V maximum.
9. Forward and reverse voltage TA = +125°C, PRY = VBO = 400V,
IR = Is = 5mA maximum.
10. Gate trigger voltage TA = +125°C, VGT = 1.5V maximum, O.25V minimum.
II. Gate trigger current TA = +125°C, IGT = 30mA maximum, O.5mA minimum.
210 6 Reliability of thyristors

12. Monitored vibration test: X or Z orientation for 30s minimum. Frequency =


60cps, double amplitude displacement ± O.1inch minimum. Monitor 100V reverse
voltage on oscilloscope. Reject for any discontinuity, flutter, drift or shift in trace.
13. Radiflo leak test to 1 x lO-IOCC/S leak rate.
14. Bubble test: immerse in 90°C ± SoC deionised water for 60s minimum.
Reject any unit which produces more than one bubble from the same point.

6.5
New technology in preparation:
the static induction thyristor 51TH

The I-V characteristics of an on-state SITH is similar to that of a pin-diode [6.11];


the two devices are structurally the same in current flow direction, so that the
analysis of a diode is usually adapted for on-state SITH. Besides this, it has been
observed that I-V behaviours of on state SITH are analogous to that of a SCR.
However, the blocking or turn-off performances of SITH and SCR are quite differ-
ent as physical phenomena. Thus, a SCR is turned off by reversing the anode volt-
age polarity. A SITH features an energy barrier mechanism. There are two opposite
electric fields in the channel: one originates from the anode bias, and the other - the
vertical component - from the gate bias. The two opposing fields reach equilibrium
at a certain point on the channel axis, leading the potential at this point to a mini-
mum. This minimum potential - if lower than zero - acts as a barrier to electrons.
The increasingly negative gate bias enhances the barrier and blocks off the elec-
trons. Under this condition, the injection of holes from anode into channel is un-
doubtedly impossible. One cannot expect that a significant current will flow. There-
fore, by applying sufficient negative gate bias, thereby raising the barrier height, a
SITH can be turned off. Fig. 6.6 shows the simplified structural model for SITH
simulations.
Considering the lateral diffusion, the junction surfaces AB and A'B' are treated
as parts of an ellipse with a major axis of Xj and a minor axis of 'YXj (y is a
coefficient determined experimentally). The symbols and the corresponding
dimensions are given in Table 6.2.
The original wafer is lightly-doped and thin. At blocking state, i. e. VG < 0 and
VA > 0, the depletion layer at the gate-channel junction expands greatly toward
channel region when the negative gate bias is increased (even a little). On the other
hand, because the electrons are blocked off by the barrier, no holes can be injected
from the p+ layer of anode region even though it is forward-biased, so that to keep
the space-charge neutrality. On these view points, depletion approximation is
reasonably introduced to treat the blocking state at which the simulation will be
done.
6 Reliability of thyristors 211

c d c' Wn

D D'

Fig. 6.6 Simplified structural simulation model of 81TH

Table 6.2 Description of the device parameters for simulations


Parameter Symbol Scale

Thickness of heavily-doped layer 120llm


Width of heavily-doped cathode 51l m
Width of gate space 211 m
Gate-gate space 20llm I
Depth of gate-channel junction
Doping concentration of gate region 110o~m -3
cm
I
I

Doping concentration of anode region lOI'cm- 3 I


Doping concentration of cathode region
Doping
L -_ _____ concentration
_ _ _ _ _ _ _ _of
__ original
_____ wafer
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __
8-
I02o C 3
I012 cm-3

The region of interest for the analysis is divided with quasispace grids. Poisson's
equation is discretised in a manner of five-diagonal band matrix. A minimum
potential appears at the channel axis (Fig. 6.7).
Multiplying the potential by electron charge, the resulting energy distribution
along the channel axis has an extreme point (as shown in Fig. 6.8) which acts as a
barrier to the electrons. Obviously no electrons can be injected from cathode region
as long as the barrier is high enough [6.12][6.13]. As for the holes injected from the
p+ layer of anode region (if possible), they cannot find a way to take part in the
conducting. So it must be concluded that essentially there is no injected holes in the
channel. Therefore there is no significant current flowing in the blocking state
device.
212 6 Reliability of thyristors

Potential ¢ (V)
0.6

0.4

0.2

o /
-0.2 7
\ /
\ '\
-0.4
/
-0.6 /
-0.8 ~ ../
V
o 2 4 6 8 10
Channel axis x (JIlII)
Fig. 6.7 Potential distribution in SITH along channel axis

Electron energy (e V) Barrier height (e V)


0.5 25

20

-1.5 15

-2.5 10

-3.5 5

-4.5 L--_ _ _ _ _ _ _JJ


o 4 8 12 -30 -20 -10 o
Distance along channel axis (JIlII) Gate voltage (V)

Fig. 6.8 Electron energy distri- Fig. 6.9 Barrier height versus gate bias
bution along channel axis

However, the barrier vanishes when the gate bias changes in its polarity as
shown in Fig. 6.8. Then the device transfers to on-state which allows larger
currents. In contrast, it is possible to tum off the device by changing the gate bias to
a sufficient negative value. An important feature of the device is that the barrier
height can be scaled artificially either by the gate bias or by the anode bias. In Fig.
6.9 the manner in which the barrier height varies with the gate bias is shown. At
6 Reliability of thyristors 213

given anode bias, according to Fermi-Dirac statistics, the current changes


depending on the barrier height. However, at given gate bias only if high anode bias
is applied to overcome the barrier, large current is allowed. The later process
involves a so-called negative-resistance effect in SITH. The outlines of the
potential assumes the shape of a saddle. The saddle point, at which the electron
energy acts as a barrier to electrons is observed in the channel near the cathode. The
crucial roles of the barrier in controlling the device properties, such as I-V
behaviours and turn-off performance, are discussed in [6.11].

References

6.1 Grafham, D. R; Golden, F. B. (eds) (1979): SCR Manual, sixth edition. General Electric,
Auburn,N. Y.
6.2 Motto, 1. W. (1977): Introduction to Solid State Power Electronics. Westinghouse Electric
Corp., Pennsylvania
6.3 Locher, R. E. : Thermal Mounting Considerations for Plastic Power Semiconductor
Packages. Application Note 200.55, General Electric, Auburn, N. Y.
6.4 Antognetti, P. (Ed.) (1986): Power Integrated Circuits. McGraw-Hill, New York
6.5 Borri, F. R.; d'Espinosa, G. (1986): Power Integrated Circuit Reliability. In: Antognetti, P.
(ed.) Power Integrated Circuits. McGraw-Hill, New York
6.6 Biijenesco, T. I. (1981): Probh:mes de la fiabilite des composants electroniques actifs
actuels. Masson, Paris /Arm, Suisse
Biijenescu, T. I. (1985): Zuverlassigkeit elektronischer Komponenten. EDV Verlag, Berlin
Biijenescu, T. I. (1996): Fiabilitatea componentelor electronice. Editura Tehnica, Bucharest
6.7 Cristoloveanu, S.; Li, S. S. (1995): Electrical Characterization of Silicon-on-Insulator
Materials and Devices. Kluver Academic Publishers
6.8 AEG-Telefunken (1985): Gate Turn-Off Thyristors. Technical data
6.9 Lawson, R. W. (1974): The Accelerated Testing of Plastic Encapsulated Semiconductor
Components. Reliability Physics
6.10 Ajiki, T. et al. (1979): A New Cyclic Biased THB Test for Power Dissipating IC's.
Reliability Physics
6.11 Li, S. Y. et al. (1995): Theoretical Analysis of Static Induction Thyristor. Proceedings of
the Fourth International Conference on Solid-State and Integrated-Circuit Technology.
Beijing (China), October 24-28
6.12 Bulucea, C.; Rusu, A. (1987): A First-Order Theory of the Static Induction Transistor.
Solid-State Electron., vol. 30, pp. 1227-1242
6.13 Akira, Y. (1987): Investigation of Numerical Algorithms in Semiconductor Device
Simulation. Solid-State Electron., vol. 30, pp. 813-820
6.14 Biijenescu, T. I. (1984): Sur la fiabilite des thyristors. Electronique, vol. 4, pp. 26-31
6.15 Grafham, D. H.; Hey, J. C. (1977): SCR-manual. Fifth edition. General Electric, Syracuse,
New York
6.16 Bodea, M. (1989): Diode si tiristoare de putere (Power diodes and thyristors). Ed. Tehnica,
Bucharest (Romania)
6.17 Ackmann, W. (1976): Zuverlassigkeit elektronischer Bauelemente. Hiithig-Verlag, Heidel-
berg
6.18 Anderson, R. T. (1976): Reliability Design Handbook. lIT Research Institute, Chicago
6.19 Bell Communications Research (1985): Reliability Prediction Procedure for Electronic
Equipment. (TR-TSY-OOO 332), Bell, Morristown NJ
6.20 Dombrowski, E. (1970): Einfiihrung in die Zuverlassigkeit elektronischer Gerate und Sys-
teme. AEG-Telefunken, Berlin
6.21 Doyle, E. A. Jr. (1981): How parts fail. IEEE Spectrum, October; pp. 36-43
214 6 Reliability of thyristors

6.22 Kao, J. H. K. (1960): A summary of some new techniques on failure analysis. Proc. Annual
Symp. Reliability, pp. 190--201
6.23 Kapur, K. L., Lamberson, L. R. (1977): Reliability in Engineering Design. 1. Wiley and
Sons, New York
6.24 Siemens, SN 29 500 (1986): Failures Rates of Components. Zurich, Siemens-Albis
6.25 Villemeur, A. (1988): Sfuete de fonctionnement des systemes industriels. EyroIles, Paris
7 Reliability of monolithic integrated circuits

7.1
Introduction

Even from the beginning, the semiconductor industry was characterised by a high
innovation rate. A spectacular moment was the appearance of the integrated circuits
on the market, allowing high cuts of price and performance growth. The first inte-
grated circuit (reported by Jack Kilby and Robert Noyce) was not a sudden discov-
ery, being prepared by previous devices. Invented in 1958, the solid-state circuit
was developed in 1959, when the planar technique arises. This was the milestone
for subsequent development of the monolithic integrated circuits, containing bipo-
lar and unipolar (mostly MOS) transistors, based on a silicon substrate. The global
market for semiconductor devices increased with 15% per year in the last twenty
years, reaching $ 140 billion in 1997.
The huge progress obtained in the integrated circuit field led to smaller dimen-
sion electronic equipment and reduced costs, but also to improvements in power
capability, reliability and maintainability. The predictions are that the strong world-
wide increase of computer and communication market will lead to an even higher
growth rate of semiconductor industry in the next decade, 20% per year, with a
level of $ 300 billion immediately after the year 2000 [7.1].
The complexity of les increased every year. In fact, Gordon Moore, in the 70's,
talked about a doubling of Ie complexity every 18 months, with a corresponding
decrease in cost per function'. This became the so-called Moore's Law, proved to
be true for more than the subsequent twenty years. Many factors contributed to
keeping this model on course: the improvement of design tools and manufacturing
technologies, but also the permanent growth of the reliability level. The intrinsic
reliability of a transistor from an Ie improved with two orders of magnitude (the
failure rate decreases from 1O·6h·] in 1970, to 1O·8h·! in 1997). But also, in the same
period of time, the number of transistors per device increased with 9 orders of
magnitude! Therefore, the Ie reliability increased even faster than the prediction
given by Moore's Law. The model for reliability growth was called "Less's Law",
taking into account the known philosophy from the architectural design: "Less is

, The cost per function is made up by two terms: alN (increasing with complexity [7.2] and
representing the chip cost) and C/N (representing assembly and testing costs), where N is the
number of functions and a, b, c are constants.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
216 7 Reliability of monolithic integrated circuits

More" [7.1]. Actually, Less's Law means a tremendous increase of the require-
ments for the IC's failure rate: from 1000 failures in 109 devices x hours (or 1000
Fits), now only some Fits in a single digit are required. It is worthwhile to note the
change in the predictions made by the Semiconductor Industry Association (SIA)
in the editions 1994, 1995 and 1997 of the National Technology Roadmap for
Semiconductors [7.1][7.3]. From Table 7.1, one may see that the forecast was
overpassed by the reality: the performances previewed in 1994 and 1995 for 1998
were attained earlier, in 1997.

Table 7.1 Predictions for Si CMOS technology development: 1994, 1995 and 1997 editions of
the National Technology Roadmap for Semiconductors

Year Minimum feature (11m) Lithography


1994 1995 1997 1994 1995 1997
1995 0.35 0.35 - Optical Optical -
1997 - - 0.25 - - 248nm
1998 0.25 0.25 - 248nm 248nm -
1999 - - 0.18 - - 248/193nm
2001 0.18 0.18 - 248/193nm 248/193nm -
2003 - - 0.13 - - -
2004 0.13 0.13 - - - 193nm
2006 - - 0.10 193nm 193nm -
2007 0.10 0.10 - - - -
2009 - - 0.07 - - -
2010 0.07 0.07 - - - -
A practical example of how a technology improvement influences the IC features
is given in Fig.7.l. The addition of copper to aluminium-copper alloys used for
metallisation allows to avoid the electromigration (a well-known failure mecha-
nism) and to increase the current density. This tendency was observed from 1982
till 1995. Further on, the increase of the copper content produces a growth of the
resistivity, limiting the allowed current density (see the AI-Cu line in Fig. 7.1). The
solution seems to be the use of copper as a supplementary metallisation layer, over
the AI-Cu layer (the dotted line CuJAI-Cu in Fig.7.1).

Current density (MAlcm 2)

0.7
...... eu
0.5 ~,.:::::::: ..... AI
0.3

0.1
1980 1985 1990 1995 2000 2005 Year
Fig. 7.1 Evolution of the metallisation technology and corresponding allowed current densities
7 Reliability of monolithic integrated circuits 217

The complex structure of ICs has a three-dimensional architecture that must be


reproduced in every circuit. To do this, many layers are used. In the manufacturing
process, this sequence of layers is built according to the previously elaborated de-
sign.
First, the designer analyses the functional block diagram describing how the cir-
cuit must operate. Than the processing steps needed for manufacturing the circuit
are selected and the size and location of each circuit element are estimated.
Even at this stage, the testing, manufacturing and reliability engineers must enter
in the development team. This is according to the concurrent engineering approach.
The reliability engineer estimates the reliability of the future product, by means of
appropriate prediction methods. Computer simulations must be performed and, if
necessary, design features may be changed.
Once the design is completed (this process can take several months, even years),
the masks are made. A primary pattern called the reticule, is checked for errors and
corrected. Then, by a step--and-repeat process the image of the device is repro-
duced side by side, hundreds of times. The original plate, called master, is copied
by direct contact printing to obtain the submasters. Each submaster is used to pro-
duce many replicas, named working plates (photographic emulsion plates or chro-
mium plates), that will be used for subsequent manufacturing.

Oxide layer - - - .

P P
I I P

a b

.
C

Diffusion window Diffusion area (p+) Metal layer


---.
J.. I"

N N N

P P P

d e f

Fig.7.2 Main sequences of the planar process: a starting material; b deposition of an epitaxial n
layer; c passivation (with an oxide layer); d photolithography; e diffusion of a p+ layer; f metalli-
sation

The fabrication consists of a series of sequences (see Fig.7.2). The starting mate-
rial is a wafer of monocrystalin semiconductor (silicon, but also galium-arsenide)
with a width of 400filll and a diameter of 3-5inches. First, an oxidation is per-
formed, by heating the wafer at high temperature (lOOO-1200°C) in oxygen atmos-
phere. In this way, a uniform oxide layer, with a thickness ofO.l-1.5Ilm, is formed
on the whole wafer. The local removement of the oxide is made by photolithogra-
phy: a photographic process, with the aid of a photoresistant layer, allowing to
218 7 Reliability of monolithic integrated circuits

obtain very small windows (some 11m2) by etching the undesired arias with appro-
priate chemicals. In these windows, doping impurities are diffused or implanted.
The diffusion and ionic implantation are accurate procedures for modifying the
electrical properties of the silicon layers, the essential element of the integrated
circuit technology. One of the most important subsequent operations is the metalli-
sation, meaning the interconnection, by a deposited metallic layer (most often alu-
minium), of all diffused elements. After the achievement of the processed wafer,
the back-end fabrication begins. The wafer is tested, scribed and broken (the chips
are separated). Than, each chip is soldered on a header. The operation is called die
bonding. Next, each metallic area (called pad) is connected by means of gold or
aluminium wires (with a diameter of 25-351lill) to the terminals, in an operation
named wire bonding. Finally, a package made by metal, ceramic or plastic material
covers the whole assembly.
Basically, two types of integrated circuits were developed since now: bipolar and
MOS, taking into account the basic cell: bipolar transistors or MOS ones, res-
pectively. In the beginning, the MOS ICs were n-channel MOS ICs or p-channel
MOS ICs, but sooner complementary MOS ICs (or CMOS IC) including both types
were developed. The main characteristic of CMOS circuits is the small supply
voltage. As the portable-electronics market increases, low-power and low-voltage
technologies, such as CMOS, became the most used. Also, the technological im-
provements leading to the remove of sodium contamination in the Si-Si02 system
encourage the use of CMOS ICs, because a high reliability level becomes possible
to get. Recently [7.4] reduced standard digital CMOS power supply voltage of3.3V
was obtained, reducing the power consumption by 70%. These ULP (Ultra low
power) CMOS ICs were deeply investigated [7.4] and proved to have a large po-
tential.
The last challenge in the IC family is the microsystem. Arisen in the early 90's,
the microsystem represents a superior integration step compared with common ICs:
the "intelligent" element (the signal processing part) is integrated with micro sen-
sors and with micro actuators, in a single component, basically still an IC [7.5]. In
fact, the micro system is a "smart" sensor, able also to actuate. This device deter-
mines the development of some new micro technologies. Many disciplines being
involved, hybrid terms, such as: mechatronics, chemtronics, bioinformatics were
used [7.6], but the term microtechnology (technology of microfabrication) seems to
be the most adequate. In Europe, the term microtechnology includes both microe-
lectronics (the "classical" devices) and micro system technology (MST) [7.76].
Other related terms are MEMS (Micro-Electro-Mechanical Systems), BIO-MEMS
(BIOlogical MEMS) and MEOMS (Micro-Electro-Opto-Mechanical-Systems).
Silicon is still the basic material and the CMOS technology can be used for the
manufacturing.
Recently, a new term, nanotechnology, was proposed, because the structures
have now characteristic features of a few nanometers. Accordingly, the tools used
for manufacturing these new technologies (micro- and nano-) are called micro-
machines and nanomachines, respectively.
7 Reliability of monolithic integrated circuits 219

7.2
Reliability evaluation

7.2.1
Some reliability problems

From a theoretical viewpoint, a higher reliability is expected for integrated circuits


than for discrete components. In practice, these expectations were surpassed, be-
cause some basing conditions were fulfilled:
• high level and automated industrial fabrication,
• constant quality materials,
• screening tests for finished products.
Before stating which type of integrated circuit is adequate for a specific use, one
must carefully analyse the design, the most important parameters, the dimensions,
the costs and the limits imposed by the reliability. Such studies must take into ac-
count:
• a comparison of the total costs (development, manufacturing and testing costs),
• a comparison of the important parameters (resistor tolerances, temperature
coefficient, speed, voltage level) with limitations specific to each type,
• an evaluation of the dissipated power of the circuit and of the thermal resis-
tance of the encapsulated circuit, because an acceptable temperature of the
substrate must be assured.

7.2.2
Evaluation of integrated circuit reliability

Generally, three main problems arise at the evaluation of integrated circuit reliabil-
ity.
1. For modem devices, the failure rates decrease under a certain limit and the con-
ventional methods become less usable. To overcome these difficulties, two solu-
tions may be discussed:
a) To perform on a very high number of integrated circuits reliability tests in
normal operational conditions, with the duration of a couple of years. Obvi-
ously, this solution is unacceptable. As an example, if one has to test a failure
rate of 1O-9h- 1 (called also 1 FIT), 1000 devices must be tested for 114 years
and only one device to be found defect.
b) To perform on some integrated circuits reliability tests in higher than normal
conditions, the so-called accelerated tests. This method may be applied only if
at the accelerated tests the failure mechanisms are the same as for normal op-
erational conditions. And this fact must be indubitably demonstrated.
The accelerated tests are used in the purpose to obtain quickly and with a minimum
of expense information about the reliability of the product. The used stresses are
220 7 Reliability of monolithic integrated circuits

higher than for normal operational conditions, the results are extrapolated and the
failure rate for normal conditions is obtained. Usually, the accelerated tests contain
combinations of stresses such as: temperature, bias, pressure, vibrations, etc. [7.32].
If the temperature is the only variable of the accelerated tests, the Arrhenius model
may be used. To obtain reliable results, relatively short testing times must be used2•
So, using various levels of the same stress factor one may follow the real behav-
iour3 • The analysis of the physico-chemical process leading to failure allows ob-
taining the correlation between the speed of these phenomena and the stress and, as
a result, the real dependence of time to failure on stress levels.
2. The rapid development of the manufacturing technology for integrated circuits,
needed by the aim to improve the control and to reduce the costs, makes difficult
the reliability evaluation. Usually, any modification in the technology or used mate-
rials is followed by the appearance of a new failure mechanism. Consequently, any
manufacturing modification must include a new reliability evaluation.
3. The last problem is linked to the increasing complexity and costs of integrated
circuits vs. discrete devices. Although the cost of a certain electronic function de-
creases substantially by integration, the basic costs are always higher for an inte-
grated circuit than for a discrete component fulfilling the same function.
The definition of the failure criteria is, unavoidably, very difficult because the
complexity ofICs is increasingly higher. Even for a simple device, like a transistor,
it is hard to define the failure limits. For an integrated circuit, the basic parameters
are more complex and hard to be measured and the degradation of these parameters
differs from an utilisation to another.
For evaluating the various stresses able to be used in reliability accelerated tests,
the following aspects must be taken into account [7.7][7.8]:
• The stress must be encountered in the operational environment. In principle,
one must note that the failure rate of integrated circuits is influenced by the
thermal, electrical and mechanical conditions of the operational environment.
But for common industrial use, mechanical shock and vibrations have a little
influence on the integrated circuits encapsulated in epoxy packages, able to
assure the necessary mechanical stability and a good protection. For instance,
the acceleration measured at a sudden stop of a running car reaches 40g, for
airplanes take-off and landing - up to 5g and for missiles - up to 50g. Compare
these values with the acceleration level used for periodic tests: 30,OOOg.
Consequently, mechanical factors will be used only rarely for accelerated tests.
On the contrary, the temperature is the most used stress for this kind of tests.
The experimentally observed correlation between failure rate and temperature
is based on the fact that the speed of chemical reactions arising in the device is
thermally increased.
• The failure mechanisms must be allways those arising in the operational
environment.

2 Even if the purpose is to minimise the testing time, a too stronger stress level must not be used,
because new failure mechanisms may be induced.
3 If the time is the accelerated variable, this means that an hour of tests at high stress level

produces the same effect on the component reliability as n hours at normal operation time.
7 Reliability of monolithic integrated circuits 221

• All samples of integrated circuits used in accelerated tests must behave in the
saem way at a stress modification: the same circuits should be the first to fail at
any stress level.

7.2.3
Accelerated thermal test

The use of accelerated tests starts from the presumption that the possible failure
mechanisms are well known. The systematic use of the temperature as a
accelerating stress was introduced by Peck [7.9], at the beginning of the 60's.
Received initially with hesitations and scepticism, the technique became useful for
component producers and users. The experience in the utilisation of electronic
components shows that the life duration is not an infinite one and the operational
conditions are important. For initially "good" integrated circuits, failures before the
end of the normal life duration were found, such as:
• catastrophic failures, breaking the normal operation,
• drift failures, producing defective operation by an important time variation of
the electrical characteristics.
One must understand that the appearance of a failure is not a proof that the life
duration is smaller. The drift failure is hard to define, depending of the drift
threshold stated as a failure criterium. In practice, the accelerated thermal tests are
not sufficient for estimating the reliability of a product. Step stress tests must also
being performed [7.1 0). In this case, the initial hypoteses are that the stress has no
memory and that the wearout does not arise. Samples withdrawn from the batch
undergo these test to an increasingly higher stress, such as: temperature, bias,
mechanical stress, etc. A careful analysis of the results allows an accurate
estimation of the product reliability [7.33]. For plastic encapsulated integrated
circuits the leak test has no sense because the package is voided. On the contrary,
humidity penetration tests are reccomended. Concerning the equivalence between
the operating hours at the standard temperature of 55°C and the operating hours at a
higher or smaller temperature, the acceleration factors are presented in Table 7.3
[7.11], for various activation energies. (In Table 7.2, the acceleration factors for
functionning at 125°C vs. the normal ambient temperature of 25°C, for various
activation energies are given).
It is important to note that without knowing the value of the activation energy,
no correct analysis of the data obtained from laboratory tests is possible. It is worth
to be mentioned that the international standardisation body do not take allways into
account the importance of the activation energy for data processing. For instance,
the standard MIL-HDBK-217D chose for the bipolar technology an overall
activation energy of O.4eV and MIL-STD-883C, method 10005-2 uses for all
devices the value of 0.7eV. The accelerated thermal stress has a big disadvantage:
there is a high probability that new failure mechanisms occur at certain stress level.
This disadvantage disappears for comparative evaluation of subsequent batches of
integrated circuits. In any case, the thermal acceleration is not a panacea for saving
time or money in estimating the life duration of integrated circuits.
222 7 Reliability of monolithic integrated circuits

Table 7.2 Acceleration factors at an operating temperature of 125°C vs. 25°C

E.(eV) 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

F 19 50 133 354 932 2500 6561 17579

Table 7.3 Acceleration factors for various activation energies and testing temperatures vs. a
testing temperature of 55°C

Testing Activation energy (eV)


temp.
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
("C)

0 0.12 0.058 0.03 0.01 0.007 0.003 0.0016 0.0008

25 0.34 0.24 0.17 0.12 0.08 0.06 0.041 0.00285

35 0.50 0.40 0.32 0.25 0.2 0.16 0.127 0.1 007

55 1.0 1.0 1.0 1.0 1.0 1 1 1

70 1.59 1.85 2.17 2.53 2.3 3 4 5

95 2.43 3.27 4.4 5.91 7.9 11 14 19

100 3.59 5.5 8.43 12.9 19.7 30 46 71

125 6.46 11.99 22.4 41.7 77.3 145 269 501

150 10.8 23.86 52.6 117.1 257.5 573 1267 2804

175 17.1 44.0 113.8 293.0 750.0 1948 5021 12942

200 25.8 75.9 225.4 666.0 1954.0 5818 17197 50824

250 52.1 193.6 728.0 2718.0 10062.0 37928 141712 528415

300 93.1 419.8 1914.0 8676.0 38935.0 178254 808013 3663626

7.2.4
Humidity environment

Compared with high temperature testing, the experience aquired in high humidity
testing is very small. However, the law for failure of integrated circuits due to high
humidity seems to be a log-normal one [7.32] and the average life time to depend
directly proportionally to the vapour pressure of the humid environment. In the
early days, to perform tests in such an environment, a temperature of 2SoC and a
relative humidity of 75% was reccomended. A test performed in these conditions
and with a duration of20 days, with bias, simulated an ageing of20 years [7.12].
7 Reliability of monolithic integrated circuits 223

Lately, a more efficient test is used, the so-called "85/85 test" (85°C and 85%
relative humidity).
In the 70's, the plastic package had a high permeability, which may led to
catastrophic failures by the corrosion of the aluminium metallisation. In any case, it
was difficult to predict te behaviour of plastic encapsulated circuits, because the
tightness was not warranted. For special utilisation conditions, the metallic or
ceramic packages were recommended. The latest developments in plastic packages
(see Chap. 12) led to the achievement of high reliability plastic packages, usable in
the most hostile environment.

7.2.5
Dynamic life testing

The failure of an IC in operational life is an unpleasant event not only because the
owner of the equipment must replace it, but also because this failure may induce
serious damages to the equipment, loss of important information or even of human
life. Therefore, it is desirable to replace a IC before failure. From economical
reasons, this replacement must take place shortly before the anticipated failure.
This implies that the lifetime of the IC be accurately estimated. This operation may
be done only if laboratory tests simulating as closed as possible the real operational
life are performed. In this respeCt, in the laboratory, not only static, but also
dynamic testing must be done. The purpose is to quantifY the performance
degradation during IC operation. An example of such testing is given by Son and
Soma [7.13]. First, the IC parameters which will be monitored during dynamic life
testing are chosen, by two criteria: i) to be measurable at existing pin-outs, and ii)
to predict progressively IC degradation. Than, the typical failure and degradation
mechanisms must be studied. In fact, there are two major types of degradation
mechanisms: electrical ones (such as: latchup, ESD, hot-carrier effect, dielectric
breakdown, electro migration, etc.) and environmental ones (produced by thermal
and mechanical stress, humidity, etc.). By means of appropriately chosen electrical
parameters (such as: static / transient current level change, noise level in current,
cut-off frequency, input offset voltage of CMOS differential amplifier, etc.), these
mechanisms are monitored during dynamic life testing.
Eventually, aging models for various failure mechanisms must be elaborated. In
[7.13] a model for hot-carrier effect is given. Starting from a widely accepted
empirical relationship between parameter deviation and the elapsed stress time for
the hot-carrier degradation mechanism, given in [7.14], an aging curve due to hot-
carrier effect under the static or periodically repeated AC stress was obtained
[7.13], defined by the equations:
(7.1)

k = C' (Irulw) . exp(-<P/(q· p 'EcJ)t (7.2)


where C is a constant, a and k are coefficients of the aging curve (depending on the
technology and on IC structure), Ids is the drain current, w is the channel width of a
MOS transistor, <I>i - the minimum energy required to cause impact ionisation, q -
224 7 Reliability of monolithic integrated circuits

the electron charge (1.6 x 1O-19C), P - the hot electron mean free path, Ech - the
channel electric field, /).VN 0 - circuit aging and t - elapsed stress-time. In a
loge/).VN 0) vs. log t plot, a straight line with slope a and y-intersection log k is
obtained (see Fig. 7.3).
From the case study given in [7.13], one may understand the procedure for IC
replacing before to occur a failure by hot-carrier effect. For a 31-stage inverter
chain, designed according to MOSIS 0.8 Iill1 HP technology rules, the device
operation was simulated, the ageing being modeled by randomly changing device
parameters. Based on this model, the probability of survival until the next
inspection may be quantified at each inspection of dynamic-life testing. Then, the
optimal moment for replacement may be calculated with respect to maintainance
cost (the recovery cost of an unanticipated failure and the wasted cost of replacing
an IC too early).

logk

log t
Fig.7.3 A log (llVNo) vs. log t plot for hot-carrier degradation mechanism

7.3
Failure analysis

7.3.1.
Failure mechanisms

Any user of an integrated circuit wants to eliminate from the beginning the future
failing devices. To do that, the typical failure mechanisms must be known. In the
following, some examples of typical failure mecahnism for integrated circuits will
be given.
In general, the failure mechanisms of integrated circuits are divided in three
categories, refering to: wafer (chip), die-package connections and package,
respectively.
In the 60's, the chip and wire solders were the main critical problem, leading to
20-30% defects at these operations. The numerous technical and technological
advances obtained lately improved substantially the situation, without solving all
the problems. The increase of the integration degree brings about many problems
linked to the chip and case reliability, respectively. With the integrated circuit
complexity, the failure potential increases too, because the external factors (static
7 Reliability of monolithic integrated circuits 225

electricity, overcharge, transitory regime) play an important role during the


component life. The influence of these factors may be reduced by an appropriate
design or by well-adapted manufacturing methods.
The failure mechanisms may also be divided as follows [7.16]:
• mechanisms depending on the temperature,
• mechanisms depending indirectely on the temperature,
• mechanisms not depending on the temperature.
The influence of the failure mechanisms depending on the temperature increases as
the temperature increases, a phenomenon easy to explain by the chemical nature of
these failure mechanisms. By the aid of the Arrhenius relation, not only the
dependence of the failure rate on the temperature, but even the component failure
rate can be obtained. If the calculated junction temperature is too much different
from the real one, the predicted average value may differ substantially to the real
value. As an example, if the component does not operate at a junction temperature
of 80°C, as usually, but at 100°C, the failure rate decreases 5.8 times.
The temperature may have also an indirect influence, because the failures often
arise after many temperature cycles. The higher is the number of temperature
modifications, the higher is the failure probability (the aluminium fatigue is the
main failure mechanism in this respect).
Failure mechanisms such as the dielectric breakdown or the degradation of the
transistor current gain (produced by the avalanche breakdown of the base-emitter
jonction) do not depend on the temperature. These mechanisms may be evidenced
with the aid of tests accelerated with other factors than the temperature.

7.3.1.1
Gate oxide breakdown

This is a typical mechanism for MOS ICs. Shorts through the thermal oxide
between the metallisation and the silicon may arise. The thinner gate oxide (several
hundreds of angstroms) may be affected by this phenomenon, especially if defects
or impurities are present in the oxide layer. As a screening test, voltages higher than
the rated value are applied and the devices with too thin oxide layer (or with
defects) are removed. Also, some gate protection circuits (reverse-biased pn
junction with controlled breakdown characteristics) may be used to absorb large
pulse energies.
This mechanism is time-dependent, being known as TDDB (Time Dependent
Dielectric Breakdown), a major problem for MOS ICs. The model SAG (Shatzkes
IAv-Rom I Gdula) [7.41] tried to explain TDDB for this gate oxide. They take into
account only a single defect, attributed to very small weak spots and arising at
metal-oxide interface, where the barrier height (2.3eV) for electron injection into
the oxide is lower than the barrier height (3.2eV) of the defect-free area. Further, by
taking into account the interaction (synergies) between applied stress (temperature,
electric field, etc.), other models were developed [7.42]. By using proportional
hazard approach, with Weibull or lognormal distributions, an accurate model was
obtained [7.43].
226 7 Reliability of monolithic integrated circuits

7.3.1.2
Surface charges

One knows that for a MOS transistor the conductivity type and the resistivity of the
semiconductor surface are modified by the presence of a zone situated in the closed
neighbourhood or being separated by a thin dielectric layer. Such a charging
phenomenon produces on an unprotected silicon surface an absorbtion of the
organic mobile ions, moved by the action of an electric field. This phenomenon
leads to the deplacement of the pn junction on the surface and may produce a
charging area4 • These areas may extend and can arrive in contact with ground
linked regions, producing a shortcircuit. The symptoms are: smaller breakdown
voltages and higher leakage currents. The use of phosphosilicate glass (PSG) as a
passivatrion layer onto the thermal oxide is often used as a getter for sodium ions
(they are fixed and do not migrate through the oxide).

7.3.1.3
Hot carrier effects

The electrons or holes from the channel of a MOS transistor can gain high energy,
being able to penetrate the gate dielectric by ionisation impact and producing a
current multiplication by creating aditional electron-hole pairs. Then, if they
continue to gain energy, an injection into the silicon, by surpassing the energy
barrier can occur. Consequently, the carriers become trapped in the oxide. The
number of trapped carriers depends on the density of available traps from the
silicon dioxide.
One may distinguish three types of hot carriers: channel ones (the carriers
traversing the channel, and undergoing a low number of lattice collisions under the
influence of the strong lateral electric field), substrate ones (thermally generated in
the substrate and drifted by an electric field towards the interface) and avalanche
ones (created in avalanche plasma and undergoing, due to the strong lateral electric
field, a high number of impact ionisations). The substrate current produced by
impact ionisation can induce bipolar latch-up in CMOS structures and the hot
carriers injected in gate oxide form interface states and trapped oxide charge. In
time, this charge causes instabilities and parameter drift. These serious reliability
problems are increasing with the decreasing of the device geometries. A correction
method is to limit the source-drain voltage to values below the threshold for the
generation of hot carriers.

7.3.1.4
Metal diffusion

At the junction of two metals, a slow interdiffusion of the neighbouring atoms


occurs. The mechanical and electrical properties of this area (which becomes
gradually a mixture of the two metals) may be not modified. Because in
microelectronics the temperature plays an essential role and due to the fact that the

4 The higher the semiconductor resistivity, the smaller the charge value.
7 Reliability of monolithic integrated circuits 227

majority of the failures are based on physico-chemical reactions, the Arrhenius


model is currently employed. This model describes the relationship between the
time to failure, the temperature and the activation energy:
t = AJ exp (-EJkT) (7.3)
where: t = time to failure, A = a constant, Ea = the activation energy, k = Boltzmann
constant, T = temperature (K).
In most cases when a metal atoms or metal ions diffusion occurs, a linear
relationship may be obtained:
In t = a + biT (7.4)
where a and b are constants. For A = I and Ea = l.1eV, the obtained straight line
(on a double logarithmic scale) is plotted in Fig.7.4. The time to failure, in normal
operational conditions, may be calculated by extrapolation (a parallel to the line of
1.1 eV), if the time to failure at a higher temperature is known.
The Arrhenius model includes the influence of the temperature and of the
activation energy corresponding to a failure mechanism, making possible to predict
the reliability of a component in normal operational conditions, based on the results
of tests performed at higher semiconductor junction temperatures. A specific
activation energy corresponds to each failure mechanism.

Temperature
('C)

250

200
150
100

50

0.0001 0.001 0.01 0.1 10 100


A (%/lOOOh)

Fig. 7.4 Plot of the Arrhenius model for A = 1 and Ea = 1.leV

7.3.1.5
Electromigration

It is known that in any conductor transporting an electric current, only few metallic
atoms are activated. During the activation period, the atoms undergo the action of
two contrary forces: an electrostatic force and an impact force (due to the electron
collisions). The action of these forces becomes manifest by the movement of the
activated atoms along the conductor. Therefore, for a current density of l06A1cm 2,
in an aluminium conductor on a silicon chip (Tj = 150°C), after 3-4 days a migration
228 7 Reliability of monolithic integrated circuits

of the aluminium occurs, creating hillocks and voids and increasing the current
density in the rest of the conductor. Consequently, the migration is multiplied and
the holes in the aluminium become a conductor interruption.
One can distinguish two types of electromigration: solid-state one and
electrolytic one, respectively [7.34]. Solid-state electromigration starts at a local
temperature above 150°C and current densities above 104A/cm2. It is a well-studied
phenomenon and efficient measures to avoid it have been proposed. Some
examples will be presented. The addition of copper [7.35] or titanium [7.36] allows
higher current densities before electromigration arises. Other methods proposed
are: to encapsulate conductors with dielectrics [7.37], to cover the aluminium
conductor with a chemical-vapour deposited silicon dioxide layer or to grow an
anodic layer onto the aluminium conductor [7.39]. Wada et al. [7.40] suggested
some surface treatments, proved to be very effective. An oxygen plasma treatment
(in a barrel reactor), preceded or followed by an annealing at 450°C, for 30
minutes, in forming gas led to significan improvements of the mean time to failure.
Details are given in Table 7.4.

Table 7.4 Results of oxygen plasma treatment

Treatment Before I after annealing Relative lifetime


duration (arbitrary units)

No - 1

20 min. After 1.2

40 min. After 3

80 min. After 4

20 min. Before 1.1

40 min. Before 2.4

80 min. Before 3.2

Obviously, from Table 7.4 one may conclude that oxygen plasma treatment must
be performed after annealing. Another method is a water dip treatment (after
aluminium metallisation, the wafer is dipped in H20 for 4 minutes, before or after
resist strip and annealed at 450°C, for 3 minutes, in forming gas). The result was an
improvement more than 3 times of the mean time to failure.

7.3.1.6
Fatigue

In a semiconductor device, the internal mechanical forces work in the areas of high
contact, where it is difficult to match the dilatation coefficients of the copper, kovar
and steel, for instance. To improve the situation, intermediate layers ofmolibdenum
or wolfram are used. After repeated temperature cycles, the structure of these
7 Reliability of monolithic integrated circuits 229

materials is modified, the cohesion force between the granules decreases and cracks
may occur.

7.3.1.7
Aluminium-gold system

For aluminium metallisation connected with gold wire, five main failure
mechanisms are known [7.15]:
Purple plague produced by metallic compounds formed at high temperatures
between aluminium metallisation and gold wires. As a consequence of this
phenomenon, an important degradation of the semiconductor reliability occurs,
because the gold / aluminium solder point becomes frail and any mechanical stress
(even a weak one) may lead to an open contact.
Electrolythic corrosion is a permanent menace for the aluminium metallisations,
especially for plastic encapsulated chips, functionning in a humid environment
Electromigration (mentioned previously) occurs at high current densities
(> 10 5Ncm2) and high temperatures. As a consequence, in the aluminium pad,
initially uniform, thinner regions arise, leading to device destruction.
Aluminium / silicon interraction (at the ohmic contacts) may lead to the total failure
of the device (by shortcircuit), especially at high current densities.
Protection layers from evaporated aluminium are often formed by too thin metallic
layers, leading to too higher contact resistances and producing regions with higher
current densities.

7.3.1.8
Brittle fracture

The die-case connection may be affected by the brittle fracture of the die. Initiated
by the cracks forming during previous wafer manufacturing processes (crystal
growth, wafer scribbing and slicing, die separating), this failure mechanism is
produced by thermal expansion mismatch of the different materials used for
assembly. After die bonding, the cooling process induces excessive mechanical
stress in the die. If the crack size exceeds the critical size for the induced stress, as
calculated with the aid of appropriate models [7.44], pre-existing cracks can cause
brittle overstress failures. Voids in the die attach can further exacerbate the failure,
not only by increasing the thermal resistance, but also by acting as a stres
concentration site [7.45]. It is interesting to note that because the wire bond is still
connected, the device may pass a functional test without signaling a possible
failure.

7.3.1.9
Electrostatic Discharge (ESD)

This failure mechanism appears at all types of Ies, generally during testing,
assembling or handling. The phenomenon is produced by voltages higher than
lOOOV. Protection circuits or other measures [7.40] can be used to avoid ESD.
230 7 Reliability of monolithic integrated circuits

7.3.2 Early failures

Early failures are very annoying for component users [7.17]. For instance, if an
equipment has only 500 integrated circuits and in the first 30 days the failures
proportion is 0.1 %, it results that, on the average, 50% of the equipments fail in the
first month of operation.
For the integrated circuits mounted with beam lead technique, the mechanical
defects explain almost all the early failures (excepting complex MS circuits, where
the oxide defects are the main failure cause). Data from various sources indicate
completely different time periods from the component lifetime, for SSIIMSI
circuits vs. MOS LSI (especially for dynamic RAM), as one can see from Fig. 7.5.
For both categories, the average failure activation energy is around 0.4 eV 5•

Temperature
(C)

400

300 .......... \ \
250 ....
.
\
200 ..... ...... \
150 ". ". ~c
100
b/-····.········.. \
50 .................. \
10" 10' Lifetime (h)
Fig. 7.5 Comparison of data refering to early failures and long term failures: a) typical domain of
long term failure mechanisms for commercial plastic encapsulated les; domain of early failures
for bipolar commercial SSIIMSI; domain of early failures of commercial MOS LSI [7.21]

In fact, in this case, the term "early failures" covers manufacturing defects,
becoming failures in a physical or electrical environment (scratches of the wafer,
open or almost open connections, voids, passivation defects, etc.). These early
failures differ one to another by its nature. The early failure period proved to be
important for solving many practical problems. In this period, one can estimate the
failure rate of an equipment or define the condition for a burn-in needed for
reaching a prescribed quality level for the equipment.
In Fig. 7.6, the replacement rates for MSI and SSI circuits are compared, during
the infant mortality period of commercial plastic encapsulated TTL. The high chip
dimensions and the increased complexity of MSI circuits lead to a higher
replacement rate than SSI circuit one's. The results of failure analysis are synetised

5 Goarin [7.18] has shown that the observed activation energies are bellow the 0.4 eV and 0.7 eV
values, estimated previously for bipolar and MOS circuits, respectively.
7 Reliability of monolithic integrated circuits 231

in Table 7.5. From this table, one may understand that the early failures are
important and must be taken into account at reliability evaluation.

Replacement rate (RIT)

SST

Operation time (h)

Fig. 7.6 Replacement rate of commercial TTL res in plastic package (in RIT, during infant
mortality period) [7.21]

7.3.3
Modeling Ie reliability

First, only simulators for one or two subsytems or failure mechanisms were arise,
such as: RELIANT [7.20], only for predicting electromigration of the interconnects
and BERT EM [7.21]. Both use SPICE for the prediction of electromigration by
derivating the current. Other electromigration simulators were CREST [7.22], using
switch-level combined with Monte-Carlo simulation, adequated for the simulation
ofVLSI circuits and SPIDER [7.23].
Other models were built for hot-carrier effects: CAS [7.24] and RELY [7.25],
based also on SPICE. An important improvement was RELIC, built for three failure
mechanisms: electromigration, hot-carrier effects and time-dependent dielectric
breakdown [7.26].
A high-level reliability simulator for electromigration failures, named GRACE
[7.27], assured a higher speed simulation for very large ICs. Compared with the
previously developed simulators, GRACE has some advantages [7.27]:
• an orders-of-magnitude speedup allows the simulation of VLSI many input
vectors;
• the generalised Eyring model [7.28] allows to simulate the ageing and
eventually the failure of physical elements due to electrical stress;
• the simulator learns how to simulate more accurately as the design progresses.
232 7 Reliability of monolithic integrated circuits

Table 7.5 Incidence of main failure mechanisms (in %) arising in infant mortality period

Failure Commercial circuits Memories Western Electric ICs


mechanism
TTL CMOS TTL Memories

(Beam lead) (Wire


bond)

Electrical 4 60 17 35 9
overcharge

Oxide 2 1 51 - 53
defects

Surface 18 - 24 - -
defects

Connections 37 5 7 29 27
Metallisation 30 34 - 4 2
Various 9 - 1 22 9

Process
defect
distributions
f
~
Layout /

Failure Defect
distributions Probabilities

Calculation of
failure probabilities

System failure simulation

Fig.7.7 Monte-Carlo reliability simulation procedure for les

If the typical failure mechanisms are known, by taking into account the
degradation and failure phenomena, models for the operational life of the devices
can be elaborated. Such models, in contrast with the regular CAD tools determining
only wearout phenomena, predicts also the failures linked to the early-failure zone.
7 Reliability of monolithic integrated circuits 233

A Monte-Carlo reliability simulation for IC, incorporating the effect of process-


flaw, test structure data, mask layout and operating environmental conditions was
proposed by Moosa and Poole [7.19]. The device was divided into subsystems
(metallisation, gate oxide, wire bonds and packaging), affected by various failure
mechanisms. Further on, these systems were divided into elementary objects (e.g.
for metallisation: metal runs, vias, contacts), which may have various failure modes
/ mechanisms. The reliability-measures of the objects are obtained by accelerated
life testing on specially designed test structure. Then the data are extrapolated at the
subsytem and device level. The simulation procedure is detailed in Fig. 7.7.
This simulator was used for a two-layers metal interconnect subsytem and the
typical failure mechanism was electromigration. The effect of various grain size
distributions and defect (voids) size distributions was checked and the results
(given as cumulative failures vs. system failure times) agree well with previously
reported results.

7.4
Screening and burn-in

7.4.1
The necessity of screening

The high complexity of electronic systems and the economical consequences of a


weak quality product increase the role of the component reliability. Obviously, the
solution of this problem is not an easy one. If the MTBF of an electronic system
must be as high as possible, an efficient input control for the quality of the
components is needed, because the weak components must be removed even from
the beginning of the system manufacturing and not later (at the control of equipped
card, for instance). An empirical rule states that the wasted costs increase at each
subsequent control level (input control, equipped card control, subsytem control).
The maintenance costs for a component failure are 1000 times higher than the cost
of the input control for the same component. If an electronic system has 1000 ICs
and each repair costs 62SFr and each replacement 25SFr, the data from Table 7.6
are obtained, for various percentages of failed circuits.

Table 7.6 Corresponding costs for various percentages of failed ICs

Failure percentages Number of failures Repair cost (SFr) Repair cost expressed in
(%) % of the equipment cost
0.1 10 6250 2.5
1 100 62500 25
2 200 125000 50
3 500 312000 125

One may notice that by using efficient intermediate and final control, for failure
percentages higher than 0.1 % all the repair costs from the last column can be
234 7 Reliability of monolithic integrated circuits

spared. By definition, AQL (the acceptable quality level) is the prescribed limit
percentage of failed devices at which the batch is still acceptable by the buyer.
For high reliability systems, the user does not accept a failed device. To do this,
in these cases a 100% input control was introduced. Such a control cost fewer than
the subsequent replacement of the equipped card. A problem to solve is to have a
method for identifying the les which will fail subsequently. Usually, thermal tests
are used. In fact, a screening sequence contains mechano-climatic and electric tests.
As an example, the stipulations of MIL-STD--883 for aerospace and defense
applications are presented in Table 7.7.

Table 7.7 Screening tests for aerospace and defense applications (MIL-SID-783)

Screening test Ie categories

S B

Bond pull (nondestructive) Yes No

Internal visual Yes Yes

High-temperature storage 24h / +I50°C 24h / 150°C

Thermal cycle (20 X) -65°C/+ 150°C -65°C/+ 150°C

Constant acceleration 3000g/60s 3000g/60s

Particle detection Yes No

Reduced electrical test Yes No

Reverse bias bum-in 72h1+150°C No

Reduced electrical test Yes Yes

Bum-in 240hl+ 125°C 160hl+ 125°C

Electrical test Yes Yes

Seal (fine/gross leak) Yes Yes

Radiography Yes No

External visual inspection Yes Yes

The costs are similar for bipolar and MOS circuits, but MOS Ie having a higher
density, for systems of equal complexity, the screening tests are cheaper for MOS
les. One must know that the sensitive parameters of MOS circuits (such as
threshold voltage or residual current) may evidence after few hours the future
failures. The degradation of these parameters is a sure signal for some types of
early failures. For other types of early failures, an appropriate burn-in may be used.
There is no method to warranty the reliability ofIes. However, the screening and
high stress tests are useful means for the researcher allowing to obtain sufficient
confidence in reliability evaluation. In Table 7.8, the efficiency of some screening
tests is presented, together with some emphasised failure mechanisms. Generally,
7 Reliability of monolithic integrated circuits 235

the minimum cost is for SSI ("small scale integration") and the maximum cost is
for LSI ("large scale integration").

Table 7.8 A comparison between various reliability tests: efficiency, failure percentages, cost
(MIL-STD-883, class B)

Reliability Area of potential Efficiency Average Range Cost


tests failure degree failure
(SFr/module)
percentage
Min. Max.

Stabilisation * Electrical instability Good/ 3 0.1.20 0.Q3 0.25


bake
* Substrate surface Very good

* Metallisation
* Silicon processing
* Connections (wires)
Thermal * Package Good 2.5 0.1.18 0.13 0.25
cycles
* Seal
* Header (surface)
* Connections (wires)
* Thermal coefficient
mismatch

Bum-in * Silicon processing Excelent 3 0.1.20 0.65 12.5

* Header (surface)
• Connections (wires)

* Electrical instability
* Metallisation
* Corrosion

7.4.2
Efficiency and necessity of burn-in

Burn-in is a step of a screening sequence, based on thermoelectric activation, in the


purpose to remove the early failures [7.29]. If one has not sufficient money for a
complete screening sequence, a burn-in test may be used instead, but with lower
efficiency. One must distinguish between the burn-in as a test and burn-in as a
treatment. A test has pre-established questions and answers are waited. As a result,
the duration and the cost are small. From a test, only "good-bad" results are
236 7 Reliability of monolithic integrated circuits

obtained. As a treatment, the bum-in must select the early failures. Only the
"remainder" of the "bath-tub" curve will be delivred to the customer. In the opinion
of many specialists [7.8][7.30], the bum-in is the most efficient treatment for
detecting and removing early failures, both for bipolar and for MaS circuits.
Birolini says that bum-in removes about 80% of the chip-related failures and about
30% of the package-related failures [7.40].
Generally, four types of stress are used:
• High temperature and bias: a cheap method, but less efficient,
• HTRB (high temperature reverse bias: high temperature, supply voltage, all
inputs reversely biased): a medium cost and medium efficiency method,
• High temperature, bias, dynamic inputs, maximum load for all inputs: an
efficient, but expensive method,
• HTOT, a method combining the optimum bias with temperatures between
200°C and 300°C: an inadequated method for plastic cases.
In accordance with the standard MIL-STD-883C, the test is performed at 125°C, for
160 hours. For special metallisations and for ceramic cases, 16 hours at 300°C are
used. To obtain the same results, 1 000 000 hours at 125°C would be needed. It
seems that the efficiency of this test depends on temperature and time. Control
activities well-organised by IC producers led to the conclusion that, on the average,
5% of the total integrated circuits fail at bum-in [7.8]. This percentage varies
between 0 and 20%. An efficient treatment may eliminate up to 90% of the future
failed devices in high systems [7.31]. One may say that bum-in is an expensive
method. In this respect, the repair cost for the system must be considered, when the
equiped boards may have hidden defects. It is obvious that the bum-in increases the
delivery cost, but the replacement at the user may be much more expensive.

7.4.3
Failures at screening and burn-in

Generally, the failures arised at screening and bum-in are directly linked to wafer
impurification and metallisation corrosion. This kind of defects may result from an
insufficient control, a nonqualified manufacturing, an improper design or an
insufficient knowledge of the material behaviour and, eventually, may lead to
short-circuit or open-circuit. Many failure mechanisms became "classical" ones,
such as: purple plague or aluminium migration. Other failure mechanisms are due
to faults of circuit designers or to insufficient control/testing (especially for
microprocessors or memories).
In Table 7.9 a syntesis of the typical failures evidenced by screening tests is
presented. Also, a comparison between the failures of transistors and those of
integrated circuits is given in Table 7.1 O.
The data from both tables (obtained in 1975 [7.31]) have not only an historical
character, because some of the devices produced in that period are still operational
somewhere in the world.
The analysis of the failed integrated circuits allows to obtain the failure rate
distribution. This distribution depends on the used technology and on the circuit
complexity.
7 Reliability of monolithic integrated circuits 237

Table 7.9 Failures arising from a screening sequence

IC family Electrical Electrical failures after burn-in (the measuring temperature


failures was indicated) - in %
before
+2S·C 1 +12S·C -SS·C +2S·C 2 All
burn-in (%)
failures

TTL 22 1.3 3.9 1.9 0.7 7.8


Standard

54H and L 4.54 0.99 0.34 1.07 0.25 2.65

Linear IC 40 5.7 4.1 9.5 5.2 24.5

DTL 34.6 1.36 3.51 3.51 351 4.87

CMOS - 0.4 0.03 1.3 0.1 1.83

/ Followed by a destructive physical analysis


2 Followed by a nondestructive physical analysis

Table 7.10 Failure rates for transistors and res

Failure Data published by TI (%) [7.S0) Data published by RAC (%) [7.S4)
types Transis- SSI MSI LSI MOS/ TTL CMOS
tors LSI
Metallisation 6 10 18 26 7 50 25
Diffusion 10 8 12 25 13 2 9
Foreign - 5 11 13 1 6 7
particles
Various 6 5 12 13 21 - -
Oxide 31 18 20 13 33 4 16
Bonds 38 14 7 4 5 13 15
Package 9 5 3 2 5 25 28
Incorect use - 35 17 4 15 - -

Table 7.11 Distribution offailure causes (in %) for various utilisation fields

Failure causes Transmission Switching

Component failures 25 64

External failures 58 20

Good circuits 17 15

One may establish also the failure cause distribution. From a comparative study
[7.46], completely different distributions were obtained for transmission equipment
used in various environmental conditions: regular microclimate, reduced external
stress, etc.), as one can see from Table 7.11. This differences may be explained by
238 7 Reliability of monolithic integrated circuits

the fact that the transmission equipment is more often exposed to the overcharge
danger than the switching elements of a telephone exchange.
The electrical failure statistics for the components may be used at the equiped
card level and, then, to the equipment level and optimum configurations for the
circuit layout may be obtained. For SSI circuits, these statistical data are easy to
obtain, but for more complex circuits it is difficult to obtain reliable statistics.
In a report elaborated by RADC (The Rome Air Development Center) in 1971
[7.47], the failed components repr~sents 5% of the total quantity delivered by the
microelectronic industry. Other sources form the early 70's [7.48][7.49] have
shown a failure level of 1-2% for the integrated circuits used in equipped cards.
These results are consistent with the failure rate, at that time, for electronic
components: 10-5h- 1• Afterwards, the spectacular improvements made in the
microelectronic industry allow to obtain failure rates of 10-7.. 10-8h- 1.

I==::J Various

Surface

Tightness

Photolithography

Metallisation
Solders

0 10 20 30 40

Fig. 7.8 Failure distribution for bipolar nionolithic lCs

Various
Metallisation
Photolithography
Mounting
Wires
Electrical failures
Electrical overcharge
Oxide

o 10 20 30 40

Fig.7.9 Failure distribution for MOS lCs

Also in the early 70's, RADC spent more than 1 million dollars on the
systematical study of the reliability of integrated circuits to get sure data. These
7 Reliability of monolithic integrated circuits 239

studies, refering mostly to bipolar circuits, led to the failure distribution from Fig.
7.8. From similar studies, performed for MOS circuits, Peattie [7.50], obtained the
results presented in Fig.7.9.
One may note that the predominant failures for MOS technique (such as:
imperfections of the oxide layer, electrical overcharge, drift of the electrical
parameters, etc.) are completely different from the failures arised for bipolar
circuits (metallisation or diffusion defects). About 50% for all failed MOS circuits
have shown electrostatic damages, overcharges orland utilisation problems. Gallace
and Pujol [7.51] stated the distribution offailure mechanisms presented in Fig 7.10.
Some comments are needed. If the gate oxide is shorted, a residual current arised
at the input, but also a decrease of the noise sensitivity for the functionning
parameters and for the output parameters was observed. Without taking into
account the complexity of the intergrated circuits, the basic failures take place
inside the small cells formed almost exclusively by MOS transistors and MOS
capacitors. The most frequently encountered type of failure mode is the open circuit
(in the inside of the MOS component or in the connection network leading to the
component). Even if at the delivery the component works, the failure may be
produced by a high current density or by a thermal I mechanical shock. Most
frequently, a damage may be induced by the ultrasonic cleaning, a method used for
removing the etching.

Humidity (for plastic package)

Scratches

Electrical overcharge

Oxide
Mechanical stress

o 10 20 30 40

Fig.7.10 Failure distribution for COSIMOS ICs

In the raw of failures of MOS circuits, the following cause is the short-circuit,
produced by various types of defects, such as:
• impurification of two semiconductor areas connected to different electrical
potential,
• metal deposition (photoresist defects, mask defects, etc. ),
• insufficient cleaning,
• metallic particles at the surface of the wafer,
• over-alloying of the surface metal with silicon,
• oxide break (short-circuit between the surface metallisation and the substrate).
240 7 Reliability of monolithic integrated circuits

Finally, the degradation effects may be produced by the migration of ions (Na+, for
instance) in silicon or by surface charges which may produce surface inversion.
The electrostatic discharges are also a major cause of failure. And this type of
failure arises not only at MOS, but also at bipolar circuits.

7.5
Comparison between the IC families TTL Standard and
TTL-LS

In the TTL standard technology, the circuit complexity is limited only by the
thermal characteristics of the package. In this respect, a comparison between CI
TTL-Standard and TTL-LS is presented in Table 7.12.

Table 7.12 A comparison between two bipolar IC families: LS vs. TTL Standard

Parameter LS TTL Standard

Dissipated power (mW) 50 250

Thermal resistance (OClW) 160 150

Temperature increase ("C) 8 40

Junction temperature (OC) 63 95

Reliability factor l 5 22.5


I Reliability/actor = A (workingjunction temp.)! A (junction temp. of 25°C)

Because the maintainance costs are increasingly higher, the reliability


improvement became an important goal. For the future, the LS technology will
allow to obtain more complex functions for a given dissipation power. The smal
number of connections, due to the high integration degree, leads to a significant
increase of the reliability level, because often the connections are detrimental for
the circuit reliability. The small input currents of the LS family allow to obtain an
almost ideal interface between The MOS compatible TTL and other systems. To -
reduce the parasitic capacities, the families LS and standard-TTL are manufactured
by an epitaxial technology. The small input currents lead to small dimensions of the
transistors. Eventually, a decrease with 60 .. 75% of the chip surface for a LS circuit
compared with a TTL-standard one was obtained.

7.6
Application Specific Integrated Circuits (ASIC)

The Application Specific Integrated Circuits (ASIC) allow a high level of


integration, especially for digital logic circuits. Up to 100 000 "gates" may be
integrated in a monolithic IC. The key element of ASIC is the flexibility of his
technology, allowing to obtain a high variety of devices, at customer demand, only
7 Reliability of monolithic integrated circuits 241

by changing the metallisation layout. But this diversity of types, usually not found
in a company catalogue, has a detrimental effect on the reliability of these devices:
expensive reliability tests are seldom performed, because the required quantities are
small. Consequently, other methods to evaluate the reliability of ASICs must be
used. These methods are refering to design and testing.
Design margins must be appropriately chosen with a view to preclude
operational failures produced by a high range of causes: process variability, hostile
environment (high temperatures, radiation, humidity), etc. Taking into account that
ASIC designers use Computer Aided Design (CAD), specific computer methods,
such as Worst Case Analysis (WCA), may be employed.
The design process of digital ASIC has some steps [7.52]: i) partitioning of the
system function, ii) CAD of primitive gate level\ based on ASIC supplier's design
library, iii) computer simulation of various operating conditions with various
Design Rule Checks (DRC).
The basic timing parameter is the maximum operating speed (the maximum
clock frequency for a correct operation of the ASIC). As this parameter depend on
the temperature, the design may be optimised by determining the actual operating
temperature and calculating the resulting margin required for operation over the
entire temperature range (for military applications: -55°C ... +125°C). Design
margins of 10-15% are currently used. The effect of the environment and ageing
phenomenon is also checked by computer simulation.
The testing must solve the problem of fault coverage (the percentage of possible
logic elements tested by test vectors). The goal is to obtain 100% fault coverage, a
result hard to get for complex ASICs. A mathematical model allows developing
digital ASIC fault coverage guidelines for complex ICs [7.53]. The model is based
on established probabilistic relationship between the fabrication yield of IC, fault
coverage and defect level of finished device, combined with an estimated
probability of using in operation untested logic elements:
DL = 1 _ yJ-FC (7.5)
where DL (Defect Level) is the probability that any given ASIC has defective
untested elements, Y is the yield and Fe - the fault coverage. The authors believe
that by using the concept of design for testability and standard techniques for
testability implementation, a fault coverage in excess of99.9% may be reached.

References

7.1 Spicer England, J.; England, R. W. (1998): The reliability challenge: new materials in the
new millenium Moore's Law drives a discontinuity. International Reliability Physics
Symp., Reno, Nevada, March 31-ApriI2, pp. 1-8
7.2 Noyce, R.N. (1977): Large-scale integration: what is yet to come? Science, vol. 195, March
18, pp. 1102-1106

6 The gate of ASIC may be: AND, OR, NAND, NOR, EXOR, D Flip-Flop (DFF), etc.
242 7 Reliability of monolithic integrated circuits

7.3 Driiganescu, M. (1997): From solid state to quantum and molecular electronics, the
depending of information processing. Proceedings of the International Semiconductor
Conference CAS'97, Oct.7-11, Sinaia (Romania), pp. 5-21
7.4 Schrom, G.; Selberherr, S. (1996): Ultra-low-power CMOS technologies. International
Semiconductor Conference, Oct. 9-12, Sinaia (Romania), pp. 237-246
7.5 Dasciilu, D. (1998): Microelectronics - an expensive field for the present perriod. In:
Curentul Economic (the Economic Stream), vol. 1, September 9, p. 28
7.6 Fluitrnan, J.H. (1994): Micro systems technology: the new challenge. International
Semiconductor Conference, Oct. 11-16, Sinaia (Romania), pp. 37-46
7.7 Peck, D.S.; Zierdt Jr., C.H. (1974): The reliability of semiconductor devices in the Bell
System. Proceedings of the IEEE, vol. 62, no. 2, pp. 185-211
7.8 Colbourne, E.D. (1974): Reliability ofMOS LSI circuits. Proceedings of the IEEE, vol. 62,
No.2, pp. 244-258
7.9 Peck D.S. (1971): The analysis of data from accelerated stress tests. Proc. Int'l Reliability
Physics Symp., March, pp. 69-78
7.10 Biijenescu, T.1. (1982): Look for cost / reliability optimisation of ICs by incoming
inspection. Proc. of EUROCON'82, pp. 893-895
Biijenescu, T.1. (1983): Pourquoi les tests de deverminage des composants. Electronique,
no. 4, pp. 8-11
7.11 Adams, J.; Workman, W. (1964): Semiconductor network reliability assessment.
Proceddings ofIEEE, vol. 52, no. 12, pp. 1624-1635
7.12 Preston, P. F., (1972): An industrial atmosphere corrosion test. Trans. Ind. Metal finish
(Printed Circuit Suppl.), vol. 50, pp. 125-129
7.13 Son, K.I.; Soma, M. (1977): Dynamic life-estimation of CMOS ICs in real operating
environment: precise electrical method and MLE. IEEE Trans. on Reliability, vol. 46, no. 1,
March, pp. 31-37
7.14 Hu, C.; Tam, S.C.; Hsu, F.C. (1985): Hot-carrier induced MOSFET degradation: model,
monitor and improvement. IEEE Trans. on Electron Devices, vol. 32, Feb., pp. 375-385
7.15 Gallace, L. J. (1975): Reliability of TP A-metallized hermetic chips in plastic packages - the
gold chip system. Note ST-6367, February, RCA, Sommerville, USA
7.16 Biijenesco, T.1. (1975): Quelques aspects de la fiabilite des microcircuites avec enrobage
plastique. Bulletin SEV, vol. 66, no. 16, pp. 880-884
7.17 Peck, D.S. (1978): New concerns about integrated circuit reliability. Proc. Int'l Reliablity
Physics Symp., April, pp. 1-6
7.18 Goarin, R. (1978): La banque et Ie recueil de donnees de fiabilite du CNET. Actes du
Colloque International sur la Fiabilite et la Maintenabilite, Paris, pp. 340-348
7.19 Moosa, S.M.; Poole, K.F. (1995): Simulating IC reliability with emphasis on process-flaw
related early failures. IEEE Trans. on Reliability,vol. 44, no. 4, Dec., pp. 556-561
7.20 Frost, D.F.; Poole, K.F. (1989): RELIANT: a reliability analysis tool for VLSI intercon-
nects. IEEE J. Solid State Circuits, vol. 24, April, pp. 458-462
7.21 Liew, BJ.; Fang, B.; Cheng, N.W.; Hu., C. (1990): Reliability simulator for interconnect
and intermetallic contact electromigration. Proc. Int'I Reliability Physics Symp., March, pp.
111-118
7.22 Najm, F.; Burch, R.; Yang, P.; Hajj, I. (1990): Probabilistic simulation for reliability
analysis of CMOS VLSI circuits. IEEE Trans. Computer-Aided Design, vol. 9, April, pp.
439-450
7.23 Hall, J.E.; Hocevar, D.E.; Yang, P.; McGraw, MJ. (1987): SPIDER - a CAD system for
modeling VLSI metallisation patterns. IEEE Trans. Computer-Aided Design, vol. 6,
November, pp. 1023-1030
7 Reliability of monolithic integrated circuits 243

7.24 Lee; Kuo; Sek; Ko; Hu (1988): Circuit aging simulator (CAS). IEDM Tech. Digest,
December, pp. 76-78
7.25 Shew, B. 1.; Hsu, W.; J.; Lee, B. W. (1989): An integrated circuit reliability simulator.
IEEE J. Solid State Circuits, vol. 24, April, pp. 473-477
7.26 Hohol, T.S.; Glasser, L.A. (1986): RELIC - a reliability simulator for IC. Proc. In!'l Conf.
Computer-Aided Design, November, pp. 517-520
7.27 Kubiak, K.; Kent Fuchs, W. (1992): Rapid integrated-circuit reliablity-simulation and its
application to testing. IEEE Trans. on Reliability, vol. 41, no. 3, Sept., pp.458-465
7.28 McPherson, J.W. (1986): Stress-dependent activation energy. Proe. Int'l Reliability Physics
Symp., April, pp. 1-18
7.29 Schaefer, E. (1980): Burn-in, was ist das? Qualitiit und Zuverlssigkeit, no. 10, pp.296-304
Jensen, F.; Petersen, N.E. (1982): Bum-in; an engineering approach to the design and
analysis ofburn-in procedures. 1. Wiley and Sons, New York
7.30 Loranger Jr., J.A. (1973): Testing IC: Higher reliability can cost less. Microelectronics, no.
4,pp.48-50
7.31 Loranger Jr., J.A. (1975): The case of component bum-in: the gain is well worth the prices.
Electronics, January 23, pp. 73-78
7.32 Bazu, M.; Tazlauanu, M. (1991): Reliability testing of semiconductor devices in humid
environment. Proceedings of the Annual Reliability and Maintainability Symp., January 29-
31, Orlando, Florida (USA), pp.237-240
7.33 Biizu, M.; Bacivarof, 1. (1991): A method of reliability evaluation of accelerated aged
electron components. Proceedings of the Conference on Probabilistic Safety Assessment
and Management (PSAM), February, 1991, Beverly Hills, California (USA), pp. 357-361
7.34 Krumbein, K. (1995): Tutorial: Electrolytic models for metallic electromigration failure
mechanisms. IEEE Trans. on Reliability, vol. 44, no. 4, December, pp. 539-549
7.35 Ghate, P.B. (1983): Electromigration induced failures in VLSI interconnects. Solid State
Technology, vol. 3,pp. 103-120
7.36 Fischer, F.; Neppl, F. (1984): Sputtered Ti-dopped Al-Si foe enhanced interconnect
reliability. Proc. In!'l Reliability Physics Symp., pp. 190-193
7.37 Black, 1.R. (1969): Electromigration - a brief survey and some recent results. IEEE Trans.
on Electron Devices, vol. ED-4, pp. 338-347
7.38 Wada, T. (1987): The influence of passivation and package on electromigration. Solid-State
Electronics, vol. 30, no. 5, pp. 493-496
7.39 Learn, A. J. (1973): Effect of structure and processing on electromigration-induced failures
in anodized aluminium. J. Applied Physics, vol. 12, pp. 518-522
7.40 Birolini, A. (1994): Reliability oftechnical systems, Springer Verlag, 1994
7.41 Shatzles, M.; Av-Ron, M.; Gdula, R.A. (1980): Defect-related breakdown and conduction.
IBM J. Research & Development, vol. 24, pp. 469-479
7.42 McPherson, J.W.; Baglee, D.A. (1985): Acceleration factors for this gate oxide stressing.
Proc. 23nd In!'l Reliability Physics Symp., pp. 1-5
7.43 Elsayed, E.A.; Chan, C.K. (1990): Estimation of thin oxide reliability using proportional
hazard models. IEEE Trans. on Reliability, vol. 39, August, pp. 329-335
7.44 Dasgupta, A.; Hu, 1. M. (1992): Failure mechanical models for brittle fracture. IEEE Trans.
Reliability vol. 41, no. 3, June, pp.328-335
7.45 Chiang, S.S.; Shukla, R.K. (1984): Failure mechanism of die cracking due to imperfect die
attachement. Proc. Electronic Components Conf., pp. 195-202
7.46 Boulaire, J.Y.; Boulet, J.P. (1977): Les composants en exploitation. L'echo des recherches,
July, pp. 16-23
244 7 Reliability of monolithic integrated circuits

7.47 Dummer, G. (1971): How reliable is microelectronics? New Scientist and Science Journal,
July 8th, pp. 75-77
7.48 Arciszewski, H. (1975): Analyse de fiabilite des dispositifs a enrobage plastique. L'onde
eiectrique, vol. 50, no. 3, pp. 230-240
7.49 Benbadis, H. (1972): Duree et efficacite du vieillissement accelere comme methode de
selection. Actes du congres national de fiabilite, Perros-Guirec, Sept. 20-22, pp. 91-99
7.50 Peattie, C.G. (1974): Elements of semiconductor reliability. Proceedings of the IEEE, vol.
62,no.2,pp.149-168
7.51 Gallace, T.; Pujol, A. (1976): Failure mechanism in COS/MOS integrated circuits.
Electronics Engineering, December, pp. 65-69
7.52 Wiling, W.E.; Helland, A.R. (1994): Implementing proper ASIC design margins: a must for
reliable operation. ARMS 94, pp. 504-511
7.53 Wiling, W.E.; Helland, A.R. (1998): Established ASIC fault-coverage guidelines for high-
reliability systems. ARMS 98, Anaheim, California, January 19-22, pp. 378-382
7.54 Signetics Integrated Circuits, Sunyvale, California, 1976
7.55 Biijenesco, T.I. (1978): Microcircuits. Reliabilty, incoming inspection, screening and
optimal efficiency. Int. Conf. on Reliability and Maintainability, Paris, June 19-23
7.56 Biijenesco, T. I. (1981): Problemes de la fiabilite des composants electroniques actifs
actuels. Masson, Paris
7.57 Biijenescu, T. I. (1982): Eingangskontrolle hilft Kosten senken. Schweizerische Technische
Zeitschrift (Switzerland), vol. 22, pp. 24-27
7.58 Biijenescu, T. I. (1982): Look Out for CostlReliability OptiH633andmization of ICs by
Incoming Inspection. Proceedings ofEUROCON '82 (Holland), pp. 893-895
7.59 Biijenescu, T. I. (1983): Dem Fehlerteufel auf dem Spur. Elektronikpraxis (West Germany),
no. 2,pp. 36--43
7.60 Biijenescu, T. I. (1984): Zeitstandfestigkeit von Drahtbondverbindungen. Elektronik
Produktion & Priiftechnik (West Germany), October, pp. 746-748
7.61 B1ijenescu, T. I. (1989): A Pragmatic Approach to the Evaluation of Accelerated Test Data.
Proceedings of the Fifth lASTED International Conference on Reliability and Quality
Control, Lugano (Switzerland), June 20-22
7.62 Biijenescu, T. I. (1989): Evaluating Accelerated Test Data. Proceedings of the International
Conference on Electrical Contacts and Electromechanical Components, Beijing (P. R.
China), May 9-12, p. 429--432
7.63 Biijenescu, T. I.: (1989): Realistic Reliability Assements in the Practice. Proceedings of the
International Conference on Electrical Contacts and Electromechanical Components,
Beijing (P. R. China), May 9-12, pp. 424--428
7.64 Biijenescu, T. I. (1991): A Pragmatic Approach to Reliability Growth. Proceedings of 8th
Symposium on Reliability in Electronics RELECTRONIC '91, August 26-30, Budapest
(Hungary), p. 1023-1028
7.65 Biijenescu, T. I. (1991): The Challenge of the Coming Years. Proceedings of the First
Internat. Fibre Optics Conf., Leningrad, March 25-29
7.66 B1ijenescu, T. I. (1991): The Challenge of the Future. Proc. ofInt. Conf. on Computer and
Communications ICCC '91, Beijing (P. R. China), October 30 to November 1
7.67 Biijenescu, T. I. (1996): Fiabilitatea componentelor electronice. Editura Tehnidt, Bucharest
(Romania)
7.68 Biijenescu, T. I. (1997): A personal view of some reliability merits of plastic encapsulated
microcircuits versus hermetically sealed ICs used in high-reliability systems. In:
Proceedings of the 8th European Symposium on Reliability of Electron Devices, Failure
Physics and Analysis (ESREF '97), Bordeaux (France), October 7-10,1997
7 Reliability of monolithic integrated circuits 245

7.69 Bajenescu, T. 1. (1998): A particular view of some reliability merits, strengths and
limitations of plastic-encapsulated microcircuits versus hermetical sealed microcircuits
utilised in high-reliability systems. Proceedings ofOPTIM '98, Brasov (Romania), 14-15
May,pp.783-784
7.70 Hewlett, F. W.; Pedersen, R. A. (1976): The reliability of integrated logic circuits for the
Bell System. Int. Reliability Pysics Symp., Las Vegas, April, pp.5-1O
7.71 Kemeny, A. P. (1974): Life tests of SSI integrated circuits. Microelectronics and
Reliability, vol. 13, no. 2, pp. 119-142
7.72 Bazu, M. et al. (1983): Step-stress tests for semiconductor components. Proceedings of
Ann. Semicond. Conf. CAS 1983, October 6-8, pp. 119-122
7.73 Bazu, M.; Ilian, V. (1990): Accelerated testing of integrated circuits after storage.
Scandinavian Reliability Engineers Symp., Nykoping, Sweden, October
7.74 Bazu, M. (1990): A model for the electric field dependence of semiconductor device
reliability. 18th Conf. on Microelectronics (MIEL). Ljubljana, Slovenia, May
7.75 Bazu, M. (1995): A combined fuzzy logic & physics-of-failure approach to reliability
prediction. IEEE Trans. Reliab., vol. 44, no. 2 (June), pp. 237-242
7.76 Dascalu, D. (1998): From micro- to nano-technologies. Proceedings of the International
Semiconductor Conference, October 6-10, Sinaia (Romania), pp. 3-12
7.77 Dietrich, D. L.; Mazzuchi, T. A. (1996): An alternative method of analyzing multi-stress,
multi-level life and accelerated-life tests. Proceedings of the Annual Reliability and
Maintainability Symp., January 22-25, Las Vegas, Nevada (USA), pp. 90-96
7.78 Caruso, H. (1996): An overview of environmental reliability testing. Proceedings of the
Annual Reliability and Maintainability Symp., January 22-25, Las Vegas, Nevada (USA),
pp.102-107
7.79 Smith, W. M. (1996): Worst-case circuit analysis: an overview. Proceedings of the Annual
Reliability and Maintainability Symp., January 22-25, Las Vegas, Nevada (USA), pp. 326-
331
7.80 Tang, S. M. (1996): New burn-in methodology based on IC attributes, family IC bum-in
data, and failure mechanism analysis. Proceedings of the Annual Reliability and
Maintainability Symp., January 22-25, Las Vegas, Nevada (USA), pp. 185-190
7.81 Knowles, I.; Malhorta, A.; Stadterman, T. J.; Munamarty, R. (1995): Framework for a dual-
use standard for reliability programs. Proceedings of the Annual Reliability and
Maintainability Symp., January 16-19, Washington DC (USA), pp. 102-105
7.82 Pecht, M. G.; Nash, F. R.; Lory, J. H. (1995); Understanding nand solving the real
reliability assurance problems. Proceedings of the Annual Reliability and Maintainability
Symp., January 16-19, Washington DC (USA), pp. 159-161
7.83 Peshes, L.; Bluvband, Z. M. (1996): Accelerated life testing for products without sequence
effect. Proceedings of the Annual Reliability and Maintainability Symp., January 22-25,
Las Vegas, Nevada (USA), pp. 341-347
7.84 Mok, Y. L.; Xie, M. (1996): Planning & optimizing environmental stress screening.
Proceedings of the Annual Reliability and Maintainability Symp., January 22-25, Las
Vegas, Nevada (USA), pp. 191-195
7.85 Johnston, G. (1996): Computational methods for reliability-data analysis. Proceedings of
the Annual Reliability and Maintainability Symp., January 22-25, Las Vegas, Nevada
(USA), pp. 287-290
7.86 Yates III, W. D.; Beaman, D. M. (1995): Design simulation tool to improve product
reliability. Proceedings of the Annual Reliability and Maintainability Symp., January 16-
19, Washington DC (USA), pp. 193-199
7.87 Mukherjee, D.; Mahadevan, S. (1995): Reliability-based structural design. Proceedings of
the Annual Reliability and Maintainability Symp., January 16-19, Washington DC (USA),
pp.207-212
246 7 Reliability of monolithic integrated circuits

7.88 Cole, E. I.; Tangyunyong, P.; Barton, D. L. (1998): Backside localization of open and
shorted IC interconnections. IEEE International Reliability Pysics Symp. Proceedings,
Reno, Nevada (USA), March 31-ApriI2, pp. 129-136
7.89 Huh, Y. et at. (1998): A study of ESD-induced latent damage in CMOS integrated circuits.
IEEE International Reliability Pysics Symp. Proceedings, Reno, Nevada (USA), March 31-
April 2, pp. 279-283
7.90 van der Pool, J. A.; Ooms, E. R.; van't Hof, T.; Kuper, F. G. (1998): Impact of screening of
latent defects at electrical tesst on the yield-reliability relation and applicaiton to bum-in
elimination. IEEE International Reliability Pysics Symp. Proceedings, Reno, Nevada
(USA), March 31-ApriI2, pp. 363-369
8 Reliability of hybrid integrated circuits

8.1
Introduction

The word hybrid means that this technique is placed between a complete integra-
tion (monolithic integrated circuits) and a combination of discrete elements. In this
way conductors, resistors and - until a certain degree - small capacitors and in-
ductors are produced, integrated on a substrate. The passive elements (such as great
value capacitors and, if necessary, inductors) are incorporated in the integrated
circuits [8.1].

IMicroelectronics I
I
icrocomponents IIntegrated CircUits I

Fig. 8.1 The place of hybrid circuits in the general framework of microelectronics

Several circuit elements are placed on the same isolator substrate. In the thick-
film technique this is done with the aid of the stencil process (the paste is pressed
on a ceramic substrate and then submitted to a baking process). In the thin-film
technique, the layers are obtained by evaporation or sputtering.
The hybrid integrated circuits can be much more reliable than the corresponding
circuits formed by distinct components, due to the smaller number of soldering
points, to the more stable substrate; to the greater resistance at mechanical stresses
and due to the replacement of several cases by one single case. In Fig. 8.1 the inter-
dependence and the place of the hybrid integrated circuits in the general framework
of microelectronics are shown.
It is often difficult for design engineers to decide between thick- and thin-film
technologies in the design and fabrication of electronic systems. (In the case of
thick-film, the deposited pattern of conductors, resistors, capacitors and inductors is

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
248 8 Reliability of hybrid integrated circuits

applied to the substrate by screen-printing and firing special conductive, resistive


or dielectric pastes. On the other hand, thin-film layers are deposited in vacuum by
evaporation, screen and fire techniques, cathode or ion impact sputtering, chemical
or electroless metal deposition, vapour plating and direct writing). In order to
maximise the benefits, it is necessary for the design and project engineer to be
aware of these various technologies l . To enable designers, production and project
engineers to capitalise the advantages of both techniques, in this chapter the general
engineering aspects are presented, particularly those linked to system design and
production rules different from the common practice for discrete component as-
semblies 2•
Unlike the monolithic integrated circuits (whose substrate is a semiconductor
material), the hybrid integrated circuits are made on a non-conductive material and
contain only passive components. The active elements (semiconductors, integrated
circuits) are added by soldering or welding with epoxy. Since the substrate is an
isolator, all the drawbacks concerning the isolation and the residual current disap-
pear. In addition, by choosing a substrate with great thermal conductivity the dissi-
pation power, already better as that of monolithic circuits, may be improved. The
passive components have very good characteristics, and their absolute value can be
adjusted with the highest precision. Finally, the access to the GHz frequencies has
opened the domain of microwaves for the hybrid circuits.
The pursuit of high performances determined the orientation of producers for the
insulating substrates such as Sapphire and Spinell - see the exceptional character-
istics of the SOS (Silicon On Sapphire) family. The replacement of monolithic
circuits with hybrids is reversible, since at present pastes for thick-film, with good
switching characteristics are at disposal, replacing for instance the diac (bilateral
trigger diode). It results that these two technologies are not rivals, but complemen-
tary.
As already said, we distinguish two groups in the frame of this family: thick-film
and thin-film hybrids. This classification has in view the thickness of galvanically
deposited layer (O.02 .. .1~ for thin-film, and 10... 50~ for thick-film), but espe-
cially the technology. For layer deposition, two different methods are used:
• deposition in vacuum, for thin-films (better properties, more complex equip-
ment);
• classical stencil procedures for thick-films.
As one can see, the principal difference between thin- and thick film is not the
thickness of conductors, but the technology.
In comparison with the printed circuits, the hybrid circuits have the following
advantages:
• better high frequency characteristics;

I The thick-film systems offer some advantages: simple processing, fast and inexpensive tooling
systems, economy - using wider tolerance active devices -, higher reliability and multilevel
circuit capabilities.
2 The initial enthusiasm and optimism concerning the immediate and wide-ranging applications
for thin- and thick-film hybrid circuits has largely failed to be realised. However, today's fore-
casts suggest that the present world-wiae production capability will be unable to cope with the
demand over the next few years.
8 Reliability of hybrid integrated circuits 249

• smaller dimensions;
• better reliability of the wire connections (smaller number of connections);
• economics (for great series);
• lightly interchangeable tested modules;
• very good reproducibility.
Faced to monolithic ICs, the hybrids have the following advantages:
• great design liberty (various resistors and capacitors, bipolar and unipolar semi-
conductors, analogical and digital functions, all in a single circuit);
• short research/development time;
• smaller development and set-in-function costs;
• shorter times to obtain the models;
• higher currents, voltages and powers;
• resistant to higher shocks, vibrations and accelerations;
• higher working frequencies;
• greater flexibility of active components (mixed technologies);
• economical possibilities to replace the circuits, even after a great series began;
• the design of the circuits can be easily modified;
• the small and moderate series are lucrative;
• the passive components, particularly the resistors, can be produced with a high
precision, and for a large range of values.
But there are also some disadvantages: thus, on the one hand - in comparison with
the printed circuits technology - the costs are higher for small quantities and
doubtless some problems may arise; on the other hand - in comparison with the
monolithic ICs - only a smaller package density can be obtained, and the costs are
higher in the case of an important number of items.

Table 8.1 Some data on layers


----,
Technology Layer thickness Connection Cond uctive path
thickness preci sion

Thick-film Conductor: 151lffi min. IOOIlffi ±5


technology Resistor: 10... 151lffi
Capacitor: 60llffi

Thin-film 0.01.. IIlffi 1O.IOOIlffi ±2~1


technology
-

The plastic materials used for encapsulation must fulfil the following conditions
[8.2]:
• good dielectric characteristics;
• small dielectric constant (for high frequency circuits);
• good compatibility with the resistors having thick layers;
250 8 Reliability of hybrid integrated circuits

• reduced absorption of water;


• high stability at high temperature;
• a working temperature smaller than + 125°C (for components).
In Table 8.1 some data concerning the thickness of deposit and connections, and
the precision of the conductive line are shown [8.3].

8.2
Thin-film hybrid circuits

These circuits are made onto ceramic substrate. On the whole surface of this sub-
strate a NiCr-layer is deposited by evaporation, covered then with a photoresist and
exposed to light through a mask. After exposure, the photoresist is removed from
the areas where the conductive lines will be placed. On these photoresist free areas
copper or gold are galvanically deposited. Afterwards, the rest of photoresist is
removed and a new photoresist layer is deposited, also exposed through a mask. In
the areas where the resistors are to be placed, the photoresist remains, and the rest
of photoresist is removed. The remaining photoresist and the already deposited
copper or gold layer will protect the internal NiCr-layer. Then, the NiCr-layer un-
protected by photoresist is baked, and the photoresist scraps are washed.
With this method a take-away process forms the resistors. The thin-film forming
phenomenon is the same, independently of the circuit type. The sole difference is
due to the mask used at photoresist exposure. The possible partitioning in elemen-
tary circuits (repeating module) is made by scribing (a chemical attack after mask-
ing, with laser or ultrasonic). The semiconductor chips and the capacitors are then
introduced in circuit and interconnected. Afterwards, the circuit is encapsulated.
During the manufacturing process, optical and electrical controls are performed.
The final control is made after encapsulation and includes climatic, mechanic and
hermeticity tests.
The advantage of gold conductive paths - in the case of thin-films - is the possi-
bility to correct discrete components (for example, non-encapsulated chips) by
means of gold conductors, assuring the safety in functioning.
Mounting and soldering of discrete components in hybrid circuits is highly
automated and supervised by computers.

8.2.1
Reliability characteristics of resistors

• The temperature coefficient of resistors is linear between -65°C and +160°C.


• The temperature coefficients of two resistors (having the same substrate), for the
range 200n .. .lMn, differ with less than ±15.1O-6K 1•
• The stability of the resistor under voltage is determined by the working tem-
perature of the thin layer, which depends on the dissipated power and on the am-
bient temperature (Fig. 8.2).
8 Reliability of hybrid integrated circuits 251

• At damp heat (6 cycles for unencapsulated resistors, severity degree 4, in accor-


dance with CCTU 01-01 A), the mean tolerance of the resistors has the magni-
tude order of 0.03% (Fig. 8.3).
• The noise figures are comparable with these for the wire wound resistors.
• In general, the resistors are pre-aged and stabilised at high temperature during
the manufacturing. The mean storage drift after 10 000 hours remains to the
level of 0.2% by +100°C (Fig. 8.4), so that, practically, the circuits have a linear
behaviour.

LWR(%) Ambient temp. = 70°C LWR(%)


II I
1
0.1

0.01
4 10 21 56 112 224
0.001 number of cycles
time (h)

Fig. 8.2 Drift of nitride tantalum resistors, Fig. 8.3 Stability of nitride tantalum re-
under load, is smaller than 0.1 % after 10 3 sistors depending on number of cycles of
working hours damp heat

---
LWR(%)

--
200°C
0.4

--
./
/ ISSoC
0.3

----
/' 12SoC
0.2 I,,{\Or

0.1
V-
V
70°C

V- 20°C
o
time (hours)

Fig. 8.4 The results of high temperature storage of nitride tantalum resistors,
at various temperatures

8.2.2
Reliability of throughout-contacts

The factors that can influence the reliability of throughout-contacts are the tem-
perature, the temperature changes and current load. During a reliability study, a
252 8 Reliability of hybrid integrated circuits

number of26 000 throughout-contacts were tested more than 1000 hours at 125°C,
loaded at 700mA. Since no failure was observed, it results that:
As < (1/2.6 X 1O]h'1 = 3.85 x 10'% (8.1)
and:
(8.2)
Therefore, at a test current IT = 700mA, for a maximum load current 1M = 35mA,
the estimated value of the mean time to the first failure [8.4], with a confidence
level of90%, is:
MTTF (90%) = 0.43 MTTFs(IIIMJ2 > 0.43 x 2.6 x 107(700/351 h (8.3)

MTTF = 4.5 x 109 h (8.4)


and the failure rate is:
,1,(90%) = J/[MTTF(90%)} < 1/(4.5 x 109) h = 2.2 X 1O,10/h. (8.5)
Since in the case of these estimated values, the high temperature storage and the
thermal cycles tests are not considered, it can be said that the reliability of through-
out-contacts is substantially greater than the other passive components reliability.

8.3
Thick-film hybrids

Thick-film hybrids [8.5] ... [8.11] are fixed on ceramic substrates by soldering. To
do this, pastes having the desired characteristics and a stencil process are utilised.
Both, for conductive lines and resistors, pastes containing glass and noble metals
are utilised. Firstly, the conductive lines are pressed on the substrate. After drying,
they are backed. Further on, in the same manner, the resistor bodies are disposed
and backed. Under the denomination "resistors", the manufacturers offer pastes
having different resistance values, indicated in n/o.
At present, the experience [8.12] indicates what for dimensions must have the
resistor bodies for the desired characteristics and resistance values.
After all the resistors are deposited on the substrate, the ensemble is backed and
the various layers acquire their final characteristics. A computer is utilised for the
calculation of resistor's form and dimensions. Since with this method a too large
distribution of the resistance values is obtained, the resistors are laser adjusted, so
that finally they have a tolerance of ± 0.5%.
Today, special elements in miniature form are available, carefully encapsulated,
measured and selected, whose terminals can be reflowed. Not only transistors or
integrated circuits are available, but also tantalum or ceramic capacitors, and high
frequency inductors, all of them isolated and having the desired form. Although all
these component types are more expensive than the types having wire terminals, by
correlating their utilisation with the preferred mounting technique for hybrids inte-
grated circuits - the reflow method -, the financial effort is justified [8.2]. In the
case of reflow method, the substrate is firstly selectively tinned and endown with
8 Reliability of hybrid integrated circuits 253

the fluid agent. Afterwards, the isolated and already tinned components are posi-
tioned. The partitioned substrate is heated for short time over the tinning tempera-
ture, until the solder becomes fluid (rejlow). In this manner a very great number of
reliable soldering points are made, in the shortest time, and by the fluid soldering
surface a supplementary selfcentering takes place. By the thinning of certain sub-
strate portions, it is possible to cover also the desired soldering points with a tin-
ning paste, which favours the component catching on the substrate, before the
proper tinning. Then, the terminals are tinned by the same reflow method or by the
normal soldering method, and the circuit is ready.
An interesting characteristic of thick-film technique is that it allows obtaining
crossing line conductors.
Pastes
Depending on their composition and destination, three paste types can be differ-
entiated:
Pastes for conductive paths containing a noble metal powder. The most re-
commended combination is Pd-Ag.
The resistor pastes have the following characteristics: range of the resistance value,
temperature coefficient of the resistor, dissipated power per cm2 (the mean value is
5W/cm2), electrical noise, temperature drift, loading drift, stability\ sensitiveness to
the microclimatic conditions, ratio length/width, print profile. The surface resis-
tance varies between 3n10 and 10MnlO and depends on the paste composition and
on the thickness of the dry layer. The precision without compensation varies be-
tween 15 and 30%, and with compensation it can be obtained a precision of 1% or
greater. Due to the semiconductor character of the thick-film resistor, the noise
curve has the form IIf(/= frequency) it is expressed in dB (Fig. 8.5) and it is pro-
portional with the applied voltage. The noise voltage (in I-IV IV) corresponds to each
frequency decade. The elements with specific chemical surface resistance (Rs) have
values between 2!lVIV (or + 5dB) and 5!lVIV (or + 15dB). In general, the noise of
a resistor layer depends on:
• specific surface resistance (the pastes with high Rs have a strong noise);
• composition (the complex pastes have a higher noise than the simple ones);
• geometry and compensation.
The dielectric pastes are utilised for crossing lines and protection coverings. Tita-
nium dielectrics allow obtaining very high dielectric constants, so that capacitors up
to 20 OOOpF /cm2 and breakdown voltage of 50 .. .1 OOV are feasible.
The glass pastes have smaller printing temperature and can also be utilised as re-
sistors.

8.3.1
Failure types

Depending on the used technology, a thick-film circuit is a comprehensive ensem-


ble of materials and components of various types and sources. That is why the

J For Birox 1400 the mean tolerance is 0.24% (for the series 17 of Dupont, even 0.1 %, and - in
general - the performance is maintained under 1%).
254 8 Reliability of hybrid integrated circuits

quality and the reliability of these circuits depend on different materials, compo-
nents and manufacturing methods.
In Table 8.2 the most frequent types and causes of failures for thick-film hybrid
circuits [8.10] are presented. One may notice that numerous and different types of
failures depend directly on the manufacturing method and on the used materials.

Noise (dB)

+ 10

o
1.44
-10
some - 25 to +5dB
-20 for2 ... 3mm2

-30
0.65 1.3 4.5 13 resistance surface (mm2)
Fig. 8.5 Noise characteristics of Birox 1400 pastes before and after laser adjustment, depending
on the resistor surface (for Birox 1400, 178, and 17G pastes of Du Pont better noise figures may
be obtained)

8.3.2
Reliability of resistors and capacitors

A few reliability data concerning the thick-film hybrids are available. In accordance
with the Sprague report [8.3] the following failures rates have been ascertained:
• resistors: after 1000 working hours at nominal loading and + 70°C, a failure rate
A. = 1.2 x 10-%, with a maximum drift of 0.5 ... 0.7% of the nominal value was
obtained.
• capacitors: after 1000 working hours at + 85°C and the double of the working
voltage, a failure rate A. = 3.4 x 10-6/h was obtained, for a capacity drift smaller
than ± 20% and an isolation resistance greater than 103MQ at the end of the cy-
cle, in comparison with 104MQ at the beginning of the test.
These failure rates indicate the magnitude order of the reliability level obtained by
the manufacturing in great series of the hybrid integrated circuits.

8.3.3
Reliability of "beam-leads"

In 1962, Bell Telephone (USA) elaborated the interconnection and mounting tech-
nology of semiconductor components named beam-leads. This technique has nu-
merous advantages, but it is doubtless that the most important is the higher relia-
8 Reliability of hybrid integrated circuits 255

Table 8.2 Usual causes and modes offailure of thick-film hybrids

Failure localisation zone Physical manifestation Origin and cause

Substrate • Fissure • Ceramic manufacturing


• Transport; manipulation
• Fabrication process
• Stencil process
• Baking of stencilled layers
• Manipulation offinite module
Conductors of • Detach of conductor • Bed adherence
stencilled layers • Permanent or intermit- • Fissure of conductor
tent interruption • Wrong design of Implantation
• Short-circuit produced • Migration of silver ions
by other components
Resistors • Unstability; hot points • Excessive adjustment of resistor:
• Incidental drift attack of hydrogen on reduced
• Short-circuit at high resistivity inks; electrochemical
temperature reactions; amines emissions
• Bad visual control
--
Passive components • Short-circuits of capacitors; • Technological defects
micro fissures • Defects of components
• Diminished isolation resis-
I
tance of chip capacitors
Active components: • Tearing of connection • Utilisation of a defect capillary
on boards soldering wire on sten- • Thermal dilatation of different I

cillayer materials
• Intermittent interruption • Purple plague (AuAh)
of microconnections • Surface loads
• Connection's fragility • Bad soldering
(soldering by thermo- • Bad positioning; excess of
compression) soldering material
in case • Intermittent interruption
of the circuit; bad elec-
trical contact
• Permanent or intermittent
short-circuit
Connections • Open circuit • Bad soldering
• Breakdown between two • Insufficient quality
conductors
Output wires • Circuit's interruption; • Bad adherence of the stencil zone
coming out of a parasitic
resistance
Cases • Hermeticity defects (me- • Bad closing
tallic case) • Porosities or gases occlusions in
• Hermeticity defects (cera- closing materials, fissures I
mic cases)
------- I
256 8 Reliability of hybrid integrated circuits

bility. In the case of beam-leads, the chip has strip connections going beyond the
edges.
With the aid of a special machine, it is possible to obtain all the connections in a
single operation.
In accordance with the published data, the standard failure rate of beam-leads
has the magnitude order of A ~ 1O-81h. After screening tests, these circuits have a
failure rate of A = 5 x lO- lolh, a remarkable result. Queyssac (Motorola) explains
that by reasons linked to the manufacturing technology:
• complete passivation of the active chip (silicon nitride);
• gold/gold-soldering (no purple plague);
• no (or small) mechanical stress at mounting; practically all fissures or scratches
are excluded; this leads to a better long term reliability;
• chemical separation of the chips; no microfissures;
• no internal soldering of terminals (in this way, about 30% of normal failure
causes of conventional circuits are eliminated).

Table 8.3 Some encapsulation techniques

Technique Unencapsulated Beam-lead Flip-chip Spider


chip bonding

R E L I A B I L I T Y
Hermeticity No Yes No No
Surface protection Fair Excellent Fair Fair
Soldering reliability Poor Excellent Fair Fair
Possibility of soldering
control Yes Yes No Yes
Manufacturing Standard Standard, until the Standard, until Standard,
emitter diffusion metallisation excepting the
soldering
Thermal characteristics Excellent Excellent Fair Excellent

C 0 S T S
Structure cost Small High High Fair
Reparation facilities Yes Yes Yes No
Facilities for building a
multistructure in a
single case Very small Fair/good Excellent Poor
System level cost Very high Fair/small Small High

The beam-leads circuits have particularly good mechanical characteristics and can
undergo successfully the following tests:
8 Reliability of hybrid integrated circuits 257

• Acceleration: 135 OOOg;


• Corrosion: 1000 working hours (steam atmosphere) at 350°C;
• Thermal cycles: 30 cycles (-65°C to +200°C);
• Shocks: 1500g during 0.5ms (three axis);
• HTRB: 100 hours at +300°C.
In Table 8.3 the features of various encapsulation methods, in accordance with
Motorola [8.1] are shown.

8.4
Thick-film versus thin-film hybrids

An advantage of the thick-film hybrids is the possibility to obtain with the aid of
various pastes very different values of the resistance (in practice, from 100 to
lOMO) in the same circuit. By adjustment, resistors with tolerances of 0.5% may
be obtained; however, the thick-film resistors are not so stable (2%) as the thin-film
resistors. The last ones are resistors with metallic film, having well-known remark-
able properties. If the specifications are not so demanding concerning the stability
and the distribution, rather one must use the thin-film technique. The resistors of
this type can be laser adjusted until ± 0.1 %; their stability is 0.3%, and the tem-
perature coefficient (40 ± 20)10·6K 1 is better than thick-film resistors ones (250 x
1O.6K 1). But the resistance domain is smaller (200 to IMO).
An advantage of the thin-film circuits is the solubility of conductive lines and of
the resistors, which is smaller than that of thick-film circuit one. This leads, in
principle, to a smaller volume. The dimensions are determined not only by the line
solubility, but also by the size of discrete components.
Another advantage is that thickcfilm circuits permit to obtain crossing lines. On
the other hand, the crossing lines can be avoided by a proper mask selection (the
crossings are placed under the discrete components). Moreover, in the case of
thick-films, by the crossing manufacturing, often two different printing stages are
utilised, increasing the circuit costs.
Even if for both circuits type the starting point is ceramic substrates with the
same thickness, their composition is nevertheless different. The purity of the ce-
ramic layer for the thick-film circuits is 96%, and that for thin-film circuits is
99.6% (that is why the last ones are a little bit thicker). This is because the ceramic
surface for the thick film circuit must be more rugged to assure a good adhesion of
the paste during the stencil process. On the contrary, the substrate of a thin-film
circuit must be flat and smooth to obtain reproducible metallic layers.
The thin-film circuits have better noise and high frequency characteristics than
the thick-film circuits. The other relalive characteristics, such as the stability of
resistors and of their temperature coefficients are better too.
Another difference is the size of the ceramic substrate that can be processed at
once. In the thin-film technique, more circuits can be set on the same substrate. If
unencapsulated structures must be used, the thin-film technique has the advantage
that their conductive lines are coated with a gold layer, and this make possible their
firm and sure cormection with the gold terminals. For the introduction of the unen-
258 8 Reliability of hybrid integrated circuits

capsulated structures in the thick-film circuits first the contact points must be per-
formed with the aid of a paste containing gold, and this paste is relatively expen-
sive. The experience indicates that about 50% of all circuits are made in thick-film
technique and the rest in thin-film technique.
loly [8.13] gives an example (a telecommunications circuit for military applica-
tions) of circuit realised in both technologies. In accordance with the performed
mechanical and screening tests (2000 working hours at +125°C), the hybrid circuits
still remain in the range value obtained at the initial measurements: no failures (for
both technologies). Based on these results the technical and economical conse-
quences of the two technologies were studied. The comparison is valid for hermetic
cases and unmounted chips, but different substrates.

Comments
For the thin-film circuits, the integration density is greater (on the same substrate
surface, 10 thin-film circuits can be integrated, versus 4 circuits for thick-film tech-
nology).
• The necessary number of photo patterns is 6 for thick-film, and 2 for thin-film
circuits.
• For the thick-film circuit, the thermo compression remains a very difficult manu-
facturing method.
• The cathodic spraying technique (for the adjusting of resistors) is an expensive,
time consuming and difficult to automatise method. The laser technique allows
obtaining a good stability of the components (for both technologies), but the
time consumption is 2-3 times greater for the thick-film circuits.
• The infrastructure is 2-3 times more expensive for the thick-film circuits.
• The noble metal content of the thick-film circuits is 4 times greater than that of
thin-film circuits.
• The drifts of temperature coefficients and of the resistor stability are roughly the
same.

Relative costs

thick-films

thin-films

quantity
Complexity

Fig. 8.6 Evaluation of the relative costs for the thick- and thin-film integrated circuits
8 Reliability of hybrid integrated circuits 259

In Fig. 8.6 [8.14] the costs of the two technologies are shown; it results that the
thick-films are more adequate for the simple integrated circuits; for the complex
circuits is more advantageously to use the other technique. The intersection point
depends - in a small measure - on the production volume and shifts towards thick-
film circuits if the number of manufactured ICs growths.
If several thousands items are manufactured monthly, the production costs are a
little bit smaller. For a small number of items, the thin-film technique leads to
greater production costs. The two technologies are not rivals, but complementary
each other.

8.5
Reliability of hybrid les

Although almost all-electronic components are available in the form of chips usable
in hybrid ICs, only capacitor chips and semiconductor chips are generally used. The
general specifications are:
• small substrate surfaces, since the costs growth with the surface growth; the
resistors with great ohmic resistance require a greater substrate surface, and the
precision capacitors are very expensive and difficult to maintain;
• minimisation of the number of hybrid elements whose mounting asks an inten-
sive work, increasing the costs.
Besides the utilisation of expensive components, reliable circuits, and basic tests,
during the research works other approach modes (such as tolerance analysis, drift
analysis, testability and MTBF forecast) have been enclosed, too. For circuits with
high dissipated power or for circuits that must have high temperature stability, a
thermal analysis is often undertaken with a triple aim:
1) discover the hot spots;
2) detect the temperature growth of critical components because of micro-
climate (evaluation of the influence of the selfheating on the drift);
3) determine the MTBF with the aid of MIL-HDBK-217. An important utilisa-
tion of MTBF is in the comparison of the alternative manufacturing possibilities
with the aim of selecting that one, leading to higher MTBF values. Another meas-
ure in this sense is the devaluation.
During the manufacturing, the principal measures are: input control of all materials
and components, careful supervision of all manufacturing phases (visual control of
equipped and soldered substrates) to identify the scratches on the semiconductor
chips and the areas of bad soldering on the capacitor chips, documentation of fabri-
cation and maintenance of the definite conditions for the microclimate (with clean
rooms, for example).
A statistical evaluation of the measured parameters at testing allows often to
obtain some conclusions about possible problems, especially if the measurements
are made during a life time test.
To avoid the early failures during the normal life, usually - before delivering -
the finished products are exposed to extreme conditions with the aim to detect all
260 8 Reliability of hybrid integrated circuits

the hidden failures. For each type of failure the components are exposed to specific
screens. The selection mode of adequate tests can eliminate the components having
weak points. The failures produced before the end of the normal period of life are
due to the used methods and materials having a random character. If the testing of
materials is made with the greatest care, if the fabrication process is 100% mas-
tered, and carefully supervised, the final test should identify only that components
with defects non detectable during the fabrication. The final test will find out and
eliminate these components.
In the ideal case, the used methods and materials determine the lifetime. The in-
creasing oflifetime is possible only if better methods and/or materials are utilised.
Platz [8.16][8.17] has indicated that an IBM circuit has a MTBF of 108 hours,
the volume of tests being 3 x 10 10 circuits x hours. In general, these tests are per-
formed twice:
a) for normal working conditions (to calculate the predicted failure rate);
b) for higher stress (to emphasise the failure mechanisms).
By comparison with classical circuits, on small boards, the principal advantage of
hybrids is the smaller number of connections. For example [8.16][8.17], a resistor
integrated in a hybrid circuit is far more reliable than a discrete resistor, soldered on
a board. In accordance with IBM data, the MTBF value is greater than 106 years!
The reliability level of a hybrid circuit depends on the size of the series: the greater
the series, the better the reliability. In accordance with the MIL-HDBK-217, the
predicted failure rate is:
Ap = Ab ( 1ft. 7rQ. 7rsJ failures / 106 hours. (8.5)
The following coefficients must be known:
1ft - temperature,
7rF - function,
7rQ - quality,
7rE - environment,
and the terms of the following relation:
Ab = As + AsAc + LARNR + LAcANcA + As7rs (failures / 106 hours) (8.6)
represent the contribution of different parts as follows:
As + AsAc + LARNR - contribution of the substrate;
LAcANcA - contribution of the components included in hybrid circuit;
As7rs - contribution of the package.
In Fig. 8.7 [8.18] a comparison between A.o the observed failure rate and Ap the
predicted failure rate [relation (8.5)], for a hybrid circuit, based on the data obtained
from a user [long observation period; without burn-in data; confidence level 75%
(i), exponential failure distribution] is shown.
The measured failure rates of a simple hybrid module, formed by two PNP tran-
sistors 2N2007 and some resistors, during the operation life [8.19] are:
,11 = 0.2 x 10.9 HI for resistors, and ,12 = 12 x 10.9 h· I for transistors.
8 Reliability of hybrid integrated circuits 261

A., (failures! J(I hours) 1.o= Ap reference)


10
-A C G ~lil ! I I. K
-" ,./

/7
I!. f / ! - mullichip
0.1 .·15
0.01
,/(v;,·. . v \.
0.01 0.1 10 100 Ap (failures!UI hours)

Fig. 8.7 The experience of users (A ... L) versus predicted failure rates

In Fig. 8.8 the primary causes of failures of small power hybrid circuits are
shown. The majority of failures are either breakdowns or soldering failures (espe-
cially for therrnocompression).

Active components 31.3%


Soldering of connections 23.2%
Contamination 21.4%
Substrate 8%
Non identified causes 8%
Encapsulation 1.8%

Fig. 8.8 Primary causes of failures of small power hybrid circuits

8.6
Causes of failures

Himmel and Pratt [8.20] arrive to the conclusion that 60% of failures are failures
of active components, 23% failures of the connections, 9% failures of integrated

Soldering 33.3%
Connections 32.4%
Active components adhesion 10.8%
Active components 10%
Contamination 6.36%
Olher 7.2%

Fig. 8.9 The primary causes of the failures (power hybrid circuits)
262 8 Reliability of hybrid integrated circuits

LINEAR CIRCUITS
Thin-film Thick-film
Metallisation of interconnections 11.2% 11.5%
Resistive films 11.1% 11.75%
Encapsulation 11.1% 17.64%
Structure 44.2% 11.77%
Foreign material - 17.6%
Miscellaneous - 17.98%

DIGITAL CIRCUITS
MetallisatlOn ot mterconnectlOns 13.72% -
Resistive films 5.88% -
Encapsulation 17.64% -
Structure 25.48% 3.3%
Wires soldering 21.58% 1.1%
Foreign material 13.72% 28.8%
Substrate - 66.71%
Miscellaneous 1.98% -
Fig. 8.10 Statistical reliability data for hybrid circuits

T-Tu('C)
140

120
without
100 cooling
radiator
80

60

40 with
cooling
20 radiator

o
50 100 150 200
Power dissipation density (W/inch 2)

Fig. 8.11 Without cooling radiator, the enamelled layer works at a smaller temperature than that
of an equivalent aluminium oxide chip. As consequence, for the aluminium oxide, a cooling
radiator has a better power dissipation. 1 - enamelled layer; 2 - aluminium oxide; 3 - beryllium
oxide
8 Reliability of hybrid integrated circuits 263

resistors, 5% failures of passive discrete components and 3% of failures are due to


other causes. If the dependence of the hybrid circuits on the dissipated power is
taken into account, the primary failures for small power and power circuits are
represented as in Fig. 8.9 and 8.1 O.
As one can see, in these figures the failure mechanisms or failure causes are not
shown; for example, the category failures of active components can include, too,
the case of a crystal crack although this can be produced either during the chip
manufacturing, or by a bad cooling ofthe chip.

Table 8.4 The efficiency of screening tests (MIL-STD 883, method 5004, class B)

Test Failures (%)


Visual internal examination 25
Temperature stabilisation -
Thermal cycles -
Thermal shock -
Constant acceleration 2
Hermeticity 30
Intermediate electrical tests 15
Burn-in 3
Final electrical tests 24
Visual external examination 1

Table 8.5 Typical failure rates of components for hybrids (FIT), versus the working temperature
(0C). [It is recommended to be used only for the cost evaluation and circuit classification, since
the data are strongly dependent on process]

Component

Thick-film resistor
25
5
50
10
Temperature
75
15
eq
100
20
=:=l
---~~-

25
Capacitor-chip 10 15 25 60 250
Wire-contact (thermocompression):
Au-AI 0.05 0.2 10 60
AI-Au 0.1 0.1 0.1 0.1 0.5
AI-AI 0.1 0.1 0.1 0.1 0.1
Au-Au 0.04 0.04 0.04 0.04 0.04
Crossovers 0.05 0.05 0.06 0.08 0.1
Transistor-chip (small power) 3 9 27 70
Power-transistor-chip 50 100 300 900 1700
Diode-chip 3 9 27 70
Integrated circuits:
Four-gates (or equivalent) 20 36 180 820 2400
Dual-flip-flop (or operational amplif.) 40 72 360 1640 4800
SSI 125 225 1125 5120 15000
MSI 250 459 2250 10200 30000
LSI 500 900 4500 20400 60000
264 8 Reliability of hybrid integrated circuits

If the hybrids are classified only depending on the layer thickness, one may
found the situation published by RADC [8.18] and shown in Fig. 8.11.
Concerning the efficiency of screening methods stipulated by MIL-STD 883
(method 5004, class B), Caldwell and Tichnell [8.26] published the data presented
in Table 8.4.
In Table 8.5 a survey of typical failure rates of the components utilised by the
manufacture of hybrids is given.

8.7
Influence of radiation

The integrated circuits used today in military projects must resist to the radiation. A
number of users look forward for a good stability and a normal working, even if the
circuits have been exposed long time to the radiation. From this point of view, the
thick-film resistors have a very good behaviour. The performed tests indicate that
these integrated circuits resist even in extreme conditions, and work in the allowed
power limits. The typical modifications of the resistance are minimal.
These advantages of thick-film hybrids are possible only if the methods and the
materials are according to the specifications. That is why, careful researches on
materials and current controls, essential during the manufacturing process, are
needed. So, for example, only pastes prepared by exactly observing the tolerances
must be used. The main parameters of a fabrication batch must be completely
specified, without neglecting the quality control with the aid of long duration cur-
rent tests.

8.8
Prospect outlook of the hybrid technology

The enamelled metallic layers are important achievements of the last years (Fig.
8.12). Their advantages are good heat dissipation and the possibility to manufacture
substrates having the desired forms and a good mechanical resistance.
Another new development is the polymeric paste for thick-film, with an ex-
pected cost reduction. This paste contains carbon conductive particle suspensions in
an organic medium. Plastic materials are used as substrate.
By using non-encapsulated semiconductors (especially in integrated circuits de-
posited onto substrate with the technology chip and wire) more complex integrated
circuits can be produced with the aid of an automated method named Tape Auto-
matic Bonding (TAP), enriching so the scale of products.
In Fig. 8.13-8.20 the main manufacturing phases of a thick-film circuit for the
transmitting band filter LOV-21 produced by Ascom Ltd., Berne are shown.
8 Hybrid integrated circuits 265

Fig. 8.12 A good example of thick-film circuit: a band filter (Ascom Ltd .. Berne)

....'
••

Fig. 8.13 Conductive lines printed on ceramic substrate: drying at +150°C; baking of the
conductive lines at +85°C
266 8 Hybrid integrated circuits

Fig. 8.14 Printing of the first resistor paste; drying at +150°C

Fig. 8.15 Printing of the second resistor paste; drying at +150°C; pastes baking at +850°C
8 Hybrid integrated circuits 267

Fig. 8.16 Printing the protection layer (glazing); drying at +Isoac; baking the glazing at +sooac

Fig. 8.17 Printing the soldering (which remains wet for component mounting); mounting of
capacitors ; r~flow-soldering
268 8 Hybrid integrated circuits

Fig. 8.18 Measuring of all capacitors; calculation of nominal values of resistors (97% of nominal
value); ageing of substrate (70 hours at +150°C)

Fig. 8.19 Fine adjustment of resistors at nominal value


8 Hybrid integrated circuits 269

Fig. 8.20 Mounting of the active components; mounting of connections

Storage at high Pre-tin-plate (wave) Electrical test


temperature 4s1240°C (begin- at 70°C
(96h/+ 125°C) ning with 90°C)

Pins bending
ofICs

Fig. 8.21 Pre-treatment of integrated circuits for thick-film hybrids [8.21][8.221

Some advantages of using hybrids [8.25] compared with discrete circuits are the
following:
• Electrical properties: (i) higher-frequency performance; (ii) higher density;
(iii) predictability of design; (iv) long-term stability and reliability; (v) low-
temperature coefficient of resistance; (vi) small absolute and relative
tolerances ; (vii) ability to trim components for both passive and functional
response; (viii) high thermal conductivity of substrates.
270 8 Hybrid integrated circuits

lower warranty costs; (vi) easy serviceability and replace ability in the field;
(vii) relatively simple processing and assembly techniques; (viii) low
development cost.

Table 8.6 Properties of thick-film substrates [8.25]


Characteristics Unit Conditions 99.5% 99.5%
Alumina Beryllia
Thenn. coeff. of expansion °C 25-300°C 6.6 x lO' u 7.5 x 10'
Thennal conductivity W/cm-K 25°C 0.367 2.5
··300°C 0.187 1.21
Dielectric constant 1MHz 9.9 6.9
Dielectric strength V/mil 220 230
Dissipation factor 1MHz 0.0001 0.0003
Bulk resistivity nlcm 25°C 10'4 10'4
Camber miVin. 4 3
Surface fmish !lin. 10 20
Tensile strength psi 28000 23000

Table 8 7 Properties of thin-film substrates [8 25]


Characteristics Unit Condo Alumina Corning Quartz Sapphire

Thenn. coeff. of expansion 25°C 6.7.10. 6 7.6.10.6 0.49.10"6


Thenn. conductivityW/cm-K 25°C 0.367 0.017 0.014 0.417
300°C 0.187 0.008 0.008
Dielectric constant 10.1 at 5. 84at 3.826at 9.39at
IGHz IGHz 1 MHz I GHz
Die!. strength V/mil 770 410 190
Dissipation factor I MHz 0.0001 0.000015 0.0001
8.6GHz 0.0002 0.0036 0.00012 5.10.5
Bulk resistivity nlcm 25°C 3.16.1010 10 14 3.16.10' 10'4
Camber miVin. 3 I I I
Surface fmish /lin. 1.0 I I I I
Tensile strength psi 7000 58000

8.9
Die attach and bonding techniques [8.31] ... [8.35]

8.9.1
Introduction

Package parasitic introduces fundamental limitations on the bandwidth of circuits


using packaged semiconductor devices. The difficulties of designing around these
parasitic have stimulated the development of microwave integrated circuit.
8 Reliability of hybrid integrated circuits 271

The accompanying advantages of better reproducibility, lower cost, and smaller


size are often more important then the parasitic considerations. In fact, these
advantages are now being recognised by designers of lower frequency systems, and
- as a result - circuit boards with packaged diodes are being replaced by hybrid
integrated circuits using semiconductor chips, beam leads, or other forms of diodes
designed for these circuits.
From the reliability viewpoint, it is often desirable that the chemical interaction
is dominant and that the bonding strength, which is the measured value of the
adhesion, ranges from 10 to 100Nmm-2. The bonding strength depends on the basic
adhesion but also on extraneous factors, such as the stresses in the layers, and on
the measuring techniques. The adhesion decreases with life; the bonding strength
can decrease [4.29] to about half its initial value after storage at 150°C for 5000
hours; this can be caused by: (i) diffusion of the adhesion layer into the adjacent
metallic layer, the diffusion being enhanced by the stress in multilayer [4.30]; (ii)
recovery of atomistic defects; (iii) a chemical reaction.
Wafer bonding started as a specific way to fabricate inexpensive thick (> l~)
film silicon-on-insulator (SOl) materials of high quality [4.31]. In the mean time
ultra-thin SOl layers can be produced by wafer bonding and proper thinning
techniques. In addition, silicon wafer bonding has shown to be a versatile technique
for fabricating sensors and actuators. Especially in this area it is desirable to
perform bonding at a temperature as low as possible. Wafer bonding may also be
used to produce combinations of materials, which may differ in terms of structure,
crystallinity or lattice constant.

8.9.2
Hybrid package styles

Chips. The need for specialised equipment for die attach (connecting the base of
the chip to the circuit) and wire bonding (connecting the chip top contact to the
circuit) limits the use of the chips. The number of assembly operations is less for
other hybrid package styles, so assembly costs are usually higher for chips. High
volume production can be an exception because automatic equipment for die attach
and bonding becomes economically feasible.
Die attach. Chips may be mounted using eutectic solders ranging from AuSi
(370°C) to AuSn (280°C) as well as conductive epoxies. Eutectic die attach may be
performed using either substrate heating or localised heating techniques. To insure
observable eutectic flow and/or filleting, generally a 0.005" border around the chip
is suitable. The localised heating technique involves the use of an accurately
controlled stream of hot inert gas directed at the chip and the immediate area. It
offers advantages in rework and lower substrate assembly temperatures.
GaAs FET chips. The FET chip can be die attached manually using a pair of
tweezers or automatically using a collet. In either case, provide a flow of nitrogen
over the workstage area. Start with a workstage temperature of 280°C and rise as
required. The chip should not be exposed, however, to a temperature higher than
320°C for more than 30 seconds. An 80120 gold/tin preform 25~ thick with the
same area as the chip is recommended. A standard round preform with the same
272 8 Reliability of hybrid integrated circuits

volume may also be used. When using tweezers, make sure that the chip is able to
facilitate subsequent wire bonding.
GaAs material is more brittle than silicon and should be handled with care.
When using a collet, it is important to have a flat die attach surface. By using a
minimum of downward force, the chance of breaking the chip is reduced (Fig.
8.22).
(Controlled atmosphere) Force
Solder prefonn (or
conductive epoxy) Film metallisation

Substrate

Fig. 8.22 Chip mounting

Bipolar chips. The bipolar chip is die attached with gold silicon eutectic under
nitrogen ambient. The eutectic temperature is 370°C. Start with a workstage tem-
perature of 380°C and raise the temperature until eutectic flow takes place. The
chip should be lightly scrubbed using a tweezer.
Diode chips. Table 8.8 shows the preform type and die attach conditions for dif-
ferent types of diode chips. The die attach operation should be performed in a re-
ducing atmosphere such as forming gas or in an inert atmosphere such as nitrogen.
When a single station is used, the operator holds the chip down for a few seconds
until the preform melts and a fillet appears around the edge of the chips or until
eutectic flow is observed. For higher volume operations a belt furnace is used.
Weights are placed on the chips to assure good adhesion when the preform melts.
Temperature, weight, and time are adjusted experimentally to accommodate differ-
ent chip size, circuit configuration, and heating equipment.

Table 8.8 Die attach - diode chips

Diode type Preform Temperature eC)

SRD; PIN Schottky;


Other Schottky Gold/tin 310

Lead bond. The criteria for choosing a specific technique are generally the size
of the contact area on the chip, sensitivity to temperature, and the available equip-
ment. To avoid damage to circuit, use minimum values that provide an adequate
bond. Wire ribbon, or mesh is used. When the bonding pad is small, wire diameter
is usually 18 to 2Sj.UIl in order to keep the wire inside the bonding pad. Typical
starting temperatures are 22SoC for the work stage and IS0°C for the bonding tool.
The bonding tool may be a wedge or a capillary. Pressure is applied to deform the
wire or ribbon about 50%. Approximately a force of O.024gf per square j.UIl (I5gf
per square mil) is needed.
8 Reliability of hybrid integrated circuits 273

Beam lead. The beam lead device is a silicon chip with co-planar plated gold
tabs that extend parallel to the top surface of the chip approximately IOmils beyond
the edge. If size is the major concern, beam lead diodes, not chips, are the cotTect
choice. Handling must be done with care, since the pressure of tweezers may distort
the leads. However, the diodes will stick to the tip of a tweezer point or to the
rough edge of a broken Q-tip. A vacuum pickup may be used, but the needle must
be small enough to prevent passage to the diode. Schottky batTier beam lead diodes
are easily damaged by static electricity, just as packaged diodes are. Contact to the
circuit should never be made with the free side of the diode because this would
allow static electricity from the operator's hand to flow through the diode. Instead,
the side of the diode to be attached should be contacted first. If there is any chance
that the two circuit attachment points are at different potentials they should be
brought together with a grounding lead before contacting the diode.

Step or parallel if" Bonding wedge or


gap weldin _ pulse heated probe
f.- r-- Beam lead device

I I Tab Metallisation
~~~====~~==C-~
L ) - - - - - - - - - - - - - - - - - . . . J Ssubslrate

Fig. 8.23 Beam lead attachment requires thennocompression bonding or parallel gap welding to
the substrate metallisation

Thermocompression bonding is a satisfactory joining technique. The device is


placed face down with the tabs resting flat on the pad area and bonded using either
a heated wedge (and/or substrate) or parallel-gap technique (Fig. 8.23). The heated
wedge may be continuously heated, as in most standard equipment, or it may be
pulse resistance heated where a high cUtTent and short duration pulse is used to
raise the wedge to the required temperature. In the welding operation, cUtTent is
passed through the substrate metallisation and the device lead. Most of the heat is
generated at the interface between the two items, which is exactly where it is
needed.
The major advantage of the pulse heating techniques is that a cold substrate may
be used, generating only localised heating in the vicinity of the bond itself. The
electrodes (or wedge) can be placed on the device lead when the bond area is cold
and can maintain a constant force through the heating and cooling cycle. When
continuous heating is used, the bonding tool is heated to 280°C and the work stage
to 225°C. Pressure is O.024gf per ~2 (15gf per miI"). If a soft substrate is used,
there is some danger of breaking a lead by pressing it into the substrate. This can be
avoided by using a cold stage and heating the tool to 380°C.
Ministrip. If the ultimate in size reduction is not needed, the ministrip design
may be preferable. A chip is soldered to a molybdenum tab and covered with a
protective coating. Either one or two leads can be provided. The ministrip may be
274 8 Reliability of hybrid integrated circuits

soldered to the circuit on a hot plate, belt furnace, or with a gap welder, or epoxy
may be used. Thermocompression bonding is recommended for attaching the leads.
This package style is particularly well suited to shunt diodes, but series applications
are possible by soldering the ministrip to the conductor on the substrate and bond-
ing the lead across a gap in the conductor.
The microstrip post was developed for PIN switches and phase shifter circuits.
The accurate location of the chip centre makes this model useful for phase shift
circuits at frequencies as high as 20GHz. The pedestal may be attached to the sub-
strate with conductive epoxy or low temperature solder. The temperature must be
kept below 280°C (the soldering temperature used to attach the chip to the pedes-
tal). The wires may then be thermocompression bonded to the substrate metallisa-
tion pattern.

8.10
Failure mechanisms

Solder interconnects. Some of the interconnects are replaced by chemically bon-


ded material interfaces on the substrate, the so-called film components, such as
resistors, conductors, and capacitors. This reduces the module's susceptibility to
wiring errors, and damage due to environments (shock, acceleration, and vibration).
Localised heating and hot spots within resistive elements are reduced due to the
direct bond between the films and the usually good thermally conductive substrate.
This results in very reliable resistive films4.

Table 8.9 Comparative A for various bonding techniques (in 0/011000 h) [8.25]

Interconnection One lead I4-lead device ISO-lead device

Thennocompression wire bonds 0.00013 0.0018 0.02


Ultrasonic wire bonds 0.00007 0.001 0.014
Face bond 0.00001 0.00014 0.0015
Beam lead 0.00001 0.00014 0.0015

The major failure mechanisms arise in the add-on components (chip resistors,
chip capacitors, transistors, diodes, ICs and wire bonds) - Table 8.9.
Although a single wire bond is very reliable, there may be more than 200 wire
bonds on a complex hybrid, and they may have a major contribution to the failure
rate.

4 However, the films will drift with time (typically 0.25% for thick film and 0.1 % for thin film).
Such drifts should be allowed in any worst-case analysis.
8 Reliability of hybrid integrated circuits 275

References

8.1 Meusel, J. (1979): Hybridschaltungen in Dickschichttechnik. Funkschau nr. 23, p. 1337


8.2 Winiger, F. (1973): Hybridschaltungen in Dickfilmtechnik. Techische Mitteilungen PTT
(CH), nr. 2, pp. 68-73
8.3 a
Lilen, H. (1974): Circuits hybrides couches minces et it couches epaisses. Editions
Radio, Paris
8.4 Deakin, C. G. (1969): A Simple Guide to the General Assessement of MTBF. Microelec-
tronics and Reliability ill. 8, pp. 189-203
8.5 Griessing, J. (1989): Dependence of Properties of Deposited Films on Angular Distribu-
tion of Incident Vapor Beam. Proc. of European Hybrid Microelectronics Conference,
Ghent, pp. 229-240
8.6 Bajenescu, T. I. (1985): Zuverlassigkeit elektronischer Komponenten. VDE-Verlag, Ber-
lin, West-Germany
Bajenescu, T. I. (1996): Fiabilitatea componentelor electronice. Editura Tehnica, Bucha-
rest, Romania
8.7 * * *(1978): Criteres de qualite des materiaux pour couches epaisses. Toute
l'electronique, Juin, pp. 41-45
8.8 Harper, C. A. (1974): Handbook of Thick Film Hybrid Microelectronics. McGraw-Hill,
New York
8.9 a
Lambert, F. (1973): Les circuits hybrides couches epaisses. EMl no. 168, pp. 23-29
8.10 Miller, L. F. (1972): Thick Film Technology and Chip Joining. Gordon and Breach, New
York.
8.11 Topfer, M. L. (1971): Thick Film Microelectronics. Van Nostrand, Princeton
8.12 Elcoma Bulletin (1980), p. 3
8.13 Joly, J. (1976): Realisation de circuits hybrides en technologie CM et CE - etude com-
parative. Actes du colloque international sur les techniques de fabrication et
d'encapsulage des circuits hybrides. Paris, April 7-8, pp. 25-33
8.14 Roggia, D. A. (1978): Hybrid circuits telecommunications. Telettra Review nr. 29, pp.
23-27
8.15 Pay, C. (1974): Zuverlassigkeit von Mikroschaltungen in Dick- und Diinnfilm-Hybrid-
technik. Elektronikpraxis no. 5, pp. 91-95
8.16 Platz, E. F. (1968): Reliability of hybrid microelectronics. Proc. of inf. circuit packaging
symposium, San Francisco
8.17 Platz, E. F. (1969): Solid logic technology computer circuits - billion hour reliability data.
Microelectronics and Reliability vol. 8, pp. 55-59
8.18 Hybrid microcircuits reliability data. Pergamon Press, 1976
8.19 Mouret, M. (1976): Bilan et perspectives d'utilisation des circuits hybrides aux PTT.
Actes du colloque international sur les techniques de fabrication et d'encapsulage des cir-
cuits hybrides, Paris, April 7-8, lV.5
8.20 Himmel, R. P.; Pratt, 1. H. (1977): How to improve microcircuit reliability. Circuits
manufacturing (June), pp. 22-32
8.21 Arnbrus, A. (1982): Vorbehandlung von lCs fur Dickschicht-Hybridschaltungen in einer
kompakten Elektronik-Bauweise. SAQ-Fachtagung "Elektronik", Ziirich-Oerlikon, March
26,p.79
8.22 Stein, E.; Kulli, C. (1982): Burn-in von Diinnschicht-Hybridschaltungen. Bulletin SEV I
VSE no. 23, pp. 1224-1229
8.23 Kohl, W. H. (1997): Handbook of Materials and Techniques for Vacuum Devices.
Springer Verlag, Berlin
276 8 Reliability of hybrid integrated circuits

8.24 Proceedings of the Custom Integrated Circuits Conference, Santa Clara, California
(USA), May 11-14, 1998
8.25 Jones, R. D. (1982): Hybrid Circuit Design and Manufacture. M. Dekker, Inc., New York
and Basel
8.26 Caldwell G. L.; Tichnell, G. S. (1977): Guidelines for the custom microelectronics hybrid
use. Quality (February), pp. 16-19; (March) pp. 22-26
8.27 Schauer, P. et al. (1995): Low frequency noise and reliability prediction of thin film
resistors. Proc. of ninth Symposium on Quality and Reliability in Electronics RELEC-
TRONIC '95, October 16-19, Budapest, Hungary, pp. 401--402
8.28 Loupis, M. I.; Avaritsiotis, J. N. (1995): Simulated tests of large samples indicate a loga-
rithmic extreme value distribution in electromigration induced failures of thin-film inter-
connects. Proc. of ninth Symposium on Quality and Reliability in Electronics RELEC-
TRONIC '95, October 16-18, Budapest, Hungary, pp. 353-358
8.29 David, L. et al. (1995): Reliability of multilayer metal-nGaAs interfaces. Proc. of ninth
Symposium on Quality and Reliability in Electronics RELECTRONIC '95, October 16-
18, Budapest, Hungary, pp. 379-384
8.30 Xun, W. et al. (1995): Newly developed passivation of GaAs surfaces and devices. Proc.
of the fourth Internat. Conf. on Solid-State and Integrated-Circuit Technology, Beijing
(China), October 24-28, pp. 501-505
8.31 Hewlett Packard Application Note 974
8.32 Howes, M. J.; Morgan, D. V. (1981): Reliability and Degradation. John Wiley & Sons,
Chichester
8.33 Kadereit, H. G.(1977): Adhesion measurements of metallizations of hybrid microcircuits.
Proc. Eur. Hybrid Microelectronic Conf. (ISHM), Bad Homburg, Germany, Session IX
8.34 Hieber, H. et al. (1977): Ageing tests on gold layers and bonded contacts. Proc. Eur.
Hybrid Microelectronic Conf. (ISHM), Bad Homburg, Germany, Session IX
8.35 Gosele, U. M.; Reiche, M. (1995): Wafer bonding: an overview. Proc. The fourth intern at.
conf. on Solid-State and Integrated-Circuit Technology, Beijing (China), October 24-28,
pp.243-247
9 Reliability of memories and microprocessors

9.1
Introduction

Silicon technology was (and still is) the dominant technology of the semiconductor
industry; silicon devices have more than 95% market share of the over $140 billions
semiconductor business at the present time. Greater integration, higher speed,
smarter functions, better reliability, lower power and costs of a silicon chip are the
permanent goal in order to meet the increasing requirements of information
technology. The industry progress has closely followed two laws. The first is the
Moore's law, the 1975 observation by Gordon Moore that the complexity ofICs had
been growing experimentally by a factor of two every year. He attributed this to a
combination of dimension reduction, die size increase, and an element which he
called "circuit and device cleverness" - improved design and circuit techniques
which allowed more function per unit area at a given lithography. With a slowing
down of the rate of progress to a factor of two every 1.5 years, Moore's law
continues to hold well today. The second law is the law of Jr, a somewhat tongue-in
cheek statement that memory chips, in a given generation, sell for about n dollars
when they reach their peak shipping volume, and eventually reach a selling price of
nl2 dollars. The law has not really held, though in constant dollars it is not too bad,
but the point is that the cost of a chip has only gradually increased from generation
to generation, held down by the ability of the industry to yield larger and larger
chips while making them smaller on increasingly larger wafers. Device
miniaturisation was the main trend (Fig. 9.1), and the silicon device technology
progress followed the scaling-down principles and Moore's law for the last three
decades. In the past 40 years, semiconductor business continued to grow at a large
growth rate. Today there are two key technologies which play the role of drivers. At
a first stage, the bipolar technology contributed to the large growth rate of
semiconductor business. In a second stage, the MOS technology much improved the
performances of logic arrays, memory devices and microprocessors. Both bipolar
and MOS technologies are based on silicon, and on pn junction. Originally used in
microwave and radio-frequency applications because of its low susceptibility to
noise, a new semiconductor technology, based on the GaAs, emerged as a contender
for use in advanced devices. GaAs is now thought of as a highly reliable, radiation
resistant, ideal medium for use in ultra-fast switching circuits, wide bandwidth
instrumentation and high-speed computers. Continuos improvements are being
made to the manufacturing process, ironing out the problems. Fabrication

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
278 9 Reliability of memories and microprocessors

techniques are the main area of concern, since mechanical stresses and impurities
introduced at this stage have a considerable influence on device performance.
The gigabit generation will very likely require a new breakpoint if the trends are
to be continued. A few major areas of technology innovations have been the key to
the requirements, such as the lithography shrink ability (lAx each generation), the
levels of metallisation and fundamental limitation of device scaling to meet
performance goal (1.25x chip level), the high dielectric-constant materials, used to
meet cell capacitance in sells of reduced area, etc. As device and process
technology is moving toward 0.25 ... 0.181illl design rule regime, till the year 2000,
semiconductor manufacturers will introduce development and production phases at
a scale of 1Gb. Projections of 4, 16 and even 64Gb DRAMs are not uncommon,
despite the requirement that an extrapolated 16Gb DRAM requires not only s
0.11illl lithography, but also the ability to fabricate devices and features at
corresponding dimensions. It is obvious that the industry is approaching some
limits in its ability to manufacture devices; but limits can be eluded. (Optical
lithography limits were considered to be around 1 Iilll in the late 70's, but
predictions are 0.1..0.21illl at present). The National Technology Roadmap for
Semiconductors [9.1] confidently predicts continuing exponential progress with a
generation every three years, culminating in the year 2010 with 64Gb DRAMs
manufactured on 14001illl2 chips with 0.071illl lithography, and microprocessors
having 800 millions transistors on 620mm2 chips. But some important limits [9.2]
concern not only the lithography, but also the speed of the light, tunneling, device
fields, soft errors, power, cell size, fabrication control, etc. The semiconductor
industry will continue to progress, since all these limits are more practical than
fundamental. However, overcoming the challenges will become increasingly
difficult, and the industry will continue to struggle against perhaps the most
important limit of all to its growth: costs.

Min. device dimension (f.ITII)


10
Realm of classical mechanics

0.1
Realm of quantum mechanics
Molecular dimensions

1970 1980 1990 2000 2010 Year


Fig. 9.1 Decrease of device dimensions in the years 1970 to 2010 [9.3]
9 Reliability of memories and microprocessors 279

.
Decrease of device dimensions

.------ -----------.
Increase of complexity

.
Electronics
.
Optics

.-------
Micro~ectronics Micrrptics

-------.
Nanoelectronics

Molecular electronics / photonics


Nanooptics

Fig. 9.2 Development of molecular electronics/photonics from conventional electronics and


optics [9.3]

In the future, optics will play, too, an essential role. Fig. 9.2 shows that - pa-
rallel to the electronic ones - optical devices and components have also become
smaller over the years, and could lead to the use of molecules or atoms.
To overcome the technology difficulties and manufacturing costs, new materials
and processes as well as cost reduction methods will be introduced, such as high
dielectric-constant materials (BST or PZT), ferroelectrics, and new processes like
silicon on insulator SOL
Furthermore, fast new ultra large scale integration (ULSI) testing methods and
new yield-enhancing redundancy techniques - resulting in cost reduction - will be
increasingly needed to achieve high reliability for ULSI with 109 ... 10 10 devices on a
single chip. Sophisticated microprocessors using O.l51llll MOSFETs could possible
appear at the beginning of 21st century. Simultaneous achievement of high
performance, high packaging density, and high reliability will become increasingly
difficult. Therefore, there is an urgent need to reduce the fabrication process costs
by developing new approaches such as single wafer processing [9.4] and tool
clustering, and increased automation of process and factory control.
For scaled MOSFETs, hot-carrier effects are still important even for less than
3V supply voltage. ULSIs have been developed permanently keeping in mind their
reliability; for each generation, device/memory structures, fabrication processes and
materials have so far been determined by the need to overcome reliability
problems: soft-error phenomena in ULSI memories, dielectric breakdown in the
insulators, electro- and stress migration in the interconnection, etc. Although this
tendency will continue, a new strategy for ULSI technology must be introduced to
realise giga-scale and nanometer LSIs.
The trend of the device parameters for each DRAM generation is shown in Fig.
9.3. It should be noted that the downscaling of capacitor size and capacitor
dielectric thickness are levelling away due to physical limits, in spite of still
monotonous cell size decrease. This trend demands complicated and three
dimensional cell structures at least until 256Mb DRAMs, resulting in increased bit-
cost (Fig. 9.4).
280 9 Reliability of memories and microprocessors

Cell/capacilor size (IJJll) f Dielectric thickness (nm)


500 ~-------------------------------------------,

Capacitor dielectric thickness

10

16K 64K 256K 1M 4M 16M 64M 256M bit

Fig. 9.3 Trend of DRAM device parameters [9.5]

2
wiring

. capacitor
MOSFET
. isolation
0 . ..... well

Fig. 9.4 Increase of process steps due to device complexity [9.5]

It was found that a drastic decrease of process step in the case of ferroelectric
DRAMs occurs, and it is necessary and urgent to enhance the quality of ferro-
electric films up to the desired production level. It should be noted that the
dielectric constant is decreasing with the decrease of film thickness, and this
physical mechanism is not clear yet. Elucidating this mechanism will also lead to
an ideal ferroelectric non-volatile DRAMs, making a good use of polarisation of
PZTs. Recently, flash non-volatile memories made good progress with higher
speeds than DRAMs (Fig. 9.5), aiming at the application to the personal digital
assistant (PDA). In the same way as for DRAMs, a key factor for flash memories is
the high quality oxide/insulator technology permitting in particular to satisfy
105.•. 106 write/erase cycles, a condition close to the intrinsic oxide breakdown.
Therefore, new robust oxides such as oxynitrided oxides N 20 are needed.
An important element for future PDA and multimedia applications is that flash
memory cell can be easier scaled down, compared with DRAM cell.
There are three approaches to reduce hot-carrier degradation in scaled MOS
devices: (i) hot-carrier resistant device structures such as double diffused drain
9 Reliability of memories and microprocessors 281

(DDD), lightly doped drain (LDD), and gate-<irain overlapped device (GOLD) -
drain/gate engineering - ; (ii) reduction of power supply voltage (l.5 .. .3V), and (iii)
making good use of alternating current effects, including the duty ratio. GOLD
structure provides higher hot-carrier breakdown voltage than LDD structure, and
moreover, higher channel current without severe down-scaling of gate length.

Record density (bit/inch 2)


10 10 ~-----~-----~
. I optical disc
109 :::~ ••••••••~:••••••••••••••••~
••• 256M

108 ••••••••••••••••••••••••••••••••••••• +•••.............•.,•............•

DRNJII;;~
107 •••••.•.•••••••••••••••• !~ ..... +.,.~::~.............. !~........

!flash memory
106 ---~~~:!~~~-~-------------------i-----------------------------------

10 5 I
1980 1990 2000 year

Fig. 9.5 Record density trend in DRAM and other media [9.5]

The testing-in reliability (TIR) approach became more and more extensive as the
semiconductor technology developed, device dimensions decreased, circuits
became more complex, chips grew in size, and the need to realise products of ever-
higher quality increased. The concept of wafer level reliability (WLR) explored and
developed in the 80's proved to be no panacea. However, this last concept led to
the development of additional tools which helped to monitor the process changes
that might affect reliability, and explored the impact of specific failure mechanisms
on the product. These tools had to be used with care because in extrapolating from
stress to use conditions, measurement results could reflect manifestations of a
failure mechanism not representative for product-use conditions. Test data from
tens of test structures must be extrapolated to circuits with millions of transistors;
and test data extending out to no more than I % tail of the failure distribution must
be extrapolated to orders of magnitude further out in the tail for the projected levels
of product reliability [9.6]. Realising the limitations ofWLR for predicting product
life, the industry has continued to use product life testing, but now is facing a
dilemma with product life testing that involves ever-more aggressive market-entry
demands, the measurement of ever-lower failure rates, the need to use ever-larger
sample sizes, and fundamental measurement uncertainties.
A good example of the "paradigm shift" is the new movement from simple
failure analysis by sampling the output of a manufacturing line to the building-in-
reliability (BIR) approach. This approach was introduced at the International
Reliability Physics Symposium (lRPS), in 1990, where Crook [9.9] stated that it
would not be enough test time nor test parts to obtain confidently the low failure
rates that were projected for the end of the century. BIR is an ongoing,
comprehensive, and integrated approach to build reliability into the product; it
282 9 Reliability of memories and microprocessors

involves a continuing search for those factors that affect reliability and for ways to
improve control over them, making the product more resistant to process and use
variations, and to failure mechanisms, defects, and the degradation of material and
devices. To pursue this technique, greater importance will be attached to a deeper
physical understanding of the significant relationships between the input variables
and product reliability.
Such a comprehensive approach requires that all organisations involved in the
design, development, and manufacture of the product take responsibility for pro-
duct reliability. Reliability is therefore an integrated effort that also includes the
suppliers of the materials and equipment used in the manufacture of the product.
Table 9.1 shows the chronology of the XS6 microprocessor implementation
since the !!p SOSO in 1974. With the increased performance and complexity shown,
there has been a reduction in the time between successive generations from over 50
to 31 months as a means to increase market share and company profits.

Table 9.1 X86 microprocessor chronology [9.7]

Processor Clock speed Number of MIPS Date of Delta


(MHz) transistors introduction (months)
8080 2 6000 0.64 April 1974
8086 5 29000 0.33 June 1978 50
80286 8 134000 1.2 February 1982 44
80386 16 275000 6 October 1985 44
486DX 25 1200000 20 April 1989 42
486DX2 50 I 200000 40 March 1992 35
Pentium 66 3100000 112 May 1993 14
P6 133 5500000 250 December 1995 31

Microprocessors can be considered to have the same failure mechanisms as


standard devices manufactured using MOS or bipolar technologies. However, the
size and complexity of the chips introduce additional failure mechanisms. For
instance, larger chip areas mean that the devices are more prone to defects inherent
in the semiconductor materials, and this increases the probability of defects such as
pinholes in the oxides and microcracks in the metallisation. Catastrophic failures
may also occur, due to such problems as oxide rupture, interruption of aluminium
lines and wire bond failures. Additionally, there are failures related to the
interrelationship between software and hardware. Burn-in coupled with a good
functional test programme is generally accepted as the most effective screen.
Device with surface-related defects should be detected using a static bum-in,
whereas dynamic bum-in should weed out defects in MOS memory cells due to
weak oxides. The best screen is considered to be a succession of several bum-in
programmes, which may prove to be uneconomical. Current failure rates of
microprocessors [9.S] have been found to be extremely low (less than IFIT).
9 Reliability of memories and microprocessors 283

9.2
Process-related reliability aspects

Semiconductor processes must be completely revamped to accommodate the need


of submicron features. Novel processes - such as dry etching - will need to be
implemented, to etch anisotropic submicron features onto complex structures such
as refractory metals, silicides and multilevel aluminium alloys. An automatic
VLSllULSI testing technique is now used, whereby VLSllULSI monolithic silicon
circuits are repaired by laser action, thereby eliminating the need to have chip
redundant functions [9.27].
In integrated circuit technology, CMOS will become the dominant process
supplanting NMOS and displacing bipolar technologi. The semiconductor industry
has now firmly adopted "automation" not only to improve reliability and integrity,
but to reduce human judgement and fatigue. One outstanding development in
automation is the surface mounting of "Ieadless" components on "holeless" printed
circuit boards (PCB).
Electrostatic discharge (ESD) was once a problem only for insulated-gate field
effect transistors, but now all semiconductor devices are vulnerable.
VLSIIULSI silicon wafers. As silicon monolithic integrated circuits become more
dense and silicon wafers become larger in diameter, the requirements for silicon
must improve mechanically, chemically, and crystallographic ally. The tolerances
for flatness and parallelism must be minimal. Cleanliness is essential, including the
absence of particles on the backside of the wafer which can affect front side
flatness when a wafer is on a lithographic exposure system.
The fabrication of VLSI/ULSI devices becomes a real challenge as the circuit
density and complexity is continuously increasing. The difficulty is twofold: the
convenient process windows have to be found out, in order to optimise the device
yield and performance for a given design, and the process defect density should be
kept at a low value, thereby minimising the associated yield loss and reliability
hazards.
Although the in-line process monitoring allows the detection of most yield
detractors and - to a lesser extent - of potential reliability problems, the detrimental
impact of manufacturing defects is still observed further in the device life: at the
wafer final test (yield loss), during bum-in (infant mortality), in service conditions
or during accelerated stress tests (long term reliability breakdown).
Silicon wafer vendors are now being urged to preoxidise wafers since wafers are
the cleanest after polishing and final cleaning. The oxidation of the silicon keeps
the slices clean because of the hydrophilic oxide surfaces which do not attract
foreign particles electrostatically. Perhaps the greatest attention should be placed at
the microscopic level. It is worth noting down that United States-produced silicon
wafers degrade faster than non-United States silicon wafers'. Minority carrier

I Previously, bipolar teclmology reliability could not - in any way - be challenged. This slant
towards MOS happens because the sodium contamination in the Si-Si0 2 system was controled
and removed.

2 The conclusions are based on minority carrier lifetime evaluations of 100mm diameter
Czochralski wafers commonly available for industrial production.
284 9 Reliability of memories and microprocessors

lifetime values are related to silicon defects. AQ study involved sampling of silicon
wafers throughout the world and comparing it with silicon wafers available in the
most European countries. Simple MOS capacitance and minority carrier lifetime
mapping were used for the evaluation.
Computer simulation has shown that modelling with existing silicon material
and wafer processing techniques does not yield the degree of confidence necessary
for submicron technology. This limitation is due to the defects in silicon and
improved silicon wafers are needed to sustain submicron technologi.
Fifteen to thirty samples from three different wafer lots were normally consi-
dered for each type of device. The sample preparation procedure mainly consists in:
• chip step-by-step removal through wet chemical or plasma etching;
• chip cross-sectioning delineation techniques;
• traditional chemical delineation techniques.
The examination techniques are optical and scanning electron microscopy and -
when needed - electron microprobe, Auger electron spectroscopy, and secondary
ion mass spectrometry. Conventional life tests are followed by failure analysis.
Bum-in failures, incoming inspection rejects, as well as field-returns are also
analysed.
Process-related defects
• Manufacturing defects are often related to former process steps; e.g. the passiva-
tion integrity strongly depends on the underlying metal quality, the aluminium
step-coverage is related to the PSG softening, to the contact opening profile, and
to the mask alignment accuracy.
• As expected, newly implemented fabrication techniques may prove to be trou-
blesome, but - surprisingly enough - conventional and presumably well-
mastered operations still remain critical.
Passivation layer
Passivation defects are seldom yield problems, but they are one of the major
causes of aluminium corrosion for plastic encapsulated devices exposed to humid
atmosphere.
The intrinsic quality of the cover layer is obviously of prime importance.
Phosphorous-doped oxides proved to be poor protections against moisture: should
the phosphorous content be large enough, they generally are crack-free, but the
corrosion process is anyway observed to occur, owing to the chemical
transformation of phosphorous pentoxide into phosphoric acid in the presence of
humidity. On the other hand, plasma nitrides demonstrated a much better efficiency
when defect-free; the basic problem regards the cracking resistance which is
critically related to their density and internal stress. So-called compressive nitrides
are in fact observed to be locally tensile (i. e. brittle) at topographical steps.

3 Previously, process techniques such as annealing, diffusion and back surface processes always
reduced and kept defects to a tolerable level.
9 Reliability of memories and microprocessors 285

But the passivation degradation is more frequently induced by extrinsic strains.


Chip dicing, plastic moulding and handling damages appear to be the main
assembly-related causes of integrity loss. The interaction of the passivation layer
with the metal underneath often leads to crack formation: aluminium hillock
grovvth generates mechanical stresses beyond the layer strength. Two operations
can be suspected: the metal annealing and/or the final cover layer deposition.
Aluminium alloys. Apart from corrosion, the major metallisation failure mode is
electromigration. In this respect, the circuit section and the metal microstructure
are the fundamental physical parameters to account for, but to different extents. All
the theoretical models underline the pre-eminence of the current density against the
metal self-diffusion.
Phosphosilicate glass PSG and low pressure temperature oxide LTO PSG and
LTO are currently used as intermediate dielectric layers between polysilicon and
aluminium in MOS devices. Although the key process parameters have been known
for a long time, this process operation is still considered as among the most critical.
The reason for this is the difficulty of finding a good compromise between
conflicting requirements.
PSGILTO reworks should be strictly prohibited as they are observed to result in
voids (mouse-holes) at the edges and consequently in threshold voltage drifts. The
deglaze operation which may follow the source/drain diffusion is also a potential
cause of mouse-hole formation.
Oxides. Oxide breakdown can result from a wide range of causes. Pinholes are
an universal yield detractor and their reliability impact is easily evidenced by
conventional devices stress testing and/or by specific test-chip stressing.
Local gate oxide thinning near the recessed oxide bid's beak is also known as a
major problem.
Gate oxide breakdowns may also originate from local parasitic field oxidations.
When the nitride mask used for the selective recessed oxide grovvth has a large
pinhole density, the thick oxide can form in undesired areas such as the future gate
location. This is easily observable on the finished device after a complete chip
delayering: the holes in the silicon crystal indicate an anomalous growth of the field
oxide. The major impact of the parasitic oxidation is a yield loss, but associated
reliability failures are brought out in conventional dynamic life tests as well as in
test-transistor stressing.
The thin interlevel oxide grovvth between poly I and poly 2 is intrinsically a
sensitive operation. The big difference with other thermal oxides is that it is grown
from a polycristalline material. As the oxidation rate varies as a function of the
crystal orientation, the dielectric thickness uniformity is rather poor.
Photolithography. Photolithography defects are a major source of product
degradation as the VLSI manufacturing may involve up to fifteen masking
operations. Mask misalignments should not be the cause of post-development
controls; the resist patterns may be redone when needed. However control escapes
or marginal etching specifications can lead to alignment misfits. As a result, the
PSG between aluminium and polysilicon is totally eliminated on one side of the
hole, during the contact etching step. The remaining thermal oxide has a poor
integrity since it was in direct contact with the buffered HF reagent. The associated
yield loss is about 20% and life test failures are caused by aluminium/polysilicon
leakages or shorts produced by the thermal oxide breakdown.
286 9 Reliability of memories and microprocessors

Dry etching to replace liquid chemical etching. In this decade MOS devices will
have more than 10 million components on one single chip. Design rules will
demand submicron features, and new and improved technologies will be necessary
for generating and delineating the required patterns. Traditional lithographic
processes will be pushed to their limits. Wet chemical etching will have to be
complemented or replaced by dry etching.
All three types of plasma-based dry etching met the requirements of submicron
features. The choice depends on the applications of the process and the interactions
with the materials being processed. Plasma etching (PE) occurs predominantly by a
chemical reaction with little directionality between a reactive gas in the plasma and
the substrate. Reactive ion etching (RIE) adds a sputtering action with higher
directionality to the chemical reaction. Reactive ion beam milling (RIBM) also
combines physical and chemical action with even higher directionality.
RIE and RIBM will become the major dry processing techniques for the rest of
the decade because of their characteristics and suitable etching compatibility with
tantalum over polysilicon for feature sizes in the one micron range. The best
technique for high quality etching seems to be RIE. The process consists of a
chemical reaction enhanced by ions, bombarding the silicon wafer. The ions
remove non-volatile etching inhibitors from its surface and in that way permit
etching to continue anisotropically.
Automation. Automatic wafer processing, inspection and final test continue to be
the password and key to the semiconductor future. There is little doubt that
automation of wafer fabrication will improve the reliability of the devices but the
automatic processing and testing will have to be economical for its total
implementation. In wafer process automation, critical processing steps such as
wafer handling, lithography and dry etching are all being done by machines. Test
and inspection will cover not only highly automated systems but also their use in
repairing processed wafers by adjusting with lasers and automation radiation testing
ofICs at the wafer stage.
Automatic surface mounting of components directly on printed circuits boards is
now gathering momentum.
Yield and reliability. It is usual to consider that ''the higher the yield, the better
the reliability". Of course it all depends on the nature of the main yield detractors.
Design-related yield losses are not easily correlated with reliability failures.
However, when manufacturing defects are involved, it can make sense to look for
such correlations. In a restricted number of cases, no correlation at all does exist
between yield and reliability. The most typical example regards the final
passivation layer quality: large defect densities do not impact the chip performance
but they definitely promote humid test failures. The basic reason why no
correlation can be made between time zero and long term failures is that
passivation layer do not play any active role in the chip functionality.
Some DRAM test results. Dynamic life-testing of 300 IMxl DRAM devices
only yielded 6 (six) electromigration failures after 1000 hours. Contact mask mis-
alignment: 5 (five) failed parts out of300 after 1000 hours in life test.
Surface Mounting Technology (SMT). Solid state equipment manufacturers
continually seek increased packaging density by packing more functions into a
given size enclosure and still maintain the same functional capability. Advances in
integrated circuit designs and fabrication will result in little practical benefit unless
9 Reliability of memories and microprocessors 287

accompanied by equally significant improvements in the assembly-packaging


density. For the past 30 years integrated circuit (lC) densities have been improving
by several orders of magnitude while printed circuit board (PCB) densities have
only improved by one order of magnitude. That is to say that IC chip package
miniaturisation and PCB technology has not kept pace with silicon wafer
fabrication technology.
SMT components on PCB are expected to have the greatest impact on electronic
packaging. SMT emerged and evolved from thin and thick film hybrid circuits\ it
is a new electronic assembly technology, whereby dual-in-line (DIP) packages are
replaced with surface mounted packages or a leadless chip carriers (LCCt
The solid state assembly approach reduces manufacturing cost by approxima-
tely 50%, reduces the printed circuit board area by as much as 58% for the same
density, and improves reliability. It is estimated that the component density will
approach that of thin and thick film hybrids.
SMT is catching on; its popularity can be attributed to short leads or pads which
enhance electrical performance. Almost every type of new IC component can be
obtained from the semiconductor manufacturer in this LCC.
There are many advantages ofSMT and some of them are not obvious:
1) LCC have a reduction in area, volume and weight6 (all of them desirable in
aircraft and satellite applications). LCC's lower mass makes them more durable and
able to withstand shock and vibration and are superior to DIPs and flatpacks in this
regard.
2) The short leads or pads have lower resistance, inductance and capacitance
resulting in improved circuit performance, higher frequency response and less
noise.
3) The LCC's or packaged IC chips, can be electrically pre-screened, ac tested,
temperature tested and burned-in before assembly. Because of their durability, the
LCC can be reworked or replaced.
4) The vapour phase reflow solder used in this technology offers greater control
of the high temperatures required to melt the solder and does not allow the
assembled PCB to pass over heated coils or through a molten solder wave as it is
required for DIP assembly.
5) LCC automatic assembly is greatly facilitated by automatic placement
equipment which accurately places the LCC components on the mounting pads.

4 Look for automatic surface mounting integrated circuits and other components on printed
circuits boards to greatly improve reliability, cut costs, and increase packaging density.
5 The main advantage of LCC over DIP is that no PCB holes are required for the assembly but
rather the LCC are soldered directly to the PCB solder pads. SMT eliminates the need to drill
holes in the PCB and saves valuable surface areas, so LCC's are much smaller that DIP's.
6 a) An LCC has a reduction in area of 3 to lover a DIP; b) a reduction in volume of 8 to lover
a DIP and c) a reduction in weight of20 to lover a DIP.
7 The only problem with LCC was the well publicised solder-joint cracking problem caused by
the thermal expansion difference between the PCB and the LCC; this problem can be solved by
the proper selection of the PCB material.
288 9 Reliability of memories and microprocessors

All this contributes to cost reduction, consistent product quality and high
reliability7.
Gallium arsenide technology. GaAs technologists have made steady progress in
the development of GaAs materials, processes and packaging technologies. Their
commitment to move GaAs from the laboratory to the production line has made
GaAs a powerful and proven technology in microelectronics. The new GaAs
technology has recently excelled and made advances in the integrated circuits.
GaAs ultra high speed ICs provide the ultimate in speed for super computers and
other high speed signal processing applications which require clock rates in the
gigahertz region and above. Clock rates of 2 to 5 times greater than those available
with the fastest silicon technology provide advantages for faster processing speeds,
increase in throughput capability and reduced system complexity8. The relative
fragility of GaAs, and possible thermal instability at high temperatures, together
with the lack of a native oxide, impose constraints on the allowable processing
techniques.
Future trends. For many years to come, silicon wafers will enjoy being the
prime candidate for many electronic devices. However, for certain applications -
such as ultra high-speed computer elements and communications - other basic
materials are being considered. So a GaAs universal shift register and a binary
counter operate five times faster than the silicon integrated circuits available today.
The market of GaAs digital integrated circuits is now poised for a phenomenal
explosive growth; the standard product GaAs market reached $5 billion in 1998.
Advances in material technology and fabrication techniques such as ion
implantation and ion milling direct-step-on-wafer photolithography and dry plasma
etching have made the commercialisation of GaAs possible.
Surface-mount packages will dominate IC packaging. Their popularity is
attributed to ease of assembly and short leads or pads which enhance both speed
and performance. This package miniaturisation is much overdue. The encapsulation
of large-area dies in thin surface mount plastic packages results in much higher
compressive and shear stresses at internal interfaces than experienced previously in
conventional DIPs.

9.3
Possible memories classifications

Table 9.2 shows some various types of semiconductor memories. In general, classi-
fications can be made as following, depending on the
• Form a/signal: a) analogue; b) digital.
• Access type: a) stochastic; b) sequential; c) semi-sequential.
• Cell type: a) static; b) dynamic.

8 As an added benefit, the extended useful operating temperature range and radiation hardness of
GaAs IC's open new applications that are not possible with silicon technology. Con~equently,
high speed GaAs 16K SRAMs have an access time of 2ns and integrate more than 10 FETs on
a 7.2 x 6.2 mm chip.
9 Reliability of memories and microprocessors 289

• Base material: a) paper; b) magnetic material; c) semiconductor; d) passive


component.
• Application type: a) read/write memory; b) permanent memory.
• Technology: a) bipolar (standard, TTL, Schottky TTL, ECL, fL, etc.);
b) MOS (PMOS, NMOS, CMOS, HMOS, VMOS, CCD, etc.); c) SOS; d)
SOL
• Memory cycle: a) erasing; b) non-erasing.
• Interchangeability of support material: a) stable; b) changeable.

Table 9.2 Some semiconductor memories types


--,---- --------
Classification Memory type Erasing Program- Memory content by
ruing current breaking
Kead/wnte KAM (random access memory) Electn- tleCUl- VOlatlle
memories SAM (serial access memory) cally cally I

Write ROM (read-only memory) Not pos- Masking Non-volatile


memories sible by manuf
PROM (programmable ROM) iEJem--
--------
Read-mostly EPROM (erasable PROM) UV-light ElectT. Non-volatile
memory REPROM (reprogrammable
ROM)
EEROM (electrically erasable Electri-
ROM) cally
EAROM (electrically alterabale
ROM)
~--

Fig. 9.6 presents another possible classification of semiconductor memories.

Semiconductor memories

Content addressable Realised


memory (CAM) functIOn

Functional
mode
Addresszng

Product
register RAM register jamily

Fig. 9.6 Another possible classification of semiconductor memories. (PLA: programmable logic
array)
290 9 Reliability of memories and microprocessors

9.4
Silicon On Insulator (501) technologies

For more than 30 years, SOl technologies have primarily been dedicated to radia-
tion-hard application. At present, a very serious opportunity is offered to SOl by
the aggressive development of ultradense, deep-submicron CMOS circuits operat-
ing at low voltage. The subsisting obstacle is the credibility of SOl when competing
with bulk Si, which is still extremely efficient. In SOl, the upper silicon layer, i. e.
the active device region, is fully isolated from the inactive substrate by a buried9
oxide. This configuration results in outstanding merits and substantial theoretical
advantages [9.10] over bulk silicon: improved speed and current driveability,
higher integration density, attenuated short-channel effects, lower power consump-
tion, and elimination of substrate-related parasitic effects. Another strong argument
in favour of SOl technology would be the inexpensive control of the interface deg-
radation induced by hot-carrier injection. This is indeed a key challenge for any
technology including bulk silicon.
The most successful and mature SOl material so far is SIMOX, formed by deep
oxygen implantation into silicon and subsequent high temperature annealing.
Commercially available wafers are synthesised with 1.8 x 108Qfcm2 dose and
160 ... 200keV energy, followed by annealing at I 320°C. The thickness of the
silicon overlay and buried oxide are about 200nm and 380nm, respectively. The Si
film is a wafer-scale monocrystal with lower residual doping « 5 x 1015cm-\ high
carrier mobility, and excellent in-depth homogeneity. Subsisting defects are
dislocations and stacking faults (102•• .106cm-2).
The reliability of CMOS circuits depends on the capability of individual
transistors to withstand ageing effects. In short-channel transistors, carriers gain
enough energy to become hot, but the hot-carrier immunity does not appear to be a
critical limit for the operation of fully-depleted or partially-depleted SOl circuits.
Nevertheless, the hot-carrier-induced ageing of SIMOX transistors is a challenging
problem because not only the front gate oxide, but also the buried oxide may be
damaged. The gate-induced drain leakage (GIDL) current - measured at large VD
by scanning the gate bias from depletion to strong accumulation - is very sensitive
to ageing. Increasing the drain voltage or reducing the channel length leads to more
pronounced ageing, whereas the extension of the defective region depends
essentially on gate bias. Other sensitive monitors are: charge pumping'O current,
low frequency noise, photoluminescence, etc.

9 The buried oxide differs from a thermal oxide: it is Si-rich which implies a high density of
electrons traps and E' centres (acting as hole traps). It is definitely larger than the thermal oxide
interface at the gate, but small enough not to adversely affect the circuit performance. The
buried oxide being more subject to degradation than the gate oxide, its defects may jeopardise,
via coupling effects, the performance of CMOS circuits.
10 Charge pumping reveals a more intensive build-up of interface traps due to the simultaneous
presence of electrons and holes [9.11].
9 Reliability of memories and microprocessors 291

9.4.1
Silicon on sapphire (50S) technology

For years, in many research laboratories - such as General Electric, Hewlett Pack-
ard, Hughes, RCA, Rockwell- important works have been accomplished to replace
the silicon support with another material, with better properties: the silicon on sap-
phire. Its most important advantages are:
• absence of field inversion problems, and of parasitic circuit elements;
• better reliability;
• smaller dissipated power, especially smaller static power;
• simple failure causes;
• complementary SOS ICs are very resistant to perturbations, and the working
voltage can varies in large limits.

9.5
Failure frequency of small geometry memories

Soft errors induced by alpha particles can be a reliability concern for microelec-
tronics, especially DRAMs packaged in ceramic. For example, in n-channel MOS
memories, the charge carriers are electrons, and the capacitors are potential wells in
the p-type silicon. Alpha particles emitted from trace levels of uranium U238 and
thorium Th232 in the packaging materials can penetrate the surface of the semicon-
ductor die. As the alpha particle passes through the semiconductor device, electrons
are dislocated from the crystal lattice along the track of the alpha particle [9.12]. If
the total number of generated electrons collected by an empty storage well exceeds
the number of electrons that differentiates from a J to a 0, the collected electron
charge can flip a J to a 0 (Fig. 9.7) generating a soft error in the memory device.

i
+++ "]"
T
V'~"~i
+++
T T
- - - - "0"

Fig. 9.7 Illustration of a soft error

Randomly occurring as single-bit errors in semiconductor memories, the soft


errors are (i) reversible and not associated with any permanent damage to the
device; (ii) completely removed on the next write cycle, the affected bits are no
more susceptible to failure than any other bit in the device.
The trend toward higher packaging density on the chip, the scaling down of the
geometry and the trend toward lower power supply voltages further increases the
susceptibility of DRAM to soft errors. Several process changes and novel memory
cell structures have been implemented to minimise carrier collection and soft-errors
in memories [9.13]. While the primary source of alpha particles in plastic
292 9 Reliability of memories and microprocessors

encapsulated microcircuits is the filler material and the package lid is the primary
source in hermetic packages, alpha particle contamination from the mineral acids
used in wafer processing must not be ignored. The susceptibility of DRAM to soft
errors is typically measured by accelerated tests or real-time soft-error rate tests.
Knowledge of the factors which lead to soft errors can be used to improve
reliability in DRAM by using a physics-of-failure approach to monitor variables in
the manufacturing. A trend toward decreasing alpha particle emission rates in fused
silica fillers used in the encapsulants for plastic encapsulated microcircuits was
observed [9.14][9.15][9.17].
Since 1985 there was a shift in the semiconductor reliability community from
reliability prediction based on accelerated test measurements of the final product
towards designed-in reliability. The use of accelerated testing and real time testing
[9.16] to monitor soft-error rates during manufacturing of a 256K DRAM were
reported. A direct correlation was found between a batch of phosphoric acid
containing high levels ll of Po-210 and the increase in soft-error rate. The alpha
particle emission rate of the hot phosphoric acid batch was 30 to 80 times higher
than the other materials.

9.6
Causes of hardware failures

These causes have varied over the last 20 ... 30 years [9.18]; generic causes for
failures have been associated with the following:
• part or (active and passive) device failures;
• interconnect failures;
• electrical and mechanical system design;
• excessive environment stresses (mechanical, moisture, chemicals, temperature);
• user handling;
• can not duplicate (CND) or retest OK;
• miscellaneous.
Table 9.3 presents a Pareto ranking of device failure data in which some 22% of
failures fall into the CND and not verifiable category [9.19]. Often these failures
are considered as apparent or virtual failures, and only the remainder of failures are
perceived as actual or hard failures. In some cases, a failure is acknowledged but
the failure cause cannot be attributed to a specific failure site, failure mode, or
failure mechanism.
One reason for the increasing trend of non-attributable failures may be due to the
higher level of complexity, so that failure analysis methods or techniques are
incapable of isolating the associated defects. Traditional techniques of failure
analysis need to be further developed or radically changed to address new cate-
gories of failures. Fig. 9.8 [9.20] indicates that the average quality of JAN, MIL-.

II The Po-21O level in the hot phosphoric acid was 50 ... 100pCillitre, which was 10 to 20 times
higher than the other lots tested. A quality control procedure has been established to monitor the
alpha particle emission rate of incoming phosphoric acid batches [9.l6].
9 Reliability of memories and microprocessors 293

STD-883C-qualified, and military IC's qualified with source control drawings has
improved from 200 defective chips per million (in 1987), to 40 defects per million
(in 1991). This study covers electrical defects, mean density defects, and
hermeticity defect data for digital MOS and linear and digital bipolar technologies.

Table 9.3 Pareto ranking of failure causes in 3400 VLSI failed devices +) (fd) [9.19]

Failure causes %offd Failure causes %of fd I


:--1
Electrical overstress & ESD 10.9 Chip damage/cracks/scratches ! 2.4* .
Unresolved 15.9 Misprogrammed 2.0
,
Gold ball bond (bb) fail at bb 9.0* Oxide instability 1.9
Not verified 6.0 Design of chip 1.7
Gold ball bond at stitch bond 4.6* Diffusion defect . 1.5
Shear stress-chip surface 3.5* Final test escape 14
Corrosion-chip metallisationiassembly 3.2* Contact failure 1.2
Dielectric fail, poly-metal, metal-metal 3.0 Bond failure, nongold 1.2*
Oxide defect 2.9 Protective coating defect 09
1

Visible contamination 2.7 Assembly-other ! 0.9*


Metal short/open 2.6* Polysilicone/silicide
~
'0.8
Latch-up 24 External contamination 0.7*
Microprocessed-wafer fabr.-related 2.4 Others 5.3

'i VLS] class devices were from multiple sources like manufacturing fallout, qualifications, reliability
monitors, and customer returns. ESD: Electrostatic discharges; * = possible packaging/assembly
related failures.

Parts per million


220
200
180
-,. /
V
160
140
A ~

: \ / \ 1/\
120
100
80
V
\!

/\
60 :\ / i \/\
1\( \
y

40
20 :'-
o
1986 1987 1988 1989 1990 1991 1992 Quarterly-yearwlse

Fig. 9.8 Defects in digital MOS and linear and digital bipolar teclmologies Ie's [9.20]
294 9 Reliability of memories and microprocessors

The transmission gate is an important special case in logic applications; it is


used bidirectionally as the source and the drain are switched alternatively into the
high state. As a consequence, the degradation occurs at both source and drain.
Moreover, the degradation is essentially symmetrical [9.22] leading to
approximately equal degradations for linear- and saturation-mode parameters.

Table 9.4 Historical perspective of the dominant causes offailures in devices [9.18]

Detail Year Major causes offailures


Failure analysis for failure rate predictions 1983 Metallisation (52.8%); oxide/dielectric (16.7%)
Westinghouse memos [9.21] 1984/87 El. overstress (40%); unknown causes (25%)
Delco data [9.18] 1988 Wire bonds (41%); others (25%)
Failure analysis on CMOS [9.18] 1990 Unknown (47%); package defects (22%)
Texas Instruments study [9.19] 1991 Cannot verify (22%); el. overstress/ESD (20%)

The case of the bidirectionally stressed p-MOSFET is more complicated; p-


channel degradation shows very pronounced negative charge trapping which
reduces the electric field. The degradation is more severe than after the first
normal-mode stress because it is performed on a device with a reduced channel
length.

9.6.1
Read only memories (ROMs)

The floating-gate avalanche-injection metal oxide semiconductor (F AMOS) tech-


nology (n-channel and p-channel) is used to construct erasable programmable
ROMs (EPROMs). The dominant failure mechanisms described by Woods and
Rosenberg [9.25] are listed in Table 9.5.
From production line tests, the principal failure mechanisms - in order of
importance - are (i) charge loss; (ii) oxide hopping conduction, and (iii) hot
electron injection. Failures in the field may not be caused by these three
mechanisms, but the principal reliability problem with EPROMs is data retention.
The integrity of oxide is therefore very important to the reliable operation. Charge
loss does occur as a result of extrinsic effects, e. g. positive ions diffusing into a
memory array can alter the programming of an EPROM. Contamination results in
charge loss as a consequence of oxide conduction l2 • Mielke [9.26] has demonstrated
that interpolyoxide defects cause charge loss. Hot electron injection is the technique
by which cell programming is achieved in an EPROM; a small percentage of these
electrons may be trapped within the thin oxide gate, and repeated programming and
erase cycles may result in a build-up of charge until device failure occurs. Hermann
and Schenk [9.32] conjecture that charge loss is due to leakage of electrons and
proposed a multiphonon-assisted tunneling mechanism where electrons stored on
the floating gate tunnel to oxide traps, then are emitted into nitride. The coupling of

12 The resultant failure must not be confused with intrinsic charge loss associated with the
detrapping of electrons on the floating gate [9.8].
9 Reliability of memories and microprocessors 295

the trap level to oxide phonons results in virtual energy levels in the oxide which
allow for more effective transition paths. As a consequence of the electron-phonon
coupling, the emission occurs close to the oxide conduction-band edge at
temperatures between 250 and 350°C, producing a strong temperature dependence
of the mechanism.

Table 9.5 EPROM failure mechanisms [9.25]


~~

Mode Lifetime region Thermal activation Primary detection method


affected energy (eV)

Slow trapping wear-out 1.0 high-temperature bi as (HTB)


Surface charge wear-out 0.5-1.0 HTB
Contamination infant/wear-out 1.0-14 HTB
Polarisation wear-out 1.0 HTB
Electromigration wear-out 1.0 high-temperature 0 perating life
Microcracks random - temperature cycling
Contacts wear-out/infant - high-temperature 0 perating life
Silicon defects infant/random 0.3 high-temperature 0 perating life
Oxide breakdown
Ileakage infant/random 0.3 high-temperature 0 perating life
Hot electron in-
jection wear-out - high-temperature 0 perating life
Fabrication
defects infant - bum-in
Charge loss infant/random!
wear-out 1.4 high-temperature st orage
Oxide hopping
conduction Iinfant/random 0.6 high-temperature st oragelbum-in I

--""--
_ _ _----.J

Typical screening test used to eliminate defective EPROMs are (i) burn-in, (ii)
high-temperature reverse bias HTRE, (iii) high-temperature storage, and (iv) low-
temperature dynamic life test.
The main failure mechanisms affecting electrically erasable/programmable read-
only memories (E 2PROM) are intrinsic charge trapping and defect charge loss 13.
Failure rates for EPROM and E2PROM are almost the same up to 10 000 cycles
at 250°C, with an activation energy of 0.6eV. For this type of failures it is relatively
simple to devise screens on a production basis (similar to those used on EPROMs),
since the mechanism is temperature activated.
Electrically alterable ROMs (EAROMs) are manufactured using MNOS
technology (a technology similar to NMOS, but with a modified gate insulating
layer). The device performance is affected by the degradation of the Si02 during
erase/write cycling. Changes in surface states and, consequently, alterations of VT

13 Its major cause is oxide breakdown; two types of breakdown have been identified: (a) tunnel
oxide breakdown (accounts for some 87.5%), without temperature acceleration, and (b) oxide
breakdown in the row select circuitry (some 10%).
296 9 Reliability of memories and microprocessors

and deterioration of charge mobility were noticed, hence the possibility of charge
loss by direct tunnelling from traps within the oxide to the silicon is increased. The
activation energy of this mechanism varies with the number of erase/write cycles,
decreasing from 0.65eV at 10000 cycles to 0.5eV, and then to 0.25eV at 10 cycles.
The retention time is logarithmically dependent on temperature [9.8]. Read cycling
is temperature dependent, but in no way does it influence retention time. The
mechanisms of charge loss are similar to those observed in EPROMs and
E2PROMs; therefore screening processes used for EPROMs are found to be
effective. ESD can be disastrous for the three ROMs discussed above, since their
thin gate oxide would be highly susceptible to breakdown as a result of static
potentials. Therefore by handling in the field all precautions must be taken.

9.6.2
Small geometry devices

As geometry get smaller, more devices can be built on the same area of silicon,
reducing therefore the cost of each individual circuit. The major cause of VMOS
failures was found to be ionic contamination (accounting for over 75% of the failed
devices). Proper process controls and screens result in a marked improvement in
device reliability, but ESD protection in VMOS devices cannot be easily accom-
plished using conventional electrostatic protection circuitry.
The major cause of HMOS device failure (infant mortality condition) seems to
be caused by the ionic contamination through defective passivation layers; it can be
screened using either a high-temperature life test or a storage bake. Accelerated
tests show that thinner gate oxides do not automatically result in higher failures,
because the provided screening removes devices with hazardous latent defects. Hot
electrons are a problem due to the high electric fields in the solid devices; the high
E fields cause impact ionisation and the generation of hole-electron pairs within the
conduction channel degrade the performance. Accelerated tests at low tempera-
tures (-1 ooe to -70°C) can detect defective devices. The acceptable soft error rate
(SER) level due to a-radiation, as specified by Intel, is 0.1 %/1 OOOh or 1000FIT.
The smaller the device, the higher the defect density. Device complexity has been
found to increase non-linearly the defect density.

9.7
Characterisation testing

This is a key to successful screening and inspection testing and plays a dominant
role in the development of design margins and test specifications, as it may reveal
the sensitivities of the RAM. Characterisation is a parametric, experimental
analysis of the electrical properties of a given integrated circuit; its purpose is to
investigate the influence of different operating conditions (temperature, supply
voltage, logic levels, frequency, etc.) on the Ie's behaviour and to deliver a cost-
effective test programme for incoming inspection l4 • Normally a characterisation is

14 For quality cost optimisation at incoming inspection level, see chapter 8.4.2 in [9.24].
9 Reliability of memories and microprocessors 297

performed at 2 ... 5 different temperatures and with a large number of patterns.


Furthermore, the characterisation of RAM devices can be used as tool for selecting
the right vendor(s) and to distinguish the characteristics of the devices from
different sources. Electrical characterisation testing can be defined as a thorough
and exhaustive testing of a given device type, usually carried out by a computer
controlled process on an automatic test equipment (ATE), and resulting in all
practical combinations of (i) device supply voltages; (ii) logic input/output
voltages; (iii) temperatures; (iv) timing conditions; (v) parametric variations; (vi)
various test patterns; (vii) operating frequency response; (viii) modes of operation;
and (ix) power consumption.
The goal is to discover how the device responds under these conditions, and
within which limits it remains functional. The tests include stringent functional
stressing by means of patterns and truth tables, as well as timing and parametric
variations under temperatures extremes. The worst case test patterns with supply
and timing variations are applied to the device to promote as many of its failure
modes as practical, in order to determine its performance and sensitivities under the
most severe operating conditions.
A RAM characterisation must be optimised in terms of test time and economics,
and operate within the constraints of the available ATE. Due to the economic
imperatives, the test is usually carried out on a small sample. The sample should -
if possible - be large enough to contain a variety of process weaknesses and cover
several fabrication date codes to allow for a maximum of process parameter
variations. Although a characterisation is optimised in terms of economics, it is still
a costly and time consuming task relative to conventional volume incoming
inspection testing, and a characterisation often requires a more flexible test system,
than incoming inspection testing. Furthermore a test system used for
characterisation should incorporate a massive storage of data and fast data
reduction routines. A comparison of characterisation versus incoming inspection
testing is given in Table 9.6.

Table 9.6 Incoming inspection testing versus characterisation [9.23]


,------------------,------------
Incoming inspection Characterisation
- Large volume of parts tested; - Sample of parts testecr;-~-
- Simple testing; - Exhaustive & complex test sequence
- Short test time: throughput is the key; - Long test duration;
- Low or no engineering content; - High engineering content;
- Requires automatic handlers; - Requires ancillary data storage equipment;
- No data required, strictly go/no-go testing; - Vast amounts of data collected;
- Low cost; - High cost;
- Often dedicated test system; - Sophisticated flexible test system;
- Generally no data are provided; - Can provide process stability analysis and
indicate parametric distribution;
- Performed continuously; - Normally done one time
---~
298 9 Reliability of memories and microprocessors

Obviously, the characterisation is not only a question of data collection; it also


calls upon much more exhaustive testing than for usual incoming inspection testing.
Some typical characteristics of the two types of testing are given in Table 9.7.

Table 9.7 Some typical characteristics of the two types of testing [9.23]

Incorrrlnginspection Characterisation
• N, N"· patterns; • N- patterns;
• Worst case temperature; • Three temperatures (hot, cold, ambient);
• I, 2 or 4 comer voltage supply; • Four comer supply;
• Fixed sets of timing data; • Variation oftiming data in may change area;
• Screening out false second sources; • Defining non specified characteristics.

During the pre-characterisation, an effort should be made to discover any pattern


sensitivities [9.24] for different patterns (e. g. butterfly, checkerboard, diagonal,
galloping one, march, masest, surround, etc.). Prior to the characterisation, all units
should be subjected to a go/no-go test including testing at the four corner extremes
of VDD and VBB • Furthermore a variety of test patterns (Table 9.7) should be
applied. During the go/no-go test and the following characterisation, inductance
and capacitance of handler contacter, interface, load board, etc. must be considered
and - when possible - compensated for. The same applies for driver impedance.
Improper sense amplifier margins can occasionally be detected by low storage tests.

9.7.1
Timing and its influence on characterisation and test

To characterise and test a dynamic RAM for sensitivities due to timing [9.23],
several timing set-ups must be included. The address latching must be considered
carefully. Row Addressed Strobe (RAS) initiates the cycle by going from high to
low. Prior to that it must remain a sufficient time in the high condition for internal
modes to be pre-charged to a known initial state. The parameter is tRP • Once RAS
goes low it must remain low enough (tRAS) for the selection of the accessed cells,
sense operation and restoration of the destroyed data (read out is destructive).
Similar requirements must be met when Column Address Strobe (CAS) goes low
and the column addresses are latched. Cycle time influences on power consump-
tion. The high impedance state of the output buffer must also be checked; there is
no reason to search for some test sequence or data pattern, which are the worst
cases for access time.

9.7.2
Test and characterisation of refresh

Refresh tests may be roughly divided into two parts: block refresh and distributed
refresh. The normal way to do block refresh testing is to write some data (such as
checkerboard pattern) in the entire memory. Then the memory is tested and if no
9 Reliability of memories and microprocessors 299

failure, all clocks are stopped and paused for the specified time, 2ms. After 2ms the
memory is tested again for failure. Such a test insures that in addition to data being
retained for the refresh interval, the peripheral circuits are also fimctioning after the
pause.
Refresh time is not so critical at low or room temperatures, but becomes
significant at elevated temperatures; it can vary as much as 30 times or more over
the temperature range 70°C to 25°C, depending on the internal construction of the
memory. One of the drawbacks of testing refresh time by this method is the thermal
changes within the chip when pausing between read and write [9.23]. Because of
this, the refresh time reading is not constant and it is difficult to decide, what is the
actual refresh time of the memory. A way to solve this problem is to apply a
distributed refresh.

9.7.2.1
Screening tests and test strategies

The newest memories on the market, produced in small series, manufactured with
insufficiently stable parameters can exhibit early failures and they must be elimi-
nated before they are mounted on PCB, with the aid of well skilled personnel. The
screening tests must activate failure mechanisms, and must not cause damage or
alteration of the tested memories. For memories in hermetic packages, and for high
reliability (or safety) applications, the following screening tests should be applied:
• Burn-in - statically or dynamically - (125 'C' for 160h) produces some 80% of
the chip related and 30% of the package related early failures; memories should
be operated with the same electrical signals as in the field. Should surface, oxide
and metallisation problems be dominant, a static bum-in is better. A dynamic
bum-in activates practically all failure mechanisms. The choice will be made on
the basis of practical results.
• Constant acceleration (for memories in hermetic packages) to check the me-
chanical stability of die-attach, bonding, and package. The memories are placed
in a centrifuge and exposed to an acceleration of 30 OOOg for one minute (gen-
erally z-axis only).
• ESD test (lkY ... 3kY) during handling, assembling and testing of memories or
ICs, using human body model HBM and the charged device model CDM.
• Glassivation (silicon dioxide and/or silicon nitride) test of the entire die surface.
Ideally for memories in plastic packages it should be free from cracks and pin-
holes. To check this, the chip is immersed (for 5 minutes) in a 50°C warm mix-
ture of nitric and phosphoric acid, and then inspected with an optical micro-
scope (MIL-STD-883, method 2021).
• High-temperature storage (150°C for 200h) to stabilise the thermodynamic
equilibrium and to activate failure mechanisms related to surface problems (e. g.
charge induced failures, contamination, contacts, oxidation). Should solderabil-
ity be a problem, an Nrprotective atmosphere can be used.
• Hot carriers are a consequence of the high electric fields (10 4 ..• 105 y fcm) in
transistor channels. Effects: increase of switching times, possible data retention,
300 9 Reliability of memories and microprocessors

increase of noise. The test is performed under dynamic conditions, at 7... 9V and
at -20°C to -70°e.
• Humidity or damp heat test, 85/85 and pressure cooker - to investigate the in-
fluence of moisture (e. g. corrosion) on the chip surface [9.24].
• Latch-up tests simulate voltage overstresses on signal and power supply lines as
well as power-on I power-off sequences [9.24].
• Seal test - Ih at O.Smm Hg I storage (4h at Satm) in a helium atmosphere I
waiting O.Sh in open air I measurement with the help of a specially-calibrated
mass spectrometer (to check the seal integrity of the cavity) begins with the fine
leak test and continues with the gross leak test (lh at Smm Hg I 2h at 5atm in
FC-72 I 2 minutes waiting in open air I immersion in a FC-40 bath at 125°C I
observation of the continuous stream of small bubbles from the same place
within 30s to confirm a defect).
• Soft errors. At the chip level, an electron beam tester allows the measurement of
signal within the chip circuitry. (If logical circuits with different signal levels
are unshielded and arranged close to the border of a cell array, stray coupling
may destroy the information of cells located close to the circuit, leading to chip
design problem).
• Solderability of tinned pins, performed according to MIL-STD-883 or IEC 68-2
after the applicable conditioning, and using the solder bath or the meniscograph
method.
• Thermal cycles - to test the memory's ability to support rapid temperature
changes (at least 10 thermal cycles from -65°C to +150°C) air to air in a two-
chamber oven using a lift. Dwell time at the temperature extremes should be
zlO minutes (after the thermal equilibrium of the memories has been reached
within ±5°C), transition time less than 1 minute. Should solderability be a
problem, an Nz-protective atmosphere can be used.
• Time-dependent dielectric breakdown (particularly sensitive for memories z4M)
as a result of up to IOMV/cm electric fields (Fowler-Nordheim effect, hot carri-
ers into isolation layer).
The following listing gives a standard procedure of the screening tests (batch of 50
pieces):
1.: Extreme temperatures, measuring the parameters at 25°C, after each step;
• beginning with 70°C, in 10°C steps until failure; Ih for each step, with
vitality test.
1. Electrical behaviour at various temperatures.
• -20°C, -40°C, -60°C, Ih for each step, and measuring at 2SOC after each
step.
• -20°C until + 120°C in 10°C steps, Ih per step with vitality test.
• 100°C, 100h with vitality test, continuous monitoring of various
parameters.
J: Thermal cycles (1000 cycles -20°C/+85°C).
• 20 .. .40°C with vitality test;
• 60 ... 80°C with vitality test;
• 60 ... 80°C with power on only during the heating phase.
9 Reliability of memories and microprocessors 301

1.:
Humidity tests (intermediary and final measurements, in dry state).
• 85/85, 250h with vitality test;
• 30/100 (hot water), 500h without supply voltage;
• 95/95, cooling to -20°C in 3h, heating to 95°C (95/95) in 2h, 100 cycles
without supply voltage.
~ Vibrations (without supply voltage).
• 8 sine explorations (0 .. .3000Hz) to detennine the resonance frequency;
• random 20 ... 500Hz; 1,3,6, eventually 10 and 15g, three axes, Ih.
~ ESD-test.

In order to investigate technological limits and failure mechanisms of great capac-


ity memories it is necessary to submit a given type to stresses that can be more
severe than those encountered in field operation. Such tests - depending on the
intended application - are often destructive and a failure analysis after each stress is
important to evaluate failure mechanisms and to detect degradation.

9.7.3
Test-programmes and -categories

A test programme for a RAM memory consists of three items: DC parametric test,
AC parametric test and functional test (although they often are applied
simultaneously). A memory test program comprises various tests such as:
continuity check, leakage tests, a variety of functional tests, dynamic or timing tests
and parametric tests. Functional tests are by far the most important tests for RAMs.
The DC tests usually have access only to the outskirts of a memory chip, whereas
the functional tests have logical access to all the embedded functions of the chip,
resulting in a much better test coverage.

9.7.3.1
Test categories

Continuity checks. Before any AC, DC or functional testing is initiated, a continuity


check is carried out. No power is applied to the device under test (DUT) except the
bias involved in the continuity check; therefore an incorrect inserted device will not
lead to a damage.
DC parametric test. This includes DC electrical characteristics under worst case
conditions, according to the manufacturer's specifications. For this purpose a
precision measurement unit (PMU) is used to force a current and measure a voltage
(V OH, VOL, etc.) or to force a voltage and measure a current (Irn, IlL, etc.). Before
each step, the memory's inputs and outputs are brought to the logical state
necessary for the measurement; the electrical test should be perfonned at 70°C or at
the highest specified operating temperature. The primary purpose is to detect gross
defect at parameters such as excessive leakage current, breakdown voltage, power
supply currents, minimax output states, output sink/source currents, etc. Due to the
high complexity of semiconductor memory's internal structure, this parametric test
provides only a gross check on the device under test at static condition.
Furthermore only some DC parameters can be tested, while the DUT is exercised in
a functional mode.
302 9 Reliability of memories and microprocessors

Functional test. The DUT is exposed to a truly operating environment and a


variety of worst case working conditions. It is a dynamic test mode that uses fast
changing input stimuli to check the DUT's internal logic, i. e. check the storage and
retrieval of standard patterns at rated cycle times. No single test pattern can test and
exercise a RAM memory thoroughly enough to detect all failure modes, because
the exact failure mechanism depends on the individual chip architecture, layout of
address decoders, cell geometry and sense amplifiers, etc. These characteristics,
too, differ from manufacturer to manufacturer, so no standardised functional test
strategy can be applied. A device that passes functional test has a high probability
of being logically correct and is unlikely to have any physical failures that could be
detected by DC-testing.
AC-test. A dynamic test mode similar to functional testing. Its distinguishing
characteristics are measurements of timing data like access times, set-up and hold
time, strobe widths, cycle times, refresh intervals, etc. The truth table is often
identical to the patterns used during functional testing.

9.7.3.2
RAM failure modes

Over the years a number of failure modes have been reported on semiconductor
RAM memories. Traditionally, the solutions of the problems were found through
an evolutionary trial-and-error approach. Very often the failures were reported by
end-users either as a result of effective electrical-characterisations or well-planned
incoming inspection or simply as the systems experienced field failure repair. Some
of the classic failure modes can be described as follows:
• Breakdown: Failure of a clamp or Zener diode; any other semiconductor or
junction breakdown.
• Decoder malfunction: Inability to address a substantial part of the array due to an
open decoder line internal to the device, or a defective decoder.
• Excessive write-recovery: Read access time lengthening, when the read cycle
immediately follows a write cycle. When using the same data line for both
reading and writing, the increased time is caused by a sense amplifier that is
saturated during the write and is unable to recover in time to detect the differen-
tial voltage of the cell being read. Recovery time may even be pattern sensitive.
• Input and output leakage: Excessive leakage currents above specified limits.
• Multiple writing: Data are written into other cell(s) than the one addressed, due
to capacitive coupling between cells or other defects like leaky input or short
circuit.
• Open and short circuits: Bonding failures or insufficient/excessive metallisation
in one of the last semiconductor. manufacturing steps.
• Pattern sensitivity: The device response varies with the test pattern, reflecting
differences in address and/or data sequences; it may also reflect timing and
voltage specifications being too close to actual failure regions.
• Refresh sensitivity: Dynamic RAM fails to retain data reliability during the
specified minimum interval between refresh cycles. Failure is due to excessive
9 Reliability of memories and microprocessors 303

voltage or current leakage fromthe storage element or a fault in the rewrite cir-
cuits .
• Sense amplifier recovery: Tendency of the output (sense) amplifier to favour one
logic state after reading a long string of a similar logic state. Alternate l's and
D's may be read correctly, while a single bit of a given logic state in a long
string of opposite logic states may come out incorrect. It is caused by improper
charge accumulation in the sense amplifier.
• Slow access time: Charge storage on the output driver circuits or long lines
causes excessive time to sink or source current, thereby increasing access time.
Each of the listed failure modes can effectively be screened for, even though it may
require several screening approaches, if all failure modes have to be dealt with. To
effectively test for the signal detection capabilities of the sense amplifier and its
pre-charge requirements, a worst case pattern string of identical data, including a
single bit of inverted data could be run in a fast read mode.

9.7.3.3
Radiation environment in space; hardening approaches

What would happen to standard electronics if they were launched into space? From
500 to 75 OOOkm above the surface of the earth, the space can be a very hostile
environment to most electronics needed for satellite functions such as navigation,
communication, and data processing. The high-density RAMs, the microprocessors,
and other vital electronics would operate for only a few months up to a year or two
in many satellite systems before succumbing to the effects of radiation trapped in
the earth's magnetic field (bad data, spurious output signals, latch-up, or bum-out,
all caused by the bombardment of galactic cosmic particles, from hydrogen to ura-
nium, that permeate the space above the earth) [9.30]. Electronics designed and
built to operate effectively in a radiation environment (rad-hard) have been in pro-
duction for over 30 years. What is new is the need for rad-hard parts l5 in quantities
of tens and hundreds of thousands for commercial satellite systems at costs close to
their unhardened commercial equivalents. UTMC Microelectronic Systems intro-
duced one alternative - the self-contained process module - at Colorado Springs, in
1997.
Radiation has two primary effects on electronics in space: the first is the total
dose (accumulation of radiation over time, which results in permanent degradation
of device performance, including shifts in tum-on voltages, increases in operating
and stand-by currents, and changes in signal propagation delays); trapped electrons
and protons are the bulk contribution to the total dose damage (Fig. 9.9) ; the other

15 Rad-hard les require special processing - for the most parts - have been manufactured on
dedicated wafer fabrication lines. Because of the high cost of maintaining this kind of facilities
and the relatively small market for high-level rad-hard products, the cost of these components
could easily exceed the cost of their commercial equivalents by a factor 10 to 100. Responding
to this dilemma, some rad-hard suppliers have come up with cost-saving innovations (such as
running rad-hard products on the same fabrication line as commercial products or shielding
commercial components from radiation by placing them in special packages with enough mass
to reduce the radiation inside to tolerable levels).
304 9 Reliability of memories and microprocessors

effect is the displacement damage caused by the proton portion of the space
radiation (degrades solar cells and bipolar devices, but essentially have no effect on
digital electronics). Because of the sensitivity of most commercial electronics to
ionising radiation, almost all satellite systems require some means of mitigating the
system degradation due to space radiation.

+v +v +v +.v

PG PG PG PG

ram source am source ram source ram source

n substrate n substrate n substrate n substrate


I I I I
Unirradiated Right after radia- Holes left after Trapped final charge
tion burst electron suppon

Fig. 9.9 Generation of electron-hole pairs in the gate and field oxides (PG = polysilicon gate)

The easiest way to minimise the trapped-hole density is to thin down the oxide; a
clean gate oxide less than 12.5nm thick can usually survive up to 100krad(Si) with
no process changes. It is also possible to entirely eliminate the field oxide through a
fully depleted technology as SOL When rad-hard products are run on commercial
lines using modified steps, it may be possible to reduce the hole traps, but in
absence of a dedicated rad-hard process, the hardness will rarely go much beyond
100krad(Si). Rad-hard components fabricated on dedicated lines (expensive
solution!) are frequently hard to 1Mrad(Si), and they can easily survive most
natural space radiation environments.
It is usual to consider that the higher the yield the better the reliability. Of
course, it all depends on the nature of the main yield detractors. Design-related
yield losses are not easily correlated with reliability failures. However, when
manufacturing defects are involved, it can make sense to look for such correla-
tions. In a restricted number of cases, no correlation at all does exist between yield
and reliability. The most typical example regards the final passivation layer quality:
large defect densities do not impact the chip performance, but they definitely
promote humid test failures. The basic reason why no correlation can be made
between time zero and long term failures is that passivation layer does not play any
active role in the chip functionality. Dynamic life testing of 300 1Mxl DRAM
devices only yielded 6 electromigration failures after 1000h. Contact mask
misalignment: 5 failed parts out of 300 after 1000 h in life test.
9 Reliability of memories and microprocessors 305

A SER (soft error rate) predictive design tool is presented in [9.17] which has
resulted in a unique modelling tool called the ~oft-Error Monte Carlo Model, or
SEMM. SEMM has been used in designing chips with performance/cost and soft-
fail reliability trade-offs for bipolar, CMOS, and bi-CMOS technologies. SEMM
can be extended to model SERs in chips used in aerospace environment, which
involves bombardment by protons, neutrons, and heavy ions. Also, as the critical
charges and device dimensions reach very low values, SER effects of secondary
spallation products - such as deuterons, tritons, and low-energy protons - and of
low-energy neutron recoils must be taken into account.

9.8
Design trends in microprocessor domain [9.29]

1) The device feature size will continuously decrease (0.2 ... 0.1 f.U11) and the delay
time of each gate will be reduced under 0.2ns. This allows integrating more than 80
millions transistors in a single chip, accommodating more function units and
greatly improving the microprocessor speed. The parallel processing technique is
very efficient to expose the instruction level parallelism, leading to a significant
reduction of communication, I/O interface, time and power consumption and
system design complexity. Consequences: (i) integrating the communication link
into the microprocessor chip; (ii) designing a synchronisation circuit in a
microprocessor chip to support multiprocessing; (iii) setting up a supporting
mechanism for the microprocessor chip to ensure the data consistency [9.28].
2) Five to six metal routing layers will be possible cutting down the routing area,
and reducing connection wire resistance and capacitance. Ion implantation
technology will improve the microprocessor speed and will reduce the parasitic
capacitance.
3) CMOS (high integration level, low power supply, low power consumption,
low I/O swing) and GaAs (high speed, radiation-hardness, temperature insensiti-
vity, low power consumption, harsh environment bearing) technologies may be
preferable options for microprocessor design.
4) New types of semiconductor material and optical interconnection will be
developed, and will permit to reach up to 5 to 10GHz. Therefore new types of wide
bandgap for the high speed device design must be developed.
5) The microprocessor design will continue to follow the line of the reduced
instruction set computer (RISC) architecture.
These ways can tremendously cut down the cost, making the parallel, and
processing systems more powerful in competition, and the embedded micro con-
troller market will substantially grow up.
The microprocessor design is directed toward the green chip in the following
aspects: (a) low voltage power supply; (b) several operation modes for energy
saving, including variable operation frequencies and clock throttling; (c) static logic
design which may work at frequencies as low as zero Hertz; (d) power management
capability fully independent of the application environment to avoid possible
conflicts; (e) dedicated hardware to monitor the power supply status of peripheral
devices.
306 9 Reliability of memories and microprocessors

The microprocessor will continue to be the most "mobile" (rapidly improving)


key component in the computer system design. The new era of the Information
Technology (IT) and the construction of the IT highway is waiting for a new type,
higher performance and more powerful microprocessor to be designed in the
computer world.

9.9
Failure mechanisms of microprocessors

A microprocessor may be considered as a system consisting of a series of separate


units, each one contributing to the overall reliability of the device [9.29]. It is pos-
sible to test the functions of the device by means of a suitable programme, but such
techniques cannot guarantee that every functional unit in the device is exercised.
Functional defects result in soft failures, which can be eliminated by adjustment.
Catastrophic failures (oxide rupture, interruption of aluminium lines, wire bond
failures, failures related to the interrelationship between software and hardware)
may also occur. Microprocessors have the same failure mechanisms as standard
devices manufactured using MOS or bipolar technologies. However, the size and
complexity of the chips introduce additional failure mechanisms. Larger chip areas
mean that the devices are more prone to defects inherent in the semiconductor
materials, the probability of defects such as pinholes in the oxides and microcracks
in the metallisation 16 being increased.
The reliability of a microprocessor is greatly influenced by the packaging17
technology employed in the manufacture. As one knows, two types of packaging
technologies are used:
• plastics (plastic encapsulated devices PEDs);
• ceramics (ceramic dual-in-line packages CERDIPs).
A good understanding of the interactions taking place at the various interfaces in a
packaged microprocessor will benefit the quest for higher microprocessor
reliability. In packaging, it seems that the main weaknesses are introduced at the
physical interfaces between the different sections. Inadequacies of the protection
employed or poor quality manufacture are highly susceptible to such interactions.
Generally, it seems that plastic packages begin to replace ceramics in most
applications, while the reliability of PEDs has reached a stage where they can be
compared with ceramics, and are cost effective as well [9.8].
The common practice of some microprocessor manufacturers is to perform
functional tests that verify the correct performance of the functions specified in the
machine's instruction set. Separate tests are done to verify data transfer and storage,

16 As passivated AI(Cu) lines become narrower, the metal exhibits increasingly elastic behaviour
with higher stress levels, a combination of stress characteristic which favour void formation.
Stress relaxation in AI(Cu) films and lines has been measured by bending beam and x-ray
diffraction methods [9.31].
17 The term packaging is used to cover the forms of encapsulation available. However, the die
attachment system used in the package and the lead frame system, parts of the so-called
interconnects are often involved when discussing the problems of packaging.
9 Reliability of memories and microprocessors 307

data manipulation (arithmetic and so on), register decoding, and instruction


decoding and control. These functional tests are then evaluated in terms of their
fault-detection effectiveness. For example, the test engineer may pass a single 1bit
through the microprocessor to verify that the bus lines and register cells have no
single fault causing the} to be changed permanently to (or "stuck at") 0; similarly
the passage of a 0 may be used to find a stuck-at-} fault. A combination of O's and
}'s may be used to verify that no pairs of bus lines or register cells have stuck-at
faults.
The test engineer operates the microprocessor beyond the nornlal specified
limits for voltage, temperature and clock rate to test for operating margins and to
identify latent failures. The functional tests are used to verify device performance
and quality at the time of the test. The stresses caused by testing the device at
excessive conditions generates data for Shmoo plots that aid in determining safety
factors for operating margins. These can be helpful in predicting reliability over the
device lifetime. In some cases, errors may occur when an index register is
incremented or decremented through its entire range due to operation at bias
voltages near extreme limits. The failures in such cases are caused by slow shifts of
behaviours over time, instead of a single, sudden event. Some manufacturers use
full stuck-at fault models for generating test patterns; this model - in essence -
allows to assume that a device is 100% tested if each of its internal nodes is
switched from 0 to }, then back toO, as observed at the output pins. For the test to
be useful, one must know: which nodes have switched and when this happened.
Device makers often use stuck-at fault models to write programmes that simulate
defects in the processor circuits. The simulations indicate how the real processor
would behave if such defects occurred in actual use. A document is compiled that
correlates the behaviour observed in the simulation with the various faults. The
document - called a fault dictionary - is used to diagnose the cause of failures that
occur in the field. The document may also be used during functional tests to seek
out faults that have been witnessed through simulation".
The testing of VLSIIULSI is a major factor in the cost of producing such digital
devices as memory chips and microprocessors. Because exhaustive testing is not
feasible, various approaches are used to develop cost-effective tests of only
relatively few states while still ensuring the correct performance of the device.
However, even the testing of fewer states, arises the major problem in VLSllULSI
today testing: automatic generating, storing, and manipulating the vast amount of
data needed for testing l9 •
Bum-in has greatly reduced the failure rate in the field, since identifies and
allows elimination of early failures, or infant mortalities, during the first 1000h of
operation. Functional tests may be performed during or after bum-in to check
storage and readout, address patterns, timing voltages, and signal margins. An

18 Such applications of models can be expensive, however. Programmes for finding the real faults
entail at least a thousand lines of code for such relatively simple functions such as data transfer
or manipulation, whereas 10 000 lines of code may be needed to detect faults in more complex
instructions-decoding and -control circuits.
19 A 64 KRAM, for example, may require from 10 5 to 109 test vectors or patterns (each pattern is a

set of input signals for testing a given state or function); with a typical memory-cycle time of
475ns, the corresponding test periods would range from 49ms to 53.7 minutes. Test times under
a few minutes are still deemed very long, considering the production volume.
308 9 Reliability of memories and microprocessors

approach used by some memories manufacturers is to plug a number of RAMs into


one board in the test chamber; signals from a pattern generator are distributed to a
bank of amplifiers that drive the RAMs. Comparator circuits compare the RAM
outputs with the expected outputs; every bit in the test pattern is compared at strobe
time with expected data. When an error occur, the comparator sets a flag bit in the
memory location corresponding to the failed RAM.
The automotive environment has often proved to be at least as severe as that of
harsh military situations. Automotive components must work at temperature range
from -40°C to +130°C, under repeated temperature cycling, at RH up to 100
percent, in road salt and/or corrosive chemicals, under instantaneous acceleration
(shock) of at least 20g, and under a variety of electromagnetic interference and
power-supply transient hazards. Moreover, the engine electronic control systems
are likely to be maintained by a service personnel with relatively little experience in
electronics. Even when an automobile is not operating, it may experience extreme
environments, especially temperature, humidity, salt atmosphere, chemicals, and
dust. So, if electrical and mechanical stresses occur in a privately owned vehicle
about 400h per year on the average, other stresses are present the full 8760h per
year. A necessary element in the field failure analysis of field returns is a test to
identify intermittent failures; these are usually associated with bonding, welding,
conductor failures, and so on and require such tests as temperature cycling,
vibration, or some combination of these to identify the failure.

Annex Some limits of gigabit memories


Speed of light: A fundamental limit is the speed of light; nothing can go faster. Although as devices
get smaller they also get closer together, chip size is continually growing, and the problem arises for long
signal and clock lines. In vacuum the reverse of the speed of light is about 33.5pslcm; thus, the time to
traverse two sides of a 20mm X 20mm chip will be about 268ps for an E of 4. Inductive effects can
increase the delay substantially. As chip cycles move towards a few ns, the skew and delay on long lines
will be important and limiting the design factor. More important, however, is the fact that
interconnections are lossy transmission lines and become more an increasing problem. Resistance can be
lowered by going to copper or even silver, but the improvement is only around 2X over aluminium.
Design approaches - such as segmented chips and multiple processors - can help get around the
problem.
Lithography: Printing of features by optical lithography is limited in principle to features not much
smaller than the wavelength of the light: the pursuit of smaller dimensions is limited by the availability
of appropriate refractive indexes. This has been increasingly more difficult as wavelengths get shorter.
Phase shift making [9.33] and off-axis illumination [9.34] potentially can extend the range to the 0.2 ...
0.11illI range, but at the cost of complicating the mask design. Depth of focus was considered a major
problem, but has been overcome through the use of multilevel resists with a planarised uniform top resist
layer which serves as a mask for the patterning of the lower layer, which may have considerable
thickness variation. Optical lithography has a history of continuous improvement to meet the
requirements; however, somewhere around 0.2 ... 0.11illI seems to be the limit of the present approaches.
X-ray lithography uses much shorter wavelength than optical lithography, but is limited to shadow
exposure and has not proven commercially viable at present. E-beam lithography has been the shortest
wavelength and the best resolution, but is essentially a serial exposure system, and has been too slow and
expensive for exposure of large chips.
9 Reliability of memories and microprocessors 309

Tunneling: Tunneling through the gate insulator is a potential fundamental limit. In theory the
tunneling limit is around 3nm. In practice, oxide quality and defect density have resulted in thicker
oxides, but improved processing should allow insulators close to the tunneling limit.
Devicefields: Another fundamental limit is the maximum allowable field in the depletion regions and
the gate insulator. If the fields go too high, hot electron effects, punch-through, or breakdown results. As
dimensions are reduced, the supply voltage cannot be made arbitrarily low Even with a well designed
device with good characteristics, sub-threshold conduction will limit how low the threshold voltage can
be reduced. Thermal energy allows some fraction of the carriers in the silicon to surmount the barrier
which the gate electrode creates as the device is turned off. In good devices current decreases by an order
of magnitude for every 80 ... 90m V reduction in gate voltage in the sub-threshold regIOn. In practice,
operation substantially below I V results in performance deterioration which is unacceptable in many
applications, due to loss of overdrive to tum the device on [9.54). This limit of the supply voltage does
have the advantage of keeping the energy in a logic transition well above the minimum necessary to
overcome thermal fluctuations; i. e. the minimum voltage must be > PkT/q, where P is 2 to 4 [9.36)
Another challenge with device fields is controlling where the field lines terminate. Device threshold
voltage can vary due to short channel effects, which arise when the device threshold voltage depends
upon the source-drain spacing and drain voltage. For very small devices, the field distribution must be
considered as a three-dimensional problem, since charge in the channel and both the source and drain can
terminate field lines from the gate. A double-sided gate structure where there are gates both above and
below the channel region, can give the best control of channel fields and short channel effects, but is
very difficult to fabricate at present. It can, however, result in the shortest channel device.
Soft errors: As device dimensions and supply voltage are reduced, the amount of charge involved in
a switching or retentive operation is correspondingly reduced. Soft errors can occur when minority
carriers cross a pn junction into a node in sufficient quantity to upset the state of the node. DRAM is
most sensitive because it involves storing small amounts of charge in the memory cell. Memory can
handle such errors through error correction codes, and logic can use parity to detect and retry. One
source of minority carriers is ionising radiation such as a-particles or cosmic rays.
DRAM cell size: One reason for the continuing progress in DRAM has been that the normalised cell
size, as expressed in minimum lithographic squares, has decreased by about a factor of 1.4 each
generation. The minimum cell size for the folded bit line cell configuration, used in all present DRAMs,
is eight squares (determined by the intersection of two word lines and a bit line). This is about the
number of squares for the 256Mb DRAM cell; to stay on the projection, the I Gb cell should require 5 ... 6
squares. The open bit line cell requires 4 squares (the intersection of one word line and one bit line), but
would have severe noise and sense amplifier pitch matching problems. If cell area does not shrink
according to projection, the chip size will be affected and impact the economic viability of the next
DRAM generation.
Fabrication control: In practice, control of the device characteristics and yield is a major concern.
Each process has an associated variation, and as devices shrink, the variation must also shrink
correspondingly. Collectively, immeasurable effort has been spent by the industry learning to control and
refine processes. Controlling a 0.1).1111 gate electrode to ± 20% requires controlling each edge to a few
10's of atomic distances. While this may not be a fundamental limit, it does represent a formidable
challenge. An eventual limit comes in the statistics of random distribution of impurities in depletion
regions [9.37). As the active volume shrinks, the number of impurity atoms N in the depletion region
shrinks, while the standard deviation goes as N 112, and the probability that somewhere on the chip a
device will not have sufficient atoms to support the depletion region goes up.
Despite these potential limitations, devices with perfectly good characteristics with channel lengths
well below 0.1).1111 have been made [9.38). Specific technology improvements (such as SOl or a low E

interconnection dielectric) could further improve performance without pushing dimensions. It appears
310 9 Reliability of memories and microprocessors

that fundamental limits would not start kicking in any serious manner until somewhere after the 16Gb
generation, if then. Practical limits in lithography and control of fabrication processes will dominate.
The industry has a record of overcoming such challenges, but it is becoming increasingly more difficult
to do so, and at increasingly higher costs. The costs of state-of-the-art DRAM fabrication facility is in the
vicinity of a billion US$, and is doubling each generation.
And processors would still get better. Microprocessor throughput has been increasing at about 2X
every 18 months. Half of this has been due to improvement in device performance, but the other half has
resulted from other sources, such as improved design, circuits, layout, architecture, compliers, and the
like, and this progress would continue. Further, there would be strong potential for improvement through
optimising design for a specific application. At present, microprocessors are designed for general usage
and then personalised through software for the particular application; processor design is very time
consuming and expensive, and the high design costs must be amortised over a large sales base. A goal
would be automated design tools which could take a high level description and produce a processor at
the "push of a button" which was reasonably optimised for the specific application, with improved
performance, and at a design cost which would be economically attractive for the smaller sales volume.
The current drive for low power electronics is a good example of optimising designs for a specific
end without pushing technology limits [9.36). Since power in a CMOS circuit is given by the very
familiar formula P = CV2j, power supply reduction is the major first step. However, power is a system
characteristic, and optimisation results from considering a wide spectrum of disciplines, including the
system level and all levels below: processing technology, device design, circuits, chip design, CAD
tools, system architecture and organisation, system operation, logic partitioning and synthesis, algorithms
and software. A key aspect has been to incorporate low power considerations into the system design
from the beginning, and major progress in power reduction has been obtained in memory, logic, and
communications [9.39), [9.40).
Reliability problems of Gigabit CMOS circuits. There is the widely shared opinion that the minimum
structure size of mainstream CMOS devices which is currently at about 0.51JIIl correspon-ding to the
16M-DRAM-CMOS-generation will be scaled down to about 0.07j.Ull for the 64Gigabit DRAM level.
Simple CMOS circuits - such as ring oscillators - have already been realised with a channel length of
0.07j.Ull. This means that today's CMOS-technology will prevail for at least a further decade, or, very
likely for two or even more decades.
DRAM processing will continue to play the role of the technology driver up to a memory cell density
of 64Gbit per chip. The pace of the past that brought us in intervals of three years the laboratory versions
of a new DRAM generation will be maintained Less technological problems, but more the fmancial risk
and the need of huge investments for more advanced fabrication capabilities may slow down the speed of
introducing higher integration densities on the market But, nevertheless, 64Gbit circuits with a minimum
structure size of 0.07j.Ull will become a reality and will enter the market place in the first quarter of the
next century.
Conclusion. Finding the optimum trade-off between reliability and performance will be a challenge
for all the reliability scientists.

References

9.1 Ning, T. H. (1995): Second symposium on nano device technology, Hsinchu, Taiwan,
May 25-26
9.2 Terman, L. M. (1995): Limits - some are more fundamental than others. Proceedings of
the fourth international Conference on Solid-State and Integrated-Circuit Technology,
Beijing (P. R. China), October 24-28, pp. 7-12
9 Reliability of memories and microprocessors 311

9.3 Pilkuhn, M. H. (1995): Molecular electronics: new prospects for IT. Proceedings of the
fourth international Conference on Solid-State and Integrated-Circuit Technology, Beijing
(P. R. China), October 24-28, pp. 13-20
9.4 Doering, R. R. (1992): Trends in single-wafer processing. Symposium on VLSI Tech-
nologies. Digest of Technical Papers, pp. 2-5
9.5 Takeda, E. (1995): Reliability challenges for giga-scale integration. Proceedings of
RELECTRONIC '95, Budapest (Hungary), October 16-18, pp. 1-16
9.6 Kleppmann, W. G. (1989): WLR Final Report, pp. 125-135
9.7 Feibus, M.; Slater, M. (1993): Pentium Power. PC Magazine, vol. 12, no. 8, pp. 108-120
9.8 Amerasekera, E. A.; Campbell, D. S. (1987): Failure Mechanisms in Semiconductor
Devices. J. Wiley & Sons, Chichester
9.9 Crook, D. L. (1990): Proceedings ofIRPS, pp. 2-11
9.10 Cristoloveanu, S.; Li, S. S. (1995): Electrical Characterization of Silicon-On-Insulator
Materials and Devices. Kluwer Acad. Publ., Boston
9.11 Guichard, E. et al. (1994): IEDM'94 Techn. Digest, p. 315
9.12 Lantz, L. II (1996): Tutorial: Soft errors induced by alpha particles. IEEE Trans. Reliabil-
ity, vol. 45, no. 2, pp. 174-179
9.13 Messenger, G. C.; Ash, M. S. (1986): The Effects of Radiation on Electronic Systems.
Van Nostrand Reinhold
9.14 Rauhut, H. W. (1991): Low alpha epoxy moulding compounds. SPE ANTEC Techn.
Papers, vol. 37, pp. 1260-1264
9.15 Pecht, M. G. et al. (1995): Plastic-Encapsulated Microelectronics. John Wiley & Sons,
New York
9.16 Hasnain, Z.; Ditali, A. (1992): Building-in reliability: soft errors - a case study. Proc. Int.
Reliab. Physics Symp., pp. 276-280
9.17 Srinivasan, G. R. (1996): Modeling the cosmic-ray-induced soft-error rate in integrated
circuits: An overview. IBM J. Res. Develop. vol. 40, no. 1, pp. 77-89
9.18 Pecht, M.; Ramappan, V. (1992): Are components still the major problem? IEEE Trans.
Comp., Hybrids, and Manuf. Technol. vol. 15, no. 6, pp. 1160-1164
9.19 Ghate, P. B. (1991): Industries Perspective on Reliability of VLSI Devices. Texas Instru-
ments
9.20 Semiconductor Industry Association, SlA (1992): SIA Report: Military Ie quality rising
across the board. Military & Aerospace Electronics, p. 50
9.21 Westinghouse Electric Corp. (1989): Failure analysis memos.
9.22 Weber, W. et al. (1991): Dynamic degradation in MOSFET's. IEEE Trans. on EI. Dev.,
vol. 38, no. 8,pp.1859-1867
9.23 Jensen, E.; Schneider, B. (1979): Characterization of RAMs.
9.24 Birolini, A. (1997): Quality and Reliability of Technical Systems. Springer, Berlin
9.25 Woods, M. H.; Rosenberg, S. (1980): EPROM Reliability. Electronics, pp. 133-141
9.26 Mielke, N. R. (1983): New EPROM data-loss mechanisms. 21st Ann. Proc. Int. ReI. Phys.
Symp., pp. 106-113
9.27 Bfljenescu, T. 1. (1978): Sur la fiabilite des memoires bipolaires PROM. Bull. SEVNSE
(Switzerland), no. 6, pp. 268-273
Bfljenescu, T. 1. (1982): ZuverHissigkeit und Systemzuverlassigkeit. Aktuelle Technik, no.
7/8, pp. 9-13
Bfljenescu, T. 1. (1982/1983): Zuverlassigkeit monolitisch integrierter Schaltungen. EPP-
Artikelserie, September 1982 / May 1983
Bfljenescu, T. 1. (1983): Fertigung bestimmt Qualitat. Elektronikpraxis, November, pp.
178-184
312 9 Reliability of memories and microprocessors

Bajenesco T. I. (1983): Fiabilite, modes de defaillance et complexite des composants


electroniques. Electronique, no. 12, pp. 23-25
Bajenesco, T. I. (1983): Pourquoi les tests de deverminage des composants? Electronique
no. 4, pp. EL8-EL11
Bajenescu, T. I. (1983): Fehleruntersuchung elektronischer Bauelemente. Aktuelle Tech-
nik,no.10,pp.22-28
9.28 Bajenescu, T. I. (1983): Mikroprozessoren und ZuverHissigkeit. Aktuelle Technik no. 10,
pp. 18-21
Bajenesco, T. I. (1984): Memoires RAM: queUe fiabilite? La Revue Polytechnique no. 6,
p. 701
Bajenescu, T. I. (1984): ZuverHissig? Kriterien fur Mikroprozessoren. Hard and Soft,
April, pp. 24--27
9.29 Bajenescu, T. I. (1997): Status and trends of microprocessor design. Proceedings of the
1997 Int. Semicond. Conference, 20th Edition, October 7-11, Sinaia, Romania
Bajenescu, T. I. (1998): Information technology and trends in microprocessor design.
Proc. of the Internat. Conference on Optimization of Electrical and Electronic Equip-
ments, Brasov, Romania, May 14-15, 1998, pp. 807-809
9.30 Benedetto, 1. M. (1998): Economy-class ion-defying lCs in orbit. IEEE Spectrum, March,
p. 36-41
Messenger, G.; Ash, M. S. (1992): The effects of radiation on electronic systems. Van
Nostrand Reinhold, New York
9.31 Ho, P. S. et a!. (1995): Thermal stress and relaxation behaviour of Al(Cu) submicron
interconnects. Proc. of the fourth Internat. Conf. on Solid-State and Integrated-Circuit
Technology, Beijing (China), October 24--28, pp. 408-412
9.32 Hermann, M.; Schenk, A. (1995): Field and high-temperature dependence of the long
term charge loss in erasable programmable read only memories: Measurements and mod-
eling. J. App!. Phys. Vol. 77, no. 9, pp. 4522-4540
9.33 Levenson, M. et al. (1982): IEEE Trans. on Electron Devices, December, pp. 1828-1836
9.34 Kamon, K et al. (1992): Symposium on VLSI Technology Digest, pp. 108-109
9.35 Davari, B. et al. (1995): Proceedings of the IEEE, April, pp. 595-606
9.36 Meindl, J. D. (1995): Proceedings of the IEEE, April, pp. 619-635
9.37 Keyes, R. W. (1991): Contemporary Physics, Vol. 32, No.6, pp. 403-419
9.38 Sai-Halasz, G. G. et al. (1988): IEEE Electron Device Letters EDL-9, pp. 633-636
9.39 Chandrakasan, A. P.; Broderson, R. W. (1995): Proceedings of the IEEE, no. 4, April, pp.
498-523
9.40 Itok, K. et al. (1995): Proceedings of the IEEE, April, pp. 524--543
10 Reliability of optoelectronic components

10.1
Introd uction

Visible light-emitting diodes LEDs (red, green, yellow, and blue) became indispen-
sable as visual indicators. Combinations of LEDs - in a hybrid or monolithic form
- are among the competitors for the lucrative visible alphanumeric display market.
Reliability of such LEDs is now almost taken for granted; however the main em-
phasis of this chapter will be the understanding of degradation processes in LEDs
and optocouplers. In Fig. 10.1 a classification of optoelectronic semiconductor
components is given.

Optoelectronic
semicond. components

ith photo resistance

with photodiode

with photoresistor

with photodarlington Photodiode


Phototransistor
Photothyristor
Photodarlington

Fig. 10.1 Classification of optoelectronic semiconductor components [10.1][10.2]

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
314 10 Reliability of optoelectronic components

IR semiconductor lasers and high radiance LEDs (HRLEDs) are still not readily
available and their current prices reflect the fact that today there is neither a large
market, nor real production capability. The potential market for such devices could
consist in various applications for semiconductors laser in optical video and audio
disc reading and writing and in range finders and IR illuminators for surveillance
purposes. In addition, the original fibre optic (FO) communication concept is
enlarging to include not only the land based civil transmission network [10.3] but
undersea transmission and interconnections within telecommunication switching
centres. The HRLED and laser combination of characteristics includes: low
electrical input power, capability for efficient optical coupling into a particular FO,
correct transient response for the particular application and operation over the
required systems temperature range. These characteristics do not deteriorate during
operation (neither catastrophically, nor gradually, the device remaining within the
specification limits for a time sufficient to make systems operation economically
viable).
For visible LEDs, GaAs,.P 1•x and GaP are the forerunners; commercially, GaAsP
- because of their band structure, which permits light emission via direct
recombination between holes and electrons - is the favoured material for red-
emitting LEDs (Fig. 10.2).

Aluminium Diffused Player Zn doped


Si,N 4

AuSn backside contact

Fig. 10. 2 A typical red LED cross-section

In the IR, GaAs and Gal_xAlxAs dominated until now. In order to grow alloys
such as InGaAsP in the form of high quality epitaxial layers on a conveniently
binary substrate material, it is usually necessary to assure lattice matching; this
prevents mismatch dislocations and other defects being formed which could affect
radiative efficiency and reliability. The most common method of growth for
epitaxial layers is liquid phase epitaxy (LPE) although hydride (or chloride) vapour
phase epitaxy (VPE), metal organic chemical vapour deposition (MO-CVD) and
molecular beam epitaxy (MBE) have been tried with varying success for some of
the materials. Radiative emission in the materials system of interest is achieved by
electron or hole injection across heavily forward-biased pn junction or by injection
of both. The diode chip geometry depends on the type of LED or laser required.
The simple cleaved or sawn-sided dice with large area alloyed contacts (Fig. 10.3)
may be modified in different ways to maximise external radiative efficiency [lOA]
depending upon whether the light suffers or not from self-absorption in the
10 Reliability of optoelectronic components 315

particular semiconductor material in use. The light output increases with the
injected current, in a fairly linear fashion.
The main cause of degradation - a consequence of forward biasing - are the
inherent crystal defects; non-radiative recombination centres are formed at these
defect sites, thereby impairing the quantum efficiency of the devices.
Recombination enhanced defect reactions utilise the energy liberated during the
recombination process to induce defect dissociation, defect creation, and defect
migration. The main source of extrinsic failure mechanisms is the packaging;
encapsulants for optoelectronic devices must be transparent and such materials
have high thermal coefficients (60ppmJ°C) compared with Ie encapsulants doped
with alumina and silicon (20ppmJ°C). Hence, the thennal mismatch between the
rest of the package (4 ... 1SppmJ°C) and the encapsulant is a severe problem'. Gold
bonding wires are preferred to the poorer quality of aluminium wires. Failures may
occur as a result of breakage of the wire, kink fonnation caused by thermal
contraction of polymer encapsulants and bond lift-offs. The bonding stress in
ceramic substrates must be controlled to prevent the OCClUTence of cracks;
temperature cycling is a major reliability hazard and soft silicon is recommended
for use in ceramic substrates because of its stable thennal behaviour' r10.13].

n contact wire

n metallisation

p metallisation

n
solder
p
- header

Fig. 10.3 Basic large-area-contact LED structure [10.3]

, The outcome of the mechanical stresses generated can be either delamination or separation of
the encapsulating epoxy from the substrate, or high bending stresses may cause bulk epoxy
cracking
Defects in silicon-based optoelectronic devices are less affected by temperature variations than
in gallium-based devices; investigations into GaAIAs devices [10.14] have shown that it is
possibile to obtain different defect types for devices of various structures. A dependence of
failure rates upon initial concentrations of defects and impurities make these processes sensitive
to sLlbtle differences in crystal growth teclmiques. Therefore. is not cautios to assume that
failure rates will be universal across a p3lticular technology.
316 10 Reliability of optoelectronic components

The mechanisms behind degradation and failure are not fully understood; until a
better understanding of the causes and processes of failure will be obtained,
reliability predictions cannot be made with a good degree of confidence.

10.2
LED reliability

Soon after the production of the first LEDs in the early 1960s, the reliability of
LEDs improved reasonably quickly; however this was because of a general im-
provement in materials and fabrication techniques, with little understanding of the
basic factors determining LED reliability. The most obvious manifestation of its
degradation is the gradual decrease in power output when the device is operated at
a constant current, i. e. the spontaneous efficiency decreases with time. Early life
test data were difficult to interpret because the erratic manner in which device pa-
rameter deteriorated, and the considerable variability in the results from device to
device. A time to end of life 3 has become the most favoured parameter for measur-
ing the device reliability.
LED failure is a gradual process; the power output decreases with time, although
not necessarily in a well behaved manner. Although the failure of light emitting
devices became considerably less erratic over the past years, there is a variation in
reliability between LEDs within a specific batch. A common approach is to
consider the lifetest data as a statistical distribution, and to use its characteristic
parameters to describe the device population as a whole.
In the case of the failure of semiconductor components, it appears that there are
no sound physical reasons for the validity of the lognormal distribution4 to
characterise LEDs, although it was suggested that it could occur as a fundamental
consequence of diffusion processes of an Arrhenius type temperature dependence.
A common approach in characterising failure distributions is to find the mean time
to failure by assuming a failure distribution, and extrapolating from the first few
failures. The lognormal and Weibull distributions can be considerably different in
the tails of the distribution, so that large differences in predicted values of mean life
could result. An additional parameter - incorporating both mean and standard
deviation - is often used for components in a telecommunications system and is
known as the 2% reliability life; this is the time at which 2% of the population will
fail. (The requirement for high reliability LEDs comes mainly from the
telecommunications industry, where these devices should carry digital information
between 2Mbitls and several Gbitls). Therefore pulsed operation is a realistic
method of testing, as mechanisms associated only with pulsed operation can be
envisaged.

3 However this time clearly depends on what criteria are used to determine device failure. The end
of life of an LED is the time at which the power output has fallen to either 50%, or lie, of its
original value.
4 One method of understanding the implications of lognormal distribution is to make comparisons

with the normal distribution. Whereas the normal distribution results from the additive effects of
random variables, the lognormal distribution should result when the random variables interact
multiplicatively.
10 Reliability of optoelectronic components 317

The basic philosophy of accelerated lifetesting is to operate devices at different


degree of overstress, in order to establish a relationship between device life and the
amount of overstress. Estimations of device life under normal conditions are then
obtained by extrapolation. Life testing of LEDs is carried out with current and
temperature being varied independently.
GaAs and GaAIAs devices can be of a low current density and a low output
power configuration. High current density degradation phenomena are produced
due to significant amounts of heat which have to be removed efficiently. Thus,
stable bonding is of prime importance. High temperature life tests have shown that
- under certain circumstances - the possibility of excessive formation of
intermetallics is removed. With less than 100nm of gold on both surfaces, it has
been shown that stable indium based bonds can be produced [10.11][ 10.17].
For GaP, GaAsP and low current density GaAs LEDs in the early 1970s various
workers p,redicted constant current room temperature lifetimes, ranging from 10sh
up to 10 h or more 5• This range is adequate for most - if not for all- applications.
The early work in this area was very much interested in the detrimental effects of
surface leakage while this can result in a significant proportion in the current being
diverted from radiative process. In addition, contaminants were also supposed to
adversely affect device performance. Stresses induced by contact alloying were
also thought to be important; this led to the use of alloyed contact areas of small
dimensions compared to the pn junction area itself. Life testing at elevated
temperature became necessary in order to begin to estimate failure times at room
temperature. Degradation of electroluminescence of GaP has been significantly
reduced by chemical passivation of surfaces.
In the case of GaP (Zn:O) LEDs, the dominant degradation mechanism was not
completely clarified. Whatever the mechanism, lives longer than 106h for operation
around room temperature are now claimed for some devices, but there is also a
current dependence of degradation.
GaP red LEDs are no longer commercially preferred - although they are
efficient LEDs - since most of their emission is in the infrared region. In addition,
their emission saturates at relatively low currents due to the restricted Zn-O pair
population possible to achieve in practice in the diode material. In red GaAS O.6P04
LEDs, zinc diffusion source type, arsenic overpressure and n-type dopant and
concentration govern the initial quantum efficiency and degradation rate. It was
suggested that native effects are generated during operation and that these are
responsible for degradation. The use of plastic encapsulations obviously limit the
temperature range of operation and commercial specifications often give limits for
both storage and operation of -55°e to +100 o e. Early problems associated with
encapsulation included bond wire failure due to differential thermal contraction, but
modem plastic encapsulants do not fail in this way.
The technology developed for GaP red diodes has been applied to the green
(GaP:N) variety; the lack of any apparent reliability problem has resulted in a
minimum of published literature in this area. It is known that green LED degra-
dation is associated with the growth of dislocation dipole structures, but the

5 The lifetime (as usual, 50% degradation defines the end of lifetime) of commercial
optoelectronic components using LEDs - made by one and the same manufacturer - may differ
considerably from batch to batch.
318 10 Reliability of optoelectronic components

dipole growth is uncertain. Some investigations of green GaP LED degradation


arrived at the conclusion that the migration of interstitial zinc is responsible for the
slow degradation of the green emission [10.12].
A recommended screening criterion, during the LED chips fabrication is pre-
sented in the following. From each batch of manufactured LED chips, a certain
number is mounted in TO-18 cases and tested during a relatively short time. When
the degradation is below a certain value - depending on your application -, the chip
batch is approved. When the degradation is larger, the chips batch is rejected (and
also will be not used in the production).
Degradation process in light emitters are - so far - not fully understood. Howe-
ver, for low visible and near infrared radiance LEDs, a careful technology has
created fully acceptable products.

10.3
Optocouplers [10.22] . .. [10.29]

10.3.1
Introduction

A crucial problem is that of the current transfer ratio, CTR, changing with time. The
resulting optocoupler's gain change, iJCTR = CTRjinal - CTRinitial., with time is re-
ferred to as CTR degradation 6 . This degradation mus{ be accounted for, if a long,
functional lifetime of a system is to be guaranteed [10.5][10.6][10.7].

10.3.2
Optocouplers ageing problem

The main cause for CTR degradation is the reduction in efficiency of the LED
within the optocoupler. Its quantum efficiency - defined as the total photons per
electron of input current - decreases with time at a constant current. The LED cur-
rent consists primarily in two components: a diffusion current component7 and a
space-charge recombination current
IF (VF) = AeqVI"T + BeQV/2kT (10.1)

diffusion space-charge recombination

where A and B are independent of VF ' q is electron charge, k is Boltzmann's con-


stant, T is temperature in degrees Kelvin, and VF is the forward voltage across the
LED [10.8][10.9][10.10][10.17].

6 Numerous studies have demonstrated that the predominant factor for degradation is reduction of
the total photon flux being emitted from the LED, which, in tum, reduces the device's CTR.
7 The diffusion current component is the important radiative current and the non-radiative current
is the space-charge recombination current.
10 Reliability of optoelectronic components 319

Over time - at fixed VF - the total current increases through an increase in the
value of B. From another point of view, with fixed total current, if the space-charge
recombination current increases - due to an increase in the value of B - then the
diffusion current, the radiative component, will decrease. The reduction in light
output through an increase in the proportion of recombination current at a specific
IF is due to both the junction current density J, and junction temperature TJ- In any
particular optocoupler, the emitter current density will be a function not only of the
required current necessary to produce the desired output, but also of the junction
geometry and of the resistivity of both the p and n regions of the diodes. The junc-
tion temperature is a function of the coupler packaging, power dissipation and
ambient temperature. As with current density, high TJ will promote a more rapid
increase in the proportion of recombination current rI 0.2] riO. 91.

K
Transm. of Gain of () Output
optical inteiface outpUI current
amp/it:

Fig. 10.4 System model for an optocoupler 110.1][10.7][ 10.9]

An useful model (Fig. 10.4) can be constructed to describe the basic opto-
coupler parameters which are able to influence the CTR. Any coupler can be mo-
delled in this fashion within its linear region. The same Fig. 10.4 shows the system
block diagram which yields the relationship of input current IF to output current 10'
( 10.2)
where K represents the total transmission factor of the optical path. generally con-
sidered a constant as well as R, the resistivity of the photodetector, defined in terms
of electrons of photocurrent per photon, T} is the quantum efficiency of the emitter
defined as the photons emitted per electron of input current and depends upon the
level of input current IF and upon time. Finally fJ is the gain of output amplifier and
is dependent upon IF, the photocurrent, and time. Temperature va-riations would,
of course, cause changes in T}, fJ as well.
From equation (10.2), a normalised change in CTR, at constant 'F, can be ex-
pressed as in (10.3). The first term, (I), iJ.T}/T}, represents the major contribution to
iJ.CTR due to the relative emitter efficiency change; generally, over time, iJ.T} is
negative. This change is strongly related to the input level!F The second term (II)
represents a second order effect of a shift, positive or negative, in the operating

8 For this reason, it is important not to operate a coupler at a current in excess of the manufaclu-
rer's maximum rmings.
320 10 Reliability of optoelectronic components

point of the output amplifier as the emitter efficiency changes. The third term (III)
is a generally negligible effect which represents a positive or negative change in the
output transistor gain over time. The parameters K and R are constants.

.1CTRlCTR = (.11]/1]) + (.11]/1]).(&nfJ/&nIp) + (.1fJlfJ) . (10.3)


IF IF IF
(I) (II) (III)

10.3.3
eTR degradation and its cause

It is an established fact that the total photon flux emitted by an optoelectronic de-
vice diminishes slightly over the operating lifetime of the device9 . Barring cata-
strophic failures or over stressing of the optoelectronic device, this change of pho-
ton emission is almost imperceptible for many tens or thousands of hours in visual
applications, but can be measured with a sensitive photodetector. At lower stress
currents, the change of light output versus time is reduced. CTR degradation is
important because an excessive amount of degradation or a bad designed system
can cause a reduction in performance and eventual system failure unless an allo-
wance is made for it [l0.7].
Potential causes of CTR degradation are a reduction in efficiency (1]) of the
emitter, a decrease in the transmission of the optical path (K), a reduction in re-
sponsiveness (R) of the photodetector, or a change in gain (/3> of the output ampli-
fier. It is generally accepted that the overwhelming influence in the .1CTR is the
time dependent reduction in the radiated output of the LED. The recorded .1CTR
can be appreciably influenced by the choice of measurement conditions. Also, since
the gain of the output amplifier (/3> is related to its input current, CTR degradation
may be made up by the change in fJ, due to a decrease in photocurrent (Ip) caused
by a reduction in 1].
There are a number of factors which influence the amount of degradation asso-
ciated with the diode. In general, however, degradation is a result of electrical and
thermal stressing of the pn Junction. Combinations of IFS/stress current in the
LED) and tamb (ambient temperature) will produce a spectrum of .1CTR va-
lues

9 This change is often referred to as a degradation of light output, although in some instances, the
light output of a LED has actually increased over time. An optically coupled isolator is an opto-
electronic emitter/detector pair. Any degradation of light output of the emitter will cause a
change in the apparent gain of the entire device. The change in gain of the isolator can be ex-
pressed as a change in CTR over time and is commonly called CTR degradation. This term is
now widely used to describe the phenomenon, and the study of factors intluencing it has grown
considerably in recent years. Semiconductor manufacturers, for their part, are at pains to point
out that the term "degradation" in the above text does not imply that their product is either
poorly designed or of inferior quality, but rather that the process of "degradation" is an inherent
characteristic of junction electro luminescence.
10 Reliability of optoelectronic components 321

LiCTR(%) Stress conditions: hs =60mA (device max. rating) at 25 '("


Test duration: 4000h; R = 1; 5; 10; 50; 100
4kh
25 (x+8) I 3kh
Ikh
20

15
• 4kh

10 ______________________ --------------------------------______ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~- (x) 3kh


lkh
5 _••~ : : :-_-_-_-_-_-_-_-_-_
-_------------------o-----------------
______________________________________________________---11

Fig. 10.5 Effect of varying the stress to monitor ratio (M) on eTR

throughout the stress duration 10. It is emphasised here that the overall degradation
cannot be totally accounted for by the monitor ratio M = IFS II FM and the stress
level (l FS) contributes to the total picture, too, making impossible to completely
isolate the effect of varying M alone. The plots of Fig. 10.5 are intended to give
general trends in behaviour, to enable the designer to appreciate the approximate
effect of varying the monitor current. The M values chosen ranged from I to 100.
Some interesting conclusions may be drawn from the curves in Fig. 10.5. Note
how the degradation measured at high M values (typically with I FM = 2mA) is
"relatively" independent on time on test.
Assume that the degradation mechanism establishes a resistive path in parallel
with the active pn junction. Any current flowing in this resistive shunt will not
generate light. At low I FM values, this alternative path may have appreciable im-
pact on the total device performance, as it offers a low resistance path to substantial
amounts of current. As the current increases however, the low resistance forward
biased pn junction draws the major proportion of the total current and the impact of
the secondary path is' considerably reduced. Using this model, we can understand
how a reduced light output is seen at low I FM currents, when a sizeable percentage
of the LED drive is deflected in this way.

10.3.4
Reliability of optocouplers

Reliability is something that must be "built in", not "tested in". Through proper
design and process control this can be accomplished, thereby reducing the task of
screening programmes which attempt to eliminate the lower tail of the distribution.

10 GaAs can display considerable lot-to-Iot variations; the individual diode chips themselves
reflect only a small fraction of a single wafer of GaAs and each wafer may have a range of va-
rious physical/electrical characteristics across its surface. Considerable impact on work has the
choice of measuring conditions used to monitor the amount of degradation incurred during a
particular stress test.
322 10 Reliability of optoelectronic components

One of the major inspection points in the wafer processing area is the light output
test of each light emitting diode; the major inspection point in the assembly area is
the die attach and the wire bond. For the forty years life of telecommunications
products, the optimal reliability screen would consist generally in 20 temperature
cycles (-65°C to +150°C), followed by variables data read and record, followed by
a 16 hours bum-in (at IF = 100mA, VCEE= 20V, IC = 15mA, Tambient = +25°C),
followed by a variables data read and record. Screening limits are based on both

Lower decile slope of IRED


&0 % decade time
Prediction model
20 ,-----------------,---------------,,~~

,,-
-- -
10 ~------,,=-~"----=-~?rl~~~---------~

o
- -"
,,-

10 \00 hsilFM
Bias current effect

Fig. 10.6 IRED output versus time slope prediction curves, assuming a virtual initial time of 50
hours

parametric shift and value. In our experience, temperature cycle is a more effective
screen than stabilisation bake.
Our experience indicates two major problems that must be addressed in the de-
sign of optoelectronic devices, utilising IRED" and phototransistors: the tem-
perature coefficient of expansion and low glass transition temperature of unfilled
clear plastics is much greater than that of the other components, requiring a reduced
temperature range of operation and stronger mechanical construction to maintain
reasonable device integrity; some clear plastics build up mechanical stress on the
encapsulated parts during curing. This stress has been likened to rapid, inconsistent
degradation of IRED light output. Although a filled plastic would stop these phe-
nomena, the filler also spoils the light transmission properties of the plastic.
The "preconditioning" is usually understood to be a stress test (or a combination
of stress tests) applied to devices (i. e. high temperature storage, operating life,
storage life, blocking life, humidity life, HTRB, temperature cycles, mechanical
sequence - which includes solderability - , etc.), after which a screening criteria is
applied to separate good units from bad ones. This criteria may be any combination
of the absolute value and parameter shift levels agreed to by the involved parties.
Since the optocoupler is a hybrid circuit, it is nonnal that the MTBF is lower
than for TTL. It is extremely difficult to find an epoxy (between LED and detector)

II Work on performance degradation has been done to improve GaAs performance and to match
that performance with GaAlAs, a newer, more difficult material (Fig. 10.6) [10.8][10.9].
10 Reliability of optoelectronic components 323

which be transparent and which perfectly matches with the bonding wires at the
same time. Most catastrophic failures are due to thermal stress between epoxy and
bonding wires.
The decrease in quantum efficiency of LEOs is the main reason for CTR degra-
dation of optocouplers. Other - less important - causes of CTR degradation are a
decrease in the transmission of the transparent epoxy, a change in sensitivity of the
photodetector and a change in gain of the output amplifier. It is now known that the
rate of CTR degradation is influenced by the materials and processing parameters
used to manufacture the LED, and the junction temperature of the LED in addition
to the current density through the LED. Several tests have been performed to find a
law of degradation. Some laboratories derived the following formula:
Teff= KCI(JF)n ·e-EIkTJ (lOA)
where:
Teff = x percent of the optocouplers have a CTR of less than m times
. the initial CTR after teff hours of operating time:
C = constant, depends on technology:
J F = current density in the diode (A/cm2);
E = activation energy of the degradation mechanism (eV):
k = constant of Boltzmann (8.62 x 1O- 5eV/K);
TJ = junction temperature of the diode (K);
K = correction factor; depends on current at which CTR is measured (CTR
degradation increases when this current decreases).
Another well known problem is that of intermittently open circuit devices (identi-
fied as thermal opens). In its simplest form, the thermal intermittent results from a
combination of an initially weak bond, acted upon by forces originating from the
thermal mismatch of the constituents of the encapsulating medium. That is why
many quality checks were introduced by manufacturers during the fabrication
process, as well as multiple screenings at elevated temperatures (i. e. 100°C for
thermal continuity, on a 100% basis) on the finished product. The data generated to
date indicate an outgoing quality better as 0.15% for intermittents (if all production
is temperature cycled during manufacture with the aim to remove weak mechanical
bonds).
The solderability (normally the lead frame is an Alloy 42. comprising 42% Ni
and 58% Fe) is checked several times daily during the production process, and - for
special customers - these tests are routinely performed. A change to silver plated
lead frame affects only that part of the frame which is enclosed by the encapsulant.

10.3.5
Some basic rules for circuit designers

a) Decrease the real operating time for the optocoupler.


b) Decrease the operating diode current and the ambient temperature.
c) A void peak transient currents.
d) Reliability can be increased by a suitable burn-in; avoid damage of the devices
by remaining below the absolute maximum ratings.
324 10 Reliability of optoelectronic components

e) Design the circuit for a CTR below the minimum specified CTR.
f) Allow a ±30% drift of the coupling factor during operation.
The optocouplers are relatively reliable products when one is aware of CTR degra-
dation while designing a circuit. A well designed circuit should allow CTR degra-
dation, as well as consider the worst case effects of temperature, component to-
lerance, and power supply variations. On the whole, the mechanisms behind degra-
dation and failure of optoelectronic devices are not fully yet understood.

10.4
Liquid crystal displays

Liquid crystal displays LCDs differ from other types of displays in that they scatter
- rather than generate - light. Two basic types are available: reflective (which re-
quire front illumination), and transmissive (which require rear illumination). A
third type - the transflective - combines the properties of the two others and ope-
rates either by reflection of front-surface light or by illumination from the rear. All
of these types of LCDs use a cell filled with liquid crystal material 12 •

Optical response (percent)


100
90

50

10
o Root mean
V sat square voltage

Fig. 10.7 Optical response curve of liquid crystal cell. Vth = threshold voltage (threshold at which
response is 10% of maximum); V,at = saturation voltage (voltage at which response is 90% of
maximum)

12 A liquid crystal material is an organic compound (containing carbon, hydrogen, oxygen, and
nitrogen) that has the optical properties of solids and the fluidity of liquids. In the liquid crystal
state - exhibited over a specific temperature range - the compound has a milky, yellow appea-
rance. At high end of the temperature range, the milky appearance gives way to clear liquid; at
the low end of the range, the compound turns to a crystalline solid. The molecules of a liquid
crystal compound are in the form of long, cigar-shaped rods. Because of the special grouping of
the atoms that form these molecules, the rods act as dipoles in the presence of an electrical field.
This field-effect characteristic enables the molecules to be aligned in the direction of the electri-
cal field, and provides the basis for operation of a LCD.
10 Reliability of optoelectronic components 325

The optical response of a liquid crystal cell is shown in Fig. 10.7. When a volta-
ge greater than V sat is applied between a segment contact and the backplane contact,
molecules in the liquid crystal material twist to align themselves with the electric
field in regions of segment and backplane overlap, turning the segment on. The
optical response is the same whether the segment voltage is positive or negative
with respect to the backplane.
DC operation causes electromechanical reactions which reduce the life of a
LCD; it is therefore customary to drive the display with AC waveforms having
minimised DC components. Frequently, these are square waves in the range of
25Hz to I kHz. The response of the LCD is to the rms value of the applied voltage.

10.4.1
Quality and reliability of LeOs

LCDs are rugged devices and will provide many years of service when operated
within their rated limits. The limiting factor in LCD life is the decomposition of the
organic liquid crystal material itself, either through exposure to moisture, prolonged
exposure to ultraviolet light or to chemical contaminants present within the cell.
The design of some LCD manufacturers eliminates these failure modes:
• by providing a hermetic cell incorporating glass to glass and metal to glass
seals;
• by using a liquid crystal that is relatively insensitive to UV light and by incor-
porating an UV screen in the front polariser;
• by specifying and maintaining a high degree of chemical purity during the
synthesis of the liquid crystal, and during subsequent display manufacturing
steps.
A high temperature humid environment will cause gradual loss of contrast over a
period of time, due to degradation of the polarisers. If displays are to be operated or
stored at temperatures >50°C and humidity higher than 60%RH for extended pe-
riods of time, the user should contact the LCD manufacturer for more specific in-
formation.
The price of LCDs bears little relation to the number of digits or complexity of
the information displayed, but is more related to glass area. It is the customer's
advantage not only to reduce glass area in his design, but - where possible - to
utilise standard display external glass sizes, thereby reducing custom display de-
velopment costs.
Today's reliability level (MTBF) of enhanced LCDs is ranging from 50 OOOh up
to values of 100 OOOh or more (Fig. 10.8).
It is to remind, that one of the first LCDs applications was the clectronic watch,
marked by two essential characteristics: (i) The normally imposed LCD lifetime -
without maintenance intervention (except the battery replacement) - is approxi-
mately 50000 h (>8.5 years), and represents an unusual value, asked only for high
performances industrial products. (ii) The expensive watches arc considered as
jewels, for which the aesthetical aspect has a primordial role. That is why, very
326 10 Reliability of optoelectronic components

small optical defects (i. e. small air bubbles, with no function influences) are consi-
dered as valuable denunciation reasons, in other words as failure signs.

Failure rate A (l/h)


10-4

10-5

10-6 \
10-7
\
Time t
.25 .5 10 100 (xI000h)

Fig. 10.8 LCD failure rate A. dependence on the time t; typically lifetime: 50 OOOh, A. :0; 10-% for
Us =5V, Tamb =25°C

From a reliability point of view, the principal question is to know how the tech-
nical properties (especially the optical properties) of LCDs change depending on
the ambient conditions and of the lifetime. The specialised literature gives only
very few answers to this question, but recently new more stable crystal materials
have been synthesised, and the quality and the reliability of the LCDs have been
improved.
Generally, we distinguish two types of failure modes: sudden and long term de-
gradation failures. The first ones are normally associated with the blackout of the
LCDs (short-circuits, opens, mechanical failures concerning the tightness, etc.);
the second ones induce an increased consumed power, lost of alignment, reduction
of the isotropic transition temperature, change of the response speed, aesthetical
defaults (lost of contrast, bubbles, etc.) [1 O.IS].
To estimate the lifetime of LCDs, the following methods are utilised:
• lifetime test (+SO°C at 8S% RH);
• storage test at +2SoC, +SO°C, and -20°C, without controlling the humidity;
• thermal shock;
• high temperature test (+SO°C), without controlling the humidity.
One of the arbitrary failure criteria utilised is the 100% increase of the AC absor-
bed. The results of such tests - performed beginning with the year 1972 -reached to
the conclusion that the expected LCDs lifetime is greater than SO OOOh. (~ 10 ye-
ars), with a failure rate of "dO-7/h (at 3V / +2S°C).
10 Reliability of optoelectronic components 327

References

10.1 Biijenescu, T. I. (1993): Degradation and reliability problems of optocouplers. Proc. of


Annual Semiconductor Conference CAS '93, Sinaia (Romania);
Biijenescu, T. I. (1995): CTR degradation and ageing problem of optocouplers. Proc. of
the fourth international conference on solid-state and integrated-circuit technology, Bei-
jing (China), October 24-28, 1995, pp. 173-175
Biijenescu, T. I. (1996): Fiabilitatea componentelor electronice (Reliability of Electronic
Components). Publishing House Editura Tehnici't, Bucharest
10.2 Biijenescu, T. I. (1985): Zuverlassigkeit elektronischer Komponenten, VDE Verlag, Berlin
10.3 Newman, D. H.; Ritchie, S. (1981): Reliability and degradation of lasers and LEDs. In:
Howes, M. 1.; Morgan, D. V. (eds.): Reliability and Degradation. 1. Wiley & Sons,
Chichester
lOA Bergh, A. A.; Dean, P. J. (1976): Light-emitting diodes. Clarendon Press
10.5 CNET: Specifications STC 968-352111 et 2, edn. 2b, Fascicules 1 et II
10.6 Biijenesco, T. I. (1982): Le CNET et les tests de fiabilite des photocoupleurs. L'indicateur
industriel no. 4, pp. 23-27
10.7 Sahm, W. H. (1976): General Electric optoelectronics manual. New York. General Elec-
tric, Syracuse (USA)
10.8 Biijenescu, T. 1. (1984): Zuverlassigkeit von LED- und FK-Anzeigen. Elektronik-
Applikation H. 8/9, pp. 26-31
10.9 Gage. Stan I. (1979): HP optoelectronics applications manual supplement. Hewlett-
Packard; Optoelectronics applications manual (1977), Hewlett-Packard
10.10 Howes, M. J.; Morgan, D. V. (1981): Reliability and degradation. Wiley & Sons. Chiche-
ster
10.11 Plumb, R. G. et al. (1979): Thermal impedance aging characteristics of CW stripe lasers.
Solid State and Electron Devices, vol. 3, pp. 206-209
10.12 Kaneko, K. (1976): Degradation of GaP green LEDs'. Japan. 1. Appl. Phys, vol. 15. pp.
1287-1296
10.13 Amerasekera. E. A.; Campbell, D. S. (1987): Failure mechanisms in semiconductor devi-
ces. J. Wiley & Sons, Chichester
10.14 Zippel, C. L. et al. (1982): Competing processes in long-ternl accelerated ageing of double
heterostructure GaAIAs light emitting diodes. 1. Appl. Phys., vol. 53, pp. 1781-1786
10.15 Donati, M.; Wullschleger, 1. (1979): LebensdauerpIiifungen an BBC -Fliissigkristallan-
zeigen. Brown Boveri Mitteilungen, vol. 66, no. I, pp. 54-55
10.16 IEEE Trans. on El. Devices (1982): Special Issue on Optoelectronic Devices. ED-29. pp.
1355-1490
10.17 Ueda, O. (1996): Reliability and degradation of III-V optical devices. Artech House, Inc.,
Norwood, MA
10.18 Kanatani, Y.; Ayukawa, M. (1995): LCD technology and its application. Proc. of the
fourth internal. conf. on solid-state and integrated-circuit technology. Beijing (China),
October 24-28, pp. 712-714
10.19 Zhu, Q. et a!. (1995): Color array in TFA technology. Proc. of the fourth internal. conf. on
solid-state and integrated-circuit technology, Beijing (China). October 24-28, pp. 727-729
10.20 Du, 1. F. et al. (1995): Hydrogenated amorphous silicone PIN photodiode for optically
addressed spatial light modulators. Proc. of the fourth internal. conf. on solid-state and in-
tegrated-circuit technology, Beijing (China), October 24-28. pp. 733-735
10.21 Addington, 1. et al. (1995): Hybrid integrated optoelectronics package for FO receivers
and transmitters. Proc. of the fourth internal. conf. on solid-state and integrated-circuit
technology, Beijing (China), October 24-28, pp. 157-159
10.22 Biijenesco. T. 1. (1975): Sur la fiabilite des photocoupleurs. Conference at I'Ecole Poly-
technique Federale de Lausanne (EPFL), November
328 10 Reliability of optoelectronic components

10.23 Bajenesco, T. I. (1982): Le C.N.E.T et les tests de fiabilite des photocoupleurs. L'lndi-
cateur Industriel (Switzerland) no. 9( 1982), pp. 15-19
10.24 Bajenescu, T. I. (1984): Optokoppler und deren Zuverlassigkeitsprobleme. Aktuelle Tech-
nik (Switzerland), no.3, pp. 17-21
10.25 Bajenescu, T. I. (1994): Ageing Problem of Optocouplers. Proc. of Mediteranean Electro-
tech. Conf. MELECON '94, Antalya (Turkey), April 12-14
10.26 Bajenescu, T. I. (1995): Particular Aspects of CTR Degradation of Optocouplers. Pro-
ceedings ofRELECTRONIC '95, Budapest (Hungary)
10.27 Bazu, M. et aI. (1997): MOVES - a method for monitoring and verfying the reliability
screening. Proc. of the 20th Int. Semicond. Conf. CAS '97, October 7-11, Sinaia, pp. 345-
348
10.28 Bajenescu, T. I., Bazu, M. (1999): Semiconductor devices reliability: an overview. Proc.
of the European Conference on Safety and Reliability, Munich, Garching, Germany, 13-17
September; Paper 31
10.29 Ueda, Osamu (1996): Reliability and Degradation ofIII-V Optical Devices. Artech House,
Boston and London
11 Noise and reliability

11.1
Introduction

Much work has been carried out in the past to study the various types of (low-
frequency excess) noise sources as they commonly occur in silicon planar transis-
tors used in monolithic integrated circuits. Some examples of such noise sources
are presented in the following.

• Shot noise:
in metal-semiconductor diodes, pn junctions, and transistors at low injection;
in the leakage currents of FETs;
in light emission ofluminescent diodes and lasers.
• Noise due to recombination and generation in the junction space-charge re-
gion, high-level injection effects (including noise in photo diodes, avalanche
diodes, and diode particle detectors).
• Thermal noise and induced gate noise in FETs.
• Generation-recombination noise in FETs and transistors at low temperatures.
• Noise due to recombination centres in the space-charge region(s) ofFETs, and
noise in space-charge-limited solid-state diodes.
• lIf - or flicker - noise in solid-state devices in terms of the fluctuating occu-
pancy of traps in the surface oxide.
• Contact or low frequency noise.
• Popcorn noise (also called burst noise) in junction diodes and transistors, and
kinetics of traps in surface oxide.
• Microplasma noise.
• Random noise.
• Flicker noise injunction diodes, transistors, Gunn diodes and FETs.
• High-injection noise.
• Excess low-frequency noise.
• Bistable noise in operational amplifiers.
• Pink noise.
The theory of the low-frequency noise of bipolar junction transistor has arisen
many years ago and remained essentially unchanged since its conception.
Unlike the other noise sources, the popcorn noise is due to a manufacturing
defect and can be eliminated by improving the manufacturing process. (e.g. X-ray

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
330 Noise and reliability

examination of transistor wafers showed that the total number of defects increases
with the incident implantation energy). The noise consists typically of random
pulses of variable length and equal height, but sometimes the random pulses
seemed to be superimposed upon each other (Fig. 11.1).

n
Fig. 11.1 Typically burst noise observed at the collector of a transistor [11.16]

This noise is caused by a defect in the semiconductor junction, usually a metallic


impurity. The width of the noise bursts varies from microseconds to seconds. The
repetition rate - which is not periodic - varies from several hundred pulses per
second to less than one pulse per minute. For any particular sample of a device,
however, the amplitude is fixed since it depends on the characteristics of the
junction defect. Typically, the amplitude is from 2 to 100 times the thermal noise.
The power density of popcorn noise has a lit 2 characteristic; since the noise is a
current-related phenomenon, popcorn noise voltage is greatest in a high impedance
circuit, for example, the input circuit in operational amplifier. The source of the
burst noise is not so clear at present, but it seems to be associated with shallow,
heavily doped emitter junctions. It is believed that the appearance and
disappearance of pulses are associated with a single trap in the space-charge region.
An ancient and permanent desire of electronics engineers was to find a practical
method for predicting the life expectancy of a transistor by correlating the low-
frequency noise and the reliability'. For many causes of failures, the method
described in 11.4.1 makes possible to obtain the functional reliability by a low load
short period flicker noise measurement and permits to eliminate unreliable
specimens.

11.2
Excess noise and reliability

Extensive studies on silicon bipolar transistor [11.1 ]... [11.4] have shown that noise
phenomena can be classified in two categories: normal and excess noise. The first
one includes the thermal and shot noises, the second the flicker (or lit), the micro-
plasma, the generation-recombination and the burst noises. It is an old assumption
(partly verified [11.5] ... [11.7]) that excess noise could give some information about

, y radiation is shown to increase the low-frequency noise level in linear bipolar devices, while it
tends to cause latch-up of CMOS lCs; X-rays are found to affect MOS devices to a greater
extent than bipolar lCs as a result of the development of positives charges in the oxid layer,
causing a threshold voltage shift. GaAs devices - because they are majority carrier devices - are
relatively radiation hard when compared to silicon devices [11.37].
Noise and reliability 331

the reliability of electronic devices. An example of the useful information obtained


form intermittence studies is the fact that the superposition theorem is invalid in
some cases when dealing with multilevel burst noise. It has been found [11.11] that
sometimes the presence of one level of burst noise in a device excludes the pres-
ence of another.

11.3
Popcorn noise

Popcorn noise - also called burst noise - was firstly discovered in semiconductor
diodes and has recently reappeared in integrated circuits [11.8] ... [11.11]. If burst
noise is amplified and fed into a loudspeaker, it sounds like com popping. Hence,
the name popcorn noise. He is a curious and undesirable noise phenomenon that
can plague the normal operation of pn junction devices. Popcorn noise is charac-
terised by collector current fluctuations, having generally the aspect of random
telegraph wave, but sometimes, different levels of current pulses can be observed.
It may appear or disappear spontaneously or under particular stress conditions, it
does not occur on all devices manufactured from the same wafer, nor does it occur
on all wafers in a given production loe.
Popcorn noise was first discovered in early 709 type operational amplifiers.
Essentially it is an abrupt step-like in offset voltage (or current) lasting for several
milliseconds and having an amplitude from less than one microvolt to several
hundred microvolts. Occurrence of the pops is quite random - an amplifier can
exhibit several pops per second during one observation period and than remain
popless for several minutes. Worst case conditions are usually at low temperatures
with high values of source resistance Rs. Some amplifier designs and the products
of some manufacturers are notoriously bad in this respect.
Some theories were developed about the popcorn mechanism. In [11.2] and
[11.4] the authors arrived to the conclusion that the burst phenomenon is located
near the surface of the emitter-base junction. In 1969, Leonard and laskowlski
[11.23] postulated that the random appearance and disappearance of microplasmas
in the reverse-biased collector-base junctions of transistors would produce step-like
changes in the collector current. However, Knott [11.24] claimed in 1970 that burst
noise was a result of a mechanism arising in the emitter-base junction, and not in
the collector-base junction. In 1971, Oren [11.22] reported that it would be
premature, without further study, to rule out either of the aforementioned models. A
closer look indicates that different mechanisms are indeed at play (e. g. modulation
of leakage current flowing through defects located in the emitter-base space-charge
region; surface problems; metal precipitates; dislocations) and an unique answer is
not yet available. Roedel and Viswanathan [11.12] observed that in Op. Amp. 741
there was a very strong correlation between the intensity of the burst noise and the
density of dislocations on the emitter-base junction. Martin and Blasquez [11.14]

2 We have checked the percentage of burst noise incidence in relation to position of the units on
the wafer (central versus peripheral) and the results show larger incidence rate for the peripheral
devices.
332 Noise and reliability

arrived at the conclusion that noise is shown to be a good means of characterisation


for surface parameters (when surface effects are predominant in the degradation
process), but burst noise is not as good indicator as the flicker noise. In [11.25] it
has been found that low frequency excess noise comprises two components: 1/f-
noise and burst noise.
Although there are various theories on the popcorn mechanism, it is known that
devices with surface contamination of the semiconductor chip will be particularly
bad poppers. Advertising claims notwithstanding, the authors have never seen any
manufacturer's op amp that was completely free of popcorn noise. Some peak
detector circuits have been developed to screen devices for low amplitude pops, but
100% assurance is impossible because an infinite test time would be required.
Some studies have shown that spot noise measurements at 10Hz and 100Hz,
discarding units that are much higher than typical, is an effective screen for
potentially high popcorn units. Screening can be performed, but it should be noted
that confidence level of the screen could be as low as 60%.
Burst noise has been observed in planar silicon and germanium diodes and
transistors. It is believed that a current pulse is caused by a single trapping centre in
the space-charge region. The proportion of transistors affected by popcorn noise
varies between 25% and 70% [1.12], depending on the type. The physical origin of
burst noise has· been described to be the current fluctuations generated in the
vicinity of macroscopic crystalline defects or dislocations in the emitter-base
junction surface region [1.13], but a controversy regarding the mechanism and
origin of popcorn noise still exists [1.14]. Several experiments show that burst
noise is an intermittent large-scale recombination; its rate of occurrence depends on
mechanical stresses. Moving dislocations acting as large-scale recombination
centres explain the burst noise characteristics. From one experiment, the cause of
dislocation motion seems to be the momentum transfer from the emitter current
[1.15] [1.16]. Measurements showed that the percentage of transistors having
popcorn noise is dependent on the implantation energy. X-ray examination of these
transistor wafers showed that the total number of defects increases with the incident
implantation energy. From these experimental results [1.17] one can conclude that
the defects induced by ion implantation cause popcorn noise.
The estimation and prediction of the reliability of an electronic device is
becoming more dependant on the variations in the characteristics of the device due
to stress. The stresses which magnify the degradation of components are
temperature, humidity, pressure, vibration, shock and electrical bias. It is widely
believed that burst noise tends to decrease as the temperature is raised. Observing a
distinct popcorn noise over a large portion of transistors of the same sample points
out a poor quality of semiconductor crystal or oxide layer and consequently a
defective fabrication process). Obviously, a time variation of the excess noise
amplitude indicates evolutive defects.

) In [1.11] it was detennined that - in order to reduce burst noise - one or more of the following
steps had to be accomplished: a) remove or neutralise the recombination-generation centres; b)
remove the metal atoms from the crystal, or at least prevent them from precipitating at the
junction; c) reduce or eliminate the surface junction dislocations. The first step was abandoned
because of the impossibility of removing all bulk and surface trapping centres.
Noise and reliability 333

11.4
Flicker noise

All solid-state devices show a noise component with a lit" spectrum, where n == l.
This type of noise is known as flicker noise or 1/fnoise. It has been demonstrated
that this lIf noise spectrum holds down to extremely low frequencies; FirIe and
Winston [11.14] have measured 1If-noise at 6.10-5 Hz. Experiments made by Plumb
and Chenette [11.21] indicated that flicker noise in transistors can be represented
by a current generator if] in parallel with the emitter junction. Theoretically, a par-
tially correlated current generator in in parallel with the collector junction may be
used, but careful experiments have shown that its effect is so small, that it can be
neglected.
In normal operating conditions, the excess noise consists essentially (over all the
low frequency range) of flicker and burst noises; they may be represented by two
equivalent current generators connected between the input terminals of the
transistor (Fig. 11.2).

Fig. 11.2 Equivalent current generators

11.4.1
Measuring noise

Noise measurements are usually done at the output of a circuit or amplifier, for two
reasons: (i) the output noise is larger and therefore easier to read on the meter; (ii) it
avoids the possibility of the noise meter upsetting the shielding, grounding or bal-
ancing of the input circuit of the device being measured.
In order to make excess noise predominant comparatively we have utilised the
HTRB step stress test (one week storage; starting temperature 150°C; 25°C/step)
followed by 24h stabilisation at normal ambient temperature, with shortened
junctions4. This enables to select high reliability transistors by a previous noise
measurement; the selection principles are: (a) acceptance of the only transistors
with a low flicker noise level; (b) rejection of the entire lots having an important

4 The testing of a sample is stopped and a failure analysis made when SO% of the transistors
shows a DC current gain higher than SO% of the iuitial gain. The transistor under test must be
biased across a large external base resistor and the measurement made at 30Hz. For a valid
comparison, the emitter-voltage must be kept at the same value and the noise must be measured
with a constant base current [1l.lS].
334 Noise and reliability

proportion of elements with burst noise; (c) rejection of the lots having a high
average value of the flicker noise spectral density (fig. 11.3 and Table 11.1).

11.4.2
Low noise, long life

This is the conclusion of our reliability tests: by measuring the excess noise it is
possible to make reasonable prediction about life expectancy of the devices by
mean of a non destructive test. A large increase in excess noise occurs just prior to
failure; units with low initial values of noise current have a longer life under artifi-
cial ageing.
Some findings on perfect crystal device technology (PCT) for reducing flicker
noise in bipolar transistors [11.25]: (i) The flicker noise can be drastically reduced
by eliminating various crystal defects such as dislocation and precipitates, and
achieving low Si/Si02 state density with the use of P/As mixed doped oxide
diffusion technique. It is worth to mention the disappearance of burst noise by
employing PCT. (ii) The degree of dislocation generation during diffusion process
depends on the grown-in dislocation density; the smaller, the better. (iii) Diffusion-
induced dislocation density depends on the crystal orientation. (Ill) turned out to
be the best so far as the dislocation is concerned.

11.5
Noise figure

Noise figure NF is the logarithm of the ratio of input signal-to-noise and output
signal-to-noise.
NF = 10 log[(S/NJin/ (S/NJouJ (11.1 )
where S and N are power or (voltage)2 levels.
This is measured by determining the SIN at the input with no amplifier present,
and then dividing by the measured SIN at the output with signal source present. The
values of Rgen and any Xgen as well as frequency must be known to properly express
NF in meaningful terms.
We desire a high signal-to-noise ratio SIN; it also happens that any noisy
channel or amplifier can be completely specified for noise in terms of two noise
generators en and in as shown in Fig. 11.4. The main points in selecting low noise
amplifiers are:
(i) Don't pad the signal source; live with the existing Rgen.
(ii) Select on the basis of low values of en and especially in if Rgen is over about a
thousand ohms.
(iii) Don't select on the basis of NF. NF specifications are all right so long as
you know precisely how to use them and so long as they are valid over the
frequency band for the Rgen (or Zgen) with which you must work.
(iv) The higher frequencies are often the most important unless there is low
frequency boost or high frequency attenuation in the system [11.26].
Noise and reliability 335

Table 11.1 Measurement results

Measured parameter Manufacturer


X y Z

Burst noise proportion (%) 60 40 10


IIf-noise (for 18 = I JlA
by f= 10 Hz) 10-24 A2/Hz 42 6 1,5

i DC characteristics Flicker noise current HTRE 168 h at l


--I
- I
_~J
measurement spectral density ISO, 175 or 200°C

,----------------------------------------------

DC characteristics rogressiv Yes Failure Rejection of the lot


measurement degradation? analysis if items have a flicker
noise density
> 2.10- 21 A2/Hz
No

Next step HTRB with + t.T = 25°C/step


Fig. 11.3 Sequence of the proposed lot acceptance reliability test programme

.... -----_._-----------------------------------------------------------------------,
i i

o
·ce····-I _
e,ig Input en
: Output

?-----------'----l
i
L. _________________ .t:!"_?_~~_:_!:~~~~ ____________________________________ J
Fig. 11.4 Noise characterisation of an operational amplifier [11.26]

Avoid the applications requiring a high gain (> 60dB), because the amplified
noise (:= 2/l V) can reach the audio domain. For high reliability systems, all the
components having burst noise should be rejected; also all the batches with an
important proportion of components having lIf-noise or burst noise should be
rejected. Only the components with a reduced noise level should be accepted.
Avoid the utilisation of too great resistances in your circuits. Minimise the external
noise sources.
The noise spectroscopy [11.38] ... [11.41] gives information on trap parameters
located in pn junction depletion layer. Noise reliability indicator in forward
direction is defined as the ratio between the maximum value of the noise spectral
336 Noise and reliability

density (measured on a load resistance) and its thermal noise spectral density. As a
noise reliability indicator for reverse bias operation, the ratio of breakdown voltage
for ideal junction and reverse voltage of soft breakdown was introduced [11.41].
Burst noise is used as the third reliability indicator.

11.6
Improvements in signal quality of digital networks

Substantial improvements in signal quality [11.47] both at component and system


level can be achieved by appropriately balancing the reactive design of digital net-
works. Cancellation of noise created by components, layout, and technologies (such
as vias, remote grounds and interposer contacts) was demonstrated in networks
from 50 to 200MHz, by using the needed cancellation criteria, CAE tools and veri-
fication of design. In [11.47] it is shown that - with the exception of device loading
- reactive mismatching is the dominant source of signal degradation in many digital
networks that are being designed today. Principles for reactive compensation and
criteria for localisation are developed and explained in the context of high-speed
digital operation. It is shown that, unlike the cases of resistive matching, reactive
compensation is without signal penalty other than a possible modifying of propa-
gation delay. Guidelines for reactive noise cancellation for digital systems operat-
ing with rise-times ranging from several ns to 50ps are given.

References

11.1 Bajenescu, T. I. (1985): Excess noise and reliability. Proceedings ofRELECTRONIC '85,
Budapest (Hungary), pp. 260-266
11.2 Jaeger, R. C.; Brodersen, A. J. (1970): Low frequency noise sources in bipolar junction
transistors. IEEE Trans. on Electron Devices, ED-17, no. 2, p. 128
11.3 Martin, J. C. et al. (1966): Le bruit en cn\neaux des transistors plans au siliciurn. Elec-
tronics Letters, June, vol. 2, no. 6, pp. 228-230
(1971): Le bruit en creneaux des transistors bipolaires. Colloques Internationaux du
C.N.R.S. no. 204,pp. 59-75
(1972): Correlation entre la fiabilite des transistors bipolaires au siliciurn et leur bruit de
fond en exces. Actes du Colloque Internat. sur les Compos ants Electroniques de Haute Fi-
abilite, Toulouse, pp. 105-119
(1972): L'effet des dislocations cristallines sur Ie bruit en creneaux des transistors bipo-
laires au siliciurn. Solid-State Electronics, vol. 15, pp. 739-744
11.4 Brodersen, A. J. et al. (1971): Low-frequency noise sources in integrated circuit transis-
tors. Actes du Colloque International du C.N.R.S., Paper II-4
11.5 Curtis, J. G. (1962): Current noise indicates resistor quality. International Electronics,
May 1962
11.6 Ziel, van der, A.; Tong, H. (1966): Low-frequency noise predicts when a transistor will
fail. Electronics, vol. 23, Nov. 28, pp. 95-97
11.7 Hoffmann, K. et al. (1976): Ein neues Verfahren der Zuverlassigkeitsanalyse fur Hal-
bleiter-Bauteile. Frequenz vol. 30, no. 1, pp. 19-22
Noise and reliability 337

11.8 Ott, H. W. (1976): Noise reduction in electronic systems. Wiley Interscience, New York,
1976
11.9 Noise in physical systems (1978). Proceedings of the Fifth Internat. Conf. on Noise, Bad
Nauheim, March 13-16, Springer Verlag, Berlin, 1978
11.10 Prakash, C. (1977): Analysis of non-catastrophic failures in electronic devices due to
random noise. Microelectronics and Reliability vol. 16, pp. 587-588
11.11 Knott, K. F. (1978): Characteristics of burst noise intermittency. Solid-State Electronics
vol 21,pp. 1039-1043
11.12 Roedel, R; Viswanathan, C. R (1975): Reduction of popcorn noise in integrated circuits.
IEEE Trans. Electron Devices ED-22, Oct., pp. 962-964
11.13 Martin,1. c.; Blasquez, G. (1974): Reliability prediction of silicon bipolar transistors by
means of noise measurements. Proceedings of 12th International Reliability Physics
Symp.
11.14 Bajenesco, T. I. (1981): Probh!mes de la fiabilite des composants electroniques actifs
actuels. Masson, Paris, pp. 163-169.
(1996): Fiabilitatea componentelor electronice. Editura Tehnicii, Bucharest (Romania),
pp.312-324
11.15 Firle, J. E.; Winston, H. (1955): Bull. Ann. Phys. Society, tome 30, no. 2
11.16 Blasquez, G. (1973): Contribution a I'etude des bruits de fond des transistors ajonctions
et notarnment des bruits en lIf et en creneaux. These doctorat no. 532, Univ. P. Sabatier,
Toulouse
11.15 Luque, A. et al. (1970): Proposed dislocation theory of burst noise in planar transistors.
Electronics Letters, vol. 6, no. 6, 19th March, pp. 176-178
11.16 Koji, T. (1974): Noise Characteristics in the Low Frequency Range of lon-Implanted-
Base-Transistor (NPN type). Trans. Inst. Electron. & Com. Eng. Jap. C, vol. 57, no. I, pp.
29-30
11.17 Jaeger, R. C. et al. (1968): Record ofthe 1968 Region III IEEE Convention, pp. 58-191
11.18 Giralt, G. et al (1965): Sur un phenomene de bruit dans les transistors, caracterise par des
creneaux de courant d'amplitude constante. C. R Acad. Sc. Paris, tome 261, groupe 5, pp.
5350--5353
11.19 Caminade, J. (1977): Analyse du bruit de fond des transistors bipolaires par un modele
distribue. These de doctorat, Universite P. Sabatier, Toulouse, France
11.20 Le Gac, G. (1977): Contribution a I'etude du bruit de fond des transistors bipolaires:
influence de la defocalisation. These de doctorat, Universite P. Sabatier, Toulouse, France
11.21 Plumb, J. L.; Chenette, E. R. (1963): Flicker noise in transistors. IEEE Trans. Electron
Devices, vol. ED-I0, pp. 304-308
11.22 Oren, R. (1971): Discussion of Various Views on Popcorn Noise. IEEE Trans. on Elec-
tron Devices, vol. ED-18, pp. 1194-1195
11.23 Leonard, P. L.; Jaskowlski, L. V. (1969): An investigation into the origin and nature of
popcorn noise. Proc. IEEE (Lett.), vol. 57, pp. 1786-1788
11.24 Knott, K. F. (1970): Burst noise and microplasma noise in silicon planar transistors. Proc.
IEEE (Lett.), pp. 1368-1369
11.25 Yamamoto, S. et al. (1971): On perfect crystal device technology for reducing flicker
noise in bipolar transistors. Colloques intern at. du CNRS no. 204, pp. 87-89
11.26 Sherwin, 1. (1974): Noise specs confusing? National Semiconductor AN-104
11.27 Grivet, P.; Blaquiere, A. (1958): Le bruit de fond. Masson, Paris
11.28 Ziel, A. van der (1970): Noise: sources, characterization, measurement. Prentice Hall,
Englewood Cliffs
338 Noise and reliability

11.29 Motchenbacher, C. D.; Fitchen, F. C. (1973): Low-noise electronic design. John Wiley &
Sons, New York
11.30 Cook, K. B. (1970): Ph. D. Thesis, University of Florida
11.31 Soderquist, D. (1975): Minimization of noise in operational amplifier applications. AN-15
of Precision Monolithics Inc., Santa Clara, California
11.32 Bilger, H. R. et al. (1974): Excess noise measurements in ion-implanted silicon resistors.
Solid-State Electronics vol. 17, pp. 599-605
11.33 Bajenesco, T. I. (1977): Bruit de fond et fiabilite des transistors et circuits integres. La
Revue Polytechnique no. 1367, pp. 1243-1251
11.34 Wolf, D., editor (1978): Noise in physical systems. Proc. of Fifth Internat. Conf. on
Noise, Bad Nauheim, March 13-16, Springer Verlag, Berlin
11.35 Boxleitner, W. (1989): Electrostatic Discharge and Electronic Equipment. IEEE Press,
New York
11.36 Frey, o. (1991): Transiente Storphenomene. Bull. SEVNSE, vol. 82, no. 1, pp. 43-48
11.37 Amerasekera, E. A.; Campbell, D. S. (1987): Failure mechanisms in semiconductor
devices. J. Wiley and Sons, Chichester
11.38 Kirtley, J. R. et al. (1987). Proc. of the Internat. Conf. on Noise in Physical Systems and
IIfFluctuations, Montreal
11.39 Schultz, M.; Pappas, A. (1991): Telegraph noise of individual defects in the MOS inter-
face. Proc. of the Internat. Conf. on Noise in Physical Systems and lIf Fluctuations,
Kyoto, Japan
11.40 Jones, B. K. (1995): The sources of excess noise. Proc. of the NODITO workshop, Brno,
CZ, July 18-20
11.41 Sikula, J. et al. (1995): Low frequency noise spectroscopy and reliability prediction of
semiconductor devices. Proc. of RELECTRONIC '95, Budapest (Hungary), October 16-
18,pp.407-412
11.42 Ciofi, C. et al. (1995): Dependence of the electromigration noise on the deposition tem-
perature of metal. Proc. ofRELECTRONIC '95, Budapest (Hungary), October 16-18, pp.
359-364
11.43 Schauer, P. et al. (1995): Low frequency noise and reliability prediction of thin film
resistors. Proceedings ofRELECTRONIC '95, Budapest (Hungary), October 16-18, pp.
401-402
11.44 Koktavy, B. et al. (1995): Noise and reliability prediction of MIM capacitors. Proc. of
RELECTRONIC '95, Budapest (Hungary), October 16-18, pp. 403-406
11.45 Yiqi, Z.; Qing, S. (1995): Reliability evaluation for integrated operational amplifiers by
means of l/f noise measurement. Proc. of the Fourth Internat. Conf. on Solid-State and
Integrated-Circuit Technology, Beijing (China), October 24-28, pp. 428-430
11.46 Guoqing, X. et al. (1995): Improvement and synthesis techniques for low-noise current
steering logic (CSL). Proc. of the Fourth Internat. Conf. on Solid-State and Integrated-
Circuit Technology, Beijing (China), October 24-28, pp. 634-636
11.47 Merkelo, H. (1993): Advanced methods for noise cancellation in system packaging. 1993
High Speed Digital Symposium, University of Illinois, Urbana
12 Plastic package and reliability

12.1
Historical development

In the beginning, only metallic packages were used for transistor encapsulation.
These type of packages seemed to be very reliable, both for military and civilian
applications. In 1962, General Electric used for the first time plastic packages for
transistors. Thus, the costs were significantly reduced, even with 90% in some
cases [12.1]. First, plastic encapsulated transistors were developed for mass con-
sumption, without taking into account the reliability or the environment. Therefore,
the low cost of these new transistors called rapidly industry and army's attention.
Consequently, starting from 1964, their market increased appreciably. Almost im-
mediately, the weaknesses referring to the reliability were revealed, especially in
combined conditions of high temperature and moisture, when the failure rate in-
creases dramatically compared with the metal encapsulated transistors' one. This
explains why, with rare exceptions, at the time, the plastic package was not ac-
cepted by the army.
In the 60's, the manufacturers of semiconductor devices published results [12.2]
trying to prove that plastic encapsulated transistors fulfil the technical requirements
of the American military standards (referring to metal packages) and, therefore,
they can successfully replace the metal encapsulated transistors. The military and
industrial users asserted the opposite [12.3], especially for combined test of high
temperature and humidity. In 1968, Flood [12.4], from Motorola, performed reli-
ability tests with duration of thousands of hours, varying the temperature and hu-
midity conditions, and arrived to the idea that the vapour pressure is the most ap-
propriate stress for evaluating the effect of the moisture on plastic encapsulated
transistors. It resulted that the humidity has a significant effect on the failure rate.
Baird and Peattie [12.2], from Texas Instruments, have asserted that they obtained
satisfactory results for the tests stipulated by the method 106B of MIL-STD-202C
and that the failure rate doubles its value if the components undergo a relative hu-
midity of 70%, at 55°C, for 5000 hours. But this it seemed to be a deficiency of the
method 106B, and methods more appropriate for plastic encapsulated transistors
were needed. In the same work, by comparing the same transistor encapsulated in
metal and in plastic, respectively, the conclusion was the better reliability offered
by the metal package. In 1968, Anixter [12.3], from Fairchild, believed that there
are some unsolved problems about plastic encapsulation and recommends not using

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
340 12 Plastic package and reliability

this type of package for military applications. Also, in 1968, Diaz [12.5], from
Burroughs, arrived to the same conclusion.
As a result of these contradictory reports, US Army Electronics Command de-
cided to organise a complete programme of reliability tests about plastic encapsu-
lated devices. In a research report from 1971, Fick [12.6] summarised the main
results of these reliability tests, performed in Panama. Fick used instead of com-
bined temperature and humidity cycles (as method 106 indicated), constant high
temperature and high humidity tests. He assumed that in this way the accelerated
failure rate was correlated with the operational conditions. As the most detrimental
conditions must be tested, experiments at the Tropics, were high temperature and
humidity is naturally combined, were also performed. The conclusions of this study
may be summarised as follows:
• The transistors intended to commercial purposes were the weakest. Their cur-
rent gain increased from 100 .. 200 to 1000 ... 2000, without a plausible explana-
tion for this phenomenon to be furnished.
• The study about the materials used for various plastic packages could offer
valuable information about the reliability of plastic encapsulated devices.
• Another aspect that it is worth to be studied is the effect of mechanical shocks
and vibrations on plastic encapsulated devices.
• It is necessary to specity the test requirements for plastic encapsulated
transistors and the failure criteria (such as: ICBO max (V CB= 16V) = 50nA and hFE
(VCE= IV, Ic= 2mA) = 60 .. .300).
After 1980, a significant improvement in the performance of semiconductor
devices was obtained. In a study from 1996, performed by the Reliability Analysis
Centre (RAC), field failure rates from one-year warranty data were analysed [12.7].
It seemed that both for hermetic and for nonhermetic devices a decrease of more
than 10 times in the failure rate was found between 1978 and 1990. In another
study, reported in 1993, a 50 times decrease of the failure rate of PEM (Plastic
Encapsulated Microcircuits) over the period 1979 to 1992 was found [12.7]. These
results are confirmed by many other industry studies. The reason is very simple:
covering 97% of worldwide market sales, the plastic encapsulated semiconductor
devices were the most studied devices. Also, the absence of the severe controls of
Military Standards allows a continuous process improvement, leading to the
mentioned results. And, eventually, a major cultural-change arisen in the
procurement politics for military systems. Known as the Acquisition Reform, this
new approach encourages the use of plastic encapsulated devices in DoD
(Department of Defence of the US Army) equipment, and - as a consequence - in
the military systems of all countries. The needed steps for implementing this new
system will be detailed in 12.8.
12 Plastic package and reliability 341

12.2
Package problems

From a reliability viewpoint, one of the most important parts of the electronic com-
ponent is the package. The experience indicated that the majority of failures arise
because the encapsulation could not fulfil its role to protect the die. The integrated
circuits encapsulated in plastic and in metal package, respectively, have a different
behaviour, depending on the environmental stress. Thus, a plastic package is more
resistant to vibrations and mechanical shocks because the wires are hold by the
plastic mass. On the contrary, plastic encapsulated integrated circuits are not tight
and may have intermittence of the solder joints at the temperature changes. The
thermal intermittence becomes manifest for all types of integrated circuits, but
especially for LSI memories. Generally, this is an effect depending on the com-
plexity of the circuit and one can reduce it with an order of magnitude if the manu-
facturing process is well monitored. One may note that plastic encapsulation is a
relatively simple technology with good properties for mechanical shocks and vi-
brations.
For plastic encapsulation of semiconductor devices, only thermoreactive resins
are used (e.g.: for a series production, a combination of phenol and epoxy resins or
silicone resins). The moulding material contains a basic resin, a drying agent, a
catalyst, an inert material, and an agent for firing delay and a material facilitating
the detaching of the package after the moulding operation.
The English standards D3000, D4000 and 11219A stipulate three levels of reli-
ability for the plastic encapsulation of semiconductor devices, the first two having a
cumulated failure rate of 2% and 10%, respectively, for an operational life of 40
years. Generally, the surface contamination may lead to various failure modes, such
as: the diminution of the current gain of a transistor, the increasing of the leakage
current, the corrosion of the aluminium metallisation, etc., accelerated by the ionic
impurities from the moulding material, especially in a humid environment. A spe-
cific failure mode for the plastic package is the mismatch between the dilatation
coefficients of the plastic material and of the other constituent parts (frame, gold
wires, and die), which may lead to open or intermittent contacts.
About 90% of the electronic components used today are plastic encapsulated. A
hermetic encapsulated semiconductor die costs, on an average, twice than its plastic
equivalent [12.8].
The majority of plastic encapsulated semiconductor devices have some inherent
failure mechanisms, such as ionic contamination and mechanical stress, which may
bring about open circuits. Moreover, the ionic contamination may distort the elec-
trical parameters of a device (examples are the increase of the leakage current for a
reversely biased pn junction or the change of the threshold voltage for a MOS tran-
sistor).
The external sources of ionic contamination are salt mist, industrial atmosphere
and corrosive solder flux. The corrosion may be a chemical one, a galvanic one, or
- with an external bias - an electrolytic one. The time period till the appearance of
a short circuit depends on temperature, relative humidity, presence of ionic
342 12 Plastic package and reliability

minants, type, plastic purity and mechanical design of the package, geometry of
aluminium interconnections. From this simple enumeration, it is obvious that to
predict the reliability of a certain plastic encapsulated semiconductor device is not
an easy task.
To outline the extreme importance of this problem, one must mention that in the
beginning of the microelectronic revolution, the Department of Defence of USA, in
co-operation with NASA and Federal Aviation Administration, created an ad-hoc
committee for plastic encapsulated semiconductor devices, with two working
groups: one for measuring methods and proceedings, and another for research and
development on plastic materials.

12.2.1
Package functions

The package must assure the following functions [12.9]:


a) Die protection (against the environment). The package must be built in such a
way to protect the incorporated electron device. The tests normally used for ve-
rifying this function are: hermeticity and humidity test, mechanical shocks and
constant acceleration, temperature cycling and thermal shocks.
b) Consistence with the needs of the system. Both, the basic properties of the
materials used for manufacturing and the design of the system may greatly in-
fluence the circuit performances. A specific interest is granted to factors such
as: capacity to transfer the heat, resistance to radiation, electrical properties.
c) Mechanical disposing. Problems concerning dimensions, weight, shape, num-
ber of wires, etc. are involved.
d) Interface between the die and the electrical system (outside world).
e) Favourable costs.
The plastic materials called Epoxy B answer well to the requirements for semicon-
ductor devices and are used on a large scale for linear and digital integrated cir-
cuits, small and medium power transistors, memories, microprocessors, etc. The
semiconductor devices in plastic package do not raise utilisation problems if they
do not undergo extended temperature cycling or if the long life at high temperatures
is not an essential requirement. Among the main advantages of the plastic package,
one may mention the great resistance at mechanical stress and at aggressive liquids
and gases, the good surface isolation of the incorporated die, the good precision of
the mechanical dimensions, the reduced costs.
There were some problems linked to the free ions, especially at high temperatu-
res, solved in the last years. Plastic materials with a very reduced number of free
ions and with dilatation coefficients closed to those of the metal or silicon were
obtained.
12 Plastic package and reliability 343

12.3
Some reliabilistic aspects of the plastic encapsulation

Normally, one may consider that there are three main aspects of the plastic encap-
sulation of semiconductor devices.
1. The stability of the electrical characteristics of the die. One of the most im-
portant degradation factors is the ion contamination due to the moulding mate-
rial, which may lead to the formation of an immersion layer at the surface of
the die. This layer produces the degradation of the electrical characteristics of
the device. The test currently used for the identification of this degradation
mode is the ageing at a high temperature reverse bias.
2. The resistance of the internal connections. For the devices in plastic package,
it is much more important than for hermetic packages to have very good me-
chanical connections, because [12.10,12.11]:
• at the moulding operation, the connection wires undergo a stress produced by
the injection of the moulding material;
• the dilatation coefficients of various materials are different, producing a
mechanical stress which cannot be neglected at extreme temperatures;
• the connection wires are included in plastic material on their whole length.
3. The resistance of the plastic package in hostile environment. This is the most
important factor which makes up the reliability of the plastic encapsulated
devices, because the degradation due to a lack of hermeticity begins with the
penetration of moisture into the package, reaching the die, especially along the
contact area between the moulded material and the metallic frame.
The main parameters characterising the resistance to humidity of a package are:
• the relative hermeticity,
• the dilatation coefficient of the moulding material,
• the quantity of hydro Ii sible contaminants in the moulding material,
• the die resistance to corrosion.
The experience showed that the accelerated test the most rich in signification (but
also the most controversial) for the evaluation of the resistance to humidity is the
ageing in functioning at high temperature (+85°C) and in a humid environment
(relative humidity 85%, deionised water). The bias must lead to a minimum
dissipation on a die, but with a maximum voltage gradient between the
neighbouring aluminium conductors. The penetration of the moisture depends on
the partial vapour pressure. However, one may emphasise that for this kind of tests
the ions arise essentially from the plastic package itself, while in an operational
environment, they are brought from the outside, by the moisture [12.10].
344 12 Plastic package and reliability

12.4
Reliability tests

The first distinction that must be made is between the discrete components and the
integrated circuits. While plastic encapsulated discrete devices are used mainly for
mass consumption, plastic encapsulated integrated circuits enlarge constantly their
utilisation field. This explains the user expectations for high reliability perform-
ances, practically equivalent with those for hermetically (metal or ceramic) encap-
sulated integrated circuits. But, one must not forget that there are specific failure
modes, created or accelerated by plastic encapsulation. To eliminate some of these
specific failure modes, the manufacturers introduced the following improve-
ments[12.11]:
• die passivation (that is the deposition on all the surface of a protective glass
layer, in which the contact area for bonding the connection wires between die
and metallic frame is to be etched);
• recovering ofthe wires after bonding with a high purity protective resin;
• impregnation of the package, after moulding, with resins liable to fill the holes
or the microcracks which could exist at the interface frame / moulding
material.

Cumulatedfailures
(%)
80

1 •••••
2/
60
/
/
/ , .... - - - - -.'..
'.
40 ~
/
/ ~ " " 4 •• ••••
............... .' 6
20

o
o 200 400 600 800 1000 1200
Number of cycles

Fig.12.t Results of destructive tests perfonned with thennal shocks (MIL-STD-883, method 1011,
level C, -65°C ... +125°C) for various package types [12.12]: 1 - epoxy without die protection; 2 -
silicone with detrimental package protection; 3 - epoxy with die protection; 4 - silicone with
nonnal die protection; 5 - ceramic package; 6 - phenol package with die protection; 7 - flat pack

F or integrated circuits, the wires must resist to a pulling force of 10gf, in the
bonding machine control, while for the metallic packages the force level is 1-2gf.
12 Plastic package and reliability 345

Generally, the mechanical stress (shock test, constant vibrations or accelerations)


undergone by a plastic encapsulated integrated circuit is more severe than that
tolerated by its equivalent in hermetic package. The use of the thermal shock is not
recommended for a 100% trial, because they can create potential defects in good
items. On the other hand [12.12, 12.13], there are valid tests for evaluating the
connection wires (see Fig.12.1).
An insufficient hermeticity facilitates the penetration of the moisture into the
package by two ways: either by the moulding material, or (especially) along the
contact area between the moulding material and the metallic frame. The observed
defect is an open circuit and produced by the corrosion (galvanic or electrolytic) of
the aluminium metallisation. The tests performed by US Army Electronics
Command [12.14] confirmed these mechanisms and the fact that - for the time
being - a plastic material superior to other ones does not exist. These results
[12.13] are, however, contradictory, because another author said [12.15]: the epoxy
encapsulated integrated circuits are with an order of magnitude more reliable than
phenolic or silicone ones.

12.4.1
Passive tests

For a grosso modo study of the thermal cycling conditions for electronic
equipment, the company National Semiconductors [12.16] employed two automatic
chambers for evaluating the various types of plastic materials used for
semiconductor encapsulation. The tested devices were transported from a cold
room (O°C) to a warm one (lOO°C) and conversely, at each 10 minutes, in a passive
test (without electrical biasing). The temperature of the junction is the same with
the ambient one. The Fig.12.2 resumes the results of these tests.

- .-
Cumulatedfailures

- - <:
(%)
10
.. .'

/i 2
.. ••••••

........'./ .
' ~

..... ~

......... /
................... .; .;
... ... ...
...
.;
.;
0.1
.
""------
." 3
... .","""
6

5
0.01
40 150 500 1K UK 2K Number of cycles (0 ... 125°C)

Fig. 12.2 Results of temperature cycling tests for various types of plastic encapsulation [12.15]; to
be noted the good behavior of encapsulant no. 6 (epoxy A, without die protection) and, especially,
the remarkable behavior of the encapsulant no. 5 (epoxy B, without die protection)
346 12 Plastic package and reliability

No screening test was perfonned. One must note the remarkable behaviour
observed for epoxy A and B, no failure being registered after 200 cycles. This
proves that the failure rate was smaller than one failure at 106 devices cycles and
corresponds approximately to a failure for 500 devices functioning 5.5 years in an
equipment working once per day, seven days per week.
The failure analysis showed that the main failure mode after the first hundred of
cycles was the breaking of the connection wires due to the material fatigue, because
the connections were frequently stretched by repeated dilatations and contractions
of the surrounding encapsulant. This allows deducing the fact that the dilatation
coefficients of the epoxy A and B are closed with those of the golden wires, up to
about + 115°C. Beyond this temperature (called transition temperature of the glass),
the increase of the mentioned coefficients is important, explaining the material
fatigue. For other encapsulating material, the failure was due to the important
values of dilatation coefficients of the moulding layer and 1or to the combination of
the drift or dilatation characteristics of the silicone resins. As a conclusion, the
results about the passive temperature cycling show that, for epoxy A and B, a small
percentage «0.03%) of intennittent failures depend on temperature ("hot
intermittent failure").
Valuable infonnation about the results of passive tests was furnished by McCoog
[12.7]. In a series of tests perfonned in 1986 by Rockwell International, at extended
temperature cycling (-401+80°C, 883 cycles), a higher failure rate (6.1 % - 2 failures
observed per million hours) was found for ceramic devices than for plastic ones
(1.6% - 1 failure per million hours). In 1987, in a study perfonned by Motorola,
similar results were reported for plastic and ceramic encapsulated devices,
undergoing temperature cycling (-65/+ 150°C, 1000 cycles): 0.083%/1000 cycles
for plastic and 0.099%/1000 cycles for ceramic, respectively. In 1989, Motorola
repeated the experiment, in the same conditions, and reported higher failure rate
values, but also equal values for plastic (0.44%11 000 cycles) and ceramic
(0.38%11 000 cycles). In the 90's, similar decisions were given: no reliability
advantage between plastic encapsulated microcircuits (PEM) and ceramic
encapsulated ICs. Condra et al. [12.17] reported such a result for temperature
cycling (-55/+85°C, 1000 cycles): more than 20-year useful life for both plastic and
ceramic packages. Weil et al [12.18] used also temperature cycling (-65/+150°C)
and obtained 1-2 device failures after 2 ... 20 millions cycles, for both plastic and
ceramic packages.

12.4.2
Active tests

Another series of tests, the active ones, are perfonned under a bias and with a
charge - power cycle test - increasing the junction temperature up to at least
+100°C. The devices are connected for 2 minutes and 30 seconds and than
disconnected for another period of 2 minutes and 30 seconds. The thenno-
mechanical stresses generated by this test approximate well those appearing in the
real functioning of the devices in an equipment which is connected and
disconnected. The experience [12.19] showed that the observed failure rate is
smaller than 0.17 failures for 10 6 devices X cycles, which is equivalent with a
12 Plastic package and reliability 347

devices, functioning in an equipment which is connected and disconnected once per


day, seven days per week.
Using epoxy B as a moulding material, the thermal intermittence was virtually
removed for normal functioning conditions, the failure percentage being 0.0086%.

12.4.3
Life tests

As one already knows, the performances and the reliability of the transistors and
integrated circuits (bipolar and MOS) may be degraded by surface problems, ther-
mally activated and associated to unwanted contaminants (mobile ions, polar mole-
cules, etc.). From this viewpoint, the organic encapsulants (such as epoxy resins)
are well known as having surface problems. As a result of numerous life tests, the
conclusion was drawn that [12.5 ... 12.22] epoxy B is a "clean" system assuring a
high reliability in normal operation conditions.
The epoxy B encapsulants were not allowed for military applications, because
they let to penetrate, to a certain extend, the moisture, an unacceptable fault for
military requirements. It is true that the huge majority of industrial applications do
not have such demanding requirements as those for military applications. One
knows that, in the beginning, in the period 1965-1970, the silicone resin packages
demonstrated excellent properties and performances at humidity accelerated tests,
but they did not pass the military examination about the functioning in a saline
atmosphere. This type of packages offers, however, a better reliability, especially in
the extreme conditions of a humid environment. In the 80's, the improvement of the
epoxy resins and the excellent mechanical reliability demonstrated with tests per-
formed by National Semiconductors led to a re-evaluation of the epoxy encapsula-
ted integrated circuits, because epoxy B proved to be much more resis-tant to hu-
midity than the other epoxies (especially A type). Even the tests performed in a
saline atmosphere (according to MIL-STD-883, method 1009, condition B: 48
hours, 271 ICs, bipolar and MOS ones, digital and linear ones) did not produce
supplementary failures, the epoxy B package having an improved thermal resistan-
ce (with about 10% smaller) than the silicone package, which allows to the juncti-
ons to operate at lower temperature and thus to attain a better reliability.
The Table 12.1 r12.211 resumes the comparative features of the most used
moulding compounds.

Table 12.1 The properties of some moulding compounds

Compound Initial Failure rate Resist. to Resist. to Package Thermal


integrity at extended moisture saline robust. resistance
temp. cycles atmosph
Epoxy A Excelent Weak Weak Excelent Excelent Good
Phenolic Good Medium Excelent Excelent Excelent Good
Silicone Good Medium Excelent Weak Good Medium
Epoxy B Excelent Excelent Excelent Excelent Excelent Good
348 12 Plastic package and reliability

A typical result, obtained following an overcharge in a humid environment, is


given in Fig.12.3, for silicone resin encapsulated transistors [12.23]. From this
result, the conclusion was drawn that the temperature alone is not useful for
evaluating the reliability of epoxy encapsulated components, but may be used for
power transistors encapsulated in a silicone resin. For these two types of plastic
packages (epoxy and silicone ones) the humidity stress is considered as being
significant and valid, because normally the environment contains an amount of
moisture. Thus, the use of relative humidity as a stress factor was proposed [12.23].
In Fig.12.4, a typical result for the medium life duration of an integrated circuit
with 14 pins DIL is shown.
A test performed in the Panama channel area [12.24], in a saline atmosphere, for
5688 transistors and 1316 bipolar integrated circuits, for 7 years, demonstrated that:
• DIP ceramic packages are superior to plastic ones.
• Flat pack packages are less reliable than the majority of plastic packages.
• An epoxy novolac encapsulated device may have the same reliability with a
DIP ceramic package encapsulated one.
• The weakest reliability results were obtained for silicone resin encapsulated
devices.

Cumulatedfailures
(%)

98

90

80

50

20

10

200 500 1000 2000


Fig. 12.3 Lognormal distribution of failures for transistors encapsulated in silicone resin. Test
stress: ambient temperature TA = 100°C, relative humidity r.h. = 97% [12.49]
12 Plastic package and reliability 349

Average lifetime
(h)
10'

~O"C

[r.h./

Fig. 12.4 Average lifetime for an integrated circuit plastic encapsulated (DIL, 14 pins) vs. [r.h.1 2

Life tests perfonned in the early 90's demonstrated also the same failure rate
value for both plastic and ceramic encapsulated ICs. Schultz et al r12.25] reported
failure rate values of 1 FIT (l0-9h- 1) at +55°C for both plastic and hermetic ICs.
Another study, also cited in [12.7], demonstrates the higher increase of the reliabi-
lity for plastic encapsulated ICs than for ceramic encapsulated ones (see Table
12.2).

Table 12.2 A comparison between the 1979-1992 decrease of failure rates (in FITs) for plastic
and ceramic packages, respectively

IC type Year Plastic package Ceramic package

Linear 1979 300 10

1992 0.2 2

Logic 1979 10 6

1992 0.2 3

As one can see, in 1992, the plastic packages were more reliable than ceramic ones.

12.4.4
Reliability of intermittent functioning plastic encapsulated ICs [12.26]

In operation, the dissipated power increases the temperature of each component,


decreasing the relative humidity (Lh.), which is a critical parameter for a plastic
package. For instance, at an over-temperature of 10°C, the life duration is increased
10 times, over a component functioning in the same conditions, but without an
over-temperature.
350 12 Plastic package and reliability

If, on the contrary, the power dissipated by the component is negligible (this is
the case of CMOS devices), and the overtemperature of lOoC cannot be obtained,
or if the component operates intermittently (instead of continuously), a detailed
analysis before any extrapolation must be made. Strohle [12.27] studied the most
detrimental testing cases for intermittent functioning simulation and he investigated
also the extrapolation of the typical results of the 85°C/85% r.h. test for the
intermittent functioning case.
At the intermittent functioning of plastic encapsulated components, the typical
failures are produced by humidity and by mechanical tensions arisen at rapid
changes of temperature (due to different temperature coefficients of the die and of
the package). The typical failure modes are die detachment, die scratches and drift
of electrical characteristics.
The humidity produces the following failure mechanisms:
• corrosion of aluminium pads,
• bit defects (for static and dynamic memories),
• drift of the threshold voltage, brought about by RO and ORO ions.
All these failure mechanisms become manifest by high leakage currents at the
surface. To establish the size of these currents, a special chip with long pads
covered by a glass passivation (with 4% in weight phosphorus) undergone to
various temperature and relative humidity levels was used. The leakage current was
measured for an element with two 41lm width pads, with a voltage of 5Y
(corresponding to an electric field of 1.25J104Y/cm) applied between them. The
results of the measurements are shown in Table 12.3.

Table 12.3 Surface leakage current produced by humidity on a test structure Si/AI

Relative humidity Leakage current (A)


(%) 25·C 60·C 80·C
20 4JIO· 1S 2JHy14 SJIO- 14
40 6:no- l4 IlIO- 13 9JIO- 13
60 SJIO- 13 4J1O"12 1010- 11
80 l.iJIO· II SJIO- II 2JIo- 1O

Table 12.4 The effect of the humidity on the time till the pad interruption (that is 50% corrosion);
the pad has the width = 41ffi1 and the thickness = IIffi1

Relative humidity Time (hours)


(%) 25·C 60·C 80·C
20 260,000 52,000 13,400
40 1,300 3,500 1,100
60 70 13 6

Experiments have shown that especially relative humidity and temperature


exercise strong influences on the reliability of the components. The Table 12.4
12 Plastic package and reliability 351

contains the results of an experiment about the corrosion till 50% of an aluminium
pad (width = 4/lm and thickness = l/lm). One may notice that the difference
between 25°C/20% r.h. and 80°C/80% r.h. is of about 6 magnitude orders.
The parameters influencing the behavior of plastic packages at intermittent
functioning may be sort out as follows [12.27][12.28]:
a) Parameters specific to semiconductor:
- technology,
- glass passivation.
b) Parameters specific to package:
- water penetration,
- water retention,
- contamination,
- pH value,
- conductibility
- transmission time of the glass,
- fixation capacity (die, lead-frame, filler).
c) Parameters specific to functioning:
- supply voltage (intensity of the electric field),
- conditions of the environment (temperature, relative humditity),
- "over-temperature" of the die or of the component.
Because of the large number of parameters, the lifetime of the plastic encapsulated
component is hard to calculate taking into account each parameter in a global
model. Therefore, Strohle [12.27] studied only the parameters not changing at the
intermittent functioning, by using an accelerated static lifetime test (e.g. the
85°C/85% r.h. test) and by using the resulting lifetimes (between lOOO and 2000
hours) as a basic factor. So, the intermittent functioning may be taken into account
as an accelerated factor (F), with the formula:
F = lifetime for operational conditions / lifetime for testing conditions. (12.1)
Strahle intended to study the influence of various parameters on the failures
produced by humidity and to use the results in a model for simulating the
operational conditions. But this goal proved to be difficult to reach and only models
for the "worst case" were obtained [12.29].

Accelerationfactor

10

o 0.25 0.5 0.75 1.0 Duty cycle

Fig.12.S Dependence of the acceleration factor on the duty cycle, having as parameter the die
over-temperature [12.61]; test conditions: 85°C/85% r.h. (192 hours cycle)
352 12 Plastic package and reliability

The various acceleration factors depending on the die temperature, for the
85°C/85% r.h. test, are shown in Fig. 12.5. The main result is that for a plastic
encapsulated component, even if the overtemperature is of only 5°C, the small duty
cycles are a more hard stress that continuous functioning. The shape of the curves
(for all temperatures between 0 and 60°C) is almost independent on the duty cycle
value. For smaller values, the equilibrium state is reached (see Table 12.5).

Table 12.5 Relationship between the duty cycle and the equilibrium state (test conditions: over-
temperature of 20°C, duty cycle 0.15, 85°C and 85% r.h.)

Cycle duration Equilibrium state (after Total time (hours)


(hours) how many cycles)
4 225 900
168 9 1730

Experimentally, the lifetime of plastic encapsulated LSI ICs, without over-


temperature and for the 85°C / 85% r.h. test, was found to be about 1500 hours,
with less than 5% failures. Of course, in normal operational conditions (20°C / 85%
r.h. test), for the worst case, lifetimes of 15 years for the equipment functioning 8
or 16 hours each day were found [12.27]. For microprocessors 8085 (manufactured
with a NMOS technology), from a batch of 30 items, in a 85°C / 85% r.h. test, 3
failures (produced by humidity) were observed. To be noted that in a normallifetest
this weakness of the components is hard to detect, because the dissipated power
(400mW) produces an important over-temperature of the die (about 30°C). So, in
an intermittent functioning the microprocessors would fail later, without a manifest
reason and without any early failures.
For very short functioning times (e.g. less than one hour per day), it is difficult
to predict the lifetime, because the leakage currents have a considerable increase
after connecting the supply voltage [12.30]. The responsability belong to the ions
(Cr, Na\ etc.), which detache from the plastic package and attack the aluminium
oxide layer. On the other hand, it is clear that the small values of the duty cycle
represent a higher risk than the rithm 8 hours / 16 hours (plastic encapsulated LSI
IC, with a dissipation power of 10.. .50mW).
It is important to storage in a· dry place the LSI ICs, in order to avoid the
saturation with water of the plastic material, saturation that remains active, for an
intermittent regime with small values of duty cycle, more than one year.

12.5
Reliability predictions

The results from operational life cumulated by CNET (France), about the reliability
of bipolar ICs with a reduced integration degree, plastic encapsulated, have shown
an important difference between the graphic representations of the failure rate vs.
the number of gates, in accordance with MIL-HDBK-217 and the CNET data,
respectively [12.7]. As one knows, the American military organisations rejected the
plastic package on the basis of the experiments made in conditions of excessive
12 Plastic package and reliability 353

humidity in Panama. This explains the ratio of about 25 between the results from
MIL-HDBK-217 and those from CNET. The pessimistic predictions of MIL-
HDBK-217 about plastic encapsulation are not justified by the failure analysis of
integrated circuits, because the established nature of the defects is directly linked to
the mechanisms, which are rarely referring to the plastic encapsulation.
The 1992 version, called F, of MIL-HDBK-217 [12.32], offers a reliability
improvement with a factor of 3.5, but the experimental results [12.7] demonstrates
a 50 times decrease of the failure rate. It seems that MIL-HDBK-217F
underestimates the reliability of PEMs, used mainly as commercial devices. For
instance, after 12 264 hours of functioning, MIL-HDBK-217F predicts for com-
mercial devices 46 failures and for military devices 23 failures. Experimentally, 19
failures were observed for both types.

12.6
Failure analysis

First, the attention of manufacturers was concentrated on the wire bonds and on the
die-frame connections, which produced the highest failure percentage. Then, the
evaluation of the material for the package was made, such as a detailed study of the
various building methods for moulded packages.

Table 12.6 A history offailure rate improvements (in FITs) for plastic encapsulated ICs

Test 1970 1971 1974 1979 1992


Temperature cycling: 200 cycles I 4 xl0 8 7 X 106 7 X 10 5 -
-40/+1 50°C

Thermal shocks: 200 cycles I 6 x 10 8 4 X 10 7 106 - -


-65/+ 150°C

Pressure test: 203KN/m2 (30psia) 7 x 108 - 3 X 106 - -


and 121O°C/96 hors

Continuity test: 100°C 5 X 106 - 2 X 104 -


Life test: 125°C - - - 30 0.2

As the solder joints and the connection wires are completely included in plastic
material, the moulded components are extremely resistant to vibrations and
mechanical shocks, even if a fracture (or a discontinuity) arises in the connection
wire. The two discontinuous elements remain held together as long as the moulding
environment continues to exercise the same compression force on these two parts.
This force has, however, the tendency to weaken the contact and, eventually, the
electrical connection is broken. But, as soon as the temperature changes, the contact
is restored. This is an intermittent contact. If the ambient temperature does not
restore the electrical contact, an open circuit arises. Such failures are generated by
354 12 Plastic package and reliability

the thermo-mechanical stress produced by the changes of the package dimensions.


This kind of failures arises mainly at the user, during the infant mortality period
and is hard to detect.
The failure analysis allows to accumulate important knowledge on the physical
and chemical mechanisms of the failures, urging subsequently the manufacturers to
improve progressively the reliability of plastic packages (see Table 12.6
[12.7][12.33]).

12.7
Technological improvements

An important step in the process of manufacturing better plastic packages was done
by the British Telecommunications (BT) Labs. The main evaluation tool was ob-
tained by the invention, in 1968, at BT Labs of the technique called HAST (Highly
Accelerated Stress Test), in fact a non-saturating autoclave test. The work done at
BT Labs is synthesised by Sinnandurai [12.43] in 1996. Starting from 1968, a first
series of experiments were performed on bipolar transistors and on specially de-
signed moisture sensors (assembled onto ceramic substrates) [12.44]. These devices
were covered with 15 various plastic coatings. The experiments, made on 500 test
vehicles, last for 4000 hours, meaning 2 x 106 device hours, and showed that 4
silicone plastic encapsulants attain a life duration equivalent two 25 years in tropi-
cal climates, while the rest of 11 plastic encapsulants continued to be hazardous to
device reliability.
Meanwhile, various laboratories developed improved plastic coatings. To
passivate and mechanically protect the die, a silica glass was used with great care,
because the phosphorus concentration must not exceed 2% in weight. Otherwise, a
catastrophic increase of the aluminium corrosion appears [12.45]. RCA proposed a
technological improvement [12.4, 12.47], by replacing the aluminium layer with a
multilayer (titanium/platinum/gold) passivated with silicon nitride. In this system,
silicon nitride assures the junction hermeticity, the titanium layer improves the
adherence of the dielectric, platinum is a barrier-layer for diffusion, and gold
constitutes the conducting layer. The electromigration speed is 10 times smaller for
a gold layer than for an aluminium layer.
In the 80's, another series of tests performed by BT Labs used improved coating
materials. In fact, two materials (a silicone resin mechanically protected by a filled
silicone, and a silicone resin protected by a filled phenolic) demonstrated high
reliability properties, for a time period equivalent to 25 years in tropical climates.
This work was the basis for using plastic encapsulant for high reliability applica-
tions. These materials were subsequently tested and the obtained results confirmed
the high reliability properties.
The high reliability plastic encapsulant allowed to obtain low cost, high
perfonnance, plastic chip carrier, with the trade name EPIC, manufactured by
common printed wiring board techniques [12.51] The reliability of these
components was assessed in tests made also on the same die encapsulated in
various commercially available packages. The results obtained from a damp heat
test are presented in Table 12.7.
12 Plastic package and reliability 355

It seems that one type of SOIC and two types of SLCC demonstrated higher
performances than the hermetic package CerDIP. These results also indicate that
humidity tests must be used for hermetic packages too.
In another series of tests, the reliability of hermetic packages was compared with
that of plastic ones, under temperate climate conditions (steady and cycle damp-
heat). The results showed that hermetic packages had no lifetime advantages over
plastic package [12.17].
In the 80's, the idea to use silicone gels as plastic encapsulants for high
reliability ICs seemed an appropriate one. The IEEE Computer Society formed a
"Gel Task Force", with representative from 24 companies, to pursue this
opportunity, starting form the earlier initiative of BT Labs. This team evaluated
polymer gel coatings for IC. A number of 1440 IC chips, with specific test patterns
and protected by various glassivations, were tested with 5 gel types and one
silicone RTV coating [12.52]. The tests were thermal shock, salt spray and HAST.
The results are summarised in Table 12.8.

Table 12.7 Results of a reliability test program: high humidity testing in a non-saturating
autoclave (108°C, 90%RH). SOlC = Sml,lll outline IC package, SLCC = Silicone junction coated
IC, CerDIP = Ceramic dual-in-line package (hermetic)

Package IC type Batch MetalIisation Cumulative failures (%)


type size
168h 336h 672h

SOIC 741 40 Al + Oxide 70 82 98


SOIC 741 40 TiIPtJAu+ShN 4 31 60 87
SOIC 741 40 Al + Oxide I I 5
SLCC 741 40 Al + Oxide I 2 14
SLCC 741 40 TilPtJ Au+ShN 4 22 23 23
SLCC 741 20 Al + Oxide 93 94 94
SLCC 348 20 AI + Oxide 3 II 27
EPIC 348 20 Al + Oxide 28 69 84
CerDIP 741 20 Al + Oxide 93 93 94
EPIC 348 15 Al + Oxide I 2 8

The identified failure mechanism was the break of the wires. It seems that the
thick coating caused wires to be broken. Consequently, one must apply only thin
layers (about 25 J.llll thick). In fact, it is recommended to use three layers of thinly
applied gel (the "triple track" from Table 12.8).
In 1994, BT Labs evaluated the reliability of better gels as reported by IEEE Gel
Task Force, applied to PIN and laser diodes and GaAs IC. As accelerated test, they
used Damp heat (85°C, 85%RH). For PIN diodes, the gel seemed to assure high
reliability properties. Under the same test conditions, laser diodes had an
inconsistent behaviour, with a degradation of the performances. On the contrary,
gel coated GaAs IC showed a remarkable stability up to 6000 hours.
356 12 Plastic package and reliability

Table 12.8 Results of reliability tests performed by IEEE Gel Task Force

Characterisation moments Cumulative failures (%)

Thick Silicone-Coated Thin Silicone Gel WN

Daisy chain Triple track Daisy chain Triple track

Initial characterisation 71 36 0 0

After thermal shock 98 39 6 0

After salt spray 100 39 6 0

After HAST 100 39 7 0

Final characterisation 100 39 7 0

12.7.1
Reliability testing of PCB equipped with PEM

So far, only tests performed on the component level were taken into account. But a
necessary step towards the use of Plastic Encapsulated Microcircuits (PEMs) in
military systems is to test printed circuit boards (PCBs) containing PEMs. In 1993
July, US Air Force Electronics System Centre contacted DSD Laboratories Inc.
(Sudbury, Massachusetts, USA) to conduct an experiment about the possibility to
replace the existing electronic hardware of DoD systems with commercial hardware
built using best commercial practice. This was called "Commercial-Components
Initiative - Ground Benign Systems". An important chapter of this study was dedi-
cated to PEMs. The experiments were made on PCBs. A system operating in a
temperature controlled ground shelter (but with a cold storage requirement, at -
40°C) was chosen. Military ICs, hermetically encapsulated, were replaced with
equivalent commercial PEMs. The results, reported by Kross and Sicuranza [12.53]
were impressive. By replacing the military components with commercial ones, a
cost saving of 88% was obtained. The experimental hardware has run for 2 years
(1994-1996, over 107 device hours) without a failure. Each PCB (8 x 10inch2)
contained 75 ... 100 ICs, mostly digital technology. This experiment proved that the
operational reliability of the commercial devices is high enough for the use in
military systems.

12.7.2
Chip-Scale packaging

The term chip-scale package (CSP) entered the industry's lexicon in 1994 [12.57];
rigorously defined, the perimeter of such a package is no more than 1.2 times the
perimeter of the die it contains. These packages combine the best features of bare
die assembly and traditional semiconductor packaging, and reduce overall system
size, something devoutly to be desired in portable electronic products. CSP is still
12 Plastic package and reliability 357

taking its first steps into the marketplace; unresolved issues include reliability,
thermal performance, design, materials, assembly, test, shipping, handling and the
CSP-system interaction. This reflects the newness of the technology and the fact
that few CSPs are as yet in production or use. J-STD-012, the first US standard,
was put into effect in 1996, and has settled on 0.5-, 0.75-, and l.O-mm pitches
(pitch: separation between adjacent conductors). Materials for chip-scale packages
must meet at least three criteria: reliability, cost-effectiveness, and manufactu-
rability. Jet Propulsion Laboratory (Pasadena, California) and Institute jor
Verkstedsteknisk Forskning (MOlndal, Sweden) are currently evaluating the
reliability of CSPs from several international suppliers; data so far show improving
levels of reliability.

12.8
Can we use plastic encapsulated microcircuits (PEM) in
high reliability applications?

The improvement obtained in the reliability of commercial PEM led to the idea to
use such components in military systems. This approach, called the Acquisition
Reform, was possible only after careful tests performed on components and PCBs
(see 12.7). But the replacing of military parts with commercial ones implies also a
series of operations, having as a main tool the physics of failure approach, such as
[12.54]:
• investigating the utilisation environment and identifYing the operating stresses,
• identifYing the failure modes and, subsequently, the failure mechanisms and the
acceleration models,
• using accelerated tests to precipitate relevant failures,
• Analysing the failures, separating the populations affected by each failure
mechanism and evaluating the reliability level.
Only after performing such a cycle, PEMs can be used in military systems. But the
work is not finished. ELDEC Corporation developed a system for implementing
PEMs in high reliability applications [12.54]. The system contains three modules,
operating at the level of the design, procurement and assembling, respectively.
The design process control contains already used operations (part selection and
derating, functional and structural design, thermal design), but also the new ones,
based on the concurrent engineering approach. Even at the design phase, the
manufacturability, the quality, the reliability, but also the cost are taken into
account. The most interesting point is the design for reliability. Reliable prediction
methods may be used, because the reliability engineer must evaluate the reliability
of the future device. Two main approaches are to be used: the probabilistic and
deterministic reliability prediction, respectively.
The probabilistic reliability prediction uses models, such as MIL-HDBK-217,
based on field data from equipment failures, the acceleration factor is the same for
all components, and the failure rate is taken as a constant for all components (an
exponential model).
358 12 Plastic package and reliability

On the contrary, in the deterministic predictions, based on studying the physics


of failure phenomena, the failure rate models are lognormal or Weibull, and for
each failure mechanism an acceleration factor and a model is used. Of course, the
deterministic predictions are more expensive, but also more accurate.
The procurement process begins with the vendor qualification. Then, the part
qualification is made, which is the selection of parts from a Qualified Parts List
(QPL) or (in a more general form) from a Qualified Manufacturer List (QML).
Under this system, the vendor gets approval for an entire production line instead of
each part type. For the use of Taguchi arrays in fractional factorial experiments, the
necessity to group the parts into part families arises. A part family contains parts
having the same device technology, similar die size and package, the same package
technology and manufacturer. This method was used by the ELDEC Corporation
and by the CALCE Centre from the University of Maryland for reliability
evaluation of various part families. Eventually, the part screening is made after part
qualification. For military systems there are qualification requirements for 100%
screening, in the purpose to identify wearout failure mechanisms and to evaluate
the useful life capability of a part. A PEM screening programme contains up to 50
temperature cycles followed by electrical characterisation. It seems that the
screening is a necessity for high reliability applications.
Eventually, the assembly process of PEMs may induce many failure
mechanisms. Consequently, a careful control of this process must be performed,
including all materials and operations involved in circuit card assembly. The same
qualification procedure described for parts can be applied for the qualification of
materials and processes. Environmental stress screening (ESS) at the assembly end
may emphasise this type of failures. As a conclusion, a reliability assurance
programme must be draw up, addressing the following areas: system definition,
operating stress conditions, procured component reliability, design reliability,
process reliability, life cycle management, reliability prediction, and reliability
demonstration. Moreover, ELDEC implemented a computer-controlled part
reliability tracking system, with data available from many sources, such as: supplier
test data, screening data from incoming components, process development and
control data, design verification testing data, product qualification testing data, and
field failure data.
The replacement of military components with commercial ones produced a
strong reaction of some DoD elements. They claimed that the specifications for
commercial parts are too narrow for DoD requirements, and that even if the
reliability of PEMs was improved, there are no quantitative results about this
improvement. Also, PEM manufacturers would have the tendency to service their
large-volume customers first, and DoD does not need high quantities of a PEM
type.
But the potential advantages in using PEMs proved to be impressive [12.53]: the
cost savings, the higher packaging density, and the increase of number of available
types.
In a detailed analysis, Demko [12.55] takes into account the possible dangers
linked with the use of Commercial-off-the-Shelf (COTS) equipment replacing
military equipment. In 1986, he said that in the next decade the reliability challenge
would be to manage COTS to take advantage of the promise and to avoid the
12 Plastic package and reliability 359

References

12.1 Hamill A. T. (1968): Westinghouse Goldilox integrated circuits offer military meeting in
plastic packages. DODINASA Industry Meeting on Plastic Encapsulated Semiconductor
Devices, Washington D.C., May 15
12.2 Baird, S. S.; Peattie. C. G. (1968): Present reliability status of plastic encapsulated semi-
conductors and evaluation of their potential for use in military systems. DODINASA In-
dustry Meeting on Plastic Encapsulated Semiconductor Devices, Washington D.C., May
15
12.3 Anixter B. (1968): Plastic semiconductors? DODINASA Industry Meeting on Plastic
Encapsulated Semiconductor Devices, Washington D.C., May 15
12.4 Flood 1. L. (1968): Reliability of plastic integrated circuits. DOD/NASA Industry Meeting
on Plastic Encapsulated Semiconductor Devices, Washington D.C.. May 15
12.5 Diaz R. P. (1968): Plastic-epoxy semiconductors. In: DOD/NASA industry meeting on
plastic encapsulated semiconductor devices, Washington D.C.. May IS 196K.
12.6 Fick S. R. (1971): Test of plastic encapsulated semiconductors. Research Report. Texas
University, May
12.7 Mc.Coog, J. R. (1997): Commercial component integration plan for military equipment
programs: reliability predictions and part procurement, Proceedings of the Annual Reli-
ability and Maintainability Symp., Jan. 13-16, Philadelphia. Pennsylvania (USA). pp.
100-110
12.8 Taylor C. H. (1976): Just how reliable are plastic encapsulated semiconductors for mili-
tary applications and how can the maximum reliability be obtained. Microelectronics and
Reliability. IS. pp. 131-134
12.9 Fehr H. G. (1970): Microcircuit packaging and assembly - state of the art. Solid State
Technology. August, pp. 41-47
12.10 Andre G. et al. (1972): Fiabilite des circuits integres it encapsulation plastique. Actes du
colloque international "Les compos ants de haute fiabilite", Toulouse. 6-10 Mars, pp. 143-
159
12.11 Andre G.; Regnault 1. (1972): Problemes de fiabilite lies it I'encapsulation plastique des
circuits integres. L'onde electrique, vol. 2, no. 3, pp. 121-125
12.12 Brauer 1. B. et al. (1970): Can plastic encapsulated microcircuits provide reliability econ-
omy? Proceedings of the International Reliability Physics Symp .. Las Vegas. pp. 61-72
12.13 Brauer J. B. (1972): military microcircuit packaging. The Electronic Engineer. July. pp.
30-31
12.14 *** (1973): Tests show epoxy IC packages to have reliability edge. Electronics. April 26.
p. 25
12.15 Hnatek, E. (1970): Plastic ICs entice military. Electron Device News. November 15. pp.
43-47
12.16 *** (1972): Epoxy B. National Semiconductor Corp.
12.17 Condra, L. et al. (1992): Comparison of plastic and hermetic microcircuits under tem-
perature cycling and temperature humidity bias. IEEE Trans. on Components. Hybrids and
Manufacturing Technology, vol. 15. no. 5, Oct. 1992, pp. 640-650
12.18 Weill, L. (1993): Reliability evaluation of plastic encapsulated parts. IEEE Trans. on
Reliability, vol. 42, no. 4. December. pp. 563-540
12.19 Hnatek, E. R. (1973): Epoxy package increases IC reliability at no extra cost. Electronic
Engineering, February, pp. 66-68
12.20 Feldt, E. 1.; Hnatek, E. R. (1972): High reliability consumer ICs. Proceedings of the IEEE
Reliability Physics Symp., pp. 78-81
360 12 Plastic package and reliability

12.21 Reich, B.; Hakim, E. B. (1972): Environmental factors governing field reliability of plastic
transistors and integrated circuits. Proceedings of the IEEE Reliability Physics Symp., pp.
82-87
12.22 Goarin, R. (1978): La banque et Ie recueil des donnees de fiabilite du CNET. Actes du
Colloque International sur la Fiabilite et la Maintainabilite, Paris, July, pp. 340-348
12.23 Lawson, R. W.; Harrison, 1. C. (1974): First Int. Conf. on Plastic in Telecommunications,
pp. 1-30
12.24 Hakim, E. B. (1978): US army Panama field test of plastic encapsulated devices. Microe-
lectronics and Reliability, vol. 17, no. 3, pp. 387-392
12.25 Schultz W.; Gottesfeld S. (1994): Reliability considerations for using plastic-encapsulated
microcircuits in military applications. Advanced Microelectronics Qualification / Reli-
ability Workshop, August, pp. 1-12
12.26 Bajenesco T. (1984): Microcircuits enrobes de plastiques: fiabilite en fonctionnement
intermittent. La Revue Poly technique, no. 1446, pp. 17-18
Biljenesco T. (1979): Problemes de la fiabilite des microcircuits bipolaires et MOS. Cycle
de conferences It l'Ecole Poly technique Federale de Lausanne, avril-mai
Bajenesco T. (1976): Quelques aspects de la fiabilite des microcircuits avec enrobage
plastique. Bulletin ASEIUCS vol. 66, pp. 880-884
Bajenesco, T. 1. (1976): Microcircuits: Fiabilite et contraintes. La Revue Poly technique
no. 11, pp. 1051-1095
Bajenescu, T. 1. (1997): A personal view of some reliability merits of plastic-encapsulated
microcircuits versus hermetically sealed ICs used in high-reliability systems. Proceedings
of the 8th European Symposium on Reliability of Electron Devices, Failure Physics and
Analysis (ESREF '97), Bordeaux (France), October 7-10
12.27 Strohle, D. (1983): Zuverliissigkeit von plastikverkapseIten LSIs bei intermitierendem
Betrieb. NTG-Fachberichte, Band 82, pp. 91-96
12.28 Bajenesco, T. (1981): Problemes de la fiabilite des composants electroniques actifs ac-
tuels. Masson
12.29 Strohle, D. (1981): Feuchtprobleme bei LSIs. NTG-Fachberichte, Band 77, p. 107
12.30 Sim, S. P.; Lawson, R. W. (1979): The influence of plastic encapsulants and passivation
layers on the corrosion on thin aluminium films subjected to humidity stress. Proc. of the
17th Annual Reliability Symp., pp. 103-107
12.31 Gardner, D. S. (1985): Layered and homogenous films of aluminium / silicon with tita-
nium and tungsten for multilevel interconnects. IEEE Trans. on Electron Devices, vol.
ED-32, no. 2, pp. 174-183
12.32 MIL-HDBK-2I7F, Notice I, Reliability prediction of electronic equipment, 10 July 1992,
US Department of Defense.
12.33 Khajezadeh, M.; Rose, A. S. (1977): Determination de la fiabilite des puces hermetiques
de circuits integres en boitiers plastique. L'Onde electrique, vol. 57, no. 3, pp. 206-212
12.34 Flod,1. L. (1972): Reliability aspects of plastic encapsulated integrated circuits. Proceed-
ings of the IEEE International Reliability Physics Symp., pp. 95-99
12.35 Fischer, F. (1970): Moisture resistance of plastic package for semiconductor devices.
Proceedings of the International Reliability Physics Symp., pp. 94--100
12.36 Peck, D. S.; Zierdt; C. H. Jr. (1973): Temperature-humidity acceleration of metal-elec-
trolysis failure in semiconductor devices. Proceedings of the International Reliability
Physics Symp., pp. 146-152
12.37 Kolesar, S.C. (1974): Principles of corrosion. Proceedings of the International Reliability
Physics Symp., pp. 155-159
12 Plastic package and reliability 361

12.38 Arciszewski, H. (1970): Analyse de fiabilite des dispositifs it enrobage plastique. L'Onde
electrique, vol. 50, no. 3, pp. 230-240
12.39 Gallace, L. J.; Khajezadeh, M.; Rose, A. S. (1978): Accelerated reliability evaluation of
trimetal circuit chips in plastic packages. Proceedings of the International Reliability
Physics Symp., pp. 224-228
12.40 Neighbour, F.; White, B. R. (1977): Factors governing aluminium interconnection corro-
sion in plastic encapsulated microelectronic devices. Microelectronics and Reliability, vol.
16, pp. 161-164
12.41 Paulson, W. M.; Kirk, R. W. (1974): The effect of phosphorus-doped passivation glass on
the corrosion of aluminium. Proceedings of the International Reliability Physics Symp ..
pp.I72-179
12.42 Parker, P.; Webb. C. (1992): A study of failures identified during board level environ-
mental stress testing. IEEE Trans. on Comp., Hybrid and Manuf. Tech., vol. 15, pp. 1086-
1092
12.43 Sinnandurai N. (1996): Plastic package is highly reliable. IEEE Trans. on Reliability. vol.
45, no. 2, June, pp. 184-193
12.44 Sinnandurai N. (1981): An evaluation of plastic coatings for high reliability microcircuits.
Microelectronics 1., vol. 12. pp. 30-38
12.45 Licari, J. J. (1970): Plastic coatings for electronics. McGraw-Hill Book Company, New
York
12.46 Lepselter, M. P.; Waggener. H. A.: McDonald, R. W., Davis. R. E. (197.\): Beam-lead
devices and integrated circuits. Proceedings of the IEEE, vol. 53. no. 4. pp. 405-409
12.47 Burkitt, A. (1975): Solid-state progress in circuits and devices. Electronics Engineering,
October, pp. 50-52
12.48 Baxter, G. K.: Anslow, 1. W. (1977): High temperature thermal characteristics of micro-
eletronic packages. IEEE Trans. on Parts, Hybrid and Packaging. vol. PHP-13. no. 4. pp.
385-390
12.49 Curran, L. (1970): Plastic ICs get foot in military door. Electronics. May II. pp. 127-132
12.50 Reich, B. (1970): Plastic semiconductor devices and integrated circuits for military appli-
cations. Solid State Technology, January, pp. 53-56
12.51 Sinnandurai N. (1985): EPIC: a cost effective plastic chip carrier for VLSI packaging.
IEEE Trans. Components. Hybrids. Manufacturing Technology. vol. CHMT-8. Sept.. pp.
386-390
12.52 Balde J. W. (1991): The effectiveness of silicone gels. IEEE Trans. Components, Hybrids,
Manufacturing Technology. vol. 14, June, pp. 352-365
12.53 Kross E. J and Sicuranza M. A. (1996): Commercial-Components Initiative: Ground
Benign Systems - Plastic Encapsulated Microcircuits. IEEE Trans. on Reliability, vol. 45,
no. 2, June, pp. 180-183
12.54 Condra L. W et al. (1994): Using plastic-encapsulated microcircuit in high reliability
applications. Proceedings of the Annual Reliability and Maintainability Symp .. Jan. 24-
27. Anaheim, California (USA), pp. 481-493
12.55 Demko E. (1996): Commercial-Off-The-Shelf (COTS): a challenge to military equipment
reliability. Proceedings of the Annual Reliability and Maintainability Symp .. Jan. 22-25.
Las Vegas, Nevada (USA), pp. 7-12
12.56 Fox, W. M. (1978): Semiconductor devices and passive components. The Bell System
Technical Journal, vol. 57. no. 7. pp. 2405-2434
12.57 Thompson, P. (1997): Chip-scale packaging. IEEE Spectrum, August, pp. 36-43
12.58 B1ljenescu, T. I. (1997): A personal view of some reliability merits of plastic-encapsulated
microcircuits versus hermetically sealed ICs used in high-reliability systems. Proceedings
362 12 Plastic package and reliability

of the 8th European Symposium on Reliability of Electron Devices, Failure Physics and
Analysis (ESREF '97), Bordeaux (France), October 7-10
12.59 Biijenescu, T. I. (1997): ZuverHissigkeit komplexer elektronischer Systeme. Sommerkurs,
Haus der Technik (Miinchen, Germany), July 14-16
12.60 Biijenescu, T. I. (1998): A particular view of some reliability merits, strengths and limita-
tions of plastic-encapsulated microcircuits versus hermetically sealed microcircuits util-
ised in high-reliability systems. Proceedings of OPTIM '98, Bra\=ov (Romania), May 14-
16,pp.783-784
12.61 Bazu, M. et al. (1984): Thermal and mechanical stress in rapid estimation of reliability.
Proceedings CAS 1984, pp. 257-260
12.62 Bazu, M. et al. (1985): SRER - a system for rapid estimation of the reliability. Proceed-
ings of 6th Symp. on Reliab. in Electronics RELECTRONIC '85, Budapest (Hungary), pp.
267-271
12.63 Bazu, M. et al. (1987): Failure mechanisms thermally and electrically accelerated .. Pro-
ceedings CAS 1987, pp. 53-56
12.64 Bazu, M. et al. (1987): Reliability of semiconductor components in the first hours of
functioning at high temperature. Electrotechnics, Automatics and Electronics, no. I, pp.
10-15
12.65 Bazu, M. et al. (1989): Behaviour of semiconductor components at temperature cycling.
Revue Roumaine des Sciences Techniques, no. 1, pp. 151-154
12.66 Bazu, M. et al. (1989): Rapid estimation of reliability changes for semiconductor devices.
Proceedings of Ann. Semicond. Conf. CAS 1989, October 7-10, pp. 399-402
12.67 Bazu, M. and llian, V. (1990): Accelerated testing of integrated circuits after storage.
Scandinavian Reliability Engineers Symp., Nykoping, Sweden, October
12.68 Bazu, M. et al. (1990): Rapid estimation of the reliability of a batch of semiconductor
components. Quality, Reliability and Metrology, no. 2-3, pp. 92-94
12.69 Bazu, M., Tiizliiuanu, M. (1991): Reliability testing of semiconductor devices in a humid
environment. Proc. Ann. Reliab. and Mainatin. Symp., Orlando, Florida, (USA), pp. 237-
240
12.70 Bazu, M. et al. (1997): MOVES - a method for monitoring and verifying the reliability
screening. Proc. of the 20th Int. Semicond. Conf. CAS '97, October 7-11, Sinaia, pp. 345-
348
12.71 Biijenescu, T. I., Bazu, M. (1999): Semiconductor devices reliability: an overview. Proc.
of the European Conference on Safety and Reliability, Munich, Garching, Germany, \3-17
September, Paper 31
13 Test and testability of logic ICs

13.1
Introd uction

Now, the electronic components are currently used in all activities. Technology
growths so rapidly, that manufacturers and users must accommodate an increased
importance to the components control. To make this, it is absolutely necessary to
have computer-assisted equipment. For the acquisition of such very modem
computerised equipment, high investments are needed, not only for the machines,
but also for personnel instruction, drawing up of software control and managing
programmes.
The testing problem can be formulated as follows: determine the input sequences
(or their lengths) so that these sequences - applied to the circuit - can allow to take
a decision from the output point of view: if the circuit is - or not - defective. The
test result allows to define if the circuit is failed or not (the detection). The
localisation bounds the failure inside of the component. The aggregate detection +
localisation is known under the name diagnosis.
The test of simple Logic Integrated Circuits (LICs), such as Small Scale
Integration (SSI) and Medium Scale Integration (MSI), can be performed without
difficulties utilising truth tables or patterns defined by an algorithm. For Large
Scale Integration (LSI) ICs the patterns must contain special perturbation tests that
can complicate the test programmes. If the complexity growths, the testers must be
more powerful (test more quickly to simulate the utilisation speed, more precisely
in timing, more versatile), so that, for example, for microprocessors an input brooch
can be transformed very quickly in an output brooch, and inversely.
The defect types from the population to be tested have a direct impact on the
quality and effectiveness of the test, also on the future quality, after test. Defects do
not necessarily influence the item's functionality. They are caused byflaws during
design, production, or installation. Unlike failures - which always appear in time,
randomly distributed -, defects are present at t = O. Table 13.1 tries to make a
classification of the defects based on their effects with the aim to better estimate the
possibilities of test methods.
A defect - in a component - is a physical imperfection that can involve an
inadequate operation (outside of specifications; nonconformity); a failure is a
defect symptom (sticking at 0 or at 1, short-circuit, etc.) which can be permanent or
intermittent.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
364 13 Test and testability of logic ICs

An error is a wrong operation of a component due to a breakdown [13.1]; a


latency error is the time between the appearance of breakdown and its expression
by an error.

Table 13.1 Classification of defects depending on their effects [13 1][13 2]

Type Model or effect Example


Explicit defect Sticking to 0 (or I) Short-circuit of a typical
connection to the electric
mass
Interruption Interruption of a phantom
and connection
Implicit defect Marginal defect Function affected by the Function affected by:
(soft failure) margin on the applied growth, - voltage supply
inside the specified limits - temperature
- input immunity margin
- output load
- available propagation
time
- preparation time
- maintain time
Conditional Function affected by the logic - static or dynamic
defect state (inputs or memory state) crosstalk
- parasitic coupling be-
tween memory cells
bv common bus
Combined defect Marginallv and conditionally
Intermittent defect Reproducibility can not be
checked

13.2
Test and test systems

Excepting the ordered design, the users know neither the logic diagram, nor the
electrical scheme of the LIe. Sometimes, even the functional description is
incomplete. For LSI, VLSI or ULSI, the manufacturers don't wish to deliver the
scheme - for industrial property reasons. Finally, the user ignores a priori the most
probable defects. Under these conditions, they often utilise a manually generation
of the test programme through the exploitation of the function and initiate the
necessary corrections of the programme until he obtain a satisfactory test
effectiveness.
13 Test and testability of logic ICs 365

13.2.1
Indirect tests

After the manufacturing is finished, the logic board is tested. To develop the
functional programme test, it is necessary to have a model that simulate the LIC.
The main obstacle: the structural simulation models of LSINLSIIULSI are too dull
to be utilised by the automatic generation at the logical board level. In addition,
these models can vary from a LIC source to another. Consequently, one may use a
functional simulation model that fits the assisted manual generation methods.
Certain internal LIC defects are not affected by these programmes, and this is
more probable for the more complex LIC: an additional reason for prefering a very
severe test of LSINLSIIULSI before the mounting.
A method that eliminates the previous gaps would be to precede the functional
test by a direct test in situ. This means that:
• each complex LIC has a good direct testability
• the LIC to be tested - on the test board - is not influenced by the environment
(excepting the supply), the inputs and the outputs being logically switched-off
from the rest of the board.
To be noted that the dynamic marginally tests can be hobbled by the parasitic
connections (unlike the direct test).
Test systems
The choice of a test system depends of numerous parameters, such as the LIC
volume to be daily tested, the LIC variety, the variations of possible loads, the
variety of applications, the utilisation volume in each application, the usage
(evaluation, qualification of production, application type).
The evaluation and qualification needs aim to correspond to a universal testing
system. At the reception control ofthe user, we distinguish the following cases:
• small quantity of each type, great types variety, request of secure circuits - for
example in aeronautics (universal testing system);
• great quantity of a small variety of types, context simulator (a context
simulator is an adapted test system, capable of exclusively "exercise" LIC, as it
should be in operation, with marginal conditions);
• small quantity of a small types variety (external service).

13.3
Input control tests of electronic components

A reliable IC must be stable and operational. It is not difficult to find if the IC is


operational (for the given specifications at a given time), at least for SSI and MSI.
On the contrary, to determine its stability, the IC must operate for a time (pre-
ageing), and then should be tested to see if parameter degradation occurs. The IC
reliability is influenced by two factors:
• the mechanical stability (because of different expansion coefficients of the
constitutive elements of plastic encapsulated IC);
366 13 Test and testability of logic les

• the electrochemical stability (strong influenced by temperature).


The normal failure mechanisms due to the mechanical instability are:

• separation between structure and lattice (lead-frame);


• creation of a humidity path, produced by the expansion;
• separation between connection and support
• acceleration of a fissure, by the pressure on the capsule surface.
The thermal shock simulates well the switching on and off, and has also the
advantage of a reduced cost. That is why the IC is firstly immersed in an adequate
liquid at the temperature of +100+5, +125+5 or +150+5 °c during at least 5 minutes,
and afterwards in other liquid at the temperature of 0. 5, _55'5 or _65+5°C
respectively, for at least 5 minutes. Test time must be less than lOs, and the test has
10 complete cycles (MIL-STD 883, method 1011.2). A centrifugal test verifies the
mechanical integrity ofIC. Concerning the electrochemical stability, for pre-ageing
the IC (stabilisation of electrical characteristics, acceleration of chemical
degradation modes) the high-temperature stability test (24 .. .48h/+150°C - if the IC
data sheet allows it), and the static or dynamic burn-in are used. For this last test,
the relation time/temperature determines the acceleration factor (and also the
operational life at 25°C) and the burn-in effectiveness (the defect percentage at this
test). The most important parameters of an effective burn-in are temperature,
duration and circuit configuration l [13.3].
Some users don't organise a systematic burn-in at components level, but try to
substitute it with a burn-in at equipped boards level. Such a test is a weak substitute
of component burn-in, for three main reasons: (i) the greatest part of equipped
boards can't be exposed (or operate) at high-temperatures; (ii) at lower temperature,
the acceleration time can not be extended to cover the infant mortality period; and
if the equipped board is tested, an IC can not be tested according to its
specifications (at the most at 50% of the specification); (iii) because some of the
detected errors are infant mortalities failures, they run through the repair and
renewal processes, and the main failures may appear later.

13.3.1
Electrical tests

The electrical tests verify the integrity of component and permit to compare the real
parameters limits with the specified limits of the technical conditions, for an
environmental temperature of +25°C. The cost of modem computerised test
equipment varies between 100 000 and 1 000 000$, depending on the desired test
level, on LIC complexity, and on user's needs. One may note that there is a direct

1 The more complex the IC, the longer the period of infant mortalities, and, consequently, the
burn-in duration must be longer. For example: SSI - 48 h; MSI - 96 h; LSI - 168 h; VLSI or
ULSI - 336 h. Configurations: A: high-temperature, reverse polarisation; B: high-temperature,
direct polarisation; C: high-temperature, reverse polarisation, loaded inputs (SSI); D: parallel
excitation (LSI); E: oscillator; F: extended temperature. Stress temperature: +125°C - for
ceramic package, +100°C - for plastic package, +70°C - for circuits having a high level of
electric power supply.
13 Test and testability of logic ICs 367

correlation between the confidenc~ level of the required tests and the equipment
cost. No system can work with an effectiveness of 100% because (a) the parameter
that causes the incorrect operation is not measured or specified; (b) LIC can
degrade or deteriorate after this test, as a result of incorrect manipulations or of
soldering conditions. Each user must decide if for his case the optimal solution is to
buy the equipment for all these electrical tests for the used components, or to make
appeal to the services of an independent test laboratory. It is important to notice
that the electrical test only detects and isolates the defective ICs induced by the
previous processes; it doesn't contribute to the reliability ofgood components.

13.3.2
Some economic considerations

Taking into account the continuous growth of the integration level of LIC, the user
must examine at least the following questions:
• why and how do I test?
• must I perform a 100% dynamic test?
• what is the evolution of the relationship between the test cost and LI C cost, if the
LIC complexity growths?
Generally, for resistors, capacitors, small signal diodes and small signal transistors
it is not economically desirable to perform 100% selection tests, burn-in tests
and/or electrical tests. On the contrary, the practical experience demonstrates that
for LIC, memories, microprocessors, hybrids, power components it is
reccommended to perform such tests at the components level. There is no universal
answer, for all types of users.
Although the LICs are tested several times during the fabrication cycle, a certain
(small) percentage of the delivered devices will fail to the user. This small
percentage can cause serious difficulties, especially for great systems. We know
that the probability of operation for an equipped board is determined by the
probability of good operation for the components. For example, a board with 100
ICs, (each IC having a 99% ~robability of good operation), will have a probability
of good operation of (99%)10 = 37%. This means that if only 1% of the utilised ICs
will fail, a third of the equipped boards will be defective during operation. This
example illustrates the great importance of the input control and of the control of
equipped boards, both controls being performed by the user.
To be persuaded about the need of a 100% input control at the user, one must
analyse: a) how much would cost the absence of an input control, and b) what
would be the minimum amount of ICs (bought by the user) that justifies a
remunerative 100% input control. For the first question, we suppose that the input
control is made with a tester that costs 25 000$; we suppose, too, that the operator
costs are 15 OOO$/year and that the amount of seconds/year is 7.106 . We presuppose
that we need 5s to test an IC. Therefore, the average test cost for one IC will be as
r
follows: (25 000 + 15000) x (7.10 6 1 = 0.0285$. If we allow, for example, an IC
failure rate average of 3% at input control (only as rough estimate example!), we
must spend 0.0285$ x 100 = 2.85$ to find 3 defect ICs from 100 ICs. We spend
also 2.85$:3 = 0.95$ to identifY each defect. In other words, an user that utilise
368 13 Test and testability of logic les

normally 100 000 ICs each year spends - due to the absence of the 100% input
control - 95 000$. The second question was: what is the minimum amount of ICs
(bought by the user) that justifies a remunerative 100% input control? To obtain the
answer, we utilise the following equation:
CuK = PI + KMN 1h (13.1)
where Cu is the unit cost of an defect IC (0.95$), K is the annual amount of tested
ICs, PI is the tester price (25 000$), M is the manual labour cost per hour (10$), and
Nh is the amount of ICs tested in a hour (say 500 ICs). With these values we reach
the conclusion that beginning with an annual amount of 27 000 ICs, the 100% input
control is justified.

13.3.3
What is the cost of the tests absence?

The response to this question is given by the equation


Cu =ABC + ADEF (13.2)
where Cu is the unit cost of a defective IC; A is the total percentage of defect ICs
(1.5%); B is the total percentage of ICs with defects at the user, during the
fabrication of the equipment (75%); C is the average cost of the repair in the
factory, due to the defective ICs (30$); D is the IC percentage with defect risk on
long term (25%); E is the percentage of IC defective arising during the warranty
period; F is the average cost of equipment repair (during the warranty period, at the
user) due to a defect IC (1300$).
With the indicated values, we obtain: Cu =0.95$IIC.
For the general case of digital IC (excluding LSIIVLSIlULSI) or linear IC, or
discrete components, the corresponding figures are presented in Table 13.2.

Table 13.2 Average indicative figures of the parameters A ....F and the unit cost for discrete
components, linear and digital ICs [13.5]

Parame- ICs Discrete components


ter linear digital transistors diodes R;C
A(%) 3 1.5 1 0.5 0.2
B(%) 85 70 50 50 25
C($) 30 30 25 25 25
D(%) 20 20 10 5 8
E(%) 50 50 30 20 15
F($) 1300 1300 100 100 100
Ci$) 3.91 1.96 0.032 0.05 0.003
13 Test and testability of logic les 369

13.4
Lie selection and connected problems
The selection of LICs strongly influences the final price of industrial products. That
is why the user wishes to know if the operation of an IC is modified by the
presence of a physical defect. With the coming out of LSIsNLSIslULSIs having
non-repetitive structures (microprocessors, customer circuits), the problem became
more complicated, and the verification of the IC function needs more and more
computerised equipment's. Irrespective of the test method, the test principle is
always the following: at inputs is applied an input sequence, and - at outputs - it is
observed either a function depending of these values, or all the successive values
obtained at all outputs.
Initially the logic test was studied for combinatory circuits. The patterns for
these LICs are essentially of two types [13.6]:
• probabilistic: an ensemble of random input vectors is applied simultaneously to
the circuit to be tested and to a reference model (either material or simulated);
each different behaviour indicates an error.
• deterministic: the input vectors are determined by examining the circuit. This
second category regroups (with the exception of a manuaP method) the
functional methods (which take into account only the circuit operation) and the
structural methods (which examine the structure of the network that realise the
logical function of IC).
This last approximation can be divided into algebraic methods (which manipulate
equations concerning various components of the circuit) and heuristic methods
(which try to push the effects of a defect to the primary output, making a path into a
circuit), the path sensitive algorithms. Between the probabilistic and deterministic)
approximations we classifY the random generation: the input vectors are randomly
generated, and a simulation of the circuit permits to know the defects that the
sequence can detect.
All these methods - with the exception of the functional methods, which detect
errors - are based on the knowledge more or less fine of the circuit potential
defects, but few defect types are suited to an effective modelling.
Concerning the control and the different phases of the LIC lifetime [13.7], we
distinguish production controls at the end of the fabrication, performed by the
manufacturer, input controls made by the user at the reception of circuits or even
before the mounting on the boards, and maintenance controls, intended to detect
the defects due to a degradation that appear only after an operation period.
The defect causes are numerous and various: marking errors, deregulated
measure instruments, wrong programming, etc. Other defect category results from

2 Manual is not a method in itself, it is only a manner in which the method is applied.

) We make the difference between forms methods and test methods. The difference between a
probabilistic test and a random test don't exists at the level of forms generation, but only at the
level of output observation; for the random test we compare a reference with a defect circuit
(simulated or rea!), while for the probabilistic test we make statistical measures on the value that
the output should have.
370 13 Test and testability of logic ICs

the imperfections that appear by utilising chemical methods for circuit elaboration.
The major part of these defects are detected at the production control level. Some
of these will bring out degradations after long storage or operation periods.
For the detection of the anomalies appearing during the life of a component, two
types of tests are utilised: (a) parametric tests, static or dynamic (applied at the end
of component fabrication); (b) component logic tests - inside of a defect system -
at the end of fabrication or in maintenance; they verify if the logic operation of the
component is ensured in environmental conditions similar with the operation ones.

13.4.1
Operational tests of memories

The tests of memories [13.8] ... [13.15] - in other words the detection and the
localisation of defects - is a problem with an increased importance due to the rapid
evolution of the memory technology registered in the last decade. The needs differ
depending on the considered activities; we discern the following tests: (i) tests
during the fabrication; (ii) tests at the end of fabrication; (iii) formula tests; (iv)
maintenance tests. Generally speaking, the first two tests require detection and
localisation; the other two tests can be limited to detection.
Although the general LIC test principles and techniques can be applied to the
memories, the nature of memories grants to their tests the following particularities:
• The test method is, in general, sequential, but the test of other circuits is rather
combinatorial; in other words, the order of sequences depends on the obtained
results during the test [13.16]. This is explained by the test structure.
• The sequence generation is made on line, while for the other LC this generation
is accomplished prior to the effective test, with the aid of a general programme.
• The sequence synthesis with the aid of general programmes is not economically
desirable; a dedicated programme for a memory is preferable.
• The dynamic test (at real operation speed) is practically compulsory.
• The selection of test points is more restrictive, while the ideal test points are
even the memory points. The absence of direct access to these points or to the
addressing register outputs and to decoder is one of the principal causes of the
memories testing complexity.
The aim of the functional testing of a memory is to verify that:
• all the information can be written and preserved;
• all the read information in memory is correct;
• all the memory points operate correctly.
From the functional point of view, the defects of the memories have three
categories of effects:
• one bit data error,
• more bits data error,
• addressing error.
13 Test and testability of logic ICs 371

Most frequently, the programmes are oriented to detect the defects and to start a
series of writing, reading and comparing operations, but does not allow the loca-
lisation of defects (with rare exceptions). An universal test method of memories
doesn't exist, only an ensemble of instruments which have the tendency to mutu-
ally complete each other.

13.4.2
Microprocessor test methods

13.4.2.1
Selftesting

The microprocessor is tested in a re-formed ''natural'' environment (with RAM,


ROM, bus and peripherals), but without to utilise a component tester, strictly
speaking. A diagnosis programme is loaded into a RAM, which will be accom-
plished throughout the microprocessor. This programme guides the fulfilment of a
maximum of instructions in the worst conditions for the internal registers. If all the
instructions normally occur, the programme switches-on to a final "good" address;
if the microprocessor is defective, the programme switches-on to a "bad" address.
Advantages

• the microprocessor is in a natural environment;


• the test programme can be the future programme to be really utilised;
• small cost price.
Disadvantages

• several errors can be compensated, but they can not be detected;


• as the default cause can not be generally defined, the product analysis is
difficult;
• to know if the microprocessor is a good one, it is necessary to wait until the
termination of the programme; this can drive to a useless test time, if the default
appears from the beginning;
• the previously defined results must be stocked.

13.4.2.2
Comparison method

This method requires the utilisation of a tester composed of two groups of drivers
and detectors, two supports, a memory, and a reference microprocessor. The
reference microprocessor works in the same manner as in the selftesting method,
but the information entering into the reference microprocessor are also sent to the
product to be controlled, and afterwards the information leaving the two
microprocessors are compared.
This method has the same advantages as the selftesting method, with the
difference that the first noticeable default can be detected immediately, at each
instruction cycle. Nevertheless, the tests speed is determined by the response time
372 13 Test and testability of logic les

of the reference microprocessor; sometimes defaults that have not existed, appear if
the two components have different speeds. Once again, the microprocessor
characterisation and the analysis of the microprocessor are very difficult, the
environment is very difficult to be simulated, and the reference microprocessor
must be of very high quality, with the aim not to be degraded during the tests (but
this is not enough; this is the more general problem of the dependence of a
microprocessor "reputable good").

13.4.2.3
Real time algorithmic method

A programme is written in the machine language of the microprocessor, and loaded


in a buffer; for each instruction sent to the microprocessor, the microprocessor
supplies his response and a waiting signal. The response is compared to a
calculated response by an algorithm. If the response is good, the next instruction is
sent to the microprocessor after a ''restart'' signal; if not, a default is signalised.
Advantages
• maximum speed;
• a fIrst default is immediately detected;
• flexible text.
Disadvantages
• the complex instructions (call of a underprogramme, interruption procedure)
require some consecutive bytes and are difficult to control;
• the execution is too slow for the microprocessor of the last generation;
• the programmer must be accustomed with the product, from the application
point of view, to determine the running order, the organisation, etc.
To notice - in spite of all faults - that the method represents the basic idea of many
commercial testers.

13.4.2.4
Registered patterns method

This method includes two independent stages. In the fIrst one, the microprocessor is
simulated with the aid of a minicomputer, a RAM or a PROM; each simulated
response can be identifIed and associated with the corresponding instruction. The
whole is controlled and sent in a buffer, at defIned periods. The content of the
buffer is then saved on disc or on magnetic tape. In the second phase, the patterns
are loaded in the buffer, and then transferred to the controlled microprocessor.
Advantages
• easy to implement;
• a certain smooth test;
• possibility of parametric measurements;
• turned right up into most sophisticated and universal testers.
13 Test and testability of logic les 373

Disadvantages
• requires a great buffer for pattern transfer;
• for each change, even minor, the phase one must be repeated, and this require a
new simulation;
• the simulation of the interruption is not possible;
• requires an important software support.

13.4.2.5
Random test of microprocessors

The philosophy of this microprocessor test - made by users and often by the
circuits designers and manufacturers - lies in the fact that the proposed test is a
comparison test using a renowned "good" microprocessor as reference micropro-
cessor. The microprocessor to be tested is dynamically compared to verify the
parametric and functional coincidence with the reference circuit. At the beginning,
the two circuits are initialised in a state known by a short programme. Afterwards
they are supplied by a pseudo-random signals generator that - in fact - generates
instructions for the "good" microprocessor. During this random sequence of
random tests, the functional integrity of the unit under test is verified in relation
with the reference unit supposed to be a "good" one. The coincidence of the input
and output signals must exist for the two units; if not, an error is indicated.
The basic philosophy is to execute random instructions concerning random data.
The programme generated by the pseudo-random signals generator is - from the
syntax point of view - correct, but has no semantic signification. He is written in
the language of the unit to be tested. This little effective approach is weak, since the
execution of each instruction in a microprocessor greatly depends on the previous
instructions. The internal elements (the counters) which demand a sequence of
instructions to reach a state must be tested by the initialisation programme which
contains: patching tests to "0" and "1 ", increments and decrements of the registers
and stacks. The available specimens of "good " microprocessors - having known
defaults - have been tested with this tester and the faults have been detected after
one to five seconds with the aid of this method.
The method does not guarantee any degree of test effectiveness and quality.
In conclusion, the user can test microprocessors if he shares correctly out his
tests between random and sequential signal generation diagrams, and this depends
on the specific architecture of the tested microprocessor. Nevertheless, a theoretical
study should be made to better assess the test confidence.

13.5
Testability of LIes

In connection with all these problems, the testability notion must be taken into
account. The testability is defined as the easiness to guarantee by test an objective
quality. This notion concerns a LIes population of a chosen type and is linked not
only to the manufacture, but also to the maintenance; the objective quality
374 13 Test and testability of logic les

corresponds to the quality level (proportion of acceptable elements in the


population) to be reached in the final stage: the use of LIC. The test will be used
also as a sorting out operation, having the tendency to eliminate the defective
elements according to a certain test strategy - which can be modified, if necessary,
during the period of notifying the knowledge of the product (several months).
The test effectiveness is the ratio:

(13.3)
where nl is the number of elements eliminated during the test (elements which
should be defective in use), and nd is the total number of population elements
(which would be defective in use).
The objective quality corresponds to the minimal proportion of elements which
would be "good" in use, in the population, after the test. The testability can be
defined as being reversely proportional to the test cost, brought back to the piece,
which would permit to reach the objective quality, starting from a given upstream
quality. The cost depends on the reported test duration and investments.

13.5.1
Constraints

• Choice of an advantageous LIC (function and price);


• easiness to obtain knowledge about the LIC;
• spread out of the domain use/specification of the LIC;
• easiness to install the means for the quality assurance (testability factor);
• variation of the quality and design of the LIC (for a given manufacturer, and
from a manufacturer to another);
• cost of the direct test (non mounted LIC);
• cost of the indirect tests (LIC mounted on a PCB); test, diagnosis, repair of the
under-set.

13.5.2
Testability of sequential circuits

The test of combinatory circuits is not a difficult problem; on the other hand, the
sequential circuits are bringing the major part of the difficulties. Fortunately, it is
possible to preserve the testability of the last ones considering the following
recommendations at the design level:

• use of synchron structures;


• accessibility of internal states - in the sense of controllability (introduction) and
observability (reading) of each possible internal state with the aid of a small
number of clock pulses 4 • It is recommended to apply a shift register technique;
• logical partitioning;
• blow by blow functioning.

4 These recommendations are not often admissible by the designer, since the circuit would be too
great or too expensive. In particular, the accessibility of internal states is rarely attainable.
13 Test and testability of logic les 375

The difficulty in generating tests for sequential circuits [13 .17] ... [13 .19] arises
from the poor controlability and observability of memory elements; the direct
impact on them is caused by the number of feedback and sequential depth of the
memory elements in circuits. Some algorithms describe a sequential circuit as a
directed graph; by selecting properly the scan flip-flops to eliminate the circles in
graph as many as possible, the testability of circuit can be increased and the time
spent on test generation can be shorted. A special algorithm [13.19] can efficiently
mitigate the contradiction between high fault coverage and the required extra chip
area.

13.5.3
Independent and neutral test laboratories

The American experience has shown that for the little and medium enterprises the
quality problems can be better solved by a small group of specialists from the
enterprise and/or by an external test laboratory, independent and neutral. The
respective co-operative works concern normally the ULSI, VLSI, and LSI Ies
characterisation studies, the comparative reliability studies of a certain product
realised by several manufacturers, and all the dynamic input controls for electronic
components.
The required knowledge concerning the measuring techniques, the quality and
the reliability of the high integrated active components, and the writing of the
necessary software programmes and of the test specifications is so complex, that
only an important group of engineers and technicians highly qualified in various
fields, and having a long experience can solve all the problems.
During the last years, the gravitation centre of the input control moves more and
more towards reliability and screening tests, with the statistical analysis of data.
We can resume the arguments that plead for indepenent and neutral test insti-
tutes as follows:
• These institutes are entirely dedicated to the semiconductor tests; that is why
they better understand the respective problems.
• The institutes offer some useful additional services (failure analysis, statistical
processing of data, etc.).
• The costs are minimised.
But there are, also, some arguments against the utilisation of these institutes, such
as:
• How the user can be sure the screening, the bum-in and other tests have been
performed on a 100% basis?
• How the user can determine - on the basis of noticed failures - the quality and
the reliability of the provided batch?
376 13 Test and testability of logic les

13.6
On the testability of electronic and telecommunications
systems, and on international standardisation

The testability is based on adequate recommendations and on structural techniques,


with the aim to support not only the test and the diagnosis of electronic and
telecommunications systems at the prototypes and production levels, but also to
facilitate the preventive and curative maintenance. All these elements must be very
efficient - from a technical point of view - and must have a moderate price - from
a financial point of view.
The most effective modality to succeed is to promote a testability politics from
which it can result a better, quicker; cheaper production.
By the early concept of equipped PCBs that can be easily tested and
diagnosticated, one may understand that:
• investments in test equipment are small, since more reduced performances are
required, with the condition to maintain the foreseen quality level;
• the design - better adapted to the test - facilitates the programming, make it
faster and easier to understand;
• the diagnosis - more obvious - is quickly given, even by the personnel with little
qualification;
• reducing the cost and the times, contributes to diminishing the development time
and the production cycle time.
The 1960s represented the discrete components era; the functional complexity
grows with the number of components. It was the age during which the tests were
made manually, with measurement instruments.
During the 1970s, the functional testers permitted an effective go/no go test for
the good PCBs. For PCBs with failures, the diagnosis of failures was long (defect
after defect) and expensive, since very specialised operators were necessary.
At the beginning of the 1980s appears the age of in situ testing. The PCB test is
a conformity test (the good component at the corresponding place), the diagnosis is
quicker and more advantageous, while the quantities are growing. A light diagnosis
leads to a cheap production.
Now the tests represent 35 .. .45% of the production costs (it is also a non-
productive operation, with an added value after the PCB tests). As the enterprises
are exposed on the market to the international competition, they must:
• design more quickly (to be present earlier on the market)
• produce more rapidly (to shorten the fabrication time)
• produce cheaper (to be competitive)
• produce the best quality (to reduce the cost of the non-quality, to maintain the
commercial position on the market, to enlarge the sale sphere).
The solution of the problem: (i) select a testability politics which allows to fulfil all
these objectives; (ii) design products that are easily testable.
For the future, the following tendencies are important:
13 Test and testability of logic les 377

• growth of the integration degree;


• more accentuated individualisation of the components;
• new technologies for manufacturing the equipped PCBs (with surface mounted
components, SMC);
• Worldwide extension of the market.
The last tendency supposes:
• a maximum of functions in a minimum of volume (reducing thus the possibili-
ties of the physical access to the elementary functions);
• an intrinsic intelligence of the PCB at the component level (the concept of
component is directly related to the undissociating nature of its constituents).
The resulting conclusion is the same: to succeed, the companies must have a wise
testability politics.
The testability is based on recommendations which justify why and how to
proceed. It is the designer's duty to decide on the application method, according to
the quantitative and qualitative solicitations of the project; the technical and
financial comparisons help him to select the best compromise. During the selection
of the specifications for the future product, a commercial company spends in fact
only 1% to 3% of the project budget. Simultaneously, by selecting the orientations,
it is decided on, and - implicitly - it is engaged the allocation of 70% of the budget.
As the project advances, the cost of an abandonment grows. That is why in this
period the maximum of orientation and technical preparation must be achieved. All
the design costs are unitary - concerning the project; but their consequences have
consequential effects on the level of each manufactured product. He who has not
understood the opportunity of the optimisation, and intends to save the sum of
money at design level takes the risk to lose a 1000 times (or greater) amount by the
global cost of production. (This financial mechanism is named control level effect
of the testability; it is optimal if the smallest growth of the design cost generates the
greatest reduction of the production repetitive costs.)
To obtain a successful testability, three essential aspects must be taken into
account:
a) Partitioning (division of a circuit in elementary functions, preserving the
greatest coherence between these functions and their physical constituents). The
elementary functions are more and more integrated, and are also gradually less
accessible - from physical point of view. With the aid of partitioning a better
simulation of the function is performed, and - in the case of an anomaly - a easier
and faster localisation of the cause of failure is possible. The problem is therefore
to establish the relationfrom cause to effect. [For one and the same physical failure
(say a short-circuit) correspond many types of functional defects. The same
functional defect can have different causes. And by partitioning, it is relatively
difficult to take all these facts into account.]
b) Control (in the sense of circuit control). Here the problem is to impose, at a
certain precise moment, a good defined test sequence. The control points are the
input points, or the access points to the elementary functions.
378 13 Test and testability of logic les

c) Visibility. The test sequences generated by the subtest circuits, and exposed to
some stimuli, and then directed to the test circuits. The visibility points are output
points of the elementary functions.
As the possibilities of physical access have the tendency to diminish, it is necessary
to implement a testability bus to have access to the elementary functions and to
perform a sequential access to a great number of points, with a minimum of
connection points. The testability bus permits the access to 4096 control and
visibility points, with 25 parallel addressing broaches and 5 series addressing
broaches. This permits a standard testability diagnosis plug:

• of the system component;


• from the developing phase to the maintenance phase;
• from the simple preventive control to the most complete test.
Its cost can be optimised only for a great production; the method effectiveness can
be really pointed out if a standardisation of the proceeding exists. That is why the
IEEE has created a standardisation committee for the standard testability bus. Its
objective is to solve all the testing problems (aims, strategy and tactics), taking into
account the following aspects:
• the marketing (the user's needs);
• the studies (product design);
• the methods (selection of the manufacturing proceedings);
• the testing (the means and the fabrication methods);
• the supply (acquisition policy);
• the quality (user's satisfaction).
The success of an enterprise that wants to thrive will depend on the homogeneity of
its proceedings, on its performances' coherence, and on the interdependence of its
leaders.
A perfect production means to understand perfectly the whole process, and this
must happen in a world where the perfection doesn't exist. It is possible to forget
the natural tendency of each industrial process to ineluctably diverge from the
adjustment status in which it was positioned? The tester must not be be the device
to distribute tickets for defects, indicating the component(s) which must be
changed; it will turn to qualification point of the quality level of a production,
which - exploiting the results in real time - will indicate the process phase(s) in
which should be taken action to produce zero defects. The tester will play then the
role of an essential autocorrection link of the production line integrated in the
future factory. The quality of the testing team will be judged not according to the
managed capital, but according to the realised savings; the omnipresence of
testability recommendations remain the communication language for the enterprises
concerned in improving their performances.
The usual practice for complex chips was to combined them [13.17] ... [13.19] to
form high-performance subassemblies on printed circuit boards PCBs. Now, the
next step is the integration of functional blocks from different sources - the term of
intellectual property expresses the major problem to be surpassed - in a single
integrated circuit IC. The productivity gap between the expected chips and the
13 Test and testability of logic ICs 379

design tools can only transferred on the chip (Fig. 13.1) with a clever combination
of intellectual property.

ASIC complexity
(Mia. transistors) Technologv: complexity
t ?
increases with 32% per .I'mr

500 Design: productivity iI/creases


with 20% per year

50

8 ~ Time
'95 '98 '01 '04 '07 '10 (year)

Fig. 13.1 The productivity gap between the expected chips and the design tools can be transferred
on the chip only with a clever combination of intellectual property. (Source: Sematechi

References

13.1 Robach. C. (1978): Le test en production et en exploitation (2eme partie). Research report
RR no. 132. National Politechnic Institute. Grenoble (France). August
13.2 Piel. G. (1978): La testabilite des circuits integres logiques vue par I·utilisateur. L 'Ondc
electrique. vol. 58. no. 12. pp. 830-835
13.3 Robach. C. (1978): Le test en production. Session de perfectionncmenL National
Politechnic Institute. Grenoble (France): fevrier
13.4 Hnatek. E. R. (1975176): Costing in-house vs outside testing. Electronic Production. Dec.
1975 / January 1976. pp. 9-11
13.5 Bajenescu. T. I. (1996): Fiabilitatea componentelor electronice. (Reliability of Electronic
Components). Publishing House Editura Tehnicii. Bucharest (Romania)
13.6 Caillat.1. (1976): Contribution au test des circuits integres logiques. Ph. D. Thesis. Institut
Poly technique de Grenoble (France). October 8
13.7 Thevenod-Fosse. P. (1978): Contribution a I'etude du test aleatoire des circuits sequentiels
et des memoires. Ph. D. Thesis. National Politechnic Institute. Grenoble (France).
February 15
13.8 Girard, E. et al. (1974): Le test fonctionnel des memoires. Revue technique Thomson-
CSF, vol. 6. no. I. pp. 217-227
13.9 Dumitrescu. D.: Saucier. G. (1975): Test de memoirc dynamique it technologie MOS.
Proc. of internal. symp. on fault-tolerant computing, Paris. JUIlt:
13.10 Marshall. M. (1976): Through the memory cells - further exploitation of Ie's in
Testingland. EDN, 20.2.1976, pp. 77-85
13.11 Bollen, H. (1978): Wichtiger denn je - der Test von LSI-Bauclementen. Elektronik. no.
13. pp. 77-80
13.12 Muehldorf, E. I. (1976): Designing LSI logic for testability. Proc. of IEEE Semicond. Test
Symp., pp. 45-49
380 13 Test and testability of logic ICs

13.13 Robach, Ch.; Saucier, G. (1972): Le test logique des circuits integres. L'Onde electrique
vol. 54, no. 12,pp. 842-849
13.14 Davison, C. (1978): The testing of modern memories. L'Onde electrique vol. 58, no. 5, pp.
39~00
13.15 Fosse, P., David, R. (1977): Random testing of memories. Informatik-Fachberichte,
Springer Verlag, vol. 10, pp. 139-153
13.16 Rault, J. C. et al. (1972): La detection et la localisation des defaults dans les circuits
logiques: principes generaux. Revue technique Thomson-CSF, vol. 4, no. 1, pp. 49-88
13.17 Kwang-Ting Cheng et al. (1990): A partial scan method for sequential circuits with
feedback. IEEE Trans. on Computers, vol. 39, no. 4, pp. 544-548
13.18 Gupta, R. et al. (1990): The BALLAST methodology for structured partial scan design.
IEEE Trans. on Computers, vol. 39, no. 4, pp. 538-543
13.19 Bo, Y. et al. (1995): Testability design for sequential circuit with multiple feedback. Proc.
The fourth internat. conf. on solid-state and integrated-circuit technology, Beijing
(China), Oct. 24-28, pp. 208-210
13.20 Bajenescu, T. I. (1993): Wann kommt der nachste Uberschlag? Schweizer Ma-
schinenmarkt (Switzerland), no. 40, pp. 74-81
13.21 Bajenescu, T. I. (1998): On the spare parts problem. In: Proceedings of OPTIM '98,
Brasov (Romania), May 14-15, pp. 797-800
13.22 Bajenescu, T. I. (1998): The Monte Carlo method and the solution of some reliability
problems. Proceedings of the Symp. on Quality and Reliab. in Information and
Commmunications Technologies RELINCOM '98, Budapest (Hungary), September 7-9
13.23 Bazovsky, I. (1961): Reliability Theory and Practice. Prentice-Hall Inc., Englewood
Cliffs, New Jersey
13.24 Bazowsky, I., Benz, G. (1988): Interval Reliability of Spare Part Stocks. Qual. Reliab.
Engng. Int.,no.4,pp.235-246
13.25 Bazu, M. et al. (1983): Reliability data bank for semiconductor components. Proceedings
of Ann. Semicond. Conf. CAS 1983, October 6-8, pp. 35-38
13.26 Bazu, M. et al. (1983): Accelerated tests for evaluation of semiconductor component
reliability. Electrotechnics, Automatics and Electronics, no. 1, pp. 19-25
13.27 Biizu, M. and llian, V. (1990): Accelerated testing of integrated circuits after storage.
Scandinavian Reliability Engineers Symp., Nykoping, Sweden, October
13.28 Birolini, A. et aI. (1989): Test and Screening Strategies for Large Memories. Proc. 1st
European Test Conf. 1989, pp. 276-283
14 Failure analysis

14.1
Introduction [14.1] ... [14.25]

This delicate and, in the same time, interesting subject will be presented from the
viewpoint of a simple user of components. The emphasis will be not on the solid
physics, but on the adequate design of electronic systems or on their manufacturing
on an industrial scale. The scanning electron microscope, for instance, will be
mentioned only incidentally, while the habitual analysis means (such as: electrical
measuring, optical microscopy and chemical procedures) will be currently used.
For the failure analysis, the "bad" components have a higher value than the
"good" ones. If a component was incorrectly used, the system designer must review
his project or to choose another component. On the contrary, if a weakness of the
component is found, a detailed analysis is required to the manufacturer, leading to
corrective actions for the elimination of the failure causes.
The failure analysis starts from the failure mode (the symptom by which one
may observe a failure, such as: shorts, open circuits, excessive high leakage
currents, changes in the resistor values, degradation of the response time or of the
frequency dependent parameters) and leads to the identification of the failure cause
andfailure mechanisms (e.g.: breaks or cracks of the die, intermetallic compounds,
oxide defects, pinholes, contamination, metal migration, short-circuiting of the
oxide or dielectric layer, "bad" solders, overcharges due to incorrect use, open
circuits, missalignments, chemical reactions at the level of the metal/semiconductor
contact area, metal corrosion inside the package, etc).
The failure analysis must discover the failure roots. Only if the failure causes are
known and the failure mechanism is elucidated, the necessary measures can be
taken. One must understand that the failure analysis is long and costly. For
component malfunction arising during tests, the following main causes may be
identified: deficiencies in component manufacturing, incorrect use, incorrect
mounting, overcharging, and testing errors.
By failure analysis, an early knowledge of the problems linked to the
components is achieved, and in the meantime, the efficiency of the measures for
reliability improvement is verified. The obtained results are used for establishing
the failure sources and for their avoidance, both to the component manufacturers
and to the users. A user must clearly state if the failure was produced by his
deficiencies or by component manufacturing errors. The claims insufficiently
defined lead to incomplete answers (if any reaction is produced) [14.26].
The constitutive elements of a failure analysis are:

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
382 14 Failure analysis

• knowledge of the failure mechanisms,


• discovery of the causes producing the defects,
• establishment of the consequences,
• statistical evaluation of the decisions,
• development of improved methods of analysis.
A correct solution of the problem is possible only by the co-operation of the
involved component manufacturers. Intensive discussions on testing methods and
on fabrication optimisation are necessary, followed by common studies on the limit
parameters, at the prototype phase. These studies must be performed on the bases
of previous experience and in a spirit of co-operation and sincerity, which shortens
the time until the problems are solved.
Also, to succeed, the principles of concurrent engineering (already presented in
Chap.2) must be used: specialists from all the development phase of a product
(designers, manufacturers, quality and reliability engineers, etc.) must co-operate.
Only in this way a diminution of the failure rate may be obtained.
The laboratory for failure analysis must cover the following problems [14.27]:
• rapid analyses of the quality failures on the manufacturing stream,
• evaluation of the suppliers from the viewpoint of the technology and of the
used procedures,
• consultance in semiconductor problems, for the specialists from the company,
• prevention of the deviations from the technological procedures.
An important mean for evaluating the components is the characterisation, allowing
knowing the qualities, the behaviour and the weaknesses of the components
produced by various manufacturers.
As one knows, the electrical parameters of the components depend on
temperature and bias. The knowledge of this dependency is very important for the
device development. If the components are correctly used, a high reliability of the
systems may be obtained. The testing methods must correspond to the dominant
failure mechanisms characterising the technology [14.28], in the purpose to obtain
in the shortest time the needed results.
If many components from the same family are used, the probability to have
failures from this family increases. If a family of components with 1% failures is
used and the total number of such components is 100, one may obtain even 50%
failures. Generally, the ratio of the costs is: reception tests for components: tests of
the system: guarantee = I : 10 : 100.
The today's integration degree of circuits containing semiconductor components
imposed new subjects for manufacturing and testing [14.27]:
• a drastic increase in the number of components integrated on a chip makes
possible the achievement of a high number of functions for a set, whose testing
is as difficult as that of a single function,
• the high costs of the corrections harm, in the case of a bad quality, the
efficiency of the automated and rationalised methods of manufacturing,
• the testing and repairing points become more and more costly; the possibility
of their re-use is often out of the question due to the need to modify the
programme or the type of integrated circuit.
14 Failure analysis 383

The discrete components, but especially the integrated circuits, contain a high
number of connections. The connections have an important contribution to the
failure due to their complexity and to the mechanical, climatic and electric stress
that they have to lUldergo.
The analysis shows that the contacts also give a high percentage of the failure
causes. So, a comparison between high frequency transistors vs. bipolar integrated
circuits leads to the conclusion that for high frequency transistors the failure is
produced by high currents and temperatures at the contacts, while for integrated
circuits the main failure cause is the high number of internal connections. If we
discuss the well-known "bath-tub" curve, a high contribution of the contacts is
found in the early failure period, but also in the wear out region. In this last case,
one may explain this by: material fatigue, interdiffusion of neighbouring metallic
layers, corrosion, electromigration or thermomigration.
The LSI integrated circuits (memories, microprocessors) ask important questions
on the defects examining and failure analysis. For microprocessors, the localisation
of the defect on chip is not always possible and sophisticated techniques must be
used, such as: nematic liquid crystals, stereoradiography with X-rays, neutron
radiography, Nomarski microscopy, instruments with electron and ionic beam,
mass spectrometry for residual gas analysis, etc. These powerful (and expensive)
instruments of analysis demand highly qualified personnel.

14.2
The purpose of failure analysis

The main purpose of a failure analysis is to identify the failure mechanisms


(including the establishment, based on a statistical analysis, of the typical failure
cases), but also to improve, by corrective actions on the manufacturing stream, the
quality and the reliability of the components. Obviously, it is difficult to analyse all
the components, and a convenient trial of the selection criteria for the component
must be made.
First, the failures with an abnormally high frequency must be studied. It is most
important that these failures to be well selected, so that they do not represent
isolated cases.
Second, the distribution ofthe failure modes for each type of component must be
known, in the purpose to improve the reliability predictions. To do this, a
systematic analysis of the failed components for a given time period must be made,
by eliminating the "abnormal" components.

14.2.1
Where are discovered the failures?

The failures may be suppressed only if their causes are known. Consequently, the
analysis of the failures is an important source of information for discovering the
"weak points" and taking the necessary measures for eliminating them. Depending
on the place (on the manufacturing stream of an electronic system), where they
384 14 Failure analysis

were discovered for the flrst time, the component failures may be divided as follow
[14.22]:
a) At the input control
The experience shows that at this place a characteristic failure rate is established for
each component. The testing laboratory must be informed about any deviation from
the normal behaviour, possibly by requiring a failure analysis. If this analysis
emphasises a serious failure, the whole batch will be returned to the supplier.
b) In the testing laboratory
The failures of the components are generally isolated, the causes being identifled
on the basis of previous experience. If a threshold of 1.5-2% failures is exceeded,
an alarm signal is pulled. The testing laboratory must give them a priority in
analysis, because an interruption of the manufacturing arises. The possible failure
causes are defects in component manufacturing, incorrect manipulation, incorrect
use.
c) In the development phase
During the development of the product, the components undergo high stresses
(circuit and mounting changes, inversion of the bias, too small supply voltages).
Such defects will be automately eliminated from the list of failure analysis. Taking
into account the remaining failures, on may identify in due time the weak points of
a component and the necessary measures for their elimination can be taken.
d) At the user o/the system
Here the differences in failure rate are assessed. In the case of an obvious deviation,
a careful analysis of the failures must be made. But an analysis of fleld failures is
difflcult to make because sometimes the related information is lost.
In most cases, as a consequence of the failure analysis and with the aid of the
manufacturer, it is possible to find a convenient solution for both sides. For
subsequent analyses and decisions, it was proved to be useful, to classify the
failures depending on the type of component.
It is recommended to perform simultaneously tests on components originating
from various manufacturers, in the purpose to compare directly the behaviour of the
components. Usually, a sample of 15-30 items undergoes electrical, mechanical
and/or thermal stress, for each component the input electrical characteristics being
measured, at given time period. Based on these results, taking into account other
parameters and the comparison between various producers, the necessary decision
is taken. A trend towards a diminution of the number of component types must be
promoted in any company.

14.2.2
Types of failures

The failures encountered at the user are relatively less numerous [14.1] and arise
from:
• utilisation problems (circuit overcharge, voltage, power),
• mechanical problems (assembling, and those referring to the aluminium),
• crystallographic structure problems (crystallographic defects, parasite
diffusions),
14 Failure analysis 385

• oxide problems (contamination, corrosion).


The last two categories of problems are actually hard to analyse. Referring to the
utilisation problems, only the user can bring together the necessary information.

Failed components, External Electrical, thermal and


with failure report visual mechanical stress

Measuring electrical NO YES


Failure
functions and parameters

Package
opening Plastic
X-ray ... YES package
inspection

Internal
visual
1 NO

NO

Identified Advanced
failure investigations
mechanism? (SEM, TEM, etc.)

Report on the YES Evaluation


failure analysis

Fig.14.1 Scheme for performing a failure analysis


386 14 Failure analysis

14.3
Methods of analysis

Following the fact that lately the reliability of integrated circuits improved
significantly, new methods of analysis for the assessment of the "good" circuits
from the new technologies and for establishing the weak points of the
manufacturing were developed. The results are used for establishing the sources of
failures both by the manufacturer and by the user.
First, full analyses for establishing if the foreseen limits from the sheet data are
overreached. With the aid of this electrical analysis, the failure rate is calculated, by
correlating the identified and the original failures. In Table 14.1, the possible steps
for a failure analysis are presented and in Figure 14.1, an image on the utilisation
mode for the analysis methods is shown.

Table 14.1 Working plan for a failure analysis for semiconductor components

Operations Identified failures


Electrical measuring (static, dynamic, current and Parameter drift
voltage characteristics)
X-Ray radiography Foreign material; broken wires: die attach
soldering defects
External cleaning of the package; then, re- Leakage due to impurities on the package
measuring surface
Storage at 200°C, then re-measuring: to be Leakage caused by an inversion layer at the die
repeated if the leakage have diminished surface (Na+ ions into the oxide)
Opening the package and microscopic examination Defects of metallisation, soldering, wire, mask
or oxide
Transverse sectioning Bad solders of the connection wires (or of the
die)
Cleaning of the die surface, then re-measuring Leakage due to surface contamination.
Cleaning of the oxide, then re-measuring Oxide contamination
Structure sectioning in the structure and junction Deep diffusion problems; traces of breakdown
revealing of the junction, in depth

14.3.1
Electrical analysis

The curve tracer is an appropriate non-destructive tool (often underestimated) for


testing the quality of the technology of transistors and integrated circuits and to
reveal the possible modifications of the design or the differences between various
suppliers. The Tektronix models are still the best choice for power devices (model
576) or for Ie, especially linear devices (model 577). Hewlett Packard produces a
digital curve tracer, the 4145A parametric analyser, with a better leakage current
capability than the model 577. With this tool, one may evidence the abnormalities
14 Failure analysis 387

in the inputs of IC, the quality deviations of the components, the critical areas of
the electrical parameters, etc.
For SSI integrated circuits, it is possible to determine the areas with defects on
the die, by combining the different characteristics and based on previous
experience, possibly with the aid of thermal tests and, if necessary, with the
mechanical ones.
By using other electrical tools, I-V and C-V curves, at various temperatures,
may be obtained and form these curves relevant information about the device is
acquired. Experiments made by Papaioanou [14.41] for an ultra fast diode lead to
activation energy of O.66eV (for the main failure mechanism) extracted from a
reverse saturation current vs. temperature plot. From C-V curves, the carrier profile
may be obtained.
Another technique, currently used is Deep Level Thermally Stimulated (DLTS),
allowing characterising the deep level from the forbidden band. From DLTS
spectra, data about the traps can be obtained. For a case study (irradiated ultra-fast
diodes), performed by Papaioannou for the Phare/TTQM project R09602-02-02-
LOOl/28.10.1997, DLTS data are presented in Table 14.2 [14.41].

Table 14.2 Trap characterisation from DLTS spectra

Trap Ea (eV) ACle. NT (1015 cm,3) Nearest candidate


Tl 0.79 0.086 2.6 Au23a
T2 0.53 0.1 3.0 Au26b or EL7
T3 0.37 0.028 0.8 E22a or E-center

It seems that for T1 the single candidate is an Au acceptor level, acting as an


electron trap (called Au23a). For T2, the nearest candidates are Au acceptor level
(called Au26b) or a trap induced by irradation (called EL 7), both having activation
energies of O.54eV. Because no peak appears in the spectra of Au-doped diodes, it
seems that T2 must correspond to EL 7. For T3, the nearest candidates are an
irradiation induced trap (E22a), with activation energy of 0.3geV and an E-centre
produced by irradiation in a phosphorus-vacancy pair, having activation energy of
0.44eV. As one can see, E22a is the most probable candidate. As a conclusion,
DLTS allows to understand the mechanism of trapping in semiconductors, a
possible failure mechanism.

14.3.2
X-ray analysis

X-rays are used for the location of foreign material (in any packages), broken wires
and die attach soldering defects prior decapsulation. The most frequently used tool
is the Faxitron 43804 produced by Hewlett Packard.
388 14 Failure analysis

14.3.3
Hermeticity testing methods

The hermeticity test is used to establish if moisture-related failures were caused by


sealed or subsequently entered water vapour. A complete hermeticity test contains
two parts: gross and fine. Fine leak tests are made by forcing helium through
leakage paths of packages. Gross leak test detects large holes in the package with
bubble testing (a low-boiling-point liquid produces the bubbles).

14.3.4
Conditioning tests

Various tests may be used to evidence the failures. Mechanical shock, vibration or
high-temperature storage allows re-creating an intermittent failure. High
temperature storage (16h, 150°C) make mobile the electrical charge produced by
ionic contamination at the interface oxide / semiconductor, curing often the device.

14.3.5
Chemical means

For the opening of plastic packages, universal solutions cannot be recommended,


because the epoxy, siliconic and poliuretanic resins have different behaviour. The
most used solutions are:
• Nitric acid "fumans"
• Sulphuric acid "fumans"
• Concentrated sulphuric acid
• "Uresolve plus" (Dynaloy Inc.)
• J-I00 and J-300 (Indust. Ri-ChemLab Inc.)
More often, the nitric acid, the cromsulphuric acid and the solutions 11 00 and BOO
(with efficiency increasing with temperature) are used.
Many packages may be dissolved only be one or two of these solvents (some
examples: the packages of the integrated circuits of the company AMI are dissolved
only by sulphuric acid "fumans"; the phenolic resin is dissolved in a mixture of
50% sulphuric acid and 50% nitric acid; other packages are dissolved in nitric acid
"fumans", at high temperatures).
Another method to open the packages is 15 hours calcination at 430°C. The
integrated circuits in ceramic packages may be opened by mechanically removing
the upper part.
Plastic packages (DIL and TO-220) may be opened quickly and without
damages by using the Jet Etch, a tool allowing to examine the die. The element to
be open is hold by a metallic part linked with the sensor, which is controlled by the
evaluation electronic. The sensor circuit has two functions: i) to protect the
dielectric, avoiding the deterioration by corrosion and ii) to start the clock signal
when the sulphuric acid reaches the conducting areas (contact points, metallic pads,
etc.). Depending on the time period till the package dissolution, one may establish
the dissolution depth and the shape of the hole. Then the metallic holder is
positioned with the upper part of the package downward and pressed by the
14 Failure analysis 389

corrosion head (with sulphuric acid). The pressure force is controlled in the purpose
to release with the aid of hot sulphuric acid (290°C) the solder and contact points.
After opening, the package is processed with acetone, deionised water, ultrasounds
and dry air [14.40].
One must know that a cause of semiconductor device degradation, both at
storage and functioning, is the chlorine impurifying of the package atmosphere.
This may be produced at the die manufacturing or at the mounting. In many cases,
the moulding press is the cause. It has been shown [14.35] that the impurifying of
the package atmosphere with chlorine may arise when the semiconductor devices
are treated with some chlorine solvents.

14.3.6
Mechanical means

The main purposes for the use of mechanical means are:


1. Opening of hermetic packages with a low-speed diamond saw.
2. Grinding and polishing of the sample for the metallographic analysis: rough
and fine grinding, fine polishing.

14.3.7
Microscope analysis

After enough information has been extracted by electrical analyses, the packages
are open and, with the aid of the metallographic microscope, one may go on.
Usually, X50 or XI00 pictures are. taken. Only if the position of the defect in the
functioning scheme of the integrated circuit was clearly identified, one may draw
conclusions on the external failure causes.
The hot spots (areas where a dielectric breakdown produced a short between
gate and source of a power FET, for instance) may be detected with Liquid Crystals
(nematic or cholesteric).

14.3.8
Plasma etcher

Decapsulation of plastic packages and removal of nitride passivation are easily


done with a simple plasma etcher. Various such tools are available from various
companies (such as March Instruments, Spi Suppliers, etc.).

14.3.9
Electron microscope

The first commercial Scanning Electron Microscope (SEM) was built in the mid
60's [14.31] and was immediately used for semiconductor technology. In the field
of quality control and failure analysis, SEM is an appropriate tool (Fig.l4.3 ... 14.6).
The interest to develop non-destructive methods imposes the use of SEM in
industry. Comparing to optical microscopy, at an enlargement of 1000 times, the
390 14 Failure analysis

resolution is 100 times better, and the field clarity, with an order of magnitude. In
the Table 14.3 [14.29], some of the possibilities of SEM are presented. As one can
see, apart from surface analyses, researches referring to the failure causes and
failure mechanisms for integrated circuits (e.g. masking defects, microbreaks at the
contact windows or forming of alloys at the contacts) can also be made. As
examples of the most usual failure mechanisms discovered with the aid of SEM, the
following can be mentioned: formation of intermetallic phases at the contact
between different materials, corrosion of the metallic leads, migration of the
material due to the high current densities on the leading paths. For instance, if
integrated circuits for semiconductor memories must be reliably measured, one
must know their organisation and topology. A manufacturer offers rarely this kind
of information and, therefore, SEM may be used.

14.3.10 Special means

If the failure mechanism is not discovered with the aforementioned means, some
other tools must be used. Examples of such tools allowing knowing more about the
failure cause are [14.42]:
• Methods using secondary electrons, back-scattered and Auger electrons (Auger
electron spectroscopy, Scanning Auger microprobe, Transmission electron
microscope, Transmission electron energy-loss microscopy, Low-energy
electron diffraction)
• Methods using electron-induced photon emISSIOn (Electron probe
microanalysis with X-ray, Appearance potential spectroscopy)
• Methods using photo- and Auger electron emission (Electron spectroscopy for
chemical analysis, X-ray induced Auger electron spectroscopy, Ultraviolet
photoelectron spectrometry)
• Methods using sampling by laser-induced emission (Atomic absorption
spectroscopy, optical emission spectroscopy)
• Methods using fluorescence and reflection (X -ray fluorescence spectrometric
analysis, Light microscopy IR, UV and Raman scattering, Laser optical
spectrometry)
• Methods using ion scattering (Ion scattering spectrometry, Rutherford back-
scattering spectrometry, Neutron activation analysis).
It is important to note that these special means (and others) must be used only if the
failure mechanism was not yet identified. With other words, the special means must
be needed by the logic of the failure analysis. A tendency to embellish the failure
reports by adding "beautiful" results (impressive pictures or diagrams obtained with
sophisticated tools) is encountered all over the world. Such a technique may be
used for making a scientific paper more convincing, but it is also the most powerful
argument for increasing the price required for a failure report. The customer must
be aware that sometimes expensive reports are not the best ones.
14 Failure analysis 391

Fig. 14.2 Detail of a memory chip Fig. 14.3 Detail of a metallisation

Fig. 14.4 Detail from Fig. 14.4, at a Fig. 14.5 Contact of a connection wire
higher enlargement
392 14 Failure analysis

Table 14.3 Examples for the usage of a Scanning Electron Microscope (SEM)

Field Usage

Semiconductors: failure • The formation of the alloys at the solder joints (e.g. the phases
analysis AU/AIISi).

• Mechanical degradation of the solder joints.

• Degradation of the aluminium metallisation.

• Shorcircuiting between the leading paths.

• Microcracks in the contact windows.

• Geometrical defects at the diffusions.

Thin / thick layers technique • Leading paths with resistive paste.

• Crossover analysis of the laser layers, surface and substrate


(breaks, impurification).

Displays with liquid crystals • The surface of the glass and of the electrode-layers after chemical
and optical fibers treatments.

Solder joints, contacts • Solder joints and connections.

• Metallisations on Si and GaAs.

• Surface impurification.

• Resistive paste.

Tape automatic bonding • Construction and shape of the bumbs.

• Solder joints and layers on the die.

Solder joints • Quality of the solder joints.

Plastic materials • Chemical structure.

• Modification produced by the temperature.

• Mechanical modification.

14.4
Failure causes

Among the operational failure causes or those produced by the non-observance of


the guaranteed parameters, there are causes arising both at the manufacturer and at
the user, such as:
• design deficiencies,
14 Failure analysis 393

• process errors,
• process variations,
• mounting and manipulation errors (user / manufacturer),
• wrong usage.
The failures may arise also form the insufficiencies of the testing technique, such
as:
• weak reaction from the user to the manufacturer.
• old-fashioned testing technique.
• nonflexible testing system,
• renunciation to the expensive testing methods.
• non-adjustement of the testing system to the user's requirements.
To avoid the wrong conclusions, the failed components must be analysed
separately for types of tests, because it is important to know [14.28] if the failure
in an humidity test is due to the penetration of the moisture inside the package or
due to other failure mechanisms.

14.5. Some examples

To accustom the reader to some current problems of the failure analysis, in the
following, many cases of failed components, based on rich illustrated examples,
will be analysed. The observation of these cases will be useful especially for
young engineers and technicians. The experience shows that from total of failures,
the utilisation ones have a high weight. If these defects are excluded, for the rest
of the defects the following distribution has been found the distribution given in
Fig.l4.7.

crystalographic defects
die defects

oxide defects

other defects
connection defects
~ii5IiIIIlilllBll!iIi!!Illil!&!aliilllilllBliI:iEiIlillllllll!lillllllll!lIilllBlSllil_IllE!!:ll2!ElIIIIIIIIIII&Zaii metallisation defects

o 5 10 15 20 25 30 35 40 45

Fig.14.7 Distribution offailures for a semiconductor device

As one can see, the crystallographic defects are relatively rare ones. Generally,
the components with such defects are removed by the input control, at the
mounting stage or, eventually, at the first testing of the circuit.
394 14 Failure analysis

14.10.1
Electrical overcharge

Case 1: Case 2:
Fig.14.8 TTL Integrated circuit 944. Fig. 14.9 DTL integrated circuit 9936,
Overcharge of an extender input good at the input control, butfailed at the
control of equipped cards (pin 13
interrupted) . By oppening the case, the
path was found to be melt and the input
diode shorted

Case 3: Case 4:
Fig. 14.10 Integrated circuit 936. Fig.14.11 DTL integrated circuit 9946,
Electrical overcharge: pads of the defect at electrical control of equipped
output transistors are melted cards (inputs I and 2 overcharged)
14 Failure analysis 395

Case 5:
Fig. 14.11 Optocoupler: the failure mode
is an open circuit of the phototransistor;
the emitter solders are interrupted.
Because the optocouplers passed by a
100% electric control, it seems that no
mechanic defects occured. To reach the
aluminium pad (leading to the emitter
windows), the glass passivation layer was
removed and the failure mechanism was
discovered: the metallisation surrounding
the emitter area was burned by a
overcharge current produced by the
scratch of the pad during the
manufacturing process. Only a small
portion of the pad remains good,
allowing the passing of the electric
control. When the optocoupler was used,
the pad was burned and the failure
occured

14.10.2
Mechanical defects

Case 6:
Fig. 14.12 Aluminium and oxide removal
during ultrasound solder
396 14 Failure analysis

Case 7:
Fig. 14.13 Local damage of the
protection layer during ultrasound solder

Case 8:
Fig. 14.14 TTL Ie 7410: Two inputs are
found defect at electrical functionning
control of equipped cards. The silicon
was broken under the contact zone (a rare
defect, produced by an incorrect
manipulation during manufacturing
process

Case 9:
Fig. 14.15 Local removal of aluminium
at testing, bellow a thermocompression
area
14 Failure analysis 397

Case 10:
Fig. 14.16 Break of an aluminium wire
(ultrasound bond)

Case 11:
Fig. 14.17 Crack in a crystal

Case 12:
Fig.14.18 Break of a die
398 14 Failure analysis

14.10.4
Bad centered solder joints

Case 13:
Fig. 14.19 TIL IC 7400 (XI70): Output
8 is defect at the electrical control of
equipped cards. One may notice the
shortcircuit between the contact wires
soldered at pin 8 and 7, respectively

Pin7 Pin 7 Pin 9

Case 14:
Fig. 14.20 Failures of diodes after a test
at temperature cycling [14.34] . Causes:
wrong centred dies and wrong aligne-
ment at diodes mounting
14 Failure analysis 399

14.10.5
Contact windows insufficiently open

Case 15:
IC TTL 7475 (nip-nop with complementary outputs. The normal operation was observed
only for temperatures between 25 and 40"C. At temperatures higher than 40"C, the output
level is instable. The phenomenon is produced by the contact windows insufficiently open
at the open collector output transistors. (Fig. 14.21...14.23 Metallised dies. Fig. 14.24 Dies
with metallisation removed.)

Fig. 14.21 Fig. 14.22

Fig. 14.23 Fig. 14.24


400 14 Failure analysis

14.10.6
Electrostatic discharges

Case 16:
Bipolar LSI IC type HAl-4602-2:
electrostatic discharges. There are no
differences between the handling
precautions for bipolar and MOS ICs,
because both categories are sensitive to
electrostatic discharges. SEM pictures
show the areas affected by electrostatic
discharge (Fig. 14.25... 14.27)

Fig. 14.25

Fig. 14.26 Fig. 14.27


14 Failure analysis 40 I

Case 17:
Partial vue of the metallisation layer of a ROM die, longitudinal section
(Fig. 14.28... 14.31)

Fig. 14.28 Enlargement X2000 Fig. 14.29 Enlargement X5000

Fig. 14.30 Enlargement XI1000 Fig. 14.31 Enlargement X21 000


402 14 Failure analysis

Case 18: Case 19:


Fig. 14.32 Notches formed during Fig.14.33 Excellent metallisation of a
metallisation corrosion collector contact window of a TTL Ie
(X5000)

Case 20: Case 21:


Fig.14.34 Excellent covering of the Fig.14.35 Wrong thining of a
metallisation over an oxide step metallisation pad over an oxide step
(X9000) (X I 0000)
14 Failure analysis 403

14.10.8
Catalogue sheets with incomplete data

Case 22:
Hybrid circuit voltage regulator with power transistor at the output. Melt connection at the
emitter of power transistor. This failure mecanism may be avoided if the manufacturer does
not forget to specify in the catalogue sheet that at the regulator input a capacitor with good
high frequency characteristics must be mounted (Fig. 14.36... 14.38)

Fig. 14.36 Fig. 14.37

Fig. 14.38 An error occured: the output


voltage is higher than the input voltage.
To avoid the failure. a hlocking diode
must be mounted between the input and
output (a detail not mentioned by the
manufacturer)
404 14 Failure analysis

14.10.9
Bad quality solder joints

Case 23:
Small signal transistors with wire bonding defects

Fig. 14.39 Bad solder of a connection


wire

Fig. 14.40 Edge solder joint

Fig. 14.41 Shortcircuit of the base wire


with the crystal
14 Failure analysis 405

14.10.10
Open circuits

Case 24:
Fig. 14.42 Electrical opens of a metallic
pad (RAM chip), produced by
eiectromigration

14.10.11
Popcorn noise (Burst noise)

Case 25:
Fig. 14.43 Typical example of popcorn
noise at an operational amplifier
406 14 Failure analysis

14.10.12
Holes in silicon

Case 26:
Fig. 14.44 Silicon dissolution III
aluminium (X 11000)

Case 27:
Fig. 14.45 Dissolution of silicon in
aluminium. To be noted the change of
orientation in horizontal plane (100) (X
1700)

Case28:
Fig. 14.46 Hole in a gate oxide, leading
to a shortcrcuit between metallisation and
substrate (X 5000)
14 Failure analysis 407

14.10.13
Oxide defects

Case 29:
Fig. 14.47 Hole in a gate oxide,
leading to a shortcrcuit between
metallisation and substrate (X 5000)

Case 30:
Fig. 14.48 Cristallisation of a point
defect in a thermally grown SiO, (X
4400)

Case 31:
Fig. 14.49 Surface separation of an
aluminium metallisation covering an
oxide step (X 16000)
408 14 Failure analysis

14.10.14
Advantages of the potential contrast method

Case 32:
Fig. 14.50 Image of a biased transistor, evidenced by
potentional contrast method (X 1000)

Case 33:
Fig. 14.51 Discontinuity of a metallisation pad, evidenced
by potentional contrast method (X 500)
14 Failure analysis 409

14.10.15
Package opening

Case 34:
Metal or ceramic packages may be opened by polishing, cuting, soldering or hiting in a
certain point, carefully, to not damage the die. The pictures show the opened metal
packages for two hibrid circuits with multiple dies. The solder joints are the weak points of
the system (Fig. 14.52-14.53)

Fig. 14.52

Fig. 14.53

Case 35:
Fig. 14.54 For the plastic packages, the
opening is difficult. If in previous
researches input shortcircuits or opens
have been found, one may establish with
X-ray radiography , before opening the
package, if the defect is at the connection
between the pin and the die r14.26]
410 14 Failure analysis

References

14.1 Vissiere, M. (1972): L'analyse des defaillances: les moyens, les methodes d'analyse,
principaux mecanismes de defaillances chez l'utilisateur. Actes du Congres National de
Fiabilite, Sept. 20-22, Perros-Guirec, France, pp. 147-153
14.2 Schwartz, S. (1976): Postmortems prevent future failures. Electronics, Jan. 23, pp. 92
106
14.3 Mann, J. E. (1978): Failure analysis of passive devices. Proceedings of the 16th Annual
Reliability Physics Symp., pp. 89-92
14.4 Wunsch, D. C. (1978): The application of electrical overstress models to gate protective
networks. Proceedings of the 16th Annual Reliability Physics Symp., pp. 47-55
14.5 *** (1973): Parts, material and process experience summary. NASA SP-6507, vol. 2,
Washington D. C.
14.6 Smith, J. S. (1978): Electrical overstress failure analysis in microcircuits. Proceedings
of the 16th Annual Reliability Physics Symp., pp. 41-46
14.7 *** MlL-STD-883, Method T 5003
14.8 Parker, S. L., Lawson, L. E. (1976): Comparison of destruct physical analysis results on
electronic components. Proceedings of the 14th Annual Reliability Physiscs Symp., Jan.
20-22, Las Vegas, pp. 456-460
14.9 Takaide, A., Manabe, N. (1977): RA system using process failure analysis for ICs.
Proceedings ofthe 15th Annual Reliability Physiscs Symp., pp. 1-6
14.10 Tretter, J. (1976): Fehleruntersuchung, Fehlerklassifikation und Fehlerphysik bei
Bauelementen der Nachrichtentechnik. Fernmeldepraxis, Bd. 46, H. 6, pp. 197-216
14.11 Bonnaud, R., Guezou, P. (1978): Essais des composants mecaniques et electriques.
L'echo des Recherches, Jan., pp. 26-31
14.12 Behera, S. K., Speer, D.P. (1972): A procedure for the evaluation and failure anlysis of
MOS memory circuits using the SEM in potential contrast mode. Proceedings of the
10th Annual Reliability Physics Symp., pp. 5-11
14.13 Kranzer, D. (1978): Correlation of crystal defects and bipolar device behaviour. Revue
de Physique Appliquee, vol. 13, Dec., pp. 803-807
14.14 Ebel, G. H., Engelke, H. A. (1973): Failure analysis of oxide defects. Proceedings of the
11th Annual Reliability Physics Symp., pp. 108-116
14.15 Piwczik, B., Siu, W. (1974): Specialized SEM voltage contrast techniques for LSI
failure analysis. Proceedings of the 11th Annual Reliability Physics Symp., pp. 49-53
14.16 Zick, G. L., Sheffer, T. T. (1977): Remote failure analysis of microbased
instrumentation. Computer, Sept., pp.30-35
14.17 Alter, M. J., McDonald, B. A. (1971): The SEM as a defect analysis tool for
semiconductor memories. Proceedings of the 10th Annual Reliability Physics Symp., pp.
149-159
14.18 Patterson, J. M. (1978): Developing an approach to semiconductor failure analysis and
curve tracer interpretation. Proceedings of the 16th Annual Reliability Physics Symp.,
pp.93-100
14.19 Bums, D. J. (1978): Microcircuit analysis techniques using field effect liquid crystals.
Proceedings of the 16th Annual Reliability Physics Symp., pp. 101-105
14.20 *** (1972): Qualitatspriifung und Fehleranalyse an Bauelementen. Sonderheft der
Firma Wandel und Golterman, Reutlingen
14.21 Boulaire, J.-y' , Boulet, J.-P. (1977): Les composants en exploitation. Analyse des
composants defectueux. L'echo des recherches, July, pp.16-23
14.22 Hagenbusch, E. (1973): Auftrag, Auibagen, Arbeitsweise Qualitatspriiflabors fur
Bauelemente. Qualitatspriifung und Fehleranalyse an Bauelementen. Sonderheft der
Firma Wandel und Golterman, Reutlingen
14 Failure analysis 411

14.23 Boulaire, J.-Y. , Boulet, J.-P. (1978): Analyse des compos ants defectueux en
exploitation: methodes et resultats. Actes du Colloque International sur la Fiabilite et la
Maintenabilite. Paris, June 19-23, pp. 401-407
14.24 Belbeoch, J.-Y, Boulet, J.-P. (178): SADE - systeme d'analyse des defaillances en
exploitation. L'echo des recherches, July, pp.12-19
14.25 Doyle, R. Jr. (1979): Military microcircuits: failure analysis at RADC. Military
Electronics I Countermeasures, vol. 5, no. 2, pp.75-79
14.26 Becker, P. (1982): Ausfallanalyse als wesentlicher Bestandteil der Qualitats- und
Zuverlassigkeits-sicherung. Qualitat und Zuverlassigkeit, h. 8, Sonderdruck
14.27 Sebald, N. (1982): Qualitatssicherung integrierter Schaltkreise. IEE Productronic, vol.
27, no. 4, pp.20-22
14.28 Angerer, R. et al. (1982): Beispiel aus der Tiitigkeit der Komponenten-Evaluation. Neue
Technik, H. 11112, pp. 42-47
14.29 Schafer, W.; Niederauer, K. (1982): Rasterelektronenmikroscopie - ein Verfaren sur
Untersuchung fester oberflachen. Messen+PriifeniAutomatik, H.1I, pp. 744-749
14.30 Hersener, J. (1982): Rasterelektronenemikroscopie und Halbleiterbauelemente-
Entwiklung. Messen+PriifeniAutomatik, H.ll, pp. 750-753
14.31 Oatley, C. W. (1982): The early history ofSEM. J. Appl. Phys, pp. RIl-R13
14.32 Biijenescu, T. 1. (1984): Fehleranalyse an Halbleiterbauelemente. Elektronik Produktion
& Priiftechnik (West Germany), Mai, pp. 245-250
14.33 Weygang, A. H. (1979): Fehleranalyse an integrierten Halbleiterschaltungen.
Elektronik, H.12, pp. 55-61
14.34 Doyle, E. A. Jr. (1981): How parts fail. IEEE Spectrum, no. 10, pp. 36-43
14.35 Nenyei, Zs.; Kalmar, G. (1982): Einfluss verschiedener chlorierter Losemittel auf die
Zuverlassigkeit von Halbleiterbauelementen. Metalloberflache, H. 8, pp. 372-379
14.36 Schaffer, E. (1979): Zuverlassigkeit, Verfiigbarkeit und Sicherheit in der Elektronik.
Vogel-Verlag
14.37 Hosoya, N. (1981): "Pressurecooker" using steam pressure raises semiconductor
reliability. JEE, March, pp. 78-81
14.38 Dawes, C. J. (1976): An evaluation of techniques for bonding beam-lead devices to
gold thick films. Solid State Technology, March
14.39 Burgess, D. (1980): Physics of failure. In: Grant Ireson, W.; Coombs Jr., C. W. (eds.)
Handbook of reliability engineering and management. Mc Graw-Hill Book Comp.,
New York
14.40 Jaques, M. (1979): The chemistry of failure analysis. Proceedings of the 17tl1 Annual
Reliability Physics Symp., pp. 197-208
14.41. Papaioannou, G. (1998): Report on Schottky diode assessment. Phare/TTQM project
RO 9602-02, IMT-Bucharest (Romania)
14.42 Werner, H. W.; Garten, R. P. H. (1984): A comparative study of methods for thin-film
surface analysis. Rep. Prog. Phys., vol. 47, pp. 221-344
14.43 Leroux, c.; Blachier, D.; Briere, 0.; Reimbold, G. (1997): Light emission microscopy
for thin oxide reliability analysis. Microelectronic Engineering, vol. 36, p. 297
14.44 Nafria, M.; Sune, J.; Aymerich, X. (1993): Exploratory observations of post-breakdown
conducion in polycrystalinne-silicon and metal-gate thin-oxide metal-oxide-
semiconductor capacitors. J. Appl. Phys., vo1.74, pp. 205-209
14.45 Wu, E.Y.; Lo, S.-H.; Abadeer, W.W.; Acovic, A.; Buchanan, D.; Furukawa, T.;
Brochu, D.; Dufresne, R. (1997): Determination of ultr-thin oxide voltages and
thickness and the impact on reliability projection. Proceedings of the IEEE International
Reliability Physics Symp., pp. 184-191
14.46 Kim, Q.; Stark, B.; Kayali, S. (1998): A novel, high resolution, non-contact channel
temperature measurement technique. Proceedings of the IEEE International Reliability
Physics Symp., pp. 108-112
14.47 De Wolf, 1.; Howard, DJ.; Rasras, M.; Lauwers, A.; Maex, K.; Groeseneken, G.; Maes,
H.E. (1998): A reliability study of titanium silicide lines using micro-Raman
412 14 Failure analysis

spectroscopy and emiSSIOn microscopy. Proceedings of the IEEE International


Reliability Physics Symp., pp. 124-128
14.48 Cole Jr., E.!.; Soden, lM.; Rife, J.L.; Baron, D.L.; Henderson, C.L. (1994): Novel
failure analysis techniques using photon probing in a scanning optical microscope.
Proceedings of the IEEE International Reliability Physics Symp., pp. 388-398
14.49 Chiang, C.L.; Hurley, D.T. (1998): Dynamics of backside wafer level microprobing.
Proceedings of the IEEE International Reliability Physics Symp., pp. 137-149
14.50 Picart, B.; Deboy, G. (1992): Failure analysis on VLSI circuits using emission
microscopy for backside observation. Proceedings ofESREF, pp. 515-520
14.51 Ishii, T.; Miyamoto, K.; Naitoh, K.; Azamawari, K. (1994): Functional faiure analysis
technology from backside ofVLSI chip. Proceedings ofISTFA, pp. 41-47
15 Appendix

15.1
Software-package RAMTOOL ++ [15.1]

Designed and manufactured by the Swiss Technology Corporation Oerlikon-


Contraves AG, RAMTOOL++ is intended as an improved tool for Reliability,
Availability and Maintainability (RAM) engineering. Proved since 1985 on daily
operation, the tool had been successfully employed on the European and North
American market on advanced programs. RAMTOOL++ was elaborated by a
multinational and various engineering disciplines team (mechanics, electronics, and
mathematics) and ensures COnCUIT\!nt Engineering for RAMs.
Based on MIL-HDBK's and / or MIL-STD's and today's leading industry
standards like IEEC or Nortel TR-332, the main features ofRAMTOOL++ are user
friendly (to ensure operation by engineers), flexibility and traceability, wide and
response, ultra smooth integration and maximum automatization on processes.
RAMTOOL++ contains the core module (R3 Trecker) and some add-on modules
(RM Analyst, Mechanicus, Logistics, RM FFT-module and PPoF-module). Further
on, details are given.

15.1.1
Core and basic module R3 Trecker

The R3 Trecker is a utility for reliability prediction, ensuring failure rate prediction
and assessment of the Mean Time Between Failures (MTBF). R3 stands for the
implied three prediction methods for the reliability:
• Parts count technique and component stress analysis method, both according
to the reference handbooks (MIL-HDBK-217F2 and Nortel TR-322, Issue 6).
• Integrated calculation scheme on non-operating failure rate adapted on the
last version of the RADel publication TR-85 as attached on BETA version of
MIL-HDBK-217E.
This basic module of RAMTOOL++ may perform analyses for any equipment
during all life phases, covering a large range of requirements, such as:
• Parts count method for operating / active state.
• Parts count method for operating steady state including limited stress analysis.
• Parts in-circuit stress analysis method for operating lactive state.

T. I. Băjenescu et al., Reliability of Electronic Components


© Springer-Verlag Berlin Heidelberg 1999
414 15 Appendix

• Parts in-circuit stress analysis method for non-operating state.


• Parts count or in-circuit stress analysis method with linear k-factor weight for
non-operating state.
• Field- or test data entry, and / or usage of field data library for the operating
state. Field data can be loaded regardless of the local load / usage rate (hours,
km, miles or rounds). The program will calculate the resulting duty cycle
related to the system and convert the usage rate into the final conversion factor
for processing of the failure rate.

15.1.2
RM analyst

The RM (Reliability and Maintainability) analyst is a RAM utility for predicting


probabilistic behavior for specific design, in a defined scenario. Calculations of
reliability parameters such as R over time, inherent availability, MTTR on
functional/mission level or mission MTBF is done for single equipment rolled up
to complex systems. This task is done by gathering and processing reliability and
maintainability parameters inside a functional grouped reliability block diagram
(RBD).
Additional statistical utilities are added in this module, like SPC (Statistical
Process Control) tools for processing field data, calculation for form parameters
related to the Weibull distribution with various confidence levels out of test or field
data, reliability growth programs using Duane and Amsaa models or tools in
estimating potential risks. Finally, this module offers features to optimise
maintenance and provides the host interface to install one of the most advanced
Markov program for reliability calculation on complex systems with distributed
Markov parameters. The provider is the Swiss Technical Laboratory (ETH-Z) of
Zurich, department Reliability Engineering.

15.1.3
Mechanicus (Maintainability analysis)

The M Mechanicus is a fully smooth integrated, user friendly maintenance analysis


tool, working automatically with the core module R3 Trecker. Calculation related to
mean, geometric and maximum maintenance time for a pre-selected confidence
level under various distributions (lognormal, exponential, Weibull) may be
performed.

15.1.4
Logistics

The RM Logistics enables concurrent provisioning for the right logistic Gust in
time) and at optimised costs. It contains the provisioning sub module and the RCM
Optimator. The provisioning sub module is using concurrent engineering idea that
reliability data are already available. The user can make a set up to establish ground
rules related to what is repairable or not. The RCM Optimator (reliability cost
modeling optimisation module) uses RAM data to calculate the related costs of
ownership up to a life cycle cost at any time during the project.
15 Appendix 415

15.1.5
RM FFT-module

The RM FFT-module is used for FMEA-FMECA testability analysis, combining


many tasks in one tool:
• Failure mode and effect analysis (FMEA)
• Failure mode, effect and criticality analysis (FMECA)
• Testability analysis to access the achieved failure detection I failure isolation
ratio on a defined totaled hardware amount (failure rate).
It contains failure mode (FM) libraries related to well-known FMs of electronics
and electro-mechanical components, but also expandable for classical mechanical
and pneumatic components. The analysis mode can be switched between hardware
orientated (typical element level) or functional level (typical black box).

15.1.6
PPoF-module

The PPoF-module for process-production oriented FMEA provides the capability to


perform a process-construction-production FMEA to analyse potential risks
involved to a given task (process). This offers information for corrective actions.
As a conclusion, RAMTOOL++ has the solution of true high-enhanced
integration of RAM tools and processes in one tool-family. The flexibility of the
package allows using RAMTOOL++ on any scale tailored to the own resources or
requirements.

15.2
Failure rates for components used in telecommunications

The National Centre for Telecommunications (CNET = Centre National d'Etudes


de Telecommunications) of France reccommended [15.2] the following failure rates
for a simplified calculus of the reliability predictions, for stationary conditions,
characterised by the following values of the environmental factors: vibrations:
2 ... 60Hz; acceleration: 0.5g; noise: 40 ... 70dB; dust: moderate; pressure: 105Pa;
relative humidity: 20 ... 90%; mechanical shock: Ilms/lO .. .l5g.
In all cases, the following model was considered valid for the failure rate:
A=7l'QAa .JO-9 H1 (15.1)
where 1tQ is the qualifying factor, taking values between (for most qualified
product) and 15 (for a product without qualifying).
The values of Aa are presented as follows. To be noted that L is the loading
factor.
416 15 Appendix

RESISTORS
Resistor families L TA ("C) 1.
Carbon resistors 0.3 40 ... 70 1.6 .. .5.2
Metal film high stability small power 0.1 40 ... 70 1.2 ... 1.7
resistors
Metal film isolated small power resistors 0.5 40 ... 70 6.0 ... 8.0
Metal film high power resistors 0.5 40 ... 70 930 .. .1020
Wirewound high power resistors 0.5 40 ... 70 54 ... 72
Wirewound precision small power 0.1 40 ... 70 42 ... 82
resistors
Wirewound precision high power 0.8 40 ... 70 210 .. .325
resistors

POTENTIOMETERS
Potentiometer families L TA ("C) A.
Film potentiometer with adjustable 0.3 40 ... 70 100.150
resitors
Film precision potentiometer 0.1 40 ... 70 1580 .. .1880
Wirewound small power potentiometer 0.1 40 ... 70 410 ... 600
Wirewound high power potentiometer 0.5 40 ... 70 725 .. .1050
Wirewound precision potentiometer 0.1 40 ... 70 685 ... 930
Wirewound fine tune potentiometer 0.5 40 ... 70 54 ... 78

CAPACITORS
Capacitor families L TA ("C) A.
Polyester / foil capacitors (70°C) 0.1 40 ... 70 1.8 .. .13.4
Polyester / foil capacitors (85°C) 0.1 40 ... 70 1.4 ... 2.0
Polyester / foil capacitors (125°C) 0.1 40 ... 70 1.2 ... 1.4
Glass capacitors 0.3 40 ... 70 24 ... 88
Ceramic capacitors with nondefined 0.5 40 ... 70 24 .. .26
temperature coefficient (85°C)
Ceramic capacitors with nondefined 0.5 40 ... 70 22
temperature coefficient (125°C)
Ceramic capacitors with defined 0.1 40 ... 70 4 ... 2
temperature coefficient
Tantalum capacitors with liquid electrolyte 0.5 40 ... 70 150 .. .200
Tantalum capacitors with solid electrolyte 0.5 40 ... 70 11...17
Aluminium capacitors with liquid 0.5 40 ... 70 60 ... 150
electrolyte
Aluminium capacitors with solid electrolyte 0.5 40 ... 70 94 ... 350
Mica humid capacitors 0.1 40 ... 70 2 ... 6
Mica button capacitors 0.1 40 ... 70 8.8 ... 9.6
15 Appendix 417

OPTOCOUPLERS (The operation induces a temperature growth with


10°C)
TA (0C) Aa
40 ... 70 600 .. .1170

LIGHT EMITTING DIODES (LED)


Device I A(10-' h-l)
LED I 80

DIODES / THYRISTORS (L = 0.3)


Diode/thyristor families TA (0C) Aa
Silicon small signal diodes 40 ... 70 16 ... 27
Germanium small signal diodes 40 ... 70 23 ... 90

Z diodes 40 ... 70 44 ... 58


RF silicon pulse diodes 40 ... 70 3250 .. .4050
RF germanium pulse diodes 40 10,000
Varactor and tunnel effect diodes 40 ... 70 925 .. .1320
Thyristors 40 ... 70 30 .. .55

TRANSISTORS
Transistor families L TA (0C) Aa
NPN silicon transistors 0.03 40 ... 70 37 ... 50
PNP silicon transistors 0.03 40 ... 70 55 ... 80
NPN germanium transistors 0.3 40 ... 70 160 .. .475
PNP germanium transistors 0.3 40 ... 70 60 ... 175
Field effect silicon transistors 0.1 40 ... 70 75 .. .105
Unijonction silicon transistors 0.1 40 ... 70 65 ... 110

VARIOUS COMPONENTS
TJ'pes A (10-' h-l)
Thermistors 12
Quartz devices 200
Solder joints (wave / manual) 0.2 ... 1.0
Components with ferites 1000
Connectors (25 pins) 312
Equiped cards (double face / multilayer) with N (0.01...1) N
holes
418 15 Appendix

INTEGRATED CIRCUITS
IC families TA ("C) 1.
Digital IC with less than 400 transistors (100 gates):
• TTL+DTL 40 ... 70 50 ... 70

• ECL+MOS 40 ... 70 60 ... 200

Digital IC with mode than 400 transistors (100 gates):


• TTL + DTL 40 ... 70 1430 ... 3135

• ECL+MOS 40 ... 70 3570 ... 21400

ROM Memories (lK):


• TTL + DTL 40 ... 70 470 ... 1120

• ECL+MOS 40 ... 70 1290... 8130

Content Adressable Memories - CAM (lK):


* RAM 40 815

Shift register (with more than 2x8 binary elements):


• TTL+DTL 70 1960

• ECL+MOS 40 ... 70 2230 .. .14070

Linear IC (20 transistors) 40 ... 70 105 .. .310

15.3
Failure types for electronic components [15.2]

fo == probability of open circuit


fk == probability of short circuit

Components fo fk Drift
Resistors 0.99 0.Q1
Film potentiometers 0.7 0.1 0.2
Wirewound potentiometers, small power 0.9 0.1
Ceramic capacitors with nondefined temperature 0.4 0.4
coefficient (type I)
Ceramic capacitors with nondefined temperature 0.1 0.9
coefficient (85°C, type II)
Ceramic capacitors with nondefined temperature 0.5 0.5
coefficient (125-150 oC/ type II)
Tantalum capacitors with solid or liquid electrolyte 0.2 0.8
Aluminium capacitors with liquid electrolyte 0.1 0.9
U<63V
15 Appendix 419

Components f. fk Drift
Aluminium capacitors with liquid electrolyte 0.1 0.1 0.8
63V<U<350V
Aluminium capacitors with liquid electrolyte 0.5 0.5
U>350V
Aluminium capacitors with solid electro1y!e 0.3 0.7
Mica humid capacitors 0.1 0.7 0.2
Mica button capacitors 0.2 0.8
Paper or plastic capacitors 0.2 0.8
Glass capacitors 0.8 0.2
Coils and transformers 0.8 0.2
Silicon signal and rectifier diodes 0.2 0.8
Z diodes 0.3 0.6 0.1
Tyristors 0.2 0.2 0.6
Field effect and unijunction transistors 0.3 0.3 0.4
Optocuplers 0.1 O.l 0.8
Bipolar integrated circuits 0.4 0.6

15.4
Detailed failure modes for some components

Components Failure
probability (%)
Silicon npn and pnp transistors
Breakthrough 25
Open circuit: EB / BC / EBC 15/5/10
Short circuit: EB / BC / EBC 5/ 10/20
Small current gain 5
High leakage currents 5
Bipolar integrated circuits
Open input / output 10/20
Short circuit input / output 20/20
Degradated input / output 5/5
Too high / nul supply current 10/5
Unrespected logic function 5
MOS integrated circuits
Short circuit input / output / supply 10110/5
Internal defect 75
Linear integrated circuits
Short circuits 30
Open circuits 10
Blocking 60
420 15 Appendix

15.5
Storage reliability data [15.3]

It has been shown that the storage has an important influence on the reliability of a
product. Further, American and European data on the reliability of components
storaged in given environmental conditions, data collected by the "Reliability"
group of AFCIQ are presented.
• Environment: fixed ground.
• Failure rate: in FITs, confidence level 60%.

Components A(FIT)
Bipolar analogic SSIIMSI ICs «100 gates) 958
Tantalum ca~pacitors 229
LED 152
Transistors (all types) 87
Bipolar digital SSIIMSI ICs «100 gates) 68
Diodes (all types) 32
Inductive components (all types) 23
Potentiometers 18
Resistors (all types) 3.2
Capacitors (all types) 0.44
Solder joints 0,32

15.6
Typical costs for the screening of plastic encapsulated ICs
(in Swiss francs) [15.4]

Operation MSI LSI VLSI


Visual control 0.03 0.05 0.05
Storage at high temperature (24h /125 °c ) 0.07 0.07 0.10
Temperature cycles (10 cycles / 0.06 0.15 0.25
-40C ... +125°C)
Burn-in (72h /125 °c / dynamic) 0.55 1.40 2.20
Electrical characterisation 0.15 1.00 2.00
Total 0.86 2.67 4.60
15 Appendix 421

15.7
Failure criteria. Some examples

Components Parameter Symbol Limits


Transistors Residual current collector-base ICBO 2XSSL
Residual current emitter-base lEBO 2X SSL
Current gain hFE 0.8XSIL;
1.2 X SSL
LEDs Direct voltage VF 1.2 X SSL
Reverse current IR 2X SSL
Light intensity Iv 0.7 X SIL
Logic ICs Supply current IDD 2X SSL
High output current IOH 0.8 X SIL
Minimum high output voltage VOH 0.8 X SIL
Maximum low output voltage VOL 1.2 X SSL
Linear ICs Supply current Icc 0.8XSSL;
Voltage gain Gy 1.2X SSL
Output voltage Vo SSL-3dB;
SSL+3dB
0.8 X SIL;
1.2 X SSL

SSL = Specified Superior Limit; SIL = Specified Inferior Limit

15.8
Results of 1000 h HTB life tests for 8 bit CMOS
microprocessors encapsulated in ceramics, type NSC 800
[15.5]

Batch Temperature Number of Number of Failure mode


number (0C) items failures
8103 125 77 0 -
8111 125 154 1 Parameter defect
8124 125 84 1 Sudden failure
8211 125 55 0 -
8216 125 55 0 -
Total 425 2 see Note

Notes: 1. HTB = High Temperature Bias (Continuous operation at supply


voltage, 125°C, or at an environmental temperature where Tj will
overreach the maximum admissible value).
2. 0.47% failures / 1000h at 125°C: 0.09% failures at 55°C; 0.7eV;
confidence level 60%.
422 15 Appendix

15.9
Results of 1000 h HTB life tests for linear circuits
encapsulated in plastic [15.5]

IC Batch Temp. Number of Number Failure mode


type no. (0C) items of failures
LM 8123 125 45 0 -
1458
LM 8131 125 45 1 parameter
308 defect
LM 8134 125 45 0 -
311
LM 8134 125 45 0 -
308
LM 8136 125 45 1 parameter
311 defect
LM 8137 125 45 1 parameter
319 defect
LM 8137 125 45 0 -
311
LM 8138 125 45 1 parameter
339 defect
LM 8139 125 45 0 -
339
405 4

Note: 0.99% / 1000 h; 80.3 FIT

15.10
Average values of the failure rates for some Ie families
IC Number Mean failure Number Number of
families of items rate (%) of batches with
batches pre-treatment
TTL 288595 1.0 258 70
CMOS 257385 1.6 268 83
PMOS 384326 1.5 77 98
J..lP 116 123 2.2 131 2
Peripheries 92318 4.0 227 6
RAM 92503 2.0 100 12
EPROM 105924 3.2 144 98
Total: 1337174 1.8 1205 369
15 Appendix 423

15.11
Activation energy values for various technologies

Device type or test condition E. (eV)


[15.6] [15.7]
Silicon diodes: * four layers 1.41 -
* varactors 2.31-2.38 -
Silicon bipolar transistors:

• Surface inversion failures 1.0-1.02 1.02

• Au-AI bond failures 1.02-1.04 1.02-1.04

• Metal penetration 1.65 1.77

• Aluminium electromigration - 0.4-0.8

• Surface charge accumulation; mobile ions - 1.0-1.05

Silicon unipolar transistors:

• Threshold shift at MOS - 1.0-l.6

- 1.0-l.35
• Surface charge accumulation; mobile ions
- 1.0-l.3
• Charge injection, slow trapping at the Si-
Si0 2 interface

Integrated circuits:

• Oxide defects (pinholes, etc.) 0.3 0.3

• Silicon defects 0.3 0.3

• Mask defects 0.5 -


• Ball bond lifts 0.35-1.0 -
• Electromigration 1.0-1.1 0.5-1.2

• Contamination (ion migration) 1.0-1.4 1.4


l.3 1.0-l.3
• Charge injection
0.3-0.7 0.3-0.6
• Electrolytic corrosion

MlL-STD-883B:
0.44 0.44
• Bum-in test (method 1005.2)
1.0 1.0
• High temp. storage (method 1008.1)
1.0 1.0
• Steady-state life (method 1015.2)
424 15 Appendix

15.12 Failures at burn-in [15.8]

Ie technologies Static burn-in Dynamic burn-in


96h 160h 96h 160h
Bipolar SSIIMSI - 0.7 0.4 0.7
LSI - 0.8 0.7 1.2
MOS SSIIMSI 0.7 0.9 - 1.0
LSI 0.5 1.2 1.1 1.6
Linear - 1.9 1.6 2.0

References

15.1 xxx (1997): RAMTOOL++. Oerlikon-Contraves AG, ZUrich


15.2 xxx (1979): Recueil des donnees de fiabilite, Centre National d'Etudes de Telecommu-
nications, France
15.3 xxx (1978): Recueil des donnees en stockage des compos ants electroniques. Association
Fran~aise pour Ie Controle Industriel de Qualite, May
15.4 Baummer, H. (1983): Knapp 2% fehlerhafte rcs. Schweizerische Technische Zeitung, no.
23, November, p. 42
15.5. xxx (1982): Reliability Scanner, vol. 4
15.6 Epstein, D. (1982): Application and use of acceleration factors in microelectronics testing.
Solid State Technology, November, pp.116-122
15.7 Hnatek, E. (1987): Integrated circuit quality and reliability. Marcel Dekker, Inc., New
York
15.8 Rickers, H. C. (1978): Microcircuit screening effectivenes. TSR - 1, Reliability Analysis
Center / Griffith Air Force Base
General bibliography

*** (1998): Proceedings of the Custom Integrated Circuits Conference, Santa Clara,
California (USA), Mai 11-14, 1998
Abdel, Ghaly, A A (1986): Ph. D. thesis. City University of London
Abel, D. (1990): Petri-Netze flir Ingenieure. Springer-Verlag, Berlin
AbramoWitz, M.; Stegun, 1. E., eds. (1965): Handbook of Mathematical Functions. Dover,
New York
Ackermann, W.-G. (1955): Einftihrung in die Wahrscheinlichkeitsrechnung. S. Hirzel
Verlag, Leipzig
Ackmann, W. (1961): Alterungskriterien bei Elektrolytkondensatoren. NTF 24. vol. I, pp.
115-126
Ackmann, W. (1973): Reliability and Failure of Capacitors. Proceedings of the 3rd
Symposium on Reliability, Budapest, pp. 3-12
Ackmann, W. (1976): Zuverlassigkeit elektronischer Bauelemente. Hiithig-Verlag,
Heidelberg
Ackmann, W.: Neuere Ergebnisse zur Zuverlassigkeit des Ta-Kondensators. SEL-
Nachrichten vol. 12, no. 1, pp. 38-41
Adams, E. N. (1984): Optimizing Preventive Service of Software Products. IBM Journal of
Research and Development, vol. 28, no. 1
AFCIQ (1981): Guide d'evaluation de fiabilite en mecanique. Paris
Aitchison, J.; Dunsmore, 1. R. (1975): Statistical Prediction Analysis, Cambridge University
Press, Cambridge
Akaike H. (1982): Prediction and Entropy. MRC Technical Summary Report, Mathematics
Research Center, University of Wisconsin-Madison
Amerasekera, A; Campbell, D. S. (1987): Failure Mechanisms in Semiconductor Devices. 1.
Wiley and Sons, Chichester
Amerasekera, A; Verwey, J. (1992): ESD in Integrated Circuits. Quality and Reliability
Engineering International, vol. 8, pp. 259-272
Amman, P. E.; Knight, J. C. (1987): Data Diversity: An Approach to Software Fault
Tolerance. Digest FTCS-17, 17th Internat. Symposium on Fault-Tolerant Computing,
pp. 122-126
Anderson, R. 1 (1985): A V-8B Design for Maintainability. Proc. Ann. ReI. & Maint.
Symp., pp. 28-33
Anderson, R. T. (1976): Reliability Design Handbook. lIT Research Institute, Chicago
Andre, G.; Regnault, 1 (1972): Problemes de la fiabilite lies a I'encapsulation plastique.
L'Onde electrique, vol. 2, fasc. 3, mars, pp. 121-125
Ankenbrandt, F. 1. (1960): Electronic Maintainability. vol. 3, London
Arrow, K. 1; Karlin, S.; Scarf, H. (1962): Studies in Applied Probability and Management
Science. Stanford University Press
Arsenault, 1 E.; Roberts, 1 A. (1980):Reliability and Maintenability of Electronic Systems.
Computer Science Press, Rockville, Maryland
Ascher, H.; Feingold, H. (1984): Repairable Systems Reliability. Dekker, New York
426 General bibliography

Avery, 1. R. (1985): Electrostatic Discharge: Mechanisms, Protection Techniques and


Effects on Integrated Circuit Reliability. Quality and Reliability Engineering
International, vol. I, pp. 119-124
Bacivarof,1. C. (1994): Common European Programs in Quality and Dependability. Quality
Engineering, vol. 7, no. 1, pp. 237-238
Bacivarof,1. C.; Balme, 1. J. (1996): Quality Efforts in Europe. Quality Engineering vol. 8,
no.4, pp.525-531
Baeckelandt, M. (1992): Manuel d'ingenierie systemes. Masson, Paris
Bailey, M. M. (1977): Effects of Burn-in and Temperature Cycling on the Corrosion
Resistance of Plastic ICs. Internat. Reliab. Phys. Symp., pp. 120-124
Bajenesco, T. I. (1974): Source de perturbations pseudo-aleatoires. EM! (France) no. 191,
pp.67-70
Bajenesco, T. I. (1975): La fiabilite des microcircuits bipolaires et MOS. Conference at the
Ecole Polytechnique Federale de Lausanne (EPFL), November 6
Bajenesco, T. I. (1975): Pratiques actuelles en fiabilite. Conference at the Ecole
Polytechnique Federale de Lausanne (EPFL), October 30
Bajenesco, T. I. (1975): Sur la fiabilite des photocoupleurs. Conference at the Ecole
Polytechnique Federale de Lausanne (EPFL), November 17
Bajenesco, T. I. (1977): Caracteristiques des systemes adaptatifs. La Revue Polytechnique
(Switzerland), no. 1359, pp. 231-235
Bajenesco, T. I. (1977): Fiabilite et redondance. Conference at the Ecole Polytechnique
Federale de Lausanne (EPFL), Mars 2
Bajenesco, T. I. (1977): Initiation it la fiabilite en electronique moderne. Masson, Paris
Bajenesco, T. 1. (1977): Maintenabilite, fiabilite et disponibilite des systemes electroniques.
Bulletin SEVNSE (Switzerland), vol. 68, no. 18, pp. 964-968
Bajenesco, T. I. (1977): Systemes adaptatifs. La Revue Polytechnique (Switzerland), no.
1358, pp. 117-123
Bajenesco, T. I. (1978): Microcircuits. Reliability, incoming inspection, screening and
optimal efficiency. International Conference on Reliability and Maintainability, Paris,
June 19-23
Bajenesco, T. 1. (1981): Problemes de la fiabilite des composants electroniques actifs
actuels. Masson, Paris
Bajenesco, T. I. (1982): Le C.N.E.T efles tests de fiabilite des photocoupleurs. L'Indicateur
Industriel (Switzerland) no. 9(1982), pp. 15-19
Bajenesco, T. I. (1983): Anzeigeelemente und deren Zuverlassigkeit. Aktuelle Technik
(Switzerland), no. 6, pp. 18-24
Bajenesco, T. I. (1983): Fiabilite, modes de defaillances et complexite des composants
electroniques. Electronique (Switzerland), no. 12, pp. 23-25
Bajenesco, T. I. (1983): Politique d'entretien des equipements electroniques et l'approche
personnalise de la maintenance. Electronique (Switzerland) no. 5, pp. 43-44
Bajenesco, T. I. (1983): Progres de la technologie des circuits integres dus it la microscopie
electronique it haute tension. Electronique (Switzerland), no. 11, pp. 39-40
Bajenesco, T. I. (1983): Quelques aspects economiques du "burn-in". La Revue
Polytechnique (Switzerland), no. 6, pp. 667-669
Bajenesco, T. I. (1984): Fiabilite et methode de deverrninage des condensateurs cerami que
multicouches. Electronique (Switzerland), no. 2, pp. 26-27
Bajenesco, T. I. (1984): La fiabilite des relais. Electronique (Switzerland), no. 10,
Bajenesco, T. I. (1984): Memoires RAM: quelle fiabilite? La Revue Polytechnique
(Switzerland), no. 6, p. 701
General bibliography 427

Bajenesco, T. 1. (1984): Microcircuits enrobes de plastique en fonctionnement intermittent.


La Revue Polytechnique (Switzerland), no. 1, pp. 17-19
Bajenesco, T. 1. (1984): Sur la fiabilite des thyristors. Electronique (Switzerland), no. 4, pp.
26-31
Bajenesco, T. 1. (1985): Correlation technologie-fiabilite: cas des diodes de signal.
Electronique (Switzerland), no. 5, pp. 35-37
Bajenesco, T. 1. (1987): Diodes de puissance BYW77 sous la loupe. Marche Suisse des
Machines (Switzerland), no. 16, pp. 20-23
Bajenesco, T. 1. (1988): Diodes de redressement en boitier verre sous la loupe. Marche
Suisse des Machines (Switzerland), no. 16, pp. 32-35
Bajenesco, T. 1. (1989): La testabilite: pourquoi et comment? La Revue Polytechnique
(Switzerland) no. 1514, pp. 883-885
Bajenesco, T. 1. (1991): Vne testabilite accrue. Marche Suisse des Machines (Switzerland),
no.4,pp.32-39
Bajenesco, T. I.: Microcircuits: fiabilite et contraintes. La Revue Polytechnique
(Switzerland), no. 1355, pp. 1051-1095
Bajenesco, T. I.: Sur la fiabilite des memoires bipolaires PROM. Bulletin SEVIVSE
(Switzerland), vol. 69, no. 6, pp. 268-273
Bajenescu, T. 1. (1978): La fiabilite des resistances. La revue polytechnique (Switzerland),
no. 9,pp. 993-997
Bajenescu, T. 1. (1979): Elektronik und ZuverHissigkeit. Hallwag-Verlag, Bern
(Switzerland), Stuttgart (West Germany), 1979
Bajenescu, T. 1. (1981): Zuverlassigkeit passiver Komponenten. Technische Rundschau,
Switzerland (A series of papers from March to June
Bajenescu, T. 1. (1981): ZuverHissigkeit von Kondensatoren. Feinwerktechnik &
Messtechnik (West Germany), vol. 89, no. 7, pp. 313-320
Bajenescu, T. 1. (1982): Eingangskontrolle hilft Kosten senken. Schweizerische Technische
Zeitschrift (Switzerland) 22(1982), pp. 24-27
Bajenescu, T. 1. (1982): Look Out for CostIReliability OptiH633andmization of ICs by
Incoming Inspection. Proceedings ofEUROCON '82 (Holland), pp. 893-895
Bajenescu, T. I. (1982): Wann steigt die nachste Komponente aus? Neue Technik
(Switzerland), no. 7/8, pp. 45--46
Bajenescu, T. 1. (1982): Zuverlassigkeit und Systemzuverlassigkeit. Aktuelle Technik
(Switzerland), no. 7/8, pp. 9-13
Bajenescu, T. 1. (1982): Zuverlassigkeitsprobleme bei Siliziumleistungstransistoren.
Elektroniker (Switzerland), no. 18(1982), pp. ELl-EL8; no 19, pp. ELl9-EL26
Bajenescu, T. 1. (1982/1983): Zuverlassigkeit monolitisch integrierter Schaltungen. EPP
(West Germany), A series of papers from September to May
Bajenescu, T. 1. (1983): Dem Fehlerteufel auf dem Spur. Elektronikpraxis (West Germany),
no.2, pp. 36--43
Bajenescu, T. 1. (1983): Fehleruntersuchung e1ektronischer Bauelemente. Aktuelle Teclmik
(Switzerland), no. 10, pp. 22-28
Bajenescu, T. I. (1983): Fertigung bestimmt Qualitat. Elektronikpraxis (West Germany),
November, p. 78-84
Bajenescu, T. I. (1983): Fertigungs- und Zuverlassigkeitsprobleme bei hybriden
Mikroschaltungen. Elektroniker (Switzerland) vol. 12, p. 71-82
Bajenescu, T. I. (1983): Mikroprozessoren und Zuverlassigkeit. Aktuelle Technik
(Switzerland), no. 10, pp. 18-21
Bajenescu, T. I. (1983): Pourquoi Ie test de deverminage des composants? Electronique
(Switzerland), no. 4, pp. EL8-ELlI
428 General bibliography

Bajenescu, T. I. (1984): Ausfallanalyse elektronischer Komponenten. Electrotechnik und


Maschinenbau (Austria), no. 10, pp. 455-463
Bajenescu, T. I. (1984): Einige Probleme der zuverliissigen Bondung in der Mikroelektronik.
Aktuelle Technik (Switzerland), no. 6, pp. 33-37
Bajenescu, T. I. (1984): Fehleranalyse an Halbleiterbauelemente. Elektronik Produktion &
Priiftechnik (West Germany), Mai, pp. 245-250
Bajenescu, T. I. (1984): Mikroprozessoren testen. Elektronikpraxis (West Germany), no. 7,
pp.34-40
Bajenescu, T. I. (1984): Optokoppler und deren Zuverliissigkeitsprobleme. Aktuelle Technik
(Switzerland), no.3, pp. 17-21
Bajenescu, T. I. (1984): Relais und Zuverliissigkeit. Aktuelle Technik (Switzerland), no. 1,
pp.17-23
Bajenescu, T. I. (1984): Zeitstandfestigkeit von Drahtbondverbindungen. Elektronik
Produktion & Priiftechnik (West Germany), October, pp. 746-748
Bajenescu, T. I. (1984): Zuverliissig? Kriterien fUr Mikroprozessoren. Hard und Soft (West
Germany), April, pp. 24-27
Bajenescu, T. I. (1984): Zuverliissigkeitsprobleme bei den Halbleiterspeichem und Mikro-
prozessoren. Elektroniker (Switzerland) no. 9, pp. 25-34; no. 10, pp. 49-57
Bajenescu, T. I. (1985): Excess Noise and Reliability. Proc. of the Sixth Symp. on
Reliability in Electronics, Budapest (Hungary), Octobre 1985
Biijenescu, T.I. (1985): Zuverliissigkeit Elektronischer Komponenten. VDE Verlag
Bajenescu, T. I. (1986): Quelques pages d'histoire de l'assurance qualite. Electronique
(Switzerland), no. 5, p. 38-39; no. 7/8(1986), pp. 28-29
Bajenescu, T. I. (1989): A Pragmatic Approach to the Evaluation of Accelerated Test Data.
Proceedings of the Fifth lASTED International Conference on Reliability and Quality
Control, Lugano (Switzerland), June 20-22,1989
Bajenescu, T. I. (1989): Evaluating Accelerated Test Data. Proceedings of the International
Conference on Electrical Contacts and Electromechanical Components, Beijing (P. R.
China), May 9-12, p. 429-432
Bajenescu, T. I. (1989): Uher die Zuverliissigkeit der Lithium-Thyonilchlorid Batterien.
Proceedings of the Electronics Conference, Vienna (Austria), 26-27 September
Bajenescu, T. I. (1990): Einige Gedanken tiber die Zuverliissigkeit der Lithium-
Thionylchlorid-Batterien.e & i (Austria), no. 10, pp. 498-501
Bajenescu, T. I. (1991): A Pragmatic Approach to Reliability Growth. Proceedings of 8th
Symposium on Reliability in Electronics RELECTRONIC '91, August 26-30, Budapest
(Hungary), pp. 1023-1028
Bajenescu, T. I. (1991): A Pragmatic Approach to the Evaluation of Accelerated Test Data.
Proc. ofRELECTRONIC '91, Budapest (Hungary), August 26-30
Bajenescu, T. I. (1991): Reliable Tests to Reduce the Life Cycle of Fast Digital Signal
Processors. Proc. of Relectronic '91, Budapest (Hungary), August 26-31
Bajenescu, T. I. (1991): Sind Lithium-Batterien zuverliissig? Elektroniker (Germany), no.
1O,p.62-64
Bajenescu, T. I. (1991): Steckverbinder und Zuverliissigkeit. Aktuelle Technik (Switzerland)
no.6,pp.17-20
Bajenescu, T. I. (1991): The Challerige of the Coming Years. Proceedings of the First
Internat. Fibre Optics Conf., Leningrad, March 25-29
Bajenescu, T.I. (1991): The Challenge of the Future. Proc. ofInt. Conf. on Computer and
Communications ICCC '91, Beijing (p. R. China), October 30 to November 1
Bajenescu, T. I. (1992): Einige Aspekte der Zuverliissigkeitssicherung in der Elektronik-
Industrie. Aktuelle Technik (Switzerland), no. 8, pp. 5-7
General bibliography 429

Bajenescu, T. I. (1992): Elektronsiche Bauelemente und ZuverHissigkeit. Aktuelle Technik


(Switzerland), no. 9, pp. 17-20
Bajenescu, T. 1. (1992): New Aspects of the Reliability of Lithium Thyonil Chloride Cells.
Microelectronics and Reliability (U. K.), vol. 32, no. 11, p. 1651-1653
Bajenescu, T. 1. (1992): Quality Assurance and the "Total Quality" Concept. Optimum Q,
(Romania) no.2, April, p. 10-14
Bajenescu, T. 1. (1992): Some issues concerning the actual status of VLSI process-related
reliability components technology. Proc. of the 15 th annual semiconductor conf. CAS'92,
October 6-11, Sinaia (Romnania), pp. 509-518 (invited paper)
Bajenescu, T. 1. (1993): Reliability and Degradation. Proc. of 16th Annual Semicond. Conf.
CAS, Oct. 12-17, Sinaia (Romania)
Biijenescu, T. I. (1993): Wann kommt der nachste Uberschlag? Schweizer Maschinenmarkt
(Switzerland), no. 40, pp. 74-81
Bajenescu, T. 1. (1994): Ageing Problem of Optocouplers. Proc. of Mediteranean
Electrotech. Conf. MELECON '94, Antalya (Turkey), 12-14 April
Bajenescu, T. 1. (1994): Aschenbrodel Steckverbinder. Bulletin SEVIVSE (Switzerland),
no.25, p. 35-39
Bajenescu, T. 1. (1994): Reliability and Degradation of Semiconductors. Internat. Conf.
Romania and Romanians in Contemporary Science, Sinaia (Romania), 24-27 May
Bajenescu, T. I. (1995): CTR Degradation and Ageing Problems of Optocouplers. Proc. of
ICSICT '95, October 24-28, Beijing (P. R. China)
Bajenescu, T. 1. (1995): Inspection Techniques Ensuring Reliable Lithium Thyonil Chloride
Cells. Proc. of POWER QUALITY '95, November 7-9, Bremen (Germany)
Bajenescu, T. 1. (1995): Particular Aspects of CTR Degradation of Optocouplers.
Proceedings ofRELECTRONIC '95, Budapest (Hungary)
Bajenescu, T. 1. (1995): Prlifung und Vorbehandlung elektronischer Bauteile und Gerate.
Aktuelle Technik (Switzerland), no. 3, p. 6-8
Bajenescu, T. 1. (1995): Priifung und Vorbehandlung elektronischer Bauteile und Gerate.
Aktuelle Technik (Switzerland), no. 3, pp. 6-8
Bajenescu, T. 1. (1995): Zuverlassigkeitskenngrossen. Aktuelle Technik (Switzerland), no. 5,
pp. 13-15
Biljenescu, T. 1. (1996): Fiabilitatea componentelor electronice. Editura Tehnica Publishing
House, Bucharest (Romania)
Biljenescu, T. 1. (1996): Zuverlassigkeit komplexer elektronischer Systeme. Sommerkurs,
Haus der Technik (Essen, Germany), July 15-17
Biljenescu, T. 1. (1997): A personal view of some reliability merits of plastic-encapsulated
microcircuits versus hermetically sealed ICs used in high-reliability systems.
Proceedings of the 8th European Symposium on Reliability of Electron Devices,
Failure Physics and Analysis (ESREF '97), Bordeaux (France), October 7-10
Biljenescu, T. 1. (1997): Complex Electronic Systems' Reliability. Editura de Vest
Publishing House, Timisoara
Biljenescu, T. 1. (1997): Status and trends of microprocessor design. Proceedings of the 20 th
intemat. semiconductor conf., October 7-11, Sinaia (Romania), pp. 625-627
Bajenescu, T. 1. (1997): Zuverlassigkeit komplexer elektronischer Systeme. Sommerkurs,
Haus der Technik (Miinchen, Germany), July 14-16
Biljenescu, T. 1. (1998): A particular view of some reliability merits, strengths and
limitations of plastic-encapsulated microcircuits versus hermetical sealed microcircuits
utilised in high-reliability systems. Proceedings of OPTIM '98, Brasov (Romania), May
14-15,pp.783-784
430 General bibliography

Biijenescu, T. 1. (1998): Infonnation technology and trends in microprocessor design. In:


Proceedings ofOPTIM '98, Brasov (Romania), May 14-15, pp. 807-809
Biijenescu, T. 1. (1998): On the spare parts problem. Proceedings of OPTIM '98, BraQov
(Romania), May 14-15, pp. 797-800
Biijenescu, T. 1. (1998): The Monte Carlo method and the solution of some reliability
problems. Proceedings of the Symp. on Quality and Reliab. in Infonnation and
Commmunications Technologies RELINCOM '98, Budapest (Hungary), 7-9 September
Biijenescu, T. 1. (1998): ZuverHissigkeit elektronischer Scha1tungen. Leybold-Vakuum
Firmenkurs, Koln (Gennany), June 25-26
Biijenescu, T. I.: (1989): Realistic Reliability Assements in the Practice. Proceedings of the
International Conference on Electrical Contacts and Electromechanical Components,
Beijing (P. R. China), May 9-12, pp. 424-428
Biijenescu, T. I.: Some Aspects of the Reliability of Lithium Thyonil Chloride Cells.
Proceedings ofRELECTRONIC '91, Budapest (Hungary), August 26-30
Balaban, H. S. (1960): Some Effects of Redundancy on System Reliability. Proc. 6th
National Symposium on Reliability and Quality Control in Electronics, pp. 388-402
Balaban, H. S. (1978): Reliability Growth Models. l of Environmental Science, vol. 12, pp.
101-107
Baldini, G. L. et al. (1993): Electromigration in Thin-Films for Microelectronics.
Microelectronics and Reliability, 33(11/12), pp. 1779-1805
Baldini, G. L.; Scorzoni, A (1990): A New Wafer-Level Resistometric Technique for
Electromigration. Proceedings ESREF '90, Bari, pp. 245-252
Balme, L. l; Bacivarof, 1. C. (1996): European Program in Quality of Complex Integrated
Systems (EPIQCS). Quality Engineering, vol. 8, no. 4, pp. 675-692
Bardou, L. (1995): Maintenace et soutien logistique des systemes informatiques. Masson,
Paris
Bar-Lev, A; Margalit, S. (1970): Changes of mobility along a depletion type MOS transistor
channel. Solid-St. Electron., vol. 13, pp. 1541-1546
Barlow, R. E. (1963): Maintenance and Replacement Policies. In: Marvin Zelen (ed.)
Statistical Theory of Reliability, Madison, pp. 75-89
Barlow, R. E. (1968): Some Recent Developments in Reliability Theory. European Meeting
in Statistics, Econometrics and Management Science, Amsterdam
Barlow, R. E., Proschan, F. (1975): Statistical Theory of Reliability and Life Testing. Holt,
Rinehart & Winston, New York
Barlow, R. E., Proschan, F., Hunter, L. C. (1965): Mathematical Theory of Reliability. John
Wiley, New York
Barlow, R. E.; Proschan, F. (1962): Planned Replacement. In: Arrow, Karlin, Scarf (eds.)
Studies in Applied Probability and Management Science, Stanford, pp. 63-87
Barlow, R. E.; Proschan, F. (1964): Comparison of Replacement Policies and Renewal
Theory Implications. Annals of Mathematical Statistics, pp. 577-589
Barlow, R.; Hunter, L. (1960): Optimum Preventive Maintenance Policies. Operation
Research, vol. 8, No.1, 1960, pp. 90--100
Bauer, C. L. (1991): Stress and Current-Induced Degradation of Thin-Film Conductors.
Proceedings ESREF '91, pp. 161-170, Bordeaux
Bavuso, S. l; Martensen, A L. (1988): A Fourth Generation Reliability Predictor. Proc.
Ann. ReI. & Maint. Symp., pp. 11-16
Bazovsky, 1. (1961): Reliability Theory and Practice. Prentice-Hall Inc., Englewood Cliffs,
New Jersey
Bazowsky, 1., Benz, G. (1988): Interval Reliability of Spare Part Stocks. Qual. Reliab.
Engng. Int., no. 4, pp. 235-246
General bibliography 431

Biizu, M. (1982): Mathematical model for the reliability of semiconductor devices.


Electrotechnics, Automatics and Electronics, no. 4, pp. 151-155
Biizu, M. (1982): Reliability prediciton for a Weibull population of semiconductor
components: the non-continuous inspection case. Int. Conference on Reliability, Varna
(Bulgaria), May
Biizu, M. (1982): Temperature dependence of the reliability of semiconductor components.
National Conference of Electronics, Telecommunications and Computers (CNETAC),
November, pp. 1.81-1.85
Biizu, M. et al. (1982): The use of accelerated tests for the estimation of the reliability of
semiconductor components. Conf. CAS 1982, October 10-12
Biizu, M. et al. (1983): Accelerated tests for evaluation of semiconductor component
reliability. Electrotechnics, Automatics and Electronics, no. 1, pp. 19-25
Biizu, M. et al. (1983): Reliability data bank for semiconductor components. Proceedings of
Ann. Semicond. Conf. CAS 1983, October 6-8, pp. 35-38
Biizu, M. et al. (1983): Step-stress tests for semiconductor components. Proceedings of Ann.
Semicond. Conf. CAS 1983, October 6-8, pp. 119-122
Biizu, M. et al. (1984): Accelerated Ageing of Semiconductor Components. Proceedings
CAS 1984, pp. 251-254
Biizu, M. et al. (1984): Thermal and mechanical stress in rapid estimation of reliability.
Proceedings CAS 1984, pp. 257-260
Biizu, M. (1985): Failure rate curves for semiconductor components. ICSITE Symp., Snagov
(Romania), September
Biizu, M. (1985): The reliability of semiconductor components - truth and legend.
Electronics XX, vol. 3, pp. 26-30
Biizu, M. et al. (1985): A system for Rapid Estimation of Semiconductor Component
Reliability. Proceedings CAS 1985, pp. 307-310
Biizu, M. et al. (1985): SRER - a system for rapid estimation of the reliability. Proc. of 6th
Symp. on Reliab. in Electronics RELECTRONIC, Budapest (Hungary), pp. 267-271
Biizu, M. et al. (1986): High temperature / Early period reliability of semiconductor
components. Proceedings CAS 1986, pp. 243-246
Biizu, M.; Bulucea, C. (1986): How to write a CAS paper. Proceedings CAS 1986, pp.456-
459
Biizu, M. et al. (1987): Accelerated tests for semiconductor components. Conf. of Young
Researchers, Politechnic Institute, Bucharest (Romania), October
Biizu, M. et al. (1987): Failure mechanisms thermally and electrically accelerated.
Proceedings CAS 1987, pp. 53-56
Biizu, M. et al. (1987): Reliability of electronic comonents - accelerated test. RNR Symp.,
Constanta (Romania), May
Biizu, M. et al. (1987): Reliability of semiconductor components in the first hours of
functioning at high temperature. Electrotechnics, Automatics and Electronics, no. 1, pp.
10-15
Biizu, M. et al. (1988): Operating-condition dependent failure distribution and bum-in. 7th
Symp. on Reliab. in Electronics RELECTRONIC '88, Budapest (Hungary)
Biizu, M. (1988): The rapid estimation of the reliability of semiconductor components - the
art to compress the time. Electronic Components, vol. 1, no. 2, pp. 16-17
Biizu, M. (1989): Design for the reliability of semiconductor components. ICSITE Symp.,
Snagov (Romania), September
Biizu, M.; Bacivarof, I. (1989): On the Validity of the Arrhenius Model in the Accelerated
Testing of Semiconductor Devices Reliability. In: Aven, T. (ed.) Reliability
Achievement, Elsevier Science Publishers Ltd., pp.151-157
432 General bibliography

Bazu, M. et al. (1989): Behaviour of semiconductor components at temperature cycling.


Revue Rournaine des Sciences Techniques, no. 1, pp. 151-154
Bazu, M. et al. (1989): Rapid estimation of reliability changes for semiconductor devices.
Proceedings of Annual Semicond. Conf. CAS 1989, October 7-10, pp. 399--402
Bazu, M. et al. (1990): Estimarea rapida a fiabilitatii lotului de componente
semiconductoare. Calitate, Fiabilitate, Metrologie no. 2-3, pp. 92-94
Bazu, M. et al. (1990): Evaluarea fiabilitatii la functionarea in conditii de cal dura umeda.
Calitate, Fiabilitate, Metrologie no. 2-3, pp. 90-91
Bazu, M. and Ilian, V. (1990): Accelerated testing of integrated circuits after storage.
Scandinavian Reliability Engineers Symp., Nykiiping, Sweden, October
Bazu, M. and llian, V. (1990): Influence of the storage on the reliability of screened
semiconductor components. Reliability and Quality Assurance Symp., Bucharest,
Romania, November
Bazu, M. (1990): A model for the electric field dependence of semiconductor device
reliability. 18th Conf. on Microelectronics (MIEL). Ljubljana, Slovenia, May
Bazu, M.; Bacivarof, 1. (1991): A method of reliability evalution of accelerated aged
electronic components. In: Proc. Conf. Probabilistic Safety Assessment and
Management (PSAM), Beverly Hills, California (USA), pp. 357-361
Bazu, M.; Tiizlauanu, M. (1991): Reliability testing of semiconductor devices in a humid
environment. Proc. Ann. Reliab. and Mainatin. Symp., Orlando, Florida, (USA), pp.
237-240
Bazu, M.; Tiizlauanu, M. (1991): Reliability testing of semiconductor devices in a humid
environment. Journal of the Institute for Environmental Sciences, vol. 36, no. 2, pp.
37--41
Bazu, M. (1991): Relationship between the electrical parameters and device reliability.
Reliability and Quality Assurance Symp., Bucharest, Romania, November
Bazu, M. (1992): Accelerated life test when the activation energy is a random variable. Proc.
of the International Semicond. Conf. CAS '92, October 5-10, Sinaia (Romania), pp.
244-248
Bazu, M. (1992): Determinism and probabilism in the failure of semiconductor device.
Reliability and Quality AssuranceSymp., November
Bazu, M. (1992): Synergetic effects in reliability. Optimum Q, vol. 2, no. 2, April 1992, pp.
32-35
Bazu, M. (1992): The use offailure physics in reliability evaluation. Reliability and Quality
Assurance Symp., November
Bazu, M. (1993): A synergetic reliability-prediction procedure. Proc. of Int. Semicond.
Conf. CAS '93, October 8-11, Sinaia, pp. 259-262
Bazu, M. (1994): A synergetic approach on the reliability assurance for semiconductor
components. Ph. D. thesis, University Politechnica Bucharest
Bazu, M. (1994): Ein System fUr Zuverliissigkeitsaussage und Zeitraffungspriifung.
MESSCOMP'94, Wiesbaden (Germany), Sept. 13-15
Bazu, M. (1994): SYRP failure risk coefficients assessed with fuzzy logic. Proc. of the 17th
Int. Semicond. Conf. CAS '94, October 11-16, Sinaia, pp. 705-708
Bazu, M.; Stoica A. (1994): Expert systems for reliability evaluation. Reliability and Quality
Assurance Symp., November
Bazu, M. (1995): A combined fuzzy logic & physics-of-failure approach to reliability
prediction. IEEE Trans. Reliab., vol. 44, no. 2 (June), pp. 237-242
Bazu, M. (1995): The new wave in the reliability of electronic devices. Reliability and
Quality Assurance Symp., November
General bibliography 433

Bazu, M. (1996): Are, really, needed components for military use? Reliability and Quality
Assurance Symp., November
Biizu, M. (1996): Fuzzy-logic based reliability prediction for the building-in reliability
approach. In: Negoita, M, Zimmermann, H.-J., Dascalu, D. (eds.) Real word
applications of intelligent technologies, Romanian Academy, July, pp. 124-128,
Bucharest
Bazu, M. (1997): The quality of quality reasearches. Reliability and Quality Assurance
Symp., November
Biizu, M. et al. (1997): MOVES - a method for monitoring and verfying the reliability
screening. Proc. of the 20th Int. Semicond. Conf. CAS '97, October 7-11, Sinaia, pp.
345-348
Bazu, M. (1998): The reliability of semiconductor devices: an overview. Proc. of the 6th
Internat. Conf. on Optimization of Electrical and Electronic Equipments OPTIM '98,
Brasov (Romania), May 14-15, pp.785-788
Bazu, M. (1999): Reliability assessment based on fuzzy logic. International Conf. on
Computational Intelligence for Modelling, Control and Automation, CIMCA'99, Viena,
Austria, February 17-19
Bednars, S. M.; Mariott, D. L. (1988): Efficient Analysis for FMEA. Proc. Ann. ReI. &
Maint. Symp., pp.416-421
Bednarz, S. (1988): Efficient Analysis for FMEA. Proc. Annual Reliability and
Maintainability Symposium, pp.416-421
Beichelt, F. (1970): ZuverHissigkeit und Erneuerung. VEB Verlag Technik, Berlin
Beichelt, F. (1993): ZuverHissigkeits- und Instandhaltungstheorie. Teubner Verlag, Stuttgart
Beichelt, F.; Franken, P. (1983): ZuverHissigkeit und Instandhaltung - Mathematische
Methoden. VEB Verlag Technik, Berlin
Bell Communications Research (1985): Reliability Prediction Procedure for Electronic
Equipment. (TR-TSY-OOO 332), Bell, Morristown NJ
Bellut, S. (1990): La competitivite par la maitrise des couts, conception a cout objectif et
analyse de la valeur. AFNOR gestion, Paris
Bennets, R. G. (1996): Built-In Self Test Backgrounder. LogicVision
Benson, K. E. et al. (1990): Reaching the Limits in Silicon Processing. AT&T Technical
Journal, NovemberlDecember
Benz, G. E.; Bazovsky, 1. (1990): Adapting Mechanical Models to Fit Electronics. In Proc.
Annual Reliab. Maintainability Symp. 1990. IEEE Reliability Society, Los Angeles,
Janary 23-25, New York, pp. 153-156
Berman, A. (1981): Time-Zero Dielectric Reliability Test by a Ramp Method. International
Reliability Physics Symposium, pp. 204-208
Bernasconi, J. et al.: Investigation of Various Models for Metal Oxide Varistors. Journal of
Electron Materials, vol. 5, no. 5, pp. 473-495
Bernet. R. (1988): CARP - a Program System to Calculate the Predicted Reliabi-lity. 6th
Internat. Conference on Reliability and Maintainability, Strasbourg, pp. 306-310
Bertsche B.; Lechner, G. (1990): ZuverHissigkeit im Maschinenbau. Springer-Verlag, Berlin
Biancomano, V. (1983): Screening method points to causes of low-voltage failure in MLC
capacitors. Electronic Design, 23rd June, 47-48
Bickley, J. (1981): ZuverHissigkeit von Halbleiterbauelemente. Elektronik (West Germany),
no. 14,pp. 51-58
Billinton, R.; Allan, R. N. (1983): Reliability Evaluation of Engineering Systems. Pitman,
Boston
434 General bibliography

Birolini, A (1974): Semi-Markoff- und verwandte Prozesse: Erzeugung und einige


Anwendungen auf Probleme der Zuverliissigkeitstheorie und der Ubertragungstheorie.
Ph. D. Thesis, ETH ZUrich (Switzerland)
Birolini, A (1974): Spare Parts Reservation on Components Subjected to Wear-out and/or
Fatigue according to a Weibull Distribution. Nuclear Eng. and Design 27(1974), pp.
293-298
Birolini, A (1985): On the Use of Stochastic Processes in Modeling Reliability Problems.
Lectures Notes in Economics und Mathematical Systems, vol. 252, Springer-Verlag,
Berlin, 1985
Birolini, A (1990): Zuverliissigkeitssicherung von Automatisierungssystemen und -
prozessen. e & i, no. 5(107), p. 258-271
Birolini, A (1994): Quality and Reliability ofTechnical Systems. Springer Verlag, Berlin
Birolini, A (1994): Statistical Methods for Reliability Tests. Tutorial at EuPac'94, Essen
Birolini, A (1996): Reliability Analysis Techniques for Electronic Equipment and Systems.
Proceedings ofEuPac'96, Essen, January 31- February 2,1996
Birolini, A (1996): Reliability Engineering: Cooperation between University and Industry at
the ETH Zurich. Quality Engineering 8(4), pp. 659-674
Birolini, A. (1997): Quality and Reliability ofTechnicai Systems. Springer-Verlag, Berlin
Birolini, A et al. (1989): Test and Screening Strategies for Large Memories. Proc. 1st
European Test Conf. 1989, pp. 276-283
Bitter, P. et al. (1971): Technische Zuverliissigkeit. Hrsg. von der Messerschmitt-BOIkow-
Blohm GmbH, Miinchen (West Germany)
Bjerre, A; Skaaning, K.: Tantalum Capacitors: An Evaluation of 11 Types of Tantalum
Capacitors. Elektronikcentralen ECR-38, Danish Research Centre for Applied
Electronics
Black., J. R. (1974): Physics of Electromigration. International Reliability Physics
Symposium, pp. 142-149
Blakemore, J. S. (1962): Semiconductor statistics. Pergamon Press
Blanchard, B. S. (1991): Logistics Engineering and Management. Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey
Blanchard, B. S.; Lawery, E. E. (1969): Maintainability Principles and Practices. McGraw-
Hill, New York
Boitel, D.; Hazard, C. (1987): Guide de la maintenance. Nathan technique, Paris
Boole, G. (1958): The Laws of Thought. Dover, New York (published for the first time in
1854)
Bora, J. S.: Limitations and Extended Applications of Arrhenius Equation in Reliability
Engineering. Microelectronics and Reliability vol. 18, pp. 241-242
Bora, J. S.: Short-Term and Long-Term Performance of electrolytic capacitors.
Microelectronics and Reliability, vol. 18, pp. 237-242
Boucly, F. (1991): Le management de la maintenance assistee par ordinateur. AFNOR
gestion, Paris
Bowles, J. B.; Klein, 1. A (1990): Comparison of Commercial Reliability-Prediction
Programs. Proc. Ann. ReI. & Maint. Symp., pp. 450-455
Brambilla P. et al. (1981): CMOS reliability: a useful case history to revise extrapolation
effectiveness, length and slope of the learning curve. Microelectronics and Reliability,
vol. 21,no. 2,pp. 991-201
Braun, H.; Paine, J. M. 81977): A Comparative Study of Models for Reliability Growth.
Technical Report No. 126, series 2, Depart. of Statistics, Princeton University
Brender, D. M. (1968): The prediction and measurement of system availability: A Bayesian
treatment. IEEE Trans. Reliability, vol. R-17, pp. 127-147
General bibliography 435

Braun, H.; Paine, J. M. 81977): A Comparative Study of Models for Reliability Growth.
Technical Report No. 126, series 2, Depart. of Statistics, Princeton University
Brender, D. M. (1968): The prediction and measurement of system availability: A Bayesian
treatment. IEEE Trans. Reliability, vol. R-17, pp. 127-147
Brinkmann, R. (1993): Modellierung des ZuverHissigkeitswachstums komplexer,
reparierbarer Systeme. Dissertationsarbeit an der ETHZ
British Telecom. Handbook of Reliability Data (HRD4), British Telecom, Birmingham,
1987
Brocklehurst, S. (1987): On the Effectiveness of Adaptive Software Reliability Modelling.
CSR Technical Report, City University, London
Buckley, F. J., Poston, R. (1984): Software Quality Assurance. IEEE Trans. Soft. Eng. vol.
10,no.l,pp.36-41
Bulucea, C. D. (1970): Investigation of deep depletion regime of MOS structures using
ramp-response method. Electron. Lett., vol. 6, pp. 479-481
Bulucea, C. D.; Antognetti, P. (1970): On the MOS structure in the avalanche regime. Alta
Frequenza, vol. 39,pp. 734-737
Sah, C. T. ; Pao, H. C. (1966): The effects of fixed bulk charge on the characteristics of
metal-oxide-semiconductor transistors. IEEE Trans. Electron Dev., vol. 13, pp. 393-
397
Bunday, B. D. et al. (1990): Likelihood and Bayesian Estimation Methods for Poisson
Process Models in Software Reliability. Intemat. 1. of Quality and Reliability
Management, vol. 7,no. 5,pp. 9-18
Calabro, S. R. (1962): Reliability Principles and Practice. McGraw-Hill, New York
Campbell, D. S. et al. (1991): Reliability Behaviour of Electronic Components as a Function
of Time. Proceedings ESREF'91, pp. 41-48, Bordeaux
Carada, E. (1975): L'affidabilita per l'electronica. Roma. La Goliardica
Carlson et al. (1986): A Procedure for Estimating Life Time of Gapless Oxid Surge Arresters
for an Application. IEEE Trans. Power Applic. and Systems PAS-Ol
Caroll,1. M. (1962): Reliability (mathematics of reliability -life testing - designing reliable
circuits - component reliability - system design - physics of failure). Electronics,
November, pp. 53-76
Carter, A. D. S. (1986): Mechanical Reliability. Macmillan, London, 2nd Edition
Catuneanu, V. M., Mihalache, A. N. (1989): Reliability Fundamentals. Elsevier, Amsterdam
Catuneanu, V. M., Popentiu Fl. (1987): Optimum Spare Allocation Policy for Preventive
Maintenance. Microel. & Reliability vol. 27, no.1, pp. 45-48
CECC 42000, CECC 42200: Harmonisiertes Geratebestatigungssystem flir Bauelemente der
Elektronik. Fachgrundspezifikationen und Rahmenspezifikationen, Varistoren.
Deutsche Elektrotechnische Kommission, Frankfurt
Chan, P. Y. et al. (1985): Parametric Spline Approach to Adaptive Reliability Modelling.
CSR Technical Report, City University, London
Chandramouli, R., Pateras, S. (1996): Testing Systems on a Chip. IEEE Spectrum,
November, pp. 42-47
Chapouille, P. de Pazzis, R. (1968): Fiabilite des systemes. Masson, Paris
Chen,1. C. et al. (1985): A Quantitative Physical Model for Time-Dependent Breakdown in
Si0 2. International Reliability Physics Symposium, pp. 22-31
Chiang, C.1.; Hurley, D.T. (1998): Dynamics of backside wafer level microprobing.
Proceedings of the IEEE International Reliability Physics Symp., pp. 137-149
Chiu, T. 1.; Sah, C. T. (1968): Correlation of experiments with a two-section-model theory
pofthe saturation drain conductance ofMOS transistors. Solid-St. Electron. vol. 11, pp.
1149-1157
436 General bibliography

Christou, A. (1992): Reliability of Gallium Arsenide. MMICs, 1 Wiley and Sons, Chichester
Chwastek, E. 1, Shaw, R. N. (1987): A Rapid Technique for Assessing the Moisture Ingress
Susceptibility of Plastic-Encapsulated Integrated Circuits. Quality and Reliability
Engineering International vol. 3, pp. 185-193
Ciappa, M. (1990): Ausfallmechanismen Integrierter Schaltungen. Bericht F1I31.1.1990.
ETH Ziircih
Ciappa, M. (1994): Package Reliability in Microelectronics: and Overview. Proc. of
WELDEC, Lausanne, October 4 - 7
Cluley,1 C. (1974): Electronic Equipment Reliability. 1 Wiley & Sons, New York
Cole Jr., E.1.; Soden, 1M.; Rife, 1L.; Baron, D.L.; Henderson, C.L. (1994): Novel failure
analysis techniques using photon probing in a scanning optical microscope.
Proceedings of the IEEE International Reliability Physics Symp., pp. 388-398
Conwell, E. M. (1967): High field transport in semiconductors. Academic Press, New York
Cooke, R. M. (1987): A Theory of Weights for Combining Expert Opinion. Dept. of
Mathematics, Delft University of Technology
Cooke, R. M. (1991): Experts in Uncertainity: Expert Opinion and Subjective Probability in
Science. Oxford University Pres
Cooke, R. M. et al. (1988): Calibration and Information in Expert Resolution: A Classical
Approach. Automatica vol. 24, pp. 87-94
Cooke, R. M. et al. (1988): Expert Opinion in Safety Studies: Case Report 4 - DSM Case.
Depart. of Mathematics, Delft University of Technology
Coppola, A. (1984): Reliability Engineering of Electronic Equipment - a Historical
Perspective. IEEE Transactions on Reliability vol. 33, pp. 29-35
Costes, A. et al. (1978): Reliability and Availability Models for Maintained Systems
Featuring Hardware Failures and Design Faults. IEEE Trans. Compo vol. 27, no. 6, pp.
548-560
Coulbourne, E. D. et al. (1974): Reliability ofMOS LSI Circuits. Proc. IEEE, Vol. 62, no. 2,
pp.244-259
Cox, D. R. (1962): Renewal Theory. Methuen, London
Cox, D. R. (1965): Erneuerungstheorie. R. Oldenbourg Verlag, Miinchen
Cox, D. R., Lewis, P. A. W. (1978): The Statistical Analysis of Series of Events. Chapman
and Hall, London
Cox, D. R., Smith, W. L. (1953): On the Superposition of Renewal Processes; In Biometrics,
vol. 40, pp. 1-11
Crook, D. L. (1991): Evolution of VLSI Reliability Engineering. Quality and Reliability
Engineering International, vol. 7, pp. 221-233
Crosby, P. B. (1971): Qualitiit kostet weniger. Verlag A. Holz
Crow, L. H. (1977): Confidence Interval Procedures for Reliability Growth Analysis.
Technical Report No. 197, US Army Material System Analysis Activity, Aberdeen, Md.
Csenki, A. (1991): Some Renewal-Theoretic Investigations in the Theory of Sojourn Times
in Finite Semi-Markov Processes. 1 Appl. Prob. vol.28, pp. 822-832
Csenki, A. (1991): Some Renewal-Theoretic Investigations in the Theory of Sojourn Times
in Finite Semi-Markov Processes. 1 Appl. Prob. vol.28, pp. 822-832
Csenki, A. (1992): Sojourn Times in Markov Processes for Power Transmission
Dependability Assessment with MATLAB. Microelec. and Reliability vol. 32, pp. 945-
960
Csenki, A. (1993): Occupation Frequencies for Irreductible Finite Semi-Markov Processes
ith Reliability Applications. Computers & Ops. Res., vol. 20, pp. 249-259
General bibliography 437

Csenki, A (1995): An Integral Equation Approach to the Interval Reliability of Systems


Modelled by finite Semi-Markov Processes. Reliability Engineering and System Safety
vol. 47, pp. 37-45
Curtin, K M. (1959): A Monte Carlo Approach to Evaluate multimoded System Reliability.
Operations Research, vol. 7, No.-Dec., pp.721-727
Cushing, M. 1. et al. (1993): Comparison of Electronics-Reliability Assessment Approaches.
IEEE Trans. on Reliab., vol 42, pp. 542-546
D'Heurle, F. M. (1971): Electromigration and Failure in Electronics: An Introduction.
International Reliability Physics Symposium, pp. 1409-1418
Dale, C. 1. (1982): Software Reliability Evaluation Methods. British Aerospace Dynamics
Group, ST-26750
Darling, D. A (1957): The Kolomogorov-Smirnov, Cramer-Von Mises tests. Ann. Math.
Statis. Vol. 28, pp. 823-838
Darveaux, R., Banerji, K (1992): Constitutive relations for tin-based solder joints. IEEE
Trans. Comp., Hybrids, and Manuf. Technol., vol. 15, no. 6, pp. 1013-1024
Das, M. B. (1969): Physical limitations of MOS structures. Solid-State Electron., vol. 12,
pp.305-312
Das, M. B. (1969): High-frequency network properties of MOS transistors including the
substrate resistivity effect. IEEE Trans. Electron Dev., vol. 16, pp. 1049-1084
Dasdilu, D. (1966): Detection characteristics at very high frequencies of the space-charge-
limited solid-state diode. Solid-State Electronics, vol. 9, pp. 1143-1148
Dascalu, D. (1967): Detection properties of space-charge-limited diodes in the presence of
trapping. Solid-State Electronics; vol. 10, pp. 729-733
Dascalu, D. (1968): Transit-time effects in bulk negative-mobility amplifiers. Electyron.
Lett., vol. 4, pp. 581-583
Dascalu, D. (1969): Small-signal impedance of space-charge-limited semiconductor diodes.
Electron. Lett., vol. 5, pp. 230-231
Dasciilu, D. (1972): Unipolar injection in semiconductor electron devices. Ed. Academiei,
Bucharest (Romania)
Dascalu, D. et al. (1988): The metal-semiconductor contact. Ed. Academiei, Bucharest
(Romania)
Dascalu, D. (1998): From micro- to nano-technologies. Proceedings of the International
Semiconductor Conference, October 6-10, Sinaia (Romania), pp. 3-12
Dawid, A P. (1982): The Well Calibrated Bayesian. 1. of the American Statistical Assoc.
vol. 77,pp. 605-613
Dawid, A P. (1989): Probability Forecasting. In: Kotz, S. et al. (Eds.), Encyclopedia of
Statistical Science, vol. 6, 1. Wiley and Sons, New York
Dawid, AP. (1984): Statistical Theory: The Prewuential Approach. 1. of the Royal Statistical
Society, A, no. 147, pp. 278-292
Deal, B. E.; Snow, E. H. (1966): Barrier energies in metal-silicon dioxide-silicon structures.
1. Pys. Chern. Solids, vol. 27, pp. 1873-1879
De Chiaro, L. F., Vaidya, S., ChemeJli, R. G. (1986): Input ESD Protection Networks for
Fineline NMOS-Effects of Stressing Waveform and Circuit Layout. International
Reliability Physics Symposium, pp. 206-214
De Mari, A (1968): An accurate numerical steady-state one-dimensional solution of the pn
junction. Solid-St. Electron., vol. 11, pp. 33-39
De Wolf, I.; Howard, DJ.; Rasras, M.; Lauwers, A; Maex, K; Groeseneken. G.; Maes, H.E.
(1998): A reliability study of titanium silicide lines using micro-Raman spectroscopy
and emission microscopy. Proceedings of the IEEE International Reliability Physics
Symp., pp. 124-128
438 General bibliography

Denson, W. K., Brusius, P. (1989): VHSICIVHSIC-Like Reliability Prediction Modelling.


Report from lIT Research Institute, Rome Air Development Center, New York
DGQ 33 (1977): Zuverliissigkeit: Einfiihrung in die Planung und Analyse.
DGQ-NTG Schrift 12-51(1986): Software--Qualitiitssicherung.
Dhillon, B. S., Rayapati, S. N. (1988): Common-Cause Failures in Repairable Systems.
Proc.Ann. ReI. & Maint. Symp., pp. 283-289
Diatcu, E. (1979): Consideratii privind fiabilitatea sistemelor de transmitere si prelucrare a
datelor. In: Cercetari In tehnologie electronica si fiabilitate, Ed. didactica si pedagogica,
Bucuresti, pp. 209-220
Diatcu, E. (1979): Proiectarea probabilistica a circuitelor electronice. In: Cercetari in
tehnologie electronica si fiabilitate, Ed. didactica si pedagogica, Bucuresti, pp. 68-76
Dietrich, D. L.; Mazzuchi, T. A (1996): An alternative method of analyzing multi-stress,
multi-level life and accelerated-life tests. Proceedings of the Annual Reliability and
Maintainability Symp., January 22-25, Las Vegas, Nevada (USA), pp. 90-96
DIN 25419 (1979): StOrfallablaufanalyse. Tl, 1977; T2
DIN 25424 (1981): Fehlerbaumanalyse. Methode und Bildzeichen.
DIN 25448 (1980): Ausfallefektanalyse.
DIN 31000 (1979): Allgemeine Leitsiitze fur das sicherheitsgerechte Gestalten technischer
Erzeugnisse.
DIN 40040,40041,40046,40815,41 099,41 122,41 240,41 247,41 255,41 256,41
257,41311,41314,41332,41426,41640,44350,44351,44358, 50 018
DIN 40039 (1988): Zuverliissigkeitsangaben flir Bauelemente der Elektronik. Teil IE
DIN 55350: Begriffe der Qualitiitssicherung, Teil 11; Begriffe der Statistik, Teil 21
Doetsch, G. (1981): Anleitungen zum praktischen Gebrauch der Laplace-Transformation
und der Z-Transformation. 4. Auflage, R. Oldenbourg Verlag, Mtinchen
Dombrowski, E. (1970): Einfiihrung in die Zuverliissigkeit elektronischer Geriite und
Systeme. AEG-Telefunken, Berlin
Doyle, E. A Jr. (1981): How parts fail. IEEE Spectrum, October; pp. 36--43
Drexler, A J. (1974): Inaugural-Dissertation zur Erlangung des Grades eines Doktors der
Wirtschafts-und Gesellschaftswissenschaften der Rechts- und Staatswissenschaftlichen
Fakultiit der Rheinsichen Friedrich-Wilhelms-Univer-sitiit Bonn.
Dryden, M. H. (1976): Design for Reliability. Microelectronics & Reliability, Vol. 15, pp.
399-436
Duane, J. T. (1964): Learning Curve Approach to Reliability Monitoring. IEEE Trans. on
Aerospace, 2(1964), pp. 563-566
Dugan, M. P. (1987): Characterization of a 3 mm CMOS/SOS Process. Quality and
Reliability Engineering International, vol. 3, no.2, pp. 99-106
Dummer, G. W. A, Griffin, N. (1966): Electronics Reliability - Calculation and Design.
Pergamon Press, Oxford
Dummer, G. W. A, Winton, R. C. (1968): An Elementary Guide to Reliability. Pergamon
Press, Oxford
Dummer, G. W., Griffin, N. (1960): Electronic Equipment Reliability. Pitman & Sons, Ltd.
London
Dunn, R., Ullman, R. (1982): Quality Assurance for Computer Software. McGraw-Hill,
New York
Durieux, J.: Fiabilite et duree de vie des condensateurs electrolytiques Ii l'aluminium. CNET,
Document de travail DT CPMlICS 38
Durr, W, Meyer, H. (1981): Wahrscheinlichkeitsrechnung und schliessende Statistik.
Hanser-Verlag, Mtinchen I Wien
General bibliography 439

Eda, Iga, Matsuoko (1980): Degradation Mechanism of Nonohmic Zinc Oxide Ceramics.
Appl. Phys. 51, no. 5
Ega (1984): Destruction Mechanism of ZnO Varistors due to High Currents. J. Appl. Phys.
56,pp.810
Einzinger: Nichtlineare elektrische Leitfahigkeit von dotierten Zink-Oxid-Keramik.
Dissertation, Fakultiit flir Physik der TU Miinchen
Ellis, B. N. (1986): Cleaning and Contamination of Electronics Components and
Assemblies. Electrochem. Publ., Ayr (Scotland)
Engelmeier, W., Attarwala, A. 1. (1989): Surface-mount attachmente reliability of clip-
leaded ceramic chip carriers on FR-4 circuit boards. IEEE Trans. Comp .. Hybrids, and
Manuf. Technol., vol. 12, no. 2, pp. 284-296
Epstein, D. (1982): Applicaiton and use of acceleration factors in microelectronics testing.
Solid State Technoogy, November, pp. 116-122
ESA (1988) PSS-01-60 Issue 2, November
Etzrodt, A. (Herausgeber): ZuverHissigkeit in Einzeldarstellungen. Oldenbourg Verlag,
Miinchen / Wien
European Safety and Reliability Research and Development Association (1990): Expert
Judgement in Risk and Reliability Analysis: Experience and Perspective. ESSRDA
Report no. 2
Fagan, 1. (1987): Achieving Reliability in the Real World. Proceedings of the Annual
Reliability and Maintenability Symposium, pp. 152-158
Fauchier, E. et al. (1996): Impact of the VHDL Description on the Testability ofIntegrated
Systems. Quality Engineering vol. 8, no. 4, pp. 623-633
Faul, R., Bartosz, R. (1984): Ausfallmechanismen bei integrierten Halbleiterbauelementen.
Elektronikno.l0,pp.73-79
Feller, W. (1969): An introduction to probability theory and its applications. John Wiley,
New York
Fischer, K. (1984): Zuverliissigkeits- und Instandhaltungstheorie. Transpress. VEB, Berlin
Fischer, K. D. et al. (1996): PRML Detection Boosts Hard-Disk Drive Capacity. IEEE
Spectrum, Nov., pp. 70-76
Fisz, M. (1980): Wahrscheinlichkeitsrechnung und mathematische Staistik. VEB Deutscher
Verlag der Wissenschaften, Berlin
Footner, P. K. et al. (1987): Purple Plague: Eliminated or Just Forgotten? Quality and
Reliability Engineering International, vol. 3, pp. 177-184
Forman, E.H., Singpurwalla, N. D. (1977): An Empirical Stopping Rule for Debugging and
Testing Computer Software. J. of the American Statistical Assoc., vol. 72, pp. 750-757
Fougerousse, S., Germain, J. (1991): Pratique de la maintenance industrielle par Ie coilt
global. AFNOR gestion
Fox, R. W.: Six Ways to Control Transients. Electronic Design, vol. 22 no. II. pp. 52-57
Freiberger W. (Ed.) (1972): Statistical Computer Performance Evaluation. Academic Press,
New York, pp.465-484
French, S. (1985): Group Consensus Probability Distributions: a Critical Survey. In:
Bernardo, J. M. et al. (Eds.) Bayesian Statistics, North Holland, vol. 2, pp. 183-201
Freudenthal, AS. M., Gumbel, E. J. (1953): On the statistical interpretation of fatigue tests.
Proc. Roy. Soc., London, vol. 216, pp. 309-332
Frey, H. (1973): Computerorientierte Methodik der Systemzuverlassigkeits- und
Sicherheitsanalyse.Dissertation Nr. 5244 ETH ZUrich
Frey, H. H. (1974): Safety Evaluation of Mass Transit Systems by Reliability Analysis. IEEE
Trans. on Reliability vol. R-23, no. 3, pp. 161-169
440 General bibliography

Frisch, H.-D. (1976): Ubersicht klimatischer Priifverfahren fUr elektrotechnische Gerate.


Qualitat und Zuverlassigkeit vol. 21, no. 1, pp. 7-10; no. 5, pp. 104-108
Frohman-Bentchkowski, D.; Grove, A. S. (1969): Conductance of MOS transistors in
saturation. IEEE Trans. Electron. Dev., vol. 16, pp. 108-116
Fujiwara et al. (1982): Evaluation of Surge Degradation of Metal Oxide Surge Arresters.
IEEE Trans. on Power Appl. and Systems PAS-101 , no. 4, pp. 978-985
Gaede, K. W. (1977): Zuverlassigkeit, Mathematische Modelle. Hanser-Verlag, MUnchen /
Wien
Garg, R. C., Sawhney, S. (1972): Reliability prediction of a two-unit standby redundant
system with standby failure. Microelectronics and Reliability, vol. 11, pp. 263-267
Gavrilov, M. A. (1960): Structural Redundancy and Reliability of Relay Circuits. Proc. 1st
Int. Congr. ofInt. Fedeartion on Automatic Control (Moscow), Butterworth, London &
R. Oldenburg, MUnchen, pp. 838-844
GE Application Notes Nr. E201.28, E200.71172, E200.73, E201.28
Gehman, B. L. (1980): Bonding Wire Microelectronic Interconnections. IEEE Trans. on
Comp., vol. CHMT-3 no. 3, p. 375, September
Geigerhilk, B., Bretschneider, R.: Das Verhalten von Kondensatoren unter extremen
Bedingungen. Nachrichtentechnik vol. 12, no. 10, pp. 393-396
Gelfand, A. E., Smith, A. F. M. (1990): Sampling Based Approaches to Calculating
Marginal Densities. J. of the American Statistical Assoc. vol. 85, pp. 398-409
Gelpi, R., Sua, M. (1989): Naherungsformeln fUr die Berechnung der Zuveriassigkeit und
der Verfiigbarkeit komp. reparierbaren Strukturen. Diplomarbeit ETHZ
Genest, C., Wagner, C. (1984): Further Evidence Against Independent Preservation in
Expert Judgement Synthesis
Gerling, W. (1976): Zuverlassigkeitssicherung bei Halbleiter-Bauelementen. Seminar 7674,
Electronica, MUnchen
Gerling, W. (1990): Modem Reliability Assurance of Integrated Circuits. Proc. 1st European
Symposium on Reliability of El. Devices, Failure Physics and Analysis (ESREF), pp.
1-12
Ghate, P. B. (1982): Electromigration-Induced Failures in VLSI Interconnects. International
Reliability Physics Symposium, pp. 292-299
Gilbert, E. N. (1960): Capacity of a burst-noise channel. Bell Syst. Tech. J., vol. 39, pp.
1253-1265
Glass, R. (1979): Software Reliability Guidebook. Prentice-Hall, Inc., Englewood Cliffs,
New Jersey
Gnedenko, B. V. et al. (1969): Mathematical Methods of Reliability Theory. Academic
Press, New York
Gnedenko, B. W., Beljajev, 1. K., Solowjew, A. D. (1968): Mathematische Methoden der
Zuverlassigkeitstheorie; in deutscher Sprache berabeitet und hrsg. von Dr. Peter
Franken, Berlin
Goodhew, P. J. (1972): Specimen preparation in material science. North-Holland,
Amsterdam
Gorke, W. (1969): Zuverlassigkeitsprobelme elektrischer Schaltungen. Bibliographisches
Institut AG, Mannheim, 1969
Gorke, W. (1973): Fehlerdiagnose digitaler Schaltungen. Teubner Verlag, Stuttgart
Gottlob, M. P. (1986): Das SMT-Handbuch. Texas Instruments, Freising
Gottschalk, A. (1990): Fehleranalyse elektronischer Bauelemente. SES Electronics,
Nordlingen
Greco, E. (1965): On a New Calculation Method for the Overall System Reliability. Proc. of
the IEEE vol. 53, no. 9, pp. 1227
General bibliography 441

Green, A. E., Bourne, A. 1. (1977): Reliability Technology. Wiley, London


Grosh, D. L. (1989): A Primer of Reliability Theory. J. Wiley, New York, Chichester
Gross, A. 1., Kamins, M. (1967): Reliability Assessment in the Presence of Reliability
Growth. The RAND Corp., Memorandum RM-5346-PR, September
Grove, A. S. (1967): Physics and technology of semiconductor devices. John Wiley, New
York
Grove, A. S.; Deal, B. E.; Snow, E. H.; Sah, C. T. (1965): Investigation of thermally
oxidized silicon surfaces using MOS structures. Solid-State Electron., vol. 8, pp. 145-
165
Gumbel, E. J. (1958): Statistics of Extremes. Columbia University Press, New York
Guyot, C. (1969): Initiation ala maintenabilite. Dunod, Paris
Hahn, G., Wilke, N.: Veriinderung charakteristischer Parameter von Kondensatoren bei
Forcierungsbelastungen. Nachrichtentechnik vol. 19, no. 2, pp. 53-57
Hakim, E. B. (1989): Microelectronic Reliability, vol. 1, Artech House, Norwood
Hall, F. et al. (1983): Hardware/Software FMECA. Proc. Ann. ReI. & Maint. Symp., pp.
320-327
Hallberg, 6., Peck, D. S. (1991): Recent Humidity Accelerations, a Base for Testing
Standards. Quality and Reliability Engineering International, vol. 7, p. 169-180
Hamelin, B. (1974): Entretien et maintenance. Eyrolles, Paris
Hanisch, H.-M. (1992): Petri-Netze in der Verfahrenstechnik. R. Oldenbourg Verlag,
Miinchen
Hansen, C. K., Thyregod, P. (1992): Component Lifetime Models Based on Weibull
Mixtures and Competing Risks. Quality and Reliability Engineering International, vol.
8,no.4,pp.325-333
Harris Digital Data Book (1990), Harris Corp., Boston, Massachussets
Harris, A. P. (1986): Reliability and Maintainability Data for Computer, Telephone and
Electronic Parts and Equipment. Harris-Associates, Ottawa
Hiirtler, G. (1983): Statistische Methoden fur die Zuverlassigkeitsanalyse. VEB Verlag
Technik, Berlin
Hauser, I. J. R.; Littlejohn, M. A.(1968): Approximations for accumulation and inversion
space-charge layers in semiconductors. Solid-St. Electron., vol. 11, pp. 667-674
Heiman, F. P. (1966): Thin-film silicon-on sapphire deep-depletion MOS transistors. IEEE
Trans. Electron. Dev., vol. 13, pp. 855-863
Hermann, M., Schenk, A. (1995): Field and high-temperature dependence of the long term
charge loss in EPROMs: Measurement and modeling. J. Appl. Phys. vol. 77, no. 9, pp.
4522-4540
Hey, J. c., Kra, W. P., eds. (1978): Transient Voltage Suppression Manual. General Electric,
Auburn
Heyns, M. (1989): Degradation and Wear-Out of Thin Dielectric Layers. Sununer Course on
Reliability and Yield in MOS VLSI Technologies, IMEC
Hnatek, E.R. (1975): The Economics of In-House Versus Outside Testing. Electronic
Packaging and Production, August, p. T29
Hnatek, E. R. (1978): Microprocessor device reliability. Microelectronics and Reliability,
vol. 17, pp. 379-385
Hnatek, E. R. (1987): Integrated circuit quality and reliability. Marcel Dekker, Inc .. New
York and Basel
Hofbauer, C. M.: Die Feuchtigkeits- und Klimabestiindigkeit von Schichtwiderstiinden.
Radio Mentor, vol. 27, no. 5, pp. 400-401
Hofle-Isophording U. (1978): Zuverlassigkeitsrechnung. Springer-Verlag, Berlin
442 General bibliography

Hofie-Isphording U. (1985): Ein mathematisches Modell zur Software-ZuverHissigkeit eines


Systems in der Testphase. Siemens Forsch.- und Entw. Bericht, vol. 14, no. 2, pp. 76-
84
Hofmann, D. (1983): Handbuch der Messtechnik und Qualitatssicherung. Vieweg,
Braunschweig
Hofmann, W. (1968): Zuverlassigkeit von Mess-, Steuer-, Regel- und Sicherheitssysteme.
Verlag K Themig KG, Miinchen
Hofstein, S. R. (1966): An analysis of deep depletion thin-film MOS transistors. IEEE
Trans. Electron. Dev., vol. 13, pp. 846-853
Hofstein, S. R. (1967): Stabilization of MOS devices. Solid-State Electron., vol. 10, pp.
657-665
Honnold, V. R., Schoch, C. B. (1963): The Effects of High Energy Radiation on Failure
Mechanisms in Semiconductors Devices. In Physics of Failure in Electronics, vol. 1,
Spartan Books
Hosford, J. E. (1960): Measure of dependability. Oper. Res. Vol. 8, pp. 53-64
Howes, M. J. and Morgan, D. V. (Eds.) (1981): Reliability and Degradation. J. Wiley, New
York
Hsu, S. T. (1970: Surface state related 1/fnoise in MOS transistors. Solid-St. Electron., vol.
13, pp. 1451-1457
Hu, J. M. (1992): GaAs substrate mechanical reliability. In: Christou, A. (ed.) Reliability of
GaAs MMICs, J. Wiley & Sons, Chichester
Hubka, V. (1987): Principle of Engineering Design. Heurista, ZUrich
Hummitzsch, P. (1965): Zuverlassigkeit von Systemen. F. Vieweg & Sohn, Braunschweig
Hunt, M., Rowson, J. A. (1996): Blocking in a System on a Chip. IEEE Spectrum,
nov., pp.35-41
IEC 1025 (1990): Fault Tree Analysis (FTA)
IEC 812 (1985): Procedure for FMEA
IEC, 56 (CO)138 (1988): Analysis Techniques for System Reliability. Draft
lEC-Specification for Aluminium Electrolytic Capacitors High Reliability Type. 40-1
(Secretariat) 36
IEE (19819, Electronic Reliability Data - A Guide to Selected Components. Inst. of
Electrical Eng., London
IEEE Special Issue on Maintainability (1981). Trans. on Reliab. vol. 30, no.3
IEEE Special Issue on Network Reliability, Distributed Computing Networks, and Computer
Systems Reliabiliy: IEEE Trans. ReI. 35(1986)3; 38(1989)1; 39(1990)4
IEEE Special Issue on Reliability. IEEE Spectrum, Oct. 1981. USAF R&M 2000 Initiative.
IEEE Transactions on Reliability, 36(1987)3
Intel (1992): Components Quality and Reliability. Intel Corporation
Iriand, E. A. (1988): Assuring Quality and Reliability of Complex Electronic Systems:
Hardware and Software. Proceedings IEEE, vol. 76, pp. 5-18
Ishii, T.; Miyamoto, K; Naitoh, K; Azamawari, K (1994): Functional faiure analysis
technology from backside ofVLSI chip. Proceedings ofISTFA, pp. 41-47
Isphording, U. (1988): Zur Theorie der optimaien Wartung technischer Anlagen; in:
Mitteilung aus dem Zentral-Laboratorium flir Nachrichtentechnik der Siemens AG,
Miinchen, A. E. D., vol. 20, no. 11, pp. 637-646
Jack, N. (1991): Repair Replacement Modelling Over Finite Time Horizons. J. Op. Res.
SOC., vol. 42, pp. 759-766
Jacob, P. (1995): A Line Monitoring Concept to Short Learn Loops in Semiconductor-
Microchip Manufacturing Lines. Proceedings ISTFA 95, Santa Clara
General bibliography 443

Jacob, P. et al. (1995): IGBT Power Semiconductor Reliability Analysis for Traction
Application. Proceedings IPFA, Singapore
Jahn, R. (1973): Methoden der Zuverlassigkeitsarbeit - ein wichtiger Faktor der
Effektivitatserhoherung und Intensivierung im Industriebereich Elektrotechnik
Elektronik. Qualitat und Zuverlassigkeit no. 12, p. 305
Jankovic, G., Black, 1. (1996): Engineering a WEB Site. IEEE Spectrum, nov., pp. 62-69
Jarl, R. B. (1976): Radiation Effects on Power Transistors. L'Onde electrique, vol. 56, no. 3,
pp. 119-125
Jelinski, Z., Moranda, P. B. (1972): Software Reliability Research. In: Freiberger, W. (ed.)
Statistical Computer Performance Evaluation. Academic Press, New York, pp. 465-
484
Jensen, F., Petersen, N. E. (1982): Bum-In. Wiley, New York
Jeuland, F. et al. (1991): An Extension of the Rapid Wafer-Level Wijet Method and ist
Comparison with Conventional Electromigration Testing. Proceedings ESREF '91, pp.
187-192, Bordeaux
Jiang, S., Kececioglu, D. (1992): Graphical Representation of Two Mixed Weibull
Distributions. IEEE Trans. on Reliability, vol. 41, no. 2, pp. 241-247
Joe, H., Reid, N. (1995): Estimating the Number of Faults in a System. 1. of the American
Statistical Assoc., vol. 80, pp. 222-226
Johnson, A. M., Malek, M. (1988): Survey of Software Tools for Evaluating Reliability,
Availability, and Serviceability. ACM Compo Surveys vol. 20,
Johnson, G.M.: Evaluation of Microcircuits Accelerated Test Techniques. RADC-TR-76-
218, Rome Air Development Centre, Griffins Air Force Base, New York, 3441
Johnson, 1. G. (1964): The Statistical Treatment of Fatigue Experiments. Elsevier,
Amsterdam
Jones, R. D. (1982): Hybrid Circuit Design and Manufacture. Marcel Dekker, New York and
Basel
Jones, R. E., Smith, L. D. (1987): A New Wafer-Level Isothermal Joule-Heated
Electromigration Test for Rapid Testing of Integrated Circuit Interconnect. Journal of
Applied Physics, vol. 61, pp. 4670--4678
Jordan, W. E. (1972): Failure Modes, Effects and Criticality Analysis Nat. Symposium, pp.
30-37
Jowett, C. E. (1976): Electrostatics in the Electronics Environment. The Macmillan Press
Ltd., London and Basingstoke
Jubisch, H. (1976): Moglichkeiten und Grenzen der Anwendung von Umge--
bungsprtifverfahren zur Ermittlung der Zuverlassigkeit der Elektrotechnik / Elektronik.
Elektrie, vol. 30, no. 10, pp. 511-512
Kao, J. H. K. (1956): A new life-quality measure for electron tubes. IRE Trans. ReI. And
Qual. Control, April, pp. 1-11;
Kao, 1. H. K. (1960): A summary of some new techniques on failure analysis. Proc. Annual
Symp. Reliability, pp. 190-201
Kapur, K. 1., Lamberson, 1. R. (1977): Reliability in Engineering Design. 1. Wiley and
Sons, New York
Kapur, P. K., Kapur, K. R. (1983): Interval Reliability of a Two-Unit Stand-By Redundant
System. Microelec. and Reliability vol. 23, pp. 167-168
Karjalainen, 1. et al. (1996): Practical Process Improvement for Embedded Real-Time
Software. Quality Engineering vol. 8, no. 4, pp. 565-573
Kas, G. (1983): Qualitiit und Zuverlassigkeit elektronischer Bauelemente und Systeme. R.
Oldenbourg Verlag, Mtinchen / Wien
444 General bibliography

Kas, G. (1983): Qualitat und Zuverlassigkeit elektronischer Bauelemente und Systeme. R.


Oldenbourg Verlag, Miinchen 1Wien
Kaufamann, A, Grouchko, D., Cmon, R. (1975): Modeles mathematiques pour l'l~tude de la
fiabilite des systemes. Masson, Paris
Keiller, P. A et al. (1982): On the Quality of Software Reliability Predictions. Proc. NATO
AS! on Electronic Systems Effectiveness and Life Cycle Costing. Norwich, (UK),
Springer-Verlag, Berlin
Keiller, P. A et al. (1983): Comparison of Software Reliability Predictions. Digest FTCS 13
(13th Internat. Symposium on Fault-Tolerant Computing), pp. 128-134
Kim, Q.; Stark, B.; Kayali, S. (1998): A novel, high resolution, non-contact channel
temperature measurement technique. Proceedings of the IEEE International Reliability
Physics Symp., pp. 108-112
Kivenson, G. (1971): Durability and Reliability in Engineering Design. Hayden Book Co.,
Inc., New York
Kleindienst, P. (1970): Neue Ergebnisse fiber die Zuverlassigkeit des Tantal-Kondensators.
Internationale elektronische Rundschau no. 8, pp. 205-208
Klinger, D. J. at al. (1990): AT&T Reliability Manual, Van Nostrand Reinhold, New York
Klinger, D. J. (1991): Humidity Acceleration Factor for Plastic Packaged Electronic
Devices. Quality and Reliability Engineering International, vol. 7, pp. 365-370
Klion, J. (1986): Specifying Maintainability - a New Approach. Proc. Ann. ReI. & Maint.
Symp., pp. 338-343
Kobetz, H. (1976): Softwarezuverlassigkeit. Carl Hanser, Miinchen
Kochel, P. (1983): Zuverlassigkeit technischer Systeme. VEB Fachbuchverlag, Leipzig
Kodama, M., Deguchi, H. (1974): Reliability considerations for a 2-unit redundant system
with Erlang-failure and general repair distributions. IEEE Trans. ReI. Vol. R-23, pp.
75-81
Kohlas, J. (1982): Stochastic Methods of Operations Research. Cambridge Univ. Press,
Cambridge
Kolomogorov, A. (1933): Sulla determinazione empirica di una legge di distribuzione.
Giorn. Istit. Att. Vol. 4, p. 84-91
Kordonsky, Kh. B., Gertsbakh, 1. B. (1995): System State Monitoring and Lifetime.
Reliability Engineering and Systems Safety vol. 47, pp. 1-14
Kormany, T., Barna, H.: Wege zur Beurteilung der natfirlichen Lebensdauer von
Elektrolytkondensatoren. Nachrichtentechnik vol. 12, no. 10, pp. 391-392
Kortlandt, D.: Reliability and Risk Analysis in the Process Industry. Proceedings of
"Reliability '83", vol. 2, pp. 6/2/1-6/2/8
Kristiansen,1. (1983): Swedish Hardware 1 Software Reliability. Proc. Ann. ReI. & Maint.
Symp., pp. 297-302
Kuhlmann, A (1981): Einfiihrung in die Sicherheitswissenschaft. Vieweg, TDv Rheinland
Kuhn, P. (1975): Simulation von Umgebungseinflfissen fur Lagerung, Transport und
Gebrauch eines Produktes. 1. Jahrestagung der SAQ fiber "Mittel und Wege zur
Erreichung einer optimal en Produktqualitat", Bern, 25/26 April
Lall, P. (1996): Tutorial: Temperature as an Input to microelectronics - reliability models.
IEEE Trans. Reliab., vol. 45, no. 1, pp. 3-9
Lall, P. et al. (1996): Influence of temperature on microelectronics and system reliability, a
physics offailure approach. CRC Press. Boca Raton, FL
Langberg, N., Singpurwalla, N. D. (1981): A Unification of Some Software Reliability
Models Via the Bayesian Approach. Technical Report TM-66571, The George
Washington University, Washington D. C.
Lapp, J. (1978): Evaluating Capacitor Reliability. Electrical World, 15 June, pp. 42-44
General bibliography 445

Laprie, J. C. (1984): Dependability Evaluation of Software Systems in Operation. IEEE


Trans. on Software Engineering, no. 10
Lawless, J. F. (1982): Statistical Models and Methods for Lifetime Data. J. Wiley and Sons,
New York
Lectures Notes in Economics and Mathematical Systems, vol. 252, Springer-Verlag, Berlin
Lee, P. A., Anderson, T. (1990): Fault Tolerance, Principles and Practice. Springer-Verlag,
2nd Edition
Leistiko, 0.; Grove, A. S.; Sah, C. T. (1965): Electron and hole mobility in inversion layers
on thermally oxidized silicon surfaces. IEEE Trans. Electron Dev., vol. 12, pp. 248-
255
Lelievre, A. (1987): Analyse statistique de la fiabilite des composants utilises dans les
teleconununications. L'Echo des Recherches, vol. 128, no. 2, pp. 53-62
Leonard, C. T. (1990): Mechanical Engineering Issues and Ele Equipment Reliability:
Incurred Costs without Compensating Benefits. ASME 2nd Intersociety Conference on
Thermal Phenomena in Electronic Systems. May 1990
Leroux, c.; Blachier, D.; Briere, 0.; Reimbold, G. (1997): Light emission microscopy for
thin oxide reliability analysis. Microelectronic Engineering, vol. 36, p. 297
Levinson, L. M., Philipp, H. R. (1977): ZnO Varistors for Transient Protection. IEEE Trans.
on Parts, Hybrids and Packaging, vol. PHP-13, no. 4, pp. 338-343
Lichtenstein, S. et al. (1982): Calibration of Probabilities: The State of the Art Until 1980.
In: Kahneman, D. et al. Judgement under Uncertainity: Heuristics and Biases.
Cambridge University Press
Liew, B. K. et al. (1990): Reliability Simulator for Interconnect and Intermetallic Contact
Electromigration. Internat. Reliab. Phys. Symp., pp. 111-118
Ligeois, A. (1990): La fiabilite en exploitation. Tec & Doc, Lavoisier, Paris
Lindley, D. V.: Making Decisions; Second Edition (1985). J. Wiley, New York
Lipson, C., Sheth, N. J. (1973): Statistical Design and Analysis of Engineering Experiments.
McGraw-Hill, Kogakusha, Tokyo
Littlewood, B. (1996): Evaluation of Software Reliability - Achievements and Limitations.
Internat. Symposium on Reliability Engineering 2000, ETH Zurich, 17th October
Littlewood, B. Sofer A. (1985): A Bayesian Modification to the Jelinski-Moranda Software
Reliability Growth Model. CSR Technical Report
Littlewood, B., Verrall, J. L. (1973): A Bayesian Reliability Growth Model for Computer
Software. Applied Statistics no. 22, pp. 332-346
Lloyd, D. K., Lipow, M. (1962): Reliability: Management, Methods and Mathe-matics.
Prentice-Hall, Inc., Englewood Cliffs, New Jersey
Locks, M. O. (1973): Reliability, Maintainability, and Availability Assessment. Hayden
Book Co., Rochelle Park, New Jersey
Locks, M. O. (1985): Recent Developments in Computing of System Reliability. IEEE
Transactions on Reliability vol. 34, no. 5, pp. 425-436
Losee, F. (1997): RF Systems, Components, and Circuits Handbook. Artech House Books,
Boston and London
Lundmark, K. (1993)ESD Sensitivity; Experimental Verification of Three Test Methods.
Proceedings ESREF '93, pp. 379-384, Bordeaux
Lycoudes, N. (1978): The reliability of plastic microcircuits in moist environment. Solid
State Technology, October, pp. 53-62
Lyonnet, P. (1991): La maintenance, mathematiques et methodes. Tec & Doc, Lavoisier,
Paris
446 General bibliography

Mader, R., Meyer, K.-D.(1974): Zuverlassigkeit diskreter passiver Bauelemente. In:


Schneider, H. G. (ed.) Zuverlassigkeit elektronischer Bauelemente, Leipzig; VEB
Deutscher Verlag fur Grundstoffindustrie, pp. 400-401
Mann, N. R. et al. (1974): Methods for Statistical Analysis of Re1iability and Life Data. John
Wiley & Sons, New York
Manzione, L. T. (1990): Plastic Packaging of Microelectronic Devices. Van Nostrand
Reinhold, New York
Martin, F. (1976): Umweltsimulation und UmwelteinflUsse. Seminar 7674, Electronica
1976,MUnchen
Martz, H. l, Waller, R. A. (1982): Bayesian Reliability Analysis. l Wiley, New York
Marzouki, M., Osseiran, A. (1996): The IEEE Boundary Scan Standard: A Test Paradigm to
Ensure Hardware System Quality. Quality Engineering vol. 8, no. 4, pp. 635-645
Masing, W. (1964): Zuverlassigkeit als wirtschaftliches Problem; In Technische
Zuverlassigkeit in Einzeldarstellungen, no. 1, pp. 33-49
Masing, W. (1974): Qualitatslehre. DGQ 19, Beuth Verlag, Berlin und Koln
Matsumoto, T., Sugita E.: Properties and Realiabiliy of Tantalum Oxide Thin Film
Capacitor. Review of the Electrical Communication Laboratories vol. 23, no. 3, pp.
257-270
Mattana, G. (1988): Qualita, Affidabilita, Certificazione. Angeli, Milano, 2nd Ed.
Mazharsolook, E. et al. (1996): Practical Application on Statistical Process Control with the
Use of Decision Trees. Quality Engineering vol. 8, no. 4, pp. 575-579
Mazzili, C. et al. (1968): RADC Reliability Notebook. Computer Applications Inc., New
York
McCarthy, L. H. (1991): Software Predicts Reliability. Design News, March, pp. 164-165
McCool, l I. (1970): Inference on Weibull percentiles and shape parameter from maximum
likelihood estimates. IEEE Trans. Reliab., vol. R-19, February, pp. 2-9
McPherson, l W. (1990): VLSI Reliability. Proc. ESREF'90, Bari, pp. 191-210
Mecke, l (1966): Ein Grenzwertsatz aus der Zuverlassigkeitstheorie. Elektronische
Informationsverarbeitung und Kybernetik, vol. 2, pp. 83-94
Meehan, A. et al. (1994): Accuracy of Worst-Case Hot-Carrier Reliability Lifetimes
Predicted by the Berkley Model. Proc. ESREF'94, Glasgow
Merkelo, H (1993): Advanced methods for noise cancellation in system packaging. 1993
High Speed Digital Symposium, University of Illionois, Urbana
Merz, H. (1980): Sicherung der Materialqualitat. Verlag Technische Rundschau, Bern
Messerschmitt-Bolkow-Blohm (Publ.) (1986). Technische ZuverUissigkeit. Springer Verlag,
3rd Edition, Berlin
Meuleau, C.: ZuverlassigkeitsprUfung und -bestimmung elektronischer Bauelemente.
Elektrisches Nachrichtenwesen vol. 38, no. 3, pp. 308-324
Meyna, A. (1982): Einfuhrung in die Sicherheitstheorie. Hanser Verlag, MUnchen IWien
Migdalski, l (1976): Methoden zur Berechnung der Zuverlassigkeit von Systemen mit
komplizierten Strukturen. Nachrichtentechnik Elektronik no. 3, pp. 92-94
Mihoc, Gh. et al. (1976): Bazele matematice ale teoriei fiabilitatii. Dacia, Cluj-Napoca
Mihoc, Gh., Ciucu, G. (1967): Introducere in teoria asteptiirii. Ed. tehnicii, Bucuresti
MIL-HDBK-175, Microelectronic Device Data Handbook, U.S. Department of Defence,
Washington, D.C.
MIL-HDBK-21 7F (1993): Reliability Prediction of Electronic Equipment
MIL-HDBK-338 (1984): Electronic Reliability Design Handbook, vol. I & II
Milk, S. (1969): Aus der Arbeit einer ZuverilissigkeitsprUfstelle. Siemens-Bauteile-
Informationen vol. 7, no. 1, pp. 8-11
General bibliography 447

Miller, D. R. (1986): Exponential Order Statistic Models of Software Reliability Growth.


IEEE Trans. on Software Engineering, no. 12, pp. 12-24
MlL-STD-1521 (1985): Technical Reviews and Audits for Systems, Equipment and
Computer Programs, Edition B
MlL-STD-1629A (1980): Procedures for Performing a Failure Mode, Effects and Criticality
Analysis
MIL-STD-198, Capacitor, Selection and Use of, Supplemental Information. U. S.
Department of Defense, Washington, D. C. 7
MlL-STD-199, Resistor, Selection and Use of, Supplemental Information, U. S.
Department of Defense, Washington, D. C. 7
MlL-STD-199, Resistor, Selection and Use of, Supplemental Information, U. S.
Department of Defense, Washington, D. C. 7
MlL-STD-883D (1991): Military Standard, Test Methods and Procedures for
Microelectronics. Department of Defense, Washington
MIL-STD-883D (1991): Military Standard, Test Methods and Procedures for
Microelectronics. Department of Defense, Washington
Mine, H., Nakagawa, T. 81977): Interval Reliability and Optimum Preventive Maintenance
Policy. IEEE Trans. on Reliability vol. 26, pp. 131-133
Mitsubishi (1986): Mitsubishi Semiconductor Reliability Handbook
Moll, 1. L. (1964): Physics of semiconductors. McGraw-Hill, New York
Monchy, F. (1991): La fonction maintenance. Masson, Paris
Moore, E. V., Shanon, C. E. (1956): Reliable Circuits Using less Reliable Relays. J. W.
Franklin Institute, pp. 191-208; 281-297
Miinchow, E., Erzberger, W. (1994): Wie zuverlassig ist zuverlassig? MegaLink
Munikoti, R., Dhar, P. (1988): Low-Voltage Failures in Multilayer Ceramic Capacitors: A
New Accelerated Stresss Screen. IEEE Trans. on Components, Hybrids, and
Manufacturing Technology, vol. 11, no. 4, pp. 346-350
Munson, 1. B. (1981): Software Maintainability - a Practical Concern for Life-Cycle Costs.
Computer no. 11, pp. 103-109
Musa, 1. D. (1975): A Theory of Software Reliability and its Application. IEEE Trans. on
Software Engineering, no. 1, pp. 312-327
Muth, E. J. (1968): A method for predicting system downtime. IEEE Trans. Reliability, vol.
R-17,pp.97-102
Myers, R. H. et al. (1964): Reliability Engineering for Electronic Systems. 1. Wiley, New
York
Nafria, M.; Sune, 1.; Aymerich, X. (1993): Exploratory observations of post-breakdown
conducion in polycrystalinne-silicon and metal-gate thin-oxide metal-oxide-
semiconductor capacitors. J. Appl. Phys., vo1.74, pp. 205-209
Nagel, O. (1970): Stabilitat von Schichtwiderstanden. Internationale Elektronische
Rundschau H. 12, p. 315-318
Nagel, P. M., Skrivan, 1. A. (1981): Software Reliability: Repetitive Run Experimentation
and Modelling. BCS-40399 (dec.), Boeing Compo Services Company, Seattle,
Washington
Nakagawa, T., Osaki, S. (1974): Stochastic behaviour of a two-dissimilar unit standby
redundant system with repair maintenance. Microelectr. and Reliability, vol. 13, pp.
143-148
Naresky,1. J.: RADC Reliability Notebook. Astia Document no. AD-148868
Naylor, 1.C., Smith, A.F.M. (1982): Applications of a Method for the Efficient Computation
of Posterior Distributions. Applied Statistics, no. 31, pp. 214-225
448 General bibliography

Nelson, J. J. et al. (1989): Reliability Models for Mechanical Equipment. Proc. Ann. ReI. &
Maint., pp. 146-153
Nelson, W. (1982): Applied Life Data Analysis. J. Wiley an, New York
Nelson, W. (1990): Accelerated Testing. J. Wiley and Sons, New York
Neumann, J. von (1956): Probabilistic Logics and the Synthesis of Reliable Organisms from
Unreliable Components. Annals of Math. Studies, Princeton University Press, no. 34,
pp.43-98
Newby, M. (1991): Reliability Modelling and Estimation. In: Sander, P. Badoux, R. (eds.)
Bayesian Methods in Reliability, Kluver Academic Publishers, Dordrecht
Niccollian, E. H.; Goetzberger, A (1967): The Si-Si0 2 interface-electrical properties as
determined by th metal-insulator-silicon conduction technique. Bell System Technical
Journal, vol. 46, pp. 1055-1063
Noyce, R. N.; Bohn, R. E.; Chua, H. T.(1969): Schottky diodes make IC scene. Electronics,
July 21, pp. 74-77
O'Connor, D. T. (1991): Practical Reliability Engineering. J. Wiley and Sons, Chichester
Olbrich, T. et al. (1996): Built-In Self-Test in Intelligent Microsystems as a Contributor to
System Quality and Performance. Quality Engineering, vol. 8, no. 4, pp. 60-613
Olson, C. (1989): Reliability of Plastic-Encapsulated Logic Circuits. Quality and Reliability
Engineering International, vol. 5, pp. 53-72
Osaki, S. (1985): Stochastic System Reliability Modeling. World Scientific, Singapore, pp.
11-18,p.35-39,pp.388-402
Osaki, S. (1992): Applied Stochastic System Modeling. Springer-Verlag, Berlin
Pasco, R. W., Schwarz, J. A (1983): The Application of Dynamic Technique to the Study
of Electro migration Kinetics. Internat. Reliability Physics Symposium, pp. 10-23
Pate-Cornell, M. E., Fischbeck, P. S. (1995): Probabilistic Interpretation of Command and
Control Signals.Reliability Engineering and System Safety no. 47, pp. 27-36
Pau, 1. F. (1981): Failure Diagnosis and Performance Monitoring. Marcel Dekker, Inc., New
York
Pecht, M. G., Palmer, M., Naft, J. (1987): Thermal Reliability Management in PCB Design.
Proc. Ann. ReI. & Maint. Symposium, pp. 312-315
Pecht, M., Ramappan, V. (1992): Are components still the major problem: A review of
electronic system and device field failure returns. IEEE Trans. Comp, Hybrids, and
Manuf. Technol., vol. 15, no. 6, pp. 1160-1164
Peck, D. S., Trapp, O. D. (1987): Accelerated Testing Handbook. Technology Associates,
Portola Valley (CA)
Petrick, P.: Das Dauerverhalten von Kondensatoren. Elektronikpraxis vol. 3, no. 2, pp. 7-17;
no. 3/4, pp. 9-16
Pfannschmidt, G. (1992): Ultrasonic Microscope Investigations of Die Attach Quality and
Correlations with Thermal Resistance. Quality and Reliability Engineering
International, vol. 8, pp. 243-246
Philipp, Levinson (1983): Degradation Phenomena in ZnO, a Review. Advances in
Ceramics, no. 7
Picart, B.; Deboy, G. (1992): Failure analysis on VLSI circuits using emission microscopy
for backside observation. Proceedings of ESREF, pp. 515-520
Pierret, R. F.; Sah, C. T. (1968): An MOS-oriented investigation of effective mobility
theory. Solid-State Eelectron., vol. 11, pp. 279-285
Pieruschka, E. (1963): Principles of Reliability. Prentice-Hall, Englewood Cliffs
Platz, G. (1983): Methoden der Software-Entwicklung. Hanser-Verlag, Miinchen
Pollard, A, Rivoire, C. (1971): Fiabilite et statistique previsionnelles. La methode de
Weibull. Eyrolles, Paris
General bibliography 449

Pollino, B. (1989): Microelectronic Reliability, vol. 2, Artech House, Norwood


Preuss, H. (1976): Zuverliissigkeit elektronischer Einrichtungen. VEB Technik Verlag,
Berlin
Pynn, C. (1986): Strategies for Electronics Test. McGraw-Hill, New York
RAC (1990): Fault Tree Analysis (FTA)
RAC (1992): Worst Case Circuit Analysis (WCCA)
RAC (1993): Fault Modes, Effects, and Criticality Analysis (FMECA)
RAC (1995): Reliability Toolkit: Conunercial Practice Edition. Reliability Analysis Center,
Rome,NY
RADC, SOAR-7 (1990): A Guide for Implementing Total Quality Management
RADC. NPRD-3 (1985): Nonelectronic Parts Reliability Data
Radford, D. (1996): Spread-Spectrum Data Leap through AC Power Wiring. IEEE
Spectrum, nov., pp. 48-53
Raiffa, H. (1968): Decision Analysis. Addison Wesley, Reading (Mass.)
Rathbone, R., Maier, R. (1996): Automotive Electronics. Internat. Symposium on Reliability
Engineering 2000, ETH Zurich, 17th October
Reddi, V. G. K. (1968): Majority carrier surface mobilities in thermally oxidized silicon.
IEEE Trans. Electron Dev., vol. 15, pp. 151-156
Redmill, F. J. (1988): Dependability of Critical Computer Systems. vol. I & 2, Elsevier,
London
Rehcs, L. (1979): Guide de choix du condensateur au tantale. Toute l'Electronique,
November, pp. 39-46
Reinschke, K. (1973): Zuverliissigkeit von Systemen (Band I). VEB Verlag Technik, Berlin
Reinschke, K., Usakov, 1. (1987): Zuverlassigkeitsstrukturen. Verlag Technik, Berlin
Reiszmann, E. (1972): Messung und Bewertung mechanischer Umweltbeeinflusse auf
Gerate. Fernmeldetechnik, vol. 12, no. 3, p. 117
Reliability of General Electric GE-MOV Varistors, Report E95.44 of the firm "General
Electric"
Revesz, A. G.; Zaininger, K. H. (1968): The Si-Si0 2 solid-state interface system. RCA Rev.,
vol. 29, pp. 22-46
Reynolds, F. H. (1974): Thermally accelerated aging of semiconductor devices. Proc. of the
IEEE, February, pp. 185-193
Ricketts, L. W. (1971): Radiation Effects on Microelectronic Components and Circuits.
Proceedings ofthe 1971 Internat. Symp., Chicago, 11-13 Oct. pp. 5-7-1...5-7-17
Roberts, J. A., Chabot, C. B. (1980): Application Engineering. In: Arsenault, 1. E. and
Roberts, J. A. (eds.) Reliability and Maintainability of Electronic Systems. Computer
Science Press, Rockville, Maryland
Root, B. J. (1985): Wafer Level Electromigration Test for Production Monitoring. Intern.
Reliab. Physics Symposium, p. 100-107
Rubino, G., Sericola, B. (1989): Sojourn Times in Finite Markov Processes. J. Appl. Prob.
vol. 26,pp. 744-756
Rubino, G., Sericola, B. (1991): Successive Operational Periods as Measures of
Dependability. In Avizienis, A., Laprie, J.-C.: "Dependable Computing and Fault-
Tolerant Systems". Springer-Verlag, Berlin, pp.239-254
Rubino, G., Sericola, B. (1993): Sojourn Times in Semi-Markov Reward Processes. Reliab.
Engng. System Safety, vo,. 41, pp. 1-4
Rubinstein, R. Y. (1981): Simulation and the Monte Carlo method. John Wiley, New York
Russel, R. F. (1971): Test on Thick Film Resistors. Microelectronics and Reliability no. 10,
p. 115
450 General bibliography

Ryerson, J. (1978): Reliability testing and screening: a general review paper.


Microelectronics and Reliability, no. 3, pp. 112-121
Saari, A. E. et al. (1982): Stress Screening of Electronic Hardware. RADC-TR-82-087.
Griffis AFB N. Y., Rome Air Development Center
Sabnis, A. G. (1990): VLSI Reliability, Academic Press, Inc. San Diego
Sah, C. T.; Wu, S. Y.; Hielscher, F. H. (1966): The effects of fixed bulk charge on the
thermal noise in metal-oxide-semiconductor transistors. IEEE Trans. Electron. Dev.,
vol. 13,pp.410-419
Sahner, R. A. et al. (1996): Performance and Reliability Analysis of Computer Systems.
Kluwer Academic Publishers
Salenieks, N. et al. (1996): MachinelProcess Parameter Monitoring Using Sample Function
Analysis. Quality Engineering vol. 8, no. 4, pp. 553-563
Samaras, T. T. (1971): Fundamentals of Configuration Management. John Wiley and Sons,
New York
Sander, P., Badoux, R. (Ed.) (1991): Bayesian Methods in Reliability. Kluwer Academic
Publ., Dordrecht,
Sandler, G. H. (1963): System Reliability Engineering. Prentice Hall
Savchuk, V. P. (1995): Estimation of Structures Reliability for Non-precise Limit State
Models and Vague Data. Reliability Engineering and System Safety, vol. 47, pp. 47-58
Schaefer, E. (1979): Zuverlassigkeit, VerfUgbarkeit und Sicherheit in der Elektronik. Vogel-
Verlag, Wiirzburg
Schaeffer, R. L. (1971): Optimum Age Replacement Policies With an Increasing Cost
Factor. Technometrics no. 13, pp. 139-144
Scheiber, S. F. (1985): Bum-in: new perspectives. Test & Measurement World, January, pp.
38-50
Schlegel, E. S. (1967): A bibliogra[hy of metal-insulator semiconductor studies. IEEE
Trans. Electron Dev., vol. 14, pp. 728-741
Schlegel, E. S.; Schnable, G. L. (1969): The application of test structures for the study of
surface effects in LSI circuitry. IEEE Trans. on Electron Devices, April, pp. 386-393
Schnable, G. L.; Keen Jr, R. S. (1969): Failure mechanisms in Large-Scale Integrated
circuits. IEEE Trans. on Electron Devices, April, pp. 322-332
Schneeweiss, W. (1985): Grundbegriffe der Graphentheorie fur praktische Anwendungen.
Hiithig-Veriag, Heidelberg
Schneeweiss, W. (1989): Boolean Functions with Engineering Applications and Computer
Programs. Springer-Verlag, Berlin
Schneeweiss, W. (1992): Zuverlassigkeitstechnik - von den Komponenten zum System.
Datakontext Veriag, KOin
Schneeweiss, W. (1993): Calculating MTBF for modularized fault trees. Proc. Ann. Reliab.
& Maintain. Symposium IEEE, pp. 206-213
Schneeweiss, W. (1994): Zuverlassigkeitstechnik in der Lehre von Automatisierungstechnik
und Informationstechnik. Automatisierungstechnik vol. 42, no. 9, pp. 379-384, R.
Oldenbourg Verlag
Schneeweiss, W. (1996): Limited usefulness of BDDs for mean failure frequency
calculation. J. Automatic Control Production Syst.
Schriifer, E. (1984): Zuverlassigkeit von Mess- und Automatisierungseinrichtungen.
Hanser-Verlag, Miinchen / Wien
Schwob, M., Peyrache G. (1969): Traite de fiabilite. Masson, Paris
Serfiing, R. J. (1980): Approximation Theorems of Mathematical Statistics. John Wiley,
New York
General bibliography 451

Serra A, Barlow R. E. (1986): Theory of Reliability. Course XCIV at the E. Fermi School,
Amsterdam, North-Holland
Sethy, A (1981): Die praktische Arbeit zur Qualitatssicherung elektronischer Bauelemente
und Einrichtungen. E und M, vol. 98, no. 10, pp. 399--406
Shaw, L. et al. (1973): Time Dependent Stress-Strength Models for Non-Electrical and
Electrical Systems. Proc. Armu. Symp. Reliability, pp. 186-197
Shiomi, H. (1968): Application of cumulative degradation model to acceleration life test.
IEEE Trans. on Reliability, vol. 17, no. 1, March, pp. 27-33
Shockley, W. (1949): The theory of pn junctions in semiconductors and pn junction
transistors. Bell Syst. Techn. Journal, vol. 28, pp. 435--467
Schokley, W. (1952): A unipolar field-effect transistor. Proc. IRE, vol. 40, pp. 1365-1371
Shockley, W.; Prim, R. C. (1953): Space-scharge limited emission in semiconductors. Phys.
Rev., vol. 90, pp. 753-762
Shockley, W. (1954): Negative resistance arising from transit time in semiconductor diodes.
Bell Syst. Techn. Journal, vol. 33, pp. 799-809
Shockley, W. (1957): HighOfrequency negative resistance device. U.S. Patent 2794917, June
4
Shooman, M. (1973): Operational Testing and Software Reliability During Program
Development. Record 1973 Symp. on Computer Software Reliability, New York, 1973,
April 30 - May 2, pp. 51-57
Sichart, K. V., Vollersten, R.-P. (1991): Bimodal Lifetime Distributions of Dielectrics for
Integrated Circuits. Quality and Reliability Engineering International, vol. 7, pp. 299-
306
Siemens, SN 29 500 (1986): Failures Rates of Components. ZUrich, Siemens-Albis
Siewiorek, D.P., Swarz, R. S. (1982): The Theory and Practice of Reliable System Design.
Digital Press, Bedford, MA
Singh, c., Billinton, R. (1977): System Reliability Modelling and Evaluation. Hutchinson,
London
Sinnadurai, N (1980): Accelerated ageing of IMPATT diodes. Microelectronics and
Reliability, vol. 21, no. 2, pp. 209-219
Sinnadurai, N (1991): Environmental Testing and Component Reliability Observations in
Teleconununications Equipment Operated in Indian Climatic Condition's. Proceedings
ESREF'91, pp. 55-63, Bordeaux
Smith, AF.M., Skene, I.E.H., Naylor, I. C. (1987): Progress with Numerial and Graphical
Methods for Practical Bayesian Statistics. Statistician, no. 36, pp. 75-82
Smith, D. I., Babb, A H. (1973): Maintainability Engineering. Pitman Publishing, Bath
Smith, W. L. (1958): Renewal theory and its ramifications. J. Roy. Stat. Soc. Ser. B, vol. 20,
pp.243-302
Solid Aluminium Capacitors - Reliability and Stability. Philips Technical Information
057/12.6.79
Solovyev, A D. (1970): Standby with rapid renewal. Eng. Cybernetics, vol. 8, pp. 49-62
Soom, E. (1970): Einfiihrung in die mathematische Statistik und in die Wahr-
scheinlichkeitsrechnung. Hallwag, Bern
Srinivasan, G. R. (1996): Modeling the cosmic-ray-induced soft-error rate in integrated
circuits: An overview. IBM I. Res. Develop., vol. 40, no. 1, pp.77-89
Srinivasan, S. K., Gopalan, M. N. (1973): Probabilistic analysis of a two-unit system with a
warm standby and a single repair facility. Oper. Res. vol. 21, pp. 748-754; IEEE Trans.
ReI., vol. R-22, pp. 250-254
Srinivasan, V. S. (1966): The Effect of Standby Redundancy in System's Failures with
Repair Maintenance; Operations Research, vol. 14, no. 6, pp. 1024-1036
452 General bibliography

Stade, W., Hahn, G.: Betrachtungen zu Lebensdaueruntersuchungen von Elektro-


lytkondensatoren. Nachrichtentechnik vol. 17, no. 11, pp. 441-443
Stanciu, G.; Miu, C.; Bazu, M. (1992): Laser scanning system, computer assisted. 15 th
Annual Semiconductor Conference, Sinaia (Romania), pp. 493-498
Stankovic, 1. A (1988): A Serious Problem for Next-Generation Systems. Computer vol. 21,
no. 10,pp. 10-19
Stanley, K. W. (1971): Reliability and Stability of Carbon Film Resistors. Microelectronics
and Reliability, no. 10, pp. 359-374
Staudinger, W. (1979): ZuverHissigkeit eines modernen offentlichen Datennetzes am
Beispiel des elektronsichen Datenvermittlungssystems EDS. NTG-Tagung "Techniche
Zuverlassigkeit", Niimberg, pp. 125-136
Stevenson, 1. L., Nachlas, 1. A (1990): Microelectronics Reliability Predictions Derived
From Component Defect Densities. Proc. Ann. Reliab. and Maintainab. Symp., pp.
366--370
Stoltze, P. L. (1990): Infant Mortality Modelling for VLS1 Devices Using Graphical
Estimator. Quality and Reliab. Engineering Internat., vol. 6, pp. 345-356
StOrmer, H. (1962): Uber die Zuverlassigkeit von Anlagen mit Reservebauelementen bei
belibebiger Verteilung der Bauelementelebensdauern. AEU vol. 16, no. 9, pp. 465-472
StOrmer, H. (1983): Mathematische Theorie der Zuverlassigkeit. R. Oldenbourg Verlag,
Milnchen
Sweetland, A (1966): Some Statistical Methods for Maintenance Analysis. The RAND
Corp., Memorandum RM-4443-PR, April
Taguchi, G. (1987): System of Experimental Design - Engineering Methods to Optimize
Quality and Minimize Costs. vol. 1 & 2, Unipub, White Plains NY
Taylor, R. G. et al. (1985): A Failure Analysis Methodology for Revealing ESD Damage to
Integrated Circuits. Quality and Reliability Engineering International, vol. 1, no. 3, pp.
165-171
Tomasek, K. «(1972): Classification of reliability tests. Microelectronics and Reliability, no.
4,pp.361-375
Towner, 1. M. (1990): Are Electromigration Failures Lognormally Distributed? Internat.
Reliab. Phys. Symp., pp. 100-105
Trachtenberg, M. (1990): A General Theory of Software-Reliability Modeling. IEEE Trans.
ReI. vol. 39, no. 1, pp. 92-96
Tretter, 1.: Zum Driftverhalten von Bauelementen und Geraten. Qualitat und Zuverlassigkeit
vol. 19,no.4,pp. 73-79
Tretter, 1.: Zum Driftverhalten von Bauelementen und Geraten. Qualitat und Zuverlassigkeit
vol. 19,no.4,pp. 73-79
Trignan,1. (1991): Probabilite, statistique et leurs applications. Breal, Paris
Trivedi. K. S. (1982): Probability & Statistics with Reliability Queuing, and Computer
Science Applications. Prentice-Hall, Inc., Englewood Cliffs, NJ 07632
Tummala, R. R., Rymaszewski, E. 1. (1989): Microelectronics Packaging Handbook, Van
Nostrand Reinhold, New York
Ueda, Osamu (1996): Reliability and Degradation ofIII-V Optical Devices. Artech House,
Boston and London
Umiker, B., Bisang, P. (1987): Wie lassen sich grosse Industriekatastrophen verhiiten? 10-
Management Zeitschrift vol. 56, no. 1, pp. 15-22
Unger, B. A (1981): Electrostatic Discharge Failures of Semiconductor Devices. Internat.
Reliab. Physics Symp., p. 204-208
Van der Ziel, A: Hsu, S. T. (1966): High-frequency admittance of space-charge-limited
solid-state diodes. Proc. IEEE (Letters), vol. 54, pp. 1194-1195
General bibliography 453

Van der Ziel, A. (1967): Normalized characteristics of nun devices. Solid-St. Electron. vol.
10,pp.267-172
Van der Ziel, A. (1968): Solid-state physical electronics. Prentice-Hall, New Jersey
Vanhecke, B. et al. (1991): Electromigration at Gold-Aluminium Interfaces and in Thin
Aluminium Tracks. Proceedings ESREF '91, pp. 193-199, Bordeaux
Vaucher, C. L. et al. (1996): The ppm Myth in Borad Assembly. Quality Engineering, vol. 8,
no. 4, pp.615-621
VDI 2221 (1987): Systematic Approach to the Design of Technical Systems and Products
VDI 4008: Handbuch Zuverliisssigkeitstechnik
VDI 4009 BI. 8 (1985): Zuverliissigkeitswachstum bei Systemen
Vetter, H. (1979): Zuverliissigkeit trotz steigender Komplexitiit der Anforderungen. NTG-
Tagung "Technische Zuverliissigkeit", Niirnberg, pp. 47-86
Viertel, R. (1988): Statistical Methods in Accelarated Life Testing. Vandenhoeck &
Ruprecht, Gottingen
Villemeur, A. (1988): Surete de fonctionnement des systemes industriels. Eyrolles, Paris
Vliet, H. (1993): Software Engineering. Principles and Practice. 1. Wiley & Sons, New York
Wada, Y. et al. (1981): Electrical testing for process evaluation. Microelectronics and
Reliability, vol. 21, no. 2, pp. 159-163
Wagner, G. R., Mischke, C. R. (1973): Cyc1es-to-Failure and Stress-to-Failure Weibull
Distributions in Steel Wire Fatigue. Proc. Annu. Symp. Reliability, pp. 445-451
Wallace, W. E. (1981): Progress in Electronic Systems Reliability. Proceedings of Annual
Reliability and Maintenability Symposium, pp. 272-274
Wallmark, 1. T.; Johnson, H. (1966): Field effect transistors - Physics technology and
applications. Prentice-Hall, New Jersey
Warner, R. M. (1965): Integrated circuits, design principles and fabrication. McGraw-Hill,
New York
Wasserman, G. S., Reddy, I. S. (1992): Practical Alternatives for Estimating the Failure
Probabilities of Censored Life Data. Quality and Reliability Engineering International,
vol. 8, pp. 61-67
Weber, G. G. (1974): State of Reliability in Europe. IEEE Trans. on Reliab. R-23
Weber, W. et al. (1991): Dynamic degradation in MOSFET's - part II: Application in the
circuit environment. IEEE Trans. El. Devices, vol. 38, no. 8, pp. 1859-1867
Webinger, R.: Aluminium-Elektrolytkondensatoren flir den Einsatz in Stromversorgungen.
Bauteile Report vol. 17, no. 2, pp. 37-41
Weibull, W. (1951): A Statistical Distribution Function of Wide Applicability. Journal.
Appl. Mech., vol. 18, pp. 293-297
Weick, W. W. (1980): Acceleration factors for IC leakage current in a steam environment.
IEEE Trans. on Reliability, vol. 29, no. 2, June, pp. 109-115
Westinghouse Summary Chart of 1984 to 1987 for Failure Analysing Memos (1988):
Westinghouse Electric Corporation. In: Pecht, M. et al. (1990): Temperature
Dependence of Microelectronic Device Fails. QRE International, vol. 6, no. 4, pp. 275-
284
Whitehead, A. P., Prince, M. D. H. (1991): Reliability Performance of Electronic
Components. Proc. of Reliability '91 (London), pp. 284-296
Whorf, B. L. (1984): Sprache-Denken-Wirklichkeit. Rowohlt-Verlag, Hamburg
Wiesen, 1. M. (1960): Mathematics of Reliability. Proc. 6th Nat. Symp. on Reliability and
Quality Control in Electr., pp. 110-120
Wilcox, R. H., Mann, W. C. (Ed.) (1962): Redundancy Techniques for Computing Systems.
Spartan Books
454 General bibliography

Wiper, M. P. (1990): Calibration and Use of Expert Probability Judgements. Ph. D. Thesis,
School of Computer Studies, University of Leeds
Wong, K. 1. (1982): A New Direction for Electronic Reliability Engineering in the 80's.
Proceedings ofEurocon '82, Copenhagen, pp. 3-10
Wong, K. 1. (1990): Reliability Prediction Models for Military Avionics. Technical Report
Project no. AF89-158, April
Woods, M. H. (1985): VLSI Reliability, NATO Seminar, Helsing0r
Woods, M. H. (1986): MOS VLSI Reliability and Yield Trends. Internat. Reliab. Physics
Symp., pp. 1715-1729
Wright, G. T. (1964): Theory of space-charge-limited surface-channel dielectric triode.
Solid-St. Electron., vol. 7, pp. 167-173
Wu, E.Y.; Lo, S.-H.; Abadeer, W.W.; Acovic, A; Buchanan, D.; Furukawa, T.; Brochu, D.;
Dufresne, R. (1997): Determination of ultrathin oxide voltages and thickness and the
impact on reliability projection. Proceedings of the IEEE International Reliability
Physics Symp., pp. 184-191
Wurnik, F. M. (1981): Quality Assurance System and Reliability Testing of LSI Circuits.
Microelectronic & Reliability vol. 23, no. 4, pp. 709-715
Zaininger, K. H.; Wang, C. C. (1970): MOS and vertical junction device characteristics of
epitaxial silicon on low aluminium-rich spinel. Solid-State Electronics, vol. 13, pp.
943-947
Zehnder, C. A (1986): Informatik-Projektentwicklung. Verlag der Fachvereine, Zurich
Zerbst, M. (Ed.) (1986): Mess- und Pruftechnik. Springer-Verlag, Berlin
Zio, E. (1995): Biasing the Transition Probabilities in Direct Monte-Carlo. Reliability
Engineering and System Safety vol. 47, pp. 59-63
Glossary of microelectronics and reliability terms

Abrasive trimming: Trimming a film resistor to its nominal value by notching the resistor
with a finely adjusted stream of an abrasive material (for example aluminium oxide,
directly against the resistor surface).
Accelerated lifetest: Test conditions used to bring about - in a short time - the deteriorating
effect obtained under normal service conditions.
Accelerated test: A test in which the applied-stress level is chosen to exceed that stated in
the reference conditions, in order to shorten the time required to observe the stress
responses of the item, or magnify the response in a given time. To be valid, an accelerated
test shall not alter the basic modes and/or mechanisms of failure, or their relative
prevalence.
Acceleration factor: The major failure mechanisms of a component stem from electrical
ageing and both electrical and mechanical wear. The electrical ageing is a chemical
process generally following the chemical reaction equation ofArrhenius:
F = A exp(-EjkT) where F = failure rate; A = a constant; Ea = activation energy (eV); k =
Boltzmann's constant (8.6 X1(J5 eV/K); T = absolute temperature (K). Since electrical
ageing is accelerated at increased temperatures, we can define a time acceleration factor
= exp{Ea/k[(1/TJ) - (1/T~J) where TJ = reference temperature (K), T2 = acceleration

temperature (K).
Acceptance test: 1) A test to demonstrate the degree of compliance of a device with
purchaser's requirements. 2) A conformance test to demonstrate the quality of the units of
a consignment, without implication of contractual relations between buyer and seller.
Active components: Electronic components (transistors, thyristors, etc.) which can operate
on an applied electrical signal so as to change its basic character; i. e. amplification,
switching, rectification, etc.
Active element: An element of a circuit in which an electrical input signal is converted into
an output signal by the non-linear voltage/current relationships of a semiconductor
device.
Active maintenance time: The time during which maintenance actions are performed on an
item either manually or automatically.
Active substrate: A substrate in which active and passive circuit elements may be formed to
provide discrete or integrated devices.
Add-on component: Discrete or integrated pre-packaged or chip components that are
attached to a film circuit to complete the circuit functions.
Adhesion: The property of one material to remain attached to another; a measure of the
bonding strength of the interface between, for example, film deposit and the surface
which receive the deposit; the surface receiving the deposit may be another film or
substrate.
Alloy: A solid-state solution of two or more metals.
Alumina: Al 20 3; alumina substrates are made offormulations that are primarily alumina.
456 Glossary of microelectronics and reliability terms

Ambient temperature (T.): Temperature of atmosphere in intimate contact with the


electrical parts or device.
Angstrom: A unit of measurement used in thin-film circuits equal to lO·IOm.
Annealing: Heating of a film resistor followed by slow cooling to relieve stresses and
stabilise the resistor material.
Array: A group of elements (or circuits) arranged in rows and columns on one substrate.
Arrhenius: acceleration factor.
Availability: 1) The capability of an item - under the combined aspects of its reliability and
maintenance - to perform its required function at a stated instant in time. 2) The
probability that an item will perform its required function under given conditions at a
stated instant oftime.
Back bonding: Bonding active chips to the substrate using the back of the chip, leaving the
face, with its circuitry face up. The opposite of back bonding is face bonding.
Ball bond: A bond formed when a ball shaped end interconnecting wire is deformed by
thermo compression against a metallised pad.
Beam lead: 1) A metal beam deposited directly onto the surface of the die as part of the
wafer processing cycle in the fabrication of an integrated circuit. Upon separation of the
individual die, the cantilevered beam is left protruding from the edge of the chip and can
be bonded directly to interconnecting pads on the circuit substrate without the need for
individual wire interconnections. 2) A long structural member not supported everywhere
along its length and subject to the forces of flexure, one end of which is permanently
attached to a chip device and the other end intended to be bonded to another material,
providing an electrical interconnection or mechanical support or both.
Beam lead device: An active (or passive) chip component possessing beam leads as its
primary interconnection and mechanical attachment means to a substrate.
Beryllia: BeO - a substrate material used where extremely high thermal conductivity is
desired.
Blisters: Raised parts of a conductor or resistor formed by the outgassing of the binder or
vehicle during the firing cycle.
Bond: 1) Electrical interconnection made with a low-resistance material between a chassis,
metal shield cans, or cable shielding braid, in order to eliminate undesirable interaction
and interference resulting from high-impedance paths between them. 2) An
interconnection which performs a permanent electrical and/or mechanical function.
Bond deformation: The change in the form of the lead produced by the bonding tool,
causing plastic flow, in making the bond.
Bond, wire: The method by which very fine wires are attached to semiconductor compo-
nents for interconnection of those components with each other or with package leads.
Bonding: 1) Soldering or welding together various elements, shields, or housings of a device
to prevent potential differences and possible interference. 2) A method used to produce
good electrical contact between metallic parts of any device. 3) The attachment of wire to
circuit.
Bonding area: The area defined by the extent of a metallisation land or the top surface of
the terminal, to which a lead is or is to be bonded.
Bonding, ball: A bonding technique that use a capillary tube to feed the bonding wire. The
end of the wire is heated and melts, thus forming a large ball. The capillary and ball are
then positioned on the contact area and the capillary is lowered. This forms a large bond.
The capillary is then removed and a flame is applied severing the wire and forming a new
ball.
Glossary of microelectronics and reliability terms 457

Bonding pad: A metallised area at the end of a thin metallic strip to which a cOimection is to
be made.
Bonding, stitch: A bonding technique where wire is fed through a capillary tube. A bent
section of the wire is bonded to the contact area by the capillary. The capillary is removed
and a cutter severs the wire, forming a new bend for the next bonding operation.
Bonding, thermal compression: Diffusion bonding where two carefully prepared surfaces
are brought into intimate contact under carefully controlled conditions of temperature,
time, and clamping pressure. Plastic deformation is induced by the combined effects of
pressure and temperature, which in tum results in atom movement causing the
development of a crystal lattice bridging the gap between the facing surfaces and results
in bonding. Generally, the process is performed under a protective atmosphere of inert
gas to keep the surfaces to be bonded clean while they are being heated.
Bonding, wedge: 1) A type of thermocompression bonding used in integrated-circuit
manufacturing where a wedge-shaped tool is used to press a small section of the lead wire
onto the bonding pad. 2) A bond formed when a heated wedge is brought down on a wire
prepositioned on a heated contact. The wedge's heat and pressure in combinations with
heat applied to the mounting contact form the bond.
Bonding, wire: 1) A lead-covered tie used for connecting two cable sheaths until a splice is
closed and covered permanently. 2) Fine gold or aluminium wire for making electrical
connections between various bonding pads on the semiconductor device substrate and
device terminals or substrate lands.
Bond lift-off: The failure mode whereby the bonded lead separates from the surface to
which it was bonded.
Brazing: Similar to soldering. The joining of metals with a non-ferrous filler metal at
temperatures above 425°C. Also called hard soldering.
Breakdown: Failure of a clamp or Zener diode.
Breakdown voltage: The voltage threshold beyond which there is a marked (almost infinite
rate) increase in electrical current conduction.
Burn-in: 1) The operation of items prior to their ultimate application intended to stabilise
their characteristics and to identify early failures. 2) The process of electrically stressing a
device (usually at an elevated temperature environment) for an adequate period of time to
cause failure of marginal devices. 3) (For nonrepairable items): Type of screening test
while an item is in operation. 4) (For repairable items): Operation of an item in a
prescribed environment with successive corrective mainte-nance at every failure during
the early failure period.
Bum-in - statically or dynamically - (125 "C for 160 h) provokes some 80% of the chip
related and 30% of the package related early failures; memories should be operate with
the same electrical signals as in the field. Should surface, oxide and metallisation
problems be dominant, a static bum-in is better. A dynamic bum-in activates practically
all failure mechanisms. The choice will be made on the basis of practical results.
Burn-out: Destruction of the junctions of a transistor due to extremely large currents caused
by latch-up.
Camber: A term that describes the amount of overall warpage present in a substrate.
Capability: Ability of an item to meet a service demand of stated quantitative characte-
ristics under given conditions.
Capillary: A hollow bonding tool used to guide the bonding wire and to apply pressure to
the wire during the bonding cycle.
Capillary tool: A tool used in bonding where the wire is fed to the bonding surface of the
tool through a bore located along the long axis of the tool.
458 Glossary of microelectronics and reliability terms

Centrifuge: Testing the integrity of bonds in a circuit by spinning the circuit at a high rate of
speed, thereby imparting a high g loading on the interconnecting wire bonds and bonded
elements.
Cermet: A solid homogeneous material usually consisting of a finely divided admixture of a
metal and ceramic in intimate contact.
Characterisation: A parametric, experimental analysis of the electrical properties of a given
IC; it investigates the influence of different operating conditions (supply voltage,
frequency, temperature, logic levels, etc.) on the IC's behaviour and delivers a cost-
effective test programme for incoming inspection. Characterisation testing is a key to
successful screening and incoming inspection testing.
Chip: 1) A single substrate on which all the active and passive circuit elements have been
fabricated using one or all of the semiconductor techniques of diffusion, passivation,
masking, photoresist, and epitaxial growth. A chip is not ready for use until packaged and
provided with external connectors. 2) A tiny piece of semiconductor material scribed or
etched from a semiconductor slice on which one or more electronic components are
formed. The percentage of usable chips obtained from a wafer is the yield.
Chip-scale package (CSP): package - introduced in 1994 - having a perimeter no more
than 1.2 times the perimeter of the die it contains. CSP combines the best features of bare
die assembly and traditional semiconductor packaging; it reduces overal system size,
something devoutly to be desired in portable electronic products. Unresolved issues
include reliability, thermal performance, design, materials, assembly test, shipping,
handling, and the CSP-system interaction. The length of the list reflects the newness of
the technology and the fact the few CSPs are as yet in production or use.
Clinch: A method of mechanically securing components prior to soldering, by bending that
portion of the component lead that extends beyond the lip of the mounting hole, against a
pad area.
Coefficient of thermal expansion: The ratio of the change in length to the change in
temperature.
Cold solder connection: A soldered connection where the surfaces being bonded moved
relative to one another while the solder was solidifYing, causing an uneven solidification
structure which may contain microcracks. Such cold joints are usually dull and grainy in
appearance.
Component: 1) A piece of equipment, a line, a section of line, or a group of items that is
viewed as an entity for purposes of reporting, analysing, and predicting outages. 2) An
essential functional part of a subsystem or equipment; it may be any self-contained
element with a specific function, or it may consist of a combination of parts, assemblies,
accessories, and attachments.
Component hazard (reliability data): The instantaneous failure rate of a component or its
conditional probability offailure versus time.
Compound (chemical): A substance consisting of two or more elements chemically united
in definite proportions by weight.
Conductive epoxy: An epoxy material (polymer resin) that has been made conductive by the
addition of a metal powder (usually gold or silver).
Conductivity: The ability of a material to conduct electricity. (The reciprocal of resistivity).
Confidence level: The probability (expressed as a percentage) that a given assertion is true
or that it lies within certain limits calculated from the data.
Confidence limits: Extremes of a confidence interval within which there is a designated
chance that the true value is included.
Glossary of microelectronics and reliability terms 459

Confidence test: A test primarily performed to provide a high degree of certainty that the
unit under test is operating acceptably.
Contact resistance: The apparent resistance between the terminating electrode and the body
of the device (the case of resistors or capacitors, for example).
Controllability: The possibility to modify internal signals at the outputs.
Corrective maintenance: 1) The maintenance carried out after a failure has occurred and
intended to restore an item to a state in which it can perform its required function. 2)
Maintenance carried out after recognition of a fault, intended to put an item back into a
state in which it can again perform its required function.
Critical charge: The amount of charge required to change the value stored in a memory cell.
Crosstalk: Signals from one line leaking into another nearby conductor because of
capacitive or inductive coupling or both.
Curie point: Above a critical temperature, ferromagnetic materials lose their permanent
spontaneous magnetisation and ferroelectric materials lose their spontaneous polarisation.
This critical temperature is the Curie point; there ferroelectric ceramic capacitors reach a
peak in capacitance.
Custom circuits: Circuits designed to satisfy a single application requirement.
Debug: To examine or test a procedure, routine, or equipment for the purpose of detecting
and correcting errors.
Debugging: The operation of an equipment or complex item prior to use to detect and
replace parts that are defective or expected to fail, and to correct errors in fabrication or
assembly.
Decoder malfunction: Inability to address a substantial part of the array due to an open
decoder line internal to the device, or a defective decoder.
Defect: 1) Any non-conformance of an item to specified requirements and that adversely
affects - or potentially affects - the quality of a device. 2) Nonfulfilment of an intended
usage requirement or reasonable expectation, essentially present at t = O.
Degradation: Change for the worse in the characteristics of an electric element because of
heat, high voltage, etc.
Dependability: Collective term used to describe the availability performance and its
influencing factors.
Depletion-mode transistor: An MOS transistor with a physically implanted channel that
conducts current at zero gate voltage.
Degradation: A gradual deterioration in performance as a function of time.
Derating: 1) The intentional reduction of stress-to-strength ratio in the application of an
item, usually for the purpose of reducing the occurrence of stress-related failures. 2) Non-
utilisation of the full load capability of an item with the intent to reduce the failure rate.
Design review: A formal documented, comprehensive, and systematic examination of a
design to evaluate the capability of the design to meet the requirements, to identify
problems, and propose solutions.
Dewetting: The condition in a soldered area in which liquid solder has not adhered
intimately and has pulled back from the conductor area.
Die (sometimes called chip): 1) A tiny piece of semiconductor material, broken from a
semiconductor slice, on which one or more active electronic components are formed.
(Plural: dice). 2) A portion of a wafer bearing an individual circuit or device cut or broken
from a wafer containing an array of such circuits or devices.
Dielectric breakdown: The breakdown of the insulation resistance in a medium under high
voltage.
460 Glossary of microelectronics and reliability terms

Dielectric loss: The power dissipated by a dielectric as the friction of its molecules opposes
the molecular motion produced by an alternating electric field.
Diffusion: The phenomenon of movement of matter at the atomic level from regions of high
concentration to regions oflow concentration.
DIN (Deutsche Industrie Normenausschuss): The abbreviation for the association in
Germany that determines the standards for electrical and other equipment in that country.
Similar to the American USAS.
Diode (semiconductor): 1) A semiconductor device having two terminals and exhibiting a
non-linear voltage-current characteristic. 2) A semiconductor device that has the
asymmetrical voltage-current characteristic exemplified by a single pn junction.
Direct chip attach: A method of forming the electrical connection from a die to a substrate
(supporting material) without the use of a package; it can be done either with wire bonds
or with flip-chip attach.
Discrete components: Individual components such as resistors, capacitors, and transistors.
Dissipation factor: Tangent of the dielectric loss angle. The ratio of the resistive component
of a capacitor (R.,) to the capacitive reactance (Xc) of the capacitor.
Doping: The addition of an impurity to a semiconductor to alter its conductivity.
Downtime: The period of time during which an item is not in a condition to perform its
intended function.
Dual-in-line pack (DIP): A package having two rows of leads extending at right angles
from the base and having standard spacings between leads and between rows of leads.
Dual in-line (DIL) package: A type of housing for integrated circuits. The standard form is
a moulded plastic container about 3/4 inch long and 1/3 inch wide, with two rows of pins
spaced 0.1 inch between centres.
Dynamic testing: Testing a hybrid circuit where reactions to ac (especially high frequency)
are evaluated.
Early failures: Often due to randomly distributed weaknesses in material or in item's
process (assembling, soldering, etc.), the early failures should be distinguished from
systematic failures (which are deterministic and are caused by an error or a mistake, and
whose elimination requires a change in the design, production process, operational
procedure, documentation, or other). The length of the early failure period varies between
some days and few thousand hours.
Effectiveness: The capability of the system or device to perform its function.
Engineering, reliability: The science of including those factors in the basic design that will
ensure the required degree of reliability.
Enhancement-mode transistor: An MOS transistor that creates a channel for minority
carriers by applying a gate voltage to drive out the majority carriers.
Environment: 1) The universe within which the system must operate. All the elements over
which the designer has no control and that affect the system or its inputs and outputs. 2)
The physical conditions which a component may be exposed to during storage or
operation. Environment usually covers climatic, mechanical, and electrical conditions.
Environmental stress screening (ESS): Test (or set of tests) intended to remove defective
items, or those likely to exhibit early failures.
Environmental test: A test (or series of tests) used to determine the sum of external
influences affecting the structural, mechanical, and functional integrity of any given
package or assembly.
Equipment: A general term including material, fittings, devices, appliances, fixtures,
apparatus, machines, etc. used as a part of - or in connection with - an electrical
installation.
Glossary of microelectronics and reliability terms 461

Exponential failure distribution: This is the failure distribution of a group of parts that
have a constant failure rate. After one fails, the probability is the same that the remaining
parts will survive the same length of time. The exponential curve results because of the
diminishing quantity remaining in the given group of parts.
Extrinsic failures: In essence, all non-intrinsic failures.
Extrinsic failure mechanisms: Mechanisms resulting from the device packaging and
interconnection process (the "back-end") of semiconductor manufacturing. As techno-
logies mature and problems in the manufacturers' fabrication lines are ironed out,
intrinsic failures are reduced, thereby making extrinsic failures all the more important to
device reliability.
Failure: 1) The termination of the ability of an item to perform a required function. 2) A part
that no longer meets its performance criteria. Failures include devices that have
drastically failed as well as components that ftmction, but are out of specification.
Failure analysis: 1) The logical, systematic examination of an item or its diagram(s) to
identifY and analyse the probability, causes, and consequences of potential and real
failures. 2) The analysis of a circuit to locate the reason for the failure of the circuit to
perform to the specified level.
Failure, catastrophic: Failure that is both sudden and complete.
Failure criteria: Limiting conditions, relating to the admissibility of the deviation from the
characteristic value due to changes after the beginning of stress.
Failure, complete: Failure resulting from deviations in characteristic(s) beyond specified
limits such as to cause complete lack of the required ftmction. The limits referred to in
this category are specified for this purpose.
Failure, critical: Failure which is likely to cause injury to persons or significant damage to
material.
Failure, degradation: Failure which is both gradual and partial. Note. In time, such a failure
may develop into a complete failure.
Failure, dependent: A failure which is caused by the failure of an associated item,
distinguished from independent failure.
Failure distribution: The distribution of failures plotted as a function of time. This is
usually plotted for a particular group of parts operating in a particular environment.
Failure, gradual: Failures that could be anticipated by prior examination or monitoring.
Failure in time (FIT): One failure in 109 device operating hours.
Failure, independent: A failure which occurs without being related to the failure of
associated items, distinguished from dependent failure.
Failure, inherent weakness: Failure attributable to weakness inherent in the item when
subjected to stresses within the stated capabilities of the item.
Failure, intermittent: Failure of an item for a limited period of time, following which the
item recovers its ability to perform its required ftmction without being subjected to any
external corrective action. Note: Such a failure is often recurrent.
Failure, major: Failure - other than a critical failure - which is likely to reduce the ability
of a more complex item to perform its required ftmction.
Failure mechanism: The basic chemical, physical or other process that result in failure (a
catastrophic, degradation, or intermittent failure).
Failure, minor: Failure - other than a critical failure - which does not reduce the ability of a
more complex item to perform its required ftmction.
Failure, misuse: Failure attributable to the application of stresses beyond the stated
capabilities of the item.
462 Glossary of microelectronics and reliability terms

Failure mode: The effect (the symptom) - the local effect - by which a failure is observed
(for example a catastrophic, degradation, or intermittent failure, usually in the form of
opens, shorts, functional faults, or parameters out of specification - for electronic
components - and brittle rupture, creep, cracking, etc. - for mechanical components).
Failure, nonrelevant: Failure to be excluded in interpreting test results or in calculating the
value of a reliability characteristic. Note: The criteria for the exclusion should be stated.
Failure, partial: Failure resulting from deviation in characteristic(s) beyond specified limits,
but not such as to cause complete lack of the required function. Note: The limits referred
to in this category are special limits specified for this purpose.
Failure, primary: Failure of an item, not caused either directly or indirectly by the failure of
another item.
Failure, random: Any failure whose cause and/or mechanism make its time of occurrence
unpredictable, but which is predictable only in a probabilistic or statistical sense.
Failure rate: The rate at which devices from a given population can be expected (or were
found) to fail as a function of time.
Failure rate (A): 1) The number offailures of an item per unit measure of life (cycles, time,
etc.); during the useful life period, the failure rate A is considered constant. 2) Limit for &
~ 0 of the probability that an itemwill fail in the time interval (t, t + &], given that the
item was new at t = 0 and did not fail in the interval (0, f], divided by &.
Failure rate, constant: After infant failures have been removed from a group of parts,
failures that occur in a completely random fashion will result in a constant failure rate. If
the events are random, one failure does not influence the probability of future failures.
Failure rate, observed (for a stated period in the life of an item): The ratio of the total
number of failures in a sample to the cumulative observed time on that sample. The
observed failure rate is to be associated with particular and stated time intervals (or
summation of intervals) in the life ofthe items, and with stated conditions.
Failure, relevant: Failure to be included in interpreting test result or in calculating the value
of a reliability characteristic. Note: The criteria for the inclusion should be stated.
Failure, secondary: Failure of an item caused either directly or indirectly by the failure of
another item.
Failure, sudden: Failure that could not be anticipated by prior examination or monitoring.
Failure, wearout: A failure that occurs as a result of deterioration processes or mechanical
wear and whose probability of occurrence increases with the time.
Fatigue: The weakening of a material under repeated stress.
Fault: A physical condition that causes a device, a component, or an element to fail to
perform in a required manner, for example, a short-circuit, a broken wire, an intermittent
connection.
Fault tree analysis (FT A): 1) Analysis to determine which fault modes of the elements of
an item and/or which external events may results in a stated fault mode of the item,
presented in the form of a fault tree. 2) FTA is a systems engineering technique which
provides an organised, illustrative approach to the identification of high risk areas.
Field-reliability test: A reliability compliance or determination test made in the field where
the operating and environmental conditions are recorded and the degree of conformity
founded.
Field-effect transistor: A transistor in which current carriers (holes or electrons) are
injected at one terminal (the source) and pass to another (the drain) through a channel of
semiconductor material whose resistivity depends mainly on the extent to which it is
penetrated by a depletion region.
Glossary of microelectronics and reliability terms 463

Field-effect varistor: A passive, non-linear, two-terminal semiconductor device that


maintains a constant current over a wide range of voltage.
Filler: A substance - usually dry and powdery or granular - used to thicken fluids or
polymers.
Film: Single or multiple layers or coatings of thin- or thick-material used to form various
elements or interconnections and crossovers.
Fissuring: The cracking of dielectric or conductors.
Flat pack: A flat, rectangular integrated-circuit or hybrid-circuit package with coplanar
leads.
Flip-chip: An uncapsulated semiconductor device in which bead-type leads terminate on one
face to permit flip (facedown) mounting of the device by contact of the leads to the
required circuit interconnectors.
Flip-chip bonding: Method of interconnectings ICs in a circuit by bonding bumps, located
on the IC chip's back surface, to the circuit's conducting paths.
Frit: Glass composition ground up into a powder form and used in thick-film compositions
as the portion of the composition that melts upon firing to give adhesion to the substrate
and hold the composition together.
Fusing: Melting and cooling two or more powder materials together so that they bond
together in a homogeneous mass.
Glassivation: J) A method of transistor passivation by a pyrolytic glass-deposition tech-
nique, whereby silicon semiconductor devices, complete with metal contact systems, are
fully encapsulated in glass. 2) The deposition of glass on a chip to give protection to
underlying device junctions.
The glassivation (silicon dioxide and/or silicon nitride) test should be made for the entire
die surface. For memories in plastic packages it should ideally free from cracks and
pinholes. To check this, the chip is immersed (for S minutes) in a SocC warm mixture of
nitric and phosphoric acid, and then inspected with an optical microscope (MIL-STD-
883, method 2021).
GPS (Global Positioning System): A system using a set of 24 orbiting satellites, each with
an atomic clock on board, to transmit position and time measurements to receive world-
wide.
Gross leak: A leak in a sealed package greater than 1O·5 cm 3!S at one atmosphere of
differential air pressure.
Hazard rate Z(t): At a particular time, the rate of change of the number of items that have
failed, divided by the number of items surviving.
Header: The base of a hybrid circuit package that holds the leads.
Head sink: The supporting member to which electronic components or their substrate or
their package bottom are attached. This is usually a heat conductive metal with the ability
to rapidly transmit heat from the generating source (component).
Hermeticity: The ability of a package to prevent exchange of its internal gas with the
external atmosphere. The figure of merit is the gaseous leak rate of the package measured
in atm'em3 Is.
Histogram: A graphical representation of a frequency distribution by a series of rectangles
which have for one dimension a distance proportional to a definite range of frequencies,
and for the other dimension a distance proportional to the number of frequencies
appearing within the range.
Hot carriers are a consequence of the high electric fields (10 4 ... 105 y Icm) in transistor
channels. Effects: increase of switching times, possible data retention, increase of noise.
This test is perfonned under dynamic conditions, at 7 ... 9Y and at -20CC to -70 D C.
464 Glossary of microelectronics and reliability terms

Hot spot: A small area on a circuit that is unable to dissipate the generated heat and
therefore operates at an elevated temperature above the surrounding area.
Hybrid circuit: A microcircuit consisting of elements which are a combination of the film
circuit type and the semiconductor circuit type, or a combination of one or both of these
types and many include discrete add-on components.
Imbedded layer: A conductor layer having been deposited between insulating layers.
Impart diode: Device whose negative characteristic is produced by a combination of impact
avalanche breakdown and charge-carrier transit-time effects. Avalanche breakdown
occurs when the electric field across the diode is high enough for the charge carriers
(holes or electrons) to create electron-hole pairs. With the diode mounted in an
appropriate cavity, the field patterns and drift distance permit microwave oscillations or
amplification.
Infant mortality failures: A characteristic pattern of failure - sometimes experienced with
new equipments which may contain marginal components - wherein the number of
failures per unit of time decreases rapidly as the number of operating hours increases. A
bum-in period may be utilised to age (or mature) an equipment to reduce the number of
marginal components.
Inherent defects: The underlying cause of intrinsic failures, in the useful life period.
Integrated circuit: A microcircuit (monolithic) consisting of interconnected elements
inseparably associated and formed in situ on or within a single substrate (usually silicon)
to perform an electronic circuit function.
Intermetallic bond: The ohmic contact made when two metal conductors are welded or
fused together.
Intraconnections: Those connections of conductors made within a circuit on the same
substrate.
Intrinsic failure mechanisms: mechanisms inherent to the semiconductor die itself,
including crystal defects, dislocations, and processing defects.
Intrinsic reliability: The reliability a system can achieve based on the types of devices and
manufacturing processes used
Ion migration: The movement of free ions within a material or across the boundary between
two materials under the influence of an applied electric field.
Item: 1) An all-inclusive term to denote any level of hardware assembly: that is system,
segment of a system, subsystem, equipment, component, part, etc. 2) Any level of
hardware assembly - system, equipment, component, part, and so on.
Item, non-repaired: An item that is not repaired after a failure.
Item, repaired: An item that is repaired after a failure.
Junction temperature: The temperature of the region of transition between the p- and n-
type semiconductor material in a transistor or diode element.
Kirkendall voids: The formation of voids by diffusion across the interface between two
different materials, in the material having the greater diffusion rate into the other.
Lands: Widened conductor areas on the major substrate used as attachment points for wire
bonds or the bonding of chip devices.
Laser bonding: Effecting a metal-to-metal bond of two conductors by welding the two
materials using a laser beam for a heat source.
Latch-up: A condition of a CMOS IC in which parasitic bipolar transistors are switched on,
drawing large currents that may destroy the device.
Latch-up tests simulate voltage overstresses on signal and power supply lines as well as
power-on / power-off sequences.
Glossary of microelectronics and reliability terms 465

Latent defect: Defect which will escape the normal quality control procedures; it requires
component stressing in order to be detected by inspection at the propagated failure level.
Lead frames: 1) The metallic portion of the device package that completes hybrid circuit
elements to the outside world. 2) A sheet metal framework to which a chip is attached,
wire-bonded, and the molded with plastic.
Leakage current: An undesirable small stray current which flows through or across an
insulator between two or more electrodes, or across a reverse-biased junction.
Leakage, input and output: Excessive leakage currents above specified limits.
Life cycle costs (LCC): Sum of the costs for acquisition, operation, maintenance, and
disposal or recycling of an item.
Life test: Test of a component or circuit under load over the rated life of the device.
Lifetime: Time span between initial operation and failure of a nonrepairable item.
Linear energy transfer: The energy per unit length transferred from an ionising particle to a
solid as the particle passes through it.
Human factors: A body of scientific facts about human characteristics. The term covers
biomedical and psycho-social considerations in the areas of human engineering, per-
sonnel selection, training, life support, job performance aid, and human performance
evaluation.
Maintainability: 1) A characteristic of design and installation expressed as the probability
that an item will be retained in or restored to a specified condition within a given period
of time, when the maintenance is performed in accordance with pre-scribed procedures
and resources. 2) Probability that preventive maintenance or repair of an item will be
performed within a stated time interval for given procedures and resources.
Maintenance: The combination of all technical and corresponding administrative activities
intended to retain an item, or restore it, to a specified state. Maintenance is thus
subdivided into preventive (carried out at predetermined intervals and according to
prescribed procedures, to reduce the probability of failures or the degradation of the
functionality of the item), and corrective (carried out after fault recognition and intended
to bring the item into a state in which it can again perform the required function).
Man-function: The function allocated to the human component of a system.
Mask: The photographic negative that serves as the master for making patterns.
Mathematical expectation (expected value) of a probability distribution:
w ro

E(x) = / xf(x)dx for a continuous probability, and I xJ(xJ for a discrete distribution.
i=]
Mean maintenance time: The total preventive and corrective maintenance time divided by
the number of preventive and corrective maintenance actions, during a specified period of
time.
Mean time between failures (MTBF): For a particular interval, the total operating life of a
population of an item divided by the total number of failures within the population,
during the measurement involved.
Mean time between maintenance (MTBM): The mean of the distribution of the time
intervals between maintenance actions (preventive, corrective or both).
Mean time to repair (MTTR): The total corrective maintenance time divided by the total
number of corrective maintenance actions during a given period oftime.
Metallisation: A film pattern (single or multilayer) of conductive material deposited on a
substrate to interconnect electronic components, or the metal film on the bonding area of
a substrate which becomes a part·· of the bond and perfonns both an electrical and a
mechanical function.
466 Glossary of microelectronics and reliability terms

Microcracks: A thin crack in a substrate or chip device, that can only be seen under
magnification and which can contribute to latent failure phenomena.
Micro-via: Miniature holes (up to 6!!m in diameter) for connections between different layers
of a multilayer printed-circuit board.
Migration: An undesirable phenomenon whereby metal ions, notably silver, are transmitted
through another metal, or across an insulated surface, in the presence of moisture and an
electrical potential.
Mil: A unit equal to O.OOlinch or 0.0254mm.
Mission profile: Specific task which must be fulfilled by an item during a stated time under
given conditions.
Mother board: A circuit board used to interconnect smaller circuit boards called "daughter
boards".
Multichip module: An electronic package that contains more than one die.
Multilayer substrates: Substrates that have buried conductors so that complex circuitry can
be handled.
Noise: Random small variations in voltage or current in a circuit due to the quantum nature
of electronic current flow, thermal considerations, etc.
Nonconformity: Nonfulfilment of a specified requirement.
Observability: The possibility to check internal signals at the outputs.
Observed reliability of non-repaired items (for a stated period of time): The ratio of the
number of items which performed their functions satisfactorily at the end of the period to
the total number of items in the sample at the beginning of the period. The criteria for
what constitutes satisfactory function shall be stated.
Operating conditions: The loading or demand cyclic operation, or both, of an item between
zero and 100% of its rated capability(ies).
Operating time: The period of time during which an item performs its intended function.
Operational reliability (software): The reliability of a system or software subsystem in its
actual use environment. Operational reliability may differ considerably from reliability in
the specified or test environment.
Optoelectronics: Technology dealing with the coupling of functional electronic blocks by
light beams.
Overlay: One material applied over another material.
Package: The container for an electronic component with terminals to provide electrical
access to the inside of the container.
Pad: 1) A device inserted into a Circuit to introduce transmission loss or to match
impedances. 2) A metal electrode that is connected to the output of a diathermy machine
and placed on the body over the region being treated.
Passivation (corrosion): 1) The process(es) (physical or chemical) by means of which a
metal becomes passive. 2) A process in which a dielectric material is diffused over the
entire wafer to provide mechanical and environmental protection for the circuits. Also
called glassivation.
Pattern sensitivity: The device response varies with the test pattern, reflecting differences in
address and/or data sequences; may also reflect timing and voltage specifications being
too close to actual failure regions.
Photoresist: A photosensitive plastic coating material which, when exposed to UV light,
becomes hardened and is resistant to etching solutions. Typical use is as a mask in
photochemical etching ofthin films.
Pitch (plastic packages): Separation between adjacent conductors.
Glossary of microelectronics and reliability terms 467

Planar technique: The formation of p-type and/or n-type regions in a semiconductor crystal
by diffusing impurity atoms into the crystal through holes in an oxide mask, which is on
the surface. The latter is left to protect the junctions so formed against surface
contamination.
Pinhole: Small holes occurring as imperfections which penetrate entirely through film
elements, such as metallisation films or dielectric films.
Power dissipation: The dispersion of the heat generated from a film circuit when a current
flows through it. (Pd = dissipated power)
Preconditioning: A stress test (or combination of stress tests) applied to devices (i. e. high
temperature storage, operating life, storage life, blocking life, humidity life, HTRB,
temperature cycles, mechanical sequence - which includes solderability, etc.), after which
a screening criteria is applied to separate good units from bad ones. This criteria may be
any combination of absolute value and parameter shift levels agreed to by the parties
involved.
Preform: To aid in soldering or adhesion, small circles or squares of the solder or epoxy are
punched out of thin sheets. These preforms are placed on the spot to be soldered or
bonded, prior to the placing of the object to be attached.
Probability density function: The first derivative of the probability distribution function; it
represents the probability of obtaining a given value.
Product assurance: All planned and systematic activities necessary to reach specified
targets for the reliability, maintainability, availability, and safety of an item, as well as to
provide adequate confidence that the item will meet given requirements for quality.
Product liability: Responsibility on a manufacturer (or others) to compensate for losses
related to injury to persons, material damage, or other unacceptable consequences caused
by a product.
Pull test: A test for bond strength of a lead, interconnecting wire, or a conductor.
Purple plague: One of several gold-aluminium compounds (very brittle, potentially leading
to time-based failure of the bonds), formed when bonding gold to aluminium and
activated by re-exposure to moisture and high temperature (> 340°C).
Quality: 1) A measure of the degree to which a device conforms to applicable specification
and workmanship standards. 2) Totality of features and characteristics of an item (product
or service) that bear on its ability to satisfy stated or implied needs.
Quality assurance: All planned and systematic activities necessary to provide adequate
confidence that an item (product or service) will satisfy given requirements for quality.
Quality, average outgoing: The ultimate average quality of products shipped to the
customer that results from composite sampling and screening techniques.
Quality defect: A defect which may be found by employing normal quality control
inspection equipment and procedures, without stressing the component.
Quality test: Test to verifY whether an item conforms to specified requirements.
Rad(Si): A unit of energy absorbed by silicon from radiation, equivalent to O.OlJ/kg.
Randomuess: The occurrence of an event in accordance with the laws of chance.
Redundancy: In an item, the existence of more than one means of performing its function.
Redundancy, active: 1) That redundancy wherein all means for performing a given function
are operating simultaneously. 2) That redundancy wherein all redundant items are
operating simultaneously rather than being switched on when need.
Redundancy, standby: That redundancy wherein the alternative means of performing the
function is inoperative until needed and is switched on upon failure of the primary means
of performing the function.
468 Glossary of microelectronics and reliability terms

Retlow soldering: A method of soldering involving application of solder prior the actual
joining. To solder, the parts are joined and heated, causing the solder to remelt, or retlow.
Refresh sensitivity: Dynamic RAM fail to retain data reliability during the specified
minimum interval between refresh cycles. Failure are due to excessive voltage or current
leakage from the storage element or fault in the rewrite circuits.
Relay: An electromechanical device in which contacts are opened and/or closed by
variations in the conditions of one electric circuit and thereby affect the operation of other
devices in the same or other electric circuits.
Reliability: Collective name for those measures of quality that reflect the effect of time in
storage or use of a product, as distinct from those measures that show the state of the
product at the time of delivery.
In the general sense, reliability is defined as the ability of an item to perform a required
function under stated conditions for a stated period of time.
Reliability assurance: The management and technical integration of the reliability activities
essential in maintaining reliability achievements, including design, production and
product assurance.
Reliability data: Data related to the frequency of failure of an item, equipment, or system.
These data may be expressed in terms ofJailure rate, MTBF, or probability ojsuccess.
Reliability engineering (design for reliability): The establishment, during design, of an
inherently high reliability in a product.
Reliability growth: A condition characterised by a progressive improvement of the
reliability of an item with time, through successful correction of design or production
weaknesses.
Reliability growth testing: The improvement process during which hardware reliability
increases to an acceptable level.
Reliability, inherent: The potential reliability of an item present in its design.
Reliability, intrinsic: 1) The probability that a device will perform its specified function,
determined by statistical analysis of the failure rates and other characteristics of the parts
and components the device comprises. 2) The reliability a system can achieve based on
the types of devices and manufacturing processes used
Reliability test: Test and analyses carried out in addition to other type tests and designed to
evaluate the level of reliability in a product, etc. as well as the dependability, or stability,
ofthis level relative to time and use under various environmental conditions.
Replaceability: A measure of the degree of which an item will be replaced within a given
time under specified conditions.
Required function: Function (or combination of functions) of an item which is considered
necessary to provide a given service.
Resist: A protective coating that will keep another material from attaching itself or coating
something, as in solder resist, plating resist, or photoresist.
Resistance: A property of conductors which - depending on their dimensions, material and
temperature - determines the current produced by a given difference of potential; that
property of substance which impedes current and results in the dissipation of power in the
form of heat.
Risk: The probability of making the wrong decision based on pessimistic data or analysis.
SA (Selective Availability): Encryption of P-code signal from GPS satellites, usually by
dithering the frequency, to deny unauthorised users access to precise positioning.
Safety: 1) The conservation of human life and its effectiveness, and the prevention of
damage to items, consistent with mission requirements. 2) Ability of an item to cause
Glossary of microelectronics and reliability terms 469

neither injury to persons, nor significant material damage or other unacceptable


consequences.
Sandwich: A packaging method in which components are placed between boards or layers.
Saw street: A space between devices on a semiconductor wafer along which a wafer saw
may carve out individual devices.
SCR silicon controlled rectifier: (Formal name is reverse-blocking triode thyristor). A
thyristor that can be triggered into conduction in only one direction. Terminal are called
anode, cathode and gate.
Screening: The process of performing 100% inspection on product lots and removing
defective units.
Screening test: A test (or combination of tests) intended to remove unsatisfactory items or
those likely to exhibit early failure.
Sense amplifier recovery: Tendency of the output (sense) amplifier to favour one logic state
after reading a long string of similar logic state. Alternate "1 "'s and "O"'s may be read
correctly, while a single bit of a given logic state in a long string of opposite logic states
may come out incorrect. Is caused by improper charge accumulation in the sense
amplifier.
Shmoo plot: The representation in an x-y diagram of the operating region of an IC as a
function of two parameters. The plots of pass/fail data were dubbed shmoo plots because
of the similarity in shape (fancied or real) to the mythical animal, originated by All Cap in
the comic strip "Li'l Abner". There are two types of shmoo plots: the two axes and the
three axes shmoo plots. The two axes shmoo plot generates a printout of the pass/fail
performance of a device as a function of two variables. One variable (1) is hold at a fixed
value while searching for the failure limit of the other variable (2), followed by changing
the fixed value of the variable (1) to a new value and repeating the procedure. If during
this search any parameter evaluated, appears to fail or be marginal to a specific limit, then
the last passing value of the parameters being tested will be recorded and the reason for
this condition should be further evaluated. Strictly speaking the two axes shmoo plot is
implemented through a sequence of go/no-go functional test, where each test has slightly
changed test conditions when compared to the previous or the subsequent test. Each test
must have a predefined set of input and output stimuli, and at each condition a test pattern
of several millions of cycles may be run. For each test there is a corresponding pass/fail
indication in the shmoo plot. Each input and output parameter as well as input and output
timing should be varied separately until a failure occurs. The three axes shmoo plot
allows the value of the measured variable to be printed out thus creating a third axis. The
composite shmoo plot shows the distribution via two variables on a group of devices [26].
Slice: A single wafer cut from a silicon ingot and forming a thin substrate on which have
been fabricated all the active and passive elements for multiple integrated circuits. A
completed slice usually contains hundreds thousands of individual circuits.
Slow access time: Charge storage on the output driver circuits or long lines causes excessive
time to sink or source current, thereby increasing access time.
Soft solder: A low-melting solder, generally a lead-tin alloy, with a melting point below
425°C.
Space-charge field: An electric field created by the response of a conductor or
semiconductor to charges outside the surface of the material.
Step-stress test: A test consisting of several stress levels applied sequentially for periods
(steps) of equal duration to a sample; during each period, a stated stress level is applied,
which is increased from one step to the next.
470 Glossary of microelectronics and reliability terms

Storage life: The length of time an item can be stored under specified conditions and still
meet specified requirements.
Stress: Any influence or a part of the influences to which an item is exposed to at a certain
instant.
Stress-accelerated corrosion: Corrosion that is accelerated by stress.
Stress, component: The stresses on component parts during testing or use that affect the
failure rate and hence, the reliability of the parts; (voltage, power, temperature, and
thermal environmental stress are included).
Substrate: 1) The supporting material on or in which the parts of an integrated circuit are
attached or made. The substrate may be passive (thin film, hybrid) or active (monolithic
compatible). 2) A material on the surface of which an adhesive substance is spread for
bonding or coating; any material which provides a supporting surface for other materials,
especially materials used to support printed-circuit patterns.
Survivability: The measure of the degree to which an item will withstand hostile man-made
environments and not suffer abortive impairment of its ability to accomplish its
designated mission.
System: Aggregate of components, assemblies, and subsystems, as well as skills and
techniques, capable of performing and/or supporting autonomously an operational role.
System effectiveness: A measure of the degree to which an item can be expected to fulfil a
set of specified mission requirements, which may be expressed as a function of
availability, dependability, and capability.
Systems engineering: Application of the mathematical and physical sciences to develop
systems that utilise resources economically for the benefit of society.
Testability: Procedure that includes the degrees of failure detection and isolation, the
correctness of the results, and the test duration. It is achieved by improving observability
and controllability.
Test of failure: The practice of inducing increased electrical and mechanical stresses in
order to determine the maximum capability of a device so that conservative use in
subsequent applications will thereby increase its life through the derating determined by
these tests.
Thermal noise: Noise that is generated by the random thermal motion of charged particles
in an electronic device.
Thermal resistance: A measure for the ability of an interface to evacuate the heat (e. g. RtJ,o-
a) = thermal resistance junction/ambient).
Time, mission: That element of uptime during which the item is performing its designated
mission.
Time, up (uptime): The element of active time during which an item is either alert, reacting,
or performing a mission.
Time, down (downtime): That element of time during which the item is not in condition to
perform its intended function.
TO package: Abbreviation for transistor outline, established as an industry standard by
JEDEC of the EIA.
Transfer molding: An automated type of compression molding in which a preform of
plastic is melted and forced into a hot mold cavity.
Underencapsuiant: The plastic material that is dispensed between a flip-chip and package
in liquid form and then thermally cured to provide mechanical and environmental
protection to the active surface of a die.
Uptime ratio: The quotient of uptime divided by uptime plus downtime.
Glossary of microelectronics and reliability terms 471

Useful life: Total operating time of an item, ending for a nonrepairable item when the failure
probability becomes too high or the item's functionality is obsolete, and for repairable
item when the intensity of failures becomes unacceptable or when a fault occurs and the
item is considered to be no longer repairable.
Varistor: A two-electrode semiconductor device with a voltage-dependent non-linear re-
sistance that drops markedly as the applied voltage is increased.
Vibration: An oscillation wherein the quantity is a parameter that defines the motion of a
mechanical system.
WAAS (Wide-Area Augmentation Service): A service that uses geostationary satellites
and a network of ground stations to compute GPS integrity and differential correction
information and to transmit that data to mobile receivers.
Wafer: A thin semiconductor slice (of silicon, germanium or GaAs) with parallel faces on
which matrices of microcircuits or individual semiconductors can be formed. After
processing, the wafer is separated into dice or chips containing individual circuits.
Wearout: The process of attrition that results in an increase of hazard rate with increasing
age (cycles, time, miles, events, and so on as applicable for the item). Wearout and/or
fatigue could perhaps be explained with the theory of the limiting distribution (extreme-
value theory): a mechanical piece would fail if any single ("weak") spot fails.
Wearout failure: A failure caused by a mechanism that is related to the physics of a device,
its design, and process parameters. Wearout failures should be distinguished from random
failures, which are associated with the variability of workmanship quality.
Whisker: A very small, hairlike metallic growth (a micron size single crystal with a tensile
strength of the order of one million psi) on a metallic circuit component.
Wire bond: Includes all the constituent components of a wire electrical connection such as
between the terminal and the semiconductor.
Wire bonding: The method used to attach very fine wire to semiconductor components to
interconnect these components with each other or with package leads.
Yield: The ratio of usable components at the end of a manufacturing process to the number
of components initially submitted for processing.
Zener breakdown: A breakdown that is caused by the field emission of charge carriers in
the depletion layer.
Zener diode: A class of silicon diodes that exhibit in the avalanche-breakdown region a
large charge in reverse current over a very narrow range of reverse voltage.
Zener effect. A reverse-current breakdown due to the presence of a high electric field at the
junction of a semiconductor or insulator.

Sources

1 Biijenescu, T. 1. (1985): Zuveriassigkeit elektronischer Komponenten. Glossar. VDE-Verlag, Berlin


2 Biijenescu, T. 1. (1996): Handbuch der Telematik Akronyme und Abkiirzungen. Fachpresse Goldach
Verlag (Switzerland)
3 Begriffserlauterungen und Formelzeichen im Bereich der statistischen Quaiitatskontrolle. Beuth-
Vertrieb, Koln, 1980
4 Benedetto, 1. M. (1998): Economy-class ion-defying les in orbit. IEEE Spectrum, no. 3, March, pp.
36-41
5 Birolini, A. (1978): Zusammenhang zwischen Qualitatssicherung und Zuveriassigkeit. Informa-
tionstagung SEV & GESO, Fribourg
6 Birolini, A. (1997): Quality and Reliability of Technical System. Springer, Berlin
472 Glossary of microelectronics and reliability terms

7 Brewer, R. (1972): Reliability terms and definitions based on the conceptual relationship between
reliability and quality. Microelectronics and Reliability, vol. II, pp. 435-461
8 Caillat, 1. (1976): Contribution au test des CI logiques. These, Universite de Grenoble
9 Calabro, S. R. (1962): Reliability principles and practices, Appendix I, McGraw-Hill, New York
10 CEI-Publication 134, (1961)
11 CEI-Publication 147-0, (1966)
12 CEl-56 (Bureau central) 62
13 CEI:56 (Secr.)84
14 DIN 40040
15 DIN 40041
16 DIN 40042
17 DIN 40043
18 EOQC Glossary
19 Glossary of terms (1982). International Society for Hybrid Microelectronics
20 Graf, R F. (1977): Modern Dictionary of Electronics. Howard W. Sams & Co., Inc., Indianapolis,
Indiana 46268 U. S. A.
21 Greene, A. E.; Bourne, A. 1. (1972): Reliability technology. Wiley Interscience, London, pp. 622-
627
22 Harper, C. A. (editor in chief) (1991): Electronic packaging and interconnection handbook. McGraw-
Hill Inc., New York
23 IEC Publication 271
24 IEEE Standard 352 (1975)
25 ISO (1977-07-01): Norme internationale 3534 - Statistique -Vocabulaire et symboles.
26 Jay, F. (editor in chief) 1984: IEEE Standard Dictionary of Electrical and Electronics Terms. The
Institute of Electrical and Electronics Engineers, Inc., New York, NY
27 Jensen, E.; Schneider, B. (1979): Characterization ofrandom access memories. ECR-93
28 Jones, R D. (1982): Hybrid Circuit Design and Manufacture. M. Dekker, New York and Basel
29 Lyon-Caen, R; Crozet, 1. M. (1977): Microprocesseurs et microordinateurs. Masson, Paris
30 Metzger, G.; Vabre, 1. P. (1974): Les memoires electriques. Masson, Paris
31 MIL-STD 7218
32 Naresky, 1. J. (1958): RADC reliability notebook; glossary. McGraw-Hill, New York
33 Neufang, O. (1983): Lexikon der Elektronik. Braunschweig, Wiesbanden
34 Reiche, H. (1972): Reliability definitions. Microelectronics and Reliability, vol. 11, pp. 425-427
35 Ryerson, C. M. (1957): Glossary and dictionary of terms and defmitions relating specifically to re-
liability. Third national symposium for reliability and quality control. Washington D. C., 15th
January, Papers, pp. 59-84
36 Thompson, P. (1997): Chip-scale packaging. IEEE Spectrum, August, pp. 36-43
37 Tummala, R. R, Rymaszewski, E. 1. (1989): Microelectronics packaging handbook .. Van Nostrand
Reinhold, New York
38 UlT - Repertoire des definitions des termes essentiels utilises dans Ie domaine des telecommuni-
cations; partie 1; 2eme impression, Geneve, 1961
List of abbreviations

Acronyms of some international organisations

ACM Association for Computing Machinery


AFCIQ Association Fran9aise pour Ie Controle Industriel de laQualite
AFNOR Association Fran9aise pour la NORmalisation
AGREE Advisory Group on Reliability of Electronic Equipment
ANIE Associazione Nationale della Industria Elettrotecnica
ANSI American National Standards Institute
AOSM Arab Organisation for Standardisation Metrology
AQAP Allied Quality Assurance Publications
ARINC Aeronautical Radio INCorporated
ASE Agence Spatiale Europeenne
ASE Association Suisse des Electriciens
ASQC American Society for Quality Control
BSI British Standards Institution
BWB Bundesamt ftiT Wehrtechnik und Beschaffung
CAME SA Canadian Military Electronics Standards Agency
CECC Cenelec Electronic Components Committee
CEF Comite Electrotechnique Fran9ais
CEI Commission Electrotechnique Intemationale
CEMEC Committee of European Manufacturers of Electronic
Components
CENEL Comite Europeen de Coordination des Normes
Electrotechniques
CENELEC European Committee for Electrotechnical
Standardisation
CNES Centre National d'Etudes Spatiales
CNET Centre National d'Etudes des Telecommunications
COPEP Commission Permanente de I'Electronique du Plan
CSA Canadian Standards Association
DEK Denmark Electr. Kom.
DFVLR Deutsche Forschungs- und Versuchsanstalt fUr Luft und
Raumfahrt e. V.
DGQ Deutsche Gesellschaft fUr QualWit
DGQA Director General for Quality Assurance
DIN Deutsches Institut fUr Normung
DKE Deutsche Elektrotechnische Kommission
DOD Department of Defense
ECCOG ESA SCC/CECC Coordination Group
474 List of abbreviations

ECQAC Electronic Components Quality Assurance Committee


EIA Electronics Industries Association
EOQC European Organisation of Quality Control
EPFL Ecole Polytechnique Federale de Lausanne (Switzerland)
ESA European Space Agency
ESRO European Space Research Organisation (beginning with 1975: ESA)
ESTEC European Space Research and Technology Centre
ETH Eidgenossische Technische Hochschule
EXACT international EXchange of Authenticated electronic
ComponenTs Perf.
GIDEP Government-Industry Data Exchange Program
GPO Government Printing Office
GRD Gruppe flir Riistungsdienste (Switzerland)
IAQ International Association for Quality (see ASQC)
IEC International Electrotechnical Commission
IECQ IEC Quality assessment system for electronic components
IEEE Institute of Electrical and Electronics Engineers
IES Institute of Environmental Sciences
lSI International Statistical Institute
ISO International Organisation for Standardisation
ISUP Institut de Statistique de I'Universite de Paris
JUSE Union of Japanese Scientists and Engineers
MEL Military Electronics Laboratory (Sweden)
MIL Military
NASA National Aeronautics and Space Administration
NIVR Netherlands Agency for Aerospace
NTIS National Technical Inf. Services
RAE Royal Aircraft Establishment
RADCIRAC Rome Air Development Centre / Reliability Analysis Centre
RETMA Radio Electronics and Television Manufacturers Association
SAQ Schweizerische Arbeitsgemeinschaft flir QualitatsfOrderung
SCCG Space Components Co-ordination Group
SEV Schweizerischer Elektrotechnischer Verein
SNV Schweizerische Normen-Vereinigung
SPARICERCA Ministero della Ricerca Scientifica
UTE Union Technique de I'Electricite

Some useful abbreviations

P Current gain of a transistor; see hFE


AAS Atomic Absorption Spectrometry
ABSS Atomic Beam Surface Scattering
AEAPS Auger Electron Appearance Potential Spectroscopy
AES Auger Electron Spectroscopy
AFM Atomic Force Microscopy
AQL Acceptable Quality Level
AI Aluminium
APCVD Atmospheric-Pressure Chemical Vapour Deposition
APS Appearance Potential Spectroscopy
List of abbreviations 475

ATE Automatic Test Equipment


ATR-IR Attenuated Total Reflection-Infrared Spectroscopy
BIR Building-In Reliability
BL Luminous intensity of an area light source, usually expressed
in candela/unit area
Br Radiant intensity of an area source; radiance, usually
expressed in watts/unit area
BLE Bombardament-induced Light Emission
BS British Standard
CAM Content Addressable Memory
CCD Charge-Coupled Devices
CERDIP CERamic Dual-In-Line Package
CFR Constant Failure Rate
Cf Cost of a field defect
Ci Cost to inspect one unit
CMOS Complementary Metal Oxide Semiconductor
COPRQ Cost Of Poor Reliability/Quality
C.T. Color Temperature
CTL Complementary Transistor Logic
CTR Current Transfer Ratio (of an optocoupler)
CVD Chemical Vapour Deposition
DAM Direct Access Memory
DAPS Disappearance Potential Spectroscopy
d.c. direct current
DFR Decreasing Failure Rate
DIP Dual In-line Package
di/dt Critical rate-of-rise of current rating of a thyristor
DMOS Diffused Metal-Oxide Semiconductor
DRAM Dynamic Random Access Memory
dv/dt Critical rate-of-rise of voltage parameter of a thyristor
E Illumination. Luminous flux density incident on a receiver,
usually in Lumens per unit of surface
E, activation energy (eV)
EAPROM Electrically Alterable PROM
EAROM Electrically Alterable Read-Only Memory
EBIC Electron Beam Induced Current
ECL Emitter-Coupled Logic
EELS Electron Energy-Loss Spectrometry
EEPROM Electrically Erasable PROM
EEROM Electrically Erasable ROM
ELL Ellipsometry
EM! ElectroMagnetic Interference
EMP ElectroMagnetic Pulse
EMP(x) Electron Probe Microanalysis
EOS Electrical OverStress
EPROM Erasable ROM
ESCA Electron Spectroscopy for Chemical Analysis
ESD ElectroStatic Discharge
ETM ElectroThermoMigration
476 List of abbreviations

eV electron volts
F lllumination. Total luminous flux incidents on a receiver,
normally in lumens
FAB-MS Fast Atom Bombardment-Mass Spectroscopy
FAMOS Floating-gate Avalanche-injection Metal-Oxide Semiconductor
FEM Field Electron Microscopy
FET Field-Effect Transistor
FIM Field Ion Microscopy
FIT Failure In unit Time (I failure per 109 device hours)
FLOTOX FLOating Oxide
Ga Gallium
GaAs Gallium Arsenide
GDMS Glow Discharge Mass Spectrometry
GDOES Glow Discharge Optical Emission Spectrometry
GIDL Gate-Induced Drain Leakage
H Irradiance. Radiant flux density incident on a receiver, usually
in watts per unit area
HAST Highly Accelerated Stress Test
HDBH High Day Busy Hour
HE Effective irradiance. The irradiance perceived by a given
receiver, usually in effective watts per unit area
HEIS High-Energy Ion Scattering
HEMT High Electron Mobility Transistors (MODFET)
hFE Current gain of a transistor biased common emitter. The ratio
of collector current to base current at a specified bias conditions
HMOS High-performance, n-channel silicon gate MOS
HTOT High-Temperature Operating Tests
HTRB High Temperature Reverse Bias operating life test current
IB Transistor base current
IBSCA Ion Bombardment Surface Chemical Analysis
IC Integrated Circuit
IC Transistor collector current
ID Dark current. The leakage current of an unilluminated
photodetector
Transistor emitter current
Forward bias current, usually of IRED.
Subscripts denote measurement or stress bias conditions, if
required
lETS Inelastic Tunneling Spectroscopy
IFR Increasing Failure Rate
IL Light current. The current through an illuminated
photodetector, at specified bias conditions
ILEED Inelastic Low-Energy Electron Difrraction
ILS lonisatin-Loss Spectrometry
IMMA Ion Microprobe Mass Analysis
IR Infrared
IRAS Infrared Absorption Spectrometry
IRED Infrared emitting diode
ISS Ion Scattering Spectrometry
List of abbreviations 477

I-V current-voltage (characteristic)


J current density (A/m2)
JAN Joint Army-Navy (specification)
JEDEC Joint Electron Device Engineering Council
k Boltzmann's constant
k kilo
K Kelvin
L Luminance of an area source of light, usually in lumens per
unit area
LAMMA Laser Microprobe Mass Analysis
LASCR Light Activated Silicon Control Rectifier
LED Light Emitting Diode
LEED Low-Energy Electron Diffaction
LEIS Low-Energy Ion Scattering
LMP Lase Microprobe Analysis
LOES Laser Optical Emission Spectrometry
LSI Large Scale Integration
LTFRD Lot Tolerance Fractional Reliability Deviation
LTPD Lot Tolerance Percent Defective
A Predicted failure rate of an electronic component subjected to
specified stress and confidence limit
Wavelength of radiation
m Magnification of a lens
MC MultiChip
MECL Motorola Emitter-Coupled Logic
MESFET Metal-Semiconductor FET
MIL MILitary electronics
mil One-thousandth of an inch
MILHDBK MILitary HanDBooK
MIL-STD MILitary STandarD
MISFET Metal Insulator Semiconductor Field-Effect Transistor
MNOS Metal-Nitride-Oxide-Semiconductor
MOS Metal-Oxide Semiconductor
MOSFET Metallic Oxid-Semiconductor Field-Effect Transistor
MOSM Metal-Oxide SemiMetal
MOST Metal-Oxide Seiniconductor Transistor
MRL Mean Reliability Level
MSCP Mean Spherical Candle Power
MSI Medium Scale Integration
MTBF Mean Time Between Failures
MTTF Mean Time To Failure
MTTFF Mean Time To First Failure
MTTR Mean Time To Repair
II Conversion efficiency of an electrically powered source. The
ratio of radiant power output to electrical power input
NAA Neutron Activation Energy
NEP Noise Equivalent Power
NEBS Network Equipment-Building System
nm nanometer (10. 9 m)
478 List of abbreviations

NMOS N-channel (type) MOS


NPR Noise Power Ratio
NRA Nuclear Reaction Analysis
OEM Original Equipment Manufacture
OES Optical Emission Spectrometry
OS Operation System
QA Quality Assurance
P Power
P material doped or polarised to give positive (hole) charge
carriers
material doped to give excess positive charge carriers (low
resistivity)
PAM Photo-Acoustic Microscopy
PD Power dissipated as heat
PDA Personal Digital Assistant
PED Plastic Encapsulated Device
PIND Particle Impact Noise Detector
PMOS p-channel (type) MOS
PPS Repetition rate in Pulses Per Second
PRM Pulse Rate Modulation. (Coding an analogue signal on a train
of pulses by varying the time between pulses).
PROM Programmable ROM
ps picosecond (10'12 s)
PSG Phospho Silicate Glass
PUT Prograrmnable Unijonction Transistor
Q charge
QA Quality Assurance
QC Quality Control
QCRIT critical charge in memory devices
QMP Quality Measurement Plan
QRA Quality and Reliability Assurance
R Resistance
RAM Random Access Memory
RBS Rutherford Back-Scattering Spectrometry
REDR Recombination Enhanced Defect Reactions
REPROM REPrograrmnable ROM
RETMA Radio Electronics and Television Manufacturers Association
RH Relative Humidity
RHEED Reflected High-Energy Electron Diffraction
R&M Reliability and Maintainability
ROCOF Rate of Occurrence Of Failure
ROM Read-Only Memory
RPP Reliability Prediction Procedure
RSER residual SER
p resistivity
SAM Serial Access Memory
SCR Silicon Controlled Rectifier
SEM Scanning Electron Microscope
SER Soft Error Rates
List of abbreviations 479

Si Silicon
SILOX SILicon Oxide
SIMS Secondary-Ion Mass Spectrometry
Si)N4 silicon nitride
Si0 2 silicon dioxide
Sn tin
SNMS Sputtered Neutral Mass Spectrometry
SOA Safe Operating Area
SOl Silicon On Insulator
SOS Silicon On Sapphire
SQPA Software Quality Program Analysis
SRAM static RAM
SRQAC Software Reliability and Quality Acceptance Criteria
SSI Small Scale Integration
SSMS Spark Source Mass Spectrometry
STEM Scanning Transmission Electron Microscopy
SXAPS Soft X-ray Appearance Potential Spectroscopy
SWC Solderless Wire Wrap Connecting
T temperature (OC or K)
t Time
TA Ambient temperature
TC Case temperature
TEELS Transmission Electron Energy-Loss Spectrometry
TEM Transmission Electron Microscope
TEM-ED Transmission Electron Microscope - Electron Diffraction
THB Temperature Humidity Bias
THDBH Ten High Day Busy Hour
TIR Testing-In Reliability
TJ Junction temperature
TO Transistor Outline
TRXRFA Total Reflection X-ray Fluorescence Analysis
TTL Transitor-Transitor Logic
TTL-LS Transistor-Transistor Logic - Low power Schottky
TTS Transitor-Transitor logic Schottky barrier
UCL Upper Confidence Level
UJT UniJunction Transistor
UL Underwriters Laboratories
ULSI Ultra Large Scale Integration
UPS Ultraviolet Photoelectron Spectrometry
UV UltraViolet
V Voltage / Volts
VLSI Very Large Scale Integration
VMOS V-groove MOS / Vertical MOS
VPE Vapor Phase Epitaxy
VT threshold voltage
W Radiant emittance
X-ray energetic high-frequency electromagnetic radiation
XAES X-ray-induced Auger Electron Spectrometry
XPD X-ray Photoelectron Diffraction
480 List of abbreviations

XPS X-ray Photoelectron Spectrometry


XRFA X-ray Fluorescence Spectrometric Analysis
XRM X-Ray Microanalyser
XRPM X-Ray Projection Microscope
ZD Zero Defect

References

Bftjenescu, T. 1. (1996): Fiabilitatea componentelor electronice. Ed. Tehnica,


Bucharest (Romania)
2 Birolini, A. (1997): Quality and Reliability of Technical Systems. Springer,
Berlin
3 DGQ 11-04 (1987): BegrifIe im Bereich der Qualitatssicherung, 4. Auflage
4 DIN 31051 (1985): mstandhaltung BegrifIe
5 EOQC (1989): Glossary ofTerms Used in the Management of Quality, 6th Edition
6 IEC 50(191): mtemational Electrotechnical Vocabulary (IEV) - Chapter 191:
Dependability and Quality of Service, 1990
7 IEEE, ANSIlIEEE Std. 100-1988 (1990): IEEE Standard Glossary of Software
Engineering Terminology
8 ISO 8402 (1986): Quality Vocabulary
9 Werner, H. W.; Garten, R. P. H. (1984): A comparative study of methods fro
thin-film and surface analysis. Reports on Progress in Physics, vol. 47, no. 3,
pp.221-344
POLYGLOT DICTIONARY OF RELIABILITY TERMS

English I French I German

accelerated test essai accelere Zeitraffungspriifung


acceptable quality control niveau de qualite acceptable annehmbare Qualitiitsgrenze
acceptable reliability level ARL niveau de fiabilite acceptable annehmbare Zuverliissigkeitsgrenze
acceptable sampling plan plan de contr6le par attributs Stichprobenplan
acceptance test essai d'acceptation Annahmepriifung
active maintenance time temps de maintenance active tatsiichliche Wartungsdauer
active redundancy redondance active aktive Redundanz (heisse Reserve)
active repair time temps de reparation (active) Instandsetzungsdauer
adaptability adaptabilite Anpassungsfiihigkeit
ageing previeillissement Voraltem, Vorbehandlung
availability disponibilite Verfugbarkeit
average life duree moyenne de vie durchschnittliche Lebensdauer
binomial distribution distribution binomiale Binomialverteilung
blackout defaillance totale ( complete) Gesamtausfall
breakdown voltage tension de c1aquage Durchbruchspannung
482 Polyglot dictionary of reliability terms

capability capacite Leistungsfiihigkeit


component failure defaillance compos ant Bauelementeausfall
confidence level niveau de confiance Vertrauensbereich, Konfidenzbereich
corrective maintenance maintenance Instandsetzung
debugging deverrninage Friihausfallbeseitigung
defect defaut Fehler
degradation degradation Leistungsminderung
degradation failure defaillance progressive Driftausfall
density function densite de probabilite Verteilungsdichte
derating devaluation Unterlastung
derating factor coefficient de reduction de charge Unterlastungsgrad
derating technique technique de devaluation Unterlastungsiechnik
design development test test de developpement Entwicklungsversuch
design inherent reliability fiabilite de conception (inherente) Entwurfszuverliissigkeit
design qualification test test de qualification (conception, projet) Konstruktionszulassungspriifung
design review examen critique du projet Entwurfsiiberpriifung
deviation ecart Abweichung
distribution of cumulative failure distribution de la frequence Ausfallsummenverteilung
frequency cumulee de defaillance
downtime duree d'indisponibilite Ausfalldauer
drift derive (drift) Drift
drift failure defaillance par derive Driftausfall
duty cycle facteur d'utilisation Tastverhiiltnis
early failure defaillance precoce Friihausfall
environment environnement Umgebungsklima
environmental condition condition d'environnement Umgebungsbedingung
environmental stress contrainte d'environnement Umgebungsbedingung
equipment appareil Gerat
estimation of parameters estimation des parametres Parameterschatzung
expected value esperance mathematique Erwartungswert
Polyglot dictionary of reliability terms 483

failure defaillance Ausfall


failure analysis analyse des defaillances Ausfallanalyse
failure characteristics caracteristiques de defaillance Ausfallkennzeichen
failure criterion critere de defaillance Ausfallkriterium
failure density densite des defaillances Ausfallhiiufigkeitsdichte
failure density distribution distrib. de la densite des defaillances Ausfalldichte
failure frequency frequence des defaillances Ausfallhiiufigkeit
failure frequency distribution distr. de la frequence des defaillances Ausfallhiiufigkeitsverteilung
failure probability densite de probabilite de la defaillance Ausfallwahrscheinlichkeitsdichte
failure quota taux de defaillance observe Ausfallquote
failure rate taux de defaillance Ausfallrate
field performance comportement en exploitation Einsatzverhalten
full operating time temps de'fonctionnement integral Vollbetriebszeit
functional redundancy redondance fonctionnelle funktionsbeteiligte Redundanz
functional stress contrainte fonctionnelle funktionsbedingte Beanspruchung
hermeticity etancheite Dichtheit
incremental probability of failure probabilite de defaillance pour une inkrementale Ausfallwahrscheinlichkeit
peri ode donnee
infant mortality mortalite infantile Friihausfallphase
inherent reliability fiabilite inherente Entwurfszuverliissigkeit
initials nombre initial (de dispositifs) Anfangsbestand
instant of failure instant d'apparition de la defaillance Ausfallzeitpunkt
intermishing reseau maille Vermaschung
interval reliability fiabilite par intervalle Intervall-Zuverliissigkeit
item dispositif (unite) Betrachtungseinheit
life (longevity) duree de vie Lebensdauer
life test essai de duree de vie Lebensdauerversuch
lightly loaded redundancy redondance faiblement chargee leicht belastete Redundanz
maintainability maintenabilite Wartbarkeit
maintenance maintenance Wartung
484 Polyglot dictionary of reliability terms

maintenance support index indice de maintenance Wartungsindex


malfunction derangement St6rung
malfunction time duree de derangement St6rungsdauer
maximum limited stress contrainte toleree Chenzbeanspruchung
mean (medium) life duree de vie moyenne mittlere Lebensdauer
mean cycles to failure nombre moyen de cycles jusqu'a la Zyklenzeit
defaillance
mean maintenance time temps moyen de maintenance mittlere Wartungszeit
mean time between failures (MTBF) moyenne des temps de bon mittlerer Ausfallabstand
fonctionnement (MTBF)
mean time to failure MTTF duree moyenne avant defaillance mittlere Ausfallfreiezeit
mean time to repair moyenne des temps des taches de reparation mittlere Reparaturzeit
nominal reliability fiabilite nominale Nennzuverlassigkeit
normal (random) failure period peri ode a taux de defaillance constant Peri ode konstanter Ausfallre
open circuit ouvert Unterbrechung
operating (working) data donnees d'exploitation Betriebsdaten
operating characteristics caracteris.tique d'acceptation Anna1unekennlinie
operating path chaine fonctionnelle Operationspfad
operating time temps de fonctionnement Betriebszeit
operational cycle cycle de fonctionnement Arbeitszyklus
operational reliability fiabilite operationnelle Betriebszuverlassigkeit
overall reliability fiabilite totale Gesamtzuverlassigkeit
overstress surcontrainte lJberbeanspruchung
part, component composant (Bau)-Element, Komponente
partial failure defaillance partielle Teilausfall
partial operating time temps de fonctionnement partiel Teilbetribezeit
partial redundancy redondance partielle Teilredundanz
passive (standby) redundancy redondance passive (sequentielle passive Redundanz (kalte Reserve)
au de commutation)
percentile pourcentage prozentualer Anteil
Polyglot dictionary of reliability terms 485

preventive maintenance maintenanace preventive vorbeugende Wartung


probability of acceptance probabilite d'acceptation Annahmewahrscheinlichkeit
probability of error probabilite d'erreur Irrtumwahrscheinlichkeit
probability of survival probabilite de survie Uberlebenswahrscheinlichkeit
quality qualite Qualitiit
quality assurance assurance qualite Qualitiitssicherung
quality control contr6le (de) qualite Qualitiitskontrolle
random failure defaillance aleatoire Zufallausfall
redundancy redondance Redundanz
reliability fiabilite Zuverlassigkeit
reliability allocation repartition de la fiabilite Zuverlassigkeitsaufteilung
reliability assurance assurance fiabilite Zuverlassigkeitssicherung
reliability characteristics caracteristiques de fiabilite Zuverlassigkeitskenngrosse
reliability data donnees de fiabilite Zuverlassigkeitsangaben
reliability demonstration demonstration fiabilite Zuverlassigkeitsnachweis
reliability estimation estimation de fiabilite Zuverlassigkeitsschiitzung
reliability function fonction de fiabilite Zuverlassigkeitsfunktion
reliability prediction prevision de fiabilite Zuverlassigkeitsaussage
restorability aptitude it la remise en etat Instandsetzbarkeit
reversible charge variation reversible umkehrbare Anderung
safety securite Sicherheit
sample echantillon Stichprobe
screen (to) selectionner auswiihlen
secondary failure defaillance seconde (dependante) Folgeausfall
shock choc Stoss
short court-circuit Kurzschluss
solderability soudabilite Lotbarkeit
stabilization bake stabilisation au four (it haute temperature) Hochtemperaturstabilisierung
static likelihood probabilite d'enonciation Aussagewahrscheinlichkeit
statistical quantities parametres stistiques statistische Grossen
486 Polyglot dictionary of reliability terms

step-stress contrainte echelonnee stufenweise Beanspruchung


stress contrainte Beanspruchung
stress cycle cycle de contrainte Beanspruchungszyklus
sudden failure defaillance soudaine Sprungausfall
survival function fonction de survie Bestandsfunktion
survivals dispositifs survivants Bestand
system effectiveness efficacite d'un systeme Systemwirksamkeit
systematic failures defaillances systematiques systematische Ausflille
temporary failure frequency pourcentage instantane de defaillance temporiire Ausfallhaufigkeit
terms of probability termes relatifs aux probabilites Wahrscheinlichkeitsbegriffe
test data donnees d'essais Testdaten
test reliability fiabilite en essai Priifzuverlassigkeit
testability testabilite Priifbarkeit
thermal cycling essai cyclique thermique Temperaturzyklen
thermal fatigue fatigue thermique thermische Ermiidung
time data donnees relatives au temps Zeitangaben
umeliability non-fiabilite Unzuverlassigkeit
uptime duree de disponibilite Klarzeit
useful life duree de vie utile Brauchbarkeitsdauer
useful life vie utile (duree de vie utile) Brauchbarkeitsdauer
utility utilite Brauchbarkeit
vibration vibration Schwingung, Schiitteln
wearout usure Verschleiss
wearout failure defaillance d'usure Verschleissausfall
wearout period peri ode d'usure Verschleissphase
worst case Ie cas Ie plus defavorable Schlimmstfall
worst case analysis analyse du cas Ie plus defavorable Schlinunstfallanalyse
Polyglot dictionary of reliability terms 487

French I English I German

adaptabilite adaptability Anpassungsrahigkeit


analyse des defaillances failure analysis Ausfallanalyse
analyse du cas Ie plus defavorable worst case analysis Schlimmstfallanalyse
appareil equipment Geriit
aptitude a la remise en etat restorability Instandsetzbarkeit
assurance fiabilite reliability assurance Zuverliissigkeitssicherung
assurance qualite quality assurance Qualitiitssicherung
capacite capability Leistungsrahigkeit
caracteristique d'acceptation operating characteristics Annabmekennlinie
caracteristiques de defai1lance failure characteristics Ausfallkennzeichen
caracteristiques de fiabilite reliability characteristics Zuverliissigkeitskenngrosse
chaine fonctionnelle operating path Operationspfad
choc shock Stoss
circuit ouvert open Unterbrechung
488 Polyglot dictionary of reliability terms

coefficient de reduction de charge derating factor Unteriastungsgrad


comportement en exploitation field performance Einsatzverhalten
composant part, component (Bau)-Element, Komponente
condition d'environnement environmental condition Umgebungsbedingung
contrainte stress Beanspruchung
contrainte d'environnement environmental stress Umgebungsbedingung
contrainte echelonnee step-stress stufenweise Beanspruchung
contrainte fonctionnelle functional stress funktionsbedingte Beanspruchung
contrainte toleree maximum limited stress Grenzbeanspruchung
contr6le (de) qualite quality control Qualitatskontrolle
court -circuit short Kurzschluss
critere de defaillance failure criterion Ausfallkriterium
cycle de contrainte stress cycle Beanspruchungszyklus
cycle de fonctionnement operational cycle Arbeitszyklus
defaillance failure Ausfall
defaillance aleatoire random failure Zufallausfall
defaillance composant component failure Bauelementeausfall
defaillance d'usure wearout failure Verschleissausfall
defaillance par derive drift failure Driftausfall
defaillance partielle partial failure Teilausfall
defaillance precoce early failure Frtihausfall
defaillance progressive degradation failure Driftausfall
defaillance seconde (dependante) secondary failure Folgeausfall
defaillance soudaine sudden failure Sprungausfall
defaillance totale (complete) blackout Gesamtausfall
defaillances systematiques systematic failures systematische AusfaIle
defaut defect Fehler
degradation degradation Leistungsminderung
demonstration fiabilite reliability demonstration Zuveriassigkeitsnachweis
densite de probabilite density function Verteilungsdichte
Polyglot dictionary of reliability terms 489

densite de probabilite de la defaillance failure probability Ausfallwahrscheinlichkeitsdichte


densite des defaillances failure density Ausfallhiiufigkeitsdichte
derangement malfunction Storung
derive (drift) drift Drift
devaluation derating Unterlastung
deverrninage debugging Friihausfallbeseitigung
disponibilite availability Verfugbarkeit
dispositif (unite) item Betrachtungseinheit
dispositifs survivants survivals Bestand
distr. de la frequence des defaillances failure frequency distribution Ausfallhiiufigkeitsverteilung
distrib. de la densite des defaillances failure density distribution Ausfalldichte
distribution binomiale binomial distribution Binomialverteilung
distribution de la frequence curnulee de distribution of cumulative failure Ausfallsumrnenverteilung
defaillance frequency
donnees d'essais test data Testdaten
donnees d'exploitation operating (working) data Betriebsdaten
donnees de fiabilite reliability data Zuverliissigkeitsangaben
donnees relatives au temps time data Zeitangaben
duree d'indisponibilite downtime Ausfalldauer
duree de derangement malfunction time Storungsdauer
duree de disponibilite uptime Klarzeit
duree de vie life (longevity) Lebensdauer
duree de vie moyenne mean (medium) life mittlere Lebensdauer
duree de vie utile useful life Brauchbarkeitsdauer
duree moyenne avant defaillance mean time to failure MTTF mittlere Ausfallfreiezeit
duree moyenne de vie average life durchschnittliche Lebensdauer
ecart deviation Abweichung
echantillon sample Stichprobe
efficacite d'un systeme system effectiveness Systemwirksarnkeit
environnement environment Umgebungsklima
490 Polyglot dictionary of reliability terms

esperance mathematique expected value Erwartungswert


essai accelere accelerated test Zeitraffungspriifung
essai cyclique thermique thermal cycling Temperaturzyklen
essai d' acceptation acceptance test Annahmepriifung
essai de duree de vie life test Lebensdauerversuch
estimation de fiabilite reliability estimation Zuverlassigkeitsschatzung
estimation des parametres estimation of parameters Parameterschatzung
etancheite hermeticity Dichtheit
examen critique du projet design review Entwurfsiiberpriifung
facteur d'utilisation duty cycle Tastverhiiltnis
fatigue thermique thermal fatigue thermische Ermiidung
fiabilite reliability Zuverlassigkeit
fiabilite de conception (inherente) design inherent reliability EntwurfszuveriassigkL ~t
fiabilite en essai test reliability Priifzuverlassigkeit
fiabilite inherente inherent reliability Entwurfszuverlassigkeit
fiabilite nominale nominal reliability Nennzuverlassigkeit
fiabilite operationnelle operational reliability Betriebszuverlassigkeit
fiabilite par intervalle interval reliability Intervall-Zuverlassigkeit
fiabilite totale overall reliability Gesamtzuverlassigkeit
fonction de survie survival function Bestandsfunktion
fonction de fiabilite reliability function Zuverlassigkeitsfunktion
four de stabilisation it haute temp. stabilisation bake Hochtemperaturstabilisierung
frequence des defaillances failure frequency Ausfallhaufigkeit
indice de maintenance maintenance support index Wartungsindex
instant d'apparition de la defaillance instant of failure Ausfallzeitpunkt
Ie cas Ie plus defavorable worst case Schlimmstfall
maintenabili te maintainability Wartbarkeit
maintenanace preventive preventive maintenance vorbeugende Wartung
maintenance corrective maintenance Instandsetzung
maintenance maintenance Wartung
Polyglot dictionary of reliability terms 491

mortalite infantile infant mortality Friihausfallphase


moyenne des temps de bon mean time between failures (MTBF) mittlerer Ausfallabstand
fonctionnement (MTBF)
moyenne des temps des taches de reparation mean time to repair mittlere Reparaturzeit
niveau de confiance confidence level Vertrauensbereich, Konfidenzbereich
niveau de fiabilite acceptable acceptable reliability level ARL annehmbare Zuverliissigkeitsgrenze
niveau de qualite acceptable NQA acceptable quality control annehmbare Qualitatsgrenze
nombre initial (de dispositifs) initials Anfangsbestand
nombre moyen de cycles jusqu'it la mean cycles to failure Zyklenzeit
defaillance
non-fiabilite unreliability Unzuverliissigkeit
parametres stistiques statistical quantities statistische Grossen
peri ode it taux de defaillance constant normal (random) failure period Peri ode konstanter Ausfallre
periode d'usure wearout period Verschleissphase
plan de contr6le par attributs acceptance sampling plan Stichprobenplan
pourcentage percentile prozentualer Anteil
pourcentage instantane de defaillance temporary failure frequency temporiire Ausfallhaufigkeit
previeillissement ageing Voraltem, Vorbehandlung
prevision de fiabilite reliability prediction Zuverliissigkeitsaussage
probabilite d'acceptation probability of acceptance Annahmewahrscheinlichkeit
probabilite d'enonciation static likelihood Aussagewahrscheinlichkeit
probabilite d'erreur probability of error Irrtumwahrscheinlichkeit
probabilite de defaillance pour une incremental probability of failure inkrementale Ausfallwahrscheinlichkeit
peri ode donnee
probabilite de survie probability of survival Uberlebenswahrscheinlichkeit
qualite quality Qualitat
redondance redundancy Redundanz
redondance active active redundancy aktive Redundanz (heisse Reserve)
redondance faiblement chargee lightly loaded redundancy leicht belastete Redundanz
redondance fonctionnelle functional redundancy funktionsbeteiligte Redundanz
492 Polyglot dictionary of reliability terms

redondance partielle partial redundancy Teilredundanz


redondance passive (sequentielle passive (standby) redundancy passive Redundanz (kalte Reserve)
au de commutation)
repartition de la fiabilite reliability allocation Zuverliissigkeitsaufieilung
reseau maille intermishing Vermaschung
securite safety Sicherheit
selectionner screen (to) auswiihlen
soudabilite solderability Uitbarkeit
stabilisation au four (a haute temperature) stabilisation bake Hochtemperaturstabilisierung
surcontrainte overstress lnberbeanspruchung
taux de defaillance failure rate Ausfallrate
taux de defaillance observe failure quota Ausfallquote
technique de devaluation derating technique Unterlastungstechnik
temps de fonctionnement operating time Betriebszeit
temps de fonctionnement integral full operating time Vollbetriebszeit
temps de fonctionnement partiel partial operating time Teilbetribezeit
temps de maintenance active active maintenance time tatsachliche Wartungsdauer
temps de reparation (active) active repair time Instandsetzungsdauer
temps moyen de maintenance mean maintenance time mittlere Wartungszeit
tension de claquage breakdown voltage Durchbruchspannung
termes relatifs aux probabilites terms of probability Wahrscheinlichkeitsbegriffe
test de developpement design development test Entwicklungsversuch
test de qualification (conception, projet) design qualification test Konstruktionszulassungspriifung
testabilite testability Priifbarkeit
usure wearout Verschleiss
utilite utility Brauchbarkeit
variation reversible reversible charge urnkehrbare Anderung
vibration vibration Schwingung, Schiitteln
vie utile (duree de vie utile) useful life Brauchbarkeitsdauer
Polyglot dictionary of reliability terms 493

German I English I French

Abweichung deviation ecart


aktive Redundanz (heisse Reserve) active redundancy redondance active
Anfangsbestand initials nombre initial (de dispositifs)
Annahmekennlinie operating characteristics caracteristique d'acceptation
AnnahmeprOfUng acceptance test essai d'acceptation
Annahmewahrscheinlichkeit probability of acceptance probabilite d'acceptation
annehmbare Zuverlassigkeitsgrenze acceptable reliability level ARL niveau de fiabilite acceptable
annehmbare Qualitatsgrenze acceptable quality control niveau de qualite acceptable NQA
Anpassungsfahigkeit adaptability adaptabilite
Arbeitszyklus operational cycle cycle de fonctionnement
494 Polyglot dictionary of reliability terms

Ausfall failure defaillance


Ausfallanalyse failure analysis analyse des defaillances
Ausfalldauer downtime duree d'indisponibilite
Ausfalldichte failure density distribution distrib. de la densite des defaillances
Ausfallhaufigkeit failure frequency frequence des defaillances
Ausfallhaufigkeitsdichte failure density densite des defaillances
Ausfallhaufigkeitsverteilung failure frequency distribution distr. de la frequence des defaillances
Ausfallkennzeichen failure characteristics caracteristiques de defaillance
Ausfallkriterium failure criterion critere de defaillance
Ausfallquote failure quota taux de defaillance observe
Ausfallrate failure rate taux de defaillance
Ausfallsummenverteilung distribution of cumulative failure distribution de la frequence
frequency cumulee de defaillance
Ausfallwahrscheinlichkeitsdichte failure probability densite de probabilite de la defaillance
Ausfallzeitpunkt instant of failure instant d' apparition de la defaillance
Aussagewahrscheinlichkeit static likelihood probabilite d'enonciation
auswahlen screen (to) selectionner
(Bau)-Element, Komponente part, component compos ant
Bauelementeausfall component failure defaillance compos ant
Beanspruchung stress contrainte
Beanspruchungszyklus stress cycle cycle de contrainte
Bestand survivals dispositifs survivants
Bestandsfunktion survival function fonction de survie
Betrachtungseinheit item dispositif (unite)
Betriebsdaten operating (working) data donnees d'exploitation
Betriebszeit operating time temps de fonctionnement
Betriebszuverlassigkeit operational reliability fiabilite operationnelle
Binomialverteilung binomial distribution distribution binomiale
Brauchbarkeit utility utilite
Brauchbarkeitsdauer useful life duree de vie utile
Polyglot dictionary of reliability terms 495

Brauchbarkeitsdauer useful life vie utile (duree de vie utile)


Dichtheit hermeticity etancheite
Drift drift derive (drift)
Driftausfall degradation failure defaillance progressive
Driftausfall drift failure defaillance par derive
Durchbruchspannung breakdown voltage tension de c1aquage
durchschnittliche Lebensdauer average life duree moyenne de vie
Einsatzverhalten field performance comportement en exploitation
Entwicklungsversuch design development test test de developpement
Entwurfsiiberpriifung design review examen critique du projet
Entwurfszuverl!issigkeit design inherent reliability fiabilite de conception (inherente)
Entwurfszuver1!issigkeit inherent reliability fiabilite inherente
Erwartungswert expected value esperance mathematique
Fehler defect defaut
Folgeausfall secondary failure defaillance seconde (dependante)
Friihausfall early failure defaillance precoce
Friihausfallbeseitigung debugging deverminage
Friihausfallphase infant mortality mortalite infantile
funktionsbedingte Beanspruchung functional stress contrainte fonctionnelle
funktionsbeteiligte Redundanz functional redundancy redondance fonctionnelle
GerM equipment appareil
Gesamtausfall blackout defaillance totale (complete)
GesamtzuverJassigkeit overall reliability fiabilite totale
Grenzbeanspruchung maximum limited stress contrainte toleree
Hochtemperaturstabilisierung stabilisation bake stabilisation au four (a haute temperature)
inkrementale Ausfallwahrscheinlichkeit incremental probability of failure probabilite de defaillance pour une
peri ode donnee
Instandsetzbarkeit restorability aptitude a la remise en etat
Instandsetzung corrective maintenance maintenance
Instandsetzungsdauer active repair time temps de reparation (active)
496 Polyglot dictionary of reliability terms

Intervall-Zuverlassigkeit interval reliability fiabilite par intervalle


Irrtumwahrscheinlichkeit probability of error probabilite d'erreur
Klarzeit uptime duree de disponibiiite
Komponente, Bauelement part, component compos ant
Konstruktionszulassungspriifung design qualification test test de qualification (conception, projet)
Kurzschluss short court-circuit
Lebensdauer life (longevity) duree de vie
Lebensdauerversuch life test essai de duree de vie
leicht belastete Redundanz lightly loaded redundancy redondance faiblement chargee
Leistungsfahigkeit capability capacite
Leistungsminderung degradation degradation
Lotbarkeit solderability soudabilite
mittlere Ausfallfreiezeit mean time to failure MTTF duree moyenne avant defaillance
mittlere Lebensdauer mean (medium) life duree de vie moyenne
mittlere Reparaturzeit mean time to repair moyenne des temps des taches de reparation
mittlere Wartungszeit mean maintenance time temps moyen de maintenance
mittlerer Ausfallabstand mean time between failures (MTBF) moyenne des temps de bon
fonctionnement (MTBF)
Nennzuverlassigkeit nominal reliability fiabilite nominale
Operationspfad operating path chaine fonctionnelle
Parameterschiitzung estimation of parameters estimation des parametres
passive Redundanz (kaJte Reserve) passive (standby) redundancy redondance passive (sequentielle
au de commutation)
Peri ode konstanter Ausfallre normal (random) failure period peri ode it taux de defaillance constant
prozentualer Anteil percentile pourcentage
Priifbarkeit testability testabilite
Priifzuverlassigkeit test reliability fiabilite en essai
Qualitiit quality qualite
Qualitiitskontrolle quality control controie (de) qualite
Qualitiitssicherung quality assurance assurance qualite
Polyglot dictionary of reliability terms 497

Redundanz redundancy redondance


Schlimrnstfall worst case Ie cas Ie plus defavorable
Schlimrnstfallanalyse worst case analysis analyse du cas Ie plus defavorable
Schwingung, Schiitteln vibration vibration
Sicherheit safety securite
Sprungausfall sudden failure defaillance soudaine
statistische Grossen statistical quantities parametres stistiques
Stichprobe sample echantillon
Stichprobenplan acceptance sampling plan plan de contr6le par attributs
Storung malfunction derangement
Storungsdauer malfunction time duree de derangement
Stoss shock choc
stufenweise Beanspruchung step-stress contrainte echelonnee
systematische Ausfalle systematic failures defaillances systematiques
Systemwirksamkeit system effectiveness efficacite d'un systeme
Tastverhaltnis duty cycle facteur d'utilisation
tatsachliche Wartungsdauer active maintenance time temps de maintenance active
Teilausfall partial failure defaillance partielle
Teilbetribezeit partial operating time temps de fonctionnement partiel
Teilredundanz partial redundancy redondance partielle
Temperaturzyklen thermal cycling essai cyclique thermique
temporare Ausfallhaufigkeit temporary failure frequency pourcentage instantane de defaillance
Testdaten test data donnees d'essais
thermische Ermiidung thermal fatigue fatigue thermique
lJberbeanspruchung overstress surcontrainte
lJberlebenswahrscheinlichkeit probability of survival probabilite de survie
Umgebungsbedingung environmental condition condition d'environnement
Umgebungsbedingung environmental stress contrainte d'environnement
Umgebungsklima environment environnement
umkehrbare Anderung reversible charge variation reversible
498 Polyglot dictionary of reliability terms

Unterbrechung open circuit ouvert


Unterlastung derating devaluation
Unterlastungsgrad derating factor coefficient de reduction de charge
. Unterlastungstechnik derating technique technique de devaluation
Unzuverlllssigkeit unreliability non-fiabilite
Verfiigbarkeit availability disponibilite
Vermaschung intermishing reseau maille
Verschleiss wearout usure
Verschleissausfall wearout failure defaillance d'usure
Verschleissphase wearout period periode d'usure
Verteilungsdichte density function densite de probabilite
Vertrauensbereich, Konfidenzbereich confideilce level niveau de confiance'
Vollbetriebszeit full operating time temps de fonctionnement integral
Voraltem, Vorbehandlung ageing previeillissement
vorbeugende Wartung preventive maintenance maintenanace preventive
Wahrscheinlichkeitsbegriffe terms of probability termes relatifs aux probabilites
Wartbarkeit maintainability maintenabilite
Wartung maintenance maintenance
Wartungsindex maintenance support index indice de maintenance
Zeitangaben time data donnees relatives au temps
Zeitraffungspriifung accelerated test essai accelere
Zufallausfall random failure defaillance aleatoire
Zuverlllssigkeit reliability fiabilite
Zuverlllssigkeitsangaben reliability data donnees de fiabilite
Zuverlllssigkeitsaufteilung reliability allocation repartition de la fiabilite
Zuverlllssigkeitsaussage reliability prediction prevision de fiabilite
Zuverlllssigkeitsfunktion reliability function fonction de fiabilite
Zuverlllssigkeitskenngr6sse reliability characteristics caracteristiques de fiabilite
Zuverlllssigkeitsnachweis reliability demonstration demonstration fiabilite
Zuverlllssigkeitsschatzung reliability estimation estimation de fiabilite
Polyglot dictionary of reliability terms 499

Zuverlassigkeitssicherung reliability assurance assurance fiabilite


Zyklenzeit mean cycles to failure nombre moyen de cycles jusqu'iI la defaillance
Index

accelerated testing 41, 42, 74, 91, 180, bipolar transistors 171, 172, 173, 188, 195
184, 186, 188, 189, 205, 219, 220, 257, bistable noise in operational amplifiers 329
293,317,343,347,356,357,415,424 bit defects 350
accelerated thermal stress 221 block refresh 300
acceleration factors 221, 351, 352, 358, bonding strength 272
366 bonding techniques 272, 276
acceleration stress 76 breakdown voltage 110, 117, 119, 131,
accidental failures 16 135, 173, 176
Acquisition Reform 340, 357 breakthrough 419
acquisition reform 41, 42, 48 brittle fracture 229
activation energy 76, 221, 222, 226, 227, bubble test 210
229,244,423 building-in reliability 41, 42, 46, 282
active tests 346 bulk resistivity 271,272
AES (Auger Electron Spectroscopy) 284 buried oxide 291
ageing models 223 burn-in 42, 49, 52, 54, 55-58, 62-65, 82,
ageing of substrate 269 89,203,209,230,233,234,236,237,244,
ageing problem 318, 327 261,283,285, 301, 310, 322, 324, 366,
aggressive liquids 342 367,375,420,423,424
alpha particles 292 capacitor-chip 264
alphanumeric display 313 capacitors 416
aluminium conductor 227 CAS (Colunm Address Strobe) 300
aluminium electrolytic capacitors 105, 125 catastrophic failures 161-163, 221, 222,
analysis 176, 186, 191 283,308,320,323
ANOVA method 52 cathodic spraying technique 259
AQL (Acceptable Quality Level) 233 cause-effect diagrams (Ishiqawa) 52
Arrhenius model 64, 71, 76, 139, 165, 166, causes of failure 205
192,207,219,226,227,316 CCD (Charge Coupled Devices) 290
assembly process 358 cell type 289
ATE (Automatic Test Equipment) 299 ceramic substrate 247, 250, 252, 258, 267,
Au-AI bond failures 423 . 315
Auger electron spectroscopy 390 CERDIP (ceramic dual-in-line packages)
automatic surface mounting 287 309,355
automatic wafer processing 287 challenge-response 64
automotive environment 310 channel degradation 295
availability 83, 86 characterisation test 23, 298
average lifetime, 349 charge induced failures 301
average quality 294 charge injection 423
average value, 334 charge loss 296, 297, 315
charge pumping 292
baking process 247 charging phenomenon 225
beam-leads 255 chemical means, 388
binomial probability function 7 cleanliness 283
bipolar chips 273 climatic tests 128
bipolar IC 215, 218, 221, 226, 229, 230, clock rates 289
233,234,238,240,241 C~OS 218,283,290,291,295,307,312,
bipolar technologies 282,294,295, 308 313
502 Index

combined temperature and hwnidity cycles degradation oflight output 320


340 degradation rate 317
commercial parts 358 dependability 42, 86, 87
component logic tests 369 derating 26, 35, 92, 93, 100, 102, 121, 136,
component stress analysis method 413 140,207
components with ferites 417 design deficiencies 392
computer simulation 284 design errors 100
concurrent engineering 41,42,47,382 design for reliability 42,49,89, 161
conditional probabilities 6 design rules 99, 286
conditioning tests 388 design-related yield losses 287, 307
conformity test 27 detection 363, 369, 370
connectors 140 devaluation 260
connectors 417 device 3, 9, 12,26
constant acceleration 234, 264, 301 device dimensions 279, 281, 307, 311
contamination 51,296,341,343,351,381 device geometries 226
continuity test 303, 354 diac (bilateral trigger diode) 248
continuous distributions 7 diagnosis 363, 371, 374, 376, 378
continuous process improvement 52 die attach 272
control level 233, 369, 377 die passivation 344
control strategy 34 die protection 342
conventional electronics 279 dielectric breakdown 113, 223, 225, 231
cooling radiator 263 dielectric constant 271
correction factor 123 dielectric paste 248, 254
corrective action 43,51,68 dielectric strength 271
corrosion 222, 228, 236, 243, 257, 341, diffusion current 319
343,345,350,351,354,360,361 digital networks 336
cost of the tests 46, 54, 57, 58, 81, 83, 368 dilatation coefficients 180,228,341,343
COTS (Commercial-off-the-Shelt) 359 diode chips 264, 273
coupling factor 324 diodes 417
crossovers 264 DIP ceramic packages 348
cross-sectioning 284 discrete devices 220
crystal crack 263 discrete distributions 7
crystal lattice 292 dissipation factor 271, 272
CSP (Chip-Scale Package) 357 dissipation power 171, 174, 187, 188, 190,
CTR degradation 318, 320, 321, 323, 324, 192,193,248,349,352
327 distributed refresh 300, 301
cultural features 42 distribution function 9, 20, 22
cumulative distribution 7, 8 DLTS (Deep Level Thermally Stimulated)
current density 216, 227, 239 387
current gain 174, 176-178, 184,225 DoE (Design of Experiments) 52
curve tracer 386 DRAM 278, 280, 281, 287, 293, 307, 312,
C-V curves 387 313
drift 51, 53, 59--61,63,64,71,81,90
damage-endurance 64 drift failures 96, 102, 109, 111,221
damp heat test 128-130,251,302,356 drift of the threshold volatge 350
data processing 30, 221 dry etching 283, 286, 287
data retention 296, 302 dry plasma etching 289
DC parametric test 303 dual-in-line (DIP) packages 288
defect 363, 364, 366-369,376,377 Duane model 414
defect density 284, 298, 311 dust 415
defect types 363, 369 duty cycle 351, 352, 414
defect-free 285 dynamic life testing 223, 287
degradation failures 148, 149, 162, 163, dynamic RAM 229
327
degradation mechanisms 223
Index 503

early failures 9, 53-55, 57, 60, 64, 91, 99, failure mechanisms 50, 52-55,60, 63, 64,
102, 103, 105, 141, 164, 187, 229-231, 68-71, 74, 81, 82, 84, 88, 148, 158, 160,
233,234,243 182, 183, 186, 194, 197, 203, 205, 207,
early life test 316 219, 220-225, 228, 230-232, 234, 236,
EAROM (electrically alterable ROM) 290 239, 244, 261, 263, 276, 281, 294, 304,
ECL289 341, 350, 356-358, 381-383, 387, 390,
economic considerations 367 393
electric field 350, 351 failure mode 162, 183, 184,341,344,346,
electrical characteristics 343, 350 350,357,381,415,419,421,422
electrical measuring 381 failure probability 419
electrical overstress 294 failure rate 2, 8-l3, 18-24,29-31,35,36,
electrical stress 159,231 67,91,93,102-106,109,111,112,115,
electrical tests 366 123, 126, 127, 145-150, 154, 157, 158,
electrochemical stability 365, 366 161, 162, 165-167, 171, 179, 180, 187,
electromigration 216, 223, 227, 231, 232, 189-192, 206, 215, 219, 220, 224, 230,
243,244,297 236, 238, 252, 255, 256, 261, 276, 297,
electron charge 211 327, 339-341, 346, 349, 352-354, 358,
electron collisions 227 415
electron microprobe 284 failure rate prediction 295, 413
electron probe microanalysis 390 failure risk 84
electronic systems 3, 21, 24, 30, 38, 39, 41, failure types 16, l32, 254, 418
42,233 FAMOS technology 296
encapsulation 262, 263 fatigue 100, 228
energy barrier 226 ferroelectrics 279
environment 2,18,19,41 FET (Field Effect Transistors) 174
environmental conditions 55, 65, 66, 74 fibre optic 314
environmental reliability testing 42, 65 , field data 414
96, 102, 106, 107, 109, 111, 112, 115, 123, field-effect characteristic 325
124, 132, 138, 139, 141 final control 233
epoxy resins 248, 273,275,341,347,315, final electrical tests 264
323 fine adjustment of resistors 270
EPROM (erasable PROM) 290 fine leak test 59, 302
equiped cards 417 flash memories 281
equipped card control 233 flat pack packages 348
erase cycles 281, 296 flaws 363
error 364, 366, 369,370, 373 flicker noise 329, 330, 333-335, 337
ESD (Electrostatic Discharge) 223, 229, FMEA (Faill;lre mode and effect analysis)
283,294,295,298,301,303 415
evaporation 247, 248, 250 FMEAlFMECA method 30
excess low-frequency noise 329 FMECA (Failure mode, effect and
excess noise 330, 336, 338 criticality analysis) 415
exponential failure distribution 261,358 FTA (Fault Tree Analysis) 32
extrinsic failure mechanisms 315 functional test 229, 283, 303, 304
Eyring model 231 functional testers 376
functionning 68
fabrication cycle 367 fuzzy logic 60, 61, 75, 87, 90
fabrication defects 167
face bond 276 GaAs FET chip 273
failure 1,2,4, 7-26, 29-38, 41, 363-365, GaAs LED 317
367,375,377 generation ofhole-electron pairs 298
failure analysis 171, 181, 183, 184, 186, generation-recombination noise 329
224, 282, 284, 294, 303, 310, 346, 353, getter 226
381-386,389-393 GIDL (gate-induced drain leakage) 291
failure cause 381,383,390,392 gigabit memories 310
failure criteria 220, 221, 421 glass passivation 350, 351
504 Index

glass pastes 254 integrated circuits 215, 218-222, 224, 229,


glassivation 152,301 236,238,240,244-246,418
glazing 268 intellectual property 378, 379
gold conductive paths 250 interconnect subsytem 232
goodlbad-test 23 interface traps 292
gross leak test 57, 59, 302 intennetallic compounds, 381
guaranteed lifetime 108 intennittent contacts 341
hardware failures 293 intennittent functioning 349, 350, 351, 352
HAST (Highly Accelerated Stress Test), intennittentlyopen circuit 324
49,354 internal connections 343
hazard rate 12, 13,82,84 internal visual 234
heat sink 180, 181, 187, 192 international standardisation 221
hennetic packages 343, 355 intrinsic charge trapping 297
henneticity 256, 257, 264 ion implantation 289
henneticity testing 388 ion scattering spectrometry 390
high radiance LEDs (HRLEDs) 314 ionic contamination 298, 341
high stress tests 234 IR semiconductor lasers 314
high temperature humid environment 326 ISO 9000 48, 86
high temperature storage 297, 301, 323 isolation resistance 255
high temperature test 327
high-injection noise 329 junction breakdown 304
high-temperature stability test 366 junction current density 319
hillocks 227 junction temperature 145, 161, 162, 165,
HMOS 290, 298 166, 171, 177-179, 187-192, 225, 319,
hometaxial base 174 323,324
hostile environment 343
hot carrier 226 laser diodes 356
hot electron injection 296 laser technique 259
hotspot 145,148,182,183,184,260,275 latch-up test 223, 302
hot-carrier 223,279,281,291 lattice constant 272
HTRB (High Temperature Reverse Bias) law of addition 6
257,297,323,333,335 law of degradation 323
humidity 323, 326, 327, 339, 340-343, law of multiplication 6
347,348,350-355,360 LCC (leadless chip carriers) 288
humidity test 65, 66, 67, 76, 77, 78, 91, LCD (Liquid Crystal Display) 324
221,222,223,242,303 lead bond 274
hybrid circuit 323 lead-frame 365
hybrid integrated circuits 247, 248, 255, leakage current 176,350,352,381
272 leakage tests 303
life testing 72, 260, 347, 349, 354, 421,
I2L 289 422
IMPATT (IMPact Avalanche and Transit- lifetime 83,174,178,179,186,317,318,
Time) diodes 164 320,326,327,351,352,355
impedance 107, 109, 116 Light Emitting Diodes (LED) 313, 417
incoming inspection 299, 300 linear IC, 418
indirect tests 364 liquid chemical etching 286
infant mortality 54, 84,284,298 load resistance 168
infant mortality period 230, 231 localisation 363, 370, 377, 380
input control 233, 367,368,369, 375, 384, lognonnal distribution 8, 76, 84, 225, 316,
393 358
input control tests 365 long tenn failures 230
input impedance 175 long-tenn reliability 52
instantaneous failure rate 13 loss factor 115, 123
insulation resistance 131 low frequency noise 329
low power electronics 312
Index 505

low-temperature dynamic life test 297 microwaves 248


LPE (Liquid Phase Epitaxy) 314 military components 356, 358
LSI (Large Scale Integration) 234 military standards 339
LSI memories 341 ministrip 275
LTO (Low pressure Temperature Oxide) minority carrier lifetime 284
285 minority carriers 172
MLE (Maximum Likelihood Estimation)
maintainability 5, 38, 39, 41,86,413,414 73
maintenance 326 MO-CVD (Metal Organic Chemical
maintenance controls 369 Vapour Deposition) 314
maintenance costs 233 modelling 284,307
majority carriers 172 moisture 339, 343, 345, 347, 348, 354
manipulation errors 393 molecular electronics/photonics 279
manufacturing defect 329, 230 monitoring 60
manufacturing errors 100, 10 1 monolithic integrated circuits 215
Markov parameters 414 Monte Carlo reliability simulation 232
mask misalignments 286 Monte Carlo techniques 32
masking defects 390 Moore's law 277
material fatigue 346 MOS (Metal Oxide Semiconductor) 174
maximum ratings 319,324 MOS ICs 218, 225, 233, 239
MBE (Molecular Beam Epitaxy) 314 MOS memories 292
measuring techniques 375 MOS transistor 223,225,226
mechanical defects 182 MOSFET 279
mechanical means 389 moulding material 341, 343, 344, 345, 347
mechanical shock 220, 239-342, 353, 415 moulding operation 341,343
mechanical stability 365 mounting of capacitors 269
mechanical stress 68, 70, 79, 221, 223, MTBF (Mean Time Between Failures) 9,
228,229,256 23,413,414
memory cycle 290 MTTF (Mean Time To Failure) 9, 252,
mesa epitaxial 174 227,228
mesa planar 174 MTTR (Mean Time To Repair) 414
mesa with double diffusion 174 multilayer ceramic capacitors 132
metal corrosion 381 multiphonon-assisted tunneling mechanism
metal diffusion 226 296
metal film resistors 93, 100 multiple writing 305
metal migration 381
metal package 184,339,341 negative exponential distribution 10, 11
metal penetration 423 nematic liquid crystals 383
metallic frame 343, 344, 345 neutral test laboratories 375
metallisation 216, 217, 218, 222, 225, 228, neutron activation analysis 390
232, 236, 238, 240, 241, 243, 262, 263, neutron radiography 383
274,355 Newton-Raphson method 73
microbreaks 390 nitride passivation 389
microcracks 297 NMOS 283, 290, 297
microelectronics 226, 244 noise 415
microfissures 256 noise behaviour 103
microplasma noise 329 noise characteristics 254
microprocessor design 308, 314 noise due to recombination and generation
microprocessor test methods 371 329
microprocessors 277, 278, 279, 283, 305, noise figure 251, 254, 334
308,310, 312, 363, 367, 368, 371, 373, noise spectroscopy 335, 338
421 noise voltage 254
microscope analysis 389 Nomarski microscopy 383
microstrip post 275 noncontinous inspection 72
microtechnology 218 non-destructive tool, 386
506 Index

non-operating failure rate 413 pinholes 381


nonnal distribution 8 pink noise 329
nonnal operational conditions 219, 226, planar with double diffusion 174
227 plasma etcher 389
plastic encapsulated circuits 222
open circuit 184, 419 plastic encapsulated ICs 420
operating conditions 12, 92, 96 plastic encapsulated transistors 339, 340
operating life 323 plastic materials 250
operating temperature range 289 plastic package 184, 186,222,243
operating time 12 plastic packages 388
operational derating 22 Poisson probability 7
operational failures 28 Poisson's equation 211
operational life 223, 232,341,352 polyester film / foil capacitors 126
operational reliability 51, 85 popcorn noise (also called burst noise) 329
operational tests 370 popcorn noise 329, 331
optical microscopy 381, 389 potentiometers 416
optocoupler 318, 319, 323, 324,417 power cycle test 346
output noise 333 power transistor 171, 173, 177, 182, 191,
overcharges 381 192, 194
oxide breakdown 225, 281, 286, 297 power-transistor-chip 264
oxide defects 229, 381 pre-ageing 22 25
oxide hopping conduction 296, 297 preconditioning 323
oxide instability 294 prediction methods 42, 82, 83
oxygen plasma treatment 227, 228 pressure 219, 222, 415
pressure cooker 202, 205, 302
package leakage 183 pressure test 354
package parasitic 272 preventive action 43
packaging density 279, 287, 288, 293 printed circuits 249
parameter drift 226 probability 1,3-14,20,23,30-33,38--40
parametric tests 369 probability distributions 7
Pareto diagrams 52, 294 process errors 392
part qualification 358 process monitoring 284
partitioning 377 process reliability 41, 42, 50
parts count technique 413 process variations 392
parts counts method 18 process weaknesses 299
parts stress analysis 18, 19 process-related defects 285
passivation 152 procurement process 358
passivation defects 230 production controls 369
passivation layer 285 programmable logic array 291
passive components 248, 249, 252 PROM (programmable ROM) 290
passive tests 345 protection circuits 229
path sensitive algorithms 369 pseudo-random signals 373
patterns 363, 369, 372 PSG (phosphosilicate glass) 225, 285
PCB (printed circuit board) 356 pulse heating techniques 275
PCB (Printed Circuit Board) 53,287 purple plague 163,228,256
PED (Plastic Encapsulated Devices) 309
PEM (Plastic Encapsulated Microcircuits) QML (Qualified Manufacturer List) 358
49,340 QPL (Qualified Parts List) 358
pH value, 351 quality 1,3,4, 16, 18,21,27,34-36,326,
phenolic resin 347 363,371,373-378
photolithography defects 217, 286 quality and reliability assurance 42
photoresist 250 quality systems 42, 86
physics offailure 41, 42, 77 quantum efficiency 315, 317, 318, 320,
pin bending 270 323
pin-diode 210 quartz devices 417
Index 507

rad-hard parts 305 safe operation 167,168, 169


radiation 265 safety limits 176, 177, 180
radiation environment 305 saline atmosphere, 347, 348
radiation field 70, 71 salt spray 355, 356
radiation hardness 289 SAM (Serial Access Memory) 290
RAM (Random Access Memory) 290, 298, scanning electron microscope 381
299,300,303-305,310,314 Schottky TTL 289
Raman scattering 390 screening 21, 22, 25, 41,42,46,49,52-54,
random instructions 373 56,59,60-63,78,79,87,89,91, 103, 105,
random noise 329 132, 133, 142, 183, 189, 192, 193, 202,
random pulses 330 203, 205, 219, 225, 233, 234, 236, 237,
random test 369, 373 245, 256, 258, 264, 297, 301, 322, 346,
RAS (Row Addressed Strobe) 300 358,362,375,420,424
RBD (reliability block diagram) 414 screening criteria 323, 318
real time algorithmic method 372 screening test 346
rectifier diodes 145, 158 screen-printing 248
reflow method 253 seal test 59, 302
reflow-soldering 269 second breakdown 171, 172, 174, 181,
refresh tests 300 182,183,192,193,194
relative humidity 222, 339, 341, 343, 348- secondary electrons 390
350,415 selftesting 371
relative-frequency 6 SEM (Scanning Electron Microscopy) 284
reliability 1-5, 7, 9-26,28,30,31,34-44, SEMM (Soft-Error Monte Carlo Model)
46-50, 52-57, 60-65, 67-69, 72, 75-79, 307
81-91,146,171,172,175-180,185-195, shift register 418
219, 235, 277, 279, 281-284, 286-288, shocks 257
291-294, 296, 298, 301, 305, 307-309, short-circuits, 150, 153, 154, 160, 162,
313, 314, 339, 340, 344, 356, 314-318, 163,184,193,381,419
322, 326, 327, 329, 340, 342, 343, 347, shot noise 329
348,350,352,353,355-358,360,362 signal diodes 150
reliability building 42, 49 silicon defects 297
reliability data 109 silicon nitride 256,
reliability evaluation 42, 63, 79, 220, 231, silicone gels 355, 361
234,244 silicone package 347
reliability fingerprint 75 SIMS (Secondary Ion Mass Spectrometry)
reliability level 357 284
reliability models 205 simulators 231
reliability prediction 52, 352, 358 single-bit errors 293
reliability study 252 SITH (Static Induction Thyristor) 210
reliability tests SLCC (Silicone junction coated IC) 355
replacement rate 230 slow access time 305
REPROM (Reprogrammable PROM) 290 slow trapping 296
residual current 107, 112-114, 126, 176, small geometry devices 298
233,239 SMT (Surface Mounting Technology) 287
residual curve III SOA (Safe Operating Area) 192
resistor pastes 253 soft error 292,298,302,307,311,313
resistor stability 96 soft-error phenomena 279
resistors 92-94, 96-103,125,134,416 Software-package 413
responsiveness 321 SOl (Silicon On Insulator) 279
reverse leakage 209 SOlC (Small Outline IC package) 355
RlBM (Reactive Ion Beam Milling) 286 solder interconnects 275
RlE (Reactive Ion Etching) 286 solder joint 181, 182, 184,341,353,392,
ROM (Read-Only Memory) 290, 418 417,420
Rutherford back-scattering spectrometry, solderability 302, 323, 324
390 soldering failures 262
508 Index

soldering points 247, 253 test chips 52


SOS (Silicon On Sapphire) 24, 290, 292 test effectiveness 364, 373, 374
SPC (Statistical Process Control) 52, 414 test equipments 376
SPICE 231 test strategies 301
sputtering 247, 248 test structure 232
SSI (Small Scale Integration) 234 test systems 364
SSI integrated circuits 387 testability 26--28, 36, 373, 374, 380
stabilisation 333 testability analysis, 415
stabilisation bake 323 testing laboratory 384
stability of the resistor 251 testing technique 283
standard deviation 42,86,76,77,88,91 thermal characteristics 257
stencil process 247, 252, 258 thermal chock 264
step-stress test 75 thermal coefficients 315
stereoradiography with X-rays 383 thermal coeficient of expansion 271
storage 67, 251, 252, 272,318,323,327, thermal conductivity 271
352,356,362,386 thermal cycling 180, 189, 192, 234, 257,
storage reliability data 420 264,302,303,345
stress analysis 413, 414 thermal expansion mismatch 229, 315, 324
stress level 74, 75-79, 220 thermal fatigue 178, 180, 181, 191, 192
stress to monitor ratio 321 thermal intermittence, 341, 347
stress-strength 64, 85 thermal noise, 329
styroflex capacitors 132 thermal oxide 225, 226, 286, 291
submicron technology 284 thermal resistance 219, 229
sudden failures 102 thermal shock 59, 327, 345, 354-356, 366
supply voltage 279, 281, 298, 303, 311, thermal stressing 321
351,352 thermistors 417
surface charge 225, 296 thermocompression 173, 259, 262, 264,
surface defects 182 274,275
surface inversion failures 423 thick-film resistor 253, 264
surface isolation 342 thick-film technique 247, 253, 258
surface resistance 253, 254 thin-film technique 247, 258, 259
surface trapping centres 332 threshold shift 423
surface-related defects 283 threshold voltage 233
switching speed 172, 174 throughout-contacts 252
synergy 48,50,51,67,68,71,77,84,225 thyristors, 417
synergy of environmental factors 67 time to end of life 316
timing set-ups 300
TAB (Tape Automatic Bonding) 265, 392 TIR (Testing-In Reliability) 281
Taguchi methods 49, 90 tolerance limits 22
tantalum capacitor 110-113, 116, 123, tolerance-requirement 64
124, 126 total failure 148
TDDB (Time Dependent Dielectric Total Quality Management (TQM) 41, 42,
Breakdown) 225,302 44
technological synergies 50 transistor-chip 264
telecommunications systems 376 transistors 417
TEM (Transmission Electron Microscope) transmission gate 295
390 transport 67
temperature 217, 219, 220--222, 224-228, transverse sectioning 386
234-237,241,242 Trans-Zorb diodes 163
temperature coefficient 94, 95-97, 99, 131, trap characterisation 387
175,251,253,258,271 triple diffusion 174
temperature cycling 59, 60, 69, 209, 323, tunneling 311
342,345,346,354,360,362,420 typical costs 420
temperature stabilisation 264 typical defects 100
tensile strength 271, 272
Index 509

ULSI (Ultra Large Scale Integration) 279


ultrasonic wire bonds 276
ultrasound bound 173
unipolar transistors 172
useful life 346, 358

vapour pressure 339, 343


VDMOS (Vertical Double diffused) 175
vendor qualification 358
vibration test 210, 219, 220, 303, 340, 341,
345,353,415
visibility points 377, 378
visible LEDs 314
visual control 420
visual external examination 264
visual internal examination 264
VLSIlULSI silicon wafers 283
VMOS (Vertical MOS) 175,290,298
voids 227, 230, 232
VPE (Vapour Phase Epitaxy) 314

wafer bonding 272, 278


wafer scribbing 229
wearout 358
Weibull distribution 8, 9, 72, 73, 91, 225,
316,358,414
welding 248, 274, 275
wire bonding 273
wire connections 249
WLR (Wafer Level Reliability) 52, 281
worst case temperature 300
wound capacitors 131
wrong usage 393

X-ray examination 330, 332


X-ray fluorescence spectrometric analysis,
390
X-Ray radiography 386
yield 277, 279, 284--287, 307, 312

Z diodes 154--163

Vf-noise 329, 332, 333, 335


100% trial 345

You might also like