You are on page 1of 13



Sonia Vohnout, Justin Judkins, James Hofmeister, and Ronald Carlsten

Ridgetop Group, Inc.
6595 N. Oracle Road, Suite 153B
Tucson, AZ 85704
Tel: 520-742-3300

Abstract: We present an indirect and non-invasive prognostic and health management

strategy for fault-tolerant electric power systems. Switch-mode power supplies (SMPS)
have become ubiquitous in electronic modules and systems, delivering a regulated DC
voltage over a power bus or to a specific module. Often, the power supply has the highest
failure rate within the electronic system making it the most important factor in reliability
and asset readiness. This paper will show how we use prognostic sensor modules for DC
power supplies to monitor degradation signatures and extend the operating life of a
critical power system by dynamically reconfiguring the loads. Sensor modules are placed
on the power distribution system external to the power supply and poled at regular
intervals. The operating life of a critical system can be extended by dynamically
distributing the load based on the degradation signatures being monitored. This modular
approach is simple and can be implemented on existing systems with minimal redesign. It
offers the advantage of increased reliability at low cost, is vendor independent, scalable,
and is applicable to many non-prognostics enabled power supplies.

Keywords: Electronic prognostics; Health monitor; Optical coupler; Switch-mode

power supply

Introduction: A leading cause of failure in its electronic systems is a failure of the power
components, such as power supplies and actuator drives. The consequence of the loss of
the power electronics can range from mission failure to making critical assets unavailable
when they are needed. Doctrine advises a frequent maintenance schedule to ensure
reliable systems, but given the logistics cost of carrying spare parts, this is not always
practical. The solution is to augment schedule-based maintenance with condition-based
maintenance, as supported by electronic prognostics. Anticipating electronic failures
before they occur can reduce scheduled maintenance and associated costs, improve
mission success and reliability, enhance system readiness, increase safety, and reduce
overall lifecycle cost in advanced electrical systems. The emerging field of electronic
prognostics and health management (ePHM) is becoming a key enabler of cost-effective
reliable, available, and robust electronic systems with a long service life. Electronic

prognostics allows the detection of impending solder joint failures in package

interconnections, monitoring of degradation signatures in ceramic and electrolytic
capacitors in power supplies, detection of broken bar faults in squirrel cage induction
motors, and health monitoring in network servers ([1] -[4]).
Switch-mode power supplies (SMPS) have become ubiquitous in electronic modules and
systems, delivering a regulated DC voltage over a power bus or to a specific module. As
with all electronic subsystems, the SMPS is prone to wear out and eventually fail. Often,
these power supplies have a higher failure rate than the downstream components, making
them the weak link in a system. Common fault modes that are present in most SMPS
include degradation of the output capacitor, failure of the power metal oxide substrate
field effect transistor (MOSFET) switch and diodes, failure of the control IC (integrated
circuit), degradation of the opto-isolator, and a variety of electro-mechanical issues
including printed circuit board (PCB) delamination and failure of interconnections [5].
SMPS manufacturers have been reluctant to add prognostics to their supplies because of
the trade-off between fault coverage and reliability; adding more sensors increases the
detection resolution but also makes a circuit inherently less reliable. A sensor failure
within the circuit may impact the electrical performance in an unpredictable way. Hence,
adding sensors within the power supply lowers its reliability. In some cases the benefits
of this capability do not justify the cost and reduction in reliability. It is always desirable
when attaching a sensor to the system that it is as non-invasive as possible, cost effective,
and a reliable prognostic solution.
In previous research we introduced and described a prognostics health monitoring system
where the crossover frequency is monitored through a voltage regulation feedback loop,
and a fault-to-failure progression model is used to predict the health and remaining useful
life (RUL) of an optical isolator in a SMPS. The approach is indirect, non-invasive, cost
effective and applicable to non-PHM enabled power supplies, offering increased
reliability. We also introduced a new non-invasive prognostic sensor for the optical
isolator in a SMPS [5]. We will now show how we use prognostic sensor modules for the
opto-isolator and the output capacitor in DC power supplies to monitor degradation
signatures and extend the operating life of a critical power system by dynamically
reconfiguring the loads. The PHM approach presented for fault-tolerant electric power
systems is both indirect and non-invasive; the prognostic modules can be combined on a
single integrated circuit that monitors only the external terminals of a general purpose
DC-DC converter. This approach is simple to implement, low cost, applicable to many
non-PHM enabled power supplies, and offers prognostic support for an on-board vehicle
health management system. Degradation of the power supply can be accurately tracked
and easily processed to provide advanced warning time to impending failure.
Switch-Mode Power Supply Topology: Our approach relies on our understanding of the
behavior of the SMPS as a system with feedback. The DC power supply chosen in this
paper has a closed regulation feedback loop (see Figure 1). As noted in a previous
research paper [5], changes in the performance in this feedback loop have minimal effect

on the regulating function of the SMPS, but do cause changes in how the output voltage
responds to dynamic inputs to the load or input.

Figure 1. State Diagram of Switch-Mode Power Supply

The phase margin, the difference in the phase of the loop gain in Figure 1 and 180o, plays
an important role in the stability of a SMPS. A large phase margin will tend to cause
greater damping and less oscillation. A low phase margin will allow the circuit to ring
for an extended time for the small input perturbations. This can result in a marginally
system. Phase margins of 0o or less result in uncontrolled oscillations and instability,
which is undesirable in a closed control system. In our research, we use an impulse input
to cause the output voltage to ring. The properties of the ringing, such as magnitude,
frequency, and damping) are then related to the loop gain and phase margin. We can
exploit this behavior by detecting and analyzing the transient behavior to infer the level of
wear-out of certain components within the loop, such as diodes, switches, transformers,
and in particular the optical coupler in the isolation stage of the circuit and then decide if
a SMPS is on a failure trajectory while it is still in service and define how soon it will
need replacement or repair. The transient signals are therefore available for diagnostic
purposes if one has an understanding of the correlation between the loop gain and phase
Fault-to-Failure Progression Model: A common component to fail in a SMPS topology
is the optical coupler or opto-isolator that acts as a signal amplifier between the error
signal generation and the pulse-width modulation stages of the power supply. We selected
an optical isolator for this study, based on a GaAs photodiode and a phototransistor. The
current gain for this device typically ranges from 1.0-3.0 Amp/Amp. This gain, referred to
as the current transfer ratio (CTR), is a multiplier in the loop gain that effectively causes a
vertical shift in the Bode plot of the feedback loop gain and, consequently, also affects the
cross over frequency. In an earlier paper [5], we showed how the degradation in the optoisolator is expected to result in a decrease in ring frequency.


Magnitude (dB)



CTR = 3.0
CTR = 0.7




Frequency (Hz)

Figure 2. Bode plot of the loop gain showing how a shift occurs in Crossover Frequency
due to a decreased in CTR

Voltage Transient

The opto-isolator is one of a few high failure-in-time (FIT) rate items in the SMPS. The
degradation progression for this component is a relatively slow decrease in CTR over
time. So long as the loop gain remains above unity, the diminishing CTR does not affect
steady state operation of the circuit. Only when the CTR falls below a critical threshold
value will the circuit cease to properly regulate the output voltage , allowing it to drift
higher. However, well before this failure point, the health of the optical coupler can be
measured from the observation of the crossover frequency. This frequency will decrease
as CTR is reduced (see
Figure 2 and Figure 3).

CTR = 3.0

Curent Impulse

CTR = 0.7





Time (ms)

Figure 3. SPICE simulation output showing voltage transient response to load current
impulse. Damped ringing behavior occurs for two values of opto-isolator CTR.
Figure 3 shows the results of a Simulation Program with Integrated Circuit Emphasis
(SPICE) simulation for the output voltage for two values of CTR, 3.0 and 0.7. In the
simulation, the load current changes from 5 amps to 10 amps for a duration of 10s and
returns to 5 amps. In the case of CTR=3.0, an oscillation of 20 kHz occurs in the vicinity
of the crossover frequency shown in Figure 4 and is quickly damped. The circuit recovers

and is again regulating at 5 volts output. Notice that when we shift this curve up or down,
the crossover frequency is also shifting right or left (
Figure 2). The rate of shift is given by the slope of the Bode plot, about one decade per
40dB in level shift.
Figure 3 shows the impulse response to the load transient for a scenario in which the
optical isolation stage has accumulated some amount of damage. The CTR is reduced
from 3.0 to 0.7, still adequate for steady state voltage regulation. Notice, however, that
the oscillation frequency is reduced. This is a direct result of moving the crossover
frequency in the loop gain curve. An oscillating frequency change of roughly 2 to 1
corresponds to the square root of the relative change in CTR.
This observation suggests the use of resonance measurements as a prognostic indicator in
a regulated SMPS. The relative value of CTR may be tracked as a function of time using
the calculation for the oscillation frequency, and signal averaging or least squares
regression can be used to determine if measurements show a trend toward failure. It has
been demonstrated [8] that the PN junction photodiodes typically used for the optical
emitter exhibit a gradual degradation with time. Such a model can be produced through a
test program using highly accelerated life testing.
The Levenberg-Marquardt (L-M) algorithm has proved to be an effective and popular
method for solving nonlinear least squares problems. It is used in many data fitting
applications. We combine the L-M method for fitting the impulse response data and the
degradation rate model presented in [8] to project the wear out time of the opto-isolator
and the eventual loss of the power supply regulation (see Figure 4). Other factors such as
temperature and load conditions change randomly while the device is in the field
environment, and these parameters may also be factored into the degradation model to
improve the accuracy and confidence of the final RUL calculation.
Intermittent faults are another issue which this prognostic addresses. The SMPS is not
considered to have actually failed until the coupling value has decreased below the level
at which the power supply can regulate current. Prior to this point, the power supply may
continue to provide well-regulated voltage output but may fail in certain stress conditions.
Once the supply has been returned from the field and tested with a voltmeter in a
laboratory environment, it is likely to pass Re-test Okay (RTOK) or No Trouble Found
(NTF) unless the conditions for test are similar to the field environment. Recreating those
conditions is not always practical or possible.

Figure 4. Non-invasive ring frequency detector to monitor degradation and assess RUL
More interesting than the instantaneous measurement is the change or trend in frequency.
This scheme may be used to periodically pole the power supply and obtain a running
history of the ring frequency measurement as it shifts over time. By relating this
frequency with a baseline frequency to the CTR shift, we obtain a progression of CTR
versus time.
Data representing CTR shift can thus be obtained by a non-invasive approach. Assuming
that failure of the opto-isolator is the dominant factor in the failure of the regulation loop,
the health of this component and its remaining life can be determined from a CTR physics
model. The confidence of this prediction is based on a number of factors including
measurement noise, noise on the bus, and ability of the model to incorporate all
significant environmental factors in predicting the components failure trend.
Nevertheless, the proposed topology provides a significant improvement in the ability to
assess the health of the regulation loop beyond a life model that only considers statistical
Output Capacitor: The primary purpose of output filter capacitor in a SMPS is to
suppress high frequency noise generated by switching in the DC-DC converter. As a
consequence, the output filter capacitor is subject to continuous current oscillation. The
magnitude of the resultant voltage ripple is dependent on Equivalent Series Resistance
(ESR), ambient temperature, output current, and the input voltage of the converter. Stress
can be also applied to the capacitor when a load is removed from the power supply.
Output capacitors fail as a result of high stress electrical bias and/or mechanical failures
such as cracked internal parts, in which the ESR of the capacitor increases. For high
capacitance of tantalum or ceramic capacitors, the initial value of ESR is small (usually <
50m). A good indication of a capacitors failure is an ESR in excess of 1. A common
mode of capacitor failure is an increase in polarization loss in the tantalum or ceramic
dielectric, which is modeled as an increase in the equivalent series resistance. It is a result
of aging due to stress and heat. An undamaged component is expected to have a value
less than 20 m and this represents the minimum lifetime value. There is a monotonic

increase in ESR with aging [10]. The best way to trend a capacitors ESR for precursorto-failure detection is to monitor the capacitors ripple voltage. The ripple voltage at the
output load is relative to the value of ESR; as the ESR increases, the ripple voltage
increases correspondingly. The primary objective of testing the capacitor is to determine
the rate of degradation by analyzing the ripple voltage and to correlate this rate of
degradation with the change in ripple voltage over time. An increase in ripple voltage is
an indication of an increase in ESR, which is a pre-cursor to failure (see Figure 5).









Ripple Voltage (peak to peak)


ESR Detection Range











Ripple Voltage (peak to peak)

Figure 5: Response to ESR

In the case of the output capacitor sensor, we have developed a generic method that takes
the ripple voltage as input and, through a fault-to-failure progression model, assesses the
capacitor health and a corresponding confidence level.
Experimental Results: An actual power supply has been tested to evaluate the transient
response health monitor strategy. The power supply used is the C&D Technologies
CPCI325. This unit is designed to produce a regulated output of 5VDC for a current up to
10A. The output stage is isolated using an optical coupler. We tested our models for both
the opto-isolator and an output capacitor and we present the results in this section.
Impulse Response
L-M Fit



Raw Data





Time (ms)



Figure 6. Voltage response with L-M theoretical fit

Figure 6 shows the voltage response at the output terminals due to a current impulse of 5
amps and 15 microseconds duration. The L-M fit of the theoretical impulse response to
the raw data shows a ring frequency of 26.1 kHz, corresponding to a voltage feedback
isolator with a measured CTR value of 264%. This example shows an essentially undegraded component in the feedback loop of the power supply. The reader will notice that
instrumentation noise and capacitor ripple on the voltage signal does not impact the fit
significantly; the residual from least squares calculation is only 4x10-4 volts. This is due
to the fitting optimization by the L-M routine.

Stressed at

CTR at 3mA



86 hours
21 hours
40 hours
118 hours
0 hours
0 hours
0 hours
1728 hours


21.9 kHz
7.7 kHz
26.3 kHz
26.7 kHz
26.1 kHz
27.8 kHz
5.37 kHz


Table 1. Results of measured ring frequencies for aged opto-isolators.

By comparing the ring frequency for a power supply with an opto-isolator in an aged
condition to the baseline value measured at the start of life, one can evaluate the level of
accumulated damage. We then use the life model for the opto-isolator, along with the
prevailing stress conditions, to project the CTR to the threshold of the operating limit and
extrapolate the time remaining to end of life.
In the case of the output filter capacitor, two different capacitors were tested with two
different voltage levels, 200mV peak-to-peak and 400mV peak-to-peak. The ripple
voltage was then measured at different time intervals and plotted over time to yield two
curves as shown in Figure 7. This graph shows data taken for approximately 50 days.
The change in ripple voltage was measured for each capacitor resulting in two different
curves as seen in
Figure 8. This graph suggests that at larger input voltages, the difference in ripple voltage
across the capacitor increases causing the capacitor to reach its maximum life faster.

Vo_ripple degradation

Vo_ripple (mV)








Time (mins)

Figure 7: Lifetime Test Results for Capacitor Ripple Voltage Degradation.

del Vo_ripple

del Vo_ripple (mV)









10000 100000

Time (mins)

Figure 8: Difference in capacitors ripple voltage

Capacitor failure is a result of high stress electrical bias and/or mechanical failure and is
marked by cracked internal capacitors, due to which the ESR of the capacitor increases.
The best way to trend a capacitors ESR for precursor-to-failure detection is to monitor
the capacitors ripple voltage in a circuit shown in Figure 9. We tested 3 cases; Resistor
=0.01, Resistor =0.1, and Resistor =1. The ripple voltage was then measured at each case
and plotted over time to yield two curves. As we can see in Figure 9, the ripple voltage at
the output load is relative to the value of ESR. It means that, as the ESR increases, the
ripple voltage increases correspondingly.

Figure 9: Transient waveform of output voltage showing ripple for three values of ESR
Power Supply Health Monitor in a Health Management System: The state diagram
analysis of the regulated power supply, combined with the transient response behavior to
load current variations presented, suggests an interesting prognostic strategy that is more
indirect and non-invasive. The voltage transient waveform captured at the output provides
a means for accurately detecting and quantifying the health of the components within the
regulator feedback loop and other critical components such at the output capacitor,
without introducing additional connections. This transient can be invoked simply by
providing an impulse stimulus to the input voltage or the output current; either is
available from the external ports. Signal processing of the captured waveform is used to
identify the central frequency, which is then related to the magnitude of the loop gain.
One can then monitor this parameter to observe either sudden changes or trends and apply
a model based reasoner to derive a RUL estimate by using a model such as the
degradation model presented in [8].
Figure 10 is a block diagram showing one possible implementation of a bus level
prognostic in a vehicle power management system, we call this the Integrated Power
System Health Manager (IPSHM). A regulated switch-mode power supply is connected
to a power distribution system (or power bus) that feeds power to a number of
subsystems. The power bus is controlled by the vehicle power manager. Attached to the
power bus is a System on Chip (SOC) board which holds various power supply
prognostic units or sensor modules, which provide prognostic information to the
Integrated Vehicle Health Manager (IVHM) through a Digital Signal Processor (DSP). At
selected times, the IVHM poles the DSP while the power manager maintains a constant
power loading from the electronic systems supplied by the bus. This is to prevent power
transients resulting from the energizing or de-energizing and to keep the prognostic
systems from corrupting the waveform measurements. The power supply, prognostic
units in the SOC, and subsystems can come from one or more vendors. The important
point to make is that this strategy for prognostics is not only non-invasive, but also


general, independent of vendor, scalable, cost-effective, and does not require a specific
design for each implementation.
For example, the CTR sensor in the opto-isolator block in Figure 10 will trigger a load
impulse response when asked that will capture the transient waveform of the voltage at
the power bus. This signal is then digitized and filtered to extract both the oscillation
frequency and the damping coefficient. These two parameters will be used to assess the
health of the optical isolator in the SMPS by using the fault-to-failure progression model
previously outlined and illustrated in Figure 4. In the case of the output capacitor sensor,
we have developed a generic power supply capacitor sensor that takes the ripple voltage
as input and through a fault-to-failure progression model and assesses the capacitor health
along with a confidence level. Similarly, other sensor modules on the SOC can be used
to monitor the health of components without making any invasive modifications to the

Figure 10. Integrated Power System Health Manager (IPSHM)

Summary: We have presented a novel approach for monitoring the in-situ health of faulttolerant electric power systems during operation. We illustrate our technique by
monitoring the health of an optical coupler, defining an opto-isolator sensor, and
discussing a fault-to-failure progression model. The technique presented is based on an
analysis of the transient response of the closed control loop. This analysis shows us that
the location of the crossover frequency will determine how the voltage at the output
terminals will ring when the system is perturbed. We briefly introduced an output filter
capacitor sensor to demonstrate the feasibility of a power health manager in a System on
Chip that communicates with an integrated vehicle health manager (IVHM).


One useful application of this sensor is to monitor the health of components within the
feedback loop, such as the optical isolator, as well as other critical components for which
we can measure ripple voltage and other digital traces. As these components degrade
through wear out or stress, their status can be observed. Monitoring of trends may provide
an estimation of their remaining useful life.
Our approach is simple to implement and non-invasive. A ring frequency detector is a bus
level prognostic with no internal connections to the power supply. It offers the advantage
of increased reliability at low cost and is applicable to many non-prognostics enabled
power supplies without any retrofit or redesign, so the power supply vendor is not
involved in the enabling of prognostics for its power supply. By monitoring crossover
frequency through a feedback loop and using a fault-to-failure progression model, we
offer a complete prognostics health monitoring system that can be used to predict the
health and remaining useful life of an optical coupler and other critical components such
as the output filter capacitor in a switch-mode power supply.
Acknowledgement: The work presented in this paper was funded by Small Business
Innovation Research contract awards from National Aeronautics and Space
Administration, Ames Research Center, Crew Exploration Vehicle program: Contract No.
NNA06AA22C and Joint Strike Program Contract No. N68335-05-C-0126.
[1] E. Keenan, R. G. Wright, R. Mulligan, and L. V. Kirkland, Terahertz and laser
imaging for printed circuit board failure detection, in Proc. AUTOTESTCON 2004,
Sept. 2004, pp. 563 569.
[2] A. Lahyani, P. Venet, G. Grellet, and P.-J. Viverge, Failure prediction of
electrolytic capacitors during operation of a switch mode power supply, IEEE Trans.
Power Electronics, vol. 13, issue 6, pp. 1199 1207, Nov. 1998.
[3] M. Eltabach, A. Charara, and I. Zein, A comparison of external and internal
methods of signal spectral analysis for broken rotor bars detection in induction
motors, IEEE Trans. Industrial Electronics, vol. 51, issue 1, pp. 107 121, Feb.
[4] K. C. Gross, A. Urmanov, L. G. Votta, S. McMaster, and A. Porter, Towards
dependability in everyday software using software telemetry, Proc. Engineering of
Autonomic and Autonomous Systems, pp. 9 18, March 2006.
[5] D. Goodman, et. al., Practical Application of PHM/Prognostics to COTS Power
Converters, Aerospace, 2005 IEEE Conference, 5-12 March 2005 Page(s):3573
[6] J. Judkins, J. Hofmeister, and S. Vohnout, "A Prognostic Sensor for Voltage
Regulated Switch-Mode Power Supplies," IEEEAC Paper #1502, Submission,
Updated December 6, 2006.
[7] R. Erickson, Fundamentals of Power Electronics. Norwell, MA: Kluwer Academic
Publishers, 1999.


[8] R. E. Ziemer, W. H. Tranter, and D. R. Fannin, Signals and Systems Continuous and
Discrete. New York: Macmillan Publishing Co., 1983.
[9] J. Keller, Design driven LED degradation model for opto isolators, in Proc. 42nd
Electronic Components and Technology Conference, 1992, pp. 394 398.
[10] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical
Recipes in C. New York: Cambridge University Press, 1988.
Sonia Vohnout is Principal Systems Engineer at Ridgetop Group, Inc.. She received her
Bachelor of Science degree in Computer Science from the University of Costa Rica, and
Bachelor of Science and Master of Science degrees in Systems Engineering from the
University of Arizona. In the past she has worked for Modular Mining, IBM and AT&T
Bell Laboratories. She previously owned and operated a manufacturing facility in
Mexico. She is an expert modeler and has extensive expertise in software and computer
systems. Ms. Vohnout has over 20 years of experience in systems engineering, quality
systems management, and business development and management.
Justin Judkins is Director of Research and oversees the research and implementations of
electronic prognostics. His research interests involve applying sensor array technology to
various reasoning engines to provide optimum performance for electronic modules and
systems. He previously held senior-level engineering positions at Bell Labs and Lucent
involving high-reliability telecom transmission. He received a Ph.D. in Electrical
Engineering from the University of Arizona.
James Hofmeister is a Senior Principal Engineer. He has been a software architect,
designer and developer for IBM, a software architect, electronic design engineer,
principal investigator on research topics and co-inventor of electronic prognostics at
Ridgetop Group. He is a former director, representing IBM, of the Southern Arizona
Center for Software Excellence, a co-author on five IBM patents, and a co-author on three
pending Ridgetop patents, two of the three have been published by the U.S. patent office.
He retired from IBM after a 30-year career and joined Ridgetop Group in 2003. He has a
BSEE from the University of Hawaii, Manoa Campus, and a MS in Electrical and
Computer Engineering from the University of Arizona.
Ronald Carlsten is a Principal Engineer at Ridgetop responsible for design of prognostic
circuits for switch mode power supplies. He has a BSEE degree from University of New
Mexico, Albuquerque and has taken graduate courses at the University of Arizona. He
has 30 years of experience designing SMPS for C & D Technologies Inc. and IBM
Corporation. He holds 3 patents and has written numerous technical papers.