You are on page 1of 5

Automated FSM Error Correction for Single Event

Upsets

Nand Kumar and Darren Zacher


Mentor Graphics Corporation
nand_kumar{darren_zacher}@mentor.com

Abstract technique of adding Hamming error


checking bits to any encoding style of
FSMs to automatically recover from
This paper presents a technique for
single event upsets (SEUs) within the
automatic error correction of FSMs for
same clock cycle. We shall report on the
single event upsets (SEU). We present a
efficacy of this method to correct single
general technique of adding Hamming
bit errors and an extension to detect two
error checking bits to automatically
bit errors. Area and delay results of
recover from single bit errors within the
using this technique on selected designs
same clock cycle. Experimental results
are also reported.
demonstrate the efficacy of the method.
Section 2 introduces the FSM model and
the effect of single event upsets. Section
1 Introduction 3 describes the use of Hamming error-
checking bits to automatically correct
FPGAs are increasingly being used for single bit errors. The experimental
mission-critical applications in technique is presented in Section 4.
hazardous operating environments Results on selected FSM examples are
(space, military, medical etc.). In these presented in Section 5. Section 6
applications the circuit must be fault- discusses the current status and future
tolerant, especially finite state machines work.
(FSMs), where failures are very hard to
detect [1]. Most single-fault-tolerant 2 FSM Model and Single Event
FSMs are implemented using triple Upsets
module redundancy (TMR) or by using a
single-error-correcting (SEC) code
during state encoding [6, 7]. Typical A finite state machine is defined to be a
instances of SEC employ a minimal 6-tuple [10]:
encoding (binary, Gray, etc.) scheme to
M  ( I , O, S ,  ,  , s0)
force a Hamming [2] distance of three.
Commercially available FPGA synthesis where:
tools [8, 9] only implement error I is a finite nonempty set of inputs; O is
recovery to a specified recovery state. a finite nonempty set of outputs; S is a
They do not implement error correction. finite nonempty set of states;
In this paper, we investigate a general  : I  S  S is the state transition
function;  : I  S  O is the output

Kumar 1 118/MAPLD 2004


function; and s0 is the initial state. Figure been easier to detect the illegal transition
1 shows the hardware model of an FSM. and recover from the illegal state.
An FSM is typically modeled as a set of
state registers to store the state vector, 3 Error Correction
combinational state decode for the next-
state and output logic. We use a Hamming error-correcting
code [1, 2, 3, 10]. To the n encoding bits,
Outputs k parity checking bits are added to form
Inputs an (n + k)-bit code. The location of each
Combinational of the bits in the new code is assigned a
Logic decimal value starting at 1 for the most
significant bit and n + k to the least
significant bit. k parity checks are
performed on selected bits of each
encoding. The result of each parity
State check is recorded as 1 if an error has
Present State
Vector been detected or 0 if no error has been
Next
state
detected.

Clock Reset The number of parity bits k must satisfy


the inequality 2k  n  k  1 . The parity
Figure 1: FSM Model checks are performed such that their
value when an error occurs is equal to
Consider an FSM with five states S0, S1, the decimal value assigned to the
S2, S3, and S4 encoded using a binary or location of the erroneous bit, and is
sequential encoding. The encodings are equal to zero if no error occurs. This
displayed in Figure 2: number is called the position number.
S0 000 The parity checking bits are placed in
S1 001 positions 1, 2, 4, …, 2k-1 such that they
S2 010 are independent of each other and
S3 011 encoded only in terms of the encoding
S4 100 bits. Consider the example in Figure 2,
Figure 2: Example FSM Encoding the original encoding uses three bits
If the FSM state register experienced a (n=3), then k = 3 satisfies the above
single event upset, say in state S0, it inequality, thus three parity bits must be
could place the FSM in states S1, S2, or added to the encoding bits to generate
S4 depending on which bit of the state- the error correcting code.
register experienced the upset. It is hard
to determine if the current state is a The position numbers are shown in
result of a valid state transition or the Table 1. From the table we observe that
result of a single event upset. It is hard to an error in position 1, 3, or 5 should
recover from such upsets because the result in a 1 in the least significant bit
upset actually placed the FSM in a legal (c0) of the position number. Hence, the
state. If the FSM had transitioned from code must be designed to have digits in
S4 to states 101 or 110, it would have position 1, 3, and 5 to have even parity.
If a parity check of these bits shows an

Kumar 2 118/MAPLD 2004


odd parity, the corresponding position 000001. The three parity checks yield
number bit (c0) is set to 1, otherwise to 0. the following position number bits:
Also, from the table we observe that an
error in position 2, 3, or 6 should result 4,5,6 parity check: 0 0 1; parity check is
in the center bit (c1) being set to 1 and an odd; c2 = 1.
error in position 4, 5, 6 should result in 2,3,6 parity check: 0 0 1; parity check is
the most significant bit (c2) being set to odd; c1 = 1.
1. 1,3,5 parity check: 0 0 0; parity check is
Table 1: Position Numbers even; c0 = 0.
Error position Position number The position number c2c1c0 is 110, which
c2 c1 c0 means that the location of the error is in
0 (no error) 0 0 0 position 6. To correct the error, the bit in
1 0 0 1 position 6 is inverted, and the correct
2 0 1 0 encoding 000000 is obtained.
3 0 1 1
4 1 0 0 The error correction outlined in this
5 1 0 1 section has been implemented to handle
6 1 1 0 any encoding style and is automatically
inserted for FSMs.
The parity bits are chosen as follows:
p1 is selected to establish even parity in 4 Experimental Technique
bits 1, 3, 5.
P2 is selected to establish even parity in
bits 2, 3, 6. We collected seven sample FSM designs
P3 is selected to establish even parity in (in VHDL) with varying complexity
bits 4, 5, 6. with regards to number of states, inputs,
and outputs. We added error correction
Table 2: Hamming code for FSM
to the FSMs and wrote out a VHDL
Position 1 2 3 4 5 6 description with the correction logic
p1 p2 e2 p3 e1 e0 (after logic synthesis).
0 0 0 0 0 0
0 1 0 1 0 1 We generated a test-bench with a
1 0 0 1 1 0 thousand random test vectors. We
1 1 0 0 1 1 simulated the original FSM description
1 1 1 0 0 0 and forced one of the state-bits in the
synthesized VHDL to a constant value
Table 2 shows the Hamming code for and simulated it and compared the
the example FSM from Figure 2. outputs of the original and synthesized
designs. The outputs were identical
The error location and correction is because the single bit error was
performed by computing the parity corrected. We conducted the same
checks and determining the position experiment by forcing one of the parity
number. For example, if we assume that bits, we observed the same behavior,
bit zero of the encoding (e0) of state S0 is because the single bit error was
inverted (SEU), the new code is: corrected.

Kumar 3 118/MAPLD 2004


We then forced two state bits (two parity [4] J.F. Meyer, “Fault-tolerant sequential
bits) and re-simulated the synthesized machines,” IEEE Trans. On Computers,
design and now we observed differences Vol. C-20, No. 10, October 1971, pp.
between the original and synthesized 1167 – 1177.
designs. The two bit errors actually
produced functional errors whereas the [5] R Leveugle, “Optimized state
single bit errors were corrected. assignment of single fault tolerant FSMs
based on SEC codes,” Proc. 30th DAC,
The other experiment compared the area 1993, pp. 14 – 18.
and delay results of this technique with
the more common usage of TMRs. [6] C. Bolchini, R. Montandon, F.
Salice, and D. Sciuto, “A State Encoding
5 Results For Self-Checking Finite State
Machines,” Proceedings of the ASP-
Results on a set of example FSMs is DAC '95/CHDL '95/VLSI '95., IFIP
presented in this section. International Conference on Hardware
Description Languages; IFIP
Use Darren’s tables and graphs here. International Conference on Very Large
Scale Integration., Asian and South
6 Discussion Pacific, 1995, pp. 711 – 716.

[7] R. Rochet, R. Leveugle, and G.


We discuss the current status and future Saucier, “Analysis and Comparison of
extensions. Fault Tolerant FSM Architecture Based
on SEC Codes,” Proc. IEEE
References International Workshop on Defect and
Fault Tolerance in VLSI Systems, 1993,
[1] S. Leveugle, L. Martinez, “Design pp. 9 – 16.
methodology of FSMs with intrinsic
fault tolerance and recovery [8] Precision™ Synthesis Reference
capabilities,” IEEE Proc. EuroASIC’92, Manual, Mentor Graphics Corp. 2004.
1992, pp. 201 – 206.
[9] Synplify Pro™ Reference Manual,
[2] R.W. Hamming, “Error Detecting Synplicity Inc. 2003.
and Error Correcting Codes,” The Bell
System Technical Journal, Vol. 29, April [10] Z. Kohavi, Switching and Finite
1950, pp. 147 – 160. Automata Theory, McGraw-Hill Book
Company.
[3] D.B. Armstrong, “A general method
of applying error correction to [11] M. Berg, “A Simplified Approach
synchronous digital systems,” The Bell to Fault Tolerant State Machine Design
System Technical Journal, Vol. 40, No. for Single Event Upsets,” Mentor
2, March 1961, pp. 577 – 593. Graphics Users’ Group User2User
Conference, 2004.

Kumar 4 118/MAPLD 2004


Kumar 5 118/MAPLD 2004

You might also like