You are on page 1of 4

Automated FSM Error Correction for Single Event

Upsets

Nand Kumar and Darren Zacher


Mentor Graphics Corporation
nand_kumar{darren_zacher}@mentor.com

checking bits to any encoding style of


Abstract FSMs to automatically recover from
single event upsets (SEUs) within the
This paper presents a technique for same clock cycle. We shall report on the
automatic error correction of FSMs for efficacy of this method to correct single
single event upsets (SEU). We present a bit errors and an extension to detect two
general technique of adding Hamming bit errors. Area and delay results of
error checking bits to automatically using this technique on selected designs
recover from single bit errors within the are also reported.
same clock cycle. Experimental results
demonstrate the efficacy of the method. Section 2 introduces the FSM model and
the effect of single event upsets. Section
3 describes the use of Hamming error-
1 Introduction checking bits to automatically correct
single bit errors. The experimental
technique is presented in Section 4.
FPGAs are increasingly being used for
Results on selected FSM examples are
mission-critical applications in
presented in Section 5. Section 6
hazardous operating environments
discusses the current status and future
(space, military, medical etc.). In these
work.
applications the circuit must be fault-
tolerant, especially finite state machines
(FSMs), where failures are very hard to
2 FSM Model and Single Event
detect [1]. Most single-fault-tolerant Upsets
FSMs are implemented using triple
module redundancy (TMR) or by using a A finite state machine is defined to be a
single-error-correcting (SEC) code 6-tuple [10]:
during state encoding [6, 7]. Typical
instances of SEC employ a minimal M  ( I , O , S ,  ,  , s0)
encoding (binary, Gray, etc.) scheme to where:
force a Hamming [2] distance of three. I is a finite nonempty set of inputs; O is
Commercially available FPGA synthesis a finite nonempty set of outputs; S is a
tools [8, 9] only implement error finite nonempty set of states;
recovery to a specified recovery state.  : I  S  S is the state transition
They do not implement error correction. function;  : I  S  O is the output
In this paper, we investigate a general function; and s0 is the initial state. Figure
technique of adding Hamming error

Kumar 1 118/MAPLD 2004


1 shows the hardware model of an FSM. 3 Error Correction
An FSM is typically modeled as a set of
state registers to store the state vector, We use a Hamming error-correcting
combinational state decode for the next- code [1, 2, 3, 10]. To the n encoding bits,
state and output logic. k parity checking bits are added to form
an (n + k)-bit code. The location of each
Outputs of the bits in the new code is assigned a
Inputs decimal value starting at 1 for the most
Combinational significant bit and n + k to the least
Logic significant bit. k parity checks are
performed on selected bits of each
encoding. The result of each parity
check is recorded as 1 if an error has
been detected or 0 if no error has been
detected.
State
Present State
Vector Next
state
The number of parity bits k must satisfy
the inequality 2k  n  k  1 . The parity
Clock Reset checks are performed such that their
value when an error occurs is equal to
Figure 1: FSM Model the decimal value assigned to the
location of the erroneous bit, and is
Consider an FSM with five states S0, S1, equal to zero if no error occurs. This
S2, S3, and S4 encoded using a binary or number is called the position number.
sequential encoding. The encodings are The parity checking bits are placed in
displayed in Figure 2: positions 1, 2, 4, …, 2k-1 such that they
S0 000 are independent of each other and
S1 001 encoded only in terms of the encoding
S2 010 bits. Consider the example in Figure 2,
S3 011 the original encoding uses three bits
S4 100 (n=3), then k = 3 satisfies the above
Figure 2: Example FSM Encoding inequality, thus three parity bits must be
If the FSM state register experienced a added to the encoding bits to generate
single event upset, say in state S0, it the error correcting code.
could place the FSM in states S1, S2, or
S4 depending on which bit of the state- The position numbers are shown in
register experienced the upset. It is hard Table 1. From the table we observe that
to determine if the current state is a an error in position 1, 3, or 5 should
result of a valid state transition or the result in a 1 in the least significant bit
result of a single event upset. It is hard to (c0) of the position number. Hence, the
recover from such upsets because the code must be designed to have digits in
upset actually placed the FSM in a legal position 1, 3, and 5 to have even parity.
state. If the FSM had transitioned from If a parity check of these bits shows an
S4 to states 101 or 110, it would have odd parity, the corresponding position
been easier to detect the illegal transition number bit (c0) is set to 1, otherwise to
and recover from the illegal state. 0. Also, from the table we observe that

Kumar 2 118/MAPLD 2004


an error in position 2, 3, or 6 should 4,5,6 parity check: 0 0 1; parity check is
result in the center bit (c1) being set to 1 odd; c2 = 1.
and an error in position 4, 5, 6 should 2,3,6 parity check: 0 0 1; parity check is
result in the most significant bit (c2) odd; c1 = 1.
being set to 1. 1,3,5 parity check: 0 0 0; parity check is
Table 1: Position Numbers even; c0 = 0.
Error position Position number The position number c2c1c0 is 110,
c2 c1 c0 which means that the location of the
0 (no error) 0 0 0 error is in position 6. To correct the
1 0 0 1 error, the bit in position 6 is inverted,
2 0 1 0 and the correct encoding 000000 is
3 0 1 1 obtained.
4 1 0 0
5 1 0 1 The error correction outlined in this
6 1 1 0 section has been implemented to handle
any encoding style and is automatically
The parity bits are chosen as follows: inserted for FSMs.
p1 is selected to establish even parity in
bits 1, 3, 5. 4 Experimental Technique
P2 is selected to establish even parity in
bits 2, 3, 6.
We collected seven sample FSM designs
P3 is selected to establish even parity in
(in VHDL) with varying complexity
bits 4, 5, 6.
with regards to number of states, inputs,
Table 2: Hamming code for FSM and outputs. We added error correction
Position 1 2 3 4 5 6 to the FSMs and wrote out a VHDL
p1 p2 e2 p3 e1 e0 description with the correction logic
0 0 0 0 0 0 (after logic synthesis).
0 1 0 1 0 1
1 0 0 1 1 0 We generated a test-bench with a
1 1 0 0 1 1 thousand random test vectors. We
1 1 1 0 0 0 simulated the original FSM description
and forced one of the state-bits in the
Table 2 shows the Hamming code for synthesized VHDL to a constant value
the example FSM from Figure 2. and simulated it and compared the
outputs of the original and synthesized
The error location and correction is designs. The outputs were identical
performed by computing the parity because the single bit error was
checks and determining the position corrected. We conducted the same
number. For example, if we assume that experiment by forcing one of the parity
bit zero of the encoding (e0) of state S0 is bits, we observed the same behavior,
inverted (SEU), the new code is: because the single bit error was
000001. The three parity checks yield corrected.
the following position number bits:
We then forced two state bits (two parity
bits) and re-simulated the synthesized

Kumar 3 118/MAPLD 2004


design and now we observed differences Vol. C-20, No. 10, October 1971, pp.
between the original and synthesized 1167 – 1177.
designs. The two bit errors actually
produced functional errors whereas the [5] R Leveugle, “Optimized state
single bit errors were corrected. assignment of single fault tolerant FSMs
based on SEC codes,” Proc. 30th DAC,
The other experiment compared the area 1993, pp. 14 – 18.
and delay results of this technique with
the more common usage of TMRs. [6] C. Bolchini, R. Montandon, F.
Salice, and D. Sciuto, “A State Encoding
5 Results For Self-Checking Finite State
Machines,” Proceedings of the ASP-
Results on a set of example FSMs is DAC '95/CHDL '95/VLSI '95., IFIP
presented in this section. International Conference on Hardware
Description Languages; IFIP
Use Darren’s tables and graphs here. International Conference on Very Large
Scale Integration., Asian and South
6 Discussion Pacific, 1995, pp. 711 – 716.

[7] R. Rochet, R. Leveugle, and G.


We discuss the current status and future
Saucier, “Analysis and Comparison of
extensions.
Fault Tolerant FSM Architecture Based
on SEC Codes,” Proc. IEEE
References International Workshop on Defect and
Fault Tolerance in VLSI Systems, 1993,
[1] S. Leveugle, L. Martinez, “Design pp. 9 – 16.
methodology of FSMs with intrinsic
fault tolerance and recovery [8] Precision™ Synthesis Reference
capabilities,” IEEE Proc. EuroASIC’92, Manual, Mentor Graphics Corp. 2004.
1992, pp. 201 – 206.
[9] Synplify Pro™ Reference Manual,
[2] R.W. Hamming, “Error Detecting Synplicity Inc. 2003.
and Error Correcting Codes,” The Bell
System Technical Journal, Vol. 29, April [10] Z. Kohavi, Switching and Finite
1950, pp. 147 – 160. Automata Theory, McGraw-Hill Book
Company.
[3] D.B. Armstrong, “A general method
of applying error correction to [11] M. Berg, “A Simplified Approach
synchronous digital systems,” The Bell to Fault Tolerant State Machine Design
System Technical Journal, Vol. 40, No. for Single Event Upsets,” Mentor
2, March 1961, pp. 577 – 593. Graphics Users’ Group User2User
Conference, 2004.
[4] J.F. Meyer, “Fault-tolerant sequential
machines,” IEEE Trans. On Computers,

Kumar 4 118/MAPLD 2004

You might also like