You are on page 1of 12

932 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO.

3, MARCH 2016

Low-Cost Multiple Bit Upset Correction in


SRAM-Based FPGA Configuration Frames
Mojtaba Ebrahimi, Parthasarathy Murali B. Rao, Razi Seyyedi, and Mehdi B. Tahoori, Senior Member, IEEE

Abstract— Radiation-induced multiple bit upsets (MBUs) is required to detect and correct multiple errors in memory
are a major reliability concern in nanoscale technology nodes. arrays. More specifically, SRAM-based FPGAs are more prone
Occurrence of such errors in the configuration frames of a field- to soft errors as a particle strikes in a configuration frame1
programmable gate array (FPGA) device permanently affects
the functionality of the mapped design. Periodic configuration has a permanent impact on the functionality of the mapped
scrubbing combined with a low-cost error correction scheme is design [10]. Since the configuration frames constitute the
an efficient approach to avoid such a permanent effect. Existing majority of SRAMs in an FPGA device (e.g., >80% for Xilinx
techniques employ error correction codes with considerably Virtex-6 VLX240T), mitigation of MBUs in configuration
high overhead to mitigate MBUs in configuration frames. In this frames is of decisive importance.
paper, we present a low-cost error-detection code to detect MBUs
in configuration frames as well as a generic scrubbing scheme Several schemes have been presented to address the
to reconstruct the erroneous configuration frame based on the increasing soft error concern in the FPGA configuration
concept of erasure codes. The proposed scheme does not require frames. The main objective of these schemes is to reduce
any modification to the FPGA architecture. Implementation of error latency, and hence, to avoid error accumulation within
the proposed scheme on a Xilinx Virtex-6 FPGA device shows configuration frames. Costly modular redundancy [11]–[13]
that the proposed scheme can detect 100% of MBUs in the
configuration frames with only 3.3% resource occupation, while is a conventional technique to tolerate soft errors in both
the recovery time is comparable with the previous schemes. configuration frames and functional logic. However, accumu-
Index Terms— FPGA, multiple bit upsets, reliability, soft lated errors in both data and configuration bits dramatically
errors. limit the mean time to failure of such schemes [14]. Another
technique is to optimize the configuration frame circuitry
I. I NTRODUCTION for soft errors as detailed in [15] and [16]. However, such

S RAM-BASED field-programmable gate arrays (FPGAs)


are widely used in a variety of application domains due
to short time-to-market time, flexibility, high density, and
hardening techniques are not implemented in the existing
FPGA devices because of their excessive area overheads.
Therefore, a low-cost solution is required to correct erroneous
cost-efficiency. However, increasing transistor count per chip configuration frames during operation. The combination of
(i.e., Moore’s law) coupled with the reduced operating voltage configuration scrubbing and error correction codes (ECCs) is
in the past years results in an exponential growth in soft an effective solution to detect and correct radiation-induced
error rate (SER) of digital circuits [1], [2]. Considering the transient errors in configuration bits. In this regard, an SEU
proliferation of FPGAs in a variety of safety- and mission- tolerant scrubbing scheme [17]–[20] is very well studied in
critical applications, it is crucial to mitigate their susceptibility the literature, and it is already included in Xilinx and Altera
to these kinds of errors [3]. design flows [21], [22].
In order to meet the ever-increasing performance and There are also a few schemes to specifically address MBUs
power demands, FPGAs are typically fabricated using the in FPGA devices. The scheme proposed in [23] employs
most advanced technology nodes. Recently, FPGAs based on the 2-D Hamming code in each configuration frame to
a 14-nm technology with denser integration schemes, such as correct MBUs. In [24], another scheme based on interleaved
3-D die stacking, have been announced [4], [5]. In such small single error correction Hamming code has been presented
device geometries, a single radiation-induced particle strike that could correct up to four adjacent error bits. In addition,
is likely to affect several adjacent cells in a memory array, Xilinx offers a soft error mitigation controller (IP block)
leading to a multiple bit upset (MBU) [6], [7]. Considering based on cyclic redundancy code (CRC) and ECC which can
the fact that the MBU rate in nanoscales is comparable with correct up to two adjacent erroneous cells in each configura-
the single event upset (SEU) [8], [9], an appropriate scheme tion frame [25]. Recent experiments reveal that the number
of affected bits by an MBU incident grows as technology
Manuscript received February 5, 2015; accepted April 10, 2015. Date of
publication May 8, 2015; date of current version February 23, 2016. This scales [8]. Therefore, a more complicated ECC with a very
work was supported by the German Research Foundation within the National high area overhead is required to efficiently address the
Focal Program through the Dependable Embedded Systems. increase in the MBU incidence rate.
The authors are with the Chair of Dependable and Nano Computing,
Karlsruhe Institute of Technology, Karlsruhe 76131, Germany (e-mail: Based on the fact that error detection can be done at much
mojtaba.ebrahimi@kit.edu; Parthasarathy.rao@kit.edu; seyyedi@ira.uka.de; lower cost than error correction [26], we propose the MBU
mehdi.tahoori@kit.edu).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. 1 A configuration frame is the smallest addressable chunk of a configuration
Digital Object Identifier 10.1109/TVLSI.2015.2425653 memory.
1063-8210 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 933

detection technique to detect the erroneous configuration frame


during configuration scrubbing. Once an error is detected, by
taking advantage of erasure codes and using the data stored
in a redundant frame, the correct contents of the affected
configuration frame are reconstructed. For detecting MBUs
in the configuration frames of the FPGA, we propose a low-
cost technique, namely, interleaved n dimensional (InD) parity.
The interleaving distance of this technique is optimized based
on the actual MBU patterns and their respective probabili-
ties obtained from a detailed technology-dependent analysis.
Furthermore, by carefully dividing the frames into several
clusters, the proposed technique can detect and correct an
MBU affecting several adjacent configuration frames.
For the case study on a Xilinx FPGA device, the proposed
InD parity technique is able to detect 100% of the soft
errors, and all detected errors are successfully recovered by
means of erasure codes. Our proposed scheme occupies only Fig. 1. MBUs in a 45-nm SRAM array. (a) MBU size distribution.
3.3% of logic blocks and 0.9% of block RAMs (BRAMs)2 (b) Some of the MBU patterns with high occurrence probabilities.
out of the available resources in the employed FPGA
device. The results also reveal that our proposed scheme
are known as sensitive frames. Any errors occurring in
provides the highest recovery coverage3 with considerably
these sensitive frames of the device might lead to a system
less area overhead compared with the existing solutions at
malfunction.
the expense of a negligible increase in the mean time to
repair (MTTR)4 (3.75%).
B. MBU Patterns
A preliminary version of this paper is published in [27].
In this paper, we generalize the proposed solution to In order to fairly quantify the MBU correction capability
achieve 100% coverage of MBUP. This technique imposes of the proposed scheme, we need to have detailed information
a minimal area overhead of 0.3% with respect to the about the possible MBU patterns and their respective
interleaved 2-D (I2D) parity technique [27], while the occurrence probabilities. In this regard, a 3-D-TCAD-
recovery time is exactly the same. based neutron particle strike simulation is conducted by
The remainder of this paper is organized as follows. The employing a commercial soft error assessment tool [28].
preliminaries are explained in Section II. The proposed MBU The SPICE netlist and the memory layout as well as the
detection technique and erasure-code-based error recovery radiation environment information are provided as inputs
scheme are described in Sections III and IV, respectively. to the tool to compute the distribution of generated current
The implementation details and the experimental results for pulses for each cell according to a nuclear database.
a Virtex 6 FPGA device are provided in Section V. Finally, Afterward, the SEU and MBU rates are extracted by injecting
the conclusion is given in Section VI. the obtained current pulses in the SPICE netlist. Using
this commercial tool, we have acquired the occurrence
II. BACKGROUND
probabilities of neutron-induced MBU patterns in the
A. FPGA Configuration Frames terrestrial environment on an SRAM memory designed for a
The configuration memory of FPGAs is organized into 45-nm technology. In this experiment, the neutron energy
configuration frames that are the smallest addressable units distribution is described according to the JEDEC89a
and constitute the majority of SRAM cells in FPGAs. The standard [29]. Furthermore, the secondary particles’ reaction
number and the size of the configuration frames vary from one that occur when neutrons interact with the atoms in
device to another. For example, in Xilinx Virtex-6 XLV240T the CMOS structure is modeled according to a nuclear
device, which is employed as a case study in this paper, there database (CEA, France).
are 28 464 configuration frames, each comprising of 81 32-bit Fig. 1 shows the MBU size distribution as well as MBU
words (total of 72 049 kbit), whereas there are only 461 patterns with high occurrence probabilities. As shown, almost
of 36 kbit BRAMs. Therefore, for this particular device, half of the SRAM soft errors in the employed 45-nm technol-
81.28% of the total SRAM cells belong to the configuration ogy are MBU incidents. In this experiment, the largest MBU
frames. size observed affects 24 bits. For smaller technology nodes,
For each mapped design, the FPGA device utilizes only this ratio and also the size of largest MBUP further grows [8].
a subset of all the available configuration frames, which This clearly demonstrates the importance of protecting
2 BRAMs are prewired ASIC-like RAMs available to logic fabric through configuration frames against MBU incidents.
read/write ports.
3 Recovery coverage is the percentage of the errors that can be successfully
C. Erasure Codes
detected and recovered to the original state.
4 Mean time to repair is the average time required to detect and correct an Our proposed error correction scheme is based on the
error. concept of erasure codes [30], [31]. An optimal erasure code
934 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

Otherwise, during each memory access, all other entries that


are involved in the computation of the corresponding common
parity bit(s) must be accessed as well. However, the errors
in FPGAs are detected during the periodic scrubbing where
the error checking unit can access the entire contents of a
configuration frame. Therefore, a parity bit could be computed
Fig. 2. Encoding and decoding of erasure codes.
based on multiple entries.
In the proposed error detection technique, we exploit the
is a data-recovery technique, which transforms m blocks fact that the sizes of large MBU patterns are typically much
into m + n blocks such that the original m blocks can be smaller than the size of a configuration frame. Since an MBU
reconstructed from any arbitrary set of m blocks among m + n incident affects several bits in a localized manner, the bits
coded blocks (shown in Fig. 2). This data-recovery technique that are located far enough cannot be simultaneously affected
is widely implemented in reliable storage devices (hard disks with one MBU incident. Therefore, having separate parities
and CDs) [32], error correction in cache arrays [33], [34], for such bits in configuration frames neither increases the error
multimedia multicasting, and signal transfer protocols [35]. detection capability nor improves the performance, rather only
A variety of erasure codes with different recovery coverage, imposes unnecessary area overhead.
area overhead, and encoding/decoding complexity are In order to increase the cost-efficiency of error detection for
proposed in [30] and [31]. The area overhead of an erasure the configuration frames, we introduce the idea of InD parity.
code is expressed as the ratio of redundant blocks to The main idea is to use the same parity bit for the bits that
the data blocks (i.e., n/m). The recovery coverage of an are separated by a constant distance to minimize the area
erasure code is defined as the maximum number of erasures overhead (i.e., interleaved parity). In addition, the parity bits
that could be tolerated. For the optimal erasure codes, the are distributed at several dimensions to increase the detection
recovery coverage is equal to the number of redundant coverage with respect to probable MBU patterns as there are
blocks (i.e., n). some MBU patterns that cannot be detected using one or two
Erasure codes are not proposed to detect or correct errors dimensions, no matter how many parity bits are employed.
rather to retrieve the original blocks when a subset of blocks The number of parity bits in each dimension theoretically
is not available (i.e., erased). However, once an error is has to be at most equal to the largest MBU spread on that
detected in some blocks, by assuming that those blocks are dimension. However, in practice, large MBU patterns are
not available, the original blocks could be recovered by means typically detected using the parity bits on the other dimensions.
of an erasure code. Consequently, the number of parity bits required by this
technique is always smaller than this theoretical limit.
III. L OW-C OST MBU D ETECTION
It should be mentioned that the proposed InD parity tech-
The main idea of this paper is to exploit erasure codes
nique is conceptually different from the existing error detection
to recover the contents of the erroneous configuration frame.
techniques used for an interleaved memory such as cache
Nevertheless, since erasure codes cannot detect errors, an
units [36], [37]. In the latter case, each word has its own
effective detection technique is required as well. Therefore,
error detection code and also, the memory words are physically
each configuration frame has to be equipped with a low-cost
interleaved to reduce the probability of having more than one
error detection code. A scrubber unit periodically investigates
erroneous bit in each word for an MBU incidence. However,
the configuration frames for possible errors. Once an error is
this does not reduce the area overhead of the error detection
detected, by assuming that the erroneous frame is erased, its
technique. In contrast, the proposed InD parity coding is a
contents are recovered using an erasure code.
virtual interleaving technique, which does not require any
Considering the fact that the entire configuration frame
changes to the configuration frame structure. In addition,
could be recovered using an erasure code, the identification of
in the case of a large configuration frame, the number of
the exact location of erroneous bits is not of our interest, rather
redundant bits could be drastically reduced in comparison with
a low-cost error detection technique with a very high detection
the complete interleaved parity coding.
coverage is required. In this regard, we present a very efficient
In order to clearly show the detection capability of the
error detection coding technique called InD parity for MBU
proposed technique, the organization of parity bits as well
detection in the configuration frames. The error detection
as the error detection capability of the proposed I2D and
capability of this technique in the configuration frame for two
interleaved 3-D (I3D) parity schemes are detailed first and
particular cases where N = 2 and N = 3 is investigated in
then the generalization of this technique in higher dimensions
detail.
is discussed.

A. InD Parity
In memory arrays of typical microprocessors such as cache B. I2D Parity
units, the parity bits employed for error detection in each In the complete (traditional) 2-D parity, a parity bit is
memory entry (i.e., word) are computed during each memory associated for each row (column) which is constructed by
access. Hence, from the performance perspective, it is crucial XORing all the bits in that particular row (column). In the
that each memory entry has its own error detection coding. I2D parity technique, each horizontal (vertical) parity bit is the
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 935

TABLE I TABLE II
PARITY C OMPUTATION FOR C OMPLETE 2-D AND I2D E RROR D ETECTION C OVERAGE V ERSUS I NTERLEAVING
D ISTANCE FOR I2D PARITY

Fig. 5. Three major MBU patterns that cannot be detected by either I2D or
traditional 2-D parity and their occurrence probability.

three examples of such MBU patterns that cannot be detected


by I2D with the horizontal and vertical distance of 3 and 4,
Fig. 3. Example of I2D with vertical and horizontal distance of 4 and 3,
respectively.
respectively. Since the size of MBU patterns (in worst case, 24 bits
are affected in our experiments) is considerably smaller than
the size of a configuration frame (e.g., each configuration
frame of a Virtex-6 device comprises of 81 × 32 bits), in
case the horizontal and the vertical interleaving distance would
be large enough, I2D can provide exactly the same MBU
detection coverage as a complete 2-D parity technique. This
is because the erroneous bits would be covered by at least one
interleaving group. Therefore, the horizontal and the vertical
interleaving distance have to be determined according to the
possible MBU patterns.
The impact of the horizontal and the vertical interleaving
Fig. 4. Examples of MBU patterns for comparison of detection capability
distance on the error detection coverage is described
of 2-D and I2D (vertical and horizontal distance of 4 and 3, respectively). in Table II. The results presented in this table are extracted by
(a) I2D: vertical 4 and horizontal 3. (b) Not detected by I2D; detectable analyzing each 45-nm MBUP (Section II-B) and investigating
by 2D. (c) Detected by both. (d) Not detected by both.
the detection capability of I2D with a given interleaving
distance for that pattern. As shown in Table II, the error
XOR of the bits in multiple rows (columns). More specifically,
detection coverage grows by increasing the interleaving
all rows (columns) that are separated by a constant distance distances and saturates for the vertical and horizontal distance
of h (v) form an interleaving group and all the bits within of 5. This detection coverage is exactly the same number
that interleaving group are covered by only one horizontal that the complete 2-D parity can detect for these MBU
(vertical) parity bit. For a f × g memory array, the complete patterns; however, our I2D technique needs only 10 parity
2-D parity have f +g parity bits while I2D parity requires only bits for error detection, whereas the complete 2-D parity
h + v parity bits. In addition, the number of XOR operations requires 113 parity bits for each Xilinx Virtex-6 configuration
in both schemes is of the same order. These are formulated in frame.
Table I.
Fig. 3 shows an example of I2D coding with the horizontal
and the vertical interleaving distance of 3 and 4, respectively. C. I3D Parity
In this example, first, second, and third horizontal parity bits In general, there might be some MBU patterns, which affect
are the parity of all bits belonging to {1, 4, 7}, {2, 5, 8}, and an even number of cells in each row and column. This kind of
{3, 6} rows, respectively. patterns cannot be detected by the I2D parity technique since
Both the horizontal and the vertical interleaving distance parity bit can only detect an odd number of erroneous bits.
directly affect the detection coverage of the I2D parity tech- Examples of MBU patterns, in this technology, with significant
nique. The only cases that I2D cannot detect an MBU incident occurrence probabilities that cannot be detected by either
is when the MBU affects an even number of bits involved I2D or traditional 2-D parity technique are shown in Fig. 5.
in the computation of a certain parity bit while it is not In order to detect such MBUs, a more powerful MBU detection
detected by other parity bits. However, most of such cases are technique is required. In this regard, we propose I3D parity
detectable by the complete 2-D parity technique. Fig. 4 shows technique that has an additional set of parity bits for diagonals.
936 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

Fig. 7. Maximum detection coverage obtained by I2D and I3D parity for
Fig. 6. Example of I3D with vertical, horizontal, and diagonal distance different number of parity bits.
of 4, 3, and 5, respectively.
D. Comparison of I2D and I3D
TABLE III The number of parity bits required by I2D to reach its
I NTERLEAVING D ISTANCE V ERSUS E RROR D ETECTION maximum detection coverage is equal to that of I3D (both
C OVERAGE FOR I3D PARITY require 10 parity bits). However, the error detection coverage
of I2D is less than that of I3D. The maximum error detection
coverages for various number of parity bits employed in the
I2D and I3D parity techniques are shown in Fig. 7. This is
obtained by comparing the detection coverage of all possible
cases that result in the desired number of parity bits. For
instance, for four parity bits in I2D parity, we need to compare
the detection coverage for five cases which are (v = 4, h = 0),
(v = 3, h = 1), (v = 2, h = 2), (v = 1, h = 3), and
(v = 0, h = 4). As it can be seen in this figure, the I3D parity
always provides higher detection coverage than I2D parity
when the number of parity bits is more than two. Although the
I3D parity technique provides more detection coverage with
the same number of bits, it occupies more logical resources
on the FPGA. This is because more parity bits are required
to be stored for the third dimension (diagonal parity) and the
complexity of the controller unit increases for the generation
of such additional parity bits.

E. Extension to n Dimensions
Computation of the horizontal and the vertical parity bits In the smaller technology nodes, the sizes of MBU patterns
in I3D parity follows the same rules explained for I2D parity. grow, hence, a larger interleaving distance or even more parity
Similar to the interleaving technique employed in I2D parity dimensions might be required. However, it should be noted
to reduce the number of horizontal and vertical parity bits, that this growth is not linear with the maximum size of MBU
in I3D parity, several interleaving groups are formed for the patterns. For example, in the MBU patterns of the investigated
diagonals as well. Then for each group, only one parity bit is 45-nm technology, there are MBU patterns with up to 24
computed. In order to uniformly distribute d diagonal parity affected bits (maximum affected row and columns of 11
bits for a configuration frame of size g × f , the interleaving and 8, respectively). However, such large MBU patterns can
group for the bit position of (i, j ) could be computed by be detected by at least one of the horizontal, vertical, and
[i + f ·( j −1)] mod d. An example of the I3D parity technique diagonal parity bits, which serves the purpose of this paper.
with the vertical, horizontal, and diagonal interleaving distance As mentioned earlier, our intention is not to locate and correct
of 4, 3, and 5, respectively, is shown in Fig. 6. each erroneous bit in a configuration frame rather to only
The error detection coverage of I3D parity for different detect the existence of the error in a configuration frame and
horizontal, vertical, and diagonal interleaving distances is mark it as erased, i.e., the granularity of error detection is
reported in Table III. As it can be seen, the maximum error a configuration frame, not an individual bit. Thus, even for
detection coverage of this coding technique is 100% and could smaller technology nodes, the interleaving distance will not
be achieved with at least 10 parity bits (h = 5, v = 2, d = 3). increase drastically.
Although the number of parity bits is much less than the While a larger interleaving distance might increase the
complete 2-D parity (i.e., 113), this detection technique is able detection coverage at the expense of some additional parity
to detect all MBUs. bits, an additional dimension for the virtual interleaving could
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 937

reduce the number of required parity bits. For example, the


I2D scheme reaches detection coverage of 99.301% with ten
parity bits, while the I3D scheme only requires seven parity
bits to overpass this detection coverage (e.g., h = 2, v = 2, and
d = 3 provides 99.659%). In such a situation, it is important
to consider the tradeoff between area obtained by reducing the
number of parity bits, and the increase in area overhead due
to the complexity of the controller. This is because in some
cases, the additional dimension could significantly increase the
complexity of the controller.
The increase in the number of parity dimensions can also Fig. 8. Encoding and decoding of parity-based erasure codes. (a) Data
blocks + parity. (b) Recovery of erased block.
increase the detection coverage. In some cases, there are
some patterns that cannot be theoretically detected using bits to find the minimum amount which provides the desired
existing parity dimensions, no matter what the interleaving detection coverage.
distances are. Several examples of such patterns are already
shown in Fig. 5 for the I2D parity technique. In such cases,
an additional dimension is inevitable; however, it is very IV. R ECOVERY BASED ON E RASURE C ODES
important how to organize the additional parity dimension to For error detection, we propose to implement a scrubber
increase the detection coverage without inferring a significant unit that periodically checks the parity bits of the configuration
area overhead to the controller. In this paper, we employed frames for possible errors. Upon a detection of an error, by
very simple horizontal, vertical, and diagonal parity bits; assuming that the erroneous frame is erased, its contents are
however, the parity bits in additional dimensions could have recovered using an erasure code.
more complicated structure to detect large MBU patterns. The proposed scheme does not require any modification to
The flow for finding the appropriate number of dimensions the architecture of the existing FPGAs. It can be mapped as a
and the associated parity bits gets the set of MBU patterns soft module alongside with the user design into the FPGA.
and a constraint on the maximum number of parity bits This way, the scrubber can work in parallel with the user
(#P) to use as inputs and provides the dimensions and the application.
parity organization within each dimension as the output. The
following steps are done in this flow. A. Effective Erasure Code for Error Recovery
1) Initialization: A set of undetected pattern (UP) is in Configuration Frames
initialized with MBUP, i.e., UP ← MBUP.
2) Adding a New Dimension: The objective of each new There are plenty of erasure codes with different character-
dimension is to maximize detection of patterns in UP; istics in the literature. An effective erasure code should be
hence, it has to maximize the number of bits from each selected in the context of the error recovery for the configura-
pattern in UP appearing in different parity groups. This tion frames. Since soft error occurrence rate is relatively small
is because when an even number of bits of an MBUP and scrubbing is continuously performed to detect and recover
appear in the same parity group, it cannot be detected possible errors,5 it is unlikely to have multiple erroneous
by that parity. In this regard, iteration is performed configuration frames in each scrubbing iteration. In addition,
over all possible parity grouping options to find the the encoding time of the erasure code is not a major issue
parity organization which provides the highest detection in this context as it is done only once in advance during
capability. the design mapping to the FPGA device. In contrast, a high
3) Distributing Parity Bits: Once a new dimension is added, decoding time prolongs the error recovery process. Therefore,
the distribution of parity budget over all the existing an erasure code with one redundant block and short decoding
dimensions is revisited to maximize the coverage. time satisfies our requirements.
To achieve this, all combinations of distributing parity The parity-based erasure code [30] is an optimal erasure
bits among different dimensions are exhaustively code with one redundant block (n = 1; a m + 1 erasure code);
investigated. hence, it can recover from all cases with one erased block.
4) Updating UP: The UP is set to be equal to the set of In this technique, the redundant block stores the parity of
MBU patterns that could not be detected using a com- data in all other blocks [Fig. 8(a)]. In case of an erasure, the
bination which provides the highest detection coverage. contents of the erased block could be simply computed by the
5) Termination Condition: If the number of parity bits is parity of the other m blocks [Fig. 8(b)].
equal to #P or there is a solution to detect all patterns The factor m determines the tradeoff between the area
(UP = Ø), terminate the process, otherwise go to step 2 overhead and the encoding/decoding time. If all configuration
to add a new dimension. frames in a large device are protected with only one erasure
The above process maximizes error detection coverage for a block, the error recovery time would be considerably high as
given parity budget. In order to minimize the number of parity 5 The overall SER of all configuration frames of the Xilinx Virtex-6
bits, i.e., finding the minimum budget for a given coverage, VLX240T device is estimated with [38] to be 5.11 errors per 105 h while the
this process could be repeated for various number of parity scrubbing iteration for the same device is ∼18.7 ms.
938 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

parity bits of different frames from one cluster very close


together because an MBU might affect parity bits of several
frames in that cluster. In such a scenario, affected bits cannot
be recovered as the number of erroneous frames is more than
the available erasure blocks (only one erasure block for each
cluster). A similar issue also exists for storing error-detection
data and the erasure block of the same cluster, i.e., if an
MBU affects both the erasure block of a cluster as well as
the parity bits of one frame in the same cluster, the recovery
is not possible.
Considering these important issues, it is essential to inter-
leave the redundant data in a way that for each cluster at most
either error detection parity bits of one frame or the redundant
erasure block could be affected. Hence, the size of the clusters
has to be larger than one, to facilitate an interleaving distance
of at least one among data from each cluster.

C. Error Detection and Recovery Unit


The error detection and recovery (EDR) unit periodically
performs configuration scrubbing to detect possible erroneous
Fig. 9. Configuration frame protection using the proposed erasure code-based frames. This unit reads a configuration frame word by word
approach.
and performs the InD error detection for that frame. Upon
all frames have to be read to compute the contents of the erased a detection of an error, this unit restores the contents of the
configuration frame. In order to reduce the error recovery time, erroneous frame according to the contents of the remaining
FPGA frames could be divided into several clusters, each of frames in the same cluster. It is worth to mention that the
which has its own redundant block (see Fig. 9). In summary, EDR unit not only scrubs the configuration frames, but also
a smaller m shortens the encoding/decoding time at the scrubs the redundant erasure blocks stored in BRAMs for
expense of higher area overhead. detecting possible errors. Otherwise, an accumulated error in a
It is worth to mention that, during the recovery process, redundant block prevents from reconstructing the correct data
the entire affected configuration frame is reconstructed which when a configuration frame from the same cluster becomes
leads to MBU correction of that particular configuration frame. erroneous in the future.
While erasure codes can completely reconstruct an erroneous The scrubbing rate is determined by the desired level of
configuration frame which has multiple errors, it fails when reliability as well as the radiation-induced SER. For example,
MBUs span across several configuration frames. In order to SER at altitude of 40 000 feet is ∼500× more than that of
avoid such cases, frames for different clusters are appointed the terrestrial environment [3]; hence, a higher scrubbing rate
in a way that the frames from one cluster are not physically is required for avionic systems to provide the same level of
adjacent in the FPGA device. Therefore, in case an MBU reliability.
affects adjacent frames, it affects only one frame from each Soft errors in the configuration frames or the functional
cluster which could be recovered based on the redundant block logic of the EDR unit might affect its functionality. Therefore,
in the same cluster. it is essential to protect this unit against this kind of errors.
The proposed technique is also applicable to FPGAs with In order to address this, we protect this unit using a Triple
partial reconfiguration capability. The partial reconfiguration Modular Redundancy (TMR) implementation. Since the size
is the ability to dynamically modify a subset of configuration of the EDR unit is relatively small (it occupies <1% of all the
frames by downloading their contents, while the remaining resources in a Xilinx Virtex-6 FPGA device), the area overhead
logic continues to operate without interruption. In such imposed by triplication (2%) is not significant. An important
designs, in each reconfiguration phase, in addition to the point to note is that since the EDR periodically scrubs the
configuration frames, the corresponding erasure frames have entire configuration frames of the device, it can also self-heal
to be updated as well. the soft errors occurring in the configuration frames pertaining
to its own circuitry.
Once an error is detected in a configuration frame, the
B. Storing Redundant Data following steps are required to be performed by the EDR unit
The proposed scheme generates error detection codes for recovering the correct contents of that frame.
(InD parity bits) for each configuration frame and also a 1) The EDR unit creates a temporary block and initializes
redundant erasure block for each cluster. These additional data all its bits with zero.
have to be stored in BRAMs of the FPGA device or in the 2) The EDR unit determines the erroneous configuration
spare bits of the configuration frames. In case the parity bits frame cluster and reads all the configuration frames
are stored in BRAMs, one important concern is not to store of the erroneous cluster one by one by excluding the
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 939

erroneous configuration frame. The recovery unit


computes the XOR of all bits with their corresponding
bits in the temporary block and replaces the new value
in the temporary block. Once all frames in the cluster
are processed, the temporary block contains the parity
of all error-free frames in the same cluster.
3) The temporary block is written into the erroneous frame
in the FPGA device.
Since the likelihood of having two erroneous frames from
the same cluster in one scrubbing iteration is negligible,6 such
cases could be safely ignored. In case such probability would Fig. 10. Implementation flow of the proposed scheme to a Xilinx FPGA
device.
be considerable, multiple erroneous configuration frames could
be easily detected during the recovery phase. In this phase, The implementation flow of our proposed MBU mitigation
once the first error is detected, the reconstruction of the scheme is described in Fig. 10. In the first step, the EDR
erroneous frame is performed by XORing the contents of all unit and the user design are given to Xilinx ISE to build
other frames in the cluster. By investigating the parity bits the initial bitstream. Once the initial bitstream is ready, the
of each frame during the reconstruction of the first erroneous InD parity codes for the configuration frames are obtained
frame, the presence of an error in other frames is detected. For by extracting the configuration frame data from the bitstream.
correcting more than one erroneous frame in the same cluster, At the same time, considering contents of all frames in
the contents of erasure frames can be either loaded from an each cluster, the redundant erasure block for that cluster is
external memory or corrected by an erasure code with higher computed. Afterward, the bitstream is updated by mapping the
number of erasure frames. redundant erasure blocks into the idle BRAM units and parity
The objective of this scheme is to address the permanent bits into the spare bits of the configuration frame. Then, the
effect of radiation-induced soft errors only in the configuration updated bitstream is programmed into the FPGA.
frames. However, errors might still affect the functionality
of the mapped design from the error occurrence time
B. Validation With Fault Injection
until the recovery operation is performed. Therefore,
a rollback (e.g., checkpointing) [39], [40] or rollforward In order to validate the EDR capabilities of the proposed
(e.g., TMR) [40], [41] error correction technique is also technique, we performed a fault injection experiment on the
required to mask errors propagated to the mapped design configuration bits of the Xilinx Virtex-6 VLX240T device.
during this time interval. The fault injection is done by reading a randomly selected
configuration frame using the ICAP interface. Afterward, a
V. C ASE S TUDY: I MPLEMENTATION ON random MBUP is applied to a random location of the frame
A X ILINX FPGA D EVICE and eventually the modified contents are written back to the
configuration frame using the same interface. We have injected
As a proof of concept, we implemented the proposed
1000 errors on the configuration frames and initiated one
technique on a Xilinx Virtex-6 VLX240T device. In this
iteration of scrubbing. At the end of the scrubbing period,
section, we explain the implementation flow validated by a
we read the contents of the erroneous frame and compared
fault injection experiment. Then, we discuss the recovery time
its contents with the golden state before fault injection. The
and area overhead tradeoff in detail and quantitatively compare
experimental result shows that the proposed technique is able
the proposed scheme with the existing solutions.
to detect and correct all injected errors.
A. Implementation Flow
C. Area and Power Overhead
Xilinx FPGAs offer several interfaces to read and modify
the configuration frames [42]. Internal configuration access In the employed Xilinx Virtex-6 XLV240T device, there
port (ICAP) is one of the fastest interfaces which operates are 28 464 configuration frames each of which comprising
at the frequency of 100 MHz. In this paper, this interface of 81 32-bit words. In addition, each configuration frame has
has been used as it is within the FPGA fabric and could be 16 spare bits for the Xilinx SEU correction mechanism [43].
controlled by a custom hardware block. As mentioned in Section III, I3D parity requires 10 bits to
We wrote a reconfigurable implementation of the EDR unit reach its maximum detection coverage (i.e., 100%) and these
in Verilog which accepts the overall number of frames and the bits could be stored in these 16 spare bits of the corresponding
number of clusters as input. This implementation is completely frame without any other storage mechanism such as BRAMs.
independent of the user design and needs to be merged with Please note that the complete 3-D parity scheme requires
the latter before mapping into the FPGA device. 2 × (81 + 32) = 226 bits for each configuration frame which
has to be stored in a BRAM unit.
6 The SER of an SRAM cell is 6.93×10−13 per h in 45-nm technology [38]. While the I3D parity bits can be stored in the spare bits
For a configuration size of 81 × 32, cluster size of 50, and scrubbing rate of of the same configuration frame, the redundant erasure blocks
18.7 ms, the probability of having one error in one cluster is 2.33 × 10−11 ,
while the probability of having more than one error in the same cluster in have to be stored in FPGA BRAMs. This is because a redun-
one scrubbing period is 2.66 × 10−22 . dant erasure block occupies the same size of a configuration
940 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

ICAP interface. The FPGA device employed in this paper has


28 464 configuration frames which have to be scrubbed by the
EDR unit. Our experiments reveal that in case there is just one
cluster employed for the entire device, the error recovery time
from an erroneous frame is ∼18.7 ms. However, as shown
in Fig. 11, by increasing the number of clusters, the recovery
time drastically reduces at the expense of a linear increase in
the BRAM area.
The error detection time in the proposed scheme is a
function of the scrubbing time (the read-back time of entire
configuration frames). The worst case scenario occurs when an
Fig. 11. Error detection and correction time versus number of required MBU occurs in a frame right after it is scrubbed. In this case,
BRAMs for different cluster sizes.
the error remains in that frame for one complete scrubbing
period (18.7 ms). However, the average MBU detection time
frame (81×32+16), and the number of erasure blocks is linear is half of this period. Both the worst case and the average
to the number of employed clusters. Therefore, our scheme error detection time are strongly dependent on the number of
requires a separate storage medium such as BRAMs for storing clusters; however, since the number of clusters is negligible
erasure blocks. There are 416 BRAM units with a size of compared with the number of configuration frames, the slight
36 kbit available in the employed Virtex-6 VLX240T device. increase in these factors is not distinguishable in Fig. 11.
In our proposed scheme, some of these BRAM units can be
exploited for storing the erasure blocks. Fig. 11 shows the
number of BRAM units required for storing erasure blocks E. Comparison With the Previous Work
for different number of clusters. According to the figure, it In order to clearly demonstrate the benefits and drawbacks
is evident that even for 50 clusters, which can significantly of the proposed scheme, it is extensively compared with
reduce the recovery time; the number of BRAMs required to state-of-the-art solutions. The error recovery coverage of each
store the redundant erasure blocks is only four. This is <1% of scheme is obtained with respect to the MBU patterns obtained
the total BRAMs in the employed device. The entire EDR unit for a 45-nm technology (Fig. 1) considering the error correc-
with built-in TMR and I3D detection technique occupies 1219 tion capability of that particular scheme.
(3.3%) Configurable Logic Blocks in the employed Xilinx The detectability of each MBUP can be deterministically
Virtex-6 VLX240T device. obtained for the 2-D Hamming code [23] and Xilinx SEU
We also obtained the power consumption overhead due to correction [21]. However, Xilinx CRC + ECC [25] can
the usage of the EDR unit alongside the user application. It is probabilistically detect specific patterns; hence, we performed
worth to mention that the EDR unit scrubs all configuration a simulation-based MBU injection experiment to extract error
frames (whether used or not) for errors because an error in detection probabilities for these schemes.
an unused configuration frame may cause a short circuit and Xilinx has implemented three different schemes for
damages the device. As a result, the power consumption of soft error detection and correction in the configuration
this unit is independent of the mapped design size. However, frames. The first scheme is based on a Hamming code of
the power overhead of the EDR unit increases by the size distance 3 and is able to correct SEUs [21]. The second
of the mapped design as dynamic power consumption grows. one utilizes a combination of CRC and ECC and is able to
Considering almost constant power consumption for the correct two adjacent cells in one word as well [25]. These
EDR unit, the maximum overhead could be obtained with two schemes are able to only correct 51.72% and 61.10% of
respect to the case that no design is loaded to the FPGA device. the soft errors, respectively. The third scheme exploits CRC
The experiment carried out using the Xilinx XPWR tool shows for error detection; however, it cannot identify the erroneous
that the power of the employed FPGA device without any configuration frames. In this approach, after error detection,
design loaded is 2.87 W, while mapping only the EDR unit to the contents of all configuration frames are reloaded from
this FPGA increase the power consumption to 3.02 W which an external nonvolatile storage. Although this technique can
shows power overhead of 5.2%. This small power overhead for successfully detect and correct all MBUs, it has a long
the case where no design is loaded is due to high static power recovery time as it reloads the contents of all frames.
consumption of the FPGA device. Furthermore, an experiment In addition, the external storage imposes some additional
has been carried out for a subset of ITC’99 benchmark circuits costs to the system.
mapped to this FPGA device, together with the EDR unit. The The MBU mitigation scheme presented in [23] exploits
results show that the power consumption of the FPGA device a 2-D Hamming code to correct MBUs. In the employed
increases on average only by 4.9% for applying the proposed FPGA device which has a frame size of 81 × 32, horizontal
erasure-code-based protection scheme. and vertical Hamming codes need 81 × 6 and 32 × 7 bits,
respectively, which means a total of 710 Hamming bits for
D. Detection and Recovery Time each frame. In this scheme, for protecting all configuration
The EDR unit periodically scrubs the entire frames more than 19.27 Mbit memory is required which
configuration frames of the FPGA device through the could be translated into 519 BRAM units. Although this
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 941

TABLE IV
C OMPARISON OF D IFFERENT C ONFIGURATION F RAME S OFT E RROR M ITIGATION S CHEMES

scheme provides a very high error coverage, the employed An important point is that although these schemes remove
FPGA device has only 416 BRAMs and there are not enough the permanent effect of soft errors from configuration frames,
BRAM units to store all of the required Hamming bits. the errors could affect the mapped design functionality
Furthermore, this scheme cannot correct some of the MBUs until the end of the recovery operation. Since all these schemes
in the Hamming data which significantly reduce its correction have the same behavior from this perspective, this is not
coverage compared with our proposed scheme. included in the comparison.
Our proposed scheme with I2D parity detection and 50 clus-
ters needs only four BRAM units and is able to correct VI. C ONCLUSION
99.30% of the soft errors. By employing the proposed
I3D parity technique, the error correction probability increases Radiation-induced MBUs are a serious reliability concern
to 100%. As explained earlier, for both detection techniques, in nanoscale technology nodes. Aggressive transistor
the parity bits could be stored in the redundant bits of downscaling and emerging dense integration schemes make
the corresponding frame, and hence, do not impose any FPGAs prone to MBUs. The configuration frames are the
additional resource overhead. The proposed scheme only most vulnerable resources on the FPGA fabric to soft errors
occupies 1% of the available BRAMs for storing erasure as they constitute the majority of the FPGA memory bits
frames and is a good candidate to be used alongside with and once affected by soft errors, permanently change the
very large designs. Besides the area overhead for storing functionality of the mapped design.
the error detection and correction data, all schemes have a In this paper, we presented a cost-efficient scheme based
common requirement for performing the scrubbing. Since it is on erasure codes for MBU detection and correction in the
a common requirement among all schemes, it is not reported configuration frames of SRAM-based FPGAs. This scheme
in Table IV. is implemented as a generic soft core alongside with the
As reported in Table IV, the error detection time of all the user design and does not require any changes to the existing
schemes are almost the same. Indeed, all the schemes read FPGA architecture. Compared with the previous solutions, our
the entire configuration frames using the ICAP interface and scheme provides the highest level of MBU protection at very
perform the error checking in parallel with reading frames; low costs with a negligible recovery time. The implementation
hence, the type of coding scheme does not affect the timing. results reveal that the proposed scheme occupies only 1%
The slight increase in the error detection time of our proposed of memory and 3% of logic resource on Xilinx Virtex-6
scheme is due to the time required for scrubbing 50 additional device. Furthermore, the error correction latency is also very
erasure blocks employed for clusters. The error recovery time small (0.35 ms for 50 clusters). These results confirm that the
of the previous schemes is negligible as the error correction proposed scheme is a practical solution for MBU mitigation
data are already loaded to the scrubber unit. Thus, the only in FPGA configuration frames.
added time would be due to the writing of corrected contents to
the erroneous frame. In contrast, our proposed scheme requires R EFERENCES
some additional time to read all frames in the affected cluster [1] A. Dixit and A. Wood, “The impact of new technology on soft
and compute the erased frame contents. However, the sum of error rates,” in Proc. IEEE Int. Rel. Phys. Symp. (IRPS), Apr. 2011,
the error detection and the error recovery time determines the pp. 5B.4.1–5B.4.7.
[2] H. Kaul, M. Anders, S. Hsu, A. Agarwal, R. Krishnamurthy, and
MTTR which is the average time required by the system to S. Borkar, “Near-threshold voltage (NTV) design: Opportunities and
return to its normal operation. The MTTR of previous schemes challenges,” in Proc. 49th Annu. Design Autom. Conf. (DAC), 2012,
is 9.343 ms while our scheme has an MTTR of 9.694 ms. The pp. 1153–1158.
3.75% overhead in the MTTR is reasonable considering the [3] C. Hu and S. Zain, “NSEU mitigation in avionics applications,”
Xilinx Corporation, San Jose, CA, USA, Appl. Note XAPP1073, 2011,
high error recovery coverage and the low area overhead of pp. 1–12.
the proposed scheme. [4] P. Dorsey, “Xilinx stacked silicon interconnect technology delivers
Although the Xilinx CRC + Reload scheme can provide breakthrough FPGA capacity, bandwidth, and power efficiency,”
Xilinx Corporation, San Jose, CA, USA, White Paper Virtex-7 FPGAs,
100% recovery coverage, it requires an additional external 2010, pp. 1–10.
nonvolatile memory for storing contents of configuration [5] Altera Corporation, “Meeting the performance and power imperative of
memory. Our proposed technique eliminates the need for the zettabyte era with generation 10,” Altera Corporation, San Jose, CA,
USA, White Paper WP-01200-1.0, 2013.
such an external memory. On the other hand, it significantly [6] N. Seifert et al., “Soft error susceptibilities of 22 nm tri-gate devices,”
reduces MTTR compared with that scheme. IEEE Trans. Nucl. Sci., vol. 59, no. 6, pp. 2666–2673, Dec. 2012.
942 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 3, MARCH 2016

[7] M. Ebrahimi, H. Asadi, and M. B. Tahoori, “A layout-based approach [31] L. Rizzo, “Effective erasure codes for reliable computer communication
for multiple event transient analysis,” in Proc. 50th Annu. Design Autom. protocols,” ACM SIGCOMM Comput. Commun. Rev., vol. 27, no. 2,
Conf. (DAC), 2013, pp. 1–6. pp. 24–36, 1997.
[8] E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. Toba, “Impact of [32] J. S. Plank and M. G. Thomason, “A practical analysis of low-density
scaling on neutron-induced soft error in SRAMs from a 250 nm to parity-check erasure codes for wide-area storage applications,” in Proc.
a 22 nm design rule,” IEEE Trans. Electron Devices, vol. 57, no. 7, IEEE Int. Conf. Dependable Syst. Netw., Jun./Jul. 2004, pp. 115–124.
pp. 1527–1538, Jul. 2010. [33] A. BanaiyanMofrad, M. Ebrahimi, F. Oboril, M. B. Tahoori, and N. Dutt,
[9] D. Radaelli, H. Puchner, S. Wong, and S. Daniel, “Investigation of “Protecting caches against multiple bit upsets using embedded erasure
multi-bit upsets in a 150 nm technology SRAM device,” IEEE Trans. coding,” in Proc. Eur. Test Symp. (ETS), 2014.
Nucl. Sci., vol. 52, no. 6, pp. 2433–2437, Dec. 2005. [34] J. Kim, N. Hardavellas, K. Mai, B. Falsafi, and J. C. Hoe, “Multi-bit
[10] F. Siegle, T. Vladimirova, J. Ilstad, and O. Emam, “Mitigation of error tolerant caches using two-dimensional error coding,” in Proc. 40th
radiation effects in SRAM-based FPGAs for space applications,” ACM Annu. IEEE/ACM Int. Symp. Microarchitecture, Dec. 2007, pp. 197–209.
Comput. Surv., vol. 47, no. 2, 2015, Art. ID 37. [35] J. M. Park, E. K. P. Chong, and H. J. Siegel, “Efficient multicast stream
[11] C. Carmichael, “Triple module redundancy design techniques authentication using erasure codes,” ACM Trans. Inf. Syst. Secur., vol. 6,
for Virtex FPGAs,” Xilinx Corporation, San Jose, CA, USA, no. 2, pp. 258–285, 2003.
Appl. Note XAPP197, 2001. [36] S. Baeg, S. Wen, and R. Wong, “SRAM interleaving distance selection
[12] Y. Ichinomiya, S. Tanoue, M. Amagasaki, M. Iida, M. Kuga, and with a soft error failure model,” IEEE Trans. Nucl. Sci., vol. 56, no. 4,
T. Sueyoshi, “Improving the robustness of a softcore processor against pp. 2111–2118, Aug. 2009.
SEUs by using TMR and partial reconfiguration,” in Proc. 18th IEEE [37] M. Ebrahimi, A. Evans, M. B. Tahoori, R. Seyyedi, E. Costenaro,
Annu. Int. Symp. Field-Program. Custom Comput. Mach. (FCCM), and D. Alexandrescu, “Comprehensive analysis of alpha and neutron
May 2010, pp. 47–54. particle-induced soft errors in an embedded processor at nanoscales,”
in Proc. Design, Autom., Test Eur. Conf. Exhibit. (DATE), Mar. 2014,
[13] B. Pratt, M. Caffrey, J. F. Carroll, P. Graham, K. Morgan, and
pp. 1–6.
M. Wirthlin, “Fine-grain SEU mitigation for FPGAs using partial TMR,”
[38] D. Alexandrescu, “A comprehensive soft error analysis methodology
IEEE Trans. Nucl. Sci., vol. 55, no. 4, pp. 2274–2280, Aug. 2008.
for SoCs/ASICs memory instances,” in Proc. 17th Int. On-Line Test.
[14] M. Ebrahimi, S. G. Miremadi, H. Asadi, and M. Fazeli, “Low-cost Symp. (IOLTS), Jul. 2011, pp. 175–176.
scan-chain-based technique to recover multiple errors in TMR systems,” [39] K. Rupnow, W. Fu, and K. Compton, “Block, drop or roll(back):
IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no. 8, Alternative preemption methods for RH multi-tasking,” in Proc. 17th
pp. 1454–1468, Aug. 2013. IEEE Symp. Field Program. Custom Comput. Mach. (FCCM), Apr. 2009,
[15] S. Srinivasan, A. Gayasen, N. Vijaykrishnan, M. Kandemir, Y. Xie, pp. 63–70.
and M. J. Irwin, “Improving soft-error tolerance of FPGA configuration [40] D. K. Pradhan and N. H. Vaidya, “Roll-forward and rollback recovery:
bits,” in Proc. IEEE/ACM Int. Conf. Comput. Aided Design (ICCAD), Performance-reliability trade-off,” in 24th Int. Symp. Fault-Tolerant
Nov. 2004, pp. 107–110. Comput. (FTCS), Dig. Papers, Jun. 1994, pp. 186–195.
[16] B. S. Gill, C. Papachristou, and F. G. Wolff, “A new asymmetric SRAM [41] M. Ebrahimi, S. G. Miremadi, and H. Asadi, “ScTMR: A scan chain-
cell to reduce soft errors and leakage power in FPGA,” in Proc. Design, based error recovery technique for TMR systems in safety-critical
Autom., Test Eur. Conf. Exhibit. (DATE), Apr. 2007, pp. 1–6. applications,” in Proc. Design, Autom., Test Eur. Conf. Exhibit. (DATE),
[17] M. Berg et al., “Effectiveness of internal versus external SEU scrubbing Mar. 2011, pp. 1–4.
mitigation strategies in a Xilinx FPGA: Design, test, and analysis,” IEEE [42] Xilinx Corporation, “Virtex-6 FPGA configuration user guide,”
Trans. Nucl. Sci., vol. 55, no. 4, pp. 2259–2266, Aug. 2008. Xilinx Corporation, San Jose, CA, USA, User Guide UG360 (V3.6),
[18] A. Sari, M. Psarakis, and D. Gizopoulos, “Combining checkpointing and 2013.
scrubbing in FPGA-based real-time systems,” in Proc. IEEE 31st VLSI [43] K. Chapman, “SEU strategies for Virtex-5 devices,” Xilinx Corporation,
Test Symp. (VTS), Apr./May 2013, pp. 1–6. San Jose, CA, USA, Appl. Note XAPP864 (v2.0), 2010.
[19] A. Sari and M. Psarakis, “Scrubbing-based SEU mitigation approach
for systems-on-programmable-chips,” in Proc. Int. Conf. Field-Program.
Technol. (FPT), Dec. 2011, pp. 1–8.
[20] G. A. Vera, S. Ardalan, X. Yao, and K. Avery, “Fast local scrubbing for
field-programmable gate array’s configuration memory,” J. Aerosp. Inf.
Syst., vol. 10, no. 3, pp. 144–153, 2013.
Mojtaba Ebrahimi received the B.Sc. degree in
[21] L. Jones, “Single event upset (SEU) detection and correction computer engineering from Shahed University,
using Virtex-4 devices,” Xilinx Corporation, San Jose, CA, USA, Tehran, Iran, in 2008, and the M.Sc. degree in
Appl. Note XAPP714, 2007. computer engineering from Sharif University,
[22] Altera Corporation, “Enhancing robust SEU mitigation with Tehran, in 2010. He is currently pursuing the
28-nm FPGAs,” Altera Corporation, San Jose, CA, USA, Ph.D. degree with the Chair of Dependable and
White Paper WP-01135-1.0, 2010. Nano Computing, Karlsruhe Institute of Technology,
[23] S. P. Park, D. Lee, and K. Roy, “Soft-error-resilient FPGAs using Karlsruhe, Germany.
built-in 2-D Hamming product code,” IEEE Trans. Very Large Scale He was a Research Assistant with the Dependable
Integr. (VLSI) Syst., vol. 20, no. 2, pp. 248–256, Feb. 2012. System Laboratory, Sharif University, from
[24] M. Lanuzza, P. Zicari, F. Frustaci, S. Perri, and P. Corsonello, “A self- 2010 to 2011. His current research interests include
hosting configuration management system to mitigate the impact of soft error rate estimation of microprocessors and selective protection
radiation-induced multi-bit upsets in SRAM-based FPGAs,” in Proc. techniques.
IEEE Int. Symp. Ind. Electron. (ISIE), Jul. 2010, pp. 1989–1994.
[25] Xilinx Corporation, “LogiCORE IP soft error mitigation controller v3.4,”
Xilinx Corporation, San Jose, CA, USA, Product Guide PG036, 2012.
[26] R. H. Morelos-Zaragoza, The Art of Error Correcting Coding.
New York, NY, USA: Wiley, 2006.
[27] P. M. B. Rao, M. Ebrahimi, R. Seyyedi, and M. B. Tahoori, “Protecting Parthasarathy Murali B. Rao received the
SRAM-based FPGAs against multiple bit upsets using erasure codes,” bachelor’s degree in electrical engineering from
in Proc. 51st ACM/EDAC/IEEE Design Autom. Conf. (DAC), Jun. 2014, Bharathiar University, Coimbatore, India, and
pp. 1–6. the M.Sc. degree in electrical engineering from
[28] E. Costenaro, D. Alexandrescu, K. Belhaddad, and M. Nicolaidis, Linköping University, Linköping, Sweden.
“A practical approach to single event transient analysis for highly He was a Research Assistant with the Chair
complex design,” J. Electron. Test., vol. 29, no. 3, pp. 301–315, 2013. of Dependable Nano Computing, Karlsruhe Insti-
[29] JEDEC89C Standard, document JEDEC89C. [Online]. Available: tute of Technology, Karlsruhe, Germany, from
http://www.jedec.org/standards-documents, accessed Apr. 2015. 2012 to 2014. His current research interests include
[30] J. S. Plank, “Erasure codes for storage applications,” in Proc. 4th Usenix reconfigurable systems, reliability issues in field-
Conf. File Storage Technol., 2005, pp. 1–74. programmable gate array, and multicore systems.
EBRAHIMI et al.: LOW-COST MBU CORRECTION IN SRAM-BASED FPGA CONFIGURATION FRAMES 943

Razi Seyyedi received the B.Sc. degree in computer Mehdi B. Tahoori (S’02–M’04–SM’08) received
engineering from Shahed University, Tehran, Iran, the B.S. degree in computer engineering from
in 2011, and the M.Sc. degree in computer the Sharif University of Technology, Tehran, Iran,
engineering from Bonn University, Bonn, Germany, in 2000, and the M.S. and Ph.D. degrees in electrical
in 2014. He performed the master’s thesis with engineering from Stanford University, Stanford,
the Chair of Dependable and Nano Computing, CA, USA, in 2002 and 2003, respectively.
Karlsruhe Institute of Technology, Karlsruhe, He was an Assistant Professor with the
Germany, under the supervision of Prof. Tahoori. Department of Electrical and Computer Engineering,
His current research interests include computer Northeastern University, Boston, MA, USA, in 2003,
architecture and reliability. where he became an Associate Professor in 2009.
He was a Research Scientist with the Fujitsu
Laboratories of America, Sunnyvale, CA, USA, from 2002 to 2003, where he
was involved in advanced computer-aided research, including reliability issues
in deep-submicrometer mixed-signal VLSI designs. He is currently a Full
Professor and the Chair of Dependable Nano Computing with the Department
of Computer Science, Institute of Computer Science and Engineering,
Karlsruhe Institute of Technology, Karlsruhe, Germany. He has authored
over 140 publications in major journals and conference proceedings
on a wide range of topics, from dependable computing and emerging
nanotechnologies to system biology. He holds five pending and granted
U.S. and international patents. His current research interests include
nanocomputing, reliable computing, VLSI testing, reconfigurable computing,
emerging nanotechnologies, and systems biology.
Prof. Tahoori was a Program Committee Member, and a Workshop, Panel,
and Special Session Organizer of various conferences and symposia in VLSI
testing, reliability, and emerging nanotechnologies, such as the Conference
on Information, Telecommunication and Computing, the International
Conference on Computer-Aided Design, Design, Automation and Test in
Europe Conference, European Test Symposium, the International Conference
on Intelligent Computing, Communication and Devices, the Asia and South
Pacific Design Automation Conference, Great Lakes Symposium on VLSI,
and VLSI Design Conference. He was a recipient of the National Science
Foundation Early Faculty Development Award. He is an Associate Editor of
the ACM Journal of Emerging Technologies for Computing. He is also the
Chair of the ACM SIGDA Technical Committee on Test and Reliability.

You might also like