Professional Documents
Culture Documents
Microelectronics Reliability
journal homepage: www.elsevier.com/locate/microrel
A R T I C L E I N F O A B S T R A C T
Keywords: This paper addresses the issue of soft error mitigation for Static Random-Access Memory (SRAM)-based Field
Heavy ions Programmable Gate Arrays (FPGAs) system in radiation environment to reduce their malfunction and system
Irradiation failure rates in space missions. Seven representative circuits are designed by using the logical resources of SRAM-
Hardened
based FPGA. The accelerator tests investigate that the clock distribution of the Triple Modular Redundancy
Single event upset
(TMR) circuits is vital to achieve a high Single Event Upset (SEU) tolerance. The separated DTMR_NEW circuits
are proposed to overcome the weakness of the conventional TMR circuits, and a 25× improvement of SEU
tolerance for the separated DTMR_NEW circuits are verified. The statistical estimations are desirable for engi
neers to assess and enhance the SEU tolerance of their designed systems at earlier stages to reduce the time and
development costs.
* Corresponding author
E-mail address: caichang@fudan.edu.cn (C. Cai).
https://doi.org/10.1016/j.microrel.2021.114340
Received 21 May 2021; Received in revised form 14 July 2021; Accepted 15 August 2021
0026-2714/© 2021 Elsevier Ltd. All rights reserved.
Please cite this article as: Chang Cai, Microelectronics Reliability, https://doi.org/10.1016/j.microrel.2021.114340
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
multiple error mitigated chains. Thus, we reexamined the TMR effi Table 1
ciency and quantified the ratios of SEU improvements of the proposed The basic information of the implemented DFF chains in FPGA.
TMR circuits by a series of ground heavy ion experiments. We think that Name Hardening strategies Orientation Specialties
the layout and routing methods for the TMR chains are useful for the
Standard_SYN No Row Synchronous reset
FPGA users in designing of high-reliable systems, and helpful to promote Standard_ASYN No Row Asynchronous reset
space applications of the advanced commercial FPGAs. LTMR_ROW Local TMR Row No reset
The organization of this paper is as follows. In Section 2, the phi GTMR_ROW Global TMR Row No reset
losophy and methodology of our designed hardening circuits are clearly DTMR_ROW Distributed TMR Row No reset
DTMR_COL Distributed TMR Column No reset
illustrated. The introductions of testing methods and evaluation plat DTMR_NEW Distributed TMR Column No reset
forms are detailed in Section 3. The experimental results as well as some
related discussions of SEU tolerance of different circuits are provided in
Section 4. The conclusions are provided in Section 5. Distributed TMR (DTMR_ROW) chains and the column-oriented
Distributed TMR (DTMR_COL) chains employ one independent global
2. Circuits design clock port, as shown in Fig. 3. The new separated Distributed column-
oriented TMR (DTMR_NEW) chains are implemented to minimize the
There are three essential factors related to the SEU tolerance of D- influence of RCLK, and the scheme of DTMR_NEW chains are shown in
type flip-flops (DFF) chains, including the inherent critical charge of DFF Fig. 4. While the global placed row-oriented TMR (GTMR_ROW) chains
cell, the methods for cell distributions, and the peripheral buffer utili employ three independent global clock ports. For the DTMR_COL,
zations. Thus, the special circuits implemented in FPGA should concern GTMR_ROW, and DTMR_NEW chains, one SLICE only places one DFF
all of the resource configuration, distribution and constrain, and buffer cell, indicating that the hardened cells are hard to be affected by the
utilization. charge sharing effects in radiation environments.
The row-oriented and column-oriented manners of DFF chains are Though the full TMR circuits are utilized, the PBIE may still exist due
designed. The triplicated DFF cells can be instantiated either in a to the buffer controlling structure (shown in Fig. 3), leading the tripli
dependent SLICE or in different SLICEs, which is depend on the cated DFF chains invalided. Thus, the DTMR_NEW scheme (Fig. 4) is
constrain file. The PBIE may from the Global Clock Buffers (BUFG), proposed to minimize the radiation sensitivity of both the BUFG and
Regional Clock Buffers (RCLK) and Local Clock Buffers (LCLK) in SLICEs, RCLK. The separated RCLKs are all connected to one external clock
as shown in Fig. 1. Thus, we customize the clock buffers in DFF chains. signal, each of which drives a different DFF.
BUFG is the most commonly used clock resource, connecting to every
clock point and drive the global clock networks in FPGA [9]. Each RCLK 3. Experimental setup
drives total 50 SLICEs in two columns or 100 SLICEs in four columns (25
SLICEs of one column is connected by switch matrix), and the local clock The XC7K325T FPGA is selected as the Device Under Test (DUT). The
buffer inside a SLICE drives all the DFFs therein [10]. backside silicon substrate of DUT is thinned down to ~60 μm, and each
The DFF shift register chains are implemented in 28 nm XC7K325T thinned DUT is fixed in a test board that has individual power input and
FPGA, and each DFF chain contains 4799 stages (4799 bits in total for a number of high-performance and high-range I/O pins. For the backside
each chain). Seven different DFF chains including two standard DFF irradiation of flip-chip devices, the silicon substrate is thinned. Backside
chains and five hardened DFF chains are detailed in Table 1, and a di thinning with good uniformity in residual Si thickness is achieved due to
agram for the structure of the multiple DFF chains are shown in Fig. 2. the limited area of the implemented DFF module. The thickness of the
The data input port, clock port, and data output port are necessary substrate is measured during the substrate thinning. Besides, the test
for each DFF chain. Two standard DFF chains are configured for com board includes a controller FPGA, which realizes data writing, reading,
parison, including a standard DFF chain with a synchronous reset port and comparison. The serial communication module is employed for
(Standard_SYN), and a standard DFF chain with an asynchronous reset receiving and executing the command of the PC computer.
port (Standard_ASYN). Five TMR hardened DFF chains are implemented The DUT are irradiated in a vacuum chamber at China Institute of
in FPGA to investigate the effective SEU mitigation strategies and Atomic Energy, and in air condition at the Heavy Ion Research Facility in
improve the reliability of FPGA-based systems. For these hardened DFF Lanzhou, in the Institute of Modern Physics, Chinese Academy of Sci
chains, the feature of data input and data output ports are identical. The ences. The information of experimental ions is listed in Table 2. The two
Local placed row-oriented TMR (LTMR_ROW) chains have one global low LET ions are suitable to extract the SEU results due to the slow SEU
clock port, and only for the LTMR_ROW chains, the triplicated DFFs of and interrupt rate they induced, leading the efficiency of irradiation test
each stage are instantiated in one SLICE. Both the row-oriented improved. All irradiation tests are performed at a normal incidence, and
2
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
Fig. 2. A diagram for the standard, LTMR hardened, GTMR hardened and DTMR hardened DFF chains.
Fig. 3. The sensitive DTMR_COL scheme. Two of the triple hardened cells may be affected at the same time.
3
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
4
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
Fig. 5. (a) PBIE for square wave data pattern (two 25-stage DFF cells); (b) PBIE for full “1” data pattern (two 25-stage DFF cells); (c) PBIE for square wave data
pattern (four 25-stage DFF cells).
Fig. 6. (a) The comparisons of upset numbers among different chains under 12C ions irradiation; (b) the comparisons of upset numbers among different chains under
19
F ions irradiation.
Fig. 7. Heavy ion induced SEU cross sections of different chains (for this figure, the error events rather than the upset numbers are calculated in the SEU cross-section
results, indicating that one distinguished PBIE is counted as only one error event).
5
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
observed. A 4799-bit error of the Standard_ ASYN chain appear once for routing delay (trouting) and voting (tvoter) of the voters. If the error data
the full “1” data pattern under 19F ions irradiation, which is caused by an satisfy the setup time (tsetup), the data of stage2 will be erroneous
abnormal signal of asynchronous reset port. For the Standard_ASYN consequently. And the relationship is shown in formula (1).
chain, 25 out of the 30 SEUs of the Standard_ASYN chain with the square
tclk > tclk q + trouting + tvoter + tsetup (1)
wave data pattern are captured, and these DFF cells are all driven by the skew
same RRET, indicating that the PBIE (“1” to “0”) is due to the sensitive Based on the above analysis, if the clock skew is inevitable, the SEU
asynchronous structure. sensitive area will involve the approximately twofold of the unhardened
chain, which is consistent with the experimental results.
In addition, both the DTMR_ROW and DTMR_COL chains are robust
4.2. Hardened chains to SEU for the full “0” and full “1” data patterns. The DTMR_COL chains
are much more tolerant to SEUs than DTMR_ROW chains, because the
As shown in Fig. 6(a) and (b), it is observed that the LTMR_ROW SEU sensitive LCLK in a SLICE is further minimized compared to the
chains are robust to SEU, which reduce the SEU rates drastically in both DTMR_ROW chains. Nevertheless, 50 errors in square wave test of
full “0” and full “1” data patterns. However, for the square wave data DTMR_COL chains are observed in Round 1 test. By reexamining the
pattern, the LTMR_ROW chains present almost the same SEU sensitivity SEUs pattern in the log files, it is found that the 50 SEUs are entirely
as the unhardened chain. It is due to the charge sharing effect that affects caused by two PBIEs, where the 50-stage DFF are driven by the same
the triplicated DFF cells placed in one SLICE, leading transients in a RCLK. As two of the triplicated DFF cells for each stage are restricted to
LCLK impact the effectiveness of the triplicated redundancy and cause the same RCLK, once the RCLK is struck by ions, the triplicated redun
an inevitable SEU eventually. However, the slightly improvement of the dancy will cease to be effective, leading PBIE appeared.
SEU tolerance of LTMR_ROW chains in 19F ions irradiation is beneficial For the DTMR_NEW chains, the triplicated BUFGs are all connected
from the reduction of the intrinsic cell upsets. to one external clock signal and drive a different RCLK. The comparisons
It is clear that the GTMR_ROW chains are also robust in mitigation of of SEU results for the DTMR_NEW chains and the other chains are also
the intrinsic cell SEUs for both full “0” and full “1” data patterns. shown in Fig. 6(a) and (b). Without considering the PBIEs from the
However, the GTMR_ROW chains show serious SEU sensitivity for the BUFG or RCLK, the DTMR_NEW chains exhibit an 80× improvement
square wave data pattern. The reason of increased SEU sensitivity of over the unhardened chain and a 25× improvement over the
GTMR_ROW chains is illustrated in Fig. 8. Take the stage2 as an DTMR_ROW chains at the LET value of 7.4 MeV⋅cm2/mg. As shown in
example, and the related nodes of the stage2 are marked. Ideally, even if Fig. 7, for the high-LET 81Ta ions irradiation, the SEU cross sections of
an SEU occurs in DFF1_tmr1 of stage1, three voters inside the stage2 will the unhardened chain and DTMR_NEW chains all increase more than
deliver a correct value. If no clock skew (tclk_skew) exists among the one order of magnitude. However, when the LET value equals to 84.9
triplicated clocks during the data transition, the correct values will be MeV⋅cm2/mg, the DTMR_NEW chains still have an ~20× improvement
output on the nodes q2, q1 and q0 after a clock-to-Q time (tclk_q). over the unhardened chain, indicating that a strong radiation tolerance
However, as a matter of fact, some additional clock skews are existed of DTMR_NEW chains is also verified by the high-LET irradiation. Based
due to the routing or peripheral hardware system. Assuming the rising on these results, it is found that the orientation method is not so critical
edge of TMR_clk2 precedes TMR_clk1 and TMR_clk0. After the first ris for the SEU sensitivities of DFF chains. While the triplicated clock buffer
ing edge of TMR_clk2, the value of node DFF1_Q2 will be changed before is significant for the radiation tolerance of chains.
the first positive edges of TMR_clk1 and TMR_clk0 arriving. If an SEU
occurs in DFF1_tmr1 of stage1, the triplicated voters inside stage2 will
output an error data on the nodes n2, n1, and n0 after a total time of
Fig. 8. Four stages of the GTMR_ROW chains with the triplicated external clock TMR_clk2, TMR_clk1 and TMR_clk0.
6
C. Cai et al. Microelectronics Reliability xxx (xxxx) xxx
5. Conclusion [4] C. Cai, X. Fan, J. Liu, D. Li, T. Liu, L. Ke, P. Zhao, Z. He, Heavy-ion induced single
event upsets in advanced 65 nm radiation hardened FPGAs, Electronics 8 (2019)
323.
In this paper, seven specially designed circuits are implemented on [5] J. Tonfat, F.L. Kastensmidt, L. Artola, G. Hubert, N.H. Medina, N. Added, V.A.
the 28 nm SRAM-based FPGA to investigate the failure rates of nano P. Aguiar, F. Aguirre, E.L.A. Macchione, M.A.G. Silveira, Analyzing the influence of
scale FPGA systems and promote their further space applications. The the angles of incidence and rotation on MBU events induced by low LET heavy ions
in a 28-nm SRAM-based FPGA, IEEE Trans. Nucl. Sci. 64 (8) (2017) 2161–2168.
heavy ion experimental results under different data patterns reveal that [6] J.S. Bi, K. Xi, B. Li, H. Wang, L.L. Ji, J. Li, M. Liu, Heavy ion induced upset errors in
the PBIE is mainly from the clock buffer, which is critical to the per 90-nm 64 Mb NOR-type floating-gate flash memory, Chin. Phys. B 27 (9) (2018),
formance of TMR hardening strategy. Hence, the specially placed clock 098501.
[7] M. Cannon, et al., Improving the effectiveness of TMR designs on FPGAs with SEU-
buffers for the hardened chains are vital to achieve a desired SEU aware incremental placement, in: 2018 IEEE 26th Annual International Symposium
tolerance. The irradiation results verified that the proposed DTMR_NEW on Field-programmable Custom Computing Machines (FCCM), 2018, pp. 141–148.
chains are suitable to be used in radiation environment due to the strong Apr.
[8] M. Berg, et al., Effectiveness of internal versus external SEU scrubbing mitigation
SEU tolerance. The detailed comparison results are useful for re strategies in a xilinx FPGA: design, test, and analysis, IEEE Trans. Nucl. Sci. 55 (4)
searchers to implement effective FPGA-based soft error tolerance (2008) 2259–2266. Aug.
systems. [9] C. Cai, T. Liu, P. Zhao, X. Fan, H. Huang, D. Li, L. Ke, Z. He, L. Xu, G. Chen, Jie Liu,
Multiple layout-hardening comparation of SEU mitigated Flip-flops in 22 nm UTBB
FD-SOI technology, IEEE Trans. Nucl. Sci. 67 (1) (2020) 374–381. Jan.
CRediT authorship contribution statement [10] K.M. Sielewicz, G.A. Rinella, M. Bonora, P. Giubilato, M. Lupi, M.J. Rossewij,
J. Schambach, T. Vanat, Experimental methods and results for the evaluation of
triple modular redundancy SEU mitigation techniques with the Xilinx Kintex-7
Methodology, Jian Yu; software, Shuai Gao and Bingxu Ning; vali FPGA, in: 2017 IEEE Radiation Effects Data Workshop (REDW), 2017,
dation, Chang Cai; writing—original draft preparation, Jian Yu; wri pp. 148–155.
ting—review and editing, Chang Cai and Mingjie Shen; supervision, [11] 7 Series FPGAs Clocking Resources User Guide, Xilinx, User Guide UG472, Sep, htt
ps://www.xilinx.com, 2016.
Tianqi Liu; project administration, Liewei Xu, Jian Yu; funding acqui [12] 7 Series FPGAs Configurable Logic Block User Guide, Xilinx, User Guide UG474,
sition, Jian Yu; conceptualization, Jian Yu. Sep, https://www.xilinx.com, 2016.
[13] G.R. Allen, L. Edmonds, C.W. Tseng, G. Swift, C. Carmichael, Single-event upset
(SEU) results of embedded error detect and correct enabled block random access
Declaration of competing interest memory (Block RAM) within the xilinx XQR5VFX130, IEEE Trans. Nucl. Sci. 57 (6)
(2010) 3426–3431.
[14] G.M. Swift, G.R. Allen, C.W. Tseng, C. Carmichael, G. Miller, J.S. George, Static
The authors declare no conflict of interest. upset characteristics of the 90nm Virtex-4QV FPGAs, in: 2008 IEEE Radiation
Effects Data Workshop, 2008, pp. 98–105.
[15] A. Evans, D. Alexandrescu, V. Ferlet-Cavrois, M. Nicolaidis, New techniques for
Acknowledgements
SET sensitivity and propagation measurement in flash-based FPGAs, IEEE Trans.
Nucl. Sci. 61 (6) (2014) 3171–3177.
This work is jointly supported by the fund of innovative center in [16] A. Manuzzato, S. Gerardin, A. Paccagnella, L. Sterpone, M. Violante, On the static
China Institute of Atomic Energy (Grant No. KFZC2020010501), the cross section of SRAM-Based FPGAs, in: 2008 IEEE Radiation Effects Data
Workshop, 2008, pp. 94–97.
fund of Municipal Commission of Economy and Information (GYQJ- [17] A. Ullah, P. Reviriego, A. Sánchez-Macián, J.A. Maestro, Multiple cell upset
2020-1-01), State Key Laboratory of ASIC & System (Grant No. injection in BRAMs for xilinx FPGAs, IEEE Trans. Device Mater. Reliab. 18 (4)
2020KF009), and HIRFL (Grant No. JIZR20GY002). (2018) 636–638.
[18] L.N. Cojocariu, V.M. Placinta, L. Dumitru, Monitoring system for testing the
radiation hardness of a KINTEX-7 FPGA, in: Presented at the AIP Conference
References Proceedings, 2016, https://doi.org/10.1063/1.4944199.
[19] T. Li, H. Liu, H. Yang, Design and characterization of SEU hardened circuits for
[1] C. Norton, An evaluation of the Xilinx Virtex-4 FPGA for on- board processing in an SRAM-based FPGA, IEEE Trans. Very Large Scale Integr. Syst. 27 (6) (2019)
advanced imaging system, in: 2009 IEEE Aerospace Conference, 2009, pp. 1–9. 1276–1283.
Mar. [20] L. Sterpone, L. Boragno, A probe-based SEU detection method for SRAM-based
[2] M. Cannon, et al., Strategies for removing common mode failures from TMR FPGAs, Microelectron. Reliab. 76–77 (2017) 154–158.
designs deployed on SRAM FPGAs, IEEE Trans. Nucl. Sci. 66 (1) (2019) 207–215. [21] G. Furano, A. Tavoularis, L. Santos, V. Ferlet-Cavrois, C. Boatella, R.G. Alia, P.
Jan. F. Martinez, M. Kastriotou, V. Wyrwoll, S. Danzeca, M. Tali, D. Gacnik,
[3] T. Lange, et al., On-board processing using reconfigurable hardware on the solar I. Kramberger, L. Juul, K. Maragos, G. Lentaris, FPGA SEE test with ultra-high
orbiter PHI instrument, in: 2017 NASA/ESA Conference on Adaptive Hardware and energy heavy ions, in: 2018 IEEE International Symposium on Defect and Fault
Systems (AHS), 2017, pp. 186–191. Jul. Tolerance in VLSI and Nanotechnology Systems (DFT), 2018, p. 18393191.