You are on page 1of 9

112

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 14, NO. 1, MARCH 2014

Concurrent Error Detection of Binary and


Nonbinary OLS Parallel Decoders
Kazuteru Namba, Member, IEEE, and Fabrizio Lombardi, Fellow, IEEE

AbstractThis paper presents a concurrent error detection


(CED) scheme for orthogonal Latin square (OLS) parallel decoders. Different from a CED scheme found in the technical
literature that protects only the syndrome generator, the proposed
CED scheme protects the whole OLS decoder for single stuck-at
faults. This paper presents the detailed design and analysis of
the proposed CED scheme and shows that it is strongly fault
secured for single stuck-at faults. Extensive simulation results
are also provided; different figures of merit such as area, power
dissipation, gate depth, and coverage are assessed. It is shown that
the proposed decoder designs for (n, k) t-bit error correcting OLS
codes (k = 16 256; t = 2 5) have reasonable overhead;
for example, the average area overhead of the proposed CED is
35.5 (23.6) % compared with an OLS decoder with no CED (i.e.,
the previously reported CED scheme). However, the most significant advantage of the proposed scheme is that it achieves 100%
fault coverage for the whole CED circuit, thus providing a very
efficient and fully fault-tolerant implementation. The proposed
CED is applicable to both binary and nonbinary OLS codes; the
CED for a nonbinary OLS decoder achieves comparable or better
results than a binary OLS decoder. Moreover, simulation shows
that the proposed CED scheme is better than double modular
redundancy.
Index TermsError correcting code (ECC), concurrent error
detection (CED), strongly fault secure (SFS), orthogonal Latin
square (OLS) codes, parallel decoder.

I. I NTRODUCTION

ECENTLY, memory systems have experienced significant


technology developments; so-called emerging designs
and technologies provide higher density and lower costs than
a traditional dynamic random access memory (DRAM). For
example, the zero-capacitor RAM (Z-RAM) is a new class of
voltage RAM [1]; the Z-RAM is expected to achieve a higher
density while operating at a supply voltage lower than for a
DRAM. A non-volatile based memory (NVM) system (such
as a NAND CMOS-based flash memory) is economic, but this
technology is encountering major challenges due to scaling.
The phase change memory (PCM) is one of the most promising
technologies for a NVM system [2]; PCM is already providing
a higher density and lower costs than DRAM. In addition to
PCM, the resistive RAM (ReRAM, RRAM) [3], the conductive
bridging RAM (CBRAM) [4] and the spin-transfer torque RAM
Manuscript received August 31, 2013; revised December 27, 2013; accepted
January 6, 2014. Date of publication January 14, 2014; date of current version
March 4, 2014.
K. Namba is with the Graduate School of Advanced Integration Science,
Chiba University, Chiba 263-8522, Japan (e-mail: namba@ieee.org).
F. Lombardi is with the Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115 USA (e-mail: lombardi@
ece.neu.edu).
Digital Object Identifier 10.1109/TDMR.2014.2300101

(STT-RAM) [5] have been proposed as explorative technologies


for implementing a NVM system.
However, these technologies are still facing significant challenges for attaining an acceptable level of reliable operation.
For example, multilevel memory systems using PCM [6] have
a high storage density; however, this is achieved by reducing
the margin between adjacent resistive levels of a cell, thus
likely degrading data integrity in the presence of noise and
drift [7]. So, error tolerance for emerging memory systems is
of increasing importance. Error correcting codes (also referred
as error control codes, ECCs) are frequently used for improving
the reliability of a memory system [8]. Orthogonal Latin Square
(OLS) codes have drawn considerable attention in the last few
years [9][11]; OLS codes provide multiple-bit error correction
[12] and utilize high-speed parallel decoding by using one-step
majority-logic decoding (OSMLGD) [8].
Errors are controlled in a memory system by using encoders
and decoders for ECCs; however, ECCs do not protect from
errors in the encoders and decoders. Control of these errors
is usually accomplished by concurrent error detection (CED).
One of the most-used CED scheme utilizes double modular
redundancy (DMR), in which circuits are simply duplicated
and their outputs compared. DMR is applicable to any circuit;
however, it incurs in a large hardware overhead (more than
100%) due to the additional checking hardware for the two
copies. An extensive literature exists on CED schemes for
specific and commonly used circuits (such as ECC encoders
and decoders, [13] for Reed-Solomon codes and [14] for
EG-LDPC codes); however these approaches cannot always be
generalized.
Reviriego, et al. have presented in [15] a CED scheme for
OLS codes in OLS decoders. This scheme protects the entire
OLS encoders, but only part of the decoders, i.e. the syndrome
generator. The remaining circuits commonly found as parts
of the OLS decoders are not protected. Initially, this paper
provides experimental evidence that as it protects only the
syndrome generator, the CED scheme of [15] does not provide
complete coverage for the entire OLS decoders. Specifically,
this CED scheme [15] achieves 100% coverage for only 43.4%
(on average) of the circuit area of the OLS decoders (as corresponding to the syndrome generator); hence, a CED scheme for
the entire OLS decoder is required.
This paper then presents a CED scheme that protects the
entire circuit of the OLS parallel decoders at 100% coverage
of single stuck-at faults. A detailed analysis of the design and
its properties is pursued to complement the initial findings of
[19]; it is proved that the proposed CED scheme is strongly
fault secure for single stuck-at faults. Extensive simulation

1530-4388 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

NAMBA AND LOMBARDI: CED OF BINARY AND NONBINARY OLS PARALLEL DECODERS

113

results are provided; different figures of merit (such as area,


power dissipation, gate depth and coverage) are assessed for
its implementation. It is shown that the proposed decoders
for (n, k) t-bit error correcting OLS codes (k = 16 256;
t = 2 5) have an acceptable overhead (better than a double
modular redundancy for example). For example the average
area overhead of the proposed CED is 35.5 (23.6) % compared
to an OLS decoder with no CED (the previously reported
CED scheme of [15]). It is also shown that the proposed CED
scheme can be used for non-binary OLS codes as applicable to
PCM [2].
II. P RELIMINARIES
The OLS code [12] is reviewed. The OLS code uses orthogonal Latin squares, which are m m square arrays of digits
0, 1, . . . , m 1. The following matrix H is an H matrix (i.e. a
parity check matrix) of a t-bit error correcting OLS code:

H=

M1
M2
M3
..
.

I2tm

M2t
where Mi (1 i 2t) is an m m2 matrix generated from
the m m orthogonal Latin squares. The generation of the
matrices Mi has been presented in [12].
OLS codes can be decoded using OSMLGD [8] as follows.
Let Sj be a vector that contains all i-th elements in a syndrome
such that hi,j in the H matrix is 1. Suppose that a t-bit error
occurs on a received word u. If the i-th bit in u is erroneous,
then the values of at least (t + 1) bits in Sj are 1s. If not, the
values of at least t bits are 0s. Based on these conditions, errors
can be corrected by simply flipping all received bits, such that
at least (t + 1) bits in Sj are 1s, i.e. by adding the majority of
all values in Sj and a value 0 to all received bits.
Next, the existing CED proposed in [15] is reviewed; the
CED is for the syndrome generator in an OLS decoder. Then, it
is shown that the CED scheme of [15] achieves CED with 100%
coverage for only a relatively small part of the OLS decoders.
Note that this paper uses the following terminology for
error and fault (as also found in the technical literature):
Error represents an error occurring outside the ECC
decoder and its checker, i.e. in this case occurring in
memory. Errors must be controlled by the ECC, and not
by the CED.
Fault represents a fault occurring in the decoder or its
checker. Faults must be detected by the CED.
The CED of [15] is totally self-checking (TSC [16]) for
single stuck-at faults at gate-level (faults at the input(s) and
output(s) of gates). The CED circuit outputs rsyn1 and rsyn2
as fault detection signals. So, if no fault occurs, rsyn1 = rsyn2 ;
if a fault occurs, rsyn1 = rsyn2 , thus achieving detection. The
outputs rsyn1 and rsyn2 are the parity of the check bits c and the
syndrome s; they are expressed as rryn1 = c and rsyn2 = s
where x is a parity of a vector x, i.e. x = 0 (1) if the

Fig. 1. Area ratio of syndrome generator in OLS decoder versus different


values of t and k.

Hamming weight of x is even (odd). The interested reader


should refer to [15] for more detail.
This paper does not discuss any scenario following fault
detection; these scenarios have been extensively analyzed in
the technical literature [16]. For example, a fail-safe system
can be designed following detection; this system stops in a
harmless mode such that operation can be continued, albeit at
a lower level of functionality. However, this feature affects other
reliability metrics such as the mean time to failure (MTTF);
the value of the MTTF of a fail-safe system is smaller than
for a no fail-safe system due to the area overhead of the CED.
The fail-safe property however is highly desirable in many
applications. Additional system-level features (such as repair
or recomputation) can also be implemented following fault
detection to further improve the tolerance to faults and continue
operation; also in these cases, the area overhead for the CED
and other units must be accounted.
The CED of [15] does not provide 100% fault coverage for
the entire OLS decoder; Fig. 1 shows the area ratio of the
syndrome generator to the entire OLS decoder circuit versus
different values of k (word length) and t. The area ratio is
43.4% on average; on the assumption that faults are normally
distributed in occurrence over the entire OLS decoder, then the
CED for the syndrome generator of [15] achieves 100% fault
coverage for only 43.4% the OLS decoder (i.e. 100% coverage
for the syndrome generator). This implies that the goal of 100%
coverage for the entire OLS decoder requires a new scheme for
CED of the remaining circuits not protected by [15].

III. P ROPOSED CED D ESIGN


This section shows the design of the proposed CED scheme.
Let ci be a sub-vector of check bits c corresponding to Mi . ci is
generated as ci = dMiT where d denotes the information bits;
so, c = (c1 c2 c2t ) and the length of ci is m. Let si be a subvector of the syndrome s corresponding to Mi . si is found as

114

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 14, NO. 1, MARCH 2014

where maj(. . .) is the majority of the input values; the following


condition is then applicable:
2t


1
ci > t

i=1
.
maj( c1 , c2 , . . . , c2t , 0) =
2t

0
ci t
i=1

Fig. 2. Decoder design using proposed CED.

Fig. 3. roth1 generator.

si = uMiT ci where u is the information bits in a received


word, s = (s1 s2 s2t ) and the length of si is m.
Fig. 2 shows the construction of the OLS decoder by the
proposed CED scheme; the decoder includes the roth1 generator, (its construction is shown in Fig. 3). As shown in a
later section, the proposed CED is strongly fault secure (SFS
[17]) for single stuck-at faults at the inputs and outputs of the
gates. It requires two pairs of signals for fault detection: (rsyn1 ,
rsyn2 ) and (roth1 , roth2 ). If no fault occurs, rsyn1 = rsyn2 and
roth1 = roth2 . If a single stuck-at fault occurs, either or both
rsyn1 = rsyn2 and/or roth1 = roth2 , thus detecting the fault.
The two pairs can be combined into a single pair(r1 , r2 ) using a
traditional two-rail checker; r1 = rsyn1 roth2 + rsyn2 roth1 and
r2 = rsyn1 roth1 + rsyn2 roth2 .
The signal roth1 is generated in the roth1 generator; this
consists of a MUX and two circuits, denoted as MAJ and EQ.
The MAJ circuit is a 2t-input majority voter (and is also used
in the error calculator of the OLS decoder). MAJ outputs a 0
if the number of 1s is equal to that of 0. The EQ circuit
outputs a 1 if and only if the number of 1 is equal to 0.
Otherwise, EQ outputs a 0. The signal roth1 is a function of
ci and si , and it can be expressed as follows:
2t


maj(
c
,

c
,
.
.
.
,

c
,
0)

s
=

t

1
2
2t
i
i=1

roth1 =
2t

( c1 ) + ( s1 )
si = t
i=1

The signal roth2 is the parity of the decoded word v, i.e. it is


true that roth2 = v
The proposed CED scheme utilizes the CED circuitry for
the syndrome generator of [15], i.e. the scheme for generating
rsyn1 and rsyn2 is the CED of [15]. The operations ci and
si can be shared with the roth1 generator; rsyn1 and rsyn2 are
2t
given by rsyn1 = 2t
i=1 ci and rsyn2 = i=1 si . If a fault
occurs on the syndrome generator, detection is accomplished by
rsyn1 = rsyn2 .
Initially, consider the case when no fault occurs. The decoder
outputs the correct information bits, while the checker outputs
the fault detection signals, such that rsyn1 = rsyn2 and roth1 =
roth2 even if a correctable error (t-bit error) occurs as explained
below. The following Lemma is therefore applicable.
Lemma 1: For any k-dimensional vector x, x = x MiT .
This Lemma is valid, because there is exactly one 1 for any
columns in Mi . The check bits ci are generated as ci = dMiT
from the information bits d, therefore ci = d.
The OLS decoder of [15] is used with no modification in the
proposed design, although some internal signals are utilized by
the checker. Thus, the decoder outputs the correct information
bits even when a correctable error occurs. Next, it is shown that
the fault detection signals of this design satisfies the conditions
rsyn1 = rsyn2 and roth1 = roth2 .
If no error occurs, for the received information bits u
and the check bits ci , u = ci . As ci is constant
regardless of i, then it is valid to establish that
2t


ci = 2t( u).

i=1

Thus, MAJ outputs u. As there is no error, the syndrome


is all-zero and the inputs of EQ are all-zero. Thus, the
output value of MAJ is selected as roth1 . The decoded
word v is the same as u, because there is no error. So,
roth2 = u and roth1 = roth2 .
Assume that a correctable error e occurs on the information bits u and u = u + e is received; no error occurs
on the check bits (where x represents x when the error
occurs). As there is no error on the check bits, then the
output value of MAJ does not change from u. Hence
si = u MiT + ci = (u + e)MiT + ci
= si + eMiT = eMiT .
So, it follows that:
si = eMiT = e.
2t
Thus, si is constant regardless of i and so
i=1
si = 0 or 2t; the output value of EQ is 0. As a result,

NAMBA AND LOMBARDI: CED OF BINARY AND NONBINARY OLS PARALLEL DECODERS


roth1
= u. As the error is correctable, then the decoded

=u
word v is equal to the correct word u. Thus, roth2


and therefore roth1 = roth2
Assume that an error e occurs on the information bits and
the value x changes to x , i.e. u = u + e. In addition,
the errors ei occur on the check bits ci and the value



x changes
ci = ci + ei . Assume that
to x , therefore
w(e) + i w(e) t and i w(ei ) < t(where w(x) is the
Hamming weight
of vector x). So, the errors e, ei are
correctable. As i w(ei ) < t, then it is true that


2t
2t





 
ci
ci  < t.



i=1

As shown previously, i=1 ci = 2t( u). If u = 0,



then 2t
ci < t and MAJ outputs a 0. If u = 1,
2t i=1
i=1 ci > t and MAJ outputs a 1. Therefore, MAJ
outputs u and it is established that
si = u MiT + ci = (u + e)MiT + ci + ei
= si + ei .

i

Fig. 4. Error pattern calculator for the (21, 9) OLS code.

i=1

2t

As

115

w(ei ) < t, then the following is valid:



 2t
2t





 
si
si  < t.



i=1

i=1

2t
Moreover as
si = 0 or 2t; then either
i=1
2t
2t

si < t or
i=1
i=1 si > t is true. In either
2t

case, i=1 si = t; EQ outputs a 0 and roth1
= u.

As the error is correctable, it follows that roth2 = u and


= roth2
.
therefore, roth1
Assume
that
the
errors
ei occur in the check bits, such

that i w(e) = t and the value x change to x , so ci =
ci + ei ; no error occurs on the information bits. If the


number of non-zero ei is less than t, roth1
= roth2
(just
as in the previously treated case).
If
the
number
of
non


zero ei is equal to t, then 2t
i=1 si = t. So, roth1 =


( c1 ) + ( s1 ). As there is no error on the information
bits, it follows that si = ei . So, it is established that

= ( c1 ) + ( s1 )
roth1

= ((c1 + e1 )) + ( e1 )
= c1
= u.



The error is correctable, so roth2
= u; roth1
= roth2
.


In conclusion, roth1 = roth2 for any correctable error, hence
proving the CED properties of the proposed scheme.

IV. FAULT D ETECTION OF P ROPOSED CED


This section shows that the proposed CED is SFS for single
stuck-at faults. Initially, an important feature of the error pattern
calculator is presented; next it is shown that the proposed CED
is fault secure and finally, its SFS property is outlined.

The error pattern calculator consists of majority voters (Fig. 4


shows the error pattern calculator for the (21, 9) OLS code);
the voters are completely separated in the output bit. Voters
could share few gates to reduce the area overhead; however,
the H matrix of the OLS code satisfies the RC constraint and
this does not allow the use of shared gates. Hence, every single
stuck-at fault on a voter affects no more than two outputs of the
calculator.
Next, it is shown that the proposed CED is fault secure for
single stuck-at faults. Even if a fault occurs on the checker, the
fault does not affect the decoded word v and thus, the decoder
outputs v correctly, regardless of whether the fault is detected or
not. Assume that a fault occurs on the syndrome generator. As
the syndrome generator is TSC, then either the fault is detected,
or the generator outputs the correct syndrome. If the generator
outputs the correct syndrome, the decoder outputs the correct v,
because there is no fault other than in the syndrome generator.
If a fault occurs in the error pattern calculator, then at most a
bit in e flips. If there is no bit flipping in e, then the decoder
outputs the correct v. If a bit flips in e, then the corresponding
bit in v flips too; hence, roth2 flips too. As the fault does not
affect roth1 , then the fault is detected. If a fault occurs on an
XOR gate calculating u + v, it flips exactly one bit in v and
then roth2 flips too. Also in this case, roth1 does not change and
so, the fault is detected.
Consider next the SFS property. In circuits without the TSC
property, there may exist faults that are not detected for any
input. Let P be a system and let P  be a sub-system in P .
Assume that a fault f occurs in P , but f is not detected. The
fault f can cause erroneous system outputs when another fault
f  occurs in P .
There are three cases in which f is sensitized by another
fault f  :
The fault f is not sensitized to the output of P before f 
occurs, so the fault f is not sensitized to the output of P 
by supplying a vector x to the input of P  . However x
never appears at the input prior to the occurrence of f . The
fault f  makes x to appear at the input.
The fault f  occurs inside P .
The fault f is sensitized to the output of P  but not to the
output of P before f  occurs. The fault f  occurring outside
of P  causes to sensitize it from the output of P  to the
output of P .

116

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 14, NO. 1, MARCH 2014

However, the SFS property does not allow these cases to


exist; therefore, next it is shown that the proposed CED is SFS
for single stuck-at faults.
The syndrome generator and its checker are TSC. In
addition, its inputs and the error detection signals rsyn1
and rsyn2 are controllable and observable. So, there is no
fault that is not detected by any input.
Assume a fault f occurs on the error pattern calculator. If
another fault f  changes the input value, then it is detected
by the checker for the syndrome generator. The calculator
consists of MAJs that are fully separated. Thus, if f  in the
calculator sensitizes f , then they must occur on the same
MAJ. This is equivalent to the case in which a single fault
occurs on the calculator. The output of the error pattern
calculator is always sensitized to roth2 .
Every fault in the XOR gates (that calculates u + v) flips
exactly one bit in v, and in turn, it flips roth2 .
Every fault in the XOR tree for roth2 flips roth2 and so, it
is detected.
Assume a fault f occurs in the roth1 generator. If another
fault f  changes the input value, then it flips either rsyn1 or
rsyn2 . This is detected. If f  in the generator sensitizes f ,
then this flips roth1 , but it does not flip roth2 . Therefore,
it is detected because the output of the roth1 generator is
directly connected to roth1 .

Fig. 5.

Area overhead for CED (normalized by decoder without CED).

From the above observations, it is possible to conclude that


the proposed CED is SFS for any single stuck-at faults.
V. E VALUATION
This section evaluates the proposed OLS decoder with CED.
In this evaluation, the proposed decoders for (n, k) t-bit error correcting OLS codes (k = 16 256; t = 2 5) are designed by using the Synopsys Design Compiler. The Design
Complier could invalidate the SFS property of the proposed
scheme; so each circuit (namely the syndrome generator, the
error pattern calculator, the roth1 generator and the XORs) is
individually compiled as a so-called separate group; then, all
circuits are assembled together but with no grouping [19]. The
OLS decoders with the CED scheme for the syndrome generator of [15], the OLS decoders with DMR and the OLS decoders
without CED are also designed for comparison purposes with
other schemes. DMR requires a comparator for the outputs of
duplicated modules; the comparator uses a two-rail checker.
Different figures of merits are evaluated.
Fig. 5 shows the normalized area overheads for both CED
schemes, i.e. the area ratios of the OLS decoder without CED to either (Ssyn_chk + Soth_chk )/(Ssyn + Soth )
for the proposed CED or Ssyn_chk /(Ssyn + Soth ) for the
CED of [15]. Note that Ssyn_chk , Soth_chk , (Ssyn , Soth )
denote the areas of the checker for the syndrome generator and the remaining circuits (the syndrome generator
and the remaining circuits parts) in the proposed scheme
(the scheme of [15]). In addition, this figure shows the
area overhead for DMR, i.e. 1 + Scomp /(Ssyn + Soth )
where Scomp denotes the area of the comparator. The
average values of (Ssyn_chk + Soth_chk )/(Ssyn + Soth )

Fig. 6. Power consumption overhead for CED (normalized by decoder


without CED).

and Ssyn_chk /(Ssyn + Soth ) are 35.5% and 23.6%. Hence,


these are reasonable ratios because the average value of
Ssyn_chk /Ssyn is 55.7%; the area overhead however is
smaller for larger values of k. The average value of 1 +
Scomp /(Ssyn + Soth ) is 119.4%; hence, the area of the
proposed CED scheme is significantly smaller than DMR.
Fig. 6 shows the normalized overhead in power consumption (utilizing the same comparative conditions as used
previously for the area overhead).
Fig. 7 shows the gate depth of the OLS decoders with
and without CED (normalized by an inverter). The use of
the proposed CED increases the gate depth by 18 34.
The proposed CED achieves a reduction of 6 29 in
gate depth than DMR. Fig. 8 shows the gate depth of the
proposed scheme; the increase is mostly caused by the
roth1 generator and the XOR-tree for calculating roth2 .
Fig. 9 shows the relation between the normalized area
overhead and fault coverage. The use of the CED for the

NAMBA AND LOMBARDI: CED OF BINARY AND NONBINARY OLS PARALLEL DECODERS

Fig. 7.

117

Gate depth of OLS decoder with CED (normalized by an inverter).

Fig. 9. Area overhead versus fault coverage.

Fig. 8.

Gate depth of OLS decoder with CED (normalized by an inverter).

syndrome generator increases the area overhead and the


fault coverage by 23.6% and 43.4% on average. The use
of the proposed CED increases them to 35.5% and 100%.
As shown in Fig. 9, the slope for the plot of the proposed
CED for all (t, k) values except (2, 16) is lower than the
CED for the syndrome generator, hence the proposed CED
provides an efficient design. DMR (not shown in Fig. 9)
also achieves 100% fault coverage; however it incurs in a
significantly larger hardware overhead than the proposed
scheme (as shown previously in Fig. 5).
VI. P IPELINED CED S CHEME
As mentioned in the previous section and in [19], the use of
the proposed CED increases the gate depth; this decreases the
clock frequency of the decoder. This issue can be resolved by
pipelining the CED scheme; Fig. 10 shows the pipelined CED
version of the proposed scheme. As mentioned in the previous
section the roth1 generator and the XOR-tree for calculating

Fig. 10. Pipelined CED scheme.

roth2 increase the delay time; these circuits operate in different


stages. In this case, the clock frequency is nearly the same as for
the case with no CED. The pipelined CED scheme outputs the
decoded words at the same clock frequency as the words are
received at the inputs (i.e. in a fashion similar to the original
parallel decoder); the check results are then provided at the
outputs of the next clock cycle. So, a latency still exists. Two
possible options are available in this case.
Do not use the decoded words till the output of the check
results is available.
Use the decoded words before the output of the check
results.

118

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 14, NO. 1, MARCH 2014

Fig. 11. Majority voter for non-binary OLS decoder.

In the former case, the scheme still retains the SFS property;
in the latter case albeit incurring in a small latency, the design
of the entire system must allow this additional delay in its
operation.

VII. CED FOR N ON -B INARY OLS C ODES


Consider next the cases of a CED for non-binary OLS codes.
These codes are applicable to multilevel storage (such as phase
charge memories [2]) in which a higher base is utilized in a cell
to increase capacity. The interested reader should refer to [18]
for a detailed discussion of non-binary OLS codes for PCM. It
is then possible to use the H matrix of an (n bits, k bits) binary
t-bit error correcting OLS code as an H matrix for (n symbols, k
symbols) t-symbol error correcting code. The H matrix consists
of only additive and multiplicative identities, 0 and 1.
For example, the matrix H of a (21, 9) binary double-bit
error correcting OLS in (2) can be used as the H matrix for a
(21, 9) double-symbol error correcting OLS code over GF(22 ).
Consider the information data d = (123 000 000), its codeword
appears as u = (123 000 000 | 000 123 123 123); uH T = 0 is
true over GF(22 ).
We can decode the non-binary OLS code using OSMLGD,
that requires a non-binary majority circuit. Fig. 11 illustrates
an example of a majority circuit for a double-symbol error
correcting OLS code over GF(23 ); it generates the majority
for every bit in the binary-coded digits. This circuit does not
always work as a majority circuit; however, it has sufficient
functionality for use in the non-binary OLS decoder. For example, if Sj = (3, 3, 4, 5) = (011, 011, 100, 101), then it outputs
1 = (001) although the majority is 3. However, such Sj does
not appear when a t-symbol error occurs, because at least t + 1
symbols in Sj are equal to the error magnitude ej , and thus,
every bit in the binary-coded ej is selected as the majority of
each digit.
Fig. 12 illustrates the proposed CEDs for non-binary OLS
codes; this figure shows an example over GF(23 ). It utilizes
only one checker; so, ci , si and roth2 are included in the
XOR gates prior to the checker. Let x[i] be x for the i-th bit in
the binary-coded digits; the following equations are true:
ci = j ci [j]
si = j si [j]
roth2 = j roth2 [j].

Fig. 12.

Non-binary OLS decoder with CED.

The other detection signals rsyn1 , rsyn2 and roth1 are generated
from ci and si in the same manner as for the binary case.
This CED is shown to be SFS for stuck-at faults as follows.
Initially, consider the case in which no fault or error occurs.
If so, u[j] = ci [j] is true for any j and it is correct to
establish that
2t

i=1

ci [j] =
j

2t

i=1

u[j].
j

So, the output value of MAJ in the roth1 generator appears


as j u[j], i.e. it is equal to roth2 . Since the syndrome
is all-zero, then the output value of MAJ is selected as roth1
and therefore roth1 = roth2 . This is the same as for the binary
case, except that u, ci , u and ci are replaced with u[j],
ci [j], j u and j ci . It is then possible to show that
roth1 = roth2 for any correctable errors in a similar manner by
replacing x and x by x[j] and j x (x = u, e, ci , si and
ei ) in the discussion presented previously for the binary case.
Consider the fault secure property. If a fault occurs on the
syndrome generator, the error calculator (MAJs) or an XOR
gate (that calculates u + v) for a j-th bit in the binary-coded
digits, then the fault affects ci [j], si [j] and roth2 [j], but not
ci [j  ], si [j  ] and roth2 [j  ] for any j  (= j). So whenever ci [j],
si [j] and roth2 [j] flip, the output of the corresponding XOR
gates (for ci , si and roth2 as inputs) flips too; the checker detects
the fault (if needed) just like in the binary case. Even if a fault
occurs on the XOR gates or the checker, the decoder outputs the
correct decoded word regardless of whether the fault is detected
or not. In conclusion, the CED is fault secure also in the nonbinary case.
The difference between the CEDs for binary and non-binary
OLS decoders occurs in the XOR gates (for generating the
output for the inputs ci , si and roth2 ). If a fault occurs on the
XOR gates for ci , si and roth2 , the values of rsyn1 , rsyn2 and

NAMBA AND LOMBARDI: CED OF BINARY AND NONBINARY OLS PARALLEL DECODERS

Fig. 13. Area overhead for CED, non-binary OLS decoder.

119

Fig. 15. Gate depth of non-binary OLS decoder with CED.

VIII. C ONCLUSION

Fig. 14. Power consumption overhead for CED, non-binary OLS decoder.

roth2 flip, thus detecting the fault. Therefore, the proposed CED
is SFS for faults in the XOR gates. If a fault flips an input value
of one of the XOR gates, the output value always flips; thus,
the SFS property is still applicable for a single stuck-at fault
outside the XOR gates, so is again similar to the binary case.
Figs. 13 and 14 show the area and power consumption
overheads of the CED for binary and non-binary OLS decoders
over GF(2b ) (k = 16 256; t = 2 5; b = 1, 2, 3, 4, 8). The
plots have solid lines connecting the cases with the same (k, t);
for any (k, t), the CED for a non-binary OLS decoder achieves
comparable or better results than a binary OLS decoder. Fig. 15
shows the gate depth of the binary and non-binary OLS decoders with CED; for large b, the gate depth is increased, i.e.
for b = 8 it is 12 (15) longer than for the binary case on average
(in the worst case).

This paper has presented a concurrent error detection (CED)


scheme for OLS parallel decoders. Different from an CED
scheme found in the technical literature [15] that protects only
the syndrome generator, the proposed CED scheme protects the
whole OLS decoder for any single stuck-at fault. This paper
has presented the detailed design and analysis of the proposed
CED scheme and has shown that it is SFS for any single stuckat fault. Extensive simulation results have also been provided;
different figures of merit such as area, power dissipation, gate
depth and coverage have been assessed. It has been shown
that the use of the CED for the syndrome generator of [15]
increases the area overhead and the fault coverage by 23.6%
and 43.4% on average. The use of the proposed CED increases
them to 35.5% and 100% respectively. Therefore, the proposed
decoders for (n, k) t-bit error correcting OLS codes (k =
16 256; t = 2 5) have modest overhead while providing
100% fault coverage of the whole circuit, thus making it fully
fault tolerant. Comparison with DMR (double modular redundancy) has shown that the proposed CED scheme is superior
in terms of all considered figures of merit. The extension of
the proposed scheme to CED for non-binary OLS codes has
also been presented as applicable to emerging memories such
as PCM. It has been shown that the CED for a non-binary OLS
decoder achieves comparable or better results than a binary
OLS decoder. This paper has dealt only with CED; future work
includes the design and development of an error correction
scheme for OLS parallel decoders.
R EFERENCES
[1] N. Savage, Z-RAM takes on DRAM, IEEE Spectrum, vol. 47, no. 7,
p. 18, Jul. 2010.
[2] N. Papandreou, A. Pantazi, A. Sebastian, M. Breitwisch, C. Lam,
H. Pozidis, and E. Eleftheriou, Multilevel phase-change memory, in
Proc. IEEE Int. Conf. Electron. Circuits Syst., 2010, pp. 10171020.
[3] R. Waser, R. Dittmann, G. Staikov, and K. Szot, Redox-based
resistive switching memoriesNanoionic mechanisms, prospects, and
challenges, Adv. Mater., vol. 21, no. 25/26, pp. 26322633, Jul. 2009.

120

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 14, NO. 1, MARCH 2014

[4] M. Kund, G. Beitel, C.-U. Pinnow, T. Rhr, J. Schumann, R. Symanczyk,


K.-D. Ufert, and G. Mller, Conductive bridging RAM (CBRAM): An
emerging non-volatile memory technology scalable to sub 20 nm, in
IEDM Tech. Dig., 2005, pp. 754757.
[5] M. Hosomi, H. Yamagishi, T. Yamamoto, K. Bessho, Y. Higo, K. Yamane,
H. Yamada, M. Shoji, H. Hachino, C. Fukumoto, H. Nagao, and H. Kano,
A novel nonvolatile memory with spin torque transfer magnetization
switching: Spin-RAM, in IEDM Tech. Dig., 2005, pp. 459462.
[6] D. Ielmini, A. L. Lacaita, and D. Mantegazza, Recovery and drift dynamics of resistance and threshold voltages in phase-change memories,
IEEE Trans. Electron Device, vol. 54, no. 2, pp. 308315, Feb. 2007.
[7] I. V. Karpov, M. Mitra, D. Kau, G. Spadini, Y. A. Kryukov, and
V. G. Karpov, Fundamental drift of parameters in chalcogenide phase
change memory, J. Appl. Phys., vol. 102, no. 12, pp. 124503-1124503-6,
Dec. 2007.
[8] S. Lin and D. J. Costello, Error Control Coding, 2nd ed. Englewood
Cliffs, NJ, USA: Prentice-Hall, 2004.
[9] Z. Chishti, A. R. Alameldeen, C. Wilkerson, W. Wu, and S.-L. Lu, Improving cache lifetime reliability at ultra-low voltages, in Proc. Annu.
IEEE/ACM Int. Symp. Microarch., 2009, pp. 8999.
[10] C. Wilkerson, H. Gao, A. R. Alameldeen, Z. Chishti, M. Khellah, and
S.-L. Lu, Trading off cache capacity for reliability to enable low voltage
operation, in Proc. Annu. Int. Symp. Comput. Archit., 2008, pp. 203214.
[11] R. Datta and N. A. Touba, Designing a fast and adaptive error correction
scheme for increasing the lifetime of phase change memories, in Proc.
IEEE VLSI Test Symp., 2011, pp. 134139.
[12] H. Y. Hsiao, D. C. Bossen, and R. T. Chien, Orthogonal Latin square
codes, IBM J. Res. Dev., vol. 14, no. 4, pp. 390394, Jul. 1970.
[13] G. C. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, Concurrent error
detection in ReedSolomon encoders and decoders, IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 15, no. 7, pp. 842846, Jul. 2007.
[14] H. Naeimi and A. DeHon, Fault secure encoder and decoder for
nanomemory applications, IEEE Trans. Very Large Scale Integr. (VLSI)
Syst., vol. 17, no. 4, pp. 473486, Apr. 2009.
[15] P. Reviriego, S. Pontarelli, and J. A. Maestro, Concurrent error detection for orthogonal Latin squares encoders and syndrome computation,
IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 21, no. 12,
pp. 23342338, Dec. 2013.
[16] J. F. Wakerly, Error Detecting Codes, Self-Checking Circuits and
Applications. Amsterdam, The Netherlands: North Holland, 1978.
[17] J. E. Smith and G. Metze, Strongly fault secure logic networks, IEEE
Trans. Comput., vol. C-27, no. 6, pp. 491499, Jun. 1978.
[18] K. Namba and F. Lombardi, Non-binary Orthogonal Latin Square Codes
for a Multilevel Phase Charge Memory (PCM), Dept. ECE, Northeastern
Univ., Internal Rep., Jul. 2013.
[19] K. Namba and F. Lombardi, A novel scheme for concurrent error detection of OLS parallel decoders, in Proc. IEEE Int. Symp. DFT VLSI
Nanotechnol. Syst., New York, NY, USA, Oct. 2013, pp. 5257.

Kazuteru Namba (M04) received the B.E., M.E.,


and Ph.D. degrees from Tokyo Institute of Technology, Yokohama, Japan, in 1997, 1999, and 2002,
respectively.
In 2002, he joined Chiba University, Chiba, Japan,
where he is currently an Assistant Professor with the
Graduate School of Advanced Integration Science.
His current research interests include dependable
computing.
Dr. Namba is a member of the Institute of Electronics, Information and Communication Engineers
and the Information Processing Society of Japan.

Fabrizio Lombardi (M81SM02F09) received


the B.Sc. (Hons.) degree in electronic engineering
from the University of Essex, Colchester, U.K., in
1977; the Masters degree in microwaves and modern
optics and the Diploma degree in microwave engineering from University College London, London,
U.K., both in 1978; and the Ph.D. degree from the
University of London, London, in 1982.
He is currently the holder of the International
Test Conference Endowed Chair Professorship at
Northeastern University, Boston, MA, USA. He has
extensively published in his areas of research interest, which are bioinspired and
nanomanufacturing/computing, very large-scale integration design, testing, and
fault/defect tolerance of digital systems, and he has also coauthored or edited
seven books.
Dr. Lombardi currently serves as an Elected Member of the Board of
Governors of the IEEE Computer Society and on the Administrative/Executive
Boards of the IEEE Nanotechnology Council and the Computing-in-the-Core
non-partisan advocacy coalition for K-12 Computer Science education. In
20072010, he was the Editor-in-Chief of the IEEE T RANSACTIONS ON
C OMPUTERS. He is also an Associate Editor of the IEEE T RANSACTIONS ON
NANOTECHNOLOGY and the inaugural Editor-in-Chief of the IEEE T RANS ACTIONS ON E MERGING T OPICS IN C OMPUTING .

You might also like