You are on page 1of 6

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts

for publication in the ICC 2007 proceedings.

Quasi-Cyclic Low-Density Parity-Check Codes in


the McEliece Cryptosystem
Marco Baldi, Franco Chiaraluce Roberto Garello, Francesco Mininni
Dipartimento di Elettronica, Intelligenza Artificiale e Telecomunicazioni Dipartimento di Elettronica
Università Politecnica delle Marche, Ancona, Italy Politecnico di Torino, Torino, Italy
Email: {m.baldi, f.chiaraluce}@univpm.it Email: {garello, francesco.mininni}@polito.it

Abstract—In this paper, a new variant of the McEliece cryp- In this paper, we slightly modify the system proposed in
tosystem, based on Quasi-Cyclic Low-Density Parity-Check (QC- [4] and study possible application of Quasi-Cyclic (QC) LDPC
LDPC) codes, is studied. In principle, such codes can substitute codes. QC-LDPC codes should, in principle, overcome both
Goppa codes, originally used by McEliece; their adoption, how-
ever, is subject to cryptanalytic evaluation to ensure sufficient the major drawbacks of the original McEliece cryptosystem,
system robustness. The authors conclude that some families of but their adoption must be subject to cryptanalytic evaluation.
QC-LDPC codes, based on circulant permutation matrices, are QC-LDPC codes are easily encodable, and have been in-
inapplicable in this context, due to security issues, whilst other cluded in some recent telecommunication standards [5], [6]. We
codes, based on the “difference families” approach, can be able
suggest an algebraic technique to design a very large number
to ensure a good level of security against intrusions, even if very
large lengths are needed. of equivalent codes with fixed length and rate, which is the
pre-requisite for their application in cryptosystems. The design
I. I NTRODUCTION method is described in Section II, where QC-LDPC codes are
Since many years, error correcting codes have gained an introduced in general terms and we report an overview of some
important place in cryptography. In particular, just in 1978, design techniques. In Section III, the McEliece system using
McEliece proposed a public-key cryptosystem based on alge- LDPC codes is reviewed, and the role of its matrix components
braic coding theory [1] that revealed to be very secure. The is discussed. In Section IV a hint of the cryptanalysis of the new
rationale of the McEliece algorithm, that adopts a generator system is carried out, by considering some attacks specifically
matrix as the private key and a linear transformation of it as targeted to LDPC codes.
the public key, lies in the difficulty of decoding a large linear
code with no visible structure, that in fact is known to be an NP II. Q UASI -C YCLIC LDPC CODES
complete problem [2]. The original McEliece cryptosystem is
still unbroken, as no algorithm able to realize a total break in an Quasi-Cyclic codes have been studied since many years [7],
acceptable time has been presented up to now. A vast body of but they did not find a great success in the past because of
literature exists on local deduction attacks, i.e., attacks finalized their inherent decoding complexity in classic implementations.
to find the plaintext of intercepted ciphertexts, without knowing Nowadays, however, the encoding facilities of QC codes can be
the secret key. Despite the advances in the field, however, the combined with new efficient LDPC decoding techniques, thus
work factors required for this kind of violation remain very yielding QC-LDPC codes.
high, and quite intractable in practice. Moreover, the system The dimension k and the length n of a QC code are
is two or three orders of magnitude faster than RSA, the both multiple of a positive integer p, i.e. k = p · k0 and
latter being, probably, the most popular public key algorithm n = p·n0 ; the information vector u = [u0 , u1 , . . . , uk−1 ] and
currently used. A variant of the McEliece cryptosystem, due to the codeword vector c = [c0 , c1 , . . . , cn−1 ] can be divided
Niederreiter [3], is even faster. into p sub-vectors of size k0 and n0 , respectively, so that
As a counterpart, however, the McEliece system also shows u = [u0 , u1 , . . . , up−1 ] and c = [c0 , c1 , . . . , cp−1 ].
some drawbacks, that can justify the limited interest most In a QC code every cyclic shift of n0 positions of a codeword
cryptographers have devoted to it till today; among them, the yields another codeword; since every shift of n0 positions is
large length of the key and the low transmission rate. led by a cyclic shift of k0 positions of the corresponding
The current scenario of error correcting codes is dominated information word, it can be easily shown that Quasi-Cyclic
by low-density parity-check (LDPC) codes; thus, it seems codes are characterized by the following form of the generator
interesting to investigate possible application of this kind of matrix G, where each block Gi has size k0 × n0 :
codes in the McEliece framework. The idea to adopt LDPC  
G0 G1 . . . Gp−1
codes in the public-key cryptosystem was first explored in [4];  Gp−1 G0 . . . Gp−2 
however, the main task of that paper was to demonstrate that  
G= . .. .. ..  (1)
the usage of LDPC codes in place of Goppa codes does not  .. . . . 
permit to reduce the key length. G1 G2 ... G0

1-4244-0353-7/07/$25.00 ©2007 IEEE


Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
951
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.

This leads to an efficient encoder implementation, consisting cycles must be avoided in the Tanner graph associated with the
in a barrel shift register of size p, followed by a combinatorial code. The latter requirement can be ensured, through algebraic
network and an adder. arguments, when H is a single row of circulants (i.e. r0 = 1),
It can be easily proved that the parity-check matrix H is and the corresponding code rate is R = (n0 − 1)/n0 :
also characterized by the same “circulant of blocks” form, in  
which each block Hi has size (n0 − k0 ) × n0 = r0 × n0 . H = Hc0 Hc1 · · · Hcn0 −1 (4)
A row and column rearrangement can be applied that yields
the alternative “block of circulants” form, here shown for the Let (G, +) be a finite group of order v, D a subset of
parity-check matrix H: G, and µ and λ positive integers, with v > µ ≥ 2. A
 c  (v, µ, λ)-difference family (DF) is a collection [D1 , . . . , Dt ] of
H0,0 Hc0,1 . . . Hc0,n0 −1 µ-subsets of G, called “base blocks”, such that every non-zero
 Hc1,0 Hc1,1 . . . Hc1,n0 −1 
  element of G appears exactly λ times as a difference of two
H= . . . .  (2) elements from a base block. Difference families can be used to
 .. .. . . .. 
construct QC-LDPC codes [11]. In particular, if we consider
Hr0 −1,0 Hr0 −1,1 . . . Hr0 −1,n0 −1
c c c
the difference family [D1 , · · · , Dn0 ], a code based on it has the
In this expression, each block Hci,j is a p × p circulant following form for
c
the polynomials of the circulant matrices
matrix and, therefore, can be associated to a polynomial H0 , · · · , Hcn0 −1 :
ai,j (x) ∈ GF2 [x]mod (xp + 1), with maximum degree p − 1
and coefficients taken from the first row of Hci,j :
µ
ai (x) = xdij , i ∈ [1; n0 ] (5)
i,j 2 i,j 3
a i,j
(x) = ai,j
0 + ai,j i,j
1 x + a2 x + a3 x + · · · + ap−1 x
p−1
(3) j=1

A. QC-LDPC codes based on Circulant Permutation Matrices where dij is the j-th element of Di whose dimension is µ.
With this choice, the designed matrix H is regular (it has
A widespread family of QC-LDPC codes has parity-check column weight dv = µ and row weight dc = n0 · dv ) and
matrices in which each block Hci,j = Pi,j is a circulant all the elements in the difference family are used. By adopting
permutation matrix or the null matrix of size p; circulant difference families with λ = 1 in construction (5), the resulting
permutation matrices can be represented through the value of code has a Tanner graph free of 4-length cycles [11].
their first row shift pi,j . Many authors have proposed code Some theorems ensure the existence of difference families
construction techniques based on this approach (see [8] and [9] with λ = 1, but they apply only for particular values of
for example) and LDPC codes based on permutation matrices the group order, so putting heavy constraints on the code
have been even included in the IEEE 802.16e standard [5]. length. We have recently proposed to overcome such constraints
Another reason why these codes have met success is their by relaxing part of the hypotheses of these theorems and
implementation facility [10]. The parity-check matrix of these then refining the outputs through simple computer searches,
codes can be represented through a “model” matrix Hm , of size according to a “Pseudo Difference Families” approach [12];
r0 × n0 , containing the shift values pi,j (pi,j = 0 represents other authors have proposed an alternative technique, based on
the identity matrix, while pi,j = −1 the null matrix). The code “Extended Difference Families”, that ensures great flexibility in
rate is R = k0 /n0 and it can be varied arbitrarily through a the code length [13]. Finally, a multi-set with the properties of a
suitable choice of r0 and n0 . On the other hand, the local girth difference family can be constructed by a (constrained) random
length for these codes cannot exceed 12, and the imposition choice of its elements; we call this a “Random Difference
of a lower bound on the local girth length reflects on a lower Family” or RDF [14]. Such approach is adopted in the present
bound on the code length [9]. paper.
The rows of a permutation matrix sum into the all-one A first requirement when using an error correcting code in
vector, so these parity-check matrices cannot have full rank. a cryptosystem concerns the possibility to choose it at random
Precisely, every parity-check matrix contains at least r0 −1 rows among a very large class of equivalent codes. This way, an
that are linearly dependent on the others, and the maximum opponent, even aware of the code parameters (i.e. length and
rank is r0 (p − 1) + 1. A common solution to ensure the full rate), neither knows (obviously) the private key nor is able to
rank consists in imposing the lower triangular (or quasi-lower obtain it through a brute force attack.
triangular) form of the matrices, similarly to what done in the When considering LDPC codes, their equivalence needs to be
IEEE 802.16e standard [5]. verified under message passing decoding, whose performance
does not depend only on the weight spectrum. Generally
B. QC-LDPC codes based on Difference Families speaking, two codes exhibit almost identical performance when
When designing a QC-LDPC code, the parity-check matrix they have equal (or very similar): i) code length and rate, ii)
H should have some properties that optimize the behavior of parity-check matrix density, iii) nodes degree distributions and
the belief propagation-based decoder. First of all, the sparse iv) girth length distribution in the Tanner graph associated with
character of H reflects on the maximum number of non-zero the code. All these parameters can be kept constant in QC-
coefficients of each polynomial ai,j (x). Then, short length LDPC codes based on RDFs; moreover, the cardinality of a

Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
952
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.

set of equivalent codes with fixed parameters can be evaluated,


through probabilistic arguments [15], and is approximately:
 
n0 n v −1 p − j 2 − p(mod2) +
0 −1 d
(j−1)2
+ ldv (dv − 1)
p 2

dv p−j
l=0 j=1
(6)
Noting by C {RDF (n0 , dv , p)} this quantity, very high val-
ues of it can be achieved, even for (relatively) short codes.
As an example, by assuming p = 210, n0 = 4 (which
implies n = 840 and R = 3/4), and dv = 5, we have
C {RDF (4, 5, 210)} ≈ 2113 . From the cryptanalysis point
of view, however, we should observe that an attack could
be made on each Hci block, which is equivalent to say that
Figure 1. Performance attainable by H and by H for a code with n = 8000,
the maximum number of trials for an eavesdropper results k = 6000 and dv = 13.
from setting n0 = 1 in expression (6). Hence, for preserving
high cardinalities, longer codes should be considered. As an
example, by setting p = 2000 and dv = 13, we have errors, by a suitable choice of t, all errors can be corrected with
C {RDF (1, 13, 2000)} ≈ 2108 , high enough to discourage a high probability (in the extremely rare case of an LDPC de-
brute force attack on a single submatrix. coder failure, message resend is requested). Belief propagation
III. T HE P ROPOSED C RYPTOSYSTEM decoding, however, works only on sparse Tanner graphs, free
of short length cycles; therefore, in order to exploit the actual
A. System description correction capability, the knowledge of the sparse parity-check
Bob, in order to receive encrypted messages, randomly matrix H is essential.
 on the ciphertext x, Bob
chooses an RDF (n0 , dv , p) and generates the corresponding By applying the decoding algorithm
parity-check matrix H. This way, Bob is provided with a can derive u · G = u · S−1 · G . On the other hand, as G
secret QC-LDPC code that, like all the others with the same is in row reduced echelon form, the first k coordinates of this
parameters, is able to correct t errors, with high probability, product reveal directly u · S−1 , and right multiplication by S
under belief propagation decoding (the choice of t will be permits to extract the plaintext u as desired.
discussed next). The eavesdropper Eve, that wishes to intercept the message
Bob also chooses an r×r non-singular dense circulant “trans- from Alice to Bob and is not able to derive matrix H from
formation” matrix, T, and obtains the new matrix H = T · H, the knowledge of G, is, as expected, in a much less favorable
that, obviously, has the same null space of H. He then derives position. Even if H could be derived from G, it is made
a generator matrix, G, corresponding to H , in row reduced unsuitable for LDPC decoding through the action of matrix T.
echelon form, and makes it available in the public directory: This aspect will be deepened in the next subsection. Moreover,
G is Bob’s public key and it is completely described by its we observe that H is dense and this implies, for Eve, large
(k + 1)-th column, that is a k-bit vector. On the contrary, H and decoding complexity.
T form the private (or secret) key, that is owned by Bob only.
The system requires also a k × k non-singular “scrambling” B. Choice of t
matrix S, that is suitably chosen and publicly available (it LDPC codes are very powerful but, in general, it is not
can be even embedded in the algorithm implementation). Also easy to explicitly determine the decoding radius for an LDPC
matrix S has the “block of circulants” form, and its role is code with belief propagation decoding. So, one cannot find
to cause propagation of residual errors at the eavesdropper’s “a priori” the exact value of the parameter t. However, once
receiver, leaving the opponent in the most uncertain condition having fixed the code parameters, the value suitable for t can
(that is equivalent to guess the plaintext at random). For this be estimated through simulation. An example is shown in Fig.
purpose, it must be sufficiently dense in its turn. 1 for a code with n = 8000, k = 6000 (p = 2000) and
When Alice wants to send an encrypted message to Bob, dv = 13. Simulation has been executed over a channel that
she fetches G from the public directory and calculates G = adds t errors (randomly distributed) in each codeword of a
S−1 · G. Then, she divides her message into k-bit blocks and sequence long enough to ensure a satisfactory confidence level
encrypts each block u as follows: for the results, and determining the residual Bit Error Rate
(BER) and Frame Error Rate (FER) values, when decoding by
x = u · G + e (7)
either matrix H or matrix H = T · H. The logarithmic Sum-
where x is the encrypted version of u and e is a random vector Product Algorithm has been adopted, and likelihoods have been
of t intentional errors. initialized coherently with the channel model.
At the receiver side, Bob uses its private key for decoding. Decoding by H represents the opponent’s point of view: in
In the ideal case of a channel that does not introduce additional this case, a sparse T (same column weight as H) has been

Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
953
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.

adopted; by employing denser T’s (as may be required for IV. S YSTEM CRYPTANALYSIS
security issues) the performance of H would be even worse. Many potential attacks can be considered for the crypt-
From the figure, we see that, decoding by H, the number of analysis of the McEliece system based on LDPC codes. All
residual bit and frame errors becomes rapidly vanishing on the the attacks conceived for the original McEliece cryptosystem
left of the knee at t ≈ 120. Because of the waterfall behavior still apply; among these, brute force attacks, information set
of the curves, we can reasonably state that, for t = 40, both decoding attacks, message resend attacks and attacks based
the BER and the FER assume negligible values (impossible to on minimum weight codewords searches. However, we have
simulate), i.e., decoding is practically error free. So, it seems verified that the classic countermeasures to these attacks still
realistic to assume t = 40 as an estimate of the considered apply to the new version of the cryptosystem and, with a
LDPC code correction capability. This does not ensure that suitable choice of the system parameters, all they remain
all possible combinations of t = 40 (or even less) errors are unfeasible (for example, n = 8000, k = 6000 and t = 40
corrected when decoding by H, but gives sufficient guarantee are suitable choices for this purpose). For the sake of brevity,
that residual errors are extremely rare. From the figure we we do not report details on work factors of such attacks; a more
see that, instead, 40 intentional errors cannot be efficiently extensive analysis will be included in [16].
corrected by using H : an eavesdropper achieves the same
In this paper, we limit to discuss some attacks targeted to
BER performance of the uncoded transmission (under the same
LDPC codes, and QC-LDPC codes, in particular, with the aim
assumption of channel average error probability t/n). In other
to show that circulant permutation matrixes are inapplicable,
words, its decoder is useless. It should be noted, however, that,
while codes based on RDFs can be used, though requiring
from a cryptographic point of view, a BER on the order of
rather large lengths.
5 · 10−3 , like that achievable for t = 40 when using matrix H ,
is unacceptable. So, it is necessary to reinforce bad decoding by A. Density Reduction Attack [4]
means of a scrambling action able to “propagate” the residual
Let hi be the i-th row of matrix H and hj the j-th row
errors after decoding; this is the role of matrix S, and it is
of matrix H = T · H, and let (GF2n , +, ×) be the vector
discussed in the next subsection.
space of all the possible binary n-tuples with the operations
C. Choice of matrix S of addition (i.e. the logical “XOR”) and multiplication (i.e. the
The residual errors after decoding are “propagated” by the logical “AND”). Let us define “orthogonality” in the following
subsequent product by S, so that, at the output of the eavesdrop- sense: two binary vectors u and v are orthogonal, i.e. u ⊥ v, iff
per’s decoder, not only the FER is equal to 1 (which means that u × v = 0. From the cryptosystem description, it follows that:
all decoded sequences contain at least one erred bit) but also hj = hi1 + hi2 + . . . + hiz where z represents the Hamming
the BER is practically equal to 1/2 (that is the most uncertain weight of each row (column) of T.
condition for an opponent). Similarly to T, also S has to be We can suppose that many hi are mutually orthogonal, due
rather dense, for doing, at the best, this error propagating action. to the sparsity of matrix H. Let hj a = hia1 + hia2 + . . . + hiaz
If we consider the information part of the vector decoded and hj b = hib1 + hib2 + . . . + hibz be two distinct rows of H
by an opponent, ũ, it can be expressed as ũ = u · S−1 + ẽ, and hia1 = hib1 = hi1 [that happens when T has two non-zero
where ẽ is the corresponding part of the residual errors vec- entries in the same column (i1 ), at rows j a and j b .] In this case,
tor. After the descrambling process, the decoded message is it may happen that: hj a × hj b = hi1 (that occurs, for example,
û = ũ·S = u+ẽ·S; therefore the scrambling matrix S operates when hia1 ⊥ hib2 , . . . , hia1 ⊥ hibz , . . . , hiaz ⊥ hib1 , . . . , hiaz ⊥
directly on the residual errors, causing their propagation. Really, hibz ). Therefore, a row of H could be derived as the product
the extent of the propagation effect can be predicted through of two rows of H . At this point, if the code is quasi-cyclic
simple combinatorial arguments, under the hypothesis that S with the considered form, its whole parity-check matrix can be
is randomly generated [14]. At the same time, this prediction derived, due to the fact that the other rows of H are simply
permits to design the features of matrix S (its density, in block-wise circular shifted versions of the one obtained through
particular, for a given size) that are compatible with the the attack.
achievement of a satisfactory security level. Even when the analysis of all possible couples of rows of
Following the procedure described in [14], we have calcu- H does not reveal a row of H, it may produce a new matrix,
lated, as an example, that, for n = 8000, k = 6000 and t = 40, H , sparser than H , able to allow efficient LDPC decoding.
an S matrix with density of about 20% is sufficient to ensure Alternatively, the attack can be iterated on H and it can
maximum error “propagation” action (that consists in having a succeed after a number of iterations > 1; in general, the attack
BER nearly equal to 1/2, the most uncertain condition for an requires ρ−1 iterations when not less than ρ rows of H have in
opponent). On the other hand, it is expected that the scrambling common a single row of H. This attack procedure can be even
action has an impact also for the authorized user (Bob). Its applied on a single circulant block of H , say Hi , to derive its
(rare) residual errors are propagated as well, thus causing a corresponding block Hi of H, from which T = Hi · H−1 i can
slight degradation in the error correction performance, that, be obtained.
however, is compensated by the possibility to make the system We have verified elsewhere [15] that the attack can be
secure. avoided through a proper selection of matrix T, but this

Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
954
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.

approach forces constraints on the code parameters. In this Eq. (12) is the starting point for the eavesdropper: looking
work, following [4], we propose instead to resort only to matrix at H , she immediately knows Td and, hence Hd , both corre-
T density. sponding to the choice of Z expressed by Eq. (10). Moreover,
Let us suppose that the attack is carried out on the single it can be proved that Hd , so found, can be sparse enough to
block Hi (it can be easily generalized for the whole H ). allow efficient belief propagation decoding. Because of the Hd
The first iteration of the attack, for the considered case of Hi definition, its sparsity depends on that of Z−1 . Actually, for the
circulant, is equivalent to calculate the periodic autocorrelation considered example, we have:
of the first row of Hi . When Hi is sparse (i.e. T is sparse)  
Z1,1 0 0
the autocorrelation is everywhere null (or very small), except
Z = Hr =  Z2,1 Z2,2
−1 −1
0  (13)
for a limited number of peaks that reveal the couple of rows
Z3,1 Z3,2 Z3,3
of Hi able to give information on the structure of Hi . On the
contrary, when Hi is dense (suppose with one half of symbols where Z1,1 = PT 1,4 , Z2,2 = PT 2,5 , Z3,3 = PT 3,6 ,
1), the autocorrelation is always high, and no information is Z2,1 = P2,5 P2,4 P1,4 , Z3,2 = P3,6 P3,5 P2,5 and Z3,1 =
T T T T
available for the opponent. In this case, Eve is in the same PT3,6 P3,4 P1,4 +P3,6 P3,5 P2,5 P2,4 P1,4 , and orthogonality of
T T T T
condition as to guess at random. The relevant point is that to permutation matrices has been used.
have a dense matrix H , in the proposed system, does not affect It follows that the main diagonal of Z−1 contains per-
the public key length: matrix G remains described completely mutation matrices; the underlying diagonal contains products
by its (k + 1)-th column. of permutation matrices, i.e. permutation matrices again; the
following one contains sums of permutation matrices. The same
B. Attack to codes based on Circulant Permutation Matrices
analysis holds for an arbitrary r0 : the blocks Zi+j,1+j , i ∈
Let the private key H be formed by r0 × n0 circulant [2; r0 ] , j ∈ [0; r0 − i] have column (row) weight 2i−2 . It
permutation or null matrices Pi,j of size p and have lower follows that, when r0 is small, Z−1 is sparse and, consequently,
triangular form, to ensure full rank. For the sake of clarity, we Hd is sparse as well; so it could be used by Eve for efficient
consider a simple example with r0 = 3 and n0 = 6: decoding.
  Furthermore, the attack can continue and aim at obtaining
P1,1 P1,2 P1,3 P1,4 0 0
another matrix, Hb , that has the same density of H and,
H =  P2,1 P2,2 P2,3 P2,4 P2,5 0  (8)
P3,1 P3,2 P3,3 P3,4 P3,5 P3,6 therefore, can produce a total break of the cryptosystem.
This new matrix corresponds to another choice of Z, namely
We assume that all Pi,j ’s are non null. The public key is Z = Z∗ , that, for the considered example, has the following
obtained as H = T · H, where T consists of r0 × r0 square form:  
circulant dense blocks Ti,j , each of size p. Although the exact P1,4 0 0
knowledge of the private key would ensure correct decoding, Z∗ =  0 P2,5 0  (14)
for the eavesdropper it suffices to find a couple of matrices 0 0 P3,6
(Td , Hd ), with the same dimensions of (T, H), such that For this choice of Z, Hb has the same density of H and
H = Td ·Hd , and Hd is sparse enough to allow efficient belief each of its rows is a permuted version of the corresponding
propagation decoding. This can be accomplished considering row of H. Hb is in row reduced echelon form.
that, given an invertible square matrix Z of size r = p · r0 , the
An attack that aims at finding Hb can be conceived by
following relationship holds:
analyzing the structure of Hd in the form (11), with H−1 r
H = T · H = T · Z · Z−1 · H = Td · Hd (9) expressed by Eq. (13). It can be noticed that the first row of
Hd equals that of H multiplied by PT 1,4 , so it corresponds to
where Td = T · Z and Hd = Z−1 · H. If we separate H in its the first row of Hb . The second row of Hd equals that of H
left (r × k) and right (r × r) parts, H = [Hl |Hr ], a particular multiplied by PT 2,5 plus a shifted version of the first row of
choice of Z coincides with the lower triangular part of H, i.e., Hb . The third row of Hd equals that of H multiplied by PT 3,6
for the considered example: plus two shifted versions of the first row of Hb and a shifted
  version of the second row of Hb . Therefore, the attack can
P1,4 0 0
exploit a recursive procedure.
Z = Hr =  P2,4 P2,5 0  (10)
When sparse matrices are added, it is highly probable that
P3,4 P3,5 P3,6
their symbols “1” do not overlap. For this reason, in a sum of
With this choice of Z, Hd assumes a particular form, namely: rows of Hb , the contributions of shifted versions of known rows
 −1  can be isolated through correlation operations and, therefore,
Hd = [Hdl |Hdr ] = H−1 r · [Hl |Hr ] = Hr · Hl |I (11) eliminated. This permits to deduce each row of Hb from the
where the right part of Hd is an identity matrix of size r = p·r0 . previous ones and, this way, derive the entire Hb . Obviously,
From this, it follows that: the hypothesis of non-overlapping elements is most likely
verified when the blocks of H are sparse and r0 is not too
H = Td · Hd = [Td · Hdl |Td ] (12) high. For example, we have found that matrices with r0 = 6

Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
955
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.

and p = 40 (i.e. density of the blocks 0.025) are highly exposed Aw = 2000, the minimum work factor, that corresponds to
to a total break. This is not a trivial case; for example, the codes (q, l) = (3, 38), is 235.7 . It is evident that such system would
included in the IEEE 802.16e standard [5] have p ∈ [24; 96] be exposed to a total break.
and r0 = 6 when the code rate is 3/4. The risk is even This attack is particularly insidious since the work factor
emphasized by the fact that most useful codes have H matrices of Stern’s algorithm mainly depends on the relative weight
that contain a high number of null blocks, in order to allow searched; in order to increase it, we should adopt denser parity-
efficient belief propagation decoding. Therefore, the density of check matrices. On the other hand, however, such matrices must
Z−1 and, hence, Hd is reduced and the attack is simpler. be sparse enough to ensure the absence of 4-length cycles and
An attack of this kind is addressed to LDPC codes based allow efficient belief propagation decoding. This means that
on circulant permutation matrices, whilst it is not applicable to it is possible to obtain high work factors only by employing
LDPC codes based on difference families (described in Section relatively large codes. For example, if we choose n = 72000,
II-B); for this reason, the latter appear more secure. r = kd = 24000, n0 = 3 (R = 2/3) and dv = 41 (dc = 123),
we have W F = 280.9 (minimum for q = 3, l = 53), that
C. Attack to the Dual Code
ensures a satisfactory system robustness.
At the present stage of cryptanalysis, the most dangerous
attack for every instance of the McEliece cryptosystem based R EFERENCES
on LDPC codes rises from the fact that an opponent knows the [1] R. J. McEliece, “A public-key cryptosystem based on algebraic coding
dual of the secret code contains very low weight codewords theory.” DSN Progress Report, pp. 114–116, 1978.
[2] E. Berlekamp, R. McEliece, and H. van Tilborg, “On the inherent
and can directly search for them, thus recovering H. intractability of certain coding problems,” IEEE Trans. Inform. Theory,
The dual of the secret code can be generated by H; therefore vol. 24, pp. 384–386, May 1978.
it has at least Adc ≥ r codewords with weight dc . Each of them [3] H. Niederreiter, “Knapsack-type cryptosystems and algebraic coding
theory,” Probl. Contr. and Inform. Theory, vol. 15, pp. 159–166, 1986.
completely describes H and, if known, allows the opponent [4] C. Monico, J. Rosenthal, and A. Shokrollahi, “Using low density parity
to break the system by gathering the private key. From a check codes in the McEliece cryptosystem,” in Proc. IEEE ISIT 2000,
cryptographic point of view, Adc should be known in order Sorrento, Italy, Jun. 2000, p. 215.
[5] 802.16e 2005, IEEE Standard for Local and Metropolitan Area Networks
to precisely evaluate the work factor of the attack, but this is - Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access
not, in general, a simple task. However, we can consider that it Systems - Amendment for Physical and Medium Access Control Layers
is dc  n and that sparse vectors most likely sum into denser for Combined Fixed and Mobile Operation in Licensed Bands, IEEE Std.,
Dec. 2005.
vectors. Therefore, we will consider Adc = r in the following. [6] IEEE P802.11 Wireless LANs WWiSE Proposal: High throughput exten-
One of the most efficient probabilistic algorithms for finding sion to the 802.11 Standard, IEEE Std. IEEE 11-04-0886-00-000n, Aug.
low weight codewords is due to Stern [17], and has been 2004.
[7] R. Townsend and E. J. Weldon, “Self-orthogonal quasi-cyclic codes,”
recently applied to LDPC codes by Hirotomo et al. [18]. The IEEE Trans. Inform. Theory, vol. 13, pp. 183–195, Apr. 1967.
algorithm works on the parity-check matrix of a code and has [8] R. Tanner, D. Sridhara, and T. Fuja, “A class of group-structured LDPC
two parameters, q and l, that represent the number of matrix codes,” in Proc. Int. Symp. Commun. Theory and Appl., ISCTA 01,
Ambleside, UK, Jul. 2001.
columns and rows considered at each iteration, respectively. [9] M. P. C. Fossorier, “Quasi-cyclic low-density parity-check codes from
Optimal values for q and l can be derived considering their circulant permutation matrices,” IEEE Trans. Inform. Theory, vol. 50,
influence on the total number of binary operations needed for pp. 1788–1793, Aug. 2004.
[10] D. Hocevar, “LDPC code construction with flexible hardware implemen-
finding a codeword of given weight. If we suppose that the tation,” in Proc. IEEE ICC 2003, vol. 4, Anchorage, Alaska, May 2003,
algorithm is performed on the dual of the secret code, with pp. 2708–2712.
length n and dimension kd (i.e. redundancy rd = n − kd ), the [11] S. Johnson and S. Weller, “A family of irregular LDPC codes with low
encoding complexity,” IEEE Commun. Lett., vol. 7, pp. 79–81, Feb. 2003.
probability of finding, in one iteration, a codeword with weight [12] M. Baldi and F. Chiaraluce, “New quasi cyclic low density parity check
w, supposed unique, is: codes based on difference families,” in Proc. Int. Symp. Commun. Theory
w n−w  w−qn−kd /2−w+q   and Appl., ISCTA 05, Ambleside, UK, Jul. 2005, pp. 244–249.
n−kd −w+2q [13] T. Xia and B. Xia, “Quasi-cyclic codes from extended difference fam-
q kd /2−q q kd /2−q
Pw =  n  · n−k /2 · n−k 
l ilies,” in Proc. IEEE Wireless Commun. and Networking Conf., vol. 2,
d d
kd /2 kd /2 l New Orleans, USA, Mar. 2005, pp. 1036–1040.
[14] M. Baldi, F. Chiaraluce, and R. Garello, “On the usage of quasi-cyclic
If the code contains Aw codewords with weight w, the low-density parity-check codes in the McEliece cryptosystem,” in Proc.
First Int. Conf. on Commun. and Electron. (ICCE’06), Hanoi, Vietnam,
probability to find one of them becomes Pw,Aw ≤ Aw Pw , Oct. 2006, pp. 305–310.
and the average number of iterations needed by a successful [15] M. Baldi, “Quasi-cyclic low-density parity-check codes and their appli-
−1
search is m = Pw,A w
. On the other hand, each iteration of the cation to cryptography,” Ph.D. dissertation, Università Politecnica delle
Marche, Ancona, Italy, Nov. 2006.
algorithm requires [16] M. Baldi and F. Chiaraluce, “Cryptanalysis of a new instance of McEliece
 2 cryptosystem based on QC-LDPC codes,” in preparation.
rd3 2 kd /2 2qrd kdq/2 [17] J. Stern, “A method for finding codewords of small weight,” in G. Cohen
N= + kd rd + 2ql + and J. Wolfmann, Coding Theory and Applications, Springer-Verlag, Ed.,
2 q 2l no. 388 in Lecture Notes in Computer Science, 1989, pp. 106–113.
binary operations; so, the total work factor can be estimated as [18] M. Hirotomo, M. Mohri, and M. Morii, “A probabilistic computation
method for the weight distribution of low-density parity-check codes,” in
W F = mN . If we consider the following choice of the system Proc. IEEE ISIT 2005, Adelaide, Australia, Sep. 2005, pp. 2166–2170.
parameters: n = 8000, kd = 2000, w = dc = n0 · dv = 52,

Authorized licensed use limited to: Indian Institute Of Technology (Banaras Hindu University) Varanasi. Downloaded on August 21,2023 at 12:26:28 UTC from IEEE Xplore. Restrictions apply.
956

You might also like