Professional Documents
Culture Documents
Error-Correcting Codes and Cryptography 2017
Error-Correcting Codes and Cryptography 2017
Cryptography
1/45
CONTENTS
I Error-correcting codes; the basics
II Quasi-cyclic codes; codes generated by circulants
III Cyclic codes
IV The McEliece cryptosystem
V Burst-correcting array codes
2/45
I Error-correcting codes; the basics
Þ
Noise
`
m c r m
Sender Encode Å Decode Receiver
Channel
4/45
A code C is such a (well-chosen) subset of {0, 1}n .
So codes here will be binary codes. The generalization to other field sizes is
easy.
The weight of a word is the number of non-zero coordinates.
c0 = 0 0 0 0 0
c1 = 0 0 1 1 1
c2 = 1 1 0 0 1
c3 = 1 1 1 1 0
5/45
Suppose that each two codewords differ in at least d coordinates (have dis-
tance at least d) and put t = b d−1
2
c.
c0 = 0 0 0 0 0
c1 = 0 0 1 1 1
d = 3, t = 1
c2 = 1 1 0 0 1
c3 = 1 1 1 1 0
Then the code C is said to be t-error-correcting, because if you transmit (or
store) a codeword and not more than t errors have occurred upon reception
(or read out) due of noise or damage, then the received word will still be
closer to the original codeword than to any other.
For instance, if you receive
r = 0 1 0 0 1
you know that c2 is the most likely transmitted codeword.
6/45
From now on codes will be linear, meaning that C is a linear subspace of
{0, 1}n. We use the notation [n, k, d] codes, where k denotes the dimension
of the code C and d the so-called minimum distance of C : the minimum
of all distances between codewords.
The quantity r = n − k is called the redundancy of the code. This is the
number of additional coordinates (apart from the actual information being
transmitted) that make error-correction possible.
It follows from the linear structure of C that an appropriate choice of k
codewords forms a basis for the code.
A basis of the code C = {00000, 00111, 11001, 11110} is given by the rows
of
0 0 1 1 1
.
1 1 0 0 1
7/45
0 0 0 0 0 0 0 c0
0 0 0 1 1 1 1 c1
0 0 1 0 0 1 1 c2
0 0 1 1 1 0 0 c3
0 1 0 0 1 0 1 c4
0 1 0 1 0 1 0 c5
A basis of the linear (!)
0 1 1 0 1 1 0 c6
0 1 1 1 0 0 1 c7
[7, 4, 3] code introduced before
1 0 0 0 1 1 0 c8
1 0 0 1 0 0 1 c9
is given by c1 , c2 , c4 , c8
1 0 1 0 1 0 1 c10
1 0 1 1 0 1 0 c11
1 1 0 0 0 1 1 c12
1 1 0 1 1 0 0 c13
1 1 1 0 0 0 0 c14
1 1 1 1 1 1 1 c15
8/45
A matrix G whose rows form a basis of an [n, k, d] code C, is called a gene-
rator matrix G of C. Its size is k × n.
The basis c1 , c2 , c4 , c8 of the code on the previous page results in the gene-
rator matrix:
0 0 0 1 1 1 1
0 0 1 0 0 1 1
G = .
0 1 0 0 1 0 1
1 0 0 0 1 1 0
9/45
If k is large compared to n, it is often advantageous to describe C as the
null-space of a (n − k) × n matrix H called a parity check matrix:
C = {x ∈ {0, 1}n | HxT = 0T }.
Typically, you transmit a codeword c and you receive r which can be written
as r = c ⊕ e, where e is called the error vector and is caused by the noise.
The decoder can not do better than look for the closest codeword to r, i.e.
look for e of lowest weight such that r − e ∈ C.
Note that sT := HrT = HcT ⊕ HeT = HeT . This value is called the
syndrome of the received word. It only depends on the error-vector.
10/45
Example: The matrix
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1
1 0 1 0 1 0 1
is the parity check matrix of a linear code C = {x ∈ {0, 1}n | HxT = 0T }
of length 7 and dimension 4.
Moreover, this code can correct a single error (d = 3, t = 1). We give a
decoding algorithm.
Let r be a received word.
Compute its syndrome s, i.e. compute sT = HrT .
0
If sT = 0 then r ∈ C, so (most likely) no error occurred.
0
11/45
Example continued: Suppose you receive
r = 1 0 0 0 1 1 1
Its syndrome with
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1
1 0 1 0 1 0 1
1
is 0 , which is the 5-th column. Note that
1
e = 0 0 0 0 1 0 0
gives the same syndrome, so H(rT − eT ) = 0T .
So, the most likely transmitted codeword is r − e, i.e.
c = 1 0 0 0 0 1 1
12/45
II Quasi-cyclic codes; Codes generated by circulants
1 0 0 0 1 0 1 1 1 0 0 0 0 0 0
0 1 0 0 0 1 0 1 1 1 0 0 0 0 0
0 0 1 0 0 0 1 0 1 1 1 0 0 0 0
0 0 0 1 0 0 0 1 0 1 1 1 0 0 0
0 0 0 0 1 0 0 0 1 0 1 1 1 0 0
0 0 0 0 0 1 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 1 0 0 0 1 0 1 1 1
Consider the
15 × 15 circulant 1
U = 0 0 0 0 0 0 1 0 0 0 1 0 1 1
1 1 0 0 0 0 0 0 1 0 0 0 1 0 1
1 1 1 0 0 0 0 0 0 1 0 0 0 1 0
0 1 1 1 0 0 0 0 0 0 1 0 0 0 1
1 0 1 1 1 0 0 0 0 0 0 1 0 0 0
0 1 0 1 1 1 0 0 0 0 0 0 1 0 0
0 0 1 0 1 1 1 0 0 0 0 0 0 1 0
0 0 0 1 0 1 1 1 0 0 0 0 0 0 1
13/45
Note that in
1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 u0
0 1 0 0 0 1 0 1 1 1 0 0 0 0 0
0 0 1 0 0 0 1 0 1 1 1 0 0 0 0
0 0 0 1 0 0 0 1 0 1 1 1 0 0 0
U = 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 u4
0 0 0 0 0 1 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 u6
1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 u7
... ...
14/45
1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 u(x)
0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 xu(x)
. ... ...
..
U =
0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 x6u(x)
x7u(x)
1 0 0 0 0 0 0 1 0 0 0 1 0 1 1
... ... ...
15/45
The reason that U generates a [15, 7] code (2nd proof) is that:
1. u(x) has degree 8, so the first 7 rows of U are clearly linearly indepen-
dent.
2. u(x) divides x15 − 1.
Indeed x15 − 1 = u(x)(1 + x4 + x6 + x7 ), as one can easily check.
So,
16/45
How about the rank of a code generated by a circulant U with top row u0 ,
corresponding to a polynomial u(x) that does not divide xn − 1?
1 0 0 1 1 1 0 u0
0 1 0 0 1 1 1
1 0 1 0 0 1 1
U = 1 1 0 1 0 0 1
1 1 1 0 1 0 0
0 1 1 1 0 1 0
0 0 1 1 1 0 1
17/45
u(x)
U = with u(x) does not divide of xn − 1.
Define g(x) = gcd(u(x), xn − 1) and use the extended version of Euclid’s
Algorithm to write:
g(x) = a(x)u(x) + b(x)(xn − 1).
Then
n−1
! n−1
X X
aixi u(x) ≡ ai xiu(x) (mod xn − 1).
g(x) ≡
i=0 i=0
n−1
X
So, g = aiui.
i=0
18/45
The vector g (and each of its cyclic shifts) is a linear combination of the rows
of U.
Since g(x) = gcd(u(x), xn − 1) divides u(x), we also know that u0 (and
each of its shifts) is a linear combination of cyclic shifts of g.
We conclude that G, the circulant with g as top row, generates the same
code as U does:
u(x) g(x)
U = and G =
generate the same code.
But now g(x) divides xn − 1, so the code generated by U has dimension
n − degree(g(x)).
19/45
How about a code that is the linear span of two (or more) circulants under-
neath each other?
u(x)
U
=
V v(x)
Codewords are linear combinations of rows of U and V.
··· ···
Some things are still easy here:
1. Compute g(x) = gcd(u1 (x), u2 (x), · · · , um (x), xn − 1).
2. The code has dimension n − degree(g(x)).
21/45
How about a code that is the linear span of two (or more) rows of circulants
next to each other, so-called quasi-cyclic codes?
u1,1(x) u1,2(x) ··· · · · u1,m(x)
··· ···
u2,1(x) u2,2(x) ··· · · · u2,m(x)
··· ···
.
.. ... ...
... ... ...
· · · · · · ul,m(x)
ul,1(x) ul,2(x)
··· ···
Things are difficult here. Little to nothing can be said about rank, minimum
distance, let alone decoding.
See Ph.D. thesis: Kristine Lally, Application of the theory of Gröbner bases to
the study of quasi-cyclic codes, National University of Ireland, Cork, 6-15-2000,
especially for the case of a single row of circulants.
22/45
III Cyclic codes
The codes generated by a column of circulants are commonly called cyclic
codes.
U g(x) := gcd(u(x), v(x), · · · , xn − 1)
V g(x)
G =
...
Only the the top n − degree(g(x)) rows of G are needed. The remaining
rows are commonly left out.
The real question is how to select a divisor g(x) of xn − 1 such that the code
generated by it has good properties:
1. large minimum distance
2. easy error-correction.
23/45
Consider the irreducible polynomial f (x) = 1 + x + x4 and let α be a zero
of f (x) in some extension field of GF (2) = {0, 1}. So 1 + α + α4 = 0.
Then α can be assumed to be in GF (24 ) with as elements all binary polyno-
mials in α of degree less than 4.
( 3 )
X
GF (24) = aiαi ai ∈ {0, 1}, 0 ≤ i ≤ 3 .
i=0
(1 + α2) (1 + α3) = 1 + α2 + α3 + α5 =
= 1 + α2 + α3 + α5 + α (1 + α + α4) =
= 1 + α + α3
24/45
But α has the additional property of being primitive:
α generates GF (24) \ {0} (remember that α4 = 1 + α)
1 α α2 α3 1 α α2 α3
1 1 0 0 0 α8 1 0 1 0
α 0 1 0 0 α9 0 1 0 1
α2 0 0 1 0 α10 1 1 1 0
α3 0 0 0 1 α11 0 1 1 1
α4 1 1 0 0 α12 1 1 1 1
α5 0 1 1 0 α13 1 0 1 1
α6 0 0 1 1 α14 1 0 0 1
α7 1 1 0 1 α15 1 0 0 0
Note that indeed α15 = 1. Thus α and each of its powers is a zero of x15 −1.
Hence
14
Y
x15 − 1 = x − αi
i=0
25/45
In general: when gcd(2, n) = 1 there exists an α in some extension field
of GF (2) = {0, 1} such that xn − 1 can be written as
n−1
Y
xn − 1 = x − αi .
i=0
26/45
Now consider the parity check matrix
1 α α2 α3 α4 α5 · · · · · · α14
H =
1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×14
which really stands for
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1
0 1 0 0 1 1 0 1 0 1 1 1 1 0 0
0 0 1 0 0 1 1 0 1 0 1 1 1 1 0
0 0 0 1 0 0 1 1 0 1 0 1 1 1 1
H =
1 0 0 0 1 1 0 0 0 1 1 0 0 0 1
0 0 0 1 1 0 0 0 1 1 0 0 0 1 1
0 0 1 0 1 0 0 1 0 1 0 0 1 0 1
0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
So, we consider the binary [15, 7, ?] code C defined by
c ∈ {0, 1}15 | HcT = 0T
27/45
2 3 4 5 14
1 α α α α α ··· ··· α
H = 3 3×2 3×3 3×4
1 α α α α α3×5 · · · · · · α3×14
P14
Let c(x) correspond to c = (c0 , c1 , . . . , c14 ). So, c(x) = i=0 ci xi .
Then
c∈C ⇔ HcT = 0T ⇔ c(α) = c(α3) = 0.
We shall now show that the minimum distance of this code is 5 and that
there exists an easy decoding algorithm to correct up to 2 errors.
Suppose that r(x) (corresponding to vector r) is received, while codeword
c(x) was transmitted. Write r(x) = c(x) + e(x), where e(x) stands for the
error vector e = (e0 , e1 , . . . , e14 ).
As always for decoding, we compute the syndrome
s1 = r(α) = c(α) + e(α) = e(α)
s3 = r(α3) = c(α3) + e(α3) = e(α3)
28/45
s1 = r(α) = c(α) + e(α) = e(α)
s3 = r(α3) = c(α3) + e(α3) = e(α3)
We can distinguish three possibilities:
No error: e(x) = 0, s1 = s3 = 0.
A single error at coordinate i: e(x) = xi , s1 = αi , s3 = α3i .
Two errors, one on coordinate i and the other on coordinate j :
e(x) = xi + xj , s1 = αi + αj , s3 = α3i + α3j .
29/45
The technique at the previous sheets can be easily generalized to construct
codes that correct more errors and allow efficient decoding methods.
So,
1 α α2 α3 α4 α5 · · · · · · αn−1
H = 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×(n−1)
1 α5 α5×2 α5×3 α5×4 α5×5 · · · · · · α5×(n−1)
generates a 3-error-correcting code, etc.
The family of BCH codes does this. Also the Reed-Solomon codes that are
used on CD’s and DVD’s are related to this construction.
Paterson’s decoding algorithm does the decoding in t × n operations, where
n is the length of the code and t the number of errors that can be corrected.
30/45
IV The McEliece cryptosystem
History: Berlekamp, McEliece, and vT proved in 1978 that the general deco-
ding problem is NP-complete.
31/45
NP: a decision problem that can be verified in polynomial time (but no known
algorithm answers it in polynomial time).
Complete: any other NP problem can be converted to this one (in polynomi-
al time).
Famous other NP-complete problems are: the Boolean satisfiability problem
and the traveling salesman problem.
The relevance of being NP-complete to cryptography is limited, as the story
of the knapsack based cryptosystems teaches us.
Elwyn Berlekamp, Bob McEliece and Henk van Tilborg, On the inherent in-
tractability of certain coding problems, IEEE Trans. Inf. Theory IT-24, 1978, p.
384-386.
Michael R. Garey and David S. Johnson, Computers and Intractability: A Gui-
de to the Theory of NP-Completeness, Freeman, San Francisco, 1978.
32/45
The Coset Weights Problem is about arbitrary (parity check) matrices, not the
well structured parity check matrices that allow easy decoding, like
0 0 0 1 1 1 1
H = 0 1 1 0 0 1 1
1 0 1 0 1 0 1
and
1 α α2 · · · · · · αn−1
H =
1 α3 α3×2 · · · · · · α3×(n−1)
33/45
Instead think of
0 1 1 1 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 1
1 0 0 0 1 0 1 0 1 0 0 1 1 1 0 1 0 1 1 1 0 1 1
1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 1 0 0
1 0 1 0 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 0 0
0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 1
0 s=
1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0
0
1 0 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 1
0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 0
0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 0
1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0 1
34/45
McEliece based his cryptosystem on this:
• Decoding linear codes is, in general, very hard.
• But linear codes with a nice structure are easy to decode.
He needed a trapdoor to hide the nice structure.
35/45
Set up
• Select a generator matrix G of an [n, k, 2t + 1] linear code C with an
efficient decoding algorithm DecG .
• Select a random k × k invertible matrix S and a random n × n permuta-
tion matrix P. Compute Ĝ = SGP.
• Make Ĝ and t public, but keep S, P , and G secret.
Encryption
• Message m ∈ {0, 1}k will be encrypted into r = mĜ + e, where e is a
random vector of weight t.
Decryption
• Compute rP −1 = (mĜ + e)P −1 = mSGP P −1 + eP −1 = (mS)G + e0 .
• Apply DecG to this vector to find mS (note that e0 also has weight t).
• Retrieve m from (mS)S −1 .
36/45
Of course, the adversary should not be able to “guess” the code C that was
used (or the S or P ).
There are too few BCH codes and Reed-Somolon codes for given parame-
ters.
That is why McEliece did choose the large class of Goppa codes. Their num-
ber grows exponentially in the length of the code.
In his original proposal (1978): n = 1024, t = 50, and k ≈ 524.
Since 2008 these parameters are no longer safe.
Dan Bernstein, Tanja Lange, and Christiane Peters, Attacking and Defending
the McEliece Cryptosystem, Johannes Buchmann and Jintai Ding, PQCrypto
2008, Springer-Verlag, Berlin Heidelberg, LNCS-5299, pp. 31–46, 2008.
37/45
V Burst-correcting array codes
Definition: An (n1 , n2 )-array code C consists of all n1 × n2 {0, 1}-arrays C
whose row and column sums are all congruent to zero modulo 2.
1 2 3 ··· · · · n2
1 ← even parity
2 ← even parity
...
...
n1 ← even parity
↑ ↑ ↑ ↑ ↑
even parity even parity
It follows directly from this definition that an (n1 , n2 ) array code C is a linear
code with length n1 × n2 , dimension (n1 − 1)(n2 − 1).
38/45
Example: n1 = 5, n2 = 8.
0 1 0 1 1 1 0 0
1 1 1 1 0 1 1 0
1 0 1 0 0 0 1 1 is a “codeword”.
0 0 0 1 0 1 1 1
0 0 0 1 1 1 1 0
39/45
Let R be a received word.
h1
h2
...
...
hn1
v1 v2 vn2
The horizontal and vertical syndrome of R are defined by the row sums and
column sums.
Decoding a single error in this code is extremely simple.
40/45
Example continued:
Look at the received word:
1 1 0 0 0 1 0 1 0
0 1 0 0 1 0 1 0 1
1 0 1 0 1 1 0 0 0
0 0 1 1 0 1 1 0 0
0 0 1 1 0 1 0 1 0
0 0 1 0 0 0 0 0
It is clear where the error occurred.
So, decoding a single error is easy (but not very impressive).
The actual minimum distance of this code is 4.
41/45
For burst-correction the particular read-out of the array is important.
We follow diagonals, one after another.
Example: n1 = 5, n2 = 6, so n = 30.
0 5 10 15 20 25
26 1 6 11 16 21
22 27 2 7 12 17
18 23 28 3 8 13
14 19 24 29 4 9
42/45
It is not so difficult to see that C cannot correct all bursts of length up to n1 .
Indeed, in our example, the two bursts of length 5 indicated below (and
many more) have the same syndrome.
0 5 10 15 20 25 0 5 10 15 20 25
26 1 6 11 16 21 26 1 6 11 16 21
22 27 2 7 12 17 and 22 27 2 7 12 17
18 23 28 3 8 13 18 23 28 3 8 13
14 19 24 29 4 9 14 19 24 29 4 9
Both have burst-pattern (1, 0, 0, 0, 1) and the positions of the ones have been
indicated in color.
43/45
Let us now see when C can correct all bursts of length ≤ n1 − 1.
With a little bit of work one can check that for n2 < 2n1 − 3 there are always
two different weight-two bursts of length ≤ n1 − 1 with the same syndrome.
For instance the two bursts depicted below in red resp. blue have the same
syndrome.
0 5 10 15 20 25 1
26 1 6 11 16 21 0
22 27 2 7 12 17 0
18 23 28 3 8 13 1
14 19 24 29 4 9 0
1 0 0 1 0 0
44/45
Theorem: Let C be the n1 × n2 array code, n2 ≥ n1 , with +1-diagonal read-
out as defined above. Then C can correct all bursts of length ≤ n1 − 1 if and
only if n2 ≥ 2n1 − 3.
Proof by example: n1 = 11, n2 = 19.
Mario Blaum, Paddy Farrell, and Henk van Tilborg, A class of burst error–
correcting array codes, IEEE Trans. Information Theory IT-32, 1986, pp. 836-
839.
45/45