This action might not be possible to undo. Are you sure you want to continue?

**Coding Theory & Cryptography
**

John C. Bowman

Lecture Notes

University of Alberta

Edmonton, Canada

April 19, 2006

c _ 2002–5

John C. Bowman

ALL RIGHTS RESERVED

Reproduction of these lecture notes in any form, in whole or in part, is permitted only for

nonproﬁt, educational use.

Contents

Preface 4

1 Introduction 5

1.A Error Detection and Correction . . . . . . . . . . . . . . . . . . . . . 6

1.B Balanced Block Designs . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.C The ISBN Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Linear Codes 21

2.A Encoding and Decoding . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.B Syndrome Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Hamming Codes 36

4 Golay Codes 41

5 Cyclic Codes 45

6 BCH Codes 54

7 Cryptographic Codes 66

7.A Symmetric-Key Cryptography . . . . . . . . . . . . . . . . . . . . . . 66

7.B Public-Key Cryptography . . . . . . . . . . . . . . . . . . . . . . . . 69

7.B.1 RSA Cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . 69

7.B.2 Rabin Public-Key Cryptosystem . . . . . . . . . . . . . . . . . 74

7.C Discrete Logarithm Schemes . . . . . . . . . . . . . . . . . . . . . . . 74

7.C.1 Diﬃe–Hellman Key Exchange . . . . . . . . . . . . . . . . . . 74

7.C.2 Okamoto Authentication Scheme . . . . . . . . . . . . . . . . 75

7.C.3 Digital Signature Standard . . . . . . . . . . . . . . . . . . . . 77

7.C.4 Silver–Pohlig–Hellman Discrete Logarithm Algorithm . . . . . 78

7.D Cryptographic Error-Correcting Codes . . . . . . . . . . . . . . . . . 78

A Finite Fields 79

3

Preface

These lecture notes are designed for a one-semester course on error-correcting codes

and cryptography at the University of Alberta. I would like to thank my colleagues,

Professors Hans Brungs, Gerald Cliﬀ, and Ted Lewis, for their written notes and ex-

amples, on which these notes are partially based (in addition to the references listed

in the bibliography). The ﬁgures in this text were drawn with the high-level vector

graphics language Asymptote (freely available at http://asymptote.sourceforge.net).

4

Chapter 1

Introduction

In the modern era, digital information has become a valuable commodity. For exam-

ple, the news media, governments, corporations, and universities all exchange enor-

mous quantities of digitized information every day. However, the transmission lines

that we use for sending and receiving data and the magnetic media (and even semi-

conductor memory devices) that we use to store data are imperfect.

Since transmission line and storage devices are not 100% reliable device, it has

become necessary to develop ways of detecting when an error has occurred and,

ideally, correcting it. The theory of error-correcting codes originated with Claude

Shannon’s famous 1948 paper “A Mathematical Theory of Communication” and has

grown to connect to many areas of mathematics, including algebra and combinatorics.

The cleverness of the error-correcting schemes that have been developed since 1948 is

responsible for the great reliability that we now enjoy in our modern communications

networks, computer systems, and even compact disk players.

Suppose you want to send the message “Yes” (denoted by 1) or “No” (denoted

by 0) through a noisy communication channel. We assume that there is a uniform

probability p < 1 that any particular binary digit (often called a bit) could be altered,

independent of whether or not any other bits are transmitted correctly. This kind

of transmission line is called a binary symmetric channel. (In a q-ary symmetric

channel, the digits can take on any of q diﬀerent values and the errors in each digit

occur independently and manifest themselves as the q −1 other possible values with

equal probability.)

If a single bit is sent, a binary channel will be reliable only a fraction 1 −p of the

time. The simplest way of increasing the reliability of such transmissions is to send

the message twice. This relies on the fact that if p is small, the probability p

2

of two

errors occurring is very small. The probability of no errors occurring is (1 −p)

2

. The

probability of one error occurring is 2p(1 − p) since there are two possible ways this

could happen. While reception of the original message is more likely than any other

particular result if p < 1/2, we need p < 1 −1/

√

2 ≈ 0.29 to be sure that the correct

message is received most of the time.

5

6 CHAPTER 1. INTRODUCTION

If the message 11 or 00 is received, either 0 or 2 errors have occurred. Thus, we

would expect with conditional probability

(1 −p)

2

(1 −p)

2

+p

2

that the sent message was “Yes” or “No”, respectively. If the message 01 or 10 is

received we know for sure that an error has occurred, but we have no way of knowing,

or even reliably guessing, what message was sent (it could with equal probability have

been the message 00 or 11). Of course, we could simply ask the sender to retransmit

the message; however this would now require a total of 4 bits of information to be sent.

If errors tend to occur frequently, it would make more sense to send three, instead of

two, copies of the original data in a single message. That is, we should send “111”

for “Yes” or “000” for “No”. Then, if only one bit-ﬂip occurs, we can always guess,

with good reliability what the original message was. For example, suppose “111” is

sent. Then of the eight possible received results, the patterns “111”, “011”, “101”,

and “110” would be correctly decoded as “Yes”. The probability of the ﬁrst pattern

occurring is (1 − p)

3

and the probability for each of the next three possibilities is

p(1 −p)

2

. Hence the probability that the message is correctly decoded is

(1 −p)

3

+ 3p(1 −p)

2

= (1 −p)

2

(1 + 2p) = 1 −3p

2

+ 2p

3

.

In other words, if p is small, the probability of a decoding error, 3p

2

−2p

3

, is very small.

This kind of data encoding is known as a repetition code. For example, suppose that

p = 0.001, so that on average one bit in every thousand is garbled. Triple-repetition

decoding ensures that only about one bit in every 330 000 is garbled.

1.A Error Detection and Correction

Despite the inherent simplicity of repetition coding, sending the entire message like

this in triplicate is not an eﬃcient means of error correction. Our goal is to ﬁnd

optimal encoding and decoding schemes for reliable error correction of data sent

through noisy transmission channels.

The sequences “000” and “111” in the previous example are known as binary

codewords. Together they comprise a binary code. More generally, we introduce the

following deﬁnitions.

Deﬁnition: Let q ∈ N. A q-ary codeword is a ﬁnite sequence of symbols, where each

symbol is chosen from the alphabet (set) F

q

= ¦λ

1

, λ

2

, . . . , λ

q

¦. Typically, we will

take F

q

to be the set Z

q

.

= ¦0, 1, 2, . . . , q −1¦. (We use the symbol

.

= to emphasize

a deﬁnition, although the notation := is more common.) The codeword itself can

be treated as a vector in the space F

n

q

= F

q

F

q

. . . F

q

. ¸¸ .

n times

.

1.A. ERROR DETECTION AND CORRECTION 7

• A binary codeword, corresponding to the case q = 2, is just a ﬁnite sequence of 0s

and 1s.

• A ternary codeword, corresponding to the case q = 3, is just a ﬁnite sequence of

0s, 1s, and 2s.

Deﬁnition: A q-ary code is a set of M codewords, where M ∈ N is known as the size

of the code.

• The set of all words in the English language is a code over the 26-letter alphabet

¦A, B, . . . , Z¦.

One important aspect of all error-correcting schemes is that the extra information

that accomplishes this must itself be transmitted and is hence subject to the same

kinds of errors as is the data. So there is no way to guarantee accuracy; one simply

attempts to make the probability of accurate decoding as high as possible.

A good code is one in which the codewords have little resemblance to each other.

If the codewords are suﬃciently diﬀerent, we will soon see that it is possible not only

to detect errors but even to correct them, using nearest-neighbour decoding, where

one maps the received vector back to the closest nearby codeword.

• The set of all 10-digit telephone numbers in the United Kingdom is a 10-ary code of

length 10. It is possible to use a code of over 82 million 10-digit telephone numbers

(enough to meet the needs of the U.K.) such that if just one digit of any phone

number is misdialed, the correct connection can still be made. Unfortunately, little

thought was given to this, and as a result, frequently misdialed numbers do occur

in the U.K. (as well as in North America)!

Deﬁnition: We deﬁne the Hamming distance d(x, y) between two codewords x and y

of F

n

q

as the number of places in which they diﬀer.

Remark: Notice that d(x, y) is a metric on F

n

q

since it is always non-negative and

satisﬁes

1. d(x, y) = 0 ⇐⇒ x = y,

2. d(x, y) = d(y, x) for all x, y ∈ F

n

q

,

3. d(x, y) ≤ d(x, z) +d(z, y) for all x, y, z ∈ F

n

q

.

The ﬁrst two properties are immediate consequences of the deﬁnition, while the third

property, known as the triangle inequality, follows from the simple observation that

d(x, y) is the minimum number of digit changes required to change x to y, whereas if

we were to change x to y by ﬁrst changing x to z and then changing z to y, we would

require d(x, z) + d(z, y) changes. Thus d(x, y) ≤ d(x, z) + d(z, y). This important

inequality is illustrated in Fig 1.1.

8 CHAPTER 1. INTRODUCTION

x

y

z

Figure 1.1: Triangle inequality.

Remark: We can use property 2 to rewrite the triangle inequality as

d(x, y) −d(y, z) ≤ d(x, z) ∀x, y, z ∈ F

n

q

.

Deﬁnition: The weight w(x) of a q-ary codeword x is the number of nonzero digits

in x.

Remark: Let x and y be binary codewords in Z

n

2

. Then d(x, y) = w(x − y) =

w(x) +w(y) −2w(xy). Here, x −y and xy are computed mod 2, digit by digit.

Remark: Let x and y be codewords in Z

n

q

. Then d(x, y) = w(x −y). Here, x −y is

computed mod q, digit by digit.

Deﬁnition: Let C be a code in F

n

q

. We deﬁne the minimum distance d(C) of the

code:

d(C) = min¦d(x, y) : x, y ∈ C, x ,= y¦.

Remark: In view of the previous discussion, a good code is one with a relatively

large minimum distance.

Deﬁnition: An (n, M, d) code is a code of length n, containing M codewords and

having minimum distance d.

• For example, here is a (5, 4, 3) code, consisting of four codewords from Z

5

2

, which

are at least a distance 3 from each other:

C

3

=

_

_

_

_

0 0 0 0 0

0 1 1 0 1

1 0 1 1 0

1 1 0 1 1

_

_

_

_

.

Upon considering each of the

_

4

2

_

=

4×3

2

= 6 pairs of distinct codewords (rows), we

see that the minimum distance of C

3

is indeed 3. With this code, we can either

1.A. ERROR DETECTION AND CORRECTION 9

(i) detect up to two errors (since the members of each pair of distinct codewords

are more than a distance 2 apart), or (ii) detect and correct a single error (since,

if only a single error has occurred, the received vector will still be closer to the

transmitted codeword than to any other).

The following theorem (cf.. Fig. 1.2) shows how this works in general.

Theorem 1.1 (Error Detection and Correction): In a symmetric channel with error-

probability p > 0,

(i) a code C can detect up to t errors in every codeword ⇐⇒ d(C) ≥ t + 1;

(ii) a code C can correct up to t errors in any codeword ⇐⇒ d(C) ≥ 2t + 1.

t

S

x

y

t

t

S

x

y

Figure 1.2: Detection of up to t errors in a transmitted codeword x requires that all

other codewords y lie outside a sphere S of radius t centered on x. Correction of up

to t errors requires that no sphere of radius t centered about any other codeword y

overlaps with S.

Proof:

(i) “⇐” Suppose d(C) ≥ t + 1. Let a codeword x be transmitted such that t or

fewer errors are introduced, resulting in a new vector y ∈ F

n

q

. Then d(x, y) =

w(x−y) ≤ t < t+1 ≤ d(C), so the received vector cannot be another codeword.

Hence t errors can be detected.

“⇒” Suppose C can detect up to t errors. If d(C) < t + 1, then there is some

pair of codewords x and y with d(x, y) ≤ t. Since it is possible to send the

codeword x and receive another codeword y by the introduction of t errors,

we conclude that C cannot detect t errors, contradicting our premise. Hence

d(C) ≥ t + 1.

(ii) “⇐” Suppose d(C) ≥ 2t + 1. Let a codeword x be transmitted such that t

or fewer errors are introduced, resulting in a new vector y ∈ F

n

q

satisfying

10 CHAPTER 1. INTRODUCTION

d(x, y) ≤ t. If x

′

is a codeword other than x, then d(x, x

′

) ≥ 2t + 1 and the

triangle inequality d(x, x

′

) ≤ d(x, y) +d(y, x

′

) implies that

d(y, x

′

) ≥ d(x, x

′

) −d(x, y) ≥ 2t + 1 −t = t + 1 > t ≥ d(y, x).

Hence the received vector y is closer to x than to any other codeword x

′

, making

it possible to identify the original transmitted codeword x correctly.

“⇒” Suppose C can correct up to t errors. If d(C) < 2t + 1, there is some

pair of distinct codewords x and x

′

with distance d(x, x

′

) ≤ 2t. If d(x, x

′

) ≤ t,

let y = x

′

, so that 0 = d(y, x

′

) < d(y, x) ≤ t. Otherwise, if t < d(x, x

′

) ≤ 2t,

construct a vector y by changing t of the digits of x that are in disagreement

with x

′

to their corresponding values in x

′

, so that 0 < d(y, x

′

) ≤ d(y, x) = t.

In either case, it is possible to send the codeword x and receive the vector y

due to t or fewer transmission errors. But since d(y, x

′

) ≤ d(y, x), the received

vector y cannot not be unambiguously decoded as x using nearest-neighbour

decoding. This contradicts our premise. Hence d(C) ≥ 2t + 1.

Corollary 1.1.1: If a code C has minimum distance d, then C can be used either (i)

to detect up to d − 1 errors or (ii) to correct up to ⌊

d−1

2

⌋ errors in any codeword.

Here ⌊x⌋ represents the greatest integer less than or equal to x.

A good (n, M, d) code has small n (for rapid message transmission), large M (to

maximize the amount of information transmitted), and large d (to be able to correct

many errors. A primary goal of coding theory is to ﬁnd codes that optimize M for

ﬁxed values of n and d.

Deﬁnition: Let A

q

(n, d) be the largest value of M such that there exists a q-ary

(n, M, d) code.

• Since we have already constructed a (5, 4, 3) code, we know that A

2

(5, 3) ≥ 4. We

will soon see that 4 is in fact the maximum possible value of M; i.e. A

2

(5, 3) = 4.

To help us tabulate A

q

(n, d), let us ﬁrst consider the following special cases:

Theorem 1.2 (Special Cases): For any values of q and n,

(i) A

q

(n, 1) = q

n

;

(ii) A

q

(n, n) = q.

Proof:

(i) When the minimum distance d = 1, we require only that the codewords be

distinct. The largest code with this property is the whole of F

n

q

, which has

M = q

n

codewords.

1.A. ERROR DETECTION AND CORRECTION 11

(ii) When the minimum distance d = n, we require that any two distinct codewords

diﬀer in all n positions. In particular, this means that the symbols appearing in

the ﬁrst position must be distinct, so there can be no more than q codewords.

A q-ary repetition code of length n is an example of an (n, q, n) code, so the

bound A

q

(n, n) = q can actually be realized.

Remark: There must be at least two codewords for d(C) even to be deﬁned. This

means that A

q

(n, d) is not deﬁned if d > n, since d(x, y) = w(x−y) ≤ n for distinct

codewords x, y ∈ F

n

q

.

Lemma 1.1 (Reduction Lemma): If a q-ary (n, M, d) code exists, with d ≥ 2, there

also exists an (n −1, M, d −1) code.

Proof: Given an (n, M, d) code, let x and y be codewords such that d(x, y) = d

and choose any column where x and y diﬀer. Delete this column from all codewords.

Since d ≥ 2, the codewords that result are distinct and form a (n−1, M, d −1) code.

Theorem 1.3 (Even Values of d): Suppose d is even. Then a binary (n, M, d) code

exists ⇐⇒ a binary (n −1, M, d −1) code exists.

Proof:

“⇒” This follows from Lemma 1.1.

“⇐” Suppose C is a binary (n −1, M, d −1) code. Let

ˆ

C be the code of

length n obtained by extending each codeword x of C by adding a parity

bit w(x) mod2. This makes the weight w(ˆ x) of every codeword ˆ x of

ˆ

C

even. Then d(x, y) = w(x) + w(y) − 2w(xy) must be even for every pair

of codewords x and y in

ˆ

C, so d(

ˆ

C) is even. Note that d − 1 = d(C) ≤

d(

ˆ

C) ≤ d. But d − 1 is odd, so in fact d(

ˆ

C) = d. Thus

ˆ

C is a (n, M, d)

code.

Corollary 1.3.1 (Maximum Code Size for Even d): If d is even, then A

2

(n, d) =

A

2

(n −1, d −1).

This result means that we only need to calculate A

2

(n, d) for odd d. In fact, in

view of Theorem 1.1, there is little advantage in considering codes with even d if the

goal is error correction. In Table 1.1, we present values of A

2

(n, d) for n ≤ 16 and for

odd values of d ≤ 7.

As an example, we now compute the value A

2

(5, 3) entered in Table 1.1, after

establishing a useful simpliﬁcation, beginning with the following deﬁnition.

Deﬁnition: Two q-ary codes are equivalent if one can be obtained from the other by

a combination of

(A) permutation of the columns of the code;

12 CHAPTER 1. INTRODUCTION

n d = 3 d = 5 d = 7

5 4 2

6 8 2

7 16 2 2

8 20 4 2

9 40 6 2

10 72 12 2

11 144 24 4

12 256 32 4

13 512 64 8

14 1024 128 16

15 2048 256 32

16 2720–3276 256–340 36–37

Table 1.1: Maximum code size A

2

(n, d) for n ≤ 16 and d ≤ 7.

(B) relabelling the symbols appearing in a ﬁxed column.

Remark: Note that the distances between codewords are unchanged by each of these

operations. That is, equivalent codes have the same (n, M, d) parameters and can

correct the same number of errors. Furthermore, in a q-ary symmetric channel, the

error-correction performance of equivalent codes will be identical.

• The binary code

_

_

_

_

0 1 0 1 0

1 1 1 1 1

0 0 1 0 0

1 0 0 0 1

_

_

_

_

is seen to be equivalent to our previous (5, 4, 3) code C

3

by interchanging the ﬁrst

two columns and then relabelling 0 ↔ 1 in the ﬁrst and fourth columns of the

resulting matrix.

Lemma 1.2 (Zero Vector): Any code over an alphabet containing the symbol 0 is

equivalent to a code containing the zero vector 0.

Proof: Given a code of length n, choose any codeword x

1

x

2

. . . x

n

. For each i such

that x

i

,= 0, apply the relabelling 0 ↔ x

i

to the symbols in the ith column.

• Armed with the above lemma and the concept of equivalence, it is now easy to

prove that A

2

(5, 3) = 4. Let C be a (5, M, 3) code with M ≥ 4. Without loss

of generality, we may assume that C contains the zero vector (if necessary, by

replacing C with an equivalent code). Then there can be no codewords with just

one or two 1s since d = 3. Also, there can be at most one codeword with four or

1.A. ERROR DETECTION AND CORRECTION 13

more 1s; otherwise there would be two codewords with at least three 1s in common

positions and less than a distance 3 apart. Since M ≥ 4, there must be at least

two codewords containing exactly three 1s. By rearranging columns, if necessary,

we see that the code contains the codewords

_

_

0 0 0 0 0

1 1 1 0 0

0 0 1 1 1

_

_

There is no way to add any more codewords containing exactly three 1s and we

can also now rule out the possibility of ﬁve 1s. This means that there can be at

most four codewords, that is, A

2

(5, 3) ≤ 4. Since we have previously shown that

A

2

(5, 3) ≥ 4, we deduce that A

2

(5, 3) = 4.

Remark: A fourth codeword, if present in the above code, must have exactly four 1s.

The only possible position for the 0 symbol is in the middle position, so the fourth

codeword must be 11011. We then see that the resulting code is equivalent to C

3

and hence A

2

(5, 3) is unique, up to equivalence.

The above trial-and-error approach becomes impractical for large codes. In some

of these cases, an important bound, known as the sphere-packing or Hamming bound,

can be used to establish that a code is as large as possible for given values of n and d.

Lemma 1.3 (Counting): A sphere of radius t in F

n

q

, with 0 ≤ t ≤ n, contains exactly

t

k=0

_

n

k

_

(q −1)

k

vectors.

Proof: The number of vectors that are a distance k from a ﬁxed vector in F

n

q

is

_

n

k

_

(q −1)

k

, because there are

_

n

k

_

choices for the k positions that diﬀer from those of

the ﬁxed vector and there are q −1 values that can be assigned independently to each

of these k positions. Summing over the possible values of k, we obtain the desired

result.

Theorem 1.4 (Sphere-Packing Bound): A q-ary (n, M, 2t + 1) code satisﬁes

M

t

k=0

_

n

k

_

(q −1)

k

≤ q

n

. (1.1)

Proof: By the triangle inequality, any two spheres of radius t that are centered on

distinct codewords will have no vectors in common. The total number of vectors in

the M spheres of radius t centered on the M codewords is thus given by the left-hand

side of the above inequality; this number can be no more than the total number q

n

of vectors in F

n

q

.

14 CHAPTER 1. INTRODUCTION

• For our binary (5, 4, 3) code, Eq. (1.1) gives the bound M(1 + 5) ≤ 2

5

= 32,

which implies that A

2

(5, 3) ≤ 5. We have already seen that A

2

(5, 3) = 4. This

emphasizes, that just because some set of numbers ¦n, M, t¦ satisfy Eq. (1.1), there

is no guarantee that such a code actually exists.

Deﬁnition: A perfect code is a code for which equality occurs in 1.1. For such a

code, the M spheres of radius t centered on the codewords ﬁll the whole space F

n

q

completely, without overlapping.

Remark: The codes that consist of a single codeword (taking t = n and M = 1),

codes that contain all vectors of F

n

q

(with t = 0 and M = q

n

), and the binary

repetition code (with t = (n−1)/2 and M = 2) of odd length n are trivially perfect

codes.

Problem 1.1: Prove that

n

t=0

_

n

t

_

(q −1)

t

= q

n

.

Each term in this sum is the number of vectors of weight t in F

n

q

. When we sum over

all possible values of t, we obtain q

n

, the total number of vectors in F

n

q

.

Alternatively, we see directly from the Binomial Theorem that

n

t=0

_

n

t

_

(q −1)

t

= (1 + (q −1))

n

= q

n

.

Problem 1.2: Show that a q-ary (n, M, d) code must satisfy M ≤ q

n−d+1

. Hint: what

can can you say about the vectors obtained by deleting the last d −1 positions of

all codewords? It might help to ﬁrst consider the special cases d = 1 and d = 2.

If you delete the last d − 1 positions of all codewords, the resulting vectors must be

distinct, or else the codewords could not be a distance d apart from each other. Since the

number of distinct q-ary vectors of length n −d +1 is q

n−d+1

, the number of codewords M

must be less or equal to this number.

Problem 1.3: (a) Given an (n, M, d) q-ary code C, let N

i

: i = 0, 1, . . . , q −1 be the

number of codewords ending with the symbol i. Prove that there exists some i for

which N

i

≥ M/q.

This follows from the pigeon-hole principle: construct q boxes, one for each possible ﬁnal

digit. If we try to stuﬀ the M codewords into the q boxes, at least one box must contain

M/q codewords.

Alternatively, one can establish this result with a proof by contradiction. Suppose

that N

i

< M/q for each i = 0, 1, . . . , q − 1. One would then obtain the contradiction

M =

q−1

i=0

N

i

<

q−1

i=0

M/q = M.

1.A. ERROR DETECTION AND CORRECTION 15

(b) Let C

′

be the code obtained by deleting the ﬁnal symbol from each codeword

in C. Show that C

′

contains an (n −1, M/q, d) subcode (that is, C

′

contains at least

M/q codewords that are still a distance d from each other).

From part (a), we know that C contains at least M/q codewords ending in the same

symbol. When we delete this last symbol from these codewords, they remain a distance d

apart from each other. That is, they form a (n −1, M/q, d) subcode of C

′

.

(c) Conclude that

A

q

(n, d) ≤ qA

q

(n −1, d).

Recall that A

q

(n, d) is the largest value of M such that there exists a q-ary (n, M, d)

code.

Let C be a (n, A

q

(n, d), d) code. We know from part(b) that we can construct a (n −

1,

Aq(n,d)

q

, d) subcode from C. Hence

A

q

(n −1, d) ≥

A

q

(n, d)

q

.

Problem 1.4: Let C be a code with even distance d = 2m. Let t be the maximum

number of errors that C can be guaranteed to correct.

(a) Express t in terms of m.

To correct t errors we need d ≥ 2t + 1; that is, t ≤

d−1

2

. The maximum value of t is

t =

_

d −1

2

_

=

_

m−

1

2

_

= m−1.

(b) Prove that C cannot be a perfect code. That is, there is no integer M such

that

M

t

k=0

_

n

k

_

(q −1)

k

= q

n

.

For C to be perfect, each vector in F

n

q

would have to be contained in exactly one of the

M codeword spheres of radius t. However, we know that there are codewords x and y with

d(x, y) = 2m = 2t + 2. Consider the vector v obtained by changing t + 1 of those digits

where x disagrees with y to the corresponding digits in y. Then d(v, x) = d(v, y) = t + 1,

so v does not lie within the codeword spheres about x or y. If v were within a distance t

from another codeword z, the triangle inequality would imply that

d(x, z) ≤ d(x, v) + d(v, z) = t + 1 + t = 2t + 1,

contradicting the fact that the code has minimum distance 2t + 2. Thus v does not lie in

any codeword sphere. That is, C is not a perfect code.

16 CHAPTER 1. INTRODUCTION

1.B Balanced Block Designs

Deﬁnition: A balanced block design consists of a collection of b subsets, called blocks,

of a set S of v points such that

(i) each point lies in exactly r blocks;

(ii) each block contains exactly k points;

(iii) each pair of points occurs together in exactly λ blocks.

Such a design is called a (b, v, r, k, λ) design.

• Let S = ¦1, 2, 3, 4, 5, 6, 7¦ and consider the subsets ¦1, 2, 4¦, ¦2, 3, 5¦, ¦3, 4, 6¦,

¦4, 5, 7¦, ¦5, 6, 1¦, ¦6, 7, 2¦, ¦7, 1, 3¦ of S. Each number lies in exactly 3 blocks, each

block contains 3 numbers, and each pair of numbers occurs together in exactly 1

block. The six lines and circle in Fig. 1.3 represent the blocks. Hence these subsets

form a (7, 7, 3, 3, 1) design.

1

2 3

4

5

6

7

Figure 1.3: Seven-point plane.

Remark: The parameters (b, v, r, k, λ) are not independent. Consider the set of

ordered pairs

T = ¦(x, B) : x is a point, B is a block, x ∈ B¦.

Since each of the v points lie in r blocks, there must be a total of vr ordered pairs

in T. Alternatively, we know that since there are b blocks and k points in each

block, we can form exactly bk such pairs. Thus bk = vr. Similarly, by considering

the set

U = ¦(x, y, B) : x, y are distinct points, B is a block, x, y ∈ B¦,

we deduce

bk(k −1) = λv(v −1),

which, using bk = vr, simpliﬁes to r(k −1) = λ(v −1).

1.B. BALANCED BLOCK DESIGNS 17

Deﬁnition: A block design is symmetric if v = b (and hence k = r); that is, the

number of points and blocks is identical. For brevity, this is called a (v, k, λ)

design.

Deﬁnition: The incidence matrix of a block design is a v×b matrix with entries

a

ij

=

_

1 if x

i

∈ B

j

,

0 if x

i

/ ∈ B

j

,

where x

i

, i = 1, . . . , v are the design points and B

j

, j = 1, . . . , b are the design

blocks.

• For our above (7, 3, 1) symmetric design, the incidence matrix A is

_

_

_

_

_

_

_

_

_

_

1 0 0 0 1 0 1

1 1 0 0 0 1 0

0 1 1 0 0 0 1

1 0 1 1 0 0 0

0 1 0 1 1 0 0

0 0 1 0 1 1 0

0 0 0 1 0 1 1

_

_

_

_

_

_

_

_

_

_

.

• We now construct a (7, 16, 3) binary code C consisting of the zero vector 0, the

unit vector 1, the 7 rows of A, and the 7 rows of the matrix B obtained from A by

the relabelling 0 ↔ 1:

C =

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

0

1

a

1

a

2

a

3

a

4

a

5

a

6

a

7

b

1

b

2

b

3

b

4

b

5

b

6

b

7

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

=

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

0 0 0 0 0 0 0

1 1 1 1 1 1 1

1 0 0 0 1 0 1

1 1 0 0 0 1 0

0 1 1 0 0 0 1

1 0 1 1 0 0 0

0 1 0 1 1 0 0

0 0 1 0 1 1 0

0 0 0 1 0 1 1

0 1 1 1 0 1 0

0 0 1 1 1 0 1

1 0 0 1 1 1 0

0 1 0 0 1 1 1

1 0 1 0 0 1 1

1 1 0 1 0 0 1

1 1 1 0 1 0 0

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

_

.

18 CHAPTER 1. INTRODUCTION

To ﬁnd the minimum distance of this code, note that each row of A has exactly

three 1s (since r = 3) and any two distinct rows of A have exactly one 1 in common

(since λ = 1). Hence d(a

i

, a

j

) = 3 + 3 − 2(1) = 4 for i ,= j. Likewise, d(b

i

, b

j

) = 4.

Furthermore,

d(0, a

i

) = 3, d(0, b

i

) = 4,

d(1, a

i

) = 4, d(1, b

i

) = 3,

d(a

i

, b

i

) = d(0, 1) = 7,

for i = 1, . . . , 7. Finally, a

i

and b

j

disagree in precisely those places where a

i

and a

j

agree, so

d(a

i

, b

j

) = w(a

i

−b

j

) = w(1 −(a

i

−a

j

)) = w(1) +w(a

i

−a

j

) −2w(a

i

−a

j

)

= 7 −w(a

i

−a

j

) = 7 −d(a

i

, a

j

) = 7 −4 = 3, for i ,= j.

Thus C is a (7, 16, 3) code, which in fact is perfect, since the equality in Eq. (1.1) is

satisﬁed:

16

__

7

0

_

+

_

7

1

__

= 16(1 + 7) = 128 = 2

7

.

The existence of a perfect binary (7, 16, 3) code establishes A

2

(7, 3) = 16, so we

have now established another entry of Table 1.1.

1.C The ISBN Code

Modern books are assigned an International Standard Book Number (ISBN), a 10-

digit codeword, by the publisher. For example, Hill [1997] has the ISBN number

0-19-853803-0. The three hyphens separate the codeword into four ﬁelds. The ﬁrst

ﬁeld speciﬁes the language (0 means English), the second ﬁeld indicates the publisher

(19 means Oxford University Press), the third ﬁeld (853803) is the book number

assigned by the publisher, and the ﬁnal digit (0) is a check digit. If the digits of the

ISBN number are denoted x = x

1

. . . x

10

, then the check digit x

10

is chosen as

x

10

=

9

k=1

kx

k

(mod 11).

If x

10

turns out to be 10, an X is printed in place of the ﬁnal digit. The tenth digit

serves to make the weighted check sum

10

k=1

kx

k

=

9

k=1

kx

k

+ 10

9

k=1

kx

k

= 11

9

k=1

kx

k

= 0 (mod 11).

So, if

10

k=1

kx

k

,= 0 (mod11), we know that an error has occurred. In fact, the ISBN

number is able to (ii) detect a single error or (ii) detect a transposition error that

results in two digits (not necessarily adjacent) being interchanged.

1.C. THE ISBN CODE 19

If a single error occurs, then some digit x

j

is received as x

j

+e with e ,= 0. Then

10

k=1

kx

k

+je = je (mod 11) ,= 0 (mod11) since j and e are nonzero.

Let y be the vector obtained by exchanging the digits x

j

and x

k

in an ISBN

code x, where j ,= k. Then

10

i=1

ix

i

+ (k −j)x

j

+ (j −k)x

k

= (k −j)x

j

+ (j −k)x

k

(mod 11)

= (k −j)(x

j

−x

k

) (mod 11) ,= 0 (mod11)

if x

j

,= x

k

.

In the above arguments we have used the property of the ﬁeld Z

11

(the integers

modulo 11) that the product of two nonzero elements is always nonzero (since ab = 0

and a ,= 0 ⇒ a

−1

ab = 0 ⇒ b = 0). Consequently, Z

ab

with a, b > 1 cannot be a ﬁeld

because the product ab = 0 (mod ab), even though a ,= 0 and b ,= 0. Note also that

there can be no inverse a

−1

in Z

ab

, for otherwise b = a

−1

ab = a

−1

0 = 0 (mod ab).

In fact, Z

p

is a ﬁeld ⇐⇒ p is prime. For this reason, the ISBN code is calculated

in Z

11

and not in Z

10

, where 2 5 = 0 (modn).

The ISBN code cannot be used to correct errors unless we know a priori which digit

is in error. To do this, we ﬁrst need to construct a table of inverses modulo 11 using

the Euclidean division algorithm. For example, let y be the inverse of 2 modulo 11.

Then 2y = 1 (mod 11) implies 2y = 11q+1 or 1 = −11q+2y for some integers y and q.

On dividing 11 by 2 as we would to show that gcd(11, 2) = 1, we ﬁnd 11 = 5 2 +1 so

that 1 = 11−5 2, from which we see that q = −1 and y = −5 (mod11) = 6 (mod 11)

are solutions. Similarly, 7

−1

= 8 (mod11) since 11 = 1 7 + 4 and 7 = 1 4 + 3 and

4 = 13+1, so 1 = 4−13 = 4−1(7−14) = 24−17 = 2(11−17)−17 = 211−37.

Thus −3 7 = −2 11 + 1; that is, 7 and −3 = 8 are inverses mod11. The complete

table of inverses modulo 11 is shown in Table 1.2.

x 1 2 3 4 5 6 7 8 9 10

x

−1

1 6 4 3 9 2 8 7 5 10

Table 1.2: Inverses modulo 11.

Suppose that we detect an error and we also know that it is the digit x

j

that is

in error (and hence unknown). Then we can use our table of inverses to solve for the

value of x

j

, assuming all of the other digits are correct. Since

jx

j

+

10

k=1

k=j

kx

k

= 0 (mod11),

we know that

x

j

= −j

−1

10

k=1

k=j

kx

k

(mod 11).

20 CHAPTER 1. INTRODUCTION

For example, if we did not know the fourth digit x of the ISBN 0-19-x53803-0, we

would calculate

x = −4

−1

(1 0 + 2 1 + 3 9 + 5 5 + 6 3 + 7 8 + 8 0 + 9 3 + 10 0) (mod 11)

= −3(0 + 2 + 5 + 3 + 7 + 1 + 0 + 5 + 0) (mod11) = −3(1) (mod 11) = 8,

which is indeed correct.

Problem 1.5: A smudge has obscured one of the digits of the ISBN code 0-8018-

01234567890739-1.

Determine the unknown digit.

The sixth digit is

x

6

= −6

−1

(1 0 + 2 8 + 3 0 + 4 1 + 5 8 + 7 7 + 8 3 + 9 9 + 10 1) (mod 11)

= −2(0 + 5 + 0 + 4 + 7 + 5 + 2 + 4 + 10) (mod11)

= −2(4) (mod11) = −8 (mod 11) = 3.

Problem 1.6: A smudge has obscured one of the digits of the ISBN code 0-393-

051012345678900-X.

Determine the unknown digit.

The eighth digit is

x

5

= −8

−1

(1 0 + 2 3 + 3 9 + 4 3 + 5 0 + 6 5 + 7 1 + 9 0 + 10 10) (mod11)

= −7(0 + 6 + 5 + 1 + 0 + 8 + 7 + 0 + 1) (mod 11)

= −7(6) (mod 11) = −9 (mod11) = 2.

Chapter 2

Linear Codes

An important class of codes are linear codes in the vector space F

n

q

, where F

q

is a

ﬁeld.

Deﬁnition: A linear code C is a code for which, whenever u ∈ C and v ∈ C, then

αu +βv ∈ C for all α, β ∈ F

q

. That is, C is a linear subspace of F

n

q

.

Remark: The zero vector 0 automatically belongs to all linear codes.

Remark: A binary code C is linear ⇐⇒ it contains 0 and the sum of any two

codewords in C is also in C.

Problem 2.1: Show that the (7, 16, 3) code developed in the previous chapter is

linear.

Remark: A linear code C will always be a k-dimensional linear subspace of F

n

q

for

some integer k between 1 and n. A k-dimensional code C is simply the set of all

linear combinations of k linearly independent codewords, called basis vectors. We

say that these k basis codewords generate or span the entire code space C.

Deﬁnition: We say that a k-dimensional code in F

n

q

is an [n, k] code, or if we also

wish to specify the minimum distance d, an [n, k, d] code.

Remark: Note that a q-ary [n, k, d] code is an (n, q

k

, d) code. To see this, let the k

basis vectors of an [n, k, d] code be u

j

, for j = 1, . . . , k. The q

k

codewords are

obtained as the linear combinations

k

j=1

a

j

u

j

; there are q possible values for each

of the k coeﬃcients a

j

. Note that

k

j=1

a

j

u

j

=

k

j=1

b

j

u

j

⇒

k

j=1

(a

j

−b

j

)u

j

= 0 ⇒ a

j

= b

j

, j = 1, . . . k,

by the linear independence of the basis vectors, so the q

k

generated codewords are

distinct.

21

22 CHAPTER 2. LINEAR CODES

Remark: Not every (n, q

k

, d) code is a q-ary [n, k, d] code (it might not be linear).

Deﬁnition: Deﬁne the minimum weight of a code to be w(C) = min¦w(x) : x ∈

C, x ,= 0¦.

One of the advantages of linear codes is illustrated by the following lemma.

Lemma 2.1 (Distance of a Linear Code): If C is a linear code in F

n

q

, then d(C) =

w(C).

Proof: There exist codewords x, y, and z with x ,= y, and z ,= 0 such that

d(x, y) = d(C) and w(z) = w(C). Then

d(C) ≤ d(z, 0) = w(z −0) = w(z) = w(C) ≤ w(x −y) = d(x, y) = d(C),

so w(C) = d(C).

Remark: Lemma 2.1 implies, for a linear code, that we only have to examine the

weights of the M − 1 nonzero codewords in order to ﬁnd the minimum distance.

In contrast, for a general nonlinear code, we need to make

_

M

2

_

= M(M − 1)/2

comparisons (between all possible pairs of distinct codewords) to determine the

minimum distance.

Deﬁnition: A k n matrix with rows that are basis vectors for a linear [n, k] code C

is called a generator matrix of C.

• A q-ary repetition code of length n is an [n, 1, n] code with generator matrix

[1 1 . . . 1].

Problem 2.2: Show that the (7, 16, 3) perfect binary code in Chapter 1 is a [7, 4, 3]

linear code (note that 2

4

= 16) with generator matrix

_

¸

¸

_

1

a

1

a

2

a

3

_

¸

¸

_

=

_

¸

¸

_

1 1 1 1 1 1 1

1 0 0 0 1 0 1

1 1 0 0 0 1 0

0 1 1 0 0 0 1

_

¸

¸

_

Remark: Linear q-ary codes are not deﬁned unless q is a power of a prime (this is sim-

ply the requirement for the existence of the ﬁeld F

q

). However, lower-dimensional

codes can always be obtained from linear q-ary codes by projection onto a lower-

dimensional subspace of F

n

q

. For example, the ISBN code is a subset of the 9-

dimensional subspace of F

10

11

consisting of all vectors perpendicular to the vector

(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); this is the space

_

(x

1

x

2

. . . x

10

) :

10

k=1

kx

k

= 0 (mod 11)

_

.

However, not all vectors in this set (for example X-00-000000-1) are in the ISBN

code. That is, the ISBN code is not a linear code.

2.A. ENCODING AND DECODING 23

For linear codes we must slightly restrict our deﬁnition of equivalence so that the

codes remain linear (e.g., in order that the zero vector remains in the code).

Deﬁnition: Two linear q-ary codes are equivalent if one can be obtained from the

other by a combination of

(A) permutation of the columns of the code;

(B) multiplication of the symbols appearing in a ﬁxed column by a nonzero scalar.

Deﬁnition: A k n matrix of rank k is in reduced echelon form (or standard form)

if it can be written as

[ 1

k

[ A] ,

where 1

k

is the k k identity matrix and A is a k (n −k) matrix.

Remark: A generator matrix for a vector space can always be reduced to an equiv-

alent reduced echelon form spanning the same vector space, by permutation of its

rows, multiplication of a row by a non-zero scalar, or addition of one row to an-

other. Note that any combination of these operations with (A) and (B) above will

generate equivalent linear codes.

Problem 2.3: Show that the generator matrix for the (7, 16, 3) perfect code in Chap-

ter 1 can be written in reduced echelon form as

G =

_

¸

¸

_

1 0 0 0 1 0 1

0 1 0 0 1 1 1

0 0 1 0 1 1 0

0 0 0 1 0 1 1

_

¸

¸

_

.

2.A Encoding and Decoding

A [n, k] linear code C contains q

k

codewords, corresponding to q

k

distinct messages.

We identify each message with a k-tuple

u = [ u

1

u

2

. . . u

k

] ,

where the components u

i

are elements of F

q

. We can encode u by multiplying it on

the right with the generator matrix G. This maps u to the linear combination uG of

the codewords. In particular the message with components u

i

= δ

ik

gets mapped to

the codeword appearing in the kth row of G.

• Given the message [0, 1, 0, 1] and the above generator matrix for our (7, 16, 3) code,

the encoded codeword

[ 0 1 0 1 ]

_

¸

¸

_

1 0 0 0 1 0 1

0 1 0 0 1 1 1

0 0 1 0 1 1 0

0 0 0 1 0 1 1

_

¸

¸

_

= [ 0 1 0 1 1 0 0 ]

is just the sum of the second and fourth rows of G.

24 CHAPTER 2. LINEAR CODES

Problem 2.4: If a generator for a linear [n, k] code is in standard form, show that

the message vector is just the ﬁrst k bits of the codeword.

Deﬁnition: Let C be a linear code over F

n

q

. Let a be any vector in F

n

q

. The set

a +C = ¦a +x : x ∈ C¦ is called a coset of C.

Lemma 2.2 (Equivalent Cosets): Let C be a linear code in F

n

q

and a ∈ F

n

q

. If b is

an element of the coset a +C, then

b +C = a +C.

Proof: Since b ∈ a + C, then b = a + x for some x ∈ C. Consider any vector

b +y ∈ b +C, with y ∈ C. Then

b +y = (a +x) +y = a + (x +y) ∈ a +C,

so b+C ⊂ a+C. Furthermore a = b+(−x) ∈ b+C, so the same argument implies

a +C ⊂ b +C. Hence b +C = a +C.

The following theorem from group theory states that F

n

q

is just the union of q

n−k

distinct cosets of a linear [n, k] code C, each containing q

k

elements.

Theorem 2.1 (Lagrange’s Theorem): Suppose C is an [n, k] code in F

n

q

. Then

(i) every vector of F

n

q

is in some coset of C;

(ii) every coset contains exactly q

k

vectors;

(iii) any two cosets are either equivalent or disjoint.

Proof:

(i) a = a +0 ∈ a +C for every a ∈ F

n

q

.

(ii) Since the mapping φ(x) = a + x is one-to-one, [a +C[ = [C[ = q

k

. Here [C[

denotes the number of elements in C.

(iii) Let a, b ∈ F

n

q

. Suppose that the cosets a+C and b +C have a common vector

v = a + x = b + y, with x, y ∈ C. Then b = a + (x − y) ∈ a + C, so by

Lemma 2.2 b +C = a +C.

Deﬁnition: The standard array (or Slepian) of a linear [n, k] code C in F

n

q

is a

q

n−k

q

k

array listing all the cosets of C. The ﬁrst row consists of the codewords

in C themselves, listed with 0 appearing in the ﬁrst column. Subsequent rows are

listed one at a time, beginning with a vector of minimal weight that has not already

been listed in previous rows, such that the entry in the (i, j)th position is the sum

of the entries in position (i, 1) and position (1, j). The vectors in the ﬁrst column

of the array are referred to as coset leaders.

2.A. ENCODING AND DECODING 25

• Let us revisit our linear (5, 4, 3) code

C

3

=

_

_

_

_

0 0 0 0 0

0 1 1 0 1

1 0 1 1 0

1 1 0 1 1

_

_

_

_

with generator matrix

G

3

=

_

0 1 1 0 1

1 0 1 1 0

_

.

The standard array for C

3

is a 8 4 array of cosets listed here in three groups of

increasing coset leader weight:

0 0 0 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1

0 0 0 0 1 0 1 1 0 0 1 0 1 1 1 1 1 0 1 0

0 0 0 1 0 0 1 1 1 1 1 0 1 0 0 1 1 0 0 1

0 0 1 0 0 0 1 0 0 1 1 0 0 1 0 1 1 1 1 1

0 1 0 0 0 0 0 1 0 1 1 1 1 1 0 1 0 0 1 1

1 0 0 0 0 1 1 1 0 1 0 0 1 1 0 0 1 0 1 1

0 0 0 1 1 0 1 1 1 0 1 0 1 0 1 1 1 0 0 0

0 1 0 1 0 0 0 1 1 1 1 1 1 0 0 1 0 0 0 1

Remark: The last two rows of the standard array for C

3

could equally well have

been written as

1 1 0 0 0 1 0 1 0 1 0 1 1 1 0 0 0 0 1 1

1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 1 0 1 0

Deﬁnition: If the codeword x is sent, but the received vector is y, we deﬁne the

error vector e

.

= y −x.

Remark: If no more than t errors have occurred, the coset leaders of weight t or

less are precisely the error vectors that can be corrected. Recall that the code C

3

,

having minimum distance 3, can only correct one error. For the code C

3

, as long as

no more than one error has occurred, the error vector will have weight at most one.

We can then decode the received vector by checking to see under which codeword

it appears in the standard array, remembering that the codewords themselves are

listed in the ﬁrst row. For example, if y = 10111 is received, we know that the

error vector e = 00001, and the transmitted codeword must have been x = y−e =

10111 −00001 = 10110.

Remark: If two errors have occurred, one cannot determine the original vector with

certainty, because in each row with coset leader weight 2, there are actually two

26 CHAPTER 2. LINEAR CODES

vectors of weight 2. For a code with minimum distance 2t + 1, the rows in the

standard array of coset leader weight greater than t can be written in more than one

way, as we have seen above. Thus, if 01110 is received, then either 01110−00011 =

01101 or 01110 −11000 = 10110 could have been transmitted.

Remark: Let C be a binary [n, k] linear code and α

i

denote the number of coset

leaders for C having weight i, where i = 0, . . . , n. If p is the error probability for a

single bit, then the probability P

corr

(C) that a received vector is correctly decoded

is

P

corr

(C) =

n

i=0

α

i

p

i

(1 −p)

n−i

.

Remark: If C can correct t errors then the coset leaders of weight no more than t

are unique and hence the total number of such leaders of weight i is α

i

=

_

n

i

_

for

0 ≤ i ≤ t. In particular, if n = t, then

P

corr

(C) =

n

i=0

_

n

i

_

p

i

(1 −p)

n−i

= (p + 1 −p)

n

= 1.

Such a code is able to correct all possible errors (no matter how poor the transmis-

sion line is); however, since C only contains a single codeword, it cannot be used

to send any information!

Remark: For i > t, the coeﬃcients α

i

can be diﬃcult to calculate. For a perfect

code, however, we know that every vector is within a distance t of some codeword.

Thus, the error vectors that can be corrected by a perfect code are precisely those

vectors of weight no more than t; consequently,

α

i

=

_

_

_

_

n

i

_

for 0 ≤ i ≤ t,

0 for i > t.

• For the code C

3

, we see that α

0

= 1, α

1

= 5, α

2

= 2, and α

3

= α

4

= α

5

= 0. Hence

P

corr

(C

3

) = (1 −p)

5

+ 5p(1 −p)

4

+ 2p

2

(1 −p)

3

= (1 −p)

3

(1 + 3p −2p

2

).

For example, if p = 0.01, then P

corr

= 0.99921 and P

err

.

= 1 − P

corr

= 0.00079,

more than a factor 12 lower than the raw bit error probability p. Of course, this

improvement in reliability comes at a price: we must now send n = 5 bits for every

k = 2 information bits. The ratio k/n is referred to as the rate of the code. It

is interesting to compare the performance of C

3

with a code that sends two bits

of information by using two back-to-back repetition codes each of length 5 and for

which α

0

= 1, α

1

= 5, and α

2

= 10. We ﬁnd that P

corr

for such a code is

[(1 −p)

5

+ 5p(1 −p)

4

+ 10p

2

(1 −p)

3

]

2

= [(1 −p)

3

(1 + 3p + 6p

2

)]

2

= 0.99998

2.B. SYNDROME DECODING 27

so that P

err

= 0.00002. While this error rate is almost four times lower than that

for C

3

, bear in mind that the repetition scheme requires the transmission of twice

as much data for the same number of information digits (i.e. it has half the rate

of C

3

).

2.B Syndrome Decoding

The standard array for our (5, 4, 3) code had 32 entries; for a general code of length n,

we will have to search through 2

n

entries every time we wish to decode a received

vector. For codes of any reasonable length, this is not practical. Fortunately, there is

a more eﬃcient alternative, which we now describe.

Deﬁnition: Let C be a [n, k] linear code. The dual code C

⊥

of C in F

n

q

is the set of

all vectors that are orthogonal to every codeword of C:

C

⊥

= ¦v ∈ F

n

q

: v·u = 0, ∀u ∈ C¦.

Problem 2.5: Show that dual code C

⊥

to a linear code C is itself linear.

Remark: The dual code C

⊥

is just the null space of G:

C

⊥

= ¦u ∈ F

n

q

: Gu

t

= 0¦.

That is,

v ∈ C

⊥

⇐⇒ Gv

t

= 0

(where the superscript t denotes transposition). This just says that v is orthogonal

to each of the rows of G. From linear algebra, we know that the space spanned by

the k independent rows of G is a k dimensional subspace and the null space of G,

which is just C

⊥

, is an n −k dimensional subspace.

Remark: Since every vector in C is perpendicular to every vector in C

⊥

, we know

immediately that C ⊂ (C

⊥

)

⊥

. In fact, since the dimension of the linear supspace

(C

⊥

)

⊥

is n −(n −k) = k, we deduce that C = (C

⊥

)

⊥

.

Deﬁnition: Let C be a [n, k] linear code. An (n−k) n generator matrix H for C

⊥

is called a parity-check matrix.

Deﬁnition: The redundancy r

.

= n − k of a code represents the number of parity

check digits in the code.

Remark: A code C is completely speciﬁed by its parity-check matrix:

C = (C

⊥

)

⊥

= ¦u ∈ F

n

q

: Hu

t

= 0¦;

that is, C is the space of all vectors that are orthogonal to every vector in C

⊥

. In

other words, Hu

t

= 0 ⇐⇒ u ∈ C.

28 CHAPTER 2. LINEAR CODES

Theorem 2.2 (Minimum Distance): A linear code has minimum distance d ⇐⇒ d

is the maximum number such that any d −1 columns of its parity-check matrix are

linearly independent.

Proof: Let C be a linear code and u be a codeword such that w(u) = d(C) = d.

Since

u ∈ C ⇐⇒ Hu

t

= 0

and u has d nonzero components, we see that some d columns of H are linearly

dependent. However, any d − 1 columns of H must be linearly independent, or else

there would exist a nonzero codeword in C with weight d −1.

Remark: Equivalently, a linear code has minimum distance d if d is the smallest

number for which some d columns of its parity-check matrix are linearly dependent.

• For a code with weight 3, Theorem 2.2 tells us that any two columns of its parity-

check matrix must be linearly independent, but that some 3 columns are linearly

dependent.

Deﬁnition: Given a linear code with parity-check matrix H, the column vector Hu

t

is called the syndrome of u.

Lemma 2.3: Two vectors u and v are in the same coset of a linear code C ⇐⇒

they have the same syndrome.

Proof:

u −v ∈ C ⇐⇒ H(u −v)

t

= 0 ⇐⇒ Hu

t

= Hv

t

.

Remark: We thus see that is there is a one-to-one correspondence between cosets

and syndromes. This leads to an alternative decoding scheme known as syndrome

decoding. When a vector u is received, one computes the syndrome Hu

t

and

compares it to the syndromes of the coset leaders. If the coset leader having the

same syndrome is of minimal weight within its coset, it is the error vector for

decoding u.

To compute the syndrome for a code, we need only ﬁrst determine the parity check

matrix. The following lemma describes an easy way to construct the standard form

of the parity-check matrix from the standard-form generator matrix.

Lemma 2.4: An (n − k) n parity-check matrix H for an [n, k] code generated by

the matrix G = [1

k

[ A], where A is a k (n −k) matrix, is given by

[ −A

t

[ 1

n−k

] .

Proof: This follows from the fact that the rows of G are orthogonal to every row

of H, in other words, that

GH

t

= [ 1

k

A]

_

−A

1

n−k

_

= 1

k

(−A) + (A)1

n−k

= −A +A = 0,

the k (n −k) zero matrix.

2.B. SYNDROME DECODING 29

• A parity-check matrix H

3

for our (5, 4, 3) code is

H

3

=

_

_

1 1 1 0 0

1 0 0 1 0

0 1 0 0 1

_

_

.

Problem 2.6: Show that the null space of a matrix is invariant to standard row

reduction operations (permutation of rows, multiplication of a row by a non-zero

scalar, and addition of one row to another) and that these operations may be used

to put H into standard form.

Remark: The syndrome He

t

of a binary error vector e is just the sum of those

columns of H for which the corresponding entry in e is nonzero.

The following theorem makes it particularly easy to correct errors of unit weight.

It will play a particularly important role for the Hamming codes discussed in the next

chapter.

Theorem 2.3: The syndrome of a vector that has a single error of m in the ith

position is m times the ith column of H.

Proof: Let e

i

be the vector with the value m in the ith position and zero in all

other positions. If the codeword x is sent and the vector y = x + e

i

is received the

syndrome Hy

t

= Hx

t

+He

t

i

= 0 +He

t

i

= He

t

i

is just m times the ith column of H.

• For our (5, 4, 3) code, if y = 10111 is received, we compute Hy

t

= 001, which

matches the ﬁfth column of H. Thus, the ﬁfth digit is in error (assuming that only

a single error has occurred), and we decode y to the codeword 10110, just as we

deduced earlier using the standard array.

Remark: If the syndrome does not match any of the columns of H, we know that

more than one error has occurred. We can still determine which coset the syndrome

belongs to by comparing the computed syndrome with a table of syndromes of all

coset leaders. If the corresponding coset leader has minimal weight within its coset,

we are able to correct the error. To decode errors of weight greater than one we

will need to construct a syndrome table, but this table, having only q

n−k

entries,

is smaller than the standard array, which has q

n

entries.

Problem 2.7: Using the binary linear code with parity check matrix

H =

_

_

0 0 1 1

1 0 1 0

0 1 1 0

_

_

,

decode the received vector 1011.

The syndrome [0, 0, 1]

t

corresponds to the second column of H. So the transmitted

vector was 1011 −0100 = 1111.

30 CHAPTER 2. LINEAR CODES

Problem 2.8: Consider the linear [6, M, d] binary code C generated by

G =

_

_

1 1 0 0 1 1

0 1 0 1 1 0

1 0 1 1 1 0

_

_

.

(a) Find a parity check matrix H for C.

First, we put G in standard form:

G =

_

_

1 0 0 1 0 1

0 1 0 1 1 0

0 0 1 0 1 1

_

_

,

from which we see that

H =

_

_

1 1 0 1 0 0

0 1 1 0 1 0

1 0 1 0 0 1

_

_

,

(b) Determine the number of codewords M and minimum distance d of C. Justify

your answers.

Since G has 3 (linearly independent) rows, it spans a three-dimensional code space over

F

2

. Thus, M = 2

3

= 8. Since no two columns of H are linearly dependent but the third

column is the sum of the ﬁrst two, we know by Theorem 2.2 that d = 3.

(c) How many errors can this code correct?

The code can correct only ⌊(d −1)/2⌋ = 1 error.

(d) Is C a perfect code? Justify your answer.

No, because 8

__

6

0

_

+

_

6

1

_¸

= 8(1 + 6) = 56 < 64 = 2

6

.

(e) Suppose the vector 011011 is received. Can this vector be decoded, assuming

that only one error has occurred? If so, what was the transmitted vector?

The syndrome is [110]

t

, which is the second column of H. So the transmitted vector

was 001011.

(f) Suppose the vector 011010 is received. Can this vector be decoded, assuming

that only one error has occurred? If so, what was the transmitted vector?

No, we cannot decode this vector because the syndrome is [111]

t

, which is not a syndrome

corresponding to an error vector of weight 1. At least 2 errors must have occurred and we

cannot correct 2 errors with this code.

Problem 2.9:

(a) Let C be a linear code. If C = C

⊥

, prove that n is even and C must be an

[n, n/2] code.

Let k be the dimension of the linear space C. The dimension of C

⊥

is n−k. If C = C

⊥

,

then k = n −k, so n = 2k.

2.B. SYNDROME DECODING 31

(b) Prove that exactly 2

n−1

vectors in F

n

2

have even weight.

By specifying that a vector in F

n

2

has even weight, we are eﬀectively imposing a parity

check equation on it; the last bit is constrained to be the sum of the previous n−1 bits. So

one can construct exactly 2

q−1

vectors of even weight.

(c) If C

⊥

is the binary repetition code of length n, prove that C is a binary code

consisting of all even weight vectors. Hint: ﬁnd a generator matrix for C

⊥

.

The generator matrix for C

⊥

is the 1 n matrix

G

⊥

= [ 1 1 1 . . . 1 1 1 ] .

This must be a parity check matrix for C, so we know that C consists of all vectors

x

1

x

2

. . . x

n

for which w(x) =

n

i=1

x

i

= 0 (mod2). That is, C consists of all even weight

vectors.

Alternatively, we can explicitly ﬁnd a parity check matrix for C

⊥

; namely, the n−1n

matrix

H

⊥

=

_

¸

¸

¸

¸

¸

¸

_

1 1 0 0 0 0

1 0 1 0 0 0

1 0 0 1 0 0

.

.

.

.

.

.

.

.

.

1 0 0 0 1 0

1 0 0 0 0 1

_

¸

¸

¸

¸

¸

¸

_

.

We see that H

⊥

is a generator matrix for C and that C consists only of even weight vectors.

Furthermore, the vector space C has dimension n −1, so we know that it contains all 2

n−1

even weight vectors.

Problem 2.10: Let C be the code consisting of all vectors in F

n

q

with checksum

0 modq. Let C

′

be the q-ary repetition code of length n.

(a) Find a generator G and parity-check matrix H for C. What are the sizes of

these matrices?

A generator for C is the (n −1) n matrix

G =

_

¸

¸

¸

¸

¸

¸

_

1 0 0 0 −1

0 1 0 0 −1

0 0 1 0 −1

.

.

.

.

.

.

.

.

.

0 0 1 0 −1

0 0 0 1 −1

_

¸

¸

¸

¸

¸

¸

_

.

A parity-check matrix for C is the 1 n matrix

H = [ 1 1 . . . 1 ] .

(b) Find a generator G

′

and parity-check matrix H

′

for C

′

.

A generator for C

′

is the 1 n matrix G

′

= H. A parity-check matrix for C

′

is the

(n −1) n matrix H

′

= G.

32 CHAPTER 2. LINEAR CODES

(c) Which of the following statements are correct? Circle all correct statements.

•

§

¦

¤

¥

C

′

⊂ C

⊥

,

•

§

¦

¤

¥

C

′

= C

⊥

,

•

§

¦

¤

¥

C

′

⊃ C

⊥

,

• Neither C

′

⊃ C

⊥

nor C

′

⊂ C

⊥

holds.

• C

′

∩ C

⊥

= ∅,

(d) Find d(C). Justify your answer.

Any set containing just one column of the parity-check matrix H of C is linearly inde-

pendent, but the ﬁrst and second column (say) are not. From Theorem 2.2, we conclude

that d(C) = 2.

(e) Find d(C

′

). Justify your answer.

Since each codeword of C

′

diﬀers from all of the others in all n places, we see that

d(C

′

) = n.

(f) How many codewords are there in C?

Since G has n −1 rows, there are q

n−1

possible codewords.

(g) How many codewords are there in C

′

?

Since G

′

has 1 row, there are q possible codewords.

(h) Suppose q = 2 and n is odd. Use part(g) to prove that

n−1

2

k=0

_

n

k

_

= 2

n−1

.

A odd-length binary (n, 2, n) code can correct correct t = (n −1)/2 errors. Odd-length

binary codes are (trivially) perfect: they satisfy the Hamming equality

2

n−1

2

k=0

_

n

k

_

= 2

n

,

from which the desired result follows. (Equivalently, this is a consequence of the left-right

symmetry of Pascal’s triangle.)

Problem 2.11: Consider the linear [7, M, d] binary code C generated by

G =

_

_

1 1 0 0 0 1 1

1 0 1 1 0 0 1

0 0 1 0 1 1 1

_

_

,

2.B. SYNDROME DECODING 33

(a) Find a parity check matrix H for C.

First, we row reduce G to standard form:

G =

_

_

1 0 0 1 1 1 0

0 1 0 1 1 0 1

0 0 1 0 1 1 1

_

_

,

from which we see that

H =

_

¸

¸

_

1 1 0 1 0 0 0

1 1 1 0 1 0 0

1 0 1 0 0 1 0

0 1 1 0 0 0 1

_

¸

¸

_

,

(b) Determine the number of codewords M in C. Justify your answer.

Since G has 3 (linearly independent) rows, it spans a three-dimensional code space over

F

2

. Thus, M = 2

3

= 8.

(c) Find the maximum number N such that any set of N columns of H are linearly

independent. Justify your answer.

It is convenient to divide the columns into two groups of clearly linearly independent

vectors: the ﬁrst three columns (which are distinct and do not sum up to zero) and the last

four columns. Each of the ﬁrst three columns has weight 3, and therefore cannot be written

as a sum of two of the last four columns. Any two of the ﬁrst three columns diﬀer in more

than one place, and so their sum cannot equal any of the last four columns. Thus, no three

columns of H are linearly dependent. However, the sum of the ﬁrst three columns is equal

to the ﬁfth column, so N = 3 is the maximum number of linearly independent columns.

(d) Determine the minimum distance d of C.

From part (c) and Theorem 2.2 we know that d = 4.

(e) How many errors can C correct?

The code can correct only ⌊(d −1)/2⌋ = 1 error.

(f) Is C a perfect code? Justify your answer.

No, from Problem 1.4(b) we know that a code with even distance can never be perfect.

Alternatively, we note that 8

__

7

0

_

+

_

7

1

_¸

= 8(1 + 7) = 64 < 128 = 2

7

.

(g) By examining the inner (dot) products of the rows of G with each other,

determine which of the following statements are correct (circle all correct statements

and explain):

•

§

¦

¤

¥

C ⊂ C

⊥

,

• C = C

⊥

,

• C ⊃ C

⊥

,

• Neither C ⊃ C

⊥

nor C ⊂ C

⊥

holds.

• C ∩ C

⊥

= ∅,

34 CHAPTER 2. LINEAR CODES

The only correct statement is C ⊂ C

⊥

, since the rows of G are orthogonal to each other.

Note that C cannot be self-dual because it has dimension k = 3 and C

⊥

has dimension

n −k = 7 −3 = 4.

(h) Suppose the vector 1100011 is received. Can this vector be decoded, assuming

that no more than one error has occurred? If so, what was the transmitted codeword?

Yes, in fact this is the ﬁrst row of G, so it must be a codeword. So no errors have

occurred; the transmitted codeword was 1100011. As a check, one can verify that the

syndrome is [0000]

t

.

(i) Suppose the vector 1010100 is received. Can this vector be decoded, assuming

that no more than one error has occurred? If so, what was the transmitted codeword?

The syndrome is [1101]

t

, which is the second column of H. So the transmitted vector

was 1110100.

(j) Suppose the vector 1111111 is received. Show that at least 3 errors have

occurred. Can this vector be unambiguously decoded by C? If so what was the

transmitted codeword? If not, and if only 3 errors have occurred, what are the

possible codewords that could have been transmitted?

Since the syndrome [1011]

t

is neither a column of H nor the sum of two columns of H,

it does not correspond to an error vector of weight 1 or 2. Thus, at least 3 errors have

occurred. We cannot unambiguously decode this vector because C can only correct 1 error.

In fact, since the rows of G have weight 4, by part (g) and Problem 4.3, we know that

all nonzero codewords in C have weight 4. So any nonzero codeword could have been

transmitted, with 3 errors, to receive 1111111.

Problem 2.12: Consider a single error-correcting ternary code C with parity-check

matrix

H =

_

_

2 0 1 1 0 0

1 2 0 0 1 0

0 2 2 0 0 1

_

_

.

(a) Find a generator matrix G.

A generator matrix is

G =

_

_

1 0 0 1 2 0

0 1 0 0 1 1

0 0 1 2 0 1

_

_

.

(b) Use G to encode the information messages 100, 010, 001, 200, 201, and 221.

The information word x is encoded as xG. So the information messages can be encoded

as follows:

_

¸

¸

¸

¸

¸

¸

_

1 0 0

0 1 0

0 0 1

2 0 0

2 0 1

2 2 1

_

¸

¸

¸

¸

¸

¸

_

_

_

1 0 0 1 2 0

0 1 0 0 1 1

0 0 1 2 0 1

_

_

=

_

¸

¸

¸

¸

¸

¸

_

1 0 0 1 2 0

0 1 0 0 1 1

0 0 1 2 0 1

2 0 0 2 1 0

2 0 1 1 1 1

2 2 1 1 0 0

_

¸

¸

¸

¸

¸

¸

_

.

2.B. SYNDROME DECODING 35

That is, the encoded words are the rows of the resulting matrix, namely 100120, 010011,

001201, 200210, 201111, and 221100.

(c) What is the minimum distance of this code?

Since

2H =

_

_

1 0 2 2 0 0

2 1 0 0 2 0

0 1 1 0 0 2

_

_

,

we see that the columns of H and 2H are distinct, so no two columns of H are linearly

dependent. But the ﬁrst column of H is the ﬁfth column plus twice the fourth column, so

by Theorem 2.2 we know that d = 3.

(d) Decode the received word 122112, if possible. If you can decode it, determine

the corresponding message vector.

The syndrome is

_

_

2 0 1 1 0 0

1 2 0 0 1 0

0 2 2 0 0 1

_

_

_

¸

¸

¸

¸

¸

¸

_

1

2

2

1

1

2

_

¸

¸

¸

¸

¸

¸

_

=

_

_

2

0

1

_

_

.

The syndrome 201 is twice the third column of H, so the corrected word is 122112 −

2(00100) = 120112. Since G is in standard form, the corresponding message word, 120, is

just the ﬁrst three bits of the codeword.

(e) Decode the received word 102201, if possible. If you can decode it, determine

the corresponding message vector.

The syndrome is

_

_

2 0 1 1 0 0

1 2 0 0 1 0

0 2 2 0 0 1

_

_

_

¸

¸

¸

¸

¸

¸

_

1

0

2

2

0

1

_

¸

¸

¸

¸

¸

¸

_

=

_

_

0

1

2

_

_

.

The syndrome 012 is not a multiple of any column of H, so either an incorrect codeword

was transmitted or more than one error in transmission has occurred. But you can only

correct one error with this code, so you have to ask for a retransmission.

Chapter 3

Hamming Codes

One way to construct perfect binary [n, k] codes that can correct single errors is to

ensure that every nonzero vector in F

n−k

2

appears as a unique column of H. In this

manner, the syndrome of every possible vector in F

n

2

can be identiﬁed with a column

of H, so that every vector in F

n

2

is at most a distance one away from a codeword.

This is called a binary Hamming code, which we now discuss in the general space F

n

q

,

where F

q

is a ﬁeld.

Remark: One can form q − 1 distinct scalar multiples from any nonzero vector u

in F

r

q

: if λ, γ ∈ F

q

, then

λu = γu ⇒ (λ −γ)u = 0 ⇒ λ = γ or u = (λ −γ)

−1

0 = 0.

Deﬁnition: Given an integer r ≥ 2, let n = (q

r

− 1)/(q − 1). The Hamming code

Ham(r, q) is a linear [n, n−r] code in F

n

q

for which the columns of the r n parity-

check matrix H are the n distinct non-zero vectors of F

r

q

with ﬁrst nonzero entry

equal to 1.

Remark: Not only are the columns of H distinct, all nonzero multiples of any two

columns are also distinct. That is, any two columns of H are linearly independent.

The total number of nonzero column multiples that can thus be formed is n(q−1) =

q

r

− 1. Including the zero vector, we see that H yields a total of q

r

distinct

syndromes, corresponding to all possible error vectors of unit weight in F

r

q

.

• The columns of a parity-check matrix for the binary Hamming code Ham(r, 2)

consist of all possible nonzero binary codewords of length r.

Remark: The columns of a parity-check matrix H for a Hamming code may be

written in any order. However, both the syndromes and codewords will depend on

the order of the columns. If H is row reduced to standard form, the codewords will

be unchanged. However, other equivalent Ham(4, 2) codes obtained by rearranging

columns of H will have rearranged codewords.

36

37

Problem 3.1: Given a parity-check matrix H for a binary Hamming code, show that

the standard form for H (obtained by row reduction) is just a rearrangement of

the columns of H.

Remark: The dimension k of Ham(r, q) is given by

n −r =

q

r

−1

q −1

−r.

• A parity-check matrix for the one-dimensional code Ham(2, 2) is

_

0 1 1

1 0 1

_

,

which can row reduced to standard form:

_

1 1 0

1 0 1

_

.

The generator matrix is then seen to be [ 1 1 1 ]. That is, Ham(2, 2) is just the

binary triple-repetition code.

• A parity-check matrix for the one-dimensional code Ham(3, 2) in standard form, is

_

_

0 1 1 1 1 0 0

1 0 1 1 0 1 0

1 1 0 1 0 0 1

_

_

.

Problem 3.2: Show that this code is equivalent to the (7, 16, 3) perfect code in

Chapter 1.

Remark: An equivalent way to construct the binary Hamming code Ham(r, 2) is to

consider all n = 2

r

−1 nonempty subsets of a set S containing r elements. Each of

these subsets corresponds to a position of a code in F

n

2

. A codeword can then be

thought of as just a collection of nonempty subsets of S. Any particular element a

of S will appear in exactly half of all 2

r

subsets (that is, in 2

r−1

subsets) of S, so

that an even number of the 2

r

−1 nonempty subsets will contain a. This gives us

a parity-check equation, which says that the sum of all digits corresponding to a

subset containing a must be 0 (mod 2). There will be a parity-check equation for

each of the r elements of S corresponding to a row of the parity-check matrix H.

That is, each column of H corresponds to one of the subsets, with a 1 appearing

in the ith position if the subset contains the ith element and 0 if it doesn’t.

• A parity check matrix for Ham(3, 2) can be constructed by considering all possible

nonempty subsets of ¦a, b, c¦, each of which corresponds to one of the bits of a

38 CHAPTER 3. HAMMING CODES

codeword x = x

1

x

2

. . . x

7

in F

7

2

:

a a a a

b b b b

c c c c

x

1

x

2

x

3

x

4

x

5

x

6

x

7

Given any four binary information digits x

1

, x

2

, x

3

, and x

4

, there will be a unique

codeword satisfying Hx = 0; the parity-check digits x

5

, x

6

, and x

7

can be deter-

mined from the three checksum equations corresponding to each of the elements a,

b, and c:

a : x

2

+x

3

+x

4

+x

5

= 0 (mod2),

b : x

1

+x

3

+x

4

+x

6

= 0 (mod 2),

and

c : x

1

+x

2

+x

4

+x

7

= 0 (mod 2).

For example, the vector x = 1100110 corresponds to the collection

¦¦b, c¦, ¦a, c¦, ¦a¦, ¦b¦¦.

Since there are an even number of as, bs, and cs in this collection, we know that x

is a codeword.

Problem 3.3: Show that two distinct codewords x and y that satisfy the above three

parity check equations must diﬀer in at least 3 places.

Remark: When constructing binary Hamming codes, there is a distinct advantage in

arranging the parity-check matrix so that the columns, treated as binary numbers,

are in ascending order. The syndrome, interpreted in exactly the same way as a

binary number, immediately tells us in which position a single error has occurred.

• We can write a parity-check matrix for a Ham(3, 2) code in in the binary ascending

form

H =

_

_

0 0 0 1 1 1 1

0 1 1 0 0 1 1

1 0 1 0 1 0 1

_

_

.

If the vector 1110110 is received, the syndrome is [0, 1, 1]

t

, which corresponds to

the binary number 3, so we know immediately that the a single error must have

occurred in the third position, without even looking at H. Thus, the transmitted

codeword was 1100110.

Remark: For nonbinary Hamming codes, we need to compare the computed syn-

drome with all nonzero multiples of the columns of the parity-check matrix.

39

• A parity-check matrix for Ham(2, 3) is

H =

_

0 1 1 1

1 0 1 2

_

.

If the vector 2020, which has syndrome [2, 1]

t

= 2[1, 2]

t

, is received and at most a

single digit is in error, we see that an error of 2 has occurred in the last position

and decode the vector as x = y −e = 2020 −0002 = 2021.

• A parity-check matrix for Ham(3, 3) is

H =

_

_

0 0 0 0 1 1 1 1 1 1 1 1 1

0 1 1 1 0 0 0 1 1 1 2 2 2

1 0 1 2 0 1 2 0 1 2 0 1 2

_

_

.

If the vector 2000 0000 00001 is sent and at most a single error has occurred, then

from the syndrome [1, 2, 1]

t

we see that an error of 1 has occurred in the second-last

position, so the transmitted vector was 2000 0000 00021.

The following theorem establishes that Hamming codes can always correct single

errors, as we saw in the above examples, and also that they are perfect.

Theorem 3.1 (Hamming Codes are Perfect): Every Ham(r, q) code is perfect and

has distance 3.

Proof: Since any two columns of H are linearly independent, we know from The-

orem 2.2 that Ham(r, q) has a distance of at least 3, so it can correct single errors.

The distance cannot be any greater than 3 because the nonzero columns

_

¸

¸

¸

¸

_

0

.

.

.

0

0

1

_

¸

¸

¸

¸

_

,

_

¸

¸

¸

¸

_

0

.

.

.

0

1

0

_

¸

¸

¸

¸

_

,

_

¸

¸

¸

¸

_

0

.

.

.

0

1

1

_

¸

¸

¸

¸

_

are linearly dependent.

Furthermore, we know that Ham(r, q) has M = q

k

= q

n−r

codewords, so the

sphere-packing bound

q

n−r

(1 +n(q −1)) = q

n−r

(1 +q

r

−1) = q

n

is perfectly achieved.

Corollary 3.1.1 (Hamming Size): For any integer r ≥ 2, we have A

2

(2

r

− 1, 3) =

2

2

r

−1−r

.

• Thus A

2

(3, 3) = 2, A

2

(7, 3) = 16, A

2

(15, 3) = 2

11

= 2048, and A

2

(31, 3) = 2

26

.

40 CHAPTER 3. HAMMING CODES

Problem 3.4: Determine the number α

i

of coset leaders of weight i for Ham(r, 2),

for each i = 0, . . . , n.

We know that the Hamming code is perfect and has minimum distance 3. The error

vectors that can be corrected by a Hamming code are precisely those vectors of weight one

or less. These vectors ﬁll F

n

q

completely, where n = 2

r

−1. Consequently, the coset weights

are distributed according to

α

i

=

_

_

_

_

n

i

_

for 0 ≤ i ≤ 1,

0 for i > 1.

That is, α

0

= 1, α

1

= n = 2

r

−1, α

2

= α

3

= . . . = α

n

= 0. Note that the total number of

cosets is α

0

+ α

1

= 2

r

= 2

n−k

and each of them contain 2

k

vectors, where k = n −r.

Problem 3.5: For all r ∈ N, describe how to construct from Ham(r, 2) a code of

length n = 2

r

with minimum distance d = 4 that contains M = 2

2

r

−1−r

codewords.

Prove that the minimum distance of your code is 4 and that M is the maximum

number of possible codewords for these parameters.

Extend the Hamming code Ham(r, 2), with length n−1 = 2

r

−1, M = 2

n−r

= 2

2

r

−1−r

,

and distance 3 by adding a parity check to produce a code with n = 2

r

but still M codewords.

Since the parity check guarantees that the weight of all extended codewords is even, we know

that the distance between any two of these codewords x and y is w(x −y) = w(x) +w(y) −

2w(xy), which is even. Hence the minimum distance of the extended code, which is at least 3

and certainly no more than 4, must in fact be 4. We also know that the Hamming code is

perfect. The extended Hamming code is not perfect, but we know by Corollary 1.3.1 that

the maximum number of codewords for the parameters (n, 4) is the same as the maximum

number M of codewords for the parameters (n −1, 3).

Chapter 4

Golay Codes

We saw in the last chapter that the linear Hamming codes are nontrivial perfect

codes.

Q. Are there any other nontrivial perfect codes?

A. Yes, two other linear perfect codes were found by Golay in 1949. In addition,

several nonlinear perfect codes are known that have the same n, M, and d

parameters as Hamming codes.

The condition for a code to be perfect is that its n, M, and d values satisfy the

sphere-packing bound

M

t

k=0

_

n

k

_

(q −1)

k

= q

n

, (4.1)

with d = 2t + 1. Golay found three other possible integer triples (n, M, d) that do

not correspond to the parameters of a Hamming or trivial perfect code. They are

(23, 2

12

, 7) and (90, 2

78

, 5) for q = 2 and (11, 3

6

, 5) for q = 3. It turns out that there

do indeed exist linear binary [23, 12, 7] and ternary [11, 6, 5] codes; these are known as

Golay codes. But, as we shall soon, it is impossible for linear or nonlinear (90, 2

78

, 5)

codes to exist.

Problem 4.1: Show that the (n, M, d) triples (23, 2

12

, 7), (90, 2

78

, 5) for q = 2, and

(11, 3

6

, 5) for q = 3 satisfy the sphere-packing bound (1.1).

Remark: In view of Theorem 1.3, a convenient way of ﬁnding a binary [23, 12, 7]

Golay code is to construct ﬁrst the extended Golay [24, 12, 8] code, which is just

the [23, 12, 7] Golay code augmented with a ﬁnal parity check in the last position

(such that the weight of every codeword is even).

41

42 CHAPTER 4. GOLAY CODES

The extended binary Golay [24, 12, 8] code C

24

can be generated by the matrix

G

24

deﬁned by

_

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1

0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 1 0 0 0 1 0

0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1

0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1

0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 1 1 0

0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 0 1

0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 1 1

0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1

0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 1 0 1 1 1 0

0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 0 1 1 1 0 0

0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 1 1 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 1 1 0 0 0 1

_

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

.

Remark: We can express G

24

= [1

12

[ A], where A is a 12 12 symmetric matrix;

that is, A

t

= A.

Problem 4.2: Show that u·v = 0 for all rows u and v of G

24

. Hint: note that the

ﬁrst row of G is orthogonal to itself. Then establish that u·v = 0 when u is the

second row and v is any row of G

24

. Then use the cyclic symmetry of the rows of

the matrix A

′

formed by deleting the ﬁrst column and ﬁrst row of A.

Remark: The above exercise establishes that the rows of G

24

are orthogonal to each

other. Noting that the weight of the rows of G

24

is either 12 or 8, we make use of

the following result.

Deﬁnition: A linear code C is self-orthogonal if C ⊂ C

⊥

. A linear code C is self-dual

if C = C

⊥

.

Problem 4.3: Let C be a binary linear code with generator matrix G. If each row

of G is orthogonal to itself and all other rows and has weight divisible by 4, prove

that C ⊂ C

⊥

and that the weight of every codeword in C is a multiple of 4.

Remark: Since k = 12 and n −k = 12, the linear spaces C

24

and C

⊥

24

have the same

dimension. Hence C

24

⊂ C

⊥

24

implies C

24

= C

⊥

24

. This means that the parity check

matrix H

24

= [A[ 1

12

] for C

24

is also a generator matrix for C

24

!

We are now ready to show that distance of C

24

is 8 and, consequently, that the bi-

nary Golay [23, 12] code generated by the ﬁrst 23 columns of G

24

must have minimum

distance either 7 or 8. But since the third row of this reduced generator matrix is a

codeword of weight 7, we can then be sure that the minimum distance is exactly 7.

43

Theorem 4.1 (Extended Golay [24, 12] code): The [24, 12] code generated by G

24

has

minimum distance 8.

Proof: We know that the weight of the code generated by G

24

must be divisible

by 4. Since both G

24

and H

24

are generator matrices for the code, any codeword

can be expressed either as a linear combination of the rows of G

24

or as a linear

combination of the rows of H

24

. We now show that a codeword x ∈ C

24

cannot have

weight 4. It is not possible for the all of the left-most twelve bits of x to be 0 if x

is some nontrivial linear combination of the rows of G

24

. Likewise, it is not possible

for all of the right-most twelve symbols of x to be 0 if x is some nontrivial linear

combination of the rows of H

24

. If exactly one of the left-most (right-most) twelve

bits of x were 1, then x would then be identical to a row of G

24

(H

24

), none of which

has weight 4. The only possible codeword of weight 4 is therefore a sum of two rows

of G

24

, but it is easily seen (again using the cyclic symmetry of A

′

) that no two rows

of G

24

diﬀer in only four positions. Since the weight of every codeword in C

24

must be

a multiple of 4, we now know that C

24

must have a minimum distance of at least 8. In

fact, since the second row of G

24

is a codeword of weight 8, we see that the minimum

distance of C

24

is exactly 8.

Problem 4.4: Show that the ternary Golay [11, 6] code generated by the ﬁrst 11

columns of the generator matrix

G

12

=

_

¸

¸

¸

¸

¸

¸

_

1 0 0 0 0 0 0 1 1 1 1 1

0 1 0 0 0 0 1 0 1 2 2 1

0 0 1 0 0 0 1 1 0 1 2 2

0 0 0 1 0 0 1 2 1 0 1 2

0 0 0 0 1 0 1 2 2 1 0 1

0 0 0 0 0 1 1 1 2 2 1 0

_

¸

¸

¸

¸

¸

¸

_

has minimum distance 5.

Theorem 4.2 (Nonexistence of binary (90, 2

78

, 5) codes): There exist no binary

(90, 2

78

, 5) codes.

Proof: Suppose that a binary (90, 2

78

, 5) code C exists. By Lemma 1.2, without

loss of generality we may assume that 0 ∈ C. Let Y be the set of vectors in F

90

2

of

weight 3 that begin with two 1s. Since there are 88 possible positions for the third

one, [Y [ = 88. From Eq. (4.1), we know that C is perfect, with d(C) = 5. Thus

each y ∈ Y is within a distance 2 from a unique codeword x. But then from the

triangle inequality,

2 = d(C) −w(y) ≤ w(x) −w(y) ≤ w(x −y) ≤ 2,

from which we see that w(x) = 5 and d(x, y) = w(x − y) = 2. This means that x

must have a 1 in every position that y does.

44 CHAPTER 4. GOLAY CODES

Let X be the set of all codewords of weight 5 that begin with two 1s. We know

that for each y ∈ Y there is a unique x ∈ X such that d(x, y) = 2. That is, there are

exactly [Y [ = 88 elements in the set ¦(x, y) : x ∈ X, y ∈ Y, d(x, y) = 2¦. But each

x ∈ X contains exactly three ones after the ﬁrst two positions. Thus, for each x ∈ X

there are precisely three vectors y ∈ Y such that d(x, y) = 2. That is, 3 [X[ = 88.

This is a contradiction, since [X[ must be an integer.

Remark: In 1973, Tiet¨avainen, based on work by Van Lint, proved that any non-

trivial perfect code over the ﬁeld F

n

q

must either have the parameters ((q

r

−1)/(q −

1), q

n−r

, 3) of a Hamming code, the parameters (23, 2

12

, 7) of the binary Golay code,

or the parameters (11, 3

6

, 5) of the ternary Golay code.

Problem 4.5: Consider the extended ternary (q = 3) Golay [12, 6, 6] code C

12

gen-

erated by

G

12

=

_

¸

¸

¸

¸

¸

¸

_

1 0 0 0 0 0 0 1 1 1 1 1

0 1 0 0 0 0 1 0 1 2 2 1

0 0 1 0 0 0 1 1 0 1 2 2

0 0 0 1 0 0 1 2 1 0 1 2

0 0 0 0 1 0 1 2 2 1 0 1

0 0 0 0 0 1 1 1 2 2 1 0

_

¸

¸

¸

¸

¸

¸

_

(a) Use the fact that C

12

is self-orthogonal (C

12

⊂ C

⊥

12

) to ﬁnd a parity-check

matrix for C

12

.

Since n − k = 12 − 6 = 6 = k, we know that the linear subspace C

⊥

12

and C

12

have the

same dimension, so C

12

= C

⊥

12

. Hence, G

12

itself is a parity check matrix for C

12

.

(b) Decode the received vector y = 010 000 010 101, assuming that at most two

errors have occurred.

The syndrome of this vector is

_

¸

¸

¸

¸

¸

¸

_

0

1

1

1

1

0

_

¸

¸

¸

¸

¸

¸

_

=

_

¸

¸

¸

¸

¸

¸

_

0

1

1

1

1

1

_

¸

¸

¸

¸

¸

¸

_

+ 2

_

¸

¸

¸

¸

¸

¸

_

0

0

0

0

0

1

_

¸

¸

¸

¸

¸

¸

_

.

Thus an error of 1 has occurred in position 7 and an error of 2 has occurred in position 6.

That is, the error vector is e = 000 002 100 000, so the transmitted vector was x = y −e =

010 001 210 101.

Chapter 5

Cyclic Codes

Cyclic codes are an important class of linear codes for which the encoding and de-

coding can be eﬃciently implemented using shift registers. In the binary case, shift

registers are built out of two-state storage elements known as ﬂip-ﬂops and arith-

metic devices called binary adders that output the sum of their two binary inputs,

modulo 2.

Many common linear codes, including Hamming and Golay codes, have an equiv-

alent cyclic representation.

Deﬁnition: A linear code C is cyclic if

a

0

a

1

. . . a

n−1

∈ C ⇒ a

n−1

a

0

a

1

. . . a

n−2

∈ C.

Remark: If x is a codeword of a cyclic code C, then all cyclic shifts of x also belong

to C.

• The binary linear code (000, 101, 011, 110) is cyclic.

• The (7, 16, 3) perfect code in Chapter 1, which we now know is equivalent to

Ham(3, 2), is cyclic.

• The binary linear code (0000, 1001, 0110, 1111) is not cyclic. However, upon inter-

changing the third and fourth positions, we note that it is equivalent to the linear

code (0000, 1010, 0101, 1111), which is cyclic.

It is convenient to identify a codeword a

0

a

1

. . . a

n−1

in a cyclic code C with the

polynomial

c(x) = a

0

+a

1

x +a

2

x

2

+. . . +a

n−1

x

n−1

.

Then a

n−1

a

0

a

1

. . . a

n−2

corresponds to the polynomial

a

n−1

+a

0

x +a

1

x

2

+. . . +a

n−2

x

n−1

= xc(x) (mod x

n

−1),

45

46 CHAPTER 5. CYCLIC CODES

since x

n

= 1 (modx

n

−1). Thus, a linear code C is cyclic iﬀ

c(x) ∈ C ⇒ xc(x) (modx

n

−1) ∈ C.

That is, multiplication by x (modulo the polynomial x

n

−1) corresponds to a cyclic

shift.

Deﬁnition: The polynomial ring F

q

[x] is the set of all polynomials P(x) with coeﬃ-

cients in F

q

.

Deﬁnition: The residue class ring R

n

q

.

= F

q

[x]/(x

n

− 1) is the set of all polynomial

remainders obtained by long division of polynomials in F

q

[x] by x

n

− 1. That is,

R

n

q

is the set of all polynomials of degree less than n.

Remark: A cyclic code in F

n

q

can be thought of as a particular subset of the residue

class polynomial ring R

n

q

. In fact, the following theorem shows that a cyclic code C

is an ideal of R

n

q

.

Theorem 5.1 (Cyclic Codes are Ideals): A linear code C in R

n

q

is cyclic ⇐⇒

c(x) ∈ C, r(x) ∈ R

n

q

⇒ r(x)c(x) ∈ C.

Proof: Suppose C is a cyclic code in R

n

q

. We know that multiplication of a

codeword c(x) in C by x corresponds to a cyclic shift of its coeﬃcients, and since C is

linear, we know that c(x) ∈ C ⇒ αc(x) ∈ C for all α ∈ F

q

. We thus see by induction

that

c(x) ∈ C ⇒ r(x)c(x) ∈ C ∀r(x) ∈ R

n

q

, (5.1)

where the multiplication is performed modulo x

n

− 1. Conversely, suppose that C

satisﬁes Eq. (5.1). Taking r(x) = x shows that C is cyclic.

Deﬁnition: The principal ideal

¸g(x)) = ¦r(x)g(x) : r(x) ∈ R

n

q

¦

of R

n

q

is the cyclic code generated by the polynomial g(x).

Problem 5.1: Verify that ¸g(x)) is an ideal.

Remark: The next theorem states that every ideal in R

n

q

is a principal ideal (i.e. R

n

q

is a Principal Ideal Domain).

Deﬁnition: A polynomial is monic if its highest-degree coeﬃcient is 1.

Theorem 5.2 (Generator Polynomial): Let C be a nonzero q-ary cyclic code in R

n

q

.

Then

(i) there exists a unique monic polynomial g(x) of smallest degree in C;

47

(ii) C = ¸g(x));

(iii) g(x) is a factor of x

n

−1 in F

q

[x].

Proof:

(i) If g(x) and h(x) are both monic polynomials in C of smallest degree, then

g(x) −h(x) is a polynomial in C of smaller degree. If g(x) −h(x) ,= 0, a certain

scalar multiple of g(x) − h(x) would be a monic polynomial in C of degree

smaller than deg g, which is a contradiction. Hence g(x) = h(x).

(ii) Theorem 5.1 shows that ¸g(x)) ⊂ C, so it only remains to show that C ⊂ ¸g(x)).

Suppose c(x) ∈ C. Using long division, we can express c(x) = q(x)g(x) +r(x),

where deg r < deg g. But since c(x) and q(x)g(x) are both in the cyclic code C,

we know by the linearity of C that r(x) = c(x) −q(x)g(x) is also in C. Hence

r(x) = 0 (otherwise a scalar multiple of r(x) would be a monic polynomial in C

of degree smaller than deg g). That is, c(x) ∈ ¸g(x)).

(iii) By long division, we may express x

n

−1 = q(x)g(x) +r(x), where deg r < deg g.

But then r(x) = −q(x)g(x) (modx

n

− 1) implies that r(x) ∈ ¸g(x)). By the

minimality of deg g, we see that r(x) = 0; that is, x

n

−1 is a multiple of g(x).

Deﬁnition: The monic polynomial of least degree in Theorem 5.2 is called the

generator polynomial of C.

Theorem 5.3 (Lowest Generator Polynomial Coeﬃcient): Let g(x) = g

0

+g

1

x+. . . +

g

r

x

r

be the generator polynomial of a cyclic code. Then g

0

,= 0.

Proof: Suppose g

0

= 0. Then x

n−1

g(x) = x

−1

g(x) is a codeword of degree r − 1,

contradicting the minimality of deg g.

Theorem 5.4 (Cyclic Generator Matrix): A cyclic code with generator polynomial

g(x) = g

0

+g

1

x +. . . +g

r

x

r

has dimension n −r and generator matrix

G =

_

¸

¸

¸

¸

_

g

0

g

1

g

2

. . . g

r

0 0 . . . 0

0 g

0

g

1

g

2

. . . g

r

0 . . . 0

0 0 g

0

g

1

g

2

. . . g

r

. . . 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 . . . 0 g

0

g

1

g

2

. . . g

r

_

¸

¸

¸

¸

_

.

Proof: Let c(x) be a codeword in a cyclic code C with generator g(x). From

Theorem 5.2, we know that

c(x) = q(x)g(x)

48 CHAPTER 5. CYCLIC CODES

for some polynomial q(x). Note that deg q < n −r since deg c < n. That is,

c(x) =

_

q

0

+q

1

x +. . . +q

n−r−1

x

n−r−1

_

g(x) = q

0

g(x)+q

1

xg(x)+. . .+q

n−r−1

x

n−r−1

g(x),

which is a linear combination of the n − r rows g(x), xg(x), x

2

g(x), . . . , x

n−r−1

g(x)

of G. The diagonal of nonzero g

0

s next to a lower-triangular zero submatrix ensures

that the rows of G are linearly independent. Thus, the span of the rows of G is the

n −r dimensional code C.

Remark: Together, Theorems 5.1 and 5.2 say that an [n, k] code is cyclic ⇐⇒ it

is generated by a factor of x

n

− 1. The following lemma is useful in ﬁnding these

factors.

Lemma 5.1 (Linear Factors): A polynomial c(x) has a linear factor x − a ⇐⇒

c(a) = 0.

Proof: Exercise.

Deﬁnition: A polynomial is said to be irreducible in F

q

[x] if it cannot be factored

into polynomials of smaller degree.

Lemma 5.2 (Irreducible 2nd or 3rd Degree Polynomials): A polynomial c(x) in F

q

[x]

of degree 2 or 3 is irreducible ⇐⇒ c(a) ,= 0 for all a ∈ F

q

.

Proof: c(x) can be factored into polynomials of smaller degree ⇐⇒ it has at

least one linear factor (x −a) ⇐⇒ c(a) = 0, by Lemma 5.1.

• Suppose we wish to ﬁnd all ternary cyclic codes of length n = 4. The generators

for such codes must be factors of x

4

− 1 in the ring F

3

[x]. Since 1 is a root of the

equation x

4

−1 we know that (x −1) is a factor and hence

(x

4

−1) = (x −1)(x

3

+x

2

+x + 1)

By Lemma 5.2, the factor x

3

+x

2

+x +1 is not irreducible because it has a linear

root at a = 2 = −1 in F

3

. Using long division, we obtain

(x

4

−1) = (x −1)(x + 1)(x

2

+ 1).

Since any combination of these three irreducible factors can be used to construct

a generator polynomial g(x) for a cyclic code, there are a total of 2

3

= 8 ternary

cyclic codes of length 4, as illustrated in Table 5.1. Upon examining the weights of

the rows of the possible generator matrices, we see that the generated codes either

have minimum distance less than or equal to 2 or else equal to 4. Hence, it is not

possible to have a cyclic code of length 4 and minimum distance 3. In particular,

Ham(2, 3), for which n = (3

2

− 1)/(3 − 1) = 4, cannot be cyclic. Thus, not all

Hamming codes have a cyclic representation.

49

g(x) G

1

_

¸

¸

_

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

_

¸

¸

_

x −1

_

_

−1 1 0 0

0 −1 1 0

0 0 −1 1

_

_

x + 1

_

_

1 1 0 0

0 1 1 0

0 0 1 1

_

_

x

2

+ 1

_

1 0 1 0

0 1 0 1

_

(x −1)(x + 1) = x

2

−1

_

−1 0 1 0

0 −1 0 1

_

(x −1)(x

2

+ 1) = x

3

−x

2

+x −1 [ −1 1 −1 1 ]

(x + 1)(x

2

+ 1) = x

3

+x

2

+x + 1 [ 1 1 1 1 ]

x

4

−1 = 0 [ 0 0 0 0 ]

Table 5.1: Generator polynomial g(x) and corresponding generator matrix G for all

possible ternary cyclic codes of length 4.

50 CHAPTER 5. CYCLIC CODES

An easy way to ﬁnd the parity check matrix for a cyclic [n, k] code (without

requiring that we ﬁrst put G given by Theorem 5.4 in standard form) is to ﬁrst

construct the check polynomial h(x) of C from its generator polynomial g(x), where

h(x) satisﬁes

x

n

−1 = g(x)h(x)

in F

q

[x]. Since g is monic and has degree n−k, we see that h is monic and has degree

k.

Theorem 5.5 (Cyclic Check Polynomial): An element c(x) of R

n

q

is a codeword of

the cyclic code with check polynomial h ⇐⇒ c(x)h(x) = 0 in R

n

q

.

Proof:

“⇒” If c(x) is a codeword, then in R

n

q

we have

c(x) = a(x)g(x) ⇒ c(x)h(x) = a(x)g(x)h(x) = a(x)(x

n

−1) = a(x)0 (mod x

n

−1) = 0.

“⇐” We can express any polynomial c(x) in R

n

q

as c(x) = q(x)g(x) +r(x)

where deg r < deg g = n −k. If c(x)h(x) = 0 then

r(x)h(x) = c(x)h(x) −q(x)g(x)h(x) = 0 (modx

n

−1).

But deg r(x)h(x) < n − k + k = n, so r(x)h(x) = 0 in F

q

[x], not just

in R

n

q

. If r(x) ,= 0, consider its highest degree coeﬃcient a ,= 0. Then

since h is monic, the coeﬃcient of the highest degree term of the product

r(x)h(x) is a = a 1 = 0, which is a contradiction. Thus r(x) = 0 and so

c(x) is a codeword: c(x) = q(x)g(x) ∈ ¸g(x)).

Theorem 5.6 (Cyclic Parity Check Matrix): A cyclic code with check polynomial

h(x) = h

0

+h

1

x +. . . +h

k

x

k

has dimension k and parity check matrix

H =

_

¸

¸

¸

¸

_

h

k

h

k−1

h

k−2

. . . h

0

0 0 . . . 0

0 h

k

h

k−1

h

k−2

. . . h

0

0 . . . 0

0 0 h

k

h

k−1

h

k−2

. . . h

0

. . . 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 . . . 0 h

k

h

k−1

h

k−2

. . . h

0

_

¸

¸

¸

¸

_

.

Proof: Since the degree of the generator polynomial g is r = n−k, by Theorem 5.4,

the dimension of the code must be k. From Theorem 5.5, we know that a codeword

c(x) = c

0

+c

1

x+. . .+c

n−1

x

n−1

must satisfy c(x)h(x) = 0. In particular, the coeﬃcients

51

x

k

, x

k+1

, . . . , x

n−1

of the product c(x)h(x) must be zero; for ℓ = k, k +1, . . . , n−1 we

then have

0 =

i+j=ℓ

c

i

h

j

.

But then, since each of these equations is one of the n−k rows of the matrix equation

_

¸

¸

¸

¸

_

h

k

h

k−1

h

k−2

. . . h

0

0 0 . . . 0

0 h

k

h

k−1

h

k−2

. . . h

0

0 . . . 0

0 0 h

k

h

k−1

h

k−2

. . . h

0

. . . 0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

0 0 . . . 0 h

k

h

k−1

h

k−2

. . . h

0

_

¸

¸

¸

¸

_

_

¸

¸

¸

¸

_

c

0

c

1

c

2

.

.

.

c

n−1

_

¸

¸

¸

¸

_

=

_

¸

¸

¸

¸

_

0

0

0

.

.

.

0

_

¸

¸

¸

¸

_

,

the codewords are orthogonal to all cyclic shifts of the vector h

k

h

k−1

h

k−2

. . . h

0

00 . . . 0.

The codewords are thus orthogonal to all linear combinations of the rows of H. This

means that C

⊥

contains the span of the rows of H. But h

k

= 1, so we see that H

has rank n −k and hence generates exactly the linear subspace C

⊥

. That is, H is a

parity check matrix for the code with check polynomial h(x).

Deﬁnition: The reciprocal polynomial h(x) of a polynomial

h(x) = h

0

+h

1

x +. . . +h

k

x

k

is obtained by reversing the order of its coeﬃcients:

h(x)

.

= x

k

h(x

−1

) = h

0

x

k

+h

1

x

k−1

+. . . +h

k

= h

k

+h

k−1

x +. . . +h

0

x

k

.

Remark: Since

h(x)g(x) = h(x)x

n−k

g(x

−1

) = x

n

h(x

−1

)g(x

−1

) = x

n

[(x

−1

)

n

−1] = 1 −x

n

,

we see that h(x) is a factor of x

n

− 1. In view of Theorems 5.1, 5.2, and 5.6, this

says that C

⊥

is itself a cyclic code, with (monic) generator h

−1

0

h(x).

We are now able to show that all binary Hamming codes have an equivalent cyclic

representation.

Theorem 5.7 (Cyclic Binary Hamming Codes): The binary Hamming code Ham(r, 2)

is equivalent to a cyclic code.

Proof: Let p(x) be an irreducible polynomial of degree r in F

2

[x]. By Theorem A.3

F

2

[x]/p(x) is a ﬁeld of order 2

r

, and by Theorem A.4 we know that F

2

[x]/p(x) can be

expressed as the set of distinct elements ¦0, α

0

, α

1

, α

2

, . . . , α

2

r

−2

¦ for some primitive

element α. We associate each element a

0

+ a

1

x +a

2

x

2

+ . . . + a

r−1

x

r−1

∈ F

2

[x]/p(x)

with the column vector

_

¸

¸

_

a

0

a

1

.

.

.

a

r−1

_

¸

¸

_

.

52 CHAPTER 5. CYCLIC CODES

Let n = 2

r

−1. The r n matrix

H = [ 1 α α

2

. . . α

n−1

]

is seen to be the parity check matrix for C = Ham(r, 2) since its columns are precisely

the distinct nonzero vectors of F

2

r . A codeword c(x) = c

0

+ c

1

x + . . . + c

n−1

x

n−1

in

this code must then satisfy the vector equation c

0

+c

1

α

1

+c

1

α

2

+. . . +c

n−1

α

n−1

= 0,

so that

C = ¦c(x) ∈ R

n

2

: c(α) = 0 in F

2

[x]/p(x)¦.

If c(x) ∈ C and r(x) ∈ R

n

2

, we have r(α)c(α) = r(α)0 = 0 in F

2

[x]/p(x), so r(x)c(x)

is also an element of C. Theorem 5.1 then implies that C is cyclic.

• The irreducible polynomial x

3

+ x + 1 in F

2

[x] can be used to generate the ﬁeld

F

8

= F

2

[x]/(x

3

+ x + 1) with 2

3

= 8 elements. Note that F

8

has x as a primitive

element since all polynomials in F

2

[x] of degree less than 3 can be expressed as

powers of x:

F

8

= ¦0, 1, x, x

2

, x

3

= x + 1, x

4

= x

2

+x, x

5

= x

2

+x + 1, x

6

= x

2

+ 1¦.

Note that x

7

= x

3

+x = 1; that is, the primitive element has order 7 = 8 −1. The

primitive element x is a root of the primitive polynomial x

3

+x + 1 in F

8

.

A parity check matrix for a cyclic version of the Hamming code Ham(3, 2) is thus

H =

_

_

1 0 0 1 0 1 1

0 1 0 1 1 1 0

0 0 1 0 1 1 1

_

_

.

Q. What is the generator polynomial for Ham(r, 2)?

A. The close parallel between Theorem 5.2 and Theorem A.5 when n = p

r

−1 gives us

a clue: on comparing these results, we see from Theorem 5.4 that any minimal

polynomial of a primitive element β of F

p

r is a generator polynomial for a cyclic

code in F

p

r (β is a root of x

n

− 1 = 0; see Appendix A). In particular, the

following corollary to Theorem 5.7 establishes that Ham(r, 2) can be generated

by any primitive polynomial of F

2

r , which is just a monic irreducible polynomial

in F

2

[x] having a primitive element as a root.

Corollary 5.7.1 (Binary Hamming Generator Polynomials): Any primitive polyno-

mial of F

2

r is a generator polynomial for a cyclic Hamming code Ham(r, 2).

Proof: Let α be a primitive element of F

2

r . Its minimal polynomial p(x) is a

primitive polynomial of F

2

r . From the proof of Theorem 5.7, we see that Ham(r, 2)

consists precisely of those polynomials c(x) for which c(α) = 0, for example, p(x)

itself. By Theorem A.5, any such polynomial must be a multiple of p(x). That

is, Ham(r, 2) ⊂ ¸p(x)). Moreover, Theorem 5.1 implies that every multiple of the

codeword p(x) belongs to the cyclic code Ham(r, 2). Hence Ham(r, 2) = ¸p(x)).

53

• Consider the irreducible polynomial p(x) = x

3

+ x + 1 in F

2

[x]. Since x is a

primitive element of F

2

[x]/p(x) and p(x) = 0 mod(x

3

+x +1), we know that p(x)

is a primitive polynomial of F

2

3 = F

2

[x]/p(x) and hence Ham(3, 2) = ¸p(x)). From

Theorem 5.4, we can then immediately write down a generator matrix for a cyclic

Ham(3, 2) code:

_

¸

¸

_

1 1 0 1 0 0 0

0 1 1 0 1 0 0

0 0 1 1 0 1 0

0 0 0 1 1 0 1

_

¸

¸

_

.

Chapter 6

BCH Codes

For noisy transmission lines, Hamming codes are of limited use because they cannot

correct more than one error. In this chapter, we discuss a class of important and

widely used cyclic codes that can correct multiple errors, developed by R. C. Bose

and D. K. Ray-Chaudhuri (1960) and independently by A. Hocquenghem (1959),

known as Bose–Chaudhuri–Hocquenghem (BCH) codes.

Deﬁnition: Let α be an element of order n in a ﬁnite ﬁeld F

q

s. A BCH code of

length n and design distance d is a cyclic code generated by the product of the

distinct minimal polynomials in F

q

[x] of the elements α, α

2

, . . . , α

d−1

.

Remark: Often we take α to be a primitive element of F

q

s , so that n = q

s

−1. The

resulting BCH code is known as a primitive BCH code. However, it is possible to

construct BCH codes over F

q

s of length n, where n is any factor of q

s

−1.

Remark: We will establish that a BCH code of odd design distance d has a minimum

distance of at least d, by showing that such a code can correct (d −1)/2 errors.

To encode the message word a

0

a

1

. . . a

k−1

, we represent it by the polynomial

f(x) =

k−1

i=0

a

i

x

i

and form its product with the generator polynomial g(x), to obtain

the codeword c(x) = f(x)g(x).

• For the primitive element α = x of the ﬁeld F

2

4 = F

2

[x]/(x

4

+x+1), we can construct

a [15, 7] code that can correct two errors, by ﬁnding a generator polynomial g(x)

that has roots at α, α

2

, a

3

, and α

4

. Such a generator can be created from the

product of the minimal polynomials m

1

(x) = x

4

+ x + 1 of α and m

3

(x) = x

4

+

x

3

+x

2

+x + 1 of α

3

:

g(x) = m

1

(x)m

3

(x) = (x

4

+x + 1)(x

4

+x

3

+x

2

+x + 1) = x

8

+x

7

+x

6

+x

4

+ 1.

In fact, g(x) has even more roots than prescribed, namely at α, α

2

, α

4

, α

8

, α

3

, α

6

,

α

12

, and α

9

. Once we have shown that this code can correct two errors, we will

know that its minimum distance is exactly 5 since the codeword g(x) has weight 5.

54

55

Remark: In the binary case q = 2, the generator of a BCH code is just the product

of the distinct minimal polynomials of the odd powers, from 1 to d − 1, of the

primitive element.

Problem 6.1: Show that a polynomial c(x) belongs to a BCH code with design

distance d ⇐⇒ c(α) = c(α

2

) = . . . = c(α

d−1

) = 0.

We now describe the decoding procedure for BCH codes. To keep the notation

simple we begin by illustrating the procedure ﬁrst for the binary case, where q = 2.

Suppose that y(x) is received rather than c(x) and that t errors have occurred. Then

the error polynomial e(x) = y(x) −c(x) can be written as e(x) = x

ℓ

1

+x

ℓ

2

+. . . +x

ℓt

for some unknown powers ℓ

1

, ℓ

2

, . . . , ℓ

t

. We then compute the syndrome S

1

by

substituting α into y(x),

S

1

.

= y(α) = c(α) +e(α) = e(α) = e

1

+. . . +e

t

,

where e

i

.

= α

ℓ

i

for i = 1, 2, . . . , t. Likewise, we evaluate

S

2

.

= y(α

2

) = c(α

2

) +e(α

2

) = e(α

2

) = e

2

1

+. . . +e

2

t

,

S

3

.

= y(α

3

) = c(α

3

) +e(α

3

) = e(α

3

) = e

3

1

+. . . +e

3

t

,

. . .

S

d−1

.

= y(α

d−1

) = c(α

d−1

) +e(α

d−1

) = e(α

d−1

) = e

d−1

1

+. . . +e

d−1

t

.

The decoding problem now amounts to determining which value of t and which choices

of the ﬁeld elements e

1

, e

2

, . . . , e

t

are consistent with the above equations. Once these

are found, from a table of the elements of F

q

s , we can determine the corresponding

powers ℓ

1

, ℓ

2

, . . . , ℓ

t

such that e

i

.

= α

ℓ

i

. These powers tell us directly which bits we

need to toggle. To ﬁnd a solution to the above equations, the following deﬁnition will

be helpful.

Deﬁnition: The error locator polynomial is

σ(x)

.

= (e

1

x −1)(e

2

x −1) . . . (e

t

x −1) = b

t

x

t

+b

t−1

x

t−1

+. . . +b

1

x + 1.

Notice that the roots of σ(x) are just the inverses of e

i

, i = 1, 2, . . . , t.

To understand how the above syndrome equations are solved, it will be helpful to

ﬁrst discuss the case where d = 5 and t ≤ 2 errors have occurred. We deﬁne e

i

= 0

for i > t and write

S

1

= e

1

+e

2

,

S

2

= e

2

1

+e

2

2

,

S

3

= e

3

1

+e

3

2

,

56 CHAPTER 6. BCH CODES

S

4

= e

4

1

+e

4

2

.

The error locator polynomial is σ(x) = b

2

x

2

+b

1

x+1. Since σ(e

−1

i

) = 0 for i = 1, 2

we know that

0 = e

3

1

σ(e

−1

1

) = e

3

1

(b

2

e

−2

1

+b

1

e

−1

1

+ 1) = b

2

e

1

+b

1

e

2

1

+e

3

1

and

0 = e

3

2

σ(e

−1

2

) = b

2

e

2

+b

1

e

2

2

+e

3

2

.

On adding these equations, we obtain

0 = b

2

(e

1

+e

2

) +b

1

(e

2

1

+e

2

2

) + (e

3

1

+e

3

2

),

i.e.

S

1

b

2

+S

2

b

1

= −S

3

.

If for each i we had multiplied σ(e

−1

i

) = 0 by e

4

i

instead of e

3

i

and added the resulting

equations, we would have obtained

S

2

b

2

+S

3

b

1

= −S

4

.

To ﬁnd b

1

and b

2

, we only need to solve the system of equations

_

S

1

S

2

S

2

S

3

_ _

b

2

b

1

_

= −

_

S

3

S

4

_

(6.1)

(Of course, in the binary case, the minus sign can be omitted.) If the coeﬃcient

matrix in Eq. (6.1) has rank 0, then S

1

= S

2

= S

3

= S

4

= 0 and hence e

1

+ e

2

= 0,

so that e

1

= e

2

. This would imply that ℓ

1

= ℓ

2

, in which case e(x) = 0; that is, no

error has occurred. That is, the system of equations will have rank 0 ⇐⇒ no errors

have occurred.

Suppose that the coeﬃcient matrix has rank 1. Since q = 2, we know that S

2

= S

2

1

.

Note that S

1

,= 0, for otherwise the ﬁrst equation would imply that S

3

= 0 and the

rank of the coeﬃcient matrix would be 0. Since the determinant S

1

S

3

− S

2

2

= 0, we

deduce S

3

= S

3

1

. But

e

3

1

+e

3

2

= (e

1

+e

2

)

3

⇒ 0 = 3e

1

e

2

(e

1

+e

2

) = 3e

1

e

2

S

1

.

This implies that e

2

= 0 (only one error has occurred) and that S

1

= e

1

= α

ℓ

1

.

Conversely, if only one error has occurred, then S

3

= S

3

1

,= 0 and the coeﬃcient

matrix of Eq. (6.1) will have rank 1. Using a power table for F

2

4, we simply look

up the exponent ℓ

1

such that α

ℓ

1

= S

1

and then toggle bit ℓ

1

of y(x) to obtain the

transmitted codeword c(x).

Finally, if the rank of the coeﬃcient matrix is 2, we can solve for the coeﬃcients b

1

and b

2

. If two errors have occurred, the error locator polynomial σ(x) = b

2

x

2

+b

1

x+1

must have two roots in F

2

4, which we can determine by trial and error. The powers

of α associated with the inverses of these roots identify the two bit positions in which

errors have occurred.

57

• Let us demonstrate this decoding scheme for the [15, 7] BCH code generated by

the polynomial g(x) = x

8

+ x

7

+ x

6

+ x

4

+ 1. Given the message word 110 0000,

the transmitted codeword is 110 011 100 100 000, i.e. c(x) = (1 + x) g(x) =

x

9

+ x

6

+ x

5

+ x

4

+ x + 1. Suppose that two errors have occurred, so that the

received vector is 110 010 101 100 000, that is, y(x) = x

9

+x

8

+x

6

+x

4

+x + 1.

Consulting the power table for F

2

[x]/(x

4

+x + 1), we see that the syndromes are

S

1

= y(α) = α

9

+α

8

+α

6

+ α

4

+α + 1

= (α

3

+α) + (α

2

+ 1) + (α

3

+α

2

) + (α + 1) +α + 1 = α + 1 = α

4

,

S

2

= S

2

1

= α

8

,

S

3

= y(α

3

) = α

27

+α

24

+α

18

+α

12

+α

3

+ 1

= α

12

+α

9

+α

3

+α

12

+α

3

+ 1 = α

9

+ 1 = α

3

+α + 1 = α

7

,

S

4

= S

2

2

= α

16

= α.

Since S

1

,= 0 and S

1

S

3

−S

2

2

= S

1

(S

3

−S

3

1

) = S

1

(α

7

−α

12

) ,= 0, the system of equations

S

1

b

2

+S

2

b

1

= S

3

⇒ α

4

b

2

+α

8

b

1

= α

7

,

S

2

b

2

+S

3

b

1

= S

4

⇒ α

8

b

2

+α

7

b

1

= α,

has rank 2. Upon adding α

4

times the ﬁrst equation to the second equation, we see

that

(α

12

+α

7

)b

1

= α

11

+α.

Thus, b

1

= α

−2

α

6

= α

4

and hence b

2

= α

−4

(α

7

+ α

8

α

4

) = α

−4

α

2

= α

13

. Thus, the

error polynomial is

σ(x) = α

13

x

2

+α

4

x + 1 = (e

1

x −1)(e

2

x −1)

We determine the roots of this equation by trial and error, That is, we search through

the ﬁeld until we ﬁnd an i such that α

2i−2

+ α

i+4

= 1. Incrementing from i = 0, the

ﬁrst such i we ﬁnd is i = 7, so one root is α

7

. The inverse, say e

1

, of this root

is α

8

. Since the product e

1

e

2

= b

2

= α

13

, we immediately see that e

2

= α

5

. That is,

the two errors are in the ﬁfth and eighth positions, so we decode the received vector

110 010 101 100 000 as the codeword 110 011 100 100 000. Upon division of the

associated polynomial x

9

+ x

6

+ x

5

+ x

4

+ x + 1 by g(x) = x

8

+x

7

+x

6

+x

4

+ 1 we

obtain x + 1, which corresponds to the original message, 110 0000.

• In the previous example, if instead the received vector was 110 010 100 100 000,

that is, y(x) = x

9

+x

6

+x

4

+x + 1, the syndromes would be

S

1

= y(α) = α

9

+α

6

+α

4

+α + 1

= (α

3

+α) + (α

3

+α

2

) + (α + 1) +α + 1 = α

2

+α = α

5

,

S

2

= S

2

1

= α

10

,

S

3

= y(α

3

) = α

27

+α

18

+α

12

+α

3

+ 1

= α

12

+α

3

+α

12

+α

3

+ 1 = 1,

S

4

= S

2

2

= α

20

= α

5

.

58 CHAPTER 6. BCH CODES

Since S

3

= S

3

1

,= 0, we know that only one error has occurred. In fact, S

1

= α

5

,

so the error must be in the ﬁfth position; one again we see that the transmitted

codeword was 110 011 100 100 000.

In general, decoding requires that we solve the nonlinear system of d−1 syndrome

equations

S

i

= y(α

i

) = e

i

1

+. . . +e

i

t

, i = 1, . . . , d −1 (6.2)

for the value of t and the errors ¦e

j

: j = 1, . . . , t¦. Here t ≤ (d − 1)/2 is the actual

number of errors that have occurred, so that each of the values e

j

for j = 1, . . . , t are

nonzero and distinct.

A straightforward generalization of the t = 2 decoding scheme leads to the t

equations:

0 = e

t+1

i

σ(e

−1

i

) = b

t

e

i

+b

t−1

e

2

i

+. . . +b

1

e

t

i

+e

t+1

i

,

0 = e

t+2

i

σ(e

−1

i

) = b

t

e

2

i

+b

t−1

e

3

i

+. . . +b

1

e

t+1

i

+e

t+2

i

,

. . .

0 = e

2t

i

σ(e

−1

i

) = b

t

e

t

i

+b

t−1

e

t+1

i

+. . . +b

1

e

2t−1

i

+e

2t

i

.

On summing each of these equations over i, we obtain a linear system of equations

for the values b

1

, b

2

, . . . , b

t

in terms of the 2t ≤ d −1 syndromes S

1

, S

2

, . . . , S

2t

:

_

¸

¸

_

S

1

S

2

. . . S

t

S

2

S

3

. . . S

t+1

.

.

.

.

.

.

.

.

.

S

t

S

t+1

. . . S

2t−1

_

¸

¸

_

_

¸

¸

_

b

t

b

t−1

.

.

.

b

1

_

¸

¸

_

= −

_

¸

¸

_

S

t+1

S

t+2

.

.

.

S

2t

_

¸

¸

_

. (6.3)

Problem 6.2: Show that we may rewrite the coeﬃcient matrix in Eq. (6.3) as

M = V DV

t

,

where V is the Vandermonde matrix

V =

_

¸

¸

¸

¸

_

1 1 . . . 1

e

1

e

2

. . . e

t

e

2

1

e

2

2

. . . e

2

t

.

.

.

.

.

.

.

.

.

e

t−1

1

e

t−1

2

. . . e

t−1

t

_

¸

¸

¸

¸

_

and D = diag(e

1

, e

2

, . . . , e

t

) is a diagonal matrix with components d

ii

= e

i

.

Remark: The matrix D is nonsingular ⇐⇒ each of its eigenvalues e

j

, j = 1, . . . , t are

nonzero. Also, the following theorem establishes that the matrix V is nonsingular

⇐⇒ the values e

j

are distinct.

59

Theorem 6.1 (Vandermonde Determinants): For t ≥ 2 the t×t Vandermonde matrix

V =

_

¸

¸

¸

¸

_

1 1 . . . 1

e

1

e

2

. . . e

t

e

2

1

e

2

2

. . . e

2

t

.

.

.

.

.

.

.

.

.

e

t−1

1

e

t−1

2

. . . e

t−1

t

_

¸

¸

¸

¸

_

has determinant

t

i,j=1

i>j

(e

i

−e

j

).

Proof: When t = 2 we see that det V = e

2

− e

1

. For t > 2, suppose that all

(t −1)×(t −1) Vandermonde matrices have determinant

t−1

i,j=1

i>j

(e

i

−e

j

) =

t−1

i=1

i−1

j=1

(e

i

−e

j

).

By subtracting e

t

times row i − 1 from row i, for i = t, t − 1, . . . , 2, we can rewrite

the determinant of any t×t Vandermonde matrix as

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1 1 . . . 1 1

e

1

−e

t

e

2

−e

t

. . . e

t−1

−e

t

0

e

1

(e

1

−e

t

) e

2

(e

2

−e

t

) . . . e

t−1

(e

t−1

−e

t

) 0

.

.

.

.

.

.

.

.

.

.

.

.

e

t−2

1

(e

1

−e

t

) e

t−2

2

(e

2

−e

t

) . . . e

t−2

t−1

(e

t−1

−e

t

) 0

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

= (−1)

t−1

¸

¸

¸

¸

¸

¸

¸

¸

e

1

−e

t

e

2

−e

t

. . . e

t−1

−e

t

e

1

(e

1

−e

t

) e

2

(e

2

−e

t

) . . . e

t−1

(e

t−1

−e

t

)

.

.

.

.

.

.

.

.

.

e

t−2

1

(e

1

−e

t

) e

t−2

2

(e

2

−e

t

) . . . e

t−2

t−1

(e

t−1

−e

t

)

¸

¸

¸

¸

¸

¸

¸

¸

=

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1 1 . . . 1

e

1

e

2

. . . e

t−1

e

2

1

e

2

2

. . . e

2

t−1

.

.

.

.

.

.

.

.

.

e

t−2

1

e

t−2

2

. . . e

t−2

t−1

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

(e

t

−e

1

)(e

t

−e

2

) . . . (e

t

−e

t−1

)

=

t−1

i=1

i−1

j=1

(e

i

−e

j

)

t−1

j=1

(e

t

−e

j

) =

t

i=1

i−1

j=1

(e

i

−e

j

).

Remark: We thus see that the matrix M = V DV

t

is nonsingular ⇐⇒ the error

values e

j

are nonzero and distinct.

60 CHAPTER 6. BCH CODES

Remark: If we attempt to increase the value of t in Eq. (6.2) beyond the actual

number of errors that have occurred, either the values e

j

will no longer be distinct

or at least one of them will be zero. In either case, M will no longer be invertible.

This gives us a method for ﬁnding the number of errors: t is just the largest number

such that

M

.

=

_

¸

¸

_

S

1

S

2

. . . S

t

S

2

S

3

. . . S

t+1

.

.

.

.

.

.

.

.

.

S

t

S

t+1

. . . S

2t−1

_

¸

¸

_

is invertible.

Remark: If it is a priori known that no more than t errors have occurred in a received

polynomial y, then it is impossible for a (t +1) (t +1) or larger syndrome matrix

based on y to be invertible.

Remark: Once we have determined the maximum value of t such that the coeﬃ-

cient matrix M is invertible, we simply solve the linear system Eq. (6.3) for the

coeﬃcients b

1

, b

2

, . . . , b

t

of the error locator polynomial σ(x). We can determine

all t roots of σ(x) simply by searching through all of the ﬁeld elements (at most

one pass is required). The exponents ℓ

1

, ℓ

2

, . . . , ℓ

t

corresponding to the inverses of

these roots precisely identify the t positions of the received vector that are in error.

Remark: The above decoding procedure can be easily extended to nonbinary codes.

In this case, the error vector becomes e(x) = q

1

x

ℓ

1

+ q

2

x

ℓ

2

+ . . . + q

t

x

ℓt

, where

the q

i

∈ ¦0, 1, . . . , q − 1¦, the syndromes become S

i

= q

1

e

i

1

+ . . . + q

t

e

i

t

, and D =

diag(q

1

e

1

, q

2

e

2

, . . . , q

t

e

t

). We then see that any BCH code of design distance d can

correct ⌊(d −1)/2⌋ errors. We encapsulate this result in the following theorem.

Theorem 6.2 (BCH Bound): The minimum distance of a BCH code of odd design

distance d is at least d.

Proof: This follows from Theorem 1.1 and the fact that the BCH code can correct

(d −1)/2 errors.

Remark: Although Theorem 6.2 may be shown to hold also when the design dis-

tance d is even, we are normally interested only in the case of odd d.

Remark: It may happen that the minimum distance of a BCH code exceeds its

design distance d, as illustrated by the following example.

• Let α be a primitive element of F

2

11 . Since 2

11

− 1 = 2047 = 23 89, the

element β = α

89

has order n = 23. The cyclotomic cosets mod23 are ¦0¦,

¦1, 2, 4, 8, 16, 9, 18, 13, 3, 6, 12¦, and ¦5, 10, 20, 17, 11, 22, 21, 19, 15, 7, 14¦. Thus the

minimal polynomials of m

1

(x) of β and m

5

(x) of β

5

in F

2

11 each have degree 11.

1

1

Since β

23

= 1, we know from Theorem A.5 that x

23

− 1 = (x − 1)m

1

(x)m

5

(x), moreover,

Theorem A.7 implies that m

5

(x) = m

1

(x)

61

We can then construct a [23, 12] binary BCH code of length 23 from the degree 11

generator polynomial m

1

(x), which has roots at β, β

2

, β

3

, and β

4

. While the design

distance of this code is 5, the actual minimum distance is 7; in fact, this BCH code

is equivalent to the triple-error correcting [23, 12, 7] Golay code we encountered in

Chapter 4.

Remark: The special case of an [n, k] BCH code for s = 1, where the primitive

element α comes from the same ﬁeld F

q

as the coeﬃcients of the generator poly-

nomial, is known as a Reed–Solomon code. Note that the minimal polynomial of

any element of F

q

has degree s = 1. The generator polynomial of a Reed–Solomon

code of design distance d,

g(x) = (x −α)(x −α

2

)(x −α

3

) . . . (x −α

d−1

),

thus has degree n − k = d − 1. That is, the minimum distance of the code must

at least n − k + 1. But since there are rk H ≤ n − k independent columns in

the parity check matrix, we know from Theorem 2.2 that the minimum distance

can be no more than n − k + 1. Thus, a Reed–Solomon code achieves the so-

called singleton upper bound n −k +1 for the minimum distance of a linear code.

Because Reed–Solomon codes are optimal in this sense and easily implementable

in hardware (using shift registers), they are widely used for error correction in

computer memory chips, magnetic and optical (compact disk) storage systems,

high-speed modems, and data transmission channels.

• Since 2 is a primitive element of Z

11

, the polynomial

g(x) = (x−2)(x−4)(x−8)(x−5)(x−10)(x−9) = x

6

+6x

5

+5x

4

+7x

3

+2x

2

+8x+2

generates a triple-error correcting [10, 4, 7] Reed–Solomon code over Z

11

. One of

the codewords is g(x) itself, which has weight 7, consistent with the above claim

that the design distance of a Reed-Solomon code is in fact the actual minimum

distance.

• Compact disk players use a double-error correcting [255, 251, 5] Reed–Solomon code

over F

256

(the symbols are eight-bit bytes) with interleaving of the data over the

disk to increase robustness to localized data loss.

• High-performance computer memory chips containing so-called ECC SDRAM use

Reed–Solomon codes to correct one error per 64-bit word in addition to detecting

very rare multiple errors (which are programmed to generate a machine halt).

Problem 6.3: Prove that the Hamming code Ham(r, 2) is equivalent to an [n, k]

BCH code with distance d = 3. What is the value of k?

Problem 6.3 establishes the hierarchy of error-correcting codes illustrated in Fig. 6.1.

62 CHAPTER 6. BCH CODES

Ham(r, 2)

BCH Codes

Cyclic Codes

Linear Codes

General Codes

Figure 6.1: Hierarchy of Error-Correcting Codes.

Problem 6.4: In this question, we construct, from the primitive element α = x in

the ﬁeld F

2

[x]/(x

4

+x

3

+ 1), a BCH code C of length n = 15, design distance 7.

(a) Find the degree of the generator polynomial g(x) for C.

We want α,α

2

,. . . ,α

6

to be roots of g(x), so we want g(x) = m

1

(x)m

3

(x)m

5

(x), which

has degree 4 + 4 + 2 = 10. That is, n −k = 10.

(b) What is the dimension k of C?

k = n −10 = 5

(c) Is C perfect? Circle the correct answer: Yes /

§

¦

¤

¥

No .

This code is not perfect since it has neither the parameters of a Hamming code nor a

Golay code.

(d) Construct the generator polynomial g(x) for C.

g(x) = m

1

(x)m

3

(x)m

5

(x) = (x

4

+ x

3

+ 1)(x

4

+ x

3

+ x

2

+ x + 1)(x

2

+ x + 1)

= (x

8

+ x

4

+ x

2

+ x + 1)(x

2

+ x + 1) = x

10

+ x

9

+ x

8

+ x

6

+ x

5

+ x

2

+ 1.

(e) We know that the minimum distance of C is at least the design distance 7.

Using the fact that g(x) itself is a codeword of C, prove that the actual minimum

distance of C is exactly 7.

The weight of g(x) is 7, so the minimum distance of this linear code cannot exceed 7.

(f) Use the irreducible factors of x

15

− 1 (or any other means) to ﬁnd the check

polynomial h(x) of C.

Since

x

16

−x = x(x + 1)m

1

(x)m

3

(x)m

5

(x)m

7

(x),

63

we know that

x

15

−1 = (x + 1)m

1

(x)m

3

(x)m

5

(x)m

7

(x) = (x + 1)m

7

(x)g(x).

Hence h(x) = (x + 1)m

7

(x) = (x + 1)(x

4

+ x + 1) = x

5

+ x

4

+ x

2

+ 1.

(g) How many errors can C correct?

3

(h) Suppose the vector 011 111 101 110 111 is received. Without computing

syndromes, ﬁnd the transmitted codeword. How many errors have occurred?

Hint: Look for an obvious nearby codeword.

Since α

15

= 1 in F

16

, we see that

1 + α + α

2

+ α

3

+ . . . + α

15

= (α −1)

−1

_

α

15

−1

_

= (α −1)

−1

0 = 0.

That is, the vector consisting of 15 ones is a codeword. Our received vector is only a distance

three away. Errors have occurred in positions 0, 7, and 11, but we can correct three errors.

The transmitted codeword was thus 111 111 111 111 111.

(i) Determine the original transmitted message polynomial and the corresponding

message.

The original message was

1 + x + x

2

+ x

3

+ . . . + x

15

= (x + 1)

−1

_

x

15

−1

_

.

From part (f) we know that

(x + 1)

−1

_

x

15

−1

_

= m

1

(x)m

3

(x)m

5

(x)m

7

(x) = m

7

(x)g(x).

Thus, the original message polynomial was m

7

(x) = x

4

+x+1, corresponding to the message

11001.

Problem 6.5: (a) Show that x

2

+ 1 is irreducible in Z

3

[x].

Since x

2

+ 1 evaluates to 1, 2, 2 at the respective values 0, 1, 2 of Z

3

, we know from

Theorem 5.1 that it cannot have a linear factor. Since the degree of x

2

+ 1 is 2, it must be

irreducible.

(b) Consider the ﬁeld F

9

= Z

3

[x]/(x

2

+ 1). Show that the element x is not a

primitive element of this ﬁeld. What is the order of x?

Since x

2

= −1, we see that x

4

= 1. The order e must divide 8, so it is in fact 4. Since

4 < 8, we know that x is not a primitive element of this ﬁeld.

(c) List all cyclotomic cosets (the distinct exponents in the sets ¦α

i

, α

3i

, α

3

2

i

, . . .¦,

where α is a primitive element) of F

9

.

¦0¦, ¦1, 3¦, ¦2, 6¦, ¦4¦, ¦5, 7¦.

64 CHAPTER 6. BCH CODES

(d) Establish that α = x + 1 is a primitive element of the ﬁeld Z

3

[x]/(x

2

+ 1) by

completing the following table. Indicate the order and minimal polynomial of each

element. Remember that the polynomial coeﬃcients come from Z

3

.

Element Polynomial Order Minimal Polynomial

α

0

1 1 x + 2

α

1

x + 1 8 x

2

+ x + 2

α

2

2x 4 x

2

+ 1

α

3

2x + 1 8 x

2

+ x + 2

α

4

2 2 x + 1

α

5

2x + 2 8 x

2

+ 2x + 2

α

6

x 4 x

2

+ 1

α

7

x + 2 8 x

2

+ 2x + 2

Problem 6.6: In the ﬁeld Z

3

[x]/(x

2

+1), with primitive element x +1, consider the

BCH code C of length n = 8 and design distance 5.

(a) Find the degrees of the minimal polynomials used to construct the generator

polynomial g of C and the degree of g itself.

Since we want α, α

2

, α

3

, and α

4

to be roots of g(x), we know that g must be the

product of m

1

(x) (degree 2), m

2

(x) (degree 2), and m

4

(degree 1). Hence g must have

degree 2 + 2 + 1 = 5. That is, n −k = 5.

(b) What is the dimension k of C?

k = n −5 = 3.

(c) How many codewords are there in C?

There are 3

k

= 27 codewords in C.

(d) Compute the syndromes S

1

, S

2

, and S

3

for the received polynomial v(x) =

x

3

+x

4

+ 2x

5

+x

6

.

S

1

= v(α) = α

3

+ α

4

+ 2α

5

+ α

6

= (2x + 1) + 2 + 2(2x + 2) + x = x + 1 = α,

S

2

= v(α

2

) = α

6

+ α

8

+ 2α

10

+ α

12

= x + 1 + 2(2x) + 2 = 2x = α

2

,

S

3

= v(α

3

) = S

3

1

= α

3

.

(e) Using the value of the syndrome S

1

and the syndrome equation

_

S

1

S

2

S

2

S

3

_ _

b

2

b

1

_

= −

_

S

3

S

4

_

,

show that the received polynomial in part(d) contains exactly one error.

Since S

1

,= 0 but S

1

S

3

− S

2

2

= αα

3

− (α

2

)

2

= 0, we know that exactly one error has

occurred.

65

(f) What was the transmitted polynomial in part (d)?

On solving

S

1

= α = qe,

S

2

= α

2

= qe

2

,

we see that e = α and q = 1. So an error of 1 has appeared in the coeﬃcient of x. On

subtracting this error, we obtain the transmitted polynomial

2x + x

3

+ x

4

+ 2x

5

+ x

6

.

Problem 6.7: Consider the [15, 7, 5] primitive BCH code C over F

16

. Let α be a

primitive element of F

16

.

(a) Find the degree of the generator polynomial g for C.

deg g = n −k = 15 −7 = 8.

(b) Find the degrees of the minimal polynomials used to construct g.

The generator g is constructed to have roots at α, α

2

, α

3

, and α

4

. From the cyclotomic

cosets

¦0¦, ¦1, 2, 4, 8¦, ¦3, 6, 12, 9¦, ¦5, 10¦, ¦7, 14, 13, 11¦,

we see that it must include the minimal polynomials of m

1

(x) (degree 4) and m

3

(x) (degree

4). Since the sum of these degrees is 4 +4 = 8, we see that g cannot contain any additional

roots.

(c) Without computing syndromes, show that 100 100 100 100 100 is a codeword

of C.

We know that the geometric series c(x) = 1 + x

3

+ x

6

+ x

9

+ x

12

has the value (1 +

x

3

)

−1

_

1 + (x

3

)

5

¸

for x

3

,= 1. Then for i = 1, 2, 3, 4,

c(α

i

) =

_

1 + α

3i

_

−1

_

1 +

_

α

3i

_

5

_

=

_

1 + α

3i

_

−1

_

1 +

_

α

15

_

i

_

=

_

1 + α

3i

_

−1

(1 + 1

i

) =

_

1 + α

3i

_

−1

0 = 0

since α

3i

,= 1. That is, α, α

2

, α

3

, and α

4

(but incidentally, not α

5

) are roots of c(x). By

Theorem A.5, c(x) is a multiple of the minimal polynomial of each of these elements, and

therefore, of g(x) itself. Thus c(x) is in the code generated by g(x).

Chapter 7

Cryptographic Codes

In contrast to error-correcting codes, which are designed only to increase the reliability

of data communications, cryptographic codes are designed to increase their security.

In cryptography, the sender uses a key to encrypt a message before it is sent through

an insecure channel (such as a telephone line, radio transmission or the internet).

An authorized receiver at the other end then uses a key to decrypt the received data

to a message. Often, data compression algorithms and error-correcting codes are

used in tandem with cryptographic codes to yield communications that are both

eﬃcient, robust to data transmission errors, and secure to eavesdropping and/or

tampering. Typically, data compression is performed ﬁrst; the resulting compressed

data is then encrypted and ﬁnally encoded with an error-correcting scheme before

being transmitted through the channel.

Deﬁnition: Let / be a set of cryptographic keys. A cryptosystem is a set

¦e, d, c

e

, T

d

: e ∈ /¦

of encrypting and decrypting keys, e and d, and their associated encrypting function

c

e

and T

d

, respectively.

Most cryptosystems, or ciphers, fall into one of two broad classes: symmetric-key

cryptosystems, where essentially the same key is used both to encrypt and decrypt

a message (precisely, where d can be easily determined whenever e is known) and

public-key cryptosystems, where the encryption key e is made publicly available, but

the decryption key d is kept secret and is (hopefully) known only to the receiver.

7.A Symmetric-Key Cryptography

One of the simplest cryptographic system is the shift cipher employed by Julius

Caeser. Shift ciphers encode each symbol m ∈ ¦0, 1, . . . , n −1¦ in the message as

c = c

e

(m) = m+e (modn),

66

7.A. SYMMETRIC-KEY CRYPTOGRAPHY 67

where e ∈ N. Decoding is accomplished via the inverse transformation

m = T

d

(c) = c +d (modn),

where d = −e. That is, encoding is accomplished by addition modulo e and decoding

is accomplished by subtraction modulo e. Caeser adopted the value e = 3 to encrypt

the n = 26 symbols of the Roman alphabet, using 0 to represent the letter A and

25 to represent the letter Z. Some fans of the ﬁlm “2001: A Space Odyssey” even

suggest that the computer name HAL is really a shift cipher for IBM, with e = 25!

A slight generalization of the shift cipher is the aﬃne cipher, deﬁned by

c = c

a,b

(m) = am +b (modn),

where a ∈ N is relatively prime to n. This condition guarantees the existence of an

inverse transformation,

m = T

a,b

(c) = a

−1

(c −b) (mod n).

Both shift and aﬃne ciphers are very insecure since they are easily decoded simply

by trying all possible values of a and b! They are both special cases of simple sub-

stitution ciphers or monoalphabetic substitution ciphers, which permute the alphabet

symbols in a prescribed manner. Simple substitution ciphers can be cryptanalyzed

(decoded by an unauthorized third party) by frequency analysis, in which the en-

crypted symbol frequencies are compared to those of the original message language

to determine the applied permutation. Block substitution ciphers or polyalphabetic

substitution ciphers divide the message into blocks of length r and apply diﬀerent

permutations to the symbols in individual positions of the block. Given enough en-

crypted text, block substitution ciphers are also easily cryptanalyzed once the block

size r is determined, simply by doing a frequency analysis on the letters in each ﬁxed

position of all blocks.

Digraph ciphers map pairs of letters in the message text to encrypted pairs of

letters. A general example of this is the linear block or Hill cipher, which uses an

r r invertible matrix e to encrypt an entire block m of r message symbols:

c = c

e

(m) = em (modn),

m = T

e

(c) = e

−1

c (mod n).

The existence of e

−1

requires that det e have an inverse in Z

n

, which happens only

when gcd(det e, n) = 1.

• Choose r = 2, n = 26 and

e =

_

2 1

3 4

_

.

68 CHAPTER 7. CRYPTOGRAPHIC CODES

We see that det e = 5 has no common factors with 26. To ﬁnd the inverse of 5 in Z

26

we use the Euclidean division algorithm: 1 = 5x+26y, 26 = 55+1 ⇒ 1 = 26−55,

from which we see that x = −5 is a solution. Thus

e

−1

= −5

_

4 −1

−3 2

_

=

_

6 5

15 16

_

.

We can use e to encrypt the word “SECRET”, in other words the message 18 4

2 17 4 19, by breaking the message up into vectors of length two: [18 4], [2 17],

[4 19] and then multiplying the transpose of each vector by e on the left. The

result is [14 18], [21 22], [1 10], or the cipher text “OSVWBK”. Note that the two

letters “E” are not mapped to the same symbol in the ciphertext. For this reason

the Hill cipher is less susceptible to frequency analysis (particularly for large block

sizes; however, the number of entries in the key matrix then becomes unreasonably

large).

Problem 7.1: Verify that the original message “SECRET” is recovered when “OS-

VWBK” is decoded with the matrix e

−1

.

A special case of the block cipher is the permutation cipher, in which the order of

the characters in every block of text is rearranged in a prescribed manner. Permuta-

tion ciphers can be detected by frequency analysis since they preserve the frequency

distribution of each symbol. In fact, all linear or aﬃne block methods are subject

to cryptanalysis using linear algebra, once r or r + 1 plaintext–ciphertext pairs are

known.

A widely used commercial symmetric-key cryptosystem is the Data Encryption

Standard (DES), which is a type of Feistel cipher endorsed in 1977 by the United

States National Bureau of Standards for the protection of conﬁdential (but unclassi-

ﬁed) government data. In 1981, DES was approved also for use in the private-sector.

Feistel ciphers are block ciphers over F

2t

2

. One divides a plaintext block message of 2t

bits into two halves, L

0

and R

0

, and then performs the iteration

L

i

= R

i−1

,

R

i

= L

i−1

+f(R

i−1

, e

i

), i = 1, . . . , r,

where the e

i

are the keys and f is a speciﬁc nonlinear cipher function. The encryption

function is c

e

(L

0

[ R

0

) = R

r

[ L

r

, where [ denotes concatenation, and e = (e

1

, . . . e

r

)

denotes a set of r encryption keys. The decryption function T

e

(L

r

[ R

r

) = R

0

[ L

0

, is

implemented by applying the inverse iteration

L

i−1

= R

i

+f(L

i

, e

i

),

R

i−1

= L

i

, i = r, . . . , 1.

With Feistel ciphers, the encryption and decryption algorithms are essentially the

same, except that the key sequence is reversed.

7.B. PUBLIC-KEY CRYPTOGRAPHY 69

The DES cipher uses the half width t = 32 and r = 16 rounds of encryption.

All 16 keys are generated from a bit sequence of length 64 that is divided into 8

bytes. The eighth bit of each byte is an (odd) parity check, so in fact there are

only 56 information bits in the key. Hence the number of keys that need to be

searched in a brute-force attack on DES is 2

56

. In fact, because of an inherent ones-

complement symmetry, only 2

55

keys need to be checked. On a modern supercomputer

this is quite feasible; moreover, DES has recently been shown to be susceptible to

relatively new cryptanalytic techniques, such as diﬀerential cryptanalysis and linear

cryptanalysis. Somewhat more secure variants of DES (such as Triple-DES, where

DES is applied three times in succession with three diﬀerent keys), were developed as

interim solutions. One common application of DES that persists today is its use in

encrypting computer passwords, using the password itself as a key. This is why many

computer passwords are still restricted to a length of eight characters (64 bits).

In October 2000, the Rijndael Cryptosystem was adopted by the National Bureau

of Standards as the Advanced Encryption Standard (AES) to replace DES. It is

based on a combinations of byte substitutions, shifts, and key additions, along with a

diﬀusion-enhancing technique based on cyclic coding theory, where the data values are

multiplied by the polynomial 3x

3

+x

2

+x+2 in the polynomial ring F

256

[x]/(x

4

−1).

7.B Public-Key Cryptography

A principle diﬃculty with symmetric-key cryptosystems is the problem of key distri-

bution and management. Somehow, both parties, which may be quite distant from

each other, have to securely exchange their secret keys before they can begin commu-

nication.

One technique for avoiding the problem of key exchange makes use of two secure

envelopes, or locks, which each party alternately applies and later removes from the

sensitive data, as the data makes a total of three transits between sender and receiver.

The required three transmissions makes this method awkward to use in practice.

In public-key cryptosystems, key exchange is avoided altogether by making copies

of the receiver’s encrypting key (lock) available to anyone who wants to communicate

with him. Both the secure envelope technique and the public-key technique require

that the encrypting key e is designed so that the decrypting key d is extremely diﬃcult

to determine from knowledge of e. They also require authentication of the lock itself,

to guard against so-called man-in-the-middle attacks.

7.B.1 RSA Cryptosystem

The most well known public-key cipher is the Rivest–Shamir–Aldeman (RSA) Cryp-

tosystem. First, the receiver forms the product n of two distinct large prime numbers p

70 CHAPTER 7. CRYPTOGRAPHIC CODES

and q chosen at random, but such that p and q cannot be easily determined from n.

1

The receiver then selects a random integer e between 1 and ϕ(n) = (p−1)(q −1) that

is relatively prime to ϕ(n) and, using the Euclidean division algorithm, computes

d = e

−1

in Z

ϕ(n)

(why does e

−1

exist?). The numbers n and e are made publicly

available, but d, p, q are kept secret.

Anyone who wishes to send a message m, where 0 ≤ m < n, to the receiver

encrypts the message using the encoding function

c = c

e

(m) = m

e

(mod n)

and transmits c. Because the receiver has knowledge of d, the receiver can decrypt c

using the decoding function

M = T

e

(c) = c

d

(modn).

To show that M = m, we will need the following results.

Theorem 7.1 (Modiﬁed Fermat’s Little Theorem): If s is prime and a and m are

natural numbers, then

m

_

m

a(s−1)

−1

¸

= 0 (mod s).

Proof: If m is a multiple of s we are done. Otherwise, we know that m

a

is not

a multiple of s, so Fermat’s little theorem

2

implies that (m

a

)

s−1

= 1 (mod s), from

which the result follows.

Corollary 7.1.1 (RSA Inversion): The RSA decoding function T

e

is the inverse of

the RSA encoding function c

e

.

By construction ed = 1 +kϕ(n) for some integer k, so

M = T

e

(c) = c

d

= (m

e

)

d

= m

ed

= m

1+kϕ(n)

= m

1+k(p−1)(q−1)

(mod n).

We ﬁrst apply Theorem 7.1 with a = k(q − 1), s = p and then with a = k(p − 1),

s = q, to deduce that m[m

k(q−1)(p−1)

−1] is a multiple of both of the distinct primes p

and q, that is, m[m

k(q−1)(p−1)

−1] = 0 (modpq). Thus

M = mm

k(q−1)(p−1)

= m (mod pq) = m (modn).

• Let us encode the message “SECRET” (18 4 2 17 4 19) using the RSA scheme with

a block size of 1. The receiver chooses p = 5 and q = 11, so that n = pq = 55 and

1

For example, if p and q are close enough that (p +q)

2

−4n = (p +q)

2

−4pq = (p −q)

2

is small,

then the sum p+q could be determined by searching for a small value of p−q such that (p−q)

2

+4n

is a perfect square, which must be p + q. Knowledge of p − q and p + q is suﬃcient to determine

both p and q.

2

This follows from applying Theorem A.4 to the ﬁeld Z

s

.

7.B. PUBLIC-KEY CRYPTOGRAPHY 71

ϕ(n) = 40. He then selects e = 17 and ﬁnds d = e

−1

in Z

40

, so that 17d = 40k + 1

for some k ∈ N. This amounts to ﬁnding gcd(17, 40):

40 = 2 17 + 6, 17 = 2 6 + 5, 6 = 1 5 + 1,

from which we see that

1 = 6 −5 = 6 −(17 −2 6) = 3 (40 −2 17) −17 = 3 40 −7 17.

That is, d = −7 (mod 40) = 33. The receiver publishes the numbers n = 55 and

e = 17, but keeps the factors p = 5, q = 11, and d = 33 (and ϕ(n)) secret.

The sender then encodes 18 4 2 17 4 19 as

18

17

4

17

2

17

17

17

4

17

19

17

(mod55) = 28 49 7 52 49 24

The two Es are encoded in exactly the same way, since the block size is 1: obviously,

a larger block size should be used to thwart frequency analysis attacks.

3

The receiver would then decode the received message 28 49 7 52 49 24 as

28

33

49

33

7

33

52

33

49

33

24

33

(mod55) = 18 4 2 17 4 19.

Remark: While the required exponentiations can be performed by repeated squaring

and multiplication in Z

ϕ(n)

(e.g. x

33

= x

32

x), RSA decryption can be implemented

in a more eﬃcient manner. This is important, since to make computing the secret

key d (from knowledge of n and e alone) diﬃcult, d must be chosen to be about

as large as n. Instead of computing m = c

d

directly, we ﬁrst ﬁnd a = c

d

(modp)

and b = c

d

(mod q). This is very easy since Fermat’s little theorem says that

c

p−1

= 1 (modp), so these deﬁnitions reduce to

a = c

d mod(p−1)

(mod p), b = c

d mod(q−1)

(modq).

The Chinese remainder theorem then guarantees that the system of linear congru-

ences

m = a (modp), m = b (modq)

has exactly one solution in ¦0, 1, . . . , n−1¦. One can ﬁnd this solution by using the

Euclidean division algorithm to construct integers x and y such that 1 = xp + yq.

Since yq = 1 (modp) and xp = 1 (mod q), we see that

m = ayq +bxp (mod n)

is the desired solution. Since the numbers x and y are independent of the ciphertext,

the factors xp and yq can be precomputed.

3

For example, we could encode pairs of letters i and j as 26i + j and choose n ≥ 26

2

= 676,

although such a limited block size would still be vulnerable to more time consuming but feasible

digraph frequency attacks.

72 CHAPTER 7. CRYPTOGRAPHIC CODES

• To set up an eﬃcient decoding scheme we precompute x and y such that 1 = px+qy.

For p = 5 and q = 11, we see that x = −2 and y = 1 are solutions, so that xp = −10

and yq = 11. Once a and b are determined from the ciphertext we can quickly

compute m = 11a −10b (mod55). For example, to compute 28

33

we evaluate

a = 28

33 (mod 4)

(mod5) = 28 (mod 5) = 3,

b = 28

33 (mod 10)

(mod11) = 28

3

mod 11 = 6

3

(mod 11) = 7

and then compute m = 11a −10b = (33 −70) (mod55) = 18.

Remark: Determining d from e and n can be shown to be equivalent to determining

the prime factors p and q of n. Since factoring large integers in general is an

extremely diﬃcult problem, the belief by many that RSA is a secure cryptographic

system rests on this equivalence. However, it has not been ruled out that no

other technique for decrypting RSA ciphertext exists. If such a technique exists,

presumably it does not involve direct knowledge of d (as that would constitute an

eﬃcient algorithm for factorizing integers!).

Problem 7.2: You have intercepted an encrypted transmission that was intended

for your math professor. It consists of the three numbers 26, 14, and 19. Each of

these numbers was separately encoded, using RSA encoding and a block size of 1,

where the letters A to Z are mapped to Z

26

in the usual way. On your professor’s

web page you notice that he lists his public key as (n = 33, e = 13).

(a) Without breaking into your professor’s computer, determine the “secret” prime

factors p and q of n.

p = 3, q = 11.

(b) Determine the decryption exponent d, knowing e, p, and q.

First we ﬁnd φ(n) = (p − 1)(q − 1) = 20. Since 20 = 1 13 + 7, 13 = 1 7 + 6, and

7 = 1 6+1, we see that 1 = 7−6 = 7−(13−7) = 2 7−13 = 2 (20−13)−13 = 2 20−3 13.

So d = −3 (mod 20) = 17.

(c) Use d to decrypt the secret message. Interpret your result as a three-letter

English word.

Since 11 = 3 3 +2 and 3 = 1 2 +1 we see that 1 = 3 −2 = 3 −(11 −3 3) = 4 3 −11.

Hence 1 = xp + yq, where x = 4 and y = −1. Hence we can decode the message c as

m = 12b − 11a (mod33), where a = c

17

(mod 3) = c

1

(mod 3) and b = c

17

(mod 11) =

c

7

(mod 11).

For c = 26, we ﬁnd

a = 26

1

(mod 3) = 2

1

(mod 3) = 2,

b = 26

7

(mod 11) = 4

7

(mod11) = 2

14

(mod 11) = 2

4

(mod11) = 5.

This yields m = 12(5) −11(2) = 38 (mod 33) = 5.

7.B. PUBLIC-KEY CRYPTOGRAPHY 73

Similarly, for c = 14, we ﬁnd

a = 2

1

(mod 3) = 2,

b = 3

7

(mod 11) = 3 27

2

(mod 11) = 3 5

2

(mod 11) = 3 3 (mod 11) = 9.

This yields m = 12(9) −11(2) = 20.

Finally, for c = 19, we ﬁnd

a = 1

17

(mod 3) = 1,

b = 8

7

(mod 11) = 2

21

(mod 11) = 2

1

(mod 11) = 2.

This yields m = 12(2) −11(1) = 13.

Thus, the sent message was 5, 20, 13, which spells FUN!

Problem 7.3: You have intercepted an encrypted transmission from the Prime Min-

ister’s Oﬃce to the Leader of the Oﬃcial Opposition. It consists of the two numbers

27 and 14, which were separately encoded, using RSA encoding and a block size

of 1, where the letters A to Z are mapped to Z

26

in the usual way. On the Leader

of the Oﬃcial Opposition’s web page you notice that he lists his public key as

(n = 35, e = 11).

(a) Determine the “secret” prime factors p and q of n.

p = 5, q = 7.

(b) Determine the decryption exponent d, knowing e, p, and q.

First we ﬁnd φ(n) = (p − 1)(q − 1) = 24. Since 24 = 2 11 + 2, 11 = 5 2 + 1, we see

that 1 = 11 −5 2 = 11 −5 (24 −2 11) = 11 11 −5 24. So d = 11.

(c) Use d to decrypt the secret message. Interpret your result as a two-letter

English word.

Since 7 = 1 5+2 and 5 = 2 2+1 we see that 1 = 5−2 2 = 5−2 (7−1 5) = 3 5−2 7.

Hence 1 = xp + yq, where x = 3 and y = −2. Hence we can decode the message c as m =

15b −14a (mod 35), where a = c

11

(mod 5) = c

3

(mod 5) and b = c

11

(mod 7) = c

5

(mod 7).

For c = 27, we ﬁnd

a = 27

3

(mod 5) = 2

3

(mod 5) = 3,

b = 27

5

(mod7) = (−1)

5

(mod 7) = −1(mod 7) = 6.

This yields m = 15(6) −14(3) = 48 (mod 35) = 13.

Similarly, for c = 14, we ﬁnd

a = 14

3

(mod5) = (−1)

3

(mod 5) = 4,

b = 14

5

(mod7) = 0.

This yields m = −14(4) = −56 (mod 35) = 14.

Thus, the sent message was 13, 14 which spells NO!

74 CHAPTER 7. CRYPTOGRAPHIC CODES

7.B.2 Rabin Public-Key Cryptosystem

In contrast to the RSA scheme, the Rabin Public-Key Cryptosystem has been proven

to be as secure as factorizing large integers is diﬃcult. Again the receiver forms the

product n = pq of two large distinct primes p and q that are kept secret. To make

decoding eﬃcient, p and q are normally chosen to be both congruent to 3 (mod4).

This time, the sender encodes the message m ∈ ¦0, 1, . . . , n −1¦ as

c = c

e

(m) = m

2

(modn).

To decode the message, the receiver must be able to compute square roots modulo n.

This can be eﬃciently accomplished in terms of integers x and y satisfying 1 = xp+yq.

First one notes from Lemma 5.1 that the equation 0 = x

2

−c has at most two solutions

in Z

p

. In fact, these solutions are given by ±a, where a = c

(p+1)/4

(mod p):

(±a)

2

= c

(p+1)/2

(modp) = cc

(p−1)/2

(mod p) = cm

(p−1)

(mod p) = c (modp).

Similarly, the two square roots of c in Z

q

are ±b, where b = c

(q+1)/4

(mod q). Conse-

quently, by the Chinese remainder theorem, the linear congruences

M = ±a (mod p), M = ±b (modq)

yield four solutions:

M = ±(ayq ±bxp) (modn),

one of which is the original message m.

7.C Discrete Logarithm Schemes

The RSA system is based on the trapdoor property that multiplying two large prime

numbers is much easier than factoring a large composite number into two constituent

primes. Another example of such a one-way function is the fact that computing a

high power of an element within a group is much easier than the reverse process of

determining the required exponent.

7.C.1 Diﬃe–Hellman Key Exchange

Deﬁnition: Let G be a ﬁnite group and b ∈ G. Suppose y ∈ G can be obtained as

the power b

x

in G. The number x is called the discrete logarithm of y to the base b

in G.

Remark: The Diﬃe–Hellman assumption conjectures that it is computationally in-

feasible to compute α

ab

knowing only α

a

and α

b

. It is assumed that this would

require ﬁrst determining one of the powers a or b (in which case it as diﬃcult as

computing a discrete logarithm).

7.C. DISCRETE LOGARITHM SCHEMES 75

Let F

q

be a publicly agreed upon ﬁeld with q elements, with primitive element α.

If Alice and Bob want to agree on a common secret key, they each secretly choose

a random integer between 1 and q − 1, which they call a and b, respectively. Alice

computes and sends α

a

to Bob; likewise, Bob computes and sends α

b

to Alice. Their

common secret key is then α

ab

. If the Diﬃe–Hellman assumption is correct, a third

party will be unable to determine α

ab

from knowledge of the public keys α

a

and α

b

alone.

In the ElGamal Cryptosystem, Alice sends a message m to Bob as the pair of

elements (α

k

, mα

bk

), where k is a randomly chosen integer. Bob can then determine m

by dividing mα

bk

by (α

k

)

b

.

7.C.2 Okamoto Authentication Scheme

One of the more diﬃcult issues in public key cryptography is that of authentication,

that is, verifying the identity of a sender or receiver. For example, Alice should

not blindly assume that Bob’s posted public key is really his, without independent

conﬁrmation via a trusted source. And Bob, when he decrypts a message from Alice,

should not naively believe that it was really Alice who used his public encoding key

to send the message to him in the ﬁrst place.

A trusted authority (perhaps the government?) might issue Alice an electronic

certiﬁcate based on more conventional forms of identiﬁcation (such as a passport,

driver’s license, or birth certiﬁcate), which Alice would like to then use to authenticate

herself while communicating with Bob. However, Alice can’t simply send Bob her

secret identiﬁcation code. Even if she encrypted it with Bob’s public key, there would

always be the risk that Bob might later use Alice’s identiﬁcation to impersonate her!

Obviously, part of Alice’s identiﬁcation must be kept secret.

One practical scheme for doing this is the Okamoto authentication scheme, a

variation of an earlier scheme by Schnorr. A trusted certifying authority chooses a

ﬁeld Z

p

, where the prime p is chosen such that p − 1 has a large prime factor q.

The authority also chooses two distinct elements of order q in Z

p

, say g

1

and g

2

. To

request a certiﬁcate, Alice chooses two secret random exponents a

1

and a

2

in Z

q

and

sends the number

s = g

−a

1

1

g

−a

2

2

(mod p).

to the trusted authority, who uses a secret algorithm to generate an identifying cer-

tiﬁcate C(s) for Alice. When Alice communicates with Bob, she identiﬁes herself by

sending him both her signature s and certiﬁcate C(s). She can then authenticate

herself to Bob with the following procedure.

Bob ﬁrst checks with the trusted authority to conﬁrm that s and C(s) really belong

to Alice. But since s and C(s) aren’t secret, this doesn’t yet establish that Alice was

really the one who sent them to Bob. So Bob challenges the person claiming to be

Alice to prove that she really owns the signature s, by ﬁrst sending her a random

number r ∈ Z

q

.

76 CHAPTER 7. CRYPTOGRAPHIC CODES

Alice now tries to prove to Bob that she owns the underlying private keys a

1

and a

2

, without actually revealing them to him, by choosing random numbers k

1

and k

2

in Z

q

and sending him the three numbers

y

1

= k

1

+a

1

r (modq),

y

2

= k

2

+a

2

r (modq),

γ = g

k

1

1

g

k

2

2

(modp).

Bob uses the fact that g

q

1

= g

q

2

= 1 (mod p) and the value of s to verify in Z

p

that

g

y

1

1

g

y

2

2

s

r

= g

k

1

+a

1

r

1

g

k

2

+a

2

r

2

s

r

= g

k

1

1

g

k

2

2

= γ (mod p).

The agreement of the numbers g

y

1

1

g

y

2

2

s

r

and γ in Z

p

begins to convince Bob that

maybe he really is talking to Alice after all.

But the question is, is it possible that a third party, say Charlie, is impersonating

Alice, having somehow devised a clever algorithm to determine numbers y

′

1

and y

′

2

that

fool Bob into thinking the sender is Alice? We now show that impersonation under

the Okamoto authentication scheme is as diﬃcult as solving the discrete logarithm

problem.

Suppose Charlie has managed to ﬁnd a way to convince Bob that he is Alice. That

is, without knowing Alice’s secret key, he has ﬁgured out a clever way to generate

two exponents y

′

1

and y

′

2

such that

g

y

′

1

1

g

y

′

2

2

s

r

′

= γ (mod p),

no matter what challenge r

′

Bob sends him. Suppose he uses his algorithm again, for

a distinct challenge r

′′

,= r

′

, to determine exponents y

′′

1

and y

′′

2

:

g

y

′′

1

1

g

y

′′

2

2

s

r

′′

= γ (modp).

Then

g

y

′

1

−y

′′

1

1

= g

y

′′

2

−y

′

2

2

s

r

′′

−r

′

(modp),

and so

g

(r

′′

−r

′

)

−1

(y

′

1

−y

′′

1

)

1

g

(r

′′

−r

′

)

−1

(y

′

2

−y

′′

2

)

2

= s.

Charlie has thus determined two exponents a

′

1

and a

′

2

such that

s = g

−a

′

1

1

g

−a

′

2

2

(mod p).

It is not hard to show, when q is suﬃciently large, that for virtually all possible

challenges r

′

and r

′′

, the resulting pair of exponents (a

′

1

, a

′

2

) will be distinct from

Alice’s exponents (a

1

, a

2

) (for example, see Stinson [1995]). Without loss of generality,

we can relabel these exponents so that a

2

,= a

′

2

.

7.C. DISCRETE LOGARITHM SCHEMES 77

Together with Alice’s original construction

s = g

−a

1

1

g

−a

2

2

(mod p),

Charlie’s scheme would then constitute a computationally feasible algorithm for com-

puting log

a

1

g

2

in Z

p

: we would know

g

a

1

−a

′

1

1

= g

a

′

2

−a

2

2

(modp),

so that, on using the fact that a

′

2

,= a

2

,

log

g

1

g

2

= (a

′

2

−a

2

)

−1

(a

1

−a

′

1

).

Note that since g

1

and g

2

were chosen by the trusted authority, neither Alice nor

Charlie had any prior knowledge of the value log

g

1

g

2

.

7.C.3 Digital Signature Standard

Another situation that arises frequently is where Bob hasn’t yet gone to the trouble of

creating a public key, but Alice would nevertheless like to send him an electronically

signed document. Her electronic signature should guarantee not only that she is the

sender, but also that the document hasn’t been tampered with during transmission.

One means for doing this is the Digital Signature Standard (DSS) proposed in 1991 by

the U.S. government National Institute of Standards and Technology as a standard

for electronically signing documents.

In the DSS, a large prime p is chosen such that p − 1 has a large prime factor

q and an element g of order q is chosen from Z

p

. Alice’s chooses as her private key

the random integer a in Z

q

. Her public key is A = g

a

. Alice ﬁrst applies to her

document x a hash function f, which is essentially a function that maps a long string

of characters to a much shorter one, such that it is computationally infeasible to ﬁnd

another document x

′

such that f(x

′

) = f(x) (even though such an x

′

will likely exist).

This makes it virtually impossible to tamper with the document without altering the

hash value. The short string of characters given by the hash is converted to an

integer h in Z

q

.

Alice now chooses a random number k in Z

q

and ﬁnds an integer s such that

sk = h +ag

k

(mod q), where g

k

∈ Z

p

. She signs her document with the pair (g

k

, s).

Bob can verify the authenticity and integrity of the document he receives by ﬁrst

calculating its hash h. He then uses Alice’s public key A to check that

g

s

−1

h

A

s

−1

g

k

= g

s

−1

(h+ag

k

)

= g

k

(mod p).

If this is indeed the case then Bob is convinced that the contents and signature are

genuine.

78 CHAPTER 7. CRYPTOGRAPHIC CODES

7.C.4 Silver–Pohlig–Hellman Discrete Logarithm Algorithm

As it happens, a fast means for computing discrete logarithms in F

q

, the Silver–

Pohlig–Hellman algorithm, is known if all of the prime factors p of q − 1 are small.

For this reason, care must be taken when choosing the size q of the ﬁeld used in

cryptographic schemes that rely on the Diﬃe–Hellman assumption.

To ﬁnd x such that b

x

= y in F

q

, it suﬃces to ﬁnd xmod p

ap

for each prime factor p

of q − 1, where a

p

is the number of p factors appearing in the prime factorization of

q − 1. The Chinese remainder theorem can then be used to solve the simultaneous

congruence problem that determines the value of x.

For each prime factor p, we ﬁrst compute a table of the pth roots of unity b

j(q−1)/p

for j = 0, 1, . . . , p −1, noting that b

q−1

= 1 in the ﬁeld Z

q

.

To ﬁnd xmodp

a

, we attempt to compute each of the coeﬃcients in the p-ary

expansion of xmod p

a

:

x = x

0

+x

1

p +. . . +x

a−1

p

a−1

modp

a

.

For example, to ﬁnd x

0

, we compute the pth root of unity y

(q−1)/p

. But y

(q−1)/p

=

b

x(q−1)/p

= b

x

0

(q−1)/p

, so in our table of pth roots, x

0

is just the j value corresponding

to the root y

(q−1)/p

.

To ﬁnd x

1

we repeat the above procedure replacing y with y

1

= yb

−x

0

= b

x−x

0

,

which has the discrete logarithm x

1

p + . . . + x

a−1

p

a−1

. Since y

1

is evidently a pth

power, we see that y

(q−1)/p

1

= 1 in Z

q

and y

(q−1)/p

2

1

= b

(x−x

0

)(q−1)/p

2

= b

x

1

(q−1)/p

is

the pth root of unity corresponding to j = x

1

. Continuing in this manner, we can

compute each of the x

i

values for i = 0, 1, . . . , a − 1. Once we have found xmodp

ap

for each prime factor p of q −1, we can use the Chinese remainder theorem to ﬁnd x

itself.

7.D Cryptographic Error-Correcting Codes

We conclude with an interesting cryptographic application of error-correcting codes

due to McEliece [1978]. The receiver selects a block size k and a private key consisting

of an [n, k, 2t+1] linear code C with generator matrix G, a kk nonsingular scrambler

matrix S, and an n n random permutation matrix P. He then constructs the k n

matrix K = SGP as his public key. A sender encodes each message block m as

c = c

e

(m) = mK +z,

where z is a random error vector of length n and weight no more than t. The receiver

then computes

cP

−1

= (mK +z)P

−1

= (mSGP +z)P

−1

= mSG+zP

−1

.

Since the weight of zP

−1

is no more than t, he can use the code C to decode the

vector mSG + zP

−1

to the codeword mS. After multiplication on the right by S

−1

,

he recovers the original message m.

Appendix A

Finite Fields

Theorem A.1 (Z

n

): The ring Z

n

is a ﬁeld ⇐⇒ n is prime.

Proof:

“⇒” Let Z

n

be a ﬁeld. If n = ab, with 1 < a, b < n, then b = a

−1

ab =

a

−1

n = 0 (mod n), a contradiction. Hence n must be prime.

“⇐” Let n be prime. Since Z

n

has a unit and is commutative, we need

only verify that each element a ,= 0 has an inverse. Consider the elements

ia, for i = 1, 2, . . . , n − 1. Each of these elements must be nonzero since

neither i nor a is divisible by the prime number n. These n −1 elements

are distinct from each other since, for i, j ∈ 1, 2, . . . , n −1,

ia = ja ⇒ (i −j)a = 0 (mod n) ⇒ n[(i −j)a ⇒ n[(i −j) ⇒ i = j.

Thus, the n −1 elements a, 2a, . . . , (n −1)a must be equal to the n −1

elements 1, 2, . . . n−1 in some order. One of them, say ia, must be equal

to 1. That is, a has inverse i.

Deﬁnition: The order of a ﬁnite ﬁeld F is the number of elements in F.

Theorem A.2 (Subﬁeld Isomorphic to Z

p

): Every ﬁnite ﬁeld has the order of a power

of a prime p and contains a subﬁeld isomorphic to Z

p

.

Proof: Let 1 (one) denote the (unique) multiplicative identity in F, a ﬁeld of

order n. The element 1+1 must be in F, so label this element 2. Similarly 2+1 ∈ F,

which we label by 3. We continue in this manner until the ﬁrst time we encounter

an element k to which we have already assigned a label ℓ (F is a ﬁnite ﬁeld): the

sum of k ones equals the sum of ℓ ones, with k > ℓ. Hence the sum of p

.

= k − ℓ

ones must be the additive identity, 0. If p is composite, p = ab, then the product of

the elements that we have labelled a and b would be 0, contradicting the fact that

F is a ﬁeld. Thus p must be prime and the set of numbers that we have labelled

79

80 APPENDIX A. FINITE FIELDS

¦0, 1, 2, . . . , p −1¦ is isomorphic to the ﬁeld Z

p

. Now consider all subsets ¦x

1

, . . . , x

r

¦

of linearly independent elements of F, in the sense that

a

1

x

1

+a

2

x

2

+. . . +a

r

x

r

= 0 ⇒ a

1

= a

2

= . . . = 0, where a

i

∈ Z

p

.

There must be at least one such subset having a maximal number of elements. Then,

if x is any element of F, the elements ¦x, x

1

, . . . , x

r

¦ cannot be linearly independent,

so that x can be written as a linear combination of ¦x

1

, . . . , x

r

¦. Thus ¦x

1

, . . . , x

r

¦

forms a basis for F, so that the elements of F may be uniquely identiﬁed by all

possible values of the coeﬃcients a

1

, a

2

, . . . , a

r

. Since there are p choices for each of

the r coeﬃcients, there are exactly p

r

distinct elements in F.

Corollary A.2.1 (Isomorphism to Z

p

): Any ﬁeld F with prime order p is isomorphic

to Z

p

.

Proof: Theorem A.2 says that the prime p must be the power of a prime, which

can only be p itself. It also says that F contains Z

p

. Since the order of Z

p

is already p,

there are no other elements in F.

Theorem A.3 (Prime Power Fields): There exists a ﬁeld F of order n ⇐⇒ n is a

power of a prime.

Proof:

“⇒” This is implied by Theorem A.2.

“⇐” Let p be prime and g be an irreducible polynomial of degree r in

the polynomial ring Z

p

[x] (for a proof of the existence of such a polyno-

mial, see van Lint [1991]). Recall that every polynomial can be written

as a polynomial multiple of g plus a residue polynomial of degree less

than r. The ﬁeld Z

p

[x]/g, which is just the residue class polynomial ring

Z

p

[x] (modg), establishes the existence of a ﬁeld with exactly p

r

elements,

corresponding to the p possible choices for each of the r coeﬃcients of a

polynomial of degree less than r.

• For example, we can construct a ﬁeld with 8 = 2

3

elements using the polynomial

g(x) = x

3

+ x + 1 in Z

2

[x]. Note that g is irreducible: the fact that g(c) =

c

3

+ c + 1 ,= 0 for all c ∈ Z

2

, implies that g(x) cannot have a linear factor (x −c).

Alternatively, we can establish the irreducibility of g in Z

2

[x] directly: If g(x) =

(x

2

+Bx +C)(x +D) = x

3

+ (B +D)x

2

+ (C +BD)x +CD, then

CD = 1 ⇒ C = D = 1

and hence

C +BD = 1 ⇒ B = 0,

which contradicts B +D = 0.

81

That is, if a and b are two polynomials in Z

2

[x]/g, their product can be zero

(mod g) only if one of them is itself zero. Thus, Z

2

[x]/g is a ﬁeld with exactly

2

3

= 8 elements, corresponding to the 2 possible choices for each of the 3 polynomial

coeﬃcients.

Deﬁnition: The Euler indicator or Euler totient function

ϕ(n)

.

= [¦m ∈ N : 1 ≤ m ≤ n, (m, n) = 1¦[

is the number of positive integers less than or equal to n that are relatively prime

(share no common factors).

• ϕ(p) = p −1 for any prime number p.

• ϕ(p

r

) = p

r

− p

r−1

for any prime number p and any r ∈ N since p, 2p, 3p, . . .,

(p

r−1

−1)p all have a factor in common with p

r

.

Remark: If we denote the set of integers in Z

n

that are not zero divisors by Z

∗

n

, we

see that ϕ(n) = [Z

∗

n

[.

• Here are the ﬁrst 12 values of ϕ:

x 1 2 3 4 5 6 7 8 9 10 11 12

ϕ(x) 1 1 2 2 4 2 6 4 6 4 10 4

Remark: Note that

ϕ(1) +ϕ(2) +ϕ(3) +ϕ(6) = 1 + 1 + 2 + 2 = 6,

ϕ(1) +ϕ(2) +ϕ(3) +ϕ(4) +ϕ(6) +ϕ(12) = 1 + 1 + 2 + 2 + 2 + 4 = 12,

and ϕ(1) +ϕ(p) = 1 + (p −1) = p for any prime p.

Problem A.1: The Chinese remainder theorem implies that ϕ(mn) = ϕ(m)ϕ(n)

whenever (m, n) = 1. Use this result to prove for any n ∈ N that

d|n

ϕ(d) = n.

Problem A.2: Consider the set

S

n

=

_

k

n

: 1 ≤ k ≤ n

_

.

(a) How many distinct elements does S

n

contain?

(b) If k and n have a common factor, reduce the fraction k/n to m/d, where d

divides n and (m, d) = 1, with 1 ≤ m ≤ d. For each d, express the number of possible

values of m in terms of the Euler ϕ function.

(c) Obtain an alternative proof of the formula in Problem A.1 from parts (a)

and (b).

82 APPENDIX A. FINITE FIELDS

Deﬁnition: The order of a nonzero element α of a ﬁnite ﬁeld is the smallest natural

number e such that α

e

= 1.

Theorem A.4 (Primitive Element of a Field): The nonzero elements of any ﬁnite

ﬁeld can be written as powers of a single element.

Proof: Given a ﬁnite ﬁeld F of order q, let 1 ≤ e ≤ q − 1. Either there exists

no elements in F of order e or there exists at least one element α of order e. In the

latter case, α is a root of the polynomial x

e

− 1 in F[x]; that is, α

e

= 1. Hence

(α

n

)

e

= (α

e

)

n

= 1 for n = 0, 1, 2, . . .. Since α has order e, we know that each of the

roots α

n

for n = 1, 2, . . . , e are distinct. Since x

e

−1 can have at most e zeros in F[x],

we then immediately know the factorization of the polynomial x

e

−1 in F[x]:

x

e

−1 = (x −1)(x −α)(x −α

2

) . . . (x −α

e−1

).

Thus, the only possible elements of order e in F are powers α

i

for 1 ≤ i < e. However,

if i and e share a common factor n > 1, then (α

i

)

e/n

= 1 and the order of α

i

would

be less than or equal to e/n. So this leaves only the elements α

i

where (i, e) = 1

as possible candidates for elements of order e. Note that the e powers of α are a

subgroup (coset) of the multiplicative group G formed by the nonzero elements of F,

so Lagrange’s Theorem implies that e must divide the order of G, that is, e[(q −1).

Consequently, the number of elements of order e, where e divides q −1, is either 0

or ϕ(e). If the number of elements of order e were 0 for some divisor e of q −1, then

the total number of nonzero elements in F would be less than

e|(q−1)

ϕ(e) = q −1,

which is a contradiction. Hence, there exist elements in F of any order e that divides

q −1, including q −1 itself. The distinct powers of an element of order q −1 are just

the q −1 nonzero elements of F.

Deﬁnition: An element of order q −1 in a ﬁnite ﬁeld F

q

is called a primitive element.

Remark: Theorem A.4 states that the elements of a ﬁnite ﬁeld F

q

can be listed in

terms of a primitive element, say α:

F

q

= ¦0, α

0

, α

1

, α

2

, . . . , α

q−2

¦.

Remark: The fact that all elements in a ﬁeld F

q

can be expressed as powers of a

primitive element can be exploited whenever we wish to multiply two elements

together. We can compute the product α

i

α

j

simply by determining which element

can be expressed as α raised to the power (i +j) mod(q −1), in exactly the same

manner as one uses a table of logarithms to multiply real numbers.

Remark: A primitive element of a ﬁnite ﬁeld F

p

r need not be unique. In fact, we

see from the proof of Theorem A.4 that the number of such elements is ϕ(p

r

−1).

Speciﬁcally, if α is a primitive element, then the powers α

i

, for the ϕ(p

r

−1) values

of i that are relatively prime to p

r

−1, are also primitive elements.

83

Remark: A primitive element α of F

q

satisﬁes the equation α

q−1

= 1, so that α

q

= α,

and has the highest possible order (q −1). Note that (α

i

)

−1

= α

q−1−i

.

Problem A.3: Show that every nonzero element β of a ﬁnite ﬁeld satisﬁes β

q−1

= 1

in F

q

.

Remark: If α is a primitive element of F

q

, then α

−1

= α

q−2

is also a primitive

element of F

q

since (q −2)(q −1 −i) = i mod(q −1).

The fact that a primitive element α satisﬁes α

q

= α leads to the following corollary

of Theorem A.4.

Corollary A.4.1 (Cyclic Nature of Fields): Every element β of a ﬁnite ﬁeld of order

q is a root of the equation β

q

−β = 0.

Remark: In particular, Corollary A.4.1 states that every element β in a ﬁnite ﬁeld

F

p

r is a root of some polynomial f(x) ∈ F

p

[x].

Deﬁnition: Given an element β in a ﬁeld F

p

r , the monic polynomial m(x) in F

p

[x]

of least degree with β as a root is called the minimal polynomial of β.

Theorem A.5 (Minimal Polynomial): Let β ∈ F

p

r . If f(x) ∈ F

p

[x] has β as a root,

then f(x) is divisible by the minimal polynomial of β.

Proof: Let m(x) be the minimal polynomial of β. If f(β) = 0, then expressing

f(x) = q(x)m(x) +r(x) with deg r < deg m, we see that r(β) = 0. By the minimality

of deg m, we see that r(x) is identically zero.

Corollary A.5.1 (Minimal Polynomials Divide x

q

−x): The minimal polynomial of

an element of a ﬁeld F

q

divides x

q

−x.

Corollary A.5.2 (Irreducibility of Minimal Polynomial): Let m(x) be a monic poly-

nomial in F

p

[x] that has β as a root. Then m(x) is the minimal polynomial

of β ⇐⇒ m(x) is irreducible in F

p

[x].

Proof:

“⇒” If m(x) = a(x)b(x), where a and b are of smaller degree, then

a(β)b(β) = 0 implies that a(β) = 0 or b(β) = 0; this would contradict the

minimality of deg m. Thus m(x) is irreducible.

“⇐” Since m(β) = 0, we know from Theorem A.5 that m(x) is divisible

by the minimal polynomial of β. But since m(x) is irreducible and monic,

the minimal polynomial must be m(x) itself.

Deﬁnition: A primitive polynomial of a ﬁeld is the minimal polynomial of a primitive

element of the ﬁeld.

Q. How do we ﬁnd the minimal polynomial of an element α

i

in the ﬁeld F

p

r ?

A. The following theorems provide some assistance.

84 APPENDIX A. FINITE FIELDS

Theorem A.6 (Functions of Powers): If f(x) ∈ F

p

[x], then f(x

p

) = [f(x)]

p

.

Proof: Exercise.

Corollary A.6.1 (Root Powers): If α is a root of a polynomial f(x) ∈ F

p

[x] then α

p

is also a root of f(x).

Theorem A.7 (Reciprocal Polynomials): In a ﬁnite ﬁeld F

p

r the following statements

hold:

(a) If α ∈ F

p

r is a nonzero root of f(x) ∈ F

p

[x], then α

−1

is a root of the reciprocal

polynomial of f(x).

(b) a polynomial is irreducible ⇐⇒ its reciprocal polynomial is irreducible.

(c) a polynomial is a minimal polynomial of a nonzero element α ∈ F

p

r ⇒ a scalar

multiple of its reciprocal polynomial is a minimal polynomial of α

−1

.

(d) a polynomial is primitive ⇒ a scalar multiple of its reciprocal polynomial is

primitive.

Proof: Exercise.

Problem A.4: Show that the pth powers of distinct elements of a ﬁeld F

p

r are

distinct.

Problem A.5: Show that the only elements a of F

p

r that satisfy the property that

a

p−1

= 1 in F

p

r are the p −1 nonzero elements of F

p

.

Suppose we want to ﬁnd the minimal polynomial m(x) of α

i

in F

p

r . Identify the

set of distinct elements ¦α

i

, α

ip

, α

ip

2

, . . .¦. The powers of α modulo p

r

−1 in this set

form the cyclotomic coset of i. Suppose there are s distinct elements in this set. By

Corollary A.6.1, each of these elements are distinct roots of m(x), and so the monic

polynomial

f(x) =

s−1

k=0

(x −α

ip

k

)

is certainly a factor of m(x).

Notice that the pth power of every root α

ip

k

of f(x) is another such root. The

coeﬃcients a

k

in the expansion f(x) =

s

k=0

a

k

x

k

are symmetric functions of these

roots, consisting of sums of products of the roots. When we raise these sums to the

pth power we obtain symmetric sums of the pth powers of the products of the roots.

Since the pth power of a root is another root and the coeﬃcients a

k

are invariant under

root permutations, we deduce that a

p

k

= a

k

. It follows that each of the coeﬃcients a

k

belong to the base ﬁeld F

p

. That is, on expanding all of the factors of f(x), all of the

αs disappear! Hence f(x) ∈ F

p

[x] and f(α

i

) = 0, so by Theorem A.5, we know also

that m(x) is a factor of f(x). Since f(x) is monic we conclude that m(x) = f(x).

85

Remark: Since the degree of the minimal polynomial m(x) of α

i

equals the number

of elements s in the cyclotomic coset of α

i

, we can sometimes use the previous

theorems to help us quickly determine m(x) without having actually to perform

the above product. Note that, since p

r

= 1 mod(p

r

− 1), minimal polynomials

in F

p

r have degree s ≤ r.

Problem A.6: Show that the elements generated by powers belonging to a particular

cyclotomic coset share the same minimal polynomial and order.

Remark: Every primitive polynomial of F

p

r has degree r and each of its roots is

a primitive element of F

p

r . This follows immediately from the distinctness of the

elements α, α

p

α

p

2

, . . ., α

p

r−1

for a primitive element α of F

p

r , noting that α

p

r

= α.

• We now ﬁnd the minimal polynomial for each of the 16 elements of the ﬁeld F

2

4 =

F

2

[x]/(x

4

+ x

3

+ 1). It is straightforward to check that the polynomial x is a

primitive element of the ﬁeld (see Table A.1). Since x is a root of the irreducible

polynomial x

4

+x

3

+ 1 in F

2

[x]/(x

4

+x

3

+ 1), we know from Corollary A.5.2 that

x

4

+ x

3

+ 1 is the minimal polynomial of x and hence is a primitive polynomial

of F

16

. The cyclotomic cosets consist of powers i2

k

(mod15) of each element α

i

:

¦1, 2, 4, 8¦,

¦3, 6, 12, 9¦,

¦5, 10¦,

¦7, 14, 13, 11¦,

¦0¦.

The ﬁrst cyclotomic coset corresponds to the primitive element α = x, for which

the minimal polynomial is x

4

+x

3

+1. This is also the minimal polynomial for the

other powers of α in the cyclotomic coset containing 1, namely α

2

, α

4

, and α

8

.

The reciprocal polynomial of x

4

+x

3

+1 is x

4

+x+1; this is the minimal polynomial

of the inverse elements α

−i

= α

15−i

for i = 1, 2, 4, 8, that is, for α

14

, α

13

, α

11

, and α

7

.

We see that these are just the elements corresponding to the second last coset.

We can also easily ﬁnd the minimal polynomial of α

3

, α

6

, α

12

, and α

9

. Since

α

15

= 1, we observe that α

3

satisﬁes the equation x

5

− 1 = 0. We can factorize

x

5

−1 = (x −1)(x

4

+x

3

+x

2

+x + 1) and since α

3

,= 1, we know that α

3

must be a

root of the remaining factor, x

4

+x

3

+x

2

+x +1. Furthermore, since the cyclotomic

coset corresponding to α

3

contains 4 elements, the minimal polynomial must have

degree 4. So x

4

+x

3

+x

2

+x+1 is in fact the minimal polynomial of α

3

, α

6

, α

9

, and

α

12

(hence we have indirectly proven that x

4

+x

3

+x

2

+x+1 is irreducible in F

2

[x]).

Likewise, since the minimal polynomial of x

5

must be a factor of x

3

− 1 = (x −

1)(x

2

+x + 1) with degree 2, we see that the minimal polynomial for these elements

is x

2

+x + 1.

86 APPENDIX A. FINITE FIELDS

Finally, the minimal polynomial of the multiplicative unit α

0

= 1 is just the ﬁrst

degree polynomial x + 1. The minimal polynomial of 0 is x.

These results are summarized in Table A.1.

Element Polynomial Order Minimal Polynomial

α

0

1 1 x + 1

α

1

α 15 x

4

+x

3

+ 1

α

2

α

2

15 x

4

+x

3

+ 1

α

3

α

3

5 x

4

+x

3

+x

2

+x + 1

α

4

α

3

+ 1 15 x

4

+x

3

+ 1

α

5

α

3

+α + 1 3 x

2

+x + 1

α

6

α

3

+α

2

+α + 1 5 x

4

+x

3

+x

2

+x + 1

α

7

α

2

+α + 1 15 x

4

+x + 1

α

8

α

3

+α

2

+α 15 x

4

+x

3

+ 1

α

9

α

2

+ 1 5 x

4

+x

3

+x

2

+x + 1

α

10

α

3

+α 3 x

2

+x + 1

α

11

α

3

+α

2

+ 1 15 x

4

+x + 1

α

12

α + 1 5 x

4

+x

3

+x

2

+x + 1

α

13

α

2

+α 15 x

4

+x + 1

α

14

α

3

+α

2

15 x

4

+x + 1

Table A.1: Nonzero elements of the ﬁeld F

2

[x]/(x

4

+x

3

+1) expressed in terms of the

primitive element α = x.

Remark: The cyclotomic cosets containing powers that are relatively prime to p

r

−1

contain the ϕ(p

r

− 1) primitive elements of F

p

r ; their minimal polynomials are

primitive and have degree r. Note that x

4

+ x

3

+ 1 and x

4

+ x + 1 are primitive

polynomials of F

2

[x]/(x

4

+x

3

+1) and their roots comprise the ϕ(15) = ϕ(5)ϕ(3) =

4 2 = 8 primitive elements of F

p

r . Even though the minimal polynomial of the

element α

3

also has degree r = 4, it is not a primitive polynomial, since (α

3

)

5

= 1.

Remark: There is another interpretation of ﬁnite ﬁelds, as demonstrated by the

following example. Consider the ﬁeld F

4

= F

2

[x]/(x

2

+ x + 1), which contains the

elements ¦0, 1, x, x + 1¦. Since the primitive element α = x satisﬁes the equation

x

2

+ x + 1 = 0, we could, using the quadratic formula, think of α as the complex

number

α =

−1 +

√

1 −4

2

= −

1

2

+i

√

3

2

.

The other root to the equation x

2

+ x + 1 = 0 is the complex conjugate α of α.

That is, x

2

+ x + 1 = (x − α)(x − α). From this it follows that 1 = αα = [α[

2

and hence α = e

iθ

= cos θ + i sin θ for some real number θ. In fact, we see that

θ = 2π/3. Thus α

3

= e

3θi

= e

2πi

= 1. In this way, we have constructed a number

87

α that is a primitive third root of unity, which is precisely what we mean when we

say that α is a primitive element of F

4

. The ﬁeld F

4

may be thought of either as

the set ¦0, 1, x, x + 1¦ or as the set ¦0, 1, e

2πi/3

, e

−2πi/3

¦, as illustrated in Fig. A.1.

Similarly, the ﬁeld F

3

= ¦0, 1, 2¦ is isomorphic to ¦0, 1, −1¦ and F

5

= ¦0, 1, 2, 3, 4¦

is isomorphic to ¦0, 1, i, −1, −i¦.

Figure A.1: A representation of the ﬁeld F

4

in the complex plane.

Bibliography

[Buchmann 2001] J. A. Buchmann, Introduction to Cryptography,

Springer, New York, 2001.

[Hill 1997] R. Hill, A First Course in Coding Theory, Oxford

University Press, Oxford, 1997.

[Koblitz 1994] N. Koblitz, A Course in Number Theory and Cryp-

tography, Springer, New York, 2nd edition, 1994.

[Lin & Daniel J. Costello 2004] S. Lin & J. Daniel J. Costello, Error Control Cod-

ing, Pearson Prentice Hall, Upper Saddle River, New

Jersey, 2nd edition, 2004.

[Ling & Xing 2004] S. Ling & C. Xing, Coding Theory: A First Course,

Cambridge Univ. Presso, Cambridge, 2004.

[Mollin 2001] R. A. Mollin, An Introduction to Cryptography,

Chapman & Hall/CRC, Boca Raton, Florida, 2001.

[Pless 1989] V. Pless, Introduction to the Theory of Error-

Correcting Codes, Wiley, New York, 2nd edition,

1989.

[Rosen 2000] K. H. Rosen, Elementary Number Theory and its ap-

plications, Addison-Wesley, Reading, Massachusetts,

4th edition, 2000.

[Stinson 1995] D. R. Stinson, Cryptography: Theory and Practice,

CRC Press, Boca Raton, Florida, 1995.

[van Lint 1991] J. van Lint, Introduction to Coding Theory, Springer,

Berlin, 3rd edition, 1991.

[Welsh 2000] D. Welsh, Codes and Cryptography, Oxford Univer-

sity Press, Oxford, 2000.

88

Index

(n, M, d) code, 8

A

q

(n, d), 10

F

q

[x], 46

[n, k, d] code, 21

[n, k] code, 21

[C[, 24

.

=, 6

q-ary symmetric channel, 5

aﬃne cipher, 67

alphabet, 6

authentication, 75

balanced block design, 16

basis vectors, 21

BCH code, 54

binary, 7

binary adders, 45

binary ascending form, 38

binary code, 6

binary codewords, 6

binary Hamming code, 36

binary symmetric, 5

bit, 5

Block substitution ciphers, 67

blocks, 16

Bose–Chaudhuri–Hocquenghem (BCH) codes,

54

check polynomial, 50

Chinese remainder theorem, 71

ciphers, 66

code, 7

codeword, 6

correct, 9

coset, 24

coset leaders, 24

cryptosystem, 66

cyclic, 45

cyclotomic coset, 84

Data Encryption Standard (DES), 68

design distance, 54

detect, 9

diﬀerential cryptanalysis, 69

Diﬃe–Hellman assumption, 74

Digital Signature Standard, 77

Digraph ciphers, 67

discrete logarithm, 74

ElGamal Cryptosystem, 75

encode, 23

envelopes, 69

equivalent, 11, 23

error locator polynomial, 55

error polynomial, 55

error vector, 25

Euler indicator, 81

Euler totient, 81

extended Golay, 41

Feistel, 68

ﬂip-ﬂops, 45

frequency analysis, 67

generate, 21

generated, 46

generator matrix, 22

generator polynomial, 47

Hamming bound, 13

Hamming code, 36

Hamming distance, 7

89

90 INDEX

hash, 77

Hill cipher, 67

ideal, 46

incidence matrix, 17

information digits, 38

irreducible, 48

linear block, 67

linear code, 21

linear cryptanalysis, 69

man-in-the-middle, 69

metric, 7

minimal polynomial, 52, 83

minimum distance, 8

minimum weight, 22

monic, 46

monoalphabetic substitution ciphers, 67

nearest-neighbour decoding, 7

null space, 27

Okamoto authentication scheme, 75

one-way function, 74

order, 79, 82

parity-check digits, 38

parity-check matrix, 27

perfect code, 14

permutation cipher, 68

permutation matrix, 78

points, 16

polyalphabetic substitution ciphers, 67

polynomial ring, 46

primitive BCH code, 54

primitive element, 51, 82

primitive polynomial, 52, 83

principal ideal, 46

Principal Ideal Domain, 46

public-key cryptosystems, 66

Rabin Public-Key Cryptosystem, 74

rate, 26

reciprocal polynomial, 51

reduced echelon form, 23

redundancy, 27

Reed–Solomon, 61

repetition code, 6

residue class ring, 46

Rijndael Cryptosystem, 69

Rivest–Shamir–Aldeman (RSA) Cryptosys-

tem, 69

Schnorr, 75

scrambler matrix, 78

self-dual, 42

self-orthogonal, 42

shift cipher, 66

shift registers, 45

Silver–Pohlig–Hellman algorithm, 78

simple substitution ciphers, 67

singleton, 61

size, 7

Slepian, 24

span, 21

sphere-packing, 13

Sphere-Packing Bound, 13

standard array, 24

standard form, 23, 28

symmetric, 17

symmetric matrix, 42

symmetric-key cryptosystems, 66

syndrome, 28, 55

syndrome decoding, 28

ternary, 7

trapdoor property, 74

triangle inequality, 7

Triple-DES, 69

trivially perfect codes, 14

Vandermonde matrix, 58

weight, 8

weighted check sum, 18

**c 2002–5 John C. Bowman ALL RIGHTS RESERVED
**

Reproduction of these lecture notes in any form, in whole or in part, is permitted only for nonproﬁt, educational use.

Contents

Preface 1 Introduction 1.A Error Detection and Correction . . . . . . . . . . . . . . . . . . . . . 1.B Balanced Block Designs . . . . . . . . . . . . . . . . . . . . . . . . . 1.C The ISBN Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Linear Codes 2.A Encoding and Decoding . . . . . . . . . . . . . . . . . . . . . . . . . 2.B Syndrome Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Hamming Codes 4 Golay Codes 5 Cyclic Codes 6 BCH Codes 7 Cryptographic Codes 7.A Symmetric-Key Cryptography . . . . . . . . . . . . . . . . . 7.B Public-Key Cryptography . . . . . . . . . . . . . . . . . . . 7.B.1 RSA Cryptosystem . . . . . . . . . . . . . . . . . . . 7.B.2 Rabin Public-Key Cryptosystem . . . . . . . . . . . . 7.C Discrete Logarithm Schemes . . . . . . . . . . . . . . . . . . 7.C.1 Diﬃe–Hellman Key Exchange . . . . . . . . . . . . . 7.C.2 Okamoto Authentication Scheme . . . . . . . . . . . 7.C.3 Digital Signature Standard . . . . . . . . . . . . . . . 7.C.4 Silver–Pohlig–Hellman Discrete Logarithm Algorithm 7.D Cryptographic Error-Correcting Codes . . . . . . . . . . . . A Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5 6 16 18 21 23 27 36 41 45 54 66 66 69 69 74 74 74 75 77 78 78 79

3

I would like to thank my colleagues.net). Professors Hans Brungs.Preface These lecture notes are designed for a one-semester course on error-correcting codes and cryptography at the University of Alberta. on which these notes are partially based (in addition to the references listed in the bibliography).sourceforge. Gerald Cliﬀ. and Ted Lewis. The ﬁgures in this text were drawn with the high-level vector graphics language Asymptote (freely available at http://asymptote. for their written notes and examples. 4 .

However. The theory of error-correcting codes originated with Claude Shannon’s famous 1948 paper “A Mathematical Theory of Communication” and has grown to connect to many areas of mathematics. a binary channel will be reliable only a fraction 1 − p of the time. we need p < 1 − 1/ 2 ≈ 0.) If a single bit is sent. it has become necessary to develop ways of detecting when an error has occurred and. Suppose you want to send the message “Yes” (denoted by 1) or “No” (denoted by 0) through a noisy communication channel. including algebra and combinatorics. The cleverness of the error-correcting schemes that have been developed since 1948 is responsible for the great reliability that we now enjoy in our modern communications networks. For example. computer systems.Chapter 1 Introduction In the modern era. The probability of one error occurring is 2p(1 − p) since there are two possible ways this could happen. (In a q-ary symmetric channel. the digits can take on any of q diﬀerent values and the errors in each digit occur independently and manifest themselves as the q − 1 other possible values with equal probability. 5 . Since transmission line and storage devices are not 100% reliable device. We assume that there is a uniform probability p < 1 that any particular binary digit (often called a bit) could be altered. the probability p2 of two errors occurring is very small. the transmission lines that we use for sending and receiving data and the magnetic media (and even semiconductor memory devices) that we use to store data are imperfect. This relies on the fact that if p is small. The probability of no errors occurring is (1 − p)2 . and universities all exchange enormous quantities of digitized information every day. While reception of the original message is more likely than any other √ particular result if p < 1/2. independent of whether or not any other bits are transmitted correctly. The simplest way of increasing the reliability of such transmissions is to send the message twice. the news media. corporations. digital information has become a valuable commodity. governments. This kind of transmission line is called a binary symmetric channel. and even compact disk players. correcting it.29 to be sure that the correct message is received most of the time. ideally.

if only one bit-ﬂip occurs. suppose that p = 0. . . and “110” would be correctly decoded as “Yes”. INTRODUCTION If the message 11 or 00 is received. A q-ary codeword is a ﬁnite sequence of symbols. More generally. where each symbol is chosen from the alphabet (set) Fq = {λ1 . The probability of the ﬁrst pattern occurring is (1 − p)3 and the probability for each of the next three possibilities is p(1 − p)2 . we can always guess. . the probability of a decoding error. what message was sent (it could with equal probability have been the message 00 or 11). If the message 01 or 10 is received we know for sure that an error has occurred. 1. or even reliably guessing. we introduce the following deﬁnitions. it would make more sense to send three. sending the entire message like this in triplicate is not an eﬃcient means of error correction. If errors tend to occur frequently. is very small. with good reliability what the original message was.001. q − 1}. . 2. λ2 . however this would now require a total of 4 bits of information to be sent. . λq }. Of course. Then. respectively. if p is small. . That is. the patterns “111”.6 CHAPTER 1. copies of the original data in a single message. we could simply ask the sender to retransmit the message. (We use the symbol = to emphasize a deﬁnition. Thus. we would expect with conditional probability (1 − p)2 (1 − p)2 + p2 that the sent message was “Yes” or “No”. . Hence the probability that the message is correctly decoded is (1 − p)3 + 3p(1 − p)2 = (1 − p)2 (1 + 2p) = 1 − 3p2 + 2p3 . suppose “111” is sent. n times . . either 0 or 2 errors have occurred. This kind of data encoding is known as a repetition code. we should send “111” for “Yes” or “000” for “No”. “101”. . Fq . Together they comprise a binary code. For example. Typically. Deﬁnition: Let q ∈ N. 1. Then of the eight possible received results. In other words.A Error Detection and Correction Despite the inherent simplicity of repetition coding. take Fq to be the set Zq = {0. Our goal is to ﬁnd optimal encoding and decoding schemes for reliable error correction of data sent through noisy transmission channels. For example. we will . The sequences “000” and “111” in the previous example are known as binary codewords. Triple-repetition decoding ensures that only about one bit in every 330 000 is garbled. so that on average one bit in every thousand is garbled. 3p2 −2p3 . . but we have no way of knowing. although the notation := is more common. instead of two. “011”. .) The codeword itself can be treated as a vector in the space Fqn = Fq × Fq × .

y) is a metric on Fqn since it is always non-negative and satisﬁes 1. • The set of all words in the English language is a code over the 26-letter alphabet {A. . while the third property. 2. we would require d(x. Deﬁnition: A q-ary code is a set of M codewords. If the codewords are suﬃciently diﬀerent. Remark: Notice that d(x. is just a ﬁnite sequence of 0s and 1s. . It is possible to use a code of over 82 million 10-digit telephone numbers (enough to meet the needs of the U. the correct connection can still be made. (as well as in North America)! Deﬁnition: We deﬁne the Hamming distance d(x. little thought was given to this. This important inequality is illustrated in Fig 1.K. • A ternary codeword. . A good code is one in which the codewords have little resemblance to each other. Z}. • The set of all 10-digit telephone numbers in the United Kingdom is a 10-ary code of length 10. d(x. So there is no way to guarantee accuracy. y). y ∈ Fqn . follows from the simple observation that d(x. Unfortunately. and as a result. y) for all x. z) + d(z. One important aspect of all error-correcting schemes is that the extra information that accomplishes this must itself be transmitted and is hence subject to the same kinds of errors as is the data. we will soon see that it is possible not only to detect errors but even to correct them. one simply attempts to make the probability of accurate decoding as high as possible. where M ∈ N is known as the size of the code. ERROR DETECTION AND CORRECTION 7 • A binary codeword.) such that if just one digit of any phone number is misdialed. B. y) between two codewords x and y of Fqn as the number of places in which they diﬀer. corresponding to the case q = 2. . z) + d(z. frequently misdialed numbers do occur in the U. 3. y) = 0 ⇐⇒ x = y.1. y) ≤ d(x. d(x. y) is the minimum number of digit changes required to change x to y. z ∈ Fqn .K. y. x) for all x. Thus d(x. y) ≤ d(x.1. and 2s. The ﬁrst two properties are immediate consequences of the deﬁnition. is just a ﬁnite sequence of 0s. d(x. . y) changes. z) + d(z. y) = d(y. where one maps the received vector back to the closest nearby codeword. known as the triangle inequality.A. whereas if we were to change x to y by ﬁrst changing x to z and then changing z to y. corresponding to the case q = 3. using nearest-neighbour decoding. 1s.

1: Triangle inequality. digit by digit. x − y is q computed mod q. containing M codewords and having minimum distance d. we can either . M. Remark: In view of the previous discussion. INTRODUCTION x Figure 1. Deﬁnition: An (n. Remark: Let x and y be codewords in Zn . x − y and xy are computed mod 2. we 2 2 see that the minimum distance of C3 is indeed 3. which 2 0 1 . Here. 4. We deﬁne the minimum distance d(C) of the code: d(C) = min{d(x. y. Here. z) ≤ d(x. With this code. 0 1 Upon considering each of the 4 = 4×3 = 6 pairs of distinct codewords (rows). Remark: Let x and y be binary codewords in Zn . y) : x. y) = w(x − y). y Remark: We can use property 2 to rewrite the triangle inequality as d(x.8 z CHAPTER 1. Then d(x. Then d(x. x = y}. • For example. z) ∀x. consisting are at least a distance 3 from each other: 0 0 0 0 0 1 1 0 C3 = 1 0 1 1 1 1 0 1 of four codewords from Z5 . y) − d(y. a good code is one with a relatively large minimum distance. 3) code. d) code is a code of length n. y) = w(x − y) = 2 w(x) + w(y) − 2w(xy). Deﬁnition: Let C be a code in Fqn . Deﬁnition: The weight w(x) of a q-ary codeword x is the number of nonzero digits in x. here is a (5. z ∈ Fqn . digit by digit. y ∈ C.

y) ≤ t.1 (Error Detection and Correction): In a symmetric channel with errorprobability p > 0. The following theorem (cf. Theorem 1. Proof: (i) “⇐” Suppose d(C) ≥ t + 1. If d(C) < t + 1.1. or (ii) detect and correct a single error (since.2) shows how this works in general. resulting in a new vector y ∈ Fqn satisfying . 1.. x S t y S x t t y Figure 1. Then d(x. Let a codeword x be transmitted such that t or fewer errors are introduced. (ii) “⇐” Suppose d(C) ≥ 2t + 1. “⇒” Suppose C can detect up to t errors. Hence d(C) ≥ t + 1. Let a codeword x be transmitted such that t or fewer errors are introduced. the received vector will still be closer to the transmitted codeword than to any other). if only a single error has occurred. ERROR DETECTION AND CORRECTION 9 (i) detect up to two errors (since the members of each pair of distinct codewords are more than a distance 2 apart). then there is some pair of codewords x and y with d(x. (ii) a code C can correct up to t errors in any codeword ⇐⇒ d(C) ≥ 2t + 1. Since it is possible to send the codeword x and receive another codeword y by the introduction of t errors. y) = w(x−y) ≤ t < t+1 ≤ d(C). so the received vector cannot be another codeword.2: Detection of up to t errors in a transmitted codeword x requires that all other codewords y lie outside a sphere S of radius t centered on x. contradicting our premise. Hence t errors can be detected.A. Correction of up to t errors requires that no sphere of radius t centered about any other codeword y overlaps with S. (i) a code C can detect up to t errors in every codeword ⇐⇒ d(C) ≥ t + 1. we conclude that C cannot detect t errors. Fig. resulting in a new vector y ∈ Fqn .

making it possible to identify the original transmitted codeword x correctly. (ii) Aq (n. The largest code with this property is the whole of Fqn . n) = q. x′ ) ≤ 2t. • Since we have already constructed a (5.1. x′ ) ≤ t. i. d) code. Hence d(C) ≥ 2t + 1.2 (Special Cases): For any values of q and n. If x′ is a codeword other than x. x) = t. INTRODUCTION d(x. x′) < d(y. 3) = 4. d) code has small n (for rapid message transmission). 2 Here ⌊x⌋ represents the greatest integer less than or equal to x. A primary goal of coding theory is to ﬁnd codes that optimize M for ﬁxed values of n and d. x′ ) implies that d(y. 1) = q n . This contradicts our premise. A2 (5. x) ≤ t. Proof: (i) When the minimum distance d = 1. large M (to maximize the amount of information transmitted). If d(C) < 2t + 1. then C can be used either (i) to detect up to d − 1 errors or (ii) to correct up to ⌊ d−1 ⌋ errors in any codeword. . construct a vector y by changing t of the digits of x that are in disagreement with x′ to their corresponding values in x′ . if t < d(x. x′) ≥ d(x. x′ ) ≥ 2t + 1 and the triangle inequality d(x. x′ ) ≤ 2t. “⇒” Suppose C can correct up to t errors. A good (n. But since d(y. so that 0 = d(y. there is some pair of distinct codewords x and x′ with distance d(x. We will soon see that 4 is in fact the maximum possible value of M. M. the received vector y cannot not be unambiguously decoded as x using nearest-neighbour decoding. let us ﬁrst consider the following special cases: Theorem 1. x′ ) − d(x. M. Hence the received vector y is closer to x than to any other codeword x′ . Corollary 1. 3) ≥ 4. x). it is possible to send the codeword x and receive the vector y due to t or fewer transmission errors. x′) ≤ d(y. let y = x′ . (i) Aq (n. 3) code. y) ≤ t. so that 0 < d(y. To help us tabulate Aq (n. If d(x. we know that A2 (5.10 CHAPTER 1. y) ≥ 2t + 1 − t = t + 1 > t ≥ d(y. 4. we require only that the codewords be distinct.e. x). d) be the largest value of M such that there exists a q-ary (n.1: If a code C has minimum distance d. x′) ≤ d(y. Otherwise. y) + d(y. x′ ) ≤ d(x. which has M = q n codewords. In either case. and large d (to be able to correct many errors. then d(x. d). Deﬁnition: Let Aq (n.

But d − 1 is odd. after establishing a useful simpliﬁcation. d) is not deﬁned if d > n. Note that d − 1 = d(C) ≤ ˆ ˆ ˆ d(C) ≤ d. Lemma 1. with d ≥ 2.1 (Maximum Code Size for Even d): If d is even.1. Let C be the code of length n obtained by extending each codeword x of C by adding a parity ˆ bit w(x) mod 2. q. the codewords that result are distinct and form a (n − 1. Delete this column from all codewords. Proof: Given an (n. since d(x. y) = w(x−y) ≤ n for distinct codewords x. Proof: “⇒” This follows from Lemma 1. y) = w(x) + w(y) − 2w(xy) must be even for every pair ˆ ˆ of codewords x and y in C. d) code. there is little advantage in considering codes with even d if the goal is error correction. M. Theorem 1. y) = d and choose any column where x and y diﬀer. Corollary 1. Thus C is a (n. we now compute the value A2 (5. beginning with the following deﬁnition. M. In particular. d − 1). M.1. In fact. ˆ “⇐” Suppose C is a binary (n − 1. n) code.A. in view of Theorem 1. d) code exists. we present values of A2 (n. ERROR DETECTION AND CORRECTION 11 (ii) When the minimum distance d = n. d) for n ≤ 16 and for odd values of d ≤ 7. d − 1) code. d − 1) code. we require that any two distinct codewords diﬀer in all n positions. there also exists an (n − 1. d − 1) code exists.1. 3) entered in Table 1.1. so d(C) is even. n) = q can actually be realized.1. let x and y be codewords such that d(x. then A2 (n. M. M. d) code. so the bound Aq (n. d) code exists ⇐⇒ a binary (n − 1. M. Since d ≥ 2. d − 1) code. In Table 1. M. d) = A2 (n − 1. . As an example. A q-ary repetition code of length n is an example of an (n. so there can be no more than q codewords. Then d(x.1 (Reduction Lemma): If a q-ary (n. M. This makes the weight w(ˆ) of every codeword x of C x ˆ even.3 (Even Values of d): Suppose d is even. so in fact d(C) = d. d) for odd d. This result means that we only need to calculate A2 (n. Deﬁnition: Two q-ary codes are equivalent if one can be obtained from the other by a combination of (A) permutation of the columns of the code.3. this means that the symbols appearing in the ﬁrst position must be distinct. y ∈ Fqn . Then a binary (n. This means that Aq (n. Remark: There must be at least two codewords for d(C) even to be deﬁned.

there can be at most one codeword with four or . the error-correction performance of equivalent codes will be identical. (B) relabelling the symbols appearing in a ﬁxed column. in a q-ary symmetric channel. Without loss of generality. 3) code C3 by interchanging the ﬁrst two columns and then relabelling 0 ↔ 1 in the ﬁrst and fourth columns of the resulting matrix. we may assume that C contains the zero vector (if necessary. Furthermore. M. equivalent codes have the same (n. For each i such that xi = 0. choose any codeword x1 x2 . That is. Then there can be no codewords with just one or two 1s since d = 3.12 n 5 6 7 8 9 10 11 12 13 14 15 16 d=3 4 8 16 20 40 72 144 256 512 1024 2048 2720–3276 CHAPTER 1. 4. d) for n ≤ 16 and d ≤ 7. • Armed with the above lemma and the concept of equivalence. M. by replacing C with an equivalent code).2 (Zero Vector): Any code over an alphabet containing the symbol 0 is equivalent to a code containing the zero vector 0. it is now easy to prove that A2 (5. . Proof: Given a code of length n. 3) code with M ≥ 4. . Also. Let C be a (5. xn . • The binary code 0 1 0 1 1 1 0 0 0 1 1 0 1 1 0 0 0 1 0 1 is seen to be equivalent to our previous (5. d) parameters and can correct the same number of errors. Lemma 1. 3) = 4. INTRODUCTION d=5 2 2 2 4 6 12 24 32 64 128 256 256–340 d=7 2 2 2 2 4 4 8 16 32 36–37 Table 1.1: Maximum code size A2 (n. Remark: Note that the distances between codewords are unchanged by each of these operations. apply the relabelling 0 ↔ xi to the symbols in the ith column.

ERROR DETECTION AND CORRECTION 13 more 1s. there must be at least two codewords containing exactly three 1s.1) Proof: By the triangle inequality. 3) is unique. 3) ≤ 4. Theorem 1. We then see that the resulting code is equivalent to C3 and hence A2 (5. M.A. this number can be no more than the total number q n of vectors in Fqn .4 (Sphere-Packing Bound): A q-ary (n. can be used to establish that a code is as large as possible for given values of n and d. any two spheres of radius t that are centered on distinct codewords will have no vectors in common. The above trial-and-error approach becomes impractical for large codes. we deduce that A2 (5. so the fourth codeword must be 11011. because there are n choices for the k positions that diﬀer from those of k k the ﬁxed vector and there are q − 1 values that can be assigned independently to each of these k positions. 3) = 4. This means that there can be at most four codewords. 2t + 1) code satisﬁes t M k=0 n (q − 1)k ≤ q n . an important bound. we obtain the desired result. we see that the code contains the codewords 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 There is no way to add any more codewords containing exactly three 1s and we can also now rule out the possibility of ﬁve 1s.3 (Counting): A sphere of radius t in Fqn . . In some of these cases. Proof: The number of vectors that are a distance k from a ﬁxed vector in Fqn is n (q − 1)k . A2 (5. that is. known as the sphere-packing or Hamming bound. otherwise there would be two codewords with at least three 1s in common positions and less than a distance 3 apart. k (1. The only possible position for the 0 symbol is in the middle position. The total number of vectors in the M spheres of radius t centered on the M codewords is thus given by the left-hand side of the above inequality. with 0 ≤ t ≤ n.1. Since M ≥ 4. Lemma 1. Since we have previously shown that A2 (5. 3) ≥ 4. up to equivalence. if necessary. if present in the above code. Summing over the possible values of k. contains exactly t k=0 n (q − 1)k k vectors. Remark: A fourth codeword. must have exactly four 1s. By rearranging columns.

This emphasizes. For such a code. If you delete the last d − 1 positions of all codewords. q − 1. 3) ≤ 5. Deﬁnition: A perfect code is a code for which equality occurs in 1. t} satisfy Eq. Suppose that Ni < M/q for each i = 0.1). and the binary repetition code (with t = (n − 1)/2 and M = 2) of odd length n are trivially perfect codes. This follows from the pigeon-hole principle: construct q boxes. One would then obtain the contradiction M = q−1 Ni < q−1 M/q = M . 3) = 4. let Ni : i = 0. . . INTRODUCTION • For our binary (5. t Problem 1. i=0 i=0 . q − 1 be the number of codewords ending with the symbol i. 1.1: Prove that n t=0 n (q − 1)t = q n . at least one box must contain M/q codewords. one can establish this result with a proof by contradiction. without overlapping. there is no guarantee that such a code actually exists. M. Since the number of distinct q-ary vectors of length n − d + 1 is q n−d+1 . the resulting vectors must be distinct.2: Show that a q-ary (n. one for each possible ﬁnal digit. 4. If we try to stuﬀ the M codewords into the q boxes. or else the codewords could not be a distance d apart from each other. Eq. d) code must satisfy M ≤ q n−d+1 . 3) code. the total number of vectors in F n . . all possible values of t. d) q-ary code C. M. Remark: The codes that consist of a single codeword (taking t = n and M = 1). . . M.1. . Problem 1. codes that contain all vectors of Fqn (with t = 0 and M = q n ). t n Each term in this sum is the number of vectors of weight t in Fq . 1. we obtain q q Alternatively. (1.3: (a) Given an (n. the M spheres of radius t centered on the codewords ﬁll the whole space Fqn completely. the number of codewords M must be less or equal to this number. Hint: what can can you say about the vectors obtained by deleting the last d − 1 positions of all codewords? It might help to ﬁrst consider the special cases d = 1 and d = 2. . which implies that A2 (5. Prove that there exists some i for which Ni ≥ M/q. that just because some set of numbers {n. (1. We have already seen that A2 (5.1) gives the bound M(1 + 5) ≤ 25 = 32. When we sum over n .14 CHAPTER 1. Alternatively. . Problem 1. we see directly from the Binomial Theorem that n t=0 n (q − 1)t = (1 + (q − 1))n = q n .

y) = t + 1. d) ≤ qAq (n − 1. Let t be the maximum number of errors that C can be guaranteed to correct. M k k=0 n For C to be perfect. Hence 1. we know that there are codewords x and y with d(x. If v were within a distance t from another codeword z. Aq (n. z) = t + 1 + t = 2t + 1. q Problem 1. Let C be a (n. x) = d(v. . M/q. That is. t ≤ t= d−1 2 . (c) Conclude that Aq (n. The maximum value of t is d−1 1 = m− = m − 1. d). That is. ERROR DETECTION AND CORRECTION 15 (b) Let C ′ be the code obtained by deleting the ﬁnal symbol from each codeword in C. they form a (n − 1. When we delete this last symbol from these codewords.1. d) subcode (that is. d) . Recall that Aq (n. C is not a perfect code.4: Let C be a code with even distance d = 2m. M. d) code. there is no integer M such that t n (q − 1)k = q n . they remain a distance d apart from each other. d) q Aq (n − 1. d) ≥ Aq (n. We know from part(b) that we can construct a (n − subcode from C. However. we know that C contains at least M/q codewords ending in the same symbol. d) subcode of C ′ . C ′ contains at least M/q codewords that are still a distance d from each other). That is. M/q. From part (a). each vector in Fq would have to be contained in exactly one of the M codeword spheres of radius t. d) code. Thus v does not lie in any codeword sphere. that is. Then d(v. Consider the vector v obtained by changing t + 1 of those digits where x disagrees with y to the corresponding digits in y. z) ≤ d(x. contradicting the fact that the code has minimum distance 2t + 2. d). d) is the largest value of M such that there exists a q-ary (n. Aq (n. the triangle inequality would imply that d(x.d) . so v does not lie within the codeword spheres about x or y.A. y) = 2m = 2t + 2. Show that C ′ contains an (n − 1. 2 2 (b) Prove that C cannot be a perfect code. v) + d(v. (a) Express t in terms of m. To correct t errors we need d ≥ 2t + 1.

{5. 3. 7} and consider the subsets {1. The six lines and circle in Fig. called blocks. x ∈ B}. k. 3. 1. (ii) each block contains exactly k points. 5}. Remark: The parameters (b. 5. 2. y. λ) are not independent. {3. (iii) each pair of points occurs together in exactly λ blocks. Alternatively. B) : x. Thus bk = vr.3: Seven-point plane. v. . 7. λ) design. • Let S = {1. Similarly. 1. of a set S of v points such that (i) each point lies in exactly r blocks. we know that since there are b blocks and k points in each block. we can form exactly bk such pairs. 6. 2}. by considering the set U = {(x. 7. v. INTRODUCTION 1. Consider the set of ordered pairs T = {(x. 4. 3. 6. {6. each block contains 3 numbers. {7. B is a block. 4}. y ∈ B}. 7}. Each number lies in exactly 3 blocks.3 represent the blocks. and each pair of numbers occurs together in exactly 1 block. B) : x is a point. {4. 3} of S. B is a block. using bk = vr. y are distinct points. 1 2 6 4 5 7 3 Figure 1. Hence these subsets form a (7. r. 5. 1}. {2. there must be a total of vr ordered pairs in T .B Balanced Block Designs Deﬁnition: A balanced block design consists of a collection of b subsets. 1) design. 4. 6}. we deduce which. 3. x. k. r. Since each of the v points lie in r blocks. 2.16 CHAPTER 1. simpliﬁes to r(k − 1) = λ(v − 1). Such a design is called a (b. bk(k − 1) = λv(v − 1).

0 0 1 0 1 1 1 1 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 . 1) symmetric design. . • For our above (7. . b are the design blocks. v are the design points and Bj . .B. . .1. BALANCED BLOCK DESIGNS 17 Deﬁnition: A block design is symmetric if v = b (and hence k = r). λ) design. and the 7 rows of the matrix B obtained from A by the relabelling 0 ↔ 1: 0 0 1 1 a1 1 a2 1 a3 0 a4 1 a5 0 a 0 C = 6 = a7 0 b1 0 b2 0 b3 1 b4 0 b5 1 b6 1 1 b7 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0 1 0 1 1 0 0 0 1 0 1 1 0 . j = 1. 0 1 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 • We now construct a (7. k. the unit vector 1. this is called a (v. 16. Deﬁnition: The incidence matrix of a block design is a v×b matrix with entries aij = 1 if xi ∈ Bj . / where xi . . 3) binary code C consisting of the zero vector 0. 0 if xi ∈ Bj . i = 1. the incidence matrix A is 1 0 0 0 1 0 1 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 0 0 0. . . that is. the number of points and blocks is identical. 3. For brevity. the 7 rows of A.

The tenth digit serves to make the weighted check sum 10 9 9 9 kxk = k=1 k=1 kxk + 10 k=1 kxk = 11 k=1 kxk = 0 (mod 11). for i = 1. by the publisher.1. aj ) = 3 + 3 − 2(1) = 4 for i = j. for i = j. 1) = 7. d(0. bj ) = 4. For example. note that each row of A has exactly three 1s (since r = 3) and any two distinct rows of A have exactly one 1 in common (since λ = 1). ai ) = 4. INTRODUCTION To ﬁnd the minimum distance of this code. bj ) = w(ai − bj ) = w(1 − (ai − aj )) = w(1) + w(ai − aj ) − 2w(ai − aj ) = 7 − w(ai − aj ) = 7 − d(ai . a 10digit codeword. bi ) = 4. d(bi . Hence d(ai. . d(1. so d(ai. Furthermore.18 CHAPTER 1. the third ﬁeld (853803) is the book number assigned by the publisher. an X is printed in place of the ﬁnal digit. ai and bj disagree in precisely those places where ai and aj agree. + 16 1 0 The existence of a perfect binary (7. . and the ﬁnal digit (0) is a check digit. 1. the second ﬁeld indicates the publisher (19 means Oxford University Press). we know that an error has occurred. So. If x10 turns out to be 10. 16. Thus C is a (7. . . d(ai . (1. 3) code establishes A2 (7. 7. . if 10 kxk = 0 (mod 11). . d(1. Likewise. bi ) = 3. which in fact is perfect. bi ) = d(0. .1) is satisﬁed: 7 7 = 16(1 + 7) = 128 = 27 . 3) code. then the check digit x10 is chosen as 9 x10 = k=1 kxk (mod 11). If the digits of the ISBN number are denoted x = x1 . so we have now established another entry of Table 1.C The ISBN Code Modern books are assigned an International Standard Book Number (ISBN). since the equality in Eq. The ﬁrst ﬁeld speciﬁes the language (0 means English). The three hyphens separate the codeword into four ﬁelds. aj ) = 7 − 4 = 3. x10 . Finally. the ISBN k=1 number is able to (ii) detect a single error or (ii) detect a transposition error that results in two digits (not necessarily adjacent) being interchanged. 3) = 16. d(0. Hill [1997] has the ISBN number 0-19-853803-0. 16. In fact. ai ) = 3.

In the above arguments we have used the property of the ﬁeld Z11 (the integers modulo 11) that the product of two nonzero elements is always nonzero (since ab = 0 and a = 0 ⇒ a−1 ab = 0 ⇒ b = 0). we know that xj = −j −1 10 kxk (mod 11). b > 1 cannot be a ﬁeld because the product ab = 0 (mod ab). Suppose that we detect an error and we also know that it is the digit xj that is in error (and hence unknown). 2) = 1. Similarly.2. Since 10 jxj + k=1 k=j kxk = 0 (mod 11). Consequently. For this reason. Note also that there can be no inverse a−1 in Zab . assuming all of the other digits are correct. Let y be the vector obtained by exchanging the digits xj and xk in an ISBN code x.1. Thus −3 · 7 = −2 · 11 + 1. we ﬁnd 11 = 5 · 2 + 1 so that 1 = 11 − 5 · 2. let y be the inverse of 2 modulo 11. the ISBN code is calculated in Z11 and not in Z10 . so 1 = 4−1·3 = 4−1·(7−1·4) = 2·4−1·7 = 2·(11−1·7)−1·7 = 2·11−3·7. where j = k. On dividing 11 by 2 as we would to show that gcd(11. Then kxk + je = je (mod 11) = 0 (mod 11) since j and e are nonzero. even though a = 0 and b = 0. The complete table of inverses modulo 11 is shown in Table 1. Zab with a. that is. Then we can use our table of inverses to solve for the value of xj . where 2 · 5 = 0 (mod n). k=1 k=j . The ISBN code cannot be used to correct errors unless we know a priori which digit is in error.C. from which we see that q = −1 and y = −5 (mod 11) = 6 (mod 11) are solutions. we ﬁrst need to construct a table of inverses modulo 11 using the Euclidean division algorithm. Then 10 k=1 10 i=1 ixi + (k − j)xj + (j − k)xk = (k − j)xj + (j − k)xk (mod 11) = (k − j)(xj − xk ) (mod 11) = 0 (mod 11) if xj = xk . x x−1 1 1 2 6 3 4 4 3 5 9 6 2 7 8 8 7 9 5 10 10 Table 1. THE ISBN CODE 19 If a single error occurs.2: Inverses modulo 11. 7−1 = 8 (mod 11) since 11 = 1 · 7 + 4 and 7 = 1 · 4 + 3 and 4 = 1·3+1. To do this. then some digit xj is received as xj + e with e = 0. Zp is a ﬁeld ⇐⇒ p is prime. for otherwise b = a−1 ab = a−1 0 = 0 (mod ab). In fact. For example. 7 and −3 = 8 are inverses mod 11. Then 2y = 1 (mod 11) implies 2y = 11q+1 or 1 = −11q+2y for some integers y and q.

if we did not know the fourth digit x of the ISBN 0-19-x53803-0.6: A smudge has obscured one of the digits of the ISBN code 0-39305100-X. 1 2 3 4 5 6 7 8 9 Determine the unknown digit. The eighth digit is x5 = −8−1 (1 · 0 + 2 · 3 + 3 · 9 + 4 · 3 + 5 · 0 + 6 · 5 + 7 · 1 + 9 · 0 + 10 · 10) (mod 11) = −7(0 + 6 + 5 + 1 + 0 + 8 + 7 + 0 + 1) (mod 11) = −7(6) (mod 11) = −9 (mod 11) = 2. which is indeed correct. 1 2 3 4 5 6 7 8 9 0 Determine the unknown digit. = −2(0 + 5 + 0 + 4 + 7 + 5 + 2 + 4 + 10) (mod 11) Problem 1. Problem 1.5: A smudge has obscured one of the digits of the ISBN code 0-80180739-1. INTRODUCTION For example.20 CHAPTER 1. The sixth digit is x6 = −6−1 (1 · 0 + 2 · 8 + 3 · 0 + 4 · 1 + 5 · 8 + 7 · 7 + 8 · 3 + 9 · 9 + 10 · 1) (mod 11) = −2(4) (mod 11) = −8 (mod 11) = 3. we would calculate x = −4−1 (1 · 0 + 2 · 1 + 3 · 9 + 5 · 5 + 6 · 3 + 7 · 8 + 8 · 0 + 9 · 3 + 10 · 0) (mod 11) = −3(0 + 2 + 5 + 3 + 7 + 1 + 0 + 5 + 0) (mod 11) = −3(1) (mod 11) = 8. .

k. there are q possible values for each of the k coeﬃcients aj . j = 1. We say that these k basis codewords generate or span the entire code space C. k. To see this. k. an [n.1: Show that the (7. . d] code be uj . β ∈ Fq . 3) code developed in the previous chapter is linear. . so the q k generated codewords are distinct. Deﬁnition: A linear code C is a code for which. let the k basis vectors of an [n. A k-dimensional code C is simply the set of all linear combinations of k linearly independent codewords. where Fq is a ﬁeld. 21 . . Note that k k k aj uj = j=1 j=1 bj uj ⇒ j=1 (aj − bj )uj = 0 ⇒ aj = bj .Chapter 2 Linear Codes An important class of codes are linear codes in the vector space Fqn . 16. . C is a linear subspace of Fqn . or if we also wish to specify the minimum distance d. Remark: The zero vector 0 automatically belongs to all linear codes. called basis vectors. for j = 1. k] code. . Remark: A binary code C is linear ⇐⇒ it contains 0 and the sum of any two codewords in C is also in C. Remark: Note that a q-ary [n. k. . Problem 2. by the linear independence of the basis vectors. That is. q k . The q k codewords are k obtained as the linear combinations j=1 aj uj . k. Remark: A linear code C will always be a k-dimensional linear subspace of Fqn for some integer k between 1 and n. then αu + βv ∈ C for all α. Deﬁnition: We say that a k-dimensional code in Fqn is an [n. whenever u ∈ C and v ∈ C. d] code is an (n. d] code. d) code. .

3. 3) perfect binary code in Chapter 1 is a [7. that we only have to examine the weights of the M − 1 nonzero codewords in order to ﬁnd the minimum distance. 16. 6. then d(C) = w(C). this is the space 10 (x1 x2 . Deﬁnition: A k × n matrix with rows that are basis vectors for a linear [n. 5. However. d) code is a q-ary [n. LINEAR CODES Remark: Not every (n. One of the advantages of linear codes is illustrated by the following lemma. x10 ) : k=1 kxk = 0 (mod 11) . for a linear code. q k . . Proof: There exist codewords x. . 4. y. and z = 0 such that d(x.2: Show that the (7. lower-dimensional codes can always be obtained from linear q-ary codes by projection onto a lowerdimensional subspace of Fqn . x = 0}. not all vectors in this set (for example X-00-000000-1) are in the ISBN code. For example. 7. 9. for a general nonlinear code. 8. so w(C) = d(C).1 (Distance of a Linear Code): If C is a linear code in Fqn . Problem 2. y) = d(C) and w(z) = w(C). . the ISBN code is not a linear code. 0) = w(z − 0) = w(z) = w(C) ≤ w(x − y) = d(x. 1. • A q-ary repetition code of length n is an [n. Remark: Lemma 2. n] code with generator matrix [1 1 . In contrast. 1]. the ISBN code is a subset of the 910 dimensional subspace of F11 consisting of all vectors perpendicular to the vector (1. .1 implies. we need to make M = M(M − 1)/2 2 comparisons (between all possible pairs of distinct codewords) to determine the minimum distance. d] code (it might not be linear). k] code C is called a generator matrix of C. Deﬁnition: Deﬁne the minimum weight of a code to be w(C) = min{w(x) : x ∈ C. Lemma 2. That is. . 3] linear code (note that 24 = 16) with generator matrix 1 1 a1 1 = a2 1 0 a3 1 0 1 1 1 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 Remark: Linear q-ary codes are not deﬁned unless q is a power of a prime (this is simply the requirement for the existence of the ﬁeld Fq ). However. 10). Then d(C) ≤ d(z. y) = d(C).22 CHAPTER 2. 4. 2. and z with x = y. k.

2. . in order that the zero vector remains in the code). 0. 3) code. . • Given the message [0. or addition of one row to another. corresponding to q k distinct messages. 2. In particular the message with components ui = δik gets mapped to the codeword appearing in the kth row of G. Deﬁnition: A k × n matrix of rank k is in reduced echelon form (or standard form) if it can be written as [ 1k | A ] . multiplication of a row by a non-zero scalar.3: Show that the generator matrix for the (7. the encoded codeword 1 0 0 0 1 0 1 0 1 0 0 1 1 1 [0 1 0 1] 0 0 1 0 1 1 0 = [0 1 0 1 1 0 0] 0 0 0 1 0 1 1 is just the sum of the second and fourth rows of G. 1] and the above generator matrix for our (7. We identify each message with a k-tuple u = [ u1 u2 . Remark: A generator matrix for a vector space can always be reduced to an equivalent reduced echelon form spanning the same vector space. 1. ENCODING AND DECODING 23 For linear codes we must slightly restrict our deﬁnition of equivalence so that the codes remain linear (e. k] linear code C contains q k codewords. We can encode u by multiplying it on the right with the generator matrix G. 16.A. 16. . Problem 2. 0 0 0 1 0 1 1 where 1k is the k × k identity matrix and A is a k × (n − k) matrix. This maps u to the linear combination uG of the codewords. uk ] . Deﬁnition: Two linear q-ary codes are equivalent if one can be obtained from the other by a combination of (A) permutation of the columns of the code..A Encoding and Decoding A [n.g. 3) perfect code in Chapter 1 can be written in reduced echelon form as 1 0 0 0 1 0 1 0 1 0 0 1 1 1 G= 0 0 1 0 1 1 0. (B) multiplication of the symbols appearing in a ﬁxed column by a nonzero scalar. where the components ui are elements of Fq . by permutation of its rows. Note that any combination of these operations with (A) and (B) above will generate equivalent linear codes.

Proof: (i) a = a + 0 ∈ a + C for every a ∈ Fqn . then b = a + x for some x ∈ C. then b + C = a + C. so the same argument implies a + C ⊂ b + C. with x. Here |C| denotes the number of elements in C.2 (Equivalent Cosets): Let C be a linear code in Fqn and a ∈ Fqn . show that the message vector is just the ﬁrst k bits of the codeword. k] code C. Suppose that the cosets a + C and b + C have a common vector v = a + x = b + y. k] code C in Fqn is a q n−k × q k array listing all the cosets of C. |a + C| = |C| = q k . Lemma 2. k] code is in standard form. Theorem 2. each containing q k elements. such that the entry in the (i. Then b + y = (a + x) + y = a + (x + y) ∈ a + C. beginning with a vector of minimal weight that has not already been listed in previous rows. y ∈ C. Proof: Since b ∈ a + C. 1) and position (1. (iii) Let a. j).4: If a generator for a linear [n. b ∈ Fqn . Then b = a + (x − y) ∈ a + C. Let a be any vector in Fqn . The vectors in the ﬁrst column of the array are referred to as coset leaders. (iii) any two cosets are either equivalent or disjoint. k] code in Fqn . . If b is an element of the coset a + C. Deﬁnition: Let C be a linear code over Fqn . LINEAR CODES Problem 2. j)th position is the sum of the entries in position (i. listed with 0 appearing in the ﬁrst column.2 b + C = a + C. (ii) Since the mapping φ(x) = a + x is one-to-one. Furthermore a = b + (−x) ∈ b + C. The set a + C = {a + x : x ∈ C} is called a coset of C. Subsequent rows are listed one at a time. Then (i) every vector of Fqn is in some coset of C. so b + C ⊂ a + C.1 (Lagrange’s Theorem): Suppose C is an [n. (ii) every coset contains exactly q k vectors. Hence b + C = a + C. The ﬁrst row consists of the codewords in C themselves. Consider any vector b + y ∈ b + C.24 CHAPTER 2. so by Lemma 2. with y ∈ C. Deﬁnition: The standard array (or Slepian) of a linear [n. The following theorem from group theory states that Fqn is just the union of q n−k distinct cosets of a linear [n.

For the code C3 . ENCODING AND DECODING • Let us revisit our linear (5. one cannot determine the original vector with certainty. 3) code 0 0 C3 = 1 1 with generator matrix G3 = 25 0 1 0 1 0 1 1 0 0 0 1 1 0 1 0 1 0 1 1 0 1 . we know that the error vector e = 00001. remembering that the codewords themselves are listed in the ﬁrst row. 4. because in each row with coset leader weight 2. we deﬁne the . the error vector will have weight at most one. if y = 10111 is received. having minimum distance 3. Recall that the code C3 . 1 0 1 1 0 The standard array for C3 is a 8 × 4 array of cosets listed here in three groups of increasing coset leader weight: 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 1 0 0 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 0 0 0 0 1 1 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 1 0 1 1 1 0 1 0 0 1 0 0 1 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 Remark: The last two rows of the standard array for C3 could equally well have been written as 1 1 0 0 0 1 0 0 0 1 1 0 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 1 0 1 0 Deﬁnition: If the codeword x is sent. can only correct one error. We can then decode the received vector by checking to see under which codeword it appears in the standard array. error vector e = y − x. but the received vector is y. there are actually two . and the transmitted codeword must have been x = y −e = 10111 − 00001 = 10110. as long as no more than one error has occurred.2. For example. Remark: If no more than t errors have occurred.A. Remark: If two errors have occurred. the coset leaders of weight t or less are precisely the error vectors that can be corrected.

In particular. then Pcorr = 0. we know that every vector is within a distance t of some codeword. If p is the error probability for a single bit. α1 = 5. since C only contains a single codeword. if n = t. Of course. . it cannot be used to send any information! Remark: For i > t. then the probability Pcorr (C) that a received vector is correctly decoded is n Pcorr (C) = i=0 αi pi (1 − p)n−i . We ﬁnd that Pcorr for such a code is [(1 − p)5 + 5p(1 − p)4 + 10p2 (1 − p)3 ]2 = [(1 − p)3 (1 + 3p + 6p2 )]2 = 0. α2 = 2. n for 0 ≤ i ≤ t. Remark: If C can correct t errors then the coset leaders of weight no more than t are unique and hence the total number of such leaders of weight i is αi = n for i 0 ≤ i ≤ t. i αi = 0 for i > t. Hence Pcorr (C3 ) = (1 − p)5 + 5p(1 − p)4 + 2p2 (1 − p)3 = (1 − p)3 (1 + 3p − 2p2 ). the rows in the standard array of coset leader weight greater than t can be written in more than one way. the error vectors that can be corrected by a perfect code are precisely those vectors of weight no more than t. and α2 = 10. For a perfect code. • For the code C3 . this improvement in reliability comes at a price: we must now send n = 5 bits for every k = 2 information bits. Remark: Let C be a binary [n. if p = 0. k] linear code and αi denote the number of coset leaders for C having weight i. where i = 0. α1 = 5. For a code with minimum distance 2t + 1. however. then either 01110 − 00011 = 01101 or 01110 − 11000 = 10110 could have been transmitted. if 01110 is received. LINEAR CODES vectors of weight 2. i Such a code is able to correct all possible errors (no matter how poor the transmission line is). n. we see that α0 = 1. Thus. It is interesting to compare the performance of C3 with a code that sends two bits of information by using two back-to-back repetition codes each of length 5 and for which α0 = 1. The ratio k/n is referred to as the rate of the code.99921 and Perr = 1 − Pcorr = 0.01.00079. . however.26 CHAPTER 2.99998 . more than a factor 12 lower than the raw bit error probability p. and α3 = α4 = α5 = 0. then n Pcorr (C) = i=0 n i p (1 − p)n−i = (p + 1 − p)n = 1. . as we have seen above. the coeﬃcients αi can be diﬃcult to calculate. . For example. Thus. consequently. .

C is the space of all vectors that are orthogonal to every vector in C ⊥ . which is just C ⊥ . Deﬁnition: Let C be a [n. we will have to search through 2n entries every time we wish to decode a received vector. we deduce that C = (C ⊥ )⊥ . Hut = 0 ⇐⇒ u ∈ C. we know that the space spanned by the k independent rows of G is a k dimensional subspace and the null space of G. this is not practical. In other words.B Syndrome Decoding The standard array for our (5. k] linear code. that is. 2. we know immediately that C ⊂ (C ⊥ )⊥ . which we now describe. it has half the rate of C3 ). k] linear code. . In fact. Remark: Since every vector in C is perpendicular to every vector in C ⊥ . Deﬁnition: The redundancy r = n − k of a code represents the number of parity check digits in the code.00002. bear in mind that the repetition scheme requires the transmission of twice as much data for the same number of information digits (i. This just says that v is orthogonal to each of the rows of G. SYNDROME DECODING 27 so that Perr = 0. Deﬁnition: Let C be a [n. since the dimension of the linear supspace (C ⊥ )⊥ is n − (n − k) = k. An (n − k) × n generator matrix H for C ⊥ is called a parity-check matrix. That is. Remark: A code C is completely speciﬁed by its parity-check matrix: C = (C ⊥ )⊥ = {u ∈ Fqn : Hut = 0}.e. Problem 2. Fortunately.B. is an n − k dimensional subspace. 3) code had 32 entries. for a general code of length n. there is a more eﬃcient alternative.2. For codes of any reasonable length.5: Show that dual code C ⊥ to a linear code C is itself linear. While this error rate is almost four times lower than that for C3 . ∀u ∈ C}. 4. From linear algebra. The dual code C ⊥ of C in Fqn is the set of all vectors that are orthogonal to every codeword of C: C ⊥ = {v ∈ Fqn : v·u = 0. v ∈ C ⊥ ⇐⇒ Gv t = 0 . Remark: The dual code C ⊥ is just the null space of G: C ⊥ = {u ∈ Fqn : Gut = 0}. (where the superscript t denotes transposition).

Lemma 2. a linear code has minimum distance d if d is the smallest number for which some d columns of its parity-check matrix are linearly dependent. we see that some d columns of H are linearly dependent. Proof: This follows from the fact that the rows of G are orthogonal to every row of H. the k × (n − k) zero matrix. Deﬁnition: Given a linear code with parity-check matrix H. Lemma 2. the column vector Hut is called the syndrome of u. This leads to an alternative decoding scheme known as syndrome decoding. it is the error vector for decoding u. that GH t = [ 1k A] −A 1n−k = 1k (−A) + (A)1n−k = −A + A = 0. Since u ∈ C ⇐⇒ Hut = 0 and u has d nonzero components.4: An (n − k) × n parity-check matrix H for an [n. one computes the syndrome Hut and compares it to the syndromes of the coset leaders. Theorem 2. . we need only ﬁrst determine the parity check matrix. but that some 3 columns are linearly dependent. LINEAR CODES Theorem 2. k] code generated by the matrix G = [1k | A]. However. is given by [ −At | 1n−k ] .2 (Minimum Distance): A linear code has minimum distance d ⇐⇒ d is the maximum number such that any d − 1 columns of its parity-check matrix are linearly independent.28 CHAPTER 2. in other words. The following lemma describes an easy way to construct the standard form of the parity-check matrix from the standard-form generator matrix.3: Two vectors u and v are in the same coset of a linear code C ⇐⇒ they have the same syndrome. If the coset leader having the same syndrome is of minimal weight within its coset. Remark: Equivalently. any d − 1 columns of H must be linearly independent. • For a code with weight 3. where A is a k × (n − k) matrix. When a vector u is received. Proof: Let C be a linear code and u be a codeword such that w(u) = d(C) = d. To compute the syndrome for a code. Proof: Remark: We thus see that is there is a one-to-one correspondence between cosets and syndromes.2 tells us that any two columns of its paritycheck matrix must be linearly independent. or else there would exist a nonzero codeword in C with weight d − 1. u − v ∈ C ⇐⇒ H(u − v)t = 0 ⇐⇒ Hut = Hv t .

2. the ﬁfth digit is in error (assuming that only a single error has occurred). 1]t corresponds to the second column of H. 0. which has q n entries. i i i • For our (5. We can still determine which coset the syndrome belongs to by comparing the computed syndrome with a table of syndromes of all coset leaders. To decode errors of weight greater than one we will need to construct a syndrome table. and addition of one row to another) and that these operations may be used to put H into standard form. It will play a particularly important role for the Hamming codes discussed in the next chapter. Problem 2. If the codeword x is sent and the vector y = x + ei is received the syndrome Hy t = Hxt + Het = 0 + Het = Het is just m times the ith column of H. 0 The syndrome [0. if y = 10111 is received. If the corresponding coset leader has minimal weight within its coset. we know that more than one error has occurred. we compute Hy t = 001. . 0 0 1 29 Problem 2. SYNDROME DECODING • A parity-check matrix H3 for our (5.3: The syndrome of a vector that has a single error of m in the ith position is m times the ith column of H. and we decode y to the codeword 10110. Proof: Let ei be the vector with the value m in the ith position and zero in all other positions. 4. parity check matrix 1 0. Remark: If the syndrome does not match any of the columns of H.B. 4. multiplication of a row by a non-zero scalar. but this table. So the transmitted vector was 1011 − 0100 = 1111. Remark: The syndrome Het of a binary error vector e is just the sum of those columns of H for which the corresponding entry in e is nonzero. having only q n−k entries. Theorem 2. The following theorem makes it particularly easy to correct errors of unit weight. we are able to correct the error.7: Using the binary linear code with 0 0 1 1 0 1 H= 0 1 1 decode the received vector 1011. is smaller than the standard array. 3) 1 1 1 0 H3 = 0 1 code is 1 0 0 0 1 0. 3) code.6: Show that the null space of a matrix is invariant to standard row reduction operations (permutation of rows. which matches the ﬁfth column of H. just as we deduced earlier using the standard array. Thus.

1 0 1 1 1 0 (a) Find a parity check matrix H for C. At least 2 errors must have occurred and we cannot correct 2 errors with this code. (d) Is C a perfect code? Justify your answer. No. M. If C = C ⊥ . Since G has 3 (linearly independent) rows. Can this vector be decoded. d] binary code C generated by 1 1 0 0 1 1 G = 0 1 0 1 1 0.9: (a) Let C be a linear code. (e) Suppose the vector 011011 is received. what was the transmitted vector? The syndrome is [110]t .30 CHAPTER 2. which is not a syndrome corresponding to an error vector of weight 1. Thus. First.2 that d = 3. 0 1 1 0 0 1 0. which is the second column of H. assuming that only one error has occurred? If so. so n = 2k. what was the transmitted vector? No. Justify your answers. 0 1 1 0 1 1 1 0 0 1 0 (b) Determine the number of codewords M and minimum distance d of C. (c) How many errors can this code correct? The code can correct only ⌊(d − 1)/2⌋ = 1 error. then k = n − k. So the transmitted vector was 001011. because 8 6 0 + 6 1 = 8(1 + 6) = 56 < 64 = 26 . it spans a three-dimensional code space over F2 . we know by Theorem 2. Since no two columns of H are linearly dependent but the third column is the sum of the ﬁrst two. Can this vector be decoded. we cannot decode this vector because the syndrome is [111]t . The dimension of C ⊥ is n − k. LINEAR CODES Problem 2. Problem 2. n/2] code. assuming that only one error has occurred? If so. we put G in standard form: 1 0 0 G = 0 1 0 0 0 1 from which we see that 1 H = 0 1 1 0 1 1 1 0. . prove that n is even and C must be an [n. Let k be the dimension of the linear space C. M = 23 = 8. (f) Suppose the vector 011010 is received.8: Consider the linear [6. If C = C ⊥ .

namely. A generator for C ′ is the 1 × n matrix G′ = H. so we know that it contains all 2n−1 even weight vectors. . .. H =... we can explicitly ﬁnd a parity check matrix for C ⊥ . Furthermore.2. SYNDROME DECODING n (b) Prove that exactly 2n−1 vectors in F2 have even weight. 1 0 0 ··· 0 1 0 1 0 0 0 ··· 0 1 We see that H ⊥ is a generator matrix for C and that C consists only of even weight vectors. we are eﬀectively imposing a parity check equation on it. . −1 −1 A parity-check matrix for C is the 1 × n matrix 1]. .B. . so we know that C consists of all vectors x1 x2 . What are the sizes of these matrices? A generator for C is the (n − 1) × n matrix 1 0 0 ··· 0 1 0 ··· 0 0 1 0 G=. A parity-check matrix for C ′ is the (n − 1) × n matrix H ′ = G. Let C ′ be the q-ary repetition code of length n.. .10: Let C be the code consisting of all vectors in Fqn with checksum 0 mod q.. C consists of all even weight i=1 vectors. That is. 0 0 ··· 1 0 0 0 ··· H = [1 1 . . 1 1 1]. 0 0 ··· 0 1 −1 −1 −1 . . Alternatively. the last bit is constrained to be the sum of the previous n − 1 bits. . .. . 31 n By specifying that a vector in F2 has even weight. prove that C is a binary code consisting of all even weight vectors. This must be a parity check matrix for C. . the vector space C has dimension n − 1. the n − 1 × n matrix 1 1 0 0 ··· 0 0 1 0 1 0 ··· 0 0 1 0 0 1 ··· 0 0 ⊥ . xn for which w(x) = n xi = 0 (mod 2). . . (c) If C ⊥ is the binary repetition code of length n. So one can construct exactly 2q−1 vectors of even weight. Problem 2. (b) Find a generator G′ and parity-check matrix H ′ for C ′ . . The generator matrix for C ⊥ is the 1 × n matrix G⊥ = [ 1 1 1 . Hint: ﬁnd a generator matrix for C ⊥ . (a) Find a generator G and parity-check matrix H for C. . .

• ¦ ⊂C ¥ • ¦ ′ = C⊥ ¥ C . (d) Find d(C). 2. C .11: Consider the linear 1 1 G= 0 [7. • ¦ ⊃C ¥ C .) Problem 2.2. (e) Find d(C ′ ). Use part(g) to prove that n−1 2 k=0 n k = 2n−1 . Odd-length binary codes are (trivially) perfect: they satisfy the Hamming equality n−1 2 2 k=0 n k = 2n . (Equivalently. Justify your answer.32 CHAPTER 2. d] binary code C generated by 1 0 0 0 1 1 0 1 1 0 0 1. From Theorem 2. from which the desired result follows. (f) How many codewords are there in C? Since G has n − 1 rows. n) code can correct correct t = (n − 1)/2 errors. Since each codeword of C ′ diﬀers from all of the others in all n places. Any set containing just one column of the parity-check matrix H of C is linearly independent. A odd-length binary (n. we see that d(C ′ ) = n. M. • C ′ ∩ C ⊥ = ∅. Justify your answer. (h) Suppose q = 2 and n is odd. LINEAR CODES (c) Which of the following statements are correct? Circle all correct statements. (g) How many codewords are there in C ′ ? Since G′ has 1 row. § ′ ⊥ § § ′ ⊥ ¤ ¤ ¤ • Neither C ′ ⊃ C ⊥ nor C ′ ⊂ C ⊥ holds. there are q n−1 possible codewords. 0 1 0 1 1 1 . there are q possible codewords. but the ﬁrst and second column (say) are not. we conclude that d(C) = 2. this is a consequence of the left-right symmetry of Pascal’s triangle.

**2.B. SYNDROME DECODING (a) Find a parity check matrix H for C.
**

First, we row reduce G to standard 1 G = 0 0 1 1 H= 1 0 form: 0 0 1 1 0 1 0 1 0 1 1 0 1 0 1 1 1 1 0 0 0 1 1 1 0 1 1 0 1 0 0 0 0 1 0 0 1, 1 0 0 , 0 1

33

from which we see that

**(b) Determine the number of codewords M in C. Justify your answer.
**

Since G has 3 (linearly independent) rows, it spans a three-dimensional code space over F2 . Thus, M = 23 = 8.

(c) Find the maximum number N such that any set of N columns of H are linearly independent. Justify your answer.

It is convenient to divide the columns into two groups of clearly linearly independent vectors: the ﬁrst three columns (which are distinct and do not sum up to zero) and the last four columns. Each of the ﬁrst three columns has weight 3, and therefore cannot be written as a sum of two of the last four columns. Any two of the ﬁrst three columns diﬀer in more than one place, and so their sum cannot equal any of the last four columns. Thus, no three columns of H are linearly dependent. However, the sum of the ﬁrst three columns is equal to the ﬁfth column, so N = 3 is the maximum number of linearly independent columns.

**(d) Determine the minimum distance d of C.
**

From part (c) and Theorem 2.2 we know that d = 4.

**(e) How many errors can C correct?
**

The code can correct only ⌊(d − 1)/2⌋ = 1 error.

(f) Is C a perfect code? Justify your answer.

(g) By examining the inner (dot) products of the rows of G with each other, determine which of the following statements are correct (circle all correct statements and explain): • ¦ ⊂ C⊥ ¥ C , • C = C ⊥, • C ⊃ C ⊥, • Neither C ⊃ C ⊥ nor C ⊂ C ⊥ holds. • C ∩ C ⊥ = ∅,

§ ¤

No, from Problem 1.4(b) we know that a code with even distance can never be perfect. 7 Alternatively, we note that 8 7 + 1 = 8(1 + 7) = 64 < 128 = 27 . 0

34

CHAPTER 2. LINEAR CODES

(h) Suppose the vector 1100011 is received. Can this vector be decoded, assuming that no more than one error has occurred? If so, what was the transmitted codeword?

The only correct statement is C ⊂ C ⊥ , since the rows of G are orthogonal to each other. Note that C cannot be self-dual because it has dimension k = 3 and C ⊥ has dimension n − k = 7 − 3 = 4.

Yes, in fact this is the ﬁrst row of G, so it must be a codeword. So no errors have occurred; the transmitted codeword was 1100011. As a check, one can verify that the syndrome is [0000]t .

(i) Suppose the vector 1010100 is received. Can this vector be decoded, assuming that no more than one error has occurred? If so, what was the transmitted codeword?

The syndrome is [1101]t , which is the second column of H. So the transmitted vector was 1110100.

(j) Suppose the vector 1111111 is received. Show that at least 3 errors have occurred. Can this vector be unambiguously decoded by C? If so what was the transmitted codeword? If not, and if only 3 errors have occurred, what are the possible codewords that could have been transmitted?

Since the syndrome [1011]t is neither a column of H nor the sum of two columns of H, it does not correspond to an error vector of weight 1 or 2. Thus, at least 3 errors have occurred. We cannot unambiguously decode this vector because C can only correct 1 error. In fact, since the rows of G have weight 4, by part (g) and Problem 4.3, we know that all nonzero codewords in C have weight 4. So any nonzero codeword could have been transmitted, with 3 errors, to receive 1111111.

Problem 2.12: Consider a single error-correcting ternary code C with parity-check matrix 2 0 1 1 0 0 H = 1 2 0 0 1 0. 0 2 2 0 0 1 (a) Find a generator matrix G.

A generator matrix is 1 0 0 G = 0 1 0 0 0 1 The information word x is encoded as follows: 1 0 0 0 1 0 0 0 1 1 0 0 2 0 0 0 1 0 2 0 1 0 0 1 2 2 1 1 2 0 0 1 1. 2 0 1

**(b) Use G to encode the information messages 100, 010, 001, 200, 201, and 221.
**

as xG. So the information messages can be encoded 1 0 1 2 0 = 0 0 1 1 2 2 0 1 2 2 0 1 0 0 0 2 0 0 1 0 1 1 1 0 2 2 1 1 2 1 0 1 1 0 0 1 1 . 0 1 0

2.B. SYNDROME DECODING

35

That is, the encoded words are the rows of the resulting matrix, namely 100120, 010011, 001201, 200210, 201111, and 221100.

**(c) What is the minimum distance of this code?
**

Since 1 0 2 1 2H = 0 1 2 2 0 0 0 2 1 0 0 0 0, 2

we see that the columns of H and 2H are distinct, so no two columns of H are linearly dependent. But the ﬁrst column of H is the ﬁfth column plus twice the fourth column, so by Theorem 2.2 we know that d = 3.

(d) Decode the received word 122112, if possible. If you can decode it, determine the corresponding message vector.

The syndrome is 2 0 1 2 0 2 1 1 0 0 0 1 2 0 0 1 2 2 0 2 0 = 0. 1 1 1 1 2

The syndrome 201 is twice the third column of H, so the corrected word is 122112 − 2(00100) = 120112. Since G is in standard form, the corresponding message word, 120, is just the ﬁrst three bits of the codeword.

(e) Decode the received word 102201, if possible. If you can decode it, determine the corresponding message vector.

The syndrome is 2 0 1 2 0 2 1 1 0 0 0 1 2 0 0 1 0 0 0 2 0 = 1. 2 1 2 0 1

The syndrome 012 is not a multiple of any column of H, so either an incorrect codeword was transmitted or more than one error in transmission has occurred. But you can only correct one error with this code, so you have to ask for a retransmission.

**Chapter 3 Hamming Codes
**

One way to construct perfect binary [n, k] codes that can correct single errors is to n−k ensure that every nonzero vector in F2 appears as a unique column of H. In this n manner, the syndrome of every possible vector in F2 can be identiﬁed with a column n of H, so that every vector in F2 is at most a distance one away from a codeword. This is called a binary Hamming code, which we now discuss in the general space Fqn , where Fq is a ﬁeld. Remark: One can form q − 1 distinct scalar multiples from any nonzero vector u in Fqr : if λ, γ ∈ Fq , then λu = γu ⇒ (λ − γ)u = 0 ⇒ λ = γ or u = (λ − γ)−1 0 = 0. Deﬁnition: Given an integer r ≥ 2, let n = (q r − 1)/(q − 1). The Hamming code Ham(r, q) is a linear [n, n − r] code in Fqn for which the columns of the r × n paritycheck matrix H are the n distinct non-zero vectors of Fqr with ﬁrst nonzero entry equal to 1. Remark: Not only are the columns of H distinct, all nonzero multiples of any two columns are also distinct. That is, any two columns of H are linearly independent. The total number of nonzero column multiples that can thus be formed is n(q−1) = q r − 1. Including the zero vector, we see that H yields a total of q r distinct syndromes, corresponding to all possible error vectors of unit weight in Fqr . • The columns of a parity-check matrix for the binary Hamming code Ham(r, 2) consist of all possible nonzero binary codewords of length r. Remark: The columns of a parity-check matrix H for a Hamming code may be written in any order. However, both the syndromes and codewords will depend on the order of the columns. If H is row reduced to standard form, the codewords will be unchanged. However, other equivalent Ham(4, 2) codes obtained by rearranging columns of H will have rearranged codewords. 36

37 Problem 3.1: Given a parity-check matrix H for a binary Hamming code, show that the standard form for H (obtained by row reduction) is just a rearrangement of the columns of H. Remark: The dimension k of Ham(r, q) is given by n−r = qr − 1 − r. q−1

• A parity-check matrix for the one-dimensional code Ham(2, 2) is 0 1 1 , 1 0 1 which can row reduced to standard form: 1 1 0 . 1 0 1 The generator matrix is then seen to be [ 1 1 1 ]. That is, Ham(2, 2) is just the binary triple-repetition code. • A parity-check matrix for the one-dimensional 0 1 1 1 1 1 0 1 1 0 1 1 0 1 0 code Ham(3, 2) in standard form, is 0 0 1 0. 0 1

Problem 3.2: Show that this code is equivalent to the (7, 16, 3) perfect code in Chapter 1. Remark: An equivalent way to construct the binary Hamming code Ham(r, 2) is to consider all n = 2r − 1 nonempty subsets of a set S containing r elements. Each of n these subsets corresponds to a position of a code in F2 . A codeword can then be thought of as just a collection of nonempty subsets of S. Any particular element a of S will appear in exactly half of all 2r subsets (that is, in 2r−1 subsets) of S, so that an even number of the 2r − 1 nonempty subsets will contain a. This gives us a parity-check equation, which says that the sum of all digits corresponding to a subset containing a must be 0 (mod 2). There will be a parity-check equation for each of the r elements of S corresponding to a row of the parity-check matrix H. That is, each column of H corresponds to one of the subsets, with a 1 appearing in the ith position if the subset contains the ith element and 0 if it doesn’t. • A parity check matrix for Ham(3, 2) can be constructed by considering all possible nonempty subsets of {a, b, c}, each of which corresponds to one of the bits of a

bs. The syndrome.38 7 codeword x = x1 x2 . . which corresponds to the binary number 3. c}. Problem 3. . immediately tells us in which position a single error has occurred. there will be a unique codeword satisfying Hx = 0. are in ascending order. the vector x = 1100110 corresponds to the collection {{b. x7 in F2 : CHAPTER 3. we know that x is a codeword. • We can write a parity-check matrix for a Ham(3. 2) code in in the binary ascending form 0 0 0 1 1 1 1 H = 0 1 1 0 0 1 1. Remark: When constructing binary Hamming codes. x1 + x3 + x4 + x6 = 0 (mod 2). x3 . Thus. the parity-check digits x5 . 1. the transmitted codeword was 1100110. b: and c: x1 + x2 + x4 + x7 = 0 (mod 2). x6 . {b}}. without even looking at H. so we know immediately that the a single error must have occurred in the third position.3: Show that two distinct codewords x and y that satisfy the above three parity check equations must diﬀer in at least 3 places. c}. x2 . {a. 1 0 1 0 1 0 1 If the vector 1110110 is received. and x7 can be determined from the three checksum equations corresponding to each of the elements a. treated as binary numbers. Since there are an even number of as. b. For example. we need to compare the computed syndrome with all nonzero multiples of the columns of the parity-check matrix. {a}. . and x4 . there is a distinct advantage in arranging the parity-check matrix so that the columns. HAMMING CODES a b c x1 c x2 a b x3 a b c x4 a b x5 x6 c x7 Given any four binary information digits x1 . Remark: For nonbinary Hamming codes. 1]t. the syndrome is [0. and cs in this collection. interpreted in exactly the same way as a binary number. and c: a : x2 + x3 + x4 + x5 = 0 (mod 2).

1. we have A2 (2r − 1. . 1]t we see that an error of 1 has occurred in the second-last position. The distance cannot be any greater than 3 because the nonzero columns 0 0 0 .0 0 1 1 1 0 1 are linearly dependent. . 1]t = 2[1. • Thus A2 (3. so the sphere-packing bound q n−r (1 + n(q − 1)) = q n−r (1 + q r − 1) = q n is perfectly achieved. Theorem 3. 3) = r 22 −1−r . . we know from Theorem 2.0. q) has M = q k = q n−r codewords.2 that Ham(r. 1 0 1 2 0 1 2 0 1 2 0 1 2 If the vector 2000 0000 00001 is sent and at most a single error has occurred. A2 (7. Proof: Since any two columns of H are linearly independent. A2 (15. 3) = 211 = 2048. The following theorem establishes that Hamming codes can always correct single errors. 0. . q) has a distance of at least 3. as we saw in the above examples. so it can correct single errors. 3) = 16. . and also that they are perfect. 1 0 1 2 If the vector 2020. Furthermore. Corollary 3.1 (Hamming Codes are Perfect): Every Ham(r. so the transmitted vector was 2000 0000 00021. 2]t . . . • A parity-check matrix for Ham(3. which has syndrome [2. is received and at most a single digit is in error. 3) is H= 0 1 1 1 .1 (Hamming Size): For any integer r ≥ 2. . 3) = 226 . and A2 (31. 3) = 2. 3) is 0 0 0 0 1 1 1 1 1 1 1 1 1 H = 0 1 1 1 0 0 0 1 1 1 2 2 2. . then from the syndrome [1. we see that an error of 2 has occurred in the last position and decode the vector as x = y − e = 2020 − 0002 = 2021.39 • A parity-check matrix for Ham(2. we know that Ham(r. q) code is perfect and has distance 3. 2.

Hence the minimum distance of the extended code. for each i = 0. must in fact be 4. the coset weights are distributed according to n for 0 ≤ i ≤ 1. Since the parity check guarantees that the weight of all extended codewords is even. Extend the Hamming code Ham(r. The error vectors that can be corrected by a Hamming code are precisely those vectors of weight one n or less. which is at least 3 and certainly no more than 4. The extended Hamming code is not perfect. but we know by Corollary 1. i αi = 0 for i > 1. We know that the Hamming code is perfect and has minimum distance 3. These vectors ﬁll Fq completely. Consequently. . Prove that the minimum distance of your code is 4 and that M is the maximum number of possible codewords for these parameters. 2) a code of r length n = 2r with minimum distance d = 4 that contains M = 22 −1−r codewords. n.5: For all r ∈ N. . 2). α1 = n = 2r − 1. with length n − 1 = 2r − 1.1 that the maximum number of codewords for the parameters (n. we know that the distance between any two of these codewords x and y is w(x − y) = w(x) + w(y) − 2w(xy). α2 = α3 = . where n = 2r − 1. which is even. .40 CHAPTER 3. That is. HAMMING CODES Problem 3. 2). describe how to construct from Ham(r. . α0 = 1. where k = n − r. . We also know that the Hamming code is perfect.3. Note that the total number of cosets is α0 + α1 = 2r = 2n−k and each of them contain 2k vectors.4: Determine the number αi of coset leaders of weight i for Ham(r. and distance 3 by adding a parity check to produce a code with n = 2r but still M codewords. r . 3). = αn = 0. Problem 3. M = 2n−r = 22 −1−r . . 4) is the same as the maximum number M of codewords for the parameters (n − 1.

12. two other linear perfect codes were found by Golay in 1949. and d parameters as Hamming codes. 8] code. 278 . M. 7). 36. 41 . and d values satisfy the sphere-packing bound t M k=0 n (q − 1)k = q n . 5] codes. 5) for q = 3 satisfy the sphere-packing bound (1. 7] and ternary [11. 5) for q = 3. They are (23. as we shall soon. The condition for a code to be perfect is that its n. 212 .Chapter 4 Golay Codes We saw in the last chapter that the linear Hamming codes are nontrivial perfect codes. 212 . 36. 6. 7] Golay code augmented with a ﬁnal parity check in the last position (such that the weight of every codeword is even). M.1: Show that the (n. It turns out that there do indeed exist linear binary [23.1) with d = 2t + 1. Remark: In view of Theorem 1. M. it is impossible for linear or nonlinear (90. In addition. 5) for q = 2. M. and (11. Yes. Are there any other nontrivial perfect codes? A. d) that do not correspond to the parameters of a Hamming or trivial perfect code. Problem 4. But. 5) for q = 2 and (11. these are known as Golay codes. 278.3. 7] Golay code is to construct ﬁrst the extended Golay [24. Q. several nonlinear perfect codes are known that have the same n. k (4.1). (90. a convenient way of ﬁnding a binary [23. which is just the [23. 5) codes to exist. 12. 7) and (90. d) triples (23. 12. 12. 278 . Golay found three other possible integer triples (n.

GOLAY CODES The extended binary Golay [24. Noting that the weight of the rows of G24 is either 12 or 8. But since the third row of this reduced generator matrix is a codeword of weight 7. where A is a 12 × 12 symmetric matrix. If each row of G is orthogonal to itself and all other rows and has weight divisible by 4. At = A. 1 1 0 0 0 1 Remark: We can express G24 = [112 | A]. Hint: note that the ﬁrst row of G is orthogonal to itself.2: Show that u·v = 0 for all rows u and v of G24 . we can then be sure that the minimum distance is exactly 7. Remark: The above exercise establishes that the rows of G24 are orthogonal to each other. A linear code C is self-dual if C = C ⊥ .42 CHAPTER 4. Deﬁnition: A linear code C is self-orthogonal if C ⊂ C ⊥ . 8] code C24 can be generated by the matrix G24 deﬁned by 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1 0 1 1 1 0 1 0 1 0 1 1 0 1 1 1 0 0 1 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 . Problem 4.3: Let C be a binary linear code with generator matrix G. 12. Then use the cyclic symmetry of the rows of the matrix A′ formed by deleting the ﬁrst column and ﬁrst row of A. Problem 4. . we make use of the following result. the linear spaces C24 and C24 have the same ⊥ ⊥ dimension. Hence C24 ⊂ C24 implies C24 = C24 . that the binary Golay [23. consequently. This means that the parity check matrix H24 = [A | 112 ] for C24 is also a generator matrix for C24 ! We are now ready to show that distance of C24 is 8 and. Then establish that u·v = 0 when u is the second row and v is any row of G24 . 12] code generated by the ﬁrst 23 columns of G24 must have minimum distance either 7 or 8. prove that C ⊂ C ⊥ and that the weight of every codeword in C is a multiple of 4. ⊥ Remark: Since k = 12 and n − k = 12. that is.

but it is easily seen (again using the cyclic symmetry of A′ ) that no two rows of G24 diﬀer in only four positions.4: Show that the ternary columns of the generator matrix 1 0 0 0 0 1 0 0 0 0 1 0 G12 = 0 0 0 1 0 0 0 0 0 0 0 0 has minimum distance 5.43 Theorem 4. from which we see that w(x) = 5 and d(x. (4. any codeword can be expressed either as a linear combination of the rows of G24 or as a linear combination of the rows of H24 . Since both G24 and H24 are generator matrices for the code. 5) codes): There exist no binary (90. We now show that a codeword x ∈ C24 cannot have weight 4. Proof: We know that the weight of the code generated by G24 must be divisible by 4. The only possible codeword of weight 4 is therefore a sum of two rows of G24 . we now know that C24 must have a minimum distance of at least 8. |Y | = 88. none of which has weight 4. 2 = d(C) − w(y) ≤ w(x) − w(y) ≤ w(x − y) ≤ 2. 6] code generated by the ﬁrst 11 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 1 2 2 1 1 1 0 1 2 2 1 2 1 0 1 2 1 2 2 1 0 1 1 1 2 2 1 0 Theorem 4. 278 .1).2. Problem 4. we see that the minimum distance of C24 is exactly 8. From Eq. Golay [11. we know that C is perfect. 278 . It is not possible for the all of the left-most twelve bits of x to be 0 if x is some nontrivial linear combination of the rows of G24 . 12] code): The [24. 5) code C exists. 5) codes. 278. In fact.2 (Nonexistence of binary (90.1 (Extended Golay [24. By Lemma 1. then x would then be identical to a row of G24 (H24 ). 12] code generated by G24 has minimum distance 8. y) = w(x − y) = 2. without 90 loss of generality we may assume that 0 ∈ C. since the second row of G24 is a codeword of weight 8. it is not possible for all of the right-most twelve symbols of x to be 0 if x is some nontrivial linear combination of the rows of H24 . But then from the triangle inequality. . with d(C) = 5. If exactly one of the left-most (right-most) twelve bits of x were 1. Proof: Suppose that a binary (90. Likewise. Since the weight of every codeword in C24 must be a multiple of 4. Thus each y ∈ Y is within a distance 2 from a unique codeword x. Since there are 88 possible positions for the third one. Let Y be the set of vectors in F2 of weight 3 that begin with two 1s. This means that x must have a 1 in every position that y does.

GOLAY CODES Let X be the set of all codewords of weight 5 that begin with two 1s.5: Consider the erated by 1 0 0 G12 = 0 0 0 extended ternary (q = 3) Golay [12. for each x ∈ X there are precisely three vectors y ∈ Y such that d(x.44 CHAPTER 4. assuming that at most two errors have occurred. q n−r . 6] code C12 gen0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 1 2 2 1 1 1 0 1 2 2 1 2 1 0 1 2 1 2 2 1 0 1 1 1 2 2 1 0 ⊥ (a) Use the fact that C12 is self-orthogonal (C12 ⊂ C12 ) to ﬁnd a parity-check matrix for C12 . d(x. That is. since |X| must be an integer. But each x ∈ X contains exactly three ones after the ﬁrst two positions. so the transmitted vector was x = y − e = 010 001 210 101. 3) of a Hamming code. based on work by Van Lint. Problem 4. we know that the linear subspace C12 and C12 have the ⊥ . y) = 2}. ⊥ Since n − k = 12 − 6 = 6 = k. 6. This is a contradiction. y) : x ∈ X. so C12 = C12 12 itself is a parity check matrix for C12 . (b) Decode the received vector y = 010 000 010 101. proved that any nona trivial perfect code over the ﬁeld Fqn must either have the parameters ((q r − 1)/(q − 1). That is. 5) of the ternary Golay code. 7) of the binary Golay code. 1 1 0 1 1 0 0 1 1 Thus an error of 1 has occurred in position 7 and an error of 2 has occurred in position 6. y) = 2. 36 . Remark: In 1973. Tiet¨vainen. G same dimension. We know that for each y ∈ Y there is a unique x ∈ X such that d(x. . 3 |X| = 88. That is. or the parameters (11. there are exactly |Y | = 88 elements in the set {(x. Thus. 212 . the parameters (23. y ∈ Y. The syndrome of this vector is 0 0 0 1 1 0 1 1 = + 20. the error vector is e = 000 002 100 000. Hence. y) = 2.

• The binary linear code (000. . .Chapter 5 Cyclic Codes Cyclic codes are an important class of linear codes for which the encoding and decoding can be eﬃciently implemented using shift registers. we note that it is equivalent to the linear code (0000. . shift registers are built out of two-state storage elements known as ﬂip-ﬂops and arithmetic devices called binary adders that output the sum of their two binary inputs. . 16. 110) is cyclic. . 2). have an equivalent cyclic representation. then all cyclic shifts of x also belong to C. . which we now know is equivalent to Ham(3. 45 . which is cyclic. Many common linear codes. + an−2 xn−1 = xc(x) (mod xn − 1). . In the binary case. upon interchanging the third and fourth positions. is cyclic. • The binary linear code (0000. 1111) is not cyclic. modulo 2. + an−1 xn−1 . Then an−1 a0 a1 . . 0101. including Hamming and Golay codes. an−2 ∈ C. 1001. 0110. 011. • The (7. 101. an−1 ∈ C ⇒ an−1 a0 a1 . . an−1 in a cyclic code C with the polynomial c(x) = a0 + a1 x + a2 x2 + . . . Deﬁnition: A linear code C is cyclic if a0 a1 . an−2 corresponds to the polynomial an−1 + a0 x + a1 x2 + . However. Remark: If x is a codeword of a cyclic code C. . 1010. 1111). It is convenient to identify a codeword a0 a1 . 3) perfect code in Chapter 1.

(5. r(x) ∈ Rq ⇒ r(x)c(x) ∈ C. Thus. n n Remark: The next theorem states that every ideal in Rq is a principal ideal (i. We know that multiplication of a codeword c(x) in C by x corresponds to a cyclic shift of its coeﬃcients.2 (Generator Polynomial): Let C be a nonzero q-ary cyclic code in Rq . n Theorem 5. the following theorem shows that a cyclic code C n is an ideal of Rq . We thus see by induction that n c(x) ∈ C ⇒ r(x)c(x) ∈ C ∀r(x) ∈ Rq . That is. Deﬁnition: The polynomial ring Fq [x ] is the set of all polynomials P (x) with coeﬃcients in Fq . (5.1) where the multiplication is performed modulo xn − 1. CYCLIC CODES since xn = 1 (mod xn − 1).1 (Cyclic Codes are Ideals): A linear code C in Rq is cyclic ⇐⇒ n c(x) ∈ C. n . n Theorem 5. Then (i) there exists a unique monic polynomial g(x) of smallest degree in C. That is. n Rq is the set of all polynomials of degree less than n. multiplication by x (modulo the polynomial xn − 1) corresponds to a cyclic shift. a linear code C is cyclic iﬀ c(x) ∈ C ⇒ xc(x) (mod xn − 1) ∈ C. Remark: A cyclic code in Fqn can be thought of as a particular subset of the residue n class polynomial ring Rq .e. Deﬁnition: The principal ideal n g(x) = {r(x)g(x) : r(x) ∈ Rq } n of Rq is the cyclic code generated by the polynomial g(x).1). Rq is a Principal Ideal Domain). Deﬁnition: The residue class ring Rq = Fq [x]/(xn − 1) is the set of all polynomial remainders obtained by long division of polynomials in Fq [x] by xn − 1. and since C is linear. we know that c(x) ∈ C ⇒ αc(x) ∈ C for all α ∈ Fq . Problem 5. Taking r(x) = x shows that C is cyclic. Deﬁnition: A polynomial is monic if its highest-degree coeﬃcient is 1. n Proof: Suppose C is a cyclic code in Rq .1: Verify that g(x) is an ideal. In fact. suppose that C satisﬁes Eq. . Conversely.46 CHAPTER 5.

Then xn−1 g(x) = x−1 g(x) is a codeword of degree r − 1.4 (Cyclic Generator Matrix): A cyclic code with generator polynomial g(x) = g0 + g1 x + ..47 (ii) C = g(x) . gr 0 g0 g1 g2 . . Proof: (i) If g(x) and h(x) are both monic polynomials in C of smallest degree. . . . which is a contradiction.. . Hence r(x) = 0 (otherwise a scalar multiple of r(x) would be a monic polynomial in C of degree smaller than deg g). g1 0 0 gr g2 .... we see that r(x) = 0. 0 g0 0 gr . Theorem 5.. Proof: Suppose g0 = 0. we know by the linearity of C that r(x) = c(x) − q(x)g(x) is also in C. . . (iii) By long division.. . But since c(x) and q(x)g(x) are both in the cyclic code C. . we can express c(x) = q(x)g(x) + r(x). so it only remains to show that C ⊂ g(x) . . (ii) Theorem 5. . .. By the minimality of deg g. then g(x) − h(x) is a polynomial in C of smaller degree. 0 0 0 . .1 shows that g(x) ⊂ C..3 (Lowest Generator Polynomial Coeﬃcient): Let g(x) = g0 +g1 x+. .2. xn − 1 is a multiple of g(x). . . . . . (iii) g(x) is a factor of xn − 1 in Fq [x]. Theorem 5. . c(x) ∈ g(x) .+ gr xr be the generator polynomial of a cyclic code.. . we may express xn − 1 = q(x)g(x) + r(x). . If g(x) − h(x) = 0. . where deg r < deg g.2 is called the generator polynomial of C. Then g0 = 0. . G = 0 0 g0 g1 g2 . Using long division. + gr xr has dimension n − r and generator matrix g0 g1 g2 .. .. Suppose c(x) ∈ C. But then r(x) = −q(x)g(x) (mod xn − 1) implies that r(x) ∈ g(x) .. a certain scalar multiple of g(x) − h(x) would be a monic polynomial in C of degree smaller than deg g. where deg r < deg g. From Theorem 5. 0 0 . gr Proof: Let c(x) be a codeword in a cyclic code C with generator g(x). contradicting the minimality of deg g... . Deﬁnition: The monic polynomial of least degree in Theorem 5. that is.. Hence g(x) = h(x). . That is. we know that c(x) = q(x)g(x) .

not all Hamming codes have a cyclic representation. .+qn−r−1 xn−r−1 g(x). the factor x3 + x2 + x + 1 is not irreducible because it has a linear root at a = 2 = −1 in F3 .1. k] code is cyclic ⇐⇒ it is generated by a factor of xn − 1. Deﬁnition: A polynomial is said to be irreducible in Fq [x] if it cannot be factored into polynomials of smaller degree. Thus.2 (Irreducible 2nd or 3rd Degree Polynomials): A polynomial c(x) in Fq [x] of degree 2 or 3 is irreducible ⇐⇒ c(a) = 0 for all a ∈ Fq . CYCLIC CODES for some polynomial q(x). .2. Lemma 5. we obtain (x4 − 1) = (x − 1)(x + 1)(x2 + 1). Note that deg q < n − r since deg c < n. . That is. Since 1 is a root of the equation x4 − 1 we know that (x − 1) is a factor and hence (x4 − 1) = (x − 1)(x3 + x2 + x + 1) By Lemma 5. • Suppose we wish to ﬁnd all ternary cyclic codes of length n = 4. In particular.1 (Linear Factors): A polynomial c(x) has a linear factor x − a ⇐⇒ c(a) = 0. 3). The generators for such codes must be factors of x4 − 1 in the ring F3 [x]. for which n = (32 − 1)/(3 − 1) = 4. xn−r−1 g(x) of G. . by Lemma 5.1 and 5. The following lemma is useful in ﬁnding these factors. xg(x). Remark: Together. which is a linear combination of the n − r rows g(x). Ham(2. cannot be cyclic. Using long division. The diagonal of nonzero g0 s next to a lower-triangular zero submatrix ensures that the rows of G are linearly independent. we see that the generated codes either have minimum distance less than or equal to 2 or else equal to 4.2 say that an [n.48 CHAPTER 5. there are a total of 23 = 8 ternary cyclic codes of length 4. Proof: Exercise. Proof: c(x) can be factored into polynomials of smaller degree ⇐⇒ it has at least one linear factor (x − a) ⇐⇒ c(a) = 0. as illustrated in Table 5. it is not possible to have a cyclic code of length 4 and minimum distance 3. . Theorems 5. the span of the rows of G is the n − r dimensional code C. Thus. Upon examining the weights of the rows of the possible generator matrices. Hence. c(x) = q0 + q1 x + . .1. . Lemma 5. x2 g(x). Since any combination of these three irreducible factors can be used to construct a generator polynomial g(x) for a cyclic code. + qn−r−1 xn−r−1 g(x) = q0 g(x)+q1 xg(x)+. . .

1: Generator polynomial g(x) and corresponding generator matrix G for all possible ternary cyclic codes of length 4.49 g(x) 1 0 0 0 0 1 0 0 G 0 0 1 0 0 0 0 1 1 x−1 −1 1 0 0 0 −1 1 0 0 0 −1 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 0 1 −1 0 1 0 0 −1 0 1 [ −1 1 −1 1 ] [1 1 1 1] [0 0 0 0] x+1 x2 + 1 (x − 1)(x + 1) = x2 − 1 (x − 1)(x2 + 1) = x3 − x2 + x − 1 (x + 1)(x2 + 1) = x3 + x2 + x + 1 x4 − 1 = 0 Table 5. .

.. hk−1 0 0 h0 hk−2 . . From Theorem 5.50 CHAPTER 5. . not just n in Rq . . 0 hk−2 hk−1 hk . . If c(x)h(x) = 0 then r(x)h(x) = c(x)h(x) − q(x)g(x)h(x) = 0 (mod xn − 1). which is a contradiction. n “⇐” We can express any polynomial c(x) in Rq as c(x) = q(x)g(x) + r(x) where deg r < deg g = n − k. Proof: n “⇒” If c(x) is a codeword. In particular. the coeﬃcient of the highest degree term of the product r(x)h(x) is a = a · 1 = 0.... . hk 0 h0 . Since g is monic and has degree n − k. consider its highest degree coeﬃcient a = 0. Proof: Since the degree of the generator polynomial g is r = n−k.. hk−2 .5. we see that h is monic and has degree k. Thus r(x) = 0 and so c(x) is a codeword: c(x) = q(x)g(x) ∈ g(x) . so r(x)h(x) = 0 in Fq [x]. .4 in standard form) is to ﬁrst construct the check polynomial h(x) of C from its generator polynomial g(x). the dimension of the code must be k.. . ... ...... . . . by Theorem 5. we know that a codeword c(x) = c0 +c1 x+. where h(x) satisﬁes xn − 1 = g(x)h(x) in Fq [x]. then in Rq we have c(x) = a(x)g(x) ⇒ c(x)h(x) = a(x)g(x)h(x) = a(x)(xn −1) = a(x)0 (mod xn −1) = 0. . . h0 .+cn−1 xn−1 must satisfy c(x)h(x) = 0. CYCLIC CODES An easy way to ﬁnd the parity check matrix for a cyclic [n. . Theorem 5.. Then since h is monic. 0 hk−1 hk 0 ... If r(x) = 0. 0 0 0 . k] code (without requiring that we ﬁrst put G given by Theorem 5. 0 h0 . n Theorem 5. . . But deg r(x)h(x) < n − k + k = n..6 (Cyclic Parity Check Matrix): A cyclic code with check polynomial h(x) = h0 + h1 x + .5 (Cyclic Check Polynomial): An element c(x) of Rq is a codeword of n the cyclic code with check polynomial h ⇐⇒ c(x)h(x) = 0 in Rq .. + hk xk has dimension k and parity check matrix hk 0 H= 0 . .. hk−2 hk−1 . the coeﬃcients . . .4.

. k + 1. . . . . 0. . α2 −2 } for some primitive element α. we see that h(x) is a factor of xn − 1. 0 hk hk−1 hk−2 . and by Theorem A. ar−1 But then. since each of these equations is one of the n − k h0 0 0 hk hk−1 hk−2 .. H is a parity check matrix for the code with check polynomial h(x). α0 . . .6. and 5. h0 0 .. n − 1 we then have 0= ci hj .3 F2 [x]/p(x) is a ﬁeld of order 2r . . . . ..1. + h0 xk . 2) is equivalent to a cyclic code.. . rows of the matrix equation 0 c0 0 0 c1 0 0 c2 = 0 . . . . . This means that C ⊥ contains the span of the rows of H. .. . . .. . . . .51 xk . . By Theorem A. . . . . Proof: Let p(x) be an irreducible polynomial of degree r in F2 [x]. with (monic) generator h−1 h(x).. . . . . this says that C ⊥ is itself a cyclic code. . . . . . 5. . The codewords are thus orthogonal to all linear combinations of the rows of H. Deﬁnition: The reciprocal polynomial h(x) of a polynomial h(x) = h0 + h1 x + .. 0 cn−1 h0 . . xn−1 of the product c(x)h(x) must be zero. . h0 00 .. . . . . 0 0 . . + hk = hk + hk−1 x + . In view of Theorems 5. We associate each element a0 + a1 x + a2 x2 + . .. 0 0 hk hk−1 hk−2 .7 (Cyclic Binary Hamming Codes): The binary Hamming code Ham(r. . That is. . . Remark: Since h(x)g(x) = h(x)xn−k g(x−1 ) = xn h(x−1 )g(x−1 ) = xn [(x−1 )n − 1] = 1 − xn . h0 . + ar−1 xr−1 ∈ F2 [x]/p(x) with the column vector a0 a1 . .2. . . α1 . . so we see that H has rank n − k and hence generates exactly the linear subspace C ⊥ . . . α2. 0 hk hk−1 hk−2 . . h(x) = xk h(x−1 ) = h0 xk + h1 xk−1 + . + hk xk is obtained by reversing the order of its coeﬃcients: . xk+1 . . i+j=ℓ the codewords are orthogonal to all cyclic shifts of the vector hk hk−1 hk−2 . . Theorem 5.4 we know that F2 [x]/p(x) can be r expressed as the set of distinct elements {0. 0 We are now able to show that all binary Hamming codes have an equivalent cyclic representation. But hk = 1. for ℓ = k. .

A codeword c(x) = c0 + c1 x + . the primitive element has order 7 = 8 − 1.52 Let n = 2r − 1. which is just a monic irreducible polynomial in F2 [x] having a primitive element as a root. x6 = x2 + 1}. That is. Theorem 5. 1.7. From the proof of Theorem 5.1 (Binary Hamming Generator Polynomials): Any primitive polynomial of F2r is a generator polynomial for a cyclic Hamming code Ham(r. we have r(α)c(α) = r(α)0 = 0 in F2 [x]/p(x). 2) = p(x) .5. A parity check matrix for a cyclic version of the Hamming code Ham(3. so r(x)c(x) is also an element of C. for example. 0 0 1 0 1 1 1 Q. the following corollary to Theorem 5. Its minimal polynomial p(x) is a primitive polynomial of F2r . . In particular. Proof: Let α be a primitive element of F2r . Ham(r. . x5 = x2 + x + 1. 2).4 that any minimal polynomial of a primitive element β of Fpr is a generator polynomial for a cyclic code in Fpr (β is a root of xn − 1 = 0.7 establishes that Ham(r. By Theorem A. . 2) can be generated by any primitive polynomial of F2r . Hence Ham(r.5 when n = pr −1 gives us a clue: on comparing these results. The primitive element x is a root of the primitive polynomial x3 + x + 1 in F8 . so that n C = {c(x) ∈ R2 : c(α) = 0 in F2 [x]/p(x)}. What is the generator polynomial for Ham(r.1 then implies that C is cyclic. we see that Ham(r. Corollary 5. x. that is. 2) consists precisely of those polynomials c(x) for which c(α) = 0. . 2) is thus 1 0 0 1 0 1 1 H = 0 1 0 1 1 1 0.7.2 and Theorem A. + cn−1 xn−1 in this code must then satisfy the vector equation c0 + c1 α1 + c1 α2 + . Note that x7 = x3 + x = 1. 2) ⊂ p(x) . . Moreover. Note that F8 has x as a primitive element since all polynomials in F2 [x] of degree less than 3 can be expressed as powers of x: F8 = {0.1 implies that every multiple of the codeword p(x) belongs to the cyclic code Ham(r. The r × n matrix H = [ 1 α α2 CHAPTER 5. αn−1 ] is seen to be the parity check matrix for C = Ham(r. 2)? A. . The close parallel between Theorem 5. p(x) itself. we see from Theorem 5. + cn−1 αn−1 = 0. x3 = x + 1. • The irreducible polynomial x3 + x + 1 in F2 [x] can be used to generate the ﬁeld F8 = F2 [x]/(x3 + x + 1) with 23 = 8 elements. CYCLIC CODES . n If c(x) ∈ C and r(x) ∈ R2 . . 2). Theorem 5. x4 = x2 + x. see Appendix A). any such polynomial must be a multiple of p(x). x2 . 2) since its columns are precisely the distinct nonzero vectors of F2r .

2) code: 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1 0 1 0. Since x is a primitive element of F2 [x]/p(x) and p(x) = 0 mod(x3 + x + 1). 2) = p(x) . 0 0 0 1 1 0 1 . From Theorem 5.4. we know that p(x) is a primitive polynomial of F23 = F2 [x]/p(x) and hence Ham(3.53 • Consider the irreducible polynomial p(x) = x3 + x + 1 in F2 [x]. we can then immediately write down a generator matrix for a cyclic Ham(3.

. In fact. to obtain i=0 the codeword c(x) = f (x)g(x). we will know that its minimum distance is exactly 5 since the codeword g(x) has weight 5. . Remark: Often we take α to be a primitive element of Fqs . . a3 . ak−1 . 54 . and α9 . developed by R. and α4 . In this chapter. we discuss a class of important and widely used cyclic codes that can correct multiple errors. . α2 . so that n = q s − 1. where n is any factor of q s − 1. Hocquenghem (1959). known as Bose–Chaudhuri–Hocquenghem (BCH) codes. by ﬁnding a generator polynomial g(x) that has roots at α. . α2 . α6 . α4 . Remark: We will establish that a BCH code of odd design distance d has a minimum distance of at least d. α12 . namely at α. we can construct a [15. by showing that such a code can correct (d − 1)/2 errors. it is possible to construct BCH codes over Fqs of length n. K. C. A BCH code of length n and design distance d is a cyclic code generated by the product of the distinct minimal polynomials in Fq [x] of the elements α. α3 . . α8 . To encode the message word a0 a1 . Once we have shown that this code can correct two errors. Bose and D. αd−1 . α2 . 7] code that can correct two errors. Hamming codes are of limited use because they cannot correct more than one error. • For the primitive element α = x of the ﬁeld F24 = F2 [x]/(x4 +x+1). The resulting BCH code is known as a primitive BCH code.Chapter 6 BCH Codes For noisy transmission lines. Ray-Chaudhuri (1960) and independently by A. However. Such a generator can be created from the product of the minimal polynomials m1 (x) = x4 + x + 1 of α and m3 (x) = x4 + x3 + x2 + x + 1 of α3 : g(x) = m1 (x)m3 (x) = (x4 + x + 1)(x4 + x3 + x2 + x + 1) = x8 + x7 + x6 + x4 + 1. g(x) has even more roots than prescribed. Deﬁnition: Let α be an element of order n in a ﬁnite ﬁeld Fqs . we represent it by the polynomial f (x) = k−1 ai xi and form its product with the generator polynomial g(x).

2 S2 = y(α2) = c(α2 ) + e(α2 ) = e(α2 ) = e1 + . it will be helpful to ﬁrst discuss the case where d = 5 and t ≤ 2 errors have occurred. S1 = y(α) = c(α) + e(α) = e(α) = e1 + . . 1 2 S3 = e3 + e3 . . . . powers ℓ1 . . t. We now describe the decoding procedure for BCH codes. t . = c(αd−1 ) = 0. . . from 1 to d − 1. To understand how the above syndrome equations are solved. t . Deﬁnition: The error locator polynomial is . We deﬁne ei = 0 for i > t and write S1 = e1 + e2 . Notice that the roots of σ(x) are just the inverses of ei . . + ed−1 . . + b1 x + 1. the following deﬁnition will be helpful. Suppose that y(x) is received rather than c(x) and that t errors have occurred. . + e3 . . from a table of the elements of Fqs . ℓt . .. . . To keep the notation simple we begin by illustrating the procedure ﬁrst for the binary case. t 1 The decoding problem now amounts to determining which value of t and which choices of the ﬁeld elements e1 . . . i = 1.1: Show that a polynomial c(x) belongs to a BCH code with design distance d ⇐⇒ c(α) = c(α2 ) = . . Once these are found. Sd−1 = y(αd−1) = c(αd−1 ) + e(αd−1 ) = e(αd−1 ) = ed−1 + . et are consistent with the above equations. . Then the error polynomial e(x) = y(x) − c(x) can be written as e(x) = xℓ1 + xℓ2 + .55 Remark: In the binary case q = 2. e2 . we can determine the corresponding . To ﬁnd a solution to the above equations. ℓ2 . . where q = 2. the generator of a BCH code is just the product of the distinct minimal polynomials of the odd powers. . σ(x) = (e1 x − 1)(e2 x − 1) . 1 2 . . . + e2 . . ℓt such that ei = αℓi . . We then compute the syndrome S1 by substituting α into y(x). . 3 S3 = y(α3) = c(α3 ) + e(α3 ) = e(α3 ) = e1 + . we evaluate . of the primitive element. 2. . . Problem 6. .. These powers tell us directly which bits we need to toggle. . . (et x − 1) = bt xt + bt−1 xt−1 + . . . Likewise. . . t. . . . where ei = αℓi for i = 1. . . + xℓt for some unknown powers ℓ1 . S2 = e2 + e2 . + et . ℓ2 . 2.

Since the determinant S1 S3 − S2 = 0. Finally. in which case e(x) = 0. . The powers of α associated with the inverses of these roots identify the two bit positions in which errors have occurred. so that e1 = e2 .1) will have rank 1. the error locator polynomial σ(x) = b2 x2 + b1 x+ 1 must have two roots in F24 . for otherwise the ﬁrst equation would imply that S3 = 0 and the 2 rank of the coeﬃcient matrix would be 0. we can solve for the coeﬃcients b1 and b2 . (6. 2 2 2 2 On adding these equations. the system of equations will have rank 0 ⇐⇒ no errors have occurred. 1 2 This implies that e2 = 0 (only one error has occurred) and that S1 = e1 = αℓ1 . if the rank of the coeﬃcient matrix is 2. then S3 = S1 = 0 and the coeﬃcient matrix of Eq. 1 2 CHAPTER 6. 2 i we know that 0 = e3 σ(e−1 ) = e3 (b2 e−2 + b1 e−1 + 1) = b2 e1 + b1 e2 + e3 1 1 1 1 1 1 1 and 0 = e3 σ(e−1 ) = b2 e2 + b1 e2 + e3 . we would have obtained S2 b2 + S3 b1 = −S4 .) If the coeﬃcient matrix in Eq.1) S1 b2 + S2 b1 = −S3 . 2 1 2 i. we obtain 2 0 = b2 (e1 + e2 ) + b1 (e1 + e2 ) + (e3 + e3 ). BCH CODES The error locator polynomial is σ(x) = b2 x2 +b1 x+1. If two errors have occurred. the minus sign can be omitted. if only one error has occurred. (Of course.56 S4 = e4 + e4 . Since q = 2. that is. Since σ(e−1 ) = 0 for i = 1. then S1 = S2 = S3 = S4 = 0 and hence e1 + e2 = 0. we simply look up the exponent ℓ1 such that αℓ1 = S1 and then toggle bit ℓ1 of y(x) to obtain the transmitted codeword c(x). which we can determine by trial and error. we only need to solve the system of equations S1 S2 S2 S3 b2 b1 =− S3 S4 (6. (6. Note that S1 = 0. That is.1) has rank 0. 2 Suppose that the coeﬃcient matrix has rank 1.e. in the binary case. Using a power table for F24 . we know that S2 = S1 . If for each i we had multiplied σ(e−1 ) = 0 by e4 instead of e3 and added the resulting i i i equations. no error has occurred. But e3 + e3 = (e1 + e2 )3 ⇒ 0 = 3e1 e2 (e1 + e2 ) = 3e1 e2 S1 . This would imply that ℓ1 = ℓ2 . To ﬁnd b1 and b2 . we 3 deduce S3 = S1 . 3 Conversely.

that is. 110 0000. The inverse. y(x) = x9 + x8 + x6 + x4 + x + 1. . if instead the received vector was 110 010 100 100 000. Consulting the power table for F2 [x]/(x4 + x + 1). • In the previous example. say e1 . Upon adding α4 times the ﬁrst equation to the second equation. 2 S2 = S1 = α 8 . we see that (α12 + α7 )b1 = α11 + α. Upon division of the associated polynomial x9 + x6 + x5 + x4 + x + 1 by g(x) = x8 + x7 + x6 + x4 + 1 we obtain x + 1. we see that the syndromes are S1 = y(α) = α9 + α8 + α6 + α4 + α + 1 = (α3 + α) + (α2 + 1) + (α3 + α2 ) + (α + 1) + α + 1 = α + 1 = α4 . 7] BCH code generated by the polynomial g(x) = x8 + x7 + x6 + x4 + 1. S3 = y(α3) = α27 + α24 + α18 + α12 + α3 + 1 = α12 + α9 + α3 + α12 + α3 + 1 = α9 + 1 = α3 + α + 1 = α7 . 2 S4 = S2 = α20 = α5 . σ(x) = α13 x2 + α4 x + 1 = (e1 x − 1)(e2 x − 1) S1 b2 + S2 b1 = S3 ⇒ α4 b2 + α8 b1 = α7 . the two errors are in the ﬁfth and eighth positions. Since the product e1 e2 = b2 = α13 . we immediately see that e2 = α5 . y(x) = x9 + x6 + x4 + x + 1. Thus. i. so we decode the received vector 110 010 101 100 000 as the codeword 110 011 100 100 000. the transmitted codeword is 110 011 100 100 000. so one root is α7 . the ﬁrst such i we ﬁnd is i = 7. Thus. Suppose that two errors have occurred. that is. Given the message word 110 0000. the syndromes would be S1 = y(α) = α9 + α6 + α4 + α + 1 = (α3 + α) + (α3 + α2 ) + (α + 1) + α + 1 = α2 + α = α5 . 2 S2 = S1 = α10 . which corresponds to the original message. 2 3 Since S1 = 0 and S1 S3 −S2 = S1 (S3 −S1 ) = S1 (α7 −α12 ) = 0. b1 = α−2 α6 = α4 and hence b2 = α−4 (α7 + α8 α4 ) = α−4 α2 = α13 . has rank 2.57 • Let us demonstrate this decoding scheme for the [15. we search through the ﬁeld until we ﬁnd an i such that α2i−2 + αi+4 = 1. Incrementing from i = 0. That is. the error polynomial is We determine the roots of this equation by trial and error. c(x) = (1 + x) g(x) = x9 + x6 + x5 + x4 + x + 1. of this root is α8 . That is. 2 S4 = S2 = α16 = α. S3 = y(α3 ) = α27 + α18 + α12 + α3 + 1 = α12 + α3 + α12 + α3 + 1 = 1.e. the system of equations S2 b2 + S3 b1 = S4 ⇒ α8 b2 + α7 b1 = α. so that the received vector is 110 010 101 100 000.

In general. . . (6. . b2 . the following theorem establishes that the matrix V is nonsingular ⇐⇒ the values ej are distinct. . . .. .. et−1 t and D = diag(e1 . .. S1 = α5 . bt in terms of the 2t ≤ d − 1 syndromes S1 . .3) as M = V DV t . + b1 et+1 + et+2 . t are nonzero and distinct. . . . . so that each of the values ej for j = 1. . decoding requires that we solve the nonlinear system of d−1 syndrome equations i Si = y(αi) = ei + . t are nonzero. one again we see that the transmitted codeword was 110 011 100 100 000. t−1 e1 1 e2 e2 2 . . . . .. St+1 bt−1 St+2 . e2 . . + b1 et + et+1 . i = 1.. . . . . . St bt St+1 S2 S3 . . . Remark: The matrix D is nonsingular ⇐⇒ each of its eigenvalues ej . . In fact. + b1 e2t−1 + e2t . . . . we know that only one error has occurred. . . et−1 2 . . . . .58 CHAPTER 6..2: Show that we may rewrite the coeﬃcient matrix in Eq. (6.3) . i i i i i i . Also. . . . . . et ) is a diagonal matrix with components dii = ei . A straightforward generalization of the t = 2 decoding scheme leads to the t equations: 0 = et+1 σ(e−1 ) = bt ei + bt−1 e2 + . + et . . . . . . t}. . . . . . . . . where V is the Vandermonde matrix 1 e1 2 V = e1 . S2t : S1 S2 . . i i i i i i On summing each of these equations over i. . .2) 1 for the value of t and the errors {ej : j = 1. 1 et e2 t . . d − 1 (6. St St+1 . so the error must be in the ﬁfth position. Here t ≤ (d − 1)/2 is the actual number of errors that have occurred. j = 1. i i i i i 0 = et+2 σ(e−1 ) = bt e2 + bt−1 e3 + .. . . .. S2t−1 b1 S2t Problem 6. S2 . . . 0 = e2t σ(e−1 ) = bt et + bt−1 et+1 + . . . = − . BCH CODES 3 Since S3 = S1 = 0. . we obtain a linear system of equations for the values b1 . .

. . 1 1 . . .. . Remark: We thus see that the matrix M = V DV t is nonsingular ⇐⇒ the error values ej are nonzero and distinct. . . . .. . .. . et−2 1 t−1 i−1 1 e2 2 e2 . For t > 2. . . . 1 e2 − et e2 (e2 − et ) .. . et−1 (et−1 − et ) 0 = (−1)t−1 e1 − et e1 (e1 − et ) . . . suppose that all (t − 1)×(t − 1) Vandermonde matrices have determinant t−1 t−1 i−1 i. . . .. .59 Theorem 6. et−2 (e1 − et ) et−2 (e2 − et ) . . . e2 t−1 (et − e1 )(et − e2 ) . et−1 . et−1 − et . . . t−2 t−2 t−2 e1 (e1 − et ) e2 (e2 − et ) .j=1 i>j (ei − ej ) = i=1 j=1 (ei − ej ). . . 1 . . (et − et−1 ) . for i = t. . 1 et e2 t . . we can rewrite the determinant of any t×t Vandermonde matrix as . .. et−1 (et−1 − et ) . . . . et−1 − et 0 . . . . 1 e1 − et e1 (e1 − et ) . . t − 1.j=1 i>j (ei − ej ). et−1 t has determinant i. et−1 1 1 e2 e2 2 . . .1 (Vandermonde Determinants): For t ≥ 2 the t×t Vandermonde matrix V = t 1 e1 e2 1 . .... . . . et−2 t−1 t−1 t i−1 et−2 2 = i=1 j=1 (ei − ej ) · j=1 (et − ej ) = i=1 j=1 (ei − ej ).. Proof: When t = 2 we see that det V = e2 − e1 ... .. . . = 1 e1 2 e1 . . . By subtracting et times row i − 1 from row i. et−2 (et−1 − et ) 1 2 t−1 . e2 − et e2 (e2 − et ) . et−1 (et−1 − et ) 0 . 2. . . et−1 2 . . .. .

q − 1}. St St+1 . 17. ℓ2 . 19. . BCH CODES Remark: If we attempt to increase the value of t in Eq. + qt xℓt . . The cyclotomic cosets mod 23 are {0}. . and {5.2) beyond the actual number of errors that have occurred. . We then see that any BCH code of design distance d can correct ⌊(d − 1)/2⌋ errors. we simply solve the linear system Eq. .1 and the fact that the BCH code can correct (d − 1)/2 errors. . S2 S3 . qt et ). This gives us a method for ﬁnding the number of errors: t is just the largest number such that S1 S2 . moreover. We encapsulate this result in the following theorem. we are normally interested only in the case of odd d. .2 (BCH Bound): The minimum distance of a BCH code of odd design distance d is at least d. . • Let α be a primitive element of F211 . .60 CHAPTER 6. Since 211 − 1 = 2047 = 23 × 89. (6. the element β = α89 has order n = 23. . either the values ej will no longer be distinct or at least one of them will be zero. Thus the minimal polynomials of m1 (x) of β and m5 (x) of β 5 in F211 each have degree 11. 1. . Theorem A. . . . . M will no longer be invertible. . (6. In this case. . .3) for the coeﬃcients b1 . 13.7 implies that m5 (x) = m1 (x) . St+1 M = . as illustrated by the following example. . 15. Remark: Once we have determined the maximum value of t such that the coeﬃcient matrix M is invertible. 6. . the syndromes become Si = q1 ei + . . . . 3. Theorem 6. q2 e2 . Remark: Although Theorem 6. . 14}. 11. 9. In either case. Remark: It may happen that the minimum distance of a BCH code exceeds its design distance d. . S2t−1 is invertible. then it is impossible for a (t + 1) × (t + 1) or larger syndrome matrix based on y to be invertible. we know from Theorem A. 7. .2 may be shown to hold also when the design distance d is even. 18. + qt ei . 22. We can determine all t roots of σ(x) simply by searching through all of the ﬁeld elements (at most one pass is required). St . . the error vector becomes e(x) = q1 xℓ1 + q2 xℓ2 + . and D = 1 t diag(q1 e1 . Remark: If it is a priori known that no more than t errors have occurred in a received polynomial y.1 1 Since β 23 = 1. 4. 8.5 that x23 − 1 = (x − 1)m1 (x)m5 (x). 21. . . 10. 20. 12}. where the qi ∈ {0. bt of the error locator polynomial σ(x). . Proof: This follows from Theorem 1. ℓt corresponding to the inverses of these roots precisely identify the t positions of the received vector that are in error. The exponents ℓ1 . 16. 2. . . {1. Remark: The above decoding procedure can be easily extended to nonbinary codes. b2 . . .

which has roots at β. g(x) = (x − α)(x − α2 )(x − α3 ) . we know from Theorem 2. the polynomial g(x) = (x−2)(x−4)(x−8)(x−5)(x−10)(x−9) = x6 +6x5 +5x4 +7x3 +2x2 +8x+2 generates a triple-error correcting [10. in fact. and β 4 . . Because Reed–Solomon codes are optimal in this sense and easily implementable in hardware (using shift registers). That is. (x − αd−1 ). 7] Reed–Solomon code over Z11 . Remark: The special case of an [n. 6. the minimum distance of the code must at least n − k + 1. consistent with the above claim that the design distance of a Reed-Solomon code is in fact the actual minimum distance.3: Prove that the Hamming code Ham(r. 12. 7] Golay code we encountered in Chapter 4.2 that the minimum distance can be no more than n − k + 1. • High-performance computer memory chips containing so-called ECC SDRAM use Reed–Solomon codes to correct one error per 64-bit word in addition to detecting very rare multiple errors (which are programmed to generate a machine halt). this BCH code is equivalent to the triple-error correcting [23. 12] binary BCH code of length 23 from the degree 11 generator polynomial m1 (x). The generator polynomial of a Reed–Solomon code of design distance d. and data transmission channels. Thus. Problem 6. One of the codewords is g(x) itself. the actual minimum distance is 7. magnetic and optical (compact disk) storage systems. Note that the minimal polynomial of any element of Fq has degree s = 1. . • Compact disk players use a double-error correcting [255. thus has degree n − k = d − 1. they are widely used for error correction in computer memory chips. What is the value of k? Problem 6. • Since 2 is a primitive element of Z11 . where the primitive element α comes from the same ﬁeld Fq as the coeﬃcients of the generator polynomial. β 3 .1.3 establishes the hierarchy of error-correcting codes illustrated in Fig. a Reed–Solomon code achieves the socalled singleton upper bound n − k + 1 for the minimum distance of a linear code. 4. β 2. 2) is equivalent to an [n. 5] Reed–Solomon code over F256 (the symbols are eight-bit bytes) with interleaving of the data over the disk to increase robustness to localized data loss. which has weight 7. . While the design distance of this code is 5. k] BCH code for s = 1.61 We can then construct a [23. high-speed modems. But since there are rk H ≤ n − k independent columns in the parity check matrix. is known as a Reed–Solomon code. k] BCH code with distance d = 3. 251.

(c) Is C perfect? Circle the correct answer: Yes / ¦ ¥ No . (a) Find the degree of the generator polynomial g(x) for C. . so the minimum distance of this linear code cannot exceed 7. prove that the actual minimum distance of C is exactly 7. (f) Use the irreducible factors of x15 − 1 (or any other means) to ﬁnd the check polynomial h(x) of C. . (d) Construct the generator polynomial g(x) for C. . (e) We know that the minimum distance of C is at least the design distance 7. We want α. § ¤ g(x) = m1 (x)m3 (x)m5 (x) = (x4 + x3 + 1)(x4 + x3 + x2 + x + 1)(x2 + x + 1) = (x8 + x4 + x2 + x + 1)(x2 + x + 1) = x10 + x9 + x8 + x6 + x5 + x2 + 1. 2) CHAPTER 6.4: In this question. Problem 6. which has degree 4 + 4 + 2 = 10.62 General Codes Linear Codes Cyclic Codes BCH Codes Ham(r. so we want g(x) = m1 (x)m3 (x)m5 (x).. (b) What is the dimension k of C? k = n − 10 = 5 This code is not perfect since it has neither the parameters of a Hamming code nor a Golay code. That is. n − k = 10.α2 . a BCH code C of length n = 15.α6 to be roots of g(x). The weight of g(x) is 7. we construct. Using the fact that g(x) itself is a codeword of C. design distance 7. Since x16 − x = x(x + 1)m1 (x)m3 (x)m5 (x)m7 (x).1: Hierarchy of Error-Correcting Codes. . BCH CODES Figure 6. from the primitive element α = x in the ﬁeld F2 [x]/(x4 + x3 + 1).

{2. 1. + α15 = (α − 1)−1 α15 − 1 = (α − 1)−1 · 0 = 0. α3 i . the original message polynomial was m7 (x) = x4 +x+1. What is the order of x? Since x2 = −1. The order e must divide 8. ﬁnd the transmitted codeword. Errors have occurred in positions 0. Our received vector is only a distance three away. Hence h(x) = (x + 1)m7 (x) = (x + 1)(x4 + x + 1) = x5 + x4 + x2 + 1. That is. so it is in fact 4. we see that x4 = 1. (b) Consider the ﬁeld F9 = Z3 [x]/(x2 + 1).5: (a) Show that x2 + 1 is irreducible in Z3 [x]. Since x2 + 1 evaluates to 1. α3i. Problem 6. . + x15 = (x + 1)−1 x15 − 1 . where α is a primitive element) of F9 . Since 4 < 8. corresponding to the message 11001.1 that it cannot have a linear factor. 7}. (g) How many errors can C correct? 3 (h) Suppose the vector 011 111 101 110 111 is received. we know that x is not a primitive element of this ﬁeld. The transmitted codeword was thus 111 111 111 111 111. Since the degree of x2 + 1 is 2. 7. Without computing syndromes. 2 (c) List all cyclotomic cosets (the distinct exponents in the sets {αi . 3}. . From part (f) we know that (x + 1)−1 x15 − 1 = m1 (x)m3 (x)m5 (x)m7 (x) = m7 (x)g(x). we know from Theorem 5. and 11. {5. 6}. 2 of Z3 . it must be irreducible. . . {1. . {4}. Since α15 = 1 in F16 .63 we know that x15 − 1 = (x + 1)m1 (x)m3 (x)m5 (x)m7 (x) = (x + 1)m7 (x)g(x). Show that the element x is not a primitive element of this ﬁeld. but we can correct three errors. Thus. . 2 at the respective values 0. . How many errors have occurred? Hint: Look for an obvious nearby codeword. {0}. the vector consisting of 15 ones is a codeword.}. (i) Determine the original transmitted message polynomial and the corresponding message. we see that 1 + α + α2 + α3 + . The original message was 1 + x + x2 + x3 + . . 2.

S4 show that the received polynomial in part(d) contains exactly one error. m2 (x) (degree 2). n − k = 5. S2 = v(α2 ) = α6 + α8 + 2α10 + α12 = x + 1 + 2(2x) + 2 = 2x = α2 . 2 Since S1 = 0 but S1 S3 − S2 = αα3 − (α2 )2 = 0. 3 S1 = v(α) = α3 + α4 + 2α5 + α6 = (2x + 1) + 2 + 2(2x + 2) + x = x + 1 = α. with primitive element x + 1. (e) Using the value of the syndrome S1 and the syndrome equation S1 S2 S2 S3 b2 b1 =− S3 . Remember that the polynomial coeﬃcients come from Z3 . Since we want α. . That is. S2 . and S3 for the received polynomial v(x) = x + x4 + 2x5 + x6 . and m4 (degree 1). α2 . (d) Compute the syndromes S1 .6: In the ﬁeld Z3 [x]/(x2 + 1). 3 S3 = v(α3 ) = S1 = α3 . (b) What is the dimension k of C? k = n − 5 = 3. (c) How many codewords are there in C? There are 3k = 27 codewords in C. α3 . consider the BCH code C of length n = 8 and design distance 5. we know that exactly one error has occurred. and α4 to be roots of g(x). (a) Find the degrees of the minimal polynomials used to construct the generator polynomial g of C and the degree of g itself. we know that g must be the product of m1 (x) (degree 2).64 CHAPTER 6. BCH CODES (d) Establish that α = x + 1 is a primitive element of the ﬁeld Z3 [x]/(x2 + 1) by completing the following table. Element α0 α1 α2 α3 α4 α5 α6 α7 Polynomial 1 x+1 2x 2x + 1 2 2x + 2 x x+2 Order 1 8 4 8 2 8 4 8 Minimal Polynomial x+2 x2 + x + 2 x2 + 1 x2 + x + 2 x+1 x2 + 2x + 2 x2 + 1 x2 + 2x + 2 Problem 6. Indicate the order and minimal polynomial of each element. Hence g must have degree 2 + 2 + 1 = 5.

65 (f) What was the transmitted polynomial in part (d)? On solving S1 = α = qe. {5. {7. show that 100 100 100 100 100 is a codeword of C. Since the sum of these degrees is 4 + 4 = 8. deg g = n − k = 15 − 7 = 8. 10}. (c) Without computing syndromes. 4. α3 . So an error of 1 has appeared in the coeﬃcient of x. On subtracting this error. S2 = α2 = qe2 . 14. α2 . not α5 ) are roots of c(x). The generator g is constructed to have roots at α. Thus c(x) is in the code generated by g(x). 8}. 5] primitive BCH code C over F16 . From the cyclotomic cosets {0}.7: Consider the [15. we see that g cannot contain any additional roots. and α4 (but incidentally.5. 2. Problem 6. 7. That is. 2. we see that it must include the minimal polynomials of m1 (x) (degree 4) and m3 (x) (degree 4). . α2 . 3. we obtain the transmitted polynomial 2x + x3 + x4 + 2x5 + x6 . α. 12. {1. x3 )−1 We know that the geometric series c(x) = 1 + x3 + x6 + x9 + x12 has the value (1 + 1 + (x3 )5 for x3 = 1. Then for i = 1. 4. 6. c(αi ) = 1 + α3i = 1 + α3i −1 −1 1 + α3i 5 = 1 + α3i −1 −1 1 + α15 i (1 + 1i ) = 1 + α3i 0=0 since α3i = 1. we see that e = α and q = 1. (a) Find the degree of the generator polynomial g for C. Let α be a primitive element of F16 . (b) Find the degrees of the minimal polynomials used to construct g. {3. 11}. 13. c(x) is a multiple of the minimal polynomial of each of these elements. By Theorem A. 9}. α3 . and therefore. and α4 . of g(x) itself.

respectively. Ee. and their associated encrypting function Ee and Dd . but the decryption key d is kept secret and is (hopefully) known only to the receiver. In cryptography. or ciphers. 7. radio transmission or the internet). where essentially the same key is used both to encrypt and decrypt a message (precisely. Deﬁnition: Let K be a set of cryptographic keys. and secure to eavesdropping and/or tampering. fall into one of two broad classes: symmetric-key cryptosystems. the sender uses a key to encrypt a message before it is sent through an insecure channel (such as a telephone line. Most cryptosystems. Dd : e ∈ K} of encrypting and decrypting keys. cryptographic codes are designed to increase their security. n − 1} in the message as c = Ee (m) = m + e (mod n). Often. data compression is performed ﬁrst. . Typically. the resulting compressed data is then encrypted and ﬁnally encoded with an error-correcting scheme before being transmitted through the channel. . . An authorized receiver at the other end then uses a key to decrypt the received data to a message. Shift ciphers encode each symbol m ∈ {0. . 1. data compression algorithms and error-correcting codes are used in tandem with cryptographic codes to yield communications that are both eﬃcient. where d can be easily determined whenever e is known) and public-key cryptosystems. where the encryption key e is made publicly available. which are designed only to increase the reliability of data communications.A Symmetric-Key Cryptography One of the simplest cryptographic system is the shift cipher employed by Julius Caeser. A cryptosystem is a set {e. robust to data transmission errors. 66 .Chapter 7 Cryptographic Codes In contrast to error-correcting codes. d. e and d.

Decoding is accomplished via the inverse transformation m = Dd (c) = c + d (mod n). Both shift and aﬃne ciphers are very insecure since they are easily decoded simply by trying all possible values of a and b! They are both special cases of simple substitution ciphers or monoalphabetic substitution ciphers. deﬁned by c = Ea.b (c) = a−1 (c − b) (mod n). 67 where d = −e. in which the encrypted symbol frequencies are compared to those of the original message language to determine the applied permutation. 3 4 . which uses an r × r invertible matrix e to encrypt an entire block m of r message symbols: c = Ee (m) = em (mod n). m = Da.A. using 0 to represent the letter A and 25 to represent the letter Z. n = 26 and e= 2 1 . That is. Given enough encrypted text. encoding is accomplished by addition modulo e and decoding is accomplished by subtraction modulo e. which happens only when gcd(det e. Simple substitution ciphers can be cryptanalyzed (decoded by an unauthorized third party) by frequency analysis. Caeser adopted the value e = 3 to encrypt the n = 26 symbols of the Roman alphabet.7. Some fans of the ﬁlm “2001: A Space Odyssey” even suggest that the computer name HAL is really a shift cipher for IBM. The existence of e−1 requires that det e have an inverse in Zn . Digraph ciphers map pairs of letters in the message text to encrypted pairs of letters. with e = 25! A slight generalization of the shift cipher is the aﬃne cipher. This condition guarantees the existence of an inverse transformation. Block substitution ciphers or polyalphabetic substitution ciphers divide the message into blocks of length r and apply diﬀerent permutations to the symbols in individual positions of the block. which permute the alphabet symbols in a prescribed manner. where a ∈ N is relatively prime to n. • Choose r = 2. simply by doing a frequency analysis on the letters in each ﬁxed position of all blocks. A general example of this is the linear block or Hill cipher. SYMMETRIC-KEY CRYPTOGRAPHY where e ∈ N. block substitution ciphers are also easily cryptanalyzed once the block size r is determined.b (m) = am + b (mod n). m = De (c) = e−1 c (mod n). n) = 1.

r. . A widely used commercial symmetric-key cryptosystem is the Data Encryption Standard (DES). i = r.68 CHAPTER 7. the number of entries in the key matrix then becomes unreasonably large). Problem 7. One divides a plaintext block message of 2t bits into two halves. [4 19] and then multiplying the transpose of each vector by e on the left. 1. For this reason the Hill cipher is less susceptible to frequency analysis (particularly for large block sizes. −3 2 15 16 We can use e to encrypt the word “SECRET”. . however. [1 10]. L0 and R0 . . from which we see that x = −5 is a solution. The decryption function De (Lr | Rr ) = R0 | L0 . . by breaking the message up into vectors of length two: [18 4]. . To ﬁnd the inverse of 5 in Z26 we use the Euclidean division algorithm: 1 = 5x+26y. . A special case of the block cipher is the permutation cipher. i = 1. ei ). is implemented by applying the inverse iteration Li−1 = Ri + f (Li . in other words the message 18 4 2 17 4 19. . or the cipher text “OSVWBK”. Thus e−1 = −5 4 −1 6 5 = . CRYPTOGRAPHIC CODES We see that det e = 5 has no common factors with 26. [2 17]. er ) denotes a set of r encryption keys. and e = (e1 . . where | denotes concatenation. Permutation ciphers can be detected by frequency analysis since they preserve the frequency distribution of each symbol. . The result is [14 18]. 26 = 5·5+1 ⇒ 1 = 26−5·5. In 1981. Ri−1 = Li . except that the key sequence is reversed. which is a type of Feistel cipher endorsed in 1977 by the United States National Bureau of Standards for the protection of conﬁdential (but unclassiﬁed) government data. all linear or aﬃne block methods are subject to cryptanalysis using linear algebra. ei ).1: Verify that the original message “SECRET” is recovered when “OSVWBK” is decoded with the matrix e−1 . In fact. [21 22]. DES was approved also for use in the private-sector. With Feistel ciphers. in which the order of the characters in every block of text is rearranged in a prescribed manner. . the encryption and decryption algorithms are essentially the same. where the ei are the keys and f is a speciﬁc nonlinear cipher function. and then performs the iteration Li = Ri−1 . The encryption function is Ee (L0 | R0 ) = Rr | Lr . Ri = Li−1 + f (Ri−1 . Note that the two letters “E” are not mapped to the same symbol in the ciphertext. 2t Feistel ciphers are block ciphers over F2 . . once r or r + 1 plaintext–ciphertext pairs are known. .

Both the secure envelope technique and the public-key technique require that the encrypting key e is designed so that the decrypting key d is extremely diﬃcult to determine from knowledge of e. In October 2000.B Public-Key Cryptography A principle diﬃculty with symmetric-key cryptosystems is the problem of key distribution and management. have to securely exchange their secret keys before they can begin communication. PUBLIC-KEY CRYPTOGRAPHY 69 The DES cipher uses the half width t = 32 and r = 16 rounds of encryption. It is based on a combinations of byte substitutions. Hence the number of keys that need to be searched in a brute-force attack on DES is 256 .B. First. the Rijndael Cryptosystem was adopted by the National Bureau of Standards as the Advanced Encryption Standard (AES) to replace DES. 7. On a modern supercomputer this is quite feasible.B. because of an inherent onescomplement symmetry. which each party alternately applies and later removes from the sensitive data. which may be quite distant from each other. along with a diﬀusion-enhancing technique based on cyclic coding theory. or locks. as the data makes a total of three transits between sender and receiver. One common application of DES that persists today is its use in encrypting computer passwords.7. where DES is applied three times in succession with three diﬀerent keys). moreover. This is why many computer passwords are still restricted to a length of eight characters (64 bits). both parties. One technique for avoiding the problem of key exchange makes use of two secure envelopes. to guard against so-called man-in-the-middle attacks. so in fact there are only 56 information bits in the key. Somehow. shifts. Somewhat more secure variants of DES (such as Triple-DES. key exchange is avoided altogether by making copies of the receiver’s encrypting key (lock) available to anyone who wants to communicate with him. All 16 keys are generated from a bit sequence of length 64 that is divided into 8 bytes.1 RSA Cryptosystem The most well known public-key cipher is the Rivest–Shamir–Aldeman (RSA) Cryptosystem. In public-key cryptosystems. The eighth bit of each byte is an (odd) parity check. In fact. the receiver forms the product n of two distinct large prime numbers p . where the data values are multiplied by the polynomial 3x3 + x2 + x + 2 in the polynomial ring F256 [x]/(x4 − 1). were developed as interim solutions. such as diﬀerential cryptanalysis and linear cryptanalysis. and key additions. only 255 keys need to be checked. They also require authentication of the lock itself. The required three transmissions makes this method awkward to use in practice. using the password itself as a key. 7. DES has recently been shown to be susceptible to relatively new cryptanalytic techniques.

1 (RSA Inversion): The RSA decoding function De is the inverse of the RSA encoding function Ee . q are kept secret. The receiver chooses p = 5 and q = 11. that is. . 2 This follows from applying Theorem A. using the Euclidean division algorithm.70 CHAPTER 7.1 The receiver then selects a random integer e between 1 and ϕ(n) = (p − 1)(q − 1) that is relatively prime to ϕ(n) and. CRYPTOGRAPHIC CODES and q chosen at random. Anyone who wishes to send a message m. By construction ed = 1 + kϕ(n) for some integer k. We ﬁrst apply Theorem 7. Proof: If m is a multiple of s we are done. if p and q are close enough that (p + q)2 − 4n = (p + q)2 − 4pq = (p − q)2 is small.1. • Let us encode the message “SECRET” (18 4 2 17 4 19) using the RSA scheme with a block size of 1.1 with a = k(q − 1). which must be p + q. the receiver can decrypt c using the decoding function M = De (c) = cd (mod n). but such that p and q cannot be easily determined from n. we will need the following results. s = q.1 (Modiﬁed Fermat’s Little Theorem): If s is prime and a and m are natural numbers. m[mk(q−1)(p−1) − 1] = 0 (mod pq). p. Thus M = mmk(q−1)(p−1) = m (mod pq) = m (mod n). to the receiver encrypts the message using the encoding function c = Ee (m) = me (mod n) and transmits c. so M = De (c) = cd = (me )d = med = m1+kϕ(n) = m1+k(p−1)(q−1) (mod n). Because the receiver has knowledge of d. computes d = e−1 in Zϕ(n) (why does e−1 exist?). but d. so that n = pq = 55 and 1 For example. then m ma(s−1) − 1 = 0 (mod s). Corollary 7. Otherwise. to deduce that m[mk(q−1)(p−1) − 1] is a multiple of both of the distinct primes p and q. Knowledge of p − q and p + q is suﬃcient to determine both p and q. Theorem 7. so Fermat’s little theorem2 implies that (ma )s−1 = 1 (mod s). s = p and then with a = k(p − 1).4 to the ﬁeld Zs . from which the result follows. The numbers n and e are made publicly available. then the sum p + q could be determined by searching for a small value of p − q such that (p − q)2 + 4n is a perfect square. where 0 ≤ m < n. To show that M = m. we know that ma is not a multiple of s.

we see that m = ayq + bxp (mod n) is the desired solution.g. . The Chinese remainder theorem then guarantees that the system of linear congruences m = b (mod q) m = a (mod p). Remark: While the required exponentiations can be performed by repeated squaring and multiplication in Zϕ(n) (e. That is. The receiver publishes the numbers n = 55 and e = 17. Since the numbers x and y are independent of the ciphertext. b = cd mod(q−1) (mod q). This amounts to ﬁnding gcd(17. so these deﬁnitions reduce to a = cd mod(p−1) 17 = 2 · 6 + 5. has exactly one solution in {0. . 6 = 1 · 5 + 1. This is very easy since Fermat’s little theorem says that cp−1 = 1 (mod p). . (mod p). Since yq = 1 (mod p) and xp = 1 (mod q). a larger block size should be used to thwart frequency analysis attacks. we could encode pairs of letters i and j as 26i + j and choose n ≥ 262 = 676. 1. but keeps the factors p = 5.3 The receiver would then decode the received message 28 49 7 52 49 24 as 2833 4933 733 5233 4933 2433 (mod 55) = 18 4 2 17 4 19. One can ﬁnd this solution by using the Euclidean division algorithm to construct integers x and y such that 1 = xp + yq. d = −7 (mod 40) = 33. the factors xp and yq can be precomputed.B. and d = 33 (and ϕ(n)) secret. The sender then encodes 18 4 2 17 4 19 as 1817 417 217 1717 417 1917 (mod 55) = 28 49 7 52 49 24 The two Es are encoded in exactly the same way. This is important. from which we see that 1 = 6 − 5 = 6 − (17 − 2 · 6) = 3 · (40 − 2 · 17) − 17 = 3 · 40 − 7 · 17. we ﬁrst ﬁnd a = cd (mod p) and b = cd (mod q). 40): 40 = 2 · 17 + 6. since to make computing the secret key d (from knowledge of n and e alone) diﬃcult. . although such a limited block size would still be vulnerable to more time consuming but feasible digraph frequency attacks. q = 11. n−1}. PUBLIC-KEY CRYPTOGRAPHY 71 ϕ(n) = 40. so that 17d = 40k + 1 for some k ∈ N. x33 = x32 · x). . since the block size is 1: obviously. Instead of computing m = cd directly. 3 For example. He then selects e = 17 and ﬁnds d = e−1 in Z40 . RSA decryption can be implemented in a more eﬃcient manner.7. d must be chosen to be about as large as n.

and 7 = 1·6+ 1.2: You have intercepted an encrypted transmission that was intended for your math professor. Since factoring large integers in general is an extremely diﬃcult problem. using RSA encoding and a block size of 1. knowing e. determine the “secret” prime factors p and q of n. we ﬁnd a = 261 (mod 3) = 21 (mod 3) = 2. Hence 1 = xp + yq. This yields m = 12(5) − 11(2) = 38 (mod 33) = 5. 13 = 1 · 7 + 6. presumably it does not involve direct knowledge of d (as that would constitute an eﬃcient algorithm for factorizing integers!). p. q = 11.72 CHAPTER 7. For example. p = 3. Interpret your result as a three-letter English word. to compute 2833 we evaluate a = 2833 b = 2833 (mod 10) (mod 4) (mod 5) = 28 (mod 5) = 3. For p = 5 and q = 11. CRYPTOGRAPHIC CODES • To set up an eﬃcient decoding scheme we precompute x and y such that 1 = px+qy. and 19. b = 267 (mod 11) = 47 (mod 11) = 214 (mod 11) = 24 (mod 11) = 5. we see that 1 = 7− 6 = 7− (13− 7) = 2·7− 13 = 2·(20− 13)− 13 = 2·20− 3·13. (a) Without breaking into your professor’s computer. On your professor’s web page you notice that he lists his public key as (n = 33. It consists of the three numbers 26. First we ﬁnd φ(n) = (p − 1)(q − 1) = 20. Remark: Determining d from e and n can be shown to be equivalent to determining the prime factors p and q of n. . and q. For c = 26. (c) Use d to decrypt the secret message. where the letters A to Z are mapped to Z26 in the usual way. we see that x = −2 and y = 1 are solutions. where a = c17 (mod 3) = c1 (mod 3) and b = c17 (mod 11) = c7 (mod 11). However. Since 11 = 3 · 3 + 2 and 3 = 1 · 2 + 1 we see that 1 = 3 − 2 = 3 − (11 − 3 · 3) = 4 · 3 − 11. the belief by many that RSA is a secure cryptographic system rests on this equivalence. (b) Determine the decryption exponent d. it has not been ruled out that no other technique for decrypting RSA ciphertext exists. Since 20 = 1 · 13 + 7. so that xp = −10 and yq = 11. Problem 7. e = 13). Each of these numbers was separately encoded. So d = −3 (mod 20) = 17. Once a and b are determined from the ciphertext we can quickly compute m = 11a − 10b (mod 55). 14. If such a technique exists. where x = 4 and y = −1. Hence we can decode the message c as m = 12b − 11a (mod 33). (mod 11) = 283 mod 11 = 63 (mod 11) = 7 and then compute m = 11a − 10b = (33 − 70) (mod 55) = 18.

where the letters A to Z are mapped to Z26 in the usual way. For c = 27. p = 5. (b) Determine the decryption exponent d. This yields m = −14(4) = −56 (mod 35) = 14.7. (a) Determine the “secret” prime factors p and q of n. and q. p. b = 37 (mod 11) = 3 · 272 (mod 11) = 3 · 52 (mod 11) = 3 · 3 (mod 11) = 9. which were separately encoded. Similarly. It consists of the two numbers 27 and 14. we ﬁnd a = 273 (mod 5) = 23 (mod 5) = 3. Since 24 = 2 · 11 + 2. we ﬁnd a = 21 (mod 3) = 2. Thus. First we ﬁnd φ(n) = (p − 1)(q − 1) = 24. (c) Use d to decrypt the secret message. where x = 3 and y = −2. the sent message was 13. Since 7 = 1 · 5 + 2 and 5 = 2 · 2 + 1 we see that 1 = 5 − 2 · 2 = 5 − 2 · (7 − 1 · 5) = 3 · 5 − 2 · 7. we see that 1 = 11 − 5 · 2 = 11 − 5 · (24 − 2 · 11) = 11 · 11 − 5 · 24. using RSA encoding and a block size of 1. Finally. 11 = 5 · 2 + 1. for c = 14. On the Leader of the Oﬃcial Opposition’s web page you notice that he lists his public key as (n = 35. q = 7. 14 which spells NO! . This yields m = 15(6) − 14(3) = 48 (mod 35) = 13. Hence 1 = xp + yq. b = 87 (mod 11) = 221 (mod 11) = 21 (mod 11) = 2.B. b = 275 (mod 7) = (−1)5 (mod 7) = −1(mod 7) = 6. for c = 14. the sent message was 5. b = 145 (mod 7) = 0. knowing e. Interpret your result as a two-letter English word. 13. 20. where a = c11 (mod 5) = c3 (mod 5) and b = c11 (mod 7) = c5 (mod 7). This yields m = 12(9) − 11(2) = 20. Hence we can decode the message c as m = 15b − 14a (mod 35). This yields m = 12(2) − 11(1) = 13.3: You have intercepted an encrypted transmission from the Prime Minister’s Oﬃce to the Leader of the Oﬃcial Opposition. PUBLIC-KEY CRYPTOGRAPHY Similarly. Thus. which spells FUN! 73 Problem 7. So d = 11. e = 11). for c = 19. we ﬁnd a = 143 (mod 5) = (−1)3 (mod 5) = 4. we ﬁnd a = 117 (mod 3) = 1.

Similarly.C Discrete Logarithm Schemes The RSA system is based on the trapdoor property that multiplying two large prime numbers is much easier than factoring a large composite number into two constituent primes. This time.1 that the equation 0 = x2 −c has at most two solutions in Zp . This can be eﬃciently accomplished in terms of integers x and y satisfying 1 = xp+yq. the two square roots of c in Zq are ±b. In fact. 1. Again the receiver forms the product n = pq of two large distinct primes p and q that are kept secret. Remark: The Diﬃe–Hellman assumption conjectures that it is computationally infeasible to compute αab knowing only αa and αb .2 Rabin Public-Key Cryptosystem In contrast to the RSA scheme. where a = c(p+1)/4 (mod p): (±a)2 = c(p+1)/2 (mod p) = cc(p−1)/2 (mod p) = cm(p−1) (mod p) = c (mod p).C. To make decoding eﬃcient. It is assumed that this would require ﬁrst determining one of the powers a or b (in which case it as diﬃcult as computing a discrete logarithm). The number x is called the discrete logarithm of y to the base b in G. these solutions are given by ±a. M = ±(ayq ± bxp) (mod n). by the Chinese remainder theorem. Suppose y ∈ G can be obtained as the power bx in G. p and q are normally chosen to be both congruent to 3 (mod 4).1 Diﬃe–Hellman Key Exchange Deﬁnition: Let G be a ﬁnite group and b ∈ G. the sender encodes the message m ∈ {0. . Consequently. . Another example of such a one-way function is the fact that computing a high power of an element within a group is much easier than the reverse process of determining the required exponent. n − 1} as c = Ee (m) = m2 (mod n). the receiver must be able to compute square roots modulo n. where b = c(q+1)/4 (mod q). the linear congruences M = ±a (mod p). yield four solutions: one of which is the original message m. the Rabin Public-Key Cryptosystem has been proven to be as secure as factorizing large integers is diﬃcult.74 CHAPTER 7. To decode the message. CRYPTOGRAPHIC CODES 7. M = ±b (mod q) 7. . .B. 7. . First one notes from Lemma 5.

to the trusted authority. But since s and C(s) aren’t secret. 7. Bob can then determine m by dividing mαbk by (αk )b . In the ElGamal Cryptosystem. likewise. they each secretly choose a random integer between 1 and q − 1.2 Okamoto Authentication Scheme One of the more diﬃcult issues in public key cryptography is that of authentication. Alice should not blindly assume that Bob’s posted public key is really his. She can then authenticate herself to Bob with the following procedure. verifying the identity of a sender or receiver. this doesn’t yet establish that Alice was really the one who sent them to Bob. or birth certiﬁcate). driver’s license. So Bob challenges the person claiming to be Alice to prove that she really owns the signature s. . who uses a secret algorithm to generate an identifying certiﬁcate C(s) for Alice. where the prime p is chosen such that p − 1 has a large prime factor q. The authority also chooses two distinct elements of order q in Zp . To request a certiﬁcate. part of Alice’s identiﬁcation must be kept secret. A trusted authority (perhaps the government?) might issue Alice an electronic certiﬁcate based on more conventional forms of identiﬁcation (such as a passport. A trusted certifying authority chooses a ﬁeld Zp . mαbk ). However. which Alice would like to then use to authenticate herself while communicating with Bob. Alice can’t simply send Bob her secret identiﬁcation code. should not naively believe that it was really Alice who used his public encoding key to send the message to him in the ﬁrst place. with primitive element α. When Alice communicates with Bob. a third party will be unable to determine αab from knowledge of the public keys αa and αb alone. DISCRETE LOGARITHM SCHEMES 75 Let Fq be a publicly agreed upon ﬁeld with q elements. say g1 and g2 . that is. One practical scheme for doing this is the Okamoto authentication scheme. Bob computes and sends αb to Alice. without independent conﬁrmation via a trusted source. by ﬁrst sending her a random number r ∈ Zq . which they call a and b.7.C. Alice computes and sends αa to Bob. when he decrypts a message from Alice. a variation of an earlier scheme by Schnorr. Their common secret key is then αab . Even if she encrypted it with Bob’s public key. Alice sends a message m to Bob as the pair of elements (αk . If the Diﬃe–Hellman assumption is correct. respectively.C. Alice chooses two secret random exponents a1 and a2 in Zq and sends the number −a −a s = g1 1 g2 2 (mod p). where k is a randomly chosen integer. And Bob. For example. she identiﬁes herself by sending him both her signature s and certiﬁcate C(s). Bob ﬁrst checks with the trusted authority to conﬁrm that s and C(s) really belong to Alice. there would always be the risk that Bob might later use Alice’s identiﬁcation to impersonate her! Obviously. If Alice and Bob want to agree on a common secret key.

when q is suﬃciently large. without actually revealing them to him. It is not hard to show. y y The agreement of the numbers g1 1 g2 2 sr and γ in Zp begins to convince Bob that maybe he really is talking to Alice after all. = s. the resulting pair of exponents (a′1 . But the question is. a2 ) (for example.76 CHAPTER 7. a′2 ) will be distinct from Alice’s exponents (a1 . by choosing random numbers k1 and k2 in Zq and sending him the three numbers y1 = k1 + a1 r (mod q). That is. q q Bob uses the fact that g1 = g2 = 1 (mod p) and the value of s to verify in Zp that y y k k k k g1 1 g2 2 sr = g1 1 +a1 r g2 2+a2 r sr = g1 1 g2 2 = γ (mod p). is impersonating ′ ′ Alice. k k γ = g1 1 g2 2 (mod p). Without loss of generality. CRYPTOGRAPHIC CODES Alice now tries to prove to Bob that she owns the underlying private keys a1 and a2 . for ′′ ′′ a distinct challenge r ′′ = r ′ . y2 = k2 + a2 r (mod q). no matter what challenge r ′ Bob sends him. Suppose he uses his algorithm again. without knowing Alice’s secret key. to determine exponents y1 and y2 : g1 1 g2 2 sr = γ (mod p). . see Stinson [1995]). Suppose Charlie has managed to ﬁnd a way to convince Bob that he is Alice. say Charlie. having somehow devised a clever algorithm to determine numbers y1 and y2 that fool Bob into thinking the sender is Alice? We now show that impersonation under the Okamoto authentication scheme is as diﬃcult as solving the discrete logarithm problem. Then and so g1 1 g1 ′′ y ′ −y1 y′ y′ ′ y ′′ y ′′ ′′ = g2 2 ′ y ′′ −y2 r ′′ −r ′ s (mod p). that for virtually all possible challenges r ′ and r ′′ . he has ﬁgured out a clever way to generate ′ ′ two exponents y1 and y2 such that g1 1 g2 2 sr = γ (mod p). is it possible that a third party. we can relabel these exponents so that a2 = a′2 . ′ ′′ ′ ′′ (r ′′ −r ′ )−1 (y1 −y1 ) (r ′′ −r ′ )−1 (y2 −y2 ) g2 Charlie has thus determined two exponents a′1 and a′2 such that s = g1 −a′ −a′ 1 g2 2 (mod p).

77 Charlie’s scheme would then constitute a computationally feasible algorithm for computing loga1 g2 in Zp : we would know g1 1 a −a′ 1 = g2 2 a′ −a2 (mod p). such that it is computationally infeasible to ﬁnd another document x′ such that f (x′ ) = f (x) (even though such an x′ will likely exist). Alice now chooses a random number k in Zq and ﬁnds an integer s such that sk = h + ag k (mod q). The short string of characters given by the hash is converted to an integer h in Zq . One means for doing this is the Digital Signature Standard (DSS) proposed in 1991 by the U. Her electronic signature should guarantee not only that she is the sender.C. She signs her document with the pair (g k . He then uses Alice’s public key A to check that gs −1 h As −1 g k = gs −1 (h+ag k ) = g k (mod p). where g k ∈ Zp .S. Alice ﬁrst applies to her document x a hash function f . government National Institute of Standards and Technology as a standard for electronically signing documents. neither Alice nor Charlie had any prior knowledge of the value logg1 g2 . so that. . Alice’s chooses as her private key the random integer a in Zq . If this is indeed the case then Bob is convinced that the contents and signature are genuine.3 Digital Signature Standard Another situation that arises frequently is where Bob hasn’t yet gone to the trouble of creating a public key. Note that since g1 and g2 were chosen by the trusted authority. 7. a large prime p is chosen such that p − 1 has a large prime factor q and an element g of order q is chosen from Zp . Her public key is A = g a . on using the fact that a′2 = a2 .7. but also that the document hasn’t been tampered with during transmission. s).C. which is essentially a function that maps a long string of characters to a much shorter one. Bob can verify the authenticity and integrity of the document he receives by ﬁrst calculating its hash h. but Alice would nevertheless like to send him an electronically signed document. DISCRETE LOGARITHM SCHEMES Together with Alice’s original construction −a −a s = g1 1 g2 2 (mod p). This makes it virtually impossible to tamper with the document without altering the hash value. logg1 g2 = (a′2 − a2 )−1 (a1 − a′1 ). In the DSS.

. c = Ee (m) = mK + z. 2t+1] linear code C with generator matrix G. Once we have found x mod pap for each prime factor p of q − 1. . + xa−1 pa−1 . . we ﬁrst compute a table of the pth roots of unity bj(q−1)/p for j = 0. we can compute each of the xi values for i = 0. He then constructs the k × n matrix K = SGP as his public key. we compute the pth root of unity y (q−1)/p . . . and an n × n random permutation matrix P . k. it suﬃces to ﬁnd x mod pap for each prime factor p of q − 1. To ﬁnd x1 we repeat the above procedure replacing y with y1 = yb−x0 = bx−x0 . we see that y1 = 1 in Zq and y1 = b(x−x0 )(q−1)/p = bx1 (q−1)/p is the pth root of unity corresponding to j = x1 . + xa−1 pa−1 mod pa . Since the weight of zP −1 is no more than t. so in our table of pth roots. he can use the code C to decode the vector mSG + zP −1 to the codeword mS. . The receiver selects a block size k and a private key consisting of an [n. we can use the Chinese remainder theorem to ﬁnd x itself. . 1. The receiver then computes cP −1 = (mK + z)P −1 = (mSGP + z)P −1 = mSG + zP −1 .4 Silver–Pohlig–Hellman Discrete Logarithm Algorithm As it happens. is known if all of the prime factors p of q − 1 are small. CRYPTOGRAPHIC CODES 7. a fast means for computing discrete logarithms in Fq . The Chinese remainder theorem can then be used to solve the simultaneous congruence problem that determines the value of x. a k×k nonsingular scrambler matrix S. A sender encodes each message block m as where z is a random error vector of length n and weight no more than t. 1.C. . p − 1. . Since y1 is evidently a pth 2 (q−1)/p (q−1)/p2 power. For this reason. .78 CHAPTER 7. . the Silver– Pohlig–Hellman algorithm. noting that bq−1 = 1 in the ﬁeld Zq . . where ap is the number of p factors appearing in the prime factorization of q − 1. For each prime factor p. Continuing in this manner. x0 is just the j value corresponding to the root y (q−1)/p . . we attempt to compute each of the coeﬃcients in the p-ary expansion of x mod pa : x = x0 + x1 p + . To ﬁnd x such that bx = y in Fq . But y (q−1)/p = bx(q−1)/p = bx0 (q−1)/p . he recovers the original message m. a − 1. care must be taken when choosing the size q of the ﬁeld used in cryptographic schemes that rely on the Diﬃe–Hellman assumption.D Cryptographic Error-Correcting Codes We conclude with an interesting cryptographic application of error-correcting codes due to McEliece [1978]. to ﬁnd x0 . which has the discrete logarithm x1 p + . 7. After multiplication on the right by S −1 . To ﬁnd x mod pa . For example.

Thus. Proof: “⇒” Let Zn be a ﬁeld. a ﬁeld of order n. . . . . We continue in this manner until the ﬁrst time we encounter an element k to which we have already assigned a label ℓ (F is a ﬁnite ﬁeld): the . 2. . 2. p = ab. with 1 < a. . b < n.1 (Zn ): The ring Zn is a ﬁeld ⇐⇒ n is prime. Hence n must be prime. If p is composite. Consider the elements ia. n − 1. then b = a−1 ab = a−1 n = 0 (mod n). . Each of these elements must be nonzero since neither i nor a is divisible by the prime number n. which we label by 3. . Proof: Let 1 (one) denote the (unique) multiplicative identity in F . . a contradiction. then the product of the elements that we have labelled a and b would be 0. Theorem A. j ∈ 1. Deﬁnition: The order of a ﬁnite ﬁeld F is the number of elements in F . n − 1 in some order. 0. . the n − 1 elements a. . . . (n − 1)a must be equal to the n − 1 elements 1. we need only verify that each element a = 0 has an inverse. contradicting the fact that F is a ﬁeld. . If n = ab. must be equal to 1. a has inverse i. Hence the sum of p = k − ℓ ones must be the additive identity. That is. say ia. “⇐” Let n be prime. . Similarly 2 + 1 ∈ F . Since Zn has a unit and is commutative. The element 1 + 1 must be in F . sum of k ones equals the sum of ℓ ones. so label this element 2. These n − 1 elements are distinct from each other since. One of them. n − 1. Thus p must be prime and the set of numbers that we have labelled 79 . with k > ℓ. ia = ja ⇒ (i − j)a = 0 (mod n) ⇒ n|(i − j)a ⇒ n|(i − j) ⇒ i = j. 2.Appendix A Finite Fields Theorem A. 2a.2 (Subﬁeld Isomorphic to Zp ): Every ﬁnite ﬁeld has the order of a power of a prime p and contains a subﬁeld isomorphic to Zp . for i. for i = 1.

establishes the existence of a ﬁeld with exactly pr elements. the elements {x. xr } of linearly independent elements of F .2 says that the prime p must be the power of a prime. . which can only be p itself. . in the sense that a1 x1 + a2 x2 + . . Proof: Theorem A. Then. Now consider all subsets {x1 . see van Lint [1991]). xr }. . .3 (Prime Power Fields): There exists a ﬁeld F of order n ⇐⇒ n is a power of a prime. . where ai ∈ Zp . . . 1. which contradicts B + D = 0.1 (Isomorphism to Zp ): Any ﬁeld F with prime order p is isomorphic to Zp . . ar . xr } cannot be linearly independent. Proof: “⇒” This is implied by Theorem A. . . . Theorem A. . • For example. . we can establish the irreducibility of g in Z2 [x] directly: If g(x) = (x2 + Bx + C)(x + D) = x3 + (B + D)x2 + (C + BD)x + CD. implies that g(x) cannot have a linear factor (x − c). “⇐” Let p be prime and g be an irreducible polynomial of degree r in the polynomial ring Zp [x] (for a proof of the existence of such a polynomial. . It also says that F contains Zp . . .2. . Note that g is irreducible: the fact that g(c) = c3 + c + 1 = 0 for all c ∈ Z2 . Since there are p choices for each of the r coeﬃcients. there are exactly pr distinct elements in F . Corollary A. There must be at least one such subset having a maximal number of elements. a2 .2. = 0. so that x can be written as a linear combination of {x1 . + ar xr = 0 ⇒ a1 = a2 = . x1 . .80 APPENDIX A. . . . . . . xr } forms a basis for F . there are no other elements in F . The ﬁeld Zp [x]/g. p − 1} is isomorphic to the ﬁeld Zp . Alternatively. then CD = 1 ⇒ C = D = 1 and hence C + BD = 1 ⇒ B = 0. . if x is any element of F . . . corresponding to the p possible choices for each of the r coeﬃcients of a polynomial of degree less than r. we can construct a ﬁeld with 8 = 23 elements using the polynomial g(x) = x3 + x + 1 in Z2 [x]. so that the elements of F may be uniquely identiﬁed by all possible values of the coeﬃcients a1 . which is just the residue class polynomial ring Zp [x] (mod g). Since the order of Zp is already p. FINITE FIELDS {0. Thus {x1 . . 2. Recall that every polynomial can be written as a polynomial multiple of g plus a residue polynomial of degree less than r.

Use this result to prove for any n ∈ N that ϕ(d) = n. express the number of possible values of m in terms of the Euler ϕ function. .2: Consider the set Sn = k :1≤k≤n . Z2 [x]/g is a ﬁeld with exactly 23 = 8 elements. we n see that ϕ(n) = |Z∗ |. if a and b are two polynomials in Z2 [x]/g. n) = 1}| is the number of positive integers less than or equal to n that are relatively prime (share no common factors). . Remark: If we denote the set of integers in Zn that are not zero divisors by Z∗ . ϕ(n) = |{m ∈ N : 1 ≤ m ≤ n. (c) Obtain an alternative proof of the formula in Problem A. d|n Problem A. Deﬁnition: The Euler indicator or Euler totient function . • ϕ(pr ) = pr − pr−1 for any prime number p and any r ∈ N since p. reduce the fraction k/n to m/d. corresponding to the 2 possible choices for each of the 3 polynomial coeﬃcients. (m. ϕ(1) + ϕ(2) + ϕ(3) + ϕ(4) + ϕ(6) + ϕ(12) = 1 + 1 + 2 + 2 + 2 + 4 = 12.1 from parts (a) and (b). . Thus. n • Here are the ﬁrst 12 values of ϕ: x ϕ(x) Remark: Note that ϕ(1) + ϕ(2) + ϕ(3) + ϕ(6) = 1 + 1 + 2 + 2 = 6. . where d divides n and (m.. For each d. n (a) How many distinct elements does Sn contain? (b) If k and n have a common factor. 2p. (pr−1 − 1)p all have a factor in common with pr . d) = 1.81 That is. with 1 ≤ m ≤ d. their product can be zero (mod g) only if one of them is itself zero. • ϕ(p) = p − 1 for any prime number p. n) = 1. and ϕ(1) + ϕ(p) = 1 + (p − 1) = p for any prime p. 3p. 1 1 2 1 3 2 4 2 5 4 6 2 7 6 8 4 9 6 10 4 11 10 12 4 Problem A.1: The Chinese remainder theorem implies that ϕ(mn) = ϕ(m)ϕ(n) whenever (m.

that is. . then the powers αi . α0. including q − 1 itself. . we see from the proof of Theorem A. . In fact. Hence. so Lagrange’s Theorem implies that e must divide the order of G. (x − αe−1 ).. if i and e share a common factor n > 1. there exist elements in F of any order e that divides q − 1. for the ϕ(pr − 1) values of i that are relatively prime to pr − 1. In the latter case. . . . αe = 1. FINITE FIELDS Deﬁnition: The order of a nonzero element α of a ﬁnite ﬁeld is the smallest natural number e such that αe = 1. Proof: Given a ﬁnite ﬁeld F of order q. e|(q − 1). the only possible elements of order e in F are powers αi for 1 ≤ i < e. Deﬁnition: An element of order q − 1 in a ﬁnite ﬁeld Fq is called a primitive element. Thus. we then immediately know the factorization of the polynomial xe − 1 in F [x]: xe − 1 = (x − 1)(x − α)(x − α2 ) . is either 0 or ϕ(e).4 (Primitive Element of a Field): The nonzero elements of any ﬁnite ﬁeld can be written as powers of a single element. .4 states that the elements of a ﬁnite ﬁeld Fq can be listed in terms of a primitive element. If the number of elements of order e were 0 for some divisor e of q − 1. . Remark: Theorem A. . α1 . Speciﬁcally. αq−2}. e) = 1 as possible candidates for elements of order e. then (αi )e/n = 1 and the order of αi would be less than or equal to e/n. α2 . which is a contradiction. Since xe − 1 can have at most e zeros in F [x]. Hence (αn )e = (αe )n = 1 for n = 0. Note that the e powers of α are a subgroup (coset) of the multiplicative group G formed by the nonzero elements of F . Remark: A primitive element of a ﬁnite ﬁeld Fpr need not be unique. Remark: The fact that all elements in a ﬁeld Fq can be expressed as powers of a primitive element can be exploited whenever we wish to multiply two elements together. α is a root of the polynomial xe − 1 in F [x]. we know that each of the roots αn for n = 1.82 APPENDIX A. So this leaves only the elements αi where (i. . say α: Fq = {0. . that is. . However. . The distinct powers of an element of order q − 1 are just the q − 1 nonzero elements of F . . are also primitive elements. let 1 ≤ e ≤ q − 1. then the total number of nonzero elements in F would be less than e|(q−1) ϕ(e) = q − 1. Since α has order e. e are distinct. in exactly the same manner as one uses a table of logarithms to multiply real numbers. the number of elements of order e. 2. We can compute the product αi αj simply by determining which element can be expressed as α raised to the power (i + j) mod(q − 1). Consequently. where e divides q − 1. if α is a primitive element.4 that the number of such elements is ϕ(pr − 1). 2. Either there exists no elements in F of order e or there exists at least one element α of order e. 1. Theorem A.

Corollary A. If f (x) ∈ Fp [x] has β as a root.1 (Cyclic Nature of Fields): Every element β of a ﬁnite ﬁeld of order q is a root of the equation β q − β = 0. “⇐” Since m(β) = 0. the minimal polynomial must be m(x) itself.1 states that every element β in a ﬁnite ﬁeld Fpr is a root of some polynomial f (x) ∈ Fp [x]. Thus m(x) is irreducible. this would contradict the minimality of deg m. Problem A. Proof: Let m(x) be the minimal polynomial of β. Deﬁnition: A primitive polynomial of a ﬁeld is the minimal polynomial of a primitive element of the ﬁeld. then f (x) is divisible by the minimal polynomial of β. then expressing f (x) = q(x)m(x) + r(x) with deg r < deg m.4. Corollary A. Remark: In particular. Remark: If α is a primitive element of Fq .5. then a(β)b(β) = 0 implies that a(β) = 0 or b(β) = 0. If f (β) = 0. How do we ﬁnd the minimal polynomial of an element αi in the ﬁeld Fpr ? A. Then m(x) is the minimal polynomial of β ⇐⇒ m(x) is irreducible in Fp [x]. Theorem A. we see that r(x) is identically zero. we see that r(β) = 0.1 (Minimal Polynomials Divide xq − x): The minimal polynomial of an element of a ﬁeld Fq divides xq − x. Note that (αi )−1 = αq−1−i .83 Remark: A primitive element α of Fq satisﬁes the equation αq−1 = 1. The following theorems provide some assistance. The fact that a primitive element α satisﬁes αq = α leads to the following corollary of Theorem A. Corollary A.4. Deﬁnition: Given an element β in a ﬁeld Fpr . and has the highest possible order (q − 1). .3: Show that every nonzero element β of a ﬁnite ﬁeld satisﬁes β q−1 = 1 in Fq . the monic polynomial m(x) in Fp [x] of least degree with β as a root is called the minimal polynomial of β. where a and b are of smaller degree. But since m(x) is irreducible and monic. Corollary A.2 (Irreducibility of Minimal Polynomial): Let m(x) be a monic polynomial in Fp [x] that has β as a root.5 that m(x) is divisible by the minimal polynomial of β. Q. so that αq = α.5 (Minimal Polynomial): Let β ∈ Fpr . then α−1 = αq−2 is also a primitive element of Fq since (q − 2)(q − 1 − i) = i mod(q − 1). Proof: “⇒” If m(x) = a(x)b(x). By the minimality of deg m.5.4. we know from Theorem A.

. Corollary A. and so the monic polynomial s−1 f (x) = k=0 (x − αip ) k is certainly a factor of m(x). Since the pth power of a root is another root and the coeﬃcients ak are invariant under p root permutations. The powers of α modulo pr − 1 in this set form the cyclotomic coset of i.6. The coeﬃcients ak in the expansion f (x) = s ak xk are symmetric functions of these k=0 roots.6. consisting of sums of products of the roots.84 APPENDIX A. (c) a polynomial is a minimal polynomial of a nonzero element α ∈ Fpr ⇒ a scalar multiple of its reciprocal polynomial is a minimal polynomial of α−1 . on expanding all of the factors of f (x). (b) a polynomial is irreducible ⇐⇒ its reciprocal polynomial is irreducible. αip . It follows that each of the coeﬃcients ak belong to the base ﬁeld Fp . .5. so by Theorem A. Suppose there are s distinct elements in this set. Proof: Exercise. Since f (x) is monic we conclude that m(x) = f (x).7 (Reciprocal Polynomials): In a ﬁnite ﬁeld Fpr the following statements hold: (a) If α ∈ Fpr is a nonzero root of f (x) ∈ Fp [x]. . Identify the 2 set of distinct elements {αi.1 (Root Powers): If α is a root of a polynomial f (x) ∈ Fp [x] then αp is also a root of f (x).1. Problem A.4: Show that the pth powers of distinct elements of a ﬁeld Fpr are distinct. . each of these elements are distinct roots of m(x). we deduce that ak = ak .}. k Notice that the pth power of every root αip of f (x) is another such root. Theorem A. then α−1 is a root of the reciprocal polynomial of f (x).6 (Functions of Powers): If f (x) ∈ Fp [x]. Suppose we want to ﬁnd the minimal polynomial m(x) of αi in Fpr . By Corollary A. That is.5: Show that the only elements a of Fpr that satisfy the property that ap−1 = 1 in Fpr are the p − 1 nonzero elements of Fp . When we raise these sums to the pth power we obtain symmetric sums of the pth powers of the products of the roots. all of the αs disappear! Hence f (x) ∈ Fp [x] and f (αi ) = 0. FINITE FIELDS Theorem A. Proof: Exercise. we know also that m(x) is a factor of f (x). αip . Problem A. then f (xp ) = [f (x)]p . (d) a polynomial is primitive ⇒ a scalar multiple of its reciprocal polynomial is primitive.

The ﬁrst cyclotomic coset corresponds to the primitive element α = x. 4. The cyclotomic cosets consist of powers i2k (mod 15) of each element αi : {1. .6: Show that the elements generated by powers belonging to a particular cyclotomic coset share the same minimal polynomial and order. α9 . noting that αp = α. Note that. 2. We see that these are just the elements corresponding to the second last coset. since the minimal polynomial of x5 must be a factor of x3 − 1 = (x − 1)(x2 + x + 1) with degree 2. we observe that α3 satisﬁes the equation x5 − 1 = 0. αp αp . we know from Corollary A. 9}. and α7 . α6 . since pr = 1 mod(pr − 1). {0}. 4.2 that x4 + x3 + 1 is the minimal polynomial of x and hence is a primitive polynomial of F16 . minimal polynomials in Fpr have degree s ≤ r. {5. that is. for α14 .5. 10}. 2. So x4 + x3 + x2 + x + 1 is in fact the minimal polynomial of α3 . and α12 (hence we have indirectly proven that x4 + x3 + x2 + x + 1 is irreducible in F2 [x]). 13.85 Remark: Since the degree of the minimal polynomial m(x) of αi equals the number of elements s in the cyclotomic coset of αi . we see that the minimal polynomial for these elements is x2 + x + 1. . α12 . Likewise. Furthermore. x4 + x3 + x2 + x + 1. the minimal polynomial must have degree 4. α6 .. and α8 . α4 . and α9 . The reciprocal polynomial of x4 +x3 +1 is x4 +x+1.1). namely α2 . Problem A. Since 15 α = 1. 12. αp for a primitive element α of Fpr . since the cyclotomic coset corresponding to α3 contains 4 elements. this is the minimal polynomial of the inverse elements α−i = α15−i for i = 1. This follows immediately from the distinctness of the r 2 r−1 elements α. 8. α13 . We can factorize x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1) and since α3 = 1. • We now ﬁnd the minimal polynomial for each of the 16 elements of the ﬁeld F24 = F2 [x]/(x4 + x3 + 1). 8}. It is straightforward to check that the polynomial x is a primitive element of the ﬁeld (see Table A. . Remark: Every primitive polynomial of Fpr has degree r and each of its roots is a primitive element of Fpr . {3. 6. we can sometimes use the previous theorems to help us quickly determine m(x) without having actually to perform the above product. 14. α11 . Since x is a root of the irreducible polynomial x4 + x3 + 1 in F2 [x]/(x4 + x3 + 1). We can also easily ﬁnd the minimal polynomial of α3 . 11}. for which the minimal polynomial is x4 + x3 + 1. . we know that α3 must be a root of the remaining factor. {7. This is also the minimal polynomial for the other powers of α in the cyclotomic coset containing 1.

which contains the elements {0. since (α3 )5 = 1. as demonstrated by the following example. x.86 APPENDIX A. Since the primitive element α = x satisﬁes the equation x2 + x + 1 = 0. In fact. The minimal polynomial of 0 is x. Remark: The cyclotomic cosets containing powers that are relatively prime to pr − 1 contain the ϕ(pr − 1) primitive elements of Fpr . we could. From this it follows that 1 = αα = |α|2 and hence α = eiθ = cos θ + i sin θ for some real number θ.1: Nonzero elements of the ﬁeld F2 [x]/(x4 + x3 + 1) expressed in terms of the primitive element α = x. it is not a primitive polynomial. their minimal polynomials are primitive and have degree r. Element α0 α1 α2 α3 α4 α5 α6 α7 α8 α9 α10 α11 α12 α13 α14 Polynomial Order 1 1 α 15 α2 15 3 α 5 3 α +1 15 α3 + α + 1 3 3 2 α +α +α+1 5 α2 + α + 1 15 α3 + α2 + α 15 2 α +1 5 α3 + α 3 α3 + α2 + 1 15 5 α+1 α2 + α 15 α3 + α2 15 Minimal Polynomial x+1 x4 + x3 + 1 x4 + x3 + 1 x4 + x3 + x2 + x + 1 x4 + x3 + 1 x2 + x + 1 x4 + x3 + x2 + x + 1 x4 + x + 1 x4 + x3 + 1 x4 + x3 + x2 + x + 1 x2 + x + 1 x4 + x + 1 x4 + x3 + x2 + x + 1 x4 + x + 1 x4 + x + 1 Table A. 2 2 2 The other root to the equation x2 + x + 1 = 0 is the complex conjugate α of α. think of α as the complex number √ √ −1 + 1 − 4 3 1 α= =− +i . Consider the ﬁeld F4 = F2 [x]/(x2 + x + 1). using the quadratic formula. x + 1}. we see that θ = 2π/3. In this way. These results are summarized in Table A. Even though the minimal polynomial of the element α3 also has degree r = 4. FINITE FIELDS Finally. Thus α3 = e3θi = e2πi = 1. Remark: There is another interpretation of ﬁnite ﬁelds. we have constructed a number .1. Note that x4 + x3 + 1 and x4 + x + 1 are primitive polynomials of F2 [x]/(x4 + x3 + 1) and their roots comprise the ϕ(15) = ϕ(5)ϕ(3) = 4 · 2 = 8 primitive elements of Fpr . x2 + x + 1 = (x − α)(x − α). 1. That is. the minimal polynomial of the multiplicative unit α0 = 1 is just the ﬁrst degree polynomial x + 1.

Figure A. 4} is isomorphic to {0. −1. 1. e−2πi/3 }. Similarly.1: A representation of the ﬁeld F4 in the complex plane. −1} and F5 = {0. e2πi/3 . 1. 1. −i}. which is precisely what we mean when we say that α is a primitive element of F4 . 1. i. 3. x. A. The ﬁeld F4 may be thought of either as the set {0. .87 α that is a primitive third root of unity.1. the ﬁeld F3 = {0. 1. 2. 1. as illustrated in Fig. 2} is isomorphic to {0. x + 1} or as the set {0.

Pearson Prentice Hall. Xing. H. 1995. 1989. Rosen. R. 2001. 2001. New York. Costello. Lin & J. D. Upper Saddle River. CRC Press. D. Addison-Wesley. Ling & C. Presso. Koblitz. Error Control Coding. 3rd edition. Pless. 4th edition. R. Costello 2004] S. Reading. 2004. An Introduction to Cryptography. Boca Raton. N. [Lin & Daniel J. 2nd edition. Springer. 2000. A First Course in Coding Theory. Florida. Buchmann. Mollin. [Rosen 2000] [Stinson 1995] [van Lint 1991] [Welsh 2000] 88 .Bibliography [Buchmann 2001] [Hill 1997] [Koblitz 1994] J. Oxford University Press. Introduction to Cryptography. Cryptography: Theory and Practice. New Jersey. Introduction to the Theory of ErrorCorrecting Codes. Berlin. Oxford University Press. 1997. Oxford. Chapman & Hall/CRC. New York. Oxford. 2000. Cambridge Univ. Florida. A. Springer. Cambridge. 2004. R. Boca Raton. Introduction to Coding Theory. 1991. van Lint. K. [Ling & Xing 2004] [Mollin 2001] [Pless 1989] S. Welsh. A Course in Number Theory and Cryptography. Stinson. Wiley. Elementary Number Theory and its applications. A. Hill. V. New York. Daniel J. 2nd edition. J. Codes and Cryptography. 1994. Coding Theory: A First Course. Massachusetts. Springer. 2nd edition.

68 blocks. 45 error locator polynomial. 23 binary adders. k] code. 47 code. 8 Aq (n. 69 binary. 45 Bose–Chaudhuri–Hocquenghem (BCH) codes. 71 generator matrix. 23 BCH code. 21 check polynomial. 6 Hamming code. 69 Diﬃe–Hellman assumption. 54 detect. 54 envelopes. 46 [n. 16 ElGamal Cryptosystem. d). 75 basis vectors. =. 50 generated. 11. 66 cyclic. 6 Euler indicator. 38 error polynomial.Index (n. 5 aﬃne cipher. 24 . 5 extended Golay. 13 codeword. 21 [n. 24 cryptosystem. 75 coset leaders. 46 Chinese remainder theorem. 67 discrete logarithm. 6 authentication. 22 ciphers. 6 error vector. 74 Digital Signature Standard. frequency analysis. d] code. 21 encode. 25 binary codewords. 77 Digraph ciphers. 16 ﬂip-ﬂops. M. 7 Hamming bound. 68 design distance. 9 diﬀerential cryptanalysis. 67 54 generate. 9 Hamming distance. 7 coset. 6 q-ary symmetric channel. 81 binary symmetric. 55 binary code. 81 binary Hamming code. 36 Euler totient. 55 binary ascending form. 21 |C|. 67 alphabet. 7 equivalent. 84 Data Encryption Standard (DES). k. 66 generator polynomial. 74 balanced block design. 36 correct. d) code. 24 89 . 45 cyclotomic coset. 5 Block substitution ciphers. 41 bit. 67 Feistel. 10 Fq [x].

69 Schnorr. 13 Sphere-Packing Bound. 66 Rabin Public-Key Cryptosystem. 8 minimum weight. 38 irreducible. 46 Rijndael Cryptosystem. 46 incidence matrix. 82 parity-check digits. 83 principal ideal. 24 span. 83 minimum distance. 26 reciprocal polynomial. 54 primitive element. 75 scrambler matrix. 17 information digits. 27 Okamoto authentication scheme. 42 self-orthogonal. 17 symmetric matrix. 74 rate. 27 perfect code. 7 Triple-DES. 52. 55 syndrome decoding. 28. 74 triangle inequality. 24 standard form. 7 Slepian. 13 standard array. 21 linear cryptanalysis. 79. 7 minimal polynomial. 18 . 23. 78 points. 23 redundancy. 52. 28 ternary. 51. 27 Reed–Solomon. 75 one-way function. 77 Hill cipher. 69 Rivest–Shamir–Aldeman (RSA) Cryptosystem. 6 residue class ring. 46 monoalphabetic substitution ciphers. 51 INDEX reduced echelon form. 66 shift registers. 69 trivially perfect codes. 16 polyalphabetic substitution ciphers. 82 primitive polynomial. 58 weight. 7 null space. 14 permutation cipher. 14 Vandermonde matrix. 67 singleton. 61 repetition code. 66 syndrome. 67 polynomial ring. 46 public-key cryptosystems. 28 symmetric. 7 trapdoor property. 69 metric. 67 linear code. 48 linear block. 78 simple substitution ciphers. 67 nearest-neighbour decoding. 42 shift cipher. 74 order. 8 weighted check sum. 61 size. 38 parity-check matrix. 67 ideal.90 hash. 46 Principal Ideal Domain. 45 Silver–Pohlig–Hellman algorithm. 78 self-dual. 68 permutation matrix. 46 primitive BCH code. 69 man-in-the-middle. 21 sphere-packing. 42 symmetric-key cryptosystems. 22 monic.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd