Slide Set 04

E2 205: Error-Control Coding
Chapter 4: Linear Codes
Navin Kashyap
Online Semester Aug–Dec 2021

Indian Institute of Science
Definitions and Notation
Notation: Henceforth, F or Fq will denote a finite field with q

elements. (Alternative notation: GF(q))
Definition: A linear code C over F is a subspace of Fn .
I n is the length or blocklength of the code C.

I The dimension of C is its dimension as a vector space over F;
denoted by dim(C) or dimF (C).
Definitions and Notation
Notation: Henceforth, F or Fq will denote a finite field with q

elements. (Alternative notation: GF(q))
Definition: A linear code C over F is a subspace of Fn .
I n is the length or blocklength of the code C.

I The dimension of C is its dimension as a vector space over F;
denoted by dim(C) or dimF (C).
Notation:
I An [n, k] linear code over F is a code of blocklength n and
dimension k.
I An [n, k]q linear code is a code of blocklength n and
dimension k over Fq .
(Contrast this with the (n, M) notation for block codes.)
Number of Codewords
Proposition: An [n, k] linear code C over Fq has q k codewords.
Proof: Let c1 , c2 , . . . , ck be a basis of the subspace C.

I By Proposition B1, every codeword (i.e., vector) c ∈ C can be
uniquely expressed as a linear combination
c = α1 · c1 + α2 · c2 + · · · + αk · ck , with αj ∈ Fq for all j
I Thus, there is a 1-1 correspondence between codewords c ∈ C
and k-tuples of coefficients (α1 , . . . , αk ) ∈ Fq k .
Hence, |C| = Fq k = q k .

I
Number of Codewords
Proposition: An [n, k] linear code C over Fq has q k codewords.
Proof: Let c1 , c2 , . . . , ck be a basis of the subspace C.

I By Proposition B1, every codeword (i.e., vector) c ∈ C can be
uniquely expressed as a linear combination
c = α1 · c1 + α2 · c2 + · · · + αk · ck , with αj ∈ Fq for all j
I Thus, there is a 1-1 correspondence between codewords c ∈ C
and k-tuples of coefficients (α1 , . . . , αk ) ∈ Fq k .
Hence, |C| = Fq k = q k .

I
Remarks:
I An [n, k] linear code over Fq is an (n, M = q k ) block code.
I The rate of an [n, k] linear code over Fq is
1 k
R= logq (q k ) = .
n n
Minimum Distance
Recall that the minimum distance of a block code C is defined as
dmin (C) = min dH (x, y).
x,y∈C
x6=y
Minimum Distance
x,y∈C
x6=y
Definition: The Hamming weight of a word (vector)

x = (x1 , . . . , xn ) ∈ Fn is the number of non-zero entries in x:
wH (x) = #{i : xi 6= 0} = dH (x, 0).
Proposition: For a linear code C,

dmin (C) = min wH (c)
c∈C
c6=0
Minimum Distance
x,y∈C
x6=y
Definition: The Hamming weight of a word (vector)

x = (x1 , . . . , xn ) ∈ Fn is the number of non-zero entries in x:
wH (x) = #{i : xi 6= 0} = dH (x, 0).
Proposition: For a linear code C,

dmin (C) = min wH (c)
c∈C
c6=0
Proof: Since C is linear: x, y ∈ C =⇒ x − y ∈ C. Therefore,

dmin (C) = min dH (x, y)
x,y∈C
x6=y
= min wH (x − y) = min wH (c).

x,y∈C c∈C
x6=y c6=0
[n, k, d] Notation
Notation: An [n, k, d] (or [n, k, d]q ) linear code is an [n, k]

(or [n, k]q ) linear code with minimum distance d.
(Contrast with (n, M, d) notation for block codes.)
[n, k, d] Notation
Notation: An [n, k, d] (or [n, k, d]q ) linear code is an [n, k]

(or [n, k]q ) linear code with minimum distance d.
(Contrast with (n, M, d) notation for block codes.)
Example:
The repetition code of blocklength n: {00 . . . 0}, 11
| {z . . . 1}}
| {z
n 0s n 1s
This is a linear code over F2 with

blocklength n, dimension 1, and minimum distance n.
In other words, this is an [n, 1, n] binary linear code.

Example: The Single Parity-Check Code
The length-n single parity-check code over F2 :
C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

= nullspace(H), where H = [1 · · · 1}].
| 1 {z
n columns
C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

| 1 {z
n columns
I By the rank-nullity theorem,
dim(C) = n − rank(H) = n − 1.
I Since there are no codewords of (odd) weight 1,

and all binary words of (even) weight 2 are in the code,
dmin (C) = 2.
C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

| 1 {z
n columns
I By the rank-nullity theorem,
dim(C) = n − rank(H) = n − 1.
I Since there are no codewords of (odd) weight 1,

and all binary words of (even) weight 2 are in the code,
dmin (C) = 2.
Thus, this is an [n, n − 1, 2] binary linear code.

Example: The Length-7 Hamming Code
A binary word
x1
x1 x2 x3 x4 x5 x6 x7
x5 x6
x4
is in the Hamming code iff
x2 x3
x1 + x2 + x4 + x5 ≡ 0 (mod 2)
x7 x1 + x3 + x4 + x6 ≡ 0 (mod 2)
x2 + x3 + x4 + x7 ≡ 0 (mod 2)
A binary word
x1
x1 x2 x3 x4 x5 x6 x7
x5 x6
x4
is in the Hamming code iff
x2 x3
x1 + x2 + x4 + x5 ≡ 0 (mod 2)
x7 x1 + x3 + x4 + x6 ≡ 0 (mod 2)
x2 + x3 + x4 + x7 ≡ 0 (mod 2)
Re-write these equations in matrix form (over F2 ) as

 
x1
  x2 
 
  
1 1 0 1 1 0 0   x3 
 0
 1 0 1 1 0 1 0   x4  =  0  .
 
0 1 1 1 0 0 1   x5 
 0
 x6 
x7
Thus, a binary word x1 x2 x3 x4 x5 x6 x7 is in the Hamming code C iff

 
x1
  x2 
 
  
1 1 0 1 1 0 0  x3  
 0
 1 0 1 1 0 1 0   x4  =  0  .
 
0 1 1 1 0 0 1  x5  0
| {z } x 
 | {z }
6

H 0
x7
| {z }
xT
In other words, the Hamming code C is equal to nullspaceF2 (H).

Consequently,
I dim(C) = n − rankF2 (H) = 7 − 3 = 4.
A binary word x1 x2 x3 x4 x5 x6 x7 is in the Hamming code C iff
 
x1
  x2 
 
  
1 1 0 1 1 0 0  x3  
 0
 1 0 1 1 0 1 0   x4  =  0 .
 
0 1 1 1 0 0 1  x5  0
| {z } x 
 | {z }
6

H 0
x7
| {z }
xT
               
1 1 0 1 1 0 0 0
x1  1 + x2  0 + x3  1 + x4  1 + x5  0 + x6  1 + x7  0  =  0 
0 1 1 1 0 0 1 0
               
1 1 0 1 1 0 0 0
x1  1 + x2  0 + x3  1 + x4  1 + x5  0 + x6  1 + x7  0  =  0 
0 1 1 1 0 0 1 0
I C has no codewords of weight 1, as no column of H is 0.

I C has no codewords of weight 2, as no two columns of H are
identical.
I C does have codewords of weight 3: e.g., the first three
columns of H sum to 0 over F2 , so 1110000 is in C.
Hence, dmin (C) = 3.

               
1 1 0 1 1 0 0 0
x1  1 + x2  0 + x3  1 + x4  1 + x5  0 + x6  1 + x7  0  =  0 
0 1 1 1 0 0 1 0
I C has no codewords of weight 1, as no column of H is 0.

I C has no codewords of weight 2, as no two columns of H are
identical.
I C does have codewords of weight 3: e.g., the first three
columns of H sum to 0 over F2 , so 1110000 is in C.
Hence, dmin (C) = 3.
Thus, the Hamming code is a [7, 4, 3] binary linear code.

Minimum Distance of C = nullspace(H)
Theorem: Let C = nullspaceF (H) for some matrix H, with C 6= {0}.
The minimum distance of C is the smallest integer d > 0 such that
some collection of d columns of H is linearly dependent over F.
(Equivalently, dmin (C) is the largest integer d such that every collection
of d − 1 columns of H is linearly independent over F.)
Proof:
I Some collection of d columns of H is linearly dependent
=⇒ there exists a codeword of weight ≤ d
=⇒ dmin ≤ d
I d is the smallest such integer
=⇒ all d − 1 or fewer columns are linearly indep.
=⇒ there are no codewords of weight < d
=⇒ dmin ≥ d
Proof:
I Some collection of d columns of H is linearly dependent
=⇒ there exists a codeword of weight ≤ d
=⇒ dmin ≤ d
I d is the smallest such integer
=⇒ all d − 1 or fewer columns are linearly indep.
=⇒ there are no codewords of weight < d
=⇒ dmin ≥ d
If C = nullspaceF (H), then H is called a parity-check matrix for C.

Generator Matrices
Definition: A generator matrix for an [n, k] linear code C over F is
a k × n matrix whose rows form a basis of C, i.e.,
 
g1
 g2 
G = ,
 
..
 . 
gk
where g1 , g2 , . . . , gk constitute a basis for C.
I In particular, rankF (G ) = dimF (C).

Generator Matrices
Definition: A generator matrix for an [n, k] linear code C over F is
a k × n matrix whose rows form a basis of C, i.e.,
 
g1
 g2 
G = ,
 
..
 . 
gk
where g1 , g2 , . . . , gk constitute a basis for C.
I In particular, rankF (G ) = dimF (C).

I Since C, in general, will have many different bases, it will have
many different generator matrices. In fact, we can count the
number of its generator matrices quite explicitly.
I Generator matrices are used for encoding.
Number of Generator Matrices
Proposition: An [n, k] linear code over Fq has
k−1
Y
(q k − q j ) = (q k − 1)(q k − q)(q k − q 2 ) · · · (q k − q k−1 )
j=0
distinct generator matrices.

Number of Generator Matrices
Proposition: An [n, k] linear code over Fq has
k−1
Y
(q k − q j ) = (q k − 1)(q k − q)(q k − q 2 ) · · · (q k − q k−1 )
j=0
distinct generator matrices.

Proof: Let C be an [n, k] linear code over Fq
We can construct a k × n generator matrix for C as follows:
I The first row, g1 can be any non-zero codeword from C:
there are q k − 1 choices for g1 .
I The second row g2 can be any codeword from C other than those in
span(g1 ): there are q k − q choices for g2 .
IIn this manner, having picked rows g1 , . . . , gj , the (j + 1)th row
gj+1 can be any codeword from C, except for those in
span(g1 , . . . , gj ): there are q k − q j choices for gj+1 .
Qk−1
Combining the no. of choices for g1 , . . . , gk , we get j=0 (q k − q j ).
Generator Matrices and Encoding
An [n, k] linear code C over Fq has q k codewords.
So, there is a 1-1 correspondence between Fq k and C ⊆ Fq n .
An encoder for C is a 1-1 mapping from message words

u = (u1 , . . . , uk ) ∈ Fq k to codewords c ∈ C.

Generator matrices give rise to encoders:

I Let G be a k × n generator matrix for C. Its rows g1 , . . . , gk
form a basis of C.
I Recall that every c ∈ C can be uniquely expressed as a linear
combination kj=1 uj gj , with uj ∈ Fq for all j.
P

Generator matrices give rise to encoders:

I Let G be a k × n generator matrix for C. Its rows g1 , . . . , gk
form a basis of C.
I Recall that every c ∈ C can be uniquely expressed as a linear
combination kj=1 uj gj , with uj ∈ Fq for all j.
P
I Hence, the mapping

k
X
u = (u1 , . . . , uk ) 7−→ u G = uj gj
| {z }
j=1
∈ Fq k
is a bijection between Fq k and C, i.e., an encoder mapping.

Systematic Generator Matrices
G is called a systematic generator matrix if it is of the form
G = [ Ik | B ],
where Ik is the k × k identity matrix and B is a k × (n − k) matrix.
In such a case, the encoder u 7→ uG maps a message u ∈ Fq k to

the codeword
c = [ |{z}
u | uB
|{z} ].
k symbols n − k symbols
I In c, the first k symbols constitute the message u itself;

they are called information symbols.
I The remaining n − k symbols, uB, are parity-check symbols.
So, retrieving the message encoded within a codeword is easy.

Example
Not every code has a generator matrix in systematic form.
For example,
C = {0000, 0011, 1100, 1111}
is a [4, 2, 2] binary linear code.
I In every codeword, the 1st coordinate is the same as the 2nd.
I So, it is not possible to have a generator matrix of the form

1 0 ∗ ∗
G=
0 1 ∗ ∗
Row Operations and Column Permutations
In general, a generator matrix for an [n, k] linear code C can always

be found which has the k × k identity matrix as a submatrix.

I Pick any generator matrix G for C.
I Since rank(G ) = k, some k columns of G are lin. indep.
I Apply elementary row operations on G to obtain a matrix G̃
in which these k columns form an identity matrix.
I Since elementary row operations correspond to exchanging
rows or replacing a row with a linear combination of rows,
the rows of G̃ still form a basis of C.

I Pick any generator matrix G for C.
I Since rank(G ) = k, some k columns of G are lin. indep.
I Apply elementary row operations on G to obtain a matrix G̃
in which these k columns form an identity matrix.
I Since elementary row operations correspond to exchanging
rows or replacing a row with a linear combination of rows,
the rows of G̃ still form a basis of C.
If we wish, we may permute the columns of G̃ to bring the identity

matrix to the front.
The resulting matrix, call it G , no longer generates the same
code C as G (or G̃ ); instead it generates an equivalent code C.
Equivalent Codes
Two codes are equivalent if one can be obtained from the other by
a permutation of coordinates.
For example,
C C
0000 0000
exchange 2nd and 3rd coords
0011 ←→ 0101
1100 1010
1111 1111
The code C has a systematic generator matrix

1 0 1 0
G =
0 1 0 1
Basis for C = nullspace(H)
Let C be an [n, k] linear code over F with parity-check matrix H,
i.e., C = nullspaceF (H).
Claim: Vectors g1 , . . . , gk from Fn form a basis of C iff

I g1 , . . . , gk are linearly independent over F; and
I Hgi = 0 for i = 1, . . . , k.

I Hgi = 0 for i = 1, . . . , k.
Proof: (=⇒) If g1 , . . . , gk form a basis of C, then

I they are, by definition of basis, linearly independent
I also, they lie in C = nullspaceF (H), so Hgi = 0 for all i.

I Hgi = 0 for i = 1, . . . , k.
Proof: (=⇒) If g1 , . . . , gk form a basis of C, then

I they are, by definition of basis, linearly independent
I also, they lie in C = nullspaceF (H), so Hgi = 0 for all i.
(⇐=) Assume g1 , . . . , gk are lin. indep., and Hgi = 0 for all i.

I span(g1 , . . . , gk ) is a vector space of dimension k.
P
I H · k Pk
i=1 α i gi = i=1 αi · Hgi = 0,
so span(g1 , . . . , gk ) ⊆ nullspace(H) = C.
I Since span(g1 , . . . , gk ) and C both have dimension k,
the ⊆ above is in fact an equality: span(g1 , . . . , gk ) = C.
Generator Matrix for C = nullspace(H)
The result of the prev. claim, when expressed in matrix notation, is:
Proposition: Let C be an [n, k] linear code over F, with
parity-check matrix H. Then, a k × n matrix G over F
is a generator matrix for C iff
I rankF (G ) = k, and
I H G T = 0. (Here, 0 denotes an all-zero matrix.)
Corollary: Suppose that H is of the form [ A | In−k ]. Then,
G = [ Ik | −AT ]
is a generator matrix for C = nullspace(H).

Corollary: Suppose that H is of the form [ A | In−k ]. Then,
G = [ Ik | −AT ]
is a generator matrix for C = nullspace(H).

Proof:
I rank(G ) = k

Ik
I HG T = [ A In−k ] = A + (−A) = 0
−A
Example: The [7, 4] Hamming Code
The [7, 4] binary Hamming code has parity-check matrix

 
1 1 0 1 1 0 0
H= 1 0 1
 1 0 1 0 ,
0 1 1 1 0 0 1
| {z } | {z }
A I3
Hence,  
1 0 0 0 1 1 0
 0 1 0 0 1 0 1 
G =
 0
,
0 1 0 0 1 1 
0 0 0 1 1 1 1
| {z } | {z }
I4 −AT
is a generator matrix for the code. (Note that −AT = AT over F2 .)

“Dot Product”
For x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) belonging to Fn ,

define the “dot product” x · y = x1 y1 + x2 y2 + · · · + xn yn , all
operations being over the field F.
Example:
I Over R, (1, 0, 1, 0) · (1, 0, 1, 0) = 2.
“Dot Product”

Example:
I Over R, (1, 0, 1, 0) · (1, 0, 1, 0) = 2.
I Over F2 , (1, 0, 1, 0) · (1, 0, 1, 0) = 0.
(Thus, over F2 , a vector can be “orthogonal” to itself!)
“Dot Product”

Example:
I Over R, (1, 0, 1, 0) · (1, 0, 1, 0) = 2.
I Over F2 , (1, 0, 1, 0) · (1, 0, 1, 0) = 0.
(Thus, over F2 , a vector can be “orthogonal” to itself!)
It is easy to verify that

I x · y = y · x (commutativity)
I x · (α1 y1 + α2 y2 ) = α1 (x · y1 ) + α2 (x · y2 ) for all α1 , α2 ∈ F
(linearity)
Dual Codes
Definition: For a linear code C of blocklength n over F,

the dual code is defined as
C ⊥ = {x ∈ Fn : x · c = 0 for all c ∈ C}
I Informally, C ⊥ consists of all vectors in Fn that are

“orthogonal” to all codewords in C.
I It is easy to verify (using the subspace test) that C ⊥ is also
a linear code.
Dual Codes
Lemma: Let c1 , c2 , . . . , ck be a basis of C. Then, for any x ∈ Fn ,
x ∈ C ⊥ ⇐⇒ x · cj = 0 for j = 1, 2, . . . , k
Dual Codes
Lemma: Let c1 , c2 , . . . , ck be a basis of C. Then, for any x ∈ Fn ,
x ∈ C ⊥ ⇐⇒ x · cj = 0 for j = 1, 2, . . . , k
Proof: (=⇒) is obvious by definition of C ⊥ .

(⇐=) Any c ∈ C is expressible as kj=1 αj cj for some
P
α1 , . . . , αk ∈ F. So, if x · cj = 0 for all j, then by the linearity
property of the “dot product”,
 
Xk Xk
x·c=x·  αj cj =
 αj (x · cj ) = 0.
| {z }
j=1 j=1
=0
Dual Codes
Proposition: Let G be any generator matrix for C. Then,
C ⊥ = {x ∈ Fn : G xT = 0}.
In other words, C ⊥ = nullspace(G ).

Dual Codes
C ⊥ = {x ∈ Fn : G xT = 0}.

Proof: Write  
c1
 c2 
G = ,
 
..
 . 
ck
where c1 , c2 , . . . , ck constitute a basis for C, and note that
 
c1 · x
 c2 · x 
G xT =  .  .
 
 .. 
ck · x
Dual Codes
C ⊥ = {x ∈ Fn : G xT = 0}.

Proof: Write  
c1
 c2 
G = ,
 
..
 . 
ck
where c1 , c2 , . . . , ck constitute a basis for C, and note that
 
c1 · x
 c2 · x 
G xT =  .  .
 
 .. 
ck · x
Corollary: dim(C ⊥ ) = n − dim(C).

Examples
If C is an [n, k] linear code, then C ⊥ is an [n, n − k] linear code.
I G = [1 1 · · · 1] generates a repetition code, which is an

[n, 1, n] linear code.
Its dual code, nullspace(G ), is the single parity-check code,
which is an [n, n − 1, 2] linear code.
Examples
If C is an [n, k] linear code, then C ⊥ is an [n, n − k] linear code.
I G = [1 1 · · · 1] generates a repetition code, which is an

[n, 1, n] linear code.
Its dual code, nullspace(G ), is the single parity-check code,
which is an [n, n − 1, 2] linear code.
I The [7, 4] binary Hamming code is generated by
 
1 0 0 0 1 1 0
 0 1 0 0 1 0 1 
G = 0 0 1

0 0 1 1 
0 0 0 1 1 1 1
Its dual code, nullspace(G ), is a [7, 3] binary linear code called

the simplex code. It has the property that every non-zero
codeword has Hamming weight equal to 4.
The Dual of a Dual
⊥
Proposition: (C ⊥ ) = C.
⊥
Proof: It is easy to see that C ⊆ (C ⊥ ) : indeed, any c ∈ C is
“orthogonal” to all codewords in C ⊥ , by definition.
Furthermore,
⊥
dim (C ⊥ ) = n − dim(C ⊥ ) = n − (n − dim(C)) = dim(C).
⊥
Since C and (C ⊥ ) are vector spaces of the same (finite)
⊥
dimension, the inclusion C ⊆ (C ⊥ ) is in fact an equality.
The Dual of a Dual
⊥
⊥
Furthermore,
⊥
⊥
⊥
Corollary: Any linear code C has a parity-check matrix, i.e.,

C = nullspace(H) for some matrix H.
The Dual of a Dual
⊥
⊥
Furthermore,
⊥
⊥
⊥
Corollary: Any linear code C has a parity-check matrix, i.e.,

C = nullspace(H) for some matrix H.
Proof: Let H be a generator matrix for C ⊥ . Then, nullspace(H)
⊥
is the code (C ⊥ ) , which is the same as C.
Remarks
I Any generator matrix for C ⊥ is a parity-check matrix for C.

I Any generator matrix for C is a parity-check matrix for C ⊥ .
Remarks
I Any generator matrix for C ⊥ is a parity-check matrix for C.

I Any generator matrix for C is a parity-check matrix for C ⊥ .
I The minimum distance of C ⊥ cannot be directly determined

from the minimum distance of C.
More information is needed: The complete weight distribution
of C allows one to determine the complete weight distribution
(and hence, dmin ) of C ⊥ , via the MacWilliams Identities.
Decoding
Recall the Minimum Distance Decoding (MDD) rule:

Given a received vector y ∈ Fn , decode to a codeword
c ∈ C that minimizes dH (y, c). (Ties are broken arbitrarily.)
We now exploit linearity to give some new perspectives on MDD.

Recall that dH (y, c) = wH (y − c).
Decoding


Let y − c =: error vector e

Decoding


Let y − c =: error vector e ⇐⇒ c=y−e

Decoding


Let y − c =: error vector e ⇐⇒ c=y−e
So, the MDD rule is equivalent to the following:

Given a received vector y ∈ Fn , find an error vector e of
least Hamming weight such that y − e ∈ C.
Decode to ĉ = y − e.
The Set of Error Vectors
Define E(y) := {e : y − e ∈ C}.

(This is the set of all error vectors that cause codewords to get
transformed to y.)
Define E(y) := {e : y − e ∈ C}.

transformed to y.)
Note that
e ∈ E(y) ⇐⇒ y − e = c0 for some c0 ∈ C

⇐⇒ y − e = −c for some c ∈ C
⇐⇒ e = y + c for some c ∈ C
Hence,
E(y) = {y + c : c ∈ C} =: y + C.
Define E(y) := {e : y − e ∈ C}.

transformed to y.)
Note that
e ∈ E(y) ⇐⇒ y − e = c0 for some c0 ∈ C

⇐⇒ y − e = −c for some c ∈ C
⇐⇒ e = y + c for some c ∈ C
Hence,
E(y) = {y + c : c ∈ C} =: y + C.
Thus, E(y) is a coset of C.

Another Perspective on MDD
MDD can now be viewed as the following algorithm: Given a

received vector y ∈ Fn ,
1. find the coset of C to which y belongs
2. identify a vector, e, of least weight from that coset
3. set ĉ = y − e
Another Perspective on MDD
MDD can now be viewed as the following algorithm: Given a

received vector y ∈ Fn ,
2. identify a vector, e, of least weight from that coset
3. set ĉ = y − e
Some advantages offered by this perspective on MDD:
I All cosets of C can be pre-calculated and stored at the

decoder. This pre-computation has to be done just once, and
does not have to be repeated each time a new y is received.
I A vector of least weight within each coset, called
a coset leader, can also be identified in advance and stored.
Cosets — A Quick Review
Definition: A coset of C in Fn is a set of the form
b + C := {b + c : c ∈ C}, for some b ∈ Fn .
Some basic facts:
1. |b + C| = |C| = q k , i.e., all cosets have the same size.

Proof: The map c 7→ b + c is a bijection between C and b + C.
Some basic facts:

2. Each b ∈ Fn lies in some coset of C.

Proof: Clearly, b ∈ b + C, since 0 ∈ C.
Some basic facts:

2. Each b ∈ Fn lies in some coset of C.

Proof: Clearly, b ∈ b + C, since 0 ∈ C.
3. a and b are in the same coset of C iff a − b ∈ C.

Proof: Suppose that b ∈ y + C, so that b = y + c for some c ∈ C.
Then, a = b + (a − b) = y + (c + a − b).
Hence, a is also in y + C ⇐⇒ c + (a − b) ∈ C
⇐⇒ a − b ∈ C.
Theorem: The distinct cosets of a linear code C ⊆ Fn form a
partition of Fn .
Proof: Define a relation ∼ on Fn as follows: for a, b ∈ Fn ,
a ∼ b ⇐⇒ a − b ∈ C
I It is easy to verify that ∼ is an equivalence relation.

I Hence, the equivalence classes of ∼ form a partition of Fn .
I However, by Basic Fact 3, the equivalence classes of ∼ are
precisely the cosets of C.
Theorem: The distinct cosets of a linear code C ⊆ Fn form a
partition of Fn .
Proof: Define a relation ∼ on Fn as follows: for a, b ∈ Fn ,
a ∼ b ⇐⇒ a − b ∈ C
I It is easy to verify that ∼ is an equivalence relation.

I Hence, the equivalence classes of ∼ form a partition of Fn .
I However, by Basic Fact 3, the equivalence classes of ∼ are
precisely the cosets of C.
Corollary: An [n, k] linear code over Fq has q n−k cosets.

Proof: Each coset has q k words (by Basic Fact 1).
Put together, the cosets partition Fq n .
Hence, the number of cosets is q n /q k = q n−k .
Back to MDD
C an [n, k] linear code over Fq .
Pre-computation (one-time):
I List out all q n−k cosets of C.
I Identify a word of least weight (coset leader) from each coset.
Given a received vector y ∈ Fq n ,

2. retrieve the coset leader, e, identified for that coset
3. decode to ĉ = y − e
Back to MDD
C an [n, k] linear code over Fq .
Pre-computation (one-time):
I List out all q n−k cosets of C.
I Identify a word of least weight (coset leader) from each coset.
Given a received vector y ∈ Fq n ,

2. retrieve the coset leader, e, identified for that coset
Example: Let C be the [6, 3] binary linear code generated by

 
1 0 0 1 1 0
G = 0 1 0 1 0 1 
0 0 1 0 1 1
| {z }
I3
C has 2n−k = 8 cosets. It can be verified that dmin (C) = 3.

The Standard Array
A standard array for a linear code is a listing of the code and all its
cosets in the form of an array.
C = 000000 100110 010101 001011 111000 011110 101101 110011

100000 + C = 100000 000110 110101 101011 011000 111110 001101 010011
010000 + C = 010000 110110 000101 011011 101000 001110 111101 100011
001000 + C = 001000 101110 011101 000011 110000 010110 100101 111011
000100 + C = 000100 100010 010001 001111 111100 011010 101001 110111
000010 + C = 000010 100100 010111 001001 111010 011100 101111 110001
000001 + C = 000001 100111 010100 001010 111001 011111 101100 110010
100001 + C = 100001 000111 110100 101010 011001 111111 001100 010010
100001
| {z }
coset
leaders
The Standard Array
A standard array for a linear code is a listing of the code and all its
cosets in the form of an array.
C = 000000 100110 010101 001011 111000 011110 101101 110011

100000 + C = 100000 000110 110101 101011 011000 111110 001101 010011
010000 + C = 010000 110110 000101 011011 101000 001110 111101 100011
001000 + C = 001000 101110 011101 000011 110000 010110 100101 111011
000100 + C = 000100 100010 010001 001111 111100 011010 101001 110111
000010 + C = 000010 100100 010111 001001 111010 011100 101111 110001
000001 + C = 000001 100111 010100 001010 111001 011111 101100 110010
100001 + C = 100001 000111 110100 101010 011001 111111 001100 010010
100001
| {z }
coset
leaders
Note: There may be multiple choices for a word of least weight

within a coset: for example, in the last coset, we could have
alternatively chosen 001100 or 010010 as coset leaders.
Unique Word of Least Weight
When is there a unique word of least weight within a coset?

Note that any word of Hamming weight ≤ b dmin2−1 c is always the
unique word of least weight within its coset.
[ If two words a, b of weight ≤ b dmin2−1 c were in the same coset,
then their difference a − b would be a word of weight < dmin in C.]
Unique Word of Least Weight
When is there a unique word of least weight within a coset?

Note that any word of Hamming weight ≤ b dmin2−1 c is always the
unique word of least weight within its coset.
[ If two words a, b of weight ≤ b dmin2−1 c were in the same coset,
then their difference a − b would be a word of weight < dmin in C.]
Example: In the previous example of the [6, 3, 3] linear code,

I All words of weight ≤ b dmin2−1 c = 1 are the unique words of
least weight in their respective cosets.
Correctable Error Patterns
The coset leaders are precisely the error patterns that get corrected
by the standard array implementation of decoding.
Example: Suppose c = 100110 is transmitted.

I Suppose y = 110110 is received. So, e = y − c = 010000 is
the error vector (but this is not a priori known to the decoder).
The coset leaders are precisely the error patterns that get corrected
by the standard array implementation of decoding.

I Suppose y = 110110 is received. So, e = y − c = 010000 is
the error vector (but this is not a priori known to the decoder).
I y is in the coset 010000 + C, for which 010000, being the

unique word of least weight, is the coset leader.
I So, y gets decoded to ĉ = y − coset leader = 100110 = c.

The error vector gets corrected!
Note that this c is the unique closest codeword to y.
I Suppose y = 101010 is received. Now, e = y − c = 001100 is
the error vector (again, not a priori known to the decoder).
I Suppose y = 101010 is received. Now, e = y − c = 001100 is
the error vector (again, not a priori known to the decoder).
I y is in the coset 100001 + C, for which 100001 was chosen to

be the coset leader in our standard array.
I So, y gets decoded to ĉ = y − coset leader = 001011 6= c.

The error vector does not get corrected this time.
Note that this ĉ is a closest codeword to y, but it is not the

unique such codeword:
I There are three words of weight 2 in the same coset as y,
including the actual error vector e = 100001.
I Subtracting any of these weight-2 words from y would yield a
codeword at distance 2 from y.
Some Observations
I Under standard array decoding, an error vector e that is added

in transmission gets corrected iff
e is one of the coset leaders of the standard array.
Thus, we choose as coset leaders all the error vectors that we
wish to correct.
Some Observations

wish to correct.
I In particular, for MDD, we choose the coset leaders to be
words of least weight within their respective cosets.
Some Observations

wish to correct.
I In particular, for MDD, we choose the coset leaders to be
words of least weight within their respective cosets.
I Under standard array decoding, the prob. of decoding error,

given that c is the transmitted codeword, is
X
Perr (c) = Pr[y | c]
y: y−c is not a coset leader
Storage Complexity
I Even the one-time pre-computation and storage of the

standard array, which contains all the q n words in Fq n , is
infeasible for n ∼ 100 or more.
I Storage of the entire array is not needed if we can find some

means of identifying the coset to which a given y ∈ Fq n
belongs.
It would then suffice to store only the q n−k coset leaders.
Syndromes
Let H be an (n − k) × n parity-check matrix for C.
Definition: The syndrome of a vector y ∈ Fq n is s = HyT .
Some facts:
1. s is a (column) vector belonging to Fq n−k
=⇒ there are q n−k possible syndromes.
2. The syndrome of y ∈ Fq n is 0 iff y ∈ C.
Syndromes
Some facts:
3. Two vectors y1 , y2 ∈ Fq n are in the same coset of C iff they

have the same syndrome.
Proof:
y1 , y2 are in the same coset of C ⇐⇒ y1 − y2 ∈ C
⇐⇒ H(y1 − y2 )T = 0
⇐⇒ Hy1T = Hy2T
Syndromes
Some facts:
3. Two vectors y1 , y2 ∈ Fq n are in the same coset of C iff they

have the same syndrome.
Proof:
y1 , y2 are in the same coset of C ⇐⇒ y1 − y2 ∈ C
⇐⇒ H(y1 − y2 )T = 0
⇐⇒ Hy1T = Hy2T
Thus, the syndrome of a word y ∈ Fq n uniquely determines
the coset to which it belongs.
Syndrome Decoding
Therefore, to implement MDD, it is enough to store a list of coset

leaders along with the syndromes for their respective cosets. In all,
this requires storing only q n−k coset leader – syndrome pairs.
MDD then reduces to syndrome decoding: Given a rcvd y ∈ Fq n ,

1. compute s = HyT
2. retrieve the coset leader, e, corresponding to syndrome s
Example
Consider, once again, the [6, 3, 3] code C generated by

 
1 0 0 1 1 0
G= 0 1 0 1
 0 1 
0 0 1 0 1 1
A parity-check matrix for C is

 
1 1 0 1 0 0
H= 1 0 1 0 1 0 
0 1 1 0 0 1
(This can be verified by checking that the rows of H form a basis of C ⊥ .)

Example " #
1 1 0 1 0 0
H= 1 0 1 0 1 0
0 1 1 0 0 1
We then form a table of syndromes and corresp. coset leaders:

Coset leader, e Syndrome s = HeT
000000 [0 0 0]T
100000 [1 1 0]T
010000 [1 0 1]T
001000 [0 1 1]T
000100 [1 0 0]T
000010 [0 1 0]T
000001 [0 0 1]T
100001 [1 1 1]T
Example " #
1 1 0 1 0 0
H= 1 0 1 0 1 0
0 1 1 0 0 1
We then form a table of syndromes and corresp. coset leaders:

Coset leader, e Syndrome s = HeT
000000 [0 0 0]T
100000 [1 1 0]T
010000 [1 0 1]T
001000 [0 1 1]T
000100 [1 0 0]T
000010 [0 1 0]T
000001 [0 0 1]T
100001 [1 1 1]T
To illustrate syndrome decoding,

y = 110110 =⇒ s = HyT = [1 0 1]T
=⇒ e = 010000 =⇒ ĉ = y − e = 100110
t-Error-Correcting Codes
Definition: A code is t-error-correcting (for some t ∈ Z+ ) if all

error patterns e of weight wH (e) ≤ t can be corrected under
minimum distance decoding (or equivalently, under syndrome
decoding).
I Thus, a code is t-error-correcting iff all the distinct error
vectors of weight ≤ t can be chosen to be coset leaders of
distinct cosets (or equivalently, all these error vectors lie in
distinct cosets).
t-Error-Correcting Codes
Definition: A code is t-error-correcting (for some t ∈ Z+ ) if all

error patterns e of weight wH (e) ≤ t can be corrected under
minimum distance decoding (or equivalently, under syndrome
decoding).
I Thus, a code is t-error-correcting iff all the distinct error
vectors of weight ≤ t can be chosen to be coset leaders of
distinct cosets (or equivalently, all these error vectors lie in
distinct cosets).
Lemma 1: A linear code with minimum distance d is

t-error-correcting for any t ≤ d−1
2 .
Proof: Consider any t ≤ d−1 2 .

Distinct words of weight ≤ t cannot lie in the same coset;
if they did, their difference would be a codeword of weight
≤ 2t < d, which cannot happen. .
The Hamming Bound for Linear Codes
Proposition 2 (The Hamming bound for linear codes):
If an [n, k] linear code over Fq is t-error-correcting, then
t
X n
(q − 1)i ≤ q n−k .
i
i=0
t
X n
(q − 1)i ≤ q n−k .
i
i=0
Proof:
I LHS = no. of error vectors of weight ≤ t
I RHS = no. of cosets
t
X n
(q − 1)i ≤ q n−k .
i
i=0
Proof:
I LHS = no. of error vectors of weight ≤ t
I RHS = no. of cosets
Definition: A t-error-correcting linear code whose parameters

satisfy the Hamming bound with equality is called a perfect code.
I Such a code, under MDD, can correct all error patterns of
weight ≤ t, but none of weight t + 1 or more.
Examples of Perfect Codes
I A trivial example of a perfect code is Fn (over any field F),
which has parameters [n, n, 1]. This is a perfect
0-error-correcting code: n0 (q − 1)0 = 1 = q n−n .
I The [n, 1, n] binary repetition code. It is a perfect

n−1
2 -error-correcting code, for odd values of n:
n−1
2 n
X n 1 X n
= = 2n−1 = 2n−k .
i 2 i
i=0 i=0
Examples of Perfect Codes
I A trivial example of a perfect code is Fn (over any field F),
which has parameters [n, n, 1]. This is a perfect
0-error-correcting code: n0 (q − 1)0 = 1 = q n−n .
I The [n, 1, n] binary repetition code. It is a perfect

n−1
2 -error-correcting code, for odd values of n:
n−1
2 n
X n 1 X n
= = 2n−1 = 2n−k .
i 2 i
i=0 i=0
I The [7, 4, 3] binary Hamming code is a perfect

1-error-correcting code:

7 7
+ = 8 = 27−4 .
0 1
We generalize this construction to obtain a family of perfect
single-error-correcting codes.
Single-Error-Correcting Linear Codes
A linear code is single-error-correcting iff all distinct error vectors

of Hamming weight ≤ 1 lie in distinct cosets, i.e., have distinct
syndromes.

syndromes.
Consider a linear code with parity-check matrix
H = [h1 h2 · · · hn ],
the hi ’s being column vectors.
I Error vector of weight 0, i.e., e = [0 0 . . . 0], has syndrome 0.

I Error vector of weight 1, i.e., e = [0 . . . 0 α 0 . . . 0] for some
α ∈ F \ {0} in the ith coordinate, has syndrome HeT = α hi .
syndromes.
H = [h1 h2 · · · hn ],

Thus, all error vectors of weight ≤ 1 have distinct syndromes iff

I α hi 6= 0 for any i and α ∈ F \ {0} ⇐⇒ hi 6= 0 for all i
syndromes.
H = [h1 h2 · · · hn ],


I α hi 6= 0 for any i and α ∈ F \ {0} ⇐⇒ hi 6= 0 for all i
I α hi 6= β hj for distinct i, j, and any α, β ∈ F \ {0}
⇐⇒ hi 6= γ hj for distinct i, j, and γ (= α−1 β) ∈ F \ {0}

syndromes.
H = [h1 h2 · · · hn ],


I hi 6= 0 for all i
I hi 6= γ hj for distinct i, j, and γ ∈ F \ {0}
Single-Error-Correcting Codes
Theorem: A linear code with parity-check matrix H can correct all

single-error patterns iff
I the columns of H are all non-zero; and
I no column is a (non-zero) multiple of any other column
Single-Error-Correcting Codes
Theorem: A linear code with parity-check matrix H can correct all

single-error patterns iff
I the columns of H are all non-zero; and
I no column is a (non-zero) multiple of any other column
In particular, a binary linear code is single-error-correcting iff it has

a parity-check matrix with all columns distinct and non-zero.
I The [7, 4, 3] binary Hamming code illustrates this idea.
Binary Hamming Codes
Let r ≥ 1 be an integer.
The binary Hamming code Hr is the binary linear code specified by
an r × n parity-check matrix whose columns are all the distinct,
non-zero binary r -tuples.
I Since there are 2r − 1 distinct, non-zero binary r -tuples, we
have n = 2r − 1.
I One way of specifying an r × (2r − 1) parity-check matrix is
H = [h1 h2 · · · h2r −1 ],
where hi is the r -bit binary representation of i.

Binary Hamming Codes
Let r ≥ 1 be an integer.
The binary Hamming code Hr is the binary linear code specified by
an r × n parity-check matrix whose columns are all the distinct,
non-zero binary r -tuples.
I Since there are 2r − 1 distinct, non-zero binary r -tuples, we
have n = 2r − 1.
I One way of specifying an r × (2r − 1) parity-check matrix is
H = [h1 h2 · · · h2r −1 ],
where hi is the r -bit binary representation of i.

I rank(H) = r =⇒ dim(Hr ) = n − r = 2r − 1 − r .
I all columns are distinct and non-zero, but some three columns
sum to 0 =⇒ dmin (Hr ) = 3.
I Thus, Hr is an [2r − 1, 2r − 1 − r , 3] binary linear code.
Hamming Codes are Perfect Codes
Hr is an [2r − 1, 2r − 1 − r , 3] binary linear code.
Hr is a perfect 1-error-correcting binary linear code:

n n
= 1 + (2r − 1) = 2r

I LHS of Hamming bound: 0 + 1
I RHS of Hamming bound: 2n−k = 2r

q-ary Hamming Codes
Let Fq be a finite field, and r ≥ 1 an integer.
I From each one-dimensional subspace of Fq r , pick exactly one

non-zero vector.
I Doing this yields N vectors h1 , h2 , . . . , hN , which we take to
be the columns on an r × N matrix H.
I The q-ary Hamming code Hr ,q is equal to nullspaceFq (H).
q-ary Hamming Codes
Let Fq be a finite field, and r ≥ 1 an integer.
I From each one-dimensional subspace of Fq r , pick exactly one

non-zero vector.
I Doing this yields N vectors h1 , h2 , . . . , hN , which we take to
be the columns on an r × N matrix H.
I The q-ary Hamming code Hr ,q is equal to nullspaceFq (H).
I rankFq (H) = r , so dimFq (Hr ,q ) = N − r .
I By construction, distinct columns of H span distinct 1-D
subspaces, so no column is a multiple of another.
Thus, Hr ,q is single-error-correcting =⇒ dmin ≥ 3.
I Take any two columns hi and hj (i 6= j) of H; their sum must
lie in some 1-D subspace, so hi + hj ∈ span(hk ) for some k.
Thus, there exist three lin. dep. cols in H =⇒ dmin = 3.
Determining N
N is the number of distinct 1-D subspaces of Fq r .
I Each 1-D subspace contains q − 1 non-zero vectors.
I Any pair of distinct 1-D subspaces has only 0 in the
intersection.
I Hence,
N · (q − 1) = no. of non-zero vectors in Fq r

= qr − 1
Thus, N = (q r − 1)/(q − 1).

Determining N
N is the number of distinct 1-D subspaces of Fq r .
I Each 1-D subspace contains q − 1 non-zero vectors.
I Any pair of distinct 1-D subspaces has only 0 in the
intersection.
I Hence,
N · (q − 1) = no. of non-zero vectors in Fq r

= qr − 1
Thus, N = (q r − 1)/(q − 1).
qr − 1 qr − 1

In summary, Hr is a , − r , 3 linear code over Fq .
q−1 q−1
I It is easy to verify that it is a perfect single-error-correcting

linear code over Fq .
Other Perfect Linear Codes
There are only two other perfect linear codes:

I a [23, 12, 7] triple-error-correcting code over F2
I an [11, 6, 5] double-error-correcting code over F3
These are known as Golay codes.
The fact that there are no other perfect codes was proved by
Tietäväinen (1973), building on the work of van Lint (early 1970s).
What Next?
I Constructions of t-error-correcting codes for t ≥ 2 requires

the full machinery of finite fields.
I But before getting into that, we squeeze more out of

elementary combinatorial and linear-algebraic arguments to
explore the trade-offs between the code parameters n, k, d.
I This allows us to give meaningful answers to questions such as

what is the “best” possible code for a given set of parameters.

Slide Set 04

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slide Set 04

Uploaded by

Copyright:

Available Formats

E2 205: Error-Control Coding

Chapter 4: Linear Codes

Online Semester Aug–Dec 2021

Notation: Henceforth, F or Fq will denote a finite field with q

Definition: A linear code C over F is a subspace of Fn .

I n is the length or blocklength of the code C.

Notation: Henceforth, F or Fq will denote a finite field with q

Definition: A linear code C over F is a subspace of Fn .

I n is the length or blocklength of the code C.

Proof: Let c1 , c2 , . . . , ck be a basis of the subspace C.

Proof: Let c1 , c2 , . . . , ck be a basis of the subspace C.

Definition: The Hamming weight of a word (vector)

Proposition: For a linear code C,

Definition: The Hamming weight of a word (vector)

Proposition: For a linear code C,

Proof: Since C is linear: x, y ∈ C =⇒ x − y ∈ C. Therefore,

= min wH (x − y) = min wH (c).

Notation: An [n, k, d] (or [n, k, d]q ) linear code is an [n, k]

Notation: An [n, k, d] (or [n, k, d]q ) linear code is an [n, k]

This is a linear code over F2 with

In other words, this is an [n, 1, n] binary linear code.

C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

I By the rank-nullity theorem,

I Since there are no codewords of (odd) weight 1,

C = {x1 x2 . . . xn : x1 + x2 + · · · + xn ≡ 0 (mod 2)}

I By the rank-nullity theorem,

I Since there are no codewords of (odd) weight 1,

Thus, this is an [n, n − 1, 2] binary linear code.

Re-write these equations in matrix form (over F2 ) as

Thus, a binary word x1 x2 x3 x4 x5 x6 x7 is in the Hamming code C iff

In other words, the Hamming code C is equal to nullspaceF2 (H).

I C has no codewords of weight 1, as no column of H is 0.

Hence, dmin (C) = 3.

I C has no codewords of weight 1, as no column of H is 0.

Hence, dmin (C) = 3.

Thus, the Hamming code is a [7, 4, 3] binary linear code.

If C = nullspaceF (H), then H is called a parity-check matrix for C.

where g1 , g2 , . . . , gk constitute a basis for C.

I In particular, rankF (G ) = dimF (C).

where g1 , g2 , . . . , gk constitute a basis for C.

I In particular, rankF (G ) = dimF (C).

distinct generator matrices.

distinct generator matrices.

An encoder for C is a 1-1 mapping from message words

An encoder for C is a 1-1 mapping from message words

Generator matrices give rise to encoders:

An encoder for C is a 1-1 mapping from message words

Generator matrices give rise to encoders:

I Hence, the mapping

is a bijection between Fq k and C, i.e., an encoder mapping.

where Ik is the k × k identity matrix and B is a k × (n − k) matrix.

In such a case, the encoder u 7→ uG maps a message u ∈ Fq k to

I In c, the first k symbols constitute the message u itself;

So, retrieving the message encoded within a codeword is easy.

Not every code has a generator matrix in systematic form.

In general, a generator matrix for an [n, k] linear code C can always

In general, a generator matrix for an [n, k] linear code C can always

In general, a generator matrix for an [n, k] linear code C can always

If we wish, we may permute the columns of G̃ to bring the identity

The code C has a systematic generator matrix

Claim: Vectors g1 , . . . , gk from Fn form a basis of C iff

Claim: Vectors g1 , . . . , gk from Fn form a basis of C iff