You are on page 1of 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/255505013

Network Coding: A Brief Tutorial

Article

CITATIONS READS

0 2,003

1 author:

Louai Al-Awami
King Fahd University of Petroleum and Minerals
11 PUBLICATIONS   72 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Data Survivability in WSN View project

Minimization of MVL Functions View project

All content following this page was uploaded by Louai Al-Awami on 04 June 2014.

The user has requested enhancement of the downloaded file.


Network Coding: A Brief Tutorial

Louai Al-Awami

Department of Electrical & Computer Engineering

Queen’s University

April 18, 2007

Abstract

Network coding is a new paradigm that is promising to change the

way networking is done. In network coding intermediate nodes com-

bine different packets to exploit more bandwidth and throughput. In

addition, network coding reduces both delay and energy requirements.

Network coding has been proposed in many applications including

wireless networks, P2P applications, and network security and man-

agement. In this report an overview of network coding is presented

and design issues and applications are discussed.

1
Contents

1 Introduction 3

2 Mathematical Formulation 6
2.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Practical Implementation . . . . . . . . . . . . . . . . . . . . . 9

2.3.1 Encoding Random Linear Code . . . . . . . . . . . . . 9

2.3.2 Decoding Random Linear Code . . . . . . . . . . . . . 10

3 Applications of Network Coding 13

3.1 P2P Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 14


3.2 Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Network Tomography . . . . . . . . . . . . . . . . . . . . . . . 19

3.5 Network Security . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Summary 20

2
1 Introduction

In today’s networks coding is done at the end nodes while intermediate nodes
route packets without altering their contents. Network coding changes this
folklore by allowing intermediate nodes to combine incoming packets into one

or many outgoing packets. The main goal behind doing so is to exploit more
potential for increasing the information content of each transfer. This in

turn increases the throughput , decreases energy consumption, and increases


reliability as we are going to see shortly.

To have a feeling of this, let’s consider the example shown in Figure 1.

In this example, four transmissions are needed to exchange two packets of


information between node A and node B through the base station S in a
wireless environment. With network coding however, the last transfer can

be eliminated by combining both packets and sending them during the third

transfer. Since node A and node B have part of the information, they can
extract the rest of it from the combined packet.

Figure 1: Routing vs. Network Coding

Network coding have many advantages to a wide range of applications

3
compared to traditional routing. The following are among the most impor-
tant ones:

Increased Throughput: Network coding increases the network through-

put by increasing the information content of transfers and packets. It


also does so by decreasing the number of transfers needed to carry data

to single or multiple nodes. For example, a node intending to send two


pieces of data (x and y) to a single destination can cut the time needed

by sending the XORed version (x ⊗ y) instead of each datum sepa-

rately. It is interesting to note that network coding can be beneficial

not only from the perspective of the nodes employing it, but also to the
whole network since reducing the number of transfers means that there
is more bandwidth available for other nodes to use. Another benefit

which can be seen in the example is reducing the delay since node B
has received the data during the third transfer using network coding
as opposed to waiting till the fourth transfer when using traditional
routing.

Those benefits can be seen specially in broadcasting and multicasting

applications. Furthermore, many unicast applications can also gain

more throughput when utilizing proper network coding schemes.

Flexibility: By network coding, information gets disseminated on all pack-

ets. Consequently, all packets become equally important. What is


interesting though is that information can be retrieved by collecting

4
enough number of packets regardless of which packets. This flexibility
helps in tackling different problems more easily, e.g acknowledgments
and error correction. This will be illustrated when describing an appli-

cation of data dissemination in wireless sensor networks.

Reliability: Error correction is implemented in two ways: Forward Error

Correction (FEC) and Automatic Repeat reQuest (ARQ). FEC pro-


vides optimal delay at the cost of reduced rate resulting from adding

some overhead error detection and correction information to packets.

ARQ on the other hand, achieves optimal rate at the cost of delay.

Network coding can optimize both delay and rate resulting in a better
reliability scheme.

This can be seen from different angles. Since the information about
each packet is distributed on different nodes, packets can be acquired
from different nodes even when the node to which the packet corre-
sponds disappears. This is going to be clearer when considering using

network coding for P2P applications. Another consideration comes

from the fact that any packet has been used to construct many en-

coded packets, thus by overhearing packets from neighbors, different

copies of every packet can be retrieved and used to check the integrity

of the corrupted ones.

5
2 Mathematical Formulation

In coding theroy, mathematical treatments of coding schemes is carried using


abstract algebra. This section will present how encoding and decoding are
represented performed mathematically. For simplicity, we will present linear

coding as it has been shown that the bestbenefits of network coding can be
achieved with linear codes [4]. At the end of the section we will present an

exemplary family of codes called digital fountain codes which demonstrates


how a real implementation of network coding looks like.

2.1 Encoding

Let M 1 , M 2 , ..., M n be n messages generated by one or multiple sources. Also,

let g1 , g2 , ..., gn be a set of coefficient drawn from the field F2s . For example,

we can define a field to be F22 = {00, 01, 10, 11}.


An encoded message can be then written as

n
X
X= gi M i
i=0

. More specifically, the addition is applied on each bit position k of every

message M i , i.e. every bit of X, denoted as Xk can be written as

n
X
Xk gi Mki
i=0

Encoded messages contain both the encoding coefficient g = (g1 , g2 , ..., gn ),

6
called the encoding vector, and the encoded information X, called the infor-
mation vector. The receiver use the encoding vector to decode the informa-
tion vector to get the original information.

2.2 Decoding

Decoding can be viewed as solving a system of equations where for each


received packet X j , the problem is to find a set of values for M1 , M2 , ..., Mn
P
that satisfies Xj . Recall that Xj = ni=0 = gij M i . To solve this system of

equations, we need m packets (equations) where m ≥ n.


One problem might arise here that is how many packets are enough?

Clearly, if we have some equations which are dependent, we will need m > n.
This problem is solved by choosing the right linear code as we will see next.
To clarify how decoding can be done, let’s see one way of doing it. Each
node can have a decoding matrix that contains encoded words received so

far as well as code words sent by the node itself, in addition to the encoding
coefficients.. This matrix can be viewed as a system of equations. Now, we

can perform Gaussian elimination on the matrix to solve for the unknowns,

i.e. M1 , M2 , ..., Mn .

This brings many design issues regarding size of decoding matrix and the
delay till enough independent vectors are received to decode.

Delay: It is obvious that decoding will result in some delay because received

packets have to wait for other (independent) packets to come in order

7
to be decoded. But note that it is not always necessary to wait for
all packets to decode as whenever Gaussian elimination results in a
raw of the form (ei , Mi ) where ei is the vector with all zeros except

in the ith location. Thus in the worst case the average delay over

packets will be that of non-network coding schemes. Also note that

the reduction in the number of transmissions required results in another

reduction in delay. So in fact, the total delay is indeed lowered. A very


interesting fountain encoding/decoding scheme was proposed in [15]

called LT code which achieves low delay encoding/decoding without

the need for the complication of Gaussian elimination. This scheme


was further improved in [19] by introducing Raptor code which achieves

a linear encoding/decoding time.

Decoding Matrix Size: Another important consideration will be the size

of the decoding matrix which is a function of the field dimension. This


specially important for systems with limited memory and processing
power when using random codes. This issue has been addressed by the

notion of generations where packets are grouped into generations and

only packets of the same generation are combined in one matrix. It is

important to keep in mind that the size of the matrix imposes a tradeoff

between the performance of the code and the memory requirements

since larger matrix means larger generation and hence longer history
that speeds decoding and boosts throughput but at the cost of bigger

memory requirements.

8
2.3 Practical Implementation

It might seem difficult to implement network coding. In this part one possible
approach will be discussed. The approach might not be the best, but it is
simple and enough to illustrate the main idea.

One way to implement network coding can be by using fountain codes


[16]. Fountain codes is a class of erasure codes with the property that an

infinite sequence of encoded symbols can be generated from source symbols

such that the original symbols can be recovered from a subset of the encoded

symbols.
There is a clear harmony between fountain codes and network coding

which makes them an attractive choice for implementing network coding.


There are many codes in the fountain family, such as Random Linear Code,
LT code, Raptor Code, and Online code. Random linear code is the sim-

plest coding scheme in the fountain family. Let’s see how the encoding and
decoding is done for this code.

2.3.1 Encoding Random Linear Code

Suppose we have a file to send. We first break it down into K packets

(s1 , s2 , ..., sK ) each with length l. Then, we prepare each encoded packet
sent out using a linear combination of all original packets. Let tn be the nth

9
generated packet. tn can be encoded as

K
X
tn = sk G(k)
n
k=1

where

G(k)
n = a randomly generated bit (0 or 1)

The sum over all packets is a bitwise modulo 2 sum which can be cal-

culated using simple bitwise XOR. Now, for each encoded packet tn , a new
(1) (2) (K)
K-bits vector Gn = {Gn , Gn , ..., Gn } is generated and used to generate
tn . Note that for this simple scheme, tn is just the result of XORing all
packets (sk ) for which the corresponding bit location in the vector Gn is 1.

It is useful to think of the concatenation of all generated encoding vectors


as a generator matrix G, see Figure 2.

2.3.2 Decoding Random Linear Code

In order for the receiver to be able to recover the data, it must have the

encoding matrix G. There are two strategies for that: either the receiver
and sender maintain synchronized random number generators, or the sender

appends the encoding vector to the packet header. The former reduces the
overhead but maintaining synchronization is costly while the latter is simple

but implies some overhead. Obviously, adding the encoding vector to the
header seems more attractive when noting that overhead can be reduced by

10
Figure 2: Encoding/Decoding Random Linear Fountain Codes.

increasing packet size.

Figure 2 shows that the receiver can build a generator matrix using the
generator vectors for the received packets. This generator matrix can be in

turn utilized to recover the original file using Gaussian elimination. In other
words, the decoding is simplified to solving a system of linear equations.

Hence the decoded packets can be found by

N
X
sk = tn (G(k)
n )
−1

n=1

Recall that K is the number of packets constituting the original file. Now
let N be the number of received packets. How large should N be in order

11
for the decoder to be able to recover the original file? Clearly, for N < K
the system of equations is not solvable. However, for N = K the system is
solvable provided that all N packets are linearly independent. It can be easily

analytically shown that for K > 10, the probability of all first N received

packets being linearly independent is only 0.289. This leaves a probability of

0.711 for the decoder failing to decode the original file.

The third possibility is N = K + E where E is some small number repre-


senting excess packets. In this case the decoder receives some more packets

hoping that they will result in getting K independent packets. Surprisingly,

a very small value of E can tremendously increase the probability of recov-


ering the original packets. In addition, this value of E is almost the same

even for large K.

Figure 3: Performance of Random Linear Code with Excess Packets.

Figure 3 shows that for E = 7, the probability of not getting enough

linearly independent packet (or equivalently unsolvable system of equation)

12
equals almost zero. This equivalent to saying that receiving seven more
packets can assure having enough number of independent packets (equations)
and hence successful decoding. Moreover, the dotted curve shows the upper

bound on this probability which surprisingly does not depend on K. For a

probability δ, the corresponding value of N equals

1
N ≈ K + log2 ( )
δ

Unfortunately, this performance has some computational cost. Since on


K
average the encoding vector contains 2
1’s, the expected cost of encoding
K
each packet involves 2
XOR operations. The decoding involves inverting
K2
the matrix (G → G−1 ) which costs K 3 , in addition to an average of 2
for

applying the inverse matrix on the received packets.


Fortunately, there exist other fountain codes with better performance.

For example, LT code and Raptor code have an encoding/decoding O(Kloge K)


and O(loge K), respectively.

3 Applications of Network Coding

Because of its potential efficiency, network coding has been proposed for
many applications in computer networks. Next, we will see some of these

applications and show how network coding can improve the way networking
is done.

13
3.1 P2P Networks

In Peer-to-Peer (P2P) networks, a server distributes a file by breaking it


down into small blocks. Clients in turns, download those small blocks from
the server and distribute them among their neighbors. Network coding has

been used in a P2P prototype called Avalanche [1][9] developed by Microsoft


where the blocks sent by servers and clients are random combinations of

original blocks.

In Avalanche, the server divides the original file into smaller blocks B1 , B2 , ..., Bn .

Instead of receiving the actual blocks, a client receives a linear combination


of the original blocks. For instance, if the client requests two blocks E1 and

E2 , it will be sent

Ei = ci1 B1 + ci2 B2 + ... + cin Bn for i = 1, 2

where cij are chosen randomly from the base field F . Along with each

block, a coefficient vector u is sent for the client to be able to decode the
original blocks.

A similar operation takes place in every group of clients constituting a

neighborhood where every client sends a linear combination of blocks it has

instead of raw blocks. So, if a client is to send out the blocks E1 and E2 it

has just received, the output block (E3 ) with be formed using

E3 = c31 E1 + c32 E2

14
Note that in terms of the original blocks, E3 can be written as

n
X
E3 = (c31 .c1i + c32 .c2i )Bi
i=0

Since all the coefficients are sent with data blocks, recovering the original

blocks is reduced to solving a linear system of equations [20].

This paradigm of P2P networks can have several advantages over tradi-

tional P2P systems. First, minimizing download time. This is mainly due
to the fact that existing P2P systems depend on knowing the topology of
the distributed file, i.e. which pieces are available where, or who has which

piece. In Avalanche, this is overcome by distributing coded blocks and clients


would have to collect enough coded blocks to be able to construct the orig-
inal file. In [9], it has been shown that Avalanche minimizes the download
time by 20% to 30% compared to traditional P2P systems. Second, since

the file is disseminated all over the network, the system is robust against
situations where the server disappears before completing the download or in
case of high churn rates, where clients join just to download and leave im-

mediately after finishing. Third, Avalanche does not suffer mush when some

cooperation-forcing mechanism are implemented.[8]

3.2 Wireless Networks

Bidirectional Traffic in Wireless Network: Network coding can be used

to get great improvements in wireless networks when utilizing the broad-

15
cast nature of wireless medium. As it was shown in Figure 1, the
throughput can be increased for bidirectional transfers when using a
shared base station. Let’s extend the picture to a bigger scenario where

we have many intermediate routers. After few transient steps, every

middel router will be transferring two packets per transfer to two ad-

jacent routers, Thus, resulting in almost doubling the throughput and

minimizing the bandwidth consumption. Furthermore, overhearing the


encoded packet from a neighbor can serve as an implicit acknowledg-

ment which saves another piece of bandwidth allocated to such control

messages.

This scheme can be useful for so many applications that require infor-
mation exchange between two nodes such as telephony, video confer-
encing, and instant messaging. In [13], a distributed implementation
has been discussed overcoming cases where transmissions in a lossy

wireless channel with random delay are not synchronized.

Residential Wireless Mesh Networks: In [12] a practical implementa-


tion of network coding has been demonstrated for wireless mesh net-

works. In the same paper a coding scheme called COPE has also been

shown to double the throughput of the IEEE 802.11 wireless mesh net-

work standard.

In this scheme, nodes broadcasts linear combinations of packets they


have in addition to annotating the packets with information about

16
packets they have. Since nodes within the vicinity of each other re-
ceive similar packets, they can decode new packets. In the big picture,
intermediate nodes encode packets corresponding to multiple unicast

flows.

Many-to-Many Broadcasts: Network-wide broadcasts are used for many

purposes in wireless networks, e.g. topology discovery and routing.


Network coding has been shown to reduce the energy-per-bit which

is the amount of energy spent on transmitting one bit of information

when used to broadcast in wireless ad hoc networks.

In [14], the authors propose an algorithm for broadcasting in wireless


ad hoc networks. The relies on the fact that intermediate nodes can

combine the packets corresponding to two different broadcast flows


and hence help spreading the information using half the number of
transmissions required by traditional methods.

3.3 Sensor Networks

Untuned Radios in Sensor Networks: Wireless sensor networks promises

to change people’s live by allowing applications in wide range of areas.

Nevertheless, implementing many of such applications require minimiz-


ing the cost, size, and energy consumption of sensor chips. This fact

is better understood when realizing that wireless sensor networks are

formed using a tremendous number of sensors.

17
In [17], the authors propose replacing the expensive antennas used to
build sensor networks with low-cost ones resulting in a huge cost saving.
However, low cost chips might not guarantee that all neighboring nodes

can tune to each other and consequently might not be able communi-

cate. This problem can be overcome using network coding since nodes

do not need to communicate with all other nodes but rather only few

since the data stored by each node represents the data of many nodes.
The only condition for this to work is having a very dense network to

increase the probability of finding the required data.

Data Gathering in Sensor Networks: In some sensor networks the goal


is to disseminate data across the network such that when the sink node
(a node responsible for collecting data) wants some data, it can contact

any node in the network and gets the required data. One problem
with such networks is the increased memory requirements imposed by
requiring each node to have big amount of data.

In [5], an interesting solution is proposed using network coding. Each

node is required to have only a memory space for one data element

plus the coefficients used to encode data. Nodes keep overhearing their

neighbors, encoding the overheard data by multiplying them with ran-

dom coefficients along with their own data so the data stored in each
node represents a linear combination of data over all other nodes. To

collect data from the network, sink node needs only to contact any n

18
nodes on the network to be able to decode n pieces of data.

3.4 Network Tomography

Among the goals of network monitoring solutions is providing performance


metrics about the network in addition to discovering the changes in topology

if some nodes fail. Network monitoring is implemented by placing probes


around the network that frequently check for nodes by sending a ping or

SNMP requests for example. Such approach to monitoring adds an overhead

load on the network resulting from the excess monitoring traffic.


Some proposals suggest using network coding to reduce this traffic using

[18][7][6]. Given that the probe knows the coefficients used by each node to
encode, it can analysis the received messages to check the state of nodes. In
this way, explicit monitoring of traffic eliminatedted and the status of nodes
can be inferred from the received packets.

3.5 Network Security

Network coding has been shown to improve security against many types

of attacks including: eavesdropping, modifying data, and jamming attack.

Since data does not exist on one node but rather is distributed, overhearing

data packets will not result in getting the information except by getting all
the required blocks which makes it harder for the eavesdropper [2][3].

Since data is distributed all over the network, it is easy to discover if

19
any piece was changed. This prevents against modification data attacks[11].
This can also serve in correcting errors happening to data. On the other
hand, network coding suffer a lot when it comes to jamming attacks where

an attacker injects a corrupted block which progressively propagates into

the whole network. A solution for this problem has been suggested in [10]

where nodes can check received blocks on-the-fly and inform each other if a

corrupted block is found.

4 Summary

Network coding is very promising paradigm that is supported by many ad-

vantages including increased throughput, reliability, and flexibility as well


as decreased delay, bandwidth, and power consumption. Network coding
has been proposed for many applications including wireless networks, P2P

applications, and network security and management. Finding practical im-


plementions of network coding is a hot topic in research nowadays and it
might stay for some time.

20
References

[1] Avalanche, File Swarming with Network Coding,


http://research.microsoft.com/ pablo/avalanche.aspx.

[2] K. Bhattad and K. R. Nayayanan. Weakly Secure Network Coding. In

NetCode 2005, Apr. 2005.

[3] Ning Cai and R.W. Yeung. Secure Network Coding. In the Proceedings of
the 2002 IEEE International Symposium on Information Theory, 2002.

[4] S.-Y. R. Li; R. W. Yeung; N. Cai. Linear Network Coding. IEEE


Transactions on Information Theory, 49:371–381, 2003.

[5] Alexandros G. Dimakis, Vinod Prabhakaran, and Kannan Ramchan-

dran. Ubiquitous Access to Distributed Data in Large-Scale Sensor


Networks Through Decentralized Erasure Codes. In IPSN ’05: Pro-
ceedings of the 4th international symposium on Information processing

in sensor networks, page 15, Piscataway, NJ, USA, 2005. IEEE Press.

[6] C. Fragouli and A. Markopoulou. A Network Coding Approach to Over-

lay Network Monitoring. In Proceeding of the 43rd Allerton Conference

on Communication, Control, and Computing, Monticello, IL, September


2005.

[7] C. Fragouli and A. Markopoulou. Network Coding Techniques for Net-

work Monitoring: A Brief Introduction. In the Proceeding of the IEEE

21
International Zurich Seminar (IZS) on Communications, pages 82–83,
ETH, Zurich, Feb. 2006.

[8] Christina Fragouli, Jean-Yves Le Boudec, and Jorg Widmer. Net-

work Coding: An Instant Primer. SIGCOMM Comput. Commun. Rev.,

36(1):63–68, 2006.

[9] C. Gkantsidis and P. Rodriguez. Network Coding for Large Scale Dis-

tribution. In Inforcom, 2005, Miami, FL, March.

[10] Pablo Rodriguez; Christos Gkantsidis. Cooperative Security for Network


Coding File Distribution. In IEEE Infocom, 2006.

[11] Tracey Ho, Ben Leong, Ralf Koetter, Muriel Médard, Michelle Effros,

and David Karger. Byzantine Modification Detection in Multicast Net-

works Using Randomized Network Coding. In Proceedings of the 2004


IEEE International Symposium on Information Theory (ISIT), June
2004.

[12] S. Katti; D. Katabi; Wenjun Hu; and Rahul Hariharan. The Impor-

tance of Being Opportunistic: Practical Network Coding for Wireless


Environments. In Proc. 43rd Allerton Conference on Communication,

Control, and Computing, Monticello, IL, September 2005.

[13] Yunnan Wu; Philip A. Chou; S. Y. Kung. Information Exchange in


Wireless Networks with Network Coding and Physical-Layer Broadcast.

Technical Report MSR-TR-2004-78, Microsoft Research, August 2004.

22
View publication stats

[14] J. Widmer; C. Fragouil; J. Y. LeBoudec. Energy Efficient Broadcasting


in Wireless Ad hoc Networks. In NetCod 2005.

[15] Michael Luby. LT Codes. In The 43rd Annual IEEE Symposium on

Foundations of Computer Science, pages 271–282, 2002.

[16] D. J. MacKay. Fountain Codes. In Proceeding of IEE Communications,

volume 152, pages 1062– 1068, Dec. 2005.

[17] Dragan Petrovi;, Kannan Ramchandran, and Jan Rabaey. Over-

coming Untuned Radios in Wireless Networks with Network Coding.


IEEE/ACM Trans. Netw., 14(SI):2649–2657, 2006.

[18] Ho T.; Leong B.; Change Y.; Wen Y.; Koetter R. Network Monitoring
in Multicast Networks Using Network Coding. In the Proceeding of the
ISIT 2005.

[19] A. Shokrollahi. Raptor Codes. In the proceedings of the International


Symposium on Information Theory (ISIT 2004), 2004.

[20] R. W. Yeung. Avalanche: A Network Coding Analysis. preprint, 2005.

23

You might also like