You are on page 1of 12

Data Link Layer (chapter 3)

Data link layer


The data link layer uses the services of the physical layer to send and receive bits over
communication channels. It has a number of functions, including:
1. Providing a well-defined service interface to the network layer
2. Handles transmission errors
3. Regulating the flow of data so that slow receivers are not swamped by fast senders

The data link layer adds a header and a trailer to the packets received from the network layer.

Services
• Unacknowledged connectionless services
• The server doesn't acknowledge retrieval of frame
• Frame is sent with no connection / error recovery
• Good when error rate is very low
• Acknowledged connectionless services
• The server does acknowledge retrieval of frame
• Frame is re-transmitted if needed
• Can be very expensive, but for wireless connections it's worth it
• Acknowledged connection-oriented service
• Connection is established
• Makes sure each frame is received in the right order, exactly one
• Rare, only used for very unreliable connections

Framing methods
These methods take care of turning bits (from the physical layer) into frames.

In appendix A you'll find images illustrating these methods.

Byte count: Frames start with the length of the frame to interpret them. Very error prone, since a
single error will result in an invalid interpretation of the next frames which will cause errors
upstream.

Byte stuffing: Use a flag byte to indicate the start and end of a frame. However, this method is
also suboptimal as the flag might also become corrupted. A solution to this problem is using
an escape byte, which escapes the flag byte. This might still cause problems though as the
escape sequence might be in the data too. This can be solved by escaping the escape
sequence (which can of course be very inefficient)

Bit stuffing: A disadvantage of byte stuffing is that it's tied to use of 8-bit bytes. Bit stuffing
(developed for the HDLC protocol) inserts a special bit pattern at the start and end of a frame.
When this bit pattern is 01111110, after 5 consecutive 1's, the sender's data link layer will add a 0,
as an escape bit. The receiver's data link layer fixes this by destuffing.

For safety, a combination of these methods is often used.

Flow control
To prevent data being sent to the receiver too quickly, there are two flow control methods
• Feedback-based flow control: the receiver sends information to the sender giving it
permission to send more data or tell it how the receiver is doing
• Rate-based flow control: the protocol has a built-in mechanism to limit transfer rate

Error detection and error correction


There is a distinction between error detection codes and error correction codes (FEC, Forward
Error Correction). The former is used in highly reliable channels, because you just resend a
faulty block. The latter is used when there are many errors, by adding redundancy to the block.

General idea
Error correcting codes add redundancy to the information that is sent. A frame consists of m
data (message) bits and r redundant bits.
• In a block code, the r check bits are computed solely as a function of the m data bits
with which they are associated
• In a systematic code, the sent message contains both the m and r bits
• In a linear code, the r check bits are computed as a linear function of the m data bits. A
popular function for this is the XOR

An n-bit (n = m + r) unit containing the message and redundancy bits is referred to as a


codeword. The code rate is the fraction of the codeword that carries information that is not
redundant: mn

The general strategy for error correction and detection is that we have a number of "valid"
patterns
• Any data section is valid
• Not every codeword is valid
• The number of valid codewords is a very small subset of all possible bitstrings of length
n, so it's impossible that a small number of errors can't change one legal pattern into
another legal pattern
• Price to pay: lots of extra check bits
• Up to a certain error rate we can detect and correct the errors, above that rate it's not
usable.

Error correcting codes


Basic idea: if illegal pattern, find the legal pattern closest to it. That might be the original data
(before errors corrupted it).

Given two bitstrings, XOR gives you the number of bits that are different. This is the Hamming
distance. If two codewords are Hamming distance d apart, it will take d one-bit errors to
convert one into the other.
• To detect (but not correct) up to d errors per length n, you need a coding scheme where
codewords are at least d + 1 apart in Hamming distance. Then d errors can't change into
another legal code, so we know there's been an error.
• To correct d errors, need codewords 2d + 1 apart. Then even with d errors, bitstring will
be d away from original and d + 1 away from nearest legal code. Still closest to original.
Original can be reconstructed.

Hamming Codes
We send parity bits in the locations 2n , where n = 0, 1, … n. Thus if you for example have a 7
bit hamming code, you have parity bits in the first, second and fourth place (called a 11,7
Hamming Code, 7 data bits with 4 check bits). The other bits are data bits. Using Hamming
Codes
is the best theoretical way, where we choose our m and r bits such that (r + m + 1) ≤ 2r

Each check bit checks a number of data bits and each check bit checks a different collection of
data bits. To see which check bits check a certain data bit in position k, rewrite k as a sum of
powers of 2, e.g. 11 = 1 + 2 + 8. Thus bit 11 is checked by bits 1, 2 and 8. This combination of
check bits ONLY checks bit 11.

Calculating Hamming Code


Each parity bit calculates the parity for some of the bits in the code words, depending on the
position of the parity bit, in the following sequence:
• Position 1: check 1 bit, skip 1 bit etc. 1, 3, 5, 7, …
• Position 2: check 2 bits, skip 2 bits etc. 2,  3,  6,  7,  10,  11, …
• Etc

In the case of even parity:


If the total number of ones in the checked positions is odd, the parity bit will become 1, else it
will be 0.

In the case of odd parity:


If the total number of ones in the checked positions is odd, the parity bit will be 0, else it will
become 1.

Checking and fixing works the same way. Check for every parity bit whether it's still correct, if
for example parity bit 1 and 4 are bad, data bit 5 is flipped.
Example
The data is 10011010. Marking with parity bits gives p1  p2  d3  p4  d5  d6  d7  p8  d9  d10  d11  d12 , for
convenience we'll notate it as _ _ 1 _ 0 0 1 _ 1 0 1 0

• Parity bit p1 is given by ? _ 1 _ 0 0 1 _ 1 0 1 0, where the red numbers are the numbers it
checks. There are 4 ones, even, so the parity bit becomes 0
• Parity bit p2 is given by 0 ? 1 _ 0 0 1 _ 1 0 1 0. There are 3 ones, uneven, so the parity bit
becomes 1
• Parity bit p4 is given by 0 1 1 ? 0 0 1 _ 1 0 1 0. There is 1 one, uneven, so the parity bit
becomes 1
• Parity bit p8 is given by 0 1 1 1 0 0 1 ? 1 0 1 0. There are 2 ones, even, so the parity bit
becomes 0

This gives the codeword 011100101010

Convolutional codes
Very complex, is used in wifi. Overview: a convolutional code operates on a stream of bits,
keeping internal state:
• Output stream is a function of all preceding input bits
• Bits are decoded with the Viterbi algorithm

Other codes
• Reed-Solomon codes
• Low-Density Parity Check codes

Error detecting codes


Parity
If we have a low error-rate channel, sending many correction bits is a waste of bandwidth. It's
quicker to just check and resend if needed.

By adding a parity bit to the end of the bitstring so that the amount of 1 bits is even can detect
single-bit flips. But if there are burst errors, multiple flipped bits after each other, the chance of it
being picked up is 12 (unacceptable). To improve upon this we can divide the bitstring into a
matrix of k × n, allowing us to detect k errors, as long as there is at most 1 error per row.

We can even further improve on this by using interleaving: computing the parity bits over the
data in a different order than the order in which the data bits are transmitted. We compute the
parity bits for the columns and send it as rows. Some longer burst errors still go undetected
though (burst error only means the first and last bits in a range a wrong, the others in between
might be correct)

Checksums
The word ‘‘checksum’’ is often used to mean a group of check bits associated with a message,
regardless of how are calculated. A checksum treats data as n-bit words and adds n checkbits
that are the modulo 2n sum of the words.

An example of a checksum is the Internet checksum, a sum of the message bits divided into
16-bit words. It is very efficient but it doesn't provide protection for deletion and addition of
zero data and for swapping parts of the messages.

A better choice is Fletcher's checksum, as it includes a positional component, adding the


product of the data and its position to the running sum.

Cyclic Redundancy Checks (CRC)


This is the technique which is in wide-spread use, its also known as a polynomial code (based
upon treating bit strings as representations of polynomials with coefficients of 0 and 1 only).
Thus a bitstring of 110001 represents the following polynomial: 1x5 + 1x4 + 0x3 + 0x2 +
0x1 + 1x0 .

Modulo 2 arithmetic
In modulo 2 arithmetic, addition = subtraction = XOR, so:
•0+0=0
•0+1=1
•1+0=1
•1+1=0

Multiplication:
•0⋅0=0
•0⋅1=0
•1⋅0=0
•1⋅1=1

Long division is as normal, except the subtraction is modulo 2.

Examples
110010 ⋅ 1000 = (x5 + x4 + x) ⋅ x3 = x8 + x7 + x4 = 110010000

Algorithm
If this method is employed, the sender and receiver must agree upon a generator polynomial
G (x) in advance. Here, M (x) is the polynomial obtained from the m data bits. The idea is to
append a CRC in such a way that the checksummed frame is divisible by G (x), if there is a
remainder, there has been a transmission error.
1. Let r be the degree of G (x). Append r zero bits to the low-order end of the frame so it
now contains m + r bits and corresponds to the polynomial xr M (x)
2. Divide the bit string corresponding to G (x) into the bit string corresponding to
xr M (x), using modulo 2 division
3. Subtract the remainder (which is always r or fewer bits) from the bit string corresponding
to xr M (x) using modulo 2 subtraction. The result is the checksummed frame to be
transmitted. Call its polynomial T (x)

Example
Consider a message 1101011111 (x9 + x8 + x6 + x4 + x3 + x2 + x1 + x0 )
Consider a generating polynomial 10011 (x4 + x1 + x0 )

This is used to generate a 4 bit CRC = C (x) to be appended to M (x)


1. The degree r is x4 , so we multiply M (x) by x4 : 11010111110000 (add 4 zeros)
2. Divide the result by G (x), the remainder is C (x). We do 100111101 long division into
11010111110000:

3. We add the remainder (10) to x4 M (x), giving the frame 11010111110010. Using this we
can detect all burst errors of length ≤ r
Data Link Protocols
Here, we assume that machine A wants to send a long stream of data to machine B using a
reliable connection-oriented service. A is assumed to have an infinite supply of data ready to
send and never has to wait for data to be produced. We also assume that machines do not
crash. Moreover, we assume that there are appropriate physical layer library functions available
like to physical layer and from physical layer. Checksums are computed and appended by the
transmitting hardware.

Initially, a receiver waits for an event to happen. Event types differ per protocol. As soon as the
receiving data link layer acquires an undamaged frames, it checks the control information in the
header, and if everything is all right, it passes the packet portion to the network layer. A frame
header is under no circumstances ever given to a network layer!

Very generally, a frame header consists of a few control fields: kind (only header or also data),
seq (used to keep frams apart), and ack (acknowledgements). There is also an info field, which
contains the packet itself. It is important that one realizes that a frame is just a packet with a
header attached to it. Protocols need to take notion of complete packet loss too, and thus
need some sort of timing mechanism.

Utopian Simplex
The Simplex Protocol is very simple, data can be transmitted in one direction only, both the
transmitting and receiving network layers are always ready, processing time can be ignored,
infinite buffer space is always available, and the channel between the data link layers never
damages or loses frames. The sender just infinitely pumps packets to the physical layer; the
receiver just waits for a frame to arrive.

The sender is continuously doing the following:


• Fetch a packet from the network layer
• Construct an outbound frame using the variable s
• Send the frame. Only the info field is used by this protocol

The receiver is also simple:


• It waits for something to happen (an undamaged frame arrives)
• The frame arrives and it removes the frame headers and moves it to the network layer

Error-Free Channel
Now we will improve upon the Simplex protocol by preventing the sender from flooding the
receiver with frames faster than the latter is able to process them. A way to do this is by sending
a dummy packet to the sender after passing the packet to the network layer. The sender can
only send the next frame after receiving this dummy frame. This is called stop-and-wait.

Though this is still simplex, frames do have to travel in both directions.


Noisy Channel
The next step in improving upon the Simplex protocol is by handling errors. Frames may be
either damaged or lost completely. However, we assume that if a frame is damaged in transit,
the receiver hardware will detect this when it computes the checksum. We could just add a
timer and wait for acknowledgement (which only comes when the frame is valid, resend if
invalid), but this has a fatal flaw: what if the data was received correctly, but the
acknowledgement isn't? Then duplicate frames are sent.

Thus the receiver has to distinguish frames so that it can detect re-transmitted frames. We can
just use a 1-bit sequence because we just have to distinguish with the previous frame. These
kinds of protocols are called ARQ (Automatic Repeat reQuest) or PAR (Positive
Acknowledgement with Retransmission). Some pseudocode:

Sender
if event is timeout:
loop around, send this event again
else:
if ack is next_frame_to_send:
set up next frame, loop round, send
else:
loop round, send again

Receiver
let frame_expected = m
if frame is m:
pass to network layer
frame_expected = m + 1
ack m
wait for m+1
else:
# didn't get m, but m-1
ack m-1
wait for m

Sliding Window Protocols


In the previously mentioned protocols, we only had one-way data transfer, which should be 2-
way (full-duplex).

One way of achieving that is by running two instances of one of those protocols, each on a
separate link for simplex data traffic (a "forward" and "reverse" channel). Though this is bad, we
should use the same link for both directions. This can be done by looking at the kind field in
the header of an incoming frame, to distinguish between acknowledgements and actual data.
Upon receiving the data, the receiver waits a bit, sending the acknowledgement together with
the outgoing data frame (piggybacking), making better use of bandwidth.

This has some complications though. For how long to wait? Which frame will it piggyback?
Using a sliding window protocol (in which frames have sequence numbers from 0 to a
maximum of 2n − 1), the sender maintains a set of sequence numbers corresponding to
frames it is permitted to send. These frames are said to fall within the sending window
(containing frames sent-but-no-ack and not-yet-sent). When a new packet from the network
layer comes in, it is given the highest sequence number and then the upper edge of the
sending windows is increased by 1. When an ack comes in, the lower bound is increased by 1.

Similarly, the receiver also maintains a receiving window corresponding to the set of frames it is
permitted to accept

One-bit sliding window


The easiest case is the 1-bit sliding window. This uses stop-and-wait since the sender transmits
a frame and waits for acknowledgements. The acknowledgement field contains the number of
the last frame received without errors. If this number agrees with the sequence number of the
frame the sender is trying to send, the sender knows it is done with the frame stored in buffer
and can fetch the next packet from its network layer. If the sequence number disagrees, it must
continue trying to send the same frame. Whenever a frame is received, a frame is also sent
back.

A problem occurs when both A and B send the first frame simultaneously:
• Imagine A's timeout is too short. A repeatedly times out and sends multiple copies to B ,
all with seq=0, ack=1.
• When first one of these gets to B , it is accepted. Set expected=1. B sends its frame,
seq=0, ack=0.
• All subsequent copies of A's frame rejected since seq=0 not equal to expected. All
these also have ack=1.
• B repeatedly sends its frame, seq=0, ack=0. But A is not getting it because it is timing
out too soon.
• Eventually, A gets one of these frames. A has its ack now (and B 's frame). A sends next
frame and acks B 's frame.
• Conclusion: Could get wasted time, but provided a frame can eventually make it
through, no infinite loop, and no duplicate packets to Network layer. Process will
complete.
Go-Back-N
As there might be a big delay in sending frames and the retrieval of the acknowledgement, it
could be helpful to send multiple frames at the same time before actually waiting for
acknowledgement (pipelining). This might even make it possible to continuously send frames
without any blocking.

We need to know how many frames w we can send, if we want to maximize throughput, w =
2BD + 1 where B is the bandwidth and D is the delay (where BD is called the bandwidth-
delay product). This means w frames are sent before an ack is received.

When errors occur, we need to handle them gracefully, there could be 5 wrong frames
received, but the frames received after those 5 might still be correct. But we still need to pass
the packets in the right order to the network layer. A possibility is to resend all frames starting
from the frame where an error occurred (meaning that all frames without ack should stay in the
sender's buffer). The problem with this is that with a bad connection, many frames will have to
be resent.

Selective repeat
Another option for pipelining is selective repeat. When it is used, a bad frame that is received is
discarded, but any good frames received after it are accepted and buffered. When the sender
times out, only the oldest unacknowledged frame is retransmitted. This approach can require
large amounts of data link layer memory if the window is large. When an acknowledgement for
frame n comes in, frames n − 1,  n − 2,  … are automatically acknowledged (cumulative
acknowledgement)
This is often combined with having the receiver send a negative acknowledgment (NAK) when
it detects an error (e.g. wrong checksum or out of sequence frame). This stimulates
retransmission before the sender’s timer runs out.

If we set aside n bits for frame numbers, it means we will start over when all bits are 1. This can
cause problems. Consider the following (where n = 3, so a sequence of 8 frames):
• The windows of both A and B are 0 … 6
• A transmits frames 0 … 6
• B receives them properly, acks them and sends the packets to the network layer. It also
moves its window to 7,  0,  … ,  5 (because the sequence wraps around)
• Acks are lost
• A retransmits original 0
• B thinks of this 0 as a new batch, but he hasn't seen 7 yet so he (again) acks up to 6
• A receives ack=6 so it figures the entire old batch got through
• A advances the window and transmits 7,  0,  … ,  5
• 7 gets accepted by B which then passes the new 7 to the network layer. It also sends
the buffered 0 (which is the old version, wrong!)

A solution to this is limiting the maximum window size to (n + 1) /2

Appendix
A
Byte count

Byte stuffing
Bit stuffing

Credits
1. https://www.computing.dcu.ie/~Humphrys/Notes/Networks/data.error.html
2. https://www.studocu.com/en/document/technische-universiteit-delft/computer-
networks/summaries/computer-networks-summary/30238/view

You might also like