You are on page 1of 151

Chapter 4

Network Layer

Computer
Networking: A Top
Down Approach
6th edition
Jim Kurose, Keith Ross
Addison-Wesley
March 2012

Network Layer 4-1


Connection setup
 One important function in some network
architectures:
 ATM, frame relay, X.25
 before datagrams flow, two end hosts and
intervening routers establish virtual connection
 routers get involved
 network vs transport layer connection service:
 network: between two hosts (may also involve intervening
routers in case of VCs)
 transport: between two processes

Network Layer 4-2


Connection, connection-less service
 datagram network provides network-layer
connectionless service
 virtual-circuit network provides network-layer
connection service
 analogous to TCP/UDP connecton-oriented /
connectionless transport-layer services, but:
 service: host-to-host
 no choice: network provides one or the other
 implementation: in network core

Network Layer 4-3


Virtual circuits
“source-to-dest path behaves much like telephone
circuit”
 performance-wise
 network actions along source-to-dest path

 call setup, teardown for each call before data can flow
 each packet carries VC identifier (not destination host
address)
 every router on source-dest path maintains “state” for
each passing connection
 link, router resources (bandwidth, buffers) may be
allocated to VC (dedicated resources = predictable
service)
Network Layer 4-4
VC implementation
a VC consists of:
1. path from source to destination
2. VC numbers, one number for each link along path
3. entries in forwarding tables in routers along path
 packet belonging to VC carries VC number
(rather than dest address)
 VC number can be changed on each link.
 new VC number comes from forwarding table

Network Layer 4-5


VC forwarding table
12 22 32

1 3
2
VC number
interface
forwarding table in number
northwest router:
Incoming interface Incoming VC # Outgoing interface Outgoing VC #

1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …

VC routers maintain connection state information!


Network Layer 4-6
Virtual circuits: signaling protocols
 used to setup, maintain teardown VC
 used in ATM, frame-relay, X.25
 not used in today’s Internet

Network Layer 4-7


Datagram networks
 no call setup at network layer
 routers: no state about end-to-end connections
 no network-level concept of “connection”
 packets forwarded using destination host address

application application
transport transport
network 1. send datagrams 2. receive datagrams network
data link data link
physical physical

Network Layer 4-8


Datagram forwarding table
4 billion IP addresses, so
routing algorithm rather than list individual
destination address
local forwarding table
list range of addresses
dest address output link (aggregate table entries)
address-range 1 3
address-range 2 2
address-range 3 2
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-9


Datagram forwarding table
Destination Address Range Link Interface

11001000 00010111 00010000 00000000


through 0
11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000


through 1
11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000


through 2
11001000 00010111 00011111 11111111

otherwise 3

Q: but what happens if ranges don’t divide up so nicely?


Network Layer 4-10
Longest prefix matching
longest prefix matching
when looking for forwarding table entry for given
destination address, use longest address prefix that
matches destination address.

Destination Address Range Link interface


11001000 00010111 00010*** ********* 0
11001000 00010111 00011000 ********* 1
11001000 00010111 00011*** ********* 2
otherwise 3

examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
Network Layer 4-11
Datagram or VC network: why?
Internet (datagram) ATM (VC)
 data exchange among  evolved from telephony
computers  human conversation:
 “elastic” service, no strict  strict timing, reliability
timing req. requirements
 need for guaranteed service
 many link types  “dumb” end systems
 different characteristics  telephones
 uniform service difficult  complexity inside
 “smart” end systems network
(computers)
 can adapt, perform control,
error recovery
 simple inside network,
complexity at “edge”

Network Layer 4-12


The Internet network layer
host, router network layer functions:

transport layer: TCP, UDP

routing protocols IP protocol


• path selection • addressing conventions
• RIP, OSPF, BGP • datagram format
network • packet handling conventions
layer forwarding
table
ICMP protocol
• error reporting
• router
“signaling”
link layer

physical layer

Network Layer 4-13


IP datagram format
IP protocol version 32 bits
number total datagram
header length length (bytes)
ver head. type of length
(bytes) len service for
“type” of data fragment fragmentation/
16-bit identifier flgs
offset reassembly
max number time to upper header
remaining hops live layer checksum
(decremented at
32 bit source IP address
each router)
32 bit destination IP address
upper layer protocol
to deliver payload to options (if any) e.g. timestamp,
record route
how much overhead? data taken, specify
(variable length, list of routers
 20 bytes of TCP
typically a TCP to visit.
 20 bytes of IP
or UDP segment)
 = 40 bytes + app
layer overhead

Network Layer 4-14


IP fragmentation, reassembly
 network links have MTU
(max.transfer size) -
largest possible link-level fragmentation:
frame


in: one large datagram
 different link types, out: 3 smaller datagrams
different MTUs
 large IP datagram divided
(“fragmented”) within net reassembly
 one datagram becomes
several datagrams
 “reassembled” only at …
final destination
 IP header bits used to
identify, order related
fragments
Network Layer 4-15
IP fragmentation, reassembly
length ID fragflag offset
example: =4000 =x =0 =0
 4000 byte datagram
one large datagram becomes
 MTU = 1500 bytes several smaller datagrams

1480 bytes in length ID fragflag offset


data field =1500 =x =1 =0

offset = length ID fragflag offset


1480/8 =1500 =x =1 =185

length ID fragflag offset


=1040 =x =0 =370

Network Layer 4-16


IPv6: motivation
 initial motivation: 32-bit address space soon to be
completely allocated.
 additional motivation:
 header format helps speed processing/forwarding
 header changes to facilitate QoS

IPv6 datagram format:


 fixed-length 40 byte header
 no fragmentation allowed

Network Layer 4-17


IPv6 datagram format
priority: identify priority among datagrams in flow
flow Label: identify datagrams in same “flow.”
(concept of“flow” not well defined).
next header: identify upper layer protocol for data
ver pri flow label
payload len next hdr hop limit
source address
(128 bits)
destination address
(128 bits)

data

32 bits
Network Layer 4-18
Other changes from IPv4
 checksum: removed entirely to reduce processing
time at each hop
 options: allowed, but outside of header, indicated
by “Next Header” field
 ICMPv6: new version of ICMP
 additional message types, e.g. “Packet Too Big”
 multicast group management functions

Network Layer 4-19


Interplay between routing, forwarding
routing algorithm determines
routing algorithm
end-end-path through network
forwarding table determines
local forwarding table
local forwarding at this router
dest address output link
address-range 1 3
address-range 2 2
address-range 3 2
address-range 4 1

IP destination address in
arriving packet’s header
1
3 2

Network Layer 4-20


Types of Routing
 Static Routing
 Manually routing information needs to be added
in each router
 Dynamic Routing
 Routers dynamically exchange routing information
and periodically updates the routing information
 RIP, BGP,OSPF

Network Layer 4-21


Chapter 10
Error Detection
and
Correction

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 10.
Note

Data can be corrupted


during transmission.

Some applications require that


errors be detected and corrected.

10.23
Note

In a single-bit error, only 1 bit in the data


unit has changed.

10.24
Figure 10.1 Single-bit error

10.25
Note

A burst error means that 2 or more bits


in the data unit have changed.

10.26
Figure 10.2 Burst error of length 8

10.27
Note

To detect or correct errors, we need to


send extra (redundant) bits with data.

10.28
Figure 10.3 The structure of encoder and decoder

10.29
Note

In this book, we concentrate on block


codes; we leave convolution codes
to advanced texts.

10.30
Note

In modulo-N arithmetic, we use only the


integers in the range 0 to N −1, inclusive.

10.31
Figure 10.4 XORing of two single bits or two words

10.32
10-2 BLOCK CODING

In block coding, we divide our message into blocks,


each of k bits, called datawords. We add r redundant
bits to each block to make the length n = k + r. The
resulting n-bit blocks are called codewords.

Topics discussed in this section:


Error Detection
Error Correction
Hamming Distance
Minimum Hamming Distance
10.33
Figure 10.5 Datawords and codewords in block coding

10.34
Example 10.1

The 4B/5B block coding discussed in Chapter 4 is a good


example of this type of coding. In this coding scheme,
k = 4 and n = 5. As we saw, we have 2k = 16 datawords
and 2n = 32 codewords. We saw that 16 out of 32
codewords are used for message transfer and the rest are
either used for other purposes or unused.

10.35
Figure 10.6 Process of error detection in block coding

10.36
Example 10.2

Let us assume that k = 2 and n = 3. Table 10.1 shows the


list of datawords and codewords. Later, we will see
how to derive a codeword from a dataword.

Assume the sender encodes the dataword 01 as 011 and


sends it to the receiver. Consider the following cases:

1. The receiver receives 011. It is a valid codeword. The


receiver extracts the dataword 01 from it.

10.37
Example 10.2 (continued)

2. The codeword is corrupted during transmission, and


111 is received. This is not a valid codeword and is
discarded.

3. The codeword is corrupted during transmission, and


000 is received. This is a valid codeword. The receiver
incorrectly extracts the dataword 00. Two corrupted
bits have made the error undetectable.

10.38
Table 10.1 A code for error detection (Example 10.2)

10.39
Note

An error-detecting code can detect


only the types of errors for which it is
designed; other types of errors may
remain undetected.

10.40
Figure 10.7 Structure of encoder and decoder in error correction

10.41
Example 10.3

Let us add more redundant bits to Example 10.2 to see if


the receiver can correct an error without knowing what
was actually sent. We add 3 redundant bits to the 2-bit
dataword to make 5-bit codewords. Table 10.2 shows the
datawords and codewords. Assume the dataword is 01.
The sender creates the codeword 01011. The codeword is
corrupted during transmission, and 01001 is received.
First, the receiver finds that the received codeword is not
in the table. This means an error has occurred. The
receiver, assuming that there is only 1 bit corrupted, uses
the following strategy to guess the correct dataword.

10.42
Example 10.3 (continued)
1. Comparing the received codeword with the first
codeword in the table (01001 versus 00000), the
receiver decides that the first codeword is not the one
that was sent because there are two different bits.

2. By the same reasoning, the original codeword cannot


be the third or fourth one in the table.

3. The original codeword must be the second one in the


table because this is the only one that differs from the
received codeword by 1 bit. The receiver replaces
01001 with 01011 and consults the table to find the
dataword 01.
10.43
Table 10.2 A code for error correction (Example 10.3)

10.44
Note

The Hamming distance between two


words is the number of differences
between corresponding bits.

10.45
Example 10.4

Let us find the Hamming distance between two pairs of


words.

1. The Hamming distance d(000, 011) is 2 because

2. The Hamming distance d(10101, 11110) is 3 because

10.46
Note

The minimum Hamming distance is the


smallest Hamming distance between
all possible pairs in a set of words.

10.47
Example 10.5

Find the minimum Hamming distance of the coding


scheme in Table 10.1.
Solution
We first find all Hamming distances.

The dmin in this case is 2.

10.48
Example 10.6

Find the minimum Hamming distance of the coding


scheme in Table 10.2.

Solution
We first find all the Hamming distances.

The dmin in this case is 3.

10.49
Note

To guarantee the detection of up to s


errors in all cases, the minimum
Hamming distance in a block
code must be dmin = s + 1.

10.50
Example 10.7

The minimum Hamming distance for our first code


scheme (Table 10.1) is 2. This code guarantees detection
of only a single error. For example, if the third codeword
(101) is sent and one error occurs, the received codeword
does not match any valid codeword. If two errors occur,
however, the received codeword may match a valid
codeword and the errors are not detected.

10.51
Example 10.8

Our second block code scheme (Table 10.2) has dmin = 3.


This code can detect up to two errors. Again, we see that
when any of the valid codewords is sent, two errors create
a codeword which is not in the table of valid codewords.
The receiver cannot be fooled.

However, some combinations of three errors change a


valid codeword to another valid codeword. The receiver
accepts the received codeword and the errors are
undetected.

10.52
Note

To guarantee correction of up to t errors


in all cases, the minimum Hamming
distance in a block code
must be dmin = 2t + 1.

10.53
Example 10.9

A code scheme has a Hamming distance dmin = 4. What is


the error detection and correction capability of this
scheme?

Solution
This code guarantees the detection of up to three errors
(s = 3), but it can correct up to one error. In other words,
if this code is used for error correction, part of its capability
is wasted. Error correction codes need to have an odd
minimum distance (3, 5, 7, . . . ).

10.54
10-3 LINEAR BLOCK CODES

Almost all block codes used today belong to a subset


called linear block codes. A linear block code is a code
in which the exclusive OR (addition modulo-2) of two
valid codewords creates another valid codeword.

Topics discussed in this section:


Minimum Distance for Linear Block Codes
Some Linear Block Codes

10.55
Note

In a linear block code, the exclusive OR


(XOR) of any two valid codewords
creates another valid codeword.

10.56
Example 10.10

Let us see if the two codes we defined in Table 10.1 and


Table 10.2 belong to the class of linear block codes.

1. The scheme in Table 10.1 is a linear block code


because the result of XORing any codeword with any
other codeword is a valid codeword. For example, the
XORing of the second and third codewords creates the
fourth one.

2. The scheme in Table 10.2 is also a linear block code.


We can create all four codewords by XORing two
other codewords.
10.57
Example 10.11

In our first code (Table 10.1), the numbers of 1s in the


nonzero codewords are 2, 2, and 2. So the minimum
Hamming distance is dmin = 2. In our second code (Table
10.2), the numbers of 1s in the nonzero codewords are 3,
3, and 4. So in this code we have dmin = 3.

10.58
Note

A simple parity-check code is a


single-bit error-detecting
code in which
n = k + 1 with dmin = 2.

10.59
Table 10.3 Simple parity-check code C(5, 4)

10.60
Figure 10.10 Encoder and decoder for simple parity-check code

10.61
Example 10.12

Let us look at some transmission scenarios. Assume the


sender sends the dataword 1011. The codeword created
from this dataword is 10111, which is sent to the receiver.
We examine five cases:

1. No error occurs; the received codeword is 10111. The


syndrome is 0. The dataword 1011 is created.
2. One single-bit error changes a1 . The received
codeword is 10011. The syndrome is 1. No dataword
is created.
3. One single-bit error changes r0 . The received codeword
is 10110. The syndrome is 1. No dataword is created.
10.62
Example 10.12 (continued)

4. An error changes r0 and a second error changes a3 .


The received codeword is 00110. The syndrome is 0.
The dataword 0011 is created at the receiver. Note that
here the dataword is wrongly created due to the
syndrome value.
5. Three bits—a3, a2, and a1—are changed by errors.
The received codeword is 01011. The syndrome is 1.
The dataword is not created. This shows that the simple
parity check, guaranteed to detect one single error, can
also find any odd number of errors.

10.63
Note

A simple parity-check code can detect


an odd number of errors.

10.64
Note

All Hamming codes discussed in this


book have dmin = 3.

The relationship between m and n in


these codes is n = 2m − 1.

10.65
Figure 10.11 Two-dimensional parity-check code

10.66
Figure 10.11 Two-dimensional parity-check code

10.67
Figure 10.11 Two-dimensional parity-check code

10.68
Table 10.4 Hamming code C(7, 4)

10.69
Figure 10.12 The structure of the encoder and decoder for a Hamming code

10.70
Table 10.5 Logical decision made by the correction logic analyzer

10.71
Example 10.13

Let us trace the path of three datawords from the sender


to the destination:
1. The dataword 0100 becomes the codeword 0100011.
The codeword 0100011 is received. The syndrome is
000, the final dataword is 0100.
2. The dataword 0111 becomes the codeword 0111001.
The syndrome is 011. After flipping b2 (changing the 1
to 0), the final dataword is 0111.
3. The dataword 1101 becomes the codeword 1101000.
The syndrome is 101. After flipping b0, we get 0000,
the wrong dataword. This shows that our code cannot
correct two errors.
10.72
Example 10.14

We need a dataword of at least 7 bits. Calculate values of


k and n that satisfy this requirement.
Solution
We need to make k = n − m greater than or equal to 7, or
2m − 1 − m ≥ 7.
1. If we set m = 3, the result is n = 23 − 1 and k = 7 − 3,
or 4, which is not acceptable.
2. If we set m = 4, then n = 24 − 1 = 15 and k = 15 − 4 =
11, which satisfies the condition. So the code is
C(15, 11)

10.73
10-4 CYCLIC CODES

Cyclic codes are special linear block codes with one


extra property. In a cyclic code, if a codeword is
cyclically shifted (rotated), the result is another
codeword.

Topics discussed in this section:


Cyclic Redundancy Check
Hardware Implementation
Polynomials
Cyclic Code Analysis
Advantages of Cyclic Codes
Other Cyclic Codes
10.74
Table 10.6 A CRC code with C(7, 4)

10.75
Figure 10.14 CRC encoder and decoder

10.76
Figure 10.15 Division in CRC encoder

10.77
Figure 10.16 Division in the CRC decoder for two cases

10.78
Figure 10.21 A polynomial to represent a binary word

10.79
Figure 10.22 CRC division using polynomials

10.80
Note

The divisor in a cyclic code is normally


called the generator polynomial
or simply the generator.

10.81
Note

In a cyclic code,
If s(x) ≠ 0, one or more bits is corrupted.
If s(x) = 0, either

a. No bit is corrupted. or
b. Some bits are corrupted, but the
decoder failed to detect them.

10.82
Note

A generator that contains a factor of


x + 1 can detect all odd-numbered
errors.

10.83
Note

A good polynomial generator needs to


have the following characteristics:
1. It should have at least two terms.
2. The coefficient of the term x0 should
be 1.
3. It should not divide xt + 1, for t
between 2 and n − 1.
4. It should have the factor x + 1.
10.84
Table 10.7 Standard polynomials

10.85
10-5 CHECKSUM

The last error detection method we discuss here is


called the checksum. The checksum is used in the
Internet by several protocols although not at the data
link layer. However, we briefly discuss it here to
complete our discussion on error checking

Topics discussed in this section:


Idea
One’s Complement
Internet Checksum

10.86
Example 10.18

Suppose our data is a list of five 4-bit numbers that we


want to send to a destination. In addition to sending these
numbers, we send the sum of the numbers. For example,
if the set of numbers is (7, 11, 12, 0, 6), we send (7, 11, 12,
0, 6, 36), where 36 is the sum of the original numbers.
The receiver adds the five numbers and compares the
result with the sum. If the two are the same, the receiver
assumes no error, accepts the five numbers, and discards
the sum. Otherwise, there is an error somewhere and the
data are not accepted.

10.87
Example 10.19

We can make the job of the receiver easier if we send the


negative (complement) of the sum, called the checksum.
In this case, we send (7, 11, 12, 0, 6, −36). The receiver
can add all the numbers received (including the
checksum). If the result is 0, it assumes no error;
otherwise, there is an error.

10.88
Example 10.20

How can we represent the number 21 in one’s


complement arithmetic using only four bits?

Solution
The number 21 in binary is 10101 (it needs five bits). We
can wrap the leftmost bit and add it to the four rightmost
bits. We have (0101 + 1) = 0110 or 6.

10.89
Example 10.21

How can we represent the number −6 in one’s


complement arithmetic using only four bits?

Solution
In one’s complement arithmetic, the negative or
complement of a number is found by inverting all bits.
Positive 6 is 0110; negative 6 is 1001. If we consider only
unsigned numbers, this is 9. In other words, the
complement of 6 is 9. Another way to find the
complement of a number in one’s complement arithmetic
is to subtract the number from 2n − 1 (16 − 1 in this case).

10.90
Example 10.22

Let us redo Exercise 10.19 using one’s complement


arithmetic. Figure 10.24 shows the process at the sender
and at the receiver. The sender initializes the checksum
to 0 and adds all data items and the checksum (the
checksum is considered as one data item and is shown in
color). The result is 36. However, 36 cannot be expressed
in 4 bits. The extra two bits are wrapped and added with
the sum to create the wrapped sum value 6. In the figure,
we have shown the details in binary. The sum is then
complemented, resulting in the checksum value 9 (15 − 6
= 9). The sender now sends six data items to the receiver
including the checksum 9.
10.91
Example 10.22 (continued)

The receiver follows the same procedure as the sender. It


adds all data items (including the checksum); the result
is 45. The sum is wrapped and becomes 15. The wrapped
sum is complemented and becomes 0. Since the value of
the checksum is 0, this means that the data is not
corrupted. The receiver drops the checksum and keeps
the other data items. If the checksum is not zero, the
entire packet is dropped.

10.92
Figure 10.24 Example 10.22

10.93
Note

Sender site:
1. The message is divided into 16-bit words.
2. The value of the checksum word is set to 0.
3. All words including the checksum are
added using one’s complement addition.
4. The sum is complemented and becomes the
checksum.
5. The checksum is sent with the data.

10.94
Note

Receiver site:
1. The message (including checksum) is
divided into 16-bit words.
2. All words are added using one’s
complement addition.
3. The sum is complemented and becomes the
new checksum.
4. If the value of checksum is 0, the message
is accepted; otherwise, it is rejected.

10.95
Example 10.23

Let us calculate the checksum for a text of 8 characters


(“Forouzan”). The text needs to be divided into 2-byte
(16-bit) words. We use ASCII (see Appendix A) to change
each byte to a 2-digit hexadecimal number. For example,
F is represented as 0x46 and o is represented as 0x6F.
Figure 10.25 shows how the checksum is calculated at the
sender and receiver sites. In part a of the figure, the value
of partial sum for the first column is 0x36. We keep the
rightmost digit (6) and insert the leftmost digit (3) as the
carry in the second column. The process is repeated for
each column. Note that if there is any corruption, the
checksum recalculated by the receiver is not all 0s. We
leave this an exercise.
10.96
Figure 10.25 Example 10.23

10.97
Chapter 3
Transport Layer

Computer
Networking: A Top
Down Approach
7th edition
Jim Kurose, Keith Ross
Pearson/Addison Wesley
April 2016
2-98
Chapter 3: Transport Layer
our goals:
 understand  learn about Internet
principles behind transport layer protocols:
transport layer • UDP: connectionless
services: transport
• multiplexing, • TCP: connection-oriented
demultiplexing reliable transport
• reliable data transfer • TCP congestion control
• flow control
• congestion control

Transport Layer 3-99


Transport services and protocols
application
transport
 provide logical communication network
data link
between app processes physical

running on different hosts


 transport protocols run in
end systems
• send side: breaks app
messages into segments,
passes to network layer
• rcv side: reassembles application
segments into messages, transport
network
passes to app layer data link
physical

 more than one transport


protocol available to apps
• Internet: TCP and UDP
Transport Layer 3-100
Transport vs. network layer
 network layer: logical household analogy:
communication
between hosts 12 kids in Ann’s house sending
letters to 12 kids in Bill’s
 transport layer: house:
logical  hosts = houses
communication  processes = kids

between processes  app messages = letters in


envelopes
• relies on, enhances,  transport protocol = Ann
network layer and Bill who demux to in-
services house siblings
 network-layer protocol =
postal service

Transport Layer 3-101


Internet transport-layer protocols
application
 reliable, in-order transport
network

delivery (TCP) data link


physical
network

 congestion control network


data link
data link
physical
physical
 flow control network
data link

 connection setup physical

network

 unreliable, unordered data link


physical

delivery: UDP network


data link
physical
 no-frills extension of network
data link application
“best-effort” IP physical
network
data link
transport
network
data link
 services not available: physical
physical

 delay guarantees
 bandwidth guarantees

Transport Layer 3-102


Multiplexing/demultiplexing
multiplexing at sender:
handle data from multiple demultiplexing at receiver:
sockets, add transport header use header info to deliver
(later used for demultiplexing) received segments to correct
socket

application

application P1 P2 application socket


P3 transport P4
process
transport network transport
network link network
link physical link
physical physical

Transport Layer 3-103


How demultiplexing works
 host receives IP datagrams 32 bits
• each datagram has source IP source port # dest port #
address, destination IP
address
other header fields
• each datagram carries one
transport-layer segment
• each segment has source, application
destination port number data
 host uses IP addresses & (payload)
port numbers to direct
segment to appropriate
TCP/UDP segment format
socket

Transport Layer 3-104


Connectionless demultiplexing
 recall: created socket has  recall: when creating
host-local port #: datagram to send into UDP
DatagramSocket mySocket1 socket, must specify (UDP
= new DatagramSocket(12534);
socket is two tuple)
• destination IP address
• destination port #
 when host receives UDP IP datagrams with same
segment: dest. port #, but different
• checks destination port # source IP addresses
in segment and/or source port
numbers will be directed
• directs UDP segment to to same socket at dest
socket with that port #

Transport Layer 3-105


Connection-oriented demux
 TCP socket identified  server host may support
by 4-tuple: many simultaneous TCP
• source IP address sockets:
• source port number • each socket identified by
• dest IP address its own 4-tuple
• dest port number  web servers have
 demux: receiver uses different sockets for
all four values to direct each connecting client
segment to appropriate • non-persistent HTTP will
socket have different socket for
each request

Transport Layer 3-106


Figure 22.5 Socket address

107
UDP: User Datagram Protocol [RFC 768]
 “no frills,” “bare bones”  UDP use:
Internet transport  streaming multimedia
protocol apps (loss tolerant, rate
 “best effort” service, sensitive)
UDP segments may be:  DNS
 lost  SNMP
 delivered out-of-order  reliable transfer over
to app
UDP:
 connectionless:
 add reliability at
 no handshaking application layer
between UDP sender,
receiver  application-specific error
recovery!
 each UDP segment
handled independently
of others
Transport Layer 3-108
Figure 22.10 UDP segment format
UDP: segment header
length, in bytes of
32 bits UDP segment,
source port # dest port # including header

Total length checksum


why is there a UDP?
 no connection
application establishment (which can
data add delay)
(payload)  simple: no connection
state at sender, receiver
 small header size
UDP segment format  no congestion control:
UDP can blast away as
fast as desired

Transport Layer 3-110


UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
sender: receiver:
 treat segment contents,  compute checksum of
including header fields, received segment
as sequence of 16-bit  check if computed
integers
checksum equals checksum
 checksum: addition field value:
(one’s complement
sum) of segment  NO - error detected
contents  YES - no error detected.
 sender puts checksum But maybe errors
value into UDP nonetheless? More later
checksum field ….
Transport Layer 3-111
Internet checksum: example
example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

Note: when adding numbers, a carryout from the most


significant bit needs to be added to the result

* Check out the online interactive exercises for more


examples: http://gaia.cs.umass.edu/kurose_ross/interactive/ Transport Layer 3-112
rdt3.0 in action
sender receiver sender receiver
send pkt0 pkt0 send pkt0 pkt0
rcv pkt0 rcv pkt0
ack0 send ack0 ack0 send ack0
rcv ack0 rcv ack0
send pkt1 pkt1 send pkt1 pkt1
rcv pkt1 X
ack1 send ack1 loss
rcv ack1
send pkt0 pkt0
rcv pkt0 timeout
ack0 send ack0 resend pkt1 pkt1
rcv pkt1
ack1 send ack1
rcv ack1
send pkt0 pkt0
(a) no loss rcv pkt0
ack0 send ack0

(b) packet loss


Transport Layer 3-113
rdt3.0 in action
sender receiver
sender receiver send pkt0 pkt0
send pkt0 pkt0 rcv pkt0
send ack0
rcv pkt0 ack0
send ack0 rcv ack0
ack0 send pkt1 pkt1
rcv ack0 rcv pkt1
send pkt1 pkt1
send ack1
rcv pkt1 ack1
ack1 send ack1
X
loss timeout
resend pkt1 pkt1
rcv pkt1
timeout
resend pkt1 pkt1 rcv ack1 (detect duplicate)
rcv pkt1 send pkt0
pkt0
send ack1
(detect duplicate) ack1
ack1 send ack1 rcv ack1 rcv pkt0
rcv ack1 send pkt0
ack0 send ack0
send pkt0 pkt0 pkt0
rcv pkt0
rcv pkt0 ack0 (detect duplicate)
ack0 send ack0 send ack0

(c) ACK loss (d) premature timeout/ delayed ACK

Transport Layer 3-114


 A problem with previous scenario is - rdt3.0 is a
stop and wait protocol.
 Rather than operate in a stop-and-wait manner,
the sender is allowed to send multiple packets
without waiting for acknowledgments.
 If the sender is allowed to transmit three packets
before having to wait for acknowledgments, the
utilization of the sender is essentially tripled.
Since the many in-transit sender-to-receiver
packets can be visualized as filling a pipeline, this
technique is known as pipelining

Transport Layer 3-115


Pipelined protocols: overview
Go-back-N: Selective Repeat:
 sender can have up to  sender can have up to N
N unacked packets in unack’ed packets in
pipeline pipeline
 receiver only sends  rcvr sends individual ack
cumulative ack for each packet
 doesn’t ack packet if
there’s a gap
 sender has timer for  sender maintains timer
oldest unacked packet for each unacked packet
 when timer expires,  when timer expires,
retransmit all unacked retransmit only that
packets unacked packet

Transport Layer 3-116


Go-Back-N: sender
 k-bit seq # in pkt header
 “window” of up to N, consecutive unack’ed pkts allowed

 ACK(n): ACKs all pkts up to, including seq # n - “cumulative


ACK”
• may receive duplicate ACKs (see receiver)
 timer for oldest in-flight pkt
 timeout(n): retransmit packet n and all higher seq # pkts in
window
Transport Layer 3-117
GBN: receiver extended FSM

ACK-only: always send ACK for correctly-received


pkt with highest in-order seq #
 may generate duplicate ACKs
 need only remember expectedseqnum
 out-of-order pkt:
 discard (don’t buffer): no receiver buffering!
 re-ACK pkt with highest in-order seq #
Transport Layer 3-118
GBN in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5

Transport Layer 3-119


Selective repeat
 receiver individually acknowledges all correctly
received pkts
 buffers pkts, as needed, for eventual in-order delivery
to upper layer
 sender only resends pkts for which ACK not
received
 sender timer for each unACKed pkt
 sender window
 N consecutive seq #’s
 limits seq #s of sent, unACKed pkts

Transport Layer 3-120


Selective repeat: sender, receiver windows

Transport Layer 3-121


Selective repeat
sender receiver
data from above: pkt n in [rcvbase, rcvbase+N-1]
 if next available seq # in  send ACK(n)
window, send pkt  out-of-order: buffer
timeout(n):  in-order: deliver (also
 resend pkt n, restart deliver buffered, in-order
timer pkts), advance window to
next not-yet-received pkt
ACK(n) in [sendbase,sendbase+N]:
 mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
 ACK(n)
 if n smallest unACKed
pkt, advance window base otherwise:
to next unACKed seq #  ignore

Transport Layer 3-122


Selective repeat in action
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?

Transport Layer 3-123


sender window receiver window
Selective repeat: (after receipt) (after receipt)

dilemma 0123012 pkt0


pkt1
0123012 0123012
pkt2 0123012
example:
0123012
0123012
pkt3
seq #’s: 0, 1, 2, 3
0123012
 X
0123012
 window size=3 pkt0 will accept packet
with seq number 0
(a) no problem
 receiver sees no
difference in two receiver can’t see sender side.
scenarios! receiver behavior identical in both cases!
something’s (very) wrong!
 duplicate data
accepted as new in (b) 0123012 pkt0
0123012 pkt1 0123012
pkt2
Q: what relationship 0123012
X
0123012
0123012
between seq # size X
and window size to timeout
retransmit pkt0 X
avoid problem in (b)? 0123012 pkt0
will accept packet
with seq number 0
(b) oops!
Transport Layer 3-124
Figure 22.12 Sending and receiving buffers

125
Figure 22.13 TCP segments

126
Reading Task:
 Transmission modes:
 https://www.geeksforgeeks.org/transmission-
modes-computer-networks/
 Point-to-point&multi point connection
 https://www.geeksforgeeks.org/differences-
between-point-to-point-and-multi-point-
communication/

Transport Layer 3-127


TCP: Overview RFCs: 793,1122,1323, 2018, 2581

 point-to-point:  full duplex data:


(connection) • bi-directional data flow
 one sender, one in same connection
receiver (multicasting is • MSS: maximum
not possible) segment size
 reliable, in-order byte  connection-oriented:
steam: • handshaking (exchange
 no “message of control msgs) inits
boundaries” sender, receiver state
 pipelined: before data exchange
 TCP congestion and  flow controlled:
flow control set window • sender will not
size overwhelm receiver
Transport Layer 3-128
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UAP R S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

Transport Layer 3-129


TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence number
byte stream “number” of acknowledgement number

first byte in segment’s rwnd

data
checksum urg pointer

window size
acknowledgements: N

seq # of next byte


expected from other side sender sequence number space
cumulative ACK
sent sent, not- usable not
Q: how receiver handles ACKed yet ACKed but not usable
out-of-order segments (“in-
flight”)
yet sent

A: TCP spec doesn’t say, incoming segment to sender


- up to implementor source port # dest port #
sequence number
(Read form Book: acknowledgement number

Chpater 3 ,page 235…) A


checksum
rwnd
urg pointer

Transport Layer 3-130


TCP seq. numbers, ACKs
Host A Host B

User
types
‘C’ Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed
‘C’ Seq=43, ACK=80

simple telnet scenario

Transport Layer 3-131


Chapter 3 outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer 3-132


TCP reliable data transfer
 TCP creates rdt service
on top of IP’s unreliable
service
 pipelined segments
 cumulative acks let’s initially consider
 single retransmission simplified TCP sender:
timer  ignore duplicate acks
 retransmissions  ignore flow control,
triggered by: congestion control
 timeout events
 duplicate acks

Transport Layer 3-133


TCP sender events:
data rcvd from app: timeout:
 create segment with  retransmit segment
seq # that caused timeout
 seq # is byte-stream  restart timer
number of first data ack rcvd:
byte in segment  if ack acknowledges
 start timer if not previously unacked
already running segments
• think of timer as for • update what is known
oldest unacked to be ACKed
segment
• start timer if there are
still unacked segments

Transport Layer 3-134


TCP: retransmission scenarios
Host A Host B Host A Host B

Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


bytes of data

ACK=100
ACK=120

lost ACK scenario premature timeout

Transport Layer 3-135


TCP: retransmission scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

ACK=100
X
ACK=120

Seq=120, 15 bytes of data

cumulative ACK
Transport Layer 3-136
TCP ACK generation [RFC 1122, RFC 2581]

event at receiver TCP receiver action


arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Transport Layer 3-137


TCP fast retransmit
 time-out period often
relatively long: TCP fast retransmit
• long delay before if sender receives 3
resending lost packet ACKs for same data
 detect lost segments (“triple
(“triple duplicate
duplicate ACKs”),
ACKs”),
via duplicate ACKs. resend unacked
• sender often sends segment with smallest
many segments back- seq #
to-back
 likely that unacked
• if segment is lost, there segment lost, so don’t
will likely be many wait for timeout
duplicate ACKs.

Transport Layer 3-138


TCP fast retransmit
Host A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

ACK=100
timeout

ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

fast retransmit after sender


receipt of triple duplicate ACK
Transport Layer 3-139
Chapter 3 outline
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP • connection management
3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control

Transport Layer 3-140


TCP flow control
application
application may process
remove data from application
TCP socket buffers ….
TCP socket OS
receiver buffers
… slower than TCP
receiver is delivering
(sender is sending) TCP
code

IP
flow control code
receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack

Transport Layer 3-141


TCP flow control
 receiver “advertises” free
buffer space by including to application process
rwnd value in TCP header
of receiver-to-sender
segments RcvBuffer buffered data
 RcvBuffer size set via
socket options (typical default rwnd free buffer space
is 4096 bytes)
 many operating systems
autoadjust RcvBuffer TCP segment payloads
 sender limits amount of
unacked (“in-flight”) data to receiver-side buffering
receiver’s rwnd value
 guarantees receive buffer
will not overflow
Transport Layer 3-142
Principles of congestion control
congestion:
 informally: “too many sources sending too much
data too fast for network to handle”
 different from flow control!
 manifestations:
 lost packets (buffer overflow at routers)
 long delays (queueing in router buffers)
 a top-10 problem!

Transport Layer 3-143


Causes/costs of congestion: scenario 1
original data: lin throughput: lout
 two senders, two
receivers Host A

 one router, infinite buffers unlimited shared


 output link capacity: R output link buffers

 no retransmission

Host B

R/2

delay
lout

lin R/2 lin R/2


 maximum per-connection  large delays as arrival rate, lin,
throughput: R/2 approaches capacity
Transport Layer 3-144
Causes/costs of congestion: scenario 2
 one router, finite buffers
 sender retransmission of timed-out packet
• application-layer input = application-layer output: lin =
lout
• transport-layer input includes retransmissions : l‘in lin

lin : original data


lout
l'in: original data, plus
retransmitted data

Host A

finite shared output


Host B
link buffers
Transport Layer 3-145
Causes/costs of congestion: scenario 2
R/2
idealization: perfect
knowledge

lout
 sender sends only when
router buffers available
lin R/2

lin : original data


lout
copy l'in: original data, plus
retransmitted data

A free buffer space!

finite shared output


Host B
link buffers
Transport Layer 3-146
Causes/costs of congestion: scenario 2
Idealization: known loss
packets can be lost,
dropped at router due
to full buffers
 sender only resends if
packet known to be lost

lin : original data


lout
copy l'in: original data, plus
retransmitted data

A
no buffer space!

Host B
Transport Layer 3-147
Causes/costs of congestion: scenario 2
Idealization: known loss R/2
packets can be lost,
dropped at router due when sending at R/2,
some packets are

lout
to full buffers retransmissions but

sender only resends if


asymptotic goodput
 is still R/2 (why?)
packet known to be lost lin R/2

lin : original data


lout
l'in: original data, plus
retransmitted data

A
free buffer space!

Host B
Transport Layer 3-148
Causes/costs of congestion: scenario 2
Realistic: duplicates R/2
 packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are

lout
 sender times out prematurely, retransmissions

sending two copies, both of including duplicated


that are delivered!
which are delivered lin R/2

lin
timeout
copy l'in lout

A
free buffer space!

Host B
Transport Layer 3-149
Causes/costs of congestion: scenario 2
Realistic: duplicates R/2
 packets can be lost, dropped at
router due to full buffers when sending at R/2,
some packets are

lout
 sender times out prematurely, retransmissions

sending two copies, both of including duplicated


that are delivered!
which are delivered lin R/2

“costs” of congestion:
 more work (retrans) for given “goodput”
 unneeded retransmissions: link carries multiple copies of pkt
• decreasing goodput

Transport Layer 3-150


 End-to-end congestion control.
 Network-assisted congestion control.

Transport Layer 3-151

You might also like