You are on page 1of 43

Lecture 6.

Internet Transport Layer:

introduction to the
Transport Control Protocol
(TCP)
RFC 793
(estensioni RFC 1122,1323,2018,2581,working group tsvwg)

G.Bianchi, G.Neglia, V.Mancuso


A MUCH more complex transport
for three main reasons

ÎConnection oriented
Öimplements mechanisms to setup and tear down a
full duplex connection between end points
ÎReliable
Öimplements mechanisms to guarantee error free
and ordered delivery of information
ÎFlow & Congestion controlled
Öimplements mechanisms to control traffic

G.Bianchi, G.Neglia, V.Mancuso


TCP services
Îconnection oriented ÎTCP functions
ÖTCP connections Æ application addressing (ports)
Æ error recovery (acks and
Îreliable transfer retransmission)
service Æ reordering (sequence numbers)
Öall bytes sent are received Æ flow control
Æ congestion control

Appl.
Appl.
TCP
TCP
IP IP IP
IP
IP

G.Bianchi, G.Neglia, V.Mancuso


Byte stream service
ÎTCP exchange data between applications as
a stream of bytes
ÎIt does not introduce any data delimiter (an
application duty)
Ö source application may enter 10 bytes followed by 1 and 40 (grouped
with some semantics)
Ö data is buffered at source, and transmitted
Ö at receiver, may be read in the sequence 25 bytes, 22 bytes and 4
bytes...

Application view

TCP view

G.Bianchi, G.Neglia, V.Mancuso


TCP segments
ÎApplication data broken into segments for transmission
Îsegmentation totally up to TCP, according to what TCP
considers being the best strategy
Îeach segment placed into an IP packet
Îvery different from UDP!!

Header TCP TCP data Header TCP TCP data

Header IP IP data Header IP IP data

G.Bianchi, G.Neglia, V.Mancuso


TCP segment format
20 bytes header (minimum)
0 3 7 15 31
Source port Destination port
32 bit Sequence number
32 bit acknowledgement number
Header 6 bit U A P R S F
length
R
Reserved G
C S S Y I
K H T N N
Window size
checksum Urgent pointer

Options (if any)


padding

Data (if any)

G.Bianchi, G.Neglia, V.Mancuso


Source port Destination port
32 bit Sequence number
32 bit acknowledgement number
Header 6 bit U A P R S F
length
R
Reserved G
C S S Y I
K H T N N
Window size
checksum Urgent pointer

ÎSource & destination port + source and


destination IP addresses
Öunivocally determine TCP connection

Îchecksum as in UDP
Ösame calculation including same pseudoheader

Îno explicit segment length specification

G.Bianchi, G.Neglia, V.Mancuso


Source port Destination port
32 bit Sequence number
32 bit acknowledgement number
Header 6 bit U A P R S F
length
R
Reserved G
C S S Y I
K H T N N
Window size
checksum Urgent pointer

Options (if any)


00000000

ÎHeader length: 4 bits


Öspecifies the header size (n*4byte words) for options
Ömaximum header size: 60 (15*4)
Öoption field size must be multiple of 32bits: zero padding
when not.
ÎReserved: 000000 (still today!)
G.Bianchi, G.Neglia, V.Mancuso
Reliable data transfer: issues
packet packet PROBLEMS:
1) Packet received with errors
INTERNET 2) Packet not received at all

Same problem considered at DATA LINK LAYER


(although it is less likely that a whole packet is lost at data link)

Îmechanisms to guarantee correct reception:


ÖForward Error Correction (FEC) coding schemes
ÆPowerful to correct bits affected by error, less effective in
case of packet loss
ÆMostly used at link layer
ÖRetransmission – issues:
ÆACK
ÆNACK
ÆTIMEOUT

G.Bianchi, G.Neglia, V.Mancuso


Retransmission scenarios
referred to as ARQ schemes (Automatic Repeat reQuest)
COMPONENTS: a) error checking at receiver; b) feedback to sender; c) retx

SRC DST SRC DST


DATA DATA
Error Error
Check: Check:
ACK OK NACK corrupted
Automatic
retransmit DATA
Basic ACK idea
Basic NACK idea

SRC DST SRC DST SRC DATA


DST
DATA DATA
Retx Error
Timeout ACK Check:
(RTO) corrupted
DATA DATA DATA

Basic ACK/Timeout idea


G.Bianchi, G.Neglia, V.Mancuso
Why sequence numbers?
(on data)
Sender side: Receiver side:
DATA
DATA
RTO
ACK
NETWORK
(ACK lost)
rtx DATA
DATA
New data?
Old data?

Need to univocally “label” all packets circulating


in the network between two end points.
1 bit (0-1) enough for Stop-and-wait

G.Bianchi, G.Neglia, V.Mancuso


Why sequence numbers?
(on ack)
Sender side: Receiver side:

DATA 1

ACK Duplicated
ACK
DATA 2
Queueing
Delay
ACK

DATA 3
Data 2 lost !!

With pathologically critical network (as the Internet!)


also need to univocally “label” all acks circulating
in the network between two end points.
1 bit (0-1) enough for Stop-and-wait
G.Bianchi, G.Neglia, V.Mancuso
Source port Destination port
32 bit Sequence number
32 bit acknowledgement number
Header 6 bit U A P R S F
length
R
Reserved G
C S S Y I
K H T N N
Window size
checksum Urgent pointer

ÎSequence number:
ÖSequence number of the first byte in the segment.
ÖWhen reaches 232-1, next wraps back to 0
ÎAcknowledgement number:
Övalid only when ACK flag on
ÖContains the next byte sequence number that the host expects to
receive (= last successfully received byte of data + 1)
Ögrants successful reception for all bytes up to ack# - 1 (cumulative)
ÎWhen seq/ack reach 232-1, next wrap back to 0

G.Bianchi, G.Neglia, V.Mancuso


TCP data transfer management
ÎFull duplex connection
Ödata flows in both directions, independently
ÖTo the application program these appear as two unrelated
data streams
ÖImpossible to build multicast connection
Îeach end point maintains a sequence
number
ÖIndependent sequence numbers at both ends
ÖMeasured in bytes
Îacks often carried on top of reverse
flow data segments (piggybacking)
ÖBut ack packets alone are possible

G.Bianchi, G.Neglia, V.Mancuso


Byte-oriented
Example: 1 kbyte message – 1024 bytes

0 1 … … 100 … … 535 … … … 1023

Example: segment size = 536 bytes Æ 2 segments: 0-535; 536-1023

sender receiver
seq=0
ÎNo explicit segment size
indication
36
Ack=5 Ö Seq = first byte number
seq=536
Ö Returning Ack = last byte
number + 1
0 24 Ö Segment size = Ack-seq#
Ack=1
time time

G.Bianchi, G.Neglia, V.Mancuso


Pipelining
Example: 1024 bytes msg; seg_size = 536 bytes Æ 2 segments: 0-535; 536-1023

0 1 … … 100 … … 535 … … … 1023

sender receiver
seq=0
seq=536 Î Other name
Ö sliding window mechanisms
36
Ack=5 Î Basic approaches for error recovery
0 24 Ö Go-Back-N and Selective Repeat
Ack=1

time time
Why pipelining? Dramatic improvement in efficiency!

G.Bianchi, G.Neglia, V.Mancuso


Cumulative ack
Example: 1024 bytes msg; seg_size = 536 bytes Æ 2 segments: 0-535; 536-1023

0 1 … … 100 … … 535 … … … 1023

sender receiver
seq=0
ÎCumulative ack
seq=536
Ö ACK = all previous bytes
correctly received!
0 24 Ö E.g. ACK=1024: all bytes 0-1023
Ack=1 received
ÎOther names of pipelining:
Ö Go-Back-N ARQ mechanisms
time time Ö Sliding window mechanisms
Why pipelining? Dramatic improvement in efficiency!

G.Bianchi, G.Neglia, V.Mancuso


Multiple acks; Piggybacking
Bytes 100
-1 99, seq=1
00,

Immediate ack,
, A ck = 200
no payload EMPTY , a c k = 200 Data in reverse
s e q= 45 0 direction,carries
4 5 0 - 525,
Bytes previous ack

Bytes 200
-2 49, seq=2
00, ack=5
26

Next segment,
CLIENT SERVER
piggybacked ack

G.Bianchi, G.Neglia, V.Mancuso


16
15
14
13 TCP data transfer
12
11 bidirectional example
10
9
8
7 118
6 117
5 Segment size = 6 116
4 115
3 114
2 Segment size = 4 113
1 112

Time 0: Seq=1, NO ack Seq=112, NO ack

Time 1: Seq=7, ack=116 Seq=116, ack=7

Time 2: Seq=13, ack=119 Seq=119, ack=13

Time 3:
Seq=119, ack=17

G.Bianchi, G.Neglia, V.Mancuso


Performance issues
with/without pipelining

G.Bianchi, G.Neglia, V.Mancuso


Link delay computation
ÎTransmission delay:
ÎC [bit/s] = link rate
Router
ÎB [bit] = packet size
Îtransmission delay = B/C [sec]
ÎExample:
sender receiver Î512 bytes packet
Î64 kbps link
Tx delay
Prop Îtransmission delay = 512*8/64000 = 64ms
B/C
delay
ÎPropagation delay – constant depending on
Tx delay ÎLink length
B/C
ÎElectromagnetig waves propagation speed in
considered media
Î200 km/s for copper links
Î300 km/s in air
time time
ÎHypothesis: other delays neglected
Îqueueing
Îprocessing
G.Bianchi, G.Neglia, V.Mancuso
Stop-and-wait performance
Router 1 Router 2

28.8 Kbps 1024 Kbps 28.8 Kbps


Start tx 1 ms 30 ms 1 ms
prop1
tx1

prop2

tx2
prop3
tx3

Completion time =
Tx1 + Prop1 + Tx2 + Prop2 + Tx3 + Prop3 +
Ack_Tx1 + Prop1 + Ack_Tx2 + Prop2 + Ack_Tx3 +
Prop3 + [same computation for other segments]=
……………
= RTT + Tx1 + Tx2 + Tx3 + Ack_Tx1 +
Ack_Tx2 + Ack_Tx3 +...
time time
G.Bianchi, G.Neglia, V.Mancuso
Stop-and-wait performance
Numerical example

Router 1 Router 2

28.8 Kbps 1024 Kbps 28.8 Kbps


1 ms 30 ms 1 ms

Î Message: Î Segment 1: Î Acks:


Ö 1024 bytes; Ö Tx1 = 576*8/28,8 = Ö Tx1 = Tx3 =
Ö 2 segments: 160ms 40*8/28,8 = 11,1ms
536+488 bytes Ö Tx3 = Tx1 Ö Tx2 = 40*8/1024 =
Ö Overhead: 20 bytes Ö Tx2 = 576*8/1024 = 0,3 ms
TCP + 20 bytes IP 4,5 ms RESULT:
Ö ACK = 40 bytes Î Segment 2:
(header only) Ö Tx1 = 528*8/28,8 = D = 667 (tx total) + 2*RTT =
146,7ms = 795 ms
Ö Tx3 = Tx1
Ö Tx2 = 528*8/1024 = THR = 1024*8/795 =
4,1 ms = 10,3 kbps
G.Bianchi, G.Neglia, V.Mancuso
Stop-and-wait performance
Numerical example

Router 1 Router 2
With
ISDN? 128 Kbps 1024 Kbps 128 Kbps
1 ms 30 ms 1 ms
Î Segment 1: Î Acks:
Ö Tx1 = Tx3 = Ö Tx1 = Tx3 =
576*8/128 = 36ms 40*8/128 = 2,5 ms on Gbps fiber optics?
Ö Tx2 = 576*8/1024 = Ö Tx2 = 40*8/1024 =
4,5 ms 0,3 ms D = negligible + 2*RTT =
Î Segment 2: = 128 ms
RESULT:
Ö Tx1 = Tx3 =
528*8/128 = 33 ms D = 157,2 (tx total) + 2*RTT = THR = 1024*8/128 =
Ö Tx2 = 528*8/1024 = = 64 kbps
= 279,9 ms
4,1 ms
THR = 1024*8/279,9 =
= 29,3 kbps
G.Bianchi, G.Neglia, V.Mancuso
Pipelining performance
Router 1 Router 2

256 Kbps 1024 Kbps 256 Kbps


Start tx 1 ms 30 ms 1 ms
prop1
tx1

prop2

tx2
prop3
tx3

Completion time (neglecting processing & queueing) =


Tx1 + Prop1 + Tx2 + Prop2 + Tx3 + Prop3 + Tx_bottleneck
full tx time Ack_Tx1 + Prop1 + Ack_Tx2 + Prop2 +
Ack_Tx3 + Prop3 (that’s it!)
……………

time time time time


G.Bianchi, G.Neglia, V.Mancuso
Pipelining performance
numerical example
On 28,8 kbps links On 128 kbps ISDN links

D = 347 (tx segm1+ack) + RTT + D = 81,8 (tx segm1+ack) + RTT +


+ 160 (segm2 bottleneck) = + 33 (segm2 bottleneck) =
= 571 ms = 178,8 ms

THR = 1024*8/571 = THR = 1024*8/178,8 =


= 14,3 kbps = 45,8 kbps

on Gbps fiber optics?

D = negligible + RTT = 64 ms

THR = 1024*8/64 = 128 kbps

G.Bianchi, G.Neglia, V.Mancuso


Simplified performance model

C bits/sec

Approximate analysis, much simpler than multi-hop


Typically, C = bottleneck link rate

MSS = segment size


MSIZE = message size
Ignore overhead
Ignore ACK transmission time
No loss of segments
W = number of outstanding segments
W=1: stop-and-wait
W>1: sliding window
This is a highly dynamic parameter in TCP!!
For now, consider W fixed
G.Bianchi, G.Neglia, V.Mancuso
W=1 case (stop-and-wait)
sender receiver

One way delay

MSS/C
RTT

MSS
throughput =
RTT + MSS / C

REMARK: throughput always lower than


Available link rate!

time time

G.Bianchi, G.Neglia, V.Mancuso


client
Start TCP server
connection Latency in TCP
RTT retrieval model
Latency: time elapsing between TCP connection
request Request, and last bit received at client
object

RTT
MSIZE ⎡ MSIZE ⎤
latency = 2 RTT + +⎢ − 1⎥ RTT
C ⎢ MSS ⎥

Number of segments
In which message
is split

time time

G.Bianchi, G.Neglia, V.Mancuso


W=1 case (stop-and-wait)
MSS = 1500 bytes

throughput
10000
Throughput (Kbps)

C=28,8 kbps
C=128 kbps
1000 C=640 kbps
C=10 Mbps

100

10
1 10 100 1000
RTT (ms)

Under-utilization with: 1) high capacity links, 2) large RTT links

G.Bianchi, G.Neglia, V.Mancuso


W=1 case (stop-and-wait)
MSS = 1500 bytes

C=28,8 kbps
100% C=128 kbps
90% C=640 kbps
80% C=10 Mbps
70%
Utilization

60%
50%
40%
30%
20%
10%
0%
1 10 100 1000
RTT (ms)

Under-utilization with: 1) high capacity links, 2) large RTT links

G.Bianchi, G.Neglia, V.Mancuso


Pipelining (W>1) analysis
two cases
W=4
RTT
+1tx
?
W=10

time time

WINDOW SIZE that allows


CONTINUOUS TRANSMISSION
UNDER-SIZED WINDOW:
THROUGHPUT INEFFICIENCY

G.Bianchi, G.Neglia, V.Mancuso


Continuous transmission
Condition in which link rate is fully utilized

MSS MSS
W⋅ > RTT +
C C

Time to transmit Time to receive


W segments Ack of first segment
We may elaborate:
W ⋅ MSS > RTT ⋅ C + MSS ≈ RTT ⋅ C
This means that full link utilization is possible when window size (in bits) is
Greater than the bandwidth (C bit/s) delay (RTT s) product!

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
D
C

ÎNetwork: like a pipe


ÎC [bit/s] x D [s] 64Kbps
Ö number of bits “flying” in the
network
Ö number of bits injected in the
network by the tx, before that the
A 15360 (64000x0.240) bits
first bit is received at distance D s
“worm” in the air!!

bandwidth-delay product = bytes # that saturates network pipe

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
(usually considered)
D
C RTT=2*D

D
After D/4 s
ÎC [bit/s] x RTT [s]
Ö number of bits “flying” in the
network (partly received)
Ö number of bits injected in the
network by the tx, before that the
ack of the first bit is received

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
(usually considered)
D
C

D
After D s
ÎC [bit/s] x RTT [s]
Ö number of bits “flying” in the
network (partly received)
Ö number of bits injected in the
network by the tx, before that the
ack of the first bit is received

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
(usually considered)
D
C
C*D/4
bits at the receiver

D Ack
After 5/4*D s
ÎC [bit/s] x RTT [s]
Ö number of bits “flying” in the
network (partly received)
Ö number of bits injected in the
network by the tx, before that the
ack of the first bit is received

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
(usually considered)
D
C
C*D
bits at the receiver

D Ack
After 2D s
ÎC [bit/s] x RTT [s]
Ö number of bits “flying” in the
network (partly received)
Ö number of bits injected in the
network by the tx, before that the
ack of the first bit is received

G.Bianchi, G.Neglia, V.Mancuso


Bandwidth-delay product
(usually considered)
D1
C
C*D2
bits at the receiver

D2 Ack
After RTT=D1+D2 s
ÎC [bit/s] x RTT [s]
Ö number of bits “flying” in the
network (partly received)
Ö number of bits injected in the
network by the tx, before that the
ack of the first bit is received

G.Bianchi, G.Neglia, V.Mancuso


Long Fat Networks
LFNs (el-ef-an(t)s): large bandwidth-delay product

NETWORK RTT (ms) rate (kbps) BxD (bytes)


Ethernet 3 10.000 3.750
T1, transUS 60 1.544 11.580
T1 satellite 480 1.544 92.640
T3 transUS 60 45.000 337.500
Gigabit transUS 60 1.000.000 7.500.000

The 65535 (16 bit field in TCP header) maximum window size W
may be a limiting factor!

G.Bianchi, G.Neglia, V.Mancuso


Pipelining (W>1) analysis
⎛ W ⋅ MSS ⎞
thr = min⎜ C , ⎟
2 RTT ⎝ RTT + MSS / C ⎠
Delay analysis (for TCP object retrieval) –
Non continuous transmission
MSIZE
latency = 2 RTT + +
C
⎡ MSIZE ⎤⎛ (W − 1) MSS ⎞
+⎢ − 1⎥ ⎜ RTT − ⎟
⎢ W ⋅ MSS ⎥⎝ C ⎠
RTT
(+1tx) Continuous transmission case:
W MSIZE
latency = 2 RTT +
C
G.Bianchi, G.Neglia, V.Mancuso
Throughput for pipelining
MSS = 1500 bytes

1 Mbps link speed


1200
T h ro u g h p u t (K b p s )

W=1
1000
W=2
800 W=4
W=16
600
400
200
0
0 100 200 300 400 500 600
RTT (ms)

G.Bianchi, G.Neglia, V.Mancuso


Maximum achievable throughput
(assuming infinite speed line…)

1000 W = 65535 bytes


Throughput (Mbps)

100

10

1
0 100 200 300 400 500
RTT (ms)

G.Bianchi, G.Neglia, V.Mancuso

You might also like