You are on page 1of 53

Agenda

1 Transport-layer Services
2 Multiplexing And Demultiplexing
3 Connectionless Transport: UDP
4 Principles Of Reliable Data Transfer
5 Connection-oriented Transport: TCP

2
Transport services and protocols
application
transport
 Provide logical communication between network
data link
app processes running on different hosts physical

 transport protocols actions in end

lo
gi
ca
systems:

le
nd
• sender: breaks application messages

-
en
d
into segments, passes to network

tra
spn
layer

or
t
• receiver: reassembles segments into
application
messages, passes to application transport
network
layer data link
physical

 Two transport protocol available to apps


• Internet: UDP (User Datagram Protocol) and
3 TCP (Transmission Control Protocol)
Transport Layer 3-3
Transport Layer Actions

Sender:
application  is passed an application- application
app. msg
layer message
transport
 determines segment TTh htransport
app. msg
header fields values
network (IP)
 creates segment network (IP)

link
 passes segment to IP link

physical physical

4
Transport Layer: 3-4
Transport Layer Actions

Receiver:
application  receives segment from IP application
 checks header values
transport
app. msg  extracts application-layer transport
message
network (IP)  demultiplexes message up network (IP)

link to application via socket link

physical physical
Th app. msg

5
Transport Layer: 3-5
Internet transport-layer protocols
 TCP: Transmission Control Protocol application
transport
network
data link
• reliable, in-order delivery physical
network
network data link
• congestion control

lo
data link physical

gi
physical

ca
network

l
flow control data link

en
physical

d-

en
connection setup network

d
data link

tra
 UDP: User Datagram Protocol physical

ns
po
network

rt
data link

• unreliable, unordered delivery network


data link
physical

application

• no-frills extension of “best-effort” IP physical


network
data link
transport
network
physical data link

 services not available: physical

• delay guarantees
• bandwidth guarantees
6
Transport Layer 3-6
Internet transport-layer protocols
Establishment
• Client initiates the connection and sends the segment with a Sequence number.
• Server acknowledges it back with its own Sequence number and ACK of client’s
segment which is one more than client’s Sequence number.
• Client after receiving ACK of its segment sends an acknowledgement of Server’s
response.
Release
• Either of server and client can send TCP segment with FIN flag set to 1.
• When the receiving end responds it back by ACKnowledging FIN, that direction of
TCP communication is closed and connection is released.

7
8
Internet transport-layer protocols
Bandwidth Management
• TCP uses the concept of window size to accommodate the need of Bandwidth
management. Window size tells the sender at the remote end the number of data byte
segments the receiver at this end can receive.
• TCP uses slow start phase by using window size 1 and increases the window size
exponentially after each successful communication. For example, the client uses
windows size 2 and sends 2 bytes of data. When the acknowledgement of this segment
received the windows size is doubled to 4 and next the segment sent will be 4 data
bytes long. When the acknowledgement of 4-byte data segment is received, the client
sets windows size to 8 and so on.
• If an acknowledgement is missed, i.e. data lost in transit network or it received
NACK, then the window size is reduced to half and slow start phase starts again.
9
Internet transport-layer protocols
Multiplexing
• The technique to combine two or more data streams in one session is called Multiplexing.
• When a TCP client initializes a connection with Server, it always refers to a well-defined
port number which indicates the application process. The client itself uses a randomly
generated port number from private port number pools.
• Using TCP Multiplexing, a client can communicate with a number of different application
process in a single session.
 For example, a client requests a web page which in turn contains different types o
data (HTTP, SMTP, FTP etc.)
 This enables the client system to receive multiple connection over single virtual
connection.
10
11
How demultiplexing works

 host receives IP datagrams 32 bits


• each datagram has source IP address, source port # dest port #
destination IP address
• each datagram carries one transport- other header fields
layer segment
• each segment has source, destination
port number application
data
 host uses IP addresses & port (payload)
numbers to direct segment to
appropriate socket
TCP/UDP segment format

12
Transport Layer 3-12
Connectionless demultiplexing

 recall: created socket has host-local  recall: when creating datagram to


port #: send into UDP socket, must
DatagramSocket mySocket1= new specify
DatagramSocket(12534); • destination IP address
• destination port #

 when host receives UDP segment: IP/UDP datagrams with same


• checks destination port # in segment dest. port #, but different source
IP addresses and/or source port
• directs UDP segment to socket with numbers will be directed to
that port # same socket at dest

13
Transport Layer 3-13
Connectionless demux: example
DatagramSocket
DatagramSocket mySocket2 serverSocket = new
= new DatagramSocket DatagramSocket DatagramSocket mySocket1
(9157); = new DatagramSocket
(6428); (5775);
application
application application
P1
P3 P4
transport
transport transport
network
network link network
link physical link
physical physical

source port: 6428 source port: ?


dest port: 9157 dest port: ?

source port: 9157 source port: ?


dest port: 6428 dest port: ?
14
Connection-oriented demux
 TCP socket identified by 4-tuple:  server host may support many
• source IP address simultaneous TCP sockets:
• source port number • each socket identified by its own 4-
• dest IP address tuple
• dest port number  web servers have different sockets
 demux: receiver uses all four for each connecting client
values to direct segment to • non-persistent HTTP will have
appropriate socket different socket for each request

15
Transport Layer 3-15
Connection-oriented demux: example

application
application P4 P5 P6 application
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
three segments, all destined to IP address: B,
16 dest port: 80 are demultiplexed to differentsockets
Connection-oriented demux: example
threaded server
application
application application
P4
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80

17
3-17
UDP: User Datagram Protocol [RFC 768]
• “no frills,” “bare bones” Internet  UDP use:
transport protocol  streaming multimedia apps (loss
• “best effort” service, UDP segments tolerant, rate sensitive)
may be:  DNS
• lost  SNMP
• delivered out-of-order to app  reliable transfer over UDP:
• connectionless:  add reliability at application layer
• no handshaking between UDP  application-specific error
sender, receiver recovery!
• each UDP segment handled
independently of others

18
3-18
UDP Data Transfer

19
UDP: segment header
length, in bytes of UDP
32 bits segment, including header
source port # dest port #
length checksum
why is there a UDP?
 no connection establishment (which
application can add delay)
data
(payload)  simple: no connection state at
sender, receiver
 small header size
UDP segment format  no congestion control: UDP can blast
away as fast as desired

20
Transport Layer 3-20
UDP: Transport Layer Actions

SNMP client SNMP server

application application

transport transport
(UDP) (UDP)

network (IP) network (IP)

link link

physical physical

21
Transport Layer: 3-21
UDP: Transport Layer Actions

SNMP client SNMP server


UDP sender actions:
application  is passed an application- application
SNMP msg
layer message
transport  determines UDP segment UDPhtransport
UDP h SNMP msg

(UDP) header fields values (UDP)

network (IP)
 creates UDP segment network (IP)

link
 passes segment to IP link

physical physical

22
Transport Layer: 3-22
UDP: Transport Layer Actions

SNMP client SNMP server


UDP receiver actions:
application  receives segment from IP application
 checks UDP checksum
transport transport
SNMP msg header value
(UDP)  extracts application-layer (UDP)

network
UDP h SNMP(IP)
msg message network (IP)
 demultiplexes message up
link to application via socket link

physical physical

23
Transport Layer: 3-23
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted segment

sender: receiver:
• treat segment contents, including • compute checksum of received segment
header fields, as sequence of 16-bit
integers • check if computed checksum equals
• checksum: addition (one’s checksum field value:
complement sum) of segment • NO - error detected
contents • YES - no error detected. But maybe
• sender puts checksum value into errors nonetheless? More later ….
UDP checksum field

24
Transport Layer 3-24
UDP checksum
Goal: detect errors (i.e., flipped bits) in transmitted segment
1st number 2nd number sum

Transmitted: 5 6 11

Received: 4 6 11

receiver-computed sender-computed
checksum
= checksum (as received)

25
Transport Layer: 3-25
Internet checksum
Goal: detect errors (i.e., flipped bits) in transmitted segment
sender: receiver:
 treat contents of UDP  compute checksum of received
segment (including UDP header segment
fields and IP addresses) as
sequence of 16-bit integers  check if computed checksum equals
 checksum: addition (one’s checksum field value:
complement sum) of segment • not equal - error detected
content • equal - no error detected. But maybe
 checksum value put into errors nonetheless? More later ….
UDP checksum field
26
Transport Layer: 3-26
Principles of reliable data transfer

sending receiving
process process
application data data
transport
reliable channel

reliable service abstraction

27
Transport Layer: 3-27
Principles of reliable data transfer

sending receiving sending receiving


process process process process
application data data application data data
transport transport
reliable channel
sender-side of receiver-side
reliable service abstraction reliable data of reliable data
transfer protocol transfer protocol

transport
network
unreliable channel

reliable service implementation


28
Transport Layer: 3-28
Principles of reliable data transfer

sending receiving
process process
application data data
transport

sender-side of receiver-side
reliable data of reliable data
Sender, receiver do not know transfer protocol transfer protocol
the “state” of each other, e.g.,
was a message received? transport
network
 unless communicated via a unreliable channel

message
reliable service implementation
29
Transport Layer: 3-29
Reliable data transfer protocol (rdt): interfaces
rdt_send(): called from above, deliver_data(): called by rdt
(e.g., by app.). Passed data to to deliver data to upper layer
deliver to receiver upper layer
sending receiving
process process
rdt_send() data data
deliver_data()

sender-side data receiver-side


implementation of implementation of
rdt reliable data packet rdt reliable data
transfer protocol transfer protocol

udt_send() Header data Header data rdt_rcv()

unreliable channel
udt_send(): called by rdt rdt_rcv(): called when packet
to transfer packet over Bi-directional communication over arrives on receiver side of
30 unreliable channel to receiver unreliable channel channel
Transport Layer: 3-30
rdt2.0: channel with bit errors
 underlying channel may flip bits in packet
• checksum to detect bit errors
 the question: how to recover from errors:
• acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK
• negative acknowledgements (NAKs): receiver explicitly tells sender that pkt
had errors
• sender retransmits pkt on receipt of NAK
 new mechanisms:
• error detection
• feedback: control msgs (ACK,NAK) from receiver to sender

stop and wait


sender sends one packet, then waits for receiver response
31
Transport Layer 3-31
TCP: Overview RFCs: 793,1122,1323, 2018, 2581

• point-to-point:  full duplex data:


• one sender, one receiver • bi-directional data flow in same
• reliable, in-order byte steam: connection
• MSS: maximum segment size
• no “message boundaries”
• pipelined:  connection-oriented:
• handshaking (exchange of control
• TCP congestion and flow control set
msgs) inits sender, receiver state
window size
before data exchange
 flow controlled:
• sender will not overwhelm
receiver

32
Transport Layer 3-32
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UAPR S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

33
Transport Layer 3-33
TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port #
sequence number
dest port #

• byte stream “number” of first byte acknowledgement number


rwnd
in segment’s data checksum urg pointer

acknowledgements: window size


N
• seq # of next byte expected from
other side
sender sequence number space
• cumulative ACK
Q: how receiver handles out-of-order sent sent, not-yet usable not
ACKed ACKed but not usable
segments (“in-flight”) yet sent

• A: TCP spec doesn’t say, - up to incoming segment to sender


implementor source port # dest port #
sequence number
acknowledgement number
A rwnd
checksum urg pointer
34
Transport Layer 3-34
TCP sequence numbers, ACKs
Host A Host B

User types‘C’
Seq=42, ACK=79, data = ‘C’
host ACKs receipt of‘C’,
echoes back ‘C’
Seq=79, ACK=43, data = ‘C’
host ACKs receipt
of echoed ‘C’
Seq=43, ACK=80

simple telnet scenario


35
Transport Layer: 3-35
TCP Sender event
event: data received from event: timeout
application  retransmit segment that
caused timeout
 create segment with seq #
 restart timer
 seq # is byte-stream number
of first data byte in segment
event: ACK received
 start timer if not already
 if ACK acknowledges
running
• think of timer as for oldest
previously unACKed segments
unACKed segment • update what is known to be
ACKed
• expiration interval:
TimeOutInterval • start timer if there are still
unACKed segments
36
Transport Layer: 3-36
TCP Receiver: ACK generation [RFC 5681]
Event at receiver TCP receiver action
arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

37
Transport Layer: 3-37
TCP: retransmission scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
timeout

timeout
Seq=100, 20 bytes of data
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data send cumulative
SendBase=120 ACK for 120
ACK=100
ACK=120

SendBase=120

lost ACK scenario premature timeout


38
Transport Layer: 3-38
TCP: retransmission scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


ACK=100
X
ACK=120

Seq=120, 15 bytes of data

cumulative ACK covers


for earlier lost ACK
39
Transport Layer: 3-39
TCP fast retransmit
Host A Host B
TCP fast retransmit
if sender receives 3 additional
Seq=92
ACKs for same data (“triple Seq=1
, 8 byte
s of da
ta
00, 20
duplicate ACKs”), resend unACKed bytes
of data
segment with smallest seq # X
 likely that unACKed segment lost,
=100
so don’t wait for timeout ACK

=100

timeout
ACK
CK =100
A
=100
Receipt of three duplicate ACKs ACK

indicates 3 segments received Seq=100, 20 bytes of data

after a missing segment – lost


segment is likely. So retransmit!
40
Transport Layer: 3-40
TCP fast retransmit
 time-out period often relatively long:
• long delay before resending lost packet
 detect lost segments via duplicate ACKs.
• sender often sends many segments back-to-back
• if segment is lost, there will likely be many duplicate ACKs.

41
Transport Layer 3-41
TCP flow control
application
application may process
Q: What happens if network remove data from application
TCP socket buffers ….
layer delivers data faster than OS
TCP socket
application layer removes receiver buffers
… slower than TCP
data from socket buffers? receiver is delivering
(sender is sending) TCP
code

IP
flow control code
receiver controls sender, so sender
won’t overflow receiver’s buffer by
transmitting too much, too fast from sender

receiver protocol stack

42
Transport Layer 3-42
Connection Management
before exchanging data, sender/receiver “handshake”:
• agree to establish connection (each knowing the other willing to establish connection)
• agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port number"); welcomeSocket.accept();
43
Transport Layer 3-43
Agreeing to establish a connection

2-way handshake:
Q: will 2-way handshake always
work in network?
Let’s talk • variable delays
ESTAB
OK • retransmitted messages (e.g.
ESTAB
req_conn(x)) due to message loss
• message reordering
• can’t “see” other side
choose x
req_conn(x)
ESTAB
acc_conn(x)
ESTAB

44
Transport Layer 3-44
Agreeing to establish a connection
2-way handshake failure scenarios:

choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn(x) req_conn(x)

ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1)
data(x+1)
connection connection
client x completes server x completes server
client
terminates forgets x terminates forgets x
req_conn(x)

ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1)
45 (no client!)
Transport Layer 3-45
TCP 3-way handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

46
Transport Layer 3-46
TCP: closing a connection
 client, server each close their side of connection
• send TCP segment with FIN bit = 1
 respond to received FIN with ACK
• on receiving FIN, ACK can be combined with own FIN
 simultaneous FIN exchanges can be handled

47
Transport Layer 3-47
TCP: closing a connection
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
48
Transport Layer 3-48
Principles of congestion control
congestion:
• informally: “too many sources sending too much data too fast for
network to handle”
• different from flow control!
• manifestations:
• lost packets (buffer overflow at routers)
• long delays (queueing in router buffers)
• a top-10 problem!

49
Transport Layer 3-49
Principles of congestion control
Error Control and Flow Control
• TCP uses port numbers to know what application process it needs to handover the
data segment. Along with that, it uses sequence numbers to synchronize itself with the
remote host. All data segments are sent and received with sequence numbers.
• The Sender knows which last data segment was received by the Receiver when it gets
ACK. The Receiver knows about the last segment sent by the Sender by referring to
the sequence number of recently received packet.
• If the sequence number of a segment recently received does not match with the
sequence number the receiver was expecting, then it is discarded and NACK is sent
back.
• If two segments arrive with the same sequence number, the TCP timestamp value is
compared to make a decision.
50
Principles of congestion control
Congestion Control
When large amount of data is fed to system which is not capable of handling
it, congestion occurs. TCP controls congestion by means of Window
mechanism. TCP sets a window size telling the other end how much data
segment to send. TCP may use three algorithms for congestion control:
 Additive increase, Multiplicative Decrease
 Slow Start
 Timeout React

51
Timer Management

TCP uses different types of timers to control and management various tasks:
Keep-alive timer:
 This timer is used to check the integrity and validity of a connection.
 When keep-alive time expires, the host sends a probe to check if the connection still exists.
Retransmission timer:
 This timer maintains stateful session of data sent.
 If the acknowledgement of sent data does not receive within the Retransmission time, the data segment is
sent again.
Persist timer:
 TCP session can be paused by either host by sending Window Size 0.
 To resume the session a host needs to send Window Size with some larger value.
 If this segment never reaches the other end, both ends may wait for each other for infinite time.
 When the Persist timer expires, the host resends its window size to let the other end know.
 Persist Timer helps avoid deadlocks in communication.

52
Timed-Wait:
 After releasing a connection, either of the hosts waits for a Timed-Wait time to
terminate the connection completely.
 This is in order to make sure that the other end has received the acknowledgement
of its connection termination request.
 Timed-out can be a maximum of 240 seconds (4 minutes).

Crash Recovery
TCP is very reliable protocol. It provides sequence number to each of byte sent in
segment. It provides the feedback mechanism i.e. when a host receives a packet, it is
bound to ACK that packet having the next sequence number expected (if it is not the
last segment).
When a TCP Server crashes mid-way communication and re-starts its process, it sends
TPDU broadcast to all its hosts. The hosts can then send the last data segment which
was never unacknowledged and carry onwards.
53
54

You might also like