You are on page 1of 219

Computer Networks (CS F303)

BITS Pilani Virendra Singh Shekhawat


Department of Computer Science and Information Systems
Pilani Campus
Today’s Agenda

• Course Overview
• Course Administration
• What is network?
• What is Internet?
• Network Structure
– Edge, Access Network (Physical Media), Network Core
• Circuit Switching and Packet Switching

2
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Objective

• To get familiar with the principles and working of state-of-the-


art of networking
– Routing, Transport protocols, addressing, naming etc.
– Design of network and services
• Learn how communication networks are put together
–Mechanisms, Algorithms, Technology components
• To understand network internals in a hands-on way
– Writing simple network applications, understanding and analyzing
working principles of protocols
3
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Overview

• Internet Architecture and Computer Network Primitives


• Network Applications (Application Layer)
• End to End Data Transfer (Transport Layer)
• Data Routing and Forwarding (Network Layer)
• Access Networks & LANs (Link Layer)
• Communication Channels (Physical Layer)
• Wireless and Mobile Networks

4
Computer Networks CS F303 BITS Pilani, Pilani Campus
Course Administration
• Instruction delivery
– Lecture classes
• 12:00 – 12:50 pm [Tue, Th] and 5:00 – 5:50 PM [Fri]
– Lab classes
• Start from the first week of Feb (detail will be posted on MS Teams)
• Course page Information
– Lectures and course material will be available at MS Teams
– For assessments NALANDA will be used (https://nalanda-aws.bits-pilani.ac.in)
• Evaluation Plan
– Mid Semester Test @25%
– Quiz (Two) @20% [10% each]
– Lab Assessment @20%
– Comprehensive exam @35% 5
Computer Networks CS F303 BITS Pilani, Pilani Campus
Text Book

6
Computer Networks CS F303 BITS Pilani, Pilani Campus
What is a Network?

• An infrastructure (shared) that allows users


(distributed) to communicate with each
other
– People, devices, …
– By means of voice, video, text, …
– ex., Telephone n/w, Cable TV Network,
Satellite network, military n/w etc. …

• Basic building blocks are


– Nodes (Hosts and Forwarding nodes) and Links
7

Computer Networks CS F303 BITS Pilani, Pilani Campus


What is Internet?

• The Internet is a Network of networks…


– Interconnected Networks Internet

Computer Networks CS F303 BITS Pilani, Pilani Campus


How Internet is different from other
Networks?
• Enable communication between diverse applications on
diverse devices over diverse infrastructure

9
Computer Networks CS F303 BITS Pilani, Pilani Campus
Network Structure

• Network edge: applications


and hosts
• Network core:
interconnected routers
• Access networks
The network that physically
connects an host
• Physical media: wired,
wireless communication links

10
Computer Networks CS F303 BITS Pilani, Pilani Campus
Physical Media-Guided

• Twisted pair
– Two insulated copper wires
– Transmission rates supported are 100 Mbps, 1 Gbps, 10 Gbps
• Coaxial Cable
– Two concentric copper conductors
– Multiple channels on cable
• Fiber Optic Cable
– Glass fiber carrying light pulses, each pulse a bit
– High speed operation (10 Gbps to 100 Gbps)
– Low error rate
11
Computer Networks CS F303 BITS Pilani, Pilani Campus
Physical Media-Unguided

• Radio link types:


– terrestrial microwave
– e.g. up to 45 Mbps channels
• LAN (e.g., WiFi)
– 11Mbps, 54 Mbps
• Wide-area (e.g., cellular)
– 3G cellular: ~ few Mbps
– 4G cellular: ~100 Mbps
• Satellite
– Kbps to 45Mbps channel (or multiple smaller channels)
12
Computer Networks CS F303 BITS Pilani, Pilani Campus
Access Networks Example

13
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Question: given millions of access ISPs, how to connect them together?


access access
net net
access
net
access
access net
net
access
access net
net

access access
net net

access
net
access
net

access
net
access
net
access access
net access net
net
14
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
Option: connect each access ISP to every other access ISP?
access access
net net
access
net
access
access net
net
access
access net
net

connecting each access ISP


access
to each other directly doesn’t access
net
scale: O(N2) connections. net

access
net
access
net

access
net
access
net
access access
net access net
net
15
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
Option: connect each access ISP to a global transit ISP? Customer
and provider ISPs have economic agreement.
access access
net net
access
net
access
access net
net
access
access net
net

global
access
net
ISP access
net

access
net
access
net

access
net
access
net
access access
net access net
net
16
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Single global ISP does not scale, there are multiple global ISPs ….
access access
net net
access
net
access
access net
net
access
access net
net
ISP A

access access
net ISP B net

access
ISP C
net
access
net

access
net
access
net
access access
net access net
net
17
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
Multiple global ISPs must be interconnected
access access
Internet exchange point
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access IXP access


net ISP B net

access
ISP C
net
access
net

access peering link


net
access
net
access access
net access net
net
18
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

… and regional networks may arise to connect access nets to ISPs


access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A

access IXP access


net ISP B net

access
ISP C
net
access
net

access
net regional net
access
net
access access
net access net
net
19
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks
… and content provider networks (e.g., Google, Microsoft, Akamai ) may run
their own network, to bring services, content close to end users
access access
net net
access
net
access
access net
net

access
IXP access
net
net
ISP A
Content provider network
access IXP access
net ISP B net

access
ISP B
net
access
net

access
net regional net
access
net
access access
net access net
net
20
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet structure: network of networks

Tier 1 ISP Tier 1 ISP Google

IXP IXP IXP

Regional ISP Regional ISP

access access access access access access access access


ISP ISP ISP ISP ISP ISP ISP ISP

• at center: small # of well-connected large networks


– “tier-1” commercial ISPs (e.g., Level 3, Sprint, AT&T, NTT), national & international coverage
– content provider network (e.g, Google): private network that connects it data centers to
Internet, often bypassing tier-1, regional ISPs

BITS Pilani, Pilani Campus


Tier-1 ISP: e.g., Sprint

POP: point-of-presence

to/from backbone

peering
… …



to/from customers

BITS Pilani, Pilani Campus


The Network Core

• Mesh of interconnected routers


• How is data transferred through
network?

– Circuit switching: Dedicated circuit


per call ex: telephone net
– Packet-switching: Data Sent through
net in discrete “chunks” ex: Internet

23
Computer Networks CS F303 BITS Pilani, Pilani Campus
Network Core: Circuit Switching

End to end resources reserved


for “call”
• Dedicated resources: no sharing
• Circuit-like (guaranteed)
performance
• Call setup required
• Link bandwidth is to be divided into
“pieces”
– Frequency division
– Time division

24
Computer Networks CS F303 BITS Pilani, Pilani Campus
Circuit Switching: FDM and TDM
Example:
FDM
4 users

frequency

time
TDM

frequency

time 25
Computer Networks CS F303 BITS Pilani, Pilani Campus
Circuit Switch: Numerical example

• How long does it take to send a file of 640,000 bits from host A to host B
over a circuit-switched network?
– All links are 1.536 Mbps
– Each link uses TDM with 24 slots/sec
– 500 msec to establish end-to-end circuit

26
Computer Networks CS F303 BITS Pilani, Pilani Campus
Network Core: Packet Switching

Host sending function:


• Takes application message two packets,
L bits each
• Breaks into smaller chunks,
known as packets, of length
L bits
2 1
• Transmits packet into access
R: link transmission rate
network at transmission host
rate R (aka Bandwidth)

• Store and forward


Transmission Delay = L (bits)
R (bits/sec)
27

Computer Networks CS F303 BITS Pilani, Pilani Campus


End-to-End Delay

transmission
A C
propagation

B D
nodal
processing queueing

28
Computer Networks CS F303 BITS Pilani, Pilani Campus
Caravan Analogy [.1]
100 km 100 km
ten-car toll toll
caravan booth booth
• Cars “propagate” at 100 km/hr
• Toll booth takes 12 sec to service a car (car transmission time)
• Car is analogous to bit; caravan is analogous to packet
• Question:
– How long until caravan is lined up before 2nd toll booth?

29
Computer Networks CS F303 BITS Pilani, Pilani Campus
Caravan analogy [..2]
100 km 100 km
ten-car toll toll
caravan booth booth

• Cars now “propagate” at 1000 km/hr


• Toll booth now takes 1 min to service a car

30
Computer Networks CS F303 BITS Pilani, Pilani Campus
Queuing Delay
• R=link bandwidth (bps)
• L=packet length (bits)
• a=average packet arrival rate

traffic intensity = La/R

 La/R > 1: more “work” arriving than can be serviced, average delay infinite!
 La/R <= 1: delays become large
 La/R ~ 0: average queueing delay small
31
Computer Networks CS F303 BITS Pilani, Pilani Campus
“Real” Internet delays and routes

• What do “real” Internet delay & loss look like?


• Traceroute program: provides delay measurement from source to router along end-to-end
Internet path towards destination. For all i:
– Sends three packets that will reach router i on path towards destination
– Router i will return packets to sender
– Sender times interval between transmission and reply.
– Read RFC 1393 for more detail !!!

• http://traceroute.org

3 probes 3 probes

3 probes

32
Computer Networks CS F303 BITS Pilani, Pilani Campus
Packet switching versus circuit switching

Packet switching allows more users to use network!


example:
 1 Mb/s link
N
 each user: users
• 100 kb/s when “active” 1 Mbps link
• active 10% of time

• Circuit-switching:
– How many users are supported?
• Packet switching:
– with 35 users, probability > 10 active at Exercise: How did we get value 0.0004?
same time is less than .0004 *

BITS Pilani, Pilani Campus


Performance Measure Parameters of
Networks
• Delay

• Packet Loss

• Throughput
– Amount of bits transferred in a unit time
• Instantaneous throughput
– e.g., P2P file sharing applications displays instantaneous throughput during downloads
• Average throughput
34
Computer Networks CS F303 BITS Pilani, Pilani Campus
Exercise

35
Computer Networks CS F303 BITS Pilani, Pilani Campus
Layered (Modular) Network Model (OSI)

Each layer performs specific operations


Implementation of a layer can change by keeping interfaces intact

36
Computer Networks CS F303 BITS Pilani, Pilani Campus
Layering of Airline Functionality

ticket (purchase) ticket (complain) ticket

baggage (check) baggage (claim baggage

gates (load) gates (unload) gate

runway (takeoff) runway (land) takeoff/landing

airplane routing airplane routing airplane routing airplane routing airplane routing

departure intermediate air-traffic arrival


airport control centers airport

Layers: Each layer implements a service


– Via its own internal-layer actions
– Relying on services provided by layer below
37
Computer Networks CS F303 BITS Pilani, Pilani Campus
Internet Hourglass Architecture

• Need to interconnect many existing networks


• Hide underlying technology from applications email WWW phone...

• Decisions: SMTP HTTP RTP...

TCP UDP…
Applications

– Network provides minimal functionality IP

– “Narrow waist” ethernet PPP…

– Best Effort Service…! CSMA async sonet...

copper fiber radio...


Technology

– Tradeoff No assumptions no guarantee

38
Computer Networks CS F303 BITS Pilani, Pilani Campus
source
message
segment
M application Layer Encapsulation
Ht M transport
datagram Hn Ht M network
frame Hl Hn Ht M link
physical
link
physical

switch

destination Hn Ht M network
M application H l Hn Ht M link Hn Ht M
Ht M transport physical
Hn H t M network
Hl Hn Ht M link router
physical

39
Computer Networks CS F303 BITS Pilani, Pilani Campus
Summary

• Network and its components


• Internet Structure
• Internet Core
– Packet Switching
– Circuit Switching
• Network Delays
• Network performance measure parameters
• Layered architecture of the Internet

40
Computer Networks CS F303 BITS Pilani, Pilani Campus
Thank You!

41

Computer Networks CS F303 BITS Pilani, Pilani Campus


Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Internet Hourglass Architecture

• Need to interconnect many existing networks


• Hide underlying technology from applications email WWW phone...

• Decisions: SMTP HTTP RTP...

TCP UDP…
Applications

– Network provides minimal functionality IP

– “Narrow waist” ethernet PPP…

– Best Effort Service…! CSMA async sonet...

copper fiber radio...


Technology

– Tradeoff No assumptions no guarantee

2
Computer Networks CS F303 BITS Pilani, Pilani Campus
What is a Network Application?

• Programs that run on different end application


transport
network
systems and communicate over a data link
physical
network
– e.g., Web: Web server software
communicates with browser software

• Network core devices do not run user


application code
application
application transport
transport
• Application on end systems allows for network
network
data link
data link physical
rapid application development physical

3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Application architectures

• Client-server
• Peer-to-Peer (P2P)
• Hybrid of client-server and P2P

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Client-Server Architecture

Server:
– “always-on” host
– Permanent IP address
– For scaling, data center is used to create
large powerful virtual server

Clients:
– Communicate with server
– May be intermittently connected
– May have dynamic IP addresses
– Clients do not communicate directly
with each other
5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pure P2P Architecture

• No “always-on” server
• Arbitrary end systems directly
communicate
• Peers are connected and change IP
addresses
– example: Freenet and BitTorrent (File Sharing
Apps)

Highly scalable but difficult to manage!!!

6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Hybrid of client-server and P2P

Skype
– Internet telephony application
– Finding address of remote party: centralized server(s)
– Client-client connection is direct (not through server)

Instant messaging
– Chatting between two users is P2P
– Presence detection/location centralized:
• User registers its IP address with central server when it comes online
• User contacts central server to find IP addresses of buddies

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How Network Applications
Communicate?
• Process sends/receives messages host or host or
server server
to/from its Socket
– Socket is the interface between the controlled by
application layer and the transport layer app developer
process process
within the host
socket socket
TCP with TCP with
• Within same host, two processes buffers, Internet buffers,
variables variables
communicate using inter-process
communication
controlled
by OS
• Processes in different hosts
communicate by exchanging
messages
8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How to identify a process running on a
machine?
• To receive messages, process must have
host or
identifier host or
server server

• IP address of host on which process runs is P1 P2 P3 P4


not sufficient for identifying the process. socket socket socket socket
Why? TCP with TCP with
buffers, Internet buffers,
variables variables

• Process identifier = IP address + port number


– e.g., HTTP server: 80, Mail server (SMTP): 25
– List of well known port numbers is available at
http://www.iana.org 9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
What transport service does an app need?

• Data loss
– Some apps (e.g., audio, video) can tolerate some loss
– Other apps (e.g., file transfer, telnet) require 100% reliable data transfer
• Bandwidth
– Some apps (e.g., multimedia) require minimum amount of bandwidth to be
“effective”
– Other apps (“elastic apps”) make use of whatever bandwidth they get
– ex. E-mail, File Transfer
• Timing
– Some apps (e.g., Internet telephony, interactive games) require low delay to be
“effective”
10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Web and HTTP [1994]
Web page consists of objects
• Object can be HTML file, JPEG image, Java applet,
audio file,…
• Web page consists of base HTML-file which
includes several referenced objects
• Each object is addressable by a URL
• Example URLs:
https://www.bits-pilani.ac.in/pilani/computerscience/ProgrammesOffered
https://www.bits-pilani.ac.in/pilani/computerscience/Faculty

11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Overview [.1]

• Types of messages exchanged


– e.g., request, response
PC running
• Message syntax: Firefox browser
– What fields in messages & how fields
are delineated
• Message semantics server
running
– Meaning of information in fields Apache Web
server
• Rules for when and how processes
send & respond to messages iphone running
Safari browser

12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Overview [..2]

Uses TCP:
• Client initiates TCP connection (creates
socket) to server at port 80
initiate TCP
• Server accepts TCP connection from client connection
RTT
request
• HTTP messages exchanged between file
browser (HTTP client) and Web server RTT
time to
transmit
(HTTP server) file
file
received
• TCP connection closed
time time

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Request Message

carriage return character


line-feed character
request line
(GET, POST, GET /index.html HTTP/1.1\r\n
HEAD commands) Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept: text/html,application/xhtml+xml\r\n
header Accept-Language: en-us,en;q=0.5\r\n
lines Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7\r\n
carriage return, Keep-Alive: 115\r\n
line feed at start Connection: keep-alive\r\n
\r\n
of line indicates data data data data data ...
end of header lines 14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Response Message
status line
(protocol
status code HTTP/1.1 200 OK\r\n
status phrase) Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS)\r\n
Last-Modified: Tue, 30 Oct 2007 17:00:02 GMT\r\n
ETag: "17dc6-a5c-bf716880"\r\n
header Accept-Ranges: bytes\r\n
Content-Length: 2652\r\n
lines Keep-Alive: timeout=10, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html; charset=ISO-8859-1\r\n
\r\n
data data data data data ...

data, e.g.,
requested
HTML file
15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Response status Codes
200 OK
– request succeeded, requested object later in this msg
301 Moved Permanently
– requested object moved, new location specified later in this msg (Location:)
400 Bad Request
– request msg not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported
– the HTTP version used in the request is not supported by the server.

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
How a Webpage transfers?

• Let’s assume a web page consists of a base HTML file and 5 JPEG images.
– https://www.bits-pilani.ac.in/Pilani/SustainableEnvironment

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Connections

Non-persistent HTTP
• At most one object is sent over a TCP connection
• HTTP/1.0 uses non-persistent HTTP

Persistent HTTP
• Multiple objects can be sent over single TCP
connection between client and server.
• Persistent with Pipeline vs. Persistent without
Pipeline
• HTTP/1.1 uses persistent connections in default
mode
18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP Method Types

HTTP/1.0: HTTP/1.1:
• GET • GET, POST, HEAD
• POST • PUT
• HEAD – uploads file in entity body
– asks server to leave to path specified in URL
requested object out of field
response • DELETE
– deletes file specified in the
URL field

19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
State in HTTP using “Cookies”
client server

ebay 8734
usual http request msg Amazon server
cookie file creates ID
usual http response
1678 for user create backend
ebay 8734
set-cookie: 1678 entry database
amazon 1678
usual http request msg
cookie: 1678 cookie- access
specific
usual http response msg action

one week later:


access
ebay 8734 usual http request msg
amazon 1678 cookie: 1678 cookie-
specific
usual http response msg action 20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Web Caches (aka Proxy Server)

origin
server

Proxy
server
client

client
origin
server

21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Conditional GET
client server
• Goal: don’t send object if cache
has up-to-date cached version HTTP request msg
If-modified-since: <date> object
not
• cache: specify date of cached modified
HTTP response
before
copy in HTTP request HTTP/1.0
304 Not Modified
<date>
If-modified-since: <date>

• server: response contains no


HTTP request msg
object if cached copy is up-to- If-modified-since: <date> object
date: modified
HTTP response after
HTTP/1.0 304 Not Modified
HTTP/1.0 200 OK <date>
<data>
22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Proxy Server Example [.1]

Assumptions:
 avg object size: 1000 K bits origin
 avg request rate from browsers to origin servers
public
servers: 15 req/sec Internet
 avg data rate to browsers: 1Mbps
 RTT from institutional router to any origin
server: 2 sec
 access link rate: 15 Mbps 15 Mbps
access link
institutional
network
100 Mbps LAN

23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP/2 [Proposed in 2015]
• Limitations of HTTP/1.1
– It processes only one outstanding request per TCP connection
– Forcing browsers to use multiple TCP connections to process multiple requests
simultaneously
– HTTP1.x used to process text commands which makes it slower
• Motivation
– To improve internet user experience and effectiveness
– Webpages comprise resource-intensive multimedia content
– To make it more secure, reliable with improved performance

• Compatibility with existing applications


– HTTP/2 modifies how the data is formatted (framed) and transported between the client and
server, and hides all the complexity from applications within the new framing layer.
– It is an extension to its predecessor not replacing the older one
24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP/2Feature: Stream Multiplexing

• What is stream?
– Bi-directional sequence of text format frames
sent over the HTTP/2 protocol exchanged
between the server and client

• HTTP/1 is capable of transmitting only


one stream at a time
– Receiving large amount of media content via
individual streams sent one by one is
inefficient and resource consuming
25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Binary Framing Layer

• HTTP/2 allows transmission of parallel multiplexed requests and responses


– HTTP/2 breaks down the HTTP protocol communication into an exchange of binary-encoded frames,
which are then mapped to messages that belong to a particular stream, all of which are multiplexed
within a single TCP connection.
– This is the foundation that enables all other features and performance optimizations provided by the
HTTP/2 protocol.

26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
HTTP/2.0 Connection
• Stream: A bidirectional flow of bytes
within an established connection, which
may carry one or more messages.
• Message: A complete sequence of
frames that map to a logical request or
response message.
• Frame: The smallest unit of
communication in HTTP/2, each
containing a frame header, which at a
minimum identifies the stream to which
the frame belongs.

27
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Domain Name System (DNS)

• The domain name system maps the name people use to locate a website to
the IP address that a computer uses to locate a website.

• Why do we need the mapping between host name and IP address?

• Application-layer protocol: hosts, name servers communicate to resolve


names (address/name translation)

28
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Structure – Distributed Hierarchical
Database
Root DNS Servers

… …

com DNS servers org DNS servers edu DNS servers

pbs.org poly.edu umass.edu


yahoo.com amazon.com
DNS servers DNS serversDNS servers
DNS servers DNS servers

Client wants IP for www.amazon.com; 1st approx:


• Client queries root server to find com DNS server
• Client queries .com DNS server to get amazon.com DNS server
• Client queries amazon.com DNS server to get IP address for
www.amazon.com
List of all top level domain servers is available at: https://www.icann.org/resources/pages/tlds-2012-02-25-en 29
Computer Networks CS F303 BITS Pilani, Pilani Campus
Root Name Servers

• Root name server:


– Total 13 server, mostly located in North America.
– Each server is actually a network of replicated servers

c. Cogent, Herndon, VA (5 other sites)


d. U Maryland College Park, MD k. RIPE London (17 other sites)
h. ARL Aberdeen, MD
j. Verisign, Dulles VA (69 other sites ) i. Netnod, Stockholm (37 other sites)

e. NASA Mt View, CA m. WIDE Tokyo


f. Internet Software C. (5 other sites)
Palo Alto, CA (and 48 other
sites)

a. Verisign, Los Angeles CA 13 root name


(5 other sites)
b. USC-ISI Marina del Rey, CA
“servers”
l. ICANN Los Angeles, CA worldwide
(41 other sites)
g. US DoD Columbus,
OH (5 other sites)

30
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Services

• Hostname to IP address translation


– Host name to IP address mapping
• Host aliasing
– Canonical name to alias name(s) mapping

• Mail server aliasing


– Host name to mail server mapping

• Load distribution
– Replicated Web servers: many IP addresses correspond to one name
31
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Query Processing - Recursive root DNS server

2 3
7
6
TLD DNS
server

local DNS server


dns.poly.edu 5 4

1 8

authoritative DNS server


dns.cs.umass.edu
requesting host
cis.poly.edu

gaia.cs.umass.edu

32
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Query Processing - Iterative root DNS server

2
• TLD server may know only of an 3
intermediate DNS server for the TLD DNS server
4
hostname, which in turn knows the
authoritative DNS server for the 5
hostname. local DNS server
dns.poly.edu
7 6
1 8
• DNS responses are usually cached to
improve the delay performance and to authoritative DNS server
reduce the number of DNS messages dns.cs.umass.edu
requesting host
– e.g., Local DNS server caches the TLD server cis.poly.edu
information
gaia.cs.umass.edu

33
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Records
DNS: distributed database for storing resource records (RR)
RR format: (name, value, type, ttl)

type=A type=CNAME
 name is hostname  name is alias name for some
 value is IP address
“canonical” (the real) name
 www.ibm.com is really
type=NS servereast.backup2.ibm.com
– name is domain (e.g.,  value is canonical name
foo.com)
– value is hostname of type=MX
authoritative name  value is name of mailserver associated
server for this domain with name (host name, i.e.,
mailserver alias)
34
Computer Networks CS F303 BITS Pilani, Pilani Campus
DNS Messages
• Query and reply messages, both with same message format
• Explore DNS protocol in Lab Session #2
2 bytes 2 bytes

msg header identification flags


 identification: 16 bit # for # questions # answer RRs
query, reply to query uses
same # # authority RRs # additional RRs
 flags:
questions (variable # of questions)
 query or reply
 recursion desired
 recursion available answers (variable # of RRs)
 reply is authoritative
authority (variable # of RRs)

additional info (variable # of RRs)


35
Computer Networks CS F303 BITS Pilani, Pilani Campus
Inserting Records into DNS
• A newly created domain name should be first registered at a registrar
– Internet Cooperation of Assigned Names and Numbers (ICANN) accredits the registrars
– Accredited registrar list is available at www.internic.net
– Registrar is a commercial entity that verifies the uniqueness of the domain name.

36
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP: File Transfer Protocol
file transfer
FTP FTP FTP
user client server
interface
user
at host remote file
local file system
system

 Transfer file to/from remote host


 Client/server model
 Client: side that initiates transfer (either to/from remote)
 Server: remote host
 ftp: RFC 959
 ftp server: port 21

37
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP: Connections
TCP control connection,
• Control connection server port 21
– Authorization, directory listing
etc. TCP data connection,
FTP server port 20 FTP
client server
• When server receives file
transfer command,
– Server opens 2nd TCP data
connection (for file) to client

• After transferring one file,


server closes data connection

38
Computer Networks CS F303 BITS Pilani, Pilani Campus
FTP Commands and Responses

Sample commands: Sample return codes


• Sent as ASCII text over control • Status code and phrase
channel (as in HTTP)
• USER username • 331 Username OK,
• PASS password password required
• 125 data connection
• LIST return list of file in already open; transfer
current directory starting
• RETR filename retrieves • 425 Can’t open data
(gets) file connection
• 452 Error writing file
• STOR filename stores
(puts) file onto remote host

39
Computer Networks CS F303 BITS Pilani, Pilani Campus
eMail
outgoing
user message queue
Three major components: agent
user mailbox
• User agents mail user
server
– e.g., Outlook, Thunderbird agent

• Mail servers SMTP mail user


– Contains incoming messages for user server agent

• Simple mail transfer protocol: SMTP


– SMTP SMTP user
agent
mail
server
user
agent
user
agent

40
Computer Networks CS F303 BITS Pilani, Pilani Campus
SMTP [RFC 5321, Original RFC 821]

• Uses TCP to reliably transfer email message from client to server, port 25
• Direct transfer: sender’s mail server to receiver’s mail server
• Three phases of transfer
– Handshaking (greeting)Transfer of messagesConnection Closure
• Command/response interaction (like HTTP, FTP)
– Commands: ASCII text
– Response: status code and phrase
• Messages must be in 7-bit ASCII
– Painful for multimedia data

41
Computer Networks CS F303 BITS Pilani, Pilani Campus
Mail Transfer Process

S: 220 hamburger.edu
C: HELO crepes.fr
S: 250 Hello crepes.fr, pleased to meet you
C: MAIL FROM: <alice@crepes.fr>
S: 250 alice@crepes.fr... Sender ok
C: RCPT TO: <bob@hamburger.edu>
S: 250 bob@hamburger.edu ... Recipient ok
C: DATA
S: 354 Enter mail, end with "." on a line by itself
C: Do you like ketchup?
C: How about pickles?
C: .
S: 250 Message accepted for delivery
C: QUIT
S: 221 hamburger.edu closing connection 42
Computer Networks CS F303 BITS Pilani, Pilani Campus
Mail Access Protocols

• Mail access protocol: retrieval from server


– POP3 [Port:110]: Post Office Protocol [RFC 1939]: authorization, download and keep,
download and delete
• User can create folders and move the messages into them locally.
• Stateless across the sessions
– IMAP: Internet Mail Access Protocol [RFC 1730]: more features, including manipulation of
stored msgs on server
• Allows to create remote folders and maintains user state information across IMAP sessions
• Permit a user agent to obtain components of messages. Good for low bandwidth connections. 43
Computer Networks CS F303 BITS Pilani, Pilani Campus
POP3 Protocol
S: +OK POP3 server ready
Authorization phase C: user alex
S: +OK
• Client commands: C: pass hungry
– user: declare username S: +OK user successfully logged on
– pass: password C: list
S: 1 498
• Server responses S: 2 912
– +OK S: .
– -ERR C: retr 1
S: <message 1 contents>
Transaction phase, client: S: .
C: dele 1
• list: list message numbers
C: retr 2
• retr: retrieve message by number S: <message 1 contents>
• dele: delete S: .
• quit C: dele 2
C: quit
S: +OK POP3 server signing off 44
Computer Networks CS F303 BITS Pilani, Pilani Campus
Web based E-Mail

• Hotmail introduced Web-based access in the 1990s

45
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

46
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Today’s Agenda

• Peer to Peer Applications and Protocols


– P2P File Distribution, Bit Torrent Protocol

• Database Implementation Protocol in P2P Networks


– Distributed Hash Tables (DHTs)
– Chord Protocol

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Peer to Peer (P2P) Architecture

• No always-on server
• Arbitrary end systems directly communicate
• Peers are intermittently connected
• Examples
– File distribution (BitTorrent)
– Streaming (KanKan)
– VoIP (Skype)

3
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution: P2P vs CS
How much time required to distribute file (size F) from one server to N peers?
– peer upload/download capacity is limited resource

us: server upload


capacity

di: peer i download


file, size F u1 d1 capacity
us u2 d2
server
di
uN network (with abundant
bandwidth) ui
dN
ui: peer i upload
capacity

4
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution Time – Client Server
• Server transmission: must F
sequentially send (upload) N us

file copies: di
network
– Time to send N copies: NF/us ui

 Client: each client must download file copy


 dmin = min client download rate
 Slowest client download time: F/dmin

time to distribute F
to N clients using
client-server approach Dc-s > max{NF/us,,F/dmin}

5
Computer Networks CS F303 BITS Pilani, Pilani Campus
File Distribution Time - Peer to Peer
• Server transmission: must upload at least one copy
F
– time to send one copy: F/us us
di
 Client: each client must download file copy network
 Slowest client download time: F/dmin ui

 Clients: as aggregate must download NF bits


 max upload rate (limiting max download rate) is us + Sui

Time to distribute F
to N clients using
P2P approach
DP2P > max{F/us,,F/dmin,,NF/(us + Sui)}

6
Computer Networks CS F303 BITS Pilani, Pilani Campus
Exercise

• Distributing a File F = 15 Gbits to 10 peers


• Server upload rate is us = 30 Mbps
• Each peer download rate is di = 2 Mbps
• Each peer upload rate is u = 300 Kbps
• Question
– Calculate minimum distribution time for both CS and P2P

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
CS vs P2P: Example

client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us


3.5
P2P

Minimum Distribution Time


3
Client-Server
2.5

1.5

0.5

0
0 5 10 15 20 25 30 35

N
8
Computer Networks CS F303 BITS Pilani, Pilani Campus
P2P File Distribution: BitTorrent

• File divided into 64 KB to 1 MB size (typically


256 KB) chunks

• Peers in torrent send/receive file chunks


– At any given time, each peer will have a subset of
chunks from the file
– A peer asks its neighbors for the list of chunks they
have and gets list from each
– A peer needs to take a call on-
• Which chunks should it request first from its neighbor?
• To which of its neighbors it should send requested
chunks?

9
Computer Networks CS F303 BITS Pilani, Pilani Campus
The lookup problem

N2 N3
N1

Internet
Key = “data item”
Value = video lecture ?
Client
Publisher
N4 N6 Lookup(“data item”)

N5
Decentralized network with several peers (servers/clients)
How to find specific peer that hosts desired data within this network?
10
Computer Networks CS F303 BITS Pilani, Pilani Campus
P2P Protocols

Napster

Gnutella
Kazaa (Skype is based on Kazaa)
11
Computer Networks CS F303 BITS Pilani, Pilani Campus
Distributed Hash Table (DHT)

• Each Peer hold a small subset of the total (key, value) pairs

• Any Peer can query the distributed database with a particular key
– Distributed DB locate the Peers that have the corresponding (key, value)
pairs and return to the querying Peer
– Each peer only knows about a small number of other peers
– Any Peer can insert new (key, value) pairs into the DB
– Robust to peers coming and going (churn)

12
Computer Networks CS F303 BITS Pilani, Pilani Campus
DHT Implementation [.1]

• Randomly scatter the (key, value) pairs across all the peers

• Each peer maintain a list of the IP addresses of all peers

• The querying peer sends its query to all other peers

• The peers containing the (key, value) pairs that match the key can respond
with matching pairs

• This approach is not scalable. Why? 13


Computer Networks CS F303 BITS Pilani, Pilani Campus
Database Implementation [..2] Circular DHT

• Hash function assigns each “node” and “key” an m-bit identifier using a
base hash function such as SHA-1
– Node_ID = hash(IP, Port)
N63
– Key_ID = hash(original key) N60 N2
k7
ID Space: 0 to 2m-1 k58
N10
Here: m = 6 k11

N50 k16
Range = 64
k46 N20

Assign (key-value) pair to the peer that has N40 k39 k25

the closest ID.


14
Computer Networks CS F303 BITS Pilani, Pilani Campus
Chord Protocol:Lookup Operation Example

Predecessor: pointer to the previous node on the id


circle
Successor: pointer to the succeeding node on the
id circle

 ask node n to find the successor of id


 If id between n and its successor
return successor
 else forward query to n´s successor and
so on

=>#messages linear in #nodes


15
Computer Networks CS F303 BITS Pilani, Pilani Campus
Scalable node localization
• Each node n contains a routing table with up-to m entries (m: number of bits
of the identifier) => finger table
• ith entry in the table at node n contains the first node s that succeds n by at
least 2i-1
– s = successor (n + 2i-1)
– s is called the ith finger of node n

16
Computer Networks CS F303 BITS Pilani, Pilani Campus
The Chord algorithm –
Scalable node localization
• Search in finger table for the node
which is
most immediatly precedes key

• Invoke find_successor from that node

Number of messages O(log N)!


17
Computer Networks CS F303 BITS Pilani, Pilani Campus
Failure Recovery (Peer Churn)

• Key step in failure recovery is maintaining correct successor pointers


• To achieve this, each node maintains a successor-list of its r nearest
successors on the ring
• If node n notices that its successor has failed, it replaces it with the first
live entry in the list
• The stabilize will correct finger table entries and successor-list entries
pointing to failed node
• Stabilization protocol should be invoked based on the frequency of
nodes leaving and joining
18
Computer Networks CS F303 BITS Pilani, Pilani Campus
Thank You!

19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Next…

• Creating network Applications


– Socket Programming
• TCP vs. UDP Sockets
• Transport Layer
– Transport Layer Services
• Multiplexing/Demultiplexing
– Connectionless and Connection Oriented
» TCP and UDP
• Reliable data transfer (Protocol design)
• Flow control
• Congestion control
2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming [.1]
• What is a socket?
– To the kernel, a socket is an endpoint of communication.
– To an application, a socket is a file descriptor that lets the application read/write from/to the network.
• Remember: All Unix I/O devices, including networks, are modeled as files.

• Clients and servers communicate with each other by reading from and writing to socket
descriptors.

application application
socket controlled by
process process app developer

transport transport
network network controlled
link by OS
link Internet
physical physical

3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming [..2]

Two socket types for two different transport services:


– UDP: unreliable datagram
– TCP: reliable, byte stream-oriented

Application Example:
1. Client reads a line of characters (data) from its keyboard and sends the
data to the server.
2. The server receives the data and converts characters to uppercase.
3. The server sends the modified data to the client.
4. The client receives the modified data and displays the line on its screen.

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming with UDP

UDP: no “connection” between client & server


• No handshaking before sending data
• Sender explicitly attaches destination IP address and port # to each packet
• Receiver extracts sender IP address and port# from received packet

Note: Transmitted data may be lost or received out-of-order

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Programming with TCP

Client contacts server by: • When contacted by client, server


• Creating TCP socket, specifying IP TCP creates new socket for
address, port number of server server process to communicate
process
with that particular client
• Server must have created socket
(door) that welcomes client’s contact – Allows server to talk with multiple
clients
• Client TCP establishes connection to
server TCP

Application viewpoint:
TCP provides reliable, in-order byte-stream transfer (“pipe”)
between client and server.
6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Structure [.1]

struct sockaddr
{
unsigned short int sa_family; // address family, AF_xxx
char sa_data[14] ; // 14 bytes of protocol address
}

• sa_family – this remains AF_INET for stream and datagram sockets


• sa_data - contains destination address and port number for the socket

7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket Structure [..2]
• Parallel structure to sockaddr
struct sockaddr_in
{
short int sin_family; // Address family (e.g., AF_INET)
unsigned short int sin_port; // Port number (e.g., htons (2240)
struct in_addr sin_addr; // Internet address
unsigned char sin_zero[8]; // same size as sockaddr
}
struct in_addr
{ unsigned long s_addr;
}
• sin_zero is used to pad the structure to the length of a structure sockaddr and hence is set to all zeros with
the function memset()
• Important – you can cast sockaddr_in to a pointer of type struct sockaddr and vice versa
• sin_family corresponds to sa_family and should be set to “AF_INET”.
• sin_port and sin_addr must be in NBO
8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
NBO & HBO Conversion Functions

• Two types that can be converted


– short (two bytes)
– long (two bytes)

• Primary conversion functions


– htons() // host to network short
– htonl() // host to network long
– ntohs // network to host short
– ntohl() // network to host long

• Very Important: Even if your machine is Big-Endian m/c, but you put your bytes in NBO before putting
them on to the network for portability

9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Primary Socket System Calls

• socket() - create a new socket and return its descriptor


• bind() - associate a socket with a port and address
• listen() - establish queue for connection requests
• accept() - accept a connection request
• connect() - initiate a connection to a remote host
• recv() - receive data from a socket descriptor
• send() - send data to a socket descriptor
• close() - “one-way” close of a socket descriptor

10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls:
Connectionless (e.g., UDP)

11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls: Connection-
Oriented (e.g., TCP)

12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [.1]

• SOCKET: int socket(int domain, int type, int protocol);


– domain := AF_INET (IPv4 protocol)
– type := (SOCK_DGRAM or SOCK_STREAM )
– protocol := 0 (IPPROTO_UDP or IPPROTO_TCP)
– returned: socket descriptor (sockfd), -1 is an error

• BIND: int bind(int sockfd, struct sockaddr *my_addr, int addrlen);


– sockfd - socket descriptor (returned from socket())
– my_addr: socket address, struct sockaddr_in is used
– addrlen := sizeof(struct sockaddr)

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [..2]

• LISTEN: int listen(int sockfd, int backlog);


– backlog: how many connections we want to queue

• ACCEPT: int accept(int sockfd, void *addr, int *addrlen);


– addr: here the socket-address of the caller will be written
– returned: a new socket descriptor (for the temporal socket)

• CONNECT: int connect(int sockfd, struct sockaddr *serv_addr, int


addrlen); //used by TCP client
– parameters are same as for bind()

14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls […3]
• SEND: int send(int sockfd, const void *msg, int len, int flags);
– msg: message you want to send
– len: length of the message
– flags := 0
– returned: the number of bytes actually sent

• RECEIVE: int recv(int sockfd, void *buf, int len, unsigned int flags);
– buf: buffer to receive the message
– len: length of the buffer (“don’t give me more!”)
– flags := 0
– returned: the number of bytes received

15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Socket System Calls [….4]
• SEND (DGRAM-style): int sendto(int sockfd, const void *msg, int len, int flags, const struct sockaddr *to, int
tolen);
– msg: message you want to send
– len: length of the message
– flags := 0
– to: socket address of the remote process
– tolen: = sizeof(struct sockaddr)
– returned: the number of bytes actually sent

• RECEIVE (DGRAM-style): int recvfrom(int sockfd, void *buf, int len, unsigned int flags, struct sockaddr
*from, int *fromlen);
– buf: buffer to receive the message
– len: length of the buffer (“don’t give me more!”)
– from: socket address of the process that sent the data
– fromlen:= sizeof(struct sockaddr)
– flags := 0
– returned: the number of bytes received

• CLOSE: close (socketfd);


16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Byte ordering routines

#include <sys/types.h>
#include <netinet/in.h>

u_long htonl(u_long hostlong); /* host-to-network, long integer */

u_short htons(u_short hostshort); /* host-to-network, short integer */

u_long ntohl(u_long netlong); /* network-to-host, long integer */

u_short ntohs(u_short netshort); /* network-to-host, short integer */

Address conversion routines


#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

unsigned long inet_addr(char *ptr);


accepts a char string of IP address and returns a 32-bit network byte-order integer equivalent.
char *inet_ntoa(struct in_addr inaddr);
accepts an IP addres expressed as a 32-bit quantity in network byte order and returns a string
expressed in dotted-decimal notation 17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple TCP Server
#include <sys/types.h> listen(sockfd, 5);
#include <sys/socket.h>
#include <netinet/in.h> for(; ; ) {
#define SERVER_PORT 5888 clilen= sizeof(cliaddr);
connfd=accept(sockfd, (struct sockaddr *)
int main() &cliaddr, &clilen);
{ int sockfd, connfd,clilen,n;
if(connfd<0)
char buf[256]; { printf(“Server Accept error \n”); exit(1); }
struct sockaddr_in servaddr, cliaddr;
printf("Client IP: %s\n",
sockfd = socket( AF_INET, SOCK_STREAM, 0); inet_ntoa(cliaddr.sin_addr));
if (sockfd < 0) printf("Client Port: %hu\n",
{ printf(“ Server socket error"); ntohs(cliaddr.sin_port));
exit(1); }
servaddr.sin_family = AF_INET; n = read(connfd, buf,256);
servaddr.sin_port = htons(SERVER_PORT); printf("Server read: \"%s\" [%d chars]\n", buf,
n);
servaddr.sin_addr.s_addr =
htonl(INADDR_ANY);
write(connfd, “Server Got Message”, n);
close(connfd);
if(bind(sockfd,(struct }
sockaddr*)&servaddr,sizeof(servaddr) <0 )
{ printf(“Server Bind Error”); exit(1); } }

18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple TCP Client
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define SERVER_PORT 5888
int main()
{ int sockfd, clifd,len;
char buf[256];
struct sockaddr_in servaddr;
sockfd = socket( AF_INET, SOCK_STREAM, 0);
if (sockid < 0) { printf(“ Server socket error"); exit(1); }

servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERVER_PORT);
servaddr.sin_addr.s_addr = inet_addr(“172.24.2.4”);

connect(sockfd,(struct sockaddr*)&servaddr, sizeof(servaddr))

print(“Enter Message \n”);


fgets(buf,256,stdin);
write(sockfd,buf,strlen(buf));

read(sockfd,buf,256);
printf(“Client Received%s\n",buf);
Close(sockfd);
}
19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple UDP Server
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define SERVER_PORT 9988
int main()
{ int sockfd, clilen;
char buf[256];
struct sockaddr_in servaddr, cliaddr;
sockfd = socket( AF_INET, SOCK_DGRAM, 0);
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERVER_PORT);
servaddr.sin_addr.s_addr =htonl(INADDR_ANY);
if (bind(sockfd,(struct sockaddr*)&servaddr,sizeof(servaddr)) <0 )
{ printf(“Server Bind Error”); exit(1); }
for(; ; )
{ clilen= sizeof(cliaddr);
recvfrom(sockfd,buf,256,0,(struct sockaddr*)&cliaddr,&clilen);

printf(“Server Received:%s\n”,buf);

sendto(sockfd,“Server Got Message",18, 0,(struct sockaddr*)&cliaddr,clilen);


}
}
20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Simple UDP Client
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#define SERVER_PORT 9988
#define SERVER_IPADDR “172.24.2.4”
int main()
{ int sockfd,len;
char buf[256];
struct sockaddr_in ,cliaddr,servaddr;

servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERVER_PORT);
servaddr.sin_addr.s_addr = inet_addr(SERVER_IPADDR);

sockfd = socket( AF_INET, SOCK_DGRAM, 0);

cliaddr.sin_family = AF_INET;
cliaddr.sin_port = htons(0);
cliaddr.sin_addr.s_addr =htonl(INADDR_ANY);
bind(sockfd,(struct sockaddr*)&cliaddr,sizeof(cliaddr));

printf(“Enter Message\n”); fgets(buf,255,stdin);


len= sizeof(server);

sendto(sockfd,buf,strlen(buf), 0,(struct sockaddr*)&servaddr,len);

recvfrom(sockfd,buf,256,0,NULL,NULL);
printf(“Clinet Received: %s \n”,buf);
close(sockfd);
} 21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Topics

• Transport Layer
– Transport Layer Services
• Multiplexing/Demultiplexing
– Connectionless and Connection Oriented
» TCP and UDP
• Reliable data transfer (Protocol design)
• Flow control
• Congestion control

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Transport Layer Services and Protocols

• Provides logical communication between app


processes
– Apps processes sends msgs to each other using the logical
communication

• Extend host-to-host delivery to process-to-process


delivery

3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TP Layer vs. Network Layer
• Network layer: logical communication between hosts

• TP Layer: logical communication between processes

• TP layer services are constrained by the service model of underlying


network-layer protocol

• But certain services can be offered by the TP layer even when the network
layer doesn’t offer
– e.g., Reliable data transfer

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Transport Layer Services

• Reliable in-order delivery (TCP)


– Congestion control
– Flow control
– Connection setup

• Unreliable, unordered delivery (UDP)


– Extension of “best-effort” IP

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Process-to-Process Delivery Service
Multiplexing at sendening time:
Demux at receiving time:
handle data from multiple use header info to deliver
sockets, add transport header received segments to correct
socket

application

application P1 P2 application socket


P3 transport P4
process
transport network transport
network link network
link physical link
physical physical

6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Demultiplexing at Receiver

• Host receives IP datagrams 32 bits


 Each datagram has source IP address,
source port # dest port #
destination IP address
 Each datagram carries one transport-
layer segment other header fields
 Each segment has source, destination
port number application
data
• Host uses IP addresses & port (payload)
numbers to direct segment to
appropriate socket TCP/UDP segment format

7
3-7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connectionless (UDP) Demultiplexing

• When host receives UDP segment:


– Checks destination port # in segment and directs segment to socket with port #

• Recall: when creating datagram to send into UDP socket, must specify
• Destination IP address
• Destination port #

8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: Connectionless Demultiplexing
DatagramSocket serverSocket
DatagramSocket = new DatagramSocket DatagramSocket
mySocket2 = new mySocket1 = new
(6428);
DatagramSocket DatagramSocket
(9157); application
(5775);
application application
P1
P3 P4
transport
transport transport
network
network link network
link physical link
physical physical

source port: 6428 source port: ?


dest port: 9157 dest port: ?

source port: 9157 source port: ?


dest port: 6428 dest port: ?
9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connection Oriented Demultiplexing

• TCP socket is identified by 4-tuples:


• Source IP address, Source port #, Destination IP address, Destination port #
• Demultiplexing: receiver uses all four values to direct segment to appropriate socket

• Server host may support many simultaneous TCP sockets:


– Web servers have different sockets for each connecting client
– e.g., non-persistent HTTP will have different socket for each request

10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: Connection Oriented Demux

application
application P4 P5 P6 application
P1 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
Three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different sockets 11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example
threaded server
application
application application
P4
P3 P2 P3
transport
transport transport
network
network link network
link physical link
physical server: IP physical
address B

host: IP source IP,port: B,80 host: IP


address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80
source IP,port: A,9157
dest IP, port: B,80
source IP,port: C,9157
dest IP,port: B,80
12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
User Datagram Protocol [RFC 768]

• Best effort service


– UDP segment may lost, delivered out of order to app
• Connectionless
– No handshaking between sender and receiver

• Each UDP segment handled independently of others

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
UDP Segment Header
length, in bytes of
32 bits UDP segment,
source port # dest port # including header

length checksum
Why is there a UDP?
• No connection establishment
application
(which can add delay)
data
(payload) • simple: no connection state
at sender, receiver
• small header size
UDP segment format • no congestion control: UDP
can blast away as fast as
desired
14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
UDP Checksum

• Treat segment contents (with header fields) as a sequence of 16-bit


integers at sender
– Sum all such 16-bit words in the segment
– One’s complement of the sum is put in checksum field
• At the receiver, all 16-bit words are added (including checksum) to detect
error in segment

15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Principles of Reliable Data Transfer

• Important in application, transport, link layers


• Top-10 list of important networking topics!

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Principles of Reliable Data Transfer

• Important in application, transport, link layers


• Top-10 list of important networking topics!

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Principles of Reliable Data Transfer

• Important in application, transport, link layers


• Top-10 list of important networking topics!

• Characteristics of unreliable channel will determine complexity of reliable data


transfer protocol (rdt) 18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Reliable Data Transfer: getting started

rdt_send(): called from above, deliver_data(): called by


(e.g., by app.). Passed data to be rdt to deliver data to upper
delivered to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver
19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Reliable Data Transfer: getting started

We will:
• Incrementally develop sender, receiver sides of reliable data
transfer protocol (rdt)
• Consider only unidirectional data transfer
– But control info will flow on both directions!
• Use finite state machines (FSM) to specify sender, receiver
event causing state transition
actions taken on state transition
State: when in this “state”
next state uniquely state state
1 event
determined by next 2
event actions

20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt1.0: reliable transfer over a reliable channel

• Underlying channel perfectly reliable


– No bit errors, No loss of packets
• Separate FSMs for sender, receiver:
– Sender sends data into underlying channel
– Receiver read data from underlying channel

Wait for rdt_send(data) Wait for rdt_rcv(packet)


call from packet = make_pkt(data) call from extract (packet,data)
above rdt_send(packet) below deliver_data(data)

sender receiver
21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: channel with bit errors
• Underlying channel may flip bits in packet
– Don’t worry… Checksum is there to detect bit errors

• The question? How to recover from errors?


– Acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK
– Negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors
– Sender retransmits pkt on receipt of NAK

• New mechanisms in rdt2.0 (beyond rdt1.0):


– Error detection
– Receiver feedback: control msgs (ACK,NAK) rcvr->sender

22
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: FSM Specification
rdt_send(data)
sndpkt = make_pkt(data, checksum) receiver
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for rdt_rcv(rcvpkt) &&
Wait for call
ACK or udt_send(sndpkt) corrupt(rcvpkt)
from above
NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
L
from below
sender
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: Operation with no Errors
rdt_send(data)
sndpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
from above ACK or udt_send(sndpkt) corrupt(rcvpkt)
NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
L from below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0: Error Scenario
rdt_send(data)
sndpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for call Wait for rdt_rcv(rcvpkt) &&
from above ACK or udt_send(sndpkt) corrupt(rcvpkt)
NAK
udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for call
L from below

rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)

25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.0 Has a fatal flaw!

• What happens if ACK/NAK corrupted?


– Sender doesn’t know what happened at receiver!
– Simple, just retransmit.

• How to handle duplicates?


– Sender adds sequence number to each pkt
– Receiver discards (doesn’t deliver up) duplicate pkt

26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Sender, handles garbled ACK/NAKs

rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for call Wait for
ACK or NAK
isNAK(rcvpkt) )
0 from
0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
L
Wait for Wait for
ACK or NAK call 1 from
rdt_rcv(rcvpkt) && 1 above
( corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_send(data)

udt_send(sndpkt) sndpkt = make_pkt(1, data, checksum)


udt_send(sndpkt)

27
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Receiver, handles garbled ACK/NAKs

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)


&& has_seq0(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
Wait for Wait for
rdt_rcv(rcvpkt) && 0 from 1 from rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
has_seq1(rcvpkt) has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)

extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
28
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.1: Discussion

Sender:
• Seq # added to pkt
• Two seq. #’s (0,1) will suffice. Why?
• Must check if received ACK/NAK corrupted
• Twice as many states
– State must “remember” whether “current” pkt has 0 or 1 seq. #

Receiver:
• Must check if received packet is duplicate
– State indicates whether 0 or 1 is expected pkt seq #
– For an out of order received packet, it sends ACK for it

• Note: Receiver can not know if its last ACK/NAK received OK at sender
29
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt2.2: NAK Free Protocol

• Same functionality as rdt2.1, using ACKs only


• Instead of NAK, receiver sends ACK for last pkt received OK
– Receiver must explicitly include seq # of pkt being ACKed
30
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0: Channels with errors and loss

• New assumption: Underlying channel can also lose packets (data or ACKs)
– Checksum, seq. #, ACKs, retransmissions will be of help, but not enough

• Approach: Sender waits “reasonable” amount of time for ACK


– Retransmits if no ACK received in this time
– If pkt (or ACK) just delayed (not lost):
• Retransmission will be duplicate, but use of seq. #’s already handles this
• Receiver must specify seq # of pkt being ACKed

– Requires countdown timer

31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Sender and Receiver FSM

32
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

33
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Topics

• Transport Layer
• Reliable data transfer (Protocol design)
– Stop and Wait vs. Pipelining (Sliding Window)
– Go Back N and Selective Repeat Protocols
• Flow control
• Congestion control

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0: Channels with errors and loss

• New assumption: Underlying channel can also lose packets (data or ACKs)
– Checksum, seq. #, ACKs, retransmissions will be of help, but not enough

• Approach: Sender waits “reasonable” amount of time for ACK


– Retransmits if no ACK received in this time
– If pkt (or ACK) just delayed (not lost):
• Retransmission will be duplicate, but use of seq. #’s already handles this
• Receiver must specify seq # of pkt being ACKed

– Requires countdown timer

3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt 3.0 Sender and Receiver FSM

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 in action

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0 (Lost ACK and Premature Timeout)

6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
rdt3.0: Performance
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK

ACK arrives, send next


packet, t = RTT + L / R

Example: 1 Gbps link, 15 ms end to end prop. delay, 1KB packet:

U L/R .008
= = = 0.00027
sender 30.008
RTT + L / R microsec
7
onds
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelining: Increased Utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

Increase utilization
by a factor of 3!

U 3*L/R .024
= = = 0.0008
sender 30.008
RTT + L / R microsecon
ds
8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelined Protocols

• Pipelining: Sender allows multiple, “in-flight”, yet-to-be-acked pkts


– Range of sequence numbers must be increased
– Buffering at sender and/or receiver

9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Pipelining Protocols Requirements

• The range of sequence numbers must be increased


– Multiple in-transit packets

• Packet Buffering is required at both sides. Why?

• Two basic approaches


– Go-Back-N (GBN)
– Selective Repeat (SR)

10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Protocol
Sender window (N=4) Sender Receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, discard,
012345678 rcv ack0, send pkt4 (re)send ack1
012345678 rcv ack1, send pkt5 receive pkt4, discard,
(re)send ack1
ignore duplicate ACK receive pkt5, discard,
(re)send ack1
pkt 2 timeout
012345678 send pkt2
012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
rcv pkt5, deliver, send ack5
11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Sender

• K-bit seq # in pkt header (modulo 2K arithmetic)


• A “window” of upto N, consecutive unack’ed pkts allowed

12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Sender FSM
rdt_send(data)
if (nextseqnum < base+N) {/*If we are allowed to send packets*/
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum) /*If there are no packets in flight*/
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=0
nextseqnum=0
timeout
start_timer
Wait
udt_send(sndpkt[base])
rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
&& corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-1])
L
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1 /*Increase left size of the window*/
If (base == nextseqnum)
stop_timer
else
start_timer 13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
GBN Receiver FSM
• Always send ACK for correctly-received pkt with highest in-order seq #
– Need only to remember “expectedseqnum”
• If out-of-order pkt arrived
– Discard it
– Re-ACK pkt with the highest in-order seq #

default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
L && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=0 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat Protocol
sender window (N=4) sender receiver
012345678 send pkt0
012345678 send pkt1
012345678 send pkt2 receive pkt0, send ack0
012345678 send pkt3 Xloss receive pkt1, send ack1
(wait)
receive pkt3, buffer,
012345678 rcv ack0, send pkt4 send ack3
012345678 rcv ack1, send pkt5 receive pkt4, buffer,
send ack4
record ack3 arrived receive pkt5, buffer,
send ack5
pkt 2 timeout
012345678 send pkt2
012345678 record ack4 arrived
012345678 rcv pkt2; deliver pkt2,
record ack5 arrived
012345678 pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?


15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat (SR) Protocol

• Receiver individually acknowledges all correctly received pkts


– Buffers pkts, as needed, for eventual in-order delivery to upper layer

• Sender only resends pkts for which ACK not received


– Sender keeps timer for each unACKed pkt

• Retransmit only that unacked packet for which timer expires

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
SR Protocol: Windows

Events at Sender
 Data from above
 Timeout
 ACK(n) in [sendbase,sendbase+N-1]

Events at Receiver
 Pkt n in [rcvbase, rcvbase+N-1]
 Pkt n in [rcvbase-N,rcvbase-1]
 Otherwise
17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Selective Repeat Dilemma
0123012 pkt0
0123012 pkt1 0123012
0123012 pkt2 0123012
X 0123012
X
timeout
retransmit pkt0 X
0123012 pkt0
will accept packet
with seq number 0

18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Relation between Window Size and
Sequence Number
• Sequence numbers range for K bits
– 0 to 2K-1
• What should be the window size N for
– Selective Repeat
– Go Back N

19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Thank You!

20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Computer Networks (CS F303)
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
Topics

• Transport Layer
• TCP Protocol
– Connection Establishment
– TCP Segment Structure
– Reliable data transfer
– Flow control
– Congestion control

2
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP [RFCs: 793,1122,1223,2018,2581]

• Point to Point protocol


– One sender and one receiver
• Reliable in-order byte stream
– No message boundaries
• Pipelined
– Window size is set by congestion and flow control
• Full duplex data
– Bi-directional data flow in same connection
• Connection oriented
– Handshaking (exchange of control msgs)
• Flow controlled
– Sender do not overwhelm receiver
3
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Segment Structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UAP R S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

4
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP: Wireshark Capture

5
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sequence Numbers and ACKs
TCP views data as stream of bytes
outgoing segment from sender Sequence number reflects stream of transmitted bytes not segments
source port #
Sequence number of a segment – Byte stream number of the first byte in the
dest port #
segment
sequence number window size
acknowledgement number N
rwnd
checksum urg pointer
sender sequence number space

incoming segment to sender


sent sent, not- usable not
ACKed yet ACKed but not usable
source port # dest port # (“in- yet sent
flight”)
sequence number
acknowledgement number
A rwnd

checksum urg pointer


6
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Connection Management

• Before exchanging data, sender/receiver do “handshake”


– Agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server, client at server, client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
number");
7
Computer Networks (CS F303) BITS Pilani, Pilani Campus
2-way Handshake

• Will 2-way handshake


always work in network?
Let’s talk
ESTAB
OK
ESTAB

choose x
req_conn(x)
ESTAB
acc_conn(x)
ESTAB

8
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP 3-way Handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

9
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Lost ACK Scenario

10
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Premature Timeout

11
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Cumulative ACK

12
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Is TCP GBN or SR…?

1. Is out of order segments are individually ACKed?


2. Are ACKs cumulative?
3. How many timers are maintained by sender?
4. Is TCP receiving out of order segments?

13
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection Close

client state server state


ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
14
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Connection States-Client and Server

15
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Flow Control

• Receiver advertises free buffer space by


including rwnd value in TCP header to application process

– RcvBuffer size set via socket options (typical


default is 4096 bytes) RcvBuffer buffered data
– many operating systems auto adjust
RcvBuffer rwnd free buffer space

• Sender limits amount of unacked (“in-


TCP segment payloads
flight”) data to receiver’s rwnd value

16
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Timeout

• How to set TCP Timeout value?


– Must be longer than RTT
– Too short vs. too long

• How to estimate RTT?


– RTT: measured time from segment transmission until ACK receipt

17
Computer Networks (CS F303) BITS Pilani, Pilani Campus
RTT Estimation
EstimatedRTT = (1-)*EstimatedRTT + *SampleRTT
– Influence of past sample decreases exponentially fast
– Typical value of  = 0.125
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

300

250

RTT (milliseconds)
200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT

18
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Timeout Interval

• Timeout Interval
– Estimated RTT + “Safety margin”
– Large variation in Estimated RTT  large safety margin
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

19
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: Timeout Interval

• Consider three RTT samples (in ms): 150, 200 and 210 in that order. Assume
initial estimated RTT= 200 ms, initial DevRTT = 50 ms, β = 0.25 and α = 0.125

20
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Next…

• Transport Layer
• TCP Protocol
– Connection Establishment
– TCP Segment Structure
– Reliable data transfer
– Flow control
– Timeout Estimation
– Congestion control

21
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Network Congestion…?

10 Mbps
1.5 Mbps
100 Mbps

• Why is it a problem?
– Different sources compete for resources inside
network
– Sources are unaware of current state of resource
– Sources are unaware of each other
– In many situations will result in < 1.5 Mbps of
throughput (congestion collapse)
22
Computer Networks CS F303 BITS Pilani, Pilani Campus
Congestion Control

• What is congestion
– Too many sources sending too much data too fast for network to handle

• Congestion results in
– Packet losses
– Packet delays
– Throughput reduction

23
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Cost of Congestion [.1]

• Two senders and two receivers


• One router with infinite buffers
• Output link capacity R
• No retransmission

delay
R/2
lout

lin R/2

lin R/2

24
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Cost of Congestion [..2]

• One router, finite buffers


• Sender retransmission of timed-out packet
– Application-layer input = Application-layer output: lin = lout

– Transport-layer input includes retransmissions:


lin > lin

25
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Cost of Congestion […3]

• Packets can be dropped at router due to


full buffers
• Sender only resends if packet known to be
lost (Tricky Assumption…)

• Cost of congestion
– Retransmission for dropped packets

26
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Cost of Congestion [….4]

• Packets can be lost, dropped at router due to full buffers


– Sender times out prematurely, sending two copies, both of which are
delivered

27
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Causes/Cost of Congestion […..5]

When packet dropped, any “upstream transmission capacity


used for that packet was wasted!
28
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Approaches Towards Congestion Control

• Network Assisted Congestion control


– Routers provide feedback to end systems

• End-to-end Congestion control


– No explicit feedback from network

29
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Congestion Control

• Approach
– Sending rate is a function of perceived congestion

• Arises three important questions


– How does sender perceive the congestion on the path?
– What algorithm should be used to change its sending rate?
– How does sender limit the sending rate?

30
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Congestion Control

• Sender limits transmission


– LastByteSent – LastByteAcked <= min(cwnd, rwnd)
sender sequence number space
cwnd

last byte last byte


ACKed sent, not- sent
yet ACKed
(“in-
flight”)

• What is TCP Sending Rate/Throughput?


31
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Switching from Slow Start to CA

32
Computer Networks (CS F303) BITS Pilani, Pilani Campus
FSM Description of TCP Congestion Control

33
Computer Networks (CS F303) BITS Pilani, Pilani Campus
Example: TCP Congestion Control
a) Identify the intervals of time when TCP slow start is
operating.
b) Identify the intervals of time when TCP congestion
avoidance is operating.
c) What is the ssthresh value between transmission round 7-
10?
d) What is the congestion window value at transmission round
11?
e) How many segments have been sent till transmission round
11? (including 11th transmission round)
f) Identify the intervals of time when TCP fast retransmission
and fast recovery is used?

34
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Sawtooth Behavior

Congestion
Window Timeouts
may still
occur

Slowstart Fast Time


Initial
to pace Retransmit
Slowstart
packets and Recovery

BITS Pilani, Pilani Campus


TCP: AIMD Behavior

additively increase window size …


…. until loss occurs (then cut window in half)

congestion window size


cwnd: TCP sender

time
AIMD saw tooth
behavior: probing
for bandwidth
36
Computer Networks (CS F303) BITS Pilani, Pilani Campus
TCP Throughput

W/2

BITS Pilani, Pilani Campus


Thank You

38
Computer Networks (CS F303) BITS Pilani, Pilani Campus

You might also like