You are on page 1of 8

The application layer

The application layer


Olof Hagsand, NADA/KTH olofh@nada.kth.se January 20, 2005

Clients, servers, peers


The TCP/IP application layer contains protocols that enable applications to communicate. The TCP/IP application layer roughly maps to three OSI layers: Session: session establishment, dialog control, synchronization Presentation: syntax and semantics of data: higher level data structures Application: application-specic information and protocols From its UNIX implementation roots, the denition of the the application layer is: everything that is implemented in user space! (not in the UNIX kernel). Computers connected to the Internet are end-systems or hosts (they host application programs running on them). Hosts are traditionally divided into clients and servers - the dierence nowadays unclear. But from a program point of view, it is easier: Client program - requests a service. Server program - provides a service. Peer - bot a client and a server program.

0-0

The Socket Interface


The socket interface is used for programming applications with a network component. Sometimes called BSD sockets - it was rst implemented in C in BSD. Variants exist for most programming languages. Winsock is almost the same but not quite! Other programming interfaces include: Streams Remote Procedure Calls (RPC) The sockets API is a de facto standard for network programming.

Protocol message formats


When you transfer information from one host to another, they need to to understand each others data. (Presentation layer) Protocol messages are designed in dierent ways, some issues: Performance - compact data for faster transmittal, easy to parse by a computer. Readability - Easy to read by humans: debugging, surveillance, editing. Common character sets - dierent languages, coding. Alignment and byte ordering - Dierent CPU characteristics.

Approach 1: Binary xed elds


Most common in the underlying layers of the TCP/IP stack. Examples: DNS, RIP, OSPFv2, BGP, RTP Predenes exactly what information is to be where in the message. The semantics is hard-coded into the application. And its binary

Pros & Cons

Binary xed elds (cont)


Requires common alignment (ie on 16, 32 or 64 -bit boundaries) Requires byte-swapping: How the CPU loads its registers from memory. Two variants: Little endian (eg Intel): LSB (Least Signicant Byte) rst Big endian (eg Motorola): MSB (Most Signicant Byte) rst Network byte order is big endian You need to byte-swap on i386 PCs.

Example: DNS
The DNS header, taken from RFC 1035.
1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z | RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

When you feel the urge to design a [...] the feeling passes
Eric Raymond: The Art of UNIX Programming

complex binary

application protocol, it is generally wise to lie down until

Pros: Compact: Ecient computer processing Fixed syntax and simple semantics Cons: Not extendable Not human readable. Byte order, alignment problems

TLV - (Type, Length, Value) Approach 2: Tree based


Data structured hierarchical - recursive structure. Both binary and textual variants. More or less formal specication dening the data-types E.g. XML DTD. Examples: TLV, ASN.1, XML. Binary format usually used as an extensible part of a protocol. Type: contains a predened code, indicating what kind of data the value eld contains. Length: Contains the size (in bytes) of the value eld. Value: Contains the payload. Examples: IS-IS and OSPFv3, DHCP, and IP options. TLVs can be recursive (value eld contains new TLVs). But there is no notion of specication - must be added externally.

Example: DHCP
A vendor extension eld taken from RFC 2132.
3.5. Router Option The router option specifies a list of IP addresses for routers on the clients subnet. Routers SHOULD be listed in order of preference. The code for the router option is 3. The minimum length for the router option is 4 octets, and the length MUST always be a multiple of 4. Code Len Address 1 Address 2 +-----+-----+-----+-----+-----+-----+-----+-----+-| 3 | n | a1 | a2 | a3 | a4 | a1 | a2 | ... +-----+-----+-----+-----+-----+-----+-----+-----+--

10

11

A tiny part of an SNMP denition:

XML
Plain-text markup language: simple syntax, easy to parse. Denition declared externally by XML Schema or DTD. Well suited for complex data formats with recursive and nested structures. Cons mainly its textual nature: parsing can be inecient.
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE note SYSTEM "InternalNote.dtd"> <note> <to>Eva</to> <from>Phil</from> <heading>Reminder</heading> <body>Remember to go to the store!</body> </note>

Abstract Syntax Notation # 1


A general way to dene data types. ASN.1 is as powerful as a typed programming language. In ASN.1 the type information is inherent in the data - no external specication necessary. Used frequently in ISO protocols, but also to a certain extent in TCP/IP protocols. Some examples: SNMP, UMTS, LDAP, NFSv4 and many security protocols.

PDU ::= SEQUENCE { request-id Integer32, error-status INTEGER ( noError(0), tooBig(1), noSuchName(2), badValue(3), readOnly(4), ... inconsistentName(18) ), error-index INTEGER (0..max-bindings), variable-bindings VarBindList }

12

13

14

RFC 822 based text protocols Approach 3: RFC 822 formats


Classical Internet format described by BNF (Backus-Naur Form) derived from context-free grammars. Several RFCs describes the actual syntax description: RFC 822, RFC 2068, RFC 2234, now called ABNF - Augmented BNF. RFC 822 is syntax-heavy: keywords are introduced for parsing, requires specic parsers. For example: name = elements crlf crlf = %d13.d10 literal element1 / element2 (element1 element2) DIGIT = %x30-39 <a>*<b>element [foo bar] . . . and more . . . a rule

(cont)

RFC 822 based text protocols

(cont)

Another example; In RFC 2048, the HTTP URL is dened as: http URL host = = http: // host [ :

characters to end a line a string, case insensitive an alternative a strict sequence a range of characters element repetition optional elements

port ] [abs path

A legal Internet host domain name or IP address (in dotted-decimal form) as dened by Section 2.1 of RFC 1123

port abs path rel path

= = =

*DIGIT / rel path [ path ] [ ; params ] [ ? query ]

15

16

17

Specic applications/protocols
Pros & Cons
Pros: Easy to extend and exible. Human readable (easy to debug) Cons: Not compact. Syntax-heavy: may require complex parsers. telnet http tftp ftp smtp snmp rtp sip Others: Instant Messaging, Peer-to-peer, Distributed gaming.

TELNET - Terminal Network


(TCP port 23, text) Virtual Terminal local terminal appears to be a terminal on a remote system It is a nice tool to test other text-based protocols (HTTP, SMTP, FTP, etc) Good example of interactive application Tinygrams leading to silly window syndrome: Nagles algorithm Delayed ack, etc Control: simple options (control bytes have rst bit set) TELNET is security challenged: use TELNET with Kerberos or SSH!
20

18

19

HTTP - example HTTP


(TCP port 80, RFC 2616, ABNF data) The Hypertext Transfer Protocol is the main protocol used to download resources from the world wide web. Simplest form: a requestor establishes a TCP connection to the web server on port 80 and sends a string describing what resource it wants, and receives the resource in reply. The most modern version today is HTTP/1.1.
-> -> -> -> -> -> -> -> -> <<<<<<<<<<<<<<<<GET /stuff/blah.html HTTP/1.1 Host: zipf.pilsnet.sunet.se User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031214 Firebird/0.7 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.1 200 OK Date: Tue, 27 Jan 2004 20:18:28 GMT Server: Apache/1.3.27 (Unix) (Gentoo/Linux) PHP/4.3.4 Last-Modified: Tue, 27 Jan 2004 19:53:47 GMT ETag: "bb4047-2c-4016c1cb" Accept-Ranges: bytes Content-Length: 44 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html <html> <b> Hello there </b> </html>

Some HTTP commands


GET http url: Download an http resource. POST http url: Upload data to an http resource. PUT http url: Write an http resource. DELETE http url: Delete an http resource.

21

22

23

HTTP 1/1 persistent connections


In HTTP 1/0, all HTTP requests generated a new TCP connection.

TFTP - Trivial File Transfer Protocol


(Text-based, UDP port 69, RFC 1350) Very simple protocol to transfer les. Character coding: netascii(like telnet) or binary. Stop-and-go protocol: send datagram, wait for ack. Small implementations: typically on boot PROMS for small devices and diskless clients. Five message types: RRQ - Read ReQuest WRQ - Write ReQuest DATA ACK ERROR
26

Some HTTP status codes


Some examples: 200 Ok 404 Not found 301 Moved Permanently 500 Internal Server Error

(cont)

But most html documents contain sub-parts one TCP connection for each sub-request. But TCP congestion control is made for longer connections they can adapt to congestion in the network. When http trac grew when the web exploded, these small ows were said to kill the Internet! HTTP 1/1 supports persistent connections: keep the TCP connection during the complete session: send all requests on the same TCP connection. Now, these longer TCP connection can perform congestion control algorithm in a proper way.

24

25

FTP - File Transfer Protocol


(Text-based, TCP ports 20 and 21, RFC 959) FTP is a more elaborate le transfer protocol. FTP conducts its sessions in clear text. FTP uses two TCP connections: The control connection - exchange commands and their replies. TCP session initiated by the client to the server on port 21. The data connection - to transfer data in a specied mode and type. Data transferred may be a part of a le, an entire le or a number of les.

FTP modes
FTP can run in two modes- active mode and passive mode. This refers to whether the ftp server will start the data connection or not. active: The server will start the TCP session for the data connection, thereby connecting to the client to a port and IP specied by the client. (May not work if client is behind NAT) passive: The server will not start a TCP session. Instead, the client will create a TCP session to the server, to a port and IP specied by the server.

Some FTP commands


Examples of FTP control commands (sent on control channel): CWD <arg> Change working directory RMD <arg> Remove directory PWD Print working directory TYPE [I|A|E|L <arg>] Set the data transfer type RETR <arg> Download a le. STOR <arg> Upload a le. LIST Download the current working directorys content list.

27

28

29

Some FTP status codes


As in HTTP, FTP has a variety of status codes:
1xx Positive Preliminary reply The requested action is being initiated; expect another reply before proceeding with a new command. 2xx Positive Completion reply The requested action has been successfully completed. A new request may be initiated. 3xx Positive Intermediate reply The command has been accepted but the requested action is waiting for further information before being completed. 4xx Transient Negative Completion reply The command was not accepted and the requested action did not take place, but the error condition is temporary and the action may be requested again. 5xx Permanent Negative Completion reply The command was not accepted and the requested action did not take place.

SMTP - Simple Mail Transfer Protocol


(Text-based, TCP port 25, RFC 2821) SMTP the protocol to transfer email from hosts to mail servers and between mail servers. Terminology: User Agent(UA) - end-hosts. Mail Transfer Agent (MTA) - mail servers.

SMTP (cont)
Electronic mail is dierent from the previous protocols in its delayed delivery in several steps: Spooling from sending host to rst MTA. Relaying by intermediate MTAs. Downloading of email by receiving host using other protocols: POPv3 (Post Oce Protocol) or IMAPv3 (Interactive Mail Access Protocol)

Addressing: <mailbox>@<domain name> This results in a a DNS MX request for <domain name>, giving a name of the MTA to transfer the message to.

30

31

32

SMTP syntax
Like HTTP and FTP, SMTP has special commands and status codes. HELO <hostname>: MAIL FROM <email address>: Sender email address RCPT TO <email address>: Recipient email address DATA: Tells the email server that data follows. QUIT: Immediately close the connection. The status codes are similar to that of HTTP and FTP.

MIME - Multipurpose Internet Mail Extensions Classical email messages must be written in US-ASCII (7-bit). What does this imply? MIME aims at redening the format of messages to allow for: textual message bodies in character sets other than US-ASCII, an extensible set of dierent formats for non-textual message bodies, multi-part message bodies, and textual header information in character sets other than US-ASCII

So how does it work?


Related header elds: Content-Type - what kind of data the content carries. Some examples: text/plain, text/html, audio, video, application/pdf, extension-token, and multipart. Content-Transfer-Encoding - how data is encoded. Some examples: 7bit, 8bit, binary, quoted-printable, base64,...

33

34

35

SNMP Simple Network Management Protocol


(TCP, ASN.1) It is complex to build internetworks and we need to manage them. Monitoring Debugging Control routers and other network devices SNMP - Internet management No special control messages use TCP/IP itself Management is on TCP/IP application level Same protocol is used for all managed devices If IP does not work correctly,...
36 37 38

Real-time multimedia
Time-sensitive, interactive applications: (eg, telephony). Use RTP- Real-Time Protocol. Limited time-sensitivity: Streaming protocols. Use RSTP (Real-Time Streaming Protocol) Non-time sensitive: Transfer the data using le transfer.

Signaling
So, RTP can be used to transfer time-sensitive data streams. But what about signaling: how to set up sessions: SIP - Session Initization Protocol H.323

SIP Example SIP


(TCP or UDP port 5060, ABNF) Terminology is similar to SMTP, but is a synchronous protocol (no delays). SIP uses URIs (Uniform Resource Identiers) as addresses: <sip:6534@kth.se> <sip:bob@biloxi.com> SIP uses transactions, usually three-way (as TCP connections). Example: INVITE 200 OK ACK From RFC 3261:
softphone proxy proxy SIP Phone | | | | | INVITE F1 | | | |--------------->| INVITE F2 | | | 100 Trying F3 |--------------->| INVITE F4 | |<---------------| 100 Trying F5 |--------------->| | |<-------------- | 180 Ringing F6 | | | 180 Ringing F7 |<---------------| | 180 Ringing F8 |<---------------| 200 OK F9 | |<---------------| 200 OK F10 |<---------------| | 200 OK F11 |<---------------| | |<---------------| | | | ACK F12 | |------------------------------------------------->| | Media Session | |<================================================>| | BYE F13 | |<-------------------------------------------------| | 200 OK F14 | |------------------------------------------------->|

SIP message example


INVITE sip:000730631661@kth.se SIP/2.0 Via: SIP/2.0/UDP 192.36.125.167:5060;branch=z9hG4bK0e4415ea From: "6534" <sip:6534@kth.se>;tag=000e38a3b7e8001d597d1d53-1bfa7620 To: <sip:000730631661@kth.se> Call-ID: 000e38a3-b7e8001e-34c94c48-72c83866@192.36.125.167 Date: Mon, 03 Jan 2005 14:16:06 GMT CSeq: 101 INVITE User-Agent: CSCO/6 Contact: <sip:6534@192.36.125.167:5060> Expires: 180 Content-Type: application/sdp Content-Length: 251 Accept: application/sdp

39

40

41

IM - Instant Messaging
On-line messaging and presence information using a central server and many connected clients. Some systems: AOL IM/ICQ, MSN Messenger, Yahoo Messenger An IM system typically has the following features: Buddy list Chat, Images, Sounds, File-sharing Real-time talk and video Most protocols are proprietary. But SIP has messaging extensions (SIMPLE). A special feature is to serve many small messages in a short time, and to manage presence information.
42

Peer-to-peer le-sharing applications


Example of content-distribition (le-sharing) using peer-to-peer techniques. Build overlays virtual networks on top of physical network. Overlay links are TCP/UDP connections. Usually, actual data transfer is direct between hosts (peer-to-peer), often using HTTP. Some have central registry (index of where les are) (Napster). Others (eg KaZaa) have distributed registry: some nodes with good network connections, no NAT, and large resources turn into supernodes. All clients connect to a supernode.

Peer-to-peer le-sharing applications (cont)


Some are completely decentralized (GnuTella), encrypts data (FreeNet). BitTorrent, for example, works closely with HTTP - splitting up an HTTP transfer in slices, distributing the download from one originator to many clients working in unison. Many rely on distributed hash lookup functions to make fast queries and lookup of data. Some of the routing problems are similar to real (physical) routing, but on a higher level.

43

44

Skype

Distributed games
Some of the best-known distributed games are interactive and real-time: Doom, Quake, Counter-Strike, Half-life, etc. Some issues are: Low latency: low pingers win ghts. Usually small UDP packets. Textures and geometric information preloaded: only deltas distributed. Movement of 3D graphics may use dead reckoning: no need to send updates on all geometric movements: use motion equations instead. All communication via central server, synchronizes and resolves events (who wins a ght).
47

Detour: NAT traversal


Nowadays, most hosts are behind NAT (Network Address Translation) boxes NATs translate global IP addresses to local, and extends the address space using TCP/UDP ports. One peer behind NAT: possible to initiate connection from behind a NAT. Both peers behind NATs: dicult to communicate directly. Solution: For UDP, exploit some regularities of NATs (reuse of same ports, etc). Or use a non-NAT peer as protocol bouncer.

(Encrypted, TCP/UDP) Skype is a VoIP tool using peer-to-peer techniques for name-lookup. Skype is a completely closed system - no open interfaces, not even which RFCs are implemented No interoperation possible. You could say this violates the Internet spirit. Uses high compression: iLBC coding ( 10x compression of audio data) Name lookup using same infrastructure as KaZaa: nodes and supernodes. NAT traversal techniques using UDP, TCP or bounce connections via supernodes. End-to-end RSA encryption

45

46

You might also like