You are on page 1of 18

HTTP

Hypertext Transfer Protocol

26-Dec-19
HTTP messages
 HTTP is the language that web clients and web servers
use to talk to each other
 HTTP is largely “under the hood,” but a basic understanding
can be helpful
 Each message, whether a request or a response, has
three parts:
1. The request or the response line
2. A header section
3. The body of the message

2
What the client does, part I
 The client sends a message to the server at a particular
port (80 is the default)
 The first part of the message is the request line,
containing:
 A method (HTTP command) such as GET or POST
 A document address, and
 An HTTP version number
 Example:
 GET /index.html HTTP/1.0

3
Other methods
 Other methods beside GET and POST are:
 HEAD: Like GET, but ask that only a header be returned
 PUT: Request to store the entity-body at the URI
 DELETE: Request removal of data at the URI
 LINK: Request header information be associated with a
document on the server
 UNLINK: Request to undo a LINK request
 OPTIONS: Request information about communications
options on the server
 TRACE: Request that the entity-body be returned as received
(used for debugging)

4
What the client does, part II
 The second part of a request is optional header
information, such as:
 What the client software is
 What formats it can accept
 All information is in the form Name: Value
 Example:
User-Agent: Mozilla/2.02Gold (WinNT; I)
Accept: image/gif, image/jpeg, */*
 A blank line ends the header

5
Client request headers
 Accept: type/subtype, type/subtype, ...
 Specifies media types that the client prefers to accept
 Accept-Language: en, fr, de
 Preferred language (For example: English, French, German)
 User-Agent: string
 The browser or other client program sending the request
 From: dave@acm.org
 Email address of user of client program
 Cookie: name=value
 Information about a cookie for that URL
 Multiple cookies can be separated by commas

6
What the client does, part III
 The third part of a request (after the blank line) is the
entity-body, which contains optional data
 The entity-body part is used mostly by POST requests
 The entity-body part is always empty for a GET request

7
What the server does, part I
 The server response is also in three parts
 The first part is the status line, which tells:
 The HTTP version
 A status code
 A short description of what the status code means
 Example: HTTP/1.1 404 Not Found
 Status codes are in groups:
100-199 Informational
200-299 The request was successful
300-399 The request was redirected
400-499 The request failed
500-599 A server error occurred

8
Common status codes
 200 OK
 Everything worked, here’s the data
 301 Moved Permanently
 URI was moved, but here’s the new address for your records
 302 Moved temporarily
 URL temporarily out of service, keep the old one but use this one for now
 400 Bad Request
 There is a syntax error in your request
 403 Forbidden
 You can’t do this, and we won’t tell you why
 404 Not Found
 No such document
 408 Request Time-out, 504 Gateway Time-out
 Request took too long to fulfill for some reason

9
What the server does, part II
 The second part of the response is header information,
ended by a blank line
 Example:
 Content-Length: 2532
Connection: Close
Server: GWS/2.0
Date: Sun, 01 Dec 2002 21:24:50 GMT
Content-Type: text/html
Cache-control: private
Set-Cookie:
All on PREF=ID=05302a93093ec661:TM=1038777890:LM=1038777890:S=
one line yNWNjraftUz299RH; expires=Sun, 17-Jan-2038 19:14:07 GMT;
path=/; domain=.google.com

10
Viewing the response
 There is a header viewer at http://www.delorie.com/web/headers.html
(with nasty jittery advertisements)
 Example 2.3 (GetResponses) in the Gittleman book does the same thing
 Here’s an example (from GetResponses):
 % java GetResponses http://www.cis.upenn.edu/~matuszek/cit597-
2003/index.html
Status line:
HTTP/1.1 200 OK
Response headers:
Date: Wed, 10 Sep 2003 00:26:53 GMT
Server: Apache/1.3.26 (Unix) PHP/4.2.2 mod_perl/1.27
mod_ssl/2.8.10 OpenSSL/0.9.6e
Last-Modified: Tue, 09 Sep 2003 19:24:50 GMT
ETag: "1c1ad5-1654-3f5e2902”
Accept-Ranges: bytes
Content-Length: 5716
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
11
The GetResponses program, I
 Here’s just the skeleton of the program that provided the output on the last
slide:
 import java.net.*;

import java.io.*;

public class GetResponses {


public static void main(String [ ] args) {
try {
...interesting code goes here...
}
catch(Exception e) {
e.printStackTrace();
}
}
}

12
The GetResponses program, II
 Here’s the interesting part of the code:

 URL url = new URL(args[0]);


URLConnection c = url.openConnection();
System.out.println("Status line: ");
System.out.println('\t' + c.getHeaderField(0));
System.out.println("Response headers:");
String value = "";
int n = 1;
while (true){
value = c.getHeaderField(n);
if (value == null) break;
System.out.println('\t' + c.getHeaderFieldKey(n++) +
": " + value);
}

13
Server response headers
 Server: NCSA/1.3
 Name and version of the server
 Content-Type: type/subtype
 Should be of a type and subtype specified by the client’s
Accept header
 Set-Cookie: name=value; options
 Requests the client to store a cookie with the given name and
value

14
What the server does, part III
 The third part of a server response is the entity body
 This is often an HTML page
 But it can also be a jpeg, a gif, plain text, etc.--anything the
browser (or other client) is prepared to accept

15
The <meta http-equiv> tag
 The <meta http-equiv=string content=string> tag may occur in the
<head> of an HTML document
 http-equiv and content typically have the same kinds of values as
in the HTTP header
 This tag asks the client to pretend that the information actually
occurred in the header
 The information is not really in the header
 This tag is available because you have little direct control over what is in
the header (unless you write your own server)
 As usual, not all browsers handle this information the same way
 Example:
<meta http-equiv="Set-Cookie"
content="value=n;expires=date; path=url">

16
Summary
 HTTP is a fairly straightforward protocol with a lot of possible
kinds of predefined header information
 More kinds can be added, so long as client and server agree
 A request from the client consists of three parts:
1. A header line
2. A block of header information, ending with a blank line
3. The (optional) entity body, containing data
 A response from the server consists of the same three parts
 HTTP headers are “under the hood” information, not normally
displayed to the user
 As with most of the things covered in CIT597,
 We have covered only the fundamentals
 Much more detail can be found on the Web

17
The End

18

You might also like