You are on page 1of 30

HTTP - is the protocol used to transfer data over the web.

It is part of the Internet


protocol suite and defines commands and services used for transmitting webpage
data. HTTP uses a server-client model. ... If the URL is valid and the connection is
granted, the server will send your browser the webpage and related files.

Uniform Resource Identifiers (URIs): It is meant to identify a resource of any kind, It’s
just a name following some syntax. URL & URN are subsets of URI.
URL (Uniform Resource Locator):- is a special kind of URI which is meant to specify
the location of a resource available on the Internet. It does not have to be HTTP URL
(http://), a URL can also be (ftp://) or (smb://)
URN (Uniform Resource Name): A URN is intended to label a resource based on its
actual identity, rather than where it can be found. Both URNs (names) and URLs
(locators) are URIs, and a particular URI may be both a name and a locator at the same
time.
Concretely, the key difference between the two is: a URL always refers to something that
is network-addressable, whereas a URN is a logical identifier which might have no
endpoint.

Difference between HTTP 1.0 – 1.1


 Multiple Host Name Support: In HTTP/1.0, there was no way to specify the host name
of the server to which the client needed to connect. As a result, the Web server at a
particular IP address could only support one domain name. HTTP/1.1 allows one Web
server to handle requests for dozens or even hundreds of different virtual hosts.
The Host request header specifies the domain name of the server (for virtual
hosting), and (optionally) the TCP port number on which the server is listening. If
no port is given, the default port for the service requested (e.g., "80" for an HTTP
URL) is implied. A Host header field must be sent in all HTTP/1.1 request
messages. A 400 (Bad Request) status code will be sent to any HTTP/1.1 request
message that lacks a Host header field or contains more than one.

Example:

GET / HTTP/1.1
Host: www.blahblahblahblah.com

 Persistent Connections: Short-lived connections have two major hitches: the


time taken to establish a new connection is significant, and performance of the
underlying TCP connection gets better only when this connection has been in use
for some time (warm connection). To ease these problems, the concept of
a persistent connection has been designed, even prior to HTTP/1.1. Alternatively,
this may be called a keep-alive connection.
A persistent connection is one which remains open for a period of time, and can
be reused for several requests, saving the need for a new TCP handshake, and
utilizing TCP's performance enhancing capabilities. This connection will not stay
open forever: idle connections are closed after some time (a server may use
the Keep-Alive header to specify a minimum time the connection should be kept
open).
HTTP/1.0 connections are not persistent by default. Setting Connection to
anything other than close, usually retry-after, will make them persistent.
In HTTP/1.1, persistence is the default, and the header is no longer needed.

 HTTP pipelining By default, HTTP requests are issued sequentially. The next
request is only issued once the response to the current request has been received.
As they are affected by network latencies and bandwidth limitations, this can
result in significant delay before the next request is seen by the server.Pipelining
is the process to send successive requests, over the same persistent connection,
without waiting for the answer. This avoids latency of the connection.
Theoretically, performance could also be improved if two HTTP requests were to
be packed into the same TCP message. The typical MSS (Maximum Segment Size),
is big enough to contain several simple requests, although the demand in size of
HTTP requests continues to grow.
Not all types of HTTP requests can be pipelined:
only GET, HEAD, PUT and DELETE, can be replayed safely: should a failure
happen, the pipeline content can simply be repeated.
Today, every HTTP/1.1-compliant proxy and server should support pipelining,
though many have limitations in practice: a significant reason no modern browser
activates this feature by default. In 1.0, this was achieved by using connection
header.
 Partial Resource Selection: In HTTP/1.1, a client can ask for only part of a resource
rather than the entire document, which reduces the load on the server and saves
transfer bandwidth. (Using the Ranges Header)
 Better Caching and Proxying Support: HTTP 1.0 had support for caching via the
header: If-Modified-Since.
HTTP 1.1 expands on the caching support a lot by using something called 'entity tag'.
If 2 resources are the same, then they will have the same entity tags. HTTP 1.1 also
adds the If-Unmodified-Since, If-Match, If-None-Match conditional headers. There
are also further additions relating to caching like the Cache-Control header. The
caching mechanism within HTTP 1.0 is simple in design. The client requests a
resource, the resource is then returned by the origin server along with the
following HTTP headers:-
Expires - The Expires header contains the date/time after which the response is
considered stale. Invalid dates, like the value 0, represent a date in the past and mean
that the resource is already expired. If there is a Cache-Control header with
the directive in the response, the Expires header is ignored.
Last-Modified - when the resource was last modified. The client can then
instruct the origin server to only send the resource if it has a newer copy by
using the If-Modified-Since header. Otherwise the origin will send a HTTP 304 -
Not Modified response. This is also known as a Conditional Request. However,
the client can also request the resource is not served from cache by using
the Pragma: no-cache header.

HTTP 1.1 extends the caching functionality from HTTP 1.0. In HTTP 1.1 a cached
entry is considered fresh. At the point the expiry time is reached it is deemed stale.
Should the cache entry be considered stale it must be revalidated with the origin server.
However, both client and origin are able to override this requirement.

Due to the If-Modified-Since header using absolute timestamps (with a one-second-


resolution) there was a potential for caching issues due to inconsistencies with time
synchronization. To solve this issue a new cache validator string was introduced called
an ETag (entity tag). Entity tags are provided with the cached response within the ETag
header.

In addition to If-Modified-Since HTTP 1.1 introduces the following conditional request-


headers,

 If-None-Match : <entitiy tag>- Instructs the server to send the resource if the entity
DOES NOT exist of the server. If the entity does exist on the server a 304 (Not Modified)
is returned.
 If-Match : <entitiy tag> - Instructs the server to send the resource if the entity
DOES exists on the server.
 If-Unmodifed-Since : <date> - Instructs the server to send the resource if the resource
has NOT been modified since the date provided.
 If-Range (entity_tag | date) - Used along with the Range header. This follows the same
principle as above but is instead works a a byte range level rather then a resource level.

2 further headers that were also added to HTTP 1.1 are the headers - cache-control and
vary.

 Cache-control The Cache-Control general-header field is used to specify directives


for caching mechanisms in both requests and responses. Caching directives are
unidirectional, meaning that a given directive in a request is not implying that the same
directive is to be given in the response.

 100 Continue status

 OPTIONS method

HTTP Methods
Difference between Get and POST:-

In Get, data is sent with query string and with POST, it’s sent as body.

GET

GET is used to request data from a specified resource. Caching also comes into play, as
a proxy server or even the client itself might satisfy the request before it gets to
the server. It’s important to remember that the meaning of a GET request may change if
certain headers, such as If-Modified-Since or If-Match are used—these and other
similar headers tell the server to only send the resource if certain conditions are met. A
request of this sort is sometimes called a conditional GET. Similarly, the Range header
may be used by the client to request that the server send it only part of a resource; this
is usually used for large files. When this header is included, the request may be called a
partial GET.

POST

The HTTP POST method sends data to the server. The type of the body of the request is
indicated by the Content-Type header.

The difference between PUT and POST is that POST appends the data and Put does not. If
post uploads a file with value of 1-10 and if it does upload the same file with a value of
11-20, the value would be changed to 1-20 and with put it would just override the data
and value would be changed to 11-20.

PUT
The HTTP PUT request method creates a new resource or replaces a representation of the
target resource with the request payload.

OPTIONS
The HTTP OPTIONS method is used to describe the communication options for the target
resource. The client can specify a URL for the OPTIONS method, or an asterisk (*) to refer to
the entire server.

HEAD

This is identical to the GET method but tells the server to not send the actual body of the
message. Thus, the response will contain all of the headers that would have
accompanied a reply to the equivalent GET message, including entity headers describing
the entity that the server would have sent had the method been GET. This method is
often used by the client to check the existence, status or size of a file before
deciding whether or not it wants the server to send the whole thing.

HEAD requests are processed in the same way as GET requests, except that only the
headers are returned, not the actual resource. It is used to get the meta data (like
content length).

DELETE

Requests that the specified resource be deleted. This has the same issues as PUT and is
not often used for similar reasons. If this method is used to delete a resource which
has been deleted already, it would get a 404 (Not found error), if its successful, it
would get 200 (OK).

Response Codes

100 Continue status:


There is a new return code in HTTP/1.1 100 Continue. This is to prevent a client from
sending a large request when that client is not even sure if the server can process the
request or is authorized to process the request. In this case the client sends only the
headers, and the server will tell the client 100 Continue, go ahead with the body.

To have a server check the request's headers, a client must send Expect: 100-
continue as a header in its initial request and receive a 100 Continue status code
in response before sending the body.

200(OK) :- The 200 (OK) status code indicates that the request has succeeded. The
payload sent in a 200 response depends on the request method. For the methods
defined by this specification, the intended meaning of the payload can be summarized
as:
GET a representation of the target resource;
HEAD the same representation as GET, but without the representation data;
POST a representation of the status of, or results obtained from, the action;
PUT, DELETE a representation of the status of the action;
OPTIONS a representation of the communications options;
TRACE a representation of the request message as received by the end server

201(Created):- The HTTP 201 Created success status response code indicates that
the request has succeeded and has led to the creation of a resource. The new resource is
effectively created before this response is sent back and the new resource is returned in
the body of the message, its location being either the URL of the request, or the content
of the Location header.
The common use case of this status code is as the result of a put request.

204 No Content :- The HTTP 204 No Content success status response code indicates
that the request has succeeded, but that the client doesn't need to go away from its
current page. A 204 response is cacheable by default. An ETag header is included in such
a response.
The common use case is to return 204 as a result of a PUT request, updating a resource,
without changing the current content of the page displayed to the user. If the resource is
created, 201 Created is returned instead. If the page should be changed to the newly
updated page, the 200 should be used instead.
206 Partial Content :- The HTTP 206 Partial Content success status response code
indicates that the request has succeeded and has the body contains the requested ranges
of data, as described in the Range header of the request.
If there is only one range, the Content-Type of the whole response is set to the type of the
document, and a Content-Range is provided.
If several ranges are sent back, the Content-Type is set to multipart/byteranges and each
fragment covers one range, with Content-Range and Content-Type describing it.
301 (Moved Permanently):- The HyperText Transfer Protocol (HTTP) 301
Moved Permanently redirect status response code indicates that the resource
requested has been definitively moved to the URL given by the Location headers. A
browser redirects to this page and search engines update their links to the resource

Note: For historical reasons, a user agent MAY change the request method from POST to
GET for the subsequent request. If this behaviour is undesired, the 308 (Permanent
Redirect status code can be used instead. A 301 response is cacheable by default;

302 (Found):- The HyperText Transfer Protocol (HTTP) 302 Found redirect status
response code indicates that the resource requested has been temporarily moved to the
URL given by the Location header. A browser redirects to this page but search engines
don't update their links to the resource.
Note: For historical reasons, a user agent MAY change the request method from POST to
GET for the subsequent request. If this behavior is undesired, the 307 (Temporary
Redirect) status code can be used instead.

303 (Other):- The 303 (See Other) status code indicates that the server is redirecting
the user agent to a different resource, as indicated in the Location header field, which is
intended to provide an indirect response to the original request

304 Not Modified:- This is used for caching purposes. It tells the client that the
response has not been modified, so the client can continue to use the same cached
version of the response.
307 (Temporary Redirect):- HTTP 307 Temporary Redirect redirect status
response code indicates that the resource requested has been temporarily moved to the
URL given by the Location headers.
The method and the body of the original request are reused to perform the redirected
request. In the cases where you want the method used to be changed to GET, use 303 See
Other instead.

Note: This status code is similar to 302 (Found), except that it does not allow changing
the request method from POST to GET.

308 (Permanent Redirect):- status code indicates that the target resource has
been assigned a new permanent URI and any future references to this resource ought to
use one of the enclosed URIs.

The server SHOULD generate a Location header field in the response containing a
preferred URI reference for the new permanent URI. The user agent MAY use the
Location field value for automatic redirection. A 308 response is cacheable by default;

Note: This status code is similar to 301 (Moved Permanently) except that it does not
allow changing the request method from POST to GET.

400 (Bad Request):- status code indicates that the server cannot or will not process
the request due to something that is perceived to be a client error (e.g., malformed
request syntax, invalid request message framing, or deceptive request routing).

401 (Unauthorized):- status code indicates that the request has not been applied
because it lacks valid authentication credentials for the target resource. The server
generating a 401 response MUST send a WWW-Authenticate header field containing at
least one challenge applicable to the target resource.

If the request included authentication credentials, then the 401 response indicates that
authorization has been refused for those credentials. The user agent MAY repeat the
request with a new or replaced Authorization header field (Section 4.2). If the 401
response contains the same challenge as the prior response, and the user agent has
already attempted authentication at least once, then the user agent SHOULD present the
enclosed representation to the user, since it usually contains relevant diagnostic
information.

403 (Forbidden):- status code indicates that the server understood the request but
refuses to authorize it. A server that wishes to make public why the request has been
forbidden can describe that reason in the response payload.
If authentication credentials were provided in the request, the server considers them
insufficient to grant access. The client SHOULD NOT automatically repeat the request
with the same credentials. The client MAY repeat the request with new or different
credentials. However, a request might be forbidden for reasons unrelated to the
credentials.
An origin server that wishes to "hide" the current existence of a forbidden target
resource MAY instead respond with a status code of 404 (Not Found).

a 401 Unauthorized response should be used for missing or bad authentication, and a 403
Forbidden response should be used afterwards, when the user is authenticated but isn’t
authorized to perform the requested operation on the given resource.
There's a problem with 401 Unauthorized, the HTTP status code for authentication errors.
And that’s just it: it’s for authentication, not authorization. Receiving a 401 response is the
server telling you, “you aren’t authenticated–either not authenticated at all or authenticated
incorrectly–but please reauthenticate and try again.” To help you out, it will always include
a WWW-Authenticate header that describes how to authenticate.
This is a response generally returned by your web server, not your web application.

It’s also something very temporary; the server is asking you to try again.

So, for authorization I use the 403 Forbidden response. It’s permanent, it’s tied to my
application logic, and it’s a more concrete response than a 401.
Receiving a 403 response is the server telling you, “I’m sorry. I know who you are–I believe
who you say you are–but you just don’t have permission to access this resource. Maybe if
you ask the system administrator nicely, you’ll get permission. But please don’t bother me
again until your predicament changes.”

In summary, a 401 Unauthorized response should be used for missing or bad authentication,
and a 403 Forbidden response should be used afterwards, when the user is authenticated
but isn’t authorized to perform the requested operation on the given resource.
401 Unauthorized:

If the request already included Authorization credentials, then the 401 response indicates
that authorization has been refused for those credentials.
403 Forbidden:

404 (Not Found):- The HTTP 404 Not Found client error response code indicates
that the server can't find the requested resource. Links which lead to a 404 page are
often called broken or dead links, and can be subject to link rot.
A 404 status code does not indicate whether the resource is temporarily or permanently
missing. But if a resource is permanently removed, a 410 (Gone) should be used instead
of a 404 status.

405 (Method Not Allowed):- status code indicates that the method received in the
request-line is known by the origin server but not supported by the target resource.
The origin server MUST generate an Allow header field in a 405 response containing a
list of the target resource's currently supported methods.

406 (Not Acceptable):- The HyperText Transfer Protocol (HTTP) 406 Not
Acceptable client error response code indicates that the server cannot produce a
response matching the list of acceptable values defined in the request's proactive content
negotiation headers, and that the server is unwilling to supply a default representation.

409 (Conflict):- The HTTP 409 Conflict response status code indicates a request
conflict with current state of the server.
Conflicts are most likely to occur in response to a PUT request. For example, you may get
a 409 response when uploading a file which is older than the one already on the server
resulting in a version control conflict.

410 (Gone):- The HyperText Transfer Protocol (HTTP) 410 Gone client error
response code indicates that access to the target resource is no longer available at the
origin server and that this condition is likely to be permanent.
If you don't know whether this condition is temporary or permanent, a 404 status code
should be used instead.

500 (Internal Server Error):- status code indicates that the server encountered
an unexpected condition that prevented it from fulfilling the request.

501 (Not Implemented):- status code indicates that the server does not support
the functionality required to fulfill the request. This is the appropriate response when
the server does not recognize the request method and is not capable of supporting it for
any resource. A 501 response is cacheable by default.

502 (Bad Gateway):- status code indicates that the server, while acting as a
gateway or proxy, received an invalid response from an inbound server it accessed
while attempting to fulfill the request.

503 (Service Unavailable):- status code indicates that the server is currently
unable to handle the request due to a temporary overload or scheduled maintenance,
which will likely be alleviated after some delay. The server MAY send a Retry-After
header field to suggest an appropriate amount of time for the client to wait before
retrying the request.

Note: The existence of the 503 status code does not imply that a server has to use it
when becoming overloaded. Some servers might simply refuse the connection.

504 (Gateway Timeout):- status code indicates that the server,while acting as a
gateway or proxy, did not receive a timely response from an upstream server it needed
to access in order to complete the request.

The 501 (Not Implemented) status code indicates that the server does not support the
functionality required to fulfill the request. ... The 405 (Method Not Allowed) status code
indicates that the method received in the request-line is known by the origin server but not
supported by the target resource.

HTTP Request Message Format

The client initiates an HTTP session by opening a TCP connection to the HTTP server
with which it wishes to communicate. It then sends request messages to the server, each
of which specifies a particular type of action that the user of the HTTP client would like
the server to take. Requests can be generated either by specific user action (such as
clicking a hyperlink in a Web browser) or indirectly as a result of a prior action (such as
a reference to an inline image in an HTML document leading to a request for that
image.)
HTTP requests use a message format that is based on the generic message format
described in the preceding topic, but specific to the needs of requests. The structure of
this
format is as follows (see Figure 317):
<request-line>
<general-headers>
<request-headers>
<entity-headers>
<empty-line>
[<message-body>]
[<message-trailers>]

Request Line
The generic start line that begins all HTTP messages is called a request line in request
messages. Its has a three-fold purpose: to indicate the command or action that the client
wants performed; to specify a resource upon which the action should be taken; and to
indicate to the server what version of HTTP the client is using. The formal syntax for the
request line is:
<METHOD> <request-uri> <HTTP-VERSION>

Headers
HTTP headers are the name or value pairs that are displayed in
the request and response messages of message headers for Hypertext Transfer Protocol
(HTTP). Usually, the header name and the value are separated by a single colon. HTTP
headers are an integral part of HTTP requests and responses.

General Headers: General headers refer mainly to the message itself, as opposed to its
contents. There are a few header fields which have general applicability for both
request and response messages, but which do not apply to the entity being transferred.
These header fields apply only to the message being transmitted.
Cache-Control
| Connection
| Date
| Pragma
| Trailer
| Transfer-Encoding
| Upgrade
| Via
| Warning

Request Headers: The request-header fields allow the client to pass additional
information about the request, and about the client itself, to the server. These fields act
as request modifiers, with semantics equivalent to the parameters on a programming
language method invocation.

Accept ; Section 14.1


| Accept-Charset ; Section 14.2
| Accept-Encoding ; Section 14.3
| Accept-Language ; Section 14.4
| Authorization ; Section 14.8
| Expect ; Section 14.20
| From ; Section 14.22
| Host ; Section 14.23
| If-Match ; Section 14.24
| If-Modified-Since ; Section 14.25
| If-None-Match ; Section 14.26
| If-Range ; Section 14.27
| If-Unmodified-Since ; Section 14.28
| Max-Forwards ; Section 14.31
| Proxy-Authorization ; Section 14.34
| Range ; Section 14.35
| Referer ; Section 14.36
| TE ; Section 14.39
| User-Agent ; Section 14.43
cookie

Entity Headers: These are headers that describe the entity contained in the body of
the request, if any.
Request headers are obviously used only in request messages, but both general headers
and entity headers can appear in either a request or a response message.

entity-header
= Allow Section 14.7
| Content-Encoding ; Section 14.11
| Content-Language ; Section 14.12
| Content-Length ; Section 14.13
| Content-Location ; Section 14.14
| Content-MD5 ; Section 14.15
| Content-Range ; Section 14.16
| Content-Type ; Section 14.17
| Expires ; Section 14.21
| Last-Modified ; Section 14.29

Response Headers:
Accept-Ranges ; Section 14.5
| Age ; Section 14.6
| ETag ; Section 14.19
| Location ; Section 14.30
| Proxy-Authenticate ; Section 14.33
| Retry-After ; Section 14.37
| Server ; Section 14.38
| Vary ; Section 14.44
| WWW-Authenticate ; Section 14.47
Set-Cookie

HTTP Response Message Format


Up and down; east and west; black and white; yin and yang. Well, you get the idea. Each
request message sent by an HTTP client to a server prompts the server to send back a
response message. Actually, in certain cases the server may in fact send two responses, a
preliminary response followed by the real one. Usually though, one request yields one
response, which indicates the results of the server's processing of the request, and often
also carries an entity (file or resource) in the message body.
Like requests, responses use their own specific message format that is based on the
HTTP
generic message format. The format, shown in Figure 318, is:
<status-line>
<general-headers>
<response-headers>
<entity-headers>
<empty-line>
[<message-body>]
[<message-trailers>]
Status Line
The status line—not “response line”, note—is the start line used for response messages.
It has two functions: to tell the client what version of the protocol the server is using,
and to communicate a summary of the results of processing the client's request. The
formal syntax
for the status line is:
<HTTP-VERSION> <status-code> <reason-phrase>
Headers:-
Host:- The Host request header specifies the domain name of the server (for virtual
hosting), and (optionally) the TCP port number on which the server is listening.
If no port is given, the default port for the service requested (e.g., "80" for an HTTP URL)
is implied.

For example, a GET request to the origin server for


<http://www.example.org/pub/WWW/> would begin with:

GET /pub/WWW/ HTTP/1.1


Host: www.example.org

A server MUST respond with a 400 (Bad Request) status code to any HTTP/1.1 request
message that lacks a Host header field and to any request message that contains more
than one Host header field or a Host header field with an invalid field-value.

Connection:- The Connection general header controls whether or not the network
connection stays open after the current transaction finishes. If the value sent is keep-
alive, the connection is persistent and not closed, allowing for subsequent requests to the
same server to be done.

Connection: keep-alive

Connection: close

Server:- The "Server" header field contains information about the software used by
the origin server to handle the request, which is often used by clients to help identify
the scope of reported interoperability problems.
Via:- The Via general header is added by proxies, both forward and reverse proxies, and
can appear in the request headers and the response headers. It is used for tracking
message forwards, avoiding request loops, and identifying the protocol capabilities of
senders along the request/response chain.

Via: [ <protocol-name> "/" ] <protocol-version> <host> [ ":" <port> ]

or

Via: [ <protocol-name> "/" ] <protocol-version> <pseudonym>

Vary:- The Vary HTTP response header determines how to match future request
headers to decide whether a cached response can be used rather than requesting a fresh
one from the origin server. It is used by the server to indicate which headers it used
when selecting a representation of a resource in a content negotiation algorithm.

The Vary header should be set on a 304 Not Modified response exactly like it would
have been set on an equivalent 200 OK response.

Accept:- The Accept request HTTP header advertises which content types, expressed
as MIME types, the client is able to understand. Using content negotiation, the server
then selects one of the proposals, uses it and informs the client of its choice with
the Content-Type response header.

Accept: <MIME_type>/<MIME_subtype>

Accept: <MIME_type>/*

Accept: */*

// Multiple types, weighted with the quality value syntax:


Accept: text/html, application/xhtml+xml, application/xml;q=0.9, image/webp,
*/*;q=0.8

Accept Encoding :- The Accept-Encoding request HTTP header advertises which


content encoding, usually a compression algorithm, the client is able to understand.
Using content negotiation, the server selects one of the proposals, uses it and informs the
client of its choice with the Content-Encoding response header.
Even if both the client and the server supports the same compression algorithms, the
server may choose not to compress the body of a response, if the identity value is also
acceptable. Two common cases lead to this:

Accept-Encoding: gzip

Accept-Encoding: compress

Accept-Encoding: deflate

Accept-Encoding: br

Accept-Encoding: identity

Accept-Encoding: *

// Multiple algorithms, weighted with the quality value syntax:


Accept-Encoding: deflate, gzip;q=1.0, *;q=0.5

Accept-Language:- The Accept-Language request HTTP header advertises which


languages the client is able to understand, and which locale variant is preferred. (By
languages, we mean natural languages, such as English, and not programming
languages.) Using content negotiation, the server then selects one of the proposals, uses
it and informs the client of its choice with the Content-Language response header.

Each language-range MAY be given an associated quality value which represents an


estimate of the user's preference for the languages specified by that range. The quality
value defaults to "q=1". For example,

Accept-Language: <language>

Accept-Language: *

// Multiple types, weighted with the quality value syntax:


Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5

Content-Length:- The Content-Length entity header indicates the size of the entity-
body, in bytes, sent to the recipient.
Content-Length: <length>

Content-type:- The Content-Type entity header is used to indicate the media type of
the resource.
In responses, a Content-Type header tells the client what the content type of the returned
content actually is.

Content-Type: text/html; charset=utf-8

Content-Type: multipart/form-data; boundary=something

Date:- The Date general HTTP header contains the date and time at which the message
was originated.

Date: <day-name>, <day> <month> <year> <hour>:<minute>:<second> GMT

Expires:- The Expires header contains the date/time after which the response is considered
stale.

Expires: <http-date>

Cookie: - The Cookie HTTP request header contains stored HTTP cookies previously
sent by the server with the Set-Cookie header.
The Cookie header is optional and may be omitted if, for example, the browser's privacy
settings block cookies.

Cookie: <cookie-list>

Cookie: name=value

Cookie: name=value; name2=value2; name3=value3


Cache-Control:- The Cache-Control general-header field is used to specify directives
for caching mechanisms in both requests and responses. Caching directives are
unidirectional, meaning that a given directive in a request is not implying that the same
directive is to be given in the response.

Cache request directivesSection


Standard Cache-Control directives that can be used by the client in an HTTP request.

Cache-Control: max-age=<seconds>

Cache-Control: max-stale[=<seconds>]

Cache-Control: min-fresh=<seconds>

Cache-Control: no-cache

Cache-Control: no-store

Cache-Control: no-transform

Cache-Control: only-if-cached

Cache response directivesSection


Standard Cache-Control directives that can be used by the server in an HTTP response.

Cache-Control: must-revalidate

Cache-Control: no-cache

Cache-Control: no-store

Cache-Control: no-transform

Cache-Control: public

Cache-Control: private

Cache-Control: proxy-revalidate

Cache-Control: max-age=<seconds>

Cache-Control: s-maxage=<second

User-Agent:- The User-Agent request header contains a characteristic string that


allows the network protocol peers to identify the application type, operating system,
software vendor or software version of the requesting software user agent.
User-Agent: <product> / <product-version> <comment>

Location:- The Location response header indicates the URL to redirect a page to. It
only provides a meaning when served with a 3xx (redirection) or 201 (created) status
response.

Caching:- The performance of web sites and applications can be significantly improved
by reusing previously fetched resources. Web caches reduce latency and network traffic
and thus lessen the time needed to display a representation of a resource. By making use
of HTTP caching, Web sites become more responsive.
Caching is a technique that stores a copy of a given resource and serves it back when
requested. When a web cache has a requested resource in its store, it intercepts the
request and returns its copy instead of re-downloading from the originating server. This
achieves several goals:
There are several kinds of caches: these can be grouped into two main categories:
private or shared caches. A shared cache is a cache that stores responses for reuse by
more than one user. A private cache is dedicated to a single user.

Private browser caches


A private cache is dedicated to a single user. You might have seen "caching" in your
browser's settings already. A browser cache holds all documents downloaded
via HTTP by the user. This cache is used to make visited documents available for
back/forward navigation, saving, viewing-as-source, etc. without requiring an additional
trip to the server. It likewise improves offline browsing of cached content.

Shared proxy caches


A shared cache is a cache that stores responses to be reused by more than one user. For
example, an ISP or your company might have set up a web proxy as part of its local
network infrastructure to serve many users so that popular resources are reused a
number of times, reducing network traffic and latency.

The Cache-control header


The Cache-Control HTTP/1.1 general-header field is used to specify directives for caching
mechanisms in both requests and responses. Use this header to define your caching
policies with the variety of directives it provides.

No caching

The cache should not store anything about the client request or server response. A
request is sent to the server and a full response is downloaded each and every time.

Cache-Control: no-store

Cache but revalidate

A cache will send the request to the origin server for validation before releasing a cached
copy.

Cache-Control: no-cache

Private and public caches

The "public" directive indicates that the response may be cached by any cache. This can
be useful, if pages with HTTP authentication or response status codes that aren't
normally cacheable, should now be cached.

On the other hand, "private" indicates that the response is intended for a single user only
and must not be stored by a shared cache. A private browser cache may store the
response in this case.

Cache-Control: private
Cache-Control: public
Expiration

The most important directive here is "max-age=<seconds>" which is the maximum amount
of time a resource will be considered fresh. Contrary to Expires, this directive is relative to
the time of the request.

Cache-Control: max-age=31536000

Validation:- When using the "must-revalidate" directive, the cache must verify the
status of the stale resources before using it and expired ones should not be used. For
more details.

Cache-Control: must-revalidate

The Pragma header


Pragma is a HTTP/1.0 header, it is not specified for HTTP responses and is therefore not
a reliable replacement for the general HTTP/1.1 Cache-Control header, although it does
behave the same as Cache-Control: no-cache, if the Cache-Control header field is
omitted in a request. Use Pragma only for backwards compatibility with HTTP/1.0
clients.

ETags
The ETag HTTP response header is an identifier for a specific version of a resource. It
allows caches to be more efficient, and saves bandwidth, as a web server does not need
to send a full response if the content has not changed. On the other side, if the content
has changed, etags are useful to help prevent simultaneous updates of a resource from
overwriting each other ("mid-air collisions").
If the resource at a given URL changes, a new Etag value must be generated. Etags are
therefore similar to fingerprints and might also be used for tracking purposes by some
servers. A comparison of them allows the determination of whether two representations
of a resource are the same. They might also be set to persist indefinitely by a tracking
server.

Caching of unchanged resources


Another typical use of the ETag header is to cache resources that are unchanged. If a user
visits a given URL again (that has an ETag set), and it is stale (too old to be considered
usable), the client will send the value of its ETag along in an If-None-Match header field:

If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"

The server compares the client's ETag (sent with If-None-Match) with the ETag for its
current version of the resource, and if both values match (that is, the resource has not
changed), the server sends back a 304 Not Modified status, without a body, which tells the
client that the cached version of the response is still good to use (fresh)
HTTP Cookies:-

An HTTP cookie (web cookie, browser cookie) is a small piece of data that a server sends
to the user's web browser. The browser may store it and send it back with the next
request to the same server. Typically, it's used to tell if two requests came from the same
browser — keeping a user logged-in, for example. It remembers stateful information for
the stateless HTTP protocol.
Cookies are mainly used for three purposes:

Session management- Logins, shopping carts, game scores, or anything else the server should
remember
Personalization - User preferences, themes, and other settings

Tracking- Recording and analyzing user behaviour

Creating cookies
When receiving an HTTP request, a server can send a Set-Cookie header with the response.
The cookie is usually stored by the browser, and then the cookie is sent with requests
made to the same server inside a Cookie HTTP header. An expiration date or duration can
be specified, after which the cookie is no longer sent. Additionally, restrictions to a
specific domain and path can be set, limiting where the cookie is sent.

The Set-Cookie and Cookie headers


The Set-Cookie HTTP response header sends cookies from the server to the user agent. A
simple cookie is set like this:

Set-Cookie: <cookie-name>=<cookie-value>

This header from the server tells the client to store a cookie.

HTTP/2.0 200 OK
Content-type: text/html
Set-Cookie: yummy_cookie=choco
Set-Cookie: tasty_cookie=strawberry

[page content]

Now, with every new request to the server, the browser will send back all previously
stored cookies to the server using the Cookie header.

GET /sample_page.html HTTP/2.0


Host: www.example.org
Cookie: yummy_cookie=choco; tasty_cookie=strawberry

Session cookies
The cookie created above is a session cookie: it is deleted when the client shuts down,
because it didn't specify an Expires or Max-Age directive. However, web browsers may
use session restoring, which makes most session cookies permanent, as if the browser
was never closed.

Permanent cookies
Instead of expiring when the client closes, permanent cookies expire at a specific date
(Expires) or after a specific length of time (Max-Age).

Set-Cookie: id=a3fWa; Expires=Wed, 21 Oct 2015 07:28:00 GMT;

Tracking and privacy

Third-party cookies
Cookies have a domain associated to them. If this domain is the same as the domain of
the page you are on, the cookies is said to be a first-party cookie. If the domain is
different, it is said to be a third-party cookie. While first-party cookies are sent only to the
server setting them, a web page may contain images or other components stored on
servers in other domains (like ad banners). Cookies that are sent through these third-
party components are called third-party cookies and are mainly used for advertising and
tracking across the web. See for example the types of cookies used by Google. Most
browsers allow third-party cookies by default, but there are add-ons available to block
them (for example, Privacy Badger by the EFF).
If you are not disclosing third-party cookies, consumer trust might get harmed if cookie
use is discovered. A clear disclosure (such as in a privacy policy) tends to eliminate any
negative effects of a cookie discovery. Some countries also have legislation about cookies.
See for example Wikimedia Foundation's cookie statement.

The X-Forwarded-For (XFF) header is a de-facto standard header for identifying


the originating IP address of a client connecting to a web server through an HTTP proxy
or a load balancer. When traffic is intercepted between clients and servers, server access
logs contain the IP address of the proxy or load balancer only. To see the original IP
address of the client, the X-Forwarded-For request header is used.

Directives
<client>
The client IP address
<proxy1>, <proxy2>
If a request goes through multiple proxies, the IP addresses of each successive proxy is
listed. This means, the right-most IP address is the IP address of the most recent proxy
and the left-most IP address is the IP address of the originating client.

Server Push in 2.0:- HTTP/2 allows a server to pre-emptively send responses


(along with corresponding "promised" requests) to a client in association with a
previous client-initiated request. This can be useful when the server knows the client
will need to have those responses available in order to fully process the response to the
original request. A client can request that server push be disabled, though this is
negotiated for each hop independently. The SETTINGS_ENABLE_PUSH setting can be
set to 0 to indicate that server push is disabled.

What Is Server Push, Exactly?


Accessing websites has always followed a request and response pattern. The user sends
a request to a remote server, and with some delay, the server responds with the
requested content.
The initial request to a web server is commonly for an HTML document. In this scenario,
the server replies with the requested HTML resource. The HTML is then parsed by the
browser, where references to other assets are discovered, such as style sheets, scripts
and images. Upon their discovery, the browser makes separate requests for those
assets, which are then responded to in kind.

Typical web server communication (Large preview)

The problem with this mechanism is that it forces the user to wait for the browser to
discover and retrieve critical assets until after an HTML document has been
downloaded. This delays rendering and increases load times.
With server push, we have a solution to this problem. Server push lets the server
preemptively “push” website assets to the client without the user having explicitly
asked for them. When used with care, we can send what we know the user is going to
need for the page they’re requesting.
Let’s say you have a website where all pages rely on styles defined in an external style
sheet named styles.css. When the user requests index.html from the server, we can
push styles.css to the user just after we begin sending the response for index.html.
Web server communication with HTTP/2 server push. (Large preview)

Rather than waiting for the server to send index.html and then waiting for the browser
to request and receive styles.css, the user only has to wait for the server to respond
with both index.html and styles.css on the initial request. This means that the browser
can begin rendering the page faster than if it had to wait.
As you can imagine, this can decrease the rendering time of a page. It also solves some
other problems, particularly in front-end development workflows.

How To Use Server Push


Using server push usually involves using the Link HTTP header, which takes on this
format:

Link: </css/styles.css>; rel=preload; as=style

 Difference between “No Store” & “Max-age=0 & no-cache:-


 In No store, it’s not going to cache the response. In Max-age=0 (request is
always sent with “if modified since” the proxy/client should revalidate
from the origin before using it unless a 304 response is received from
server.
If the no-cache directive does not specify a field-name, then a cache MUST
NOT use the response to satisfy a subsequent request without successful
revalidation with the origin server.
If the no-cache directive does specify one or more field-names, then a cache
MAY use the response to satisfy a subsequent request, subject to any other
restrictions on caching. However, the specified field-name(s) MUST NOT be
sent in the response to a subsequent request without successful revalidation
with the origin server.
 Difference between max-age=0 & must-revalidate: -
 In Max-age=0, if the origin is not responding, user agent would start using
the content from cache and with must-revalidate, it would not use the
cache and remove it from the cache.
 If cache-control & expire are set, which would be preferred.?
 Cache-control
 What is the use of Private in cache-control.?
It’s used in case server does not wants to cache in intermediate devices. Only end
browser would cache.
 Response header – Set cookie (sends domain and path) With secure, it only works
on https and doesn’t use cookies.

HSTS (The HTTP Strict-Transport-Security response header (often abbreviated


as HSTS) lets a web site tell browsers that it should only be accessed using HTTPS,
instead of using HTTP

Directives
max-age=<expire-time>
The time, in seconds, that the browser should remember that a site is only to be
accessed using HTTPS.

preload Optional

Description: - If a website accepts a connection through HTTP and redirects to


HTTPS, visitors may initially communicate with the non-encrypted version of the site
before being redirected, if, for example, the visitor types http://www.foo.com/ or even
just foo.com. This creates an opportunity for a man-in-the-middle attack. The redirect
could be exploited to direct visitors to a malicious site instead of the secure version of
the original site.

The HTTP Strict Transport Security header informs the browser that it should never load
a site using HTTP and should automatically convert all attempts to access the site using
HTTP to HTTPS requests instead.

Note: The Strict-Transport-Security header is ignored by the browser when your site is
accessed using HTTP; this is because an attacker may intercept HTTP connections and
inject the header or remove it. When your site is accessed over HTTPS with no certificate
errors, the browser knows your site is HTTPS capable and will honor the Strict-Transport-
Security header.

An example scenario
You log into a free WiFi access point at an airport and start surfing the web, visiting your
online banking service to check your balance and pay a couple of bills. Unfortunately, the
access point you're using is actually a hacker's laptop, and they're intercepting your
original HTTP request and redirecting you to a clone of your bank's site instead of the
real thing. Now your private data is exposed to the hacker.
Strict Transport Security resolves this problem; as long as you've accessed your bank's
web site once using HTTPS, and the bank's web site uses Strict Transport Security, your
browser will know to automatically use only HTTPS, which prevents hackers from
performing this sort of man-in-the-middle attack.

How the browser handles it


The first time your site is accessed using HTTPS and it returns the Strict-Transport-
Security header, the browser records this information, so that future attempts to load the
site using HTTP will automatically use HTTPS instead.
When the expiration time specified by the Strict-Transport-Security header elapses, the
next attempt to load the site via HTTP will proceed as normal instead of automatically
using HTTPS.

Whenever the Strict-Transport-Security header is delivered to the browser, it will


update the expiration time for that site, so sites can refresh this information and prevent
the timeout from expiring. Should it be necessary to disable Strict Transport Security,
setting the max-age to 0 (over a https connection) will immediately expire the Strict-
Transport-Security header, allowing access via http.

What does Secure & http flags means.?

The secure flag ensures that the setting and transmitting of a cookie is only done in a secure
manner (i.e. https). If there is an option for http, secure flag should prevent transmission of that
cookie. Therefore, a missing secure flag becomes an issue if there is an option to use or fall back
to http.
httpOnly ensures that scripting languages (ie. javascript) won't be able to get the cookie value
(such as through document.cookie). The only way to get it is through http request and response
headers. Therefore, a missing httpOnly coupled with XSS vulnerability is a recipe for stolen
session token.
It's best to put httpOnly and secure flag for your session token. Other cookies, it would depend
on how sensitive it is and what is it used for.

 Diff between put and post.?


 Diff between GET and POST.?
 What is the use case of these methods.?
 Diff between HTTP 1.0 & 1.1
 HTTP request & response flow
 Diff between 401 & 403
 Diff between 405,501 & 503
 Diff between 301, 302, 307 & 308
 Cookie parameters (secure)
 What is HSTS and how it works
 Types of cookies
 How caching works in 1.0 and 1.1.

Please read this document carefully as anything could be asked out of it.

You might also like