You are on page 1of 232

Web

Technologies

Lecture 1
The Internet and HTTP
The Internet
• From internetted, meaning interconnected
• Global system of interconnected computer
networks
• 2014 – more than 38% of the world is
connected
• Relies on the TCP/IP protocol
A bit of history
• 1960’s ARPANET is created
– Early packet switching protocol
• 1981 - ARPANET is expanded by the National Science
Foundation (NSF) - NSFNET
• 1982 - TCP/IP is standardized
• Late 80’s & early 90’s ISPs begin to emerge
• 1990 - ARPANET is decommissioned
• 1990 - 1.5 MBS over the Atlantic between Cornell
University and CERN
• 1990 - 1st web browser called WorldWideWeb
• 1995 - the Internet is fully commercialized
• 2007 - over 97% of the world telecom is over the Internet
Connected World
The Internet of Things
• The Internet is a complex beast
– Mobile devices
– Smart sensors
– Social networks
– Etc.
The Internet today
Towards a networked society
Examples
Fundamentals
• Client-server model
– E.g., web browsers
• P2P
– E.g., file sharing software
• TCP/IP for communication
• Many application level protocols
– HTTP for web traffic
– SSH cryptographic protocol
– POP3 for email
– Etc.
Client-server architecture
• Many clients talk to one or more (via a proxy)
servers
World Wide Web
• A collection of documents and other resources
• URI – Uniform Resource Identifier
– 2 subclasses:
• URL – Uniform Resource Locator
Format: http://<host>[:<port>]/[<resource>][?<query>]
Example: http://www.test.ro:8080/get/animals?type=mamal
• URN – Uniform Resource Name
Format: <scheme>:<hierarchy>[?<query>][#<fragment>]
Example: urn:animals:mamal:ferret:nose
HTTP
• Hyper Text Processing Protocol
– 1999: HTTP/1.1
• Reuse the same connection for multiple resource requests
• Less latency
– May 2015: HTTP/2 (proposed)
• Header field compression
• Concurrent exchanges on the same connection
– HTTP/1.1 format:
• Request
[method] [resource] [version]<CRLF> [header]: [value]<CRLF> <CRLF>
• Response
[version] [status] [message]<CRLF>
[header]: [value]<CRLF> <CRLF>
[body]
HTTP sessions
• Sequence of network request-responses
– Establish TCP connection to a port (usually 80)
– Server listens for requests from clients
• GET, HEAD, POST, PUT, DELETE, TRACE, OPTIONS, CONNECT,
PATCH
– Server sends back status and message plus the body
which represents the requested resource
• Status codes:
– Informational (1xx)
– Successful (2xx): 200
– Redirection (3xx): 301, 302
– Client error (4xx): 404
– Server error (5xx): 500
Example
• Request
GET /index.html HTTP/1.1<CRLF> Host:
www.example.com<CRLF> <CRLF>
• Response
HTTP/1.1 200 OK<CRLF> Date: Mon, 23 May 2005
22:38:34 GMT<CRLF> Server: Apache/1.3.27 (Unix)
(Red-Hat/Linux)<CRLF> Last-Modified: Wed, 08 Jan 2003
23:11:55 GMT<CRLF> Etag: "3f80f-1b6-
3e1cb03b"<CRLF> Accept-Ranges: bytes<CRLF> Content-
Length: 438<CRLF> Connection: close<CRLF> Content-
Type: text/html; charset=UTF-8<CRLF> <CRLF> <Content
...>
HTTP session state
• HTTP is stateless
– Does not retain information or status about
requests
– Can be bypassed with cookies, server side
sessions, hidden variables, web forms.
Synchronous vs. asynchronous
• Synchronous = blocking
– A complete page must be loaded with every client
request
– AJAX (2005) changed the game by making
asynchronous communication possible
• Asynchronous
– Perform background tasks without blocking the
interface
• Example: load page content gradually (Google maps),
autocompletion (Google search)
What to expect from this lecture
• Basic information about Internet technologies
– State vs stateless
– Synchronous vs. asynchronous
– Client side technologies
• HTML, CSS
• XML, DOM, AJAX, JQuery, JSON
– Server side technologies
• Web servers, templating engines, PHP, databases
– Advanced topics on
• Web services
• RSS
• Cloud fundamentals
• Message queues
Grading
• Written exam (60%)
(50%)

– Multiple choice
• Lab assignments (40%)
(50%)

• Documentation
– Personal page: http://elearning.e-uvt.ro
Google Classroom
Web Technologies

Lecture 2
HTML and CSS
HTML
• Hyper Text Markup Language
– Describes web documents
– Made up of nested HTML markup tags
• <html></html>
– Tags are the building blocks of websites
• Embedded images and objects
• Interactive web forms
• Organized content
– Each tag describes different document content
• <head><head>
• <body><body>
– Web browsers interpret the tags and display the result on
screen
Brief History
• 1980 – physicist Berners-Lee (CERN) proposed
ENQUIRE a system for sharing documents
• 1989 – Berners-Lee proposed an Internet based
hypertext system
• 1990 – Berners-Lee wrote a browser and server
for it
• 1991 – 1st publicly available description of HTML
– 18 tags inspired from SGMLSquid
– 11 are still used in HTML 4
Brief history
• 1995 – HTML 2.0 specification is created
• 1996 – World Wide Web Consortium (W3C)
takes over the specifications
• 1999 – HTML 4.01 specification is published
• 2000 – HTML becomes an international
standard (ISO/IEC 15445:2000)
• 2004 – work on HTML 5 begins
• 2014 – HTML 5 standard is finalized
Markup language
• System for annotating a document
• Idea derives from the marking up of documents by editors
• Some languages have predefined presentation semantics (HTML) other do
not (XML)

• Classification
– Presentational markup
• Used by traditional word processing editors
• Markup hidden from users, editors, authors
– Procedural markup
• Embedded in text
• Provides instructions on how to process the text
• Latex, PostScript
– Descriptive (semantic) markup
• Label part of the text rather than providing specific instructions, i.e., decouple the
structure from the rendering of the document
• HTML’s tag <cite>
HTML 4
• SGML application
• Three flavors
– Strict
• Deprecated elements are forbidden
– Transitional
• Deprecated elements are allowed
– Frameset
• For frame related elements only

• Major changes
– Adopted many browser specific tags and attributed
– Began the transition from visual markup tags to Cascading Style Sheets (CSS)
• Old HTML:
– <font color=“red”>Text in red</font>
• New CSS:
– <p style=“color:red”>Text in red</p>
HTML structure
• Tags/Elements and attributes
– <tagName1 attributeName1=“value1” attributeName2=“value 2”>content</tagname1>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"


"http://www.w3.org/TR/html4/transitional.dtd">
<html>
<head>
<title>This is a title</title>
</head>
<body>
<!-- Comment: display text in blue -->
<p style=“color:blue”>Hello world!</p>
</body>
</html>
HTML metadata
• Allows to insert extra information in the document
• Placed inside the <head></head> tag

– Character set encoding


• <meta http-equiv="content-type" content="text/html; charset=UTF-8">
– Keywords and description
• <meta name="keywords" content="HTML, CSS, XML, XHTML, JavaScript">
• <meta name="description" content="Free Web tutorials on HTML and CSS">
– Refresh page
• <meta http-equiv="refresh" content="30">
– Author info
– Etc.
Transitional vs. strict
• Transitional is a flavor which enables older version HTMLs to be migrated to
version 4

• Looser content model


– Inline elements and direct text allowed in <body>, <blockquote>, <form>, <noscript> and
<noframes> tags
• Presentation elements allowed
– Underline (<u>), strikeout (<s>), <center>, <font>, <basefont>
• Presentation attributes allowed
– Background, align, bgcolor, vspace, hspace, clear, compact, type, width
• Other elements
– Applet, menu, dir, isindex

• In addition, in the Frameset flavor <frameset> replaces <body> and contains


<frame> tags

• In the Strict flavor the use of CSS is required for cosmetics


HTML 5
• 28 October 2014 – specification is released
• Defines a single markup language which combines a series
of features introduced by various specifications and
browsers
• Encourages interoperable implementations
– What runs on Firefox should run unchanged on IE or Chrome
• “Living standard”: can be improved but older features
cannot be removed

• 2011 – 34 out of 100 most popular websites used HTML 5


• 2014 – 153 of Fortune 500 US companies used HTML 5 on
their websites
HTML 5
• New syntactic features
– <video>, <audio>, <canvas>, <svg>
• Removed deprecated tags
– <font>, <center>
• New page structure tags
– <main>, <section>, <article>, <header>, <footer>, <aside>,
<nav>, <figure>
• Modified existing tags
– <a>, <cite>, <menu>
• DOM is part of HTML 5
• Not based on SGML anymore
• Includes the Web Forms 2.0
New APIs
• APIs that can work with Javascript
More differences
• Simpler doctype declaration
– <!DOCTYPE html>
• Simpler character encoding
– <meta charset="UTF-8">
– Default encoding is UTF-8
Cascading Style Sheets
• First released in 1996
• Style sheet language for describing the
presentation of a document written in a
markup language
– Not restricted to HTML
• Separates the document content from the
presentation
– CSS: all heading 1 elements should be bold
– HTML: this text should be a level 1 heading
CSS syntax
• Set of rules
– Declaration block
• Specifies the style sheet
• Property + value
– Example: color:red
– Selectors
• Declare which part of the HTML the style sheet applies to
– Elements, classes, pseudo-classes, ids
– Example: p .myclass { color:red }
Colors all p with the class myclass in the red color
<p class=“myclass”>Red text</p>
<p>Not red text</p>
<p class=“myclass”>Another red text</p>
Inserting CSS
• Inline
– <p style=“color:red;font-size:14pt;font-weight:bold”>

• Internal stylesheet (as any HTML element)


<style>
p {
color:red;
font-size:14pt;
font-weight:bold;
}
</style>

• External stylesheet (in a standalone .css file)


<head>
<link rel="stylesheet" type="text/css" href="mystyle.css">
</head>
Multiple stylesheets
• If an element has more than one stylesheet
– Example: one internal and one in an external file
• The more specific style will be applied
– I.e., The closest to the element: external à internal à inline

Example:
• External:
h1 {
color: navy;
margin-left: 20px;
}
• Internal:
h1 {
color: orange;
}
The style will be
color: orange;
margin-left: 20px;
CSS 3
• Work on the standard started in 1998
• Split in modules
• Most modules are W3C Recommendations and many
of the new properties are already implemented in
modern browsers
• Old CSS still present

• Still many debates


– Example: “The CSS Working Group has reached an impasse
on the issue of percentage margins (and padding) in flex
and grid layout, and needs your input to help decide it.”
(May 5, 2015 – www.css3.info)
New CSS modules
• New modules
– Selectors
– Box Model
– Backgrounds and Borders
– Image Values and Replaced Content
– Text Effects
– 2D/3D Transformations
– Animations
– Multiple Column Layout
– User Interface
CSS 3 primer
• Shadow effects

CSS 3:
h1 {
text-shadow: 2px 2px red;
}
HTML:
<h1>Test-shadow effect!</h1>
What’s next?
• Web forms
• XML & XHTML
• Javascript
– State vs. stateless
• Dynamic HTML manipulation
• AJAX
– Synchronous vs. asynchronous
• JQUERY
Web Technologies

Lecture 3
Web forms
HTML5 forms
• A component of a webpage that has
form controls
– Text fields
– Buttons
– Checkboxes
– Range controls
– Color pickers

• User interacts with the form providing data which


is sent to the server for further processing
– E.g.: returning results of a search or calculation
Web Forms 2.0
• If a pattern is popular enough migrate it from a
scripted solution to a more declarative form
• Example
– The :hover peusdo-class in CSS
• CSS has limitations
• HTML5 introduces many new form
enhancements
• Features were part of the WHATWG specification
called Web Forms 2.0 now part of HTML5
The <form> tag
<form
method= "get"
enctype="application/x-www-form-urlencoded"
action="https://www.random.org/integers/">
</form>
• Placed inside the <body> tag
• Required attributes
– method
• Get (for querying data) or post (for sending data, e.g., a file)
– action
• URL of the service handling the submitted data
• Optional enctype
• application/x-www-form-urlencoded (default)
– All characters encoded before being sent (e.g., spaces are converted to + characters and special characters to
ASCII HEX)
» & à &#X26 (&amp;), “ à &#x22 (&quot;)
• multipart/form-data
– No characters are encoded
– Used for file uploads
• text/plain
– Spaces are encoded, special characters not
Form content
• Contains controls
– Many are represented by <input> elements
• Controls are labeled with the <label> tag
• Each part of a form is considered a paragraph and
is separated by the rest by using <p> elements

<form …>
<p><label>Number of requested integers: <input></label></p>
</form>
Input types
• Specified via the type attribute
hidden (HTML 4) datetime-local
text (HTML 4) number
search range
tel color
url checkbox (HTML 4)
email radio (HTML 4)
password (HTML 4) file
datetime submit (HTML 4)
date image
month reset (HTML 4)
week button (HTML 4)
time
Browser compatibility
• Not all browsers support all input types
• Check compatibility at
https://html.spec.whatwg.org/multipage/forms.html

• The browser will only retain the type value you set if
it supports that input type
• Otherwise, it will ignore the value you set and leave
the type property as "text“
– In this case, javascript handling of the value is required
Input restrictions
Other form controls
• <textarea> tag
– Specifies a multi-row text field
<textarea>At w3schools.com you will learn how to make a website. We
offer free tutorials in all web development technologies.
</textarea>
• <option>
– Allows to setup a dropdown list with options
<select>
<option value="volvo“ selected>Volvo</option>
<option value="saab">Saab</option>
<option value="opel">Opel</option>
<option value="audi">Audi</option>
</select>
New HTML5 attributes
• Placeholder
– When no input is entered display default text
– Grayed
– Disappears when clicked on the input
• Autofocus
– Focus on a particular form field
• Required
– Marks a field as required
– Requires javascript to check for its existence
• Autocomplete
– Lets forms forget about auto-filling a field
– Good in cases where you do not want other users to see some
of your input such as bank accounts, etc.
Listing predefined options
• <datalist> tag
– crossbreed between <input >and <select>

<input type="range" name="a" list="a-values">


<datalist id="a-values">
<option value="10" label="Low">
<option value="90" label="High">
</datalist>
Going into the future
• Browser vendors competing on the prettiness
and usability of their HTML5 form controls

– Should web developers style them instead?


– Already some controls such as calendars and
sliders cannot be styled using CSS
Data requests
• Get
– Requests data from a specified resource
– The query string is sent in the URL of the request
https://www.random.org/integers/?num=10&min=1&max=6&col=1&bas
e=10&format=plain&rnd=new
– Used by the REST (REpresentational State Transfer) architecture
• Post
– Submits data to be processed to a specified resource
– The query string is sent in the HTTP body of the request
POST /test/demo_form.asp HTTP/1.1
Host: w3schools.com
name1=value1&name2=value2
Get
• GET requests can be cached
• GET requests remain in the browser history
• GET requests can be bookmarked
• GET requests should never be used when dealing
with sensitive data
• GET requests have length restrictions
– Depends on implementation
• GET requests should be used only to retrieve data
Post
• POST requests are never cached
• POST requests do not remain in the browser
history
• POST requests cannot be bookmarked
• POST requests have no restrictions on data
length
Get vs. post
What’s next?
• XML & XHTML
• Javascript
– State vs. stateless
• Dynamic HTML manipulation
• AJAX
– Synchronous vs. asynchronous
• JQUERY
• Server side programming
Web Technologies

Lecture 4
XML and XHTML
XML
• Extensible Markup Language
• Set of rules for encoding a document in a format
readable
– By humans, and
– Machines
• W3C XML 1.0 Specification
• Goals
– Simplicity
– Generability
– Usability
XML
• Focuses on documents
• Can represent arbitrary data
– Those used by web services
• Many document formats
– RSS
– Atom
– SOAP (communication and web service protocol)
– XHTML (similar to HTML)
– Office Open XML
– XMPP (communication protocol)
A little history
• 1986 – Standard Generalized Markup
Language (SGML)
• 1998 – SGML is reworked into XML
• 2000 – XHTML 1.0 is released
• 2001 – XHTML 1.1 is released
• 2008 XML 1.0 standard is released
• 2015 HTML 5.0 is published as a non SGML
language
Markup and content
• Markup text starts with
– < and ends with >
• Example: <div> </div>
• Forms tags
– Start tags
» <div>
– End tags
» </div>
– Empty-element tags
» <br/>
– & and ends with ;
• Example: &amp; &acirc;
• Everything else is content
Elements and attributes
• Elements are logical document components
which
– start with a start tag and ends with an end tag, or
– Consist of only an empty-element tag
• Attributes are name-value pairs within an
element
– Except the end element
– <img src=“my-image.jpg”/>
XML documents
• Declaration
• Elements
• Attributes
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XML characteristics
• All opened elements need to closed
• Case sensitive
– <Div> is different from <div>
• No shorthand features
– Whereas in HTML we can write
<option selected>
– In XHTML we must write
<option selected=“selected”>
• On errors the parsing of a document stops
XML namespaces
• Element names are defined by the developer
– This leads to confusion
– Same <table> element can have different
meanings
• Solve conflicts using prefixes
<my:table>

• Prefixes require the definition of a namespace


<my:table xmlns:my=“http://address/to/my/namespace”>
CDATA
• Character data section
– Specifies a section of content that is marked to be
interpreted purely as textual data, and not
markup
<! [CDATA[ content ]]>
– Avoids having encoded text which the parsers are
not supposed to process
– Example:
• <![CDATA[<sender>John Smith</sender>]]>
• Instead of &lt;sender&gt;John Smith&lt;/sender&gt;
XHTML
• XHTML 1.0 is "a reformulation of the three HTML
4 document types as applications of XML 1.0“
• XHTML was developed to make HTML more
extensible and increase interoperability with
other data formats
– Compatibility with common XML tools, servers, and
proxies
– Extensibility by adding new features such as SVG and
MathML all written in XML
• Through namespaces
XHTML and HTML
• Javascript and CSS is handled differently in
XHTML
• Complex relationship
Validating XML
• DTD – Document Type Definition
– Defines a standard for exchanging data
– Example
• Clients ensure that the data they receive from the web
server is valid XHTML
– Defined using <!DOCTYPE>
• Example
– <!DOCTYPE html>
• XML Schema
– XML alternative to DTD
DTD vs XML schemas
• XML Schemas are written in XML
– You don't have to learn a new language
– You can use your XML editor to edit your Schema files
– You can use your XML parser to parse your Schema files
– You can manipulate your Schemas with the XML DOM
– You can transform your Schemas with XSLT
• XML Schemas are extensible to additions
• XML Schemas support data types
– It is easier to describe document content
– It is easier to define restrictions on data
– It is easier to validate the correctness of data
– It is easier to convert data between different data types
• XML Schemas support namespaces
An example

DTD for the XML

Schema for the XML


Transforming XML
• XSL – Extensible Stylesheet Language
– Transform and render XML documents
• XSLT
– Language for transforming XML documents
• XPath
– Non-XML language used to address XML elements
– Used from inside XSLT
• Example
<xsl:for-each select=“./note">
What’s next?
• Javascript
– State vs. stateless
• Dynamic HTML manipulation
• AJAX
– Synchronous vs. asynchronous
• JQUERY
• Server side programming
• Web services
Web Technologies 5
Java Servlets
What is a Servlet?
• A Servlet is a Java program that extends the capabilities of servers.
• Inherently multi-threaded.
• Each request launches a new thread.
• Input from client is automatically parsed into a Request variable.
• A servlet can be thought of as a server-side applet
• Applet: a java program that runs within the web browser
• Servlet: a java program that runs within the web server
• Servlets are loaded and executed by a web server in the same
manner that applets are loaded and executed by a web browser
Server-Side
• Architecture

• Applications
• Dynamic generates HTML pages
• Access to database and/or back-end servers
• etc.
Server-Side Application: CGIs
• Common Gateway Interface (CGI)
• Basically call external program
• Use standard input and output for data exchange
• Programming language independent
• Weakness
• CGI program may not be easily portable to other platform
• Substantial overhead is incurred in starting the CGI process
Servlet Architecture

• The client makes a request via HTTP


• The web server receives the requests and forwards it to the servlet
• If the servlet has not yet been loaded, the web server loads it into the JVM
and executes it
• The servlet receives the HTTP request and performs some type of
process
• The servlet returns a response to the web server
• The web server forwards the response to the client
Why servlets?
• Servlets where designed to replace CGI scripts
• Platform-independent and extensible
• CGI scripts are typically written in Perl or C, and are very much tied to a particular server
platform
• Servlet is written in Java, which can easily integrate with existing legacy systems through RMI,
CORBA, and/or JNI
• Persistent and fast
• Servers are loaded only once by the web server and can maintain services between requests
(particularly important for maintaining database connections)
• CGI scripts are transient – a CGI script is removed from memory after it is complete
• For each browser request, the web server must spawn a new operating system process
• Secure
• The only way to invoke a servlet from the outside world is through a web server, which can
be protected behind a firewall
Servlet Processing
1. Read any data sent by the server
• Capture data submitted by an HTML form
2. Look up any HTTP information
• Determine the browser version, host name of client,
3. Generate the results
• Connect to databases, connect to legacy applications, etc.
4. Format the results
• Generate HTML on the fly
5. Set the appropriate HTTP headers
• Tell the browser the type of document being returned or set any cookies
6. Send the document back to the client
Servlet Life Cycle
• Create (Servlet Instantiation):
• Loading the servlet class and creating a new instance
• Initialize (Servlet Initialization):
• Initialize the servlet using the init() method
• Service (Servlet processing):
• Handling 0 or more client requests using the service() method
• Destroy (Servlet Death):
• Destroying the servlet using the destroy() method
• When HTTP calls for a servlet
• Not loaded: Load, Create, Init, Service
• Already loaded: Service
Writing Servlets
• Install a web server capable of launching and managing servlet
programs.
• Install the javax.servlet package to enable programmers to write
servlets.
• Ensure CLASSPATH is changed to correctly reference the javax.servlet
package.
• Define a servlet by subclassing the HttpServlet class and adding any
necessary code to the doGet() and/or doPost() and if necessary the
init() functions.
Handlers

• Each HTTP Request type has a separate handler function.


• GET -> doGet(HttpServletRequest, HttpServletResponse)
• POST -> doPost(HttpServletRequest, HttpServletResponse)
• PUT -> doPut (HttpServletRequest, HttpServletResponse)
• DELETE -> doDelete (HttpServletRequest, HttpServletResponse)
• TRACE -> doTrace (HttpServletRequest, HttpServletResponse)
• OPTIONS -> doOptions (HttpServletRequest, HttpServletResponse)
Servlet Template
doGet and doPost
• The handler methods each take two parameters:
• HTTPServletRequest: encapsulates all information regarding the browser
request.
• Form data, client host name, HTTP request headers.
• HTTPServletResponse: encapsulateallinformation regarding the servlet
response.
• HTTP Return status, outgoing cookies, HTML response.
If you want the same servlet to handle both GET and POST, you can have doGet call
doPost or vice versa.
Single Threaded
• By default, uses shared threads
• Single instance of servlet shared by all requests
• One thread created for each request
• Class & instance variables are thread-unsafe; auto variables are thread-safe
• In some applications, you have to use single thread model, which
• Results in new servlet for each request
• Allows use of instance variables w/o synchronization
Env. Access in HTTPServletRequest
• getContentLength() • getServletPath()
• getContentType() • getPathInfo()
• getProtocol() • getPathTranslated()
• getServerName() • getQueryString()
• getServerPort() • getRemoteUser()
• getRemoteAddr() • getAuthType()
• getRemoteHost() • getHeader(“HdrStr”)
• getMethod()
Parameter Access in HTTPServletRequest

• GetScheme • GetHeader
• GetInputStream • getIntHeader, getDateHeader
• GetParameter • GetSession
• GetParameterValues • GetRequestedSessionId
• GetParameterNames GetReader • IsRequestedSessionIdValid
• GetCharacterEncoding • isRequestedSessionIDFromCookie
• GetContentType • IsRequestedSessionIDFromUrl
• GetCookies • GetHeaderNames
• GetRequestURI
• GetHeaderNames
HTTPResponse Methods
• GetOutputStream • SendError
• GetWriter • SendRedirect
• GetCharacterEncoding • SetHeader
• SetContentLength • setIntHeader, setDateHeader
• SetContentType • SetStatus
• AddCookie • encodeURL,
• ContainsHeader • encodeRedirectURL
Session Tracking
• Many applications need to maintain state across a series of requests
from the same user (or originating from the same browser), e.g.,
• When clients at an on-line store add an item to their shopping cart, how does
the server know what’s already in the cart?
• When clients decide to proceed to checkout, how can the server determine
which previously created shopping cart is theirs?
• HTTP is a stateless protocol
• Each time, a client talks to a web server, it opens a new connection
• Server does not automatically maintains “conversational state” of a user
Session Tracking Mechanisms
• Three mechanisms of session tracking
• Cookies
• URL rewriting
• Hidden form fields
Cookies
• Cookie is a small amount of information sent by a servlet to a web
browser
• Saved by the browser, and later sent back to the server in subsequent
requests
• A cookie has a name, a single value, and optional attributes (name/value
pair)
• A cookie’s value can uniquely identify a client
• Server uses cookie’s value to extract information about the session
from some location on the server
Cookies and Servlets
• The HttpServletRequest class includes the “getCookies()” function.
• This returns an array of cookies, or null if there aren’t any.
• Cookies can then be accessed using three methods.
• String getName()
• String getValue()
• String getVersion()
Cookies and Servlets cont.
• Cookies can be created using
• HttpServletResponse.addCookie()
• Constructor
• new Cookie(String name, String value);
• Expiration can be set using
• setMaxAge(int seconds)
Cookie Servlet
Cookie Servlet cont.
Cookies as Session Tracking Mechanism
• Advantage
• Very easy to implement
• Highly customizable
• Persist across browser shut-downs
• Disadvantage
• Users may turn off cookies for privacy or security reason
• Not quite universal browser support
Hidden Form Fields
• Hidden form fields do not display in the
• browser, but can be sent back to the server by submit
<INPUT TYPE=“HIDDEN” Name=“session” Value =‘...’>
• Fields can have identification (session id) or just something to
remember
• Servlet reads the fields using
request.getParameter()
Hidden Form Fields as Session Tracking
• Advantages
• Universally supported
• Allow anonymous users
• Disadvantages
• Only works for a sequence of dynamically generated forms
• Breaks down with static documents, emailed documents, bookmarked
documents
• Cannot support browser shutdown
Steps for Session Tracking
• Programmers have to do the following steps in order to use the
aforementioned tracking mechanisms:
• Generating and maintaining a session id for each session
• Passing session id to client via either cookie or URL
• Extracting session id either from cookie or URL
• Creating and maintaining a hashtable in which session id and session
information are stored
• Coming up with a scheme in which session information can be added or
removed
• These mechanisms can pass “session id”, but
• do not provide high-level programming APIs
• do not provide a framework from managing sessions
Session Tracking for Servlets
• Provides higher-level API for session tracking
• Built on top of cookie or URL rewriting
• Servlet container maintains
• Internal hashtable of session ids
• Session information in the form of HttpSession
• Provides a simple API for adding and removing session information
(attributes) to HttpSession
• Could automatically switch to URL rewriting if cookies are unsupported or
explicitly disabled
HttpSession
• To get a user’s existing or new session object:
• HttpSession session = request.getSession(true)
• flag = true to create a new session if none exists
• HttpSession is a java interface containing methods to
• View and manipulate information about a session, such as the session identifier, creation
time, and last accessed time
• Bind objects to sessions, allowing user information to persist across multiple user connections
• To Store and retrieve of attribute
• session.setAttribute(“cartItem”, cart)
• session.getAttribute(“cartItem”)
• All session data are kept on the server
• Only session ID sent to client
Sample HTTP Session
Sample HTTP Session cont.
Session Timeout
• Used when an end-user can leave the browser without actively
closing a session
• Session usually timeout after 30 minutes of inactivity
• Product specific
• A different timeout may be set
• getMaxInactiveInterval()
• setMaxInactiveInterval()
Issues with “State” Session Objects
• The number of “stale” session objects that are in “to be timed out”
could be large and affect system performance, for example,
• 1000 users with average 2 minutes session time, thus 15000 users during a
period of 30 minutes
• 4K bytes of data per session
• 15000 sessions * 4K = 60M bytes of session data – just for one application
Session Invalidation
• Can be used by servlet programmer to end a session proactively by
calling invalidate()
• When a user at the browser clicks on “logout” button
• When business logic ends a session
• Caution: a session object could be shared by multiple servlet/JSP-
pages and invalidating it could destroy data that other servlet/JSP-
pages are using
HttpSession Methods
• Object getAttribute(String) – Value for the given name
• Enumeration getAttributeNames() - All the names of all attributes in
the session
• long getCreationTime() - Time at which this session was created
• String getId() - Identifier assigned to this session
• long getLastAccessedTime() - Last time the client sent a request
carrying the identifier assigned to the session
• int getMaxInactiveInterval() - Max time (in seconds) between
requests that the session will be kept
HttpSession Methods cont.
• ServletContext getServletContext() - ServletContext for session
• void invalidate() - Invalidates the session
• boolean isNew() - true if it has been created by the server (client has
not yet acknowledged joining the session)
• void setAttribute(String, Object) - Sets the value for the given name
• void removeAttribute(String) - Removes the value for the given name
• void setMaxInactiveInterval(int) - Sets the maximum interval
between requests
Web Technologies

Lecture 6
State preservation
Motivation
• How to keep user data while navigating on a
website?
– Authenticate only once
– Store wish list or shopping cart items while
browsing an online shop
– Remember user preferences when displaying a
page
Stateless vs. stateful
• State – a set of conditions at a moment of time
– Computers are inherently stateful in operation
• Describe whether or not a computer is designed
to note and remember one or more preceding
events in a sequence of interactions
• Stateful means that a computer keeps track of
the state of interaction
• Stateless means that no record of previous
interactions are kept and that each interaction
request is handled solely based on information
that comes with it
Sessions
• A semi-permanent interactive information
interchange
• Set up or established at a certain point in time
• Basic requirement to perform connection-
oriented communication
• Enables stateful communication
Stateless protocol
• Protocol that treats each request as an
independent transaction
• Communication consists of a paired request-
responses
• It does not require the server to retain session
information
• Examples
– IP
– HTTP
Stateful protocol
• Requires keeping the internal state on the
server
• Examples
– FTP
• During a session the user provides authentication
details and sets various variables
• All details are stored on the server as part of the user
state
Pros and cons
Advantages of stateless communication
– Simplifies the server design
– No need to dynamically allocate storage
– If client dies in mid-connection no need to clean
up the state
However
– Requires additional information in every request
– The information needs to be processed on the
server
Stateful HTTP
• Keep information between different requests
• Useful in many cases
– Stores user information when navigating a website
• Authentication credentials
• Shopping cart items
• Search preferences
• HTTP is stateless à need artificial constructs
– Hidden form variables
– HTTP Cookies
– Web Storage (HTML 5)
– Server side session variables
– URL rewriting using URI-encoded parameters
Client side web sessions
• State information is kept on the client
• Approaches
– Hidden variables
<input type=“hidden” name=“userName” value=“John
Doe”>
– Cookies
• Format: cookieName=cookieValue
• Handled using Javascript
document.cookie="username=John Doe; expires=Thu, 18 Dec
2013 12:00:00 UTC";
Data flow
1. Server sends current state to client
2. Client stores state in a cookie
– In memory
– On disk
3. For each successive request client sends
cookie information to server
4. Server uses cookie data to remember the
state of the web application
Client side issues
• Prone to tempering from user or locally installed software
• When confidentiality and integrity is required
– Only the server must be able to interpret the data
– Only the server should manipulate data
– Only the server should initiate valid sessions
– Encryption is required
• Cookies should be small to avoid communication overhead
– Data compression may be needed for large session data
• Logout not fully implemented
– Clients can drop cookies but data can be resent by the server
Web storage
• Alternative to cookies
• Implemented in HTML 5
• Advantages
– Security
– Can store more data than a cookie (>5Mb)
– Information is never transferred to the server
– Local storage is per origin
• All pages from one origin can store and access the same
data
Using web storage
• localStorage object
localStorage.setItem("lastname", "Smith");
var name = localStorage.getItem(“lastname”);
localStorage.removeItem("lastname");
• sessionStorage object
– Similar methods to localStorage
– It keeps data only for the current session
• If the tab is closed data is lost
Server side web sessions
• Full control of the session
– Can terminate a session on demand
• Existing frameworks can reduce the amount of
code to handle sessions
– Apache Shiro
• Can handle larger data than a cookie
• Only reference to session ID is sent over HTTP as
a cookie
• Implementation can change independent on
client
Server side issues
• More points of failures
– If DB is down sessions cannot by created,
updated, or validated
• More overhead in handling sessions
– Requires asynchronous DB write
• Web applications can only verify a session by
communicating with the server
What’s next?
• AJAX
– Synchronous vs. asynchronous
• JQUERY
• Server side programming
• Web services
• Cloud computing
Web Technologies

Lecture 7
Synchronous vs. asynchronous
Motivation
• “If the web is the human body, Javascript tags are
like its nerve endings” - www.krux.com
– Means through which pages sense, respond, execute,
measure, and remember
• Problem
– Tag activity slows down pages
• 0.1s delay can translate into a 1% drop in customer activity
(Amazon)
• 30% variance in page views based on load time (AOL)
• Load times are increasingly important in Google and Bing
search rankings
Javascript is synchronous
• Javascript is the #1 cause of slow web pages
– Blocking behavior
– When it loads nothing else happens
• Browsers render page elements synchronously
– One element cannot load until the one before it
has
• Solution
– Load elements in a non blocking manner
• Asynchronously
Example

http://www.krux.com/blog/krux-engineers/synchronous-versus-asynchronous-
tags-whats-the-big-deal/
Why is JS not asynchronous?
• document.write construct
– Inserts something into a web page
• Text, tag, etc.
– It expects to alter the page inline, inserting
content as the page is loaded
• Messes the content if loaded asynchronously
• Why do we still use it?
– We cannot get rid of it until the entire stack can
guarantee that document.write will not be used
Solutions
• Element creation
– document.createElement(‘div’)
• InnerHTML
– element.innerHTML(‘some content’)
• Libraries (JQuery library)
• Iframes
– Work only if access to top level page is not needed
• HTML 5 attributes for the script element
– Only work if the script does not use document.write
Load scripts asynchronously
• Do not wait for scripts to render page
• Solution
– New HTML 5 script attributes
• defer: script loads after page finished loading
• async: script loads concurrently with the page
• Example
<script type="text/javascript" src="busy.js" async></script>
Example
• Javascript blocking code
var startNow = new Date();
var pauseFor = 5000; // In milliseconds
while (new Date() - startNow < pauseFor) ;
• HTML code
<html>
<head>
<script type="text/javascript" src="busy.js"></script>
<script type="text/javascript">alert("Time's up"); </script>
</head>
</html>

Outcome
– The alert pops out after the 5,000ms timeout elapses
Example
• Javascript blocking code
var startNow = new Date();
var pauseFor = 5000; // In milliseconds
while (new Date() - startNow < pauseFor) ;
• HTML code
<html>
<head>
<script type="text/javascript" src="busy.js“ async></script>
<script type="text/javascript">alert("Time's up"); </script>
</head>
</html>

Outcome
– The alert pops out before the 5,000ms timeout elapses
Callbacks
• JS functions are first class objects
– Their type is object
– They can be stored in variables, passed as arguments,
returned from functions, etc.
• Essence of callbacks
– Pass a function as an argument to another function
and later execute that passed-in function or even
return it to be executed later
• Most widely used functional programming
technique in JavaScript
Callbacks
• When passing a callback as argument we do not call
the function
var data = getData( function (data) {
alert(“We have ” + data)
});

• The function will be called somewhere in the body of


the function
function getData(callback) {
var data = … //read the data
callback(data);
return data;
}
Asynchronous data transfer
• Load data without blocking the interface
– Facebook comments
– Twitter tweets
– Youtube movies
• Solution
– Use XMLHttpRequest object
• Supported by: Chrome, IE7+, Firefox, Safari, and Opera
• AJAX (Asynchronous Javascript and XML)
– XHTML + CSS + DOM + XML + XMLHttpRequest
Example
function loadXMLAsync (filePath) {
var req = new XMLHttpRequest();
req.open("GET", filePath, true);
req.send(null);
req.onreadystatechange = function () {
if (req.readyState == 4) {
if (req.status == 200) {
xml = req.responseXML;
processXML (xml);
}
else
alert ("Error loading XML");
}
}
}
Notes
– For local access use req.status == 0 instead of the HTTP code 200
– State: 0 – uninitialized, 1 – loading, 2 – loaded, 3 – interactive, 4 – complete
Processing incoming XMLs
• DOM
– createElement
– appendChild

function processXML(xml) {
var imgs = xml.getElementsByTagName('image');
body = document.getElementsByTagName(“body”);
for (var i=0; i<imgs.length; i++) {
var image = document.createElement(“img”);
image.src = imgs[i].firstChild.nodeValue; //set src attribute
body[0].appendChild(image);
}
}
Older IE browsers
• < IE7
• Use ActiveXObject instead

if (window.XMLHttpRequest) {
req = new XMLHttpRequest();
req.overrideMimeType('text/xml'); }
else if (window.ActiveXObject) { //try to get the most modern implementation
var list =
["Microsoft.XmlHttp","MSXML2.XmlHttp","MSXML2.XmlHttp.3.0","MSXML2.XmlHttp.4.0","
MSXML2.XmlHttp.5.0"];
var ok = false;
var i = 5;
while (i >= 0 && ok == false) {
try {
req = new ActiveXObject(list[i]);
ok = true;
} catch (e) {}
i--;
}
}
Async vs. sync API
Rule of thumb

• If API
– requires IO, or
– heavy processing
(>15ms) expose it asynchronously from the start
Building asynchronous APIs
• Design APIs to be asynchronous from the start

var data = getData()


alert(“We have ” + data);
• Freezes the user interface until data is fetched
getData( function (data) {
alert(“We have ” + data)
});
• Designed asynchronously even if the app needs to be
async or not in a later stage
• Use callbacks
What’s next?
• JQUERY
• Server side programming
• Web services
• Cloud computing
Web Technologies

Lecture 8
JQuery
JQuery
• “A fast and concise JavaScript Library that
simplifies HTML document traversing, event
handling, animating, and Ajax interactions for
rapid web development.” - jQuery.com

• Write less do more


$("p.neat").addClass(“anim").show("slow");
Advantages
• Easy of use
– Simpler syntax and less lines of code
• Large library set
– Many functions for HTML manipulation,
animation, event handling, AJAX, etc.
• Strong open source community
– Many libraries
– Efficiency and security
• Great documentation and tutorials
Disadvantages
• Functionality may be limited
– Despite the many libraries, depending on the website
complexity raw Javascript may still be required
• JQuery file is always required
– 25-100KB which need to be loaded with your website
• Strain on the client
• Strain on the server (if your website hosts it)
• Performance
– Vanilla Javascript is faster than JQuery
• Code is not always shorter
– $(this).attr("id") vs. this.id
DOM and JQuery
• Identification
– How do I obtain a reference to the node that I
want
• Traversal
– How do I move around the DOM tree
• Node Manipulation
– How do I get or set aspects of a DOM node
• Tree Manipulation
– How do I change the structure of the page
Terminology
• JQuery function
– Global jQuery object or the $ function depending
on the context
• JQuery object
– The object returned by the JQuery function that
often represents a group of elements
• Selected elements
– The DOM elements that you have selected, most
likely by some CSS selector
JQuery function
— The $ function always (even for ID selectors) returns an
array-like object called a JQuery object
Example:
document.getElementById("id") == $("#myid")[0];
— The JQuery object wraps the originally selected DOM objects
— You can access the actual DOM object by accessing the
elements of the jQuery object
— $ adds extra functionality to DOM elements
— Passing an existing DOM object to $ will give it the jQuery
upgrade
Example:
var elem = document.getElementById("myelem");
elem = $(elem);
Node selectors
• Imported from CSS
• Patterns used to select the element(s) you want to handle
• http://www.w3schools.com/jquery/jquery_ref_selectors.asp
• http://api.jquery.com/category/selectors/

www.webstepbook.com/supplements-2ed/slides/ppt/22-jQuery1.pptx
Examples
Examples
Writing efficient selector
• Use IDs if possible
– Fast access
– $(“#myelement”)
• Avoid selecting by class only
– $(“.myclass”)
– Inefficient in older browsers
• Keep it simple
– No more than 2 or 3 qualifiers unless you have a very complex HTML
– Example: $(“p#intro em”) instead of $("body #page:first-child article.main
p#intro em");
• JQuery works from last selector to first
– Retrieve best qualified selector first
– Example: $("em", $("p#intro")) instead of $(“p#intro em”)
Complex examples
• All p tags that have no children, but only if they don't
have a class of ignore
$(“p:empty:not([class=‘ignore’])”)
• Any p element with the text "REPLACE_ME" in it
$(“p:contains(‘REPLACE_TEXT’)”)
• All div tags with a child that has a class of special
$(“div”).children(“.special”)
• All heading elements (h1, h2, h3, h4, h5, h6)
$(“h1,h2,h3,h4,h5,h6”)
• Every other visible li
$(“li:even:visible”)
Handling events, CSS, animations
$(document).ready(function() {
$("button").click(function() {
$("p").css("background-color", "yellow");
});
});

$("button").click(function() {
$("div").animate( {
left: '250px',
opacity: '0.5',
height: '150px',
width: '150px'
});
});
What’s next?
• Server side programming
• Web services
• Cloud computing
Web Technologies

Lecture 8
Server side web
Client Side vs. Server Side Web

• Client-side code executes on the end-user's


computer, usually within a web browser.

• Server-side code executes on the web server,


usually within a web application environment,
which in turn generates HTML to be viewed in a
browser.
Note: material adapted from
– http://people.clarkson.edu/~bjukic/IS400/Lecture%2013.ppt
– www.massey.ac.nz/~nhreyes/.../Lectures/Lecture%206%20-%20CGI.ppt
Client-server interactions
Client or server side?
• Determining factors:
– Performance:
• Responsiveness, speed, reliability
• Ability to handle a large number of simultaneous
users
– Functionality:
• Simplicity of use and maintenance,
• Breadth of user options
• Ability to handle multiple simultaneous
transactions
– Security:
• Desktop security
• Server security
• Database security
• Network security
Examples

• Code that runs on the server that interprets every mouse


move and keystroke is clearly undesirable
– terminal to mainframe paradigm
• On the other hand, one does not want to download an entire
product database to a browser and then run code that
searches for the products.
• Server side forms have direct access to active code and
perform more reliably
– On the other hand they are more prone to slowdowns due to the
server/network congestion
Client side vs. server side strengths
• Client-side coding advantages stem from its location on the user
desktop and/or other end device. They include the following:
– Interactivity (e.g., mouse and keyboard handling)
– Handling of user interface controls: buttons, textboxes, etc.
– Feedback and validation

• Server-side strengths include stem from their proximity to the


backend business databases and other applications. They include
the following:
– Direct information access, retrieval, processing and storage
• facilitate e-commerce, reservations, shipment tracking etc.
– Central repository of added web features such as e-mail, chat and
multimedia streaming
– Security and authentication (mostly)
Server Side Technologies
• Server-side technologies
– numerous and diverse
• Popular server side web application technologies:
– Microsoft ASP/.NET
– Java server technologies such as J2EE, JSP, and servlets
– CGI / Perl
– PHP
– ColdFusion
Server Side Technologies
• Server-side technologies also include database
systems
– Oracle, SQL Server (Microsoft), MySQL (open source)
and many others
• DB systems are indispensable part of server side
operations
– some DB software providers, such as Oracle are
combining web application functionality with their
core database functions
Server Side Technologies
• The “core” server side application development
platforms can retrieve, modify and query the
contents of databases through their own access
mechanisms:
• ADO.NET for Microsoft’s .NET platform enables
access to almost every existing database platform
• PHP enables direct access to many existing DB
platforms
– MySQL, Oracle, SQL Server, MongoDB, etc.
Server-Side Programming

l Lots of programs/applications designed to


run on the machines on which they are
installed

l How can a remote client request access to


these?
Rich Internet Applications
l Web applications that provide the client with
the features and functionality of desktop
applications
l Require transferring the processing from the
server to the client
l Javascript is one enabling technology for a RIA
RIAs in the Internet client-server
model
Client (browser) Web server

HTTP request for resource

Client does Data (eg


all the multimedia) stay
Server sends code but
processing keeps data
on the server
(eg play
videos as
they come
in)
Some technologies that support
RIA development
l Javascript (and associated ECMA dialects)
l Adobe Flash
− Flash player and IDE
l Java Applets and Java Webstart
l AJAX
− Asynchronous JavaScript and XML
CGI programming

l CGI à Common Gateway Interface


− A protocol for interfacing local applications with a web
server
l Sequence of steps
− Client sends URL request
− Program runs at the server side
− Output is collected and sent back to the client
− Often the output is an HTML “built” by the server
CGI programming
• CGI programs can be written in any language
supported by the server.

• This includes compiled programming


languages, such as C and C++; interpreted
languages, such as Perl, Python, Ruby, and
languages, such as Java, that lie somewhere in
between.
What’s next?
• REST and SOAP Web services
• Cloud computing
Web Technologies

Lecture 10
Web services
Web services
• From W3C
– A software system designed to support interoperable
machine-to-machine interaction over a network
– It has an interface described in a machine-processable format
• Web Service Description Language (XML format)
– Other systems interact with the Web service in a manner
prescribed by its description using SOAP (Simple Object
Access Protocol) messages, typically conveyed using HTTP
with an XML serialization in conjunction with other Web-
related standards

Note:
https://docs.google.com/presentation/d/1xmDMmiFtSQG7Bu65g63r3HuEuyKw-
wXDfzbrPDiZKd8/edit#slide=id.i0
Why web services?
• Expose your application data and services
• Bring together disparate data sources
• Standardized protocols
• Low cost of communication
• Loose coupling
Big web services

• Use XML messages that follow the SOAP


standard
• There is often a machine-readable
description of the operations offered by the
service written in the Web Services
Description Language (WSDL).
– WSDL is not a requirement, but can be used to
generate clients using various languages
including Java and .NET.
SOAP
• Simple Object Access Protocol
• Remote procedure calls
• The XML based protocol consists of three parts
– An envelope, which defines
what is in the message and
how to process it
– A set of encoding rules for
expressing instances of
application-defined datatypes
– A convention for representing
procedure calls and responses
SOAP example
Pros and cons
Advantages
– strict type checking
– formal contract can be enforced
– development tool support

Disadvantages

– rigid schema
– complexity
– higher bandwidth due to envelope requirements
– custom protocol, black box messages
REST

• Representational State Transfer is a software


architectural style for distributed hypermedia
systems, such as the World Wide Web
• Introduced by Roy Fielding in his PhD
dissertation
– University of California, Irvine

Note: source wikipedia REST


Design principles
• Stateless Client/Server Protocol:
– Each message contains all the information needed by a receiver to understand and/or process it
– Keep things simple and avoid needless complexity
• A set of uniquely addressable resources enabled by a universal syntax for
resource identification
– Everything is a Resource in a RESTful system
• A set of well-defined operations that can be applied to all resources
– In the context of HTTP, the primary methods are POST, GET, PUT, and DELETE
– Similar (but not exactly) to the database world's notion of CRUD (Create, Read, Update, Delete)
• The use of hypermedia both for application information and state transitions
– Resources are typically stored in a structured data format that supports hypermedia links, such as
HTML or XML
Note: While the Web is ONE embodiment of these principles, it is not the ONLY one
– You can create RESTful systems using other protocols, methods, resources, etc. but those new
entities must conform with these design principles
More on REST
Quotes from Roy T. Fielding:

• “… resources are just consistent mappings from an


identifier [such as a URL path] to some set of views on
server-side state”

• “If one view doesn’t suit your needs, then feel free to
create a different resource that provides a better view …”

• “These views need not have anything to do with how the


information is stored on the server … [They just need] to be
understandable (and actionable) by the recipient.”
And again REST
• The largest known implementation of a system
conforming to the REST architectural style is the
World Wide Web

• REST is more of an old philosophy than a new


technology
– SOAP looked to jump-start the next phase of internet
development with a host of new specifications
– REST espoused that existing principles and protocols
of the Web are enough to create robust Web services.
REST and URLs
• Every resource is URL-addressable
– Example:
/mycollection
/mycollection/{id}
/loans
/mycollection/movies
/loans/overdue

Note: courses.ischool.berkeley.edu/i290-rmm/s12/.../Lecture3%20REST.pdf
Changing state
• To change system state, simply change a
resource
– Example:
Within the /mycollection “bucket” you can:
– Create an item
– Update an item with new data
– Delete an item
REST operations
• GET – retrieve a copy of a resource
– Example:
GET /mycollection/{id}
GET /mycollection
• DELETE – remove a resource
– Example: DELETE /mycollection/{id}
• POST – create a resource
– Example: POST /mycollection
• PUT – create or update a resource
– Example: PUT /mycollection{id}
– Amazon's S3 service uses PUT to create objects in buckets

POST vs. PUT


– HTTP/1.1 spec says that
• POST's URI should point to a script
• PUT's URI should point to the resource in question
Searches in REST
• There are no methods for resource discovery such as LIST
or FIND
– Instead, collections and search results are treated as
another type of resource with their own unique URLs

• Not REST defined, but:

– Search a term
GET /mycollection/?q=term
– Search for a keyword
GET /mycollection/?kw=2001%20A%20Space%20Oddysey
– Search for partial terms
GET /mycollection/?pt=2001
Error codes
HTTP status codes returned in the response header:
– 200 OK
• The resource was read, updated, or deleted
– 201 Created
• The resource was created
– 400 Bad Request
• The data sent in the request was bad
– 403 Not Authorized
• The Principal named in the request was not authorized to perform this action
– 404 Not Found
• The resource does not exist
– 409 Conflict
• A duplicate resource could not be created
– 500 Internal Server Error
• A service error occurred
Error responses
• Response in body when a 4xx or 5xx status is returned:
<error>
<code>{Mandatory code}</code>
<message>{Optional message}</message>
<resource-id>
{Resource ID, if available}
</resource-id>
<request-uri>
{URI of request}
</request-uri>
</error>
Transferring REST data
• Various formats can be used to transmit representations
of resources
– XML and JSON are generally the most common.

• JSON (JavaScript Object Notation) is a format specified


by Douglas Crockford that uses a subset of the
JavaScript syntax to represent data:
{
"name": "foo",
"description": "bar"
}
REST example
http://www.example.com/api/stock/ibm [GET]

<stock>
<name>IBM</name>
<price>139.84</price>
</stock>

http://www.example.com/api/stock/ibm [PUT]

<stock>
<price>142.98</price>
</stock>
Pros and cons
Advantages

• multiple data formats


• simplicity - known operations
• lighter on bandwidth
• REST application security rules can be set up using HTTP
standards

Disadvantages

• no common standard format


• GET requests are limited in size
What’s next?
• Creating SOAP and REST web services
• Node.js
• Cloud computing
Web Technologies

Lecture 11
Implementing RESTful services
REST Services
• REpresentational State Transfer
• Stateless
• Stardard design architecture for web APIs
– Allows to publicly expose methods over the Internet to
be accessed and manipulated outside the program itself
• Web services
– Get data from an application without having to visit the
application itself (e.g., go to a particular website)
– Achieved through RESTful URIs
• Example: GET /mycollection/{id}
Building a RESTful service
• Use a single URL for the requests
– Avoids having multiple URLs
• increases maintability
– Achived through .htaccess rewrite rules
• Handle cross domain requests to the unique URL
– Receive HTTP requests
– Extract endpoint from URI
– Detect HTTP method (GET, POST, PUT, DELETE)
– Assemble additional data provided in the header or URI
– Pass information to proper method for processing
– Send back HTTP response
Cross Origin Resource Sharing
• CORS W3C specification
• Enables cross domain communication
– Javascript for instance explicitly prohibits this
• See AJAX

• Client and server side


– Requires coordination between the two
• Built on top of XMLHttpRequest
• Server adds some headers allowing client to access its data

NOTE
– Not a substitute for sound security practices!
– It only allows cross domain access
CORS browser support

Image source: http://caniuse.com/#search=cors


CORS on the client side
1. Create CORS request 2. Send the CORS request

A valid CORS request always contains an Origin header


POST /receiver HTTP/1.1
Origin: http://www.issuer.com
The header is added by the browser, and cannot be controlled by the user
CORS on the server side
• Requires to set up some response headers
– Access-Control-Allow-Origin: mandatory
– Access-Control-Allow-Methods: list of supported methods. Mandatory
– Access-Control-Allow-Credentials: for cookies. Works in conjunction with the withCredentials
property on the XMLHttpRequest 2 object
– Access-Control-Expose-Headers: allows access to various response headers
– Access-Control-Request-Method: the request method. Always present
– Access-Control-Request-Headers: list of request headers
– Access-Control-Max-Age: allows preflight response to be cahced

header("Access-Control-Allow-Origin: *");
header("Access-Control-Allow-Methods: *");
header("Content-Type: application/json");

PUT and DELET require a preflight request


– Extracommunication with the server
– Asks permission to make the actual request
– Can be cached to avoid making it for every request

Source: http://www.html5rocks.com/en/tutorials/cors/
The .htaccess file
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule api/v1/(.*)$ api/v1/api.php?request=$1 [QSA,NC,L]
</IfModule>

1. Check the existence of mod_rewrite


2. Activate rewrite engine
3. For any request that does not point to an existing file or directory forward it to api.php
by using the paramter request

QSA: named capture is appended at the end of the URI


NC: URIs are not case sensitive
L: stop after processing this rule
API class
• PHP implementation
• JSON communication format
• Abstract class defining the basic wrapper for the
various endpoints the API will be using
• Concrete class implementing endpoint
functionality
• Enable secure & cross domain access
API abstract class
abstract class API {
protected $method = ''; //PUT, GET, POST, DELETE
protected $endpoint = ''; //e.g.: /files
protected $verb = ''; //e.g.: /files/process
protected $args = Array(); //e.g.: /files/process/23
protected $file = Null; // input of the PUT request. Can be JSON text
public function __construct($request) {
header("Access-Control-Allow-Orgin: *");
header("Access-Control-Allow-Methods: *");
header("Content-Type: application/json");
//extract method, endpoint, verb, args
$this->args = explode('/', rtrim($request, '/'));
$this->endpoint = array_shift($this->args);
if (array_key_exists(0, $this->args) && !is_numeric($this->args[0])) {
$this->verb = array_shift($this->args);
}
$this->method = $_SERVER['REQUEST_METHOD'];
// search for PUT and DELETE in HTTP_X_HTTP_METHOD
if ($this->method == 'POST' && array_key_exists('HTTP_X_HTTP_METHOD', $_SERVER)) {
if ($_SERVER['HTTP_X_HTTP_METHOD'] == 'DELETE') {
$this->method = 'DELETE';
} else if ($_SERVER['HTTP_X_HTTP_METHOD'] == 'PUT') {
$this->method = 'PUT';
} else {
throw new Exception("Unexpected Header");
}
}
switch ($this->method) {
case 'DELETE':
case 'POST': $this->request = $this->_cleanInputs($_POST); break;
case 'GET': $this->request = $this->_cleanInputs($_GET); break;
case 'PUT': $this->request = $this->_cleanInputs($_GET);
$this->file = file_get_contents("php://input"); break; // read raw post data
default: $this->_response('Invalid Method', 405); break;
}
}
[…] // see next slide
}
API abstract class
• Private methods
– _response
• Sets the response header and body
– _cleanInputs
• Stores the request variables in a key-value list
– _requestStatus
• Returns a message associated with a particular code
• Used by _response

• Public method
– processApi
• Determine if the concrete class implements the endpoint method called by the client

public function processAPI() {


if (method_exists($this, $this->endpoint)) {
// call the endpoint method
return $this->_response($this->{$this->endpoint}($this->args));
}
return $this->_response("No Endpoint: $this->endpoint", 404);
}
API concrete class
• CORS opens up a huge security vulnerability
– Must ensure that only certain clients with a unique key can access the API

require_once 'API.class.php';
class MyAPI extends API {
protected $token;
public function __construct($request, $origin) {
parent::__construct($request);
$User = new User()
if (!array_key_exists('apiKey', $this->request)) {
throw new Exception('No API Key provided');
} else if (!$APIKey->verifyKey($this->request['apiKey'], $origin)) {
throw new Exception('Invalid API Key');
} else if (!array_key_exists('token', $this->request)) {
throw new Exception('No User Token provided');
}
$this->token = $this->request['token']);
}
// Example of an Endpoint. Called by processAPI (see previous slide)
protected function returnToken() {
if ($this->method == 'GET') {
return "Your token is " . $this->token;
} else {
return "Only accepts GET requests";
}
}
}
Using the API
// Requests from the same server don't have a HTTP_ORIGIN header
if (!array_key_exists('HTTP_ORIGIN', $_SERVER)) {
$_SERVER['HTTP_ORIGIN'] = $_SERVER['SERVER_NAME'];
}
try {
$API = new MyAPI($_REQUEST['request'], $_SERVER['HTTP_ORIGIN']);
echo $API->processAPI();
} catch (Exception $e) {
echo json_encode(Array('error' => $e->getMessage()));
}

Note: http://coreymaynard.com/blog/creating-a-restful-api-with-php/
What’s next?
• Node.js
• Cloud computing
Web Technologies

Lecture 11
Microservices
Microservices
• How is being a microservice different than
simply being RESTful?
• What are the advantages of a microservice
backend architecture over a monolithic
architecture?
• Next time: what additional infrastructure is
required to realize these advantages?
Old Backend
Backend Development
• In the beginning, you wrote whatever you
wanted using whatever language you wanted
and whatever framework you wanted
• Then... PHP and ASP
– Languages “designed” for writing backends
– Encouraged spaghetti code
– A lot of the web was built on this
• New languages come on to the seen
– Ruby, Python, JSP
Monolithic backend
Microservice backend
RESTful APIs
• Recall guidelines for RESTful APIs from previous
lectures: Handling HTTP Requests
• Support scaling
– Use HTTP actions to support intermediaries (caches)
• Support change
– Leaving anything out of URI that might change
– Ensure any URI changes are backwards compatible
• Support reuse
– Design URIs around resources that are expressive
abstractions that support a range of client interactions
– Resources are nouns; use HTTP actions to signal verbs
Microservices vs. Monoliths
• Advantages of microservices over monoliths include
– Support for scaling
• Scale vertically rather than horizontally
– Support for change
• Support hot deployment of updates
– Support for reuse
• Use same web service in multiple apps
• Swap out internally developed web service for externally
developed web service
• Support for separate team development
– Pick boundaries that match team responsibilities
– Support for failure
Support for scaling
Support for scaling II
Support for scaling III
• This is called the monolithic app
• If we need 100 servers each server will have to
run EACH module
• What if we need more of some modules than
others?
Microservice
Conway’s Law
• the structure of an organziation mirrors the structure
of a product
• Building a car
– Tires team
– Drivetrain team
– Seating team
– Paint team
– ….
• Could pick a product structure and design a team
around it
• Or could pick a desired team structure and design a
product around it
Organization in a monolith
Organization around business
capabilities in microservices
Size of a microservice
• Components are independently replaceable
• Components are independently updatable
• This means that they can be also
independently developed, tested, etc
• Components can be built as:
– Library (e.g. module)
– Service (e.g. web service)
Goals of microservices
• Add them independently
• Upgrade them independently
• Reuse them independently
• Develop them independently
• Have ZERO coupling between microservices,
aside from their shared interface
Design for Failure
• Each of the many microservices might fail
– Services might have bugs
– Services might be slow to respond
– Entire servers might go down
• If I have 60,000 hard disks, 3 fail a day
– The more microservices there are, the higher the
likelihood at least on is currently failing
• Key: design every service assuming that at some point,
everything it depends on might disappear - must fail
“gracefully”
• Netflix simulates this constantly with “ChaosMonkey”
Support for faliure
• Goal: Support graceful degradation with
service failures
• Design for idempotency
– Should be able to retry requests without
introducing bad data
• Design for data locality
– Transactions across microservices are hard to
manage
• Design for eventual consistency
Design for idempotency
• Want to design APIs so that executing an
action multiple times leads to the same
resulting state
• Prefer state changes on existing entity rather
than creating new entities
Design for data locality
• If datastore server fails or is slow, do not want
entire site to go down
• Decentralized implementation decisions
• Allow each service to manage data in the way
that makes the most sense for that service
• Rule: Service exchange data ONLY through
their exposed APIs – NO shared databases
Consistency
• Updates are sent via HTTP requests
• No guarantee that those updates occur
immediately
• Instead, guarantee that they occur eventually
• Can force some ordering, but that’s expensive
Maintaining Consistency
• Core problem: different services may respond
to requests at different times.
– What if a request results in change to resource in
one service, but other service has not yet
processed corresponding request?
– May end up with different states in different
resources
– Logic needs to be written to correctly handle such
situations

You might also like