You are on page 1of 38

Distributed Document-Based Systems

Chapter 9

The World Wide Web

Overall organization of the Web.

Document Model (1)


<HTML> <BODY> <H1>Hello World/H1> <P> <SCRIPT type = "text/javascript"> document.writeln ("<H1>Hello World</H1>; </SCRIPT> </P> </BODY> </HTML> <!- Start of HTML document --> <!- Start of the main body --> <!- Basic text to be displayed --> <!- Start of a new paragraph --> <!- identify scripting language --> // Write a line of text <!- End of scripting section --> <!- End of paragraph section --> <!- End of main body --> <!- End of HTML section -->

A simple Web page embedding a script written in JavaScript.

Document Model (2)


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) <!ELEMENT article (title, author+,journal)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (name, affiliation?)> <!ELEMENT name (#PCDATA)> <!ELEMENT affiliation (#PCDATA)> <!ELEMENT journal (jname, volume, number?, month? pages, year)> <!ELEMENT jname (#PCDATA)> <!ELEMENT volume (#PCDATA)> <!ELEMENT number (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT pages (#PCDATA)> <!ELEMENT year (#PCDATA)>

An XML definition for referring to a journal article.

Document Model (3)


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) <?xml = version "1.0"> <!DOCTYPE article SYSTEM "article.dtd"> <article> <title> Prudent Engineering Practice for Cryptographic Protocols</title> <author><name>M. Abadi</name></author> <author><name>R. Needham</name></author> <journal> <jname>IEEE Transactions on Software Engineering</jname> <volume>22</volume> <number>12</number> <month>January</month> <pages>6 15</pages> <year>1996</year> </journal> </article>

An XML document using the XML definitions from previous slide

Document Types
Type Text Subtype Plain HTML XML Image Audio Video Application GIF JPEG Basic Tone MPEG Pointer Octet-stream Postscript PDF Multipart Mixed Parallel Unformatted text Text including HTML markup commands Text including XML markup commands Still image in GIF format Still image in JPEG format Audio, 8-bit PCM sampled at 8000 Hz A specific audible tone Movie in MPEG format Representation of a pointer device for presentations An uninterrupted byte sequence A printable document in Postscript A printable document in PDF Independent parts in the specified order Parts must be viewed simultaneously Description

Six top-level MIME types and some common subtypes.

Architectural Overview (1)

The principle of using server-side CGI programs.

Architectural Overview (2)


(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) <HTML> <BODY> <P>The current content of <pre>/data/file.txt</PRE>is:</P> <P> <SERVER type = "text/javascript"); clientFile = new File("/data/file.txt"); if(clientFile.open("r")){ while (!clientFile.eof()) document.writeln(clientFile.readln()); clientFile.close(); } </SERVER> </P> <P>Thank you for visiting this site.</P> </BODY> </HTML>

An HTML document containing a JavaScript to be executed by the server

Architectural Overview (3)

Architectural details of a client and server in the Web.

HTTP Connections

a) b)

Using nonpersistent connections. Using persistent connections

HTTP Methods
Operation Head Get Put Post Delete Description Request to return the header of a document Request to return a document to the client Request to store a document Provide data that is to be added to a document (collection) Request to delete a document

Operations supported by HTTP.

HTTP Messages (1)

HTTP request message

HTTP Messages (2)

HTTP response message.

HTTP Messages (3)


Header Accept Accept-Charset Source Client Client Client Client Client Server Both Server Server Client Client Client Client Client Client Server Server Client Both Both Contents The type of documents the client can handle The character sets are acceptable for the client The document encodings the client can handle The natural language the client can handle A list of the client's credentials Security challenge the client should respond to Date and time the message was sent The tags associated with the returned document The time how long the response remains valid The client's e-mail address The TCP address of the document's server The tags the document should have The tags the document should not have Tells the server to return a document only if it has been modified since the specified time Tells the server to return a document only if it has not been modified since the specified time The time the returned document was last modified A document reference to which the client should redirect its request Refers to client's most recently requested document The application protocol the sender wants to switch to Information about the status of the data in the message

Some HTTP message headers.

Accept-Encoding Accept-Language Authorization WWW-Authenticate Date ETag Expires From Host If-Match If-None-Match If-Modified-Since If-Unmodified-Since Last-Modified Location Referer Upgrade Warning

Clients (1)

Using a plug-in in a Web browser.

Clients (2)

Using a Web proxy when the browser does not speak FTP.

Servers

General organization of the Apache Web server.

Server Clusters (1)

The principle of using a cluster of workstations to implement a Web service.

Server Clusters (2)

(a) The principle of TCP handoff.

Server Clusters (3)

(b) A scalable content-aware cluster of Web servers.

Uniform Resource Locators (1)

Often-used structures for URLs. a) Using only a DNS name. b) Combining a DNS name with a port number. c) combining an IP address with a port number.

Uniform Resource Locators (2)


Name http ftp file data telnet tel modem Used for HTTP FTP Local file Inline data Remote login Telephone Modem Example http://www.cs.vu.nl:80/globe ftp://ftp.cs.vu.nl/pup/minx/README file:/edu/book/work/chp/11/11 data:text/plain;charset=iso-8859-7,%e1%e2%e3 telnet://flits.cs.vu.nl tel:+31201234567 modem:+31201234567;type=v32

Examples of URLs.

Uniform Resource Names

The general structure of a URN

Web Proxy Caching

The principle of cooperative caching

Server Replication

The principle working of the Akami CDN.

Security (1)

The position of TLS in the Internet protocol stack.

Security (2)

TLS with mutual authentication.

Lotus Notes

The general organization of a Lotus Notes system.

Document Model
Note type Document Form Field View ACL Category Data Design Design Design Description A user-oriented document such as a Web page Structure for creating, editing, and viewing a document Defines a field shared between a form and subforms Structure for displaying a collection of documents

Administration Contains an access control list for the database

ReplFormula Administration Describes the replication of the database

Examples of different types of notes.

Processes (1)

The general organization of a Domino server.

Processes (2)

Request handling in a cluster of Domino servers.

Naming

A Notes URL for accessing a database.

Identifiers
Identifier Universal ID Originator ID Database ID Note ID Replica ID Scope World World Server Database World Description Globally unique identifier assigned to each note Identifier for a note, but includes history information Time-dependent identifier for a database Identifier of a note relative to a database instance Timestamp used to identify the same copies of a database

Some major identifiers in Notes.

Replication
Scheme Pull-push Pull-pull Push-only Pull-only Description A replicator task pulls updates in from a target server, and pushes its own updates to that target as well A replicator task pulls in updates from a target server, and responds to update fetch requests from that target A replicator task only pushes its own updates to a target server, but does not pull in any updates from the target A replicator only pulls in updates from a target server, but does not push any of its own updates to that target

Replication schemes in Notes.

Conflict Resolution

Safely merging two documents with conflicting OIDs.

Authentication: Validating Certificates

Public-key validation in Notes

Access Control
Part Servers Workstations Databases Files Design notes Documents Description ACLs specifying access rights for servers and ports Lists specifying execution rights for scripts and such ACLs specifying permissions for different types of users ACLs used for controlling access by Web clients ACLs to control the presentation and such of documents ACLs to control read and and write access to documents

Parts in Notes subject to access control.

Comparison of Web & Lotus Notes


Issue Basic model Extensions Storage model Network comm. Interprocess comm. Client process Client extensions Server process Server extensions Server clusters Naming Synchronization Caching Replication Fault tolerance Recovery Authentication Access control Marked-up text Multimedia, scripts File oriented HTTP Operating sys. dependent Browser, Editor Plug-ins Comparable to file server Servlets, CGI programs Transparent URNs, URLs Mainly local Advanced Mirroring, CDNs Reliable comm. & clusters No explicit support Mainly TLS Server dependent WWW Notes List of text items (note) Multimedia, scripts Database oriented RPC, E-mail Notes Object Services (NOS) Browser, Design editor In basic client system Comparable to database server Server tasks Nontransparent URLs, identifiers Mainly local Not documented Lazy Clusters Single server Certificate validation Extensive ACLs

You might also like