Distributed Document-Based Systems
The World Wide Web
Overall organization of the Web.
Document Model (1)
Document Model (2)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) <!ELEMENT article (title, author+,journal)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (name, affiliation?)> <!ELEMENT name (#PCDATA)> <!ELEMENT affiliation (#PCDATA)> <!ELEMENT journal (jname, volume, number?, month? pages, year)> <!ELEMENT jname (#PCDATA)> <!ELEMENT volume (#PCDATA)> <!ELEMENT number (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT pages (#PCDATA)> <!ELEMENT year (#PCDATA)>
An XML definition for referring to a journal article.
Document Model (3)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) <?xml = version "1.0"> <!DOCTYPE article SYSTEM "article.dtd"> <article> <title> Prudent Engineering Practice for Cryptographic Protocols</title> <author><name>M. Abadi</name></author> <author><name>R. Needham</name></author> <journal> <jname>IEEE Transactions on Software Engineering</jname> <volume>22</volume> <number>12</number> <month>January</month> <pages>6 – 15</pages> <year>1996</year> </journal> </article>
An XML document using the XML definitions from previous slide
Type Text Subtype Plain HTML XML Image Audio Video Application GIF JPEG Basic Tone MPEG Pointer Octet-stream Postscript PDF Multipart Mixed Parallel Unformatted text Text including HTML markup commands Text including XML markup commands Still image in GIF format Still image in JPEG format Audio, 8-bit PCM sampled at 8000 Hz A specific audible tone Movie in MPEG format Representation of a pointer device for presentations An uninterrupted byte sequence A printable document in Postscript A printable document in PDF Independent parts in the specified order Parts must be viewed simultaneously Description
Six top-level MIME types and some common subtypes.
Architectural Overview (1)
The principle of using server-side CGI programs.
Architectural Overview (2)
Architectural Overview (3)
Architectural details of a client and server in the Web.
Using nonpersistent connections. Using persistent connections
Operation Head Get Put Post Delete Description Request to return the header of a document Request to return a document to the client Request to store a document Provide data that is to be added to a document (collection) Request to delete a document
Operations supported by HTTP.
HTTP Messages (1)
HTTP request message
HTTP Messages (2)
HTTP response message.
HTTP Messages (3)
Header Accept Source Client Client Client Client Client Server Both Server Server Client Client Client Client Client Client Server Server Client Both Both Contents The type of documents the client can handle The character sets are acceptable for the client The document encodings the client can handle The natural language the client can handle A list of the client's credentials Security challenge the client should respond to Date and time the message was sent The tags associated with the returned document The time how long the response remains valid The client's e-mail address The TCP address of the document's server The tags the document should have The tags the document should not have Tells the server to return a document only if it has been modified since the specified time Tells the server to return a document only if it has not been modified since the specified time The time the returned document was last modified A document reference to which the client should redirect its request Refers to client's most recently requested document The application protocol the sender wants to switch to Information about the status of the data in the message
Some HTTP message headers.
Accept-Charset Accept-Encoding Accept-Language Authorization WWW-Authenticate Date ETag Expires From Host If-Match If-None-Match If-Modified-Since If-Unmodified-Since Last-Modified Location Referer Upgrade Warning
Using a plug-in in a Web browser.
Using a Web proxy when the browser does not speak FTP.
General organization of the Apache Web server.
Server Clusters (1)
The principle of using a cluster of workstations to implement a Web service.
Server Clusters (2)
(a) The principle of TCP handoff.
Server Clusters (3)
(b) A scalable content-aware cluster of Web servers.
Uniform Resource Locators (1)
Often-used structures for URLs. b) Using only a DNS name. c) Combining a DNS name with a port number. d) combining an IP address with a port number.
Uniform Resource Locators (2)
Name http ftp file data telnet tel modem Used for HTTP FTP Local file Inline data Remote login Telephone Modem Example http://www.cs.vu.nl:80/globe ftp://ftp.cs.vu.nl/pup/minx/README file:/edu/book/work/chp/11/11 data:text/plain;charset=iso-8859-7,%e1%e2%e3 telnet://flits.cs.vu.nl tel:+31201234567 modem:+31201234567;type=v32
Examples of URLs.
Uniform Resource Names
The general structure of a URN
Web Proxy Caching
The principle of cooperative caching
The principle working of the Akami CDN.
The position of TLS in the Internet protocol stack.
TLS with mutual authentication.
The general organization of a Lotus Notes system.
Note type Document Form Field View ACL ReplFormula Category Data Design Design Design Description A user-oriented document such as a Web page Structure for creating, editing, and viewing a document Defines a field shared between a form and subforms Structure for displaying a collection of documents
Administration Contains an access control list for the database Administration Describes the replication of the database
Examples of different types of notes.
The general organization of a Domino server.
Request handling in a cluster of Domino servers.
A Notes URL for accessing a database.
Identifier Universal ID Originator ID Database ID Note ID Replica ID Scope World World Server Database World Description Globally unique identifier assigned to each note Identifier for a note, but includes history information Time-dependent identifier for a database Identifier of a note relative to a database instance Timestamp used to identify the same copies of a database
Some major identifiers in Notes.
Scheme Pull-push Pull-pull Push-only Pull-only Description A replicator task pulls updates in from a target server, and pushes its own updates to that target as well A replicator task pulls in updates from a target server, and responds to update fetch requests from that target A replicator task only pushes its own updates to a target server, but does not pull in any updates from the target A replicator only pulls in updates from a target server, but does not push any of its own updates to that target
Replication schemes in Notes.
Safely merging two documents with conflicting OIDs.
Authentication: Validating Certificates
Public-key validation in Notes
Part Servers Workstations Databases Files Design notes Documents Description ACLs specifying access rights for servers and ports Lists specifying execution rights for scripts and such ACLs specifying permissions for different types of users ACLs used for controlling access by Web clients ACLs to control the presentation and such of documents ACLs to control read and and write access to documents
Parts in Notes subject to access control.
Comparison of Web & Lotus Notes
Issue Basic model Extensions Storage model Network comm. Interprocess comm. Client process Client extensions Server process Server extensions Server clusters Naming Synchronization Caching Replication Fault tolerance Recovery Authentication Access control Marked-up text Multimedia, scripts File oriented HTTP Operating sys. dependent Browser, Editor Plug-ins Comparable to file server Servlets, CGI programs Transparent URNs, URLs Mainly local Advanced Mirroring, CDNs Reliable comm. & clusters No explicit support Mainly TLS Server dependent WWW Notes List of text items (note) Multimedia, scripts Database oriented RPC, E-mail Notes Object Services (NOS) Browser, Design editor In basic client system Comparable to database server Server tasks Nontransparent URLs, identifiers Mainly local Not documented Lazy Clusters Single server Certificate validation Extensive ACLs