Professional Documents
Culture Documents
الجـزء الأول
أستاذ المقرر
أ /ماجد صالح أحمد الحاج
EMAIL : magedalhaj4@gmail.com
PHONE : 770091917
1444هـ 2022 -م
Motivation
Web-based Knowledge & Data Management
A huge amount of Web data
how to organize, retrieve them, how to discover interesting
patterns and how to make a recommend from them
Web Search Engine
Uber Taxi and Didi Chuxing
Amazon, Alibaba, Tencent, JD.com
Web Blog Analysis
Spam Email Detection
Online Electronic Medical Data Analysis
Electronic Health Care and eHealth
Social Network Analysis
Amazon Business Model
3
Examples of Web Search
Engines
4
Examples of Web
5
Introduction to Client-Server Systems,
WWW and Web Technology
Week 1
Outline
The Internet
The Web
What makes the Web work?
HTTP
URL
HTML
CGI
Example of a Web page
Summary
The Internet
To
IP Address: 123.21.12.131
From
The Internet
6
Historical View: Internet
1969 - Telnet
1970 - 4 computers
Stanford, UCLA, UC Santa Barbara, U Utah
1971 - FTP
1983 - 562 computers on the internet
1993 - 1.2 million computers on the internet
1999 - ssh, sftp, ……
2010 - Amazon, Alibaba, ……
2020 - Smart-based devices, …….
Outline
The Internet
The Web
What makes the Web work?
HTTP
URL
HTML
CGI
Example of a Web page
Summary
The Web
World-Wide Web (Web, WWW)
networked information system that provides a simple
way of browsing different types (text, pictures, video,
audio, etc.) of information on the Internet using
hyperlinks.
Web pages
electronic documents that typically contains several
types of information accessible via the World Wide Web
Web sites
a collection of related Web pages of a certain individual,
group, or organization.
The Web uses a client/server model
Client-Server Model
Browser - software to interact
machine that services internet request
with internet data at the client
Request File
Browser
Display File
Send File
Server
What is a Web Server?
Web server
computer running application software that listens and
responds to a client computer’s request made through a
web browser
machine that hosts web pages and other web
documents
provides web documents and other online services
using HTTP
What is a Web Browser?
Web browser
application software that is used to locate and issue a
request for the page on the web server that hosts the
document
It also interpret the page sent back by the web server
and display it on the monitor of the client computer
computer program that lets you view and explore
information on the World Wide Web
Web Browsers
Note: Not all URLs will have the directory and filename
HyperText Markup Language (HTML)
Hypertext
presents and relates information as hyperlinked
documents that point to other documents or resources.
HTML
A standard markup language that defines a hypertext
document.
A simple, powerful, platform-independent document
language.
Specifies what displays should look like
Browser interprets HTML
Same HTML file often looks different across browsers
HTML files are the source files of Web pages
HTML File Structure
<HTML>
<HEAD>
<TITLE>Page Title</TITLE>
</HEAD>
<BODY>
Stuff
</BODY>
</HTML>
What About Graphics?
An HTML file can refer to an image file
<h2>Teaching</h2>
<p><a href=”http://ai.uwaterloo.ca/3421.html">
COSC 3421 Fall 2002</a></p>
<p><a href=”http://ai.uwaterloo.ca/3221.html">
COSC 3221 Winter 2003</a></p>
Simple Formatting
<H1><FONT COLOR="#b80000">
Heading level 1</FONT></H1>
<H2><FONT COLOR="#ff0000">
Heading level 2 </FONT> </H2>
<P>Paragraph with <B>bold</B> and
<I>italic</I> text.</P>
<HR>
Creating HTML Files
Browser Server
Web Server
Retrieving Hosting web pages
web pages
using HTTP
protocol
Internet Web Authoring System
Web Client create web pages
Browser Publish Scanner
web pages
Video capture
Sound card
Web page: document written in HTML,JSP and ASP.
Internet Client-Server Systems
Internet Client-Server Systems
Internet Client-Server Systems
40 Internet Banking
Internet Client-Server Systems
46
Wechat Business Model
47
Amazon Business Model
48
Static and Dynamic Web Pages
18
Common Gateway Interface (CGI)
HTTP Request
HTTP Document
Web Browser Web Server
Get Data
CGI Scripts/
Applications Database
Return data
How Web Page Works
URL
Navigational tools
Navigational
Graphics /
tools
Hyperlinks
Hyperlinks
Cookies
A piece of information generated by the web-server
and stored in the client side ready for future access.
Cookies can make CGI scripts more interactive.
Cookies are text files stored on Web client.
CGI script creates cookie and has a Web server sent
it to client’s browser to store on hard disk.
Later, when client revisits Web site and uses a CGI
script that requests this cookie, client’s browser
sends information stored in the cookie.
39
Cookies
How do cookies work?
Request Origin
Client
Server A
Response Origin
Client
Set-Cookie: XYZ Server A
Request Origin
Client Cookie: XYZ Server A
Client
1 Internet Banking
Internet Client-Server Systems
Client
Web Server
GET foo.html
PC-1
Web Server
Foo.html
Request 2
Web Servers
Most web servers are very simple. They
just return files to the PC that requests it
Request file
Retrieve file
Send file
Dynamic Web Pages
Request service
Do Computation
Generate HTML
page with results
of computation
Return dynamically
generated HTML file
CGI and Web Forms
How to write the HTML that sends data to
the server?
HTTP
SERVER
Environment
Variables
stdin
QUERY_STRING
CONTENT_LENGTH
Request Method: Get
Request Resource
Method Query
Name
String
CGI URLs
There is a mapping between URLs and CGI
programs provided by a web server. The
exact mapping is not standardized (web
server admin can set it up)
Typically:
requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs (not to
static documents).
CGI Programs
When the user hits the “submit” button
the data is sent to the web server
The CGI program that handles it on the
web server is specified in the HTML
Form tag
firstName=lee
middleName=harvey
lastName=oswald
CGI Programs
Strengths:
A simple method to send data to the server.
Dynamically generates HTML pages.
Weaknesses
All the processing happens on the server.
Takes time to launch the CGI process on the
server.
Use the process, instead of thread.
Web Forms
Overview of Web forms
HTML form components
GET & POST methods
Server-side processing with forms
CGI-based Web Application
HTT Request
HTTP Document
Web Browser Web Server
Get Data
CGI Scripts/
Applications Database
Return data
Form Interaction with CGI
Web Browser Web CGI
Server Program
User requests form
Network Server
Forms
Forms work in a different and slightly more
complex way than standard HTML pages.
Forms consist of a number of separate data entry
components such as menus and text areas.
The user can select different options from the menus
and enter text in the text entry fields.
A single form can contain many text entry fields
and/or many menus.
To differentiate the menus and text areas from each
other each one is given a unique name, selected by
the Web form designer.
HTML Forms
When the user clicks the submit button, the set of variable
names & corresponding values are sent to the Web server
in a HTTP request.
The Web server can extract the set of variables & values
from the HTTP request, and can do something with them...
Example for HTML Form
<html>
<head> Note that this form
<title>Query Form</title>
contains two variables
name & id
</head>
<body>
<h2>Query Form</h2>
<form method="GET” action="doquery.php”>
<p>Your name: <input name="name" type="text" size=30></p>
<p>Your ID: <input name="id" type="text" size=15></p>
<p><input type="submit" value="Submit your query"></p>
<p><input type="reset" value="Clear your query"></p>
</form>
</body>
</html>
Example for HTML Form
Two options:
GET: pass the data in a HTTP GET request
POST: pass the data in a HTTP POST request
http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234
Note how the variable names & values are appended to the URL.
Any spaces in a value are converted to +.
Form Actions
The action attribute on the form tag specifies what the
Web server should do with the form data.
Common options:
email the data to someone [the mailto action]
pass the data to a script or program
For example, the CGI script could use the name & id to
look up student info in a database.
Form Actions
<form method="GET" action="mailto:jhuang@yorku.ca">
Until you can actually use scripts on the server, use the
mailto action. It operates in the same way as the mailto that
you have used in the HTML document.
POST:
REQUEST_METHOD is “POST”
CONTENT_LENGTH is the size of the query
(in bytes)
query can be read from STDIN
HTTP Form Processing
1. user fills in form &
clicks submit 2. Browser sends GET
http://www.yorku.ca/jhuang/doquery.cgi?name=joe+bloggs&id=1234
4. server sends
5. Browser script results to 3. server runs
displays the Browser the script
script results* doquery.cgi
passing form
internet data to it
Text field
Password field
Radio buttons
Drop-down list
Check boxes
Text area
Buttons
Form Processing & Results
The easiest way to deal with form data is to simply email it to an
email address using a mailto form action:
The old way is to use a so-called CGI Script, usually with a URL
something like:
SSI
WWW
CGI WWW server
CGI HTTP client
program
API
script
(embedded
in HTML)
Java
other Java applet
program servlet
(application )
SML
)
Some History
SGML
HTML
HTML - Disadvantages
Limited tag set
Can’t separate the presentation from content
Can’t define structure of contents
XHTML
)
XHTML Basics
Very few real changes from HTML
But more strict
<body>
<p>Welcome to XHTML!</p>
</body>
</html>
Images
The value of the src attribute
)
XML Introduction
The Extensible Markup Language (XML) is a document
processing standard proposed by the World Wide Web
Consortium (W3C), which is related to Standard
Generalised Markup Language (SGML).
Possible to search, sort, manipulate and render XML
using Extensible Markup Language (XSL).
Highly portable
Files end in the .xml extension.
XML& W3C
•XMLhas been in development since the 1960s through its parent called
SGML(Standard GeneralizedMarkup Language) which is also the parent for
HTML
- www.w3.org/xml
- www.xml.com/axml/axml.html (annotated version)
XML-related Technologies
DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for
particular purposes
<passport_details>
<last_name>Smith</last_name>
<first_name>Jo</first_name>
<first_name>Stephen</first_name>
<address>
<street>1 Great Street</street>
<city>GreatCity</city>
<state>GreatState</state>
<postal_code>1234</postal_code>
<country>GreatLand</country>
<email>jo@theworldaccordingtojo.com</email>
</address>
</passport_details>
XML Examples
XML Source File
http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml
http://www.xml.com/programming/
XML Examples
)
SGML
HTML - Disadvantages
Limited tag set
Can’t separate the presentation from content
Can’t define structure of contents
XHTML
)
XHTML Basics
Very few real changes from HTML
But more strict
<body>
<p>Welcome to XHTML!</p>
</body>
</html>
Images
The value of the src attribute
)
XML Introduction
The Extensible Markup Language (XML) is a document
processing standard proposed by the World Wide Web
Consortium (W3C), which is related to Standard
Generalised Markup Language (SGML).
Possible to search, sort, manipulate and render XML
using Extensible Markup Language (XSL).
Highly portable
Files end in the .xml extension.
XML& W3C
•XMLhas been in development since the 1960s through its parent called
SGML(Standard GeneralizedMarkup Language) which is also the parent for
HTML
- www.w3.org/xml
- www.xml.com/axml/axml.html (annotated version)
XML-related Technologies
DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for
particular purposes
<?xml version='1.0'?>
<!DOCTYPE passport_details [
<!ELEMENT passport_details <passport_details>
(last_name,first_name+,address)> <last_name>Smith</last_name>
<!ELEMENT last_name (#PCDATA)> <first_name>Jo</first_name>
<!ELEMENT first_name (#PCDATA)>
<first_name>Stephen</first_name>
<!ELEMENT address
(street,(city|town),(state|province) <address>
,(ZIP|postal_code),country,contact_no?,email*)> <street>1 Great Street</street>
<!ELEMENT street (#PCDATA)> <city>GreatCity</city>
<!ELEMENT city (#PCDATA)> <state>GreatState</state>
<!ELEMENT town (#PCDATA)>
<postal_code>1234</postal_code>
<!ELEMENT state (#PCDATA)>
<country>GreatLand</country>
<!ELEMENT province (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)> <email>jhuang@yorku.ca</email>
<!ELEMENT postal_code (#PCDATA)> </address>
<!ELEMENT country (#PCDATA)> </passport_details>
<!ELEMENT phone_home (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>
Shared DTD
XML Document specifies the DTD
<?xml version='1.0'?>
<passport_details>
<last_name>Smith</last_name>
<first_name>Jo</first_name>
<first_name>Stephen</first_name>
<address>
<street>1 Great Street</street>
<city>GreatCity</city>
<state>GreatState</state>
<postal_code>1234</postal_code>
<country>GreatLand</country>
<email>jo@theworldaccordingtojo.com</email>
</address>
</passport_details>
XML Examples
XML Source File
http://www.yorku.ca/jhuang/xml/04.adhoc.topics.xml
https://www.w3schools.com/xml/xml_parser.asp
B2C
• Business-to-Consumer involves sending XMLdirectly to the client
• Data sent directly to the client needs a style (XSL) applied
• Applying style is best accomplished on the server side
Document Structure
• Three distinct parts
- Prolog <?xml version=“1.0” encoding=“UTF-8”?>
- Root Element
- Miscellaneous Section
Child
Xml document element
Child
element Child
element
Root element
Child
element
Child
element Child
element
XMLElements
- have the same overall structure
- can contain sub-elements
PCDATA
(Parsed Character Data)
ELEMENT
NAME
Element vs. Attribute based XML
<student> <student id = “9906789”> 2
1
<id> 9906789 </id> <name>Adam</name>
<name>Adam</name> <email>adam@unl.ac.uk</email>
<email>adam@unl.ac.uk</email> </student>
</student>
3
<student id = “9906789” name=“Adam email=“adam@yorku.ca”> </student>
prolog
<?xml version = "1.0"?> •The document structures
data with ‘books’ element
<!-- article.xml -->
as the root node.
root
<books>
element
<author> •Root node contains
<title> Introduction to Computer Graphics </title>
elements (e.g. author)
<date>1995</date>
<fname>James</fname>
<lname>Foley</lname> •Each element further
</author> contains child nodes that
<author>
describe data
<title> Principles of Database Systems </title>
<date month="February” >2000</date>
<fname>Greg</fname> attribute
<lname>Riccardi</lname> •<books>,<author>,<title>
</author>
etc. are customised tags
</books> Miscellaneous
<!- - This is a list of students - -> describing data.
XMLSyntax
• XMLelements must be enclosed within start and end tags
<title> Introduction to Computer Graphics </title>
If there is no data inside the element, tag can end with ‘/>’
<title/> which is same as <title> </title>
• The author grammar indicates that it is madeup of four elements defined as below:
<!ELEMENT date (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT fname (#PCDATA)>
<!ELEMENT lname (#PCDATA)>
• Each element may have attributes that contains information about its content
e.g. <date month="February” >2000</date>
CDATAin non-
• An element’s attribute list can be defined using ATTLIKSTtag: parsed
syntax: <!ATTLIST element_name attribute_name type default_value>
<books>
<author>
<date>1995</date>
<title> Introduction to Computer Graphics </title>
<fname>James</fname>
<lname>Foley</lname>
</author> inLineDtdExample.xml
…..
</books>
DTDs - Disadvantages
• Notoriouslyhard to read
• Difficult to create (written in non-XMLsyntax; uses EBNF - Extended Backus-Naur
Form - grammar)
• No support for namespacesetc. Also studyANY, EMPTY,
• Limiteddata types (PCDATA, CDATA) MixedContent
Programthat processes
XMLdocuments
•First, create XML document that the contains content character data and
marked up with XML tags.
•Second, build Document Type Definition (DTD). The DTD specifies rules
such as ordering of elements, default values, and so on.
•Third, use XML Parser that checks the XML document against the DTD and
then splits the document up into markup regions and character-data regions.
•After processing with the XML parser, the data now is in a structured format
and can be processed by any XML application.
XMLParsers (or Processors)
• one of the most important layers to an XML-aware application (e.g.Firefox, IE 5+)
• input - raw XMLdocument
• parses to ensure that the document is wellformed and/or valid (if a DTDexists),
report errors and allows programmatic access to the documentcontents
• output - a data structure (XMLdocumentis transformed)
<books> books
<author>
<date>1995</date>
<title> Web IR </title> author
<fname>Jimmy</fname>
<lname>Huang</lname>
</author>
</books> 1995 Web IR Jimmy Huang
Parsing XML Documents
•Parsers can support the Document Object Model (DOM) and Simple API
for XML(SAX) for accessing document’s content programmatically using
languages such as Java, C, C++, Python etc.
•ASAX based parser processes the document and generates events (I.e.
notifications to the application) whentags, comments etc. are
encountered. These events return data from the XMLdocument.
(used to read XMLdocuments only;
SAX is attractive for handling large documents because it is not required
to load the entire document)
DOM(Document Object Model)
•A DOM-based parser exposes a programmatic library called the DOM
API that allows data in an XML document to be accessed and modified by
manipulating the nodes in a DOM tree. DOM API is available in many
languages e.g. JavaScript.
• Data can be accessed quickly as all the document’s data is in memory.
•The DOM interfaces for creating and manipulating XML documents are
platform and language dependant. DOM parsers exist for Java, C, C++,
Python and Perl.
•JDOM provides a higher-level API than the W3C DOM for working with
XML documents in Java. See www.jdom.org
- provides full tree representation of the XML document
- allows random access to any node
- provides a variety of output formats
- less memory intensive than DOM API
• In order to use DOM API, programming experience is required.
SAX (Simple API for XML)
• Developed by the members of the XML-DEV mailing list
• Released in May 1998
•SAX and DOM are totally different APIs for accessing information in
XML documents.
•SAX based parsers invoke methods when markup (e.g. a start tag,
end tag etc.) is encountered. With this event based model, no tree
structure is created to store data. Instead, data is passed to the
application from the XML document as it is found.
=> greater performance and less memory overhead than with DOM
•Many DOM parsers use a SAX parser to retrieve data for building the
DOM tree.
•SAX parsers are typically used for reading documents that will not be
modified.
Parsing (msxml) and rendering
XML with IE
• XML document contains data, NOT formatting information.
•When XML document is loaded into IE5+, the document is
parsed by msxml.
•If the document is well-formed, the parser makes the
document’s data available to the application (I.e. IE5).
•The application can format and render the data and also
perform other processing.
•IE5 renders data by applying a stylesheet that formats and
colours the markup identically to the original document.
•Notice the - sign. It indicates that child elements are visible.
When clicked, it becomes + hiding the children.
•This behaviour is similar to viewing disk directory structure
using a program such as Windows Explorer.
Using XML:
How does browser read XML ?
XML parser: A tool for reading XML documents.
To manipulate an XML document, you need an XML
parser. The parser loads the document into your
computer's memory. Once the document is loaded,
its data can be manipulated using the DOM. The
DOM treats the XML document as a tree.
Once you have installed Internet Explorer 5.0, the
Microsoft XML parser is available.
http://www.w3schools.com/xml/xml_parser.asp
https://developer.mozilla.org/en-
US/docs/Archive/Mozilla/XML_in_Mozilla (XML in
Mozilla)
Using XML: Presenting Data
<lastname>Smith</lastname>
<b>Smith</b> Smith
Extensible Stylesheet Language (XSL)
• XMLis just data - no presentationinformation
• Topresent the data on the screen or paper or any media - apply appropriate style
• Style sheets containrules that instruct the processorhowto present elements
• Twostyle languages:CSS (CascadingStyle Sheets) and XSL
• XSLis powerfulthan CSSand an excellent solution to control the presentation of
data
- resource intensive: memory and processingpower
- complex to write
• transforms and translates XMLdata from one format intoanother
samedocumentneeded to be displayed in HTML,PDF and postscript form
CSS and XSL
CSS - Cascading Style Sheets
can predefined HTML display (font etc)
these are shared and reused
<xsl:template match=”EmployeeRecord/Name">
<Bold>
<xsl:apply-templates select=“FirstName”/>
</Bold>
</xsl:template> The templates is applied only to the
`FirstName’element of the `Name’
element contained in `EmployeeRecord’.
Options for Displaying XML
XSL XSL HTML
Transformation Transformation Document Web Browser
spec
Boeing
<student_list>
<student> Only data
<id> 9906789 </id>
<name>Adam</name>
<email>adam@unl.ac.uk</email>
• Data is self-describing
<bsc level=“final”>yes</bsc>
</student>
• customtags describe content
(define your owntags)
<student>
<id> 9806791 </id>
<name>Adrian</name> • easy to locate data
<email>adrian@unl.ac.uk</email> (e.g. all BSCstudents)
<bsc>no</bsc>
</student>
</student_list>
The Framework of WWW
HTML
Web Designer External Applications
Authoring Non-HTTP objects
& Publisher
Tools/Editors
• JAVA Servlet
• CGI (Perl)
• ASP & ASP.NET
• Java Server Pages
• Java Applet
• JavaScript
Web Programmer
Web
Browser
Internet
Global Reach
Broad Range Web
Server
Client
End User Web Master
Why Build Pages Dynamically?
The Web page is based on data submitted by the user
E.g., results page from search engines and order-
confirmation pages at on-line stores
The Web page is derived from data that changes
frequently
E.g., a weather report or news headlines page
The Web page uses information from databases or
other server-side sources
E.g., an e-commerce site could use a servlet to build a
Web page that lists the current price and availability of
each item that is for sale