You are on page 1of 29

WEB TECHNOLOGY

Introduction of Web Technologies.

ASP:-
Microsoft's Active Server Pages (ASP) technology provides a
framework for building dynamic HTML pages which enable Internet and
Intranet applications to be interactive.

Databases:-
Simply put, a database is a computerized record keeping
system. More completely, it is a system involving data, the hardware that
physically stores that data, the software that utilizes the hardware's file
system in order to 1) store the data and 2) provide a standardized method
for retrieving or changing the data, and finally, the users who turn the data
into information.

Design:-
Web site design is more than just some text and pretty
graphics.

DHTML:-
"Dynamic HTML" is typically used to describe the combination
of HTML, style sheets and scripts that allows documents to be animated.
Dynamic HTML allows a Web page to change after it's loaded into the
browser --there doesn't have to be any communication with the Web server
for an update. You can think of it as 'animated' HTML. For example, a piece
of text can change from one size or color to another, or a graphic can move
from one location to another, in response to some kind of user action, such
as clicking a button.

Graphics:-
Resources, demos and tutorials on the basics of graphics
design and construction, including integrating images into your Web pages.
HTML:-
Hypertext Markup Language is the fundamental building stuff of
the Web. We present several articles, tutorials, and references on HTML.

Multimedia:-
Multimedia makes your Web sites come alive. These
tutorials can help make that happen.

Perl:-
A interpretative language used in CGI for handling text files

PHP:-
An open-source server-parsed embedded scripting language.

Usability:-
Make certain that your Web sites are "user-friendly".

Web Programming & Scripting-


Static HTML pages can be
supplemented with server or client-side programming, or 'scripting' as it is
often known. Currently the main languages for this are Perl, Java, and
JavaScript, sometimes used in combination with CGI.

XML:-
Extensible Markup Language (XML) is a human-readable, machine-
understandable, general syntax for describing hierarchical data, applicable
to a wide range of applications (databases, e- commerce, Java, Web
development, searching, etc.).
Internet Basics: The Internet

What Is the Internet and why should I become a part of it?

The Internet is revolutionizing and enhancing the way we as humans


communicate, both locally and around the globe. Simply put, the Internet is a
network of linked computers allowing participants to share information on
those computers. You should want to be a part of it because the Internet
literally puts a world of information and a potential worldwide audience at
your fingertips.

Internet History: The Internet's roots can be traced to the 1950s with the
launch of Sputnik, the ensuing space race, the Cold War and the development
of ARPAnet (Department of Defense Advanced Research Projects Agency),
but it really took off in the 1980s when the National Science Foundation
used ARPAnet to link its five regional supercomputer centers. From there
evolved a high-speed backbone of Internet access for many other types of
networks, universities, institutions, bulletin board systems and commercial
online services. The end of the decade saw the emergence of the World
Wide Web, which heralded a platform-independent means of communication
enhanced with a pleasant and relatively easy-to-use graphical interface.

Internet Activity: The information superhighway is literally buzzing with


activity as Internet pipelines pump out all manner of files, movies, sounds,
programs, video, e-mail, live chat, you name it. Yet amid all this activity there
are always two key players in every transaction: a server and a client.

Servers are computers with a 24-hour Internet connection that provide


access to their files and programs. These can be but are not limited to
educational institutions, commercial companies, organizations, government or
military organizations, Internet access providers and various other computer
networks of all sizes.
Clients are software programs (and the people on remote computers using
the software!) used to access files on a server (typically, a Web browsing
program such as Netscape Navigator or an e-mail program such as Eudora).
Servers are typically located and organized by IP address and domain.
An IP address (IP stands for Internet Protocol) is a specific set of numbers
referring to a server's exact location on a network. Most domains have their
own IP address, for instance, 192.41.20.33 is the IP address of my server at
webcurrent.com. You can type those numbers in to get there, but the domain
is easier to remember. An IP address also leaves your fingerprints wherever
you "surf" on the net. Each modem connection typically is designated a
specific IP address at Internet providers (this number typically changes
dynamically as users log in), so you never really surf the net anonymously.
You can be traced to a point.
A domain is part of the server's official name on the network, an alias for
the less descriptive IP numbers. Domains are organized by type of
organization (a three-letter suffix) and by country (a two-letter suffix
which defaults to the U.S. if no suffix is specified). You can tell a lot about a
server by looking at its domain name.
Here are some typical organizational suffixes: com=commercial,
edu=educational, gov=government, int=international, mil=military,
net=network, org=organization.
Here are some country codes: au=Australia, at=Austria, be=Belgium, br=-
Brazil, dk=Denmark, jp=Japan, nz=New Zealand, ru=Russian Federation,
uk=United Kingdom, ch=Switzerland.
Internet Equipment: Computer System Requirements

To access the Internet, you need the following minimum configuration (as of
spring '97). You can sometimes make do with less but you'll notice
shortcomings.

A computer with at least 16 megabytes (32 or more is even better) of RAM.


Your computer will need to run several large programs at once, so it needs all
the brain power (RAM) it can muster. RAM is the best performance-booster
you can add to your computer.
A color monitor that displays at least 16-bit or thousands of colors. Better:
24-bit or millions of colors. This will be a factor of the VRAM in your
system. In most cases you'll need at least 2-4 megs of VRAM for millions of
colors on a larger monitor. Also, the higher your monitor resolution, the more
web "real estate" will fit on your screen. I have never regretted the day I
splurged on a 17-inch monitor at 1024 x 768 pixels in millions of colors. I
wonder why I waited so long! Now I have much less eye strain and can easily
"surf" two full web pages side by side.
A modem with a speed of at least 28.8 bps ( although 56k technology is now
standard). You're even luckier is your area has affordable access to ISDN
lines or cable modems. Another fairly option is DirecPC, which offers
incredibly high download speeds (albeit at a price) via satellite, although
uploads are limited to your home modem speed. Access just keeps getting
better and better.
A hard disk with plenty of storage space (think of your hard disk as a big
filing cabinet) for the software you will use and all those files you'll be
downloading. If you're getting a new machine, get at least 6 gigabytes for
your hard disk. and I am now finding space to be cramped on an 8-gig drive. I
guess such is the case, the more you have the more you need (or think you
need). The latest browser downloads are about 12 megabytes.
Sound capabilities, stereo speakers. Since the World Wide Web is also a
multimedia medium (that's a mouthful!), you'll also enjoy having sound on
your system. Mine are built in to the monitor, which is great if you don't
have a lot of desk space.
Also nice is a separate telephone line so your phone won't be busy every time
you're online. I use mine for both Internet access and my computer fax.
Another necessity these days is a CD-ROM drive (standard now on most PCs)
because there is so much software available on CD-ROM and installations
from high-capacity CDs are much easier and faster. You're really in for a
treat if your computer's CD-ROM is actually a DVD drive. I was amazed and
left hungry for the Apple Macintosh G3 PowerBook I spied at a computer
store last Christmas with Chevy Chase's Christmas Vacation" playing its the
huge (for a laptop) 14-inch screen. In my opinion, these drives are still too
expensive although my next computer will definitely have one!
This may be obvious, but a printer is handy. I like color inkjets. Mine is an
Epson Stylus Photo 1200, which I love for its photorealistic and oversize
capabilities. But there's no beating the speed of a Laser printer. Nice for
printing out e-mail and Web pages.
For the curious, I run my business with a Macintosh OS/Power Computing
PowerTowerPro with 400-MHz G3 PowerLogix daughter card processor (this
machine has been upgradable for almost three years and was a great buy), 4-
gig and 2-gig hard disks with 8 megs of VRAM, 368 megs of RAM, a 24x CD-
ROM and a built-in Iomega Zip drive that I installed myself. I also run
Windows 98 via SoftWindows on my spare hard disk. My mobile office
consists of PowerBook G3 laptop running along at 300 MHz.This may be
obvious, but Before you buy a computer, be sure to check around. Prices vary
greatly. You may decide to pay a little more for a system from your local
dealer if you feel more comfortable with their support. Many systems that
were totally out of reach only months ago are more affordable than ever.
Most multimedia computer packages now include massive hard drives (10+
gigabytes), tons of RAM (64 megs or more) and a super-quick (24x or faster)
CD-ROM drive or DVD drive. Processors of 400 MHz and faster are fast
becoming the norm. (every few months, it seems, I go through this
paragraph and increase all the numbers!). Faster processing speeds will
improve performance, especially when you are trying to listen to RealAudio
streams or play video files. While you're at it, you may be interested in my
favorite add-on, the little Iomega Zip drive that holds 100-megabyte floppy
disks-- it's affordable and great for backups. Also, look for computers that
will accept processor upgrades or dual processors. But don't be discouraged
if you don't have a lightning-fast model. My old 33MHz Performa could
cruise the Web quite nicely by and large (especially since it had lots of
RAM). But if you're buying a new model, get as fast a processor and as big a
hard disk as you can possibly afford. Then put twice as much RAM in it as
you think you need. You won't be sorry.
Internet Access: The best news of all with the explosive growth of the
Internet and its accompanying revolution in human communications is this:
Most anyone anywhere can access the Internet and thus any Internet file in
the world via a simple local telephone connection with no long-distance fees.

CLIENT/SERVER MODEL OF COMPUTING

This is client/server computing described.

To truly understand how much of the Internet operates, including the Web,
it is important to understand the concept of client/server computing. The
client/server model is a form of distributed computing where one program
(the client) communicates with another program (the server) for the purpose
of exchanging information.

The client's responsibility is usually to:

1. Handle the user interface.


2. Translate the user's request into the desired protocol.
3. Send the request to the server.
4. Wait for the server's response.
5. Translate the response into "human-readable" results.
6. Present the results to the user.

The server's functions include:

1. Listen for a client's query.


2. Process that query.
3. Return the results back to the client.

A typical client/server interaction goes like this:

1. The user runs client software to create a query.


2. The client connects to the server.
3. The client sends the query to the server.
4. The server analyzes the query.
5. The server computes the results of the query.
6. The server sends the results to the client.
7. The client presents the results to the user.
8. Repeat as necessary.
9.

A typical client/server interaction

This client/server interaction is a lot like going to a French restaurant. At


the restaurant, you (the user) are presented with a menu of choices by the
waiter (the client). After making your selections, the waiter takes note of
your choices, translates them into French, and presents them to the French
chef (the server) in the kitchen. After the chef prepares your meal, the
waiter returns with your diner (the results). Hopefully, the waiter returns
with the items you selected, but not always; sometimes things get "lost in
the translation."
Flexible user interface development is the most obvious advantage of
client/server computing. It is possible to create an interface that is
independent of the server hosting the data. Therefore, the user interface
of a client/server application can be written on a Macintosh and the server
can be written on a mainframe. Clients could be also written for DOS- or
UNIX-based computers. This allows information to be stored in a central
server and disseminated to different types of remote computers. Since the
user interface is the responsibility of the client, the server has more
computing resources to spend on analyzing queries and disseminating
information. This is another major advantage of client/server computing; it
tends to use the strengths of divergent computing platforms to create more
powerful applications. Although its computing and storage capabilities are
dwarfed by those of the mainframe, there is no reason why a Macintosh
could not be used as a server for less demanding applications.

In short, client/server computing provides a mechanism for disparate computers to


cooperate on a single computing task.
Middleware

Purpose and Origin


Middleware is connectivity software that consists of a set of
enabling services that allow multiple processes running on one
or more machines to interact across a network. Middleware is
essential to migrating mainframe applications to client/server
applications and to providing for communication across
heterogeneous platforms. This technology has evolved during
the 1990s to provide for interoperability in support of the
move to client/server architectures. The most widely-
publicized middleware initiatives are the Open Software
Foundation's Distributed Computing Environment (DCE),
Object Management Group's Common Object Request Broker
Architecture (CORBA), and Microsoft's COM/DCOM.

Technical Detail
As outlined in Figure 1, middleware services are sets of
distributed software that exist between the application and
the operating system and network services on a system node
in the network.

Figure 1: Use of Middleware

Middleware services provide a more functional set of


Application Programming Interfaces (API) than the operating
system and network services to allow an application to

• locate transparently across the network, providing


Hypertext
The Web is based on the concept of hypertext. A hypertext document
allows readers to access information in the document in almost any order,
unlike a traditional linear text which is intended to be read from beginning
to end. For example, suppose you are reading a research paper on
computational fluid dynamics. A traditional paper might have an introduction,
a definition of terms, some equations, some algorithms that implement the
equations, and a discussion of the performance of the algorithms on one or
more supercomputers. If you are an expert in this area, you might want to
skip the definitions and equations and go right to the algorithms. As you are
reading that section you might decide you shouldn't have skipped the
equations after all, and want to go back and read them. Hypertext is
intended to make these random movements through the document easier for
the reader. The idea is to give the reader more of a role in deciding how to
use the document and to make the document flexible enough to be used in
several ways.

Hypertext documents are made possible by computer technology. Documents


are displayed on a computer console, and readers use a mouse and/or
keyboard to navigate through the document. Random movements between
sections are implemented by links which are activated when the reader
clicks a mouse button. The program that displays a hypertext document and
uses mouse clicks to move around in the document is sometimes called a
browser or navigator.

A more interesting example of using links to navigate through text is a case


where a person who is new to an area comes across an equation and needs to
learn more about it; the browser could connect them to a completely
separate document that has background information the reader could study
before resuming the main text. The really interesting possibilities for
hypertext will take full advantage of the underlying computer system. A
text that introduces equations might have links to a symbolic math package
such as Mathematical which could help students derive the equations from
initial conditions or assumptions. Readers could ask the system to plot the
equation for a given range of inputs, and then plot another equation for
comparison. Complex figures could be scaled and rotated by computer aided
design programs instead of lying flat on a page. Data sets could be fetched
over the network from the authors, and readers could perform their own
statistical analyses. In general, readers will be able to interact with the
document rather than simply consume information.

Hypermedia generalizes the notion of hypertext to include sound and


interactive video in documents. For example, a document on Mozart could
include portions of his music which readers could listen to simply by clicking
the mouse on a link in the main document. Several publishers have begun to
produce hypermedia versions of their products. For example, there is a
version of Grolier's Encyclopedia available on CD ROM that has sound and
movie clips included with the text. If you read about US presidents, you will
see pictures of the presidents. Some of these pictures have links that will
play a movie (complete with sound) showing part of that president's
inaugural address.

Mosaic is a hypermedia browser for X window systems. It displays


information in a window on a workstation console, and uses keystrokes and
mouse clicks to navigate through the document. Documents can be read as
normal text files by using a scroll bar and mouse buttons to move back and
forth in the text. Clicking the mouse on a link will either move you to another
location within the current document or call up a new document. If
necessary, Mosaic will use the Internet to fetch a document when you select
it; for example, if you are viewing a document on an art exhibit that has a
link that refers to a particular piece, when you click the mouse over the link
Mosaic will invoke FTP to copy the image and then invoke xv to display it in
another window. It is the ability to call external software (such as FTP and
xv) that makes Mosaic a hypermedia software package: if you select a movie
or piece of music, Mosaic will invoke software to play the movie or sound.

File Transfer Protocol


FTP or File Transfer Protocol is used to transfer data from one computer to
another over the Internet, or through a network.

Specifically, FTP is a commonly used protocol for exchanging files over any
network that supports the TCP/IP protocol (such as the Internet or an
intranet). There are two computers involved in an FTP transfer: a server and
a client. The FTP server, running FTP server software, listens on the
network for connection requests from other computers. The client
computer, running FTP client software, initiates a connection to the server.
Once connected, the client can do a number of file manipulation operations
such as uploading files to the server, download files from the server, rename
or delete files on the server and so on. Any software company or individual
programmer is able to create FTP server or client software because the
protocol is an open standard. Virtually every computer platform supports the
FTP protocol. This allows any computer connected to a TCP/IP based
network to manipulate files on another computer on that network regardless
of which operating systems are involved (if the computers permit FTP
access). There are many existing FTP client and server programs. FTP
servers can be setup anywhere between game servers, voice servers,
internet hosts, and other physical servers.

Overview

FTP runs exclusively over TCP. FTP servers by default listen on port 21 for
incoming connections from FTP clients. A connection to this port from the
FTP Client forms the control stream on which commands are passed to the
FTP server from the FTP client and on occasion from the FTP server to the
FTP client. For the actual file transfer to take place, a different connection
is required which is called the data stream. Depending on the transfer mode,
the process of setting up the data stream is different.

In active mode, the FTP client opens a random port (> 1023), sends the FTP
server the random port number on which it is listening over the control
stream and waits for a connection from the FTP server. When the FTP
server initiates the data connection to the FTP client it binds the source
port to port 20 on the FTP server.

In order to use active mode, the client sends a PORT command, with the IP
and port as argument. The format for the IP and port is "h1,h2,h3,h4,p1,p2".
Each field is a decimal representation of 8 bits of the host IP, followed by
the chosen data port. For example, a client with an IP of 192.168.0.1,
listening on port 1025 for the data connection will send the command “PORT
192,168,0,1,4,1”. The port fields should be interpreted as 4*256 + 1 = 1025.
In passive mode, the FTP Server opens a random port (> 1023), sends the
FTP client the server's IP address to connect to and the port on which it is
listening (a 16 bit value broken into a high and low byte) over the control
stream and waits for a connection from the FTP client. In this case the FTP
client binds the source port of the connection to a random port greater than
1023.

To use passive mode, the client sends the PASV command to which the
server would reply with something similar to "227 Entering Passive Mode
(127,0,0,1,78,52)". The syntax of the IP address and port are the same as
for the argument to the PORT command.

In extended passive mode, the FTP Server operates exactly the same as
passive mode, however it only transmits the port number (not broken into
high and low bytes) and the client is to assume that it connects to the same
IP address that was originally connected to. Extended passive mode was
added by RFC 2428 in September 1998.

While data is being transferred via the data stream, the control stream sits
idle. This can cause problems with large data transfers through firewalls
which time out sessions after lengthy periods of idleness. While the file may
well be successfully transferred, the control session can be disconnected by
the firewall, causing an error to be generated.

The FTP protocol supports resuming of interrupted downloads using the


REST command. The client passes the number of bytes it has already
received as argument to the REST command and restarts the transfer. In
some commandline clients for example, there is an often-ignored but
valuable command, "reget" (meaning "get again") that will cause an
interrupted "get" command to be continued, hopefully to completion, after a
communications interruption.

Resuming uploads is not as easy. Although the FTP protocol supports the
APPE command to append data to a file on the server, the client does not
know the exact position at which a transfer got interrupted. It has to obtain
the size of the file some other way, for example over a directory listing or
using the SIZE command.
In ASCII mode (see below), resuming transfers can be troublesome if client
and server use different end of line characters.

The objectives of FTP, as outlined by its RFC, are:

 To promote sharing of files (computer programs and/or data).


 To encourage indirect or implicit use of remote computers.
 To shield a user from variations in file storage systems among
different hosts.

 To transfer data reliably, and efficiently.

TELNET
TELNET (TELetype NETwork) is a network protocol used on the Internet or
local area network (LAN) connections. It was developed in 1969 beginning
with RFC#0015 and standardized as IETF STD 8, one of the first Internet
standards. It has limitations that are considered to be security risks.

The term telnet also refers to software which implements the client part of
the protocol. TELNET clients have been available on most UNIX systems for
many years and are available for virtually all platforms. Most network
equipment and OSs with a TCP/IP stack support some kind of TELNET
service server for their remote configuration (including ones based on
Windows NT). Recently, SSH has begun to dominate remote access for Unix-
based machines.

"To telnet" is also used as a verb meaning to establish or use a TELNET or


other interactive TCP connection, as in, "To change your password, telnet to
the server and run the passwd command".

Most often, a user will be telneting to a UNIX-like server system or a simple


network device such as a switch. For example, a user might "telnet in from
home to check his mail at school". In doing so, he would be using a telnet
client to connect from his computer to one of his servers. Once the
connection is established, he would then log in with his account information
and execute operating system commands remotely on that computer, such as
ls or cd.

TELNET is a client-server protocol, based on a reliable connection-oriented


transport. Typically this is TCP port 23, although TELNET predates TCP/IP
and was originally run on NCP.

The protocol has many extensions, some of which have been adopted as
Internet Standards. IETF standards STD 27 through STD 32 define various
extensions, most of which are extremely common. Other extensions are on
the IETF standards track as proposed standards.

Usenet
Usenet is one of the oldest computer network communications systems still
in widespread use. It was established in 1980, following experiments from
the previous year, over a decade before the World Wide Web was
introduced and the general public got access to the Internet. It was
originally conceived as a "poor man's ARPANET," employing UUCP to offer
mail and file transfers, as well as announcements through the newly
developed news software. This system, developed at University of North
Carolina at Chapel Hill and Duke University, was called USENET to
emphasize its creators' hope that the USENIX organization would take an
active role in its operation (Daniel et al, 1980).

The articles that users post to Usenet are organized into topical categories
called newsgroups, which are themselves logically organized into hierarchies
of subjects. For instance, sci.math and sci.physics are within the sci
hierarchy, for science. When a user

subscribes to a newsgroup, the news client software keeps track of which


articles that user has read.

In most newsgroups, the majority of the articles are responses to some


other article. The set of articles which can be traced to one single non-reply
article is called a thread. Most modern newsreaders display the articles
arranged into threads and subthreads, making it easy to follow a single
discussion in a high-volume newsgroup.

When a user posts an article, it is initially only available on that user's news
server. Each news server, however, talks to one or more other servers (its
"newsfeeds") and exchanges articles with them. In this fashion, the article
is copied from server to server and (if all goes well) eventually reaches every
server in the network. The later peer-to-peer networks operate on a similar
principle; but for Usenet it is normally the sender, rather than the receiver,
who initiates transfers. Some have noted that this seems a monstrously
inefficient protocol in the era of abundant high-speed network access.
Usenet was designed for a time when networks were much slower, and not
always available. Many sites on the original Usenet network would connect
only once or twice a day to batch-transfer messages in and out.

In the early times, many articles posted a notice at the end disclosing if the
author was free of, or had any financial motive, or axe to grind, in posting
about any product or issue. That was back when the community was the
pioneering computer society.

Usenet has significant cultural importance in the networked world, having


given rise to, or popularized, many widely recognized concepts and terms
such as "FAQ" and "spam."

Today, almost all Usenet traffic is carried over the Internet. The current
format and transmission of Usenet articles is very similar to that of
Internet email messages. However, Usenet articles are posted for general
consumption; any usenet user has access to all newsgroups, unlike email,
which requires a list of known recipients.

Today, Usenet has diminished in importance with respect to mailing lists,


web forums and weblogs. The difference, though, is that Usenet requires no
personal registration with the group concerned, that information need not
be stored on a remote server, that archives are always available, and that
reading the messages requires no mail or web client, but a news client
(included in many modern e-mail clients).

Gopher
Gopher is a distributed document search and retrieval network protocol
designed for the Internet. Its goal was similar to that of the World Wide
Web, and it has been almost completely displaced by the Web.

The Gopher protocol offers some features not natively supported by the
Web and imposes a much stronger hierarchy on information stored in it. Its
text menu interface is well-suited to computing environments that rely
heavily on remote computer terminals, common in universities at the time of
its creation. Some consider it to be the superior protocol for storing and
searching large repositories of information.

Origins

The original Gopher system was released in late spring of 1991 by Mark
McCahill, Farhad Anklesaria, Paul Lindner, Dan Torrey, and Bob Alberti of
the University of Minnesota. Its central goals were:

• A file-like hierarchical arrangement that would be familiar to users


• A simple syntax
• A system that can be created quickly and inexpensively
• Extending the file system metaphor to include things like searches

The source of the name "Gopher" is claimed to be three-fold:

1. Users instruct it to "go for" information


2. It does so through a web of menu items analogous to gopher holes
3. The sports team of the University of Minnesota is the Golden Gophers

Gopher combines document hierarchies with collections of services, including


WAIS, the Archie and Veronica search engines, and gateways to other
information systems such as ftp and Usenet.

The general interest in Campus-Wide Information Systems (CWISs) [1] in


higher education at the time, and the ease with which a Gopher server could
be set up to create an instant CWIS with links to other sites' online
directories and resources were the factors contributing to Gopher's rapid
adoption. By 1992, the standard method of locating someone's e-mail
address was to find their organization's CSO nameserver entry in Gopher,
and query the nameserver [2].
The exponential scaling of utility in social networked systems (Reed's law)
seen in Gopher, and then the web, is a common feature of networked
hypermedia systems with distributed authoring. In 1993–1994, Web pages
commonly contained large numbers of links to Gopher-delivered resources, as

the Web continued Gopher's embrace and extend tradition of providing


gateways to other services.

Proxy server
In computer networks, a proxy server is a server (a computer system or an
application program) which services the requests of its clients by making
requests to other servers. A client connects to the proxy server, requesting
a
file, connection, web page, or other resource available from a different
server.
A proxy server provides the resource by connecting to the specified server,
with
some exceptions: A proxy server may alter the client's request or the
server's
response. A proxy server may service the request without contacting the
specified server.
(A proxy server that passes all requests and replies unmodified is not called
a
proxy server. It is a gateway.)
A proxy server can be placed in the user's local computer, or at specific key
points between the user and the destination servers or the Internet.

Proxy servers implement one or more of these functions:

Caching proxy server


A proxy server may service the request without contacting the specified
server,
by retrieving content saved from a previous request, made by the same
client or
even other clients. This is called caching. Caching proxies keep local copies
of
frequently requested resources, allowing large organizations and Internet
Service Providers to significantly reduce their upstream bandwidth usage
and
cost, while significantly increasing performance.
There are well-defined rules for caching. Some poorly-implemented caching
proxies have had downsides, e.g. an inability to use user authentication. Some

problems are described in RFC 3143 (Known HTTP Proxy/Caching Problems).

Web proxy

Proxies that focus on WWW traffic are called web proxies. Many web
proxies
attempt to block offensive web content. Other web proxies reformat web
pages for
a specific purpose or audience (e.g., cell phones and PDAs or persons with
disabilities). Network operators can also deploy proxies to intercept
computer
viruses and other hostile content served from remote web pages. (For
example,
Microsoft Internet Security and Acceleration Server.)
Many organizations — including families, schools, corporations, and countries

use proxy servers to enforce acceptable network use policies (see
content-control software) or to provide security, anti-malware and/or
caching
services. A traditional web proxy is not transparent to the client application,

which must be configured to use the proxy (manually or with a configuration


script). In some cases, where alternative means of connection to the
Internet
are available (e.g., a SOCKS server or NAT connection), the user may be able
to
avoid policy control by simply resetting the client configuration and
bypassing
the proxy. Furthermore administration of browser configuration can be a
burden
for network administrators.
Access control: Some proxy servers implement a logon requirement. In large
organizations, authorized users must log on to gain access to the WWW. The

organization can thus track usage to individuals.


Anonymizing proxy server
A proxy server that removes identifying information from the client's
requests
for the purpose of anonymity is called an anonymizing proxy server or
anonymizer.

Intercepting proxy server

An intercepting proxy (often incorrectly called "transparent proxy")


combines a
proxy server with a Gateway. Connections made by client browsers through
the
gateway are redirected through the proxy without client-side configuration
(or
often knowledge).
Intercepting proxies are commonly used in businesses to prevent avoidance
of
acceptable use policy, and to ease administrative burden, since no client
browser configuration is required.
It is often possible to detect the use of an intercepting proxy server by
comparing the external IP address to the address seen by an external web
server,
or by examining the HTTP headers on the server side.
[edit] Transparent and non-transparent proxy server
The term "transparent proxy" is most often used incorrectly to mean
"intercepting proxy" (because the client does not need to configure a proxy
and
cannot directly detect that its requests are being proxied).
However, RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) offers
different
definitions:
"A 'transparent proxy' is a proxy that does not modify the request or
response
beyond what is required for proxy authentication and identification.
"A 'non-transparent proxy' is a proxy that modifies the request or response
in
order to provide some added service to the user agent, such as group
annotation services, media type transformation, protocol reduction, or
anonymity filtering."
Forced proxy
The term "forced proxy" is ambiguous. Examples have it meaning both
"intercepting proxy" (because it filters all traffic on the only available
gateway to the Internet) and its exact opposite, "non-intercepting proxy"
(because the user is forced to configure a proxy in order to access the
Internet).

Open proxy server

An open proxy is a proxy server which will accept client connections from
any IP
address and make connections to any Internet resource. Abuse of open
proxies is
currently implicated in a significant portion of e-mail spam delivery.
Spammers
frequently install open proxies on unwitting end users' operating systems by
means of computer viruses designed for this purpose. Internet Relay Chat
(IRC)
abusers also frequently use open proxies to cloak their identities.
Because proxies might be used for abuse, system administrators have
developed a
number of ways to refuse service to open proxies. IRC networks such as the
Blitzed network automatically test client systems for known types of open
proxy.

Reverse proxy
A reverse proxy is a proxy server that is installed in the neighborhood of
one
or more web servers. All traffic coming from the Internet and with a
destination
of one of the web servers goes through the proxy server. There are several
reasons for installing reverse proxy servers:
Security: the proxy server is an additional layer of defense and therefore
protects the web servers further up the chain
Encryption / SSL acceleration: when secure web sites are created, the SSL
encryption is often not done by the web server itself, but by a reverse
proxy
that is equipped with SSL acceleration hardware. See Secure Sockets Layer.

Load balancing: the reverse proxy can distribute the load to several web
servers, each web server serving its own application area. In such a case, the

reverse proxy may need to rewrite the URLs in each web page (translation
from
externally known URLs to the internal locations)

Hypertext Transfer Protocol

Hypertext Transfer Protocol (HTTP) is a method used to transfer or convey


information on the World Wide Web. Its original purpose was to provide a
way to publish and retrieve HTML pages.

HTTP is a request/response protocol between clients and servers. The


originating client, such as a web browser, spider, or other end-user tool, is
referred to as the user agent. The destination server, which stores or
creates resources such as HTML files and images, is called the origin server.
In between the user agent and origin server may be several intermediaries,
such as proxies, gateways, and tunnels

client initiates a request by establishing a Transmission Control Protocol


(TCP) connection to a particular port on a remote host (port 80 by default;
see List of TCP and UDP port numbers). An HTTP server listening on that
port waits for the client to send a request message.
Request methods

HTTP defines eight methods (sometimes referred to as "verbs") indicating


the desired action to be performed on the identified resource.

HEAD
Asks for the response identical to the one that would correspond to a
GET request, but without the response body. This is useful for
retrieving meta-information written in response headers, without
having to transport the entire content.
GET
Requests a representation of the specified resource. By far the most
common method used on the Web today. Should not be used for
operations that cause side-effects (using it for actions in web
applications is a common misuse). See 'safe methods' below.
POST
Submits data to be processed (e.g. from an HTML form) to the
identified resource. The data is included in the body of the request.
This may result in the creation of a new resource or the updates of
existing resources or both.
PUT
Uploads a representation of the specified resource.
DELETE
Deletes the specified resource.
TRACE
Echoes back the received request, so that a client can see what
intermediate servers are adding or changing in the request.
OPTIONS
Returns the HTTP methods that the server supports. This can be used
to check the functionality of a web server.
CONNECT
For use with a proxy that can change to being an SSL tunnel.

HTTP servers are supposed to implement at least the GET and HEAD
methods and, whenever possible, also the OPTIONS method.

Request message

The request message consists of the following:


• Request line, such as GET /images/logo.gif HTTP/1.1, which requests
the file logo.gif from the /images directory
• Headers, such as Accept-Language: en
• An empty line
• An optional message body

The request line and headers must all end with CRLF (i.e. a carriage return
followed by a line feed). The empty line must consist of only CRLF and no
other whitespace.

In the HTTP/1.1 protocol, all headers except Host are optional.

Simple Mail Transfer Protocol

Simple Mail Transfer Protocol (SMTP) is the de facto standard for e-mail
transmissions across the Internet. Formally SMTP is defined in RFC 821
(STD 10) as amended by RFC 1123 (STD 3) chapter 5. The protocol used
today is also known as ESMTP and defined in RFC 2821.

Description

SMTP is a relatively simple, text-based protocol, where one or more


recipients of a message are specified (and in most cases verified to exist)
and then the message text is transferred. It is quite easy to test an SMTP
server using the telnet program (see below).

SMTP uses TCP port 25. To determine the SMTP server for a given domain
name, the MX (Mail eXchange) DNS record is typically used, falling back to a
simple A record in the case of no MX (not all MTAs (Mail Transfer Agents)
support fallback). Some current mail transfer agents will also use SRV
records, a more general form of MX, though these are not widely adopted.
SMTP is a "push" protocol that does not allow one to "pull" messages from a
remote server on demand. To do this a mail client must use POP3 or IMAP.
Another SMTP server can trigger a delivery in SMTP using ETRN

Post Office Protocol


POP3 has made earlier versions of the protocol obsolete, POP (informally
called POP1) and POP2. In contemporary usage, the less precise term POP
almost always means POP3 in the context of e-mail protocols.

The design of POP3 and its procedures supports end-users with intermittent
connections (such as dial-up connections), allowing these users to retrieve e-
mail when connected and then to view and manipulate the retrieved messages
without needing to stay connected. Although most clients have an option to
leave mail on server, e-mail clients using POP3 generally connect, retrieve all
messages, store them on the user's PC as new messages, delete them from
the server, and then disconnect. In contrast, the newer, more capable
Internet Message Access Protocol (IMAP) supports both connected and
disconnected modes of operation. E-mail clients using IMAP generally leave
messages on the server until the user explicitly deletes them. This and other
facets of IMAP operation allow multiple clients to access the same mailbox.
Most e-mail clients support either POP3 or IMAP to retrieve messages;
however, fewer Internet Service Providers (ISPs) support IMAP. The
fundamental difference between POP3 and IMAP4 is that POP3 offers
access to a mail drop; the mail exists on the server until it is collected by
the client. Even if the client leaves some or all messages on the server, the
client's message store is considered authoritative. In contrast, IMAP4
offers access to the mail store; the client may store local copies of the
messages, but these are considered to be a temporary cache; the server's
store is authoritative.

Clients with a leave mail on server option generally use the POP3 UIDL
(Unique IDentification Listing) command. Most POP3 commands identify
specific messages by their ordinal number on the mail server. This creates a
problem for a client intending to leave messages on the server, since these
message numbers may change from one connection to the server to another.
For example if a mailbox contains five messages at last connect, and a
different client then deletes message #3, the next connecting user will find
the last two messages' numbers decremented by one. UIDL provides a
mechanism to avoid these numbering issues. The server assigns a string of
characters as a permanent and unique ID for the message. When a POP3-
compatible e-mail client connects to the server, it can use the UIDL
command to get the current mapping from these message IDs to the ordinal
message numbers. The client can then use this mapping to determine which
messages it has yet to download, which saves time when downloading. IMAP
has a similar mechanism, using a 32-bit UID (Unique IDentifier) that is
required to be strictly ascending. The advantage of the numeric UID is with
large mailboxes; a client can request just the UIDs greater than its
previously stored "highest UID". In POP, the client must fetch the entire
UIDL map.

Whether using POP3 or IMAP to retrieve messages, e-mail clients typically


use the SMTP_Submit profile of the SMTP protocol to send messages. E-
mail clients are commonly categorized as either POP or IMAP clients, but in
both cases the clients also use SMTP. There are extensions to POP3 that
allow some clients to transmit outbound mail via POP3 - these are known as
"XTND XMIT" extensions. The Qualcomm qpopper and CommuniGate Pro
servers and Eudora clients are examples of systems that optionally utilize
the XTND XMIT methods of authenticated client-to-server e-mail
transmission.

MIME serves as the standard for attachments and non-ASCII text in e-


mail. Although neither POP3 nor SMTP require MIME-formatted e-mail,
essentially all Internet e-mail comes MIME-formatted, so POP clients must
also understand and use MIME. IMAP, by design, assumes MIME-formatted
e-mail.

Like many other older Internet protocols, POP3 originally supported only an
unencrypted login mechanism. Although plain text transmission of passwords
in POP3 still commonly occurs, POP3 currently supports several
authentication methods to provide varying levels of protection against
illegitimate access to a user's e-mail. One such method, APOP, uses the MD5
hash function in an attempt to avoid replay attacks and disclosure of a
shared secret. Clients implementing APOP include Mozilla, Thunderbird,
Opera, Eudora, KMail and Novell Evolution. POP3 clients can also support
SASL authentication methods via the AUTH extension.
POP3 works over a TCP/IP connection using TCP on network port 110. E-mail
clients can encrypt POP3 traffic using TLS or SSL. A TLS or SSL connection
is negotiated using the STLS command. Some clients and servers, like Google
Gmail, instead use the deprecated alternate-port method, which uses TCP
port 995.

Common Object Request Broker Architecture

The Common Object Request Broker Architecture (CORBA) is a standard


defined by the Object Management Group (OMG) that enables software
components written in multiple computer languages and running on multiple
computers to work together

General overview

CORBA “wraps” program code into a bundle containing information about the
capabilities of the code and how to invoke it. The wrapped objects can then
be invoked from other programs or CORBA objects across a network.

CORBA uses an interface definition language (IDL) to specify the interfaces


that objects will present to the world. CORBA then specifies a “mapping”
from IDL to a specific implementation language like C++ or Java. Standard
mappings exist for Ada, C, C++, Lisp, Smalltalk, Java, COBOL, PL/I and
Python. There are also non-standard mappings for Perl, Visual Basic, Ruby,
and Tcl implemented by ORBs written for those languages.

This diagram illustrates how the generated code is used within the CORBA
infrastructure:
This picture does not reflect all typically used possibilities. Normally the
server side has the Portable Object Adapter (POA) that redirects calls
either to the local servants or (to balance the load) to the other servers.
Also, both server and client parts often have interceptors that are
described below.

In addition to providing users with a language and a platform-neutral remote


procedure call specification, CORBA defines commonly needed services such
as transactions and security.

Apart from remote objects, the CORBA and RMI-IIOP define the concept
of the OBV. The code inside the methods of these objects is executed
locally by default. If the OBV has been received from the remote side, the
needed code must be either a priori known for both sides or dynamically
downloaded from the sender. To make this possible, the record, defining
OBV, contains the Code Base that is a space separated list of URLs from
where this code should be downloaded. The OBV can also have the remote
methods.

The OBV's may have fields that are transferred when the OBV is
transferred. These fields can be OBV's themselves, forming lists, trees or
arbitrary graphs. The OBV's have a class hierarchy, including multiple
inheritance and abstract classes.
CORBA Component Model (CCM) is an addition to the family of CORBA
definitions. It was introduced with CORBA 3, and it describes standard
application framework for CORBA components. It is an extension of
"language independent Enterprise Java Beans (EJB)". It provides an
abstraction of entities that can provide and accept services through well-
defined named interfaces called ports.

The CCM has a component container, where software components can be


deployed. The container offers a set of services that the components can
use. These services include (but are not limited to) notification,
authentication, persistence and transaction management. These are the
most-used services any distributed system requires, and, by moving the
implementation of these services from the software components to the
component container, the complexity of the components is dramatically
reduced.