You are on page 1of 4

A Proxy Server Experiment: an

Indication of the Changing Nature of the Web

MAJ Richard Howard MAJ Bernard J . Jansen


Directorate of Information Management Department of EE & CS
United States Military Academy United States Military Academy
West Point, NY 10996 West Point, NY 10996
Voice (914)938-7449 Voice:(914) 938-3233
Fax: (914)938-5956 Fax: (914)938-5956
howard @ exmail.usma.edu jansen@exmail.usma.edu

access and response time as network and server loads


Abstract have increased dramatically. Increased bandwidth (i.e.,
more or “bigger” connections to Web) can relieve the
With the growing reliance on connectivity to the
access problem, but increasing bandwidth may not
World-Wide Web (Web), many organizations have been
decrease the response time for users. Additionally,
experiencing trouble servicing their users with adequate
increase bandwidth comes at greatly increased cost due
access and response time. Increase bandwidth on more
to typical monthly charges. Therefore, many
connections to Web can relieve the access problem, but
organizations, including businesses, schools,
this approach may not decrease the access time.
universities, government, and military organizations
Additionally, increase bandwidth comes at greatly
have turned to the use of proxy servers.
increased cost. Therefore, many organizations have
turned to the use of proxy servers. A proxy server is a A proxy server is a Web server that caches Internet
Web server that caches Internet resources for re-use by resources for re-use by a set of client machines.
a set of client machines. The performance increases of Caching proxies have been introduced to improve the
proxy servers has been widely reported: however, we system performance with the assumption that a page
could not locate any recent test of proxy server will be fetched many times before it is destroyed or
pegormanee. Given the exponential growth of the Web modified. The performance increases of proxy servers
in just the last year, we wondered i f this would have an has been widely reported. In fact, the praise of proxy
effect on the pelformance of proxy servers. Therefore, servers has been almost universal. However, we could
we conducted a 14-day proxy server experiment. The not locate any recently published, scholarly articles on
results of our experiment showed that the proxy servers tests of proxy server performance. Therefore, we
actually decreased performanee, i.e. access time. We conducted an experiment to measure if the use of a
review this experiment, analyze why the proxy sever proxy server would decrease access time for users on
failed to decrease access time, and draw conclusions on the university WAN.
the changing nature of the Web and its impact on proxy
We present a brief description of and uses for a
servers.
proxy server. W e then review recent proxy server
literature, both from the trade press and from scholarly
Introduction journals. We then discuss the methodology of the
With the growing reliance on connectivity to the experiment and the nature of our network on which the
World-Wide Web (Web), many organizations have been experiment was conducted. We follow this discussion
experiencing trouble servicing their users with adequate with the results of the experiment and conclude with

646
$10.000 1998 IEEE
0-8186-9014-3/98
thoughts on why our results differed from those stored page, the page flies out of a local drive or cache
previously reported and the implications for the future without Internet transmission delays.
of proxy servers on the Web
To ensure that the proxy server can do it job, the
network must be set up so that users needing access to
Proxy Servers the Web must use the proxy server as their Internet
gateway. One can accomplish this access control
Proxy servers have two main purposes. They can
through proper router setup, which places all users
improve performance and filter requests. By filtering
“behind” the firewall.
requests, we mean an organization might use a proxy
server to prevent its employees from accessing a
specific set of Web sites. We are more concerned, for Literature Review
this experiment, in the first purpose of a proxy server,
Organizations that use Web proxy servers report
improving performance. Proxy servers can dramatically
that the proxy server’s caching technology has greatly
improve performance for users of an organization. This
reduced network costs. Many organizations bought
is because the proxy server saves the results of all
proxy servers in order to decrease access time.
requests for a certain amount of time.
However, they received an unexpected benefit when
For example, consider the situation where both caching reduced traffic on the Internet connections.
User A and User B access the Web through the same Industry analysts report that proxy servers often reduced
proxy server. First, User A requests a certain Web traffic enough to eliminate the need to add bandwidth
page, which we will refer to as Web Page 1. The proxy servers [ 3 ] . The demand for proxy servers among
sever will forward, based on the Uniform Resource businesses, organizations, government, and academic
Locator (URL), the request to the Web server where was been strong. Microsoft, Netscape and Novel1 all
Web Page 1 resides. Depending on the network’s Web offer proxy-server software as part of their Internet
connection, the number of graphics in the Web page, server suites. The demand is understandable given the
etc., this can be a time consuming operation. Now, reported increases in performance from users. Reported
later, User B requests the same Web page. Instead of performance enhancements from proxy server for end
forwarding the request to the Web server where Page 1 users is typically about 20 to 25%, i.e., a one-quarter
resides, the proxy server simply returns the Web Page 1 decrease in access time [ l ] [ 2 ] . Organizations also
that it already fetched for User A. Since the proxy reported high volume of proxy server cache access, as
server is usually on the same network as the user, this is high as 40% and extremely active caches with
a much faster operation. If this series of actions is thousands of Web documents [ 3 ] .
repeated over several to hundreds of users, the
There has been very little discussion of limitations
performance increase via reduced access time can a real
or negative aspects of proxy servers. In general, proxy
benefit to the users on a network. The major online
servers generally cannot provide the sophisticated event
services such as CompuServe and America Online, for
statistics, reports, alarms and audit tracking of
example, employ an array of proxy servers to service
standalone firewalls. However, a high end proxy server
thousands of users [ 7 ] . If User B had requested a Web
goes for about $1,000. Firewalls cost from $5,000 to
page that had not been previously requested, the proxy
$50,000 [2].Also, industry experts caution that proxy
server forwards the request to the real Web server
servers cannot take the place of a second, high-speed,
designated in the URL. .
Web access line if the first is overloaded. Other than
The storing of server requests by proxy servers is these points, there have been very little reported
referred to as caching. As stated, it was this aspect of drawbacks of proxy servers.
the proxy server that we were primarily interested in
With all the positive reports, it is no wonder that
studying. Web pages are modified, deleted, renamed
proxy servers are still an active area of research. Most
continuously, so the proxy server must have a means of
of the research focuses on methods to increase the
checking to see if the page that it has in cache is the
performance aspect of proxy servers as opposed to the
most current version. Briefly, a Web caching proxy
filtering aspect of proxy servers. Jeffery, Das, and
server “cruises” the Web and examines pages that are
Bernal [4] investigated the design and implications of an
currently cached on the server. If a page has been
extended proxy server that shares cache resources not
modified, the proxy server stores the new version on a
only itself, but also with near neighbors. They reported
local drive. Some proxy servers can also use certain
a substantial reduction in network work load that can be
guidelines to hit links on that page to pull down related
obtained from this proxy sharing. This shared cache
pages. Most proxy servers are extremely efficient.
also lead to a corresponding increase in performance.
They can examine and store thousands of Web pages,
The best performance came from a simple
and when any local user on the LAN asks for a specific

647
implementation model that is non-hierarchical; proxies 20% on the FDDI backbone, about 96% on the DREN
access each other using the natural topology of the Web. Gateway, and about 80% on NIPRNRT Gateway.
Instead of cache sharing, Law and Nandy [6]
investigated a distributed proxy server architecture that
can increase the service availability, provide system
scalability, coupled with load balancing capability. The
system employs TCP-based switching mechanism
which has a finer session granularity, and more dynamic
control on resource allocation. Finally, [5] researched
the ability of proxy servers to cache video.

Methodology
However, we could not locate any recent test of
proxy server performance. Given the exponential
growth of the Web in just the last or two year, we
wondered if this growth would have an effect on the
performance of proxy servers. Therefore, we conducted United States Mllitary Academy
FDDI Rings
a 14-day proxy server experiment. We first installed a West Poinf N.Y.
C U M I DNS
JP t n m
I
proxy server. We then review the sites that our network Figure 1: USMA Network.
users commonly visit. We selected one site,
www.microsoft.com, that was typical of the web sites The cadets use the network and Web extensively
that our users commonly visited. We then blocked for courses, both in the classroom and for research
direct access to this site for two weeks at the Firewall. projects. The faculty also uses the network and Web for
This action forced all users that desired to visit teaching preparation and research. Some typical sites
www.microsoft.com to go through the newly installed that are commonly visited are:
proxy server. Without this block, users could bypass
0 www 1.nytimes.com
the proxy server and our data set would not be as dense.
With any experiment of this type, one should get a sense 0 images.nytimes.com,
of the size, traffic load, and nature of the network users.
0 weather7.pvy.mci.net
0 www3.nytimes.com
The Network
The experiment was conducted at the United States 0 www7.nytimes.com
Military Academy (USMA), which is a four year, 0 www2.nytimes.com
undergraduate institution. USMA graduates about 1000
students, called cadets, per year. Almost all of the 0 www4.nytimes.com
graduating cadets immediately serve in the US Army.
0 www6.nytimes.com
All cadets, all faculty, and the majority of the staff have
computers and Internet access. 0 weatherl7.pcy.mci.net
The USMA network at has a Fiber Distributed Data
Interface (FDDI) backbone at 100 Mbps. About half of 0 wsfl.usatoday.com
our users are connected to the backbone via dedicated 0 www.fi)xnews.com
10 Mbps lines. Other users are connected to the
backbone via shared 10 Mbps lines. Our network is www.microsoft.com.
connected to the Web via two (2) connections, one
DREN and one NIPRNET. DREN is the Defense
Research Engineering Network and NIPRNET is Results and Explanation
uNclassified (but sensitive) Internet Protocol Routing We were expecting to see a substantial performance
NETwork). DREN is primarily for our education needs. gain by caching common documents that many users on
NIPRNET is the military Internet. Both run at about T1 the USMA visit repeatedly. Based on the architecture
speeds, 1.5Mbps. This network serves the of our network, as explained above, the proxy server
approximately four thousand cadets, the faculty and should have retrieved a large number of cached
staff of the institution, plus the various staff agencies of documents and delivered these at close to USMA
the military post. Figure 1 illustrates the layout and network speeds (100 Mbps around the backbone and
size of the USMA network. Average utilization is 15- generally 10 Mbps to the user). Unfortunately, this did

648
not happen. As an example, from 0010 hrs to 1945 hrs proxy servers. We conducted a 14-day experiment
on 17 December 1997, the proxy server accepted and channeled users through a proxy server if they
290,000 server requests. It only served 12,000 wanted to access www.microsoft.com, a major Web
documents from its cache. This is a 4% hit ratio. We site. Based on previous trade reports, we expected
were expecting about a 20% hit ratio. about a 20% hit ratio. Instead, our hit ratio was one
fifth of this expectation. We traced the cause to the
The factor that we failed to consider is that many
increased use of dynamic Web pages. ,While the growth
web sites, including www.microsoft.com, are using
of the Web was not the major factor, it appears that as
dynamic Web pages and dynamic HTML to create their
the Web has grown, it has and is evolving from a static
documents. Dynamic Web pages refer to Web content
to a dynamic information repository. Under this
that changes each time it is viewed. For example, the
environment, the role of the proxy server will decrease
same URL could result in a different page depending on
as it less able to delivery performance enhancements.
any number of parameters, such as: geographic location
Obviously, product redesign is in order.
of the reader, time of day, previous pages viewed by the
reader, or profile of the reader. There are already many
languages and technologies for producing dynamic References:
HTML, including CGI scripts, Server-Side Includes,
[ l ] Computerworld, April 21, 1997 v31 n16 p16(1)
cookies, Java, JavaScript, and ActiveX. It appears that
Proxy servers gain user appeal Laura DiDio.
the number of dynamic Web pages will increase,
especially with the advent of Dynamic HTML. [2] Computerworld, Nov 17, 1997 v31 n46 p6(1)
Dynamic HTML are new HTML extensions that will Planning blunts Web traffic spikes. Sharon Machlis.
enable a Web page to react to user input without
[3] Computerworld, Jan 26, 1998 v32 n4 p47(2) Web-
sending requests to the Web server. Microsoft and
caching servers cut network costs. Bob Wallace.
Netscape have submitted competing Dynamic HTML
proposals to W3C, which must now hammer out the [4] Jeffery, C.L.; Das, S.R.; Bernal, G.S. Proxy-sharing
final specification [7]. proxy servers. 7-10 May 1996. Proceedings of
COM’96. First Annual Conference on Emerging
So, with Dynamic Web pages every time you go to
Technologies and Applications in Communications.
the site, you may get a different page. It may be mostly
the same information as the last time you visited the [5] Ki Dong Nam; Hyeun Tae Lee. Design of a virtual
site, but the page is created on the fly. The proxy server server for service interworking over heterogeneous
can not cache this kind of web page in the normal way. networks. 20-22 Aug. 1997. 1997 IEEE Pacific Rim
Instead, the proxy server acts as a middleman. It sends Conference on Communications, Computers and Signal
a request to the distance Web server, copies the Processing, PACRIM. 10 Years Networking the Pacific.
documents to the proxy server hard drive and then
[6] Law, K.L.E.; Nandy, B.; Chapman, A. A scalable
delivers the documents to the user. This had two
and distributed WWW proxy system. 3-6 June 1997.
effects: 1. There were very few documents stored in
Proceedings of IEEE International Conference on
cache. Only, the static HTML documents. 2. For the
Multimedia Computing and Systems.
dynamic Web pages, the proxy server was actually
slowing down access time with the coping and updating [7] http://www.pcwehouedia.coni/dynamic HTML.htm
of the documents.
[8] http://journals.ecs.soton.ac.uWyr2/Figl.htm.

Conclusion
For this experiment, we set out to see if the growth
of the Web had changed the role and performance of

649