You are on page 1of 13

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO.

5, MAY 2011 773

Measuring Client-Perceived Pageview


Response Time of Internet Services
Jianbin Wei, Member, IEEE Computer Society, and Cheng-Zhong Xu, Senior Member, IEEE

Abstract—As e-commerce services are exponentially growing, businesses need quantitative estimates of client-perceived response
times to continuously improve the quality of their services. Current server-side nonintrusive measurement techniques are limited to
nonsecured HTTP traffic. In this paper, we present the design and evaluation a monitor, namely sMonitor, which is able to measure
client-perceived response times for both HTTP and HTTPS traffic. At the heart of sMonitor is a novel size-based analysis method that
parses live packets to delimit different webpages and to infer their response times. The method is based on the observation that most
HTTP(S)-compatible browsers send significantly larger requests for container objects than those for embedded objects. sMonitor is
designed to operate accurately in the presence of complicated browser behaviors, such as parallel downloading of multiple webpages
and HTTP pipelining, as well as packet losses and delays. It requires only to passively collect network traffic in and out of the monitored
secured services. We conduct comprehensive experiments across a wide range of operating conditions using live secured Internet
services, on the PlanetLab, and on controlled networks. The experimental results demonstrate that sMonitor is able to control the
estimation error within 6.7 percent, in comparison with the actual measured time at the client side.

Index Terms—Client-perceived service quality, monitoring and measurement, pageview response time, secured Internet services.

1 INTRODUCTION

W ITH recent technology developments, secured Internet


services are the foundation for various e-commerce
services, such as online banking, shopping, auctions, and
support their requirements on response times. By obtaining
the response times promptly, businesses can also quickly
identify problems, get them fixed, reduce downtime costs,
payments. For these e-commerce services, the Internet is a and thus, maintain high service availability. For example, if
highly competitive environment. Clients have many choices the response time is large while network latency is small,
when they are seeking quality services. Whenever an e- the service providers would expect issues in the back-end
commerce service cannot meet their quality requirements, servers. Webserver logs can be used to determine the exact
they can turn to elsewhere much more easily than in objects that are requested for troubleshooting.
traditional business environments. This makes it especially Measuring client-perceived response time at the server
critical for e-commerce service providers to offer quantifiable side is challenging. In general, the client-perceived response
service quality to attract new clients while retaining old ones. time of a webpage is the retrieval time of the whole page,
Client-perceived response time is a key measure of including the container object (e.g., an HTML file) and all of
client-experienced service quality. In deciding whether to its embedded objects, such as images. Hence, a key to the
continue using one e-commerce service, 46 percent clients measurement of pageview response time is to delimit
want to have quick checkout process and 40 percent want to different pages when they are being downloaded simulta-
have fast pageview response time [26]. Early experiments at neously by different clients via HTTP requests and
Amazon even showed a one percent sales loss for an responses. HTTP protocol itself, however, does not provide
additional 100 ms delay [13]. Therefore, obtaining quanti- any methods for such page delimitations. Moreover, page
tative measures of the response times in a timely and cost- delimitations need to be conducted in the presence of
effective manner is fundamental to manage e-commerce complicated browser behaviors, such as parallel down-
services across a diverse and rapidly changing client loading of multiple webpages and HTTP pipelining, as well
population. By obtaining the response times, businesses as packet losses and delays.
might employ techniques to match different clients’ More importantly, for secured Internet services, such
requirements for service qualities [19], [33], [35]. For response-time measurements must be conducted without
example, in [33], [35], [36], the authors proposed mechan- decrypting HTTPS messages. This is particularly important
isms to dynamically allocate resources to different clients to for web hosting providers that are responsible for maintain-
ing the infrastructure of various Internet services for
businesses but are prohibited from parsing the network
. J. Wei is with Yahoo! Inc, Sunnyvale, CA 94089.
traffic in and out of these services. Even for network
E-mail: jianbinw@yahoo-inc.com.
. C.-Z. Xu is with the Department of Electrical and Computer Engineering, administrators of these services, they are allowed to parse
Wayne State University, Detroit, MI 48202. E-mail: czxu@wayne.edu. decrypted contents only in some special cases to prevent
Manuscript received 31 May 2009; revised 1 Feb. 2010; accepted 8 May 2010; potential leaks of clients’ personal information, such as
published online 16 June 2010. passwords and credit card numbers. Furthermore, for
Recommended for acceptance by D. Turgut. secured Internet services where the encryption is conducted
For information on obtaining reprints of this article, please send e-mail to:
tpds@computer.org, and reference IEEECS Log Number TPDS-2009-05-0247. by webservers, it is impossible to analyze decrypted
Digital Object Identifier no. 10.1109/TPDS.2010.131. HTTPS messages without modifying the structures of these
1045-9219/11/$26.00 ß 2011 IEEE Published by the IEEE Computer Society
774 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

webservers. Such modification would limit the scalability of


the measurement and render the system hard to be deployed.
There are many response-time measurement techniques
with different characteristics. They include active samplings
from particular network locations [12], [16], instrumentation
of webpages [25] or webservers [1], offline HTTP traffic
analysis [5], [15], and online HTTP traffic analysis [5], [21]. In
particular, E2E [5] and ksniffer [21] are able to measure
pageview response time at the server side in a nonintrusively
Fig. 1. The architecture of sMonitor.
manner. However, none of them is able to measure the time
for secured Internet services nonintrusively. They fall short,
in one aspect or the other, in capability, accuracy, intrusive- reported a measurement error of 3 percent to 5 percent
ness, and real-time availability of the measurements. More [21]. But it is limited to unsecured HTTP traffic.
importantly, they all need to decrypt HTTPS messages. The rest of this paper is organized as follows: In Section 2,
In this paper, we present sMonitor for measuring client- we present the architecture of sMonitor. Section 3 discusses
perceived response times for both secured and unsecured the design of sMonitor and the size-based analysis method
Internet services nonintrusively. At the heart of sMonitor is for page delimitations. Section 4 gives the evaluation
a novel packet-size-based analysis method that parses live methodology. Section 5 shows the experimental results. In
packets to delimit different webpages retrieved by different Section 6, we survey existing techniques and products and
clients at the same time and infers their response times. The discuss their merits and limitations. Section 7 concludes this
analysis method is based on the observation that most paper with remarks on the limitation of sMonitor and
HTTP(S)-compatible browsers send significantly larger possible future work.
requests for container objects than those for embedded
objects. Note that the analysis method is based on protocol 2 OVERVIEW OF sMONITOR ARCHITECTURE
features of HTTP and HTTPS. Therefore, although the use
In general, client-perceived pageview response time of an
of web services evolves very quickly, it does not affect the
Internet service corresponds to the duration from the time
validity of the method as long as the underlying protocols
do not change the features, which is the case for HTTP/ when the client clicks/inputs the address of its web page to
1.1bis [8]. On the other hand, sMonitor actually is useful in the time when all objects of the webpage are retrieved. We
monitoring such evolving and can help service providers to design sMonitor to measure the time for Internet services in a
adopt to the evolving. nonintrusive manner so that it can be deployed without
Comparing with existing techniques, sMonitor offers a needs for any modification to existing Internet services. As
number of unique benefits. shown in Fig. 1, sMonitor runs as a stand-alone application,
keeps capturing packets in and out of the monitored service.
sMonitor works for secured Internet services with-
1. It analyzes the traffic using a size-based analysis method to
out requiring any modifications, such as embed- infer client-perceived pageview response times. sMonitor
ding codes in webpages or decoupling encryption works for both secured HTTPS and unsecured HTTP traffic.
components from webservers. This feature would Because of its unique strengths in the treatment of HTTPS
simplfy the deployment of sMonitor in real traffic, we elaborate the system architecture details in the
environments. context of secured Internet services.
2. sMonitor accurately measures client-perceived re- sMonitor consists of three modules. A packet sniffer
sponse times by capturing different browser beha- passively collects live network packets using pcap (libpcap
viors, such as parallel downloading of multiple on Unix-like systems or WinPcap on Windows). A packet
webpages and pipelining of HTTP requests. More- analyzer then parses the packets to extract HTTP transac-
over, it is agile to the change of client access patterns. tion information, such as HTTP request sizes, and passes
3. sMonitor measures response times for all real clients them to a performance analyzer. To obtain such informa-
who access the monitored service. It provides a tion, the analyzer only needs TCP/IP and SSL/TLS record
much more complete view of the service perfor- headers, which are not encrypted. The parsed packets are
mance than active sampling that uses customized discarded to reduce storage requirements. Based on the
browsers from a few locations that generally have HTTP transaction information, the performance analyzer
better network connectivities than the real world. uses the packet-size-based analysis method to delimit pages
We conducted comprehensive experiments across a and measure their response times.
wide range of operating conditions using live secured In sMonitor, the control flow is as far as possible data
Internet services, on the PlanetLab, and on controlled driven: The packet processing loop is driven by the presence
networks. The experimental results suggested that of packets. It is composed of three parts. The first part is to
sMonitor should be accurate in terms of both response process packets for connection establishments and releases.
time and page delimitation. The measurement errors were Upon capturing a SYN packet from a client, sMonitor
kept within 6.7 percent (or 210 ms) in comparison with the creates a new connection object. If the packet is from a new
response times measured at the client sides. Although client, sMonitor also creates a new client object. If the
network traffic was encrypted, no more than 18 percent packet is for connection releases, e.g., a FIN or RST packet,
webpages would be falsely delimited. In contrast, ksniffer the corresponding connection will be deleted. The second
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 775

on sMonitor’s accuracy, we will not present the detailed


discussion and evaluation of sMonitor’s capacity. It is worth
noting that the accuracy of sMonitor is not affected by the
capacity when there is no loss in capturing packets.

3 PACKET-SIZE-BASED ANALYSIS FOR PAGE


DELIMITATION
A key challenge in inferring client-perceived pageview
response times from captured packets is page delimitation.
In this section, we present the packet-size-based analysis
method to identify the requests for container objects in page
delimitation. Let r0k denote the HTTP request for the
container object of webpage k. Then r0k and r0kþ1 help delimit
the retrieval beginning and end of page k, respectively.
Fig. 2. Retrievals of multiple pages and embedded objects using
pipelined requests over multiple connections. 3.1 The Size-Based Analysis Method
Our design of the size-based analysis method is based on an
part processes server responses. sMonitor classifies a server observation of HTTP(S)-compatible browser behaviors,
packet with nonempty application data as a server which we first reported in [34]: The request for the container
response. A response packet is identified as the end of of a page is normally significantly larger than the follow-up
the response if no further response packets are captured requests for its embedded objects, and the follow-up
within a timeout period or if a new request arrives over the requests often have similar sizes. An HTTP request message
same connection after it. The third part of packet proces- begins with a request line, followed by a set of optional
sing is for client requests. It parses TCP/IP and SSL/TLS headers and an optional message body. The large size
record headers to obtain HTTP request sizes. After that, difference between r0k and the request rik for embedded
combining with identified response ends, it uses the object i is mainly because of the following reasons:
packet-size-based analysis method to delimit pages and
infers their response times. 1. In HTTP/1.0 and HTTP/1.1, the Accept header is
Page retrieval is a complicated process. A browser may designed to specify certain media types which are
retrieve multiple pages at the same time using pipelined acceptable for the response. To be protocol-compa-
HTTP requests over multiple persistent TCP connections. tible, most browsers use the default standard
Accept header in the first request r0k for a page to
Fig. 2 shows the HTTP protocol for accessing a web page
indicate as many as possible acceptable types. On
containing multiple embedded objects. It starts with a
the other hand, rik uses Accept header to indicate
request for the page container object, followed by requests
that the request is specifically limited to a small set
for the embedded objects. Essential to the inference of
of desired types and the header is smaller than the
pageview response times is identification of the requests for
default standard one. For example, through our
containers in the traffic flow. This is nontrivial, because a
experiments, in IE, r0k has a default Accept header
client can have many different ways (persistent versus
that can be 161 bytes larger than that in rik when
nonpersistent connections, pipelining, parallel download-
certain applications, such as Microsoft Office and
ing, etc.) to access the webpages of an Internet service. The Adobe Flash, add their identifications to the default
request and response packets with respect to different header. We also examined the source code of Firefox
webpages are mixed in the captured traffic, which makes and Mozilla and found that the size difference could
the issue of page delimitation challenging. At the heart of be as large as 93 bytes.
sMonitor is a packet-size-based analysis approach to 2. For static web objects, the request lines in r0k and rik
determine retrieval beginning and end of a webpage in a tend to have limited sizes for easy administrations of
traffic flow without decrypting HTTPS messages. a website. For dynamically generated objects, their
The overhead of sMonitor is small because only TCP/IP URLs often follow the same pattern and the sizes are
and SSL/TLS record headers need to be captured and very close to each other.
analyzed. In general, TCP/IP headers are 40 bytes and the 3. In general, for HTTP headers, the only difference
SSL/TLS record headers are only 5 bytes. In this work, we between requests r0k and rik is the Accept headers. In
designed sMonitor as a user-space application for good the case where cookies are used, the first request
portability. To further improve sMonitor’s performance, it from a client for the website may be smaller than
can be easily implemented in the kernel space. For example, others since it does not contain a Cookie header while
in [21], the authors reported that their ksniffer approach others do. The first request, however, is normally for
had a capacity near gigabit traffic rates without packet the container of a webpage. Therefore, the size
losses. We believe that sMonitor should have the same difference is a good method to differentiate requests
capacity as or even higher capacity than ksniffer. This is for container objects and for embedded objects.
because ksniffer requires to parse packets up to HTTP layer 4. Across a wide variety of measurement studies, the
while sMonitor does not. Because the focus of this paper is overwhelming majority of web requests do not have
776 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

Fig. 4. Data structures used in the packet analyzer.


Fig. 3. Structure of the packet analyzer.

client must first establish a connection. As shown in Fig. 2,


message bodies. They use the GET method to the request r0k is generally sent right after the last step of the
retrieve webpages and invoke scripts [9], [22]. three-way handshake. Therefore, there is at least 1.5 round-
Based on the observation of HTTP(S)-compatible brow- trip time (RTT) gap before the server receives the first
ser behaviors, we design the size-based analysis method. packet of the request r0k . In the case where the request r0k is
Let xn denote the size of the nth HTTP request, which transmitted over an existing connection, the gap is 0.5RTT.
includes the HTTP method header and all other HTTP Similarly, there is also a 0.5RTT gap between the time the
header fields. The second derivative of the HTTP request server sends the last packet of a response and the time the
size function fðxÞ can be approximated as client receives it. To take into account these time gaps,
f 00 ðxn Þ ffi ðf 0 ðxn þ h=2Þ  f 0 ðxn  h=2ÞÞ=h sMonitor records the RTT for every client using an
exponentially weighted moving average of the difference
¼ ðfðxnþ1 Þ  2fðxn Þ þ fðxn1 ÞÞ=h2 þ Oðh2 Þ; between a packet and its acknowledgment in the same
where h is the step size. Since the function is discrete, let h manner as TCP [24].
be 1, we assert that Second, in sMonitor, the end of a page is identified as the
last response packet received before the arrival of another
f 00 ðxn Þ ffi xnþ1  2xn þ xn1 : request for a container object. The request can be trans-
mitted over the same or a different connection. This
If f 00 ðxn Þ < 0, then fðxÞ is in a concave shape and it has a
identification method may cause measurement lags or
relative maximum value at xn . Let t denote a configurable
delayed page delimitations because there exists a time gap
threshold of the size difference between r0k and rik . We set a
between the end of a webpage and the arrivals of requests
page delimitation policy as “the request n is identified as r0k
for another one. Furthermore, this time gap may be as long
if f 00 ðxn Þ is less than t.”
as a few minutes. sMonitor addresses this issue using a
The selection of t should maximize the number of
configurable timeout mechanism: If there is no packet
correctly identified requests for container objects, and
transmitted between the client and the server for the
meanwhile minimizing the number of false positive ones
that would otherwise impair the accuracy of sMonitor. timeout period, sMonitor infers that the page is ended. In
Notice that in IE and Firefox, r0k has Accept header that is sMonitor, we set the default timeout period to 5 s because it
normally 161 bytes and 93 bytes larger than the one in rik , is often regarded as a threshold of acceptable pageview
respectively. Considering the possible difference between response time [3]. To implement the timeout mechanism,
other headers in r0k and rik , we set t to 60 bytes by default sMonitor records the latest response time in the “last packet
in sMonitor. This setting is then used in all of our test cases time” field. sMonitor then scans this field periodically to
and we find that this is a good choice. In general, for a check if any packet is transmitted between the client and the
service, t should be set based on the shares of different server within the timeout period.
browser clients use to access the service since these Third, sMonitor also needs to deal with the effects of
browsers have different behaviors. For example, if most HTTP pipelining, in which multiple requests can be sent to
clients use IE then t can be smaller than 60 because IE has the server without waiting for the completion of prior
large Accept header in r0k . Moreover, there is no one requests. sMonitor identifies a client message as pipelined
optimal t for all services because different services have HTTP requests for multiple embedded objects if the
different clients. message is the same as or larger than 1,500 bytes (the
sMonitor keeps track of all clients and their activities maximum transmission unit of Ethernet). Such an identifi-
using a hash table, which is indexed by a hash function over cation method is based on two observed facts. First, past
the IP address of each client. Fig. 3 shows the structure of studies showed more than 90 percent of the requests were
the table. Each client object in the hash table maintains a list less than 1 KB bytes and most large requests were for
of active connections. It also contains an array of HTTP services, such as web-mail, that have large message bodies
transactions that need to be processed by the performance [30]. Second, as pointed out in [21], in most cases, only
analyzer. The data structures of client, active connection, requests for embedded objects are pipelined.
and HTTP transaction are presented in Fig. 4. Meanwhile, sMonitor needs to be resilient to parallel
There are several issues we need to address in the design downloading. During parallel downloading, multiple pages
of sMonitor. First, we must consider the difference between are retrieved at the same time and their requests are
the time of sending packets from clients or servers and the intertwined. This might cause false identifications of page
time of receiving them by servers or clients. When a client ends. As an example, assuming that a client retrieves pages k
retrieves a page, if there exist no active connections, the and k þ 1 using parallel downloading, in sMonitor, request
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 777

r0kþ1 is identified as the retrieval beginning of page k þ 1. All


requests following r0kþ1 would be then identified as the
requests for embedded objects in page k þ 1. Due to parallel
downloading, however, requests for the embedded objects in
page k may be sent after r0kþ1 . The measured response time of
page k becomes smaller than that perceived by the client
while that of page k þ 1 becomes larger. As we shall see in
Section 5.1, sMonitor is still able to measure average response
time perceived by the client accurately.
Fig. 5. Determining the size relationship between HTTP and HTTPS
In summary, sMonitor records RTTs to address the delay messages.
between the time clients send requests and the time servers
receive these requests. It uses the configurable timeout webservers, just ignore the unrecognized Accept
mechanism to measure response time promptly. Also header and send back the default response. Thus,
requests that are the same as or large than 1,500 bytes are the meaningless Accept headers we used did not
determined to be pipelined requests for embedded objects. At bias our experimental results.
last, we demonstrate the accuracy of sMonitor is acceptable 2. In order to evaluate different protocols, encryption
even with significant percentage of parallel downloading. algorithms, and MAC algorithms, we modified the
3.2 HTTPS Request Size Inference settings of a Windows system according to [17]. This
is important because otherwise only one set of
In general, secured Internet services use either SSL protocols
algorithms will be used. For example, in IE 6.0,
or TLS protocols to encrypt HTTP messages transmitted TLS protocols are disabled, and in IE 7.0 RC1, they
between clients and servers. Since these protocols are similar, are enabled by default; in addition, RC4 is always
our discussion focuses on SSL protocols and it also applies to used by IE as the first option for the encryption
TLS protocols unless specified explicitly. algorithm and MD5 is used as the MAC algorithm.
In SSL record protocol, an HTTP message is first 3. To ensure that we are able to capture the corre-
fragmented into blocks of 214 bytes or less, which are then sponding HTTPS message of an HTTP request, we
optionally compressed. A message authentication code disabled the local cache. To achieve that, we passed
(MAC) over a compressed fragment is then calculated using the flags INTERNET_FLAG_NO_CACHE_WRITE
HMAC MD5 or HMAC SHA-1 algorithms [14] and ap- and INTERNET_FLAG_RELOAD to the function
pended to the fragment. After that, the compressed fragment HttpOpenRequest().
and the MAC are encrypted using a symmetric encryption
We conducted experiments with HTTP requests ranging
algorithm. The final step is to attach a 5-byte plaintext record
from 100 through 2,000 bytes. The sizes of the HTTPS
header to the beginning of the encrypted fragment. The SSL
messages were obtained by checking their record headers of
record fragment is then passed to lower protocols, such as
the first SSL/TLS record fragments with the content type as
TCP, for transmissions. In this way, with the support of SSL
application data. To demonstrate how we determine HTTP
protocols, the HTTP message is transmitted with guarantees
request sizes from HTTPS messages clearly, in Fig. 5, we only
of its secrecy, integrity, and authenticity.
depict the results where the sizes of the HTTP requests
As is known, it is extremely hard to hide information
ranging from 300 through 350 bytes. In the figure, the first
such as the size or the timing of a message [7]. In [31], the
half of each legend, RC4 or 3DES, denotes the cipher suite
authors developed an approach to estimate the object size in
used and the second half, e.g., MD5 or SHA, is the MAC
an encrypted HTTPS response message. Because it assumed
algorithm. We determine the cipher suite used between the
a minimal response message size of 4 KB, their approach
client and the server by checking the handshake messages for
cannot be applied to analyze the HTTPS request messages
session establishments. These messages are not encrypted.
of a few hundreds bytes. In the following, we present our
From Fig. 5, we observe that the size difference between
experiments for inferring the size of an HTTP request from
an HTTP request and its corresponding HTTPS request is
its corresponding encrypted HTTPS one. In the experi-
always 16 bytes for RC4_MD5 and 20 bytes for RC4_SHA.
ments, we sent HTTP requests over SSL/TLS protocols to
This is because of two reasons. First, RC4 is a stream cipher.
an HTTPS webserver using Microsoft’s WinInet library. The
Thus, the ciphertext has the same size as the plaintext.
sizes of these HTTP requests were known before being
Second, MD5 calculates a 16-byte MAC and SHA calculates
encrypted. We then captured the corresponding HTTPS
a 20-byte MAC.
messages and measured their sizes. There are three issues
In the case of 3DES_SHA, we can determine the size of an
worth noting in the design of the experiments.
HTTP message within 8 bytes from the size of the
1. To send HTTP requests with arbitrary sizes, we set corresponding HTTPS message. This is because 3DES is a
their Accept headers to meaningless characters in block cipher with a block size of 8 bytes. For example, for a
our experiments. HTTP/1.1 specifies that a webser- 360-byte HTTPS message, since there exists a byte to record
ver should send a 406 (“not acceptable”) response if the padding size, only another zero or seven padding bytes is
the acceptable content type set in a request’s needed to make the total an integral multiple of 8 in 3DES.
Accept header cannot be satisfied directly. We Considering the 20-byte MAC calculated by SHA, the HTTP
found that, however, most webservers, including message therefore ranges from 332 through 339 bytes. Such a
Microsoft Internet Information Services and Apache determination can also be observed from Fig. 5.
778 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

Furthermore, in SSL protocols, the padding added to an respectively. In total, there were 1,041 container objects and
HTTP message prior to the encryption operation is the 959 embedded objects.
minimum amount required so that the total size of the data Furthermore, we made following enhancements to
to be encrypted is a multiple of the cipher’s block size. In SURGE to mimic behaviors of real-world browsers:
contrast, TLS protocols define a random padding mechanism
so that the padding can be any amount that results in a total 1. To support secured Internet services, we updated
that is a multiple of the cipher’s block size, up to a SURGE with OpenSSL 0.9.8 (www.openssl.org).
maximum of 255 bytes. It aims to frustrate attacks based on 2. We enhanced SURGE to mimic behaviors of IE and
a size analysis of exchanged messages. From Fig. 5, we can Firefox so that requests for container objects may
see that the random padding is not implemented in IE 6.0 have large Accept headers.
and 7.0 RC 1 for TLS protocols. We can draw the same 3. We updated SURGE to support parallel downloading
conclusion for Firefox. by establishing another two persistent TCP connec-
The compression operation can change the size of an tions for page retrievals.
HTTP request. Such a change might affect the accuracy of 4. We added the support of HTTP pipelining to SURGE
the size-based analysis method presented in Section 3.1. by following the default behaviors of Firefox since it
From Fig. 5, however, we observe that the compression is the most widely used browser that has imple-
operation is not performed. This is because no default mented HTTP pipelining.
compression algorithm is specified in SSL/TLS protocols. 5. We followed HTTP/1.1 to set SURGE to use two
In summary, we can determine the size of an HTTP parallel persistent TCP connections to retrieve
request from the corresponding HTTPS request in the case of webpages and their embedded objects.
stream ciphers, such as RC4, or an estimation within a block- 6. In the enhanced SURGE, each user equivalent (UE)
size difference in the case of block ciphers, such as 8 bytes for binds to a unique IP address using IP aliasing on the
DES and 3DES. client machines. This makes each client machine
appear to the server as a collection of unique clients.
7. We updated SURGE so that each UE would send
4 EVALUATION METHODOLOGY requests for a page’s embedded objects only after the
We conducted experiments under a wide range of operating container object is received.
conditions using live secured Internet services, on the 8. The enhanced SURGE also records the retrieval time of
PlanetLab, and on the controlled networks to evaluate the every webpage as the client-perceived response time.
accuracy of sMonitor. We evaluated the accuracy in three In experiments with live secured Internet services, we
aspects. The first is the ability to accurately infer pageview used IE and Firefox to retrieve 44 webpages from several
response times perceived by clients. The second is the different Internet services, including online banking sites.
ability to correctly delimit different webpages. The third is We recorded their retrieval times manually. Because it was
measurement lag from the time when the last object of a page infeasible for us to deploy sMonitor near those servers, we
is delivered to the time the page is delimited by sMonitor. placed sMonitor on the client side to measure response
The measurement lag is mainly determined by when the times. We believe that such a placement of sMonitor has
request for container object of next page arrives and the little effect on our accuracy evaluation since the key for the
timeout mechanism used in sMonitor. For example, when measurement is the identifications of retrieval beginnings
the response of the last object in the page is retrieved, and ends of individual pages [5]. In the experiment, we also
sMonitor cannot decide if the page is completely down- consider network transfer times from the client to servers to
loaded until it detects the arrival of requests for another page, simulate measuring from server-side and use sMonitor to
or the timeout period has passed. Measurement lag exists in
handle the variance of RTT during the experiment.
other response-time monitors, such as ksniffer [21] and EtE
We also conducted experiments on the PlanetLab to
[5]. For example, ksniffer used a timeout mechanism or the
evaluate sMonitor’s accuracy in a real-world environment.
arrival of a request for a container object to identify the end of
In these experiments, clients resided on nine geographically
a page retrieval. There exists time gaps between the ends of
diverse nodes: Cambridge, Massachusetts, San Diego,
page retrievals and their identifications.
California, and Cambridge, United Kingdom. The webser-
In the experiments, sMonitor captured network traffic in
and out of monitored services, analyzed packets, and then ver was setup in Detroit, Michigan. It was a Dell PowerEdge
inferred pageview response times. We used an Apache 2450 configured with dual-processor (1 GHz Pentium III)
webserver with the support of OpenSSL to provide secured and 512 MB main memory. We connected the server to the
Internet services. In addition, the Apache webserver also Internet via a 100 Mbps network card. During these
supports HTTP/1.1. experiments, the RTTs between the server and the clients
We used SURGE [2] to generate emulated web objects, in were around 45 ms (Cambridge), 70 ms (San Diego), and
which there were 2,000 unique objects. Similar to [21], we 130 ms (United Kingdom). One issue in the experiments on
also made minor changes to SURGE to reflect more recent the PlanetLab was that we could only simulate nine clients
work on web traffic characterizations [9], [30]. That is, the using nine nodes. It is because we couldn’t use IP aliasing
maximum number of embedded objects in a given page was on these nodes without root privileges. To make the
set to 100 instead of 150; the percentages of base, embedded, experiment environment more realistic, we ran SURGE on
and loner objects were changed from 30 percent, 38 percent, other machines to simulate another 100 clients to access the
and 32 percent to 42 percent, 48 percent, and 10 percent, service at the same time.
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 779

TABLE 1
The Summary of Experiment Results

The page split and page merge are two types of false page delimitations.

To further evaluate sMonitor’s accuracy in different environments where nonnegligible portion of clients uses
operating conditions, we implemented a network simulator browsers with different implementations of the Accept
similar to [29] and Dummynet [27] to simulate wide-area header. For example, in Safari, all requests have the same
network conditions. In these experiments, two machines Accept headers. We also changed the percentage of
were used as clients and one as network simulator. They had parallel downloaded pages. While we varied this percen-
the same hardware configurations as the server and were tage between zero and 15 percent, in this section, we shall
connected by a 100 Mbps Ethernet. We changed the network also investigate its effect on sMonitor’s accuracy. In
routing in the server and client machines so that the packets addition, we varied the percentage of pipelined HTTP
between them were sent to the simulator. Upon receiving a requests. It is to simulate environments where some users of
packet, the simulator routed the packet to an “ethertap” Firefox may change the HTTP pipelining option from the
device. A small user-space program read the packet from the default off to on. We shall also examine its effect on the
“ethertap” device, delayed or dropped it according to the accuracy of sMonitor, in this section.
settings, and wrote it back to the device. The packet was then Table 1 summarizes the experimental results. Experiments
routed to the Ethernet. The simulator was shown to be B and F were conducted on the PlanetLab and others were on
effective in simulating wide-area network delays and packet the controlled networks. Because the results on the controlled
losses. For example, with the RTT set as 180 ms, ping times networks with different RTTs were similar, in Table 1 we only
were showing a round trip of around 182 ms. present the experiments with the RTT as 180 ms for
For the experiments on the controlled environments, we simplicity. In all test cases, we observed measurement errors
set the RTT between clients and the server to be 40, 80, or no larger than 6.7 percent, and the absolute measurement
180 ms. They represent the transmission latency within the errors were always smaller than 210 ms. When large Accept
continental US, the latency between the east and west coasts headers were not used in all requests for container objects, the
of US, and the one between US and Europe, respectively sMonitor measured response times were always larger than
[28]. Similar to [21], we set the packet-loss rate to 2 percent. the client-perceived ones. This is because sMonitor is unable
The number of UEs was set to 100. Each experiment lasted to perfectly delimit the beginning and end of every page
20 min. Because the results showed no clear trend of change retrieval because of the lack of size difference between
over the increase or decrease of the time window size, we requests for container and embedded objects. In some cases,
assumed that the results should be robust in this regard. several pages might be falsely identified as one page,
resulting in an estimate of larger response time. On the other
hand, the sMonitor measurements could become smaller
5 EXPERIMENTAL RESULTS than client-perceived ones when the effects of parallel
5.1 Measurement Accuracy on Average downloading and HTTP pipelining become dominant, as
We conducted comprehensive evaluations under different we shall discuss in Section 5.3.
network and traffic conditions and compared sMonitor’s Note that in Table 1, we evaluate the accuracy of sMonitor
measurements with those obtained by the enhanced SURGE via the measurement errors averaged over all retrieved pages
running on the client machines. We changed the settings in in the 20-min experiments. We also investigated the transient
SURGE to mimic different browser behaviors. We varied the behaviors of sMonitor. Fig. 6 shows the average response
percentage of requests for container objects that have large times measured by sMonitor in experiment A in different
Accept headers. Although the market share of IE and time scales, and compares them with those measured from
Firefox is around 93 percent [18], we set the lower bound to the client sides. For brevity, we only present the results in the
80 percent to investigate the accuracy of sMonitor in period from 200th to 400th second.
780 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

Fig. 6. Comparisons of average response times for experiment A. (a) Average for every 5 s. (b) Average for every 1 s.

The results show that most measurement errors are less individual pages for experiments conducted on PlanetLab
than 2 s for the 1-s interval results. It further indicates that and controlled testbed. From Fig. 7d, we can observe that
sMonitor is accurate in measuring client-perceived response the measurement errors for most individual pages are close
times. Comparing the 1-s and 5-s interval results, we can to zero.
observe that the 5-s interval results have smaller errors Furthermore, the correlation coefficient of the results
(0.473 s) than the 1-s interval results (1.136 s). This is from the PlanetLab and the controlled networks is 0.96 and
because the large average interval can decrease the effect of 0.98, respectively. Notice that the closer the correlation
the measurement lag on the accuracy of sMonitor. Such a coefficient to 1 (a perfect monitoring where every page is
measured accurately), the more accurate sMonitor is. These
result also demonstrates that it is necessary to include the
results demonstrate that sMonitor can accurately measure
measurement lag in assessing the accuracy of sMonitor.
client-perceived response times in different environments.
5.2 Measurement Accuracy of Individual Pages Notice that the measured response time of one client is
We further investigated the accuracy of sMonitor by not affected by those of other clients. This is because
comparing its measured response times of individual pages sMonitor measures response time of a client separately by
delimiting its own page requests. Moreover, from Fig. 2 we
against those measured from the client sides. Fig. 7 presents
can observe that, the changes of network characteristics
the scatter plots, which we used to determine relationships
during one page retrieval does not affect the accuracy of
between two measurements. Fig. 7a shows the comparison
sMonitor significantly as long as the requests for container
results for the experiments conducted using live secured
objects are identified correctly. For example, assume that
Internet services. In Fig. 7b, we present the measurement
the packet loss rate changes during retrieval of a page and it
comparisons of experiment B for requests sent by SURGE
causes the request for embedded object i to be retransmitted
running on the nodes on the PlanetLab. Fig. 7c shows the
several times. sMonitor treats these retransmissions as
comparison results of experiment A. From these figures, we
requests for different embedded objects rather than the
can observe that there is a strong linear relationship same object. The accuracy of measured client-perceived-
between these two measurements. Such a linear relation- response time is, however, not affected.
ship indicates that sMonitor measured response times are
very close to those measured from the client sides. Actually, 5.3 Measurement Accuracy in Various Browser
the stronger the linear relationship between the measure- Settings
ments, the more accurate sMonitor is. This is also confirmed To further investigate the accuracy of sMonitor in
by Fig. 7d, which presents the error distributions of different operating environments, we conducted three

Fig. 7. Response time comparisons for individual pages. (a) Live secured Internet services. (b) On the PlanetLab. (c) On the controlled
environments. (d) Error distributions.
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 781

objects had large Accept headers and no pages were


parallel downloaded. Fig. 8 also plots the results. They
show that, the more pipelined requests, the larger the
measurement error. For clients, pipelining of HTTP requests
can reduce their perceived response times since several
requests for embedded objects are sent to the server within
one packet without waiting for each response. On the other
hand, size differences between pipelined requests for
embedded objects and those for container objects become
Fig. 8. Accuracy of sMonitor in different operating environments. smaller. This may cause sMonitor to delimit multiple pages
as one only. As a result, with the increasing usage of
sets of experiments with different settings of usages of pipelined requests, sMonitor measures larger response
large Accept headers, parallel downloading, and HTTP times than those client-perceived.
pipelining. Each set consists of 21 experiments. Fig. 8
5.4 Accuracy of Page Delimitation
plots the results.
Fig. 8 shows that, with the increasing usage of large sMonitor’s accuracy on response times is mainly determined
Accept headers, sMonitor’s measurement error decreases. by its accuracy on page delimitations. There are two types of
For example, when no large Accept headers are used in false page delimitations. The first type is page split that one
requests for container objects, the average response time page is identified as multiple pages. The second type is page
merge, in which multiple pages are identified as one.
measured by sMonitor is 4.269 s while the actual client-
In Table 1, we also present the results of these two
perceived response time is 3.185 s. The measurement error
types of false page delimitations in different environments.
is 34 percent. In this case, sMonitor solely relies on the
For example, in experiment A, the total percentage of
timeout mechanism to delimit webpages. It identifies all
falsely delimited pages was as low as 3 percent. With these
requests issued within the timeout period as requests for
falsely delimited pages, the measurement error was only
one single page although they may be for different pages.
0.3 percent. In all experiments, split pages were less than
Thus, sMonitor measures larger response times than those
2.3 percent and merged pages were less than 17 percent.
measured from the client sides. The usage of large Accept
They further demonstrate the high accuracy of sMonitor in
headers helps sMonitor to delimit pages even when the
measuring client-perceived response times.
requests are issued within the timeout period and to reduce
To investigate the causes of false page delimitations in
the measurement errors. Notice that in this set of experi-
sMonitor, we conducted experiments with different set-
ments, we set the usage percentages of both parallel tings. Fig. 9 shows the experimental results. In Fig. 9a, we
downloading and HTTP pipelining to zero to prevent them present the effect of the usage of large Accept headers on
from affecting our accuracy evaluation of sMonitor. page delimitations. The results show that the usage of large
Fig. 8 also shows the accuracy of sMonitor in environ- Accept headers has negligible effect on split pages. This is
ments with different percentages of parallel downloaded because the sizes of requests for embedded objects are not
pages. In this set of experiments, the large Accept header affected. These requests have similar sizes and sMonitor
ratio was set to 100 percent and no HTTP requests were does not identify them as the beginnings of pages.
pipelined. From this figure we can gain two insights. First, With the increasing usage of large Accept headers,
the sMonitor measured response times are smaller than Fig. 9a also shows that the number of merged pages
those perceived by clients. In our experiments, when significantly drops from 65.3 percent to 2.5 percent. This
several pages are parallel downloaded, two TCP connec- demonstrates that large Accept headers are effective for
tions are established for each page. This is to mimic the helping sMonitor to delimit pages.
behavior that a new IE is executed. From the viewpoint of a We also conducted experiments to determine how
client, parallel downloading does not reduce the response sMonitor’s accuracy on page delimitation is affected by
times of individual pages. For sMonitor, however, the total parallel downloading. Fig. 9b plots the experimental results.
downloading time of these pages becomes smaller because From this figure we can observe that, with the increasing
their requests are overlapped. usage of parallel downloading, more pages are split or
Second, with the increasing usage of parallel down- merged. This is expected since if two pages are parallel
loading, more requests are overlapped and sMonitor downloaded, requests for their container and embedded
measures smaller response times. For example, when only objects will interleave each other and lead to false page
10 percent webpages are parallel downloaded, the average delimitations. Moreover, since parallel downloading will
response time measured by sMonitor is 3.085 s and the not change request sizes, its effect on split and merged
measurement error is 1:4 percent. When all webpages are pages is small. For example, when the percentage of parallel
parallel downloaded, while the measured response times downloaded pages varies from 20 percent to 75 percent, less
from the client sides are similar, the measured average than 4.5 percent pages are split and no more than 6.5 percent
response time from sMonitor is reduced to 2.794 s. It is are merged.
equivalent to a measurement error of about 10 percent. Given the fact that HTTP pipelining is becoming
HTTP pipelining is another factor that we need to popular, we conducted more experiments to determine
investigate for the evaluation of the accuracy of sMonitor. In how it affects the accuracy of sMonitor. Fig. 9c plots the
the experiments, we assumed that all requests for container experimental results. It is clear that HTTP pipelining has
782 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

Fig. 11. Cumulative distribution function of the measurement lags in


experiments C and G.

from the client sides. In all other experiments, we have


similar results. This is because there are more merged pages
than split ones, which can be observed from Fig. 9. Because
a merged page actually is a mix of several pages, it has a
larger response time.

5.5 Measurement Lag of sMonitor


Recall that measurement lag refers to the time difference
between the time of the retrieval end of a page and the time
the same page is delimited by sMonitor. To measure the
measurement lag of sMonitor, we synchronized all client
machines and servers with one NTP server. In addition, we
only calculate the lags of those correctly delimited pages to
prevent those falsely delimited ones from affecting our
evaluations.
Table 1 also shows the mean measurement lags in the
last column. They vary between 3.5 and 4 s. More
Fig. 9. Effect of browser behaviors on sMonitor’s accuracy on page importantly, they are at least 0.5 s longer than client-
delimitation. (a) Accept header. (b) Parallel downloading. (c) HTTP
pipelining. perceived response times on average. This suggests that
requests for a page could be processed within one
little effect on the number of split pages. On the other hand, measurement lag. Therefore, it is important to take the
with the increasing usage of HTTP pipelining, the number measurement lag into account in order to effectively
of merged pages increases. This is because pipelined manage client-perceived response times [19], [33]. That is,
to control response times, a scheduler needs to obtain
requests for embedded objects make size differences
current system status promptly in order to determine which
between requests for container and embedded objects
client should be processed next. Due to the measurement
smaller. sMonitor becomes more likely to merge multiple
lag, however, a scheduler has to decide processing order of
webpages into one instead of splitting pages. client requests based on lagged system status, which might
In addition to the overall results, we also investigated be very different from current system status.
detailed behaviors of sMonitor in delimiting pages. Fig. 10 We also examined detailed behaviors of the measure-
depicts the scatter plots for experiment E. Because split ment lag. Fig. 11 plots the cumulative probability distribu-
pages do not exist in client-retrieved pages, in Fig. 10, they tions of the measurement lags for experiments C and G.
are plotted as points with x value as zero. Similarly, merged From Fig. 11, we can see that all pages were delimited
pages are plotted as points with y value as zero. within 5 s after their responses end. This is because of the
Fig. 10 shows that there are more delimited pages above timeout setting used in sMonitor. In addition, 14 percent
the line than below the line, which consists of accurately measurement lags are 5 s. It suggests that these pages
delimited pages. It implies that sMonitor measures larger should be delimited via the timeout mechanism. Further-
response times for the pages above the line than SURGE more, about 75 percent measurement lags are larger than 3 s
and 50 percent are larger than 4 s. They further show the
importance of the measurement lag.
To investigate the measurement lag in different environ-
ments, we conducted three sets of experiments with
different settings of large Accept headers, parallel down-
loading, and HTTP pipelining. Fig. 12 shows the experi-
mental results. With the increasing usage of large Accept
headers, we can clearly see that the measurement lags
become smaller. This is because the usage of large Accept
headers helps sMonitor to identify the beginnings of pages
Fig. 10. Page delimitations in experiments E. and to delimit pages more promptly.
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 783

issue service requests as scheduled or by emulation.


WebProphet [16] develops models for object download
times and their dependences by simulating the page load
process. Active monitors have difficulty in emulating real
user behaviors using customized browsers. Meanwhile,
they often only measure the response times of a few
selected webpages from a few selected locations. Such
limitations make the results unrepresentative.
Passive client-side monitors can be embedded instru-
Fig. 12. Effect of browser behaviors on the measurement lag. mentation codes inside webpages that can be downloaded
from servers or based on browser instrumentation. The
Parallel downloading also helps to reduce the measure- OpenView Transaction Analyzer (OVTA) from HP [10] is an
ment lag significantly. For example, the measurement lag example of webpage instrumentations. In OVTA, JavaScript
decreases from 3.899 s without parallel downloading to code is downloaded with webpages to record access times
2.507 s when as many as possible pages are parallel and to report statistics back to the server. One limitation of
downloaded. Such a reduction is due to the decrease of this approach is that those non-HTML webpages, such as
the interarrival times between requests for pages. When two PDF files, are difficult to instrument. In addition, since the
pages are sequentially downloaded, the measurement lag is JavaScript code is executed after the instrumented HTML
the time difference between the response end of first page file is retrieved, typically it is unable to measure the
and the arrival of the request for container object of the connection establishment time and retrieval time of the
second page. HTML file. To overcome these limitations, IBM has
Meanwhile, HTTP pipelining has little effect on the proposed Client Perceived Response Time (CPRT) [25].
measurement lag. For example, when HTTP pipelining was Beside instrumenting webpages, in CPRT, the hyperlinks
not used, the average measurement lag was 3.899 s. It drops pointing to these webpages are also instrumented using a
to 3.755 s when as many as possible HTTP requests were JavaScript code so that the retrieval time of the HTML file
pipelined. This decrease is expected because pipelining of can be measured. CPRT, however, is unable to measure the
response times of the webpages that are accessed directly.
HTTP requests provides no help in delimiting pages.
Such webpages include those accessed by a client through
To determine the effect of the measurement lag on page
typing the addresses in a browser, through the favorite
delimitations, we compare the numbers of delimited pages
collections, or through search engines.
for each 1-s interval measured by sMonitor and SURGE for
Browser-instrumentation approaches place a monitor
experiment C. Fig. 13 shows the differences between the
between a browser and the network socket layer or within
results from sMonitor and SURGE measurement in different the browser as a plug-in to access response times of
timescales. From Fig. 13a, we observe that the average Internet services from the client’s perspective. The repre-
difference in a 5-s interval is about 1.8 pages or 26 percent sentatives include Page Detailer from IBM [11]. This type
delimitation error. It is in contrast to the result of 2.2 pages or of approach has the same drawbacks as active client-side
42 percent delimitation error in the timescale of 1-s interval. monitors. Therefore, it is generally used for testing and
Although all pages are delimited within 5 s of their debugging services.
response ends as shown in Fig. 11, using 5 s as the average Client-side monitors have the advantages that they can
interval does not eliminate the measurement lag. This can provide the most accurate response time of client-perceived
be observed from Fig. 13a. It is because the end of a page response times. This includes the latency incurred in
may be identified in one average interval while the page domain name system server inquiries and the time
ends in previous one. How to further reduce the measure- spending in retrieving objects from locations other than
ment lag is part of our future work. the monitored servers. Their disadvantages are related to
issues in emulating representative user behaviors. In
addition, they are unable to provide a breakdown between
6 RELATED WORK network and server delays and lack drill-down trouble-
Existing response-time monitors can be classified according shooting capabilities without additional webserver instru-
to the type of monitoring (active versus passive) and the mentations. These limitations apply to both active and
location of the monitor (client side versus server side). passive client-side monitors.
Active client-side monitors such as Keynote service [12] and Passive server-side monitors normally measure response
WebProphet [16] are deployed at selected client sites and times using an instrumented server or analyzing logs or

Fig. 13. Behaviors of the measurement lags inferred from comparisons of page delimitations. (a) Average for every 5 s. (b) Average for every 1 s.
784 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 5, MAY 2011

traffic in and out of monitored services. Server-instrumenta- We have implemented sMonitor as a stand-alone
tion approaches track the request arrivals and response application in the user space. We have conducted compre-
departures on the application level or on the kernel level. hensive evaluations of its accuracy using live Internet
Application-level server instrumentations, such as [1], services, on the PlanetLab, and on the controlled networks.
provide an easy way to obtain information regarding Our results demonstrate the unique ability of sMonitor to
Internet service transactions. They, however, do not consider infer client-perceived pageview response times accurately.
network delays incurred during TCP connection establish- More importantly, the measurement is obtained in the
ments or kernel waiting times of client requests. In [20], the presence of complicated browser behaviors, such as parallel
authors showed that application-level approaches could downloading and HTTP pipelining, as well as packet losses
underestimate response times by more than an order of and delays.
magnitude. Kernel-level approaches, such as [20], overcome We note that sMonitor is required to be deployed in front of
these limitations. They, however, measure service perfor- a website so as to capture all the traffic in and out of the
mance at a per connection level. The measured results are website. However, in many applications, the objects of a web
different from what perceived by end users due to wide page may be located or generated in geographically dis-
usages of parallel and persistent connections. tributed websites. How to measure the client-perceived
pageview response time for such web pages deserve further
Traffic-analysis approaches, including [5], [21], decode
studies. Recent studies, such as Link Gradients [4] and WISE
packets to the HTTP layer to identify the beginning and the
[32], made a step forward by developing ways to estimate the
end of each HTTP transaction. Since passive server-side
end-to-end request/response transaction time of distributed
monitors have a detailed view of the monitored system, applications due to the change of network conditions. On the
they provide the most accurate performance characteristics client side, there may be a firewall, NAT-enabled router, or
of servers. Moreover, they observe actual service traffic, and proxy in general that hides clients from server by changing the
therefore, provide a measurement of experiences of all real request sources to the proxy. In this case, sMonitor would
clients. They can be deployed easily and run without measure the response time to the proxy, instead. sMonitor
interfering a server’s operation. Due the unavailability of treats the requests from different clients as parallel down-
HTTP headers, however, none of these approaches can loading and the results remain instructional for performance
measure response times for secured Internet services. diagnosis and QoS provisioning.
Log-analysis approaches, such as [15], estimate the
response time of a page as the serving time of the container
ACKNOWLEDGMENTS
object and the last embedded object. This approach,
however, suffers the same shortcoming as the application- The authors would like to thank the anonymous reviewers for
level server instrumentation since it does not take network their constructive comments and suggestions. This research
delays into account. Moreover, with dynamically generated was supported in part by US National Science Foundation
webpages, it is difficult to determine which embedded (NSF) grants DMS-0624849, CNS-0702488, CRI-0708232,
object belongs to which container object without the help of CNS-0914330, and CCF-1016966. The original idea of the
referrer field. size-based analysis method appeared in [34].
Server-side monitoring methods, including sMonitor
and ksniffer, share the limitation that they are unable to REFERENCES
measure latencies before clients send packets to servers. [1] J. Almeida, M. Dabu, A. Manikutty, and P. Cao, “Providing
They are also limited to measure the latencies of requests Differentiated Levels of Service in Web Content Hosting,” Proc.
handled by intermediate components between clients and ACM SIGMETRICS Workshop Internet Server Performance, pp. 91-
102, June 1998.
servers, such as browser caches and proxies. [2] P. Barford and M. Crovella, “Generating Representative Web
Other work on network traffic analysis include the Workloads for Network and Server Performance Evaluation,”
passive packet monitor used on the AT&T network [6] and Proc. ACM SIGMETRICS, pp. 151-160, June 1998.
inferring HTTP characteristics from TCP/IP packet headers [3] N. Bhatti, A. Bouch, and A. Kuchinsky, “Integrating User-
Perceived Quality into Web Server Design,” Proc. Ninth Int’l
[30]. They have discussed many challenges, such as TCP World Wide Web (WWW) Conf. Computer Networks, pp. 1-16, 2000.
connection reconstructions, which we faced in the design [4] S. Chen, K. Joshi, M. Hiltunen, W. Sanders, and R. Schlichting,
and implementation of sMonitor. “Link Gradients: Predicting the Impact of Network Latency on
Multi-Tier Applications,” Proc. IEEE INFOCOM, 2009.
[5] L. Cherkasova, Y. Fu, W. Tang, and A. Vahdat, “Measuring and
Characterizing End-to-End Internet Service Performance,” ACM
7 CONCLUSIONS Trans. Internet Technology, vol. 3, no. 4, pp. 347-391, 2003.
We have designed, implemented, and evaluated sMonitor, [6] A. Feldmann, “BLT: Bi-Layer Tracing of HTTP and TCP/IP,” Proc.
Ninth Int’l World Wide Web (WWW) Conf. Computer Networks,
a monitor that can determine client-perceived pageview pp. 321-335, 2000.
response times for secured Internet services without [7] N. Ferguson and B. Schneier, Practical Cryptography. John Wiley &
decrypting HTTPS messages. Since sMonitor passively Sons, 2003.
[8] R.T. Fielding, J. Gettys, J.C. Mogul, H.F. Nielsen, L. Masinter, P.J.
collects network traffic in and out of the monitored services, Leach, and T. Berners-Lee, Hypertext Transfer Protocol - HTTP/1.1.
it requires no changes to any part of the services or clients Network Working Group, Request for Comments 2616bis, June
and can be deployed easily. It measures the response times 1999.
nonintrusively using the novel size-based analysis method [9] F. Hernandez-Campos, K. Jeffay, and F.D. Smith, “Tracking the
Evolution of Web Traffic: 1995-2003,” Proc. 11th IEEE Int’l Symp.
on HTTP requests to characterize client accesses and delimit Modeling, Analysis, and Simulation of Computer and Telecomm.
different pages from live network traffic. Systems (MASCOTS), pp. 16-25, 2003.
WEI AND XU: MEASURING CLIENT-PERCEIVED PAGEVIEW RESPONSE TIME OF INTERNET SERVICES 785

[10] HP, “Openview Transaction Analyzer,” http://openview.hp. [35] J. Wei, X. Zhou, and C.-Z. Xu, “Robust Processing Rate Allocation
com/, 2010. for Proportional Slowdown Differentiation on Internet Servers,”
[11] IBM, “Page Detailer,” http://www.alphaworks.ibm.com/tech/ IEEE Trans. Computers, vol. 54, no. 8, pp. 964-977, Aug. 2005.
pagedetailer, 2010. [36] C.-Z. Xu, J. Wei, and F. Liu, “Model Predictive Feedback Control
[12] Keynote Systems, Inc., www.keynote.com, 2010. for QoS Assurance in Web Servers,” Computer, vol. 41, no. 3,
[13] R. Kohavi, R. Henne, and D. Sommerfield, “Practical Guide to pp. 66-72, Mar. 2008.
Controlled Experiments on the Web: Listen to Your Customers
Not the HiPPO,” Proc. ACM SIGKDD, 2007. Jianbin Wei received the BS degree in compu-
[14] H. Krawczyk, M. Bellare, and R. Canetti, HMAC: Keyed-Hashing for ter science from the Huazhong University of
Message Authentication. Network Working Group, Request for Science and Technology, China, in 1997. He
Comments 2104, Feb. 1997. received the MS and PhD degrees in computer
[15] B. Krishnamurthy and C.E. Wills, “Improving Web Performance engineering from Wayne State University in
by Client Characterization Driven Server Adaptation,” Proc. 11th 2003 and 2006, respectively. His research
Int’l Conf. World Wide Web, 2002. interests are in distributed and Internet comput-
[16] Z. Li, M. Zhang, Z. Zhu, Y. Chen, A. Greenberg, and Y.-M. Wang, ing systems. He is currently with Yahoo, working
“WebProphet: Automating Performance Prediction for Web on platforms of cloud computing. He is a
Services,” Proc. Seventh USENIX Symp. Networked Systems Design member of the IEEE Computer Society.
and Implementation (NSDI), 2010.
[17] Microsoft Corporation, “How to Restrict the Use of Certain
Cryptographic Algorithms and Protocols in Schannel.dll,” http://
support.microsoft.com/?kbid=245030, Dec. 2004. Cheng-Zhong Xu received the BS and MS
[18] NetApplications.com, “Browser Version Market Share,” http:// degrees from Nanjing University in 1986 and
marketshare.hitslink.com/report.aspx?qprid=6, Dec. 2006. 1989, respectively, and the PhD degree in
[19] D. Olshefski and J. Nieh, “Understanding the Management of computer science from the University of Hong
Client Perceived Response Time,” Proc. ACM SIGMETRICS, Kong in 1993. He is currently a professor in the
pp. 240-251, 2006. Department of Electrical and Computer Engi-
[20] D. Olshefski, J. Nieh, and D. Agrawal, “Using Certes to Infer neering at Wayne State University, the Director
Client Response Time at the Web Server,” ACM Trans. Computer of the Cloud and Internet Computing Laboratory,
Sysmtems, vol. 22, no. 1, pp. 49-93, 2004. and the Director of Sun Microsystems’ Center of
[21] D.P. Olshefski, J. Nieh, and E. Nahum, “ksniffer: Determining the Excellence in Open Source Computing and
Remote Client Perceived Response Time from Live Packet Applications. His research interest is mainly in scalable distributed and
Streams,” Proc. Sixth USENIX Symp. Operating Systems Design parallel systems and wireless embedded computing devices, with an
and Implementation (OSDI), pp. 333-346, 2004. emphasis on resource and system management for performance,
[22] V.N. Padmanabhan and L. Qiu, “The Content and Access availability, reliability, energy efficiency, and security. He has published
Dynamics of a Busy Web Site: Findings and Implications,” Proc. more than 160 articles in peer-reviewed journals and conferences in
ACM SIGCOMM, pp. 111-123, 2000. these areas, including more than 20 papers in IEEE and ACM
transactions. He is the author of book Scalable and Secure Internet
[23] D. Patterson, “A Simple Way to Estimate the Cost of Downtime,”
Services and Architecture (Chapman & Hall/CRC Press, 2005) and a
Proc. 16th USENIX Large Installation System Administration Conf.
coauthor of book Load Balancing in Parallel Computers: Theory and
(LISA), pp. 185-188, 2002.
Practice (Kluwer Academic/Springer Verlag, 1997). He serves on the
[24] V. Paxson and M. Allman, Computing TCP’s Retransmission Timer. editorial boards of IEEE Transactions on Parallel and Distributed
Network Working Group, Request for Comments 2988, Nov. 2000. Systems, Journal of Parallel and Distributed Computing, Journal of
[25] R. Rajamony and M. Elnozahy, “Measuring Client-Perceived Parallel, Emergent, and Distributed Systems, Journal of Computers and
Response Time on the WWW,” Proc. Third Conf. USENIX Symp. Applications, Journal of High Performance Computing and Networking,
Internet Technologies and Systems (USITS), 2001. and ZTE Communications. He served dozens of international confer-
[26] Jupiter Research, “Retail Web Site Performance: Consumer ences and workshops in the capacity of program chair, general chair,
Reaction to a Poor Online Shopping Experience,”technical report, and plenary speaker. He was a recipient of the Faculty Research Award,
JupiterKagan, Inc., 2006. the President’s Award for Excellence in Teaching, and the Career
[27] L. Rizzo, “Dummynet: A Simple Approach to the Evaluation of Development Chair Award of Wayne State University, and the “Out-
Network Protocols,” ACM SIGCOMM Computer Comm. Rev., standing Oversea Scholar” award of the National Science Foundation of
vol. 27, no. 1, pp. 31-41, 1997. China. He is a senior member of the IEEE. For more information, please
[28] S. Shakkottai, R. Srikant, N. Brownlee, A. Broido, and K. Claffy, visit http://www.ece.eng.wayne.edu/~czxu.
“The RTT Distribution of TCP Flows in the Internet and Its Impact
on TCP-Based Flow Control,” technical report, The Cooperative
Assoc. for Internet Data Analysis (CAIDA), 2004.
. For more information on this or any other computing topic,
[29] J. Slottow, A. Shahriari, M. Stein, X. Chen, C. Thomas, and P.B.
please visit our Digital Library at www.computer.org/publications/dlib.
Ender, “Instrumenting and Tuning Dataview—A Networked
Application for Navigating through Large Scientific Datasets,”
Software: Practice and Experience, vol. 32, no. 2, pp. 165-190, Feb.
2002.
[30] F.D. Smith, F. Hernandez-Campos, K. Jeffay, and D. Ott, “What
TCP/IP Protocol Headers Can Tell Us About the Web,” Proc. ACM
SIGMETRICS, pp. 245-256, 2001.
[31] Q. Sun, D.R. Simon, Y.-M. Wang, W. Russell, V.N. Padmanabhan,
and L. Qiu, “Statistical Identification of Encrypted Web Browsing
Traffic,” Proc. IEEE Symp. Security and Privacy, pp. 19-30, May
2002.
[32] M. Tariq, K. Bhandankar, V. Valancius, A. Zeitoun, N. Feamster,
and M. Ammar, “Answering “What-If” Deployment and Config-
uration Questions with WISE: Techniques and Deployment
Experience,” Proc. ACM SIGCOMM, 2008.
[33] J. Wei and C. Xu, “eQoS: Provisioning of Client-Perceived End-to-
End QoS Guarantees in Web Servers,” IEEE Trans. Computers,
vol. 55, no. 12, pp. 1543-1556, Dec. 2006.
[34] J. Wei and C.-Z. Xu, “sMonitor: A Non-Intrusive Client-Perceived
End-to-End Performance Monitor of Secured Internet Services,”
Proc. USENIX Ann. Technical Conf., June 2006.

You might also like