Professional Documents
Culture Documents
Abstract—We address the problem of highly transient pop- ployed P2P systems have reported median session times varying
ulations in unstructured and loosely structured peer-to-peer from one hour to one minute [13], [8], [6].
(P2P) systems. We propose a number of illustrative query-related The implications of such degrees of Churn on the system’s
strategies and organizational protocols that, by taking into con-
sideration the expected session times of peers (their lifespans),
performance are directly related to the degree of peers’ invest-
yield systems with performance characteristics more resilient to ment in their friends. At the very least, the amount of mainte-
the natural instability of their environments. We first demonstrate nance-related messages processed by any node would be a func-
the benefits of lifespan-based organizational protocols in terms of tion of the degree of stability of the node’s neighboring set. Be-
end-application performance and in the context of dynamic and yond this, and in the context of content distribution P2P sys-
heterogeneous Internet environments. We do this using a number tems, the degree of replication, the effectiveness of caches, and
of currently adopted and proposed query-related strategies, in-
cluding methods for query distribution, caching, and replication.
the spread and satisfaction level of queries will all be affected
We then show, through trace-driven simulation and wide-area by how dynamic the peers’ population is.
experimentation, the performance advantages of lifespan-based, Our work addresses the problem of highly transient popula-
query-related strategies when layered over currently employed tions in unstructured and loosely-structured P2P systems (col-
and lifespan-based organizational protocols. While merely illus- lectively, less-structured P2P systems). Through active probing
trative, the evaluated strategies and protocols clearly demonstrate of over half a million peers in a widely deployed P2P system,
the advantages of considering peers’ session time in designing
widely-deployed P2P systems. we determined that the session time of peers can be well mod-
eled by a Pareto distribution. In this context, the implication is
Index Terms—Lifespan, session time, resilience, peer-to-peer that the expected remaining session time of a peer is directly
(P2P).
proportional to the session’s current length, i.e., the peer’s age.
This observation forms the basis for a new set of protocols for
I. INTRODUCTION peer organization and query-related strategies that, by taking
into consideration the expected session times of peers, yield sys-
tems with performance characteristics that are more resilient to
P EER-TO-PEER (P2P) computing can be defined as the
sharing of computer resources and services by direct
exchange between the participating nodes. Since Napster’s [2]
the natural instability of their environments.
The lifespan-based approach for organizational protocols was
introduction in the late 1990s, the area has received increasing first proposed in our position paper [8], where we show its effec-
attention from the research community and the general public. tiveness in terms of increased system stability (e.g., over 42%
Peers in P2P systems typically define an overlay network reduction on the ratio of connection breakdowns and their asso-
topology by keeping a number of connections to other peers, ciated costs) through a simulation study. In this paper, we first
their “friends,” and implementing a maintenance protocol that demonstrate the advantages of our approach in terms of appli-
continuously repairs the overlay as new members join and cation performance in a dynamic Internet testbed of 150 widely
others leave the system. distributed PlanetLab nodes. We do this using a set of illustra-
Due in part to the autonomous nature of peers, their mutual tive organizational protocols combined with a number of cur-
dependency, and their astoundingly large populations, the tran- rently adopted and proposed query-related strategies, including
siency of peers (a.k.a. Churn) and its implications on the overall methods for query distribution, caching and replication. Our re-
system’s performance have recently attracted the attention of the sults show that even simple lifespan-based overlays can signifi-
research community [3]–[12]. A well-accepted metric of Churn cantly boost query performance (with improvements of at least
is node session time—the time from the node’s joining to its sub- 57% on aggregated query hits and 50% reduction in query reso-
sequent leaving from the system.1 Measurement studies of de- lution time) and increase system scalability by achieving query
performance comparable to those of currently deployed proto-
cols with only a third of their load.
Manuscript received May 11, 2006; revised January 15, 2007; approved by
IEEE/ACM TRANSACTIONS ON NETWORKING Editor B. Levine. An earlier ver- We then apply similar ideas to query-related strategies.
sion of this paper appeared in Proceedings of the IEEE International Conference Through trace-driven simulation and wide-area experi-
on Peer-to-Peer Computing (P2P), Konstanz, Germany, August 2005. mentation, we demonstrate the performance advantages of
The authors are with the Department of Electrical Engineering and Com-
puter Science, Northwestern University, Evanston, IL 60208 USA (e-mail: lifespan-based query-related strategies when layered over
fabianb@cs.northwestern.edu; yqiao@cs.northwestern.edu). currently employed organizational protocols as well as when
Digital Object Identifier 10.1109/TNET.2007.903986 used in combination with their lifespan-based variations. Our
1We employ lifespan and session time interchangeably. Another metric of results show that lifespan-based strategies can generate over
transiency sometimes used, lifetime, refers instead to the time between the node 2–5 times more query hits than alternative strategies. While
first entering the system and its final departure from it [6] merely illustrative, the evaluated protocols and strategies clearly
1063-6692/$25.00 © 2008 IEEE
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
618 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008
demonstrate the advantages of considering peers’ session time of messages in the network. A number of improvements to the
in designing widely deployed P2P systems. basic strategy have been suggested [5], [20], [21]. Adamic et
The remainder of this paper is structured as follows. Section II al. [20] propose using random walk in power-law topologies
provides a brief background on relevant P2P systems, protocols, with walks biased toward high-degree nodes. While this can
and strategies. Sections III and IV present results from our study significantly improve query performance, it could also result in
on peers’ lifespans and discuss lifespan-based organizational overloaded nodes. Lv et al. [21] and Chawathe et al. [5] suggest
protocols and query-related strategies. Section V describes our taking node capacity into consideration through biased random
evaluation methodology and presents results from both trace- walks, directed toward high-capacity peers.
based simulations and wide-area experiments. We discuss some To further boost query performance, different strategies for
related work in Section VI and conclude in Section VII. index caching have been proposed, including Path Caching with
eXpiration (PCX) and Neighbor Caching with incremental Up-
II. BACKGROUND date (NCU). With PCX, each node in the system maintains an
index cache, with each entry being a (key, value) pair [22]. The
The structure of a peer-to-peer system is defined by its organi- value in the pair is normally a pointer to the node holding a
zational protocol. In unstructured (UDP) and loosely structured replica of the object associated with the corresponding key [23],
or hierarchical P2P (HDP) systems,2 peers define the P2P net- [24]. Upon receiving a query message, the node not only checks
work overlay through connections with other (mostly randomly) its own shared objects, but also scans the cache for entries with
chosen peers. While organizational protocols for unstructured matching keys. Upon a successful match, the node replies with
systems, such as Gnutella v0.4 [15], consider all peers as equals, the associated pair(s). PCX can be used in both DHT-based and
protocols for loosely structured systems, like Gnutella v0.6 and less structured systems to improve query results. With NCU [5],
Kazaa [16]–[18], commonly define a two-level hierarchy dis- [18], [20], each node maintains caches of metadata for all of its
tinguishing between common “leaf” peers and well provisioned neighbors. As with PCX, a node sends a query hit either on its
super-peers. own behalf or on behalf of its neighbors, effectively increasing
Connected peers interact with each other by exchanging var- the reach of a query [25], [26].
ious types of messages, most of which are broadcast or back- Replication is a common approach to improve performance
propagated. Broadcast messages are sent on to all other peers to when distributed systems need to scale in the numbers of
which the sender has connections. Back-propagated messages users, objects in the system, and geographical area. The most
(e.g., replies) are forwarded on the reverse of the path taken by commonly used file replication strategy in P2P systems simply
an associated message (e.g., query). In addition to queries and makes replicas of objects on the requesting peer, upon a
replies, which are discussed in detail in the following subsec- successfully query/reply. Beyond this, a number of proactive
tion, other types of messages include object transfer and group replication strategies that aim at improving query hits while re-
membership messages such as ping, pong and bye. Pings are ducing response time have been proposed. Cohen and Shenker
used to discover hosts on the network. Pings are answered with [19] modeled various explicit replication strategies; they found
pong messages, containing information (such as contact infor- square-root replication, which can be efficiently achieved using
mation and resources shared) about the responding peer and path replication, to be optimal.
about others that the peer knows about. Information on neigh-
boring peers can be provided either by creating pongs on their III. TRANSIENT POPULATIONS AND P2P SYSTEMS
behalf or by forwarding the pings and back-propagating their There has been a number of studies of peers’ participation
replies. Byes are optional messages used to report the closing of and transiency in P2P systems [8], [13], [25], [27]–[30]. We
connections. performed an independent study [8] of peers’ lifespans in the
widely deployed Gnutella network (v0.6 [18], i.e., with super-
A. Query, Caching, and Replication peers), collecting over one million peer session times for over
A key component of resource-sharing P2P systems is their half-million peers. The next subsections describe our measure-
search or query mechanism. In highly structured (DHT-based) ment experiment and the characterization of the distribution of
systems, the search for an object based on its object identifier is peers’ session times.
a relatively easy task, thanks to the strictly controlled placement
of objects. In less structured P2P systems, however, the location A. Collecting Observed Peers’ Lifespans
of an object is independent of the system’s topology, leaving To actively measure the lifespan of peers in Gnutella, we
peers with only “near blind” query strategies [19]. We review modified an open source Gnutella client3 to keep track of every
some of these strategies for less structured systems next, in- peer found and periodically check its availability. Our moni-
cluding currently used and other proposed techniques for query toring peer maintains a hash table of peers it has seen so far.
distribution, caching, and replication. Each entry in the hash table includes fields for: 1) peer’s IP:port
The earliest and simplest query strategy is flooding, where pair; 2) node type (leaf- or ultra-peer); 3) time of birth (TOB);
a query is propagated to all neighbors within a certain radius. 4) time when found (TWF); and 5) time of death (TOD).
Addressing flooding’s inherent scalability problems, Lv et al. On each iteration, the monitoring peer updates the existing
[14] propose k-random-walks, where a set of parallel query entries and inserts newly found peers. To avoid potential mea-
messages (walkers) are independently forwarded to randomly surement errors, each of our probes tries to establish an appli-
chosen neighbors at each hop, significantly reducing the number cation-level connection checking for specific Gnutella packet
2We use the classification proposed by Lv et al. [14] 3Mutella. [Online]. Available: http://mutella.sourceforge.net
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
BUSTAMANTE AND QIAO: DESIGNING LESS-STRUCTURED P2P SYSTEMS FOR THE EXPECTED HIGH CHURN 619
TABLE I
STRATEGY USED FOR UPDATING THE PEER TABLE IN EACH ITERATION (T:
CURRENT TIME)
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
620 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
BUSTAMANTE AND QIAO: DESIGNING LESS-STRUCTURED P2P SYSTEMS FOR THE EXPECTED HIGH CHURN 621
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
622 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008
TABLE IV
LIST OF ORGANIZATIONAL PROTOCOLS AND QUERY-RELATED STRATEGIES AND THEIR ACRONYMS
has interacted most recently. Peers in this list potentially counterpart as LRCX. We use three different replication
serve as witnesses of ’s age. strategies including its most basic form (denoted as SRep for
• Phase 2: Witness Sampling and Trimming: To reduce the “Simple Replication”) and the random and lifespan variations
chances of collusion, first trims proposed witnesses with of Regional Replication (RRRep and LRRep, respectively).
suspiciously large interaction windows (outliers), before For both LRRep and RRRep, we set an upper bound on the
sampling a subset of the remaining peers to construct the number of replicas a peer can hold to be 10. Table IV provides a
final witness list. quick summary of of the different protocols and strategies used
• Phase 3: Collecting Testimonies and Determining Age: In throughout the rest of the paper and their associated acronyms.
the final phase, verifies the interaction times reported
by directly contacting all peers in the final witness list and B. Metrics
determines ’s age as a function (e.g., minimum or me- The effectiveness of the proposed approach is evaluated in
dian) of the collected testimonies, i.e., the verified interac- terms of the performance improvements to query-related tasks,
tion windows. as captured by three metrics: Query Resolution Time, Query
The protocol has a number of properties that improves its re- Hits, and Z Query Satisfaction. Query Resolution Time is the
silience to cheating. The age of a peer is never directly requested time between query submission and the arrival of the first reply.
from the peer itself, but determined through the collected testi- Query Hits is simply the number of query hits associated with a
monies of randomly chosen witnesses. In addition, the trimming given query. We also analyze the aggregated query hit number
of outliers helps in reducing the probability of small cabals. Al- for all queries issued during each experiment. Z Query Satisfac-
though the value determined by our protocol may not exactly tion [26] is the percentage of queries achieving satisfaction,
match the “real” age of the peer in question, it is sufficient for i.e., obtaining at least query hits.
our purposes as our protocols rely less on the real age of a peer
than in its relative seniority among other candidate peers. C. Simulation and Wide-Area Experimental Setup
For the simulation study, we employ an in-house event-based
V. EVALUATION
simulator for P2P systems with support for all membership man-
We evaluate the advantage of the proposed lifespan-based ap- agement-related functionalities as well as the evaluated query
proach to both query-related strategies and organizational pro- distribution, caching, and replication mechanisms. We ran simu-
tocols through simulations and wide-area experiments in Plan- lations using 4 of the 20 traces collected,7 with a total simulation
etLab. We compare this approach against currently deployed time of 511 000 s, capturing the lifespan of 150 033 peers. At
and proposed strategies and organizational protocols. The goal any time during a simulation run, there are around 3000–4000
of this evaluation is to determine the efficacy of the proposed ap- active peers in the system.
proach at improving system stability and, ultimately, enhancing For our wide-area evaluation, we implemented lifespan-based
the performance of applications. protocols and strategies as extensions to an open source Gnutella
client [34], thus inheriting all of the expected functionality of a
A. Query-Related Strategies and Organizational Protocols
typical P2P file-sharing system. As mentioned earlier, we use
For our evaluation, we implemented two random-based random- and lifespan-based k-random-walk instead of the orig-
(UDP and HDP) and two lifespan-based organizational pro- inal flooding in Gnutella as query strategies. We deployed and
tocols (LUDP and LHDP) as well as a number of strategies ran our system using 150 stable PlanetLab nodes, distributed
for query routing, caching, and replication. For query routing, throughout the world. At any time during an experiment, the
we use the original k-random-walk query strategy [21] and number of active peers in the whole system ranges between 200
a lifespan-based variation (denote as RQuery and LQuery, and 300, evenly mapped to the set of PlanetLab hosts. Peers’
respectively). For caching we implemented PCX, neighbor lifespans are sampled from our collected traces. Each experi-
caching with incremental update (NCU), and regional caching ment lasts for 511 000 seconds, or about six days. To ensure that
with expiration (RCX). For PCX and RCX, we set the max-
7Simulations using the remainder traces yield similar results. On the other
imum number of object identifiers to 300 and the maximum
hand, due to memory and CPU limitations of the physical machines, for some
number of node identifiers per object to 10. We denote our of our simulation scenarios, it was prohibitively expensive to apply more than
basic random regional caching as RRCX and its lifespan-based four traces simultaneously.
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
BUSTAMANTE AND QIAO: DESIGNING LESS-STRUCTURED P2P SYSTEMS FOR THE EXPECTED HIGH CHURN 623
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
624 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
BUSTAMANTE AND QIAO: DESIGNING LESS-STRUCTURED P2P SYSTEMS FOR THE EXPECTED HIGH CHURN 625
TABLE VI
RELATIVE PERFORMANCE COMPARISON FOR QUERY-RELATED STRATEGIES;
LIFESPAN-BASED QUERY-RELATED STRATEGIES OVER ALTERNATIVE ONES
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
626 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 16, NO. 3, JUNE 2008
spending few minutes per day searching for content to down- form a preference-list strategy, which occurs because of opti-
load and a few others remain logged in for weeks exhibiting a mizing for a metric other than Churn, in scenarios with fairly
more server-like behavior (as summarized by Rhea et al. [6]). skewed distributions of session times. Godfrey et al. also ex-
A number of efforts have studied Churn and its implications plore how different designs or parameter choices in distributed
to P2P systems. A few of these studies have opted for passive systems can “accidentally” lead to particular replacement strate-
monitoring techniques. For example, an early study by Sen and gies. Our work experimentally derives the distribution of session
Wang [28] analyze P2P traffic collected passively at multiple times and proposes and evaluates a number of organizational
border routers across a large ISP network. The authors report protocols and query-related strategies that leverage this to yield
high-level system dynamics with ontime, the duration an IP ad- systems with performance characteristics more resilient to their
dress is active, showing a heavy-tailed distribution. More re- environments’ expected high Churn.
cently, in their study on workload characterization of P2P file Other related efforts have focused on the performance and
sharing, Gummadi et al. [30] report session lengths based on maintenance cost of DHT-based systems in the face of Churn
passive monitoring of a router at the University of Washington. [6], [41]–[43]. Although originally targeted at non-DHT pro-
It has been pointed out [11] that passive monitoring studies may tocols, our lifespan-based approach could be straightforwardly
tend to underestimate peers’ lifespans as some peers may not be combined with some of the techniques proposed in the litera-
continuously generating traffic through the instrumented host. ture to yield better structurally Churn-resilient DHT systems.
Through active probing of 17 125 peers during 60 h, Saroiu We plan to explore this as part of our future work.
et al. [13] found a median peer session time 60 min. Chu This research is partially motivated by the seminal work of
et al. [27] present results from a considerably longer experi- Harchol-Balter and Downey [44] on process lifetime distribu-
ment on a smaller set of peers (5000 IP:port pairs). Their results tion and its implications on load-balancing techniques. The au-
show a highly transient population and significant time-of-day thors measured the distribution of Unix processes and propose a
effects. Both Saroiu et al. [13] and Chu et al. [27] measured ses- UBNE distribution that fits it well. Based on their finding, Har-
sion times by actively probing previously collected TCP/IP ad- chol-Balter and Downey present a new policy for preemptive
dresses of peers following an approach that can only determine process migration in clusters of workstations.
if a node is or not accepting TCP connections in the requested
port without distinguishing what application is connected to it. VII. CONCLUSION
In their study on availability in the Overnet DHT [29], Bhagwan
et al. rightly point out the potential effects of aliasing on mod- This paper addresses the problem of highly transient pop-
eling host availability.9 IP address aliasing can result in great ulations in unstructured and loosely structured P2P systems.
overestimation of the number of hosts in the systems and the Using a number of illustrative organizational protocols and
underestimation of their availability. A more recent work by query-related strategies, we present trace-driven simulation and
Nurmi et al.[12] discuss the suitability of different statistical wide-area experiment results that illustrate the performance
distribution for describing machine availability in three different advantages of considering peers’ estimated session time as a
data sets. Their results indicate that Weibull model more accu- key system attribute in the design of Churn-resilient P2P sys-
rately represent machine availability in some of these settings. tems. The benefits of lifespan-based ideas are not constrained
In contrast, our work characterizes the lifespan distribution of to control-related traffic, but extend to applications, resulting
individual sessions, during which a peer’s IP:port tuple will not in improved query satisfaction and resolution time, as well as
change, and builds on this characterization to propose new or- significantly higher system scalability. An area of future work
ganization protocols and query-related strategies. The different is to understand the implications of partial adoption of the
algorithms we have presented throughout this paper rely on the presented approach.
predictability of peer’s remaining time based on its current up-
time. In a recent study [11] Stutzbach and Rejaie present a thor- ACKNOWLEDGMENT
ough analysis of Churn in Gnutella, Kad and BitTorrent and, The authors would like to thank NUIT and our Computing
while questioning the use of Pareto distribution to describe the Support Group for their aid and understanding while obtaining
session lengths, they show that current uptime is still a good in- the trace data used for our experiments. The authors would also
dicator of remaining uptime in both Gnutella and Kad. like to thank J. Casler, S. Birrer, D. Choffnes, and K. Livingston
Leonard et al. [9] investigate the resilience of random graphs for their helpful comments on early drafts of this paper.
to lifespan-based node failures and derives the expected delay
(and associated probability) before a peer is forcefully isolated
from the graph. Godfrey et al.[10] present a comparative study REFERENCES
of different strategies aimed at reducing Churn rate by intelli- [1] Y. Qiao and F. E. Bustamante, “Elders know best—Handling Churn
in less structured P2P systems,” in Proc. IEEE Int. Conf. Peer-to-Peer
gently selecting a subset of the available nodes. The compared Computing (P2P), Konstanz, Germany, Aug.–Sep. 2005.
strategies differ in the amount of node information they rely on [2] Napster. 2003 [Online]. Available: http://www.napster.com
for selection and in whether they replace a failed node with a [3] M. Bawa, H. Deshpande, and H. Garcia-Molina, “Transience of peers
new one. The authors show, through trace-based evaluation, that and streaming media,” in Proc. HotNets-I, 2002, pp. 107–112.
[4] P. Linga, I. Gupta, and K. Birman, “A Churn-resistant peer-to-peer web
replacement strategies can yield significant reduction in Churn caching system,” in Proc. ACM Workshop on Survivable and Self-Re-
over fixed strategies, and that random replacement can outper- negerative Syst., Oct. 2003, pp. 1–10.
[5] Y. Chawathe, S. Ratnasamy, L. Breslau, and S. Shenker, “Making
9Aliasing effects could be due, for example, to the use of DHCP and NATs, Gnutella-like P2P systems scalable,” in Proc. Conf. Appl., Technol.,
as well as the sharing of a host by multiple users. Arch., Protocols Comput. Commun., Aug. 2003, pp. 407–418.
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.
BUSTAMANTE AND QIAO: DESIGNING LESS-STRUCTURED P2P SYSTEMS FOR THE EXPECTED HIGH CHURN 627
[6] S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz, “Handling Churn in [32] D. Dutta, A. Goel, R. Govindan, and H. Zhang, “The design of a dis-
a DHT,” in Proc. USENIX Annu. Tech. Conf., Jun. 2004, pp. 127–140. tributed rating scheme for peer-to-peer systems,” in Proc. P2PEcon
[7] J. Li, J. Stribling, R. Morris, T. Gil, and F. Kaashoek, “Routing trade- Workshop, 2003.
offs in peer-to-peer DHT systems with Churn,” in Proc. 3rd Int. Work- [33] S. Buchegger and J.-Y. L. Boudec, “A robust reputation system for p2p
shop on Peer-to-Peer Systems (IPTPS’04), 2004, pp. 87–99. and mobile ad hoc networks,” in Proc. P2PEcon Workshop, 2004.
[8] F. E. Bustamante and Y. Qiao, “Friendships that last: Peer lifespan [34] Mutella. 2003 [Online]. Available: http://mutella.sourceforge.net
and its role in P2P protocols,” in Proc. 8th Int. Workshop Web Content [35] V. Almeida, A. Bestavros, M. Crovella, and A. de Oliveira, “Charac-
Caching Distrib. (WCW), Sep.-Oct. 2003. terizing reference locality in the WWW,” in Proc. IEEE PDIS, 1996.
[9] D. Leonard, V. Rai, and D. Loguinov, “On lifetime-based node failure [36] K. Sripanidkulchai, “The popularity of Gnutella queries and its impli-
and stochastic resilience of decentralized peer-to-peer networks,” in cations on scalability,” in O’Reilly’s OpenP2P, 2001.
Proc. SIGMETRICS, Banff, AB, Canada, Jun. 2005, pp. 26–37. [37] M. F. Kaashoek and D. Karger, “Koorde: A simple degree-optimal dis-
[10] P. B. Godfrey, S. Shenker, and I. Stoica, “Minimizing Churn in dis- tributed hash table,” in Proc. 2st Int. Workshop Peer-to-Peer Systems
tributed systems,” in Proc. ACM SIGCOMM, Sep. 2006, pp. 147–158. (IPTPS’03), Berkeley, CA, Feb. 2003.
[11] D. Stutzbach and R. Rejaie, “Understanding Churn in peer-to-peer net- [38] J. Aspens, Z. Diamadi, and G. Shah, “Fault tolerant routing in
works,” in Proc. ACM Internet Measurement Conf. (IMC), Oct. 2006, peer-to-peer systems,” in Proc. ACM Symp. Principles Distrib.
pp. 189–202. Comput. (PODC), Monterey, CA, Jul. 2002.
[12] D. Nurmi, J. Brevik, and R. Wolski, “Modeling machine availability [39] K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S. Shenker, and
in enterprise and wide-area distributed computing environments,” in I. Stoica, “The impact of DHT routing geometry on resilience and prox-
Proc. Euro-Par, Aug. 2005, pp. 432–441. imity,” in Proc. ACM SIGCOMM, Karlsruhe, Germany, Aug. 2003.
[13] S. Saroiu, P. K. Gummadi, and S. D. Gribble, “A measurement study [40] L. Massoulié, A.-M. Kermarrec, and A. J. Ganesh, “Network awareness
of peer-to-peer file sharing systems,” in Proc. MMCN, Jan. 2002. and failure resilience in self-organizing overlay networks,” in Proc.
[14] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search and replication IEEE Symp. Reliable Distributed Systems (SRDS), Florence, Italy, Oct.
in unstructured peer-to-peer networks,” in Proc. ICS, Jun. 2002, pp. 2003.
84–95. [41] D. Liben-Nowell, H. Balakrishnan, and D. Karger, “Analysis of the
[15] “The Gnutella protocol specification v0.4,” Gnutella RFC, 2000 [On- evolution of peer-to-peer systems,” in Proc. ACM Symp. Principles of
line]. Available: http://rfc-gnutella.sourceforge.net Distributed Computing (PODC 2002), Monterey, CA, 2002.
[16] A. Singla and C. Rohrs, “Ultrapeers: Another step towards Gnutella [42] R. Mahajan, M. Castro, and A. Rowstron, “Controlling the cost of relia-
scalability,” Lime Wire LLC, 2001. bility in peer-to-peer overlays,” in Proc. 2st Int. Workshop Peer-to-Peer
[17] KaZaA. 2001 [Online]. Available: http://www.kazaa.com Systems (IPTPS’03), 2003.
[18] T. Klingberg and R. Manfredi, Gnutella 0.6, Gnutella RFC, 2002 [On- [43] J. Li, J. Stribling, R. Morris, M. F. Kaashoek, and T. M. Gil, “A per-
line]. Available: http://rfc-gnutella.sourceforge.net formance versus cost framework for evaluating DHT design tradeoffs
[19] E. Cohen and S. Shenker, “Replication strategies in unstructured under Churn,” in Proc. IEEE INFOCOM, 2005.
peer-to-peer networks,” in Proc. ACM SIGCOMM, Aug. 2002, pp. [44] M. Harchol-Balter and A. B. Downey, “Exploting process lifetime dis-
177–190. tribution for dynamic load balancing,” ACM Trans. Comput. Syst., vol.
[20] L. A. Adamic, R. M. Lukose, A. R. Puniyani, and B. A. Huberman, 15, no. 3, 1997.
“Search in power-law networks,” Phys. Rev., vol. 64, no. 4, p. 046135,
2001.
[21] Q. Lv, S. Ratnasamy, and S. Shenker, “Can heterogeneity make
gnutella scalable?,” in Proc. 1st Int. Workshop Peer-to-Peer Systems
(IPTPS’02), Mar. 2002, pp. 94–103.
[22] M. Roussopoulos and M. Baker, “Cup: Controlled update propagation Fabián E. Bustamante received the M.S. and Ph.D.
in peer-to-peer networks,” in Proc. USENIX Annu. Tech. Conf., Jun. degrees in computer science from the Georgia Insti-
2003, pp. 167–180. tute of Technology, Atlanta.
[23] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A He is an Assistant Professor of computer science
scalable content-addressable network,” in Proc. ACM SIGCOMM, with the Department of Electrical Engineering
Aug. 2001, pp. 161–172. and Computer Science, Northwestern University,
[24] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, Evanston, IL. His research interests include dis-
“Chord: A scalable peer-to-peer lookup service for internet applica- tributed systems and networks and operating systems
tions,” in Proc. ACM SIGCOMM, Aug. 2001, pp. 149–160. support for massively distributed systems, including
[25] E. P. Markatos, “Tracing a large-scale peer to peer system: An hour in Internet and vehicular network services.
the life of gnutella,” in Proc. CCGrid, May 2002, pp. 21–24. Dr. Bustamante is a member of the IEEE Computer
[26] B. Yang and H. Garcia-Molina, “Efficient search in peer-to-peer net- Society, the ACM, and USENIX.
works,” in Proc. ICDCS, Jul. 2002, pp. 5–14.
[27] J. Chu, K. Labonte, and B. N. Levine, “Availability and locality mea-
surements of peer-to-peer file systems,” in Proc. ITCom, 2002.
[28] S. Sen and J. Wang, “Analyzing peer-to-peer traffic across large net- Yi Qiao received the B.S. degree in electrical
works,” in Proc. ACM SIGCOMM-IM Workshop, 2002. engineering and the M.S. degree from Tsinghua
[29] R. Bhagwan, S. Savage, and G. M. Voelker, “Understanding avail- University, China, and the M.S. degree in computer
ability,” in Proc. 2st Int. Workshop Peer-to-Peer Systems (IPTPS’03), science from Northwestern University, Evanston,
2003. IL. He is currently working toward the Ph.D. degree
[30] K. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and at the Department of Electrical Engineering and
J. Zahorjan, “Measurement, modeling, and analysis of a peer-to-peer Computer Science, Northwestern University.
file-sharing workload,” in Proc. 19th ACM Symp. Oper. Syst. Principles His research interests are in distributed systems
(SOSP), 2003. and networking with an emphasis on measurement,
[31] E. Damiani, S. D. C. di Vimercati, S. Paraboschi, P. Samarati, and F. modeling, and design of scalable distributed systems.
Violante, “A reputation-based approach for choosing reliable resources Mr. Qiao is a member of the IEEE Computer So-
in peer-to-peer networks,” Proc. 9th ACM CCCS, 2002. ciety and ACM.
Authorized licensed use limited to: University of Pittsburgh. Downloaded on December 18,2020 at 02:13:33 UTC from IEEE Xplore. Restrictions apply.