Professional Documents
Culture Documents
csf661-10 Peer To Peer PDF
csf661-10 Peer To Peer PDF
Chapter 10
Peer-to-Peer Systems
10.7 Summary
2
10.1 Introduction
Paradigm Shift of Computing System Models
1980~
Terminal-Mainframe
(Super-computing)
1990~
2000~
Client-Server
Peer-to-Peer
(Micro-computing
(Macro-computing)
/Personal Computer)
RS-232
Dialup/10M Ethernet
ADSL/100M+ Ethernet
VT100/DOS
Windows 31/95
Linux/Windows XP
3
Client-server Model
Clients and servers each with distinct roles
Request
Clients
U11
Server
Service
U12
S
U13
U21
U22
U31
U32
Peer-to-peer Model
Peer-to-Peer (P2P) is a way of structuring distributed
applications such that the individual nodes have symmetric
roles. Rather than being divided into clients and servers each
with quite distinct roles, in P2P applications a node may act as
both a client and a server.
Excerpt from the Charter of Peer-to-Peer Research Group,
IETF/IRTF, June 24, 2003
http://www.irtf.org/charters/p2prg.html
CR1
U13
CRa
CRb
Overlay Network
U21
CR2
U22
CR3
Content Router
or Peer Node
Name:
Addresses:
Aliases:
Name:
Addresses:
Aliases:
Name:
Address:
Aliases:
Name:
Addresses:
Aliases:
CR
U31
U32
lb1.www.ms.akadns.net
207.46.20.60, 207.46.18.30, 207.46.19.30, 207.46.19.60, 207.46.20.30
www.microsoft.com, toggle.www.ms.akadns.net, g.www.ms.akadns.net
www.yahoo.akadns.net
66.94.230.33, 66.94.230.34, 66.94.230.35, 66.94.230.39, 66.94.230.40,
www.yahoo.com
e96.g.akamaiedge.net
202.177.217.122
www.gio.gov.tw, www.gio.gov.tw.edgekey.net
a1289.g.akamai.net
203.133.9.9, 203.133.9.11
www.whitehouse.gov, www.whitehouse.gov.edgesuite.net
Generations
Early services
DNS, Netnews/Usenet
Xerox Grapevine name/mail service
Lamports part-time parliament algorithm
Bayou replicated storage system
classless inter-domain IP routing algorithm
Directory
1 Client 1
2
C
1
1
3
Server
3/14/2000:
Released
by
NullSoft/AOL,
almost
immediately withdrawn, and
became open source
Message flooding: serverless,
decentralized
search
by
message broadcast, direct
file transfer using HTTP
Limited-scope query
10
11
C
1
3
4
12
13
14
13
12
11
5
10
6
9
14
16
Load balancing
Network dynamics
(addition/deletion of
objects/nodes)
Fault tolerance
Target identification
Security and anonymity
IP
17
Distributed Computation
Only a small portion of the CPU cycles of most computers is
utilized. Most computers are idle for the greatest portion of
the day, and many of the ones in use spend the majority of
their time waiting for input or a response.
A number of projects have attempted to use these idle CPU
cycles. The best known is the SETI@home project, but
other projects including code breaking have used idle CPU
cycles on distributed machines.
SETI@home project: a scientific experiment that uses
Internet-connected computers to analyze radio telescope
data in the Search for Extraterrestrial Intelligence (SETI)
http://setiathome.ssl.berkeley.edu/
18
19
20
Napster server
Index
1. File location
request
2. List of peers
offering the file
Napster server
Index
3. File request
5. Index update
4. File delivered
21
22
23
24
A
D
B
Object:
Node:
Bs routing knowledge
Cs routing knowledge
26
Overlay Networks
Overlay: a network on top of another (IP) networks
links on one layer are network segments of lower layers
tunneling, application adaptor, transparent router
27
Routing Inefficiencies
Principal categories of the routing inefficiencies
Poor routing metrics
These routers make decisions by minimizing the number of
independent autonomous systems (ASs)
Single-path routing
28
29
Pastry
Each Pastry node keeps track of its L immediate
neighbors in the nodeId space (called the leaf set),
and notifies applications of new node arrivals, node
failures and node recoveries within the leaf set.
Pastry takes into account locality (proximity) in the
underlying Internet; it seeks to minimize the distance
messages travel, according to a scalar proximity
metric like the ping delay. Pastry is completely
decentralized, scalable, and self-organizing; it
automatically adapts to the arrival, departure and
failure of nodes.
30
Pastry Applications
31
put(GUID, data)
The data is stored in replicas at all nodes responsible for the object
identified by GUID.
remove(GUID)
Deletes all references to GUID and the associated data.
value = get(GUID)
The data associated with GUID is retrieved from one of the nodes
responsible it.
32
D471F1
D467C4
D46A1C
D13DA3
65A1FC
33
34
35
D471F1
D46A1C
D467C4
D462BA
D4213F
D13DA3
65A1FC
Tapestry
Tapestry is another peer-to-peer model similar to Pastry.
It hides a distributed hash table from applications behind a
Distributed object location and routing (DOLR) interface to
make replicated copies of objects more accessible by
allowing multiple entries in the routing structure.
This allows for a more direct access to a nearby copy of data
resources, as shown in Figure 10.10.
38
43FE
publish path
Location mapping
for 4378
4228
4378
Phils
Books
4361
4664
4A6D
4B4F
Routes actually
taken by send(4378)
E791
57EC
AA93
4378
Phils
Books
40
Evaluation
105 active clients in Cambridge and more than 36,000 in Redmond
Each client contributes 100MB of disk storage
Hit ratios: centralized - 29% (Redmond) and 38% (Cambridge)
Similar simulation results achieved by the overlay
OceanStore
42
VGUID of current
version
certificate
d2
d3
d2
d3
root block
version i
indirection blocks
data blocks d1
d4
d5
43
Name
Meaning
Description
BGUID
block GUID
VGUID
version GUID
AGUID
active GUID
44
Ivy
The Ivy file system emulates a Sun NFS server. Ivy maintains
a store of file updates in logs and reconstructs a particular state
from the logs. It does not allow file locking to allow for the
failure or withdrawal of nodes or connections. For protection
against malicious attacks, file changes are associated with
nodes and can omit a node when performing a reconstruction.
Conflicts in file contents due to partitions in a network are dealt
with by algorithm. Ivy architecture is shown in figure 10.14.
45
Application
DHash server
Ivy server
DHash server
DHash server
Modifled
NFS Client
module
DHash server
Kernel
46