You are on page 1of 79

Models & Algorithms

for Internet Computing


Lecture 10: PEER-TO-PEER NETWORK MODELS
Agenda
• Introduction to Peer-to-Peer Networks
• Distributed Hash Tables
• Chord
• CAN

Probability for Computing 3


An Introduction to Peer-to-Peer Networks

The amount of IP Traffic from Peer-to-Peer Networks


Traffic profile of the Deutsche Telekom Access Network (2nd quarter 2003)
Problems of Today‘s Internet

• Scalability
• Flexibility / extensibility
• Security / reliability
• and more…

•Does the Peer-to-Peer principle offer adequate


solutions?
Problems of Today‘s Internet: Scalability (1)
•Scalability
•The capability of a system to keep functioning and working efficiently while growing
by orders of magnitude

•Enormous growth of the Internet


• Seams to reveal scalability limitations of the
client-server paradigm
•Increasing number of participants
• Wireless hot spots, interconnection with mobile
and cellular networks
• Ubiquitous computing
• Sensor networks
• Internet of Things
Problems of Today‘s Internet: Scalability (2)

•General problems of client-server


architectures
•Limited resources of and around the server
• Increasing traffic load
C of the
• Highly concentrated on server – under-load at other parts
network C
C
• Asymmetrical traffic flows
• Problems of resource scalability
• memory, CPU S
• Server poses problem of „single point of failure“
•Unused resources at many clients
• CPU C
• Memory C
• Information C
Problems of Today‘s Internet: Scalability (3)

•Client-server systems do not scale infinitely!


• Limited resources
•New applications impose increasing resource requirements
• Bandwidth, memory (e.g., file exchange)
• CPU power (e.g., Seti@Home)
•The requirements cannot be fulfilled by centralized architectures anymore.
• Examples (year 2002): Kazaa (107 Gbyte), Seti@Home (20 GigaFlops)
Driving Forces Behind Peer-to-Peer Networks (1)
•Development of end system capabilities
•1992:
• Average hard disk size: ~0.3 Gbyte
• Average processing power (clock frequency) of personal computers: ~ 100
MHz

•2002:
• Average hard disk size: 100 Gbyte

•2004:
• Average processing power (clock frequency) of personal computers: ~ 3 GHz
 Personal computers now have the capabilities comparable to servers in
the 1990s
Driving Forces Behind Peer-to-Peer Networks (2)
•Development of communication networks
•Early 1990s: private users start to connect to the Internet via 56 kbps modems

•1997/1998:
• first broadband connections for residential users become available
• cable modem with up to 10 Mbps
•1999:
• Introduction of DSL and ADSL connections
• Data rates of up to 8.5 Mbps via common telephone connections become
available.
• The deregulation of the telephone market shows first effects with significantly
reduced tariffs, due to increased competition on the last mile.
 bandwidth is plentiful and cheap!
Problems of Today‘s Internet: Flexibility

•Missing flexibility and extensibility of the


network MODIFY
• Limitations for the introduction of new services
• No specialized services available

•ISPs are shy at changes


• High cost, low earnings
• Limited control, possible instability
Internet
•Many developments fail (more or less)
•Examples:
• Group communication, IP multicast 
• Quality of Service in the Internet
• Support for mobility doi
• Active Networking
Problems of Today‘s Internet: Security/Reliability (1)

•Availability will be increasingly important for


future Internet-based systems
•Very high cost in case of failing systems, servers, services

•Increasing number of intentional attacks


• Distributed denial-of-service attacks (DDoS)
• Central servers are easy targets

•100% fault tolerance


• … impossible
• Single-vendor platforms are more susceptible to attacks
• e.g., MS Windows bugs, MS Windows viruses
Problems of Today‘s Internet: Security/Reliability (2)

• Resistance to censorship
• Great demand from users` point of view
• „Small demand“ from the „agencies`“ point of view
• Central server systems can be easily shut down or rendered harmless
Departure from the End-to-End Principle

• Further problems due to the evolution of the Internet


• Network Address Translation (NAT) / dynamic addresses
• Shortage of Internet addresses / IPv6 no big issue anymore
• Firewall-based protection of intra networks
• Various forms of „middleboxes“ (proxies, …)
• No general end-to-end connections anymore
 Departure from original Peer-to-Peer characteristics
 Systems „behind“ NAT have no chance to offer public services!

NAT cannot be reached


from outside

NAT
Result of the Developments: De-Centralization

•Conclusion
• Centralized systems
• … do not scale infinitely
• … entail single points of failures (reliability problems)
• … are easy to attack
• … but are easy to realize
• and will thus continue to be used (up to the scalability limit)

•The way out


• De-Centralization
• Re-emergence of the Peer-to-Peer principle
• „Back to the roots“ (but on a different scale than before – challenge!)
Definition of Peer-to-Peer Systems (1)
•What are Peer-to-Peer Systems?

•A first definition
P2P is a class of applications where each node is at the same time a
client and a server. Because they operate in an environment of
unstable con-nectivity and un-predictable IP addresses, P2P nodes
must operate outside the DNS system and have significant or total
autonomy from central servers.
Definition of Peer-to-Peer Systems (2)

•Characteristics of Peer-to-Peer systems


• Direct interaction of end systems (peers)
• Joint usage of resources in end systems
• No central control or usage of central services
• Equal and autonomous participants
• Self-organization of the system
Information Management is a Basic Challenge

I have item „D“. I want item „D“.


Where to place „D“? Where can I find „D“?
Data item „D“
D

? ? distributed system
?

12.5.7.31

peer-to-peer.info
berkeley.edu planet-lab.org 89.11.20.15

95.7.6.10

86.8.10.18 7.31.10.25
Types of Peer-to-Peer Systems
C
• Client-Server systems C
C •Examples
• Classical role-based systems S •WWW

• No direct interaction between clients C


•DNS
C
C

•Hybrid P2P systems P •Napster


• Joint usage of distributed resources •ICQ
P P
• Usage of servers for coordination •AIM (AOL)
• Direct interaction between peers S
for data exchange •Seti@Home
P P
•Skype
P

• Pure P2P systems P •Gnutella


• Completely de-centralized manage-
ment and usage of the resources P P •Freenet
P
P •DHT-based systems

P
P P
Architectures of Peer-to-Peer
Networks
Client-Server Peer-to-Peer
1. Server is the 1. Resources are shared between the peers
central entity and 2. Resources can be accessed directly from other peers
only provider of 3. Peer is provider (server) and requestor (client): the servent concept
service and
content. Unstructured P2P Structured P2P
 Network
managed by the Centralized P2P Pure P2P Hybrid P2P DHT-Based
Server
1. All features of Peer-to- 1. All features of Peer-to- 1. All features of Peer-to- 1. All features of Peer-to-
2. Server as the Peer included Peer included Peer included Peer included
higher performance 2. Central entity is 2. Any terminal entity can 2. Any terminal entity can 2. Any terminal entity can
system. necessary to provide be removed without be removed without be removed without
the service loss of functionality loss of functionality loss of functionality
3. Clients as the
3. Central entity is some 3.  No central entities 3.  dynamic central 3.  No central entities
lower performance kind of index/group entities
Examples: Gnutella 0.4, 4. Connections in the
systems database Freenet Example: Gnutella 0.6, overlay are “fixed”
Example: Napster JXTA Examples: Chord, CAN
Example: WWW

1st Gen. 2nd Gen.


Peer-to-Peer Networks vs. Overlay Networks
Some applications are
neither pure P2P, nor do
they employ a (non-trivial)
Napster
Seti@home overlay

Freenet SMTP

Mobile Gnutella DHTs


Ad-Hoc
Distributed
Networks Arpanet
Web caches

Peer-to-Peer Overlay

Typically, P2P applications


employ overlay networks
Examples: P2P vs. Overlay Networks
•P2P, but no overlay: •Overlay, but not P2P …
•Mobile Ad-Hoc Networks •… Client/Server:
• Each node is client and router • Distributed Web caches
• No infrastructure •… without using distributed resources:
•Sensor networks • Virtual private networks (VPN)

•Both P2P and Overlay


• ARPANET
• Gnutella
• P2P systems based on
Distributed
Hash Tables (DHTs)
Review – a bit of history

•Peer-to-Peer networks
• De-centralized self-organizing systems with de-centralized usage of
resources

•Reasons for current usage of P2P techniques


• Today‘s Internet has many weaknesses
• limited scalability, flexibility, extensibility, reliability/security
• other deployment-specific „illnesses“ of the Internet: NAT,
middleboxes, dynamic addresses …

•Additional reasons
• Success of “practical” P2P approaches (publicity through copyright
discussions)
• Emerging of innovative types of services (ICQ, file sharing, Skype, etc.)
• Interesting research areas:
• Quality in peer-to-peer networks
• De-centralized self-organization
• Location-based routing  content-based routing
Distributed Hash Tables
•Essential challenge in (most) peer-to-peer systems:
• Location of a data item among the distributed systems:
• Where shall the item be stored?
• How does a requester find the location of an item?
• Allow peer nodes to join and leave the system anytime.
• Scalability: keep the complexity for communication and
storage scalable.
• Robustness and resilience in case of faults and frequent
changes

•Distributed Hash Tables serve two purposes:


• to distribute data evenly over a set of peers
• to locate the data in search processes.
•They use a hash function for both purposes.
Peer-to-Peer Systems Based
Client-Server Peer-to-Peer
on DHTs
1. Server is the
central entity and
1. Resources are shared between the peers
2. Resources can be accessed directly from other peers
only provider of 3. Peer is provider (server) and requestor (client): servent concept
service and
content. Unstructured P2P Structured P2P
 Network
managed by the Centralized P2P Pure P2P Hybrid P2P DHT-Based
Server 1. All features of Peer- 1. All features of Peer-
1. All features of peer- 1. All features of peer-
2. Server as the to-peer included to-peer included to-Peer included to-Peer included
higher 2. Central entity is 2. Any peer entity can 2. Any terminal entity 2. Any terminal entity
performance necessary to pro- be removed without can be removed can be removed
system. without loss of without loss of
vide the service loss of functionality
functionality functionality
3. Clients as the 3. Central entity is 3. No central entities 3.  dynamic central 3.  No central entities
lower some kind of index/ Examples: Gnutella 0.4, entities 4. Connections in the
performance group database Freenet Example: Gnutella 0.6, overlay are “fixed”
system Example: Napster JXTA Examples: Chord, CAN
Example: WWW

1st Gen. 2nd Gen.


Distributed Indexing (1)
• Communication overhead vs. node state

O(N) Flooding
Communication

Bottlenecks:
• Communication Overhead
Overhead

• False negatives Bottlenecks:


• Memory, CPU

O(log N)
?
Scalable solution Central
between the two Server
O(1)
extremes?
O(1) O(log N) Node State O(N)
Distributed Indexing (2)
• Communication overhead vs. node state
Scalability: O(log N)
No false negatives
Resistant against changes,
failures, attacks
Flooding Allows short-time users
Communication

O(N)
Bottleneck:
•Communication Overhead
Overhead

•False negatives
Bottlenecks:
•Memory, CPU, Network
Distributed •Availability
O(log N)
Hash Table
Central
Server
O(1)

O(1) O(log N) Node State O(N)


Distributed Indexing (3)
•Approach of distributed indexing schemes
• Data and nodes are mapped into the same address space!
• Intermediate nodes maintain routing information to
target nodes
• Efficient forwarding to destination node (based on content
– not on location)
• Definitive statement about the existence of content is
possible.
•Problems
• Maintenance of routing information required
• Fuzzy queries not easily supported (e.g., wildcard
searches)
Comparison of Lookup Concepts

Per-Node Communication Fuzzy No false Robust-


System
State Overhead Queries negatives ness

Central
O(N) O(1)   
Server

Flooding
O(1) O(N²)   
Search

Distributed
O(log N) O(log N)   
Hash Tables
From Classic Hash Tables to
Distributed Hash Tables
•Classic Hash Table peers
•Searching is easy and efficient. 0 1 2 3 4 5 6
However, adding a new peer node
changes the hash function! 23 0 5 1 4

f(23)=1
data f(1)=4

data
•Distributed Hash Table
•Each peer is responsible for a subset of
the data range; that subset is computed hash function
by the hash function.
•If we search for data, the query is
submitted to the same hash function.
hash function

responsible peers
Insertion into a Distributed Hash Table
• Peers are hashed to a specific area
• Documents are also hashed to a
specific area
• Each peer is responsible for his area

• When a new node is added to the


network the neigbors share their
range with it.

• When a node leaves the network


his neighbors take over his share.
Distributed Management of Data
• Sequence of operations

1.Mapping of nodes and data into the same address


space
• Peers and content are addressed using flat identifiers (IDs)
• Common address space for data and nodes
• Nodes are responsible for data in certain parts of the
address space
• Association of data to nodes can change since nodes may
disappear
2.Storing / looking up data in the DHT
• Search for data = routing to the responsible node
• Responsible node not necessarily known in advance
• Deterministic statement about the availability of data possible
Addressing in Distributed Hash Tables
•Step 1: Mapping of content/nodes into the same
address space
• Usually: 0, …, 2m-1 >> number of objects to be stored
• Mapping of data and nodes into an address space (with a hash function)
• e.g., Hash(String) mod 2m: H(„my data“)  2313
• Association of parts of the address space with peer nodes

3485 - 611 - 710 – 1008 - 1622 - 2011 - 2207- 2906 - (3485 -


610 709 1007 1621 2010 2206 2905 3484 610)

2m-1 0
H(Node Y)=3485 Often, the address
Y space is visualized
as a circle.
X
Data item “D”:
H(“D”)=3107 H(Node X)=2906
Association of Address Space
with Nodes
•Each node is responsible for a part of the value range Node 3485 is responsible
• Sometimes with redundancy for data items in range
• Continuous adaptation 2907 to 3485 (in case of
a Chord DHT)
• Real (underlay) and logical (overlay)
topology are uncorrelated
1008 1622 2011
709 2207

Logical view of the


Distributed Hash Table 611 2906
3485

Mapping to the real


Internet topology
Step 2: Routing to a Data Item (1)

•Step 2: Storing/looking up data (content-based


routing)
•Goal: a small and scalable effort
• O(1) with a centralized hash table
• But: management of a centralized hash table is very
costly.
• Minimum overhead with distributed hash tables
• O(log N) DHT hops to locate object
• O(log N): number of keys and routing information per
node (N = # nodes)
Step 2: Routing to a Data Item (2)
•Routing to a key/value pair
• Start lookup at an arbitrary node of the DHT
• Route the request (possibly indirectly) to the peer node
responsible for the key range.

H(„my
data“) 1622
= 3107 709
1008 2011
2207
Node 3485 manages
keys 2907-3485,

611 2906
3485
Key = H(“my
data”)
Initial node
(arbitrary)
(3107, (ip, port))

Value = pointer to location of data


Step 2: Routing to a Data Item (3)
•Getting the content
• The key/value pair is delivered to the requester.
• The requester analyzes the key/value tuple (and downloads the data
from the actual location in case of indirect storage).
In case of indirect storage:
After knowing the actual
H(„my Get_Data(ip, port) location, the data is re-
data“) 1622 quested from there.
1008 2011
= 3107 709 2207

611 2906
3485

Node 3485 sends


(3107, (ip/port)) to the requester
Association of Data with IDs – Direct
Storage
•How is content stored in the nodes?
• Example:
H(“my data”) = 3107 is mapped into the DHT address space.
•Direct storage
• Content is stored in the node responsible for H(“my data”)
 Okay if the amount of data is small (<1 kB). Inflexible for large
contents.

1622
709 1008 2011
2207

611 2906
D 3485
D
D HSHA-1(„D“)=3107
134.2.11.68
Association of Data with IDs –
Indirect Storage
•Indirect storage
• Nodes in a DHT store tuples (key,value)
• Key = Hash(„my data”)  2313
• Value is then the storage address of the content:
(IP, Port) = (134.2.11.140, 4711)
• More flexible, but requires one step more to reach the content.

1622
709 1008 2011
2207

611 3485 2906


HSHA-1(„D“)=3107

D Item D: 134.2.11.68
134.2.11.68
Node Arrival
1. Calculation of node ID
2. New node contacts DHT via arbitrary node.
3. A particular hash range is assigned to the node.
4. The key/value pairs of this hash range are stored on
the new node (usually with redundancy).
5. The node is integrated into the routing environment.

1622
709 1008 2011
2207

 
611 3485 2906


ID: 3485
134.2.11.68
Node Failure / Node Departure
•Failure of a node
• Use of redundant storage of the key/value pairs (if
a node fails)
• Use of redundant/alternative routing paths
• Key/value usually still retrievable as long as at
least one copy remains.

•Departure of a node
• New partitioning of the hash range to neighbor
nodes
• Copy the key/value pairs to the neighbor nodes
• Remove the departing node from the routing
environment.
DHT Interfaces
•Generic interface of Distributed Hash Tables
• Provisioning of information
• publish(key,value)
• Requesting of information (search for content)
• lookup(key)
• Reply
• value
•DHT approaches are then interchangeable (implementing the
same inter-face).

Distributed Application

publish(key,value)lookup(key)
value
Distributed Hash Table
(CAN, Chord, Pastry,…)

Node 1 Node 2 Node 3 . . . . Node N


Conclusions
• Data and nodes are mapped into the same address
space.
• Use of routing information for efficient search for
content.
• Keys are evenly distributed across nodes of a DHT
• No bottlenecks
• A continuous increase in the number of stored keys is
possible
• Failures of nodes can be tolerated
• Survival of attacks is possible
• A self-organizing system
• Simple and efficient realization
• Supports a wide spectrum of applications:
• Flat (hash) key without a semantic meaning
• Value depends on the application
Chord
• Overview
• Developed at UC Berkeley and MIT, published at ACM
SIGCOMM in 2001
• An early and successful algorithm
• Simple and elegant
• Easy to understand and implement
• Many improvements and optimizations exist
• Main functions
• Routing
• A flat logical address space: 160-bit identifiers instead of IP
addresses
• Efficient routing in large systems: log(N) hops for a network of N
nodes
• Self-organization
• Can handle frequent node arrival, departure and failure (“churn”)
Chord Interface and Identifiers
•User interface
• put (key,value) inserts data into Chord
• value = get (key) retrieves data from Chord
•Identifiers
• Derived from the hash function
• e.g., SHA-1, 160-bit output → 0 <= identifier < 2160
• Key associated with each data item
• e.g., key = SHA-1(value)
• ID associated with each host
• e.g., id = SHA-1(IP address  port)
Chord Topology (1)
•The Chord ring
• Keys and IDs are placed on a ring, i.e., all arithmetic happens
modulo 2160
• (key, value) pairs are managed by the clockwise next node
(successor)
6

1
0 successor(1) = 1
7 1

Chord
successor(6) = 0 6 6 Ring 2 2 successor(2) = 3

Identifier
5 3 Node
4 X Key
2
Chord Topology (2)
•Topology determined by links
between nodes
• Link: knowledge about another node
• Stored in a routing table on each node
0
•Simplest topology 7 1
• circular linked list
•Principle of consistent 6 2
(distributed) hashing
• Initial idea: balance load among nodes
by using a hash function to map 5 3
nodes/data into the linear address 4
space.
• Each node has a link to the next node
(clockwise)
Chord Routing (1)
•Primitive routing in distributed hashing
• Forward query for key k to the next node until
successor(k) is found
• Return result to the source of the query
•Advantages 6
node 0

• Simple
1
• Little node state needed 0
•Disadvantages 7 1 key 6?

• Poor lookup efficiency:


N/2 hops on the average 6 2
for N nodes ( = O(N) )
• Per-node state just O(1)
• Poor scalability 5 3
• A node failure breaks the circle. 4
2
Chord Routing (2)
•Advanced routing in distributed hashing
• Store links to z next neighbors
• Forward queries for k to the farthest known predecessor of
k
• For z = N: a fully meshed routing system
• Lookup efficiency: O(1)
• Per-node state: O(N)
• Still poor scalability
•Scalable routing
• A mix of short- and long-distance links is required:
• Accurate routing in the node’s vicinity
• Fast routing progress over large distances
• Bounded number of links per node
•Chord’s routing table: finger table
• Stores log(N) links per node
• Covers exponentially increasing distances:
• Node n: entry i (i-th “finger”) points to successor(n+2i)
Chord Routing (3)
•Chord routing: Example 1

finger table at node 0 keys


i n+2i succ. 6
0 1 1
finger table at node 1 keys
1 2 3
2 4 0
0 i n+2i succ. 1

7 1 0 2 3
1 3 3
2 5 0

6 2

finger table at node 3 keys


5 3 i n+2i succ. 2
0 4 0
4 1 5 0
2 7 0
Chord Routing (4)
•Chord routing: Example 2
• Each node n forwards the query for key k clockwise
• to the farthest finger preceding k
• until n = predecessor(k) and successor(n) = successor(k)
• returns successor(n) to the source of the query
63
60
4

56 7
iii 2^i
2^i Target
2^i Target Link
Target Link
Link 54
000 111 40 53
24
43 42
54
26
45 52 13
111 222 41 54
25
44 42
54
26
45 14
222 444 (44)
lookup 43
56
46
27 = 45
56
49
30 4949
16
333lookup
45 31(44)52
888 47
60
50 49
60
33
444 1616 55
16 4
58
39 56
4
60
39 4545
19
555 32
32
32 10 7
20
55 7
13
23
56 44
4242 23
39
37 26
33 30
Chord Self Organization (1)
•Handle a changing network environment
• Arrival of new nodes
• Departure of participating nodes
• Failure of nodes

•Maintain consistent system state for routing


• Keep routing information up to date
• The correctness of the routing algorithm depends on the
correct successor information.
• Routing efficiency depends on correct finger tables.
• Fault tolerance required for all operations.
Chord Self Organization (2)
•Chord soft-state approach
• Nodes delete (key,value) pairs after a timeout of 30
s to some minutes.
• Applications need to refresh (key,value) pairs they
wish to store peri-odically.
• Worst case: data unavailable for the refresh interval
after a node failure.
Chord Self Organization (3)
•Finger failures during routing
• query cannot be forwarded to the finger entry
• forward to the previous finger (do not overshoot destination
node)
• trigger repair mechanism: replace finger by its successor
63
60
4
•Active finger maintenance 56 7

• periodically check liveness of 52


54

13
fingers 14

• replace with correct nodes in 49


49
16

case of failures
19
45
• trade-off: maintenance traffic 45
44
vs. correctness and timeliness 42
42 23
39
37 26
33 30
Chord Self Organization (4)
•Successor failure during routing
• Last step of routing can return a failed node to the
source of the query
-> all queries for the successor fail
• Store n successors in a successor list
• successor[0] fails -> use successor[1], etc.
• routing fails only if n consecutive nodes fail
simultaneously.

•Active maintenance of the successor list


• periodic checks, similar to finger table maintenance
• crucial for correct routing
Chord: Node Join (1)
• New node picks its ID
• Contacts existing node responsible for his range
• Constructs finger table via standard routing/lookup
• Retrieves (key, value) pairs from his successor.

finger table at node 0 keys


i n+2i succ. 6
0 1 1
finger table at node 1 keys
1 2 3
2 4 0
0 i n+2i succ. 1

7 1 0 2 3
1 3 3
2 5 0

6 2
finger table at node 6 keys
i n+2i succ. finger table at node 3 keys
0 7 0 5 3 i n+2i succ. 2
1 0 0 0 4 0
2 2 3 4 1 5 0
2 7 0
Chord: Node Join (2)
•Examples for choosing new node IDs
• random ID: equal distribution assumed
• hash IP address and port
• place new nodes based on
• load of the existing nodes ID = rand()
? =6
• geographic location
• etc.

•Retrieval of existing 0
1
node IDs 7
entrypoint.chord.org?
• Controlled flooding
• DNS aliases 6 2
• Published through
Web
5 3
• etc.
DNS 182.84.10.23 4
Chord: Node Join (3)
•Construction of finger table
• iterate over finger table rows
• for each row: query entry point for
successor
• use standard Chord routing on entry
point
•Construction of successor list successor list
• add immediate successor from the 1 3
finger table
• request successor list from this
successor
0
7 1
finger table at node 6 keys succ(7)= ?
0
i n+2i succ.
succ(0)= ?
0
0 7 0
1 0 0 6 succ(2)= ?
3 2
2 2 3

successor list
0 1
5 3
4
Chord: Node Join (4)
•Update of finger pointers: Example
• Node 82 joins example for i = 3
• Finger entries to node 86 may now point
to the new node 82 1
• Candidates for updates: 87
i
• Nodes (counter-clockwise) whose 2 -th 8
86
finger entry have to point to 82
• Check predecessor’s
i
t i of keys (s – 2 i)
82 82-23
• route to s - 2
• If t’s 2i-finger points to a node
beyond 82: i
• change t’s 2 -finger to 82
• set t to predecessor of t
and repeat 32
• ELSE continue with 2i+1 74
23-finger=72
X
23-finger=86 72
82
•O(log2 N) for looking up and updating X
23-finger=86
the finger entries. 82
Conclusions for Chord
•Complexity
• Messages per lookup: O(log N)
• Memory per node: O(log N)
• Messages per management action (join/leave/fail): O(log²
N)
•Advantages
• Theoretical models and proofs exist about the complexity
• Simple and flexible
•Disadvantages
• No notion of node proximity and proximity-based routing
optimizations
• Chord rings may become disjoint (partitioned) in realistic
settings
•By today, many improvements were published
• e.g., provisions for proximity, bi-directional links, load
balancing, etc.
CAN: Content-Addressable Network
•An early and successful algorithm
•Simple and elegant
• Intuitive to understand and implement
• Many improvements and optimizations exist
• Published by Sylvia Ratnasamy et al. in 2001
•Main responsibilities
• CAN is a distributed system that maps keys to values.
• CAN uses distributed hashing.
• Keys are hashed into a D-dimensional space
• Interface:
• insert(key, value)
• retrieve(key)
CAN Overview (1)
Basic idea (K1,V1) K V
K V

K V
K V

K V

K V
K V

K V

K V
K V
K V
insert
(K1,V1) retrieve (K1)
CAN Overview (2)
•Solution
•Virtual Cartesian coordinate space
•Entire space is partitioned amongst all the nodes. Every
node “owns” a zone in the overall space.
•Abstraction
• can store data at “points” in the space
• can route from one “point” to another

•A point is a node that owns the enclosing zone.


CAN Overview (3)
•D-dimensional value space
•Hash value corresponds to a
point in the D-dimensional space.
• H(„movie.avi“)  4711  (0.7, Example: D=2
0.2) 1
• DHT stores (key, value)-pairs 7
6
•Complexity 1
8
O( D4 N )
D
1
• Search effort: 9
• Memory requirement: O(D) = 0,5
O(1)
2 5
3

4
H(„movie.avi“)  (0.7, 0.2)
0
0 0,5 1
CAN Overview (3)
•An overlay node manages
one partition (rectangle) of
the value space.
• Example: node 4 manages
all values in x  [0.5, 1], y 
[0, 0.25] 1
• Adjacent partitions are 7
called “neighbors”: 6
• Nodes 6, 2 and 4 are 8
neighbors 1
of node 5 9
• „wrap around“ on DHT- 0,5
borders:
node 3 is also a neighbor of 2 5
node 5 3
• Expected number of
neighbors: O(2D) 4
 independent of the size of the
CAN network! 0
0 0,5 1
CAN Setup
State of the system at time t

Zone
x Peer

In this 2-dimensional space a key is mapped to a point(x,y) Resource


• A D-dimensional space
CAN Routing (1) with
n zones
• 2 zones are neighbors if
D-1 dimensions overlap
• The definition of
(x,y) neighbors “wraps around
“ the edges
• Algorithm Routing:
• begin at any node
• follow the path to
a nearer direct
Q(x,y) neighbor until
you find the node
responsible for
key the key’s region

Peer
Q(x,y)
CAN Routing (2)
•Each node manages a rectangle with ratio 1:1, 1:2 or 2:1 (if D=2)

Example
7
• Dimension = 2, x=0…8, y=0…8 n5
• Node n1 is the first node and 6
thus manages the entire space n3 n4
• Node n2 joins the CAN-Network: the 5

space is split between n1 und n2 4


• Join of node n3
3
• Join of node n4 n1 n2
• Join of node n5 2

0 1 2 3 4 5 6 7
CAN Routing (3)
•Data location is associated with
coordi-nates derived from the key.
7
A (key, value)-pair is stored at the 6 n5
node responsible for the respective n3 n4
section. 5 k4
A query for a key is always forwarded 4
via neighbors: k1
• Entry point at some known node, 3
n1 n2
e.g., n1 2
• Lookup for key k4 k3
1

0 k2
0 1 2 3 4 5 6 7
CAN Routing (4)
•Path selection in CAN
• Routing along the shortest path in the D-dimensional
space
• Details:
The distance decreases continuously
• effort: O( D4 N ) hops
1
D
CAN: A Simple Example (1)

1 1 2

2 2 4
CAN: A Simple Example (2)
node J::retrieve(K)
I::insert(K,V)

(1) a = hx(K)
b = hy(K) I
(2) route
route(K,V)
“retrieve(K)”
-> (a,b)to (a,b)

(3) (a,b) stores (K,V) y=b


(K,V)

x=a
Partitioning of CAN Ranges (1)
Partitioning of CAN Ranges (2)
•Partitioning is performed according to some rules
• Strict sequencing of value range partitioning
• According to the order D
• For example: x, y, z, x, y, z, ... if D=3

•Partitioning tree 0 1 Y
• Reflects „history“ of the
partitioning process 0 1 0 1 X
• Important for fusion of B 0 1
A F Y
ranges in the case of 0 1 X
exit or failure of nodes D
C E
Structure of a CAN – Example (1)
•Insertion of nodes A,…, D
y

A (1) A (1) A (1)

C (011)
B (0) B (00) C (01) B (00)
D (010)

x
0 1 0 1 0 1

0 1 0 1
B A A A
B C B 0 1

D C
Structure of a CAN – Example (2)
•Insertion of nodes E, F, G
y
G (101)
A (1) A (10) F (11) F (11)
A (100)
C E C E C E
(0110) (0111) (0110)(0111) (011 (0111)
B (00) B (00) B (00) 0)
D (010) D (010) D (010)

x
0 1 0 1 0 1

0 1 0 1 0 1
A 0 1 0 1

0 1 0 1
B B 0 1
A F B 0 1
F
0 1 0 1
D D 0 1 D A G
C E C E C E
Other Improvements for CAN
•Routing metrics
• measure the delay between neighbors
• choose the neighbors with the shortest delay
•Overlapping regions
• k nodes jointly manage one area
• more redundancy
• faster routing paths because of less number of zones
•Equal (uniform) partitioning of regions
• Target zone tests during the join procedure: are there
“large” neighbors in the proximity, being more qualified
for partitioning?
Conclusions for CAN
•CAN is a peer-to-peer system based on a DHT.
•It operates with D dimensions. The number of
dimensions determines the efficiency.
•Access to a key in O( N )
1
D D
4

•Efficient algorithms for joining and leaving


nodes exist.
•Problem: N has to be known beforehand!
Thank you for​
your attentions!​

You might also like