You are on page 1of 38

Overlay Networks

(with a focus on Content Distribution Networks)

Janardhan R. Iyengar
CISC 856 – TCP/IP and Upper Layer Protocols
04/23/2002
What is an Overlay ?

What is the topology of this network?

WHICH network??
Figure borrowed from www.isi.edu/xbone
Overlay Networks: Overview
• Networks built using an existing network as substrate

• Also known as Virtual Networks

• Most popular overlay – The Internet: Evolved as an overlay


on the POTS (Plain Old Telephone System) network

• Overlays could consist of routing software installed at


selected sites, connected by encapsulation tunnels or direct
links
Overlay Networks: Examples

• MBone, 6Bone, ABone

• RON, VNS

• P2P (Napster, FreeNet, Gnutella)

• Content Networks
- Cooperating Caches
- Server Farms
- Content Distribution Networks (CDNs)
Example Overlays: (1) MBone
• Semi-permanent testbed to carry IP multicast traffic

• Routing of IP multicast traffic is not commonly integrated and


deployed in production routers on the Internet

• Hence, layered on the Internet to support routing of IP multicast


packets using tunneling

Mbone Internet Mbone Internet


node router node router

Internet
Internet
router
router Mbone
node
Example Overlays: (1) MBone
Example Overlays: (2) 6Bone
• 6bone is an IPv6 testbed on the Internet

• Intended to eventually subsume the underlying IPv4 network

• IPv4 tunnels used to overlay the 6bone

• ABone is the Active Networks Backbone, for experimentation in


Active networking. Uses tunneling
Example Overlays: (2) 6Bone
Other known Overlays
• Resilient Overlay Network (RON): Provides fault tolerance
and faster recovery as compared to conventional routing
techniques

• Virtual Network Service (VNS): Infrastructure for


provisioning QoS within Virtual Private Networks

• Peer to Peer Networks: Infrastructure for distribution and


sharing of files (eg: Napster, Gnutella, Freenet)

• Content Networks:
- Server Farms, Caching Proxies, Content Distribution
Networks (CDNs)
- Today, we will try to focus on CDNs
- What are the motivations for Content Networks?
Motivations for Content Networks

• More hops between client and Web server => more


congestion!

• Same data flowing repeatedly over links between clients


and Web server

C1 C3

S C4

C2 - IP router
Motivations for Content Networks (contd.)

• Origin server is bottleneck as number of users grows

• Flash Crowds (for instance, Sept. 11)

• The Content Distribution Problem: Arrange a rendezvous


between a content source at the origin server
(www.cnn.com) and a content sink (us, as users)
Example content networks: Server Farms
• Simple solution to the content distribution problem:
deploy a large group of servers

www.cnn.com www.cnn.com www.cnn.com


(Copy 1) (Copy 2) (Copy 3)
Request from Request from
grad.umd.edu ren.cis.udel.edu

L4-L7 Switch

Request from Request from


ren.cis.udel.edu grad.umd.edu

• Arbitrate client requests to servers using an “intelligent”


L4-L7 switch

• Pretty widely used today


Example content networks: Caching Proxies
ISP
Other
Client Intercepters
traffic
ren.cis.udel.edu
TCP port 80 Internet www.cnn.com
traffic
Client
merlot.cis.u
del.edu Proxy

• Majorly motivated by ISP business interests – reduction in


bandwidth consumption of ISP from the Internet

• Reduced network traffic

• Reduced user perceived latency


Consider, On September 11, 2001
Web Server
www.cnn.com New Content
WTC News!

1000,000
request other hosts

1000,000
other hosts ISP
old
content request - Congestion /
Bottleneck
User - Caching Proxy
merlot.cis.udel.edu
Problems with discussed approaches:
Server farms and Caching proxies

• Server farms do nothing about problems due to network


congestion, or to improve latency issues due to the network

• Caching proxies serve only their clients, not all users on the
Internet

• Content providers (say, Web servers) cannot rely on


existence and correct implementation of caching proxies

• Accounting issues with caching proxies.


For instance, www.cnn.com needs to know the number of
hits to the webpage for advertisements displayed on the
webpage
Again, On September 11, 2001
Web Server
www.cnn.com New Content
WTC News!

WA
1000,000
CA MI other users

IL
MA
1000,000 FL NY
other users DE
new
request content
- Distribution
User Infrastructure
- Surrogate
merlot.cis.udel.edu
Web replication - CDNs

• Overlay network to distribute content from origin servers


to users

• Avoids large amounts of same data repeatedly traversing


potentially congested links on the Internet

• Reduces Web server load

• Reduces user perceived latency

• Tries to route around congested networks


CDN vs. Caching Proxies

• Caches are used by ISPs to reduce bandwidth consumption,


CDNs are used by content providers to improve quality of
service to end users

• Caches are reactive, CDNs are proactive

• Caching proxies cater to their users (web clients) and not to


content providers (web servers), CDNs cater to the content
providers (web servers) and clients

• CDNs give control over the content to the content


providers, caching proxies do not
CDN Architecture
Origin
Server

CDN
Request Distribution
Routing and
Infrastructure Accounting
Infrastructure

Surrogate
Surrogate

Client Client
CDN Components
• Content Delivery Infrastructure: Delivering content to
clients from surrogates

• Request Routing Infrastructure: Steering or directing


content request from a client to a suitable surrogate

• Distribution Infrastructure: Moving or replicating content


from content source (origin server, content provider) to
surrogates

• Accounting Infrastructure: Logging and reporting of


distribution and delivery activities
Server Interaction with CDN
www.cnn.com
1. Origin server pushes new Origin
content to CDN Server
1 CDN
OR
CDN pulls content from origin
server Distribution
2 Infrastructure

2. Origin server requests logs and


other accounting info from CDN
OR
CDN provides logs and other Accounting
Infrastructure
accounting info to origin server
Client Interaction with CDN
california.cnn.akamai.com
CDN
1. Hi! I need www.cnn.com/sept11
Surrogate
(CA)
2. Go to surrogate
delaware.cnn.akamai.com Request
Routing
3. Hi! I need content /sept11 Infrastructure

delaware.cnn.akamai.com
Surrogate
Q: (DE)
How did the CDN choose the
1 2
Delaware surrogate over the
3
California surrogate ?

Client
Request Routing Techniques

• Request routing techniques use a set of metrics to direct


users to “best” surrogate

• Proprietary, but underlying techniques known:


• DNS based request routing
• Content Modification (URL rewriting)
• Anycast based (how common is anycast?)
• URL based request routing
• Transport layer request routing
• Combination of multiple mechanisms
DNS based Request-Routing

• Common due to the ubiquity of DNS as a directory


service

• Specialized DNS server inserted in DNS resolution


process

• DNS server is capable of returning a different set of A, NS


or CNAME records based on policies/metrics
DNS based Request-Routing
Q: www.cnn.com
How does the Akamai
DNS know which
surrogate isAkamai
closest ? Akamai DNS
CDN

5
DNS response:
california.cnn.akamai.com

A 145.155.10.1

www.cnn.com
DNS query:
delaware.cnn.akamai.com Surrogate
Surrogate 58.15.100.152
145.155.10.15
n
sio
s
Se

DNS query:
www.cnn.com
merlot.cis.udel. local DNS server (louie.udel.edu)
edu 128.4.4.12
DNS response:
128.4.30.15 A 145.155.10.15
DNS based Request-Routing
www.cnn.com

Akamai Akamai DNS Meas


s u re
l t me
CDN esu Me nt r
t r a esu
n Cli s u lts
me
re re S
to e re t nt D o
s u s u N NS
ea ea t D Surrogate
M ien

DNS response
M
Cl

DN S q u e r y
Surrogate
n ts
M e
ea
su rem
u
n

re s
sio

me ea
s

nt M
Se

s
DNS query
local DNS server
merlot.cis.udel.
(louie.udel.edu)
edu DNS response 128.4.4.12
128.4.30.15
DNS based Request Routing: Caching
www.cnn.com

Akamai DNS
Requesting DNS - 76.43.32.4
Akamai
Surrogate - 145.155.10.15
CDN
Surrogate
Surrogate 58.15.100.152
145.155.10.15
Requesting DNS - 76.43.32.4
Requesting DNS - 76.43.32.4 Available Bandwidth = 5 kbps
Available Bandwidth = 10 kbps RTT = 100 ms
RTT = 10 ms

Client Client DNS www.cnn.com


76.43.35.53 76.43.32.4 A 145.155.10.15
TTL = 10s
DNS based Request Routing Techniques:
Discussion
• Originator Problem: Client may be far removed from client
DNS

• Client DNS Masking Problem: Virtually all DNS servers,


except for root DNS servers honor requests for recursion
Q: Which DNS server resolves pel.cis.udel.edu?
Q: Which DNS server performs the last recursion of the
DNS request?

• Hidden Load Factor: A DNS resolution may result in


drastically different load on the selected surrogate – issue in
load balancing requests, and predicting load on surrogates
Server Selection Metrics
• Network Proximity (Surrogate to Client):
- Network hops (traceroute)
- Internet mapping services (NetGeo, IDMaps)
- …

• Surrogate Load:
- Number of active TCP connections
- HTTP request arrival rate
- Other OS metrics
- …

• Bandwidth Availability

• …
Value of a CDN

• Scale: Aggregate infrastructure size

• Reach: Diversity of content locations (diverse placement of


surrogates)

• Request routing efficiency, delivery techniques


Content Distribution Internetworking:
CDI
• Interconnection of content networks – collaboration
between caching proxies and CDNs, as well as between
individual CDNs

• Greater reach, larger scale, higher capacity, increased fault


tolerance

• A new area, lots of challenges

• Basic architecture involves gateways between various


content networks
CDI: Architecture

CDN1 CDN2

CDN3

CN4
For instance,cache
network of some ISPx
- Content Peering
Gateway
Traditional vs. Overlay Content Networks
Traditional networks Content networks

Overlay "Content Layer" to


enable richer services on top of
layer 7 protocols (HTTP,
RTSP)

• Information processed at layers • Information processed at layers


1 through 3 of the OSI stack 4 through 7 of the OSI stack

• Units of transported data are • Units of transported data in


frames and packets content networks are images,
movies, songs
In Summary
• Overlays is a concept which can be used to:
- deploy new services on the Internet
(Mbone, 6bone, Abone, Peer-to-Peer, Content Networks)
- get around problems in the underlying technology
(Resilient Overlay Networks)

• Further reading - Overlays:


- www.savetz.com/mbone/
- www.6bone.net/
- nms.lcs.mit.edu/projects/ron/
- www-2.cs.cmu.edu/~hzhang/VNS/

• Further reading - CDNs:


- www.ietf.org/internet-drafts/draft-ietf-cdi-model-01.txt
- www.ietf.org/internet-drafts/draft-ietf-cdi-known-request-routing-00.txt
- Bunch of papers … send me mail if you are interested

• Questions? Answers? Thoughts?


Full-Site vs. Partial-Site Content Delivery
• Full-Site delivery is what we have seen so far – entire
webpage is delivered from the CDN

• Partial-Site delivery delivers only embedded objects (say,


only images on the webpage) from the CDN

• Embedded object redirection can be done using DNS based


request routing or URL rewriting

Q: How many TCP connections are needed to do a P-HTTP


transfer of a webpage with embedded objects using the
above 2 techniques?
CDN with Full-Site Delivery

index.html
CDN
embedded image1.gif
image2.gif Surrogate
Server
x . html e2.gif
E T inde if, imag
G ge1.g
E T ima
G
Client

imag .html,
imag 1.gif,
if
e2.g
x
e
inde
Origin
Server
CDN with Partial-Site Delivery

index.html
CDN
embedded image1.gif
image2.gif Surrogate
2 . gif Server
e
gif, imag
e 1.
T imag
GE
Client
GE
T

imag 1.gif,
ind

if
e2 . g
ex
.ht

e
imag
ml

Origin
Server
CDN Types (Skeletal)
CDNs

Hosting CDN Relaying CDN

Partial Site Full Site Content


Content Delivery Delivery

Request Routing Techniques

DNS based URL


Rewriting

You might also like