You are on page 1of 27

CS 525

Advanced Distributed
Systems
Spring 2010
Yeah! That’s what
I’d like to know.

Indranil Gupta (Indy)


Lecture 3
Cloud Computing Continued
January 26, 2010
SHIBA SANKAR BEHERA
RAVENSHAW MANAGEMENT CENTRE, CUTTACK
1
All Slides © IG (Acknowledgments: Md Yusuf Sarwar and Md Ahsan Arefin)
Clouds are Water Vapor

• Larry Ellison’s Rant on Cloud Computing

2
“A Cloudy History of Time” © IG 2010
The first datacenters!

1940

1950 Timesharing Companies & Data Processing Industry

1960

1970 Clusters
Grids
1980

1990
PCs
(not distributed!) 2000
Peer to peer
systems
2010 3
Clouds and datacenters
“A Cloudy History of Time” © IG 2010
First large datacenters: ENIAC, ORDVAC, ILLIAC
Many used vacuum tubes and mechanical relays
Berkeley NOW Project
Supercomputers
Server Farms (e.g., Oceano)

P2P Systems (90s-00s)


•Many Millions of users
•Many GB per day

Data Processing Industry


- 1968: $70 M. 1978: $3.15 Billion.
Timesharing Industry (1975):
•Market Share: Honeywell 34%, IBM 15%,
•Xerox 10%, CDC 10%, DEC 10%, UNIVAC 10%
•Honeywell 6000 & 635, IBM 370/168,
Xerox 940 & Sigma 9, DEC PDP-10, UNIVAC 1108
Grids (1980s-2000s): Clouds
•GriPhyN (1970s-80s)
4
•Open Science Grid and Lambda Rail (2000s)
•Globus & other standards (1990s-2000s)
So What’s new in Today’s Clouds?
Besides massive scale, three major features:
I. On-demand access: Pay-as-you-go, no upfront
commitment.
– Anyone can access it (e.g., Washington Post – Hillary Clinton
example)
II. Data-intensive Nature: What was MBs has now
become TBs.
– Daily logs, forensics, Web data, etc.
– Do you know the size of Wikipedia dump?
III. New Cloud Programming Paradigms:
MapReduce/Hadoop, Pig Latin, DryadLinq, Swift, and
many others.
– High in accessibility and ease of programmability
Combination of one or more of these gives rise to novel
and unsolved distributed computing problems in cloud
computing. 5
OK, so that’s what a cloud
looks like today. Now,
suppose I want to start my
own company, Devils Inc.
Should I buy a cloud and own
it, or should I outsource to a
public cloud?

6
Single site Cloud: to Outsource or
Own? [OpenCirrus paper]
• Medium-sized organization: wishes to run a
service for M months
– Service requires 128 servers (1024 cores) and 524 TB
– Same as UIUC CCT cloud site
• Outsource (e.g., via AWS): monthly cost
– S3 costs: $0.12 per GB month. EC2 costs: $0.10 per
Cpu hour
– Storage ~ $62 K
– Total ~ $136 K
• Own: monthly cost
– Storage ~ $349 K / M
– Total ~ $ 1555 K / M + 7.5 K (includes 1 sysadmin /
100 nodes)
• using 0.45:0.0.4:0.15 split for hardware:power:network 7
Single site Cloud: to Outsource or
Own?
• Breakeven analysis: more preferable to own if:
– $349 K / M < $62 K (storage)
– $ 1555 K / M + 7.5 K < $136 K (overall)
Breakeven points
– M > 5.55 months (storage)
• Not surprising: Cloud providers benefit monetarily most from storage
– M > 12 months (overall)
• Assume hardware lasts for 3 years (typical lifetime)
• Systems are typically underutilized
• With system utilization of x%, still more preferable to own
if:
– x > 33.3%
– Even with CPU util of 20%, storage > 47% makes owning
preferable 8
I want to do research in this
area. I am sure there are no
grand challenges in cloud
computing!

9
10 Challenges [Above the Clouds]
(Index: Performance Data-related Scalability Logisitical)

• Availability of Service: Use Multiple Cloud Providers; Use Elasticity;


Prevent DDOS
• Data Lock-In: Enable Surge Computing; Standardize APIs
• Data Confidentiality and Auditability: Deploy Encryption, VLANs,
Firewalls: Geographical Data Storage
• Data Transfer Bottlenecks: Data Backup/Archival; Higher BW
Switches; New Cloud Topologies; FedExing Disks
• Performance Unpredictability: QoS; Improved VM Support; Flash
Memory; Schedule VMs
• Scalable Storage: Invent Scalable Store
• Bugs in Large Distributed Systems: Invent Debuggers; Real-time
debugging; predictable pre-run-time debugging
• Scaling Quickly: Invent Good Auto-Scalers; Snapshots for
Conservation
• Reputation Fate Sharing
• Software Licensing: Pay-for-use licenses; Bulk use sales 10
A more Bottom-Up View of Open
Research Directions
Myriad interesting problems that acknowledge the characteristics that make
today’s cloud computing unique: massive scale + on-demand + data-
intensive + new programmability + and infrastructure- and application-
specific details.

 Monitoring: of systems&applications; single site and multi-site


 Storage: massive scale; global storage; for specific apps or classes
 Failures: what is their effect, what is their frequency, how do we achieve
fault-tolerance?
 Scheduling: Moving tasks to data, dealing with federation
 Communication bottleneck: within applications, within a site
 Locality: within clouds, or across them
 Cloud Topologies: non-hierarchical, other hierarchical
 Security: of data, of users, of applications, confidentiality, integrity
 Availability of Data
 Seamless Scalability: of applications, of clouds, of data, of everything
 Inter-cloud/multi-cloud computations
 Second Generation of Other Programming Models? Beyond MapReduce!
 Pricing Models 11
 Explore the limits today’s of cloud computing
Alright. But, I bet that if I a
have a ton of data to
process, it is very difficult to
write a program for it!

12
New Parallel Programming Paradigms:
MapReduce
• Highly-Parallel Data-Processing
• Originally designed by Google (OSDI 2004 paper)
• Open-source version called Hadoop, by Yahoo!
– Hadoop written in Java. Your implementation could be
in Java, or any executable
• Google (MapReduce)
– Indexing: a chain of 24 MapReduce jobs
– ~200K jobs processing 50PB/month (in 2006)
• Yahoo! (Hadoop + Pig)
– WebMap: a chain of 100 MapReduce jobs
– 280 TB of data, 2500 nodes, 73 hours
• Annual Hadoop Summit: 2008 had 300 attendees, 2009
had 700 attendees
13
What is MapReduce?
• Terms are borrowed from Functional Language
(e.g., Lisp)
Sum of squares:
• (map square ‘(1 2 3 4))
– Output: (1 4 9 16)
[processes each record sequentially and independently]

• (reduce + ‘(1 4 9 16))


– (+ 16 (+ 9 (+ 4 1) ) )
– Output: 30
[processes set of all records in a batch]
14
Map

• Process individual key/value pair to


generate intermediate key/value pairs.

Welcome 1
Welcome Everyone
Hello Everyone
Everyone 1
Hello 1
Input <filename, file text> Everyone 1

15
Reduce

• Processes and merges all intermediate


values associated with each given key
assigned to it
Welcome 1
Everyone 2
Everyone 1 Hello 1
Hello 1 Welcome 1
Everyone 1

16
Some Applications
• Distributed Grep:
– Map - Emits a line if it matches the supplied pattern
– Reduce - Copies the intermediate data to output
• Count of URL access frequency
– Map – Process web log and outputs <URL, 1>
– Reduce - Emits <URL, total count>
• Reverse Web-Link Graph
– Map – process web log and outputs <target,
source>
– Reduce - emits <target, list(source)>

17
Programming MapReduce
• Externally: For user
1. Write a Map program (short), write a Reduce program (short)
2. Submit job; wait for result
3. Need to know nothing about parallel/distributed programming!
• Internally: For the cloud (and for us distributed systems
researchers)
1. Parallelize Map
2. Transfer data from Map to Reduce
3. Parallelize Reduce
4. Implement Storage for Map input, Map output, Reduce input,
and Reduce output

18
Inside MapReduce
• For the cloud (and for us distributed systems researchers)
1. Parallelize Map: easy! each map job is independent of the other!
• All Map output records with same key assigned to same Reduce
2. Transfer data from Map to Reduce:
• All Map output records with same key assigned to same Reduce task
• use partitioning function (more soon)
3. Parallelize Reduce: easy! each reduce job is independent of the
other!
4. Implement Storage for Map input, Map output, Reduce input, and
Reduce output
• Map input: from distributed file system
• Map output: to local disk (at Map node); uses local file system
• Reduce input: from (multiple) remote disks; uses local file systems
• Reduce output: to distributed file system
local file system = Linux FS, etc.
distributed file system = GFS (Google File System), HDFS (Hadoop
Distributed File System)

19
Internal Workings of MapReduce

20
Fault Tolerance
• Worker Failure
– Master keeps 3 states for each worker task
• (idle, in-progress, completed)
– Master sends periodic pings to each worker to keep track of
it (central failure detector)
• If fail while in-progress, mark the task as idle
• If map workers fail after completed, mark worker as idle
• Reduce task does not start until all Map tasks done, and all its
(Reduce’s) data has been fetched
• Master Failure
– Checkpoint

21
Locality and Backup tasks
• Locality
– Since cloud has hierarchical topology
– GFS stores 3 replicas of each of 64MB chunks
• Maybe on different racks
– Attempt to schedule a map task on a machine that
contains a replica of corresponding input data: why?
• Stragglers (slow nodes)
– Due to Bad Disk, Network Bandwidth, CPU, or
Memory.
– Perform backup (replicated) execution of straggler
task: task done when first replica complete

22
Testbed: 1800 servers each with 4GB RAM, dual 2GHz Xeon,
dual 169 GB IDE disk, 100 Gbps, Gigabit ethernet per machine
Grep

Locality optimization helps:


• 1800 machines read 1 TB at peak ~31 GB/s
• W/out this, rack switches would limit to 10 GB/s

Startup overhead is significant for short jobs


Workload: 1010 100-byte records to extract records
matching a rare pattern (92K matching records) 23
Discussion Points
• Hadoop always either outputs complete results, or
none
– Partial results?
– Can you characterize partial results of a partial
MapReduce run?
• Storage: Is the local write-remote read model
good for Map output/Reduce input?
– What happens on node failure?
– Can you treat intermediate data separately, as a first-
class citizen?
• Entire Reduce phase needs to wait for all Map
tasks to finish: in other words, a barrier
– Why? What is the advantage? What is the
disadvantage?
– Can you get around this?
24
Hmm, CCT and
OpenCirrus are new. But
are there no older
testbeds?

25
Administrative
Announcements
Student-led paper presentations (see instructions on website)
• Start from February 11th
• Groups of up to 2 students present each class, responsible for
a set of 3 “Main Papers” on a topic
– 45 minute presentations (total) followed by discussion
– Set up appointment with me to show slides by 5 pm day prior to
presentation
– Select your topic by Jan 31st
• List of papers is up on the website
• Each of the other students (non-presenters) expected to read
the papers before class and turn in a one to two page review of
the any two of the main set of papers (summary, comments,
criticisms and possible future directions)
– Email review and bring in hardcopy before class

26
Announcements (contd.)
Projects
• Groups of 2 (need not be same as
presentation groups)
• We’ll start detailed discussions “soon” (a few
classes into the student-led presentations)

27

You might also like