Professional Documents
Culture Documents
• Cloud Computing OS
• Cloud Computing OS
10
0
2010 2011 2012 2013 2014 2015
Solving the Impedance Mismatch
• Computers not getting faster,
and we are drowning in data
• How to resolve the dilemma?
• “Computer” == 10,000’s
computers, storage, network
• Programming Abstractions
• Libraries (libc), system calls, …
• Multiplexing of resources
• Scheduling, virtual memory, file allocation/protection, …
Datacenter/Cloud Operating System
• Data sharing
• Google File System, key/value stores
• Programming Abstractions
• Google MapReduce, PIG, Hive, Spark
• Multiplexing of resources
• Apache projects: Mesos, YARN (MRv2), ZooKeeper, BookKeeper, …
Google Cloud Infrastructure
• Google File System (GFS), 2003
• Distributed File System for entire
cluster
• Single namespace
• At Yahoo!:
– Index building for Yahoo! Search
– Spam detection for Yahoo! Mail
• At Facebook:
– Data mining
– Ad optimization
– Spam detection
MapReduce Pros
• Distribution is completely transparent
• Not a single line of distributed programming (ease, correctness)
• Automatic fault-tolerance
• Determinism enables running failed tasks somewhere else again
• Saved intermediate data enables just re-running failed reducers
• Automatic scaling
• As operations as side-effect free, they can be distributed to any
number of machines dynamically
• Automatic load-balancing
• Move tasks and speculatively execute duplicate copies of slow
tasks (stragglers)
MapReduce Cons
• Restricted programming model
• Not always natural to express problems in this model
• Low-level coding necessary
• Little support for iterative jobs (lots of disk access)
• High-latency (batch processing)
Group on url
Count clicks
Order by clicks
Take top 5
Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
In MapReduce
• Developed at Facebook
• Used for many Facebook jobs
Spark Motivation
Complex jobs, interactive queries and online processing all need
one thing that MR lacks:
Efficient primitives for data sharing
Query 1
Stage 1
Stage 2
Stage 3
Job 2
Job 1
Query 2 …
Query 3
Query 1
Problem: in MR, the only way to share data
Stage 1
Stage 2
Stage 3
Job 2
Job 1
Query 2 …
across jobs is using stable storage
Query 3
(e.g. file system) slow!
Iterative job Interactive mining Stream processing
Examples
HDFS HDFS HDFS HDFS
read write read write
iter. 1 iter. 2 . . .
Input
iter. 1 iter. 2 . . .
Input
query 1
one-time
processing
query 2
query 3
Input Distributed
memory . . .
• http://spark.incubator.apache.org/
Datacenter Scheduling Problem
• Rapid innovation in datacenter computing
frameworks Pregel
• No single framework optimal for all
applications
• Want to run multiple frameworks in a single
datacenter Pig
• …to maximize utilization
• …to share data between frameworks
CIEL
Dryad Percolator
Solution: Apache Mesos
• Mesos is a common resource sharing layer over which
diverse frameworks can run
Hadoop Pregel
Hadoop Pregel …
… Mesos
Node Node Node Node Node Node Node Node
http://mesos.apache.org/
•Resource offers:
• Simple, scalable application-controlled scheduling mechanism
Element 1: Fine-Grained Sharing
Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos):
Fw. 3 Fw. 3
2 Fw. 1
Framework 1 Fw. 1 Fw. 2 Fw. 2
MPI Hadoop
scheduler scheduler
Mesos Allocation
Pick framework to
Resource
module offer resources to
master offer
MPI Hadoop
scheduler scheduler
Mesos Allocation
Pick framework to
Resource
module offer resources to
master offer
Mesos Allocation
Pick framework to
Resource
module offer resources to
master offer