Professional Documents
Culture Documents
Notifications
Map Reduce
MapReduce is a programming model and an associated
implementation for processing and generating big data sets
with a parallel, distributed algorithm on a cluster.
Key terms and Concepts. Continue
http://www.youtube.com/watch?v=vADJy7ZtopY
Key terms and Concepts
Hadoop
Hadoop is a software framework for distributed processing of
large data sets on compute clusters of commodity hardware
https://www.youtube.com/watch?v=aReuLtY0YMI
Key terms and Concepts. Continue
Hadoop & MapReduce
Hadoop and MapReduce is currently the BEST attempt to avoid the CAP theorem.
Hadoop tries to provide eventual Consistency (if you wait long enough the answer will be consistent),
Availability, and Partitioning.
Hadoop does this through TWO big differences with other data stores:
1.Data is repeated many times across many different servers
2.Queries are performed using Java code that breaks a larger job down into many smaller tasks that can
each be run on a separate node.
Key terms and Concepts. Continue