Professional Documents
Culture Documents
using hadoop
Hadoop
It is a open source Apache framework for storing and
processing data in cluster with commodity hardware.
Core components of Hadoop
HDFS(storing)
MapReduce(Processing)
Slave services
DataNode
Tasktracker
HDFS architecture
ith
w
t
es
u
req
t
en file
i
l
C ut
it h
inp
w
e
ns
o
ta
sp
Re tada
Me
NameNod
e
Meta
data
a.txt1,2,5
b.txt2,4,6
c.txt3,2,7
d.txt1,6,8