You are on page 1of 30

HNG DN CI T HADOOP

1. Chun b...............................................................................................................1
2. Ci t Hadoop....................................................................................................1
3. Cu hnh cc thng s cho hadoop cluster........................................................3
4. Format HDFS.....................................................................................................22
5. Khi ng h thng...........................................................................................22
6. Kim tra ci t thnh cng.............................................................................23

1. Chun b
1.1. To user Hadoop
Thc hin cc lnh sau trn tt c cc server (Master v slave)
useradd hadoop
passwd hadoop
hadoop ALL = NOPASSWD: ALL (Mn quyn root thc thi mi cu lnh)
1.2. Ci t java 1.7 (Phng QTHT ci)
Nu cha c ci, tham kho trn web:
http://timarcher.com/node/59
http://www.roseindia.net/linux/tutorial/installingjdk5onlinux.shtml
1.3. Cu hnh SSH
Kim tra xem my c c ci t SSH hay cha:

Nu SSH cha c ci t , tin hnh ci t gi OpenSSH theo hng dn ti:


http://www.topology.org/linux/openssh.html
2. Ci t Hadoop
Ton b phn ci t ta s lu tr trong th mc: /u01/hadoop. Cu trc th mc ci
t hadoop nh sau:

Ton b cu trc ci t c t trong th mc hadoop_installation. Th mc


ny t trong th mc home ca /u01/hadoop. Cu trc /u01/hadoop
/hadoop_installtion nh sau:
o Trn Master:
/u01/hadoop/hadoop_installtion
/u01/hadoop/hadoop_installtion/installation
/u01/hadoop/hadoop_installtion/data
/u01/hadoop/data/hadoop_installtion/name_dir
/u01/hadoop/data/hadoop_installtion/mapred
o Trn Slave:
/u01/hadoop/hadoop_installtion
/u01/hadoop/hadoop_installtion/installation
/u02/hadoop/hadoop_installtion/data
/u03/hadoop/hadoop_installtion/data
/u04/hadoop/hadoop_installtion/data
/u01/hadoop/data/hadoop_installtion/name_dir
/u02/hadoop/data/hadoop_installtion/mapred
/u03/hadoop/data/hadoop_installtion/mapred
/u04/hadoop/data/hadoop_installtion/mapred
Th mc installation: cha phn ci t hadoop
Th mc data: th mc lu tr d liu khi chy hadoop.
Th mc name_dir: l th mc lu tr cho HDFS. Nu my ng vai tr l mt
DataNode, th th mc ny s l ni lu tr cc block d liu cho DataNode. Nu
my ng vai tr l NameNode hay Secondary NameNode th th mc ny s lu
tr cc metadata.
Th mc mapred: l th mc lu tr d liu khi chy MapReduce. V d nh y s
l ni lu tr cc kt qu gin tip khi thc hin map task.
Ti v chp tp tin hadoop-1.2.1.tar.gz vo th mc home ca user hadoop.
Gii nn tp tin hadoop-1.2.1.tar.gz vo th mc hadoop_installation/installation

Export cc bin mi trng HADOOP_HOME v PATH (Hoc ta c th sa trc


tip file file ~./bashrc trong phn3):

Bin HADOOP_HOME gip ta qun l ng dn ti th mc ci t hadoop v h


tr cho Hadoop xc nh CLASSPATH, cn vic thm ng dn ti
$HADOOP_HOME/bin vo PATH gip ta c th thc thi cc lnh, cc control script
trong $HADOOP_HOME/bin , v d nh lnh hadoop hay script start-all.sh, m
khng cn g ng dn tuyt i ti lnh. T di y ta s dng
$HADOOP_HOME ni ti ng dn ti th mc ci t Hadoop.
(CLASSPATH: l mt tham s (c th c thit lp bng command line hay bin
mi trng) m JVM dng n tm cc lp c nh ngha hoc cc gi chng
trnh.)
Kim tra vic ci t Hadoop thnh cng:

Ta thy Hadoop c ci t thnh cng v hin th c thng tin phin bn


Hadoop ang dng.
3. Cu hnh cc thng s cho hadoop cluster
Cu hnh HADOOP
Sa ni dung cc file
Thm vo file ~/.bashrc vi ni dung nh sau:
# Set Hadoop-related environment variables
export HADOOP_PREFIX= /u01/hadoop/hadoop_installation/installation
/hadoop-1.2.1/
export HADOOP_PID_DIR = /u01/hadoop/ hadoop_installation/installation
/hadoop-1.2.1/pid/
export JAVA_HOME= /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.65.x86_64

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop
later on)
# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
# Add Hadoop bin/ directory to PATH
export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_PREFIX/bin
File $HADOOP_HOME/conf/hdfs-site.xml
<?xml version="1.0"?>
4

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/u01/hadoop/data/hadoop_installtion/name_dir</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table. If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/u01/hadoop/data/hadoop_installtion/data</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
5

<value>3</value>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:9070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>0.0.0.0:9090</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:9010</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:9075</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:9475</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:9020</value>
</property>
<property>
6

<name>dfs.https.address</name>
<value>0.0.0.0:9470</value>
</property>
</configuration>
File $HADOOP_HOME/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- In: conf/mapred-site.xml -->
<property>
<name>mapred.job.tracker</name>
<value>master:9311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.local.dir</name>
<value>/u01/hadoop-0.20.203.0/tempdir</value>
</property>
<property>
<name>mapred.map.child.java.opts</name>
7

<value>-Xmx512M -Djava.io.tmpdir>/u01/hadoop-0.20.203.0/tempdir</value>
<description>Larger heap-size for child jvms of maps.
</description>
</property>
<property>
<name>mapred.reduce.child.java.opts</name>
<value>-Xmx512M -Djava.io.tmpdir>/u01/hadoop-0.20.203.0/tempdir </value>
<description>Larger heap-size for child jvms of reduces.
</description>
</property>
<property>
<name>mapred.job.tracker.http.address</name>
<value>0.0.0.0:9030</value>
<description>Larger heap-size for child jvms of reduces.
</description>
</property>
<property>
<name>mapred.task.tracker.http.address</name>
<value>0.0.0.0:9060</value>
<description>Larger heap-size for child jvms of reduces.
</description>
</property>
</configuration>
File $HADOOP_HOME/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
8

<!-- Put site-specific property overrides in this file. -->


<configuration>
<!-- In: conf/core-site.xml -->
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>fs.inmemory.size.mb</name>
<value>100</value>
</property>
<property>
<name>io.sort.factor</name>
<value>50</value>
</property>
<property>
<name>io.sort.mb</name>
<value>100</value>
</property>
</configuration>
9

Sa file /etc/hosts trn tt c cc server


Thm vo cc dng sau:
10.30.136.4 slave4
10.30.136.5 master
10.30.136.6 slave6
10.30.136.7 slave7
10.30.136.8 slave8
10.30.136.9 slave9
10.30.136.10 slave10
10.30.136.11 slave11

Sa file $HADOOP_PREFIX/conf/slaves trn tt c cc server

Thm vo cc dng sau:


slave4
slave6
slave7
slave8
slave9
slave10
slave11
Cu hnh passwordless cho dch v SSH vi user hadoop
Pht sinh cp public/private key:
ssh -keygen -t rsa -f ~/.ssh/id_rsa

10

Ch , khi thc hin lnh ssh-keygen, c th ta s c yu cu nhp mt


passphrases. Trong trng hp ny, hy nhp mt passphrases rng.
Sau , append public key vo file ~/.ssh/authorized_keys
Cn m bo user hadoop (owner) c quyn c/ghi th mc ~/.ssh v file
~/ssh/authorized_keys
Lu : Do trn tt c cc slave ta u c user hadoop nn ta ch cn pht sinh rsa
key 1 ln v ng b ha th mc /home/ hadoop/.ssh ln tt c cc slave.
ng b cu hnh ssh passwordless ln cc my slave thng qua lnh scp:

11

(Hoc c th dng lnh: ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slave239)


Kim tra cu hnh passwordless login: T master ta thc hin vic login vo cc
Slave

Nu cu hnh thnh cng, ta c th login c vo ngay m khng cn in


password nh bn trn.
(Hoc bng cch thc hin cc lnh sau:
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh master (g yes, ri n enter, nu khng i nhp pass l ok, ch ny khng quan
trng, quan trng l my master sang slave khng i pass)
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@slave
ssh hduser@slave)
Danh mc cc file cu hnh
Hadoop cung cp 1 tp cc file cu hnh cho php chng ta cu hnh cc thng s
cho Hadoop cluster, chng ta c th tm thy cc file cu hnh ny trong th mc
$HADOOP_HOME/conf:

Tn file

Format

M t
12

Lu gi cc bin mi
hadoop-

Bash

env.sh

script

trng

chy

daemons

trn

Hadoop
cluster.

File cu
hnh
core-site.xml

theo

Cu hnh cc thng s cho

nh

hadoop core

dng
xml
File cu Cu hnh cc thng s cho

hdfs-site.xml

mapredsite.xml

hnh

cc

theo

daemons

nh

namenode,

dng

datanode,

chy

hdfs:

secondary

xml
namenode.
File cu
Cu hnh cc thng s cho
hnh
cc
theo
daemons
chy
nh
MapReduce:
dng
jobtracker, tasktracker
xml
Cha danh sch a ch ip
(hoc

masters

Plain

hostname nu c ci dns)

text

cc
my

chy

namnode
13

secondary

Cha danh sch a ch ip


(hoc
slaves

Plain

hostname nu c ci dns)

text

cc
my chy datanode v
tasktraker.
Cu hnh cc metric, tc

hadoopmetric Java
s.

properti

Properties

es

cch
m hadoop s report li
cc
thng tin hot ng ca
cluster.
Cu hnh cc properties

log4j.properti
es

Java
properti
es

cho

vic

ghi li log khi chy cc


daemons:

namenode,

datanode,
jobtracker, tasktracker.

Lu : ta c th t cc file cu hnh ny 1 th mc bt k bn ngoi


$HADOOP_HOME/conf. Lc khi chy cc control script khi ng cc
daemons, ta phi thm option config ch r ng dn ti th mc cha cc file
cu hnh ny. V d:
% start-dfs.sh config <ng dn ti th mc cha cc file config>
Chi tit cc file cu hnh chnh
hadoop-env.sh

14

Cha cc bin mi trng phc v cho vic chy cc daemon Hadoop. Cc daemon
chy Hadoop gm c Namenode/Datanode, Jobtracker/TastTracker v Secondary
Namenode. Mt s thng s quan trng:

Tn file

Gi tr mc nh

ngha
Bin mi trng cha
th

JAVA_HOME

mc home ca Java.

Khng c

y
l mt bin rt quan
trng.
Lu gi thng tin v
th

HADOOP_LOG_DIR

$HADOOP_HOME/log

mc lu cc file log
trong
qu trnh chy cc
daemon
Lng b nh ti a s

HADOOP_HEAPSIZE

1000 MB

c cp pht chy
mi daemon

core-site.xml
Cu hnh cc thng s cho cc daemons chy Hadoop:
Tn file

Gi

tr

mc

nh

15

ngha

Tn min mc nh. Tham s ny s


gip
chng ta dng nhng path tng i,
path
tng i ny s kt hp vi tn min
mc
fs.default.name

file://

nh xc nh path tuyt i. Khi s


dng hdfs, ta nn t gi tr cho tham
s
113
ny

l:

hdfs://<hostname

hay

ip

ca

namenode>
Mc nh, y s l th mc lu tr
hadoop.tmp.dir

cho

/tmp

tt

c d liu trn hadoop cluster.


hdfs-site.xml
Cu hnh cc daemon chy HDFS. Mt s tham s quan trong:
Tn file
dfs.replication

Gi tr mc nh
3

ngha
Tham s ny quy nh ch
s
replication level mc nh
cho

mt

file

khi n c to ra trn
HDFS.

Nh

ta

bit, replication level


ca

mt

file

chnh l s bn sao ca
tng
16

block

ca

file trn HDFS. Gi s


replication

level

ca file F l n, th mi
block

ca

file

s c lu ra n bn sao
nm

trn

datanode khc nhau trn


${hadoop.tmp.dir}/dfs

cluster.
Danh sch cc th mc

/name

lu

liu

trn

h thng file local ca cc


daemon
dfs.name.dir

namenode. Ni y s lu
tr

cc

metadata ca h thng file


phn

tn

${hadoop.tmp.dir}/dfs

HDFS.
Danh sch cc thc mc

/data

lu

tr

liu

trn h thng file local ca


cc

dfs.data.dir

daemon

datanode. y l ni tht
s

lu

tr

cc block ca cc file trn


HDFS.
Danh sch cc th mc

$
fs.checkpoint.di {hadoop.tmp.dir}/dfs/namesecon

trn h thng file local m

cc

dary

daemon

secondary

namenode s lu tr
17

mapred-site.xml
Cu hnh cc daemon chy Map/Reduce. Cc tham s quan trng:
Tn file

Gi tr mc nh
localhost:8021

ngha
Hostname (hoc ip) v port
ca
Nh

daemon
ta

Jobtracker.
bit,

trn

Hadoop cluster, c duy nht

mapred.job.tracker

mt

daemon

JobTracker

chy trn 1 node no .


Port mc nh chy daemon
${hadoop.tmp.dir}/mapred

ny l 8021
Ni lu tr trn h thng
file cc b ca cc tin trnh

mapred.local.dir

chy

MapReduce

JobTracker v TaskTracker
Phn tn cc ci t v cu hnh ln mi node trn cluster
Dng lnh scp chp ton b th mc /u01/hadoop/hadoop_installation ln
cc th mc tng ng trn slave01, slave02

18

nh

4. Format HDFS
Format HDFS
Ch : Lnh sau phi c thc hin t NameNode
Format namenode:

5. Khi ng h thng
Khi ng Hadoop
Ch : cc lnh sau phi c thc hin t namenode
Trc khi khi ng, ta phi m bo tng la c tt trn tt c cc node

Khi ng HDFS (khi ng NameNode, SecondaryNameNode v cc DataNode):


19

Khi ng MapReduce:

(Ta c th dng lnh start-all.sh v stop-all.sh)


6. Kim tra ci t thnh cng
Kim tra Hadoop ang chy
Ta c th kim tra vic Hadoop ang chy bng cch kim tra cc daemon trn cc
cluster c chy 1 cch ng n.
Kim tra Namenode v JobTracker ang chy trn namenode

Kim tra Datanode ang chy trn cc datanode

Kim tra cc TaskTracker ang chy trn datanode

Kim tra tnh tr006Eg ton b HDFS bng lnh:

20

Vi lnh ny ta s bit c danh sch cc DataNode v tnh trng ca chng.


Hoc thc truy cp vo namenode qua cc cng http:
http://<namenode>:50070 : Cng giao din web ca HDFS

21

http://<jobtracker>:50030 : Cng giao tip vi dch v Map/Reduce

22

Ph lc I: Bng cc tham s cu hnh Hadoop


Cc thng tin cu hnh l mt trong cc thnh phn chnh ca thao tc thc thi mt
cng vic trn h thng Hadoop. V thun tin cho ngi dng th Hadoop h tr
sn cho ngi dng cc file cu hnh mc nh vi cc thng tin cu hnh ban u,
vi tng file cu hnh s l cc thng tin cu hnh c th. i vi ngi dng, vi
tng h thng ci t v tng bi ton mun gii quyt trn h thng, h phi chnh
sa cc thng tin cu hnh cho thch hp. C ch p dng cc thng tin cu hnh vo
h thng nh sau, u tin n s c cc thng tin cu hnh mc nh (vi mu l
tendefault. xml), sau n s c cc thng tin cu hnh ca ngi dng (vi mu
l ten-site.xml), nu c thng tin cu hnh khc th n s override thng tin ny.

23

Sau y l danh sch cc file cu hnh ca h thng hadoop. Mc nh, cc file ny


nm trong th mc <Th mc ci hadoop>\conf. Mt thng tin cu hnh c mu nh
sau : <name> </name> <value> </value> <description> </description>
Thng tin cu hnh chung cho h thng hadoop
File core-site.xml
File cu hnh core-site.xml (mc nh l file core-default.xml), file ny cha cc
thng tin cu hnh chung cho h thng hadoop (Ngun tham kho:
http://hadoop.apache.org/common/docs/current/core-default.html).
coresite.xml
Name
Default
value
Descriptio
n
Name
Default
value

Descriptio
n
Name
Default
value
Descriptio
n
Name
Default
value
Descriptio
n

hadoop.tmp.dir
/tmp/hadoop-${user.name}
Cc th mc tm trong cc node trong cluster
fs.default.name
file:///
Tn ca h thng file mc nh gm cc trng
nh scheme v
authority ca URI. Authority gm c host v port.
Mc nh l h
thng local. Cn vi HDFS l hdfs://
fs.checkpoint.size
67108864
Kch thc ca editlog (theo byte) m trigger li
cc checkpoint
local.cache.size
10737418240
Kch thc ti a ca b nh cache m bn mun
lu tr (mc
24

nh l 10GB)
Thng s cu hnh h thng file HDFS
File hdfs-site.xml
File cu hnh hdfs-site.xml, dng cho thao tc cu hnh cc thng tin ca h thng
file

HDFS.

Xem

thm

ti

(http://hadoop.apache.org/common/docs/current/

hdfsdefault.html)
hdfssite.xml
Name
Default
value

Descriptio
n
Name
Default
value
Descriptio
n
Name
Default
value
Descriptio
n
Name
Default
value

dfs.namenode.logging.level
info
Cc mc logging cho namenode. Nu gi tr l
dir th s log li
thay i ca namespace, l block th log li cc
thng tin v cc
125
bn sao,thao tc to hoc xa block, cui cng l
all
dfs.secondary.http.address
0.0.0.0:50090
a chi http ca Secondary Namenode server. Nu
port l 0 th
server s chy trn mt port bt k.
dfs.datanode.address
0.0.0.0:50010
a chi datanode server dng lng nghe cc kt
ni. Nu port l
0 th server s chy trn mt port bt k.
dfs.datanode.http.address
0.0.0.0:50075
25

Descriptio
n

a chi http ca datanode server. Nu port l 0 th


server s chy
trn mt port bt k.

Name
Default
value
Descriptio
n

dfs.datanode.handler.count

Name
Default
value

dfs.http.address

Descriptio
n
Name
Default
value

Descriptio
n
Name
Default
value

Descriptio
n
Name
Default
value
Descriptio

3
S lng cc tiu trnh trn server cho datanode

0.0.0.0:50070
a chi v port ca giao din web ca dfs
namenode dng lng
nghe cc kt ni. Nu port l 0 th server s chy
trn mt port bt
k.
dfs.name.dir
${hadoop.tmp.dir}/dfs/name
Th mc trn h thng file local m DFS
Namenode dng lu
tr file fsimage. Nu c nhiu th mc, th file
fsimage s c to
bn sao trong tt c cc th mc trn.
dfs.name.edits.dir
${dfs.name.dir}
Th mc trn h thng file local m DFS
Namenode dng lu
tr file v transaction (file edits). Nu c nhiu
th mc, th file
ny s c to bn sao trong tt c cc th mc
trn.
dfs.permissions
TRUE
Bt thao tc kim tra cc permission trn HDFS.
26

n
Name
Default
value

Descriptio
n

dfs.data.dir
${hadoop.tmp.dir}/dfs/data
Th mc trn h thng file local m mt DFS
Datanode dng lu tr cc file block ca n.
Nu c nhiu th mc, th cc block
s c to bn sao trong tt c cc th mc trn.
Nu th mc
khng tn ti th b ignore

Name
Default
value
Descriptio
n

dfs.replication

Name
Default
value
Descriptio
n

dfs.replication.max

Name
Default
value
Descriptio
n

dfs.replication.min

Name
Default
value
Descriptio
n

dfs.block.size

Name
Default
value
Descriptio
n

dfs.heartbeat.interval

3
S lng bn sao mc nh ca 1 block

512
S lng bn sao ti a ca mt block

1
S lng bn sao ti thiu ca mt block

67108864
Kch thc mc nh ca mt block (64MB)

3
Khong thi gian datanode gi heartbeat n
Namenode (giy)
27

Name
Default
value
Descriptio
n

dfs.namenode.handler.count

Name
Default
value

dfs.replication.interval

Descriptio
n

10
S lng cc tiu trnh server trn Namenode

3
Chu k (giy) m namenode s tnh li s lng
bn sao cho cc
datanode

File master
File ny nh ngha host lm Secondary Namenode. Vi tng dng trong file ny l
a ch ip hoc tn ca host .
File slaves
File ny nh ngha cc host lm DataNode cng nh TaskTracker. Vi tng dng
trong file l a chi ip hoc tn ca host .
Thng s cu hnh cho m hnh Hadoop MapReduce
File mapred-site.xml
File cu hnh mapred-site.xml, dng cho thao tc cu hnh cc thng tin ca m hnh
MapReduce. Tham kho thm ti
(http://hadoop.apache.org/common/docs/current/mapred-default.html)
mapredsite.xml
Name
mapred.job.tracker
Default value local
Host v port m MapReduce job tracker chy trn
. Nu l local, cc job s c chy trong mt
Description
tin trnh nh mt maptask v reduce task.
Name
mapred.job.tracker.http.address
Default value 0.0.0.0:50030
28

Description

a ch v port ca server http ca jobtrack m


server s lng nghe cc kt ni. Nu port l 0 th
server s khi ng trn mt port bt k.

Name
mapred.local.dir
Default value ${hadoop.tmp.dir}/mapred/local
Th mc local ni m MapReduce s lu cc file d
liu trung gian. C th l danh sch cc th mc
c cch nhau bi du phy trn cc thit b khc
Description
nhau m rng a. Th mc phi tn ti.
Name
mapred.system.dir
Default value ${hadoop.tmp.dir}/mapred/system
Th mc chia s ni m MapReduce lu tr cc file
Description
iu khin.
Name
mapred.temp.dir
Default value ${hadoop.tmp.dir}/mapred/temp
Description
Th mc chia s cho cc file tm.
Name
mapred.map.tasks
Default value 2
S lng cc maptask dng cho mt job. Khng c
Description
hiu lc khi mapred.job.tracker l local
Name
mapred.reduce.tasks
Default value 1
S lng cc reducetask dng cho mt job. Khng
Description
c hiu lc khi mapred.job.tracker l local
Name
mapred.child.java.opts
Default value -Xmx200m
Cc option ca Java cho cc tin trnh con ca
Description
TaskTracker. Gi tr kch thc heap cho mt task.
Name
mapred.job.reuse.jvm.num.tasks
Default value 1
S lng cc task chy trn mi jvm. Nu gi tr l
Description
-1 th khng gii hn s lng task.
Name

mapred.task.tracker.http.address
29

Default value 0.0.0.0:50060


a ch v port ca http tasktracker server. Nu port
Description
l 0 th server s khi ng trn mt cng bt k.

30

You might also like