Welcome to Scribd!

Skip carousel

Hadoop

Uploaded by

jey011851

0% found this document useful (0 votes)

16 views3 pages

Original Title

Hadoop.docx

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

16 views3 pages

Hadoop

Uploaded by

jey011851

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

1. Unable to locate completed jobs in History Server (HS)?

Why they are pointing to old month

directories?

All the completed jobs will be maintained in HS cache. If the cache is full, it is unable to load the
new jobs/applications in the cache.

Solution:

 Stop history server

 Delete or move old job directories
 Restart HS to load new jobs into cache

2. Too many open files in hive log.

sudo lsof | grep mapr | wc -l

grep -i mapr /etc/security/limits.conf

Just restart the hiveserver2.

sudo /etc/init.d/hiveserver2 status
sudo /etc/init.d/hiveserver2 stop
sudo /etc/init.d/hiveserver2 start

3. Why Jobs fail in COMMIT stage with COMMIT_SUCCESS file exists exception?

This appears to be issue with speculative execution where different tasks create same
COMMIT_SUCCESS file for a single task.

Solution:
Rerun job with below properties
mapreduce.map.speculative=FALSE
mapreduce.reduce.speculative=FALSE

4. How to troubleshoot “GC overhead limit exceeded” issue at Reducer phase?

Tasks may fail if they don’t have enough memory to store its input data.

2015-11-04 13:56:49,465 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running

child : java.lang.OutOfMemoryError: GC overhead limit exceeded

General scenario is either to increase number of Reducers. At times, even though we increase
number of Reducers, the data may be skewed to only few Reducers.

1
For ex:

If the total map output keys are 10000 and you are using 5 Reducers to process this data, and if
your data is not skewed uniformly like below, only 1st Reducer is taking complete load to process
this data and failing due to out of memory.

REDUCE_INPUT_RECORDS for Reducer R1 = 9000

REDUCE_INPUT_RECORDS for Reducer R2 = 100
REDUCE_INPUT_RECORDS for Reducer R3 = 600
REDUCE_INPUT_RECORDS for Reducer R4 = 200
REDUCE_INPUT_RECORDS for Reducer R5 = 100

Solution:

Try to increase the Reducer memory and Java Opts property only for that particular job as
shown below. This will launch each Reducer with 6GB memory and tasks will be succeeded
without memory issue.

- Dmapreduce.reduce.memory.mb=6144
-Dmapreduce.reduce.java.opts=-Xmx4915m

5. Time difference in job execution due to disk latency?

At times tasks will take more time to complete due to disk latency.

6. Hive CLI is not showing hive prompt and just hangs?

Possible checks:

Check below directory for any startup errors.

/tmp/<username>/hive.log

Run hive in debug mode to see the errors.

hive -hiveconf hive.root.logger=DEBUG,console

Check for any defunc processes running from long time. If found any, kill them.
ps -aef |grep -i defunc

7. No LoginModules Exception?

Exception in thread "main" java.io.IOException: failure to login: No LoginModules configured for

hadoop_simple
2
at
org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:724
)

8. How to enable verbose property to know which jars are getting picked from where?

Add below properties to mapred-site.xml

<property>
<name>mapreduce.map.java.opts</name>
<value> -Xmx512M -verbose:class </value>
</property>

In Spark side,Set the below property in CLI.

export SPARK_SUBMIT_OPTS=-verbose:class

And add below properties at /opt/mapr/spark/spark-1.6.1/conf/spark-defaults.conf

spark.driver.userClassPathFirst=true
spark.executor.userClassPathFirst=true

9. RM process going Beyond Xmx values?

Bug in RM. Fixed by MapR.

10. RM not utilizing resources even when resources are available?

Bug in RM. Fixed by MapR.

11. .

12. A
13. A
14. A
15. A
16. A
17. A

Apex Institute of Technology: Big Data Security
Document30 pages
Apex Institute of Technology: Big Data Security
So do so
No ratings yet
049 Hadoop Commands Reference Guide.
Document3 pages
049 Hadoop Commands Reference Guide.
vaasu1
No ratings yet
Sqoop Commands - Latest
Document4 pages
Sqoop Commands - Latest
H S Manju Nath
No ratings yet
HDFS Encryption Zone Hive Orig
Document47 pages
HDFS Encryption Zone Hive Orig
shyamsunderrai
No ratings yet
Apache Hive Tutorial
Document139 pages
Apache Hive Tutorial
nagalakshmi
No ratings yet
Cloudera Administrator Training For Apache Hadoop
Document5 pages
Cloudera Administrator Training For Apache Hadoop
Gowthamkaju Venkat
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
Rating: 4 out of 5 stars
4/5 (2)
HDFS
Document6 pages
HDFS
Siddharth Bubbul
100% (2)
HDFS Commands
Document15 pages
HDFS Commands
pawan
No ratings yet
Hive Commands
Document3 pages
Hive Commands
pkumarss
No ratings yet
Guided By:: Miss. Rupali Zambre
Document20 pages
Guided By:: Miss. Rupali Zambre
john
No ratings yet
Hadoop Hdfs Commands
Document5 pages
Hadoop Hdfs Commands
Vijaya K Rao
No ratings yet
HDFS Exercises - Basic
Document5 pages
HDFS Exercises - Basic
Prabhu Kushwaha
No ratings yet
Sqoop Demo
Document7 pages
Sqoop Demo
Jyotirmay Sahu
No ratings yet
Linux Command List
Document8 pages
Linux Command List
hkneptune
No ratings yet
Pair RDD Operations: Flat Map
Document4 pages
Pair RDD Operations: Flat Map
marina dutta
No ratings yet
ClouderaManager ExerciseInstructions
Document25 pages
ClouderaManager ExerciseInstructions
arjun.ec633
No ratings yet
2 HDFS Commands
Document7 pages
2 HDFS Commands
VIPUL GUPTA
No ratings yet
Hadoop Imp Commands
Document21 pages
Hadoop Imp Commands
aepuri
No ratings yet
Hadoop and Java Ques - Ans
Document222 pages
Hadoop and Java Ques - Ans
ravi
No ratings yet
HDFS Architecture
Document47 pages
HDFS Architecture
krishan Goyal
No ratings yet
Cloudera Manager Intro
Document34 pages
Cloudera Manager Intro
geeyes1
No ratings yet
Cloudera Certification Dump 410 Anil PDF
Document49 pages
Cloudera Certification Dump 410 Anil PDF
arunshan
No ratings yet
Hadoop & Big Data
Document36 pages
Hadoop & Big Data
Paresh Bhatia
No ratings yet
Cloudera Administrator Training For Apache Hadoop PDF
Document2 pages
Cloudera Administrator Training For Apache Hadoop PDF
Rocky
50% (2)
Hadoop All Installations
Document19 pages
Hadoop All Installations
Fernando Andrés Hinojosa Villarreal
No ratings yet
Big Data Ana Unit - II Part - II (Hadoop Architecture)
Document47 pages
Big Data Ana Unit - II Part - II (Hadoop Architecture)
Mokshada Yadav
No ratings yet
Hadoop Admin Interview Question and Answers
Document5 pages
Hadoop Admin Interview Question and Answers
Vivek Kushwaha
No ratings yet
Apache Hive: Prashant Gupta
Document61 pages
Apache Hive: Prashant Gupta
Naveen Reddy
No ratings yet
SAmple Hadoop
Document7 pages
SAmple Hadoop
Mottu2003
No ratings yet
Spark Summit East 2015 - Adv Dev Ops - Student Slides
Document219 pages
Spark Summit East 2015 - Adv Dev Ops - Student Slides
Chánh Lê
No ratings yet
HDFS Commands
Document2 pages
HDFS Commands
Raghavendra Prabhu
No ratings yet
MapR Installation
Document6 pages
MapR Installation
Kali Varaprasad
No ratings yet
Seminar Report On: Hadoop
Document44 pages
Seminar Report On: Hadoop
nagasrinu20
No ratings yet
HBase Interview Questions
Document12 pages
HBase Interview Questions
pooh06
No ratings yet
Hadoop Singlenode
Document43 pages
Hadoop Singlenode
ir
No ratings yet
Hadoop Distributed File System (HDFS) : Suresh Pathipati
Document43 pages
Hadoop Distributed File System (HDFS) : Suresh Pathipati
Kancharla
No ratings yet
Hadoop Administration Interview Questions and Answers: 40% Career Booster Discount On All Course - Call Us Now 9019191856
Document26 pages
Hadoop Administration Interview Questions and Answers: 40% Career Booster Discount On All Course - Call Us Now 9019191856
krishna
No ratings yet
SqoopTutorial Ver 2.0
Document51 pages
SqoopTutorial Ver 2.0
bujjijuly
No ratings yet
HOL Hive
Document85 pages
HOL Hive
Kishore Kumar
No ratings yet
Mix Hadoop Developer Interview Questions
Document3 pages
Mix Hadoop Developer Interview Questions
Amit Kumar
No ratings yet
Cloudera Administrator Training For Apache Hadoop
Document3 pages
Cloudera Administrator Training For Apache Hadoop
Mathavan Sundharamoorthy
No ratings yet
Cloudera Administration Handbook Sample Chapter
Document22 pages
Cloudera Administration Handbook Sample Chapter
Packt Publishing
No ratings yet
Hive Function Cheat Sheet
Document1 page
Hive Function Cheat Sheet
Vikas Srivastava
No ratings yet
Facebook Hive POC
Document18 pages
Facebook Hive POC
Jayashree Ravi
No ratings yet
Jenkins - Fundamentals - CloudBees
Document14 pages
Jenkins - Fundamentals - CloudBees
GOPI C
No ratings yet
Hadoop
Document30 pages
Hadoop
SAM7028
No ratings yet
Hadoop-Oozie User Material
Document183 pages
Hadoop-Oozie User Material
rahulneel
No ratings yet
Homework Labs Lecture01
Document9 pages
Homework Labs Lecture01
Episode Unlocker
No ratings yet
Hadoop Fundamentals and Hive Interview Questions
Document8 pages
Hadoop Fundamentals and Hive Interview Questions
michel jonson
No ratings yet
CCD-410 Cloudera Hadoop Certification Questions
Document8 pages
CCD-410 Cloudera Hadoop Certification Questions
Selvarajaguru Ramaswamy
No ratings yet
Sqoop Cheatsheet
Document3 pages
Sqoop Cheatsheet
PremKumar Sivanandan
No ratings yet
Admin Commands
Document6 pages
Admin Commands
Katrina Camacho
No ratings yet
Spark Training - Java
Document8 pages
Spark Training - Java
Pavan Kumar
No ratings yet
Pig Slides
Document46 pages
Pig Slides
Sreedhar Arikatla
No ratings yet
Hadoop Admin Download Syllabus PDF
Document4 pages
Hadoop Admin Download Syllabus PDF
shubham phulari
No ratings yet
ASM Interview Questions
Document44 pages
ASM Interview Questions
Kumar K
No ratings yet
Hive Commands Simplin
Document5 pages
Hive Commands Simplin
marina dutta
No ratings yet
Real Time Hadoop Interview Questions From Various Interviews
Document6 pages
Real Time Hadoop Interview Questions From Various Interviews
Saurabh Gupta
No ratings yet
CORONA
Document9 pages
CORONA
jey011851
No ratings yet
Hortonworks Upgrade - 2.5.3.0 To 2.6.2.0
Document12 pages
Hortonworks Upgrade - 2.5.3.0 To 2.6.2.0
jey011851
No ratings yet
Hadoop 2x Installation With HA (NFS - QJM)
Document43 pages
Hadoop 2x Installation With HA (NFS - QJM)
jey011851
No ratings yet
Hadoop 3x Installation With HA
Document17 pages
Hadoop 3x Installation With HA
jey011851
No ratings yet
Hadoop Interview Questions
Document28 pages
Hadoop Interview Questions
jey011851
No ratings yet
Cluster Requirement Analysis - 30 Nodes
Document4 pages
Cluster Requirement Analysis - 30 Nodes
jey011851
No ratings yet