Welcome to Scribd!

Skip carousel

ADM203 L13 Troubleshooting

Uploaded by

Sriraksha Srinivasan

0% found this document useful (0 votes)

44 views19 pages

Hadoop Mapr Troubleshooting

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Hadoop Mapr Troubleshooting

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

44 views19 pages

ADM203 L13 Troubleshooting

Uploaded by

Sriraksha Srinivasan

Hadoop Mapr Troubleshooting

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 19

Search inside document

ADM 203 – Cluster Maintenance

L13-1
Learning Goals

13.1 Troubleshoot Different Problem Types

13.2 Use Support Utilities

L13-2
How Do You Know There's a Problem?

• Alarm raised

• Performance issues

• Jobs failing

L13-3
Locate Troubleshooting Resources

• Log files

• MapR documentation
– maprdocs.mapr.com

• MapR Community: Answers

– community.mapr.com/community/answers

• Apache forums and websites

• MapR Support (with Enterprise Edition)

L13-4
Default Log File Locations

Log Type Default Location

MapR-specific logs /opt/mapr/logs

ZooKeeper logs /opt/mapr/zookeeper/zookeeper-<version>/logs

Ecosystem component logs /opt/mapr/<component>

YARN job logs /opt/mapr/hadoop/hadoop-2*/logs

YARN job logs (log /mapr/<cluster name>/tmp/logs

aggregation enabled)

L13-5
Set Log File Retention

YARN
To change, set:
yarn.nodemanager.log.retain-seconds
in:
yarn-site.xml

/opt/mapr/logs
To retain indefinitely set:
log.retention.exceptions
in:
warden.conf

L13-6
Problem: Services Not Running
• Typically corresponds to a raised alarm

• Warden will attempt start or restart 3 times

• Each service has its own logs in /opt/mapr/logs

– Don't forget to check Warden logs

L13-7
Strategy for Troubleshooting Service Problems
1. Open two terminal windows

2. Window 1:
$ service mapr-warden stop

3. Window 2:
$ tail –f /opt/mapr/logs/warden.log

4. Window 1:
$ service mapr-warden start

L13-8
Problem: Cluster Startup Issues
1. ZooKeepers up?
$ service mapr-zookeeper qstatus

2. Review warden.log and cldb.log

3. Get maprcli node cldbmaster to succeed

4. Troubleshoot individual services

L13-9
Problem: Data Not Available
• Is affected volume node-specific?
– /var/mapr/local/<node name>
– Is node offline?

• Failed disks and nodes

• Permission problems

L13-10
Problem: Jobs Not Running
• Start with the YARN job logs
– Command line
– ResourceManager page (<IP address>:19888)

• Permissions problems

• Resource problems

L13-11
Problem: Degraded Performance
• Check /opt/mapr/logs/mfs.log-3
– Failures communicating with the CLDB nodes or other servers?

• Check node resources

– CPU, memory, swap space, disk space, network

• Isolate management services

L13-12
Learning Goals

13.1 Troubleshoot Different Problem Types

13.2 Use Support Utilities

L13-13
Core Files
• Located in /opt/cores
– Name includes service involved

• No immediate action required

– Service will automatically restart

• Forward to MapR Support when possible

L13-14
Caution:
fsck
Using fsck –r
• Find/fix file system inconsistencies can result in loss
of data. Contact
– /opt/mapr/server/fsck
support first!

• Repair storage pools after disk failure

• Verification or repair
– Use -r for repair

L13-15
gfsck
• Find/fix volume inconsistencies
– /opt/mapr/bin/gfsck

• Scans and repairs a cluster, volume, or snapshot

• Typical process
– Take storage pool offline
– Run fsck on storage pool
– Bring storage pool back online
– Run gfsck on affected cluster/volumes/snapshots

L13-16
mapr-support-dump.sh
• Collects information/logs from a single node
$ /opt/mapr/support/tools/mapr-support-dump.sh

• Output at /opt/mapr/support/dump
– Creates a .tar file

L13-17
mapr-support-collect.sh
• Collects information/logs from entire cluster
$ /opt/mapr/support/tools/mapr-support-collect.sh

• Output at /opt/mapr/support/collect
– Uses SSH and SCP
– Creates a .tar file

L13-18
Congratulations!

L13-19

System and Network Administration Syslog Shares Sudo ++
Document42 pages
System and Network Administration Syslog Shares Sudo ++
Arsalan Jahani
No ratings yet
Linux Logging
Document39 pages
Linux Logging
eran
No ratings yet
Inter-Task Communication
Document21 pages
Inter-Task Communication
Ramesh Sun
No ratings yet
Unix / Linux - System Logging - Tutorialspoint
Document9 pages
Unix / Linux - System Logging - Tutorialspoint
krkama
No ratings yet
Unix / Linux - System Logging - Tutorialspoint
Document9 pages
Unix / Linux - System Logging - Tutorialspoint
krkama
No ratings yet
Memmanagement
Document20 pages
Memmanagement
api-3702016
No ratings yet
Log Management Tenshi
Document15 pages
Log Management Tenshi
Hien Nguyen
No ratings yet
Reduced Instruction Set Computers Pipelining: (RISC)
Document25 pages
Reduced Instruction Set Computers Pipelining: (RISC)
Sudip Bhujel
No ratings yet
Architecture 11g
Document3 pages
Architecture 11g
kamalprabha
No ratings yet
05 Solaris
Document14 pages
05 Solaris
Yc Rdml
No ratings yet
Brendan Gregg: Container Performance Analysis
Document75 pages
Brendan Gregg: Container Performance Analysis
csy365
No ratings yet
Laborator Linux
Document8 pages
Laborator Linux
Popa Alex
No ratings yet
Zones Best Pract
Document6 pages
Zones Best Pract
Justin Liu
No ratings yet
Bigdata and Hadoop - Unit III
Document24 pages
Bigdata and Hadoop - Unit III
garima khasdeo
No ratings yet
Unit 1 OS
Document61 pages
Unit 1 OS
SAI ANDALAM (RA2111003011353)
No ratings yet
Gfs Google File System 13331
Document28 pages
Gfs Google File System 13331
Rakesh Akhileswaran
No ratings yet
Apache ZooKeeper - Mesosphere
Document27 pages
Apache ZooKeeper - Mesosphere
satmania
No ratings yet
IBM Z Systems Processor Optimization SHARE Aug 2016
Document27 pages
IBM Z Systems Processor Optimization SHARE Aug 2016
Anonymous G1iPoNOK
No ratings yet
Troubleshooting Cluster Administration
Document5 pages
Troubleshooting Cluster Administration
danial vista
No ratings yet
Solaris - Crash Dump Analysis PDF
Document14 pages
Solaris - Crash Dump Analysis PDF
mitrasamrat20026085
No ratings yet
Parallel Architecture: Sathish Vadhiyar
Document26 pages
Parallel Architecture: Sathish Vadhiyar
dhruvbhagtani
No ratings yet
YARN - MapReduce
Document34 pages
YARN - MapReduce
Yashu Sagar
No ratings yet
Linux Server Troubleshooting
Document8 pages
Linux Server Troubleshooting
yoru010196
No ratings yet
Netapp Setup
Document47 pages
Netapp Setup
Surro Gates
100% (1)
Power 9
Document25 pages
Power 9
Khalll
No ratings yet
Tutorial On DNN 4 of 9 DNN Accelerator Architectures PDF
Document73 pages
Tutorial On DNN 4 of 9 DNN Accelerator Architectures PDF
Ashin Antony
No ratings yet
Unit V
Document80 pages
Unit V
Backup Suraj
No ratings yet
Veritas Dynamic Multipathing (DMP)
Document9 pages
Veritas Dynamic Multipathing (DMP)
vamshik1982
No ratings yet
Computer Systems: Hardware (Book No. 1 Chapter 2)
Document111 pages
Computer Systems: Hardware (Book No. 1 Chapter 2)
Matthew Anthony Saliendra Salvacion
No ratings yet
Lustre Admin Monitor
Document25 pages
Lustre Admin Monitor
Chuin Fujio
No ratings yet
Operating 12system Ori
Document102 pages
Operating 12system Ori
api-3716254
No ratings yet
Secondary Storage Introduction
Document82 pages
Secondary Storage Introduction
harisiddhanthi
No ratings yet
l23 Multithread
Document34 pages
l23 Multithread
Santosh Gannavarapu
No ratings yet
Unix Os
Document23 pages
Unix Os
Parvathi Goud
100% (1)
Tc0419 Data Base Trace Tools
Document8 pages
Tc0419 Data Base Trace Tools
Humberto Ochoa Mendez
No ratings yet
Admin Database
Document35 pages
Admin Database
javier
No ratings yet
Components of Computer
Document69 pages
Components of Computer
Jamil Nawabia
No ratings yet
Embedded System
Document20 pages
Embedded System
pra0408
No ratings yet
Operating Systems: Steven Hand Michaelmas / Lent Term 2008/09
Document11 pages
Operating Systems: Steven Hand Michaelmas / Lent Term 2008/09
Sanjeev Kumar
No ratings yet
Reduced Instruction Set Computers Pipelining: (RISC)
Document25 pages
Reduced Instruction Set Computers Pipelining: (RISC)
Shiva Kumar Pottabathula
No ratings yet
Computer Memory
Document34 pages
Computer Memory
Alice Guarin
No ratings yet
ARM Processor
Document46 pages
ARM Processor
yixexi7070
No ratings yet
RHCSA Study Guide 1554826285
Document10 pages
RHCSA Study Guide 1554826285
Prasanth Unnikrishnan
No ratings yet
Linux 操作系统: Acegene IT Co. Ltd. 1
Document23 pages
Linux 操作系统: Acegene IT Co. Ltd. 1
Suxiaoxiao
No ratings yet
Real Time Analytics With Apache Kafka and Spark: Rahul Jain
Document54 pages
Real Time Analytics With Apache Kafka and Spark: Rahul Jain
Sudhanshoo Saxena
No ratings yet
Hadoop Distributed File System Basics
Document30 pages
Hadoop Distributed File System Basics
ashuvasuma
No ratings yet
Ksar
Document13 pages
Ksar
scolomaay
No ratings yet
Data Dependence Distances: Write To Register File Data Available Normal/Earliest Stage
Document18 pages
Data Dependence Distances: Write To Register File Data Available Normal/Earliest Stage
Abdul Wahab Mulk
No ratings yet
Lecture 2: Unix Structure: Asoc. Prof. Guntis Barzdins Asist. Girts Folkmanis
Document19 pages
Lecture 2: Unix Structure: Asoc. Prof. Guntis Barzdins Asist. Girts Folkmanis
rosshush
No ratings yet
Operating System Support: William Stallings Computer Organization and Architecture 7 Edition
Document51 pages
Operating System Support: William Stallings Computer Organization and Architecture 7 Edition
Andi Didik Wira Putra
No ratings yet
So-It-Chapter-3-Computer Architecture PDF
Document35 pages
So-It-Chapter-3-Computer Architecture PDF
Sumit Soni
No ratings yet
Kernel Memory Allocation: Unix Internals, Uresh Vahalia
Document16 pages
Kernel Memory Allocation: Unix Internals, Uresh Vahalia
Jaimin Patel
No ratings yet
15 Most Useful Linux Commands For File System Maintenance
Document6 pages
15 Most Useful Linux Commands For File System Maintenance
nagaraj
No ratings yet
Label Tom Jerry 103 Kusnendar
Document23 pages
Label Tom Jerry 103 Kusnendar
Khoiruddin
No ratings yet
Chapter 7
Document70 pages
Chapter 7
BSIT3_IT116
No ratings yet
Computer Science I Essentials
From Everand
Computer Science I Essentials
Randall Raus
Rating: 5 out of 5 stars
5/5 (7)
PostgreSQL Replication - Second Edition
From Everand
PostgreSQL Replication - Second Edition
Hans-Jürgen Schönig
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Gluster Filesystem - Practical Method
From Everand
Gluster Filesystem - Practical Method
Fabian Mestre
No ratings yet
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hidaia Mahmood Alassouli
No ratings yet
Mcca Study Guide 7.2017 Uvawomo
Document30 pages
Mcca Study Guide 7.2017 Uvawomo
Sarah Benmoussa
No ratings yet
ADM200 L2 InstallMapR
Document43 pages
ADM200 L2 InstallMapR
Sriraksha Srinivasan
No ratings yet
ADM200 L2 InstallMapR
Document43 pages
ADM200 L2 InstallMapR
Sriraksha Srinivasan
No ratings yet
Tutorial MapR Administration
Document236 pages
Tutorial MapR Administration
Sriraksha Srinivasan
No ratings yet
Good Research
Document11 pages
Good Research
M R Mukit
No ratings yet
Brochure - UpGrad & BITS Pilani - PG Program in Big Data Engineering
Document16 pages
Brochure - UpGrad & BITS Pilani - PG Program in Big Data Engineering
Suganthi Aravind
No ratings yet
FAISALKHAN
Document17 pages
FAISALKHAN
Arvind G
No ratings yet
HBase Coprocessors - Paval Ambrozie
Document13 pages
HBase Coprocessors - Paval Ambrozie
Pavăl Ambrozie
No ratings yet
Assessing Global Market Opportunities: Information
Document5 pages
Assessing Global Market Opportunities: Information
Sai Ram
No ratings yet
The Art & Science of Marketing
Document2 pages
The Art & Science of Marketing
ALEXANDRU GABRIEL IFTIMOVICI
No ratings yet
Dileep - DBA&Data Replication Qlik (Attunity)
Document6 pages
Dileep - DBA&Data Replication Qlik (Attunity)
Samruddhi Thengdi
No ratings yet
Database Management System
Document9 pages
Database Management System
Rajesh kuamr
No ratings yet
DWH Guide Oracle E85643 03
Document699 pages
DWH Guide Oracle E85643 03
Cong Nguyen
No ratings yet
NetSDK Programming Manual
Document48 pages
NetSDK Programming Manual
rudy019
No ratings yet
Shivam Krishna
Document1 page
Shivam Krishna
physicist sharma
No ratings yet
Vsphere Esxi Vcenter Server 651 Monitoring Performance Guide PDF
Document206 pages
Vsphere Esxi Vcenter Server 651 Monitoring Performance Guide PDF
goran974
No ratings yet
Oracle Data Guard
Document386 pages
Oracle Data Guard
Prasanna Narayanan
No ratings yet
ABB Experience Implementing CIM
Document24 pages
ABB Experience Implementing CIM
Igor Sangulin
No ratings yet
Chapter 5 Big Data
Document36 pages
Chapter 5 Big Data
Jiawei Tan
No ratings yet
File Server
Document10 pages
File Server
Artem Parriñas
No ratings yet
Base Your Block
Document17 pages
Base Your Block
Miguel Angel Manrique
No ratings yet
B28624 PDF
Document772 pages
B28624 PDF
Jose Luis
100% (2)
CH 10 UNIX
Document48 pages
CH 10 UNIX
Anusha Bikkineni
No ratings yet
2
Document7 pages
2
Dropdead OH
100% (1)
Postgres The First Experience
Document173 pages
Postgres The First Experience
brmonteiro
No ratings yet
Business Management IA SL
Document29 pages
Business Management IA SL
siutakmak
No ratings yet
Snowflake Notes
Document67 pages
Snowflake Notes
Jay Jinka
100% (4)
Unesco Als Ls4 Sg07
Document22 pages
Unesco Als Ls4 Sg07
als midsayap1
100% (2)
AUTOSAR EXP NVDataHandling 2
Document52 pages
AUTOSAR EXP NVDataHandling 2
afra
No ratings yet
ISO 30401 Lead Auditor Course......
Document435 pages
ISO 30401 Lead Auditor Course......
Somto Nwachukwu
No ratings yet
Master Data Management
Document41 pages
Master Data Management
klaudia danko
100% (3)
Data Processing and Data Analysis - 104910 1
Document8 pages
Data Processing and Data Analysis - 104910 1
Gurya
No ratings yet
BSBINM601 - Assessment Tasks PDF
Document62 pages
BSBINM601 - Assessment Tasks PDF
Anaya Ranta
22% (9)
Synology MiB
Document19 pages
Synology MiB
Naz Lun
No ratings yet