You are on page 1of 2

HADOOP ADMINISTRATOR TRAINING

--- Hadoop L1 L2 Curriculum ---

In-depth Design Architecture & Process/Data flow


- Hadoop, MR, Zookeeper
- YARN
- Hive(MR, TEZ, Spark)
- Spark (Spark SQL, Streaming)
- Kafka (Producer, Consumer, Mirror Maker, Schema Registry)
- HBase & Oozie

Important Config Parameters & Performance Tunning


- Hadoop, MR, Zookeeper
- YARN
- Hive
- Spark
- Kafka
- HBase & Oozie
- Hadoop configurations (MTU, transparent huge pages, sizing(JVM, space distribution) etc)

Troubleshooting jobs and services


- Best practice approach to doing RCA
- Identifying issues with platform/Application end
- Dealing with Slow running jobs (@ YARN or Processing Level)
- Diagnostic tools - (heap dump, tcp dump, Netstat, Netcat, Htop, ps etc.)

Important & Handy Service Command lines


- Hadoop, MR, Zookeeper
- YARN
- Hive (Beeline)
- Spark
- Kafka
- HBase & Oozie

Important Service Alerts


- HDFS metrics
- YARN Metrics
- Hive Metrics
- Sparks Metrics
- Impala Metrics
Note: review existing alerts and adjust accordingly.
Job and Service Monitoring
- From CLI & WebUI
- Utilizing Cloudera Manager, Dynatrace and Unravel Monitoring tools
- Monitoring and responding Matter Most Channels (Hadoop MM channels, SRC)
- ServiceNow

Typical Day-to-Day Activities & Miscellaneous


- Start and stopping services based on dependencies
- Dealing with local space utilization issues
- HDFS user quotas, Rebalancing
- Copying Data, Restoring Snapshot, Creating Hive Tables
- Hive MSCK, HDFS (FSCK & cluster Report)
- Decom node out of cluster and add node into cluster
- Effective Shift Handover Report
- Scheduling Monthly Agile Patching
- Updating documents and confluence pages

--- Hadoop L2  L3 Curriculum ---

Security
- Kerberos
- Sentry (or Ranger)
- SSL / TLS
- Data Encryptions

Patching and Upgrade


- CDP training
- Applying Hotfix and Patching
- Fixing Vulnerability findings
- BC Event activities
- FFA runbooks

Reports
-Performance Report*
- Setting up Alerts and Notifications on Cloudera Manager, Dynatrace and Unravel
- INC, REQ, CR, Vulnerabilty, Troux

Miscellaneous
- Lead Troubleshooting session with vendors
- Train Wipro team members on products, processes
- Manage/Plan Wipro team schedule & coverage to adjust for workload spikes

You might also like