programs for processing voluminous data. • All execution happens in parallel mode on large clusters of commodity hardware. • The MapReduce framework and the Hadoop Distributed File system are typically run on the same set of nodes as the compute nodes. Map Reduce Framework • The setting enables the framework to schedule job effectively. • Tasks are scheduled, monitored and failed tasks are re-executed by the framework. • A MapReduce job divides the input data set into separate chunks which are then processed in parallel by the map jobs. Map Reduce Framework • To complete the task each cluster node has one master job trancker and one slave task tracker in MapReduce Framework. • The master is in charge of scheduling the clasves component task, monitoring them and rerunning any failed task. • The slaves carry out the master’s instruction. How does MapReduce Work • MapReduce program executes in three stages – Map – Shuffle – Reduce • Map Stage: – Map or Mapper’s job is to process the input data. – Input data is in the form of file or directory and stored in HDFS. – Input file passed in the mapper function line by line – Mapper function creates small chunks of large file. How does MapReduce Work • Reduce Stage – This is combination of the shuffle stage and reduce stage. – Process data coming from the mapper. – Produces New set of output which will be stored in the HDFS. Developing a MapReduce Application • Configuration API: Represents a collection of configuration properties and their values – Combining resources: Used in hadoop to separate out the default properties for the system – Variable expansion: Properties are expanded using the values found in the configuration. Developing a MapReduce Application • Setting up the development environment – Managing configuration: It is common to switch between running the application locally and running it on a cluster. Variation accommodate by hadoop configuration files containing the connection setting. – GeneticOptionsParser, Tool and toolrunner: GenericOptionsParser is a class that interprets common Hadoop command-line options. Programmer don’t usually use GenericOptionsParser directly, as it’s more convenient to implement the Tool interface and run your application with the ToolRunner, which uses GenericOptionsParser internally Developing a MapReduce Application • Writing a Unit Test with MRUnit: MRUnit is a testing library that makes it easy to pass known inputs to mapper or a reducer and check the outputs are as expected. – Mapper – Reducer – Mrunit is used in conjunction with standard test execution framework as Junit. So programmer can run the test for Map Reduce job in normal development environmnet. – If expected values are not emitted by the mapper , MRUnit will fail the test. Developing a MapReduce Application • Running locally on test data – Local Job runner uses a single JVM to run a job. – Running a Job in a local Job runner: Using the Tool interface, it’s easy to write a driver to run our MapReduce job – Testing the driver: Uses a local job runner to run a job against known input data, which checks that the output is as expected MRv1 • MRv1 is part of Apache hadoop 1.x • MRv1 uses Jobtracker to create and assign tasks to data nodes which can become a resource bottle neck when the cluster scales out for enough. • In general jobtracker has to manage the resources and application. MRv2 • MRv2(YARN) has a resource manager for each cluster which bothers about – how many slots available – What if a node fails – What is the capacity – Each data node runs a node manager • For each job, one slave will act as the application master, monitoring, resource/task etc. MRv2 • Application master submits job to resource manager • Node manager sends node status • AppMaster make resource request • Container sends MR status. Developing a MapReduce Application • Running on a cluster: – Packaging a job: Distributed setting packed into a job JAR file. – Launching a Job – The MapReduce Web UI – Retrieving the Results – Debugging a Job – Hadoop Logs Developing a MapReduce Application • Tuning a job – Profiling a Task • MapReduce Workflows – Decomposing a problem into MapReduce Jobs – Job control – Apache oozie Developing a MapReduce Application • Running on a cluster • Tuning a job • MapReduce Workflows Anatomy of MapReduce Job run • Submit(): method call to run a MapReduce Job • WaitForCompletion(): submits the job if it hasn’t been submitted already, then waits for finish. Anatomy of MapReduce Job • Client: submits the MapReduce job • YARN coordinates the allocation of computer resurces • YARN Node Managers: Launch and Monitor the compute containers on machines in the cluster. • MapReduce application master: coordinates tasks running the MapReduce job. Anatomy of MapReduce Job • Application Master and MapReduce task run in the containers that are scheduled by the manager and managed by the node managers. • HDFS: Used for sharing job files between the other entities. Anatomy of Mapreduce Job Run Anatomy of MapRedeuce Application • The Submit method on job creates an internal JobSubmitter instance and calls submitJobInternal() (step 1) • Asks the resource manager for a new application ID used for the MapReduce job • Check the output specification • Computes the input split for the job. If the splits cannot be computed, the job is not submitted and an error is thrown to the MapReduce program Anatomy of MapRedeuce Application • Copies the resources needed to run the job including the job JAR file, the configuration file, and the computed input spilts, to the shared filesystem in a directory named after the Job ID (Step 3) • Submits the job by calling submitApplication on the resource manager (Step 4) Anatomy of MapRedeuce Application • Resource manager recieves a call to its submitApplication() method. It hands off the request to the YARN Scheduler. • YARN scheduler allocates a container. • YARN Resource manager launches application master’s process under the node manager management. (Step 5a & 5b) Anatomy of MapRedeuce Application • MRAppmaster initializes the job by creating a number of bookkeeping objects to keep track of the job’s progress. • MRAppmaster receive progress and completion reports from the tasks. • MRAppmaster retireves the input splits computed in the client from the shared filesystem. • MRAppmaster create a map task object for each spilt as well as a number of reduce task objects. • Tasks are given IDs at this point. Anatomy of MapRedeuce Application • If the job doesn’t qualify as an uber task then the MRAppmaster request container for all the map and reduce tasks in the job from the resource manager(Step 8) • After assigning the task and allocation of resources the MRAppmaster starts the container by contacting the node manager (step 9a and 9b) • The task is executed by a java application whose main class is YarnChild. Anatomy of MapRedeuce Application • YarnChil localizes the resources that the task needs including the job configuration and JAR file and any files from the distributed cache (Step 10). • Finally it runs the Map or reduce task (Step 11) • YarnChild runs in a dedicated JVM so that any bugs in the user- defined map and reduce functions (even in YarnChild) don’t affect the node manager –by causing it to crash or hang Anatomy of MapRedeuce Application • Job Submission: step 1 to 4 • Job initialization: 5a,5b, 6, 7 • Task assignment: 8 • Task execution: 9a, 9b, 10, 11 Relationship of the streaming executable to the node manager and the task container Task execution: streaming • Streaming runs special map and reduce tasks for the purpose of launching the user supplied executable and communicating with it. • Java Process input key-value pair to the external process and passes the output key value pair back to the java process using user defined map or reduce function. How status updates are propagated through the MapReduce system Progress and status updates • Progress is not always measurable, it tell hadoop that a task is doing something. • MapReduce jobs are long-running batch jobs. • When a task is running, it keeps track of its progress. Progress and status updates • Following operations constitute progress: – Reading an input record(in a mapper and reducer) – Writing an output record (in a mapper and reducer) – Setting the status description (via reports or taskAttempt contexts setStatus() method. – Incrementing a counter (Using Reporters incrCounter() or Countersincrement() method – Calling reporters or TaskAttemptContextsProgress() method Failures • Hadoop has ability to handle failures and allow job to complete successfully. • Common Failures are following – Task failure – Application Master Failure – Node Manager failure – Resurce Manager failure Task Failure • Most common failure in throwing a runtime exception • If it happens JVM reports backs to its parent parrent application master before it exists which create a user log. • Application master marks the task attempted as failed task and frees up the container so resources get utilized by another task. Task Failure • For streaming task: if streaming process exists with a nonzero exit code, it is marked as failed. • Sudden exit because of JVM bug: In this case node manager reports exited process to application master and mark attempt as failed. Task Failure • If a streaming process hangs: nodemanger will kill it; • Either yarn.nodemanager.container- executor.class is set to org.apache.hadoop.yarn.server.nodemanager.l inuxContainerExecutor, or the default container executor is being used and setsid command is available on the system. Task Failure • Orphaned streaming process will accumulate on the system which will impact utilization over time. • Application master reschedule execution task for failed task but try to avoid task failed at node manager and maximum four times attempted. Application Master Failure • If a MRAppmaster fails twice it will not tried again and job will fail. • Application master sends periodic heart beats to resource manager. If application master sends periodic heartbeats to resource manager. If application master fails, resource manager will detect the failure and start a new instance of application master. Node manager failure • Nodemanager fails by crashing or running very slowly. It will stop sending heartbeat to the resoruce manager. If any nodemanager fails multiple time it is blacklisted. Resource manager failure • Serious failure • Without it neither job nor task container can be launched • To achieve high availability, run a pair of resource manager in active standby configuration. Running application stored in a highly available state store. Resource manager failure • When a new resource manager starts, it reads application information from the state store. • Clients and node managers must be configured to handle resource manager failure since there are two possible resource manager to communicate. • Each resource manager connect in round- robin fashion. Client reading from HDFS A client writing data to HDFS Shuffle and Sort Shuffle and Sort • MapReduce guarantee that the input to every reducer is sorted by key. • Process through which system performs sort and transfer the map output to reducer as input known as shuffle. • Shuffle is an area of the codebase where refinement and improvements are continually being made. Shuffle and Sort • The map task may finish at different times, so the reduce task start copying their output as soon as each complete known as copy phase. • The reduce task has a something of copier threads so that it can fetch map output in parallel. What is MapReduce • MapReduce is a computational model and an implementation for processing and generating big data sets with a parallel, distributed kind of algorithm on a cluster of data. A MapReduce consists of the following procedures • Map procedure: Performs a filtering and sorting operation • Reduce procedure: Performs a summary operation Functions of MapReduce • Map Reduce serves two essential functions: • It filters and distributes work to various nodes within the cluster or map, a function sometimes referred to as the mapper. • It collects, organizes and reduces the results from each node into a collective answer, referred to as the reducer. Map Reduce Features • Counters - They are a useful channel for gathering statistics about the job like quality control. • Hadoop maintains some built in counters for every job that report various metrics for your job. • Types of counters - Task counters, Job counters, User- Defined Java Counters. • Sorting - Ability to sort data is at the heart of MapReduce. • Types of sorts - Partial sort, Total sort, Secondary sort • For any particular key, values are not sorted. Map Reduce Features • Joins - MapReduce can perform joins between large datasets, but writing code to do joins from scratch is fairly involved. Ex: Map-side joins, Reduced-side join. • Basic idea is that the mapper tags each record with its source and uses the join key as map output key, so that the records with same key are brought together in the reducer. • Side Data Distribution - It can be defined as extra read only data needed by job to process the main dataset. • Challenge is to make side data available to all the map or reduce tasks in convenient and efficient fashion. Map Reduce Features • Using the Job Configuration - We can set arbitrary key- value pairs in the job configuration using various setter methods on Configuration. • Distributed Cache - Rather than serializing side data in job configuration, it is preferable to distribute datasets using Hadoop's distributed cache mechanism. • Distributed Cache API - Most applications don’t need to use distributed cache API as they can use the cache via GenericOptionsParser. • MapReduce Library Classes - Hadoop comes with library of mappers and reducers for commonly used functions. Input Format • Input Format takes care about how input file is split and read by Hadoop. • It uses input format interface and TextInputFormat is the default. • Each Input file is broken into splits and each map processes a single split.Each Split is further divided into records of key/value pairs which are processed by map tasks one record at a time. • Record reader creates key/value pairs from input splits and writes on context, which will be shared with Mapper class. Input Format Types of Input File Format • FileInputFormat: It is the base class for all file-based Input Formats. It specifies input directory where data files are located. It will read all files and divides these files into one or more Input Splits. • TextInputFormat: Each line in the text file is a record.Key:Byte offset of line Value: Content of the line. • KeyValueTextInputFormat: Everything before the separator is the key, and everything after is value. • SequenceFileInputFormat: To read any sequence files. Key and values are user defined. Types of Input File Format • SequenceFileasTextInputFormat: Similar to SequenceFileInputFormat. It converts sequence file key values to text objects. • SequenceFileasBinaryInputFormat: To read any sequence files. It is used to extract sequence files keys and values as opaque binary object. • NLineInputFormat: Similar to TextInputFormat, But each split is guaranteed to have exactly N lines. • DBInputFormat: To read data from RDS. Key is LongWrittables and values are DB Writable. Output Format • The OutputFormat checks the Output- Specification for execution of the Map- Reduce job. For e.g check that the output directory doesn’t already exist. • It determines how RecordWriter Implementation is used to write output to output files. Output Files are stored in a File System. • The OutputFormat decides the way the output key-value pairs are written in the output files by RecordWriter. Command Line Interface • There are two properties that need to set in the pseudo distribution configuration • The first is fsdefault.name set to hdfs://localhost/ which is used to set a default file system for hadoop. • File system are specified by a uniform resource identifier. • File systems are specified by a URL, and here hdfs used to configure Hadoop to use hdfs. Command line interface • HDFS uses this property to determine host and port for the HDFS namenode. For localhost default HDFS port, 8020. HDFS client uses this to find namenode and connect it. Command Line Interface • The second property, dfs.replication, to 1 so that HDFS doesn’t replicate file system blocks by the default factor of three • When running with a single datanode, HDFS can’t replicate block to three datanode. • In case of single data, HDFS cannot replicate, this setting help in removing warning. Data Flow Client reading from HDFS A client writing data to HDFS