Professional Documents
Culture Documents
• MapReduce (Processing)
• YARN (A framework for jobs scheduling and the management of cluster resources)
• HBase is a column-oriented non-relational database management system that runs on Top of Hadoop Distributed
File System (HDFS).
• Zeppelin is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents.
• Ambari web interface includes a built-in set of Views that are pre- deployed for you to use with your cluster.
• Hadoop is omnipresent.
• A maturing technology.
• Fault tolerance by detecting faults and applying quick and automatic recovery
• Processing logic close to the data rather than the data close to the processing logic
• Edits Log
• TaskTracker
▪ Runs Map and Reduce tasks.
▪ Reports statuses to JobTracker.
19. Describe the architecture and list the components of the Apache Spark unified stack.