Review answers 1. True or False: the number of Vs of big data are exactly four.
2. Data that can be stored and processed in a fixed format is called:
A. Structured B. Semi-structured C. Unstructured D. Machine generated 3. True or False: Agriculture is one of the industry sectors that are using big data and analytics to help to improve and transform their industries. 4. Hadoop is good for: A. Processing transactions (random access) B. Massive amounts of data through parallelism C. Processing lots of small files D. Intensive calculations with little data E. Low latency data access 5. True or False: One of Hadoop main characteristics is that applications are written in low-level language code. Unit 2. Introduction to Hortonworks Data Platform (HDP) Review answers 1. Which of thes components of HDP provides data access capabilities? A. MapReduce B. Falcon C. Ranger D. Ambari 2. Identify the component that is a messaging system used for real-time data pipelines A. Nifi B. Sqoop C. Kafka D. None of the following 3. True or False: Big Match is added value from IBM. 4. True or False: IBM BigIntegrate Provides data quality features of Information Server. 5. IBM BigQuality provides scalable engine to A. Manage B. Design C. Connect D. Cleanse Unit 3. Introduction to Apache Ambari Review answers 1. True or False: Apache Ambari is backed by RESTful APIs for developers to easily integrate with their own applications. 2. Which functions does AMS provide? A. Monitors the health and status of the Hadoop cluster. B. Starts, stops, and reconfigures Hadoop services across the cluster. C. Collects, aggregates, and serves Hadoop and system metrics. D. Handles the configuration of Hadoop services for the cluster. 3. Which page from the Apache Ambari UI enables you to check the versions of the software that is installed on your cluster? A. Cluster Admin > Stack and Versions B. admin > Service Accounts C. Services D. Hosts 4. True or False: Creating users through the Apache Ambari Web UI also creates the user on the HDFS. 5. True or False: You can use the cURL commands to issue commands to Apache Ambari. Unit 4. Apache Hadoop and HDFS Review questions 1. True or False: Hadoop systems are designed for using a single server. 2. What is the default number of replicas in a Hadoop system? A. 1 B. 2 C. 3 D. 4 3. True or False: One of the Hadoop goals is fault tolerance by detecting faults and applying quick and automatic recovery. 4. True or False: At least two NameNodes are required for a stand-alone Hadoop cluster. 5. The default Hadoop block size is: A. 16 B. 32 C. 64 D. 128 Unit 5. MapReduce and YARN Review answers 1. Which of the following phases in a MapReduce job is optional? A. Map B. Shuffle C. Reduce D. Combiner 2. True or False: Interactive, online, and streaming applications are not allowed to run on Hadoop v2 3. The JobTracker in MRv1 is replaced by which components in YARN? (Select all that apply.) A. ResourceManager B. NodeManager C. ApplicationMaster D. TaskTracker 4. True or False: The major change from Hadoop v1 to Hadoop v2 is the separation of cluster and resource management from the execution and data processing environment. 5. True or False: It is possible to run unmodified MapReduce v1 jobs by using the same MapReduce API and CLI in Hadoop v2. Unit 6. Introduction to Apache Spark Review answers 1. True or False: Ease of use is one of the benefits of using Apache Spark. 2. Which language is supported by Apache Spark? A. C++ B. C# C. Java D. Node.js 3. True or False: Scala is the primary abstraction of Apache Spark. 4. In RDD actions, which function returns all the elements of the data set as an array of the driver program? A. Collect B. Take C. Count D. Reduce 5. True or False: Referencing a data set is one of the methods to create RDD. Unit 7. Storing and querying data Review answers 1. What is the data representation format of an RC or ORC file? A. Row-based encoding B. Record-based encoding C. Column-based storage D. NoSQL data store 2. True or False: A NoSQL database is designed for those developers that do not want to use SQL. 3. HBase is an example of which of the following NoSQL data store type? A. Key-value store B. Graph store C. Column store D. Document store 4. Which database provides an SQL for Hadoop interface? A. Hbase B. Apache Hive C. Cloudant D. MongoDB 5. True or False: R is a real programming language, and Python is an interactive environment for doing statistics
Event Server and clone requirementsThis document discusses the functions of the Event Server and requirements for cloning a server in the Central Management Console