You are on page 1of 7

Unit 1.

Introduction to big data


Review answers
1. True or False: the number of Vs of big data are exactly four.

2. Data that can be stored and processed in a fixed format is called:


A. Structured
B. Semi-structured
C. Unstructured
D. Machine generated
3. True or False: Agriculture is one of the industry sectors that are using big
data and analytics to help to improve and transform their industries.
4. Hadoop is good for:
A. Processing transactions (random access)
B. Massive amounts of data through parallelism
C. Processing lots of small files
D. Intensive calculations with little data
E. Low latency data access
5. True or False: One of Hadoop main characteristics is that applications are
written in low-level language code.
Unit 2. Introduction to Hortonworks Data Platform (HDP)
Review answers
1. Which of thes components of HDP provides data access capabilities?
A. MapReduce
B. Falcon
C. Ranger
D. Ambari
2. Identify the component that is a messaging system used for real-time data
pipelines
A. Nifi
B. Sqoop
C. Kafka
D. None of the following
3. True or False: Big Match is added value from IBM.
4. True or False: IBM BigIntegrate Provides data quality features of
Information Server.
5. IBM BigQuality provides scalable engine to
A. Manage
B. Design
C. Connect
D. Cleanse
Unit 3. Introduction to Apache Ambari
Review answers
1. True or False: Apache Ambari is backed by RESTful APIs for developers to
easily integrate with their own applications.
2. Which functions does AMS provide?
A. Monitors the health and status of the Hadoop cluster.
B. Starts, stops, and reconfigures Hadoop services across the cluster.
C. Collects, aggregates, and serves Hadoop and system metrics.
D. Handles the configuration of Hadoop services for the cluster.
3. Which page from the Apache Ambari UI enables you to check the versions
of the software that is installed on your cluster?
A. Cluster Admin > Stack and Versions
B. admin > Service Accounts
C. Services
D. Hosts
4. True or False: Creating users through the Apache Ambari Web UI also
creates the user on the HDFS.
5. True or False: You can use the cURL commands to issue commands to
Apache Ambari.
Unit 4. Apache Hadoop and HDFS
Review questions
1. True or False: Hadoop systems are designed for using a single server.
2. What is the default number of replicas in a Hadoop system?
A. 1
B. 2
C. 3
D. 4
3. True or False: One of the Hadoop goals is fault tolerance by detecting faults
and applying quick and automatic recovery.
4. True or False: At least two NameNodes are required for a stand-alone
Hadoop cluster.
5. The default Hadoop block size is:
A. 16
B. 32
C. 64
D. 128
Unit 5. MapReduce and YARN
Review answers
1. Which of the following phases in a MapReduce job is optional?
A. Map
B. Shuffle
C. Reduce
D. Combiner
2. True or False: Interactive, online, and streaming applications are not
allowed to run on Hadoop v2
3. The JobTracker in MRv1 is replaced by which components in YARN? (Select
all that apply.)
A. ResourceManager
B. NodeManager
C. ApplicationMaster
D. TaskTracker
4. True or False: The major change from Hadoop v1 to Hadoop v2 is the
separation of cluster and resource management from the execution and data
processing environment.
5. True or False: It is possible to run unmodified MapReduce v1 jobs by using
the same MapReduce API and CLI in Hadoop v2.
Unit 6. Introduction to Apache Spark
Review answers
1. True or False: Ease of use is one of the benefits of using Apache Spark.
2. Which language is supported by Apache Spark?
A. C++
B. C#
C. Java
D. Node.js
3. True or False: Scala is the primary abstraction of Apache Spark.
4. In RDD actions, which function returns all the elements of the data set as an
array of the driver program?
A. Collect
B. Take
C. Count
D. Reduce
5. True or False: Referencing a data set is one of the methods to create RDD.
Unit 7. Storing and querying data
Review answers
1. What is the data representation format of an RC or ORC file?
A. Row-based encoding
B. Record-based encoding
C. Column-based storage
D. NoSQL data store
2. True or False: A NoSQL database is designed for those developers that do
not want to use SQL.
3. HBase is an example of which of the following NoSQL data store type?
A. Key-value store
B. Graph store
C. Column store
D. Document store
4. Which database provides an SQL for Hadoop interface?
A. Hbase
B. Apache Hive
C. Cloudant
D. MongoDB
5. True or False: R is a real programming language, and Python is an
interactive environment for doing statistics

You might also like