Professional Documents
Culture Documents
A. Efficient compression
C. bigsql.alltables.io.doAs
D. bigsql.impersonation.create.table.grant.public
u need to enable impersonation. Which
properties in the bigsql-conf.xml file need
be marked true?
$BIGSQL_HOME/conf
DB2COMPOPT
bigsql.alltables.io.doAs
sql.impersonation.create.table.grant.public
DB2_ATS_ENABLE
C. STORED AS parquetfile
B. 777
Which directory permissions
need to be set to allow all
users to create their own
schema?
A. 666
B. 777
C. 700
D. 755
A. umask
You need to determine the
permission setting for a new
schema directory. Which tool
would you use?
A. umask
B. GRANT
C. HDFS
D. Kerberos
A. ./jsqsh mybigdata
B. GRANT
C. REVOKE
Which two commands would
you use to give or remove
certain privileges to/from a
user?
A. INSERT
B. GRANT
C. REVOKE
D. LOAD
E. SELECT
A. Schemas
What are Big SQL database
tables organized into?
A. Schemas
B. Directories
C. Files
D. Hives
A. Wrapper
When connecting to an
external database in a
federation, you need to use the
correct database driver and
protocol. What is this
federation component called in
Big SQL?
A. Wrapper
B. Data source
C. User mapping
D. Nickname
B. DSM
Which tool would you use to
create a connection to your Big
SQL database?
A. Jupyter
B DSM
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 3/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
B. DSM
C. Ambari
D. Scheduler
B. /apps/hive/warehouse/
D. CREATE FUNCTION
A. graph operations
C. batch processing
Apache Spark provides a D. machine learning
single, unifying platform for
which three of the following
types of operations?
A. graph operations
B. record locking
C. batch processing
D. machine learning
E. ACID transactions
F. transaction processing
C. org.apache.hadoop.mapred
C. NiFi
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 4/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
C. NiFi
D. Druid
D. ResourceManager
A. ResourceManager
A. Scala
B. Java
Which three programming D. Python
languages are directly
supported by Apache Spark?
A. Scala
B. Java
C. C++
D. Python
E. .NET
F. C#
B. ResourceManager
E. ApplicationMaster
Under the YARN/MRv2
framework, the JobTracker
functions are split into which
two daemons?
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 5/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
two daemons?
A. JobMaster
B. ResourceManager
C. ScheduleManager
D. TaskManager
E. ApplicationMaster
B. disk latency
Which component of an
Hadoop system is the primary
cause of poor performance?
A. CPU
B. disk latency
C. network
D. RAM
A. Hive
B. Spark SQL
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 6/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
A. RDD
Which Spark Core function
provides the main element of
Spark API?
A. RDD
B. MLlib
C. YARN
D. Mesos
A. Proxying services.
B. API and perimeter security.
What two security functions
does Apache Knox provide?
A. Proxying services.
B. API and perimeter security.
C. Management of Kerberos in
the cluster.
D. Database field access
auditing.
simplified?
commands in a
A. Place the
file.
C. Collector
What is the final agent in a
Flume chain named?
A. Stream
B. Agent
C. Collector
D. Source
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 8/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
B. MongoDB
What is an example of a
NoSQL datastore of the
"Document Store" type?
A. HBase
B. MongoDB
C. REDIS
D. Cassandra
D. Pig
D. JBOD
Which hardware feature on an
Hadoop datanode is
recommended for cost efficient
performance?
A. SSD
B. RAID
C. LVM
D. JBOD
D. REDIS
What is an example of a Key-
value type of NoSQL
datastore?
A. MongoDB
B. Sesame
C. Neo4j
D. REDIS
C. SequenceFiles
Which data encoding format
supports exact storage of all
data in binary representations
such as VARBINARY
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 9/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
columns?
A. Parquet
B. RCFile
C. SequenceFiles
D. Flat
A. Authorization Provider
A. Sqoop
Which Hadoop ecosystem tool
can import data into a Hadoop
cluster from a DB2, MySQL, or
other databases?
A. Sqoop
B. HBase
C. Accumulo
D. Oozie
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 10/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
C. HBase
Which NoSQL datastore type
began as an implementation of
Google's BigTable that can
store any type of data and
scale to many petabytes?
A. MemcacheD
B. CouchDB
C. HBase
D. Riak
B. YARN
E. MapReduce
F. HDFS
Hadoop 2 consists of which
three open-source sub-projects
maintained by the Apache
Software Foundation?
A. Big SQL
B. YARN
C. Hive
D. Cloudbreak
E. MapReduce
F. HDFS
B. Scalability
E. Resource utilization
C. Projects
What is the architecture of
Watson Studio centered on?
A. Data Assets
B. Collaborators
C. Projects
D. Analytic Assets
B. Markdown
B. Spark Instance
D. Project
Before you create a Jupyter
notebook in Watson Studio,
which two items are
necessary?
A. File
B. Spark Instance
C. Scala
D. Project
E. URL
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 12/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
A. %lsmagic
What command is used to list
the "magic" commands in
Jupyter?
A. %lsmagic
B. %list-all-magic
C. %dirmagic
D. %list-magic
D. Notebooks can be
connected to big data engines
such as Spark.
C. Combiner
D. ResourceManager
applications on disk.
Which statement about Apache
Spark is true?
A. It supports HDFS, MS-SQL,
and Oracle.
B. It is much faster than
MapReduce for complex
applications on disk.
C. It runs on Hadoop clusters
with RAM drives configured on
each DataNode.
D. It features APIs for C++ and
.NET.
B. Administration
D. Data Protection
Which three are a part of the
E. Audit
Five Pillars of Security?
A. Resiliency
B. Administration
C. Speed
D. Data Protection
E. Audit
B. Spark
A. NodeChildrenChanged
Which two are valid watches B. NodeDeleted
for ZNodes in ZooKeeper?
A. NodeChildrenChanged
B. NodeDeleted
C. NodeRefreshed
D. NodeExpired
A. Authorization
B. Auditing
What are two security features
Apache Ranger provides?
A. Authorization
B. Auditing
C. Authentication
D. Availability
A. Apache Mesos
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 16/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
C. Hadoop YARN
Apache Spark can run on
which two of the following
cluster managers?
A. Apache Mesos
B. oneSIS
C. Hadoop YARN
D. Linux Cluster Manager E.
Nomad
B. HBase
D. Parallel Processing
A. Fluid query
D. Object Storage
Where does the unstructured
data of a project reside in
Watson Studio?
A. Database
B. Wrapper
C. Tables
D. Object Storage
A. Acquisition
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 17/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
B. Analytics
C. Exploration
D. Manipulation
A. Parallel Processing
Which description
characterizes a function
provided by Apache Ambari?
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 18/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
worker nodes.
B. Big Match
D. Big SQL
F. Big Replicate
What are three IBM value-add
components to the
Hortonworks Data Platform
(HDP)?
A. Big YARN
B. Big Match
C. Big Index
D. Big SQL
E. Big Data
F. Big Replicate
A. An application evaluating
sensor data in real-time.
B. A web application that
supports 10,000 users.
C. A system that stores many
records in a database.
D. One time export and import
of a database.
D. 1
How many Big SQL
management node do you
need at minimum?
A. 3
B. 4
C 2
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 19/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
C. 2
D. 1
A. MongoDB
What is an example of a
NoSQL datastore of the
"Document Store" type?
A. MongoDB
B. REDIS
C. Cassandra
D. HBase
B. YARN
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 20/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
B. ApplicationMaster
C. Markdown
A. Efficient compression
What is an advantage of the
ORC file format?
A. Efficient compression
B. Data interchange outside
Hadoop
C. Big SQL can exploit
advanced features
GraphX
Which spark RDD operation
creates a directed acyclic
graph through lazy
evaluations?
Postgres RDBMS
Which component of the
apache ambari architecture
stores the cluster
configurations?
Spark Streaming
Which component of the spark
unified stack provides
processing of data arriving at
the system in real-time?
Avro
Which of the following is a data
encoding format is a
compact,binary format that
supports interoperability with
multiple programming
languages and versioning?
Scala
What is the native
programming language for
spark?
Namenode
Which component of the HDFS
architecture manages the file
system namespace and
metadata?
Email address
Which two are examples of Medical record number
personally identifiable
information(PII)?(select two)
Lambda functions
What is the name of the scala
programming feature that
provides functions with no
names?
REST APIs
Which feature allows
application developers to easily
use the ambari interface to
integrate hadoop provisioning,
management and monitoring
capabilities into their own
applications?
Actions
Which spark RDD operation
returns values after performing
the evaluations?
JobTracker
Under the mapreduce v1
architecture, which element of
mapreduce controls job
execution on multiple slaves?
log files.
What are three examples of cookies.
"Data Exhaust"?(select three) browser cache.
Is /
What ZK CLI command is used
to list all the ZNodes at the top
level of the zookeeper
CSV.
Which two of the following are Avro.
row-based data encoding
formats?(select two)
datanode
Which component of the HDFS
architecture manages storage
attached to the nodes?
NumPy
What python package has
support for linear
algebra,optimization,
mathematical integration and
statistics?
import
What python statement is used
to add a library to the current
code cell?
Data modeling.
Which areas of expertise are Machine learning.
attributed to a data scientist?
(select two)
Substantive expertise.
Which three main areas make Math and statistics knowledge.
up data science according to Hacking skills.
drew conway?(select three)
String
Which data type can cause
significant performance
degradation and should be
avoided?
LOAD
Which command is used to
populate a big sql table?
Parquet
Which file format has the
highest performance?
Apache HIVE
Which type of foundation does
Big sql build on?
SMALLINT
Which data type is boolean
defined as in a Big sql
database?
The data is not human readable.
Which statement describes a
sequence file?
CREATE NICKNAME
Which command would you
run to make a remote table
accessible using an alias?
Apache Ranger
Which tool should you use to
enable Kerberos security?
Delimited
Which file format contains
human-readable data where
the column values are
seperated by a comma?
User-Defined
Which type of function
promotes code re-use and
reduces query complexity?
EXTERNAL
You need to create a table that
is not managed by the big sql
database manager. Which
keyword would you use to
create the table?
Impersonation
Which feature allows the bigsql
user to securely access data in
hadoop on behalf of another
user?
Apache Ranger
You need to monitor and
manage data security across a
Hadoop platform.Which tool
would you use?
Python
You can import preinstalled R
libraries if you are using which
languages?(select two)
http://localhost:8080/
What is the default web
location for a local jupyter
instance
Jobtracker
Under the mapreduce v1
architecture, which element of
the system manages the map
and reduce functions?
Which statement is true about HDFS links the disks on multiple nodes into one
the hadoop distributed file large file system
system(hdfs)?
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 27/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
system(hdfs)?
Scala.
Which two spark libraries Python.
provide a native shell?(select
two)
Data munging
Ambari
Which Hortonworks Data
Platform(HDP) component
provides a common web user
interface for applications
running on a hadoop cluster?
Lambda functions
What is the name of the scala
programming feature that
provides functions with no
names?
NameNode
Which component of the HDFS
CREATE WRAPPER
Kerberos
Apache ranger
You need to monitor and
manage data security across a
hadoop platform. which tool
would you use?
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 29/30
21/06/2022 16:49 Big data engineer ibm exploree Cartes | Quizlet
Under the HDFS storage 3 replicas, 2 on the same rack, 1 on a different rack
model, what is the default
method of replication?
Quick data exploration tasks that can be
For what are interactive reproduced
notebooks used by data
scientists?
facilitates sql based queries
Which is the primary
advantage of using column-
based data formats over
record-based formats?
https://quizlet.com/in/558154874/big-data-engineer-ibm-exploree-flash-cards/ 30/30