You are on page 1of 7

REPUBLIQUE TUNISIENNE Signatures des

N° de la feuille
MINISTERE surveillants
DE L’ENSEIGNEMENT SUPERIEUR ET DE LA
RECHERHCE SCIENTIFIQUE
FEUILLE
Nombre totale
D'EXAMEN
UNIVERSITE DE SOUSSE
Ecole Nationale d’Ingénieurs de Sousse des feuilles

Nom :............................................................
Prénom :....................................................... Identifiant secret
N° CIN : ......:......:......:......:......:......:......:…… :
Epreuve de : Big Data
Ne rien écrire ici
Spécialité :..IA2, GT2........ Session : principale A.U.20-21 Groupe :.............................
---------------------------------------------------------------------------------------------------------------------------------------------------------

Which Hadoop functionalities does Ambari provide?


A. Monitor
B. Manage
C. Provision
D. Integrate
E. None of the above
F. All of the above Haut du formulaire

True or False? Creating users through the Ambari UI will also create the user on the HDFS.
A. True
B. False

True or False? You can use the CURL commands to issue commands to Ambari.
A. True
B. False

Apache Spark can run on which two of the following cluster managers? Select the TWO answers that apply
A. Nomad
B. Linux Cluster Manager
C. oneSIS
D. Apache Mesos
E. Hadoop YARN

What is the final agent in a Flume chain named?


A. Stream
B. Collector
C. Agent
D. Source

What are two services provided by ZooKeeper?


A. Providing distributed synchronization
B. Loading bulk data into an Hadoop cluster
C. Providing distributed synchronization
D. Authenticating and auditing user access
E. Maintaining configuration information

1
Ne rien écrire ici

---------------------------------------------------------------------------------------------------------------------------------------------------

Which statement about Apache Spark is true?


A. It features APIs for C++ and .NET.
B. It supports HDFS, MS-SQL, and Oracle
C. It runs on Hadoop clusters with RAM drives configured on each DataNode
D. It is much faster than MapReduce for complex applications on disk

Which three are a part of the Five Pillars of Security?


A. Audit
B. Administration
C. Speed
D. Resiliency
E. Data Protection

Hadoop 2 consists of which three open-source sub-projects maintained by the Apache Software Foundation?
Select the THREE answers that apply
A. Cloudbreak
B. HDFS
C. MapReduce
D. Hive
E. Big SQL
F. YARN

If a Hadoop node goes down, which Ambari component will notify the Administrator?
A. Ambari Alert Framework
B. Ambari Metrics System
C. Ambari Wizard
D. REST API

Which component of the Apache Ambari architecture integrates with an organization's LDAP or Active
Directory service
A. REST API
B. Authorization Provider
C. Postgres RDBMS
D. Ambari Alert Framework

2
What is an example of a Key-value type of NoSQL datastore?
E. Sesame
F. MongoDB
G. Neo4j
H. REDIS

Apache Spark provides a single, unifying platform for which three of the following types of operations? Select
the THREE answers that apply.
A. graph operations
B. batch processing
C. record locking
D. machine learning
E. ACID transactions
F. transaction processing

Which statement describes an example of an application using streaming data?


A. One time export and import of a database.
B. An application evaluating sensor data in real-time.
C. A web application that supports 10,000 users.
D. A system that stores many records in a database.

What are two ways the command-line parameters for a Sqoop invocation can be simplified?
A. Use the --import-command line argument.
B. Run Sqoop using the vi editor.
C. Place the commands in a file.
D. Include the --options-file command line argument.

What does the split-by parameter tell Sqoop?


A. The number of rows to commit per transaction.
B. The table name to export from the database.
C. The number of rows to send to each mapper.
D. The column to use as the primary key.

How can a Sqoop invocation be constrained to only run one mapper?


E. Use the --single parameter.
F. Use the --limit mapper=1 parameter.
G. Use the -m 1 parameter.
H. Use the -mapper 1 parameter.

Which is the java class prefix for the MapReduce v1 APIs?


A. org.apache.mapreduce
B. org.apache.mr
C. org.apache.hadoop.mr
D. org.apache.hadoop.mapred

Which statement is true about the Combiner phase of the MapReduce architecture?
A. It aggregates all input data before it goes through the Map phase.
B. It reduces the amount of data that is sent to the Reducer task nodes.
C. It determines the size and distribution of data split in the Map phase.
D. It is performed after the Reducer phase to produce the final output.

3
What command is used to list the "magic" commands in Jupyter?
A. %dirmagic
B. %list-all-magic
C. %lsmagic
D. %list-magic

What is a markdown cell used for in a data science notebook?


A. Configuring data connections.
B. Documenting the computational process
C. Holding the output of a computation.
D. Writing code to transform data.

Why might a data scientist need a particular kind of GPU (graphics processing unit)?
A. To collect video for use in streaming data applications.
B. To perform certain data transformation quickly.
C. To display a simple bar chart of data on the screen.
D. To input commands to a data science notebook

What does the user interface for Jupyter look like to a user?
A. Common desktop app.
B. App in web browser.
C. Database interface.
D. Linux SSH session.

Using the Java SQL Shell, which command will connect to a database called mybigdata?
A. ./jsqsh mybigdata
B. ./java mybigdata
C. ./jsqsh go mybigdata
D. ./java tables

You need to enable impersonation. Which two properties in the bigsql-conf.xml file need to be marked true?
Select the TWO answers that apply
A. DB2COMPOPT
B. bigsql.alltables.io.doAs
C. DB2_ATS_ENABLE
D. bigsql.impersonation.create.table.grant.public
E. $BIGSQL_HOME/conf

Which directory permissions need to be set to allow all users to create their own schema?
A. 666
B. 700
C. 777
D. 755

You are creating a new table and need to format it with parquet. Which partial SQL statement would create the
table in parquet format?
A. STORED AS parquet
B. CREATE AS parquetfile
C. STORED AS parquetfile
D. CREATE AS parquet

4
Which definition best describes RCAC?
A. It grants or revokes certain directory privileges.
B. It limits access by using views and stored procedures.
C. It grants or revokes certain user privileges.
D. It limits the rows or columns returned based on certain criteria.

Which two commands would you use to give or remove certain privileges to/from a user?
A. SELECT
B. INSERT
C. REVOKE
D. LOAD
E. GRANT

When connecting to an external database in a federation, you need to use the correct database driver and
protocol. What is this federation component called in Big SQL?
A. Wrapper
B. Data source
C. Nickname
D. User mapping

What is an advantage of the ORC file format?


A. Big SQL can exploit advanced features
B. Supported by multiple I/O engines
C. Data interchange outside Hadoop
D. Efficient compression

How many Big SQL management node do you need at minimum?


A. 2
B. 1
C. 3
D. 4

Which statement best describes a Big SQL database table?


E. A container for any record format.
F. A data type of a column describing its value.
G. A directory with zero or more data files.
H. The defined format and rules around a delimited file.

What are Big SQL database tables organized into?


A. Directories
B. Hives
C. Schémas
D. Files

You need to determine the permission setting for a new schema directory. Which tool would you use?
A. GRANT
B. umask
C. HDFS
D. Kerberos

5
Where does the unstructured data of a project reside in Watson Studio?
A. Wrapper
B. Tables
C. Database
D. Object Storage

What is the architecture of Watson Studio centered on?


A. Collaborators
B. Projects
C. Data Assets
D. Analytic Assets

Which type of cell can be used to document and comment on a process in a Jupyter notebook?
A. Kernel
B. Markdown
C. Output
D. Code

Before you create a Jupyter notebook in Watson Studio, which two items are necessary?
A. Project
B. URL
C. Spark Instance
D. File
E. Scala

Which component of the HDFS architecture regulates client access to files?


A. SlaveNode
B. WorkerNode
C. DataNode
D. NameNode

Select the storage with the biggest strength in working with unstructured, infrequently modified, and remotely
accessed data.
A. DB2 Storage
B. File Storage
C. Object Storage
D. Block Storage

Which machine learning approach detects patterns and relationships between data without using labeled data?
A. Supervised Learning
B. Reinforcement Learning
C. Semi-supervised Learning
D. Unsupervised Learning

Which tool would you use to create a connection to your Db2 Big SQL database?
A. Db2 Big SQL console
B. Scheduler
C. Jupyter
D. Ambari

6
What is a "magic" command used for in Jupyter?
A. Running common statistical analyses.
B. Parsing and loading data into a notebook.
C. Extending the core language with shortcuts.
D. Autoconfiguring data connections using a registry.

You need to add a collaborator to your project. What do you need?


A. The email address of the collaborator
B. Your project ID
C. A list of your saved bookmarks
D. The list of deployments

What are two common issues in distributed systems?


A. Distributed systems are harder to scale up.
B. Reduced performance when compared to a single server.
C. Finding a particular node within the cluster.
D. Partial failure of the nodes during execution.

Which feature makes Apache Spark much easier to use than MapReduce?
A. Suitable for transaction processing.
B. APIs for Scala, Python, C++, and .NET.
C. Applications run in-memory.
D. Libraries that support SQL queries.

What are three IBM value-add components to the Hortonworks Data Platform (HDP)?
A. Big SQL
B. Big Data
C. Big YARN
D. Big Replicate
E. Big Index
F. Big Match

What two security functions does Apache Knox provide?


A. Database field access auditing.
B. Proxying services.
C. Management of Kerberos in the cluster.
D. API and perimeter security.

Which two are valid watches for ZNodes in ZooKeeper?


A. Node Deleted
B. Node Refreshed
C. Node ChildrenChanged
D. Node Expired

Which Big SQL feature allows users to join a Hadoop data set to data in external databases?
A. Grant/Revoke privileges
B. Impersonation
C. Integration
D. Fluid query

Fin.

You might also like