Professional Documents
Culture Documents
Dashboard / My courses / 2022-2023 / 2º ciclo / Pós-Graduações / Outono / ABD-400083-202223-S1 / ABD22 1st Exame 6-January
Question 1
Complete
Question 2
Complete
model.fit(mydata)
Question 3
Complete
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 1/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 4
Complete
What is a DataFrame?
Question 5
Complete
Question 6
Complete
a. Structured Streaming is for structured streaming data processing and Spark Streaming is for unstructured streaming data processing
b. Spark Streaming is the new ASF library for Streaming Data and Structured Streaming the old one
c. Structured Steaming is a stream processing engine and Spark Streaming is an extension to the core Spark API to streaming data
processing
d. Structured Streaming relies on micro batch and RDDs while Spark Streaming relies on DataFrames and Datasets
Question 7
Complete
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 2/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 8
Complete
Question 9
Complete
Question 10
Complete
a. 98520
b. 53940
c. 12434
d. 10365
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 3/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 11
Complete
Consider the friends graphframe that we viewed in the classes and that can be create with the following code:
from graphframes import *
from graphframes.examples import Graphs
g = Graphs(spark).friends()
a. 6
b. 4
c. 7
d. 5
Question 12
Complete
a. Execution will take some time because it needs to be sent to the worker nodes
Question 13
Complete
In a Databricks notebook, to access the cluster driver node console, what magic command is used?
a. %fs
b. %drive
c. dbutils.fs.mount()
d. %sh
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 4/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 14
Complete
a. 933455678
b. 123478543
c. 212135217
d. 523135234
Question 15
Complete
Question 16
Complete
Question 17
Complete
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 5/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 18
Complete
a. spark.range(10)
b. sc.textFile(“mydata”)
c. spark.dataFrame(“mydata”)
d. sc.parallelize(“mydata”)
Question 19
Complete
1. Load the file 'poems.txt' file from Moodle ABD class page and technical resources folder to the Databricks file system and create an RDD
with it.
2. How many words are in the file "poems.txt"?
a. 245
b. 98
c. 232
d. 124
Question 20
Complete
a. 3
b. 9
c. 0
d. 2
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 6/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 21
Complete
a. 142
b. 84
c. 234
d. 185
Question 22
Complete
2. What is the sum of the "price" where color = 'E' and cut = 'Premium'?
a. 9381456
b. 1287909
c. 2288145
d. 8270443
Question 23
Complete
a. Wide transformations are very efficient because they don’t move data from the node
b. Narrow transformations are very efficient because they don’t move data from the node
c. Both wide and narrow transformations move data from the node
d. None of the narrow or wide transformations move data from the node
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 7/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 24
Complete
1. Load the file 'poems.txt' file from Moodle ABD class page and technical resources folder to the Databricks file system and create an RDD
with it.
2. How many distinct words are in the file "poems.txt"?
a. 152
b. 53
c. 131
d. 174
Question 25
Complete
a. 3.21
b. 2.17
c. 5.46
d. 1.53
Question 26
Complete
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 8/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 27
Complete
Note: do not consider the decimal places for checking the right answer
a. 7627
b. 2121
c. 1051
d. 1134
Question 28
Complete
What kind of Managed table will be created with the Spark statement bellow?
df.write.saveAsTable(“table_name”)
a. A Managed table
b. An Unmanaged table
d. A semi-managed table
Question 29
Complete
What is the output object type that results from applying a map() function to an RDD that was created from a text file with the sc.textFile()
method?
a. String
b. Tuple
c. List
d. Dictionary
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 9/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 30
Complete
1. Load the file 'd2buy.csv' file from Moodle ABD class page and technical resources folder to the Databricks file system and create a DF
with it.
2. Join the diamonds DF: '/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv' with the 'd2buy' DF, using the join columns /
condition: diamonds._c0 == d2buy.d_id.
3. Calculate the sum of the prices of the diamonds with the value 'Y' in the field 'd2buy' of the DF 'd2buy.csv' and with the field 'color' = 'E'.
a. 2401
b. 2089
c. 2101
d. 2826
Question 31
Complete
1. Load the file 'd2buy.csv' file from Moodle ABD class page and technical resources folder to the Databricks file system and create a DF with
it.
2. Join the diamonds DF: '/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv' with the 'd2buy' DF, using the join columns /
condition: diamonds._c0 == d2buy.d_id.
3. Calculate the sum of the prices of the diamonds with the value 'Y' in the field 'd2buy' of the DF 'd2buy.csv' and with the word 'Good' in the
column 'cut'.
a. 64304
b. 93489
c. 32604
d. 11534
Question 32
Complete
Consider the friends graphframe that we viewed in the classes and that can be create with the following code:
a. c
b. b
c. e
d. f
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 10/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 33
Complete
model.transform(mydata)
Question 34
Complete
Question 35
Complete
a. 50
b. 500
c. 100
d. 200
Question 36
Complete
a. It’s a function defined without a name and with only one parameter
b. It’s a function defined without a name and with only one expression
c. It’s a function that can be reused with many parameters
d. It’s a function that can be reused with many expressions
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 11/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 37
Complete
lr = LogisticRegression(maxIter=10)
Question 38
Complete
What is the data type that results from the following spark instruction: spark.range(10)?
a. A list
b. A tuple
c. An RDD
d. A DataFrame
Question 39
Complete
d. reduceByKey() is an action
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 12/13
1/6/23, 7:53 PM ABD22 1st Exam - 6 January: Attempt review
Question 40
Complete
1. Load the file 'poems.txt' file from Moodle ABD class page and technical resources folder to the Databricks file system and create an RDD
with it.
2. How many lines are in the file "poems.txt" with the word "the"?
a. 3
b. 8
c. 15
d. 11
Jump to...
https://elearning.novaims.unl.pt/mod/quiz/review.php?attempt=89191&cmid=55154 13/13