You are on page 1of 4

Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...

reviewer4@nptel.iitm.ac.in ▼

Courses » Big Data Computing Announcements Course Ask a Question Progress FAQ

Unit 4 - Week-3
Register for
Certification exam Assignment-3
The due date for submitting this assignment has passed.
Course As per our records you have not submitted this Due on 2019-03-20, 23:59 IST.
outline
assignment.

How to access 1) In spark, a ______________________is a read-only collection of objects partitioned across 1 point
the portal a set of machines that can be rebuilt if a partition is lost.

Week-1 Spark Streaming

Resilient Distributed Dataset (RDD)


Week-2
FlatMap
Week-3
Driver
Parallel
No, the answer is incorrect.
Programming
with Spark Score: 0
Accepted Answers:
Introduction to
Spark Resilient Distributed Dataset (RDD)

Spark Built-in 2) Given the following definition about the join transformation in Apache Spark: 1 point
Libraries
def join[W](other: RDD[(K, W)]): RDD[(K, (V, W))]
Design of
Key-Value
Stores Where join operation is used for joining two datasets. When it is called on datasets of type (K, V) and
(K, W), it returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key.
Quiz :
Assignment-3
Output the result of joinrdd, when the following code is run.
Week-3:
Lecture
val rdd1 = sc.parallelize(Seq(("m",55),("m",56),("e",57),("e",58),("s",59),("s",54)))
material
val rdd2 = sc.parallelize(Seq(("m",60),("m",65),("s",61),("s",62),("h",63),("h",64)))
Big Data val joinrdd = rdd1.join(rdd2)
Computing: joinrdd.collect
Feedback for
Week-03
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)),
Assignment-3 (s,(59,62)), (h,(63,64)), (s,(54,61)), (s,(54,62)))
Solution
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)),
Week-4 (s,(59,62)), (e,(57,58)), (s,(54,61)), (s,(54,62)))

© 2014 NPTEL - Privacy & Terms - Honor Code - FAQs -


A project of In association with

Funded by

1 of 4 Friday 21 June 2019 10:16 AM


Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...

Array[(String,Powered
(Int, Int))]
by= Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)), (s,(59,62)),
Week-8
(s,(54,61)), (s,(54,62)))

3) Consider the following statements in the context of Spark: 1 point

Statement 1: Spark also gives you control over how you can partition your Resilient Distributed
Datasets (RDDs)

Statement 2: Spark allows you to choose whether you want to persist Resilient Distributed Dataset
(RDD) onto disk or not.

Only statement 1 is true

Only statement 2 is true

Both statements are true

Both statements are false

No, the answer is incorrect.


Score: 0
Accepted Answers:
Both statements are true

4) ______________ leverages Spark Core fast scheduling capability to perform streaming 1 point
analytics.

MLlib

Spark Streaming

GraphX

RDDs

No, the answer is incorrect.


Score: 0
Accepted Answers:
Spark Streaming

5) ____________________ is a distributed graph processing framework on top of Spark. 1 point

MLlib

Spark streaming

GraphX

All of the mentioned

No, the answer is incorrect.


Score: 0
Accepted Answers:
GraphX

6) Consider the following statements: 1 point

Statement 1: Scale out means grow your cluster capacity by replacing with more powerful machines

Statement 2: Scale up means incrementally grow your cluster capacity by adding more COTS
machines (Components Off the Shelf)

Only statement 1 is true

Only statement 2 is true

Both statements are true

2 of 4 Friday 21 June 2019 10:16 AM


Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...

Both statements are false

No, the answer is incorrect.


Score: 0
Accepted Answers:
Both statements are false

7) Which of the following is not a NoSQL database? 1 point

HBase

SQL Server

Cassandra

None of the mentioned

No, the answer is incorrect.


Score: 0
Accepted Answers:
SQL Server

8) Which of the following are the simplest NoSQL databases ? 1 point

Key-value

Wide-column

Document

All of the mentioned

No, the answer is incorrect.


Score: 0
Accepted Answers:
Key-value

9) Point out the incorrect statement in the context of Cassandra: 1 point

It is originally designed at Facebook

It is a centralized key-value store

It is designed to handle large amounts of data across many commodity servers, providing
high availability with no single point of failure.

It uses a ring-based DHT (Distributed Hash Table) but without finger tables or routing

No, the answer is incorrect.


Score: 0
Accepted Answers:
It is a centralized key-value store

Previous Page End

3 of 4 Friday 21 June 2019 10:16 AM


Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...

4 of 4 Friday 21 June 2019 10:16 AM

You might also like