Unit 4 - Week-3: Assignment-3

Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...
reviewer4@nptel.iitm.ac.in ▼
Courses » Big Data Computing Announcements Course Ask a Question Progress FAQ
Unit 4 - Week-3
Register for
Certification exam Assignment-3
The due date for submitting this assignment has passed.
Course As per our records you have not submitted this Due on 2019-03-20, 23:59 IST.
outline
assignment.
How to access 1) In spark, a ______________________is a read-only collection of objects partitioned across 1 point
the portal a set of machines that can be rebuilt if a partition is lost.
Week-1 Spark Streaming
Resilient Distributed Dataset (RDD)

Week-2
FlatMap
Week-3
Driver
Parallel
No, the answer is incorrect.
Programming
with Spark Score: 0
Accepted Answers:
Introduction to
Spark Resilient Distributed Dataset (RDD)
Spark Built-in 2) Given the following definition about the join transformation in Apache Spark: 1 point
Libraries
def join[W](other: RDD[(K, W)]): RDD[(K, (V, W))]
Design of
Key-Value
Stores Where join operation is used for joining two datasets. When it is called on datasets of type (K, V) and
(K, W), it returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key.
Quiz :
Assignment-3
Output the result of joinrdd, when the following code is run.
Week-3:
Lecture
val rdd1 = sc.parallelize(Seq(("m",55),("m",56),("e",57),("e",58),("s",59),("s",54)))
material
val rdd2 = sc.parallelize(Seq(("m",60),("m",65),("s",61),("s",62),("h",63),("h",64)))
Big Data val joinrdd = rdd1.join(rdd2)
Computing: joinrdd.collect
Feedback for
Week-03
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)),
Assignment-3 (s,(59,62)), (h,(63,64)), (s,(54,61)), (s,(54,62)))
Solution
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)),
Week-4 (s,(59,62)), (e,(57,58)), (s,(54,61)), (s,(54,62)))
© 2014 NPTEL - Privacy & Terms - Honor Code - FAQs -

A project of In association with
Funded by
1 of 4 Friday 21 June 2019 10:16 AM

Array[(String,Powered
(Int, Int))]
by= Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,(59,61)), (s,(59,62)),
Week-8
(s,(54,61)), (s,(54,62)))
3) Consider the following statements in the context of Spark: 1 point
Statement 1: Spark also gives you control over how you can partition your Resilient Distributed
Datasets (RDDs)
Statement 2: Spark allows you to choose whether you want to persist Resilient Distributed Dataset
(RDD) onto disk or not.
Only statement 1 is true
Both statements are true
Both statements are false

Score: 0
Accepted Answers:
4) ______________ leverages Spark Core fast scheduling capability to perform streaming 1 point
analytics.
MLlib
Spark Streaming
GraphX
RDDs

Score: 0
Accepted Answers:
Spark Streaming
5) ____________________ is a distributed graph processing framework on top of Spark. 1 point
MLlib
Spark streaming
GraphX
All of the mentioned

Score: 0
Accepted Answers:
GraphX
6) Consider the following statements: 1 point
Statement 1: Scale out means grow your cluster capacity by replacing with more powerful machines
Statement 2: Scale up means incrementally grow your cluster capacity by adding more COTS
machines (Components Off the Shelf)


Score: 0
Accepted Answers:
7) Which of the following is not a NoSQL database? 1 point
HBase
SQL Server
Cassandra
None of the mentioned

Score: 0
Accepted Answers:
SQL Server
8) Which of the following are the simplest NoSQL databases ? 1 point
Key-value
Wide-column
Document
All of the mentioned

Score: 0
Accepted Answers:
Key-value
9) Point out the incorrect statement in the context of Cassandra: 1 point
It is originally designed at Facebook
It is a centralized key-value store
It is designed to handle large amounts of data across many commodity servers, providing
high availability with no single point of failure.
It uses a ring-based DHT (Distributed Hash Table) but without finger tables or routing

Score: 0
Accepted Answers:
It is a centralized key-value store
Previous Page End


Unit 4 - Week-3: Assignment-3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 4 - Week-3: Assignment-3

Uploaded by

Copyright:

Available Formats

Big Data Computing - - Unit 4 - Week-3 https://onlinecourses-archive.nptel.ac.in/noc19_...

Week-1 Spark Streaming

Resilient Distributed Dataset (RDD)

© 2014 NPTEL - Privacy & Terms - Honor Code - FAQs -

1 of 4 Friday 21 June 2019 10:16 AM

3) Consider the following statements in the context of Spark: 1 point

Only statement 1 is true

Only statement 2 is true

Both statements are true

Both statements are false

No, the answer is incorrect.

No, the answer is incorrect.

5) ____________________ is a distributed graph processing framework on top of Spark. 1 point

All of the mentioned

No, the answer is incorrect.

6) Consider the following statements: 1 point

Only statement 1 is true

Only statement 2 is true

Both statements are true

2 of 4 Friday 21 June 2019 10:16 AM

Both statements are false

No, the answer is incorrect.

7) Which of the following is not a NoSQL database? 1 point

None of the mentioned

No, the answer is incorrect.

8) Which of the following are the simplest NoSQL databases ? 1 point

All of the mentioned

No, the answer is incorrect.

9) Point out the incorrect statement in the context of Cassandra: 1 point

It is originally designed at Facebook

It is a centralized key-value store

No, the answer is incorrect.

Previous Page End

3 of 4 Friday 21 June 2019 10:16 AM

4 of 4 Friday 21 June 2019 10:16 AM

You might also like