Welcome to Scribd!

Skip carousel

UEC735

Uploaded by

Abhi Mittal

0% found this document useful (0 votes)

2 views2 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

2 views2 pages

UEC735

Uploaded by

Abhi Mittal

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Thapar Institute of Engineering and Technology, Patiala

Department of Electronics and Communication Engineering

BE- ENC (VII Semester) MST (September 2022) 11E735: Big Data Analytics
Time: 02 Hours; MM: 35 Name of Faculty: Dr. Debayani Ghosh, Dr. Arnab
Pattanayak

Note: Answer all questions

Ql. Suppose we have two tables, where the first contains an employee's personal information, primary
keyed on SSN and the second table includes the employee's income again keyed on SSN. The data is
as follows:

Input Data:

Table 1: (SSN, (Personal Information))

111222: (Stephen; Sacramento, CA)
333444: (Edward; San Diego, CA)
555666: (John; San Diego, CA)

Table 2: (SSN, {year, income))

111222: (2016,$70000), (2015,$65000), (2014,$6000),...

333444: (2016,$72000) ,(2015,$70000), (2014,$6000),...
555666: (2016,$80000), (2015,$85000), (2014,$7500),....

(a) Write a first stage Map-Reduce programming model to get the following output from the input
data: (3.5 marks)

{SSN, (City,Income in 2016))

(b) Write a second-stage Map-reduce programming model to get the average income in each city
in 2016. The second stage takes as input the output generated by the first stage. (3.5 marks)

Q2. (a) Consider the following list [2,4,5,6,7,8,9]. Use necessary RDD transformations to print the
numbers that are either divisible by 2 or divisible by 3. (3.5 marks)
(b) Consider two RDDs created from the lists ['orange', 'mango', 'apple', 'grapes', 'orange] and
['green', 'red', 'yellow']. Perform necessary transformations and actions to generate the following:
(3.5 marks)

['orange', 'mango', 'green', 'red', 'yellow', 'apple', 'grapes]

Q3. Consider the following data:

1461balanceid.watio,
.
21 2143 261
441 29 15
331 2060
581 1596 51
331 569 195
351 231 1?8
(a) Write an algorithm to find the outliers in the above dataframe. (3.5 marks)
(b) Find the mean of all columns after removing the rows corresponding to the outliers (3.5 marks)

Q4. Consider a Spark Structured Streaming Program is receiving a steaming data for word count. We want a
Window-Based Aggregation, where Window length = 05 Seconds and we want the output result in every 03
seconds in the complete mode. [time in hh:mm:ss format]

If the input data stream is as follows —

Input Stream

Event-time Data received at Program Data

07:00:01 07:00:01 Deer Owl Dog
07:00:02 07:00:02 Dog
07:00:02 07:00:08 Dog Owl
07:00:04 07:00:05 Deer Owl Owl
07:00:05 07:00:05 Dog Owl
07:00:05 07:00:08 Owl Cheetah
07:00:07 07:00:08 Deer Cheetah

(a) what will be all the output tables of word count at 07:00:03, 07:00:06 and 07:00:09. (7 marks)

Consider the following points for the output —

(1) streaming starts at 07:00:00.

(2) Index the count by both Grouping Key (word) and the Window.
(3) Consider the data only within Window interval — that means — data in interval 07:00:03 — 07:00:08
implies data coming after 07:00:03 and before 07:00:08.

Q5 (a). Consider following three documents below—

Doc ID Document
Docl three birds, two birds
Doc2 red birds, blue birds
Doc3 two red birds

Using Map-Reduce, we want to build an inverted index to show — how many times does a word occur in each
document? For example — desired output will be as below for each word.

two (Doc1,1), (Doc3,1)

What are the inputs of the map function and the output (key, value) pairs for each of those inputs? (3 marks)
After shuffle, what does the reducer function do to generate the desired output? (1 mark)
(b) Write the output of the following program (here sc is spark.sparkContext) — (3 marks)
rdd = sc.parallelize ( • ) flatMap (lambda x: [x, x*x] )
rdd. cartesian (rdd) . reduceByKey (lambda x, y:x+y) . collect ()

Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet
CS398 Exam 3, 2 Chance December 17th, 2012: Circle The Section That Attend (So We Can Hand Back Your Exam)
Document7 pages
CS398 Exam 3, 2 Chance December 17th, 2012: Circle The Section That Attend (So We Can Hand Back Your Exam)
vs
No ratings yet
R Worksheet
Document4 pages
R Worksheet
Kavinaya Saravanan
No ratings yet
I N F o R M A T I C S
Document3 pages
I N F o R M A T I C S
Old.Master
No ratings yet
Technical Test: Astra Data Science Bootcamp 2019
Document10 pages
Technical Test: Astra Data Science Bootcamp 2019
abi
No ratings yet
NYU POLY CS2134 HW 1a
Document5 pages
NYU POLY CS2134 HW 1a
Mehul Patel
No ratings yet
2010oct FE AM Questions PDF
Document34 pages
2010oct FE AM Questions PDF
Đinh Văn Bắc Đinh
No ratings yet
XI CS SamplePaper
Document3 pages
XI CS SamplePaper
raghuvara
No ratings yet
Time - 2 Hours Full Marks - 60
Document2 pages
Time - 2 Hours Full Marks - 60
Shashankesh Upadhyay
No ratings yet
2013 April (Q)
Document32 pages
2013 April (Q)
WachichaoOahcihaw
No ratings yet
F. Y. B. Sc. (Computer Science) Examination - 2010: Total No. of Questions: 5) (Total No. of Printed Pages: 4
Document76 pages
F. Y. B. Sc. (Computer Science) Examination - 2010: Total No. of Questions: 5) (Total No. of Printed Pages: 4
Amarjeet Das
No ratings yet
2020 Assignment 2 Ver2 - 34487
Document7 pages
2020 Assignment 2 Ver2 - 34487
Ahmed Rezik
No ratings yet
News Network Analysis
Document23 pages
News Network Analysis
Kaartik Modi
No ratings yet
Sample Paper of Computer Science Class 12
Document5 pages
Sample Paper of Computer Science Class 12
Niti Arora
No ratings yet
GS200 First Exam
Document7 pages
GS200 First Exam
Mohmed Al Najar
No ratings yet
Ip Sample Paper 1
Document6 pages
Ip Sample Paper 1
NEEMA GANDHI
No ratings yet
June 19th 2009
Document71 pages
June 19th 2009
api-24647689
No ratings yet
Kendriya Vidyalaya Sangathan, Mumbai Region 1 Pre-Board Examination 2019-20
Document11 pages
Kendriya Vidyalaya Sangathan, Mumbai Region 1 Pre-Board Examination 2019-20
Haresh Nathani
No ratings yet
Untitled
Document5 pages
Untitled
iijjgwel
No ratings yet
UEC735
Document2 pages
UEC735
Abhi Mittal
No ratings yet
Sas Faq V1.3
Document56 pages
Sas Faq V1.3
Mukesh Kumar
No ratings yet
Bangluru Ip
Document6 pages
Bangluru Ip
Namita Sahu
No ratings yet
Endsem PDF
Document16 pages
Endsem PDF
AnujNagpal
No ratings yet
MITRES 6 009IAP12 Lab1 PDF
Document9 pages
MITRES 6 009IAP12 Lab1 PDF
lp111bj
No ratings yet
Assignment - 1 Class 12 - Gargee - 2020-21
Document4 pages
Assignment - 1 Class 12 - Gargee - 2020-21
Gargee Chattopadhyay
No ratings yet
T02 Ans
Document4 pages
T02 Ans
Kyle Sangma
No ratings yet
Kendriya Vidyalaya Sangathan: Kolkata Region First Preboard E Informatics Practices New (065) - Class Xii
Document15 pages
Kendriya Vidyalaya Sangathan: Kolkata Region First Preboard E Informatics Practices New (065) - Class Xii
Haresh Nathani
No ratings yet
Interview Questions
Document2 pages
Interview Questions
Hari Kappera
No ratings yet
Guess Paper - 2014 Class - Xi Subject - Computer Science: Other Educational Portals
Document3 pages
Guess Paper - 2014 Class - Xi Subject - Computer Science: Other Educational Portals
Srimathi Rajamani
No ratings yet
Chapter 3 Homework (Take 2)
Document7 pages
Chapter 3 Homework (Take 2)
Jaime Andres Chica P
No ratings yet
An Efficient Symmetric Key RGB Image Encryption and Decryption Scheme Using Chaotic Arnold Cat Map and Logistic Map
Document6 pages
An Efficient Symmetric Key RGB Image Encryption and Decryption Scheme Using Chaotic Arnold Cat Map and Logistic Map
rati ranjan sahoo
No ratings yet
Data and Operators-Exercises
Document4 pages
Data and Operators-Exercises
Subhra Khan 9C
No ratings yet
Documents 5495-ICSE+Class+10+Computer+Applications+ (+java+) +2012+Solved+Question+Paper+ +ICSE+J
Document12 pages
Documents 5495-ICSE+Class+10+Computer+Applications+ (+java+) +2012+Solved+Question+Paper+ +ICSE+J
GreatAkbar1
No ratings yet
Degree EngineeringchandniAMP Programs
Document34 pages
Degree EngineeringchandniAMP Programs
liobond
No ratings yet
ICSE 2017 Computer Applications Class X Question Paper (Solved) JAVAPBA
Document10 pages
ICSE 2017 Computer Applications Class X Question Paper (Solved) JAVAPBA
Arka Pramanik
No ratings yet
ICSE Class 9 Computer Applications Sample Question Papers
Document24 pages
ICSE Class 9 Computer Applications Sample Question Papers
daryanani.akshay7374
No ratings yet
Study Material CS XII 2019
Document122 pages
Study Material CS XII 2019
ishan
50% (2)
XIIComp SC S E 378
Document7 pages
XIIComp SC S E 378
Electron
No ratings yet
Sel 387L - IM - 20100609 - DNP3 Map
Document2 pages
Sel 387L - IM - 20100609 - DNP3 Map
diegomm2
No ratings yet
ECE568 2010 Ts 1303709634
Document10 pages
ECE568 2010 Ts 1303709634
Ibrahim qashta
No ratings yet
PREBOARD 1 Qpaper Xii
Document8 pages
PREBOARD 1 Qpaper Xii
Deepti Sharma
No ratings yet
Computer - IX
Document4 pages
Computer - IX
Debaditya Chakraborty
No ratings yet
Cognos
Document5 pages
Cognos
Harry Konnect
No ratings yet
A2 Adsa
Document3 pages
A2 Adsa
SANDHYA LAVURI
No ratings yet
Sas Exam
Document7 pages
Sas Exam
lanka.cnu
No ratings yet
DAA Practical
Document68 pages
DAA Practical
Harsh
No ratings yet
Computer Science Paper - 2: Solve Any One of The Following Problem
Document0 pages
Computer Science Paper - 2: Solve Any One of The Following Problem
anurekhar
No ratings yet
How To Prepare The Reports
Document3 pages
How To Prepare The Reports
muktikanta
No ratings yet
Avalon 2.3.1.a HexadecimalOctalNumberSystems
Document5 pages
Avalon 2.3.1.a HexadecimalOctalNumberSystems
Avalon Brown
No ratings yet
Comprehensive Assignment Four
Document9 pages
Comprehensive Assignment Four
Jasiz Philipe Ombugu
100% (1)
VINOD Iot Lab Body
Document24 pages
VINOD Iot Lab Body
Vinod G Gowda
No ratings yet
Sample Questions On QUIZ-3
Document3 pages
Sample Questions On QUIZ-3
Arifur Rahman
No ratings yet
QP Ip Sample Papers
Document141 pages
QP Ip Sample Papers
Nitesh Sain
100% (1)
2017A IP Question
Document36 pages
2017A IP Question
NJ Degoma
No ratings yet
Experiment-4: Logical Clocks - Vector Clocks Description
Document6 pages
Experiment-4: Logical Clocks - Vector Clocks Description
hamza khalailah
No ratings yet
DLD Unit Wise Imp Q's
Document13 pages
DLD Unit Wise Imp Q's
paavanmoksha
No ratings yet
Kendriya Vidyalaya Sangathan, Chennai Region PRACTICE TEST 2020-2021 Class XII
Document8 pages
Kendriya Vidyalaya Sangathan, Chennai Region PRACTICE TEST 2020-2021 Class XII
RamanKaur
100% (1)
Class 12 Unit Test 1 August 2022
Document3 pages
Class 12 Unit Test 1 August 2022
SDNSTARK INDUSTRIES
No ratings yet
QP 11 CS
Document6 pages
QP 11 CS
meenakvs2007
No ratings yet
Introduccion Matlab Emily
Document99 pages
Introduccion Matlab Emily
Georgina Violet Ramirez
No ratings yet
Primavera 106P
Document4 pages
Primavera 106P
bihaiau
No ratings yet
Non Disclosure - BPS Technology Global
Document6 pages
Non Disclosure - BPS Technology Global
cizar
No ratings yet
MSDR PDF
Document479 pages
MSDR PDF
Carlos Trujillo
No ratings yet
ProStream 9100 Release
Document79 pages
ProStream 9100 Release
diegoh_silva
100% (1)
tOMGr13 Biology Resource Book Unit 5 6
Document169 pages
tOMGr13 Biology Resource Book Unit 5 6
asdf
No ratings yet
Assignment
Document5 pages
Assignment
Dmytro Stachko
No ratings yet
Z Commusage Update
Document9 pages
Z Commusage Update
Sarut Lapthaweechok
No ratings yet
Worship Essential - Video Projection
Document18 pages
Worship Essential - Video Projection
Jouberto Heringer
No ratings yet
Case Study - Sacking Social Media in College Sports
Document2 pages
Case Study - Sacking Social Media in College Sports
elliot pickard
No ratings yet
Table 5.2 Characteristics of The 30 Members of The Batong Malake Senior Citizens Association (BMSCA) Who Participated in Their 2009 Lakbay-Aral
Document2 pages
Table 5.2 Characteristics of The 30 Members of The Batong Malake Senior Citizens Association (BMSCA) Who Participated in Their 2009 Lakbay-Aral
emily l. palma
No ratings yet
1.question Bank - IGMCRI (Anatomy)
Document11 pages
1.question Bank - IGMCRI (Anatomy)
tanushri narendran
No ratings yet
Activator - Createinstance (Typeof (Kkryczka) ) : Datetime Controls in Pi Processbook Display
Document7 pages
Activator - Createinstance (Typeof (Kkryczka) ) : Datetime Controls in Pi Processbook Display
Igo Rafael Alves Silva
No ratings yet
LA206BAD
Document8 pages
LA206BAD
brusha
No ratings yet
P54x1Z TM EN 1.2
Document1,000 pages
P54x1Z TM EN 1.2
VuiLênNào
No ratings yet
Vision Tester: Accurate Measurement With Simple and Comfortable Operation
Document2 pages
Vision Tester: Accurate Measurement With Simple and Comfortable Operation
FERNEY REYES MONSALVE
0% (1)
To The Development of Web Applications (Core) MVC: Razvan Alexandru Mezei
Document321 pages
To The Development of Web Applications (Core) MVC: Razvan Alexandru Mezei
jerry.voldo
No ratings yet
Richard Feynman: Simulating Physics With Computers
Document8 pages
Richard Feynman: Simulating Physics With Computers
kajdijkstra
100% (1)
Sir C R Reddy College of Engineering Department of Computer Science and Engineering
Document1 page
Sir C R Reddy College of Engineering Department of Computer Science and Engineering
Jayaramsai Panchakarla
No ratings yet
Ofdm Using Matlab Aziz
Document6 pages
Ofdm Using Matlab Aziz
Mircea Bujor
No ratings yet
Directorate of Estates: Subject: How To Use Automated System of Allotment For GPRA Objective
Document5 pages
Directorate of Estates: Subject: How To Use Automated System of Allotment For GPRA Objective
sofia gupta
No ratings yet
Ucmdb8.01 DDM Cp5
Document350 pages
Ucmdb8.01 DDM Cp5
a.achraf
No ratings yet
Samsung GT-i9100 Disassembly Manual
Document4 pages
Samsung GT-i9100 Disassembly Manual
bradu2002alex9211
No ratings yet
Assignment 8dbms
Document3 pages
Assignment 8dbms
subham
No ratings yet
Automatic Detection of Blood Vessels in Digital Retinal Image Using CVIP Tools
Document29 pages
Automatic Detection of Blood Vessels in Digital Retinal Image Using CVIP Tools
Placement.ece Velammal IT
No ratings yet
Paper (INTELLIGENT NIGHT SURVELLIANCE USING DRONE)
Document7 pages
Paper (INTELLIGENT NIGHT SURVELLIANCE USING DRONE)
Lohit Dalal
No ratings yet
Ergonomics For Children and Youth in The Educational Environment
Document61 pages
Ergonomics For Children and Youth in The Educational Environment
Rabia Azhar
No ratings yet
Wa0010.
Document6 pages
Wa0010.
ivon cruz
No ratings yet
52
Document65 pages
52
Jaswanth Sunkara
No ratings yet
Set Theory and Boolean Algebra: Unit 3
Document4 pages
Set Theory and Boolean Algebra: Unit 3
Joy Ibarrientos
No ratings yet
Radon Transform
Document4 pages
Radon Transform
adobi
No ratings yet