Welcome to Scribd!

Action and Transformations (Wide and Narrow)

Uploaded by

0% found this document useful (0 votes)

4 views7 pages

The document discusses various Spark RDD transformations including collect, countByValue, take, map, flatMap, filter, union, intersection, subtract, and distinct. It provides examples of applying each transformation to RDDs, such as using filter to find even or odd numbers, map to multiply elements by 2, and flatMap to change the number of elements and partitions. GroupBy is also discussed as applying a lambda function to group elements based on a key.

Original Description:

Original Title

Action and Transformations (wide and Narrow)

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

4 views7 pages

Action and Transformations (Wide and Narrow)

Uploaded by

velamatiskiran

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 7

Search inside document

Collect action ------ Don’t use collect action.

All the data from the file it will read so its very
expensive.. all data will write it to memory

Name is dataframe, sc is sparkcontext

We are applying collect() function on RDD

RDDname. countByValue() == we can using in list how manytimes one name or word is repeating

RDD.take(5) ##### to read only 5 result. We can change the number

Transformations

If data dependency is not there shuffling is not required at that time from node1 RDD1 data we will
move it to node1 RDD2,

Example: If we want to filter data by odd number we will just apply filter transformation.
Map:

If we want to map any function to RDD. We will use it. Elements and partitions are equal in Map.

Example: If we want to multiply all elements with *2. We will create lambda function and mapp it to
RDD

.collect to show the output

Num is rdd

.map is action

(lambda a : power of a,2 means we are mentioning the element power as 2

Adding word to all words in list*

We are using lambda function a : “mr. “+a and mapping it to RDD

FlatMap: elements and partitions in number are not equal.

Filter
Filter is the operation in which it will give us a new dataset but by selecting some filter criteria we
will filter some criteria on the source which will return some elements suppose we want to search
odd values even values or multiplication.

****Finding even numbers in list by Filter*

*****Filter the words which are started with letter “B” ****

Union
Combining both data

Adding two datasets. Union will not give order.

Sample
Wide Transformation

GroupBY:

For one dataset we are applying groupby. In groupby we have used lambda function.

Lambda x : x[0] means what ever first letter is there take that letter and apply by group by. So
starting with B letter name will be in one group like that.

To check the results we are applied key and value. we can use for loop
For (k,v) in the new rdd dataset and print those.

Intersection: To know common records from the two rdd’s we can use intersection.

Oder doesn’t matter we can mention the rdd’s any place it will give the common records. Like inner
join.

Subtract

It will work like rdd1-(minus) rdd2

Distinct

More Practice With Class Creation, Pointers, and An Introduction To Linked List Concepts
Document6 pages
More Practice With Class Creation, Pointers, and An Introduction To Linked List Concepts
Maha Naeem
No ratings yet
List
Document10 pages
List
Raffi Sk
0% (1)
Apache Spark Tutorials
Document9 pages
Apache Spark Tutorials
ronics123
No ratings yet
4 - Action and RDD Transformations
Document25 pages
4 - Action and RDD Transformations
ravikumar lanka
No ratings yet
Big Data Analysis With Scala and Spark: Heather Miller
Document17 pages
Big Data Analysis With Scala and Spark: Heather Miller
dd
No ratings yet
RDD Programing 8
Document28 pages
RDD Programing 8
chandrasekhar yerragandhula
No ratings yet
Task Spark
Document4 pages
Task Spark
Azza A. Aziz
No ratings yet
Lambda, Filter, Map
Document24 pages
Lambda, Filter, Map
Aamna Raza
No ratings yet
Spark Transformations and Actions
Document24 pages
Spark Transformations and Actions
chandra
No ratings yet
7 - Graphs
Document12 pages
7 - Graphs
yotopia horoor game
No ratings yet
Name Description: RRD - Random
Document6 pages
Name Description: RRD - Random
Aymen Chaouki
No ratings yet
Open Spark Shell
Document12 pages
Open Spark Shell
RamyaKrishnan
No ratings yet
Spark
Document13 pages
Spark
thunuguri santosh
No ratings yet
Assignment 3
Document6 pages
Assignment 3
Aayush Mittal
No ratings yet
Map Reduce
Document10 pages
Map Reduce
Vikas Sinha
No ratings yet
Graph Based Methods
Document3 pages
Graph Based Methods
drla4
No ratings yet
Panas Short Notes
Document4 pages
Panas Short Notes
MIHIRETEAB LEBASSE
No ratings yet
PySpark Transformations Tutorial
Document58 pages
PySpark Transformations Tutorial
ravikumar lanka
100% (1)
Lambda Function: Frozensets
Document2 pages
Lambda Function: Frozensets
Bhashkar Jha
No ratings yet
Unit-5 Spark
Document24 pages
Unit-5 Spark
nosopa5904
No ratings yet
XII Computer Science EM Five Mark Question and Answer
Document16 pages
XII Computer Science EM Five Mark Question and Answer
rsgk
No ratings yet
Filter - This Is A Python Inbuilt Library That Returns Only Those
Document4 pages
Filter - This Is A Python Inbuilt Library That Returns Only Those
rockinever
No ratings yet
Basics of LINQ & Lamda Expressions
Document13 pages
Basics of LINQ & Lamda Expressions
igogin
No ratings yet
Assignment 2
Document2 pages
Assignment 2
Shailesh Karki
No ratings yet
Unit 1 CHP 3
Document5 pages
Unit 1 CHP 3
D-497 Neha Malviya
No ratings yet
Spark Transformations and Actions
Document4 pages
Spark Transformations and Actions
juliatomva
No ratings yet
Lab Manual DAR
Document81 pages
Lab Manual DAR
Harry Kunar
No ratings yet
Section8 Mapreduce Solution PDF
Document5 pages
Section8 Mapreduce Solution PDF
Pratiksha Kamble
No ratings yet
Lambdas
Document38 pages
Lambdas
Tushar Shrimali
No ratings yet
Reference Operator (&) : Objects and Lvalues
Document7 pages
Reference Operator (&) : Objects and Lvalues
avnika sogani
No ratings yet
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
Document12 pages
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
kishoremokka
No ratings yet
What Is Anonymous Function?: Anonymous Functions Can Accept Inputs and Return The Outputs, Just
Document3 pages
What Is Anonymous Function?: Anonymous Functions Can Accept Inputs and Return The Outputs, Just
Anonymous AZ
No ratings yet
New 79
Document1 page
New 79
Ramanjaneyulu Kancharla
No ratings yet
BPJ Lesson 15
Document4 pages
BPJ Lesson 15
api-307094747
No ratings yet
New 77
Document1 page
New 77
Ramanjaneyulu Kancharla
No ratings yet
HKBK College of Engineering Department of Ise: Big Data Analytics (18Cs72) Seminar On The Topic Key-Value Pairs
Document15 pages
HKBK College of Engineering Department of Ise: Big Data Analytics (18Cs72) Seminar On The Topic Key-Value Pairs
Akhila R
100% (1)
What Is Lambda Function in Python?
Document22 pages
What Is Lambda Function in Python?
PILLINAGARAJU
No ratings yet
New 78
Document1 page
New 78
Ramanjaneyulu Kancharla
No ratings yet
What Is MapReduce
Document6 pages
What Is MapReduce
Sundaram yadav
No ratings yet
02 DS Quiz Set
Document34 pages
02 DS Quiz Set
vinay harsha
No ratings yet
Unit 4-1
Document6 pages
Unit 4-1
chirag suresh chiru
No ratings yet
Linked Lists Notes
Document17 pages
Linked Lists Notes
theoh55555
No ratings yet
Week 2 Assignment Adv - Python
Document10 pages
Week 2 Assignment Adv - Python
niteshdulal6
No ratings yet
Spark Tutorial
Document17 pages
Spark Tutorial
Gerardo Perez
No ratings yet
Apache Spark Python Slides
Document186 pages
Apache Spark Python Slides
Douglas Leite
No ratings yet
CS508 FinalTerm Solved Short Questions
Document40 pages
CS508 FinalTerm Solved Short Questions
Saad Ebaad As-Sheikh
No ratings yet
Cloudera Certification Dump 410 Anil PDF
Document49 pages
Cloudera Certification Dump 410 Anil PDF
arunshan
No ratings yet
Tcldot Manpage
Document7 pages
Tcldot Manpage
Rui Valporto
No ratings yet
CS226 06 RDD
Document29 pages
CS226 06 RDD
chenna kesava
No ratings yet
CH 7 - Introduction To SQL and Its Commands For Board Exam
Document28 pages
CH 7 - Introduction To SQL and Its Commands For Board Exam
ameentrafiq
No ratings yet
Transformations and Actions: A Visual Guide of The API
Document122 pages
Transformations and Actions: A Visual Guide of The API
Jorge Emilio Roa Barreto
No ratings yet
5.1 Types: Edsger Dijkstra
Document30 pages
5.1 Types: Edsger Dijkstra
philip resuello
No ratings yet
Python Map
Document3 pages
Python Map
jeeshitha
No ratings yet
ECE 264 Advanced C Programming 2009/02/18: 1 Pointer (Review) 1
Document9 pages
ECE 264 Advanced C Programming 2009/02/18: 1 Pointer (Review) 1
truongvinhlan19895148
No ratings yet
4a Resilient Distributed Datasets Etc PDF
Document46 pages
4a Resilient Distributed Datasets Etc PDF
23522020 Danendra Athallariq Harya P
No ratings yet
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
Document36 pages
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
Mony Simonetti
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Gd Script
From Everand
Gd Script
Marijo Trkulja
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
Rating: 4.5 out of 5 stars
4.5/5 (9)
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
Presentation
Document15 pages
Presentation
Muhammad Umer
No ratings yet
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
Document6 pages
Deep Learning Approach For Suspicious Activity Detection From Surveillance Video
Neha Sharma
No ratings yet
Lab Assignment 8: Nishiv Singh (B20MT029) Google Colab Notebooks Link: Task 1
Document4 pages
Lab Assignment 8: Nishiv Singh (B20MT029) Google Colab Notebooks Link: Task 1
Nishiv Singh
No ratings yet
Deep Learning Autoencoders
Document31 pages
Deep Learning Autoencoders
Q Q
No ratings yet
Presentation YCEF Num. Methods 1
Document16 pages
Presentation YCEF Num. Methods 1
Victor Omotoriogun
No ratings yet
Internal Model Control
Document12 pages
Internal Model Control
gopika1991
No ratings yet
Prerequ QFT
Document1 page
Prerequ QFT
Hassan Fawaz
No ratings yet
Petroleum Geostatistics: Multi-Point Geostatistics Models April 7, 2016
Document32 pages
Petroleum Geostatistics: Multi-Point Geostatistics Models April 7, 2016
zach
No ratings yet
Chapter 04 - Multiple Regression
Document23 pages
Chapter 04 - Multiple Regression
Nicolas Copernic
No ratings yet
6715 978-1-5386-4658-8/18/$31.00 ©2019 Ieee Icassp 2019
Document5 pages
6715 978-1-5386-4658-8/18/$31.00 ©2019 Ieee Icassp 2019
Bouhafs Abdelkader
No ratings yet
Foundation of Computational Fluid Dynamics Dr. S. Vengadesan Department of Applied Mechanics Indian Institute of Technology, Madras Lecture - 15
Document14 pages
Foundation of Computational Fluid Dynamics Dr. S. Vengadesan Department of Applied Mechanics Indian Institute of Technology, Madras Lecture - 15
mahesh d
No ratings yet
WI3150TU Partial Differential Equations 1: Detailed Description
Document2 pages
WI3150TU Partial Differential Equations 1: Detailed Description
hoilolol
No ratings yet
Gauss Seidel Method
Document33 pages
Gauss Seidel Method
jek vin
No ratings yet
ADT
Document34 pages
ADT
bravejaya2002
No ratings yet
Machine Learning For Fluid Mechanics
Document34 pages
Machine Learning For Fluid Mechanics
Prakash Singh
No ratings yet
Rns Institute of Technology, Bengaluru - 98: 18MATDIP41: Additional Mathematics - II
Document2 pages
Rns Institute of Technology, Bengaluru - 98: 18MATDIP41: Additional Mathematics - II
sumant yadav
No ratings yet
L01 Introduction To AI
Document26 pages
L01 Introduction To AI
YONG LONG KHAW
No ratings yet
Large Language Models On Graphs: A Comprehensive Survey
Document26 pages
Large Language Models On Graphs: A Comprehensive Survey
patrizio.gelosi
No ratings yet
Lossless and Lossy Decomposition in DBMS
Document5 pages
Lossless and Lossy Decomposition in DBMS
Nanaji Uppe
No ratings yet
15A05602 Data Warehousing & Mining
Document2 pages
15A05602 Data Warehousing & Mining
Chitra Madhuri Yashoda
No ratings yet
DSP Unit 1 To 5 QB
Document12 pages
DSP Unit 1 To 5 QB
Samuel White
No ratings yet
3D Convolutional Neural Networks For Human Action Recognition
Document11 pages
3D Convolutional Neural Networks For Human Action Recognition
LUCAS HAAS
No ratings yet
Introduction To Computer Algorithms
Document16 pages
Introduction To Computer Algorithms
Luthfie Febrian
No ratings yet
Islington College: Logic and Problem Solving
Document10 pages
Islington College: Logic and Problem Solving
Pratigya Karki
100% (1)
MCQS
Document5 pages
MCQS
Jitesh
No ratings yet
Gauss Elimination Method
Document15 pages
Gauss Elimination Method
Aditya Agrawal
No ratings yet
Computer Paper Class 6
Document5 pages
Computer Paper Class 6
RamRakh Yadav
No ratings yet
Advanced Matrix Operations: 6.1 Opening Remarks
Document22 pages
Advanced Matrix Operations: 6.1 Opening Remarks
uta
No ratings yet
K Mean Clustering
Document27 pages
K Mean Clustering
ashishamitav123
No ratings yet
A4303-Modern Control Theory
Document2 pages
A4303-Modern Control Theory
hari0118
No ratings yet