Professional Documents
Culture Documents
Action and Transformations (Wide and Narrow)
Action and Transformations (Wide and Narrow)
All the data from the file it will read so its very
expensive.. all data will write it to memory
Transformations
If data dependency is not there shuffling is not required at that time from node1 RDD1 data we will
move it to node1 RDD2,
Example: If we want to filter data by odd number we will just apply filter transformation.
Map:
If we want to map any function to RDD. We will use it. Elements and partitions are equal in Map.
Example: If we want to multiply all elements with *2. We will create lambda function and mapp it to
RDD
Num is rdd
.map is action
Filter
Filter is the operation in which it will give us a new dataset but by selecting some filter criteria we
will filter some criteria on the source which will return some elements suppose we want to search
odd values even values or multiplication.
*****Filter the words which are started with letter “B” ****
Union
Combining both data
Sample
Wide Transformation
GroupBY:
For one dataset we are applying groupby. In groupby we have used lambda function.
Lambda x : x[0] means what ever first letter is there take that letter and apply by group by. So
starting with B letter name will be in one group like that.
To check the results we are applied key and value. we can use for loop
For (k,v) in the new rdd dataset and print those.
Intersection: To know common records from the two rdd’s we can use intersection.
Oder doesn’t matter we can mention the rdd’s any place it will give the common records. Like inner
join.
Subtract
Distinct