Professional Documents
Culture Documents
1. Transformation Operations: These operations create a new RDD from an existing one.
Examples include map(), filter(), flatMap(), groupByKey(), reduceByKey(),
sortByKey(), etc. These operations are lazy, meaning they don't execute immediately but build
up a lineage of transformations.
2. Action Operations: These operations trigger the execution of transformations and return results
to the driver program or write data to external storage. Examples include reduce(),
collect(), count(), take() , saveAsTextFile(), foreach(), etc.
3. Numeric Operations: These are specific operations applied to numeric RDDs. They include
statistical functions like mean(), sum(), max(), min(), etc. Additionally, you might perform
mathematical operations using map() or reduce() functions.
By understanding these components and processes, you can effectively develop, deploy, and
manage Spark applications for various use cases.