Map() :- Return a new distributed dataset formed by passing each element of the source through a function.
Map can be considered as 1: 1 relationship in spark It means each element is associated with it's corresponding element.
The input and output will be RDD for this function.
FlatMap():-Similar to map, but each input item can be mapped to 0 or more output items (so func should return a Seq rather than a single item).
The input and output will be RDD for this function also.
It flattens multiple list into single list.
Python Code :-
OUTPUT OF MAP:-
OUTPUT OF FLATMAP:-
Difference between MAP and FLATMAP :-
map() output is an RDD whereas flatMap() output is RDD containing elements of all iterators.
You can find the datafile and related code on my github id :-
https://github.com/sangam92/Spark_tutorials
Map can be considered as 1: 1 relationship in spark It means each element is associated with it's corresponding element.
The input and output will be RDD for this function.
FlatMap():-Similar to map, but each input item can be mapped to 0 or more output items (so func should return a Seq rather than a single item).
The input and output will be RDD for this function also.
It flattens multiple list into single list.
Python Code :-
OUTPUT OF MAP:-
OUTPUT OF FLATMAP:-
Difference between MAP and FLATMAP :-
map() output is an RDD whereas flatMap() output is RDD containing elements of all iterators.
You can find the datafile and related code on my github id :-
https://github.com/sangam92/Spark_tutorials
No comments:
Post a Comment