RDD supports different kinds of join.We will see one by one each of them.
Normal Join :- It outputs all the data from both the RDD based upon the common key present in them.
Example :- Suppose RDD 'GOALS' has the list of football player with their respective number of goals and the RDD 'MATCH' contains the number of matches played by them.
RDD GOALS
Player Name Goals Scored
Messi 71
Ronaldo 77
Pele 59
Zidane 42
Drogba 27
RDD MATCH
Player Name Matches Played
Messi 163
Ronaldo 171
Pele 142
Zidane 91
Roonie 183
.
The joining happens on the basis of the Key ,in case the key is available in both the RDD then the resulting output will have their corresponding values.
Output :-
Messi 71 163
Ronaldo 77 171
Pele 59 142
Zidane 42 91
Python Code Snippet :-
Normal Join :- It outputs all the data from both the RDD based upon the common key present in them.
Example :- Suppose RDD 'GOALS' has the list of football player with their respective number of goals and the RDD 'MATCH' contains the number of matches played by them.
RDD GOALS
Player Name Goals Scored
Messi 71
Ronaldo 77
Pele 59
Zidane 42
Drogba 27
RDD MATCH
Player Name Matches Played
Messi 163
Ronaldo 171
Pele 142
Zidane 91
Roonie 183
.
The joining happens on the basis of the Key ,in case the key is available in both the RDD then the resulting output will have their corresponding values.
Output :-
Messi 71 163
Ronaldo 77 171
Pele 59 142
Zidane 42 91
Python Code Snippet :-
No comments:
Post a Comment