Data is Future: Map Reduce Part

Wednesday, March 21, 2018

Map Reduce Part - 1

History :- Map Reduce was first implemented by GOOGLE .Initially ,It was coded in C++ language but later it has been re coded into Java.

Phases of Map Reduce :- The Map and Reduce is divided into two main parts Map and Reduce stage. But it has various sub stages inside these two stages.

1.) Record Reader
2.) Mapper
3.)In Memory Sorting
4.)Merge
5.)Shuffle
6)Reducer

Let us start our journey with the first part Mapper .
`
Record Reader :- Let us suppose we have a file.txt having few lines embedded inside it.

So, it will be divided into two parts key and value .

Key =0                               Value = How are you ?
Key = 15                             Value = I am good.

Here , the key refers to the byte offset and in first case the byte offset is 0 but in the second value . The byte offset is 15 To have the byte offset , we need to count the letters "How" -3 , Space -1 , are -3 ,Space -1, you -3 space-1 ?-1 we will have a    /n after the first line also -1-

Adding all these things we have     3+1+3+1+3+1+1+1 = 14
so ,the next line will start from the 15th character .

Later , the required key,value is fed into the Mapper part .

Mapper :-   Once the input is passes it's processing from the Record reader .it is fed into the Mapper .The language in a mapper is a program dependent .It can be any programming language based upon our convenience.

To understand this problems ,let us take an example of the word count problem.
Suppose we have a sentence " how are you I am good".

The output of the mapping process will be something like this.

how    1
are      1
you     1
i          1
am      1
good   1

So, the output of mapper will be something like the above output.

Sorting :- Once all the above processes are done then in memory sorting will take place on the data part .

We will discuss other components in our next tutorials.

Data is Future

Wednesday, March 21, 2018

Map Reduce Part - 1

No comments:

Post a Comment

Delta Lake - Time Travel