Tuesday, June 26, 2018

HDFS -Read Architecture

In the last blog post , we have gone through the HDFS Write architecture.The Read Architecture is quite simple and easy to understand.
Suppose that we have a file "file.txt" that need to be read from the HDFS. Below steps will take place while reading the data from the HDFS.

  • The Client will create a read request to the Name Node which is having the metadata for   the file "file.txt".
  • Name Node will reply back to the client providing the ip's of the data nodes which are having the "file.txt".
  • The Client will connect to any of the data nodes and start retrieving the data.
  • Once the client will get the required file, the connection will get closed.
  • In case, the data is coming from the multiple blocks, it will combine these blocks to form a file.
While serving read request of the client, HDFS selects the replica which is closest to the client. This reduces the read latency and the bandwidth consumption.

No comments:

Post a Comment

Hadoop - What is a Job in Hadoop ?

In the field of computer science , a job just means a piece of program and the same rule applies to the Hadoop ecosystem as wel...