Data is Future: HDFS -Read Architecture

Tuesday, June 26, 2018

HDFS -Read Architecture

In the last blog post , we have gone through the HDFS Write architecture.The Read Architecture is quite simple and easy to understand.
Suppose that we have a file "file.txt" that need to be read from the HDFS. Below steps will take place while reading the data from the HDFS.

The Client will create a read request to the Name Node which is having the metadata for the file "file.txt".
Name Node will reply back to the client providing the ip's of the data nodes which are having the "file.txt".
The Client will connect to any of the data nodes and start retrieving the data.
Once the client will get the required file, the connection will get closed.
In case, the data is coming from the multiple blocks, it will combine these blocks to form a file.

While serving read request of the client, HDFS selects the replica which is closest to the client. This reduces the read latency and the bandwidth consumption.

Data is Future

Tuesday, June 26, 2018

HDFS -Read Architecture

No comments:

Post a Comment

Delta Lake - Time Travel