Tuesday, February 27, 2018

Apache Spark Core

In spark , the architecture is mainly divided between the Driver and Executor node.The Driver node is the part of the program where the main program get executed.The Driver take the main program and distribute the data sets into the worker nodes and also the operation that the worker nodes are suppose to do.
In layman words, the driver is the manager and executor are the developer .Driver distributes the resources and tasks to be performed by each developer.
The driver program access spark through sparkcontext  object which is connected to a computing cluster.


In spark shell ,you can connect to  the sparkcontext via sc variable .

If we are running our program on our local machine ,then it will run on a single cluster .But when we run the same program on cluster , different part of the program is run on different cluster.

No comments:

Post a Comment

Hadoop - What is a Job in Hadoop ?

In the field of computer science , a job just means a piece of program and the same rule applies to the Hadoop ecosystem as wel...