Tuesday, August 6, 2019

Spark - Basic Statistics


We have already gone through the tutorial on Measure of Central Tendency.Now we will do it’s implementation in Pyspark.We need to import statistics module from pyspark.mlib.stat.


Once the spark job is submitted , we will get the below output as the result.

The below code is available in my Github library :- https://github.com/sangam92/Spark_tutorials

No comments:

Post a Comment

Hadoop - What is a Job in Hadoop ?

In the field of computer science , a job just means a piece of program and the same rule applies to the Hadoop ecosystem as wel...