Saturday, December 29, 2018

Python - Pass Statement

Pass is a statement that is equivalent to NULL The difference between comment and pass statement is that comment is completely ignored by the interpreter while pass is not ignored.

when Pass  is executed nothing happens.It is used when we are anticipating some further development in function or class in later phases but we cannot keep it empty right now.

Example :-


class example:
         pass


def example(args):
      pass



Friday, December 28, 2018

Python - Comments & Docstring

Comments play a key role while writing any program.It will tell us what the code is going to do.It is mainly used to make the code more readable.

During the development of code ,we need this feature as it will help to deactivate the part of the code that is not in use in the current debugging plan.

Single Line Comment :

#This is an example of single line comment.
#we can give comment like this.


Multi Line Comment :

Multi Line Comment can be given in two different ways :-

By using """ quotes 

""" This is a multi
     line comment using double quotes.
"""

By using ''' quotes

''' This is a multi line
comment using single quote
''' 

Docstring 

Docstring is short for documentation string.It is a string that occurs as the first statement in a module, function, class, or method definition. We must write what a function/class does in the docstring.Triple quotes are used while writing docstrings. For example:


def test(num):
""" to test the divisibility of a number """
.
.
.
return xyz

Sunday, December 23, 2018

Spark - DataFrames 2


Spark DataFrames 2

In the previous blog , we have gone through the basic of dataframes and also created a dataframe from the sample test.csv file.In this post, we go ahead and will see how to do different operations on dataframes.

We will see the complete program and find out the different operations that can be done on the dataframes.

We will start the program by reading the csv file and try to display the count of  rows in the dataframes.


After the submitting the spark jobs , we will get the below output.






  Display the number of the columns and their name.


describe operation is use to calculate the summary statistics of numerical column(s) in DataFrame. If we don’t specify the name of columns it will calculate summary statistics for all numerical columns present in DataFrame.

 

 

     









Selecting specific columns in the dataframes .

The specific columns in a dataframe can be selected by invoking the dataframe and specifying the required columns.

 

 


         







  Displaying the statistics of a specific column.
   



Hadoop - What is a Job in Hadoop ?

In the field of computer science , a job just means a piece of program and the same rule applies to the Hadoop ecosystem as wel...