Data is Future: HIVE An Introduction

Tuesday, March 27, 2018

HIVE An Introduction

Hive was founded by Facebook in August 2007 and later made open source in 2008 .The main idea behind the creation of HIVE was to provide a SQL like flavor for the Hadoop.

The problem faced were that the Hadoop Map Reduce needs a lot of code for the simple programs and lack the expressability of the SQL.

HIVE has a SQL like dialect called HQL (Hive Query Language).It has made the solution very easy as anyone having the knowledge of SQL can easily work upon it.

Hive is best suited for data warehouse applications, where a large data set is maintained and mined for insights, reports, etc.

However , HIVE is not a proper database as it lacks some basic properties of the database.

1.)The record level update is not possible in the HIVE.
2.)HIVE does not provide transactions
3.)Even small data set required a large latency .

HIVE consist mainly of three parts:-

1.) It contains of multiple JAR files and each having some different functionality.They are normally available in the $HIVE_HOME/lib directory.
2.)The second part has executable scripts present in the $HIVE_HOME/bin directory.CLI (Command Line Interface) is invoked with the help of this scripts.
3.)HIVE has also a thrift services to access it's services via ODBC / JDBC driver .It is normally used by the reporting solution like Tableau and Qlikview.

HIVE Architecture:-

Apart from this HIVE also has a meta store that is built-in DERBY database.It is used to store table schema and other metadata.

The DERBY database is normally used for learning purpose and we cannot run two instances of the HIVE CLI as derby is a single process storage.

Starting with HIVE:-

Just type HIVE in the prompt and a hive session will open with HIVE prompt hive> and a secondary prompt comes like this >.

Hive Prompt:-

Secondary Prompt:-

A Simple Query in HIVE :-

In the above example , we have created a table and try to see the data but we do not have any data .Finally we have dropped the table.

We should note that whenever our query is correct ,OK should be there and later the query result.

Data is Future

Tuesday, March 27, 2018

HIVE An Introduction

No comments:

Post a Comment

Delta Lake - Time Travel