Managed table and External table is one of the main concept and need to be understand carefully.Normally, the table which we will create is the managed tables.
So, Let us understand what exactly the Managed Table and External table internally behave.
Managed Tables :- In managed table , the data is moved from the it's original directory to the location of the HIVE meta store (/user/hive/warehouse/). The location can be default or user can provide it.The required directory and sub directory is also created.
In case , we drop the table the data as well as the metadata will get removed from the meta store.
HIVE controls the complete life cycle of the table.
External Tables:- In external table , the data does not move from the HDFS to the table storage location.It means HIVE does not owns the data.
Metadata for this table get updated in the meta store.In case , the table is dropped the data does not get deleted only the metadata get deleted.
During the creation of the table , we need to provide the EXTERNAL keyword for the creation of the table.
When to use External Tables:- There are some scenario when we need to use EXTERNAL table.For our learning purpose, the managed table is fine.But when we are working on a production cluster . it is recommended that we should go for for the External Tables.
1.) When we are using some other tool to excavate the same piece of data .
2.) When we have multiple views or table on the same data set.
3.) To query external dataset present in the external system like Amazon S3.
Syntax for creating an external table.
Syntax for creating a normal table.
So, Let us understand what exactly the Managed Table and External table internally behave.
Managed Tables :- In managed table , the data is moved from the it's original directory to the location of the HIVE meta store (/user/hive/warehouse/). The location can be default or user can provide it.The required directory and sub directory is also created.
In case , we drop the table the data as well as the metadata will get removed from the meta store.
HIVE controls the complete life cycle of the table.
External Tables:- In external table , the data does not move from the HDFS to the table storage location.It means HIVE does not owns the data.
Metadata for this table get updated in the meta store.In case , the table is dropped the data does not get deleted only the metadata get deleted.
During the creation of the table , we need to provide the EXTERNAL keyword for the creation of the table.
When to use External Tables:- There are some scenario when we need to use EXTERNAL table.For our learning purpose, the managed table is fine.But when we are working on a production cluster . it is recommended that we should go for for the External Tables.
1.) When we are using some other tool to excavate the same piece of data .
2.) When we have multiple views or table on the same data set.
3.) To query external dataset present in the external system like Amazon S3.
Syntax for creating an external table.
Syntax for creating a normal table.
No comments:
Post a Comment