Tags

,

– Designed for batch processing.

Real time query capabilities added to Hive (Tez)

– HiveQL query language 

Allows data stored in HDFS to be accessed from within Hadoop or from databases and datawarehouses 

Compare Hive & RDBMS

Hive

Focused on analytics.

Supports sequential inserts and appends.

Low cost storage using local disks

Many Nodes

Fast data access with data skipping and sorting

Map/reduce.

RDBMS

Focused on real-time queries and analytics.

Random INSERT and UPDATE supported

Expensive storage using SAN technology

Few Nodes

Fast data access through indexing

Parallel queries

$ hive
hive> CREATE TABLE sample(id  INT);

hive> DESCRIBE sample;

How to process Hive Sql Statements?

Clients connect to hive server instance.

Execute Query

Hive parse and plan query

Query convert to map reduce

Map Reduce run Hadoop

Table smaples

CREATE TABLE customer (custID INT,fName STRING,lName STRING,birthday TIMESTAMP,) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’;

CREATE EXTERNAL TABLE SALARIES (

   gender string,age int,salary int,code int) ROW FORMAT DELIMITED

   FIELDS TERMINATED BY ‘,’ LOCATION ‘/home/custsalaries/’;

LOAD DATA INPATH ‘/home/custsalaries.csv’ OVERWRITE INTO TABLE customers;