CSV file schema

txtno, ymd, custno, amount, category, product, city, state, spentby

Step 1: Create a table in HIVE and map it to HBase.

CREATE TABLE transactions
txtno int,
ymd string,
custno int,
amount float,
category string,
product string,
city string,
state string,
spentby string

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'


TBLPROPERTIES (""="transactions_hbase");

Step 2: Verify whether the table(s) is created in HIVE and HBase from their
corresponding shells.

[training@localhost ~]$ hbase shell

hbase(main):001:0> list
hbase(main):001:1> describe 'transactions_hbase'
hbase(main):001:2> exit

[training@localhost ~]$ hive

hive> show tables;
hive> describe extended transactions;
hive> quit;

Step 3: Load the transactions data from HDFS into the HBase table created
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv '-Dimporttsv.separator=,'
transactions_hbase /user/training/transactions_data.txt;
Step 4: Check whether the data got properly loaded into both HIVE and HBase tables.

[training@localhost ~]$ hbase shell

hbase(main):001:0> scan 'transactions_hbase'
hbase(main):001:2> exit

[training@localhost ~]$ hive

hive> select * from transactions limit 10;
hive> quit;

Step 5 : Use the Hive table in IMPALA


select * from transaction limit 5;

Row Level Inserts into HBase using Impala

insert into transactions(txtno, ymd, custno, amount, category, product, city,
state, spentby) values(0,"06-26-2011",4007024,40.33,"Exercise & Fitness","Cardio
Machine Accessories","Clarksville","Tennessee","credit")

