You are on page 1of 2

Lessons Learnt

Challenge Options Lesson Learnt

Hadoop Hive w/o Hadoop Hive w/ Joining 2 or more Large Tables,


Joining (Large Tables) Hadoop Pig (12H)
Bucketing (12H) Bucketing (1H) use Bucketing

Hadoop Pig
Joining (Large with Small Hadoop Pig (Left Join) Hadoop Hive (Left Joining Large with Small Table,
(Replicated Left Join)
Table) (1H) Join) (1.5H) use Replicated Join on Pig
(5M)

      to Pig, is far
Hive, compared
Aggregation Hadoop Pig (12H) Hadoop Hive w/ Bucketing (1H)
superior for Aggregation

Access Optimization is supported


Access Blockage Queue Definition and Resource Allocation to each Queue
through Queues

Drastically improved File


File Decompression Tie One-by-One (20M) 10-in-Parallel (6M)
Decompression by Parallelism
Lessons Learnt

Challenge Options Lesson Learnt

TDCH: Regular Large Volume;


Data Movement in UDA Query Grid (10X) TDCH (X)
Query Grid: Irregular Low Volume

Depends on requirement.
TDCH Data Movement PI Defined Table (50M) No-PI Table (5M)
Preferably No-PI Table.

Direct BI Tool to Hadoop Platform Depends on requirement. Tool


BI Tool Integration
    Integration through SDM  
Integration Limitation (multiple ODBCs).

OLAP Aggregations: Physical


SDM Architecture Physical Tables Logical Views
Tables; Ad-hoc/Tactical: Views

PAM & ACL


Security on Hadoop PAM, Kerberos ACL , Rangers
Easy to implement

You might also like