You are on page 1of 26

Building Your Big Data Analytics Strategy: Block by Block

@impetuscalling

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Outline
Building a Big Data Strategy
  

Big Data & 3Vs 3 Vs model Big Data Analytics Lifecycle

 

Strategy Selection Technology Selection


 

Hadoop Ecosystem Alternative

  

Putting it Together Case Studies and Applications Q&As


Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Impetus Proprietary

Building a Big Data Strategy




Gather Requirements
 

What needs to be done? Objectives


Requirements

Choose Candidate Strategy Options




Patterns & Best Practices

Candidate Strategy Selection Tools & Technology Selection

 

Choose Tools and Technology Implementation




Operational Readiness

Implementation

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Big Data & 3Vs Model




What is Big Data?


 

Define by size or volume or by breakdown

3Vs model
  

Variety of Data Volume of Data Velocity of Data

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Big Data Analytics Life Cycle

Ingestion

Visualization

Creation

Analysis

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

BIG Data Analytics Life Cycle: Concerns

Storage Elasticity Monitoring Compression

Ingestion
Integrations Tools & Technologies Standardization

Visualization
Tools & Technologies Testing Pre Built Solutions Channels In Memory Support Standardization

Creation

Analysis

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

BIG Data Analytics Life Cycle & 3Vs


Simple and potent tool to analyze strategy requirements


Answer simple questions of how much, what type and at what rate

  

Applicable to each phase Using matrix to select suitable strategy Dictates the potential choice of solutions, tools & technologies

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

BIG Data Analytics Life Cycle & 3Vs

How Much? Creation


Storage Elasticity Monitoring Compression

What Type?

What Rate?

Ingestion Analysis Visualization


Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Impetus Proprietary

Strategy Selection

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Big Data Analytics Strategy

Creation

Ingestion

Analysis

Visualization

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

10

Technology Selection

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

11

The Hadoop Ecosystem

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

12

Alternate/Emerging Options


Making stuff Faster


     

Pervasive Datarush, Hstreaming Cloud Map Reduce HPCC, Datastax Brisk, Platform Computing MARS, GPMR Major MPPs-in-database MR-Oracle, Aster etc Hadapt

NOSQL
  

Cassandra, MongoDB, Hbase Riak Redis

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

13

Alternate/Emerging Options


Graph Type DBs


    

Neo4j HyperGraphDB InfiniteGraph Pregel Trinity

Faster SQL DBs




VoltDB, Clustrix

Hardware + Software Solutions


 

Exadata , Parstream Virtualized Options of Hardware + Software Solutions such as Xeround


Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Impetus Proprietary

14

Putting it Together

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

15

Indirect Analytics over Hadoop

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

16

Direct Analytics over Hadoop

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

17

Analytics over Hadoop with MPP DW

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

18

Case Studies

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

19

Social Media Analytics




Problem Statement
 

Analytics on huge data sets populated from live streaming data Simplifying services, cost reduction, proactive analysis on customers feedback

Challenges
 

Live data streaming from social media websites Clustering


 

Learn typical comments, demands, questions Value: Helps identify response / behavior anomalies Learn to identify known patterns automatically Value: useful in filtering, pre-emptive addressing, gaining customer confidence
20

Classification
 

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Social Media Analytics (cont..)




Approach


Prepare matrix to capture How Much?, What Type?, What Rate against each phase Use big data solution strategy covering all concerns of big data analytics lifecycle

Solution


Architected a flexible and scalable solution with near real time streaming of social media data on daily/hourly scheduled jobs Built a solution based on Hadoop, HBase, Hive and Mahout

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

21

Solution Overview
Creation Ingestion Analysis Visualization

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

22

Summing Up
Creating a matrix to build suitable strategy


Enables creation of a platform or a solution to manage 3Vs of data




Solutions, tools & technologies


Hadoop based Big Data Analytics is a scalable and cost effective option

Strategy selection

Recorded version available at Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53

23

About Us
   

Strategic partners for software product engineering and R&D Thought leaders in cutting-edge technologies Mature processes and practices that are methodical, yet flexible Diverse domain expertise

Our services in Big Data and Analytics  Expert consulting  Proof-of-concept & Implementation  Support services

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

Questions

Please send in your questions using the chat panel

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

25

Thank you
For more information, write to us at inquiry@impetus.com

@impetuscalling

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=53

You might also like