You are on page 1of 24

Acquiring Big Data

David Segleau Director, Product Management

Big Data In Action


DECIDE ACQUIRE

Make Better Decisions Using Big Data


ANALYZE ORGANIZE

Big Data In Action


DECIDE ACQUIRE

Acquire all available data


ANALYZE ORGANIZE

Traditional Data Sources

New Data Sources

Big Data Use Cases


Todays Challenge
Healthcare Expensive office visits Manufacturing In-person support Location-Based Services Based on home zip code Utilities Complex Distribution Grid Retail One size fits all marketing

New Data
Remote patient monitoring

Whats Possible
Preventive care, reduced hospitalization

Product sensors
Real time location data Detailed consumption statistics Social media

Automated diagnosis, support


Geo-advertising, traffic, local search Increased availability, reduced cost, tiered metering plans Sentiment analysis segmentation

Two Sets of Characteristics


Batch-Oriented Real-Time

Process data to use Bulk storage


Write once, read all

Deliver a service Fast access to specific record


Read, write, delete update

Best Choices
Hadoop Distributed File System (HDFS) File System Parallel scanning No inherent structure High volume writes Oracle NoSQL Database Database Indexed storage Simple data structure High volume random reads and writes

Hadoop Architecture
Management/Monitoring

Distributed file system Map/Reduce programming paradigm Highly scalable data processing
Hadoop Distributed File System (HDFS)

MapReduce

HDFS Overview

Distributes Data on Cluster


Multiple Copies

Add Nodes to Scale

HDFS Overview
Strengths Large files Write once Optimized to stream Weaknesses Low Latency Lots of small files Multiple writers File updates

HDFS Use Cases


Click stream storage and analysis
Number of web sessions lasting more than X minutes Most/Least frequently browsed pages Group session times by hour of day and source location

Sentiment analysis
How many comments contain the word(s) or phrase(s)

Relationship discovery
What items appear to be related in time or proximity How many times does X and Y happen in proximity

Oracle NoSQL Database


Application NoSQL Driver Delete Read Application NoSQL Driver Update Read

Distributed key-value database Simple programming model

Scalable throughput
Commercial software and support
Nodes
West

Nodes
Central

Nodes
East

Easy management

Oracle NoSQL Database


Enterprise Topology
Replicated Application Servers Driver linked into each Application Data Nodes kept current Storage Nodes across Data Centers Automatic Storage Node failure handling
Graceful degradation Automatic recovery

No Single point of failure

Oracle NoSQL Database Key Features


Simple Data Model
Simple data model key-value pair (major+minor-key paradigm) Simple operations read/insert/update/delete, RMW support Scope of transaction records within a major key, single API call Unordered scan of all data (non-transactional)
Major key:
Strings Minor key: Byte Array Value: subscriptions expiration date address

userid

phone #

email id

Oracle NoSQL Database Key Features


ACID Transactions
Specified on per-operation basis, application sets defaults Configurable Durability Policy

- Sync Policy + Replica Ack Policy


Configurable Consistency Policy

Personalization on Login
Profile Lookup

Update profile

Generate and display

Record actions

Oracle NoSQL Database Use Cases


Data capture
Sensor data capture (i.e. IA, SmartGrid, Earth Sc., BioMedical Sc.) Statistics & network capture (QOS Network Mgmt) Web applications (click-through capture) Backup services for mobile devices

Data services
NoSQL data sharing (Earth Sci, BioMedical) Scalable authentication Real-time communication (MMS, SMS, routing) Social Networks, Personalization

Oracle NoSQL Database Differentiation


Seamless integration with Oracle stack Commercial grade Scalable Simple programming model Easy management

Oracle Integrated Solution Stack for Big Data

HDFS Oracle NoSQL Database

Hadoop
(MapReduce)

In-Database Analytics

Oracle Big Data Connectors Oracle Data Integrator

Data Warehouse

Analytic Applications

Enterprise Applications

ACQUIRE

ORGANIZE

ANALYZE

DECIDE

Usage Model
Big Data Appliance Exadata Exalytics

ACQUIRE

ORGANIZE

ANALYZE

DECIDE

Acquiring Big Data


Right place for your data
HDFS NoSQL Relational

Uncover value with analysis

Next Session
Start (PT)
10:00 am
10:30 am 11:00 am 11:30 am 12:00 pm

Start (ET)
1:00 pm
1:30 pm 2:00 pm 2:30 pm 3:00 pm

Session
Keynote: Big Data: Are You Ready?
Big Data Panel Discussion Acquiring Big Data Organizing Big Data Analyzing Big Data

12:30 pm
1:00 pm

3:30 pm
4:00 pm

Conquering Big Data


Closing: A Gartner Perspective on Big Data