You are on page 1of 24

Acquiring Big Data

David Segleau Director, Product Management

Big Data In Action


Make Better Decisions Using Big Data


Big Data In Action


Acquire all available data


Traditional Data Sources

New Data Sources

Big Data Use Cases

Todays Challenge
Healthcare Expensive office visits Manufacturing In-person support Location-Based Services Based on home zip code Utilities Complex Distribution Grid Retail One size fits all marketing

New Data
Remote patient monitoring

Whats Possible
Preventive care, reduced hospitalization

Product sensors
Real time location data Detailed consumption statistics Social media

Automated diagnosis, support

Geo-advertising, traffic, local search Increased availability, reduced cost, tiered metering plans Sentiment analysis segmentation

Two Sets of Characteristics

Batch-Oriented Real-Time

Process data to use Bulk storage

Write once, read all

Deliver a service Fast access to specific record

Read, write, delete update

Best Choices
Hadoop Distributed File System (HDFS) File System Parallel scanning No inherent structure High volume writes Oracle NoSQL Database Database Indexed storage Simple data structure High volume random reads and writes

Hadoop Architecture

Distributed file system Map/Reduce programming paradigm Highly scalable data processing
Hadoop Distributed File System (HDFS)


HDFS Overview

Distributes Data on Cluster

Multiple Copies

Add Nodes to Scale

HDFS Overview
Strengths Large files Write once Optimized to stream Weaknesses Low Latency Lots of small files Multiple writers File updates

HDFS Use Cases

Click stream storage and analysis
Number of web sessions lasting more than X minutes Most/Least frequently browsed pages Group session times by hour of day and source location

Sentiment analysis
How many comments contain the word(s) or phrase(s)

Relationship discovery
What items appear to be related in time or proximity How many times does X and Y happen in proximity

Oracle NoSQL Database

Application NoSQL Driver Delete Read Application NoSQL Driver Update Read

Distributed key-value database Simple programming model

Scalable throughput
Commercial software and support



Easy management

Oracle NoSQL Database

Enterprise Topology
Replicated Application Servers Driver linked into each Application Data Nodes kept current Storage Nodes across Data Centers Automatic Storage Node failure handling
Graceful degradation Automatic recovery

No Single point of failure

Oracle NoSQL Database Key Features

Simple Data Model
Simple data model key-value pair (major+minor-key paradigm) Simple operations read/insert/update/delete, RMW support Scope of transaction records within a major key, single API call Unordered scan of all data (non-transactional)
Major key:
Strings Minor key: Byte Array Value: subscriptions expiration date address


phone #

email id

Oracle NoSQL Database Key Features

ACID Transactions
Specified on per-operation basis, application sets defaults Configurable Durability Policy

- Sync Policy + Replica Ack Policy

Configurable Consistency Policy

Personalization on Login
Profile Lookup

Update profile

Generate and display

Record actions

Oracle NoSQL Database Use Cases

Data capture
Sensor data capture (i.e. IA, SmartGrid, Earth Sc., BioMedical Sc.) Statistics & network capture (QOS Network Mgmt) Web applications (click-through capture) Backup services for mobile devices

Data services
NoSQL data sharing (Earth Sci, BioMedical) Scalable authentication Real-time communication (MMS, SMS, routing) Social Networks, Personalization

Oracle NoSQL Database Differentiation

Seamless integration with Oracle stack Commercial grade Scalable Simple programming model Easy management

Oracle Integrated Solution Stack for Big Data

HDFS Oracle NoSQL Database


In-Database Analytics

Oracle Big Data Connectors Oracle Data Integrator

Data Warehouse

Analytic Applications

Enterprise Applications





Usage Model
Big Data Appliance Exadata Exalytics





Acquiring Big Data

Right place for your data
HDFS NoSQL Relational

Uncover value with analysis

Next Session
Start (PT)
10:00 am
10:30 am 11:00 am 11:30 am 12:00 pm

Start (ET)
1:00 pm
1:30 pm 2:00 pm 2:30 pm 3:00 pm

Keynote: Big Data: Are You Ready?
Big Data Panel Discussion Acquiring Big Data Organizing Big Data Analyzing Big Data

12:30 pm
1:00 pm

3:30 pm
4:00 pm

Conquering Big Data

Closing: A Gartner Perspective on Big Data