BigData Overview
August 18, 2014 © 2013 IBM Corporation
Agenda
▪ What is BigData
▪ Use Cases
▪ The IBM Big Data Platform
Intrinsic Property of Data … it grows
two years
20%
90% 80% of available data can
of the world’s data of the world’s data be processed by
was created in the last today is unstructured traditional systems
1 in 2 83% 5.4X
business leaders don’t of CIO’s cited BI and analytics more likely that top
have access to data they as part of their visionary plan performers use business
need analytics
Source: GigaOM, Software Group, IBM Institute for Business Value"
tags today
A growing Interconnected and
Instrumented World
4.6
billion
30 billion RFID
500+ Million
users posting 55 Million
tweets every day
n
o
(1.3B in 2005)
camera
phones
world
wide
100s of millions of GPS
ehcr enabled devices sold
Trilli ae
s
annually
1.2
s
2+
1+ Billion active usersbillion people
spending 700 Million on the
minutes per month Web by end 2011
76 million smart
meters in 2009… 200M
by 2014
Characteristics of Big Data
▪ V4 = Volume Velocity Variety Veracity
Cost efficiently Responding to the Collectively analyzing
processing the increasing
the broadening Variety
growing Volume Velocity
50x 30 Billion 80% of the
35 ZB 2010
RFID sensors worlds data is
and counting unstructured
2020
trust the information they
use to make decisions
Establishing the Veracity of
big data sources
1 in 3 business leaders don’t
Commoditization of Hardware Enabling New
Analytics
▪ Low cost compute platform
– 1 petabyte Hadoop cluster for approx $1 million
– Hadoop architecture
• Optimized for high data volumes
❑
Clusters of affordable machines running a Distributed File System (HDFS) and MapReduce processing –
Hardware failure is expected and managed
▪ Hardware Appliance
– Up and Running with new cluster in hours
▪ Cloud
– Up and Running with new cluster in minutes
– Pay what you use
6 Source: Forbes: “The Big Cost of Big Data” © 2013 IBM Corporation
The 5 Key Big Data Use Cases
Big Data Exploration of the Customer Extension
Find, visualize, understand Extend existing customer Lower risk, detect fraud
all big data to improve views by incorporating and monitor cyber security
decision making additional internal and in real-time
Enhanced 360o View external data sources
Security/Intelligence
Operations Analysis Integrate big data and data warehouse
Analyze a variety of machine data for capabilities to increase operational
improved business results efficiency
Data Warehouse Augmentation
7 © 2013 IBM Corporation
More Ways - Wide Ranging Analytics & Techniques
Spatial Analysis
Temporal Statistics
Analysis Text Analysis
Image
Analysis Video Analysis
Machine
Learning
Audio Analysis
8 © 2013 IBM Corporation
Big Data and Complexity in Health Care
Medical information
is doubling every 5
years, much of
which is
unstructured
81% of physicians
report spending 5
hours or less per
month reading
medical journals
“Medicine has become too complex (and only) about 20 percent of the knowledge
clinicians use today is evidence-based”
- Steven Shapiro, Chief Medical and Scientific Officer, UPMC
…to keep up with the state of the art, a doctor would have to devote 160 hours a
week to perusing papers…”
The Economist Feb 14th 2013
Source: International Journal of Circumpolar Health, DoctorDirectory.com, Institute for Medicine"
Application Frameworks
Big Data Platform and
Solutions value with analytic
and application
Speed time to
Gather, extract and explore data accelerators
using best of
breed visualization
Analytics and Decision
Management IBM Big Data Analyze streaming
Platform
Management
data and large data
bursts for real-time
insights
Cost-effectively Visualization & Applications &
analyze Discovery Development
Petabytes of Accelerators
structured and Index and
Systems
federated discovery
unstructured Computing Data
information
Govern data
Stream
quality and
manage
Hadoop System Warehouse
Deliver deep insight
for contextual
collaborative
Contextual insights
Discovery
information lifecycle Governance in-database analytics and operational
analytics
Big Data Infrastructure
Information Integration & with advanced
Cloud | Mobile | Security
An example of the big data platform in practice
Analytic Zone Streams
Ingestion and Real-time Analytics and Reporting Zone
Hadoop
BI &
Reporting
C
o
n Predictive
n
e
ctor
s
Analytics
Warehousing Zone Enterprise
Warehouse
Hive/HBase Col Stores Discovery
Data Marts
MapReduce Visualization &
ETL, MDM, Data
Documents Governance Metadata and
in variety of formats Governance Zone
11
Landing and Analytics
Sandbox Zone
A Big Data Platform Manifesto
Understand and Navigate
Federated Big Data SourcesFederated Discovery and Navigation
Manage
and Store
Huge
Volume of any DataHadoop File System MapReduce
Structure and Control Data Data Warehousing Manage Streaming Data
Stream Computing
Analyze Unstructured Data Text Analytics EngineIntegrate and
Govern
all Data SourcesIntegration, Data Quality, Security, ILM, MDM
12
Use Cases for a Big Data Platform
▪ Financial services
–
Problem:
• Manage the several Petabytes of data which is growing at 40-100% per year
under increasing pressure to prevent frauds and complain to regulations.
– How big data analytics can help:
• Fraud detection
• Risk management
• 360°View of the Customer
13 © 2013 IBM Corporation
– Problem:
Use Cases for a Big Data
Platform
▪ Telecommunication services
• Legacy systems are used to gain insights from internally generated data facing
issues of high storage costs, long data loading time, and long administration
process. – How big data analytics can help:
• CDR processing
• Churn prediction
• Geomapping / marketing
• Network monitoring
14 © 2013 IBM Corporation
Use Cases for a Big Data Platform
▪ Transportation services
– Problem:
• Traffic congestion has been increasing worldwide as a result of increased
urbanization and population growth reducing the efficiency of transportation
infrastructure and increasing travel time and fuel consumption.
– How big data analytics can help:
• Real time analysis to weather and traffic congestion data streams to identify
traffic patterns reducing transportation costs.
15 © 2013 IBM Corporation
Use Cases for a Big Data Platform
▪ Healthcare and Life Sciences
– Problem:
• Vast quantities of real-time information are starting to come from wireless
monitoring devices that postoperative patients and those with chronic diseases
are wearing at home and in their daily lives.
– How big data analytics can help:
• Epidemic early warning
• Intensive Care Unit and remote monitoring
16 © 2013 IBM Corporation
Questions?
17
© 2013 IBM Corporation