You are on page 1of 111

IBM SPSS Modeler

Contents
NOTE: This deck has been designed to provide the elements needed to construct a tailored
presentation to a prospect/customers requirements.
It is not intended to be used in full.

What is Predictive Analytics?


IBM SPSS Modeler Overview
Better Decisions with SPSS Modeler Gold
SPSS Modeler Capabilities
Modeling Algorithms
Premium Features Text Analytics, Entity Analytics, Social Network Analysis
IBM SPSS Modeler & Open Source
Managing & Deploying with IBM SPSS Collaboration & Deployment Services
Analytics at Scale IBM SPSS Modeler Server & Related Technologies
Integration with Other IBM Products

What is Predictive Analytics?


Market Context

WHAT IS PREDICTIVE
ANALYTICS?
3

2013 IBM Corporation

What is Predictive Analytics?

Predictive Analytics helps


connect data to effective
action by drawing reliable
conclusions about current
conditions and future events
Gareth Herschel, Research Director,
Gartner Group

Key themes driving IBMs analytic strategy

Engaging

Analytics for Everyone

Addressing the
Analytical Needs of
the Business User

Breakthrough
Analytics for All Data

Exploiting Value
From the Relevant
New Mix of Data

Pervasive

Analytics for All Problems

Making Decisions
At Point of Impact

Simplify consumption: analytics must impact the business moment in


a meaningful way
Create New Value From Actionable Insight

New Mix of Data

/ Broader Application / New Buyers and Decision Types

Organizations using predictive analytics see results

Enjoyed a 75% higher

click through rate


and a 73% higher sales lift than companies
that did not use predictive analytics1

The ROI of business analytics solutions


that incorporate predictive analytics is about

250%
2

1. Divide & Conquer: Using Predictive Analytics to Segment, Target & Optimize Marketing (pg. 1), Aberdeen, February 2012.
2
Source: IDC, The Business Value of Predictive Analytics, June 2011

Transforming raw data to insight-driven actions key to creating value

A solid majority of organizations are now realizing a return on


their big data investments within a year
Integrating digital capabilities into business processes is
transforming organizations
The value driver for big data has shifted from volume to
velocity

63%

of organizations realize
a positive return on their
analytic investments
within one year

46%

are re-inventing business


processes by integrating
digital capabilities

IBM SPSS Predictive Analytics Foundation

Predictive
Maintenance and
Quality (PMQ)

Predictive Customer
Intelligence (PCI)

Counter Fraud
Management
(CFM)

Custom Applications

Differentiated Analytic Solutions

Data
Collection

Statistics

Modeler and
ADM

Analytic
Server

Collaboration and Deployment Services

Decision
Optimization

Watson
Analytics

Predictive analytics for all


ANALYTIC LEADER

SOLUTION BUYER

(Cross-enterprise/Chief Data Officer)

(Primarily Senior Line of Business)

IBM SPSS Predictive Analytics Enterprise


IBM SPSS Modeler Gold (On Prem/On Cloud)
IBM SPSS Analytic Server

IBM Predictive Maintenance and


Quality
IBM Predictive Customer Intelligence
IBM Counter Fraud Management

SELF-SUFFICIENT BUILDER

SELF-SERVICE USER

(Primarily Developer/Data Scientist)

(Primarily Line of Business Individual)

IBM Predictive Modeling on Bluemix


IBM Embedded Predictive Modeling
IBM SPSS Modeler/Statistics
Predictive Extensions/Community Assets
Developer

IBM Watson Analytics


IBM SPSS Modeler/Statistics
Predictive Extensions/Community
Assets User

IBM SPSS MODELER


OVERVIEW

IBM SPSS Modeler supports multiple users and multiple uses

Puts predictive
power into the
hands of a
business analyst

Provides the
sophistication needed
by an expert

Includes a range of
advanced data
manipulation and
analytical algorithms

Flexible deployment
options

Adoption options to meet business need


beyond cloud or not
Ideal for

IBM Provides

Business Process
as a Service

Customer/User is looking for a


hosted business process and
wants IBM to manage the
complete business process

Best-in-class BPaaS offering hosted by GBS on


IBM infrastructure (i.e., IBM Predictive Customer
Intelligence on Cloud)

Software
as a Service

Customer/User is looking to
move to cloud and wants IBM to
manage infrastructure and
applications it uses

Best-in-class SaaS offering available on IBM


market-leading infrastructure (i.e., IBM SPSS
Modeler Gold on Cloud)

IT developer looking to quickly


add enterprise features/services
into cloud applications

Rapid app development through composable and


integrated platform (IBM Predictive Modeling on
BlueMix, now in beta)

Platform
as a Service

Infrastructure
as a Service

On-premises

Customer/User is looking to
move to cloud and wants IBM to
manage infrastructure

Customer/User is looking to host


enterprise s/w on-premise

Enterprise-class, optimized Infrastructure for


IBM SPSS server-based products

Hybrid Offering

SPSS Modeler Editions


SPSS Modeler Gold
Provides organizations with the ability to build and deploy predictive models
directly into the business process to help people or systems make the right
decisions each time. Combines deployment with the workbench features of IBM
SPSS Modeler Premium.

SPSS Modeler Premium


Provides all of the capabilities of SPSS Modeler Professional combined with
advanced algorithms and capabilities for text analytics, entity analytics,
social network analysis to address broader use cases and more varied data
types.

SPSS Modeler Professional


Provides a range of advanced algorithms, data manipulation and automated
modeling and preparation techniques to build predictive models and uncover
hidden patterns in structured data. Can be extended to provide enterprisescale performance through in-database algorithms, SQL pushback and more to
support increasingly sophisticated Big Data needs.

SPSS Modeler Personal


Desktop-only data mining workbench for structured data built for the individual.

On-Premises or On-Cloud

BETTER DECISIONS WITH


SPSS MODELER GOLD

IBM SPSS Modeler Gold automates transactional decisions and


actions using predictive analytics, rules, scoring and optimization
Streaming Data
Textual Data

Rules

Applications Data

Predictive
Analytics

Time Series
Geotemporal
and Geospatial
Relational
Social Networks

Simulation and
Optimization
Scoring

Frontline Staff and Systems


Benefit From
Recommendations, Automated
Decisions Or Insight In the
Business Context

Decision management is an innovative business discipline that


combines a variety of techniques to optimize actions and outcomes
Optimized Decisions
Provide Frontline Employees and Systems
With Recommended Actions

Business
Rules

Empower Real-time and Adaptive Decisions


to Accommodate Changing Conditions

Optimize Actions With Resource Constraints,


Aligning Execution With Strategy
Optimization

Predictive
Analytics

This Capability Is Packaged With IBM SPSS Modeler Gold

Example customer service relies on smart decision systems to


route the high volume of inbound calls and insurance claims

Frontline Representative Sees


Only Refer At the Point of Impact
Structured, Unstructured, Social Media
and Business Intelligence Data
Rules
Predictive Analytics
Simulation and Optimization
Scoring

Local rules drive governance, input and a critical link to strategy

The Types of Questions Asked


via the End-user Interface Drive
and Are Driven By the
Rules That Help Govern Answers,
Decisions and Recommendations

Predictive modeling leverages the likely state to enable answers,


decisions and recommendations

Responses Create the Structured


and Unstructured Data That Are
Used to Create a Predicted Score
of Fraud Or Churn, Failure,
Purchase, etc. Based On the
Application

Analytical decision management drives the recommendation to


frontline people and systems

Analytical Decision Management


Adjudicates (Moderates)
Predictive Models, Local Rules,
Scoring and Optimization
Taking Into Account Likely States,
Policies and Optimized Resource
Allocation to Provide a Scored
Answer Or Decision

The feedback loops enable an ongoing link from the execution of


decisions to strategy

Fast-start solution development with templates

Customer interactions: decide which promotions to offer


customers when they contact your organization and then
deliver recommendations to a call center, Website or store
location in real time

Campaign optimization: maximize the return on your


campaign investment by allocating the right customers to
each campaign, subject to budget, channel capacity and
contact policy constraints

Claims management: identify the best action or routing


for incoming claims in accordance with business goals

Predictive maintenance: predict failures and avoid costly


downtime by taking action before problems occur

Demand optimization: maximize profits and avoid costly


shortfalls or overages by allocating the right amount of
product, material or bandwidth to each channel or outlet

Under the hood Modeler


Use Modeler to
predict churn,
satisfaction,
lifetime value, and
other attributes
Models provide
insight into what a
given customer is
likely to do.

Under the hood Analytical Decision Management


Use Analytical
Decision
Management to
translate insight to
action.
For example, if a
customer has a
predicted lifetime
value above a
given threshold,
offer them a
premium phone.

Under the hood Collaboration and Deployment Services


Manage analytic
assets in a
centralized
repository
Configure batch
and real time
scoring
Define Analytic
Data View sources

Simplifying deployment: IBM SPSS Modeler Gold on Cloud

Modeler
Server

Analytical
Decision
Management

Statistics
Server

Real Time
Scoring

Collaboration & Deployment Services

Client
Software

Database & Web Application.


DB2 and WebSphere

Optional

Hosted environment
with software,
security and
infrastructure
managed by IBM

Expand scope and


breath with cloudbased statistical
analysis and realtime scoring

Subscription pricing
with flexible terms

Jump start services


that accelerate time
to value

SPSS MODELER CAPABILITIES

IBM SPSS Modeler

Easy-to-use, interactive interface without the


need for programming
Automated modeling and data preparation
capabilities
Access ALL data structured and
unstructured from disparate sources
Natural Language Processing (NLP) to
extract concepts and sentiments in text
Entity Analytics ensures the quality of the
data and results in more accurate models
Leverage existing investment in Cognos,
Netezza, InfoSphere and System Z

Supports Industry and Vendor Agnostic Data Mining Methodology


CRISP-DM
Business understanding
Data understanding
Data preparation
Modeling
Evaluation
Deployment

IBM SPSS Modeler User Interface

IBM SPSS Modeler User Interface Nodes Palette

IBM SPSS Modeler User Interface Stream Canvas

IBM SPSS Modeler User Interface Managers Pane & Project


Pane

Full breadth of native predictive techniques


Technique

Algorithms

Usage

Classification
(Or Prediction)

Autoclassifiers, Decision
Trees, Logistic, Support
Vector Machines, Time Series

Predict Group Membership (e.g., Will This Employee Leave?) Or a


Number (e.g., How Many Widgets Will I Sell?)

Segmentation

Autoclusters, K-Means,
Anomaly Detection

Classify Data Points Into Groups That Are Internally Homogenous


and Externally Heterogeneous, Identify Cases That Are Unusual

Association

Apriori, CARMA, Sequence

Find Events That Occur Together Or In a Sequence Market Basket

Geospatial

Space-Time Boxes

Ability to Improve Model Accuracy (for Any Model Type) By


Including Inputs Derived From Geospatial Data Sources

Automated

Autoclassified, Autonumeric,
Time Series, Clustering

Automatically Find the Right Algorithms Based On Data and


Outcome to Create An Ensemble Model

Simulation

Monte Carlo

Run Different Scenarios to Identify Which Is Best From Historical


Data Or Generated Data

Specialized

Text Analytics, Entity Analytics,


Social Network Analysis

Improve Overall Model Accuracy

In-database

Netezza, DB2, Oracle,


Microsoft

Provide User Friendly Interface On Top of Vendor Algorithms

Open Source

Utilize Open Source Algorithms From R Within Modeler UI.


Enhance By Easily Building Custom Dialogs

CLEF

Unlimited

Leverage the Extension Framework to Build Custom Nodes

MODELING ALGORITHMS

Classification and Prediction

Help to predict a result:


Will a customer buy or leave
Does transaction fit a known pattern of fraud
Expected inventory levels
Forecast number of widget purchases

Techniques included
Decision Trees
Bayesian Networks
Neural Networks
Decision List
Statistical Models
Time Series
Self Learning Response Models
Support Vector Models
Nearest Neighbor Models

Segmentation

Help to group records into clusters or identify unusual cases:


Identify new patterns of fraud
Identify groups of interest in your customers
Identify data segments that are unusual

Techniques included
Kohonen
K-Means
TwoStep

Association

Help determine relationships and rules to determine an outcome given a


set of conditions:
Find associations quickly in larger data sets
Customers who bought product X also bought Y and Z (market basket)
Products related to target product promotion
Techniques included
Apriori
CARMA
Sequence Model

Automated Modeling

Choose from three automated modeling nodes, depending on the needs of


your analysis
Build a number of different modeling methods in a single modeling run, then rank
them to compare the models performance.
Techniques included
Auto Classifier
Auto Numeric
Auto Cluster

Monte Carlo Simulation

Generate simulated data


Fit distributions from existing data
Evaluate the simulation

Geospatial analytics

Support for geospatial data sources


ESRI Shape files and data services

Geospatial data preparation capabilities


Map visualization

Geospatial data preparation

Geospatial data preparation

Map visualizations layers

Algorithms built to scale; expand analytical possibilities


Big data/scalability

Apriori/Association
TCM (Temporal Causal Modeling)
STP (Spatio Temporal Prediction)
TwoStep Cluster

Entity Analytics, Text Analytics & Social Network Analytis

PREMIUM FEATURES EXTEND


CAPABILITIES

Entity Analytics automatically detects when multiple entities are the


same despite having been described differently.
Is this one person or two?

Bill Smith
123 Main Street
(800) 555-1212
SSN: 444-33-2222
DOB: 8/7/84
Applicant: Today

William R Smith
123 S Main Avenue
(100) 111-1234
DL: 90909091
DOB: 7/8/84
Arrested: Feb 2013

Entity Analytics Available in Modeler Premium


What is Entity Analytics?
An entity could be an individual, vehicle, vessel etc
Entity Analytics enables an organization to resolve like entities, even when they
do not share key values ,eg ID number (also called Identity Resolution or Entity
Resolution)
The data can come from multiple sources or just one source
The matching technique enables even the weakest connections to be
discovered
The result is more accurate analytics, based on correctly resolved entities.
How Does Entity Analytics work?
Underlying technical breakthrough known as context accumulation.
Can get more accurate and faster as data sets grow
Out of the box it is ready for processing people, organizations, and vehicles
Users can easily add new entities and new features, without having to train the
system or add elaborate rules.

Entity Analytics uses Context Accumulation to Find Deeper


Insights
Context: Better understanding something by taking into account the things around it

Product is Awful #TheCo


1 minutes ago

Product is Awful #TheCo


1 minutes ago

Call Center
Complaint

Not Actionable

Influential
@Twitter

Terminated
Employee

Substantially more Actionable

Context Accumulation: The incremental process of integrating new observations with previous
observations

IBM SPSS Entity Analytics Delivers General Purpose Context


Accumulation
Data Finds Data

Relevance Finds User

Entity Analytics

Information In Context
Observation
Space

Consumption

Entity Analytics Analyze the Data in the Repository

Entity Analytics Application for Credit

Name

Beth102
L. Johns
Entity

-Parker
BL Johns
Addr1
123 Main Street
777 Park Road
City
New York
State
NY
Phone
2127331234
DOB
6/21/1954
Income
$8,000
Credit Debt
$5,359
Other Debt
$2,009
Debt to Income
92.1
Prev Default?
True
Pending Loan
False

Full
Liz Johns
Addr1
33 Red Dr
City Entity 343
Mamaronec
k
State
NY
Postal
10354
Phone
212-7331234
914-6982234
Income
$9,000
Credit Debt
$6,000
Other Debt
$3,000
Debt to Income 100
Prev Default?
True
Pending Loan
False

Full

Entity 642

Elizabeth Lisa
Johns
Addr1
33 Reed Dr
City
White Plains
State
NY
Postal
10354
Phone
914-698-2234
Income
$31,000
DOB
6/21/1954
Credit Debt
$1,362
Other Debt
$4,001
Debt to Income 17.3
Prev Default?
False
Pending Loan
True

Resolved Entity

Name

Elizabeth Lisa
Johns
Liz Johns
Beth L JohnsParker
BL Johns
Addr1
123 Main Street
777 Park Road
33 Red Dr
33 Reed Dr
City
New York,
White Plains,
Mamaroneck
State
NY
Postal
11732, 10354
Phone
212-733-1234
914-698-2234
DOB
6/21/1954
Defaults
Yes
Income
$48,000
Credit Debt
$12,722
Other Debt
$9,009
Debt to Income 113.5
Prev Default?
True
Pending Loan
True

Entity Analytics Enhancements Support for Relationships


Relationships
Support n degree relationships (Unleashed only)
Find related entities in addition to exact matches

Persist Searches
Optionally add new
streaming records
to the repository

Anonymization
Anonymize features in the repository
to address privacy concerns

Entity Analytics Map Data Fields to the Repository Features

Text Analytics within IBM SPSS Modeler

Text Analytics within IBM SPSS Modeler

Text Analytics Extracts Concepts and Patterns from Text

Text Analytics Identifies the Context/Sentiment of the Text

Comparison of Text Analytics with SPSS Text Analytics for Surveys

Text Analytics

STAfS

Integrate analysis with a predictive


model

Yes

No (needs to be

Text format

Short/long responses,
documents and folders of
documents

Shorter, survey type


responses

Amount of text

Small to large
(can use Modeler Server)

Smaller sets
generally up to
10,000 records

File types that can be used as data


sources

Modeler data sources in


addition to .doc, .ppt, .txt, .pdf

.sav, .mdd, .xls and


ODBC DBs

Sentiment analysis

Yes

Yes

Text Link Analysis

Yes

No

Work with RSS feeds

Yes

No

Server version available

Yes

No

Can be deployed within SPSS


Yes
Collaboration & Deployment Services

No

exported .sav or .xls)

Social Network Analysis Applications

Churn Prediction

Group characteristics can influence churn rates


Focus on individuals in groups with an increased risk of churn
Identify individuals that are at risk of churning due to the flow of info from those that
already churned.

Leveraging Group Leaders

Group leaders are highly influential over other group members


Prevent a group leader from churning to decrease the churn rate for other group
members
Acquire a group leader from a competitor to increase the churn rate that group.

Marketing

Use Group leaders to initiate new goods or service offerings.

Social Network Analysis

Processes CDR (Call Data Record) data from Telecommunications companies


Identifies groups, leaders and probabilities that others will churn based on
influence
Enhances existing churn predictions of Modeler
Expressed as two new nodes in the Sources Palette
Group Analysis: identify the groups in the data and who are the leaders of
them
Diffusion Analysis: use existing churn information to determine who else that
churner is likely to influence to leave

IBM SPSS MODELER & OPEN


SOURCE

Embrace and extend to open source IBM and R (and Python)

Native first-class citizen


Robust data preparation,
build/scoring and output
Scale score in-database and
in-Hadoop with no data
movement

Custom Dialog Builder

Predictive Extensions Demonstration

Catalogue of extensions made


available through community.
Use whats there, add your
own

Take advantage of tutorials,


usage examples, expertise,
events, other downloads and
more

Extend the Use of R

Build and score R models through Modeler GUI


Scale R execution by leveraging database
vendor provided R engines

Use R processes and generate output

Extend the Use of R

Custom Dialog Builder for R


Provides the ability to create new Modeler Algorithm nodes and dialogs that run
R processes
Makes R usable for non-programmers

Predictive Extensions Put R to Work

Custom nodes provide functionality


extensions
Growing catalogue of extensions that
address some of the most frequently
requested features
Available to all AnalyticsZone members

Managing and Deploying Analytics

IBM SPSS COLLABORATION &


DEPLOYMENT SERVICES

Foundation for managing and deploying analytics.

C&DS: Collaborate to manage/deliver more effective analytical results


Share
Business environments are more complex
More people need to be involved
Need to centralize and share analytical assets
Protect
Protection from loss of intellectual property
Automatically tracking and auditing changes
Corporate governance and regulatory
requirements
Publish
Easy for analysts to publish results
Easy for business users to access results
formerly reserved for analysts

C&DS: Automate to bring control and discipline to analytic processes


Construct
Complex jobs, different scenarios, multiple tools
Analysts know what needs to be done
Structured, documented processes
Operationalize
Initiate at the right time
Monitor results
Integration with other processes

Govern
Disciplined approach instills confidence
Managed changes
Audit ready

C&DS: Deploy analytical results within key business processes for


better decisions
Deliver scores
At time of decision
Optimized for real time
Multiple sources of input add context historical and real time
Integrate
Non-intrusive integration with existing systems
Open architecture based on industry
standards
Programmatic interfaces into operational
systems
Enterprise ready
Necessary for integration with
mission critical processes
Ensures scalability, reliability,
security

Technical Components of C&DS


IBM SPSS Collaboration and Deployment Services
includes the following components:
IBM SPSS Collaboration and Deployment Services
Repository. A centralized location for storing analytical
assets, such as models and data.
IBM SPSS Collaboration and Deployment Services
Deployment Manager. A client application that allows
users to manage the repository and to create, execute,
and automate analytical tasks, such as scoring and
updating the models stored in the repository.
IBM SPSS Collaboration and Deployment Services
Deployment Portal. A browser-based thin-client interface
for navigating the IBM SPSS Collaboration and
Deployment Services Repository, running analyses, and
viewing output.
Browser-based Deployment Manager. A thin-client
interface for performing setup and system
management tasks by administrators.
BIRT Report Designer for IBM SPSS. An application
for creating ad-hoc reports against relational and filebased data sources.
IBM SPSS Collaboration and Deployment Services
Enterprise View Driver. A component designed to
provide IBM SPSS Statistics and third-party
applications access to IBM SPSS Collaboration and
Deployment Services Enterprise View objects stored in
the repository.

IBM SPSS Modeler Server & Related Enterprise Technologies

ANALYTICS AT SCALE

What Makes up Big Data?


Volume

Velocity

Data at Rest

Data in Motion

Terabytes to
Exabytes of existing
data to process

Streaming data,
milliseconds to
seconds to respond

Variety

Data in
Many Forms
Structured,
unstructured, text,
multimedia

Veracity

Data in Doubt
Uncertainty due to
data inconsistency
& incompleteness,
ambiguities, latency,
deception, model
approximations

Simplifying actionable insight scalable analytics for high-volume


data
Who? Modeler User
What? Data Mining Process
SPSS Algorithms + Text Analytics
Database Algorithms
R Algorithms
Where? Anywhere
How? Visual and No Programming

SPSS Modeler

SPSS Modeler
Server SPSS
Analytic Server

Reduced Network Traffic


Commands and
Results

Database/Hadoop
Data Preparation
and Model
Building/
Scoring
Pushback

Server
Resources
Used for
Analysis

Simplified, scalable deployment develop once, deploy anywhere


via IBM SPSS Collaboration and Deployment Services
Real time deployment of results
Deliver recommendations to place and time
of decision making
Foundation for decision management
applications

Data At
Rest

Efficient model operationalization


Rapid deployment by business analysts
No coding or reprogramming of models to
integrate with operational systems

Flexible deployment options


Batch scoring
Real-time request/response scoring
High-volume, low-latency streaming data
scoring
Scoring and optimization via decision
management and packaged solutions

Data In
Motion

Predictive analytics for big data data volume requires scalable


analytics
SPSS Modeler Desktops

SQL pushback
In-database mining
IBM InfoSphere
Warehouse
IBM PureData System
for Analytics (Netezza)
Oracle
SQL Server

In-database adapters
IBM PureData System
for Analytics (Netezza)
IBM DB2 for z/OS
Teradata
Parallel processing
IBM InfoSphere BigInsights
Other Hadoop distributions

SPSS Modeler Server Reduced Network Traffic


SPSS Analytic Server Consisting Only of
Commands and Results

Data
Database Resources
Used for SQL
Pushback, In-DB
Processing and
Map/Reduce
Processing

Server Resources
Used for Analysis

Predictive analytics for big data IBM SPSS predictive analytics and
IBM PureData System for Analytics
Visual, easy-to-use interface
Faster time to solution and understanding
Accessible analytics for a business user,
sophistication and power for the data scientist

Scalable and optimized for Netezza


Limited/no data movement analysis executed within the DB
(SQL pushback, UDFs, in-database mining, in-database scoring)
No programming SQL is automatically generated
Analytics run10x-100x faster

Analytics flexibility and deployment


Executed on a purpose-built appliance
SPSS algorithms and Netezza Analytics available
Fits with SPSS portfolio for deployment
o IBM SPSS Modeler Gold
o IBM SPSS Predictive Analytics Enterprise

Fast time to value

< 4 Seconds

100M Customers
1 Model
10 Predictors

< 10 seconds

100M customers
20 models
20 predictors

Predictive analytics for big data IBM SPSS predictive analytics and
IBM PureData System for Analytics

Many use cases require instant answers


Identifying security threats as they occur
Spotting fraudulent transactions as
they are submitted
Predicting mechanical failures while a
machine is running

Key to solving these challenges is highvolume, low-latency scoring


Should not require specialized
programming
Must allow for updates without interrupting
the scoring process
Must integrate seamlessly into operational
systems

Predictive analytics for big data data velocity requires streaming


analytics
Real-time Decisions
Environment
Monitoring

ICU
Monitoring

IBM SPSS Modeler for model development


Fast time to solution
Accessible to non-technical users

Cyber
Security

IBM InfoSphere Streams for high-bandwidth,


low-latency operations
Algorithms scored in the context of stream
processing

IBM SPSS Collaboration and Deployment


Services for lifecycle management
Automated model evaluation, model refresh
and model re-deployment
Scoring models are updated without
suspending data processing

Powerful
Analytics

Algo
Trading

Millions of
Events per
Second

Government/
Law Enforcement

Telco Churn
Prediction
Smart
Grid

Microsecond
Latency

Traditional/Non-traditional
Data Sources

Predictive analytics for big data new data sources require


distributed analytics

IBM SPSS Analytic Server:

Enables Organizations to:

Delivers fast time to solution for


predictive analytics of big data
Visual, easy-to-use interface
abstracts analysts and line-ofbusiness users from complexities of
big data systems

Empower analysts to create and


deploy predictive analytics over big
data without coding
Better solve problems such as:

Targeting and recommendations


Customer service
Fraud, waste and abuse
Predictive maintenance

Pure SQL Vs Scoring Adapter (UDFs) for Model Scoring


Pure SQL

Scoring Adapter (UDFs)

Difficult to support some


model scoring algorithms

Easily supports a large


class of scoring algorithms

Requires a SQL mapping to


be constructed for each
model type

Reuses existing scoring


component to score each
model type

Resulting SQL will run on


many database systems

Needs to be adapted for


each database system
Pure SQL
requiring support

No database extensions
required

Requires database

extensions to be installed

Performance/reliability
harder to predict

Performance/reliability
easier to predict

Harder to generate SQL to


score ensemble models

Easier to score ensemble


models

Di

INTEGRATION WITH OTHER


IBM PRODUCTS

IBM SPSS Advanced Analytics


Complementary Capabilities, New Possibilities

Predictive
Maintenance and
Quality (PMQ)

Predictive Customer
Intelligence (PCI)

Counter Fraud
Management
(CFM)

Custom Applications

Differentiated Analytic Solutions

Data
Collection

Statistics

Modeler and
ADM

Analytic
Server

Collaboration and Deployment Services

Decision
Optimization

Watson
Analytics

SPSS Modeler Synergies with IBM

Cognos BI & Cognos TM1 directly import and export data


Netezza Enabling more sophisticated analytics on purpose built analytics
appliance by leveraging Modelers UI and in-database algorithm support.
InfoSphere Imbed a predictive model within a stream to utilize data in
motion.
Maximo Utilizing predictive capabilities for condition based monitoring in
conjunction with asset management.
System Z Models can be deployed at the database level allowing scoring to
take place when a transaction is being processed (in-transaction scoring).
WebSphere iLog jRules can be combined with predictive models and form
the basis for Analytical Decision Management
Decision Optimization Predictive outcomes can be levered in enterprisegrade optimization solutions
IBM SPSS Modeler is the predictive engine that fuels analytical applications
such as IBM Predictive Maintenance and Quality, IBM Predictive Customer
Intelligence and IBM Counter Fraud Management

Adding Value to Cognos BI with Predictive Analytics

1) Leveraging BI, identify problem or


2) Use SPSS predictive
situation needing attention
analytics & feed results
Common Business Model

Cognos BI

3) Results widely distributed via BI for


consumption by Business Users

back into the BI layer

SPSS Modeler and Cognos BI


Import Data

Cognos Package

Consume Analytics

Report / Dashboard

Data Preparation

Create Predictive Insight

SPSS Modeler

SPSS Modeler

Create Report /
Dashboard

Export Data

Author Report

Cognos Package

TM1 Integration

TM1 Source and Export nodes


Score and/or manipulate data and export directly to TM1

IBM SPSS Modeler with IBM SPSS Analytic Server

Visual, easy to use interface shields analysts & line


of business users from complexities of big data
systems
IBM SPSS Modeler

Data mining and text analytics workbench to build predictive


models without programming or coding

How Modeler Leverages Big Data via IBM SPSS Analytic


Server

SQL / UDF

IBM SPSS Modeler


Stream File
Big Data
Request

Modeler Client

Relational Database

IBM SPSS
Analytic Server

Modeler Server
Hadoop Job

IBM SPSS Analytic Catalyst

Analytic Catalyst
Tablet Client

Analytic Catalyst
Browser Client

Analytics

IBM InfoSphere BigInsights


& Other Hadoop Distributions

SPSS Modeler and Netezza

Modeler supports integration with IBM Netezza, providing the ability to run data
mining algorithms to be directly in the IBM Netezza environment from the Modeler
user interface.
The following algorithms from Netezza Analytics are supported within Modeler

Bayes Net
Decision Trees
Divisive Clustering
Generalized Linear
K-Means
KNN
Linear Regression
Naive Bayes
PCA
Regression Tree
Time Series
2 Step cluster

Real-Time Analytics on Streaming Data


Real Time Decisions

Streaming Enhancement - Support for


Forecasting (Time Series)

Environment
Monitoring

ICU
Monitoring

Powerful
Analytics

Algo
Trading
Cyber
Security

Millions of
Events per
Second

Government /
Law Enforcement

Telco Churn
Prediction
Smart
Grid

Microsecond
Latency

Traditional / Non-traditional
Data Sources

IBM Watson Analytics & IBM SPSS:


Powerful combination
Predictive Exploration for the Business

Built for the novice analyst/business user


Individual discovery in the cloud
Freemium model
Personal data

Apply Statistical Thinking

Built for an analytical professional


Sophisticated statistical analysis backed by reporting capabilities
Supports the entire analytical cycle
Research data

Embed Predictive Analytics into Enterprise Functions

Built for the professional analyst and data scientist


Sophisticated predictive capabilities for all data
Optimizes business processes in real-time
Enterprise data

A smooth path from analytical discovery (IBM Watson Analytics)


to enterprise-scale, embedded analytics

Extend
Basic predictive
capabilities for the
business. Go beyond
spreadsheets and
quickly explore data in
the context a business
can understand

Collaborate
Meaningful analytics that
a novice begins and an
expert builds upon.
Bring business led
insights to decision
makers and advanced
analysts.

Transition
From discovery to rich
story telling capabilities,
embed predictive findings
into decision management
models for optimal
business efficiencies.

Introduction to Decision Optimization

Mathematical Models that find


balance between conflicting goals
(maximizing revenues, creating
efficiency) and resource
constraints

Decision Optimization

Decision Optimization Applications in SPSS Modeler Gold


Why not just stick with Optimization capabilities in SPSS Modeler?
3 Optimization applications (CPLEX)
Campaign Optimization
Demand Optimization
Predictive Maintenance
2 Prioritization applications (non-CPLEX)
Customer Interaction
Claims Management

Why Decision Optimization beyond Modeler Gold?


Widens possible use cases that can be addressed
Adds a sophistication that an increasingly complex business
landscape needs
Optimization problems get larger and more complex
Big problems need sophisticated optimization capabilities
Open source is not a serious alternative (eg 1s vs 20 min
for solving same problem)

What is Decision Optimization?

How to best allocate


aircrafts and crews?

Optimization helps businesses:


What to build,
where and when?

Inventory cost vs.


customer satisfaction

create the best possible plans


explore alternatives and understand trade-off
respond to changes in business operations

Risk vs. potential reward

Cost vs.carbon
emission

From Predictive to Prescriptive The Next Step


Predictive

Prescriptive

What COULD happen?

What SHOULD happen?


With the forecasted data,
where should I place my
inventory? How much safety
stock do I need?

Consumer
Goods

Based on the history of


product sales, what will be my
sales forecast for the next
month?

Retail

Based on the history of


customer purchasing data, will
you be able to predict what
items will interest them?

What items should I place in


which stores? What
campaigns would be best to
promote those items? How
do I minimize discounts?

Manufacturing

Based on the history of


machine data, will you be able
to predict when your machine
might breakdown?

Using the predicted machine


breakdown data to conduct
preventive maintenance, do you
know the best times to schedule
these? (Demand, inventory, etc..)

Statistical Analysis & Modeling Feed Predictive Analytics

Statistics Approach

Modeling Approach

Statistical approach involves


forming a theory about a
possible relationship
converting it to a
hypothesis
testing that hypothesis
using statistical methods
It is a manual, user-driven, topdown approach to data analysis

Data mining involves


the interrogation of the
data
determined by the method
and goal, rather than by
the user
It is a data-driven, selforganizing, bottom-up approach
to data analysis that works on
very large data sets

Note that Both Approaches Drive Predictive Analytics


Statistical Modeling: The Two Cultures, Leo Breiman,
Statistical Science, 2001, Vol.16 (3), pp.199-231.

Integration of Modeler and Statistics

IBM SPSS Statistics Palette within Modeler

Allows Statistics models, transformations, output and syntax within the Modeler
GUI
Statistics dialog boxes for consistency
Uses Statistics in the background to run analysis from the Modeler interface
One way integration i.e. Statistics can be used within Modeler, but not vice versa
Requires a Statistics license for the procedures

Integration of Modeler and Statistics

Integration of Modeler and Statistics

Integration of Modeler and Statistics

Integration of Modeler and Statistics

Integration of Modeler and Statistics

Integration of Modeler and Statistics