You are on page 1of 22

Oracle Machine Learning

Overview and Roadmap


Pelin Ozbozkurt, Phd.

Data Science Lead


Oracle, EMEA
12 February 2020
Safe Harbor

The following is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.

Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September 2019
and Oracle undertakes no duty to update any statement in light of new information or future events.

2 Copyright © 2020, Oracle and/or its affiliates


Oracle Machine Learning Key Attributes

Automated Scalable Production-ready


Get better results faster Handle big data volumes using Deploy and update data
with less effort – parallel, distributed algorithms – science solutions faster with
even non-expert users no data movement integrated ML platform

Increase productivity, Achieve enterprise goals, Innovate More


Copyright © 2020 Oracle and/or its affiliates.
d

Oracle Machine Learning


OML4SQL OML Notebooks
Oracle Advanced Analytics with Apache Zeppelin on
SQL API Autonomous Database

OML4R Oracle Data Miner


Oracle R Enterprise Oracle SQL Developer extension
R API

OML4Py* OML4Spark
Python API Oracle R Advanced Analytics
for Hadoop

OML Microservices*
Supporting Oracle Applications
Image, Text, Scoring, Deployment,
Model Management

Copyright © 2020 Oracle and/or its affiliates. * Coming soon


Customer Success
Using Oracle Machine Learning to achieve business goals

Increase Cash Loans by 15% Within 18 Months of Deployment

Combat Healthcare Fraud

Dramatically reduce online fraud,


significantly improve conversions

Reveal data insights on target customers and


focus marketing dollars

https://www.oracle.com/database/technologies/datawarehouse-bigdata/oml-customers.html
Cross-Platform Machine Learning
Multiple user interfaces and APIs Data Lake
Deployed in cloud and on-premises
From database to entire data management ecosystem Oracle Database

Select API Cloud or Reach broader


User Interface, e.g. Options On-premises Data Sources Oracle Object
Storage
OML Notebooks Amazon
OML4SQL S3
SQL Developer
Oracle Cloud SQL Azure Blob
Popular R OML4R Storage
IDEs NoSQL
Popular Python OML4Python Oracle Big Data SQL Databases
IDEs Kafka
REST Streams
OCI Data Science
Big Data
OML4Spark Service (HDFS)
OAC
Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning Algorithms and Analytics
CLASSIFICATION REGRESSION FEATURE EXTRACTION
Naïve Bayes Linear Model Principal Comp Analysis (PCA)
Logistic Regression (GLM) Generalized Linear Model (GLM) Non-negative Matrix Factorization
Decision Tree Support Vector Machine (SVM) Singular Value Decomposition (SVD)
Random Forest Stepwise Linear regression Explicit Semantic Analysis (ESA)
Neural Network Neural Network
Support Vector Machine (SVM) LASSO TEXT MINING SUPPORT
Explicit Semantic Analysis Algorithms support text columns
ATTRIBUTE IMPORTANCE Tokenization and theme extraction
CLUSTERING Minimum Description Length Explicit Semantic Analysis (ESA) for
Hierarchical K-Means Principal Component Analysis (PCA) document similarity
Hierarchical O-Cluster Unsupervised Pair-wise KL Div
Expectation Maximization (EM) CUR decomposition for row & AI STATISTICAL FUNCTIONS
Basic statistics: min, max,
ANOMALY DETECTION ASSOCIATION RULES median, stdev, t-test, F-test,
One-Class SVM A priori/ market basket Pearson’s, Chi-Sq, ANOVA, etc.
TIME SERIES PREDICTIVE QUERIES R AND PYTHON PACKAGES
Forecasting - Exponential Smoothing Predict, cluster, detect, features Third-party R and Python Packages
Includes popular models through Embedded Execution
e.g. Holt-Winters with trends, SQL ANALYTICS Spark MLlib algorithm integration
seasonality, irregularity, missing data SQL Windows
SQL Patterns
SQL Aggregates

Copyright © 2020 Oracle and/or its affiliates.


Oracle Machine Learning Notebooks
Autonomous Database as a Data Science Platform
Collaborative UI
Based on Apache Zeppelin
Supports data scientists, data analysts,
application developers, DBAs
Easy sharing of notebooks and templates
Permissions, versioning, and execution scheduling
Included with Autonomous Database
Automatically provisioned, managed, backed up
In-database SQL algorithms and analytics functions
Explore and prepare, build and evaluate models,
score data, deploy solutions
Soon to be augmented with Python and R

Copyright © 2020 Oracle and/or its affiliates.


Oracle Machine Learning for SQL
Empower SQL users with immediate access to ML in
Oracle Database and Oracle Autonomous Database

SQL Interfaces OML Notebooks


In-database parallel, distributed algorithms SQL*Plus
SQLDeveloper

ML models as first class database objects
Export / import models across databases
Batch and real-time scoring
Explanatory predictive details
Leverage ML across Oracle stack
Oracle Database Oracle
with OAA option Autonomous
Database
Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning for R and Python
Components of Oracle Database – R today, Python coming soon
Both coming soon to Oracle Autonomous Database

Empower data scientists


Client SQL Interfaces
Oracle Database as HPC environment SQL*Plus
SQLDeveloper
OML4Py OML4R
In-database parallel and distributed
machine learning algorithms
Manage scripts and objects in Oracle Database Database
Integrate results into applications and dashboards Server
Machine
Leverage open source 3rd party packages
Key functional areas:
Transparency Layer ML Algorithms
Embedded Execution OML4Py automated machine learning
Copyright © 2020 Oracle and/or its affiliates.
AutoML – new with OML4Py
Increase data scientist productivity – reduce overall compute time

Auto Model
Selection
Auto Feature
Selection
AutoTune ML
Data
Table
Much faster than
exhaustive search
>50% reduction in
features
Significant score
improvement Model

Auto Model Selection Auto Feature Selection Auto Tune Hyperparameters


– Identify in-database – Reduce # of features by – Significantly improve
algorithm that achieves identifying most predictive model accuracy
highest model quality – Avoid manual or exhaustive
– Improve performance
– Find best model faster than and accuracy search techniques
with exhaustive search

Enables non-expert users to leverage Machine Learning


Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning Roadmap
Expanding Oracle’s investment in machine learning
Roadmap: OML Microservices
Currently available to internal Oracle Applications teams
Model Management Services
Building and deploying OML models
Model Monitoring of accuracy and
prediction/predictor drift
Model repository
Store, version, compare ML models
Cognitive Services
Feature Extraction, Image and Text
User-defined scripts deployment
Python and R user-defined functions invoked via REST API
REST APIs for application integration

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: Sample of Microservices APIs
Model Management Cognitive Image
GET /models POST /imageClassification
GET /{model name} POST /nsfw
Model GET /{model name}/{version} POST /objectDetection Cognitive
Repository REST REST
POST /{model name} POST /faceDetection Image
POST /{model name}/{version} POST /imageSimilarity
DELETE /{model name}/{version} POST /faceSimilarity

Model Deployment Cognitive Text


GET /models POST /ner
Model GET /{uri} POST /topics Cognitive
Deploy REST Text REST
GET /{uri}/api POST /keywords
POST /{uri} POST /sentiment
POST /{uri}/score POST /summary
DELETE /{uri} POST /similarity

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: Algorithms for Database 20c
Frequently requested algorithms

eXtreme Gradient Boosting Trees (XGBoost)


Classification, regression, ranking, survival analysis
Highly popular and powerful algorithm
MSET-SPRT
Anomaly detection for sensors, IoT data sources
“Multivariate State Estimation Technique”
A non-linear, non-parametric anomaly detection ML technique
Based on Oracle Labs algorithm

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: Expand Autonomous Database
with Python and R
Autonomous Database as a Data Science Platform $
OML Notebooks add support for Python and R
Python and R scripts managed in-database DATA SCIENTISTS SQL Clients
Invoke from OML Notebooks, and REST or SQL APIs REST Applications
Deploy into SQL and Web applications easily
Scalable Python and R execution
Transparency layer-enabled database functionality
In-database machine learning algorithms
AutoML functionality via OML4Py
OML4Py integrated with OCI Data Science
SQL

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: OML AutoML User Interface
“Code-free” user interface supporting automated end-to-end machine learning
Automate production and deployment of ML models
Enhance Data Scientist productivity, user-experience
Enable non-expert users to leverage ML
Unify model deployment and monitoring
Support model management
Features
Minimal user input: data, target
Model leaderboard
Model deployment in applications via REST endpoint
Model monitoring: accuracy, prediction/predictor drift
Cognitive features for processing image and text
Sample screen mock-up

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: OML4R and OML4Py
Expand support for open source languages and ecosystems

Expose additional OML4SQL algorithms to Python and R


Support for recent R and Python releases
Enable Oracle Database standard integrated
installation, patching, upgrade/downgrade

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: OML4Spark
New cloud-based architecture with powerful Spark analytics

Support advanced machine learning activities on Big Data


Model management and cognitive image and text processing
Model deployment and monitoring on Big Data (including Database models)
Cloud-oriented packaging (containers, REST APIs)
Enable OML4Py and OML4R for uniform experience across platforms
Algorithms
Neural Network gradient descent enhancements avoid over-fitting
New native Support Vector Machine with linear and non-linear kernels
New native k-Means and k-Mode clustering algorithms

Copyright © 2020 Oracle and/or its affiliates.


Roadmap: Enabling OML on GPUs

Enable GPUs for in-database algorithms


Replace MKL with cuBLAS
Leverage GPUs for user-defined R and Python functions
Include 3rd party packages leveraging GPUs, e.g., Tensorflow, Keras
Support state-of-the-art ML processing, e.g., deep learning
Augment OML Microservices for GPU processing – key for images

Copyright © 2020 Oracle and/or its affiliates.


Why Oracle for Machine Learning?
Oracle integrates ML across the Oracle Stack and the Enterprise
Eliminates costly data movement and latency
Fast and scalable data exploration, data preparation, and ML algorithms
Over 30 algorithms supporting: regression, classification, time series, clustering,
feature extraction, anomaly detection
R and Python integration supports data scientists
Empowers data scientists and analysts, developers, and DBAs/IT with ML
Ease of ML model and R/Python script deployment
Automation of key ML process steps
That’s where most enterprise data lives – bring the algorithms to the data!
Oracle Database and Oracle Autonomous Database
Copyright © 2020 Oracle and/or its affiliates.
Thank You

You might also like