Professional Documents
Culture Documents
Learning Models in
Practice
Nick Pentreath
Principal Engineer
@MLnick
CODAIT
codait.org
Containers
Wrap Up
Perception
Reality
… and tools …
• Serving framework
What is a “model”?
• ML model itself
• Prediction transformation
Challenges
• Versioning & lineage • Monitoring & Feedback
• Interpretability
• Multi-model serving
• Ensemble models
Challenges
• Need to manage and bridge many different: • Proliferation of formats
• Languages - Python, R, Notebooks, Scala / Java / C
• Open source, open standard: PMML, PFA, ONNX
• Frameworks – too many to count!
• Open-source, non-standard: MLeap, Spark,
• Dependencies TensorFlow, Keras, PyTorch, Caffe, …
• Versions
• Proprietary formats: lock-in, not portable
• Performance characteristics can be highly
• Lack of standardization leads to custom
variable across these dimensions
solutions
• Friction between teams
• Data scientists & researchers – latest & greatest • Where standards exist, limitations lead to
• Production – stability, control, minimize changes,
custom extensions, eliminating the benefits
performance
• Business – metrics, business impact, product must always
work!
• Swagger API
documentation
• Kubeflow integration
• Clipper
• UC Berkeley RISElab project
PMML
• Data Mining Group (DMG) • Shortcomings
• Model interchange format in XML with • Cannot represent arbitrary programs / analytic
operators applications
• Watch SPARK-11237
MLeap
• Created by Combust.ML, a startup focused on • Shortcomings
ML model serving
• “Open” format, but not a “standard”
• Model interchange format in JSON / Protobuf • No concept of well-defined operators / functions
• Components implemented in Scala code • Effectively forces a tight coupling between
versions of model producer / consumer
• Initially focused on Spark ML. Offers almost
complete support for Spark ML components • Must implement custom components in Scala
• Recently added some sklearn; working on • Impossible to statically verify serialized model
TensorFlow
TensorFlow Serving
• Works well for serving TF models • Shortcomings
• Supports GPU since it is executing TF Graphs • “Open” format, but not a “standard”
• Prediction (generic)
• DMG previously created PMML (Predictive • AVRO schemas for data types
Model Markup Language), arguably the only • Encodes functions (actions) that are applied to inputs
viable open standard currently to create outputs with a set of built-in functions and
language constructs (e.g. control-flow, conditionals)
• PMML has many limitations
• Essentially a mini functional math language + schema
• PFA was created specifically to address these specification
shortcomings
• Type and function system means PFA can be
fully & statically verified on load and run by
any compliant execution engine
• => portability across languages, frameworks,
run times and versions
DBG / May 10, 2018 / © 2018 IBM Corporation
Portable Format for Analytics
A Simple Example
• Example – multi-class logistic regression • Specify the action to perform (typically on
input)
• Specify input and output types using Avro
schemas
Managing State
• Data storage specified by cells
• A cell is a named value acting as a global variable
Other Features
• Special forms
• Control structure – conditionals & loops
• Casts
• Null checks
Aardpfark
• PFA export for Spark ML pipelines
• aardpfark-core – Scala DSL for creating PFA
documents
• Coverage
• Almost all predictors (ML models)
• Pipeline support
Aardpfark - Challenges
• Spark ML Model has no schema knowledge
• E.g. Binarizer can operate on numeric or vector
columns
• Need to use Avro union types for standalone PFA
components and handle all cases in the action logic
Summary
• PFA provides an open standard for • However there are risks
serialization and deployment of analytic • PFA is still young and needs to gain adoption
workflows
• Performance in production, at scale, is relatively
• True portability across languages, frameworks, untested
runtimes and versions
• Tests indicate PFA reference engines need some work
• Execution environment is independent of the producer on robustness and performance
(R, scikit-learn, Spark ML, weka, etc)
• What about Deep Learning / comparison to ONNX?
• Solves a significant pain point for the • Limitations of PFA
deployment of ML pipelines and benefits the
• A standard can move slowly in terms of new features,
wider ecosystem fixes and enhancements
• e.g. many currently use PMML for exporting models
from R, scikit-learn, XGBoost, LightGBM, etc.
Future
• Open source release of Aardpfark • PFA for Deep Learning?
• Initially focused on Spark ML pipelines
• Comparing to ONNX and other emerging standards
• Later add support for scikit-learn pipelines, XGBoost,
LightGBM, etc • Better suited for the more general pre-processing
• (Support for many R models exist already in the Hadrian steps of DL pipelines
project)
• Requires all the various DL-specific operators
• Further performance testing in progress vs Spark
& MLeap; also test vs PMML • Requires tensor schema and better tensor support
built-in to the PFA spec
• More automated translation (Scala -> PFA, ASTs
etc) • Should have GPU support
Monitoring
• Needs to combine traditional software • All are important
system monitoring ... • If your model is offline, uplift (or cost savings) are
zero!
• Latency, throughput, resource usage, etc
• Model performance degrades over time
• ... with model performance metrics
• Model performance vs business needs changes over
• Traditional ML evaluation measures (accuracy, time
prediction error)
Feedback
• A ML system needs to automatically learn • Implicit feedback loops – predictions
from & adapt to feedback from the world influence behavior in longer-term or indirect
around it ways
• Reinforcement learning
MAX
codait.org
twitter.com/MLnick
FfDL
github.com/MLnick
https://datascience.ibm.com/
PMML
jpmml-sparkml
MLeap
Seldon Core
Clipper