Professional Documents
Culture Documents
Customer
Story Q&A
Our mission at AWS
Many connections
Debugging machine +
learning training is Computationally
painful intensive
=
Extraordinarily difficult
to inspect, debug, and
profile
the ‘black box’
Challenges with Machine Learning
Training Manually print debug
data
+
Manually analyze the
debug data
Debugging machine +
learning training is Use open source tools
painful for charting
=
Valuable data
scientist/ML practitioner
time wasted
There’s got to be a better
way!
And there is…
Introducing Amazon SageMaker
Debugger
Training data analysis, debugging, & alert
generation
Amazon SageMaker
Relevant data Automatic data Automatic error Faster training Studio
capture analysis detection integration
Analyze &
Data is Debug data with Errors are automatically Analyze and debug across
debug from
automatically no code detected and alerts are distributed clusters
Amazon
captured for analysis changes sent
SageMaker
Studio
How does Amazon SageMaker Debugger
Work
Amazon SageMaker
A
m
a
zAction Analyze using
Training Analysis o Debugger SDK
in in Amazon SageM aker
n
C
progress progress
Notebook
l
o
u
Action Visualize Tensors
Customer’s S3 d
using charts
W
Bucket Amazon SageM aker a
Studio t
Visualization c
h
• No code change is necessary to emit debug data with built in algorithms and E
custom training script v
e
• Analysis occurs real time as data is emitted making real time alerts possible n
© 2019, Amazon Web Services, Inc. or its affiliates. All rights
reserved.
Amazon SageMaker Studio – Real-time Built in
Rules
Amazon SageMaker Studio – Real-time
Alerts
Amazon SageMaker Studio – Compare Loss
Curves
Custom
Analysis
• Fetch specific
tensors in numpy
array
Vanishing gradients
© 2019, Amazon Web Services, Inc. or its affiliates. All rights
reserved.
The Change Healthcare Intelligent Healthcare
Network™
Accelerating Healthcare Transformation
Change Healthcare AI innovations are accelerating
healthcare transformation by tackling cost and quality
Carried out by
More accurate Reducing rework
machines or
decision FIRST loops
optimal person
© 2019, Amazon Web Services, Inc. or its affiliates. All rights
reserved.
Model Development through
Production
Amazon SageMaker
Debugger notebooks
Amazon Redshift
Athena
Deployment
Architecture Amazon Customer
SageMaker onboarding
end points
Customer
systems
Internal Lambda
consumers Elastic Load API AWS WAF Rules
API ‘Score’ (Registration/API Rules)
Balancer Gateway Gateway
VPCE Public
Kinesis Stream
Early stops
role=sagemaker.get_execution_role(),
...
sagemaker_session=sess,
rules = [
Rule.sagemaker(
rule_configs.ex
ploding_tensor(
),
rule_par
ameters=
{ "tenso
r_regex"
:
Tensor
Analysis
> from smdebug.trials import create_trial
> tr = create_trial('/tmp/smdebug’)
> print(tr.tensors())
['model/convolution_25/kernel:0',
'model/convolution_23/kernel:0', 'model/convolution_28/bias:0',
'model/convolution_30/bias:0', 'model/convolution_24/kernel:0',
'model/convolution_28/kernel:0',
'model/convolution_30/kernel:0', 'model/convolution_26/kernel:0',
'model/convolution_33/bias:0’,….]
>print(tr.tensor('model/convolution_25/kernel:0').steps())
> print(tr.tensor('model/convolution_25/kernel:0').value(2)
Autopilot