Professional Documents
Culture Documents
04/21/2021
1
ML in our team – building & scaling re-usable tools
Current state of ML usage & potential directions
• “AI” overstates current capability – “Functional ML” is more appropriate
• Perform isolated tasks Narrow-AI at best
• Perform tasks more efficiently compared to legacy methods $$ (capex savings, quality improvement)
• In terms of actual awareness of surroundings, little to none
• Next steps
• ML replicates current result Native-ML design
• Active policy coordination – Narrow-AI agents informing each other
• Build some contextual awareness (efficient Federated or Few-shot learning)
08/27/2021 © 2021 Western Digital Corporation or its affiliates. All rights reserved. | WESTERN DIGITAL CONFIDENTIAL 2
Smarter tools & smarter device?
• Data size huge
Business problems rolled up to categories • Variables/features in Millions
defined by • Often no structure or labels
• Requires massive compute & storage
- Data size we need to handle platforms (Cloud Edge)
- Compute & latency requirements ns•e Very high clock-speed GPUs
o
- Efficient s/w stack to support the esp • Massively parallelized s/w stack
r
application uic k
i ngq Sentinel;
w
• Data size medium– MB
n allo Computer Vision framework
l uti o
• Variables/features in 1000s
so
• Nominal storage/compute
y
c
requirements
ate n
• Parallelized code needed for low Visual data from low-cost cameras
L
ow
latency
– L
me
e
Th Praetor & Scout;
mon Mfg Tool mgmt
m
Co
• Data size small – kB
• Variables/features in 100s
• Very limited storage & compute Low-level streaming data from tool logs
(Device Edge)
• RAM used – 700-800 kB
• Compute on ARM SOC
Sparse-ML; Embedded
inference
HDD as computing device (millions of low cost processors)
HiveMind/Sentinel – unified neural network platform
08/27/2021 © 2021 Western Digital Corporation or its affiliates. All rights reserved. | WESTERN DIGITAL CONFIDENTIAL 4
HiveMind/Sentinel – unified neural network platform
ML
Scout Praetor TCR
Tako Gripper OSA Candela/IRIS Auto Label Polish Wash Plating Packing
08/27/2021 © 2021 Western Digital Corporation or its affiliates. All rights reserved. | WESTERN DIGITAL CONFIDENTIAL 5
Talent Acquisition and Team setup
• OpenCV • Kafka streams
1. Hiring of a healthy mix of senior and • Mask-RCNN • RabbitMQ/ActiveMQ
• Tesseract • Java
junior software engineers and data • TensorFlow • Scala
• Keras • Spark
scientists
2. Establishment of a Scaled Agile Framework
(SAFe) for multi-project and multi-client Computer Messaging
management Vision Layer
3. Establishment of Agile cadence and best
practices
4. Setting up process maturity roadmaps
5. Collating development phases into
milestones with deploy-ready outputs Web service
Data Layer
layer
6. Risk identifications and management
7. Portfolio management for different • Distributed database • NodeJS
projects • Cassandra
• MongoDB
• FastAPI
• ExpressJS
8. Project/team reporting to higher • Redis
• ElasticSearch
• OpenAPI
08/27/2021 © 2021 Western Digital Corporation or its affiliates. All rights reserved. | WESTERN DIGITAL CONFIDENTIAL 6
Tools and solutions being developed
Various Deployment Stages Development phase
• Sentinel: Computer Vision framework • Generative ML models (VAE; GAN) for large-scale
o MO, SDSM, PRB, BPI, FJ Dev team data augmentation & change detection
o Sarawak (staging)
o Use case for Basalt (collab with R&D team/Mipsology)
• Native Neural Network to model and predict HDD
• Streaming Data: ML on “time series” data from mfg failure modes
tools
o Scout @ MO (PN & SZ)
o Praetor @ PRB (Paris-C Clean Room lines)
©2018 Western Digital Corporation or its affiliates. All rights reserved. 08/27/2021 7
Sentinel: computer vision framework
Metadata manager
Rule Engine
©2018 Western Digital Corporation or its affiliates. All rights reserved. 08/27/2021 8
Sentinel - for label checks
Auto Vision station
Current flow
AOI (Automatic Optical Inspection)
New flow
Auto Verify
1s latency
Sputter Gripper tolerance control
Gripper tip alignment sample result
Vertical Tolerance = 0~+0.3 mm
Horizontal Tolerance = +/- 0.1mm
Detecting pattern changes in high-frequency tool data
Why? Profiles may shift causing material change, but static SPC limits are not violated
• Scout (MO sputter) Motif detection/segmentation/drift to monitor & reduce variation in sputter process
• Praetor (Drive Assembly in HDD Clean Room) LSTM/CNN Auto-encoders to detect low-lying anomalies
Sputter sequence and data
Chamber Layer
P24 NCVD
Carbon Overcoat Layer
P23 Youtec
P22 NCT Cap Etching Layer
P21 Heater
P20 Cap
P19 ECL-5
P18 Mag-5
P17 ECL-4
P16 Mag-4
P15 ECL-3
P14 Mag-3 Mag Layer
P13 ECL-2
P12 Mag-2
P11 ECL-1
P10 Mag-1
P9 GIIL
P8 Heater
P7 ILRu2
InterLayer
P6 ILRu1
Blank
P5 Seed-2
Seed Layer
P4 Seed-1
P3 SUL 2
P2 SUL Ru Soft Under Layer
• 24 Chambers and total of 410 tool-level parameters
P1 SUL 1
©2017 Western Digital Corporation or its affiliates. All rights reserved. Confidential. 08/27/2021 12
Scout – Motif Detection algorithm
Golden Profile
Matrix Profile:
… … Establish a reference motif; calculate distance
relative to it
1𝑑
𝑑 2 𝑑 𝑖 𝑑
𝑛
Lightweight compute; 1 tunable hyper-parameter
Monitoring Profile
A near-universal time series similarity and anomaly
detection approach
Quantitative
Map to Matrix Profile space Pattern
• Currently deployed in production across all sputter Difference
lines in media
• PN MO significant improvement in line-to-line
variations
Layout of Assembly area
1
17
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2
Optical/ Touch
Sensor output
profile
Praetor : Architecture & threat detection
C
O
M
P
A
R
E
• Current FD system gets most things right, like most legacy Ops
Bottleneck layer compresses control knobs
input sequence to the time-
• Comparison done with a custom loss invariant features which describe • However, some things it gets right late – subtle pattern changes
function a non-anomalous input that cause issues downstream
• Loss function Anomaly score
• Praetor as a 2nd layer is the plan
• Highlights threats, rather than faults
• Currently in the UAT phase by Mfg Eng teams
Sparse-ML : embedding ML code in FW
Custom port of open-source ML inference framework to convert Neural Networks
TPI
Input features
Conversion Utility
BPI
With the framework now demo-d, can be extended to harder problems/more complex models