2019 Dac Tutorial Nvidia Part

Machine Learning and Deep Learning
Applications in Design Automation and

Practical Issues
Haoxing (Mark) Ren, DAC 19 Tutorial
OUTLINE
• Classical machine learning applications

• Deep learning applications
• Supervised learning: CNN, FCN, GCN
• Unsupervised learning and Reinforcement learning
• Practical Issues
• Feature selection
• Model selection
• Data imbalance
• Conclusions
2
CLASSICAL ML MODELS
Linear Regression Support Vector Machine
Decision Tree Neural Network
3
ENSEMBLE MODELS
Decision Tree Random Forrest XGBoost

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/
4
ML FOR DRC PREDICTION
Window based
GR-abased DRV prediction
ISPD17: SVM-based DRV prediction Routability Optimization In Sub14nm Technologies, ISPD’17 5

ML TIMER CORRELATION
Hierarchical ML Models: LSQR, ANN, SVM, RF
A deep learning methodology to proliferate golden signoff timing, DATE’14 6

ML LEGALIZATION CLASSIFICATION
Classify partition legalization max displacement for any arbitrary region
Circuit row density histogram Circuit row overlap histogram
How Deep Learning Can Drive Physical Synthesis Towards More Predictable Legalization, ISPD’19
7
ML DESIGN SPACE EXPLORATION
Intelligently explore a large design space to find the optimal target
Use Random Forest to model to predict HLS QoR
On Learning-Based Methods for Design-Space Exploration with High-Level Synthesis ,DAC’13 8

OUTLINE

• Model selection
• Data imbalance
• Conclusions
9
THE RISE OF DEEP LEARNING
Source : aiindex.org
10
CLASSICAL MACHINE LEARNING VS DEEP LEARNING
No need for feature engineering
Deep Learning
Performance
Machine Learning
Amount of Data
https://arxiv.org/ftp/arxiv/papers/1803/1803.01164.pdf
11
LEARNING PARADIGMS
12
OUTLINE

• Model selection
• Data imbalance
• Conclusions
13
IMAGE CONVOLUTION
Convolution aggregates information from neighboring pixels
feature
learned map
weights
Fully Connected Convolution

14
CONVOLUTIONAL NEURAL NETWORK
LeNet (Yann LeCun 1998)
15
CNN VARIATIONS
ShuffleNet V2
https://arxiv.org/pdf/1810.00736.pdf
16
POWER PREDICTION WITH MACHINE LEARNING
SystemC Slow (10-100 cycles/s)

High Accuracy
Fast (1k-10k cycles/s) Gate Level

Low Accuracy Netlist
Predict gate level
RTL power with RTL or
SystemC register traces
PRIMAL: Power Inference using Machine Learning (DAC’19) 17

POWER PREDICTION WITH CNN
Fast & Accurate ML
C0: non-switching 1K – 50K

C1: 0→1 inferences/s
Reg-to-Pixel Mapping C2: 1→0
A B C 1 0 0
0 0 1
D E 1 10 01 0
0 0 0
0 00 00 0
0 0 0
0 0 0
ShuffleNet V2
POWER PREDICTION WITH CNN
DL has better accuracy than ML for large design
Register 160 384 405 438 381 372 5651 23531

& I/Os
PRIMAL: Power Inference using Machine Learning, DAC’19

19
FULLY CONVOLUTIONAL NETWORK (FCN)
No Fully Connected Layers
Upsampling/Transposed Convolution
(Blue input, green output)
Fully Convolutional Networks for Semantic Segmentation, CVPR’15
20
DRC HOTSPOT PREDICTION WITH FCN
DRC Hotspot Map

?
Input tensor constructed by stacking 2D features:

(1) Pin density, (2) macro (3) long-range RUDY, (4) RUDY pins
ROUTENET: Routability Prediction for Mixed-Size Designs Using Convolutional Neural Network, ICCAD’18
21
ROUTENET MODEL
Pixel-wise loss function
22
DRC HOTSPOT DETECTION EVALUATION
Window-based ML Ground True RouteNet
23
GRAPH CONVOLUTIONAL NETWORK (GCN)
GCN aggregates information from neighboring nodes
1 3
2
4 9 8 𝐹 (𝑙+1) = 𝜎(𝐴𝐹 (𝑙) 𝑊 (𝑙) )
5 7
Aggregation (mean, sum)

Encoding (Rm → Rn,Relu)
Semi-Supervised Classification with Graph Convolutional Networks, ICLR’17 24

5 6
GCN EXAMPLE 2
2nd Layer
9 1 7
1 [1 x 64]
4 3
Encoding
1 [1 x 32] 8
Aggregation
1 [1 x 32] 2 [1 x 32] 3 [1 x 32] 4 [1 x 32]

1st Layer
Encoding Encoding Encoding Encoding
1 [1 x 4] 2 [1 x 4] 3 [1 x 4] 4 [1 x 4]
Aggregation Aggregation Aggregation Aggregation
1 2 3 4 [4 x 4] 1 2 5 6 [4 x 4] 1 3 7 8 [4 x 4] 1 4 8 9 [4 x 4]
25
25
GCN BASED TESTABILITY PREDICTION
Logic Level
SCOAP_C0
SCOAP_C1
SCOAP_OB
Layer 1 Layer 2 Layer 3 Fully Connected Layers
1
1
0 0
0 0
Weighted sum Weighted sum Weighted sum

(64,64,128,2)
& Relu(R4 → R32) & Relu(R32 → R64) & Relu(R64 → R128)
High Performance Graph Convolutional Networks with Applications in Testability Analysis, DAC’19
26
TESTABILITY PREDICTION ACCURACY
Testing Accuracy(%) 1
0.9
100 0.8
K: # GCN 0.7
Layers
95 0.6
0.5
90 0.4
0.3
85 0.2
0.1
80 0
75
70
65
60
1 31 61 91 121 151 181 211 241 271
Precision Recall F1 score Accuracy
Epochs
• Test point insertion reduced by 11% over TetraMax.

• Graph based model well-suited for EDA problems.
27
OUTLINE

• Model selection
• Data imbalance
• Conclusions
28
UNSUPERVISED LEARNING
• Supervised learning learns to predict y
from x, typically with maximum likelihood
• Unsupervised learning: model density, do

maximum likelihood on the data instead
of the targets
• Generative models
Unsupervised Learning Tutorial, NIPS’18

29
GENERATIVE MODELS
Autoregressive Autoencoder
GAN
Image credit: PixelRNN (CVPR’16), https://skymind.ai/wiki/generative-adversarial-network-gan 30

GAN BASICS
Hung-Yi Lee, Generative Adversary Networks, 2018

31
OPC-GAN
Design target
Mask with OPC
ILT
Wafer
GAN-OPC : mask optimization with lithography-guided generative adversarial nets, DAC’18

32
LithoGAN
CGAN
Mask Photo Resist
33
LithoGAN: End-to-End Lithography Modeling with Generative Adversarial Networks , DAC’19
REINFORCEMENT LEARNING BACKGROUND
• Reward 𝑹(𝒕): score you earned at current step
• State 𝐒 : current screen
• Action 𝒂: move your board left / right
෡ (𝑺, 𝒂): your predicted future total rewards

• Action value function 𝑸
• Policy 𝝅(𝒔): How to choose your action
34
REINFORCEMENT LEARNING CATEGORIES
Learn Q Function Learn Action Policy

(DQN) (Policy Gradient)
෡ (𝑺, 𝒂)
𝑸 𝝅(𝒔)
Value-based Policy-based
• Simple action policy • Stochastic action

• Discrete action space • Continuous space
• Sample efficient • Sample inefficient
Actor + Critic
(A2C, A3C)
35
LEARNING Q FUNCTION
Q Learning with Temporal Difference
Deep neural network

(DQN)
Hung-Yi Lee, Deep Reinforcement Learning, 2018
36
POLICY GRADIENT
Hung-Yi Lee, Deep Reinforcement Learning, 2018

37
DQN FOR COMBINATIONAL OPTIMIZATION
Replacing heuristics
Minimum Vertex Cover (MV) RF Formulation
Select a vertex to insert into • Reward: -1, cost of vertex cover

cover one at a time • State: current selected nodes, use
GCN to learn graph state
• Action: which node to select
• Q function: Use DQN to learn which
node has highest value
• Policy: 𝜀-greedy
Learning Combinatorial Optimization Algorithms Over Graph, NIPS’17

38
DQN WITH DEEP NODE REPRESENTATIONS
Learning Combinatorial Optimization Algorithms Over Graph, NIPS’17 39

POLICY GRADIENT FOR LOGIC OPTIMIZATION
Stochastic policy to select optimization transforms
Majority Inverter Graph (MIG) RF Formulation
MAJ x, y, y = (𝑥 ∧ 𝑦) ∨ (𝑥 ∧ 𝑧) ∨ (𝑦 ∧ 𝑧)
• Reward: logic depth reduction
• State: current graph, use GCN to learn
graph state
• Action: which move to select
• Policy: Use move dependent fully
connected layer to compute probabilities
of each move
Deep Learning for Logic Optimization, IWLS’17

40
RF VS OTHER ALGORITHMS
𝜀-greedy policy Simulated

Annealing
Known
Environment
Dynamic
Programming
Simple policy Heuristics
Gradient free
Evolutionary
Algorithm
41
POTENTIAL OF UNSUPERVISED LEARNING
42
OUTLINE

• Model selection
• Data imbalance
• Conclusions
43
DATASET ANALYSIS
Analyze dataset with Pandas Profiler
44
FEATURE SELECTION
• Filter method
• Evaluate relationship between features and target to compute importance of each feature
• F Test, Mutual information, Variance threshold
• Wrapper method
• Add features one at a time
• Eliminate features one at a time
• Embedded method
• Lasso regression: zero weight for unimportant features
• Tree based method: important feature at root of tree

45
WRAPPER METHOD EXAMPLE
Routability Optimization In Sub14nm Technologies, ISPD’17
46
1D FEATURE ENCODING
➢ on-chip measurement point location

Sub-Block-Level layout of an SoC ➢ sense point neighborhood-level graph
➢ global and local feature vectors
Robust Power Estimation and Simultaneous Switching Noise Prediction Methods Using Machine Learning, GTC’19 47
2D FEATURE ENCODING
Map array of registers to an 2D image
Partition based encoding Node embedding based encoding

DATASET CREATION
Cover a wide range of design frequencies

Cover different types of standard cell sizes
Prevent duplication in training data due to replicated partitions/chiplets
Select more outliers in the design chosen
Training/Testing split
Using Machine Learning for VLSI Testability and Reliability, GTC’19
49
MODELING
Always try classical machine learning to establish a strong baseline
Use Linear Regression, SVM and XGBoost
Model building: Think about underline physics

The DL model performs better if adhere to the physics, e.g. 2D CNN associate with patterns
Use Priors in model construction: graphs, cost functions, etc.
Hyperparameter tuning:
Start with small dataset and make sure you can overfit it with the model
Gradually increase model complexity if you can not overfit the training dataset
Cross validation
50
DATA IMBALANCE ISSUE
It is very common to have much more non-DTs (negative class) than DTs
(positive class), imbalance ratio more than 100X
Classifier 1: ok precision, low recall Classifier 2: high recall, low precision

Predict: 0 Predict: 1 Predict: 0 Predict: 1
Fact: 0 133576 290 Fact: 0 100919 32927
Fact: 1 3681 432 Fact: 1 114 4069
Recall: 10.5% Recall: 97.3%

Precision: 59.8% Precision: 11.0%
51
WEIGHTED LABEL
Apply weights to compensate the bias {2, 3, 4, 5,.... 10, 20, 30, 40, 50}
Routability Optimization In Sub14nm Technologies, ISPD’17 52

MULTI-STAGE CLASSIFICATION
The networks on initial stages only filter out negative data points with high confidence
High recall, low precision
Positive predictions are sent to the network on the next stage
+ + - +
- -
Network 1 Network 2 Network 3
53
CONCLUSIONS
• Deep learning and machine learning can improve quality and productivity
of design automation in many ways.
• We should focus on innovative methods to apply advanced DL models to

hard EDA problems
• There are still a lot of challenges in applying DL : open dataset,

transferability, interpretability
54

2019 Dac Tutorial Nvidia Part

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2019 Dac Tutorial Nvidia Part

Uploaded by

Copyright:

Available Formats

Machine Learning and Deep Learning

Applications in Design Automation and

• Classical machine learning applications

Decision Tree Neural Network

Decision Tree Random Forrest XGBoost

GR-abased DRV prediction

ISPD17: SVM-based DRV prediction Routability Optimization In Sub14nm Technologies, ISPD’17 5

Hierarchical ML Models: LSQR, ANN, SVM, RF

A deep learning methodology to proliferate golden signoff timing, DATE’14 6

Circuit row density histogram Circuit row overlap histogram

Use Random Forest to model to predict HLS QoR

On Learning-Based Methods for Design-Space Exploration with High-Level Synthesis ,DAC’13 8

• Classical machine learning applications

• Classical machine learning applications

Fully Connected Convolution

LeNet (Yann LeCun 1998)

SystemC Slow (10-100 cycles/s)

Fast (1k-10k cycles/s) Gate Level

PRIMAL: Power Inference using Machine Learning (DAC’19) 17

Fast & Accurate ML

C0: non-switching 1K – 50K

Register 160 384 405 438 381 372 5651 23531

PRIMAL: Power Inference using Machine Learning, DAC’19

DRC Hotspot Map

Input tensor constructed by stacking 2D features:

Pixel-wise loss function

Window-based ML Ground True RouteNet

4 9 8 𝐹 (𝑙+1) = 𝜎(𝐴𝐹 (𝑙) 𝑊 (𝑙) )

Aggregation (mean, sum)

Semi-Supervised Classification with Graph Convolutional Networks, ICLR’17 24

1 [1 x 32] 2 [1 x 32] 3 [1 x 32] 4 [1 x 32]

Encoding Encoding Encoding Encoding

Aggregation Aggregation Aggregation Aggregation

Weighted sum Weighted sum Weighted sum

• Test point insertion reduced by 11% over TetraMax.

• Classical machine learning applications

• Unsupervised learning: model density, do

Unsupervised Learning Tutorial, NIPS’18

Image credit: PixelRNN (CVPR’16), https://skymind.ai/wiki/generative-adversarial-network-gan 30

Hung-Yi Lee, Generative Adversary Networks, 2018

Mask with OPC

GAN-OPC : mask optimization with lithography-guided generative adversarial nets, DAC’18

Mask Photo Resist

• Reward 𝑹(𝒕): score you earned at current step

• State 𝐒 : current screen

• Action 𝒂: move your board left / right

෡ (𝑺, 𝒂): your predicted future total rewards

• Policy 𝝅(𝒔): How to choose your action

Learn Q Function Learn Action Policy

• Simple action policy • Stochastic action

Deep neural network

Hung-Yi Lee, Deep Reinforcement Learning, 2018

Minimum Vertex Cover (MV) RF Formulation

Select a vertex to insert into • Reward: -1, cost of vertex cover

Learning Combinatorial Optimization Algorithms Over Graph, NIPS’17

Learning Combinatorial Optimization Algorithms Over Graph, NIPS’17 39

Deep Learning for Logic Optimization, IWLS’17

𝜀-greedy policy Simulated

Simple policy Heuristics

• Classical machine learning applications

• F Test, Mutual information, Variance threshold

• Eliminate features one at a time