You are on page 1of 48

Dev.

Environment Setup
&
Deep Learning: A Concise Intro.

HonghuTech 翁啟閎
@NTUST
2022/12/15

Docker KubeRun
Introduction Introduction

User Interface
Docker Command Lines
for Admin. & for User

O cial Docker Images for Deep Customize Your Own Image


Learning (TensorFlow, PyTorch, etc) • We aim to let you
Initialize a Jupyter Dev. know how to
Build docker images by yourself Environment quickly deploy
environments for
Submit GPU Jobs
development

Deep Learning
Convolutional Neural Network
some Basic Concepts • We aim to give you a good
starting point for AI model
construction / training
Build a ResNet from Scratch
• Theoretical details or recent
developments are not
Distributed Training Principle covered here
and a Demo
ffi
THE ECOSYSTEM OF NVIDIA…
Frameworks

Libraries

NCCL cuDNN
cuBLAS

CUDA

NVIDIA GPUs
Docker vs. Virtual Machine

• Developed by Docker, Inc. (2013 until now)


• Aim at fast deployment of Apps
Virtual Machine:
Resources are pre-allocated

Virtual Machine vs. Docker

https://www.hitechnectar.com/blogs/hypervisor-vs-docker/
Docker image: a pile of layers

https://ragin.medium.com/docker-what-it-is-how-images-are-structured-docker-vs-vm-and-some-tips-part-1-d9686303590f?source=post_internal_links---------1----------------------------
Docker and Deep Learning?

• Easy to deploy a variety of DL environments (TensorFlow, PyTorch


O cial images available)
• Can deploy the same environment everywhere
• Version control (CUDA, cuDNN, NCCL, DL frameworks).
• To ensure you co-work with your teammates using the same version of
DL environment.
ffi
NVIDIA Container Toolkit

• NVIDIA container runtime: expose GPU drivers


installed on the host side to the running container.

https://github.com/NVIDIA/nvidia-docker
Common usage
docker images #看共有哪些Docker images
docker search [possible_image_name] #找image
docker pull [image_name] #抓image
nvidia-docker run [options image_name] #將⼀個image啟動(變成Container)

docker stop [container_name/id] #停⽌Container


docker start [container_name/id]
#啟動Container

docker rm [container_name_or_id] #移除Container


docker rmi [image_name] #刪image

docker ps #看⽬前啟動了哪些process (看有哪些Container正在啟動)

docker inspect [container_name] #檢查Container


docker logs [container_name] #看執⾏紀錄

docker exec -it [container_name_or_id] /bin/bash


#進入該Container的終端機
docker build [dockerfile_path] .
根據Docker le建立新image
fi
Fetch an image from Docker Hub via:
docker pull maintainer/repository:tag

Share your App on Docker Hub

https://hub.docker.com/
Run containers via docker / nvidia-docker run

Google docker run -it --rm --gpus=0,1 -it --rm -p 8888:8888


--name tf -v $HOME:/workspace --shm-size 32gb nvcr.io/
TensorFlow
nvidia/tensorflow:22.10.1-tf2-py3 bash

NVIDIA docker run -d --gpus=all -p 5000:5000 —name digits


Digits -v $HOME:/workspace nvidia/digits

-it : interactive tty


--rm: remove
-v: volume
--name: name
-d: daemon
-p: port

docker exec [command]: execute some


command inside the running container

Execute the BASH of a running container

docker exec -it tf /bin/bash


Build your own image: nvidia-docker build

mkdir build_tf # create a folder


echo '# Add `seaborn, polars` to the Google TensorFlow image

FROM tensorflow/tensorflow:2.10.1-gpu-jupyter
MAINTAINER Chi-Hung Weng <chihung@honghutech.com>
RUN apt update && apt install -y vim
RUN pip3 install seaborn polars
' > build_tf/Dockerfile Creat the `Docker le`
# put the instruction file (Dockerfile) into the created folder

# Enter the folder and Build the image


cd build_tf && docker build -t="me/my_tf" . Build the image based on the created `Docker le`

# Run the image


docker run -it --rm --gpus=0,1 -it --rm -p 8888:8888 --name my_tf -v $HOME:/workspace --shm-size 32gb me/my_tf bash

Run the created image


fi

fi

Deep-learning images provided by


NVIDIA GPU Cloud (NGC)

https://ngc.nvidia.com/

• Maintained by NVIDIA; updated approximately monthly.

• You must agree the NVIDIA GPU CLOUD TERMS OF USE…


Deep-learning images provided by DockerHub

https://hub.docker.com/r/tensor ow/tensor ow/

• Maintained by the TensorFlow Community


fl
fl
Kubernetes (K8s)

• Released by Google; now hosted by the Cloud Native


Computing Foundation.

• Can deploy, scale, manage containers across cluster nodes

• If you want to get started, visit here.


K8s for Deep Learning? Why?

• Scalable.

• GPU jobs can be scheduled!

• We have released KubeRun, a web UI for job scheduling.


KubeRun
Theory of ML
A Typical Deep Learning procedure

1. Choose a model that corresponds to your needs


ResNet? Transformer? Etc? Check paperswithcode.com
2. De ne the Loss Function
Linear regression:Least Square Loss
Binary classi cation:Cross Entropy Loss
Multi-class classi cation:Cross Entropy Loss
3. Loss Optimization
Use something such as
Stochastic Gradient Descent for loss minimization
4. Evaluate Model Performance
Accuracy, Precision, Recall, F1, AP, AUC, …
5. If you observe Over tting
Say, consider adding L1 or L2 penalty
23 有著作權,侵害必究
fi
fi
fi
fi
Computer Vision

The Convolutional Layer

(5 + 2*1 -3 )/1 + 1 = 5 (5 + 2*1 -3 )/2 + 1 = 3

w’=5 w’=3

W=5, P=1, S=1, F=3 W=5, P=1, S=2, F=3


(5 + 2*1 -3 )/2 + 1 = 3

Width, Height (W, H)


CS231n, Andrej Karpathy & Fei-Fei Li Padding (P) 左右各補幾個零
Stride (S) 滑動間隔
Filter size (F) 濾鏡⼤⼩
101 100
Depth (D) 這個Conv層有多少濾鏡
010 010 W + 2P
0 F
101 001 W = +1
S

24 有著作權,侵害必究
Computer Vision

The Convolutional Layer

CS231n, Andrej Karpathy & Fei-Fei Li:

Typical-looking lters on the rst CONV layer of a trained AlexNet

• ⽤同⼀組3 x 3 lter掃視整張圖:
表⽰我們假設圖像裡的每個3 x 3
的⼩區塊是有共通性。
• 好處:降低資料量,讓網路容易訓
練。

25 有著作權,侵害必究
fi
fi
fi
Computer Vision

The Max Pooling Layer


W =4
H=4

Width (W)
Height (H)
Padding (P)
Stride (SW, SH)
Filter size (FW, FH)

0 W + 2P F
W = +1
W=4 SW
CS231n, Andrej Karpathy & Fei-Fei Li
H=4
W + 2P F
P=0 0
H = +1
SW=SH=2 SH
FW=FH=2

26 有著作權,侵害必究
Computer Vision

Test: Convolutional Layer I/O


Width (W)
Height (H)
Width=5 Padding (P)
Height=5 Stride (SW, SH)
Filter size=3X3 Filter size (FW, FH)
Stride=1X1 Depth (D)
Padding= 0
Depth=96 0 W + 2P F
W = +1
SW
輸入:10X5X5X3
0 H + 2P F
輸入: #samples X Height X Width X Channel 輸出:10X3X3X96 H = +1
SH
輸出: #samples X Height X Width X Depth

randData=np.random.normal(0,1,(10,5,5,3)) # normal分佈的亂數資料當input, 10個3D樣本

model = Sequential()
model.add(Conv2D(filters=96, kernel_size=(3, 3),
strides=(1,1),
padding='valid',
input_shape=(5,5,3)
)
)
print( model.predict(randData).shape ) 27 # 看輸出資料的形狀

Computer Vision

Test: Max Pooling Layer I/O


Width (W)
Width=4 Height (H)
Height=4
Padding (P)
Pool size=2X2
Stride (SW, SH)
Stride=2X2
Padding= 0
Pool size (FW, FH)
Depth=3
0 W + 2P F
W = +1
SW
輸入: #samples X Width X Height X Depth 輸入:10X4X4X3
輸出: #samples X Width X Height X Depth 0 H + 2P F
輸出:10X2X2X3 H = +1
SH

randData=np.random.normal(0,1,(10,4,4,3)) # normal分佈的亂數資料當input, 10個3D樣本


model = Sequential()
model.add(MaxPooling2D(pool_size=(2, 2),
strides=(2,2),
padding='valid',
input_shape=(4,4,3)
)
)
print( model.predict(randData).shape ) # 看輸出資料的形狀
28

Computer Vision

Concept: Receptive Field


濾鏡能夠感受到多少視野範圍?
Image

Feature map 1
Feature map 2

3 X 3 Conv 2 X 2 Conv
(stride=1) (stride=1)

29 有著作權,侵害必究
Computer Vision

Concept: Receptive Field


濾鏡能夠感受到多少視野範圍?
Image

Feature map 1
Feature map 2

3 X 3 Conv 2 X 2 Conv
(stride=1) (stride=1)

此濾鏡能感受到
多少原圖視野?

30 有著作權,侵害必究
Computer Vision

Concept: Receptive Field


濾鏡能夠感受到多少視野範圍?
Image

Feature map 1
Feature map 2

3 X 3 Conv 2 X 2 Conv
(stride=1) (stride=1)

此濾鏡能感受到
多少原圖視野?

31 有著作權,侵害必究
Computer Vision

WHAT HAS THE CONVOLUTIONAL LAYERS LEARNT?

第⼆層 lter資訊(已還原⾄圖像空間)

第⼀層 lter樣貌

第四層 lter資訊(已還原⾄圖像空間) Zeiler and Fergus 2013

越後⾯的層能呈現越複雜的圖像 / 機器能夠學會保留對分類有⽤的特徵
fi
fi
fi
AlexNet
Computer Vision

Alex Krizhevsky, Ilya Sutskever, Geo rey E. Hinton (2012)

from a slide made by Bickson

33 有著作權,侵害必究
ff
Computer Vision

The Deeper, the Better?

https://medium.com/@Lidinwise/the-revolution-of-depth-facf174924f5

34 有著作權,侵害必究
VGGNet
Computer Vision
size: 224 3x3 Conv, 64

3x3 Conv, 64
max pool Karen Simonyan, Andrew Zisserman (2014)
size: 112 3x3 Conv, 128 http://www.robots.ox.ac.uk/~vgg/
3x3 Conv, 128

max pool
size: 56 3x3 Conv, 256

3x3 Conv, 256

3x3 Conv, 256

max pool
size: 28 3x3 Conv, 512

3x3 Conv, 512

3x3 Conv, 512

max pool
size: 14 3x3 Conv, 512

3x3 Conv, 512

3x3 Conv, 512


max pool
size: 7 Dense, 4096

Dense, 4096

Dense, 1000
35 有著作權,侵害必究
VGGNet
Computer Vision
size: 224 3x3 Conv, 64

3x3 Conv, 64
max pool Karen Simonyan, Andrew Zisserman (2014)
size: 112 3x3 Conv, 128
http://www.robots.ox.ac.uk/~vgg/
3x3 Conv, 128

max pool
size: 56 3x3 Conv, 256

3x3 Conv, 256

3x3 Conv, 256

max pool
size: 28 3x3 Conv, 512

3x3 Conv, 512

3x3 Conv, 512

max pool
size: 14 3x3 Conv, 512

3x3 Conv, 512

3x3 Conv, 512


max pool
size: 7 Dense, 4096

Dense, 4096

Dense, 1000
Computer Vision

ResNet
Kaiming He et al (2015)

Before ResNet, we have an issue:

ResNet v1: Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385)

37 有著作權,侵害必究
Computer Vision

ResNet
Kaiming He et al (2015)

38 有著作權,侵害必究
Computer Vision

ResNet
Kaiming He et al (2015)

F(x) x
Conv 3x3

Conv 3x3

39 有著作權,侵害必究
Computer Vision

ResNet
Kaiming He et al (2015)

F(x) x
Conv 3x3

Conv 3x3

• Allows Incremental learning

40 有著作權,侵害必究
Residual block

有著作權,侵害必究
DEMO
DISTRIBUTED TRAINING
L = l(1) + l(2) + l(3) + l(4)

GPU
L = l(1) + l(2) +
1. 正向傳遞, 得到所有樣本誤差: l(3) + l(4)

Model

X(1), X(2), X(3), X(4)

43
單卡模型訓練
L = l(1) + l(2) + l(3) + l(4)

GPU
2. 整體誤差倒傳遞, 讓權重w得到梯度
Model @L
gw =
@w
@l(1) @l(2) @l(3) @l(4)
= + + +
X(1), X(2), X(3), X(4)
@w @w @w @w
3. 利⽤w的梯度更新w

w := w ⌘gw

44

多卡模型訓練

l(1), l(2) l(3), l(4)

GPU1 GPU2 1. 各GPU做正向傳遞, 得


到各⾃的樣本誤差
Model Model

X(1), X(2) X(1), X(2)

CPU1
X(1), X(2), X(3), X(4)

45
多卡模型訓練

GPU1 GPU2

@l(1) @l(2) @l(3) @l(4)


Model + Model +
@w @w @w @w

2. 各GPU做倒傳遞, 得到
各⾃的梯度

46
多卡模型訓練

GPU1 GPU2

@l(1) @l(2) @l(3) @l(4) @l(1) @l(2) @l(3) @l(4)


Model + + + Model @w
+
@w
+
@w
+
@w
@w @w @w @w

3. 利⽤allreduce, 使得每張
卡都可以得到完整的w梯度

4. 利⽤w的梯度更新w
w := w ⌘gw
47
Quick DEMO

• https://github.com/horovod/horovod
• https://github.com/horovod/horovod/blob/master/examples/tensor ow2/tensor ow2_keras_mnist.py

48
fl
fl

You might also like