You are on page 1of 48

Datawhale

零基础⼊入⻔门CV实践教程
────────
# 版本⼁丨V1.0 #

By 阿⽔水、安晟、王程伟、张强、伊雪、王茂霖

Github Datawhalechina

http://www.datawhale.club
版本号 版本时间 修订内容

V1.0 2020.5.13

内容 贡献者

Datawhale CV - Basline

Datawhale CV - Task01

Datawhale CV - Task02

Datawhale CV - Task03

Datawhale CV - Task04

Datawhale CV - Task05

成员 介绍 个⼈人主⻚页

Top https://www.zhihu.com/people/finlayliu

https://blog.csdn.net/u011583927

https://blog.csdn.net/weixin_40647819

Github https://github.com/QiangZiBro

Tianchi Github: https://github.com/mlw67

https://github.com/datawhalechina/team-learning/
1 Datawhale 零基础⼊入⻔门CV-Baseline讲解
1.1 Baseline思路路

1.2 运⾏行行环境及安装示例例
1.3 Baseline详解
2 Datawhale 零基础⼊入⻔门CV-Task1 赛题理理解

2.1 学习⽬目标
2.2 赛题数据
2.3 数据标签

2.4 评测指标
2.5 读取数据
2.6 解题思路路

2.7 本章⼩小节
3 Datawhale 零基础⼊入⻔门CV-Task2 数据读取与数据扩增
3.1 学习⽬目标

3.2 图像读取
3.2.1 Pillow

3.2.2 OpenCV
3.3 数据扩增⽅方法
3.3.1 数据扩增介绍

3.3.2 常⻅见的数据扩增⽅方法
3.3.3 常⽤用的数据扩增库
3.4 Pytorch读取数据

3.5 本章⼩小节
4 Datawhale 零基础⼊入⻔门CV-Task3 字符识别模型
4.1 学习⽬目标

4.2 CNN介绍
4.3 CNN发展
4.4 Pytorch构建CNN模型

4.5 本章⼩小节
5 Datawhale 零基础⼊入⻔门CV-Task4 模型训练与验证

5.1 学习⽬目标
5.2 构造验证集
5.3 模型训练与验证

5.4 模型保存与加载
5.5 模型调参流程
5.6 本章⼩小节

6 Datawhale 零基础⼊入⻔门CV-Task5 模型集成


6.1 学习⽬目标
6.2 集成学习⽅方法
6.3 深度学习中的集成学习
6.3.1 Dropout

6.3.2 TTA
6.3.3 Snapshot
6.4 结果后处理理

6.5 本章⼩小节
1 Datawhale 零基础⼊入⻔门CV-Baseline讲解

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

1.1 Baseline思路路
CNN

1. Pytorch Dataset DataLoder

2. CNN Pytorch

3.

4.

1.2 运⾏行行环境及安装示例例
Python2/3 Pytorch1.x 4G GPU python3.7+ torch1.3.1gpu

1. Anaconda

$conda create -n py37_torch131 python=3.7

2. pytorch1.3.1

$source activate py37_torch131

$conda install pytorch=1.3.1 torchvision cudatoolkit=10.0

3.

$pip install jupyter tqdm opencv-python matplotlib pandas

4. notebook baseline

$jupyter-notebook
5. ../input/

1 import os, sys, glob, shutil, json

2 os.environ["CUDA_VISIBLE_DEVICES"] = '0'
3 import cv2

5 from PIL import Image


6 import numpy as np

8 from tqdm import tqdm, tqdm_notebook


9

10 import torch

11 torch.manual_seed(0)
12 torch.backends.cudnn.deterministic = False

13 torch.backends.cudnn.benchmark = True

14
15 import torchvision.models as models

16 import torchvision.transforms as transforms

17 import torchvision.datasets as datasets


18 import torch.nn as nn

19 import torch.nn.functional as F
20 import torch.optim as optim

21 from torch.autograd import Variable

22 from torch.utils.data.dataset import Dataset

1.3 Baseline详解
步骤1:定义好读取图像的Dataset

1 class SVHNDataset(Dataset):

2 def __init__(self, img_path, img_label, transform=None):

3 self.img_path = img_path
4 self.img_label = img_label

5 if transform is not None:

6 self.transform = transform
7 else:

8 self.transform = None

9
10 def __getitem__(self, index):

11 img = Image.open(self.img_path[index]).convert('RGB')

12
13 if self.transform is not None:

14 img = self.transform(img)

15
16 # 设置最⻓长的字符⻓长度为5个

17 lbl = np.array(self.img_label[index], dtype=np.int)


18 lbl = list(lbl) + (5 - len(lbl)) * [10]

19 return img, torch.from_numpy(np.array(lbl[:5]))


20

21 def __len__(self):
22 return len(self.img_path)

步骤2:定义好训练数据和验证数据的Dataset

1 train_path = glob.glob('../input/train/*.png')

2 train_path.sort()
3 train_json = json.load(open('../input/train.json'))

4 train_label = [train_json[x]['label'] for x in train_json]

5 print(len(train_path), len(train_label))
6

7 train_loader = torch.utils.data.DataLoader(

8 SVHNDataset(train_path, train_label,
9 transforms.Compose([

10 transforms.Resize((64, 128)),

11 transforms.RandomCrop((60, 120)),
12 transforms.ColorJitter(0.3, 0.3, 0.2),

13 transforms.RandomRotation(5),

14 transforms.ToTensor(),
15 transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

16 ])),
17 batch_size=40,

18 shuffle=True,

19 num_workers=10,
20 )

21

22 val_path = glob.glob('../input/val/*.png')
23 val_path.sort()

24 val_json = json.load(open('../input/val.json'))

25 val_label = [val_json[x]['label'] for x in val_json]


26 print(len(val_path), len(val_label))

27

28 val_loader = torch.utils.data.DataLoader(
29 SVHNDataset(val_path, val_label,

30 transforms.Compose([

31 transforms.Resize((60, 120)),
32 # transforms.ColorJitter(0.3, 0.3, 0.2),

33 # transforms.RandomRotation(5),

34 transforms.ToTensor(),
35 transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

36 ])),
37 batch_size=40,

38 shuffle=False,
39 num_workers=10,

40 )

步骤3:定义好字符分类模型,使⽤用renset18的模型作为特征提取模块

1 class SVHN_Model1(nn.Module):

2 def __init__(self):

3 super(SVHN_Model1, self).__init__()
4

5 model_conv = models.resnet18(pretrained=True)

6 model_conv.avgpool = nn.AdaptiveAvgPool2d(1)
7 model_conv = nn.Sequential(*list(model_conv.children())[:-1])

8 self.cnn = model_conv
9

10 self.fc1 = nn.Linear(512, 11)

11 self.fc2 = nn.Linear(512, 11)


12 self.fc3 = nn.Linear(512, 11)

13 self.fc4 = nn.Linear(512, 11)

14 self.fc5 = nn.Linear(512, 11)


15

16 def forward(self, img):

17 feat = self.cnn(img)
18 # print(feat.shape)

19 feat = feat.view(feat.shape[0], -1)

20 c1 = self.fc1(feat)
21 c2 = self.fc2(feat)

22 c3 = self.fc3(feat)

23 c4 = self.fc4(feat)
24 c5 = self.fc5(feat)

25 return c1, c2, c3, c4, c5

步骤4:定义好训练、验证和预测模块

1 def train(train_loader, model, criterion, optimizer):


2 # 切换模型为训练模式

3 model.train()
4 train_loss = []

6 for i, (input, target) in enumerate(train_loader):


7 if use_cuda:

8 input = input.cuda()

9 target = target.cuda()
10

11 c0, c1, c2, c3, c4 = model(input)


12 loss = criterion(c0, target[:, 0]) + \

13 criterion(c1, target[:, 1]) + \


14 criterion(c2, target[:, 2]) + \

15 criterion(c3, target[:, 3]) + \

16 criterion(c4, target[:, 4])


17

18 # loss /= 6

19 optimizer.zero_grad()
20 loss.backward()

21 optimizer.step()

22
23 if i % 100 == 0:

24 print(loss.item())

25
26 train_loss.append(loss.item())

27 return np.mean(train_loss)

28
29 def validate(val_loader, model, criterion):

30 # 切换模型为预测模型

31 model.eval()
32 val_loss = []

33
34 # 不不记录模型梯度信息

35 with torch.no_grad():

36 for i, (input, target) in enumerate(val_loader):


37 if use_cuda:

38 input = input.cuda()

39 target = target.cuda()
40

41 c0, c1, c2, c3, c4 = model(input)

42 loss = criterion(c0, target[:, 0]) + \


43 criterion(c1, target[:, 1]) + \

44 criterion(c2, target[:, 2]) + \

45 criterion(c3, target[:, 3]) + \


46 criterion(c4, target[:, 4])

47 # loss /= 6

48 val_loss.append(loss.item())
49 return np.mean(val_loss)

50
51 def predict(test_loader, model, tta=10):

52 model.eval()

53 test_pred_tta = None
54

55 # TTA 次数

56 for _ in range(tta):
57 test_pred = []

58
59 with torch.no_grad():

60 for i, (input, target) in enumerate(test_loader):

61 if use_cuda:
62 input = input.cuda()

63

64 c0, c1, c2, c3, c4 = model(input)


65 output = np.concatenate([

66 c0.data.numpy(),

67 c1.data.numpy(),
68 c2.data.numpy(),

69 c3.data.numpy(),
70 c4.data.numpy()], axis=1)

71 test_pred.append(output)

72
73 test_pred = np.vstack(test_pred)

74 if test_pred_tta is None:

75 test_pred_tta = test_pred
76 else:

77 test_pred_tta += test_pred

78
79 return test_pred_tta

步骤5:迭代训练和验证模型

1 model = SVHN_Model1()

2 criterion = nn.CrossEntropyLoss()
3 optimizer = torch.optim.Adam(model.parameters(), 0.001)

4 best_loss = 1000.0

5
6 use_cuda = False

7 if use_cuda:

8 model = model.cuda()
9

10 for epoch in range(2):


11 train_loss = train(train_loader, model, criterion, optimizer, epoch)

12 val_loss = validate(val_loader, model, criterion)

13
14 val_label = [''.join(map(str, x)) for x in val_loader.dataset.img_label]

15 val_predict_label = predict(val_loader, model, 1)


16 val_predict_label = np.vstack([
17 val_predict_label[:, :11].argmax(1),
18 val_predict_label[:, 11:22].argmax(1),

19 val_predict_label[:, 22:33].argmax(1),
20 val_predict_label[:, 33:44].argmax(1),
21 val_predict_label[:, 44:55].argmax(1),

22 ]).T
23 val_label_pred = []

24 for x in val_predict_label:
25 val_label_pred.append(''.join(map(str, x[x!=10])))
26

27 val_char_acc = np.mean(np.array(val_label_pred) == np.array(val_label))


28
29 print('Epoch: {0}, Train loss: {1} \t Val loss: {2}'.format(epoch, train_loss, val_loss))

30 print(val_char_acc)
31 # 记录下验证集精度
32 if val_loss < best_loss:
33 best_loss = val_loss

34 torch.save(model.state_dict(), './model.pt')

2 Epoch
Epoch: 0, Train loss: 3.1 Val loss: 3.4 0.3439

Epoch: 1, Train loss: 2.1 Val loss: 2.9 0.4346

步骤6:对测试集样本进⾏行行预测,⽣生成提交⽂文件

1 test_path = glob.glob('../input/test_a/*.png')
2 test_path.sort()
3 test_label = [[1]] * len(test_path)

4 print(len(val_path), len(val_label))
5
6 test_loader = torch.utils.data.DataLoader(

7 SVHNDataset(test_path, test_label,
8 transforms.Compose([
9 transforms.Resize((64, 128)),

10 transforms.RandomCrop((60, 120)),
11 # transforms.ColorJitter(0.3, 0.3, 0.2),
12 # transforms.RandomRotation(5),

13 transforms.ToTensor(),
14 transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
15 ])),

16 batch_size=40,
17 shuffle=False,
18 num_workers=10,

19 )
20
21 test_predict_label = predict(test_loader, model, 1)
22

23 test_label = [''.join(map(str, x)) for x in test_loader.dataset.img_label]


24 test_predict_label = np.vstack([
25 test_predict_label[:, :11].argmax(1),

26 test_predict_label[:, 11:22].argmax(1),
27 test_predict_label[:, 22:33].argmax(1),
28 test_predict_label[:, 33:44].argmax(1),
29 test_predict_label[:, 44:55].argmax(1),

30 ]).T
31
32 test_label_pred = []

33 for x in test_predict_label:
34 test_label_pred.append(''.join(map(str, x[x!=10])))
35

36 import pandas as pd
37 df_submit = pd.read_csv('../input/test_A_sample_submit.csv')
38 df_submit['file_code'] = test_label_pred

39 df_submit.to_csv('renset18.csv', index=None)

2 Epoch 0.33

关于Datawhale:Datawhale AI

Datawhale 计算机视觉
2 Datawhale 零基础⼊入⻔门CV-Task1 赛题理理解

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

赛题名称 CV

赛题⽬目标

赛题任务

SVHN paper

2.1 学习⽬目标
1.

2.

2.2 赛题数据
SVHN
: SVHN
Top

3W 1W

A 4W B 4W

2.3 数据标签

Field Description

top X

height

left Y

width

label
JSON

原始图⽚片 图⽚片JSON标注

2.4 评测指标

< center> Score= / < /center>

2.5 读取数据
JSON

1 import json
2 train_json = json.load(open('../input/train.json'))

3
4 <div STYLE="page-break-after: always;"></div>
5 # 数据标注处理理
6 def parse_json(d):

7 arr = np.array([
8 d['top'], d['height'], d['left'], d['width'], d['label']
9 ])

10 arr = arr.astype(int)
11 return arr
12

13 img = cv2.imread('../input/train/000000.png')
14 arr = parse_json(train_json['000000.png'])
15

16 plt.figure(figsize=(10, 10))
17 plt.subplot(1, arr.shape[1]+1, 1)
18 plt.imshow(img)

19 plt.xticks([]); plt.yticks([])
20
21 for idx in range(arr.shape[1]):

22 plt.subplot(1, arr.shape[1]+1, idx+2)


23 plt.imshow(img[arr[0, idx]:arr[0, idx]+arr[1, idx],arr[2, idx]:arr[2, idx]+arr[3, idx]])
24 plt.title(arr[4, idx])

25 plt.xticks([]); plt.yticks([])

2.6 解题思路路

2 3 4

字符属性 图⽚片
42 2

241 3

7358 4

简单⼊入⻔门思路路:定⻓长字符识别

2-4 6

6 23 23XXXX 231 231XXX

6 11

专业字符识别思路路:不不定⻓长字符识别

CRNN
专业分类思路路:检测再识别

SSD

YOLO

2.7 本章⼩小节
3 Datawhale 零基础⼊入⻔门CV-Task2 数据读取与数据扩

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

Pytorch

3.1 学习⽬目标
1. Python Pytorch

2. Pytorch

3.2 图像读取
Python

Pillow OpenCV

3.2.1 Pillow

Pillow Python (PIL Pillow ipython

notebook

效果 代码

from PIL import Image # Pillow # im =Image.open(cat.jpg')


from PIL import Image, ImageFilter im = Image.open('cat.jpg') # : im2 = i
m.filter(ImageFilter.BLUR) im2.save('blur.jpg', 'jpeg')

from PIL import Image # jpg : im = Image.open('c


at.jpg') im.thumbnail((w//2, h//2)) im.save('thumbnail.jpg', 'jpeg')

Pillow Pillow
Pillow https://pillow.readthedocs.io/en/stable/

3.2.2 OpenCV

OpenCV Intel OpenCV


OpenCV Pillow

效果 代码

import cv2 # Opencv


img = cv2.imread('cat.jpg') # Opencv BRG, img = cv2.cvtColor(im
g, cv2.COLOR_BGR2RGB)

import cv2 # Opencv


img = cv2.imread('cat.jpg') img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #

import cv2 # Opencv


img = cv2.imread('cat.jpg') img =cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #
edges = cv2.Canny(img, 30, 70) cv2.imwrite('canny.jpg', edges)# Canny

OpenCV OpenCV OpenCV

OpenCV https://opencv.org/

OpenCV Github https://github.com/opencv/opencv

OpenCV https://github.com/opencv/opencv_contrib
3.3 数据扩增⽅方法
Pillow OpenCV

Data Augmentation

3.3.1 数据扩增介绍

数据扩增为什什么有⽤用?

A B

有哪些数据扩增⽅方法?

3.3.2 常⻅见的数据扩增⽅方法
torchvision

1. transforms.CenterCrop

2. transforms.ColorJitter

3. transforms.FiveCrop

4. transforms.Grayscale

5. transforms.Pad
6. transforms.RandomAffine

7. transforms.RandomCrop

8. transforms.RandomHorizontalFlip

9. transforms.RandomRotation

10. transforms.RandomVerticalFlip

3.3.3 常⽤用的数据扩增库

torchvision

https://github.com/pytorch/vision

pytorch torch

imgaug
https://github.com/aleju/imgaug

imgaug

albumentations

https://albumentations.readthedocs.io

3.4 Pytorch读取数据
Pytorch Pytorch

Pytorch Dataset DataLoder

1 import os, sys, glob, shutil, json

2 import cv2
3

4 from PIL import Image

5 import numpy as np
6

7 import torch

8 from torch.utils.data.dataset import Dataset


9 import torchvision.transforms as transforms

10

11 class SVHNDataset(Dataset):
12 def __init__(self, img_path, img_label, transform=None):

13 self.img_path = img_path

14 self.img_label = img_label
15 if transform is not None:

16 self.transform = transform

17 else:
18 self.transform = None

19

20 def __getitem__(self, index):


21 img = Image.open(self.img_path[index]).convert('RGB')

22

23 if self.transform is not None:


24 img = self.transform(img)

25

26 # 原始SVHN中类别10为数字0
27 lbl = np.array(self.img_label[index], dtype=np.int)
28 lbl = list(lbl) + (5 - len(lbl)) * [10]

29

30 return img, torch.from_numpy(np.array(lbl[:5]))


31

32 def __len__(self):

33 return len(self.img_path)
34

35 train_path = glob.glob('../input/train/*.png')

36 train_path.sort()
37 train_json = json.load(open('../input/train.json'))

38 train_label = [train_json[x]['label'] for x in train_json]

39
40 data = SVHNDataset(train_path, train_label,

41 transforms.Compose([

42 # 缩放到固定尺⼨寸
43 transforms.Resize((64, 128)),

44

45 # 随机颜⾊色变换
46 transforms.ColorJitter(0.2, 0.2, 0.2),

47

48 # 加⼊入随机旋转
49 transforms.RandomRotation(5),

50

51 # 将图⽚片转换为pytorch 的tesntor
52 # transforms.ToTensor(),

53

54 # 对图像像素进⾏行行归⼀一化
55 # transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])

56 ]))

1 2 3

Dataset DataLoder Dataset DataLoder

1. Dataset

2. DataLoder Dataset
DataLoder

1 import os, sys, glob, shutil, json

2 import cv2
3

4 from PIL import Image

5 import numpy as np
6

7 import torch

8 from torch.utils.data.dataset import Dataset


9 import torchvision.transforms as transforms

10

11 class SVHNDataset(Dataset):
12 def __init__(self, img_path, img_label, transform=None):

13 self.img_path = img_path

14 self.img_label = img_label
15 if transform is not None:

16 self.transform = transform

17 else:
18 self.transform = None

19

20 def __getitem__(self, index):


21 img = Image.open(self.img_path[index]).convert('RGB')

22

23 if self.transform is not None:


24 img = self.transform(img)

25

26 # 原始SVHN中类别10为数字0
27 lbl = np.array(self.img_label[index], dtype=np.int)

28 lbl = list(lbl) + (5 - len(lbl)) * [10]

29
30 return img, torch.from_numpy(np.array(lbl[:5]))

31

32 def __len__(self):
33 return len(self.img_path)

34

35 train_path = glob.glob('../input/train/*.png')
36 train_path.sort()

37 train_json = json.load(open('../input/train.json'))

38 train_label = [train_json[x]['label'] for x in train_json]


39

40 train_loader = torch.utils.data.DataLoader(

41 SVHNDataset(train_path, train_label,
42 transforms.Compose([

43 transforms.Resize((64, 128)),

44 transforms.ColorJitter(0.3, 0.3, 0.2),


45 transforms.RandomRotation(5),
46 transforms.ToTensor(),

47 transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])


48 ])),

49 batch_size=10, # 每批样本个数

50 shuffle=False, # 是否打乱顺序
51 num_workers=10, # 读取的线程个数

52 )

53
54 for data in train_loader:

55 break

DataLoder Dataset data

torch.Size([10, 3, 64, 128]), torch.Size([10, 6])

batchsize * chanel * height * width

3.5 本章⼩小节
Pytorch
4 Datawhale 零基础⼊入⻔门CV-Task3 字符识别模型

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

Convolutional Neural Network, CNN

4.1 学习⽬目标
1. CNN

2. Pytorch CNN

4.2 CNN介绍
CNN CNN

CNN

CNN

CNN CNN convolution pooling

non-linear activation function fully connected layer


LeNet

5×5 stride=1

CNN

CNN

CNN End to End CNN

4.3 CNN发展

AlexNet VGG InceptionV3 ResNet

LeNet-5(1998)
AlexNet(2012)

VGG-16(2014)

Inception-v1 (2014)
ResNet-50 (2015)

4.4 Pytorch构建CNN模型
Pytorch CNN

Pytorch CNN Pytorch

CNN CNN 6

1 import torch
2 torch.manual_seed(0)
3 torch.backends.cudnn.deterministic = False
4 torch.backends.cudnn.benchmark = True
5
6 import torchvision.models as models
7 import torchvision.transforms as transforms

8 import torchvision.datasets as datasets


9 import torch.nn as nn
10 import torch.nn.functional as F
11 import torch.optim as optim
12 from torch.autograd import Variable
13 from torch.utils.data.dataset import Dataset
14
15 <div STYLE="page-break-after: always;"></div>

16 # 定义模型
17 class SVHN_Model1(nn.Module):
18 def __init__(self):
19 super(SVHN_Model1, self).__init__()

20 # CNN提取特征模块
21 self.cnn = nn.Sequential(
22 nn.Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2)),
23 nn.ReLU(),

24 nn.MaxPool2d(2),
25 nn.Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2)),
26 nn.ReLU(),
27 nn.MaxPool2d(2),
28 )
29 #
30 self.fc1 = nn.Linear(32*3*7, 11)
31 self.fc2 = nn.Linear(32*3*7, 11)

32 self.fc3 = nn.Linear(32*3*7, 11)


33 self.fc4 = nn.Linear(32*3*7, 11)
34 self.fc5 = nn.Linear(32*3*7, 11)
35 self.fc6 = nn.Linear(32*3*7, 11)
36
37 def forward(self, img):
38 feat = self.cnn(img)

39 feat = feat.view(feat.shape[0], -1)


40 c1 = self.fc1(feat)
41 c2 = self.fc2(feat)
42 c3 = self.fc3(feat)
43 c4 = self.fc4(feat)
44 c5 = self.fc5(feat)
45 c6 = self.fc6(feat)
46 return c1, c2, c3, c4, c5, c6

47
48 model = SVHN_Model1()

1 <div STYLE="page-break-after: always;"></div>


2 # 损失函数
3 criterion = nn.CrossEntropyLoss()
4 <div STYLE="page-break-after: always;"></div>

5 # 优化器器
6 optimizer = torch.optim.Adam(model.parameters(), 0.005)
7
8 loss_plot, c0_plot = [], []
9 <div STYLE="page-break-after: always;"></div>
10 # 迭代10个Epoch
11 for epoch in range(10):

12 for data in train_loader:


13 c0, c1, c2, c3, c4, c5 = model(data[0])
14 loss = criterion(c0, data[1][:, 0]) + \

15 criterion(c1, data[1][:, 1]) + \


16 criterion(c2, data[1][:, 2]) + \
17 criterion(c3, data[1][:, 3]) + \
18 criterion(c4, data[1][:, 4]) + \

19 criterion(c5, data[1][:, 5])


20 loss /= 6
21 optimizer.zero_grad()
22 loss.backward()
23 optimizer.step()
24
25 loss_plot.append(loss.item())
26 c0_plot.append((c0.argmax(1) == data[1][:, 0]).sum().item()*1.0 / c0.shape[0])

27
28 print(epoch)

ImageNet

1 class SVHN_Model2(nn.Module):

2 def __init__(self):
3 super(SVHN_Model1, self).__init__()
4
5 model_conv = models.resnet18(pretrained=True)
6 model_conv.avgpool = nn.AdaptiveAvgPool2d(1)
7 model_conv = nn.Sequential(*list(model_conv.children())[:-1])
8 self.cnn = model_conv

9
10 self.fc1 = nn.Linear(512, 11)
11 self.fc2 = nn.Linear(512, 11)
12 self.fc3 = nn.Linear(512, 11)
13 self.fc4 = nn.Linear(512, 11)
14 self.fc5 = nn.Linear(512, 11)

15
16 def forward(self, img):
17 feat = self.cnn(img)
18 # print(feat.shape)
19 feat = feat.view(feat.shape[0], -1)
20 c1 = self.fc1(feat)
21 c2 = self.fc2(feat)
22 c3 = self.fc3(feat)

23 c4 = self.fc4(feat)
24 c5 = self.fc5(feat)
25 return c1, c2, c3, c4, c5

4.5 本章⼩小节
CNN CNN Pytorch CNN
5 Datawhale 零基础⼊入⻔门CV-Task4 模型训练与验证

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

CNN

1.

2.

3.

Pytorch

5.1 学习⽬目标
1.
2. Pytorch

5.2 构造验证集

Overfitting Underfitting
CNN

Model Complexity

“ ”

1. Train Set

2. Validation Set

3. Test Set

1. Hold-Out

2. Cross Validation CV
K K-1 1 K
K K

K CV K
3. BootStrap

在本次赛题中已经划分为验证集,因此选⼿手可以直接使⽤用训练集进⾏行行训练,并使⽤用验证集进⾏行行验证精度(当然你也

可以合并训练集和验证集,⾃自⾏行行划分验证集)。

-
-

“ ” - -

5.3 模型训练与验证
Pytorch CNN CNN

1.

2.

1 train_loader = torch.utils.data.DataLoader(
2 train_dataset,
3 batch_size=10,

4 shuffle=True,
5 num_workers=10,
6 )
7
8 val_loader = torch.utils.data.DataLoader(
9 val_dataset,
10 batch_size=10,

11 shuffle=False,
12 num_workers=10,
13 )
14
15 model = SVHN_Model1()
16 criterion = nn.CrossEntropyLoss (size_average=False)
17 optimizer = torch.optim.Adam(model.parameters(), 0.001)
18 best_loss = 1000.0

19 for epoch in range(20):


20 print('Epoch: ', epoch)

21
22 train(train_loader, model, criterion, optimizer, epoch)
23 val_loss = validate(val_loader, model, criterion)
24

25 # 记录下验证集精度
26 if val_loss < best_loss:
27 best_loss = val_loss
28 torch.save(model.state_dict(), './model.pt')

Epoch

1 def train(train_loader, model, criterion, optimizer, epoch):


2 # 切换模型为训练模式

3 model.train()
4
5 for i, (input, target) in enumerate(train_loader):
6 c0, c1, c2, c3, c4, c5 = model(data[0])
7 loss = criterion(c0, data[1][:, 0]) + \
8 criterion(c1, data[1][:, 1]) + \
9 criterion(c2, data[1][:, 2]) + \
10 criterion(c3, data[1][:, 3]) + \

11 criterion(c4, data[1][:, 4]) + \


12 criterion(c5, data[1][:, 5])
13 loss /= 6
14 optimizer.zero_grad()
15 loss.backward()
16 optimizer.step()

Epoch

1 def validate(val_loader, model, criterion):


2 # 切换模型为预测模型
3 model.eval()
4 val_loss = []
5
6 # 不不记录模型梯度信息
7 with torch.no_grad():

8 for i, (input, target) in enumerate(val_loader):


9 c0, c1, c2, c3, c4, c5 = model(data[0])
10 loss = criterion(c0, data[1][:, 0]) + \
11 criterion(c1, data[1][:, 1]) + \
12 criterion(c2, data[1][:, 2]) + \

13 criterion(c3, data[1][:, 3]) + \


14 criterion(c4, data[1][:, 4]) + \
15 criterion(c5, data[1][:, 5])

16 loss /= 6
17 val_loss.append(loss.item())
18 return np.mean(val_loss)

5.4 模型保存与加载
Pytorch
torch.save(model_object.state_dict(), 'model.pt')

model.load_state_dict(torch.load(' model.pt'))

5.5 模型调参流程

GPU

1. http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html

2. http://karpathy.github.io/2019/04/25/recipe/

1. CNN

2. CNN

3.

5.6 本章⼩小节
关于Datawhale:Datawhale AI

Datawhale 计算机视觉
6 Datawhale 零基础⼊入⻔门CV-Task5 模型集成

赛题:零基础⼊入⻔门CV赛事- 街景字符编码识别

https://tianchi.aliyun.com/competition/entrance/531795/information

6.1 学习⽬目标
1.

2.

6.2 集成学习⽅方法
Stacking Bagging Boosting

10 10 CNN
10 CNN

1.

2.

6.3 深度学习中的集成学习

6.3.1 Dropout

Dropout

Dropout CNN
Dropout

1 <div STYLE="page-break-after: always;"></div>


2 # 定义模型
3 class SVHN_Model1(nn.Module):
4 def __init__(self):
5 super(SVHN_Model1, self).__init__()
6 # CNN提取特征模块
7 self.cnn = nn.Sequential(
8 nn.Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2)),

9 nn.ReLU(),
10 nn.Dropout(0.25),
11 nn.MaxPool2d(2),
12 nn.Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2)),

13 nn.ReLU(),
14 nn.Dropout(0.25),
15 nn.MaxPool2d(2),
16 )

17 #
18 self.fc1 = nn.Linear(32*3*7, 11)
19 self.fc2 = nn.Linear(32*3*7, 11)
20 self.fc3 = nn.Linear(32*3*7, 11)
21 self.fc4 = nn.Linear(32*3*7, 11)
22 self.fc5 = nn.Linear(32*3*7, 11)
23 self.fc6 = nn.Linear(32*3*7, 11)
24

25 def forward(self, img):


26 feat = self.cnn(img)
27 feat = feat.view(feat.shape[0], -1)
28 c1 = self.fc1(feat)
29 c2 = self.fc2(feat)
30 c3 = self.fc3(feat)
31 c4 = self.fc4(feat)

32 c5 = self.fc5(feat)
33 c6 = self.fc6(feat)
34 return c1, c2, c3, c4, c5, c6

6.3.2 TTA

Test Time Augmentation TTA

1 2 3

1 def predict(test_loader, model, tta=10):

2 model.eval()
3 test_pred_tta = None
4 # TTA 次数
5 for _ in range(tta):
6 test_pred = []
7

8 with torch.no_grad():
9 for i, (input, target) in enumerate(test_loader):
10 c0, c1, c2, c3, c4, c5 = model(data[0])
11 output = np.concatenate([c0.data.numpy(), c1.data.numpy(),
12 c2.data.numpy(), c3.data.numpy(),
13 c4.data.numpy(), c5.data.numpy()], axis=1)

14 test_pred.append(output)
15
16 test_pred = np.vstack(test_pred)
17 if test_pred_tta is None:
18 test_pred_tta = test_pred
19 else:
20 test_pred_tta += test_pred
21

22 return test_pred_tta

6.3.3 Snapshot

10 CNN
CNN

Snapshot Ensembles cyclical learning rate

checkopint checkpoint

cyclical learning rate CNN


Snapshot
6.4 结果后处理理

1.
2.

6.5 本章⼩小节

1.

2. Dropout TTA
Datawhale
Datawhale AI
Datawhale “for the learner ”
AI “数据众智、众

创”

https://tianchi.aliyun.com/

L4
D UPS

10 ⽆无⼈人驾驶货运领域⾸首个独⻆角兽企业 2015

www.tusimple.com

机械⼯工业出版社华章公司——专注⾼高端IT图书出版25年年! 1995

IT 25 30

You might also like