Professional Documents
Culture Documents
23
2020 年 12 月 Transactions of the Chinese Society of Agricultural Engineering Dec. 2020 171
基于迁移学习的卷积神经网络花生荚果等级图像识别
张瑞青 1,李张威 1,郝建军 1,孙 磊 1,李 浩 1,韩 鹏 2
(1. 河北农业大学机电工程学院,保定 071001;2. 河北省农业技术推广总站,石家庄 050000)
摘 要:针对花生荚果人工分级效率低、机械分级不精确等问题,该研究提出一种基于迁移学习的卷积神经网络花生荚果
等级图像识别方法。利用翻转、旋转、平移、对比度变换和亮度变换等操作,对获取的 5 个等级花生荚果图像进行数量扩
充和预处理,得到花生荚果等级图像数据集;对比分析了 GoogLeNet、ResNet18 和 AlexNet 3 种基本模型下花生荚果图像分
级识别的性能;通过迁移 AlexNet 卷积层对花生荚果等级识别模型进行了改进,用批归一化替换局部响应归一化且将激活函
数置于批归一化层前后不同位置,设计了 4 种不同的识别训练模型;对改进的 4 种 AlexNet 模型进行迁移学习对比试验和超
参数学习率优化试验,研究了非饱和激活函数和改进的非饱和激活函数对模型性能的影响。试验结果表明,在满足测试精
度的基础上 AlexNet 模型所用训练时间最少;基于 AlexNet 的改进模型的迁移学习中学习率是需要优化的超参数,合适的学
习率能够加快模型的训练并提升识别能力;改进模型中批归一化的引入及网络参数的减少,缩减了 220 s 训练时间,模型性
能提高。所构建的花生荚果等级识别模型(Penut_AlexNet model,PA 模型)对花生荚果 5 个等级分类识别准确率达到 95.43%,
该模型对花生荚果等级识别具有较高的准确率,也可为其他农产品精确分级提供参考。
关键词:图像识别;模型;卷积神经网络;迁移学习;批归一化;花生荚果;等级分类
doi:10.11975/j.issn.1002-6819.2020.23.020
中图分类号:S565.2; TP391.4 文献标志码:A 文章编号:1002-6819(2020)-23-0171-10
图 1 花生荚果等级识别 PA 模型结构
Fig.1 Structure of Penut_AlexNet (PA) model for peanut pod grade recognition
第 23 期 张瑞青等:基于迁移学习的卷积神经网络花生荚果等级图像识别 173
综上可知超参数学习率对花生荚果等级识别模型性 到最大迭代训练时都达到了稳定收敛的状态,不再像此
能有重要影响:模型达到最大迭代训练次数时仍未收敛 前的曲线振荡,表明各模型都得到了充分的训练,此时
表明超参数学习率的设置不当,需要修改;学习率过大, 的超参数设置合适。PA-I、PA-II、PA-III 和 PA-IV 4 个
模型训练振荡甚至无法正常训练;学习率过小,模型训 模型的验证准确率分别为 93.81%、95.10%、97.24%和
练缓慢;学习率适当,迁移学习的模型获得的准确率更 61.33%,验证损失值分别为 0.206 9、0.156 8、0.096 8 和
高,损失值更低。以固定学习率的方式对网络模型训练 0.981 1,其中 PA-Ⅲ模型获得的准确率最高,损失值最低,
不可取,需要大量的试错或优化方法寻找一个合适的学 表明其性能最优。
习率,较为耗时。 PA-I、PA-II、PA-III 和 PA-IV 4 种模型的验证曲线
由表 2 与图 3 可知固定学习率为 0.001 时各模型的性 如图 4 所示,由图 4 可知,在取得相同验证准确率 60%
能较好,但验证曲线未达到稳定收敛。因此学习率采用 的情况下,全部学习模型 PA-IV 4 所需迭代次数大约是迁
分段式常数衰减法进行更新,设置初始学习率为 0.001, 移学习模型的 8.3~15 倍,说明 PA-I、PA-II、PA-III 3
每经过 4 个周期学习率乘以衰减系数 0.1 进行更新,对 种迁移学习模型训练更快更容易。PA-I 模型结构中有
PA-I、PA-II、PA-III 和 PA-IV 4 个模型分别进行训练, LRN 层,而 PA-II 和 PA-III 模型结构中有 BN 层,迭代
训练与验证结果如表 3 所示,观察训练过程发现各模型 训练 100 次时, PA-II 和 PA-III 模型达到的准确率比 PA-I
训练时并未发生过拟合。由表 3 可知,相较于表 2 中固 模型高出约 7%~10%,达到的损失值也小于 PA-I 模型,
定学习率为 0.001 的 4 种模型,准确率下降,损失值上升。 说明 BN 有助于模型的训练且能使模型获得更高的准确
但由图 4 与图 3cd 可知,学习率更新下训练的 4 种模型达 率和更低的损失值。
176 农业工程学报(http://www.tcsae.org) 2020 年
2.2.5 全连接层参数量的影响
通 过 上 述 对 PA- Ⅰ 、 PA- Ⅱ 、 PA- Ⅲ 、 PA- Ⅳ 、
LReLU-PA-Ⅰ、LReLU-PA-Ⅱ和 LReLU-PA-Ⅲ 7 种模型
在本研究的花生荚果图像数据集上的训练与验证结果比
图 6 PA-Ⅲ和 PA 2 种模型的验证准确率与损失值曲线
较分析,可知在学习率更新的情况下 LReLU-PA-Ⅲ模型
Fig.6 Validation accuracy and loss values curves of PA-Ⅲ and
和 PA- Ⅲ 模 型 的 验 证 结 果 优 于 其 他 模 型 , 但 因 为 PA models
LReLU-PA-Ⅲ模型需耗费额外的时间搜索合适的系数,
所以 PA-Ⅲ模型的训练效率高于 LReLU-PA-Ⅲ模型。但 2.3模型检验
此时 PA-Ⅲ模型的参数量仍然庞大,模型训练时间较长, 使用最终训练好的 PA 模型对花生荚果测试集进行
因此对 PA-Ⅲ模型进行降参处理,缩短训练时间,降低模 等级识别,通过混淆矩阵计算得到查准率、查全率及 F1
型占用的内存空间。 值评估模型的性能(表 6)。由表 6 可知 PA 模型对花生
对 PA-Ⅲ模型进行改进:去除所有的全连接层,设计 荚果等级的平均分类识别准确率为 95.43%,准确率较高,
新的 2 个全连接层(全连接层 FC1、FC2 的输出单元数 分类性能(根据 F1 得分)由高到低依次为三级果、二级
分 别 为 512 、 5 ) , 此 时 的 模 型 即 为 花 生 AlexNet 果、一级果、五级破损果和四级异形果。测试集中四等
(Peanut_AlexNet, PA)(图 1),与 PA-Ⅲ模型相比,PA 异形果的识别准确率为 87.14%,误识别为五级破损果的
模型参数量下降了 87.60%。用花生荚果图像训练集与验 较多,主要是因为部分荚果腰部窄小且存在暗色区域,
证集分别对 PA-Ⅲ和 PA 2 种模型进行训练与验证,训练 荚果表面有斑块,荚果头部或尾部有凹陷以及低分辨率
结果如表 5 所示,由表 5 可知,降参数量后,模型准确 等因素造成的误判。
表 6 PA 模型的混淆矩阵及分类性能
Table 6 Confusion matrix and classification performance of PA model
预测花生荚果等级 Prediction of peanut pod grades 分类性能 Classification performance
样本集 四级异形果 五级破损果
Sample sets 一级果 二级果 三级果 查准率 查全率 F1 得分
Fourth-grade Fifth-grade
First-grade pod Second-grade pod Third-grade pod Precision/% Recall/% F1 score/%
abnormal pod damaged pod
一级果
131 0 0 1 8 99.24 93.57 96.32
First-grade pod
二级果
0 136 0 1 3 97.84 97.14 97.49
Second-grade pod
三级果
0 0 139 0 1 100 99.29 99.64
Third-grade pod
四级异形果
1 3 0 122 14 98.39 87.14 92.42
Fourth-grade abnormal pod
五级破损果
0 0 0 0 140 84.34 100 94.50
Fifth-grade damaged pod
178 农业工程学报(http://www.tcsae.org) 2020 年
Abstract: Aiming at the problems of low efficiency of manual grading and inaccurate mechanical grading of peanut pods, a
convolutional neural network peanut pod grades image recognition method based on transfer learning was proposed. By using
the operations of the flip, rotation, translation, contrast transformation, and brightness transformation, the obtained five grades
(first-grade pod, second-grade pod, third-grade pod, fourth-grade abnormal pod, and fifth-grade damaged pod) of peanut pod
images were expanded and preprocessed, thus the peanut pod grades image data set was established. The 60% of data was
randomly selected as the training set, 20% of data was randomly selected as the validation set, and the remaining 20% as the
test set. The performance of peanut pod image classification based on the GoogLeNet, ResNet18, and AlexNet was compared
and analyzed. The peanut pod grades recognition model was improved by transferring the AlexNet convolution layers. The
local response normalization was replaced by batch normalization, and the activation function was placed in different positions
before and after the batch normalization layer, so that four different recognition-training models were designed, including the
PA-I model, PA-II model, PA-III model, and PA-IV model. The transfer learning contrast experiments and the hyperparameter
optimization experiments of the learning rate carried out for the four improved AlexNet models proposed above. The effects of
the unsaturated activation function (ReLU) and improved unsaturated activation function (LReLU) on the performance of the
model were studied. The experimental results showed that the training time of the AlexNet model was the least on the basis of
satisfying the test accuracy and the learning rate of transfer learning based on the improved AlexNet model was a very
important hyperparameter that needed to be optimized. If the learning rate is chosen too high, the model training oscillates
seriously and even can’t train normally; if the learning rate too small, the model training slow. An appropriate learning rate can
speed up the training and improve the recognition ability of the model. When the learning rate was updated automatically, the
model with batch normalization had better performance than local response normalization, which could make the model get
higher accuracy and lower loss value. When the coefficient of activation function LReLU was 0.000 1, the performance of the
LReLU used in the model was equivalent to that of the ReLU used in the model, therefore LReLU had no substantial impact
on the training results of the model. The addition of batch normalization and reduction of parameters in the model reduced 220
s training time and improved the model’s performance. The classification accuracy of the proposed peanut pod grades
recognition model for the first-grade pod, second-grade pod, third-grade pod, fourth-grade abnormal pod, and fifth-grade
damaged pod was 93.57%, 97.14%, 99.29%, 87.14%, and 100% respectively and the average classification accuracy reached
95.43%, and F1-scores achieved 96.32%, 97.49%, 99.64%, 92.42%, and 94.50% respectively. The model proposed in this
study had high classification accuracy for peanut pod grades and could provide a reference for the precise classification of
other agricultural products.
Keywords: image recognition; models; convolutional neural network; transfer learning; batch normalization; peanut pod; rank
classification