English

中国农机化学报

中国农机化学报 ›› 2024, Vol. 45 ›› Issue (2): 267-274.DOI: 10.13733/j.jcam.issn.2095-5553.2024.02.038

• 农业智能化研究 • 上一篇    下一篇

基于Vision Transformer的小麦病害图像识别算法

白玉鹏1,冯毅琨2,李国厚1,赵明富1,周浩宇3,侯志松1, 4   

  • 出版日期:2024-02-15 发布日期:2024-03-20
  • 基金资助:
    国家自然科学基金(11871196);河南省科技攻关项目(232102111125)

Algorithm of wheat disease image identification based on Vision Transformer

Bai Yupeng1, Feng Yikun2, Li Guohou1, Zhao Mingfu1, Zhou Haoyu3, Hou Zhisong1, 4   

  • Online:2024-02-15 Published:2024-03-20

摘要: 小麦白粉病、赤霉病和锈病是危害小麦产量的三大病害。为提高小麦病害图像的识别准确率,构建一种基于Vision Transformer的小麦病害图像识别算法。首先,通过田间拍摄的方式收集包含小麦白粉病、赤霉病和锈病3种病害在内的小麦病害图像,并对原始图像进行预处理,建立小麦病害图像识别数据集;然后,基于改进的Vision Transformer构建小麦病害图像识别算法,分析不同迁移学习方式和数据增强对模型识别效果的影响。试验可知,全参数迁移学习和数据增强能明显提高Vision Transformer模型的收敛速度和识别精度。最后,在相同时间条件下,对比Vision Transformer、AlexNet和VGG16算法在相同数据集上的表现。试验结果表明,Vision Transformer模型对3种小麦病害图像的平均识别准确率为96.81%,相较于AlexNet和VGG16模型识别准确率分别提高6.68%和4.94%。

关键词: 小麦病害, Vision Transformer, 迁移学习, 图像识别, 数据增强

Abstract: Wheat powdery mildew, head blight, and rust are the three major diseases that harm wheat yield. In order to improve the recognition accuracy of wheat disease images, a wheat disease image recognition algorithm based on Vision Transformer was proposed. Firstly, the images of wheat diseases, including wheat powdery mildew, scab, and rust, were collected by field shooting, and the original images were preprocessed to establish the wheat disease image recognition data set. Then, the wheat disease image recognition algorithm was constructed based on the improved Vision Transformer, analyzing the influence of different transfer learning methods and data enhancement on the model identification effect. The experiments showed  that full parameter transfer learning and data enhancement could significantly improve the convergence speed and identification accuracy of the Vision Transformer model. Finally, the performance of Vision Transformer, AlexNet and VGG 16 algorithms on the same dataset was compared under the same time condition. The experimental results showed that the average recognition accuracy of the Vision Transformer model for the three wheat disease images was 96.81%, which was 6.68% and 4.94% higher than that of  AlexNet and VGG 16 models, respectively.

Key words: wheat disease, Vision Transformer, transfer learning, image recognition, data augmentation

中图分类号: