English

Journal of Chinese Agricultural Mechanization

Journal of Chinese Agricultural Mechanization ›› 2025, Vol. 46 ›› Issue (3): 238-245.DOI: 10.13733/j.jcam.issn.2095-5553.2025.03.035

• Research on Agricultural Intelligence • Previous Articles     Next Articles

Multi-target detection of cherry tomatoes under complex backgrounds based on Sim—YOLO

Zhang Junning1, Yan Ying1, Hu Huan1, Bi Zeyang2, Xing Yu1   

  1. (1. School of Mechanical and Electrical Engineering, Beijing Information Science and Technology University, Beijing, 100192, China; 2. School of Information and Electrical Engineering, China Agricultural University, Beijing, 100091, China)
  • Online:2025-03-15 Published:2025-03-13

基于Sim—YOLO的复杂背景下樱桃番茄多目标检测

张俊宁1,闫英1,胡欢1,毕泽洋2,邢宇1   

  1. (1. 北京信息科技大学机电工程学院,北京市,100192;2. 中国农业大学信息与电气工程学院,北京市,100091)
  • 基金资助:
    北京高校重点研究培育项目(2021YJPY201);农业装备技术全国重点实验室开放课题基金资助项目(NKL—2023—004);国家重点研发计划“政府间国际科技创新合作”专项(SQ2023YFE0101940)

Abstract:

 To realize precise recognition and localization of cherry tomatoes by picking robots in complex scenes, a detection method for single fruit and clustered cherry tomato fruits under complex backgrounds is proposed. In this study, a Sim—YOLO model, incorporating an attention mechanism with object detection algorithms, is developed to solve the problem of multi-target recognition in tomato picking. First, the dataset is expanded by using GAN combined with traditional augmentation methods, such as mosaic, rotating 90°, and Hue adjustment to improve model generalization. After preprocessing, the expanded tomato images are segmented using K—means clustering and the Canny operator in the Lab color space to distinguish the complex background from the target fruits. The SimAM attention mechanism is then integrated into the YOLOv5 backbone network to enhance the feature extraction of the algorithm and improve the localization accuracy of tomato stems in similarly colored backgrounds. The experimental results show that the accuracy rate and recall rate accuracy of Sim—YOLO model for greenhouse cherry tomatoes at different maturity stages are 84.1% and 98.0%, respectively. For multi-target detection of fruits and stems, the Sim—YOLO model achieved an accuracy of 48.9%, surpassing Faster R—CNN, YOLOv2, and YOLOv3 models. Finally, the Sim—YOLO model is deployed to edge computing devices using model conversion to optimize the inference process and reduce the computational pressure on the embedded edge system, and a detection speed of 25 FPS is achieved.

Key words:  , cherry tomatoes, complex background, GAN, multi-target recognition, embedded edge deployment

摘要:

为满足复杂场景下樱桃番茄采摘机器人精准识别和定位的需求,提出一种复杂背景下单果及成簇樱桃番茄果实的检测方法,采用注意力机制结合目标检测算法(Sim—YOLO)来解决番茄采摘多目标识别的难点问题。首先,结合GAN生成对抗网络和传统的图像增强如mosaic、旋转90°和Hue方法扩充数据集,提高模型的泛化能力。其次,对扩充后的番茄图像在Lab颜色空间下利用K—means聚类算法并结合Canny算子对其进行边缘检测,以初步区分复杂背景与检测目标。并在YOLOv5的骨干网络中加入SimAM注意力机制,增加算法的特征提取能力,提升果梗在相似颜色背景中的定位精度。试验结果表明,Sim—YOLO模型对不同成熟期的温室樱桃番茄的平均检测精确率及召回率分别为84.1%、98.0%,对果实及果梗多目标的平均检测精度为48.9%。果实及果梗多目标的检测精度均高于Faster R—CNN模型、YOLOv2模型和YOLOv3模型。最后,将Sim—YOLO模型通过模型转换部署到边缘计算设备,优化模型推理过程,减轻嵌入式端边缘计算压力,达到25 FPS的检测速率。

关键词: 樱桃番茄, 复杂背景, GAN网络, 多目标识别, 嵌入式端部署

CLC Number: