English

中国农机化学报

中国农机化学报 ›› 2025, Vol. 46 ›› Issue (1): 85-90.DOI: 10.13733/j.jcam.issn.2095-5553.2025.01.013

• 设施农业与植保机械工程 • 上一篇    下一篇

基于MOT和实例分割的肉鸭采食饮水行为识别模型研究

马肄恒1, 2,段恩泽3, 4,赵世达3, 4,柏宗春3, 4,班兆军1, 2   

  1. 1. 浙江科技学院生物与化学工程学院,杭州市,310023;  2. 浙江省农产品化学与生物加工技术重点实验室,杭州市,310023;  3. 江苏省农业科学院农业设施与装备研究所,南京市,210014;  4. 农业农村部长江中下游设施农业工程重点实验室,南京市,210014
  • 出版日期:2025-01-15 发布日期:2025-01-24
  • 基金资助:
    江苏现代农业重大核心技术创新项目(CX(22)1008);江苏省现代农机装备与技术示范推广项目(NJ2021—25);江苏省苏北科技专项(XZ—SZ202119)

Research on the recognition model of meat duck eating and drinking behavior based on MOT and instance segmentation#br#

Ma Yiheng1, 2, Duan Enze3, 4, Zhao Shida3, 4, Bai Zongchun3, 4, Ban Zhaojun1, 2   

  1. 1. School of Biological and Chemical Engineering, Zhejiang University of Science and Technology, Hangzhou, 310023, China;  2. Zhejiang Key Laboratory of Agricultural Product Chemistry and Bioprocessing Technology, Hangzhou, 310023, China;  3. Institute of Agricultural Facilities and Equipment, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China;  4. Key Laboratory of Facility Agricultural Engineering in the Middle and Lower Reaches of the Yangtze River, Ministry of Agriculture and Rural Affairs, Nanjing, 210014, China
  • Online:2025-01-15 Published:2025-01-24

摘要: 追踪识别笼养肉鸭的饮水采食行为,对判别肉鸭的病理状态和实现鸭舍的智能管理具有重要意义。针对传统实例分割模型无法关联视频帧逻辑顺序的缺点,提出一种先进行多目标追踪(MOT),再进行实例分割的肉鸭饮水、采食行为识别模型。搭建肉鸭笼养俯拍试验台,采集含有多只肉鸭目标的图像数据。利用TAM模型对肉鸭个体进行识别与运动追踪。随后基于SAM对其中3只肉鸭目标的饮水采食行为进行快速标注。在肉鸭多目标追踪的基础上,采用Mask R-CNN识别鸭只的饮水和采食行为,并以视频帧率为依据,推断肉鸭视频中两类行为的时长,最终构建肉鸭饮水采食行为识别与计时模型。试验结果表明,Mask R-CNN模型对目标肉鸭的饮水、采食行为识别的预测框平均精确率和掩膜平均精确率分别为91.6%和93.3%,饮水和采食行为时长计算准确率分别为95.4%和90.1%,能够以较高的精度实现肉鸭饮水采食行为的识别与计时。

关键词: 笼养肉鸭, 多目标追踪, Mask R-CNN, 饮水采食, 行为识别

Abstract: Tracking and identifying the drinking and feeding behavior of caged meat ducks is of great significance to distinguish the pathological state of meat ducks and realize the intelligent management of duck houses. Aiming at the shortcomings of the traditional instance segmentation model that cannot correlate the logical sequence of video frames, a duck drinking and feeding behavior recognition model is proposed that first performs multiobject tracking (MOT) and then performs instance segmentation. A meat duck cage overhead shooting test platform was set up to collect image data containing multiple meat duck targets. The TAM (Track Anything Model) model was used to identify and track individual meat ducks. Then based on the SAM (Segment Anything Model), the drinking and feeding behaviors of the three meat duck targets were quickly marked. Based on the multitarget tracking of meat ducks, Mask R-CNN was used to identify the drinking and feeding behaviors of ducks, and the duration of the two types of behaviors in the meat duck video was inferred based on the video frame rate, and finally the recognition and timing models of drinking and feeding behavior of meat duck was constructed. The test results show that the Mask R-CNN model has an average prediction frame accuracy rate  and mask average accuracy rate of the target meat duck's drinking water and feeding behavior recognition by 91.6% and 93.3%, respectively, the calculation accuracy of drinking water and feeding behavior duration is 95.4% and 90.1%, respectively, which can realize the recognition and timing of meat duck drinking and feeding behavior with high precision.

Key words: caged meat duck, multiple object tracking, Mask R-CNN, drinking and feeding, behavior recognition

中图分类号: