Resumen
In order to achieve accurate detection of mature Zanthoxylum in their natural environment, a Zanthoxylum detection network based on the YOLOv5 object detection model was proposed. It addresses the issues of irregular shape and occlusion caused by the growth of Zanthoxylum on trees and the overlapping of Zanthoxylum branches and leaves with the fruits, which affect the accuracy of Zanthoxylum detection. To improve the model?s generalization ability, data augmentation was performed using different methods. To enhance the directionality of feature extraction and enable the convolution kernel to be adjusted according to the actual shape of each Zanthoxylum cluster, the coordinate attention module and the deformable convolution module were integrated into the YOLOv5 network. Through ablation experiments, the impacts of the attention mechanism and deformable convolution on the performance of YOLOv5 were compared. Comparisons were made using the Faster R-CNN, SSD, and CenterNet algorithms. A Zanthoxylum harvesting robot vision detection platform was built, and the visual detection system was tested. The experimental results showed that using the improved YOLOv5 model, as compared to the original YOLOv5 network, the average detection accuracy for Zanthoxylum in its natural environment was increased by 4.6% and 6.9% in terms of mAP@0.5 and mAP@0.5:0.95, respectively, showing a significant advantage over other network models. At the same time, on the test set of Zanthoxylum with occlusions, the improved model showed increased mAP@0.5 and mAP@0.5:0.95 by 5.4% and 4.7%, respectively, compared to the original model. The improved model was tested on a mobile picking platform, and the results showed that the model was able to accurately identify mature Zanthoxylum in its natural environment at a detection speed of about 89.3 frames per second. This research provides technical support for the visual detection system of intelligent Zanthoxylum-harvesting robots.