Abstract:
Aiming at the problem of fuzzy features of small targets in aerial image detection, an improved YOLO_v5x target detection method was proposed. A space-to-depth (SPD) module was added to the backbone and neck network of YOLO_v5x to reduce the loss of fine-grained information, and a small target prediction head was added to the detection output to improve the efficiency of the algorithm in learning low-resolution features. At the same time, the coordinate attention (CA) mechanism was introduced to encode the horizontal and vertical position information into the channel attention to enhance the ability of the network to extract different dimensional features. In order to improve the target positioning accuracy, the Alpha intersection over union (
α−IOU) loss function was introduced based on the complete-intersection over union (CIOU) loss function. To obtain more accurate bounding box regression, to achieve more accurate target positioning in the image. Through training and comparative experiments on the improved YOLO_v5x algorithm on the Visdrone datasets. The results show that compared with the original YOLO_v5x, the average detection accuracy of the improved target detection algorithm was increased by 7.8%, and the average detection accuracy of small target detection was up to 23.9%, which can effectively identify small targets in unmanned aerial vehicle aerial photos. Compared with other target detection algorithms such as RetinaNet and YOLOX-S, the average precision of small target detection was the highest in the improved target detection algorithm, reaching the advanced level among the current mainstream small target detection algorithms.