Abstract:
To address the challenges of complex backgrounds, weak features of small targets, and the difficulty in balancing detection accuracy with inference efficiency in glass defect detection, a dynamic detection model based on an improved YOLOv8 was proposed. Firstly, the model structure was improved from three aspects: the detection head, lightweight design, and loss function. A Dynamic Head integrated with deformable convolution DCNv3 was introduced into the detection head to enhance multi-scale feature representation capability. The ADown downsampling module was adopted to achieve a lightweight design and reduce computational cost. The MPDIoU loss function was introduced to improve bounding box localization accuracy. Secondly, a training set was constructed by combining public data and a self-collected dataset, and data augmentation methods such as random cropping, grayscale transformation, and brightness perturbation were employed to enhance the model's adaptability to complex working conditions. The model performance was validated through ablation experiments and multi-model comparative experiments in terms of detection accuracy and inference efficiency. The results show that the improved modules are collaboratively effective: the Dynamic Head-DCNv3 significantly enhances scale feature representation, MPDIoU improves localization accuracy, and ADown reduces computational overhead while maintaining stable performance. The final model's mean average precision mAP50 and mAP50-95 are achieved at 91.7% and 51.8%, respectively, with an inference speed of 39.8 F/s, thereby a good balance between detection accuracy and operational efficiency is realized. Further analysis indicates that significant advantages of this method are observed in small defect detection tasks, and complex industrial scenes can be adapted to, thus an effective technical path that balances accuracy and real-time performance is provided for industrial online glass defect detection.