Abstract:
Endoscopic image segmentation technology as a routine clinical diagnostic method, whose segmentation accuracy directly affects physicians’ diagnosis and treatment decisions of lesion areas. In view of the limitations of existing methods in challenging scenarios such as poor image quality and blurred lesion area boundaries, a polyp segmentation network integrating multi-scale feature perception and fuzzy boundary modeling was proposed. Firstly, the image was decomposed into sub-bands of different scales and frequencies through discrete wavelet transform to extract global structural and local detailed features, while an adaptive attention mechanism was employed to dynamically adjust the weights of each sub-band feature, achieving multi-scale feature perception. Secondly, a variational multi-sampling module was utilized to map features into latent space for probability distribution modeling, where diversified latent space representations were generated through multiple reparameterized samplings, effectively smoothing blurred regions and improving boundary segmentation accuracy. Experiments were conducted on five public datasets (CVC−300, CVC−ClinicDB, Kvasir−SEG, CVC−ColonDB, ETIS−LaribPolyDB) and the non-public USTCAI dataset to validate the performance of the proposed method. The results demonstrate that the proposed method outperforms existing methods in both Dice coefficient and mIoU metrics. Particularly on the ETIS−LaribPolyDB dataset, a Dice coefficient of 57.54% is achieved, surpassing the state-of-the-art method by 7.1%, while on the CVC−ClinicDB dataset, an outstanding Dice coefficient of 91.88% is attained, exhibiting excellent segmentation performance and generalization capability in complex scenarios.By combining multi-scale feature perception with fuzzy boundary modeling techniques, the proposed method effectively addresses key challenges in endoscopic image segmentation, providing more accurate and reliable technical support for clinical diagnosis.