结合边界自知识蒸馏的结肠镜息肉图像分割方法
Colonoscopy polyp image segmentation method with boundary self-knowledge distillation
- 2025年30卷第2期 页码:589-600
纸质出版日期: 2025-02-16
DOI: 10.11834/jig.240175
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2025-02-16 ,
移动端阅览
孟祥福, 张智超, 俞纯林, 张霄雁. 2025. 结合边界自知识蒸馏的结肠镜息肉图像分割方法. 中国图象图形学报, 30(02):0589-0600
Meng Xiangfu, Zhang Zhichao, Yu Chunlin, Zhang Xiaoyan. 2025. Colonoscopy polyp image segmentation method with boundary self-knowledge distillation. Journal of Image and Graphics, 30(02):0589-0600
目的
2
结肠镜技术在结肠息肉的早期检测中至关重要,但其依赖于操作员的专业技能和主观判断,因此存在局限性。现有的结肠息肉图像分割方法通常采用额外层和显式扩展网络结构,导致模型效率较低。此外,由于息肉与其周围粘膜之间的边界不清晰,现有模型对于息肉边界的分割效果并不理想。
方法
2
提出了一种端到端的自知识蒸馏框架,专门用于结肠息肉图像分割。该框架将边界分割网络和息肉分割网络整合到一个统一的知识蒸馏框架中,以相互增强两个网络的性能。该框架采用专注于边界分割的模型作为教师网络,将息肉分割模型作为学生网络,两者共享一个特征提取模块,以促进更有效的知识传递。设计了一种反向特征融合结构,通过上采样和矩阵乘法聚合编码器深层特征,并利用反向浅层特征作为辅助信息,从而获得分割掩膜的全局映射。
结果
2
通过在CVC-ClinicDB(colonoscopy videos challenge-clinicdatabase)、CVC-ColonDB(colonoscopy videos challenge-colondatabase)、Kvasir以及HAM10000(human against machine with 10000 training images) 4个数据集上开展实验,与当前11种先进方法PraNet(parallel reverse attention network)和Polyp2Former(boundary guided network based on transformer for polyp segmentation)等进行比较,实验结果表明本文模型表现最佳,Dice相似性系数(Dice similarity coefficient, DSC)和平均交并比(mean intersection over union,mIoU)指标分别比现有最优模型提升了0.45%和0.68%。
结论
2
本文模型适用于各种尺寸和形状的息肉分割,实现了准确的边界提取,并且具有推广到其他医学图像分割任务的潜力。本文代码可在
https://github.com/xiaoxiaotuo/BA-KD
https://github.com/xiaoxiaotuo/BA-KD
下载。
Objective
2
Colorectal cancer remains a formidable global health challenge, underscoring the pressing need for early detection strategies to improve treatment outcomes. Among these strategies, colonoscopy stands out as a primary diagnostic tool, relying on the visual acumen of medical professionals to identify potentially cancerous abnormalities, such as polyps, within the colon and rectum. However, the effectiveness of colonoscopy is heavily contingent upon the skill and experience of the operator, leading to variability and limitations in detection rates across different practitioners and settings. In response to these challenges, the integration of artificial intelligence and computer vision techniques has garnered increasing attention as a means to augment the accuracy and efficiency of colorectal cancer screening. Various algorithms have been developed to automatically segment colorectal images, with the overarching goal of precisely delineating polyps from the surrounding tissue. Despite advancements in this domain, many existing models confront inherent inefficiencies and limited effectiveness stemming from their intricate architectures and dependence on manual feature engineering.
Method
2
This study proposes a novel end-to-end boundary self-knowledge distillation (BA-KD) framework, which aims to achieve precise polyp segmentation. In contrast to conventional methods, BA-KD seamlessly integrates boundary and polyp segmentation networks into a unified framework, facilitating effective knowledge transfer between the two domains. BA-KD represents a pioneering contribution in this field, aiming to harness the synergistic benefits of both boundary and polyp information for increased segmentation accuracy. The BA-KD framework comprises two interconnected branches: a boundary segmentation network serving as the teacher branch and a polyp segmentation network acting as the student branch. The inherent challenges associated with delineating polyp boundaries are addressed by introducing a boundary detection operator to automatically generate boundary masks, which are subsequently leveraged when training both branches. This approach not only enhances the segmentation performance of the student branch but also enriches the knowledge base of the teacher branch, thereby fostering mutual learning and refinement. A key distinguishing feature of BA-KD is the shared image feature extractors between the student and teacher branches, facilitating robust knowledge transfer across both domains. Two innovative structures, namely, reverse multilevel feature fusion (RMLF) and reverse feature fusion (RFM), are proposed to facilitate the effective fusion of feature information at various hierarchical levels. RMLF enables the integration of high-level features to generate a comprehensive global feature map, whereas RFM synergistically combines reverse shallow features with high-level features aggregated via RMLF to produce the final segmentation mask.
Result
2
A comprehensive experimental validation of the BA-KD results is conducted against seven state-of-the-art methods across four different datasets: CVC-ClinicDB, CVC-ColonDB, Kvasir, and HAM10000. The comparative models include U-Net, Double-UNet, UNet++, TransFuse, PraNet, DuAT, RaBit, GroupSeg, and G-CASCADE. These models serve as benchmarks in the polyp segmentation and general medical image segmentation domains. The results on CVC-ClinicDB show that BA-KD demonstrates exceptional performance in terms of mSpe and mDSC, with values of 0.997 and 0.955 5, respectively. BA-KD outperforms all the competitors in terms of mDSC and mIoU, with improvements of 0.45% and 0.68%, respectively, over RaBit. For the CVC-ColonDB dataset, BA-KD outperforms all the other methods across all the evaluation metrics, achieving improvements of 2.20% in the mIoU and 1.51% in the mDSC compared with the optimal performance achieved by TransFuse. For the Kvasir dataset, BA-KD achieves an mIoU of 0.889 and an mDSC of 0.937, surpassing the best-performing RaBit by approximately 1.08% and 1.14%, respectively. Furthermore, the generalization ability of BA-KD is evaluated on other medical segmentation tasks via the HAM10000 dataset, which includes dermoscopic images from different populations. Compared with existing medical segmentation baselines, BA-KD excels in all metrics on HAM10000, achieving significant scores in mDSC (0.956 2) and mIoU (0.922 3), surpassing the best-performing Double UNet by 1.45% and 2.25%, respectively.
Conclusion
2
The experimental results clearly demonstrate that BA-KD outperforms existing state-of-the-art segmentation methods, with substantial improvements in the Dice similarity coefficient (DSC)and mean intersection over union metrics (mIoU).
Ahn S B , Han D S , Bae J H , Byun T J , Kim J P and Eun C S . 2012 . The miss rate for colorectal adenoma determined by quality-adjusted, back-to-back colonoscopies . Gut and Liver , 6 ( 1 ): 64 - 70 [ DOI: 10.5009/gnl.2012.6.1.64 http://dx.doi.org/10.5009/gnl.2012.6.1.64 ]
Bernal J , Snchez F J , Fernndez-Esparrach G , Gil D , Rodríguez C and Vilariño F . 2015 . WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians . Computerized Medical Imaging and Graphics , 43 : 99 - 111 [ DOI: 10.1016/j.compmedimag.2015.02.007 http://dx.doi.org/10.1016/j.compmedimag.2015.02.007 ]
Caron M , Touvron H , Misra I , Jegou H , Mairal J , Bojanowski P and Joulin A . 2021 . Emerging properties in self-supervised vision transformers // Proceedings of 2021 IEEE/CVF International Conference on Computer Vision . Montreal, Canada : IEEE: 9630 - 9640 [ DOI: 10.1109/ICCV48922.2021.00951 http://dx.doi.org/10.1109/ICCV48922.2021.00951 ]
Chung I , Park S U , Kim J and Kwak N . 2020 . Feature-map-level online adversarial knowledge distillation // Proceedings of 2020 International Conference on Machine Learning . PMLR , 2020 : 2006 - 2015
Du X Q , Xu X B and Ma K P . 2022 . ICGNet: integration context-based reverse-contour guidance network for polyp segmentation // Proceedings of the 31st International Joint Conference on Artificial Intelligence . Vienna, Austria : IJCAI: 877 - 883 [ DOI: 10.24963/ijcai.2022/123 http://dx.doi.org/10.24963/ijcai.2022/123 ]
Fan D P , Ji G P , Zhou T , Chen G , Fu H Z , Shen J B and Shao L . 2020 . PraNet: parallel reverse attention network for polyp segmentation // Proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention . Lima, Peru : Springer: 263 - 273 [ DOI: 10.1007/978-3-030-59725-2_26 http://dx.doi.org/10.1007/978-3-030-59725-2_26 ]
Favoriti P , Carbone G , Greco M , Pirozzi F , Pirozzi R E M and Corcione F . 2016 . Worldwide burden of colorectal cancer: a review . Updates in Surgery , 68 ( 1 ): 7 - 11 [ DOI: 10.1007/s13304-016-0359-y http://dx.doi.org/10.1007/s13304-016-0359-y ]
Gao S H , Cheng M M , Zhao K , Zhang X Y , Yang M H and Torr P . 2021 . Res2Net: a new multi-scale backbone architecture . IEEE Transactions on Pattern Analysis and Machine Intelligence , 43 ( 2 ): 652 - 662 [ DOI: 10.1109/tpami.2019.2938758 http://dx.doi.org/10.1109/tpami.2019.2938758 ]
Hinton G , Vinyals O and Dean J . 2015 . Distilling the knowledge in a neural network [EB/OL]. [ 2024-03-29 ]. https://arxiv.org/pdf/1503.02531.pdf https://arxiv.org/pdf/1503.02531.pdf
Huang X S , Huang J Z , Wang S , Wei Y G , An D and Liu J C . 2023 . Polyp2Former: boundary guided network based on transformer for polyp segmentation // Proceedings of 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) . Istanbul, Turkiye : IEEE: 1971 - 1976 [ DOI: 10.1109/BIBM58861.2023.10385257 http://dx.doi.org/10.1109/BIBM58861.2023.10385257 ]
Jha D , Smedsrud P H , Riegler M A , Halvorsen P , De Lange T , Johansen D and Johansen H D . 2020a . Kvasir-SEG: a segmented polyp dataset // Proceedings of the 26th International Conference on MultiMedia Modeling . Daejeon, Korea (South) : Springer: 451 - 462 [ DOI: 10.1007/978-3-030-37734-2_37 http://dx.doi.org/10.1007/978-3-030-37734-2_37 ]
Jha D , Riegler M A , Johansen D , Halvorsen P and Johansen H D . 2020b . DoubleU-Net: a deep convolutional neural network for medical image segmentation // Proceedings of the 33rd IEEE International Symposium on Computer-Based Medical Systems (CBMS) . Rochester, USA : IEEE: 558 - 564 [ DOI: 10.1109/CBMS49503.2020.00111 http://dx.doi.org/10.1109/CBMS49503.2020.00111 ]
Kanopoulos N , Vasanthavada N and Baker R L . 1988 . Design of an image edge detection filter using the Sobel operator . IEEE Journal of Solid-State Circuits , 23 ( 2 ): 358 - 367 [ DOI: 10.1109/4.996 http://dx.doi.org/10.1109/4.996 ]
Kim T , Lee H and Kim D . 2021 . UACANet : uncertainty augmented context attention for polyp segmentation// Proceedings of the 29th ACM international conference on multimedia : 2167 - 2175 . [ DOI: 10.1145/3474085.3475375 http://dx.doi.org/10.1145/3474085.3475375 ]
Liang L M , He A J , Zhu C K and Sheng X Q . 2023 . Colorectal polyp segmentation method based on fusion of transformer and cross-level phase awareness . Journal of Biomedical Engineering , 40 ( 2 ): 234 - 243
梁礼明 , 何安军 , 朱晨锟 , 盛校棋 . 2023 . 融合Transformer和跨级相位感知的结肠息肉分割方法 . 生物医学工程学杂志 , 40 ( 2 ): 234 - 243 [ DOI: 10.7507/1001-5515.202211067 http://dx.doi.org/10.7507/1001-5515.202211067 ]
Lin Y , Zhang D , Fang X , Chen Y F , Cheng K T and Chen H . 2023 . Rethinking boundary detection in deep learning models for medical image segmentation // Proceedings of the 28th International Conference on Information Processing in Medical Imaging . San Carlos de Bariloche, Argentina : Springer: 730 - 742 [ DOI: 10.1007/978-3-031-34048-2_56 http://dx.doi.org/10.1007/978-3-031-34048-2_56 ]
Park W , Kim D , Lu Y and Cho M . 2019 . Relational knowledge distillation // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach, USA : IEEE: 3962 - 3971 [ DOI: 10.1109/CVPR.2019.00409 http://dx.doi.org/10.1109/CVPR.2019.00409 ]
Peng B Y , Jin X , Li D S , Zhou S F , Wu Y C , Liu J H , Zhang Z N and Liu Y . 2019 . Correlation congruence for knowledge distillation // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision . Seoul, Korea (South) : IEEE: 5006 - 5015 [ DOI: 10.1109/ICCV.2019.00511 http://dx.doi.org/10.1109/ICCV.2019.00511 ]
Qu L H , Luo X Y , Wang M N and Song Z J . 2022 . Bi-directional weakly supervised knowledge distillation for whole slide image classification // Proceedings of the 36th International Conference on Neural Information Processing Systems . New Orleans, USA : Curran Associates Inc.: #1118
Rahman M M , Marculescu R . 2024 . G-CASCADE: efficient cascaded graph convolutional decoding for 2D medical image segmentation // Proceedings of 2024 IEEE/CVF Winter Conference on Applications of Computer Vision . Waikoloa, USA : IEEE: 7713 - 7722 [ DOI: 10.1109/WACV57701.2024.00755 http://dx.doi.org/10.1109/WACV57701.2024.00755 ]
Ronneberger O , Fischer P and Brox T . 2015 . U-Net: convolutional networks for biomedical image segmentation // Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention . Munich, Germany : Springer: 234 - 241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Sanderson E and Matuszewski B J . 2022 . FCN-transformer feature fusion for polyp segmentation // Proceedings of the 26th Annual Conference on Medical Image Understanding and Analysis . Cambridge, UK : Springer: 892 - 907 [ DOI: 10.1007/978-3-031-12053-4_65 http://dx.doi.org/10.1007/978-3-031-12053-4_65 ]
Selvaraju R R , Cogswell M , Das A , Vedantam R , Parikh D and Batra D . 2017 . Grad-CAM: visual explanations from deep networks via gradient-based localization // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice, Italy : IEEE: 618 - 626 [ DOI: 10.1109/ICCV.2017.74 http://dx.doi.org/10.1109/ICCV.2017.74 ]
Shen Y T , Jia X and Meng M Q H . 2021 . HRENet: a hard region enhancement network for polyp segmentation // Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention . Strasbourg, France : Springer: 559 - 568 [ DOI: 10.1007/978-3-030-87193-2_53 http://dx.doi.org/10.1007/978-3-030-87193-2_53 ]
Siegel R L , Miller K D and Jemal A . 2019 . Cancer statistics, 2019 . CA : A Cancer Journal for Clinicians , 69 ( 1 ): 7 - 34 [ DOI: 10.3322/caac.21551 http://dx.doi.org/10.3322/caac.21551 ]
Sung H , Ferlay J , Siegel R L , Laversanne M , Soerjomataram I , Jemal A and Bray F . 2021 . Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries . CA : A Cancer Journal for Clinicians , 71 ( 3 ): 209 - 249 [ DOI: 10.3322/caac.21660 http://dx.doi.org/10.3322/caac.21660 ]
Tajbakhsh N , Gurudu S R and Liang J M . 2016 . Automated polyp detection in colonoscopy videos using shape and context information . IEEE Transactions on Medical Imaging , 35 ( 2 ): 630 - 644 [ DOI: 10.1109/tmi.2015.2487997 http://dx.doi.org/10.1109/tmi.2015.2487997 ]
Tang F L , Huang Q M , Wang J F , Hou X X , Su J L and Liu J X . 2022 . DuAT: dual-aggregation transformer network for medical image segmentation [EB/OL]. [ 2024-03-29 ]. https://arxiv.org/pdf/2212.11677.pdf https://arxiv.org/pdf/2212.11677.pdf
Thuan N H , Oanh N T , Thuy N T , Perry S and Sang D V . 2023 . RaBit: an efficient transformer using bidirectional feature pyramid network with reverse attention for colon polyp segmentation [EB/OL]. [ 2024-03-29 ]. https://arxiv.org/pdf/2307.06420.pdf https://arxiv.org/pdf/2307.06420.pdf
Tian C X and Zhao L . 2021 . Epidemiological characteristics of colorectal cancer and colorectal liver metastasis . Chinese Journal of Cancer Prevention and Treatment , 28 ( 13 ): 1033 - 1038
田传鑫 , 赵磊 . 2021 . 结直肠癌及结直肠癌肝转移流行病学特点 . 中华肿瘤防治杂志 , 28 ( 13 ): 1033 - 1038 [ DOI: 10.16073/j.cnki.cjcpt.2021.13.12 http://dx.doi.org/10.16073/j.cnki.cjcpt.2021.13.12 ]
Tschandl P , Rosendahl C and Kittler H . 2018 . The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions . Scientific Data , 5 ( 1 ): # 180161 [ DOI: 10.1038/sdata.2018.161 http://dx.doi.org/10.1038/sdata.2018.161 ]
Wei T Q and Xiao Z Y . 2022 . Dual encoded-decoded polyp segmentation method for gastroscopic images architecture . Journal of Image and Graphics , 27 ( 12 ): 3637 - 3650
魏天琦 , 肖志勇 . 2022 . 双重编—解码架构的肠胃镜图像息肉分割 . 中国图象图形学报 , 27 ( 12 ): 3637 - 3650 [ DOI: 10.11834/jig.210966 http://dx.doi.org/10.11834/jig.210966 ]
Wu Z , Su L and Huang Q M . 2019 . Cascaded partial decoder for fast and accurate salient object detection // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Beach, USA : IEEE: 3902 - 3911 [ DOI: 10.1109/CVPR.2019.00403 http://dx.doi.org/10.1109/CVPR.2019.00403 ]
Zhang M W , Xia H Y and Tan Y M . 2023 . GroupSeg: an efficient grouping transformer network for polyp segmentation // Proceedings of 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) . Istanbul, Turkiye : IEEE: 2380 - 2384 [ DOI: 10.1109/BIBM58861.2023.10385401 http://dx.doi.org/10.1109/BIBM58861.2023.10385401 ]
Zhang R F , Li G B , Li Z , Cui S G , Qian D H and Yu Y Z . 2020 . Adaptive context selection for polyp segmentation // Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention . Lima, Peru : Springer: 253 - 262 [ DOI: 10.1007/978-3-030-59725-2_25 http://dx.doi.org/10.1007/978-3-030-59725-2_25 ]
Zhang Y D , Liu H Y and Hu Q . 2021 . TransFuse: fusing transformers and CNNs for medical image segmentation // Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention . Strasbourg, France : Springer: 14 - 24 [ DOI: 10.1007/978-3-030-87193-2_2 http://dx.doi.org/10.1007/978-3-030-87193-2_2 ]
Zhao X Q , Zhang L H and Lu H C . 2021 . Automatic polyp segmentation via multi-scale subtraction network // Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention . Strasbourg, France : Springer: 120 - 130 [ DOI: 10.1007/978-3-030-87193-2_12 http://dx.doi.org/10.1007/978-3-030-87193-2_12 ]
Zhou Z W , Siddiquee M M R , Tajbakhsh N and Liang J M . 2018 . UNet++: a nested U-Net architecture for medical image segmentation // Proceedings of the 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018 on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support . Granada, Spain : Springer: 3 - 11 [ DOI: 10.1007/978-3-030-00889-5_1 http://dx.doi.org/10.1007/978-3-030-00889-5_1 ]
相关作者
相关机构