U-Net通道变换网络在腺体图像分割中的应用

曹伟杰; 段先华; 许振伟; 盛帅

doi:10.11834/jig.230233

图像分析和识别 | 浏览量 : 0 下载量: 10 CSCD: 0

PDF
导出
分享
收藏
专辑

U-Net通道变换网络在腺体图像分割中的应用
Application of U-Net channel transformation network in gland image segmentation
2024年29卷第3期页码：713-724
纸质出版日期： 2024-03-16 ，
DOI： 10.11834/jig.230233
稿件说明：

移动端阅览

曹伟杰，段先华，许振伟，盛帅. 2024. U-Net通道变换网络在腺体图像分割中的应用. 中国图象图形学报， 29(03):0713-0724

Cao Weijie， Duan Xianhua， Xu Zhenwei， Sheng Shuai. 2024. Application of U-Net channel transformation network in gland image segmentation. Journal of Image and Graphics， 29(03):0713-0724
曹伟杰，段先华，许振伟，盛帅. 2024. U-Net通道变换网络在腺体图像分割中的应用. 中国图象图形学报， 29(03):0713-0724 DOI： 10.11834/jig.230233.

Cao Weijie， Duan Xianhua， Xu Zhenwei， Sheng Shuai. 2024. Application of U-Net channel transformation network in gland image segmentation. Journal of Image and Graphics， 29(03):0713-0724 DOI： 10.11834/jig.230233.

摘要

目的

腺体医学图像分割是将医学图像中的腺体区域与周围组织分离出来的过程，对分割精度有极高要求。传统模型在对腺体医学图像分割时，因腺体形态多样性和小目标众多的特点，容易出现分割不精细或误分割等问题，对此根据腺体医学图像的特点对U-Net型通道变换网络分割模型进行改进，实现对腺体图像更高精度分割。

方法

首先在U-Net型通道变换网络的编码器前端加入ASPP_SE（spatial pyramid pooling_squeeze-and-excitation networks）模块与ConvBatchNorm模块的组合，在增强编码器提取小目标特征信息能力的同时，防止模型训练出现过拟合现象。其次在编码器与跳跃连接中嵌入简化后的密集连接，增强编码器相邻模块特征信息融合。最后在通道融合变换器（channel cross fusion with Transformer，CCT）中加入细化器，将自注意力图投射到更高维度，提高自注意机制能力，增强编码器全局模块特征信息融合。简化后的密集连接与CCT结合使用，模型可以达到更好效果。

结果

改进算法在公开腺体数据集MoNuSeg（multi-organ nuclei segmentation challenge）和Glas（gland segmentation）上进行实验。以Dice系数和IoU（intersection over union）系数为主要指标，在MoNuSeg的结果为80.55%和67.32%，在Glas数据集的结果为92.23%和86.39%，比原U-Net型通道变换网络分别提升了0.88%、1.06%和1.53%、2.43%。

结论

本文提出的改进算法在腺体医学分割上优于其他现有分割算法，能满足临床医学腺体图像分割要求。

Abstract

Objective

Adenocarcinoma is a malignant tumor originating from the glandular epithelium and poses immense harm to human health. With the rapid development of computer vision technology， medical imaging has become an important means for expert preoperative diagnosis. In the diagnosis of adenocarcinoma， doctors judge the severity of the cancer and grade it by analyzing the size， shape， and other external features of the glandular structure. Accordingly， achieving high-precision segmentation of glandular images has become an urgent requirement in clinical medicine. Glandular medical image segmentation refers to the process of separating the glandular region from the surrounding tissue in medical images， requiring high segmentation accuracy. Traditional models for segmenting glandular medical images can suffer from such problems as imprecise segmentation and mis-segmentation owing to the diverse shapes of glands and presence of numerous small targets. To address this issue， this study proposes an improved glandular medical image segmentation algorithm based on UCTransNet. UCTransNet addresses solves the semantic gap between different resolution modules of the encoder and between the encoder and decoder， thereby achieving high precision image segmentation.

Method

First， a combination of the fusion of ASPP_SE and ConvBatchNorm modules is added to the front end of the encoder. The ASPP_SE module combines the ASPP module and channel attention mechanism. The ASPP module consists of three different dilation rates of atrous convolution， a 1 × 1 convolution， and an ASPP pooling. Atrous convolution injects holes into standard convolution to expand the receptive field， obtain dense data features， and maintain the same output feature map size. The ASPP module uses multi-scale atrous convolution to obtain a large receptive field， and fuses the obtained features with the global features obtained from the ASPP pooling to obtain denser semantic information than the original features. The channel attention mechanism enables the model to focus considerably on important channel regions in the image， dynamically select information in the image， and give substantial weight to channels containing important information. In the CCT （channel cross fusion with Transformer）， modules with higher weight of important information will achieve better fusion. The ConvBatchNorm module enhances the ability of the encoder to extract the features of small targets， while preventing overfitting during model training. Second， a simplified dense connection is embedded between the encoder and the skip connections， and the CCT in the model performs global feature fusion of the features extracted by the encoder from a channel perspective. Although the global attention ability of the CCT is strong， its problem is a weak local attention ability， and the ambiguity between adjacent encoder modules has not been solved. To solve this problem， a dense connection is added to enhance the local information fusion ability. The dense connection passes the upper encoder module through convolution pooling to obtain the lower encoder module and performs upsampling on the lower encoder to make its resolution consistent with the upper encoder module. The two encoder modules are concatenated on the channel， and the resolution does not change after concatenation. After concatenation， the upper encoder module obtains the feature information supplement of the lower encoder module. Consequently， the semantic fusion between adjacent modules is enhanced， the semantic gap between adjacent encoder modules is reduced， and the feature information fusion between adjacent encoder modules is improved. A refiner is added to the CCT， which projects the self-attention map to a higher dimension， and uses the head convolution to enhance the spatial context and local patterns of the attention map. This method effectively combines the advantages of self-attention and convolution to further improve the self-attention mechanism. Lastly， a linear projection is used to restore the attention map to the initial resolution， thereby enhancing the global feature information fusion of the encoder. A fusion ASPP_SE and ConvBatchNorm modules are added to the front end of the UCTransNet encoder to enhance its ability to extract small target features and prevent overfitting. Second， a simplified dense connection is embedded between the encoder and skip connection to enhance the fusion of adjacent module features. Lastly， a refinement module is added to the CCT to project the self-attention map to a markedly high dimension， thereby enhancing the global feature fusion ability of the encoder. The combination of the simplified dense connection and CCT refinement module improves the performance of the model.

Result

The improved algorithm was tested on the publicly available gland data sets MoNuSeg and Glas. The Dice and intersection over union（IoU） coefficients were the main evaluation metrics used. The Dice coefficient is a similarity measure used to represent the similarity between two samples. By contrast， the IoU coefficient is a standard used to measure the accuracy of the result’s positional information. Both metrics are commonly used in medical image segmentation. The test results on the MoNuSeg data set were 80.55% and 67.32%， while those on the Glas data set were 92.23% and 86.39%. These results represent improvements of 0.88% and 1.06%， and 1.53% and 2.43%， respectively， compared those of the original UCTransNet. The improved model was compared to existing popular segmentation networks and was found to generally outperform them.

Conclusion

The proposed improved model is superior to existing segmentation algorithms in medical gland segmentation and can meet the requirements of clinical medical gland image segmentation. The CCT module in the original model was further optimized to fuse global and local feature information， thereby achieving better results.

关键词

医学图像分割U-Net型通道变换网络（UCTransNet）密集连接注意力机制细化器

Keywords

medical image segmentationU-Net from a channel-wise perspective with Transformer （UCTransNet）dense connectionself-attention mechanismrefinement module

references

Ates G C， Mohan P and Celik E. 2023. Dual cross-attention for medical image segmentation. Engineering Applications of Artificial Intelligence， 126： #107139 ［DOI： 10.1016/j.engappai.2023.107139http://dx.doi.org/10.1016/j.engappai.2023.107139］

Cao H， Wang Y Y， Chen J， Jiang D S， Zhang X P， Tian Q and Wang M N. 2022. Swin-UNet： UNet-like pure Transformer for medical image segmentation//Proceedings of 2022 European Conference on Computer Vision. Tel Aviv， Israel： Springer ［DOI： 10.1007/978-3-031-25066-8_9http://dx.doi.org/10.1007/978-3-031-25066-8_9］

Chen J N， Lu Y Y， Yu Q H， Luo X D， Adeli E， Wang Y， Lu L， Yuille A L and Zhou Y Y. 2021. TransUNet： Transformers make strong encoders for medical image segmentation ［EB/OL］. ［2022-12-05］. https://arxiv.org/pdf/2102.04306.pdfhttps://arxiv.org/pdf/2102.04306.pdf

Chen L C， Papandreou G， Kokkinos I， Murphy K and Yuille A L. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFs//Proceedings of the 3rd International Conference on Learning Representations. San Diego， USA：［s.n.］

Çiçek O， Abdulkadir A， Lienkamp S S， Brox T and Ronneberger O. 2016. 3D U-Net： learning dense volumetric segmentation from sparse annotation//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens， Greece： Springer ［DOI： 10.1007/978-3-319-46723-8_49http://dx.doi.org/10.1007/978-3-319-46723-8_49］

Dosovitskiy A， Beyer L， Kolesnikov A， Weissenborn D， Zhai X H， Unterthiner T， Dehghani M， Minderer M， Heigold G， Gelly S， Uszkoreit J and Houlsby N. 2021. An image is worth 16 × 16 words： Transformers for image recognition at scale//Proceedings of the 9th International Conference on Learning Representations. ［s.l.］： OpenReview.net

Huo L， Hu X X， Xiao Q， Gu Y J， Chu X and Jiang L. 2021. Automatic segmentation of breast and fibroglandular tissues in DCE-MR images based on nnU-Net. Chinese Journal of Magnetic Resonance， 38（3）： 367-380

霍璐，胡晓欣，肖勤，顾雅佳，褚旭，姜娈. 2021. 基于nnU-Net的乳腺DCE-MR图像中乳房和腺体自动分割. 波谱学杂志， 38（3）： 367-380 ［DOI： 10.11938/cjmr20212883http://dx.doi.org/10.11938/cjmr20212883］

Jiang X， Yuan Y X， Wang Y P， Xiao Z X， Zhu M L， Chen Z H， Liu T M and Shen D G. 2022. A 20-year retrospect and prospect of medical imaging artificial intelligence in China. Journal of Image and Graphics， 27（3）： 655-671

蒋希，袁奕萱，王雅萍，肖振祥，朱美芦，陈泽华，刘天明，沈定刚. 2022. 中国医学影像人工智能20年回顾和展望. 中国图象图形学报， 27（3）： 655-671 ［DOI： 10.11834/jig.211162http://dx.doi.org/10.11834/jig.211162］

Liu Y， Wu R R， Tang L and Song N N. 2022. U-Net-based mediastinal lymph node segmentation method in bronchial ultrasound elastic images. Journal of Image and Graphics， 27（10）： 3082-3091

刘羽，吴蓉蓉，唐璐，宋宁宁. 2022. U-Net支气管超声弹性图像纵膈淋巴结分割. 中国图象图形学报， 27（10）： 3082-3091［DOI： 10.11834/jig.210225http://dx.doi.org/10.11834/jig.210225］

Meng Y B， Liu G H， Xu S J and Feng F. 2019. Image segmentation method using multi-resolution Markov random field model with edge-preserving. Journal of Xi’an Jiaotong University， 53（3）： 56-65

孟月波，刘光辉，徐胜军，冯峰. 2019. 一种具有边缘保持的多尺度马尔可夫随机场模型图像分割方法. 西安交通大学学报， 53（3）： 56-65 ［DOI： 10.7652/xjtuxb201903009http://dx.doi.org/10.7652/xjtuxb201903009］

Milletarì F， Navab N and Ahmadi S A. 2016. V-Net： fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision （3DV）. Stanford， USA： IEEE： #79 ［DOI： 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79］

Oktay O， Schlemper J， Fdgoc L L， Lee M， Heinrich M， Misawa K， Mori K， McDonagh S， Hammerla N Y， Kainz B， Glocker B and Rueckert D. 2018. Attention U-Net： learning where to look for the pancreas ［EB/OL］. ［2023-04-20］. https://arxiv.org/pdf/1804.03999v3.pdfhttps://arxiv.org/pdf/1804.03999v3.pdf

Ren C L， Wang N and Zhang Y.2022.Review of medical image segmentation methods. Network Security Technology and Application，254（2）：49-50.

任楚岚，王宁，张阳.2022.医学图像分割方法综述.网络安全技术与应用，254（2）：49-50

Ronneberger O， Fischer P and Brox T. 2015. U-Net： convolutional networks for biomedical image segmentation//Proceedings of the 18th Medical Image Computing and Computer-Assisted Intervention. Munich， Germany： Springer ［DOI： 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28］

Shelhamer E， Long J and Darrell T. 2014. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Boston， USA： IEEE： #7298964 ［DOI： 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965］

Wang H N， Cao P， Wang J Q and Zaïane O R. 2022. UCTransNet： rethinking the skip connections in U-Net from a channel-wise perspective with Transformer//Proceedings of the 36th AAAI Conference on Artificial Intelligence， AAAI 2022， 34th Conference on Innovative Applications of Artificial Intelligence， IAAI 2022， the 12th Symposium on Educational Advances in Artificial Intelligence. ［s.l.］： AAAI Press： #20144 ［DOI： 10.1609/aaai.v36i3.20144http://dx.doi.org/10.1609/aaai.v36i3.20144］

Wazir S and Fraz M M. 2022. HistoSeg： quick attention with multi-loss function for multi-structure segmentation in digital histology images//Proceedings of the 12th International Conference on Pattern Recognition Systems （ICPRS）. Saint-Etienne， France： IEEE： #9854067 ［DOI： 10.1109/ICPRS54038.2022.9854067http://dx.doi.org/10.1109/ICPRS54038.2022.9854067］

Xu C， Hao H Y， Wang Y， Ma Y H， Yan Q F， Chen B， Ma S D， Wang X G and Zhao Y T. 2023. Vessel segmentation of OCTA images based on latent vector alignment and Swin Transformer. Journal of Image and Graphics， 28（9）： 2927-2939

许聪，郝华颖，王阳，马煜辉，阎岐峰，陈浜，马韶东，王效贵，赵一天. 2023. 融合隐向量对齐和Swin Transformer的OCTA血管分割. 中国图象图形学报， 28（9）： 2927-2939［DOI： 10.11834/jig.220482http://dx.doi.org/10.11834/jig.220482］

Xu G X， Feng C and Ma F. 2023. Review of medical image segmentation based on UNet. Journal of Frontiers of Computer Science and Technology， 17（8）： 1776-1792

徐光宪，冯春，马飞. 2023. 基于UNet的医学图像分割综述. 计算机科学与探索， 17（8）： 1776-1792［DOI： 10.3778/j.issn.1673-9418.2301044http://dx.doi.org/10.3778/j.issn.1673-9418.2301044］

Zhang J B and Xiao Z Y. 2023. Gland and colonoscopy segmentation method combining self-attention and convolutional neural network. Laser and Optoelectronics Progress， 60（2）： #0217002

张家宝，肖志勇. 结合自注意力与卷积神经网络的腺体及息肉分割方法. 2023. 激光与光电子学进展， 60（2）： #0217002［DOI： 10.3788/LOP212696http://dx.doi.org/10.3788/LOP212696］

Zhao B Q， Yu F， Sun J M， Li X M， Yuan L and Xiao L. 2021. Glandular cell segmentation method combined with dense connective blocks and self-attention mechanism. Journal of Computer-Aided Design and Computer Graphics， 33（7）： 991-999

赵宝奇，尉飞，孙军梅，李秀梅，袁珑，肖蕾. 2021. 结合密集连接块和自注意力机制的腺体细胞分割方法. 计算机辅助设计与图形学学报， 33（7）： 991-999 ［DOI： 10.3724/SP.J.1089.2021.18625http://dx.doi.org/10.3724/SP.J.1089.2021.18625］

Zhao Y L， Ding W L， You Q H， Zhu F L， Zhu X J， Zheng K and Liu D D. 2023. Classification of whole slide images of breast histopathology based on spatial correlation characteristics. Journal of Image and Graphics， 28（4）： 1134-1145

赵樱莉，丁维龙，游庆华，朱峰龙，朱筱婕，郑魁，刘丹丹. 2023. 融合空间相关性特征的乳腺组织病理全切片分类. 中国图象图形学报， 28（4）： 1134-1145 ［DOI： 10.11834/jig.211133http://dx.doi.org/10.11834/jig.211133］

Zhou D Q， Shi Y J， Kang B Y， Yu W H， Jiang Z H， Li Y， Jin X J， Hou Q B and Feng J S. 2021. Refiner： refining self-attention for vision Transformers ［EB/OL］. ［2023-02-15］. https://arxiv.org/pdf/2106.03714.pdfhttps://arxiv.org/pdf/2106.03714.pdf

Zhou Z W， Siddiquee M M R， Tajbakhsh N and Liang J M. 2018. UNet++： a nested U-Net architecture for medical image segmentation//4th International Workshop， DLMIA 2018， and 8th International Workshop on Deep Learning in Medical Image Analysis -and-Multimodal Learning for Clinical Decision Support-. Granada， Spain： Springer ［DOI： 10.1007/978-3-030-00889-5_1http://dx.doi.org/10.1007/978-3-030-00889-5_1］

文章被引用时，请邮件提醒。

提交

TransAS-UNet:融合Swin Transformer和UNet的乳腺癌区域分割

相似度感知蒸馏的统一弱监督个性化联邦图像分割

基于边缘信息增强的前列腺MR图像分割网络

采用多尺度视觉注意力分割腹部CT和心脏MR图像

边界信息保持的全染色肾脏切片多粒度分割