多尺度融合增强的纵膈淋巴结超声弹性图像分割

周奇; 杨行; 田传耕; 唐璐; 惠雨

doi:10.11834/jig.230324

图像分析和识别 | 浏览量 : 0 下载量: 8 CSCD: 0

PDF
导出
分享
收藏
专辑

多尺度融合增强的纵膈淋巴结超声弹性图像分割
Multi-scale fusion-enhanced ultrasound elastic images segmentation for mediastinal lymph node
2024年29卷第3期页码：670-685
纸质出版日期： 2024-03-16 ，
DOI： 10.11834/jig.230324
稿件说明：

移动端阅览

周奇，杨行，田传耕，唐璐，惠雨. 2024. 多尺度融合增强的纵膈淋巴结超声弹性图像分割. 中国图象图形学报， 29(03):0670-0685

Zhou Qi， Yang Hang， Tian Chuangeng， Tang Lu， Hui Yu. 2024. Multi-scale fusion-enhanced ultrasound elastic images segmentation for mediastinal lymph node. Journal of Image and Graphics， 29(03):0670-0685
周奇，杨行，田传耕，唐璐，惠雨. 2024. 多尺度融合增强的纵膈淋巴结超声弹性图像分割. 中国图象图形学报， 29(03):0670-0685 DOI： 10.11834/jig.230324.

Zhou Qi， Yang Hang， Tian Chuangeng， Tang Lu， Hui Yu. 2024. Multi-scale fusion-enhanced ultrasound elastic images segmentation for mediastinal lymph node. Journal of Image and Graphics， 29(03):0670-0685 DOI： 10.11834/jig.230324.

摘要

目的

支气管超声弹性成像具有丰富的通道语义信息，精准的分割纵膈淋巴结对诊断肺癌是否转移具有重要意义，也对癌症的分期和治疗有着重要作用。目前，超声弹性图像分割研究较少，没有充分挖掘图像通道特征之间的关系。因此，提出一种结合注意力机制的多尺度融合增强的纵膈淋巴结超声弹性图像分割U-Net（attention-based multi-scale fusion enhanced ultrasound elastic images segmentation network for mediastinal lymph node， AMFE-UNet）。

方法

首先，考虑到图像可以提供纵膈淋巴结的位置和通道信息，设计密集卷积网络（dense convolutional network， DenseNet）作为模型编码器；其次，结合注意力机制和空洞卷积设计多尺度融合增强解码器，从多尺度和范围对结节的边界和纹理进行建模；最后，用选择性内核网络设计跳跃连接，将编码器的中间特征与解码器的输出特征充分融合。根据解码器特征进行数值或通道融合的方式不同，将AMFE-UNet分为A和B两个子型。

结果

在超声弹性图像数据集上进行对比实验与验证。结果表明AMFE-UNet平均Dice系数达到86.593%，较U-Net提升了1.986%；相较于对比模型，AMFE-UNet A在Dice、精确度和特异度指标上均达到了最优；AMFE-UNet B在交并比、灵敏度和豪斯多夫距离指标上也达到最优。消融实验和可视化分析表明提出的改进方法具有明显的提升效果。

结论

本文通过密集卷积网络设计分割模型编码器，并利用通道注意力机制优化模型特征恢复和连接过程，在超声弹性图像中获得了良好的纵膈淋巴结分割效果，具有较高的临床应用价值。代码链接：

https://github.com/Philo-github/AMFE-UNet

。

Abstract

Objective

Ultrasound elastography enables non-invasive diagnosis of lesion tissues by analyzing the differences in hardness among different body tissues. It is gradually being used in the diagnoses of many diseases. In bronchial ultrasound elastography， accurately segmenting mediastinal lymph nodes from images is significant for diagnosing whether lung cancer has metastasized and has an important role in the consequent staging and diagnosis of cancer. Manual segmentation methods performed by radiologists are always time-consuming， and research on automated segmentation， specifically for ultrasound elastic images， is limited. Therefore， deep learning-based assisted segmentation methods have attracted considerable attention. Although ultrasound elastic images can provide some guidance for the segmentation of regions of interest， the obscuring of texture information in this area also makes segmentation challenging to execute. Existing research has focused primarily on the encoder structure of the model， particularly by incorporating different pre-trained models to accommodate the three-channel data format of ultrasound elastic images. However， limited research has been conducted on the intermediate features obtained by the encoder and decoder structures， resulting in less precise segmentation results. Therefore， this study proposes a network for the segmentation of the mediastinal lymph node， called attention-based multi-scale fusion enhanced ultrasound elastic images segmentation network for mediastinal lymph node （AMFE-UNet）.

Method

First， a pre-trained dense convolutional network （DenseNet） with dense connections is introduced into the U-Net architecture to extract channel and position information from ultrasound elastic images. Second， to model the boundaries and textures of the nodules from different scales and scopes， this research enhanced the decoder module with efficient channel attention （ECA） and dilated convolutions. Three dilated convolution branches and one pooling branch are set up in each decoder module. Different combinations of the results from these branches are used to obtain the following four decoder structures. 1） Decoder-A： Results from each branch are added and passed through the ECA module. 2） Decoder-B： Results from each branch are concatenated along the channel dimension and passed through an ECA module. 3） Decoder-C： Each branch is equipped with an ECA module， and results from each branch are concatenated along the channel dimension. 4） Decoder-D： Results from each branch are densely connected and passed through an ECA module. Lastly， selective kernel network （SK-Net） is used to enhance the fusion of features obtained from the encoder and decoder， ensuring a considerably comprehensive integration. In the experiments， the proposed models are implemented using Python 3.7 and PyTorch 1.12. The image processing workstation is equipped with an Intel i9-13900K CPU and two NVIDIA RTX 4090 GPUs， each with 24 GB memory. The initial parameters of the model are obtained using the default initialization method in PyTorch. The Adam optimizer is used to update the network parameters. Learning rate is initially set to 0.000 1， with a weight decay coefficient of 0.1， and it is decayed every 90 iterations. Dice coefficient is used as loss function， and the model is trained for 190 epochs.

Result

The experiment is performed on a collected dataset of bronchial ultrasound elastic images with six-fold cross-validation. The evaluation metrics include the Dice coefficient， sensitivity， specificity， precision， intersection over union （IoU）， Hausdorff distance 95 percentile （HD95）， parameters， and GFlops. The range of the first five metrics is between 0 and 1； a higher value indicates better segmentation performance. HD95 does not have a specified range， and a lower value indicates better segmentation performance. The ablation experiments show improvements in the skip connection structure and decoder structure proposed for the model. The model using SK-Net as skip connections is only slightly less sensitive than Dense-UNet， while the remaining five metrics are better than Dense-UNet. The four models using the multi-scale fusion-enhanced decoder outperform Dense-UNet by 0.4% to 0.9% in Dice coefficient and up to 2% in precision. Two final models were designed according to the ablation experiment： AMFE-UNet A and AMFE-UNet B. AMFE-UNet compared with a variety of models， including U-Net， Att-UNet， Seg-Net， DeepLabV3+， Trans-UNet， U-Net++， BPAT-UNet， CTO， and ACE-Net. The Dice coefficient of AMFE-UNet is 86.59% on average， which is an improvement of 1.983% compared with U-Net. AMFE-UNet A is optimal in terms of Dice coefficient， precision， and specificity. Meanwhile， AMFE-UNet B is optimal in terms of sensitivity， IoU， and HD95. The class activation map demonstrates that AMFE-UNet achieves better segmentation sensitivity and completeness by focusing on the content of the region at the lower levels of the network and on the boundaries of the region at the higher levels of the network. The other networks only focus on the content of the region and are ineffective at segmenting the region’s boundaries. The loss variation curves for training and testing of the model indicate that AMFE-UNet B has faster convergence and better segmentation than AMFE-UNet A.

Conclusion

Adequate experiments demonstrate the excellent segmentation effectiveness of the AMFE-UNet combined attention mechanism for ultrasound elastic images， which has significance for future research on multichannel medical images. The code is available at

https://github.com/Philo-github/AMFE-UNet

关键词

超声弹性成像（UE）纵膈淋巴结实例分割U-Net通道注意力机制

Keywords

ultrasound elastography（UE）mediastinal lymph nodesinstance segmentationU-Netchannel attention mechanism

references

Badrinarayanan V， Kendall A and Cipolla R. 2017. SegNet： a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（12）： 2481-2495 ［DOI： 10.1109/TPAMI.2016.2644615http://dx.doi.org/10.1109/TPAMI.2016.2644615］

Bi H， Cai C J， Sun J W， Jiang Y B， Lu G， Shu H Z and Ni X Y. 2023. BPAT-UNet： boundary preserving assembled Transformer UNet for ultrasound thyroid nodule segmentation. Computer Methods and Programs in Biomedicine， 238： #107614 ［DOI： 10.1016/J.CMPB.2023.107614http://dx.doi.org/10.1016/J.CMPB.2023.107614］

Cai S J， Tian Y X， Lui H， Zeng H S， Wu Y and Chen G N. 2020. Dense-UNet： a novel multiphoton in vivo cellular image segmentation model based on a convolutional neural network. Quantitative Imaging in Medicine and Surgery， 10（6）： 1275-1285 ［DOI： 10.21037/QIMS-19-1090http://dx.doi.org/10.21037/QIMS-19-1090］

Chen H Y， Gao J Y， Zhao D， Wu J， Chen J J， Quan X Y， Li X M， Xue F， Zhou M Y and Bai B B. 2021. LFSCA-UNet： liver fibrosis region segmentation network based on spatial and channel attention mechanisms. Journal of Image and Graphics， 26（9）： 2121-2134

陈弘扬，高敬阳，赵地，吴忌，陈金军，全显跃，李欣明，薛峰，周沐瑶，柏冰冰. 2021. LFSCA-UNet：基于空间与通道注意力机制的肝纤维化区域分割网络. 中国图象图形学报， 26（9）： 2121-2134 ［DOI： 10.11834/jig.210236http://dx.doi.org/10.11834/jig.210236］

Chen J N， Lu Y Y， Yu Q H， Luo X D， Adeli E， Wang Y， Lu L， Yuille A L and Zhou Y Y. 2021. TransUNet： Transformers make strong encoders for medical image segmentation ［EB/OL］. ［2023-02-05］. https://arxiv.org/pdf/2102.04306.pdfhttps://arxiv.org/pdf/2102.04306.pdf

Chen L C， Zhu Y K， Papandreou G， Schroff F and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 833-851 ［DOI： 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49］

Deng J， Dong W， Socher R， Li L J， Li K and Li F F. 2009. ImageNet： a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami， USA： IEEE： 248-255 ［DOI： 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848］

Detterbeck F C， Chansky K， Groome P， Bolejack V， Crowley J， Shemanski L， Kennedy C， Krasnik M， Peake M and Rami-Porta R. 2016. The IASLC lung cancer staging project： methodology and validation used in the development of proposals for revision of the stage classification of NSCLC in the forthcoming （eighth） edition of the TNM classification of lung cancer. Journal of Thoracic Oncology， 11（9）： 1433-1446 ［DOI： 10.1016/J.JTHO.2016.06.028http://dx.doi.org/10.1016/J.JTHO.2016.06.028］

Gu Y， Shi H， Su C X， Chen X X， Zhang S J， Li W， Wu F Y， Gao G H， Wang H， Chu H Q， Zhou C C， Zhou F and Ren S X. 2017. The role of endobronchial ultrasound elastography in the diagnosis of mediastinal and hilar lymph nodes. Oncotarget， 8（51）： 89194-89202 ［DOI： 10.18632/ONCOTARGET.19031http://dx.doi.org/10.18632/ONCOTARGET.19031］

Hu J， Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7132-7141 ［DOI： 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745］

Huang G， Liu Z， Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 2261-2269 ［DOI： 10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243］

Li L Z， Verma M， Nakashima Y， Nagahara H and Kawasaki R. 2020. IterNet： retinal image segmentation utilizing structural redundancy in vessel networks//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass， USA： IEEE： 3645-3654 ［DOI： 10.1109/WACV45572.2020.9093621http://dx.doi.org/10.1109/WACV45572.2020.9093621］

Li X， Wang W H， Hu X L and Yang J. 2019. Selective kernel networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 510-519 ［DOI： 10.1109/CVPR.2019.00060http://dx.doi.org/10.1109/CVPR.2019.00060］

Lin Y， Zhang D， Fang X， Chen Y F， Cheng K T and Chen H. 2023. Rethinking boundary detection in deep learning models for medical image segmentation//Proceeding of the 28th International Conference on Information Processing in Medical Imaging. San Carlos de Bariloche， Argentina： Springer： 730-742 ［DOI： 10.1007/978-3-031-34048-2_56http://dx.doi.org/10.1007/978-3-031-34048-2_56］

Liu Y， Wu R R， Tang L and Song N N. 2022. U-Net-based mediastinal lymph node segmentation method in bronchial ultrasound elastic images. Journal of Image and Graphics， 27（10）： 3082-3091

刘羽，吴蓉蓉，唐璐，宋宁宁. 2022. U-Net支气管超声弹性图像纵膈淋巴结分割. 中国图象图形学报， 27（10）： 3082-3091 ［DOI： 10.11834/JIG.210225http://dx.doi.org/10.11834/JIG.210225］

Milletari F， Navab N and Ahmadi S A. 2016. V-Net： fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision. Stanford， USA： IEEE： 565-571 ［DOI： 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79］

Oktay O， Schlemper J， Le Folgoc L， Lee M， Heinrich M， Misawa K， Mori K， McDonagh S， Hammerla N Y， Kainz B， Glocker B and Rueckert D. 2018. Attention U-Net： learning where to look for the pancreas ［EB/OL］. ［2023-02-20］. https://arxiv.org/pdf/1804.03999.pdfhttps://arxiv.org/pdf/1804.03999.pdf

Polat H. 2022. A modified DeepLabV3+ based semantic segmentation of chest computed tomography images for COVID-19 lung infections. International Journal of Imaging Systems and Technology， 32（5）： 1481-1495 ［DOI： 10.1002/IMA.22772http://dx.doi.org/10.1002/IMA.22772］

Ronneberger O， Fischer P and Brox T. 2015. U-Net： convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich， Germany： Springer： 234-241 ［DOI： 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28］

Selvaraju R R， Cogswell M， Das A， Vedantam R， Parikh D and Batra D. 2017. Grad-CAM： visual explanations from deep networks via gradient-based localization//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 618-626 ［DOI： 10.1109/ICCV.2017.74http://dx.doi.org/10.1109/ICCV.2017.74］

Shelhamer E， Long J and Darrell T. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（4）： 640-651 ［DOI： 10.1109/TPAMI.2016.2572683http://dx.doi.org/10.1109/TPAMI.2016.2572683］

Sun J Y， Zheng X X， Mao X W， Wang L， Xiong H K， Herth F J F and Han B H. 2017. Endobronchial ultrasound elastography for evaluation of intrathoracic lymph nodes： a pilot study. Respiration， 93（5）： 327-338 ［DOI： 10.1159/000464253http://dx.doi.org/10.1159/000464253］

Wang H， Wan Y X， Zhang L， Tao H Y and Huang H R. 2018. Clinical value of bronchial ultrasound elastography in the differential diagnosis of benign and malignant hilar and mediastinal lymph nodes. Chinese Journal of Clinical Oncology， 45（14）： 721-725

王虹，万毅新，张丽，陶红艳，黄晖蓉. 2018. 支气管超声弹性成像技术对肺门及纵隔淋巴结良恶性鉴别诊断的临床价值. 中国肿瘤临床， 45（14）： 721-725 ［DOI： 10.3969/j.issn.1000-8179.2018.14.358http://dx.doi.org/10.3969/j.issn.1000-8179.2018.14.358］

Wang Q L， Wu B G， Zhu P F， Li P H， Zuo W M and Hu Q H. 2020. ECA-Net： efficient channel attention for deep convolutional neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 11531-11539 ［DOI： 10.1109/CVPR42600.2020.01155http://dx.doi.org/10.1109/CVPR42600.2020.01155］

Yu F and Koltun V. 2015. Multi-scale context aggregation by dilated convolutions ［EB/OL］. ［2023-03-08］. https://arxiv.org/pdf/1511.07122.pdfhttps://arxiv.org/pdf/1511.07122.pdf

Zhang F， Zhang X Q， Lyu B L， Zhang Z， Cai L M， Li R S， Zhou Y L and Lian H R. 2019. Differential diagnosis value of hilar and mediastinal lymph nodes in lung cancer by bronchoscopic elastography and intrabronchial ultrasonography. Chinese Journal of Ultrasound in Medicine， 35（10）： 897-900

张芳，张秀芹，吕蓓丽，张祯，蔡礼鸣，李润生，周玉龙，廉海容. 2019. 超声支气管镜下弹性成像与淋巴结支气管内超声成像对肺癌肺门纵隔淋巴结良恶性的鉴别诊断价值. 中国超声医学杂志， 35（10）： 897-900 ［DOI： 10.3969/J.ISSN.1002-0101.2019.10.011http://dx.doi.org/10.3969/J.ISSN.1002-0101.2019.10.011］

Zhou Z W， Rahman Siddiquee M M， Tajbakhsh N and Liang J M. 2018. UNet++： a nested U-Net architecture for medical image segmentation//Proceedings of the 4th International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Granada， Spain： Springer： 3-11 ［DOI： 10.1007/978-3-030-00889-5_1http://dx.doi.org/10.1007/978-3-030-00889-5_1］

文章被引用时，请邮件提醒。

提交

LFSCA-UNet：基于空间与通道注意力机制的肝纤维化区域分割网络

基于边缘信息增强的前列腺MR图像分割网络

结合语义分割与模型匹配的室内场景重建方法

引入分组注意力的医学图像分割模型

融合改进ASPP和极化自注意力的自底向上全景分割