自适应模态融合双编码器MRI脑肿瘤分割网络

张奕涵; 柏正尧; 尤逸琳; 李泽锴

doi:10.11834/jig.230275

图像分析和识别 | 浏览量 : 0 下载量: 9 CSCD: 0

PDF
导出
分享
收藏
专辑

自适应模态融合双编码器MRI脑肿瘤分割网络
Adaptive modal fusion dual encoder MRI brain tumor segmentation network
2024年29卷第3期页码：768-781
纸质出版日期： 2024-03-16 ，
DOI： 10.11834/jig.230275
稿件说明：

移动端阅览

张奕涵，柏正尧，尤逸琳，李泽锴. 2024. 自适应模态融合双编码器MRI脑肿瘤分割网络. 中国图象图形学报， 29(03):0768-0781

Zhang Yihan， Bai Zhengyao， You Yilin， Li Zekai. 2024. Adaptive modal fusion dual encoder MRI brain tumor segmentation network. Journal of Image and Graphics， 29(03):0768-0781
张奕涵，柏正尧，尤逸琳，李泽锴. 2024. 自适应模态融合双编码器MRI脑肿瘤分割网络. 中国图象图形学报， 29(03):0768-0781 DOI： 10.11834/jig.230275.

Zhang Yihan， Bai Zhengyao， You Yilin， Li Zekai. 2024. Adaptive modal fusion dual encoder MRI brain tumor segmentation network. Journal of Image and Graphics， 29(03):0768-0781 DOI： 10.11834/jig.230275.

摘要

目的

评估肿瘤的恶性程度是临床诊断中的一项具有挑战性的任务。因脑肿瘤的磁共振成像呈现出不同的形状和大小，肿瘤的边缘模糊不清，导致肿瘤分割具有挑战性。为有效辅助临床医生进行肿瘤评估和诊断，提高脑肿瘤分割精度，提出一种自适应模态融合双编码器分割网络D3D-Net（double3DNet）。

方法

本文提出的网络使用多个编码器和特定的特征融合的策略，采用双层编码器用于充分提取不同模态组合的图像特征，并在编码部分利用特定的融合策略将来自上下两个子编码器的特征信息充分融合，去除冗余特征。此外，在编码解码部分使用扩张多纤维模块在不增加计算开销的前提下捕获多尺度的图像特征，并引入注意力门控以保留细节信息。

结果

采用BraTS2018（brain tumor segmentation 2018）、BraTS2019和BraTS2020数据集对D3D-Net网络进行训练和测试，并进行了消融实验。在BraTS2018数据集上，本模型在增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了3.6%，1.0%，11.5%，与DMF-Net（dilated multi-fiber network）相比分别提高了2.2%，0.2%，0.1%。在BraTS2019数据集上进行实验，增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了2.2%，0.6%，7.1%。在BraTS2020数据集上，增强肿瘤、整个肿瘤、肿瘤核心的平均Dice值与3D U-Net相比分别提高了2.5%，1.9%，2.2%。

结论

本文提出的双编码器融合网络能够充分融合多模态特征可以有效地分割小肿瘤部位。

Abstract

Objective

Accurate segmentation of brain tumors is a challenging clinical diagnosis task， especially in assessing the degree of malignancy. The magnetic resonance imaging （MRI） of brain tumors exhibits various shapes and sizes， and the accurate segmentation of small tumors plays a crucial role in achieving accurate assessment results. However， due to the significant variability in the shape and size of brain tumors， their fuzzy boundaries make tumor segmentation a challenging task. In this paper， we propose a multi-modal MRI brain tumor image segmentation network， named D3D-Net， based on a dual encoder fusion architecture to improve the segmentation accuracy. The performance of the proposed network is evaluated on the BraTS2018 and BraTS2019 datasets.

Method

The paper proposes a network that utilizes multiple encoders and a feature fusion strategy. The network incorporates dual-layer encoders to thoroughly extract image features from various modal combinations， thereby enhancing the segmentation accuracy. In the encoding phase， a targeted fusion strategy is adopted to fully integrate the feature information from both upper and lower sub-encoders， effectively eliminating redundant features. Additionally， the encoding-decoding process employs an expanded multi-fiber module to capture multi-scale image features without incurring additional computational costs. Furthermore， an attention gate is introduced in the process to preserve fine-grained details. We conducted experiments on the BraTS2018， BraTS2019， and BraTS2020 datasets， including ablation and comparative experiments. We used the BraTS2018 training dataset， which consists of the magnetic resonance images of 210 high-grade glioma （HGG） and 75 low-grade glioma （LGG） patients. The validation dataset contains 66 cases. The BraTS2019 dataset added 49 HGG cases and 1 LGG case on top of the BraTS2018 dataset. Specifically， BraTS2018 is an open dataset that was released for the 2018 Brain Tumor Segmentation Challenge. The dataset contains multi-modal magnetic resonance images of HGG and LGG patients， including T1-weighted， T1-weighted contrast-enhanced， T2-weighted， and fluid-attenuated inversion recovery （FLAIR） image sequences. T1-weighted， T1-weighted contrast-enhanced， T2-weighted， and FLAIR images are all types of MRI sequences used to image the brain. T1-weighted MRI scans emphasize the contrast between different tissues on the basis of the relaxation time of the hydrogen atoms in the brain. In T1-weighted images， the cerebrospinal fluid appears dark， while the white matter appears bright. This type of scan is often used to detect structural abnormalities in the brain， such as tumors， and assess brain atrophy. T1-weighted contrast-enhanced MRI scans involve the injection of a contrast agent into the bloodstream to improve the visualization of certain types of brain lesions. This type of scan is particularly useful in detecting tumors because the contrast agent tends to accumulate in abnormal tissues. T2-weighted MRI scans emphasize the contrast between different tissues on the basis of the water content in the brain. In T2-weighted images， the cerebrospinal fluid appears bright， while the white matter appears dark. This type of scan is often used to detect areas of brain edema or inflammation. FLAIR MRI scans are similar to T2-weighted images but with the suppression of signals from the cerebrospinal fluid. This type of scan is particularly useful in detecting abnormalities in the brain that may be difficult to visualize with other types of scans， such as small areas of brain edema or lesions in the posterior fossa. The dataset is divided into two subsets： the training and validation datasets. The training dataset includes 285 cases， including 210 HGG and 75 LGG patients. The validation dataset includes 66 cases.

Result

The proposed D3D-Net exhibits superior performance compared with the baseline 3D U-Net and DMF-Net models. Specifically， on the BraTS2018 dataset， the D3D-Net achieves a high average Dice coefficient of 79.7%， 89.5%， and 83.3% for enhancing tumors， whole tumors， and tumor core segmentation， respectively. Result shows the effectiveness of the proposed network in accurately segmenting brain tumors of different sizes and shapes. The D3D-Net also demonstrated an improvement in segmentation accuracy compared with the 3D U-Net and DMF-Net models. In particular， compared with the 3D U-Net model， D3D-Net showed a significant improvement of 3.6%， 1.0%， and 11.5% in enhancing tumors， whole tumors， and tumor core segmentation， respectively. Additionally， compared with the DMF-Net model， D3D-Net respectively demonstrated an improvement of 2.2%， 0.2%， and 0.1% in the same segmentation tasks. On the BraTS2019 dataset， D3D-Net also achieved high accuracy in segmenting brain tumors. Specifically， the network achieved an average Dice coefficient of 89.6%， 91.4%， and 92.7% for enhancing tumors， whole tumors， and tumor core segmentation， respectively. The improvement in segmentation accuracy compared with the 3D U-Net model was 2.2%， 0.6%， and 7.1%， respectively， for enhancing tumors， whole tumors， and the tumor core segmentation. Results suggest that the proposed D3D-Net is an effective and accurate approach for segmenting brain tumors of different sizes and shapes. The network’s superior performance compared with the 3D U-Net and DMF-Net models indicates that the dual encoder fusion architecture， which fully integrates multi-modal features， is crucial for accurate segmentation. Moreover， the high accuracy achieved by D3D-Net in both the BraTS2018 and BraTS2019 datasets demonstrates the robustness of the proposed method and its potential to aid in the accurate assessment of brain tumors， ultimately improving clinical diagnosis. On the BraTS2020 dataset， the average Dice values for enhanced tumor， whole tumor， and tumor core increased by 2.5%， 1.9%， and 2.2%， respectively， compared with those on 3D U-Net.

Conclusion

The proposed dual encoder fusion network， D3D-Net， demonstrates a promising performance in accurately segmenting brain tumors from MRI images. The network can improve the accuracy of brain tumor segmentation， aid in the accurate assessment of brain tumors， and ultimately improve clinical diagnosis. The proposed network has the potential to become a valuable tool for radiologists and medical practitioners in the field of neuro-oncology.

关键词

脑肿瘤分割多模态融合双编码器MRI注意力门控

Keywords

brain tumor segmentationmultimodal fusiondual encodermagnetic resonance imaging （MRI）attention gate

references

Badrinarayanan V， Kendall A and Cipolla R. 2017. SegNet： a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（12）： 2481-2495 ［DOI： 10.1109/TPAMI.2016.2644615http://dx.doi.org/10.1109/TPAMI.2016.2644615］

Baid U， Talbar S， Rane S， Gupta S， Thakur M H， Moiyadi A， Sable N， Akolkar M and Mahajan A. 2020. A novel approach for fully automatic intra-tumor segmentation with 3D U-Net architecture for gliomas. Frontiers in Computational Neuroscience， 14： #10 ［DOI： 10.3389/fncom.2020.00010http://dx.doi.org/10.3389/fncom.2020.00010］

Chandra S， Vakalopoulou M， Fidon L， Battistella E， Estienne T， Sun R， Robert C， Deutsch E and Paragios N. 2018. Context aware 3D CNNs for brain tumor segmentation//Proceedings of the 4th International MICCAI Brainlesion Workshop. Granada， Spain： Springer： 299-310 ［DOI： 10.1007/978-3-030-11726-9_27http://dx.doi.org/10.1007/978-3-030-11726-9_27］

Chen C， Liu X P， Ding M， Zheng J F and Li J Y. 2019. 3D dilated multi-fiber network for real-time brain tumor segmentation in MRI//Proceedings of the 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention. Shenzhen， China： Springer： 184-192 ［DOI： 10.1007/978-3-030-32248-9_21http://dx.doi.org/10.1007/978-3-030-32248-9_21］

Chen Y P， Kalantidis Y， Li J S， Yan S C and Feng J S. 2018. Multi-fiber networks for video recognition//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 364-380 ［DOI： 10.1007/978-3-030-01246-5_22http://dx.doi.org/10.1007/978-3-030-01246-5_22］

Çiçek Ö， Abdulkadir A， Lienkamp S S， Brox T and Ronneberger O. 2016. 3D U-Net： learning dense volumetric segmentation from sparse annotation//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens， Greece： Springer： 424-432 ［DOI： 10.1007/978-3-319-46723-8_49http://dx.doi.org/10.1007/978-3-319-46723-8_49］

Feng C Y， Elazab A， Yang P， Wang T F， Zhou F， Hu H Y， Xiao X H and Lei B Y. 2019. Deep learning framework for Alzheimer’s disease diagnosis via 3D-CNN and FSBi-LSTM. IEEE Access， 7： 63605-63618 ［DOI： 10.1109/access.2019.2913847http://dx.doi.org/10.1109/access.2019.2913847］

Havaei M， Davy A， Warde-Farley D， Biard A， Courville A， Bengio Y， Pal C， Jodoin P M and Larochelle H. 2017. Brain tumor segmentation with deep neural networks. Medical Image Analysis， 35： 18-31 ［DOI： 10.1016/j.media.2016.05.004http://dx.doi.org/10.1016/j.media.2016.05.004］

Hu J， Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7132-7141 ［DOI： 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745］

Hu K， Gan Q H， Zhang Y， Deng S H， Xiao F， Huang W， Cao C H and Gao X P. 2019. Brain tumor segmentation using multi-cascaded convolutional neural networks and conditional random field. IEEE Access， 7： 92615-92629 ［DOI： 10.1109/access.2019.2927433http://dx.doi.org/10.1109/access.2019.2927433］

Kamnitsas K， Ledig C， Newcombe V F J， Simpson J P， Kane A D， Menon D K， Rueckert D and Glocker B. 2017. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical Research Council， 36： 61-78 ［DOI： 10.1016/j.media.2016.10.004http://dx.doi.org/10.1016/j.media.2016.10.004］

Krizhevsky A， Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Advances in Neural Information Processing Systems， 2012： 1097-1105

Long J， Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 3431-3440 ［DOI： 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965］

Mallick P K， Ryu S H， Satapathy S K， Mishra S， Nguyen G N and Tiwari P. 2019. Brain MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural network. IEEE Access， 7： 46278-46287 ［DOI： 10.1109/access.2019.2902252http://dx.doi.org/10.1109/access.2019.2902252］

Milletari F， Navab N and Ahmadi S A. 2016. V-Net： fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision. Stanford， USA： IEEE： 565-571 ［DOI： 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79］

Nuechterlein N and Mehta S. 2018. 3D-ESPNet with pyramidal refinement for volumetric brain tumor image segmentation//Proceedings of 2018 International MICCAI Brainlesion Workshop Granada， Spain：Springer：245-253［DOI：10.1007/978-3-030-11726-9_22http://dx.doi.org/10.1007/978-3-030-11726-9_22］

Oktay O， Schlemper J， Le Folgoc L， Lee M， Heinrich M， Misawa K， Mori K， McDonagh S， Hammerla N Y， Kainz B， Glocker B and Rueckert D. 2018. Attention U-Net： learning where to look for the pancreas ［EB/OL］. ［2023-05-17］. https://arxiv.org/pdf/1804.03999.pdfhttps://arxiv.org/pdf/1804.03999.pdf

Qiu L Y， Geng J， Zhang Y L， Zhang C， He D J and Gao L L. 2021. 3D EMSU-Net： a framework for automatic segmentation of brain tumors//Proceedings of the 6th International Conference on Intelligent Computing and Signal Processing. Xi’an， China： IEEE： 1049-1053 ［DOI： 10.1109/icsp51882.2021.9408794http://dx.doi.org/10.1109/icsp51882.2021.9408794］

Ronneberger O， Fischer P and Brox T. 2015. U-Net： convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich， Germany： Springer： 234-241 ［DOI： 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28］

Thillaikkarasi R and Saravanan S. 2019. An enhancement of deep learning algorithm for brain tumor segmentation using kernel based CNN with M-SVM. Journal of Medical Systems， 43（4）： #84 ［DOI： 10.1007/s10916-019-1223-7http://dx.doi.org/10.1007/s10916-019-1223-7］

Wang Q L， Wu B G， Zhu P F， Li P H， Zuo W M and Hu Q H. 2020. ECA-Net： efficient channel attention for deep convolutional neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 11531-11539 ［DOI： 10.1109/CVPR42600.2020.01155http://dx.doi.org/10.1109/CVPR42600.2020.01155］

Woo S， Park J， Lee J Y and Kweon I S. 2018. CBAM： convolutional block attention module//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 3-19 ［DOI： 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1］

Xia F， Shao H J and Deng X. 2022. Cross-stage deep-learning-based MRI fused images of human brain tumor segmentation. Journal of Image and Graphics， 27（3）： 873-884

夏峰，邵海见，邓星. 2022. 融合跨阶段深度学习的脑肿瘤MRI图像分割. 中国图象图形学报， 27（3）： 873-884 ［DOI： 10.11834/jig.210330http://dx.doi.org/10.11834/jig.210330］

Zhang Y H， Lu P， Liu X Y and Zhou S J. 2017. A modified MRF segmentation of brain MR images//Proceedings of the 10th International Congress on Image and Signal Processing， BioMedical Engineering and Informatics. Shanghai， China： IEEE： 1-5 ［DOI： 10.1109/CISP-BMEI.2017.8302185http://dx.doi.org/10.1109/CISP-BMEI.2017.8302185］

Zhou T X， Ruan S and Canu S. 2019. A review： deep learning for medical image segmentation using multi-modality fusion. Array， 3-4： #100004 ［DOI： 10.1016/j.array.2019.100004http://dx.doi.org/10.1016/j.array.2019.100004］

Zhou Y J， Huang W J， Dong P， Xia Y and Wang S S. 2021. D-UNet： a dimension-fusion U shape network for chronic stroke lesion segmentation. IEEE/ACM Transactions on Computational Biology and Bioinformatics， 18（3）： 940-950 ［DOI： 10.1109/tcbb.2019.2939522http://dx.doi.org/10.1109/tcbb.2019.2939522］

文章被引用时，请邮件提醒。

提交

图像与点云多重信息感知关联的三维多目标跟踪

多模态时空特征表示及其在行为识别中的应用

引导性权重驱动的图表问答重定位关系网络

MRI脑肿瘤图像的超像素/体素分割及发展现状

融合跨阶段深度学习的脑肿瘤MRI图像分割