集成注意力增强和双重相似性引导的多模态脑部图像配准

田梨梨; 程欣宇; 唐堃; 张健; 王丽会

doi:10.11834/jig.200657

磁共振图像 | 浏览量 : 0 下载量: 0 CSCD: 1

PDF
导出
分享
收藏
专辑

集成注意力增强和双重相似性引导的多模态脑部图像配准
Multimodal brain image registration with integrated attention augmentation and dual similarity guidance
2021年26卷第9期页码：2219-2232
纸质出版日期： 2021-09-16 ，

录用日期： 2021-02-14
DOI： 10.11834/jig.200657
稿件说明：

移动端阅览

田梨梨, 程欣宇, 唐堃, 张健, 王丽会. 集成注意力增强和双重相似性引导的多模态脑部图像配准[J]. 中国图象图形学报, 2021,26(9):2219-2232.

Lili Tian, Xinyu Cheng, Kun Tang, Jian Zhang, Lihui Wang. Multimodal brain image registration with integrated attention augmentation and dual similarity guidance[J]. Journal of Image and Graphics, 2021,26(9):2219-2232.
田梨梨, 程欣宇, 唐堃, 张健, 王丽会. 集成注意力增强和双重相似性引导的多模态脑部图像配准[J]. 中国图象图形学报, 2021,26(9):2219-2232. DOI： 10.11834/jig.200657.

Lili Tian, Xinyu Cheng, Kun Tang, Jian Zhang, Lihui Wang. Multimodal brain image registration with integrated attention augmentation and dual similarity guidance[J]. Journal of Image and Graphics, 2021,26(9):2219-2232. DOI： 10.11834/jig.200657.

摘要

目的

医学图像配准是医学图像处理和分析的关键环节，由于多模态图像的灰度、纹理等信息具有较大差异，难以设计准确的指标来量化图像对的相似性，导致无监督多模态图像配准的精度较低。因此，本文提出一种集成注意力增强和双重相似性引导的无监督深度学习配准模型（ensemble attention-based and dual similarity guidance registration network，EADSG-RegNet），结合全局灰度相似性和局部特征相似性共同引导参数优化，以提高磁共振T2加权图像和T1加权模板图像配准的精度。

方法

EADSG-RegNet模型包含特征提取、变形场估计和重采样器。设计级联编码器和解码器实现图像对的多尺度特征提取和变形场估计，在级联编码器中引入集成注意力增强模块（integrated attention augmentation module，IAAM），通过训练的方式学习提取特征的重要程度，筛选出对配准任务更有用的特征，使解码器更准确地估计变形场。为了能够准确估计全局和局部形变，使用全局的灰度相似性归一化互信息（normalized mutual information，NMI）和基于SSC（self-similarity context）描述符的局部特征相似性共同作为损失函数训练网络。在公开数据集和内部数据集上验证模型的有效性，采用Dice分数对配准结果在全局灰质和白质以及局部组织解剖结构上作定量分析。

结果

实验结果表明，相比于传统配准方法和深度学习配准模型，本文方法在可视化结果和定量分析两方面均优于其他方法。对比传统方法ANTs（advanced normalization tools）、深度学习方法voxelMorph和ADMIR（affine and deformable medical image registration），在全局灰质区域，Dice分数分别提升了3.5%，1.9%和1.5%。在全局白质区域分别提升了3.4%，1.6%和1.3%。对于局部组织结构，Dice分数分别提升了5.2%，3.1%和1.9%。消融实验表明，IAAM模块和SSC损失分别使Dice分数提升1.2%和1.5%。

结论

本文提出的集成注意力增强的无监督多模态医学图像配准网络，通过强化有用特征实现变形场的准确估计，进而实现图像中细小区域的准确配准，对比实验验证了本文模型的有效性和泛化能力。

Abstract

Objective

Medical image registration has been widely used on the aspect of clinical diagnosis

treatment

intraoperative navigation

disease prediction and radiotherapy planning. Non-learning registration algorithms have matured nowadays in common. Non-learning-based registration algorithms have optimized the deformation parameters iteratively to cause poor robustness because of the huge limitations in the computation speed. Various deep convolution neural networks (DCNNs) models have been running in medical image registration due to the powerful feature expression and learning. DCNNs-based image registration has been divided into supervised and unsupervised categories. The supervised-learning-based registration algorithms have intensive data requirements

which require locking the anatomical landmarks to identify the deformation areas

the performance of reliability of the landmarks has been greatly relied on even the supervised-learning based registration algorithm plays well. Real label information still cannot be acquired. Scholars have focused on unsupervised image registration to complete the defects of supervised image registration. To assess the deformation parameters of the image pair directly via appropriate optimization goals and deformation area constraints. It is difficult to design accurate metric to quantify the similarity of image pairs because the low multimodal images (MI)-based demonstration accuracy in the context of the quite differences amongst content

grayscale

texture and others. Unsupervised registration has been opted in appropriate image similarity to optimize targets involving mean square error

correlation coefficient and normalized mutual information. Most of these similarity assessments have been based on global gray scale. The local deformation still cannot be assessed accurately via good quality e registration structure. An integrated ensemble attention-based augmentation and dual similarity guidance registration network(EADSG-RegNet) has upgraded the registration accuracy of T2-weighted magnetic resonance image and T1-weighted magnetic resonance template image.

Method

EADSG-RegNet network has been designated to assess the deformation area between the moving and fixed image pairs. The feature extraction

deformation field estimation and resampler have been illustrated in the network mentioned above. A cascade encoder and encoder have been designed to realize the multi-scale feature extraction and deformation area assessment based on U-Net structure modification. An integrated attention augmentation module (IAAM) in the cascade encoder to improve feature extraction capabilities have been demonstrated to improve the accuracy of registration. In a word

the extracted features have been learned to decode the deformation area accurately. Integrated attention augmentation module has been applied to generate the weights of feature channels of the global average feature via global average pooling of the input feature map. The global feature channels (the number of channels is

$$n$$

) are shuffled firstly for twice obtain 3×

$$n$$

channels have been calculated in total. Each shuffled global channel feature block has been deducted in dimension via a 1×1×1 convolution. Next

the concatenated features have been mapped to 1×1×1×

$$n$$

weighting coefficients via weighting coefficient to multiply the original feature maps for bottleneck to generate the attention features. The global and local deformation can be accurately assessed in the network training stage. The applications of global gray-scale similarity normalized mutual information (NMI) and the local feature similarity based on the self-similarity context (SSC) descriptor as the loss function to guide the training of the network. The smoothness of the deformation area has been maintained and a regularization has been added to the loss function. Internal dataset and public dataset have been added to verify the performance and generalizability of the model. All T2 weighted magnetic resonance images have been preprocessed firstly and a given T1 template has been pre-aligned. The effectiveness of the network in terms of visualization results and quantitative analysis results have been analyzed. Dice score has been used to analyze the registration results quantitatively. The registration results have been assessed in the global gray matter

white matter and local organizational structures respectively.

Result

To assess the performance of the registration model

the symmetric image normalization method(SyN) implemented in advanced normalization tolls(ANTs) software package

the deep learning registration models voxelMorph framework and affine and deformable medical image registration(ADMIR)

which are the state-of-the-art algorithms in traditional and deep learning-based registration methods. This research has analyzed the registration results quantitatively via the overall structure and several local anatomical structures. The gray matter and white matter have been automatically segmented using FMRIB Software Library(FSL). Nine small anatomical structures have been segmented manually using ITK-Snap. Compared with the ANTs

voxelMorph and ADMIR

the average Dice score on gray matter increased by 3.5%

1.9%

1.5%. The average Dice score on white matter increased by 3.4%

1.6%

1.3%. For the nine anatomical structures

the average Dice score of the proposed model has been increased by 5.2%

3.1%

1.9%. In addition

the registration speed has been improved by dozens of times compared with the traditional ANTs algorithm. The impact of the attention module and feature-based similarity loss on the registration results have been further illustrated. This research have done the ablation experiments of IAAM and SSC-based loss further. The results have demonstrated that the IAAM and the SSC-based loss can increase the Dice score in 1.2% and 1.5% respectively. The registration models have been illustrated to get consistent results with the clinical research via analyzing the volume difference in some brain regions between control groups and drug addicts.

Conclusion

The unsupervised multimodal medical image registration network with integrated attention augmentation module has been illustrated to achieve accurate estimation of the deformation area based on augmented features and accurate registration.

关键词

多模态配准深度学习无监督学习集成注意力增强双重相似性

Keywords

multimodal registrationdeep learningunsupervised learningintegrated attention augmentationdual similarity

references

Avants B B, Tustison N and Song G. 2009. Advanced normalization tools (ANTS). Insight Journal, 2(365): 1-35

Balakrishnan G, Zhao A, Sabuncu M R, Dalca A V and Guttag J. 2018. An unsupervised learning model for deformable medical image registration//Proceedings of 2018 IEEE/CVF Conferenceon Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 9252-9260[DOI: 10.1109/CVPR.2018.00964http://dx.doi.org/10.1109/CVPR.2018.00964]

Balakrishnan G, Zhao A, Sabuncu M R, Guttag J and Dalca A V. 2019. VoxelMorph: a learning framework for deformable medical image registration. IEEE Transactions on Medical Imaging, 38(8): 1788-1800[DOI: 10.1109/TMI.2019.2897538]

Battistella G, Fornari E, Annoni J M, Chtioui H, Dao K, Fabritius M, Favrat B, Mall J F, Maeder P and Giroud C. 2014. Long-term effects of cannabis on brain structure. Neuropsychopharmacology, 39(9): 2041-2048[DOI: 10.1038/npp.2014.67]

Cao X H, Yang J H, Zhang J, Nie D, Kim M, Wang Q and Shen D G. 2017. Deformable image registration based on similarity-steered CNN regression//Proceedings of the 20th International Conference on Medical Image Computing and Computer Assisted Intervention. Quebec City, Canada: Springer: 300-308[DOI: 10.1007/978-3-319-66182-7_35http://dx.doi.org/10.1007/978-3-319-66182-7_35]

Connolly C G, Bell R P, Foxe J J and Garavan H. 2013. Dissociated grey matter changes with prolonged addiction and extended abstinence in cocaine users. PLoS One, 8(3): #e59645[DOI: 10.1371/journal.pone.0059645]

Dalca A V, Balakrishnan G, Guttag J and Sabuncu M R. 2019a. Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Medical Image Analysis, 57: 226-236[DOI: 10.1016/j.media.2019.07.006]

de Vos B D, Berendsen F F, Viergever M A, Sokooti H, Staring M and Išgum I. 2019. A deep learning framework for unsupervised affine and deformable image registration. Medical Image Analysis, 52: 128-143[DOI: 10.1016/j.media.2018.11.010]

de Vos B D, Berendsen F F, Viergever M A, Staring M and Išgum I. 2017. End-to-end unsupervised deformable image registration with a convolutional neural network//Proceedings of the 3rd International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Québec City, Canada: Springer: 204-212[DOI: 10.1007/978-3-319-67558-9_24http://dx.doi.org/10.1007/978-3-319-67558-9_24]

Eppenhof K A J, Lafarge M W, Moeskops P, Veta M and Pluim J P W. 2018. Deformable image registration using convolutional neural networks//Proceedings of SPIE 10574, Medical Imaging 2018: Image Processing. Houston, USA: SPIE: #105740S[DOI: 10.1117/12.2292443http://dx.doi.org/10.1117/12.2292443]

Eppenhof K A J and Pluim J P W. 2019. Pulmonary CT registration through supervised learning with convolutional neural networks. IEEE Transactions on Medical Imaging, 38(5): 1097-1105[DOI: 10.1109/TMI.2018.2878316]

Fan J F, Cao X H, Yap P T and Shen D G. 2019. BIRNet: brain image registration using dual-supervised fully convolutional networks. Medical Image Analysis, 54: 193-206[DOI: 10.1016/j.media.2019.03.006]

Fu Y B, Lei Y, Wang T H, Curran W J, Liu T and Yang X F. 2020. Deep learning in medical image registration: a review. Physics in Medicine and Biology, 65(20): #20TR01[DOI: 10.1088/1361-6560/ab843e]

He K M, Zhang X Y, Ren S Q and Sun J. 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1026-1034[DOI: 10.1109/ICCV.2015.123http://dx.doi.org/10.1109/ICCV.2015.123]

Heinrich M P, Jenkinson M, Bhushan M, Matin T, Gleeson F V, Brady S M and Schnabel J A. 2012. MIND: modality independent neighbourhood descriptor for multi-modal deformable registration. Medical Image Analysis, 16(7): 1423-1435[DOI: 10.1016/j.media.2012.05.008]

Jaderberg M, Simonyan K and Zisserman A. 2015. Spatial transformer networks[EB/OL]. [2020-02-08].https://arxiv.org/pdf/1506.02025.pdfhttps://arxiv.org/pdf/1506.02025.pdf

Kingma D P and Ba J. 2014. Adam: a method for stochastic optimization[EB/OL]. [2020-05-02].https://arxiv.org/pdf/1412.6980.pdfhttps://arxiv.org/pdf/1412.6980.pdf

Liu H H, Hao Y H, Kaneko Y, Ouyang X, Zhang Y, Xu L, Xue Z M and Liu Z N. 2009. Frontal and cingulate gray matter volume reduction in heroin dependence: optimized voxel-based morphometry. Psychiatry and Clinical Neurosciences, 63(4): 563-568[DOI: 10.1111/j.1440-1819.2009.01989.x]

Miao S, Wang Z J and Liao R. 2016. A CNN regression approach for real-time 2D/3D registration. IEEE Transactions on Medical Imaging, 35(5): 1352-1363[DOI: 10.1109/TMI.2016.2521800]

Ronneberger O, Fischer P and Brox T. 2015, October. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241[DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]

Sentker T, Madesta F and Werner R. 2018. GDL-FIRE4D: deep learning-based fast 4D CT image registration//Proceedings of the 21st International Conference on Medical Image Computing and Computer Assisted Intervention. Granada, Spain: Springer: 765-773[DOI: 10.1007/978-3-030-00928-1_86http://dx.doi.org/10.1007/978-3-030-00928-1_86].

Simonovsky M, Gutiérrez-Becker B, Mateus D, Navab N and Komodakis N. 2016. A deep metric for multimodal registration//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens, Greece: Springer: 10-18[DOI: 10.1007/978-3-319-46726-9_2http://dx.doi.org/10.1007/978-3-319-46726-9_2]

Smith S, Bannister P R, Beckmann C, Brady M, Clare S, Flitney D, Hansen P, Jenkinson M, Leibovici D, Ripley B, Woolrich M and Hang Y Y. 2001. FSL: new tools for functional and structural brain image analysis. NeuroImage, 13(6): #249[DOI: 10.1016/S1053-8119(01)91592-7]

Tang K, Li Z, Tian L L, Wang L H and Zhu Y M. 2020. ADMIR-affine and deformable medical image registration for drug-addicted brain images. IEEE Access, 8: 70960-70968[DOI: 10.1109/ACCESS.2020.2986829]

van Essen D C, Ugurbil K, Auerbach E, Barch D, Behrens T E J, Bucholz R, Chang A, Chen L, Corbetta M, Curtiss S W, Della Penna S, Feinberg D, Glasser M F, Harel N, Heath A C, Larson-Prior L, Marcus D, Michalareas G, Moeller S, Oostenveld R, Petersen S E, Prior F, Schlaggar B L, Smith S M, Snyder A Z, Xu J, Yacoub E and Consortium W M H. 2012. The human connectome project: a data acquisition perspective. Neuroimage, 62(4): 2222-2231[DOI: 10.1016/j.neuroimage.2012.02.018]

Wu G R, Kim M, Wang Q, Gao Y Z, Liao S and Shen D G. 2013. Unsupervised deep feature learning for deformable registration of MR brain images//Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention. Nagoya, Japan: Springer: 649-656[DOI: 10.1007/978-3-642-40763-5_80http://dx.doi.org/10.1007/978-3-642-40763-5_80]

Xu R, Chen Y W, Tang S Y, Morikawa S and Kurumi Y. 2008. Parzen-window based normalized mutual information for medical image registration. IEICE Transactions on Information and Systems, E91. D(1): 132-144[DOI: 10.1093/ietisy/e91-d.1.132]

Yan P K, Xu S, Rastinehad A R and Wood B J. 2018. Adversarial image registration with application for MR and TRUS image fusion//Proceedings of the 9th International Workshop on Machine Learning in Medical Imaging. Granada, Spain: Springer: 197-204[DOI: 10.1007/978-3-030-00919-9_23http://dx.doi.org/10.1007/978-3-030-00919-9_23]

Yushkevich P A, Piven J, Hazlett H C, Smith R G, Ho S, Gee J C and Gerig G. 2006. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage, 31(3): 1116-1128[DOI: 10.1016/j.neuroimage.2006.01.015]

Zhao S Y, Dong Y, Chang E and Xu Y. 2019. Recursive cascaded networks for unsupervised medical image registration//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 10599-10609[DOI: 10.1109/ICCV.2019.01070http://dx.doi.org/10.1109/ICCV.2019.01070]

Zhao S Y, Lau T, Luo J, Chang E I C and Xu Y. 2020. Unsupervised 3D end-to-end medical image registration with volume tweening network. IEEE Journal of Biomedical and Health Informatics, 24(5): 1394-1404[DOI:10.1109/JBHI.2019.2951024]

文章被引用时，请邮件提醒。

提交

基于视觉的液晶屏/OLED屏缺陷检测方法综述