融合残差上下文编码和路径增强的视杯视盘分割

梅华威; 尚虹霖; 苏攀; 刘艳平

doi:10.11834/jig.230140

图像分析和识别 | 浏览量 : 0 下载量: 6 CSCD: 0

PDF
导出
分享
收藏
专辑

融合残差上下文编码和路径增强的视杯视盘分割
Optic disc and cup segmentation with combined residual context encoding and path augmentation
2024年29卷第3期页码：637-654
纸质出版日期： 2024-03-16 ，
DOI： 10.11834/jig.230140
稿件说明：

移动端阅览

梅华威，尚虹霖，苏攀，刘艳平. 2024. 融合残差上下文编码和路径增强的视杯视盘分割. 中国图象图形学报， 29(03):0637-0654

Mei Huawei， Shang Honglin， Su Pan， Liu Yanping. 2024. Optic disc and cup segmentation with combined residual context encoding and path augmentation. Journal of Image and Graphics， 29(03):0637-0654
梅华威，尚虹霖，苏攀，刘艳平. 2024. 融合残差上下文编码和路径增强的视杯视盘分割. 中国图象图形学报， 29(03):0637-0654 DOI： 10.11834/jig.230140.

Mei Huawei， Shang Honglin， Su Pan， Liu Yanping. 2024. Optic disc and cup segmentation with combined residual context encoding and path augmentation. Journal of Image and Graphics， 29(03):0637-0654 DOI： 10.11834/jig.230140.

摘要

目的

从眼底图像中分割视盘和视杯对于眼部疾病智能诊断来说是一项重要工作，U-Net及变体模型已经广泛应用在视杯盘分割任务中。由于连续的卷积与池化操作容易引起空间信息损失，导致视盘和视杯分割精度差且效率低。提出了融合残差上下文编码和路径增强的深度学习网络RCPA-Net，提升了分割结果的准确性与连续性。

方法

采用限制对比度自适应直方图均衡方法处理输入图像，增强对比度并丰富图像信息。特征编码模块以ResNet34（residual neural network）为骨干网络，通过引入残差递归与注意力机制使模型更关注感兴趣区域，采用残差空洞卷积模块捕获更深层次的语义特征信息，使用路径增强模块在浅层特征中获得精确的定位信息来增强整个特征层次。本文还提出了一种新的多标签损失函数用于提高视盘视杯与背景区域的像素比例并生成最终的分割图。

结果

在4个数据集上与多种分割方法进行比较，在ORIGA（online retinal fundus image database for glaucoma analysis）数据集中，本文方法对视盘分割的JC（Jaccard）指数为0.939 1，F-measure为0.968 6，视杯分割的JC和F-measure分别为0.794 8和0.885 5；在Drishti-GS1数据集中，视盘分割的JC和F-measure分别为0.951 3和0.975 0，视杯分割的JC和F-measure分别为0.863 3和0.926 6；在Refuge（retinal fundus glaucoma challenge）数据集中，视盘分割的JC和F-measure分别为0.929 8和0.963 6，视杯分割的JC和F-measure分别为0.828 8和0.906 3；在RIM-ONE（retinal image database for optic nerve evaluation）-R1数据集中，视盘分割的JC和F-measure分别为0.929 0和0.962 8。在4个数据集上结果均优于对比算法，性能显著提升。此外，针对网络中提出的模块分别做了消融实验，验证了RCPA-Net中各个模块的有效性。

结论

实验结果表明，RCPA-Net提升了视盘和视杯分割精度，预测图像更接近真实标签结果，同时跨数据集测试结果证明了RCPA-Net具有良好的泛化能力。

Abstract

Objective

Ophthalmic image segmentation is an important part of medical image analysis. Among these， optic disc （OD） and optic cup （OC） segmentation are crucial technologies for the intelligent diagnosis of glaucoma， which can cause irreversible damage to the eyes and is the second leading cause of blindness worldwide. The primary glaucoma screening method is the evaluation of OD and OC based on fundus images. The cup disc ratio （CDR） is one of the most representative glaucoma detection features. In general， eyes with CDR greater than 0.65 are considered to have glaucoma. With the continuous development of deep learning， U-Net and its variant models， including superpixel classification and edge segmentation， have been widely used in OD and OC segmentation tasks. However， the segmentation accuracy of OD and OC is limited， and their efficiency is low during training due to the loss of spatial information caused by continuous convolution and pooling operations. To improve the accuracy and training efficiency of OD and OC segmentation， we proposed the residual context path augmentation U-Net （RCPA-Net）， which can capture deeper semantic feature information and solve the problem of unclear edge localization.

Method

RCPA-Net includes three modules： feature coding module （FCM）， residual atrous convolution （RAC） module， and path augmentation module （PAM）. First， the FCM block adopts the ResNet34 network as the backbone network. By introducing the residual module and attention mechanism， the model is enabled to focus on the region of interest， and the efficient channel attention （ECA） is adopted to the squeeze and excitation （SE） module. The ECA module is an efficient channel attention module that avoids dimensionality reduction and captures cross-channel features effectively. Second， the RAC block is used to obtain the context feature information of a wider layer. Inspired by Inception-V4 and context encoder network（CE-Net）， we fuse cavity convolution into the inception series network and stack convolution blocks. Traditional convolution is replaced with cavity convolution， such that the receptive field increases while the number of parameters remains the same. Finally， to shorten the information path between the low-level and top-level features， the PAM block uses an accurate low-level positioning signal and lateral connection to enhance the entire feature hierarchy. To solve the problem of extremely unbalanced pixels and generate the final segmentation map， we propose a new multi-label loss function based on the dice coefficient and focal loss. This function improves the pixel ratio between the OD/OC and background regions. In addition， we enhance the training data by flipping the image and adjusting the ratio of length and width. Then， the input images are processed using the contrast-limited adaptive histogram equalization method， and each resultant image is fused with its original one and then averaged to form a new three-channel image. This step aims to enhance image contrast and enrich image information. In the experimental stage， we use Adam optimization instead of the stochastic gradient descent method to optimize the model. The number of samples selected for each training stage is eight， and the weight decay is 0.000 1. During training， the learning rate is adjusted adaptively in accordance with the number of samples selected each time. In outputting the prediction results， the maximum connected region in OD and OC is selected to obtain the final segmentation result.

Result

Four datasets （ORIGA， Drishti-GS1， Refuge， and RIM-ONE-R1） are employed to validate the performance of the proposed method. Then， the results are compared with various state-of-the-art methods， including U-Net， M-Net， and CE-Net. The ORIGA dataset contains 650 color fundus images of 3 072 × 2 048 pixels， and the ratio of the training set to the test set is 1∶1 during the experiment. The Drishti-GS1 dataset contains 101 images， including 31 normal images and 70 diseased images. The fundus images are divided into two datasets， Groups A and B， which include 50 training samples and 51 testing samples， respectively. The 400 fundus images in the Refuge dataset are also divided into two datasets. Group A includes 320 training samples， while Group B includes 80 testing samples. The Jaccard index and F-measure score are used in the experimentation to evaluate the results of OD and OC segmentation. The results indicate that in the ORIGA dataset， the Jaccard index and F-measure of the proposed method in OD/OC segmentation are 0.939 1/0.794 8 and 0.968 6/0.885 5， respectively. In the Drishti-GS1 dataset， the results in OD/OC segmentation are 0.951 3/0.863 3 and 0.975 0/0.926 6， respectively. In the Refuge dataset， the results are 0.929 8/0.828 8 and 0.963 6/0.906 3， respectively. In the RIM-ONE-R1 dataset， the results of OD segmentation are 0.929 0 and 0.962 8. The results of the proposed method on the four datasets are all better than those of its counterparts， and the performance of the network is significantly improved. In addition， we conduct ablation experiments for the primary modules proposed in the network， where we perform comparative experiments with respect to the location of the modules， the parameters in the model， and other factors. The results of the ablation experiments demonstrate the effectiveness of each proposed module in RCPA-Net.

Conclusion

In this study， we propose RCPA-Net， which combines the advantages of deep segmentation models. The images predicted using RCPA-Net are closer to the real results， providing more accurate segmentation of OD and OC than several state-of-the-art methods. The experimentation demonstrates the high effectiveness and generalization ability of RCPA-Net.

关键词

视杯视盘分割深度学习注意力机制残差空洞卷积路径增强

Keywords

optic disc and optic cup segmentationdeep learningattention mechanismresidual atrous convolutionpath augmentation

references

Akram M U， Tariq A， Khalid S， Javed M Y， Abbas S and Yasin U U. 2015. Glaucoma detection using novel optic disc localization， hybrid feature set and classification techniques. Australasian Physical and Engineering Sciences in Medicine， 38（4）： 643-655 ［DOI： 10.1007/s13246-015-0377-yhttp://dx.doi.org/10.1007/s13246-015-0377-y］

Al-Bander B， Williams B M， Al-Nuaimy W， Al-Taee M A， Pratt H and Zheng Y L. 2018. Dense fully convolutional segmentation of the optic disc and cup in colour fundus for glaucoma diagnosis. Symmetry， 10（4）： #87 ［DOI： 10.3390/sym10040087http://dx.doi.org/10.3390/sym10040087］

Aquino A， Gegúndez-Arias M E and Marín D. 2010. Detecting the optic disc boundary in digital fundus images using morphological， edge detection， and feature extraction techniques. IEEE Transactions on Medical Imaging， 29（11）： 1860-1869 ［DOI： 10.1109/TMI.2010.2053042http://dx.doi.org/10.1109/TMI.2010.2053042］

Chen L C， Zhu Y K， Papandreou G， Schroff F and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation ［EB/OL］. ［2023-03-09］. https://arxiv.org/pdf/1802.02611.pdfhttps://arxiv.org/pdf/1802.02611.pdf

Cheng J， Liu J， Xu Y W， Yin F S， Wong D W K， Tan N M， Tao D C， Cheng C Y， Aung T and Wong T Y. 2013. Superpixel classification based optic disc and optic cup segmentation for glaucoma screening. IEEE Transactions on Medical Imaging， 32（6）： 1019-1032 ［DOI： 10.1109/TMI.2013.2247770http://dx.doi.org/10.1109/TMI.2013.2247770］

Feng S L， Zhao H M， Shi F， Cheng X N， Wang M， Ma Y H， Xiang D H， Zhu W F and Chen X J. 2020. CPFNet： context pyramid fusion network for medical image segmentation. IEEE Transactions on Medical Imaging， 39（10）： 3008-3018 ［DOI： 10.1109/TMI.2020.2983721http://dx.doi.org/10.1109/TMI.2020.2983721］

Fu H Z， Cheng J， Xu Y W， Wong D W K， Liu J and Cao X C. 2018a. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Transactions on Medical Imaging， 37（7）： 1597-1605 ［DOI： 10.1109/TMI.2018.2791488http://dx.doi.org/10.1109/TMI.2018.2791488］

Fu H Z， Cheng J， Xu Y W， Zhang C Q， Wong D W K， Liu J and Cao X C. 2018b. Disc-Aware ensemble network for glaucoma screening from fundus image. IEEE Transactions on Medical Imaging， 37（11）： 2493-2501 ［DOI： 10.1109/TMI.2018.2837012http://dx.doi.org/10.1109/TMI.2018.2837012］

Fumero F， Alayon S， Sanchez J L， Sigut J and Gonzalez-Hernandez M. 2011. Rim-One： an open retinal image database for optic nerve evaluation//The 24th International Symposium on Computer-Based Medical Systems. Bristol， UK： IEEE： 1-6 ［DOI： 10.1109/CBMS.2011.5999143http://dx.doi.org/10.1109/CBMS.2011.5999143］

Gu Z W， Cheng J， Fu H Z， Zhou K， Hao H Y， Zhao Y T， Zhang T Y， Gao S H and Liu J. 2019. CE-Net： context encoder network for 2D medical image segmentation. IEEE Transactions on Medical Imaging， 38（10）： 2281-2292 ［DOI： 10.1109/TMI.2019.2903562http://dx.doi.org/10.1109/TMI.2019.2903562］

Gu Z W， Liu P， Zhou K， Jiang Y M， Mao H Y， Cheng J and Liu J. 2018. DeepDisc： optic disc segmentation based on atrous convolution and spatial pyramid pooling//Proceedings of the 1st International Workshop on Computational Pathology and Ophthalmic Medical Image Analysis. Granada， Spain： Springer： 253-260 ［DOI： 10.1007/978-3-030-00949-6_30http://dx.doi.org/10.1007/978-3-030-00949-6_30］

He K M， Zhang X Y， Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 770-778 ［DOI： 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90］

Hu J， Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 7132-7141 ［DOI： 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745］

Jiang Y M， Duan L X， Cheng J， Gu Z W， Xia H， Fu H Z， Li C S and Liu J. 2020. JointRCNN： a region-based convolutional neural network for optic disc and cup segmentation. IEEE Transactions on Biomedical Engineering， 67（2）： 335-343 ［DOI： 10.1109/TBME.2019.2913211http://dx.doi.org/10.1109/TBME.2019.2913211］

Lin T Y， Goyal P， Girshick R， He K M and Doll􀅡r P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 2999-3007 ［DOI： 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324］

Liu H P， Zhao Y H， Hou X D， Guo H Y and Ding M Y. 2021. Optic disc and cup segmentation by combining context and attention. Journal of Image and Graphics， 26（5）： 1041-1057

刘洪普，赵一浩，侯向丹，郭鸿湧，丁梦园. 2021. 融合上下文和注意力的视盘视杯分割. 中国图象图形学报， 26（5）： 1041-1057 ［DOI： 10.11834/jig.200257http://dx.doi.org/10.11834/jig.200257］

Liu S， Qi L， Qin H F， Shi J P and Jia J Y. 2018. Path aggregation network for instance segmentation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 8759-8768 ［DOI： 10.1109/CVPR.2018.00913http://dx.doi.org/10.1109/CVPR.2018.00913］

Long J， Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston， USA： IEEE： 3431-3440 ［DOI： 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965］

Lowell J， Hunter A， Steel D， Basu A， Ryder R， Fletcher E and Kennedy L. 2004. Optic nerve head segmentation. IEEE Transactions on Medical Imaging， 23（2）： 256-264 ［DOI： 10.1109/TMI.2003.823261http://dx.doi.org/10.1109/TMI.2003.823261］

Maninis K K， Pont-Tuset J， Arbeláez P and Van Gool L. 2016. Deep retinal image understanding//Proceedings of the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention. Athens， Greece： Springer： 140-148 ［DOI： 10.1007/978-3-319-46723-8_17http://dx.doi.org/10.1007/978-3-319-46723-8_17］

Milletari F， Navab N and Ahmadi S A. 2016. V-Net： fully convolutional neural networks for volumetric medical image segmentation//Proceedings of the 4th International Conference on 3D Vision. Stanford， USA： IEEE： 565-571 ［DOI： 10.1109/3DV.2016.79http://dx.doi.org/10.1109/3DV.2016.79］

Mou L， Zhao Y T， Fu H Z， Liu Y H， Cheng J， Zheng Y L， Su P， Yang J L， Chen L， Frangi A F， Akiba M and Liu J. 2021. CS2-Net： deep learning segmentation of curvilinear structures in medical imaging. Medical Image Analysis， 67： #101874 ［DOI： 10.1016/j.media.2020.101874http://dx.doi.org/10.1016/j.media.2020.101874］

Murugesan B， Sarveswaran K， Shankaranarayana S M， Ram K， Joseph J and Sivaprakasam M. 2019. Psi-Net： shape and boundary aware joint multi-task deep network for medical image segmentation//Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Berlin， Germany： IEEE： 7223-7226 ［DOI： 10.1109/EMBC.2019.8857339http://dx.doi.org/10.1109/EMBC.2019.8857339］

Orlando J I， Fu H Z， Breda J B， van Keer K， Bathula D R， Diaz-Pinto A， Fang R G， Heng P A， Kim J， Lee J， Lee J， Li X X， Liu P， Lu S， Murugesan B， Naranjo V， Phaye S S R， Shankaranarayana S M， Sikka A， Son J， van den Hengel A， Wang S J， Wu J Y， Wu Z F， Xu G H， Xu Y L， Yin P S， Li F， Zhang X L， Xu Y W and Bogunović H. 2020. REFUGE challenge： a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Medical Image Analysis， 59： #101570 ［DOI： 10.1016/j.media.2019.101570http://dx.doi.org/10.1016/j.media.2019.101570］

Ronneberger O， Fischer P and Brox T. 2015. U-Net： convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich， Germany： Springer： 234-241 ［DOI： 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28］

Sivaswamy J， Krishnadas S， Chakravarty A and Ujjwal J G D. 2015. A comprehensive retinal image dataset for the assessment of glaucoma from the optic nerve head analysis. JSM Biomedical Imaging Data Papers， 2（1）： 1004-1012

Surendiran J， Theetchenya S， Benson Mansingh P M， Sekar G， Dhipa M， Yuvaraj N， Arulkarthick V J， Suresh C， Sriram A， Srihari K and Alene A. 2022. Segmentation of optic disc and cup using modified recurrent neural network. BioMed Research International， 2022： #6799184 ［DOI： 10.1155/2022/6799184http://dx.doi.org/10.1155/2022/6799184］

Szegedy C， Ioffe S， Vanhoucke V and Alemi A. 2017. Inception-v4， inception-ResNet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence， 31（1）： 4278-4284 ［DOI： 10.1609/AAAI.v31i1.11231http://dx.doi.org/10.1609/AAAI.v31i1.11231］

Tabassum M， Khan T M， Arsalan M， Naqvi S S， Ahmed M， Madni H A and Mirza J. 2020. CDED-Net： joint segmentation of optic disc and optic cup for glaucoma screening. IEEE Access， 8： 102733-102747 ［DOI： 10.1109/access.2020.2998635http://dx.doi.org/10.1109/access.2020.2998635］

Wang Q L， Wu B G， Zhu P F， Li P H， Zuo W M and Hu Q H. 2020. ECA-Net： efficient channel attention for deep convolutional neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 11531-11539 ［DOI： 10.1109/CVPR42600.2020.01155http://dx.doi.org/10.1109/CVPR42600.2020.01155］

Wang S J， Yu L Q， Yang X， Fu C W and Heng P A. 2019. Patch-based output space adversarial learning for joint optic disc and cup segmentation. IEEE Transactions on Medical Imaging， 38（11）： 2485-2495 ［DOI： 10.1109/TMI.2019.2899910http://dx.doi.org/10.1109/TMI.2019.2899910］

Xu Y W， Liu J， Lin S， Xu D， Cheung C Y， Aung T and Wong T Y. 2012. Efficient optic cup detection from intra-image learning with retinal structure priors//Proceedings of the 15th International Conference on Medical Image Computing and Computer-Assisted Intervention. Nice， French： Springer： 58-65 ［DOI： 10.1007/978-3-642-33415-3_8http://dx.doi.org/10.1007/978-3-642-33415-3_8］

Yu S， Xiao D， Frost S and Kanagasingam Y. 2019. Robust optic disc and cup segmentation with deep learning for glaucoma detection. Computerized Medical Imaging and Graphics， 74： 61-71 ［DOI： 10.1016/j.compmedimag.2019.02.005http://dx.doi.org/10.1016/j.compmedimag.2019.02.005］

Yuan X， Zhou L X， Yu S Y， Li M， Wang X and Zheng X J. 2021. A multi-scale convolutional neural network with context for joint segmentation of optic disc and cup. Artificial Intelligence in Medicine， 113： #102035 ［DOI： 10.1016/j.artmed.2021.102035http://dx.doi.org/10.1016/j.artmed.2021.102035］

Zhang S H， Fu H Z， Yan Y G， Zhang Y B， Wu Q Y， Yang M， Tan M K and Xu Y W. 2019. Attention guided network for retinal image segmentation//Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention. Shenzhen， China： Springer： 797-805 ［DOI： 10.1007/978-3-030-32239-7_88http://dx.doi.org/10.1007/978-3-030-32239-7_88］

Zhang Z， Yin F S， Liu J， Wong W K， Tan N M， Lee B H， Cheng J and Wong T Y. 2010. ORIGA-light： an online retinal fundus image database for glaucoma analysis and research//Proceedings of 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology. Buenos Aires， Argentina： IEEE： 3065-3068 ［DOI： 10.1109/iembs.2010.5626137http://dx.doi.org/10.1109/iembs.2010.5626137］

Zhao X F， Lin T S and Li B. 2011. Fast automatic localization of optic disc in retinal images. Journal of South China University of Technology （Natural Science Edition）， 39（2）： 71-75， 80

赵晓芳，林土胜，李碧. 2011. 视网膜图像中视盘的快速自动定位方法. 华南理工大学学报（自然科学版）， 39（2）： 71-75， 80 ［DOI： 10.3969/j.issn.1000-565X.2011.02.012http://dx.doi.org/10.3969/j.issn.1000-565X.2011.02.012］

Zhu Q L， Chen X J， Meng Q Q， Song J H， Luo G H， Wang M， Shi F， Chen Z Y， Xiang D H， Pan L J， Li Z Y and Zhu W F. 2021. GDCSeg-Net： general optic disc and cup segmentation network for multi-device fundus images. Biomedical Optics Express， 12（10） 6529-6544 ［DOI： 10.1364/BOE.434841http://dx.doi.org/10.1364/BOE.434841］

Zuiderveld K. 1994. Contrast limited adaptive histogram equalization//Graphics Gems. Boston： Academic Press： 474-485 ［DOI： 10.1016/b978-0-12-336156-1.50061-6http://dx.doi.org/10.1016/b978-0-12-336156-1.50061-6］

文章被引用时，请邮件提醒。

提交

显著性引导的目标互补隐藏弱监督语义分割

轻量级图像超分辨率的蓝图可分离卷积Transformer网络

面向低剂量CT的牙齿分割网络

融合多尺度特征的复杂手势姿态估计网络

航空遥感图像深度学习目标检测技术研究进展