感受野扩增的轻量级病理图像聚焦质量评估网络
Lightweight focus quality assessment network for pathological image with amplified receptive field
- 2024年29卷第11期 页码:3447-3461
纸质出版日期: 2024-11-16
DOI: 10.11834/jig.230676
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-11-16 ,
移动端阅览
丁维龙, 朱伟, 廖婉茵, 刘津龙, 汪春年, 祝行琴. 2024. 感受野扩增的轻量级病理图像聚焦质量评估网络. 中国图象图形学报, 29(11):3447-3461
Ding Weilong, Zhu Wei, Liao Wanyin, Liu Jinlong, Wang Chunnian, Zhu Xingqin. 2024. Lightweight focus quality assessment network for pathological image with amplified receptive field. Journal of Image and Graphics, 29(11):3447-3461
目的
2
病理切片扫描仪成像的数字病理图像的聚焦质量不佳,会严重影响肿瘤诊断的准确性。因此,开展对数字病理图像的聚焦质量评估的自动化算法至关重要。现有的聚焦质量评估主要采用深度学习方法,但常规的卷积神经网络(convolutional neural network, CNN)存在全局信息提取能力差和计算量过大问题。为此,提出一种感受野扩增的轻量级病理图像聚焦质量评估网络。
方法
2
该网络引入大卷积核来扩增网络的感受野,以捕获更多的全局信息。再利用新的双流大核注意力机制,增强对空间和通道上全局信息的提取能力。最后,将该网络优化为参数量递减的大型、中型和小型3个版本,以实现网络的轻量化。
结果
2
本文提出的大型网络比同类先进方法取得更优的性能。与本文的大型网络相比,优化后的小型网络牺牲了较小的性能,却取得参数量、计算量和CPU推理时间的显著下降。与同类轻量级网络SDCNN(self-defined convolutional neural network)相比,本文的小型网络在SRCC(Spearman’s rank correlation coefficient)、PLCC(Pearson linear correlation coefficient)和KRCC(Kendall rank correlation coefficient)等度量指标上分别提升了0.016 1、0.016 6和0.029 9,而参数量、计算量和CPU推理时间分别减少了39.06%、95.11%和51.91%。
结论
2
本文提出的方法可有效地提取数字病理图像的全局聚焦信息,且计算资源消耗更低,具有现实可行性。
Objective
2
Histopathology is the gold standard for tumor diagnosis. With the development of digital pathology slide scanners, digital pathology has introduced revolutionary changes to clinical pathological diagnosis. Pathologists use digital images to examine tissues and make diagnoses based on the characteristics of the observed tissues. Simultaneously, these digital images are fed into a computer-aided diagnostic system for automated diagnosis, thereby speeding up diagnosis. However, the quality of digital pathology images is blurred locally or globally by the focusing errors produced in the scanning process. For pathologists, these blurred areas will prevent accurate observations of tissue and cellular structures, leading to misdiagnosis. Therefore, studying the focus quality evaluation for pathological images is crucial. Methods based on machine and deep learning are currently available for this research. In machine learning-based methods, features are artificially designed with the help of a priori knowledge, such as optical or microscopic imaging, and fed into a classifier to automatically obtain focused predictions. However, these methods do not automatically learn the focus features in pathological images, resulting in low evaluation accuracy. Meanwhile, deep learning-based methods automatically learn complex features, substantially improving evaluation accuracy. Current learning-based work enhances the capability to process global focus information from pathological images by introducing attention mechanisms. However, the receptive scope of these attention mechanisms is limited, which results in inadequate global focus information. By contrast, the existing networks with better performance require a larger number of parameters and computations, increasing the difficulty of their application in practice. In this paper, a focus quality assessment network with amplified receptive field (ARF-FQANet) is proposed to address challenges such as poor global information extraction and excessive computations.
Method
2
In ARF-FQANet, a large convolution kernel is used to amplify the receptive field of the network, and the dual-stream large kernel attention (DsLKA) mechanism is then integrated. In DsLKA, large kernel channel and spatial attentions are proposed to capture the global focus information in channels and spaces, respectively. The proposed large kernel channel attention is better than the classical channel attention mechanism, and the introduced large kernel retransmit squeeze (LKRS) method redistributes the weights in the space, thus avoiding the problem of losing saliency weights in classical channel attention. However, the local cellular semantic information gradually becomes salient with the downsampling of input features, which may affect the capability of the network to represent focus information. A local stable downsampling block (LSDSB) is designed to address the above problems. Extraneous information is minimized during the upsampling and downsampling processes by integrating LSDSB, thus ensuring the local stability of the features. A short branch is introduced to create a residual attention block (RAB) based on DsLKAB and LSDSB modules. In this short branch, the noise is extracted using a minimum pooling operation, which effectively suppresses the learning of noisy information during backpropagation, thus improving the capability of the network to represent focus information. In addition, an initial feature enhancement block (IFEB) is introduced at the initial stage of the network to enhance the capability of the initial layer to represent the focus information. The features obtained by IFEB provide highly comprehensive information for subsequent networks. A strategy to decompose large convolutional kernels is introduced to obtain a lightweight network, which substantially reduces the number of parameters and computational requirements. By contrast, the network parameters are reduced to achieve further compression. The network is then optimized into three aspects: large, medium, and small, each with a reduced number of parameters.
Result
2
Comparative experiments are performed on a publicly available dataset of focused quality assessment of pathology images. The compared networks are categorized as small, medium, and large according to the number of their parameters. In terms of large networks, the proposed large network performs the best with 0.765 8, 0.957 8, 0.956 2, and 0.852 3 for RMSE, SRCC, PLCC, and KRCC, respectively. These results show that the predicted focus scores are highly consistent with the actual focus scores. In terms of small and medium networks, the performance of the proposed small and medium networks is slightly degraded, but its parameters and computational complexity are notably reduced. Compared with self-defined convolutional neural network (SDCNN), the parameters of the small network (ARF-FQANet-S), the floating-point operations, and the CPU reference time (CPU-Time) are reduced by 39.06%, 95.11%, and 51.91%, respectively. The small network may not be able to outperform the FocusLiteNN network in terms of speed; however, performance comparable to larger networks is still provided. This paper visualizes the receptive field of several networks in different stages. The results indicate that the ARF-FQANet proposed in this paper obtains larger receptive fields, especially in the initial layer of the network. Thus, additional global focusing information is obtained at the initial layer of the network, which contributes to the stable performance of the small ARF-FQANet.
Conclusion
2
Compared with similar methods, the proposed network efficiently extracts global focus information from pathological images. In this network, a large convolutional kernel is used to expand the receptive field of the network, and DsLKA is introduced to enhance the global information within the learning space and channels. This strategy ensures that the network maintains competitive performance even after notable parameter reductions. The small network (ARF-FQANet-S) offers remarkable advantages in terms of CPU inference time and is ideal for lightweight deployments on edge devices. Overall, the results provide a technical reference for the lightweight models.
数字病理图像聚焦质量评估感受野扩增注意力机制轻量级
digital pathological imagesfocus quality assessmentamplified receptive fieldattention mechanismlightweight
Ameisen D, Deroulers C, Perrier V, Bouhidel F, Battistella M, Legrès L, Janin A, Bertheau P and Yunès J B. 2014. Towards better digital pathology workflows: programming libraries for high-speed sharpness assessment of whole slide images. Diagnostic Pathology, 9(1): 1-7 [DOI: 10.1186/1746-1596-9-S1-S3http://dx.doi.org/10.1186/1746-1596-9-S1-S3]
Dastidar T R. 2019. Automated focus distance estimation for digital microscopy using deep convolutional neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, USA: IEEE: 1049-1056 [DOI: 10.1109/CVPRW.2019.00137http://dx.doi.org/10.1109/CVPRW.2019.00137]
Ding X H, Zhang X Y, Han J G and Ding G G. 2022. Scaling up your kernels to31 × 31: revisiting large kernel design in CNNs//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11953-11965 [DOI: 10.1109/CVPR52688.2022.01166http://dx.doi.org/10.1109/CVPR52688.2022.01166]
Dong Y H, Cordonnier J B and Loukas A. 2021. Attention is not all you need: pure attention loses rank doubly exponentially with depth//Proceedings of the 38th International Conference on Machine Learning. Virtual Event: [s.n.]: 2793-2803 [DOI: 10.48550/arXiv.2103.03404http://dx.doi.org/10.48550/arXiv.2103.03404]
Edelstein A D, Tsuchida M A, Amodaj N, Pinkard H, Vale R D and Stuurman N. 2014. Advanced methods of microscope control using μManager software. Journal of Biological Methods, 1(2): #e10 [DOI: 10.14440/jbm.2014.36http://dx.doi.org/10.14440/jbm.2014.36]
Gao D S, Padfield D, Rittscher J and McKay R. 2010. Automated training data generation for microscopy focus classification//Proceedings of the 13th International Conference on Medical Image Computing and Computer-Assisted Intervention. Beijing, China: Springer: 446-453 [DOI: 10.1007/978-3-642-15745-5_55http://dx.doi.org/10.1007/978-3-642-15745-5_55]
Guo K K, Liao J, Bian Z C, Heng X and Zheng G A. 2015. InstantScope: a low-cost whole slide imaging system with instant focal plane detection. Biomedical Optics Express, 6(9): 3210-3216 [DOI: 10.1364/BOE.6.003210http://dx.doi.org/10.1364/BOE.6.003210]
Guo M H, Lu C Z, Liu Z N, Cheng M M and Hu S M. 2023a. Visual attention network. Computational Visual Media, 9(4): 733-752 [DOI: 10.1007/s41095-023-0364-2http://dx.doi.org/10.1007/s41095-023-0364-2]
Guo M H, Xu T X, Liu J J, Liu Z N, Jiang P T, Mu T J, Zhang S H, Martin R R, Cheng M M and Hu S M. 2022. Attention mechanisms in computer vision: a survey. Computational Visual Media, 8(3): 331-368 [DOI: doi.org/10.1007/s41095-022-0271-yhttp://dx.doi.org/doi.org/10.1007/s41095-022-0271-y]
Guo Y F, Hu M H, Min X K, Wang Y, Dai M, Zhai G T, Zhang X P and Yang X K. 2023b. Blind image quality assessment for pathological microscopic image under screen and immersion scenarios. IEEE Transactions on Medical Imaging, 42(11): 3295-3306 [DOI: 10.1109/TMI.2023.3282387http://dx.doi.org/10.1109/TMI.2023.3282387]
Hashimoto N, Bautista P A, Yamaguchi M, Ohyama N and Yagi Y. 2012. Referenceless image quality evaluation for whole slide imaging. Journal of Pathology Informatics, 3(1): #9 [DOI: 10.4103/2153-3539.93891http://dx.doi.org/10.4103/2153-3539.93891]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hosseini M S, Zhang Y Y and Plataniotis K N. 2019. Encoding visual sensitivity by MaxPol convolution filters for image sharpness assessment. IEEE Transactions on Image Processing, 28(9): 4510-4525 [DOI: 10.1109/TIP.2019.2906582http://dx.doi.org/10.1109/TIP.2019.2906582]
Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141 [DOI: 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745]
Jiang S W, Liao J, Bian Z C, Guo K K, Zhang Y B and Zheng G A. 2018. Transform- and multi-domain deep learning for single-frame rapid autofocusing in whole slide imaging. Biomedical Optics Express, 9(4): 1601-1612 [DOI: 10.1364/BOE.9.001601http://dx.doi.org/10.1364/BOE.9.001601]
Jiménez A, Bueno G, Cristóbal G, Déniz O, Tooméy D and Conway C. 2016. Image quality metrics applied to digital pathology//Proceedings Volume 9896, Optics, Photonics and Digital Technologies for Imaging Applications IV. Brussels, Belgium: SPIE: 170-187 [DOI: 10.1117/12.2230655http://dx.doi.org/10.1117/12.2230655]
Jin X, Wen K, Lyu G F, Shi J, Chi M X, Wu Z and An H. 2020. Survey on the applications of deep learning to histopathology. Journal of Image and Graphics, 25(10): 1982-1993
金旭, 文可, 吕国锋, 石军, 迟孟贤, 武铮, 安虹. 2020. 深度学习在组织病理学中的应用综述. 中国图象图形学报, 25(10): 1982-1993 [DOI: 10.11834/jig.200460http://dx.doi.org/10.11834/jig.200460]
Kohlberger T, Liu Y, Moran M, Chen P H C, Brown T, Hipp J D, Mermel C H and Stumpe M C. 2019. Whole-slide image focus quality: automatic assessment and impact on AI cancer detection. Journal of Pathology Informatics, 10(1): #39 [DOI: 10.4103/jpi.jpi_11_19http://dx.doi.org/10.4103/jpi.jpi_11_19]
Krizhevsky A, Sutskever I and Hinton G E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90 [DOI: 10.1145/3065386http://dx.doi.org/10.1145/3065386]
Li B, Keikhosravi A, Loeffler A G and Eliceiri K W. 2021. Single image super-resolution for whole slide image using convolutional neural networks and self-supervised color normalization. Medical Image Analysis, 68: #101938 [DOI: 10.1016/j.media.2020.101938http://dx.doi.org/10.1016/j.media.2020.101938]
Li J, Chen J Y, Tang Y C, Wang C, Landman B A and Zhou S K. 2023. Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives. Medical Image Analysis, 85: #102762 [DOI: 0.1016/j.media.2023.102762http://dx.doi.org/0.1016/j.media.2023.102762]
Li Q, Liu X M, Han K G, Guo C, Jiang J J, Ji X Y and Wu X L. 2022. Learning to autofocus in whole slide imaging via physics-guided deep cascade networks. Optics Express, 30(9): 14319-14340 [DOI: 10.1364/OE.416824http://dx.doi.org/10.1364/OE.416824]
Liang Z Y, Wang Q Y, Liao H W, Zhao M, Lee J, Yang C, Li F Y and Ling D S. 2021. Artificially engineered antiferromagnetic nanoprobes for ultra-sensitive histopathological level magnetic resonance imaging. Nature Communications, 12(1): #3840 [DOI: 10.1038/s41467-021-24055-2http://dx.doi.org/10.1038/s41467-021-24055-2]
Luo W J, Li Y J, Urtasun R and Zemel R. 2016. Understanding the effective receptive field in deep convolutional neural networks//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc.: 4905-4913
Moorthy A K and Bovik A C. 2011. Blind image quality assessment: from natural scene statistics to perceptual quality. IEEE Transactions on Image Processing, 20(12): 3350-3364 [DOI: 10.1109/TIP.2011.2147325http://dx.doi.org/10.1109/TIP.2011.2147325]
Nair T, Pour A F and Chuang J H. 2020. The effect of blurring on lung cancer subtype classification accuracy of convolutional neural networks//Proceedings of 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Seoul, Korea (South): IEEE: 2987-2989 [DOI: 10.1109/BIBM49941.2020.9313192http://dx.doi.org/10.1109/BIBM49941.2020.9313192]
Otsu N. 1979. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1): 62-66 [DOI: 10.1109/TSMC.1979.4310076http://dx.doi.org/10.1109/TSMC.1979.4310076]
Patel A, Balis U G J, Cheng J, Li Z B, Lujan G, McClintock D S, Pantanowitz L and Parwani A. 2021. Contemporary whole slide imaging devices and their applications within the modern pathology department: a selected hardware review. Journal of Pathology Informatics, 12(1): #50 [DOI: 10.4103/jpi.jpi_66_21http://dx.doi.org/10.4103/jpi.jpi_66_21]
Sandler M, Howard A, Zhu M L, Zhmoginov A and Chen L C. 2018. MobileNetV2: inverted residuals and linear bottlenecks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4510-4520 [DOI: 10.1109/CVPR.2018.00474http://dx.doi.org/10.1109/CVPR.2018.00474]
Song J and Liu M J. 2023. A new methodology in constructing no-reference focus quality assessment metrics. Pattern Recognition, 142: #109688 [DOI: 10.1016/j.patcog.2023.109688http://dx.doi.org/10.1016/j.patcog.2023.109688]
Sun W, Min X K, Tu D Y, Ma S W and Zhai G T. 2023. Blind quality assessment for in-the-wild images via hierarchical feature fusion and iterative mixed database training. IEEE Journal of Selected Topics in Signal Processing, 17(6): 1178-1192 [DOI: 10.1109/JSTSP.2023.3270621http://dx.doi.org/10.1109/JSTSP.2023.3270621]
Suvorov R, Logacheva E, Mashikhin A, Remizova A, Ashukha A, Silvestrov A, Kong N, Goka H, Park K and Lempitsky V. 2022. Resolution-robust large mask inpainting with Fourier convolutions//Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE: 3172-3182 [DOI: 10.1109/WACV51458.2022.00323http://dx.doi.org/10.1109/WACV51458.2022.00323]
Wang Z L, Hosseini M S, Miles A, Plataniotis K N and Wang Z. 2020. FocusLiteNN: high efficiency focus quality assessment for digital pathology//Proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention. Lima, Peru: Springer: 403-413 [DOI: 10.1007/978-3-030-59722-1_39http://dx.doi.org/10.1007/978-3-030-59722-1_39]
Xue Y F, Qian H L, Li X, Wang J, Ren K F and Ji J. 2022. A deep-learning-based workflow to deal with the defocusing problem in high-throughput experiments. Bioactive Materials, 11: 218-229 [DOI: 10.1016/j.bioactmat.2021.09.018http://dx.doi.org/10.1016/j.bioactmat.2021.09.018]
Yang S J, Berndl M, Ando D M, Barch M, Narayanaswamy A, Christiansen E, Hoyer S, Roat C, Hung J, Rueden C T, Shankar A, Finkbeiner S and Nelson P. 2018. Assessing microscope image focus quality with deep learning. BMC Bioinformatics, 19: #77 [DOI: 10.1186/s12859-018-2087-4http://dx.doi.org/10.1186/s12859-018-2087-4]
Zhang C Y, Gu Y, Yang J and Yang G Z. 2021. Diversity-aware label distribution learning for microscopy auto focusing. IEEE Robotics and Automation Letters, 6(2): 1942-1949 [DOI: 10.1109/LRA.2021.3061333http://dx.doi.org/10.1109/LRA.2021.3061333]
相关作者
相关机构