提示学习与门控前馈网络的多尺度图像去模糊
Multi-scale image deblurring based on prompt learning and gated feedforward networks
- 2024年 页码:1-14
网络出版日期: 2024-08-15
DOI: 10.11834/jig.240315
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-08-15 ,
移动端阅览
谢斌,黎彦先,邵祥等.提示学习与门控前馈网络的多尺度图像去模糊[J].中国图象图形学报,
Xie Bin,Li Yanxian,Shao Xiang,et al.Multi-scale image deblurring based on prompt learning and gated feedforward networks[J].Journal of Image and Graphics,
目的
2
针对传统基于深度学习的去模糊方法存在的伪影明显、细节模糊和噪声残留等问题,文中提出了一种基于提示学习的多尺度图像去模糊新方法。
方法
2
首先,在详细地分析了传统去模糊方法的基础上,文中引入了基于提示学习的特定退化信息编码模块,利用退化信息中包含的上下文信息来动态地引导深度网络以更有效地完成去模糊任务。其次,设计了新的门控前馈网络,通过控制各个层级的信息流动构建更为丰富和更具层次结构的特征表示,从而进一步提高对复杂数据的理解和处理能力,以更好地保持结果图像的几何结构。另外,新方法引入了经典的总变差正则来抑制去模糊过程中的噪声残留,以提高结果图像的视觉表现。
结果
2
大量基于GoPro和REDS数据集的实验结果表明,与其他先进的基于深度学习的去模糊方法相比,文中所提新方法在图像去模糊方面取得了更好的效果。在峰值信噪比(peak signal-to-noise ratio, PSNR)和结构相似性(structural similarity, SSIM)指标上,文中提出的新方法在GoPro数据集上分别达到了33.04dB和0.962的最优结果。在REDS数据集上分别达到了28.70dB和0.859的结果,并且,相比SAM-deblur(segment anything model-deblur)方法,PSNR提升了1.77dB。
结论
2
相较于其他的去模糊方法,文中所提出的新方法不仅能够较好的保持结果图像的细节信息,而且还能够有效地克服伪影明显和噪声残留的问题,所得结果图像在PSNR和SSIM等客观评价指标方面均有更好的表现。
Objective
2
Image deblurring is to restore a clean image from blurry image. It aims to maintain the structure and details of the original image during the restoration. With the rapid development of Internet technology, the way people get images becomes more diversified. However, the image is often blurred or distorted by various factors in the process of acquisition, so it is necessary to deblur the image. Image deblurring is of great significance to improve image quality, and plays a key role in many fields such as medical imaging, satellite image processing, security monitoring, which has attracted the attention of many researchers. Due to the ill-posed image deblurring task, more prior knowledge is needed to recover image with high quality. At present, the existing deblurring methods include traditional methods and deep learning based methods. In the traditional methods, although the filter based deblurring method is simple and convenient, the recovered images often have artifacts, content loss and other problems, which can not meet the needs of various applications. And the deblurring method based on the idea of regularity has been widely concerned by researchers for a long time, various methods of constructing regular terms have been proposed to solve this kind of ill-posed problems. Although these traditional methods can achieve the purpose of deblurring to a certain extent, these methods rely on the prior information of images, which is difficult to obtain accurately in practical applications, so this kind of methods can not be well promoted in a wide range. With the wide application of deep learning technology, more and more researchers begin to use this technology to solve the ill-posed problem. The image deblurring methods as a whole fall into three main categories: convolutional neural network (CNN) based method; generative adversarial networks(GAN) based method and Transformer based method. In the CNN based method, with the powerful feature extraction capability of CNN, the model can learn the complex mapping relationship, and by minimizing the loss function, it can guide the model convergence to get the best output images. However, such methods lack of the features on multi-scales, and can produce artifacts and loss of image details. In order to make up for the deficiency of the appeal methods, the researchers propose a new framework named GAN. In such method, by alternating training the generator and discriminator to continuously improve the performance of the generator and then get better quality resulting images. Due to the success of Transformer in natural language processing, researchers begin to introduce it into the field of image processing. The advantage of the Transformer based method is that the model can better capture local context information for better image deblurring. However, using the Transformer block will inevitably increase the computational complexity of the model. Aiming at the problems of obvious artifacts, fuzzy details and residual noise in previous image deblurring methods, a novel method of multi-scale image deblurring based on prompt learning is proposed.
Method
2
In this paper, three improvements are made. Firstly, the degraded information coding module based on Prompt learning can use the context information contained in the degraded image to dynamically guide the deep network to complete different image deblurring tasks. Next, a gated feedforward networks (GFFN) is designed to control the flow of information at each level to build a richer and more hierarchical feature representation. Based on this, prompt u-shaped block (PUBlock) is designed. In addition, on the basis of the original loss function, the adaptive total variation regularization is added to effectively suppress the noise residue in the process of image restoration and improve the visual performance of the result image. In general, through the introduction of gating mechanism, the network can dynamically control the flow of information, so as to capture complex feature relationships more effectively. Using deep convolution can improve the efficiency of the model while ensuring the performance of the model. Prompt learning can better help model utilize degraded images and adaptive regularization can selectively smooth the image, which not only removes the noise, but also prevents the image from being over-smooth.
Result
2
To demonstrate the effectiveness of the proposed method, we performed deblurring experiments on the GoPro and REDS datasets and compared them with other advanced methods. In addition, peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are used as objective evaluation metrics. The experimental results show that the proposed method outperforms all other methods in GoPro and REDS datasets and achieves 33.04dB and 0.962 respectively on the GoPro dataset and 28.70dB and 0.859 respectively on the REDS dataset under the two metrics, which are better than the PSNR and SSIM values of the conventional image deblurring method. The comparison results with SAM-deblur (segment anything model-deblur) algorithm show that PSNR improves by 1.77dB on REDS dataset. And the comparison results with DFFA-Net(double-scale network with deep feature fusion attention) based on the GoPro dataset show that the proposed method improve the PSNR and SSIM by 0.49dB and 0.005, respectively. In addition, the visual results also show that the images recovered by our model are closest to the original real image, maintaining the original structure and features, and has a finer edge.
Conclusion
2
In this paper, aiming at the problems of existing image deblurring methods, we propose a novel method of multi-scale image deblurring based on prompt learning. The experimental results show that the new method can not only preserve the details of the result image, but also effectively overcome the problems of obvious artifacts and noise residue, and the result image has better performance in the objective evaluation metrics on PSNR and SSIM.
图像去模糊提示学习多尺度门控前馈网络深度卷积
image deblurringprompt learningmulti-scalegated feedforward networksdepthwise convolution
Abdelhamed A, Lin S and Brown M S. 2018. A high-quality denoising dataset for smartphone cameras//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1692-1700 [DOI: 10.1109/CVPR.2018.00182http://dx.doi.org/10.1109/CVPR.2018.00182]
Buades A, Coll B and Morel J M. 2006. Image enhancement by non-local reverse heat equation. Cmla Preprint Centre De Mathématiques Et Leurs Applications, 22(5): 2001-2016. https://www.researchgate.net/publication/228353108_Image_enhancement_by_non-local_reverse_heat_equationhttps://www.researchgate.net/publication/228353108_Image_enhancement_by_non-local_reverse_heat_equation
Ba J L, Kiros J R and Hinton G E. 2016. Layer normalization. Machine Learning. [DOI: 10.48550/arXiv.1607.06450http://dx.doi.org/10.48550/arXiv.1607.06450]
Chang M M, Tekalp A M and Erdem A T. 1990. Blur identification using bispectrum//Proceedings of 1990 International Conference on Acoustics, Speech, and Signal Processing. Albuquerque, USA: IEEE: 1961-1964 [DOI: 10.1109/ICASSP.1990.115892http://dx.doi.org/10.1109/ICASSP.1990.115892]
Cai J, Zuo W and Zhang L. 2020. Dark and bright channel prior embedded network for dynamic scene deblurring. IEEE Transactions on Image Processing, 29: 6885-6897 [DOI: 10.1109/TIP.2020.2995048http://dx.doi.org/10.1109/TIP.2020.2995048]
Chang M, Li Q, Feng H and Xu Z. 2020. Spatial-adaptive network for single image denoising//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 171-187 [DOI: 10.1007/978-3-030-58577-8_11http://dx.doi.org/10.1007/978-3-030-58577-8_11]
Cho S J, Ji S W, Hong J P, Jung S W and Ko S J. 2021. Rethinking coarse-to-fine approach in single image deblurring//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 4621-4630 [DOI: 10.1109/ICCV48922.2http://dx.doi.org/10.1109/ICCV48922.2
00460] Chen L, Chu X, Zhang X and Sun J. 2022. Simple baselines for image restoration//Proceedings of 2022 European Conference on Computer Vision. Tel-Aviv, Israel: Springer: 17-33 [DOI: 10.1007/978-3-031-20071-7_2http://dx.doi.org/10.1007/978-3-031-20071-7_2]
Chen J B, Xiong B S, Kuang F and Zhang Z Z. 2023. Motion deblurring based on deep feature fusion attention and double-scale. Jour⁃nal of Image and Graphics,28(12):3731-3743
陈加保, 熊邦书, 况发, 章照中. 2023. 深度特征融合注意力与双尺度的运动去模糊 . 中国图象图形学报, 28(12): 3731-3743[DOI:10. 11834/jig. 220931http://dx.doi.org/10.11834/jig.220931]
Fan C M, Liu T J, Liu K H and Chiu C H. 2022. Selective residual m-net for real image denoising//Proceedings of 2022 30th European Signal Processing Conference (EUSIPCO). Belgrade, Serbia: IEEE: 469-473 [DOI: 10.23919/EUSIPCO55093.2022.990http://dx.doi.org/10.23919/EUSIPCO55093.2022.990
9521]
Jiang Z, Zhang Y, Zou D, Ren J, Lv J and Liu Y. 2020. Learning event-based motion deblurring//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 3317-3326 [DOI: 10.1109/CVPR42600.2020.00338http://dx.doi.org/10.1109/CVPR42600.2020.00338]
Khare A, Tiwary U S. 2005. A new method for deblurring and denoising of medical images using complex wavelet transform//Proceedings of 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference. Shanghai, China: IEEE: 1897-1900 [DOI: 10.1109/IEMBS.2005.1616821http://dx.doi.org/10.1109/IEMBS.2005.1616821]
Köhler R, Hirsch M, Mohler B, Schölkopf B and Harmeling S. 2012. Recording and playback of camera shake: Benchmarking blind deconvolution with a real-world database//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 27-40 [DOI: 10.1007/978-3-642-33786-4_3http://dx.doi.org/10.1007/978-3-642-33786-4_3]
Kheradmand A, Milanfar P. 2014. A general framework for regularized, similarity-based image restoration. IEEE Transactions on Image Processing, 23(12): 5136-5151 [DOI: 10.1109/TIP.2014.2362059http://dx.doi.org/10.1109/TIP.2014.2362059]
Kupyn O, Budzan V, Mykhailych M, Mishkin D and Matas J. 2018. DeblurGAN: Blind motion deblurring using conditional adversarial networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8183-8192 [DOI: 10.1109/CVPR.2018http://dx.doi.org/10.1109/CVPR.2018.
00854]
Kupyn O, Martyniuk T, Wu J and Wang Z. 2019. Deblurgan-v2: Deblurring(orders-of-magnitude) faster and better//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE: 8877-8886 [DOI: 10.1109/ICCV.2019.00897http://dx.doi.org/10.1109/ICCV.2019.00897]
Levin A. 2007. Blind motion deblurring using image statistics. Advances in Neural Information Processing Systems, 19: 841-848 [DOI: 10.7551/mitpress/7503.001.0001http://dx.doi.org/10.7551/mitpress/7503.001.0001]
Li S, Liu M, Zhang Y, Chen S, Li H, Dou Z and Chen H. 2024. SAM-Deblur: Let segment anything boost image Deblurring// Proceedings of ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing. Seoul, Korea: IEEE: 2445-2449 [DOI: 10.1109/ICASSP48485.2024.10445844http://dx.doi.org/10.1109/ICASSP48485.2024.10445844]
Nah S, Hyun Kim T and Mu Lee K. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 257-265 [DOI: 10.1109/CVPR.2017.35http://dx.doi.org/10.1109/CVPR.2017.35]
Nah S, Son S, Lee S, Timofte R, Lee K M, Chen L and Jeong J. 2021. NTIRE 2021 challenge on image deblurring//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 149-165 [DOI: 10.1109/CVPRW53098.2021.00025http://dx.doi.org/10.1109/CVPRW53098.2021.00025]
Park D, Kang D U, Kim J and Chun S Y. 2020. Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training// Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer:327-343 [DOI: 10.1007/978-3-030-58539-6_20http://dx.doi.org/10.1007/978-3-030-58539-6_20]
Potlapalli V, Zamir S W, Khan S and Khan F S. 2023. Promptir: Prompting for all-in-one blind image restoration. [DOI: 10.48550/arXiv.2306.13090http://dx.doi.org/10.48550/arXiv.2306.13090]
Radin L, Osher S and Fatemi E. 1992. Non-linear total variation noise removal algorithm. Physica D Nonlinear Phenomena, 60 (1-4): 259-268 [DOI: 10.1016/0167-2789(92)90242-Fhttp://dx.doi.org/10.1016/0167-2789(92)90242-F]
Sun J, Cao W, Xu Z and Ponce J. 2015. Learning a convolutional neural network for non-uniform motion blur removal// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 769-777 [DOI: 10.1109/CVPR.2015.7298677http://dx.doi.org/10.1109/CVPR.2015.7298677]
Tao X, Gao H, Shen X, Wang J and Jia J. 2018. Scale-recurrent network for deep image deblurring//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8174-8182 [DOI: 10.1109/CVPR.2018.00853http://dx.doi.org/10.1109/CVPR.2018.00853]
Tu Z, Talebi H, Zhang H, Yang F, Milanfar P, Bovik A and Li Y. 2022. Maxim: Multi-axis mlp for image processing// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 5759-5770 [DOI: 10.1109/CVPR52688.2022.00568http://dx.doi.org/10.1109/CVPR52688.2022.00568]
Wang Z M, Bao H. 2013. A new regularization model based on non-local means for image deblurring. Applied Mechanics and Materials, 411(1): 1164-1169 [DOI: 10.4028/www. scientific.nhttp://dx.doi.org/10.4028/www.scientific.n
et/AMM.411-414.1164]
Wang Z, Cun X, Bao J, Zhou W, Liu J and Li H. 2022. Uformer: A general u-shaped transformer for image restoration// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 17662-17672 [DOI: 10.1109/CVPR52688.2022.01716http://dx.doi.org/10.1109/CVPR52688.2022.01716]
Yue Z, Zhao Q, Zhang L and Meng D. 2020. Dual adversarial network: Toward real-world noise removal and noise generation//Proceedings of the 16th European Conference. Glasgow, UK: Springer: 41-58 [DOI: 10.1007/978-3-030-5860http://dx.doi.org/10.1007/978-3-030-5860
7-2_3]
Zhang H, Yang J, Zhang Y and Huang T S. 2011. Sparse representation based blind image deblurring//Proceedings of 2011 IEEE International Conference on Multimedia and Expo. Barcelona, Spain: IEEE: 1-6 [DOI: 10.1109/ICME.2011. 60120http://dx.doi.org/10.1109/ICME.2011.60120
Zhang J, Pan J, Ren J, Song Y, Bao L, Lau R W and Yang M H. 2018. Dynamic scene deblurring using spatially variant recurrent neural networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2521-2529 [DOI: 10.1109/CVPR.2018.00267] ZhangH, DaiY, LiH and KoniuszP. 2019. Deep stacked hierarchical multi-patch network for image deblurring// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5971-5979 [DOI: 10.1109/CVPR.2019.00613] ZhangK, LuoW, ZhongY, MaL, StengerB, LiuW and LiH. 2020. Deblurring by realistic blurring//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2734-2743 [DOI: 10.1109/CVPR42600.2020.00281] ZamirS W, AroraA, KhanS, HayatM, KhanF S, YangM H and ShaoL. 2020. Learning enriched features for real image restoration and enhancement//Proceedings of the 16th European Conference. Glasgow, UK: Springer: 492-511 [DOI: 10.1007/978-3-030-58595-2_30] ZamirS W, AroraA, KhanS, HayatM, KhanF S, YangM H and ShaoL. 2020. Cycleisp: Real image restoration via improved data synthesis//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 2693-2702 [DOI: 10.1109/CVPR42600.2020.00277] ZamirS W, AroraA, KhanS, HayatM, KhanF S, YangM H and ShaoL. 2021. Multi-stage progressive image restoration// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 14816-14826 [DOI: 10.1109/CVPR46437.2021.01458] ZamirS W, AroraA, KhanS, HayatM, KhanF S and YangM H. 2022. Restormer: Efficient transformer for high-resolution image restoration//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 5718-5729 [DOI: 10.1109/CVPR52688.2022.0056http://dx.doi.org/10.1109/CVPR52688.2022.0056
相关作者
相关机构