图像复原中自注意力和卷积的动态关联学习
Dynamic association learning of self-attention and convolution in image restoration
- 2024年29卷第4期 页码:890-907
纸质出版日期: 2024-04-16
DOI: 10.11834/jig.230323
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-04-16 ,
移动端阅览
江奎, 贾雪梅, 黄文心, 王文兵, 王正, 江俊君. 2024. 图像复原中自注意力和卷积的动态关联学习. 中国图象图形学报, 29(04):0890-0907
Jiang Kui, Jia Xuemei, Huang Wenxin, Wang Wenbing, Wang Zheng, Jiang Junjun. 2024. Dynamic association learning of self-attention and convolution in image restoration. Journal of Image and Graphics, 29(04):0890-0907
目的
2
卷积神经网络(convolutional neural network, CNN)和自注意力(self-attention, SA)在多媒体应用领域已经取得了巨大的成功。然而,鲜有研究人员能够在图像修复任务中有效地协调这两种架构。针对这两种架构各自的优缺点,提出了一种关联学习的方式以综合利用两种方法的优点并抑制各自的不足,实现高质高效的图像修复。
方法
2
本文结合CNN和SA两种架构的优势,尤其是在特定的局部上下文和全局结构表示中充分利用CNN的局部感知和平移不变性,以及SA的全局聚合能力。此外,图像的降质分布揭示了图像空间中退化的位置和程度。受此启发,本文在背景修复中引入退化先验,并据此提出一种动态关联学习的图像修复方法。核心是一个新的多输入注意力模块,将降质扰动的消除和背景修复关联起来。通过结合深度可分离卷积,利用CNN和SA两种架构的优势实现高效率和高质量图像修复。
结果
2
在Test1200数据集中进行了消融实验以验证算法各个部分的有效性,实验结果证明CNN和SA的融合可以有效提升模型的表达能力;同时,降质扰动的消除和背景修复关联学习可以有效提升整体的修复效果。本文方法在3个图像修复任务的合成和真实数据上与其他10余种方法进行了比较,提出的方法取得了显著的提升。在图像去雨任务上,本文提出的ELF(image deeraining meets association learning and Transformer)方法在合成数据集Test1200上,相比于MPRNet(multi-stage progressive image restoration network),PSNR(peak signal-to-noise ratio)值提高了0.9 dB;在水下图像增强任务上,ELF在R90数据集上超过Ucolor方法4.15 dB;在低照度图像增强任务上,相对于LLFlow(flow-based low-light image enhancement)算法,ELF获得了1.09 dB的提升。
结论
2
本文方法在效果和性能上具有优势,在常见的图像去雨、低照度图像增强和水下图像修复等任务上优于代表性的方法。
Objective
2
Convolutional neural networks (CNNs) and self-attention (SA) have achieved great success in the field of multimedia applications for dynamic association learning of SA and convolution in image restoration. However, owing to the intrinsic characteristics of local connectivity and translation equivariance, CNNs have at least two shortcomings, 1) limited receptive field and 2) static weight of sliding window at inference, unable to cope with content diversity. The former prevents the network from capturing long-range pixel dependencies, while the latter sacrifices the adaptability to input contents. As a result, they are far from meeting the requirement in modeling global rain distribution and generate results with obvious rain residue. Meanwhile, because of the global calculation of SA, its computational complexity grows quadratically with the spatial resolution, making it infeasible to apply to high-resolution images. In view of the advantages and disadvantages of these two architectures, this study proposes an association learning method to utilize the advantages of the two methods comprehensively and suppress their respective shortcomings to achieve high-quality and efficient inpainting.
Method
2
This study combines the advantages of CNN and SA architectures, particularly by fully utilizing CNNs’ local perception and translation invariance in specific local context and global structural representations, as well as SA’s global aggregation ability. We take inspiration from the observation that rain distribution reflects the degradation location and degree, in addition to rain distribution prediction. Therefore, we propose to refine background textures with the predicted degradation prior in an association learning manner. We accomplish image deraining by associating rain streak removal and background recovery, in which an image deraining network and a background recovery network are specifically designed for these two subtasks. The key part of association learning is a novel multi-input attention module (MAM). It generates the degradation prior and produces the degradation mask according to the predicted rainy distribution. Benefiting from the global correlation calculation of SA, MAM can extract informative complementary components from the rainy input (query) with a degradation mask (key) and then help realize accurate texture restoration. SA tends to aggregate feature maps with SA importance, but convolution diversifies them to focus on local textures. Unlike Restormer equipped with pure Transformer blocks, the design paradigm is promoted in a parallel manner of SA and CNNs, and a hybrid fusion network is proposed. The network involves one residual Transformer branch (RTB) and one encoder-decoder branch (EDB). The former takes a few learnable tokens (feature channels) as input and stacks multihead attention and feed-forward networks to encode global features of the image. The latter, conversely, leverages the multiscale encoder-decoder to represent contexture knowledge. We propose a lightweight hybrid fusion block to aggregate the outcomes of RTB and EDB to yield a final solution to the subtask. In this way, we construct our final model as a two-stage Transformer-based method, namely, ELF, for single image deraining.
Result
2
An ablation experiment is conducted on the Test1200 dataset to validate the effectiveness of various parts of the algorithm. The experimental results show that the fusion of CNN and SA can effectively improve the model’s expression ability. At the same time, the elimination of degraded disturbances and background repair association learning can effectively improve the overall repair effect. The method proposed in this paper is compared with over 10 new methods on the synthetic and real data of three inpainting tasks, and the proposed method achieves significant improvement. In the task of image rain removal, the ELF method improves the peak signal-to-noise ratio (PSNR) value by 0.9 dB compared with multi-stage progressive image restoration network (MPRNet) on the synthetic dataset Test1200. In the underwater enhancement task, ELF exceeds Ucolor by 4.15 dB on the R90 dataset. In the low-illumination image enhancement task, ELF achieves a 1.09 dB improvement compared with the LLFlow algorithm.
Conclusion
2
We rethink image deraining as a composite task of rain streak removal, texture recovery, and their association learning and propose an ELF model for image deraining. Accordingly, a two-stage architecture and an associated learning module are adopted in ELF to account for the two goals of rain streak removal and texture reconstruction while facilitating the learning capability. The joint optimization promotes the compatibility while maintaining the model compactness. Extensive results on image deraining and joint detection tasks demonstrate the superiority of our ELF model over state-of-the-art techniques. The method proposed in this paper possesses efficiency and effectiveness and is superior to representative methods in common tasks such as image rain removal, low-light image enhancement, and underwater enhancement.
图像修复关联学习自注意力(SA)图像去雨低照度图像增强水下图像修复
image inpaintingassociation learningself-attention(SA)image rain removallow illumination image enhancementunderwater image enhancement
Ancuti C, Ancuti C O, Haber T and Bekaert P. 2012. Enhancing underwater images and videos by fusion//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE: 81-88 [DOI: 10.1109/CVPR.2012.6247661http://dx.doi.org/10.1109/CVPR.2012.6247661]
Bossu J, Hautière N and Tarel J P. 2011. Rain or snow detection in image sequences through use of a histogram of orientation of streaks. International Journal of Computer Vision, 93(3): 348-367 [DOI: 10.1007/s11263-011-0421-7http://dx.doi.org/10.1007/s11263-011-0421-7]
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A and Zagoruyko S. 2020. End-to-end object detection with Transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 213-229 [DOI: 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13]
Chen H T, Wang Y H, Guo T Y, Xu C, Deng Y P, Liu Z H, Ma S W, Xu C J, Xu C and Gao W. 2021. Pre-trained image processing Transformer//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual, USA: IEEE: 12294-12305 [DOI: 10.1109/CVPR46437.2021.01212http://dx.doi.org/10.1109/CVPR46437.2021.01212]
Chen Y L and Hsu C T. 2013. A generalized low-rank appearance model for spatio-temporally correlated rain streaks//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE: 1968-1975 [DOI: 10.1109/ICCV.2013.247http://dx.doi.org/10.1109/ICCV.2013.247]
Cheng H D and Shi X J. 2004. A simple and effective histogram equalization approach to image enhancement. Digital Signal Processing, 14(2): 158-170 [DOI: 10.1016/j.dsp.2003.07.002http://dx.doi.org/10.1016/j.dsp.2003.07.002]
Deng S, Wei M Q, Wang J, Feng Y D, Liang L M, Xie H R, Wang F L and Wang M. 2020. Detail-recovery image deraining via context aggregation networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 14548-14557 [DOI: 10.1109/CVPR42600.2020.01457http://dx.doi.org/10.1109/CVPR42600.2020.01457]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J and Houlsby N. 2021. An image is worth 16 × 16 words: transformers for image recognition at scale [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/2010.11929.pdfhttps://arxiv.org/pdf/2010.11929.pdf
El-Nouby A, Touvron H, Caron M, Bojanowski P, Douze M, Joulin A, Laptev I, Neverova N, Synnaeve G, Verbeek J and Jegou H. 2021. XCiT: cross-covariance image Transformers [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/2106.09681.pdfhttps://arxiv.org/pdf/2106.09681.pdf
Fu X Y, Huang J B, Ding X H, Liao Y H and Paisley J. 2017a. Clearing the skies: a deep network architecture for single-image rain removal. IEEE Transactions on Image Processing, 26(6): 2944-2956 [DOI: 10.1109/TIP.2017.2691802http://dx.doi.org/10.1109/TIP.2017.2691802]
Fu X Y, Huang J B, Zeng D L, Huang Y, Ding X H and Paisley J. 2017b. Removing rain from single images via a deep detail network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1715-1723 [DOI: 10.1109/CVPR.2017.186http://dx.doi.org/10.1109/CVPR.2017.186]
Garg K and Nayar S K. 2005. When does a camera see rain?//Proceedings of 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE: 1067-1074 [DOI: 10.1109/ICCV.2005.253http://dx.doi.org/10.1109/ICCV.2005.253]
Ghani A S A and Isa N A M. 2015. Underwater image quality enhancement through integrated color model with Rayleigh distribution. Applied Soft Computing, 27: 219-230 [DOI: 10.1016/j.asoc.2014.11.020http://dx.doi.org/10.1016/j.asoc.2014.11.020]
Guo C L, Li C Y, Guo J C, Loy C C, Hou J H, Kwong S and Cong R M. 2020a. Zero-reference deep curve estimation for low-light image enhancement//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1777-1786 [DOI: 10.1109/CVPR42600.2020.00185http://dx.doi.org/10.1109/CVPR42600.2020.00185]
Guo Y C, Li H Y and Zhuang P X. 2020b. Underwater image enhancement using a multiscale dense generative adversarial network. IEEE Journal of Oceanic Engineering, 45(3): 862-870 [DOI: 10.1109/JOE.2019.2911447http://dx.doi.org/10.1109/JOE.2019.2911447]
Hu M S, Jiang K, Liao L, Xiao J, Jiang J J and Wang Z. 2022. Spatial-temporal space hand-in-hand: spatial-temporal video super-resolution via cycle-projected mutual learning//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 3564-3573 [DOI: 10.1109/CVPR52688.2022.00356http://dx.doi.org/10.1109/CVPR52688.2022.00356]
Huang H B, Yu A J, Chai Z H, He R and Tan T N. 2021. Selective wavelet attention learning for single image deraining. International Journal of Computer Vision, 129(4): 1282-1300 [DOI: 10.1007/s11263-020-01421-zhttp://dx.doi.org/10.1007/s11263-020-01421-z]
Ijaz M, Diaz R and Chen C. 2022. Multimodal Transformer for nursing activity recognition//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 2064-2073 [DOI: 10.1109/CVPRW56347.2022.00224http://dx.doi.org/10.1109/CVPRW56347.2022.00224]
Iqbal K, Odetayo M, James A, Salam R A and Talib A Z. 2010. Enhancing the low quality images using unsupervised colour correction method//Proceedings of 2010 IEEE International Conference on Systems, Man and Cybernetics. Istanbul, Turkey: IEEE: 1703-1709 [DOI: 10.1109/ICSMC.2010.5642311http://dx.doi.org/10.1109/ICSMC.2010.5642311]
Ji H B, Feng X, Pei W J, Li J X and Lu G M. 2021. U2-Former: a nested u-shaped Transformer for image restoration [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/2112.02279.pdfhttps://arxiv.org/pdf/2112.02279.pdf
Jiang K, Wang Z Y, Yi P, Chen C, Han Z, Lu T, Huang B J and Jiang J J. 2021a. Decomposition makes better rain removal: an improved attention-guided deraining network. IEEE Transactions on Circuits and Systems for Video Technology, 31(10): 3981-3995 [DOI: 10.1109/TCSVT.2020.3044887http://dx.doi.org/10.1109/TCSVT.2020.3044887]
Jiang K, Wang Z Y, Yi P, Chen C, Huang B J, Luo Y M, Ma J Y and Jiang J J. 2020a. Multi-scale progressive fusion network for single image deraining//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 8343-8352 [DOI: 10.1109/CVPR42600.2020.00837http://dx.doi.org/10.1109/CVPR42600.2020.00837]
Jiang K, Wang Z Y, Yi P, Chen C, Wang G C, Han Z, Jiang J J and Xiong Z X. 2023. Multi-scale hybrid fusion network for single image deraining. IEEE Transactions on Neural Networks and Learning Systems, 34(7): 3594-3608 [DOI: 10.1109/TNNLS.2021.3112235http://dx.doi.org/10.1109/TNNLS.2021.3112235]
Jiang K, Wang Z Y, Yi P, Chen C, Wang X F, Jiang J J and Xiong Z X. 2021b. Multi-level memory compensation network for rain removal via divide-and-conquer strategy. IEEE Journal of Selected Topics in Signal Processing, 15(2): 216-228 [DOI: 10.1109/JSTSP.2021.3052648http://dx.doi.org/10.1109/JSTSP.2021.3052648]
Jiang K, Wang Z Y, Yi P, Chen C, Wang Z, Wang X, Jiang J J and Lin C W. 2021c. Rain-free and residue hand-in-hand: a progressive coupled network for real-time image deraining. IEEE Transactions on Image Processing, 30: 7404-7418 [DOI: 10.1109/TIP.2021.3102504http://dx.doi.org/10.1109/TIP.2021.3102504]
Jiang K, Wang Z Y, Yi P and Jiang J J. 2020b. Hierarchical dense recursive network for image super-resolution. Pattern Recognition, 107: #107475 [DOI: 10.1016/j.patcog.2020.107475http://dx.doi.org/10.1016/j.patcog.2020.107475]
Jiang Y F, Gong X Y, Liu D, Cheng Y, Fang C, Shen X H, Yang J C, Zhou P and Wang Z Y. 2021d. Enlightengan: deep light enhancement without paired supervision. IEEE Transactions on Image Processing, 30: 2340-2349 [DOI: 10.1109/TIP.2021.3051462http://dx.doi.org/10.1109/TIP.2021.3051462]
Jobson D J, Rahman Z and Woodell G A. 1997. Properties and performance of a center/surround retinex. IEEE Transactions on Image Processing, 6(3): 451-462 [DOI: 10.1109/83.557356http://dx.doi.org/10.1109/83.557356]
Kang L W, Lin C W and Fu Y H. 2012. Automatic single-image-based rain streaks removal via image decomposition. IEEE Transactions on Image Processing, 21(4): 1742-1755 [DOI: 10.1109/TIP.2011.2179057http://dx.doi.org/10.1109/TIP.2011.2179057]
Khan S, Naseer M, Hayat M, Zamir S W, Khan F S and Shah M. 2021. Transformers in vision: a survey. ACM Computing Surveys, 54: 200 [DOI: 10.1145/3505244http://dx.doi.org/10.1145/3505244]
Lai W S, Huang J B, Ahuja N and Yang M H. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5835-5843 [DOI: 10.1109/CVPR.2017.618http://dx.doi.org/10.1109/CVPR.2017.618]
Li C Y, Anwar S, Hou J H, Cong R M, Guo C L and Ren W Q. 2021. Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Transactions on Image Processing, 30: 4985-5000 [DOI: 10.1109/TIP.2021.3076367http://dx.doi.org/10.1109/TIP.2021.3076367]
Li C Y, Anwar S and Porikli F. 2020a. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognition, 98: #107038 [DOI: 10.1016/j.patcog.2019.107038http://dx.doi.org/10.1016/j.patcog.2019.107038]
Li C Y, Guo C L, Ren W Q, Cong R M, Hou J H, Kwong S and Tao D C. 2020b. An underwater image enhancement benchmark dataset and beyond. IEEE Transactions on Image Processing, 29: 4376-4389 [DOI: 10.1109/TIP.2019.2955241http://dx.doi.org/10.1109/TIP.2019.2955241]
Li C Y, Guo J C and Guo C L. 2018a. Emerging from water: underwater image color correction based on weakly supervised color transfer. IEEE Signal Processing Letters, 25(3): 323-327 [DOI: 10.1109/LSP.2018.2792050http://dx.doi.org/10.1109/LSP.2018.2792050]
Li G B, He X, Zhang W, Chang H Y, Dong L and Lin L. 2018b. Non-locally enhanced encoder-decoder network for single image de-raining//Proceedings of the 26th ACM international conference on Multimedia. Seoul, Korea(South): ACM: 1056-1064 [DOI: 10.1145/3240508.3240636http://dx.doi.org/10.1145/3240508.3240636]
Li R T, Cheong L F and Tan R T. 2017. Single image deraining using scale-aware multi-stage recurrent network [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/1712.06830.pdfhttps://arxiv.org/pdf/1712.06830.pdf
Li S Y, Araujo I B, Ren W Q, Wang Z Y, Tokuda E K, Junior R H, Cesar-Junior R, Zhang J W, Guo X J and Cao X C. 2019a. Single image deraining: a comprehensive benchmark analysis//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3833-3842 [DOI: 10.1109/CVPR.2019.00396http://dx.doi.org/10.1109/CVPR.2019.00396]
Li S Y, Ren W Q, Zhang J W, Yu J K and Guo X J. 2019b. Single image rain removal via a deep decomposition-composition network. Computer Vision and Image Understanding, 186: 48-57 [DOI: 10.1016/j.cviu.2019.05.003http://dx.doi.org/10.1016/j.cviu.2019.05.003]
Li X, Wu J L, Lin Z C, Liu H and Zha H B. 2018c. Recurrent squeeze-and-excitation context aggregation net for single image deraining//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 262-277 [DOI: 10.1007/978-3-030-01234-2_16http://dx.doi.org/10.1007/978-3-030-01234-2_16]
Liang J Y, Cao J Z, Sun G L, Zhang K, van Gool L and Timofte R. 2021. SwinIR: image restoration using swin Transformer//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal, Canada: IEEE: 1833-1844 [DOI: 10.1109/ICCVW54120.2021.00210http://dx.doi.org/10.1109/ICCVW54120.2021.00210]
Liao L, Chen W Y, Xiao J, Wang Z, Lin C W and Satoh S. 2022. Unsupervised foggy scene understanding via self spatial-temporal label diffusion. IEEE Transactions on Image Processing, 31: 3525-3540 [DOI: 10.1109/TIP.2022.3172208http://dx.doi.org/10.1109/TIP.2022.3172208]
Liu L X, Liu B, Huang H and Bovik A C. 2014. No-reference image quality assessment based on spatial and spectral entropies. Signal Processing: Image Communication, 29(8): 856-863 [DOI: 10.1016/j.image.2014.06.006http://dx.doi.org/10.1016/j.image.2014.06.006]
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, Lin S and Guo B N. 2021. Swin Transformer: hierarchical vision Transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 9992-10002 [DOI: 10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986]
Luo Y, Xu Y and Ji H. 2015. Removing rain from a single image via discriminative sparse coding//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 3397-3405 [DOI: 10.1109/ICCV.2015.388http://dx.doi.org/10.1109/ICCV.2015.388]
Ma L, Liu R S, Jiang Z Y, Wang Y Y, Fan X and Li H J. 2018. Rain streak removal using learnable hybrid MAP network. Journal of Image and Graphics, 23(2): 277-285
马龙, 刘日升, 姜智颖, 王怡洋, 樊鑫, 李豪杰. 2018. 自然场景图像去雨的可学习混合MAP网络. 中国图象图形学报, 23(2): 277-285 [DOI: 10.11834/jig.170390http://dx.doi.org/10.11834/jig.170390]
Mittal A, Soundararajan R and Bovik A C. 2013. Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, 20(3): 209-212 [DOI: 10.1109/LSP.2012.2227726http://dx.doi.org/10.1109/LSP.2012.2227726]
Park N and Kim S. 2022. How do vision Transformers work? [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/2022.06709.pdfhttps://arxiv.org/pdf/2022.06709.pdf
Pisano E D, Zong S Q, Hemminger B M, DeLuca M, Johnston R E, Muller K, Braeuning M P and Pizer S M. 1998. Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. Journal of Digital Imaging, 11(4): 193-200 [DOI: 10.1007/BF03178082http://dx.doi.org/10.1007/BF03178082]
Redmon J and Farhadi A. 2018. YOLOv3: an incremental improvement [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/1804.02767.pdfhttps://arxiv.org/pdf/1804.02767.pdf
Ren D W, Zuo W M, Hu Q H, Zhu P F and Meng D Y. 2019. Progressive image deraining networks: a better and simpler baseline//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3932-3941 [DOI: 10.1109/CVPR.2019.00406http://dx.doi.org/10.1109/CVPR.2019.00406]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A and Jégou H. 2021. Training data-efficient image Transformers and distillation through attention//Proceedings of the 38th International Conference on Machine Learning. Virtual: PMLR: 10347-10357
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6000-6010
Wang C, Xing X Y, Wu Y T, Su Z X and Chen J Y. 2020a. DCSFN: deep cross-scale fusion network for single image rain removal//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM: 1643-1651 [DOI: 10.1145/3394171.3413820http://dx.doi.org/10.1145/3394171.3413820]
Wang H, Xie Q, Zhao Q and Meng D Y. 2020b. A model-driven deep neural network for single image rain removal//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 3100-3109 [DOI: 10.1109/CVPR42600.2020.00317http://dx.doi.org/10.1109/CVPR42600.2020.00317]
Wang W H, Xie E Z, Li X, Fan D P, Song K T, Liang D, Lu T, Luo P and Shao L. 2021. Pyramid vision Transformer: a versatile backbone for dense prediction without convolutions//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 548-558 [DOI: 10.1109/ICCV48922.2021.00061http://dx.doi.org/10.1109/ICCV48922.2021.00061]
Wang W X, Chen C, Wang J, Zha S, Zhang Y and Li J Y. 2022a. Med-DANet: dynamic architecture network for efficient medical volumetric segmentation [EB/OL]. [2023-06-11]. https://arxiv.org/pdf/2206.06575.pdfhttps://arxiv.org/pdf/2206.06575.pdf
Wang X L, Girshick R, Gupta A and He K M. 2018. Non-local neural networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7794-7803 [DOI: 10.1109/CVPR.2018.00813http://dx.doi.org/10.1109/CVPR.2018.00813]
Wang Y F, Wan R J, Yang W H, Li H L, Chau L P and Kot A. 2022c. Low-light image enhancement with normalizing flow//Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI: 2604-2612 [DOI: 10.1609/aaai.v36i3.20162http://dx.doi.org/10.1609/aaai.v36i3.20162]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI: 10.1109/TIP.2003.819861http://dx.doi.org/10.1109/TIP.2003.819861]
Wang Z D, Cun X D, Bao J M, Zhou W G, Liu J Z and Li H Q. 2022b. Uformer: a general u-shaped Transformer for image restoration//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 17662-17672 [DOI: 10.1109/CVPR52688.2022.01716http://dx.doi.org/10.1109/CVPR52688.2022.01716]
Xie P Y, Xu X, Wang Z and Yamasaki T. 2022. Sampling and re-weighting: towards diverse frame aware unsupervised video person re-identification. IEEE Transactions on Multimedia, 24: 4250-4261 [DOI: 10.1109/TMM.2022.3186177http://dx.doi.org/10.1109/TMM.2022.3186177]
Yang F Z, Yang H, Fu J L, Lu H T and Guo B N. 2020. Learning texture Transformer network for image super-resolution//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 5790-5799 [DOI: 10.1109/CVPR42600.2020.00583http://dx.doi.org/10.1109/CVPR42600.2020.00583]
Yang H J, Li L Q and Wang D. 2022. Deep learning image inpainting combining semantic segmentation reconstruction and edge reconstruction. Journal of Image and Graphics, 27(12): 3553-3565
杨红菊, 李丽琴, 王鼎. 2022. 联合语义分割与边缘重建的深度学习图像修复. 中国图象图形学报, 27(12): 3553-3565 [DOI: 10.11834/jig. 210702http://dx.doi.org/10.11834/jig.210702]
Yang W H, Tan R T, Feng J S, Liu J Y, Guo Z M and Yan S C. 2017. Deep joint rain detection and removal from a single image//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1685-1694 [DOI: 10.1109/CVPR.2017.183http://dx.doi.org/10.1109/CVPR.2017.183]
Yang Y, Guan J W, Huang S Y, Wan W G, Xu Y T and Liu J X. 2022. End-to-end rain removal network based on progressive residual detail supplement. IEEE Transactions on Multimedia, 24: 1622-1636 [DOI: 10.1109/TMM.2021.3068833http://dx.doi.org/10.1109/TMM.2021.3068833]
Yasarla R and Patel V M. 2019. Uncertainty guided multi-scale residual learning-using a cycle spinning CNN for single image de-raining//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8397-8406 [DOI: 10.1109/CVPR.2019.00860http://dx.doi.org/10.1109/CVPR.2019.00860]
Yu W J, Huang Z, Zhang W, Feng L T and Xiao N. 2019. Gradual network for single image de-raining//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM: 1795-1804 [DOI: 10.1145/3343031.3350883http://dx.doi.org/10.1145/3343031.3350883]
Zamir S W, Arora A, Khan S, Hayat M, Khan F S and Yang M H. 2022. Restormer: efficient Transformer for high-resolution image restoration//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 5718-5729 [DOI: 10.1109/CVPR52688.2022.00564http://dx.doi.org/10.1109/CVPR52688.2022.00564]
Zamir S W, Arora A, Khan S, Hayat M, Khan F S, Yang M H and Shao L. 2021. Multi-stage progressive image restoration//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 14816-14826 [DOI: 10.1109/CVPR46437.2021.01458http://dx.doi.org/10.1109/CVPR46437.2021.01458]
Zhang H, Goodfellow I, Metaxas D and Odena A. 2019b. Self-attention generative adversarial networks//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR: 7354-7363
Zhang H and Patel V M. 2017. Convolutional sparse and low-rank coding-based rain streak removal//Proceedings of 2017 IEEE Winter Conference on Applications of Computer Vision. Santa Rosa, USA: IEEE: 1259-1267 [DOI: 10.1109/WACV.2017.145http://dx.doi.org/10.1109/WACV.2017.145]
Zhang H and Patel V M. 2018a. Density-aware single image de-raining using a multi-stream dense network//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 695-704 [DOI: 10.1109/CVPR.2018.00079http://dx.doi.org/10.1109/CVPR.2018.00079]
Zhang H, Sindagi V and Patel V M. 2020. Image de-raining using a conditional generative adversarial network. IEEE Transactions on Circuits and Systems for Video Technology, 30(11): 3943-3956 [DOI: 10.1109/TCSVT.2019.2920407http://dx.doi.org/10.1109/TCSVT.2019.2920407]
Zhang R, Isola P, Efros A A, Shechtman E and Wang O. 2018b. The unreasonable effectiveness of deep features as a perceptual metric//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 586-595 [DOI: 10.1109/CVPR.2018.00068http://dx.doi.org/10.1109/CVPR.2018.00068]
Zhang Y H, Guo X J, Ma J Y, Liu W and Zhang J W. 2021. Beyond brightening low-light images. International Journal of Computer Vision, 129(4): 1013-1037 [DOI: 10.1007/s11263-020-01407-xhttp://dx.doi.org/10.1007/s11263-020-01407-x]
Zhang Y H, Zhang J W and Guo X J. 2019a. Kindling the darkness: a practical low-light image enhancer//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM: 1632-1640 [DOI: 10.1145/3343031.3350926http://dx.doi.org/10.1145/3343031.3350926]
Zhang Y L, Tian Y P, Kong Y, Zhong B N and Fu Y. 2018c. Residual dense network for image super-resolution//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2472-2481 [DOI: 10.1109/CVPR.2018.00262http://dx.doi.org/10.1109/CVPR.2018.00262]
Zhong X, Tu S D, Ma X Z, Jiang K, Huang W X and Wang Z. 2022. Rainy WCity: a real rainfall dataset with diverse conditions for semantic driving scene understanding//Proceedings of the 31st International Joint Conference on Artificial Intelligence. Vienna, Austria: Morgan Kaufmann: 1743-1749 [DOI: 10.24963/ijcai.2022/243http://dx.doi.org/10.24963/ijcai.2022/243]
Zhong X, Zhao S L, Wang X, Jiang K, Liu W X, Huang W X and Wang Z. 2021. Unsupervised vehicle search in the wild: a new benchmark//Proceedings of the 29th ACM International Conference on Multimedia. Chengdu, China: ACM: 5316-5325 [DOI: 10.1145/3474085.3475654http://dx.doi.org/10.1145/3474085.3475654]
Zhu J Y, Park T, Isola P and Efros A A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2242-2251 [DOI: 10.1109/ICCV.2017.244http://dx.doi.org/10.1109/ICCV.2017.244]
相关作者
相关机构