视觉模型及多模态大模型推进图像复原增强:研究进展
Visual Models and Large Multimodal Models Promote Image Restoration and Enhancement: Research Progress
- 2024年 页码:1-20
网络出版日期: 2024-12-23
DOI: 10.11834/jig.240436
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-12-23 ,
移动端阅览
韦炎炎,毛天一,李柏昂等.视觉模型及多模态大模型推进图像复原增强:研究进展[J].中国图象图形学报,
Wei Yanyan,Mao Tianyi,Li Baiang,et al.Visual Models and Large Multimodal Models Promote Image Restoration and Enhancement: Research Progress[J].Journal of Image and Graphics,
图像在拍摄、传输和存储过程中常会出现退化情况,影响视觉感知和信息理解。图像复原增强旨在将降质图像恢复为干净图像,以提升视觉感知体验,并提高如语义分割和目标检测等计算机视觉任务的精度,在自动驾驶和智能医疗等数据敏感的应用场景有重要作用。近年来,视觉及多模态大模型在多个领域取得了重要进展,并在图像复原增强任务中展现出巨大潜力。对此,本文系统地总结并分析了近年来国内外图像复原增强领域应用视觉(大)模型和多模态大模型的重要研究进展。 1)总结介绍基于Vision Transformer(ViT)的图像复原增强方法,探讨ViT在处理图像退化和增强方面所具有的长距离依赖潜力; 2)详细阐述基于扩散模型的图像复原增强方法,讨论其在处理复杂图像退化和恢复细节方面的优势; 3)分析X-anything模型在图像复原增强任务上的潜力,尤其是Segment Anything Model(SAM)等视觉大模型在退化样本上提供的鲁棒零样本预测先验信息能力; 4)介绍多模态大模型如CLIP和GPT-4V在图像复原增强任务中的应用,展示了这些预训练模型在图像复原过程中所提供的语义信息指导能力; 5)分析当前图像复原增强技术面临的挑战,如数据获取困难、计算资源需求高和模型稳定性不足等,同时展望了图像复原增强技术的发展方向,为未来的研究和应用提供新的思路和参考。
Images, as essential carriers of visual information, play an integral role in various facets of human life, from daily interactions to complex technological applications. However, throughout the processes of acquisition, transmission, and storage, images are often exposed to numerous environmental and technical factors that lead to quality degradation. This degradation not only results in diminished visual perception and information loss but also has broader implications, adversely affecting computer vision tasks. When such quality degradation occurs, it can reduce the accuracy of critical computer vision applications, including semantic segmentation and object detection, which rely heavily on high-quality input images. In application scenarios where high precision and reliability are paramount, such as autonomous driving, intelligent healthcare, and other safety-critical environments, image degradation can significantly undermine the user experience and compromise the reliability of data-driven systems. To address these challenges, image restoration and enhancement technologies are designed with the goal of recovering degraded images to their original clarity and fidelity. These technologies aim to restore distortion-free images, thereby improving subjective visual quality and enhancing the performance of downstream tasks that depend on these images. Traditional image restoration techniques have shown some effectiveness in repairing images with mild degradation, but they often encounter difficulties when addressing complex or severe degradations, especially when multiple degradation factors are involved. This limitation has driven researchers to explore advanced methods capable of handling diverse and intricate degradation scenarios. In recent years, advancements in hardware computational power, coupled with rapid developments in deep learning, have led to significant breakthroughs in vision and multimodal large models. These models, powered by sophisticated architectures and extensive training, have demonstrated extraordinary potential across multiple fields. Leveraging these advances, image restoration and enhancement technologies have achieved notable progress, offering promising solutions to previously challenging problems. This paper provides a systematic review of the current research landscape in image restoration and enhancement, conducting an in-depth analysis of several core technologies driving advancements in this area. The primary contributions of this paper are structured around the following six focal areas: 1) Compilation and Analysis of Datasets for Image Restoration and Enhancement Tasks: The effectiveness of image restoration methods is greatly influenced by the quality and scale of datasets used for training and evaluation. This paper offers a comprehensive compilation of datasets commonly applied in image restoration tasks, such as denoising, deraining, and dehazing. We provide insights into the characteristics of these datasets, including their scale, quality, and the techniques employed to generate low-quality images, enabling a thorough understanding of dataset influences on restoration performance. 2) Exploration of Vision Transformer (ViT) in Image Restoration and Enhancement: Vision Transformers (ViT) have introduced the powerful Transformer architecture to the field of image processing. By enabling the processing of long-range dependencies, ViT has demonstrated considerable promise in image restoration and enhancement tasks. This paper systematically reviews the application of ViT in recent restoration tasks, discussing the advantages and limitations of ViT-based methods and evaluating its potential to manage complex image degradation patterns. 3) Summary of Diffusion Model-Based Image Restoration and Enhancement Methods: Diffusion models have emerged as effective solutions for handling complex image degradation and restoring fine details in challenging cases. This paper summarizes the recent advancements in diffusion model-based image restoration, focusing on the unique strengths of the iterative denoising process. Compared to traditional methods, diffusion models show strong capabilities in detail recovery for severely degraded images, though they also present risks related to generating content that may appear less realistic. 4) Analysis of the Potential of X-anything Models in Image Restoration and Enhancement Tasks: Represented by models such as the Segment Anything Model (SAM), X-anything models leverage extensive pre-training and prior information to achieve robust zero-shot predictions, even when applied to degraded images with limited labeling. This paper explores the application potential of SAM and similar models in image restoration, highlighting their ability to provide stable restoration capabilities through zero-shot learning, which could be highly advantageous in scenarios with unlabeled or weakly labeled data. 5) Application of Multimodal Large Models in Image Restoration and Enhancement: With the rise of multimodal large models like CLIP and GPT-4V, researchers have begun to leverage the powerful information fusion capabilities of these models for image restoration and enhancement. This paper demonstrates the advantages of multimodal models in complex restoration tasks by analyzing how they utilize pre-trained semantic information to guide restoration processes. The assistance of these semantic features allows multimodal models to achieve superior performance in challenging scenarios, where traditional methods may fall short. 6) Challenges and Prospects of Image Restoration and Enhancement Technologies: Despite the significant progress made in recent years, image restoration and enhancement technologies still face substantial challenges in practical applications. Key obstacles include difficulties in acquiring high-quality and diverse training data, high computational resource demands, and the need for enhanced model stability under various conditions. This paper discusses these challenges in depth and explores prospective research directions, such as improving model adaptability to resource constraints, developing more efficient data acquisition methods, and enhancing model robustness. These directions aim to provide valuable insights for both researchers and practical applications, fostering further development in the field. In conclusion, this paper aims to provide readers with a comprehensive overview of the research advancements in image restoration and enhancement over recent years, both domestically and internationally. By systematically summarizing the current progress and analyzing key technological innovations, this paper seeks to inspire new ideas and open up innovative directions for future research and applications in this rapidly evolving field.
图像复原增强视觉大模型多模态大模型Vision Transformer扩散模型X-anything计算机视觉
Abdelhamed A, Lin S and Brown M S. 2018. A High-Quality Denoising Dataset For Smartphone Cameras. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1692-1700.
Achiam J, Adler S, Agarwal S, et al. 2023. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.
Agustsson E and Timofte R. 2017. Ntire 2017 Challenge on Single Image Super-Resolution: Dataset and Study. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 126-135.
Ai Y, Huang H, Zhou X, Wang J and He R. 2024. Multimodal Prompt Perceiver: Empower Adaptiveness Generalizability and Fidelity for All-in-One Image Restoration. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25432-25444.
Ancuti C, Ancuti C O and De Vleeschouwer C. 2016. D-Hazy: A Dataset to Evaluate Quantitatively Dehazing Algorithms. Proceedings of the IEEE International Conference on Image Processing. 2226-2230.
Ancuti C O, Ancuti C, Sbert M and Timofte R. 2019. Dense-Haze: A Benchmark for Image Dehazing with Dense-Haze and Haze-Free Images. Proceedings of the IEEE International Conference on Image Processing. 1014-1018.
Ancuti C O, Ancuti C and Timofte R. 2020. NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition workshops. 444-445.
Bai Y, Wang C, Xie S, Dong C, Yuan C and Wang Z. 2023. Textir: A Simple Framework for Text-Based Editable Image Restoration. arXiv preprint arXiv:2302.14736.
Brown T, Mann B, Ryder N, et al. 2020. Language Models Are Few-Shot Learners. Advances in Neural Information Processing Systems, 1877-1901.
Cai J, Gu S and Zhang L. 2018. Learning a Deep Single Image Contrast Enhancer From Multi-Exposure Images. IEEE Transactions on Image Processing. 27(4): 2049-2062.
Cai Y, Bian H, Lin J, Wang H, Timofte R and Zhang Y. 2023. Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement. Proceedings of the IEEE/CVF International Conference on Computer Vision. 12504-12513.
Chen C, Chen Q, Xu J and Koltun V. 2018. Learning to See in The Dark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3291-3300.
Chen H, Li W, Gu J, et al. 2024. RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models. arXiv preprint arXiv:2407.18035.
Chen H, Wang Y, Guo T, et al. 2021. Pre-Trained Image Processing Transformer. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12299-12310.
Chen X, Li H, Li M and Pan J. 2023. Learning A Sparse Transformer Network for Effective Image Deraining. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5896-5905.
Chen X, Wang X, Zhou J,Qiao Y and Dong C. 2023. Activating More Pixels in Image Super-Resolution Transformer. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22367-22377.
Chen Z, Zhang Y, Liu D, Xia B, Gu J, Kong L and Yuan X. 2024. Hierarchical Integration Diffusion Model for Realistic Image Deblurring. Advances in Neural Information Processing Systems. 36.
Conde M V, Geigle G and Timofte R. 2024. High-quality Image Restoration Following Human Instructions. arXiv preprint arXiv:2401.16468.
Devlin J, Chang M W, Lee K and Toutanova K. 2018. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.
Dosovitskiy A, Beyer L, Kolesnikov A, et al. 2020. An Image Is Worth16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929.
Fei B, Lyu Z, Pan L, Zhang J, Yang W, Luo T, Zhang B and Dai B. 2023. Generative Diffusion Prior for Unified Image Restoration and Enhancement. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9935-9946.
Gandikota KV and Chandramouli P. 2024. Text-Guided Explorable Image Super-Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25900-25911.
Gao S, Liu X, Zeng B, Xu S, Li Y, Luo X, Liu J, Zhen X and Zhang B. 2023. Implicit Diffusion Models for Continuous Super-resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10021-10030.
Guan R, Hu R, Zhou Z, Xue T, Man KL, Smith J, Lim EG, Ding W and Yue Y. 2024. Referring Flexible Image Restoration[J]. arXiv preprint arXiv:2404.10342
Guo C L, Yan Q, Anwar S, Cong R, Ren W and Li C. 2022. Image Dehazing Transformer with Transmission-aware 3d Position Embedding. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5812-5820.
Guo L, Wang C, Yang W, Huang S, Wang Y, Pfister H and Wen B. 2023. Shadowdiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14049-14058.
He K, Sun J, Tang X. 2010. Single Image Haze Removal Using Dark Channel Prior. IEEE Transactions on Pattern Analysis and Machine Intelligence. 33(12): 2341-2353.
Ho J, Jain A and Abbeel P. 2020. Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems. 6840-6851.
Hu X, Jiang Y, Fu C W and Heng P A. 2019. Mask-Shadowgan: Learning to Remove Shadows from Unpaired Data. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2472-2481.
Huang J B, Singh A and Ahuja N. 2015. Single Image Super-Resolution from Transformed Self-Exemplars. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5197-5206.
Jiang A, Wei Z, Peng L, Liu F, Li W and Wan M. 2024. DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution. arXiv preprint arXiv:2406.16477.
Jiang J and Holz C. 2023. Restore Anything Pipeline: Segment Anything Meets Image Restoration. arXiv preprint arXiv:2305.13093.
Jiang K, Jia X M, Huang W X, Wang W B, Wang Z and Jiang J J. 2024. Dynamic Association Learning of Self-Attention and Convolution in Image Restoration. Journal of Image and Graphics, 29(04):0890-0907
江奎,贾雪梅,黄文心,王文兵,王正,江俊君 . 2024. 图像复原中自注意力和卷积的动态关联学习. 中国图象图形学报, 29(04):0890-0907
Jin X, Shi Y, Xia B, and Yang W. 2024. LLMRA: Multi-Modal Large Language Model Based Restoration Assistant[J]. arXiv preprint arXiv:2401.11401.
Jin Z, Chen S, Chen Y, Xu Z and Feng H. 2023. Let Segment Anything Help Image Dehaze. arXiv preprint arXiv:2306.15870.
Kawar B, Elad M, Ermon S and Song J. 2022. Denoising Diffusion Restoration Models. Advances in Neural Information Processing Systems. 23593-23606.
Kirillov A, Mintun E, Ravi N, et al2023. Segment Anything. Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015-4026.
Kong L, Dong J, Ge J, Li M and Pan J. 2023. Efficient Frequency Domain-Based Transformers for High-Quality Image Deblurring. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5886-5895.
Li B, Ren W, Fu D, Tao D, Feng D, Zeng W and Wang Z. 2018. Benchmarking Single-Image Dehazing and Beyond. Proceedings of the IEEE Transactions on Image Processing. 28(1): 492-505.
Li G, Liu J, Ma L, Jiang Z, Fan X and Liu R. 2023. Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction. Proceedings of the 31st ACM International Conference on Multimedia. 7304-7313.
Li J, Jain J and Shi H. 2024. Matting Anything. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1775-1785.
Li S, Liu M, Zhang Y, Chen S, Li H, Dou Z and Chen H. 2024. Sam-Deblur: Let Segment Anything Boost Image Deblurring. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing . 2445-2449.
Li Z, Lei Y, Ma C, Zhang J and Shan H. 2023. Prompt-in-Prompt Learning for Universal Image Restoration. arXiv preprint arXiv:2312.05038.
Liang J, Cao J, Sun G, Zhang K, Van Gool L and Timofte R. 2021. SwinIR: Image Restoration Using Swin Transformer. Proceedings of the IEEE/CVF International Conference on Computer Vision. 1833-1844.
Liang P, Jiang J, Liu X and Ma J. 2024. Image Deblurring by Exploring In-Depth Properties of Transformer. Proceedings of the IEEE Transactions on Neural Networks and Learning Systems.
Liang Z, Li C, Zhou S, Feng R and Loy CC. 2023. Iterative Prompt Learning for Unsupervised Backlit Image Enhancement. Proceedings of the IEEE/CVF International Conference on Computer Vision. 8094-8103.
Lin J W, Lee C H, Su T W, and Chang CC. 2024. Importing Diffusion and Re-Designed Backward Process for Image De-Raining. Sensors, 24(12): 3715.
Lin J, Zhang Z, Wei Y, Ren D, Jiang D, Tian Q and Zuo W. 2024. Improving Image Restoration Through Removing Degradations in Textual Representations. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2866-2878.
Lin X, He J, Chen Z, Lyu Z, Dai B, Yu F, Ouyang W, Qiao Y and Dong C. 2023. DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior. arXiv preprint arXiv:2308.15070.
Lin X, Ren C, Chan K C K, Qi L, Pan J and Yang M H. 2023. Multi-Task Image Restoration Guided By Robust DINO Features[J]. arXiv preprint arXiv:2312.01677.
Liu S, Zeng Z, Ren T, et al. 2023. Grounding Dino: Marrying Dino with Grounded Pre-training for Open-Set Object Detection. arXiv preprint arXiv:2303.05499.
Liu H, Li C, Wu Q and Lee Y J. Visual Instruction Tuning. Advances in Neural Information Processing Systems. 36.
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S and Guo B. 2021. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. Proceedings of the IEEE/CVF International Conference on Computer Vision. 10012-10022.
Lu Z, Li J, Liu H, Huang C, Zhang L and Zeng T. 2022. Transformer for Single Image Super-Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 457-466.
Luo Z, Gustafsson F K, Zhao Z, Sjölund J and Schön T B. 2023. Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1680-1691.
Luo Z, Gustafsson F K, Zhao Z, Sjölund J and Schön TB. 2024b. Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6641-6651.
Luo Z, Gustafsson FK, Zhao Z, Sjölund J and Schön TB. 2024a. Controlling Vision-Language Models for Multi-Task Image Restoration. Proceedings of the The Twelfth International Conference on Learning Representations.
Martin D, Fowlkes C, Tal D and Malik J. 2001. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. Proceedings eighth IEEE International Conference on Computer Vision. 416-423.
Morawski I, He K, Dangi S and Hsu WH. 2024. Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5971-5981.
Nah S, Hyun Kim T and Mu Lee K. 2017. Deep Multi-Scale Convolutional Neural Network for Dynamic Scene Deblurring. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3883-3891.
Özdenizci O and Legenstein R. 2023. Restoring Vision in Adverse Weather Conditions with Patch-based Denoising Diffusion Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(8): 10346-10357.
Plotz T and Roth S. 2017. Benchmarking Denoising Algorithms with Real Photographs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1586-1595.
Potlapalli V, Zamir S W, Khan S H and Khan F S. 2024. PromptIR: Prompting for All-in-One Image Restoration. Advances in Neural Information Processing Systems. 36.
Qian R, Tan R T, Yang W, Su J and Liu J. 2018. Attentive Generative Adversarial Network for Raindrop Removal from a Single Image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2482-2491.
Qu L, Tian J, He S, Tang Y and Lau R W H. 2017. Deshadownet: A Multi-Context Embedding Deep Network for Shadow Removal. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4067-4075.
Quan R, Yu X, Liang Y and Yang Y. 2021. Removing Raindrops and Rain Streaks in One Go. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9147-9156.
Radford A, Kim J W, Hallacy C, et al. 2021. Learning Transferable Visual Models from Natural Language Supervision. International Conference on Machine Learning. PMLR. 8748-8763.
Ren M, Delbracio M, Talebi H, Gerig G and Milanfar P. 2023. Multiscale Structure Guided Diffusion for Image Deblurring. Proceedings of the IEEE/CVF International Conference on Computer Vision. 10721-10733.
Ren T, Liu S, Zeng A, et al. 2024. Grounded Sam: Assembling Open-World Models for Diverse Visual Tasks. arXiv preprint arXiv:2401.14159.
Rim J, Lee H, Won J and Sho S. 2020. Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August23–28, 2020, Proceedings, PartXXV 16. Springer International Publishing. 184-201.
Rombach R, Blattmann A, Lorenz D, Esser P and Ommer B. 2022. High-resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684-10695.
Saharia C, Ho J, Chan W, Salimans T, Fleet D J and Norouzi M. 2022. Image Super-Resolution via Iterative Refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence. 45(4): 4713-4726.
Shen Z, Wang W, Lu X, Shen J, Ling H and Shao L. 2019. Human-Aware Motion Deblurring. Proceedings of the IEEE/CVF International Conference on Computer Vision. 5572-5581.
Sohl-Dickstein J, Weiss E, Maheswaranathan N and Ganguli S. 2015. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. International Conference on Machine Learning. PMLR. 2256-2265.
Song Y, He Z, Qian H and Du X. 2023. Vision Transformers for Single Image Dehazing. IEEE Transactions on Image Processing. 32: 1927-1941.
Sun Y, Ding J W, Zhang Q and Deng Q Y. 2024. Image Super-Resolution Reconstruction of Transposed Self-Attention with Local Feature Enhancement. Journal of Image and Graphics, 29(04):0908-0921
孙阳,丁建伟,张琪,邓琪瑶. 2024. 局部特征增强的转置自注意力图像超分辨率重建. 中国图象图形学报, 29(04):0908-0921
Tan Z, Wu Y, Liu Q, Chu Q, Lu L, Ye J and Yu N. 2024. Exploring the Application of Large-Scale Pre-Trained Models on Adverse Weather Removal. IEEE Transactions on Image Processing, 2024.
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A and Jegou H. 2021. Training Data-Efficient Image Transformers & Distillation Through Attention. International Conference on Machine Learning. PMLR. 10347-10357.
Vasluianu F A, Seizinger T and Timofte R. 2023. Wsrd: A Novel Benchmark for High Resolution Image Shadow Removal. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1826-1835.
Vaswani A, Shazeer N, Parmar N, et al. 2017. Attention Is All You Need. Advances in Neural Information Processing Systems. 30.
Wang J, Li X and Yang J. 2018. Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1788-1797.
Wang J, Wu S, Yuan Z, Tong Q and Xu K. 2024. Frequency Compensated Diffusion Model for Real-scene Dehazing. Neural Networks, 175: 106281.
Wang L G, Guo Y L, Lin Z P, Wang Y Q and An W. 2023. Deep Hyperspectral Image Super-Resolution with Transformers (in Chinese). Sci Sin Inform, 53: 500–516 (王龙光, 郭裕兰, 林再平, 王应谦, 安玮. 基于Transformer的高光谱图像超分辨率重建. 中国科学: 信息科学), 53: 500–516
Wang M H, Ke F H, Liang Y, Fan Z and Liao L. 2022. 3D attention and Transformer based single image deraining network. Journal of Image and Graphics,27(05):1509-1521
王美华,柯凡晖,梁云,范衠,廖磊. 2022. 融合3D注意力和Transformer的图像去雨网络. 中国图象图形学报,27(05):1509-1521
Wang T, Yang X, Xu K, Chen S, Zhang Q and Lau R W H. 2019. Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12270-12279.
Wang T, Zhang J, Fei J, Zheng H, Tang Y, Li Z, Gao M and Zhao S. 2023. Caption Anything: Interactive Image Description with Diverse Multimodal Controls. arXiv preprint arXiv:2305.02677.
Wang Y, Yu Y, Yang W, Guo L, Chau L P, Kot A C, Wen B. 2023. Exposurediffusion: Learning to Expose for Low-Light Image Enhancement. Proceedings of the IEEE/CVF International Conference on Computer Vision. 12438-12448.
Wei C, Wang W, Yang W and Liu J. 2018. Deep Retinex Decomposition for Low-Light Enhancement. arXiv preprint arXiv:1808.04560.
Wei Y, Zhang Z, Ren J, Xu X, Hong R, Yang Y, Yan S and Wang M. 2023. Clarity ChatGPT: An Interactive and Adaptive Processing System for Image Restoration and Enhancement. arXiv preprint arXiv:2311.11695.
Wen Y B, Gao T, An Y S, Li Z Q and Chen T. 2024. Weather-Degraded Image Restoration Based on Visual Prompt Learning. CHINESE JOURNAL OF COMPUTERS, 1-17
文渊博, 高涛, 安毅生, 李子琦, 陈婷. 2024. 基于视觉提示学习的天气退化图像恢复[J]. 计算机学报, 1-17
Wen Y B, Gao T, Chen T and Zhang Q X. 2023. Frequency-Guided Dual Sparse Self-Attention Algorithm for Single Image Deraining. Acta Electronica Sinica, 51(10): 2812-2820.
文渊博, 高涛, 陈婷, 张千禧. 2023. 频率引导的双稀疏自注意力单图像去雨算法. 电子学报, 51(10): 2812-2820.
Xia B, Zhang Y, Wang S, Wang Y, Wu X, Tian Y, Yang W and Van Gool L. 2023. DiffIR: Efficient Diffusion Model for Image Restoration. Proceedings of the IEEE/CVF International Conference on Computer Vision. 13095-13105.
Xiao J, Fu X, Zhu Y, Li D, Huang J, Zhu K and Zha Z. 2024. HomoFormer: Homogenized Transformer for Image Shadow Removal. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25617-25626.
Xiao Z, Bai J, Lu Z and Xiong Z. 2023. A Dive into Sam Prior in Image Restoration. arXiv preprint arXiv:2305.13620.
Xie R L, Wu H and Yuan G W. 2024. Single Image Deraining Using Rainy Streak Degradation Prediction and Pre-Trained Diffusion Prior. Computer Engineering and Applications, 1-16
谢瑞麟, 吴昊, 袁国武. 2024. 雨痕退化预测与预训练扩散先验的单图像去雨方法. 计算机工程与应用, 1-16
Xiong W, Xiong C Y, Gao Z R, Chen W Q, Zheng R H and Tian J W. 2023. Image Super-Resolution with Channel-Attention-Embedded Transformer. Journal of Image and Graphics, 28(12):3744-3757
熊巍,熊承义,高志荣,陈文旗,郑瑞华,田金文 . 2023. 通道注意力嵌入的Transformer图像超分辨率重构. 中国图象图形学报, 28(12):3744-3757
Xu J, Li H, Liang Z, Zhang D and Zhang L. 2018. Real-World Noisy Image Denoising: A New Benchmark. arXiv preprint arXiv:1804.02603.
Xu X Y and Zhang M F. 2024. Remote Sensing Image Super-Resolution Algorithm Based on LR Coding Network and Diffusion Model. Computer Engineering and Applications, 1-12.
许晓阳, 张梦飞. 2024. 融合LR编码网络和扩散模型的遥感图像超分辨率算法. 计算机工程与应用, 1-12.
Xue J, Wang T, Wang J, et al. 2024. Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration. arXiv preprint arXiv:2403.05906.
Yan Q, Jiang A, Chen K, Peng L, Yi Q and Zhang C. 2023. Textual Prompt Guided Image Restoration. arXiv preprint arXiv:2312.06162.
Yang W, T. Tan R, Feng J, Liu J, Guo Z and Yan S. 2017. Deep Joint Rain Detection and Removal from a Single Image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1357-1366.
Yang F, Yang H, Fu J, Lu H and Guo B. 2020. Learning Texture Transformer Network for Image Super-Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5791-5800.
Yang H, Pan L, Yang Y and Liang W. 2024. Language-Driven All-in-One Adverse Weather Removal. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24902-24912.
Yang L, Kang B, Huang Z, Xu X, Feng J and Zhao H. 2024. Depth Anything: Unleashing The Power of Large-scale Unlabeled Data. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10371-10381.
Yang Y, Wu X, He T, Zhao H and Liu X. 2023. SAM3d: Segment Anything in 3d Scenes. arXiv preprint arXiv:2306.03908.
Yang Z, Chen H, Qian Z, Zhou Y, Zhang H, Zhao D, Wei B and Xu Y. 2024. Region Attention Transformer for Medical Image Restoration. arXiv preprint arXiv:2407.09268.
Yu B, Fan Z, Xiang X, Chen J and Huang D. 2024. Universal Image Restoration with Text Prompt Diffusion. Sensors, 24(12): 3917.
Yu F, Gu J, Li Z, Hu J, Kong X, Wang X, He J, Qiao Y and Dong C. 2024. Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration in The Wild. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 25669-25680.
Yu T, Feng R, Feng R, Liu J, Jin X, Zeng W and Chen Z. 2023. Inpaint Anything: Segment Anything Meets Image Inpainting. arXiv preprint arXiv:2304.06790.
Yu Y, Zeng Z, Hua H, Fu J and Luo J. 2024. PromptFix: You Prompt and We Fix the Photo. arXiv preprint arXiv:2405.16785.
Zamir SW, Arora A, Khan S, Hayat M, Khan FS and Yang MH. 2022. Restormer: Efficient Transformer for High-Resolution Image Restoration. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5728-5739.
Zeyde R, Elad M, and Protter M. 2010. On Single Image Scale-Up Using Sparse-Representations. Curves and Surfaces: 7th International Conference, Avignon, France, June24-30, 2010, Revised Selected Papers 7. Springer Berlin Heidelberg, 2012: 711-730.
Zhai X, Kolesnikov A, Houlsby N and Beyer L. 2022. Scaling Vision Transformers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12104-12113.
Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni L M and Shum H Y. 2022. DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection. arXiv preprint arXiv:2203.03605.
Zhang J, Zhang Y, Gu, Zhang Y, Kong L and Yuan X. 2022. Accurate Image Restoration with Attention Retractable Transformer. arXiv preprint arXiv:2210.01427.
Zhang Z, Wei Y, Zhang H, Yang Y, Yan S and Wang M. 2023.Data-Driven Single Image Deraining: A Comprehensive Review and New Perspectives. Pattern Recognition, 143: 109740.
Zhang L, Rao A and Agrawala M. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. Proceedings of the IEEE/CVF International Conference on Computer Vision. 3836-3847.
Zhang Q, Liu X, Li W, Chen H, Liu J, Hu J, Xiong Z, Yuan C and Wang Y. 2024. Distilling Semantic Priors from SAM to Efficient Image Restoration Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25409-25419.
Zhang Y, Huang X, Ma J, Li Z, Luo Z, Xie Y, Qin Y, Luo T, Li Y, Liu S, Guo Y and Zhang L. 2024. Recognize Anything: A Strong Image Tagging Model. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1724-1732.
Zheng B, Gu J, Li S and Dong C. 2024. LM4LV: A Frozen Large Language Model for Low-level Vision Tasks. arXiv preprint arXiv:2405.15734.
Zheng D, Wu X M, Yang S, Zhang J, Hu J F and Zheng W S. 2024. Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25445-25455.
Zhou Y, Lin J, Ye F, Qu Y and Xie Y. 2024. Efficient Lightweight Image Denoising with Triple Attention Transformer. Proceedings of the AAAI Conference on Artificial Intelligence. 38(7): 7704-7712.
Zhu Q, Li P and Li Q. 2023. Attention Retractable Frequency Fusion Transformer for Image Super Resolution. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1756-1763.
相关作者
相关机构