语义分割和HSV色彩空间引导的低光照图像增强
Low-light image enhancement guided by semantic segmentation and HSV color space
- 2024年29卷第4期 页码:966-977
纸质出版日期: 2024-04-16
DOI: 10.11834/jig.230182
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-04-16 ,
移动端阅览
张航, 颜佳. 2024. 语义分割和HSV色彩空间引导的低光照图像增强. 中国图象图形学报, 29(04):0966-0977
Zhang Hang, Yan Jia. 2024. Low-light image enhancement guided by semantic segmentation and HSV color space. Journal of Image and Graphics, 29(04):0966-0977
目的
2
低光照图像增强是图像处理中的基本任务之一。虽然已经提出了各种方法,但它们往往无法在视觉上产生吸引人的结果,这些图像存在细节不清晰、对比度不高和色彩失真等问题,同时也对后续目标检测、语义分割等任务有不利影响。针对上述问题,提出一种语义分割和HSV(hue, saturation and value)色彩空间引导的低光照图像增强方法。
方法
2
首先提出一个迭代图像增强网络,逐步学习低光照图像与增强图像之间像素级的最佳映射,同时为了在增强过程中保留语义信息,引入一个无监督的语义分割网络并计算语义损失,该网络不需要昂贵的分割注释。为了进一步解决色彩失真问题,在训练时利用HSV色彩空间设计HSV损失;为了解决低光照图像增强中出现细节不清晰的问题,设计了空间一致性损失,使增强图像与对应的低光照图像尽可能细节一致。最终,本文的总损失函数由5个损失函数组成。
结果
2
将本文方法与LIME(low-light image enhancement)、RetinexNet(deep retinex decomposition)、EnlightenGAN(deep light enhancement using generative adversarial networks)、Zero-DCE (zero-reference deep curve estimation)和SGZ(semantic-guided zero-shot learning)5种方法进行了比较。在峰值信噪比(peak signal-to-noise ratio,PSNR)上,本文方法平均比Zero-DCE (zero-reference deep curve estimation)提高了0.32 dB;在自然图像质量评价(natural image quality evaluation,NIQE)方面,本文方法比EnlightenGAN提高了6%。从主观上看,本文方法具有更好的视觉效果。
结论
2
本文所提出的低光照图像增强方法能有效解决细节不清晰、色彩失真等问题,具有一定的应用价值。
Objective
2
Images are often taken under sub-optimal lighting conditions and are disturbed by backlight, uneven illumination, and weak light due to unavoidable environmental and technical limitations, such as insufficient lighting and limited exposure time. The quality of such images will be affected, and the information transmission for high-level tasks, such as object tracking, recognition, and detection, is also unsatisfactory. Various methods have been proposed, but they often fail to produce attractive results visually. These images have problems such as unclear details, low contrast, and color distortion. Existing deep learning methods have better accuracy, robustness, and speed than the traditional methods. However, the generalization performance is generally poor due to synthetic data sets. For example, the supervised learning method requires pairs of low- and normal-light images, and the visual effect of the trained model applied to the real low-light image is remarkably poor. Considering the above problems, a low-light image enhancement method guided by semantic segmentation and HSV color space is proposed. This method does not require excessive computing resources while restoring the true color and detail texture of the object. Moreover, the generalization performance of the model is better than supervised learning because it is a nonreference training.
Method
2
The proposed framework is an end-to-end low-light image enhancement network based on seven convolutional layers with a symmetrical structure similar to U-Net. The input is a low-light image, and the output is a set of best-fit curve parameter graphs. Through iterative application of the curve, all pixels in the RGB channel of the input low-light image are mapped to obtain the final enhanced image. The curve in this study can automatically map the low-light image to the enhanced image, and the curve parameters are adaptive, only depending on the input image and learning through the network. After the network extracts the curve parameter diagram of the input image, the curve is repeatedly applied for image enhancement, and the enhancement results are evaluated and guided by a series of nonreference loss functions. Simultaneously, the result of the last step of iterative enhancement is sent into an unsupervised semantic segmentation network to preserve the semantic information of the image. The loss function includes the following: 1) spatial consistency loss is used to maintain the consistency of details between the enhanced and original images and address the problem of unclear details in most low-light image enhancement. The enhanced result and low-light image are divided into several small local regions to minimize the pixel differences between the corresponding local regions in the enhanced result and the surrounding one-pixel-wide local regions in the low-light image as much as possible. 2) HSV loss is used to restore the color information of the image. The enhanced result and the low-light image input are converted from RGB to the HSV color space, and the hue and saturation differences for each pixel between the enhanced result and the corresponding low-light image are then calculated. A small difference in hue and saturation indicates that the color is close to the original color of the low-light image. 3) Exposure loss is used to enhance brightness by providing each pixel with brightness close to a certain middle value, enhancing the overall brightness level of the final image. This middle value represents the ideal exposure value. 4) Semantic loss is used to retain semantic information. The unsupervised semantic segmentation network performs pixel-wise segmentation on the enhanced image, obtaining the predicted probability for each pixel and using this probability to design the semantic loss. 5) Total variation loss is used to maintain the difference between adjacent pixels of the image. The estimated curve parameter map is smoothed to ensure that the curve parameter values of adjacent pixels are close to each other and preserve the monotonicity of the curve as much as possible.
Result
2
The proposed method in this paper was compared with five methods, including low-light image enhancement (LIME), RetinexNet, EnlightenGAN, zero-reference deep curve estimation (Zero-DCE) and semantic-guided zero-shot learning (SGZ). The quality of enhanced images is objectively evaluated using full-reference evaluation metrics, such as peak signal-to-noise ratio (PSNR), structural similarity (SSIM), mean absolute error (MAE), and no-reference evaluation metric natural image quality evaluator (NIQE), while incorporating subjective visual effects for comprehensive evaluation. PSNR is used to measure the level of noise and distortion in an image. A high value theoretically indicates a small error between the enhanced and reference images, thus indicating high quality. SSIM is a perceptual model that aligns with human visual perception and is used to measure the similarity between the enhanced and reference images in terms of contrast, brightness, and structure. A high SSIM value indicates closeness between the enhanced and reference images. A small MAE value indicates a small deviation from the reference image. NIQE compares the image with a designed natural image model, and a low NIQE value indicates a high similarity to natural real images. On the peak signal-to-noise ratio, the proposed method is 0.32 dB higher than Zero-DCE; on natural image quality evaluation value, the method is higher than EnlightenGAN by 6%. From a subjective viewpoint, the proposed method in this paper addresses the existing problems of unclear details and color distortion in other methods and has better visual effects.
Conclusion
2
An unsupervised semantic segmentation network was introduced in this paper to perform pixel-wise segmentation on the enhanced images, preserving the semantic information during the enhancement process. The color of low-light images was restored by designing a loss function in the HSV color space. The spatial consistency loss was designed to ensure that the enhanced images are as detail-consistent as possible with their corresponding low-light images. Subjective and objective evaluations were conducted to demonstrate the superiority of the proposed method over others. Experimental results show that the proposed enhancement method outperforms other methods in qualitative and quantitative aspects, effectively addressing the issues of unclear details and color distortion in low-light images and demonstrating its practical value.
图像处理低光照图像增强深度学习语义分割HSV色彩空间
image processinglow-light image enhancementdeep learningsemantic segmentationHSV color space
Buchsbaum G. 1980. A spatial processor model for object colour perception. Journal of the Franklin Institute, 310(1): 1-26 [DOI: 10.1016/0016-0032(80)90058-7http://dx.doi.org/10.1016/0016-0032(80)90058-7]
Cai J R, Gu S H and Zhang L. 2018. Learning a deep single image contrast enhancer from multi-exposure images. IEEE Transactions on Image Processing, 27(4): 2049-2062 [DOI: 10.1109/TIP.2018.2794218http://dx.doi.org/10.1109/TIP.2018.2794218]
Chen Q, Montesinos P, Sun Q S, Heng P A and Xia D S. 2010. Adaptive total variation denoising based on difference curvature. Image and Vision Computing, 28(3): 298-306 [DOI: 10.1016/j.imavis.2009.04.012http://dx.doi.org/10.1016/j.imavis.2009.04.012]
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S and Schiele B. 2016. The cityscapes dataset for semantic urban scene understanding//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 3213-3223 [DOI: 10.1109/CVPR.2016.350http://dx.doi.org/10.1109/CVPR.2016.350]
Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255 [DOI: 10.1109/CVPR.2009.5206848http://dx.doi.org/10.1109/CVPR.2009.5206848]
Fan Z W, Sun L Y, Ding X H, Huang Y, Cai C B and Paisley J. 2018. A segmentation-aware deep fusion network for compressed sensing MRI//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: #1231 [DOI: 10.1007/978-3-030-01231-1_4http://dx.doi.org/10.1007/978-3-030-01231-1_4]
Fu X Y, Zeng D L, Huang Y, Zhang X P and Ding X H. 2016. A weighted variational model for simultaneous reflectance and illumination estimation//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2782-2790 [DOI: 10.1109/CVPR.2016.304http://dx.doi.org/10.1109/CVPR.2016.304]
Guo C L, Li C Y, Guo J C, Loy C C, Hou J H, Kwong S and Cong R M. 2020. Zero-reference deep curve estimation for low-light image enhancement//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 1777-1786 [DOI: 10.1109/CVPR42600.2020.00185http://dx.doi.org/10.1109/CVPR42600.2020.00185]
Guo X J, Li Y and Ling H B. 2017. Lime: low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing, 26(2): 982-993 [DOI: 10.1109/TIP.2016.2639450http://dx.doi.org/10.1109/TIP.2016.2639450]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Jiang Y F, Gong X Y, Liu D, Cheng Y, Fang C, Shen X H, Yang J C, Zhou P and Wang Z Y. 2021a. EnlightenGAN: deep light enhancement without paired supervision. IEEE Transactions on Image Processing, 30: 2340-2349 [DOI: 10.1109/TIP.2021.3051462http://dx.doi.org/10.1109/TIP.2021.3051462]
Jiang Z Q, Li H T, Liu L J, Men A and Wang H Y. 2021b. A switched view of retinex: deep self-regularized low-light image enhancement. Neurocomputing, 454: 361-372 [DOI: 10.1016/j.neucom.2021.05.025http://dx.doi.org/10.1016/j.neucom.2021.05.025]
Lee C, Lee C and Kim C S. 2012. Contrast enhancement based on layered difference representation//Proceedings of the 19th IEEE International Conference on Image Processing. Orlando, USA: IEEE: 965-968 [DOI: 10.1109/ICIP.2012.6467022http://dx.doi.org/10.1109/ICIP.2012.6467022]
Li M D, Liu J Y, Yang W H, Sun X Y and Guo Z M. 2018. Structure-revealing low-light image enhancement via robust retinex model. IEEE Transactions on Image Processing, 27(6): 2828-2841 [DOI: 10.1109/TIP.2018.2810539http://dx.doi.org/10.1109/TIP.2018.2810539]
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Lore K G, Akintayo A and Sarkar S. 2017. LLNet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognition, 61: 650-662 [DOI: 10.1016/j.patcog.2016.06.008http://dx.doi.org/10.1016/j.patcog.2016.06.008]
Lyu F F, Lu F, Wu J H and Lim C. 2018. MBLLEN: low-light image/video enhancement using CNNs//Proceedings of 2018 British Machine Vision Conference. Newcastle, UK: BMVA Press: #220
Ma L, Ma T Y and Liu R S. 2022. The review of low-light image enhancement. Journal of Image and Graphics, 27(5): 1392-1409
马龙, 马腾宇, 刘日升. 2022. 低光照图像增强算法综述. 中国图象图形学报, 27(5): 1392-1409 [DOI: 10.11834/jig.210852http://dx.doi.org/10.11834/jig.210852]
Mittal A, Soundararajan R and Bovik A C. 2013. Making a “Completely Blind” image quality analyzer. IEEE Signal Processing Letters, 20(3): 209-212 [DOI: 10.1109/LSP.2012.2227726http://dx.doi.org/10.1109/LSP.2012.2227726]
Vonikakis V, Andreadis I and Gasteratos A. 2008. Fast centre-surround contrast modification. IET Image processing, 2(1): 19-34 [DOI: 10.1049/iet-ipr:20070012http://dx.doi.org/10.1049/iet-ipr:20070012]
Wang R X, Zhang Q, Fu C W, Shen X Y, Zheng W S and Jia J Y. 2019. Underexposed photo enhancement using deep illumination estimation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 6842-6850 [DOI: 10.1109/CVPR.2019.00701http://dx.doi.org/10.1109/CVPR.2019.00701]
Wang S H, Zheng J, Hu H M and Li B. 2013. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Transactions on Image Processing, 22(9): 3538-3548 [DOI: 10.1109/TIP.2013.2261309http://dx.doi.org/10.1109/TIP.2013.2261309]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612 [DOI: 10.1109/TIP.2003.819861http://dx.doi.org/10.1109/TIP.2003.819861]
Wei C, Wang W J, Yang W H and Liu J Y. 2018. Deep retinex decomposition for low-light enhancement//Proceedings of 2018 British Machine Vision Conference. Newcastle, UK: BMVA Press: #155
Zhang Y H, Zhang J W and Guo X J. 2019. Kindling the darkness: a practical low-light image enhancer//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM: 1632-1640 [DOI: 10.1145/3343031.3350926http://dx.doi.org/10.1145/3343031.3350926]
Zhang Y H, Guo X J, Ma J Y, Liu W and Zhang J W. 2021. Beyond brightening low-light images. International Journal of Computer Vision, 129(4): 1013-1037 [DOI: 10.1007/s11263-020-01407-xhttp://dx.doi.org/10.1007/s11263-020-01407-x]
Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y. 2017. Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6230-6239 [DOI: 10.1109/CVPR.2017.660http://dx.doi.org/10.1109/CVPR.2017.660]
Zheng S and Gupta G. 2021. Semantic-guided zero-shot learning for low-light image/video enhancement//Proceedings of 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). Waikoloa, USA: IEEE: #64 [DOI: 10.1109/WACVW54805.2022.00064http://dx.doi.org/10.1109/WACVW54805.2022.00064]
相关作者
相关机构