高斯—维纳表示下的稠密焦栈图生成方法
Gaussian-Wiener-based dense focal stack image synthesis
- 2025年30卷第2期 页码:391-405
纸质出版日期: 2025-02-16
DOI: 10.11834/jig.240249
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2025-02-16 ,
移动端阅览
王其腾, 李志龙, 丁新, 刘琼, 杨铀. 2025. 高斯—维纳表示下的稠密焦栈图生成方法. 中国图象图形学报, 30(02):0391-0405
Wang Qiteng, Li Zhilong, Ding Xin, Liu Qiong, Yang You. 2025. Gaussian-Wiener-based dense focal stack image synthesis. Journal of Image and Graphics, 30(02):0391-0405
目的
2
焦栈图像能够扩展光学系统的景深,并为计算摄影、交互式和沉浸式媒体提供灵活的图像表达。然而,受限于光学系统的物理属性和拍摄对象的动态变化,人们往往只能拍摄稀疏的焦栈图像。因此,焦栈图像的稠密化成为当前需要解决的一个难题。为应对上述挑战,提出了一种高斯—维纳表示下的稠密焦栈图生成方法。
方法
2
焦栈图像被抽象为高斯—维纳表示,所提出的双向预测模型分别包含双向拟合模块和预测生成模块,在高斯—维纳表示模型的基础上构建双向拟合模型,求解双向预测参数并生成新的焦栈图像。首先,将稀疏焦栈序列的图像按照相同块大小进行分块,并基于此将相邻焦距、相同位置的块组合成块对,以块对为最小单元进行双向预测。其次,在双向预测模块中,块对将用于拟合出最佳双向拟合参数,并基于此求解出预测生成参数,生成新的焦栈图像块。最后,将所有预测生成得到的块进行拼接,得到新的焦栈图像。
结果
2
在11组稀疏焦栈图像序列上进行实验,所采用评价指标包括峰值信噪比(peak signal to noise ratio,PSNR)和结构相似性(structure similarity index measure,SSIM)。11个序列生成结果的平均PSNR为40.861 dB,平均SSIM为0.976。相比于广义高斯和空间坐标两个对比方法,PSNR分别提升了6.503 dB和6.467 dB,SSIM分别提升了0.057和0.092。各序列均值PSNR和SSIM最少提升了3.474 dB和0.012。
结论
2
实验结果表明,所提出的双向预测方法可以较好地生成新的焦栈图像,能够在多种以景深为导向的视觉应用中发挥关键作用。
Objective
2
In optical imaging systems, the depth of field (DoF) is typically limited by the properties of optical lenses, resulting in the ability to focus only on a limited region of the scene. Thus, expanding the depth of field for optical systems is a challenging task in the community for both academia and industry. For example, in computational photography, when dense focus stack images are captured, photographers can select different focal points and depths of field in postprocessing to achieve the desired artistic effects. In macro- and micro-imaging, dense focus stack images can provide clearer and more detailed images for more accurate analysis and measurement. For interactive and immersive media, dense focus stack images can provide a more realistic and immersive visual experience. However, achieving dense focus stack images also faces some challenges. First, the performance of hardware devices limits the speed and quality of image acquisition. During the shooting process, the camera needs to adjust the focus quickly and accurately and capture multiple images to build a focus stack. Therefore, high-performance cameras and adaptive autofocus algorithms are required. In addition, changes in the shooting environment, such as object motion or manual operations by photographers, can also introduce image blurring and alignment issues. These challenges are addressed by introducing the block-based Gaussian-Wiener bidirectional prediction model to provide an effective solution. When the image is embedded into blocks and the characteristics of local blocks for prediction are utilized, the computational complexity can be reduced, and the prediction accuracy can be improved. Gaussian-Wiener filtering can smooth prediction results and reduce the impact of artifacts and noise, which can improve image quality. The bidirectional prediction method combines the original sparse focal stack images(FoSIs) with the prediction results to generate dense FoSIs, thereby expanding the DoF of the optical system. The Gaussian-Wiener bidirectional prediction model provides an innovative method for capturing dense focus stack images. It can be applied to various scenarios and application fields, providing greater creative freedom and image processing capabilities for photographers, scientists, engineers, and artists.
Method
2
This work abstracts the FoS as a Gaussian-Wiener representation. The proposed bidirectional prediction model includes a bidirectional fitting module and a prediction generation module. On the basis of the Gaussian-Wiener representation model, a bidirectional fitting model is constructed to solve for the bidirectional prediction parameters and draw a new focal stack image. First, on the basis of the given sparse focus stack image sequence, the number ranges from near to far according to the focal length. These numbers start from 0 and are incremented according to a certain rule, such as increasing by 2 each time to ensure that all the numbers are even. These results in a set of sparse focus stack images arranged in serial order. Then, all images were divided into blocks according to predefined block sizes. The size of each block can be selected on the basis of specific needs and algorithms. The blocks located in adjacent numbers are combined to form a block pair, which becomes the most basic unit for bidirectional prediction. Before conducting bidirectional prediction, the image was preprocessed by dividing the focus stack image into blocks and recombining them into block pairs. This preprocessing process can be achieved by using image segmentation algorithms and block pair combination strategies. For each block pair, bidirectional prediction was performed to obtain the prediction parameters. These prediction parameters can be determined on the basis of specific prediction models and algorithms, such as the block-based Gaussian-Wiener bidirectional prediction model. In the bidirectional prediction module, block pairs can be used to fit the best bidirectional fitting parameters, and on this basis, the prediction generation parameters can be solved. When the information of the prediction generation parameters and block pairs is applied, new prediction blocks can be generated. Finally, when all the prediction blocks are concatenated according to their positions in the image, new prediction focus stack images can be obtained.
Result
2
This experiment is performed on 11 sparse focal stack images, with evaluation metrics using the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM). The average PSNR of the 11 sequence generated results is 40.861 dB, and the average SSIM is 0.976. Compared with the two comparison methods of generalized Gaussian and spatial coordinates, the PSNR is improved by 6.503 dB and 6.467 dB, respectively, and the SSIM is improved by 0.057 and 0.092, respectively. The average PSNR and SSIM of each sequence improved by at least 3.474 dB and 0.012, respectively.
Conclusion
2
The experimental results show that the method proposed in this study outperforms both the subjective and objective comparison methods and has good performance on 11 different scene sequences. Combined with ablation experiments, the advantages of bidirectional prediction in the proposed method have been demonstrated. The results indicate that the proposed bidirectional prediction method can effectively generate new focal pile images and play a crucial role in visual applications that target various depths of the field.
Anger J , Facciolo G and Delbracio M . 2018 . Estimating an image’s blur kernel using natural image statistics, and deblurring it: an analysis of the Goldstein-Fattal method . Image Processing on Line , 8 : 282 - 304 [ DOI: 10.5201/ipol.2018.211 http://dx.doi.org/10.5201/ipol.2018.211 ]
Bai Y C , Jia H Z , Jiang M , Liu X M , Xie X D and Gao W . 2020 . Single-image blind deblurring using multi-scale latent structure prior . IEEE Transactions on Circuits and Systems for Video Technology , 30 ( 7 ): 2033 - 2045 [ DOI: 10.1109/TCSVT.2019.2919159 http://dx.doi.org/10.1109/TCSVT.2019.2919159 ]
Chang J H R , Kumar B V K V and Sankaranarayanan A . 2018 . Towards multifocal displays with dense focal stacks . ACM Transactions on Graphics (TOG) , 37 ( 6 ): # 198 [ DOI: 10.1145/3272127.3275015 http://dx.doi.org/10.1145/3272127.3275015 ]
Fang L and Dai Q H . 2020 . Computational light field imaging . Acta Optica Sinica , 40 ( 1 ): #0111001
方璐 , 戴琼海 . 2020 . 计算光场成像 . 光学学报 , 40 ( 1 ): # 0111001 [ DOI: 10.3788/AOS202040.0111001 http://dx.doi.org/10.3788/AOS202040.0111001 ]
Fu M L , Zhao Y and Hou Z Y . 2023 . An effective interpretation of defocusing and the corresponding defocus convolution kernel . Optics and Laser Technology , 160 : # 109035 [ DOI: 10.1016/j.optlastec.2022.109035 http://dx.doi.org/10.1016/j.optlastec.2022.109035 ]
Hog M , Sabater N , Vandame B and Drazic V . 2017 . An image rendering pipeline for focused plenoptic cameras . IEEE Transactions on Computational Imaging , 3 ( 4 ): 811 - 821 [ DOI: 10.1109/TCI.2017.2710906 http://dx.doi.org/10.1109/TCI.2017.2710906 ]
Hu Y Z , Wang Y J and Wang S . 2020 . Scene target 3D point cloud reconstruction technology combining monocular focus stack and deep learning . IEEE Access , 8 : 168099 - 168110 [ DOI: 10.1109/ACCESS.2020.3022630 http://dx.doi.org/10.1109/ACCESS.2020.3022630 ]
Jian Y S , Zhu D M , Fu Z T and Wen S Y . 2023 . Remote sensing image segmentation network based on multi-level feature refinement and fusion . Laser and Optoelectronics Progress , 60 ( 4 ): #0428002
菅永胜 , 朱大明 , 付志涛 , 文诗雅 . 2023 . 多层级特征优化融合的遥感图像分割网络 . 激光与光电子学进展 , 60 ( 4 ): # 0428002 [ DOI: 10.3788/LOP212864 http://dx.doi.org/10.3788/LOP212864 ]
Kang X K , Qiu J , Liu C and He D . 2020 . Global imaging based on monomer subset of focal stack architecture . Laser and Optoelectronics Progress , 57 ( 24 ): #241101
亢新凯 , 邱钧 , 刘畅 , 何迪 . 2020 . 基于聚焦堆栈单体数据子集架构的全局成像 . 激光与光电子学进展 , 57 ( 24 ): # 241101 [ DOI: 10.3788/LOP57.241101 http://dx.doi.org/10.3788/LOP57.241101 ]
Kontogianni G , Chliverou R , Koutsoudis A , Pavlidis G and Georgopoulos A . 2017 . Enhancing close-up image based 3D digitisation with focus stacking . The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2-W 5 : 421 - 425 [ DOI: 10.5194/ISPRS-ARCHIVES-XLII-2-W5-421-2017 http://dx.doi.org/10.5194/ISPRS-ARCHIVES-XLII-2-W5-421-2017 ]
Li S W , Lin D Y , Zhou X H , Zhang W , Chen D N , Yu B and Qu J L . 2021 . Mutifocal image scanning microscopy based on double-helix point spread function engineering . Acta Physica Sinica , 70 ( 3 ): #038701
李四维 , 林丹樱 , 邹小慧 , 张炜 , 陈丹妮 , 于斌 , 屈军乐 . 2021 . 基于双螺旋点扩散函数工程的多焦点图像扫描显微 . 物理学报 , 70 ( 3 ): # 038701 [ DOI: 10.7498/aps.70.20200640 http://dx.doi.org/10.7498/aps.70.20200640 ]
Li W T , Wang G J , Hu X W and Yang H Z . 2018 . Scene-adaptive image acquisition for focus stacking // Proceedings of the 25th IEEE International Conference on Image Processing (ICIP) . Athens, Greece : IEEE: 1887 - 1891 [ DOI: 10.1109/ICIP.2018.8451455 http://dx.doi.org/10.1109/ICIP.2018.8451455 ]
Liu D W , Nicolescu R and Klette R . 2016 . Stereo-based bokeh effects for photography . Machine Vision and Applications , 27 ( 8 ): 1325 - 1337 [ DOI: 10.1007/s00138-016-0775-5 http://dx.doi.org/10.1007/s00138-016-0775-5 ]
Liu Y Q , Du X , Shen H L and Chen S J . 2021 . Estimating generalized Gaussian blur kernels for out-of-focus image deblurring . IEEE Transactions on Circuits and Systems for Video Technology , 31 ( 3 ): 829 - 843 [ DOI: 10.1109/TCSVT.2020.2990623 http://dx.doi.org/10.1109/TCSVT.2020.2990623 ]
Ren J R , Fu X D , Wang M R , Zhao T Y , Wang Z J , Feng K , Liang Y S , Wang S W and Lei M . 2023 . Advances in rapid three-dimensional wide field microscopy . Chinese Journal of Lasers , 50 ( 3 ): #0307104
任婧荣 , 傅相达 , 王孟瑞 , 赵天宇 , 汪召军 , 冯坤 , 梁言生 , 王少伟 , 雷铭 . 2023 . 快速宽场三维显微技术研究进展 . 中国激光 , 50 ( 3 ): # 0307104 [ DOI: 10.3788/CJL221303 http://dx.doi.org/10.3788/CJL221303 ]
Ren S X , Zhang C S , Cao H Q , Lin D Y , Yu B and Qu J L . 2022 . Three-dimensional multifocal two-photon laser scanning microscopy based on double-helix point spread function engineering . Acta Optica Sinica , 42 ( 14 ): #1411001
任苏霞 , 张晨爽 , 曹慧群 , 林丹樱 , 于斌 , 屈军乐 . 2022 . 基于双螺旋点扩展函数工程的三维多焦点双光子激光扫描显微技术 . 光学学报 , 42 ( 14 ): # 1411001 [ DOI: 10.3788/AOS202242.1411001 http://dx.doi.org/10.3788/AOS202242.1411001 ]
Ruan L Y , Chen B , Li J Z and Lam M L . 2021 . AIFNet: all-in-focus image restoration network using a light field-based dataset . IEEE Transactions on Computational Imaging , 7 : 675 - 688 [ DOI: 10.1109/TCI.2021.3092891 http://dx.doi.org/10.1109/TCI.2021.3092891 ]
Sun L T , Wang L J , Wang W S and Liu M X . 2020 . Design of dual-focal-plane helmet mounted display based on single picture generation unit . Acta Optica Sinica , 40 ( 13 ): #1322004
孙路通 , 王灵杰 , 王蔚松 , 刘铭鑫 . 2020 . 基于单图像生成单元的双焦面头盔显示光学系统设计 . 光学学报 , 40 ( 13 ): # 1322004 [ DOI: 10.3788/AOS202040.1322004 http://dx.doi.org/10.3788/AOS202040.1322004 ]
Sun Z H , Ke Q H , Rahmani H , Bennamoun M , Wang G and Liu J . 2023 . Human action recognition from various data modalities: a review . IEEE Transactions on Pattern Analysis and Machine Intelligence , 45 ( 3 ): 3200 - 3225 [ DOI: 10.1109/TPAMI.2022.3183112 http://dx.doi.org/10.1109/TPAMI.2022.3183112 ]
Wang Q and Fu Y T . 2020 . Single-image refocusing using light field synthesis and circle of confusion rendering . Acta Optica Sinica , 40 ( 1 ): #0111021
王奇 , 傅雨田 . 2020 . 利用光场合成与弥散圆渲染的单幅图像重聚焦 . 光学学报 , 40 ( 1 ): # 0111021 [ DOI: 10.3788/AOS202040.0111021 http://dx.doi.org/10.3788/AOS202040.0111021 ]
Wang T T , Piao Y R , Lu H C , Li X and Zhang L H . 2019 . Deep learning for light field saliency detection // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul, Korea (South) : IEEE: 8837 - 8847 [ DOI: 10.1109/ICCV.2019.00893 http://dx.doi.org/10.1109/ICCV.2019.00893 ]
Wang X , Liu Q , Peng Z J , Hou J H , Yuan H , Zhao T S , Qin Y , Wu K J , Liu W Y and Yang Y . 2023 . Research progress of six degree of freedom (6DoF) video technology . Journal of Image and Graphics , 28 ( 6 ): 1863 - 1890
王旭 , 刘琼 , 彭宗举 , 侯军辉 , 元辉 , 赵铁松 , 秦熠 , 吴科君 , 刘文予 , 杨铀 . 2023 . 6DoF视频技术研究进展 . 中国图象图形学报 , 28 ( 6 ): 1863 - 1890 [ DOI: 10.11834/jig.230025 http://dx.doi.org/10.11834/jig.230025 ]
Wang Y T , Cheng D W and Xu C . 2016 . Display technologies in virtual reality systems . Scientia Sinica Informationis , 46 ( 12 ): 1694 - 1710
王涌天 , 程德文 , 许晨 , 2016 . 虚拟现实光学显示技术 . 中国科学 : 信息科学) , 46 ( 12 ): 1694 - 1710 [ DOI: 10.1360/N112016-00247 http://dx.doi.org/10.1360/N112016-00247 ]
Wu K J , Yang Y , Liu Q and Zhang X P . 2022 . Gaussian-wiener representation and hierarchical coding scheme for focal stack images . IEEE Transactions on Circuits and Systems for Video Technology , 32 ( 2 ): 523 - 537 [ DOI: 10.1109/TCSVT.2021.3066523 http://dx.doi.org/10.1109/TCSVT.2021.3066523 ]
Wu K J , Yang Y , Yu M and Liu Q . 2020 . Block-wise focal stack image representation for end-to-end applications . Optics Express , 28 ( 26 ): 40024 - 40043 [ DOI: 10.1364/oe.413523 http://dx.doi.org/10.1364/oe.413523 ]
Xiang S , Huang N T , Deng H P and Wu J . 2022 . Estimation of light field depth based on multi-level network optimization . Laser and Optoelectronics Progress , 59 ( 10 ): #1010009
向森 , 黄楠婷 , 邓慧萍 , 吴谨 . 2022 . 基于多级网络优化的光场深度值估计 . 激光与光电子学进展 , 59 ( 10 ): # 1010009 [ DOI: 10.3788/LOP202259.1010009 http://dx.doi.org/10.3788/LOP202259.1010009 ]
Xie N Y , Ding Y Y , Li M Y , Liu Y , Lyu R M and Yan T . 2022 . Light field image re-focusing based on conditional generative adversarial networks leverage . Journal of Image and Graphics , 27 ( 4 ): 1056 - 1065
谢柠宇 , 丁宇阳 , 李明悦 , 刘渊 , 律睿慜 , 晏涛 . 2022 . 利用条件生成对抗网络的光场图像重聚焦 . 中国图象图形学报 , 27 ( 4 ): 1056 - 1065 [ DOI: 10.11834/jig.200471 http://dx.doi.org/10.11834/jig.200471 ]
Xie P Y , Yang J F , Xue B and Chen G Q . 2017 . Simulation of light field imaging and refocusing models based on matrix transformation . Acta Photonica Sinica , 46 ( 5 ): #0510001
解培月 , 杨建峰 , 薛彬 , 陈国庆 . 2017 . 基于矩阵变换的光场成像及重聚焦模型仿真 . 光子学报 , 46 ( 5 ): # 0510001 [ DOI: 10.3788/gzxb20174605.0510001 http://dx.doi.org/10.3788/gzxb20174605.0510001 ]
Zhang C P and Wang Q . 2016 . Survey on imaging model and calibration of light field camera . Chinese Journal of Lasers , 43 ( 6 ): #0609004
张春萍 , 王庆 . 2016 . 光场相机成像模型及参数标定方法综述 . 中国激光 , 43 ( 6 ): # 0609004 [ DOI: 10.3788/CJL201643.0609004 http://dx.doi.org/10.3788/CJL201643.0609004 ]
Zheng H Y , Yong H W and Zhang L . 2022 . Unfolded deep kernel estimation for blind image super-resolution // Proceedings of the 17th European Conference on Computer Vision . Tel Aviv, Israel : Springer: 502 - 518 [ DOI: 10.1007/978-3-031-19797-0_29 http://dx.doi.org/10.1007/978-3-031-19797-0_29 ]
相关作者
相关机构