MSPRL:面向图像逆半色调的多尺度渐进式残差学习网络
MSPRL: multiscale progressively residual learning network for image inverse halftoning
- 2024年29卷第4期 页码:953-965
纸质出版日期: 2024-04-16
DOI: 10.11834/jig.230560
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-04-16 ,
移动端阅览
李飞宇, 杨俊, 桑高丽. 2024. MSPRL:面向图像逆半色调的多尺度渐进式残差学习网络. 中国图象图形学报, 29(04):0953-0965
Li Feiyu, Yang Jun, Sang Gaoli. 2024. MSPRL: multiscale progressively residual learning network for image inverse halftoning. Journal of Image and Graphics, 29(04):0953-0965
目的
2
图像逆半色调的目的是从二值半色调图像中恢复出连续色调图像。半色调图像丢失了大量原始图像内容信息,因此逆半色调成为一个经典的图像重建病态问题。现有的逆半色调算法重建效果无法满足对图像细节和纹理的需求。此外,已有方法大多忽略了训练策略对模型优化的重要影响,导致模型性能较差。针对上述问题,提出一个逆半色调网络以提高半色调图像重建质量。
方法
2
首先提出一个端到端的多尺度渐进式残差学习网络(multiscale progressivoly residual learning network, MSPRL)以恢复出更高质量的连续色调图像。该网络基于UNet架构并以多尺度图像作为输入;为充分利用不同尺度输入图像的信息,设计一个浅层特征提取模块以捕获多尺度图像的注意力信息;同时探讨不同学习策略对模型训练和性能的影响。
结果
2
实验在7个数据集上与6种方法进行对比。在Place365和Kodak数据集上,相比性能第2的方法,峰值信噪比(peak signal-to-noise ratio,PSNR)分别提高0.12 dB和0.18 dB;在其他5个常用于图像超分辨率的测试数据集Set5、Set14、BSD100(Berkeley segmentation dataset 100)、Urban100和Manga109上,相比性能第2的方法,PSNR值分别提高0.11 dB、0.25 dB、0.08 dB、0.39 dB和0.35 dB。基于本文的训练策略,重新训练的渐进式残差学习网络相比未优化训练模型在7个数据集上PSNR平均提高1.44 dB。本文方法在图像细节和纹理重建上实现最优效果。实验表明选用合适的学习策略能够优化模型训练,对性能提升具有重要帮助。
结论
2
本文提出的逆半色调模型,综合UNet架构和多尺度图像信息的优点,选用合适的训练策略,使得图像重建的细节与纹理更加清晰,视觉效果更加细致。本文算法代码公布在
https://github.com/Feiyuli-cs/MSPRL
https://github.com/Feiyuli-cs/MSPRL
。
Objective
2
The halftoning method represents continuous-tone images by using two levels of color, namely, black and white; it is commonly used in digital image printing, publishing, and displaying applications because of cost considerations. Compared with continuous-tone images, a halftone image has only two values. The halftoning method can save considerable storage space and network transfer bandwidth, so it is a feasible and important image compression method. Image inverse halftoning is a classic image restoration task, aiming to recover continuous-tone images from halftone images with only bilevel pixels. However, owing to the loss of original image content in halftone images, inverse halftoning is also a classic ill-problem. Although existing inverse halftoning algorithms have achieved good performance, their reconstruction results indicate lost image details and features, causing varying degrees of curvature and roughness in some high-frequency regions and resulting in poor visual reconstruction results, which still cannot meet the requirements for high detail and texture of images. Therefore, inverse halftoning remains a challenge in recovering high-quality continuous-tone images. Many previous methods focused on model design to improve performance, ignoring the important impact of training strategies on model optimization, which led to poor model performance. To solve these problems, we propose an inverse halftone network to improve the quality of halftone image reconstruction and explore different training strategies to optimize model training.
Method
2
In this paper, we propose an end-to-end multiscale progressively residual learning network (MSPRL), which is based on the UNet architecture and takes multiscale input images. To make full use of different input image information, we design a shallow feature extraction module to capture the attention features of different-scale images. We divide our model into an encoder and a decoder, where the encoder focuses on restoring content information, and the decoder receives the aggregation features of the encoder to strengthen deep feature learning. The encoder and the decoder are composed of residual blocks (RBs). We design our MSPRL to comprise three levels, each level receiving the input halftone images of different scales. To collect the encoder features and transmit them to the decoder, we use the
Concat
operation and a
<math id="M1"><mn mathvariant="normal">1</mn><mo>×</mo><mn mathvariant="normal">1</mn></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051884&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051871&type=
7.11199999
2.28600001
convolutional kernel as the feature fusion module (FF) to aggregate the feature maps of different-level encoders. In our overall model, input halftone images are progressively learned from the left encoder to the right decoder. We systematically study the effects of different training strategies for model training and reconstruction performance. For example, the performance of using
<math id="M2"><mn mathvariant="normal">128</mn><mo>×</mo><mn mathvariant="normal">128</mn></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051890&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051888&type=
13.37733269
2.28600001
pixel patch size is slightly lower than that of using
<math id="M3"><mn mathvariant="normal">256</mn><mo>×</mo><mn mathvariant="normal">256</mn></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051882&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051892&type=
13.37733269
2.28600001
pixels patch size, but its training speed is significantly reduced by about 65% during the model training phase. Adding fast Fourier transform loss can further improve the model performance compared with the use of a single
<math id="M4"><msub><mrow><mi>L</mi></mrow><mrow><mn mathvariant="normal">1</mn></mrow></msub></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051913&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=57051895&type=
2.70933342
3.21733332
loss. We also compare different feature channel dimensions, feature extraction blocks, and activation functions. Experimental results demonstrate that effective learning strategies can optimize model training and significantly improve performance.
Result
2
The experimental results are compared with the results of six methods on seven datasets, including a denoising convolutional neural network, VDSR, an enhanced deep super-resolution network, a progressively residual learning network (PRL), a gradient-guided residual learning network, a multi-input multi-output UNet, and a retrained PRL (PRL-dt). On the Places365 and Kodak datasets, compared with that of the second-best-performing model PRL-dt, the peak signal-to-noise ratio (PSNR) of our MSPRL is increased by 0.12 dB and 0.18 dB, respectively. On the other five commonly used test datasets (Set5, Set14, BSD100, Urban100, and Manga109) for image super-resolution, compared with that of the second-best-performing model PRL-dt, the PSNR of MSPRL is increased by 0.11 dB, 0.25 dB, 0.08 dB, 0.39 dB and 0.35 dB, respectively. Based on our training strategies, PRL-dt has an average PSNR improvement of 1.44 dB compared with the unoptimized training PRL on the seven test datasets. Extensive experiments demonstrate that MSPRL achieves significant reconstruction results in image details and textures.
Conclusion
2
In this paper, we propose an inverse halftone network to solve the problem of low-quality reconstruction for inverse halftoning. Our MSPRL contains an SFE, an FF, and an encoder and a decoder with RBs as the core. It combines the advantages of the UNet architecture and multiscale image information and chooses appropriate training strategies to improve image reconstruction quality and the visual effects in terms of details and textures. Extensive experiments demonstrate that our MSPRL outperforms previous approaches and achieves state-of-the-art performance.
图像逆半色调误差扩散多尺度渐进式学习深度学习图像恢复
image inverse halftoningerror diffusionmultiscale progressively learningdeep learningimage restoration
Analoui M and Allebach J. 1992. New results on reconstruction of continuous-tone from halftone//Proceedings of 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. San Francisco, USA: IEEE: 313-316 [DOI: 10.1109/ICASSP.1992.226238http://dx.doi.org/10.1109/ICASSP.1992.226238]
Bayer B E. 1973. An optimum method for two-level rendition of continuous-tone pictures. Proceedings of 1973 IEEE International Conference on Communications, New York, USA: IEEE: 2611-2615
Bevilacqua M, Roumy A, Guillemot C and Alberi-Morel M L. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding//Proceedings of the 23rd British Machine Vision Conference (BMVC). Surrey, UK: BMVA Press: 1-10 [DOI: 10.5244/C.26.135http://dx.doi.org/10.5244/C.26.135]
Catté F, Lions P L, Morel J M and Coll T. 1992. Image selective smoothing and edge detection by nonlinear diffusion. SIAM Journal on Numerical Analysis, 29(1): 182-193 [DOI: 10.1137/0729012http://dx.doi.org/10.1137/0729012]
Chen L Y, Chu X J, Zhang X Y and Sun J. 2022. Simple baselines for image restoration//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 17-33 [DOI: 10.1007/978-3-031-20071-7_2http://dx.doi.org/10.1007/978-3-031-20071-7_2]
Cho S J, Ji S W, Hong J P, Jung S W and Ko S J. 2021. Rethinking coarse-to-fine approach in single image deblurring//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 4621-4630 [DOI: 10.1109/ICCV48922.2021.00460http://dx.doi.org/10.1109/ICCV48922.2021.00460]
Cubuk E D, Zoph B, Shlens J and Le Q V. 2020. Randaugment: practical automated data augmentation with a reduced search space//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE: 3008-3017 [DOI: 10.1109/CVPRW50498.2020.00359http://dx.doi.org/10.1109/CVPRW50498.2020.00359]
Dong C, Loy C C, He K M and Tang X O. 2014. Learning a deep convolutional network for image super-resolution//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 184-199 [DOI: 10.1007/978-3-319-10593-2_13http://dx.doi.org/10.1007/978-3-319-10593-2_13]
Eschbach R and Knox K T. 1991. Error-diffusion algorithm with edge enhancement. Journal of the Optical Society of America A, 8(12): 1844-1850 [DOI: 10.1364/JOSAA.8.001844http://dx.doi.org/10.1364/JOSAA.8.001844]
Everingham M, Eslami S M A, Van Gool L, Williams C K I, Winn J and Zisserman A. 2015. The Pascal visual object classes challenge: a retrospective. International Journal of Computer Vision, 111(1): 98-136 [DOI: 10.1007/s11263-014-0733-5http://dx.doi.org/10.1007/s11263-014-0733-5]
Floyd R W. 1976. An adaptive algorithm for spatial gray-scale. Proceedings of Society Information Display, 17: 75-77
Goyal P, Dollár P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y Q and He K M. 2018. Accurate, large minibatch SGD: training ImageNet in 1 hour [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1706.02677.pdfhttps://arxiv.org/pdf/1706.02677.pdf
Guo J M, Liu Y F, Chang J Y and Lee J D. 2013. Efficient halftoning based on multiple look-up tables. IEEE Transactions on Image Processing, 22(11): 4522-4531 [DOI: 10.1109/TIP.2013.2277774http://dx.doi.org/10.1109/TIP.2013.2277774]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
He T, Zhang Z, Zhang H, Zhang Z Y, Xie J Y and Li M. 2019. Bag of tricks for image classification with convolutional neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 558-567 [DOI: 10.1109/CVPR.2019.00065http://dx.doi.org/10.1109/CVPR.2019.00065]
Hendrycks D and Gimpel K. 2023. Gaussian error linear units (GELUs) [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1606.08415.pdfhttps://arxiv.org/pdf/1606.08415.pdf
Hou X X and Qiu G P. 2017. Image Companding and inverse halftoning using deep convolutional neural networks [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1707.00116.pdfhttps://arxiv.org/pdf/1707.00116.pdf
Huang J B, Singh A and Ahuja N. 2015. Single image super-resolution from transformed self-exemplars//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 5197-5206 [DOI: 10.1109/CVPR.2015.7299156http://dx.doi.org/10.1109/CVPR.2015.7299156]
Huang W B, Su A W Y and Kuo Y H. 2008. Neural network based method for image halftoning and inverse halftoning. Expert Systems with Applications, 34(4): 2491-2501 [DOI: 10.1016/j.eswa.2007.04.013http://dx.doi.org/10.1016/j.eswa.2007.04.013]
Kim J, Lee J K and Lee K M. 2016. Accurate image super-resolution using very deep convolutional networks//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 1646-1654 [DOI: 10.1109/CVPR.2016.182http://dx.doi.org/10.1109/CVPR.2016.182]
Kingma D P and Ba J. 2017. Adam: a method for stochastic optimization [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1412.6980.pdfhttps://arxiv.org/pdf/1412.6980.pdf
Kite T D, Damera-Venkata N, Evans B L and Bovik A C. 2000. A fast, high-quality inverse halftoning algorithm for error diffused halftones. IEEE Transactions on Image Processing, 9(9): 1583-1592 [DOI: 10.1109/83.862639http://dx.doi.org/10.1109/83.862639]
Knuth D E. 1987. Digital halftones by dot diffusion. ACM Transactions on Graphics, 6(4): 245-273 [DOI: 10.1145/35039.35040http://dx.doi.org/10.1145/35039.35040]
Lim B, Son S, Kim H, Nah S and Lee K M. 2017. Enhanced deep residual networks for single image super-resolution//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, USA: IEEE: 1132-1140 [DOI: 10.1109/CVPRW.2017.151http://dx.doi.org/10.1109/CVPRW.2017.151]
Lin Z D, Garg P, Banerjee A, Magid S A, Sun D Q, Zhang Y L, Van Gool L, Wei D L and Pfister H. 2022. Revisiting RCAN: improved training for image super-resolution [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/2201.11279.pdfhttps://arxiv.org/pdf/2201.11279.pdf
Liu Y F, Guo J M and Lee J D. 2011. Inverse halftoning based on the Bayesian theorem. IEEE Transactions on Image Processing, 20(4): 1077-1084 [DOI: 10.1109/TIP.2010.2087765http://dx.doi.org/10.1109/TIP.2010.2087765]
Loshchilov I and Hutter F. 2017. SGDR: stochastic gradient descent with warm restarts [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1608.03983.pdfhttps://arxiv.org/pdf/1608.03983.pdf
Loshchilov I and Hutter F. 2019. Decoupled weight decay regularization [EB/OL]. [2023-04-25]. https://arxiv.org/pdf/1711.05101.pdfhttps://arxiv.org/pdf/1711.05101.pdf
Lu B and Gai S. 2022. Single image rain removal based on multi scale progressive residual network. Journal of Image and Graphics, 27(5): 1537-1553
卢贝, 盖杉. 2022. 多尺度渐进式残差网络的图像去雨. 中国图象图形学报, 27(5): 1537-1553 [DOI: 10.11834/jig.210472http://dx.doi.org/10.11834/jig.210472]
Maas A L, Hannun A Y and Ng A Y. 2013. Rectifier nonlinearities improve neural network acoustic models//Proceedings of the 30th International Conference on Machine Learning. Atlanta, USA: [s.n.]: #3
Martin D, Fowlkes C, Tal D and Malik J. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics//Proceedings of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada: IEEE: 416-423 [DOI: 10.1109/ICCV.2001.937655http://dx.doi.org/10.1109/ICCV.2001.937655]
Matsui Y, Ito K, Aramaki Y, Fujimoto A, Ogawa T, Yamasaki T and Aizawa K. 2017. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76(20): 21811-21838 [DOI: 10.1007/s11042-016-4020-zhttp://dx.doi.org/10.1007/s11042-016-4020-z]
Mese M and Vaidyanathan P P. 2001. Look-up table (LUT) method for inverse halftoning. IEEE Transactions on Image Processing, 10(10): 1566-1578 [DOI: 10.1109/83.951541http://dx.doi.org/10.1109/83.951541]
JrMulligan J B and Ahumada A J. 1992. Principled halftoning based on human vision models//Proceedings Volume 1666, Human Vision, Visual Processing, and Digital Display III. San Jose, USA: SPIE: 109-121 [DOI: 10.1117/12.135960http://dx.doi.org/10.1117/12.135960]
Nair V and Hinton G E. 2010. Rectified linear units improve restricted Boltzmann machines//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: Omnipress: 807-814
Qian G C, Li Y C, Peng H W, Mai J J, Hammoud H, Elhoseiny M and Ghanem B. 2022. PointNeXt: revisiting PointNet++ with improved training and scaling strategies//Advances in Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc.: 23192-23204
Seldowitz M A, Allebach J P and Sweeney D W. 1987. Synthesis of digital holograms by direct binary search. Applied Optics, 26(14): 2788-2798 [DOI: 10.1364/AO.26.002788http://dx.doi.org/10.1364/AO.26.002788]
Shao L H, Zhang E H and Li M. 2021. An efficient convolutional neural network model combined with attention mechanism for inverse halftoning. Electronics, 10(13): #1574 [DOI: 10.3390/electronics10131574http://dx.doi.org/10.3390/electronics10131574]
Shi W Z, Caballero J, Huszár F, Totz J, Aitken A P, Bishop R, Rueckert D and Wang Z H. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network//Proceedings of 2016 IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE: 1874-1883 [DOI: 10.1109/CVPR.2016.207http://dx.doi.org/10.1109/CVPR.2016.207]
Son C H and Choo H. 2014. Local learned dictionaries optimized to edge orientation for inverse halftoning. IEEE Transactions on Image Processing, 23(6): 2542-2556 [DOI: 10.1109/TIP.2014.2319732http://dx.doi.org/10.1109/TIP.2014.2319732]
Unal G B and Cetin A E. 2001. Restoration of error-diffused images using projection onto convex sets. IEEE Transactions on Image Processing, 10(12): 1836-1841 [DOI: 10.1109/83.974568http://dx.doi.org/10.1109/83.974568]
Wang X T, Yu K, Wu S X, Gu J J, Liu Y H, Dong C, Qiao Y and Loy C C. 2019. ESRGAN: enhanced super-resolution generative adversarial networks//Proceedings of the European Conference on Computer Vision (ECCV) Workshops. Munich, Germany: Springer: 63-79 [DOI: 10.1007/978-3-030-11021-5_5http://dx.doi.org/10.1007/978-3-030-11021-5_5]
Wong P W. 1995. Inverse halftoning and kernel estimation for error diffusion. IEEE Transactions on Image Processing, 4(4): 486-498 [DOI: 10.1109/83.370677http://dx.doi.org/10.1109/83.370677]
Xia M H and Wong T T. 2019. Deep inverse halftoning via progressively residual learning//Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: Springer: 523-539 [DOI: 10.1007/978-3-030-20876-9_33http://dx.doi.org/10.1007/978-3-030-20876-9_33]
Xia M H, Hu W B, Liu X T and Wong T T. 2021. Deep halftoning with reversible binary pattern//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 13980-13989 [DOI: 10.1109/ICCV48922.2021.01374http://dx.doi.org/10.1109/ICCV48922.2021.01374]
Xiao Y, Pan C, Zhu X Y, Jiang H and Zheng Y. 2017. Deep neural inverse halftoning//Proceedings of 2017 International Conference on Virtual Reality and Visualization (ICVRV). Zhengzhou, China: IEEE: 213-218 [DOI: 10.1109/ICVRV.2017.00051http://dx.doi.org/10.1109/ICVRV.2017.00051]
Yen Y T, Cheng C C and Chiu W C. 2021. Inverse halftone colorization: making halftone prints color photos//Proceedings of 2021 IEEE International Conference on Image Processing (ICIP). Anchorage, USA: IEEE: 1734-1738 [DOI: 10.1109/ICIP42928.2021.9506307http://dx.doi.org/10.1109/ICIP42928.2021.9506307]
Yuan J, Pan C, Zheng Y, Zhu X Y, Qin Z and Xiao Y. 2020. Gradient-guided residual learning for inverse halftoning and image expanding. IEEE Access, 8: 50995-51007 [DOI: 10.1109/ACCESS.2019.2955025http://dx.doi.org/10.1109/ACCESS.2019.2955025]
Zamir S W, Arora A, Khan S, Hayat M, Khan F S and Yang MH. 2022. Restormer: efficient Transformer for high-resolution image restoration//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 5718-5729 [DOI: 10.1109/CVPR52688.2022.00564http://dx.doi.org/10.1109/CVPR52688.2022.00564]
Zeyde R, Elad M and Protter M. 2012. On single image scale-up using sparse-representations//Proceedings of the 7th International Conference on Curves and Surfaces. Avignon, France: Springer: 711-730 [DOI: 10.1007/978-3-642-27413-8_47http://dx.doi.org/10.1007/978-3-642-27413-8_47]
Zhang K, Zuo W M, Chen Y J, Meng D Y and Zhang L. 2017. Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Transactions on Image Processing, 26(7): 3142-3155 [DOI: 10.1109/TIP.2017.2662206http://dx.doi.org/10.1109/TIP.2017.2662206]
Zhang Y, Zhang E H, Chen W J, Chen Y J and Duan J H. 2018a. Sparsity-based inverse halftoning via semi-coupled multi-dictionary learning and structural clustering. Engineering Applications of Artificial Intelligence, 72: 43-53 [DOI: 10.1016/j.engappai.2018.03.012http://dx.doi.org/10.1016/j.engappai.2018.03.012]
Zhang Y L, Li K P, Li K, Wang L C, Zhong B N and Fu Y. 2018b. Image super-resolution using very deep residual channel attention networks//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 294-310 [DOI: 10.1007/978-3-030-01234-2_18http://dx.doi.org/10.1007/978-3-030-01234-2_18]
Zhao H, Gallo O, Frosio I and Kautz J. 2017. Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging, 3(1): 47-57 [DOI: 10.1109/TCI.2016.2644865http://dx.doi.org/10.1109/TCI.2016.2644865]
Zhou B L, Lapedriza A, Khosla A, Oliva A and Torralba A. 2018. Places: a 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6): 1452-1464 [DOI: 10.1109/TPAMI.2017.2723009http://dx.doi.org/10.1109/TPAMI.2017.2723009]
相关作者
相关机构