面向图数转化的曲线提取与细化神经网络
Curve extraction and thinning based curve-to-data conversion neural network
- 2024年29卷第4期 页码:1030-1040
纸质出版日期: 2024-04-16
DOI: 10.11834/jig.230280
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-04-16 ,
移动端阅览
周其当, 刘春晓, 吕金龙, 冯才博. 2024. 面向图数转化的曲线提取与细化神经网络. 中国图象图形学报, 29(04):1030-1040
Zhou Qidang, Liu Chunxiao, Lyu Jinlong, Feng Caibo. 2024. Curve extraction and thinning based curve-to-data conversion neural network. Journal of Image and Graphics, 29(04):1030-1040
目的
2
曲线图是数据呈现的重要形式,但在没有原始数据的情况下难以查询其中的具体数值。现有的图数转化算法需要大量的人工辅助操作去除图表中网格线等干扰,具有机械重复性且需大量人力的缺点。另外,图像压缩与缩放等攻击会降低图像质量,导致图数转化的准确度进一步降低。为了解决上述问题,本文提出了一个基于曲线提取与细化神经网络的图数转化算法。
方法
2
首先,提出了基于侧结构引导与拉普拉斯卷积的曲线提取神经网络(side structure guidance and Laplace convolution based curve extraction neural network,SLCENet),以轻量化的模型解决了现有曲线提取方法中的池化操作导致的边界模糊问题,提高了曲线提取的准确度。其次,为了减小曲线线宽对图数转化造成的误差,并平衡计算复杂度和准确度,设计了10个能够反映曲线走势的特征,提出了基于曲线走势特征和多层感知机的曲线细化方法(curve trend features and MLP based curve thinning method,CMCT),实现了曲线细化的高精度。最后,利用PaddleOCR(paddle optical character recognition)定位并识别坐标轴上的坐标标签,建立起坐标轴坐标与像素坐标的变换关系,通过坐标变换完成图数转化任务。
结果
2
在曲线提取方面,本文方法SLCENet的全局最优阈值指标(optimal dataset scale,ODS)达到了0.985,在分辨率为640 × 480像素的图像上的运行速度达到了0.043 s/幅,在兼顾曲线提取准确度和运行速度的情况下达到了最好的性能。在图数转化方面,本文方法的归一化均值误差(normalized mean error,NME)达到了0.79,运行速度达到了0.83 s/幅。
结论
2
提出的方法实现了全自动高精度的图数转化目标。与现有方法相比,在保持较小计算量的情况下兼具准确度高和运行速度快的特点,摆脱了图数转化需要大量人工交互辅助的限制。
Objective
2
Curve image is an important form of data presentation; however, querying the specific values embedded into curve image is difficult without the original data. The existing curve-to-data conversion methods require considerable manual assistance to remove such interference in curve images, such as grid lines and axes. Thus, they exhibit the disadvantage of being mechanically repetitive and labor-intensive. In addition, attacks, such as image compression and scaling, can degrade image quality, leading to the decrease of curve-to-data conversion accuracy. The curve has a certain line width, and the same
X
coordinate corresponds to multiple pixel points. Therefore, obtaining the exact position of the point to be measured in the curve is difficult. To solve the aforementioned problems, this study proposes a curve extraction and thinning based curve-to-data conversion neural network.
Method
2
First, we propose the side structure guidance and Laplace convolution based curve extraction neural network (SLCENet). SLCENet uses ResNet as its backbone network and enhances curve extraction performance with side structure guidance. It uses deep supervision to make each layer of the network learn the details in the curve mask better. The side structure guidance contains four different scales, and each scale consists of four residual blocks. To obtain clearer curve details, we add the multi-scale dilation module (MDM) to enrich the multi-scale curve features and the noise reduction module (NRM) to reduce the noise in the feature map. Moreover, we specially design the Laplace module (LM) to enhance the curve extraction performance in side structure guidance. In general, the number of curve pixels is considerably smaller than the number of non-curve pixels, and thus, this study uses the cross-entropy loss with weights to balance the penalty of the loss function for the curve and non-curve pixels. Consequently, SLCENet solves the problem in which the pooling operation in existing curve extraction methods lead to blurred curve edges, improving curve extraction accuracy. Second, to reduce the error caused by the curve line width on curve-to-data conversion and balance computational complexity and curve thinning accuracy, we design 10 features that can reflect the curve trend and propose a curve trend feature and MLP based curve thinning method (CMCT), which achieves curve thinning results with high accuracy. Finally, PaddleOCR is used to identify the coordinate labels on the coordinate axes and establish the coordinate transformation formula between axis coordinates and pixel coordinates.
Result
2
A huge amount of experimental results show that our algorithm achieves superior accuracy and speed. In curve extraction, SLCENet achieves the optimal dataset scale (ODS) of 0.985 and only takes 0.043 seconds for an image with a resolution of 640 × 480. For the curve images degraded by JPEG compression, scaling, and noising attack, SLCENet still achieves the ODS of 0.902. Although the speed of our SLCENet is slightly slower than holistically-nested edge detection (HED), richer convolutional features for edge detection (RCF), and dense extreme inception network (DexiNed), they fail to achieve high curve extraction accuracy. Therefore, when combining accuracy and running speed, SLCENet achieves the best performance. In curve-to-data conversion, our algorithm obtains the normalized mean error (NME) of 0.79 and a running speed of 0.83 seconds per image. In model size, SLCENet achieves high accuracy with a lightweight model which is only about 17MB. To balance curve thinning accuracy and computational costs, this study compares typical machine learning methods for the curve thinning task. The experimental results show that decision tree exhibits the best performance in the curve-to-data conversion accuracy. Nevertheless, considering curve-to-data conversion accuracy, model size, and running speed, MLP is chosen with the best comprehensive performance.
Conclusion
2
Our algorithm achieves the goal of fully automatic curve-to-data conversion with high accuracy and exhibits greater advantages over existing methods in curve images with JPEG compression, image scaling, and noise attacks. Compared with existing methods, our algorithm is free from the limitation of requiring considerable manual interaction assistance for curve-to-data conversion, which has high accuracy and fast running speed.
曲线图数据转化曲线提取曲线细化拉普拉斯卷积卷积神经网络(CNN)
curve-to-data conversioncurve extractioncurve thinningLaplace convolutionconvolutional neural network(CNN)
Breiman L. 2001. Random forests. Machine Learning, 45(1): 5-32 [DOI: 10.1023/A:1010933404324http://dx.doi.org/10.1023/A:1010933404324]
Canny J. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(6): 679-698 [DOI: 10.1109/TPAMI.1986.4767851http://dx.doi.org/10.1109/TPAMI.1986.4767851]
Chelmis C and Qi W T. 2021. Hierarchical MultiClass AdaBoost//Proceedings of 2021 IEEE International Conference on Big Data. Orlando, USA: IEEE: 5063-5070 [DOI: 10.1109/BigData52589.2021.9671291http://dx.doi.org/10.1109/BigData52589.2021.9671291]
Deng R X and Liu S J. 2020. Deep structural contour detection//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM: 304-312 [DOI: 10.1145/3394171.3413750http://dx.doi.org/10.1145/3394171.3413750]
Gu R, Jin L, Wu Y W, Qu J Y, Wang T, Wang X J, Yuan C F and Huang Y H. 2015. Parallel training GBRT based on KMeans histogram approximation for big data//Proceedings of the 15th International Conference on Algorithms and Architectures for Parallel Processing. Zhangjiajie, China: Springer: 52-65 [DOI: 10.1007/978-3-319-27122-4_4http://dx.doi.org/10.1007/978-3-319-27122-4_4]
Guo Y W, Li Q Z, Zuo W M and Chen H. 2023. An intermediate-level attack framework on the basis of linear regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3): 2726-2735 [DOI: 10.1109/TPAMI.2022.3188044http://dx.doi.org/10.1109/TPAMI.2022.3188044]
He J Z, Zhang S L, Yang M, Shan Y H and Huang T J. 2019. Bi-directional cascade network for perceptual edge detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3828-3837 [DOI: 10.1109/CVPR.2019.00395http://dx.doi.org/10.1109/CVPR.2019.00395]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Kingma D P and Ba J L. 2017. Adam: a method for stochastic optimization [EB/OL]. [2023-04-30]. https://arxiv.org/pdf/1412.6980.pdfhttps://arxiv.org/pdf/1412.6980.pdf
Lee C Y, Xie S N, Gallagher P, Zhang Z Y and Tu Z W. 2014. Deeply-supervised nets [EB/OL]. [2023-04-30]. https://arxiv.org/pdf/1409.5185v1.pdfhttps://arxiv.org/pdf/1409.5185v1.pdf
Lee S, Lee C, Mun K G and Kim D. 2022. Decision tree algorithm considering distances between classes. IEEE Access, 10: 69750-69756 [DOI: 10.1109/ACCESS.2022.3187172http://dx.doi.org/10.1109/ACCESS.2022.3187172]
Li C X, Liu W W, Guo R Y, Yin X T, Jiang K T, Du Y K, Du Y N, Zhu L F, Lai B H, Hu X G, Yu D H and Ma Y J. 2022. PP-OCRv3: more attempts for the improvement of ultra lightweight OCR system [EB/OL]. [2023-04-30]. https://arxiv.org/pdf/2206.03001.pdfhttps://arxiv.org/pdf/2206.03001.pdf
Li K and Tian H X. 2021. A bagging based multiobjective differential evolution with multiple subpopulations. IEEE Access, 9: 105902-105913 [DOI: 10.1109/ACCESS.2021.3100483http://dx.doi.org/10.1109/ACCESS.2021.3100483]
Liao T T, Lei Z, Zhu T Q, Zeng S, Li Y Q and Yuan C. 2023. Deep metric learning for K nearest neighbor classification. IEEE Transactions on Knowledge and Data Engineering, 35(1): 264-275 [DOI: 10.1109/TKDE.2021.3090275http://dx.doi.org/10.1109/TKDE.2021.3090275]
Liu Y, Cheng M M, Hu X W, Wang K and Bai X. 2017. Richer convolutional features for edge detection//Proceedings of 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 5872-5881 [DOI: 10.1109/CVPR.2017.622http://dx.doi.org/10.1109/CVPR.2017.622]
Pu M Y, Huang Y P, Liu Y M, Guan Q J and Ling H B. 2022. EDTER: edge detection with Transformer//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 1392-1402 [DOI: 10.1109/CVPR52688.2022.00146http://dx.doi.org/10.1109/CVPR52688.2022.00146]
Ronneberger O, Fischer P and Brox T. 2015. U-net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Sahiner B, Chan H P, Petrick N, Gopal S S and Goodsitt M M. 1997. Neural network design for optimization of the partial area under the receiver operating characteristic curve//Proceedings of 1997 International Conference on Neural Networks. Houston, USA: IEEE: 2468-2471 [DOI: 10.1109/ICNN.1997.614545http://dx.doi.org/10.1109/ICNN.1997.614545]
Sciavicco G and Stan I E. 2020. Knowledge extraction with interval temporal logic decision trees//Proceedings of the 27th International Symposium on Temporal Representation and Reasoning. Dagstuhl, Germany: [s.n.]: 9:1-9:16 [DOI: 10.4230/LIPIcs.TIME.2020.9http://dx.doi.org/10.4230/LIPIcs.TIME.2020.9]
Simonyan K and Zisserman A. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition [EB/OL]. [2023-04-30]. https://arxiv.org/pdf/1409.1556.pdfhttps://arxiv.org/pdf/1409.1556.pdf
Sokolova M, Japkowicz N and Szpakowicz S. 2006. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation//Proceedings of the 19th Australian Joint Conference on Artificial Intelligence. Hobart, Australia: Springer: 1015-1021 [DOI: 10.1007/11941439_114http://dx.doi.org/10.1007/11941439_114]
Soria X, Riba E and Sappa A. 2020. Dense extreme inception network: towards a robust CNN model for edge detection//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass Village, USA: IEEE: 1912-1921 [DOI: 10.1109/WACV45572.2020.9093290http://dx.doi.org/10.1109/WACV45572.2020.9093290]
Su Z, Liu W Z, Yu Z T, Hu D W, Liao Q, Tian Q, Pietikainen M and Liu L. 2021. Pixel difference networks for efficient edge detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 5097-5107 [DOI: 10.1109/ICCV48922.2021.00507http://dx.doi.org/10.1109/ICCV48922.2021.00507]
Xie S N and Tu Z W. 2015. Holistically-nested edge detection//Proceedings of 2015 IEEE/CVF International Conference on Computer Vision. Santiago, Chile: IEEE: 1395-1403 [DOI: 10.1109/ICCV.2015.164http://dx.doi.org/10.1109/ICCV.2015.164]
Zhang T Y and Suen C Y. 1984. A fast parallel algorithm for thinning digital patterns. Communications of the ACM, 27(3): 236-239 [DOI: 10.1145/357994.358023http://dx.doi.org/10.1145/357994.358023]
Zhou Z H. 2016. Machine Learning. Beijing: Tsinghua University Press
周志华. 2016. 机器学习. 北京: 清华大学出版社
相关作者
相关机构