融合场景先验的船名文本检测方法
Ship name text detection method with scene priors fusion
- 2024年29卷第10期 页码:3104-3115
纸质出版日期: 2024-10-16
DOI: 10.11834/jig.230564
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-10-16 ,
移动端阅览
陈博伟, 易尧华, 汤梓伟, 彭继兵, 尹爱国. 2024. 融合场景先验的船名文本检测方法. 中国图象图形学报, 29(10):3104-3115
Chen Bowei, Yi Yaohua, Tang Ziwei, Peng Jibing, Yin Aiguo. 2024. Ship name text detection method with scene priors fusion. Journal of Image and Graphics, 29(10):3104-3115
目的
2
船名文本信息是船舶身份识别的核心要素。真实场景船舶影像中文本区域尺度不一导致船名文本检测存在漏检等问题。同时,现有自然场景文本检测算法难以排除背景文本、图案等因素对船名检测任务的干扰。因此,针对以上问题提出一种融合场景先验的船名检测方法。
方法
2
首先,依据船首与船名目标关联性,提出一个基于先验损失的区域监督模块,以约束模型关注船名文本区域特征。然后,为了提高文本区域细粒度,提出一个基于非对称卷积的船名区域定位模块,增强文本区域边缘信息,进一步提高船名检测的召回率。
结果
2
本文收集、标注并公开发布了一个真实场景船名文本检测数据集CBWLZ2023进行实验验证,并与最新的8种通用自然场景文本检测方法进行比较。本文算法在船名文本检测任务上取得了94.2%的F1值,相比于性能第2的模型,F1值提高了2.3%;相比于基线模型,F1值提高了2.8%。同时在CBWLZ2023数据集中进行了参数分析实验及消融实验以验证算法各模块的有效性。实验结果证明提出的算法能准确获取边界清晰的文本区域,改善了船名文本检测的效果。
结论
2
本文提出的融合场景先验的船名检测模型,可以解决船名文本尺度不一、背景文本干扰带来的问题,在检测精度上超过了现有的场景文本检测算法,具有有效性与先进性。CBWLZ2023可由
https://aistudio.baidu.com/aistudio/datasetdetail/224137
https://aistudio.baidu.com/aistudio/datasetdetail/224137
获取。
Objective
2
Ships are the most important carriers of waterborne transportation, accounting for over two-thirds of global trade in goods transportation. Ship names, as one of the most crucial identification pieces of information for ships, possess uniqueness and distinctiveness, forming the core elements for intelligent ship identity recognition. Achieving ship name text detection is crucial in enhancing waterway traffic regulation and improving maritime transport safety. However, in real-world scenarios, given the variations in ship size and diverse ship types, the areas of ship name text regions differ, and the aspect ratio of ship name text varies greatly across different ship types, directly affecting the accuracy of ship name text detection and increasing the likelihood of missed detections. Additionally, during ship name text detection, various elements, such as background text and patterns in the scene, can introduce interference. Existing natural scene text detection algorithms do not completely eliminate these interference factors. Directly applying them to ship name text detection tasks may lead to poor algorithm robustness. Therefore, this study addresses the aforementioned issues and proposes a ship name detection method based on scene prior information.
Method
2
First, given that ship name text regions are usually fixed at the bow and two sides of the ship, this study proposes a region supervision module based on prior loss, which utilizes the correlation between the bow and the ship name text target. Through the classification and regression branches on the shared feature maps, prior information of the bow region is obtained, constructing a scene prior loss with bow correlation. During training, the model simultaneously learns the ship name text detection main task and the bow object detection auxiliary task and updates the network parameters through joint losses to constrain the model’s attention to the ship name text region features and eliminate background interference. Then, a ship name region localization module based on asymmetric convolution is further proposed to improve the granularity of text region localization. It achieves lateral connections between deep semantic information and shallow localization information by fusing feature layers with different scales between networks. On the basis of the additive property of convolution, three convolution kernels with sizes of 3 × 3, 3 × 1, and 1 × 3 are used to enhance the fused feature maps, balancing the weights of the kernel region features to enrich the text edge information. Finally, a differentiable binarization optimization is introduced to generate text boundaries and realize ship name text region localization. Given that no ship name text detection dataset is publicly available, this study constructs the CBWLZ2023 dataset, comprising 1 659 images of various types of ships, such as fishing vessels, passenger ships, cargo ships, and warships, captured in real-world scenes such as waterways and ports, featuring differences in background, ship poses, lighting, text attributes, and character sizes.
Result
2
To validate the effectiveness of the proposed algorithm, this study collected, annotated, and publicly released a real-world ship name text detection dataset CBWLZ2023 for experimental verification and compared it with eight state-of-the-art general natural scene text detection methods. Quantitative analysis results show that the proposed algorithm achieves an F-value of 94.2% in the ship name text detection task, representing a 2.3% improvement over the second-best-performing model. Moreover, ablation experiments demonstrate that the model’s F-value increases by 2.3% and 0.7% after incorporating the region supervision module based on prior loss and the ship name region localization module based on asymmetric convolution, respectively. The fused model’s F-value increases by 2.8%, confirming the effectiveness of each algorithm module. Qualitative analysis results indicate that the proposed algorithm exhibits stronger robustness than other methods in dealing with text of varying scales and background interference, accurately capturing text regions with clear boundaries and effectively reducing false positives and missed detections. Experimental results demonstrate that the proposed algorithm enhances ship name text detection performance.
Conclusion
2
This study proposes a ship name detection method based on scene prior information. The algorithm has two main advantages. First, it fully utilizes the strong correlation between the bow region of the ship and the ship name text region, suppressing the interference of background information in ship name detection tasks. Second, it integrates multiscale text feature information to enhance the robustness of multiscale text object detection. The proposed algorithm achieves higher detection accuracy than existing scene text detection algorithms on the CBWLZ2023 dataset, demonstrating its effectiveness and advancement. The CBWLZ2023 can be obtained from
https://aistudio.baidu.com/aistudio/datasetdetail/224137
https://aistudio.baidu.com/aistudio/datasetdetail/224137
.
船名文本检测场景先验损失区域监督特征增强非对称卷积
ship name text detectionscene priori lossregional supervisionfeature enhancementasymmetric convolution
Bai Z C, Li Q, Chen P and Guo L Q. 2020. Text detection in natural scenes: a literature review. Chinese Journal of Engineering, 42(11): 1433-1448
白志程, 李擎, 陈鹏, 郭立晴. 2020. 自然场景文本检测技术研究综述. 工程科学学报, 42(11): 1433-1448 [DOI: 10.13374/j.issn2095-9389.2020.03.24.002http://dx.doi.org/10.13374/j.issn2095-9389.2020.03.24.002]
Chen L C, Xu X Z, Cao J F and Pan L H. 2020. Multi-scenario lane line detection with auxiliary loss. Journal of Image and Graphics, 25(9): 1882-1893
陈立潮, 徐秀芝, 曹建芳, 潘理虎. 2020. 引入辅助损失的多场景车道线检测. 中国图象图形学报, 25(9): 1882-1893 [DOI: 10.11834/jig.190646http://dx.doi.org/10.11834/jig.190646]
Ding D P and Li H T. 2023. Detection and recognition of ship numbers based on DP-DBNet and MHA-CRNN. Computer Systems and Applications, 32(3): 209-216
丁东平, 李海涛. 2023. 基于DP-DBNet和MHA-CRNN的船牌号检测与识别. 计算机系统应用, 32(3): 209-216 [DOI: 10.15888/j.cnki.csa.008972http://dx.doi.org/10.15888/j.cnki.csa.008972]
Ding X H, Guo Y C, Ding G G and Han J G. 2019. ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE: 1911-1920[DOI: 10.1109/ICCV.2019.00200http://dx.doi.org/10.1109/ICCV.2019.00200]
Gan L X, Wu J R, Xu H X, Feng H, Zhang L, Shu Y Q and Zhang D F. 2023. A ship name detection method based on attention mechanism and feature enhancement. Journal of Wuhan University of Technology(Transportation Science and Engineering), 47(5): 850-855
甘浪雄, 吴金茹, 徐海祥, 冯辉, 张磊, 束亚清, 张东方. 2023. 基于注意力机制与特征增强的船名检测方法. 武汉理工大学学报(交通科学与工程版), 47(5): 850-855 [DOI: 10.3963/j.issn.2095-3844.2023.05.014http://dx.doi.org/10.3963/j.issn.2095-3844.2023.05.014]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hong H Y, Chen B C, Ma L and Zhang B Y. 2023. Ship hull number detection and recognition under sparse samples. Journal of Image and Graphics, 28(4): 984-1003
洪汉玉, 陈冰川, 马雷, 张必银. 2023. 稀疏样本条件下的舰船舷号检测与识别. 中国图象图形学报, 28(4): 984-1003 [DOI: 10.11834/jig.211167http://dx.doi.org/10.11834/jig.211167]
Huang S Z, Xu H S, Xia X Z and Zhang Y. 2018. End-to-end vessel plate number detection and recognition using deep convolutional neural networks and LSTMs//Proceedings of the 11th International Symposium on Computational Intelligence and Design. Hangzhou, China: IEEE: 195-199 [DOI: 10.1109/ISCID.2018.00051http://dx.doi.org/10.1109/ISCID.2018.00051]
Jin L and Zhang Y. 2022. Scene text detection algorithm based on RetinaNet. Computer Applications and Software, 39(2): 201-207
金灵, 张轶. 2022. 基于RetinaNet的场景文字检测算法. 计算机应用与软件, 39(2): 201-207 [DOI: 10.3969/j.issn.1000-386x.2022.02.033http://dx.doi.org/10.3969/j.issn.1000-386x.2022.02.033]
Li Z T and Sun H Y. 2019. A ship detection and plate recognition system based on FCN. Computer and Modernization, (12): 72-77.
李兆桐, 孙浩云. 2019. 基于全卷积神经网络的船舶检测和船牌识别系统. 计算机与现代化, (12): 72-77 [DOI: 10.3969/j.issn.1006-2475.2019.12.014http://dx.doi.org/10.3969/j.issn.1006-2475.2019.12.014]
Liang H R, Ye L C, Liang R H, Chen L and Wu H. 2022. Text detection algorithm for natural scenes under attention supervision strategy. Journal of Computer-Aided Design and Computer Graphics, 34(7): 1011-1019
梁浩然, 叶凌晨, 梁荣华, 陈龙, 吴昊. 2022. 注意力监督策略下的自然场景文本检测算法. 计算机辅助设计与图形学学报, 34(7): 1011-1019 [DOI: 10.3724/SP.J.1089.2022.19088http://dx.doi.org/10.3724/SP.J.1089.2022.19088]
Liao M H, Shi B G, Bai X, Wang X G and Liu W Y. 2017. TextBoxes: a fast text detector with a single deep neural network//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 4161-4167 [DOI: 10.1609/aaai.v31i1.11196http://dx.doi.org/10.1609/aaai.v31i1.11196]
Liao M H, Wan Z Y, Yao C, Chen K and Bai X. 2020. Real-time scene text detection with differentiable binarization//Proceedings of the 34th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press: 11474-11481 [DOI: 10.1609/aaai.v34i07.6812http://dx.doi.org/10.1609/aaai.v34i07.6812]
Liao M H, Zou Z S, Wan Z Y, Yao C and Bai X. 2023. Real-time scene text detection with differentiable binarization and adaptive scale fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1): 919-931 [DOI: 10.1109/TPAMI.2022.3155612http://dx.doi.org/10.1109/TPAMI.2022.3155612]
Lin T Y, Dollr P, Girshick R, He K M, Hariharan B and Belongie S. 2017a. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 936-944 [DOI: 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106]
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017b. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Liu B L, Sheng J, Dun J Y, Zhang S Y, Hong Z J and Ye X Z. 2017. Locating various ship license numbers in the wild: an effective approach. IEEE Intelligent Transportation Systems Magazine, 9(4): 102-117 [DOI: 10.1109/mits.2017.2743168http://dx.doi.org/10.1109/mits.2017.2743168]
Liu C Y, Chen X X, Luo C J, Jin L W, Xue Y and Liu Y L. 2021. Deep learning methods for scene text detection and recognition. Journal of Image and Graphics, 26(6): 1330-1367
刘崇宇, 陈晓雪, 罗灿杰, 金连文, 薛洋, 刘禹良. 2021. 自然场景文本检测与识别的深度学习方法. 中国图象图形学报, 26(6): 1330-1367 [DOI: 10.11834/jig.210044http://dx.doi.org/10.11834/jig.210044]
Liu D K, Cao J W, Wang T L, Wu H H, Wang J Z, Tian J M and Xu F Y. 2022. SLPR: a deep learning based Chinese ship license plate recognition framework. IEEE Transactions on Intelligent Transportation Systems, 23(12): 23831-23843 [DOI: 10.1109/TITS.2022.3196814http://dx.doi.org/10.1109/TITS.2022.3196814]
Liu S, Qi L, Qin H F, Shi J P and Jia J Y. 2018. Path aggregation network for instance segmentation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8759-8768 [DOI: 10.1109/CVPR.2018.00913http://dx.doi.org/10.1109/CVPR.2018.00913]
Liu X X, Cheng J C, Cheng Y M, Gu Y F, Lei X H and Wang B. 2023. Distance posts detection and character sequences recognition method in video images acquired from camera in moving vehicle. Computer Engineering and Applications, 59(8): 175-181
刘小溪, 程佳诚, 程咏梅, 顾一凡, 雷鑫华, 汪波. 2023. 车载视频图像路牌检测与字符序列识别方法. 计算机工程与应用, 59(8): 175-181 [DOI: 10.3778/j.issn.1002-8331.2112-0246http://dx.doi.org/10.3778/j.issn.1002-8331.2112-0246]
Liu X Y, Chen H X, Liu B Y, Lin Y and Ma T. 2023. License plate detection algorithm in unrestricted scenes based on adaptive confidence threshold. Journal of Computer Applications, 43(1): 67-73
刘小宇, 陈怀新, 刘壁源, 林英, 马腾. 2023. 自适应置信度阈值的非限制场景车牌检测算法. 计算机应用, 43(1): 67-73 [DOI: 10.11772/j.issn.1001-9081.2021111974http://dx.doi.org/10.11772/j.issn.1001-9081.2021111974]
Qian J, Zhang G R, Yao J, Ji J Z, He P and Gu S H. 2019. Vessel name location method based on maximally stable extremal regions and edge enhancement. Computer Applications and Software, 36(2): 264-268
钱江, 张桂荣, 姚江, 季建中, 何平, 顾宋华. 2019. 最稳定极值区域与边缘增强的船名定位方法. 计算机应用与软件, 36(2): 264-268 [DOI: 10.3969/j.issn.1000-386x.2019.02.047http://dx.doi.org/10.3969/j.issn.1000-386x.2019.02.047]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Sui Y, Duan R and Zhu D L. 2022. Ship name matching method based on deep siamese network. Command Information System and Technology, 13(3): 32-35, 51
隋远, 段然, 朱德理. 2022. 基于深度孪生网络的船舶名称匹配方法. 指挥信息系统与技术, 13(3): 32-35, 51 [DOI: 10.15908/j.cnki.cist.2022.03.006http://dx.doi.org/10.15908/j.cnki.cist.2022.03.006]
Tian Z, Huang W L, He T, He P and Qiao Y. 2016. Detecting text in natural image with connectionist text proposal network//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 56-72 [DOI: 10.1007/978-3-319-46484-8_4http://dx.doi.org/10.1007/978-3-319-46484-8_4]
Wang W H, Xie E Z, Li X, Hou W B, Lu T, Yu G and Shao S. 2019. Shape robust text detection with progressive scale expansion network//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 9328-9337 [DOI: 10.1109/CVPR.2019.00956http://dx.doi.org/10.1109/CVPR.2019.00956]
Wu H H, Chen J G, Wang T L, Lai X P and Cao J W. 2023. Ship license plate super-resolution in the wild. IEEE Signal Processing Letters, 30: 394-398 [DOI: 10.1109/LSP.2023.3262418http://dx.doi.org/10.1109/LSP.2023.3262418]
Wu S K, Liu B L, Xu S C, Li Y, Wu S Q, Zhang S Y and Ye X Z. 2020. A two-stage ship license plate locating algorithm based on deep feature transfer and fusion. Journal of Computer-Aided Design and Computer Graphics, 32(4): 628-634
吴书楷, 刘宝龙, 徐舒畅, 李毅, 吴双卿, 张三元, 叶修梓. 2020. 结合深度特征迁移与融合的两阶段船牌定位算法. 计算机辅助设计与图形学学报, 32(4): 628-634 [DOI: 10.3724/SP.J.1089.2020.17874http://dx.doi.org/10.3724/SP.J.1089.2020.17874]
Yi Y H, He J J, Lu L Q and Tang Z W. 2020. Association of text and other objects for text detection with natural scene images. Journal of Image and Graphics, 25(1): 126-135
易尧华, 何婧婧, 卢利琼, 汤梓伟. 2020. 顾及目标关联的自然场景文本检测. 中国图象图形学报, 25(1): 126-135 [DOI: 10.11834/jig.190179http://dx.doi.org/10.11834/jig.190179]
Yu W W, Liu Y L, Hua W, Jiang D Q, Ren B and Bai X. 2023. Turning a CLIP model into a scene text detector//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE [DOI: 10.1109/CVPR52729.2023.00674http://dx.doi.org/10.1109/CVPR52729.2023.00674]
Yu Y, Fu Y Z, Chen W X and Liu H T. 2021. DLPD-Net: distorted license plate detection model in natural scenarios. Journal of Image and Graphics, 26(3): 556-567
余烨, 付源梓, 陈维笑, 刘海涛. 2021. 自然场景下变形车牌检测模型DLPD-Net. 中国图象图形学报, 26(3): 556-567 [DOI: 10.11834/jig.200091http://dx.doi.org/10.11834/jig.200091]
Zhang S X, Zhu X B, Yang C, Wang H F and Yin X C. 2021. Adaptive boundary proposal network for arbitrary shape text detection//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 1285-1294 [DOI: 10.1109/ICCV48922.2021.00134http://dx.doi.org/10.1109/ICCV48922.2021.00134]
Zhang S X, Zhu X B, Chen L, Hou J B and Yin X C. 2023. Arbitrary shape text detection via segmentation with probability maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3): 2736-2750 [DOI: 10.1109/TPAMI.2022.3176122http://dx.doi.org/10.1109/TPAMI.2022.3176122]
Zhang W S, Sun H Y, Zhou J H, Liu X, Zhang Z M and Min G Z. 2018. DCNN based real-time adaptive ship license plate recognition (DRASLPR)//Proceedings of 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). Halifax, Canada: IEEE: 1829-1834 [DOI: 10.1109/Cybermatics_2018.2018.00304http://dx.doi.org/10.1109/Cybermatics_2018.2018.00304]
Zhao Y Q, Rao Y, Dong S P and Zhang J Y. 2020. Survey on deep learning object detection. Journal of Image and Graphics, 25(4): 629-654
赵永强, 饶元, 董世鹏, 张君毅. 2020. 深度学习目标检测方法综述. 中国图象图形学报, 25(4): 629-654 [DOI: 10.11834/jig.190307http://dx.doi.org/10.11834/jig.190307]
Zhou X Y, Yao C, Wen H, Wang Y Z, Zhou S C, He W R and Liang J J. 2017. EAST: an efficient and accurate scene text detector//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2642-2651 [DOI: 10.1109/CVPR.2017.283http://dx.doi.org/10.1109/CVPR.2017.283]
Zhou Y, Zhu Q R, Xie H C and Yang J F. 2021. Non-standard ship identification characters detection based on target detection and fuzzy matching. Laser and Infrared, 51(11): 1526-1530
周怡, 祝啟瑞, 谢海成, 羊箭锋. 2021. 基于目标检测与模糊匹配的非标船牌识别研究. 激光与红外, 51(11): 1526-1530 [DOI: 10.3969/j.issn.1001-5078.2021.11.020http://dx.doi.org/10.3969/j.issn.1001-5078.2021.11.020]
Zhu Y Q, Chen J Y, Liang L Y, Kuang Z H, Jin L W and Zhang W Y. 2021. Fourier contour embedding for arbitrary-shaped text detection//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 3122-3130 [DOI: 10.1109/CVPR46437.2021.00314http://dx.doi.org/10.1109/CVPR46437.2021.00314]
相关作者
相关机构