改进YOLOv7的交通标志识别模型

孟勃; 史伟大

doi:10.11834/jig.230501

图像分析和识别 | 浏览量 : 0 下载量: 56 CSCD: 0

PDF
导出
分享
收藏
专辑

改进YOLOv7的交通标志识别模型
Improved traffic sign recognition model for YOLOv7
2024年29卷第9期页码：2737-2752
纸质出版日期： 2024-09-16 ，
DOI： 10.11834/jig.230501
稿件说明：

移动端阅览

孟勃，史伟大. 2024. 改进YOLOv7的交通标志识别模型. 中国图象图形学报， 29(09):2737-2752

Meng Bo， Shi Weida. 2024. Improved traffic sign recognition model for YOLOv7. Journal of Image and Graphics， 29(09):2737-2752
孟勃，史伟大. 2024. 改进YOLOv7的交通标志识别模型. 中国图象图形学报， 29(09):2737-2752 DOI： 10.11834/jig.230501.

Meng Bo， Shi Weida. 2024. Improved traffic sign recognition model for YOLOv7. Journal of Image and Graphics， 29(09):2737-2752 DOI： 10.11834/jig.230501.

摘要

目的

随着自动驾驶和辅助驾驶的快速发展，交通标志识别研究变得越来越重要。但是现阶段交通标志识别算法对交通标志识别的精度较低，尤其在面对目标背景较为复杂、光照不足和小目标交通标志的场景时，更加容易出现错检和漏检情况。针对以上问题，提出了一种改进YOLOv7（you only look once version 7）的交通标志识别模型。

方法

首先，采用空间金字塔池化快速跨级部分连接（spatial pyramid pooling fast cross stage partial concat，SPPFCSPC）方法，替换YOLOv7算法使用的空间金字塔池化跨级部分连接（spatial pyramid pooling cross stage partial concat，SPPCSPC）方法，提高算法的特征提取能力。其次，采用加权双向特征金字塔网络（bi-directional feature pyramid network，BiFPN），增强算法的多尺度特征融合能力。接着，采用一种新的框间距离度量的归一化Wasserstein距离（normalized Wasserstein distance，NWD）方法，解决传统的IoU（intersection over union）度量对小目标交通标志检测过于敏感的问题。最后，使用特征内容的感知重组（content-aware reassembly of feature，CARAFE）算子，通过输入的特征，自适应生成上采样内核，有效地增加模型的感受域，更好地利用目标周边的信息，减少交通标志错检和漏检情况。

结果

实验结果表明，在减少算法参数量的基础上，改进算法在TT100K交通标志数据集上的mAP@0.5和mAP@0.5∶0.9值分别达到了92.50%和72.21%，较原始的YOLOv7算法分别提高了3.24%和1.83%。同时，在具有小目标特性的CCTSDB交通标志数据集和整理的国外交通标志数据集上验证了模型改进的有效性。

结论

通过实验验证和主客观评价，证明了本文改进算法的可行性，能够有效地对多种环境下的小目标交通标志进行识别，并在降低算法参数量的前提下，进一步提高了YOLOv7算法对交通标志识别的平均精度。

Abstract

Objective

Traffic sign recognition has become an important research direction given the rapid development of driverless and assisted driving. To date， driverless and assisted driving pose additional requirements for accurate traffic sign recognition， especially in a real driving environment. The correct recognition rate of traffic signs is easily interfered by the external environment. In the identification of small-target traffic signs， most algorithms still present a very low accuracy， which easily results in erroneous and missed detection. Such a condition has a great impact on the driver’s accurate judgment of the state of road traffic signs. Given the hidden dangers of traffic， for the improved accuracy of traffic sign detection， the occurrence of accidents must be reduced and the driver’s driving safety be improved. On the basis of YOLOv7 model， this paper proposes a traffic sign recognition method to improve the YOLOv7 algorithm.

Method

First， drawing on the idea of spacelab payload processing facility， on the basis of the spatial pyramid pooling cross stage partial cat （SPPCSPC） module of the original YOLOv7 model， the input feature map was reblocked， and pooling operations of different sizes are implemented in each block. Then， the pooled results were spliced based on the position of the original block. Finally， convolution operation was performed to obtain a new spatial pyramid pooling structure called spatial pyramid pooling fast cross stage partial concat （SPPFCSPC）. Instead of the spatial pyramid pooling cross stage partial cat， the SPPFCSPC in the original model was used to pool the input feature map at multiple scales to optimize the training model， improve the accuracy of the algorithm， and identify targets more accurately. On the basis of this algorithm， given that the ordinary feature fusion method often adds characteristics of different resolutions after resizing without discrimination， to solve this problem， we used bidirectional feature pyramid network in the neck part to add a more weight to each input during feature fusion. Each input was allowed to learn the importance of each feature during fusion of features to effectively merge the multiscale features of the target and improve the detection capability for small targets. Then， aiming at the issue of small-target detection tasks requiring a high positioning performance， a normalized Wasserstein distance （NWD） method of interframe distance measurement was adopted to solve the high-sensitivity problem of the traditional intersection over union （IoU） metric in regard to small targets， which is used in anchor frame detectors to enhance the performance of nonmaximum suppression module and loss function. Specifically， bbox was remodeled as a two-dimensional Gaussian distribution for additional consistency with the characteristics of small targets， and the IoU of the prediction and truth boxes were converted into similarity between the two distributions. In addition， NWD was designed as a new evaluation indicator and used to measure the similarity of both distributions. The NWD metric can be applied to detectors that use the IoU metric， with the IoU being directly replaced with NWD. This metric can improve the capability to recognize traffic signs with less features in real traffic scenarios. Finally， through the lightweight upsampling content-aware reassembly of features operator， the output size of the input feature map was matched with the original image， and as a result， the input features were adapted to generate an upsampling kernel， realize the feature fusion of various scales， effectively increase the sensitivity domain of the model， improve the use of information around the target， increase the target detection capability， and reduce cases of missing detection.

Result

The experimental results show that the mAP@0.5 and mAP@0.5∶0.9 values of the model trained on the Tsinghua-Tencent 100K traffic sign dataset of the improved YOLOv7 algorithm reached 92.5% and 72.21%， respectively. In addition， the original YOLOv7 algorithm had mAP@0.5 and mAP@0.5∶0.9 values of 89.26% and 70.38%， respectively. Thus， its accuracy improved by 3.24% and 1.83%， respectively. Furthermore， the feasibility of improving the algorithm was verified on the CSUST Chinese traffic sign detection benchmark traffic sign dataset with small targets and the collated foreign traffic sign dataset. After experimental verification， compared with the original algorithm， the improved algorithm showed increased accuracies of 3.15% and 2.24% on the CSUST Chinese traffic sign detection benchmark dataset. In the collected foreign traffic sign dataset， after comparison with the original algorithm， the improved algorithm showed increased accuracies of 2.28% and 1.25%. Experiments revealed that the improved algorithm increased the recognition accuracy on the three traffic sign datasets.

Conclusion

Experimental verification and subjective and objective evaluation prove the feasibility and effectiveness of the improved YOLOv7 traffic sign recognition model in this paper. In addition， the improved model can effectively increase the recognition rate of ordinary and small-target traffic signs in various harsh environments under the premise of reducing the number of algorithm parameters. Thus， the improved model meets the recognition accuracy requirements of unmanned driving and assisted driving systems to a certain extent.

关键词

交通标志识别空间金字塔池化快速跨级部分连接（SPPFCSPC）加权双向特征金字塔网络（BiFPN）归一化Wasserstein距离（NWD）特征内容的感知重组（CARAFE）小目标

Keywords

traffic sign recognitionspatial pyramid pooling fast cross stage partial concat（SPPFCSPC）bi-directional feature pyramid network（BiFPN）normalized Wasserstein distance（NWD）content-aware reassembly of feature（CARAFE）small goals

references

Benjumea A， Teeti I， Cuzzolin F and Bradley A. 2021. YOLO-Z： improving small object detection in YOLOv5 for autonomous vehicles ［EB/OL］. ［2023-08-21］. https://arxiv.org/pdf/2112.11798.pdfhttps://arxiv.org/pdf/2112.11798.pdf

Cheng P， Liu W， Zhang Y F and Ma H D. 2018. LOCO： local context based faster R-CNN for small traffic sign detection//Proceedings of the 24th International Conference on MultiMedia Modeling. Bangkok， Thailand： Springer： 329-341 ［DOI： 10.1007/978-3-319-73603-7_27http://dx.doi.org/10.1007/978-3-319-73603-7_27］

Creusen I M， Wijnhoven R G J， Herbschleb E and de With P H N. 2010. Color exploitation in hog-based traffic sign detection//Proceedings of 2010 IEEE International Conference on Image Processing. Hong Kong， China： IEEE： 2669-2672 ［DOI： 10.1109/icip.2010.5651637http://dx.doi.org/10.1109/icip.2010.5651637］

Fleyeh H， Biswas R and Davami E. 2013. Traffic sign detection based on AdaBoost color segmentation and SVM classification//Eurocon 2013. Zagreb， Croatia： IEEE： 2005-2010 ［DOI： 10.1109/eurocon.2013.6625255http://dx.doi.org/10.1109/eurocon.2013.6625255］

Fu H X， Song G Q and Wang Y C. 2021. Improved YOLOv4 marine target detection combined with CBAM. Symmetry， 13（4）： #623 ［DOI： 10.3390/sym13040623http://dx.doi.org/10.3390/sym13040623］

Gao K Z， Zhang Y C， Su R， Yang F J， Suganthan P N and Zhou M C. 2019. Solving traffic signal scheduling problems in heterogeneous traffic network by using meta-heuristics. IEEE Transactions on Intelligent Transportation Systems， 20（9）： 3272-3282 ［DOI： 10.1109/TITS.2018.2873790http://dx.doi.org/10.1109/TITS.2018.2873790］

Han C， Gao G Y and Zhang Y. 2019. Real-time small traffic sign detection with revised faster-RCNN. Multimedia Tools and Applications， 78（10）： 13263-13278 ［DOI： 10.1007/s11042-018-6428-0http://dx.doi.org/10.1007/s11042-018-6428-0］

He K M， Zhang X Y， Ren S Q and Sun J. 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence， 37（9）： 1904-1916 ［DOI： 10.1109/TPAMI.2015.2389824http://dx.doi.org/10.1109/TPAMI.2015.2389824］

Li C Y， Li L L， Jiang H L， Weng K H， Geng Y F， Li L， Ke Z D， Li Q Y， Cheng M， Nie W Q， Li Y D， Zhang B， Liang Y F， Zhou L Y， Xu X M， Chu X X， Wei X M and Wei X L. 2022. YOLOv6： a single-stage object detection framework for industrial applications ［EB/OL］. ［2023-08-21］. https://arxiv.org/pdf/2209.02976.pdfhttps://arxiv.org/pdf/2209.02976.pdf

Lin T Y， Dollár P， Girshick R， He K M， Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 936-944 ［DOI： 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106］

Liu Z W， Qi M Y， Shen C， Fang Y and Zhao X M. 2021. Cascade saccade machine learning network with hierarchical classes for traffic sign detection. Sustainable Cities and Society， 67： #102700 ［DOI： 10.1016/j.scs.2020.102700http://dx.doi.org/10.1016/j.scs.2020.102700］

Ren S Q ， He K M and Girshick R S J. 2016. Faster R-CNN： towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（6）： 1137-1149 ［DOI：10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031］

Redmon J and Farhadi A. 2018. Yolov3： an incremental improvement ［EB/OL］. ［2023-08-21］. https://arxiv.org/pdf/1804.02767.pdfhttps://arxiv.org/pdf/1804.02767.pdf

Selvaraju R R ， Cogswell M ， Das A ， Vedantam R， Parikh D and Batra D. 2020. Grad-CAM： visual explanations from deep networks via gradient-based localization.International Journal of Computer Vision， 128（2）： 336-359 ［DOI：10.1007/s11263-019-01228-7http://dx.doi.org/10.1007/s11263-019-01228-7］

Wang C Y， Bochkovskiy A and Liao H Y M. 2023. YOLOv7： trainable bag-of-freebies sets new state-of-the-art for real-time object detectors//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver， Canada： IEEE： 7464-7475 ［DOI： 10.1109/CVPR52729.2023.00721http://dx.doi.org/10.1109/CVPR52729.2023.00721］

Xu S Q. 2009. Traffic sign detection and shape recognition in outdoor environments. Journal of Image and Graphics， 14（4）： 707-711

许少秋. 2009. 户外交通标志检测和形状识别. 中国图象图形学报， 14（4）： 707-711 ［DOI： 10.11834/jig.20090422http://dx.doi.org/10.11834/jig.20090422］

Xu X Z， Jiang Y Q， Chen W H， Huang Y L， Zhang Y and Sun X Y. 2022. DAMO-YOLO： a report on real-time object detection design ［EB/OL］. ［2023-08-21］. https://arxiv.org/pdf/2211.15444.pdfhttps://arxiv.org/pdf/2211.15444.pdf

Zhang H， Li F， Liu S L， Zhang L， Su H， Zhu J， Ni L M and Shum H Y. 2023. DINO： DETR with improved denoising anchor boxes for end-to-end object detection//Proceedings of the 11th International Conference on Learning Representations. Kigali， Rwanda： OpenReview.net

Zhang J M， Huang M T， Jin X K and Li X D. 2017. A real-time Chinese traffic sign detection algorithm based on modified YOLOv2. Algorithms， 10（4）： #127 ［DOI： 10.3390/a10040127http://dx.doi.org/10.3390/a10040127］

Zhang Z Q， Xiong Z L， Zhang B， Yang Y X and Fu E K. 2022. Detection for small target ship in remote sensing image based on super resolution reconstruction technology. Journal Of Northeast Electric Power University， 2022， 42（2）：33-40

张子茜，熊再立，张彪，杨琰鑫，付恩康. 2022. 基于超分辨率重建技术的遥感图像小目标检测. 东北电力大学学报， 42 （2）： 33-40 ［DOI： 10.19718/j.issn.1005-2992.2022-02-0033-08http://dx.doi.org/10.19718/j.issn.1005-2992.2022-02-0033-08］

Zhu S D， Zhang Y and Lu X F. 2006. Intelligent approach for triangle traffic sign detection. Journal of Image and Graphics， 11（8）： 1127-1131

朱双东，张懿，陆晓峰. 2006. 三角形交通标志的智能检测方法. 中国图象图形学报， 11（8）： 1127-1131 ［DOI： 10.3969/j.issn.1006-8961.2006.08.013http://dx.doi.org/10.3969/j.issn.1006-8961.2006.08.013］

Zhu X K， Lyu S C， Wang X and Zhao Q. 2021a. TPH-YOLOv5： improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal， Canada： IEEE： 2778-2788 ［DOI： 10.1109/ICCVW54120.2021.00312http://dx.doi.org/10.1109/ICCVW54120.2021.00312］

Zhu X Z， Su W J， Lu L W， Li B， Wang X G and Dai J F. 2021b. Deformable DETR： deformable transformers for end-to-end object detection//Proceedings of the 9th International Conference on Learning Representations. Vienna， Austria： OpenReview.net

Zhu Z， Liang D， Zhang S H， Huang X L， Li B L and Hu S M. 2016. Traffic-sign detection and classification in the wild//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 2110-2118 ［DOI： 10.1109/CVPR.2016.232http://dx.doi.org/10.1109/CVPR.2016.232］

文章被引用时，请邮件提醒。

提交

暂无数据