Detection of camouflage targets based on attention mechanism and multi-detection layer structure

Lai Jie; Peng Ruihui; Sun Dianxing; Huang Jie

doi:10.11834/jig.221189

Image Analysis and Recognition | Views : 0 下载量: 2 CSCD: 0

PDF
Export
Share
Collection
Album

Detection of camouflage targets based on attention mechanism and multi-detection layer structure
Vol. 29, Issue 1, Pages: 134-146(2024)
Published： 16 January 2024 ，
DOI： 10.11834/jig.221189
稿件说明：

移动端阅览

赖杰，彭锐晖，孙殿星，黄杰. 2024. 融合注意力机制与多检测层结构的伪装目标检测. 中国图象图形学报， 29(01):0134-0146

Lai Jie， Peng Ruihui， Sun Dianxing， Huang Jie. 2024. Detection of camouflage targets based on attention mechanism and multi-detection layer structure. Journal of Image and Graphics， 29(01):0134-0146
赖杰，彭锐晖，孙殿星，黄杰. 2024. 融合注意力机制与多检测层结构的伪装目标检测. 中国图象图形学报， 29(01):0134-0146 DOI： 10.11834/jig.221189.

Lai Jie， Peng Ruihui， Sun Dianxing， Huang Jie. 2024. Detection of camouflage targets based on attention mechanism and multi-detection layer structure. Journal of Image and Graphics， 29(01):0134-0146 DOI： 10.11834/jig.221189.

摘要

目的

伪装目标是目标检测领域一类重要研究对象，由于目标与背景融合度较高、视觉边缘性较差、特征信息不足，常规目标检测算法容易出现漏警、虚警，且检测精度不高。针对伪装目标检测的难点，基于YOLOv5（you only look once）算法提出了一种基于多检测层与自适应权重的伪装目标检测算法（algorithm for detecting camouflage targets based on multi-detection layers and adaptive weight，MAH-YOLOv5）。

方法

在网络预测头部中增加一个非显著目标检测层，提升网络对于像素占比极低、语义信息不足这类目标的感知能力；在特征提取骨干中融合注意力机制，调节卷积网络对特征信息不足目标的权重配比，使其更关注待检测的伪装目标；在网络训练过程中使用多尺度训练策略，进一步提升模型鲁棒性与泛化能力；定义了用于军事目标检测领域的漏警、虚警指标，并提出伪装目标综合检测指数。

结果

实验在课题组采集的伪装数据集上进行训练和验证。结果表明，本文方法在自制数据集上的平均精度均值（mean average precision，mAP）达到76.64%，较YOLOv5算法提升3.89%；漏检率8.53%、虚警率仅有0.14%，较YOLOv5算法分别降低2.75%、0.56%；伪装目标综合检测能力指数高达88.17%。与其他对比算法相比，本文方法的综合检测能力指数仅次于最先进的YOLOv8等算法。

结论

本文方法在识别精度、漏检率等指标上均有较大改善，具有最优的综合检测能力，可为战场伪装目标的快速高精度检测识别提供技术支撑和借鉴参考。

Abstract

Objective

Camouflage target identification is a critical area of research in computer vision， and its major goal is to extract information about the target’s position and categorization from a complex backdrop environment. In addition to being widely used in the military， camouflage target identification has considerable application and research value in medical image segmentation， industrial defect detection， agricultural fruit detection， and other fields. Typically， a remarkable degree of fusion is between the disguised target and the surrounding background environment， a poor visual edge， low resolution， and insufficient feature information. As a result， the conventional target detection algorithms struggle to meet the requirements of camouflage target detection and typically have a high missed detection rate and low detection accuracy. To address these issues， this work provides a camouflage target recognition approach based on YOLOv5 （MAH-YOLOv5）.

Method

First， the YOLOv5 algorithm detects large， medium， and small objects in the network prediction head by using three different scale detection layers： 80 × 80， 40 × 40， and 20 × 20. The smallest detection layer can only recognize objects with pixel sizes greater than 8 × 8， so targets with an extremely low pixel ratio are overlooked. Consequently， a non-significant target detection layer can be added to the prediction head to improve the network’s perception of targets with insufficient feature information and reduce the possibility of missed detection and false alarms during the detection and recognition. Second， the convolutional neural network is used for feature extraction， but in the extraction， each component of the image is assigned the same weight. This strategy prevents focusing on the target’s effective information extraction， resulting in a waste of computer resources. Therefore， the convolutional block attention module （CBAM） can be implemented in the network feature extraction backbone to optimize target feature information use. CBAM is divided into two components： channel attention and spatial attention. CBAM fuses attention features from two dimensions of space and channel， adjusts the weight ratio of the network and the target with insufficient feature information， and causes the network to pay more attention to the camouflage target to be detected， improving the camouflage target’s average detection accuracy. Third， in the network training， a multiscale training method is utilized to expand the variety of the data set and improve the model’s robustness and generalization ability through scale shift during training. Finally， the indexes for missed alarms and false alarms for military target detection are determined. The comprehensive detection ability index of the camouflage target is proposed， which provides a mechanism for quantitative comparison between different techniques when combined with the average detection accuracy and speed.

Result

The experiment is trained and verified using the study group’s camouflage dataset， which includes 3 200 training sets and 1 100 test sets， and is compared with faster region convolutional neural network （Faster R-CNN）， YOLOv4-tiny， single shot multibox detector （SSD）， detection Transformer （DETR）， YOLOx， YOLOv7， YOLOv8， and other algorithms. Results show the proposed method’s mean average precision （mAP） on the self-made dataset is 76.64%， the number of frames detected per second （FPS） is 53， the missed detection rate （MA） is 8.53%， the false alarm rate （FA） is only 0.14%， and the comprehensive detection index of camouflage targets is as high as 88.17%. Compared with the YOLOv5 algorithm， the mAP is 3.89% higher， the MA is 2.75% lower， the FA is 0.56% lower， and the comprehensive detection index of camouflage targets is 0.74% higher. Furthermore， by adding a small target detection layer， integrating an attention mechanism， and training using a multiscale method， the detection effect of YOLOv5 can be increased. After adding a small target detection layer， the mAP of YOLOv5 is raised by 4.12%， the FA is lowered by 0.71%， and the MA and comprehensive detection index of the camouflage target change slightly. After using the attention mechanism， the mAP of YOLOv5 is improved by 3.89%， the FA is lowered by 0.63%， the MA is reduced by 0.71%， and the comprehensive detection index of the camouflage targets is enhanced by 0.29%. After using the multiscale training methods， the mAP is increased by 3.13%， the MA is lowered by 2.85%， and the FA is reduced by 0.56%. To demonstrate the usefulness of the suggested technique， the MAH-YOLOv5 algorithm is compared with Faster R-CNN， SSD， YOLOv4-tiny， DETR， YOLOx， YOLOv7， YOLOv8， and other algorithms. Testing results reveal the suggested method outperforms the most advanced YOLOv8 algorithm in terms of mAP， FPS， MA， FA， and other indicators， and its comprehensive detection index is second only to that of the most advanced YOLOv8 algorithm.

Conclusion

This paper improves the YOLOv5 method by adding a small target detection layer and a fusion attention mechanism and proposes a camouflage target recognition method. The experimental results show the proposed method has great improvement in detection accuracy and recognition rate and can effectively identify camouflage targets in complex background environment. Comparisons show the comprehensive detection performance of this method is much better than that of Faster R-CNN， SSD， YOLOv4-tiny， DETR， YOLOv7， and other algorithms. The proposed method can provide technical assistance and reference for the rapid， accurate identification of battlefield camouflage targets.

关键词

伪装目标检测非显著目标检测层注意力机制多尺度训练综合检测指数

Keywords

camouflage target detectionnon-significant target detection layerattention mechanismmulti-scale trainingcomposite detection index

references

Bhajantri N U and Nagabhushan P. 2006. Camouflage defect identification： a novel approach//Proceedings of the 9th International Conference on Information Technology. Bhubaneswar， India： IEEE： 145-148 ［DOI： 10.1109/ICIT.2006.34http://dx.doi.org/10.1109/ICIT.2006.34］

Bochkovskiy A， Wang C Y and Liao H Y M. 2020. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. ［2020-04-23］. https://arxiv.org/pdf/2004.10934.pdfhttps://arxiv.org/pdf/2004.10934.pdf

Carion N， Massa F， Synnaeve G， Usunier N， Kirillov A and Zagoruyko S. 2020. End-to-end object detection with Transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 213-229 ［DOI： 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13］

Guo X P， Li X B， Pan Q F， Yue P and Wang J X. 2019. An object detection algorithm for UAV reconnaissance image based on deep convolution network//Proceedings of the International Conference on Sensing and Imaging， 2018. Switzerland： Springer： 53-64 ［DOI： 10.1007/978-3-030-30825-4_5http://dx.doi.org/10.1007/978-3-030-30825-4_5］

Li Q. 2011. Studies on Camouflaged Moving Objects Detection in Video with Complex Background. Jinan： Jinan University

李倩. 2011. 视频中复杂背景伪装色运动目标的检测研究. 济南：济南大学［DOI： 10.7666/d.y1883585http://dx.doi.org/10.7666/d.y1883585］

Liang M F and Li L. 2021. Enhanced YOLOv3 blur target detection based on generative adversarial network. Computer Applications and Software， 38（10）： 221-228

梁铭峰，李蠡. 2021. 基于对抗神经网络的增强YOLOv3模糊目标检测. 计算机应用与软件， 38（10）： 221-228 ［DOI： 10.3969/j.issn.1000-386x.2021.10.035http://dx.doi.org/10.3969/j.issn.1000-386x.2021.10.035］

Liang X Y， Lin H K， Yang H， Xiao K H and Quan J C. 2021. Construction of semantic segmentation dataset of camouflage target image. Progress of Laser and Optoelectronics， 58（4）： #0410015

梁新宇，林浩坤，杨辉，肖铠鸿，权冀川. 2021. 迷彩伪装目标图像语义分割数据集的构建. 激光与光电子学进展， 58（4）： #0410015 ［DOI： 10.3788/LOP202158.0410015http://dx.doi.org/10.3788/LOP202158.0410015］

Liu H， Ran J G， Yang X and Wu X Q. 2022. Camouflage target detection based on detection Transformer. Modern Electronics Technique， 45（17）： 41-46

刘珩，冉建国，杨鑫，吴晓强. 2022. 基于DETR的迷彩伪装目标检测. 现代电子技术， 45（17）： 41-46 ［DOI： 10.16652/j.issn.1004373x.2022.17.008http://dx.doi.org/10.16652/j.issn.1004373x.2022.17.008］

Liu W， Anguelov D， Erhan D， Szegedy C， Reed S， Fu C Y and Berg A C. 2016. SSD： single shot MultiBox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam， the Netherlands： Springer： 21-37 ［DOI： 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2］

Liu Y F， Ji H J and Liu L B. 2022. Real-time detection model of highway vehicle based on YOLOv5s. Chinese Journal of Liquid Crystals and Displays， 37（9）： 1228-1241

刘元峰，姬海军，刘立波. 2022. 基于YOLOv5s的高速公路车辆实时检测模型. 液晶与显示， 37（9）： 1228-1241 ［DOI： 10.37188/CJLCD.2022-0026http://dx.doi.org/10.37188/CJLCD.2022-0026］

Liu Z， Lin Y T， Cao Y， Hu H， Wei Y X， Zhang Z， Lin S and Guo B N. 2021. Swin Transformer： hierarchical vision Transformer using shifted windows//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 9992-10002 ［DOI： 10.1109/ICCV48922.2021.00986http://dx.doi.org/10.1109/ICCV48922.2021.00986］

Nguyen M T， Siritanawan P and Kotani K. 2020. Saliency detection in human crowd images of different density levels using attention mechanism. Signal Processing： Image Communication， 88： #115976 ［DOI： 10.1016/j.image.2020.115976http://dx.doi.org/10.1016/j.image.2020.115976］

Quan X J， Choi J W and Cho S H. 2020. A new thresholding method for IR-UWB radar-based detection applications. Sensors， 20（8）： #2314 ［DOI： 10.3390/s20082314http://dx.doi.org/10.3390/s20082314］

Ren H and Wang X G. 2021. Review of attention mechanism. Journal of Computer Applications， 41（S1）： 1-6

任欢，王旭光. 2021. 注意力机制综述. 计算机应用， 41（S1）： 1-6 ［DOI： 10.11772/j.issn.1001-9081.2020101634http://dx.doi.org/10.11772/j.issn.1001-9081.2020101634］

Ren S Q， He K M， Girshick R and Sun J. 2017. Faster R-CNN： towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（6）： 1137-1149 ［DOI： 10.1109/TPAMI2016.2577031http://dx.doi.org/10.1109/TPAMI2016.2577031］

Tan S L， Bie X B， Lu G L and Tan X H. 2021. Real-time detection for mask-wearing of personnel based on YOLOv5 network model. Laser Journal， 42（2）： 147-150

谈世磊，别雄波，卢功林，谈小虎. 2021. 基于YOLOv5网络模型的人员口罩佩戴实时检测. 激光杂志， 42（2）： 147-150 ［DOI： 10.14016/j.cnki.jgzz.2021.02.147http://dx.doi.org/10.14016/j.cnki.jgzz.2021.02.147］

Tan X Y， Hu X， Yang J X and Xiang J J. 2022. Camouflaged object detection based on progressive feature enhancement aggregation. Journal of Computer Applications， 42（7）： 2192-2200

谭湘粤，胡晓，杨佳信，向俊将. 2022. 基于递进式特征增强聚合的伪装目标检测. 计算机应用， 42（7）： 2192-2200 ［DOI： 10.11772/j.issn.1001-9081.2021060900http://dx.doi.org/10.11772/j.issn.1001-9081.2021060900］

Tankus A and Yeshurun Y. 2001. Convexity-based visual camouflage breaking. Computer Vision and Image Understanding， 82（3）： 208-237 ［DOI： 10.1006/cviu.2001.0912http://dx.doi.org/10.1006/cviu.2001.0912］

Vaswani A， Shazeer N， Parmar N， Uszkoreit J， Jones L， Gomez A N， Kaiser L and Polosukhin I. 2023. Attention is all you need ［EB/OL］. ［2023-08-02］. https://arxiv.org/pdf/1706.03762.pdfhttps://arxiv.org/pdf/1706.03762.pdf

Wang Y， Cao T Y， Yang J B， Zheng Y F， Fang Z， Deng X T， Wu J W and Lin J. 2021. Camouflaged object detection based on improved YOLOv5 algorithm. Computer Science， 48（10）： 226-232

王杨，曹铁勇，杨吉斌，郑云飞，方正，邓小桐，吴经纬，林嘉. 2021. 基于YOLOv5算法的迷彩伪装目标检测技术研究. 计算机科学， 48（10）： 226-232 ［DOI： 10.11896/jsjkx.210100058http://dx.doi.org/10.11896/jsjkx.210100058］

Woo S， Park J， Lee J Y and Kweon I S. 2018. CBAM： convolutional block attention module//Proceedings of the 15th European Conference on Computer Vision. Munich， Germany： Springer： 3-19 ［DOI： 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1］

Wu G J， Lyu X L， Xing H N， Zhang L T and Teng Y H. 2015. Application of three-dimensional convex analysis in pattern painting camouflage detection. Journal of PLA University of Science and Technology （Natural Science Edition）， 16（6）： 582-586

武国晶，吕绪良，邢海宁，章林通，滕雅慧. 2015. 3维凸面分析法在迷彩伪装检测中的应用. 解放军理工大学学报（自然科学版）， 16（6）： 582-586 ［DOI： 10.766/j.issn.1009-3443.20141212001http://dx.doi.org/10.766/j.issn.1009-3443.20141212001］

Wu T， Wang L W and Zhu J C. 2022. Camouflage target detection based on an improved YOLOv3 algorithm. Fire Control and Command Control， 47（2）： 114-120， 126

吴涛，王伦文，朱敬成. 2022. 改进的YOLOv3算法对伪装目标检测. 火力与指挥控制， 47（2）： 114-120， 126 ［DOI： 10.3969/j.issn.1002-0640.2022.02.020http://dx.doi.org/10.3969/j.issn.1002-0640.2022.02.020］

Yan Z W. 2020. Research on Target Matching and Positioning Based on Deep Learning. Xi’an： Xidian University

闫战伟. 2020. 基于深度学习的目标匹配和定位实现技术研究. 西安：西安电子科技大学［DOI： 10.27389/d.cnki.gxadu.2020.001958http://dx.doi.org/10.27389/d.cnki.gxadu.2020.001958］

Yu C H. 2020. Research and Implementation of Camouflage Personnel Detection Method Based on Multispectral Images. Nanjing： Nanjing University of Science and Technology

余楚恒. 2020. 基于多光谱图像的迷彩人员检测方法研究与实现. 南京：南京理工大学［DOI： 10.27241/d.cnki.gnjgu.2020.001767http://dx.doi.org/10.27241/d.cnki.gnjgu.2020.001767］

Zhao X Y and Sun B Y. 2022. An attention mechanism and contextual information based low-light image enhancement method. Journal of Image and Graphics， 27（5）： 1565-1576

赵兴运，孙帮勇. 2022. 融合注意力机制和上下文信息的微光图像增强. 中国图象图形学报， 27（5）： 1565-1576 ［DOI： 10.11834/jig.210583http://dx.doi.org/10.11834/jig.210583］

Zhou F H and Chai X Y. 2022. A study on the detection of zebrafish larvae based on YOLOv5. Intelligent Computer and Applications， 12（8）： 129-131， 135

周福欢，柴鑫雨. 2022. 基于YOLOv5算法对斑马鱼幼鱼的检测研究. 智能计算机与应用， 12（8）： 129-131， 135 ［DOI： 10.3969/j.issn.2095-2163.2022.08.024http://dx.doi.org/10.3969/j.issn.2095-2163.2022.08.024］

Alert me when the article has been cited

提交

Infrared-visible image object detection algorithm using feature dynamic selection

Attention-guided local feature joint learning for facial expression recognition

Chemical structure recognition method based on attention mechanism and encoder-decoder architecture

Industrial box-packing action recognition based on multi-view adaptive 3D skeleton network

Counterfactual reasoning model for Alzheimer’s disease diagnosis and pathological region detection