小目标检测研究综述
Survey of small object detection
- 2023年28卷第9期 页码:2587-2615
纸质出版日期: 2023-09-16
DOI: 10.11834/jig.220455
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2023-09-16 ,
移动端阅览
潘晓英, 贾凝心, 穆元震, 高炫蓉. 2023. 小目标检测研究综述. 中国图象图形学报, 28(09):2587-2615
Pan Xiaoying, Jia Ningxing, Mu Yuanzhen, Gao Xuanrong. 2023. Survey of small object detection. Journal of Image and Graphics, 28(09):2587-2615
随着计算机视觉和人工智能技术的快速发展,目标检测受到了更加广泛的关注。由于小目标像素占比小、语义信息少、易受复杂场景干扰以及易聚集遮挡等问题,导致小目标检测一直是目标检测领域中的一大难点。目前,视觉的小目标检测在生活的各个领域中日益重要。为了进一步促进小目标检测的发展,提高小目标检测的精度与速度,优化其算法模型,本文针对小目标检测中存在的问题,梳理了国内外研究现状及成果。首先,分别从小目标可视化特征、目标分布情况和检测环境等角度对小目标检测的难点进行了分析,同时从数据增强、超分辨率、多尺度特征融合、上下文语义信息、锚框机制、注意力机制以及特定的检测场景等方面系统总结了小目标检测方法,并整理了在框架结构、损失函数、预测和匹配机制等方面发展的较为成熟的单阶段小目标检测方法。其次,本文对小目标检测的评价指标以及可用于小目标检测的各类数据集进行了详细介绍,并针对部分经典的小目标检测方法在MS-COCO(Microsoft common objects in context)、VisDrone2021(vision meets drones2021)和Tsinghua-Tencent100K等数据集上的检测结果及其可视化检测效果进行了对比与分析。最后,对未来小目标检测面临的挑战,包括如何解决小目标定位困难、网络模型下采样对小目标的影响、交并比阈值的设置对小目标不合理等问题和其对应的研究方向进行了分析与展望。
In recent years, object detection has attracted increasing attention because of the rapid development of computer vision and artificial intelligence technology. Early traditional object detection methods, such as histogram of oriented gradient (HOG) and deformable parts model (DPM) usually adopt three steps: region selection, manual feature extraction, and classification regression. However, manual feature extraction has great limitations for small object detection. The object detection algorithm based on the convolutional neural network can be divided into two-stage and one-stage detection algorithms. Two-stage detection algorithms, such as faster region with convolutional neural network (Faster RCNN) and cascade region with convolutional neural network (Cascade RCNN), select candidate regions through the region proposal network. Then, they classify and regress these regions to obtain the detection results. However, the problem of low accuracy still exists in small object detection. One-stage detection algorithms, such as single shot MultiBox detector (SSD) and you only look once (YOLO), can directly locate the object and output the category detection information of the object, thereby improving the speed of object detection to a certain extent. However, small object detection has always been a huge challenge in the field of object detection because of the small proportion of small object pixels, little semantic information, and small objects that are easily disturbed by complex scenes. In particular, the challenges in object detection are as follows: First, the characteristics of small objects are few. Given the small scale of small objects and the small coverage area in data images, extracting favorable semantic feature information in network training is difficult. Second, small object detection is susceptible to interference. Most of the small objects have low resolution, blurred images, and little visual information. Thus, they are easily disturbed during difficult feature extraction. Thus, the detection model cannot easily locate and identify small objects accurately. Moreover, many false detections and missed detections exist. Third, a shortage of small object datasets exists. At present, most of the mainstream object datasets, such as PASCAL VOC and MS-COCO, are aimed at normal-scale objects. In particular, the proportion of small-scale objects is insufficient, and the distribution is uneven. However, some datasets mentioned in this study that can be used for small object detection are all aimed at specific scenes or tasks. These datasets include DOTA remote sensing object detection dataset, face detection dataset and benchmark, which are not universal for small object detection. Fourth, small objects are easy to gather and block. A serious occlusion problem occurs when small objects gather. After many downsampling and pooling operations, quite a lot of feature information is lost, resulting in some detection difficulties. At present, visual small object detection is increasingly important in all fields of life. Aiming at the problems in small object detection, this study combs the research status and achievements of small object detection at home and abroad to promote the development of small object detection further, improve the speed and accuracy of small object detection, and optimize its algorithm model. The methods of small object detection are analyzed and summarized from the aspects of data enhancement, super resolution, multiscale feature fusion, contextual semantic information, anchor frame mechanism, attention, and specific detection scenarios. Data enhancement is the method proposed for solving the problems of a few general small object datasets, a small number of small objects in public datasets, and uneven distribution of small objects in images. The earliest data enhancement strategy is to increase the number of object training and improve the performance of object detection by deforming, rotating, scaling, cutting, and translating object instances. Then, other effective data augmentation methods emerged, which included oversampling the images containing small objects in the experiment, scaling and rotating the small objects, and copying the objects to any new position in order to augment the data. Data enhancement helps improve the robustness of a model to a certain extent. Moreover, it solves the problems of unobvious visual features of small objects and less object information. It also achieves good results in the final detection performance. However, the improper design of data enhancement strategy in practical applications may lead to new noise, impairing the performance of feature extraction. This scenario also brings some challenges to the design of the algorithm. The small object detection method based on multiscale fusion needs to make full use of the detailed information in the image because the characteristic information of small-scale objects is little. In the existing convolutional neural network (CNN) model of general object detection, multiscale detection can help the model to obtain accurate positioning information and discriminating feature information by using a low-level feature layer. This scenario is conducive to the detection and recognition of small-scale objects. First, a feature pyramid network (FPN) with strong semantic features at all scales is introduced. Then, an fpn-based path aggregation network (PANet), which not only achieved good results in case segmentation but also improved the detection of small objects. In feature fusion, the residual feature enhancement method extracts the context information with a constant ratio to reduce the information loss of the highest pyramid feature map. At present, many methods are based on multiscale feature fusion, which uses the low-level high-resolution and high-level strong feature semantic information of the network to improve the accuracy of small objects. In small object detection, the target’s feature expression ability is weak. Thus, the network structure must be deepened to learn considerable feature information. Introducing an attention mechanism can often make the network model pay considerable attention to the channels and areas related to the task. In the object detection network, the shallow feature map lacks the contextual semantic information of small objects. By incorporating attention mechanisms into the SSD model, irrelevant information in feature fusion is suppressed, leading to an improvement in the detection accuracy of small objects. In general, the attention mechanism can reasonably allocate the used resources, quickly find the region of interest, and ignore disturbing information. However, the improper design in use increases the cost of network calculation and affects the extraction of object features by the model. Finally, the future research direction of small object detection is prospected. Visual small object detection is becoming increasingly important in all fields of life, and it will develop in other directions in the future.
目标检测小目标检测数据增强超分辨率多尺度特征融合
object detectionsmall object detectiondata enhancementsuper-resolutionmultiscale characteristic fusion
Akyon F C, Altinuc S O and Temizel A. 2022. Slicing aided hyper inference and fine-tuning for small object detection. IEEE International Conference on Image Processing [EB/OL]. [2022-10-24]. http://arxiv.org/pdf/2202.06934.pdfhttp://arxiv.org/pdf/2202.06934.pdf
Bai Y C, Zhang Y Q, Ding M L and Ghanem B. 2018. SOD-MTGAN: small object detection via multi-task generative adversarial network//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 210-226 [DOI: 10.1007/978-3-030-01261-8_13http://dx.doi.org/10.1007/978-3-030-01261-8_13]
Benjumea A, Teeti I, Cuzzolin F and Bradley A. 2021. YOLO-Z: improving small object detection in YOLOv5 for autonomous vehicles//Proceedings of the International Conference on Computer Vision (ICCV 2021): The ROAD Challenge Workshop. [s.l.]: [s.n.]
Bianco S, Buzzelli M, Mazzini D and Schettini R. 2017. Deep learning for logo recognition. Neurocomputing,245: 23-30 [DOI: 10.1016/j.neucom.2017.03.051http://dx.doi.org/10.1016/j.neucom.2017.03.051]
Bochkovskiy A, Wang C Y and Liao H Y M. 2020. Yolov4: Optimal speed and accuracy of object detection. Computer Vision and Pattern Recognition [EB/OL]. [2020-04-23]. http://arxiv.org/pdf/2004.10934.pdfhttp://arxiv.org/pdf/2004.10934.pdf
Cai Z W, Fan Q F, Feris R S and Vasconcelos N. 2016. A unified multi-scale deep convolutional neural network for fast object detection//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 354-370 [DOI: 10.1007/978-3-319-46493-0_22http://dx.doi.org/10.1007/978-3-319-46493-0_22]
Cai Z W and Vasconcelos N. 2018. Cascade R-CNN: delving into high quality object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 6154-6162 [DOI: 10.1109/CVPR.2018.00644http://dx.doi.org/10.1109/CVPR.2018.00644]
Chen C Y, Liu M Y, Tuzel O and Xiao J X. 2016. R-CNN for small object detection//Proceedings of the 13th Asian Conference on Computer Vision. Taipei, China: Springer: 214-230 [DOI: 10.1007/978-3-319-54193-8_14http://dx.doi.org/10.1007/978-3-319-54193-8_14]
Chen X L and Gupta A. 2017. Spatial memory for context reasoning in object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4106-4116 [DOI: 10.1109/ICCV.2017.440http://dx.doi.org/10.1109/ICCV.2017.440]
Chen Y K, Zhang P Z, Li Z M, Li Y W, Zhang X Y, Meng G F, Xiang S M, Sun J and Jia J Y. 2020. Stitcher: feedback-driven data provider for object detection. Computer Vision and Pattern Recognition [EB/OL]. [2021-03-14]. http://arxiv.org/pdf/2004.12432.pdfhttp://arxiv.org/pdf/2004.12432.pdf
Cheng G, Han J W, Zhou P C and Guo L. 2014. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing, 98: 119-132 [DOI: 10.1016/j.isprsjprs.2014.10.002http://dx.doi.org/10.1016/j.isprsjprs.2014.10.002]
Dai J, Li Y and He K. 2016. R-FCN: object detection via region-based fully convolutional networks. Advances in Neural Information Processing Systems, #29 [DOI: https://doi.org/10.48550/arXiv.1605.06409http://dx.doi.org/https://doi.org/10.48550/arXiv.1605.06409]
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego, USA: IEEE: 886-893 [DOI: 10.1109/CVPR.2005.177http://dx.doi.org/10.1109/CVPR.2005.177]
Ding S and Zhao K. 2018. Research on daily objects detection based on deep neural network. IOP Conference Series: Materials Science and Engineering, 322(6): #062024 [DOI: 10.1088/1757-899X/322/6/062024http://dx.doi.org/10.1088/1757-899X/322/6/062024]
Everingham M, Eslami S M A, Van Gool L, Williams C K I, Winn J and Zisserman A. 2015. The PASCAL visual object classes challenge: a retrospective. International Journal of Computer Vision, 111(1): 98-136 [DOI: 10.1007/s11263-014-0733-5http://dx.doi.org/10.1007/s11263-014-0733-5]
Everingham M, Van Gool L, Williams C K I, Winn J and Zisserman A. 2010. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2): 303-338 [DOI: 10.1007/s11263-009-0275-4http://dx.doi.org/10.1007/s11263-009-0275-4]
Felzenszwalb P, McAllester D and Ramanan D. 2008. A discriminatively trained, multiscale, deformable part model//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE: 1-8 [DOI: 10.1109/CVPR.2008.4587597http://dx.doi.org/10.1109/CVPR.2008.4587597]
Feng Z Q, Xie Z J and Bao Z W. 2023. Realtime dense small object detection algorithm for UAV based on improved YOLOv5. Aeronautica et Astronautica Sinica, 44(7):251-265.
奉志强,谢志军,包正伟.2023. 基于改进YOLOv5的无人机实时密集小目标检测算法.航空学报,44(7):251-265 [DOI:10.7527/ S1000-6893 2022 27106http://dx.doi.org/10.7527/S1000-6893202227106]
Fu C Y, Liu W, Ranga A, Tyagi A and Berg A C. 2017. DSSD: Deconvolutional single shot detector. Computer Vision and Pattern Recognition [EB/OL]. [2021-01-23]. http://arxiv.org/pdf/1701.06659.pdfhttp://arxiv.org/pdf/1701.06659.pdf
Fu J, Liu J, Tian H, Li Y, Bao Y J, Fang Z W and Lu H Q. 2019.Dual attention network for scene segmentation//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Long Beach, USA: IEEE: 3146-3154
Fu J M, Sun X, Wang Z R and Fu K. 2021. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Transactions on Geoscience and Remote Sensing, 59(2): 1331-1344 [DOI: 10.1109/TGRS.2020.3005151http://dx.doi.org/10.1109/TGRS.2020.3005151]
Gao M, Yu R, Li A, Morariu V I and Davis L S. 2018. Dynamic zoom-in network for fast object detection in large images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 6926-6935 [DOI: 10.1109/CVPR.2018.00724http://dx.doi.org/10.1109/CVPR.2018.00724]
Ge Z, Liu S and Wang F. 2021. Yolox: exceeding YOLO series in 2021. Computer Vision and Pattern Recognition [EB/OL]. [2021-08-06]. http://arxiv.org/pdf/2107.08430.pdfhttp://arxiv.org/pdf/2107.08430.pdf
Glenn J, Stoken A and Borovec J. 2020. YOLOv5: v3. 1-Bug Fixes and Performance Improvements [DOI: 10.5281/zenodo.4154370http://dx.doi.org/10.5281/zenodo.4154370]
Guo C X, Fan B, Zhang Q, Xiang S M and Pan C H. 2020. AugFPN: improving multi-scale feature learning for object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 12592-12601 [DOI: 10.1109/CVPR42600.2020.01261http://dx.doi.org/10.1109/CVPR42600.2020.01261]
Haris M, Shakhnarovich G and Ukita N. 2021. Task-driven super resolution: Object detection in low-resolution images//Proceedings of the 28th International Conference on Neural Information Processing. Sanur, Indonesia: Springer: 387-395 [DOI: 10.1007/978-3-030-92307-5_45http://dx.doi.org/10.1007/978-3-030-92307-5_45]
He K, Gkioxari G and Dollr P. 2017. Mask R-CNN. //Proceedings of 2017 IEEE International Conference on Computer Vision. 2961-2969 [DOI: 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]
Jain V and Learned-Miller E. 2010. FDDB: A Benchmark for Face Detection in Unconstrained Settings. Technical Report UM-CS-2010-009. University of Massachusetts
Jia K X, Ma Z H, Zhu R and Li Y G. 2022. Attention-mechanism-based light single shot multiBox detector modelling improvement for small object detection on the sea surface. Journal of Image and Graphics, 27(4): 1161-1175
贾可心, 马正华, 朱蓉, 李永刚. 2022. 注意力机制改进轻量SSD模型的海面小目标检测. 中国图象图形学报, 27(4): 1161-1175 [DOI: 10.11834/jig.200517http://dx.doi.org/10.11834/jig.200517]
Jiang C H, Zhang D X and Zhang C. 2022. Improve YOYLv3’s ground vehicle small target detection. Computer and Digital Engineering, 50(3): 548-553
蒋川虎, 张东旭, 张超. 2022. 改进YOLOv3的地面车辆小目标检测. 计算机与数字工程, 50(3): 548-553 [DOI: 10.3969/j.issn.1672-9722.2022.03.018http://dx.doi.org/10.3969/j.issn.1672-9722.2022.03.018]
Jiang R Q, Peng Y P, Xie W X and Xie G R. 2021. Improved YOLOv4 small target detection algorithm with embedded scSE module. Journal of Graphics, 42(4): 546-555
蒋镕圻, 彭月平, 谢文宣, 谢郭蓉. 2021. 嵌入scSE模块的改进YOLOv4小目标检测算法. 图学学报, 42(4): 546-555 [DOI: 10.11996/JG.j.2095-302X.2021040546http://dx.doi.org/10.11996/JG.j.2095-302X.2021040546]
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc
Li C. 2021. Small target detection algorithm based on YOLOv5, Changjiang Information & Communications. 34(9): 30-33
李成. 基于改进YOLOv5的小目标检测算法研究. 长江信息通信, 34(9): 30-33
Li J, Wang Y B, Wang C A, Tai Y, Qian J J, Yang J, Wang C J, Li J L and Huang F Y. 2019a. DSFD: dual shot face detector//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5055-5064 [DOI: 10.1109/CVPR.2019.00520http://dx.doi.org/10.1109/CVPR.2019.00520]
Li J N, Liang X D, Wei Y C, Xu T F, Feng J S and Yan S C. 2017. Perceptual generative adversarial networks for small object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE [DOI: 10.1109/CVPR.2017.211http://dx.doi.org/10.1109/CVPR.2017.211]
Li Y H, Chen Y T, Wang N Y and Zhang Z X. 2019b. Scale-aware trident networks for object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 6053-6062 [DOI: 10.1109/ICCV.2019.00615http://dx.doi.org/10.1109/ICCV.2019.00615]
Li Y J, Li S S, Du H H, Chen L J, Zhang D M and Li Y. 2020. YOLO-ACN: focusing on small target and occluded object detection. IEEE Access, 8: 227288-227303 [DOI: 10.1109/ACCESS.2020.3046515http://dx.doi.org/10.1109/ACCESS.2020.3046515]
Li Z, Peng C and Yu G.2018. Detnet: a backbone network for object detection. Computer Vision and Pattern Recognition [EB/OL]. [2018-04-19]. http://arxiv.org/pdf/1804.06215.pdfhttp://arxiv.org/pdf/1804.06215.pdf
Li Z M, Peng C, Yu G, Zhang X Y, Deng Y D and Sun J. 2018. DetNet: a backbone network for object detection. Computer Vision and Pattern Recognition [EB/OL]. [2022-05-12]. http://arxiv.org/pdf/1804.06215.pdfhttp://arxiv.org/pdf/1804.06215.pdf
Lim J S, Astrid M, Yoon H and Lee S I. 2021. Small object detec-tion using context and attention//Proceedings of 2021 International Conference on Artificial Intelligence in Information and Communication (IC-AIIC). Jeju Island, Korea (South): IEEE: 181-186 [DOI: 10.1109/ICAIIC51459.2021.9415217http://dx.doi.org/10.1109/ICAIIC51459.2021.9415217]
Lin J, Jing W and Song H. 2019. SAN: Scale-aware network for semantic segmentation of high-resolution aerial images. Computer Vision and Pattern Recognition [EB/OL]. [2019-07-06]. http://arxiv.org/pdf/1907.03089.pdfhttp://arxiv.org/pdf/1907.03089.pdf
Lin T Y, Dollr P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 936-944 [DOI: 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106]
Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollr P and Zitnick C L. 2014. Microsoft COCO: common objects in context//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 740-755 [DOI: 10.1007/978-3-319-10602-1_48http://dx.doi.org/10.1007/978-3-319-10602-1_48]
Liu S, Qi L, Qin H F, Shi J P and Jia J Y. 2018. Path aggregation network for instance segmentation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8759-8768 [DOI: 10.1109/CVPR.2018.00913http://dx.doi.org/10.1109/CVPR.2018.00913]
Liu S T, Huang D and Wang Y H. 2019. Learning spatial fusion for single-shot object detection. Computer Vision and Pattern Recognition [EB/OL]. [2019-11-25]. http://arxiv.org/pdf/1911.09516.pdfhttp://arxiv.org/pdf/1911.09516.pdf
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot MultiBox detector//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]
Liu W Y, Ren G F, Yu R S, Guo S, Zhu J K and Zhang L. 2021. Image-adaptive YOLO for object detection in adverse weather conditions//Proceedings of the 36th AAAI Conference on Artificial Intelligence. [s.l.]: [s.n.]
Liu Y, Liu H Y and Fan J L. 2020. Overview of research and application of small target detection based on deep learning. Chinese Journal of Electronics ,48 (3) : 590-601
刘颖, 刘红燕, 范九伦. 2020. 基于深度学习的小目标检测研究与应用综述. 电子学报, 48(3) : 590-601[DOI: 10.3969 /j.Issn.0372-2112.2020.03.024http://dx.doi.org/10.3969/j.Issn.0372-2112.2020.03.024]
Luo J H, Huang J and Bai X Y. 2022. Road small target detection method based on improved YOLOv3. Journal of Chinese Computer Systems.43(3):449-455
罗建华,黄俊,白鑫宇.2022. 改进YOLOv3的道路小目标检测方法.小型微型计算机系, 43(3):449-455 [DOI:10.20009/ j.cnki.21-1106/TP.2020-0989http://dx.doi.org/10.20009/j.cnki.21-1106/TP.2020-0989]
Ma S Q and Zhou K. 2020. An improved small object detection algorithm based on attention mechanism and feature fusion. Computer Applications and Software, 37(5): 194-199
麻森权, 周克. 2020. 基于注意力机制和特征融合改进的小目标检测算法. 计算机应用与软件, 37(5): 194-199 [DOI: 10.3969/j.issn.1000-386x.2020.05.034http://dx.doi.org/10.3969/j.issn.1000-386x.2020.05.034]
Ng H F. 2006. Automatic thresholding for defect detection. Pattern Recognition Letters, 27(14): 1644-1649 [DOI: 10.1016/j.patrec.2006.03.009http://dx.doi.org/10.1016/j.patrec.2006.03.009]
Noh J, Bae W, Lee W, Seo J and Kim G. 2019. Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 9724-9733 [DOI: 10.1109/ICCV.2019.00982http://dx.doi.org/10.1109/ICCV.2019.00982]
Pang J M, Chen K, Shi J P, Feng H J, Ouyang W L and Lin D H. 2019. Libra R-CNN: towards balanced learning for object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 821-830 [DOI: 10.1109/CVPR.2019.00091http://dx.doi.org/10.1109/CVPR.2019.00091]
Ptak B and Piechocki M.2020.Small object detection and recognition in aerial images.Applied Machine Learning: #04007 [DOI:10.13140/RG.2.2.29239.04007http://dx.doi.org/10.13140/RG.2.2.29239.04007]
Qiao S Y, Chen L C and Yuille A. 2021. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 10208-10219 [DOI: 10.1109/CVPR46437.2021.01008http://dx.doi.org/10.1109/CVPR46437.2021.01008]
Radford A, Metz L and Chintala S. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: [s.n.]
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 779-788 [DOI: 10.1109/CVPR.2016.91http://dx.doi.org/10.1109/CVPR.2016.91]
Ren S Q, He K and Girshick R. 2015. Faster R-CNN: towards real-time object detection with region proposal networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press
Romberg S, Pueyo L G, Lienhart R and van Zwol R. 2011. Scalable logo recognition in real-world images//Proceedings of the 1st ACM International Conference on Multimedia Retrieval. Trento, Italy: ACM: 25 [DOI: 10.1145/1991996.1992021]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S A, Huang Z H, Karpathy A, Khosla A, Bernstein M, Berg A C and Li F F. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3): 211-252 [DOI: 10.1007/s11263-015-0816-yhttp://dx.doi.org/10.1007/s11263-015-0816-y]
Sharma S, Kiros R and Salakhutdinov R. 2015. Action recognition using visual attention. Machine Learning [EB/OL]. [2016-02-14]. http://arxiv.org/pdf/1511.04119.pdfhttp://arxiv.org/pdf/1511.04119.pdf
Simard P Y, Steinkraus D and Platt J C. 2003. Best practices for convolutional neural networks applied to visual document analysis//Proceedings of the 7th International Conference on Document Analysis and Recognition. Edinburgh, UK: IEEE: 958-963 [DOI: 10.1109/ICDAR.2003.1227801http://dx.doi.org/10.1109/ICDAR.2003.1227801]
Singh B and Davis L S. 2018. An analysis of scale invariance in object detection-SNIP//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3578-3587 [DOI: 10.1109/CVPR.2018.00377http://dx.doi.org/10.1109/CVPR.2018.00377]
Singh B, Najibi M and Davis L S. 2018. SNIPER: efficient multi-scale training//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc
Szegedy C, Toshev A and Erhan D. 2013. Deep neural networks for object detection//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc
Tang X, Du D K, He Z Q and Liu J T. 2018. PyramidBox: a context-assisted single shot face detector//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 812-828 [DOI: 10.1007/978-3-030-01240-3_49http://dx.doi.org/10.1007/978-3-030-01240-3_49]
Tang Z Y, Liu X and Yang B J. 2020. PENet: object detection using points estimation in high definition aerial images//Proceedings of the 19th IEEE International Conference on Machine Learning and Applications (ICMLA). Miami, USA: IEEE: 392-398 [DOI: 10.1109/ICMLA51294.2020.00069http://dx.doi.org/10.1109/ICMLA51294.2020.00069]
Wan L, Zeiler M, Zhang S X, LeCun Y and Fergus R. 2013. Regularization of neural networks using dropconnect//Proceedings of the 30th International Conference on Machine Learning. Atlanta, USA: JMLR.org: III-1058-III-1066
Wang F, Jiang M Q, Qian C, Yang S, Li C, Zhang H G, Wang X G and Tang X O. 2017. Residual attention network for image classification//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,USA: IEEE: 6450-6458 [DOI: 10.1109/CVPR.2017.683http://dx.doi.org/10.1109/CVPR.2017.683]
Wang J F, Chen Y, Gao Z K and Gao M Y. 2023. Improved YOLOv5 network for real-time multi-scale traffic sign detection. Neural Computing and Applications, 35(10): 7853-7865 [DOI: 10.1007/s00521-022-08077-5http://dx.doi.org/10.1007/s00521-022-08077-5]
Wang J H and He Z J. Small object detection survey paper. Computer Science [EB/OL]. [2023-07-11]. https://api.semanticscholar.org/CorpusID:231860446?utm_source=wikipediahttps://api.semanticscholar.org/CorpusID:231860446?utm_source=wikipedia
Wang L and Zhang W Z. 2022. A traffic sign detection based on attentional mechanism and contextual information. Computer Measurement & Control. 30(3): 54-59
王林,张文卓. 2022. 一种融合注意力机制与上下文信息的交通标志检测方法. 计算机测量与控制, 30(3): 54-59[DOI:10.16526/j.cnki.11-4762/tp.2022.03.010http://dx.doi.org/10.16526/j.cnki.11-4762/tp.2022.03.010.]
Wang M, Yang W, Wang L, Chen D, Wei F Y, KeZiErBieKe H L T and Liao Y Y. 2023. FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection. Journal of Visual Communication and Image, 90:#103752
Wu X W, Sahoo D and Hoi S C H. 2020. Recent advances in deep learning for object detection. Neurocomputing, 396: 39-64 [DOI: 10.1016/j.neucom.2020.01.085http://dx.doi.org/10.1016/j.neucom.2020.01.085]
Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, Datcu M, Pelillo M and Zhang L P. 2018. DOTA: a large-scale dataset for object detection in aerial images//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 3974-3983 [DOI: 10.1109/CVPR.2018.00418http://dx.doi.org/10.1109/CVPR.2018.00418]
Xiao J S, Zhao T, Zhou J, Le Q P and Yang L H. 2023. Small target detection network based on context augmentation and feature refinement. Journal of Computer Research and Development, 60(2): 465-474
肖进胜, 赵陶, 周剑, 乐秋平, 杨力衡. 2023. 基于上下文增强和特征提纯的小目标检测网络. 计算机研究与发展, 60(2): 465-474 [DOI: 10.7544/issn1000-1239.20210956http://dx.doi.org/10.7544/issn1000-1239.20210956]
Xu C J, Wang X F and Yang Y D. 2019. Attention-YOLO: YOLO detection algorithm that introduces attention mechanism. Computer Engineering and Applications, 55(6): 13-23
徐诚极, 王晓峰, 杨亚东. 2019. Attention-YOLO: 引入注意力机制的YOLO检测算法. 计算机工程与应用, 55(6): 13-23 [DOI: 10.3778/j.issn.1002-8331.1812-0010http://dx.doi.org/10.3778/j.issn.1002-8331.1812-0010]
Yaeger L, Lyon R and Webb B. 1996. Effective training of a neural network character classifier for word recognition//Proceedings of the 9th International Conference on Neural Information Processing Systems. Denver, Colorado: MIT Press
Yang S, Luo P, Loy C C and Tang X O. 2016. WIDER FACE: a face detection benchmark//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 5525-5533 [DOI: 10.1109/CVPR.2016.596http://dx.doi.org/10.1109/CVPR.2016.596]
Yang X, Zhou Y, Zhang G F, Yang J R, Wang W T, Yan J S, Zhang X P and Tian Q. 2022. The KFIoU loss for rotated object detection international conference on learning representations. Computer Vision and Pattern Recognition [EB/OL]. [2022-10-06]. http://arxiv.org/pdf/2201.12558.pdfhttp://arxiv.org/pdf/2201.12558.pdf
Yu B S and Tao D C. 2019. Anchor cascade for efficient face detection. IEEE Transactions on Image Processing, 28(5): 2490-2501 [DOI: 10.1109/TIP.2018.2886790http://dx.doi.org/10.1109/TIP.2018.2886790]
Yu J R, Huang D Q, Zeng R and Zhao H H. 2022, Research on vehicle detection algorithm based on improved YOLOv4,Laser Journal, 43(4):52-59
於积荣,黄德启,曾蓉,赵恒辉. 2022. 基于改进YOLOv4的车型检测算法研究.激光杂志, 43(4): 52-59[DOI:10.14016/j.cnki.jgzz.2022.04.052http://dx.doi.org/10.14016/j.cnki.jgzz.2022.04.052.]
Yu X, Chen P and Wu D. 2022. Object localization under single coarse point supervision//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4868-4877 [DOI: https://doi.org/10.48550/arXiv.2203.09338http://dx.doi.org/https://doi.org/10.48550/arXiv.2203.09338]
Yu X H, Gong Y Q, Jiang N, Ye Q X and Han Z J. 2020. Scale match for tiny person detection//Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass, USA: IEEE: 1246-1254 [DOI: 10.1109/WACV45572.2020.9093394http://dx.doi.org/10.1109/WACV45572.2020.9093394]
Zeng X Y, Ouyang W L, Yang B, Yan J J and Wang X G. 2016. Gated bi-directional CNN for object detection//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer: 354-369 [DOI: 10.1007/978-3-319-46478-7_22http://dx.doi.org/10.1007/978-3-319-46478-7_22]
Zhang H, Li F, Liu S L, Zhang L, Su H, Zhu J, Ni L M and Shum H Y. 2022. DINO: DETR with improved denoising anchor boxes for end-to-end object detection. Computer Vision and Pattern Recognition [EB/OL]. [2022-07-11]. http://arxiv.org/pdf/2203.03605.pdfhttp://arxiv.org/pdf/2203.03605.pdf
Zhang K P, Zhang Z P, Li Z F and Qiao Y. 2016. Joint face detection and alignment using Multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10): 1499-1503 [DOI: 10.1109/LSP.2016.2603342http://dx.doi.org/10.1109/LSP.2016.2603342]
Zhang L, Yang X, Liu Z Y, Qi L, Zhou H and Chiu C. 2018. Single shot feature aggregation network for underwater object detection//Proceedings of the 24th International Conference on Pattern Recognition (ICPR). Beijing, China: IEEE: 1906-1911 [DOI: 10.1109/ICPR.2018.8545677http://dx.doi.org/10.1109/ICPR.2018.8545677]
Zhang S F, Chi C, Yao Y Q, Lei Z and Li S Z. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 9756-9765 [DOI: 10.1109/CVPR42600.2020.00978http://dx.doi.org/10.1109/CVPR42600.2020.00978]
Zhang S F, Zhu X Y, Lei Z, Shi H L, Wang X B and Li S Z. 2017. S3FD: single shot scale-invariant face detector//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy, USA: IEEE: 192-201 [DOI: 10.1109/ICCV.2017.30http://dx.doi.org/10.1109/ICCV.2017.30]
Zhang T N, Chen E Q and Xiao W F. 2021. Fast target detection method for improving MobileNet_YOLOv3 network. Journal of Chinese Computer Systems, 42(5): 1008-1014
张陶宁, 陈恩庆, 肖文福. 2021. 一种改进MobileNet_YOLOv3网络的快速目标检测方法. 小型微型计算机系统, 42(5): 1008-1014 [DOI: 10.3969/j.issn.1000-1220.2021.05.018http://dx.doi.org/10.3969/j.issn.1000-1220.2021.05.018]
Zhao L, Zhi L and Zhao C.2022. Fire-YOLO: a small target object detection method for fire inspection. Sustainability, 14(9): #4930.
Zhao P F, Xie L B and Peng L. 2022. Deep small object detection algorithm integrating attention mechanism. Journal of Frontiers of Computer Science and Technology, 16(4): 927-937
赵鹏飞, 谢林柏, 彭力. 2022. 融合注意力机制的深层次小目标检测算法. 计算机科学与探索, 16(4): 927-937 [DOI: 10.3778/j.issn.1673-9418.2108087http://dx.doi.org/10.3778/j.issn.1673-9418.2108087]
Zhao Y M, Wang J C, Ren H E and Zhao L. 2022. A small object detection algorithm integrated with ReFPN and compound attention mechanism. Journal of Harbin University of Science and Technology, 27(2): 85-91
赵一鸣, 王金聪, 任洪娥, 赵龙. 2022. 融合ReFPN结构与混合注意力的小目标检测算法. 哈尔滨理工大学学报, 27(2): 85-91 [DOI: 10.15938/j.jhust.2022.02.011http://dx.doi.org/10.15938/j.jhust.2022.02.011]
Zheng P, Bai H Y, Li W and Guo H W. 2020. Small target detection algorithm in complex background. Journal of Zhejiang University (Engineering Science), 54(9): 1777-1784
郑浦, 白宏阳, 李伟, 郭宏伟. 2020. 复杂背景下的小目标检测算法. 浙江大学学报(工学版), 54(9): 1777-1784 [DOI: 10.3785/j.issn.1008-973X.2020.09.014http://dx.doi.org/10.3785/j.issn.1008-973X.2020.09.014]
Zhou L, Wei H R, Li H, Zhao W Z, Zhang Y and Zhang Y. 2020. Arbitrary-oriented object detection in remote sensing images based on polar coordinates. IEEE Access, 8: 223373-223384 [DOI: 10.1109/ACCESS.2020.3041025http://dx.doi.org/10.1109/ACCESS.2020.3041025]
Zhu C C, He Y H and Savvides M. 2019. Feature selective anchor-free module for single-shot object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 840-849 [DOI: 10.1109/CVPR.2019.00093http://dx.doi.org/10.1109/CVPR.2019.00093]
Zhu C C, Tao R, Luu K and Savvides M. 2018. Seeing small faces from robust anchor’s perspective//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 5127-5136 [DOI: 10.1109/CVPR.2018.00538http://dx.doi.org/10.1109/CVPR.2018.00538]
Zhu H G, Chen X G, Dai W Q, Fu K, Ye Q X and Jiao J B. 2015. Orientation robust object detection in aerial images using deep convolutional neural network//Proceedings of 2015 IEEE International Conference on Image Processing (ICIP). Quebec City, Canada: IEEE: 3735-3739 [DOI: 10.1109/ICIP.2015.7351502http://dx.doi.org/10.1109/ICIP.2015.7351502]
Zhu X K, Lyu S, Wang X and Zhao Q. 2021. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal, Canada: IEEE: 2778-2788 [DOI: 10.1109/ICCVW54120.2021.00312http://dx.doi.org/10.1109/ICCVW54120.2021.00312]
Zhu Y J, Cai H X, Zhang S H, Wang C H and Xiong Y C. 2020. TinaFace: strong but simple baseline for face detection. Computer Vision and Pattern Recognition [EB/OL]. [2020-12-02]. http://arxiv.org/pdf/2203.03605.pdfhttp://arxiv.org/pdf/2203.03605.pdf
Zhu Y S, Zhao C Y, Wang J Q, Zhao X, Wu Y and Lu H Q. 2017. CoupleNet: coupling global structure with local parts for object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 4146-4154 [DOI: 10.1109/ICCV.2017.444http://dx.doi.org/10.1109/ICCV.2017.444]
Zhu Z, Liang D, Zhang S H, Huang X L, Li B L and Hu S M. 2016. Traffic-sign detection and classification in the wild//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 2110-2118 [DOI: 10.1109/CVPR.2016.232http://dx.doi.org/10.1109/CVPR.2016.232]
Zoph B, Cubuk E D, Ghiasi G, Lin T Y, Shlens J and Le Q V. 2020. Learning data augmentation strategies for object detection//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 566-583 [DOI: 10.1007/978-3-030-58583-9_34http://dx.doi.org/10.1007/978-3-030-58583-9_34]
Zou H H and Hou J. 2022. Research on road small target detection with improved SSD algorithm. Computer Engineering, 48(5): 281-288
邹慧海, 侯进. 2022. 改进SSD算法的道路小目标检测研究. 计算机工程, 48(5): 281-288 [DOI: 10.19678/j.issn.1000-3428.0061499http://dx.doi.org/10.19678/j.issn.1000-3428.0061499]
相关作者
相关机构