引入余弦空间相关性的两阶段滤波器剪枝

廖威; 李光辉; 代成龙; 张飞飞

doi:10.11834/jig.230592

图像处理和编码 | 浏览量 : 0 下载量: 0 CSCD: 0

PDF
导出
分享
收藏
专辑

引入余弦空间相关性的两阶段滤波器剪枝
Two-stage filter pruning incorporating cosinespatial correlation
2024年29卷第12期页码：3628-3643
纸质出版日期： 2024-12-16 ，
DOI： 10.11834/jig.230592
稿件说明：

移动端阅览

廖威，李光辉，代成龙，张飞飞. 2024. 引入余弦空间相关性的两阶段滤波器剪枝. 中国图象图形学报， 29(12):3628-3643

Liao Wei， Li Guanghui， Dai Chenglong， Zhang Feifei. 2024. Two-stage filter pruning incorporating cosinespatial correlation. Journal of Image and Graphics， 29(12):3628-3643
廖威，李光辉，代成龙，张飞飞. 2024. 引入余弦空间相关性的两阶段滤波器剪枝. 中国图象图形学报， 29(12):3628-3643 DOI： 10.11834/jig.230592.

Liao Wei， Li Guanghui， Dai Chenglong， Zhang Feifei. 2024. Two-stage filter pruning incorporating cosinespatial correlation. Journal of Image and Graphics， 29(12):3628-3643 DOI： 10.11834/jig.230592.

摘要

目的

深度神经网络在图形图像、计算机视觉等众多应用领域取得了令人瞩目的效果，但是一直以来深度学习网络模型由于其庞大的计算量以及存储资源而无法部署在资源受限的嵌入式设备端。为了解决模型所需的计算资源和嵌入式设备资源受限之间的矛盾，提出了一种引入余弦空间相关的两阶段滤波器剪枝方法，旨在利用滤波器间的空间相关性实现更优的剪枝方式。

方法

在预剪枝阶段引入L范数记录下范数值最高的滤波器，本文称为关键滤波器；在剪枝阶段引入余弦距离保留和关键滤波器空间相关性高的滤波器。

结果

本文提出的剪枝方法在CIFAR（Canadian Institute for Advanced Research）数据集上取得了优于其他对比方法的效果，在CIFAR10数据集上将VGG（Visual Geometry Group）16的参数量和浮点运算量分别压缩了72.9%和73.5%，同时模型精度提升了0.1%。对于高效的残差网络ResNet（residual neural network）56和深度可分离网络MobileNet V1也可以有效地压缩，该方法在CIFAR100数据集上对ResNet56网络在更高的压缩率下实现了更小的精度损失（精度提升0.48%）。对于MobileNet V1网络，压缩了46.89%的参数量和46.23%的浮点运算量，而模型精度提升了0.11%。

结论

引入余弦空间相关性的两阶段滤波器剪枝策略避免了网络剪枝中“衡量指标小，则衡量对象不重要”和“相似即冗余”两种假设不成立而导致模型陷入次优结果，从滤波器空间的角度挖掘相关性，在保证模型准确率的前提下能够压缩更多的参数量和浮点运算量。

Abstract

Objective

Convolutional neural networks have made breakthroughs in computer vision， speech recognition， and other fields. However， with the continuous pursuit for neural network models with excellent performance， the structure of these models has become increasingly complex as mainly reflected in the width and number of their layers. Accordingly， the size and computing resource requirements of these models are also constantly expanding， and such a huge resource consumption limits these models to server platforms with unlimited computing power and other resources. As deep learning networks gradually integrate into the application end devices， many network models cannot be deployed on resource-constrained embedded end devices， such as smartphones， low-end mainboards， and edge devices. To address the contradiction between the computing resource requirements of network models and resource-constrained embedded devices， the existing complex models should be compressed. Based on the extant model pruning methods， this article proposes a two-stage filter pruning method that incorporates cosine spatial correlation （CSCTFP）， which improves pruning performance by utilizing the spatial correlation between filters. CSCTFP also relies on such spatial correlation to identify the filter bank that contributes the most to the network， thus avoiding the secondary model pruning results caused by the assumption that “if the measurement index is small， the measurement object is not important”.

Method

The existing model pruning methods are mainly divided into two types. The first type， called unstructured pruning， uses the weight parameters of the filter as the minimum pruning unit. However， this pruning method leads to the unstructured sparsity of the filter. The network structure after pruning cannot use the existing software and hardware to achieve an acceleration effect but needs to design a corresponding accelerator to speed up the calculation of unstructured sparse matrix. The second type， called structured pruning， takes the whole filter as the smallest pruning unit. This pruning method makes the network structure appear structured and sparse， thus facilitating the use of existing software and hardware for acceleration. The existing filter pruning methods mainly use the assumption that “if the measurement index is small， the measurement object is not important” as an important evaluation criterion for filters， such as using the kernel norm of the filter as the measurement importance index. Alternatively， the “similarity is redundancy” assumption can be used as a criterion for evaluating filter redundancy， such as using the distance between filters as a measure of redundancy. The above two assumptions need to meet the prerequisite conditions， and they do not always hold true in actual scenarios. CSCTFP aims to address these shortcomings as follows. First， in the pre-pruning stage of the model， instead of deleting small norm filters， CSCTFP identifies the filter represented by the maximum norm value， which is referred to as the key filter in this article. Second， in the pruning stage， a set of filters that are highly correlated with the key filters is preserved by computing the cosine distance. Measuring the correlation between filters in these two stages also avoids poor pruning results when the above two assumptions do not hold.

Result

Experiments were conducted using various network structures， such as visual geometry group （VGG） 16， residual neural network （ResNet） 56， and MobileNet V1， to verify that the proposed method can be adapted to different types of network models with sequential， residual， and deeply separable structures. The experimental results on datasets CIFAR10 and CIFAR100 were compared with those of previous methods. On the CIFAR10 dataset， the parameter count and floating-point operations （FLOPs） of VGG16 were compressed by 72.9% and 73.5%， respectively， while the model accuracy was improved by 0.1%. Compared with the Hrank pruning method， CSCTFP can compress more floating-point operations and reduce accuracy loss （the accuracy of the Hrank method decreased by 0.62%）. For the efficient residual network ResNet56， the CSCTFP can compress 53.81% of FLOPs with an accuracy increase of 0.33%， and the accuracy loss is much lower than those obtained by SFP， FPGM， and NSPPR. The efficient deep separable network MobileNet V1 can also be effectively compressed， with CSCTFP compressing 46.23% of FLOPs and 46.89% of the parameter quantities， thus improving accuracy by 0.11%. CSCTFP demonstrates a better compression effect than DCP， which reduces accuracy by 0.3% and only compresses 42.86% of FLOPs and 30.07% of parameter quantities. CSCTFP also achieves a good compression performance on highly complex datasets， such as CIFAR100. For VGG16， CSCTFP can compress more FLOPs （33.35%） and experience a much lower accuracy loss compared with Variational and DeepPruningEs. For ResNet56， CSCTFP can compress 43.02% and 40.36% of parameter quantities and FLOPs and achieve an accuracy improvement of 0.48%， while the comparison methods OICSR and NSPPR compress fewer FLOPs and experience higher accuracy loss. In addition， CSCTFP is not only applicable to image classification tasks but also to object detection visual tasks. The lightweight face detection model RetinaFace on the WiderFace dataset performs well on simple and moderate validation sets. CSCTFP is then compared with the assumptions “if the measurement index is small， the measurement object is not important” and “similarity is redundancy” and continuous to show accuracy improvements at different pruning ratios.

Conclusion

CSCTFP takes into account the uncertainty of the assumptions “if the measurement index is small， the object being measured is not important” and “similarity is redundancy”， thus avoiding suboptimal results in the pruned model resulting from the failure of these two assumptions. CSCTFP further improves the accuracy and compression rate of pruning by searching for key filters and using the spatial correlation between filters. A large number of experiments have confirmed the effectiveness of CSCTFP and its advantages over other extant methods. The iterative pruning method used in this article can compress the network model finely， but further research is needed to reduce the loss of time cost and avoid manually setting the pruning ratio.

关键词

深度学习神经网络模型压缩余弦距离滤波器剪枝

Keywords

deep learningneural networkmodel compressioncosine distancefilter pruning

references

Aggarwal C C， Hinneburg A and Keim D A. 2001. On the surprising behavior of distance metrics in high dimensional space//Proceedings of the 8th International Conference on Database Theory. London， UK： Springer： 402-434 ［DOI： 10.1007/3-540-44503-X_27http://dx.doi.org/10.1007/3-540-44503-X_27］

Alwani M， Wang Y and Madhavan V. 2022. DECORE： deep compression with reinforcement learning//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 12339-12349 ［DOI： 10.1109/CVPR52688.2022.01203http://dx.doi.org/10.1109/CVPR52688.2022.01203］

Cai L H， An Z L， Yang C G， Yan Y C and Xu Y J. 2022. Prior gradient mask guided pruning-aware fine-tuning//Proceedings of the 36th AAAI Conference on Artificial Intelligence. ［s.l.］： AAAI， 2022： 140-148 ［DOI： 10.1609/aaai.v36i1.19888http://dx.doi.org/10.1609/aaai.v36i1.19888］

Cai Z D， Ying N， Guo C S， Guo R and Yang P. 2021. Research on multiperson pose estimation combined with YOLOv3 pruning model. Journal of Image and Graphics， 26（4）： 837-846

蔡哲栋，应娜，郭春生，郭锐，杨鹏. 2021. YOLOv3剪枝模型的多人姿态估计. 中国图象图形学报， 26（4）： 837-846 ［DOI： 10.11834/jig.200138http://dx.doi.org/10.11834/jig.200138］

Deng J K， Guo J， Ververas E， Kotsia I and Zafeiriou S. 2020. RetinaFace： single-shot multi-level face localisation in the wild//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 5202-5211 ［DOI： 10.1109/CVPR42600.2020.00525http://dx.doi.org/10.1109/CVPR42600.2020.00525］

Denil M， Shakibi B， Dinh L， Ranzato M and de Freitas N D. 2013. Predicting parameters in deep learning//Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe， USA： Curran Associates Inc.： 2148-2156

Elkerdawy S， Elhoushi M， Zhang H and Ray N. 2022. Fire together wire together： a dynamic pruning approach with self-supervised mask prediction//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 12444-12453 ［DOI： 10.1109/CVPR52688.2022.01213http://dx.doi.org/10.1109/CVPR52688.2022.01213］

Fernandes Jr F E and Yen G G. 2021. Pruning deep convolutional neural networks architectures with evolution strategy. Information Sciences， 552： 29-47 ［DOI： 10.1016/j.ins.2020.11.009http://dx.doi.org/10.1016/j.ins.2020.11.009］

Gao X T， Zhao Y R， Dudziak Ł， Mullins R and Xu C Z. 2019. Dynamic channel pruning： feature boosting and suppression//Proceedings of the 7th International Conference on Learning Representations. New Orleans， USA： ICLR： 5331-5344

Guo Y， Yuan H， Tan J C， Wang Z Y， Yang S and Liu J. 2021. GDP： stabilized neural network pruning via gates with differentiable polarization//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal， Canada： IEEE： 5219-5230 ［DOI： 10.1109/ICCV48922.2021.00519http://dx.doi.org/10.1109/ICCV48922.2021.00519］

Guo Y W， Yao A B， and Chen Y R. 2016. Dynamic network surgery for efficient DNNs//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona， Spain： Curran Associates Inc.： 1387-1395

Han S， Pool J， Tran J and Dally W J. 2015. Learning both weights and connections for efficient neural networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal， Canada： MIT Press： 1135-1143

He K M， Zhang X Y， Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 770-778 ［DOI： 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90］

He Y， Kang G L， Dong X Y， Fu Y W and Yang Y. 2018. Soft filter pruning for accelerating deep convolutional neural networks//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm， Sweden： IJCAI： 2234-2240 ［DOI： 10.24963/IJCAI.2018/309http://dx.doi.org/10.24963/IJCAI.2018/309］

He Y， Liu P， Wang Z W， Hu Z L and Yang Y. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 4335-4344 ［DOI： 10.1109/CVPR.2019.00447http://dx.doi.org/10.1109/CVPR.2019.00447］

Hou X， Qu G Y， Wei D Z and Zhang J C. 2022. A lightweight UAV object detection algorithm based on iterative sparse training. Journal of Computer Research and Development， 59（4）： 882-893

侯鑫，曲国远，魏大洲，张佳程. 2022. 基于迭代稀疏训练的轻量化无人机目标检测算法. 计算机研究与发展， 59（4）： 882-893 ［DOI： 10.7544/issn1000-1239.20200986http://dx.doi.org/10.7544/issn1000-1239.20200986］

Howard A G， Zhu M L， Chen B， Kalenichenko D， Wang W J， Weyand T， Andreetto M and Adam H. 2017. MobileNets： efficient convolutional neural networks for mobile vision applications ［EB/OL］. ［2023-09-04］. https://arxiv.org/pdf/1704.04861.pdfhttps://arxiv.org/pdf/1704.04861.pdf.

Hu H Y， Peng R， Tai Y W and Tang C K. 2016. Network trimming： a data-driven neuron pruning approach towards efficient deep architectures ［EB/OL］. ［2023-09-04］. https://arxiv.org/pdf/1607.03250.pdfhttps://arxiv.org/pdf/1607.03250.pdf

Hu J， Huang Q P， Liu J X， Liu W， Yuan H and Zhao H. 2021. Greedy pruning of deep neural networks fused with probability distribution. Journal of Image and Graphics， 26（1）： 198-207

胡骏，黄启鹏，刘嘉昕，刘威，袁淮，赵宏. 2021. 引入概率分布的深度神经网络贪婪剪枝. 中国图象图形学报， 26（1）： 198-207 ［DOI： 10.11834/jig.200438http://dx.doi.org/10.11834/jig.200438］

Krizhevsky A and Hinton G. 2009. Learning Multiple Layers of Features from Tiny Images. University of Toronto

Li H， Kadav A， Durdanovic I， Samet H and Graf H P. 2016. Pruning filters for efficient ConvNets//Proceedings of the 5th International Conference on Learning Representations. Toulon， France： ICLR： 8710-8723

Li J S， Qi Q， Wang J Y， Ge C， Li Y J， Yue Z Z and Sun H F. 2019. OICSR： out-in-channel sparsity regularization for compact deep neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 7039-7048 ［DOI： 10.1109/CVPR.2019.00721http://dx.doi.org/10.1109/CVPR.2019.00721］

Li Y C， Lin S H， Liu J Z， Ye Q X， Wang M D， Chao F， Yang F， Ma J C， Tian Q and Ji R R. 2021. Towards compact CNNs via collaborative compression//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 6434-6443 ［DOI： 10.1109/CVPR46437.2021.00637http://dx.doi.org/10.1109/CVPR46437.2021.00637］

Lin M B， Ji R R， Wang Y， Zhang Y C， Zhang B C， Tian Y H and Shao L. 2020a. HRank： filter pruning using high-rank feature map//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 1526-1535 ［DOI： 10.1109/CVPR42600.2020.00160http://dx.doi.org/10.1109/CVPR42600.2020.00160］

Lin M B， Ji R R， Zhang Y X， Zhang B C， Wu Y J and Tian Y H. 2020b. Channel pruning via automatic structure search//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama， Japan： IJCAI： 673-679 ［DOI： 10.24963/ijcai.2020/94http://dx.doi.org/10.24963/ijcai.2020/94］

Lin S H， Ji R R， Yan C Q， Zhang B C， Cao L J， Ye Q X， Huang F Y and Doermann D. 2019. Towards optimal structured CNN pruning via generative adversarial learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 2785-2794 ［DOI： 10.1109/CVPR.2019.00290http://dx.doi.org/10.1109/CVPR.2019.00290］

Liu H D， Du F， Yu Z H and Song L J. 2021. Label-free network pruning via reinforcement learning. Pattern Recognition and Artificial Intelligence， 34（3）： 214-222

刘会东，杜方，余振华，宋丽娟. 2021. 基于强化学习的无标签网络剪枝. 模式识别与人工智能， 34（3）： 214-222 ［DOI： 10.16451/j.cnki.issn1003-6059.202103003http://dx.doi.org/10.16451/j.cnki.issn1003-6059.202103003］

Liu L， Zheng Y and Fu D M. 2020. Occluded pedestrian detection algorithm based on improved network structure of YOLOv3. Pattern Recognition and Artificial Intelligence， 33（6）： 568-574

刘丽，郑洋，付冬梅. 2020. 改进YOLOv3网络结构的遮挡行人检测算法. 模式识别与人工智能， 33（6）： 568-574 ［DOI： 10.16451/j.cnki.issn1003-6059.202006010http://dx.doi.org/10.16451/j.cnki.issn1003-6059.202006010］

Liu Z， Li J G， Shen Z Q， Huang G， Yan S M and Zhang C S. 2017. Learning efficient convolutional networks through network slimming//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 2755-2763 ［DOI： 10.1109/ICCV.2017.298http://dx.doi.org/10.1109/ICCV.2017.298］

Lu H W and Yuan X T. 2019. Dynamic network structured pruning via feature coefficients of layer fusion. Pattern Recognition and Artificial Intelligence， 32（11）： 1051-1059

卢海伟，袁晓彤. 2019. 基于层融合特征系数的动态网络结构化剪枝. 模式识别与人工智能， 32（11）： 1051-1059 ［DOI： 10.16451/j.cnki.issn1003-6059.201911010http://dx.doi.org/10.16451/j.cnki.issn1003-6059.201911010］

Ma L and Wang Y X. 2019. Fine-grained visual classification based on sparse bilinear convolutional neural network. Pattern Recognition and Artificial Intelligence， 32（4）： 336-344

马力，王永雄. 2019. 基于稀疏化双线性卷积神经网络的细粒度图像分类. 模式识别与人工智能， 32（4）： 336-344 ［DOI： 10.16451/j.cnki.issn1003-6059.201904006http://dx.doi.org/10.16451/j.cnki.issn1003-6059.201904006］

Nonnenmacher M， Pfeil T， Steinwart I and Reeb D. 2022. SOSP： efficiently capturing global correlations by second-order structured pruning//Proceedings of the 10th International Conference on Learning Representations. ［s.l.］： ICLR： 3848-3872

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition//Proceedings of the 3rd International Conference on Learning Representations. San Diego， USA： ICLR： 1556-1569 ［DOI： 10.48550/arXiv.1409.1556http://dx.doi.org/10.48550/arXiv.1409.1556］

Tang Y H， Wang Y H， Xu Y X， Deng Y P， Xu C， Tao D C and Xu C. 2021. Manifold regularized dynamic network pruning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 5016-5026 ［DOI： 10.1109/CVPR46437.2021.00498http://dx.doi.org/10.1109/CVPR46437.2021.00498］

Wang W X， Fu C， Guo J S， Cai D and He X F. 2019. COP： customized deep model compression via regularized correlation-based filter-level pruning//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao， China： IJCAI： 3785-3791 ［DOI： 10.24963/IJCAI.2019/525http://dx.doi.org/10.24963/IJCAI.2019/525］

Wang Y L， Zhang X L， Xie L X， Zhou J， Su H， Zhang B and Hu X L. 2020. Pruning from scratch//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York， USA： AAAI： 12273-12280 ［DOI： 10.1609/aaai.v34i07.6910http://dx.doi.org/10.1609/aaai.v34i07.6910］

Wei Y X and Chen Y. 2022. Convolutional neural network compression based on adaptive layer entropy. Acta Electronica Sinica， 2022， 50（10）： 2398-2408

魏钰轩，陈莹. 2022. 基于自适应层信息熵的卷积神经网络压缩. 电子学报， 50（10）： 2398-2408 ［DOI： 10.12263/DZXB.20201372http://dx.doi.org/10.12263/DZXB.20201372］

Xia P P， Zhang L and Li F Z. 2015. Learning similarity with cosine similarity ensemble. Information Sciences， 307： 39-52 ［DOI： 10.1016/j.ins.2015.02.024http://dx.doi.org/10.1016/j.ins.2015.02.024］

Yang S， Luo P， Loy C C and Tang X O. 2016. WIDER FACE： a face detection benchmark//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 5525-5533 ［DOI： 10.1109/CVPR.2016.596http://dx.doi.org/10.1109/CVPR.2016.596］

Yu R C， Li A， Chen C F， Lai J H， Morariu V I， Han X T， Gao M F， Lin C Y and Davis L S. 2018. NISP： pruning networks using neuron importance score propagation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City， USA： IEEE： 9194-9203 ［DOI： 10.1109/CVPR.2018.00958http://dx.doi.org/10.1109/CVPR.2018.00958］

Zhao C L， NI B B， Zhang J， Zhao Q W， Zhang W J and Tian Q. 2019. Variational convolutional neural network pruning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach， USA： IEEE： 2775-2784 ［DOI： 10.1109/CVPR.2019.00289http://dx.doi.org/10.1109/CVPR.2019.00289］

Zhuang T， Zhang Z X， Huang Y H， Zeng X Y， Shuang K and Li X. 2020. Neuron-level structured pruning using polarization regularizer//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver， Canada： Curran Associates Inc.： #827

文章被引用时，请邮件提醒。

提交