结合适配器增强的双阶段连续缺陷判别
Adapter-enhanced two-stage continual defect detection
- 2025年 页码:1-15
收稿日期:2024-10-31,
修回日期:2025-02-17,
录用日期:2025-02-25,
网络出版日期:2025-02-26
DOI: 10.11834/jig.240663
移动端阅览
浏览全部资源
扫码关注微信
收稿日期:2024-10-31,
修回日期:2025-02-17,
录用日期:2025-02-25,
网络出版日期:2025-02-26,
移动端阅览
目的
2
传统异常检测方法在工业产品缺陷判别中仅关注当前任务,从而导致在接受新任务训练时会灾难性地遗忘以前学过的知识。鉴于现实工业场景中对异常检测模型的灵活性和持续适应性的需求,结合连续学习方法提出一种适配器增强的双阶段连续缺陷判别方法(adapter-enhanced two-stage continual defect detection,AETS)以实现连续异常检测任务。
方法
2
首先,在AdaptFormer基础上引入外部注意力机制,增强模型对顺序任务中的全局依赖关系的捕捉能力,以提升对新任务的泛化性能。其次,在视觉转换器(vision Transformer,ViT)预训练模型的基础上结合高效微调技术,采用双阶段训练策略,即在适应阶段,通过全量微调缓解自然图像与工业图像之间的域差异;在高效微调阶段,通过适配器增强模块提升模型对新任务的适应性,同时冻结大部分参数以保留对旧任务的记忆,从而缓解灾难性遗忘问题。此外,还提出遗忘波动率(forgetting fluctuation rate, FFR)这一新的连续学习评价指标,用于量化模型在整个学习过程中遗忘波动情况,以检验模型在工业场景中的适用性和稳定性。
结果
2
在MVTec-MCIL、MVTec-SCIL和MVTec+MTD数据集上进行实验,AETS的ACC值分别达到84.21%、89.16%和78.49%,相较于5种连续学习方法,AETS具有最佳的ACC、FM值和最小的训练参数量,相较于6种先进高效微调方法其FFR值达到最佳。消融实验选取缩放因子及确定适配器增强模块结构,以实现模型可塑性与稳定性的最佳平衡。
结论
2
所提出的AETS方法通过构建适配器增强模块,充分利用预训练模型的特征表达能力,双阶段训练策略能够捕捉与任务相关的特征,显著增强模型在连续工业缺陷判别任务中的适应性和泛化性。
Objective
2
Due to the difficulty in obtaining defect samples in industrial scenarios, many existing anomaly detection algorithms rely solely on normal samples during the training phase. These methods demonstrate strong performance in industrial defect detection by training a single model tailored to specific product types. However, traditional anomaly detection methods mainly focus on the current task in industrial product defect discrimination, which often results in catastrophic forgetting knowledge of previously learned tasks when the model is trained on new tasks. As product objects and defect types continuously change and diversify in real-world industrial environments, anomaly detection models need to possess greater flexibility and continuous adaptation capability to cope with new tasks and data. Therefore, this paper proposes an adapter-enhanced two-stage continual defect detection (AETS) method tailored for continuous anomaly detection tasks.
Method
2
Firstly, building upon the AdaptFormer framework, this paper introduces an external attention mechanism to augment the model's capability in capturing global dependencies across sequential tasks, thereby enhancing generalization performance on new tasks. This enhancement is crucial in industrial anomaly detection, enabling the AETS model to adapt to diverse and complex defect types across different products. Furthermore, combining parameter-efficient tuning technique with vision Transformer (ViT) pre-trained models, the AETS framework incorporates two primary stages of training to enhance the model’s ability to retain previously learned knowledge while effectively adapting to new tasks. The first stage focuses on adaptation, where a full parameter fine-tuning strategy is employed to mitigate the domain shift between natural and industrial images. This is critical as the differences in feature distributions between these two domains can severely hinder the performance of industrial anomaly detection models when directly applied to industrial datasets. To further enhance the adaptation process, the AETS model integrates an external attention mechanism into the existing AdaptFormer architecture. This integration allows the model to capture long-range dependencies across sequential tasks, which is vital for maintaining performance across different domains and tasks. This incorporation of external attention enables the model to better generalize to new tasks by improving the representation learning of global contextual information, which not only improves the model's ability to attend to relevant features but also enhances its capacity to model global contextual information. By refining the model's representation learning, the external attention mechanism improves the flexibility and robustness. In the second stage, known as efficient fine-tuning, the AETS method utilizes adapter modules to selectively fine-tune a minimal number of parameters, thereby reducing computational overhead while retaining essential knowledge from previously learned tasks. By freezing most of the parameters, the model is able to preserve knowledge from prior tasks and mitigate catastrophic forgetting, which is a common challenge in continual learning settings. The adapter-enhanced module is designed to enhance the model’s capacity for handling new tasks without overwriting previously learned information, ensuring that the model can continuously adapt to evolving data distributions in industrial environments. Forgetting measure (FM) only evaluates the forgetting compared to the maximum performance achieved in the past for each task at the completion of the final model training. In order to provide a comprehensive assessment of the catastrophic forgetting problem, this paper introduces a novel evaluation metric called forgetting fluctuation rate (FFR), specifically designed for continuous learning scenarios. FFR is used for quantifying the extent of forgetting during the learning process and can provide a more granular assessment of model stability across sequential tasks. By measuring the fluctuation in forgetting over time, FFR helps evaluate how well the model retains its knowledge while learning new tasks, offering a more robust evaluation of continual learning performance in industrial defect detection scenarios. When the FFR value is low, it indicates that the model's forgetting is relatively smooth, with less performance fluctuation. Conversely, the higher the value, the greater the fluctuation in forgetting.
Results
2
We conduct extensive experiments using three benchmark datasets from the MVTecAD dataset, which are specifically divided to evaluate the model's performance in different industrial defect detection scenarios. The AETS method demonstrates significant improvements in average accuracy (ACC), achieving scores of 84.21%, 89.16%, and 78.49% on the MVTec-MCIL, MVTec-SCIL, and MVTec+MTD datasets, respectively. These results indicate that AETS outperforms other continual learning models, especially in terms of efficiency and resource utilization. Compared to five continuous learning methods, AETS exhibits the best performance in terms of ACC and FM metrics, with the smallest number of training parameters. Furthermore, when compared to six state-of-the-art parameter-efficient tuning methods, AETS achieves optimal FFR values. This demonstrates that our method not only improves model accuracy but also provides a lightweight solution that is more practical for real-world industrial applications. We perform ablation studies to further validate the effectiveness of our proposed method. Specifically, the ablation experiments focus on selecting the optimal scaling factor and determining the best structure for the adapter enhancement module. These experiments are crucial in achieving a balanced trade-off between the model's plasticity, enabling it to adapt to new tasks, and its stability, preserving previously learned knowledge.
Conclusion
2
This paper presents a novel continual anomaly detection method, AETS, which leverages the powerful feature representation capability of pre-trained models combined with adapter-enhanced modules to improve continual learning performance in industrial defect detection tasks. By adopting a two-stage training process, AETS effectively addresses domain shift issues between natural and industrial images, and efficiently captures task-relevant features through parameter-efficient fine-tuning, which not only mitigates catastrophic forgetting but also enhances the training efficiency of the model. Additionally, the introduction of FFR further validates the model's robustness and adaptability in dynamic industrial environments. The experimental results demonstrate that AETS can achieve competitive performance across various benchmark datasets. It not only maintains high detection accuracy but also significantly enhances the model's stability and plasticity in the context of continuous learning. By mitigating catastrophic forgetting, AETS effectively enables the model to adapt to new tasks without compromising previously learned knowledge. These advantages make AETS particularly well-suited for real-world industrial applications, where both accuracy and the ability to continuously learn from evolving data are essential for practical deployment and long-term performance.
Bahng H , Jahanian A , Sankaranarayanan S and Isola P . 2022 . Visual prompting: Modifying pixel space to adapt pre-trained models [EB/OL].[ 2025-01-15 ]. https://arxiv.org/pdf/2203.17274.pdf https://arxiv.org/pdf/2203.17274.pdf
Bai Y F , Wang L B , Gao W D and Ma Y L . 2024 . Multi-modal hierarchical classification for power equipment defect detection . Journal of Image and Graphics , 29 ( 7 ): 2011 - 2023
白艳峰 , 王立彪 , 高卫东 , 马应龙 . 2024 . 面向电力设备缺陷检测的多模态层次化分类 . 中国图象图形学报 , 29 ( 7 ): 2011 - 2023 [ DOI: 10.11834/jig.230269 http://dx.doi.org/10.11834/jig.230269 ]
Bergmann P , Fauser M , Sattlegger D and Steger C . 2019 . MVTec AD: A comprehensive real-world dataset for unsupervised anomaly detection // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach, USA : IEEE: 9592 - 9600 [ DOI: 10.1109/CVPR.2019.00982 http://dx.doi.org/10.1109/CVPR.2019.00982 ]
Bugarin N , Bugaric J , Barusco M , Pezze D D and Susto G A . 2024 . Unveiling the anomalies in an ever-changing world: A benchmark for pixel-level anomaly detection in continual learning // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle, USA : IEEE: 4065 - 4074 [ DOI: 10.1109/CVPRW63382.2024.00410 http://dx.doi.org/10.1109/CVPRW63382.2024.00410 ]
Buzzega P , Boschini M , Porrello A , Abati D and Calderara S . 2020 . Dark experience for general continual learning: A strong, simple baseline . Advances in Neural Information Processing Systems , 33 : 15920 - 15930 [ DOI: 10.48550/arXiv.2004.07211 http://dx.doi.org/10.48550/arXiv.2004.07211 ]
Chang C Y , Su Y D and Li W Y . 2022 . Tire bubble defect detection using incremental learning . Applied Sciences , 12 ( 23 ): 12186 [ DOI: 10.3390/app122312186 http://dx.doi.org/10.3390/app122312186 ]
Chen S , Ge C , Tong Z , Wang J L , Song Y B , Wang J and Luo P . 2022 . Adaptformer: Adapting vision Transformers for scalable visual recognition . Advances in Neural Information Processing Systems , 35 : 16664 - 16678 [ DOI: 10.48550/arXiv.2205.13535 http://dx.doi.org/10.48550/arXiv.2205.13535 ]
Defard T , Setkov A , Loesch A and Audigier R . 2021 . Padim: A patch distribution modeling framework for anomaly detection and localization // International Conference on Pattern Recognition . Virtual Event : Springer: 475 - 489 [DOI: 10.1007/978-3-030-68799-1\_ 35 ]
Dekhovich A and Bessa M A . 2023 . Continual learning for surface defect segmentation by subnetwork creation and selection [EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/2312.05100.pdf https://arxiv.org/pdf/2312.05100.pdf
Guo M H , Liu Z N , Mu T J and Hu S M . 2023 . Beyond self-attention: External attention using two linear layers for visual tasks . IEEE Transactions on Pattern Analysis and Machine Intelligence , 45 ( 5 ): 5436 - 5447 [ DOI: 10.1109/TPAMI.2022.3211006 http://dx.doi.org/10.1109/TPAMI.2022.3211006 ]
Han X , Zhang Z , Ding N , Gu Y X , Liu X , Huo Y Q , Qiu J Z , Yao Y , Zhang A , Zhang L , Han W T , Huang M L , Jin Q , Lan Y Y , Liu Y , Lu Z W , Qiu X P , Song R H , Tang J , Wen J R , Yuan J H , Zhao W X and Zhu J . 2021 . Pre-trained models: Past, present and future . AI Open , 2 : 225 - 250 [ DOI: 10.1016/J.AIOPEN.2021.08.002 http://dx.doi.org/10.1016/J.AIOPEN.2021.08.002 ]
Hu E J , Shen Y L , Wallis P , Zhu Z A , Li Y Z , Wang S A and Chen W Z . 2021 . Lora: Low-rank adaptation of large language models [EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/2106.09685.pdf https://arxiv.org/pdf/2106.09685.pdf
Huang Y , Qiu C and Yuan K . 2020 . Surface defect saliency of magnetic tile . The Visual Computer , 36 ( 1 ): 85 - 96 [ DOI: 10.1007/S00371-018-1588-5 http://dx.doi.org/10.1007/S00371-018-1588-5 ]
Jia M , Tang L , Chen B C , Cardie C , Belongie S J , Hariharan B and Lim S N . 2022 . Visual prompt tuning // European Conference on Computer Vision . Tel Aviv, Israel : Springer: 709 - 727 [ DOI: 10.48550/ARXIV.2203.12119 http://dx.doi.org/10.48550/ARXIV.2203.12119 ]
Jiang J and Deng W H . 2020 . Facial expression recognition improved by continual learning . Journal of Image and Graphics , 25 ( 11 ): 2361 - 2369
江静 , 邓伟洪 . 2020 . 持续学习改进的人脸表情识别 . 中国图象图形学报 , 25 ( 11 ): 2361 - 2369 [ DOI: 10.11834/jig.200315 http://dx.doi.org/10.11834/jig.200315 ]
Kirkpatrick J , Pascanu R , Rabinowitz N , Veness J , Desjardins G , Rusu A , Milan K , Quan J , Ramalho T , Barwinska A G , Hassabis D , Clopath C , Kumaran D and Hadsell R . 2017 . Overcoming catastrophic forgetting in neural networks . Proceedings of the National Academy of Sciences , 114 ( 13 ): 3521 - 3526 [ DOI: 10.1073/pnas.1611835114 http://dx.doi.org/10.1073/pnas.1611835114 ]
Ledoit O and Wolf M . 2004 . A well-conditioned estimator for large-dimensional covariance matrices . Journal of Multivariate Analysis , 88 ( 2 ): 365 - 411 [ DOI: 10.1016/s0047-259x(03)00096-4 http://dx.doi.org/10.1016/s0047-259x(03)00096-4 ]
Lee K , Lee K , Lee H and Shin J . 2018 . A simple unified framework for detecting out-of-distribution samples and adversarial attacks . Advances in Neural Information Processing Systems , 31 : 7167 - 7177 [ DOI: 10.48550/arXiv.1807.03888 http://dx.doi.org/10.48550/arXiv.1807.03888 ]
Li C L , Sohn K , Yoon J and Pfister T . 2021 . Cutpaste: Self-supervised learning for anomaly detection and localization // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville, USA : IEEE: 9664 - 9674 [ DOI: 10.1109/CVPR46437.2021.00954 http://dx.doi.org/10.1109/CVPR46437.2021.00954 ]
Li W B , Xiong Y K , Fan Z C , Deng B , Cao F Y and Gao Y . 2024 . Advances and trends of continual learning . Journal of Computer Research and Development , 61 ( 6 ): 1476 - 1496
李文斌 , 熊亚锟 , 范祉辰 , 邓波 , 曹付元 , 高阳 . 持续学习的研究进展与趋势 . 计算机研究与发展 , 61 ( 6 ): 1476 - 1496 [ DOI: 10.7544/issn1000-1239.202220820 http://dx.doi.org/10.7544/issn1000-1239.202220820 ]
Li W J , Zhan J W , Wang J B , Xia B Z , Gao B B , Liu J , Wang C J and Zheng F . 2022 . Towards continual adaptation in industrial anomaly detection // Proceedings of the 30th ACM International Conference on Multimedia . Lisboa, Portugal : ACM: 2871 - 2880 [ DOI: 10.1145/3503161.3548232 http://dx.doi.org/10.1145/3503161.3548232 ]
Lian D Z , Zhou D Q , Feng J S and Wang X C . 2022 . Scaling & shifting your features: A new baseline for efficient model tuning . Advances in Neural Information Processing Systems , 35 : 109 - 123 [ DOI: 10.48550/arXiv.2210.08823 http://dx.doi.org/10.48550/arXiv.2210.08823 ]
Mehta S V , Patil D , Chandar S and Strubell E . 2023 . An empirical investigation of the role of pre-training in lifelong learning . Journal of Machine Learning Research , 24 ( 214 ): 1 - 50 [ DOI: 10.48550/arXiv.2112.09153 http://dx.doi.org/10.48550/arXiv.2112.09153 ]
Ni Y M and Chen S C . 2022 . Continual unsupervised anomaly detection . Scientia Sinica Informationis , 52 ( 1 ): 75 - 85
倪一鸣 , 陈松灿 . 2022 . 连续无监督异常检测 . 中国科学 : 信息科学) , 52 ( 1 ): 75 - 85 [ DOI: 10. 1360/SSI-2021-0192 http://dx.doi.org/10.1360/SSI-2021-0192 ]
Rebuffi S A , Kolesnikov A , Sperl G and Lampert C H . 2017 . Icarl: Incremental classifier and representation learning // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, USA : IEEE: 2001 - 2010 [ DOI: 10.1109/CVPR.2017.587 http://dx.doi.org/10.1109/CVPR.2017.587 ]
Reiss T , Cohen N , Bergman and Hoshen Y . 2021 . Panda: Adapting pretrained features for anomaly detection and segmentation // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville, USA : IEEE: 2806 - 2814 [ DOI: 10.1109/CVPR46437.2021.00283 http://dx.doi.org/10.1109/CVPR46437.2021.00283 ]
Riemer M , Cases I , Ajemian R , Liu M , Rish I , Tu Y and Tesauro G . 2018 . Learning to learn without forgetting by maximizing transfer and minimizing interference [EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/1810.11910.pdf https://arxiv.org/pdf/1810.11910.pdf
Rippel O , Mertens P and Merhof D . 2021 . Modeling the distribution of normal data in pre-trained deep features for anomaly detection // 25th International Conference on Pattern Recognition . Milan, Italy : IEEE: 6726 - 6733 [ DOI: 10.1109/ICPR48806.2021.9412109 http://dx.doi.org/10.1109/ICPR48806.2021.9412109 ]
Tan Y , Zhou Q , Xiang X . 2024 . Semantically-shifted incremental adapter-tuning is a continual ViTransformer // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle, USA : IEEE: 23252 - 23262 [ DOI: 10.1109/CVPR52733.2024.02194 http://dx.doi.org/10.1109/CVPR52733.2024.02194 ]
Wang S Q , Cheng C , Shi M and Zhu D M . 2024a . Defect detection method for industrial product surfaces with similar features by combining frequency and ViT . Journal of Image and Graphics , 29 ( 10 ): 3074 - 3089
王素琴 , 程成 , 石敏 , 朱登明 . 2024a . 结合频率和ViT的工业产品表面相似特征缺陷检测方法 . 中国图象图形学报 , 29 ( 10 ): 3074 - 3089 [ DOI: 10.11834/jig.230532 http://dx.doi.org/10.11834/jig.230532 ]
Wang W P , Qin Y C and Shi W X . 2024b . Review of unsupervised deep learning methods for industrial defect detection . Journal of Computer Applications , 1 - 16
王文鹏 , 秦寅畅 , 师文轩 . 2024b . 工业缺陷检测无监督深度学习方法综述 . 计算机应用 , 1 - 16 [ DOI: 10.11772/j.issn.1001-9081.2024050736 http://dx.doi.org/10.11772/j.issn.1001-9081.2024050736 ]
Yildiz O , Chan H , Raghavan K , Judge W , Cherukara M J , Balaprakash P , Sankaranarayanan S and Peterka T . 2022 . Automated continual learning of defect identification in coherent diffraction imaging // 2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications . Dallas, USA : IEEE: 1 - 6 [ DOI: 10.1109/AI4S56813.2022.00006 http://dx.doi.org/10.1109/AI4S56813.2022.00006 ]
Yin D S , Hu L Y , Li B and Zhang Y Q . 2023 . Adapter is all you need for tuning visual tasks [EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/2311.15010.pdf https://arxiv.org/pdf/2311.15010.pdf
Zhang G W , Wang L Y , Kang G L , Chen L and Wei Y C . 2023 . Slca: Slow learner with classifier alignment for continual learning on a pre-trained model // Proceedings of the IEEE/CVF International Conference on Computer Vision . Paris, France : IEEE: 19091 - 19101 [ DOI: 10.1109/ICCV51070.2023.01754 http://dx.doi.org/10.1109/ICCV51070.2023.01754 ]
Zhou D W , Cai Z W , Ye H J , Zhan D C and Liu Z W . 2024 . Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need . International Journal of Computer[EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/2303.07338.pdf https://arxiv.org/pdf/2303.07338.pdf
Zhou D W , Wang F Y , Ye H J and Zhan D C . 2023a . Deep learning for class-incremental learning: A survey . Chinese Journal of Computers , 46 ( 8 ): 1577 - 1605
周大蔚 , 汪福运 , 叶翰嘉 , 詹德川 . 2023a . 基于深度学习的类别增量学习算法综述 . 计算机学报 , 46 ( 8 ): 1577 - 1605 [ DOI: 10.11897/SP.J.1016.01.2022.00001 http://dx.doi.org/10.11897/SP.J.1016.01.2022.00001 ]
Zhou D W , Wang Q W , Ye H J . 2023b . A model or 603 exemplars: Towards memory-efficient class-incremental learning [EB/OL]. [ 2025-01-15 ]. https://arxiv.org/pdf/2205.13218.pdf https://arxiv.org/pdf/2205.13218.pdf
相关作者
相关机构