不确定性域感知网络在少样本跨域图像分类中的研究
Uncertainty domain awareness network for cross-domain few-shot image classification
- 2025年30卷第2期 页码:518-532
纸质出版日期: 2025-02-16
DOI: 10.11834/jig.240142
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2025-02-16 ,
移动端阅览
余悦, 陈楠, 成科扬. 2025. 不确定性域感知网络在少样本跨域图像分类中的研究. 中国图象图形学报, 30(02):0518-0532
Yu Yue, Chen Nan, Cheng Keyang. 2025. Uncertainty domain awareness network for cross-domain few-shot image classification. Journal of Image and Graphics, 30(02):0518-0532
目的
2
跨域少样本学习的主要挑战在于,很难将源域的知识推广到未知的目标域中。最近的一些少样本学习模型试图通过在元训练过程中诱导图像多样化来解决这一问题。然而,其中一些模拟未知领域任务的方法效果有限,因为它们不能有效地模拟领域偏移,其生成的内容变化范围狭窄,难以从域偏移中学习到有效的域不变知识。为了提升少样本模型的跨域泛化能力,提出了一个基于不确定性增强的域感知网络(uncertainty enhancement based domain-aware network,UEDA)。
方法
2
基于不确定性增强的域感知框架从特征通道视角探索和提取其中可用于缓解领域偏移的关键知识。首先提出一个不确定性特征增强方法,将特征的充分统计定值定义为服从高斯分布的概率表示,以源域充分统计量为分布中心建模不确定性分布。随后,从不确定性分布中生成有别于加性扰动的挑战性特征,从而挖掘不同域之间的共性知识;其次,提出了基于不确定性增强的域感知方法,将源特征和生成特征视为来自不同领域的信息,利用域鉴别器计算特征通道与领域信息的相关性,从而帮助模型挖掘领域之间的潜在关联并鉴别出其中的域因果信息用于学习。
结果
2
实验使用Mini-ImageNet、CUB(caltech-ucsd birds)、Plantae、EuroSAT(land use and land cover classification with sentinel-2)和 Cropdiseases共5个数据集评估所提出方法的跨域泛化表现。实验遵从纯源域泛化,其中在图神经网络(graph neural network,GNN)分类框架下,以Mini-ImageNet数据集作为源域,模型在后4个目标域的1-shot和5-shot设置下其平均精度分别为59.50%、47.48%、79.04%和75.08%,表明了所提出方法能有效提高基于源域的跨域图像分类能力。
结论
2
本文所提出的基于不确定性增强的域感知网络框架使得模型在训练阶段适应各种域偏移,并从中学习到有效的可泛化知识,从而提高在少样本条件下的跨域图像分类能力。
Objective
2
Inspired by the fast learning properties of humans and transfer learning, researchers have proposed a new machine learning paradigm——few-shot learning. The purpose of few-shot learning is to enable models to quickly learn new knowledge or concepts from a limited amount of labeled data that can be used in unknown categories. Currently, few-shot image classification is based on the framework of meta-learning, which divides the model learning process into a meta-training phase and a meta-testing phase. Existing solutions can be broadly classified according to the following differences in concepts: 1) optimization-based methods, the basic idea of which is to allow the model to find the parameters that can optimize performance under the training of multiple sets of task data; 2) metric learning-based methods, whose core idea is to construct an optimal embedding space for measuring distances so that the distance between similar samples is as small as possible; and 3) data manipulation-based methods, which use some basic data augmentation (e.g., rotating, clipping, and adding noise) to increase the diversity of the training samples and the amount of data in these three main categories. However, these works tend to follow strict assumptions, such as the smoothness assumption, clustering assumption, and prevalence assumption, and require that the training data and test data come from the same distribution. This situation makes it difficult to ensure data from the same distribution setting during the training process of the model in certain real-world scenarios, such as medical imaging, military applications, and finance, where issues such as difficulty in data access and data privacy make it challenging to use labeled data from other domains to provide a priori knowledge. Here, an uncertainty enhancement-based domain-awareness network(UEDA) is proposed to alleviate the problem of domain distribution bias encountered in the learning process of the few-shot model.
Method
2
The uncertainty enhancement-based domain-awareness approach explores and extracts key knowledge from the feature channel perspective, which can be used to mitigate domain bias. An uncertainty feature augmentation approach is first proposed, where the feature channel contains both domain-relevant and domain-independent information, suggesting that the generalized learning of the model may be correlated with the feature channel’s ability to extract domain-generalized knowledge. However, most of these works consider feature statistics to be deterministic and use additive perturbations (i.e., swapping and interpolation) to achieve augmentation, practices that may lead to models that are negatively affected by voluminous domain-specific information. The uncertainty enhancement approach models the uncertainty distribution by defining the feature sufficient statistics fixed value as a probabilistic representation of uncertainty obeying a Gaussian distribution with the source feature sufficient statistic as the center of the distribution and the standard deviation defined as being the potential range of variation of the probability. The new features generated by the uncertainty enhancement method can be effectively distinguished from the source domain features. The second part of the UEDA is a domain-awareness approach. In the domain-aware module, source and generated features are considered information from different domains and ensure that both features are within a reasonable challenging offset by maximizing the interdomain differences. A domain discriminator is also introduced to compute the correlation between each channel, and the domain information is used as a way to extract effective generalizable knowledge.
Result
2
The cross-domain generalization performance of the proposed method was evaluated on five datasets, namely, Mini-ImageNet, CUB, Plantae, Cropdiseases, and EuroSAT. The experiments follow single-domain generalization, use the Mini-ImageNet dataset as the source domain and the latter four datasets as the target domains. Then, the initial findings are compared with current mainstream methods under three classification frameworks, namely, MatchingNet, RelationNet, and the GNN. The subsequent experiments follow 5-way 1-shot and 5-way 5-shot settings on the CUB, Plantae, EuroSAT, and Cropdiseases datasets, and the accuracies of the proposed UEDA are 41.01%, 58.78%, 38.07% and 51.36%, respectively; under the MatchingNet classifier, 58.37%, 80.45%, 59.48%, and 69.91%; and under the GNN classifier, the accuracies of the UEDA are 49.36%, 69.65%, 38.48%, 56.49%, 68.98%, 89.11%, 64.87% and 85.29%, respectively. Comparative experimental results demonstrate that the proposed UEDA method can effectively improve the cross-domain generalizability of the model. Furthermore, ablation experiments were conducted to validate the effectiveness of the modules of the proposed methodology, and the results show that the modules are mutually reinforcing and indispensable in the overall methodology.
Conclusion
2
The uncertainty enhancement-based domain-awareness network proposed in this study allows the model to adapt to various domain offsets during the training phase and learn effective generalizable knowledge from them, thus improving cross-domain image classification with fewer samples.
Balaji Y , Sankaranarayanan S and Chellappa R . 2018 . MetaReg: towards domain generalization using meta-regularization // Proceedings of the 32nd International Conference on Neural Information Processing Systems . Montréal, Canada : Curran Associates Inc.: 1006 - 1016
Chen J N , Sun S Y , He J , Torr P , Yuille A and Bai S . 2022 . TransMix: attend to mix for vision transformers // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans, USA : IEEE: 12125 - 12134 [ DOI: 10.1109/CVPR52688.2022.01182 http://dx.doi.org/10.1109/CVPR52688.2022.01182 ]
Dong Y Y , Song B B and Sun W F . 2023 . Local feature fusion network-based few-shot image classification . Journal of Image and Graphics , 28 ( 7 ): 2093 - 2104
董杨洋 , 宋蓓蓓 , 孙文方 . 2023 . 局部特征融合的小样本分类 . 中国图象图形学报 , 28 ( 7 ): 2093 - 2104 [ DOI: 10.11834/jig.220079 http://dx.doi.org/10.11834/jig.220079 ]
Finn C , Abbeel P and Levine S . 2017 . Model-agnostic meta-learning for fast adaptation of deep networks // Proceedings of the 34th International Conference on Machine Learning . Sydney, Australia : JMLR.org: 1126 - 1135
Fu Y Q , Fu Y W and Jiang Y G . 2021 . Meta-FDMixup: cross-domain few-shot learning guided by labeled target data // Proceedings of the 29th ACM International Conference on Multimedia . Chengdu, China : ACM: 5326 - 5334 [ DOI: 10.1145/3474085.3475655 http://dx.doi.org/10.1145/3474085.3475655 ]
Garcia V and Bruna J . 2018 . Few-shot learning with graph neural networks // Proceedings of the 6th International Conference on Learning Representations . Vancouver, Canada : ICLR: 1 - 13
Guo Q Y , Haotong G , Wei X J , Fu Y W , Yu Y Z , Zhang W Q and Ge W F . 2023 . RankDNN: learning to rank for few-shot learning // Proceedings of the 37th AAAI Conference on Artificial Intelligence . Washington, USA : AAAI Press: 728 - 736 . [ DOI: 10.1609/aaai.v37i1.25150 http://dx.doi.org/10.1609/aaai.v37i1.25150 ]
Guo Y H , Codella N C , Karlinsky L , Codella J V , Smith J R , Saenko K , Rosing T and Feris R . 2020 . A broader study of cross-domain few-shot learning // Proceedings of the 16th European Conference on Computer Vision . Glasgow, UK : Springer: 124 - 141 [ DOI: 10.1007/978-3-030-58583-9_8 http://dx.doi.org/10.1007/978-3-030-58583-9_8 ]
Helber P , Bischke B , Dengel A and Borth D . 2019 . EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification . IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , 12 ( 7 ): 2217 - 2226 [ DOI: 10.1109/JSTARS.2019.2918242 http://dx.doi.org/10.1109/JSTARS.2019.2918242 ]
Hu Y X and Ma A J . 2022 . Adversarial feature augmentation for cross-domain few-shot classification // Proceedings of the 17th European Conference on Computer Vision . Tel Aviv, Israel : Springer: 20 - 37 [ DOI: 10.1007/978-3-031-20044-1_2 http://dx.doi.org/10.1007/978-3-031-20044-1_2 ]
Kang S , Hwang D , Eo M , Kim T and Rhee W . 2023 . Meta-learning with a geometry-adaptive preconditioner // Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver, Canada : IEEE: 16080 - 16090 [ DOI: 10.1109/CVPR52729.2023.01543 http://dx.doi.org/10.1109/CVPR52729.2023.01543 ]
Li P , Li D , Li W , Gong S G , Fu Y W and Hospedales T M . 2021a . A simple feature augmentation for domain generalization // Proceedings of 2021 IEEE/CVF International Conference on Computer Vision . Montreal, Canada : IEEE: 8866 - 8875 [ DOI: 10.1109/ICCV48922.2021.00876 http://dx.doi.org/10.1109/ICCV48922.2021.00876 ]
Li W H , Liu X L and Bilen H . 2021b . Universal representation learning from multiple domains for few-shot classification // Proceedings of 2021 IEEE/CVF International Conference on Computer Vision . Montreal, Canada : IEEE: 9506 - 9515 [ DOI: 10.1109/ICCV48922.2021.00939 http://dx.doi.org/10.1109/ICCV48922.2021.00939 ]
Li Y Y , Yang Y X , Zhou W and Hospedales T . 2019 . Feature-critic networks for heterogeneous domain generalization // Proceedings of the 36th International Conference on Machine Learning . Long Beach, USA : PMLR: 3915 - 3924
Liu C , Yang L W , Li Z , Yang W , Han Z G , Guo J Z and Yu J Y . 2024 . Multi-level relation learning for cross-domain few-shot hyperspectral image classification . Applied Intelligence , 54 ( 5 ): 4392 - 4410 [ DOI: 10.1007/s10489-024-05384-3 http://dx.doi.org/10.1007/s10489-024-05384-3 ]
Mohanty S P , Hughes D P and Salathé M . 2016 . Using deep learning for image-based plant disease detection . Frontiers in Plant Science , 7 : # 1419 [ DOI: 10.3389/fpls.2016.01419 http://dx.doi.org/10.3389/fpls.2016.01419 ]
Nguyen Q H , Nguyen C Q , Le D D and Pham H H . 2023 . Enhancing few-shot image classification with cosine transformer . IEEE Access , 11 : 79659 - 79672 [ DOI: 10.1109/ACCESS.2023.3298299 http://dx.doi.org/10.1109/ACCESS.2023.3298299 ]
Sun J M , Lapuschkin S , Samek W , Zhao Y Q , Cheung N M and Binder A . 2021 . Explanation-guided training for cross-domain few-shot classification // Proceedings of the 25th International Conference on Pattern Recognition . Milan, Italy : IEEE: 7609 - 7616 [ DOI: 10.1109/icpr48806.2021.9412941 http://dx.doi.org/10.1109/icpr48806.2021.9412941 ]
Sung F , Yang Y X , Zhang L , Xiang T , Torr P H S and Hospedales T M . 2018 . Learning to compare: relation network for few-shot learning // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City, USA : IEEE: 1199 - 1208 [ DOI: 10.1109/CVPR.2018.00131 http://dx.doi.org/10.1109/CVPR.2018.00131 ]
Tao P , Feng L , Du Y D , Gong X and Wang J . 2024 . Meta-cosine loss for few-shot image classification . Journal of Image and Graphics , 29 ( 2 ): 506 - 519
陶鹏 , 冯林 , 杜彦东 , 龚勋 , 王俊 . 2024 . 面向元余弦损失的少样本图像分类 . 中国图象图形学报 , 29 ( 2 ): 506 - 519 [ DOI: 10.11834/jig.230127 http://dx.doi.org/10.11834/jig.230127 ]
Tseng H Y , Lee H Y , Huang J B and Yang M H . 2020 . Cross-domain few-shot classification via learned feature-wise transformation // Proceedings of the 8th International Conference on Learning Representations . Addis Ababa, Ethiopia : ICLR: 1 - 18
Van Horn G , Mac Aodha O , Song Y , Cui Y , Sun C and Shepard A . 2018 . The iNaturalist species classification and detection dataset // Proceedings of 2018 CVF IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City, USA : IEEE: 8769 - 8778 [ DOI: 10.1109/CVPR.2018.00914 http://dx.doi.org/10.1109/CVPR.2018.00914 ]
Vinyals O , Blundell C , Lillicrap T , Kavukcuoglu K and Wierstra D . 2016 . Matching networks for one shot learning // Proceedings of the 30th International Conference on Neural Information Processing Systems . Barcelona, Spain : Curran Associates Inc.: 3637 - 3645
Volpi R , Namkoong H , Sener O , Duchi J , Murino V and Savarese S . 2018 . Generalizing to unseen domains via adversarial data augmentation // Proceedings of the 32nd International Conference on Neural Information Processing Systems . Montréal, USA : Curran Associates Inc.: 5339 - 5349
Wah C , Branson S , Welinder P , Perona P and Belongie S . 2011 . The Caltech-UCSD Birds-200-2011 Dataset. CNS-TR-2010-001 . California Institute of Technology
Wang H Q and Deng Z H . 2021 . Cross-domain few-shot classification via adversarial task augmentation // Proceedings of the 30th International Joint Conference on Artificial Intelligence . Montreal, Canada : IJCAI.org: 1075 - 1081 [ DOI: 10.24963/ijcai.2021/149 http://dx.doi.org/10.24963/ijcai.2021/149 ]
Wang S M , Ma R , Wu T R and Cao Y . 2023 . P3DC-shot: prior-driven discrete data calibration for nearest-neighbor few-shot classification . Image and Vision Computing , 136 : # 104736 [ DOI: 10.1016/j.imavis.2023.104736 http://dx.doi.org/10.1016/j.imavis.2023.104736 ]
Wang Y , Qi L , Shi Y and Gao Y . 2022 . Feature-based style randomization for domain generalization . IEEE Transactions on Circuits and Systems for Video Technology , 32 ( 8 ): 5495 - 5509 [ Doi: 10.1109/TCSVT.2022.3152615 http://dx.doi.org/10.1109/TCSVT.2022.3152615 ]
Zhuo L H , Fu Y Q , Chen J J , Cao Y X and Jiang Y G . 2022 . TGDM: Target guided dynamic mixup for cross-domain few-shot learning // Proceedings of the 30th ACM International Conference on Multimedia . Lisboa, Portugal : Association for Computing Machinery: 6368 - 6376 [ DOI: 10.1145/3503161.3548052 http://dx.doi.org/10.1145/3503161.3548052 ]
相关作者
相关机构