联邦学习中局部和全局偏移的联合动态校正算法
Joint dynamic correction algorithms for local and global drifts in federated learning
- 2024年29卷第12期 页码:3727-3738
纸质出版日期: 2024-12-16
DOI: 10.11834/jig.230891
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-12-16 ,
移动端阅览
戚银城, 霍亚琳, 王宁, 侯禹. 2024. 联邦学习中局部和全局偏移的联合动态校正算法. 中国图象图形学报, 29(12):3727-3738
Qi Yincheng, Huo Yalin, Wang Ning, Hou Yu. 2024. Joint dynamic correction algorithms for local and global drifts in federated learning. Journal of Image and Graphics, 29(12):3727-3738
目的
2
在联邦学习场景中,由于各客户端数据分布的不一致,会导致各客户端的局部目标之间偏差较大,以及全局平均模型偏离全局最优,影响模型训练的收敛速度和模型精度。针对非独立同分布数据导致的全局模型收敛缓慢以及模型准确率较低的问题,提出一种联合动态校正的联邦学习算法(federated learning algorithm for joint dynamic correction, FedJDC),分别从客户端和服务器端进行优化。
方法
2
为了降低局部模型更新偏移的影响,定义累积偏移度来衡量各参与客户端的数据非独立同分布程度,并在本地损失函数中引入动态约束项,根据累积偏移度动态调整约束项大小,可自动适应不同程度的非独立同分布数据,减小局部模型的更新方向不一致性,从而提高模型准确率及通信效率;其次,针对全局模型聚合偏移,将参与客户端上传的累积偏移度作为全局模型聚合权重,从而动态更新全局模型,大幅减少通信轮数。
结果
2
本文在3个真实数据集上的实验结果表明,与4种不同的联邦学习算法相比,在多种不同非独立同分布程度的情况下,FedJDC可以平均减少62.29%、20.90%、24.93%和20.47%的通信轮次,平均提高5.48%、1.62%、2.10%和2.28%的模型准确率。
结论
2
本文提出的联邦学习中局部和全局偏移的联合动态校正算法从局部模型更新和全局模型聚合两方面进行改进,降低了通信轮次,提高了准确率,取得了良好的收敛效果。
Objective
2
Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. In practical applications, the data between nodes usually follow a non-independent identical distribution (non-IID). In the local update, each client model will be optimized toward its local optima (i.e., fitting its individual feature distribution) instead of the global optimal objective and raises a client update drift. Meanwhile, in global updates that aggregate these diverged local models, the server model is further distracted by the set of mismatching local optima, which subsequently leads to a global drift at the server model. To solve the problems of slow global convergence and increasing number of training communication rounds caused by non-IID data, this paper proposes a joint dynamic correction federated learning algorithm (FedJDC) that is optimized from the client and server.
Method
2
To reduce the influence of non-IID on federated learning, this paper carries out a joint optimization from the two aspects of local model update and global model update and proposes the FedJDC algorithm. This paper then uses the cosine similarity between the local and global update directions to measure the offset of each participating client. Afterward, given that each client has a different degree of non-IID, if the degree of the model offset is only determined by the cosine similarity calculated in this round, then the model update may become unstable. Therefore, the FedJDC algorithm defines the cumulative offset and introduces the attenuation coefficient
ρ
. In calculating the cumulative offset of the model, the current and historical cumulative offsets are taken into account. In addition, by changing
ρ
to reduce the proportion of the cumulative offset of the current round, the influence of the offset of the current round on the final result can be reduced. This paper also proposes a strategy for dynamically adjusting the constraint terms for local model update offset. Specifically, the constraint terms of the local loss function are dynamically adjusted according to the calculated cumulative offset of the local model, and the algorithm is automatically adapted to various non-IID settings without a careful selection of hyperparameters, thus improving the flexibility of the algorithm. To dynamically change the weight of global model aggregation in each round and effectively improve the convergence speed and model accuracy, this paper also designs a dynamic weighted aggregation strategy that takes the accumulated offset uploaded by all clients as the weight of global model aggregation in each round of communication.
Result
2
The proposed method is tested on three dataset us
ing different deep learning models. LeNeT5, the VGG16 network model, and the ResNet18 network model are used for training in the MNIST, FMNIST, and CIFAR10 datasets, respectively. Four experiments are designed to prove the effectiveness of the proposed algorithm. To verify the accuracy of FedJDC at different degrees of non-IID, the hyperparameter
β
of the Dirichlet distribution is varied, and the performance of different algorithms is compared. Experimental results show that FedJDC can improve the model accuracy by 5.48%, 1.62%, 2.10%, and 2.28% on average compared with FedAvg, FedProx, FedAdp, and FedLAW, respectively. To evaluate the communication efficiency of FedJDC, the number of communication rounds is counted as FedJDC reaches a target accuracy, and this number is compared with that obtained by other algorithms. Experimental results show that under different degrees of non-IID, FedJDC can reduce communication rounds by 62.29%, 20.90%, 24.93%, and 20.47% on average compared with FedAvg, FedProx, FedAdp, and FedLAW, respectively. This paper also investigates the effect of the number of local epochs on the accuracy of the final model. Experimental results show that FedJDC outperforms the other four methods under different epochs in terms of final model accuracy. FedJDC also demonstrates better robustness against the larger offset caused by more local update epochs. Ablation experiments also show that each optimization method performs well on all datasets, and FedJDC combines these two strategies to achieve the global optimal performance.
Conclusion
2
This paper optimizes the local and global model offsets from two aspects and proposes a joint dynamic correction algorithm for these offsets in federated learning. The cumulative offset is defined, and the attenuation coefficient is introduced into the calculation of the cumulative offset. Considering the historical and current offset information, the size of the cumulative offset is dynamically adjusted to ensure the stability of the training parameter update. The dynamic constraint strategy takes the cumulative offset calculated by the client in each round as the constraint parameter of the client model. The dynamic weighted aggregation strategy changes the weight of each local model during the global model aggregation based on the cumulative offset of each participating client so as to dynamically update the global model in each round. The combination of the two optimization strategies has achieved good results, effectively alleviated the performance degradation of the federated learning model caused by non-IID data, and provided a good foundation for the further implementation of federated learning in this field.
联邦学习(FL)非独立同分布(non-IID)损失函数模型聚合收敛性
federated learning(FL)non-independent identical distribution (non-IID)loss functionmodel aggregationconvergence
Cohen G, Afshar S, Tapson J and van Schaik A. 2017. EMNIST: extending MNIST to handwritten letters//Proceedings of 2017 International Joint Conference on Neural Networks. Anchorage, USA: IEEE: 2921-2926 [DOI: 10.1109/IJCNN.2017.7966217http://dx.doi.org/10.1109/IJCNN.2017.7966217]
Hsu T M H, Qi H and Brown M. 2019. Measuring the effects of non-identical data distribution for federated visual classification [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/1909.06335.pdfhttps://arxiv.org/pdf/1909.06335.pdf
Hsu T M H, Qi H and Brown M. 2020. Federated visual classification with real-world data distribution//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 76-92 [DOI: 10.1007/978-3-030-58607-2_5http://dx.doi.org/10.1007/978-3-030-58607-2_5]
Huang L, Yin Y F, Fu Z, Zhang S F, Deng H and Liu D B. 2020. LoAdaBoost: loss-based AdaBoost federated machine learning with reduced computational complexity on IID and non-IID intensive care data. PLoS One, 15(4): #0230706 [DOI: 10.1371/journal.pone.0230706http://dx.doi.org/10.1371/journal.pone.0230706]
Kaissis G A, Makowski M R, Rückert D and Braren R F. 2020. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence, 2(6): 305-311 [DOI: 10.1038/s42256-020-0186-1http://dx.doi.org/10.1038/s42256-020-0186-1]
Karimireddy S P, Kale S, Mohri M, Reddi S J, Stich S U and Suresh A T. 2020. SCAFFOLD: stochastic controlled averaging for federated learning//Proceedings of the 37th International Conference on Machine Learning. Virtual Event: JMLR.org: 5132-5143
Krizhevsky A. 2009. Learning Multiple Layers of Features from Tiny Images. University of Toronto
Kumar R, Khan A A, Kumar J, Zakria, Golilarz N A, Zhang S M, Ting Y, Zheng C Y and Wang W Y. 2021. Blockchain-federated-learning and deep learning models for COVID-19 detection using CT imaging. IEEE Sensors Journal, 21(14): 16301-16314 [DOI: 10.1109/JSEN.2021.3076767http://dx.doi.org/10.1109/JSEN.2021.3076767]
Lecun Y, Bottou L, Bengio Y and Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278-2324 [DOI: 10.1109/5.726791http://dx.doi.org/10.1109/5.726791]
Li Q B, Diao Y Q, Chen Q and He B S. 2022. Federated learning on Non-IID data silos: an experimental study//Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE). Kuala Lumpur, Malaysia: IEEE: 965-978 [DOI: 10.1109/ICDE53745.2022.00077http://dx.doi.org/10.1109/ICDE53745.2022.00077]
Li X X, Jiang M R, Zhang X F, Kamp M and Dou Q. 2021a. FedBN: federated learning on Non-IID features via local batch normalization [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/2102.07623.pdfhttps://arxiv.org/pdf/2102.07623.pdf
Li Z X, Lin T, Shang X Y and Wu C. 2023. Revisiting weighted aggregation in federated learning with neural networks//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: JMLR.org: 19767-19788
Li Q B, He B S and Song D. 2021b. Model-contrastive federated learning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 10708-10717 [DOI: 10.1109/CVPR46437.2021.01057http://dx.doi.org/10.1109/CVPR46437.2021.01057]
Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A and Smith V. 2020a. Federated optimization in heterogeneous networks [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/1812.06127.pdfhttps://arxiv.org/pdf/1812.06127.pdf
Li X, Huang K X, Yang W H, Wang S S and Zhang Z H. 2020b. On the convergence of FedAvg on Non-IID data [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/1907.02189.pdfhttps://arxiv.org/pdf/1907.02189.pdf
Liu Y, Huang A B, Luo Y, Huang H, Liu Y Z, Chen Y Y, Feng L C, Chen T J, Yu H and Yang Q. 2020. FedVision: an online visual object detection platform powered by federated learning//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 13172-13179 [DOI: 10.1609/aaai.v34i08.7021http://dx.doi.org/10.1609/aaai.v34i08.7021]
McMahan H B, Moore E, Ramage D, Hampson S and Arcas B A Y. 2017. Communication-efficient learning of deep networks from decentralized data//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA: [s.n.]: 1273-1282
Wang H Y, Yurochkin M, Sun Y K, Papailiopoulos D and Khazaeni Y. 2020a. Federated learning with matched averaging [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/2002.06440.pdfhttps://arxiv.org/pdf/2002.06440.pdf
Wang J Y, Liu Q H, Liang H, Joshi G and Poor H V. 2020b. Tackling the objective inconsistency problem in heterogeneous federated optimization//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: 7611-7623
Wu H D and Wang P. 2021. Fast-convergent federated learning with adaptive weighting. IEEE Transactions on Cognitive Communications and Networking, 7(4): 1078-1088 [DOI: 10.1109/TCCN.2021.3084406http://dx.doi.org/10.1109/TCCN.2021.3084406]
Xia P P, Zhang L and Li F Z. 2015. Learning similarity with cosine similarity ensemble. Information Sciences, 307: 39-52 [DOI: 10.1016/j.ins.2015.02.024http://dx.doi.org/10.1016/j.ins.2015.02.024]
Yuan Q Q, Shen H F, Li P X and Zhang L P. 2010. Adaptively regularized muti-frame image super-resolution reconstruction. Journal of Image and Graphics, 15(12): 1720-1727
袁强强, 沈焕锋, 李平湘, 张良培. 2010. 自适应正则化多幅影像超分辨率重建. 中国图象图形学报, 15(12): 1720-1727[DOI: 10.11834/jig.20101202http://dx.doi.org/10.11834/jig.20101202]
Yurochkin M, Agarwal M, Ghosh S, Greenewald K, Hoang T N and Khazaeni Y. 2019. Bayesian nonparametric federated learning of neural networks//Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: [s.n.]: 9252-9261
Zhao Y, Li M, Lai L Z, Suda N, Civin D and Chandra V. 2022. Federated learning with Non-IID data [EB/OL]. [2023-12-01]. https://arxiv.org/pdf/1806.00582.pdfhttps://arxiv.org/pdf/1806.00582.pdf
相关作者
相关机构