非獨立同分布下訊息交換階層式聚類之聯邦學習

簡易檢索 / 詳目顯示

回結果列表

研究生：	施辰翰 Shih, Chen-Han
論文名稱：	非獨立同分布下訊息交換階層式聚類之聯邦學習 Information-Exchangeable Hierarchical Clustering for Federated Learning With Non-IID Data
指導教授：	許健平 Sheu, Jang-Ping
口試委員:	郭建志 Kuo, Jian-Jhih 邱德泉 Chiu, Te-Chuan
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	30
中文關鍵詞：	聯邦學習、聚類、通信拓撲、機器學習建議
外文關鍵詞：	Federated learning, Clustering, Communication Topology, Machine-learned advice
相關次數：	點閱：57 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

機器學習 (Machine Learning, ML) 已成為解決研究問題（例如分析用戶數據和進行預測）的一種很有前途的方法，但它會導致對集中數據的隱私問題。聯邦學習 (Federated Learning, FL) 協同訓練全局模型並交換模型更新而不是本地數據，以解決隱私問題。然而，當前的 FL 框架存在三大缺陷：高通信成本、單點故障以及對非獨立同分佈（non-independent and identically distributed, non-IID）數據的測試準確度低。

在本文中，我們結合了集群聯邦學習（Clustered Federated Learning, CFL）和分散學習（Decentralized Learning, DL）的思想，制定了優化問題集群形成（Cluster Formation, CF）和拓撲結構（Topology Construction, TC），並證明該問題是 NP-hard 且不可逼近在任何大於 1 的常數因子內，除非 P = NP。

為此，我們提出了一種創新的聯邦學習框架 IHC-FL，以 1) 根據通信成本和模型距離將設備分組到集群中，2) 在簇頭上分發模型聚合，以及 3) 構建拓撲以引導簇頭交換模型更新。據我們所知，本文首次嘗試聯合優化將用戶設備分組到集群中並在集群頭之間交換模型更新以提高模型性能。數值結果表明，在 FMNIST 和 CIFAR-10 上，IHC-FL 與其他具有非獨立同分佈數據的啟發式算法相比，可以隨著時間的推移減少 38%~89% 的總通信成本，從而達到目標準確度。此外，IHC-FL 與給定相同訓練回合數的其他啟發式算法相比，模型測試準確度提高了 0.803%~63.221%。

Machine Learning (ML) has emerged as a promising approach to solving research problems such as analyzing user data and making predictions, but it leads to privacy concerns over centralized data. Federated Learning (FL) collaboratively trains a global model and exchange model updates instead of local data to address privacy concerns. However, the current FL framework suffers from three major deficiencies, high communication cost, single point of failure, and low model test accuracy on non-independent and identically distributed (non-IID) data.

In this thesis, we combine the idea of Clustered Federated Learning (CFL) and Decentralized Learning (DL) and formulate optimization problem Cluster Formation (CF) and Topology Construction (TC), and prove the problem is NP-hard and not approximable to within any constant factor large than 1, unless P = NP.

To this end, we propose an innovative FL framework, IHC-FL, to 1) group devices into clusters based on communication cost and model distance, 2) distribute model aggregation over cluster heads, and 3) construct a topology to guide cluster heads to exchange model updates. To the best of our knowledge, this paper makes the first attempt to jointly optimize grouping user devices into clusters and exchanging model updates among cluster heads to enhance model performance. The numeric results show that IHC-FL can reduce 38%∼89% of total communication cost over time and than other heuristics with non-IID data on FMNIST and CIFAR-10 to achieve the target accuracy. Additionally, IHC-FL improves model test accuracy by 0.803%∼63.221% compared to other heuristics given the same training rounds.

Contents
Introduction 1
Related work 5
1 Decentralized Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Client selection and Non-IID data . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Communication Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Clustering and Hierarchical structure . . . . . . . . . . . . . . . . . . . . . . . 7
Optimization Problem for IHC-FL 8
1 The Overview of IHC-FL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Information-exchangeable Hierarchical Clustering . . . . . . . . . . . . . . . . 10
3 Effect of Number of Clusters on Model Performance . . . . . . . . . . . . . . 10
4 System Model and Problem Formulation . . . . . . . . . . . . . . . . . . . . . 11
5 NP-Hardness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Reformulation and Clustering for IHC-FL 17
1 Problem Reformulation with Machine-Learned Advice . . . . . . . . . . . . . 17
2 Clustering Framework for IHC-FL . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Trade-off between intra- and inter-cluster model distances . . . . . . . . . . . . 21
Evaluation 23
1 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.1 Datasets and models . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.2 Communication Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3 Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Effects of Non-IID Level and Number of Nodes . . . . . . . . . . . . . . . . . 25
3 Communication Cost and Number of Training Rounds . . . . . . . . . . . . . 26
4 Running time of MILP Solver . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Conclusion 28
References 29

                                

References
[1] T. Li et al., “Federated learning: Challenges, methods, and future directions,” IEEE Signal
Process. Mag., vol. 37, no. 3, pp. 50–60, 2020.
[2] B. McMahan et al., “Communication-efficient learning of deep networks from decentral-
ized data,” in Proc. AISTATS, 2017.
[3] Y. Zhao et al., “Federated learning with non-iid data,” arXiv preprint arXiv:1806.00582,
2018.
[4] J. Konečný et al., “Federated learning: Strategies for improving communication effi-
ciency,” in Proc. NeurIPS Workshop on PMPML, 2016.
[5] H. Seo et al., “Federated knowledge distillation,” Machine Learning and Wireless Com-
munications, pp. 457–485, 2022.
[6] I. Hegedűs, G. Danner, and M. Jelasity, “Gossip learning as a decentralized alternative to
federated learning,” in Proc. IFIP DAIS, 2019.
[7] A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered
federated learning,” in Proc. NeurIPS, 2020.
[8] F. Sattler et al., “Clustered federated learning: Model-agnostic distributed multitask op-
timization under privacy constraints,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32,
pp. 3710–3722, 2020.
[9] C. Briggs et al., “Federated learning with hierarchical clustering of local updates to im-
prove training on non-iid data,” in Proc. IJCNN, 2020.
[10] G. Long et al., “Multi-center federated learning: clients clustering for better personaliza-
tion,” World Wide Web, vol. 26, pp. 481–500, 2023.
[11] A. Koloskova, S. Stich, and M. Jaggi, “Decentralized stochastic optimization and gossip
algorithms with compressed communication,” in ICML, 2019.
[12] A. Koloskova, T. Lin, S. U. Stich, and M. Jaggi, “Decentralized deep learning with arbi-
trary communication compression,” in ICLR, 2020.
[13] N. Yoshida et al., “Hybrid-fl for wireless networks: Cooperative learning mechanism us-
ing non-iid data,” in 2020 IEEE International Conference on Communications, ICC 2020,
Dublin, Ireland, June 7-11, 2020, pp. 1–7, IEEE, 2020.
[14] H. Wang, Z. Kaplan, D. Niu, and B. Li, “Optimizing federated learning on Non-IID data
with reinforcement learning,” in IEEE INFOCOM, 2020
[15] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated op-
timization in heterogeneous networks,” Proceedings of Machine Learning and Systems,
vol. 2, pp. 429–450, 2020.
[16] S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “Scaffold:
Stochastic controlled averaging for federated learning,” in International Conference on
Machine Learning, pp. 5132–5143, PMLR, 2020.
[17] H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, and Y. Khazaeni, “Federated learning
with matched averaging,” arXiv preprint arXiv:2002.06440, 2020.
[18] M. Zhang, K. Sapra, S. Fidler, S. Yeung, and J. M. Alvarez, “Personalized federated learn-
ing with first order model optimization,” arXiv preprint arXiv:2012.08565, 2020.
[19] P.-N. Tan et al., Introduction to data mining. Pearson Edu. India, 2016.
[20] Z. Wang et al., “Efficient ring-topology decentralized federated learning with deep gen-
erative models for medical data in ehealthcare systems,” Electronics, vol. 11, p. 1548,
2022.
[21] S. P. Sturluson et al., “Fedrad: Federated robust adaptive distillation,” arXiv preprint
arXiv:2112.01405, 2021.
[22] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,” J.
R. Stat. Soc. Ser. C Appl. Stat., vol. 28, pp. 100–108, 1979.
[23] Y.-C. Lin, J.-J. Kuo, W.-T. Chen, and J.-P. Sheu, “Reinforcement based communica-
tion topology construction for decentralized learning with Non-IID data,” in Proc. IEEE
GLOBECOM, 2021.
[24] R. J. Gould, “Advances on the hamiltonian problem–a survey,” Graphs and Combina-
torics, vol. 19, no. 1, pp. 7–52, 2003.
[25] D. P. Williamson and D. B. Shmoys, The design of approximation algorithms. Cambridge
university press, 2011.
[26] R. Anderson et al., “Strong mixed-integer programming formulations for trained neural
networks,” Math. Program., vol. 183, pp. 3–39, 2020.
[27] Y. Shi et al., “Beyond IID: learning to combine non-IID metrics for vision tasks,” in Proc.
AAAI, 2017

簡易檢索 / 詳目顯示

相關論文