基於動態刪剪及擴張之聯合多任務學習演算法

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡昀霖 Tsai, Yun-Lin
論文名稱：	基於動態刪剪及擴張之聯合多任務學習演算法 Dynamic Pruning and Expansion for Federated Multitask Learning
指導教授：	洪樂文 Hong, Yao-Win Peter
口試委員:	陳祝嵩 Chen, Chu-Song 王奕翔 Wang, I-Hsiang
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 通訊工程研究所 Communications Engineering
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	48
中文關鍵詞：	聯合多任務學習、神經網路刪剪、神經網路擴張
外文關鍵詞：	Federated Multitask Learning, Neural Network Pruning, Neural Network Expansion
相關次數：	點閱：1 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

此論文提出了一種基於動態刪剪及擴張之聯合多任務學習演算法 (DyPE)。該演算法使本地端可以針對各自特定任務量身訂製其模型，並同時利用共享模型參數間的好處。該作法與多數現有的聯合學習作法有所不同，多數現有的作法通常假設所有本地端都使用共通的模型。然而特別的是，於DyPE中，本地刪剪可以協助刪除與本地任務較不相關的參數，從而減少來自其他任務的干擾，而本地拆分擴張則可以使本地端生成特定的子模型以用於捕捉特定的任務知識。此外，由於僅需要於每輪訓練回合中交換中央伺服端與本地端間共享的模型參數，因此所提出的方法也能降低資訊交換的成本。通過利用真實數據集進行實驗並與當前用於聯合多任務學習的最新技術相比，DyPE的有效性得以證明。結果表明，我們所提出的方法不僅能夠良好地處理具有異質數據分佈的情境，也能夠很好地適應不同任務的兼容性。

This thesis proposes a dynamic pruning and expansion (DyPE) technique for the federated learning of multiple diverse local tasks. The technique enables local devices to tailor their models toward locally specific tasks while leveraging the benefits of transfer through shared model parameters. This is different from most existing works on federated learning that assumes the use of a common model at all local devices. In particular, local pruning helps eliminate parameters that are less relevant to the local task so as to reduce interference from other tasks, whereas local expansion generates sub-models that can be used to capture task-specific knowledge. The proposed method is also communication-efficient since only the shared model parameters need to be exchanged between center and local devices in each training round. The effectiveness of DyPE is shown through simulations on real-world datasets in comparison to the current state-of-the-art for federated multitask learning. The results show that our proposed method is capable of handling tasks with non-IID data distributions, and adapts well to the compatibility of different tasks.

Abstract i
Contents ii
Introduction 1
Related Works 4
1 Federated Learning . . . . . . . . . . . . . . . . . . . . 4
2 Sequential Learning . . . . . . . . . . . . . . . . . . . . 5
Problem Description 7
Dynamic Pruning and Expansion for Federated Multitask Learning 10
1 Local Pruning of Parameters . . . . . . . . . . . . . . . . . . . . 10
2 Splitting of Nodes at Local Devices . . . . . . . . . . . . . . . . . . . . 11
Experiment 16
1 Dataset Description . . . . . . . . . . . . . . . . . . . . 16
2 Baseline Methods . . . . . . . . . . . . . . . . . . . . 19
3 Experiment I: Local Tasks from Different Datasets . . . . . . . . . . . . . . . . . . . . 20
3 Experiment II: Local Tasks formed by Disjoint Subsets of a Common Dataset . . . . . . . . . . . . . . . . . . . . 30
4 Experiment III: Local Tasks formed by Overlapped Classes of a Common Dataset . . . . . . . . . . . . . . . . . . . . 33
Conclusion 44


                                

[1] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication efficient learning of deep networks from decentralized data,” in Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1273–1282, 2017.
[2] J. Konecn`y, B. McMahan, and D. Ramage, “Federated optimization: Distributed optimization beyond the datacenter,” arXiv preprint arXiv:1511.03575, 2015.
[3] A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,” arXiv preprint arXiv:1811.03604, 2018.
[4] Y. Chen, X. Qin, J. Wang, C. Yu, and W. Gao, “Fedhealth: A federated transfer learning framework for wearable healthcare,” IEEE Intelligent Systems, vol. 35, no. 4, pp. 83–93, 2020.
[5] G. A. Sheller, Micah J. and Reina, B. Edwards, J. Martin, and S. Bakas, “Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation,” in International MICCAI Brainlesion Workshop, pp. 92–104, 2019.
[6] V. Smith, C.-K. Chiang, M. Sanjabi, and A. S. Talwalkar, “Federated multi-task learning,” in Advances in Neural Information Processing Systems (NIPS), pp. 4424–4434, 2017.
[7] L. Corinzia and J. M. Buhmann, “Variational federated multi-task learning,” arXiv
preprint arXiv:1906.06268, 2019.
[8] N. Shoham, T. Avidor, A. Keren, N. Israel, D. Benditkis, L. Mor-Yosef, and I. Zeitak,
“Overcoming forgetting in federated learning on non-iid data,” arXiv preprint
arXiv:1910.07796, 2019.
[9] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated learning with
non-IID data,” arXiv preprint arXiv:1806.00582, 2018.
[10] J. Yoon, E. Yang, J. Lee, and S. J. Hwang, “Lifelong learning with dynamically expandable networks,” in Proceedings of International Conference on Learning Representations (ICLR), 2018.
[11] C.-Y. Hung, C.-E. Tu, C.-H. and Wu, C.-H. Chen, Y.-M. Chan, and C.-S. Chen,
“Compacting, picking and growing for unforgetting continual learning,” in Advances in Neural Information Processing Systems (NIPS), pp. 13669–13679, 2019.
[12] X. Yao, T. Huang, C. Wu, R.-x. Zhang, and L. Sun, “Federated learning with additional mechanisms on clients to reduce communication costs,” arXiv preprint arXiv:1908.05891, 2019.
[13] D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,” arXiv preprint arXiv:1910.03581, 2019.
[14] B. Liu, L.Wang, M. Liu, and C. Xu, “Federated imitation learning: A novel framework for cloud robotic systems with heterogeneous sensor data,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3509–3516, 2020.
[15] M. G. Arivazhagan, V. Aggarwal, A. K. Singh, and S. Choudhary, “Federated learning with personalization layers,” arXiv preprint arXiv:1912.00818, 2019.
[16] P. P. Liang, T. Liu, L. Ziyin, R. Salakhutdinov, and L.-P. Morency, “Think locally, act
globally: Federated learning with local and global representations,” in Proceedings of
International Workshop on Federated Learning for Data Privacy and Confidentiality,
2019.
[17] J. Yoon, W. Jeong, G. Lee, E. Yang, and S. J. Hwang, “Federated continual learning with adaptive parameter communication,” arXiv preprint arXiv:2003.03196, 2020.
[18] Z. Li and D. Hoiem, “Learning without forgetting,” in Proceedings in European Conference on Computer Vision (ECCV), pp. 614–629, 2016.
[19] F. M. Castro, M. J. Marin-Jimenez, N. Guil, C. Schmid, and K. Alahari, “End-to-end
incremental learning,” in Proceedings of European Conference on Computer Vision
(ECCV), pp. 233–248, 2018.
[20] A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016.
[21] S. Golkar, M. Kagan, and K. Cho, “Continual learning via neural pruning,” arXiv preprint arXiv:1903.04476, 2019.
[22] J. Requeima, J. Gordon, J. Bronskill, S. Nowozin, and R. E. Turner, “Fast and flexible multi-task classification using conditional neural adaptive processes,” in Advances in Neural Information Processing Systems (NIPS), pp. 7959–7970, 2019.
[23] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems (NIPS), pp. 1135–1143, 2015.
[24] Y. Chen, X. Sun, and Y. Jin, “Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation, ”IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 10, pp. 4229–4238, 2020.
[25] Y. LeCun, C. Cortes, and C. Burges, “MNIST handwritten digit database,” ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist, 2010.
[26] G. Cohen, S. Afshar, J. Tapson, and A. van Schaik, “Emnist: Extending mnist to
handwritten letters,” in International Joint Conference on Neural Networks (IJCNN),
pp. 2921–2926, 2017.
[27] T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha, “Deep learning for classical japanese literature,” arXiv preprint arXiv:1812.01718, 2018.
[28] H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
[29] A. Krizhevsky, “Learning multiple layers of features from tiny images,” tech. rep., 2009.
[30] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.

簡易檢索 / 詳目顯示

相關論文