一種上行無線網路的聯合資源分配和數據包排程的混合學習方法

簡易檢索 / 詳目顯示

回結果列表

研究生：	顏廷光 Yen, Ting-Guang
論文名稱：	一種上行無線網路的聯合資源分配和數據包排程的混合學習方法 A Hybrid learning Approach for Joint Resource Allocation and Packet Scheduling in Uplink Wireless Networks
指導教授：	李端興 Lee, Duan-Shin
口試委員:	張正尚 Chang, Cheng-Shang 陳志成 Chen, Chih-Cheng
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 通訊工程研究所 Communications Engineering
論文出版年：	2022
畢業學年度：	111
語文別：	英文
論文頁數：	34
中文關鍵詞：	強化學習、虛構遊戲、隨機博弈、資源分配、排程演算法、超可靠的低延遲通信類型、增強型移動寬帶類型、無線網路
外文關鍵詞：	reinforcement learning, fictitious play, stochastic games, resource allocation, scheduling algorithms, URLLC, eMBB, wireless networks
相關次數：	點閱：119 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在本論文中，我們制定了一個包含超可靠低延遲通信類
型(URLLC)和增強型移動寬帶類型(eMBB) 服務的聯合資源分配和
數據包排程問題，將它作為隨機博弈。藉由傳統多代理人 Q 學習
算法(multi-agent Q-learning)來解決隨機博弈的問題，會導致維度
災難問題或由於分佈式的環境造成信息缺失問題。在本文中，我
們提出了一種基於隨機混合的混合學習方案– 多代理人 Q 學習算
法(multi-agent Q-learning)和虛構遊戲。經由模擬，我們的方法可
以找到最佳策略。

In this thesis, we formulate a joint resource allocation and packet scheduling problem for URLLC and eMBB traffic as a stochastic game. Traditional solution of stochastic games by multi-agent Q learning algorithms suffers from either a curse-of-dimension problem or a lack-ofinformation problem due to distributed environments. In this thesis we
propose a hybrid learning scheme that is based on a random mixture of
multi-agent Q learning algorithm and fictitious play. Through simulation
we show that our proposed scheme is capable of finding the best policy.

中文摘要 i
Abstract ii
Acknowledgements iii
List of Figures vi
List of Tables vii
1 Introduction 1
2 System Architecture 4
3 A Stochastic Game 8
4 Review of Reinforcement Learning 13
4.1 Principle of Deep Q-learning . . . . . . . . . . . . . . . 16
5 A Hybrid Learning Algorithm 19
6 Simulation 23
6.1 Comparison between different methods of training process 24
6.2 Comparison between different methods of resource allocation and packet scheduling . . . . . . . . . . . . . . . 25
7 Conclusions 30
Bibliography 31
                                

[1] C.-P. Li, J. Jiang, W. Chen, T. Ji, and J. Smee, “5g ultra-reliable
and low-latency systems design,” in 2017 European Conference on
Networks and Communications (EuCNC), 2017, pp. 1–5.
[2] R. Abreu, T. Jacobsen, K. Pedersen, G. Berardinelli, and P. Mogensen, “System level analysis of embb and grant-free urllc multiplexing in uplink,” in 2019 IEEE 89th Vehicular Technology Conference (VTC2019-Spring), 2019, pp. 1–5.
[3] Y. Saito, Y. Kishiyama, A. Benjebbour, T. Nakamura, A. Li, and
K. Higuchi, “Non-orthogonal multiple access (noma) for cellular
future radio access,” in 2013 IEEE 77th Vehicular Technology Conference (VTC Spring), 2013, pp. 1–5.
[4] I. Gerasin, A. Krasilov, and E. Khorov, “Flexible multiplexing of
grant-free urllc and embb in uplink,” in 2020 IEEE 31st Annual
International Symposium on Personal, Indoor and Mobile Radio
Communications, 2020, pp. 1–6.
[5] M. Mollanoori and M. Ghaderi, “Uplink scheduling in wireless
networks with successive interference cancellation,” IEEE Transactions on Mobile Computing, vol. 13, no. 5, pp. 1132–1144, 2014.
[6] E. Khorov, A. Kureev, I. Levitsky, and I. F. Akyildiz, “Prototyping
and experimental study of non-orthogonal multiple access in Wi-Fi
networks,” IEEE Network, 2020.
[7] I. Gerasin, A. Krasilov, and E. Khorov, “Flexible multiplexing of
grant-free URLLC and eMBB in uplink,” in 2020 IEEE 31st Annual
International Symposium on Personal, Indoor and Mobile Radio
Communications, 2020, pp. 1–6.
[8] R. Abreu, T. Jacobsen, K. Pedersen, G. Berardinelli, and P. Mogensen, “System level analysis of eMBB and grant-free URLLC
multiplexing in uplink,” in 2019 IEEE 89th Vehicular Technology
Conference (VTC2019-Spring), 2019, pp. 1–5.
[9] R. Abreu, T. Jacobsen, G. Berardinelli, K. Pedersen, I. Z. Kovacs,
and P. Mogensen, “On the multiplexing of broadband traffic and
grant-free ultra-reliable communication in uplink,” in 2019 IEEE
89th Vehicular Technology Conference (VTC2019-Spring), 2019,
pp. 1–5.
[10] M. Mollanoori and M. Ghaderi, “Uplink scheduling in wireless
networks with successive interference cancellation,” IEEE Transactions on Mobile Computing, vol. 13, no. 5, pp. 1132–1144, 2014.
[11] A. Anand, G. de Veciana, and S. Shakkottai, “Joint scheduling of
URLLC and eMBB traffic in 5G wireless networks,” IEEE Trans.
on Networking, vol. 28, April 2020.
[12] C.-H. Yu, L. Huang, C.-S. Chang, and D.-S. Lee, “Poisson receivers: a probabilistic framework for analyzing coded random access,” IEEE Trans. on Networking, vol. 29, no. 2, April 2021.
[13] T. Jacobsen, R. Abreu, G. Berardinelli, K. Pedersen, P. Mogensen,
I. Kovacs, and T. K. Madsen, “System level analysis of uplink
grant-free transmission for URLLC,” in IEEE Globecom Workshops, 2017.
[14] 3GPP, “Study on scenarios and requirements for next generation
access technologies,” 3GPP TR 38.913 V1.0.0.
[15] S. E. Elayoubi, P. Brown, M. Deghel, and A. Galindo-Serrano,
“Radio resource allocation and retransmission schemes for URLLC
over 5G networks,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 4, pp. 896–904, 2019.
[16] 3GPP, “Physical layer procedures for data,” 3GPP TR 38.214
v15.1.0, March 2018.
[17] C. Wang, Y. Chen, Y. Wu, and L. Zhang, “Performance evaluation
of grant-free transmission for uplink urllc services,” in IEEE VTC
Spring, 2017.
[18] L. Shapley, “Stochastic games,” PNAS, vol. 39, pp. 1095–1100,
1953.
[19] Y. Shoham and K. Leyton-Brown, Multiagent systems algorithmic,
game-theoretic, and logical foundations. Cambridge: Cambridge
University Press, 2009.
[20] N. Vieille, Handbook of Game theory with economic applications.
Amsterdam: Elsevier Science, 2002.
[21] C. Watkins and P. Dayan, “Technical note: Q-learning,” Machine
Learning, vol. 8, pp. 279–292, 1992.
[22] Y. Shoham, R. Powers, and T. Grenager, “Multi-agent reinforcement learning: a critical survey,” Stanford University, Tech. Rep.,
2003.
[23] L. Bu¸soniu, R. Babuska, and B. D. Schutter, “A comprehensive sur- ˘
vey of multiagent reinforcement learning,” IEEE Transactions on
Systems, Man, and Cybernetics - part C: Applications and Reviews,
vol. 38, no. 2, pp. 156 – 172, March 2008.
[24] G. W. Brown, “Iterative solution of games by fictitious play,” in
Activity Analysis of Production and Allocation, T. C. Koopmans,
Ed. New York: Wiley, 1951.
[25] E. Kalai and E. Lehrer, “Rational learning leads to Nash equilibrium,” Econometrica, vol. 61, no. 5, pp. 1019–45, 1993.
[26] C. Claus and C. Boutilier, “The dynamics of reinforcement learning
in cooperative multiagent systems,” 1998, p. 746–752.

簡易檢索 / 詳目顯示

相關論文