以強化學習法進行孤立微電網之頻率同步｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	董建鋒 TUNG,CHIEN-FENG
論文名稱：	以強化學習法進行孤立微電網之頻率同步 Frequency Synchronization of Isolated AC Microgrids: A Reinforcement Learning Approach
指導教授：	朱家齊 CHU, CHIA-CHI
口試委員:	黃維澤 Huang, Wei-Tzer 鄧人豪 Teng, Jen-Hao 劉建宏 Liu, Jian-Hong
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	70
中文關鍵詞：	強化學習、多重代理人共識控制、最佳控制、頻率二次控制
外文關鍵詞：	Reinforcement Learning, Multiagent Consensus Control, Optimal Control, Frequency Secondary Control
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年再生能源發展快速，尤其是風力發電和太陽能發電。隨著台灣的再生能
源滲透率逐年上升，電力系統的慣量逐年遞減，電網頻率擾動的風險增加，而
一旦頻率過低，便有電力系統崩潰的危險，而頻率的二次控制能確保電網頻率
的擾動縮小到一定範圍內。
強化學習是機器學習中的一個領域，它可分為需要模型資訊的演算法和不需
模型資訊的演算法，而我們要使用強化學習中不需要模型資訊的演算法來進行
頻率的二次控制，來驗證強化學習可應用在不同的環境上。
本文提出了三種強化學習算法來實現多重代理人的狀態同步，Q學習演算
法、分佈式Q學習演算法、離線學習的演員-批評者結構演算法。這些演算法不
需要代理人的系統參數，也就是無模型的演算法。為了確認所提出的方法的性
能，分別將這三種共識算法應用到狀態空間的狀態控制和電網的頻率控制中。
模擬結果證明經過學習的回授增益在遇到環境有所變化時能提供較好的性能指
標。

Renewable energy has developed rapidly in recent years, especially wind power and solar power. As the penetration rate of renewable energy in Taiwan increases year by year, the inertia of the power system decreases year by year, and the risk of grid frequency disturbance increases. Once the frequency is too low, there is a danger of power system collapse. The secondary control of frequency can ensure the frequency of the grid. The disturbance is reduced to a certain range.
Reinforcement learning is a field in machine learning. It can be divided into algorithms that require model information and algorithms that do not require model information, and we need to use algorithms in reinforcement learning that do not require model information to perform frequency second control to verify that reinforcement learning can be applied in different environments.
This paper proposes three reinforcement learning algorithms to achieve state synchronization of multiple agents, Q-learning algorithm, distributed Q-learning algorithm, and Actor-critic structure for Off-policy learning algorithm. These algorithms do not require the system knowledge of the agent and the proposed algorithm is model-free. To
confirm the performance of the proposed algorithm, these three consensus algorithms are applied to the state control of the state space and the frequency control of the grid, respectively. The simulation results demonstrate that the learned feedback gain can provide better performance when the environment changes.

摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
致謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII
Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Consensus Control by Multi-Agent Reinforcement Learning . . . . . . . 4
1 Reinforcement Learning in Linear Discrete-Time Systems . . . . . . . . 4
2 Optimal Consensus by Reinforcement Learning . . . . . . . . . . . . . 10
2.1 Model-Based On-Policy Reinforcement Learning . . . . . . . . 12
2.2 Actor-Critic Structure for Off-policy Learning . . . . . . . . . . 15
2.3 Q-Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Distributed Q-Learning Algorithm . . . . . . . . . . . . . . . . 30
3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Frequency Control in Isolated Microgrids . . . . . . . . . . . . . . . . . . 37
1 Dynamic Models of Distributed Generators . . . . . . . . . . . . . . . 37
2 Linearized Model of Distributed Generators . . . . . . . . . . . . . . . 40
3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Simulation Validations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1 Linear Multiagent System . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.1 Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.2 Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.3 Scenario 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2 Frequency Control in Isolated Micorgrids . . . . . . . . . . . . . . . . 54
2.1 Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2 Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.3 Scenario 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Conclusion and Future work . . . . . . . . . . . . . . . . . . . . . . . . . 66
1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
REFERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67


                                

[1] Chaoxu Mu and Qian Zhao and Zhongke Gao and Changyin Sun, “Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning” IEEE Power Energy Mag., vol. 356, no. 13, pp. 6946–6967,
July 2019.
[2] Jinna Li, Member, IEEE, Hamidreza Modares, Tianyou Chai, Fellow, IEEE, Frank
L. Lewis, Fellow, IEEE, and Lihua Xie, Fellow, IEEE, “Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games” IEEE Power Energy Mag., vol. 28, no. 10, pp. 2434–2445, October 2017.
[3] Frank L. Lewis, Draguna L. Vrabie and Vassilis L. Syrmos, “Optimal Control,
Third Edition” Wiley, 1995.
[4] Lancaster, P. and Rodman, L., “Algebraic Riccati Equations” Oxford Science,
1995.
[5] Long, Mingkang and Su, Housheng and Wang, Xiaoling and Jiang, Guo-Ping and
Wang, Xiaofan, “An iterative Q-learning based global consensus of discrete-time saturated multi-agent systems” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 29, no. 10, pp. 103127, 2019.
[6] Wang, Xiaoling and Su, Housheng, “Completely model-free RL-based consensus of continuous-time multi-agent systems” Applied Mathematics and Computation, vol. 382, pp. 125312, 2020.
[7] Hamidreza Modares; Frank L. Lewis; Mohammad-Bagher Naghibi-Sistani,
“Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks” IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 10 , October 2017.
[8] Yifan Liu and Housheng Su, “General Second-Order Consensus of Discrete-Time Multiagent Systems via Q-Learning Method” IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, 2020.
[9] Long, Mingkang and Su, Housheng and Zeng, Zhigang, “Model-Free Algorithms for Containment Control of Saturated Discrete-Time Multiagent Systems via Q-Learning Method” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 2, pp. 1308-1316, 2022.
[10] Su, Housheng and Chen, Michael Z. Q. and Lam, James and Lin, Zongli, “SemiGlobal Leader-Following Consensus of Linear Multi-Agent Systems With Input Saturation via Low Gain Feedback” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 60, no. 7, pp. 1881-1889,2013.
[11] Busoniu, Lucian and Babuska, Robert and De Schutter, Bart, “A comprehensive survey of multiagent reinforcement learning” IEEE Transactions on Systems,Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 2, pp. 156172, 2008.
[12] Yan, Ziming and Xu, Yan, “A multi-agent deep reinforcement learning method for cooperative load frequency control of a multi-area power system” IEEE Transactions on Power Systems, vol. 35, no. 6, pp. 4599–4608, 2020.
[13] Gheisarnejad, Meysam and Farsizadeh, Hamed and Khooban, Mohammad Hassan, “A Novel Nonlinear Deep Reinforcement Learning Controller for DC–DC Power Buck Converters” IEEE Transactions on Industrial Electronics, vol. 26,no. 8, pp. 6849–6858, 2020.
[14] Bedoya, Juan Carlos and Wang, Yubo and Liu, Chen-Ching, “Distribution System Resilience Under Asynchronous Information Using Deep Reinforcement Learning” IEEE Transactions on Power Systems, vol. 36, no. 5, pp. 4235-4245, 2021.
[15] Qishao Wang , Zhisheng Duan , Senior Member, IEEE, Jingyao Wang , and Guanrong Chen , Fellow, IEEE, “LQ Synchronization of Discrete-Time Multiagent Systems: A Distributed Optimization Approach” IEEE TRANSACTIONS ON AUTOMATIC CONTROL, vol. 64, no. 12, October 2019.
[16] Kamalapurkar, Rushikesh and Klotz, Justin R and Walters, Patrick and Dixon, Warren E, “Model-based reinforcement learning in differential graphical games” IEEE Transactions on Control of Network Systems, vol. 5, no. 1, pp. 423–433,2016.
[17] Hong Wang and Man Li , Student Member, IEEE, “Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems” IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2020.
[18] Kyriakos G. Vamvoudakis and Frank L. Lewis and Greg R. Hudas, “Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality” Automatica, vol. 48, no. 8, pp. 1598-1611, 2012.
[19] Vamvoudakis, Kyriakos G., “Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach” Systems & Control Letters, vol. 100,pp. 14-20, February 2017.
[20] Chen, Xin and Qu, Guannan and Tang, Yujie and Low, Steven and Li, Na, “Reinforcement learning for decision-making and control in power systems: Tutorial, review, and vision” arXiv preprint arXiv:2102.01168, 2021.
[21] Kiumarsi, Bahare and Lewis, Frank L and Modares, Hamidreza and Karimpour, Ali and Naghibi-Sistani, Mohammad-Bagher, “Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics” Automatica, vol. 50, no. 4, pp. 1167-1175, 2014.
[22] Xu, Yinliang and Sun, Hongbin and Gu, Wei and Xu, Yan and Li, Zhengshuo,“Optimal Distributed Control for Secondary Frequency and Voltage Regulation in an Islanded Microgrid” IEEE Transactions on Industrial Informatics, vol. 15, no.1, pp. 225-235, 2019.
[23] Bidram, Ali and Davoudi, Ali and Lewis, Frank L, “A multiobjective distributed control framework for islanded AC microgrids” IEEE Transactions on industrial informatics, vol. 10, no. 3, pp. 1785-1798, 2014.
[24] Donghwan Lee, Niao He, Parameswaran Kamalaruban, Volkan Cevher , “Optimization for Reinforcement Learning: From Single Agent to Cooperative Agents” IEEE SIGNAL PROCESSING MAGAZINE , May 2020.
[25] Soummya Kar , Jose M. F. Moura and H. Vincent Poor , “QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations” IEEE TRANSACTIONS ON SIGNAL PROCESSING , vol. 61, no. 7 , APRIL 1, 2013.
[26] Chaoxu Mu and Qian Zhao and Zhongke Gao and Changyin Sun, “Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning” Journal of the Franklin Institute, vol. 356, no. 13, pp. 6946-6967,2019.
[27] Ali Bidram , Vahidreza Nasirian , Ali Davoudi , Frank L. Lewis , “Cooperative Synchronization in Distributed Microgrid Control”
[28] Sutton, Richard S and Barto, Andrew G, “Reinforcement learning: An introduction” MIT press, 2018.
[29] G. Hewer , “An iterative technique for the computation of the steady state gains for the discrete optimal regulator” IEEE Transactions on Automatic Control, vol. 166,Aug 1971.

簡易檢索 / 詳目顯示

相關論文