運用折扣強化學習策略尋找艾爾法酒吧賽局之奈許平衡

簡易檢索 / 詳目顯示

回結果列表

研究生：	林新宗 Lin, Hsin-Tsung
論文名稱：	運用折扣強化學習策略尋找艾爾法酒吧賽局之奈許平衡 Learning to play an El Farol bar game using discounted reinforcement learning
指導教授：	李端興 Lee, Duan-Shin
口試委員:	張正尚 Chang, Cheng-Shang 林華君 Lin, Hwa-Chun
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 通訊工程研究所 Communications Engineering
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	29
中文關鍵詞：	艾爾法酒吧、賽局理論、奈許平衡、學習理論
外文關鍵詞：	El Farol bar, game theory, Nash equilibrium, learning theory
相關次數：	點閱：83 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在本篇的論文中，我們探究艾爾法酒吧賽局(El Farol bar game)的折扣強化學習過程(discounted reinforcement learning process) ，並分析它。我們對這樣的強化學習過程推導出一個非線性微分方程組的系統，並從非線性系統(nonlinear system)中找出一些對應於賽局的奈許平衡(Nash equilibria of the game) 的平衡點。在這些穩
定的平衡點中，只有對應於純奈許平衡(pure Nash equilibria) 的平衡點是穩定的。其它對應於混合奈許平衡(mixed Nash equilibria)的平衡點則都是不穩定的。

In this paper we consider the discounted reinforcement learning process
of an El Farol bar game. We analyze a discounted reinforcement learning process of the El Farol Bar game. We derive a system of nonlinear differential equations for this process. We show that some equilibrium
points of this nonlinear system correspond to the Nash equilibria of the game. Among these stable equilibrium points only the equilibrium point that corresponds to pure Nash equilibrium is stable. Other equilibrium points that correspond to mixed Nash equilibria are unstable.

中文摘要
Abstract
Acknowledgements
List of Figures
List of Tables
1 Introduction------------------------------------1
2 Discounted Reinforcement Learning---------------4
2.0.1 Equilibrium points-----------------------8
2.0.2 Nash Equilibria-------------------------10
2.0.3 Stability-------------------------------13
3 Numerical Results------------------------------24
4 Conclusions------------------------------------27
Bibliography--------------------------------------28
                                

[1] D. Fudenberg and D. K. Levine, The theory of learning in games. Massachusetts: The MIT Press, 1999.
[2] E. Hopkins, “Two competing models of how people learn in games,” Econometrica, vol. 70, no. 6, pp. 2141–2166, 2002.
[3] I. Erev and A. E. Roth, “Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria,” The American Economic Review, vol. 88, no. 4, pp.848–881, 1998.
[4] R. S. Sutton and A. G. Barto, Reinforcement learning: an introduction. Massachusetts: The MIT Press, 2012.
[5] W. B. Arthur, “Inductive reasoning and bounded rationality,” American Economic Review, vol. 84, pp. 406–411, 1994.
[6] D. Challet, M. Marsili, and G. Ottino, “Shedding light on El Farol,” Physica A: Statistical Mechanics and Its Applications, vol. 332, pp.469–482, 2004.
[7] R. Franke, “Reinforcement learning in the El Farol model,” Journal of Economic Behavior & Organization, vol. 51, p. 367–388, 2003.
[8] D. Whitehead, “The El Farol Bar problem revisited: Reinforcement
learning in a potential game,” Edinburgh School of Economics, Discussion Paper Series 186, 2008.
[9] E. L. Thorndike, “Animal intelligence: An experimental study of
the associative processes in animals,” Psychological monographs,
vol. 2, no. 8, 1989.
[10] J. Blackburn, “Acquisition of skills: An analysis of learning
curves,” IHRB Report, Tech. Rep. 73, 1936.
[11] W. K. Estes, “Toward a statistical theory of learning,” Psychological Review, vol. 57, no. 2, pp. 94–107, March 1950.
[12] D. R. Luce, Individual choice behavior. New York: Wiley, 1959.
[13] R. Bush and F. Mosteller, Stochastic models for learning. New
York: Wiley, 1955.
[14] P. Suppes and R. C. Atkinson, Markov learning models for multiperson interactions. Stanford, CA: Stanford University Press,
1960.

簡易檢索 / 詳目顯示

相關論文