理性代理人任務重分配協商下的神諭學習法｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳怡君 Chen, Yi-chun
論文名稱：	理性代理人任務重分配協商下的神諭學習法 Oracle Learning for Agent Negotiation Based on Rationality in Task Reallocation Problems
指導教授：	蘇豐文 Soo, Von-Wun Soo
口試委員:	陳煥宗 Chen, Hwann-Tzong 周志遠 Chou, Jerry
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2013
畢業學年度：	101
語文別：	英文
論文頁數：	114
中文關鍵詞：	任務重分配問題、代理人溝通協定、增強式學習
外文關鍵詞：	Task Allocation Problem, Agent Negotiation, OCSM-Contracts Protocol
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

任務分配是多重代理人系統中一個重要的議題。OCSM 溝通協定被提出，且其能達到最佳解的性質已經得到證明了。然而，在任務重分配的過程中，如果沒有明確的溝通交換指引，從任意的分配狀態要交換到最佳分配仍然十分複雜且困難。在這篇論文中我們提出了，要如何找到明確的溝通指引──神諭，使代理人們能減少溝通次數並逐漸趨近最佳分配的方法。所提出的神諭學習法將方法分成很多細部機制，逐一回答任務重分配問題中所有需要解決的細節。然後，透過實驗，我們評估了這個方法在解決任務分配問題上的效果以及所需要的溝通交換次數，還有在不同規模問題上的應用性。這個方法確實的降低了需要的溝通交換次數，同時，由於細部機制的設計而能在每個隨機的分配狀態下給出明確的溝通指引。如此一來，應用OCSM溝通機制在任務重分配問題下的複雜度可以明顯地降低。

Task allocation with a contract net protocol is an important issue in multi-agent system. The OCSM contracts protocol has been proposed and it has a good property on that its guarantee of global optimality has already been proved. However, without a proper an oracle to provide guideline of selection of the strategies at proper problem solving situation, the reachability of the optimal allocation solution still has some difficulty. A method to find the oracle, the guide, to agents who can help to reduce the needed number of steps of negotiation that can lead to the optimal allocation solution from any random initial assignment of task allocation is proposed in this thesis. The Oracle Learning method we proposed in this thesis is a method that is divided into several sub-mechanisms, each of which is designed to solve every detailed sub-problem in modeling the task (re)-allocation problem. And we show how each sub-problem can be solved and how the complexity of the optimal solution finding in this problem can be reduced. Then, through experiments, the performance of problem solving, the needed numbers of negotiation steps and the applicability of the method on different scale of problems were evaluated. We conclude the method can really help to get a good result in reducing the needed number of steps of negotiation and can really give a proper negotiation guide in each assignment of task allocation since its sub-mechanisms answers questions that an Oracle needs to answer. Thus, the computational complexity of OCSM negotiation mechanism in task re-allocation problem has a great reduction.

Abstract    1
中文摘要    3
Table of Contents    4
CHAPTER 1. INTRODUCTION    6
CHAPTER 2 RELATED WORKS    9
2.1    Task Allocation Problem    10
Definition 2.1.1: Task Allocation Problem    10
2.2    OCSM-Contract protocol    11
Definition 2.2.1: O-contract (Original Contract)    11
Definition 2.2.2: C-contract (Cluster Contract)    12
Definition 2.2.3: S-contract (Swap Contract)    12
Definition 2.2.4: M-contract (Multi-agent Contract)    13
Proposition 2.2.1: Path    14
Proposition 2.2.2: No Path    14
Proposition 2.2.3: IR Contract    15
2.3    Reinforcement Learning    17
Definition 2.3.1: Q-Learning    17
CHAPTER 3 PROPOSED METHODS    20
3.1    Specially Defined Actions    20
Definition 3.1.1: Action    21
Trick: Bounded Contract    24
3.2    Oracle Recommend Structure    27
Mechanism 3.2.1: Rational Decision Making Strategy Bases on Individual Rationality    30
Mechanism 3.2.2: Agent-chosen Policy    35
Mechanism 3.2.3: Contract Recommendation    40
3.3    Oracle Learning    46
Overall Refining Structure    46
Part 1: Choose policy parameters and record state table    51
Part 2: Update of the recorded state table    58
CHAPTER 4 EXPERIMENTS DESIGN AND ANALYSIS    60
A Small Sample Case    62
Experiment 4.1: Check the Properties of OCSM-Contract Protocol    65
Experiment 4.1-1: Test Proposition 2.2.3 IR Contract    65
Experiment 4.1-2: The Decrease of Performance in Optimality Finding with Bounded contracts    70
Experiment 4.2: Evaluation of Mechanisms in Oracle Recommend Structure    77
Experiment 4.2-1: Cross Validation of Feature-selection under Different Rationalities and Strategies    78
Experiment 4.2-2 Different contract recommend mechanisms    89
Experiment 4.2-3 Contract Recommendations by Different k State.    100
Experiment 4.3: Evaluation of overall performance    105
Experiment 4.3-1:Different Problem at Different Scale    105
Chapter 5 Conclusion and future work    108
Reference    112

                                

[1]. T. Sandholm. Contract types for satisficing task allocation: I theoretical results. In AAAI Spring Symposium: Satisficing Models, 1998
[2] Martin R. Andersson and Tuomas W. Sandholm. Contract types for satisficing task allocation: I theoretical results. In AAAI Spring Symposium: Satisficing Models, 1998
[3] Filip Jurčíček, Blaise Thomson and Steve Young, “Reinforcement learning for parameter estimation in statistical spoken dialogue systems,” Computer Speech & Language vol. 36, Issue 3, Pages 168–192, June 2012,.
[4]. Yifan Cai and Yang, Simon X. A Survey on multi-robot systems. In World Automation Congress (WAC), 2012
[5] Ethan O. Selfridge and Peter A. Heeman, “Learning Turn, Attention, and Utterance Decisions in a Negotiative Slot-Filling Domain,” Technical Report CSLU-11-005, October 2011
[6] Kai Zhang, Collins, E.G.and Barbu, A., “A novel Stochastic Clustering Auction for task allocation in multi-robot teams,” Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, Pages 3300 - 3307, 18-22 Oct. 2010.
[7] Kai Zhang, Collins and Barbu, A., “An efficient stochastic clustering auction for heterogeneous robot teams,” Robotics and Automation (ICRA), 2012 IEEE International Conference on, Pages 4806 – 4813, 14-18 May 2012
[8] Xiaoming Zheng , Sven Koenig, “K-Swaps: Cooperative Negotiation for Solving Task-Allocation Problems,” IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence, Pages 373-378, 2009.
[9] Andreas Ernst, Houyuan Jiang and Mohan Krishnamoorthy, “Exact Solutions to Task Allocation Problems,” Management Science October 2006 vol. 52 no. 10, Pages 1634-1646, August 22, 2002.
[10] Jongeun Choi,Songhwai Oh and Roberto Horowitz, “Distributed learning and cooperative control for multi-agent systems,” Automatica Volume 45, Issue 12, Pages 2802–2814, December 2009.
[11] Maja J. Matarić, Gaurav S. Sukhatme, Esben H. Østergaard, “Multi-Robot Task Allocation in Uncertain Environments,” Autonomous Robots Volume 14, Issue 2-3, Pages 255-263, 2003.
[12] Ethem Alpaydin. “Introduction to Machine Learning,”, 2nd ed., CH 18, 2009.
[13] Wikipedia contributors, "Reinforcement Learning," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Reinforcement_learning, [accessed 27 October 2011]
[14] Wikipedia contributors, "Assignment Problem," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Assignment_problem, [accessed 3 March 2009‎‎]
[15] Alejandro R. Mosteo, Luis Montano. “A survey of multi-robot task allocation,” Internet: http://www.mosteo.com/papers/mm10-i3a.pdf
[16] Michael Wooldridge. “An introduction to Multi-Agent Systems,”. 2nd ed. CH11,2009.
[17] Wikipedia contributors, "Assignment Problem," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Q-learning, [accessed 13 December 2010‎‎]

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文