簡易檢索 / 詳目顯示

研究生: 陳怡君
Chen, Yi-chun
論文名稱: 理性代理人任務重分配協商下的神諭學習法
Oracle Learning for Agent Negotiation Based on Rationality in Task Reallocation Problems
指導教授: 蘇豐文
Soo, Von-Wun Soo
口試委員: 陳煥宗
Chen, Hwann-Tzong
周志遠
Chou, Jerry
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 114
中文關鍵詞: 任務重分配問題代理人溝通協定增強式學習
外文關鍵詞: Task Allocation Problem, Agent Negotiation, OCSM-Contracts Protocol
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 任務分配是多重代理人系統中一個重要的議題。OCSM 溝通協定被提出,且其能達到最佳解的性質已經得到證明了。然而,在任務重分配的過程中,如果沒有明確的溝通交換指引,從任意的分配狀態要交換到最佳分配仍然十分複雜且困難。在這篇論文中我們提出了,要如何找到明確的溝通指引──神諭,使代理人們能減少溝通次數並逐漸趨近最佳分配的方法。所提出的神諭學習法將方法分成很多細部機制,逐一回答任務重分配問題中所有需要解決的細節。然後,透過實驗,我們評估了這個方法在解決任務分配問題上的效果以及所需要的溝通交換次數,還有在不同規模問題上的應用性。這個方法確實的降低了需要的溝通交換次數,同時,由於細部機制的設計而能在每個隨機的分配狀態下給出明確的溝通指引。如此一來,應用OCSM溝通機制在任務重分配問題下的複雜度可以明顯地降低。


    Task allocation with a contract net protocol is an important issue in multi-agent system. The OCSM contracts protocol has been proposed and it has a good property on that its guarantee of global optimality has already been proved. However, without a proper an oracle to provide guideline of selection of the strategies at proper problem solving situation, the reachability of the optimal allocation solution still has some difficulty. A method to find the oracle, the guide, to agents who can help to reduce the needed number of steps of negotiation that can lead to the optimal allocation solution from any random initial assignment of task allocation is proposed in this thesis. The Oracle Learning method we proposed in this thesis is a method that is divided into several sub-mechanisms, each of which is designed to solve every detailed sub-problem in modeling the task (re)-allocation problem. And we show how each sub-problem can be solved and how the complexity of the optimal solution finding in this problem can be reduced. Then, through experiments, the performance of problem solving, the needed numbers of negotiation steps and the applicability of the method on different scale of problems were evaluated. We conclude the method can really help to get a good result in reducing the needed number of steps of negotiation and can really give a proper negotiation guide in each assignment of task allocation since its sub-mechanisms answers questions that an Oracle needs to answer. Thus, the computational complexity of OCSM negotiation mechanism in task re-allocation problem has a great reduction.

    Abstract 1 中文摘要 3 Table of Contents 4 CHAPTER 1. INTRODUCTION 6 CHAPTER 2 RELATED WORKS 9 2.1 Task Allocation Problem 10 Definition 2.1.1: Task Allocation Problem 10 2.2 OCSM-Contract protocol 11 Definition 2.2.1: O-contract (Original Contract) 11 Definition 2.2.2: C-contract (Cluster Contract) 12 Definition 2.2.3: S-contract (Swap Contract) 12 Definition 2.2.4: M-contract (Multi-agent Contract) 13 Proposition 2.2.1: Path 14 Proposition 2.2.2: No Path 14 Proposition 2.2.3: IR Contract 15 2.3 Reinforcement Learning 17 Definition 2.3.1: Q-Learning 17 CHAPTER 3 PROPOSED METHODS 20 3.1 Specially Defined Actions 20 Definition 3.1.1: Action 21 Trick: Bounded Contract 24 3.2 Oracle Recommend Structure 27 Mechanism 3.2.1: Rational Decision Making Strategy Bases on Individual Rationality 30 Mechanism 3.2.2: Agent-chosen Policy 35 Mechanism 3.2.3: Contract Recommendation 40 3.3 Oracle Learning 46 Overall Refining Structure 46 Part 1: Choose policy parameters and record state table 51 Part 2: Update of the recorded state table 58 CHAPTER 4 EXPERIMENTS DESIGN AND ANALYSIS 60 A Small Sample Case 62 Experiment 4.1: Check the Properties of OCSM-Contract Protocol 65 Experiment 4.1-1: Test Proposition 2.2.3 IR Contract 65 Experiment 4.1-2: The Decrease of Performance in Optimality Finding with Bounded contracts 70 Experiment 4.2: Evaluation of Mechanisms in Oracle Recommend Structure 77 Experiment 4.2-1: Cross Validation of Feature-selection under Different Rationalities and Strategies 78 Experiment 4.2-2 Different contract recommend mechanisms 89 Experiment 4.2-3 Contract Recommendations by Different k State. 100 Experiment 4.3: Evaluation of overall performance 105 Experiment 4.3-1:Different Problem at Different Scale 105 Chapter 5 Conclusion and future work 108 Reference 112

    [1]. T. Sandholm. Contract types for satisficing task allocation: I theoretical results. In AAAI Spring Symposium: Satisficing Models, 1998
    [2] Martin R. Andersson and Tuomas W. Sandholm. Contract types for satisficing task allocation: I theoretical results. In AAAI Spring Symposium: Satisficing Models, 1998
    [3] Filip Jurčíček, Blaise Thomson and Steve Young, “Reinforcement learning for parameter estimation in statistical spoken dialogue systems,” Computer Speech & Language vol. 36, Issue 3, Pages 168–192, June 2012,.
    [4]. Yifan Cai and Yang, Simon X. A Survey on multi-robot systems. In World Automation Congress (WAC), 2012
    [5] Ethan O. Selfridge and Peter A. Heeman, “Learning Turn, Attention, and Utterance Decisions in a Negotiative Slot-Filling Domain,” Technical Report CSLU-11-005, October 2011
    [6] Kai Zhang, Collins, E.G.and Barbu, A., “A novel Stochastic Clustering Auction for task allocation in multi-robot teams,” Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, Pages 3300 - 3307, 18-22 Oct. 2010.
    [7] Kai Zhang, Collins and Barbu, A., “An efficient stochastic clustering auction for heterogeneous robot teams,” Robotics and Automation (ICRA), 2012 IEEE International Conference on, Pages 4806 – 4813, 14-18 May 2012
    [8] Xiaoming Zheng , Sven Koenig, “K-Swaps: Cooperative Negotiation for Solving Task-Allocation Problems,” IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence, Pages 373-378, 2009.
    [9] Andreas Ernst, Houyuan Jiang and Mohan Krishnamoorthy, “Exact Solutions to Task Allocation Problems,” Management Science October 2006 vol. 52 no. 10, Pages 1634-1646, August 22, 2002.
    [10] Jongeun Choi,Songhwai Oh and Roberto Horowitz, “Distributed learning and cooperative control for multi-agent systems,” Automatica Volume 45, Issue 12, Pages 2802–2814, December 2009.
    [11] Maja J. Matarić, Gaurav S. Sukhatme, Esben H. Østergaard, “Multi-Robot Task Allocation in Uncertain Environments,” Autonomous Robots Volume 14, Issue 2-3, Pages 255-263, 2003.
    [12] Ethem Alpaydin. “Introduction to Machine Learning,”, 2nd ed., CH 18, 2009.
    [13] Wikipedia contributors, "Reinforcement Learning," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Reinforcement_learning, [accessed 27 October 2011]
    [14] Wikipedia contributors, "Assignment Problem," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Assignment_problem, [accessed 3 March 2009‎‎]
    [15] Alejandro R. Mosteo, Luis Montano. “A survey of multi-robot task allocation,” Internet: http://www.mosteo.com/papers/mm10-i3a.pdf
    [16] Michael Wooldridge. “An introduction to Multi-Agent Systems,”. 2nd ed. CH11,2009.
    [17] Wikipedia contributors, "Assignment Problem," Wikipedia, The Free Encyclopedia, Internet: http://en.wikipedia.org/wiki/Q-learning, [accessed 13 December 2010‎‎]

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE