簡易檢索 / 詳目顯示

研究生: 謝志鈴
Heish, Chih-Ling
論文名稱: 應用深度強化學習網路於多機器人之避障與地圖導航
Collision Avoidance and Map Navigation of a Multi-Robot System using Deep Reinforcement Learning Network
指導教授: 葉廷仁
Yeh, Ting-Jen
口試委員: 劉承賢
Liu, Cheng-Hsien
黃浚鋒
Huang, Chung-Feng
學位類別: 碩士
Master
系所名稱: 工學院 - 動力機械工程學系
Department of Power Mechanical Engineering
論文出版年: 2020
畢業學年度: 109
語文別: 中文
論文頁數: 69
中文關鍵詞: 深度強化學習地圖導航避障非完整約束
外文關鍵詞: deep reinforcement learning, map navigation, collision avoidance, nonholonomic constraint
相關次數: 點閱:3下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究利用深度強化學習建立深度神經價值網路來讓群組機器人可以即時避開移動障礙物並完成導航任務。在學習的初始資料蒐集時便將輪型移動的非完整約束納入考量,使得機器人得以做出更有效率且符合運動學限制的行為。強化學習訓練先從雙機器人網路開始,再根據相關測試結果,系統化建立了一套由少數機器人網路訓練及拓展至多機器人網路的架構與流程。最後利用訓練完成的網路,搭配設置虛擬機器人來取代不同地圖環境中的固定障礙物,使得本研究原先在開放空間中訓練的網路,可以適用於不同環境進行地圖導航,而無須重新進行訓練,提升了網路的泛用性。


    This thesis applies deep reinforcement learning to build a deep value network for collision avoidance and map navigation of the multiple-robot system. By considering the nonholonomic constraints associated with wheeled mobile robots in the initial learning data collection, the robots can perform more efficient behaviors that conform to the kinematics constraints. Reinforcement learning training starts with a dual-robot navigation network. Then based on the test results, a systematic procedure to constructively extend the dual-robot network to a multi-robot network is proposed. It is also shown that virtual robots can be adopted in the trained navigation network to emulate fixed obstacles in the map environment. By doing so, the network originally trained in the open space can be used for navigation in different map environments without retraining. Both simulations and experiments verify the effectiveness and generality of the multiple-robot navigation network constructed by the proposed approach.

    目錄 摘要……………………………………………………………………….i Abstract…………………………………………………………………..ii 致謝……………………………………………………………………...iii 目錄……………………………………………………………………...iv 圖目錄……………………………………………………….…………viii 表目錄……………………………………………………….…………..xi 第一章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧 2 1.3 論文簡介 5 第二章 整體系統相關理論與系統基本架構 6 2.1 輪型機器人的非完整約束 6 2.2 常用的路徑規劃演算法 7 2.2.1 戴克斯特拉演算法 7 2.2.2 A*演算法 9 2.2.3 Hybrid A*演算法 10 2.3 速度障礙物(Velocity Obstacle) 12 2.4 機器學習 13 2.4.1 深度神經網路 13 2.4.2 強化學習 15 2.4.3 深度強化學習 18 2.5 基本系統架構 19 第三章 兩機器人網路架構設計、訓練流程 20 3.1 系統狀態 20 3.1.1 機器人本身狀態 21 3.1.2 障礙物被觀測狀態 21 3.1.3 上層系統狀態 22 3.1.4 座標轉換簡化系統狀態 23 3.2 初始訓練資料蒐集 25 3.3 深度神經價值網路參數設定 27 3.4 深度強化學習參數設定與訓練流程 28 3.4.1 獎勵設計 28 3.4.2 路徑資料轉換 29 3.4.3 訓練流程 30 3.5 實際應用流程 32 第四章 多機器人網路拓展、訓練與地圖導航應用 33 4.1 兩機器人網路拓展測試 33 4.2 多機器人深度神經網路拓展 37 4.2.1 網路拓展流程(概論) 37 4.2.2 網路拓展流程(舉例) 38 4.3 多機器人網路訓練 39 4.4 地圖導航應用 41 4.4.1 避障行為特性分析 41 4.4.2 地圖導航問題轉換 42 4.4.3 虛擬機器人配置 47 第五章 模擬與實驗結果 50 5.1 測試平台介紹 50 5.1.1 單機完全模擬環境 50 5.1.2 通訊模擬環境 51 5.1.3 輪型機器人平台 52 5.2 兩機器人網路測試結果 55 5.2.1 與其他導航系統比較 55 5.2.2 優先度分析 57 5.2.3 各測試平台結果 58 5.2.4 多機器人系統拓展 60 5.3 多機器人網路應用測試結果 61 5.3.1 兩機器人與三機器人網路拓展比較 62 5.3.2 地圖導航測試結果 63 第六章 結論與未來工作 65 6.1 結論 65 6.2 未來工作 66 參考資料 67

    [1] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische mathematik, vol. 1, no. 1, pp. 269-271, 1959.
    [2] P. E. Hart, N. J. Nilsson, and B. Raphael, "A formal basis for the heuristic determination of minimum cost paths," IEEE transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100-107, 1968.
    [3] S. Kuswadi, J. W. Santoso, M. N. Tamara, and M. Nuh, "Application SLAM and Path Planning using A-Star Algorithm for Mobile Robot in Indoor Disaster Area," in 2018 International Electronics Symposium on Engineering Technology and Applications (IES-ETA), 2018, pp. 270-274: IEEE.
    [4] T. Chen, G. Zhang, X. Hu, and J. Xiao, "Unmanned aerial vehicle route planning method based on a star algorithm," in 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA), 2018, pp. 1510-1514: IEEE.
    [5] A. K. Guruji, H. Agarwal, and D. Parsediya, "Time-efficient A* algorithm for robot path planning," Procedia Technology, vol. 23, pp. 144-149, 2016.
    [6] H. Samet, "An overview of quadtrees, octrees, and related hierarchical data structures," in Theoretical Foundations of Computer Graphics and CAD: Springer, 1988, pp. 51-68.
    [7] H. Noborio, T. Naniwa, and S. Arimoto, "A quadtree‐based path‐planning algorithm for a mobile robot," Journal of Robotic Systems, vol. 7, no. 4, pp. 555-574, 1990.
    [8] J.-P. Laumond, S. Sekhavat, and F. Lamiraux, "Guidelines in nonholonomic motion planning for mobile robots," in Robot motion planning and control: Springer, 1998, pp. 1-53.
    [9] D. Dolgov, S. Thrun, M. Montemerlo, and J. Diebel, "Practical search techniques in path planning for autonomous driving," Ann Arbor, vol. 1001, no. 48105, pp. 18-80, 2008.
    [10] P. Fiorini and Z. Shiller, "Motion planning in dynamic environments using velocity obstacles," The International Journal of Robotics Research, vol. 17, no. 7, pp. 760-772, 1998.
    [11] J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, "Reciprocal n-body collision avoidance," in Robotics research: Springer, 2011, pp. 3-19.
    [12] J. Van den Berg, M. Lin, and D. Manocha, "Reciprocal velocity obstacles for real-time multi-agent navigation," in 2008 IEEE International Conference on Robotics and Automation, 2008, pp. 1928-1935: IEEE.
    [13] S. J. Guy et al., "Clearpath: highly parallel collision avoidance for multi-agent simulation," in Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2009, pp. 177-187.
    [14] C. Fulgenzi, A. Spalanzani, and C. Laugier, "Dynamic obstacle avoidance in uncertain environment combining PVOs and occupancy grid," in Proceedings 2007 IEEE International Conference on Robotics and Automation, 2007, pp. 1610-1616: IEEE.
    [15] B. Patle, A. Pandey, D. Parhi, and A. Jagadeesh, "A review: On path planning strategies for navigation of mobile robot," Defence Technology, 2019.
    [16] P. Long, T. Fanl, X. Liao, W. Liu, H. Zhang, and J. Pan, "Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 6252-6259: IEEE.
    [17] Y. F. Chen, M. Liu, M. Everett, and J. P. How, "Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning," in 2017 IEEE international conference on robotics and automation (ICRA), 2017, pp. 285-292: IEEE.
    [18] Y. F. Chen, M. Everett, M. Liu, and J. P. How, "Socially aware motion planning with deep reinforcement learning," in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 1343-1350: IEEE.
    [19] C. Chen, Y. Liu, S. Kreiss, and A. Alahi, "Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning," in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 6015-6022: IEEE.
    [20] R. Han, S. Chen, and Q. Hao, "Cooperative Multi-Robot Navigation in Dynamic Environment with Deep Reinforcement Learning," in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 448-454: IEEE.
    [21] R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," 2011.
    [22] A. Sakai, D. Ingram, J. Dinius, K. Chawla, A. Raffin, and A. Paques, "Pythonrobotics: a python code collection of robotics algorithms," arXiv preprint arXiv:1808.10703, 2018.
    [23] D. Harris and S. Harris, Digital design and computer architecture. Morgan Kaufmann, 2010.
    [24] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics (Intelligent robotics and autonomous agents). MIT Press, 2005, pp. I-XX, 1-647.

    QR CODE