結合深度強化學習、軌跡規劃與追蹤控制於多機器人避障與導航

簡易檢索 / 詳目顯示

回結果列表

研究生：	顏棣逵 Yen, Di-kuei
論文名稱：	結合深度強化學習、軌跡規劃與追蹤控制於多機器人避障與導航 Application of Deep Reinforcement Learning, Trajectory Planning and Tracking in Navigation and Collision Avoidance of Multi-Robot systems
指導教授：	葉廷仁 Yeh, Ting-Jen
口試委員:	劉承賢 Liu, Cheng-Hsien 陳國聲 Chen, Kuo-Shen
學位類別：	碩士 Master
系所名稱：	工學院 - 動力機械工程學系 Department of Power Mechanical Engineering
論文出版年：	2022
畢業學年度：	110
語文別：	中文
論文頁數：	89
中文關鍵詞：	深度強化學習、多機器人系統、社交導航、避障、軌跡規劃、追蹤控制、擴展式卡曼濾波器
外文關鍵詞：	social navigation, nonholonomic constraints
相關次數：	點閱：2 下載：4
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文基於深度強化學習理論建立多機器人系統的導航避障神經網路。網路設計從雙機器人系統出發，考慮狀態對稱性、非完整約束，優先順序與速度障礙物等條件，以有效率的方式完成符合運動學且具社交禮讓能力的避障導航訓練。論文中特別提出創新的擴展架構，結合社交斥力觀念使雙機器人避障導航網路能以輕量的計算成本系統性地擴展於多機器人系統。此外，為了彌補避障導航網路於目標點定位精度不足的問題，論文也發展了一套軌跡規劃與追蹤方法，用於機器人導航至目標點附近的控制切換。驗證發展理論與方法採用自製輪型機器人，其上安裝了光學雷達、光流感測器、慣性感測器等，並使用擴展式卡曼濾波器進行感測融合達成精確定位。實際場域的測試證明了多機器人導航避障的可行性與性能。

This thesis develops a navigation and obstacle avoidance neural network for the multi-robot system based on deep reinforcement learning. The network design starts with the dual-robot system. Considering state symmetry, nonholonomic constraints, priority order and speed obstacles, the reinforcement learning can efficiently train the network so that it conforms to the kinematics of mobile robots and can perform collision avoidance and social navigation. An innovative extension architecture is also proposed in the thesis. Combined with the concept of social repulsive force, the dual-robot obstacle avoidance and navigation network can be systematically extended to multi-robot systems with light computational cost. In addition, in order to make up for the insufficient positioning accuracy of the network at the target point, the thesis also proposes a trajectory planning and tracking method for the control switching of the robot navigation to the vicinity of the target point. To verify the developed theories and methods, a set of differential drive wheeled robots are constructed. Each of the robots is equipped with an optical radar, an optical flow sensor, an inertial measurement unit, etc., and the extended Kalman filter is used for sensor fusion to achieve precise localization. Experiments in the indoor environment prove the feasibility and performance of the proposed multi-robot navigation and obstacle avoidance network.

摘要    i
致謝    iii
目錄    iv
圖目錄    vii
表目錄    ix
符號表    x
第1章    緒論    1
1    研究動機與目的    1
2    文獻回顧    2
3    論文簡介    5
第2章    深度強化學習神經網路    7
1    深度強化學習    7
1.1    深度學習    7
1.2    強化學習    9
2    雙機器人導航避障神經網路設計及訓練架構    11
2.1    系統狀態    12
2.2    深度神經網路設計    16
2.3    獎勵設計    17
2.4    訓練流程    20
2.5    實際應用流程    23
3    以狀態對稱性加速學習速率    24
4    觀測噪音及輸出擾動    24
5    初步訓練結果    26
6    速度障礙物    26
6.1    優先度的設定    27
6.2    速度障礙物    28
6.3    速度障礙物的應用    30
第3章    多機器人擴展神經網路    32
1    擴展神經網路架構    32
2    影響力係數    34
3    擴展神經網路訓練結果與比較    35
4    擴展神經網路至四、五台機器人    38
5    與其他論文的運行結果比較    41
第4章    軌跡規劃以及追蹤    46
1    軌跡規劃    46
1.1    貝茲曲線    47
1.2    「參數再選取」演算法    50
1.3    s與時間t的轉換    52
1.4    軌跡規劃總結    53
2    軌跡追蹤    54
3    模擬結果    57
第5章    機器人整體硬體及軟體架構設計    60
1    整體系統架構    61
2    Jetson TX2電腦及其軟體    62
2.1    單機器人內部軟體通訊    63
2.2    多機器人通訊    64
3    慣性感測器    65
4    光達    66
5    光流感測器    67
第6章    感測融合定位    68
1    擴展式卡曼濾波器(Extended Kalman filter)    69
2    EKF 感測融合實測    72
第7章    實驗結果    73
1    軌跡追蹤導航    73
2    整體避障及導航系統    74
第8章    結論與未來工作    79
1    結論    79
2    未來工作    81
參考文獻    86

                                

[1] E. W. Dijkstra, "A note on two problems in connexion with graphs:(Numerische Mathematik, 1 (1959), p 269-271)," 1959.
[2] P. E. Hart, N. J. Nilsson, and B. Raphael, "A formal basis for the heuristic determination of minimum cost paths," IEEE transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100-107, 1968.
[3] S. Kuswadi, J. W. Santoso, M. N. Tamara, and M. Nuh, "Application SLAM and path planning using A-star algorithm for mobile robot in indoor disaster area," in 2018 International Electronics Symposium on Engineering Technology and Applications (IES-ETA), 2018: IEEE, pp. 270-274.
[4] T. Chen, G. Zhang, X. Hu, and J. Xiao, "Unmanned aerial vehicle route planning method based on a star algorithm," in 2018 13th IEEE conference on industrial electronics and applications (ICIEA), 2018: IEEE, pp. 1510-1514.
[5] A. K. Guruji, H. Agarwal, and D. Parsediya, "Time-efficient A* algorithm for robot path planning," Procedia Technology, vol. 23, pp. 144-149, 2016.
[6] D. Dolgov, S. Thrun, M. Montemerlo, and J. Diebel, "Practical search techniques in path planning for autonomous driving," Ann Arbor, vol. 1001, no. 48105, pp. 18-80, 2008.
[7] J. Petereit, T. Emter, C. W. Frey, T. Kopfstedt, and A. Beutel, "Application of hybrid A* to an autonomous mobile robot for path planning in unstructured outdoor environments," in ROBOTIK 2012; 7th German Conference on Robotics, 2012: VDE, pp. 1-6.
[8] P. Fiorini and Z. Shiller, "Motion planning in dynamic environments using velocity obstacles," The international journal of robotics research, vol. 17, no. 7, pp. 760-772, 1998.
[9] J. v. d. Berg, S. J. Guy, M. Lin, and D. Manocha, "Reciprocal n-body collision avoidance," in Robotics research: Springer, 2011, pp. 3-19.
[10] J. Van den Berg, M. Lin, and D. Manocha, "Reciprocal velocity obstacles for real-time multi-agent navigation," in 2008 IEEE international conference on robotics and automation, 2008: Ieee, pp. 1928-1935.
[11] S. J. Guy, J. Chhugani, C. Kim, N. Satish, M. Lin, D. Manocha, and P. Dubey, "Clearpath: highly parallel collision avoidance for multi-agent simulation," in Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2009, pp. 177-187.
[12] J. A. Douthwaite, S. Zhao, and L. S. Mihaylova, "A comparative study of velocity obstacle approaches for multi-agent systems," in 2018 UKACC 12th International Conference on Control (CONTROL), 2018: IEEE, pp. 289-294.
[13] J. Snape, J. Van Den Berg, S. J. Guy, and D. Manocha, "The hybrid reciprocal velocity obstacle," IEEE Transactions on Robotics, vol. 27, no. 4, pp. 696-706, 2011.
[14] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
[15] R. Han, S. Chen, and Q. Hao, "Cooperative multi-robot navigation in dynamic environment with deep reinforcement learning," in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020: IEEE, pp. 448-454.
[16] P. Long, T. Fan, X. Liao, W. Liu, H. Zhang, and J. Pan, "Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018: IEEE, pp. 6252-6259.
[17] L. Tai, G. Paolo, and M. Liu, "Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation," in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017: IEEE, pp. 31-36.
[18] E. Marchesini and A. Farinelli, "Discrete deep reinforcement learning for mapless navigation," in 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020: IEEE, pp. 10688-10694.
[19] J. Lin, X. Yang, P. Zheng, and H. Cheng, "End-to-end decentralized multi-robot navigation in unknown complex environments via deep reinforcement learning," in 2019 IEEE International Conference on Mechatronics and Automation (ICMA), 2019: IEEE, pp. 2493-2500.
[20] Q. Tan, T. Fan, J. Pan, and D. Manocha, "DeepMNavigate: deep reinforced multi-robot navigation unifying local & global collision avoidance," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 6952-6959.
[21] R. Guldenring, M. Görner, N. Hendrich, N. J. Jacobsen, and J. Zhang, "Learning local planners for human-aware navigation in indoor environments," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 6053-6060.
[22] Y. F. Chen, M. Liu, M. Everett, and J. P. How, "Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning," in 2017 IEEE international conference on robotics and automation (ICRA), 2017: IEEE, pp. 285-292.
[23] C. Chen, Y. Liu, S. Kreiss, and A. Alahi, "Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning," in 2019 International Conference on Robotics and Automation (ICRA), 2019: IEEE, pp. 6015-6022.
[24] Y. F. Chen, M. Everett, M. Liu, and J. P. How, "Socially aware motion planning with deep reinforcement learning," in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017: IEEE, pp. 1343-1350.
[25] A. Mohseni-Kabir, D. Isele, and K. Fujimura, "Interaction-aware multi-agent reinforcement learning for mobile agents with individual goals," in 2019 International Conference on Robotics and Automation (ICRA), 2019: IEEE, pp. 3370-3376.
[26] L. Liu, D. Dugas, G. Cesari, R. Siegwart, and R. Dubé, "Robot navigation in crowded environments using deep reinforcement learning," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 5671-5677.
[27] B. Riviere, W. Hönig, Y. Yue, and S.-J. Chung, "Glas: Global-to-local safe autonomy synthesis for multi-robot motion planning with end-to-end learning," IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4249-4256, 2020.
[28] J. J. Johnson, L. Li, F. Liu, A. H. Qureshi, and M. C. Yip, "Dynamically constrained motion planning networks for non-holonomic robots," in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020: IEEE, pp. 6937-6943.
[29] R. Bellman, "Dynamic programming," Science, vol. 153, no. 3731, pp. 34-37, 1966.
[30] B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo, "Mobile robots," Robotics: Modelling, Planning and Control, pp. 469-521, 2009.
[31] S. L. Harris and D. Harris, Digital design and computer architecture. Morgan Kaufmann, 2015.
[32] J. Makmul, "A Social Force Model for Pedestrians’ Movements Affected by Smoke Spreading," Modelling and Simulation in Engineering, vol. 2020, 2020.
[33] E. Bassoli and L. Vincenzi, "Parameter Calibration of a Social Force Model for the Crowd-Induced Vibrations of Footbridges," Frontiers in Built Environment, p. 75, 2021.
[34] G. Farin, "Algorithms for rational Bézier curves," Computer-aided design, vol. 15, no. 2, pp. 73-77, 1983.
[35] J. W. Peterson, "Arc length parameterization of spline curves," Journal of Compu ter Aided Design, 2006.
[36] R. J. Sharpe and R. W. Thorne, "Numerical method for extracting an arc length parameterization from parametric curves," Computer-aided design, vol. 14, no. 2, pp. 79-81, 1982.
[37] E. W. Weisstein, "Leibniz integral rule," https://mathworld. wolfram. com/, 2003.
[38] J. Mohammadpour and C. W. Scherer, Control of linear parameter varying systems with applications. Springer Science & Business Media, 2012.
[39] 林子傑, "自我平衡電動自行車之感測融合與路徑追蹤," 清華大學動力機械工程學系碩士學位論文, pp. 1-64, 2021.
[40] V. Cerone, D. Andreo, M. Larsson, and D. Regruto, "Stabilization of a riderless bicycle: A linear-parameter-varying approach," in IEEE Control Syst. Mag, 2010: Citeseer.
[41] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear matrix inequalities in system and control theory. SIAM, 1994.
[42] A. Koubâa, Robot Operating System (ROS). Springer, 2017.
[43] P. Del Moral, "Nonlinear filtering: Interacting particle resolution," Comptes Rendus de l'Académie des Sciences-Series I-Mathematics, vol. 325, no. 6, pp. 653-658, 1997.
[44] K. Fujii, "Extended kalman filter," Refernce Manual, pp. 14-22, 2013.
[45] 謝志鈴, "應用深度強化學習網路於多機器人之避障與地圖導航," 清華大學動力機械工程學系碩士學位論文, pp. 1-69, 2020.

簡易檢索 / 詳目顯示

相關論文