研究生: |
高唯家 Kao,Wei-Chia |
---|---|
論文名稱: |
人體即時上半身動作辨識和肢體部位動作捕捉 Real-time Human Upper Body Action Recognition and Body Part Motion Capturing |
指導教授: |
黃仲陵
Huang,Chung-Lin 鐘太郎 Jong,Tai-Lang |
口試委員: |
黃仲陵
鐘太郎 莊仁輝 柳金章 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2013 |
畢業學年度: | 102 |
語文別: | 英文 |
論文頁數: | 57 |
中文關鍵詞: | 身體部位辨識 、深度圖 、隨機森林 、姿勢估測 |
外文關鍵詞: | Body partrecognition, Depth image, Random Forest, Pose recognition |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出一個可以辨識人體上身動作及身體部位模擬的即時系統。系統的輸入為即時的深度影像,攝影機是使用微軟的深度攝影機Kinect,系統有三個階段:使用者現在的動作、使用者目前的身體部位估計位置、使用者的部位誤差補償。在第一階段,深度影像經過前處理、萃取特徵、將特徵送入兩個動作分類器辨識目前使用者動作,再加上最後考慮動作上的時間相依性修正並輸出使用者當前動作。第二階段,依照第一階段辨識出的使用者動作,挑選適當的部位分類器,將前處理過的深度影像送入分類器,辨識出當前使用者的身體部位分布。之後考慮身體各部位的關聯性,推論出可能被遮擋住的身體部位,輸出使用者目前的身體部位估計位置。第三階段,依照第二階段估計的身體部位,並依照第一階段辨識出的使用者動作來決定適當的誤差補償分類器,將估計的身體部位當作輸入,辨識出當前身體部位的誤差數值,最後將誤差加回使之修正第二階段的輸出。產生的身體部位修正結果,以及第二階段的身體部位估計,這兩個結果來判斷何者於深度影像較為符合,即是最後的身體部位估計輸出。
This thesis proposes a real-time system to recognize human upper body posture and predict positions of upper limbs joints using the depth image captured by using Kinect. The system consists of three stages: (1) action recognition, (2) body part segmentation, (3) offset compensation. In the 1st stage, the depth images after the pre-processing and feature extraction are analyzed by the action type classifier to identify the current user action type. Then, the temporal correlation between the recognized action types can be applied for action type correction. In the 2nd stage, based on the user action type, we select an appropriate body classifier to classify pre-processing depth image and identify the distribution of body part. We also consider the time dependency and correlation of each body part to solve the occlusion problem of body part. In the 3rd stage, we develop the offset classifiers based on the difference between the output of the 2nd stage and the ground truth. For different user action type, we select an appropriate offset classifier to find the offset and compensate the output of the body classifier. Based on the results of before and after offset compensation, and depth silhouette, we can determine this information to identify which result is better as the final output of body location.
[1] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake, “Real-Time Human Pose Recognition in Parts from Single Depth Images.” CVPR,2011.
[2] V. Ganapathi, C. Plagemann, D. Koller, and S. Thrun.”Real time motion capture using a single time-of-flight camera.” CVPR, 2010.
[3] W. Shen, K. Deng, and X. Bai, “Exemplar-Based Human Action Pose Correction and Tagging.” CVPR , 2012.
[4] A. Baak, M. Müller, G. Bharaj, H.-P. Seidel, and H. Theobalt. “A data-driven approach for real-time full body pose construction from a depth camera.” ICCV,2011.
[5] S. Wang , H. Ai , and T. Yamashita, “Top-Down/Bottom-Up Human Articulated Pose Estimation Using AdaBoost Learning,” ICPR ,2010
[6] T. Moeslund, A. Hilton, and V. Kr¨uger. “A survey of advances in vision-based human motion capture and analysis.” CVIU,2006.
[7] R. Poppe. Vision-based human motion analysis: An overview. CVIU 2007.
[8] Microsoft Corp. Redmond WA. Kinect for Xbox 360.
[9] L. A. Schwarz, A. Mkhitaryan, “Estimating Human 3D Pose from Time-of-Flight Images Based on Geodesic Distances and Optical Flow,” AFGR, 2011
[10] S. Sempena and N. U. Maulidevi, “Human Action Recognition Using Dynamic Time Warping.” ICEE ,2011.
[11] R. Yang , L. Ren, and M. Pollefeys, “Accurate 3D Pose Estimation From a Single Depth Image.” ICCV,2011.
[12] J. Charles and M. Everingham, “Learning shape models for monocular human pose estimation from the Microsoft Xbox Kinect.” ICCV,2011.
[13] C. Plagemann, V. Ganapathi, D. Koller, and S. Thrun. “Real-time identification and localization of body parts from depth images.” ICRA ,2010.
[14] L. Breiman. Random forests. Mach. Learning, 45(1):5–32, 2001.
[15] G. Fanelli, J. Gall, L. Van Gool, “Real Time Head Pose Estimation with Random Regression Forests,” ICPR ,2010
[16] R.Wang and J. Popovi´c. “Real-time hand-tracking with a color glove.” ACM SIGGRAPH,2009.
[17] C. Bregler and J. Malik. “Tracking people with twists and exponential maps.” CVPR,1998.
[18] H. Yang and S. Lee. “Reconstructing 3D Human Body Pose from Stereo Image Sequences Using Hierarchical Human Body Model Learning.” ICPR,2006.
[19] L. Sigal, S. Bhatia, S. Roth, M. Black, and M. Isard. “Tracking Loose Limbed People.” CVPR,2004.
[20] T. Sharp. “Implementing decision trees and forests on a GPU.” ECCV,2008.
[21] Ho, T. Kam . "Random Decision Forest". Proceedings of the 3rd Int. Conf. on Document Analysis and Recognition, Montreal, QC, 14–16 Aug. 1995.
[22] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, 1996
[23] N. H. Lehment, M. Kaiser, and G. Rigol, “Using Segmented 3D Point Clouds for Accurate Likelihood Approximation in Human Pose Tracking,” ICCV, 2011.
[24] L. Breiman, Bagging Predictors, Machine Learning,p123-p140 ,1996
[25] T. K.Ho, “The Random Subspace Method for Constructing Decision Forests,” PAMI, 1998.
[26] J.R. Shaw, QuickFill: An efficient flood fill algorithm. Online article:
http://www.codeproject.com/gdi/QuickFill.asp,2005.