研究生: |
徐弘豫 Hsu, Hung-Yu |
---|---|
論文名稱: |
利用學習距離矩陣產生連續的手勢動作 Generating Continuous Pose Sequences Using Learned Distance Metrics |
指導教授: | 陳煥宗 |
口試委員: |
劉庭祿
賴尚宏 |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 中文 |
論文頁數: | 26 |
中文關鍵詞: | Kinect |
外文關鍵詞: | Kinect |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於最近的成果顯示使用深度攝影機拍攝得到的資訊,比起傳統的攝影機更有辦法去解決一些影像處理的問題,像是物體辨識、姿勢追蹤等;而且加上微軟體感遊戲機台的成功以及以及漸降低的價格也使得深度攝影機越來越容易讓⼀一般人擁有。因此在這篇論文中,我們主要也是利用深度攝影機所擷取到的圖片資訊來進行研究和發展我們的應用。我們利用Kinect以及OpenNI API擷取出人的深度影像資訊和關節座標資訊,收集人們各種不同的姿態動作,然後用收集的這些影像以及關節資訊作為我們的資料來源,進行研究分析。這篇論文
主要期望達到兩個目標:(1)藉由學習⼀一個距離矩陣降低相似姿態動作之間的距離,並拉大不同動作之間的關係(2)建立⼀一個模型描述所有姿態動作影片中,每張圖像之間的關係;且藉由此模型我們可以只提供少許不連續的圖片,便可將其中間的圖片找出來接續成連續的⼀一段影片。我們利用(1)所學的距離矩陣與MDS的方法將圖片座標高維的資訊降到低維度空間中並提出了⼀一個描述各個不同姿態動作影片以及將缺少的圖片補齊成連續影片的方
式。而且降低的低維度資訊也減少了計算時間、能處理更多的資料以及帶來可視化上的好處。
Depth cameras become popular for the gradually acceptable price and their capability to solve relatively difficult problems such as object detection [1], pose tracking [2], and particularly the successful pose recognition system developed by Shotton et al. [3]. In this thesis we use the depth information to develop our application. We obtain depth images and human skeleton coordinates using Kinect and we focus on processing the skeleton information. Our work has two goals: (1) Learning a distance metric that can identify similar-pose videos and distinguish dissimilar-pose videos. (2) Constructing a graph that describes the relation of the poses for synthesizing consecutive pose-sequence. We use multidimensional scaling (MDS) with the learned distance metric to reduce the dimensionality of poses. We present a method for describing distinct poses and for interpolating the lost frames. The reduced pose dimension also contributes to saving computation time, handling large datasets, and visualization. Our experiments show the effectiveness of our method for modeling and synthesizing pose sequences.
[1] S. Hinterstoisser, S. Holzer, C. Cagniart, S. Ilic, K. Konolige, N. Navab, and V. Lepetit.
“Multimodal templates for real-time detection of texture-less objects in heavily cluttered
scenes”. In International Conference on Computer Vision, 2011.
[2] A. Baak, M. M¨uller, G. Bharaj, H.-P. Seidel, and C. Theobalt. “A data-driven approach
for real-time full body pose reconstruction from a depth camera”. In IEEE 13th International
Conference on Computer Vision, to appear, Nov. 2011.
[3] J. Shotton, A. W. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman,
and A. Blake. “Real-time human pose recognition in parts from single depth images”. In
Computer Vision and Pattern Recognition, pages 1297–1304, 2011.
[4] E. P. Xing, A. Y. Ng, M. I. Jordan, and S. J. Russell. “Distance metric learning with
application to clustering with side-information.” In Neural Information Processing
SYSTEMS, pages 505–512, 2002.
[5] Shashua, A. Wolf, L. “Kernel Feature Selection with Side Data using a Spectral
Approach,” Proc. of the European Conference on Computer Vision, Prague, Czech
Republic, May 2004
[6] Torgerson, W.S. “Multidimensional Scaling: Theory and Method,” Psychometrika, vol
17, pp. 401-419, 1952.
[7] T. Yang, J. Liu, L. Mcmillan, and W. Wang. “A fast approximation to multidimensional
scaling,” In Proceedings of the ECCV Workshop on Computation Intensive Methods for
Computer Vision (CIMCV, 2006).
[8] H. Z. Zha and Z. Zhang. “Isometric embedding and continuum isomap.” In Proceedings
of the Twentieth International Conference on Machine Learning, pages 864–871, 2003.
[9] A. Y. Ng, M. I. Jordan, and Y. Weiss. “On spectral clustering: Analysis and an
algorithm.” In Advances In Neural Information Processing Systems, pages 849– 856. MIT
Press, 2001.