研究生: |
邱志義 Chih-Yi Chiu |
---|---|
論文名稱: |
多模式視覺內容擷取 Multi-Modal Visual Content Retrieval |
指導教授: |
楊熙年
Shi-Nine Yang 林信志 Hsin-Chih Lin |
口試委員: | |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Computer Science |
論文出版年: | 2004 |
畢業學年度: | 92 |
語文別: | 英文 |
論文頁數: | 115 |
中文關鍵詞: | 多模式互動 、視覺內容擷取 、語意隔閡 、材質影像 、人體動作 、人體姿勢重建 |
外文關鍵詞: | multi-modal interaction, visual content retrieval, semantic gap, texture image, human motion, human posture reconstruction |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
視覺內容擷取(visual content retrieval)是近年來受到相當重視的多媒體資訊檢索技術,而這項技術目前面臨的最大瓶頸是語意隔閡問題(semantic gap problem)。要降低語意之間的隔閡,其中一個可行的方法是透過多模式互動(multi-modal interactions)進行視覺內容擷取。多模式互動提供多種使用者與電腦系統互動的模式,例如相關性回饋(relevance feedback)或多種不同的輸出輸入介面,使用者可以在擷取的過程中,選擇一個方便合適的模式輸入查詢條件,然後觀察輸出的結果。
我們在本篇論文中提出一個多模式視覺內容擷取架構,並設計與展示新的互動模式,以及相關的索引比對演算法。在這我們以材質影像(texture image)以及人體動作(human motion)為例來探討所提出的架構。在材質影像擷取系統中,使用者可以下達形容詞以及影像範例來描述欲尋找的材質影像。此外,我們提出一種自動對材質影像進行形容詞註解及索引的方法,而系統也會根據使用者對形容詞的認知進行個人化的調整;在人體動作擷取系統中,我們設計多種不同的輸入及輸出模式,使用者可以透過文字、可調式人偶(stickman)、影像、以及影片等輸入模式下達查詢條件,也可以透過軌跡影像圖或動作影片等輸出模式來觀察查詢結果。最後,我們針對人體動作的特性設計一套有效的索引及比對演算法,以提升系統的效率並增進目標搜尋的準確率。我們在本篇論文中也特別針對影像輸入模式,探討從一張人體姿勢影像還原深度資訊(depth cue)的問題,並提出一種以人體骨架為基礎的三維姿勢重建方法。從實驗結果中,我們得到令人滿意的結果,也充分顯示我們的多模式視覺內容擷取架構的可行性。
Visual content retrieval is an emerging technique to access multimedia information. However, this technique suffers from the semantic gap problem, which is the mismatch between low-level visual features and high-level human concepts. A way to narrow down the semantic gap is to incorporate multi-modal interactions into the retrieval technique. Multi-modal interactions provide users with a combination of multiple interacting modalities, such as relevance feedback, various input and output interfaces. Therefore users can select a convenient and appropriate mode to specify query conditions and observe search results during the retrieval process.
In this study, we propose a multi-modal interaction framework in visual content retrieval. Several novel interacting modalities are presented, and associated indexing and matching algorithms are devised. Two types of multimedia information, namely, texture images and human motion, are investigated and demonstrated. In our texture image retrieval system, users can specify linguistic terms, together with visual examples to find desired texture images. Besides, an automatic annotation and indexing is devised based on these linguistic terms. Further, a personalization mechanism is developed to cope with the human subjectivity to linguistic terms. In our human motion retrieval system, users can choose appropriate input modes, including text, stickman, images, and motion clips, to specify their queries. Later, they can observe retrieval results via graphics images or animation video. Moreover, we design efficient and effective indexing and matching algorithms to decrease retrieval time and improve retrieval accuracy. In particular for the image input mode, we present a novel model-based approach to reconstruct a 3D human posture from a single image. Our experimental results reveal a promising direction of the proposed multi-modal interaction framework in visual content retrieval.
[1] H. Tamura, N. Yokoya, Image database system: a survey, Pattern Recognition, 17(1), 1984.
[2] S. K. Chang, A. Hsu, Image information systems: where do we go from here? IEEE Transactions on Knowledge and Data Engineering, 4(5), 1992, pp. 431-442.
[3] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, P. Yanker, Query by image and video content: the QBIC system, IEEE Computer, 28(9), 1995, pp. 23-32.
[4] A. Gupta, R. Jain, Visual information retrieval, Communications of the ACM, 40(5), 1997, pp. 71-79.
[5] J. Dowe, Content-based retrieval in multimedia imaging, SPIE Storage and Retrieval for Image and Video Databases, San Jose, California, USA, Jan 31-Feb. 5, 1993, Vol. 1908, pp. 164-167.
[6] A. Pentland, R. W. Picard, S. Sclaroff, Photobook: content-based manipulation of image databases, International Journal of Computer Vision, 18(3), 1996, pp. 233-254.
[7] J. R. Smith, S. F. Chang, VisualSEEk: a fully automated content-based image query system, ACM International Conference on Multimedia, Boston, Massachusetts, USA, Nov. 18-22, 1996, pp. 87-98.
[8] W. Y. Ma, B. S. Manjunath, NeTra: a toolbox for navigating large image databases, Multimedia Systems, 7(3), 1999, pp. 184-198.
[9] S. Mehrotra, Y. Rui, M. Ortega-B, T. S. Huang, Supporting content-based queries over images in MARS, IEEE International Conference on Multimedia Computing and Systems, Ottawa, Canada, Jun. 3-6, 1997, pp. 632-633.
[10] I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, P. N. Yianilos, The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments, IEEE Transactions on Image Processing, 9(1), 2000, pp. 20-37.
[11] S. W. Smoliar, H. J. Zhang, Content-based video indexing and retrieval, IEEE Multimedia, 1(2), 1994, pp. 62-72.
[12] K. Shearer, S. Venkatesh, D. Kieronska, Spatial indexing for video databases, Journal of Visual Communication and Image Representation, 7(4), 1996, pp. 325-335.
[13] E. Ardizzone, M. Cascia, Automatic video database indexing and retrieval, Multimedia Tools and Applications, 4(1), 1997, pp. 29-56.
[14] D. Ponceleon, S. Srinvasan, A. Amir, D. Petkovic, D. Diklic, Key to effective video retrieval: effective catalogin and browsing, ACM International Conference on Multimedia, Bristol, UK, Sep. 12-16, 1998, pp. 99-107.
[15] Y. Deng, B. S. Manjunath, NeTra-V: toward an object-based video representation, IEEE Transactions on Circuits and Systems for Video Technology, 8(5), 1998, pp. 616-627.
[16] A. K. Jain, A. Vailaya, X. Wei, Query by video clip, Multimedia Systems, 7(5), 1999, pp. 369-384.
[17] R. Lienhart, W. Effelsberg, R. Jain, VisualGREP: a systematic method to compare and retrieve video sequences, Multimedia Tools and Applications, 10(1), 2000, pp. 47-72.
[18] M. R. Naphade, T. S. Huang, A probabilistic framework for semantic video indexing, filtering and retrieval, IEEE Transactions on Multimedia, 3(1), 2001, pp. 141-151.
[19] Y. Rui, T. S. Huang, S. F. Chang, Image retrieval: current techniques, promising directions, and open issues, Journal of Visual Communication and Image Representation, 10(1), 1999, pp. 39-62.
[20] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-based image retrieval at the end of the early years, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 2000, pp. 1349-1380.
[21] G. Ahanger, T. D. C. Little, A survey of technologies for parsing and indexing digital video, Journal of Visual Communication and Image Representation, 7(1), 1996, pp. 28-43.
[22] F. Idris, S. Panchanathan, Review of image and video indexing techniques, Journal of Visual Communication and Image Representation, 8(2), 1997, pp. 146-166.
[23] R. Brunelli, O. Mich, C. M. Modena, A survey on the automatic indexing of video data, Journal of Visual Communication and Image Representation, 10(2), 1999, pp. 78-112.
[24] S. Antani, R. Kasturi, R. Jain, A survey on the use of pattern recognition methods for abstraction, indexing, and retrieval of images and video, Pattern Recognition, 35(4), 2002, pp. 945-965.
[25] E. Chang, K. Goh, G. Sychay, G. Wu, CBSA: content-based soft annotation for multimodal image retrieval using Bayes Points Machines, IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 2003, pp. 26-38.
[26] T. Funkhouser, P. Min, M. Kazhdan, J. Chen, A. Halderman, D. Dobkin, D. Jacobs, A search engine for 3D models, ACM Transactions on Graphics, 22(1), 2003, pp. 83-105.
[27] Multimodal requirements for voice markup languages, W3C working draft 10 July 2000, http://www.w3.org/TR/multimodal-reqs
[28] W3C multimodal interaction framework, W3C note 06 May 2003,
http://www.w3.org/TR/mmi-framework/
[29] Y. Rui, T. S. Huang, M. Ortega, S. Mehrotra, Relevance feedback: a power tool for interactive content-based image retrieval, IEEE Transactions on Circuits and Systems for Video Technology, 8(5), 1998, pp. 644-655.
[30] Y. Ishikawa, R. Subramanya, MindReader: querying databases through multiple examples, International Conference on Very Large Databases, New York, USA, Aug. 24-27, 1998, pp. 218-227.
[31] Z. Su, H. Zhang, S. Li, S. Ma, Relevance feedback in content-based image retrieval: Bayesian framework, feature subspaces, and progressive learning, IEEE Transactions on Image Processing, 12(8), 2003, pp. 924-937.
[32] X. S. Zhou, T. S. Huang, Relevance feedback in image retrieval: a comprehensive review, Multimedia Systems, 8(6), 2003, pp. 536-544.
[33] B. S. Manjunath, W. Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8), 1996, pp. 837-842.
[34] J. F. Cullen, J. J. Hull, P. E. Hart, Document Image Database Retrieval and Browsing Using Texture Analysis, International Conference Document Analysis and Recognition, Ulm, Germany, Aug. 18-20, 1997, pp. 718-721.
[35] C. R. Shyu, C. E. Brodley, A. C. Kak, A. Kosaka, A. M. Aisen, L. S. Broderick, ASSERT: a physician-in-the-loop content-based retrieval system for HRCT Image databases, Computer Vision and Image Understanding, 75(1/2), 1999, pp. 111-132.
[36] H. Tamura, S. Mori, T. Yamawaki, Texture features corresponding to visual perception, IEEE Transactions on Systems, Man, and Cybernetics, 8(6), 1978, pp. 460-473.
[37] S. Medasani, R. Krishnapuram, A fuzzy approach to complex linguistic query based image retrieval, IEEE International Conference on Fuzzy Systems, Seoul-Korea, Aug., 1999, pp. 590-595.
[38] L. Balmelli, A. Mojsilović, Wavelet domain features for texture description, classification and replicability analysis, IEEE International Conference on Image Processing, Kobe, Japan, Oct. 1999, Vol. 4, pp. 440-444
[39] B. S. Manjunath, P. Wu, S. Newsam, H. D. Shin, A texture descriptor for browsing and similarity retrieval, Singnal Processing: Image Communication, 16(1/2), 2000, pp. 33-43.
[40] B. C. O'Connor, M. K. O'Connor, J. M. Abbas, User reactions as access mechanism: an exploration based on captions for images, Journal of the American Society for Information Science, 50(8), 1999, pp. 681-697.
[41] C. Jörgensen, Attribute of images in describing tasks, Information Processing & Management, 34(2/3), 1998, pp. 161-174.
[42] M. Rorvig, C. H. Turner, J. R. Moncada, The NASA visual thesaurus, Annual Mid-Year Meeting of the American Society for Information Science, Ann Arbor, Michigan, May 15-18, 1988.
[43] J. Mostafa, A. Dillon, Design and evaluation of a user interface supporting multiple image query models, Annual Conference of the American Society for Information Science, Baltimore, MD, USA, Oct. 21-26, 1996.
[44] S. F. Chang, J. R. Smith, M. Beigi, A. B. Benitez, Visual information retrieval from large distributed online repositories, Communications of the ACM, 40(12), 1997, pp. 63-71.
[45] A. Goodrum, A. Spink, Visual information seeking: a study of image queries on the world wide web, Annual Meeting of the American Society for Information Science, Washington, DC, USA, Oct. 31-Nov. 4, 1999.
[46] J. M. Corridoni, A. Del Bimbo, P. Pala, Image retrieval by color semantics, Multimedia Systems, 7(3), 1999, pp. 175-183.
[47] A. Del Bimbo, Visual Information Retrieval, Morgan Kaufmann Publishers, 1999.
[48] A. Mojsilović, J. Kovačević, J. Hu, R. J. Safranek, S. K. Ganapathy, Matching and retrieval based on the vocabulary and grammar of color patterns, IEEE Transactions on Image Processing, 9(1), 2000, pp. 38-54.
[49] G. R. Hjaltason, H. Samet, Ranking in spatial databases, International Symposium on Spatial Databases, Portland, Maine, Aug. 6-9, 1995, pp. 83-95.
[50] B. Bradshaw, Semantic based image retrieval: a probabilistic approach, ACM International Conference on Multimedia, Los Angeles, California, USA, Oct. 30-Nov. 4, 2000, pp. 167-176.
[51] K. Barnard, D. Forsyth, Learning the semantics of words and pictures, IEEE International Conference on Computer Vision, Vancouver, Canada, Jul. 9-12, 2001, Vol. 2, pp. 408-415.
[52] R. Zhao, W. I. Grosky, Negotiating the semantic gap: from feature maps to semantic landscapes, Pattern Recognition, 35(3), 2002, pp. 593-600.
[53] J. Li, J. Z. Wang, Automatic linguistic indexing of pictures by a statistical modeling approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 2003, pp. 1075-1088.
[54] P. V. Beek, K. Yoon, A. M. Ferman, User interaction, In: B. S. Manjunath, P. Salembier, and T. Sikora (Eds.) Introduction to MPEG-7, John Wiley, 2002.
[55] T. P. Minka, R. W. Picard, Interactive learning with a society of models, Pattern Recognition, 30(4), 1997, pp. 565-582.
[56] M. Pazzani, D. Billsus, Learning and revising user profiles: the identification of interesting web sites, Machine Learning, 27(3), 1997, pp. 313-331.
[57] X. S. Zhou, T. S. Huang, Unifying keywords and visual contents in image retrieval, IEEE Multimedia, 9(2), 2002, pp. 23-32.
[58] C. Zhang, T. Chen, An active learning framework for content-based information retrieval, IEEE Transaction on Multimedia, 4(2), 2002, pp. 260-268.
[59] E. Chang, K. Goh, G. Sychay, G. Wu, CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines, IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 2003, pp. 26-38.
[60] X. He, O. King, W. Y. Ma, M. Li, H. J. Zhang, Learning a semantic space from user’s relevance feedback for image retrieval, IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 2003, pp. 39-48.
[61] D. Dubois, H. Prade, C. Testemale, Weighted fuzzy pattern matching, International Journal of Fuzzy Sets and Systems, 28, 1988, pp. 313-331.
[62] MIT VisTex: http://www-white.media.mit.edu/vismod/imagery/VisionTexture/
distribution.html
[63] G. Salton, M. J. McGill, Introduction to modern information retrieval, McGraw-Hill, 1983.
[64] I. Haritaoglu, D. Harwood, L. S. Davis, W4: real-time surveillance of people and their activities, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 2000, pp. 809-830.
[65] N. M. Oliver, B. Rosario, A. P. Pentland, A Bayesian computer vision system for modeling human interactions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 2000, pp. 831-843.
[66] M. Kőhle, D. Merkl, J. Kastner, Clinical gait analysis by neural networks: issues and experiences, IEEE Symposium on Computer-Based Medical Systems, 1997, pp. 138-143.
[67] D. Meyer, J. Denzler, H. Niemann, Model based extraction of articulated objects in image sequences for gait analysis, IEEE International Conference on Image Processing, 1997, pp. 78-81.
[68] J. W. Davis, A. F. Bobick, Virtual PAT: a virtual personal aerobics trainer, Workshop on Perceptual User Interfaces, San Francisco, CA, Nov. 5-6, 1998, pp. 13-18.
[69] P. T. Chua, R. Crivella, B. Daly, N. Hu, R. Schaaf, D Ventura, T. Camill, J. Hodgins, R. Pausch, Training for physical tasks in virtual environments: Tai Chi, IEEE International Conference on Virtual Reality, Los Angeles, CA, Mar. 22-26, 2003.
[70] C. BenAbelkader, R. Cutler, L. Davis, Person identification using automatic height and stride estimation, IEEE International Conference on Pattern Recognition, Quebec City, Canada, Aug. 11-15, 2002.
[71] A. F. Bobick, A. Johnson, Gait recognition using static activity-specific parameters, IEEE Computer Vision and Pattern Recognition, Kauai, Hawaii, Dec. 8-14, 2001.
[72] F. Multon, L. France, M.-P. Cani-Gascuel, G. Debunne, Computer animation of human walking: a survey, The Journal of Visualization and Computer Animation, 10(1), 1999, pp. 39-54.
[73] O. Arikan, D. A. Forsyth, J. F. O’Brien, Motion synthesis from annotations, ACM Transactions on Graphics, 22(3), 2003, pp. 402-408.
[74] W. T. Freeman, P. A. Beardsley, H. Kage, K.-I. Tanaka, K. Kyuma, C. D. Weissman, Computer vision for computer interaction, ACM SIGGRAPH Computer Graphics, 33(4), 1999, pp. 65-68.
[75] J. Lee, J. Chai, J. K. Hodgins, P. S. A. Reitsma, N. S. Pollard, Interactive control of avatars animated with human motion data, ACM Transactions on Graphics, 21(3), 2002, pp. 491-500.
[76] S. Jeannin, A. Divakaran, MPEG-7 visual motion descriptors, IEEE Transactions on Circuits and Systems for Video Technology, 11(6), 2001, pp. 720-724.
[77] N. Dimitrova, F. Golshani, Motion recovery for video content analysis, ACM Transactions on Information Systems, 13(4), 1995, pp. 408-439.
[78] S. F. Chang, W. Chen, H. J. Meng, H. Sundaram, D. Zhong, A fully automated content-based video search engine supporting spatiotemporal queries, IEEE Transactions on Circuits and Systems for Video Technology, 8(5), 1998, pp. 602-615.
[79] E. Sahouria, A. Zakhor, A trajectory based video indexing system for street surveillance, IEEE International Conference on Image Processing, Kobe, Japan, Oct. 24-28, 1999.
[80] S. Dağtaş, W. Al-Khatib, A. Ghafoor, R. L. Kashyap, Models for motion-based video indexing and retrieval, IEEE Transactions on Image Processing, 9(1), 2000, pp. 88-101.
[81] M. Nabil, A. H. H. Ngu, J. Shepherd, Modeling and retrieval of moving objects, Multimedia Tools and Applications, 13(1), 2001, pp. 35-71.
[82] C. S. Li, J. R. Smith, L. D. Bergman, V. Castelli, Sequential processing for content-based retrieval of composite objects, SPIE Storage and Retrieval of Image and Video Databases, San Jose, CA, Jan. 28-30, 1998, pp. 2-13.
[83] H. Sundaram, S. F. Chang, Efficient video sequence retrieval in large repositories, SPIE Storage and Retrieval of Image and Video Databases, San Jose, CA, Jan. 26-29, 1999.
[84] T. B. Moeslund, E. Granum, A survey of computer vision-based human motion capture, Computer Vision and Image Understanding, 81(3), 2001, pp. 231-268.
[85] L. Wang, W. Hu, T. Tan, Recent developments in human motion analysis, Pattern Recognition, 36(3), 2003, pp. 585-601.
[86] Y. Li, T. Wang, H. Y. Shum, Motion texture: a two-level statistical model for character motion synthesis, ACM Transactions on Graphics, 21(3), 2002, pp. 465-472.
[87] L. Kovar, M. Gleicher, F. Pighin, Motion graphs, ACM Transactions on Graphics, 21(3), 2002, pp. 473-482.
[88] Web3D working group on humanoid animation, specification for a standard humanoid, Version 1.1, August 1999.
[89] MPEG-4 overview, ISO/IEC JTC1/SC29/WG11 N4668, March 2002,
http://mpeg.telecomitalialab.com/standards/mpeg-4/mpeg-4.htm
[90] R. Parent, Computer Animation: Algorithms and Techniques, Morgan Kaufmann, 2002.
[91] R. O. Duda, P. E. Hart, D. G. Stork, Patten Classification, John Wiley & Sons, 2001.
[92] T. W. Parsons, Voice and Speech Processing, McGraw-Hill, 1986.
[93] G. Lu, Multimedia Database Management Systems, Artech House, 1999.
[94] C. R. Wren, A. Azarbayejani, T. Darrell, A. P. Pentland, Pfinder: real-time tracking of the human body, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 1997, pp. 780-785.
[95] C. Theobalt, M. Magnor, P. Schüler, H. Seidel, Combining 2D Feature Tracking and Volume Reconstruction for Online Video-Based Human Motion Capture, Pacific Conference on Computer Graphics and Applications, Beijing, China, Oct. 9-11, 2002.
[96] G. K. M. Cheung, S. Baker, T. Kanade, Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture, IEEE Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, Jun. 16-22, 2003.
[97] T. Igarashi, S. Matsuoka, H. Tanaka, Teddy: a sketching interface for 3D freeform design, ACM SIGGRAPH, Los Angeles, California, USA, Aug. 8-13, 1999, pp. 409-416.
[98] K. Morimura, T. Sonoda, Y. Muraoka, A whole-body-gesture input interface with a single-view camera – a user interface for 3D games with a subjective viewpoint, International Conference on Web Delivering of Music, Darmstadt, Germany, Dec. 9-11, 2002.
[99] J. Davis, M. Agrawala, E. Chuang, Z. Popović, D. Salesin, A sketching interface for articulated figure animation, ACM SIGGRAPH/Eurographics Symposium on Computer Animation, San Diego, California, USA, Jul. 26-27, 2003.
[100] M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysis, and Machine Vision, Second Edition, Brooks/Cole, 1999.
[101] H. J. Lee, Z. Chen, Determination of 3D human body postures from a single view, Computer Vision, Graphics, and Image Processing, 30, 1985, pp. 148-168.
[102] C. Bregler, J. Malik, Tracking people with twists and exponential maps, IEEE Computer Vision and Pattern Recognition, Santa Barbara, California, USA, Jun. 23-25, 1998, pp. 8-15.
[103] D. E. Difranco, T. J. Cham, J. M. Rehg, Recovery of 3D articulated motion from 2D correspondences, Compaq Cambridge Research Laboratory Technical Report Series, CRL 99/7, Dec. 1999.
[104] C. J. Taylor, Reconstruction of articulated objects from point correspondences in a single uncalibrated image, Computer Vision and Image Understanding, 80(3), 2000, pp. 349-363.
[105] C. Barron, I. A. Kakadiaris, Estimating anthropometry and posture from a single image, IEEE Computer Vision and Pattern Recognition, South Carolina, USA, Jun. 13-15, 2000.
[106] M. J. Park, M. G. Choi, S. Y. Shin, Human motion reconstruction from inter-frame feature correspondences of a single video stream using a motion library, ACM SIGGRAPH Symposium on Computer Animation, San Antonio, Texas, Jul. 21-22, 2002.
[107] M. Gleicher, N. Ferrier, Evaluating video-based motion capture, International Conference on Computer Animation, Geneva, Switzerland, Jun. 19-21, 2002.
[108] K. Rohr, Towards model-based recognition of human movements in image sequences, CVGIP: Image Understanding, 59(1), 1994, pp. 94-115.
[109] X. Liu, Y. Zhuang, Y. Pan, Video based human animation technique, ACM International Conference on Multimedia, Orlando, Florida, USA, Oct. 30-Nov. 5, 1999.
[110] H. Ning, T. Tan, L. Wang, W. Hu, Kinematics-based tracking of human walking in monocular video sequences, Image and Vision Computing, 22(5), 2004, pp. 429-441.
[111] V. Pavlović, J. M. Rehg, T. J. Cham, K. P. Murphy, A dynamic Bayesian network approach to figure tracking using learned dynamic models, IEEE International Conference on Computer Vision, Kerkyra, Corfu, Greece, Sep. 20-25, 1999, pp. 94-101.
[112] M. Brand, Shadow puppetry, IEEE International Conference on Computer Vision, Kerkyra, Corfu, Greece, Sep. 20-25, 1999, pp. 1237-1244.
[113] N. R. Howe, M. E. Leventon, W. T. Freeman, Bayesian reconstruction of 3D human motion from single-camera video, Neural Information Processing Systems, Denver, Colorado, USA, Nov. 29-Dec. 4, 1999.
[114] R. Rosales, S. Sclaroff, Learning body pose via specialized maps, Neural Information Processing Systems, Vancouver, British Columbia, Canada, Dec. 3-8, 2001.
[115] D. Tolani, A. Goswami, N. Badler, Real-time inverse kinematics techniques for anthropomorphic limbs, Graphical Models, 62(5), 2000, pp. 353-388.
[116] S. McFarlane, The Complete Book of T’ai Chi, Dorling Kindersley Limited, London, 1999.
[117] M. Gleicher, Retargetting motion to new characters, ACM SIGGRAPH, Orlando, Florida, USA, Jul. 19-24, 1998, pp. 33-42.
[118] K. J. Choi, H. S. Ko, Online motion retargetting, The Journal of Visualization and Computer Animation, 11(5), 2000, pp. 223-235.
[119] M. Hilaga, Y. Shinagawa, T. Kohmura, T. L. Kunii, Topology matching for fully automatic similarity estimation of 3D shapes, ACM SIGGRAPH, Los Angeles, CA, USA, Aug. 12-17, 2001, pp. 203-212.
[120] T. Funkhouser, P. Min, M. Kazhdan, J. Chen, A. Halderman, D. Dobkin, A search engine for 3D models, ACM Transactions on Graphics, 22(1), 2003, pp. 83-105.
[121] D. Y. Chen, M. Ouhyoung, X. P. Tian, Y. T. Shen, On Visual Similarity Based 3D Model Retrieval, Computer Graphics Forum, 22(3), 2003, pp. 223-232.