研究生: |
梁瀚文 Han-Wen Liang |
---|---|
論文名稱: |
應用於虛擬視訊會議系統的擬真視訊代理人之錯誤隱藏方法 Error Concealment of Video Realistic Avatar for Virtual Conferencing System |
指導教授: |
陳永昌
Yung-Chang Chen |
口試委員: | |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 英文 |
論文頁數: | 37 |
中文關鍵詞: | 錯誤隱藏 、虛擬視訊會議會議 、擬真代理人 、臉部動畫 |
外文關鍵詞: | error concealment, virtual conferencing system, realistic avatar, facial animation |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
虛擬視訊會議系統中利用虛擬代理人來代替使用者在同一虛擬場景中面對面交談,在過去研究的系統架構中整合了二維影像編碼及三維模型編碼的優點,在更低的頻寬要求下,讓虛擬代理人可以更逼真地呈現出和使用者相同的表情變化。
在本篇論文中,我們希望發展更適用於此系統二維影像編碼器之錯誤隱藏工具,在超低位元率傳輸下,當作加強層的二維影像資料發生遺失時,仍可利用當作基本層傳送的臉部動畫參數讓虛擬代理人達到應有的生動表情。
我們由輸入影像中分析出臉部動畫參數及有表情的人臉影像剪去無表情的人臉影像可得到兩者間的「紋理差異值」,再定義三維模型編碼所無法表現出人臉細緻表情變化的區域,將「紋理差異值」區分為幾個「部份紋理差異值」,並且根據臉部動畫參數的特性,每個「部份紋理差異值」都會跟隨著影響其區域的臉部動畫參數。
所以我們提出以類似向量量化的方法對影響各個區域的臉部動畫參數分類,選出以「部份紋理差異值」重建出的人臉影像,與原本的人臉影像之間得到最小的量化誤差,當作屬於此類別表情下的代表,因此我們可根據臉部動畫參數屬於的類別,選擇出其「部份紋理差異值」,藉此得到此時有表情之影像,來彌補可能遺失的部份。
Avatars are usually employed in virtual conferencing system as user’s agent for face-to-face talking with each other in a 3-D virtual space. In previous research, system architecture is established by integrating a 3-D model-based coder and a 2-D video coder under very low bit-rate video coding. It makes the avatars to have more realistic expression as users’ representatives to form the video realistic avatars.
In this thesis, we develop a new error concealment scheme for the 2-D video coder as the enhancement layer, which is expected to be more suitable in the virtual conferencing system. With the proposed scheme, we can alleviate the defect caused by the loss of the enhancement layer and reconstruct a vivid 3D virtual facial model while inefficient bandwidth is available.
First, we extract the facial animation parameters (FAP) and texture difference, which is the difference between expressive facial image and neutral facial image synthesized by 3-D model-based coding, from the input sequence. Then, we define the region of interest (ROI) by using facial animation tables (FAT) to separate the texture differences of the whole face into several partial texture differences in each ROI. Besides, each ROI are affected by specific FAP groups, that is the partial texture difference in ROI are also affected.
Thus, we propose a VQ-like algorithm to cluster the FAPs and its partial texture differences of each ROI according to the minimum quantization error, which is determined by the differences between synthetic facial image with partial updated by the partial texture difference and the real facial image. With the codevectors of each cluster, we can predict the partial texture difference by the nearest one to the received FAPs. Finally, a predicted texture map is generated by partial texture difference, and then we can conceal the enhancement layer by using these predicted texture map.
[1] Yao-Jen Chang, Chien-Chia Chien, and Yung-Chang Chen, “Video Realistic Avatar for Virtual Face-to-Face Conferencing,” Proceedings of ICME 2002, Lausanne, Switzerland, Aug.26-29, 2002.
[2] Deepak S. Turaga and Tsuhan Chen, “Model-Based Error Concealment for Wireless Video,” Circuits and Systems for Video Technology, IEEE Transactions on Volume 12, Issue 6, June 2002.
[3] S.Aign and K. Fazel, “Temporal and Spatial Error Concealment Techniques for hierarchical MPEG-2 video codec,” in Proc. Globecom, 1995
[4] W. Kwok and H.Sun, “Multi-directional Interpolation for Spatial Error Concealment,” IEEE Trans. Consumer Electron., vol. 39, pp. 455-460, Aug. 1993.
[5] M. J. Chen, L. G. Chen, and R. M. Weng, “Error Concealment of Lost Motion Vectors with Overlapped Motion Compensation,” Circuits and Systems for Video Technology, IEEE Transactions, on Volume 7, June 1997.
[6] S. Tsekeridou and I. Pitas, “MPEG-2 Error Concealment based on Block Matching Principles,” IEEE Trans. Circuits Syst. Video Technol. Vol. 10, pp. 630-645, June 2000.
[7] D. E. Pearson, “Developments in Model-based Video Coding,” Proc. IEEE, Vol. 83, pp. 892-906, June 1995
[8] W. J. Welsh, S. Searsby, and J. B. Waite, “Model-based Image Coding,” British Telecom Technol. J., vol. 8, no. 3, pp. 94-106, July 1990
[9] N. Sarris, N. Grammalidis, and M. G. Strintzis, “FAP Extraction Using Three-Dimensional Motion Estimation,” IEEE Trans. Circuits and Systems for Video Technology, Vol. 12, No. 10, Oct. 2002.
[10] F. Dornaika and J. Ahlberg, “Fast and Reliable Active Appearance Model Search for 3-D Face Tracking,” IEEE Trans. Systems, Man and Cybernetices, Vol. 34, Issue 4, Aug 2004.
[11] Jen-Chung Chou, Yao-Jen Chang, Yung-Chang Chen, “Facial Feature Point Tracking and Expressions Analysis for Virtual Conferencing Systems,” Multimedia and Expo, 2001, ICME 2001, IEEE International Conference on 22-25 Aug. 2001.
[12] “Text of ISO/IEC FDIS 14496-2: Visual,” ISO/IEC JTC1/SC29/WG11 N2502, Atlantic City MPEG Meeting, Oct. 1998.
[13] Markus Kampmann, “Automatic 3-D Face Model Adaptation for Model-based Coding of Videophone Sequences,” IEEE Trans. Circuits and Systems for Video Technology, vol. 9, No. 02, March 2002.
[14] F. Pighin, J.Hecker, D.Lischinski, R. Szeliski, and D. H. Salesin, “Synthesizing Realistic Facial Expressions from Photographs,” SIGGRAPH,98, Orlando, Flordia, July 1998.
[15] L. Yin and A. Basu, “Partial Update of Active Textures for Efficient Expression Synthesis in Model-based Coding,” Proceedings of IEEE-ICME 2000, Vol. 3, pp. 1763-1766, July 2000.
[16] S. P. Lloyd, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., vol. 28, no. 1, pp. 84-95, 1982.