利用整體最佳化從影像及深度資料達成時空一致的視訊合成

簡易檢索 / 詳目顯示

回結果列表

研究生：	許孝安 Hsu, Hsiao-An
論文名稱：	利用整體最佳化從影像及深度資料達成時空一致的視訊合成 Spatio-Temporally Consistent View Synthesis from Video-Plus-Depth Data with Global Optimization
指導教授：	賴尚宏 Lai, Shang-Hong
口試委員:	賴尚宏陳永昌陳永盛
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2011
畢業學年度：	99
語文別：	中文
論文頁數：	45
中文關鍵詞：	視訊合成、影像及深度、整體最佳化
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在本文中，我們提出了一個新演算法可以從視訊加深度的影像序列去產生一個虛擬視角的視訊，在這任務中，如何合成原本被遮蔽的區域中的實際內容是最主要的挑戰。我們提出的方法利用了空間和時間上的一致性在被遮蔽區域中，藉由馬可夫隨機場的架構，制定一個能量最小化的問題。解決馬可夫隨機場的問題是藉由信任傳遞演算法。在作完深度圖的投影後，我們首先復原深度影像以及運動向量圖。然後我們利用馬可夫隨機場制定能量函數並且對每一個節點加入了位移變數。為了減少信任傳遞演算法在運算時間上的高度複雜度，我們展示了一個多層信任傳遞演算法，藉由使用較少的候選標籤在每一層中傳遞信息。最後，應用Poisson 影像重建方法，在被遮蔽邊界範圍的合成影像中增強顏色的一致性。使用我們所提出的演算法，我們展示利用真實深度加上影像序列產生一些它的實驗結果。

In this thesis, we propose a novel algorithm to generate a virtual-view video from a video-plus-depth sequence. How to synthesize the realistic content to fill in the disocclusion regions at the synthesized view is the main challenging problem in this task. The proposed method enforces the spatial and temporal consistency in the disocclusion regions by formulating the problem as an energy minimization problem in a Markov random fields (MRFs) framework. The resulting MRF optimization problem is solved via the belief propagation (BP) algorithm. We first recover the depth images and the motion vector maps after the image warping with the depth map. Then we formulate the energy function for the MRF with additional shift variables for each node. To reduce the high computational complexity of applying BP to this problem, we present a multi-level BPs by using BP with smaller numbers of label candidates for each level. Finally, the Poisson image reconstruction is applied to improve the color consistency between the boundary of the disocclusion region in the synthesized image. Some experimental results of applying the proposed algorithm to real video-plus-depth sequences are shown to demonstrate its performance.

Chapter 1    Introduction                               1
1    Motivation                                       1
2    Problem Description                              1
3    Main Contributions                               3
4    Thesis Organization                              3
Chapter 2    Previous Works                             4
1    DIBR (Depth-Image-Based Rendering)               4
2    Image Inpainting                                 5
Chapter 3    Proposed Method                            8
1    Preprocessing of View Synthesis                  9
1.1    Preprocessing of Depth Images                  9
1.2    Image Warping                                 12
1.3    Preprocessing of Disocclusion Region          14
1.4    Recovering Depth Images                       16
2    Motion Estimation                               19
2.1    Constrained Optical Flow                      19
3    Recovering Disocclusion Region                  21
3.1    Global Energy Minimization Problem by MRF     23
3.2    Shift Variable                                27
3.3    Multi-Level BP                                28
4    Poisson Image Reconstruction                    32
Chapter 4    Experimental Results                      34
Chapter 5    Conclusion                                41
Reference                                              42

                                

[1] W. Mark, L. McMillan and G. Bishop, “Post-rendering 3D warping,” Symposium on Interactive 3D Graphics, pp. 7–16, 1997.
[2] L. McMillan, “An image-based approach on three-dimensional computer graphics,” Ph.D. Dissertation, University of North Carolina at Chapel Hill, 1997.
[3] F. Christoph, “A 3D-TV system based on video plus depth information,” Asilomar Conference on Signal, System and Computers, vol. 2, pp. 1529-1533, 2003.
[4]F.Christoph, “Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV,” SPIE on Stereoscopic Displays and Virtual Reality Systems XI, Vol.5291, pp.93-104, 2004.
[5] L. Zhang and W.J. Tam, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Transactions on Broadcasting, Vol. 51, No. 2, pp. 191-199, 2005.
[6] A. Telea, “An image inpainting technique based on the fast marching method,” Journal of Graphics, GPU and Game Tools, vol. 9, No. 1, pp. 23-34, 2004.
[7] M. Bertalmio, G. Sapiro, C. Ballester and V. Caselles, “Image inpainting,” ACM. SIGGRAPH Conference on Computer Graphics, pp. 417–424, 2000.
[8] K.-J. Oh, Y. Sehoon, Y.-S. Ho, Hole-filling method using depth based in-painting for view synthesis in free viewpoint television (FTV) and 3D video, in: Proceedings, Picture Coding Symposium (PCS), Chicago, USA, 2009.
[9] L. Zhang and W.J. Tam, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Transactions on Broadcasting, Vol. 51, No. 2, pp. 191-199, 2005.
[10] A. Criminisi, P. P´erez and K. Toyama, “Object removal by exemplar-based inpainting,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, No. 2, pp.721-728, 2003.
[11] C.M. Cheng, S.J. Lin, and S.H. Lai, “Spatio-temporally consistent multi-view video synthesis for autostereoscopic display” The 2009 Pacific Rim Conference on Multimedia(PCM) 2009.
[12] Shen, J., Jin, X., Zhou, C., Wang, C.C.L.: Gradient Based Image Completion by Solving Poisson Equation. PCM, 257-268 (2005).
[13] N. Komodakis and G. Tziritas. Image Completion Using Global Optimization. In CVPR, 2006.
[14] J. Sun, L. Yuan, J. Jia, and H. Shum. Image completion with structure propagation. In SIGGRAPH, 2005.
[15] T. Huang, S. Chen, J. Liu, and X. Tang. Image inpainting by global structure and texture propagation. ACM MM, 2007.
[16] Liu M, Chen S, Liu J, Tang X (2009) Video completion via motion guided spatial-temporal global optimization. In: Proceedings of the ACM international conference on multimedia, pp 537–540. Beijing, China
[17] Tomasi, C., Manduchi, R.: Bilateral Filtering for Gray and Color Images. In: IEEE Int’l Conf. on Computer Vision, Washington, DC, pp. 839–846 (1998)
[18] B. K. P. Horn, B. G. Schunck, Determine optical flow. Artificial Intelligence, vol. 17,pp. 185-203, 1981.
[19] C. H. Teng, S. H. Lai, Y. S. Chen. Accurate optical flow computation under non-uniform brightness variations. Computer Vision and Image Understanding, vol. 97, no.3, pp. 315-346, 2005.
[20] C. K. Hsieh, S. H. Lai, Y. C. Chen. Expression-Invariant Face Recognition With Constrained Optical Flow Warping. IEEE Transactions on Multimedia, vol. 11, no. 4, pp. 600-610, 2009.
[21] C. K. Hsieh, S. H. Lai, Y. C. Chen. Expression-Invariant Face Recognition With Constrained Optical Flow Warping. IEEE Transactions on Multimedia, vol. 11, no. 4, pp. 600-610, 2009.
[22] Joris M. Mooij. libDAI: A free & open source C++ library for Discrete Approximate Inference in graphical models. Journal of Machine Learning Research, 11(Aug):2169-2173, 2010.
[23] Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315 (2007) 972, 976.
[24] P’EREZ, P., GANGNET, M., AND BLAKE, A. 2003. Poisson image editing. ACM Transactions on Graphics 22, 3, 313–318.
[25] G. Zhang, J. Jia, T.T. Wong and H. Bao, “Consistent depth maps recovery from a video sequence,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, No. 6, pp. 974-988, 2009.
[26] G. Zhang, J. Jia, T.T. Wong and H. Bao, “Recovering consistent video depth maps via bundle optimization,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.

全文公開日期本全文未授權公開 (校內網路)
全文公開日期本全文未授權公開 (校外網路)

簡易檢索 / 詳目顯示

相關論文